counterpart – SamaGamer

South Korean experts reported the development of a new artificial intelligence tool capable of generating an image in 1.5-2 seconds based on a given text description of the user. This tool does not require any specialized or expensive equipment to operate.

When creating the tool, the developers used a special technique – knowledge distillation, which was necessary to compress the size of the open source image generation model, Stable Diffusion XL. This model has about 2.5 billion parameters or variables that the neural network uses for training.

The simplest version of the new artificial intelligence model, called KOALA, has 700 million parameters. It is noted that this is a fairly “compact” neural network that works quickly and without the need to use energy-intensive and expensive equipment.

This type of tool can run on low-cost, commonly available GPUs and requires 8GB of RAM to handle all user requests.

During testing, the KOALA neural network was able to create images based on a simple prompt (“a picture of an astronaut reading a book under the moon on Mars”) in about 1.6 seconds. According to the official description, DALL·E 2 from OpenAI will spend 12.3 seconds on a similar task, and DALL·E 3 – 13.7 seconds.

South Korean specialists presented the results of their work in an article (PDF) on the arXiv service. Their project is currently available through the open source artificial intelligence repository Hugging Face.