Quick Hands released a big move today by opening its in-house image generation model——“Kolors”. This is not an ordinary model; it has been trained on tens of billions of text-image pairs, equipped with a General Language Model (GLM) as a text encoder, supporting bilingual Chinese and English prompts, and can handle contexts up to 256 tokens.

Key Features of Kolors:

  • Bilingual Support:Utilizes the General Language Model (GLM) as a text encoder, enabling the model to not only master English but also perfectly understand and apply Chinese prompts.

  • Long Text Processing:Supports a context length of up to 256 tokens, allowing creators to detail their thoughts, whether complex scenes or rich stories.

  • Massive Data Training:Trained on tens of billions of text-image pairs, the model has a vast knowledge base, capable of generating diverse and accurate images.

  • Optimization for Chinese Cultural Elements:Especially optimized for Chinese cultural elements, the generated images are more in line with Chinese cultural characteristics, meeting localized needs.

  • Chinese Text Generation:“Kolors”not only understands Chinese but can also embed Chinese text into the generated images, adding more expressiveness to the images.

An AIbase test found that currently, Kolors performs better in inserting Chinese into images, with most outputs being correct, but with English, there is a tendency to have missing or incorrect characters.

QQ截图20240708112714.jpg

QQ截图20240708111705.jpg

As can be seen, the above-generated "lieping" (lying down) cat has no problem with Chinese characters, but when I change it to "AIbase", there are missing or omitted characters. In terms of Chinese output, Kolors performs well, but note that the text should not be too long; too long and it's prone to errors.

QQ截图20240708112728.jpg

This model is not just a simple tool; it is backed by the powerful technology of Quick Hands. Trained on massive data, it has special optimization for Chinese cultural elements, making the generated images more Chinese in flavor. This is not just a technical breakthrough but also a cultural inheritance.

The open-source plan also includes support for CN (ControlNet), LoRa (Low-Rank Adaptation), IPA (Image Prompt Adaptation), and direct support for ComfyUI, all of which are designed to make your creative process more smooth and personalized.

Technical Details:

  • "Kolors" is based on the SDXL model architecture and integrates the ChatGLM256 technology to enhance bilingual understanding and text generation capabilities.

  • It is worth noting that running this model requires a large amount of video memory, about 19GB, which may have certain requirements for hardware devices.

Quick Hands' open-source of "Kolors" is not only a contribution to the technical community but also a bold push for creative freedom. This demonstrates Quick Hands' determination and strength in AI technology, and shows the endless possibilities of AI in artistic creation.

Official Kolors Website: https://top.aibase.com/tool/kuaishouketudamoxingkolors

Project Address: https://top.aibase.com/tool/kolors