In the wave of AI technology, Kuaishou's large text-to-image model—Kolors, has emerged as a bright new star in domestic AI technology, thanks to its outstanding performance and open-source spirit. Kolors not only surpasses existing open-source models in image generation but also reaches a level comparable to commercial closed-source models, quickly sparking discussions on social media.

image.png

The Open-Source Journey of Kolors

The open-sourcing of Kolors is not just a technical milestone but also a testament to Kuaishou's open attitude towards AI technology. At the World Artificial Intelligence Conference, Kuaishou announced the official open-sourcing of Kolors, providing comprehensive resources including model weights, complete code, and technical reports. It is now available on Huggingface and GitHub for global developers to use for free.

Additionally, the open-source plan was announced on the GitHub homepage, with interfaces and large models already open-sourced, and subsequent plans to open-source Kolors' Lora, controlnet, and more, which is indeed exciting.

image.png

Outstanding Performance of Kolors

Kolors has earned high praise from developers and users for its powerful complex semantic understanding capabilities and photographic-quality textures. In the Zhiyuan FlagEval text-to-image model evaluation, Kolors ranked second globally with a subjective comprehensive score of 75.23, particularly excelling in image quality, where it ranked first.

image.png

Technological Innovations of Kolors

Kolors employs a latent space diffusion model combined with a large language model for text representation, enabling it to understand complex long texts. Through a two-stage progressive training strategy, Kolors has achieved international leading levels in image aesthetics and quality. Moreover, Kolors is the first text-to-image model natively supporting Chinese text generation, showcasing its strengths in understanding and presenting Chinese elements.

image.png

Kolors Deployment with ComfyUI

Having introduced so much, you must be eager to try it out. Now, let's learn how to deploy Kolors locally.

A one-click deployment for Kolors is already available on GitHub.

GitHub homepage: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper

Huggingface homepage: https://huggingface.co/Kwai-Kolors/Kolors

First, we copy the project URL.

image.png

After copying, we install it in the ComfyUI manager, and then restart.

image.png

Next, we set up the simplest Kolors text-to-image workflow.

image.png

After setting up, we click to add the prompt queue, which will automatically download the necessary large models and text encoders.

⚠️Note: Since the models are downloaded from Huggingface, the large model is about 5GB and the text encoder is approximately 11GB, so please ensure your internet connection is stable using a VPN.

Finally, the models will be downloaded to this file path:

Troubleshooting Installation Errors

The first time we download and use it, we might encounter errors where the text encoder cannot find the file.

image.png

The solution is simple: go to the Huggingface project address, download all json and python files in the text_encoder folder,

image.png

and place them in our local text_encoder folder. Since the downloaded files are improperly named, we need to rename them accordingly as shown in the figure.

image.png

Finally, we also need to download the Vae model and place it in the file path shown below.

image.png

Local vae file path

image.png

After resolving the above issues, we can use Kolors to generate images. Using Chinese prompts in our workflow feels comfortable, and the image quality is exquisite, with no major issues with hands. It also performs well in abstract images, rivaling Midjouney.

image.png

image.png

The Future of Kolors and the Open-Source Community

During the turmoil at Stability AI, Kuaishou's open-sourcing of Kolors has become a new focal point for the open-source community. Kuaishou plans to gradually open-source related application components of Kolors to enrich its open-source ecosystem and looks forward to collaborating with global developers to advance the development of the large text-to-image model community.

Conclusion

Kuaishou's Kolors large model, with its open attitude, high-standard technology, and practical commercial applications, demonstrates the true strength of domestic AI technology. In the ever-evolving world of AI, the open-sourcing and implementation of Kolors reveal the infinite possibilities of combining technology with content forms. With more enterprises and developers joining the Kolors open-source ecosystem, we have reason to believe that this will bring new development opportunities to the entire industry.

------------------------------------------------------------------------------------------

ChinaZ AI Tutorials is the AI drawing tutorial platform under ChinaZ.com

A wealth of free AI tutorials, continuously updating with practical content

For in-depth learning of more AI drawing tutorials, please visit the ChinaZ AI Tutorials website:

https://aisc.chinaz.com/jiaocheng/