In the wave of AI technology, Kuaishou's large text-to-image model—Kolors, has emerged as a bright new star in domestic AI technology, thanks to its outstanding performance and open-source spirit. Kolors not only surpasses existing open-source models in image generation but also reaches a level comparable to commercial closed-source models, quickly sparking discussions on social media.
The Open-Source Journey of Kolors
The open-sourcing of Kolors is not just a technical milestone but also a testament to Kuaishou's open attitude towards AI technology. At the World Artificial Intelligence Conference, Kuaishou announced the official open-sourcing of Kolors, providing comprehensive resources including model weights, complete code, and technical reports. It is now available on Huggingface and GitHub for global developers to use for free.
Additionally, the open-source plan was announced on the GitHub homepage, with interfaces and large models already open-sourced, and subsequent plans to open-source Kolors' Lora, controlnet, and more, which is indeed exciting.
Outstanding Performance of Kolors
Kolors has earned high praise from developers and users for its powerful complex semantic understanding capabilities and photographic-quality textures. In the Zhiyuan FlagEval text-to-image model evaluation, Kolors ranked second globally with a subjective comprehensive score of 75.23, particularly excelling in image quality, where it ranked first.
Technological Innovations of Kolors
Kolors employs a latent space diffusion model combined with a large language model for text representation, enabling it to understand complex long texts. Through a two-stage progressive training strategy, Kolors has achieved international leading levels in image aesthetics and quality. Moreover, Kolors is the first text-to-image model natively supporting Chinese text generation, showcasing its strengths in understanding and presenting Chinese elements.
Kolors Deployment with ComfyUI
Having introduced so much, you must be eager to try it out. Now, let's learn how to deploy Kolors locally.
A one-click deployment for Kolors is already available on GitHub.
GitHub homepage: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper
Huggingface homepage: https://huggingface.co/Kwai-Kolors/Kolors
First, we copy the project URL.
After copying, we install it in the ComfyUI manager, and then restart.
Next, we set up the simplest Kolors text-to-image workflow.
After setting up, we click to add the prompt queue, which will automatically download the necessary large models and text encoders.
⚠️Note: Since the models are downloaded from Huggingface, the large model is about 5GB and the text encoder is approximately 11GB, so please ensure your internet connection is stable using a VPN.
Finally, the models will be downloaded to this file path:
Troubleshooting Installation Errors
The first time we download and use it, we might encounter errors where the text encoder cannot find the file.
The solution is simple: go to the Huggingface project address, download all json and python files in the text_encoder folder,
and place them in our local text_encoder folder. Since the downloaded files are improperly named, we need to rename them accordingly as shown in the figure.
Finally, we also need to download the Vae model and place it in the file path shown below.
Local vae file path
After resolving the above issues, we can use Kolors to generate images. Using Chinese prompts in our workflow feels comfortable, and the image quality is exquisite, with no major issues with hands. It also performs well in abstract images, rivaling Midjouney.
The Future of Kolors and the Open-Source Community
During the turmoil at Stability AI, Kuaishou's open-sourcing of Kolors has become a new focal point for the open-source community. Kuaishou plans to gradually open-source related application components of Kolors to enrich its open-source ecosystem and looks forward to collaborating with global developers to advance the development of the large text-to-image model community.
Conclusion
Kuaishou's Kolors large model, with its open attitude, high-standard technology, and practical commercial applications, demonstrates the true strength of domestic AI technology. In the ever-evolving world of AI, the open-sourcing and implementation of Kolors reveal the infinite possibilities of combining technology with content forms. With more enterprises and developers joining the Kolors open-source ecosystem, we have reason to believe that this will bring new development opportunities to the entire industry.
------------------------------------------------------------------------------------------
ChinaZ AI Tutorials is the AI drawing tutorial platform under ChinaZ.com
A wealth of free AI tutorials, continuously updating with practical content
For in-depth learning of more AI drawing tutorials, please visit the ChinaZ AI Tutorials website: