Tencent's HunyuanDiT (Hunyuan Large-scale Text-to-Image Model) has recently released three new controllable plugins, ControlNet, in collaboration with the community. These plugins are named tile (high-definition zoom), inpainting (image restoration and expansion), and lineart (line drawing to image), further expanding its ControlNet matrix. The addition of these plugins allows the HunyuanDiT model to cover a broader range of application scenarios, including art, creativity, architecture, photography, cosmetics, and e-commerce, serving 80% of cases and scenarios globally. This provides businesses and individual developers, as well as creators, with more precise image generation capabilities and greater creative freedom.

The Tile plugin enhances image information, enabling super-high-definition zoom, even up to 4K to 8K clarity, suitable for scenarios where extreme attention to image details is required. The Inpainting plugin can fill in smudged and blotchy areas of images according to the creator's needs, achieving effects such as background replacement and subject alteration, and handling large-scale image redrawing. The Lineart plugin uses different line types to create images of real people, anime, and architecture, suitable for generating architectural renderings and coloring sketches.

WeChat Screenshot_20240815135451.png

Additionally, Tencent's HunyuanDiT has previously released ControlNet models based on conditions such as canny (edge), depth, and pose (human posture), supporting developers in inference and open-sourcing the training scheme of ControlNet, enabling developers and creators to train custom ControlNet models.

Since its full upgrade and open-source announcement in May, HunyuanDiT, as the industry's first open-source text-to-image model based on a Chinese-native DiT architecture, has been continuously building its developer ecosystem. It has released a dedicated acceleration library, improved inference efficiency, shortened image generation time, and further open-sourced the inference code. In July, HunyuanDiT was upgraded to version 1.2, with an open-source version requiring only 6G of VRAM to run, making it more user-friendly for developers deploying locally on personal computers.

Currently, HunyuanDiT has exceeded 3.1k stars on Github, becoming the most popular domestic DiT open-source model.

Official Website

https://dit.hunyuan.tencent.com/

Code

https://github.com/Tencent/HunyuanDiT

Model

https://huggingface.co/Tencent-Hunyuan/HunyuanDiT

Paper

https://tencent.github.io/HunyuanDiT/asset/Hunyuan_DiT_Tech_Report_05140553.pdf