ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer
A versatile creator and editor that follows instructions via diffusion transformers
CommonProductImageVisual GenerationDiffusion Model
ACE is a diffusion transformer-based all-in-one creator and editor that facilitates joint training of multiple visual generation tasks using a unified input format known as Long-context Condition Unit (LCU). ACE addresses the challenge of insufficient training data through efficient data collection methods and generates accurate textual instructions using multimodal large language models. It demonstrates significant performance advantages in the realm of visual generation, enabling the creation of chat systems that seamlessly respond to any image creation request, thus circumventing the cumbersome workflows typically employed by visual agents.
ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer Visit Over Time
Monthly Visits
1040
Bounce Rate
47.34%
Page per Visit
1.2
Visit Duration
00:00:08