Orthogonal Finetuning (OFT)
OFT effectively stabilizes text-to-image diffusion models during fine-tuning
CommonProductImageText-to-Image GenerationImage Synthesis
The study 'Controlling Text-to-Image Diffusion' explores how to effectively guide or control powerful text-to-image generation models for various downstream tasks. The orthogonal finetuning (OFT) method is proposed, which maintains the model's generative ability. OFT preserves the hypershell energy between neurons, preventing the model from collapsing. The authors consider two important fine-tuning tasks: subject-driven generation and controllable generation. Results show that the OFT method outperforms existing methods in terms of generation quality and convergence speed.