Multi-target Reinforcement Learning Framework for Text-to-Image Generation
CommonProductImageReinforcement LearningText Generation
Parrot is a multi-target reinforcement learning framework specifically designed for text-to-image generation. It automatically identifies the best trade-off between different rewards during the reinforcement learning optimization process of T2I by using a batch Pareto optimal selection method. In addition, Parrot employs a joint optimization approach for the T2I model and a tip extension network, promoting the generation of text prompts with quality perception, leading to further improvement in the final image quality. To counteract the potential catastrophic forgetting of the original user prompt that may occur due to prompt extension, we introduce original prompt centralized guidance during inference to ensure that the generated images are faithful to the user's input. Extensive experiments and user research studies show that Parrot is superior to several baseline methods in terms of various quality standards, including aesthetics, human preferences, image emotions, and text-image alignment.
Parrot Visit Over Time
Monthly Visits
Bounce Rate
Page per Visit
Visit Duration