Translated data: The collaboration between Peking University, Stanford University, and Pika Labs has introduced a new open-source text-to-image framework called RPG, which successfully addresses two major challenges in text-to-image generation by leveraging the capabilities of multi-modal large language models (LLMs). This framework achieves significant research breakthroughs by decomposing text prompts, segmenting image spaces, and independently generating sub-region images through core strategies, bringing new advancements to the field of text-to-image generation.