CogView3 is a text-to-image generation system built on a cascaded diffusion framework. This system decomposes the high-resolution image generation process into multiple stages, adding Gaussian noise to low-resolution outputs, which initiates the diffusion process from these noisy images. CogView3 surpasses SDXL in image generation, featuring faster generation speeds and higher image quality.