Translated data: Flexible Image Transformer (FiT) is an innovative Transformer architecture for image generation models, specifically designed to create images without limitations on resolution and aspect ratio. FiT treats images as a series of variable-sized image patches (Tokens), enhancing adaptability to different resolutions. Through a meticulously designed network structure and techniques that do not require additional training, FiT demonstrates significant flexibility in extending image resolution. Its introduction provides a novel solution for generating images unconstrained by resolution and aspect ratio. Additionally, the article also covers the latest advancements in other related large-scale models and generative model frameworks.