Step1X-Edit is a practical general-purpose image editing framework that utilizes the image understanding capabilities of MLLMs to parse editing instructions, generate editing tokens, and decode them into images via the DiT network. Its significance lies in its ability to effectively meet the editing needs of real users, enhancing the convenience and flexibility of image editing.