LLaDA is a novel diffusion model that generates text through a diffusion process, unlike traditional autoregressive models. It excels in the scalability of language generation, instruction following, context learning, dialogue capabilities, and compression capabilities. Developed by researchers from Renmin University of China and Ant Group, this 8B-parameter model is trained entirely from scratch. Its main advantage is its ability to flexibly generate text through the diffusion process, supporting various language tasks such as solving mathematical problems, code generation, translation, and multi-turn dialogues. LLaDA provides a new direction for the development of language models, especially in terms of generation quality and flexibility.