Meta scientist Thomas Scialom revealed the development secrets of Llama 3.1 and teased the upcoming Llama 4 on the podcast Latent Space.

The birth of Llama 3.1 represents a perfect balance between parameter scale, training time, and hardware constraints. Its massive 405B size is not a random choice but a challenge letter from Meta to GPT-4o. Although hardware limitations prevent Llama 3.1 from running on every household computer, the power of the open-source community makes everything possible.

During the development of Llama 3.1, Scialom and his team revisited the Scaling Law. They found that while model size is crucial, the total amount of training data is even more important. Llama 3.1 opted to increase the number of training tokens, even if it required more computational power.

image.png

Llama 3.1 did not undergo a revolutionary change in architecture, but Meta put a lot of effort into the scale and quality of the data. The 15T token ocean has led to a qualitative leap in the depth and breadth of Llama 3.1's knowledge.

In terms of data selection, Scialom firmly believes that there is too much textual garbage on the open internet, and the real gold is synthetic data. Llama 3.1's post-training process did not use any human-written answers, relying entirely on synthetic data generated by Llama 2.

Model evaluation has always been a challenge in the AI field. Llama 3.1 experimented with various methods for evaluation and improvement, including reward models and diverse benchmark tests. However, the real challenge lies in finding the right prompts to defeat a powerful model.

Meta has already started training Llama 4 in June, with a focus on agent technology this time. The development of agent tools like Toolformer indicates Meta's new exploration in the AI field.

The open-source release of Llama 3.1 is not only a bold attempt by Meta but also a profound reflection on the future of AI. With the launch of Llama 4, we have reason to believe that Meta will continue to lead the way in AI. Let's look forward to how Llama 4 and agent technology will redefine the future of AI.