In one of the ongoing artificial intelligence copyright cases involving Meta, Kadrey v. Meta, internal communications revealed the fierce competition and potential copyright issues the company faced while developing Llama3. Meta's top executives and researchers are fully committed to surpassing companies like OpenAI and Anthropic in AI model development, viewing their GPT-4 and Claude as the gold standard for their efforts.

Intense Competitive Spirit: Meta's Pledge to Outperform OpenAI

According to Ahmad Al-Dahle, Vice President of Meta's Generative AI, the company's goals in developing Llama3 were clearly aimed at GPT-4, determined to gain an edge in the AI race through efficient hardware support, such as 64k GPUs. Al-Dahle wrote in an internal message, "We will launch 64k GPUs! We need to learn how to build cutting-edge technology and win this race."

However, despite Meta releasing open-source AI models, the company's executives are more focused on defeating competitors like OpenAI and Anthropic, who typically do not publicly share their model weights and instead place their models behind APIs, creating a strong competitive focus.

Facebook Metaverse Meta

Disdain for Mistral and Internal Anxiety

French AI startup Mistral is one of Meta's biggest public competitors, but Meta executives clearly look down on it. In a message, Al-Dahle stated, "Mistral is insignificant to us; we should be able to do better." This also exposes the extreme anxiety within Meta regarding AI competition and their strong ambition in the industry.

At the same time, Meta's AI leaders frequently discuss how they are actively acquiring data to train Llama3 in internal communications, with some indications that executives are very optimistic about Llama3. One executive even mentioned in a message, "Llama3 is what I truly care about."

Copyright Issues and Legal Challenges

As Meta faces fierce competition in the development of Llama3, the training data they use has begun to spark legal controversies. Prosecutors allege that Meta executives may have cut corners in their rush to develop AI and used copyrighted books as training data.

In a message, researcher Hugo Touvron revealed that the dataset combination used in training Llama2 was "terrible," and suggested optimizing Llama3 by improving the dataset. They also discussed clearing obstacles to using the LibGen dataset, which includes copyrighted works from publishers like Cengage Learning, Macmillan Learning, McGraw Hill, and Pearson Education.

Despite the copyright issues, Meta CEO Mark Zuckerberg emphasized that Meta will continue to advance the Llama model and narrow the gap with other closed models like OpenAI and Google.

Meta's Future Outlook and Llama3's Market Position

In July 2024, Zuckerberg stated that Llama3's performance has reached parity with the most advanced AI models and is leading in certain areas. He anticipates that starting in 2025, Meta's Llama series will become the industry's most advanced AI models. However, the release of Llama3 still faces escalating copyright lawsuits, particularly due to the legal scrutiny surrounding its training data.