As large models like ChatGPT continue to gain popularity, the year 2026 may witness a shortage of high-quality training data. To address the issue of insufficient training data for the development of GPT-5, OpenAI has established a "Data Alliance" to collect private, ultra-long text, video, audio, and other data. Research indicates that high-quality training data is crucial for the accuracy of large models' learning, and a lack of it could lead to a decline in the quality of AI-generated content. By 2026, high-quality training data may be exhausted, posing challenges for the iterative development of large models.