On April 8th, 2025, NVIDIA launched Llama 3.1 Nemotron Ultra 253B, an open-source model optimized from Llama-3.1-405B. With 25.3 billion parameters, it surpasses Meta's Llama 4 Behemoth and Maverick, becoming a focal point in the AI field. This model demonstrates superior performance in benchmarks such as GPQA-Diamond, AIME 2024/25, and LiveCodeBench, achieving inference throughput comparable to DeepSeek.