xAI Grok-2 Ranks Second in Chatbot Leaderboard, Closely Following GPT-4o

AIbase基地

Published inAI News · 3 min read · Aug 26, 2024

685

Data indicates that the Grok-2 and Grok-Mini models from the xAI team have officially made it onto the LMSys chatbot Arena leaderboard. Grok-2 has notably secured the second place, outperforming OpenAI's GPT-4o (May) and tying with the latest Gemini model, supported by over 6,000 community members' enthusiastic votes.

It's worth noting that Grok-2 excels particularly in mathematical tasks, claiming the top spot in that category, and has also secured second place in several other tasks, including complex prompts, programming, and following instructions. In contrast, Grok-2-Mini entered the rankings at fifth place, demonstrating its notable capabilities.

Grok-2-Mini has also seen a significant speed boost, now operating at twice the previous speed. This leap in improvement stems from xAI's inference team, who completely rewrote the inference stack, utilizing SGLang for more efficient multi-host inference and enhanced precision. Additionally, the team introduced new computational and communication kernel algorithms, as well as improved batch scheduling and quantization techniques, further elevating the model's overall performance.

Although some remain skeptical about Grok-2's performance, believing OpenAI's GPT-4o to be superior, many users in practice have reported that Grok-2 indeed performs exceptionally well in programming and mathematical tasks. The Grok-2 series models were released this month as a beta version, and users can also experience them on the X platform. Furthermore, the model supports image creation using the FLUX.1 image generation model.

Key Points:
✨ Grok-2 ranks second on the LMSys chatbot leaderboard, surpassing GPT-4o (May) and tying with Gemini.
🚀 Grok-2 excels in mathematical tasks, securing first place, and performs well in multiple other tasks.
💡 Grok-2-Mini has doubled its speed, further enhancing its performance.

Grok-2 Grok-Mini LMSys OpenAI

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Noam Shazeer, the Core Author of Transformer, Joins OpenAI; Google's Huge Investment Could Not Keep Him

Noam Shazeer, a legendary AI researcher and key Transformer architecture co-author, joins OpenAI. Google previously paid around $2.7 billion in tech licensing to bring him back, but he now defects to rival OpenAI, intensifying the AI talent war.....

Jun 18, 2026

260

Liblib Completes $300 Million B+ Round, Valuation Exceeds $2 Billion, ARR Surpasses $300 Million

Chinese AI application company Yanyu Technology (Liblib) has completed a $300 million B+ round, with a post-money valuation exceeding $2 billion. The round was led by Granite Asia, Tencent, and Shunwei Capital, with participation from existing investors such as Gao Rong and Ant Group. Within 8 months, the company secured over $500 million in combined funding, with capital backing almost encompassing all top-tier domestic tech institutions. As of May 2026, the company's ARR has surpassed $300 million, demonstrating strong commercialization capabilities.

Jun 18, 2026

350

OpenAI CEO Altman Cancels South Korea and Japan Visit Following the Birth of His Second Daughter

Sam Altman canceled his South Korea and Japan trips due to his second daughter's premature birth, dispelling speculation about government investigations or new model launches. The move reflects Silicon Valley's emphasis on work-life balance. His planned visits aimed to deepen regional cooperation.....

Jun 18, 2026

240

South Korea Joins Forces with OpenAI: Global AI Safety Assessment Framework Expands

The South Korean Ministry of Science and ICT signed a memorandum of understanding with OpenAI, becoming the fourth country to establish AI safety cooperation with it. The two parties will work together with the South Korea Artificial Intelligence Security Institute to jointly build a scientific and standardized global artificial intelligence security evaluation framework.

Jun 18, 2026

270

Intense Battle Among AI Giants: SpaceX Acquires Cursor for $60 Billion, OpenAI Suffered a $38.5 Billion Loss Last Year

The AI industry is experiencing accelerated capital realignment. SpaceX announced it would acquire Anysphere, the parent company of the AI coding tool Cursor, through a $6 billion stock-only deal with no cash involved, highlighting the tech giants' strong demand for AI coding capabilities. This move also reflects the financial pressure faced by top model companies despite their high growth.

Jun 18, 2026

210

From Passive Q&A to Proactive Execution: ChatGPT Launches Scheduled Tasks, Accelerating the Evolution of Intelligent Assistants

OpenAI has launched a "Scheduled Tasks" system for ChatGPT, which can automatically perform periodic tasks, reminders, and information monitoring, replacing the Pulse feature. This feature is now available to Plus, Pro, Business, and Enterprise users.

Jun 18, 2026

180

OpenAI Exposed as Preparing to Launch New Dual-Directional Voice Model GPT-Bidi-1

OpenAI is set to launch GPT-Bidi-1, a next-gen bidirectional audio model upgrading ChatGPT's voice mode. Its duplex design enables simultaneous listening and speaking, catching interruptions in real time to adjust responses dynamically for a seamless, lag-free conversational experience.....

Jun 17, 2026

900

JD.com Launches A2P2 Protocol: The First Smart Agent Autonomous Payment Standard, Dividing into Six Levels from L0 to L5

JD.com released the country's first smart agent autonomous payment protocol, A2P2, which for the first time categorizes AI payment capabilities into six levels from L0 to L5. The protocol focuses on the intermediate stages of L3 and L4, achieving a progressive transition from user confirmation to full autonomous decision-making by the smart agent, providing a framework for standardization of AI payments.

Jun 17, 2026

260

It Is Dangerous to Be Named by the U.S. Government, but Anthropic's Corporate Sales Have Surpassed OpenAI

The AI Competition Landscape Has Changed: Anthropic's Corporate Market Share Exceeds OpenAI for the First Time. Despite Being Forced to Remove Its Cutting-Edge Models, Mythos 5 and Fable 5, Due to Restrictions by the Trump Administration, The Official Characterization of "Supply Chain Risks" Has Actually Enhanced Its Technological Scarcity, Which Did Not Impact Sales, Highlighting the Unique Influence of Geopolitical Narratives on the Market Position of Technology Companies.

Jun 17, 2026

170

AI Daily: ByteDance Launches Seedance 2.0 Mini; Kimi 2.7 Code High-Speed Version Large Model Officially Launched; DeepSeek Completes Over $7 Billion First Round of Funding

Welcome to the 【AI Daily】 column! This is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers, helping you understand technical trends and innovative AI product applications. Click to learn more about new AI products: https://app.aibase.com/zh1. The cost per second is halved, and ByteDance has launched the Seedance 2.0 Mini video generation model. ByteDance's火山引擎 (Volcano Engine) has launched the Seedance 2.0 Mini video generation model.

Jun 16, 2026

2.1k

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Ranking Monitor

AI Conversation Insight

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

LLM API Proxy Checker

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

xAI Grok-2 Ranks Second in Chatbot Leaderboard, Closely Following GPT-4o

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Noam Shazeer, the Core Author of Transformer, Joins OpenAI; Google's Huge Investment Could Not Keep Him

Liblib Completes $300 Million B+ Round, Valuation Exceeds $2 Billion, ARR Surpasses $300 Million

OpenAI CEO Altman Cancels South Korea and Japan Visit Following the Birth of His Second Daughter

South Korea Joins Forces with OpenAI: Global AI Safety Assessment Framework Expands

Intense Battle Among AI Giants: SpaceX Acquires Cursor for $60 Billion, OpenAI Suffered a $38.5 Billion Loss Last Year

From Passive Q&A to Proactive Execution: ChatGPT Launches Scheduled Tasks, Accelerating the Evolution of Intelligent Assistants

OpenAI Exposed as Preparing to Launch New Dual-Directional Voice Model GPT-Bidi-1

JD.com Launches A2P2 Protocol: The First Smart Agent Autonomous Payment Standard, Dividing into Six Levels from L0 to L5

It Is Dangerous to Be Named by the U.S. Government, but Anthropic's Corporate Sales Have Surpassed OpenAI

AI Daily: ByteDance Launches Seedance 2.0 Mini; Kimi 2.7 Code High-Speed Version Large Model Officially Launched; DeepSeek Completes Over $7 Billion First Round of Funding

AI News Recommendations

Noam Shazeer, the Core Author of Transformer, Joins OpenAI; Google's Huge Investment Could Not Keep Him

Liblib Completes $300 Million B+ Round, Valuation Exceeds $2 Billion, ARR Surpasses $300 Million

OpenAI CEO Altman Cancels South Korea and Japan Visit Following the Birth of His Second Daughter

South Korea Joins Forces with OpenAI: Global AI Safety Assessment Framework Expands

Intense Battle Among AI Giants: SpaceX Acquires Cursor for $60 Billion, OpenAI Suffered a $38.5 Billion Loss Last Year

From Passive Q&A to Proactive Execution: ChatGPT Launches Scheduled Tasks, Accelerating the Evolution of Intelligent Assistants

OpenAI Exposed as Preparing to Launch New Dual-Directional Voice Model GPT-Bidi-1

JD.com Launches A2P2 Protocol: The First Smart Agent Autonomous Payment Standard, Dividing into Six Levels from L0 to L5

It Is Dangerous to Be Named by the U.S. Government, but Anthropic's Corporate Sales Have Surpassed OpenAI

AI Daily: ByteDance Launches Seedance 2.0 Mini; Kimi 2.7 Code High-Speed Version Large Model Officially Launched; DeepSeek Completes Over $7 Billion First Round of Funding