AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

Al Hardware

Lists all AI hardware products.

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation MCP

Why do LLMs consistently struggle with math? AI expert Karpathy explains 9.9 < 9.11

AIbase基地

Published inAI News · 5 min read · Aug 7, 2024

177

Recently, a seemingly straightforward question, "Is 9.11 larger than 9.9?" has garnered widespread attention globally, with nearly all large language models (LLMs) making errors on this issue. This phenomenon has caught the attention of AI expert Andrej Karpathy, who delves into the fundamental flaws of current large-scale models and future improvement directions, starting from this question.

Karpathy refers to this phenomenon as "jagged intelligence" or "uneven intelligence," pointing out that while the most advanced LLMs can perform various complex tasks, such as solving high-difficulty mathematical problems, they perform poorly on some seemingly simple questions. This unevenness in intelligence resembles the shape of a jagged edge.

For instance, OpenAI researcher Noam Brown found that LLMs perform poorly in the game of Tic-Tac-Toe, even failing to make correct decisions when the user is about to win. Karpathy believes this is because the model makes "unjustified" decisions, while Noam suggests it might be due to a lack of relevant strategy discussions in the training data.

Another example is the error made by LLMs in counting the number of letters. Even the latest Llama3.1 can give incorrect answers to simple questions. Karpathy explains that this stems from the LLM's lack of "self-awareness," meaning the model cannot distinguish what it can and cannot do, leading to a "mystifying confidence" when facing tasks.

To address this issue, Karpathy mentions the solution proposed in the Llama3.1 paper released by Meta. The paper suggests achieving model alignment in the post-training phase, allowing the model to develop self-awareness, knowing what it knows, and that merely adding factual knowledge cannot eradicate the problem of hallucinations. The Llama team proposes a training method called "knowledge probing," encouraging the model to only answer questions it understands and to refuse to generate uncertain answers.

Karpathy believes that although there are various issues with the current AI capabilities, these do not constitute fundamental flaws, and there are feasible solutions. He suggests that the current AI training approach is merely "imitating human labels and scaling up," and to further enhance AI intelligence, more work needs to be done throughout the development stack.

Before the issue is fully resolved, if LLMs are to be used in a production environment, they should be limited to tasks they excel at, paying attention to "jagged edges," and maintaining human involvement at all times. This way, we can better harness the potential of AI while avoiding the risks posed by its limitations.

Zigzag Intelligence Large Language Models OpenAI Andrej Karpathy

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Major Breakthrough! Research Team Reveals the Hidden Reward Mechanism Inside Large Language Models

Jul 2, 2025

150

JD.com's Embodied Intelligence Strategy Accelerates Rapidly, JoyInside Collaboration Map Exposed

According to NetEase Technology, JD.com's layout in the field of embodied intelligence is accelerating rapidly. The embodied intelligence brand JoyInside under JD.com has reached cooperation with more than ten leading robot companies, becoming the core engine for JD.com to seize the smart robot market. According to insiders, JoyInside is supported by JD's large model technology, focusing on providing smart interaction capabilities between robots and consumers. Its product strategy focuses on scenario-based applications such as one person, one dog, and one toy. Since its launch, the brand has successfully attracted leading enterprises from multiple niche fields to join.

Jul 2, 2025

240

Foxconn Launches Its First AI Inference Large Model FoxBrain, Trademark Application Submitted

Recently, Hon Hai Precision Industrial Co., Ltd. (commonly known as Foxconn) submitted a trademark registration application for "FoxBrain" to the Trademark Office of the National Intellectual Property Administration. This AI inference large model is not only Foxconn's first attempt but also the first AI model of this type in Taiwan. According to public information, the international classification of this trademark is scientific instruments, and it is currently in the "waiting for substantive examination" status. "FoxBrain" is an AI inference large model launched by the Hon Hai Research Institute, covering data analysis

Jul 2, 2025

270

Zhipu AI Launches GLM-4.1V-Thinking Open Source! A New Leader in Multimodal Reasoning, Challenging Top Models Worldwide

Jul 2, 2025

280

AI Daily: Baidu Launches Drawn-Imagine Platform and MuseSteamer; Alibaba's Audio-Driven Full-Body Digital Human Model OmniAvatar

Welcome to the [AI Daily] section! Here is your guide to exploring the world of artificial intelligence every day. Every day, we present you with the latest content in the AI field, focusing on developers, helping you understand technical trends and learn about innovative AI product applications. Click to learn more about new AI products: https://top.aibase.com/1、Open Source End-to-End Speech Large Model Step-Audio-AQAA: Understand audio and directly generate natural speech. Step-Audio-AQAA is an open source end-to-end speech large model,

Jul 2, 2025

250

Open Source End-to-End Speech Large Model Step-Audio-AQAA: Understand Audio and Generate Natural Speech Directly

Jul 2, 2025

230

Ant Group's Medical AI Platform Wins SAIL Award at 2025 World Artificial Intelligence Conference

Jul 2, 2025

220

State Administration for Market Regulation Approves the Release of 7 National Standards Including Artificial Intelligence, Information Technology, and Internet of Things

Jul 2, 2025

160

Foxconn's Parent Company Registers a Trademark for an AI Inference Large Model

Jul 2, 2025

OpenAI Launches High-End Enterprise AI Consulting Service, Clients Charged at Least 10 Million Dollars

Jul 2, 2025

960