Stronger than GPT-4, 2 Billion Parameter Model Achieves Nearly 100% Accuracy in Arithmetic Tasks

学术头条

Published inAI News · 2 min read · Sep 19, 2023

Academic headlines report that researchers from Tsinghua University, TAL AI Lab, and Zhipu AI have proposed a 2-billion-parameter language model called MathGLM, aimed at exploring the efficiency of large language models in mathematical reasoning. The model employs a Transformer decoder architecture and has been trained on a large-scale arithmetic dataset, significantly enhancing its mathematical operation capabilities. Experimental results show that MathGLM achieves near 100% accuracy on a series of arithmetic tasks, outperforming GPT-4. Even with only 100 million parameters, MathGLM surpasses both GPT-4 and ChatGPT. The study also found that as the number of parameters increases, MathGLM's arithmetic operation abilities also improve. When handling complex mixed arithmetic operations with intricate number formats, MathGLM also outperforms GPT-4 and ChatGPT. This research indicates that under the condition of sufficient parameters and data volume, language models can accurately perform complex mathematical operations.

Language Model Arithmetic Operations MathGLM

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Lightricks Releases LTXV Model Update: Breakthrough in Image-to-Video Generation Beyond 60 Seconds

Jul 18, 2025

Head of ByteDance's Visual Large Model, Yang Jianchao, Announces Temporary Leave; Zhou Chang Takes Over, Drawing Attention

Yang Jianchao, the head of ByteDance's Visual Large Model team, announced a temporary leave due to family reasons, with Zhou Chang, former technical leader of Alibaba's Tongyi Qianwen, taking over. This personnel change occurred during a period of adjustment in ByteDance's AI department, sparking concerns about the stability of the technical roadmap. Yang Jianchao's work information remains in the internal system, and Zhou Chang will lead the global Seed team to continue research on visual multimodal generation. The company emphasized its continued investment in basic research and hopes that the new leader will bring innovative energy. This change highlights the importance of balancing work and health in the fast-paced tech industry.

Jul 18, 2025

5.63% Error Rate Sets New Low: NVIDIA AI Launches Commercial-Grade Ultra-High-Speed Speech Recognition Model Canary-Qwen-2.5B

NVIDIA's Canary-Qwen-2.5B sets a 5.63% WER record on Hugging Face OpenASR. This CC-BY licensed model combines FastConformer encoder with Qwen3-1.7B LLM decoder for efficient speech-to-text and NLP. Supports multi-GPU deployment for cloud/edge applications.....

Jul 18, 2025

140

Liangxin Technology Launches AI Energy Big Model, Power Trading Will Achieve Intelligence

Lanxin Tech unveiled 'Lanxin Jiugong AI Energy Model' at the Chain Expo, featuring core technologies like a time-series prediction engine with 90% accuracy and an AI agent engine for real-time monitoring and strategy generation, now applied in Guangdong, Shandong, and Zhejiang.....

Jul 18, 2025

First Live Streaming Diffusion AI Model MirageLSD Makes a Stunning Debut, Opening Infinite Possibilities for Real-Time Video Conversion!

MirageLSD, the world's first AI real-time video conversion model with 40ms latency, enables instant scene/outfit changes via gestures. Applied in gaming/livestreaming, its LSD model uses Diffusion Forcing to eliminate long-generation errors.....

Jul 18, 2025

120

AI Influences Language Communication! Our Daily Conversations Contain More GPT Vocabulary

German study finds AI like ChatGPT is altering human language, creating 'GPT words'. Analysis shows AI-preferred terms like 'in-depth study' surge in usage across media, revealing unconscious human mimicry of AI speech patterns and raising concerns about technology's cognitive impact.....

Jul 17, 2025

100

Tesla Grok Assistant is About to Get Heyk Voice Wake-up Function, Say Goodbye to Manual Operations!

Tesla's Grok AI assistant will soon support 'Hey Grok' voice activation, initially for AMD-equipped vehicles. Requires premium connectivity. Functionality is limited now, but expanding.....

Jul 17, 2025

220

Windsurf Re-launches Claude Sonnet 4 Model

AI coding tool Windsurf announced the re-launch of Anthropic's Claude Sonnet 4 model, offering Pro users a monthly quota of 250 API calls (2x credit consumption). The model is known for its 72.7% performance on the SWE-bench test, supports a 200K token context window, and enables code generation and complex refactoring features. Previously, due to Anthropic's restrictions on direct access, Windsurf introduced a BYOK solution. This partnership restoration is being

Jul 17, 2025

160

Google DeepMind Launches MoR Architecture: Expected to Significantly Improve the Efficiency of Large Language Models

DeepMind's Mixture-of-Recursions (MoR) enhances model efficiency via dynamic token routing and recursive depth allocation, outperforming Transformers with fewer parameters. Its selective caching reduces memory pressure, proving especially effective above 360M scale, offering optimized AI deployment solutions.....

Jul 17, 2025

300

Huawei and Yunnan Jiaotou Collaborate to Launch 'Lvmei Channel · Transportation Large Model' to Promote Digital Transformation in the Transportation Industry

Huawei, Yunnan Communications Investment, and Chang'an University launched the 'Green and Beautiful Corridor·Transportation Model' to drive digital transformation in transport. It features an 84% accurate cognitive model, AI computing centers, and 35 edge nodes, enhancing efficiency and safety in construction and management.....

Jul 17, 2025

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Stronger than GPT-4, 2 Billion Parameter Model Achieves Nearly 100% Accuracy in Arithmetic Tasks

学术头条

This article is from AIbase Daily

AI News Recommendations

Lightricks Releases LTXV Model Update: Breakthrough in Image-to-Video Generation Beyond 60 Seconds

Head of ByteDance's Visual Large Model, Yang Jianchao, Announces Temporary Leave; Zhou Chang Takes Over, Drawing Attention

5.63% Error Rate Sets New Low: NVIDIA AI Launches Commercial-Grade Ultra-High-Speed Speech Recognition Model Canary-Qwen-2.5B

Liangxin Technology Launches AI Energy Big Model, Power Trading Will Achieve Intelligence

First Live Streaming Diffusion AI Model MirageLSD Makes a Stunning Debut, Opening Infinite Possibilities for Real-Time Video Conversion!

AI Influences Language Communication! Our Daily Conversations Contain More GPT Vocabulary

Tesla Grok Assistant is About to Get Heyk Voice Wake-up Function, Say Goodbye to Manual Operations!

Windsurf Re-launches Claude Sonnet 4 Model

Google DeepMind Launches MoR Architecture: Expected to Significantly Improve the Efficiency of Large Language Models

Huawei and Yunnan Jiaotou Collaborate to Launch 'Lvmei Channel · Transportation Large Model' to Promote Digital Transformation in the Transportation Industry