Microsoft Paper Reveals OpenAI Model Parameters? Medical AI Evaluation Unexpectedly Exposes 4o-mini with Only 8B

AIbase基地

Published inAI News · 5 min read · Jan 2, 2025

684

In a research paper focused on evaluating medical AI, Microsoft seems to have once again "unintentionally" disclosed the parameter sizes of several top large language models in the industry. This paper, published on December 26, not only revealed the model parameter data from multiple companies including OpenAI and Anthropic, but also sparked discussions within the industry regarding model architecture and technical capabilities.

According to the paper, OpenAI's o1-preview model has approximately 300 billion parameters, GPT-4o has around 200 billion, while GPT-4o-mini has only 8 billion parameters. This sharply contrasts with Nvidia's earlier claim this year that GPT-4 employs a 1.76 trillion MoE architecture. Additionally, the paper disclosed that Claude3.5Sonnet has a parameter size of about 175 billion.

This is not the first time Microsoft has "leaked" model parameter information in a paper. Last October, Microsoft revealed the 20 billion parameter size of GPT-3.5-Turbo in a paper, only to remove this information in a later update. This recurring "leak" has led industry insiders to speculate whether there is a specific intention behind it.

It is worth noting that the primary aim of this paper is to introduce a medical domain benchmark test called MEDEC. The research team analyzed 488 clinical notes from three hospitals in the United States to evaluate the ability of various models to identify and correct errors in medical documents. The test results showed that Claude3.5Sonnet led other models in error detection with a score of 70.16.

The authenticity of these data has sparked heated discussions in the industry. Some believe that if Claude3.5Sonnet indeed achieves excellent performance with fewer parameters, it would highlight Anthropic's technical strength. Other analysts have inferred that certain parameter estimates are reasonable based on model pricing.

Particularly noteworthy is the fact that the paper estimates parameters for mainstream models but notably omits specific parameters for Google's Gemini. Some analysts suggest that this may be related to Gemini using TPUs instead of Nvidia GPUs, making it difficult to accurately estimate through token generation speed.

As OpenAI gradually downplays its commitment to open-source, core information such as model parameters may continue to be a focal point of industry attention. This unexpected leak has once again prompted deep reflections on AI model architecture, technical pathways, and commercial competition.

References:

https://arxiv.org/pdf/2412.19260

https://x.com/Yuchenj_UW/status/1874507299303379428

https://www.reddit.com/r/LocalLLaMA/comments/1f1vpyt/why_gpt_4o_mini_is_probably_around_8b_active/

Li Yanhong of Baidu: Baidu Has Become the Most Aggressive Search Engine in Terms of AI

Li Yanhong, founder of Baidu, stated at the 2025 World Conference that Baidu has completely restructured its search engine through AI, shifting from traditional text links to a multimedia application centered on images and videos, becoming the most aggressive search engine company in terms of AI transformation.

Anthropic Invests $50 Billion in Building AI Infrastructure

The US AI company Anthropic plans to invest $50 billion in building AI infrastructure, with the first step being the construction of data centers in Texas and New York, expected to be operational by 2026. The investment aims to drive business development and long-term research. The CEO emphasized that infrastructure is key to achieving AI goals.

AI Data Startup WisdomAI Secures $50 Million New Funding to Boost Smart Data Analysis

AI data analysis company WisdomAI has completed a $50 million Series A funding round led by Kleiner Perkins with participation from NVentures. The founder stated that the funds will be used for technological research and development and market expansion. The company focuses on extracting business insights from various types of data.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Microsoft Paper Reveals OpenAI Model Parameters? Medical AI Evaluation Unexpectedly Exposes 4o-mini with Only 8B

AIbase基地

This article is from AIbase Daily

AI News Recommendations

TikTok's Large Model Researcher Was Fired for Leaking Secrets! He Frequently Exposed Robot Development Details on Zhihu, Company Cracks Down on Information Security

Hollywood Stars Team Up with AI to Launch Celebrity Voice Synthesis Service

Northeastern University Opens Multilingual Translation Missile! NiuTrans.LMT Supports 60 Languages and 234 Directions Major Breakthrough in Low-Resource Language Translation

Microsoft Builds a Super Cluster of Cross-Continent Data Centers to Support Large-Scale AI Model Training

Li Yanhong of Baidu: Baidu Has Become the Most Aggressive Search Engine in Terms of AI

Anthropic Invests $50 Billion in Building AI Infrastructure

Nebius, a Dutch AI Cloud Company, Signs a $3 Billion Cooperation Agreement with Meta

AI Data Startup WisdomAI Secures $50 Million New Funding to Boost Smart Data Analysis

Microsoft Leverages OpenAI Chip Technology for a Competitive Edge! A Major Step in Developing Its Own AI Chips, Satya Nadella Reveals Collaboration Details

Google Invests 6.4 Billion Dollars in New AI Data Center in Germany

GEO Services