AIbase
Product LibraryTool Navigation

Search AI Products and News

  • AI News
  • AI Tools
2025-04-14 17:36:48.AIbase

Meta's Llama-4-Maverick Plummets in Rankings, Raising Concerns of Benchmark Manipulation

2025-04-14 09:25:20.AIbase

Kimi-VL y Kimi-VL-Thinking, modelos de lenguaje visual de código abierto, superan a GPT-4o en varios benchmarks

2025-04-14 09:15:57.AIbase

AI IQ Revolution! The New GAIA Benchmark Surpasses ARC-AGI

2025-04-11 09:47:08.AIbase

OpenAI Open-Sources BrowseComp: A New Benchmark for Evaluating AI Agent Web Browsing Capabilities

2025-04-11 09:00:39.AIbase

Soaring Costs of Benchmarking Inference AI Models: Assessing One Can Cost Nearly $3000

2025-04-10 14:35:16.AIbase

ByteDance Open-Sources Multi-SWE-bench to Drive Intelligent Upgrades for Large Model Code

2025-04-10 11:33:10.AIbase

OmniSVG: A New Benchmark in Multimodal Vector Graphic Generation from Fudan University and Jieyue Xingchen

2025-04-10 09:47:04.AIbase

OpenAI Launches Pioneers Program to Redefine AI Model Evaluation

2025-04-09 09:24:39.AIbase

NVIDIA Unveils Llama 3.1 Nemotron Ultra 253B: A New Benchmark in Performance

2025-04-08 09:58:19.AIbase

Mozilla Releases LocalScore: A New Tool to Simplify Benchmarking Local AI Models

2025-04-03 09:31:03.AIbase

OpenAI Releases PaperBench, a Benchmark for Evaluating AI Agents

2025-03-25 10:08:07.AIbase

Tencent's HunYuan-T1 Reasoning Model Matches OpenAI's Top Performance in Benchmark Tests

2025-03-21 11:48:03.AIbase

High School Student Creates AI Model Evaluation Website Using Minecraft

2025-03-21 09:45:00.AIbase

Minecraft Transformed into an AI Arena: High School Student Builds Innovative Model Evaluation Platform

2025-03-17 14:13:59.AIbase

Xiaomi's Large Model Team Achieves Major Breakthrough in Audio Reasoning, Topping International Benchmark

2025-03-17 10:37:36.AIbase

The Video Game Factorio Becomes a New Benchmark for AI Capabilities

2025-03-07 14:35:00.AIbase

Mistral AI Unveils Mistral OCR: A Revolutionary Benchmark in Document Understanding

2025-02-27 17:07:26.AIbase

Kimi k1.6 Model Unveiled: Programming Prowess Surpasses GPT-3, Ushering in a New AI Wave

2025-02-27 10:08:10.AIbase

Alibaba's Open-Source Video Generation Model Wan 2.1 Tops Benchmarks, Runs Smoothly on 4070

2025-02-24 11:26:35.AIbase

OpenAI Employee Publicly Questions xAI: Grok 3 Benchmark Results Are Misleading