The k1 Series Reinforcement Learning Model Debuts! The Dark Side of the Moon's Kimi Launches Visual Thinking Model

AIbase基地

Published inAI News · 4 min read · Dec 16, 2024

402

The Dark Side of the Moon today announced the release of a brand new visual thinking model, k1. This model is based on reinforcement learning technology, supporting end-to-end image understanding while integrating chain-of-thought techniques. It expands its capabilities beyond mathematics into more fundamental scientific fields, including physics and chemistry. In benchmark capability tests, the k1 model outperformed leading global benchmark models, such as OpenAI's o1, GPT-4o, and Claude3.5Sonnet.

The new generation model encourages the generation of more detailed reasoning steps, forming high-quality chains of thought that significantly improve the success rate of solving complex tasks. Kimi's k1 model integrates image understanding and reasoning abilities, providing users with a smoother interaction experience, capable of directly processing user-inputted image information and deriving answers without relying on external OCR or additional visual models.

The training of the k1 model is divided into two stages: first, a pre-training phase to obtain a base model, followed by reinforcement learning to enhance the model. The base model achieved an impressive score of 903 on OCRBench and excelled in benchmark test sets such as MathVista-testmini, MMMU-val, and DocVQA. The reinforcement learning phase optimized data quality and learning efficiency, achieving new breakthroughs in scalability.

Kimi has also independently built a standardized test set called Science Vista, which covers image-based questions in mathematics, physics, and chemistry of varying difficulties, and will be made available for use across the industry. Although the k1 model has shown some limitations in internal testing, such as the need for improvement in out-of-distribution generalization and the success rate on complex problems, its performance in visually noisy scenarios surpasses other models, demonstrating exceptional visual recognition capabilities.

Kimi's intelligent assistant's k1 visual thinking model not only excels in mathematics but also extends to the fields of physics and chemistry, showcasing a broad range of foundational scientific abilities. Additionally, the k1 model exhibits general capabilities, able to explain and reason about non-mathematical issues, such as the content and background stories of scientists' manuscripts.

Kimi's intelligent assistant looks forward to exploring a larger world with users. The new k1 model is now live, and users can experience this new feature through the latest version of the Kimi Intelligent Assistant mobile app or the web version.

Visual Thinking Model k1 Reinforcement Learning Image Understanding

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

OpenAI Releases New AI Model with Image Reasoning Capabilities

OpenAI recently unveiled its latest AI model, o3, marking a significant advancement in AI's ability to understand and analyze images, particularly low-quality sketches and diagrams. Alongside o3, a smaller version, o4-mini, was also released, expanding OpenAI's offerings. The core functionality of the o3 model lies in its ability to 'think with images,' allowing users to upload diverse images, such as whiteboard sketches and complex charts, for in-depth AI analysis and discussion.

Apr 17, 2025

150

Vision-R1: Reinforcing Visual Localization with RL, Achieving 50% Performance Boost

Recently, a collaborative effort between the Institute of Automation, Chinese Academy of Sciences, and the CASIA-Zidong Taichu team introduced Vision-R1, a novel method leveraging R1-like reinforcement learning to significantly enhance visual localization capabilities. This approach achieved a 50% performance improvement in complex tasks such as object detection and visual localization, surpassing existing state-of-the-art (SOTA) models with over ten times the parameter scale. Currently, large vision-language models typically rely on "pre-training + supervised fine-tuning" to improve responsiveness to user instructions, but this...

Apr 8, 2025

350

DeepSeek and Tsinghua University Collaborate on Self-Optimizing AI Model

Amidst the growing prevalence of artificial intelligence, the collaboration between DeepSeek and Tsinghua University has garnered significant industry attention. DeepSeek, a Chinese startup, is renowned for its breakthroughs in low-cost inference models. This collaboration aims to further reduce the training costs of AI models, thereby enhancing operational efficiency. DeepSeek recently launched a new low-cost inference model that has generated considerable market excitement. To further optimize this model, DeepSeek's research team...

Apr 7, 2025

410

Figure AI Achieves Breakthrough in Humanoid Robot Locomotion: Near Human Speed, Hours of Training

Figure AI recently announced a significant breakthrough in humanoid robot locomotion, showcasing natural walking capabilities achieved through reinforcement learning. This technology not only dramatically improves robot speed but also represents a new milestone in AI-driven robotic control systems. The new Figure02 robot achieves a walking speed of 2.68 mph (approximately 1.2 m/s), approaching normal human walking speed (approximately 3-4 mph), a significant improvement over the 0.67 mph of its predecessor, Figure01.

Mar 26, 2025

270

Tencent's HunYuan-T1 Reasoning Model Matches OpenAI's Top Performance in Benchmark Tests

Tencent recently announced its latest large language model, HunYuan-T1, claiming its reasoning capabilities rival OpenAI's best reasoning systems. Tencent reports that HunYuan-T1's development heavily relied on reinforcement learning, with 96.7% of post-training computing power dedicated to enhancing its logical reasoning and alignment with human preferences. In various benchmark tests, HunYuan-T1 demonstrated strong performance. On the MMLU-PRO evaluation, testing knowledge across 14 academic subjects, the model achieved a score of 87.2.

Mar 25, 2025

270

Fin-R1: A 7B-Parameter Financial Large Language Model Trained with Reinforcement Learning, Outperforming Industry Giants Based on Qwen2.5-7B

A powerful newcomer has emerged in the fintech arena. The Fin-R1 model, jointly developed by Professor Liwen Zhang's team (SUFE-AIFLM-Lab) at the School of Statistics and Data Science, Shanghai University of Finance and Economics, and Caiyue Xingchen, has been officially open-sourced, attracting significant attention due to its impressive performance. This financial specialized large language model, based on Qwen2.5-7B and trained with reinforcement learning, achieves leading performance across multiple financial benchmark tests. Remarkably, Fin-R1 surpasses most models of comparable size, and even many significantly larger models, despite having only 7B parameters.

Mar 24, 2025

400

Boston Dynamics' Latest Atlas Robot Shows Stunning Agility: Running, Backflips, Side Somersaults, and Breakdancing

On March 19th, Boston Dynamics showcased the impressive agility of its latest Atlas robot. In collaboration with the RAI Institute, this humanoid robot, utilizing reinforcement learning and motion capture technology, achieves remarkably natural and fluid human-like movements, generating significant attention. Boston Dynamics revealed that Atlas's breakthroughs stem from advanced reinforcement learning, enabling it to self-learn and optimize its movements. This movement data is sourced from motion capture technology, capturing the trajectories of human or other model movements to train Atlas.

Mar 20, 2025

130

Boston Dynamics' Atlas Robot Achieves Human-Level Agility

Boston Dynamics recently showcased its Atlas humanoid robot's latest advancements on X, sparking widespread discussion in the tech community. In collaboration with the Robotics & AI Institute (RAI Institute), the company leveraged reinforcement learning and motion capture to enable Atlas to learn and perform more natural, fluid, human-like movements. This breakthrough is expected to accelerate the development of humanoid robots towards practical real-world applications.

Mar 20, 2025

340

AI Daily: Man Sentenced to 10 Months for AI-Generated Pornographic Novels; 360's Zhi Nao Team Replicates DeepSeek Reinforcement Learning Results; ByteDance's SeedFoley AI Sound Effects Generation Model Launches

Welcome to the AI Daily column! Your daily guide to exploring the world of artificial intelligence. We present you with the hottest AI news, focusing on developers and helping you understand technology trends and innovative AI product applications. Discover new AI products here: https://top.aibase.com/1. A man was sentenced to ten months in prison for using AI to write and profit from pornographic novels, illegally earning over 20,000 yuan. Dayaze People's Court in Hubei Province recently ruled on a case involving the use of artificial intelligence to write and profit from pornographic novels.

Mar 14, 2025

150

360 ZhiNao Team Successfully Replicates DeepSeek Reinforcement Learning Results, Releases Open-Source Model Light-R1-14B-DS

Recently, the 360 ZhiNao team announced the successful replication of DeepSeek's reinforcement learning results and the official release of the open-source reasoning model Light-R1-14B-DS. This model surpasses DeepSeek-R1-Distill-Llama-70B and DeepSeek-R1-Distill-Qwen-32B in performance, becoming the industry's first 14B parameter model to achieve reinforcement learning effects. It significantly improves mathematical reasoning capabilities, outperforming most 32B-parameter models.

Mar 14, 2025

340

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview