Alibaba International AI Team Releases Open-Source Problem Reasoning Model Marco-o1

AIbase基地

Published inAI News · 5 min read · Nov 26, 2024

274

The Alibaba International AI team recently released a new inference model called Marco-o1, which focuses specifically on solving open-ended questions, not limited to subjects with standard answers, such as programming and mathematics. The research team is dedicated to exploring whether such models can be effectively applied to areas that are difficult to quantify and lack clear rewards.

WeChat Screenshot_20241126082757.png

The features of the Marco-o1 model include fine-tuning with ultra-long CoT data, utilizing MCTS to expand the solution space, and fine-grained solution space expansion. The model constructs a set of ultra-long CoT data with reflection and correction capabilities through self-play + MCTS, and it is trained alongside other open-source data. Additionally, the research team has defined mini-Steps to further expand the model's solution space, guiding the model to output better answers.

In translation tasks, the Marco-o1 model demonstrated its ability to handle the translation of long and complex sentences, marking the first time that inference expansion has been applied to machine translation tasks. The research team has open-sourced some CoT data and the best current model, with plans to release more data and models in the future.

WeChat Screenshot_20241126082711.png

During inference, the model deeply analyzes the response. For example, when outputting the number of 'r's in the word 'strawberry', the model gradually breaks down each letter in the word and compares them, ultimately producing the correct result. In the field of machine translation, the model correctly identifies challenges through reasoning pathways, translating word by word, which enhances overall translation accuracy.

The research team has also attempted applications in other areas, demonstrating that the model possesses the ability to solve other general real-world problems. The overall structure of Marco-o1 is built using self-play + MCTS to create a set of ultra-long CoT data with reflection and correction capabilities, trained alongside other open-source data. The research team has also integrated some instruction-following datasets from the MarcoPolo family, improving the model's instruction-following abilities.

Regarding usage, the research team provides inference and fine-tuning code, allowing users to easily load the model and tokenizer and start chatting or fine-tuning the model. Additionally, the model can also be run directly in the GGUF version on ModelScope, offering a quicker experience.

The release of the Marco-o1 model marks an important step for the Alibaba International AI team in the field of inference models, providing new ideas and tools for solving open-ended problems.

ModelScope:

https://modelscope.cn/models/AIDC-AI/Marco-o1

Arxiv:

https://arxiv.org/abs/2411.14405

Github:

https://github.com/AIDC-AI/Marco-o1

Hugging Face:

https://huggingface.co/AIDC-AI/Marco-o1

OpenAI o4-mini with Reinforcement Fine-Tuning Officially Launched, AI Professional Capabilities Advancing to Expert Level

On May 8th, OpenAI o4-mini launched with Reinforcement Fine-Tuning. The combination of these two technologies completely transforms the cost structure and technical threshold for AI specialization, enabling enterprises to rapidly turn general-purpose AI into expert systems in specific fields using only a small amount of training data. The leap from general intelligence to expert-level AI is highlighted by the Reinforcement Fine-Tuning technology at the core of this release. Unlike traditional supervised fine-tuning, RFT is based on reinforcement learning algorithms, driven by reward-based training.

AI Daily: Coze Space Open to Testing; Tencent Opens Hunyuan Video Generation Tool HunyuanCustom; Alibaba Releases ZeroSearch, a Large Model Search Engine

Welcome to the 【AI Daily】 section! This is your guide to exploring the world of artificial intelligence every day. We bring you the latest highlights in the AI field, focusing on developers and helping you gain insight into technological trends and innovative AI product applications. For fresh AI products, visit: https://top.aibase.com/1. No Invitation Code Required! Coze Space Announces Open Testing. Coze Space (Coze Space) officially opens for testing, allowing users to log in and use it without an invitation code. The platform showcases powerful AI collaboration capabilities.

ByteDance Launches Top Seed Program to Recruit AI Talent from the Class of 2026

ByteDance recently announced the official launch of its "Top Seed" program for recruiting top AI talent from the class of 2026. The program aims to recruit approximately 30 outstanding doctoral students. This initiative focuses on cutting-edge artificial intelligence, encompassing research areas such as large language models, machine learning algorithms and systems, multi-modal generation and understanding, and speech processing. ByteDance hopes to attract young talents with strong potential and passion in the field of large language model research. Unlike previous recruitment plans, this year's "Top Seed" program emphasizes no restrictions on academic background.

Open-Source Revolution! Step1X-Edit Lands on Hugging Face, Generating Images with Natural Language, Rivaling GPT-4o!

Step1X-Edit, a groundbreaking open-source AI model, has arrived on Hugging Face. This powerful tool allows users to create images using natural language descriptions, demonstrating performance comparable to GPT-4o. This release marks a significant advancement in accessible AI image generation technology.

Gurman: Apple Smart Glasses Still Far Off, Meta Aims for High-End Model in 2025

According to Bloomberg's Mark Gurman, Apple's smart glasses project, codenamed N50, remains in early development and is far from ready. While Apple has a strong track record of creating small, innovative devices, its smart glasses development appears to be progressing slowly. Gurman writes that the N50 glasses aim to leverage "Apple Intelligence" capabilities, analyzing the surroundings and feeding information back to the wearer, but the glasses will not provide a full augmented reality (AR) experience.