MinMo

MinMo is a multimodal large language model designed for seamless voice interaction.

CommonProductchatting\Voice InteractionMultimodal

MinMo, developed by Alibaba Group's Tongyi Laboratory, is a multimodal large language model with approximately 8 billion parameters, focused on achieving seamless voice interactions. It is trained on 1.4 million hours of diverse voice data through various stages, including speech-to-text alignment, text-to-speech alignment, speech-to-speech alignment, and full-duplex interaction alignment. MinMo achieves state-of-the-art performance across various benchmarks in speech understanding and generation, while maintaining the capabilities of text-based large language models and supporting full-duplex dialogues, enabling simultaneous bidirectional communication between users and the system. Additionally, MinMo introduces a novel and straightforward voice decoder that surpasses previous models in speech generation. Its command-following ability has been enhanced to support voice generation control based on user instructions, including details such as emotion, dialect, and speech rate, as well as mimicking specific voices. MinMo's speech-to-text latency is approximately 100 milliseconds, with theoretical full-duplex latency around 600 milliseconds, and actual latency around 800 milliseconds. The development of MinMo aims to overcome the major limitations of previous multimodal models, providing users with a more natural, smooth, and human-like voice interaction experience.

Visit

MinMo Visit Over Time

Monthly Visits

34228

Bounce Rate

54.67%

Page per Visit

1.4

Visit Duration

00:00:47

MinMo Visit Trend

MinMo Visit Geography

MinMo Traffic Sources

MinMo Alternatives

Vibe Coding — Vibe Coding combines ambiance and AI assistance to help beginners easily master programming skills.

Programming

•[\VibeCoding\•\AI Programming\

Gobii — Deploy AI employees around the clock to handle outreach, research, and operational tasks, keeping your business running at all times.

Business

•[\AI Employees\•\Business Automation\

LTX 2 — LTX 2 is a revolutionary AI video generation engine that supports 4K, open-source, and audio-video synchronization.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

MinMo

MinMo Visit Over Time

MinMo Visit Trend

MinMo Visit Geography

MinMo Traffic Sources

MinMo Alternatives

Vibe Coding — Vibe Coding combines ambiance and AI assistance to help beginners easily master programming skills.

Gobii — Deploy AI employees around the clock to handle outreach, research, and operational tasks, keeping your business running at all times.

LTX 2 — LTX 2 is a revolutionary AI video generation engine that supports 4K, open-source, and audio-video synchronization.

Qianye Novel AI — Improve the creative efficiency of experienced novel writers and lower the entry barrier for new authors.

Home Design AI — Design your dream home online easily with AI.

Floor Plan AI — Design and generate floor plans using AI without registration.

RentLateFee.com — Free Rent Late Fee Calculator, Including Multi-Property Management Tools, Helping Landlords and Tenants Operate Compliantly

Screenz Live AI Recruiter Interviews — AI-driven candidate screening software that automatically interviews and eliminates unqualified candidates and provides reports

Whisp — Whisp is a voice-first application builder that converts voice ideas into applications in seconds

MailAI — MailAI is an AI email assistant that enables automated management, saving over 10 hours per week and boosting productivity by 300%.

ToolWaves — Free online tools covering text, PDF, and image processing, no login required, fast and private.

Countik — Free generate 250 views and 50 likes, no password, no ads, real and fast

Wan Animate — Wan Animate enables character animation and replacement, based on Wan 2.2 technology, with realistic effects

Floqer — Floqer is a CRM data enrichment tool that automatically cleans and enhances CRM data, with over 75 data points.

GoMim — GoMim is a personal math AI tutor, offering free online step-by-step solutions.

Shenli Nishang — An intelligent image technology tool platform focused on traditional Chinese costume design.

CloneAI — Use AI to convert photos or videos into creative video content.

Windsurf — Windsurf is a global advanced AI coding assistant, featuring the first AI-native IDE that keeps developers efficient.

ZenMux — Enterprise-grade LLM platform with a unified API and intelligent routing.

Seko — Leverage advanced AI technology to easily create and edit professional videos.

Same.new — Build a website through AI chat, and also create a music streaming application

Director.ai — Director is a no-code tool from Browserbase that creates reusable web automation operations based on prompts.

Rocket.new — No coding required, transform your idea into an online Web and mobile app in minutes

HuiduoShi — A college student's job search tool that helps you enter state-owned enterprises with AI.

ChatGPT Atlas — A new browser with built-in ChatGPT.

DeepSeek OCR — The world's first online OCR tool driven by deep learning, with 97% accuracy.

illumi — illumi is a context-aware whiteboard that supports integration with multiple models, helping AI teams collaborate efficiently.

TalentSprout.ai — AI Recruitment Platform, Automated Voice and Video Interviews, Fast Candidate Screening and Hiring

Jet Admin — Jet Admin no-code builds custom business applications, improves efficiency and reduces costs, and provides an easy-to-deploy tool.

GEO Services