Mini-Omni

An open-source multimodal large language model that supports real-time voice input and streaming audio output.

CommonProductProductivityMultimodalSpeech Recognition
Mini-Omni is an open-source multimodal large language model capable of engaging in real-time voice input and streaming audio output dialogues. It provides real-time voice-to-voice conversational capabilities without the need for additional ASR or TTS models. Furthermore, it can produce voice output while processing, supporting simultaneous text and audio generation. Mini-Omni enhances its performance through batch inference using 'Audio-to-Text' and 'Audio-to-Audio' functionalities.
Visit

Mini-Omni Visit Over Time

Monthly Visits

515580771

Bounce Rate

37.20%

Page per Visit

5.8

Visit Duration

00:06:42

Mini-Omni Visit Trend

Mini-Omni Visit Geography

Mini-Omni Traffic Sources

Mini-Omni Alternatives