Mini-Omni
An open-source multimodal large language model that supports real-time voice input and streaming audio output.
CommonProductProductivityMultimodalSpeech Recognition
Mini-Omni is an open-source multimodal large language model capable of engaging in real-time voice input and streaming audio output dialogues. It provides real-time voice-to-voice conversational capabilities without the need for additional ASR or TTS models. Furthermore, it can produce voice output while processing, supporting simultaneous text and audio generation. Mini-Omni enhances its performance through batch inference using 'Audio-to-Text' and 'Audio-to-Audio' functionalities.
Mini-Omni Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29