Mini-Omni
An open-source multimodal large language model that supports real-time voice input and streaming audio output.
CommonProductProductivityMultimodalSpeech Recognition
Mini-Omni is an open-source multimodal large language model capable of engaging in real-time voice input and streaming audio output dialogues. It provides real-time voice-to-voice conversational capabilities without the need for additional ASR or TTS models. Furthermore, it can produce voice output while processing, supporting simultaneous text and audio generation. Mini-Omni enhances its performance through batch inference using 'Audio-to-Text' and 'Audio-to-Audio' functionalities.
Mini-Omni Visit Over Time
Monthly Visits
515580771
Bounce Rate
37.20%
Page per Visit
5.8
Visit Duration
00:06:42