Megrez-3B-Omni
Open-source full-modal understanding model for edge deployment
CommonProductProductivityFull-modal understandingImage recognition
Megrez-3B-Omni is a full-modal understanding model developed by Wunwen Xinqun, based on the large language model Megrez-3B-Instruct. It possesses the ability to analyze and understand three modalities of data: images, text, and audio. The model achieves optimal accuracy in image understanding, language comprehension, and voice recognition, supporting Chinese and English voice input as well as multi-turn dialogues. It can respond to voice questions about input images and provide text responses based on voice commands, having achieved leading results on multiple benchmark tasks.
Megrez-3B-Omni Visit Over Time
Monthly Visits
20899836
Bounce Rate
46.04%
Page per Visit
5.2
Visit Duration
00:04:57