2024-12-18 17:52:23.AIbase.
New Breakthrough in Multimodal Models: Fei-Fei Li's Team Unifies Actions and Language, Not Only Understanding Commands but also Reading Implicit Emotions
2024-12-10 08:03:30.AIbase.
Zhipu AI Launches Free Multimodal Model GLM-4V-Flash: Enhancing Image Processing Accuracy
2024-11-30 10:01:37.AIbase.
Zhipu AI Open Source End-Side Large Language and Multimodal Model GLM-Edge Series
2024-11-19 13:51:41.AIbase.
Peking University Team Releases Multimodal Model LLaVA-o1, Inference Capabilities Comparable to GPT-o1!
2024-11-19 09:54:07.AIbase.
Mistral Launches the Most Powerful Open Source Multimodal Model Pixtral Large, Upgrading Le Chat to Directly Call Flux Pro
2024-10-25 11:16:59.AIbase.
Salesforce AI Research Unveils New Multimodal Model BLIP-3-Video: Cost-Effective Video Understanding
2024-09-27 17:37:02.AIbase.
Super Powerful Multimodal Model Emu3: Understanding Images and Videos Through Next Word Prediction
2024-09-26 14:34:11.AIbase.
The Open Source Multimodal Model Molmo Can Recognize Objects in Images and Generate Accurate Descriptions
2024-08-13 08:15:52.AIbase.
Starred Over Ten Thousand! The MiniCPM-V2.6 Model of WallFacer Intelligence Tops GitHub
2024-08-02 09:04:21.AIbase.
Google Launches Powerful Multimodal Model Gemini 1.5 Pro, Outranking GPT-4o and Claude-3.5 Sonnet
2024-07-31 17:56:44.AIbase.
Shusheng · Puyu Lingbi Multimodal Model Upgrade Version 2.5 Supports Longer Contexts and Image-Video Understanding Comparable to GPT-4V
2024-07-16 10:24:06.AIbase.
Meta Unveils Massive Multimodal Model Llama 3 405B on July 23rd
2024-07-04 10:48:36.AIbase.
Open-Source Local Real-Time Multimodal Model Moshi: Real-Time Speech Generation with Support for Multiple Accents Moshi, an open-source, real-time, multimodal model, excels in generating speech instantaneously while accommodating various accents.
2024-06-27 16:41:35.AIbase.
LeCun Launches New Visual Multimodal Model Cambrian-1, Visual Capabilities Outperform GPT-4V
2024-06-19 09:20:50.AIbase.
Meta Releases Multiple Models: Multimodal Model Chameleon, Text-to-Music Generation Model JASCO, Audio Watermarking Technology AudioSeal, and More
2024-06-17 10:47:33.AIbase.
Sketchpad: A Canvas Framework for Multimodal Models to Enhance Mathematical Abilities
2024-01-31 10:12:49.AIbase.
Microsoft Open Sources Multimodal Model LLaVA-1.5 Comparable to GPT-4V Performance
2024-01-15 18:04:00.AIbase.
2023 AI Industry Event: GPT-4 Debuts, Multimodal Model War Erupts, AI Sun Yanzi Sparks Controversy
2023-12-27 10:28:28.AIbase.
Shanghai AI Lab Releases 'PuYi 2.0' OpenMEDLab 2.0
2023-12-07 08:33:07.AIbase.