On August 25, Alibaba Cloud introduced Qwen-VL, a large-scale visual language model that supports multiple languages including Chinese and English, and possesses the ability to jointly understand text and images. Based on Alibaba Cloud's previously open-sourced general-purpose language model Qwen-7B, Qwen-VL enhances its capabilities compared to other visual language models by adding features such as visual positioning and understanding of text within images. Qwen-VL has garnered over 3,400 stars on GitHub and has been downloaded more than 400,000 times. Visual language models are considered a significant evolution direction for general AI. The industry believes that models supporting multimodal inputs can enhance the understanding of the world and expand the range of applications. Through the open-sourcing of Qwen-VL, Alibaba Cloud is further advancing the development of general AI technology.
Aliyun Tongyi Qianwen Open Sources Again: Multimodal Large Model Qwen-VL

亿邦动力
This article is from AIbase Daily
Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.