On August 25, Alibaba Cloud introduced Qwen-VL, a large-scale visual language model that supports multiple languages including Chinese and English, and possesses the ability to jointly understand text and images. Based on Alibaba Cloud's previously open-sourced general-purpose language model Qwen-7B, Qwen-VL enhances its capabilities compared to other visual language models by adding features such as visual positioning and understanding of text within images. Qwen-VL has garnered over 3,400 stars on GitHub and has been downloaded more than 400,000 times. Visual language models are considered a significant evolution direction for general AI. The industry believes that models supporting multimodal inputs can enhance the understanding of the world and expand the range of applications. Through the open-sourcing of Qwen-VL, Alibaba Cloud is further advancing the development of general AI technology.
Aliyun Tongyi Qianwen Open Sources Again: Multimodal Large Model Qwen-VL
亿邦动力
37
© Copyright AIbase Base 2024, Click to View Source - https://www.aibase.com/news/817