The team from Shanghai Artificial Intelligence Laboratory's OpenCompass and ModelScope recently announced significant updates to their large model evaluation platform, Compass Arena, introducing a new multi-modal large model competition section called Compass Multi-Modal Arena. This new section provides users with a platform to experience and compare the performance of several mainstream multi-modal large models, helping them find the model that best suits their needs.
The official website of Compass Multi-Modal Arena and the ModelScope page are now open to the public, offering a user-friendly interface where users can upload images and input questions. The system will then arrange two anonymous multi-modal large models to generate answers based on the input content. Users can subjectively evaluate the quality of the generated content and choose the model they believe performs better. After the evaluation, users can see the names of each model.
The platform also features a built-in question bank, designed for use when users are unable to upload images. This question bank focuses on subjective visual question-answering tasks, such as meme understanding, art appreciation, and photography appreciation. This design aims to assess the performance and user experience of multi-modal large models on subjective tasks.
Compass Multi-Modal Arena Official Website
https://opencompass.org.cn/arena?type=multimodal
ModelScope Page:
https://modelscope.cn/studios/opencompass/CompassArena
HuggingFace Page
https://huggingface.co/spaces/opencompass/CompassArena
OpenCompass Multi-Modal Evaluation Tool Open Source Link:
https://github.com/open-compass/VLMEvalKit