Tencent's Hunyuan Large Model has excelled in the SuperCLUE-V evaluation benchmark for Chinese multimodal large models, topping the August rankings among domestic large models and positioning itself in the outstanding leader quadrant. Multimodal understanding, which requires the model to accurately identify image elements, understand their relationships, and generate natural language descriptions, tests the model's precision in image recognition and its understanding of the complex real world.

This evaluation included 12 representative multimodal understanding large models from both domestic and international sources, assessing capabilities in both basic and application-oriented directions. Tencent's Hunyuan Large Model demonstrated comprehensive advantages in both areas, scoring 71.95. The SuperCLUE evaluation criteria cover aspects such as understanding accuracy, response relevance, and depth of reasoning, ensuring the scientific and impartial nature of the assessment.

WeChat Screenshot_20240808103707.png

The evaluation results indicate that domestic large models have nearly reached the level of top overseas models in basic multimodal understanding capabilities. Tencent's Hunyuan Large Model particularly stood out in application capabilities, benefiting from a deep understanding of the Chinese context and comprehensive abilities across multiple domains.

The technical foundation of Tencent's Hunyuan Large Model supports the AI-native application Tencent Yuanbao, enabling it with multimodal understanding capabilities to comprehend and analyze various types of images. Additionally, the Tencent Hunyuan Multimodal Model is now live on Tencent Cloud, offering capabilities such as image-to-text generation for enterprise and individual developers.

Jiang Jie, Vice President of Tencent, stated that the Hunyuan Large Model is evolving towards full-modal technology. Users will soon be able to experience this technology in the Tencent Yuanbao App and Tencent's internal services, and it will be available for external applications through Tencent Cloud. Currently, the Tencent Hunyuan Large Model has expanded to a trillion-parameter scale, adopting a Mixture of Experts (MoE) structure, with its multimodal understanding capabilities leading domestically.