Tongyi Qianwen Joins ModelScope Community to Open Source P-MMEval Testing Set: Evaluating Multilingual Capabilities of Models
Alibaba DAMO Academy, in collaboration with the ModelScope community, recently announced the open sourcing of a new multilingual benchmark testing set, P-MMEval, aimed at comprehensively evaluating the multilingual capabilities of Large Language Models (LLMs) and conducting comparative analysis of cross-language transfer abilities. This testing set covers efficient datasets for both basic and specialized capabilities, ensuring consistency in multilingual coverage across all selected datasets, and provides parallel samples across multiple languages, supporting up to 10 languages from 8 different language families, including English, Chinese, and Arabic.