Recently, Cohere announced the launch of two new open-source AI models aimed at bridging the language gap in foundational models through its Aya project. These new models, named Aya Expanse8B and 35B, are now available on Hugging Face, significantly enhancing AI performance for 23 languages.
Cohere stated in their blog that the 8B parameter model makes breakthroughs more accessible to global researchers, while the 32B parameter model offers industry-leading multilingual capabilities.
The goal of the Aya project is to expand access to foundational models for non-English languages. Prior to this, Cohere's research division launched the Aya initiative last year and released the Aya101 large language model (LLM) in February, covering 101 languages. Additionally, Cohere introduced the Aya dataset to aid in model training for other languages.
The Aya Expanse models were built using many core methods from Aya101. Cohere noted that the improvements in Aya Expanse are the result of rethinking core building blocks in machine learning breakthroughs over the years. Their research focuses on narrowing the language gap, achieving key breakthroughs such as data arbitrage, preference training for general performance and safety, and model merging.
In several benchmark tests, Cohere reported that the Aya Expanse models outperformed AI models of similar scale from companies like Google, Mistral, and Meta.
Specifically, Aya Expanse32B outperformed Gemma227B, Mistral8x22B, and even the larger Llama3.170B in multilingual benchmarks. The smaller 8B model also surpassed Gemma29B, Llama3.18B, and Ministral8B, with win rates ranging from 60.4% to 70.6%.
To avoid generating incomprehensible content, Cohere employs a data sampling method called data arbitrage. This method effectively trains models, especially for low-resource languages. Additionally, Cohere focuses on guiding models towards "global preferences" and considering different cultural and linguistic perspectives, thereby enhancing model performance and safety.
Cohere's Aya initiative aims to ensure better performance of LLMs in research on non-English languages. While many LLMs eventually release versions in other languages, they often face data scarcity issues during model training, particularly for low-resource languages. Therefore, Cohere's efforts are particularly important in helping to build multilingual AI models.
Official Blog: https://cohere.com/blog/aya-expanse-connecting-our-world
Key Points:
🌍 Cohere launches two new AI models, dedicated to narrowing the language gap in foundational models, enhancing performance for 23 languages.
💡 Aya Expanse models excel, outperforming many competitors in multilingual benchmarks.
🔍 Data arbitrage method helps models avoid generating low-quality content, focusing on global cultural and linguistic perspectives, improving multilingual AI training.