Aya Vision is an advanced visual model developed by the Cohere For AI team, focusing on multilingual and multimodal tasks and supporting 23 languages. The model significantly improves the performance of visual and text tasks through innovative algorithmic breakthroughs such as synthetic annotation, multilingual data augmentation, and multimodal model fusion. Its main advantages include efficiency (performing well even with limited computing resources) and extensive multilingual support. The release of Aya Vision aims to advance the forefront of multilingual and multimodal research and provide technical support to the global research community.