CohereForAI's Aya Vision 8B is an 800-million parameter multilingual vision-language model optimized for various visual language tasks, supporting OCR, image captioning, visual reasoning, summarization, and question answering. Based on the C4AI Command R7B language model and incorporating the SigLIP2 visual encoder, it supports 23 languages and features a 16K context length. Key advantages include multilingual support, powerful visual understanding capabilities, and broad applicability. Released with open-source weights, it aims to advance the global research community. Users must adhere to C4AI's acceptable use policy under the CC-BY-NC license.