Llama-3.2-90B-Vision is a multimodal large language model (LLM) released by Meta, focusing on visual recognition, image reasoning, image description, and answering general questions about images. The model surpasses many existing open-source and closed multimodal models in common industry benchmarks.