Kimi-VL is an advanced expert-mixed visual language model designed for multi-modal reasoning, long-context understanding, and powerful agent capabilities. This model excels in several complex domains, boasting efficient 2.8B parameters while exhibiting outstanding mathematical reasoning and image understanding capabilities. Kimi-VL sets a new standard for multi-modal models with its optimized computational performance and ability to handle long inputs.