Kuaishou Keye-VL is a cutting - edge multimodal large language model developed by the Kuaishou Keye team, which performs excellently in video understanding, visual perception, and reasoning tasks. The 1.5 version reaches a new height in video understanding, image perception, and reasoning ability through innovative fast - slow video encoding strategies, LongCoT cold - start data pipelines, and reinforcement learning training strategies, and supports an extended context length of up to 128k tokens.
Multimodal
SafetensorsMultiple Languages