SlowFast-LLaVA
A large language model for video understanding and reasoning that does not require training.
CommonProductProductivityVideo Question AnsweringMultimodal Learning
SlowFast-LLaVA is a zero-training multimodal large language model designed for video understanding and reasoning. It achieves performance comparable to or even better than state-of-the-art video large language models across various video question-answering tasks and benchmarks, without the need for fine-tuning on any data.
SlowFast-LLaVA Visit Over Time
Monthly Visits
515580771
Bounce Rate
37.20%
Page per Visit
5.8
Visit Duration
00:06:42