ShareGPT4Video
Enhance AI models for video understanding and generation.
CommonProductVideoVideo UnderstandingText-to-Video
The ShareGPT4Video series aims to promote video understanding in large video-language models (LVLMs) and video generation in text-to-video models (T2VMs) through dense and precise captions. The series includes:
1) ShareGPT4Video, a dense video caption dataset of 40K GPT4V annotations, developed through carefully designed data filtering and annotation strategies.
2) ShareCaptioner-Video, an efficient and powerful video captioning model for any video, trained on its 4.8M high-quality aesthetic video dataset.
3) ShareGPT4Video-8B, a simple yet excellent LVLM that achieved top performance on three advanced video benchmark tests.