CelebV-Text
A large-scale, high-quality, and diverse face text-video dataset
CommonProductVideoFaceText
CelebV-Text is a large-scale, high-quality, and diverse face text-video dataset designed to promote research on face text-video generation tasks. The dataset contains 70,000 out-door face video clips, each accompanied by 20 text descriptions covering 40 general appearances, 5 detailed appearances, 6 lighting conditions, 37 actions, 8 emotions and 6 light directions. CelebV-Text has been validated through comprehensive statistical analysis for its superiority in video, text, and text-video correlation, and it constructs a benchmark to standardize the evaluation of face text-video generation tasks.
CelebV-Text Visit Over Time
Monthly Visits
498
Bounce Rate
40.07%
Page per Visit
1.0
Visit Duration
00:00:00