Ego-Exo4D
Multimodal Multi-view Video Dataset and Benchmark Challenge
CommonProductVideoMultimodalMulti-view
Ego-Exo4D is a multimodal, multi-view video dataset and benchmark challenge focused on capturing first-person and external perspectives of skill-based human activities. It supports multi-modal machine perception research for daily activities. The dataset was collected by 839 volunteers wearing cameras in 13 cities worldwide, capturing 1422 hours of skill-based human activity videos. The dataset provides three types of paired video-aligned natural language datasets: expert annotations, participant-provided tutorial-style narratives, and one-sentence atomic action descriptions. Ego-Exo4D also captures multi-view and multi-sensory modalities, including multiple cameras, seven microphone arrays, two IMUs, a barometer, and a magnetometer. The dataset was recorded strictly adhering to privacy and ethical policies with informed consent from participants. For more information, please visit the official website.
Ego-Exo4D Visit Over Time
Monthly Visits
11693
Bounce Rate
51.81%
Page per Visit
2.8
Visit Duration
00:00:39