Ego-Exo4D

Multimodal Multi-view Video Dataset and Benchmark Challenge

CommonProductVideoMultimodalMulti-view
Ego-Exo4D is a multimodal, multi-view video dataset and benchmark challenge focused on capturing first-person and external perspectives of skill-based human activities. It supports multi-modal machine perception research for daily activities. The dataset was collected by 839 volunteers wearing cameras in 13 cities worldwide, capturing 1422 hours of skill-based human activity videos. The dataset provides three types of paired video-aligned natural language datasets: expert annotations, participant-provided tutorial-style narratives, and one-sentence atomic action descriptions. Ego-Exo4D also captures multi-view and multi-sensory modalities, including multiple cameras, seven microphone arrays, two IMUs, a barometer, and a magnetometer. The dataset was recorded strictly adhering to privacy and ethical policies with informed consent from participants. For more information, please visit the official website.
Visit

Ego-Exo4D Visit Over Time

Monthly Visits

11693

Bounce Rate

51.81%

Page per Visit

2.8

Visit Duration

00:00:39

Ego-Exo4D Visit Trend

Ego-Exo4D Visit Geography

Ego-Exo4D Traffic Sources

Ego-Exo4D Alternatives