ManiWAV
Robot manipulation learning from wild audio-visual data
CommonProductProgrammingRobot LearningAudio-Visual Data
ManiWAV is a research project aimed at learning robot manipulation skills from wild audio and visual data. It collects synchronized audio and visual feedback from human demonstrations and directly learns robot manipulation policies from these demonstrations through corresponding policy interfaces. The model demonstrates its system's capabilities through four contact-rich manipulation tasks, which require the robot to passively perceive contact events and patterns or actively perceive the material and state of object surfaces. Additionally, the system can generalize to unseen wild environments by learning from diverse wild human demonstrations.