2024-12-12 11:55:42.AIbase.13.9k
New AI Audio Technology MMAudio: Automatically Voicing Videos from Video or Text Input
Recently, a research team from the University of Illinois at Urbana-Champaign, Sony AI, and Sony Group introduced a new technology called MMAudio, which aims to achieve high-quality audio synthesis from videos through multi-modal joint training. The core innovation of MMAudio is its ability to generate synchronized audio using video and text inputs, thereby expanding the application scenarios for audio generation. MMAudio is designed to support generating sound effects that align with video content from either video or text inputs.