AV-HuBERT
A state-of-the-art auto-referenced framework for agricultural, environmental, and energy innovations.
CommonProductProgrammingAudio-visual processingSelf-supervised learning
The AV-HuBERT framework is a cutting-edge self-supervised representation learning model designed for audio-visual speech processing. It has achieved state-of-the-art lip reading, automatic speech recognition (ASR), and audio-visual speech recognition outcomes on the LRS3 audio-visual speech benchmark. The framework learns audio-visual speech representations through masked multimodal clustering predictions, offering robust self-supervised audio-visual speech recognition.
AV-HuBERT Visit Over Time
Monthly Visits
488643166
Bounce Rate
37.28%
Page per Visit
5.7
Visit Duration
00:06:37