VSP-LLM
A framework that combines Visual Speech Processing with Large Language Models
CommonProductProgrammingVisual Speech ProcessingLarge Language Models
VSP-LLM is a framework that combines Visual Speech Processing (VSP) with Large Language Models (LLMs), designed to maximize the capability of contextual modeling by leveraging the powerful abilities of LLMs. VSP-LLM is engineered for multitasking, performing visual speech recognition and translation tasks. It maps input videos to the LLM's input latent space through an unsupervised visual speech model. The framework efficiently trains by proposing a novel deduplication method and Low-Rank Adaptation (LoRA).
VSP-LLM Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29