VSP-LLM

A framework that combines Visual Speech Processing with Large Language Models

CommonProductProgrammingVisual Speech ProcessingLarge Language Models
VSP-LLM is a framework that combines Visual Speech Processing (VSP) with Large Language Models (LLMs), designed to maximize the capability of contextual modeling by leveraging the powerful abilities of LLMs. VSP-LLM is engineered for multitasking, performing visual speech recognition and translation tasks. It maps input videos to the LLM's input latent space through an unsupervised visual speech model. The framework efficiently trains by proposing a novel deduplication method and Low-Rank Adaptation (LoRA).
Visit

VSP-LLM Visit Over Time

Monthly Visits

499904316

Bounce Rate

37.31%

Page per Visit

5.8

Visit Duration

00:06:52

VSP-LLM Visit Trend

VSP-LLM Visit Geography

VSP-LLM Traffic Sources

VSP-LLM Alternatives