LSLM

An AI conversational system for real-time voice interaction.

CommonProductchattingArtificial IntelligenceSpeech Recognition
The Listening-while-Speaking Language Model (LSLM) is an AI conversational model aimed at enhancing the naturalness of human-computer interaction. Utilizing full duplex modeling (FDM) technology, it enables the ability to listen while speaking, which significantly boosts real-time interactivity, particularly when generated content lacks satisfaction, allowing for interruptions and immediate responses. LSLM employs a token-based decoder for speech generation through TTS, and a streaming self-supervised learning (SSL) encoder for real-time audio input, exploring the optimal interaction balance through three fusion strategies: early fusion, mid-fusion, and late fusion.
Visit

LSLM Alternatives