Don't miss any moment of global AI innovation
Daily three-minute AI industry trends
AI industry milestones
AI monetization case sharing
AI image creation monetization cases
AI video creation monetization cases
AI audio creation monetization cases
AI content writing monetization cases
Free sharing of the latest AI tutorials
Shows total visits ranking of AI websites
Track fastest growing AI websites by traffic
Focus on AI websites with significant traffic drops
Shows weekly visits ranking of AI websites
AI websites most popular with US users
AI websites most popular with Chinese users
AI websites most popular with Indian users
AI websites most popular with Brazilian users
Total visits ranking of AI image generation websites
Total visits ranking of AI personal assistant websites
Total visits ranking of AI character generation websites
Total visits ranking of AI video generation websites
GitHub popular AI projects by total stars
GitHub popular AI projects by growth rate
GitHub popular AI developer ranking
GitHub popular AI organization ranking
GitHub popular deepseek open source projects
GitHub popular TTS open source projects
GitHub popular LLM open source projects
GitHub popular ChatGPT open source projects
Overview of GitHub popular AI open source projects
Production First and Production Ready End-to-End Speech Recognition Toolkit
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
? wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, support 11 programming languages
TEN Agent is a conversational voice AI agent powered by TEN, integrating Deepseek, Gemini, OpenAI, RTC, and hardware like ESP32. It enables realtime AI capabilities like seeing, hearing, and speaking, and is fully compatible with platforms like Dify and Coze.
Multilingual Voice Understanding Model
Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型??,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。??内含详细的数据生成流程? ?另外还集成了 LMDeploy 加速推理?、RAG检索增强生成 ?、TTS文字转语音?、数字人生成 ?、 Agent 使用网络查询实时信息?、ASR 语音转文字??、Vue 生态搭建前端?、FastAPI 搭建后端??、Docker-compose 打包部署?