Don't miss any moment of global AI innovation
Daily three-minute AI industry trends
AI industry milestones
AI monetization case sharing
AI image creation monetization cases
AI video creation monetization cases
AI audio creation monetization cases
AI content writing monetization cases
Free sharing of the latest AI tutorials
Shows total visits ranking of AI websites
Track fastest growing AI websites by traffic
Focus on AI websites with significant traffic drops
Shows weekly visits ranking of AI websites
AI websites most popular with US users
AI websites most popular with Chinese users
AI websites most popular with Indian users
AI websites most popular with Brazilian users
Total visits ranking of AI image generation websites
Total visits ranking of AI personal assistant websites
Total visits ranking of AI character generation websites
Total visits ranking of AI video generation websites
GitHub popular AI projects by total stars
GitHub popular AI projects by growth rate
GitHub popular AI developer ranking
GitHub popular AI organization ranking
GitHub popular deepseek open source projects
GitHub popular TTS open source projects
GitHub popular LLM open source projects
GitHub popular ChatGPT open source projects
Overview of GitHub popular AI open source projects
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and textual inputs to produce high-quality SVG code with remarkable precision.
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 ?
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
? Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Latest Advances on System-2 Reasoning
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cascaded solutions (ASR-LLM-TTS-THG). Customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.