Don't miss any moment of global AI innovation
Daily three-minute AI industry trends
AI industry milestones
Lists all AI hardware products.
AI monetization case sharing
AI image creation monetization cases
AI video creation monetization cases
AI audio creation monetization cases
AI content writing monetization cases
Free sharing of the latest AI tutorials
Shows total visits ranking of AI websites
Track fastest growing AI websites by traffic
Focus on AI websites with significant traffic drops
Shows weekly visits ranking of AI websites
AI websites most popular with US users
AI websites most popular with Chinese users
AI websites most popular with Indian users
AI websites most popular with Brazilian users
Total visits ranking of AI image generation websites
Total visits ranking of AI personal assistant websites
Total visits ranking of AI character generation websites
Total visits ranking of AI video generation websites
GitHub popular AI projects by total stars
GitHub popular AI projects by growth rate
GitHub popular AI developer ranking
GitHub popular AI organization ranking
GitHub popular deepseek open source projects
GitHub popular TTS open source projects
GitHub popular LLM open source projects
GitHub popular ChatGPT open source projects
Overview of GitHub popular AI open source projects
任意内容描述的实现:详细的本地化图像和视频字幕
[NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
A Framework of Small-scale Large Multimodal Models
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
A collection of resources on applications of multi-modal learning in medical imaging.
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"