Don't miss any moment of global AI innovation
Daily three-minute AI industry trends
AI industry milestones
AI monetization case sharing
AI image creation monetization cases
AI video creation monetization cases
AI audio creation monetization cases
AI content writing monetization cases
Free sharing of the latest AI tutorials
Shows total visits ranking of AI websites
Track fastest growing AI websites by traffic
Focus on AI websites with significant traffic drops
Shows weekly visits ranking of AI websites
AI websites most popular with US users
AI websites most popular with Chinese users
AI websites most popular with Indian users
AI websites most popular with Brazilian users
Total visits ranking of AI image generation websites
Total visits ranking of AI personal assistant websites
Total visits ranking of AI character generation websites
Total visits ranking of AI video generation websites
GitHub popular AI projects by total stars
GitHub popular AI projects by growth rate
GitHub popular AI developer ranking
GitHub popular AI organization ranking
GitHub popular deepseek open source projects
GitHub popular TTS open source projects
GitHub popular LLM open source projects
GitHub popular ChatGPT open source projects
Overview of GitHub popular AI open source projects
Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
A Doctor for your data
Resources for Data Centric AI
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
Lab assignments for Introduction to Data-Centric AI, MIT IAP 2024 ????
[NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.
DataCLUE: 数据为中心的NLP基准和工具包
OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)
Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning
nbsynthetic is simple and robust tabular synthetic data generation library for small and medium size datasets