Data-Juicer
A one-stop data processing system that provides high-quality data for large language models.
CommonProductProductivityMachine LearningData Science
Data-Juicer is a comprehensive multimodal data processing system aimed at delivering higher quality, richer, and more digestible data for large language models (LLMs). It offers a systematic and reusable data processing library, supports collaborative development between data and models, allows rapid iteration through a sandbox lab, and provides features like data and model feedback loops, visualization, and multidimensional automated evaluation, helping users better understand and improve their data and models. Data-Juicer is actively maintained and regularly enhanced with more features, data recipes, and datasets.
Data-Juicer Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29