en
AI Ranking
每月不到10元,就可以无限制地访问最好的AIbase。立即成为会员
Home
News
Daily Brief
Income Guide
Tutorial
Tools Directory
Product Library
en
AI Ranking
Search AI Products and News
Explore worldwide AI information, discover new AI opportunities
AI News
AI Tools
AI Cases
AI Tutorial
Type :
AI News
AI Tools
AI Cases
AI Tutorial
2024-09-25 13:54:53
.
AIbase
.
12.0k
Beijing Academy of Artificial Intelligence Releases Chinese Internet Corpus CCI3.0 Containing 1000GB Dataset
At the 2024 Beijing Cultural Forum, the Beijing Academy of Artificial Intelligence (BAAI) officially announced the release of the next-generation Chinese Internet corpus CCI3.0 (Chinese Corpora Internet), further promoting data co-construction and sharing. CCI3.0 includes a 1000GB dataset and a 498GB high-quality subset CCI3.0-HQ, marking another important update following the initial open-source release of CCI1.0 in November 2023 and the release of CCI2.0 in April 2024.
2023-11-29 14:00:10
.
AIbase
.
3.7k
ZhiYuan Research Institute Jointly Builds Chinese Internet Corpus CCI to Provide Resources for Big Data and Artificial Intelligence Industries
ZhiYuan Research Institute, in collaboration with TuoSi and ZhongKe WenGe, has jointly established the 'Chinese Internet Corpus' (CCI). This corpus has undergone strict screening and cleaning, with a data scale of 104GB, covering the period from 2001 to 2023. ZhiYuan Research Institute will continue to expand data sources and improve data processing workflows to provide more high-quality and reliable data resources. The institute has also opened up other high-quality Chinese datasets, such as WUDAO corpus, COIG, and MTP. This initiative aims to support the big data and artificial intelligence industries.