The Beijing Academy of Artificial Intelligence (BAAI), in collaboration with TALIS and iFLYTEK, has established the "Chinese Internet Corpus" (CCI). This corpus, meticulously screened and cleaned, has initially released data amounting to 104GB, spanning from 2001 to 2023. The BAAI has indicated plans to expand data sources, refine data processing procedures, and open additional high-quality Chinese datasets such as WUDAO corpora, COIG, and MTP. This initiative aims to provide the big data and artificial intelligence industries with secure and reliable corpus resources.