MNBVC (Massive Never-ending BT Vast Chinese corpus) is a project aimed at providing rich Chinese data for AI. It includes not only mainstream cultural content but also niche cultures and internet slang. The dataset encompasses various forms of pure text Chinese data, such as news, essays, novels, books, magazines, papers, dialogues, posts, wikis, ancient poems, lyrics, product descriptions, jokes, anecdotes, and chat logs.