en
每月不到10元,就可以无限制地访问最好的AIbase。立即成为会员
Home
News
Daily Brief
Income Guide
Tutorial
Tools Directory
Product Library
en
Search AI Products and News
Explore worldwide AI information, discover new AI opportunities
AI News
AI Tools
AI Cases
AI Tutorial
Type :
AI News
AI Tools
AI Cases
AI Tutorial
2023-08-24 10:46:12
.
AIbase
.
772
AI2 Releases Open Source Dataset for Large Language Model Dolma Containing 3 Trillion Tokens
AI2 recently released an open source dataset named Dolma, which contains 3 trillion tokens. Dolma's data will serve as the foundation for AI2's developing Open Language Model OLMo, expected to launch in early 2024. The Dolma dataset comes from a wide range of sources, including web content, academic publications, code, and books, making it the largest publicly available dataset of its kind.
2023-08-21 10:21:44
.
AIbase
.
657
AI2 Releases Open Dataset Dolma: Breaking Down Data Barriers for AI Language Models
["The Allen Institute for Artificial Intelligence has released the open text dataset Dolma, aimed at promoting transparency and innovation in AI language models.", "Dolma, as the core of AI2's open language model initiative, provides researchers and developers with free data resources.", "The Dolma dataset boasts a scale of 3 billion tokens, operates under the mid-risk ImpACT license, and encourages users to provide contact information and usage feedback."]