OpenBuddy Releases Chinese Version of Llama3.1-8B Model from Open Source Large Language Model Team

AIbase基地

Published inAI News · 3 min read · Jul 25, 2024

996

Meta has recently unveiled the next generation of its open-source model series, Llama3.1, which includes a version with 405 billion parameters. This model's performance is on par with, and in some benchmarks even surpasses, closed-source models like GPT-4. The Llama3.1-8B-Instruct, a variant with 8 billion parameters, supports English, German, French, Italian, Portuguese, Spanish, Hindi, and Thai, with a context length of up to 131,072 tokens, and its knowledge is updated until December 2023.

To enhance the capabilities of Llama3.1-8B-Instruct, Meta used over 25 million synthetic data points generated by the larger 405B model during training. This has resulted in Llama3.1-8B-Instruct demonstrating cognitive and reasoning abilities comparable to GPT3.5Turbo in tests involving code and mathematics.

WeChat Screenshot_20240725083410.png

OpenBuddy has leveraged the Llama3.1-8B-Instruct model and, through training on a small amount of Chinese data, released OpenBuddy-Llama3.1-8B-v22.1-131K, a next-generation open-source cross-lingual model capable of Chinese Q&A and translation across languages. Despite Llama3.1 not having inherent Chinese capabilities, the trained model can generate answers to questions that often lead to conceptual confusion, which are typically only produced by larger models, indicating a greater cognitive potential.

However, due to the limitations of the training dataset and time, OpenBuddy-Llama3.1-8B-v22.1 still has limitations in Chinese knowledge, particularly in traditional cultural knowledge. Nevertheless, the model demonstrates relatively stable performance in tasks such as long-text comprehension, thanks to its inherent long-text capabilities.

In the future, OpenBuddy plans to conduct larger-scale training for the 8B and 70B models to enhance the model's Chinese knowledge reserve, long-text capabilities, and cognitive abilities, and to explore the possibility of fine-tuning the 405B model.

Project Address: https://modelscope.cn/models/OpenBuddy/openbuddy-llama3.1-8b-v22.1-131k

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

Ant Group open-sources the BaiLing Large Model Ring-flash-linear-2.0-128K, specifically targeting long-text programming. It employs a hybrid linear attention mechanism with a sparse MoE architecture, achieving performance comparable to a 40B dense model by activating only 6.1B parameters. It achieves optimal results in code generation and intelligent agent applications, efficiently addressing the challenges of long context processing.

Hailuo2.3 AI Video Generation Model Launches on Replicate Platform, Bringing Realistic Physics and Cinematic Effects

MiniMax's video generation model Hailuo2.3 is launched on the Replicate platform, supporting text and image input to generate high-quality videos. The model improves training efficiency through the NCR architecture, with realistic physics simulation and smooth action capture capabilities, driving innovation in dynamic visual effects in fields such as movies and advertising.

NVIDIA Launches OmniVinci, a Multimodal Understanding Model That Sets a New SOTA with 19.05 Points Higher

NVIDIA released the multimodal understanding model OmniVinci, which outperformed top models by 19.05 points in benchmark tests. The model achieves excellent performance with only 1/6 of the training data. It aims to enable AI systems to simultaneously understand vision, audio, and text, simulating human multisensory perception of the world.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

OpenBuddy Releases Chinese Version of Llama3.1-8B Model from Open Source Large Language Model Team

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

Hailuo2.3 AI Video Generation Model Launches on Replicate Platform, Bringing Realistic Physics and Cinematic Effects

NVIDIA Launches OmniVinci, a Multimodal Understanding Model That Sets a New SOTA with 19.05 Points Higher

Small Model Training Efficiency Surges 100 Times! Thinking Machine Introduces Online Policy Distillation, OpenAI's Former CTO Likes It Personally

OpenAI GPT-5 Revolutionary Upgrade in Mental Health Response, Unwanted Answers Drop by 65%

Tahoe Bio Launches Tahoe-x1 Model: AI Decodes the Language of Life, Bringing a Computational Efficiency Revolution to Cancer Research

DeepSeek Model Wins in Hong Kong and U.S. Stock Trading Competition with an Annualized Return of 10.61%, Far Exceeding GPT and Nasdaq Benchmark

Fitbit Launches Gemini Health Coach: Your AI Personal Trainer + Sleep Advisor Is Here, Android Users Can Try It Early Tomorrow

MiniMax Launches M2 Inference Large Model: 230 Billion Parameters, 100 Tokens/s, Specifically Designed for Smart Agents

Millions of Users Confess Suicidal Thoughts to ChatGPT Every Week OpenAI Urgently Upgrades GPT-5 Safety Mechanisms to Address Psychological Crises

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

OpenBuddy Releases Chinese Version of Llama3.1-8B Model from Open Source Large Language Model Team

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

Hailuo2.3 AI Video Generation Model Launches on Replicate Platform, Bringing Realistic Physics and Cinematic Effects

NVIDIA Launches OmniVinci, a Multimodal Understanding Model That Sets a New SOTA with 19.05 Points Higher

Small Model Training Efficiency Surges 100 Times! Thinking Machine Introduces Online Policy Distillation, OpenAI's Former CTO Likes It Personally

OpenAI GPT-5 Revolutionary Upgrade in Mental Health Response, Unwanted Answers Drop by 65%

Tahoe Bio Launches Tahoe-x1 Model: AI Decodes the Language of Life, Bringing a Computational Efficiency Revolution to Cancer Research

DeepSeek Model Wins in Hong Kong and U.S. Stock Trading Competition with an Annualized Return of 10.61%, Far Exceeding GPT and Nasdaq Benchmark

Fitbit Launches Gemini Health Coach: Your AI Personal Trainer + Sleep Advisor Is Here, Android Users Can Try It Early Tomorrow

MiniMax Launches M2 Inference Large Model: 230 Billion Parameters, 100 Tokens/s, Specifically Designed for Smart Agents

Millions of Users Confess Suicidal Thoughts to ChatGPT Every Week OpenAI Urgently Upgrades GPT-5 Safety Mechanisms to Address Psychological Crises

GEO Services