Zhiyuan Research Institute Releases the World's First Trillion-Parameter Dense Model Tele-FLM-1T as Open Source

AIbase基地

Published inAI News · 3 min read · Jul 25, 2024

297

The Beijing Academy of Artificial Intelligence (BAAI) and the Artificial Intelligence Research Institute of China Telecom (TeleAI) have recently upgraded their jointly developed Tele-FLM series of large models, releasing the 52B instruction model FLM-2-52B-Instruct and the world's first trillion-parameter monolithic dense model Tele-FLM-1T, while also open-sourcing relevant technical reports and model checkpoints.

FLM-2-52B-Instruct is an instruction dialogue model obtained through instruction fine-tuning based on the Tele-FLM-52B base model, focusing on enhancing Chinese dialogue capabilities. Through supervised fine-tuning, it was trained using 1 million open-source instruction data, and the optimal results were achieved based on a subset of 30,000 data items. These data included mathematical problems, code, and multi-turn dialogue data. The model training employed specific batch sizes, learning rates, and epoch settings, and was evaluated on the AlignBench evaluation platform. The results showed that FLM-2-52B-Instruct achieved 90% of GPT-4's capabilities in Chinese dialogue.

Tele-FLM-1T is the world's first open-source trillion-parameter dense model, using a growth-based pre-training approach to save costs. The model structure has been improved based on the decoder-only Transformers of the GPT series, including the addition of Input and Output multipliers, rotational positional encoding, RMSNorm, and SwiGLU, among others. The growth process includes horizontal and vertical growth, using a value-preserving operator based on MSG improvements. The pre-training process employed specific hyperparameter settings.

FLM-2-52B-Instruct Model Open-Source Link:

https://huggingface.co/CofeAI/FLM-2-52B-Instruct-2407

Tele-FLM-1T Model Open-Source Link:

https://huggingface.co/CofeAI/Tele-FLM-1T

52B +1T Technical Report Link:

https://arxiv.org/abs/2407.02783

52B Base Model Technical Report Link:

https://arxiv.org/abs/2404.16645

Honor Magic V5 Launch: Li Jian Emphasizes Open Ecosystem, Collaborating with Giants to Build the AI Future

In the media Q&A session after today's Honor Magic V5 and AI Terminal Ecosystem Launch, Honor CEO Li Jian, CFO Peng Qiuen, and Product Line President Fang Fei had in-depth discussions with the media. During the event, Honor officially announced support for the MCP and A2A protocols, and revealed that it will collaborate deeply with partners such as Alibaba, BYD, and Midea in the fields of intelligent service ecosystem, smart vehicle networking, and smart home. Honor CEO Li Jian emphasized in the conversation that 'openness' is the core philosophy of Honor. He pointed out...

Chai-2 Makes a Shocking Debut: AI-Powered Zero-Shot Antibody Design, Accelerating Drug Development by Hundreds of Times

Artificial intelligence once again stirs up the field of drug development! Chai Discovery recently launched a new AI model called Chai-2, which has drawn widespread attention with its breakthrough technology in molecular design. Chai-2 achieves zero-shot antibody design with a success rate of 16%-20%, hundreds of times higher than traditional methods, shortening the drug development cycle from months or even years to just two weeks. Zero-shot antibody design breaks through traditional bottlenecks. Chai-2 is a multi-modal generative AI model developed by Chai Discovery, specifically designed for...

Huawei Open Sources Dense Pangu 7B and Mixture of Experts Model with 72B Parameters

On June 30, Huawei officially announced the open sourcing of the Pangu dense model with 7 billion parameters, the PanguPro MoE model with 72 billion parameters, and the model inference technology based on Ascend. This open-source initiative is a key step in Huawei's strategy to build an Ascend ecosystem, aiming to promote research and innovation in large model technology, accelerate the application of artificial intelligence across industries, and create value.

"AI Daily Report - June 27th"; Tencent open-sources lightweight Huyuan-A13B model; Keling AI launches video audio effects feature

Welcome to AIbase's [AI Daily Report]! Spend three minutes every day to learn about the latest AI news, helping you understand AI industry trends and innovative AI product applications. For more AI updates, visit: https://www.aibase.com/zh1. Tencent open-sources the lightweight Huyuan-A13B model, which can be deployed with just one mid-range GPU card. Tencent has released a new member of the Huyuan large model family, the Huyuan-A13B model, which uses a mixture of experts (MoE) architecture, with a total parameter scale of 80 billion and an activated parameter count of 13 billion, large

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Zhiyuan Research Institute Releases the World's First Trillion-Parameter Dense Model Tele-FLM-1T as Open Source

AIbase基地

This article is from AIbase Daily

AI News Recommendations

ByteDance Open Sources New Model VINCIE-3B: 300 Million Parameters Support Continuous Image Editing with Context

Topview Avatar 2 Shakes the Market! AI Digital Humans Revolution E-commerce Live Streaming, Will the Era of Models Come to an End?

DeepSWE Open Source AI Agent System Makes a Strong Debut, Based on Qwen3-32B

Honor Magic V5 Launch: Li Jian Emphasizes Open Ecosystem, Collaborating with Giants to Build the AI Future

Chai-2 Makes a Shocking Debut: AI-Powered Zero-Shot Antibody Design, Accelerating Drug Development by Hundreds of Times

Chai Discovery Launches Chai-2 Model: Zero-shot Antibody Design Achieves 16-20% Hit Rate

New Open Source AI System OmniGen 2: Integrates Image and Text Generation Like GPT-4o

Tencent Open Sources Hunyuan-A13B: An AI Model with Small Size and Great Intelligence

Huawei Open Sources Dense Pangu 7B and Mixture of Experts Model with 72B Parameters

"AI Daily Report - June 27th"; Tencent open-sources lightweight Huyuan-A13B model; Keling AI launches video audio effects feature