Large Model Slimming Expert! The Powerful Compression Tool LLMC for Single Card Handling Llama 3.1 405B is Here

AIbase基地

Published inAI News · 6 min read · Aug 2, 2024

368

Artificial Intelligence (AI) is becoming increasingly intelligent, especially large language models (LLMs), which are astonishing in their ability to process natural language. But did you know? Behind these intelligent AI brains lies the need for substantial computational power and storage space to support them.

A multilingual model called Bloom with 176 billion parameters requires at least 350GB of space just to store the model's weights, and several high-end GPUs to run it. This is not only costly but also difficult to popularize.

To address this issue, researchers have proposed a technique called "quantization." Quantization is like giving the AI brain a "slim-down," by mapping the model's weights and activations to a lower-bit data format, which not only reduces the model's size but also speeds up its operation. However, this process also carries risks, potentially sacrificing some accuracy.

Facing this challenge, researchers from Beihang University and SenseTime have jointly developed the LLMC toolkit. LLMC is like a personal fitness coach for AI, helping researchers and developers find the most suitable "weight loss plan," making the AI model lighter without affecting its "intelligence level."

LLMC toolkit has three main features:

Diversity: LLMC offers 16 different quantization methods, like preparing 16 different diet plans for AI. Whether your AI wants to lose weight all over or just in certain areas, LLMC can meet your needs.

Low cost: LLMC is very resource-saving, even handling ultra-large models with minimal hardware support. For example, just one 40GB A100 GPU can adjust and evaluate a model with 175 billion parameters like OPT-175B. It's as efficient as training an Olympic champion on a home treadmill!

High compatibility: LLMC supports multiple quantization settings and model formats, and is compatible with various backends and hardware platforms. It's like a versatile coach, no matter what equipment you use, it can help you devise a suitable training plan.

LLMC's practical application: Making AI smarter and more energy-efficient

The emergence of the LLMC toolkit provides a comprehensive and fair benchmark test for the quantization of large language models. It considers three key factors: training data, algorithms, and data formats, helping users find the best performance optimization solution.

In practical applications, LLMC can help researchers and developers more efficiently integrate suitable algorithms and low-bit formats, promoting the compression and popularization of large language models. This means that in the future, we may see more lightweight yet equally powerful AI applications.

The authors of the paper also shared some interesting findings and suggestions:

When selecting training data, choose datasets that are more similar to the test data in terms of vocabulary distribution, just as humans should choose suitable diets based on their own conditions.

Regarding quantization algorithms, they explored the impact of three main techniques: conversion, pruning, and reconstruction, comparing different exercise methods' effects on weight loss.

When choosing between integer or floating-point quantization, they found that floating-point quantization has an advantage in handling complex situations, while in some special cases, integer quantization might be better. This is similar to needing different exercise intensities at different stages of weight loss.

The advent of the LLMC toolkit brings a new breeze to the AI field. It not only provides a powerful assistant for researchers and developers but also points the way for the future development of AI. Through LLMC, we can look forward to seeing more lightweight and high-performance AI applications, truly integrating AI into our daily lives.

Project address: https://github.com/ModelTC/llmc

Paper address: https://arxiv.org/pdf/2405.06001

Uncovering the Secrets of Large Models! The 'Thinking Words' Behind Them Contain Astonishing Information

Recently, a research team from Renmin University, Shanghai Artificial Intelligence Laboratory, University College London, and Dalian University of Technology revealed an important finding in the reasoning process of large models: when the model is thinking, the 'thinking words' it uses actually reflect a significant increase in its internal information. This research result provides a new perspective for better understanding the reasoning mechanisms of artificial intelligence through methods of information theory. You may have seen large models output some language that seems human-like when answering questions, such as "Hmm..." or "Let me think...".

JD.com's Embodied Intelligence Strategy Accelerates Rapidly, JoyInside Collaboration Map Exposed

According to NetEase Technology, JD.com's layout in the field of embodied intelligence is accelerating rapidly. The embodied intelligence brand JoyInside under JD.com has reached cooperation with more than ten leading robot companies, becoming the core engine for JD.com to seize the smart robot market. According to insiders, JoyInside is supported by JD's large model technology, focusing on providing smart interaction capabilities between robots and consumers. Its product strategy focuses on scenario-based applications such as one person, one dog, and one toy. Since its launch, the brand has successfully attracted leading enterprises from multiple niche fields to join.

Meta Establishes a Superintelligence Lab to Lead a New Era in Artificial Intelligence

Meta is undergoing a major internal restructuring, deciding to consolidate all artificial intelligence-related teams into a new unit called "Meta Superintelligence Labs." This information was disclosed by Bloomberg, according to an internal memo from Meta, which shows that CEO Mark Zuckerberg hopes to focus the company's efforts on developing "superintelligence" artificial intelligence.

Test Article

The internal testing project of Xiaomi, "AI Toolkit," has officially announced the end of its phased testing and plans to suspend service starting July 5, 2025. As an important AI project incubated internally by Xiaomi, the AI Toolkit aims to explore and integrate cutting-edge AI technologies, providing users with a series of innovative features and experiences. Although the specific testing functions and application scenarios have not been fully disclosed, its name suggests its positioning as a multifunctional AI toolset. During the recent testing period, the AI Toolkit has gathered some Xiaomi employees

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Large Model Slimming Expert! The Powerful Compression Tool LLMC for Single Card Handling Llama 3.1 405B is Here

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Google Announces the Latest Class of Students at the American Artificial Intelligence Infrastructure Institute

Meituan Invests Again in the Field of Embodied Intelligence, Xinghai Tu Completes Over $100 Million Financing

Uncovering the Secrets of Large Models! The 'Thinking Words' Behind Them Contain Astonishing Information

JD.com's Embodied Intelligence Strategy Accelerates Rapidly, JoyInside Collaboration Map Exposed

Ant Group's Medical AI Platform Wins SAIL Award at 2025 World Artificial Intelligence Conference

State Administration for Market Regulation Approves the Release of 7 National Standards Including Artificial Intelligence, Information Technology, and Internet of Things

Artificial Intelligence Accelerates Transformation, Amazon to Face More Layoff Waves

Meta Establishes a Superintelligence Lab to Lead a New Era in Artificial Intelligence

The UK is Actively Addressing the Power Challenges Brought by Artificial Intelligence

Test Article