Small but Mighty! A 10-Person Team Develops the First Fine-Tuned Llama 3.1 405B

AIbase基地

Published inAI News · 5 min read · Aug 16, 2024

318

A small team of just 10 individuals dares to challenge the status of tech giant Meta, this is truly the real-life version of "David versus Goliath"!

This startup, named Nous Research, is no obscure entity. Their recently launched Hermes3 is a fine-tuned version of the 405B model based on Llama3.1. Despite the small team size, their capabilities are not to be underestimated. This "ten-person powerhouse" has successfully fine-tuned multiple models including Mistral, Yi, Llama, and more, with downloads surpassing 33 million, making them the "hit-maker" of the AI world!

The emergence of Hermes3 is like a shot of adrenaline for the AI world. Even after FP8 quantization, its performance remains astonishingly strong. This optimization not only significantly reduces the model's VRAM and disk requirements but also allows Hermes3 to run on a single node, a boon for developers!

In terms of conversational abilities, Hermes3 is a versatile performer. Whether it's long-term memory, multi-turn conversations, role-playing, or internal monologues, it handles them all with ease. Thanks to the 128K context window of Llama3.1, Hermes3 maintains conversational coherence like a seasoned diplomat.

But Hermes3's capabilities don't stop there. It showcases advanced abilities beyond traditional language modeling, capable of understanding and evaluating the quality of generated text in a refined and subtle manner. This means it can not only speak eloquently but also act as a strict literary critic!

Even more astonishing is that Hermes3 integrates several agent capabilities, including structured output, intermediate step output, and generating internal monologues for transparent decision-making. It's like equipping the AI with a "transparent brain," allowing us to glimpse its thought process.

The training process of Hermes3 is akin to a "devil's training" in the AI world. It underwent supervised fine-tuning (SFT) and direct preference optimization (DPO) phases. The team spent a full five months to screen and build the SFT dataset, their focus and patience are truly commendable.

Founded in 2023, Nous Research, a private applied research group headquartered in New York, is the "barbarian invader" of the AI world. They firmly believe in the power of open-source and are determined to challenge the innovation restrictions of closed technologies. The company's slogan is stirring: "We challenge the assumption that closed technologies will forever dominate the peak of innovation; instead, we provide powerful open-source code."

In just over a year, Nous Research has released 5 datasets and 89 models, seemingly proclaiming to the world: size doesn't matter, strength is the key!

Paper link: https://nousresearch.com/wp-content/uploads/2024/08/Hermes-3-Technical-Report.pdf

Official introduction: https://nousresearch.com/freedom-at-the-frontier-hermes-3/

AI Daily Report - June 30th: Baidu Open Sources the WENXIN Large Model 4.5 Series; Tongyi Qianwen Multimodal Generation Model Qwen VLo

Welcome to the AIbase [AI Daily Report] section! Spend three minutes a day to learn about the latest AI events, helping you understand AI industry trends and innovative AI product applications. For more AI news, visit: https://www.aibase.com/zh1. Baidu officially releases the WENXIN Large Model 4.5 series and fully opens it to the public, featuring ten new models with various parameter configurations. These models are trained and inferred using the PaddlePaddle framework, achieving a FLOPs utilization rate of 47%, and perform well in multi-modal text tasks.

Test Article

The internal testing project of Xiaomi, "AI Toolkit," has officially announced the end of its phased testing and plans to suspend service starting July 5, 2025. As an important AI project incubated internally by Xiaomi, the AI Toolkit aims to explore and integrate cutting-edge AI technologies, providing users with a series of innovative features and experiences. Although the specific testing functions and application scenarios have not been fully disclosed, its name suggests its positioning as a multifunctional AI toolset. During the recent testing period, the AI Toolkit has gathered some Xiaomi employees

Test Article

The internal testing project of Xiaomi, "AI Toolbox," has officially announced the end of its phased internal testing and plans to suspend services starting from July 5, 2025. As an important AI project incubated internally by Xiaomi, the AI Toolbox aims to explore and integrate cutting-edge AI technologies, providing users with a series of innovative features and experiences. Although the specific internal testing functions and application scenarios have not been fully disclosed, its name suggests its positioning as a multifunctional AI toolkit. During the recent internal testing period, the AI Toolbox has gathered some Xiaomi employees

Baidu Launches the WENXIN Large Model 4.5 Series Open Source, Sparking a New Wave in the Domestic Large Model Market!

Recently, Baidu officially announced the open-source release of its WENXIN Large Model 4.5 series, launching a total of ten models, including mixed expert (MoE) models with 47B and 3B activated parameters, as well as dense models with 0.3B parameters. This open-source initiative not only fully publicizes the pre-trained weights but also provides inference code, marking a significant advancement for Baidu in the field of large models. These newly released models can be downloaded and deployed on platforms such as PaddlePaddle Starry Sky Community and Hugging Face. Additionally, Baidu Intelligent Cloud's Qianfan Large Model Platform also provides

The Internal Testing Period of Xiaomi AI Toolbox Ends, Service Will Be Suspended Starting July 5

The internal testing project "Xiaomi AI Toolbox" has officially announced the end of its phased internal testing and plans to suspend service starting July 5, 2025. "AI Toolbox" is an important AI project incubated internally by Xiaomi, aimed at exploring and integrating cutting-edge AI technologies to provide users with a series of innovative features and experiences. Although the specific internal testing functions and application scenarios have not been fully disclosed, its name suggests its positioning as a multifunctional AI toolset. During the recent internal testing period, "AI Toolbox" has gathered some Xiaomi employees and core users.

The 'In-Depth Research' Feature of Doubao is Now in Testing on the Doubao APP, Web Version, and Desktop Version

Recently, the Doubao APP, web version, and desktop version platforms have introduced a new feature test - the 'In-Depth Research' feature has been officially launched, offering users free trial. This feature aims to help users efficiently handle complex tasks by quickly integrating massive in-depth information and generating detailed research reports or visualized web results.