A small team of just 10 individuals dares to challenge the status of tech giant Meta, this is truly the real-life version of "David versus Goliath"!
This startup, named Nous Research, is no obscure entity. Their recently launched Hermes3 is a fine-tuned version of the 405B model based on Llama3.1. Despite the small team size, their capabilities are not to be underestimated. This "ten-person powerhouse" has successfully fine-tuned multiple models including Mistral, Yi, Llama, and more, with downloads surpassing 33 million, making them the "hit-maker" of the AI world!
The emergence of Hermes3 is like a shot of adrenaline for the AI world. Even after FP8 quantization, its performance remains astonishingly strong. This optimization not only significantly reduces the model's VRAM and disk requirements but also allows Hermes3 to run on a single node, a boon for developers!
In terms of conversational abilities, Hermes3 is a versatile performer. Whether it's long-term memory, multi-turn conversations, role-playing, or internal monologues, it handles them all with ease. Thanks to the 128K context window of Llama3.1, Hermes3 maintains conversational coherence like a seasoned diplomat.
But Hermes3's capabilities don't stop there. It showcases advanced abilities beyond traditional language modeling, capable of understanding and evaluating the quality of generated text in a refined and subtle manner. This means it can not only speak eloquently but also act as a strict literary critic!
Even more astonishing is that Hermes3 integrates several agent capabilities, including structured output, intermediate step output, and generating internal monologues for transparent decision-making. It's like equipping the AI with a "transparent brain," allowing us to glimpse its thought process.
The training process of Hermes3 is akin to a "devil's training" in the AI world. It underwent supervised fine-tuning (SFT) and direct preference optimization (DPO) phases. The team spent a full five months to screen and build the SFT dataset, their focus and patience are truly commendable.
Founded in 2023, Nous Research, a private applied research group headquartered in New York, is the "barbarian invader" of the AI world. They firmly believe in the power of open-source and are determined to challenge the innovation restrictions of closed technologies. The company's slogan is stirring: "We challenge the assumption that closed technologies will forever dominate the peak of innovation; instead, we provide powerful open-source code."
In just over a year, Nous Research has released 5 datasets and 89 models, seemingly proclaiming to the world: size doesn't matter, strength is the key!
Paper link: https://nousresearch.com/wp-content/uploads/2024/08/Hermes-3-Technical-Report.pdf
Official introduction: https://nousresearch.com/freedom-at-the-frontier-hermes-3/