IBM has officially launched its next-generation open-source large language model, Granite3.1, aiming to take a leading position in the enterprise AI sector. This series of models features an extended context length of 128K, embedding models, built-in hallucination detection, and significant performance improvements.
IBM claims that the Granite8B Instruct model outperforms its open-source competitors of the same scale, including Meta's Llama3.1, Qwen2.5, and Google's Gemma2.
The release of the Granite3.1 model comes in the context of IBM's rapid iteration of the Granite series, having launched Granite3.0 as early as October. IBM revealed that its revenue related to generative AI has reached $2 billion. The core idea of the new version is to integrate more functionalities into smaller models, allowing enterprise users to run them more easily and cost-effectively.
David Cox, Vice President of IBM Research, stated that the Granite models are widely used in IBM's internal products, consulting services, and customer service, and are also released as open source, necessitating high standards across the board. The model's performance evaluation relies not only on speed but also on efficiency, helping users save time in obtaining results.
In terms of context length, the improvement of Granite3.1 is particularly notable, expanding from the original 4K to 128K, which is especially important for enterprise AI users, particularly in retrieval-augmented generation (RAG) and intelligent agent AI. The extended context length allows the model to handle longer documents, logs, and conversations, enabling it to better understand and respond to complex queries.
IBM has also launched a series of embedding models to accelerate the process of converting data into vectors. Among them, the Granite-Embedding-30M-English model has a query time of 0.16 seconds, outperforming competing products. To achieve the performance enhancements of Granite3.1, IBM has innovated in multi-stage training processes and the use of high-quality training data.
In terms of hallucination detection, the Granite3.1 model integrates hallucination protection within the model itself, enabling self-detection and reduction of erroneous outputs. This built-in detection feature optimizes overall efficiency and reduces the number of inference calls.
Currently, the Granite3.1 model is available for free to enterprise users and is offered through IBM's Watsonx enterprise AI service. In the future, IBM plans to maintain a rapid update pace, with Granite3.2 set to introduce multimodal capabilities in early 2025.
Official blog: https://www.ibm.com/new/announcements/ibm-granite-3-1-powerful-performance-long-context-and-more
Key Points:
🌟 IBM launches the Granite3.1 model, aiming to lead the open-source large language model market.
💡 The new model supports a 128K context length, significantly enhancing processing capability and efficiency.
🚀 The hallucination detection feature is integrated into the model, optimizing overall performance and accuracy.