Recently, Inflection AI made a striking decision on its latest enterprise platform: to abandon Nvidia's GPUs in favor of Intel's Gaudi3 accelerators. This shift marks a strategic adjustment in the company's AI field, as its previous "Pi" customer applications were all based on Nvidia's GPUs. Now, Inflection3.0 will rely on Gaudi3, allowing users to choose to run it locally or on the cloud-based Tiber AI Cloud.
Image source note: The image is generated by AI, provided by the image licensing service provider Midjourney
Inflection AI was founded in 2022, initially focusing on developing a conversational personal assistant named Pi. However, with founders Mustafa Suleyman and Karén Simonyan leaving for Microsoft in the spring, the company shifted its focus to building custom fine-tuned models for enterprises, enhancing service quality using customer data.
Inflection3.0 is the latest version of the platform, aiming to tailor AI applications for enterprises by fine-tuning models using proprietary datasets. Notably, Intel will be one of the first customers to use this service, sparking speculation about whether Inflection will pay the full price for these accelerators.
Although Inflection plans to run its services on Gaudi3 accelerators, it is clear that the system will not be established soon. Like the previous Inflection2.5, the latest version will also run on Intel's Tiber AI Cloud service. However, Inflection realizes that some customers may want to keep their data local, so it plans to provide physical systems based on Intel AI accelerators starting from the first quarter of 2025.
One benefit of using Gaudi3 accelerators is the significant improvement in price-performance for Inflection. Sean White, CEO of Inflection AI, stated in a blog that by using Intel's technology, they have seen up to twice the price-performance improvement compared to current competitive products. Gaudi3 is also considered faster than Nvidia's H100 in both training and inference speeds, and at a lower cost.
The technical specifications of Gaudi3 are also quite powerful, equipped with 128GB of HBM2e memory, a bandwidth of up to 3.7Tbps, and boasts 1,835 teraFLOPS of dense FP8 or BF16 performance. At 16-bit precision, Gaudi3's floating-point performance is almost twice that of H100, which is crucial for Inflection's focus on training and fine-tuning workloads.
Additionally, Intel recently announced that IBM will deploy Gaudi3 accelerators in its cloud platform and plans to launch them early in 2025. This indicates that Gaudi3 accelerators are gradually gaining market recognition.
Key Points:
🌟 Inflection AI has decided to abandon Nvidia GPUs in favor of Intel's Gaudi3 accelerators.
🚀 Inflection3.0 will be based on Gaudi3, providing customized AI applications for enterprises.
💰 Using Gaudi3, Inflection AI has achieved up to twice the price-performance improvement.