Inflection AI Ditches Nvidia for Intel Gaudi 3 Accelerators!

AIbase基地

Published inAI News · 5 min read · Oct 8, 2024

244

Recently, Inflection AI made a striking decision on its latest enterprise platform: to abandon Nvidia's GPUs in favor of Intel's Gaudi3 accelerators. This shift marks a strategic adjustment in the company's AI field, as its previous "Pi" customer applications were all based on Nvidia's GPUs. Now, Inflection3.0 will rely on Gaudi3, allowing users to choose to run it locally or on the cloud-based Tiber AI Cloud.

Chip AI Illustration (1)

Image source note: The image is generated by AI, provided by the image licensing service provider Midjourney

Inflection AI was founded in 2022, initially focusing on developing a conversational personal assistant named Pi. However, with founders Mustafa Suleyman and Karén Simonyan leaving for Microsoft in the spring, the company shifted its focus to building custom fine-tuned models for enterprises, enhancing service quality using customer data.

Inflection3.0 is the latest version of the platform, aiming to tailor AI applications for enterprises by fine-tuning models using proprietary datasets. Notably, Intel will be one of the first customers to use this service, sparking speculation about whether Inflection will pay the full price for these accelerators.

Although Inflection plans to run its services on Gaudi3 accelerators, it is clear that the system will not be established soon. Like the previous Inflection2.5, the latest version will also run on Intel's Tiber AI Cloud service. However, Inflection realizes that some customers may want to keep their data local, so it plans to provide physical systems based on Intel AI accelerators starting from the first quarter of 2025.

One benefit of using Gaudi3 accelerators is the significant improvement in price-performance for Inflection. Sean White, CEO of Inflection AI, stated in a blog that by using Intel's technology, they have seen up to twice the price-performance improvement compared to current competitive products. Gaudi3 is also considered faster than Nvidia's H100 in both training and inference speeds, and at a lower cost.

The technical specifications of Gaudi3 are also quite powerful, equipped with 128GB of HBM2e memory, a bandwidth of up to 3.7Tbps, and boasts 1,835 teraFLOPS of dense FP8 or BF16 performance. At 16-bit precision, Gaudi3's floating-point performance is almost twice that of H100, which is crucial for Inflection's focus on training and fine-tuning workloads.

Additionally, Intel recently announced that IBM will deploy Gaudi3 accelerators in its cloud platform and plans to launch them early in 2025. This indicates that Gaudi3 accelerators are gradually gaining market recognition.

Key Points:
🌟 Inflection AI has decided to abandon Nvidia GPUs in favor of Intel's Gaudi3 accelerators.
🚀 Inflection3.0 will be based on Gaudi3, providing customized AI applications for enterprises.
💰 Using Gaudi3, Inflection AI has achieved up to twice the price-performance improvement.

Modern Motor and NVIDIA Collaborate to Build a $3 Billion Artificial Intelligence Factory

Modern Motor and NVIDIA deepen their cooperation to build an AI factory based on the Blackwell architecture. The two companies announced joint development of projects in autonomous driving, smart factories, and robotics at CES. The project has received support from the South Korean government and will be detailed at the 2025 APEC Summit in South Korea.

AI Large Model Investment Competition Concludes! Alibaba Tongyi Qianwen Qwen3-Max Wins with a 22.32% Return Rate

The first AI large model investment competition organized by the US Nof1 institution concluded, with Alibaba's Tongyi Qianwen Qwen3-Max winning with a 22.32% return rate, demonstrating its leading capabilities in quantitative trading. The competition provided each of the six top models with $10,000 in initial funds to compete in a real trading environment on the Hyperliquid platform.

Kunlun Wanyi SkyReels V3 Model Launch! One-Stop Aggregation of Top AI Video Capabilities such as Sora2 and Veo3.1

The AI video creation platform SkyReels under Kunlun Wanyi is officially launched with a new version, introducing the V3 model and five core function upgrades, supporting both web and mobile ends. The platform highlights the 'one-stop' and 'multi-modal aggregation' features, integrating top global AI multi-modal models to provide a seamless creative experience.

Aliyun Tongyi Qwen3-Max Launches Deep Thinking Function on Official Website

Alibaba's Qwen3-Max model introduces 'Deep Thinking' mode, enhancing complex task efficiency via reinforced reasoning and multi-step problem-solving. With over 1 trillion parameters and 36T tokens of pre-training data, it is the largest and most capable version, showing significant improvements in coding and agent capabilities.....

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

AI Brand Monitoring Tool

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Inflection AI Ditches Nvidia for Intel Gaudi 3 Accelerators!

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Modern Motor and NVIDIA Collaborate to Build a $3 Billion Artificial Intelligence Factory

AI Daily: Kunlun Tech SkyReels V3 Model Released; Moonshot AI Launches Kimi Linear Model; MiniMax Music 2.0 Released

AI Large Model Investment Competition Concludes! Alibaba Tongyi Qianwen Qwen3-Max Wins with a 22.32% Return Rate

Kunlun Wanyi SkyReels V3 Model Launch! One-Stop Aggregation of Top AI Video Capabilities such as Sora2 and Veo3.1

AI Daily: Meituan's LongCat-Flash-Omni Released; Qwen3-Max Launches Deep Thinking Feature; Baidu Wenshi 5.0 Makes a Strong Return

Gemini 3 to Launch Next Year! Google Strives to Catch Up with GPT-5, 650 Million Users Become a Key Chip in the AI Counterattack

Google CEO Confirms: Gemini Will be Released in 3 Years, AI Agent Capabilities May Be the Breakthrough

Aliyun Tongyi Qwen3-Max Launches Deep Thinking Function on Official Website

World's First Embodied Intelligence Open Platform Launches! 3D Digital Humans Now Ready to Use Out of the Box: Mofa Xingyun Integrates Large Models into Hundreds of Yuan Chips

8B Model Outperforms 32B? Mira Murati's New Work in Online Strategic Distillation Sparks an AI Training Revolution, Cost Drops by 90%!

GEO Services