At the 2024 AWS re:Invent conference, Amazon Web Services (AWS) announced the official release of Amazon Elastic Compute Cloud (EC2) instances powered by the Trainium2 chip. These new instances offer a 30-40% improvement in price-performance compared to the previous generation of GPU-based EC2 instances. AWS CEO Matt Garman stated, "I am excited to announce the official launch of the Trainium2-driven Amazon EC2 Trn2 instances."

image.png

In addition to the Trn2 instances, AWS also introduced Trn2 UltraServers and showcased the next-generation Trainium3 AI chip. The Trn2 instances are equipped with 16 Trainium2 chips, capable of delivering up to 20.8 petaflops of computing performance, specifically designed for training and deploying large language models (LLMs) with billions of parameters.

The Trn2 UltraServers combine four Trn2 servers into a single system, providing up to 83.2 petaflops of computing power for greater scalability. These UltraServers feature 64 interconnected Trainium2 chips, meeting customers' demands for computing power during training and inference. AWS Vice President of Compute and Networking David Brown stated, "The launch of Trainium2 instances and Trn2 UltraServers provides customers with the computational power needed to tackle the most complex AI models."

AWS has partnered with Anthropic to launch a large-scale AI computing cluster called Project Rainier, utilizing hundreds of thousands of Trainium2 chips. This infrastructure will support Anthropic's development, including optimizations for its flagship product, Claude, to run on Trainium2 hardware.

Additionally, Databricks and Hugging Face are also collaborating with AWS to leverage the capabilities of Trainium to enhance the performance and cost-efficiency of their AI products. Databricks plans to utilize this hardware to enhance its Mosaic AI platform, while Hugging Face will integrate Trainium2 into its AI development and deployment tools.

Other customers of Trainium2 include Adobe, Poolside, and Qualcomm. Garman mentioned that Adobe has been very satisfied with the early testing of the Firefly inference model using Trainium2, expecting to save significantly. "Poolside anticipates a 40% savings compared to other options," he added. "Qualcomm is leveraging Trainium2 to develop AI systems that can be trained in the cloud and deployed at the edge."

Furthermore, AWS has previewed its Trainium3 chip, which is built on a 3-nanometer process. UltraServers based on Trainium3 are expected to launch by the end of 2025, aiming to deliver four times the performance of Trn2 UltraServers.

To optimize the use of Trainium hardware, AWS has also launched the Neuron SDK, a software toolkit that helps developers optimize models for best performance on Trainium chips. This SDK supports frameworks like JAX and PyTorch, enabling customers to integrate software into existing workflows with minimal code changes.

Currently, Trn2 instances are available in the US East (Ohio) region, with plans to expand to other regions in the future. UltraServers are currently in preview.

Highlights:

🌟 AWS launches Trainium2 instances with a 30-40% performance improvement over the older GPU instances.  

💡 Trn2 UltraServers combine multiple Trn2 servers, offering stronger computing power to meet the needs of large AI models.  

🚀 AWS collaborates with multiple companies to advance the application of AI technology, helping customers gain advantages in cost and performance.