Featured image
GPU Cloud

AWS democratizes access to H100 GPU instances for ML

avatar

Sven

November 2nd, 2023

~ 3 min read

Recent advancements in machine learning (ML) have opened up new opportunities for organizations of all sizes and industries to innovate and transform their businesses. However, the demand for GPU capacity to train, fine-tune, experiment, and inference these ML models has outpaced the supply, making GPUs a scarce resource. To address this challenge, Amazon Web Services (AWS) is introducing Amazon Elastic Compute Cloud (Amazon EC2) Capacity Blocks for ML, a new usage model that democratizes access to GPU instances for ML and generative AI models. This blog post will explore the features and benefits of EC2 Capacity Blocks and how they enable customers to easily reserve and access GPU capacity for their ML workloads.

The Challenge of GPU Capacity for ML Workloads

The rapid growth of ML has led to a surge in demand for GPU capacity. However, the limited availability of GPUs has created a bottleneck for customers whose capacity needs vary depending on their research and development phase. This section will discuss the challenges faced by organizations in accessing GPU capacity and how it affects their ML initiatives.

AWS EC2 Capacity Blocks for ML

In response to the demand for GPU capacity, AWS has developed EC2 Capacity Blocks for ML. Similar to hotel room reservations, customers can reserve GPU instances for a specific date and duration. They can choose from a range of instance sizes and specify the number of instances they require.

By reserving GPU instances in advance, customers can ensure capacity assurance for training and fine-tuning ML models, running experiments, and preparing for future surges in demand. EC2 Capacity Blocks also offer the highest performance in EC2 for ML training, making them an ideal choice for organizations looking to optimize their ML workloads.

Customers can reserve their capacity blocks through the Amazon EC2 console, selecting the number of instances and the duration of the reservation. The pricing of EC2 Capacity Blocks is dynamic and depends on supply and demand at the time of purchase. Customers can use the AWS Command Line Interface (CLI) and AWS SDKs to purchase EC2 Capacity Blocks as well.

Once the EC2 Capacity Blocks are reserved, customers have access to their purchased capacity starting from the scheduled start date.

Currently, EC2 Capacity Blocks are available for Amazon EC2 P5 instances in the AWS US East (Ohio) Region powered by NVIDIA H100 Tensor Core GPU. Customers can view the price of a capacity block before making a reservation, and the total price is charged upfront at the time of purchase.

Conclusion

Amazon EC2 Capacity Blocks for ML offer a solution to the GPU capacity challenge faced by organizations working on machine learning initiatives. By providing easy access to GPU instances, allowing organizations of all sizes to leverage the power of GPUs for their ML workloads. With the ability to reserve capacity in advance and ensure performance optimization, EC2 Capacity Blocks empower customers to accelerate innovation and drive business transformation.