PERF 2: How do you select your compute solution?
The optimal compute solution for a workload varies based on application design, usage patterns, and configuration settings. Architectures can use different compute solutions for various components and enable different features to improve performance. Selecting the wrong compute solution for an architecture can lead to lower performance efficiency.
Resources
Amazon EC2 foundations (CMP211-R2)
Powering next-gen Amazon EC2: Deep dive into the Nitro system
Deliver high performance ML inference with AWS Inferentia (CMP324-R1)
Optimize performance and cost for your AWS compute (CMP323-R1)
Better, faster, cheaper compute: Cost-optimizing Amazon EC2 (CMP202-R1)
Cloud Compute with AWS
EC2 Instance Types
Processor State Control for Your EC2 Instance
EKS Containers: EKS Worker Nodes
ECS Containers: Amazon ECS Container Instances
Functions: Lambda Function Configuration
Best Practices:
-
Evaluate the available compute options: Understand the performance characteristics of the compute-related options available to you. Know how instances, containers, and functions work, and what advantages, or disadvantages, they bring to your workload.
-
Understand the available compute configuration options: Understand how various options complement your workload, and which configuration options are best for your system. Examples of these options include instance family, sizes, features (GPU, I/O), function sizes, container instances, and single versus multi-tenancy.
-
Collect compute-related metrics: One of the best ways to understand how your compute systems are performing is to record and track the true utilization of various resources. This data can be used to make more accurate determinations about resource requirements.
-
Determine the required configuration by right-sizing: Analyze the various performance characteristics of your workload and how these characteristics relate to memory, network, and CPU usage. Use this data to choose resources that best match your workload's profile. For example, a memory-intensive workload, such as a database, could be served best by the r-family of instances. However, a bursting workload can benefit more from an elastic container system.
-
Use the available elasticity of resources: The cloud provides the flexibility to expand or reduce your resources dynamically through a variety of mechanisms to meet changes in demand. Combined with compute-related metrics, a workload can automatically respond to changes and utilize the optimal set of resources to achieve its goal.
-
Re-evaluate compute needs based on metrics: Use system-level metrics to identify the behavior and requirements of your workload over time. Evaluate your workload's needs by comparing the available resources with these requirements and make changes to your compute environment to best match your workload's profile. For example, over time a system might be observed to be more memory-intensive than initially thought, so moving to a different instance family or size could improve both performance and efficiency.
Improvement Plan
Evaluate the available compute options
Cloud Compute with AWS
Understand the available compute configuration options
AWS Nitro System
Collect compute-related metrics
Amazon CloudWatch
Determine the required configuration by right-sizing
Use the available elasticity of resources
Re-evaluate compute needs based on metrics