This content is outdated. This version of the Well-Architected Framework is now found at: https://docs.aws.amazon.com/en_us/wellarchitected/2022-03-31/framework/performance-efficiency.html

PERF 2: How do you select your compute solution?

The optimal compute solution for a workload varies based on application design, usage patterns, and configuration settings. Architectures can use different compute solutions for various components and enable different features to improve performance. Selecting the wrong compute solution for an architecture can lead to lower performance efficiency.

Resources

Amazon EC2 foundations (CMP211-R2)
Powering next-gen Amazon EC2: Deep dive into the Nitro system
Deliver high performance ML inference with AWS Inferentia (CMP324-R1)
Optimize performance and cost for your AWS compute (CMP323-R1)
Better, faster, cheaper compute: Cost-optimizing Amazon EC2 (CMP202-R1)
Cloud Compute with AWS
EC2 Instance Types
Processor State Control for Your EC2 Instance
EKS Containers: EKS Worker Nodes
ECS Containers: Amazon ECS Container Instances
Functions: Lambda Function Configuration

Best Practices:

Evaluate the available compute options: Understand the performance characteristics of the compute-related options available to you. Know how instances, containers, and functions work, and what advantages, or disadvantages, they bring to your workload.
Understand the available compute configuration options: Understand how various options complement your workload, and which configuration options are best for your system. Examples of these options include instance family, sizes, features (GPU, I/O), function sizes, container instances, and single versus multi-tenancy.
Collect compute-related metrics: One of the best ways to understand how your compute systems are performing is to record and track the true utilization of various resources. This data can be used to make more accurate determinations about resource requirements.
Determine the required configuration by right-sizing: Analyze the various performance characteristics of your workload and how these characteristics relate to memory, network, and CPU usage. Use this data to choose resources that best match your workload's profile. For example, a memory-intensive workload, such as a database, could be served best by the r-family of instances. However, a bursting workload can benefit more from an elastic container system.
Use the available elasticity of resources: The cloud provides the flexibility to expand or reduce your resources dynamically through a variety of mechanisms to meet changes in demand. Combined with compute-related metrics, a workload can automatically respond to changes and utilize the optimal set of resources to achieve its goal.
Re-evaluate compute needs based on metrics: Use system-level metrics to identify the behavior and requirements of your workload over time. Evaluate your workload's needs by comparing the available resources with these requirements and make changes to your compute environment to best match your workload's profile. For example, over time a system might be observed to be more memory-intensive than initially thought, so moving to a different instance family or size could improve both performance and efficiency.

Improvement Plan

Evaluate the available compute options

Consider compute options: Decide which compute option matches your requirements. From a performance perspective, instances can be used for long running applications, especially those with state and long running computation cycles; functions can be used for event-initiated, stateless applications that need quick response times; and containers enable you to efficiently use the resources of an instance.
Cloud Compute with AWS

Define compute performance requirements: Identify the important compute performance metrics for your workload, and implement your requirements using a data-driven approach involving benchmarking or load testing. Use this data to identify where your compute solution is constrained, and examine configuration options to improve the solution.

Understand the available compute configuration options

Learn about available configuration options: For your selected compute option, use the available configuration options to optimize for your performance requirements.Utilize AWS Nitro Systems to enable full consumption of the compute and memory resources of the host hardware. Dedicated Nitro Cards enable high speed networking, high speed EBS, and I/O acceleration.
AWS Nitro System

Collect compute-related metrics

Collect compute-related metrics: Amazon CloudWatch can collect metrics across the compute resources in your environment. Use a combination of CloudWatch and other metrics-recording tools to track the system-level metrics within your workload. Record data such as CPU usage levels, memory, disk I/O, and network to gain insight into utilization levels or bottlenecks. This data is crucial to understand how the workload is performing and how effectively it is using resources. Use these metrics as part of a data-driven approach to actively tune and optimize your workload's resources.
Amazon CloudWatch

Determine the required configuration by right-sizing

Modify your workload configuration by right sizing: To optimize both performance and overall efficiency, determine which resources your workload needs. Choose memory-optimized instances for systems that require more memory than CPU, or compute-optimized instances for components that do data processing that is not memory-intensive. Right sizing enables your workload to perform as well as possible while only using the required resources

Use the available elasticity of resources

Take advantage of elasticity: Elasticity matches the supply of resources you have against the demand for those resources. Instances, containers, and functions provide mechanisms for elasticity either in combination with automatic scaling or as a feature of the service. Use elasticity in your architecture to ensure that you have sufficient capacity to meet performance requirements at all scales of use. Ensure that the metrics for scaling up or down elastic resources are validated against the type of workload being deployed. If you are deploying a video transcoding application, 100% CPU is expected and should not be your primary metric. Alternatively, you can measure against the queue depth of transcoding jobs waiting to scale your instance types. Ensure that workload deployments can handle both scale up and scale down events. Scaling down workload components safely is as critical as scaling up resources when demand dictates. Create test scenarios for scale-down events to ensure that the workload behaves as expected.

Re-evaluate compute needs based on metrics

Use a data-driven approach to optimize resources: To achieve maximum performance and efficiency, use the data gathered over time from your workload to tune and optimize your resources. Look at the trends in your workload's usage of current resources and determine where you can make changes to better match your workload's needs. When resources are over-committed, system performance degrades, whereas underutilization results in a less efficient use of resources and higher cost.