REL 7: How do you design your workload to adapt to changes in demand?
A scalable workload provides elasticity to add or remove resources automatically so that they closely match the current demand at any given point in time.
Resources
AWS Auto Scaling: How Scaling Plans Work
What Is Amazon EC2 Auto Scaling?
Managing Throughput Capacity Automatically with DynamoDB Auto Scaling
What is Amazon CloudFront?
Distributed Load Testing on AWS: simulate thousands of connected users
Telling Stories About Little's Law
AWS Marketplace: products that can be used with auto scaling
APN Partner: partners that can help you create automated compute solutions
Best Practices:
-
Use automation when obtaining or scaling resources: When replacing impaired resources or scaling your workload, automate the process by using managed AWS services, such as Amazon S3 and AWS Auto Scaling. You can also use third-party tools and AWS SDKs to automate scaling.
-
Obtain resources upon detection of impairment to a workload: Scale resources reactively when necessary if availability is impacted, to restore workload availability.
-
Obtain resources upon detection that more resources are needed for a workload: Scale resources proactively to meet demand and avoid availability impact.
-
Load test your workload: Adopt a load testing methodology to measure if scaling activity meets workload requirements.
Improvement Plan
Use automation when obtaining or scaling resources
What is AWS Auto Scaling?
- Configure Auto Scaling on your Amazon EC2 instances and Spot Fleets, Amazon ECS tasks, Amazon DynamoDB tables and indexes, Amazon Aurora Replicas, and AWS Marketplace appliances as applicable.
Managing throughput capacity automatically with DynamoDB Auto Scaling- Use service API operations to specify the alarms, scaling policies, warm up times, and cool down times.
What is Elastic Load Balancing?
- Application Load Balancers can distribute load by path.
What is an Application Load Balancer?- Configure an Application Load Balancer to distribute traffic to different workloads based on the path under the domain name.
- Application Load Balancers can be used to distribute loads in a manner
that integrates with AWS Auto Scaling to manage demand.
Using a load balancer with an Auto Scaling group
- Network Load Balancers can distribute load by connection.
What is a Network Load Balancer?- Configure a Network Load Balancer to distribute traffic to different workloads using TCP, or to have a constant set of IP addresses for your workload.
- Network Load Balancers can be used to distribute loads in a manner that integrates with AWS Auto Scaling to manage demand.
- Use Amazon Route 53 or a trusted DNS provider
What is Amazon Route 53? - Use Route 53 to manage your CloudFront distributions and load balancers.
- Determine the domains and subdomains you are going to manage.
- Create appropriate record sets using ALIAS or CNAME records.
Working with records
- AWS Global Accelerator is a service that improves the availability and performance of your applications with local or global users. It provides static IP addresses
that act as a fixed entry point to your application endpoints in a single or multiple
AWS Regions, such as your Application Load Balancers, Network Load Balancers or Amazon EC2 instances.
What Is AWS Global Accelerator?
What is Amazon CloudFront?
- Configure Amazon CloudFront distributions for your workloads, or use a third-party CDN.
Obtain resources upon detection of impairment to a workload
- Use scaling plans which are the core component of AWS Auto Scaling. It's where you configure a set of instructions for scaling your resources. If you
work with AWS CloudFormation or add tags to AWS resources, you can set up scaling plans for different sets of resources, per
application. AWS Auto Scaling provides recommendations for scaling strategies customized to each resource. After
you create your scaling plan, AWS Auto Scaling combines dynamic scaling and predictive scaling methods together to support your
scaling strategy.
AWS Auto Scaling: How Scaling Plans Work - Amazon EC2 Auto Scaling helps you ensure that you have the correct number of Amazon EC2 instances available to handle the load for your application. You create collections of EC2 instances, called Auto Scaling groups. You can specify the minimum number of instances in each
Auto Scaling group, and Amazon EC2 Auto Scaling ensures that your group never goes below this size. You can specify the maximum number
of instances in each Auto Scaling group, and Amazon EC2 Auto Scaling ensures that your group never goes above this size.
What Is Amazon EC2 Auto Scaling? - Amazon DynamoDB auto scaling uses the AWS Application Auto Scaling service to dynamically adjust provisioned throughput capacity on your behalf, in
response to actual traffic patterns. This enables a table or a global secondary index
to increase its provisioned read and write capacity to handle sudden increases in
traffic, without throttling.
Managing Throughput Capacity Automatically with DynamoDB Auto Scaling
Obtain resources upon detection that more resources are needed for a workload
- Calculate how many compute resources you will need (compute concurrency) to handle
a given request rate
Telling Stories About Little's Law - When you have a historical pattern for usage, setup scheduled scaling for Amazon EC2 auto scaling
Scheduled Scaling for Amazon EC2 Auto Scaling - Use AWS predictive scaling
Predictive Scaling for EC2, Powered by Machine Learning
Load test your workload
Distributed Load Testing on AWS: simulate thousands of connected users
- Identify the mix of requests: You may have varied mixes of requests, so you should look at various time frames when identifying the mix of traffic.
- Implement a load driver: You can use custom code, open source, or commercial software to implement a load driver.
- Load test initially using small capacity: You see some immediate effects by driving load onto a lesser capacity, possibly as small as one instance or container.
- Load test against larger capacity: The effects will be different on a distributed load, so you must test against as close to a product environment as possible.