REL 7: How do you design your workload to adapt to changes in demand?

A scalable workload provides elasticity to add or remove resources automatically so that they closely match the current demand at any given point in time.

Resources

AWS Auto Scaling: How Scaling Plans Work
What Is Amazon EC2 Auto Scaling?
Managing Throughput Capacity Automatically with DynamoDB Auto Scaling
What is Amazon CloudFront?
Distributed Load Testing on AWS: simulate thousands of connected users
Telling Stories About Little's Law
AWS Marketplace: products that can be used with auto scaling
APN Partner: partners that can help you create automated compute solutions

Best Practices:

Improvement Plan

Use automation when obtaining or scaling resources

  • Configure and use AWS Auto Scaling: This monitors your applications and automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost. Using AWS Auto Scaling, you can setup application scaling for multiple resources across multiple services.
    What is AWS Auto Scaling?
  • Use Elastic Load Balancing: Load balancers can distribute load by path or by network connectivity.
    What is Elastic Load Balancing?
  • Use a highly available DNS provider: DNS names allow your users to enter names instead of IP address to access your workloads and distributes this information to a defined scope, usually globally for users of the workload.
  • Use the AWS global network to optimize the path from your users to your applications.: AWS Global Accelerator continually monitors the health of your application endpoints and redirects traffic to healthy endpoints in less than 30 seconds
  • Configure and use Amazon CloudFront or a trusted content delivery network: A content delivery network (CDN) can provide faster end-user response times and can serve requests for content that may cause unnecessary scaling of your workloads.
    What is Amazon CloudFront?
  • Obtain resources upon detection of impairment to a workload

  • Obtain resources upon detection of impairment to a workload: Scale resources reactively when necessary if availability is impacted, to restore workload availability.
  • Obtain resources upon detection that more resources are needed for a workload

  • Obtain resources upon detection that more resources are needed for a workload: Scale resources proactively to meet demand and avoid availability impact
  • Load test your workload

  • Perform load testing to identify which aspect of your workload indicates that you must add or remove capacity: Load testing should have representative traffic similar to what you receive in production. Increase the load while watching the metrics you have instrumented, to determine which metric indicates when you must add or remove resources.
    Distributed Load Testing on AWS: simulate thousands of connected users