REL 1: How do you limit an individual tenant’s ability to impose load that may impact availability for other tenants of your system?

Identify tenants that are consuming resources at a rate that could undermine the overall stability and availability of your system. Use this data and or/scaling policies to limit the load these tenants can place on the system to prevent a large-scale outage, which could cascade across all tenants of your system.

Resources

AWS re:Invent 2017: GPS: SaaS Monitoring - Creating a Unified View of Multi-tenant Health featuring New Relic (GPSTEC309)
Monolith to serverless SaaS: Migrating to multi-tenant architecture
Partitioning Pooled Multi-Tenant SaaS Data with Amazon DynamoDB
AWS re:Invent 2019: [REPEAT] Scaling up to your first 10 million users (ARC211-R)
Architecting Successful SaaS: Interacting with Your SaaS Customer’s Cloud Accounts
AWS Auto Scaling https://aws.amazon.com/autoscaling/

Best Practices:

Improvement Plan

 Use throttling policies to limit the effect that noisy tenants have on the system

  • Use scaling policies to add capacity in anticipation of tenant spikes
  • Detect and throttle any tenant that is generating load that might impact availability of your system
  • Partition tenant load to limit the area of effect

  • Design your multi-tenant microservices to prevent potential tenant bottlenecks that could impact availability
  • Optimize for the most active tenant or higher end tiers
  • Define SLAs for each tenant tier

  • Define the tenant tiers and their mapping to application SLAs
  • Introduce a mechanism to manage and enforce these SLA
  • Surface potential SLA issues as part of the operational experience
  • Use separate reserve concurrency configurations for tenant tiers to help ensure that tenant consumption does not exceed the target consumption profile for a given tier.
    AWS re:Invent 2019: Serverless SaaS deep dive: Building serverless SaaS on AWS (ARC410-R)