PERF 1: How do you prevent one tenant from adversely impacting the experience of another tenant?

Introduce scaling and throttling constructs to help ensure that no single tenant can put load on the system in a way that impacts the performance and experience of other tenants.

Resources

Optimizing SaaS Tenant Workflows and Costs Blog
SaaS Storage Strategies Building a Multi-tenant Storage Model on AWS
Whitepaper: SaaS Solutions on AWS Tenant Isolation Architectures
AWS re:Invent 2018: SaaS Reference: Review of Real-World Patterns & Strategies (GPSTEC302)
Microservices decomposition for SaaS environments (ARC210)

Best Practices:

Improvement Plan

Silo high demand tenant resources

  • Identify the workflows where the load of individual tenants could consume excess resources and impact the experience of another tenant. For each of these areas, create separate siloes where each tenant is given dedicated resources.
  • Messaging constructs, such as Amazon SNS, Amazon SQS, and Amazon EventBridge, can be impacted by the load of individual tenants. In these cases, you might offer some or all tenants a dedicated message processing resource.
  • For Amazon EC2-based solutions, you might offer some or all tenants dedicated compute resources.
    Microservices decomposition for SaaS environments (ARC210)
    See section “Targeted Silo Isolation” in SaaS Tenant Isolation Strategies Isolating Resources in a Multi-Tenant Environment
    SaaS tenant isolation patterns (ARC372)
  • In cases where bottlenecks might span the entire tenant experience (compute, storage, messaging, etc.), consider supporting siloes that can span the entire stack for specific tenants.
  • For serverless SaaS environments, use reserved concurrency to silo the experience of specific tenant tiers, offering a larger pool of concurrent functions to premium tiers.
    Serverless SaaS deep dive: Building serverless SaaS on AWS (ARC410)
  • Decompose and deploy services in a pattern that aligns tenant loads with performance expectations

  • Design your multi-tenant microservices to distribute and optimize your scaling profile.
  • Optimize for the most active tenant or higher end tiers.
  • Combine tenant-aware policies with added capacity to address tenant spikes

  • Use tenant consumption insights to define policies that react and respond to scenarios where individual tenants are putting heavier loads on the system.
  • Add capacity to your baseline environment, adding a scaling buffer that can be used to offset bursts of activity from individual tenants.
  • Use throttling policies to prevent individual tenants from placing excess load on the system

  • Detect and throttle any tenant that is generating excess load on your system