PERF 2: How do you ensure that consumption of infrastructure resources aligns with the activity and workloads of tenants?

Create a scaling strategy where the infrastructure footprint of a SaaS application closely mirrors the continually evolving profile of the tenant and the tenant’s workload. The requires rich insights into tenant activity and consumption patterns to construct a scaling model that efficiently scales without over-provisioning resources, which is critical to SaaS businesses.

Resources

SaaS Metrics: The Ultimate View of Tenant Consumption (GPSTEC308)
Optimizing SaaS Tenant Workflows and Costs

Best Practices:

Use tenant profile data to configure static scaling policies: Use logs and profiling data to analyze tenant loads and periodically configure infrastructure scaling policies based on historical consumption trends.
Scale based on application-generated tenant insights: SaaS specific metric data is captured, aggregated, and used (along with other metrics) to build a robust multi-tenant scaling strategy that minimizes the over-provisioning of resources in based on live workloads.
Build dynamic tenant scaling policies around standard AWS metrics: Rely on the readily accessible AWS infrastructure metrics to define a series of policies that approximate tenant consumption activity. Use these metrics to build policies that move you closer to aligning tenant activity with resource consumption.

Improvement Plan

Use tenant profile data to configure static scaling policies

Periodically analyze your log and application profile data to determine how tenants are consuming resources and assess how that consumption is influencing the multi-tenant scaling profile of your application.

Use tenant insights to develop and adjust static scaling policies with the goal of better aligning resource consumption with actual tenant activity.

For serverless environments, introduce scaling constructs that optimize AWS Lambda function load with tenant tiers and profiles.

Use provisioned concurrency only in scenarios where specific tenant workflows do not meet SLA with being pre-warmed
Use separate reserve concurrency configurations for tenant tiers, ensuring that tenant consumption does not exceed the target consumption profile of a given tier.

Scale based on application-generated tenant insights

Instrument your application with detailed metrics to capture and surface tenant activity and consumption data that be analyzed in real time.

Use the detailed tenant insights to construct scaling policies that help ensure that the system scales based on the actual activity trends and patterns of tenants.

Build dynamic tenant scaling policies around standard AWS metrics

For Amazon EC2 and container-based SaaS environments, identify the Amazon CloudWatch metrics that best align with the consumption profile of your tenants.

Create separate automatic scaling policies for each separately scaled service in your system, scaling on the metrics that represent the common performance bottlenecks for each service.
Monitoring CloudWatch metrics for your Auto Scaling groups and instances
Dynamic scaling for Amazon EC2 Auto Scaling