PERF 2: How do you ensure that consumption of infrastructure resources aligns with the activity and workloads of tenants?
Create a scaling strategy where the infrastructure footprint of a SaaS application closely mirrors the continually evolving profile of the tenant and the tenant’s workload. The requires rich insights into tenant activity and consumption patterns to construct a scaling model that efficiently scales without over-provisioning resources, which is critical to SaaS businesses.
Resources
SaaS Metrics: The Ultimate View of Tenant Consumption (GPSTEC308)
Optimizing SaaS Tenant Workflows and Costs
Best Practices:
-
Use tenant profile data to configure static scaling policies: Use logs and profiling data to analyze tenant loads and periodically configure infrastructure scaling policies based on historical consumption trends.
-
Scale based on application-generated tenant insights: SaaS specific metric data is captured, aggregated, and used (along with other metrics) to build a robust multi-tenant scaling strategy that minimizes the over-provisioning of resources in based on live workloads.
-
Build dynamic tenant scaling policies around standard AWS metrics: Rely on the readily accessible AWS infrastructure metrics to define a series of policies that approximate tenant consumption activity. Use these metrics to build policies that move you closer to aligning tenant activity with resource consumption.
Improvement Plan
Use tenant profile data to configure static scaling policies
- Use provisioned concurrency only in scenarios where specific tenant workflows do not meet SLA with being pre-warmed
- Use separate reserve concurrency configurations for tenant tiers, ensuring that tenant consumption does not exceed the target consumption profile of a given tier.
Scale based on application-generated tenant insights
Build dynamic tenant scaling policies around standard AWS metrics
- Create separate automatic scaling policies for each separately scaled service in your
system, scaling on the metrics that represent the common performance bottlenecks for
each service.
Monitoring CloudWatch metrics for your Auto Scaling groups and instances
Dynamic scaling for Amazon EC2 Auto Scaling