OPS 1: How do you effectively monitor and manage the operational health of a multi-tenant environment?

Multi-tenant environments rely on robust operational tools that enable tenant-aware views into the overall activity and health of their systems. By adding tenant consumption, activity, and health trends to the system’s operational experience, SaaS teams are able to more effectively capture, profile, and assess the health trends of tenant and tenant tiers.

Resources

AWS re:Invent 2017: GPS: SaaS Monitoring - Creating a Unified View of Multi-tenant Health featuring New Relic (GPSTEC309)
GPSTEC309-SaaS Monitoring Creating a Unified View of Multi-tenant Health featuring New Relic Slides

Best Practices:

Improvement Plan

Include tenant context into application logs

  • Instrument application logs with tenant context
  • Use this tenant context to streamline the operational experience, allowing administrators to analyze logs with tenant context with log analytics tools.
  • Use search solutions to analyze tenant trends, leveraging solutions, such as Amazon OpenSearch Service/Kibana or Amazon Athena, to explore the log data.
  • Use purpose-built, tenant-aware tools to enable proactive management of tenant workloads

  • Create a multi-tenant aware dashboard
  • Enable operations to configure tenant alerts and alarms
  • Identify specific patterns of tenant consumption, activity, and SLA metrics that can be combined and used to proactively identify tenant health issues.
  • Identify specific patterns of tenant consumption, activity, and SLA metrics that can be combined and used to proactively identify tenant health issues.
  • Configure alerts and alarms that are triggered when tenants reach specific health states or performance thresholds that might be a precursor to a reliability issue.
  • Collect detailed tenant insights

  • Publish insights that enhance the operational views of tenant workloads
  • Aggregate tenant metric insights to enable historical analysis of trends and provide insights to shape future architectural and operational strategies.