OPS 11: How do you evolve operations?
Dedicate time and resources for continuous incremental improvement to evolve the effectiveness and efficiency of your operations.
Have a process for continuous improvement: Regularly evaluate and prioritize opportunities for improvement to focus efforts where they can provide the greatest benefits.
Perform post-incident analysis: Review customer-impacting events, and identify the contributing factors and preventative actions. Use this information to develop mitigations to limit or prevent recurrence. Develop procedures for prompt and effective responses. Communicate contributing factors and corrective actions as appropriate, tailored to target audiences.
Perform Knowledge Management: Mechanisms exist for your team members to discover the information that they are looking for in a timely manner, access it, and identify that it’s current and complete. Mechanisms are present to identify needed content, content in need of refresh, and content that should be archived so that it’s no longer referenced.
Define drivers for improvement: Identify drivers for improvement to help you evaluate and prioritize opportunities.
Validate insights: Review your analysis results and responses with cross-functional teams and business owners. Use these reviews to establish common understanding, identify additional impacts, and determine courses of action. Adjust responses as appropriate.
Perform operations metrics reviews: Regularly perform retrospective analysis of operations metrics with cross-team participants from different areas of the business. Use these reviews to identify opportunities for improvement, potential courses of action, and to share lessons learned.
Document and share lessons learned: Document and share lessons learned from the execution of operations activities so that you can use them internally and across teams.
Allocate time to make improvements: Dedicate time and resources within your processes to make continuous incremental improvements possible.
Have a process for continuous improvement
Perform post-incident analysis
Implement feedback loops
- Immediate feedback: Immediate feedback comes from the execution of operations activities where, through review of the execution and outcomes, it is recognized that the process could be improved. Feedback can come from customers, team members, or automated output of an activity. When the improvement has a low level of effort, or significant benefit, consider implementing it immediately. Track opportunities for improvement in your backlog or issue system as appropriate. For example, a process where data is staged on an intermediate device could be optimized by instead placing the data directly into the target environment. This would eliminate a step in the process and the requirement for the intermediate resources.
- Retrospective analysis: Perform retrospective analysis regularly
to capture feedback from the review of operational outcomes and metrics over time.
Use trends to identify areas that need improvement.
For example, review the rate of deployment failures to identify
when potential issues with development and deployment activities have emerged.
Serverless big data analytics - Amazon Athena and Amazon QuickSight - 2017 AWS Online Tech Talks
View AWS CodeDeploy logs in Amazon CloudWatch console
Analyzing VPC flow logs with Amazon Kinesis Firehose, Amazon Athena, and Amazon QuickSight
Perform Knowledge Management
Define drivers for improvement
- Desired capabilities: Evaluate desired features and capabilities
when evaluating opportunities for improvement.
What's New with AWS
- Unacceptable issues: Evaluate unacceptable issues, bugs, and vulnerabilities
when evaluating opportunities for improvement.
AWS Latest Security Bulletins
AWS Trusted Advisor
- Compliance requirements: Evaluate updates and changes required
to maintain compliance with regulation, policy, or to remain
under support from a third party, when reviewing opportunities
AWS Compliance Programs
AWS Compliance Latest News
Perform operations metrics reviews
Using Amazon CloudWatch metrics
Publish custom metrics
Amazon CloudWatch metrics and dimensions reference
Document and share lessons learned
- Share learnings: Have procedures to share lessons learned and associated artifacts
across teams. For example, share updated procedures, guidance,
governance, and best practices through an accessible wiki; share
scripts, code, and libraries through a common repository.
Delegating access to your AWS environment
Share an AWS CodeCommit repository
Easy authorization of AWS Lambda functions
Sharing an AMI with specific AWS Accounts
Speed template sharing with an AWS CloudFormation designer URL
Using AWS Lambda with Amazon SNS
Allocate time to make improvements