l12 culture Flashcards
The Principles of Devops
The Principles of Flow
1. Make Work visible
2. Limit Work in Progress
3. Reduce Batch Size
4. Reduce the number of handoffs
5. Continually identify and elevate our
contraints
6. Eliminate waste in the value stream
The Principles of Feedback
1. Work safely within complex systems
2. See problems as they occur
3. Swarm and Solve problems to build new knowledge
4. Keep pushing quality closer to the source
5. Enable optimizing for downstream work centers
The Principles of Continual Learning and Experimentation
1. Enable organizational learning and a safety culture
2. Institutionalize the improvement of daily work
3. Transform local discoveries into global improvements
4. Inject resilience patterns into the daily work
5. Leaders reinforce a learning culture
Post Mortem
Each incident needs to be described in a Post Mortem
See Post Mortems as part of the
process:
- Each incident with a defined criteria
should result in a post mortem
- Similar to ”Problem” in ITIL (but
blameless and focused on analysis
instead of result)
- Create a template for easy and wide
spread usage
- Take your time to create a Post
Mortem. A good Post Mortem is
more than worth the time investe
4 Common Failure Causes Classes
- Changes to the System itself (new deployments or
configuration changes) - Changes in User Behaviour (downstream services playing
user-role are updated as well as new real user behaviour
(Black Friday, Street Parade)) - Changes in Dependencies (Deployments of services, this
service depends on as well as new version of libraries, may
be transitive) - Changes in the Infrastructure (changes to host, container,
network, storage, …)
Failure tracing, USE Method
What works in monitoring, works when tracing errors as well:
USE as one example.
* Define / Learn Flow Chart to trace down errors
* Start with logs
* At the application…
* …to the platform…
* …to the infrastructure
* Continue with other metrics (saturation / utilization) as
well
* Document findings in Post Mortem
* Document might live…does not be to be fin
Failure tracing, Anti Pattern
- The lack of a deliberate methodology
- Street Light Anti-Method:
1. Pick observability tools that are
1. Familiar
2. Found on the internet
3. Found at random
2. Run tools
3. Look for obvious issues - Druck Man Anti-Method
- Tune things at random until problems goes away