Table of Contents
- Table of Contents
- Foreword I
- Foreword II
- Preface
- Chapter 1 - How SRE Relates to DevOps
- Part I - Foundations
- Chapter 2 - Implementing SLOs
- Chapter 3 - SLO Engineering Case Studies
- Chapter 4 - Monitoring
- Chapter 5 - Alerting on SLOs
- Chapter 6 - Eliminating Toil
- Chapter 7 - Simplicity
- Part II - Practices
- Chapter 8 - On-Call
- Chapter 9 - Incident Response
- Chapter 10 - Postmortem Culture: Learning from Failure
- Chapter 11 - Managing Load
- Chapter 12 - Introducing Non-Abstract Large System Design
- Chapter 13 - Data Processing Pipelines
- Chapter 14 - Configuration Design and Best Practices
- Chapter 15 - Configuration Specifics
- Chapter 16 - Canarying Releases
- Part III - Processes
- Chapter 17 - Identifying and Recovering from Overload
- Chapter 18 - SRE Engagement Model
- Chapter 19 - SRE: Reaching Beyond Your Walls
- Chapter 20 - SRE Team Lifecycles
- Chapter 21 - Organizational Change Management in SRE
- Conclusion
- Appendix A - Example SLO Document
- Appendix B - Example Error Budget Policy
- Appendix C - Results of Postmortem Analysis
- Index
- About the Editors
- Colophon