Awesome Site Reliability Engineering 
A curated list of awesome Site Reliability and Production Engineering resources.
What is Site Reliability Engineering?
"Fundamentally, it's what happens when you ask a software engineer to design an operations function." - Ben Treynor Sloss, VP Google Engineering, founder of Google SRE
Contributing
Please take a look at the contribution guidelines first. Contributions are always welcome!
Contents
- Culture
- Education
- Books
- Hiring
- Reliability
- Monitoring & Observability & Alerting
- On-Call
- Post-Mortem
- Capacity Planning
- Service Level Agreement
- Performance
- Programming
- Misc Articles
- Real-time Messaging
- Blogs
- Newsletters
- Conferences & Meetups
- SRE Tools
- SRE Podcasts
Culture
- What is Site Reliability Engineering?
- Keys To SRE by Ben Treynor
- Google SRE Resources
- Notes from Production Engineering by Pedro Canahuati
- PostOps: Recovery from Operations
- Love DevOps? Wait 'till you meet SRE [video]
- How Google Does Planet-Scale Engineering for Planet-Scale Infra
- Site Reliability Engineering at Facebook
- A History of Site Reliability Engineering at Uber
- Case Study: Adopting SRE Principles at StackOverflow
- Site Reliability Engineering at Dropbox
- Site Reliability Engineers — Keeping Google up and running 24/7
- Site Reliability Engineering at Salesforce
- From Sys Admin to Netflix SRE - video and slides
- SRE@Google: Thousands of DevOps Since 2004
- Transactional System Administration Is Killing Us and Must be Stopped
- A hierarchy of SRE needs
- PostOps: A Non-Surgical Tale of Software, Fragility, and Reliability
- SRE: An incomplete guide to cultural Narnia - [Video]
- Putting Together Great SRE Teams
- Work at Google: Meet our Production Engineers for Site Reliability Hangout on Air
- Toil: A Word Every Engineer Should Know
- Engineering Reliability into Web Sites: Google SRE
- DEVOPS & SRE AMA - Building High Performance Organizations
- John Allspaw's AMA on Incident Analysis and Postmortems
- Site Reliability Engineering with Paul Newson - Part 1 & Part 2
- How SysAdmins Devalue Themselves
- The Softer Side of DevOps