A curated collection of resources for Site Reliability and Production Engineering practices.
This repository gathers essential readings, talks, and tools covering the breadth of SRE and production engineering. It's organized into logical sections like culture, monitoring, incident response, and capacity planning, offering insights from industry leaders.
A curated collection of resources for Site Reliability and Production Engineering practices.
Teams and individuals focused on building and maintaining reliable, scalable software systems.