The Manager of Site Reliability Engineering leads and develops a team of SRE practitioners focused on delivering highly reliable, scalable, and performant cloud-based infrastructure and services.
Requirements
- Lead, mentor, and grow a high-performing team of Site Reliability Engineers
- Implement and champion Site Reliability Engineering principles and DevOps best practices
- Define and track key SRE metrics
- Drive automation efforts
- Own and continuously improve observability practices
- Participate in incident response processes
- Partner closely with software engineering, product management, architecture, and security teams
- Oversee the management and scalability of cloud infrastructure environments
- Advocate for and apply best practices in performance tuning, capacity planning, and system design
- Develop and execute a long-term roadmap for our hybrid cloud platform
- Establish and monitor key performance indicators (KPIs) service level indicators (SLIs) and service level objectives (SLOs)
Benefits
- Health insurance
- Dental insurance
- Vision insurance
- 401k Matching
- Retirement Plan
- Paid Time Off