We are looking for an experienced Site Reliability Engineer to help deepen the reliability, observability, and operational excellence of Netflix. The ideal candidate will partner closely with engineering teams to ensure the end-to-end reliability of the complete member journey.
Requirements
- 5+ years of experience in an SRE, Production Engineering, or similar role operating business-critical, high-traffic services in production.
- Strong coding skills in one or more languages such as Python, Go, or Java, with a focus on automating solutions instead of relying on manual operations.
- Fluency in modern cloud infrastructure: hands-on experience with large-scale environments on AWS/Azure/GCP, along with abstracted compute and platform orchestration systems.
- Deep understanding of large-scale distributed systems, including common failure modes, performance bottlenecks, and how to design for resilience and graceful degradation.
- Strong observability and performance tuning skills: you can use metrics, logs, and traces to debug issues in complex systems, and you’re comfortable profiling and optimizing services to meet latency, availability, or efficiency goals.
- Experience with incident management and response: you can navigate ambiguous, high-pressure production issues, drive coordinated response, and follow through with durable improvements.
- Strong collaboration and influence skills: you communicate clearly, build trust with partner teams, and can guide engineering teams toward better reliability practices without relying on authority.
- Ability to balance reliability, velocity, and cost: you’re comfortable making and explaining tradeoffs, and using data (SLOs, error budgets, performance metrics) to guide decision-making.
- Growth mindset and curiosity: you are eager to learn, comfortable challenging assumptions (including your own), and motivated by continuous improvement of systems, processes, and yourself.
- Embraces agency: you thrive when given a loosely defined goal by coming up with work to accomplish the goal while farming for dissent and feedback from the team and our stakeholders
Benefits
- Health Plans
- Mental Health support
- 401(k) Retirement Plan with employer match
- Stock Option Program
- Disability Programs
- Health Savings and Flexible Spending Accounts
- Family-forming benefits
- Life and Serious Injury Benefits