Own availability, the most important product feature, by continually striving for sustained operational excellence of Sumo’s planet-scale observability and security products.
Requirements
- Continually improve the lifecycle of microservices and architectural components from inception and design, through deployment, operation, and refinement.
- Write code and automation to reduce operational workload, increase efficiency, improve security posture, eliminate toil, and enable Sumo’s developers to deliver features more rapidly.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- Facilitate blame-free root cause analysis meetings for incidents to learn and drive improvement.
- Participate in and continually improve our global IRC (incident response coordination) for all products.
- Drive root cause identification and issue resolution with the various teams.
- Work inside a fast-paced iterative environment.
Benefits
- 401k Matching
- Retirement Plan
- Generous Paid Time Off
- Visa Sponsorship