Join Novibet as an SRE Manager to lead and grow the Site Reliability Engineering team, ensuring the reliability, scalability, and performance of production systems while building a culture of operational excellence.
Requirements
- 5+ years of experience in SRE, DevOps, or application support engineering roles
- 2+ years of experience managing or leading engineering teams
- Deep expertise in cloud platforms (AWS,Azure) and infrastructure-as-code (Terraform, etc.)
- Strong background in Kubernetes, container orchestration, and microservices reliability
- Hands-on experience with observability tooling: Datadog, Prometheus/Grafana, OpenTelemetry, or equivalent
- Proven track record of designing and running incident management frameworks at scale
- Excellent communication skills with the ability to influence and align cross-functional stakeholders
- Experience defining and operating against SLOs and error budgets
- Experience in a high-growth SaaS or platform engineering environment is a nice to have
- Background in security-adjacent reliability practices (chaos engineering, threat modeling) will be considered as a plus
- Contributions to open-source reliability or observability projects is a plus
Benefits
- Competitive Compensation
- Health insurance
- Top-Notch Equipment
- Career Growth
- Free access to in-house gym
- Alternative Transportation
- Inclusive Environment
- Engaging Activities