As a Site Reliability Engineer, you will help ensure the reliability, scalability, and observability of CloudBlue's multi-tenant SaaS platforms used by service providers worldwide.
Requirements
- 3+ years of experience as an SRE, DevOps Engineer, or Production Engineer, with strong ownership of production systems
- Proven experience operating highly available, enterprise-grade, multi-tenant SaaS platforms
- Hands-on experience with observability and monitoring tools such as Datadog, Grafana, and Elasticsearch/Kibana
- Solid understanding of Linux, networking, and distributed systems fundamentals
- Experience working with containerized environments such as Docker and Kubernetes
- Strong scripting and automation skills using Python and/or Bash
- Experience participating in on-call rotations and incident response in production environments
- Strong written and spoken English
Benefits
- Competitive salary
- Career advancement & professional development opportunities
- Flexible work arrangements
- Remotework opportunity