At Cloudbeds, we're transforming hospitality with a platform that powers properties across 150 countries. As a Senior Site Reliability Engineer, you'll ensure our platform's reliability and performance, architecting scalable AWS cloud solutions and fostering a culture of automation and continuous improvement.
Requirements
- 5+ years of experience as a DevOps or SRE working within the AWS ecosystem
- 5+ years of experience with Kubernetes (EKS) and Helm charts
- Experience with designing, building, and supporting CI/CD pipelines with ArgoCD and GitHub actions
- Experience with infrastructure-as-code methodologies with Terraform
- Experience with Observability and Monitoring with Grafana, Prometheus, DataDog, and Cloudwatch
- Experience with Incident Management, full stack troubleshooting, performance analysis and root cause analysis (RCA)
- Experience with Web application systems such as Nginx, Ingress controllers, load balancing and Content Delivery Networks
- Experience with Databases (MySQL, PostgreSQL, Aurora) and Middleware technologies (Redis, Memcached and SQS)
- Good networking skills with VPC, Security Groups and Network ACLs
- Ability to work remotely and manage your own time in a global team
- Good written and verbal communication in English
- Bachelor’s degree in Computer Science or equivalent experience
Benefits
- PTO in accordance with local labor requirements
- Monthly Wellness Fridays - enjoy an extra long weekend every month
- Full Paid Parental Leave
- Home office stipend based on country of residency
- Professional development courses in Cloudbeds University
- Access to professional development, including manager training, upskilling and knowledge transfer