We're a remote team transforming hospitality with a unified platform that integrates with hundreds of partners. We're looking for a Senior Site Reliability Engineer to ensure our platform's reliability and performance, ensuring millions of hospitality transactions flow seamlessly across the globe.
Requirements
- Design and implement reliable, scalable, and efficient cloud infrastructure
- Maintain and support highly loaded Kubernetes (EKS) clusters and infrastructure-related components
- Develop and continuously improve monitoring and logging systems
- Participate in on-call rotation to support production environment and ensure rapid response to outages
- Lead incident response efforts, ensuring minimal service impact while documenting learnings and implementing preventive measures
- Collaborate with development teams to establish Service Level Objectives (SLOs) and ensure systems meet or exceed reliability targets
- Champion SRE best practices across engineering, mentoring teams on resiliency, performance optimization, and scalability
- Automate platform operations with infrastructure-as-code (Terraform) and configuration management tools
Benefits
- Remote First, Remote Always
- PTO in accordance with local labor requirements
- 2 corporate apartment accommodations for team member use for free
- Full Paid Parental Leave
- Home office stipend based on country of residency
- Professional development courses in Cloudbeds University
- Access to professional Therapy and Coaching
- Access to professional development, including manager training, upskilling and knowledge transfer.