LivePerson is a leading customer engagement company that is seeking a Mid-Level Site Reliability Engineer (SRE) to join their global Platform Engineering team. The successful candidate will ensure that the company's platform is reliable, scalable, and performant, collaborating closely with developers, QA, and product teams. They will be responsible for designing automation, improving observability, and maintaining the health of production systems.
Requirements
- Collaborate with developers, QA, and product teams during sprint planning to understand release plans, dependencies, and infrastructure requirements
- Manage and operate Kubernetes clusters in Google Kubernetes Engine (GKE) and Amazon Elastic Kubernetes Service (EKS)
- Develop and manage Terraform modules for provisioning and configuring cloud infrastructure across GCP and AWS
- Build and enhance observability with Prometheus, Grafana, and Datadog to monitor application and platform performance
- Design, implement, and maintain GitLab CI/CD pipelines for build, test, and deployment automation
- Drive an automation-first culture by developing scripts and tooling in Python, Go, or Shell to minimize manual effort and improve efficiency
- Participate in a 24/7 on-call rotation, ensuring quick detection, mitigation, and resolution of incidents
- Perform root cause analysis (RCA) and contribute to post-incident reviews to prevent recurrence
Benefits
- 15 Days PTO + Casual & Sick Leave
- 8 Lakhs Family Floater Coverage
- Personal Accident & Life Insurance: 3x of Gross Annual Salary