Join Fortune 7 CVS Health as a Staff Software Engineer to lead and advance Site Reliability Engineering, AIOps, Observability, and Monitoring capabilities in the CVS Digital team. This role is critical in advancing intelligent, automated, and scalable reliability practices across our platforms.
Requirements
- 5+ years of experience in software engineering, SRE, or production engineering in large-scale distributed systems
- Hands-on experience with Observability tools such as AppDynamics, Grafana, Prometheus, Datadog, OpenTelemetry, or similar
- Experience with AIOps or intelligent monitoring platforms, including anomaly detection and event correlation
- Strong expertise in cloud platforms (AWS, Azure, or GCP) and cloud-native architectures (Kubernetes, containers, microservices)
- Proficiency in at least one programming language (e.g., Python, Java, Go)
- Strong understanding of distributed systems, resiliency patterns, and fault tolerance
- Experience implementing incident management, on-call processes, and root cause analysis
- Hands-on expertise with Infrastructure as Code (Terraform, ARM, CloudFormation) and CI/CD pipelines
- Experience using GenAI/Automation tools and frameworks such as OpenAI, CoPilot, Gemini, Claude, MCP etc.
- Proven ability to design scalable, reliable, and observable systems
Benefits
- Comprehensive benefits package
- Medical, dental, and vision coverage
- Paid time off
- Retirement savings options
- Wellness programs
- Other resources, based on eligibility