Join Oracle's Health Data Intelligence (HDI) team as a Senior Infrastructure & Reliability Engineer, focused on Site Reliability Engineering for large-scale healthcare analytics platforms. Design, build, and operate highly reliable, scalable infrastructure and data pipelines that power mission-critical analytics globally.
Requirements
- Experience building and operating high-availability, fault-tolerant systems
- Strong understanding of distributed systems, performance monitoring, and resiliency patterns
- Experience with incident response, root-cause analysis, and production troubleshooting
- Hands-on experience applying Generative AI or Agentic AI to infrastructure lifecycle management, observability and anomaly detection, incident response and remediation automation
- Strong experience with multi-cloud environments (OCI, AWS/Azure)
- Deep understanding of cloud infrastructure design, deployment, and resource optimization
- Advanced competency in CI/CD pipelines (Jenkins, Kubernetes)
- Infrastructure as Code (Terraform)
- Observability tools (Prometheus, Grafana)
- Proficiency in Data Warehousing platforms (e.g., Vertica, Snowflake)
- Experience with ETL frameworks and large-scale data processing
- Understanding of columnar storage systems
- Experience supporting or integrating BI tools (Tableau, Power BI, Oracle Analytics)
- Strong proficiency in Python, Java, or Go
- Experience with Docker, Kubernetes, and shell scripting
- Strong troubleshooting skills with ability to perform root-cause analysis
- Experience resolving complex production issues in distributed systems
Benefits
- Flexible medical insurance
- Life insurance
- Retirement options