McKesson is seeking a Sr. Database Site Reliability Engineer (DB SRE) to own the reliability, availability, and operational maturity of business-critical Azure PostgreSQL platforms supporting the CoverMyMeds (CMM) platform.
Requirements
- Own and continuously improve the reliability, availability, and performance of Azure PostgreSQL platforms across dev, stage, and prod
- Design, build, and operate cloud database infrastructure using Infrastructure as Code (Terraform)
- Apply SRE principles to stateful systems, including environment isolation, blast-radius reduction, and automation-first operations
- Define and implement database observability (metrics, logs, dashboards, alerts) using enterprise monitoring tools (e.g., Datadog)
- Lead incident response for database-related production issues and participate in on-call rotations
- Troubleshoot complex issues across performance, replication, connectivity, failover, and permissions
- Define and validate high availability, backup, restore, disaster recovery, and point-in-time recovery (PITR) strategies
- Enforce least-privilege access, support audits, and ensure compliance with security and governance requirements
- Collaborate with platform, application, security, and network teams to design scalable, secure database architectures
- Provide senior technical leadership, set reliability standards, and mentor less-experienced engineers
Benefits
- Generous Paid Time Off
- 401k Matching
- Retirement Plan
- Four Day Work Week
- Generous Parental Leave
- Tuition Reimbursement
- Relocation Assistance