The Sr. Database Site Reliability Engineer (DB SRE) is responsible for owning the reliability, availability, and operational maturity of businessâcritical Azure PostgreSQL platforms supporting the CoverMyMeds (CMM) platform. The role applies Site Reliability Engineering (SRE) principles to database services, emphasizing automation, Infrastructure as Code, observability, and resilience in a modern cloud-native environment.
Requirements
- Own and continuously improve the reliability, availability, and performance of Azure PostgreSQL platforms across dev, stage, and prod
- Design, build, and operate cloud database infrastructure using Infrastructure as Code (Terraform)
- Apply SRE principles to stateful systems, including environment isolation, blastâradius reduction, and automation-first operations
- Define and implement database observability (metrics, logs, dashboards, alerts) using enterprise monitoring tools (e.g., Datadog)
- Lead incident response for database-related production issues and participate in onâcall rotations
- Troubleshoot complex issues across performance, replication, connectivity, failover, and permissions
- Define and validate high availability, backup, restore, disaster recovery, and pointâinâtime recovery (PITR) strategies
- Enforce leastâprivilege access, support audits, and ensure compliance with security and governance requirements
- Collaborate with platform, application, security, and network teams to design scalable, secure database architectures
- Provide senior technical leadership, set reliability standards, and mentor lessâexperienced engineers
Benefits
- Competitive compensation package
- Annual bonus or long-term incentive opportunities
- Benefits package