Director of Site Reliability Engineering role at Webster Financial Corporation, responsible for transforming reliability, performance, and availability across platforms, with a focus on AWS cloud architecture and MuleSoft integration ecosystem.
Requirements
- Monitoring and Observability: Implement and maintain tools for monitoring, logging, and tracing to gain insights into system performance and health
- Automation: Write software and scripts to automate repetitive tasks, such as deployment, monitoring, and system management
- Incident Management: Respond to incidents, troubleshoot system-level issues, and perform root cause analysis to prevent recurrence
- Reliability Engineering: Design and build reliable and scalable systems, define Service Level Objectives (SLOs) and Indicators (SLIs), and implement reliability patterns
- Collaboration: Work closely with software developers to ensure applications are reliable and to provide feedback on performance in a production environment
- Documentation: Create and maintain documentation, including runbooks and system diagrams, to ensure knowledge sharing and team efficiency
- Agile methodology and DevOps practices
Benefits
- Incentive compensation
- 401k matching
- Generous Paid Time Off
- Relocation Assistance