TechInsights is a high-growth company seeking a Site Reliability Developer to design, implement, and maintain its cloud infrastructure. The ideal candidate has expertise in reliability engineering, cloud infrastructure, and software engineering.
Requirements
- Design, implement, and maintain highly available, scalable infrastructure systems across multi-region AWS deployments
- Develop and maintain service level objectives (SLOs) and service level indicators (SLIs)
- Monitor system performance, availability, and resource utilization
- Implement capacity planning strategies
- Create comprehensive infrastructure-as-code solutions using Terraform and GitOps methodologies
- Develop and maintain CI/CD pipelines
- Implement and maintain containerization platforms using Docker and Kubernetes
- Build automation tools and scripts
- Lead incident response for critical system outages and performance issues
- Implement comprehensive observability solutions
- Conduct blameless post-mortems and thorough post-incident reviews
- Develop and maintain disaster recovery procedures and business continuity plans
Benefits
- Company-sponsored training and development opportunities
- Comprehensive benefits package (health, dental, vision, wellness, RRSP Matching, annual fitness reimbursement)
- Flexible vacation policy
- Bring your own device program
- Community involvement opportunities through charitable alliances
- Wellness resources and support
- Inclusive environment that prioritizes diversity, equity, and accessibility