Senior Site Reliability Engineer position at iManage in Toronto, ON. The ideal candidate will have experience in cloud platforms, containerization, and observability. They will be responsible for creating a modern, cloud-native platform, eliminating TOIL through automation, and participating in on-call rotations. The company offers a competitive salary, annual performance-based bonus, comprehensive benefits, and a flexible work environment.
Requirements
- Experience writing design documents, postmortems, and refactoring application code.
- Built automation to reduce operational burden or developed internal SaaS tools.
- Ability to advocate for SRE principles (e.g., SLOs vs SLAs) and introduce them effectively.
- Experience in public cloud or hosted datacenter environments (Azure and AKS preferred).
- A passion for collaborative teamwork and influencing reliability best practices across teams.
- Hands-on experience with Linux server stacks (Ubuntu/Debian preferred).
- Knowledge of cloud provisioning platforms (Terraform preferred).
- Exposure to configuration management tools (Chef preferred).
- Experience with containerization/clustering technologies (Docker preferred).
- Familiarity with observability and alerting tools (Prometheus/Grafana or ELK/EFK).
- Practical experience with CI/CD pipelines and rollout strategies.
- A bachelor’s degree (or equivalent experience) in Computer Engineering or related field.
- Proficiency in one or more programming languages (e.g., Java, Python, Golang).
- Familiarity with scripting languages (e.g., PowerShell, Bash, Python, Ruby).
Benefits
- Market competitive salary
- Annual performance-based bonus
- Comprehensive Health/Vision/Dental/Life Insurance
- Registered Retirement Savings Plan with a company match up to 5%
- Enhanced leave for expecting parents
- Flexible time off policy
- Multiple company wellness days each year
- Access to RethinkCare, a global behavioral health platform