Fortinet is seeking a Principal Site Reliability Engineer to lead the design, implementation, and optimization of their highly scalable, resilient, and efficient platform infrastructure. The ideal candidate will have 10+ years of DevOps/SRE experience, with a proven track record of building sophisticated tools and workflows.
Requirements
- Architect and implement advanced automation strategies to maximize operational efficiency and minimize toil across the FortiCNAPP platform.
- Lead the design, development, and enhancement of infrastructure systems to ensure world-class scalability, resiliency, and performance.
- Proactively identify and resolve complex, systemic issues through innovative automation, tooling, and architectural solutions, preventing customer-facing incidents.
- Drive the evolution of monitoring, instrumentation, and observability systems to anticipate and mitigate scalability and reliability risks before they impact customers.
- Champion company-wide adoption of reliability best practices, establishing key metrics, SLAs, and milestones to embed scalability and resiliency into all engineering processes.
- Collaborate with cross-functional teams to define and implement industry-leading practices for infrastructure, deployment, and operational workflows.
- Provide technical leadership and mentorship to engineering and operations teams, fostering a culture of reliability, automation, and continuous improvement.
- Lead incident response and post-mortem processes, driving root cause analysis and implementing preventive measures.
- Participate in an on-call rotation, serving as an escalation point for complex issues and guiding the team through critical incidents.
- Influence strategic technology decisions, evaluating and integrating cutting-edge tools, services, and methodologies to enhance platform reliability.
Benefits
- Medical, dental, vision, life and disability insurance
- 401(k)
- 11 paid holidays
- Vacation time
- Sick time
- Comprehensive leave program