Okta is looking for a Manager, Site Reliability Engineering to lead a team of SRE's supporting various workloads and teams that support our IDaaS platform. The role involves driving microservice journey, DevOps maturity, and workload reliability in tandem with architects and teams across the organization.
Requirements
- 3+ years of experience in technical leadership & people management
- Extensive experience using Agile and DevOps methodologies to build product infrastructure and shared service at scale
- Experience running large-scale infrastructure platforms supporting a SaaS/Cloud service in a public Cloud, preferably AWS. Experience supporting a multi-Cloud environment will be a plus.
- Strong expertise in cloud-native architectures, containerization (Kubernetes), IaC (Terraform), and CI/CD pipelines
- Strong background and hands-on experience in SW development, PaaS and automation
- Deep experience with building and operating observability platforms and monitoring tools (Grafana, Splunk, APM etc.) in a large scale environment.
- Effective verbal, written communication and interpersonal skills
- Computer Science Degree or related degree or equivalent experience
- Ability to access federal environments and/or have access to protected federal data
Benefits
- 401(k)
- Flexible spending account
- Paid leave (including PTO and parental leave)
- Health, dental and vision insurance