Okta is seeking a Senior Site Reliability Engineer (SRE) with experience managing large-scale cloud production systems. The ideal candidate will have expertise in Kubernetes, Linux systems administration, and cloud infrastructure, as well as strong scripting skills and experience with Docker containers and web applications in high-traffic live environments.
Requirements
- Active TS/SCI clearance
- Deep familiarity with FedRAMP and DoD IL6 compliance standards
- B.S. in Computer Science or equivalent professional experience
- 5+ years of experience building and operating workloads orchestrated by Kubernetes
- Expert-level debugging of Helm values and charts
- Strong Linux systems administration background with proficiency in Go, Python, Bash, or Ruby
- Expertise in AWS services (EC2, ECS, KMS, CloudWatch) and Infrastructure as Code (Terraform or CloudFormation)
- Experience managing Docker containers and web applications (Java/Apache/Tomcat) in high-traffic live environments
- Solid understanding of networking concepts and IP protocols; experience with multi-cloud environments is a significant plus
Benefits
- Health, dental and vision insurance
- 401(k)
- Flexible spending account
- Paid leave (including PTO and parental leave)