Principal TechOps Engineer – SRE role to help enable and drive initiatives from design to implementation in a rapidly changing environment, with a focus on Kubernetes (AWS EKS) environments, DevOps approach, and automation.
Requirements
- 5+ years of hands-on experience with AWS in a production environment
- Experience building and deploying Docker images
- Production experience running Kubernetes workloads ideally on AWS EKS
- Experience managing and maintaining Kubernetes Clusters on AWS EKS
- Experience with Confluent or Kafka
- Experience creating and deploying Helm charts & libraries
- Hands-on experience with Jenkins Core, including authoring and maintaining declarative CI/CD pipelines and libraries
- Experience with monitoring tools e.g., CloudWatch, Datadog & Splunk Cloud
- Proficiency with UNIX operating systems and shell scripting
- Experience with Amazon Web Services (AWS), having managed services and applications in a large AWS cross-account environment using IAM and federated SSO
- Experience crafting and maintaining logging, monitoring, and alerting capabilities using tools like Datadog and Splunk
- Ability to communicate at all levels with track record of strong written and verbal communications
- See problems as opportunities to automate
- Ability to work independently with minimal direction
- Drive and champion the overall design of highly available, secure, scalable microservices-based applications in AWS
- Track record of providing technical leadership to strong teams of Site Reliability Engineers / Cloud Engineers
- Experience with configuring and deploying resilient infrastructure in multiple regions and multiple availability zones
- Work multi-functionally with other organizations and collaborate with our risk, product and engineering team leaders
- Leading the initiative to craft and deploy our applications to the cloud
- Promoting a DevOps mentality, providing mentorship and establishing development standard methodologies for AWS infrastructure-as-code
- Championing automation tools to improve software delivery and reduce risk
- Production experience with infrastructure-as-code (IaC), Terraform preferred
- Programming experience, e.g., Python preferred
- Experience with distributed version control systems, Git preferred
- Experience with Apache or Confluent Kafka a plus
- Experience with the agile software development lifecycle and Kanban preferred
- Experience with CDN Providers e.g., Akamai preferred
Benefits
- Generous Paid Time Off
- 401k Matching
- Retirement Plan
- Visa Sponsorship