We are seeking a Lead Site Reliability Engineer (Infrastructure) to join our fast-moving VSaaS engineering organization. This role carries responsibility for technical leadership and operational execution of the Infrastructure SRE team.

Requirements

10+ years of experience in site reliability engineering, infrastructure, or systems engineering
Strong hands-on experience designing and building automation and operational tooling using Golang and/or Python
Advanced expertise in cloud-native and IaaS architectures, distributed systems, and container orchestration in production environments
Deep understanding of SRE and DevOps principles, including incident management, SLA/SLO ownership, automation, reliability engineering practices and leading incident response with post-incident analysis and preventive improvements
Strong experience with CI/CD pipelines, GitOps workflows, release tooling, and modern cloud-native infrastructure practices, ensuring reliable and traceable software and infrastructure changes
Hands-on experience operating Docker and Kubernetes environments, observability platforms (logging, monitoring, alerting), and SQL/NoSQL databases (e.g., Postgres, MongoDB, Graph DB)

Benefits

Medical/dental benefits
FSA or HSA
401k with 6% Safe Harbor employer match
Paid parental leave
Generous PTO (20 days' vacation, 10 days paid sick time, and 12 company holidays)
Fully paid Short Term disability policy
Fully paid Long Term disability policy
Life Insurance

Requirements

10+ years of experience in site reliability engineering, infrastructure, or systems engineering

Strong hands-on experience designing and building automation and operational tooling using Golang and/or Python

Advanced expertise in cloud-native and IaaS architectures, distributed systems, and container orchestration in production environments

Deep understanding of SRE and DevOps principles, including incident management, SLA/SLO ownership, automation, reliability engineering practices and leading incident response with post-incident analysis and preventive improvements

Strong experience with CI/CD pipelines, GitOps workflows, release tooling, and modern cloud-native infrastructure practices, ensuring reliable and traceable software and infrastructure changes

Hands-on experience operating Docker and Kubernetes environments, observability platforms (logging, monitoring, alerting), and SQL/NoSQL databases (e.g., Postgres, MongoDB, Graph DB)

Lead Site Reliability Engineer - Infrastructure

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Lead Site Reliability Engineer - Infrastructure

Staff/Principal Site Reliability Engineer

Principal Site Reliability Engineer

Lead Site Reliability Engineer - Infrastructure

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Lead Site Reliability Engineer - Infrastructure

Staff/Principal Site Reliability Engineer

Principal Site Reliability Engineer

Job Details

About Milestone