Sieve is an AI research lab focused on video data. We're hiring a Reliability Engineer to design and validate infrastructure powering PB-scale workloads, build monitoring and alerting platforms, and improve cloud and data security.

Requirements

3+ years building internal infrastructure at scale
Experience on-call for Sev 0 / Sev 1 production incidents (L3 preferred)
Strong cloud experience (GCP, AWS, Oracle, Cloudflare, etc.)
Deep Infrastructure-as-Code experience (Terraform preferred)
Familiarity with Argo, Helm, Kustomize, or similar deployment tools
Experience operating observability systems (Prometheus, OTel, VictoriaMetrics)
Backend fundamentals in Python, Go, Rust, or C++
Strong networking + security intuition, including SSO implementation
High ownership mindset over critical systems

Benefits

401k
Full Health Insurance
Breakfast, Lunch, and Dinner covered and your choice of snacks
Ubers covered home

Requirements

3+ years building internal infrastructure at scale
Experience on-call for Sev 0 / Sev 1 production incidents (L3 preferred)
Strong cloud experience (GCP, AWS, Oracle, Cloudflare, etc.)
Deep Infrastructure-as-Code experience (Terraform preferred)
Familiarity with Argo, Helm, Kustomize, or similar deployment tools
Experience operating observability systems (Prometheus, OTel, VictoriaMetrics)
Backend fundamentals in Python, Go, Rust, or C++
Strong networking + security intuition, including SSO implementation
High ownership mindset over critical systems

Benefits

401k
Full Health Insurance
Breakfast, Lunch, and Dinner covered and your choice of snacks
Ubers covered home

Reliability Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Reliability Engineer

Staff Site Reliability Engineer

Site Reliability Engineer

Reliability Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Reliability Engineer

Staff Site Reliability Engineer

Site Reliability Engineer

Job Details

About Sieve