The Senior Cloud Platform Engineer will be responsible for ensuring the reliability, performance, and scalability of the company's AI Inferencing Service. The role includes participating in a shared on-call rotation to maintain 24/7 service reliability, developing and maintaining advanced monitoring and alerting systems, and implementing auto-scaling policies to handle variable inference loads cost-effectively.

Requirements

Bachelor's degree in Computer Science, Engineering, or a related field
5-8+ years of experience in a Site Reliability Engineer, DevOps, or related role supporting a large-scale, customer-facing service in a public cloud environment
Strong programming/scripting skills in languages like Python, Go, or Java
Proven experience with containerization and orchestration technologies (Docker, Kubernetes)
Deep understanding of monitoring and observability principles and tools (e.g., Prometheus, Grafana, ELK Stack, Datadog)
Solid experience with Infrastructure as Code (e.g., Terraform, CloudFormation)
Familiarity with CI/CD principles and tools (e.g., Jenkins, GitHub Actions, ArgoCD)
Excellent problem-solving skills and a systematic approach to troubleshooting complex distributed systems

Benefits

Competitive total rewards package, including base salary, equity, and benefits
95% premium coverage for employee medical insurance
77% premium coverage for dependents
Health Savings Account (HSA) with employer contribution
Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans
Flexible Spending Account (FSA) options like Health Care, Limited Purpose, and Dependent Care
Well-being benefits, including a full subscription to Headspace, Gympass+ membership, One Medical membership, counseling services with an Employee Assistance Program, and more

Requirements

Bachelor's degree in Computer Science, Engineering, or a related field

5-8+ years of experience in a Site Reliability Engineer, DevOps, or related role supporting a large-scale, customer-facing service in a public cloud environment

Strong programming/scripting skills in languages like Python, Go, or Java

Proven experience with containerization and orchestration technologies (Docker, Kubernetes)

Deep understanding of monitoring and observability principles and tools (e.g., Prometheus, Grafana, ELK Stack, Datadog)

Solid experience with Infrastructure as Code (e.g., Terraform, CloudFormation)

Familiarity with CI/CD principles and tools (e.g., Jenkins, GitHub Actions, ArgoCD)

Excellent problem-solving skills and a systematic approach to troubleshooting complex distributed systems

Benefits

Competitive total rewards package, including base salary, equity, and benefits

95% premium coverage for employee medical insurance

77% premium coverage for dependents

Health Savings Account (HSA) with employer contribution

Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans

Flexible Spending Account (FSA) options like Health Care, Limited Purpose, and Dependent Care

Well-being benefits, including a full subscription to Headspace, Gympass+ membership, One Medical membership, counseling services with an Employee Assistance Program, and more

Senior Cloud Platform Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Senior Cloud Platform Engineer

Principal Cloud Backend Engineer

Principal Compiler Engineer - ML Systems

Senior Cloud Platform Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Senior Cloud Platform Engineer

Principal Cloud Backend Engineer

Principal Compiler Engineer - ML Systems

Job Details

About SambaNova Systems