As a Senior Site Reliability Engineer, Observability, you will play a key part in maturing our observability capabilities, standardizing instrumentation, improving telemetry quality, and enabling faster root cause analysis. You will get to support and evolve end-to-end observability solutions, administer and operate core observability platforms, and drive the adoption and standardization of instrumentation practices across services.

Requirements

Bachelor’s degree in Computer Science or equivalent practical experience.
7+ years of experience in Observability, SRE, DevOps, or platform engineering roles supporting production systems.
Strong understanding of APM and SRE fundamentals, including MELT (Metrics, Events, Logs, Traces), latency analysis, error rate monitoring, service dependency mapping, SLIs/SLOs, alert tuning, and root cause analysis.
Hands-on experience administering at least one modern observability/APM platform (e.g., Splunk, New Relic, Grafana), with practical exposure to metrics, logs, distributed tracing, and platform configuration.
Experience building dashboards and actionable alerts, including configuring alert workflows and integrations with incident management tools such as PagerDuty.
Experience implementing or supporting OpenTelemetry-based instrumentation and improving telemetry quality across services, with a focus on reducing alert fatigue and improving signal-to-noise ratio.
Familiarity with Kubernetes and cloud-native environments - an understanding of how applications are deployed, monitored, and scaled, including troubleshooting complex production issues in distributed environments.
Experience managing telemetry pipelines and agents (e.g., collectors, forwarders, sidecars), including onboarding services and troubleshooting ingestion issues, and optimizing pipelines for scale and efficiency.
Working knowledge of scripting or automation (e.g., Shell, Python) and CI/CD concepts.
Experience leading or contributing to incident investigations and postmortems, identifying observability gaps and driving continuous improvement.
Relevant certifications such as New Relic APM Professional, Reliability Engineer – Professional, Splunk Admin, or GCP Associate Cloud Engineer are a plus.

Benefits

Health & wellness coverage including medical, dental, vision, and mental health resources
Generous time off including PTO, holidays, a company-wide Priceline Pause reset week, and paid volunteer days
Work/life support including the ability to work up to 4 weeks per year from anywhere, parental leave, dependent care and family support resources, Summer Fridays, and office perks like stocked kitchens and catered meals (varies by location)
Financial security programs such as retirement plans with company contributions, life and disability coverage, and tax-advantaged accounts
Signature travel perks including employee-only discounts on hotels and flights, VIP deals, and Big Deal Bucks credits
Additional perks & discounts like travel and partner discounts, tuition support, legal support, and pet benefits

Senior Site Reliability Engineer, Observability

Senior Site Reliability Engineer, Observability

About the Company

Job Description

Requirements

Benefits

Job Details

About Priceline

Similar Jobs

Senior Site Reliability Engineer, Observability

Observability Developer

Site Reliability Engineer, Datadog Specialist