This role is eligible for our hybrid work model: Two days in-office. Observability Developer (SRE / O11y Platform) to support and evolve end-to-end observability solutions for collecting, shipping, storing, and querying OpenTelemetry signals (metrics, logs, and traces) across infrastructure, containers, and Kubernetes environments.
Requirements
- Bachelor’s degree in Computer Science or equivalent practical experience.
- 3+ years of experience in Observability, SRE, DevOps, or platform engineering roles supporting production systems.
- Strong understanding of APM and SRE fundamentals, including MELT (Metrics, Events, Logs, Traces), latency analysis, error rate monitoring, service dependency mapping, SLIs/SLOs, alert tuning, and root cause analysis.
- Hands-on experience administering at least one modern observability/APM platform (e.g., Splunk, New Relic, Grafana), with practical exposure to metrics, logs, distributed tracing, and platform configuration.
- Experience building dashboards and actionable alerts, including configuring alert workflows and integrations with incident management tools such as PagerDuty.
- Experience implementing or supporting OpenTelemetry-based instrumentation and improving telemetry quality across services.
- Familiarity with Kubernetes and cloud-native environments - an understanding of how applications are deployed, monitored, and scaled.
- Experience managing telemetry pipelines and agents (e.g., collectors, forwarders, sidecars), including onboarding services and troubleshooting ingestion issues.
- Working knowledge of scripting or automation (e.g., Shell, Python) and CI/CD concepts.
- Experience or familiarity with infrastructure-as-code tools such as Terraform for managing platform configurations and integrations is a plus.
Benefits
- Competitive base salary
- Annual bonus
- Equity grant