We are seeking a hands-on Observability Engineer with strong experience using Datadog and modern telemetry tools. This is not a general DevOps or platform engineering role; it is a tool-focused position responsible for implementing, operating, and continuously improving observability across applications, databases, and infrastructure within an established SRE framework.
Requirements
- 4+ years of experience in Observability, SRE, or Production Operations roles
- Strong, hands-on Datadog experience: APM, logs, DBM, dashboards, monitors, integrations
- Experience working with telemetry concepts: Metrics, logs, traces, log correlation, distributed tracing
- Working knowledge of AWS environments (EC2, ECS, RDS, S3, DynamoDB etc)
- Ability to read and reason about application code (Java and/or Python) to support instrumentation, troubleshooting, and telemetry design
- Experience integrating monitoring tools with PagerDuty and ServiceNow
- Strong troubleshooting, documentation, and communication skills
- Datadog certifications (APM, Logs, Fundamentals)
- Exposure to Splunk, ELK, Dynatrace, or similar tools
- Experience with OpenTelemetry (instrumentation or collectors)
- Familiarity with CI/CD pipelines and containerized workloads
- Experience supporting mission-critical, high-availability systems
- Financial services, index, or data-platform experience
Benefits
- Health & Wellness
- Flexible Downtime
- Continuous Learning
- Invest in Your Future
- Family Friendly Perks
- Beyond the Basics