We are looking for a highly experienced Staff Site Reliability Engineer to help shape the future of reliability, scalability, and performance at AlphaSense. This is a hands-on, high-impact role where you will architect core reliability platforms, lead by example in incident response, and drive cultural adoption of SRE best practices across our global engineering organization.
Requirements
- 8+ years of experience in Site Reliability Engineering, DevOps, or a similar role, with at least 3+ of those years operating in a Senior+ SRE position
- Strong background in running production SaaS systems at scale.
- Proficiency in at least one programming/scripting language (Python, Go, or similar).
- Hands-on expertise with cloud platforms (AWS, GCP, or Azure) and Kubernetes.
- Deep understanding of networking fundamentals (TCP/IP, DNS, HTTP/S, load balancing).
- Experience with monitoring & alerting (Prometheus, Grafana, Datadog, ELK).
- Familiarity with advanced observability (OTEL, continuous profiling).
- Proven incident management experience, including leading high-severity incidents and postmortems.
- Strong troubleshooting skills across the full stack.
- Excellent communication and collaboration skills.
Benefits
- AlphaSense is an equal-opportunity employer. We are committed to a work environment that supports, inspires, and respects all individuals.