Anomali is seeking a Senior Engineer, AI Evaluation & Reliability to lead the design and execution of evaluation, quality assurance, and release gating for their agentic AI features. The role requires 5+ years of experience building evaluation or testing infrastructure for ML/LLM systems or large-scale distributed systems.
Requirements
- 5+ years building evaluation or testing infrastructure for ML/LLM systems or large-scale distributed systems
- Proven ability to translate product requirements into measurable metrics and test plans
- Strong Python skills (or similar language) and experience with modern data tooling
- Hands-on experience running A/B tests, canaries, or experiment frameworks
- Experience defining and maintaining operational reliability metrics (SLIs/SLOs) for AI-driven systems
- Familiarity with large-scale distributed or streaming systems serving AI/agent workflows (millions of events or alerts/day)
- Excellent communication skills -- able to clearly convey technical results and trade-offs to engineer, PMs, and analysts
Benefits
- Generous Paid Time Off
- 401k Matching
- Retirement Plan
- Compensation Transparency