Scale Labs is seeking a Research Scientist to work on agent robustness, AI control protocols, and AI risk evaluations to help governments, industry, and the public understand and mitigate AI risk while maximizing AI adoption.
Requirements
- Research the science of AI agent capabilities with a focus on how they relate to safety, risk factors, and methodologies for benchmarking them;
- Design and build harnesses to test AI agents’ tendency to take harmful actions when pressured to do so by users or tricked into doing so by elements of their environment;
- Characterize and design mitigations for potential failure modes or broader risks of systems involving multiple interacting AI agents.
- Experience with post-training and RL techniques such as RLHF, DPO, GRPO, and similar approaches.
- A track record of published research in machine learning, particularly in generative AI.
- At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development.
- Strong written and verbal communication skills to operate in a cross-functional team.
Benefits
- Comprehensive health, dental and vision coverage
- Retirement benefits
- A learning and development stipend
- Generous Paid Time Off