Anthropic is looking for a Cyber Harms Technical Policy Manager to lead the effort to prevent AI misuse in the cyber domain. The role involves leading a team of technical specialists, designing and overseeing capability evaluations, creating comprehensive cyber threat models, and developing policies to govern responsible use of AI models.
Requirements
- Lead and grow a team of technical specialists focused on cyber threat modeling and evaluation frameworks
- Design and oversee execution of capability evaluations to assess the cyber-relevant capabilities of new models
- Create comprehensive cyber threat models, including attack vectors, exploit chains, precursor identification, and weaponization techniques
- Develop and iterate on usage policies that govern responsible use of our models for emerging capabilities and use cases related to cyber harms
- Serve as the primary domain expert on cyber harms, advising cross-functional teams on threat landscapes and mitigation strategies
- Collaborate closely with internal and external threat modeling experts to develop training data for safety systems, and with ML engineers to train these systems, optimizing for both robustness against adversarial attacks and low false-positive rates for legitimate security researchers
- Analyze safety system performance in traffic, identifying gaps and proposing improvements
- Conduct regular reviews of existing policies and enforcement systems to identify and address gaps and ambiguities related to cybersecurity risks
- Develop rigorous stress-testing of safeguards against evolving cyber threats and product surfaces
- Partner with Research, Product, Policy, Security Team, and Frontier Red Team to ensure cybersecurity safety is embedded throughout the model development lifecycle
- Translate cybersecurity domain knowledge into actionable safety requirements and clearly articulated policies
- Contribute to external communications, including model cards, blog posts, and policy documents related to cybersecurity safety
- Monitor emerging technologies and threat landscapes for their potential to contribute to new risks and mitigation strategies, and strategically address these
- Mentor and develop team members, fostering a culture of technical excellence and responsible AI development
Benefits
- Competitive compensation
- Optional equity donation matching
- Generous vacation and parental leave
- Flexible working hours
- Lovely office space in which to collaborate with colleagues