Anthropic is a public benefit corporation headquartered in San Francisco, working on steerable, trustworthy AI.
We're looking for researchers and engineers to join our Interpretability team at Anthropic, working on mechanistic interpretability of neural networks to make them safe. Responsibilities include developing methods for understanding LLMs, designing and running robust experiments, and creating and analyzing new interpretability features and circuits.
Anthropic is a public benefit corporation headquartered in San Francisco, working on steerable, trustworthy AI.
Anthropic