Anthropic is seeking a Research Engineer to join their Alignment Science team, focusing on creating safe and beneficial AI systems. The role involves conducting experimental research on AI safety, particularly concerning powerful future systems (ASL-3 or ASL-4) under their Responsible Scaling Policy.
The position offers a unique opportunity to work on critical AI safety challenges, including AI Control and Alignment Evaluations. Key projects include testing safety technique robustness, running multi-agent reinforcement learning experiments, and building evaluation tools for LLM-generated jailbreaks.
The ideal candidate combines scientific and engineering mindsets, with experience in software development, ML, or research engineering. They should be collaborative, adaptable, and deeply concerned about AI's societal impact. Experience with LLMs, reinforcement learning, or AI safety research is highly valued.
Anthropic operates as a cohesive team focused on large-scale research efforts, viewing AI research as an empirical science. The company offers a collaborative environment with frequent research discussions and continues work in areas like GPT-3, Circuit-Based Interpretability, and Learning from Human Preferences.
The role is based in London with a requirement to spend at least 25% time in office and occasional travel to San Francisco. Anthropic provides competitive compensation, benefits, equity donation matching, generous leave policies, and a modern office space. The company values diversity and encourages applications from candidates with varied backgrounds and perspectives.