Anthropic is seeking a Trust and Safety Software Engineer to join their mission of creating reliable, interpretable, and steerable AI systems. This role is crucial for ensuring the safety and beneficial impact of their AI technology.
The position focuses on building robust safety and oversight mechanisms for AI systems, with responsibilities including developing monitoring systems, implementing abuse detection infrastructure, and working closely with research teams to enhance model security. You'll be at the forefront of preventing misuse and ensuring user well-being while enforcing terms of service and acceptable use policies.
The ideal candidate should have 3-8+ years of software engineering experience, particularly in areas related to integrity, spam, fraud, or abuse detection. Strong technical skills in Python, SQL, and data analysis are required, along with excellent communication abilities to bridge technical and non-technical stakeholders.
Anthropic offers a collaborative environment where you'll work as part of a cohesive team focused on high-impact AI research. The company values empirical science approaches and maintains strong connections to influential research in areas like GPT-3, Circuit-Based Interpretability, and AI Safety.
Based in San Francisco, Anthropic provides competitive compensation (£240,000 - £325,000 GBP), comprehensive benefits, and a flexible hybrid work environment requiring at least 25% office presence. They offer visa sponsorship and maintain an inclusive culture that welcomes diverse perspectives, crucial for addressing the social and ethical implications of AI development.
This role presents a unique opportunity to contribute to the development of safe, beneficial AI systems while working with leading researchers and engineers in the field. The position combines technical challenges with the meaningful goal of ensuring AI safety and ethical deployment at scale.