Trust and Safety Software Engineer

Anthropic

Anthropic creates reliable, interpretable, and steerable AI systems, focusing on safe and beneficial AI development for users and society.

San Francisco, CA, USA

$304,800 - $412,750

Security

Mid-Level Software Engineer

Hybrid

101 - 500 Employees

3+ years of experience

Description For Trust and Safety Software Engineer

Anthropic is seeking a Trust and Safety Software Engineer to join their mission of creating reliable, interpretable, and steerable AI systems. This role is crucial for ensuring the safety and beneficial impact of their AI technology.

The position focuses on building robust safety and oversight mechanisms for AI systems, with responsibilities including developing monitoring systems, implementing abuse detection infrastructure, and working closely with research teams to enhance model security. You'll be at the forefront of preventing misuse and ensuring user well-being while enforcing terms of service and acceptable use policies.

The ideal candidate should have 3-8+ years of software engineering experience, particularly in areas related to integrity, spam, fraud, or abuse detection. Strong technical skills in Python, SQL, and data analysis are required, along with excellent communication abilities to bridge technical and non-technical stakeholders.

Anthropic offers a collaborative environment where you'll work as part of a cohesive team focused on high-impact AI research. The company values empirical science approaches and maintains strong connections to influential research in areas like GPT-3, Circuit-Based Interpretability, and AI Safety.

Based in San Francisco, Anthropic provides competitive compensation (£240,000 - £325,000 GBP), comprehensive benefits, and a flexible hybrid work environment requiring at least 25% office presence. They offer visa sponsorship and maintain an inclusive culture that welcomes diverse perspectives, crucial for addressing the social and ethical implications of AI development.

This role presents a unique opportunity to contribute to the development of safe, beneficial AI systems while working with leading researchers and engineers in the field. The position combines technical challenges with the meaningful goal of ensuring AI safety and ethical deployment at scale.

Last updated a month ago

Responsibilities For Trust and Safety Software Engineer

Develop monitoring systems to detect unwanted behaviors from API partners and implement automated enforcement actions
Build abuse detection mechanisms and infrastructure
Surface abuse patterns to research teams to harden models at the training stage
Build robust multi-layered defenses for real-time improvement of safety mechanisms at scale
Analyze user reports of inappropriate content or accounts

Requirements For Trust and Safety Software Engineer

Python

Bachelor's degree in Computer Science, Software Engineering or comparable experience
3-8+ years of experience in software engineering, preferably in integrity, spam, fraud, or abuse detection
Proficiency in SQL, Python, and data analysis tools
Strong communication skills and ability to explain complex technical concepts
Experience with trust and safety mechanisms for AI/ML systems (preferred)
Experience with machine learning frameworks (preferred)
Experience with prompt engineering and adversarial inputs (preferred)
Experience working with operational teams to build custom internal tooling (preferred)

Benefits For Trust and Safety Software Engineer

Visa Sponsorship

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Office space for collaboration

Anthropic

Anthropic creates reliable, interpretable, and steerable AI systems, focusing on safe and beneficial AI development for users and society.

San Francisco, CA, USA

$304,800 - $412,750

Security

Mid-Level Software Engineer

Hybrid

101 - 500 Employees

3+ years of experience

Interested in this job?

Jobs Related To Anthropic Trust and Safety Software Engineer

Trust and Safety Software Engineer

Anthropic

Trust and Safety Software Engineer role at Anthropic, focusing on building safety mechanisms for AI systems, requiring 3+ years of experience in software engineering and security.

Trust and Safety Software Engineer

Anthropic

Trust and Safety Software Engineer at Anthropic: Build safety mechanisms for AI systems

Security Operations Engineer

Axon

Security Operations Engineer position at Axon focusing on cloud security, incident response, and security tooling development.

Security Engineer

DoorDash

Security Engineer position at DoorDash focusing on corporate security, zero-trust architecture, and endpoint security, requiring 3+ years of experience.

Security Operations Engineer

Axon

Security Operations Engineer position at Axon focusing on cloud security, incident response, and security tooling development.