Senior Software Engineer, MLOps and Infrastructure

Training and deploying frontier models for developers and enterprises building AI systems for content generation, semantic search, RAG, and agents.
DevOps
Senior Software Engineer
Hybrid
5+ years of experience
AI

Description For Senior Software Engineer, MLOps and Infrastructure

Cohere is on a mission to scale intelligence to serve humanity, focusing on training and deploying frontier models for AI systems. As a Senior Software Engineer in MLOps and Infrastructure, you'll join a team responsible for building critical infrastructure that underpins all of Cohere's success. The role demands expertise in designing and managing large-scale distributed systems, particularly with Kubernetes and GPU workloads. You'll work with cutting-edge cloud technologies across multiple platforms (GCP, Azure, AWS, OCI) and build automation systems for deployment and monitoring.

The position requires participation in a 24x7 on-call rotation (with compensation) and targets candidates based in EMEA. You'll be instrumental in building self-service systems, custom Kubernetes operators, and ensuring high availability of mission-critical infrastructure. The role emphasizes both technical excellence and team collaboration, with opportunities to mentor others and influence the infrastructure roadmap.

Working at Cohere means joining a diverse team of world-class professionals passionate about advancing AI technology. The company offers comprehensive benefits including health coverage, parental leave, enrichment benefits, and flexible work arrangements. With offices in major tech hubs and a hybrid work model, you'll have the opportunity to shape the future of AI infrastructure while maintaining work-life balance.

If you're experienced in production infrastructure, passionate about automation and scalability, and want to contribute to cutting-edge AI development, this role offers an exciting opportunity to make a significant impact in the field of artificial intelligence.

Last updated 2 hours ago

Responsibilities For Senior Software Engineer, MLOps and Infrastructure

  • Build self-service systems that automate managing, deploying and operating services
  • Build custom Kubernetes operators that support language model deployments
  • Automate environment observability and resilience
  • Ensure defined SLOs are met, including participation in 24x7 on-call rotation
  • Build strong relationships with internal developers and influence Infrastructure team's roadmap
  • Develop team through knowledge sharing and active review process

Requirements For Senior Software Engineer, MLOps and Infrastructure

Go
Kubernetes
Linux
  • 5+ years of engineering experience running production infrastructure at large scale
  • Experience designing large, highly available distributed systems with Kubernetes, and GPU workloads
  • Experience working with GCP, Azure, AWS and/or OCI
  • Experience in designing, deploying, supporting, and troubleshooting in complex Linux-based computing environments
  • Excellent collaboration and troubleshooting skills
  • The grit and adaptability to solve complex technical challenges

Benefits For Senior Software Engineer, MLOps and Infrastructure

Dental Insurance
Medical Insurance
Mental Health Assistance
Parental Leave
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits
  • Mental health budget
  • 100% Parental Leave top-up for 6 months
  • Personal enrichment benefits
  • Remote-flexible work
  • Co-working stipend
  • 6 weeks of vacation

Interested in this job?

Jobs Related To Cohere Senior Software Engineer, MLOps and Infrastructure

Senior Software Engineer, MLOps and Infrastructure

Senior Software Engineer position at Cohere, focusing on MLOps and Infrastructure, building and maintaining critical systems for AI model deployment and operations.

Senior Software Engineer, MLOps and Infrastructure

Senior Software Engineer position at Cohere, focusing on MLOps and Infrastructure, building and maintaining critical systems for AI model deployment and operations.

Senior Software Engineer (Infrastructure)

Senior Infrastructure Engineer role at Owl.co, building scalable AWS solutions for AI-powered insurance claims platform in New York City.

Senior Infrastructure Engineer

Senior Infrastructure Engineer position at SentiLink - Remote US role focusing on cloud infrastructure, DevOps, and system reliability with competitive compensation.

Solutions Engineer, PubSec (Central/East)

Senior Solutions Engineer role at Docker focusing on public sector sales, requiring 8+ years of technical experience and expertise in container technologies.