Sr Site Reliability Engineer, AI Platform Inference

Adobe is a company that changes the world through digital experiences, providing tools for emerging artists to global brands to design and deliver exceptional digital experiences.
$154,000 - $278,800
Site Reliability
Senior Software Engineer
Hybrid
5+ years of experience
AI

Description For Sr Site Reliability Engineer, AI Platform Inference

Adobe is seeking an outstanding Site Reliability Engineer for their AI Inference Platform, Adobe Firefly. As part of a team of Site Reliability Engineers, you'll work closely with Engineering teams to build, scale, and secure the AI Platform. This role enables Firefly product teams to easily manage and deploy Machine Learning capabilities used by Adobe client applications.

The platform will support thousands of models from Adobe Research and other App Teams, offering ML model serving at scale, with high-cost efficiency, across multiple cloud platforms. You'll be responsible for ensuring high uptime and quality of service for Adobe's customers through operational excellence.

Key responsibilities include:

  • Identifying and implementing solutions to increase reliability, scalability, security, and efficiency
  • Defining and measuring service level objectives (SLOs) and indicators (SLIs)
  • Supporting and maintaining globally distributed, multi-cloud environments
  • Automating common tasks at scale to streamline operations
  • Improving service resiliency through techniques like chaos engineering and performance testing
  • Coordinating with other Adobe teams and service providers to innovate on Generative AI as a Service

The ideal candidate will have a strong background in distributed systems, containerization (especially Kubernetes), and cloud technologies. They should be proficient in programming (Python or Go preferred) and have experience with infrastructure management tools, observability solutions, and AI/ML frameworks.

This role offers the opportunity to work on cutting-edge AI technologies and shape the future of Adobe's AI platform. The compensation range for this position is $154,000 - $278,800 annually, depending on qualifications and location.

Last updated 15 days ago

Responsibilities For Sr Site Reliability Engineer, AI Platform Inference

  • Identify and implement solutions to increase reliability, scalability, security, and efficiency
  • Ensure high uptime and Quality of Service (QoS) for Adobe's customers
  • Define service level objectives (SLOs) and indicators (SLIs) to measure service quality
  • Support and maintain globally distributed, multi-cloud environments
  • Automate common, repeatable tasks at a large scale
  • Improve service resiliency through chaos engineering and performance testing
  • Coordinate with other Adobe teams and service providers to innovate on Generative AI as a Service

Requirements For Sr Site Reliability Engineer, AI Platform Inference

Kubernetes
Python
Go
  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field, or equivalent industry experience
  • Experience in building and scaling distributed systems
  • Expertise with containerization and orchestration technologies, especially Kubernetes
  • Strong programming skills, preferably in Python or Go
  • Experience with infrastructure configuration management tools like Ansible and Terraform
  • Knowledge of observability and tracing tools such as InfluxDB, Prometheus, and Elastic Stack
  • Understanding of AI/ML frameworks and cloud-based AI/ML solutions
  • Familiarity with modern continuous development techniques and pipelines (IaC, CI/CD, ArgoCD, Git)

Benefits For Sr Site Reliability Engineer, AI Platform Inference

  • Competitive salary range of $154,000 - $278,800 annually
  • Opportunity to work on cutting-edge AI technologies
  • Collaboration with Adobe Research and other App Teams

Interested in this job?

Jobs Related To Adobe Sr Site Reliability Engineer, AI Platform Inference

Site Reliability Engineer

Senior Site Reliability Engineer role at Adobe focusing on cloud infrastructure, automation, and service reliability for the Experience Cloud platform.

Site Reliability Engineer, Adobe Pass

Join Adobe as a Site Reliability Engineer for Adobe Pass, shaping the future of TV Everywhere technology and working with cutting-edge cloud services and infrastructure.

Platform Engineer (Service Reliability Engineer)

Senior Platform Engineer role focusing on service reliability, cloud infrastructure, and DevOps practices in a financial services environment.