AI & Machine Learning Site Reliability Engineer

Oomnitza provides Enterprise Technology Management platform that orchestrates and automates key business processes for IT through SaaS solutions.
Galway, Ireland
Site Reliability
Senior Software Engineer
Remote
5+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:
Sr. System Reliability Engineer

Senior SRE position at Disney focusing on system reliability, automation, and infrastructure management for enterprise-scale applications.

Senior Reliability Engineer

Senior Reliability Engineer position at Natron Energy focusing on battery systems development and testing for data centers and EV applications.

Senior Site Reliability / Gitops Engineer

Senior Site Reliability Engineer position at Canonical, focusing on GitOps and infrastructure automation for Ubuntu's parent company.

Site Reliability Engineer

Senior SRE position at Radar, managing high-throughput infrastructure handling 1B+ daily API calls, using AWS, Kubernetes, and MongoDB, with competitive compensation and benefits.

Site Reliability Engineer, AI/ML Platforms

Senior Site Reliability Engineer role at Adobe focusing on AI/ML platforms, requiring 5+ years experience in distributed systems and containerization technologies.

Description For AI & Machine Learning Site Reliability Engineer

Oomnitza, a leading Enterprise Technology Management platform provider, is seeking an AI & ML Site Reliability Engineer to drive their AI and Data product management innovations. This role combines site reliability engineering with AI/ML expertise, focusing on building and maintaining infrastructure for machine learning operations. The position offers an opportunity to work with cutting-edge technologies including vector databases, knowledge graphs, and large language models. You'll be responsible for architecting scalable AI systems, implementing RAG solutions, and ensuring robust ML model deployment pipelines. The role requires strong technical expertise in cloud platforms, containerization, and ML frameworks, combined with the ability to collaborate across teams. Working in a venture-backed company with a progressive culture, you'll have the chance to shape the future of enterprise technology management while enjoying comprehensive benefits and flexible work arrangements. The position offers significant growth potential, working directly with founders and helping scale a fast-growing business backed by notable investors.

Last updated 3 months ago

Responsibilities For AI & Machine Learning Site Reliability Engineer

  • Build and maintain big data analytics platform
  • Design and build scalable AI infrastructure
  • Implement and manage vector databases and knowledge graphs
  • Develop retrieval-augmented generation systems
  • Train and optimize large language models
  • Deploy, manage, and monitor ML models in production
  • Implement CI/CD processes for machine learning
  • Develop and manage AI agents for task automation
  • Ensure model performance monitoring and governance
  • Collaborate with data scientists and cross-functional teams

Requirements For AI & Machine Learning Site Reliability Engineer

Python
Kubernetes
  • Bachelor's degree in Computer Science, Engineering, Data Science, or related field
  • 5+ years of experience in site reliability engineering, dev ops, ML Ops
  • Experience with cloud platforms (AWS, GCP, Azure)
  • Proficient in deploying machine learning models
  • Experience with data processing tools
  • Strong understanding of vector databases and knowledge graph tools
  • Experience with containerization and orchestration technologies
  • Proficiency in Python and ML tools
  • Experience in on-call incident response
  • Excellent communication and collaboration skills

Benefits For AI & Machine Learning Site Reliability Engineer

Dental Insurance
Vision Insurance
Medical Insurance
Equity
  • Healthcare for dependents and spouse
  • Dental & Vision Insurance
  • Employee equity plan
  • Pension, Life insurance and Income protection
  • Remote working & flexible work schedules
  • Working from home equipment allowance
  • Choice of preferred equipment (Mac or PC)
  • Regular social events and workshops

Interested in this job?