AI & Machine Learning Site Reliability Engineer

Oomnitza provides Enterprise Technology Management platform that orchestrates and automates key business processes for IT through SaaS solutions.
Galway, Ireland
Site Reliability
Senior Software Engineer
Remote
5+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:
Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior Software Developer role in Site Reliability Engineering at Google Cloud, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and comprehensive benefits.

Senior Software Engineer, SRE, Cloud Incident Response

Senior SRE position at Google focusing on Cloud Incident Response, requiring expertise in distributed systems and incident management.

Senior Software Engineer, Site Reliability Engineering

Senior Site Reliability Engineering role at Google, focusing on building and maintaining large-scale distributed systems for Google Cloud services.

Senior Software Engineer, Site Reliability Engineering

Senior SRE position at Google focusing on building and maintaining large-scale distributed systems for enterprise applications in Bengaluru.

Description For AI & Machine Learning Site Reliability Engineer

Oomnitza, a leading Enterprise Technology Management platform provider, is seeking an AI & ML Site Reliability Engineer to drive their AI and Data product management innovations. This role combines site reliability engineering with AI/ML expertise, focusing on building and maintaining infrastructure for machine learning operations. The position offers an opportunity to work with cutting-edge technologies including vector databases, knowledge graphs, and large language models. You'll be responsible for architecting scalable AI systems, implementing RAG solutions, and ensuring robust ML model deployment pipelines. The role requires strong technical expertise in cloud platforms, containerization, and ML frameworks, combined with the ability to collaborate across teams. Working in a venture-backed company with a progressive culture, you'll have the chance to shape the future of enterprise technology management while enjoying comprehensive benefits and flexible work arrangements. The position offers significant growth potential, working directly with founders and helping scale a fast-growing business backed by notable investors.

Last updated 3 months ago

Responsibilities For AI & Machine Learning Site Reliability Engineer

  • Build and maintain big data analytics platform
  • Design and build scalable AI infrastructure
  • Implement and manage vector databases and knowledge graphs
  • Develop retrieval-augmented generation systems
  • Train and optimize large language models
  • Deploy, manage, and monitor ML models in production
  • Implement CI/CD processes for machine learning
  • Develop and manage AI agents for task automation
  • Ensure model performance monitoring and governance
  • Collaborate with data scientists and cross-functional teams

Requirements For AI & Machine Learning Site Reliability Engineer

Python
Kubernetes
  • Bachelor's degree in Computer Science, Engineering, Data Science, or related field
  • 5+ years of experience in site reliability engineering, dev ops, ML Ops
  • Experience with cloud platforms (AWS, GCP, Azure)
  • Proficient in deploying machine learning models
  • Experience with data processing tools
  • Strong understanding of vector databases and knowledge graph tools
  • Experience with containerization and orchestration technologies
  • Proficiency in Python and ML tools
  • Experience in on-call incident response
  • Excellent communication and collaboration skills

Benefits For AI & Machine Learning Site Reliability Engineer

Dental Insurance
Vision Insurance
Medical Insurance
Equity
  • Healthcare for dependents and spouse
  • Dental & Vision Insurance
  • Employee equity plan
  • Pension, Life insurance and Income protection
  • Remote working & flexible work schedules
  • Working from home equipment allowance
  • Choice of preferred equipment (Mac or PC)
  • Regular social events and workshops

Interested in this job?