ML Infrastructure Engineer

WitnessAI is a leader in providing innovative networking solutions designed to enhance security, performance, and reliability for businesses.
Machine Learning
Mid-Level Software Engineer
Hybrid
2+ years of experience
AI · Enterprise SaaS

Description For ML Infrastructure Engineer

WitnessAI, a leader in innovative networking solutions, is seeking an ML Infrastructure Engineer to drive their machine learning operations forward. This role combines cutting-edge ML infrastructure development with practical implementation of scalable solutions.

The position offers an exciting opportunity to work with state-of-the-art technologies in machine learning infrastructure, focusing on optimizing and scaling ML models in production environments. You'll be responsible for managing GPU resources, building continuous learning pipelines, and implementing advanced inference solutions using platforms like NVIDIA Triton and vLLM.

As an ML Infrastructure Engineer, you'll collaborate with cross-functional teams including applied scientists, software engineers, and DevOps professionals. Your work will directly impact the company's mission by designing and maintaining scalable ML infrastructure components, optimizing workflows, and ensuring high performance of deployed models.

The ideal candidate brings 2+ years of experience in building and scaling ML systems, strong Python skills, and expertise in cloud platforms, particularly AWS. You'll work in a hybrid environment in the San Francisco Bay Area, with comprehensive benefits including health insurance, 401(k), and professional development opportunities.

This role is perfect for someone who combines technical expertise in ML infrastructure with strong problem-solving abilities and excellent communication skills. You'll be at the forefront of implementing and optimizing ML systems while contributing to a company that's pushing the boundaries of networking solutions.

Last updated 7 hours ago

Responsibilities For ML Infrastructure Engineer

  • Design and manage scalable GPU infrastructures for model training and inference
  • Build automated pipelines that accelerate ML workflows
  • Implement feedback loops for continuous learning
  • Evaluate and integrate inference platforms
  • Work closely with applied scientists, software engineers, and DevOps professionals
  • Document best practices to support team knowledge sharing
  • Optimize ML workflows for performance and resource utilization

Requirements For ML Infrastructure Engineer

Python
Kubernetes
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field
  • 2+ years of experience building and scaling machine learning systems
  • Experience with inference platforms like NVIDIA Triton, vLLM, or similar
  • Expertise in model quantization, pruning, and optimization techniques
  • Proficient with cloud platforms (AWS preferred, GCP, or Azure)
  • Strong skills in Python
  • Experience with CUDA packages
  • Experience with PyTorch, Tensorflow or similar frameworks
  • Proficient in Docker and Kubernetes
  • Experience with Jenkins, Github CI/CD, or similar tools
  • Experience with Prometheus, Grafana, or similar monitoring solutions

Benefits For ML Infrastructure Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
  • Hybrid work environment
  • Competitive salary
  • Health, dental, and vision insurance
  • 401(k) plan
  • Opportunities for professional development and growth
  • Generous vacation policy

Interested in this job?

Jobs Related To Witness AI ML Infrastructure Engineer

Software Development Engineer (ML), AGI Foundations

ML Engineer position at Amazon's AGI team focusing on LLM development and fine-tuning with competitive compensation and benefits.

Software Dev Engineer II, People Experience and Technology Central Science (PXTCS), GenAI Apps

Software Development Engineer role focusing on ML/LLM systems and GenAI applications at Amazon's PXT Central Science Team

Software Engineer II

Microsoft is seeking a Software Engineer II to join the Microsoft365 Turing team, focusing on AI and machine learning solutions development.

Field Service AI Solution Architect

AI Solution Architect role focusing on field service optimization using artificial intelligence and advanced analytics at Salesforce.

Research Engineer

Research Engineer position at Google DeepMind working on cutting-edge AI applications across Google products, offering competitive salary and benefits.