ML Engineer Large-scale AI Infrastructure

A Silicon Valley startup combining Generative AI with biology and medicine, pioneering pan-modal Large Biological Models (LBM) for healthcare transformation.
Machine Learning
Mid-Level Software Engineer
In-Person
2+ years of experience
AI · Healthcare · Biotech

Description For ML Engineer Large-scale AI Infrastructure

GenBio is a pioneering Silicon Valley startup at the intersection of Generative AI and biomedicine. With headquarters in Silicon Valley and a presence in Paris, we're revolutionizing healthcare through Large Biological Models (LBM). Our team of visionary scientists, engineers, and entrepreneurs is dedicated to decoding biology holistically and enabling next-generation life-transforming solutions.

As our ML Engineer for Large-scale AI Infrastructure, you'll be at the forefront of building and maintaining the computational backbone that powers our breakthrough research. You'll work with cutting-edge GPU clusters, implement distributed training systems, and optimize performance for our large-scale AI models. This role combines expertise in machine learning infrastructure with high-performance computing, requiring both technical depth and collaborative skills.

The ideal candidate will bring strong experience in GPU cluster management, distributed systems, and deep learning frameworks. You'll work alongside leading minds in AI and Biological Science, contributing to a mission that could fundamentally transform healthcare and biological research. This is an opportunity to join an exceptionally strong R&D team that's leading the charge in LLM and generative AI applications in biomedicine.

We offer a unique environment where innovation meets impact, and your work will directly contribute to advancing the future of biology and medicine through AI. Join us in our mission to pioneer new paradigms in healthcare, working with state-of-the-art technology and alongside world-class experts in both AI and biological sciences.

Last updated 3 months ago

Responsibilities For ML Engineer Large-scale AI Infrastructure

  • Design, deploy, and maintain high-performance GPU clusters
  • Implement distributed computing techniques for parallel training
  • Fine-tune GPU clusters and deep learning frameworks for optimal performance
  • Collaborate with data scientists and machine learning engineers
  • Ensure GPU clusters can scale effectively
  • Troubleshoot and resolve issues related to GPU clusters
  • Create and maintain documentation for GPU cluster configuration

Requirements For ML Engineer Large-scale AI Infrastructure

Python
Kubernetes
  • Master's or Ph.D. degree in computer science or related field with focus on High-Performance Computing, Distributed Systems, or Deep Learning
  • 2+ years proven experience in managing GPU clusters
  • Strong expertise in distributed deep learning and parallel training techniques
  • Proficiency in PyTorch, Megatron-LM, DeepSpeed
  • Programming skills in Python and experience with GPU-accelerated libraries
  • Knowledge of performance profiling and optimization tools for HPC and deep learning
  • Familiarity with resource management and scheduling systems
  • Strong background in distributed systems, cloud computing, and containerization

Interested in this job?

Jobs Related To GenBio ML Engineer Large-scale AI Infrastructure

Deep Learning Engineer

Deep Learning Engineer position focused on developing and deploying large-scale AI models for biological applications

Field Solution Architect II, AI Infrastructure, North, Google Cloud

Enterprise AI Infrastructure Field Solution Architect position at Google Cloud, focusing on implementing AI/ML accelerators and cloud solutions for major clients.

Software Developer III, AI/ML GenAI

Software Developer III position at Google focusing on AI/ML and GenAI development, requiring 2 years of experience and expertise in machine learning infrastructure and generative AI concepts.

Product Manager, Assurance Evaluations, Google Cloud

Lead product management for Google Cloud's AI Assurance Evaluations, focusing on responsible AI development, safety, and governance while ensuring efficient and ethical AI solutions.

Research Scientist, Google Cloud AI

Research Scientist position at Google Cloud AI team, focusing on advancing AI technology and its applications across various industries.