Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.
$129,300 - $223,600
Machine Learning
Mid-Level Software Engineer
In-Person
5,000+ Employees
3+ years of experience
AI · Enterprise SaaS · Cloud

Description For Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

AWS Utility Computing (UC) is seeking a talented Machine Learning Engineer to join their Distributed Training team for AWS Neuron. This role sits at the intersection of cutting-edge AI technology and cloud computing, working with AWS's custom silicon solutions - Inferentia and Trainium. You'll be responsible for developing and optimizing distributed training solutions for large-scale ML models, including LLMs like GPT and Llama, as well as Stable Diffusion and Vision Transformers.

The position offers a unique opportunity to work directly with chip architects, compiler engineers, and runtime engineers, creating solutions that push the boundaries of what's possible in machine learning. You'll be part of AWS's innovative culture, working on products that continue to set AWS's services apart in the industry.

The team culture emphasizes knowledge-sharing, mentorship, and inclusive practices. AWS values diverse experiences and backgrounds, offering various employee-led affinity groups and ongoing learning experiences. The company provides comprehensive benefits, emphasizes work-life harmony, and offers competitive compensation ranging from $129,300 to $223,600 based on location and experience.

This role requires strong software development skills combined with deep machine learning knowledge. You'll work with technologies like FSDP, Deepspeed, PyTorch, and TensorFlow, while having the opportunity to contribute to Amazon's growing suite of generative AI services. The position offers excellent career growth opportunities, with senior team members providing one-on-one mentoring and thorough code reviews.

If you're passionate about machine learning, have strong software development skills, and want to work on technology that helps customers solve previously unimaginable challenges, this role offers an exciting opportunity to make a significant impact in the field of AI and cloud computing.

Last updated 4 hours ago

Responsibilities For Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

  • Lead efforts building distributed training support into Pytorch, Tensorflow using XLA
  • Work with Neuron compiler and runtime stacks
  • Tune models for highest performance and maximize efficiency on AWS Trainium and Inferentia silicon
  • Develop and manage AWS Compute, Database, Storage, Platform services
  • Work with massive-scale Large Language Models (LLM)
  • Support distributed training solutions with Trainium instances

Requirements For Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

Python
Java
TypeScript
  • Bachelor's degree in computer science or equivalent
  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture experience
  • Experience programming with at least one software programming language
  • Experience in machine learning, data mining, information retrieval, statistics or natural language processing

Benefits For Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

Medical Insurance
401k
Parental Leave
  • Full range of medical benefits
  • Financial benefits
  • Work-life harmony
  • Career growth opportunities
  • Mentorship programs

Interested in this job?

Jobs Related To Amazon Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

Machine Learning Engineer, BADS

Machine Learning Engineer position at Amazon's BADS team, focusing on implementing ML systems and optimizing fulfillment center operations.

Software Dev Engineer II, Amazon

Software Dev Engineer II position at Amazon focusing on developing LLMs for e-commerce, offering competitive compensation and opportunity to work on cutting-edge AI technology.

Software Development Engineer, Amazon

Software Development Engineer position at Amazon focusing on machine learning and speech recognition technology for shopping experiences

Software Development Engineer, AGI Sensory ASR Inference

Software Development Engineer position focusing on AI inference optimization and implementation, working with cutting-edge neural models and deep learning technologies at Amazon's AGI division.

Software Development Engineer, Ring AI

Software Development Engineer position at Ring focusing on computer vision and machine learning infrastructure to enhance smart home security solutions.