Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.
$129,300 - $223,600
Machine Learning
Mid-Level Software Engineer
In-Person
5,000+ Employees
2+ years of experience
AI · Enterprise SaaS · Cloud

Description For Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

AWS Utility Computing (UC) is at the forefront of cloud innovation, specifically within the Annapurna Labs division. This role focuses on the AWS Neuron software stack for AWS Inferentia and Trainium, our cloud-scale Machine Learning accelerators. As a machine learning engineer in the Distribute Training team, you'll be responsible for developing and optimizing various ML models, including Large Language Models like GPT and Llama, as well as Stable Diffusion and Vision Transformers.

The position requires expertise in distributed training libraries such as FSDP and Deepspeed, working directly with custom AWS silicon. You'll collaborate with a diverse team of chip architects and engineers to build and enhance distributed training solutions. The role combines deep machine learning knowledge with strong software development skills.

AWS values diverse experiences and maintains an inclusive culture that celebrates knowledge-sharing and mentorship. The team supports professional growth through one-on-one mentoring and constructive code reviews. AWS pioneered cloud computing and continues to innovate, serving customers from startups to Global 500 companies.

The company emphasizes work-life harmony and provides comprehensive benefits including medical, financial, and equity compensation. Employee-led affinity groups and ongoing learning experiences, including Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, foster an inclusive environment where differences are celebrated.

This is an opportunity to work on cutting-edge ML infrastructure at scale, with competitive compensation ranging from $129,300 to $223,600 based on location and experience. Join a team that's pushing the boundaries of what's possible in cloud computing and machine learning acceleration.

Last updated 2 days ago

Responsibilities For Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

  • Lead efforts building distributed training support into PyTorch and TensorFlow
  • Tune ML models for highest performance on AWS Trainium and Inferentia silicon
  • Work with chip architects, compiler engineers and runtime engineers
  • Develop and enable performance tuning of ML model families including LLMs
  • Create and optimize distributed training solutions with Trainium instances

Requirements For Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

Python
Java
TypeScript
  • Bachelor's degree in computer science or equivalent
  • 2+ years of computer science fundamentals experience
  • 2+ years of systems architecture and design experience
  • Experience programming with at least one software programming language
  • Experience in machine learning, data mining, statistics or natural language processing

Benefits For Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

Medical Insurance
Equity
Mental Health Assistance
  • Medical, financial, and other benefits
  • Equity compensation
  • Sign-on payments
  • Mentorship and career growth opportunities
  • Work-life harmony
  • Inclusive team culture

Interested in this job?

Jobs Related To Amazon Software Engineer - AI/ML, AWS Neuron Distributed Training - Multimodal

Software Development Engineer, StoreGen

AI-focused Software Development Engineer role at Amazon, building next-generation development tools and practices using artificial intelligence.

Software Engineer- AI/ML, AWS Neuron Distributed Training

AWS Neuron seeks ML Engineer to develop distributed training solutions for cloud-scale machine learning accelerators, focusing on large language models and ML infrastructure.

Software Dev Engineer, AGI Info - Web & Knowledge Services

Software Development Engineer role at Amazon focusing on AGI development, combining ML, distributed systems, and high-performance computing.

Software Development Engineer II

Software Development Engineer II position at Amazon's AI Technology team, focusing on machine learning and AI innovation for consumer electronics and shopping experiences.

Software Development Engineer II

Software Development Engineer II position at Amazon focusing on AI/ML systems development and implementation within the Consumer Electronics Technology organization.