Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS · Cloud

Description For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

AWS Utility Computing (UC) is seeking a Senior Machine Learning Engineer to join their Distributed Training team for AWS Neuron. This role is part of Annapurna Labs, AWS's infrastructure provider focusing on silicon and software innovation. The position involves working on AWS Inferentia and Trainium, cloud-scale Machine Learning accelerators, developing solutions for massive-scale Large Language Models like GPT and Llama.

The role combines deep software engineering expertise with machine learning knowledge, requiring hands-on experience with distributed training libraries like FSDP and Deepspeed. You'll collaborate with chip architects and compiler engineers to optimize performance on custom AWS silicon. The position offers exposure to cutting-edge AI technologies and cloud computing innovations.

AWS provides a supportive environment emphasizing knowledge-sharing and mentorship, with opportunities for career growth and skill development. The company values diverse experiences and maintains an inclusive culture through various employee-led initiatives and affinity groups. Work-life harmony is prioritized, with flexibility built into the working culture.

The compensation package is comprehensive, including competitive base pay ranging from $151,300 to $261,500 depending on location, plus equity, sign-on payments, and extensive benefits. This is an opportunity to work at the intersection of cloud computing and machine learning, developing solutions that push the boundaries of what's possible in AI acceleration.

Last updated 3 hours ago

Responsibilities For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

  • Lead efforts building distributed training support into Pytorch, Tensorflow using XLA
  • Work with Neuron compiler and runtime stacks
  • Tune ML models for highest performance on AWS Trainium and Inferentia silicon
  • Develop and enable performance tuning of ML model families including LLMs
  • Work with chip architects, compiler engineers and runtime engineers
  • Create, build and tune distributed training solutions

Requirements For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Python
TypeScript
  • Bachelor's degree in computer science or equivalent
  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language
  • 5+ years of leading design or architecture experience
  • 5+ years of full software development life cycle experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Experience in machine learning, data mining, statistics or natural language processing

Benefits For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Medical Insurance
401k
Mental Health Assistance
  • Medical benefits
  • Financial benefits
  • Flexible work arrangements
  • Career development opportunities
  • Mentorship programs
  • Inclusive workplace culture

Interested in this job?

Jobs Related To Amazon Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Senior Software Developer, Amazon Games AI

Senior Software Developer role at Amazon Games focusing on implementing ML, RL, and Generative AI techniques for game development, offering competitive salary and benefits.

Senior Software Engineer, Amazon Games AI Research

Senior AI/ML Game Developer role at Amazon Games, focusing on implementing innovative AI systems in game development, offering competitive salary and opportunity to work on AAA titles.

ML Compiler Engineer, Annapurna Labs

Senior ML Compiler Engineer role at AWS developing cutting-edge deep learning compiler stack for custom ML accelerators, offering competitive compensation and growth opportunities.

Senior ML Engineer, AWS Generative AI Innovation Center

Senior ML Engineer position at AWS Generative AI Innovation Center, focusing on developing AI solutions and helping customers implement generative AI technologies.

Sr. Software Development Engineer, AWS Compute Services

Senior SDE role at AWS focusing on ML/AI services, building distributed systems with 5+ years experience required, competitive salary $151K-$261K.