Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS · Cloud

Description For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

AWS Utility Computing (UC) is at the forefront of cloud innovation, developing cutting-edge solutions in the AWS ecosystem. This senior role within the Machine Learning Applications (ML Apps) team for AWS Neuron focuses on developing and optimizing machine learning solutions using AWS's custom silicon accelerators. The position involves working with state-of-the-art ML models, including GPT-series and Vision Transformers, while collaborating with chip architects and compiler engineers.

The role demands expertise in distributed training frameworks like FSDP and Deepspeed, combined with strong software development skills. You'll be working on AWS Neuron, the complete software stack for AWS Inferentia and Trainium cloud-scale machine learning accelerators. This position offers exposure to Amazon's growing suite of generative AI services and cutting-edge cloud computing technologies.

AWS values diverse experiences and maintains an inclusive culture through employee-led affinity groups and ongoing learning experiences. The team emphasizes knowledge-sharing, mentorship, and career growth, with senior members providing one-on-one mentoring and thorough code reviews. Work-life harmony is prioritized, ensuring success at work doesn't come at the expense of personal life.

The compensation package is comprehensive, including competitive base pay, equity, sign-on payments, and various benefits. The role offers an opportunity to work with advanced ML technologies while contributing to AWS's mission of being Earth's Best Employer. Join a team that's pushing the boundaries of what's possible in cloud computing and machine learning acceleration.

Last updated 2 months ago

Responsibilities For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

  • Lead efforts building distributed training and inference support into Pytorch, Tensorflow, Jax
  • Work with chip architects, compiler engineers and runtime engineers
  • Create, build and tune distributed training solutions with Trn1
  • Tune models for highest performance on AWS Trainium and Inferentia silicon
  • Support development of AWS Compute, Database, Storage, Platform services
  • Work with AWS Neuron software stack for ML accelerators

Requirements For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Python
Java
TypeScript
  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language
  • 5+ years of leading design or architecture of new and existing systems
  • 5+ years of full software development life cycle experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Machine Learning knowledge in frameworks and end to end model training
  • Bachelor's degree in computer science or equivalent (preferred)

Benefits For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Medical Insurance
Equity
Mental Health Assistance
  • Medical, financial, and other benefits
  • Equity compensation
  • Sign-on payments
  • Mentorship and career growth opportunities
  • Work-life harmony
  • Inclusive team culture

Interested in this job?

Jobs Related To Amazon Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Sr. Software Development Engineer – Machine Learning, Ad Response Prediction

Senior ML Engineer role at Amazon's Ad team, building large-scale prediction systems with competitive pay and benefits.

Sr. Software Engineer (ML), AGI Customization

Senior Machine Learning Engineer position focused on developing LLM training techniques and AI customization capabilities at Amazon's AGI team.

Sr. Software Engineer (ML), AGI Customization

Senior ML Engineer role at Amazon's AGI team, focusing on LLM training techniques and customization capabilities, offering competitive compensation and growth opportunities.

Software Development Engineer, ML Networking Performance

Senior SDE role at AWS focusing on ML Network Performance, developing metrics and solutions for hyperscale data center networks, offering competitive compensation and benefits.

Software Dev Engineer III, Conversational Ads Experiences

Senior Software Engineer role at Amazon focusing on building LLM-powered conversational advertising experiences, requiring 5+ years of experience and offering competitive compensation.