Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Annapurna Labs designs silicon and software that accelerates innovation for AWS cloud solutions.
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS

Description For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Annapurna Labs, now part of AWS, is seeking a Senior Machine Learning Engineer for their Distribute Training team working on AWS Neuron. This role focuses on developing and optimizing distributed training solutions for large-scale ML models including LLMs, Stable Diffusion, and Vision Transformers. The position involves working with cutting-edge technology like AWS Trainium and Inferentia accelerators, requiring expertise in distributed training libraries such as FSDP, Deepspeed, and Nemo. The team offers a collaborative environment with opportunities for mentorship and knowledge-sharing. AWS, as the world's leading cloud platform, provides comprehensive benefits, work-life harmony, and a strong commitment to diversity and inclusion. The role offers competitive compensation ranging from $151,300 to $261,500 based on location, plus additional benefits including equity and sign-on payments. This is an opportunity to work at the forefront of ML infrastructure, developing solutions that power the next generation of cloud computing.

Last updated a day ago

Responsibilities For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

  • Lead efforts to build distributed training support into PyTorch and JAX using XLA
  • Optimize models to achieve peak performance on AWS custom silicon
  • Work with chip architects, compiler engineers and runtime engineers
  • Create, build and tune distributed training solutions with Trainium instances
  • Development, enablement and performance tuning of ML model families

Requirements For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Python
Java
  • Bachelor's degree in computer science or equivalent
  • 5+ years of non-internship professional software development experience
  • 5+ years of programming experience
  • 5+ years of leading design or architecture experience
  • 5+ years of full software development life cycle experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Experience in machine learning, data mining, statistics or natural language processing

Benefits For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Medical Insurance
Equity
Mental Health Assistance
  • Medical, financial, and other benefits
  • Equity compensation
  • Sign-on payments
  • Mentorship and career growth opportunities
  • Work-life harmony
  • Inclusive team culture

Interested in this job?

Jobs Related To Amazon Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Machine Learning Engineer, AI/LLM, Amazon Shopping

Senior ML Engineer role at Amazon focusing on LLM-based systems, offering competitive pay and benefits in Seattle.

Software Development Engineer AI/ML

Senior Software Engineer role at Amazon focusing on AI/ML development for revolutionary shopping experiences, offering competitive compensation and growth opportunities.

Sr. Software Engineer- AI/ML, AWS Neuron Apps

Senior Software Engineer role at AWS Neuron, focusing on ML infrastructure development and optimization for cloud-scale machine learning accelerators.

Applied Scientist, AWS Product Analytics & Data Science (PANDAS)

Senior Applied Scientist role at AWS focusing on machine learning and AI to transform product analytics and customer experience.

Sr SOC Verification Engineer, Annapurna ML

Senior SOC Verification Engineer position at AWS, focusing on cloud server chip design and machine learning acceleration, requiring 8+ years of experience.