Software Development Engineer II, ML Acceleration

Amazon

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuously innovating.

Cupertino, CA, USA

$129,300 - $223,600

Machine Learning

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

AI · Enterprise SaaS · Cloud

This job posting may no longer be active. You may be interested in these related jobs instead:

Machine Learning Engineer, AWS Neuron Apps

Amazon

Senior ML Engineer role at AWS working on Neuron software stack for machine learning accelerators

Senior Solutions Architect - AI/ML, AWS Cloud Intelligence

Amazon

Senior Solutions Architect position at AWS Cloud Intelligence team, focusing on AI/ML solutions and customer advisory for Azure to AWS migrations.

Sr. Machine Learning Engineer, AGI Foundations

Amazon

Senior Machine Learning Engineer position at Amazon's AGI team focusing on developing industry-leading multimodal AI systems and large language models.

Software Development Engineer, Prime Video Sports

Amazon

Senior Software Engineer role at Amazon Prime Video Sports, focusing on ML/CV technology to enhance sports streaming experiences.

Machine Learning Engineer III, FAR (Frontier AI & Robotics)

Amazon

Senior ML Engineer role at Amazon Robotics, optimizing large-scale foundation models and working with world-class AI researchers to advance robotics technology.

Description For Software Development Engineer II, ML Acceleration

AWS Neuron is the complete software stack for the AWS Inferentia (Inf1/Inf2) and Trainium (Trn1), our cloud-scale Machine Learning accelerators. This role is for a machine learning engineer in the Inference team for AWS Neuron, responsible for development, enablement and performance tuning of a wide variety of ML model families, including massive-scale Large Language Models (LLM) such as GPT and Llama, as well as Stable Diffusion, Vision Transformers (ViT) and many more.

The ML Inference team works side by side with chip architects, compiler engineers and runtime engineers to create, build and optimize distributed inference solutions with Trainium/Inferentia instances. Experience with training and optimizing inference on these large models using Python/C++ is a must. Model parallelization, quantization, memory optimization - vLLM, DeepSpeed and other distributed inference libraries are central to this role, and extending all of them for the Neuron based system is key.

Key responsibilities include:

Leading efforts to build and achieve the best distributed training and inference performance of PyTorch, JAX, TensorFlow with XLA and other advanced frameworks on Neuron stacks.
Optimizing models to ensure the highest performance and maximize efficiency on custom AWS Trainium and Inferentia silicon and the Trn1, Inf1/2 servers.
Strong software development (Python and C++) and Machine Learning knowledge are critical to this role.

This position offers an opportunity to work on cutting-edge ML accelerator technology and contribute to the development of AWS's cloud-scale machine learning infrastructure.

Last updated 2 months ago

Responsibilities For Software Development Engineer II, ML Acceleration

Lead efforts to build and optimize distributed training and inference performance
Develop and tune ML models for AWS Inferentia and Trainium accelerators
Work with chip architects, compiler engineers, and runtime engineers
Optimize large-scale ML models, including LLMs, for performance and efficiency
Extend distributed inference libraries for Neuron-based systems

Requirements For Software Development Engineer II, ML Acceleration

Python

3+ years of non-internship professional software development experience
2+ years of non-internship design or architecture experience
Experience programming with at least one software programming language
3+ years of full software development life cycle experience
Bachelor's degree in computer science or equivalent
Strong software development skills in Python and C++
Experience with PyTorch, JAX, TensorFlow, and XLA
Knowledge of distributed training and inference optimization
Familiarity with model parallelization, quantization, and memory optimization techniques
Experience with vLLM, DeepSpeed, and other distributed inference libraries

Benefits For Software Development Engineer II, ML Acceleration

Medical Insurance

Medical Insurance
Financial Benefits
Other Benefits

Amazon

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuously innovating.

Cupertino, CA, USA

$129,300 - $223,600

Machine Learning

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

AI · Enterprise SaaS · Cloud

Interested in this job?