AWS Neuron is seeking a Senior Machine Learning Engineer to join their ML Applications team, focusing on the complete software stack for AWS Inferentia and Trainium cloud-scale machine learning accelerators. This role presents an exciting opportunity to work at the forefront of machine learning infrastructure, specifically with massive-scale language models like Llama2, GPT2, and GPT3.
The position combines deep technical expertise in machine learning with high-impact software development. You'll be responsible for developing and optimizing distributed inference solutions, working directly with compiler and runtime engineers. The role requires expertise in performance tuning for both latency and throughput on large models using Python, PyTorch, or JAX, with a particular focus on Deepspeed and other distributed inference libraries.
As part of Amazon's AWS team, you'll work in a startup-like environment while having the resources and impact of a global tech leader. The team culture emphasizes knowledge-sharing and mentorship, with senior members providing one-on-one mentoring and thorough code reviews. You'll collaborate with internal and external stakeholders, participate in critical design discussions, and directly influence business decisions through technical expertise.
The compensation is highly competitive, ranging from $129,300 to $223,600 based on location, plus additional benefits including equity and sign-on payments. This role offers the unique opportunity to work on cutting-edge ML infrastructure that powers some of the most advanced AI models in production today, making it an ideal position for someone looking to make a significant impact in the field of machine learning systems.