AWS Neuron is seeking a talented Software Engineer to join their Machine Learning Applications team, focusing on the development and optimization of cloud-scale ML accelerators. This role sits at the intersection of cutting-edge AI technology and cloud computing, working with AWS Inferentia and Trainium accelerators. The position involves working with massive scale language models like Llama2 and GPT, as well as stable diffusion and Vision Transformers.
The role offers an exciting opportunity to work with state-of-the-art ML infrastructure, where you'll be responsible for building and optimizing distributed inference solutions. You'll collaborate closely with compiler and runtime engineers, working on performance tuning for both latency and throughput on large models using Python, PyTorch, and JAX.
The team operates in a startup-like environment while having the resources and scale of Amazon. You'll be part of a supportive team that values knowledge-sharing and mentorship, with opportunities to work on high-impact solutions that serve a large customer base. The position offers competitive compensation ranging from $129,300 to $223,600 based on location, plus additional benefits including equity and sign-on payments.
This is an ideal role for someone who combines strong software development skills with deep ML knowledge, offering the chance to work on the forefront of AI infrastructure development while building solutions that will shape the future of cloud-based machine learning applications.