AWS Neuron is Amazon's complete software stack for cloud-scale machine learning accelerators, specifically the AWS Inferentia and Trainium. This Senior Software Engineering role is part of the Machine Learning Inference Applications team, focusing on developing and optimizing core LLM inference components.
The position involves working with cutting-edge LLM technology, including attention mechanisms, MLP, quantization, speculative decoding, and mixture of experts. You'll collaborate with chip architects, compiler engineers, and runtime engineers to maximize performance on Neuron devices for various models like Llama 3.3 70B, 3.1 405B, DBRX, and Mixtral.
The team culture emphasizes knowledge-sharing and mentorship, with senior members providing one-on-one mentoring and thorough code reviews. Career growth is prioritized through strategic project assignments that help develop engineering expertise. The role offers competitive compensation ranging from $151,300 to $261,500 based on location, plus equity and comprehensive benefits.
This is an excellent opportunity for experienced engineers passionate about machine learning optimization and looking to work on large-scale, impactful projects. The position requires strong programming skills, understanding of ML fundamentals, and the ability to work collaboratively across teams. Amazon's inclusive culture and commitment to diversity make it an ideal workplace for innovation and professional growth.