AWS AI is seeking a dedicated Software Development Engineer II to join their SageMaker Training team, focusing on building next-generation AI compute platforms optimized for LLMs and distributed training. This role sits within AWS Utility Computing (UC), which provides foundational services like S3 and EC2, along with continuous product innovations. The position involves working with cutting-edge AI technology, specifically designed for large-scale deep learning model training handling 100+ billion parameter GPT models across thousands of GPU devices.
The role offers an exciting opportunity to work at the forefront of AI technology, collaborating with ML scientists and customers to shape the future of cloud-based AI training solutions. You'll be responsible for designing and developing distributed machine learning systems that serve AWS's worldwide customer base. The position requires strong technical abilities in software development, particularly with multi-threaded asynchronous programming and experience with resource orchestrators or high-performance computing.
AWS values diverse experiences and maintains an inclusive culture through employee-led affinity groups and ongoing learning opportunities. The company offers strong career growth potential through mentorship and knowledge-sharing programs. Work-life harmony is emphasized, with flexibility built into the working culture. This role presents an opportunity to have significant impact on AWS and its global customer base while working with cutting-edge AI technology and contributing to innovative solutions in cloud computing.