Senior Software Development Engineer, Annapurna Labs, Trainium Collectives

AWS division specializing in semiconductor, systems, chips, and software development for EC2 infrastructure
$151,300 - $261,500
Distributed Systems
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS

Description For Senior Software Development Engineer, Annapurna Labs, Trainium Collectives

Annapurna Labs, an integral part of AWS, is seeking a Senior Software Development Engineer to join their team working on critical HW and SW components for EC2 infrastructure. The role focuses on developing networking solutions, specifically the Elastic Fabric Adapter, to enhance performance for HPC and ML workloads across AWS's infrastructure.

The ideal candidate will bring extensive experience in HPC interconnects with an ML focus, demonstrating expertise in collective operations and RDMA networking. A deep understanding of computer architecture and Linux operating systems is essential, along with comfort in both server and embedded environments.

The team emphasizes work-life balance and fosters an inclusive culture with strong support for professional growth. AWS maintains ten employee-led affinity groups across 190 global chapters, demonstrating their commitment to diversity and inclusion. The position offers competitive compensation ranging from $151,300 to $261,500 based on location and experience, plus comprehensive benefits.

This role presents an exciting opportunity to work on cutting-edge technology that directly impacts AWS's customer experience, particularly in optimizing network-intensive workloads across thousands of CPUs and GPUs. The team values knowledge sharing and mentorship, making it an ideal environment for continued professional development while contributing to critical AWS infrastructure components.

Last updated 3 months ago

Responsibilities For Senior Software Development Engineer, Annapurna Labs, Trainium Collectives

  • Build networking solutions (Elastic Fabric Adapter)
  • Improve network speed and performance for HPC and ML workloads
  • Work on sophisticated embedded devices for AWS
  • Design and implement scalable networking solutions for thousands of CPUs and GPUs

Requirements For Senior Software Development Engineer, Annapurna Labs, Trainium Collectives

Linux
  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language
  • 5+ years of leading design or architecture experience
  • 5+ years of full software development life cycle experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Deep understanding of computer architecture and operating systems
  • Experience with HPC interconnects preferably in ML domain
  • Good understanding of collective operations and networking (RDMA networking preferred)

Benefits For Senior Software Development Engineer, Annapurna Labs, Trainium Collectives

Medical Insurance
401k
  • Work-life balance
  • Flexible working hours
  • Mentorship opportunities
  • Career growth opportunities
  • Inclusive team culture
  • Employee-led affinity groups
  • Comprehensive benefits package

Interested in this job?

Jobs Related To Amazon Senior Software Development Engineer, Annapurna Labs, Trainium Collectives

Software Dev Engineer III, Distributed Systems, Amazon Redshift, Query Processing

Senior Software Engineer role at AWS building distributed systems for cloud services, focusing on scalability and innovation.

Sr. Software Dev Engineer, Kuiper Software & Networking

Senior Software Engineer role at Amazon's Project Kuiper developing distributed systems for satellite communications

Software Development Engineer, EC2 Instance Networking

Senior Software Engineer role at Amazon AWS working on EC2 VPC Dataplane team, focusing on high-performance networking and distributed systems.

Sr Software Development Engineer, AWS Elastic Block Store

Senior Software Engineer role at AWS Elastic Block Store team, building and managing large-scale distributed storage systems for cloud computing.

Software Development Engineer, SageMaker

Senior Software Engineer role at AWS developing next-generation AI platform for large-scale machine learning and distributed training systems.