Taro Logo

System Development Engineer, Annapurna Labs Infrastructure

Annapurna Labs is an AWS organization building innovation in silicon and software for AWS customers, with development centers in the U.S. and Israel.
DevOps
Mid-Level Software Engineer
In-Person
["5,000+"] Employees
4+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For System Development Engineer, Annapurna Labs Infrastructure

Annapurna Labs, an AWS organization, is at the forefront of cloud computing innovation, focusing on silicon and software development. As part of the Cloud-Scale Machine Learning Acceleration Infrastructure team, you'll be instrumental in designing and supporting enterprise-scale infrastructure for AWS's cutting-edge hardware development.

The role combines deep technical expertise in Linux systems, networking, and infrastructure automation with a focus on supporting machine learning acceleration product development. You'll work with world-class engineers in a fast-paced, startup-like environment while having the resources and impact of AWS behind you.

Your responsibilities will span from designing networks and developing monitoring systems to troubleshooting complex infrastructure issues. You'll be working with state-of-the-art technology in machine learning acceleration, including ATE testers, Emulators, and Lab debug equipment. The position offers unique exposure to both cloud and on-premise infrastructure development.

The ideal candidate will be a self-starter with strong problem-solving skills, capable of working effectively in ambiguous situations. You'll need to balance technical excellence with customer obsession, always looking to understand and resolve customer pain points quickly and completely. This role offers the opportunity to work on projects that directly impact AWS's machine learning acceleration capabilities and contribute to the development of some of the most advanced ML Accelerators in the world.

You'll be based in Austin, Texas, working with the team that develops custom silicon, owning the infrastructure that enables this innovation. The position offers excellent growth opportunities and the chance to work with cutting-edge technology while making a significant impact on AWS's machine learning infrastructure.

Last updated 7 months ago

Responsibilities For System Development Engineer, Annapurna Labs Infrastructure

  • Design and support enterprise-scale infrastructure
  • Lead across teams to develop and execute infrastructure plans
  • Solve critical infrastructure issues involving networking and high performance compute clusters
  • Implement process improvements for team's agility and operations
  • Define mechanisms for system health monitoring, diagnostics, and automation
  • Develop and update operational runbooks
  • Participate in on-call rotations
  • Support silicon development workflows
  • Define building infrastructure requirements for labs and server rooms
  • Act as liaison to contractors and vendors for infrastructure

Requirements For System Development Engineer, Annapurna Labs Infrastructure

Python
Linux
  • 4+ years of major internet routing protocols experience
  • 4+ years of working in a Linux/Unix environment experience
  • Experience with automation scripting using Python, Bash, Shell and/or Perl

Benefits For System Development Engineer, Annapurna Labs Infrastructure

Medical Insurance
Vision Insurance
Dental Insurance
  • Equal opportunity employer
  • Workplace accommodations available
  • Inclusive culture

Interested in this job?