Taro Logo

Principal Supercomputing Software Engineer

Microsoft is a global technology company that empowers every person and organization on the planet to achieve more.
United States
$137,600 - $267,000
Principal Software Engineer
Remote
5,000+ Employees
6+ years of experience
AI · Enterprise SaaS

Description For Principal Supercomputing Software Engineer

Microsoft Azure AI/HPC team is seeking a Principal Supercomputing Software Engineer to enable customers in deploying, monitoring, profiling, and debugging applications on hyperscale cloud infrastructure. This role is critical in building and maintaining Azure's largest supercomputing deployments, which have achieved recognition in Top500, MLPerf, and Graph500 rankings.

As a Principal Supercomputing Engineer, you'll be responsible for developing state-of-the-art tools and techniques for managing cloud-native supercomputers at scale. You'll work on maintaining system reliability, runtime performance, and health monitoring while meeting customer SLAs. The role involves establishing best practices, driving architectural changes, and influencing the roadmap of software and hardware components.

The position offers a competitive base salary range of $137,600 - $267,000 (higher in SF Bay Area and NYC), along with comprehensive benefits including healthcare, educational resources, and investment options. This is a remote-friendly role with 0-25% travel requirements.

The ideal candidate will bring deep expertise in HPC systems, cloud infrastructure, and software engineering. You'll be part of Microsoft's mission to empower every person and organization globally, working in a culture that values growth mindset, innovation, and collaboration.

This is an exceptional opportunity for a seasoned professional to impact the future of AI and HPC in the cloud, working with cutting-edge technology at unprecedented scale. Your work will directly influence a wide range of users and drive the next wave of innovation in cloud supercomputing.

Last updated 20 days ago

Responsibilities For Principal Supercomputing Software Engineer

  • Be part of a comprehensive systems management team focused on operational excellence and customer success
  • Analyze key system metrics and telemetry to proactively identify and debug HPC system issues
  • Build appropriate tooling, help develop processes and ensure solutions are responsive to emerging user needs
  • Partner with customers, vendors, and other teams within Azure to drive comprehensive solutions
  • Ensure that the Azure platform is performant, scalable and resilient
  • Foster test-driven engineering culture to reduce regressions and bugs in production

Requirements For Principal Supercomputing Software Engineer

Python
Java
JavaScript
Linux
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • 5+ years of experience in operating AI/HPC systems, developing and running AI/HPC applications on clusters, or operating Cloud Infrastructure
  • 3+ years of specialized experience with one of AI/HPC system management OR High-Speed Networks OR HPC Storage OR managing Cloud Infrastructure
  • Must pass Microsoft Cloud Background Check upon hire/transfer and every two years thereafter

Benefits For Principal Supercomputing Software Engineer

Medical Insurance
Parental Leave
Education Budget
401k
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Opportunities to network and connect

Interested in this job?

Jobs Related To Microsoft Principal Supercomputing Software Engineer