Senior Principal Software Engineer - GPU Cluster Performance and Benchmark Engineering

As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's problems. True innovation starts with diverse perspectives and various abilities and backgrounds. When everyone's voice is heard, we're inspired to go beyond what's been done before. It's why we're committed to expanding our inclusive workforce that promotes diverse insights and perspectives. We've partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. Oracle careers open the door to global opportunities where work-life balance flourishes. We offer a highly competitive suite of employee benefits designed on the principles of parity and consistency. We put our people first with flexible medical, life insurance and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
$150,200 - $251,600
Distributed Systems
Principal Software Engineer
Hybrid
5,000+ Employees
10+ years of experience
AI · Enterprise SaaS · Cloud

Description For Senior Principal Software Engineer - GPU Cluster Performance and Benchmark Engineering

We are seeking a highly skilled and experienced Large GPU Cluster Performance and Benchmark Engineer to join our advanced technology team as a Senior Principal. In this role, you will be responsible for designing, optimizing, and benchmarking large-scale GPU clusters, specifically focusing on running MLPerf benchmarks from MLCommons across thousands of NVIDIA and AMD GPUs. You will play a critical role in optimizing performance, both for AI/ML and compute workloads, as well as ensuring efficient storage solutions.

Why Join Us?

  • Be at the forefront of GPU performance benchmarking and large-scale infrastructure design.
  • Opportunity to work with a highly skilled team of engineers, architects, and thought leaders in the AI/ML and HPC space.
  • Competitive salary and benefits package, with opportunities for growth and development.

If you are a highly motivated and experienced professional with a passion for pushing the boundaries of GPU cluster performance, we encourage you to apply and join our dynamic team!

Career Level - IC5

Last updated 14 days ago

Benefits For Senior Principal Software Engineer - GPU Cluster Performance and Benchmark Engineering

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
  • Medical, dental, and vision insurance, including expert medical opinion
  • Short term disability and long term disability
  • Life insurance and AD&D
  • Supplemental life insurance (Employee/Spouse/Child)
  • Health care and dependent care Flexible Spending Accounts
  • Pre-tax commuter and parking benefits
  • 401(k) Savings and Investment Plan with company match
  • Paid time off: Flexible Vacation
  • 11 paid holidays
  • Paid sick leave
  • Paid parental leave
  • Adoption assistance
  • Employee Stock Purchase Plan
  • Financial planning and group legal
  • Voluntary benefits including auto, homeowner and pet insurance

Interested in this job?

Jobs Related To Oracle Senior Principal Software Engineer - GPU Cluster Performance and Benchmark Engineering

Principal Software Engineer - Cluster Networks (JoinOCI-SDE)

Principal Software Engineer position at Oracle focusing on building high-performance networking systems for AI infrastructure, requiring 7+ years of experience in systems development.

Principal Member of Technical Staff

Principal Engineer role at Oracle Health Applications & Infrastructure, focusing on distributed systems and cloud infrastructure.

Senior Principal Software Engineer - GPU Cluster Performance and Benchmark Engineering

Senior Principal Software Engineer role for GPU Cluster Performance and Benchmark Engineering at Oracle, focusing on large-scale GPU clusters and MLPerf benchmarks.

Software Developer 5

Oracle is seeking a skilled Software Developer 5 to design and develop high-performance software for their Clusterware team, focusing on scalable and fault-tolerant distributed systems.

Principal Member of Technical Staff

Principal Member of Technical Staff at Oracle Health Applications & Infrastructure, focusing on distributed systems, identity, and security for cloud-centric healthcare applications.