NVIDIA, known as "the AI computing company," is seeking a Senior Platform Software Engineer for their AI Server - GPU team. This role focuses on PCIe firmware and software for NVIDIA GPU servers, driving innovations in GPU-based AI server architecture. Key responsibilities include:
- Optimizing I/O performance for various GPU applications
- Debugging complex system issues related to GPU, I/O bus, and CPU
- Architecting complex systems and improving resiliency of GPU-based systems
- Identifying new technologies to enhance performance, functionality, and uptime of GPU systems
- Working across the industry to enable new technologies for AI servers
- Contributing to all phases of product development
The ideal candidate should have:
- Deep understanding of Server Architecture, CPU design, PCI Express, and CXL
- Expertise in PCI Express Error Handling and Performance
- Familiarity with PCIe Switches and Retimers
- Strong knowledge of Memory architecture and RAS
- Experience with UEFI BIOS and Linux Kernel modification
- Excellent communication skills and a strong work ethic
- Bachelor's Degree in Electrical Engineering or Computer Science (or equivalent)
- At least 7+ years of experience as an individual contributor
This role offers the opportunity to work at the forefront of AI and GPU technology, contributing to NVIDIA's cutting-edge products like the GH200 superchip. Join a team that's shaping the future of AI computing and server architecture.