HPC Infrastructure Engineer

AHEAD builds platforms for digital business, specializing in cloud infrastructure, automation, analytics, and software delivery solutions.
United States
$135,000 - $165,000
Cloud
Senior Software Engineer
Remote
1,000 - 5,000 Employees
5+ years of experience
Enterprise SaaS · AI

Description For HPC Infrastructure Engineer

AHEAD is seeking a Senior HPC Infrastructure Engineer to join their team in a remote capacity. This role is crucial for maintaining and optimizing high-performance computing environments for managed services customers. The position combines expertise in Kubernetes, HPC systems, and NVIDIA DGX infrastructure, requiring both technical depth and customer service skills.

The ideal candidate will bring 5+ years of expert-level experience in HPC environments, with strong capabilities in Kubernetes, Linux engineering, and infrastructure management. You'll be responsible for designing and managing high-performance computing clusters, ensuring optimal performance, and serving as a technical escalation point for complex issues.

AHEAD offers a comprehensive benefits package including medical, dental, and vision insurance, 401(k), paid time off, and parental leave. The company strongly values diversity and inclusion, creating an environment where all perspectives are valued and respected. They invest in employee growth through access to cutting-edge technologies in their multi-million-dollar lab and support continued learning through certifications.

This role offers an excellent opportunity for experienced infrastructure engineers looking to work with cutting-edge HPC and cloud technologies while making a significant impact in a customer-facing role. The position combines technical challenges with the opportunity to work in a collaborative, growth-oriented environment that values both technical expertise and soft skills.

Last updated a month ago

Responsibilities For HPC Infrastructure Engineer

  • Provide enterprise-level operational support for incident, problem, and change management activities
  • Design, deploy, and manage Kubernetes clusters optimized for HPC workloads
  • Optimize cluster performance, resource utilization, and cost-effectiveness
  • Implement monitoring, logging, and alerting solutions
  • Ensure security of Kubernetes infrastructure and HPC workloads
  • Troubleshoot and resolve issues related to Kubernetes, DGX systems, and HPC applications
  • Create and maintain detailed documentation
  • Serve as subject matter expert for HPC technologies
  • Work with vendors to resolve infrastructure issues
  • Participate in on-call rotation

Requirements For HPC Infrastructure Engineer

Kubernetes
Python
Linux
  • Bachelor's degree or equivalent in Information Systems or related field
  • 5+ years of expert level experience in high-performance computing environments
  • Strong understanding of Kubernetes architecture, components, and networking
  • Linux engineering experience with RedHat, Ubuntu, and Rocky distributions
  • Experience with containerization technologies (Docker, Singularity)
  • Experience with Infrastructure-as-Code tools (Terraform, Ansible)
  • Strong scripting skills (Bash, Python)
  • Experience with storage technology and distributed file systems
  • Strong problem-solving and communication skills
  • Managed Services or consulting experience

Benefits For HPC Infrastructure Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
  • Medical, Dental, and Vision Insurance
  • 401(k)
  • Paid company holidays
  • Paid time off
  • Paid parental and caregiver leave

Interested in this job?

Jobs Related To AHEAD HPC Infrastructure Engineer

Senior Windows Engineer

Senior Windows Engineer position at AHEAD, requiring 10+ years of experience in enterprise Windows administration, Azure, and infrastructure management.

Network Cloud Engineer

Senior Network Cloud Engineer position at AHEAD, focusing on enterprise cloud services, infrastructure management, and network virtualization, offering remote work with competitive benefits.

Senior Microsoft Engineer

Senior Microsoft Engineer position focusing on cloud infrastructure, system administration, and Microsoft technologies implementation.

Senior NetApp Engineer

Senior NetApp Engineer position at AHEAD, offering remote work and comprehensive benefits, focusing on enterprise storage solutions and technical leadership.

Senior Engineer - AVD

Senior Cloud Engineer position specializing in Azure Virtual Desktop (AVD) infrastructure management and optimization, requiring 10+ years of experience in VDI technologies.