Taro Logo

Deep Learning Architect, LLM Inference - New College Graduate 2024

NVIDIA is the world leader in accelerated computing, pioneering solutions for challenges no one else can solve.
$104,000 - $189,750
Machine Learning
Entry-Level Software Engineer
In-Person
AI
This job posting is no longer active. 😔

Job Description

NVIDIA is at the forefront of the generative AI revolution. The Inference Benchmarking (IB) team focuses on advanced inference server performance for Large Language Models (LLMs). As a Deep Learning Architect for LLM Inference, you'll be responsible for characterizing the latest LLMs and inference servers, collaborating with performance marketing teams, working with AI startup engineers, profiling GPU kernel-level performance, developing analysis tools, contributing to deep learning software projects, verifying TRT-LLM performance, and guiding the direction of inference serving across the company.

Key responsibilities include:

  • Characterizing LLMs and inference servers like vLLM and DeepSpeed-MII
  • Creating content to highlight TRT-LLM achievements
  • Collaborating with AI startup engineers
  • Profiling GPU performance and identifying optimization opportunities
  • Developing profiling and analysis software tools
  • Contributing to projects like PyTorch, vLLM, and LLMPerf
  • Verifying TRT-LLM performance for new GPU product launches
  • Collaborating across teams to ensure world-class performance

Requirements:

  • Master's or PhD in Computer Science, Electrical Engineering, or related fields
  • Knowledge of deep learning inference serving, PyTorch, and compiler optimizations
  • Proficiency in C++ and Python, familiarity with CUDA
  • Experience with LLMs and their performance challenges
  • Understanding of CPU and GPU microarchitecture
  • Experience with complex software projects

Preferred qualifications:

  • Drive to improve software and hardware performance
  • History of developing workplace efficiency tools
  • Experience with database and visualization tools like D3.js

NVIDIA offers a competitive base salary range of $104,000 - $189,750 USD, along with equity and comprehensive benefits. Join a team of highly skilled professionals in one of the technology world's most desirable employers.

Last updated a year ago

Responsibilities For Deep Learning Architect, LLM Inference - New College Graduate 2024

  • Characterize latest LLMs and inference servers
  • Collaborate with performance marketing team
  • Work with AI startup engineers
  • Profile GPU kernel-level performance
  • Develop profiling and analysis software tools
  • Contribute to deep learning software projects
  • Verify TRT-LLM performance for new GPU product launches
  • Guide the direction of inference serving across the company

Requirements For Deep Learning Architect, LLM Inference - New College Graduate 2024

Python
  • Master's or PhD in Computer Science, Electrical Engineering, or related fields
  • Knowledge of deep learning inference serving, PyTorch, and compiler optimizations
  • Proficiency in C++ and Python, familiarity with CUDA
  • Experience with LLMs and their performance challenges
  • Understanding of CPU and GPU microarchitecture
  • Experience with complex software projects like compilers, operating systems, or frameworks

Benefits For Deep Learning Architect, LLM Inference - New College Graduate 2024

Equity
  • Equity
  • Comprehensive benefits