Deep Learning Architect, LLM Inference - New College Graduate 2024

NVIDIA is the world leader in accelerated computing, pioneering solutions for challenges no one else can solve.
$104,000 - $189,750
Machine Learning
Entry-Level Software Engineer
In-Person
AI

Description For Deep Learning Architect, LLM Inference - New College Graduate 2024

NVIDIA is at the forefront of the generative AI revolution. The Inference Benchmarking (IB) team focuses on advanced inference server performance for Large Language Models (LLMs). As a Deep Learning Architect for LLM Inference, you'll be responsible for characterizing the latest LLMs and inference servers, collaborating with performance marketing teams, working with AI startup engineers, profiling GPU kernel-level performance, developing analysis tools, contributing to deep learning software projects, verifying TRT-LLM performance, and guiding the direction of inference serving across the company.

Key responsibilities include:

  • Characterizing LLMs and inference servers like vLLM and DeepSpeed-MII
  • Creating content to highlight TRT-LLM achievements
  • Collaborating with AI startup engineers
  • Profiling GPU performance and identifying optimization opportunities
  • Developing profiling and analysis software tools
  • Contributing to projects like PyTorch, vLLM, and LLMPerf
  • Verifying TRT-LLM performance for new GPU product launches
  • Collaborating across teams to ensure world-class performance

Requirements:

  • Master's or PhD in Computer Science, Electrical Engineering, or related fields
  • Knowledge of deep learning inference serving, PyTorch, and compiler optimizations
  • Proficiency in C++ and Python, familiarity with CUDA
  • Experience with LLMs and their performance challenges
  • Understanding of CPU and GPU microarchitecture
  • Experience with complex software projects

Preferred qualifications:

  • Drive to improve software and hardware performance
  • History of developing workplace efficiency tools
  • Experience with database and visualization tools like D3.js

NVIDIA offers a competitive base salary range of $104,000 - $189,750 USD, along with equity and comprehensive benefits. Join a team of highly skilled professionals in one of the technology world's most desirable employers.

Last updated 3 months ago

Responsibilities For Deep Learning Architect, LLM Inference - New College Graduate 2024

  • Characterize latest LLMs and inference servers
  • Collaborate with performance marketing team
  • Work with AI startup engineers
  • Profile GPU kernel-level performance
  • Develop profiling and analysis software tools
  • Contribute to deep learning software projects
  • Verify TRT-LLM performance for new GPU product launches
  • Guide the direction of inference serving across the company

Requirements For Deep Learning Architect, LLM Inference - New College Graduate 2024

Python
  • Master's or PhD in Computer Science, Electrical Engineering, or related fields
  • Knowledge of deep learning inference serving, PyTorch, and compiler optimizations
  • Proficiency in C++ and Python, familiarity with CUDA
  • Experience with LLMs and their performance challenges
  • Understanding of CPU and GPU microarchitecture
  • Experience with complex software projects like compilers, operating systems, or frameworks

Benefits For Deep Learning Architect, LLM Inference - New College Graduate 2024

Equity
  • Equity
  • Comprehensive benefits

Interested in this job?

Jobs Related To NVIDIA Deep Learning Architect, LLM Inference - New College Graduate 2024

Applied Machine Learning Engineer, VLSI Design - New College Grad 2024

NVIDIA seeks new grad Applied Machine Learning Engineer for VLSI Design. Work on cutting-edge AI projects, hardware design, and data analysis. Competitive salary and benefits.

Deep Learning Algorithm Engineer - New College Graduate 2024

NVIDIA seeks a Deep Learning Algorithm Engineer to optimize AI training performance across hardware/software stack.

Engineering Analyst, AI Safety

AI Safety Engineering Analyst role at Google focusing on protecting GenAI products with robust safety filters and applying AI to combat harmful content.

Technical Program Manager I, Resource Engineering, Machine Learning

Technical Program Manager position at Google focusing on Resource Engineering and Machine Learning, requiring programming skills and program management experience.

Associate Customer Engineer, GenAI, Google Cloud

Associate Customer Engineer position at Google Cloud focusing on GenAI, combining ML expertise with customer-facing responsibilities in Tokyo.