We are seeking highly skilled software engineers to join NVIDIA and build AI inference systems that serve large-scale models with extreme efficiency. You'll architect and implement high-performance inference stacks, optimize GPU kernels and compilers, and scale workloads across multi-GPU, multi-node, and multi-cloud environments.

Requirements

Bachelor's degree (or equivalent experience) in Computer Science (CS), Computer Engineering (CE), or Software Engineering (SE) with 7+ years of experience; alternatively, Master's degree in CS/CE/SE with 5+ years of experience; or PhD degree with the thesis and top-tier publications in ML Systems, GPU architecture, or high-performance computing.
Strong programming skills in Python and C/C++; experience with Go or Rust is a plus; solid CS fundamentals: algorithms & data structures, operating systems, computer architecture, parallel programming, distributed systems, deep learning theories.
Knowledgeable and passionate about performance engineering in ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).
Familiarity with GPU programming and performance: CUDA, memory hierarchy, streams, NCCL; proficiency with profiling/debug tools (e.g., Nsight Systems/Compute).
Experience with containers and orchestration (Docker, Kubernetes, Slurm); familiarity with Linux namespaces and cgroups.
Excellent debugging, problem-solving, and communication skills; ability to excel in a fast-paced, multi-functional setting.

Benefits

Eligible for equity and benefits

Requirements

Bachelor's degree (or equivalent experience) in Computer Science (CS), Computer Engineering (CE), or Software Engineering (SE) with 7+ years of experience; alternatively, Master's degree in CS/CE/SE with 5+ years of experience; or PhD degree with the thesis and top-tier publications in ML Systems, GPU architecture, or high-performance computing.
Strong programming skills in Python and C/C++; experience with Go or Rust is a plus; solid CS fundamentals: algorithms & data structures, operating systems, computer architecture, parallel programming, distributed systems, deep learning theories.
Knowledgeable and passionate about performance engineering in ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).
Familiarity with GPU programming and performance: CUDA, memory hierarchy, streams, NCCL; proficiency with profiling/debug tools (e.g., Nsight Systems/Compute).
Experience with containers and orchestration (Docker, Kubernetes, Slurm); familiarity with Linux namespaces and cgroups.
Excellent debugging, problem-solving, and communication skills; ability to excel in a fast-paced, multi-functional setting.

Benefits

Eligible for equity and benefits

Senior Software Engineer, AI Inference Systems

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Senior Software Engineer, AI Inference Systems

Senior Software Engineer, AI Inference Systems

Senior Software Engineer, AI Inference

Senior Software Engineer, AI Inference Systems

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Senior Software Engineer, AI Inference Systems

Senior Software Engineer, AI Inference Systems

Senior Software Engineer, AI Inference

Job Details

About NVIDIA