NVIDIA is seeking a highly skilled Principal Software Engineer to join its dynamic team. The company is at the forefront of technological innovation, dedicated to driving efficiency, defining platform architecture, and optimizing the performance of its infrastructure both on-prem and in the cloud. The ideal candidate will lead the architectural vision for a massive global platform and spearhead the operationalization of internal frontier-class AI inference systems.
Requirements
- Bachelor's degree in Engineering, Computer Science, Mathematics, or related field, or equivalent experience.
- 15+ years of proven experience in compute platform engineering, site reliability, or systems architecture with a heavy focus on automation at massive scale.
- Deep expertise in Kubernetes architecture and designing/deploying virtualization architectures, specifically operating VMs inside K8s (KubeVirt, OpenShift).
- In-depth knowledge of hardware technologies (GPUs, high-speed backplane networking) with a track record of mitigating hardware-level failures, silent data corruption, and anomalies in large-scale environments.
- Experience running large global environments spanning bare metal, virtualized infrastructure, and cloud with a unified GitOps posture (ArgoCD or similar).
- Proficiency in programming languages such as Go and/or Python, alongside expert-level infrastructure-as-code development (Terraform, Config Management).
- Strong leadership skills with the ability to influence technical direction across highly autonomous teams without relying on top-down mandates.
Benefits
- Eligible for equity and benefits
- 401k Matching