
Job description
Join NVIDIA's DGX Cloud group as a Senior AI Infrastructure Engineer to design, build, and maintain large-scale production systems with high efficiency and availability. This role demands knowledge across different systems, networking, coding, databases, capacity management, and open-source cloud-enabling technologies.
Design, build, deploy, and run internal tooling for large-scale AI training and inferencing platform. Conduct performance characterization and analysis on large multi-GPU and multi-node clusters. Engage in the whole lifecycle of services from inception to deployment and refinement.
We value diversity, curiosity, problem-solving, and openness. Our team includes people with varied backgrounds and perspectives. We encourage collaboration, big thinking, and risk-taking without blame.
Company
Keep exploring
Sign in to see similar jobs
Create a free account to discover roles related to this posting.

Tech, Software & IT Services
NVIDIA, founded in 1993, is a leading full‑stack computing company that designs and manufactures GPUs and related technologies. Its products power a wide spectrum of applications—from high‑performance gaming and professional graphics to AI, deep learning, and autonomous vehicle systems—while its data‑center solutions enable large‑scale supercomputing and virtualization. NVIDIA’s pioneering GPU architecture has driven the growth of PC gaming, catalyzed the modern AI era, and continues to shape emerging fields such as the metaverse. The company’s integrated hardware‑software ecosystem delivers unprecedented performance and scalability, positioning NVIDIA as a key enabler of next‑generation computing across automotive, robotics, and enterprise sectors.