
Job description
The Hardware Automation team builds internal platforms and tooling for Nebius' data center infrastructure. The team operates as a product engineering team, owning the full stack from requirements gathering to rollout and ongoing reliability.
Ensure fault-tolerance, scale, and uninterrupted operations for Nebius' services. Implement and improve CI/CD processes. Troubleshoot complex system issues, including hardware, software, and networking problems.
We expect strong analytical and problem-solving skills, with a focus on optimizing system performance. Proficiency in Linux systems, Python, and Bash scripting is required. Experience designing, developing, and running high-load distributed systems is a bonus.
Company
Keep exploring
Sign in to see similar jobs
Create a free account to discover roles related to this posting.

Tech, Software & IT Services
Nebius provides a comprehensive AI cloud platform designed for developers and organizations building and deploying generative AI applications. We offer a full-stack infrastructure – encompassing secure, high-performance computing and cost-optimized resources – enabling efficient machine learning model training and deployment. Nebius caters to a diverse clientele, including startups, enterprises, and research institutions, empowering them to accelerate AI innovation and deliver impactful scientific breakthroughs. Our platform simplifies the complexities of AI infrastructure, allowing teams to focus on core development and maximize the value of their AI investments.