As a Staff ML Performance Engineer, you’ll play a key role in high-impact projects, optimising ML inference for edge accelerators and GPUs. The focus of this team is to run large transformer-based models efficiently on low-cost, low-power edge devices to enable Wayve’s first driving product.
Requirements
- Profile and pinpoint bottlenecks across the full inference stack (model graph, compiler/runtime, kernel execution, memory movement) and deliver measurable improvements.
- Implement and validate optimisations in compilers, runtimes, and/or kernels (e.g. operator fusion, scheduling, quantisation-aware performance, custom kernels).
- Build robust benchmarking and regression testing to ensure performance improvements hold across models, devices, and software releases.
- Optimise for multiple targets (e.g. NVIDIA Orin/Thor, Qualcomm) and work with teams to support these in a maintainable way
- Collaborate with model developers to influence architecture and training/deployment decisions that affect on-device performance.
- Contribute to technical roadmaps and tooling and help raise the standard of performance engineering across the team
Benefits
- Generous Paid Time Off
- 401k Matching
- Retirement Plan
- Visa Sponsorship
- Four Day Work Week
- Generous Parental Leave
- Tuition Reimbursement
- Relocation Assistance