XPENG is seeking a high-caliber Staff Machine Learning Engineer to optimize and deploy VLA models onto vehicle-grade compute platforms for its global fleet. The role involves leading the effort to bridge the gap between massive research models and production-ready L4 autonomous driving systems.
Requirements
- 5-8 years of experience in model deployment, quantization, or high-performance computing (HPC)
- Mastery of Modern C++ and deep experience with CUDA or other hardware acceleration libraries
- Strong familiarity with PyTorch and deep knowledge of inference engines like TensorRT, ONNX Runtime, or TVM
- Hands-on experience with INT8/FP8/INT4 quantization and knowledge of the unique challenges in quantizing Large Language Models (LLMs) or Transformers
- Solid understanding of computer architecture (Cache, Memory Bandwidth, SIMD) and experience with embedded/edge compute constraints
Benefits
- A fun, supportive and engaging environment
- Infrastructures and computational resources to support ML model development/research
- Opportunity to work on cutting edge technologies with the top talent in the field
- Opportunity to make significant impact on transportation revolution by the means of advancing autonomous driving
- Competitive compensation package
- Snacks, lunches, dinners, and fun activities