XPENG is looking for a full-time Machine Learning Engineer - AI Foundation to work on establishing a state-of-art ML infrastructure for training very large foundation model and accelerating model training/inference, to solve the autonomous driving problem.
Requirements
- Optimize transformer-based LLMs for low-latency and high-throughput inference.
- Implement and benchmark model optimization techniques such as Quantization, Knowledge distillation, structured and unstructured pruning, KV-cache optimization, etc.
- Deploy optimized models across GPUs, CPUs, and edge accelerators.
- Contribute to internal tooling and documentation for model optimization flows.
- Minimum 5-8 years of industry experience.
- Good knowledge of PyTorch.
- Knowledge of transformer architecture and ways to accelerate the training and inference of transformer models.
Benefits
- Competitive compensation package.
- Snacks, lunches, dinners, and fun activities.