Join Baseten, a dynamic startup revolutionizing AI deployment with cutting-edge inference infrastructure, as a Software Engineer focused on ML performance. The role involves implementing and productionizing cutting-edge techniques for ML model inference and infrastructure.
Requirements
- Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field.
- Experience with one or more general-purpose programming languages, such as Python or C++.
- Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous batching).
- Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.
- Demonstrated interest and experience in LLM’s.
- Deep understanding of GPU architecture.
Benefits
- Competitive compensation package (Unlimited PTO, 401k, covered healthcare premiums)
- An inclusive and supportive work culture that fosters learning and growth.
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.