At Mistral AI, we are seeking an Evaluation Engineer to design and implement comprehensive evaluation frameworks to measure LLM capabilities across diverse customer use cases.
Requirements
- Design and implement comprehensive evaluation frameworks to measure LLM capabilities across diverse customer use cases
- Build scalable evaluation infrastructure and pipelines that enable rapid, reproducible assessment of model performance
- Develop novel evaluation methodologies to assess emerging capabilities or verticalized use cases
- Create custom evaluation suites tailored to enterprise customers' specific needs
- Collaborate with research teams to translate evaluation insights into model improvements and training decisions
- Partner with product teams to continuously improve our evaluation tooling based on customer feedback
Benefits
- PTO: The CDI contract will be a "Forfait 218 jours", corresponding to 25 days of holidays and on average 8 to 10 days of RTT days, and complete autonomy on working hours
- Health: Full health insurance coverage for you and your family
- Transportation: We offer a €600 annual mobility allowance
- Food: Swile meal vouchers with 10,83€ per worked day
- Sport: Gymlib - sponsorship by Mistral of a significant part of the monthly fee
- Parental policy: 4 additional weeks for parents on top of what is offered by the French state