We're looking for a Senior Evaluation ML Engineer to design and own our end-to-end evaluation stack for our multimodal large language model (MLLM) trained on domain-specific, high-complexity medical data. The goal is to reach clinical-grade performance by conducting comprehensive, large-scale evaluation that is clinically grounded.
Requirements
- Excellent Python skills and strong software engineering fundamentals (testing, modular design, CI/CD)
- Deep experience designing & operating evaluation or data-quality pipelines for ML/LLMs at scale
- Comfortable with distributed compute (Ray, Spark), data lakehouse paradigms (Delta/Iceberg) and columnar formats (Parquet/ORC)
- Working knowledge of oncology workflows and terminology: staging (TNM), common biomarkers, lines of therapy, response criteria (e.g., RECIST), typical labs and imaging follow-up
Benefits
- Attractive and competitive salary
- Good pension plan
- 25 vacation days per year
- Great offsites and team events
- EUR 1000 learning and development budget
- Autonomy to do your work the way that works best for you
- Annual commuting subsidy