Pathway builds the first post-transformer frontier model that solves AI's fundamental memory problem. The company is seeking an AI Benchmark & Dataset Engineering intern to support the definition and execution of benchmarking processes for model evaluation.
Requirements
- Proactively identify, prioritize, and curate relevant public and client-driven benchmarks
- Evaluate candidate benchmarks for clarity, data quality, evaluation methodology, and fit with our model roadmap
- Run benchmarks with baseline models to validate setup, uncover edge cases, and de-risk R&D runs
- Hand off “benchmark-ready” packages to R&D (specs, data, evaluation scripts, expected metrics, constraints)
- Maintain a shared vocabulary and documentation around benchmarks, datasets, and evaluation formats that GTM and R&D can both use.
- Track and organize benchmark results, model leaderboards, and “what good looks like” for different customers and scenarios.
- Contribute to demos and public-facing proof points based on benchmark outcomes.
Benefits
- Competitive salary
- Opportunity to work with a cutting-edge AI project
- Collaborative and intellectually stimulating work environment