The Senior Data Architect will own the Training Environment data architecture, including dataset design, schema, data selection and sampling strategy, data catalog, and annotation pipeline architecture. They will work closely with LLM/NLU/S2S/ASR/TTS/VB Tech Leads and Senior Engineers to align data architecture with model training requirements.
Requirements
- Own the Training Environment data architecture end-to-end
- Define and govern data selection and sampling strategy
- Build and maintain the data catalog and dataset discovery infrastructure
- Define annotation pipeline architecture
- Architect the data flywheel
- Own and maintain data pipelines and infrastructure
- Work directly with LLM, NLU, and Agentic systems teams
- Define and maintain the data architecture for Omilia's Training Environment
- Design data quality frameworks
- Define annotation requirements
- Build and maintain the data catalog
- Architect the closed-loop data flywheel
- Identify gaps in production training data
Benefits
- Fixed compensation
- Long-term employment with the working days vacation
- Development in professional growth
- Being part of successful cutting-edge technology products
- Proficient and fun-to-work-with colleagues
- Apple gear