The Data Engineer will play a crucial role in developing and fine-tuning data for LLMs and machine learning models, responsible for the entire data lifecycle, including gathering, cleaning, structuring, and optimizing large, diverse healthcare datasets.
Requirements
- Collaborate with data scientists and machine learning engineers to understand data requirements for LLM and machine learning model fine-tuning.
- Design, build, and maintain scalable data pipelines to ingest, process, and store massive and diverse healthcare datasets.
- Implement robust data validation and monitoring to ensure the integrity, accuracy, and consistency of all training datasets.
- Develop and optimize data structures and schemas for efficient access and utilization by LLMs and machine learning models.
- Monitor data pipeline performance, troubleshoot issues, and implement optimizations to improve efficiency and reliability.
Benefits
- Competitive salary and benefits package
- Flexible working arrangements (remote or hybrid options available)
- The opportunity to work on life-changing AI technology that directly impacts patient outcomes
- Join a team that combines cutting-edge innovation with a mission to save lives and improve health equity
- Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare