We're looking for a Research Scientist to lead work on post-training data curation for foundation models. You'll design and implement algorithms to generate and improve instruction, preference, and other post-training datasets.

Requirements

3+ years of deep learning research experience
Experience with post-training large vision, language, and multimodal models
Post-training algorithm development, data curation, and/or synthetic data methods for: Preference-based tuning (e.g. DPO, RLVR, RRHF), Alternative supervision & self-supervision techniques such as self-training and chain-of-thought distillation, SFT (e.g. instruction tuning and demonstration fine-tuning), Post-training tooling development and engineering experience
Strong understanding of the fundamentals of deep learning
Sufficient software engineering + deep learning framework (PyTorch or a willingness to learn PyTorch) skills to conduct large-scale research experiments and build production prototypes.
Demonstrated track record of success in deep learning research, whether papers, tools, or other research artifacts.

Benefits

100% covered health benefits (medical, vision, and dental).
401(k) plan with a generous 4% company match.
Unlimited paid time off (PTO) policy.
Annual $2,000 wellness stipend.
Annual $1,000 learning and development stipend.
Daily lunches and snacks are provided in our office!
Relocation assistance for employees moving to the Bay Area.

Requirements

3+ years of deep learning research experience
Experience with post-training large vision, language, and multimodal models
Post-training algorithm development, data curation, and/or synthetic data methods for: Preference-based tuning (e.g. DPO, RLVR, RRHF), Alternative supervision & self-supervision techniques such as self-training and chain-of-thought distillation, SFT (e.g. instruction tuning and demonstration fine-tuning), Post-training tooling development and engineering experience
Strong understanding of the fundamentals of deep learning
Sufficient software engineering + deep learning framework (PyTorch or a willingness to learn PyTorch) skills to conduct large-scale research experiments and build production prototypes.
Demonstrated track record of success in deep learning research, whether papers, tools, or other research artifacts.

Benefits

100% covered health benefits (medical, vision, and dental).
401(k) plan with a generous 4% company match.
Unlimited paid time off (PTO) policy.
Annual $2,000 wellness stipend.
Annual $1,000 learning and development stipend.
Daily lunches and snacks are provided in our office!
Relocation assistance for employees moving to the Bay Area.

Research Scientist, Post-Training

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Research Scientist, Post-Training

Research Scientist

Research Engineer / Scientist, Post-Training

Research Scientist, Post-Training

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Research Scientist, Post-Training

Research Scientist

Research Engineer / Scientist, Post-Training

Job Details

About DatologyAI