Meta is seeking Research Engineers to join the Post-Training team within Meta Superintelligence Labs, building data pipelines for post-training AI models. Responsibilities include designing, building, and scaling full-stack data collection pipelines, collaborating with external data vendors, and executing on the technical vision of research scientists.
Requirements
- Design, build, and scale full-stack data collection pipelines for post-training (SFT, RLHF) across text, vision, and action modalities
- Develop and implement environments to capture complex agentic trajectories, including computer use agents, Deep research workflows, UI generation, and shopping agents
- Collaborate with external data vendors and domain experts to source, securely ingest, and prepare high-quality datasets in fields like STEM, finance, legal, and health
- Execute on the technical vision of research scientists to generate and filter high-quality synthetic data at scale
- Build robust, reusable data processing pipelines that scale across multiple model lines and product areas
- Contribute to tooling that measures and ensures the Quality, Diversity, and Safety of post-training datasets
Benefits