We are seeking a Principal Engineer to lead the development of our high-scale retrieval systems & Retrieval-Augmented Generation (RAG) construction. This role is critical in bridging the gap between massive, licensed datasets and real-time generative inference. You will be the primary architect for our RAG pipeline, focusing on the sophisticated processing, chunking, and indexing of millions of documents to power both semantic and full-text discovery.
Requirements
- Design and implement the end-to-end RAG construction pipeline, ensuring high-performance ingestion and transformation of diverse datasets in near real-time.
- Develop and optimize hybrid retrieval strategies that combine the precision of full-text search with the contextual depth of semantic (vector) search.
- Own the 'document-to-chunk' lifecycle. Implement advanced strategies for chunking, metadata enrichment, and quality filtering to ensure the most relevant context is fed into generative models.
- Architect systems to handle jobs across millions of documents while optimizing indices for sub-second latency and high-throughput serving.
- Recommend and implement optimizations for GPU/CPU performance, concurrency, and memory management to minimize serving costs and maximize ROI.
- Act as a technical expert and influencer, guiding the team in software design and providing superior diagnostic skills for complex distributed system issues.
Benefits
- Opportunity to work at the forefront of AI technology
- Collaborative and innovative work environment
- Competitive salary and benefits package
- Professional development and growth opportunities
- Chance to make a significant impact on the company's success