We're seeking a Research-Hardware Codesign Engineer to operate at the boundary between model research and silicon/system architecture. You'll help shape the numerics, architecture, and technology bets of future OpenAI silicon in collaboration with both Research and Hardware.
Requirements
- Strong Python, and C++ or Rust, with a cautious attitude toward correctness and an intuition for clean extensibility.
- Experience writing Triton, CUDA, or similar, and an understanding of the resulting mapping of tensor ops to functional units.
- Working knowledge of PyTorch or JAX; experience in large ML codebases is a plus.
- Practical understanding of floating point numerics, the ML tradeoffs of reduced precision, and the current state of the art in model quantization.
- Deep understanding of transformer models, and strong intuition for transformer rooflines and the tradeoffs of sharded training and inference in large-scale ML systems.
- Experience writing RTL (especially for floating point logic) and understanding of PPA tradeoffs is a plus.
- Strong cross-functional communication (e.g. across ML researchers and hardware engineers); ability to slice ambiguous early-incubation ideas into concrete arenas in which progress can be made.
Benefits
- Relocation assistance available