Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. The AI Inference Engineer will port AI models to Quadric platform, optimize model deployment for efficient inference, and profile and benchmark model performance.

Requirements

Quantize, prune and convert models for deployment
Port models to Quadric platform using Quadric toolchain
Optimize inference deployment for latency, speed
Benchmark and profile model performance and accuracy
Collaborate across related areas of the AI inference stack to support team and business priorities
Develop tools to scale and speed up the deployment
Make Improvement to SDK and runtime
Provide technical support and documents to customers and developer community

Benefits

Competitive salary and meaningful equity
Medical, dental, and vision plans starting on day one
401(k) retirement plan
Flexible paid time off (unlimited, non-accrual) to support work-life balance

Requirements

Quantize, prune and convert models for deployment
Port models to Quadric platform using Quadric toolchain
Optimize inference deployment for latency, speed
Benchmark and profile model performance and accuracy
Collaborate across related areas of the AI inference stack to support team and business priorities
Develop tools to scale and speed up the deployment
Make Improvement to SDK and runtime
Provide technical support and documents to customers and developer community

Benefits

Competitive salary and meaningful equity
Medical, dental, and vision plans starting on day one
401(k) retirement plan
Flexible paid time off (unlimited, non-accrual) to support work-life balance

AI Inference Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

AI Inference Engineer

AI Applications Engineer

AI Kernel Engineer

AI Inference Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

AI Inference Engineer

AI Applications Engineer

AI Kernel Engineer

Job Details

About quadric, Inc