About the Role
In this role, you will be responsible for building a highly scalable inference framework for our future chip generations. You will participate in fundamental architectural decisions and have the opportunity to contribute to upstream open-source projects.
What you will do:
- Research, develop, and maintain the machine learning inference framework used for SEMRON architectures.
- Collaborate with machine learning, compiler and hardware engineers/researchers to facilitate the quantization/optimization of NNs for SEMRON’s hardware.
What you should bring in:
- Proficiency with machine learning frameworks like Tensorflow, PyTorch in Linux environments including the ability to extend those framework with custom C/C++/CUDA code
- A deep understanding of machine learning algorithms and modelling techniques, including but not limited to semi-supervised or weakly supervised learning, transfer learning
Helpful but not required:
- Experience with State-of-the-art NN compression methods like Adaround, QDrop, QUIP, or GPTQ
- Experience with typical tools used in ML environments like HuggingFace’s transformers or DeepSpeed
Why us?