We are seeking an experienced Machine Learning Operations (MLOps) Engineer to join our team and build the infrastructure that powers AI and machine learning at K1X.
Requirements
- Design and build scalable ML infrastructure to support model training, evaluation, and deployment.
- Develop and maintain containerized environments using Docker and Kubernetes.
- Build and manage distributed training pipelines and orchestration workflows.
- Implement and maintain ML lifecycle tooling such as MLflow for experiment tracking and reproducibility.
- Own production inference systems, including NVIDIA Triton Inference Server.
- Design and operate low-latency, high-availability model serving architectures.
- Implement CI/CD pipelines for ML deployment, versioning, and rollback strategies.
- Build and maintain data pipelines integrated with Snowflake and related data systems.
- Implement monitoring, logging, and alerting for model performance, drift detection, and system health.
- Partner with ML Engineers to improve developer experience and accelerate delivery.
Benefits
- Unlimited Vacation Policy + Sick Time
- Fully Remote Opportunity
- Benefits/401K
- Growing Startup Culture
- Unlimited Vacation Policy + Sick Time + Holidays
- Paid Parental Leave
- Fully Remote Opportunity
- Healthcare Benefits and 401K