Zyte is seeking an experienced Team Lead to manage our Core & MLOps Squad, responsible for building the bedrock infrastructure that powers Zyte at scale. This hands-on technical leadership role requires expertise across MLOps, systems programming, and orchestration to lead a cross-functional team in designing and maintaining the scalable foundation that enables all Zyte teams to build and run their services with confidence.
Requirements
- Design and evolve the core platform (Kubernetes, Mesos, GPU scheduling/autoscaling, distributed compute).
- Own the model platform: registry, experiment tracking, training orchestration, evaluation, serving, and monitoring.
- Build the Golden Path: reference repos, a scaffold CLI, opinionated CI/CD pipelines, runtime contracts (health/metrics/tracing/SLOs), high-performance clients, circuit breakers and other productionâready defaults.
- Operate a secure, multiâtenant model registry and training platform with standardized experiment/evaluation harnesses.
- Provide turnkey serving patterns (online + batch), drift/quality monitoring, and rollback playbooks.
- Integrate public/openâsource AI capabilities as managed platform services with cost and dataâgovernance guardrails.
- Run the squad: roadmap/prioritization, delivery, mentoring, and high engineering standards.
- Partner with product engineering (Zyte API, Scrapy Cloud), Prod Ops, and Security on adoption and rollout plans.
- Mentor the team and foster a platform-thinking mindset.
- Container orchestration (Kubernetes/Knative), GPU provisioning & autoscaling, environment & secret management.
- Operators, sidecars, and internal SDKs/libraries (Go/Rust/Python/Java) that enforce the golden path contract.
- Model platform: registry, experiment tracking, training orchestration, evaluation framework, serving infra, model monitoring.
- Observability: logging/metrics/tracing pipelines;
- Billing pipeline: metering/events/cost tracking abstractions.
- Golden Path: Java, Python, ML templates + CI/CD blueprints + docs + scaffold CLI.
- Reliability enablement (SRE practices), cost governance, supplyâchain security (SBOM, image signing).
Benefits
- We love fostering and nourishing new ideas and bringing them to market
- Become part of a self-motivated, progressive, multi-cultural team.
- Have the freedom and flexibility to work from where you do your best work, as we are a completely remote company.
- Get the chance to work with cutting-edge open-source technologies and tools.