Advertima is building the missing infrastructure that brings the power of digital advertising to the physical world. As the Lead Site Reliability Engineer, you will architect the systems and shape the team that owns the resilience of our unique hybrid ecosystem.
Requirements
- Team Leadership: Hire, mentor, and scale the SRE team to support Advertima’s global expansion.
- Maintain Reliability: Define and own SLOs/SLIs for a global IoT fleet and supporting cloud infrastructure.
- Scale Operations: Drive the Infrastructure as Code (IaC) and observability roadmap required to manage thousands of remote devices with minimal manual intervention.
- Incident Management: Establish a professional 24/7 on-call culture; leading by example while refining incident response processes to reduce Mean Time to Recovery (MTTR).
- Edge Resilience: Direct the hardening of IoT devices and edge AI services against real-world failures, ensuring the team remains focused on high-impact engineering over daily 'toil.'
- Collaborative Engineering: Partner with stakeholders to implement monitoring best practices and advise on architectural standards.
Benefits
- Flexible Work Policies: Hybrid work environment with the flexibility to balance your personal and professional life.
- Employee Participation Plan: Benefit from our employee participation plan, so your success is directly tied to our company’s success.
- Self-Development Budget: Get your own budget for professional development, learning, and growth.