Cloaked is a privacy startup dedicated to rebuilding consumer trust in how personal data is used. We're looking for a Senior Site Reliability Engineer to take ownership of the critical infrastructure powering our privacy platform.
Requirements
- Define and maintain SLOs/SLAs that balance user experience with engineering velocity
- Implement comprehensive monitoring and alerting in Datadog to detect production issues
- Build resilient architectures that gracefully handle failures
- Establish error budgets and use them to make data-driven decisions about feature velocity vs. stability
- Lead incident response as primary on-call for infrastructure, taking critical load off leadership
- Conduct thorough, blameless post-mortems to prevent recurrence
- Build and maintain runbooks that enable faster resolution
- Serve as the first line of defense when production issues occur
- Identify and eliminate repetitive manual work through intelligent automation
- Build self-healing systems that reduce operational burden
- Improve deployment pipelines for faster, safer releases
- Own reliability for a platform running on AWS with Kubernetes, ArgoCD, Cloudflare, and Terraform
Benefits
- 401K
- Top of the line Health, Dental, and Vision benefits
- Flexible work arrangements
- Ability to work remotely as needed
- Home office stipend
- New company laptop
- Competitive PTO
- Monthly health stipend
- Late Night Meals
- Professional Growth opportunities
- Unlimited professional development fund