Truelogic is a leading provider of nearshore staff augmentation services headquartered in New York. We are seeking a Senior Reliability Engineer to operate, observe, and improve the reliability of existing distributed systems running on AWS and Kubernetes.
Requirements
- 5+ years of experience in Site Reliability Engineering, Platform Engineering, or Infrastructure roles
- Hands-on experience operating and supporting production systems
- Strong experience in observability operations, including defining metrics, logs, traces, dashboards, alerts, and reliability indicators for complex systems
- Fluency in Python and experience with Infrastructure-as-Code using AWS CDK, CDK8s, or equivalent frameworks
- Experience improving existing systems rather than building greenfield infrastructure
- Proven track record of using observability data to drive automation, scaling decisions, and operational improvements
Benefits
- 100% Remote Work
- Highly Competitive USD Pay
- Paid Time Off
- Work with Autonomy
- Work with Top American Companies