TextNow is looking for a motivated Site Reliability Engineer to own infrastructure, monitoring, logging, ci/cd, reliability and everything in between. The role is about impact at scale, shaping how TextNow builds and operates its systems in an AI-first environment.

Requirements

Ensure System Reliability: Design, build, and maintain scalable, resilient, and highly available systems to support TextNow’s infrastructure and services.
Automation & Infrastructure as Code: Develop and maintain automation using Terraform, Ansible, and other tools to enable efficient deployment, scaling, and operations of cloud-based systems (AWS preferred).
Incident Response & On-Call Support: Participate in an on-call rotation, troubleshoot issues, and drive incident resolution to minimize downtime and improve system performance.
Performance Monitoring & Optimization: Implement and improve observability tools, logging, and monitoring solutions to identify and mitigate potential system issues proactively.
Collaboration & Cross-Team Engagement: Work closely with software engineers, DevOps, and product teams to align technical efforts with business objectives and improve system reliability from development to production.
Continuous Improvement: Identify areas for improvement in architecture, automation, and operational practices. Contribute to the design and implementation of new SRE best practices.

Benefits

Free phone service
Strong work life blend
Flexible work arrangements (work-from-home, remote, or access to one of our office spaces)
Employee stock options
Unlimited vacation
12 paid holidays per year
Competitive pay
Health, dental, and vision benefits
Short-term & long-term disability
$750 annual wellness benefit or healthcare spending account
RRSP matching (Canada) | 401(K) (USA)
Parental leave for eligible employees
Learning & Development opportunities

Requirements

Ensure System Reliability: Design, build, and maintain scalable, resilient, and highly available systems to support TextNow’s infrastructure and services.

Automation & Infrastructure as Code: Develop and maintain automation using Terraform, Ansible, and other tools to enable efficient deployment, scaling, and operations of cloud-based systems (AWS preferred).

Incident Response & On-Call Support: Participate in an on-call rotation, troubleshoot issues, and drive incident resolution to minimize downtime and improve system performance.

Performance Monitoring & Optimization: Implement and improve observability tools, logging, and monitoring solutions to identify and mitigate potential system issues proactively.

Collaboration & Cross-Team Engagement: Work closely with software engineers, DevOps, and product teams to align technical efforts with business objectives and improve system reliability from development to production.

Continuous Improvement: Identify areas for improvement in architecture, automation, and operational practices. Contribute to the design and implementation of new SRE best practices.

Benefits

Free phone service

Strong work life blend

Flexible work arrangements (work-from-home, remote, or access to one of our office spaces)

Employee stock options

Unlimited vacation

12 paid holidays per year

Competitive pay

Health, dental, and vision benefits

Short-term & long-term disability

$750 annual wellness benefit or healthcare spending account

RRSP matching (Canada) | 401(K) (USA)

Parental leave for eligible employees

Learning & Development opportunities

Site Reliability Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Site Reliability Engineer

Site Reliability Engineer/Developer

Site Reliability Engineer

Site Reliability Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Site Reliability Engineer

Site Reliability Engineer/Developer

Site Reliability Engineer

Job Details

About TextNow