We are seeking a Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our production systems through automation, observability, and operational excellence. As an SRE, you will work closely with our product development team to integrate observability, reliability, and security considerations into the entire software development lifecycle.

Requirements

Design, implement and maintain scalable and reliable infrastructure.
Collaborate with engineering and product teams to integrate observability, reliability, and security considerations into the entire software development lifecycle.
Develop and implement automation tools for monitoring, deployment, and incident response to ensure efficient and reliable operations.
Lead and participate in post-incident reviews to learn from operational surprises and driving actionable improvements to system reliability.
Proactively identify and resolve performance bottlenecks and system issues.
Conduct regular security assessments and audits to mitigate risks.
Champion and embed a culture of reliability across the organization.
Implement and manage Infrastructure as Code (IaC) using Ansible and other industry-standard tools.
Implement and enforce cloud security best practices, including identity and access management (IAM), encryption, and network security.
Develop dashboards and alerts to ensure real-time visibility into system operations.
Stay updated with emerging cloud technologies and recommend improvements to existing systems.

Benefits

Flexible working hours
Free snacks and beverages
Regular team events
Modern office environment
Mental health counselling
Home Office set up budget
25 vacation days
Additional day off for birthday

Requirements

Design, implement and maintain scalable and reliable infrastructure.

Collaborate with engineering and product teams to integrate observability, reliability, and security considerations into the entire software development lifecycle.

Develop and implement automation tools for monitoring, deployment, and incident response to ensure efficient and reliable operations.

Lead and participate in post-incident reviews to learn from operational surprises and driving actionable improvements to system reliability.

Proactively identify and resolve performance bottlenecks and system issues.

Conduct regular security assessments and audits to mitigate risks.

Champion and embed a culture of reliability across the organization.

Implement and manage Infrastructure as Code (IaC) using Ansible and other industry-standard tools.

Implement and enforce cloud security best practices, including identity and access management (IAM), encryption, and network security.

Develop dashboards and alerts to ensure real-time visibility into system operations.

Stay updated with emerging cloud technologies and recommend improvements to existing systems.

Site Reliability Engineer (SRE) (m/f/d)

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Site Reliability Engineer (SRE) (m/f/d)

Site Reliability Engineer (SRE)

Senior Site Reliability Engineering (SRE) 100 % remote (w/m/d)

Site Reliability Engineer (SRE) (m/f/d)

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Site Reliability Engineer (SRE) (m/f/d)

Site Reliability Engineer (SRE)

Senior Site Reliability Engineering (SRE) 100 % remote (w/m/d)

Job Details

About Software Spinner Gmbh