LearnUpon is looking for a Senior Engineer, Site Reliability to join their team in Dublin. The role involves designing, implementing, and maintaining a highly available and scalable infrastructure, as well as identifying opportunities to improve and scale the infrastructure for performance, observability, maintainability, and cost.
Requirements
- At least five years production system administration/SRE experience
- At least three years serving a large-scale SaaS web application solution with AWS, or similar cloud provider
- Strong experience with implementing infrastructure as code (e.g. CloudFormation, Terraform etc.), automation tooling (e.g. Puppet, Ansible etc.), CI/CD (e.g. Jenkins, Travis CI, GitLab etc.)
- Experience in implementing observability tech stacks using tools such as Grafana, Prometheus, Datadog, New Relic etc.
- Ability to analyse and optimise performance in high-traffic web applications
- Ability to solve complex, high-impact problems
- Experience building and supporting large-scale distributed systems that back a consumer app or website with associated requirements of performance, security and disaster recovery
- Able to effectively communicate technical ideas to and collaborate with both technical and non-technical peers
- Experience deploying microservice environments, using containerisation technologies such as Docker and Kubernetes is an advantage
- Certification in AWS, any PaaS, and/or related technologies is not required but considered a big plus
Benefits
- Competitive salary and company ESOP
- 25 days’ annual leave and 1 annual wellness day
- Private health insurance and company pension
- Parental benefits, including up to 26 weeks’ paid maternity leave, 4 weeks paid paternity leave, and coaching support for new parents
- Up to 4 weeks’ per year working abroad (role eligibility applies)
- Clear career progression opportunities — take LearnUpon where you think it can go
- A collaborative and supportive environment with regular team events