We are looking for a Site Reliability Engineer with a strong focus on monitoring, observability, and alerting to ensure the reliability, performance, and scalability of our infrastructure and applications.
Requirements
- Proven experience as a Site Reliability Engineer or similar role
- Proficiency in logging, metrics, and tracing frameworks (DataDog, Loki, Prometheus, OpenTelemetry)
- Experience with cloud platforms (Azure preferred) and infrastructure-as-code tools (e.g., Terraform)
- Strong programming and scripting skills (Python, Bash)
- Proficiency in containerization technologies and orchestration tools (Docker, Kubernetes)
- Understanding of Linux-based systems, networking, and security principles related to containerized applications
- Strong problem-solving and troubleshooting skills, with a passion for identifying and resolving complex technical issues
- Excellent communication and collaboration abilities
- Ability to thrive in a fast-paced, constantly evolving environment
- Experience with PostgreSQL monitoring and optimization (Optional/Nice to have)
Benefits
- Unlimited vacation
- Meal vouchers paid in full by the company
- Multisport card contribution
- Pension contributions
- Language courses
- Centrally located office in the heart of Brno
- Bi-weekly team lunches provided by the company
- Tech courses and conferences
- Top of the line MacBook
- Company team building events
- Flexible working hours and the possibility to work from home