We are looking for a DevOps - SRE/Observability Engineer responsible for instrumenting, maintaining, and optimizing monitoring platforms for one of our banking clients. They will work closely with development, SRE, and DevSecOps teams to ensure critical service coverage and enable data-driven decisions.
Requirements
- Experience in observability instrumentation with tools like Dynatrace, Grafana, Zabbix, or similar
- Experience in metrics, logs, and traces for critical service monitoring (SLO/SLA is desirable)
- Pipeline design for log and metric ingestion (Fluent Bit, Beats, Kafka, or other)
- Scripting and automation experience with Python, Bash, Terraform, and Ansible
- Solid knowledge of Kubernetes/EKS, cloud services (AWS, Azure, or GCP), and databases
- Knowledge of infrastructure and networking
- Experience in advanced troubleshooting and root cause analysis (RCA) using AIOps practices and distributed tracing
- Experience in agile methodologies (Scrum/Kanban) and tools like Jira/Confluence