The RabbitMQ Platform engineering specialist will be responsible for managing and optimizing platform maintenance activities, ensuring changes are rightly deployed to production. This role requires a strong understanding of platform management principles, pub-sub architecture, managing the deployments, monitoring, excellent problem-solving skills, and the ability to work collaboratively with cross-functional teams.
Requirements
- Design, deploy, configure, and manage VMware Tanzu RabbitMQ clusters across development and production environments.
- Administer RabbitMQ clusters including nodes, virtual hots (vhosts), exchanges, queues, bindings, policies and user permissions, Dead Letter Exchanges (DLX), TTL, Message Priority and lazy queues.
- Implement and manage RabbitMQ cluster High Availability(HA) configuration across the data centers along with classic Mirrored Queues and Quorum Queues.
- Configure and manage TLS/mTLS for RabbitMQ client and Inter-node communication using certificates.
- Design and implement Shovel and Federation plugins for cross-clusters message routing and disaster recovery.
- Manage RabbitMQ plugins lifecycle including Shovel, Federation, Consistent Hash Exchange, RMQPS.
- Perform the capacity planning for message throughput, queue depth, memory and disk utilization.
- Conduct the performance tuning of RabbitMQ nodes including memory watermarks, disk alarms, prefetch counts, and flow control settings.
- Manage RabbitMQ upgrades and patching with zero or minimal downtime using rolling upgrade strategies.
- Grafana tool : Implement comprehensive monitoring and alerting using Prometheus, Grafana, or Tanzu Observability ( WaveFront) for RabbitMQ metrics including queue depth, consumer count, message publish / deliver rates, memory, and node health.
- Define and enforce SLOs/SLAs for RabbitMQ platform availability, latency, and throughput.
- Integrate RabbitMQ logs into centralized logging platforms such as Splunk and build operational dashboards and queries.
- ON-CALL Support for platform level incidents and escalations.
Benefits
- Full range of medical and dental benefits options
- Disability insurance
- Paid time off (inclusive of sick leave)
- Other paid and unpaid leave options