At PayPal, Senior Site Reliability Engineers (SREs) drive the reliability, performance, and availability of our global mobile and backend systems. As part of our new Mobile SRE team, you'll bridge the gap between iOS and Android clients and the backend services that power them, delivering seamless experiences for millions of customers.
Requirements
- Take ownership of system performance monitoring, identify inefficiencies, and lead initiatives to improve the overall availability and reliability of digital platforms and applications.
- Lead and manage the response to complex, high-priority incidents, ensuring prompt resolution and a thorough root cause analysis to prevent future occurrences.
- Design and implement advanced automation frameworks to improve operational efficiency, streamline processes, and reduce human error.
- Lead reliability-focused initiatives, ensuring systems are highly available, resilient, and scalable, and promote best practices across engineering teams.
- Enhance the monitoring infrastructure by identifying key metrics, optimizing alerting, and improving system observability to ensure the reliability of large-scale systems.
- Forecast resource requirements and lead capacity planning activities to ensure systems can scale effectively to meet growing user demand.
- Ensure robust disaster recovery strategies are in place and conduct regular testing to ensure systems can recover quickly from failures.
- Partner with engineering and product teams to identify opportunities for improving system architecture, focusing on scalability, reliability, and fault tolerance.
- Provide mentorship and technical guidance to junior site reliability engineers, fostering skill development and knowledge sharing.
- Drive continuous improvement across operational workflows, identifying areas for optimization, cost reduction, and performance enhancement.
Benefits
- Generous Paid Time Off
- Healthcare coverage for you and your family
- Financial security and support your mental health