As a Senior Software Engineer - SRE at OneTrust, you will engage and partner with various Engineering, Operations, and Product teams to design, deliver, and maintain a highly available and performant application platform. Your mission is to eliminate toil by automating processes, tuning alerts, and improving code where it is most needed. You will frequently evaluate new ideas and trends to identify potentially useful tools and techniques, and collaborate with different functional groups to identify gaps, prioritize, and resolve issues.
Requirements
- Bachelor's degree in computer science, Engineering, or related technical or business field
- 4+ yrs. of application development experience with Java or other equivalent language
- Experience with Spring environment
- Experience in cloud-based infrastructure (Azure, AWS, GCP, etc.)
- Experience with the factors that affect software application performance at different levels
- Knowledge of the importance of centralizing logging, metrics dashboards, and alerting
- A good awareness of databases (ideally SQL/NoSQL)
- Hands-on experience with observability tools (Datadog, Prometheus, Grafana, etc.)
- Knowledge with CI/CD pipelines and infrastructure-as-code (Terraform, Helm, jenkins, gitlab)
- Build and operate AI-assisted incident response systems
- Develop or integrate LLM-based tools to reduce MTTR and improve alert quality
- Apply machine learning techniques for anomaly detection, capacity prediction, or failure pattern analysis
- Experience deploying AI systems in production
- Knowledge with vector databases, embeddings, or RAG architectures for operational intelligence
- Well-developed insight of prompt engineering and evaluation of LLM outputs in the reliability workflow
- Kubernetes and container orchestration (EKS/AKS/GKE)
- Experience with distributed systems at scale
- Familiarity with service meshes and microservices architectures
- Nice to Have: Experience with chaos engineering tools (Gremlin, Chaos Monkey), Background in product-facing services with high traffic scale, Understand how to use incident management platforms
Benefits
- Comprehensive healthcare coverage
- Flexible PTO
- Equity RSUs
- Annual performance bonus opportunities
- Retirement account support
- 14+ weeks of paid parental leave
- Career development opportunities
- Company-paid privacy certification exam fees