Join Amgen's mission of serving patients by designing, developing, and optimizing data pipelines, data integration frameworks, and metadata-driven architectures in a collaborative, innovative, and science-based culture.
Requirements
- Design, develop, and maintain complex ETL/ELT data pipelines in Databricks using PySpark, Scala, and SQL to process large-scale datasets
- Understand the biotech/pharma or related domains & build highly efficient data pipelines to migrate and deploy complex data across systems
- Design and Implement solutions to enable unified data access, governance, and interoperability across hybrid cloud environments
- Ingest and transform structured and unstructured data from databases, APIs, logs, event streams, images, pdf, and third-party platforms
- Ensure data integrity, accuracy, and consistency through rigorous quality checks and monitoring
- Expert in data quality, data validation and verification frameworks
- Innovate, explore and implement new tools and technologies to enhance efficient data processing
- Proactively identify and implement opportunities to automate tasks and develop reusable frameworks
- Work in an Agile and Scaled Agile (SAFe) environment, collaborating with cross-functional teams, product owners, and Scrum Masters to deliver incremental value
- Use JIRA, Confluence, and Agile DevOps tools to manage sprints, backlogs, and user stories
- Support continuous improvement, test automation, and DevOps practices in the data engineering lifecycle
- Collaborate and communicate effectively with the product teams, with cross-functional teams to understand business requirements and translate them into technical solutions
Benefits
- Competitive and comprehensive Total Rewards Plans
- Inclusive environment of diverse, ethical, committed and highly accomplished people
- Collaborative culture
- Opportunities for professional and personal growth and well-being