Genpact is seeking a Consultant- Data Engineer with experience of working in AWS-native tools and modern data architectures. The role will lead data architecture design and implementation for client engagements, ensuring scalable, secure, and high-performing data solutions tailored to the unique data needs of life sciences commercial operations.
Requirements
- Design, develop, and maintain scalable, high-performance data pipelines and ETL workflows on AWS using services like Glue, Lambda, Redshift, S3, Athena, EMR, and Step Functions.
- Implement robust data ingestion, transformation, and processing pipelines for both batch and real-time streaming use cases.
- Write and optimize complex SQL queries for data extraction, transformation, and loading across large datasets.
- Develop Apache Spark jobs for large-scale data processing, with a focus on performance tuning and reliability.
- Ensure data quality, consistency, and security by implementing data validation, governance, and monitoring mechanisms.
- Collaborate closely with data architects, analysts, and other engineers to translate business requirements into efficient data workflows and models.
- Participate in code reviews, contribute to technical discussions, and ensure adherence to best practices in data engineering and DevOps (CI/CD, version control, etc.).
- Support the modernization and migration of legacy data systems to cloud-native AWS data platforms.
- Troubleshoot and resolve data pipeline failures, performance bottlenecks, and infrastructure issues proactively.
- Document data pipelines, processes, and standards to ensure maintainability and knowledge sharing.
Benefits
- Competitive salary
- Opportunities for career growth and professional development
- Collaborative and dynamic work environment
- Access to cutting-edge technology and tools
- Recognition and rewards for outstanding performance