We are looking for a data engineering professional with experience in distributed environments to design and maintain scalable pipelines, ensure data governance and quality, and work in collaboration with multidisciplinary teams.
Requirements
- Experience with data engineering in distributed environments
- Knowledge in Apache Spark/Pyspark
- Experience with AWS (Glue, S3, EMR, Athena, Redshift etc.)
- Knowledge in data modeling, ETL/ELT, and data pipelines
- Familiarity with relational and NoSQL databases
- Experience with batch and streaming processing
- Experience with Apache Kafka or Amazon MSK