Design, build, and support trusted analytical data products for various data initiatives using open Lakehouse technologies. Own end-to-end batch/stream pipelines, logical and physical data models, and production-grade curation of reference and application data.
Requirements
- Strong data engineering programming skills in Python, Java, and SQL
- Strong programming skills in Spark & SQL, with hands-on knowledge of optimization and debugging
- Hands-on experience with Snowflake and Databricks as data platforms
- Working knowledge of open table formats such as Apache Iceberg, catalogs (any of Polaris, Horizon, Unity), or metadata frameworks
- Experience building production-grade services in cloud environments; AWS,GCP and/or Azure is preferred
- Experience with structured and unstructured ingestion, schema evolution, and data modeling
- Strong debugging and performance-tuning skills for data pipelines
- Experience designing curated analytical/semantic data models for consumption (BI/metrics layers and/or serving models), including governance, documentation, and change management
- Working knowledge of building data products for consumption via APIs and BI endpoints, including interface contracts, performance considerations, and access controls
Benefits
- Inclusive development opportunities
- Flexible work-life support
- Paid volunteer days
- Vibrant employee networks