Data Engineer responsible for designing, building, maintaining, analyzing, and interpreting data to provide actionable insights. Strong technical skills and experience with big data technologies, data architecture, and ETL processes required.
Requirements
- Design, develop, and maintain data solutions for data generation, collection, and processing
- Create data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems
- Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions
- Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks
- Collaborate with cross-functional teams to understand data requirements and design solutions that meet business needs
- Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency
- Implement data security and privacy measures to protect sensitive data
- Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions
- Collaborate and communicate effectively with product teams
- Collaborate with Data Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast-paced business needs across geographic regions
- Identify and resolve complex data-related challenges
- Adhere to best practices for coding, testing, and designing reusable code/component
- Explore new tools and technologies that will help to improve ETL platform performance
- Participate in sprint planning meetings and provide estimations on technical implementation
- Design and develop data pipelines leveraging Databricks, PySpark, and SQL to ingest, transform, and process large-scale datasets
- Engineer solutions for both structured and unstructured data to enable advanced analytics and insights
- Implement automated workflows for data ingestion, transformation, and deployment using Databricks Jobs and notebooks, with ongoing monitoring and scheduling
- Apply performance optimization techniques, including Spark job tuning, caching, partitioning, and indexing, to improve scalability and efficiency
- Build integrations with multiple data sources, such as SQL databases, APIs, and cloud storage platforms, ensuring seamless connectivity and reliability
- Collaborate effectively with global teams across time zones to maintain alignment, resolve issues, and deliver on shared objectives
Benefits
- Generous Paid Time Off
- 401k Matching
- Retirement Plan
- Visa Sponsorship
- Four Day Work Week
- Generous Parental Leave
- Tuition Reimbursement
- Relocation Assistance