Vantage Data Centers is seeking a Mid-Level Data Engineer to help build, operate, and scale our enterprise data platform. This role is designed for an engineer who can operate independently, execute reliably in a fast-paced environment, and take ownership of data pipelines and datasets with minimal ramp-up.
Requirements
- Design, build, and maintain reliable, scalable data pipelines using Python and PySpark on the Microsoft Azure data platform.
- Develop and operate batch and incremental data pipelines leveraging Azure Data Factory for orchestration and Azure Data Lake Storage Gen2 as the primary data store.
- Independently implement SQL- and Spark-based transformations to produce curated datasets that support enterprise reporting, analytics, and downstream consumption.
- Take ownership of assigned data pipelines and datasets, including monitoring, troubleshooting, and performance optimization in production environments.
- Work with Azure Synapse (dedicated or serverless where applicable) to support analytical workloads and data consumption patterns.
- Collaborate with business analysts and cross-functional stakeholders to translate data requirements into practical, working data solutions.
- Prepare and structure data to support advanced analytics and AI-enabled use cases by ensuring data quality, consistency, and documentation.
- Apply established data governance, security, and engineering standards to ensure compliant, maintainable, and scalable solutions.
- Participate in code reviews, technical discussions, and platform improvement initiatives as an active contributor.
- Proactively identify data quality issues, pipeline risks, and improvement opportunities, and communicate them clearly in a fast-paced environment.
Benefits
- Medical, dental, and vision coverage
- Life and AD&D
- Short and long-term disability coverage
- Paid time off
- Employee assistance
- Participation in a 401k program that includes company match
- Generous Paid Time Off
- 401k Matching
- Retirement Plan