Data Engineer position responsible for designing, developing, and maintaining scalable production pipelines for market data, statistical sources, and AI edge models. Requires 3+ years of experience in data engineering, proficiency in Python, SQL, and cloud platforms, as well as hands-on expertise with WebSockets, streaming data, and real-time event-driven system architectures.
Requirements
- Minimum of 3 years of practical Data Engineering experience building scalable production pipelines
- Proficient in Python (including Pandas, asyncio, aiohttp, requests, BeautifulSoup/Scrapy) and advanced SQL skills
- Experience with PostgreSQL, Redis/DragonflyDB, and cloud platforms such as AWS (S3, Lambda, RDS) or their GCP equivalents
- Hands-on expertise with WebSockets, streaming data, and real-time event-driven system architectures
- Skilled in REST API integration, webhook configuration, and large-scale web scraping (handling rate limits, proxy rotation, anti-bot measures)
- Familiarity with workflow orchestration tools (Airflow, Prefect, or Dagster) and CI/CD pipelines on Linux/Docker environments
- Strong foundation in database design, Medallion/lakehouse architectures, and data modelling for analytical purposes
- Clear, well-structured communication skills in English
- Ability to independently manage execution, proactively troubleshoot, and resolve issues without supervision
- A meticulous data quality mindset, ensuring data accuracy and reliability instinctively
- Resourceful problem solver, capable of devising quick workarounds and fixes for API failures or changes in rate limits
- Maintains thorough documentation for schema definitions, pipeline DAGs, and failure runbooks
- Comfortable with rapid iteration cycles and designing extensible pipeline architectures
- Strong cross-functional communication skills to clearly convey technical information to non-technical stakeholders