About the Role
We are looking for a Senior Data Engineer to join our Data Platform team and lead the design, development, and optimization of our enterprise data products using Azure Databricks & Lakehouse architecture. You will work across ingestion, data modeling, automation, ML, and data governance to build a scalable data ecosystem serving analytics, ML, and business activation use cases.
Key Responsibilities
- Design & develop scalable data pipelines using Azure Databricks (SQL, Python, PySpark).
- Implement Delta Lake / Lakehouse Medallion architecture (BronzeSilverGold).
- Optimize performance, cost, cluster tuning, scheduling, and serverless compute.
- Implement CI/CD, DBX version control, Unity Catalog governance & cluster policies.
- Integrate Databricks with Azure ADLS Gen2, Azure SQL, ADF / Databricks Jobs, Event Hub, Key Vault, Terraform.
- Build automation: Auto EDA (profiling, anomaly detection), AutoML & MLflow pipelines.
- Apply LLM/Data GPT for automated SQL generation, documentation, data lineage & data quality explanation.
- Work closely with business teams to translate requirements into scalable solutions.
- Platform Scope You Will Help Build:
- Data ingestion system, data cleaning & standardization
- Global-ID data connection / mapping
- Data crawler
- Enterprise Data Lake & Feature Store
- Realtime & batch analytics
- Activation API
- Data Catalog, Data Lineage, Data Quality Monitoring
- Data Access governance, usage monitoring, pricing & FinOps visibility
- Data security best practices
Qualifications
Requirements
- 5+ years in Data Engineering or distributed data processing
- Expert in Azure Databricks (Delta Lake, Unity Catalog, DBX version control, cluster policies, CI/CD)
- Strong data modeling (star schema, dimensional, data vault) & ELT frameworks
- Hands-on with PySpark, SQL, Python, Databricks SQL
- Experience with AutoML / MLflow (train deploy monitor)
- Experience applying GenAI / Data GPT to data workflows
Nice-to-have
- Streaming: Structured Streaming, Auto Loader, Kafka/EventHub
- Databricks Photon, Serverless SQL, fine-grain access control
- Cost governance & FinOps for Databricks & Azure
Why Join Masan
- Be part of Masan's digital transformation journey with high-impact, real-world data challenges.
- Build a modern end-to-end data platform from scratch
- Work with Databricks, AutoML, Generative AI cutting-edge stack
- High ownership, high-impact engineering role