Senior Data Engineer

Masan Group

Ho Chi Minh, Vietnam

5-7 Years

Save

Posted 19 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

About the Role

We are looking for a Senior Data Engineer to join our Data Platform team and lead the design, development, and optimization of our enterprise data products using Azure Databricks & Lakehouse architecture. You will work across ingestion, data modeling, automation, ML, and data governance to build a scalable data ecosystem serving analytics, ML, and business activation use cases.

Key Responsibilities

Design & develop scalable data pipelines using Azure Databricks (SQL, Python, PySpark).
Implement Delta Lake / Lakehouse Medallion architecture (BronzeSilverGold).
Optimize performance, cost, cluster tuning, scheduling, and serverless compute.
Implement CI/CD, DBX version control, Unity Catalog governance & cluster policies.
Integrate Databricks with Azure ADLS Gen2, Azure SQL, ADF / Databricks Jobs, Event Hub, Key Vault, Terraform.
Build automation: Auto EDA (profiling, anomaly detection), AutoML & MLflow pipelines.
Apply LLM/Data GPT for automated SQL generation, documentation, data lineage & data quality explanation.
Work closely with business teams to translate requirements into scalable solutions.
Platform Scope You Will Help Build:
Data ingestion system, data cleaning & standardization
Global-ID data connection / mapping
Data crawler
Enterprise Data Lake & Feature Store
Realtime & batch analytics
Activation API
Data Catalog, Data Lineage, Data Quality Monitoring
Data Access governance, usage monitoring, pricing & FinOps visibility
Data security best practices

Qualifications

Requirements

5+ years in Data Engineering or distributed data processing
Expert in Azure Databricks (Delta Lake, Unity Catalog, DBX version control, cluster policies, CI/CD)
Strong data modeling (star schema, dimensional, data vault) & ELT frameworks
Hands-on with PySpark, SQL, Python, Databricks SQL
Experience with AutoML / MLflow (train deploy monitor)
Experience applying GenAI / Data GPT to data workflows

Nice-to-have

Streaming: Structured Streaming, Auto Loader, Kafka/EventHub
Databricks Photon, Serverless SQL, fine-grain access control
Cost governance & FinOps for Databricks & Azure

Why Join Masan

Be part of Masan's digital transformation journey with high-impact, real-world data challenges.
Build a modern end-to-end data platform from scratch
Work with Databricks, AutoML, Generative AI cutting-edge stack
High ownership, high-impact engineering role