Design, develop, and maintain robust, scalable, and reliable data pipelines and ETL/ELT processes.
Write and optimize scalable data processing jobs using Spark (preferably PySpark) within the AWS Glue environment.
Develop and maintain data transformation logic using dbt to create clean, reliable datasets for analytics.
Build and manage data workflows and orchestration using AWS MWAA (Managed Workflows for Apache Airflow).
Develop, test, and debug complex Airflow DAGs to schedule and monitor Spark jobs, dbt runs, and other data pipeline tasks.
Write and optimize complex SQL queries for data extraction, transformation, and analysis using services like Amazon Athena.
Manage access control and security for data resources using AWS IAM policies and roles.
Collaborate with data analysts, data scientists, and other stakeholders to understand data requirements and deliver appropriate solutions.
Monitor, troubleshoot, and resolve issues in data pipelines to ensure data quality and integrity.
Contribute to the documentation of data processes, pipelines, and architecture.
Requirements
At least 2.5 years of hands-on experience in a Data Engineering role.
Solid understanding of core data engineering concepts, modern data warehousing, and ETL/ELT principles.
Proficiency in writing and optimizing distributed data processing jobs using Spark (PySpark highly preferred).
Proven experience with core AWS data services, including S3, Glue, Athena, MWAA, and IAM.
Hands-on experience with data transformation tools, specifically dbt.
Proficiency in Python and writing efficient, maintainable, and debuggable Airflow DAGs.
Strong SQL skills with the ability to write complex queries and perform performance tuning.
Excellent analytical and problem-solving skills.
Strong communication skills and the ability to work effectively in a collaborative team environment.
(Preferred) Knowledge or hands-on experience with Infrastructure as Code (IaC) tools, particularly Terraform.
Open-minded and willing to work in an international team
Good at speaking English
Having the following skill/experience/certification is a bonus
Data Engineering certifications from Microsoft, IBM, and Cloudera.
Data Management Association International (DAMA) or Certified Data Management Professional (CDMP) certification.
Benefits
Salary: Negotiate
An international, professional, young, but innovative and dynamic environment, working closely with international experts and joining conferences and workshops on exciting new technologies.
Full benefits for employees according to the Vietnam Labor Laws: social and health insurance
Holidays based on the Vietnamese labor law + paid vacations
Work at Techno Park Tower, recognized as a world-class office tower and ranked among the world's smartest towers