Job Description:
We seek a Data Governance Engineer who will be responsible for establishing and maintaining data governance practices across our local data platform. This role will focus primarily (70%) on data governance initiatives including security management, query optimization, and resource allocation, while supporting (30%) analytics automation and pipeline maintenance.
Primary Responsibilities: Data Governance (70%)
Data Governance Framework
- Optimize data pipelines and SQL queries to ensure minimal resource consumption, high stability, and efficient performance across platforms such as HDFS, Presto, and Spark
- Establish and maintain continuous monitoring of governance scores, ensuring all teams achieve and sustain scores above 85/100, based on storage utilization and compute resource efficiency
- Develop governance policies and best practices tailored to local team needs
Data Service Management
- Integrate all VN reports and data pipelines to SLA Manager System for comprehensive resource tracking
- Lead optimization of resource allocation across all teams through performance monitoring and capacity planning
- Develop and manage Asset Status Tracker for complete visibility of data assets and project dependencies
- Implement and monitor failed task notifications, working with teams to establish Seatalk Failed Task Notifications
Data Quality & Performance
- Prioritize and implement quality rules for key tasks ensuring stable data resources
- Conduct regular query performance reviews and optimization sessions with Functional BI teams
- Establish data quality metrics and monitoring dashboards for proactive issue detection
- Lead troubleshooting efforts for data quality issues and performance bottlenecks
Security & Access Management
- Manage data access controls and security policies across all data platforms
- Conduct regular security audits and compliance reviews
- Work with Regional team to ensure alignment with global data governance standards
Requirements:
- Strong experience with data governance frameworks and best practices
- Hands-on experience with HDFS, Presto, Spark and distributed systems
- Proven track record in working with large distributed data warehouse systems and very-high data volume query optimization
- Strong SQL skills and understanding of data modeling & data warehouse management principles
- Knowledge of data security and access control management
- Excellent communication skills to work with both technical and business stakeholders.
- Experience with data quality tools and frameworks
- Experience with Python for automation and monitoring scripts
- Experience with Asset Status Tracking systems
- Knowledge of notification systems integration (Seatalk or similar)
- Background in implementing data governance in cross-functional environments
- LLM applications