About the Role
We are seeking a Senior Data Engineer to build and stabilize the next generation of Cake Digital Bank's unified data infrastructure — powering analytics, dashboards, and real-time intelligence across the organization.
You will work across streaming, batch, and lakehouse layers, ensuring that our data pipelines and platforms — BigQuery, Doris, and Iceberg — operate with reliability, observability, and cost-efficiency.
This role is central to the evolution of Cake's data mesh and lakehouse architecture, where dashboards, dbt transformations, and machine-learning features all depend on a stable, scalable, and auditable data foundation.
Key Responsibilities
1. Data Infrastructure Stability & Reliability
- Design, implement, and maintain high-availability data pipelines that replicate data from OLTP systems to analytical warehouses (BigQuery, Doris)
- Build resilience and recovery patterns — checkpointing, replay queues, schema-aware ingestion, deduplication, and versioned storage.
- Lead incident response, RCA, and SLA management for data infrastructure components.
- Implement end-to-end observability across ingestion, transformation, and serving layers .
2. Unified Storage & Lakehouse Evolution
- Architect and operationalize Apache Iceberg as the central data storage layer to unify data across BigQuery and Doris.
- Define data layout, partitioning, compaction, and schema evolution strategy for Iceberg tables stored on GCS.
- Design cross-system metadata synchronization between Iceberg catalog, BigQuery external tables, and Doris engines.
3. Platform Automation & Scalability
- Develop and maintain infrastructure-as-code (Terraform, Helm, Config Sync) for Airflow, Flink, Doris, and BigQuery resources.
- Build self-service templates for creating new pipelines, enabling consistent deployment and monitoring.
- Optimize data infrastructure configurations to support workload growth while controlling costs.
- Automate schema detection, dependency validation, and backfill workflows for safe and predictable releases.
4. Data Governance, Quality & Security
- Integrate with DataHub to maintain end-to-end lineage, glossary, and policy-tag enforcement.
- Apply column-level security and row-level policies for sensitive and restricted data (PII, PCI, financial metrics)in BigQuery and Doris.
- Establish validation checks and data contracts to ensure quality and consistency across streaming and batch paths.
- Collaborate with Compliance and Risk teams to maintain auditable data flow and access traceability.
5. Performance Optimization & Cost Efficiency
- Continuously tune BigQuery slot usage, query design, and reservation policies by domain.
- Improve Doris query performance through partition pruning, tablet balancing, and index tuning.
- Design caching, pre-aggregation, and materialized-view strategies to accelerate dashboards while reducing query cost.
- Track job efficiency, data duplication, and storage growth across the platform.
6. Collaboration & Leadership
- Partner with BI, Risk, and ML teams to deliver reliable, low-latency data products that power business decisions.
- Mentor junior engineers on best practices for streaming systems, Airflow orchestration, and infra automation.
- Participate in architecture design sessions to guide the evolution of Cake's multi-engine, unified lakehouse platform.
Required Skills & Experience
- 5+ years of hands-on experience in data engineering or data platform development in a cloud-native environment
- Strong proficiency in SQL (BigQuery, Doris, or equivalent MPP engines) and Python for building, monitoring, and automating data pipelines.
- Proven experience operating streaming and CDC systems such as Dataflow, Flink, Debezium, or Datastream, with solid understanding of checkpointing, offset management, and backpressure handling.
- Deep understanding of data modeling and warehouse optimization — partitioning, clustering, materialized views, caching, and query tuning.
- Hands-on experience implementing and maintaining infrastructure-as-code using Terraform, Helm, or Config Sync.
- Familiarity with Apache Iceberg (or Delta/Hudi) and lakehouse design patterns, including schema evolution and compaction.
- Strong background in observability and reliability engineering — building dashboards, alerts, and auto-remediation for data pipelines using Prometheus, Grafana, or Cloud Monitoring.
- Understanding of data governance and security concepts: column-level security, policy tags, lineage, and data classification using DataHub or similar systems.
- Working knowledge of orchestration and workflow automation tools such as Airflow or Dagster.
Our benefits:
- Competitive compensation including a 13th-month wage and up to 3 months of performance-based bonus.
- Macbook and essential equipment are provided.
- BE Corp budget (vary from your level) is allocated for using services such as transportation, food, and passenger car bookings in Be application.
- Annual health checks and premium medical healthcare (PTI) after probation.
- 15 days of annual leave is applied for the entire employees.
- Company trips, team-building activities, and happy hour events are organized on a quarterly or annual basis.