About the Role
We are seeking a Senior Data Engineer to build and stabilize the next generation of Cake Digital Bank's unified data infrastructure powering analytics, dashboards, and real-time intelligence across the organization.
You will work across streaming, batch, and lakehouse layers, ensuring that our data pipelines and platforms BigQuery, Doris, and Iceberg operate with reliability, observability, and cost-efficiency.
This role is central to the evolution of Cake's data mesh and lakehouse architecture, where dashboards, dbt transformations, and machine-learning features all depend on a stable, scalable, and auditable data foundation.
Key Responsibilities
1. Data Infrastructure Stability & Reliability
- Design, implement, and maintain high-availability data pipelines that replicate data from OLTP systems to analytical warehouses (BigQuery, Doris)
- Build resilience and recovery patterns checkpointing, replay queues, schema-aware ingestion, deduplication, and versioned storage.
- Lead incident response, RCA, and SLA management for data infrastructure components.
- Implement end-to-end observability across ingestion, transformation, and serving layers .
2. Unified Storage & Lakehouse Evolution
- Architect and operationalize Apache Iceberg as the central data storage layer to unify data across BigQuery and Doris.
- Define data layout, partitioning, compaction, and schema evolution strategy for Iceberg tables stored on GCS.
- Design cross-system metadata synchronization between Iceberg catalog, BigQuery external tables, and Doris engines.
3. Platform Automation & Scalability
- Develop and maintain infrastructure-as-code (Terraform, Helm, Config Sync) for Airflow, Flink, Doris, and BigQuery resources.
- Build self-service templates for creating new pipelines, enabling consistent deployment and monitoring.
- Optimize data infrastructure configurations to support workload growth while controlling costs.
- Automate schema detection, dependency validation, and backfill workflows for safe and predictable releases.
4. Data Governance, Quality & Security
- Integrate with DataHub to maintain end-to-end lineage, glossary, and policy-tag enforcement.
- Apply column-level security and row-level policies for sensitive and restricted data (PII, PCI, financial metrics)in BigQuery and Doris.
- Establish validation checks and data contracts to ensure quality and consistency across streaming and batch paths.
- Collaborate with Compliance and Risk teams to maintain auditable data flow and access traceability.
5. Performance Optimization & Cost Efficiency
- Continuously tune BigQuery slot usage, query design, and reservation policies by domain.
- Improve Doris query performance through partition pruning, tablet balancing, and index tuning.
- Design caching, pre-aggregation, and materialized-view strategies to accelerate dashboards while reducing query cost.
- Track job efficiency, data duplication, and storage growth across the platform.
6. Collaboration & Leadership
- Partner with BI, Risk, and ML teams to deliver reliable, low-latency data products that power business decisions.
- Mentor junior engineers on best practices for streaming systems, Airflow orchestration, and infra automation.
- Participate in architecture design sessions to guide the evolution of Cake's multi-engine, unified lakehouse platform.
Required Skills & Experience
- 5+ years of hands-on experience in data engineering or data platform development in a cloud-native environment
- Strong proficiency in SQL (BigQuery, Doris, or equivalent MPP engines) and Python for building, monitoring, and automating data pipelines.
- Proven experience operating streaming and CDC systems such as Dataflow, Flink, Debezium, or Datastream, with solid understanding of checkpointing, offset management, and backpressure handling.
- Deep understanding of data modeling and warehouse optimization partitioning, clustering, materialized views, caching, and query tuning.
- Hands-on experience implementing and maintaining infrastructure-as-code using Terraform, Helm, or Config Sync.
- Familiarity with Apache Iceberg (or Delta/Hudi) and lakehouse design patterns, including schema evolution and compaction.
- Strong background in observability and reliability engineering building dashboards, alerts, and auto-remediation for data pipelines using Prometheus, Grafana, or Cloud Monitoring.
- Understanding of data governance and security concepts: column-level security, policy tags, lineage, and data classification using DataHub or similar systems.
- Working knowledge of orchestration and workflow automation tools such as Airflow or Dagster.
Our benefits:
- Competitive compensation including a 13th-month wage and up to 3 months of performance-based bonus.
- Macbook and essential equipment are provided.
- BE Corp budget (vary from your level) is allocated for using services such as transportation, food, and passenger car bookings in Be application.
- Annual health checks and premium medical healthcare (PTI) after probation.
- 15 days of annual leave is applied for the entire employees.
- Company trips, team-building activities, and happy hour events are organized on a quarterly or annual basis.