
Search by job, company or skills
Project Overview:
You will be a key member of the delivery team on a high-stakes data platform migration for a multi-brand iGaming operator. The engagement replaces a legacy Pentaho/Postgres architecture with a modern SQLMesh + Dagster + BigQuery stack, completing before a 34X data volume surge in July 2026.
This is hands-on build work. You will work directly in the codebase from day one, embedded within a squad led by a Fractional CTO/Data Architect.
Scope of Work Phase 1a Core Migration (MarJul 2026):
Migrate 128 Pentaho ETL jobs to idempotent SQLMesh models following Medallion architecture (Bronze Silver Gold)
Convert 20+ complex PL/pgSQL trigger functions to event-driven Dagster assets and SQLMesh models
Build and validate a Reverse ETL pipeline (BigQuery Postgres) covering 107 tables across 3 SLA tiers: Hot 5 min, Warm 1530 min, Cold daily
Execute 7-Day Consistency Check with row-count and column-sum reconciliation as the production cutover gate
Support UAT, production cutover, and 1-month hypercare post-cutover
Phase 1b BigQuery Modernisation (JulSep 2026):
Rewrite 164 BigQuery Scheduled Queries from EXTERNAL_QUERY to native BQ reads
Rewrite 32 BigQuery Stored Procedures to eliminate CloudSQL dependencies
Retire 12 CloudSQL connections across 10 brands after 7-day burn-in validation
1-month hypercare support post-Phase 1b cutover.
Technical Requirements Must Have:
SQLMesh: Incremental models, full-refresh strategies, model testing, Medallion architecture
Dagster: Asset pipelines, sensors, scheduling, upstream/downstream dependency management
BigQuery: Partitioning, clustering, MERGE with partition pruning, cost-aware query design, INFORMATION_SCHEMA
PostgreSQL: Triggers, stored functions, PL/pgSQL fluency, trigger-to-model conversion patterns
Python: Apache Arrow, Polars (or equivalent) for high-throughput Reverse ETL data movement
Data validation: row-count reconciliation, column-sum diffing, data-diff tooling
CI/CD for data: Git PR workflows, automated model testing, Azure DevOps experience
Idempotency discipline: all models must be re-runnable with zero side effects
Nice to Have:
iGaming or sportsbook domain knowledge: member tiers, bet/deposit fact tables, fiscal period logic
Experience with Datastream, Airbyte, or other CDC ingestion tools
dbt experience (cross-applicable to SQLMesh patterns)
Terraform or Pulumi on GCP
Grafana or Datadog for pipeline observability
Ideal Candidate
5+ years in data engineering with a track record of shipping production pipelines not just building in staging
Has executed at least one legacy-to-modern DWH migration (not just contributed to one)
Comfortable reading, debugging, and rewriting PL/pgSQL business logic
Writes clean, testable SQL and Python and holds the same standard for teammates
Works well under an experienced technical lead: takes direction, flags risk early, executes independently Does not need hand-holding on tooling picks up SQLMesh or Dagster quickly if not already proficient Available to start March 2026; committed to Phase 1b continuation preferred
Code Quality Standards:
All models and assets produced must adhere to:
Explicit partitioning and clustering keys on all destination tables
No broad DISTINCT use GROUP BY or window functions
MERGE with partition pruning for upserts; no DELETE+INSERT anti-patterns
No temp tables or CTEs materialised to disk
Zero material drift on financial columns (Turnover, Revenue) the mandatory acceptance gate
Python adhering to PEP 8; SQL modular, idempotent, and reusable Working Norms
3x per week standup with the Fractional CTO focused on blockers
Weekly sync reviewing milestones, risks, and forward planning
All work submitted via Pull Request reviewed and approved by Client Data Engineering before merge Named to the engagement no rotation without client approval
Interview Process
This role goes through the Qsoft pre-qualification process, consisting of:
Short 1520 minute interview with the project team
Candidate may join 12 interviews on the client side.
Job ID: 143818939