Skip to content

MOD-038 — Data quality & reconciliation monitor

System: SD06 Snowflake Analytics & Risk Platform Repo: bank-risk-platform Status: Built Owner: Data & Risk Engineering Build date: 2026-05-01


Purpose

MOD-038 is the data quality (DQ) gate for SD06. It owns the governance_meta schema and provides:

  • Continuous runtime DQ monitoring on every key Snowflake table via Snowflake Data Metric Functions (DMFs).
  • A load-bearing FR-226 gate view (V_QUALITY_SCORES) that downstream Phase 2 risk modules query in their dbt staging pre-hook to skip when their upstream is unhealthy.
  • A Snowflake Alert (A_DQ_BREACH) that publishes to BANK_SNS_INTEGRATION → MOD-076 ops alerting whenever any monitored dataset falls below the FR-226 quality threshold.
  • A FR-227 reconciliation log (RECONCILIATION_BREAKS) — V1 placeholder; real LSN-ack-based reconciliation enabled in V2.
  • A FR-228 daily quality summary Dynamic Table for the CRO report.

Per ADR-049 (Snowflake-native compute — Alerts and DMFs preferred over Lambda monitors), MOD-038 has no Lambda surface. The Pulumi stack is minimal — only SSM outputs that publish the view-as-product contract.

FRs / NFRs satisfied

ID Summary
FR-225 DMFs attached to source tables (NULL_COUNT, UNIQUE_COUNT, FRESHNESS, DUPLICATE_COUNT system DMFs + custom value-range / format-conformance DMFs). Quality score per dataset = passing/total checks, computed in V_QUALITY_SCORES.
FR-226 V_QUALITY_SCORES.halt_required is the precomputed gate; threshold sourced from GOVERNANCE_META.CONFIG (default 0.98). The Snowflake Alert A_DQ_BREACH publishes the human-alert event to MOD-076 via BANK_SNS_INTEGRATION when any dataset is in halt.
FR-227 reconciliation_breaks dbt model + GOVERNANCE_META.RECONCILIATION_BREAKS table. V1 status: placeholder — gated on var('recon_check_enabled', false) until MOD-042's LSN-ack-in-Iceberg pattern is wired. Schema is fixed; body is V2 work.
FR-228 DAILY_QUALITY_SUMMARY Dynamic Table aggregates per-domain quality scores, halt counts, and open break counts for the CRO report (target_lag 1 hour).
NFR-006 DQ checks fire on the schedules attached to each table (1 hour for high-frequency, 4-24 hours for slower-changing data). Alert fires daily 06:00 UTC, ahead of the 23:59 NFR deadline.
NFR-010 The DQ halt mechanism (V_QUALITY_SCORES → downstream pre-hook gate) is the load-bearing automation that satisfies "regulatory submissions produced without manual intervention ≥ 90%".
NFR-024 CONFIG and RECONCILIATION_BREAKS are append-only — UPDATE/DELETE not granted to runtime roles. Runtime immutability verified by tests/integration/snowflake-immutability.test.ts.

Architecture

┌──────────────────────────────────────────────────────────────────┐
│ Source tables (MOD-085 MARKET.*, MOD-098 METERING.*, future      │
│ MOD-042 raw_cdc_*) with attached system + custom DMFs            │
└────────────────┬─────────────────────────────────────────────────┘
                 │ DMF schedules fire (1h-24h cadence)
┌──────────────────────────────────────────────────────────────────┐
│ SNOWFLAKE.ACCOUNT_USAGE.DATA_QUALITY_MONITORING_RESULTS          │
│ (native Snowflake-managed result store — no custom log table)    │
└────────────────┬─────────────────────────────────────────────────┘
                 │ joined to GOVERNANCE_META.CONFIG (threshold)
┌──────────────────────────────────────────────────────────────────┐
│ GOVERNANCE_META.V_QUALITY_SCORES   ← FR-226 LOAD-BEARING GATE    │
│ GOVERNANCE_META.V_OPEN_BREAKS                                    │
│ GOVERNANCE_META.DAILY_QUALITY_SUMMARY (Dynamic Table)            │
└────────────────┬─────────────────────────────────────────────────┘
        ┌────────┴────────────────────────────┐
        ▼                                      ▼
┌──────────────────┐                ┌─────────────────────────────┐
│ Phase 2 modules' │                │ GOVERNANCE_META.A_DQ_BREACH │
│ dbt staging      │                │ Snowflake Alert (CRON 0 6)  │
│ pre-hook         │                └────────────┬────────────────┘
│ (skip if halt)   │                             │ on halt_required
└──────────────────┘                             ▼
                                    SYSTEM$SEND_SNOWFLAKE_NOTIFICATION
                                    → BANK_SNS_INTEGRATION
                                    → AWS SNS topic
                                    → MOD-076 ops alerting

Snowflake objects

Object Type Purpose
GOVERNANCE_META Schema Cross-module governance metadata. MOD-038 provisions; future MOD-079/MOD-156 may add to it.
GOVERNANCE_META.CONFIG Table (append-history) Effective-from/to versioned runtime config. Seeds fr_226_quality_threshold = 0.98 and fr_227_recon_tolerance = 0.0001.
GOVERNANCE_META.RECONCILIATION_BREAKS Table (append) FR-227 break log. Append-only via dbt incremental; UPDATE/DELETE revoked.
GOVERNANCE_META.DMF_COUNT_NEGATIVE Custom DMF Counts rows where a NUMBER column is < 0.
GOVERNANCE_META.DMF_COUNT_OUTSIDE_UNIT_INTERVAL Custom DMF Counts rows where a NUMBER column is outside [0, 1].
GOVERNANCE_META.DMF_COUNT_UNPARSEABLE_DATE Custom DMF Counts rows where a VARCHAR column doesn't parse as YYYY-MM-DD.
GOVERNANCE_META.V_QUALITY_SCORES View (always fresh) FR-226 gate. Per-dataset latest quality score + halt_required.
GOVERNANCE_META.V_OPEN_BREAKS View (always fresh) DMF failures + open recon breaks. Read by FR-228 summary.
GOVERNANCE_META.DAILY_QUALITY_SUMMARY Dynamic Table (FULL, 1h lag) FR-228 CRO daily report.
GOVERNANCE_META.A_DQ_BREACH Snowflake Alert (cron 0 6 UTC) Publishes to MOD-076 when V_QUALITY_SCORES.halt_required is TRUE. ADR-049.

DMF attachments (initial set — extend in follow-up commits as new tables come online): - MARKET.FX_SPOT_CURRENT — NULL_COUNT, UNIQUE_COUNT, FRESHNESS, DMF_COUNT_NEGATIVE on MID_RATE - MARKET.BENCHMARK_RATES — NULL_COUNT, UNIQUE_COUNT, FRESHNESS, DMF_COUNT_NEGATIVE on RATE - METERING.AWS_COST_DAILY — NULL_COUNT, FRESHNESS, DMF_COUNT_NEGATIVE on amortized_cost_usd - METERING.DAILY_TENANT_SUMMARY — NULL_COUNT, UNIQUE_COUNT, FRESHNESS, DMF_COUNT_NEGATIVE on attributed_cost_usd

SSM contract

Reads (consumed)

SSM path From
/bank/{env}/snowflake/account-locator MOD-102
/bank/{env}/snowflake/databases/risk MOD-102
/bank/{env}/snowflake/warehouses/{etl,dbt} MOD-102
/bank/{env}/snowflake/roles/{domain-prod,domain-nonprod,dbt} MOD-102
/bank/{env}/snowflake/integrations/sns MOD-102 (BANK_SNS_INTEGRATION used by the Alert; resolved by name in DDL, not by SSM at runtime)

Writes (published) — view-as-product per ADR-046 §3

SSM path Value Consumed by
/bank/{env}/risk-platform/dq/quality-scores-view GOVERNANCE_META.V_QUALITY_SCORES Phase 2 module pre-hooks (MOD-032/033/035/036 etc.)
/bank/{env}/risk-platform/dq/open-breaks-view GOVERNANCE_META.V_OPEN_BREAKS MOD-076 dashboards, ad-hoc ops queries
/bank/{env}/risk-platform/dq/daily-summary-table GOVERNANCE_META.DAILY_QUALITY_SUMMARY CRO daily report consumer (out of MOD-038 scope)
/bank/{env}/risk-platform/dq/breach-alert-name GOVERNANCE_META.A_DQ_BREACH Ops/runbooks

MOD-102 grants required

This module depends on three MOD-102 migrations applied prior to first deploy:

  • 0019IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE to BANK_DBT_ROLE (so dbt can read SNOWFLAKE.ACCOUNT_USAGE.DATA_QUALITY_MONITORING_RESULTS).
  • 0021EXECUTE ALERT ON ACCOUNT to BANK_NONPROD_RISK_ROLE + BANK_PROD_RISK_ROLE + BANK_DBT_ROLE (so the deploy can RESUME the Alert).
  • 0022APPLY METRIC ON ACCOUNT to the same roles (so the deploy can attach DMFs to MOD-085/098-owned tables).

BANK_SNS_INTEGRATION (MOD-102 migration 10) is the notification target for the Alert — must be enabled by MOD-076 in the target stage before notifications are routed.

Deploy flow

Per scripts/deploy.ts:

  1. Snowflake DDL apply (pre-dbt) — schemas → tables → DMFs → DMF attachments → grants → tasks. Connects as the deploy role.
  2. dbt builddbt build --select tag:mod-038. Runs the four dbt models, the schema tests, the unit tests (per the SD06 testing taxonomy), and source freshness. Connects as BANK_DBT_ROLE.
  3. Snowflake DDL apply (post-dbt)alerts/ only. CREATE ALERT validates the body's reference to V_QUALITY_SCORES, which dbt has just created.
  4. Pulumi up — publishes the four SSM outputs.
  5. Integration tests (CI) — verify the deployed shape (alert started, DMFs attached, SSM resolves).

Testing

Per the SD06 testing taxonomy (docs/handoffs/sd06-testing-architecture.handoff.md):

Layer Location Coverage
Vitest unit tests/unit/ None — no TypeScript code in this module.
Vitest policy (structural) tests/policy/ REP-005 GATE (alert structural contract; threshold sourced from config), DT-004 AUTO (DMF attachments mandatory; no skip in CI), GOV-006 LOG (immutability grant absences).
Vitest integration tests/integration/ Schema, tables, views, DT, DMFs, Alert exist with expected state. SSM outputs resolve. Runtime immutability under BANK_DBT_ROLE.
dbt schema tests dbt/models/MOD-038-data-quality-monitor/sources.yml not_null, accepted_values, unique on every published view's key columns.
dbt unit tests same file Behavior assertions on v_quality_scores — halt_required boundary, threshold sourcing from CONFIG (effective_to IS NULL), latest-measurement-wins per metric.
dbt source freshness same file account_usage.data_quality_monitoring_results warned at 24h, errored at 72h.

The dbt unit tests are where FR-225/226 behavioral correctness is asserted. Vitest policy tests are restricted to file-content / DDL pattern assertions that wouldn't be observable from compiled SQL.

Open items at handoff

  1. CDC schema attachments — the dmf_attach/ directory ships with attachments for MARKET.* and METERING.*. Attachments for raw_cdc_* schemas (MOD-042's CDC outputs) aren't included in V1; they should be added in a follow-up once each domain's CDC is verified live in dev.
  2. FR-227 reconciliation enabled-by-default in prod — the reconciliation_breaks dbt model is gated on var('recon_check_enabled', false). Enable per stage by passing the workflow_dispatch input recon_check_enabled=true once MOD-042 publishes LSN-ack-in-Iceberg and the model body's placeholder CTE is replaced with the real read.
  3. MOD-076 SNS subscriptionA_DQ_BREACH notifies via BANK_SNS_INTEGRATION. Until MOD-076 enables the integration in the target stage, alerts fire as no-ops in ALERT_HISTORY but no SNS message is delivered. Coordinate with MOD-076 ops at first-prod deploy.
  4. Wiki body inconsistency — the bank-wiki body of MOD-038.md references a "thin Lambda" for the FR-226 alert, written before ADR-049 finalised. The dependency reasons (.yaml) and ADR-049 are authoritative; this design doc reflects the ADR-049 path. Editorial follow-up for the wiki agent.