Skip to content

ADR-050: SD06 test layering — dbt unit tests for SQL behaviour, Vitest policy tests for SQL structure

Status Accepted
Date 2026-05-01
Deciders CTO, Head of Risk Engineering
Affects repos bank-risk-platform

Status: Accepted — 2026-05-01

Context

During the MOD-038 design review (2026-05-01), a pattern was identified in MOD-085 and MOD-098 policy tests that conflated structural file assertions with behavioural SQL assertions. The tests used Vitest regex matches against raw .sql file content to "assert" arithmetic correctness:

// MOD-098 tests/policy/REP-001-calc-correctness.test.ts
expect(sql).toMatch(
  /attribution_mode\s*=\s*['"]dedicated['"][\s\S]+?warehouse_total_credits\s*\*\s*credit_unit_cost_usd/i,
);

This gives false confidence. A semantically-equivalent SQL refactor — IFF(attribution_mode = 'dedicated', warehouse_total_credits * unit_cost, ...) — breaks the test without breaking the behaviour. Conversely, a real arithmetic bug that does not change the token order passes the test silently. These tests are asserting file content, not correctness.

dbt 1.8 introduced native unit tests that execute a model's SQL against fixture rows in a temp table and assert the output. This is the correct tool for SQL behavioural assertions. It runs as part of dbt build in CI with no new pipeline step required.

The failing pattern surfaced specifically in SD06 modules because they are Snowflake-centric: most of the business logic lives in dbt SQL models, not in Lambda TypeScript. The bank-core and bank-kyc test suites are correctly TypeScript-centric and are not affected by this ADR.

Decision

1. Six-layer test taxonomy

All SD06 modules adopt the following layering. Each layer has a single tool and a clearly scoped responsibility.

Layer Tool Runs when What it proves
TypeScript code Vitest unit pnpm test:unit (no AWS, no Snowflake) Lambda handlers, domain modules, event publishers, helpers — TypeScript logic correctness
dbt SQL behaviour dbt unit tests (1.8+) dbt build A model's SELECT produces the expected output rows for given fixture inputs
Output data quality dbt schema tests dbt build Published columns meet data-quality contracts: not_null, unique, accepted_values, relationships, custom singular tests
Source freshness dbt source freshness dbt build Upstream raw CDC sources are recent enough to use (loaded_at_field thresholds)
Deployed infrastructure Vitest integration pnpm test:integration (against deployed dev stack) The deployed AWS / Snowflake stack matches the IaC declaration — objects exist, grants are present, Alerts are started
Static SQL structure Vitest policy pnpm vitest run tests/policy (no AWS, no Snowflake) Source files structurally encode the policy requirement — e.g., a REVOKE clause is present, a threshold is sourced from config rather than hard-coded

2. The structural vs behavioural boundary

A test belongs in Vitest policy (static structure) if and only if it asserts something about the source file that would not be observable from the model's compiled output — and where a semantically equivalent implementation would correctly pass the test. Examples:

  • governance_meta_grants.sql contains a REVOKE UPDATE, DELETE ON ... clause — confirms the GOV-006 immutability grant is present in the file.
  • unattributed_costs.sql references source('metering', 'config') for the threshold rather than a literal 0.02 — confirms DT-004's threshold-not-hardcoded structural obligation.
  • A DDL file uses CREATE OR REPLACE — confirms the idempotency contract in the file.
  • A materialization is dynamic_table with target_lag = '1 hour' — confirms the SLA is declared in the file.

A test belongs in dbt unit tests (behaviour) if it asserts something about the output of a transformation given known inputs — i.e., something that can only be verified by executing the SQL. Examples:

  • Given a row with attribution_mode = 'dedicated', the output attributed_credits equals warehouse_total_credits (FR-394 arithmetic).
  • Given unattributed_share = 0.04 with a config threshold of 0.05, the output exceeds_threshold is FALSE (FR-396 boundary).
  • Given five consecutive carry-forward rows, consecutive_carry_forward_days equals 5 (FR-384 streak aggregation).

The litmus test: if a semantically-equivalent SQL refactor — one that produces identical output — would break the test, the test is in the wrong layer. Behavioural tests survive refactors. Structural tests are deliberately tied to file content.

3. dbt unit test syntax (dbt 1.8+)

Stored in the model's schema.yml alongside existing column tests and source declarations:

unit_tests:
  - name: dedicated_warehouse_attributes_full_warehouse_credits
    model: snowflake_credit_daily
    given:
      - input: source('account_usage', 'warehouse_metering_history')
        rows:
          - {start_time: '2026-04-29 10:00', warehouse_name: 'TENANT_A_WH',
             credits_used: 42, credits_used_compute: 40}
      - input: source('account_usage', 'query_history')
        rows:
          - {start_time: '2026-04-29 10:00', warehouse_name: 'TENANT_A_WH',
             query_tag: 'tenant-A', credits_used_cloud_services: 7}
    expect:
      rows:
        - {tenant_id: 'tenant-A', attribution_mode: 'dedicated', attributed_credits: 42}

dbt 1.8+ runs unit tests as part of dbt build by default. No new CI step is required. The existing pnpm dbt:run pipeline step in each module requires no change.

4. dbt singular tests vs dbt unit tests

Both are valid dbt test types with different scopes:

  • Unit tests (schema.yml unit_tests: block) — per-model behavioural assertions against fixture rows. Appropriate for CASE expression branches, JOIN classification logic, aggregation arithmetic.
  • Singular tests (tests/<name>.sql files that must return 0 rows) — cross-table consistency assertions. Appropriate for FR-227-style reconciliation checks (e.g., CDC row count matches transaction log), referential integrity checks that span multiple models, or any assertion requiring a full-table scan against real data.

Both run as part of dbt build. Use singular tests when the assertion requires real deployed data or a cross-model JOIN; use unit tests when the assertion can be proven with a small fixture.

5. Coverage guidance

There is no minimum dbt unit test count enforced at the repository level. The expectation per module is:

  • Every non-trivial CASE expression or conditional aggregation in a published view or Dynamic Table has at least one unit test covering each significant branch (true / false, boundary conditions).
  • Every CALC-mode or AUTO-mode policy obligation that depends on SQL arithmetic has a corresponding dbt unit test. Vitest policy tests for these obligations are removed or narrowed to structural assertions only.
  • Every published view in the module's schema has at least one dbt schema test on its primary key and key NOT NULL columns.

Module-level acceptance criteria in the module's design doc record the specific dbt test files that satisfy each CALC/AUTO policy.

6. dbt unit tests as published contract

A dbt unit test's expect: rows are part of the published view contract for downstream consumers, consistent with ADR-046 §3 (view-as-product). A change to a model that causes an existing unit test to fail is treated as a breaking change to that view's contract and requires a version bump or explicit migration notice to downstream consumers.

7. Reference implementation

MOD-038 (data quality monitor, built 2026-05-01) is the first SD06 module built with this layering from V1. Its test structure serves as the canonical reference:

  • tests/unit/ — empty (MOD-038 has no Lambda code).
  • tests/policy/ — file-structural assertions only (REP-005 REVOKE/immutability grant present; DT-004 threshold sourced from config table; GOV-006 CREATE OR REPLACE idempotency).
  • tests/integration/ — deployed Snowflake/AWS object existence and Alert state.
  • dbt/models/MOD-038-data-quality-monitor/schema.yml — schema tests, dbt unit tests (per-model behavioural), source freshness thresholds.

8. Retrofit scope

MOD-085 and MOD-098 contain policy tests that must be split into dbt unit tests (behaviour) and narrowed Vitest policy tests (structure). This retrofit is bundled with the ADR-049 Snowflake Alert refactor pass — not an independent pass.

File Behaviour tests to move to dbt Structural tests to keep in Vitest policy
MOD-098/tests/policy/REP-001-calc-correctness.test.ts Dedicated/proportional attribution arithmetic, TENANT_ prefix classification, untagged fallback classification Materialization config, Dynamic Table target_lag, threshold sourced from config
MOD-085/tests/policy/CLQ-002-calc-correctness.test.ts BKBM carry-forward streak count behaviour, tenor-jurisdiction whitelist pass/fail DDL idempotency (CREATE OR REPLACE), REVOKE present

Consequences

Positive: - Behavioural tests are refactor-resilient. A semantically equivalent SQL rewrite no longer breaks tests incorrectly. - CALC-mode and AUTO-mode policy obligations are now verifiably asserted by executing the SQL, not by inspecting text. The policy test claim is honest. - One fewer mental model: dbt unit tests use the same YAML authoring surface as column tests and source declarations. No TypeScript fixture wiring required. - dbt unit test failures are caught at dbt build time, which runs in CI on every PR before a deploy is possible.

Negative: - dbt unit tests cost a small amount of warehouse compute per CI run (each test creates and queries a temp table). This is proportionate to the confidence gained. - Existing modules (MOD-085, MOD-098) require a one-time test refactor. Deferred to the ADR-049 retrofit pass — not blocking any active build. - dbt unit tests are Snowflake-side only: they run in a Snowflake session, not in a local Node.js process. Contributors need a Snowflake connection to run them locally, though CI always has one.

Alternatives considered

Keep Vitest regex tests. Rejected. The anti-pattern gives false assurance for CALC-mode policy obligations and breaks on refactors that do not change correctness. Not acceptable as a long-term quality strategy.

Skip behavioural testing of dbt SQL entirely. Rejected. Untested SQL arithmetic in CALC-mode policy obligations (FR-394, FR-396, FR-384) leaves the bank exposed to silent calculation errors that pass all existing tests. The risk is proportionate to the financial impact of a wrong number in a risk or cost report.

SQLFluff or Pytest-dbt-utils for behavioural assertions. Considered. dbt's native unit tests (1.8+) are the first-party solution, require no additional tooling, and run within the existing dbt build pipeline. No third-party dependency is needed.


All ADRs Compiled 2026-05-22 from source/entities/adrs/ADR-050.yaml