MOD-017 — ML Behavioural Scoring (Path C: ADR-047 reference write-back)¶

Repo: bank-aml Module dir: MOD-017-ml-behavioural-scoring/ Bus: bank-aml (publishes nothing; consumed by MOD-016 via direct DB read)

What this module does¶

Pre-computes a 0–1000 behavioural risk score per party per model version in Snowflake Cortex, writes the score back to Postgres aml.behavioural_scores so MOD-016's rule path can read it on the real-time evaluation hot path without a Snowflake round-trip, and records every model lifecycle event (CHALLENGER_DEPLOYED, PROMOTED_TO_CHAMPION, ROLLED_BACK, RETIRED) in aml.model_deployment_history for FR-112 evidence and DT-005 governance.

Why this architecture¶

Snowflake Cortex inference is the right tool for behavioural scoring, but it cannot meet FR-112's real-time NFR (≤200ms on the rule path). The resolution is ADR-047 reference write-back: Cortex runs in batch (1- minute target lag), a Snowflake Task forwards new scores to a Lambda via an EXTERNAL FUNCTION, the Lambda writes the score to Postgres, and MOD-016's evaluator reads aml.behavioural_scores on the hot path with no cross-system call.

This is the same pattern MOD-085 uses for SD05 reference data.

Hot-path contract with MOD-016¶

MOD-016's evaluator queries:

SELECT score, risk_tier, model_version, model_role, scored_at, valid_until
  FROM aml.behavioural_scores
 WHERE party_id = $1
   AND model_role = 'CHAMPION'
   AND valid_until > now()
 ORDER BY scored_at DESC
 LIMIT 1;

Indexed by idx_behavioural_scores_party_champion (partial, WHERE model_role='CHAMPION'). Single-row read. No external call.

valid_until is NOT NULL (per wiki SD03 spec) — every row carries an explicit freshness boundary. The MOD-016 reader treats stale rows (valid_until <= now()) as misses and falls back to rule-only.

The contract test at tests/contract/mod-016-reader-shape.test.ts pins the shape — if MOD-017 ever drops or renames a column MOD-016 depends on, that test fails first.

Data model — owned tables¶

Both tables are append-only, defended by the canonical aml.fn_immutable_row() SECURITY DEFINER trigger function (per ADR-048). Created in V001__behavioural_scores_and_deployment_history.sql.

`aml.behavioural_scores`¶

Column	Type	Notes
id	uuid PK	DB default
party_id	uuid	indexed
model_version	text	non-empty, ≤64 chars
model_role	text	CHECK IN ('CHAMPION','CHALLENGER')
score	numeric(5,0)	CHECK 0..1000
risk_tier	text	CHECK IN (LOW,MEDIUM,HIGH,CRITICAL)
feature_vector_hash	text	SHA-256 hex of input vector — FR-111 traceability
score_reasons	jsonb	array of reason codes
scored_at	timestamptz	event time at the model
valid_until	timestamptz NOT NULL	CHECK >= scored_at. Derived by the Lambda as `scored_at + validity_hours` — see `validity-window` service below; not in the inbound payload.
triggered_by	text	CHECK IN (SCHEDULED,EVENT)
source_event_id	text NULL	for triggered_by=EVENT
created_at	timestamptz	DB default

UNIQUE (party_id, model_version, scored_at) — provides INSERT ... ON CONFLICT DO NOTHING idempotency for the write-back Lambda under at-least-once Snowflake-Task delivery.
Partial index on CHAMPION rows only — keeps the MOD-016 hot-path lookup tight.

`aml.model_deployment_history`¶

Column names follow bank-wiki SD03. Two additions (model_role, trace_id) are V001 extensions tracked for wiki update via docs/handoffs/MOD-017-wiki-corrections.handoff.md.

Column	Type	Notes
id	uuid PK	DB default
model_version	text
model_role	text NOT NULL	CHECK IN (CHAMPION,CHALLENGER). V001 extension — orthogonal to event_type for DT-005 audit queries.
previous_model_version	text NULL	Required for PROMOTED_TO_CHAMPION + ROLLED_BACK (chk_mdh_previous_version_required)
event_type	text	CHECK IN (CHALLENGER_DEPLOYED, PROMOTED_TO_CHAMPION, ROLLED_BACK, RETIRED)
effective_at	timestamptz	event time
deployed_by	text NOT NULL	Staff ID or automation principal that triggered the change
change_reason	text NOT NULL	Required human-readable rationale (DT-005 LOG)
champion_metrics	jsonb NULL	Precision/recall/AUC. Populated only on PROMOTED_TO_CHAMPION; NULL on the other three event types (chk_mdh_promotion_carries_metrics)
trace_id	uuid NOT NULL	ADR-031 — every audit-class row carries the runbook-originating trace_id. V001 extension.
created_at	timestamptz	DB default

CHECK constraints (ADR-048):

chk_mdh_created_at_after_effective (Cat 1) — created_at must be ≥ effective_at − 1 minute (rows are recorded as the event takes effect).
chk_mdh_previous_version_required (Cat 1) — PROMOTED_TO_CHAMPION and ROLLED_BACK must name the version they replace.
chk_mdh_promotion_carries_metrics (Cat 2) — PROMOTED_TO_CHAMPION rows must carry champion_metrics; non-promotion rows must omit it. Both directions enforced (Zod refine + DB CHECK).

One row written per event — the orchestrator's quality gate is explicit: a promotion writes one row, the next ROLLED_BACK writes a second row. The integration test fr-112-rollback.test.ts verifies both rows persist and the rollback's effective_at is within the 30-minute window of the prior promotion.

Components¶

1. Score-writeback Lambda (`src/handlers/score-writeback.ts`)¶

Trigger: Snowflake Task → API Integration → EXTERNAL FUNCTION → API Gateway → Lambda.
Body: Snowflake batch envelope { data: [[idx, row], ...] }.
Per row: Zod parse → assertScoreSemantics() → INSERT with ON CONFLICT DO NOTHING → response.
Per-row failures don't fail the batch — Snowflake gets per-row error envelopes back.
Idempotency: UNIQUE (party_id, model_version, scored_at).
Coverage: idempotency integration test plants the same payload twice → exactly one row in the table.

2. Deployment-event Lambda (`src/handlers/deployment-event.ts`)¶

Trigger: model-deployment runbook → API Gateway POST.
Body: single DeploymentEventPayload (Zod).
Idempotent over ${model_version}:${event_type}:${effective_at} via aml.idempotency_keys.
For ROLLED_BACK: looks up the prior PROMOTED_TO_CHAMPION row and computes rollback elapsed time; flags > 30 minutes as out_of_rollback_window=true but never rejects (a delayed rollback in a major incident must still be recorded — the audit trail is what matters).
One row written per event, every time. CHALLENGER_DEPLOYED writes a row. RETIRED writes a row. PROMOTED_TO_CHAMPION writes a row. ROLLED_BACK writes a row.

3. Pure-logic services¶

services/score-writer.ts — INSERT ... ON CONFLICT DO NOTHING + replay re-read. Takes the Lambda-derived validUntil: Date as a third argument.
services/deployment-writer.ts — single INSERT into aml.model_deployment_history. Column names match the wiki spec exactly.
services/score-validator.ts — semantic checks (future-drift on scored_at, tier/score consistency). valid_until is not validated here — it's no longer in the payload (see validity-window).
services/deployment-validator.ts — required-field checks (previous_model_version for PROMOTED + ROLLED_BACK) + 30-minute rollback window flag.
services/validity-window.ts — resolves the model freshness window from /bank/{stage}/mod-017/model/validity-hours (default 24h on miss) and derives valid_until = scored_at + window. Single source of truth so the window can be retuned via SSM without redeploying Snowflake or the Lambda. Cached for the Lambda instance lifetime.

ADR-048 invariants applied from day one¶

V001 contains no follow-up V002 — every invariant is in the first migration (this is what feedback memory feedback_adr-048_invariants.md asks for in new modules):

Invariant	Mechanism
Append-only `aml.behavioural_scores`	trg_behavioural_scores_immutable → aml.fn_immutable_row() SECURITY DEFINER
Append-only `aml.model_deployment_history`	trg_model_deployment_history_immutable → same
score in [0, 1000]	CHECK on column (Cat 1)
valid_until ≥ scored_at	CHECK chk_behavioural_scores_valid_until (Cat 1)
valid_until NOT NULL	column constraint (Cat 1) — Lambda-derived, never NULL
event_type ∈ valid set	CHECK on column
previous_model_version required for PROMOTED/ROLLED_BACK	CHECK chk_mdh_previous_version_required (Cat 1)
created_at ≥ effective_at - 1 minute	CHECK chk_mdh_created_at_after_effective (Cat 1)
champion_metrics required iff PROMOTED_TO_CHAMPION	CHECK chk_mdh_promotion_carries_metrics (Cat 2) + Zod refine
INSERT-only grants	bank_aml_app_user has SELECT+INSERT only (no UPDATE, no DELETE)

Negative test for every invariant: tests/integration/adr-048-invariants.test.ts.

FR/policy mapping¶

FR	Mechanism
FR-109 — score 0..1000	Cortex output, validated by Zod + DB CHECK
FR-110 — alert if score ≥ threshold (default 750)	MOD-016 reads score; threshold is in MOD-016's AppConfig
FR-111 — model_version + feature_vector_hash	Stored on every score row
FR-112 — rollback within 30 minutes	Deployment-event Lambda + ROLLED_BACK row + rollback-elapsed semantic + `RollbackOutOfWindow` metric + alarm

Policy	Mode	Mechanism
AML-001	LOG	aml.behavioural_scores + aml.model_deployment_history immutability
AML-005	AUTO	Source-scan: no `skip_score`, `bypass_model`, `disable_model` tokens
DT-005	LOG	aml.model_deployment_history immutability + previous_version CHECK

Tests¶

Tier	Files	Count	Coverage
Unit	tests/unit/*	49	pure-logic services, payload schemas, lib (logger, errors, trace, emf)
Contract	tests/contract/mod-016-reader-shape.test.ts	4 (in unit run)	reader-shape pin between MOD-017 write side + MOD-016 read side; asserts `valid_until` is NOT in the input payload
Integration	tests/integration/{adr-048,score-writeback-idempotency,fr-112-rollback}.test.ts	17	DB-gated; assert triggers fire, INSERT...ON CONFLICT collapses replays, FR-112 rollback path end-to-end, valid_until NOT NULL + chk_mdh_promotion_carries_metrics
Policy	tests/policy/{pol-aml-001,pol-aml-005,pol-dt-005}.test.ts	8 (1 + 4 + 3)	source-scan + DB-gated immutability/governance

Local result: 53 unit/contract pass, 1 AML-005 source-scan policy pass, 24 DB-gated tests skipped correctly. Coverage 93.15% on the unit-only scope (above 80% gate).

SSM outputs¶

Path	Value	Consumed by
`/bank/{env}/mod017/api/base-url`	API Gateway base URL	Snowflake API Integration; runbook
`/bank/{env}/mod017/lambda/score-writeback/arn`	Lambda ARN	(reference — not invoked outside Snowflake)
`/bank/{env}/mod017/lambda/deployment-event/arn`	Lambda ARN	(reference)

SSM inputs¶

Path	Source
`/bank/{env}/iam/lambda/bank-aml/arn`	MOD-104
`/bank/{env}/observability/adot-layer-arn`	MOD-076
`/bank/{env}/network/vpc-id`	MOD-104
`/bank/{env}/network/private-subnet-ids`	MOD-104
`/bank/{env}/neon/pooler-host`	MOD-103
`/bank/{env}/sns/alerts/arn`	MOD-104
`/bank/{env}/eventbridge/bank-aml/dlq-arn`	MOD-104
`/bank/{env}/mod-017/model/validity-hours`	Operations — sets the freshness window for `valid_until` derivation. Default 24h on miss; the validity-window service degrades gracefully without it. Type: numeric string (hours).

Dependencies + cross-cutting¶

Module	Role
MOD-104	BankAmlRole, VPC, SNS alerts topic
MOD-103	bank_aml database, bank_aml_app_user/migrate_user secrets
MOD-076	ADOT layer
MOD-016	Reads `aml.behavioural_scores` on the rule path
MOD-102	Snowflake API Integration + warehouse access (pre-req for the write-back chain)
MOD-150	Provisions the Snowflake-side DDL (infra/snowflake/cortex_dynamic_table.sql is the contract)

What was NOT delivered¶

Cognito sub-group policy on the deployment-event route — v1 relies on MOD-075's bank-admin/compliance JWT scope at the gateway edge. v2 may tighten to a dedicated aml-model-runbook group.
Per-row bank.aml.score_published event — v1 publishes nothing (per ADR-047 the data IS the contract). v2 may add an event for observability dashboards if MOD-076 needs it.
Snowflake-side IaC — the DDL in infra/snowflake/ is reference; MOD-150 owns provisioning.

Runbook references¶

Promotion: runbook POSTs to /internal/v1/deployment-events with event_type=PROMOTED_TO_CHAMPION, previous_model_version=<old-champion>, champion_metrics={...} (precision/recall/AUC — required), and change_reason (non-empty).
Rollback: same endpoint with event_type=ROLLED_BACK and previous_model_version=<the-version-we-roll-forward-to>. Must complete within 30 minutes of the prior promotion to satisfy FR-112's window — the RollbackOutOfWindow CloudWatch alarm fires if not. The Lambda still records the row (audit trail must capture the event).
Challenger deployed: event_type=CHALLENGER_DEPLOYED, no previous_model_version, no champion_metrics.
Retired: event_type=RETIRED, no previous_model_version required, no champion_metrics.