Skip to content

MOD-039 — Customer risk score model

System: SD06 Snowflake risk platform Type: Snowflake-native (Python UDF + DT + Stream/Task) + Lambda for EB publication Status: Built, awaiting deploy

XGBoost classifier scoring each customer's AML/financial-crime risk on a 0–100 scale with a derived tier (LOW / MEDIUM / HIGH / CRITICAL). Two write paths feed the same logical row in Snowflake; a polling Lambda fans out tier changes to the bank-risk-platform EventBridge bus.

Requirements

FR Implementation
FR-229 customer_risk_scores Snowflake Dynamic Table, 15-min target lag, calls score_customer Python UDF over per-party features from raw_cdc_kyc.* and raw_cdc_core.*.
FR-230 t_score_event_driven Task on streams over raw_cdc_kyc.parties (CDD/PEP) and raw_cdc_kyc.sanctions_results (HITs). 1-minute schedule + 1-minute Lambda poll → consumers receive event in <60s of the underlying CDC delta. Writes to score_overlay; UNION-merged into v_current_scores.
FR-231 NOT owned by MOD-039. Each consumer domain (MOD-010 KYC, MOD-016/017 AML) maintains a Neon mirror table populated from the customer_risk_score_updated EB event; the consumer's existing read API satisfies the 5ms p99 against its local mirror.
FR-232 score_history append-only table, INSERTs from both Tasks. score_version per row joins to model_versions for reproducibility. 7-year retention via storage policy in V2.
NFR-021 Single-row Python UDF call is <50ms after warehouse warm-up; well under 200ms.
NFR-024 Append-only enforced by absence of UPDATE/DELETE/TRUNCATE grants on score_history and model_versions. Verified by tests/integration/snowflake-immutability.test.ts via SHOW GRANTS.

Policy obligations

Code Mode How
AML-002 AUTO UDF takes cdd_tier as a feature; score change publishes EB event; MOD-010 re-evaluates CDD tier from the event payload. Test: tests/policy/AML-002-auto.test.ts.
AML-005 AUTO UDF tier thresholds produce HIGH ≥50, CRITICAL ≥75; event-driven Task + 1-min Lambda poll meets <60s SLA so MOD-016/017 places customer in enhanced monitoring promptly. Test: tests/policy/AML-005-auto.test.ts.
DT-005 LOG model_versions inventory table + score_history per-row score_version. Quarterly review joins these. Tests: tests/policy/DT-005-log.test.ts + runtime SHOW GRANTS in tests/integration/snowflake-immutability.test.ts.

Snowflake objects

Object Type Owner Notes
RISK_CUSTOMER schema deploy role Owned by BANK_{NONPROD\|PROD}_RISK_ROLE.
RISK_CUSTOMER.SCORE_HISTORY table deploy role Append-only audit (FR-232). 7-year retention.
RISK_CUSTOMER.MODEL_VERSIONS table deploy role DT-005 model inventory. INSERT-only at runtime.
RISK_CUSTOMER.MODEL_ARTEFACTS internal stage deploy role Holds model-<version>.pkl + .model_card.json. SNOWFLAKE_SSE encryption.
RISK_CUSTOMER.SCORE_OVERLAY table deploy role FR-230 event-driven writes; UNION-merged into v_current_scores.
RISK_CUSTOMER.EB_PUBLISH_CURSOR table deploy role Single-row state for the EB publisher Lambda.
RISK_CUSTOMER.SCORE_CUSTOMER Python UDF deploy role xgboost 2.0 / sklearn 1.5 / numpy 1.26. IMPORTS model-<version>.pkl.
RISK_CUSTOMER.STREAM_PARTIES_CHANGES stream deploy role On raw_cdc_kyc.parties. Bootstrap-resilient.
RISK_CUSTOMER.STREAM_SANCTIONS_HITS stream deploy role On raw_cdc_kyc.sanctions_results. Bootstrap-resilient.
RISK_CUSTOMER.STREAM_DT_REFRESH stream dbt role On customer_risk_scores DT. Drives history-writer Task.
RISK_CUSTOMER.T_SCORE_EVENT_DRIVEN task deploy role 1-min schedule, fires on stream content. Writes overlay + history.
RISK_CUSTOMER.T_SCORE_DT_HISTORY_WRITER task deploy role 5-min schedule, copies DT changes to score_history.
RISK_CUSTOMER.CUSTOMER_RISK_SCORES dynamic table dbt role dbt-built. 15-min target lag. FR-229.
RISK_CUSTOMER.V_CURRENT_SCORES view dbt role Published contract. UNIONs DT + overlay; ROW_NUMBER picks newest.

SSM outputs

Path Value Consumer
/bank/{env}/risk-platform/risk-customer/current-scores-view RISK_CUSTOMER.V_CURRENT_SCORES MOD-010, MOD-016/017 (mirror writers query this view)
/bank/{env}/risk-platform/risk-customer/score-history-table RISK_CUSTOMER.SCORE_HISTORY Quarterly model validation reviewers
/bank/{env}/risk-platform/risk-customer/event-source-name bank.risk-platform EB rule subscribers (MOD-010 KYC, MOD-016/017 AML)
/bank/{env}/risk-platform/risk-customer/event-detail-type customer_risk_score_updated EB rule subscribers
/bank/{env}/risk-platform/risk-customer/publisher-lambda-arn Lambda ARN Observability dashboards

EventBridge contract

Source: bank.risk-platform Detail-type: customer_risk_score_updated Detail schema:

{
  "party_id": "p-12345",
  "composite_risk_score": 67.2,
  "risk_tier": "HIGH",
  "previous_risk_tier": "MEDIUM",
  "tier_changed": true,
  "score_version": "v1-abc123def456",
  "triggering_event": "sanctions_hit",
  "scored_at": "2026-05-02T08:00:00.000Z"
}

Publication trigger: EB publisher Lambda runs every 1 min, queries v_current_scores WHERE scored_at > cursor AND tier_changed = TRUE, publishes one event per row.

Idempotency: subscribers MUST be idempotent on (party_id, scored_at) — a Lambda crash before cursor update will republish the same rows on the next invocation.

Architecture diagram

       ┌───────────────────────────────────────────────────────────┐
       │  raw_cdc_kyc.{parties, sanctions_results}                  │
       │  raw_cdc_core.postings                                     │
       │  (External Iceberg, owned by MOD-042)                      │
       └─────────────┬─────────────────────────────┬────────────────┘
                     │                             │
        ┌────────────┴──────────┐         ┌────────┴────────┐
        │ DT customer_risk_     │         │ Stream parties_  │
        │ scores                │         │ changes +        │
        │ (15-min, FR-229)      │         │ sanctions_hits   │
        │ → score_customer UDF  │         │ (FR-230 source)  │
        └────────────┬──────────┘         └────────┬────────┘
                     │                             │
                     │                             ▼
                     │                  ┌───────────────────┐
                     │                  │ Task t_score_event_│
                     │                  │ driven (1 min) →  │
                     │                  │ score_overlay +   │
                     │                  │ score_history     │
                     │                  └─────────┬─────────┘
                     │                            │
                     ▼                            │
       ┌─────────────────────────┐                │
       │ Stream stream_dt_refresh│                │
       │ → Task dt_history_writer│                │
       │ → score_history         │                │
       └─────────────────────────┘                │
                     │                            │
                     └─────────────┬──────────────┘
                       ┌───────────────────┐
                       │ v_current_scores  │
                       │ (UNION + newest-  │
                       │  per-party)       │
                       └─────────┬─────────┘
                  ┌──────────────┴───────────────┐
                  ▼                              │
       ┌───────────────────┐                     │
       │ EB publisher      │                     │
       │ Lambda (1 min,    │                     │
       │ rate(1 minute))   │                     │
       │ → PutEvents       │                     │
       └─────────┬─────────┘                     │
                 │                               │
                 ▼                               │
       ┌────────────────────────┐                │
       │ bank.risk-platform/    │                │
       │ customer_risk_score_   │                │
       │ updated (EB)           │                │
       └─────────┬──────────────┘                │
                 │                               │
       ┌─────────┴─────────┐                     │
       ▼                   ▼                     │
       MOD-010 KYC mirror  MOD-016/017 AML       │
       (consumer-side)     mirror (consumer-     │
                           side)                 │
       quarterly model validation ───────────────┘
       (joins score_history to model_versions)

V2 / follow-ups

  1. Real fraud-outcome retraining. The V1 model is fitted on synthetic data with hand-tuned distributions. After 6 months of MOD-016/017 confirmed/cleared flags accumulate via CDC, retrain against score_historyaml.aml_alerts outcomes. The training script is then replaced (not edited).
  2. Dedicated publisher role. The EB publisher Lambda currently authenticates as the schema owner role (BANK_{NONPROD|PROD}_RISK_ROLE). Create a minimal-privilege BANK_RISK_PLATFORM_PUBLISHER_ROLE in MOD-102 with SELECT on v_current_scores + UPDATE on EB_PUBLISH_CURSOR only.
  3. Storage policy for 7-year retention on score_history. V1 leaves the table untouched; volume is well within budget.
  4. Adverse media feature source. V1 emits adverse_media_score = 0.0 because no source exists. Add a raw_cdc_compliance.adverse_media_screening pipeline (MOD-038 follow-up?) and wire it into the feature-engineering block in customer_risk_scores.sql.
  5. Geographic risk DB. The hand-coded country list in customer_risk_scores.sql is a placeholder. Replace with a managed DT in risk_aml.geographic_risk_classifications.
  6. Wiki updates pending (orchestrator-owned, per 2026-05-02 brief):
  7. SD02 data model: add bank_kyc party.risk_scores_mirror.
  8. SD03 data model: add bank_aml aml.risk_scores_mirror.
  9. MOD-039 yaml: add MOD-079 to dependencies (or document the direct-subscriber pattern that supersedes it).