MOD-039 — Customer risk score model¶
System: SD06 Snowflake risk platform Type: Snowflake-native (Python UDF + DT + Stream/Task) + Lambda for EB publication Status: Built, awaiting deploy
XGBoost classifier scoring each customer's AML/financial-crime risk on a 0–100 scale with a derived tier (LOW / MEDIUM / HIGH / CRITICAL). Two write paths feed the same logical row in Snowflake; a polling Lambda fans out tier changes to the bank-risk-platform EventBridge bus.
Requirements¶
| FR | Implementation |
|---|---|
| FR-229 | customer_risk_scores Snowflake Dynamic Table, 15-min target lag, calls score_customer Python UDF over per-party features from raw_cdc_kyc.* and raw_cdc_core.*. |
| FR-230 | t_score_event_driven Task on streams over raw_cdc_kyc.parties (CDD/PEP) and raw_cdc_kyc.sanctions_results (HITs). 1-minute schedule + 1-minute Lambda poll → consumers receive event in <60s of the underlying CDC delta. Writes to score_overlay; UNION-merged into v_current_scores. |
| FR-231 | NOT owned by MOD-039. Each consumer domain (MOD-010 KYC, MOD-016/017 AML) maintains a Neon mirror table populated from the customer_risk_score_updated EB event; the consumer's existing read API satisfies the 5ms p99 against its local mirror. |
| FR-232 | score_history append-only table, INSERTs from both Tasks. score_version per row joins to model_versions for reproducibility. 7-year retention via storage policy in V2. |
| NFR-021 | Single-row Python UDF call is <50ms after warehouse warm-up; well under 200ms. |
| NFR-024 | Append-only enforced by absence of UPDATE/DELETE/TRUNCATE grants on score_history and model_versions. Verified by tests/integration/snowflake-immutability.test.ts via SHOW GRANTS. |
Policy obligations¶
| Code | Mode | How |
|---|---|---|
| AML-002 | AUTO | UDF takes cdd_tier as a feature; score change publishes EB event; MOD-010 re-evaluates CDD tier from the event payload. Test: tests/policy/AML-002-auto.test.ts. |
| AML-005 | AUTO | UDF tier thresholds produce HIGH ≥50, CRITICAL ≥75; event-driven Task + 1-min Lambda poll meets <60s SLA so MOD-016/017 places customer in enhanced monitoring promptly. Test: tests/policy/AML-005-auto.test.ts. |
| DT-005 | LOG | model_versions inventory table + score_history per-row score_version. Quarterly review joins these. Tests: tests/policy/DT-005-log.test.ts + runtime SHOW GRANTS in tests/integration/snowflake-immutability.test.ts. |
Snowflake objects¶
| Object | Type | Owner | Notes |
|---|---|---|---|
RISK_CUSTOMER |
schema | deploy role | Owned by BANK_{NONPROD\|PROD}_RISK_ROLE. |
RISK_CUSTOMER.SCORE_HISTORY |
table | deploy role | Append-only audit (FR-232). 7-year retention. |
RISK_CUSTOMER.MODEL_VERSIONS |
table | deploy role | DT-005 model inventory. INSERT-only at runtime. |
RISK_CUSTOMER.MODEL_ARTEFACTS |
internal stage | deploy role | Holds model-<version>.pkl + .model_card.json. SNOWFLAKE_SSE encryption. |
RISK_CUSTOMER.SCORE_OVERLAY |
table | deploy role | FR-230 event-driven writes; UNION-merged into v_current_scores. |
RISK_CUSTOMER.EB_PUBLISH_CURSOR |
table | deploy role | Single-row state for the EB publisher Lambda. |
RISK_CUSTOMER.SCORE_CUSTOMER |
Python UDF | deploy role | xgboost 2.0 / sklearn 1.5 / numpy 1.26. IMPORTS model-<version>.pkl. |
RISK_CUSTOMER.STREAM_PARTIES_CHANGES |
stream | deploy role | On raw_cdc_kyc.parties. Bootstrap-resilient. |
RISK_CUSTOMER.STREAM_SANCTIONS_HITS |
stream | deploy role | On raw_cdc_kyc.sanctions_results. Bootstrap-resilient. |
RISK_CUSTOMER.STREAM_DT_REFRESH |
stream | dbt role | On customer_risk_scores DT. Drives history-writer Task. |
RISK_CUSTOMER.T_SCORE_EVENT_DRIVEN |
task | deploy role | 1-min schedule, fires on stream content. Writes overlay + history. |
RISK_CUSTOMER.T_SCORE_DT_HISTORY_WRITER |
task | deploy role | 5-min schedule, copies DT changes to score_history. |
RISK_CUSTOMER.CUSTOMER_RISK_SCORES |
dynamic table | dbt role | dbt-built. 15-min target lag. FR-229. |
RISK_CUSTOMER.V_CURRENT_SCORES |
view | dbt role | Published contract. UNIONs DT + overlay; ROW_NUMBER picks newest. |
SSM outputs¶
| Path | Value | Consumer |
|---|---|---|
/bank/{env}/risk-platform/risk-customer/current-scores-view |
RISK_CUSTOMER.V_CURRENT_SCORES |
MOD-010, MOD-016/017 (mirror writers query this view) |
/bank/{env}/risk-platform/risk-customer/score-history-table |
RISK_CUSTOMER.SCORE_HISTORY |
Quarterly model validation reviewers |
/bank/{env}/risk-platform/risk-customer/event-source-name |
bank.risk-platform |
EB rule subscribers (MOD-010 KYC, MOD-016/017 AML) |
/bank/{env}/risk-platform/risk-customer/event-detail-type |
customer_risk_score_updated |
EB rule subscribers |
/bank/{env}/risk-platform/risk-customer/publisher-lambda-arn |
Lambda ARN | Observability dashboards |
EventBridge contract¶
Source: bank.risk-platform
Detail-type: customer_risk_score_updated
Detail schema:
{
"party_id": "p-12345",
"composite_risk_score": 67.2,
"risk_tier": "HIGH",
"previous_risk_tier": "MEDIUM",
"tier_changed": true,
"score_version": "v1-abc123def456",
"triggering_event": "sanctions_hit",
"scored_at": "2026-05-02T08:00:00.000Z"
}
Publication trigger: EB publisher Lambda runs every 1 min, queries
v_current_scores WHERE scored_at > cursor AND tier_changed = TRUE,
publishes one event per row.
Idempotency: subscribers MUST be idempotent on
(party_id, scored_at) — a Lambda crash before cursor update will
republish the same rows on the next invocation.
Architecture diagram¶
┌───────────────────────────────────────────────────────────┐
│ raw_cdc_kyc.{parties, sanctions_results} │
│ raw_cdc_core.postings │
│ (External Iceberg, owned by MOD-042) │
└─────────────┬─────────────────────────────┬────────────────┘
│ │
┌────────────┴──────────┐ ┌────────┴────────┐
│ DT customer_risk_ │ │ Stream parties_ │
│ scores │ │ changes + │
│ (15-min, FR-229) │ │ sanctions_hits │
│ → score_customer UDF │ │ (FR-230 source) │
└────────────┬──────────┘ └────────┬────────┘
│ │
│ ▼
│ ┌───────────────────┐
│ │ Task t_score_event_│
│ │ driven (1 min) → │
│ │ score_overlay + │
│ │ score_history │
│ └─────────┬─────────┘
│ │
▼ │
┌─────────────────────────┐ │
│ Stream stream_dt_refresh│ │
│ → Task dt_history_writer│ │
│ → score_history │ │
└─────────────────────────┘ │
│ │
└─────────────┬──────────────┘
▼
┌───────────────────┐
│ v_current_scores │
│ (UNION + newest- │
│ per-party) │
└─────────┬─────────┘
│
┌──────────────┴───────────────┐
▼ │
┌───────────────────┐ │
│ EB publisher │ │
│ Lambda (1 min, │ │
│ rate(1 minute)) │ │
│ → PutEvents │ │
└─────────┬─────────┘ │
│ │
▼ │
┌────────────────────────┐ │
│ bank.risk-platform/ │ │
│ customer_risk_score_ │ │
│ updated (EB) │ │
└─────────┬──────────────┘ │
│ │
┌─────────┴─────────┐ │
▼ ▼ │
MOD-010 KYC mirror MOD-016/017 AML │
(consumer-side) mirror (consumer- │
side) │
│
quarterly model validation ───────────────┘
(joins score_history to model_versions)
V2 / follow-ups¶
- Real fraud-outcome retraining. The V1 model is fitted on synthetic
data with hand-tuned distributions. After 6 months of MOD-016/017
confirmed/cleared flags accumulate via CDC, retrain against
score_history⨝aml.aml_alertsoutcomes. The training script is then replaced (not edited). - Dedicated publisher role. The EB publisher Lambda currently
authenticates as the schema owner role (BANK_{NONPROD|PROD}_RISK_ROLE).
Create a minimal-privilege
BANK_RISK_PLATFORM_PUBLISHER_ROLEin MOD-102 with SELECT on v_current_scores + UPDATE on EB_PUBLISH_CURSOR only. - Storage policy for 7-year retention on
score_history. V1 leaves the table untouched; volume is well within budget. - Adverse media feature source. V1 emits
adverse_media_score = 0.0because no source exists. Add araw_cdc_compliance.adverse_media_screeningpipeline (MOD-038 follow-up?) and wire it into the feature-engineering block incustomer_risk_scores.sql. - Geographic risk DB. The hand-coded country list in
customer_risk_scores.sqlis a placeholder. Replace with a managed DT inrisk_aml.geographic_risk_classifications. - Wiki updates pending (orchestrator-owned, per 2026-05-02 brief):
- SD02 data model: add
bank_kyc party.risk_scores_mirror. - SD03 data model: add
bank_aml aml.risk_scores_mirror. - MOD-039 yaml: add MOD-079 to dependencies (or document the direct-subscriber pattern that supersedes it).