Skip to content

MOD-040 — Churn & Health Score Engine

Status: Built (pending CI) Repo: bank-risk-platform System: SD06 Phase: 1 (no internal deps; depends on MOD-039 schema co-residence) ADRs: ADR-002, ADR-039, ADR-046, ADR-054

A weekly churn + financial-health scoring engine in Snowflake. Pure-SQL logistic regression over CDC-replicated engagement, balance, and support data. Outputs: churn_probability_90d (0–1) and financial_health_score (0–100), per active customer, every week. Customers entering HIGH_CHURN trigger a customer_churn_at_risk_identified EventBridge event for the operations work queue.

Functional requirements

ID Requirement How satisfied
FR-233 Weekly churn probability + health score per active customer using engagement, balance trend, txn frequency, support contact data churn_scores DT, FULL refresh, 7-day target_lag, sourced from CDC via stg_customer_features
FR-234 Flag churn>0.7 customers for proactive outreach within 24h v_churn_at_risk_new_entries view (HIGH_CHURN cohort transitions) → weekly cron Lambda → EB event customer_churn_at_risk_identified
FR-235 Scores ≤ 60s after weekly calc to app insights + customer 360 DT refresh + view available immediately on completion
FR-236 ≥ 3 years history per customer churn_scores DT retains daily history; churn_outreach_runs audit log immutable per NFR-024
NFR-004 Insight card refresh ≤ 60s DT availability after refresh
NFR-013 Postgres ≤ 5ms p99 Not directly binding (no Postgres write-back this module)
NFR-022 Categorisation auto-rate ≥ 85% Transitive — uses MOD-041 categorised features

Policies satisfied

Policy Mode Test
CON-001 AUTO tests/policy/CON-001-auto.test.ts — DT + view + Lambda + EB chain present, automatic outreach path proven
CON-003 ALERT tests/policy/CON-003-alert.test.tshealth_segment='STRESSED' payload + decimal-string financial_health_score enables vulnerable-customer flagging by subscribers

Snowflake objects

Object Type Owner Notes
BANK_{ENV}_RISK.RISK_CUSTOMER schema DCM v2 (MOD-039 owns) MOD-040 does NOT redeclare; references via FQN
RISK_CUSTOMER.CHURN_MODEL_COEFFICIENTS table DCM v2 + pre-dbt seed 7 rows for v1 model (intercept + 6 features); UPSERT for new model_version
RISK_CUSTOMER.CHURN_OUTREACH_RUNS table DCM v2 FR-234/236 audit log; INSERT-only on consumer roles
RISK_CUSTOMER.CHURN_SCORES Dynamic Table dbt FULL refresh, 7-day target_lag, cluster (scored_date), target.warehouse, post_hook RESUME. Schema matches SD06 wiki §risk_customer.churn_scores exactly
RISK_CUSTOMER.V_CHURN_CURRENT view dbt ADR-046 §3 published contract — most-recent score per customer
RISK_CUSTOMER.V_CHURN_AT_RISK_NEW_ENTRIES view dbt ADR-046 §3 — cohort transitions into HIGH_CHURN this week (FR-234 trigger)

The wiki is missing entries for risk_customer.churn_model_coefficients and risk_customer.churn_outreach_runs — see docs/handoffs/MOD-040-sd06-wiki-update.handoff.md for the data-model update request.

SSM contract

Reads

Standard MOD-104 + MOD-102 set. No MOD-085 / MOD-103 dependencies.

Writes

SSM path Value Consumed by
/bank/{env}/risk-platform/churn/scores-table RISK_CUSTOMER.CHURN_SCORES History queries
/bank/{env}/risk-platform/churn/current-view RISK_CUSTOMER.V_CHURN_CURRENT MOD-018, MOD-107, customer 360
/bank/{env}/risk-platform/churn/at-risk-view RISK_CUSTOMER.V_CHURN_AT_RISK_NEW_ENTRIES MOD-018 (when shipped)
/bank/{env}/risk-platform/churn/event-source-name bank.risk-platform MOD-018 / MOD-107 EB rules
/bank/{env}/risk-platform/churn/outreach-lambda-arn Lambda ARN Ops

EventBridge contract

Direction Bus Source Detail-type
Publish bank-risk-platform-{env} bank.risk-platform customer_churn_at_risk_identified

Payload (full envelope; numeric values as decimal strings per catalogue convention):

{
  "event_id": "11111111-2222-3333-4444-555555555555",
  "event_time": "2026-05-04T06:00:00.000Z",
  "schema_version": "1.0.0",
  "idempotency_key": "<sha256(scored_date|customer_id)>",
  "trace_id": "trace-abc",
  "run_id": "mod-040-20260504T0600-deadbeef",
  "customer_id": "cust-12345",
  "scored_date": "2026-05-04",
  "scored_at": "2026-05-04T05:00:00.000Z",
  "churn_probability_90d": "0.8124",
  "churn_segment": "HIGH_CHURN",
  "previous_segment": "AT_RISK",
  "primary_churn_signal": "days_since_last_login",
  "financial_health_score": "32.50",
  "health_segment": "STRESSED",
  "model_version": "v1"
}

Schedule

Outreach Lambda fires on cron (0 6 ? * MON *) — Mondays 06:00 UTC. Sequence: Sunday EOD → DT refresh → Monday 06:00 UTC outreach Lambda → EB events land. FR-234 24h window comfortably met.

Dependencies

Module Why Status
MOD-039 Owns RISK_CUSTOMER schema (must deploy first) Built+Deployed
MOD-042 CDC for engagement / balance / txn / support data Built; lazy-resolved
MOD-102 Snowflake account + RISK database + roles Built
MOD-104 AWS bootstrap Built

V1 model coefficients

Stored in RISK_CUSTOMER.CHURN_MODEL_COEFFICIENTS (model_version='v1'):

Feature β Mean (μ) StdDev (σ) Sign
intercept -2.0
product_engagement -0.45 2.5 1.2 engagement → low churn
balance_trend_90d_pct -0.30 0.0 15.0 growing balance → low churn
txn_frequency_per_week -0.25 12.0 8.0 active customer → low churn
support_contacts_30d +0.55 0.5 1.0 contact count → high churn signal
avg_balance_3m (log10) -0.40 4.0 0.8 larger balance → low churn
days_since_last_login +0.65 7.0 10.0 dormant customer → high churn

Calibrated so a high-engagement / growing-balance fixture customer scores ~0.05 churn probability, and a low-engagement / declining-balance fixture scores ~0.85 — well above the 0.7 HIGH_CHURN threshold.

V1 limitations + future work

  • CDC-resilient. stg_customer_features uses lazy adapter.get_relation — when CDC absent, returns 0 rows and churn_scores is empty. Once MOD-042 ships the underlying tables, the next dbt build picks them up automatically.
  • Pure-SQL logistic regression. No Python UDF (DCM v2 constraint from MOD-039 lesson). Coefficients live in versioned config table; data-science team UPSERTS new model versions via SQL post-deploy.
  • MOD-018 case management not yet built. customer_churn_at_risk_identified events sit on the bus until MOD-018 subscribes. Same shape as MOD-032 → MOD-076 pattern.
  • Health score derivation is a simple constant-weighted projection in v1. A future revision can pull weights from the same CHURN_MODEL_COEFFICIENTS table under model_version='health-v1'.

V1 numerical reference

Synthetic customer A (high engagement, growing balance):

features:
  product_engagement      = 4
  balance_trend_90d_pct   = +20
  txn_frequency_per_week  = 25
  support_contacts_30d    = 0
  avg_balance_3m (log10)  = 4.5
  days_since_last_login   = 1

normalised features (z-scores):
  product_engagement      = (4-2.5)/1.2 = 1.25
  balance_trend_90d_pct   = (20-0)/15.0 = 1.33
  txn_frequency_per_week  = (25-12)/8 = 1.625
  support_contacts_30d    = (0-0.5)/1 = -0.5
  avg_balance_3m          = (4.5-4.0)/0.8 = 0.625
  days_since_last_login   = (1-7)/10 = -0.6

logit z = -2.0 + (-0.45)(1.25) + (-0.30)(1.33) + (-0.25)(1.625) + (0.55)(-0.5) + (-0.40)(0.625) + (0.65)(-0.6)
        ≈ -2.0 - 0.5625 - 0.399 - 0.406 - 0.275 - 0.25 - 0.39
        ≈ -4.28
P(churn)  = 1 / (1 + e^4.28) ≈ 0.014

→ churn_segment = STABLE  ✓
→ no outreach event published

Synthetic customer B (declining engagement, dormant):

features:
  product_engagement      = 1
  balance_trend_90d_pct   = -25
  txn_frequency_per_week  = 2
  support_contacts_30d    = 4
  avg_balance_3m (log10)  = 3.0
  days_since_last_login   = 60

logit z ≈ -2.0 + 0.56 + 0.5 + 0.31 + 1.93 + 0.5 + 3.45
        ≈ +5.25
P(churn)  = 1 / (1 + e^-5.25) ≈ 0.995

→ churn_segment = HIGH_CHURN  ✓
→ outreach event published (if previous week was not HIGH_CHURN)