MOD-040 — Churn & Health Score Engine¶

Status: Built (pending CI) Repo: bank-risk-platform System: SD06 Phase: 1 (no internal deps; depends on MOD-039 schema co-residence) ADRs: ADR-002, ADR-039, ADR-046, ADR-054

A weekly churn + financial-health scoring engine in Snowflake. Pure-SQL logistic regression over CDC-replicated engagement, balance, and support data. Outputs: churn_probability_90d (0–1) and financial_health_score (0–100), per active customer, every week. Customers entering HIGH_CHURN trigger a customer_churn_at_risk_identified EventBridge event for the operations work queue.

Functional requirements¶

ID	Requirement	How satisfied
FR-233	Weekly churn probability + health score per active customer using engagement, balance trend, txn frequency, support contact data	`churn_scores` DT, FULL refresh, 7-day target_lag, sourced from CDC via `stg_customer_features`
FR-234	Flag churn>0.7 customers for proactive outreach within 24h	`v_churn_at_risk_new_entries` view (HIGH_CHURN cohort transitions) → weekly cron Lambda → EB event `customer_churn_at_risk_identified`
FR-235	Scores ≤ 60s after weekly calc to app insights + customer 360	DT refresh + view available immediately on completion
FR-236	≥ 3 years history per customer	`churn_scores` DT retains daily history; `churn_outreach_runs` audit log immutable per NFR-024
NFR-004	Insight card refresh ≤ 60s	DT availability after refresh
NFR-013	Postgres ≤ 5ms p99	Not directly binding (no Postgres write-back this module)
NFR-022	Categorisation auto-rate ≥ 85%	Transitive — uses MOD-041 categorised features

Policies satisfied¶

Policy	Mode	Test
CON-001	AUTO	`tests/policy/CON-001-auto.test.ts` — DT + view + Lambda + EB chain present, automatic outreach path proven
CON-003	ALERT	`tests/policy/CON-003-alert.test.ts` — `health_segment='STRESSED'` payload + decimal-string `financial_health_score` enables vulnerable-customer flagging by subscribers

Snowflake objects¶

Object	Type	Owner	Notes
`BANK_{ENV}_RISK.RISK_CUSTOMER`	schema	DCM v2 (MOD-039 owns)	MOD-040 does NOT redeclare; references via FQN
`RISK_CUSTOMER.CHURN_MODEL_COEFFICIENTS`	table	DCM v2 + pre-dbt seed	7 rows for v1 model (intercept + 6 features); UPSERT for new model_version
`RISK_CUSTOMER.CHURN_OUTREACH_RUNS`	table	DCM v2	FR-234/236 audit log; INSERT-only on consumer roles
`RISK_CUSTOMER.CHURN_SCORES`	Dynamic Table	dbt	FULL refresh, 7-day target_lag, cluster (scored_date), target.warehouse, post_hook RESUME. Schema matches SD06 wiki §`risk_customer.churn_scores` exactly
`RISK_CUSTOMER.V_CHURN_CURRENT`	view	dbt	ADR-046 §3 published contract — most-recent score per customer
`RISK_CUSTOMER.V_CHURN_AT_RISK_NEW_ENTRIES`	view	dbt	ADR-046 §3 — cohort transitions into HIGH_CHURN this week (FR-234 trigger)

The wiki is missing entries for risk_customer.churn_model_coefficients and risk_customer.churn_outreach_runs — see docs/handoffs/MOD-040-sd06-wiki-update.handoff.md for the data-model update request.

SSM contract¶

Reads¶

Standard MOD-104 + MOD-102 set. No MOD-085 / MOD-103 dependencies.

Writes¶

SSM path	Value	Consumed by
`/bank/{env}/risk-platform/churn/scores-table`	`RISK_CUSTOMER.CHURN_SCORES`	History queries
`/bank/{env}/risk-platform/churn/current-view`	`RISK_CUSTOMER.V_CHURN_CURRENT`	MOD-018, MOD-107, customer 360
`/bank/{env}/risk-platform/churn/at-risk-view`	`RISK_CUSTOMER.V_CHURN_AT_RISK_NEW_ENTRIES`	MOD-018 (when shipped)
`/bank/{env}/risk-platform/churn/event-source-name`	`bank.risk-platform`	MOD-018 / MOD-107 EB rules
`/bank/{env}/risk-platform/churn/outreach-lambda-arn`	Lambda ARN	Ops

EventBridge contract¶

Direction	Bus	Source	Detail-type
Publish	`bank-risk-platform-{env}`	`bank.risk-platform`	`customer_churn_at_risk_identified`

Payload (full envelope; numeric values as decimal strings per catalogue convention):

{
  "event_id": "11111111-2222-3333-4444-555555555555",
  "event_time": "2026-05-04T06:00:00.000Z",
  "schema_version": "1.0.0",
  "idempotency_key": "<sha256(scored_date|customer_id)>",
  "trace_id": "trace-abc",
  "run_id": "mod-040-20260504T0600-deadbeef",
  "customer_id": "cust-12345",
  "scored_date": "2026-05-04",
  "scored_at": "2026-05-04T05:00:00.000Z",
  "churn_probability_90d": "0.8124",
  "churn_segment": "HIGH_CHURN",
  "previous_segment": "AT_RISK",
  "primary_churn_signal": "days_since_last_login",
  "financial_health_score": "32.50",
  "health_segment": "STRESSED",
  "model_version": "v1"
}

Schedule¶

Outreach Lambda fires on cron (0 6 ? * MON *) — Mondays 06:00 UTC. Sequence: Sunday EOD → DT refresh → Monday 06:00 UTC outreach Lambda → EB events land. FR-234 24h window comfortably met.

Dependencies¶

Module	Why	Status
MOD-039	Owns RISK_CUSTOMER schema (must deploy first)	Built+Deployed
MOD-042	CDC for engagement / balance / txn / support data	Built; lazy-resolved
MOD-102	Snowflake account + RISK database + roles	Built
MOD-104	AWS bootstrap	Built

V1 model coefficients¶

Stored in RISK_CUSTOMER.CHURN_MODEL_COEFFICIENTS (model_version='v1'):

Feature	β	Mean (μ)	StdDev (σ)	Sign
intercept	-2.0	—	—	—
product_engagement	-0.45	2.5	1.2	engagement → low churn
balance_trend_90d_pct	-0.30	0.0	15.0	growing balance → low churn
txn_frequency_per_week	-0.25	12.0	8.0	active customer → low churn
support_contacts_30d	+0.55	0.5	1.0	contact count → high churn signal
avg_balance_3m (log10)	-0.40	4.0	0.8	larger balance → low churn
days_since_last_login	+0.65	7.0	10.0	dormant customer → high churn

Calibrated so a high-engagement / growing-balance fixture customer scores ~0.05 churn probability, and a low-engagement / declining-balance fixture scores ~0.85 — well above the 0.7 HIGH_CHURN threshold.

V1 limitations + future work¶

CDC-resilient. stg_customer_features uses lazy adapter.get_relation — when CDC absent, returns 0 rows and churn_scores is empty. Once MOD-042 ships the underlying tables, the next dbt build picks them up automatically.
Pure-SQL logistic regression. No Python UDF (DCM v2 constraint from MOD-039 lesson). Coefficients live in versioned config table; data-science team UPSERTS new model versions via SQL post-deploy.
MOD-018 case management not yet built. customer_churn_at_risk_identified events sit on the bus until MOD-018 subscribes. Same shape as MOD-032 → MOD-076 pattern.
Health score derivation is a simple constant-weighted projection in v1. A future revision can pull weights from the same CHURN_MODEL_COEFFICIENTS table under model_version='health-v1'.

V1 numerical reference¶

Synthetic customer A (high engagement, growing balance):

features:
  product_engagement      = 4
  balance_trend_90d_pct   = +20
  txn_frequency_per_week  = 25
  support_contacts_30d    = 0
  avg_balance_3m (log10)  = 4.5
  days_since_last_login   = 1

normalised features (z-scores):
  product_engagement      = (4-2.5)/1.2 = 1.25
  balance_trend_90d_pct   = (20-0)/15.0 = 1.33
  txn_frequency_per_week  = (25-12)/8 = 1.625
  support_contacts_30d    = (0-0.5)/1 = -0.5
  avg_balance_3m          = (4.5-4.0)/0.8 = 0.625
  days_since_last_login   = (1-7)/10 = -0.6

logit z = -2.0 + (-0.45)(1.25) + (-0.30)(1.33) + (-0.25)(1.625) + (0.55)(-0.5) + (-0.40)(0.625) + (0.65)(-0.6)
        ≈ -2.0 - 0.5625 - 0.399 - 0.406 - 0.275 - 0.25 - 0.39
        ≈ -4.28
P(churn)  = 1 / (1 + e^4.28) ≈ 0.014

→ churn_segment = STABLE  ✓
→ no outreach event published

Synthetic customer B (declining engagement, dormant):

features:
  product_engagement      = 1
  balance_trend_90d_pct   = -25
  txn_frequency_per_week  = 2
  support_contacts_30d    = 4
  avg_balance_3m (log10)  = 3.0
  days_since_last_login   = 60

logit z ≈ -2.0 + 0.56 + 0.5 + 0.31 + 1.93 + 0.5 + 3.45
        ≈ +5.25
P(churn)  = 1 / (1 + e^-5.25) ≈ 0.995

→ churn_segment = HIGH_CHURN  ✓
→ outreach event published (if previous week was not HIGH_CHURN)