MOD-040 — Churn & Health Score Engine¶
Status: Built (pending CI) Repo: bank-risk-platform System: SD06 Phase: 1 (no internal deps; depends on MOD-039 schema co-residence) ADRs: ADR-002, ADR-039, ADR-046, ADR-054
A weekly churn + financial-health scoring engine in Snowflake. Pure-SQL
logistic regression over CDC-replicated engagement, balance, and
support data. Outputs: churn_probability_90d (0–1) and
financial_health_score (0–100), per active customer, every week.
Customers entering HIGH_CHURN trigger a customer_churn_at_risk_identified
EventBridge event for the operations work queue.
Functional requirements¶
| ID | Requirement | How satisfied |
|---|---|---|
| FR-233 | Weekly churn probability + health score per active customer using engagement, balance trend, txn frequency, support contact data | churn_scores DT, FULL refresh, 7-day target_lag, sourced from CDC via stg_customer_features |
| FR-234 | Flag churn>0.7 customers for proactive outreach within 24h | v_churn_at_risk_new_entries view (HIGH_CHURN cohort transitions) → weekly cron Lambda → EB event customer_churn_at_risk_identified |
| FR-235 | Scores ≤ 60s after weekly calc to app insights + customer 360 | DT refresh + view available immediately on completion |
| FR-236 | ≥ 3 years history per customer | churn_scores DT retains daily history; churn_outreach_runs audit log immutable per NFR-024 |
| NFR-004 | Insight card refresh ≤ 60s | DT availability after refresh |
| NFR-013 | Postgres ≤ 5ms p99 | Not directly binding (no Postgres write-back this module) |
| NFR-022 | Categorisation auto-rate ≥ 85% | Transitive — uses MOD-041 categorised features |
Policies satisfied¶
| Policy | Mode | Test |
|---|---|---|
| CON-001 | AUTO | tests/policy/CON-001-auto.test.ts — DT + view + Lambda + EB chain present, automatic outreach path proven |
| CON-003 | ALERT | tests/policy/CON-003-alert.test.ts — health_segment='STRESSED' payload + decimal-string financial_health_score enables vulnerable-customer flagging by subscribers |
Snowflake objects¶
| Object | Type | Owner | Notes |
|---|---|---|---|
BANK_{ENV}_RISK.RISK_CUSTOMER |
schema | DCM v2 (MOD-039 owns) | MOD-040 does NOT redeclare; references via FQN |
RISK_CUSTOMER.CHURN_MODEL_COEFFICIENTS |
table | DCM v2 + pre-dbt seed | 7 rows for v1 model (intercept + 6 features); UPSERT for new model_version |
RISK_CUSTOMER.CHURN_OUTREACH_RUNS |
table | DCM v2 | FR-234/236 audit log; INSERT-only on consumer roles |
RISK_CUSTOMER.CHURN_SCORES |
Dynamic Table | dbt | FULL refresh, 7-day target_lag, cluster (scored_date), target.warehouse, post_hook RESUME. Schema matches SD06 wiki §risk_customer.churn_scores exactly |
RISK_CUSTOMER.V_CHURN_CURRENT |
view | dbt | ADR-046 §3 published contract — most-recent score per customer |
RISK_CUSTOMER.V_CHURN_AT_RISK_NEW_ENTRIES |
view | dbt | ADR-046 §3 — cohort transitions into HIGH_CHURN this week (FR-234 trigger) |
The wiki is missing entries for risk_customer.churn_model_coefficients and
risk_customer.churn_outreach_runs — see
docs/handoffs/MOD-040-sd06-wiki-update.handoff.md for the data-model
update request.
SSM contract¶
Reads¶
Standard MOD-104 + MOD-102 set. No MOD-085 / MOD-103 dependencies.
Writes¶
| SSM path | Value | Consumed by |
|---|---|---|
/bank/{env}/risk-platform/churn/scores-table |
RISK_CUSTOMER.CHURN_SCORES |
History queries |
/bank/{env}/risk-platform/churn/current-view |
RISK_CUSTOMER.V_CHURN_CURRENT |
MOD-018, MOD-107, customer 360 |
/bank/{env}/risk-platform/churn/at-risk-view |
RISK_CUSTOMER.V_CHURN_AT_RISK_NEW_ENTRIES |
MOD-018 (when shipped) |
/bank/{env}/risk-platform/churn/event-source-name |
bank.risk-platform |
MOD-018 / MOD-107 EB rules |
/bank/{env}/risk-platform/churn/outreach-lambda-arn |
Lambda ARN | Ops |
EventBridge contract¶
| Direction | Bus | Source | Detail-type |
|---|---|---|---|
| Publish | bank-risk-platform-{env} |
bank.risk-platform |
customer_churn_at_risk_identified |
Payload (full envelope; numeric values as decimal strings per catalogue convention):
{
"event_id": "11111111-2222-3333-4444-555555555555",
"event_time": "2026-05-04T06:00:00.000Z",
"schema_version": "1.0.0",
"idempotency_key": "<sha256(scored_date|customer_id)>",
"trace_id": "trace-abc",
"run_id": "mod-040-20260504T0600-deadbeef",
"customer_id": "cust-12345",
"scored_date": "2026-05-04",
"scored_at": "2026-05-04T05:00:00.000Z",
"churn_probability_90d": "0.8124",
"churn_segment": "HIGH_CHURN",
"previous_segment": "AT_RISK",
"primary_churn_signal": "days_since_last_login",
"financial_health_score": "32.50",
"health_segment": "STRESSED",
"model_version": "v1"
}
Schedule¶
Outreach Lambda fires on cron (0 6 ? * MON *) — Mondays 06:00 UTC.
Sequence: Sunday EOD → DT refresh → Monday 06:00 UTC outreach Lambda
→ EB events land. FR-234 24h window comfortably met.
Dependencies¶
| Module | Why | Status |
|---|---|---|
| MOD-039 | Owns RISK_CUSTOMER schema (must deploy first) | Built+Deployed |
| MOD-042 | CDC for engagement / balance / txn / support data | Built; lazy-resolved |
| MOD-102 | Snowflake account + RISK database + roles | Built |
| MOD-104 | AWS bootstrap | Built |
V1 model coefficients¶
Stored in RISK_CUSTOMER.CHURN_MODEL_COEFFICIENTS (model_version='v1'):
| Feature | β | Mean (μ) | StdDev (σ) | Sign |
|---|---|---|---|---|
| intercept | -2.0 | — | — | — |
| product_engagement | -0.45 | 2.5 | 1.2 | engagement → low churn |
| balance_trend_90d_pct | -0.30 | 0.0 | 15.0 | growing balance → low churn |
| txn_frequency_per_week | -0.25 | 12.0 | 8.0 | active customer → low churn |
| support_contacts_30d | +0.55 | 0.5 | 1.0 | contact count → high churn signal |
| avg_balance_3m (log10) | -0.40 | 4.0 | 0.8 | larger balance → low churn |
| days_since_last_login | +0.65 | 7.0 | 10.0 | dormant customer → high churn |
Calibrated so a high-engagement / growing-balance fixture customer scores ~0.05 churn probability, and a low-engagement / declining-balance fixture scores ~0.85 — well above the 0.7 HIGH_CHURN threshold.
V1 limitations + future work¶
- CDC-resilient.
stg_customer_featuresuses lazyadapter.get_relation— when CDC absent, returns 0 rows andchurn_scoresis empty. Once MOD-042 ships the underlying tables, the next dbt build picks them up automatically. - Pure-SQL logistic regression. No Python UDF (DCM v2 constraint from MOD-039 lesson). Coefficients live in versioned config table; data-science team UPSERTS new model versions via SQL post-deploy.
- MOD-018 case management not yet built.
customer_churn_at_risk_identifiedevents sit on the bus until MOD-018 subscribes. Same shape as MOD-032 → MOD-076 pattern. - Health score derivation is a simple constant-weighted projection in v1. A future revision can pull weights from the same CHURN_MODEL_COEFFICIENTS table under model_version='health-v1'.
V1 numerical reference¶
Synthetic customer A (high engagement, growing balance):
features:
product_engagement = 4
balance_trend_90d_pct = +20
txn_frequency_per_week = 25
support_contacts_30d = 0
avg_balance_3m (log10) = 4.5
days_since_last_login = 1
normalised features (z-scores):
product_engagement = (4-2.5)/1.2 = 1.25
balance_trend_90d_pct = (20-0)/15.0 = 1.33
txn_frequency_per_week = (25-12)/8 = 1.625
support_contacts_30d = (0-0.5)/1 = -0.5
avg_balance_3m = (4.5-4.0)/0.8 = 0.625
days_since_last_login = (1-7)/10 = -0.6
logit z = -2.0 + (-0.45)(1.25) + (-0.30)(1.33) + (-0.25)(1.625) + (0.55)(-0.5) + (-0.40)(0.625) + (0.65)(-0.6)
≈ -2.0 - 0.5625 - 0.399 - 0.406 - 0.275 - 0.25 - 0.39
≈ -4.28
P(churn) = 1 / (1 + e^4.28) ≈ 0.014
→ churn_segment = STABLE ✓
→ no outreach event published
Synthetic customer B (declining engagement, dormant):
features:
product_engagement = 1
balance_trend_90d_pct = -25
txn_frequency_per_week = 2
support_contacts_30d = 4
avg_balance_3m (log10) = 3.0
days_since_last_login = 60
logit z ≈ -2.0 + 0.56 + 0.5 + 0.31 + 1.93 + 0.5 + 3.45
≈ +5.25
P(churn) = 1 / (1 + e^-5.25) ≈ 0.995
→ churn_segment = HIGH_CHURN ✓
→ outreach event published (if previous week was not HIGH_CHURN)