ADR-038: Data access tier policy — Snowflake as the reporting and insight layer¶
| Status | Proposed |
| Date | 2026-04-10 |
| Deciders | CTO, Head of Architecture, Head of Data |
| Affects repos | bank-core, bank-kyc, bank-aml, bank-payments, bank-credit, bank-risk-platform, bank-platform, bank-app |
Status¶
Proposed — 2026-04-10
Context¶
ADR-001 established Postgres (Neon) as the OLTP operational store and ADR-002 established Snowflake as the analytics and risk compute platform. Those decisions are settled.
What was not formally settled is the boundary rule for the return path — specifically, which Snowflake-computed values must be written back to Postgres, and which should be read directly from Snowflake via an API layer. Early ADR drafts (ADR-001, ADR-019) conflated the two cases, leading to a pattern where insight signals, reporting values, and display data were proposed for write-back to Postgres alongside genuine operational decisions. That conflation is incorrect and creates unnecessary data duplication, schema coupling, and maintenance overhead.
This ADR formalises the three-tier rule that governs all data access across the platform.
Decision¶
All data access in the platform is classified into one of three tiers. The tier determines where the data lives and how it is accessed. No exception requires a new ADR; an exception that violates this classification requires this ADR to be superseded.
Tier 1 — Transaction hot path (Postgres only)¶
Data that must be read in sub-millisecond time during the synchronous processing of a payment, account operation, or authentication event. This data lives in Postgres and is updated by the operational service that owns it, or by a Snowflake write-back for pre-computed values that gate transactions.
| Data | Table | Updated by |
|---|---|---|
| Account balance | accounts.available_balance |
MOD-001 on every posting |
| Account state | accounts.status |
MOD-007 account state machine |
| Daily payment limit / velocity | payment_limits |
MOD-021 on each payment |
| Pre-computed fraud score | risk_scores.fraud_score |
Snowflake → ADR-036 write-back |
| Sanctions status | sanctions_status.clear |
MOD-013/014 on list change |
| Active session / device trust | sessions |
MOD-068 on authentication |
| Credit decision existence flag | credit_decisions.has_live_offer |
Snowflake → ADR-036 write-back |
Rule: If removing a value from Postgres would prevent a payment from being authorised or an account operation from completing safely, it belongs in Tier 1.
Tier 2 — Operational decisions (Snowflake → Neon via ADR-036 decision inbox)¶
Decisions generated by Snowflake that change the operational state of a customer, account, or application. These cross the Snowflake → Neon boundary via the governed decision publication contract defined in ADR-036. They are not used on the transaction hot path but must be durable, idempotent, and auditable.
| Decision type | Published via | Applied to |
|---|---|---|
| Onboarding outcome (accept/reject/refer) | ADR-036 | customers.onboarding_status |
| CDD risk tier assignment | ADR-036 | customers.cdd_tier |
| Fraud action (block/suspend/clear) | ADR-036 | accounts.status |
| Credit pre-approval decision | ADR-036 | credit_decisions |
| AML case escalation | ADR-036 | aml_cases.status |
Rule: If Snowflake is making a decision that changes how the bank treats a customer or their account, it is Tier 2. It must use the ADR-036 contract. It must not be a direct database write.
Tier 3 — Reporting, insights, and visualisations (Snowflake only)¶
Data used for display, reporting, charting, dashboards, and insight cards. This data lives in Snowflake and is served to the front end via MOD-176 (Snowflake read API service) — the governed gateway built in bank-platform (SD07). It is never written into Postgres for display purposes. No Redis or external cache layer sits between the API and Snowflake.
Tier 3 has two sub-patterns with different access mechanics and warehouse configurations:
Tier 3a — Customer-facing signals: Pre-shaped presentation tables refreshed by Snowflake Dynamic Tables. MOD-176 performs trivial indexed point lookups by party_id. Dedicated XS warehouse, always-on during business hours, ≤500ms p99 target. See "Snowflake presentation layer" section below.
Tier 3b — Back-office and regulatory dashboards: Native Snowflake semantic layer (Cortex Analyst). SD06 modules define CREATE SEMANTIC VIEW objects in their migrations alongside dbt models, exposing named metrics, dimensions, and entities. MOD-176 accepts structured metric queries {metric, groupBy, filters} and proxies them to the Cortex Analyst REST API, which generates and executes SQL against the registered semantic models. Dedicated back-office warehouse, auto-suspend, ≤5 s p95 acceptable. See "Snowflake semantic layer" section below.
| Data | Sub-pattern | Snowflake object |
|---|---|---|
| Spending summary & category breakdown | 3a | intelligence.spend_summary (Dynamic Table) |
| Idle cash / spend anomaly / FX signals | 3a | intelligence.customer_signals (Dynamic Table) |
| Payday automation suggestion | 3a | intelligence.customer_signals (Dynamic Table) |
| Net worth dashboard signals | 3a | intelligence.net_worth_view (Dynamic Table) |
| Predicted upcoming payments | 3a | intelligence.payment_predictions (Dynamic Table) |
| Customer risk score (display) | 3a | risk.customer_risk_view (Dynamic Table) |
| Capital ratio (CET1, RWA), LCR/NSFR | 3b | SD06 semantic views (Cortex Analyst) |
| Credit portfolio view (back office) | 3b | SD06 semantic views (Cortex Analyst) |
| AML typology dashboard | 3b | SD06 semantic views (Cortex Analyst) |
| Model risk register, performance, change control | 3b | SD06 semantic views (Cortex Analyst) |
| Regulatory returns (back office) | 3b | SD06 semantic views (Cortex Analyst) |
Rule: If a value is shown in a chart, card, table, or report and is not required to gate or execute a transaction, it is Tier 3. It lives in Snowflake. It never lives in Postgres.
The boundary in plain language¶
Postgres holds state. Snowflake holds understanding.
Postgres knows the customer's balance, account status, and fraud score — because those facts must be available in microseconds during a payment. Snowflake knows what the balance means — whether there's idle cash, whether spending is anomalous, whether the customer is likely to churn — because that understanding is computed from patterns, not point-in-time facts.
The boundary is not about latency alone. It is about the nature of the data: - Facts that change atomically with operational events → Postgres - Decisions that change operational state → ADR-036 inbox → Postgres - Understanding, patterns, aggregations, and visualisations → Snowflake → API cache
Snowflake presentation layer¶
The heavy computation for Tier 3 data — aggregation, pivoting, roll-ups, ML inference output, signal scoring — happens at Dynamic Table refresh time, not at query time. The Snowflake read API queries purpose-built presentation tables: one row per customer, shaped to match the API response payload exactly. These are the serving layer, equivalent to a traditional BI semantic layer.
| Presentation table | Refresh cadence | Contents |
|---|---|---|
PLATFORM.reporting.home_screen_signals |
Per signal cadence (see ADR-019) | Idle cash, anomaly, FX signal, payday suggestion — one row per customer |
PLATFORM.reporting.spend_summary |
Nightly | Category breakdown, merchant totals — one row per customer per month |
PLATFORM.reporting.net_worth_view |
On balance change + 15-min FX refresh | Account balances in NZD equivalent |
RISK.reporting.customer_risk_view |
Daily | Customer risk score for display |
CREDIT.reporting.credit_offer_detail |
On decision publication | Pre-approval rate, term, rationale |
FINCRIME.reporting.aml_dashboard |
15 minutes | AML typology dashboard data |
RISK.reporting.capital_positions |
End of day | CET1, RWA, LCR, NSFR |
The Snowflake read API query for each endpoint is a trivial indexed point lookup:
Query latency against a warm warehouse on a pre-shaped row is 150–400ms — no external cache required.
Read API warehouse¶
A dedicated XS Snowflake warehouse serves the customer-facing read API exclusively. It is configured always-on during business hours with a 10-minute auto-suspend on inactivity, and auto-resume. This is a minimal always-on footprint — cost is approximately $5–15/day — justified by consistent sub-500ms query latency on the customer path.
Back office reports, dashboards, and analytical queries use a separate warehouse and tolerate higher latency.
Data boundary — insights only¶
The Snowflake read API serves insight and reporting data only. It never returns operational data. Operational data (balances, account status, transaction list, payment limits) is served from Neon directly via the domain APIs. The home screen mixes both sources — operational data from Neon, insight signals from Snowflake — but these are separate API calls with separate latency characteristics.
Snowflake semantic layer (Tier 3b)¶
Back-office and regulatory dashboards use the native Snowflake semantic layer (Cortex Analyst) rather than pre-shaped presentation tables. The two patterns are complementary — Tier 3a optimises for per-customer point lookup latency; Tier 3b optimises for metric flexibility across arbitrary dimensions and time ranges.
Semantic view ownership¶
Each SD06 module defines its semantic model as CREATE SEMANTIC VIEW DDL in its own migrations directory, alongside dbt models. Semantic views reference the module's published dbt views (the schema-as-product contract from ADR-046) — they never reference raw tables or cross-schema implementation detail.
A semantic view exposes:
- Metrics — named, pre-defined measures (lcr_ratio, tier_1_capital, model_psi_score)
- Dimensions — grouping attributes (jurisdiction, model_id, reporting_date, product_type)
- Entities — the primary grain (account, model, counterparty)
Query path¶
MOD-177 (renderer) → POST /v1/snowflake/metrics (MOD-176)
→ Cortex Analyst REST API
→ SQL generated against SD06 semantic views
→ back-office warehouse
→ JSON result → renderer → Recharts
MOD-176 is a proxy only. It forwards the structured query, enforces RBAC role scoping, logs the query, and returns the result. Metric logic lives entirely in Snowflake.
Back-office warehouse¶
A dedicated back-office warehouse serves all Tier 3b queries. It is configured with auto-suspend after inactivity and auto-resume on demand. Query latency on complex aggregations (multi-dimension, multi-period) is 1–5 seconds — acceptable for back-office use. This warehouse is separate from the XS customer-facing warehouse.
Constraint this creates for all repos¶
Every repo must apply the tier classification before writing data access code:
-
bank-app / bank-platform (API layer): The Snowflake read API must be built as a first-class service with per-customer caching, circuit breaking, and query governance. The API layer owns the ≤200ms p99 latency SLA for Tier 3 data; it achieves this via the cache, not by avoiding Snowflake.
-
bank-risk-platform (Snowflake): All analytical outputs are Tier 3 by default unless they are operational decisions (Tier 2). No risk model output should be written to Postgres except via the ADR-036 decision inbox. Capital, liquidity, and customer risk score values stay in Snowflake.
-
All repos: No repo may add a Postgres table for the purpose of caching Snowflake reporting data. If a reporting value is needed in the app, the answer is always: add it to the Snowflake read API and cache — never add it to the Postgres schema.
What this supersedes in prior ADRs¶
| ADR | Prior text | Superseded by this rule |
|---|---|---|
| ADR-001 | Write-back table listed capital_positions, liquidity_positions, customer_scores |
These are Tier 3 — they stay in Snowflake. Write-back is for Tier 1 fraud/sanctions scores and Tier 2 decisions only. |
| ADR-019 | "Results written to Postgres via write-back API" for insight signals | Superseded — insight signals are Tier 3. Postgres holds only the credit decision existence flag (Tier 2). |
ADR-001 and ADR-019 are Proposed drafts and have been updated to reflect this classification. ADR-001 is the Accepted decision on OLTP store choice — its OLTP stance is unchanged; only the overbroad write-back examples are corrected in this ADR.
Snowflake read API design constraints¶
The Snowflake read API is a new internal service (to be built in bank-platform). Constraints:
- Per-customer query scoping: All queries must be scoped to the requesting customer's ID — no query may return data for a different customer regardless of what parameters are passed.
- Row-level security in Snowflake: Snowflake presentation tables serving customer data must have row access policies enforced at the Snowflake layer, providing defence-in-depth independent of the API layer.
- Insights only: The read API returns Tier 3 data exclusively. It must not be used to serve operational data (balances, account status, transactions) — those come from Neon via domain APIs.
- Circuit breaker: If Snowflake query latency exceeds 1 second p95, the API must return a graceful degraded response (empty insight cards with a "not available" state) rather than waiting. A slow or unavailable Snowflake has zero impact on the transaction path.
- Query governance: All Snowflake queries from the read API must be logged with customer ID, signal type, query hash, and warehouse used — for cost attribution and anomaly detection.
- No Redis or external cache: Freshness is managed by Dynamic Table refresh cadence. The presentation layer is the cache — there is no additional caching tier between the API Lambda and Snowflake.
Principles alignment¶
| Principle | Assessment | Notes |
|---|---|---|
| AP-001 KISS | ✓ | Single rule; clear tier classification; no ambiguity about where data lives |
| AP-002 Data governance | ✓ | Snowflake is the governed analytical store; Postgres is not used as a reporting cache |
| AP-004 Security | ✓ | Row-level security in Snowflake; per-customer cache keys; no cross-customer data leakage path |
| AP-006 Cost effective | ✓ | No Postgres schema sprawl from reporting columns; Snowflake compute on demand |
| AP-007 Evolution | ✓ | New insight signals require only a new Snowflake dynamic table — no Postgres migration |
| AP-010 Modular | ✓ | Clean separation: Postgres owns operational state; Snowflake owns analytical understanding |
Perspectives¶
| Perspective | Assessment | Notes |
|---|---|---|
| Performance & Scalability | ✓ | API-layer cache absorbs load; Snowflake compute scales independently |
| Resilience | ~ | Cache provides degraded-mode resilience; Snowflake unavailability affects Tier 3 display only — no transaction impact |
| Evolution | ✓ | Adding new insight signals requires no Postgres schema change |
| Security | ✓ | Row-level security + per-customer cache keys + circuit breaker |
| Regulatory | ✓ | Regulatory reporting stays in Snowflake — the governed, auditable analytical layer |
| Cost | ✓ | Avoids Postgres bloat from reporting columns; cache reduces Snowflake compute cost |
See perspectives.md for guidance on producing these viewpoints.
Related ADRs¶
- ADR-001 — Postgres as the OLTP store; Tier 1 data lives here
- ADR-002 — Snowflake as the analytics platform; Tier 3 data lives here
- ADR-003 — Facts flow Neon → Snowflake
- ADR-036 — Tier 2 decisions flow Snowflake → Neon via decision inbox
- ADR-009 — Visualisation approach; uses Tier 3 data via Snowflake read API
- ADR-019 — Home screen intelligence; Tier 3 data via API cache
Signoff record¶
| Date | Name | Role | Status |
|---|---|---|---|
| 2026-04-10 | Ross Millen | CTO | Approved |
| 2026-04-10 | Ross Millen | Head of Architecture | Approved |
| 2026-04-10 | Ross Millen | Head of Data | Approved |
Capabilities¶
| Capability | Description | Relationship |
|---|---|---|
| CAP-019 | Monthly spending summary & trends | governed — Tier 3: served from Snowflake intelligence.spend_summary via read API |
| CAP-020 | Predicted upcoming payments | governed — Tier 3: served from Snowflake intelligence.payment_predictions |
| CAP-021 | Low balance alert | governed — Tier 3 signal; balance fact is Tier 1 in Postgres |
| CAP-022 | Unusual transaction alert | governed — Tier 3: anomaly signal from Snowflake |
| CAP-023 | Net worth dashboard (linked external accounts) | governed — Tier 3: served from intelligence.net_worth_view |
| CAP-038 | Real-time fraud scoring & block | governed — fraud score is Tier 1 in Postgres (pre-computed write-back) |
| CAP-062 | CDD automatic tier assignment | governed — CDD tier decision is Tier 2 via ADR-036 decision inbox |
| CAP-063 | Proactive financial insight engine | governed — all insight signals are Tier 3; never enter Postgres |
| CAP-065 | Pre-approval engine (nightly eligibility scoring) | governed — credit decision flag is Tier 2; offer detail is Tier 3 |
| CAP-067 | Idle cash detection | governed — idle cash amount is Tier 3; stays in Snowflake |
| CAP-070 | Real-time capital ratio engine (CET1/RWA) | governed — capital_positions is Tier 3; stays in Snowflake |
| CAP-071 | LCR/NSFR continuous calculation | governed — liquidity_positions is Tier 3; stays in Snowflake |
| CAP-072 | Regulatory threshold alerting | governed — threshold monitoring is Tier 3 |
Related decisions¶
| ADR | Title | Relationship |
|---|---|---|
| ADR-001 | Postgres as the OLTP operational store | Tier 1 data lives in Postgres |
| ADR-002 | Snowflake as the analytics and risk compute platform | Tier 2 and Tier 3 data lives in Snowflake |
| ADR-003 | CDC pipeline — Neon Postgres to Snowflake via Firehose and Apache Iceberg | facts flow Neon to Snowflake to populate Tier 3 |
| ADR-009 | Insights and data visualisation approach | visualisation approach consumes Tier 3 via read API |
| ADR-019 | Intelligent customer home screen and financial intelligence layer | home screen reads Tier 1 from Neon and Tier 3 from Snowflake |
| ADR-036 | Decision result publication — governed Snowflake → Neon write-back contract | Tier 2 decisions cross Snowflake to Neon via this ADR |
All ADRs
Compiled 2026-05-22 from source/entities/adrs/ADR-038.yaml