ADR-038: Data access tier policy — Snowflake as the reporting and insight layer¶


Status	Proposed
Date	2026-04-10
Deciders	CTO, Head of Architecture, Head of Data
Affects repos	`bank-core`, `bank-kyc`, `bank-aml`, `bank-payments`, `bank-credit`, `bank-risk-platform`, `bank-platform`, `bank-app`

Status¶

Proposed — 2026-04-10

Context¶

ADR-001 established Postgres (Neon) as the OLTP operational store and ADR-002 established Snowflake as the analytics and risk compute platform. Those decisions are settled.

What was not formally settled is the boundary rule for the return path — specifically, which Snowflake-computed values must be written back to Postgres, and which should be read directly from Snowflake via an API layer. Early ADR drafts (ADR-001, ADR-019) conflated the two cases, leading to a pattern where insight signals, reporting values, and display data were proposed for write-back to Postgres alongside genuine operational decisions. That conflation is incorrect and creates unnecessary data duplication, schema coupling, and maintenance overhead.

This ADR formalises the three-tier rule that governs all data access across the platform.

Decision¶

All data access in the platform is classified into one of three tiers. The tier determines where the data lives and how it is accessed. No exception requires a new ADR; an exception that violates this classification requires this ADR to be superseded.

Tier 1 — Transaction hot path (Postgres only)¶

Data that must be read in sub-millisecond time during the synchronous processing of a payment, account operation, or authentication event. This data lives in Postgres and is updated by the operational service that owns it, or by a Snowflake write-back for pre-computed values that gate transactions.

Data	Table	Updated by
Account balance	`accounts.available_balance`	MOD-001 on every posting
Account state	`accounts.status`	MOD-007 account state machine
Daily payment limit / velocity	`payment_limits`	MOD-021 on each payment
Pre-computed fraud score	`risk_scores.fraud_score`	Snowflake → ADR-036 write-back
Sanctions status	`sanctions_status.clear`	MOD-013/014 on list change
Active session / device trust	`sessions`	MOD-068 on authentication
Credit decision existence flag	`credit_decisions.has_live_offer`	Snowflake → ADR-036 write-back

Rule: If removing a value from Postgres would prevent a payment from being authorised or an account operation from completing safely, it belongs in Tier 1.

Tier 2 — Operational decisions (Snowflake → Neon via ADR-036 decision inbox)¶

Decisions generated by Snowflake that change the operational state of a customer, account, or application. These cross the Snowflake → Neon boundary via the governed decision publication contract defined in ADR-036. They are not used on the transaction hot path but must be durable, idempotent, and auditable.

Decision type	Published via	Applied to
Onboarding outcome (accept/reject/refer)	ADR-036	`customers.onboarding_status`
CDD risk tier assignment	ADR-036	`customers.cdd_tier`
Fraud action (block/suspend/clear)	ADR-036	`accounts.status`
Credit pre-approval decision	ADR-036	`credit_decisions`
AML case escalation	ADR-036	`aml_cases.status`

Rule: If Snowflake is making a decision that changes how the bank treats a customer or their account, it is Tier 2. It must use the ADR-036 contract. It must not be a direct database write.

Tier 3 — Reporting, insights, and visualisations (Snowflake only)¶

Data used for display, reporting, charting, dashboards, and insight cards. This data lives in Snowflake and is served to the front end via MOD-176 (Snowflake read API service) — the governed gateway built in bank-platform (SD07). It is never written into Postgres for display purposes. No Redis or external cache layer sits between the API and Snowflake.

Tier 3 has two sub-patterns with different access mechanics and warehouse configurations:

Tier 3a — Customer-facing signals: Pre-shaped presentation tables refreshed by Snowflake Dynamic Tables. MOD-176 performs trivial indexed point lookups by party_id. Dedicated XS warehouse, always-on during business hours, ≤500ms p99 target. See "Snowflake presentation layer" section below.

Tier 3b — Back-office and regulatory dashboards: Native Snowflake semantic layer (Cortex Analyst). SD06 modules define CREATE SEMANTIC VIEW objects in their migrations alongside dbt models, exposing named metrics, dimensions, and entities. MOD-176 accepts structured metric queries {metric, groupBy, filters} and proxies them to the Cortex Analyst REST API, which generates and executes SQL against the registered semantic models. Dedicated back-office warehouse, auto-suspend, ≤5 s p95 acceptable. See "Snowflake semantic layer" section below.

Data	Sub-pattern	Snowflake object
Spending summary & category breakdown	3a	`intelligence.spend_summary` (Dynamic Table)
Idle cash / spend anomaly / FX signals	3a	`intelligence.customer_signals` (Dynamic Table)
Payday automation suggestion	3a	`intelligence.customer_signals` (Dynamic Table)
Net worth dashboard signals	3a	`intelligence.net_worth_view` (Dynamic Table)
Predicted upcoming payments	3a	`intelligence.payment_predictions` (Dynamic Table)
Customer risk score (display)	3a	`risk.customer_risk_view` (Dynamic Table)
Capital ratio (CET1, RWA), LCR/NSFR	3b	SD06 semantic views (Cortex Analyst)
Credit portfolio view (back office)	3b	SD06 semantic views (Cortex Analyst)
AML typology dashboard	3b	SD06 semantic views (Cortex Analyst)
Model risk register, performance, change control	3b	SD06 semantic views (Cortex Analyst)
Regulatory returns (back office)	3b	SD06 semantic views (Cortex Analyst)

Rule: If a value is shown in a chart, card, table, or report and is not required to gate or execute a transaction, it is Tier 3. It lives in Snowflake. It never lives in Postgres.

The boundary in plain language¶

Postgres holds state. Snowflake holds understanding.

Postgres knows the customer's balance, account status, and fraud score — because those facts must be available in microseconds during a payment. Snowflake knows what the balance means — whether there's idle cash, whether spending is anomalous, whether the customer is likely to churn — because that understanding is computed from patterns, not point-in-time facts.

The boundary is not about latency alone. It is about the nature of the data: - Facts that change atomically with operational events → Postgres - Decisions that change operational state → ADR-036 inbox → Postgres - Understanding, patterns, aggregations, and visualisations → Snowflake → API cache

Snowflake presentation layer¶

The heavy computation for Tier 3 data — aggregation, pivoting, roll-ups, ML inference output, signal scoring — happens at Dynamic Table refresh time, not at query time. The Snowflake read API queries purpose-built presentation tables: one row per customer, shaped to match the API response payload exactly. These are the serving layer, equivalent to a traditional BI semantic layer.

Presentation table	Refresh cadence	Contents
`PLATFORM.reporting.home_screen_signals`	Per signal cadence (see ADR-019)	Idle cash, anomaly, FX signal, payday suggestion — one row per customer
`PLATFORM.reporting.spend_summary`	Nightly	Category breakdown, merchant totals — one row per customer per month
`PLATFORM.reporting.net_worth_view`	On balance change + 15-min FX refresh	Account balances in NZD equivalent
`RISK.reporting.customer_risk_view`	Daily	Customer risk score for display
`CREDIT.reporting.credit_offer_detail`	On decision publication	Pre-approval rate, term, rationale
`FINCRIME.reporting.aml_dashboard`	15 minutes	AML typology dashboard data
`RISK.reporting.capital_positions`	End of day	CET1, RWA, LCR, NSFR

The Snowflake read API query for each endpoint is a trivial indexed point lookup:

SELECT payload FROM platform.reporting.home_screen_signals WHERE customer_id = ?

Query latency against a warm warehouse on a pre-shaped row is 150–400ms — no external cache required.

Read API warehouse¶

A dedicated XS Snowflake warehouse serves the customer-facing read API exclusively. It is configured always-on during business hours with a 10-minute auto-suspend on inactivity, and auto-resume. This is a minimal always-on footprint — cost is approximately $5–15/day — justified by consistent sub-500ms query latency on the customer path.

Back office reports, dashboards, and analytical queries use a separate warehouse and tolerate higher latency.

Data boundary — insights only¶

The Snowflake read API serves insight and reporting data only. It never returns operational data. Operational data (balances, account status, transaction list, payment limits) is served from Neon directly via the domain APIs. The home screen mixes both sources — operational data from Neon, insight signals from Snowflake — but these are separate API calls with separate latency characteristics.

Snowflake semantic layer (Tier 3b)¶

Back-office and regulatory dashboards use the native Snowflake semantic layer (Cortex Analyst) rather than pre-shaped presentation tables. The two patterns are complementary — Tier 3a optimises for per-customer point lookup latency; Tier 3b optimises for metric flexibility across arbitrary dimensions and time ranges.

Semantic view ownership¶

Each SD06 module defines its semantic model as CREATE SEMANTIC VIEW DDL in its own migrations directory, alongside dbt models. Semantic views reference the module's published dbt views (the schema-as-product contract from ADR-046) — they never reference raw tables or cross-schema implementation detail.

A semantic view exposes: - Metrics — named, pre-defined measures (lcr_ratio, tier_1_capital, model_psi_score) - Dimensions — grouping attributes (jurisdiction, model_id, reporting_date, product_type) - Entities — the primary grain (account, model, counterparty)

Query path¶

MOD-177 (renderer) → POST /v1/snowflake/metrics (MOD-176)
  → Cortex Analyst REST API
    → SQL generated against SD06 semantic views
      → back-office warehouse
        → JSON result → renderer → Recharts

MOD-176 is a proxy only. It forwards the structured query, enforces RBAC role scoping, logs the query, and returns the result. Metric logic lives entirely in Snowflake.

Back-office warehouse¶

A dedicated back-office warehouse serves all Tier 3b queries. It is configured with auto-suspend after inactivity and auto-resume on demand. Query latency on complex aggregations (multi-dimension, multi-period) is 1–5 seconds — acceptable for back-office use. This warehouse is separate from the XS customer-facing warehouse.

Constraint this creates for all repos¶

Every repo must apply the tier classification before writing data access code:

bank-app / bank-platform (API layer): The Snowflake read API must be built as a first-class service with per-customer caching, circuit breaking, and query governance. The API layer owns the ≤200ms p99 latency SLA for Tier 3 data; it achieves this via the cache, not by avoiding Snowflake.
bank-risk-platform (Snowflake): All analytical outputs are Tier 3 by default unless they are operational decisions (Tier 2). No risk model output should be written to Postgres except via the ADR-036 decision inbox. Capital, liquidity, and customer risk score values stay in Snowflake.
All repos: No repo may add a Postgres table for the purpose of caching Snowflake reporting data. If a reporting value is needed in the app, the answer is always: add it to the Snowflake read API and cache — never add it to the Postgres schema.

What this supersedes in prior ADRs¶

ADR	Prior text	Superseded by this rule
ADR-001	Write-back table listed `capital_positions`, `liquidity_positions`, `customer_scores`	These are Tier 3 — they stay in Snowflake. Write-back is for Tier 1 fraud/sanctions scores and Tier 2 decisions only.
ADR-019	"Results written to Postgres via write-back API" for insight signals	Superseded — insight signals are Tier 3. Postgres holds only the credit decision existence flag (Tier 2).

ADR-001 and ADR-019 are Proposed drafts and have been updated to reflect this classification. ADR-001 is the Accepted decision on OLTP store choice — its OLTP stance is unchanged; only the overbroad write-back examples are corrected in this ADR.

Snowflake read API design constraints¶

The Snowflake read API is a new internal service (to be built in bank-platform). Constraints:

Per-customer query scoping: All queries must be scoped to the requesting customer's ID — no query may return data for a different customer regardless of what parameters are passed.
Row-level security in Snowflake: Snowflake presentation tables serving customer data must have row access policies enforced at the Snowflake layer, providing defence-in-depth independent of the API layer.
Insights only: The read API returns Tier 3 data exclusively. It must not be used to serve operational data (balances, account status, transactions) — those come from Neon via domain APIs.
Circuit breaker: If Snowflake query latency exceeds 1 second p95, the API must return a graceful degraded response (empty insight cards with a "not available" state) rather than waiting. A slow or unavailable Snowflake has zero impact on the transaction path.
Query governance: All Snowflake queries from the read API must be logged with customer ID, signal type, query hash, and warehouse used — for cost attribution and anomaly detection.
No Redis or external cache: Freshness is managed by Dynamic Table refresh cadence. The presentation layer is the cache — there is no additional caching tier between the API Lambda and Snowflake.

Principles alignment¶

Principle	Assessment	Notes
AP-001 KISS	✓	Single rule; clear tier classification; no ambiguity about where data lives
AP-002 Data governance	✓	Snowflake is the governed analytical store; Postgres is not used as a reporting cache
AP-004 Security	✓	Row-level security in Snowflake; per-customer cache keys; no cross-customer data leakage path
AP-006 Cost effective	✓	No Postgres schema sprawl from reporting columns; Snowflake compute on demand
AP-007 Evolution	✓	New insight signals require only a new Snowflake dynamic table — no Postgres migration
AP-010 Modular	✓	Clean separation: Postgres owns operational state; Snowflake owns analytical understanding

Perspectives¶

Perspective	Assessment	Notes
Performance & Scalability	✓	API-layer cache absorbs load; Snowflake compute scales independently
Resilience	~	Cache provides degraded-mode resilience; Snowflake unavailability affects Tier 3 display only — no transaction impact
Evolution	✓	Adding new insight signals requires no Postgres schema change
Security	✓	Row-level security + per-customer cache keys + circuit breaker
Regulatory	✓	Regulatory reporting stays in Snowflake — the governed, auditable analytical layer
Cost	✓	Avoids Postgres bloat from reporting columns; cache reduces Snowflake compute cost

See perspectives.md for guidance on producing these viewpoints.

ADR-001 — Postgres as the OLTP store; Tier 1 data lives here
ADR-002 — Snowflake as the analytics platform; Tier 3 data lives here
ADR-003 — Facts flow Neon → Snowflake
ADR-036 — Tier 2 decisions flow Snowflake → Neon via decision inbox
ADR-009 — Visualisation approach; uses Tier 3 data via Snowflake read API
ADR-019 — Home screen intelligence; Tier 3 data via API cache

Signoff record¶

Date	Name	Role	Status
2026-04-10	Ross Millen	CTO	Approved
2026-04-10	Ross Millen	Head of Architecture	Approved
2026-04-10	Ross Millen	Head of Data	Approved

Capabilities¶

Capability	Description	Relationship
CAP-019	Monthly spending summary & trends	governed — Tier 3: served from Snowflake intelligence.spend_summary via read API
CAP-020	Predicted upcoming payments	governed — Tier 3: served from Snowflake intelligence.payment_predictions
CAP-021	Low balance alert	governed — Tier 3 signal; balance fact is Tier 1 in Postgres
CAP-022	Unusual transaction alert	governed — Tier 3: anomaly signal from Snowflake
CAP-023	Net worth dashboard (linked external accounts)	governed — Tier 3: served from intelligence.net_worth_view
CAP-038	Real-time fraud scoring & block	governed — fraud score is Tier 1 in Postgres (pre-computed write-back)
CAP-062	CDD automatic tier assignment	governed — CDD tier decision is Tier 2 via ADR-036 decision inbox
CAP-063	Proactive financial insight engine	governed — all insight signals are Tier 3; never enter Postgres
CAP-065	Pre-approval engine (nightly eligibility scoring)	governed — credit decision flag is Tier 2; offer detail is Tier 3
CAP-067	Idle cash detection	governed — idle cash amount is Tier 3; stays in Snowflake
CAP-070	Real-time capital ratio engine (CET1/RWA)	governed — capital_positions is Tier 3; stays in Snowflake
CAP-071	LCR/NSFR continuous calculation	governed — liquidity_positions is Tier 3; stays in Snowflake
CAP-072	Regulatory threshold alerting	governed — threshold monitoring is Tier 3

ADR	Title	Relationship
ADR-001	Postgres as the OLTP operational store	Tier 1 data lives in Postgres
ADR-002	Snowflake as the analytics and risk compute platform	Tier 2 and Tier 3 data lives in Snowflake
ADR-003	CDC pipeline — Neon Postgres to Snowflake via Firehose and Apache Iceberg	facts flow Neon to Snowflake to populate Tier 3
ADR-009	Insights and data visualisation approach	visualisation approach consumes Tier 3 via read API
ADR-019	Intelligent customer home screen and financial intelligence layer	home screen reads Tier 1 from Neon and Tier 3 from Snowflake
ADR-036	Decision result publication — governed Snowflake → Neon write-back contract	Tier 2 decisions cross Snowflake to Neon via this ADR

All ADRs Compiled 2026-05-22 from source/entities/adrs/ADR-038.yaml