Skip to content

ADR-038: Data access tier policy — Snowflake as the reporting and insight layer

Status Proposed
Date 2026-04-10
Deciders CTO, Head of Architecture, Head of Data
Affects repos bank-core, bank-kyc, bank-aml, bank-payments, bank-credit, bank-risk-platform, bank-platform, bank-app

Status

Proposed — 2026-04-10

Context

ADR-001 established Postgres (Neon) as the OLTP operational store and ADR-002 established Snowflake as the analytics and risk compute platform. Those decisions are settled.

What was not formally settled is the boundary rule for the return path — specifically, which Snowflake-computed values must be written back to Postgres, and which should be read directly from Snowflake via an API layer. Early ADR drafts (ADR-001, ADR-019) conflated the two cases, leading to a pattern where insight signals, reporting values, and display data were proposed for write-back to Postgres alongside genuine operational decisions. That conflation is incorrect and creates unnecessary data duplication, schema coupling, and maintenance overhead.

This ADR formalises the three-tier rule that governs all data access across the platform.


Decision

All data access in the platform is classified into one of three tiers. The tier determines where the data lives and how it is accessed. No exception requires a new ADR; an exception that violates this classification requires this ADR to be superseded.

Tier 1 — Transaction hot path (Postgres only)

Data that must be read in sub-millisecond time during the synchronous processing of a payment, account operation, or authentication event. This data lives in Postgres and is updated by the operational service that owns it, or by a Snowflake write-back for pre-computed values that gate transactions.

Data Table Updated by
Account balance accounts.available_balance MOD-001 on every posting
Account state accounts.status MOD-007 account state machine
Daily payment limit / velocity payment_limits MOD-021 on each payment
Pre-computed fraud score risk_scores.fraud_score Snowflake → ADR-036 write-back
Sanctions status sanctions_status.clear MOD-013/014 on list change
Active session / device trust sessions MOD-068 on authentication
Credit decision existence flag credit_decisions.has_live_offer Snowflake → ADR-036 write-back

Rule: If removing a value from Postgres would prevent a payment from being authorised or an account operation from completing safely, it belongs in Tier 1.


Tier 2 — Operational decisions (Snowflake → Neon via ADR-036 decision inbox)

Decisions generated by Snowflake that change the operational state of a customer, account, or application. These cross the Snowflake → Neon boundary via the governed decision publication contract defined in ADR-036. They are not used on the transaction hot path but must be durable, idempotent, and auditable.

Decision type Published via Applied to
Onboarding outcome (accept/reject/refer) ADR-036 customers.onboarding_status
CDD risk tier assignment ADR-036 customers.cdd_tier
Fraud action (block/suspend/clear) ADR-036 accounts.status
Credit pre-approval decision ADR-036 credit_decisions
AML case escalation ADR-036 aml_cases.status

Rule: If Snowflake is making a decision that changes how the bank treats a customer or their account, it is Tier 2. It must use the ADR-036 contract. It must not be a direct database write.


Tier 3 — Reporting, insights, and visualisations (Snowflake only)

Data used for display, reporting, charting, dashboards, and insight cards. This data lives in Snowflake and is served to the front end via MOD-176 (Snowflake read API service) — the governed gateway built in bank-platform (SD07). It is never written into Postgres for display purposes. No Redis or external cache layer sits between the API and Snowflake.

Tier 3 has two sub-patterns with different access mechanics and warehouse configurations:

Tier 3a — Customer-facing signals: Pre-shaped presentation tables refreshed by Snowflake Dynamic Tables. MOD-176 performs trivial indexed point lookups by party_id. Dedicated XS warehouse, always-on during business hours, ≤500ms p99 target. See "Snowflake presentation layer" section below.

Tier 3b — Back-office and regulatory dashboards: Native Snowflake semantic layer (Cortex Analyst). SD06 modules define CREATE SEMANTIC VIEW objects in their migrations alongside dbt models, exposing named metrics, dimensions, and entities. MOD-176 accepts structured metric queries {metric, groupBy, filters} and proxies them to the Cortex Analyst REST API, which generates and executes SQL against the registered semantic models. Dedicated back-office warehouse, auto-suspend, ≤5 s p95 acceptable. See "Snowflake semantic layer" section below.

Data Sub-pattern Snowflake object
Spending summary & category breakdown 3a intelligence.spend_summary (Dynamic Table)
Idle cash / spend anomaly / FX signals 3a intelligence.customer_signals (Dynamic Table)
Payday automation suggestion 3a intelligence.customer_signals (Dynamic Table)
Net worth dashboard signals 3a intelligence.net_worth_view (Dynamic Table)
Predicted upcoming payments 3a intelligence.payment_predictions (Dynamic Table)
Customer risk score (display) 3a risk.customer_risk_view (Dynamic Table)
Capital ratio (CET1, RWA), LCR/NSFR 3b SD06 semantic views (Cortex Analyst)
Credit portfolio view (back office) 3b SD06 semantic views (Cortex Analyst)
AML typology dashboard 3b SD06 semantic views (Cortex Analyst)
Model risk register, performance, change control 3b SD06 semantic views (Cortex Analyst)
Regulatory returns (back office) 3b SD06 semantic views (Cortex Analyst)

Rule: If a value is shown in a chart, card, table, or report and is not required to gate or execute a transaction, it is Tier 3. It lives in Snowflake. It never lives in Postgres.


The boundary in plain language

Postgres holds state. Snowflake holds understanding.

Postgres knows the customer's balance, account status, and fraud score — because those facts must be available in microseconds during a payment. Snowflake knows what the balance means — whether there's idle cash, whether spending is anomalous, whether the customer is likely to churn — because that understanding is computed from patterns, not point-in-time facts.

The boundary is not about latency alone. It is about the nature of the data: - Facts that change atomically with operational events → Postgres - Decisions that change operational state → ADR-036 inbox → Postgres - Understanding, patterns, aggregations, and visualisations → Snowflake → API cache


Snowflake presentation layer

The heavy computation for Tier 3 data — aggregation, pivoting, roll-ups, ML inference output, signal scoring — happens at Dynamic Table refresh time, not at query time. The Snowflake read API queries purpose-built presentation tables: one row per customer, shaped to match the API response payload exactly. These are the serving layer, equivalent to a traditional BI semantic layer.

Presentation table Refresh cadence Contents
PLATFORM.reporting.home_screen_signals Per signal cadence (see ADR-019) Idle cash, anomaly, FX signal, payday suggestion — one row per customer
PLATFORM.reporting.spend_summary Nightly Category breakdown, merchant totals — one row per customer per month
PLATFORM.reporting.net_worth_view On balance change + 15-min FX refresh Account balances in NZD equivalent
RISK.reporting.customer_risk_view Daily Customer risk score for display
CREDIT.reporting.credit_offer_detail On decision publication Pre-approval rate, term, rationale
FINCRIME.reporting.aml_dashboard 15 minutes AML typology dashboard data
RISK.reporting.capital_positions End of day CET1, RWA, LCR, NSFR

The Snowflake read API query for each endpoint is a trivial indexed point lookup:

SELECT payload FROM platform.reporting.home_screen_signals WHERE customer_id = ?

Query latency against a warm warehouse on a pre-shaped row is 150–400ms — no external cache required.

Read API warehouse

A dedicated XS Snowflake warehouse serves the customer-facing read API exclusively. It is configured always-on during business hours with a 10-minute auto-suspend on inactivity, and auto-resume. This is a minimal always-on footprint — cost is approximately $5–15/day — justified by consistent sub-500ms query latency on the customer path.

Back office reports, dashboards, and analytical queries use a separate warehouse and tolerate higher latency.

Data boundary — insights only

The Snowflake read API serves insight and reporting data only. It never returns operational data. Operational data (balances, account status, transaction list, payment limits) is served from Neon directly via the domain APIs. The home screen mixes both sources — operational data from Neon, insight signals from Snowflake — but these are separate API calls with separate latency characteristics.


Snowflake semantic layer (Tier 3b)

Back-office and regulatory dashboards use the native Snowflake semantic layer (Cortex Analyst) rather than pre-shaped presentation tables. The two patterns are complementary — Tier 3a optimises for per-customer point lookup latency; Tier 3b optimises for metric flexibility across arbitrary dimensions and time ranges.

Semantic view ownership

Each SD06 module defines its semantic model as CREATE SEMANTIC VIEW DDL in its own migrations directory, alongside dbt models. Semantic views reference the module's published dbt views (the schema-as-product contract from ADR-046) — they never reference raw tables or cross-schema implementation detail.

A semantic view exposes: - Metrics — named, pre-defined measures (lcr_ratio, tier_1_capital, model_psi_score) - Dimensions — grouping attributes (jurisdiction, model_id, reporting_date, product_type) - Entities — the primary grain (account, model, counterparty)

Query path

MOD-177 (renderer) → POST /v1/snowflake/metrics (MOD-176)
  → Cortex Analyst REST API
    → SQL generated against SD06 semantic views
      → back-office warehouse
        → JSON result → renderer → Recharts

MOD-176 is a proxy only. It forwards the structured query, enforces RBAC role scoping, logs the query, and returns the result. Metric logic lives entirely in Snowflake.

Back-office warehouse

A dedicated back-office warehouse serves all Tier 3b queries. It is configured with auto-suspend after inactivity and auto-resume on demand. Query latency on complex aggregations (multi-dimension, multi-period) is 1–5 seconds — acceptable for back-office use. This warehouse is separate from the XS customer-facing warehouse.


Constraint this creates for all repos

Every repo must apply the tier classification before writing data access code:

  1. bank-app / bank-platform (API layer): The Snowflake read API must be built as a first-class service with per-customer caching, circuit breaking, and query governance. The API layer owns the ≤200ms p99 latency SLA for Tier 3 data; it achieves this via the cache, not by avoiding Snowflake.

  2. bank-risk-platform (Snowflake): All analytical outputs are Tier 3 by default unless they are operational decisions (Tier 2). No risk model output should be written to Postgres except via the ADR-036 decision inbox. Capital, liquidity, and customer risk score values stay in Snowflake.

  3. All repos: No repo may add a Postgres table for the purpose of caching Snowflake reporting data. If a reporting value is needed in the app, the answer is always: add it to the Snowflake read API and cache — never add it to the Postgres schema.


What this supersedes in prior ADRs

ADR Prior text Superseded by this rule
ADR-001 Write-back table listed capital_positions, liquidity_positions, customer_scores These are Tier 3 — they stay in Snowflake. Write-back is for Tier 1 fraud/sanctions scores and Tier 2 decisions only.
ADR-019 "Results written to Postgres via write-back API" for insight signals Superseded — insight signals are Tier 3. Postgres holds only the credit decision existence flag (Tier 2).

ADR-001 and ADR-019 are Proposed drafts and have been updated to reflect this classification. ADR-001 is the Accepted decision on OLTP store choice — its OLTP stance is unchanged; only the overbroad write-back examples are corrected in this ADR.


Snowflake read API design constraints

The Snowflake read API is a new internal service (to be built in bank-platform). Constraints:

  • Per-customer query scoping: All queries must be scoped to the requesting customer's ID — no query may return data for a different customer regardless of what parameters are passed.
  • Row-level security in Snowflake: Snowflake presentation tables serving customer data must have row access policies enforced at the Snowflake layer, providing defence-in-depth independent of the API layer.
  • Insights only: The read API returns Tier 3 data exclusively. It must not be used to serve operational data (balances, account status, transactions) — those come from Neon via domain APIs.
  • Circuit breaker: If Snowflake query latency exceeds 1 second p95, the API must return a graceful degraded response (empty insight cards with a "not available" state) rather than waiting. A slow or unavailable Snowflake has zero impact on the transaction path.
  • Query governance: All Snowflake queries from the read API must be logged with customer ID, signal type, query hash, and warehouse used — for cost attribution and anomaly detection.
  • No Redis or external cache: Freshness is managed by Dynamic Table refresh cadence. The presentation layer is the cache — there is no additional caching tier between the API Lambda and Snowflake.

Principles alignment

Principle Assessment Notes
AP-001 KISS Single rule; clear tier classification; no ambiguity about where data lives
AP-002 Data governance Snowflake is the governed analytical store; Postgres is not used as a reporting cache
AP-004 Security Row-level security in Snowflake; per-customer cache keys; no cross-customer data leakage path
AP-006 Cost effective No Postgres schema sprawl from reporting columns; Snowflake compute on demand
AP-007 Evolution New insight signals require only a new Snowflake dynamic table — no Postgres migration
AP-010 Modular Clean separation: Postgres owns operational state; Snowflake owns analytical understanding

Perspectives

Perspective Assessment Notes
Performance & Scalability API-layer cache absorbs load; Snowflake compute scales independently
Resilience ~ Cache provides degraded-mode resilience; Snowflake unavailability affects Tier 3 display only — no transaction impact
Evolution Adding new insight signals requires no Postgres schema change
Security Row-level security + per-customer cache keys + circuit breaker
Regulatory Regulatory reporting stays in Snowflake — the governed, auditable analytical layer
Cost Avoids Postgres bloat from reporting columns; cache reduces Snowflake compute cost

See perspectives.md for guidance on producing these viewpoints.


  • ADR-001 — Postgres as the OLTP store; Tier 1 data lives here
  • ADR-002 — Snowflake as the analytics platform; Tier 3 data lives here
  • ADR-003 — Facts flow Neon → Snowflake
  • ADR-036 — Tier 2 decisions flow Snowflake → Neon via decision inbox
  • ADR-009 — Visualisation approach; uses Tier 3 data via Snowflake read API
  • ADR-019 — Home screen intelligence; Tier 3 data via API cache

Signoff record

Date Name Role Status
2026-04-10 Ross Millen CTO Approved
2026-04-10 Ross Millen Head of Architecture Approved
2026-04-10 Ross Millen Head of Data Approved

Capabilities

Capability Description Relationship
CAP-019 Monthly spending summary & trends governed — Tier 3: served from Snowflake intelligence.spend_summary via read API
CAP-020 Predicted upcoming payments governed — Tier 3: served from Snowflake intelligence.payment_predictions
CAP-021 Low balance alert governed — Tier 3 signal; balance fact is Tier 1 in Postgres
CAP-022 Unusual transaction alert governed — Tier 3: anomaly signal from Snowflake
CAP-023 Net worth dashboard (linked external accounts) governed — Tier 3: served from intelligence.net_worth_view
CAP-038 Real-time fraud scoring & block governed — fraud score is Tier 1 in Postgres (pre-computed write-back)
CAP-062 CDD automatic tier assignment governed — CDD tier decision is Tier 2 via ADR-036 decision inbox
CAP-063 Proactive financial insight engine governed — all insight signals are Tier 3; never enter Postgres
CAP-065 Pre-approval engine (nightly eligibility scoring) governed — credit decision flag is Tier 2; offer detail is Tier 3
CAP-067 Idle cash detection governed — idle cash amount is Tier 3; stays in Snowflake
CAP-070 Real-time capital ratio engine (CET1/RWA) governed — capital_positions is Tier 3; stays in Snowflake
CAP-071 LCR/NSFR continuous calculation governed — liquidity_positions is Tier 3; stays in Snowflake
CAP-072 Regulatory threshold alerting governed — threshold monitoring is Tier 3

ADR Title Relationship
ADR-001 Postgres as the OLTP operational store Tier 1 data lives in Postgres
ADR-002 Snowflake as the analytics and risk compute platform Tier 2 and Tier 3 data lives in Snowflake
ADR-003 CDC pipeline — Neon Postgres to Snowflake via Firehose and Apache Iceberg facts flow Neon to Snowflake to populate Tier 3
ADR-009 Insights and data visualisation approach visualisation approach consumes Tier 3 via read API
ADR-019 Intelligent customer home screen and financial intelligence layer home screen reads Tier 1 from Neon and Tier 3 from Snowflake
ADR-036 Decision result publication — governed Snowflake → Neon write-back contract Tier 2 decisions cross Snowflake to Neon via this ADR

All ADRs Compiled 2026-05-22 from source/entities/adrs/ADR-038.yaml