Skip to content

Data architecture

Core principle

Postgres is the operational source of truth. Snowflake is the analytical brain. Kafka is the nerve system connecting them. Nothing flows backwards — data moves from Postgres → Kafka → Snowflake. Operational results from Snowflake write back to Postgres via a controlled API.

OLTP layer — Postgres

Purpose: Every transactional write. Source of truth for all financial positions.

Key schemas: - ledger — ledger_entries (append-only), accounts, balances - customers — customer records, kyc_status, risk_score (write-back) - kyc — kyc_records, cdd_decisions, sanctions_matches - payments — payment_instructions, payment_status, fx_conversions - products — product_instances, interest_rates, account_features

Critical rules: - ledger_entries is append-only — enforced by trigger. No UPDATE or DELETE ever. - All balance reads for payment authorisation use this layer — never Snowflake - Write latency target: p99 ≤ 10ms (NFR-012) - Read latency target: p99 ≤ 5ms for balance queries (NFR-013)

CDC pipeline — Debezium

Debezium reads the Postgres WAL (write-ahead log) and publishes a change event to Kafka for every INSERT, UPDATE, and DELETE across all operational tables. Schema-versioned via Confluent Schema Registry.

Latency target: Postgres commit → Kafka delivery ≤ 1s p99 (NFR-015)

Event streaming — Kafka

Topic naming: bank.[domain].[event-type]

Topic Source Consumers
bank.transactions.created bank-core bank-aml, bank-risk-platform, bank-app
bank.accounts.updated bank-core bank-risk-platform, bank-app
bank.kyc.events bank-kyc bank-aml, bank-risk-platform
bank.payments.completed bank-payments bank-core, bank-app
bank.ledger.entries bank-core bank-risk-platform

OLAP layer — Snowflake

Purpose: All analytical computation. Risk models, regulatory reporting, ML training, customer intelligence.

Key objects: - Dynamic tables — continuously refreshed from Kafka via Snowpipe Streaming - Cortex models — fraud scoring, categorisation, credit scoring, AML typology - Regulatory marts — capital/liquidity, RBNZ/APRA returns, AML reporting

Latency from Postgres commit to Snowflake availability: typically 5–15 seconds via Snowpipe Streaming.

Write-back pattern

Snowflake computes values that need to be read on the transaction path. These are written back to Postgres via a controlled write-back API — never via direct DB connection.

Value Snowflake model Write-back target Latency target
Customer risk score customer_risk_score customers.risk_score ≤ 60s (NFR-014)
Fraud score fraud_transaction_model payments.fraud_score ≤ 60s (NFR-014)
Credit pre-approval credit_preapproval_model credit_offers table Nightly batch
AML typology score aml_typology_model aml_scores table ≤ 60s

Rule: If a value is needed on the transaction path, it must be in Postgres. Snowflake is never called inline on a payment or account operation.

Data residency

All Postgres data for NZ customers stored in AWS ap-southeast-2 (Sydney) or ap-southeast-4 (Melbourne). AU customer data in AU regions. No NZ customer PII in US or EU regions. Snowflake account configured for AU data residency with no cross-region replication of PII.


  • Architectural principlesAP-002 (data governance), AP-008 (real time) and AP-001 (KISS) are the primary principles governing this architecture
  • ADR-001 — Postgres as the OLTP operational store
  • ADR-002 — Snowflake as the analytics and risk compute platform
  • ADR-003 — Kafka as the real-time event streaming backbone
  • ADR-013 — External data sharing from Snowflake