Data architecture¶

Core principle¶

Postgres is the operational source of truth. Snowflake is the analytical brain. Kafka is the nerve system connecting them. Nothing flows backwards — data moves from Postgres → Kafka → Snowflake. Operational results from Snowflake write back to Postgres via a controlled API.

OLTP layer — Postgres¶

Purpose: Every transactional write. Source of truth for all financial positions.

Key schemas: - ledger — ledger_entries (append-only), accounts, balances - customers — customer records, kyc_status, risk_score (write-back) - kyc — kyc_records, cdd_decisions, sanctions_matches - payments — payment_instructions, payment_status, fx_conversions - products — product_instances, interest_rates, account_features

Critical rules: - ledger_entries is append-only — enforced by trigger. No UPDATE or DELETE ever. - All balance reads for payment authorisation use this layer — never Snowflake - Write latency target: p99 ≤ 10ms (NFR-012) - Read latency target: p99 ≤ 5ms for balance queries (NFR-013)

CDC pipeline — Debezium¶

Debezium reads the Postgres WAL (write-ahead log) and publishes a change event to Kafka for every INSERT, UPDATE, and DELETE across all operational tables. Schema-versioned via Confluent Schema Registry.

Latency target: Postgres commit → Kafka delivery ≤ 1s p99 (NFR-015)

Event streaming — Kafka¶

Topic naming: bank.[domain].[event-type]

Topic	Source	Consumers
`bank.transactions.created`	bank-core	bank-aml, bank-risk-platform, bank-app
`bank.accounts.updated`	bank-core	bank-risk-platform, bank-app
`bank.kyc.events`	bank-kyc	bank-aml, bank-risk-platform
`bank.payments.completed`	bank-payments	bank-core, bank-app
`bank.ledger.entries`	bank-core	bank-risk-platform

OLAP layer — Snowflake¶

Purpose: All analytical computation. Risk models, regulatory reporting, ML training, customer intelligence.

Key objects: - Dynamic tables — continuously refreshed from Kafka via Snowpipe Streaming - Cortex models — fraud scoring, categorisation, credit scoring, AML typology - Regulatory marts — capital/liquidity, RBNZ/APRA returns, AML reporting

Latency from Postgres commit to Snowflake availability: typically 5–15 seconds via Snowpipe Streaming.

Write-back pattern¶

Snowflake computes values that need to be read on the transaction path. These are written back to Postgres via a controlled write-back API — never via direct DB connection.

Value	Snowflake model	Write-back target	Latency target
Customer risk score	customer_risk_score	`customers.risk_score`	≤ 60s (NFR-014)
Fraud score	fraud_transaction_model	`payments.fraud_score`	≤ 60s (NFR-014)
Credit pre-approval	credit_preapproval_model	`credit_offers` table	Nightly batch
AML typology score	aml_typology_model	`aml_scores` table	≤ 60s

Rule: If a value is needed on the transaction path, it must be in Postgres. Snowflake is never called inline on a payment or account operation.

Data residency¶

All Postgres data for NZ customers stored in AWS ap-southeast-2 (Sydney) or ap-southeast-4 (Melbourne). AU customer data in AU regions. No NZ customer PII in US or EU regions. Snowflake account configured for AU data residency with no cross-region replication of PII.

Architectural principles — AP-002 (data governance), AP-008 (real time) and AP-001 (KISS) are the primary principles governing this architecture
ADR-001 — Postgres as the OLTP operational store
ADR-002 — Snowflake as the analytics and risk compute platform
ADR-003 — Kafka as the real-time event streaming backbone
ADR-013 — External data sharing from Snowflake