Skip to content

ADR-003: CDC pipeline — Neon Postgres to Snowflake via Firehose and Apache Iceberg

Status Accepted
Date 2026-04-10
Deciders CTO, Head of Architecture
Affects repos bank-core, bank-kyc, bank-aml, bank-payments, bank-credit, bank-risk-platform, bank-platform

Status

Accepted — 2026-04-10

The previous file at this ADR number (Kafka as event streaming backbone) was AI-generated without human approval and has been replaced. This is the first human-reviewed decision for ADR-003.

Context

The bank requires a pipeline from the Neon Postgres operational databases to Snowflake for analytics, regulatory reporting, ML model training, and AML monitoring. Requirements: capture all committed WAL changes, preserve an immutable append-only record for lineage, and align with the serverless cost-first paradigm (AP-001, AP-006, AP-010).

The AI-generated decision previously at this number selected Kafka + Debezium, mandating always-on managed infrastructure ($200–800/month baseline) in conflict with the serverless architecture (ADR-025). It also conflated CDC-to-analytics with inter-Lambda event routing — concerns that are separable and addressed by different tools. Inter-Lambda routing is covered in ADR-029 (superseded by ADR-051; see ADR-051 for current EventBridge bus naming convention).

Decision

EventBridge Scheduler → CDC Lambda → Kinesis Firehose → S3 Iceberg → Snowflake External Iceberg Table

Layer Component Detail
Trigger EventBridge Scheduler 60-second interval per domain database
CDC Lambda Direct connection to Neon (not pooled). Calls pg_logical_slot_get_changes(), reads committed changes, advances slot, stores last LSN in S3.
Delivery Kinesis Firehose Buffers at 60s / 128MB. Writes Apache Iceberg files to S3 via AWS Glue Data Catalog. Native Iceberg support added 2024.
Storage S3 Iceberg tables One S3 prefix per domain DB (core, kyc, aml, payments, credit, risk, platform), containing one Iceberg table per source Postgres table within that domain. Append-only snapshots — full time travel and change lineage per table.
Analytics Snowflake External Iceberg Tables One External Iceberg Table per source Postgres table, organised into Snowflake schemas by domain (e.g. raw_core, raw_kyc, raw_payments). Zero-copy reads from S3 — no Snowpipe ingestion cost. Snowflake Dynamic Tables derive the refreshed analytical layer on top.

Migration path

When Snowflake Openflow's Iceberg support reaches GA, the Lambda + Firehose layer will be replaced by Openflow BYOC. S3 Iceberg files produced under this ADR remain readable — no data migration required. A new ADR records that transition.

NFR change

NFR-015 (Kafka end-to-end latency ≤ 1s) is retired — it was written for the superseded Kafka decision and has no independent business basis. Fraud scoring (NFR-021) runs synchronously via Snowflake Cortex on the payment path; it does not depend on CDC latency.

Replacement — NFR-015: CDC-to-Snowflake data availability p99 ≤ 5 minutes. Appropriate for analytics, AML monitoring, and regulatory reporting workloads.

Neon-specific constraints

  • wal_level = logical must be set in the Neon project — irreversible once changed.
  • CDC Lambda must use the direct connection string, not the PgBouncer pooled endpoint. Logical replication is incompatible with connection poolers.
  • Neon drops inactive replication slots after ~40 hours. 60-second polling keeps slots active. Monitoring must alert if the Lambda fails for more than 30 continuous hours.
  • One slot per domain DB. Neon default max_replication_slots = 10 — sufficient for 7 DBs.

Rejected alternatives

Option Reason rejected
Apache Kafka + Debezium Always-on managed cluster; $200–800/month baseline; conflicts with serverless paradigm
Snowflake Openflow (current) EKS baseline ~$72/month; Iceberg not yet GA; SPCS compute pool always-on
AWS DMS Serverless ~$30–60/month minimum; Parquet output only — no native Iceberg
Debezium standalone Requires always-on JVM process

Consequences

Positive — No always-on streaming infrastructure; near-zero cost at launch volumes. S3 Iceberg is open format — readable by Athena, Spark, and future tooling independently of Snowflake. Append-only snapshots satisfy immutability requirements for governance and lineage.

Negative / trade-offs — Lambda manages the replication slot LSN (operational code requiring monitoring). ~60–120s end-to-end latency to Snowflake (acceptable for all analytics workloads). wal_level = logical is irreversible in Neon.



Signoff record

Date Name Role Status
2026-04-10 Ross Millen CTO Approved
2026-04-10 Ross Millen Head of Architecture Approved
2026-04-10 Ross Millen Head of Data Approved

Capabilities

Capability Description Relationship
CAP-025 Automated regulatory reporting pipeline enabled — regulatory data reaches Snowflake via this pipeline
CAP-027 Automated AML submission pipeline enabled — AML monitoring data flows Postgres to Snowflake via CDC
CAP-063 Proactive financial insight engine enabled — customer signals are derived from CDC-delivered facts in Snowflake
CAP-070 Real-time capital ratio engine (CET1/RWA) enabled — capital inputs flow from Postgres to Snowflake via CDC
CAP-071 LCR/NSFR continuous calculation enabled — liquidity position data flows via this pipeline

ADR Title Relationship
ADR-001 Postgres as the OLTP operational store source of CDC stream
ADR-002 Snowflake as the analytics and risk compute platform destination of CDC stream
ADR-024 Database hosting — Neon serverless Postgres Neon-specific WAL constraints documented here
ADR-029 (superseded by ADR-051; see ADR-051 for current EventBridge bus naming convention) Domain event routing via Amazon EventBridge separated concern — EventBridge handles inter-Lambda events; CDC handles analytics ingest

All ADRs Compiled 2026-05-22 from source/entities/adrs/ADR-003.yaml