Skip to content

Technical design — MOD-003 Real-time balance engine

Module: MOD-003 System: SD01 Core Banking Repo: bank-core FR scope: FR-053..056, FR-429..432 NFR scope: NFR-012, NFR-013, NFR-019 Policies satisfied: PAY-001 (GATE), CLQ-002 (CALC), CON-005 (AUTO), CLQ-004 (CALC) Author: AI coding agent (Claude) Date: 2026-04-30

Objective

MOD-003 is the read-side of SD01 account state. MOD-001 already keeps accounts.accounts.balance and available_balance in sync inside the posting transaction (FR-053's write path); MOD-003 owns the synchronous balance read API, the hold lifecycle (FR-429), historical balance reconstruction from MOD-002's immutable transaction log (FR-056), the daily EOD balance snapshot for regulatory point-in-time queries (FR-431), and the publication of bank.core.balance_updated to downstream consumers. Every payment validator (MOD-020), fraud scorer (MOD-023), liquidity engine (MOD-032), IRRBB engine (MOD-035), customer dashboard (MOD-077), and transaction history view (MOD-070) calls into this module.

Internal architecture

API Gateway HTTP API
   GET  /internal/v1/balance/{account_id}            ─▶ Mod003BalanceQueryHandler
   GET  /internal/v1/balance/by-party/{party_id}     ─▶ Mod003BalanceQueryHandler
   GET  /internal/v1/balance/{account_id}/at         ─▶ Mod003ReconstructionHandler
   POST /internal/v1/holds                           ─▶ Mod003HoldsHandler
   POST /internal/v1/holds/{hold_id}/release         ─▶ Mod003HoldsHandler

EventBridge bank-core ─▶ posting_completed rule ─▶ Mod003BalanceUpdatedPublisher
                                                    └─▶ bank.core.balance_updated

EventBridge schedules:
   cron(55 11 * * ? *)   — NZ EOD, 23:55 NZST ─▶ Mod003EodSnapshotJob
   cron(55 13 * * ? *)   — AU EOD, 23:55 AEST ─▶ Mod003EodSnapshotJob
   rate(5 minutes)       — hold expiry sweep   ─▶ Mod003HoldExpirySweeper

Six Lambdas; one HTTP API with five routes; one EventBridge consumer rule plus three schedule rules; two CloudWatch alarms; one dashboard.

Key design decisions

Decision: MOD-001 keeps the writer path (orchestrator A1)

Context: FR-053 attributes balance maintenance to MOD-003. MOD-001 already updates accounts.accounts.balance / available_balance atomically inside the posting transaction.

Choice: MOD-001 keeps the writer; MOD-003 is the read surface + hold management + reconstruction + EOD snapshot + event publisher.

Reason: Smallest scope; preserves NFR-012 posting latency; treats FR-053 as already-satisfied by MOD-001's atomic update. MOD-003 asserts the read reflects the live denormalised values.

Decision: optimistic locking for hold writes only (orchestrator A2)

Context: FR-430 specifies optimistic locking with a version counter. MOD-001's posting flow uses pessimistic SELECT … FOR UPDATE.

Choice: Pessimistic on the posting hot path, optimistic on the hold-write path. The version int NOT NULL DEFAULT 0 column on accounts.accounts (added by V002) is incremented atomically inside both writers — MOD-001's FOR UPDATE makes the increment serialisable; MOD-003's hold writes do UPDATE … WHERE version = $expected and retry up to 3 times on row-count = 0.

Reason: Avoids invasive changes to MOD-001's posting hot path while satisfying FR-430's contract. The two strategies coexist because FOR UPDATE blocks the optimistic UPDATE until the posting commits, at which point the retry succeeds with the post-image version. After 3 failed retries → CONCURRENT_MODIFICATION (HTTP 503 + retryable per the FR).

Decision: reconstruction reads from MOD-002 (orchestrator A6)

Context: Both accounts.postings (MOD-001) and core.transaction_log (MOD-002) contain the same data. FR-056 is a "match the stored balance" check.

Choice: Replay signed amounts from core.transaction_log. The hash chain verifies its own integrity (FR-427); using it as the source decouples MOD-003 from MOD-001's storage layout.

Reason: Independent verification path. If the live balance ever drifts from the immutable log, the FR-056 endpoint surfaces it.

Decision: EOD snapshots are append-only

Context: FR-431 requires daily EOD snapshots, retained 7 years, "to support regulatory point-in-time balance queries".

Choice: accounts.daily_balance_snapshots is append-only — INSERT-only role grant + RLS policies that block UPDATE / DELETE (same pattern as MOD-002 core.transaction_log). Re-running the snapshot for the same date is a no-op via ON CONFLICT (snapshot_date, account_id) DO NOTHING.

Reason: A regulatory snapshot must be tamper-evident. Anyone who needs to "correct" a snapshot inserts a one-off audit record at a synthetic date — never overwrites the original.

Decision: hold expiry computed at read time + scheduled cleanup

Context: FR-429 says active holds reduce available balance and expire after 24h (default).

Choice: The LATERAL aggregate in balance-reader.ts filters by expires_at > now() AND released_at IS NULL, so an expired hold ceases to reduce available balance the instant now() rolls past expires_at. A scheduled sweeper Lambda (rate(5 minutes)) flips released_at = now() on expired rows so the partial index stays small.

Reason: Read-time freshness without depending on the sweeper's cadence; sweeper is bookkeeping not correctness.

Decision: per-jurisdiction EOD cron

Context: NZ and AU have different "end of day" wall-clocks (NZST = UTC+12, AEST = UTC+10).

Choice: Two separate aws.cloudwatch.EventRules, each invoking the EOD Lambda with { jurisdiction: "NZ" | "AU" } in the input. The Lambda's SQL filters accounts.accounts WHERE jurisdiction = $1.

Reason: Each jurisdiction's snapshot reflects its own true EOD. Daylight-saving handling deferred to a follow-up — today the cron runs on standard-time offsets and accepts the 1-hour slip across DST.

External dependencies

  • Database: bank_core on Neon (provisioned by MOD-103)
  • READ: accounts.accounts, accounts.pending_holds, accounts.account_party_relationships, core.transaction_log
  • WRITE: accounts.pending_holds, accounts.daily_balance_snapshots, accounts.accounts.version
  • EventBridge (bank-core bus)
  • Consumes: bank.core.posting_completed
  • Publishes: bank.core.balance_updated
  • Secrets Manager: bank-neon/{stage}/bank_core/app_user
  • SSM (read):
  • /bank/{stage}/eventbridge/bank-core/arn
  • /bank/{stage}/eventbridge/bank-core/dlq-arn
  • /bank/{stage}/iam/lambda/bank-core/arn
  • /bank/{stage}/observability/adot-nodejs-layer-arn
  • /bank/{stage}/sns/alerts/arn
  • /bank/{stage}/mod-002/transaction-log-table

SSM outputs table

Output SSM path Consumers
Balance API base URL /bank/{stage}/mod-003/api/base-url MOD-020, MOD-023, MOD-070, MOD-077, MOD-032, MOD-035
Single-account URL /bank/{stage}/mod-003/balance/url (alias)
Multi-account URL /bank/{stage}/mod-003/balance/multi/url MOD-077, MOD-074
Holds URL /bank/{stage}/mod-003/holds/url MOD-020, MOD-023
Reconstruction URL /bank/{stage}/mod-003/reconstruct/url MOD-018, MOD-074
Daily snapshot table /bank/{stage}/mod-003/daily-snapshot-table MOD-036, MOD-042
Lambda ARNs /bank/{stage}/mod-003/{balance-query,holds}-lambda/arn MOD-020 (direct invoke if selected)

Security and data handling

  • No customer PII flows through MOD-003; UUIDs and money amounts only.
  • The EOD snapshot table is append-only at the DB layer (privilege revoke + RLS), defending the regulatory point-in-time query against tampering by the runtime app role.
  • Holds carry payment_id and release_reason only — no document or free-text customer data.

Performance approach

  • NFR-013 ≤ 5 ms p99 balance read: a single SELECT on accounts.accounts joined to a LATERAL aggregate of pending_holds. Both keys are indexed (accounts_pkey, idx_pending_holds_account_id_active) so the read is a primary-key lookup + sub-millisecond LATERAL. withConnection (no BEGIN/COMMIT) avoids the transaction overhead for read-only queries. Real verification is the staging in-region load test; the integration NFR-013 check here bounds dev single-read latency at 1 s as a regression gate (laptop ↔ Sydney RTT dominates).
  • NFR-013 ≤ 20 ms p99 multi-account read (FR-432): keyset on idx_acct_party_rel_party_id filters down to ≤ 50 relationships before joining accounts.
  • ADOT Node.js layer attached to all six Lambdas; X-Ray spans flow automatically with trace_id correlation.

Error handling

  • Sync HTTP paths — standard error envelope per the error-handling standard (HTTP 422 / 503 / 500).
  • CONCURRENT_MODIFICATION is HTTP 503 + retryable: true (FR-430 contract) — caller retries with the same idempotency_key.
  • EventBridge consumer (posting_completedbalance_updated) — re-raise on transient failures so EventBridge retries; bank-core DLQ catches after retry exhaustion.
  • Scheduled paths (EOD, sweeper) — re-raise on transient failures so the next scheduled run retries; alarm trips if errors ≥ 3 in 5 min.

Event types emitted in structured logs

Registered in src/lib/logger.ts (EVENT_TYPES):

  • balance_query_served, balance_multi_query_served, balance_reconstruction_served
  • hold_created, hold_released, hold_expired
  • balance_updated_published, balance_updated_publish_failed
  • eod_snapshot_completed
  • concurrent_modification_retried
  • trace_id_missing_from_upstream, validation_failed, internal_error

Test approach

Tier Files Status
Unit tests/unit/{amount,logger,trace,errors,emf,hold-math}.test.ts 29 / 29
Contract tests/contract/{balance-updated-event,balance-response,holds-request-response}.test.ts 6 / 6
FR integration (one per FR) tests/integration/fr-{053,054,055,056,429,430,431,432}.test.ts + observability-log-schema.test.ts pending dev Neon
Policy satisfaction (one per row) tests/policy/{pay-001,clq-002,con-005,clq-004}.test.ts pending dev Neon

The skipIfNoDb + transactionLogExists guards keep the integration tier green-with-skips while dev Neon's compute is unreachable; once dev is back the tests run unconditionally.