Skip to content

Technical design — MOD-043 EventBridge domain event governance

Module: MOD-043 — EventBridge domain event governance System: SD07 — Data Platform & Governance Infrastructure Repo: bank-platform FR scope: FR-269, FR-270, FR-271, FR-272 Policies satisfied: DT-001 (AUTO), DT-004 (AUTO), PRI-001 (AUTO), PRI-003 (AUTO) Author: AI agent (Claude Opus 4.7) Date: 2026-04-19 Stage covered: dev (account 647751526084, region ap-southeast-2) SST permalink: https://sst.dev/u/491a888e


Objective

MOD-043 governs the 8 custom EventBridge buses provisioned by MOD-104. It adds:

  1. A single Amazon EventBridge Schema Registry (bank-events-{env}) with a Discoverer per bus and canonical JSON-Schema (draft-04) seeds.
  2. A governance rule per bus that captures every event and forwards it to a delivery-logger Lambda. Each rule target is configured with a 3-retry policy and a DLQ (the MOD-104 DLQ for that domain) — satisfying FR-271.
  3. A Node.js 20 delivery-logger Lambda that validates each event against the registry and emits a structured log record with the FR-272 fields (event_id, schema_version, source, event_time/received_at, delivery_status).
  4. CloudWatch alarms on DLQ depth for each of the 8 domain DLQs, publishing to a dedicated SNS topic.

This document describes the as-deployed state in dev. All 78 integration tests pass against the live environment. The Neon-backed platform.event_delivery_log persistence (MOD-042 dependency) is stubbed: the logger writes to CloudWatch Logs (90-day retention — FR-272) and a placeholder SQS queue that MOD-042 will wire to Postgres.


Execution model

Aspect Decision
IaC tool SST v3 Ion + raw @pulumi/aws resources (ADR-025)
Lambda runtime Node.js 20 on arm64; 256 MB / 10s timeout
Lambda packaging Local esbuild bundle (CJS, Node-20 target) zipped into dist/delivery-logger/ and uploaded via pulumi.asset.FileArchive. Bundle size ~1.7 MB
Schema format JSON Schema draft-04 (matches wiki schema-registry.md)
Schema validator ajv-draft-04 + ajv-formats in-Lambda
Region ap-southeast-2
Tagging Provider defaultTags: tenant_id, module_id=MOD-043, environment, system_id=SD07, cost_center=sd07-bank-platform, managed_by=sst
Deployment identity AWS_PROFILE=bank-dev for dev

Stack layout

eventbridge-governance/
├── sst.config.ts
├── src/
│   ├── stacks/
│   │   ├── schema-registry.ts     — bank-events registry + 8 Discoverers + seeded schemas
│   │   ├── governance-rules.ts    — 1 rule/bus forwarding to the logger + RetryPolicy(3) + DLQ target
│   │   ├── delivery-logger.ts     — Lambda + role + log group(90d) + placeholder SQS queue
│   │   └── dlq-alerts.ts          — 8 CloudWatch alarms on DLQ depth + SNS topic
│   ├── lambdas/
│   │   └── delivery-logger/
│   │       ├── index.ts           — handler: validate detail against registry, emit structured log, forward to SQS placeholder
│   │       ├── package.json
│   │       └── tsconfig.json
│   └── outputs.ts                 — 22 SSM parameters under /bank/{env}/mod043/...
├── schemas/
│   ├── bank.platform.notification_sent.json
│   └── bank.core.posting_completed.json
├── scripts/build-lambda.mjs       — esbuild wrapper producing dist/delivery-logger/index.js
└── __tests__/integration/         — 8 test files, 78 assertions (all live AWS)

AWS resources provisioned (dev stage)

Schema Registry (FR-269, FR-270)

Resource Name Notes
aws.schemas.Registry bank-events-dev Single registry per wiki spec — buses are the domain boundary
aws.schemas.Discoverer × 8 one per bus sourceArn resolved via SSM /bank/{env}/eventbridge/{domain}/arn
aws.schemas.Schema × 2 (seed) bank.platform.notification_sent, bank.core.posting_completed JSONSchemaDraft4. additionalProperties:false is mandatory per wiki

Schemas auto-version on change (AWS-native). FR-270 SLA (≤1 business day) is satisfied structurally: Discoverer latency <1 minute; seeded schemas deploy via sst deploy on merge to main.

Governance rules + targets (FR-271)

Per domain d:

Resource Name Notes
aws.cloudwatch.EventRule bank-{d}-governance-{env} EventPattern { "source": [{ "prefix": "bank." }] } — captures every event
aws.cloudwatch.EventTarget Targets the delivery-logger Lambda. RetryPolicy.MaximumRetryAttempts=3, MaximumEventAgeInSeconds=3600. DeadLetterConfig.Arn = MOD-104 DLQ for d
aws.lambda.Permission bank-{d}-gov-{env} events.amazonaws.com → Lambda, scoped to the rule ARN

Delivery logger (FR-272)

Resource Value
Lambda name bank-eventbridge-delivery-logger-dev
Runtime / arch nodejs20.x / arm64
Handler index.handler
Role bank-eventbridge-delivery-logger-dev (inline: schemas:DescribeSchema, sqs:SendMessage on placeholder queue, logs:* on own group)
Log group /aws/lambda/bank-eventbridge-delivery-logger-dev, retention 90 days (FR-272)
Env vars SCHEMA_REGISTRY_NAME=bank-events-dev, DELIVERY_LOG_QUEUE_URL=..., MODULE_ID=MOD-043
Placeholder SQS queue bank-eventbridge-delivery-log-placeholder-dev, MessageRetentionPeriod=1209600 (14d — PRI-003), KMS alias/aws/sqs

Structured log schema:

{
  "level": "INFO",
  "event_type": "event_delivery_recorded",
  "event_id": "...",
  "schema_name": "bank.<domain>.<detail-type>",
  "schema_version": "1.0.0",
  "source": "bank.<domain>",
  "bus_name": "bank-<domain>",
  "detail_type": "<detail-type>",
  "event_time": "...",
  "received_at": "...",
  "delivery_status": "PUBLISHED|SCHEMA_REJECTED|SCHEMA_MISSING|ERROR",
  "failure_reason": "...",
  "payload_checksum": "sha256",
  "module_id": "MOD-043",
  "trace_id": "..."
}

The handler caches compiled validators in-memory per cold start. When no schema is registered yet, the record is emitted with delivery_status=SCHEMA_MISSING — i.e. observability without blocking (the publisher-side validator in @bank-platform/schema-registry is the enforcement point per wiki).

TODO (MOD-042): replace the placeholder SQS queue with a direct insert into platform.event_delivery_log (Neon) once MOD-042 lands. Reference: bank-wiki/source/pages/design/system/data-models/SD07-data-platform.md.

DLQ alarms + SNS

Resource Value
aws.sns.Topic bank-eventbridge-dlq-alerts-dev
aws.cloudwatch.MetricAlarm × 8 bank-{d}-dlq-depth-{env}AWS/SQS ApproximateNumberOfMessagesVisible > 0 1/1 min → alert + OK actions → SNS

MOD-076 will subscribe pager/chat channels at a later phase.


SSM outputs table (consumer contract)

All under arn:aws:ssm:ap-southeast-2:647751526084:parameter, path convention /bank/{env}/mod043/.... Consumers resolve these at deploy time.

SSM path Value Consumed by
/bank/{env}/mod043/schema-registry/name Registry name (bank-events-{env}) All event-publishing and consuming Lambdas (validator bootstrap)
/bank/{env}/mod043/schema-registry/arn Registry ARN IAM policies in downstream modules needing schemas:* on the registry
/bank/{env}/mod043/delivery-logger/arn Lambda ARN MOD-076 subscription-filter target; MOD-042 when replacing the placeholder queue
/bank/{env}/mod043/delivery-logger/log-group-arn Log group ARN MOD-076 log-subscription ingestion
/bank/{env}/mod043/delivery-log-placeholder/arn SQS ARN MOD-042 (will consume + write to platform.event_delivery_log)
/bank/{env}/mod043/delivery-log-placeholder/url SQS URL MOD-042 consumer
/bank/{env}/mod043/dlq-alerts-topic/arn SNS topic ARN MOD-076 alarm-destination subscriptions
/bank/{env}/mod043/rule/{domain}/arn × 8 EventRule ARN Audit/support tooling; identifying governance subscription per bus
/bank/{env}/mod043/dlq-alarm/{domain}/arn × 8 CloudWatch Alarm ARN MOD-076 dashboard widgets

Resolution example

// In downstream module (Pulumi)
const registryName = aws.ssm.getParameterOutput({
  name: `/bank/${stage}/mod043/schema-registry/name`,
}).value;

// In a publisher Lambda
const schema = await new SchemasClient({ region }).send(
  new DescribeSchemaCommand({
    RegistryName: process.env.SCHEMA_REGISTRY_NAME!,
    SchemaName: "bank.core.posting_completed",
  }),
);

Acceptance criteria status (dev stage, 2026-04-19)

Run: AWS_PROFILE=bank-dev STAGE=dev pnpm test (from eventbridge-governance/)

FR / Policy Mode Tests Pass Fail Status
FR-269 — schema-validated publish, non-conforming rejected 3 3 0 PASS
FR-270 — central schema registry, per-bus Discoverer 3 3 0 PASS
FR-271 — at-least-once + DLQ after 3 retries 24 24 0 PASS
FR-272 — 90d delivery log with mandatory fields 3 3 0 PASS
DT-001 — bus access IAM-gated AUTO 16 16 0 PASS
DT-004 — cross-domain subscriptions need explicit contract AUTO 17 17 0 PASS
PRI-001 — no PII in payloads (schema rejects bare PII) AUTO 3 3 0 PASS
PRI-003 — DLQ retention ≤ 14 days AUTO 9 9 0 PASS
Total 78 78 0 100%

Quality gates for hybrid IaC + Lambda modules met: one integration test per FR, one per policy row, all AUTO policies verified automatable with explicit "no-manual-bypass" structural assertions (pol-dt-004, pol-pri-001 grep the source for forbidden toggle tokens), SSM outputs table present and accurate.


Test approach

Jest + AWS SDK v3 integration tests in __tests__/integration/. No mocks; every assertion queries live AWS in the dev stage.

File Coverage
fr-269-schema-validation.test.ts Canonical schema present + Lambda validates valid vs. non-conforming payloads
fr-270-registry-update.test.ts Registry exists + seeded schemas present + Discoverer per bus STARTED
fr-271-dlq-routing.test.ts Rule exists + ENABLED + target has 3+ retries and correct DLQ ARN (×8 buses)
fr-272-delivery-log.test.ts Log group 90d retention + Lambda emits structured record with mandatory fields + end-to-end delivery path working
pol-dt-001.test.ts Bus policies allow only events:PutEvents, wildcard Principals carry Conditions (×8)
pol-dt-004.test.ts Bus has explicit-contract policy + Discoverer + module source has no manual-bypass toggles
pol-pri-001.test.ts party_id-only payload accepted, email/phone bare PII rejected by schema, no bypass flag in code
pol-pri-003.test.ts All 8 MOD-104 DLQs and the placeholder queue have MessageRetentionPeriod ≤ 1209600

Run: AWS_PROFILE=bank-dev STAGE=dev pnpm test from eventbridge-governance/.


Operational notes

  • Deploy: AWS_PROFILE=bank-dev pnpm -F @bank-platform/eventbridge-governance run deploy --stage <env>
  • The deploy script runs pnpm build:lambda (esbuild bundle) before sst deploy.
  • Remove: AWS_PROFILE=bank-dev pnpm -F @bank-platform/eventbridge-governance run remove --stage <env>
  • SST permalink (latest dev deploy): https://sst.dev/u/491a888e
  • The Lambda bundle is 1.7 MB and packaged at dist/delivery-logger/index.js; AWS SDK v3 clients (@aws-sdk/client-schemas, @aws-sdk/client-sqs) are bundled (Node 20 runtime does not ship them).

Stubs / deferred work

Stub Owner Replace with
SQS placeholder queue bank-eventbridge-delivery-log-placeholder-dev MOD-042 Consumer writing to platform.event_delivery_log (Neon)
Only 2 seeded schemas Owning domain modules schemas/{event_name}.json per new event; each module's deploy uploads its own aws.schemas.Schema

Both are documented inline as TODO(MOD-042) in src/lambdas/delivery-logger/index.ts and src/stacks/delivery-logger.ts.


  • Wiki spec: bank-wiki/source/entities/modules/MOD-043.{yaml,md}
  • Handoff: docs/handoffs/MOD-043-complete.handoff.md
  • Methodology: https://bank-wiki.pages.dev/delivery/methodology/
  • Event catalogue: https://bank-wiki.pages.dev/design/system/event-catalogue/
  • Schema registry spec: https://bank-wiki.pages.dev/design/system/schema-registry/
  • Data model for platform.event_delivery_log: https://bank-wiki.pages.dev/design/system/data-models/SD07-data-platform/
  • ADRs in effect: ADR-023, ADR-025, ADR-029, ADR-030, ADR-031