Technical design — MOD-043 EventBridge domain event governance¶
Module: MOD-043 — EventBridge domain event governance
System: SD07 — Data Platform & Governance Infrastructure
Repo: bank-platform
FR scope: FR-269, FR-270, FR-271, FR-272
Policies satisfied: DT-001 (AUTO), DT-004 (AUTO), PRI-001 (AUTO), PRI-003 (AUTO)
Author: AI agent (Claude Opus 4.7)
Date: 2026-04-19
Stage covered: dev (account 647751526084, region ap-southeast-2)
SST permalink: https://sst.dev/u/491a888e
Objective¶
MOD-043 governs the 8 custom EventBridge buses provisioned by MOD-104. It adds:
- A single Amazon EventBridge Schema Registry (
bank-events-{env}) with a Discoverer per bus and canonical JSON-Schema (draft-04) seeds. - A governance rule per bus that captures every event and forwards it to a delivery-logger Lambda. Each rule target is configured with a 3-retry policy and a DLQ (the MOD-104 DLQ for that domain) — satisfying FR-271.
- A Node.js 20 delivery-logger Lambda that validates each event against the registry and emits a structured log record with the FR-272 fields (
event_id,schema_version,source,event_time/received_at,delivery_status). - CloudWatch alarms on DLQ depth for each of the 8 domain DLQs, publishing to a dedicated SNS topic.
This document describes the as-deployed state in dev. All 78 integration tests pass against the live environment. The Neon-backed platform.event_delivery_log persistence (MOD-042 dependency) is stubbed: the logger writes to CloudWatch Logs (90-day retention — FR-272) and a placeholder SQS queue that MOD-042 will wire to Postgres.
Execution model¶
| Aspect | Decision |
|---|---|
| IaC tool | SST v3 Ion + raw @pulumi/aws resources (ADR-025) |
| Lambda runtime | Node.js 20 on arm64; 256 MB / 10s timeout |
| Lambda packaging | Local esbuild bundle (CJS, Node-20 target) zipped into dist/delivery-logger/ and uploaded via pulumi.asset.FileArchive. Bundle size ~1.7 MB |
| Schema format | JSON Schema draft-04 (matches wiki schema-registry.md) |
| Schema validator | ajv-draft-04 + ajv-formats in-Lambda |
| Region | ap-southeast-2 |
| Tagging | Provider defaultTags: tenant_id, module_id=MOD-043, environment, system_id=SD07, cost_center=sd07-bank-platform, managed_by=sst |
| Deployment identity | AWS_PROFILE=bank-dev for dev |
Stack layout¶
eventbridge-governance/
├── sst.config.ts
├── src/
│ ├── stacks/
│ │ ├── schema-registry.ts — bank-events registry + 8 Discoverers + seeded schemas
│ │ ├── governance-rules.ts — 1 rule/bus forwarding to the logger + RetryPolicy(3) + DLQ target
│ │ ├── delivery-logger.ts — Lambda + role + log group(90d) + placeholder SQS queue
│ │ └── dlq-alerts.ts — 8 CloudWatch alarms on DLQ depth + SNS topic
│ ├── lambdas/
│ │ └── delivery-logger/
│ │ ├── index.ts — handler: validate detail against registry, emit structured log, forward to SQS placeholder
│ │ ├── package.json
│ │ └── tsconfig.json
│ └── outputs.ts — 22 SSM parameters under /bank/{env}/mod043/...
├── schemas/
│ ├── bank.platform.notification_sent.json
│ └── bank.core.posting_completed.json
├── scripts/build-lambda.mjs — esbuild wrapper producing dist/delivery-logger/index.js
└── __tests__/integration/ — 8 test files, 78 assertions (all live AWS)
AWS resources provisioned (dev stage)¶
Schema Registry (FR-269, FR-270)¶
| Resource | Name | Notes |
|---|---|---|
aws.schemas.Registry |
bank-events-dev |
Single registry per wiki spec — buses are the domain boundary |
aws.schemas.Discoverer × 8 |
one per bus | sourceArn resolved via SSM /bank/{env}/eventbridge/{domain}/arn |
aws.schemas.Schema × 2 (seed) |
bank.platform.notification_sent, bank.core.posting_completed |
JSONSchemaDraft4. additionalProperties:false is mandatory per wiki |
Schemas auto-version on change (AWS-native). FR-270 SLA (≤1 business day) is satisfied structurally: Discoverer latency <1 minute; seeded schemas deploy via sst deploy on merge to main.
Governance rules + targets (FR-271)¶
Per domain d:
| Resource | Name | Notes |
|---|---|---|
aws.cloudwatch.EventRule |
bank-{d}-governance-{env} |
EventPattern { "source": [{ "prefix": "bank." }] } — captures every event |
aws.cloudwatch.EventTarget |
— | Targets the delivery-logger Lambda. RetryPolicy.MaximumRetryAttempts=3, MaximumEventAgeInSeconds=3600. DeadLetterConfig.Arn = MOD-104 DLQ for d |
aws.lambda.Permission |
bank-{d}-gov-{env} |
events.amazonaws.com → Lambda, scoped to the rule ARN |
Delivery logger (FR-272)¶
| Resource | Value |
|---|---|
| Lambda name | bank-eventbridge-delivery-logger-dev |
| Runtime / arch | nodejs20.x / arm64 |
| Handler | index.handler |
| Role | bank-eventbridge-delivery-logger-dev (inline: schemas:DescribeSchema, sqs:SendMessage on placeholder queue, logs:* on own group) |
| Log group | /aws/lambda/bank-eventbridge-delivery-logger-dev, retention 90 days (FR-272) |
| Env vars | SCHEMA_REGISTRY_NAME=bank-events-dev, DELIVERY_LOG_QUEUE_URL=..., MODULE_ID=MOD-043 |
| Placeholder SQS queue | bank-eventbridge-delivery-log-placeholder-dev, MessageRetentionPeriod=1209600 (14d — PRI-003), KMS alias/aws/sqs |
Structured log schema:
{
"level": "INFO",
"event_type": "event_delivery_recorded",
"event_id": "...",
"schema_name": "bank.<domain>.<detail-type>",
"schema_version": "1.0.0",
"source": "bank.<domain>",
"bus_name": "bank-<domain>",
"detail_type": "<detail-type>",
"event_time": "...",
"received_at": "...",
"delivery_status": "PUBLISHED|SCHEMA_REJECTED|SCHEMA_MISSING|ERROR",
"failure_reason": "...",
"payload_checksum": "sha256",
"module_id": "MOD-043",
"trace_id": "..."
}
The handler caches compiled validators in-memory per cold start. When no schema is registered yet, the record is emitted with delivery_status=SCHEMA_MISSING — i.e. observability without blocking (the publisher-side validator in @bank-platform/schema-registry is the enforcement point per wiki).
TODO (MOD-042): replace the placeholder SQS queue with a direct insert into
platform.event_delivery_log(Neon) once MOD-042 lands. Reference:bank-wiki/source/pages/design/system/data-models/SD07-data-platform.md.
DLQ alarms + SNS¶
| Resource | Value |
|---|---|
aws.sns.Topic |
bank-eventbridge-dlq-alerts-dev |
aws.cloudwatch.MetricAlarm × 8 |
bank-{d}-dlq-depth-{env} — AWS/SQS ApproximateNumberOfMessagesVisible > 0 1/1 min → alert + OK actions → SNS |
MOD-076 will subscribe pager/chat channels at a later phase.
SSM outputs table (consumer contract)¶
All under arn:aws:ssm:ap-southeast-2:647751526084:parameter, path convention /bank/{env}/mod043/.... Consumers resolve these at deploy time.
| SSM path | Value | Consumed by |
|---|---|---|
/bank/{env}/mod043/schema-registry/name |
Registry name (bank-events-{env}) |
All event-publishing and consuming Lambdas (validator bootstrap) |
/bank/{env}/mod043/schema-registry/arn |
Registry ARN | IAM policies in downstream modules needing schemas:* on the registry |
/bank/{env}/mod043/delivery-logger/arn |
Lambda ARN | MOD-076 subscription-filter target; MOD-042 when replacing the placeholder queue |
/bank/{env}/mod043/delivery-logger/log-group-arn |
Log group ARN | MOD-076 log-subscription ingestion |
/bank/{env}/mod043/delivery-log-placeholder/arn |
SQS ARN | MOD-042 (will consume + write to platform.event_delivery_log) |
/bank/{env}/mod043/delivery-log-placeholder/url |
SQS URL | MOD-042 consumer |
/bank/{env}/mod043/dlq-alerts-topic/arn |
SNS topic ARN | MOD-076 alarm-destination subscriptions |
/bank/{env}/mod043/rule/{domain}/arn × 8 |
EventRule ARN | Audit/support tooling; identifying governance subscription per bus |
/bank/{env}/mod043/dlq-alarm/{domain}/arn × 8 |
CloudWatch Alarm ARN | MOD-076 dashboard widgets |
Resolution example¶
// In downstream module (Pulumi)
const registryName = aws.ssm.getParameterOutput({
name: `/bank/${stage}/mod043/schema-registry/name`,
}).value;
// In a publisher Lambda
const schema = await new SchemasClient({ region }).send(
new DescribeSchemaCommand({
RegistryName: process.env.SCHEMA_REGISTRY_NAME!,
SchemaName: "bank.core.posting_completed",
}),
);
Acceptance criteria status (dev stage, 2026-04-19)¶
Run: AWS_PROFILE=bank-dev STAGE=dev pnpm test (from eventbridge-governance/)
| FR / Policy | Mode | Tests | Pass | Fail | Status |
|---|---|---|---|---|---|
| FR-269 — schema-validated publish, non-conforming rejected | — | 3 | 3 | 0 | PASS |
| FR-270 — central schema registry, per-bus Discoverer | — | 3 | 3 | 0 | PASS |
| FR-271 — at-least-once + DLQ after 3 retries | — | 24 | 24 | 0 | PASS |
| FR-272 — 90d delivery log with mandatory fields | — | 3 | 3 | 0 | PASS |
| DT-001 — bus access IAM-gated | AUTO | 16 | 16 | 0 | PASS |
| DT-004 — cross-domain subscriptions need explicit contract | AUTO | 17 | 17 | 0 | PASS |
| PRI-001 — no PII in payloads (schema rejects bare PII) | AUTO | 3 | 3 | 0 | PASS |
| PRI-003 — DLQ retention ≤ 14 days | AUTO | 9 | 9 | 0 | PASS |
| Total | 78 | 78 | 0 | 100% |
Quality gates for hybrid IaC + Lambda modules met: one integration test per FR, one per policy row, all AUTO policies verified automatable with explicit "no-manual-bypass" structural assertions (pol-dt-004, pol-pri-001 grep the source for forbidden toggle tokens), SSM outputs table present and accurate.
Test approach¶
Jest + AWS SDK v3 integration tests in __tests__/integration/. No mocks; every assertion queries live AWS in the dev stage.
| File | Coverage |
|---|---|
fr-269-schema-validation.test.ts |
Canonical schema present + Lambda validates valid vs. non-conforming payloads |
fr-270-registry-update.test.ts |
Registry exists + seeded schemas present + Discoverer per bus STARTED |
fr-271-dlq-routing.test.ts |
Rule exists + ENABLED + target has 3+ retries and correct DLQ ARN (×8 buses) |
fr-272-delivery-log.test.ts |
Log group 90d retention + Lambda emits structured record with mandatory fields + end-to-end delivery path working |
pol-dt-001.test.ts |
Bus policies allow only events:PutEvents, wildcard Principals carry Conditions (×8) |
pol-dt-004.test.ts |
Bus has explicit-contract policy + Discoverer + module source has no manual-bypass toggles |
pol-pri-001.test.ts |
party_id-only payload accepted, email/phone bare PII rejected by schema, no bypass flag in code |
pol-pri-003.test.ts |
All 8 MOD-104 DLQs and the placeholder queue have MessageRetentionPeriod ≤ 1209600 |
Run: AWS_PROFILE=bank-dev STAGE=dev pnpm test from eventbridge-governance/.
Operational notes¶
- Deploy:
AWS_PROFILE=bank-dev pnpm -F @bank-platform/eventbridge-governance run deploy --stage <env> - The
deployscript runspnpm build:lambda(esbuild bundle) beforesst deploy. - Remove:
AWS_PROFILE=bank-dev pnpm -F @bank-platform/eventbridge-governance run remove --stage <env> - SST permalink (latest dev deploy): https://sst.dev/u/491a888e
- The Lambda bundle is 1.7 MB and packaged at
dist/delivery-logger/index.js; AWS SDK v3 clients (@aws-sdk/client-schemas,@aws-sdk/client-sqs) are bundled (Node 20 runtime does not ship them).
Stubs / deferred work¶
| Stub | Owner | Replace with |
|---|---|---|
SQS placeholder queue bank-eventbridge-delivery-log-placeholder-dev |
MOD-042 | Consumer writing to platform.event_delivery_log (Neon) |
| Only 2 seeded schemas | Owning domain modules | schemas/{event_name}.json per new event; each module's deploy uploads its own aws.schemas.Schema |
Both are documented inline as TODO(MOD-042) in src/lambdas/delivery-logger/index.ts and src/stacks/delivery-logger.ts.
Related artefacts¶
- Wiki spec:
bank-wiki/source/entities/modules/MOD-043.{yaml,md} - Handoff:
docs/handoffs/MOD-043-complete.handoff.md - Methodology: https://bank-wiki.pages.dev/delivery/methodology/
- Event catalogue: https://bank-wiki.pages.dev/design/system/event-catalogue/
- Schema registry spec: https://bank-wiki.pages.dev/design/system/schema-registry/
- Data model for
platform.event_delivery_log: https://bank-wiki.pages.dev/design/system/data-models/SD07-data-platform/ - ADRs in effect: ADR-023, ADR-025, ADR-029, ADR-030, ADR-031