Skip to content

ADR-053: Build artefact versioning and stage promotion model

Status Accepted
Date 2026-05-02
Deciders CTO, Head of Platform Engineering
Affects repos bank-core, bank-kyc, bank-aml, bank-payments, bank-credit, bank-app, bank-risk-platform, bank-platform

Status: Proposed

Context

The platform produces six distinct artefact types across its eight code repositories: Lambda function bundles, CloudFormation templates (synthesised by SST CDK), Neon Postgres schemas (managed by Flyway migrations), dbt compiled model manifests, Snowflake non-dbt DDL (managed separately — see ADR-054), and Pulumi infrastructure programs. Currently all repositories recompile or re-synthesise from source on every sst deploy or pulumi up invocation. No versioned artefacts are stored between the CI build step and the deploy step.

This creates three concrete problems:

1. No binary provenance for stage promotion. When promoting from dev to UAT (and later to production), the standard approach would be to rerun sst deploy --stage uat from the same git commit. esbuild is deterministic in practice, but there is no auditable proof that the Lambda ZIP deployed to UAT is byte-for-byte identical to the ZIP that was integration-tested in dev. A regulator or auditor asking "show me the test evidence for what is running in production" cannot be given a clean answer.

2. No Neon schema validation gate in CI. Flyway migration files are applied as a sub-step of sst deploy (bank-core, bank-kyc, bank-aml) or as a separate CI job (bank-payments). There is no flyway validate step that independently confirms the migration files in the repo are consistent with the schema history already applied to the dev Neon branch — before the Lambda code that depends on those schemas is deployed. A migration file edited after initial application would pass unit tests but fail at deploy time with no prior warning.

3. No schema snapshot as test evidence. When integration tests run against the dev Neon database, the exact schema state is not recorded. The flyway_schema_history table in Neon is the authoritative version record but it is not captured as a build artefact. A point-in-time record of the schema that was integration-tested is missing.

This ADR does not cover Snowflake DDL versioning — that is addressed by ADR-054.

Decision

Artefact storage

All build artefacts are stored to an S3 bucket (bank-artefacts, ap-southeast-2, server-side encrypted with the bank-platform KMS key) keyed by repo and commit SHA:

s3://bank-artefacts/{repo}/{commit-sha}/
  build-manifest.json          commit SHA, timestamp, repo, branch,
                               list of included modules, test result
                               summary (unit pass/fail, integration
                               pass/fail), flyway checksum map
  modules/
    {MOD-NNN}/
      functions/
        {function-name}.zip    Lambda bundle (esbuild output)
      cloudformation/
        {stack-name}.json      CDK synthesised CloudFormation template
      schema/
        {db-name}-schema.sql   pg_dump --schema-only output captured
                               immediately after integration tests pass
      dbt/
        manifest.json          dbt compiled manifest (bank-risk-platform
                               modules only)

S3 objects are tagged integration-passed=true by the CI workflow only after the integration test step exits 0. Objects for a commit where integration tests failed carry integration-passed=false. Promotion workflows gate on the tag value.

CI pipeline shape (Lambda repos)

The reusable-lambda workflow acquires the following shape:

1.  pnpm typecheck
2.  pnpm test:unit             (Vitest unit + contract tiers; ≥80% coverage)
3.  pnpm test:policy           (one test per policies_satisfied row)
4.  [has_postgres] flyway validate
                               (confirms migration files match dev Neon
                               schema history — fails fast before deploy)
5.  sst build --stage dev      (synthesises CloudFormation + bundles Lambdas)
6.  Upload artefacts to S3     (Lambda ZIPs + CloudFormation templates;
                               tagged integration-passed=false initially)
7.  sst deploy --stage dev     (applies stored artefacts to dev stage)
8.  [has_postgres] flyway migrate
                               (applies any pending migrations to dev Neon)
9.  pnpm test:integration      (RUN_INTEGRATION=1, against live deployed dev)
10. [has_postgres] pg_dump --schema-only
                               (captures dev Neon schema at post-migration
                               state, uploaded to S3 per module)
11. Tag S3 objects integration-passed=true
12. CI writes CI handoff document (steps 1–11 all green):
                               `docs/handoffs/{module_id}-ci-built-{sha}.handoff.md`
                               bank-wiki processes this to advance build_status
                               to Built
13. node tests/verify-deployment.mjs
                               (smoke test against live deployed dev)
14. CI writes CI handoff document (step 13 passes):
                               `docs/handoffs/{module_id}-ci-deployed-{sha}.handoff.md`
                               bank-wiki processes this to advance build_status
                               to Deployed

Step 4 (flyway validate) is a new gate. It runs flyway validate using the migrate_user credential against the dev Neon branch. If any migration file in the repo has a checksum mismatch against the flyway_schema_history table — because a file was edited after application — the workflow fails immediately, before any deploy step runs.

Stage promotion (dev → UAT)

Stage promotion is a separate workflow (promote-to-uat.yml), manually triggered, requiring a commit SHA as input.

1. Verify S3 objects exist for {commit-sha}
2. Verify build-manifest.json tag: integration-passed=true
3. Download Lambda ZIPs and CloudFormation templates from S3
4. Apply CloudFormation templates to UAT stack (no CDK synth)
5. Deploy Lambda ZIPs to UAT Lambda functions (no esbuild)
6. flyway migrate against UAT Neon branch (from same git ref —
   applies any migrations not yet present in UAT schema history)
7. Run smoke test against UAT endpoints
8. CI writes CI handoff document: status Deployed
                               (`docs/handoffs/{module_id}-ci-deployed-{sha}.handoff.md`
                               bank-wiki processes to advance build_status)

The Lambda ZIPs and CloudFormation templates applied to UAT are the exact objects stored in step 5 of the CI pipeline above — not recompiled. The git SHA of the UAT deploy is the same SHA that was integration-tested.

Pulumi (bank-platform)

Pulumi programs are declarative. The "artefact" for a Pulumi module is the Pulumi program source at a git SHA plus the Pulumi stack state stored in the Pulumi backend (S3 state bucket). Stage promotion is pulumi up --stack uat run from the same git SHA. Pulumi diffs the uat stack state against the program and applies only changes. No separate artefact storage is required beyond S3 state, but the build-manifest.json records the Pulumi program git SHA for audit traceability.

dbt (bank-risk-platform)

dbt's manifest.json (produced by dbt compile) is a complete compiled representation of all models, tests, and their dependencies. It is stored in S3 per commit under modules/{MOD-NNN}/dbt/manifest.json. Stage promotion uses dbt build --target uat from the same git ref against the UAT Snowflake environment. The dbt manifest in S3 serves as the audit record of what logic was deployed.

Snowflake DDL (bank-risk-platform)

Covered by ADR-054.

Built and Deployed definitions

Builtbuild_status is set to Built by the bank-wiki agent when it processes the CI handoff document written by the pipeline once all of the following are true in a single pipeline run:

  1. pnpm typecheck passes — zero errors
  2. Unit tests pass — ≥ 80% line + function coverage
  3. Policy tests pass — one per policies_satisfied row
  4. [Lambda/Postgres repos] Flyway validate passes — no migration file tampering
  5. Deploy step completes without error (sst deploy, pulumi up, or DCM deploy)
  6. Integration tests pass with RUN_INTEGRATION=1 against live deployed dev
  7. S3 artefacts tagged integration-passed=true
  8. [bank-risk-platform] DCM plan and DCM deploy completed; dbt build completed

The orchestrator must not set Built manually. Built without a passing CI run is not Built.

Deployed (dev)build_status is set to Deployed by the bank-wiki agent when it processes the CI handoff document written immediately after the smoke test (node tests/verify-deployment.mjs) passes in the same pipeline run that achieved Built.

The orchestrator must not set Deployed manually.

Deployed (UAT) — future. Requires the promote-to-uat.yml workflow to have completed for that commit SHA. The UAT status is set automatically by that workflow.

RUN_INTEGRATION requirement

RUN_INTEGRATION=1 must be set as a GitHub Actions environment variable in Settings → Environments → dev → Environment variables for every code repository. It is not set in any workflow YAML file — the environment: dev binding on the job injects it automatically. Without it, integration test guards (skipIfNoAws(), skipIfNoDb(), skipIfNoSnowflake()) fire and tests are silently skipped, meaning a pipeline run cannot produce a valid Built status under this ADR's definition.

Consequences

Positive: - The exact Lambda binary deployed to UAT is provably the one integration-tested in dev — binary-level audit traceability. - flyway validate as a CI gate prevents the class of bugs where a migration file is edited after initial application, which would cause a silent schema drift that only surfaces at the next deploy. - pg_dump schema snapshots provide point-in-time schema evidence correlated to specific integration test runs — directly answers regulatory "what schema was tested?" questions. - dbt manifest storage provides complete model lineage at any deployed commit. - The S3 artefact store doubles as a deployment rollback mechanism — redeploying a prior commit's ZIPs to revert a broken Lambda is a single workflow invocation.

Negative: - Implementation effort: reusable-lambda.yml needs flyway validate, S3 upload, and pg_dump steps added; promote-to-uat.yml needs to be written. - S3 storage cost: Lambda ZIPs are small (typically 1–10 MB per function); CloudFormation templates are tiny JSON. Storage cost for a full 12-month artefact history across all modules is negligible at current volume. - flyway validate adds ~5 seconds to every CI run. Acceptable. - The RUN_INTEGRATION=1 gate means CI runs take longer (integration tests against live dev add 2–5 minutes per module). This is the correct trade-off — the alternative is shipping modules whose integration tests have never run in CI.


All ADRs Compiled 2026-05-22 from source/entities/adrs/ADR-053.yaml