Skip to content

MOD-098 — Cost attribution engine

System: SD06 Snowflake Analytics & Risk Platform Repo: bank-risk-platform Status: In progress Owner: Data & Risk Engineering Build date: 2026-04-30


Purpose

MOD-098 converts raw usage events and infrastructure costs into attributed financial figures — per licensee, per module, per billing period. It is the engine that makes the three-part tariff (customer levy + facility fee + variable consumption) computable in Snowflake, and produces the data that future billing report (MOD-099) and internal finance reporting (MOD-080) consume.

FRs / NFRs satisfied

ID Summary
FR-393 Pull AWS Cost Explorer daily, grouped by tenant_id and module_id tags, into metering.aws_cost_daily within 2h of CE availability
FR-394 Attribute Snowflake compute per tenant — dedicated warehouses 1:1, shared warehouses proportional via query_history. ADR-046 pass 2: implemented as the dbt Dynamic Table dbt/models/MOD-098.../snowflake_credit_daily.sql sourced from SNOWFLAKE.ACCOUNT_USAGE; replaces the previous Lambda.
FR-395 Compute the three-part billing summary (customer levy + facility + variable) per tenant per period using rate card active at period start; refresh every 4h
FR-396 Maintain metering.unattributed_costs Dynamic Table; emit bank.risk-platform/unattributed_cost_threshold_exceeded when share > 2%
NFR-014 Snowflake write-back operational target ≤ 60s — applies to Cost Explorer / credit fetcher inserts
NFR-024 Audit log immutability — cost_rates, aws_cost_daily, snowflake_credit_daily, external_api_costs are append-only via REVOKE

Architecture

┌────────────────────┐    ┌──────────────────────┐    ┌────────────────────┐
│  AWS Cost Explorer │    │ Snowflake account_   │    │ MOD-097 usage      │
│  API (24-48h lag)  │    │ usage (1-3h lag)     │    │ events / external  │
└─────────┬──────────┘    └──────────┬───────────┘    │ API costs (input)  │
          │ daily 06 UTC             │ dbt source     └─────────┬──────────┘
          ▼                          │                          │
┌────────────────────┐               │                          │
│ cost-explorer-     │               │                          │
│ fetcher Lambda     │               │                          │
└─────────┬──────────┘               │                          │
          │ INSERT                   │                          │
          ▼                          ▼                          ▼
┌──────────────────────────────────────────────────────────────────────┐
│  METERING.AWS_COST_DAILY (landing)                                   │
│  METERING.SNOWFLAKE_CREDIT_DAILY (Dynamic Table — FR-394, was Lambda)│
│  METERING.EXTERNAL_API_COSTS  METERING.USAGE_EVENTS (← MOD-097)      │
│  METERING.COST_RATES (rate card; ops-managed, append-only)           │
│  METERING.CONFIG (FR-396 threshold; ops-managed, append-history)     │
│  BILLING.TENANT_MODULES   BILLING.TENANT_TIERS (config)              │
└──────────┬───────────────────────────────────────────────────────────┘
           │ Dynamic Table refresh
┌──────────────────────────────────────────────────────────────────────┐
│  METERING.DAILY_TENANT_SUMMARY (1h lag)                              │
│  METERING.BILLING_PERIOD_SUMMARY (4h lag) ← FR-395                   │
│  METERING.UNIT_ECONOMICS (24h lag)                                   │
│  METERING.UNATTRIBUTED_COSTS (1h lag)    ← FR-396                    │
└──────────┬───────────────────────────────────────────────────────────┘
           │ daily 08 UTC read of exceeds_threshold flag
┌────────────────────┐
│ unattributed-      │
│ monitor Lambda     │
└─────────┬──────────┘
          │ thin alert publisher (no threshold logic)
   bank-risk-platform EventBridge
   source = bank.risk-platform
   detail-type = unattributed_cost_threshold_exceeded

Data model — metering.* and billing.*

Schemas, tables, and Dynamic Tables defined in infra/snowflake/. The Dynamic Tables are the public output contract; landing and config tables are internal.

Object Type Refresh Notes
METERING.AWS_COST_DAILY Table append Cost Explorer landing, FR-393 (private)
METERING.SNOWFLAKE_CREDIT_DAILY Table (dbt + Task managed) dbt build at deploy + Snowflake Task every 4h FR-394 — per-tenant credit attribution (private). ADR-046 pass 2: was a Lambda-populated landing table; now a dbt table model sourced from SNOWFLAKE.ACCOUNT_USAGE. Cannot be a Dynamic Table because Snowflake forbids DTs on shared objects. Pass 5 added the Task T_REFRESH_SNOWFLAKE_CREDIT_DAILY for between-deploy freshness.
METERING.EXTERNAL_API_COSTS Table append Stub schema — populated by MOD-097 ext-cost Lambda
METERING.COST_RATES Table append Versioned rate card (effective_from / effective_to; private)
METERING.CONFIG Table append-history Module-internal runtime config (ADR-046 §4). Seeds FR-396 unattributed-cost threshold; future thresholds + calculation params land here. Effective-from/to versioned.
METERING.V_AWS_COST_DAILY View always fresh Published contract for AWS Cost Explorer landings (ADR-046 §3)
METERING.V_SNOWFLAKE_CREDIT_DAILY View always fresh Published contract for FR-394 credit attribution (ADR-046 §3)
METERING.V_COST_RATES View always fresh Published contract for the versioned rate card (ADR-046 §3)
METERING.V_DAILY_TENANT_SUMMARY View always fresh Published contract for the daily per-tenant cost rollup (ADR-046 §3)
METERING.V_BILLING_PERIOD_SUMMARY View (gated) always fresh Published contract for FR-395 (gated on mod_099_built)
METERING.V_UNIT_ECONOMICS View (gated) always fresh Internal contract for per-tenant gross margin (gated on mod_099_built)
METERING.V_UNATTRIBUTED_COSTS View always fresh Published contract for FR-396 (read by the unattributed-monitor Lambda + MOD-076)

ADR-046 §3 view-as-product (pass 4): every consumed object has a V_* view that is the published contract. SSM outputs point at the views. The unattributed-monitor Lambda reads METERING.V_UNATTRIBUTED_COSTS. Downstream modules (MOD-080 statutory reporting, future MOD-099 billing report, MOD-076 dashboards) reference METERING.V_* and never the underlying tables directly. | METERING.USAGE_EVENTS | Table | append | Owned by MOD-097; read-only here | | BILLING.TENANT_MODULES | Table | append-history | Tenant → activated module timeline | | BILLING.TENANT_TIERS | Table | append-history | Tenant → tier subscription with included thresholds | | METERING.DAILY_TENANT_SUMMARY | Dynamic | INCREMENTAL, 1h lag | Rolled up per tenant/module/resource_type/day | | METERING.BILLING_PERIOD_SUMMARY | Dynamic | FULL, 4h lag | FR-395 — three-part tariff per tenant per period | | METERING.UNIT_ECONOMICS | Dynamic | FULL, 24h lag | Internal use — gross margin per tenant | | METERING.UNATTRIBUTED_COSTS | Dynamic | INCREMENTAL, 1h lag | FR-396 — share + threshold flag |

Lambdas — src/modules/MOD-098/handlers/

Lambda Schedule Purpose
cost-explorer-fetcher cron(0 6 * * ? *) (06:00 UTC daily) FR-393 — pull Cost Explorer for last 2 days, INSERT into metering.aws_cost_daily
unattributed-monitor cron(0 8 * * ? *) (08:00 UTC daily, post-CE) FR-396 — thin alert publisher per ADR-046 §6: reads the latest metering.unattributed_costs row, publishes the EventBridge event if the dbt-computed exceeds_threshold flag is TRUE. No threshold logic in Lambda.

(snowflake-credit-fetcher was removed in ADR-046 pass 2: its 4-hour cron, 37-line CTE, and TS attribution arithmetic are all consolidated into the dbt model dbt/models/MOD-098.../snowflake_credit_daily.sql, materialized as a Dynamic Table on a 4-hour target_lag.)

Reserved concurrency: 3 per Lambda. Memory: 512 MB. Timeout: 120s (300s for Cost Explorer fetcher).

Domain — src/modules/MOD-098/domain/

Module Responsibility
cost-explorer-client.ts Paginated GetCostAndUsage with 5xx-retry / 4xx-fail-fast. Maps AWS group structure to CostExplorerRow

Removed in the ADR-046 refactor: - unattributed-evaluator.ts (pass 1) — threshold lookup + comparison moved to dbt/models/MOD-098.../unattributed_costs.sql, joined to METERING.CONFIG for the threshold value. - snowflake-credit-attribution.ts (pass 2) — credit attribution arithmetic moved to dbt/models/MOD-098.../snowflake_credit_daily.sql as CASE expressions.

Events

Direction Bus Source Detail-type Schema
Publish bank-risk-platform-{env} bank.risk-platform unattributed_cost_threshold_exceeded { trace_id, alert_id, cost_date, total_aws_cost_usd, unattributed_cost_usd, unattributed_share, threshold, event_time }

Consumes: none.

Error code enumeration

error_code Class Source Recovery
COST_EXPLORER_4XX PROVIDER_ERROR (non-retryable) cost-explorer-client Surface to ops; investigate IAM / API enablement
COST_EXPLORER_UNAVAILABLE TRANSIENT_INFRA cost-explorer-client Lambda retry via EventBridge schedule
EVENTBRIDGE_PUBLISH_FAILED TRANSIENT_INFRA event publisher Lambda retry
SNOWFLAKE_* TRANSIENT_INFRA shared/snowflake Lambda retry
ENV_MISSING TRANSIENT_INFRA handlers Deploy / SSM mismatch

Event type registry (logger event_type values)

cost_explorer_fetch_completed, cost_explorer_empty, unattributed_monitor_check_ok, unattributed_cost_threshold_exceeded, unattributed_monitor_no_data.

(snowflake_credit_fetch_completed was emitted by the snowflake-credit-fetcher Lambda removed in ADR-046 pass 2.)

SSM parameter contract

Reads (consumed)

SSM path From
/bank/{env}/iam/lambda/bank-risk-platform/arn MOD-104
/bank/{env}/observability/adot-layer-arn MOD-076
/bank/{env}/eventbridge/bank-risk-platform/arn MOD-104
/bank/{env}/eventbridge/bank-risk-platform/dlq-arn MOD-104
/bank/{env}/snowflake/account-locator MOD-102 (proposed contract)
/bank/{env}/snowflake/mod-098/warehouse MOD-102 (proposed contract)
/bank/{env}/snowflake/mod-098/database MOD-102 (proposed contract)
/bank/{env}/snowflake/mod-098/ingest-role MOD-102 (proposed contract)
/bank/{env}/snowflake/mod-098/ingest-secret-arn MOD-102 (proposed contract)

Writes (published) — same /bank/{env}/risk-platform/* convention as MOD-085

SSM path Value Consumed by
/bank/{env}/risk-platform/metering/daily-tenant-summary-table METERING.V_DAILY_TENANT_SUMMARY MOD-080 (statutory reporting), future MOD-099
/bank/{env}/risk-platform/metering/billing-period-summary-table METERING.V_BILLING_PERIOD_SUMMARY MOD-080, future MOD-099
/bank/{env}/risk-platform/metering/unit-economics-table METERING.V_UNIT_ECONOMICS Internal finance
/bank/{env}/risk-platform/metering/unattributed-costs-table METERING.V_UNATTRIBUTED_COSTS MOD-076 dashboards, MOD-098 unattributed-monitor Lambda
/bank/{env}/risk-platform/metering/aws-cost-daily-table METERING.V_AWS_COST_DAILY Ops
/bank/{env}/risk-platform/metering/snowflake-credit-daily-table METERING.V_SNOWFLAKE_CREDIT_DAILY Ops
/bank/{env}/risk-platform/metering/cost-rates-table METERING.V_COST_RATES Ops, future MOD-099
/bank/{env}/risk-platform/metering/event-source-name bank.risk-platform SD04/SD07 EB rules
/bank/{env}/risk-platform/metering/cost-explorer-fetcher-arn Lambda ARN Ops/runbooks
/bank/{env}/risk-platform/metering/unattributed-monitor-arn Lambda ARN Ops/runbooks

(The snowflake-credit-fetcher-arn SSM output was removed in ADR-046 pass 2 — there is no Lambda. Pulumi pulumi up --refresh deletes the orphan SSM parameter on the next deploy.)

Acceptance criterion — REP-001 CALC

Produces daily attributed cost and running billing period totals per licensee — the authoritative source for SaaS invoices and internal gross margin reporting.

Per acceptance-criteria.md, CALC mode requires running the calculation against known inputs and verifying the output matches an independently computed result, plus boundary cases.

Test: tests/policy/REP-001-calc-correctness.test.ts.

  • Snowflake credit attribution arithmetic (FR-394): dedicated 42 × 2.5 = 105.000000; proportional 17 × 3.0 = 51.000000; sum-invariant check.
  • billing_period_summary (FR-395) structural correctness: contains all four tariff components; rate-card binding uses period_start; total sums all four; refresh mode FULL with 4-hour lag.
  • unattributed_costs (FR-396) structural correctness: SQL uses strict > (boundary case share == threshold does not alert); exceeds_threshold is emitted as a precomputed boolean; threshold value is sourced from METERING.CONFIG rather than a hard-coded literal; threshold_used is captured per row for audit.
  • METERING.CONFIG (ADR-046 §4): seed migration creates the table with effective-from/to versioning and inserts the FR-396 threshold (0.02) idempotently.

Runtime numerical correctness against the deployed Snowflake account is in the integration layer (tests/integration/rep-001-calc-runtime.test.ts) — seeds AWS_COST_DAILY with a known fixture, force-refreshes the UNATTRIBUTED_COSTS Dynamic Table, and asserts the dbt-computed UNATTRIBUTED_SHARE, THRESHOLD_USED, and EXCEEDS_THRESHOLD columns match independent arithmetic.

Observability

  • ADOT layer attached to all Lambdas (per MOD-085 pattern).
  • extract_trace_context at the top of every handler.
  • Structured logs use StructuredLogger with mandatory observability fields.
  • EMF metrics: cost_explorer_rows_written_total, snowflake_credit_rows_written_total, unattributed_share.
  • Dashboard provisioned at bank-{env}-MOD-098.

Idempotency

  • Cost Explorer fetcher: each run uses a fresh run_id. Reprocessing the same window is safe — table is append-only and de-duplicated downstream by Dynamic Tables joining on cost_date.
  • Snowflake credit fetcher: same — fresh run_id per run.
  • Threshold monitor: re-evaluates the latest unattributed_costs row each run; multiple firings within a day are deduplicated downstream by alert_id.

Snowflake Task DAG (ADR-046 §1)

Task Schedule Owns Notes
METERING.T_REFRESH_SNOWFLAKE_CREDIT_DAILY USING CRON 0 */4 * * * UTC snowflake_credit_daily refresh Pass 5. Mirrors the dbt model body (CTAS from ACCOUNT_USAGE). Replaces the legacy mod-098-snowflake-credit-fetcher Lambda's 4-hour cron. Body duplicates the dbt model SQL for now — pass-6 candidate is to deduplicate via stored procedure or Snowpark Container Services running dbt.

The other Lambdas (cost-explorer-fetcher, unattributed-monitor) keep their CloudWatch crons — both are explicitly justified by ADR-046 §6 (external API reader, alert publisher).

Out of scope (explicit non-goals)

  • The rate card data itself — managed by ops via approved rate-change process; MOD-098 only owns the schema.
  • BILLING.TENANT_MODULES / TENANT_TIERS data — populated by onboarding flow (out of scope here); schemas declared so MOD-098 has stable joins.
  • Invoice generation and PDF rendering — future MOD-099 in another repo.
  • AWS resource tagging compliance — MOD-104 GOV-005 enforces tagging at synthesis time. MOD-098 surfaces breaches via unattributed_costs.
  • NZ public-holiday awareness for billing periods — periods are calendar months.

Open items at handoff

  1. MOD-097 SSM contract for metering.usage_events — MOD-097 lives in bank-platform and writes metering.usage_events. Its design doc doesn't publish a per-table SSM. MOD-098 hard-codes METERING.USAGE_EVENTS in DCM SQL — confirm name with MOD-097 before first Snowflake apply.
  2. MOD-097 external API cost contractMOD-098.md says "Pulled by MOD-097's external cost Lambda and available in metering.external_api_costs" but MOD-097's design doc has no such Lambda. MOD-098 declares the schema as a stub. Either (a) MOD-097 adds the Lambda or (b) MOD-098 absorbs the responsibility in a follow-on iteration.
  3. billing.* schema ownership — MOD-098 currently provisions BILLING.TENANT_MODULES and BILLING.TENANT_TIERS. These may move to a future MOD-099 (billing/onboarding) when that module ships. Captured in handoff.
  4. AWS Cost Explorer dimensions — current implementation groups by SERVICE and tag tenant_id. Per FR-393 module_id should also be grouped — Cost Explorer's GroupBy is limited to two dimensions, so module_id is captured via tag filter on subsequent passes (not yet implemented). Flagged for follow-up.