MOD-098 — Cost attribution engine¶
System: SD06 Snowflake Analytics & Risk Platform
Repo: bank-risk-platform
Status: In progress
Owner: Data & Risk Engineering
Build date: 2026-04-30
Purpose¶
MOD-098 converts raw usage events and infrastructure costs into attributed financial figures — per licensee, per module, per billing period. It is the engine that makes the three-part tariff (customer levy + facility fee + variable consumption) computable in Snowflake, and produces the data that future billing report (MOD-099) and internal finance reporting (MOD-080) consume.
FRs / NFRs satisfied¶
| ID | Summary |
|---|---|
| FR-393 | Pull AWS Cost Explorer daily, grouped by tenant_id and module_id tags, into metering.aws_cost_daily within 2h of CE availability |
| FR-394 | Attribute Snowflake compute per tenant — dedicated warehouses 1:1, shared warehouses proportional via query_history. ADR-046 pass 2: implemented as the dbt Dynamic Table dbt/models/MOD-098.../snowflake_credit_daily.sql sourced from SNOWFLAKE.ACCOUNT_USAGE; replaces the previous Lambda. |
| FR-395 | Compute the three-part billing summary (customer levy + facility + variable) per tenant per period using rate card active at period start; refresh every 4h |
| FR-396 | Maintain metering.unattributed_costs Dynamic Table; emit bank.risk-platform/unattributed_cost_threshold_exceeded when share > 2% |
| NFR-014 | Snowflake write-back operational target ≤ 60s — applies to Cost Explorer / credit fetcher inserts |
| NFR-024 | Audit log immutability — cost_rates, aws_cost_daily, snowflake_credit_daily, external_api_costs are append-only via REVOKE |
Architecture¶
┌────────────────────┐ ┌──────────────────────┐ ┌────────────────────┐
│ AWS Cost Explorer │ │ Snowflake account_ │ │ MOD-097 usage │
│ API (24-48h lag) │ │ usage (1-3h lag) │ │ events / external │
└─────────┬──────────┘ └──────────┬───────────┘ │ API costs (input) │
│ daily 06 UTC │ dbt source └─────────┬──────────┘
▼ │ │
┌────────────────────┐ │ │
│ cost-explorer- │ │ │
│ fetcher Lambda │ │ │
└─────────┬──────────┘ │ │
│ INSERT │ │
▼ ▼ ▼
┌──────────────────────────────────────────────────────────────────────┐
│ METERING.AWS_COST_DAILY (landing) │
│ METERING.SNOWFLAKE_CREDIT_DAILY (Dynamic Table — FR-394, was Lambda)│
│ METERING.EXTERNAL_API_COSTS METERING.USAGE_EVENTS (← MOD-097) │
│ METERING.COST_RATES (rate card; ops-managed, append-only) │
│ METERING.CONFIG (FR-396 threshold; ops-managed, append-history) │
│ BILLING.TENANT_MODULES BILLING.TENANT_TIERS (config) │
└──────────┬───────────────────────────────────────────────────────────┘
│ Dynamic Table refresh
▼
┌──────────────────────────────────────────────────────────────────────┐
│ METERING.DAILY_TENANT_SUMMARY (1h lag) │
│ METERING.BILLING_PERIOD_SUMMARY (4h lag) ← FR-395 │
│ METERING.UNIT_ECONOMICS (24h lag) │
│ METERING.UNATTRIBUTED_COSTS (1h lag) ← FR-396 │
└──────────┬───────────────────────────────────────────────────────────┘
│ daily 08 UTC read of exceeds_threshold flag
▼
┌────────────────────┐
│ unattributed- │
│ monitor Lambda │
└─────────┬──────────┘
│ thin alert publisher (no threshold logic)
▼
bank-risk-platform EventBridge
source = bank.risk-platform
detail-type = unattributed_cost_threshold_exceeded
Data model — metering.* and billing.*¶
Schemas, tables, and Dynamic Tables defined in infra/snowflake/. The Dynamic Tables are the public output contract; landing and config tables are internal.
| Object | Type | Refresh | Notes |
|---|---|---|---|
METERING.AWS_COST_DAILY |
Table | append | Cost Explorer landing, FR-393 (private) |
METERING.SNOWFLAKE_CREDIT_DAILY |
Table (dbt + Task managed) | dbt build at deploy + Snowflake Task every 4h | FR-394 — per-tenant credit attribution (private). ADR-046 pass 2: was a Lambda-populated landing table; now a dbt table model sourced from SNOWFLAKE.ACCOUNT_USAGE. Cannot be a Dynamic Table because Snowflake forbids DTs on shared objects. Pass 5 added the Task T_REFRESH_SNOWFLAKE_CREDIT_DAILY for between-deploy freshness. |
METERING.EXTERNAL_API_COSTS |
Table | append | Stub schema — populated by MOD-097 ext-cost Lambda |
METERING.COST_RATES |
Table | append | Versioned rate card (effective_from / effective_to; private) |
METERING.CONFIG |
Table | append-history | Module-internal runtime config (ADR-046 §4). Seeds FR-396 unattributed-cost threshold; future thresholds + calculation params land here. Effective-from/to versioned. |
METERING.V_AWS_COST_DAILY |
View | always fresh | Published contract for AWS Cost Explorer landings (ADR-046 §3) |
METERING.V_SNOWFLAKE_CREDIT_DAILY |
View | always fresh | Published contract for FR-394 credit attribution (ADR-046 §3) |
METERING.V_COST_RATES |
View | always fresh | Published contract for the versioned rate card (ADR-046 §3) |
METERING.V_DAILY_TENANT_SUMMARY |
View | always fresh | Published contract for the daily per-tenant cost rollup (ADR-046 §3) |
METERING.V_BILLING_PERIOD_SUMMARY |
View (gated) | always fresh | Published contract for FR-395 (gated on mod_099_built) |
METERING.V_UNIT_ECONOMICS |
View (gated) | always fresh | Internal contract for per-tenant gross margin (gated on mod_099_built) |
METERING.V_UNATTRIBUTED_COSTS |
View | always fresh | Published contract for FR-396 (read by the unattributed-monitor Lambda + MOD-076) |
ADR-046 §3 view-as-product (pass 4): every consumed object has a V_* view that is the published contract. SSM outputs point at the views. The unattributed-monitor Lambda reads METERING.V_UNATTRIBUTED_COSTS. Downstream modules (MOD-080 statutory reporting, future MOD-099 billing report, MOD-076 dashboards) reference METERING.V_* and never the underlying tables directly.
| METERING.USAGE_EVENTS | Table | append | Owned by MOD-097; read-only here |
| BILLING.TENANT_MODULES | Table | append-history | Tenant → activated module timeline |
| BILLING.TENANT_TIERS | Table | append-history | Tenant → tier subscription with included thresholds |
| METERING.DAILY_TENANT_SUMMARY | Dynamic | INCREMENTAL, 1h lag | Rolled up per tenant/module/resource_type/day |
| METERING.BILLING_PERIOD_SUMMARY | Dynamic | FULL, 4h lag | FR-395 — three-part tariff per tenant per period |
| METERING.UNIT_ECONOMICS | Dynamic | FULL, 24h lag | Internal use — gross margin per tenant |
| METERING.UNATTRIBUTED_COSTS | Dynamic | INCREMENTAL, 1h lag | FR-396 — share + threshold flag |
Lambdas — src/modules/MOD-098/handlers/¶
| Lambda | Schedule | Purpose |
|---|---|---|
cost-explorer-fetcher |
cron(0 6 * * ? *) (06:00 UTC daily) |
FR-393 — pull Cost Explorer for last 2 days, INSERT into metering.aws_cost_daily |
unattributed-monitor |
cron(0 8 * * ? *) (08:00 UTC daily, post-CE) |
FR-396 — thin alert publisher per ADR-046 §6: reads the latest metering.unattributed_costs row, publishes the EventBridge event if the dbt-computed exceeds_threshold flag is TRUE. No threshold logic in Lambda. |
(snowflake-credit-fetcher was removed in ADR-046 pass 2: its 4-hour cron, 37-line CTE, and TS attribution arithmetic are all consolidated into the dbt model dbt/models/MOD-098.../snowflake_credit_daily.sql, materialized as a Dynamic Table on a 4-hour target_lag.)
Reserved concurrency: 3 per Lambda. Memory: 512 MB. Timeout: 120s (300s for Cost Explorer fetcher).
Domain — src/modules/MOD-098/domain/¶
| Module | Responsibility |
|---|---|
cost-explorer-client.ts |
Paginated GetCostAndUsage with 5xx-retry / 4xx-fail-fast. Maps AWS group structure to CostExplorerRow |
Removed in the ADR-046 refactor:
- unattributed-evaluator.ts (pass 1) — threshold lookup + comparison moved to dbt/models/MOD-098.../unattributed_costs.sql, joined to METERING.CONFIG for the threshold value.
- snowflake-credit-attribution.ts (pass 2) — credit attribution arithmetic moved to dbt/models/MOD-098.../snowflake_credit_daily.sql as CASE expressions.
Events¶
| Direction | Bus | Source | Detail-type | Schema |
|---|---|---|---|---|
| Publish | bank-risk-platform-{env} |
bank.risk-platform |
unattributed_cost_threshold_exceeded |
{ trace_id, alert_id, cost_date, total_aws_cost_usd, unattributed_cost_usd, unattributed_share, threshold, event_time } |
Consumes: none.
Error code enumeration¶
error_code |
Class | Source | Recovery |
|---|---|---|---|
COST_EXPLORER_4XX |
PROVIDER_ERROR (non-retryable) | cost-explorer-client | Surface to ops; investigate IAM / API enablement |
COST_EXPLORER_UNAVAILABLE |
TRANSIENT_INFRA | cost-explorer-client | Lambda retry via EventBridge schedule |
EVENTBRIDGE_PUBLISH_FAILED |
TRANSIENT_INFRA | event publisher | Lambda retry |
SNOWFLAKE_* |
TRANSIENT_INFRA | shared/snowflake | Lambda retry |
ENV_MISSING |
TRANSIENT_INFRA | handlers | Deploy / SSM mismatch |
Event type registry (logger event_type values)¶
cost_explorer_fetch_completed, cost_explorer_empty, unattributed_monitor_check_ok, unattributed_cost_threshold_exceeded, unattributed_monitor_no_data.
(snowflake_credit_fetch_completed was emitted by the snowflake-credit-fetcher Lambda removed in ADR-046 pass 2.)
SSM parameter contract¶
Reads (consumed)¶
| SSM path | From |
|---|---|
/bank/{env}/iam/lambda/bank-risk-platform/arn |
MOD-104 |
/bank/{env}/observability/adot-layer-arn |
MOD-076 |
/bank/{env}/eventbridge/bank-risk-platform/arn |
MOD-104 |
/bank/{env}/eventbridge/bank-risk-platform/dlq-arn |
MOD-104 |
/bank/{env}/snowflake/account-locator |
MOD-102 (proposed contract) |
/bank/{env}/snowflake/mod-098/warehouse |
MOD-102 (proposed contract) |
/bank/{env}/snowflake/mod-098/database |
MOD-102 (proposed contract) |
/bank/{env}/snowflake/mod-098/ingest-role |
MOD-102 (proposed contract) |
/bank/{env}/snowflake/mod-098/ingest-secret-arn |
MOD-102 (proposed contract) |
Writes (published) — same /bank/{env}/risk-platform/* convention as MOD-085¶
| SSM path | Value | Consumed by |
|---|---|---|
/bank/{env}/risk-platform/metering/daily-tenant-summary-table |
METERING.V_DAILY_TENANT_SUMMARY |
MOD-080 (statutory reporting), future MOD-099 |
/bank/{env}/risk-platform/metering/billing-period-summary-table |
METERING.V_BILLING_PERIOD_SUMMARY |
MOD-080, future MOD-099 |
/bank/{env}/risk-platform/metering/unit-economics-table |
METERING.V_UNIT_ECONOMICS |
Internal finance |
/bank/{env}/risk-platform/metering/unattributed-costs-table |
METERING.V_UNATTRIBUTED_COSTS |
MOD-076 dashboards, MOD-098 unattributed-monitor Lambda |
/bank/{env}/risk-platform/metering/aws-cost-daily-table |
METERING.V_AWS_COST_DAILY |
Ops |
/bank/{env}/risk-platform/metering/snowflake-credit-daily-table |
METERING.V_SNOWFLAKE_CREDIT_DAILY |
Ops |
/bank/{env}/risk-platform/metering/cost-rates-table |
METERING.V_COST_RATES |
Ops, future MOD-099 |
/bank/{env}/risk-platform/metering/event-source-name |
bank.risk-platform |
SD04/SD07 EB rules |
/bank/{env}/risk-platform/metering/cost-explorer-fetcher-arn |
Lambda ARN | Ops/runbooks |
/bank/{env}/risk-platform/metering/unattributed-monitor-arn |
Lambda ARN | Ops/runbooks |
(The snowflake-credit-fetcher-arn SSM output was removed in ADR-046 pass 2 — there is no Lambda. Pulumi pulumi up --refresh deletes the orphan SSM parameter on the next deploy.)
Acceptance criterion — REP-001 CALC¶
Produces daily attributed cost and running billing period totals per licensee — the authoritative source for SaaS invoices and internal gross margin reporting.
Per acceptance-criteria.md, CALC mode requires running the calculation against known inputs and verifying the output matches an independently computed result, plus boundary cases.
Test: tests/policy/REP-001-calc-correctness.test.ts.
- Snowflake credit attribution arithmetic (FR-394): dedicated
42 × 2.5 = 105.000000; proportional17 × 3.0 = 51.000000; sum-invariant check. - billing_period_summary (FR-395) structural correctness: contains all four tariff components; rate-card binding uses
period_start; total sums all four; refresh mode FULL with 4-hour lag. - unattributed_costs (FR-396) structural correctness: SQL uses strict
>(boundary caseshare == thresholddoes not alert);exceeds_thresholdis emitted as a precomputed boolean; threshold value is sourced fromMETERING.CONFIGrather than a hard-coded literal;threshold_usedis captured per row for audit. - METERING.CONFIG (ADR-046 §4): seed migration creates the table with effective-from/to versioning and inserts the FR-396 threshold (
0.02) idempotently.
Runtime numerical correctness against the deployed Snowflake account is in the integration layer (tests/integration/rep-001-calc-runtime.test.ts) — seeds AWS_COST_DAILY with a known fixture, force-refreshes the UNATTRIBUTED_COSTS Dynamic Table, and asserts the dbt-computed UNATTRIBUTED_SHARE, THRESHOLD_USED, and EXCEEDS_THRESHOLD columns match independent arithmetic.
Observability¶
- ADOT layer attached to all Lambdas (per MOD-085 pattern).
extract_trace_contextat the top of every handler.- Structured logs use
StructuredLoggerwith mandatory observability fields. - EMF metrics:
cost_explorer_rows_written_total,snowflake_credit_rows_written_total,unattributed_share. - Dashboard provisioned at
bank-{env}-MOD-098.
Idempotency¶
- Cost Explorer fetcher: each run uses a fresh
run_id. Reprocessing the same window is safe — table is append-only and de-duplicated downstream by Dynamic Tables joining oncost_date. - Snowflake credit fetcher: same — fresh
run_idper run. - Threshold monitor: re-evaluates the latest
unattributed_costsrow each run; multiple firings within a day are deduplicated downstream byalert_id.
Snowflake Task DAG (ADR-046 §1)¶
| Task | Schedule | Owns | Notes |
|---|---|---|---|
METERING.T_REFRESH_SNOWFLAKE_CREDIT_DAILY |
USING CRON 0 */4 * * * UTC |
snowflake_credit_daily refresh | Pass 5. Mirrors the dbt model body (CTAS from ACCOUNT_USAGE). Replaces the legacy mod-098-snowflake-credit-fetcher Lambda's 4-hour cron. Body duplicates the dbt model SQL for now — pass-6 candidate is to deduplicate via stored procedure or Snowpark Container Services running dbt. |
The other Lambdas (cost-explorer-fetcher, unattributed-monitor) keep their CloudWatch crons — both are explicitly justified by ADR-046 §6 (external API reader, alert publisher).
Out of scope (explicit non-goals)¶
- The rate card data itself — managed by ops via approved rate-change process; MOD-098 only owns the schema.
BILLING.TENANT_MODULES/TENANT_TIERSdata — populated by onboarding flow (out of scope here); schemas declared so MOD-098 has stable joins.- Invoice generation and PDF rendering — future MOD-099 in another repo.
- AWS resource tagging compliance — MOD-104 GOV-005 enforces tagging at synthesis time. MOD-098 surfaces breaches via
unattributed_costs. - NZ public-holiday awareness for billing periods — periods are calendar months.
Open items at handoff¶
- MOD-097 SSM contract for
metering.usage_events— MOD-097 lives inbank-platformand writesmetering.usage_events. Its design doc doesn't publish a per-table SSM. MOD-098 hard-codesMETERING.USAGE_EVENTSin DCM SQL — confirm name with MOD-097 before first Snowflake apply. - MOD-097 external API cost contract —
MOD-098.mdsays "Pulled by MOD-097's external cost Lambda and available inmetering.external_api_costs" but MOD-097's design doc has no such Lambda. MOD-098 declares the schema as a stub. Either (a) MOD-097 adds the Lambda or (b) MOD-098 absorbs the responsibility in a follow-on iteration. billing.*schema ownership — MOD-098 currently provisionsBILLING.TENANT_MODULESandBILLING.TENANT_TIERS. These may move to a future MOD-099 (billing/onboarding) when that module ships. Captured in handoff.- AWS Cost Explorer dimensions — current implementation groups by
SERVICEand tagtenant_id. Per FR-393module_idshould also be grouped — Cost Explorer'sGroupByis limited to two dimensions, somodule_idis captured via tag filter on subsequent passes (not yet implemented). Flagged for follow-up.