Data contracts and decision publication¶

Defines the contract layer beneath the onboarding decision engine: CDC envelope, signal model, decision publication contract, schema boundaries, reason-code design, and operational write-back from Snowflake to Neon.

Architecture stance: Neon → Snowflake carries facts. Snowflake computes meaning. Snowflake → Neon returns approved decisions.

Design principles¶

Keep analytical replication separate from operational decisioning. The CDC path exists to preserve history, lineage, and analytical freshness — it is not the operational write-back mechanism.
Use stable contracts between platforms. Neon must not depend on Snowflake's internal model features or transient scoring fields.
Return only approved outcomes to Neon. Raw features, exploratory scores, and intermediate model output stay in Snowflake unless explicitly promoted to a contract.
Every operational result must be idempotent, versioned, and explainable.
Human-readable reasoning is required. Operators must see the decision, score, and a plain-language explanation without exposing the full internal algorithm.

Platform interaction model¶

Path	Primary purpose	Latency expectation	Contract shape
Neon → Snowflake	Facts, history, analytical visibility	Minutes, not sub-second	Append-only CDC envelope with before/after payloads and commit lineage
Snowflake → Neon	Approved operational outcomes	Near-real-time, replay-safe	Decision publication contract with idempotency, status, scores, and explanations
Snowflake → Snowflake	Cross-account sharing and internal curation	Varies by product need	Curated tables, shares, listings, or replicated datasets
Neon → Neon	Operational sync between Postgres estates	Operational latency	Replication or API-mediated upsert contracts with stricter ownership boundaries

Contract families¶

CDC envelope contract¶

Used for Neon → Snowflake replication. Carries committed database facts with enough metadata to reconstruct order, lineage, and replay position.

Append-only by design; consumers derive current state downstream.
Includes source schema and table so contracts remain explicit across domain databases.
Supports before/after payloads for update and delete interpretation.
Carries commit metadata and schema version for deterministic replay.

Signal contract¶

Used inside the risk and fraud platform. A signal is an observed or derived fact about an entity, suitable for scoring, monitoring, and explanation.

Signals are not raw CDC rows — they are produced by verification, screening, fraud, graph, and behavioural modules.
Signals carry severity, confidence, and production source.
Signals may contribute to decisions but do not themselves change operational state.

See Signal taxonomy v1 for the full signal code catalogue.

Decision publication contract¶

Used for Snowflake → Neon write-back. Returns approved operational outcomes into the live system.

Carries an operational decision: ACCEPT, REFER, REJECT, HOLD, or CLEAR.
Includes compact score summaries only — enough for operational visibility, not full model internals.
Includes structured reason codes and human-readable labels and explanations.
Must be idempotent and safe to replay.

Reason-code and explanation design¶

Reasoning must serve both machines and people. The contract separates coded causes from plain-language operator guidance.

Field	Purpose	Audience	Example
`reason_code`	Stable machine key for rules, analytics, and routing	Systems	`HRGEO01`
`reason_label`	Short human-readable name	Operators	`High-risk geography`
`reason_explanation`	Plain-language explanation of why the reason applied	Operators and reviewers	`Customer declared residence in a higher-risk jurisdiction under the current policy set.`
`decision_summary`	One-line overall explanation of the decision	Operators and customers where appropriate	`Referred for review due to a combination of geography and identity-risk signals.`

Rules:

Expose the decision, key scores, and top reasons clearly in the operational UI.
Do not duplicate the full scoring algorithm, factor weights, or all model features in the returned contract.
Allow separate internal and external explanation text where customer-facing language needs simplification.
Reason labels should be short enough for queue screens; explanations should remain readable in case detail views.

Score visibility¶

Score field	Included in write-back	Notes
`risk_score`	Yes	Numeric value or bounded band used operationally
`risk_tier`	Yes	Low / Medium / High or equivalent
`fraud_score`	Optional	Include where it directly affects the action
Feature contributions	No	Remain in Snowflake explanation stores
Model weights	No	Not published back to Neon

Schema boundaries¶

Platform	Schema families
Neon	`core_`, `kyc_`, `aml_`, `decision_`, `integration_*`
Snowflake	`raw_cdc_`, `conformed_`, `signal_`, `decision_`, `governance_*`

Separate inbound decision state from core product state so replay and audit are easier.
Keep integration inbox/outbox structures distinct from business-domain tables.
Reserve governance schemas for contract registry, compatibility records, lineage, and quality controls.

Compatibility and lifecycle rules¶

Every contract has a semantic version. Backward-compatible additions can be introduced within the same major version. Breaking changes require a new major version and a dual-run period.

Operational consumers must validate schema version before applying a decision. Write-back consumers must use idempotency keys to prevent duplicate effects.

Each contract has a technical owner, a data owner, and a policy owner. Decision contracts must reference policy sources, model version, and production component. Promotion of a field into a stable contract requires explicit approval; exploratory fields stay internal.

Example decision payload¶

{
  "decision_id": "2e9fa36f-a993-4dc0-b1ce-6eaabf818001",
  "entity_type": "APPLICATION",
  "entity_id": "app_102934",
  "decision_type": "ONBOARDING",
  "decision_status": "REFER",
  "decision_summary": "Referred for review due to geography and identity-risk signals.",
  "score_summary": {
    "risk_score": 68,
    "risk_tier": "MEDIUM",
    "fraud_score": 41
  },
  "reasons": [
    {
      "reason_code": "HRGEO01",
      "reason_label": "High-risk geography",
      "reason_explanation": "Customer declared residence in a higher-risk jurisdiction under the current policy set."
    },
    {
      "reason_code": "IDV004",
      "reason_label": "Identity confidence below auto-accept threshold",
      "reason_explanation": "Verification passed minimum checks but did not meet the stronger confidence level required for straight-through onboarding."
    }
  ],
  "policy_refs": ["AML-011", "AML-012", "AML-013"],
  "produced_by": "decision_engine.onboarding",
  "model_version": "risk-v1.0.0",
  "idempotency_key": "onb-app_102934-v3",
  "schema_version": "1.0.0"
}

ADR-036 — Decision result publication
ADR-003 — CDC pipeline
Signal taxonomy v1
Neon and Snowflake physical storage
Policy DT-012 — Ledger Data Contracts & Event Publication
Policy OPS-007 — Financial Processing Resilience & Idempotency