Skip to content

Data contracts and decision publication

Defines the contract layer beneath the onboarding decision engine: CDC envelope, signal model, decision publication contract, schema boundaries, reason-code design, and operational write-back from Snowflake to Neon.

Architecture stance: Neon → Snowflake carries facts. Snowflake computes meaning. Snowflake → Neon returns approved decisions.

Design principles

  • Keep analytical replication separate from operational decisioning. The CDC path exists to preserve history, lineage, and analytical freshness — it is not the operational write-back mechanism.
  • Use stable contracts between platforms. Neon must not depend on Snowflake's internal model features or transient scoring fields.
  • Return only approved outcomes to Neon. Raw features, exploratory scores, and intermediate model output stay in Snowflake unless explicitly promoted to a contract.
  • Every operational result must be idempotent, versioned, and explainable.
  • Human-readable reasoning is required. Operators must see the decision, score, and a plain-language explanation without exposing the full internal algorithm.

Platform interaction model

Path Primary purpose Latency expectation Contract shape
Neon → Snowflake Facts, history, analytical visibility Minutes, not sub-second Append-only CDC envelope with before/after payloads and commit lineage
Snowflake → Neon Approved operational outcomes Near-real-time, replay-safe Decision publication contract with idempotency, status, scores, and explanations
Snowflake → Snowflake Cross-account sharing and internal curation Varies by product need Curated tables, shares, listings, or replicated datasets
Neon → Neon Operational sync between Postgres estates Operational latency Replication or API-mediated upsert contracts with stricter ownership boundaries

Contract families

CDC envelope contract

Used for Neon → Snowflake replication. Carries committed database facts with enough metadata to reconstruct order, lineage, and replay position.

  • Append-only by design; consumers derive current state downstream.
  • Includes source schema and table so contracts remain explicit across domain databases.
  • Supports before/after payloads for update and delete interpretation.
  • Carries commit metadata and schema version for deterministic replay.

Signal contract

Used inside the risk and fraud platform. A signal is an observed or derived fact about an entity, suitable for scoring, monitoring, and explanation.

  • Signals are not raw CDC rows — they are produced by verification, screening, fraud, graph, and behavioural modules.
  • Signals carry severity, confidence, and production source.
  • Signals may contribute to decisions but do not themselves change operational state.

See Signal taxonomy v1 for the full signal code catalogue.

Decision publication contract

Used for Snowflake → Neon write-back. Returns approved operational outcomes into the live system.

  • Carries an operational decision: ACCEPT, REFER, REJECT, HOLD, or CLEAR.
  • Includes compact score summaries only — enough for operational visibility, not full model internals.
  • Includes structured reason codes and human-readable labels and explanations.
  • Must be idempotent and safe to replay.

Reason-code and explanation design

Reasoning must serve both machines and people. The contract separates coded causes from plain-language operator guidance.

Field Purpose Audience Example
reason_code Stable machine key for rules, analytics, and routing Systems HRGEO01
reason_label Short human-readable name Operators High-risk geography
reason_explanation Plain-language explanation of why the reason applied Operators and reviewers Customer declared residence in a higher-risk jurisdiction under the current policy set.
decision_summary One-line overall explanation of the decision Operators and customers where appropriate Referred for review due to a combination of geography and identity-risk signals.

Rules:

  • Expose the decision, key scores, and top reasons clearly in the operational UI.
  • Do not duplicate the full scoring algorithm, factor weights, or all model features in the returned contract.
  • Allow separate internal and external explanation text where customer-facing language needs simplification.
  • Reason labels should be short enough for queue screens; explanations should remain readable in case detail views.

Score visibility

Score field Included in write-back Notes
risk_score Yes Numeric value or bounded band used operationally
risk_tier Yes Low / Medium / High or equivalent
fraud_score Optional Include where it directly affects the action
Feature contributions No Remain in Snowflake explanation stores
Model weights No Not published back to Neon

Schema boundaries

Platform Schema families
Neon core_*, kyc_*, aml_*, decision_*, integration_*
Snowflake raw_cdc_*, conformed_*, signal_*, decision_*, governance_*
  • Separate inbound decision state from core product state so replay and audit are easier.
  • Keep integration inbox/outbox structures distinct from business-domain tables.
  • Reserve governance schemas for contract registry, compatibility records, lineage, and quality controls.

Compatibility and lifecycle rules

Every contract has a semantic version. Backward-compatible additions can be introduced within the same major version. Breaking changes require a new major version and a dual-run period.

Operational consumers must validate schema version before applying a decision. Write-back consumers must use idempotency keys to prevent duplicate effects.

Each contract has a technical owner, a data owner, and a policy owner. Decision contracts must reference policy sources, model version, and production component. Promotion of a field into a stable contract requires explicit approval; exploratory fields stay internal.

Example decision payload

{
  "decision_id": "2e9fa36f-a993-4dc0-b1ce-6eaabf818001",
  "entity_type": "APPLICATION",
  "entity_id": "app_102934",
  "decision_type": "ONBOARDING",
  "decision_status": "REFER",
  "decision_summary": "Referred for review due to geography and identity-risk signals.",
  "score_summary": {
    "risk_score": 68,
    "risk_tier": "MEDIUM",
    "fraud_score": 41
  },
  "reasons": [
    {
      "reason_code": "HRGEO01",
      "reason_label": "High-risk geography",
      "reason_explanation": "Customer declared residence in a higher-risk jurisdiction under the current policy set."
    },
    {
      "reason_code": "IDV004",
      "reason_label": "Identity confidence below auto-accept threshold",
      "reason_explanation": "Verification passed minimum checks but did not meet the stronger confidence level required for straight-through onboarding."
    }
  ],
  "policy_refs": ["AML-011", "AML-012", "AML-013"],
  "produced_by": "decision_engine.onboarding",
  "model_version": "risk-v1.0.0",
  "idempotency_key": "onb-app_102934-v3",
  "schema_version": "1.0.0"
}