Data contracts and decision publication¶
Defines the contract layer beneath the onboarding decision engine: CDC envelope, signal model, decision publication contract, schema boundaries, reason-code design, and operational write-back from Snowflake to Neon.
Architecture stance: Neon → Snowflake carries facts. Snowflake computes meaning. Snowflake → Neon returns approved decisions.
Design principles¶
- Keep analytical replication separate from operational decisioning. The CDC path exists to preserve history, lineage, and analytical freshness — it is not the operational write-back mechanism.
- Use stable contracts between platforms. Neon must not depend on Snowflake's internal model features or transient scoring fields.
- Return only approved outcomes to Neon. Raw features, exploratory scores, and intermediate model output stay in Snowflake unless explicitly promoted to a contract.
- Every operational result must be idempotent, versioned, and explainable.
- Human-readable reasoning is required. Operators must see the decision, score, and a plain-language explanation without exposing the full internal algorithm.
Platform interaction model¶
| Path | Primary purpose | Latency expectation | Contract shape |
|---|---|---|---|
| Neon → Snowflake | Facts, history, analytical visibility | Minutes, not sub-second | Append-only CDC envelope with before/after payloads and commit lineage |
| Snowflake → Neon | Approved operational outcomes | Near-real-time, replay-safe | Decision publication contract with idempotency, status, scores, and explanations |
| Snowflake → Snowflake | Cross-account sharing and internal curation | Varies by product need | Curated tables, shares, listings, or replicated datasets |
| Neon → Neon | Operational sync between Postgres estates | Operational latency | Replication or API-mediated upsert contracts with stricter ownership boundaries |
Contract families¶
CDC envelope contract¶
Used for Neon → Snowflake replication. Carries committed database facts with enough metadata to reconstruct order, lineage, and replay position.
- Append-only by design; consumers derive current state downstream.
- Includes source schema and table so contracts remain explicit across domain databases.
- Supports before/after payloads for update and delete interpretation.
- Carries commit metadata and schema version for deterministic replay.
Signal contract¶
Used inside the risk and fraud platform. A signal is an observed or derived fact about an entity, suitable for scoring, monitoring, and explanation.
- Signals are not raw CDC rows — they are produced by verification, screening, fraud, graph, and behavioural modules.
- Signals carry severity, confidence, and production source.
- Signals may contribute to decisions but do not themselves change operational state.
See Signal taxonomy v1 for the full signal code catalogue.
Decision publication contract¶
Used for Snowflake → Neon write-back. Returns approved operational outcomes into the live system.
- Carries an operational decision:
ACCEPT,REFER,REJECT,HOLD, orCLEAR. - Includes compact score summaries only — enough for operational visibility, not full model internals.
- Includes structured reason codes and human-readable labels and explanations.
- Must be idempotent and safe to replay.
Reason-code and explanation design¶
Reasoning must serve both machines and people. The contract separates coded causes from plain-language operator guidance.
| Field | Purpose | Audience | Example |
|---|---|---|---|
reason_code |
Stable machine key for rules, analytics, and routing | Systems | HRGEO01 |
reason_label |
Short human-readable name | Operators | High-risk geography |
reason_explanation |
Plain-language explanation of why the reason applied | Operators and reviewers | Customer declared residence in a higher-risk jurisdiction under the current policy set. |
decision_summary |
One-line overall explanation of the decision | Operators and customers where appropriate | Referred for review due to a combination of geography and identity-risk signals. |
Rules:
- Expose the decision, key scores, and top reasons clearly in the operational UI.
- Do not duplicate the full scoring algorithm, factor weights, or all model features in the returned contract.
- Allow separate internal and external explanation text where customer-facing language needs simplification.
- Reason labels should be short enough for queue screens; explanations should remain readable in case detail views.
Score visibility¶
| Score field | Included in write-back | Notes |
|---|---|---|
risk_score |
Yes | Numeric value or bounded band used operationally |
risk_tier |
Yes | Low / Medium / High or equivalent |
fraud_score |
Optional | Include where it directly affects the action |
| Feature contributions | No | Remain in Snowflake explanation stores |
| Model weights | No | Not published back to Neon |
Schema boundaries¶
| Platform | Schema families |
|---|---|
| Neon | core_*, kyc_*, aml_*, decision_*, integration_* |
| Snowflake | raw_cdc_*, conformed_*, signal_*, decision_*, governance_* |
- Separate inbound decision state from core product state so replay and audit are easier.
- Keep integration inbox/outbox structures distinct from business-domain tables.
- Reserve governance schemas for contract registry, compatibility records, lineage, and quality controls.
Compatibility and lifecycle rules¶
Every contract has a semantic version. Backward-compatible additions can be introduced within the same major version. Breaking changes require a new major version and a dual-run period.
Operational consumers must validate schema version before applying a decision. Write-back consumers must use idempotency keys to prevent duplicate effects.
Each contract has a technical owner, a data owner, and a policy owner. Decision contracts must reference policy sources, model version, and production component. Promotion of a field into a stable contract requires explicit approval; exploratory fields stay internal.
Example decision payload¶
{
"decision_id": "2e9fa36f-a993-4dc0-b1ce-6eaabf818001",
"entity_type": "APPLICATION",
"entity_id": "app_102934",
"decision_type": "ONBOARDING",
"decision_status": "REFER",
"decision_summary": "Referred for review due to geography and identity-risk signals.",
"score_summary": {
"risk_score": 68,
"risk_tier": "MEDIUM",
"fraud_score": 41
},
"reasons": [
{
"reason_code": "HRGEO01",
"reason_label": "High-risk geography",
"reason_explanation": "Customer declared residence in a higher-risk jurisdiction under the current policy set."
},
{
"reason_code": "IDV004",
"reason_label": "Identity confidence below auto-accept threshold",
"reason_explanation": "Verification passed minimum checks but did not meet the stronger confidence level required for straight-through onboarding."
}
],
"policy_refs": ["AML-011", "AML-012", "AML-013"],
"produced_by": "decision_engine.onboarding",
"model_version": "risk-v1.0.0",
"idempotency_key": "onb-app_102934-v3",
"schema_version": "1.0.0"
}
Related¶
- ADR-036 — Decision result publication
- ADR-003 — CDC pipeline
- Signal taxonomy v1
- Neon and Snowflake physical storage
- Policy DT-012 — Ledger Data Contracts & Event Publication
- Policy OPS-007 — Financial Processing Resilience & Idempotency