Stage 1 Certification
Purpose
Certify Stage 1 ingestion boundary as audit-grade before Stage 2+ progression.When to run this
- At initial Stage 1 certification
- After Stage 1 hotfixes affecting ingestion/bridge behavior
Prerequisites
- Access to Stage 1 test suite
- BigQuery read access for proof queries
- Tenant/batch IDs for evidence collection
Inputs
- Certification date/branch/commit metadata
- Stage 1 test output
- SQL proof query output
Procedure
Date: 2026-01-31Branch:
feat/partb-stage1-bridgeCommit: 7697419 (initial certification)
Production Certified: 2026-02-01
Production Certified Revision: 5a8e373
1) Stage 1 scope definition
Stage 1 is the ingestion boundary: transaction_events_raw → stage1_bridge_rowsBoundaries
- Input: CSV upload via
/api/v1/intake/upload - Raw Storage:
transaction_events_rawtable (all rows, ACCEPTED + REJECTED) - Bridge:
stage1_bridge_rowstable (MERGE from raw, typed fields via SAFE_CAST) - Output: Validated, typed rows ready for Stage 2 processing
Explicitly Deferred to Stage 2
- Business onboarding/discovery logic
- Agent PEPM assignments
- Rate plan resolution
- Commission calculations
- Aggregation/shaping (stage1_snapshots)
- Dashboard fields
2) Required wizard inputs
period_label (YYYY-MM-01 format)
- Required: Yes (wizard input, not column-mapped)
- Format:
YYYY-MM-01(e.g.,2025-01-01) - Validation: Strict regex
^\d{4}-\d{2}-01$ - Source: Wizard UI input, passed in
/processrequest payload
3) Verified endpoints
/api/v1/intake/{batch_id}/discover
- Discovers businesses from mapped rows
- Returns classification (configured_ok, unconfigured_seen_before, new_discovered)
- Status: Verified in test suite
/api/v1/intake/{batch_id}/preflight
- Server-side preflight check (authoritative)
- Validates normalization drift invariant
- Checks policy requirements
- Returns preflight hash for idempotency
- Status: Verified in test suite
/api/v1/intake/{batch_id}/process
- Processes mapped rows → transaction_events_raw
- Bridges to stage1_bridge_rows (MERGE)
- Row-resilient: bad rows REJECTED, batch still PROCESSED
- Returns processing summary with counts
- Status: Verified in test suite (17 tests passing)
4) Invariants (audit-grade)
1. NUMERIC Money Fields (ADR-002)
- All money fields stored as
NUMERIC(not FLOAT) - Fields:
credit_num,debit_num,total_num - Parsed via
SAFE_CAST(credit_raw AS NUMERIC) - Verified: Decimal safety tests passing
2. Idempotent MERGE Keys
- MERGE key:
(tenant_id, batch_id, row_index)+ NULL-safeorg_idmatching - NULL org_id handling:
IFNULL(target.org_id,'') = IFNULL(source.org_id,'') - Verified: Idempotency tests passing (no duplicates on rerun)
3. Lineage Integrity
- Every bridge row links to raw row via
(tenant_id, batch_id, row_index) - Full traceability:
mapped_payload_jsonpreserved in raw table - Verified: Lineage join tests passing
4. Safe Error Payloads
- No
str(e)in client payloads - Exceptions logged server-side only
- Error codes: structured arrays (not free-form strings)
- Verified: Error handling tests passing
5. Missing Batch Metadata: Controlled 404 + Metric
- Pre-check: Query
ingestion_batchesbefore MERGE - If batch metadata missing: raise
RuntimeError("BATCH_METADATA_MISSING") - Route handler maps to
HTTPException(404)with safe payload - Structured log field:
"metric": "stage1_bridge.batch_metadata_missing" - Verified: Missing metadata tests passing
5) Targeted test gate
Command
Results (2026-01-31)
Test Coverage
- test_stage1_bridge.py: Bridge MERGE, idempotency, validation, missing metadata handling
- test_intake_processor.py: Full ingestion flow, decimal safety, RBAC, row resilience
6) Proof queries
Counts Verification
Lineage Integrity
Duplicate Check (Idempotency)
7) Log verification
STAGE1_BRIDGE_COMPLETE Event
Check logs for structured event:8) Known limitations (deferred to Stage 2)
- Business Onboarding: Discovery logic exists but onboarding decisions are Stage 2
- Agent PEPM Assignments: Assignment logic is Stage 2+
- Rate Plan Resolution: Rate plan lookup is Stage 2+
- Commission Calculations: Commission math is Stage 2+
- Aggregation: stage1_snapshots shaping is Stage 2+
Verification
✅ CERTIFIED - Stage 1 ingestion boundary is audit-grade and ready for Stage 2 development. Evidence:- All 17 Stage 1 targeted tests passing
- Row resilience verified (bad rows REJECTED, batch still PROCESSED)
- Idempotency verified (no duplicates on rerun)
- Lineage integrity verified (every bridge row links to raw row)
- Decimal safety verified (NUMERIC fields, no FLOAT leakage)
- Missing metadata handling verified (controlled 404 + metric)
Failure modes & fixes
- Test gate failures
- Fix regression before certification; rerun targeted gate.
- Duplicate rows in bridge
- Re-check merge key and NULL-safe org match logic.
- Lineage mismatch
- Verify
(tenant_id, batch_id, row_index)join contract.
- Verify
- Unsafe error payload exposure
- Ensure safe, structured error responses only.
Artifacts produced
- Certification metadata (date/branch/commit/revision)
- Test gate output (
17 passedbaseline) - SQL proof outputs for counts, lineage, and duplicates
- Structured log evidence (
STAGE1_BRIDGE_COMPLETE)
Related docs
docs/runbooks/STAGE2_CERTIFICATION.mddocs/reference/INVARIANT_ENFORCEMENT.mddocs/reference/BQ_CONTRACT.md