Real Payroll File Wizard E2E Test Results
Date: 2026-01-28File:
data/input/December 2025 Combnied Payroll.csvSize: 821.99 KB
Rows: 2,841
Batch ID:
f97d7357-986d-46d4-822c-74eb3d467ac3Period Label:
2025-12-01Tenant:
creative_benefit_strategiesOrg ID:
null (Phase 8D)
Test Execution Summary
✅ Step A: Health Check
- Status: 200 OK
- Git SHA:
ccfb79ea0f617cad88b235ae0ac4f4d5c11004aa - Service:
payroll-pipeline-cbs-api-evndxpcirq-uc.a.run.app
✅ Step B: Upload
- Status: 200 OK
- Batch ID:
f97d7357-986d-46d4-822c-74eb3d467ac3 - Row Count: 2,841
- Status:
UPLOADED - Org ID:
null✅ (Phase 8D compliance)
⚠️ Step C: Preview
- Status: 404 Not Found (endpoint may not be implemented or path incorrect)
- Impact: Non-critical, skipped
⚠️ Step D: Map
- Status: HTTP Timeout (>300 seconds)
- Rows Written: 279 (partial)
- Batch Status: Still
UPLOADED(notMAPPED) - Issue: Synchronous processing of 2,841 rows exceeds HTTP timeout limits
/map endpoint processes files synchronously:
- Downloads file from GCS (821 KB)
- Parses all 2,841 rows into memory
- Processes each row
- Writes all rows to BigQuery in one batch
⏸️ Steps E-G: Not Executed
- Discover: Not executed (requires MAPPED status)
- Preflight: Not executed (requires MAPPED status)
- Process: Not executed (requires MAPPED status)
BigQuery Status (Current State)
ingestion_batches
batch_id:f97d7357-986d-46d4-822c-74eb3d467ac3status:UPLOADED(notMAPPED)row_count:2841org_id:null✅
transaction_events_raw
row_count:279(partial - only 9.8% of total rows)accepted_count:279rejected_count:0
MERGE Key Analysis (Task C)
transaction_events_raw MERGE Keys
Source:api/bigquery/intake_processor_queries.py:102-104
batch_id updates existing rows (by row_index). New batch_id accumulates new rows.
stage1_bridge_rows MERGE Keys
Source:api/bigquery/intake_processor_queries.py:679-683
batch_id updates existing rows (by row_index). New batch_id accumulates new rows.
ingestion_batch_businesses MERGE Keys
Source:api/bigquery/discovery_queries.py:120-124
batch_id + period_label + normalized_business_name updates existing rows. New combinations accumulate.
Preflight Readiness Check (Task D)
Source:api/routes/intake_preflight.py:764-780Query Function:
get_business_readiness() from api/bigquery/business_onboarding_queries.py
Tables Checked:
config_business_onboarding(base table)config_business_agent_assignment(JOIN for agent assignments)config_business_agent_pepm_assignment(JOIN for PEPM rows)config_business_policy(JOIN for OWNER_ROLLUP policies)
- Has
OWNER_ROLLUPpolicy active (as of period) - Has ≥ 1 agent assignment AND all assigned agents have PEPM rows AND
effective_start_datepresent
- Option 1: Use existing admin UI “Go to Business Onboarding” to configure businesses
- Option 2: Use existing admin endpoints (if available):
POST /api/v1/admin/onboarding/businesses/{business_id}/assignments(assign agents)POST /api/v1/admin/onboarding/businesses/{business_id}/pepm(set PEPM)POST /api/v1/admin/onboarding/businesses/{business_id}/policy(set OWNER_ROLLUP)
Red Flags Identified
-
⚠️ Performance Issue:
/mapendpoint times out on large files (>2000 rows)- Impact: Cannot process production payroll files synchronously
- Recommendation: Consider async processing or chunked writes for large files
-
⚠️ Partial Write: 279 rows written but batch status not updated to
MAPPED- Impact: Inconsistent state - rows exist but batch appears incomplete
- Recommendation: Ensure atomic batch status updates or implement idempotent retry
-
✅ Tenant Isolation: Confirmed
org_id = nullfor batch (Phase 8D compliance) - ✅ Decimal Handling: No float leakage detected (all money fields stored as strings)
Recommendations
- Immediate: Increase Cloud Run timeout for
/mapendpoint (if not already maxed at 3600s) - Short-term: Implement async processing for
/mapendpoint (return 202, poll status) - Long-term: Consider chunked BigQuery writes to avoid memory issues on very large files
Next Steps
- Retry Map: Attempt map again (should be idempotent) with increased timeout
- Check Logs: Review Cloud Run logs for
/mapendpoint to identify bottleneck - Complete Flow: Once map succeeds, proceed with Discover → Preflight → Process
- Validate Outputs: Run BigQuery invariant queries after Process completes