P0 Discovery Implementation — Deliverable
Date: 2026-01-28Status: ✅ COMPLETE
Summary
Problem: New businesses from uploads appear intransaction_events_raw but do NOT appear in config_business_onboarding, making them invisible to readiness and businesses list endpoints.
Solution: Implemented discover_businesses_from_upload() that queries transaction_events_raw, normalizes business labels, and upserts into config_business_onboarding.
1) Route File Paths + Handler Names
Readiness Endpoint
- File:
api/routes/business_onboarding.py - Handler:
get_business_readiness_endpoint()(line 911) - Query Function:
get_business_readiness()(line 932) - Query File:
api/bigquery/business_onboarding_queries.py(line 44)
Businesses List Endpoint
- File:
api/routes/business_onboarding.py - Handler:
list_businesses_endpoint()(line 77) - Query Function:
list_businesses_for_onboarding()(line 96) - Query File:
api/bigquery/business_onboarding_queries.py(line 285)
2) Query Function Names + Table Names
get_business_readiness()
Source Tables:
config_business_onboarding(line 139) — PRIMARY SOURCEconfig_business_policy(line 142) — JOIN for OWNER_ROLLUP policiesconfig_business_agent_assignment(line 140) — JOIN for agent assignmentsconfig_business_agent_pepm_assignment(line 141) — JOIN for PEPM rows
list_businesses_for_onboarding()
Source Tables:
config_business_onboarding(line 424) — PRIMARY SOURCEdim_business_mapping(line 423) — JOIN for business identityconfig_business_agent_assignment(line 425) — JOIN for assignmentsconfig_business_agent_pepm_assignment(line 428) — JOIN for PEPM
3) Discovery Confirmation
Answer: ❌ NO — Discovery was NOT automatically invoked before P0 fix. Evidence:refresh_dim_business_mapping_from_stage1()only upserts todim_business_mapping, notconfig_business_onboarding- Intake processor (
process_batch) wrote totransaction_events_rawbut did NOT call any discovery function - Readiness endpoint queries ONLY
config_business_onboarding(does not join withtransaction_events_raw)
4) P0 Implementation
New Function: discover_businesses_from_upload()
Location: api/bigquery/business_onboarding_queries.py (line 2935)
Functionality:
- Queries
transaction_events_rawfiltered bybatch_id(orperiod_label) - Extracts DISTINCT
business_name_rawvalues - Normalizes using SQL pattern matching Python
normalize_business_name() - Generates
business_idusing MD5 hash (first 16 hex chars) - Upserts into
config_business_onboardingwith:first_seen_date= MIN(period_label) from batchlast_seen_date= MAX(period_label) from batchscope= NULL (default, can be set later via adoption)owning_org_id= NULL (default, can be set later via adoption)ignored= FALSE
- ✅ Automatic: After
process_batchcompletes successfully (inapi/routes/intake_processor.py, line 702) - ✅ Manual: Debug endpoint
GET /api/v1/admin/onboarding/debug/discover/{batch_id}(line 910 inbusiness_onboarding.py)
Debug Endpoint
Route:GET /api/v1/admin/onboarding/debug/discover/{batch_id}Handler:
debug_discover_businesses_endpoint() (line 910)Returns: Raw labels + normalized labels + business_ids (before upsert) Response Format:
Files Modified
-
api/bigquery/business_onboarding_queries.py- Added
discover_businesses_from_upload()function (line 2935)
- Added
-
api/routes/business_onboarding.py- Added import for
discover_businesses_from_upload(line 60) - Added debug endpoint
GET /debug/discover/{batch_id}(line 910)
- Added import for
-
api/routes/intake_processor.py- Added import for
discover_businesses_from_upload(line 28) - Integrated discovery call after batch processing (line 702)
- Added import for
-
docs/READINESS_DATA_SOURCE_TRACE.md(new)- Complete trace of data sources and discovery gap
-
docs/P0_DISCOVERY_IMPLEMENTATION.md(this file)- Implementation summary and deliverable
Testing
Manual Test Steps
-
Upload a CSV file via
POST /api/v1/intake/upload -
Map columns via
POST /api/v1/intake/{batch_id}/map -
Process batch via
POST /api/v1/intake/{batch_id}/process -
Verify discovery:
- Check logs for:
"[INTAKE] Discovered X businesses, upserted Y to config_business_onboarding" - Query
GET /api/v1/admin/onboarding/businesses?period_label=YYYY-MM-01— new businesses should appear - Query
GET /api/v1/admin/onboarding/businesses/readiness?period_label=YYYY-MM-01— new businesses should appear
- Check logs for:
-
Debug endpoint test:
GET /api/v1/admin/onboarding/debug/discover/{batch_id}— returns raw+normalized labels
Commit Message
Next Steps
- Deploy backend with P0 discovery changes
- Test in production:
- Upload a test CSV with new businesses
- Verify businesses appear in onboarding list
- Verify readiness endpoint includes new businesses
- Monitor logs for discovery success/failure rates
- Consider: Add discovery retry logic if batch processing succeeds but discovery fails