Skip to main content

Readiness Batch Discovery Org Filter Fix - Deployment Proof

Summary

Fixed readiness endpoint returning total_businesses=0 when x-org-id header was present. Root Cause: get_batch_discovered_business_ids was called with org_id=effective_org_id, but batch businesses are platform-scoped (org_id IS NULL in ingestion_batch_businesses). Fix: Pass org_id=None to get_batch_discovered_business_ids to retrieve all platform-scoped batch businesses, then filter by org at the config level via get_business_readiness(org_id=effective_org_id).

Commit SHA

  • Commit: c8fff16
  • Message: fix(readiness): pass org_id=None to batch discovery for platform-scoped businesses

Security Validation

Access Control Model (Validated)

  1. Tenant Isolation: get_ingestion_batch(batch_id, tenant_id) enforces tenant_id match
  2. Batch IDs: UUIDs (unguessable)
  3. RBAC: Requires admin/ceo role (enforced by get_ingestion_principal)
  4. Org Filtering: Happens at config level (get_business_readiness filters by effective_org_id)
  5. Batch org_id: Always NULL (Phase 8D constraint)
Conclusion: Passing org_id=None to batch discovery is safe because:
  • Batch access is already protected by tenant_id + RBAC
  • Org filtering happens at the correct layer (config readiness)
  • No cross-org data leakage risk

BigQuery Validation

Batch Businesses Are Platform-Scoped (Confirmed)

SELECT org_id, COUNT(*) as c
FROM `payroll-bi-gauntlet.payroll_analytics.ingestion_batch_businesses`
WHERE tenant_id='creative_benefit_strategies'
  AND batch_id='d38d4994-56ef-4360-a7e8-2acdc9e05591'
GROUP BY org_id;
Result:
[
  {
    "c": "113",
    "org_id": null
  }
]
Conclusion: All 113 businesses in the batch have org_id = null (platform-scoped), validating the fix.

Tests Added

Regression Tests (api/tests/test_readiness_batch_discovery_org_filter.py)

  1. test_readiness_calls_batch_discovery_with_org_id_none: Verifies readiness handler calls get_batch_discovered_business_ids with org_id=None even when x-org-id header is present.
  2. test_get_batch_discovered_business_ids_org_filter_clause: Tests SQL clause builder:
    • When org_id is not None => uses "ib.org_id = @org_id"
    • When org_id is None => uses "ib.org_id IS NULL"
  3. test_readiness_returns_businesses_when_batch_is_platform_scoped: Tests exact bug scenario (batch businesses platform-scoped, configs org-scoped).
  4. test_readiness_returns_zero_if_batch_discovery_fails_with_wrong_org: Documents bug behavior to prevent regression.
Test Results: ✅ All 4 tests pass

Code Changes

api/routes/intake.py

Change: Added security comment documenting access control model. Location: Lines 1580-1588
# Get business_ids discovered for this batch (with identity_status)
# CRITICAL: Batch discovery is platform-scoped (org_id IS NULL in ingestion_batch_businesses)
# Pass org_id=None to get all batch businesses, then filter by org at readiness level
#
# SECURITY: Passing org_id=None is safe because:
# 1. Batch access is enforced by tenant_id (get_ingestion_batch checks tenant_id)
# 2. Batch IDs are UUIDs (unguessable)
# 3. RBAC requires admin/ceo role (enforced by get_ingestion_principal)
# 4. Org filtering happens at config level (get_business_readiness filters by effective_org_id)
# 5. Batch org_id is always NULL (Phase 8D constraint)
batch_businesses = get_batch_discovered_business_ids(
    tenant_id=tenant_id,
    batch_id=batch_id,
    period_label=period_label,
    org_id=None  # Platform-scoped: batch businesses have org_id IS NULL
)
Note: The fix itself (passing org_id=None) was already applied in a previous commit. This commit adds documentation and regression tests.

Deployment Status

  • GitHub Actions: Triggered by push to main
  • Cloud Run Service: payroll-backend-prod
  • Deployment Status: Pending verification

Post-Deployment Verification Steps

1. Verify Deployment SHA

curl -s "https://payroll-backend-prod-238826317621.us-central1.run.app/api/health" | jq '.git_commit_sha'
Expected: SHA should match or contain c8fff16

2. Network-Level Verification

Using Chrome DevTools → Network tab:
  1. Navigate to ingestion wizard preflight step
  2. Set period to 2025-12-01 (or current period)
  3. Observe readiness GET request:
    • URL: /api/v1/intake/ingestion-wizard/readiness?batch_id=...&period_label=...
    • Headers: Must include x-org-id: cbs-main (or appropriate org)
    • Response: Must have total_businesses > 0 (not 0)
Expected: total_businesses should match the number of businesses in the batch (e.g., 113 for the test batch).

3. UI Verification

  1. Open business drawer for a configured business (e.g., “AIC Inc”)
  2. Verify saved PEPM/assignments are visible
  3. Save a new configuration
  4. Close and reopen drawer
  5. Verify saved configuration persists
Expected: UI should show saved configurations without requiring page refresh.

4. BigQuery Spot Check

Verify config rows have org_id set correctly:
SELECT business_id, owning_org_id, scope
FROM `payroll-bi-gauntlet.payroll_analytics.config_business_onboarding`
WHERE tenant_id='creative_benefit_strategies'
  AND business_id='a7ac092572c6fa5c'
  AND effective_start_date <= '2025-12-01'
  AND (effective_end_date IS NULL OR effective_end_date >= '2025-12-01')
LIMIT 1;
Expected: Row should exist with owning_org_id='cbs-main' (or appropriate org) and scope='ORG_SCOPED'.

Success Criteria

All criteria must pass:
  1. Readiness endpoint returns total_businesses > 0 when x-org-id header is present
  2. Saved business configurations persist and are visible in UI
  3. No console errors in browser
  4. No 500 errors from readiness endpoint
  5. BigQuery confirms batch businesses are platform-scoped (org_id IS NULL)
  6. BigQuery confirms config rows are org-scoped (owning_org_id set)

Rollback Plan

If deployment causes issues:
  1. Revert commit c8fff16
  2. Push to main to trigger rollback deployment
  3. Investigate root cause before re-applying fix

  • Original Issue: UI shows successful saves but readiness returns 0/113 businesses
  • Root Cause: get_batch_discovered_business_ids called with org_id=effective_org_id instead of org_id=None
  • Fix: Pass org_id=None to batch discovery, filter by org at config level

Notes

  • This fix does not change the security model - it corrects a bug that prevented correct data retrieval
  • Org filtering still happens at the correct layer (config readiness)
  • Batch access is still protected by tenant_id + RBAC
  • No schema changes required