Skip to main content

P0 Discovery Deployment Runbook

Purpose

Deploy and validate the P0 discovery fix so discovered businesses are written into onboarding readiness flows.

When to run this

  • Rolling out commit 65c90d3
  • Re-validating P0 discovery behavior after backend deployment changes

Prerequisites

  • Cloud Run deploy access in payroll-bi-gauntlet
  • Admin JWT for onboarding debug endpoint
  • Intake UI access and December payroll sample file

Inputs

  • Target commit SHA (65c90d3)
  • Cloud Run service (payroll-pipeline-cbs-api)
  • batch_id captured from intake upload
  • Tenant (creative_benefit_strategies)

Procedure

Commit

Committed: 65c90d3 - fix(p0): discover businesses from intake batch into onboarding registry

1) Deploy backend (Cloud Run)

Option A: Cloud Build (recommended)
gcloud builds submit --config=cloudbuild.yaml
Option B: Manual deploy (if Cloud Build unavailable)
# Build and push image
gcloud builds submit --tag gcr.io/payroll-bi-gauntlet/payroll-pipeline-cbs-api:65c90d3

# Deploy to Cloud Run
gcloud run deploy payroll-pipeline-cbs-api \
  --image=gcr.io/payroll-bi-gauntlet/payroll-pipeline-cbs-api:65c90d3 \
  --region=us-central1 \
  --platform=managed \
  --allow-unauthenticated \
  --service-account=sa-worker@payroll-bi-gauntlet.iam.gserviceaccount.com \
  --set-secrets=JWT_SECRET_KEY=jwt-secret:latest \
  --set-env-vars=GCP_PROJECT_ID=payroll-bi-gauntlet,CORS_ORIGINS=https://payroll-pipeline-cbs.vercel.app,GIT_COMMIT_SHA=65c90d3,STAGE=prod,ONBOARDING_STORAGE_BUCKET=payroll-bi-gauntlet-onboarding-storage \
  --update-labels=git-sha=65c90d3 \
  --memory=1Gi \
  --cpu=1 \
  --max-instances=10 \
  --min-instances=0 \
  --timeout=300 \
  --concurrency=80

2) Runtime validation (post-deploy)

Step 2.1: Upload December payroll

  1. Upload December payroll file via intake UI
  2. Capture batch_id from response or logs

Step 2.2: Call Debug Endpoint

# Replace {batch_id} with actual batch_id from Step 2.1
curl -X GET "https://payroll-pipeline-cbs-api-<hash>-uc.a.run.app/api/v1/admin/onboarding/debug/discover/{batch_id}" \
  -H "Authorization: Bearer {admin_jwt_token}"
Expected Response:
{
  "success": true,
  "tenant_id": "creative_benefit_strategies",
  "batch_id": "{batch_id}",
  "discovered_count": 3,
  "dim_upserted_count": 0,  // dry_run=True, so 0
  "onboarding_upserted_count": 0,  // dry_run=True, so 0
  "invalid_period_count": 0,
  "dry_run": true,
  "businesses": [
    {
      "raw_name": "AIC",
      "normalized_name": "AIC",
      "business_id": "{hash}",
      "first_seen_date": "2026-12-01",
      "last_seen_date": "2026-12-01"
    },
    {
      "raw_name": "Motor Medics",
      "normalized_name": "MOTOR MEDICS",
      "business_id": "{hash}",
      "first_seen_date": "2026-12-01",
      "last_seen_date": "2026-12-01"
    },
    {
      "raw_name": "Auto Intensive Care of Savannah",
      "normalized_name": "AUTO INTENSIVE CARE OF SAVANNAH",
      "business_id": "{hash}",
      "first_seen_date": "2026-12-01",
      "last_seen_date": "2026-12-01"
    }
  ]
}
Validation Checklist:
  • discovered_count > 0 (should include AIC, Motor Medics, Auto Intensive Care of Savannah)
  • invalid_period_count present (should be 0 ideally)
  • dry_run: true present
  • dim_upserted_count: 0 (dry-run, no writes)
  • onboarding_upserted_count: 0 (dry-run, no writes)

Step 2.3: Trigger Discovery (Non-Dry-Run)

Discovery should run automatically after intake processing completes (via api/routes/intake_processor.py line 707). Alternatively, trigger manually via intake completion flow.

Step 2.4: Refresh Preflight/Readiness

  1. Navigate to Preflight UI
  2. Refresh readiness check
  3. Expected: Missing businesses (AIC, Motor Medics, Auto Intensive Care of Savannah) now appear in “Not ready for ingest” list

Step 2.5: Verify No Duplicates (Idempotency)

  1. Re-upload same Dec payroll file (or re-trigger discovery for same batch_id)
  2. Refresh Preflight again
  3. Expected: No duplicate entries, same business count

3) BigQuery verification (optional)

-- Verify dim_business_mapping has entries
SELECT 
  tenant_id,
  business_id,
  normalized_name,
  first_seen_date,
  last_seen_date
FROM `payroll-bi-gauntlet.payroll_analytics.dim_business_mapping`
WHERE tenant_id = 'creative_benefit_strategies'
  AND last_seen_date >= DATE('2026-12-01')
ORDER BY normalized_name;

-- Verify config_business_onboarding has registry entries
SELECT 
  tenant_id,
  business_id,
  scope,
  business_label,
  created_at
FROM `payroll-bi-gauntlet.payroll_analytics.config_business_onboarding`
WHERE tenant_id = 'creative_benefit_strategies'
  AND scope = 'PAYROLL_DISCOVERED'
ORDER BY business_label;
Expected:
  • dim_business_mapping: Entries for discovered businesses with last_seen_date = 2026-12-01
  • config_business_onboarding: Registry entries with scope = 'PAYROLL_DISCOVERED'

Verification

P0 Fully Closed if:
  1. Debug endpoint returns discovered businesses (dry-run)
  2. Preflight/readiness shows missing businesses as “Not ready for ingest”
  3. No duplicates on re-run (idempotency verified)

Failure modes & fixes

  1. Debug endpoint returns 0 discovered businesses
    • Verify mapped intake rows and period validity.
  2. Discovery not reflected in readiness UI
    • Confirm intake completion trigger and readiness refresh path.
  3. Duplicates appear on re-run
    • Validate idempotent keys and upsert logic for onboarding registry.

Artifacts produced

  • Deployment record for commit 65c90d3
  • Debug endpoint output for discovery dry-run
  • Readiness UI evidence
  • Duplicate-check SQL result
  • docs/runbooks/P0_DISCOVERY_RUNTIME_VALIDATION_STEPS.md
  • docs/guides/P0_DISCOVERY_IMPLEMENTATION.md
  • docs/reference/READINESS_DATA_SOURCE_TRACE.md

Supporting reference

Rollback (if needed)

# Deploy previous working revision
gcloud run services update-traffic payroll-pipeline-cbs-api \
  --to-revisions=<previous_revision>=100 \
  --region=us-central1