Repo B Shadow Mode Execution Guide

Purpose

Guide for running Repo B in shadow mode and validating dashboard parity.

When to run this

Before Repo B cutover
After data/logic changes that may impact dashboard parity
During parity incident investigation

Prerequisites

Environment Setup:

# Set shadow mode environment variable
export BQ_SHADOW_MODE=1

# Verify GCP authentication
gcloud auth application-default login

# Verify project ID
echo $GCP_PROJECT_ID  # Should output: payroll-bi-gauntlet

Required Files:
- Repo B Stage1_Raw CSV files for June-September 2025
- Repo B Stage3_Full_Detail CSV files for June-September 2025
- Location: data/outputs/repo_b/dashboard_data/{YYYY-MM}/dashboard_csvs/

Inputs

Periods to backfill (June-September 2025 baseline in this guide)
Tenant (creative_benefit_strategies in examples)
Repo B output CSV files
Shadow validation output file path (for example parity_results.json)

Procedure

Last Updated

2025-12-19

Overview

Shadow mode allows Repo B to write to parallel datasets (payroll_raw_shadow, payroll_processed_shadow, payroll_analytics_shadow) without affecting production data. This enables safe validation of Repo B output before cutover.

1) Shadow backfill (June-September 2025)

Upload Single Period

# Set shadow mode
export BQ_SHADOW_MODE=1

# Upload June 2025
python -m repo_b.upload_to_bq --period 2025-06

# Upload July 2025
python -m repo_b.upload_to_bq --period 2025-07

# Upload August 2025
python -m repo_b.upload_to_bq --period 2025-08

# Upload September 2025
python -m repo_b.upload_to_bq --period 2025-09

Batch Upload Script

Create scripts/shadow_backfill_june_september.sh:

#!/bin/bash
set -e

export BQ_SHADOW_MODE=1

PERIODS=("2025-06" "2025-07" "2025-08" "2025-09")

for period in "${PERIODS[@]}"; do
    echo "Uploading $period..."
    python -m repo_b.upload_to_bq --period "$period"
    echo "✅ $period complete"
done

echo "✅ Shadow backfill complete"

Run:

chmod +x scripts/shadow_backfill_june_september.sh
./scripts/shadow_backfill_june_september.sh

Verification

After upload, verify shadow datasets:

-- Check shadow dataset row counts
SELECT 
  'stage1_shadow' as table_name,
  period_label,
  COUNT(*) as row_count
FROM `payroll-bi-gauntlet.payroll_raw_shadow.stage1_snapshots`
WHERE tenant_id = 'creative_benefit_strategies'
  AND period_label IN (DATE('2025-06-01'), DATE('2025-07-01'), DATE('2025-08-01'), DATE('2025-09-01'))
GROUP BY period_label
ORDER BY period_label

UNION ALL

SELECT 
  'stage3_shadow' as table_name,
  period_label,
  COUNT(*) as row_count
FROM `payroll-bi-gauntlet.payroll_processed_shadow.stage3_snapshots`
WHERE tenant_id = 'creative_benefit_strategies'
  AND period_label IN (DATE('2025-06-01'), DATE('2025-07-01'), DATE('2025-08-01'), DATE('2025-09-01'))
GROUP BY period_label
ORDER BY period_label

Expected: Row counts should match production datasets for the same periods.

2) Create shadow analytics views

Deploy Views

# Deploy CEO snapshot view
bq query --use_legacy_sql=false < integration/bigquery/sql/views_shadow/analytics_ceo_snapshot.sql

# Deploy top businesses view
bq query --use_legacy_sql=false < integration/bigquery/sql/views_shadow/analytics_top_businesses.sql

# Deploy top agents view
bq query --use_legacy_sql=false < integration/bigquery/sql/views_shadow/analytics_top_agents.sql

# Deploy business growth/loss view
bq query --use_legacy_sql=false < integration/bigquery/sql/views_shadow/analytics_business_growth_loss.sql

Verification

-- Test shadow views
SELECT COUNT(*) as ceo_snapshot_rows
FROM `payroll-bi-gauntlet.payroll_analytics_shadow.ceo_snapshot`
WHERE tenant_id = 'creative_benefit_strategies'
  AND period_label IN (DATE('2025-06-01'), DATE('2025-07-01'), DATE('2025-08-01'), DATE('2025-09-01'))

SELECT COUNT(*) as top_businesses_rows
FROM `payroll-bi-gauntlet.payroll_analytics_shadow.top_businesses`
WHERE tenant_id = 'creative_benefit_strategies'
  AND period_label IN (DATE('2025-06-01'), DATE('2025-07-01'), DATE('2025-08-01'), DATE('2025-09-01'))

SELECT COUNT(*) as top_agents_rows
FROM `payroll-bi-gauntlet.payroll_analytics_shadow.top_agents`
WHERE tenant_id = 'creative_benefit_strategies'
  AND period_label IN (DATE('2025-06-01'), DATE('2025-07-01'), DATE('2025-08-01'), DATE('2025-09-01'))

SELECT COUNT(*) as growth_loss_rows
FROM `payroll-bi-gauntlet.payroll_analytics_shadow.business_growth_loss`
WHERE tenant_id = 'creative_benefit_strategies'
  AND period_label IN (DATE('2025-06-01'), DATE('2025-07-01'), DATE('2025-08-01'), DATE('2025-09-01'))

Expected: All views return non-zero row counts.

3) Run parity validation

Basic Execution

# Run parity validator
python tests/verify_dashboard_parity_shadow.py

With JSON Output

# Save results to JSON file
python tests/verify_dashboard_parity_shadow.py --json-output parity_results.json

Example Output

================================================================================
DASHBOARD PARITY VALIDATION: Repo A (prod) vs Repo B (shadow)
================================================================================
Tenant ID: creative_benefit_strategies

MTD VALIDATION
--------------------------------------------------------------------------------

Period: 2025-07-01
  [PASS] CEO Metrics
  [PASS] Top Businesses Ranking
  [PASS] Top Agents Ranking

Period: 2025-08-01
  [PASS] CEO Metrics
  [PASS] Top Businesses Ranking
  [PASS] Top Agents Ranking

================================================================================
MoM VALIDATION
--------------------------------------------------------------------------------

MoM: 2025-07-01 ↔ 2025-06-01
  [PASS] Growth/Loss Reconciliation

MoM: 2025-08-01 ↔ 2025-07-01
  [PASS] Growth/Loss Reconciliation

================================================================================
QTD VALIDATION (Q3: baseline = 2025-07-01)
--------------------------------------------------------------------------------

Period: 2025-07-01 QTD
  [PASS] CEO Metrics QTD
  [PASS] Growth/Loss QTD

Period: 2025-08-01 QTD
  [PASS] CEO Metrics QTD
  [PASS] Growth/Loss QTD

================================================================================
SUMMARY
--------------------------------------------------------------------------------
Total Checks: 24
Passed: 24
Failed: 0

✅ ALL PARITY CHECKS PASSED

Interpreting Results

[PASS]: Metric matches within tolerance
- Counts: exact match (diff = 0)
- Money: abs diff ≤ $0.05
- Rankings: identical ordered list
[FAIL]: Metric mismatch
- Check diff values in output
- Review JSON output for detailed comparison
- Investigate root cause before proceeding

Common Issues

Missing Shadow Data:
- Error: No shadow data found for {period}
- Fix: Run shadow backfill for the missing period
View Not Created:
- Error: Table not found: payroll_analytics_shadow.{view_name}
- Fix: Deploy shadow views (Step 2)
Money Tolerance Exceeded:
- Error: abs_diff > 0.05
- Fix: Investigate rounding differences or calculation logic
Ranking Mismatch:
- Error: prod != shadow (ordered lists differ)
- Fix: Check business/agent name normalization

4) Review parity results

JSON Output Structure

{
  "mtd": {
    "2025-07-01": {
      "ceo_metrics": {
        "total_businesses": {
          "metric": "total_businesses",
          "prod": 150,
          "shadow": 150,
          "diff": 0,
          "match": true,
          "tolerance": 0
        },
        "gross_payout": {
          "metric": "gross_payout",
          "prod": 18523.78,
          "shadow": 18523.80,
          "diff": 0.02,
          "abs_diff": 0.02,
          "match": true,
          "tolerance": 0.05
        }
      },
      "top_businesses": {
        "ranking": {
          "metric": "top_businesses_ranking",
          "prod": ["Business A", "Business B", ...],
          "shadow": ["Business A", "Business B", ...],
          "match": true
        }
      }
    }
  },
  "mom": {
    "2025-07-01_vs_2025-06-01": {
      "reconciliation": {
        "comparison": {
          "current_employees": {
            "metric": "current_employees",
            "prod": 5000,
            "shadow": 5000,
            "diff": 0,
            "match": true
          }
        }
      }
    }
  },
  "qtd": {
    "2025-07-01": {
      "ceo_metrics": {
        "gross_payout": {
          "metric": "gross_payout_qtd",
          "prod": 18523.78,
          "shadow": 18523.80,
          "diff": 0.02,
          "match": true
        }
      }
    }
  },
  "summary": {
    "total_checks": 24,
    "passed": 24,
    "failed": 0
  }
}

Review Checklist

All MTD checks pass for June-September
All MoM checks pass for Jul↔Jun, Aug↔Jul, Sep↔Aug
All QTD checks pass for July, August, September
No money diffs exceed $0.05
All rankings match exactly
All counts match exactly

Verification

Shadow datasets contain expected row counts for all target periods
Shadow analytics views are deployed and queryable
Parity validator completes and reports pass/fail breakdown
Any failing metrics are documented with root-cause notes

Failure modes & fixes

Issue: `is_tpa` Column Missing

Error:

CRITICAL: Stage3 CSV missing required column 'is_tpa'

Fix:

Regenerate Stage3 CSV with Repo B engine
Ensure is_tpa column is included in CSV export

Issue: Shadow Mode Not Active

Error:

SEATBELT: Refusing to run for period=2025-07

Fix:

export BQ_SHADOW_MODE=1
# Verify
python -c "import os; print('Shadow mode:', os.getenv('BQ_SHADOW_MODE'))"

Issue: Period Not in Shadow Allowlist

Error:

SHADOW SEATBELT: Refusing to run for period=2025-10-01

Fix:

Add period to ALLOWED_PERIODS_SHADOW in repo_b/upload_to_bq.py
Or use --force flag (dangerous, only for testing)

Artifacts produced

Shadow backfill execution logs
Shadow view deployment results
Parity output (console and optional parity_results.json)
Review checklist with pass/fail status

docs/runbooks/engine_cutover_repoB.md
docs/shadow_mode_final_alignment.md
docs/runbooks/STAGE3_KICKOFF.md

Supporting reference

Next steps

After successful parity validation:

Review Results: Ensure all checks pass
Document Findings: Note any minor diffs or edge cases
Proceed to Cutover: Follow docs/runbooks/engine_cutover_repoB.md

Support

For issues or questions:

Check docs/shadow_mode_final_alignment.md for architecture details
Review docs/runbooks/engine_cutover_repoB.md for cutover procedure
Contact Data Engineering team

Start Here

Operate & Support

Reference

Build & Integrate

SHADOW MODE EXECUTION GUIDE

Repo B Shadow Mode Execution Guide

Purpose

When to run this

Prerequisites

Inputs

Procedure

Last Updated

Overview

1) Shadow backfill (June-September 2025)

Upload Single Period

Batch Upload Script

Verification

2) Create shadow analytics views

Deploy Views

Verification

3) Run parity validation

Basic Execution

With JSON Output

Example Output

Interpreting Results

Common Issues

4) Review parity results

JSON Output Structure

Review Checklist

Verification

Failure modes & fixes

Issue: `is_tpa` Column Missing

Issue: Shadow Mode Not Active

Issue: Period Not in Shadow Allowlist

Artifacts produced

Supporting reference

Next steps

Support

Start Here

Operate & Support

Reference

Build & Integrate

​Repo B Shadow Mode Execution Guide

​Purpose

​When to run this

​Prerequisites

​Inputs

​Procedure

​Last Updated

​Overview

​1) Shadow backfill (June-September 2025)

​Upload Single Period

​Batch Upload Script

​Verification

​2) Create shadow analytics views

​Deploy Views

​Verification

​3) Run parity validation

​Basic Execution

​With JSON Output

​Example Output

​Interpreting Results

​Common Issues

​4) Review parity results

​JSON Output Structure

​Review Checklist

​Verification

​Failure modes & fixes

​Issue: is_tpa Column Missing

​Issue: Shadow Mode Not Active

​Issue: Period Not in Shadow Allowlist

​Artifacts produced

​Related docs

​Supporting reference

​Next steps

​Support

Repo B Shadow Mode Execution Guide

Purpose

When to run this

Prerequisites

Inputs

Procedure

Last Updated

Overview

1) Shadow backfill (June-September 2025)

Upload Single Period

Batch Upload Script

Verification

2) Create shadow analytics views

Deploy Views

Verification

3) Run parity validation

Basic Execution

With JSON Output

Example Output

Interpreting Results

Common Issues

4) Review parity results

JSON Output Structure

Review Checklist

Verification

Failure modes & fixes

Issue: `is_tpa` Column Missing

Issue: Shadow Mode Not Active

Issue: Period Not in Shadow Allowlist

Artifacts produced

Related docs

Supporting reference

Next steps

Support