Shadow Mode Implementation - Updated Alignment Check
Date: 2025-01-XXPurpose: Verify updated specification resolves all red flags and check for remaining issues
✅ RESOLVED RED FLAGS
RED FLAG #1: Shadow Naming Strategy ✅ RESOLVED
Status: ✅ EXPLICITLY RESOLVED Updated Spec:“Use dataset-level shadowing only.” “Shadow datasets:Verification: ✅ Spec is unambiguous and matches recommendationpayroll_raw_shadow,payroll_processed_shadow,payroll_analytics_shadow” ”❌ Do NOT suffix individual tables”
RED FLAG #2: Hardcoded Dataset Constants ✅ PARTIALLY RESOLVED
Status: ⚠️ SPEC MENTIONS SOLUTION, NEEDS IMPLEMENTATION DETAIL Updated Spec:
“Introduce a single helper (example): resolve_dataset("payroll_raw") -> "payroll_raw_shadow" if BQ_SHADOW_MODE=1”
“One canonical resolver”
“No hardcoded dataset strings elsewhere”
Verification: ✅ Spec provides clear direction, but implementation details needed:
- Exact function signature
- Where to place helper function
- How to integrate with existing
load_stage1_to_bigquery()andload_stage3_to_bigquery()calls
- All call sites use helper (not hardcoded strings)
- Helper reads
BQ_SHADOW_MODEenv var - Default behavior (shadow mode OFF) writes to prod datasets
RED FLAG #3: top_agents Dual-Table Dependency ✅ RESOLVED
Status: ✅ EXPLICITLY ADDRESSED
Updated Spec:
“Critical:Verification: ✅ Spec explicitly calls out this requirementtop_agentsshadow view must replace both:payroll_processed.stage3_snapshots,payroll_raw.stage1_snapshots”
RED FLAG #4: Period Validation ✅ RESOLVED
Status: ✅ EXPLICITLY RESOLVED Updated Spec:“IntroduceVerification: ✅ Spec provides clear solution Implementation Note: Current code has:ALLOWED_PERIODS_SHADOW” “Shadow allowlist must include:2025-06-01,2025-07-01,2025-08-01” “Shadow mode must not bypass validation silently”
RED FLAG #5: Parity Validator Dual-Dataset Queries ⚠️ NEEDS CLARIFICATION
Status: ⚠️ SCOPE CLEAR, IMPLEMENTATION PATTERN NOT SPECIFIED Updated Spec:“Compare Repo A (prod) vs Repo B (shadow)” Lists all metrics to compareVerification: ✅ Scope is comprehensive, but spec doesn’t show:
- How to construct queries with configurable dataset names
- Whether to use query builder functions or string replacement
- Example query pattern
RED FLAG #6: MoM Calculations Require Previous Period ✅ RESOLVED
Status: ✅ EXPLICITLY RESOLVED Updated Spec:
“Run Repo B shadow ingestion for: June 2025, July 2025, August 2025”
“This is required to satisfy: MoM LAG() logic in views, Growth/Loss reconciliation”
Verification: ✅ Spec explicitly includes multi-month backfill
RED FLAG #7: YTD/QTD Direct Table Queries ✅ RESOLVED
Status: ✅ EXPLICITLY ADDRESSED Updated Spec:“QTD (June–August)” listed in parity validator scope Implies validator will query shadow tables directly for QTDVerification: ✅ QTD validation included in scope Implementation Note: QTD for August 2025 should aggregate:
- Start:
2025-07-01(Q3 start) - End:
2025-08-31(August end) - Baseline:
2025-07-01(for snapshot comparison)
RED FLAG #8: tpa_applied Column ✅ VERIFIED IN CODE
Status: ✅ CODE VERIFIES COLUMN EXISTS
Code Evidence:
tpa_applied from is_tpa column
Action Required: Verify is_tpa column exists in Repo B Stage3 CSV outputs
🔴 REMAINING RED FLAGS
RED FLAG #9: QTD Baseline Calculation for Multi-Month Validation
Issue: Spec mentions “QTD (June–August)” but QTD logic depends on selected period:- If validating August 2025 QTD: baseline = July 1, current = August 31
- If validating July 2025 QTD: baseline = July 1, current = July 31
- If validating June 2025 QTD: baseline = April 1 (Q2), current = June 30
- Parity validator needs to understand QTD baseline varies by selected period
- For August 2025 QTD validation, need July 1 baseline (already covered by multi-month backfill)
- For June 2025 QTD validation, need April 1 baseline (NOT in shadow backfill scope)
- Clarify: Are we validating QTD for each month separately (June QTD, July QTD, August QTD)?
- OR: Are we validating Q3 QTD only (aggregating June–August)?
- Current spec says “QTD (June–August)” which suggests Q3 aggregation, but needs clarification
RED FLAG #10: Growth/Loss QTD/YTD Uses Snapshot Comparison
Issue:get_growth_loss_summary_snapshot() (for QTD/YTD) uses snapshot comparison:
- Current snapshot: Selected period end
- Baseline snapshot: Quarter start (for QTD) or Jan 1 (for YTD)
- QTD Growth/Loss for August 2025 compares:
- Current: August 31 snapshot
- Baseline: July 1 snapshot
- This requires July shadow data (✅ covered)
- But also requires understanding that QTD uses snapshot comparison, not cumulative aggregation
- Parity validator should use
get_growth_loss_summary_snapshot()logic for QTD validation - Document that QTD Growth/Loss uses snapshot comparison (not cumulative)
RED FLAG #11: View DDL Storage Location
Issue: Spec says:
“Store shadow view SQL in: integration/bigquery/sql/views_shadow/”
Verification: ✅ Clear location specified
Action Required: Create directory structure if it doesn’t exist
⚠️ WARNINGS (Non-Blocking)
WARNING #1: Project ID Hardcoding
Issue: All view SQL hardcodespayroll-bi-gauntlet project ID
Impact: Shadow views will also hardcode project ID (acceptable for Phase 1)
Recommendation: Acceptable for shadow mode. Consider parameterization in future.
WARNING #2: Period Format Consistency
Issue: Spec uses both formats:2025-06-01(with day)2025-06(without day)
YYYY-MM-01 format consistently (matches BigQuery DATE partitioning)
WARNING #3: QTD Quarter Definition
Issue: Q3 2025 = July, August, September (not June, July, August) Impact:- If validating “QTD (June–August)”, this is NOT a standard quarter
- Standard Q3 = July–September
- June is end of Q2
- Are we validating Q2 QTD (April–June) + Q3 QTD (July–September)?
- OR: Are we validating a custom 3-month period (June–August)?
📋 UPDATED PRE-IMPLEMENTATION CHECKLIST
- ✅ Shadow naming strategy clarified (dataset-level)
- ✅ Multi-month scope defined (June, July, August)
- ✅
top_agentsdual-table dependency acknowledged - ✅ Shadow allowlist specified
- ⚠️ Clarify QTD validation scope (standard quarter vs custom period)
- ⚠️ Verify
is_tpacolumn exists in Repo B Stage3 CSV - ⚠️ Design query builder pattern for parity validator (or use string replacement)
- ⚠️ Create
views_shadow/directory structure - ⚠️ Document QTD Growth/Loss snapshot comparison logic
✅ ALIGNMENT SUMMARY
Overall Assessment: Updated specification resolves 6 of 8 original red flags explicitly. 2 red flags need minor clarification (QTD scope, query builder pattern).Resolved (6):
- ✅ Shadow naming strategy (dataset-level)
- ✅ Hardcoded constants (helper function specified)
- ✅
top_agentsdual-table dependency - ✅ Period validation (shadow allowlist)
- ✅ MoM previous period requirement (multi-month backfill)
- ✅ YTD/QTD scope (included in validator)
Needs Clarification (2):
- ⚠️ QTD validation scope (standard quarter vs custom June–August period)
- ⚠️ Parity validator query builder pattern (implementation detail)
New Red Flags (1):
- 🔴 QTD baseline calculation for multi-month validation
🎯 READINESS ASSESSMENT
Status: ✅ MOSTLY READY FOR IMPLEMENTATION Blockers: None (all can be resolved during implementation) Recommendations:- Clarify QTD validation scope before writing parity validator SQL
- Verify
is_tpacolumn exists in Repo B outputs before shadow backfill - Design query builder helper early (before writing validator)
Next Step: Proceed with implementation, addressing clarifications as they arise