Stage3 Insert Root Cause — Partition Window
Evidence (from scripts/repro_stage3_insert_error.py)
first_5_errors (first error):
| Field | Value |
|---|---|
| index | 0 |
| location | (empty) |
| reason | invalid |
| message | Value 47117 for partition column period_label of the destination table payroll-bi-gauntlet:payroll_processed.stage3_snapshots_v2 is outside the allowed bounds. You can only stream to partitions of date range within 3650 days in the past and 366 days in the future relative to the current date. Integer value mapping to a BigQuery date column has to be number of days since Unix Epoch time, please double check if your input data is valid. |
period_label (DATE, partition column)
Root Cause
stage3_snapshots_v2 is partitioned by period_label. BigQuery streaming insert (insert_rows_json) to partitioned tables enforces:
- 3650 days in the past
- 366 days in the future
period_label=2099-01-01. That date is beyond the 366‑day future limit, so BQ rejects the streaming insert.
Category
| Category | Example |
|---|---|
| Date mismatch | 2099-01-01 outside BQ streaming window |
YYYY-MM-DD) is correct; the value is outside the allowed partition range.
Fix Options
- Use period within bounds for smoke – Default
PeriodLabelto something like2026-02-01in the smoke script. Requires a batch with data for that period. - Reject out-of-window in API – Add validation in
write_stage3_rowsand returnSTAGE3_WRITE_FAILEDwith a clear message before calling BQ. - Batch load for out-of-window – Use a BQ load job instead of streaming when the period is out of window. More complex and out of scope for the initial fix.
Minimal Fix (Recommended)
- Add validation in
stage3_queries.pythat rejectsperiod_labeloutside the BQ streaming window and raises a descriptiveSTAGE3_WRITE_FAILEDerror. - Update
pr3d_deploy_smoke.ps1defaultPeriodLabelto2026-02-01(or another date within 366 days of today). - Document this constraint in the smoke script.