BigQuery Contract - Payroll Commission Pipeline
Summary (3–6 bullets)
- Defines BigQuery schema, dataset, table, and view expectations for payroll analytics.
- Documents ingestion paths, ETL mappings, and downstream Excel-aligned analytics semantics.
- Lists contract-critical fields, partitioning/clustering, and source-of-truth artifacts.
- Specifies breaking vs safe change policy for schema and pipeline evolution.
- Provides implementation-level mappings from Repo B outputs into BigQuery objects.
When to use this (3–6 bullets)
- Before changing BigQuery schemas, ETL mappings, or analytics views.
- During ELT/ETL onboarding to understand required contracts.
- When validating if Repo B output changes are backward-compatible.
- During incident response for dashboard/data mismatches tied to BigQuery objects.
What you’ll walk away with (2–5 bullets)
- A concrete map of what BigQuery expects and what Repo B must preserve.
- A change-control checklist to avoid breaking dashboards and analytics.
- A fast way to trace table/view lineage and contract dependencies.
Purpose: Defines BigQuery schema expectations, data flow, and Excel dependencies to prevent breaking changes when aligning Repo B Excel outputs.
Table of Contents
- Source of Truth
- Data Flow
- Datasets
- Tables
- Views
- Excel Dependencies
- CSV/DataFrame Mappings
- Change Policy
Source of Truth
Schema Definitions
-
JSON Schemas:
integration/bigquery/schemas/*.json- Primary source for table column definitions
- Used by ETL loader for validation
- Files:
stage1_schema.json,stage3_schema.json,ceo_metrics_schema.json,growth_loss_schema.json,audit_checks_schema.json,dim_business_mapping_schema.json,dim_agent_hierarchy_schema.json
-
SQL Table Definitions:
integration/bigquery/sql/tables/*.sql- Used for table creation and schema evolution
- Files:
bq_schema_agents.sql,monthly_rollup_all.sql,business_churn_metadata.sql,period_toggle_audit.sql
-
SQL View Definitions:
integration/bigquery/sql/views/*.sql- Analytics views that align with Excel outputs
- Key views:
analytics_agent_report_v5.sql,analytics_ceo_snapshot.sql,analytics_business_growth_loss.sql
-
Configuration:
configs/bq_config.yaml- Dataset definitions, partitioning, clustering, access control
- ETL settings, data quality rules
ETL Loader
-
Main Loader:
integration/bigquery/scripts/etl_loader.pyBigQueryETLLoaderclass handles all data loading- Methods:
upload_period_data(),_upload_stage3_v10_3(),_upload_ceo_metrics_v10_3(),_upload_stage1_v10_3() - Loads from DataFrames (not Excel directly)
-
Client Initialization:
api/bigquery/client.pyget_client()function initializes BigQuery client- Uses
GCP_PROJECT_IDenvironment variable or ADC
Data Flow
Primary Path (BigQuery Ingestion)
- Repo B generates DataFrames from canonical JSON
- ETL loader (
etl_loader.py) receives DataFrames directly - Data is uploaded via
load_table_from_dataframe()(not Excel files) - CSV snapshots are used for historical backfill only
Secondary Path (Excel Outputs)
- Generated for human review and reporting
- Used for validation and reconciliation
- May inform downstream processes, but BigQuery loads from DataFrames/CSV
View Alignment
Some BigQuery views align with Excel outputs conceptually:analytics_agent_report_v5matches “V5 Excel output” (per view comment)- Views aggregate from tables, not Excel files
Datasets
1. payroll_raw (Ingestion Layer)
Purpose: Raw Stage 1 payroll data snapshotsLocation: US
Retention: 7 years (220752000000 ms)
Access: Aaron (Owner), Richard/Robin/Sheleeha (Data Viewer) Tables:
stage1_snapshots(partitioned byperiod_label, clustered bytenant_id,business_label,payee_agent_id)
2. payroll_processed (Transformation Layer)
Purpose: Stage 3 collapsed data with TPA normalizationLocation: US
Retention: 7 years
Access: Aaron (Owner), Richard/Robin/Sheleeha (Data Viewer) Tables:
stage3_snapshots(partitioned byperiod_label, clustered bytenant_id,business_name,agent_id,tpa_applied)
3. payroll_analytics (BI Layer)
Purpose: Pre-aggregated analytics and executive reportingLocation: US
Retention: 7 years
Access: Aaron (Owner), Richard/Robin/Sheleeha (Data Viewer) Tables:
ceo_metrics(partitioned byperiod_label)growth_loss(partitioned byperiod_label)audit_checks(partitioned byperiod_label)dim_business_mapping(dimension table)dim_agent_hierarchy(dimension table)dim_ingest_audit(audit dimension)agent_monthly_rollup(agent audit tables)agent_catchups(agent audit tables)agent_churn_register(agent audit tables)agent_summary(agent audit tables)monthly_rollup_all(monthly business rollup)
Tables
payroll_raw.stage1_snapshots
Purpose: Raw Stage 1 payroll data snapshots - raw commission recordsPartitioning:
period_label (DATE, DAY) - partition filter requiredClustering:
tenant_id, business_label, payee_agent_idSource: Repo B Stage 1 DataFrame or CSV snapshot Schema (from
integration/bigquery/schemas/stage1_schema.json):
| Column | Type | Mode | Description |
|---|---|---|---|
tenant_id | STRING | REQUIRED | Tenant identifier for SaaS multi-tenant isolation |
period_label | DATE | REQUIRED | Payroll period in YYYY-MM-DD format (partitioning field) |
business_label | STRING | REQUIRED | Business name from payroll data (Label column) |
member_id | STRING | REQUIRED | Unique employee identifier |
member_first_name | STRING | NULLABLE | Employee first name |
member_last_name | STRING | NULLABLE | Employee last name |
payee_agent_id | INTEGER | REQUIRED | Agent ID (721995=Robin, 668078=Richard) |
payee_agent_name | STRING | NULLABLE | Agent name |
credit | FLOAT | REQUIRED | Gross commission amount before chargebacks |
debit | FLOAT | REQUIRED | Chargeback amount (usually 0) |
total | FLOAT | REQUIRED | Net amount after chargebacks (Credit - Debit) |
period | STRING | REQUIRED | Pay period cadence (Weekly, Bi-Weekly, etc.) |
period_code | STRING | REQUIRED | Normalized period code (W, BW, SM, 4M, M) |
pay_periods | INTEGER | REQUIRED | Number of pay periods per year |
trunc2_applied | BOOLEAN | REQUIRED | Whether tunc2 truncation was applied (audit trail) |
created_at | TIMESTAMP | REQUIRED | Timestamp when record was created |
source_file | STRING | REQUIRED | Source file or process |
pipeline_version | STRING | REQUIRED | Pipeline version |
_upload_stage1_v10_3()):
Label→business_labelMember ID→member_idMember First Name→member_first_nameMember Last Name→member_last_namePayee Agent ID→payee_agent_idPayee Agent Name→payee_agent_namePeriod→period→period_code(mapped via period_map)Credit→creditDebit→debitTotal→total
payroll_processed.stage3_snapshots
Purpose: Historical Stage 3 collapsed payroll data with TPA normalizationPartitioning:
period_label (DATE, DAY) - partition filter requiredClustering:
tenant_id, business_name, agent_id, tpa_appliedSource: Repo B Stage 3 DataFrame Schema (from
integration/bigquery/schemas/stage3_schema.json):
| Column | Type | Mode | Description |
|---|---|---|---|
tenant_id | STRING | REQUIRED | Tenant identifier |
period_label | DATE | REQUIRED | Payroll period (partitioning field) |
business_name | STRING | REQUIRED | Business name after normalization |
business_id | STRING | NULLABLE | Stable business ID (MD5 hash) |
agent_id | STRING | REQUIRED | Agent identifier |
agent_name | STRING | REQUIRED | Agent full name |
downline_agent_name | STRING | NULLABLE | Downline agent name for TPA commissions |
owner_name | STRING | NULLABLE | Owner name (Robin Bundy or Richard Ballard) |
period_code | STRING | REQUIRED | Pay period cadence code (W, BW, SM, 4M, M) |
pay_periods | INTEGER | REQUIRED | Number of pay periods per year |
pepm_rate | FLOAT | NULLABLE | Per Employee Per Month rate from agent2_pepm.csv |
pepm_rate_clean | FLOAT | NULLABLE | Cleaned PEPM rate after normalization |
absorbed_count | FLOAT | REQUIRED | Number of employees for this agent-business-cadence |
tpa_per_pay_amount | FLOAT | NULLABLE | Commission per pay period (PEPM * 12 / pay_periods) |
agent_total | FLOAT | REQUIRED | Total commission for this agent-business-cadence |
tpa_applied | BOOLEAN | REQUIRED | Whether TPA normalization was applied |
match_type | STRING | NULLABLE | Type of agent matching (single_agent, multi_agent) |
match_method | STRING | NULLABLE | Method used (exact, fuzzy) |
slice_note | STRING | NULLABLE | Additional notes about commission slice |
trunc2_applied | BOOLEAN | REQUIRED | Whether tunc2 truncation was applied |
created_at | TIMESTAMP | REQUIRED | Timestamp when record was created |
source_file | STRING | REQUIRED | Source file or process |
pipeline_version | STRING | REQUIRED | Pipeline version |
commission_type | STRING | NULLABLE | Commission type: TPA, MIXED_OWNER, or Owner |
owner_residual_total | FLOAT | NULLABLE | Owner residual amount for MIXED_OWNER slices |
cadence_override_applied | BOOLEAN | NULLABLE | Whether cadence override was applied |
cadence_override_reason | STRING | NULLABLE | Reason for cadence override |
posted_date | DATE | NULLABLE | Posted date from source payroll file |
_upload_stage3_v10_3()):
business_label→business_name(ifbusiness_namenot present)employee_count→absorbed_count(ifabsorbed_countnot present)business_idgenerated viaBusinessIDGeneratorcommission_typederived fromis_tpa_commissionandowner_residual_totalpay_periodsmapped fromperiod_codeif missing:{'W': 52, 'BW': 26, 'SM': 24, '4M': 48, 'M': 12}
payroll_analytics.ceo_metrics
Purpose: Historical CEO metrics and executive KPIs by periodPartitioning:
period_label (DATE, DAY) - partition filter requiredClustering:
tenant_id, period_labelSource: Repo B CEO metrics dictionary Schema (from
integration/bigquery/schemas/ceo_metrics_schema.json):
| Column | Type | Mode | Description |
|---|---|---|---|
tenant_id | STRING | REQUIRED | Tenant identifier |
period_label | DATE | REQUIRED | Payroll period (partitioning field) |
total_businesses | INTEGER | REQUIRED | Total number of businesses with commission activity |
total_employees | INTEGER | REQUIRED | Total number of unique employees |
total_agents | INTEGER | REQUIRED | Total number of TPA agents |
commissionable_agents | INTEGER | REQUIRED | Number of commissionable agents |
gross_payout | FLOAT | REQUIRED | Total gross payout before chargebacks |
chargebacks | FLOAT | REQUIRED | Total chargebacks and refunds |
agent_payout_net | FLOAT | REQUIRED | Total TPA agent commission payout (net) |
business_owner_commission | FLOAT | REQUIRED | Business owner commission (Gross - Chargebacks - Agent Payout) |
average_commission_per_employee | FLOAT | REQUIRED | Average commission per employee (Gross / Total Employees) |
stage1_gross | FLOAT | REQUIRED | Stage 1 gross amount (for reconciliation) |
stage1_chargebacks | FLOAT | REQUIRED | Stage 1 chargebacks (for reconciliation) |
stage3_total | FLOAT | REQUIRED | Stage 3 total amount (for reconciliation) |
reconciliation_status | STRING | REQUIRED | Reconciliation status (PASS, FAIL, WARNING) |
reconciliation_delta | FLOAT | REQUIRED | Reconciliation delta amount |
trunc2_applied | BOOLEAN | REQUIRED | Whether tunc2 truncation was applied |
created_at | TIMESTAMP | REQUIRED | Timestamp when record was created |
source_file | STRING | REQUIRED | Source file or process |
pipeline_version | STRING | REQUIRED | Pipeline version |
_upload_ceo_metrics_v10_3()):
- CEO metrics dictionary keys map directly to columns
period_labelconverted to DATE format
payroll_analytics.growth_loss
Purpose: Employee gains and losses by business over timePartitioning:
period_label (DATE, DAY) - partition filter requiredClustering:
tenant_id, business_nameSource: Repo B Growth/Loss DataFrame or calculation Schema (from
integration/bigquery/schemas/growth_loss_schema.json):
| Column | Type | Mode | Description |
|---|---|---|---|
tenant_id | STRING | REQUIRED | Tenant identifier |
period_label | DATE | REQUIRED | Payroll period (partitioning field) |
business_name | STRING | REQUIRED | Business name for growth/loss analysis |
employees_gained | INTEGER | REQUIRED | Number of employees gained this period |
employees_lost | INTEGER | REQUIRED | Number of employees lost this period |
net_change | INTEGER | REQUIRED | Net change in employees (gained - lost) |
percent_change | FLOAT | REQUIRED | Percentage change in employees |
previous_employees | INTEGER | REQUIRED | Previous period employee count |
current_employees | INTEGER | REQUIRED | Current period employee count |
previous_period | DATE | NULLABLE | Previous period label for comparison |
comparison_available | BOOLEAN | REQUIRED | Whether previous period data was available |
status | STRING | REQUIRED | Growth status (GROWING, DECLINING, STABLE, NEW, CHURNED) |
trunc2_applied | BOOLEAN | REQUIRED | Whether tunc2 truncation was applied |
created_at | TIMESTAMP | REQUIRED | Timestamp when record was created |
source_file | STRING | REQUIRED | Source file or process |
pipeline_version | STRING | REQUIRED | Pipeline version |
payroll_analytics.audit_checks
Purpose: Historical audit check results and data quality validationPartitioning:
period_label (DATE, DAY) - partition filter requiredClustering:
tenant_id, check_type, statusSource: Repo B audit checks DataFrame Schema (from
integration/bigquery/schemas/audit_checks_schema.json):
| Column | Type | Mode | Description |
|---|---|---|---|
tenant_id | STRING | REQUIRED | Tenant identifier |
period_label | DATE | REQUIRED | Payroll period (partitioning field) |
check_type | STRING | REQUIRED | Type of audit check performed |
check_name | STRING | REQUIRED | Specific name of the audit check |
status | STRING | REQUIRED | Check result status (PASS, WARNING, FAIL) |
expected_value | FLOAT | NULLABLE | Expected value for the check |
actual_value | FLOAT | NULLABLE | Actual value found |
threshold | FLOAT | NULLABLE | Threshold value for the check |
variance_percent | FLOAT | NULLABLE | Percentage variance from expected value |
message | STRING | REQUIRED | Human-readable message about check result |
severity | STRING | REQUIRED | Severity level (INFO, WARNING, ERROR, CRITICAL) |
category | STRING | REQUIRED | Check category (IDENTITY, RECONCILIATION, ANOMALY, DATA_QUALITY) |
business_name | STRING | NULLABLE | Business name if check is business-specific |
agent_name | STRING | NULLABLE | Agent name if check is agent-specific |
details | STRING | NULLABLE | Additional details about check result |
trunc2_applied | BOOLEAN | REQUIRED | Whether tunc2 truncation was applied |
created_at | TIMESTAMP | REQUIRED | Timestamp when record was created |
source_file | STRING | REQUIRED | Source file or process |
pipeline_version | STRING | REQUIRED | Pipeline version |
Dimension Tables
payroll_analytics.dim_business_mapping
Purpose: Stable business ID mapping dimensionClustering:
business_id, normalized_nameSource:
BusinessIDGenerator utility
Key Columns:
business_id(STRING, REQUIRED) - Stable MD5 hash identifierbusiness_name(STRING, REQUIRED) - Business name as appears in sourcenormalized_name(STRING, REQUIRED) - Normalized name (uppercase, trimmed)is_active(BOOLEAN, REQUIRED)first_seen_date,last_seen_date(DATE, NULLABLE)
payroll_analytics.dim_agent_hierarchy
Purpose: Agent hierarchy dimension with parent-child relationshipsClustering:
agent_id, parent_agent_idSource:
agent_hierarchy.csv reference file
Key Columns:
agent_id(STRING, REQUIRED)agent_name(STRING, REQUIRED)parent_agent_id,parent_agent_name(STRING, NULLABLE)hierarchy_level(INTEGER, NULLABLE)is_root_owner(BOOLEAN, REQUIRED) - Robin/Richard flagis_metrics_only(BOOLEAN, REQUIRED) - Richard Ballard exclusion flagis_active,is_deleted(BOOLEAN, REQUIRED)effective_date,end_date(DATE, NULLABLE) - SCD Type 2
Views
payroll_analytics.analytics_agent_report_v5
Purpose: TPA Producing Agent Commissions (matches V5 Excel output)Source:
integration/bigquery/sql/views/analytics_agent_report_v5.sqlBase Tables:
stage3_snapshots, stage1_snapshots
Key Columns:
tenant_id,period_label,agent_name,business_namegross_payout(SUM ofagent_total)employee_count(from Stage 1, authoritative)record_count,avg_pepmcadence_codes,pay_periodsoverride_applied,rounding_applied(audit flags)
payroll_analytics.analytics_ceo_snapshot
Purpose: CEO snapshot view that mirrors Excel/Python logicSource:
integration/bigquery/sql/views/analytics_ceo_snapshot.sqlBase Tables:
stage1_snapshots
Key Columns:
tenant_id,period_labeltotal_businesses,total_employeesgross_payout,chargebacks,net_payout- MoM changes:
business_change_mom,employee_change_mom,gross_payout_change_mom,net_payout_change_mom - MoM percentages:
business_change_pct_mom,employee_change_pct_mom, etc.
payroll_analytics.analytics_business_growth_loss
Purpose: Business growth/loss analysisSource:
integration/bigquery/sql/views/analytics_business_growth_loss.sqlBase Tables:
stage1_snapshots
Key Columns:
tenant_id,business_name,period_labelcurrent_employees,previous_employeesnet_change,pct_change
Other Analytics Views
Located inintegration/bigquery/sql/views/:
analytics_agent_at_risk_businesses.sqlanalytics_agent_attrition_trends.sqlanalytics_agent_business_health_scores.sqlanalytics_agent_business_lifecycle.sqlanalytics_agent_business_recovery.sqlanalytics_agent_business_summary.sqlanalytics_agent_churn_metrics_v.sqlanalytics_agent_churn_summary.sqlanalytics_agent_lost_businesses.sqlanalytics_agent_portfolio_metrics.sqlanalytics_agent_trends.sqlanalytics_agent2_pepm_view.sqlanalytics_churn_metrics.sqlanalytics_owner_commission.sqlanalytics_top_agents.sqlanalytics_top_businesses.sqlbusiness_health.sqlstage1_agent_detailed_export_vw.sqltop_anomalies.sqlv_business_absent_2m.sqlv_business_full_churn.sqlv_business_permanently_lost.sqlv_business_recovered.sqlv_business_watchlist.sql
Excel Dependencies
Important Note
BigQuery does NOT directly read Excel files. Excel workbooks are generated for human consumption only. However, Excel sheet structures may inform downstream processes, so we document them for completeness.Excel Sheet Names (from repo_b/excel_contract.py)
V5 Mode Sheets
Required:Agent Commissions- Agent commission breakdownAudit View- Audit trail view
Executive Mode Sheets
Required:CEO Snapshot- Executive KPIsBusiness Summary- Business-level totalsAgent2 Breakdown- Agent commission breakdown (same as “Agent Commissions” with Posted Date)Agent2 Audit- Audit trail (same as “Audit View” with Posted Date)Full Detail- Detailed transaction data (Posted Date must be first column)
Growth/LossorGrowth & Loss- Growth/loss analysisAudit Checks- Data quality checksBusiness Analytics- Business analytics
Excel Column Requirements
Agent Commissions / Agent2 Breakdown
Required Columns (from repo_b/excel_contract.py):
business- Business nameagent_first- Agent first nameagent_last- Agent last nameagent_pepm- Agent PEPM ratepay_periods- Pay periods per yearemployee_count- Employee count (Stage 1 count)normalized_pepm- Normalized PEPM rateagent_total- Total agent commissionnote- Additional notes
Agent2 Breakdown has Posted Date as first column (added by add_posted_date_first()).
Full Detail
Required: Posted Date must be first columnOther columns: Varies based on Stage 3 DataFrame structure
CEO Snapshot
Required Columns:
Metric- Metric nameCurrent Value- Metric value
Business Summary
Required Columns:
Business Name- Business nameTotal Employees- Total employee countGross Commission- Gross commission amountChargebacks- Chargeback amountNet Commission- Net commission amount
Excel → BigQuery Mapping (Indirect)
Since BigQuery doesn’t read Excel directly, the mapping is:Agent2 BreakdownExcel sheet →build_agent2_breakdown()DataFrame → Not directly loaded (Excel is output only)- Stage 3 DataFrame (source of
Agent2 Breakdown) →stage3_snapshotstable
CSV/DataFrame Mappings
Stage 1 → stage1_snapshots
Source: Repo B Stage 1 DataFrame (from canonical JSON)ETL Method:
_upload_stage1_v10_3()
Column Mappings:
- DataFrame columns map directly to BigQuery columns
period_labelconverted to DATE format (YYYY-MM-01)member_idconverted to STRINGpay_periodsconverted to INT64period_codederived fromperiodcolumn if missing
Stage 3 → stage3_snapshots
Source: Repo B Stage 3 DataFrame (from output_adapter.py)ETL Method:
_upload_stage3_v10_3()
Column Mappings:
business_label→business_name(ifbusiness_namenot present)employee_count→absorbed_count(ifabsorbed_countnot present)business_idgenerated viaBusinessIDGenerator.get_or_create_business_id()commission_typederived fromis_tpa_commissionandowner_residual_totalpay_periodsmapped fromperiod_code:{'W': 52, 'BW': 26, 'SM': 24, '4M': 48, 'M': 12}- Currency columns cleaned (remove
$, strip spaces, convert to float)
CEO Metrics → ceo_metrics
Source: Repo B CEO metrics dictionaryETL Method:
_upload_ceo_metrics_v10_3()
Column Mappings:
- Dictionary keys map directly to BigQuery columns
period_labelconverted to DATE format
Growth/Loss → growth_loss
Source: Repo B Growth/Loss DataFrame (from repo_b/reporting/growth_analysis.py or similar)ETL Method: Not yet implemented in ETL loader (placeholder) Expected Mappings:
business_name→business_namecurrent_employees→current_employeesprevious_employees→previous_employeesnet_change→net_changepercent_change→percent_change
Change Policy
What Repo B CAN Change
✅ Safe Changes:- Add new optional Excel sheets (e.g.,
Growth & Loss,Business Analytics) - Add new columns to Excel sheets (as long as required columns remain)
- Change internal DataFrame column names (as long as ETL mapping is updated)
- Add new BigQuery tables/views (with proper schema definitions)
- Modify Excel formatting/styling (doesn’t affect BigQuery)
What Repo B MUST NOT Change
❌ Breaking Changes:-
Required Excel Sheet Names:
Agent Commissions(V5 mode) - DO NOT RENAMEAgent2 Breakdown(Executive mode) - DO NOT RENAMEFull Detail- DO NOT RENAMECEO Snapshot- DO NOT RENAMEBusiness Summary- DO NOT RENAME
-
Required Excel Columns:
- All columns listed in
REQUIRED_COLUMNSdict (repo_b/excel_contract.py) Posted Datemust remain first column inFull DetailandAgent2 Breakdown
- All columns listed in
-
BigQuery Schema:
- DO NOT remove or rename existing columns in tables
- DO NOT change column types (e.g., FLOAT → INTEGER)
- DO NOT change partitioning/clustering fields
- DO NOT remove required fields (
tenant_id,period_label,created_at, etc.)
-
Data Types:
period_labelmust remain DATE type (YYYY-MM-DD format)agent_total,gross_payout, etc. must remain FLOAT/NUMERICpay_periodsmust remain INTEGER
-
ETL Column Mappings:
- If changing DataFrame column names, MUST update ETL loader mappings
business_label→business_namemapping must be preservedemployee_count→absorbed_countmapping must be preserved
-
View Dependencies:
- Views depend on specific table columns - changes to base tables may break views
analytics_agent_report_v5expectsdownline_agent_name,agent_total,tpa_appliedanalytics_ceo_snapshotexpectsbusiness_label,member_id,credit,debit,total
Migration Strategy
If schema changes are needed:- Additive Changes Only: Add new columns, don’t remove existing ones
- Versioning: Use
pipeline_versionfield to track schema versions - Backward Compatibility: Maintain old column names alongside new ones during transition
- Documentation: Update this contract document when changes are made
- Testing: Validate ETL loader and views after schema changes
Summary: What BQ Expects / What Repo B Must Preserve
Critical Requirements
- Excel Sheet Names:
Agent Commissions,Agent2 Breakdown,Full Detail,CEO Snapshot,Business Summarymust not change - Excel Columns: Required columns in
REQUIRED_COLUMNSmust remain present - BigQuery Tables: Schema must remain stable - no column removals or type changes
- Data Types:
period_label(DATE), currency fields (FLOAT),pay_periods(INTEGER) - ETL Mappings: Column mappings in
etl_loader.pymust be maintained - Partitioning:
period_labelpartitioning must be preserved for cost control
Safe to Change
- Optional Excel sheets
- Excel formatting/styling
- Internal DataFrame column names (with ETL updates)
- New tables/views (with proper schemas)
End of Contract Document