Legacy agent_id Minting Function & Join Usage Analysis
Date: 2025-01-XXPurpose: Document current
agent_id minting function and all join dependencies before migration to agent_key
1. Legacy agent_id Minting Function
Function Location
File:repo_b/upload_to_bq.py (lines 169-198)
Function Definition
Normalization Function
Can It Be Frozen Forever?
✅ YES - Function Can Be Frozen Rationale:- Deterministic: Same input → same output (SHA256 hash)
- No External Dependencies: Pure function, no database lookups
- Stable Normalization:
normalize_agent_name()is deterministic - Backward Compatible: Existing
agent_idvalues remain valid
- Move function to
utils/agent_id_legacy.py(frozen module) - Mark as
@deprecatedwith migration path toagent_key - Keep for backward compatibility during migration
- Document that new code should use
agent_keyfromdim_agent_identity
2. agent_id Join Usage Analysis
2.1 BigQuery Views (SQL Files)
A. semantic_view_commission_traceability.sql
File: integration/bigquery/sql/views/semantic_view_commission_traceability.sql
Join Usage:
- Line 227:
PARTITION BY tenant_id, agent_id(QUALIFY deduplication) - Line 371:
AND s1c.agent_id = ah.agent_id(LEFT JOIN todim_agent_hierarchy)
- Add
agent_identityCTE join - Use
agent_keyforstage3_component_idhash - Keep
agent_idjoin for backward compatibility during transition
B. bq_schema_agents.sql
File: integration/bigquery/sql/tables/bq_schema_agents.sql
Join Usage:
- Line 108:
ON a.agent_id = c.agent_id(agent to commission join) - Line 112:
ON a.agent_id = s.agent_id(agent to snapshot join)
agent_key via dim_agent_identity
C. stage1_agent_detailed_export_vw.sql
File: integration/bigquery/sql/views/stage1_agent_detailed_export_vw.sql
Join Usage:
- Line 28:
AND payee_agent_id = @agent_id(WHERE clause filter)
@agent_id parameter to @agent_key, join via dim_agent_identity to resolve
2.2 API Query Functions (api/bigquery/queries.py)
A. get_agent_stage1_detail() (Lines 4645-5082)
Function: Get row-level Stage-1 credit detail for an agent
Join Usage:
- Line 4839:
AND s.{agent_id_col} = 721995(hardcoded Robin Bundy filter) - Line 4840:
AND s.{agent_id_col} != 668078(exclude Richard Ballard) - Line 4918-4919: Same hardcoded filters in subquery
- Line 5005-5006: Same hardcoded filters in summary query
- Line 5081-5082: Same hardcoded filters in count query
mint_agent_id_from_name()
Note: These are upstream numeric IDs from source data, not minted IDs. They should be preserved as upstream_agent_id in dim_agent_identity.
Migration Path:
- Keep hardcoded filters (they’re business requirements)
- Map to
agent_keyviadim_agent_identity.upstream_agent_id
B. get_agent_commission_with_authoritative_pepm() (Lines 6206-6271)
Function: Get agent commission with authoritative PEPM
Join Usage:
- Line 6244:
WHERE payee_agent_id IN (721995, 668078)(hardcoded numeric IDs)
upstream_agent_id
C. get_agent_metrics_range() (Lines 2923-3174)
Function: Get agent metrics for date range
Join Usage:
- No direct
agent_idjoins - Usesagent_namefor filtering
agent_name, no migration needed
D. Window Functions
Line 524:PARTITION BY agent_id, business_name, period_label
Impact: ⚠️ MEDIUM - Used in windowing for deduplication
Migration Path: Update to use agent_key in partition
2.3 API Routes (api/routes/)
A. agents.py
File: api/routes/agents.py
Join Usage:
- Routes extract
agent_namefrom JWT token - Pass
agent_nameto query functions (notagent_id)
agent_name, no direct agent_id joins
2.4 Dimension Tables
A. dim_agent_hierarchy
File: integration/bigquery/sql/tables/bq_schema_agents.sql (referenced)
Join Usage:
- Primary key:
agent_id - Used in semantic view join (line 371)
agent_id as primary key
Migration Path:
- Option 1: Add
agent_keycolumn, keepagent_idfor backward compatibility - Option 2: Create mapping view that joins
dim_agent_hierarchytodim_agent_identity
B. dim_users
File: Referenced in multiple scripts
Join Usage:
- Stores
agent_idfor user records - Used in authentication/authorization
agent_id
Migration Path: Add agent_key column, migrate gradually
3. Summary: Join Dependencies
Critical Dependencies (Must Migrate)
- ✅
semantic_view_commission_traceability- Usesagent_idin hierarchy join - ✅
bq_schema_agents.sql- Usesagent_idin multiple joins - ⚠️ Hardcoded numeric IDs (721995, 668078) - These are NOT minted IDs, preserve as
upstream_agent_id
Medium Dependencies (Should Migrate)
stage1_agent_detailed_export_vw.sql- Uses@agent_idparameter- Window functions - Use
agent_idin PARTITION BY dim_agent_hierarchy- Primary key isagent_id
Low Dependencies (No Migration Needed)
- API routes - Already use
agent_name(noagent_idjoins) - Most query functions - Filter by
agent_name, notagent_id
4. Migration Strategy
Phase 1: Freeze Legacy Function
- ✅ Move
mint_agent_id_from_name()toutils/agent_id_legacy.py - ✅ Mark as
@deprecatedwith migration notice - ✅ Document that new code should use
agent_key
Phase 2: Create dim_agent_identity
- ✅ Create table (SQL already exists)
- ✅ Populate from existing
stage3_snapshotsdata - ✅ Map existing
agent_idvalues toagent_key
Phase 3: Update Semantic View (Backward Compatible)
- Add
agent_identityCTE join - Update
stage3_component_idhash to useagent_key(fallback toagent_id) - Keep
agent_idjoin todim_agent_hierarchyfor backward compatibility - Add
agent_keyto final SELECT
Phase 4: Update Other Views/Queries
- Update
bq_schema_agents.sqljoins to useagent_key - Update window functions to use
agent_key - Update
stage1_agent_detailed_export_vw.sqlto accept@agent_key
Phase 5: Deprecate agent_id (Future)
- Mark
agent_idcolumns as deprecated - Migrate all queries to
agent_key - Remove
agent_idcolumns (breaking change, requires coordination)
5. Hardcoded Numeric IDs (721995, 668078)
Important Discovery
These numeric IDs are NOT generated bymint_agent_id_from_name(). They are:
- Upstream source IDs from payroll data
- Hardcoded business requirements (Robin Bundy = 721995, Richard Ballard = 668078)
- Used in WHERE clauses for Stage1 filtering
Handling Strategy
- Preserve in
dim_agent_identity: Store asupstream_agent_id - Keep hardcoded filters: These are business requirements, not technical artifacts
- Map to
agent_key: Create mapping so queries can useagent_keybut still filter by upstream ID when needed
Example Mapping
6. Freeze Guarantee Statement
LEGACYagent_idMINTING FUNCTION FREEZE GUARANTEE:
The functionmint_agent_id_from_name()inrepo_b/upload_to_bq.pyis FROZEN FOREVER as of [DATE]. Frozen Behavior:Migration Path:
- Same
agent_nameinput → sameagent_idoutput (deterministic SHA256 hash)- Normalization:
UPPER(TRIM(collapse_spaces(agent_name)))- Special cases:
"OWNER"→"OWNER", empty →"UNASSIGNED"- Format:
"NAME:" + SHA256(normalized)[:12]
New code MUST useagent_keyfromdim_agent_identitytable.
Legacyagent_idvalues remain valid for backward compatibility during migration period. Frozen Function Location:utils/agent_id_legacy.py(never modified)
7. Action Items
Immediate
- ✅ Document current
mint_agent_id_from_name()function - ✅ Identify all
agent_idjoin dependencies - ⏳ Create frozen function module (
utils/agent_id_legacy.py)
Short-term
- ⏳ Create
dim_agent_identitytable - ⏳ Populate with existing agents (map
agent_id→agent_key) - ⏳ Update semantic view to use
agent_key(backward compatible)
Long-term
- ⏳ Migrate all views/queries to
agent_key - ⏳ Deprecate
agent_idcolumns - ⏳ Remove
agent_iddependencies (breaking change)
Status: ✅ ANALYSIS COMPLETE - Ready for migration planning