Skip to main content

Phase 1.3 Production Population Runbook

Purpose: Populate Phase 1.3 identity resolution tables (dim_agent_identity_admin, dim_agent_pepm_agent_map) in production using admin API endpoints. Prerequisites:
  • Phase 1.3 code deployed to production
  • Admin/CEO role access to /api/v1/admin/identity/* endpoints
  • BigQuery access to payroll-bi-gauntlet project
  • JWT authentication token with admin/CEO role
Last Updated: 2025-12-31

Overview

Phase 1.3 introduces two admin-managed tables for identity resolution:
  1. dim_agent_pepm_agent_map: Maps PEPM sheet names → agent_key (one-time mapping)
  2. dim_agent_identity_admin: Admin overrides for identity resolution (optional, as-needed)
This runbook covers the initial population of PEPM mappings. Identity admin overrides are optional and can be created later as needed.

Step 1: Derive Tenant-Relevant PEPM Pairs

Goal: Identify PEPM agent/business combinations that actually appear in production data for the tenant. Query stage3_snapshots for the tenant and last closed period to find agent/business pairs that need PEPM mappings:
-- Query to generate PEPM mapping candidates
SELECT DISTINCT
  UPPER(TRIM(COALESCE(downline_agent_name, agent_name))) AS pepm_agent_name_norm,
  UPPER(TRIM(business_name)) AS pepm_business_name_norm,
  agent_id,
  COUNT(*) AS row_count,
  SUM(agent_total) AS total_commission
FROM `payroll-bi-gauntlet.payroll_processed.stage3_snapshots`
WHERE tenant_id = @tenant_id
  AND period_label = @period_label  -- Use last closed period (e.g., DATE('2025-11-01'))
  AND tpa_applied = TRUE
GROUP BY 
  pepm_agent_name_norm,
  pepm_business_name_norm,
  agent_id
ORDER BY pepm_agent_name_norm, pepm_business_name_norm
Parameters:
  • @tenant_id: 'creative_benefit_strategies'
  • @period_label: Last closed period (e.g., DATE('2025-11-01'))
Output: List of (pepm_agent_name_norm, pepm_business_name_norm, agent_id) tuples that need mappings.

Option B: Query Multiple Recent Periods

For more comprehensive coverage, query multiple recent periods:
SELECT DISTINCT
  UPPER(TRIM(COALESCE(downline_agent_name, agent_name))) AS pepm_agent_name_norm,
  UPPER(TRIM(business_name)) AS pepm_business_name_norm,
  agent_id,
  COUNT(DISTINCT period_label) AS periods_seen,
  MIN(period_label) AS first_seen,
  MAX(period_label) AS last_seen
FROM `payroll-bi-gauntlet.payroll_processed.stage3_snapshots`
WHERE tenant_id = @tenant_id
  AND period_label >= DATE('2025-09-01')  -- Last 3 months
  AND tpa_applied = TRUE
GROUP BY 
  pepm_agent_name_norm,
  pepm_business_name_norm,
  agent_id
ORDER BY pepm_agent_name_norm, pepm_business_name_norm

Step 2: Resolve agent_key for Each PEPM Pair

Goal: For each PEPM pair, determine the correct agent_key from dim_agent_identity.

Query dim_agent_identity to Find Matching agent_key

-- Resolve agent_key from agent_id (from Stage3)
SELECT 
  agent_key,
  agent_id,
  display_name,
  UPPER(display_name) AS display_name_upper
FROM `payroll-bi-gauntlet.payroll_analytics.dim_agent_identity`
WHERE tenant_id = @tenant_id
  AND is_active = TRUE
  AND agent_id = @agent_id  -- From Stage3 query
Note: If agent_id doesn’t match exactly, you may need to:
  • Check display_name matches (case-insensitive)
  • Check aliases array for matches
  • Manually resolve ambiguous cases
Manual Resolution Required: For each PEPM pair, document:
  • pepm_agent_name_norm: Normalized PEPM agent name
  • pepm_business_name_norm: Normalized PEPM business name
  • agent_key: Resolved agent_key from dim_agent_identity
  • resolved_agent_id: Optional, informational only (can be NULL)
  • match_source: 'initial_mapping' (or 'admin_override' if manually corrected)

Step 3: Generate Mapping Candidates (Read-Only)

Goal: Create a review document with all mapping candidates before applying. Create a CSV or JSON file with candidate mappings:
[
  {
    "pepm_agent_name_norm": "KENNY YOUNG",
    "pepm_business_name_norm": "ELITE AT WHITESTONE GOLF CLUB LLC",
    "agent_key": "kenny_young",
    "resolved_agent_id": null,
    "match_source": "initial_mapping",
    "effective_start_date": "2024-01-01",
    "reason": "Initial Phase 1.3 production mapping - derived from Stage3 snapshots for tenant creative_benefit_strategies, period 2025-11-01"
  },
  {
    "pepm_agent_name_norm": "KENNY YOUNG",
    "pepm_business_name_norm": "WELLS CUTTING EDGE LLC",
    "agent_key": "kenny_young",
    "resolved_agent_id": null,
    "match_source": "initial_mapping",
    "effective_start_date": "2024-01-01",
    "reason": "Initial Phase 1.3 production mapping - derived from Stage3 snapshots for tenant creative_benefit_strategies, period 2025-11-01"
  }
  // ... more mappings
]
Review Checklist:
  • All PEPM pairs have valid agent_key values
  • pepm_agent_name_norm matches normalization: UPPER(TRIM(CONCAT(agent_first_name, ' ', agent_last_name)))
  • pepm_business_name_norm matches normalization: UPPER(TRIM(business_name))
  • effective_start_date is set to earliest period (e.g., 2024-01-01)
  • reason field is non-empty and descriptive
  • No duplicate (pepm_agent_name_norm, pepm_business_name_norm) pairs

Step 4: Admin Review and Approval

Goal: Review mapping candidates before applying to production. Review Process:
  1. Spot-check 5-10 mappings against source PEPM data
  2. Verify agent_key resolution is correct
  3. Confirm no ambiguous mappings (one agent_key per PEPM pair)
  4. Approve mapping list for production
Approval Artifact: Signed-off mapping candidate file (CSV/JSON) with reviewer name and timestamp.

Step 5: Apply Mappings via Admin API

Goal: Create PEPM mappings using /api/v1/admin/identity/pepm-map endpoint.

Endpoint Details

URL: POST /api/v1/admin/identity/pepm-map Authentication: JWT token with admin or ceo role Request Body Schema (PEPMAgentMapCreate):
{
  "pepm_agent_name_norm": "KENNY YOUNG",
  "pepm_business_name_norm": "ELITE AT WHITESTONE GOLF CLUB LLC",
  "agent_key": "kenny_young",
  "resolved_agent_id": null,
  "match_source": "initial_mapping",
  "effective_start_date": "2024-01-01",
  "effective_end_date": null,
  "is_active": true,
  "reason": "Initial Phase 1.3 production mapping - derived from Stage3 snapshots for tenant creative_benefit_strategies, period 2025-11-01",
  "notes": null
}
Required Fields:
  • pepm_agent_name_norm: STRING (normalized: UPPER(TRIM(CONCAT(agent_first_name, ' ', agent_last_name))))
  • pepm_business_name_norm: STRING (normalized: UPPER(TRIM(business_name)))
  • agent_key: STRING (must exist in dim_agent_identity)
  • match_source: STRING ('admin_override' | 'initial_mapping' | 'auto_resolved')
  • effective_start_date: DATE (e.g., "2024-01-01")
  • reason: STRING (min_length=1, required for audit)
Optional Fields:
  • resolved_agent_id: STRING (informational only, can be NULL)
  • effective_end_date: DATE (NULL = active)
  • is_active: BOOL (default: true)
  • notes: STRING (optional)
Response Schema (PEPMAgentMapResponse):
{
  "override_id": "2fc4e079-c26d-4c7e-a553-16f7dc6a4257",
  "tenant_id": "creative_benefit_strategies",
  "pepm_agent_name_norm": "KENNY YOUNG",
  "pepm_business_name_norm": "ELITE AT WHITESTONE GOLF CLUB LLC",
  "agent_key": "kenny_young",
  "resolved_agent_id": null,
  "match_source": "initial_mapping",
  "effective_start_date": "2024-01-01",
  "effective_end_date": null,
  "is_active": true,
  "reason": "Initial Phase 1.3 production mapping...",
  "created_at": "2025-12-31T00:00:00",
  "updated_at": null,
  "created_by": "admin@example.com",
  "updated_by": null,
  "notes": null
}
Critical: Save override_id from each response for audit trail.

Example cURL Request

curl -X POST "https://api.example.com/api/v1/admin/identity/pepm-map" \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "pepm_agent_name_norm": "KENNY YOUNG",
    "pepm_business_name_norm": "ELITE AT WHITESTONE GOLF CLUB LLC",
    "agent_key": "kenny_young",
    "resolved_agent_id": null,
    "match_source": "initial_mapping",
    "effective_start_date": "2024-01-01",
    "effective_end_date": null,
    "is_active": true,
    "reason": "Initial Phase 1.3 production mapping - derived from Stage3 snapshots for tenant creative_benefit_strategies, period 2025-11-01",
    "notes": null
  }'

Batch Application Script

For large numbers of mappings, create a script to apply them sequentially:
import requests
import json
import time

API_BASE_URL = "https://api.example.com"
JWT_TOKEN = "your_jwt_token_here"

headers = {
    "Authorization": f"Bearer {JWT_TOKEN}",
    "Content-Type": "application/json"
}

# Load approved mappings
with open("approved_mappings.json", "r") as f:
    mappings = json.load(f)

created_mappings = []
failed_mappings = []

for mapping in mappings:
    try:
        response = requests.post(
            f"{API_BASE_URL}/api/v1/admin/identity/pepm-map",
            headers=headers,
            json=mapping
        )
        response.raise_for_status()
        
        result = response.json()
        created_mappings.append({
            "pepm_pair": f"{mapping['pepm_agent_name_norm']} @ {mapping['pepm_business_name_norm']}",
            "override_id": result["override_id"],
            "agent_key": result["agent_key"]
        })
        print(f"✅ Created: {mapping['pepm_agent_name_norm']} @ {mapping['pepm_business_name_norm']} -> {result['override_id']}")
        
        # Rate limiting: 1 request per second
        time.sleep(1)
        
    except requests.exceptions.HTTPError as e:
        error_detail = e.response.json().get("detail", str(e))
        failed_mappings.append({
            "mapping": mapping,
            "error": error_detail
        })
        print(f"❌ Failed: {mapping['pepm_agent_name_norm']} @ {mapping['pepm_business_name_norm']}: {error_detail}")

# Save audit artifact
audit_artifact = {
    "timestamp": datetime.now().isoformat(),
    "created_count": len(created_mappings),
    "failed_count": len(failed_mappings),
    "created_mappings": created_mappings,
    "failed_mappings": failed_mappings
}

with open(f"phase1_3_population_audit_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json", "w") as f:
    json.dump(audit_artifact, f, indent=2)

Step 6: Optional - Create Identity Admin Overrides

Goal: Create identity admin overrides if needed (optional, as-needed basis).

Endpoint Details

URL: POST /api/v1/admin/identity/agent Request Body Schema (AgentIdentityAdminCreate):
{
  "canonical_agent_id": "721995",
  "agent_key": "robin_bundy",
  "display_name": "Robin Bundy",
  "upstream_agent_id": "721995",
  "effective_start_date": "2024-01-01",
  "effective_end_date": null,
  "is_active": true,
  "reason": "Admin override: Robin Bundy identity resolution override",
  "notes": null
}
Required Fields:
  • canonical_agent_id: STRING (canonical agent_id this override resolves to)
  • display_name: STRING (canonical display name, normalized: UPPER, TRIM)
  • effective_start_date: DATE
  • reason: STRING (min_length=1, required for audit)
Optional Fields:
  • agent_key: STRING (optional, may be set by admin override)
  • upstream_agent_id: STRING (compatibility mirror of canonical_agent_id)
  • effective_end_date: DATE (NULL = active)
  • is_active: BOOL (default: true)
  • notes: STRING
Response Schema (AgentIdentityAdminResponse):
{
  "override_id": "uuid-here",
  "tenant_id": "creative_benefit_strategies",
  "canonical_agent_id": "721995",
  "agent_key": "robin_bundy",
  "display_name": "Robin Bundy",
  "upstream_agent_id": "721995",
  "effective_start_date": "2024-01-01",
  "effective_end_date": null,
  "is_active": true,
  "reason": "Admin override: Robin Bundy identity resolution override",
  "created_at": "2025-12-31T00:00:00",
  "updated_at": null,
  "created_by": "admin@example.com",
  "updated_by": null,
  "source": "admin",
  "notes": null
}
Note: Identity admin overrides are optional. Only create if you need to override default identity resolution behavior.

Step 7: Capture Audit Artifacts

Goal: Document all created mappings for audit trail.

Required Artifacts

  1. Mapping Creation Log:
    • Filename: phase1_3_population_audit_YYYYMMDD_HHMMSS.json
    • Location: integration/bigquery/outputs/phase1_3_population/
    • Contents:
      {
        "timestamp": "2025-12-31T00:00:00Z",
        "tenant_id": "creative_benefit_strategies",
        "period_label": "2025-11-01",
        "created_by": "admin@example.com",
        "mapping_count": 50,
        "mappings": [
          {
            "override_id": "uuid-1",
            "pepm_agent_name_norm": "KENNY YOUNG",
            "pepm_business_name_norm": "ELITE AT WHITESTONE GOLF CLUB LLC",
            "agent_key": "kenny_young",
            "reason": "Initial Phase 1.3 production mapping..."
          }
          // ... more mappings
        ]
      }
      
  2. Approval Document:
    • Filename: phase1_3_mapping_approval_YYYYMMDD_HHMMSS.md
    • Contents: Reviewer name, approval timestamp, mapping candidate summary
  3. Validation Results (see Step 8)

Step 8: Run Production Validation

Goal: Verify all validation gates pass after population.

Execute Validation SQL

bq query --location=US --use_legacy_sql=false \
  --parameter=tenant_id:STRING:creative_benefit_strategies \
  --parameter=period_label:STRING:2025-11-01 \
  --format=prettyjson \
  < integration/bigquery/sql/validation/phase1_3_identity_resolution_validation.sql \
  > integration/bigquery/outputs/phase1_3_validation/phase1_3_prod_validation_YYYYMMDD_HHMMSS.json

Expected Validation Results

All validation gates must return 0:
[
  {
    "validation_name": "A1: Unmapped Agent2 PEPM rows",
    "unmapped_count": "0"
  },
  {
    "validation_name": "B1: Overlapping overrides (dim_agent_identity_admin)",
    "overlap_count": "0"
  },
  {
    "validation_name": "C1: Overlapping overrides (dim_agent_pepm_agent_map)",
    "overlap_count": "0"
  },
  {
    "validation_name": "D1: Ambiguous resolution (dim_agent_pepm_agent_map)",
    "ambiguous_count": "0"
  },
  {
    "validation_name": "D2: Ambiguous resolution (dim_agent_identity_admin)",
    "ambiguous_count": "0"
  },
  {
    "validation_name": "E1: Missing agent_key references (dim_agent_pepm_agent_map)",
    "missing_count": "0"
  },
  {
    "validation_name": "F1: Validation Summary",
    "unmapped_pepm_count": "0",
    "identity_overlap_count": "0",
    "pepm_overlap_count": "0",
    "ambiguous_pepm_count": "0",
    "ambiguous_identity_count": "0",
    "missing_fk_count": "0"
  }
]

Step 9: Rollback/Closure Procedure

Goal: Close (not delete) incorrect mappings using end-exclusive window logic.

Close a PEPM Mapping

Endpoint: PUT /api/v1/admin/identity/pepm-map Request Body Schema (PEPMAgentMapUpdate):
{
  "override_id": "2fc4e079-c26d-4c7e-a553-16f7dc6a4257",
  "effective_end_date": "2025-12-01",
  "is_active": false,
  "notes": "Closed: Incorrect mapping, replaced with override_id abc-123"
}
Required Fields:
  • override_id: STRING (UUID from original POST response)
Optional Fields:
  • effective_end_date: DATE (set to closure date, end-exclusive)
  • is_active: BOOL (set to false to mark inactive)
  • notes: STRING (explanation for closure)
End-Exclusive Logic:
  • Setting effective_end_date = '2025-12-01' means the mapping is active for effective_start_date <= as_of_date < '2025-12-01'
  • For as_of_date = '2025-12-01' or later, the mapping is not active

Close an Identity Admin Override

Endpoint: PUT /api/v1/admin/identity/agent Request Body Schema (AgentIdentityAdminUpdate):
{
  "override_id": "uuid-here",
  "effective_end_date": "2025-12-01",
  "is_active": false,
  "notes": "Closed: Override no longer needed"
}
Note: Do NOT delete rows. Always use UPDATE to set effective_end_date and is_active = false. This preserves audit trail and allows overlap validation to work correctly.

Pass/Fail Criteria

Pass Conditions (All Required)

  1. All validation gates = 0:
    • unmapped_pepm_count = 0 (all tenant-relevant PEPM rows mapped)
    • identity_overlap_count = 0 (no overlapping admin overrides)
    • pepm_overlap_count = 0 (no overlapping PEPM mappings)
    • ambiguous_pepm_count = 0 (unambiguous PEPM resolution)
    • ambiguous_identity_count = 0 (unambiguous identity resolution)
    • missing_fk_count = 0 (all agent_key references valid)
  2. Non-empty data present:
    • At least 5 PEPM mappings created
    • All mappings have valid agent_key values
    • All mappings have non-empty reason fields
  3. Audit artifacts retained:
    • Mapping creation log with override_ids
    • Approval document
    • Validation results JSON
  4. No errors during API calls or validation SQL execution

Fail Conditions

  • Any validation count > 0
  • Missing override_ids in audit log
  • API errors during mapping creation
  • SQL execution errors during validation

Artifacts to Retain for Audit

  1. Mapping Creation Log (phase1_3_population_audit_YYYYMMDD_HHMMSS.json):
    • All override_ids from POST responses
    • Mapping details (pepm pairs, agent_keys, reasons)
    • Timestamp and created_by user
  2. Approval Document (phase1_3_mapping_approval_YYYYMMDD_HHMMSS.md):
    • Reviewer name and email
    • Approval timestamp
    • Mapping candidate summary (count, sample pairs)
  3. Validation Results (phase1_3_prod_validation_YYYYMMDD_HHMMSS.json):
    • Full validation SQL output
    • All validation gate results
    • Execution timestamp
  4. Rollback Log (if any closures performed):
    • override_ids closed
    • Closure dates and reasons
    • updated_by user
Retention: All artifacts should be retained in integration/bigquery/outputs/phase1_3_population/ for audit compliance.

Follow-Up Tasks (If Discovered)

Note: If any endpoint behavior is missing or incorrect, document here:
  • List Endpoints: Consider adding GET /api/v1/admin/identity/pepm-map/list and GET /api/v1/admin/identity/agent/list endpoints to query all mappings for a tenant (with optional filters). Currently, users must query BigQuery directly to list mappings. This is not blocking for initial population but would improve operational visibility.

Troubleshooting

Error: “Overlapping effective date window detected”

Cause: Another active mapping exists for the same (pepm_agent_name_norm, pepm_business_name_norm) with overlapping effective dates. Resolution:
  1. Query existing mappings: SELECT * FROM dim_agent_pepm_agent_map WHERE tenant_id = '...' AND pepm_agent_name_norm = '...' AND pepm_business_name_norm = '...'
  2. Close the existing mapping first (PUT with effective_end_date)
  3. Then create the new mapping

Error: “agent_key does not exist in dim_agent_identity”

Cause: The agent_key value doesn’t exist in dim_agent_identity for the tenant. Resolution:
  1. Verify agent_key exists: SELECT * FROM dim_agent_identity WHERE tenant_id = '...' AND agent_key = '...'
  2. If missing, check if agent_id matches and resolve correct agent_key
  3. Update mapping request with correct agent_key

Querying Existing Mappings

Note: There is no list endpoint for querying all mappings. Use BigQuery directly:
-- List all PEPM mappings for a tenant
SELECT 
  override_id,
  pepm_agent_name_norm,
  pepm_business_name_norm,
  agent_key,
  effective_start_date,
  effective_end_date,
  is_active,
  reason,
  created_at,
  created_by
FROM `payroll-bi-gauntlet.payroll_analytics.dim_agent_pepm_agent_map`
WHERE tenant_id = 'creative_benefit_strategies'
ORDER BY pepm_agent_name_norm, pepm_business_name_norm, effective_start_date DESC
-- List all identity admin overrides for a tenant
SELECT 
  override_id,
  canonical_agent_id,
  agent_key,
  display_name,
  effective_start_date,
  effective_end_date,
  is_active,
  reason,
  created_at,
  created_by
FROM `payroll-bi-gauntlet.payroll_analytics.dim_agent_identity_admin`
WHERE tenant_id = 'creative_benefit_strategies'
ORDER BY canonical_agent_id, effective_start_date DESC

Error: “reason field is required and must be non-empty”

Cause: reason field is missing or empty string. Resolution: Ensure reason field is present and has at least 1 character.
  • Phase 1.3 Validation Plan: integration/bigquery/sql/validation/phase1_3_identity_resolution_validation.sql
  • Admin Identity API Schemas: api/schemas/admin_identity.py
  • Admin Identity Endpoints: api/routes/admin_identity.py