Skip to main content

Member-ID-Based MIXED_CADENCE Normalization Implementation Summary

Date: 2025-01-XX
Status: ✅ COMPLETE

Implementation Checklist

✅ 1. Cadence Inference Key

  • Uses Member ID (NOT member_key) - Line 245: member_counts = group_df["member_id"].value_counts()
  • Scope: (business, agent, Member ID, month) - Line 461: Called inside agent loop with agent_group_df
  • Guardrail: Line 254: Assert "member_id" in group_df.columns

✅ 2. Cadence Inference Rules

  • 1 row per Member ID → Monthly (12 PPY) - Line 272-275
  • 2 rows per Member ID → Bi-weekly (26 PPY) - Line 276-279
  • 4 rows per Member ID → 4×/month (48) or Weekly (52) - Line 280-314
  • >4 rows → Weekly (52 PPY) - Line 280-314
  • posted_date spacing used when available - Line 283-307

✅ 3. Commission Math (ROW-BASED ONLY)

  • row_count_basis = len(agent_group_df) - Line 478
  • employee_count = agent_group_df["member_id"].nunique() - Line 479 (KPI ONLY)
  • normalized_rate = (PEPM_monthly * 12) / inferred_pay_periods - Line 497
  • agent_commission = normalized_rate * row_count_basis - Line 504
  • All math in Decimal with CENT quantization - Lines 494, 497, 504

✅ 4. Source of Truth

  • PEPM_monthly from agent2_pepm.csv - Line 493: pepm_rate_raw = agent_row["pepm_rate"]
  • Decimal parsing via _d_raw() - Line 494

✅ 5. Required Audit Fields

All fields persisted on agent rows (Lines 538-550):
  • inferred_pay_periods_per_year (int)
  • inferred_cadence_label (string)
  • mixed_cadence_flag (bool)
  • source_pepm_monthly (CENT string)
  • rate_per_pay_period (CENT string) - alias for normalized_rate
  • normalized_rate_decimal (Decimal)
  • row_count_basis (int)
  • member_id_duplication_map (JSON: Member ID → count) - Line 253, 518, 550

✅ 6. Guardrails

  • Period label disagreementmixed_cadence_flag=True - Line 469-471
  • Assertion: employee_count prevention - Lines 481-490
  • Assertion: cadence inference per Member ID - Line 254, 485

✅ 7. Validation Script

  • Created: scripts/validate_canyon_plumbing_september.py
  • Prints: Member ID counts, inferred cadence, normalized_rate, agent_total per agent

Key Code Locations

Cadence Inference Function

  • Location: repo_b/output_adapter.py:207-337
  • Key Line: 245 - Uses member_id directly
  • Returns: Includes member_id_duplication_map (Line 336)

Commission Calculation

  • Location: repo_b/output_adapter.py:477-505
  • Scope: Per (business, agent, month) - Line 461
  • Uses: row_count (never employee_count) - Line 504

Audit Fields

  • Agent rows: Lines 538-550
  • Owner rows: Lines 618-627, 676-685

Next Steps

  1. Run validation script: python scripts/validate_canyon_plumbing_september.py
  2. Verify Canyon Plumbing totals match Repo A (~$553.84)
  3. Check Member ID duplication maps for expected patterns