Member-ID-Based MIXED_CADENCE Normalization Implementation Summary
Date: 2025-01-XXStatus: ✅ COMPLETE
Implementation Checklist
✅ 1. Cadence Inference Key
- Uses Member ID (NOT member_key) - Line 245:
member_counts = group_df["member_id"].value_counts() - Scope: (business, agent, Member ID, month) - Line 461: Called inside agent loop with
agent_group_df - Guardrail: Line 254: Assert
"member_id" in group_df.columns
✅ 2. Cadence Inference Rules
- 1 row per Member ID → Monthly (12 PPY) - Line 272-275
- 2 rows per Member ID → Bi-weekly (26 PPY) - Line 276-279
- 4 rows per Member ID → 4×/month (48) or Weekly (52) - Line 280-314
- >4 rows → Weekly (52 PPY) - Line 280-314
- posted_date spacing used when available - Line 283-307
✅ 3. Commission Math (ROW-BASED ONLY)
- row_count_basis =
len(agent_group_df)- Line 478 - employee_count =
agent_group_df["member_id"].nunique()- Line 479 (KPI ONLY) - normalized_rate =
(PEPM_monthly * 12) / inferred_pay_periods- Line 497 - agent_commission =
normalized_rate * row_count_basis- Line 504 - All math in Decimal with CENT quantization - Lines 494, 497, 504
✅ 4. Source of Truth
- PEPM_monthly from
agent2_pepm.csv- Line 493:pepm_rate_raw = agent_row["pepm_rate"] - Decimal parsing via
_d_raw()- Line 494
✅ 5. Required Audit Fields
All fields persisted on agent rows (Lines 538-550):- ✅
inferred_pay_periods_per_year(int) - ✅
inferred_cadence_label(string) - ✅
mixed_cadence_flag(bool) - ✅
source_pepm_monthly(CENT string) - ✅
rate_per_pay_period(CENT string) - alias fornormalized_rate - ✅
normalized_rate_decimal(Decimal) - ✅
row_count_basis(int) - ✅
member_id_duplication_map(JSON: Member ID → count) - Line 253, 518, 550
✅ 6. Guardrails
- Period label disagreement →
mixed_cadence_flag=True- Line 469-471 - Assertion: employee_count prevention - Lines 481-490
- Assertion: cadence inference per Member ID - Line 254, 485
✅ 7. Validation Script
- Created:
scripts/validate_canyon_plumbing_september.py - Prints: Member ID counts, inferred cadence, normalized_rate, agent_total per agent
Key Code Locations
Cadence Inference Function
- Location:
repo_b/output_adapter.py:207-337 - Key Line: 245 - Uses
member_iddirectly - Returns: Includes
member_id_duplication_map(Line 336)
Commission Calculation
- Location:
repo_b/output_adapter.py:477-505 - Scope: Per (business, agent, month) - Line 461
- Uses:
row_count(neveremployee_count) - Line 504
Audit Fields
- Agent rows: Lines 538-550
- Owner rows: Lines 618-627, 676-685
Next Steps
- Run validation script:
python scripts/validate_canyon_plumbing_september.py - Verify Canyon Plumbing totals match Repo A (~$553.84)
- Check Member ID duplication maps for expected patterns