Repo B Audit-Grade Commission Engine Lock (v1.0)
Date: 2025-01-XXStatus: LOCKED - No further logic changes unless required by failing tests
Executive Summary
Repo B commission calculation engine is now audit-grade and stable. This document freezes the canonical rules, rounding standards, and regression test suite.Locked Components
1. Cadence Inference Rules (CANONICAL)
Function:infer_cadence_from_member_duplication() in repo_b/output_adapter.py
Rules (DO NOT CHANGE WITHOUT AUDIT APPROVAL):
max_dup_count == 1→ Monthly (12 pay periods)max_dup_count == 2 or 3→ Bi-weekly (26 pay periods)- Note: 3 duplicates can occur in bi-weekly months that span 3 pay periods
max_dup_count == 4→ 4×/month (48) or Weekly (52) based onposted_datespacing- Default to 48 if spacing signal insufficient or
posted_dateunavailable - If spacing ≈ weekly (7-day intervals), use 52
- Default to 48 if spacing signal insufficient or
max_dup_count >= 5→ Weekly (52 pay periods)
- Primary:
member_id(string) - Fallback 1:
member_keyifmember_idis null/blank - Fallback 2:
member_first_name + member_last_namecomposite if both missing - Default: PPY=12 (Monthly) with
MISSING_MEMBER_ID_FOR_CADENCEwarning if all fail
inferred_pay_periods_per_year(int)inferred_cadence_label(string)mixed_cadence_flag(bool)member_id_duplication_map(JSON: Member ID → count)member_dup_count_distribution(JSON:{1×: x, 2×: y, 3×: z, 4×: w, 5×+: v})
2. Rounding Standard (LOCKED)
Constants:CENT = Decimal("0.01")- Final money quantization unitTOLERANCE = Decimal("0.05")- Reconciliation tolerance
- All money math uses
Decimal(neverfloat) - Final persisted/displayed money values are CENT-quantized via
.quantize(CENT) - Use
_money()for final values,_d_raw()for intermediate calculations - Float contamination guardrail: Assertions prevent float types in commission calculation paths
_d_raw(x)- Parse to Decimal WITHOUT quantization (preserves full precision)_money(x)- Parse to Decimal WITH quantization to cents (for final values)
3. Row-Based Commission Invariant (ENFORCED)
Rule: Commissions userow_count (absorbed transaction rows), NEVER employee_count (distinct employees).
Guardrails:
- Hard assertion:
row_countmust beint(notfloat) - Hard assertion:
employee_countmust NEVER be used in commission calculation - Float contamination guardrail:
pepm_rateandnormalized_ratemust beDecimal(notfloat)
build_stage3_like_dataframe() in repo_b/output_adapter.py (lines 590-622).
4. UNDERFUNDED_TPA Soft-Fail (LOCKED)
Behavior: Non-blocking anomaly detection (pipeline continues, never raises). Detection:agent_commissions > net_payout + TOLERANCE(whereTOLERANCE = Decimal("0.05"))- Emits
UNDERFUNDED_TPAanomaly with:business_name,period_label,shortfall_amountstage1_net_payout,stage3_agent_commissionsprimary_agent,agent_names,livesdetails_json(JSON string with full context)
- Sets
owner_residual = Decimal("0")to preserve CEO-level invariant - Tracks shortfall in anomaly (not in owner residual)
gross == agent_sum + owner + chargebacksstill holds at CEO level
- CSV:
business_anomalies_underfunded_tpa_{period}.csv - BigQuery:
business_anomalies_historytable withanomaly_type = "UNDERFUNDED_TPA"
build_stage3_like_dataframe() in repo_b/output_adapter.py (lines 650-700).
Regression Test Suite (MUST PASS)
Test Files
-
tests/test_cadence_inference.pytest_cadence_inference_monthly()- 1× duplicate → 12 PPYtest_cadence_inference_biweekly()- 2× duplicate → 26 PPYtest_cadence_inference_biweekly_3x()- 3× duplicate → 26 PPYtest_cadence_inference_4x_per_month()- 4× duplicate → 48 PPYtest_cadence_inference_5plus_weekly()- 5+ duplicates → 52 PPYtest_cadence_inference_mixed()- Multiple patterns → mixed_cadence_flag=Truetest_commission_uses_inferred_cadence()- Commission uses inferred PPY, not Period labeltest_decimal_purity_no_floats()- No float types in cadence inference results
-
tests/test_row_based_commission_invariant.pytest_commission_uses_row_count_not_employee_count()- Commission uses row_count, not employee_count
-
tests/test_underfunded_tpa.pytest_underfunded_case_emits_anomaly()- UNDERFUNDED_TPA anomaly emitted, no exceptiontest_funded_case_no_anomaly()- No anomaly when fundedtest_tolerance_boundary_no_anomaly()- No anomaly within tolerancetest_business_normalization_consistency()- Business normalization handles variationstest_decimal_purity_anomaly_payload()- No float types in anomaly payloadstest_decimal_purity_ceo_snapshot()- No float types in CEO snapshottest_excel_dataframe_no_floats()- Excel DataFrame uses CENT-quantized strings
Repo B as Authoritative Source
Status: Repo B is the AUTHORITATIVE source of truth for:- Agent payouts
- Audit narratives
- Dispute resolution
LEGACY_UNDERPAYMENT_DETECTED when Repo A payout < Repo B payout for the same business/agent/month, indicating Repo A’s historical calculation diverged from the audit-grade TPA KEY normalization.
Parity Label (Informational Only):
LEGACY_UNDERPAYMENT_DETECTED- When Repo A payout < Repo B payout AND Repo B audit basis is supported by cadence inference + TPA KEY normalization- This label is informational only (no logic changes to payouts)
Story Sheets Wireframe
Seerepo_b/README.md section “Story Sheets Wireframe” for display requirements for cadence inference and audit fields.
Git Tag
Tag:repo-b-audit-grade-v1Commit Message: “Lock Repo B audit-grade commission engine (cadence inference, cent rounding, tests, underfunded soft-fail)“
Change Control
Process: No logic changes to locked components unless:- Regression test fails (investigate root cause)
- Audit approval obtained
- Test suite updated to cover new behavior
- Documentation updated
- Bug fixes (with test coverage)
- Performance optimizations (preserving behavior)
- Documentation updates
- Export format changes (preserving calculation logic)
- Cadence inference rules (without audit approval)
- Rounding standard (without audit approval)
- Row-based commission invariant (never change)
- UNDERFUNDED_TPA behavior (soft-fail must remain)