Skip to main content

Reporting Hierarchy MVP Implementation Plan (FINAL)

Overview

Build audit-grade Reporting Hierarchy system (separate from commissions) to seed hierarchy from agent2_pepm.csv safely, even though CSV only contains names. System uses agent_entity_id (UUID) for all FK references, implements SCD2 pattern, enforces cycle detection, maintains org-scoping, and supports ops-based changesets for future drag/drop UI. CRITICAL: Hierarchy is ONLY for reporting/visibility and dashboard scoping; must NOT be used by commission engine. Do not add hierarchy joins to payout paths.

Non-Negotiable Decisions (FINAL)

1) Hashing Strategy (FINAL)

Option A (CHOSEN): op_hash and batch_hash are computed SERVER-SIDE ONLY, returned in API responses, and written to structured audit logs.
  • Do NOT implement any client-side hashing, preview hashes, or cross-language parity tests.
  • Do NOT “choose later.” This is final.
Schema-Compatibility Shim (REQUIRED):
  • Schema shows: record_hash STRING NOT NULL (no default) in integration/bigquery/sql/tables/config_agent_hierarchy.sql
  • REQUIRED: Populate record_hash with deterministic ROW-CORE hash to satisfy schema (insert would fail without it).
  • Document clearly: record_hash is a compatibility field, not relied upon for audit integrity.
  • Row-core fields for record_hash (canonical JSON + sha256):
    • tenant_id, org_id, agent_entity_id, parent_agent_entity_id, effective_start_date, effective_end_date, change_status, is_active
  • Formula: record_hash = sha256(UTF8(canonical_json(row_core)))
  • Implementation: api/utils/hierarchy_hashing.py with compute_record_hash(row_core: Dict) -> str

2) Batch Semantics (FINAL)

/admin/hierarchy/batch_reparent is ATOMIC for MVP:
  • Validate ALL ops first; if ANY invalid -> return 400 and apply NONE.
  • Response must include ALL validation errors with operation_index.
  • Partial-success mode is NOT MVP. If desired later, it must be an explicit flag (not now).

3) Canonical Hashing Rules (SERVER ONLY)

op_hash:
  • Formula: op_hash = sha256(UTF8(canonical_json(op_core)))
  • op_core fields ONLY (exclude notes/comments/UI metadata/timestamps):
    • op: “reparent” (string)
    • tenant_id: string
    • org_id: string (or JSON null)
    • child_agent_entity_id: lowercase UUID string
    • new_parent_agent_entity_id: lowercase UUID string or JSON null (NOT string “null”)
    • effective_start_date: “YYYY-MM-DD” string
    • change_status: string (default “APPROVED”)
    • active: boolean true/false (NOT “Yes”/“No” string)
  • Canonical JSON (Python):
    • json.dumps(op_core, sort_keys=True, separators=(',', ':')) (no whitespace)
    • UTF-8, no BOM, arrays preserve order
    • JSON null is null (never string “null”)
batch_hash:
  • Compute op_hash for each operation first
  • Enforce deterministic server order for batch application: sort by (child_agent_entity_id, effective_start_date)
  • MUST use canonical JSON (NOT string concatenation):
    • batch_hash = sha256(UTF8(canonical_json({tenant_id, org_id, effective_start_date, operations: [{op_hash}, ...]})))
    • Operations array sorted by deterministic order

4) SCD2 Overlap Logic (FINAL)

All overlap checks MUST handle bounded rows, not only “end_date IS NULL”: Overlap predicate:
existing.effective_start_date <= new_start_date AND
(existing.effective_end_date IS NULL OR existing.effective_end_date >= new_start_date)
Use this overlap predicate for:
  • reparent “close existing” selection
  • bootstrap single-parent enforcement
  • as_of tree queries

5) Normalization Consistency (FINAL)

Name matching is FAIL-CLOSED. No fuzzy matching in MVP. Normalization must be identical in SQL and Python:
  • uppercase, trim
  • remove punctuation: . , ' - (including hyphen)
  • collapse ALL whitespace (not single REPLACE)
SQL (MUST use REGEXP_REPLACE to collapse whitespace):
REGEXP_REPLACE(
  UPPER(TRIM(REPLACE(REPLACE(REPLACE(REPLACE(CONCAT(c.first_name, ' ', c.last_name), '.', ''), ',', ''), '''', ''), '-', ''))),
  r'\s+', ' '
) AS full_name_normalized
Python:
def normalize_full_name(first_name: str, last_name: str) -> str:
    full = f"{first_name} {last_name}"
    # Remove punctuation: . , ' - (must match SQL exactly)
    full = full.replace('.', '').replace(',', '').replace("'", '').replace('-', '')
    # Trim and uppercase
    full = full.strip().upper()
    # Collapse multiple spaces
    full = ' '.join(full.split())
    return full
Add a backend-only test that validates the normalization function handles hyphens and multiple spaces.

Security & RBAC (FINAL)

Design Principle: Strict for safety but NEVER blocks the platform owner (super admin) from doing full admin operations across this tenant and future tenants.

1) Platform Owner / Super Admin (FINAL)

Definition: Platform Super Admin is detected using _is_platform_admin() function from api/routes/admin.py:
  • Requires ALL of:
    1. role == "admin"
    2. org_id is None/missing (no org_id in JWT)
    3. User email/sub is in ADMIN_PLATFORM_ALLOWLIST_EMAILS (if allowlist is configured)
  • If ADMIN_PLATFORM_ALLOWLIST_EMAILS is not set, falls back to role+org_id check (backward compatible, but less secure)
MVP Implementation (Minimal Scope):
  • Treat email aaronbundy681@gmail.com as platform_super_admin (allowlist)
  • Apply to all /api/v1/admin/hierarchy/* endpoints and /admin/agents/resolver
  • Super Admin must provide X-Org-Id on WRITE endpoints; missing => 400 (safety)
  • Super Admin write access must NOT depend on org membership rows existing (prevent bootstrapping deadlocks)
  • Non-Super Admin users: require Admin/CEO for org
Super Admin Capabilities:
  • Super Admin can perform ALL hierarchy operations (bootstrap, reparent, batch_reparent, validate_move, tree/history, resolver) across any tenant/org they are authorized to administer
  • Super Admin must NOT be dependent on org membership rows existing ahead of time (avoid bootstrapping deadlocks)
  • Super Admin access is NOT blocked by org membership table checks
  • Super Admin must supply X-Org-Id for ALL hierarchy endpoints (writes AND reads including resolver) - missing X-Org-Id => 400 (safety check to prevent accidental cross-org operations)
Implementation:
  • Use _is_platform_admin(current_user) from api/routes/admin.py to detect platform admin
  • Create helper function is_platform_super_admin(current_user: dict) -> bool in api/routes/admin_hierarchy.py that wraps _is_platform_admin()
  • For MVP: Hardcode check for aaronbundy681@gmail.com OR use ADMIN_PLATFORM_ALLOWLIST_EMAILS env var (if set, must include this email)
  • All hierarchy endpoints check is_platform_super_admin() before enforcing org-scoped restrictions

2) RBAC for Non-Owner Users (FINAL)

Write Endpoints (bootstrap, reparent, batch_reparent, validate_move):
  • For standard org users: Require Admin/CEO role for that org
  • Use require_admin_or_ceo dependency from api/routes/admin_identity.py
  • Non-admin/CEO users receive 403 Forbidden
Read Endpoints (tree, history, resolver):
  • Admin/CEO can read full org tree/history
  • If exposed to agents later, agent reads must be server-side subtree-scoped (filter to agent’s subtree only)
  • MVP: Only Admin/CEO can read (agent access deferred)
Implementation:
  • Write endpoints: Use require_admin_or_ceo dependency
  • Read endpoints: Use require_admin_or_ceo dependency (MVP)
  • Future agent access: Add require_agent_or_admin_or_ceo dependency with subtree filtering

3) Org Scoping Rules (FINAL - Do NOT Lock Out Super Admin)

Do NOT rely solely on X-Org-Id: Validate org scope against auth claims. If caller is Super Admin:
  • X-Org-Id is REQUIRED when performing org-scoped operations (to prevent accidental writes to wrong org)
  • Request must be allowed even if the user is not explicitly “a member” of that org in a membership table
  • Super Admin can operate on any org_id they specify via X-Org-Id header (no membership validation)
If caller is NOT Super Admin:
  • X-Org-Id must match the org in their auth context / membership; otherwise 403
  • Use get_org_id_from_request() from api/utils/org_scoping.py with allow_platform_level=False
  • Org-scoped users cannot override their JWT org_id via X-Org-Id header (prevent spoofing)
Implementation:
  • Create get_org_id_for_hierarchy_ops() function in api/routes/admin_hierarchy.py:
    def get_org_id_for_hierarchy_ops(
        current_user: dict = Depends(get_current_user),
        x_org_id: Optional[str] = Header(None, alias="X-Org-Id")
    ) -> Optional[str]:
        """
        Get org_id for hierarchy operations.
        
        Rules:
        - If Super Admin: Require X-Org-Id header (400 if missing), return it (no membership check)
        - If org-scoped user: Use JWT org_id, validate X-Org-Id matches (403 if mismatched)
        """
        if is_platform_super_admin(current_user):
            # Super Admin: Require X-Org-Id header
            if not x_org_id or not x_org_id.strip():
                raise HTTPException(
                    status_code=status.HTTP_400_BAD_REQUEST,
                    detail="X-Org-Id header required for platform admin org-scoped operations"
                )
            return x_org_id.strip()
        
        # Non-Super Admin: Use existing org scoping pattern
        return get_org_id_from_request(current_user, x_org_id, allow_platform_level=False)
    

4) Safe Default Behavior (FINAL)

For Super Admin:
  • Missing X-Org-Id on write endpoints -> 400 Bad Request with clear message: “X-Org-Id header required for platform admin org-scoped operations”
  • This prevents accidental cross-tenant/org writes while still preserving full power
  • Super Admin can still operate on any org (just must specify it explicitly)
For Non-Owner:
  • Missing X-Org-Id -> Use existing repo pattern (typically 400 or 403 per get_org_id_from_request())
  • Org-scoped users use JWT org_id (X-Org-Id optional but must match if provided)
Implementation:
  • All write endpoints use get_org_id_for_hierarchy_ops() dependency
  • All read endpoints use get_org_id_for_hierarchy_ops() dependency (for consistency)
  • Error messages clearly distinguish between Super Admin (400) and non-Super Admin (403) failures

5) Tests (MUST BE ADDED)

File: api/tests/test_admin_hierarchy.py Test Cases:
  1. Super Admin Access:
    • Super Admin can call all hierarchy endpoints with X-Org-Id set (works even if no prior org membership record exists)
    • Super Admin missing X-Org-Id on writes gets 400 (not 403) with clear message
    • Super Admin can operate on any org_id (no membership validation)
    • Super Admin with aaronbundy681@gmail.com email is detected correctly
  2. Non-Super Admin Access:
    • Non-admin org user cannot call write endpoints (403)
    • Org admin/CEO can call write endpoints for their org (200)
    • Org admin/CEO cannot call write endpoints for different org (403)
    • Org-scoped user with mismatched X-Org-Id gets 403 (prevent spoofing)
  3. Read Endpoints:
    • Admin/CEO can read full org tree/history (200)
    • Non-admin/CEO cannot read (403)
    • Super Admin can read any org with X-Org-Id (200)
  4. Org Scoping:
    • Super Admin with X-Org-Id can write to any org (200, no membership check)
    • Org-scoped user can only write to their JWT org_id (200)
    • Org-scoped user with X-Org-Id mismatched to JWT gets 403
Test Pattern:
  • Mock _is_platform_admin() to return True/False for Super Admin tests
  • Use kill_switch_bigquery fixture to mock BigQuery calls
  • Test both allowlist-configured and fallback scenarios
  • Test aaronbundy681@gmail.com specifically

Security Note

This RBAC/scoping design prevents data leakage and sabotage while guaranteeing the platform owner retains god-mode across all tenants/orgs.

Implementation Tasks

A) Agent Resolver Export Endpoint

File: api/routes/admin_hierarchy.py (recommended: separate router) Endpoint: GET /admin/agents/resolver Requirements:
  • RBAC: Admin/CEO only (use require_admin_or_ceo from api/routes/admin_identity.py)
  • Org-scoped: Use get_org_id_for_hierarchy_ops() dependency (see Security & RBAC section)
    • Super Admin: Requires X-Org-Id header (400 if missing) - applies to ALL hierarchy endpoints including resolver reads
    • Non-Super Admin: Uses JWT org_id (X-Org-Id optional but must match if provided)
  • Returns: agent_entity_id, first_name, last_name, full_name_normalized, agent_id (optional), email (optional)
  • SQL normalization uses REGEXP_REPLACE whitespace collapse + hyphen removal (see Normalization Consistency above)
  • Query joins config_agent_profile_identity (for agent_entity_id, agent_id) with config_agent_profile_contact (for first_name, last_name, email)
  • Filter: is_active=TRUE, effective_end_date IS NULL, tenant/org scope
  • Strict org scoping: Require i.org_id = @org_id (do not use NULL org clause - MVP assumes all tenants have org_id)
  • Response format: List of AgentResolverResponse objects (200 OK)
  • Error handling: 403 if not admin/CEO, 400 if Super Admin missing X-Org-Id, 403 if org-scoped user with mismatched X-Org-Id
Query Pattern:
SELECT DISTINCT
  i.agent_entity_id,
  i.agent_id,
  c.first_name,
  c.last_name,
  c.email,
  REGEXP_REPLACE(
    UPPER(TRIM(REPLACE(REPLACE(REPLACE(REPLACE(CONCAT(c.first_name, ' ', c.last_name), '.', ''), ',', ''), '''', ''), '-', ''))),
    r'\s+', ' '
  ) AS full_name_normalized
FROM config_agent_profile_identity i
LEFT JOIN config_agent_profile_contact c ON c.agent_entity_id = i.agent_entity_id
WHERE i.tenant_id = @tenant_id
  AND i.org_id = @org_id
  AND i.is_active = TRUE
  AND i.effective_end_date IS NULL
Response Schema: Create api/schemas/admin_hierarchy.py:
class AgentResolverResponse(BaseModel):
    agent_entity_id: str
    first_name: Optional[str]
    last_name: Optional[str]
    full_name_normalized: str
    agent_id: Optional[str]
    email: Optional[str]

B) Hierarchy Bootstrap Endpoint

File: api/routes/admin_hierarchy.py Endpoint: POST /admin/hierarchy/bootstrap Requirements:
  • Request format: JSON body with rows array (NOT multipart CSV upload)
  • Accepts JSON rows with ONLY existing schema fields: child_agent_entity_id, parent_agent_entity_id, effective_start_date, effective_end_date (optional), change_status, active
  • IMPORTANT: Schema does NOT have relationship_type or notes fields - MVP must not include these
  • RBAC: Admin/CEO only (use require_admin_or_ceo from api/routes/admin_identity.py)
  • Org-scoped: Use get_org_id_for_hierarchy_ops() dependency (see Security & RBAC section)
    • Super Admin: Requires X-Org-Id header (400 if missing)
    • Non-Super Admin: Uses JWT org_id (X-Org-Id optional but must match if provided)
  • UUID normalization: Normalize all UUIDs to lowercase (accept both cases, store lowercase)
  • Validations:
    • Tenant/org isolation: Child and parent must belong to same org/tenant (reject cross-org refs)
    • IDs exist: Verify child_agent_entity_id and parent_agent_entity_id (if not NULL) exist in config_agent_profile_identity
    • Date normalized: YYYY-MM-DD format, validate date is valid (no timezone, no time component)
    • Single-parent policy (MVP): At most one active parent per child as-of date (use overlap predicate)
    • Cycle prevention: Use detect_circular_hierarchy() from api/bigquery/phase4_hierarchy_queries.py to reject loops
    • Idempotency: Same (child_agent_entity_id, parent_agent_entity_id, effective_start_date) => reused/no-op (query existing row before insert)
  • Error response format: FastAPI HTTPException with detail field (string or list of strings)
    • 400: Validation errors (invalid IDs, cycles, cross-org refs), Super Admin missing X-Org-Id
    • 403: RBAC failure (non-admin/CEO), org-scoped user with mismatched X-Org-Id
    • 500: Internal server error
  • Response format: HierarchyBootstrapResponse (200 OK with created/skipped counts)
  • Writes SCD2 rows: Insert only (no in-place edits), use create_agent_hierarchy() from api/bigquery/phase4_hierarchy_queries.py
  • Set change_status='APPROVED', is_active=TRUE (from active field)
  • Populate record_hash with deterministic row-core hash (schema compatibility shim)
Request Schema: api/schemas/admin_hierarchy.py:
class HierarchyBootstrapRow(BaseModel):
    child_agent_entity_id: str  # UUID, will be normalized to lowercase
    parent_agent_entity_id: Optional[str]  # NULL = root, UUID normalized to lowercase
    effective_start_date: date  # YYYY-MM-DD, no timezone
    effective_end_date: Optional[date] = None  # Optional, NULL = unbounded
    change_status: str = "APPROVED"  # Default: APPROVED
    active: bool = True  # Maps to is_active field

class HierarchyBootstrapRequest(BaseModel):
    rows: List[HierarchyBootstrapRow]
Response Schema:
class HierarchyBootstrapResponse(BaseModel):
    created_count: int
    skipped_count: int  # Idempotent skips
    errors: List[Dict[str, Any]]  # Format: [{"row_index": 0, "error": "message"}, ...]
Note: Multiple roots allowed for MVP. Option 1 uses Robin as root. Known override: Tommy Dang’s parent is Ricky Ballard.

C) Reparent Endpoint

File: api/routes/admin_hierarchy.py Endpoint: POST /admin/hierarchy/reparent Requirements:
  • Input: child_agent_entity_id, new_parent_agent_entity_id (nullable), effective_start_date, change_status (optional, default “APPROVED”), active (optional, default True)
  • IMPORTANT: Schema does NOT have notes field - MVP must not include it
  • RBAC: Admin/CEO only (use require_admin_or_ceo from api/routes/admin_identity.py)
  • Org-scoped: Use get_org_id_for_hierarchy_ops() dependency (see Security & RBAC section)
    • Super Admin: Requires X-Org-Id header (400 if missing)
    • Non-Super Admin: Uses JWT org_id (X-Org-Id optional but must match if provided)
  • UUID normalization: Normalize all UUIDs to lowercase (accept both cases, store lowercase)
  • Algorithm (SCD2):
    1. Scope + existence checks: Verify child and parent (if not null) exist in tenant/org scope
    2. NOOP check: If current parent (as-of effective_start_date) equals new_parent_agent_entity_id, return early (idempotent)
    3. Cycle check (single-parent optimized): Walk up from new_parent_agent_entity_id via parent links (as-of effective_start_date); if hit child_agent_entity_id => reject with 400
    4. Close existing: Find any overlapping existing relationship for child (use overlap predicate: effective_start_date <= new_start_date AND (effective_end_date IS NULL OR effective_end_date >= new_start_date)), set effective_end_date = effective_start_date - 1 day (use close_agent_hierarchy())
    5. Insert new: Create new relationship row with effective_start_date, effective_end_date=NULL (use create_agent_hierarchy())
  • Error response format: FastAPI HTTPException with detail field
    • 400: Validation errors (invalid IDs, cycles, cross-org refs; NOOP returns 200 with status=“noop”), Super Admin missing X-Org-Id
    • 403: RBAC failure (non-admin/CEO), org-scoped user with mismatched X-Org-Id
    • 500: Internal server error
  • Response format: HierarchyReparentResponse (200 OK for created, 200 OK for noop)
  • Populate record_hash with deterministic row-core hash (schema compatibility shim)
  • Server-side hashing: Compute op_hash (SHA-256 of canonical JSON) and return in response (hashes NOT stored in DB except record_hash shim)
  • Audit logging: Log request_id, actor (user email), tenant_id, org_id, op_hash, status
Request Schema:
class HierarchyReparentRequest(BaseModel):
    child_agent_entity_id: str  # UUID, normalized to lowercase
    new_parent_agent_entity_id: Optional[str]  # NULL = set root, UUID normalized to lowercase
    effective_start_date: date  # YYYY-MM-DD, no timezone
    change_status: Optional[str] = "APPROVED"  # Default: APPROVED
    active: Optional[bool] = True  # Default: True (maps to is_active)

class HierarchyReparentResponse(BaseModel):
    hierarchy_id: Optional[str]  # None if noop
    op_hash: str  # SHA-256 hash of canonical JSON representation
    status: str  # 'created' | 'noop'

D) Batch Reparent Endpoint

File: api/routes/admin_hierarchy.py Endpoint: POST /admin/hierarchy/batch_reparent Requirements:
  • Accepts list of reparent operations (same fields as /reparent)
  • RBAC: Admin/CEO only (use require_admin_or_ceo from api/routes/admin_identity.py)
  • Org-scoped: Use get_org_id_for_hierarchy_ops() dependency (see Security & RBAC section)
    • Super Admin: Requires X-Org-Id header (400 if missing)
    • Non-Super Admin: Uses JWT org_id (X-Org-Id optional but must match if provided)
  • UUID normalization: Normalize all UUIDs to lowercase
  • ATOMIC: Validate ALL ops first; if ANY invalid -> return 400 and apply NONE
  • Response must include ALL validation errors with operation_index
  • Deterministic order: Sort ops by (child_agent_entity_id, effective_start_date) before validation and application
  • If all pass, apply each op sequentially using same logic as single reparent
  • Must be idempotent per op (same op can be replayed safely)
  • Error response format:
    • 400: Validation errors (fail whole batch, return all errors in structured format), Super Admin missing X-Org-Id
    • 403: RBAC failure (non-admin/CEO), org-scoped user with mismatched X-Org-Id
    • 500: Internal server error
  • 400 Error Response Schema (FastAPI-compatible structured detail):
    # HTTPException with structured detail array
    {
      "detail": [
        {
          "operation_index": 0,
          "error_code": "CYCLE_DETECTED",
          "message": "Operation would create a cycle in hierarchy"
        },
        {
          "operation_index": 2,
          "error_code": "INVALID_AGENT_ID",
          "message": "child_agent_entity_id does not exist in tenant/org"
        }
      ]
    }
    
  • Response format: HierarchyBatchReparentResponse (200 OK ONLY returned when full batch applies successfully - all ops validated and applied)
  • Populate record_hash for each inserted row (schema compatibility shim)
  • Server-side hashing: Compute per-op op_hash + batch_hash (canonical JSON, NOT concatenation)
  • Audit logging: Log request_id, actor, tenant_id, org_id, op_hashes (list), batch_hash, status
Request Schema:
class HierarchyReparentOp(BaseModel):
    child_agent_entity_id: str  # UUID, normalized to lowercase
    new_parent_agent_entity_id: Optional[str]  # UUID normalized to lowercase or null
    effective_start_date: date  # YYYY-MM-DD, no timezone
    change_status: Optional[str] = "APPROVED"  # Default: APPROVED
    active: Optional[bool] = True  # Default: True (maps to is_active)

class HierarchyBatchReparentRequest(BaseModel):
    operations: List[HierarchyReparentOp]
Response Schema:
class HierarchyReparentOpResult(BaseModel):
    operation_index: int
    status: str  # 'created' | 'noop'
    child_agent_entity_id: str
    op_hash: str  # SHA-256 hash of canonical JSON representation
    hierarchy_id: Optional[str] = None  # None if noop

class HierarchyBatchReparentResponse(BaseModel):
    results: List[HierarchyReparentOpResult]
    batch_hash: str  # SHA-256 hash computed from canonical JSON (NOT concatenation)
    total_created: int
    total_noop: int
Validation Strategy:
  • Check all operations for:
    • Scope/existence (all IDs exist in tenant/org)
    • Cross-op cycles (op A’s child becomes parent of op B’s child, creating cycle)
    • Conflicts (multiple ops try to reparent same child to different parents)
  • If any validation fails, return 400 with structured detail array containing all errors (do not apply any ops)
  • If all pass, apply each op sequentially using same logic as single reparent
  • 200 response is ONLY returned when the full batch applies successfully (all ops validated and applied)
Server-Side Only Deterministic Hashing (Python backend only):
  1. op_hash computation (per reparent operation):
    • Formula: op_hash = sha256(UTF8(canonical_json(op_core)))
    • op_core fields (ONLY these, exclude notes/comments/UI metadata/timestamps):
      • op: “reparent” (string)
      • tenant_id: string
      • org_id: string (or JSON null, NOT string “null”)
      • child_agent_entity_id: lowercase UUID string
      • new_parent_agent_entity_id: lowercase UUID string or JSON null (NOT string “null”)
      • effective_start_date: “YYYY-MM-DD” string
      • change_status: string (default “APPROVED”)
      • active: boolean true/false (NOT “Yes”/“No” string)
    • Canonical JSON rules (Python):
      • UTF-8 encoding, no BOM
      • Keys sorted lexicographically
      • No whitespace: json.dumps(..., sort_keys=True, separators=(',', ':'))
      • Arrays preserve order
      • Null values: null (JSON null, not string “null”)
    • Implementation: api/utils/hierarchy_hashing.py with compute_op_hash(op_core: Dict) -> str
  2. batch_hash computation:
    • Compute op_hash for each operation first
    • Enforce deterministic server order for application: sort by (child_agent_entity_id, effective_start_date)
    • MUST use canonical JSON (NOT string concatenation):
      • batch_hash = sha256(UTF8(canonical_json({tenant_id, org_id, effective_start_date, operations: [{op_hash}, ...]})))
      • Operations array sorted by deterministic order
    • Implementation: api/utils/hierarchy_hashing.py with compute_batch_hash(tenant_id: str, org_id: str, effective_start_date: str, op_hashes: List[str]) -> str
  3. record_hash computation (schema compatibility shim):
    • Row-core fields: tenant_id, org_id, agent_entity_id, parent_agent_entity_id, effective_start_date, effective_end_date, change_status, is_active
    • Formula: record_hash = sha256(UTF8(canonical_json(row_core)))
    • Implementation: api/utils/hierarchy_hashing.py with compute_record_hash(row_core: Dict) -> str
    • Document: record_hash is a compatibility field, not relied upon for audit integrity
  4. API behavior:
    • /admin/hierarchy/reparent returns op_hash in response (hash NOT stored in DB except record_hash shim)
    • /admin/hierarchy/batch_reparent returns per-op op_hash + batch_hash (hashes NOT stored in DB except record_hash shim)
    • Hashes are computed server-side only and returned as opaque metadata
    • UI treats hashes as opaque strings (no client-side computation, no preview hashes)
  5. Audit logging:
    • Log fields: request_id (UUID), actor (user identity/email from JWT), tenant_id, org_id, op_hash(es), batch_hash (if applicable), status (success/failure), errors (if validation fails)
    • Log format: Structured JSON log entry for audit trail
    • Log location: Application logs (structured JSON format for parsing)
File: api/routes/admin_hierarchy.py Endpoint: POST /admin/hierarchy/validate_move Requirements:
  • Input: child_agent_entity_id, new_parent_agent_entity_id (nullable), effective_start_date
  • Output: ok/errors (no writes, validation only)
  • Used later by drag/drop UI for pre-flight validation
  • Checks: scope, existence, cycle detection (same as reparent validation)
  • RBAC: Admin/CEO only (use require_admin_or_ceo from api/routes/admin_identity.py)
  • Org-scoped: Use get_org_id_for_hierarchy_ops() dependency (see Security & RBAC section)
    • Super Admin: Requires X-Org-Id header (400 if missing)
    • Non-Super Admin: Uses JWT org_id (X-Org-Id optional but must match if provided)
  • UUID normalization: Normalize all UUIDs to lowercase
  • Error response format: 400 for validation errors or Super Admin missing X-Org-Id, 403 for RBAC failure or org-scoped user with mismatched X-Org-Id
Request Schema:
class HierarchyValidateMoveRequest(BaseModel):
    child_agent_entity_id: str  # UUID, normalized to lowercase
    new_parent_agent_entity_id: Optional[str]  # UUID normalized to lowercase or null
    effective_start_date: date  # YYYY-MM-DD, no timezone

class HierarchyValidateMoveResponse(BaseModel):
    is_valid: bool
    errors: List[str]  # Empty if valid

F) Tree and History Endpoints

File: api/routes/admin_hierarchy.py Endpoints:
  • GET /admin/hierarchy/tree?as_of=YYYY-MM-DD
    • Returns tree structure (parent-child relationships) as-of date
    • Filter: Use overlap predicate: effective_start_date <= as_of AND (effective_end_date IS NULL OR effective_end_date > as_of)
    • Format: Nested JSON structure (dashboard-friendly)
    • RBAC: Admin/CEO only (use require_admin_or_ceo from api/routes/admin_identity.py)
    • Org-scoped: Use get_org_id_for_hierarchy_ops() dependency (see Security & RBAC section)
      • Super Admin: Requires X-Org-Id header (400 if missing)
      • Non-Super Admin: Uses JWT org_id (X-Org-Id optional but must match if provided)
    • Error response: 400 for invalid date format or Super Admin missing X-Org-Id, 403 for RBAC failure or org-scoped user with mismatched X-Org-Id
  • GET /admin/hierarchy/history?agent_entity_id=...
    • Returns all SCD2 rows for given agent_entity_id
    • Ordered by effective_start_date DESC
    • RBAC: Admin/CEO only (use require_admin_or_ceo from api/routes/admin_identity.py)
    • Org-scoped: Use get_org_id_for_hierarchy_ops() dependency (see Security & RBAC section)
      • Super Admin: Requires X-Org-Id header (400 if missing)
      • Non-Super Admin: Uses JWT org_id (X-Org-Id optional but must match if provided)
    • Error response: 400 for invalid UUID or Super Admin missing X-Org-Id, 403 for RBAC failure or org-scoped user with mismatched X-Org-Id, 404 if agent not found
Query Functions: Add to api/bigquery/phase4_hierarchy_queries.py:
  • get_hierarchy_tree(tenant_id, org_id, as_of_date) (uses overlap predicate)
  • get_hierarchy_history(tenant_id, org_id, agent_entity_id)

G) Seed Generator Script

File: tools/generate_paid_today_hierarchy_seed.py Requirements:
  • Inputs:
    • data/reference/agent2_pepm.csv
    • Resolver export (API call to GET /admin/agents/resolver or CSV file)
    • Optional: hierarchy_changeset.json (for overrides)
  • Logic:
    1. Load agent2_pepm.csv, dedupe by normalized name
    2. Join to resolver by EXACT normalized name only (fail-closed):
      • 0 matches => exception row UNKNOWN_AGENT (write to exceptions CSV)
      • 1 matches => exception row AMBIGUOUS_AGENT_NAME (write to exceptions CSV)
    3. Hierarchy construction (Option 1, FINAL):
      • Robin Bundy is ROOT: parent_agent_entity_id = null
      • Default parent for every resolved paid agent = Robin Bundy (direct child)
      • Apply overrides from ops-based changeset JSON (see H) Changeset/Override Support
    4. Output:
      • hierarchy_paid_today_seed.json: JSON array of HierarchyBootstrapRow objects (for API consumption)
      • hierarchy_paid_today_seed.csv: CSV format (for manual review)
      • Schema-aligned fields only: child_agent_entity_id, parent_agent_entity_id, effective_start_date, effective_end_date (optional), change_status, active
      • hierarchy_paid_today_exceptions.csv: Agent_2_First_Name, Agent_2_Last_Name, Agent_2_Business_Name, Root_Owner, error_type, error_message
      • Optional: hierarchy_changeset.json (export format for ops-based changeset)
Normalization Function (MUST match resolver SQL exactly - see Normalization Consistency above):
def normalize_full_name(first_name: str, last_name: str) -> str:
    full = f"{first_name} {last_name}"
    # Remove punctuation: . , ' - (must match SQL exactly)
    full = full.replace('.', '').replace(',', '').replace("'", '').replace('-', '')
    # Trim and uppercase
    full = full.strip().upper()
    # Collapse multiple spaces
    full = ' '.join(full.split())
    return full

H) Changeset/Override Support

Files:
  • docs/config/hierarchy_changeset.sample.json (new)
Requirements:
  1. Changeset JSON Format (preferred, ops-based, schema-aligned):
    • File: hierarchy_changeset.json
    • Schema: Maps directly to /reparent or /batch_reparent WITHOUT non-schema fields
      {
        "meta": {
          "org_id": "creative_benefit_strategies",
          "as_of": "YYYY-MM-DD",
          "source": "seed|ui",
          "comment": "..."
        },
        "operations": [
          {
            "op": "reparent",
            "child_agent_entity_id": "UUID",
            "new_parent_agent_entity_id": "UUID|null",
            "effective_start_date": "YYYY-MM-DD",
            "change_status": "APPROVED",
            "active": true
          }
        ]
      }
      
    • IMPORTANT: No relationship_type or notes fields (not in schema)
    • Generator can read this file and apply overrides by generating final seed JSON
    • No full-tree snapshot imports. Ops-based only (replayable, auditable, SCD2-safe)
    • Sample must include Tommy Dang -> Richard Ballard reparent op (no notes field)

I) UI/UX Changes (MVP Now, Future Drag/Drop Ready)

File: dashboard/src/pages/admin/hierarchy.tsx (new page) or extend existing hierarchy page Components:
  1. Hierarchy Bootstrap Upload: File upload for hierarchy_paid_today_seed.json (JSON array), parse and call POST /admin/hierarchy/bootstrap with JSON body
  2. Reparent Modal: Child search, parent search or “Set as root”, effective start date picker, change status (default: APPROVED), active toggle (default: Yes), call POST /admin/hierarchy/reparent
  3. Bulk Moves Panel: Upload hierarchy_changeset.json -> parse and call POST /admin/hierarchy/batch_reparent, shows per-op results + export results JSON/CSV
  4. Tree View: Call GET /admin/hierarchy/tree?as_of=YYYY-MM-DD (org-scoped), display hierarchical tree (collapsible nodes)
API Client: dashboard/src/lib/apiClient.ts (add methods):
  • bootstrapHierarchy(rows) - returns response with created/skipped counts
  • reparentHierarchy(request) - returns response with op_hash (opaque metadata, no client-side computation)
  • batchReparentHierarchy(operations) - returns response with per-op op_hash + batch_hash (opaque metadata)
  • validateMoveHierarchy(request) - returns validation result (no writes)
  • getHierarchyTree(asOfDate) - returns tree structure
  • getHierarchyHistory(agentEntityId) - returns SCD2 history
  • getAgentResolver(orgId) - returns resolver export
API Client Pattern (follow existing AgentMappingClient pattern):
// Use requestWithHeaders for X-Org-Id header
const customHeaders: Record<string, string> = {};
if (orgId) {
  customHeaders['X-Org-Id'] = orgId;
}

return apiClient.requestWithHeaders<ResponseType>(endpoint, {
  method: 'POST',
  body: JSON.stringify(request),
  customHeaders,
});

J) Tests (pytest)

File: api/tests/test_admin_hierarchy.py (new) Test Cases:
  1. Resolver endpoint: Returns normalized names deterministically (verify full_name_normalized format matches seed generator), org/tenant scoping enforced, RBAC (Super Admin vs non-Super Admin), X-Org-Id header handling (Super Admin requires header, non-Super Admin uses JWT), normalization handles hyphens and multiple spaces
  2. Bootstrap endpoint: Org isolation rejects cross-org refs, idempotency, invalid IDs => error (400), cycle reject, single-parent overlap handling (uses overlap predicate), UUID normalization, error response format, Super Admin access (works without membership), Super Admin missing X-Org-Id => 400
  3. Reparent endpoint: Reparent closes old row + inserts new (verify SCD2 pattern with overlap predicate), set-root works, NOOP works, cycle reject, UUID normalization, Super Admin access
  4. Batch Reparent endpoint: Atomic validation (fail whole batch on validation errors, return 400), returns per-op results (created/noop), cross-op cycle detection, conflicts detection, idempotent per op, UUID normalization, batch_hash uses canonical JSON (NOT concatenation), Super Admin access
  5. Tree endpoint: Returns correct parent after reparent with effective dating, as_of date filtering works correctly (uses overlap predicate), response format (nested JSON structure), Super Admin access
  6. Server-side hashing (backend only): op_hash is deterministic (same op_core produces same hash), different parent or date => different op_hash, batch_hash is deterministic (canonical JSON, NOT concatenation), hash computation uses canonical JSON (UTF-8, sorted keys, no whitespace, arrays preserve order, JSON null is null), active is boolean (NOT “Yes”/“No”), hashes NOT stored in DB (except record_hash shim), UI treats hashes as opaque metadata
  7. Security & RBAC (prevent unauthorized access):
    • Super Admin can call all hierarchy endpoints with X-Org-Id set (works even if no prior org membership record exists)
    • Super Admin missing X-Org-Id on writes gets 400 (not 403) with clear message
    • Non-admin org user cannot call write endpoints (403)
    • Org admin/CEO can call write endpoints for their org (200)
    • Org admin/CEO cannot call write endpoints for different org (403)
    • Org-scoped user with mismatched X-Org-Id gets 403 (prevent spoofing)
    • Admin/CEO can read full org tree/history (200)
    • Non-admin/CEO cannot read (403)
    • Super Admin can read any org with X-Org-Id (200)
    • Super Admin with aaronbundy681@gmail.com email is detected correctly
  8. Regression guard (prevent commission engine contamination): Test/grep rule: commission engine modules must NOT import hierarchy query module, verify no hierarchy joins in payout paths, verify dashboards use hierarchy only for visibility, not payouts
Test Pattern: Use kill_switch_bigquery fixture from api/tests/test_onboarding.py to mock BigQuery calls.

File Changes Summary

New Files

  • api/routes/admin_hierarchy.py (separate router for hierarchy endpoints)
    • Includes is_platform_super_admin() helper function (wraps _is_platform_admin())
    • Includes get_org_id_for_hierarchy_ops() dependency (Super Admin-aware org scoping)
  • api/schemas/admin_hierarchy.py
  • api/utils/hierarchy_hashing.py (deterministic hashing utilities: compute_op_hash, compute_batch_hash, compute_record_hash)
  • tools/generate_paid_today_hierarchy_seed.py
  • api/tests/test_admin_hierarchy.py
  • api/tests/test_hierarchy_regression_guards.py (regression guard tests)
  • dashboard/src/pages/admin/hierarchy.tsx (if new page)
  • docs/config/hierarchy_changeset.sample.json (include Tommy Dang -> Richard Ballard reparent op, no notes field)

Modified Files

  • api/bigquery/phase4_hierarchy_queries.py (add tree/history query functions with overlap predicate)
  • dashboard/src/lib/apiClient.ts (add hierarchy API methods)
  • api/main.py (register new router: app.include_router(admin_hierarchy_router, prefix="/api/v1/admin", tags=["Admin Hierarchy"]))

Critical Constraints

  1. NO commission engine changes: Do not touch commission logic, assignments, or payout paths. Hierarchy is ONLY for reporting/visibility and dashboard scoping. Do not add hierarchy joins to payout paths.
  2. Guarantee no downstream breakage: No schema changes to ingestion/commission tables, hierarchy endpoints only write config_agent_hierarchy, add regression guard test/grep rule, dashboards may use hierarchy only for visibility scoping and rollups, never payouts
  3. Decimal-only rules: Do not introduce floats anywhere in shared libs
  4. Strict tenant + org isolation: Primary org for Phase 2 prod validation: org_id = "creative_benefit_strategies". All operations enforce tenant/org scoping.
  5. Deterministic, auditable behavior: No destructive edits. Use SCD2 (close prior row + insert new row). Ops-based changesets are replayable and auditable. Implement deterministic hashing (op_hash, batch_hash) for audit trail.
  6. agent_entity_id only: All FK references use agent_entity_id (UUID), never agent_id
  7. Fail-closed: Name matching must reject ambiguous cases (0 or >1 matches)
  8. Normalization consistency: Normalization MUST be identical in SQL and Python (uppercase, trim, remove punctuation: . , ' -, collapse spaces using REGEXP_REPLACE in SQL)
  9. Schema alignment: config_agent_hierarchy table DOES NOT have relationship_type or notes fields. MVP must not include these in CSV/JSON payloads or inserts.
  10. UUID normalization: All UUIDs normalized to lowercase (accept both cases in API, store lowercase in DB)
  11. Date format: All dates use YYYY-MM-DD format, no timezone, no time component
  12. Error response format: FastAPI HTTPException with detail field (string or list of strings)
  13. Router registration: Must register router in api/main.py with prefix /api/v1/admin
  14. X-Org-Id header: All endpoints must accept X-Org-Id header for org-scoping
  15. SCD2 overlap predicate: All overlap checks MUST handle bounded rows using: effective_start_date <= new_start_date AND (effective_end_date IS NULL OR effective_end_date >= new_start_date)
  16. Batch semantics: Atomic validation (fail whole batch on validation errors, return 400), no partial-success mode in MVP
  17. Canonical hashing: op_hash uses boolean true/false (NOT “Yes”/“No”), JSON null is null (NOT string “null”), batch_hash uses canonical JSON (NOT concatenation)
  18. record_hash compatibility: Populate record_hash with deterministic row-core hash to satisfy schema (NOT NULL constraint)
  19. Super Admin support: MVP must include Super Admin bypass to prevent bootstrapping deadlocks. Email aaronbundy681@gmail.com treated as platform_super_admin (allowlist).

Implementation Order

  1. Resolver endpoint (GET /admin/agents/resolver) with REGEXP_REPLACE normalization
  2. Hierarchy hashing utilities (api/utils/hierarchy_hashing.py: compute_op_hash, compute_batch_hash, compute_record_hash)
  3. Hierarchy model + reparent + tree read (POST /admin/hierarchy/reparent, GET /admin/hierarchy/tree, GET /admin/hierarchy/history) with overlap predicate
  4. Batch reparent + validate move (POST /admin/hierarchy/batch_reparent atomic, POST /admin/hierarchy/validate_move optional)
  5. Bootstrap import (POST /admin/hierarchy/bootstrap)
  6. Seed generator script (tools/generate_paid_today_hierarchy_seed.py)
  7. UI wiring for bootstrap + reparent + batch moves
  8. Tests + regression coverage

Validation Checklist

  • Resolver returns normalized names deterministically matching seed generator normalization (REGEXP_REPLACE whitespace collapse)
  • Bootstrap rejects cycles, invalid IDs, multiple parents (overlap predicate), cross-org refs
  • Reparent implements SCD2 correctly (close old + insert new, overlap predicate)
  • Batch reparent validates all then applies atomically (fail whole batch on validation errors, return 400)
  • Tree endpoint respects effective dating (overlap predicate)
  • Seed generator fails-closed on ambiguous names
  • All endpoints enforce org-scoping
  • Changeset JSON format supported (read/export, schema-aligned, no relationship_type/notes)
  • Tests pass with mocked BigQuery
  • Commission/assignment/payout paths confirmed unchanged (no hierarchy joins)
  • Server-side hashing implemented (op_hash, batch_hash) - backend only, canonical JSON (NOT concatenation), boolean active (NOT “Yes”/“No”), JSON null (NOT string “null”)
  • record_hash compatibility shim implemented (populate with row-core hash)
  • Audit logging implemented (request_id, actor, tenant_id, org_id, op_hashes, batch_hash, status, errors)
  • Hashing tests verify deterministic behavior (same op_core => same hash, different => different hash, batch_hash uses canonical JSON)
  • Regression guard tests pass (commission engine does not import hierarchy modules)
  • Router registered in api/main.py with prefix /api/v1/admin
  • All endpoints accept X-Org-Id header
  • UUID normalization (accepts both cases, stores lowercase)
  • Error response format consistent (FastAPI HTTPException with detail field)
  • Tree response format (nested JSON structure)
  • Bootstrap accepts JSON body (not multipart CSV)
  • Normalization test validates hyphens and multiple spaces handling
  • Super Admin support implemented (aaronbundy681@gmail.com allowlist, works without membership records)
  • Super Admin missing X-Org-Id on writes => 400 (not 403)
  • Non-admin users cannot write (403)