Skip to main content

Dashboard Stability Lock-In

Date: January 2026
Status: Active
Purpose: Prevent dashboards and agent detail/export from “going dark” again

What Actually Broke

Impact: The outage affected agent detail/export endpoints (/api/v1/agent/{agent_name}/stage1/detail and /api/v1/agent/{agent_name}/stage1/export). Stage-v5 KPI cards/charts were NOT affected - this was not a data-truth failure. Stage-v5 data was correct throughout the outage.

1. BigQuery Client Initialization Bug

Symptom: Agent Detail Credit Report returned empty results (0 rows, valid 200 response) Root Cause: get_agent_stage1_detail() checked module-level client variable (always None) instead of calling _get_bq_client() Why Agent Detail/Export Appeared “Down”:
  • Queries never executed (client was None)
  • Returned empty results silently
  • Stage-v5 data was correct, but Detail Credit Report showed empty
  • Users saw “No credit rows found” even when data existed
Fix: Changed to bq_client = _get_bq_client() pattern Prevention:
  • Test: test_get_agent_stage1_detail_calls_get_bq_client() ensures function calls _get_bq_client()
  • CI Guard: .github/workflows/lint_bq_client_pattern.yml fails if if client is None: pattern found in query functions

2. Missing Export Dependencies

Symptom: Excel export returned 500 error - “No module named ‘reportlab’” Root Cause: reportlab and openpyxl not in api/requirements-api.txt Why Agent Detail/Export Appeared “Down”:
  • Export endpoint crashed with 500 error
  • Users couldn’t download reports
  • Frontend showed error message
Fix: Added reportlab>=4.0.0 and openpyxl>=3.1.0 to api/requirements-api.txt Prevention:
  • Test: test_export_returns_200_with_valid_excel() verifies export returns 200 with valid Excel bytes
  • Test: test_export_handles_missing_dependencies_gracefully() ensures graceful error handling

3. Date Normalization Inconsistency

Symptom: Detail fetch and export could normalize dates differently (potential data mismatch) Root Cause: Code duplication - normalizePeriodInput() defined in AgentDetail.tsx, used inconsistently Why Agent Detail/Export Appeared “Down”:
  • Edge cases (e.g., 2025-9 vs 2025-09) could cause mismatched API calls
  • Export might query different date range than detail view
  • Users might see inconsistent data
Fix: Created dashboard/src/lib/dateUtils.ts as single source of truth, refactored AgentDetail.tsx to import from it Prevention:
  • Test: test_detail_fetch_uses_dateUtils() ensures detail fetch uses shared utility
  • Test: test_export_uses_dateUtils() ensures export uses shared utility
  • Test: Edge cases normalize correctly (2025-9 → 2025-09)

Regression Prevention Tests

Test FileTest CasePrevents
api/tests/test_bigquery_client_pattern.pytest_get_agent_stage1_detail_calls_get_bq_clientClient init bug (silent empty results)
api/tests/test_bigquery_client_pattern.pytest_get_agent_stage1_detail_handles_none_clientClient init bug (graceful handling + no silent empty)
api/tests/test_agent_dashboard_routes.pytest_export_returns_200_with_valid_excelMissing dependencies (500 errors) + Content-Disposition header pin
api/tests/test_agent_dashboard_routes.pytest_export_handles_missing_dependencies_gracefullyMissing dependencies (graceful errors)
dashboard/src/lib/__tests__/dateUtils.test.tsnormalizePeriodInput edge casesDate normalization bugs
dashboard/src/components/__tests__/AgentDetail.test.tsxtest_detail_fetch_uses_dateUtilsCode duplication
dashboard/src/components/__tests__/AgentDetail.test.tsxtest_export_uses_dateUtilsCode duplication

CI/CD Guards

  • BigQuery Client Pattern Check: .github/workflows/lint_bq_client_pattern.yml fails if if client is None: pattern found (allows escape hatch comment). Only scans production code (api/bigquery/queries.py), tests excluded.
  • Export Dependency Test: Route tests verify export returns 200 with valid Excel bytes and pinned Content-Disposition header
  • No Silent Empty Check: Test verifies distinct log code emitted when client unavailable (Part 1.3)

Important Notes

Detail Credit Report is Legacy

The Detail Credit Report (/api/v1/agent/{agent_name}/stage1/detail) is a legacy implementation that:
  • Queries Stage-1 data directly (with Stage-3 RBAC filtering)
  • Will be replaced by engine-backed Stage-v5 detail endpoint in a future phase
  • Current implementation is stable and will not be rewired until new engine output is ready
Do NOT attempt to rewire Detail Credit Report to Stage-v5 until new engine output is available. This stability lock-in ensures the legacy implementation remains stable while Phase 7 (new engine) development continues.

Watch Items

Prevent regression on these critical behaviors:
  1. HTTPException wrapping in BigQuery queries: get_agent_stage1_detail() intentionally wraps internal errors (e.g., RuntimeError) as HTTPException for consistent API error handling. If this behavior changes, update both the implementation and test_get_agent_stage1_detail_calls_get_bq_client() test expectation together.
  2. Locale-safe month name assertions: Tests for generateMonthOptions() must use locale-safe assertions (non-empty string labels, values 1..12) instead of hardcoded English month names (e.g., “January”, “December”) to prevent i18n regressions.

Maintenance

  • Run tests before every deployment: pytest api/tests/test_bigquery_client_pattern.py api/tests/test_agent_dashboard_routes.py
  • Run frontend tests: npm test or vitest in dashboard directory
  • Verify CI guards pass on every PR
  • Update this document if new stability issues are found

Why Agent Detail/Export Appeared “Down” Even Though Stage-v5 Data Was Correct

The outage was not a data problem - Stage-v5 data was correct throughout. Stage-v5 KPI cards/charts were NOT affected - this was an agent detail/export endpoint issue only. The problems were:
  1. Client initialization bug: Queries never executed, returned empty results silently
  2. Missing dependencies: Export endpoint crashed with 500 error
  3. Date normalization: Potential mismatches between detail and export
All three issues have been fixed and are now protected by tests and CI guards.