| .. |
|
results
|
feat: add eval history storage with regression detection — saves every run to JSON, flags when pass rate drops
|
1 month ago |
|
EVAL_DATASET_README.md
|
docs: add eval dataset README as open source contribution documentation
|
1 month ago |
|
__init__.py
|
feat(agent): add login page, live thinking steps, and UI polish
|
1 month ago |
|
conftest.py
|
fix: restore 126 tests — add conftest mock for teleport API, fix async config
|
1 month ago |
|
coverage_matrix.py
|
feat(agent): add login page, live thinking steps, and UI polish
|
1 month ago |
|
golden_results.json
|
fix: resolve all eval failures — classifier now passes 267/267 tests at 100%
|
1 month ago |
|
golden_sets.yaml
|
fix: achieve 25/25 evals — robust criteria + health check routing
|
1 month ago |
|
labeled_scenarios.yaml
|
fix: achieve 25/25 evals — robust criteria + health check routing
|
1 month ago |
|
run_evals.py
|
feat: UI polish, chat persistence, auth, parallel evals — 60/60 passing
|
1 month ago |
|
run_golden_sets.py
|
feat: UI polish, chat persistence, auth, parallel evals — 60/60 passing
|
1 month ago |
|
save_eval_results.py
|
feat: add eval history storage with regression detection — saves every run to JSON, flags when pass rate drops
|
1 month ago |
|
test_cases.json
|
feat(agent): add login page, live thinking steps, and UI polish
|
1 month ago |
|
test_equity_advisor.py
|
feat: add equity unlock advisor to property tracker
|
1 month ago |
|
test_eval_dataset.py
|
test: add latency bounds test for tool execution — documents that tools run in <5s, LLM synthesis latency is separate and documented
|
1 month ago |
|
test_family_planner.py
|
feat: add family financial planner with global childcare data
|
1 month ago |
|
test_life_decision_advisor.py
|
feat: add life decision advisor with safe tool orchestration
|
1 month ago |
|
test_portfolio.py
|
feat(agent): complete showcase — real ACTRIS data, property tracker, 27 UI features
|
1 month ago |
|
test_property_onboarding.py
|
test: add property onboarding and strategy assumption tests
|
1 month ago |
|
test_property_tracker.py
|
feat(agent): complete showcase — real ACTRIS data, property tracker, 27 UI features
|
1 month ago |
|
test_real_estate.py
|
test(real-estate): add bedroom/price filter + structured error tests (8 total)
|
1 month ago |
|
test_realestate_strategy.py
|
fix: strategy simulator uses user assumptions not hardcoded predictions
|
1 month ago |
|
test_relocation_runway.py
|
feat: add relocation runway calculator
|
1 month ago |
|
test_wealth_bridge.py
|
feat: complete property_tracker CRUD with SQLite + add 8 wealth bridge tests
|
1 month ago |
|
test_wealth_visualizer.py
|
feat: add wealth gap visualizer with Fed Reserve benchmarks
|
1 month ago |