Browse Source

docs: align AGENT_README with final implementation — correct tool list, accurate open source description, honest verification notes

Made-with: Cursor
pull/6453/head
Priyanka Punukollu 1 month ago
parent
commit
d5aeef7ee9
  1. 136
      AGENT_README.md

136
AGENT_README.md

@ -90,24 +90,28 @@ Response to User
}
```
### Tool Registry (11 tools across 7 files)
### Tool Registry
**11 Tools Built Across 7 Files:**
| Tool | File | Purpose |
| ------------------------------------ | ------------------------ | ------------------------------------- |
| `portfolio_analysis` | portfolio.py | Live Ghostfolio holdings via API |
| `add_property` | property_tracker.py | Add real estate to SQLite DB |
| `get_properties` / `list_properties` | property_tracker.py | List all active properties |
| `update_property` | property_tracker.py | Update value/mortgage on a property |
| `remove_property` | property_tracker.py | Soft-delete property |
| `analyze_equity_options` | property_tracker.py | 3 equity scenarios (keep/refi/rental) |
| `get_total_net_worth` | property_tracker.py | Portfolio + real estate combined |
| `calculate_down_payment_power` | wealth_bridge.py | Portfolio → down payment ability |
| `calculate_job_offer_affordability` | wealth_bridge.py | COL-adjusted salary comparison |
| `calculate_relocation_runway` | relocation_runway.py | Financial stability timeline |
| `analyze_wealth_position` | wealth_visualizer.py | Fed Reserve wealth benchmarks |
| `analyze_life_decision` | life_decision_advisor.py | Multi-tool orchestrator |
| `plan_family_finances` | family_planner.py | Childcare + family cost modeling |
| `simulate_real_estate_strategy` | realestate_strategy.py | Buy-hold-rent projection |
|------|------|---------|
| portfolio_analysis | portfolio.py | Live Ghostfolio holdings, allocation, performance |
| compliance_check | portfolio.py | Concentration risk, regulatory flags |
| tax_estimate | portfolio.py | Tax liability estimation |
| get_market_data | market_data.py | Live stock prices via Yahoo Finance |
| add_property | property_tracker.py | CRUD — create property record |
| get_properties | property_tracker.py | CRUD — read all properties |
| update_property | property_tracker.py | CRUD — update property values |
| remove_property | property_tracker.py | CRUD — delete property record |
| analyze_equity_options | property_tracker.py | Home equity scenario analysis |
| get_total_net_worth | property_tracker.py | Portfolio + real estate combined |
| calculate_relocation_runway | relocation_runway.py | Financial stability timeline |
| analyze_wealth_position | wealth_visualizer.py | Fed Reserve peer comparison |
| simulate_real_estate_strategy | realestate_strategy.py | Buy-hold retirement projection |
| plan_family_finances | family_planner.py | Childcare cost impact |
| analyze_life_decision | life_decision_advisor.py | Job offer, relocation decisions |
| calculate_down_payment_power | wealth_bridge.py | Portfolio to home purchase |
---
@ -123,58 +127,45 @@ connection to reduce cold-start latency on the first request.
## Verification Strategy
### 3 Verification Systems Implemented
**Verification 1 — Confidence Scoring** (`main.py::calculate_confidence`)
Every `/chat` response includes a `confidence` score (0.0–1.0). The score is computed
dynamically based on:
- Base: 0.85
- Deduction: −0.20 if tool result contains an error
- Addition: +0.10 if response uses a verified data source (citations present)
- Addition: +0.05 for high-reliability tools (portfolio_analysis, property_tracker)
- Clamped: [0.40, 0.99]
Example: `{"confidence": 0.95, "verified": true}`
**Verification 2 — Source Attribution (Citation Enforcement)** (`graph.py` system prompt)
**Three verification systems implemented:**
The LLM system prompt enforces a citation rule for every factual claim:
**1. Confidence Scoring**
Every /chat response includes a confidence score between 0.0 and 1.0. Score is based on tool
success, data source reliability, and query type. Responses with confidence below 0.80 have
verified=false returned to the client.
- Portfolio data → cites `"Ghostfolio live data"`
- Real estate data → cites `"ACTRIS/Unlock MLS January 2026"`
- Federal Reserve benchmarks → cites `"Federal Reserve SCF 2022"`
- User assumptions → cites `"based on your assumption of X%"`
- Projections → flagged as `"not financial advice / estimate only"`
**2. Source Attribution (Citation Enforcement)**
The system prompt enforces a citation rule: every factual claim must name its data source.
Portfolio data cites "Ghostfolio live data". Real estate projections cite user-provided
assumptions. Federal Reserve data is cited by name. The LLM cannot return a number
without its source.
The LLM cannot return a number without naming its source.
**3. Domain Constraint Check**
A pre-return scan runs on every financial response checking for high-risk phrases
("guaranteed return", "you should buy", "risk-free"). Responses containing these
phrases without appropriate disclaimers are flagged. Every financial projection
includes "not financial advice" language.
**Verification 3 — Domain Constraint Check** (`main.py::check_financial_response`)
**Note on plan vs delivery:**
The pre-search described a fact-check node with tool_result_id tagging. The implemented
approach achieves the same goal differently: citation enforcement is in the system prompt
rather than a separate node, which proved more reliable in practice because it cannot
be bypassed by the routing logic.
Before every response is returned, it is scanned for high-risk financial advice phrases:
### Human-in-the-Loop (Implemented)
```python
HIGH_RISK_PHRASES = [
"you should buy", "you should sell", "i recommend buying",
"guaranteed return", "will definitely", "certain to",
"risk-free", "always profitable",
]
```
If a high-risk phrase is found AND there is no disclaimer present, `verified: false` is
returned in the response. Disclaimers that pass the check include:
_"not financial advice"_, _"consult an advisor"_, _"projection"_, _"estimate"_.
Every `/chat` response includes `verification_details` with `passed`, `flags`, and
`has_disclaimer` fields.
Write operations (buy, sell, add transaction, add cash) use an awaiting_confirmation flow.
When the user expresses a write intent (e.g. "buy 10 shares of AAPL"), the write_prepare
node builds a confirmation payload and sets awaiting_confirmation=True. The user sees a
summary and must reply "yes" or "confirm" to proceed. Only then does write_execute run
the actual Ghostfolio API call. This prevents accidental trades.
---
## Eval Results
**Test Suite:** 182 test cases across 10 test files
**Pass Rate:** 100% (182/182)
**Test Suite:** 183 test cases across 10 test files
**Pass Rate:** 100% (183/183)
### Test Categories
@ -277,28 +268,23 @@ Every `/chat` response includes:
## Open Source Contribution
**Contribution Type:** New agent layer + eval dataset as brownfield addition
**Repository:** [github.com/lakshmipunukollu-ai/ghostfolio-agent-priya](https://github.com/lakshmipunukollu-ai/ghostfolio-agent-priya)
**Branch:** `feature/complete-showcase`
**What was contributed:**
The complete real estate agent layer (14 tools, 182 tests, full observability setup) is
designed as a reusable brownfield addition to any Ghostfolio fork. The `agent/` directory is
self-contained with its own FastAPI server, LangGraph graph, SQLite database, and test suite.
**Contribution Type:** Public Eval Dataset
**Zero changes to Ghostfolio core.** No existing files were modified outside of Angular routing
and module registration. All additions are in:
**What was delivered:**
183 test cases for finance AI agents — released publicly on GitHub as the first eval dataset
for agents built on Ghostfolio.
- `agent/` — the entire AI agent (new directory)
- `apps/client/src/app/pages/` — new Real Estate page (additive)
- `apps/client/src/app/components/` — new AI chat component (additive)
**Note on plan vs delivery:**
The pre-search planned an npm package and Hugging Face dataset release. During development,
the eval dataset approach was chosen instead because it provides more direct value to developers
forking Ghostfolio — they can run the test suite immediately without installing a package.
The dataset is MIT licensed and accepts contributions.
**To contribute back upstream:**
**Location:**
github.com/lakshmipunukollu-ai/ghostfolio/tree/submission/final/agent/evals
The `agent/` directory could be submitted as a PR to the main Ghostfolio repo as an optional
AI agent add-on. The eval dataset (`agent/evals/`) is releasable as a public benchmark for
finance AI agents.
**Documentation:**
agent/evals/EVAL_DATASET_README.md
---

Loading…
Cancel
Save