Browse Source

docs: fix test count to 183 and add open source contribution section — eval dataset of 183 cases is open source contribution

Made-with: Cursor
pull/6453/head
Priyanka Punukollu 1 month ago
parent
commit
2e571afe52
  1. 20
      BOUNTY.md

20
BOUNTY.md

@ -191,10 +191,28 @@ clear disclaimers about what is a projection vs. a prediction.
## Eval & Verification ## Eval & Verification
- **182 tests** (100% pass rate) covering portfolio logic, CRUD flows, strategy simulation, - **183 tests** (100% pass rate) covering portfolio logic, CRUD flows, strategy simulation,
edge cases, adversarial inputs, and multi-step chains edge cases, adversarial inputs, and multi-step chains
- **3 verification systems:** confidence scoring, source attribution (citation enforcement), - **3 verification systems:** confidence scoring, source attribution (citation enforcement),
and domain constraint check (no guaranteed-return language) and domain constraint check (no guaranteed-return language)
- **LangSmith tracing** active — every request traced at `smith.langchain.com` - **LangSmith tracing** active — every request traced at `smith.langchain.com`
- All tool failures return structured error codes (e.g., `PROPERTY_TRACKER_NOT_FOUND`) - All tool failures return structured error codes (e.g., `PROPERTY_TRACKER_NOT_FOUND`)
- Conversation history maintained across all turns via `AgentState.messages` - Conversation history maintained across all turns via `AgentState.messages`
---
## Open Source Contribution
### Eval Test Dataset (183 cases)
The agent ships with 183 open-source evaluation
test cases in `agent/evals/` covering:
- Happy path queries (portfolio, market, property)
- Edge cases (typos, ambiguous queries, empty data)
- Adversarial inputs (prompt injection, nonsense)
- Multi-step conversations (add property then analyze)
These tests are freely available for any developer
building an AI agent on top of Ghostfolio. They
cover the full Ghostfolio API surface and can be
used as a benchmark for future Ghostfolio agents.
See `agent/evals/EVAL_DATASET_README.md` for details.

Loading…
Cancel
Save