You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
Max P d589bd55d9 feat(ai): close critical requirements, add live latency + eval package scaffold 1 month ago
..
datasets feat(ai): close critical requirements, add live latency + eval package scaffold 1 month ago
scripts feat(ai): close critical requirements, add live latency + eval package scaffold 1 month ago
LICENSE feat(ai): close critical requirements, add live latency + eval package scaffold 1 month ago
README.md feat(ai): close critical requirements, add live latency + eval package scaffold 1 month ago
index.d.ts feat(ai): close critical requirements, add live latency + eval package scaffold 1 month ago
index.mjs feat(ai): close critical requirements, add live latency + eval package scaffold 1 month ago
package.json feat(ai): close critical requirements, add live latency + eval package scaffold 1 month ago

README.md

@ghostfolio/finance-agent-evals

Framework-agnostic evaluation dataset and runner for finance AI agents.

Contents

  • 53 deterministic eval cases from Ghostfolio AI MVP
  • Category split:
    • 22 happy_path
    • 11 edge_case
    • 10 adversarial
    • 10 multi_step
  • Reusable eval runner with category summaries
  • Type definitions for JavaScript and TypeScript consumers

Install

npm install @ghostfolio/finance-agent-evals

Usage

import {
  FINANCE_AGENT_EVAL_DATASET,
  runFinanceAgentEvalSuite
} from '@ghostfolio/finance-agent-evals';

const result = await runFinanceAgentEvalSuite({
  execute: async (evalCase) => {
    const response = await myAgent.chat({
      query: evalCase.input.query,
      sessionId: evalCase.input.sessionId
    });

    return {
      answer: response.answer,
      citations: response.citations,
      confidence: response.confidence,
      memory: response.memory,
      toolCalls: response.toolCalls,
      verification: response.verification
    };
  }
});

console.log(result.passRate, result.categorySummaries);

Dataset Export

This package dataset is generated from:

apps/api/src/app/endpoints/ai/evals/mvp-eval.dataset.ts

Exported artifact:

datasets/ghostfolio-finance-agent-evals.v1.json

Scripts

npm run check
npm run pack:dry-run

License

Apache-2.0