ghostfolio/.cursor/rules/ghostfolio-project-spec.mdc


								---

								description: Ghostfolio CSV Import Auditor project specification and requirements

								globs:

								alwaysApply: true

								---


								# Ghostfolio CSV Import Auditor — Project Spec


								## What We're Building


								A **CSV Import Auditor Reasoning Agent** for the Ghostfolio finance platform that ensures safe, deterministic, and verifiable transaction imports.


								## Use Cases


								1. Parsing broker CSV exports

								2. Mapping broker-specific fields to Ghostfolio schema

								3. Validating transaction correctness

								4. Detecting duplicates and conflicts

								5. Generating a structured preview report

								6. Safely committing transactions to the database


								## Architecture


								- **Backend**: TypeScript + NestJS (Ghostfolio native)

								- **Database**: PostgreSQL (Railway)

								- **Validation**: Zod schemas

								- **Agent Orchestration**: Custom lightweight controller (no LangChain/LangGraph/CrewAI)

								- **LLM**: Reasoning-capable model with tool/function calling

								- **Observability**: Langfuse

								- **Evaluation**: Deterministic + LLM-as-Judge (dual-layer)

								- **CI**: Eval gating

								- **Deployment**: Docker-compatible Ghostfolio backend


								## Agent Pipeline (6 Tools)


								```

								CSV → parseCSV → mapBrokerFields → validateTransactions → detectDuplicates → previewImportReport → commitImport

								```


								Each tool has: strict Zod input/output schemas, deterministic behavior, explicit error handling, no hidden side effects. All tools are pure except `commitImport()`.


								## Verification Requirements (Non-Negotiable)


								- Schema validation (required fields, types, formats)

								- Accounting invariants (quantity >= 0, commission >= 0, price logic)

								- Currency consistency

								- Duplicate prevention (idempotency)

								- Deterministic normalization

								- Transactional database commits

								- Before/after state signature validation

								- No commit occurs unless ALL deterministic checks pass


								## Human-in-the-Loop


								- Always generate `previewImportReport()` before commit

								- Require explicit user confirmation

								- Never auto-commit without approval


								## LLM Usage (Cost-Efficient)


								- Parsing and validation are deterministic code (no LLM)

								- LLM used ONLY for: broker column mapping suggestions + human-readable preview explanation

								- Maximum 1-2 LLM calls per import

								- No per-row inference


								## Performance Targets


								- Small CSV: < 2 seconds

								- Large CSV: < 10-15 seconds

								- Progress feedback required

								- Dozens of concurrent imports supported


								## Evaluation Requirements


								### Layer 1: Deterministic (Primary)

								- Schema correctness, numeric invariants, duplicate detection, idempotency, state signature comparison

								- Minimum 50 test CSV cases, pass/fail enforced in CI


								### Layer 2: LLM-as-a-Judge (Secondary)

								- Mapping explanation clarity (1-5), logical completeness (1-5), issue detection completeness (1-5), hallucination presence (0/1), overall quality (1-10)

								- Judge does NOT control commit, only scores language outputs


								## Observability (Langfuse)


								Per import: one trace (`agent.import_audit`), one span per tool, latency per tool, token usage, `toolCountPlanned` vs `toolCountExecuted`, validation result, duplicate detection metrics, commit success/failure.


								## Security


								- CSV treated as data only

								- Prompt injection prevention

								- No arbitrary CSV content in prompts

								- API keys secured server-side

								- Audit logging enforced


								## Testing Strategy


								- Unit tests per tool

								- Integration tests for full pipeline

								- Adversarial malformed CSV tests

								- Regression test suite in CI

								- Deterministic eval dataset


								## Deadlines


								| Checkpoint | Deadline | Focus |

								|---|---|---|

								| Pre-Search | 2 hours after receiving | Architecture, Plan |

								| MVP | Tuesday (24 hours) | Basic agent with tool use |

								| Early Submission | Friday (4 days) | Eval framework + observability |

								| Final | Sunday (7 days) | Production-ready + open source |


								## MVP Checklist


								- Agent responds to natural language queries in finance domain

								- At least 3 functional tools the agent can invoke

								- Tool calls execute successfully and return structured results

								- Agent synthesizes tool results into coherent responses

								- Conversation history maintained across turns

								- Basic error handling (graceful failure, not crashes)

								- At least one domain-specific verification check

								- Simple evaluation: 5+ test cases with expected outcomes

								- Deployed and publicly accessible