docs(submission): add AI development log and cost analysis

4 months ago · 2b6506de87
5 changed files with 363 additions and 0 deletions
--- a/Tasks.md
+++ b/Tasks.md
@ -0,0 +1,22 @@
 # Tasks
 Last updated: 2026-02-23
 ## Active Tickets
 | ID | Feature | Status | Tests | PR / Commit |
 | --- | --- | --- | --- | --- |
 | T-001 | Presearch package and architecture direction | Complete | Doc review checklist | Local docs update |
 | T-002 | ADR foundation in `docs/adr/` | Complete | ADR template and first ADR review | Local docs update |
 | T-003 | Agent MVP tool 1: `portfolio_analysis` | Complete | `apps/api/src/app/endpoints/ai/ai.service.spec.ts` | Planned |
 | T-004 | Agent memory and response formatter | Complete | `apps/api/src/app/endpoints/ai/ai.service.spec.ts` | Planned |
 | T-005 | Eval dataset baseline (MVP 5-10) | Complete | `apps/api/src/app/endpoints/ai/evals/mvp-eval.runner.spec.ts` | Planned |
 | T-006 | Full eval dataset (50+) | Planned | Dataset validation and regression run | Planned |
 | T-007 | Observability wiring (LangSmith traces and metrics) | Planned | Trace assertions and latency checks | Planned |
 | T-008 | Deployment and submission bundle | Complete | `npm run test:ai` + Railway healthcheck + submission docs checklist | Pending push |
 ## Notes
 - Canonical project requirements: `docs/requirements.md`
 - ADR location: `docs/adr/`
 - Detailed execution tracker: `tasks/tasks.md`
--- a/docs/AI-COST-ANALYSIS.md
+++ b/docs/AI-COST-ANALYSIS.md
@ -0,0 +1,87 @@
 # AI Cost Analysis
 Date: 2026-02-23  
 Project: Ghostfolio Finance Agent MVP  
 Scope: development, testing, and monthly production projections
 ## Pricing Inputs
 Primary model routing in MVP:
 - 80% `glm-5` (Z.AI)
 - 20% `MiniMax-M2.5` (MiniMax)
 Reference pricing:
 - Z.AI pricing page (`glm-5`): input `$1.00 / 1M tokens`, output `$3.20 / 1M tokens`  
  Source: https://docs.z.ai/guides/getting-started/pricing
 - MiniMax pricing page (`MiniMax-M2.5`): input `$0.3 / 1M tokens`, output `$1.2 / 1M tokens`  
  Source: https://www.minimax.io/platform/pricing
 Blended effective rates used for projections:
 - Input: `$0.86 / 1M` (`0.8 * 1.00 + 0.2 * 0.30`)
 - Output: `$2.80 / 1M` (`0.8 * 3.20 + 0.2 * 1.20`)
 ## Development and Testing Costs
 Measured local verification for this MVP used mocked model calls in Jest suites:
 - `npm run test:ai` uses mocked `generateText`
 - `npm run test:mvp-eval` uses mocked `generateText`
 Direct external LLM calls from automated tests:
 - API calls: `0`
 - Input tokens: `0`
 - Output tokens: `0`
 - Cost: `$0.00`
 Manual smoke estimate for development sessions:
 - Assumed live calls: `40`
 - Average tokens per call: `2400 input`, `700 output`
 - Total estimated tokens: `96,000 input`, `28,000 output`
 - Estimated model cost:  
  `96,000/1,000,000 * 0.86 + 28,000/1,000,000 * 2.80 = $0.16`
 Observability cost:
 - LangSmith tracing integration: planned, current spend in this repository phase: `$0.00`
 ## Production Cost Projections
 Assumptions:
 - Average AI queries per user per month: `30` (about `1/day`)
 - Average tokens per query: `2400 input`, `700 output`
 - Tool call frequency: `1.5 tool calls/query` (portfolio + risk/market mix)
 - Verification overhead and retry buffer: `25%`
 - Effective cost/query before buffer: `$0.004024`
 Monthly projection:
 | Users | Queries / Month | Model Cost / Month | With 25% Buffer |
 | --- | ---: | ---: | ---: |
 | 100 | 3,000 | $12.07 | $15.09 |
 | 1,000 | 30,000 | $120.72 | $150.90 |
 | 10,000 | 300,000 | $1,207.20 | $1,509.00 |
 | 100,000 | 3,000,000 | $12,072.00 | $15,090.00 |
 ## Sensitivity Range (Model Mix)
 Same token assumptions, model-only monthly cost (without 25% buffer):
 - 100 users:
  - all `MiniMax-M2.5`: `$4.68`
  - all `glm-5`: `$13.92`
 - 100,000 users:
  - all `MiniMax-M2.5`: `$4,680`
  - all `glm-5`: `$13,920`
 ## Instrumentation Plan for Exact Tracking
 1. Add per-request token usage logging at provider response level.
 2. Add LangSmith traces for request, tool-call, and verification spans.
 3. Export weekly token and cost aggregates into a versioned cost ledger.
 4. Set alert thresholds for cost/query drift and high retry rates.
--- a/docs/AI-DEVELOPMENT-LOG.md
+++ b/docs/AI-DEVELOPMENT-LOG.md
@ -0,0 +1,81 @@
 # AI Development Log
 Date: 2026-02-23  
 Project: Ghostfolio Finance Agent MVP  
 Domain: Finance
 ## Tools and Workflow
 The workflow for this sprint followed a strict loop:
 1. Presearch and architecture alignment in `docs/PRESEARCH.md`.
 2. Ticket and execution tracking in `tasks/tasks.md` and `Tasks.md`.
 3. Implementation in the existing Ghostfolio backend and client surfaces.
 4. Focused verification through AI unit tests and MVP eval tests.
 5. Deployment through Railway with public health checks.
 Technical stack used in this MVP:
 - Backend: NestJS (existing Ghostfolio architecture)
 - Agent design: custom orchestrator in `ai.service.ts` with helper modules for tool execution
 - Memory: Redis with 24-hour TTL and max 10 turns
 - Tools: `portfolio_analysis`, `risk_assessment`, `market_data_lookup`
 - Models: `glm-5` via Z.AI primary path, `MiniMax-M2.5` fallback path, OpenRouter backup path
 - Deployment: Railway (moved to GHCR image source for faster deploy cycles)
 ## MCP Usage
 - Railway CLI and Railway GraphQL API:
  - linked project/service
  - switched service image source to `ghcr.io/maxpetrusenko/ghostfolio:main`
  - redeployed and verified production health
 - Local shell tooling:
  - targeted test/eval runs
  - health checks and deployment diagnostics
 - GitHub Actions:
  - GHCR publish workflow on `main` pushes
 ## Effective Prompts
 The following user prompts drove the highest-impact delivery steps:
 1. `use z_ai_glm_api_key glm-5 and minimax_api_key minimax m2.5 for mvp`
 2. `ok 1 and 2 and add data to the app so we can test it`
 3. `i dotn see activities and how to test and i dont see ai bot windows. where should i see it?`
 4. `publish you have cli here`
 5. `ok do 1 and 2 and then   3. AI development log (1 page) 4. AI cost analysis (100/1K/10K/100K users) 5. Submit to GitHub`
 ## Code Analysis
 Rough authorship estimate for the MVP slice:
 - AI-generated implementation and docs: ~70%
 - Human-guided edits, review, and final acceptance decisions: ~30%
 The largest human contribution focused on:
 - model/provider routing decisions
 - deploy-source migration on Railway
 - quality gates and scope control
 ## Strengths and Limitations
 Strengths observed:
 - High velocity on brownfield integration with existing architecture
 - Fast refactor support for file-size control and helper extraction
 - Reliable generation of deterministic test scaffolding and eval cases
 - Strong support for deployment automation and incident-style debugging
 Limitations observed:
 - CLI/API edge cases required manual schema introspection
 - Runtime state and environment drift required explicit verification loops
 - Exact token-cost accounting still needs production telemetry wiring
 ## Key Learnings
 1. Clear, constraint-rich prompts produce fast and stable implementation output.
 2. Deterministic eval cases are essential for regression control during rapid iteration.
 3. Deploy speed improves materially when runtime builds move from source builds to prebuilt images.
 4. Production readiness depends on traceability: citations, confidence scores, verification checks, and explicit assumptions in cost reporting.
--- a/docs/tasks/tasks.md
+++ b/docs/tasks/tasks.md
@ -0,0 +1,50 @@
 # Tasks
 Last updated: 2026-02-23
 ## Active Tickets
 | ID | Feature | Status | Tests | PR / Commit |
 | --- | --- | --- | --- | --- |
 | T-001 | Presearch package and architecture direction | Complete | Doc review checklist | Local docs update |
 | T-002 | ADR foundation in `docs/adr/` | Complete | ADR template and first ADR review | Local docs update |
 | T-003 | Agent MVP tool 1: `portfolio_analysis` | Complete | `apps/api/src/app/endpoints/ai/ai.service.spec.ts` | Planned |
 | T-004 | Agent memory and response formatter | Complete | `apps/api/src/app/endpoints/ai/ai.service.spec.ts` | Planned |
 | T-005 | Eval dataset baseline (MVP 5-10) | Complete | `apps/api/src/app/endpoints/ai/evals/mvp-eval.runner.spec.ts` | Planned |
 | T-006 | Full eval dataset (50+) | Planned | Dataset validation and regression run | Planned |
 | T-007 | Observability wiring (LangSmith traces and metrics) | Planned | Trace assertions and latency checks | Planned |
 | T-008 | Deployment and submission bundle | Complete | `npm run test:ai` + Railway healthcheck + submission docs checklist | Pending push |
 ## Notes
 - Canonical project requirements live in `docs/requirements.md`.
 - Architecture decisions live in `docs/adr/`.
 - Root tracker mirror lives in `Tasks.md`.
 ## MVP Local Runbook
 1. Install dependencies and infra:
   - `npm install`
   - `cp .env.dev .env`
   - `docker compose -f docker/docker-compose.dev.yml up -d`
   - `npm run database:setup`
 2. Start API:
   - `npm run start:server`
 3. Authenticate and call AI chat endpoint:
   - Obtain Bearer token using the existing Ghostfolio auth flow.
   - Call `POST http://localhost:3333/api/v1/ai/chat` with JSON body:
     - `{"query":"Analyze my portfolio concentration risk","sessionId":"mvp-session-1"}`
 4. Optional LLM output:
   - Preferred for MVP: set `z_ai_glm_api_key` (`glm-5`) and `minimax_api_key` (`MiniMax-M2.5`) in `.env`.
   - Fallback path: `API_KEY_OPENROUTER` and `OPENROUTER_MODEL` in properties store.
   - Without provider keys, endpoint returns deterministic fallback summaries and still keeps tool and verification metadata.
 5. Hostinger infra check:
   - `npm run hostinger:check`
 ## Verification Snapshot (2026-02-23)
 - `nx run api:lint` passed.
 - Full `nx test api` fails in existing portfolio calculator tests unrelated to AI endpoint changes.
 - Focused AI endpoint test command passed:
  - `npm run test:ai`
  - `npm run test:mvp-eval`
--- a/tasks/tasks.md
+++ b/tasks/tasks.md
@ -0,0 +1,123 @@
 # Todo
 Updated: 2026-02-23
 - [x] Verify current repository state and missing required files
 - [x] Create `docs/adr/` for architecture decisions
 - [x] Save `Tasks.md` at repository root
 - [x] Populate `docs/tasks/tasks.md`
 - [x] Create `tasks/improvements.md`
 - [x] Create `tasks/lessons.md`
 - [x] Confirm files exist on disk
 - [x] Kick off MVP slice after Presearch refresh (this session)
 # Tasks
 Last updated: 2026-02-23
 ## Active Tickets
 | ID | Feature | Status | Tests | PR / Commit |
 | --- | --- | --- | --- | --- |
 | T-001 | Presearch package and architecture direction | Complete | Doc review checklist | Local docs update |
 | T-002 | ADR foundation in `docs/adr/` | Complete | ADR template and first ADR review | Local docs update |
 | T-003 | Agent MVP tool 1: `portfolio_analysis` | Complete | `apps/api/src/app/endpoints/ai/ai.service.spec.ts` | Planned |
 | T-004 | Agent memory and response formatter | Complete | `apps/api/src/app/endpoints/ai/ai.service.spec.ts` | Planned |
 | T-005 | Eval dataset baseline (MVP 5-10) | Complete | `apps/api/src/app/endpoints/ai/evals/mvp-eval.runner.spec.ts` | Planned |
 | T-006 | Full eval dataset (50+) | Planned | Dataset validation and regression run | Planned |
 | T-007 | Observability wiring (LangSmith traces and metrics) | Planned | Trace assertions and latency checks | Planned |
 | T-008 | Deployment and submission bundle | Complete | `npm run test:ai` + Railway healthcheck + submission docs checklist | Pending push |
 ## Notes
 - Canonical project requirements live in `docs/requirements.md`.
 - Architecture decisions live in `docs/adr/`.
 - Detailed task board mirror lives in `docs/tasks/tasks.md`.
 ## MVP Start (Finance Agent on Ghostfolio)
 - [x] Inspect existing AI endpoint and integration points (`ai.controller.ts`, `ai.service.ts`, portfolio and data-provider services)
 - [x] Add `POST /api/v1/ai/chat` endpoint with request validation
 - [x] Implement 3 MVP tools in AI service
 - [x] Add Redis-backed session memory for conversation continuity
 - [x] Add verification checks and structured output formatter (citations, confidence, verification details)
 - [x] Add targeted API unit tests for tool selection and response contract
 - [x] Run lint and API tests
 - [x] Share required and optional `.env` keys for local MVP run
 ## Session Plan (2026-02-23)
 - [x] Refresh `docs/PRESEARCH.md` with source-backed framework and eval notes
 - [x] Add root `Tasks.md` mirror for submission checklist compliance
 - [x] Add AI chat service tests (tool execution, memory, verification, confidence)
 - [x] Add MVP runbook snippet for local execution and API invocation
 - [x] Execute focused verification (lint/test on touched surface)
 - [x] Update ticket status and evidence links
 ## Session Plan (2026-02-23, UI + Deploy + Test Data)
 - [x] Add client chat integration method for `POST /api/v1/ai/chat`
 - [x] Build MVP chat interface in portfolio analysis page
 - [x] Add focused frontend tests for chat request and response rendering
 - [x] Verify AI test suite + eval suite after UI changes
 - [x] Validate Railway API key and project visibility
 - [x] Install/use Railway CLI and initialize or link project configuration
 - [x] Add local data setup path and seed script command for MVP testing
 - [x] Run final verification commands and capture evidence
 ## Session Plan (2026-02-23, Visibility + Test URL)
 - [x] Diagnose why AI panel and activities are not visible in local UI
 - [x] Remove analysis page visibility gate that hides AI panel for non-experimental users
 - [x] Seed MVP activities for all local users to make testing deterministic
 - [x] Update `docs/CODE-REVIEW.md` with exact local test URLs and validation steps
 - [x] Run focused tests for touched files and record verification evidence
 - [x] Capture lesson for UI discoverability and test-path communication
 ## Session Plan (2026-02-23, Publish via CLI)
 - [x] Validate Railway CLI availability and auth path
 - [x] Switch to supported Railway CLI version (`@railway/cli`)
 - [x] Link current repository to configured Railway project/service
 - [x] Trigger production deployment from local repository
 - [x] Return deployed URL and post-deploy health check evidence
 ## Session Plan (2026-02-23, Seed Expansion)
 - [x] Expand AI MVP seed dataset with more symbols and transactions
 - [x] Add a second account per user for diversification scenarios
 - [x] Run seeding command and verify row counts and sample orders
 - [x] Share exact seeded coverage for local and deploy testing
 ## Session Plan (2026-02-23, Submission Bundle Completion)
 - [x] Switch Railway service from source-build deploys to GHCR image deploys
 - [x] Trigger redeploy and verify production health endpoint on image-based deploy
 - [x] Create 1-page AI development log document for submission
 - [x] Create AI cost analysis document with 100/1K/10K/100K projections
 - [ ] Push submission documents and deployment updates to `origin/main`
 ## Session Plan (2026-02-23, Railway Crash Recovery)
 - [x] Reproduce Railway start-command failure locally
 - [ ] Correct Railway start command to built API entrypoint
 - [ ] Verify fixed command resolves module-not-found crash
 - [ ] Update task tracker evidence for deploy follow-up
 ## Verification Notes
 - `nx run api:lint` completed successfully (existing workspace warnings only).
 - Full `nx test api` currently fails in pre-existing portfolio calculator suites unrelated to AI endpoint changes.
 - Focused MVP verification passed:
  - `npm run test:ai`
  - `npm run test:mvp-eval`
  - `npm run hostinger:check`
  - `npx dotenv-cli -e .env.example -- npx jest apps/client/src/app/pages/portfolio/analysis/ai-chat-panel/ai-chat-panel.component.spec.ts --config apps/client/jest.config.ts`
  - `npm run railway:check`
  - `npm run railway:setup`
  - `npm run database:seed:ai-mvp`
  - `npx nx run client:build:development-en`
  - `npx nx run client:lint`
  - `npx dotenv-cli -e .env -- npx -y @railway/cli@latest up --detach`
  - `npx dotenv-cli -e .env -- npx -y @railway/cli@latest service status`
  - `curl -i https://ghostfolio-api-production.up.railway.app/api/v1/health`