# AI Cost Analysis ## Development & Testing Costs ### Observed Usage (Local + Production) | Metric | Value | | -------------------------- | --------------------- | | Total chats logged | 89 | | Total prompt tokens | 367,389 | | Total completion tokens | 17,771 | | Total tokens | 385,160 | | Avg prompt tokens/chat | 4,128 | | Avg completion tokens/chat | 200 | | Avg total tokens/chat | 4,328 | | Error rate | 9.0% (8/89) | | Development period | Feb 26 – Feb 28, 2026 | ### Development API Costs Primary model: Claude Sonnet 4.6 ($3/MTok input, $15/MTok output) | Component | Tokens | Cost | | ------------------------ | ------- | --------- | | Prompt tokens (367K) | 367,389 | $1.10 | | Completion tokens (18K) | 17,771 | $0.27 | | **Total dev/test spend** | 385,160 | **$1.37** | Additional costs: - Eval scoring (Haiku 4.5): ~86 cases x ~500 tokens/judgment = ~43K tokens = $0.05 - Multiple eval runs during development: ~10 runs x $0.05 = $0.50 - **Estimated total dev spend: ~$2.00** ### Per-Chat Cost Breakdown | Model | Avg Input | Avg Output | Cost/Chat | | -------------------- | --------- | ---------- | --------- | | Sonnet 4.6 (default) | 4,128 tok | 200 tok | $0.0154 | | Haiku 4.5 | 4,128 tok | 200 tok | $0.0051 | | Opus 4.6 | 4,128 tok | 200 tok | $0.0769 | ## Production Cost Projections ### Assumptions - **Queries per user per day**: 5 (portfolio check, performance, transactions, market data, misc) - **Avg tokens per query**: 4,328 (4,128 input + 200 output) — observed from production data - **Tool call frequency**: 1.9 tool calls/chat average (observed) - **Verification overhead**: Negligible (deterministic, no extra LLM calls) - **Cache warming overhead**: `warmPortfolioCache` runs after activity/account writes — Redis + BullMQ job, zero LLM tokens, up to 30s added latency per write operation - **Model mix**: 90% Sonnet 4.6, 8% Haiku 4.5, 2% Opus 4.6 - **Blended cost per chat**: $0.0154 x 0.90 + $0.0051 x 0.08 + $0.0769 x 0.02 = $0.0157 ### Monthly Projections | Scale | Users | Chats/Month | Token Volume | Monthly Cost | | ------------- | ------- | ----------- | ------------ | ------------ | | 100 users | 100 | 15,000 | 64.9M tokens | **$235** | | 1,000 users | 1,000 | 150,000 | 649M tokens | **$2,355** | | 10,000 users | 10,000 | 1,500,000 | 6.49B tokens | **$23,550** | | 100,000 users | 100,000 | 15,000,000 | 64.9B tokens | **$235,500** | ### Cost Optimization Levers | Strategy | Estimated Savings | Trade-off | | --------------------------------------- | ----------------- | ------------------------------- | | Default to Haiku 4.5 for simple queries | 60-70% | Slightly less nuanced responses | | Prompt caching (repeated system prompt) | 30-40% | Requires API support | | Response caching for market data | 10-20% | Staleness window | | Reduce system prompt size | 15-25% | Less detailed agent behavior | ### Infrastructure Costs (Render) | Service | Plan | Monthly Cost | | -------------------- | -------- | ------------- | | Web (Standard) | Standard | $25 | | Redis (Standard) | Standard | $10 | | Postgres (Basic 1GB) | Basic | $7 | | **Total infra** | | **$42/month** | ### Total Cost of Ownership | Scale | AI Cost | Infra | Total/Month | Cost/User/Month | | ------------- | -------- | ------ | ----------- | --------------- | | 100 users | $235 | $42 | $277 | $2.77 | | 1,000 users | $2,355 | $42 | $2,397 | $2.40 | | 10,000 users | $23,550 | $100\* | $23,650 | $2.37 | | 100,000 users | $235,500 | $500\* | $236,000 | $2.36 | \*Infrastructure scales with traffic — estimated for higher tiers. ### Real-Time Cost Tracking Cost is tracked live via `GET /api/v1/agent/metrics`: ```json { "cost": { "totalUsd": 0.0226, "avgPerChatUsd": 0.0226 } } ``` Computed per request using model-specific pricing (Sonnet/Haiku/Opus rates) applied to actual prompt and completion token counts.