You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 

4.4 KiB

AI Cost Analysis

Development & Testing Costs

Observed Usage (Local + Production)

Metric Value
Total chats logged 89
Total prompt tokens 367,389
Total completion tokens 17,771
Total tokens 385,160
Avg prompt tokens/chat 4,128
Avg completion tokens/chat 200
Avg total tokens/chat 4,328
Error rate 9.0% (8/89)
Development period Feb 26 – Feb 28, 2026

Development API Costs

Primary model: Claude Sonnet 4.6 ($3/MTok input, $15/MTok output)

Component Tokens Cost
Prompt tokens (367K) 367,389 $1.10
Completion tokens (18K) 17,771 $0.27
Total dev/test spend 385,160 $1.37

Additional costs:

  • Eval scoring (Haiku 4.5): ~86 cases x ~500 tokens/judgment = ~43K tokens = $0.05
  • Multiple eval runs during development: ~10 runs x $0.05 = $0.50
  • Estimated total dev spend: ~$2.00

Per-Chat Cost Breakdown

Model Avg Input Avg Output Cost/Chat
Sonnet 4.6 (default) 4,128 tok 200 tok $0.0154
Haiku 4.5 4,128 tok 200 tok $0.0051
Opus 4.6 4,128 tok 200 tok $0.0769

Production Cost Projections

Assumptions

  • Queries per user per day: 5 (portfolio check, performance, transactions, market data, misc)
  • Avg tokens per query: 4,328 (4,128 input + 200 output) — observed from production data
  • Tool call frequency: 1.9 tool calls/chat average (observed)
  • Verification overhead: Negligible (deterministic, no extra LLM calls)
  • Cache warming overhead: warmPortfolioCache runs after activity/account writes — Redis + BullMQ job, zero LLM tokens, up to 30s added latency per write operation
  • Model mix: 90% Sonnet 4.6, 8% Haiku 4.5, 2% Opus 4.6
  • Blended cost per chat: $0.0154 x 0.90 + $0.0051 x 0.08 + $0.0769 x 0.02 = $0.0157

Monthly Projections

Scale Users Chats/Month Token Volume Monthly Cost
100 users 100 15,000 64.9M tokens $235
1,000 users 1,000 150,000 649M tokens $2,355
10,000 users 10,000 1,500,000 6.49B tokens $23,550
100,000 users 100,000 15,000,000 64.9B tokens $235,500

Cost Optimization Levers

Strategy Estimated Savings Trade-off
Default to Haiku 4.5 for simple queries 60-70% Slightly less nuanced responses
Prompt caching (repeated system prompt) 30-40% Requires API support
Response caching for market data 10-20% Staleness window
Reduce system prompt size 15-25% Less detailed agent behavior

Infrastructure Costs (Render)

Service Plan Monthly Cost
Web (Standard) Standard $25
Redis (Standard) Standard $10
Postgres (Basic 1GB) Basic $7
Total infra $42/month

Total Cost of Ownership

Scale AI Cost Infra Total/Month Cost/User/Month
100 users $235 $42 $277 $2.77
1,000 users $2,355 $42 $2,397 $2.40
10,000 users $23,550 $100* $23,650 $2.37
100,000 users $235,500 $500* $236,000 $2.36

*Infrastructure scales with traffic — estimated for higher tiers.

Real-Time Cost Tracking

Cost is tracked live via GET /api/v1/agent/metrics:

{
  "cost": {
    "totalUsd": 0.0226,
    "avgPerChatUsd": 0.0226
  }
}

Computed per request using model-specific pricing (Sonnet/Haiku/Opus rates) applied to actual prompt and completion token counts.