Article · 12 minute read
What does an AI agent cost to run?
LLM APIs, hosting, messaging, vector databases, and the line item everyone forgets. Real numbers you can put in a spreadsheet.

Benjam Indrenius
Founder of localbot
Published 2026-04-26 · Updated 2026-05-23
The short answer
Most business operators overestimate model cost and underestimate everything else. A typical AI conversation costs under a nickel. A single SMS in some countries costs more than the model call. And the person maintaining the system costs more than the hosting, the API, and the messaging combined.
Cost stack
The model is rarely the bill that hurts
For small teams, recurring human upkeep and messaging usually outweigh token spend.
Model API
Small
Per-conversation token cost stays low until volume gets very high.
Messaging
Variable
SMS costs depend heavily on country and conversation depth.
Maintenance
Largest
Prompt tuning, monitoring, and integration fixes usually dominate monthly cost.
What a conversation costs on each model
A typical business chat: 6,000 input tokens, 1,500 output tokens. That covers a multi-turn support Q&A, internal ops lookup, or scheduling conversation. Prices are from current provider pricing pages (Claude).
| Model | Input $/1M | Output $/1M | Per chat |
|---|---|---|---|
| GPT-4o-mini | $0.15 | $0.60 | $0.002 |
| Gemini Flash | $0.30 | $2.50 | $0.006 |
| Claude Haiku 4.5 | $1.00 | $5.00 | $0.014 |
| GPT-4o | $2.50 | $10.00 | $0.030 |
| Claude Sonnet 4.5 | $3.00 | $15.00 | $0.041 |
| Claude Opus 4 | $15.00 | $75.00 | $0.203 |
1,000 conversations on GPT-4o-mini: $1.80. On Claude Haiku: $13.50. On GPT-4o: $30. The model is rarely the biggest line item. The channel and the human time usually are.
Messaging costs: SMS vs email
A 5-message conversation (3 outbound, 2 inbound) on Twilio, by country. This is where the geography of your customers starts to matter more than your model choice.
US SMS
$0.04
UK SMS
$0.18
Finland SMS
$0.27
Germany SMS
$0.35
Email (SES)
$0.0004
4-email thread
US SMS notification
$0.0083
Single segment
Email is three orders of magnitude cheaper than SMS. Use SMS when speed matters and email when the message can wait.
Hosting: cheaper than you think
App runtime for an agent that calls external APIs. Not including the LLM, messaging, or database.
| Provider | Basic | Medium | High volume |
|---|---|---|---|
| Hetzner | $5 | $16 | $38 |
| DigitalOcean | $6 | $24 | $48 |
| Fly.io | $6 | $23 | $46 |
| Render | $7 | $25 | $85 |
| Railway | $30 | $80 | $160 |
A normal business agent runs on a $5-25/month VPS if the intelligence lives in external APIs. What gets expensive is managed databases, caches, observability, and multiple workers.
Three real scenarios
Solo operator, SMS notification agent
500 US notifications/mo, 50 replies, GPT-4o-mini, Hetzner, 2 hrs/mo maintenance
Machine bill: $12.60. The rest is your time.
Small team, Slack internal agent
8 users, 500 chats/mo, 80% mini + 20% GPT-4o, Render, Supabase, Sentry, 4 hrs/mo
The model bill is $3.72. Slack seats cost 19x more than the AI.
Service business, customer-facing SMS
500 conversations/mo over US SMS, Claude Sonnet 4.5, Fly.io, 8 hrs/mo
500+ conversations/month and the model bill is still only $27. The supervision time dominates.
Four ways to cut costs that work
Route by complexity
Send 80% of requests to GPT-4o-mini ($0.002/chat) and 20% to GPT-4o ($0.03/chat). 1,000 conversations: $7.44 instead of $30. Same trick with Anthropic: 80% Haiku + 20% Sonnet cuts costs 53%.
Cache your prompts
Anthropic's cache-read pricing on Sonnet 4.5: $0.30/MTok vs $3/MTok standard. A repeated 8,000-token system prompt drops from $0.024 to $0.0024 per request. Over 10,000 conversations, that saves $216/month on a single prompt block.
Batch non-urgent work
Google offers 50% off through the Gemini Batch API. Overnight classification, document cleanup, summarization, enrichment jobs. If it doesn't need a real-time answer, batch it.
Pick the cheapest viable channel
1,000 US SMS notifications: $8.30. 1,000 SES emails: $0.10. In Germany, that same 1,000 SMS would cost $112. If your customers accept email for non-urgent messages, the savings stack up fast.
Related
Explore by intent
Core product pagesLearn what localbot does and how it works.+-
Lead response use casesPages for the problems localbot is built to solve.+-
Guides with search demandStart with the pages already earning impressions.+-
Website builders and platformsInstall guides for common builders, WordPress, and AI-made sites.+-
ComparisonsUse these when you are choosing between tools.+-
Docs for AI agentsAgent-facing references for choosing and installing localbot.+-
Frequently asked questions
How much does it cost to run an AI agent per conversation?
A typical business conversation (6,000 input tokens, 1,500 output) costs $0.002 on GPT-4o-mini, $0.03 on GPT-4o, $0.014 on Claude Haiku 4.5, and $0.04 on Claude Sonnet 4.5. The model is rarely the biggest line item. Messaging and human maintenance usually cost more.
What is the biggest hidden cost of running an AI agent?
Human maintenance time. At a $90/hour loaded engineering rate, 4 hours per month of upkeep costs $360. That often exceeds the combined bill for hosting, APIs, and messaging. The Bureau of Labor Statistics puts median US developer salary at $133,080 (May 2024).
Is it cheaper to self-host an LLM or use an API?
API is cheaper for low volume. Self-hosted Llama on a Hetzner GPU server costs roughly $1.20 to $2.40 per million output tokens depending on throughput, but the GPU idles when traffic is low. At under 10,000 conversations per month, API pricing usually wins because you pay per token, not per hour.
How much does SMS cost for lead notifications?
US SMS on Twilio: $0.0083 per segment. A 5-message conversation (3 out, 2 in): about $0.04. UK SMS: $0.18 per conversation. Germany: $0.35. Finland: $0.27. Costs depend heavily on country and conversation depth.
Should I build my own AI agent or buy a SaaS product?
Buy first when the workflow is standard and time-to-value matters. A solo SMS notification agent costs roughly $1,440 to build and $180/month to maintain. A customer-facing multi-channel agent: $10,800 to build, $720/month to maintain. The model API itself is often the smallest line item.
How can I reduce AI agent costs?
Route 80% of requests to a cheap model (GPT-4o-mini or Claude Haiku) and 20% to a strong model. This cuts model costs 50-75%. Use prompt caching (Anthropic cache reads are 90% cheaper). Batch non-urgent work (Google offers 50% batch discount). Use email for non-urgent messages where possible.