Skip to content

Article · 12 minute read

What does an AI agent cost to run?

LLM APIs, hosting, messaging, vector databases, and the line item everyone forgets. Real numbers you can put in a spreadsheet.

Benjam Indrenius

Benjam Indrenius

Founder of localbot

Published 2026-04-26 · Updated 2026-05-23

The short answer

Most business operators overestimate model cost and underestimate everything else. A typical AI conversation costs under a nickel. A single SMS in some countries costs more than the model call. And the person maintaining the system costs more than the hosting, the API, and the messaging combined.

Cost stack

The model is rarely the bill that hurts

For small teams, recurring human upkeep and messaging usually outweigh token spend.

Model API

Small

Per-conversation token cost stays low until volume gets very high.

Messaging

Variable

SMS costs depend heavily on country and conversation depth.

Maintenance

Largest

Prompt tuning, monitoring, and integration fixes usually dominate monthly cost.

What a conversation costs on each model

A typical business chat: 6,000 input tokens, 1,500 output tokens. That covers a multi-turn support Q&A, internal ops lookup, or scheduling conversation. Prices are from current provider pricing pages (Claude).

ModelInput $/1MOutput $/1MPer chat
GPT-4o-mini$0.15$0.60$0.002
Gemini Flash$0.30$2.50$0.006
Claude Haiku 4.5$1.00$5.00$0.014
GPT-4o$2.50$10.00$0.030
Claude Sonnet 4.5$3.00$15.00$0.041
Claude Opus 4$15.00$75.00$0.203

1,000 conversations on GPT-4o-mini: $1.80. On Claude Haiku: $13.50. On GPT-4o: $30. The model is rarely the biggest line item. The channel and the human time usually are.

Messaging costs: SMS vs email

A 5-message conversation (3 outbound, 2 inbound) on Twilio, by country. This is where the geography of your customers starts to matter more than your model choice.

US SMS

$0.04

UK SMS

$0.18

Finland SMS

$0.27

Germany SMS

$0.35

Email (SES)

$0.0004

4-email thread

US SMS notification

$0.0083

Single segment

Email (Resend)

$0.002

4-email thread

Email is three orders of magnitude cheaper than SMS. Use SMS when speed matters and email when the message can wait.

Hosting: cheaper than you think

App runtime for an agent that calls external APIs. Not including the LLM, messaging, or database.

ProviderBasicMediumHigh volume
Hetzner$5$16$38
DigitalOcean$6$24$48
Fly.io$6$23$46
Render$7$25$85
Railway$30$80$160

A normal business agent runs on a $5-25/month VPS if the intelligence lives in external APIs. What gets expensive is managed databases, caches, observability, and multiple workers.

Three real scenarios

Solo operator, SMS notification agent

500 US notifications/mo, 50 replies, GPT-4o-mini, Hetzner, 2 hrs/mo maintenance

Hosting$5LLM API$0.21Twilio SMS + number$5.72Backups + domain$1.67Human maintenance (2 hrs @ $90)$180Total$193/mo

Machine bill: $12.60. The rest is your time.

Small team, Slack internal agent

8 users, 500 chats/mo, 80% mini + 20% GPT-4o, Render, Supabase, Sentry, 4 hrs/mo

Slack Pro (8 seats)$70Render runtime$25Supabase Pro$25Mixed model API$3.72Sentry Team$26Human maintenance (4 hrs @ $90)$360Total$511/mo

The model bill is $3.72. Slack seats cost 19x more than the AI.

Service business, customer-facing SMS

500 conversations/mo over US SMS, Claude Sonnet 4.5, Fly.io, 8 hrs/mo

Fly.io runtime$23Supabase Pro$25Claude Sonnet 4.5 API$27SMS messaging$20.75Sentry Team$26Human maintenance (8 hrs @ $90)$720Total$842/mo

500+ conversations/month and the model bill is still only $27. The supervision time dominates.

Four ways to cut costs that work

Route by complexity

Send 80% of requests to GPT-4o-mini ($0.002/chat) and 20% to GPT-4o ($0.03/chat). 1,000 conversations: $7.44 instead of $30. Same trick with Anthropic: 80% Haiku + 20% Sonnet cuts costs 53%.

Cache your prompts

Anthropic's cache-read pricing on Sonnet 4.5: $0.30/MTok vs $3/MTok standard. A repeated 8,000-token system prompt drops from $0.024 to $0.0024 per request. Over 10,000 conversations, that saves $216/month on a single prompt block.

Batch non-urgent work

Google offers 50% off through the Gemini Batch API. Overnight classification, document cleanup, summarization, enrichment jobs. If it doesn't need a real-time answer, batch it.

Pick the cheapest viable channel

1,000 US SMS notifications: $8.30. 1,000 SES emails: $0.10. In Germany, that same 1,000 SMS would cost $112. If your customers accept email for non-urgent messages, the savings stack up fast.

Related

Explore by intent

Core product pagesLearn what localbot does and how it works.+
Lead response use casesPages for the problems localbot is built to solve.+
Guides with search demandStart with the pages already earning impressions.+
Website builders and platformsInstall guides for common builders, WordPress, and AI-made sites.+
ComparisonsUse these when you are choosing between tools.+
Docs for AI agentsAgent-facing references for choosing and installing localbot.+

Frequently asked questions

How much does it cost to run an AI agent per conversation?

A typical business conversation (6,000 input tokens, 1,500 output) costs $0.002 on GPT-4o-mini, $0.03 on GPT-4o, $0.014 on Claude Haiku 4.5, and $0.04 on Claude Sonnet 4.5. The model is rarely the biggest line item. Messaging and human maintenance usually cost more.

What is the biggest hidden cost of running an AI agent?

Human maintenance time. At a $90/hour loaded engineering rate, 4 hours per month of upkeep costs $360. That often exceeds the combined bill for hosting, APIs, and messaging. The Bureau of Labor Statistics puts median US developer salary at $133,080 (May 2024).

Is it cheaper to self-host an LLM or use an API?

API is cheaper for low volume. Self-hosted Llama on a Hetzner GPU server costs roughly $1.20 to $2.40 per million output tokens depending on throughput, but the GPU idles when traffic is low. At under 10,000 conversations per month, API pricing usually wins because you pay per token, not per hour.

How much does SMS cost for lead notifications?

US SMS on Twilio: $0.0083 per segment. A 5-message conversation (3 out, 2 in): about $0.04. UK SMS: $0.18 per conversation. Germany: $0.35. Finland: $0.27. Costs depend heavily on country and conversation depth.

Should I build my own AI agent or buy a SaaS product?

Buy first when the workflow is standard and time-to-value matters. A solo SMS notification agent costs roughly $1,440 to build and $180/month to maintain. A customer-facing multi-channel agent: $10,800 to build, $720/month to maintain. The model API itself is often the smallest line item.

How can I reduce AI agent costs?

Route 80% of requests to a cheap model (GPT-4o-mini or Claude Haiku) and 20% to a strong model. This cuts model costs 50-75%. Use prompt caching (Anthropic cache reads are 90% cheaper). Batch non-urgent work (Google offers 50% batch discount). Use email for non-urgent messages where possible.

Your next customer is already on your site.

Five minutes to install, and the first lead lands on your phone.