Article · 12 minute read

What does an AI agent cost to run?

LLM APIs, hosting, messaging, vector databases, and the line item everyone forgets. Real numbers you can put in a spreadsheet.

Benjam Indrenius

Founder of localbot

Published 2026-04-26 · Updated 2026-05-23

The short answer

Most business operators overestimate model cost and underestimate everything else. A typical AI conversation costs under a nickel. A single SMS in some countries costs more than the model call. And the person maintaining the system costs more than the hosting, the API, and the messaging combined.

Cost stack

The model is rarely the bill that hurts

For small teams, recurring human upkeep and messaging usually outweigh token spend.

Model API

Small

Per-conversation token cost stays low until volume gets very high.

Messaging

Variable

SMS costs depend heavily on country and conversation depth.

Maintenance

Largest

Prompt tuning, monitoring, and integration fixes usually dominate monthly cost.

What a conversation costs on each model

A typical business chat: 6,000 input tokens, 1,500 output tokens. That covers a multi-turn support Q&A, internal ops lookup, or scheduling conversation. Prices are from current provider pricing pages (Claude).

Model	Input $/1M	Output $/1M	Per chat
GPT-4o-mini	$0.15	$0.60	$0.002
Gemini Flash	$0.30	$2.50	$0.006
Claude Haiku 4.5	$1.00	$5.00	$0.014
GPT-4o	$2.50	$10.00	$0.030
Claude Sonnet 4.5	$3.00	$15.00	$0.041
Claude Opus 4	$15.00	$75.00	$0.203

1,000 conversations on GPT-4o-mini: $1.80. On Claude Haiku: $13.50. On GPT-4o: $30. The model is rarely the biggest line item. The channel and the human time usually are.

Messaging costs: SMS vs email

A 5-message conversation (3 outbound, 2 inbound) on Twilio, by country. This is where the geography of your customers starts to matter more than your model choice.

US SMS

$0.04

UK SMS

$0.18

Finland SMS

$0.27

Germany SMS

$0.35

Email (SES)

$0.0004

4-email thread

US SMS notification

$0.0083

Single segment

Email (Resend)

$0.002

4-email thread

Email is three orders of magnitude cheaper than SMS. Use SMS when speed matters and email when the message can wait.

Hosting: cheaper than you think

App runtime for an agent that calls external APIs. Not including the LLM, messaging, or database.

Provider	Basic	Medium	High volume
Hetzner	$5	$16	$38
DigitalOcean	$6	$24	$48
Fly.io	$6	$23	$46
Render	$7	$25	$85
Railway	$30	$80	$160

A normal business agent runs on a $5-25/month VPS if the intelligence lives in external APIs. What gets expensive is managed databases, caches, observability, and multiple workers.

Three real scenarios

Solo operator, SMS notification agent

500 US notifications/mo, 50 replies, GPT-4o-mini, Hetzner, 2 hrs/mo maintenance

Hosting$5LLM API$0.21Twilio SMS + number$5.72Backups + domain$1.67Human maintenance (2 hrs @ $90)$180Total$193/mo

Machine bill: $12.60. The rest is your time.

Small team, Slack internal agent

8 users, 500 chats/mo, 80% mini + 20% GPT-4o, Render, Supabase, Sentry, 4 hrs/mo

Slack Pro (8 seats)$70Render runtime$25Supabase Pro$25Mixed model API$3.72Sentry Team$26Human maintenance (4 hrs @ $90)$360Total$511/mo

The model bill is $3.72. Slack seats cost 19x more than the AI.

Service business, customer-facing SMS

500 conversations/mo over US SMS, Claude Sonnet 4.5, Fly.io, 8 hrs/mo

Fly.io runtime$23Supabase Pro$25Claude Sonnet 4.5 API$27SMS messaging$20.75Sentry Team$26Human maintenance (8 hrs @ $90)$720Total$842/mo

500+ conversations/month and the model bill is still only $27. The supervision time dominates.

Four ways to cut costs that work

Route by complexity

Send 80% of requests to GPT-4o-mini ($0.002/chat) and 20% to GPT-4o ($0.03/chat). 1,000 conversations: $7.44 instead of $30. Same trick with Anthropic: 80% Haiku + 20% Sonnet cuts costs 53%.

Cache your prompts

Anthropic's cache-read pricing on Sonnet 4.5: $0.30/MTok vs $3/MTok standard. A repeated 8,000-token system prompt drops from $0.024 to $0.0024 per request. Over 10,000 conversations, that saves $216/month on a single prompt block.

Batch non-urgent work

Google offers 50% off through the Gemini Batch API. Overnight classification, document cleanup, summarization, enrichment jobs. If it doesn't need a real-time answer, batch it.

Pick the cheapest viable channel

1,000 US SMS notifications: $8.30. 1,000 SES emails: $0.10. In Germany, that same 1,000 SMS would cost $112. If your customers accept email for non-urgent messages, the savings stack up fast.

Explore by intent

Core product pagesLearn what localbot does and how it works.+

localbot pricingOne product, one price. Instant SMS alerts, lead replies, and summaries.SMS notification featureInstant SMS alerts plus optional lead replies from localbot.

Lead response use casesPages for the problems localbot is built to solve.+

Get more leads from your websiteA practical lead path audit for small businesses that want more enquiries and faster replies.Contact form that texts youAdd a form that sends every lead to your phone by SMS and can reply when you are busy.Auto-reply to leadsAutomatically text leads back with useful follow-up questions.SMS form notificationsGet a text every time someone submits your contact form, with lead replies when enabled.Lead response softwareReply to website leads in seconds, not hours, and get a summary before callback.SMS lead alertsText alert for every new website lead plus optional missing-detail follow-up.Website lead notificationsGet notified by text when someone fills your form and keep leads engaged by SMS.

Guides with search demandStart with the pages already earning impressions.+

Best website lead capture toolsCompare website lead capture tools by owner alerts, lead replies, setup work, and response speed.Website lead response time dataThe studies behind fast lead response and why the first minutes matter.Contact form SMS alertsWays to get a text message when someone fills out your website contact form.Get notified when a form is submittedEmail, Zapier, SMS, and phone-first notification paths compared.

Website builders and platformsInstall guides for common builders, WordPress, and AI-made sites.+

WordPress SMS notificationsGet a text when your WordPress contact form is submitted, with lead replies when enabled.Lovable + localbotInstall localbot on a Lovable project with one prompt.Bolt + localbotInstall localbot on a Bolt project with one prompt.v0 + localbotInstall localbot on a v0 project with one prompt.Replit + localbotInstall localbot on a Replit project with one prompt.

ComparisonsUse these when you are choosing between tools.+

CallPage alternativeInstant SMS alerts and lead replies vs callback tools.localbot vs ResendWhen to use instant SMS alerts and lead replies vs email delivery.Formspree alternativeWhen SMS lead alerts matter more than form backend plumbing.Jotform alternativeA focused lead-response alternative to broad form builders.Typeform alternativeFor contact forms where fast callback matters most.Contact Form 7 alternativeSMS notifications without WordPress plugin stacks.

Docs for AI agentsAgent-facing references for choosing and installing localbot.+

AI Agent DocsFull integration guide for coding agents and builders.

Frequently asked questions

How much does it cost to run an AI agent per conversation?

A typical business conversation (6,000 input tokens, 1,500 output) costs $0.002 on GPT-4o-mini, $0.03 on GPT-4o, $0.014 on Claude Haiku 4.5, and $0.04 on Claude Sonnet 4.5. The model is rarely the biggest line item. Messaging and human maintenance usually cost more.

What is the biggest hidden cost of running an AI agent?

Human maintenance time. At a $90/hour loaded engineering rate, 4 hours per month of upkeep costs $360. That often exceeds the combined bill for hosting, APIs, and messaging. The Bureau of Labor Statistics puts median US developer salary at $133,080 (May 2024).

Is it cheaper to self-host an LLM or use an API?

API is cheaper for low volume. Self-hosted Llama on a Hetzner GPU server costs roughly $1.20 to $2.40 per million output tokens depending on throughput, but the GPU idles when traffic is low. At under 10,000 conversations per month, API pricing usually wins because you pay per token, not per hour.

How much does SMS cost for lead notifications?

US SMS on Twilio: $0.0083 per segment. A 5-message conversation (3 out, 2 in): about $0.04. UK SMS: $0.18 per conversation. Germany: $0.35. Finland: $0.27. Costs depend heavily on country and conversation depth.

Should I build my own AI agent or buy a SaaS product?

Buy first when the workflow is standard and time-to-value matters. A solo SMS notification agent costs roughly $1,440 to build and $180/month to maintain. A customer-facing multi-channel agent: $10,800 to build, $720/month to maintain. The model API itself is often the smallest line item.

How can I reduce AI agent costs?

Route 80% of requests to a cheap model (GPT-4o-mini or Claude Haiku) and 20% to a strong model. This cuts model costs 50-75%. Use prompt caching (Anthropic cache reads are 90% cheaper). Batch non-urgent work (Google offers 50% batch discount). Use email for non-urgent messages where possible.