The aggregation problem
Most teams start with a single bill from their LLM provider: Total monthly spend: $47,293Input tokens: 2.3 B
Output tokens: 890M That’s it. One number. No context. No way to know what’s working and what’s hemorrhaging money. You can’t answer basic questions:
- Which product feature costs the most?
- Is the chatbot more expensive than the code assistant?
- Did that optimization last week actually work?
- Which team owns the spending spike?
The fundamentals: Input + output tokens
Good news. LLM billing is simpler than cloud. Almost everything comes down to these core items:- Input tokens: What you send to the model (prompts, context, documents)
- Output tokens: What the model generates back
Break down the spending
The goal is to move from aggregate numbers to granular insights. Here’s the progression:Level 1: Organization-wide
❌ Total: $47,293/month across all models and applications You have no idea what to optimize. Every feature looks equally responsible.Level 2: By app or product
- Customer chatbot: $28,400 (60%)
- Code assistant: $12,900 (27%)
- email classifier: $5,993 (13%)
Level 3: By feature or use case
- Customer chatbot
- Live support: $18,200 (64%)
- FAQ responses: $7,100 (25%)
- Conversation summaries: $3,100 (11%)
Level 4: By owner and cost per outcome
- Live support (Customer Success Team)
- Monthly spend: $18,200
- Conversations handled: 4,320
- Cost per conversation: $4.21
- Owner: Sarah Chen (VP Customer Success)
How to track spending at the source
You need infrastructure that attributes costs automatically. Here are the three most common approaches:1. Resource tagging
Tag every LLM-related resource (API keys, endpoints, services) with metadata:Pros: Automatic attribution once configuredCons: Requires disciplined tagging from day one
2. Application-specific endpoints
Create separate API keys or proxy endpoints for each application:api.yourcompany.com/chatbot→ Customer chatbotapi.yourcompany.com/code-assist→ Code assistantapi.yourcompany.com/classifier→ Email classifier
Pros: Immediate visibility with no code changesCons: Requires some infrastructure work upfront
3. Tracing
Wrap your LLM calls with middleware that logs metadata and traces execution:- OpenTelemetry: Industry-standard observability framework that captures spans, traces, and metrics across your LLM pipeline
- LangSmith: Purpose-built for LLM applications, tracks prompts, completions, latency, and costs
- Langfuse: Open-source LLM observability with automatic cost tracking and evaluation workflows
- Arize Phoenix: Monitors model performance, token usage, and traces multi-step agent workflows
Pros: Maximum flexibility, control, and deep observabilityCons: Most engineering work required, needs integration with existing monitoring stack
Identify the owners
Every dollar of LLM spending should have a clear owner—someone who:- Understands the use case and user experience
- Can make tradeoffs between cost, quality, and speed
- Has authority to approve changes
- A product manager (for user-facing features)
- An engineering lead (for internal tooling)
- A team lead (for department-specific applications)
- You (if it’s your project)
| Application/Feature | Owner | Team | Monthly Spend | Priority |
|---|---|---|---|---|
| Chatbot - Live support | Sarah Chen | Customer Success | $18,200 | High |
| Chatbot - FAQ | Sarah Chen | Customer Success | $7,100 | Medium |
| Code assistant | Alex Rivera | Engineering | $12,900 | High |
| Email classifier | Jordan Kim | Operations | $5,993 | Low |
- The spending breakdown (does it look right?)
- The ownership assignments (is Alex the right person for code assist?)
- The optimization priorities (should we start with live support or the code assistant?)