Skip to main content
DeepSeek’s latest models, deepseek-v4-pro and deepseek-v4-flash, offer high performance with competitive pricing. You can find the latest pricing information directly on the DeepSeek API documentation website. Alternatively, you can access this pricing information programmatically. This guide explains how to retrieve their current rates via API and calculate the total cost for your specific usage.
Model availability: you can access DeepSeek models directly via DeepSeek. The model ID has no prefix (for example, deepseek-v4-pro or deepseek-v4-flash) and the provider is deepseek.

1. Look up current pricing

To get the latest pricing for the DeepSeek V4 models, use the GET /api/models/pricing endpoint. You can filter the rates by the model identifier and the provider.

Example: Fetching Direct DeepSeek pricing

Query the deepseek-v4-pro model and filter by the deepseek provider.
curl --request GET \
  --url "https://www.narev.ai/api/models/pricing?model_id=deepseek-v4-pro&provider=deepseek"
The response will give you a detailed breakdown of the costs per token, including prompt, completion, and cache read/write prices.

2. Calculate your exact usage cost

Once you know the rates, or if you just want to calculate the exact cost of a specific request based on your token usage, you can use the POST /api/models/pricing/calculate endpoint. This is especially useful because DeepSeek models often feature significant discounts for cached input. By providing the exact breakdown of cached vs. non-cached tokens, you can get an accurate cost calculation.

Example: Calculating cost for a direct DeepSeek request

Here is an example of how to calculate the cost for a request made directly through DeepSeek using deepseek-v4-pro:
curl --request POST \
  --url https://www.narev.ai/api/models/pricing/calculate \
  --header 'Content-Type: application/json' \
  --data '{
    "modelId": "deepseek-v4-pro",
    "provider": "deepseek",
    "usage": {
      "inputTokens": {
        "total": 1500,
        "noCache": 500,
        "cacheRead": 1000,
        "cacheWrite": 0
      },
      "outputTokens": {
        "total": 250,
        "text": 200,
        "reasoning": 50
      }
    }
  }'

Understanding the cost breakdown

The API will return a detailed breakdown of your costs based on the usage you provided. For example, the response might look like this:
{
  "modelId": "deepseek-v4-pro",
  "provider": "deepseek",
  "subprovider": "DeepSeek",
  "usage": {
    "inputTokens": 1500,
    "inputCacheReadTokens": 1000,
    "inputCacheWriteTokens": 0,
    "outputTokens": 250,
    "outputReasoningTokens": 50,
    "webSearchRequests": 0
  },
  "pricing": {
    "input": 0.00000174,
    "output": 0.00000348,
    "request": 0,
    "inputCacheRead": 1.45e-7,
    "inputCacheWrite": 0,
    "internalReasoning": 0,
    "webSearch": 0
  },
  "costBreakdown": {
    "input": 0.00261,
    "output": 0.00087,
    "request": 0,
    "inputCacheRead": 0.000145,
    "inputCacheWrite": 0,
    "outputReasoning": 0,
    "webSearch": 0,
    "total": 0.003625
  }
}
Don’t forget cache savings! Documenting the correct token types like cacheRead and reasoning ensures you’re capturing these savings in your billing correctly. DeepSeek’s cache read prices are typically much lower than standard input prices.

3. Automate billing with @ai-billing/deepseek

If you are building your app using the Vercel AI SDK, you can automate usage tracking and cost calculation using the @ai-billing/deepseek package. This package provides an AI SDK middleware that automatically intercepts your DeepSeek usage, calculates the cost using the correct pricing (including cache hits and reasoning tokens), and sends the billing events to your preferred destination like Polar, Stripe, Lago, or your own custom backend. Check out the Automate DeepSeek Billing with AI SDK guide for complete code examples on how to set up the billing middleware with generateText and streamText.