Published V4 rates (USD per 1M tokens)
| Model | Cached input | Cache-miss input | Output |
|---|---|---|---|
| DeepSeek V4-Flash | $0.0028 | $0.14 | $0.28 |
| DeepSeek V4-Pro (discount through May 31, 2026) | $0.003625 | $0.435 | $0.87 |
| DeepSeek V4-Pro (standard) | — | $1.74 | $3.48 |
Credits vs “free unlimited”
New DeepSeek accounts sometimes receive starter credits, but you should assume ongoing usage is metered. DeepSeek TUI does not bypass billing—it simply helps you use tokens efficiently.
Why cache-hit pricing matters
Agent loops often reuse long system prompts and prior context. When prefixes qualify as cache hits, input pricing can drop sharply versus cache-miss rates—especially relevant for marathon sessions with stable scaffolding.
- Light reuse (~30% cache hit): ~0.72× vs all cache-miss input
- Heavy reuse (~70% cache hit): ~0.31× vs all cache-miss input
- Stable long context (~90% cache hit): ~0.13× vs all cache-miss input
Illustrative Flash sessions
The table below applies only DeepSeek V4-Flash list prices to hypothetical token mixes. Real invoices include rounding, routing, and promotional schedules.
| Scenario | Notes | Est. Flash cost |
|---|---|---|
| Small session | Short refactor in one folder, mostly Flash, modest output. | ~$0.02 USD |
| Medium session | Repo walkthrough + several edits; mixed reads and writes. | ~$0.08 USD |
| Large session | Parallel analysis + Pro synthesis step. | ~$0.05 USD |
Context: other flagship APIs
Rough order-of-magnitude comparisons—always verify current vendor pricing.
| Offering | Indicative input / 1M | Indicative output / 1M |
|---|---|---|
| Claude Sonnet 4 (API, indicative) | $3.00 | $15.00 |
| GPT-4o (API, indicative) | $2.50 | $10.00 |
| Gemini 2.5 Pro (API, indicative) | $1.25 | $10.00 |
FAQ
Cheapest default for DeepSeek TUI?
Usually DeepSeek V4-Flash unless a task clearly needs Pro reasoning depth.
Should Pro be always on?
No—toggle upward for specific problems; otherwise tokens add up fast.