Overview
- DeepSeek made its 75% discount on the V4‑Pro permanent and set non‑cached input at $0.435 per million tokens and output at $0.87 per million tokens, plus a 90% cut for input cache hits.
- Developers immediately reported single‑API‑key throughput limits that return 429 Too Many Requests errors when agents fire dozens of calls per second, creating a practical bottleneck for heavy workloads.
- Engineers are unlocking the new price advantage by using multi‑key load balancing and automatic failover with self‑hosted proxies like One‑API or paid key‑pool services such as AiCredits.
- Industry observers say the move is a strategic push for market share and fundraising leverage by making frontier performance far cheaper than top Western models, which can cost many times more per token.
- For users this means much lower run costs for iterative testing and autonomous agents, but it also adds operational work or third‑party fees to manage multiple keys and avoid rate‑limit failures.