The global artificial intelligence price war has intensified following an announcement from Chinese AI startup DeepSeek. The company has officially made its temporary 75% promotional discount on its flagship DeepSeek V4-Pro model permanent. The steep price cut, which was originally scheduled to expire on May 31, 2026, cements the startup's position as the most aggressive price-undercutter in the frontier large language model (LLM) market.
Under this locked-in pricing structure, DeepSeek has permanently adjusted its Application Programming Interface (API) costs to just one-quarter of the original list price. Standard input tokens for the V4-Pro model are now priced at a mere $0.435 per million tokens, while output tokens have dropped to $0.87 per million tokens. For developers utilizing optimized cache hits, the input cost falls even further to an fractions-of-a-cent rate of $0.003625 per million tokens.
This permanent reduction places a massive financial gulf between DeepSeek and its prominent Western competitors. In comparison, OpenAI's flagship GPT-5.5 charges $5.00 per million input tokens and $30.00 per million output tokens, making DeepSeek’s V4-Pro roughly 11.5 times cheaper on input and 34.5 times cheaper on output. The model similarly undercuts Anthropic's Claude 4.7 Opus ($5 input / $25 output) and even outpaces the cost efficiency of Google's lighter Gemini 3.5 Flash model.
Industry analysts point out that DeepSeek is able to sustain these razor-thin margins by optimizing its architectures on domestic Chinese infrastructure. Rather than relying heavily on restricted or black-market Western hardware, DeepSeek has tailored its software to run efficiently on large clusters of Huawei Ascend 950 chips. The massive, localized deployment of these domestic supernodes has significantly minimized server operational costs, allowing the company to aggressively prioritize global market share over per-unit revenue.
The strategy is tailored heavily toward high-volume enterprise users and creators building complex AI agent workflows that process immense text blocks. Because the V4-Pro features a massive 1 million token context window and can output up to 384,000 tokens in a single request, standard commercial API costs can compound rapidly for businesses. DeepSeek's aggressive, localized pricing structure presents a highly disruptive, budget-friendly alternative that forces the wider tech ecosystem to rethink the baseline cost of advanced AI inference.






