Skip to content

DeepSeek Unveils V3.2-exp: A New AI Model to Slash Inference Costs

DeepSeek's new model, V3.2-exp, cuts AI inference costs by up to 50% for long texts. This could democratize AI, fostering wider innovation.

This picture contains a box which is in red, orange and blue color. On the top of the box, we see a...
This picture contains a box which is in red, orange and blue color. On the top of the box, we see a robot and text written as "AUTOBOT TRACKS". In the background, it is black in color and it is blurred.

DeepSeek Unveils V3.2-exp: A New AI Model to Slash Inference Costs

DeepSeek, a leading AI innovator, has unveiled V3.2-exp, a new experimental model designed to lower the cost of AI inference. This development could significantly benefit startups, universities, and nonprofits, opening up access to AI and fostering wider innovation.

The V3.2-exp model is open-weight and available on Hugging Face for independent testing and verification. It employs a two-step approach to select the most relevant parts of the input and identify key tokens. A central feature, Sparse Attention, enables efficiency instead of scale. This results in a leaner, faster model that reduces server strain while maintaining accuracy.

DeepSeek's preliminary reports suggest that inference costs for long texts can be cut by up to 50 percent using V3.2-exp. Sparse Attention, in particular, reduces costs for long-context operations by nearly half. This focus on efficiency in AI development presents an alternative to scale-driven development in the broader U.S.-China AI rivalry.

DeepSeek's V3.2-exp model, with its potential to halve inference costs for long texts, could democratize AI access. Following its successful R1 model earlier in 2025, DeepSeek continues to push boundaries in efficient AI development.

Read also:

Latest