Close Menu
FSNN | Free Speech News NetworkFSNN | Free Speech News Network
  • Home
  • News
    • Politics
    • Legal & Courts
    • Tech & Big Tech
    • Campus & Education
    • Media & Culture
    • Global Free Speech
  • Opinions
    • Debates
  • Video/Live
  • Community
  • Freedom Index
  • About
    • Mission
    • Contact
    • Support
Trending

Tether’s $344 million USDT freeze linked to U.S. ‘Economic Fury’ against Iran regime

6 minutes ago

Quantum computer breaks 15-bit elliptic curve cryptographic key

8 minutes ago

South Korea Arrests Man for a Fake AI Wolf Photo That Raised Alarms

9 minutes ago
Facebook X (Twitter) Instagram
Facebook X (Twitter) Discord Telegram
FSNN | Free Speech News NetworkFSNN | Free Speech News Network
Market Data Newsletter
Friday, April 24
  • Home
  • News
    • Politics
    • Legal & Courts
    • Tech & Big Tech
    • Campus & Education
    • Media & Culture
    • Global Free Speech
  • Opinions
    • Debates
  • Video/Live
  • Community
  • Freedom Index
  • About
    • Mission
    • Contact
    • Support
FSNN | Free Speech News NetworkFSNN | Free Speech News Network
Home»Cryptocurrency & Free Speech Finance»DeepSeek V4 Is Here—Its Pro Version Costs 98% Less Than GPT 5.5 Pro
Cryptocurrency & Free Speech Finance

DeepSeek V4 Is Here—Its Pro Version Costs 98% Less Than GPT 5.5 Pro

News RoomBy News Room2 hours agoNo Comments10 Mins Read1,098 Views
Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
DeepSeek V4 Is Here—Its Pro Version Costs 98% Less Than GPT 5.5 Pro
Share
Facebook Twitter Pinterest Email Copy Link

Listen to the article

0:00
0:00

Key Takeaways

Playback Speed

Select a Voice

In brief

  • DeepSeek released its new V4-Pro model with 1.6 trillion parameters.
  • It costs $1.74/$3.48 per million input/output tokens, roughly 1/20th the price of Claude Opus 4.7 and 98% less than GPT 5.5 Pro.
  • DeepSeek trained V4 partly on Huawei Ascend chips, circumventing U.S. export restrictions, and says that once 950 new supernodes come online later in 2026, the Pro model’s already-low price will drop further.

DeepSeek is back, and it showed up a few hours after OpenAI dropped GPT-5.5. Coincidence? Maybe. But if you’re a Chinese AI lab that the U.S. government has been trying to slow down with chip export bans for the past three years, your sense of timing gets pretty sharp.

The Hangzhou-based lab released preview versions of DeepSeek-V4-Pro and DeepSeek-V4-Flash today, both open-weight, both with one million token context windows. That means you can basically work with a context roughly the size of the Lord of the Rings Trilogy before the model collapses. Both are also priced well below anything comparable in the West, and both are free for those capable of running locally.

DeepSeek’s last major disruption—R1 in January 2025—wiped $600 billion from Nvidia’s market cap in a single day as investor questioned whether American companies really needed such huge investments to produce results that a small chinese lab achieved with a fraction of the cost. V4 is a different kind of move: quieter, more technical, and more focused on efficiency for anyone actually building with AI.

Two models, very different jobs

Of the two new models, DeepSeek’s V4-Pro is the big one, with 1.6 trillion total parameters. To put that in perspective, parameters are the internal “settings” or “brain cells” that a model uses to store knowledge and recognize patterns—the more parameters a model has, the more complex information it can theoretically hold. That makes it the biggest open-source model in the LLM market to date. The size may sound ridiculous until you learn it only activates 49 billion of them per inference pass.

This is the Mixture-of-Experts trick DeepSeek has refined since V3: The full model sits there, but only the relevant slice of it wakes up for any given request. More knowledge, same compute bill.

“DeepSeek-V4-Pro-Max, the maximum reasoning effort mode of DeepSeek-V4-Pro, significantly advances the knowledge capabilities of open-source models, firmly establishing itself as the best open-source model available today,” Deepseek wrote in the model’s official card on Huggingface. “It achieves top-tier performance in coding benchmarks and significantly bridges the gap with leading closed-source models on reasoning and agentic tasks.”

V4-Flash is the practical one: 284 billion total parameters, 13 billion active. It’s designed to be faster, cheaper, and according to DeepSeek’s own benchmarks, “achieves comparable reasoning performance to the Pro version when given a larger thinking budget.”

Both support one million tokens of context. That’s roughly 750,000 words—roughly the entire “Lord of the Rings” trilogy plus change. And that’s as a standard feature, not a premium tier.

Deepseek’s (not so) secret sauce: Making attention not terrible at scale

Here’s the technical part for nerds or those interested in the magic powering the model. Deepseek doesn’t hide its secrets, and everything is available for free—the full paper is available on Github.

Standard AI attention—the mechanism that lets a model understand relationships between words—has a brutal scaling problem. Every time you double the context length, the compute cost roughly quadruples. So running a model on a million tokens isn’t just twice as expensive as 500,000 tokens. It’s four times as expensive. This is why long context has historically been a checkbox labs add and then silently throttle behind rate limits.

DeepSeek invented two new attention types to get around this. The first, Compressed Sparse Attention, works in two steps. It first compresses groups of tokens—say, every 4 tokens—into a single entry. Then, instead of attending to all of those compressed entries, it uses a “Lightning Indexer” to pick only the most relevant results for any given query. Your model goes from attending to a million tokens to attending to a much smaller set of the most important chunks, kind of like a librarian who doesn’t read every book but knows exactly which shelf to check.

The second, Heavily Compressed Attention, is more aggressive. It collapses every 128 tokens into a single entry—no sparse selection, just brutal compression. You lose fine-grained detail, but you get an extremely cheap global view. The two attention types run in alternating layers, so the model gets both the detail and the overview.

The result, from the technical paper: At one million tokens, V4-Pro uses 27% of the compute its predecessor (V3.2) needed. KV cache—the memory the model needs to track context—drops to just 10% of V3.2. V4-Flash pushes that further: 10% of compute, 7% of memory.

And this ended up with Deepseek being able to offer a much cheaper price per token than its competitors, while providing comparable results. To put that in dollar terms: GPT-5.5 launched yesterday at $5 input and $30 output per million tokens with GPT-5.5 Pro priced at $30 per million input tokens and $180 per million output tokens.

Deepseek V4-Pro is $1.74 input and $3.48 output. V4-Flash is $0.14 input and $0.28 output. Cline CEO Saoud Rizwan pointed out that if Uber had used DeepSeek instead of Claude, its 2026 AI budget—reportedly enough for four months of usage—would have lasted seven years.

deepseek v4 is now the cheapest sota model available at 1/20th the cost of opus 4.7.

for perspective, if uber used deepseek instead of claude their 2026 ai budget would have lasted 7 years instead of only 4 months. pic.twitter.com/i9rJZzvRBV

— Saoud Rizwan (@sdrzn) April 24, 2026

The benchmarks

DeepSeek does something unusual in its technical report: It publishes the gaps. Most model releases cherry-pick the benchmarks where they win. DeepSeek ran the full comparison against GPT-5.4 and Gemini-3.1-Pro, found that V4-Pro’s reasoning lags behind those models by about three to six months, and printed it anyway.

Where V4-Pro-Max actually wins: Codeforces, competitive programming benchmark, rated like human chess. V4-Pro scored 3,206, placing it around 23rd among actual human contest participants. On Apex Shortlist, a curated set of hard math and STEM problems, it scored a pass rate and hit 90.2% versus Opus 4.6’s 85.9% and GPT-5.4’s 78.1%. On SWE-Verified, which measures whether a model can resolve real GitHub issues pulled from actual open-source repositories, it scored 80.6%—matching Claude Opus 4.6.

Where it trails: multitasking benchmark MMLU-Pro (Gemini-3.1-Pro at 91.0% vs V4-Pro at 87.5%), expert knowledge benchmark GPQA Diamond (Gemini 94.3 vs V4-Pro 90.1), and Humanity’s Last Exam, a graduate-level benchmark where Gemini-3.1-Pro’s 44.4% still beats V4-Pro’s 37.7%.

On long context specifically, V4-Pro leads open-source models and beats Gemini-3.1-Pro on the CorpusQA benchmark (a test simulating real document analysis at one million tokens), but loses to Claude Opus 4.6 on MRCR—a test measuring how well a model retrieves specific needles buried deep in a very long haystack.

Built to run agents, not just answer questions

The agentic stuff is where this release gets interesting for developers actually shipping products.

V4-Pro can run in Claude Code, OpenCode, and other AI coding tools. According to DeepSeek’s internal survey of 85 developers who used V4-Pro as their primary coding agent, 52% said it was ready to be their default model, 39% leaned toward yes, and fewer than 9% said no. Internal employees said it outperforms Claude Sonnet and approaches Claude Opus 4.5 on agentic coding tasks.

Artificial Analysis, which runs independent evaluations of AI models on real-world tasks, ranked V4-Pro first among all open-weight models on GDPval-AA—a benchmark testing economically valuable knowledge work across finance, legal, and research tasks, scored via Elo. V4-Pro-Max scored 1,554 Elo, ahead of GLM-5.1 (1,535) and MiniMax’s M2.7 (1,514). For reference, Claude Opus 4.6 scores 1,619 on the same benchmark—still ahead, but the gap is closing.

DeepSeek V4 Pro is the #1 open weights model on GDPval-AA, our agentic real-world work tasks evaluation@deepseek_ai has released V4 Pro (1.6T total / 49B active) and V4 Flash (284B total / 13B active). V4 is DeepSeek’s first new size since V3, with all intermediate models… pic.twitter.com/2kJWVrKQjF

— Artificial Analysis (@ArtificialAnlys) April 24, 2026

Deepseek’s V4 also introduces something called “interleaved thinking.” In previous models, if you were running an agent that made multiple tool calls—say, it searched the web, then ran some code, then searched again—the model’s reasoning context got flushed between rounds. Each new step, the model had to rebuild its mental model from scratch. V4 retains the full chain of thought across tool calls, so a 20-step agent workflow doesn’t suffer from amnesia halfway through. This matters more than it sounds for anyone running complex automated pipelines.

Deepseek and the U.S.-China AI war

The U.S. has been restricting high-end Nvidia chip exports to China since 2022. The stated goal was to slow Chinese AI development, but the chip ban didn’t stop DeepSeek and instead made them invent a more efficient architecture and build out domestic hardware supply.

DeepSeek didn’t release V4 in a vacuum—the AI space has been flush with activity as of late: Anthropic shipped Claude Opus 4.7 on April 16—a model Decrypt tested and found strong on coding and reasoning, with notably high token usage. The day before that, Anthropic was also sitting on Claude Mythos, a cybersecurity model it says it can’t release publicly because it’s too good at autonomous network attacks.

Xiaomi dropped MiMo V2.5 Pro on April 22, going full multimodal—image, audio, video. Costs $1 input and $3 output per million tokens. It matches Opus 4.6 on most coding benchmarks. Three months ago, nobody was talking about Xiaomi as a frontier AI company. Now it’s shipping competitive models faster than most Western labs.

OpenAI’s GPT-5.5 landed yesterday with costs spiking up to $180 per million tokens of output in the Pro version. It beats V4-Pro on Terminal Bench 2.0 (82.7% vs 70.0%), which tests complex command-line agent workflows. But it costs considerably more than V4-Pro for equivalent tasks. That same day Tencent released Hy3, another state-of-the-art model focused on efficiency.

What this means for you

So with so many new models available, the question developers are actually asking: When is the premium worth it?

For enterprise, the math may have changed. A model that leads open-source benchmarks at $1.74 per million input tokens means large-scale document processing, legal review, or code generation pipelines that were expensive six months ago are now much cheaper. The one-million-token context means you can feed entire codebases or regulatory filings in a single request instead of chunking them across multiple calls.

Besides, its open-source nature means it can not only be run for free on local hardware, but it can be customized and improved based on the company’s needs and use cases.

For developers and solo builders, V4-Flash is the one to watch. At $0.14 input and $0.28 output, it’s cheaper than models that were considered budget options a year ago—and it handles most tasks the Pro version handles. DeepSeek’s existing deepseek-chat and deepseek-reasoner endpoints already route to V4-Flash in non-thinking and thinking modes respectively, so if you’re on the API, you’re already using it.

The models are text-only for now. DeepSeek said it’s working on multimodal capabilities, which means other big labs from Xiaomi to OpenAI still have that edge. Both models are MIT licensed and available on Hugging Face today. The old deepseek-chat and deepseek-reasoner endpoints retire on July 24, 2026.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.



Read the full article here

Fact Checker

Verify the accuracy of this article using AI-powered analysis and real-time sources.

Get Your Fact Check Report

Enter your email to receive detailed fact-checking analysis

5 free reports remaining

Continue with Full Access

You've used your 5 free reports. Sign up for unlimited access!

Already have an account? Sign in here

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link
News Room
  • Website
  • Facebook
  • X (Twitter)
  • Instagram
  • LinkedIn

The FSNN News Room is the voice of our in-house journalists, editors, and researchers. We deliver timely, unbiased reporting at the crossroads of finance, cryptocurrency, and global politics, providing clear, fact-driven analysis free from agendas.

Related Articles

Cryptocurrency & Free Speech Finance

Tether’s $344 million USDT freeze linked to U.S. ‘Economic Fury’ against Iran regime

6 minutes ago
Cryptocurrency & Free Speech Finance

Quantum computer breaks 15-bit elliptic curve cryptographic key

8 minutes ago
Cryptocurrency & Free Speech Finance

South Korea Arrests Man for a Fake AI Wolf Photo That Raised Alarms

9 minutes ago
Media & Culture

Civil Liberties Groups Sue for Information on ICE’s Speech-Chilling Subpoenas

51 minutes ago
Cryptocurrency & Free Speech Finance

Researcher wins 1 bitcoin (BTC) for largest quantum attack on elliptic curve yet

1 hour ago
Cryptocurrency & Free Speech Finance

Crypto PAC Fellowship Halts Support of Texas AG for Senate: Report

1 hour ago
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Quantum computer breaks 15-bit elliptic curve cryptographic key

8 minutes ago

South Korea Arrests Man for a Fake AI Wolf Photo That Raised Alarms

9 minutes ago

Civil Liberties Groups Sue for Information on ICE’s Speech-Chilling Subpoenas

51 minutes ago

ACLU Leads Fight to Limit Overbroad Digital Search Warrants

55 minutes ago
Latest Posts

Researcher wins 1 bitcoin (BTC) for largest quantum attack on elliptic curve yet

1 hour ago

Crypto PAC Fellowship Halts Support of Texas AG for Senate: Report

1 hour ago

Anthropic Rolls Out Election Safeguards for Claude AI Ahead of US Midterms

1 hour ago

Subscribe to News

Get the latest news and updates directly to your inbox.

At FSNN – Free Speech News Network, we deliver unfiltered reporting and in-depth analysis on the stories that matter most. From breaking headlines to global perspectives, our mission is to keep you informed, empowered, and connected.

FSNN.net is owned and operated by GlobalBoost Media
, an independent media organization dedicated to advancing transparency, free expression, and factual journalism across the digital landscape.

Facebook X (Twitter) Discord Telegram
Latest News

Tether’s $344 million USDT freeze linked to U.S. ‘Economic Fury’ against Iran regime

6 minutes ago

Quantum computer breaks 15-bit elliptic curve cryptographic key

8 minutes ago

South Korea Arrests Man for a Fake AI Wolf Photo That Raised Alarms

9 minutes ago

Subscribe to Updates

Get the latest news and updates directly to your inbox.

© 2026 GlobalBoost Media. All Rights Reserved.
  • Privacy Policy
  • Terms of Service
  • Our Authors
  • Contact

Type above and press Enter to search. Press Esc to cancel.

🍪

Cookies

We and our selected partners wish to use cookies to collect information about you for functional purposes and statistical marketing. You may not give us your consent for certain purposes by selecting an option and you can withdraw your consent at any time via the cookie icon.

Cookie Preferences

Manage Cookies

Cookies are small text that can be used by websites to make the user experience more efficient. The law states that we may store cookies on your device if they are strictly necessary for the operation of this site. For all other types of cookies, we need your permission. This site uses various types of cookies. Some cookies are placed by third party services that appear on our pages.

Your permission applies to the following domains:

  • https://fsnn.net
Necessary
Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.
Statistic
Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.
Preferences
Preference cookies enable a website to remember information that changes the way the website behaves or looks, like your preferred language or the region that you are in.
Marketing
Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.