4 hours ago

CAISI Says DeepSeek V4 Pro Trails U.S. Frontier by About 8 Months

US Government Says China's Best AI Models Lag Behind. Experts Aren't So Sure

Decrypt

Key Point

CAISI, a NIST unit, released an evaluation on May 1 that said DeepSeek V4 Pro lags the U.S. frontier by about eight months and called it the most capable Chinese AI model it has tested. CAISI used Item Response Theory across nine benchmarks, including two non-public datasets, and estimated DeepSeek at about 800 points versus GPT-5.5 at 1,260 and Claude Opus 4.6 at 999. For cost comparison, CAISI filtered out U.S. models that performed significantly worse or cost significantly more per token than DeepSeek, leaving only GPT-5.4 mini, and DeepSeek was still cheaper on five of seven benchmarks. Stanford's 2026 AI Index said the U.S.-China gap on public leaderboards had narrowed to 2.7%, while Ex0bit said there is no eight-month gap.

Market Sentiment

Neutral, Event-driven.

Reason: CAISI published an evaluation that placed DeepSeek V4 Pro about eight months behind the U.S. frontier, but the event does not directly change crypto market access or rules.

Similar Past Cases

Government-backed technology scorecards usually shape policy and competitiveness debates before they change market pricing. This case could diverge because the dispute over non-public benchmarks may limit how much weight readers give the ranking.

Ripple Effect

This report mainly affects AI competition narratives rather than crypto market plumbing, so any spillover would likely come through broad risk sentiment or future policy debate. If later government actions start using similar rankings to justify tighter technology controls, then the impact could spread beyond the AI sector.

Opportunities & Risks

Opportunities: The main point to monitor is whether CAISI's fuller methodology write-up resolves the dispute over non-public benchmarks. Clearer methodology would make future model comparisons easier to trust.

Risks: The main risk to watch is whether the methodology dispute stays unresolved and keeps benchmark comparisons contested. If that happens, this report may remain a weak signal for near-term market positioning.

This content is an AI-generated summary/analysis for informational purposes only and does not constitute investment advice.