3 February 2025

What DeepSeek r1 Means—and What It Doesn’t

Dean W. Ball

On Jan. 20, the Chinese AI company DeepSeek released a language model called r1, and the AI community (as measured by X, at least) has talked about little else since. The model is the first to publicly match the performance of OpenAI’s frontier “reasoning” model, o1—beating frontier labs Anthropic, Google’s DeepMind, and Meta to the punch. The model matches, or comes close to matching, o1 on benchmarks like GPQA (graduate-level science and math questions), AIME (an advanced math competition), and Codeforces (a coding competition).

What’s more, DeepSeek released the “weights” of the model (though not the data used to train it) and released a detailed technical paper showing much of the methodology needed to produce a model of this caliber—a practice of open science that has largely ceased among American frontier labs (with the notable exception of Meta). As of Jan. 26, the DeepSeek app had risen to number one on the Apple App Store’s list of most downloaded apps, just ahead of ChatGPT and far ahead of competitor apps like Gemini and Claude.

Alongside the main r1 model, DeepSeek released smaller versions (“distillations”) that can be run locally on reasonably well-configured consumer laptops (rather than in a large data center). And even for the versions of DeepSeek that run in the cloud, the cost for the largest model is 27 times lower than the cost of OpenAI’s competitor, o1.

No comments: