Zeyi Yang
On January 20, DeepSeek, a relatively unknown AI research lab from China, released an open source model that’s quickly become the talk of the town in Silicon Valley. According to a paper authored by the company, DeepSeek-R1 beats the industry’s leading models like OpenAI o1 on several math and reasoning benchmarks. In fact, on many metrics that matter—capability, cost, openness—DeepSeek is giving Western AI giants a run for their money.
DeepSeek’s success points to an unintended outcome of the tech cold war between the US and China. US export controls have severely curtailed the ability of Chinese tech firms to compete on AI in the Western way—that is, infinitely scaling up by buying more chips and training for a longer period of time. As a result, most Chinese companies have focused on downstream applications rather than building their own models. But with its latest release, DeepSeek proves that there’s another way to win: by revamping the foundational structure of AI models and using limited resources more efficiently.
“Unlike many Chinese AI firms that rely heavily on access to advanced hardware, DeepSeek has focused on maximizing software-driven resource optimization,” explains Marina Zhang, an associate professor at the University of Technology Sydney, who studies Chinese innovations. “DeepSeek has embraced open source methods, pooling collective expertise and fostering collaborative innovation. This approach not only mitigates resource constraints but also accelerates the development of cutting-edge technologies, setting DeepSeek apart from more insular competitors.”
No comments:
Post a Comment