Artificial Intelligence

DeepSeek poses a challenge to Beijing as much as to Silicon Valley

Author: Editors Desk
February 2, 2025 at 00:35
Autoplay:
Photograph: Anthony Gerace
Photograph: Anthony Gerace

The story of Liang Wenfeng, the model-maker’s mysterious founder

With the release of its latest artificial-intelligence (ai) model, DeepSeek, an obscure Chinese firm, has laid waste to several years of American policy meant to hold back Chinese innovation—and, in the process, blown a hole in the valuations of companies from Nvidia, America’s ai-chip champion, to Siemens Energy, a manufacturer of electrical equipment used in data centres. In demonstrating its ability to innovate around American export restrictions, DeepSeek has raised doubts as to whether access to piles of cutting-edge semiconductors and related equipment is as important as previously thought when it comes to training ai models.

The man at the centre of it all is Liang Wenfeng, DeepSeek’s 40-year-old founder. It is unclear how much he has relished the global market turmoil he has unleashed. A high-school classmate who recently spoke to local media said Mr Liang was hiding out in his home town for the lunar new year, which started on January 29th. Playfully mocked on Chinese social media for his skinny, pale appearance, Mr Liang remains a mystery to most people. Those who have had professional dealings with DeepSeek say he is obsessed with human-like artificial general intelligence (agi) and the impact it could have on the world. In his pursuit of it, DeepSeek’s founder is up-ending ideas about technological progress both in the West and China.

Public information on Mr Liang is scant. Born into a family of teachers in an impoverished village near the southern city of Zhanjiang in 1985, he was a gifted student. A former instructor claimed he mastered university-level maths in middle school. In 2002 he gained entry into an electronic-information degree at Zhejiang University, a prestigious school in the eastern city of Hangzhou. A master’s degree at the same university, under a well-known machine-vision scientist, exposed him to the field of ai.

At the time, Hangzhou was a bustling hub for internet technology and home to rising companies such as Alibaba, an e-merchant. Mr Liang and several classmates remained in the city and began experimenting with quantitative investing models, which do not rely on company fundamentals but on crunching reams of data. In 2013 Mr Liang and three classmates launched an investment group called Yakebi in an attempt to monetise the trading models they had built.

Two years later Mr Liang co-founded High-Flyer, a quantitative hedge fund that grew rapidly alongside dozens of similar firms during a period of deregulation and market volatility in China. In 2021 it claimed to be managing as much as 100bn yuan ($14bn), though it appears to have rapidly shrunk in size in the latter half of that year. Quant funds have routinely tussled with Chinese regulators, who view them as profiting from market routs. Industry insiders say High-Flyer made a name for itself as one of the most aggressive quant funds, regularly drawing the ire of securities regulators.

DeepSeek’s origins lie in an effort to improve High-Flyer’s algorithms. In 2019 the firm invested 200m yuan to set up a separate unit to develop its own deep-learning platform, called “Fire-Flyer 1”. The fund poured 1bn yuan into the effort in 2021 in order to launch a second iteration armed with 10,000 of Nvidia’s a100 graphics-processing units. This made High-Flyer an outlier: at the time just four other firms in China held such large arsenals of powerful chips, all of which were tech giants such as Alibaba. DeepSeek was made a standalone company in 2023.

It delivered its first jolt to the market in May last year, when it released an ultra-cheap chatbot based on its v2 model. That kicked off a price war in China’s ai industry, forcing the country’s biggest tech firms—Alibaba, Baidu, ByteDance and Tencent—to lower their own prices.

By Mr Liang’s own telling, this was not a ploy to capture more users. In July he said costs had fallen as DeepSeek explored new model structures, something that set it apart from others. Although rival Chinese ai firms have been conducting their own research into models, their disadvantage in computing power, owing to American export restrictions, has led them to focus more on creating clever applications that use the technology. Many Chinese aicompanies have used Llama, the family of large language models developed by Meta, an American social-media firm, as a basis for their applications.

Deep thoughts

For Mr Liang, developing models using less computing power is an essential step in pursuit of his longer-term objective. “Our goal is agi, which requires us to explore new model structures to achieve superior capabilities within limited resources,” he has told local media.

DeepSeek’s new r1 model, which has shocked the West, suggests it is making progress. The company says it cost less than $6m to train, a tiny fraction of comparable models from firms such as Openai, maker of Chatgpt. Sam Altman, Openai’s boss, has called r1 “impressive” (though he has also promised to produce “much better models”, adding that it is “legit invigorating to have a new competitor”).

DeepSeek certainly has its doubters. Early testing seems to confirm that r1 is as powerful as its maker says it is. But some have questioned whether the firm has underplayed the number of high-end chips it used to develop the model, even if others argue its claims are plausible. There is also speculation that DeepSeek has trained its models by studying the results of American ones, a process known as “distillation”. Openai has said it has evidence that points to DeepSeek distilling its models, in violation of its terms of service.

Even if DeepSeek’s efficiency gains are not as impressive as thought, they still pose a challenge to thinking both in Silicon Valley and Beijing. Chinese state media has been quick to champion DeepSeek as a national asset in the country’s fight for ai supremacy. Mr Liang was invited to meet with Li Qiang, China’s premier, on January 20th, alongside a handful of other entrepreneurs.

Yet as Zhang Zhiwei of Pinpoint Asset Management, an investment firm, points out, DeepSeek’s achievements did not emerge from one of China’s myriad government-backed research institutes or state-controlled companies. Mr Liang seems to control most of the shares in DeepSeek, and has steered clear of China’s state-dominated venture-capital industry.

Mr Liang views China’s role over the past 30 years as that of a technological “follower”, building on foundations developed in the West. The gap between America and China is between “originality and imitation”, he said in an interview with local media in July. Nvidia’s success, he argues, has not relied solely on its own performance, but also on technological collaboration among Western companies. China’s efforts to imitate Western computing power have fallen short, in his view, because it lacks this type of collaboration, despite a capital-intensive state-led effort to create one. DeepSeek may not be a wake-up call only for Silicon Valley, but also for China’s leaders in Beijing. 

To stay on top of the biggest stories in business and technology, sign up to the Bottom Line, our weekly subscriber-only newsletter.

Keywords
Advertisement
You did not use the site, Click here to remain logged. Timeout: 60 second