DeepSeek has hurt the American tech giants. The Chinese AI giant has launched models in recent weeks whose capabilities match those of US companies: R1 is similar to o1, the most advanced reasoning model from OpenAI. The DeepSeek V3 model, less advanced, nevertheless plays in the same league as Anthropic's Sonnet-3.5 or OpenAI's GPT-4o.
OpenAI, the tables turned by DeepSeek
It is above all the way in which DeepSeek trained these models that has caused a commotion in the United States: the company uses Nvidia GPUs (2048 H800 cards for V3) which are much less powerful than their American counterparts, due to American restrictions. DeepSeek has also and above all developed ingenious optimization methods that considerably reduce the cost of training its models.
We are talking about less than 6 million dollars for the V3 model, but that is without counting the costs incurred for research, development of algorithms, data acquisition and experimentation on architectures. Consequently, V3 must have cost much more, but probably not the hundreds of billions that the US giants are grandiloquently announcing.
In any case, the spectacular rise of DeepSeek in recent days and the fact that its AI models are open source have had very real financial consequences, particularly for Nvidia, whose stock market capitalization melted by 589 billion dollars (!) on Monday, before recovering a little the next day. Other AI companies are not much better off.
Above all, DeepSeek’s appearance in the game casts serious doubts on the seriousness of the very expensive roadmaps of American players, like OpenAI’s megalomaniacal Stargate project.
Good sports, Nvidia, Microsoft, Meta and OpenAI have praised DeepSeek’s prowess, but just as quickly suspicions have been raised about the methods used by the Chinese company. OpenAI would thus have the proof that would demonstrate DeepSeek’s use of proprietary models in order to train its own competing models. More specifically, the creator of ChatGPT has noted the use of the so-called “distillation” method.
This technique is used by developers to obtain better performance on small models by exploiting the results of larger and more powerful models. This allows them to obtain similar results on specific tasks, at a much lower cost. Distillation is a common and permitted practice in the industry, but the concern here is that DeepSeek may have leveraged it to develop its own competing models, a breach of OpenAI’s terms of service.
“The problem arises when you [exploit this technique outside the platform] to create your own model for personal use,” an OpenAI source told the Financial Times. The company’s terms of service state that users cannot copy an OpenAI service or “use results to develop models that compete with [OpenAI’s].”
OpenAI and Microsoft launched investigations last year into accounts they suspected of belonging to DeepSeek that were using OpenAI’s API. Access was blocked due to suspicions of distillation. The story has taken a political turn: David Sacks, Donald Trump’s “AI czar,” has claimed that there is “substantial evidence” that DeepSeek has distilled OpenAI’s models.
There’s something ironic about DeepSeek “pilfering” OpenAI’s intellectual property. OpenAI’s models are trained on vast amounts of data, some of which comes from content that is on the “open web” but not available for commercial use without permission. OpenAI is also said to have dug into copyrighted content, prompting several complaints from authors and press publishers.
Source: Financial Times
0 Comments