China’s DeepSeek Challenges US AI Costs with Low-Cost Training Model


China’s DeepSeek Challenges US AI Costs with Low-Cost Training Model

TEHRAN (Tasnim) – Chinese artificial intelligence firm DeepSeek announced that training its reasoning-focused R1 model cost only $294,000, a fraction of the figures cited by US competitors, underscoring Beijing’s drive to challenge US dominance in AI.

The disclosure appeared in a peer-reviewed article published Wednesday in Nature, marking the first time the Hangzhou-based company revealed details of its training costs.

DeepSeek’s release of lower-cost AI systems earlier this year unsettled global tech markets, with investors fearing the models could erode the position of US giants such as Nvidia.

The Nature article, co-authored by founder Liang Wenfeng, said the R1 was trained using 512 Nvidia H800 chips and took 80 hours to complete. A previous January version of the paper omitted cost details.

Training large-language models typically requires weeks of computation on powerful processors, often costing tens or even hundreds of millions of dollars. OpenAI chief executive Sam Altman said in 2023 that foundational model training had cost “much more” than $100 million, without providing specifics.

Washington has questioned DeepSeek’s claims. US officials told Reuters in June the company held “large volumes” of Nvidia’s high-end H100 chips despite American export bans. Nvidia said DeepSeek lawfully used H800 chips, while DeepSeek acknowledged for the first time that it also possessed A100 chips, employed in preliminary development stages.

DeepSeek’s access to advanced processors has helped it attract leading Chinese researchers, Reuters has previously reported.

The company also addressed allegations it had copied OpenAI’s models. US officials and industry figures suggested in January that DeepSeek “distilled” OpenAI’s technology into its own.

DeepSeek defended the practice, saying distillation improves performance and reduces costs, making AI more accessible. The method allows one AI to learn from another’s outputs, leveraging prior investment while cutting expenses.

The firm acknowledged using Meta’s open-source Llama for some versions of its models. It also noted that training data for its V3 model included web content containing OpenAI-generated answers, but said this was incidental rather than deliberate.

OpenAI did not respond to Reuters’ request for comment.

Most Visited in Space/Science
Top Space/Science stories
Top Stories