Stanford Researchers Use Qwen AI Model to Train Model Comparable to DeepSeek R1 for Just $50, Says Alibaba Cloud| TMTPOST

中文

HOME

BRIEF NEWS

OPINION

FEATURES

LIVE

EVENTS

Feb. 7, 2025

Stanford Researchers Use Qwen AI Model to Train Model Comparable to DeepSeek R1 for Just $50, Says Alibaba Cloud

A team of researchers from Stanford University and the University of Washington, led by Fei-Fei Li, has reportedly developed an AI reasoning model named "s1" for less than $50 in cloud computing costs. The model has demonstrated math and coding capabilities on par with OpenAI’s o1 and DeepSeek’s R1, drawing significant industry attention. However, controversy soon followed as the s1 model was revealed not to be trained from scratch. Instead, it was based on Alibaba Cloud’s Qwen model. When approached for confirmation, Alibaba Cloud verified the claim, saying: “They used our Qwen2.5-32B-Instruct open-source model as a foundation and conducted 26 minutes of supervised fine-tuning on 16 H100 GPUs to create the new model, s1-32B. The results showed performance comparable to OpenAI’s o1 and DeepSeek’s R1 in mathematical reasoning and coding tasks. In fact, s1-32B outperformed o1-preview by 27% on competition-level math problems.” The development has sparked discussions on the potential for cost-effective AI advancements and the role of open-source models in accelerating research.

Subscribe To Our News