The Watch Secret Boutique Onlinefloodgates have opened for building AI reasoning models on the cheap.
Researchers at Stanford and the University of Washington have developed a model that performs comparably to OpenAI o1 and DeepSeek R1 models in math and coding — for less than $50 of cloud compute credits.
What's more, the model was trained on only 1,000 questions, and took just 26 minutes and 16 Nvidia H100 GPUs. Stanford researcher Niklas Muennighoff said in a email to Mashable that the cost is an estimate based on the GPU runtime and number of H100 GPUs used.
The AI industry of late is all about how new approaches to the pre and post training process can massively save computing costs, as evidenced by DeepSeek's disruptive impact. On top of that, developers are now able to build on top of existing AI models at little or no cost, through APIs, open-source access, and even closed-source models by distilling their data, bringing the costs down even more.
According to the team's research paper which was published last Friday, s1 was trained on a dataset consisting of "1,000 carefully curated questions paired with reasoning traces and answers distilled from Gemini Thinking Experimental." Google's Gemini Thinking Experimental model is accessible with daily limits through AI Studio. While it's a closed-source model, that clearly hasn't stopped researchers from making use of its responses.
SEE ALSO: OpenAI launches 'deep research' AI agent for ChatGPTNext, the researchers used an "off the shelf" pretrained model from Alibaba-owned lab, Qwen, and performed supervised fine-tuning of its curated dataset. Then, the team created a token budget to control the amount of compute time for testing the model. If s1 went over budget on thinking tokens, it was cut off and forced to generate whatever answer it came up with. If the researchers wanted the model to spend more "test-time compute" on a problem, they would simply tell the model to "wait," which extended its thinking time and led to more accurate results.
By controlling the amount of time and compute spent on a problem, the researchers were able to show how increased thinking team leads to improved performance.
S1 is one example of open-source reasoning models that have been developed for a fraction of the cost of flagship models from Google and OpenAI. In January, UC Berkeley researchers released an open-source reasoning model called Sky-T1 that cost $450, "demonstrating that it is possible to replicate high-level reasoning capabilities affordably and efficiently," per its blog post. There's also the open-source rStar-Math reasoning model from Microsoft Asia researchers, Tulu 3 from non profit research institute Ai2, and HuggingFace has its own initiative to replicate DeepSeek's R1.
As high-quality models become more accessible and cheaper, we're starting to see a power shift from the few AI heavy hitters, to the many.
Topics Artificial Intelligence OpenAI
Previous:Stranger than Fiction
Next:Missing Perspectives
Girl's viral tweet about her very bad hairstyle got the attention of her preschool crushWireless charging is finally coming to the iPhone—or so says one manufacturerGoogle's likely ditching HTC for LG when it comes to the next PixelAfter railway deaths, Sri Lanka will start arresting people taking selfies on train tracksMicrosoft's Panos Panay on the Surface's past, present, and futureParents' reaction to daughter's tattoo is pricelessMuslims can find the Qibla on their smartphones with Google's new appNo, it's not time to start using Uber againScientist reminds everyone NASA TTSA fingerprint scanning might be on its way to an airport near youThe Australian accent still confuses Americans. Just ask Steve Harvey.We saw 'Beyond Good and Evil 2' and here's what we know so farTill death do us dongle: Newlyweds take their Apple obsession to the next levelMicrosoft's Panos Panay on the Surface's past, present, and futureThe iPhone almost looked completely different, and, well, it's not goodEvery single senior at this low income D.C. school earned their way into collegeLeBron James finally shaved his head and the internet can barely handle itThis could be our first look at the iPhone 8's glorious edgeEvery single senior at this low income D.C. school earned their way into collegeThe Australian accent still confuses Americans. Just ask Steve Harvey. NYT's The Mini crossword answers for May 31 Volkswagen, Xpeng expand electric vehicle partnership · TechNode Giant rhino fossils shed light on the biggest land mammals in history In case you missed it: Bank info Alibaba to test rocket package delivery service with China’s startup Space Epoch · TechNode X plans town hall with Trump as Elon Musk gets cozy with the former president 'Doctor Who' does 'Black Mirror' in 'Dot and Bubble' Why Google AI Overviews got so weird so quickly Bilibili overhauls main site operating unit as profitability timeline looms · TechNode PlayStation State of Play May 2024: All games announced, including 'Astro Bot' Domestic trips recorded during China’s Qingming festival up over 10% from 2019 level · TechNode Wordle today: The answer and hints for May 31 Toyota’s China joint venture to use Huawei components for autonomous driving: report · TechNode China authority sets a new industry standard for cross X's best reactions to Trump's 34 felony convictions Redmi introduces Turbo 3, its first phone with Snapdragon 8s Gen 3 processor · TechNode Unistellar's eVscope boosts citizen astronomy during COVID The first big U.S. offshore wind farm is a big deal China’s group Apple iPad Pro on sale: Save $55 on the 11
1.3983s , 10130.875 kb
Copyright © 2025 Powered by 【Watch Secret Boutique Online】,Charm Information Network