The rumored ‘Strawberry’ model is here, and the company says it can handle more complex queries — for a steep price.
OpenAI has just announced the launch of its latest innovation: the o1 model, the first in a groundbreaking series of AI systems specifically trained to reason and solve complex problems. This marks a new chapter in OpenAI’s mission to develop human-like artificial intelligence. Accompanying o1 is a smaller, more affordable version called o1-mini, designed to offer similar capabilities at a lower cost. For those who have followed the recent hype in AI, this model, codenamed “Strawberry,” has been a hot topic of speculation.
What sets o1 apart is its unique focus on reasoning, making it a powerful tool for handling intricate tasks like coding and solving multistep mathematical problems. However, the model’s capabilities come at a price—both literally and figuratively. While it excels in tackling complex challenges, o1 is more expensive and slower than GPT-4o, OpenAI’s previous large language model. Despite these trade-offs, OpenAI is presenting this release as a “preview,” signaling the nascent stage of this cutting-edge technology.
What’s Different About o1?
Unlike its predecessors, o1 has been trained using a novel approach, which OpenAI’s research lead, Jerry Tworek, refers to as a “completely new optimization algorithm.” Though OpenAI remains tight-lipped about the specifics, Tworek reveals that the training process involved a new dataset, designed specifically for this model. Previous versions of GPT were primarily trained to mimic patterns in large amounts of data, but o1 goes a step further, employing a technique known as reinforcement learning.
In essence, o1 has been trained to learn from its mistakes. Using rewards and penalties, this method guides the model toward more accurate outcomes. It processes problems through a “chain of thought,” breaking them down step by step, mimicking how humans tackle complex problems. This capability makes o1 more adept at solving puzzles, writing code, and tackling difficult mathematical equations.
“We’ve seen significant improvements in accuracy,” Tworek explains. “The model hallucinates less than previous versions, but hallucinations aren’t entirely eliminated yet. It’s still an ongoing challenge.”
Who Can Access o1?
Starting today, ChatGPT Plus and Team users will gain access to both o1-preview and o1-mini, while Enterprise and Edu users will have to wait until next week. OpenAI also plans to make o1-mini available to free ChatGPT users, though no specific release date has been announced yet.
Developer access, however, comes at a steep price. o1-preview is available in the API at $15 per million input tokens and $60 per million output tokens, compared to GPT-4o’s $5 per million input tokens and $15 per million output tokens. Despite the high cost, developers may find o1’s advanced capabilities worth the investment.
A Smarter, More Accurate AI
One of o1’s most impressive feats is its ability to outperform GPT-4o in solving complex problems, especially in math and programming. According to OpenAI’s chief research officer, Bob McGrew, o1 was tested against a qualifying exam for the International Mathematics Olympiad, where it scored 83%—a massive improvement compared to GPT-4o’s 13%.
“The model is better at solving the AP math test than I am,” says McGrew, who admits to being a math minor in college.
In online programming contests like Codeforces competitions, o1 ranked in the 89th percentile of participants. OpenAI claims that future updates to the model could see it perform at the level of PhD students in challenging subjects like physics, chemistry, and biology.
But o1 is not a perfect replacement for GPT-4o in all areas. It struggles with factual knowledge and lacks capabilities like web browsing and image processing. Nevertheless, OpenAI believes that o1 represents a significant leap forward in artificial intelligence, with its reasoning abilities marking the beginning of a new class of AI.
The model’s name, “o1,” is meant to symbolize a fresh start—“resetting the counter back to 1,” according to McGrew.
“I’ll be honest: We’re not great at naming,” he jokes. “But hopefully this is the first step toward more thoughtful, meaningful names that convey what we’re really doing here.”
A Glimpse Into o1’s Problem-Solving Capabilities
Although I didn’t get the chance to test o1 myself, McGrew and Tworek gave me a live demonstration over video. They tasked the model with solving this brain teaser:
“A princess is as old as the prince will be when the princess is twice as old as the prince was when the princess’s age was half the sum of their present ages. What are the ages of the prince and princess?”
After 30 seconds of processing, the model delivered the correct solution, breaking down its reasoning in real-time. The interface is designed to showcase the model’s step-by-step thought process, providing insight into how it arrives at an answer. Phrases like “I’m curious about,” “I’m thinking through,” and “Let me see” give the illusion that the model is “thinking,” though Tworek is quick to clarify that this is just a design feature.
“This is not human thinking,” Tworek emphasizes. “But the model spends more time processing, and we wanted to design an interface that reflects that deeper, more deliberate approach.”
McGrew adds, “There are moments where it feels surprisingly human, and then there are moments where it feels entirely alien. The model can even signal when it’s running out of time, saying something like, ‘Let me get to an answer quickly,’ or it may weigh options, saying, ‘I could do this or that, what should I do?’”
The Future of Reasoning AI and Autonomous Agents
Large language models like o1 still aren’t quite “smart” in the human sense. They’re essentially sophisticated pattern-recognition machines, predicting word sequences based on vast amounts of data. But OpenAI sees reasoning as the next crucial step toward creating AI that can function more autonomously—an agent capable of making decisions and taking actions on behalf of its users.
As OpenAI reportedly seeks to raise additional funding at a staggering $150 billion valuation, the success of models like o1 will be critical to its momentum. Cracking reasoning abilities could lead to breakthroughs in areas ranging from medicine to engineering, with AI systems capable of solving problems that go beyond mere pattern recognition.
“We’ve been working on reasoning for months,” McGrew says. “We think it’s the key breakthrough needed to push toward human-like intelligence. It’s a new modality that could finally allow models to solve the really hard problems that stand in the way of progress.”
Conclusion: The Dawn of Reasoning AI
While o1’s reasoning abilities may be slower and more expensive than its predecessors, it represents a bold step forward for OpenAI and the field of artificial intelligence as a whole. The introduction of reasoning capabilities could pave the way for more intelligent, autonomous systems capable of solving complex problems that once seemed beyond the reach of machines. The future of AI might not be human, but with models like o1, it’s getting closer to thinking like one.