Thinking with images.
In a major leap forward for artificial intelligence, OpenAI has officially launched two cutting-edge AI reasoning models—o3 and o4-mini. The company describes o3 as its “most powerful reasoning model” to date, while o4-mini stands out as a compact, high-performance model optimized for speed and cost-efficiency.
These groundbreaking models mark a new era in multimodal intelligence. For the first time, OpenAI’s reasoning engines are capable of using images as part of their thought process—a feature that fundamentally transforms how AI understands and interacts with the world around it.
Smarter Reasoning with Visual Context
Traditionally, AI has relied on text to drive understanding. With o3 and o4-mini, that changes dramatically. These models can now “think” with images, integrating visual information directly into their chain of thought. Whether it’s a whiteboard sketch, a flowchart, or a product photo, the models can now analyze, interpret, and reason about visual data alongside text inputs.
Even more impressively, the models can interact with images during the reasoning process—zooming in on specific areas, rotating images for better angles, and dissecting complex visuals for deeper understanding. This opens up a wealth of possibilities in fields such as education, engineering, design, and research, where visual context is just as critical as written information.
Fully Equipped with ChatGPT’s Toolset
OpenAI is also rolling out full integration of its reasoning models with every tool in the ChatGPT ecosystem. This means that o3, o4-mini, and o4-mini-high users can now leverage:
- Web browsing for real-time information retrieval
- Python for coding, data analysis, and calculations
- Image analysis to interpret photos, diagrams, and charts
- File interpretation for documents, spreadsheets, and PDFs
- Image generation to create visuals from prompts
These tools are no longer siloed—they work together seamlessly under the reasoning capabilities of the o3 family, enabling models to act more agentically, or independently, by choosing the right tools at the right time.
This kind of dynamic tool usage enhances the models’ ability to solve complex problems end-to-end, whether it’s writing reports with live data, generating illustrations for concepts, or debugging code.
Availability and Rollout Plans
The new models are being made available immediately to all ChatGPT Plus, Pro, and Team subscribers. These tiers now include access to:
- o3
- o4-mini
- o4-mini-high
Meanwhile, OpenAI has announced that o3-pro will become available in the coming weeks, promising even more advanced capabilities tailored for professional use.
As part of this upgrade, older models such as o1, o3-mini, and o3-mini-high will be phased out, ensuring that users always have access to the most capable and efficient versions available.
A Future of Visual Intelligence
This launch comes just days after OpenAI introduced GPT-4.1, the successor to its widely used GPT-4o model. With o3 and o4-mini, OpenAI is reinforcing its position at the forefront of general-purpose AI by bridging the gap between text and visual intelligence.
From reading documents and analyzing visuals to generating new ideas and solving real-world problems, these models represent the next chapter in AI’s evolution—more powerful, more intuitive, and more human-like in their thinking.
As OpenAI puts it:
“Our smartest and most capable models to date.”
And with image-integrated reasoning, the future of AI just got a whole lot clearer—and more visually intelligent.
Want to see it in action? If you’re a ChatGPT Plus, Pro, or Team user, the future of reasoning is already at your fingertips.