Google has unveiled its first reasoning model, Gemini 2.0 Flash Thinking, to compete with OpenAI’s latest AI models in the o1- series. This experimental model is trained to generate a “thinking process” before presenting a solution. According to Google, this new “Thinking Mode” offers superior reasoning capabilities compared to the base Gemini 2.0 Flash model.
Gemini 2.0 Flash Thinking: Availability
The Thinking Mode is currently available as an experimental model through Google AI Studio and Vertex AI. Developers can also access it directly via the Gemini API.
Gemini 2.0 Flash Thinking: Details
In a post on X (formerly Twitter), Jeff Dean, chief scientist at Google DeepMind, highlighted the strengths of the Gemini 2.0 Flash Thinking model. Built on the performance foundation of Gemini 2.0 Flash, the new mode is designed to “explicitly show its thoughts” to enhance reasoning.
Also Read
Want to see Gemini 2.0 Flash Thinking in action? Check out this demo where the model solves a physics problem and explains its reasoning. pic.twitter.com/Nl0hYj7ZFS
— Jeff Dean (@JeffDean) December 19, 2024
A demo video shared by Dean illustrates the model solving complex physics problems. While delivering a solution, the model also demonstrates the reasoning steps it took, displayed through a new interface. This breakdown reveals how the model dissects a problem into smaller components to arrive at the final answer. Logan Kilpatrick, product lead for Google AI Studio, shared another demo video showcasing the Thinking Mode’s ability to handle a math problem involving both text and image inputs.
Gemini 2.0 models: Details
Earlier this month, Google launched the Gemini 2.0 series, introducing advancements in multimodality with native image and audio output, alongside new tools. The Gemini 2.0 Flash is the first publicly available model in this series, enabling multimodal reasoning, long-context understanding, and agentic experiences.. Google also showcased new prototypes for AI agents, including:
- Project Astra: A prototype exploring universal AI assistant capabilities. Previewed at Google I/O 2024, it includes features like “remembering” visual and auditory inputs from a smartphone’s camera and microphone.
- Project Mariner: This prototype analyses and reasons across browser information, including text, code, images, and forms. With an experimental Chrome extension, it completes tasks using this information.
- Jules: An AI-powered coding agent capable of addressing programming challenges, creating plans, and executing them under developer supervision.
- Gaming Agents: Designed to help players navigate virtual environments, these agents reason about gameplay based on on-screen actions, offering real-time suggestions and acting as virtual companions.