Google Announces Public Release of Gemini 3

Gemini 3 Pro adds agentic workflows, vibe coding, and a 1M-token context window for richer multimodal reasoning. New APIs, tool integrations, and Antigravity accelerate developer automation and app generation.

gemini cover

TL;DR

  • 1 million-token context window: enables much longer multimodal and conversational contexts
  • Agentic workflows and tooling: Google Antigravity public preview (macOS/Windows/Linux), client-side bash + hosted server-side bash tools for code generation (early access; GA planned), integrations with Gemini CLI, Android Studio, Cursor, GitHub, JetBrains, Manus, Cline
  • New API controls: thinking level, granular media resolution, and stricter thought-signature validation to preserve reasoning across multi-turn interactions; supports combining hosted tools like Grounding with Google Search and URL context
  • Vibe coding via Google AI Studio Build mode: single-prompt app generation with automatic planning, wiring of visuals/interactivity, and annotation features for iterative refinement
  • Multimodal and spatial advances: improved image/video reasoning (MMMU-Pro, Video MMMU), document understanding beyond OCR, spatial reasoning for robotics/XR/screen tasks, and high-frame-rate video analysis with long-context recall
  • Pricing for preview tier: $2 per million input tokens and $12 per million output tokens for prompts ≤200k tokens; available via the Gemini API in Google AI Studio and Vertex AI; free, rate-limited access in Google AI Studio

Google has introduced Gemini 3, a new foundation model positioned for developer workflows that require deep reasoning, multimodal understanding, and agentic tool use. The release emphasizes tighter integration with IDEs and agent platforms, expanded multimodal capabilities, and API features intended to support complex, multi-step automation.

What’s new in this release

Gemini 3 Pro is presented as an upgrade over prior Gemini iterations with improved benchmark performance and coding ability. Notable technical highlights from the announcement include:

  • Terminal-Bench 2.0 score: 54.2%, reflecting tool use ability to operate a computer via the terminal.
  • WebDev Arena Elo: 1487, representing single-prompt web development performance.
  • 1 million-token context window, enabling much longer multimodal and conversational contexts.
  • New API parameters for finer control: a thinking level, more granular media resolution settings, and stricter validation for thought signatures to preserve model reasoning across multi-turn interactions.

Pricing for the preview tier is stated as $2 per million input tokens and $12 per million output tokens for prompts 200k tokens or less; the model is available via the Gemini API in Google AI Studio and Vertex AI (see full pricing details). Gemini 3 Pro is also accessible, with rate limits, at no charge in Google AI Studio.

Agentic coding and development tooling

The release focuses heavily on agentic workflows. Gemini 3 Pro is intended to act as a more autonomous partner across editor, terminal, and browser contexts. To showcase those capabilities, Google is promoting two entry points:

  • Google Antigravity, an agentic development platform in public preview for macOS, Windows, and Linux that manages agents across workspaces while preserving an IDE-style experience.
  • A client-side bash tool (for proposing shell commands) plus a hosted server-side bash tool for multi-language code generation and secure prototyping, available now in the Gemini API for early access partners, with general availability planned soon.

Integration points already called out include Gemini CLI, Android Studio, and a range of third-party coding tools such as Cursor, GitHub, JetBrains, Manus, and Cline.

Vibe coding and Google AI Studio

The term “vibe coding” is used to describe translating high-level, natural-language prompts into fully interactive apps. Gemini 3 Pro aims to improve single-prompt app generation, handling multi-step planning and wiring up visuals and interactivity automatically. Google AI Studio emphasizes a Build mode that automatically configures models and APIs, along with annotation features for iterative refinement. A number of demo apps and examples are linked from the Studio pages.

Multimodal, visual, spatial, and video reasoning

Gemini 3 is positioned for complex multimodal tasks, with reported new highs on MMMU-Pro for image reasoning and Video MMMU for video understanding. Practical developer-facing capabilities include:

  • Document understanding beyond OCR for structured reasoning over complex documents.
  • Improved spatial reasoning for tasks like trajectory prediction and screen understanding, useful in robotics, XR, and computer-use agents.
  • High-frame-rate video understanding combined with long-context recall for analyzing extended footage.

The API also supports combining hosted tools such as Grounding with Google Search and URL context with structured outputs, intended to help agents fetch, extract, and format data for downstream tasks.

Where to try it

Further technical details, API parameters, and developer guidance are available in the Gemini 3 developer documentation and the associated prompting and developer guides: see the Developer Guide.

Original source: https://blog.google/technology/developers/gemini-3-developers/

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community