Claude Sonnet 4 adds 1M‑token context window in public beta

Anthropic is introducing a 1 million-token context window for Claude Sonnet 4, in public beta on its API and Amazon Bedrock, with revised pricing for prompts beyond 200K tokens.

Overview

Anthropic has announced long‑context support for Claude Sonnet 4, raising the context window to 1 million tokens. The company describes this as a 5x increase, stating that single requests can now accommodate entire codebases with over 75,000 lines of code or process dozens of research papers while maintaining continuity across the input. The update was published on August 12, 2025.

The release focuses on enabling broader, data‑heavy workflows without fragmenting inputs. Anthropic points to three main categories: large‑scale code analysis across source files, tests, and documentation; document synthesis across extensive sets of contracts, research, or technical specs; and context‑aware agents that carry state across many tool calls and steps using complete API references, tool definitions, and interaction histories.

Availability and access

Long context for Sonnet 4 is in public beta on the Anthropic API and in Amazon Bedrock. Anthropic states that support is “coming soon” to Google Cloud’s Vertex AI. Access on the Anthropic API is currently limited to customers with Tier 4 and custom rate limits, with broader availability described as rolling out over the coming weeks. No timeline beyond that has been provided.

Pricing and cost controls

Anthropic has updated pricing for Sonnet 4 to account for larger prompts, with different rates above and below 200K tokens (pricing is per MTok):

  • Prompts ≤ 200K: $3 input, $15 output
  • Prompts > 200K: $6 input, $22.50 output

The company notes that prompt caching can reduce latency and costs for long‑context use cases. In addition, the 1M context window can be used with batch processing for an additional 50% cost savings, according to the documentation. Further details are available on the pricing page at https://www.anthropic.com/pricing#api and in the docs at:

Stated use cases

Anthropic highlights several workloads that may benefit from the expanded window:

  • Large‑scale code analysis where the model can consider project architecture and cross‑file relationships while suggesting system‑wide changes.
  • Document synthesis across large corpora (e.g., legal or technical materials), with the model maintaining context over relationships spanning hundreds of documents.
  • Context‑aware agents that keep state across extended, tool‑driven workflows without truncating prior steps or references.

These are presented as supported workflows enabled by the larger context capacity; performance characteristics beyond the longer window are not detailed in the announcement.

Customer references

Anthropic includes two customer spotlights. Bolt.new describes using Claude within a browser‑based development platform; CEO Eric Simons is quoted asserting that Sonnet 4 remains their preferred model for code generation and that the 1M context window allows work on larger projects while maintaining accuracy. London‑based iGent AI cites Maestro, a system that turns conversations into code; CEO Sean Ward is quoted stating that the 1M context window “has supercharged autonomous capabilities” and enabled multi‑day sessions on real‑world codebases. These are customer statements attributed in the announcement and are not independently verified here.

Getting started

Long‑context Sonnet 4 is live in public beta on the Anthropic API for organizations at Tier 4 or with custom rate limits, and available through Amazon Bedrock. Anthropic says Vertex AI availability is planned. Documentation and pricing details are provided at:

TL;DR

  • Claude Sonnet 4 now supports a 1M‑token context (public beta).
  • Available on the Anthropic API (Tier 4/custom rate limits) and Amazon Bedrock; Vertex AI is listed as “coming soon.”
  • Anthropic cites use cases in large‑scale code analysis, document synthesis, and context‑aware agents.
  • Pricing: ≤200K tokens at $3 input / $15 output per MTok; >200K tokens at $6 input / $22.50 output.
  • Prompt caching and batch processing are documented for cost reduction, with batch processing noted as offering an additional 50% savings.

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community