Cline: Sonoma Sky and Dusk Alpha: 2M Context Windows, Fast but Less Reliable

Sonoma Sky and Dusk Alpha bring 2M-token context windows and fast inference but lag on accuracy in real coding edits (Sky 84%, Dusk 87%). Use them for experimentation; stick with proven models for production.

September 9, 2025

•

Cline

TL;DR

Sonoma Sky Alpha & Sonoma Dusk Alpha: new models with 2M token context windows (appeared Sep 6, 2025)
Sky oriented toward reasoning; Dusk prioritized faster inference
Cline testbed: thousands of diff edit ops tracked from Aug 26–Sep 9, 2025
Measured success rates on real coding edits: Claude 4 Sonnet 96% · GPT-5 92% · Gemini 2.5 Pro 90% · Dusk 87% · Sky 84%
Large context benefits large-code/multipage contexts but did not translate to top-tier reliability in these workflows
Both models delivered notable inference speed, consistent with Dusk’s design goal
Community reports documented hallucinations and tool-calling failures that lowered success rates
Practical takeaway: appropriate for exploratory or non-production workflows; retain established models for production/high-stakes tasks
Free alpha access available via Vercel AI Gateway and OpenRouter
Original source: https://cline.bot/blog/sonoma-alpha-sky-dusk-models-cline

Sonoma Alpha Sky & Dusk: 2M Context Windows, Real Coding Tasks, and Early Limits

Two new models with 2M token context windows — Sonoma Sky Alpha and Sonoma Dusk Alpha — appeared on major gateways in early September 2025. Both showed up with free alpha access and rapid inference, prompting testing across thousands of real coding edits in Cline to evaluate practical performance beyond headline specs.

The models and the test bed

Sky is positioned as the more capable reasoning model, while Dusk focuses on faster inference. Cline tracked thousands of diff edit operations from August 26 – September 9, 2025, with the Sonoma models first appearing on September 6.

Performance measured as success rate on those real-world edits:

Claude 4 Sonnet — 96%
GPT-5 — 92%
Gemini 2.5 Pro — 90%
Dusk — 87%
Sky — 84%

These figures place the Sonoma Alphas behind established models on accuracy, despite the notable context window and speed.

Observations from real usage

The 2M context window represents a significant capability for large-code or multipage contexts, but raw context size did not translate into top-tier reliability in the tested workflows.
Both Sonoma models offered notable inference speed, aligning with Dusk’s intended design point.
Community reports in Discord documented mixed experiences, including instances of hallucinations and tool calling failures, which contributed to lower success rates relative to mature competitors.

Practical implications for teams

The Sonoma Alpha models present an intriguing experiment in scaling context and responsiveness, but current reliability metrics suggest continued reliance on proven models for critical coding tasks.
Free alpha access is available via Vercel AI Gateway and OpenRouter, making hands-on evaluation straightforward for non-critical experimentation.
Given measured success rates, a reasonable approach for engineering teams is to explore Sonoma Alphas for exploratory or non-production workflows while maintaining established models for production automation and higher-stakes editing.

Results may vary with task complexity and integration patterns, but the early readout emphasizes that large context windows alone are not a substitute for established model reliability.

Original source: https://cline.bot/blog/sonoma-alpha-sky-dusk-models-cline

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community

Cline CLI Preview: Standalone, Open-Source Cline Core with gRPC

Now open-source, Cline CLI Preview runs as a standalone service on Cline Core, exposing a gRPC API for scriptable automation, multi-agent orchestration, and multi-frontend control. Install via npm or download the prev...

Oct 17, 2025

1 shared tag

Cline v3.31 Adds Voice Mode, Streamlined Task Header, and YOLO

Cline v3.31 adds experimental Voice Mode (OpenAI Whisper) for conversational workflows and a cleaner task header with manual context compression. It also introduces YOLO Mode, which disables safety prompts for fully a...

Sep 25, 2025

1 shared tag

Cline Launches code-supernova: Free 200k-Context Model for Agentic Coding

Cline's stealth model code-supernova is free in alpha with unlimited access. It provides a 200k-token context window and multimodal (screenshots, diagrams) support, tuned for agentic, tool-based coding.

Sep 19, 2025

1 shared tag

Sonoma Alpha Sky & Dusk: 2M Context Windows, Real Coding Tasks, and Early Limits

The models and the test bed

Observations from real usage

Practical implications for teams

Continue the conversation on Slack

Related Articles

Cline CLI Preview: Standalone, Open-Source Cline Core with gRPC

Cline v3.31 Adds Voice Mode, Streamlined Task Header, and YOLO

Cline Launches code-supernova: Free 200k-Context Model for Agentic Coding