xAI releases grok-code-fast-1 for agentic coding, with low-cost API | Oltre.dev

Overview

xAI announced grok-code-fast-1 on August 28, 2025, positioning it as a fast, lower-cost reasoning model optimized for agentic coding workflows. The company says the model was built from a new architecture, trained on a programming-rich corpus, and refined with datasets reflecting real-world pull requests and coding tasks. During development, xAI worked with launch partners to tune behavior inside their agentic platforms.

According to xAI’s announcement, the model is intended to reduce latency during iterative reasoning and tool-call loops common in agentic coding. It has been released generally via the xAI API with published pricing, and is also available free for a limited time through select partners.

Training and Design

xAI describes a multi-stage pipeline:

Pre-training on a corpus “rich with programming-related content”
Post-training using curated datasets mirroring practical coding work, including pull requests
Iterative refinement with partner feedback during a stealth period under the codename “sonic,” with multiple checkpoints shipped in response to community signals

The release notes indicate a focus on responsiveness and behavior under agentic workflows as a core design goal rather than a byproduct of general-purpose training.

Tool Use and Language Coverage

xAI reports that grok-code-fast-1 has been trained to operate common development tools used in agentic systems, including grep, terminal access, and file editing. The model is described as versatile across the software stack, with particular aptitude in TypeScript, Python, Java, Rust, C++, and Go. Claimed use cases include project scaffolding, answering codebase questions, and targeted bug fixes.

Performance, Caching, and Evaluation

xAI says it implemented serving-side techniques to improve responsiveness and lists prompt caching as a key optimization, with partners reportedly seeing cache hit rates above 90%. For quantitative assessment, the company combines public benchmarks with internal harnesses and human evaluations. On the full subset of SWE-Bench-Verified, grok-code-fast-1 scored 70.8% using xAI’s internal harness. xAI notes that while benchmarks are informative, they may not fully represent end-to-end usability in agentic workflows; internal developer ratings and automated behavior evaluations are used to judge trade-offs during training.

A performance chart is included in the announcement, alongside a methodology statement noting measurements via provider APIs (including xAI’s), but no comprehensive cross-model leaderboard is published in the post.

Access and Pricing

grok-code-fast-1 is available via the xAI API with the following published prices:

$0.20 per million input tokens
$1.50 per million output tokens
$0.02 per million cached input tokens

In addition, xAI has made the model available “for free for a limited time” through launch partners including GitHub Copilot, Cursor, Cline, Roo Code, Kilo Code, opencode, and Windsurf. The announcement highlights partner access pages for:

Cursor: https://cursor.com
GitHub Copilot: https://github.com/features/copilot
Cline: https://cline.bot/

No end date for the limited-time partner access is provided.

Development Timeline and What’s Next

xAI states that grok-code-fast-1 was released in stealth under the codename “sonic” during the prior week, with subsequent checkpoints deployed based on feedback. The company says it plans frequent updates and indicates that a variant supporting multimodal inputs, parallel tool calling, and extended context is in training. A community feedback channel is listed at https://discord.gg/x-ai.

Documentation and Resources

Prompt Engineering Guide: https://docs.x.ai/docs/guides/grok-code-prompt-engineering
xAI Cloud Console: https://console.x.ai
Model card (PDF): https://data.x.ai/2025-08-26-grok-code-fast-1-model-card.pdf

TL;DR

New coding-focused model built on a fresh architecture and programming-heavy training data
Tool-use support for grep, terminal, and file editing; languages include TypeScript, Python, Java, Rust, C++, Go
Reported optimizations include >90% prompt cache hit rates with partners
Benchmark snapshot: 70.8% on SWE-Bench-Verified (xAI internal harness)
API pricing: $0.20/M input, $1.50/M output, $0.02/M cached input
Availability: General via xAI API; “free for a limited time” through partners (no end date disclosed)