Kilo Code adds usage-based pricing estimates, Qwen Code support

Kilo Code’s latest update introduces usage-based model cost estimates, Qwen Code integration, timeline panning, and expanded provider controls, alongside targeted fixes and GPT-5 compatibility improvements.

Kilo Code adds usage-based pricing estimates, Qwen Code support

Overview

Kilo Code, a coding assistant that incorporates functionality from open-source projects such as Roo Code and Cline, has released an update with several feature additions and maintenance changes. The release notes characterize three capabilities as unique to Kilo Code at the time of publication: usage-based model cost estimates, Qwen Code integration as an API provider, and drag-to-pan in the Task Timeline header. The update also expands provider routing controls, improves interoperability with GPT-5 on supported providers, and includes a set of fixes and UI refinements.

Usage-Based Model Cost Estimates

The Kilo Code API provider now displays average cost per million tokens based on real-world usage, accessible under Settings -> Providers. The estimate incorporates factors such as cache discounts (contribution credited to @chrarnoldus via PR #1893). The project cites aggregate activity of over 30 billion tokens per day routed via the Kilo Code API provider, with a reference at https://openrouter.ai/apps?url=https%3A%2F%2Fkilocode.ai%2F.

An example included in the notes: at publication time, users are averaging approximately $0.85 per million tokens with Sonnet 4 at a 200k context window. For context, Anthropic’s published list pricing for Sonnet 4 is $3 per 1M input tokens, $15 per 1M output tokens, and $0.3 per 1M cached tokens (see https://www.anthropic.com/pricing#api). The in-product estimate is presented as a single, empirically grounded figure derived from observed usage.

Qwen Code as an API Provider

Kilo Code now supports Qwen Code as an API provider (external contribution by @Toukaiteio; PR https://github.com/Kilo-Org/kilocode/pull/1868). The notes indicate that Qwen currently offers a free tier allowing up to 1000 daily requests to the Qwen3 Coder model without token limits. Setup guidance provided in the release includes: install Qwen Code and create an account (https://github.com/QwenLM/qwen-code), select Qwen Code as the provider, and choose the Qwen3 Coder model; configuration files are auto-discovered by Kilo Code.

Task Timeline Drag-to-Pan

Navigation in the Task Timeline now supports drag-to-pan in the header (credited to @hassoncs; PR https://github.com/Kilo-Org/kilocode/pull/1948). The timeline is described as a way to jump between chat messages within a task; the new interaction is intended to make it easier to traverse longer timelines.

Provider Pricing, Capabilities, and Routing Controls

Kilo Code has made it more visible and convenient to choose an inference provider per model:

  • Under Settings -> Providers, a “Provider Price and Capability Breakdown” view shows pricing and context window differences across providers. A screenshot in the notes illustrates this for Qwen3-Coder. Additional provider data is available on OpenRouter at https://openrouter.ai/qwen/qwen3-coder.
  • The “Provider Routing” dropdown exposes two paths: letting Kilo Code select a provider by preference—lower price, higher throughput, or lower latency—or selecting a specific provider directly. The release notes state that this functionality existed previously but has been surfaced more clearly in the UI. Related routing and pricing work is attributed to PR https://github.com/Kilo-Org/kilocode/pull/1893.

GPT-5 Support and Provider Changes

The project reports improvements related to GPT-5 compatibility across providers:

Other provider adjustments include:

Fixes and Minor Updates

The release aggregates several maintenance updates:

Changes pulled from Roo Code v3.25.10 include:

TL;DR

  • Kilo Code now shows usage-based cost estimates per million tokens for models, factoring in cache discounts; figures draw from observed traffic (reference: https://openrouter.ai/apps?url=https%3A%2F%2Fkilocode.ai%2F).
  • Qwen Code integration enables use of the Qwen3 Coder model; the notes mention a free tier of up to 1000 daily requests.
  • The Task Timeline header supports drag-to-pan for easier navigation.
  • The Providers view surfaces pricing/capability tables and a clearer Provider Routing selector with preferences for lower price, higher throughput, or lower latency, plus manual provider selection.
  • GPT-5 related fixes and Roo Code v3.25.10 updates improve compatibility and polish.
  • The Big Model API provider has been removed; use the Z.AI provider with open.bigmodel.cn.

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community