OpenAI is rolling out three practical updates to Codex access aimed at extending usable cycles and speeding throughput. The changes include a new compact model, increased rate limits for several ChatGPT plans, and priority processing for higher-tier accounts — along with backend efficiency work to get more out of GPU resources.
GPT-5-Codex-Mini: more runs, slightly smaller model
A new offering, GPT-5-Codex-Mini, delivers roughly 4x more usage compared with GPT-5-Codex by virtue of being a more compact and cost-efficient variant. The tradeoff is a modest capability reduction relative to the full model, which positions the mini-model for simpler or repetitive coding tasks where extra throughput is more valuable than peak model capability.
- Availability: selectable in the CLI and IDE extension when signed in with ChatGPT.
- API support: indicated as coming soon.
- The system will also suggest switching to GPT-5-Codex-Mini when usage hits 90% of configured limits, enabling longer uninterrupted sessions.
Higher limits and priority processing
Two account-level changes accompany the model update:
- 50% higher rate limits for ChatGPT Plus, Business, and Edu plans, resulting from efficiency improvements and better GPU utilization.
- Priority processing for ChatGPT Pro and Enterprise accounts to boost responsiveness and throughput under load.
These adjustments target both higher-frequency workflows and scenarios where latency or throughput matter most.
Practical implications for development workflows
The combination of a compact Codex model plus raised limits and priority processing encourages more flexible cost/performance tradeoffs:
- GPT-5-Codex-Mini can be used for scaffolding, simple refactors, or bulk transformations where the primary goal is maximizing token throughput.
- Higher rate caps and priority processing reduce interruptions and contention for users on paid tiers, particularly in collaborative or classroom settings where many sessions run concurrently.
API integration remains pending, so the immediate effects are most visible within the CLI and IDE extension when authenticated with ChatGPT. The announcement notes efficiency gains on the GPU side as a contributing factor to these stepped-up limits.


