Hands-on: GPT-5.2’s gains for coding, vision, and long-form reasoning
After two weeks of hands-on testing that began November 25, the GPT-5.2 family shows clear improvements in several areas that matter for developers and researchers. The release brings better task completion behavior, stronger code generation, and improved vision and long-context handling — though latency and occasional reasoning stalls remain notable downsides.
Better instruction-following and ambition
One of the most useful advances is improved instruction-following in the sense of completing multi-step workflows rather than stopping early. In practice this means the model more reliably carries out longer, explicitly described processes (for example, generating a full list of 50 options before selecting the best). It also demonstrates a willingness to attempt much larger tasks end-to-end—such as drafting a full 200-page book—rather than defaulting to outlines and section-by-section offers. The outputs are not production-ready in those extreme cases, but the willingness to execute entire workflows opens new iterative approaches for creative and research tasks.
Code generation: more capable and persistent
Code generation is a tangible step up from GPT-5.1. The model tends to produce longer, more autonomous coding sessions, remains engaged for more complex tasks, and gets more things right on the first pass. Tests with Three.js highlighted improved styling (textures and lighting) but also revealed that spatial placement and layout reasoning still need work. Overall, the model is more reliable across larger code tasks and shows stronger context-awareness.
Vision and long-context handling
Vision capabilities are noticeably improved, especially in spatial understanding and object positioning within images—though generation of exact spatial layouts can still be imperfect. Long-context performance is also stronger: working with huge codebases, extended analysis threads, and large agentic workflows feels more stable than before, which benefits agent-style coding and repo-aware tasks.
GPT-5.2 Pro: deeper reasoning at a cost
The Pro variant delivers the clearest leap. Pro is stronger at deep reasoning, understanding intent beyond literal instructions, and holding more context while synthesizing multiple angles. Examples include meal-planning prompts where Pro optimized not just cooking time but shopping complexity and prep overhead, showing a grasp of user constraints beyond the literal brief.
Those gains come with trade-offs. Pro is notably slower, and occasional cases occur where it becomes stuck between conflicting directives—spending a long time thinking and sometimes still failing. Pro is available only inside ChatGPT and not exposed in Codex CLI or API, which limits where its reasoning strength can be applied directly.
Codex CLI and agentic coding
In Codex CLI, GPT-5.2 is the closest experience to Pro-quality coding in a CLI environment so far. It excels at context-gathering: asking clarifying questions, reading files, and exploring the repo before implementing changes. That behavior reduces blind assumptions and increases first-shot correctness. The trade-off is that the highest-reasoning modes can be very slow, sometimes taking significantly longer than Pro in ChatGPT.
Workflow comparisons and practical guidance
Across parallel usage with other frontier models, the practical roles have settled into distinct buckets:
- Quick lookups and syntax questions: competitors may be faster and more concise.
- Deep research and complex reasoning: GPT-5.2 Pro tends to produce stronger, more thoughtful results.
- Frontend aesthetics: other models can produce more polished-looking UIs, though GPT-5.2 is more reliable for engineering correctness.
Quirks and the speed problem
The biggest friction point is speed. Standard GPT-5.2 Thinking is often slow enough to deter frequent use for everyday queries, and the extra-deep reasoning modes (including Pro) increase latency further. Occasional reasoning loops or prolonged deliberation that ends in failure are another practical annoyance.
Conclusion
GPT-5.2 advances instruction fidelity, code generation, vision, and long-context stability. For tasks that benefit from careful thought—research, complex debugging, and agentic coding—GPT-5.2 Pro is a notable step forward, albeit with real-world trade-offs in latency and availability. For quick interactive work, faster alternatives still occupy a useful place in the toolkit.
Further reading and the original review: https://shumer.dev/gpt52review
Related links:
- Deep Pro-mode dive: https://shumer.dev/gpt52prodeepdive
- Concise-response prompt: https://shumerprompt.com/prompts/gpt-52-concise-response-style-custom-instructions-prompt-bfa38620-cda5-4a47-b937-7a5793537907
- Early access sign-up: https://tally.so/r/w2M17p
- Author on X: https://x.com/mattshumer_
