Superpowers 4.0 Tightens Subagent Workflows, Condenses Skills, Adds E2E Tests

Superpowers 4.0 formalizes subagent review loops, streamlines skill triggers by condensing descriptions, and adds a lightweight end-to-end test suite. Increased GraphViz use improves process documentation.

Superpowers 4.0 Tightens Subagent Workflows, Condenses Skills, Adds E2E Tests

TL;DR

  • Subagent workflow formalized: spec review agent checks implementations first, then code review agent assesses quality; both run as formal loops to allow reruns; constrained by OpenAI Codex lack of subagent support (fix not production-ready).
  • Skill descriptions trimmed to guidance about when to use a skill to reduce cases where Claude-based agents claim familiarity without consulting the skill.
  • Skill consolidation: test-driven-development now includes testing-anti-patterns; systematic-debugging absorbs root-cause-tracing, defense-in-depth, and condition-based-waiting; addresses Claude Code skill-description length limits (https://blog.fsck.com/2025/12/17/claude-code-skills-not-triggering/).
  • Basic E2E tests added covering the brainstorming → planning → implementing flow and verification of skill usage.
  • Process documentation increasingly uses GraphViz ‘dot’ notation for formalized agent workflows (https://blog.fsck.com/2025/09/29/using-graphviz-for-claudemd/).
  • Project remains free and open; repo: https://github.com/obra/superpowers — sponsorships: https://github.com/sponsors/obra

Superpowers 4.0 tightens subagent workflows, trims skill descriptions, and adds basic E2E tests

Superpowers 4.0 arrives with a set of focused changes aimed at making the AI-driven development flow more reliable and predictable. The release refines the subagent architecture, streamlines how skills are triggered, introduces a modest end-to-end test suite, and increases internal use of GraphViz for process documentation. The project repository is available at the Superpowers 4.0 GitHub page: https://github.com/obra/superpowers.

Subagent-driven development becomes more formal

A central change is a clearer separation of the review steps into two dedicated agents. A spec review agent now evaluates whether an implementation matches the plan, and only after that agent signs off does a code review agent assess code quality. Both review steps are implemented as formal loops rather than potentially one-shot checks, allowing the coordinating agent to rerun code review after fixes are applied by an implementer.

This refinement moves the workflow closer to removing the legacy “two windows” interaction where a human bridged the implementing agent and the coordinator. Progress toward that change is limited by OpenAI Codex’s current lack of subagent support; a fix exists but is not yet production-ready.

Skill triggering and condensed descriptions

Skill triggering behavior has been adjusted to reduce instances where Claude-based agents claim familiarity with a skill but proceed without actually consulting it. To address this, skill description fields were revised to contain only guidance about when to use a skill, rather than describing what the skill does.

An illustrative change: the brainstorming skill’s description shifted from a mix of purpose and guidance to a focused directive specifying when it must be used. Relatedly, several narrowly scoped skills were consolidated into broader skills with progressively disclosed parts — for example, test-driven-development now contains testing-anti-patterns, and systematic-debugging absorbs root-cause-tracing, defense-in-depth, and condition-based-waiting. This consolidation also mitigates issues with hidden limits on the total number of characters allowed for skill descriptions in Claude Code (see the write-up on Claude Code skill limits: https://blog.fsck.com/2025/12/17/claude-code-skills-not-triggering/).

Tests and process notation

A lightweight set of end-to-end tests has been added. These are not a formal evals suite but cover the full brainstorming → planning → implementing flow and verify skill usage, and have already helped improve skill triggering.

Internally, Superpowers is making heavier use of GraphViz ‘dot’ notation for process documentation. Dot provides a slightly more formal process representation than prose, and has proven usable for describing agent workflows (see: https://blog.fsck.com/2025/09/29/using-graphviz-for-claudemd/).

Sponsorship and maintenance

The project remains free and open. The maintainer has enabled GitHub Sponsorships for those who wish to support ongoing development: https://github.com/sponsors/obra

Original post: https://blog.fsck.com/2025/12/18/superpowers-4/

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community