BMAD vs Spec Kit vs OpenSpec: Choosing Your Spec-Driven AI Framework

Yuriy Butkevych

Co-founder and Technology Evangelist

We trialed five spec-driven frameworks across three real client projects in Q1 2026. Here’s what we’d actually pick and why it might not be BMAD.

If you’re a CTO evaluating AI coding agents for your team right now, you’ve probably hit the same wall we did six months ago: there are too many “spec-driven development” frameworks and almost no honest comparisons between them. Marketing pages all promise the same thing — predictable AI, production-ready code, no more vibe coding hangover. The reality on the ground is very different.

In this article, we lay out a working decision tree for picking between BMAD, GitHub Spec Kit, OpenSpec, GSD, Hermes, and AWS Kiro — the five frameworks our team at Reenbit has actually shipped with, plus the one (Kiro) that’s reshaping the conversation. No feature matrix theater. Just the tradeoffs, the pain points, and the five questions we walk every client through before recommending one.

What Spec-Driven Development Actually Means (in 60 seconds)

Spec-driven development (SDD) is a methodology that treats a written specification — not the prompt, not the chat history — as the single source of truth for AI agents. You write what you want, plan how to build it, break it into tasks, and only then let the AI implement. The spec becomes the contract; the code becomes the artifact.

That single shift solves the three problems that killed early “vibe coding”:

Context loss. The spec persists across sessions and team members.
Drift. Every change traces back to a versioned document, not a forgotten Slack thread.
Auditability. When regulators (or your own QA) ask why a decision was made, you have a paper trail.

The category exploded in 2025. By early 2026, a community map of agentic coding frameworks tracked 30+ tools — and three of them dominate real engineering conversations: BMAD, GitHub Spec Kit, and OpenSpec. Two more — GSD and Hermes — sit on the edges as lighter alternatives, and AWS Kiro has joined the fight as a full IDE rather than just a methodology.

If you’re new to the category, our pillar article is What Is BMAD? The Agentic AI Framework for Production-Ready Development is prerequisite reading.

BMAD: What It’s Best At and Where It Breaks

BMAD-METHOD (Breakthrough Method for Agile AI-Driven Development) is the most architecturally ambitious framework in the category. It simulates an entire agile software team: 12+ specialized AI agents covering Analyst, PM, Architect, UX Designer, Scrum Master, Developer, QA, and Tech Writer roles. Each agent gets a tightly scoped context window and produces a versioned artifact — PRD, architecture document, sprint stories — before the next agent picks up the work. With 37,000+ GitHub stars and V6’s cross-platform agent team support across Claude Code, Cursor, Codex, Copilot, and Windsurf, BMAD is the framework other frameworks compare themselves to.

Where BMAD shines:

Complex greenfield projects with clear scope. When you’re building something net-new and the cost of “almost right” is high, BMAD’s heavy planning pays for itself within two sprints.
Teams that benefit from formal documentation. If you’re scaling from 5 to 25 engineers, BMAD’s artifact-driven handoffs default to onboarding documentation.
Regulated industries. PRDs, architecture diagrams, and story-level test plans double as compliance evidence — a quiet but powerful side benefit.

Where BMAD breaks:

Small features. A 4-hour bug fix shouldn’t need a PRD and a sprint story. BMAD’s three-track system (Quick Flow, Standard, Enterprise) helps, but senior engineers still feel friction.
Token cost. Real-world BMAD usage averages ~31,667 tokens per workflow run, with large projects incurring $800–$2,000+/month in API costs per developer. We’ve seen weeks hit $3,200 on Claude Opus before we tuned the workflow.
Speed for small jobs. In one real-world CRM dashboard build, the same task took 12 minutes with OpenSpec, 90 minutes with Spec Kit, and 5.5 hours with BMAD. Five and a half hours.
Brownfield friction. Active GitHub issues #446 and #563 confirm what we’ve seen on legacy refactors: BMAD’s documentation-first assumptions don’t always map cleanly onto a 10-year-old monolith.

BMAD is excellent. It’s also expensive and overkill for most weekly engineering work. Knowing when to not use it matters as much as knowing how to set it up. Discover more about Reenbit’s AI-Assisted Development Service.

The Other Frameworks: Honest Capsule Reviews

GitHub Spec Kit

Released by GitHub in late 2025, Spec Kit has the strongest distribution story in the category — 80,000+ GitHub stars and template packages for 24+ AI coding agents, including Copilot, Claude Code, Gemini CLI, Cursor, and Windsurf.

Its workflow is built on a four-phase loop: specify → plan → tasks → implement. The defining feature is the constitution — a project-wide ruleset that every spec inherits from, ensuring consistency across features without retyping conventions in every prompt. Setup is specify init, and slash commands like /specify and /plan appear directly in your editor.

Best for: Standardizing AI quality across an existing team. Medium-sized features where rigor matters but you don’t need an entire simulated team.

Watch out for: Initial setup involves “a lot of questions,” as one Visual Studio Magazine review put it. Spec Kit is opinionated and assumes you’ll invest the time to write a strong constitution upfront.

OpenSpec

OpenSpec is the minimalist of the group. Instead of generating full architecture documents per change, OpenSpec uses delta specs — you write only what’s changing, completed specs archive into a growing source-of-truth document, and the project documentation evolves with the code. No personas, no agent ceremonies, no sprint metaphors. Just spec → change → archive.

Best for: Modernizing legacy systems and brownfield projects where heavy frameworks bog down. Speed-first teams.

Watch out for: Limited ceremony cuts both ways. If your team needs explicit handoffs between roles (PM → Architect → Dev), OpenSpec’s leanness will feel like it’s missing pieces.

GSD (Get Stuff Done)

GSD is a meta-prompting framework built primarily on top of Claude Code’s native capabilities. It uses a 3-phase funnel with just two prompts: a PLANNING prompt that performs gap analysis and generates a prioritized TODO list, and a BUILDING prompt that implements tasks and runs tests in a loop until everything is green. As Context Studios describes it, GSD’s philosophy is that complexity should live in the system, not in the workflow.

Best for: Fast iteration on smaller projects where requirements are fluid. Solo developers and small product teams who already live inside Claude Code.

Watch out for: Limited multi-agent orchestration. If you need real role specialization across PM, Architect, and Dev, GSD’s leanness becomes a ceiling.

Hermes

Hermes is the structured communication layer for complex multi-agent systems. Where other frameworks define the process, Hermes defines the protocol for how agents hand off context, artifacts, and decisions to each other. As MindStudio’s framework comparison puts it, Hermes is best when “structured handoffs are a first-class concern.”

Best for: Teams building their own custom agent pipelines, especially in domains where audit-grade context transfer between agents matters (financial services, healthcare AI).

Watch out for: Hermes is closer to a toolkit than a turnkey methodology. If you want something to install and run today, this isn’t it.

AWS Kiro

Kiro is AWS’s spec-driven agentic IDE — the wildcard in this lineup because it isn’t just a methodology; it’s a full development environment with the framework baked in. Before any code is written, Kiro’s agent produces three structured documents: requirements.md (user stories with EARS-notation acceptance criteria), design.md (architecture, sequence diagrams, component breakdown), and tasks.md (a numbered implementation checklist).

AWS announced a new “spec check” feature that mathematically proves requirements are contradiction-free before code is written, and Quick Plan mode plus parallel task execution can cut implementation time on large projects by ~75%.

Best for: AWS-native shops that want SDD without picking a methodology themselves. Teams that already standardize on a single IDE.

Watch out for: Vendor lock-in. Kiro is excellent inside AWS; outside it, you lose the integrated MCP catalog and parallel execution advantages. It’s also less flexible than running a methodology like BMAD on top of your existing toolchain.

The Decision Matrix: Which Framework, When

We use the matrix below with every client engagement. The variables that matter most aren’t framework features — they’re team size, project type, and compliance requirements.

Scenario

Team size

Project type

Compliance needs

Recommended framework

Pre-PMF startup MVP

1-3

Greenfield

Low

OpenSpec or GSD

Funded startup, scaling team

4–15

Greenfield product

Medium

GitHub Spec Kit

Complex enterprise greenfield

10-15

New platform / big-bet

Medium-Hig

BMAD

Brownfield modernization

Any

Legacy refactor

Medium

OpenSpec (primary) + BMAD brownfield mode (selectively)

Regulated industry (FinTech, health

Any

Either

High (SOC 2, HIPAA, EU AI Act)

BMAD (artifacts double as audit evidence)

AWS-only shop, single IDE policy

Any

Greenfield

Any

Kiro

Custom multi-agent pipeline

Internal platform team

Tooling build

High

Hermes as the comms layer

Solo dev, fast iteration

Anything small

Low

GSD

Pre-PMF startup MVP

Team size: 1-3

Project type: Greenfield

Compliance needs: Low

Recommended framework: OpenSpec or GSD

Funded startup, scaling team

Team size: 4–15

Project type: Greenfield product

Compliance needs: Medium

Recommended framework: GitHub Spec Kit

Complex enterprise greenfield

Team size: 10-15

Project type: New platform / big-bet

Compliance needs: Medium-Hig

Recommended framework: BMAD

Brownfield modernization

Team size: Any

Project type: Legacy refactor

Compliance needs: Medium

Recommended framework: OpenSpec (primary) + BMAD brownfield mode (selectively)

Regulated industry (FinTech, health

Team size: Any

Project type: Either

Compliance needs: High (SOC 2, HIPAA, EU AI Act)

Recommended framework: BMAD (artifacts double as audit evidence)

AWS-only shop, single IDE policy

Team size: Any

Project type: Greenfield

Compliance needs: Any

Recommended framework: Kiro

Custom multi-agent pipeline

Team size: Internal platform team

Project type: Tooling build

Compliance needs: High

Recommended framework: Hermes as the comms layer

Solo dev, fast iteration

Team size: 1

Project type: Anything small

Compliance needs: Low

Recommended framework: GSD

Migration Paths: When You Outgrow Your Current Framework

The good news: because these are methodologies and not deeply coupled platforms, migration is almost always painless. The artifacts (PRDs, specs, stories) port across tools because the underlying data is just markdown plus JSON.

A few migration patterns we’ve seen:

OpenSpec → BMAD. As your brownfield modernization succeeds and you start building new features on top, OpenSpec’s delta-only model can feel thin. Carry your archived spec forward as the BMAD Architect agent’s input document.
Spec Kit → BMAD. Common when a startup raises a Series A, hires its first PM, and needs explicit role separation. The Spec Kit constitution maps cleanly onto BMAD’s master agent prompts.
BMAD → GSD. Yes, the reverse direction. When a complex BMAD project ships and the team enters maintenance mode, GSD’s lean two-prompt loop is often better suited to the bug-fix-and-small-feature cadence that follows.
Anything → Kiro. If your stack is consolidating on AWS, Kiro becomes attractive because it embeds the methodology in the IDE. You give up flexibility; you get integrated tooling.

The Five Questions We Ask Every Client

When a client asks us, “Which spec-driven AI framework should we use?” we don’t start with framework features. We start with these five questions. Whichever framework best matches the answers wins.

1. What’s your project type — greenfield or brownfield? Brownfield tilts toward OpenSpec (primary) with selective BMAD brownfield mode. Greenfield opens up BMAD and Spec Kit.
2. What’s your team size and structure? Solo or small team → GSD or OpenSpec. Scaling team that needs explicit roles → Spec Kit. Multi-team enterprise → BMAD.
3. What are your compliance requirements? SOC 2, HIPAA, EU AI Act, or Colorado AI Act (enforceable June 2026) → BMAD’s audit-friendly artifacts. Internal-only tools → any framework works.
4. What’s your token budget? BMAD is the most expensive to run; OpenSpec and GSD are the cheapest. If $2,000+/month/developer in API spend is a non-starter, that’s your filter.
5. How tied are you to a single IDE or cloud? AWS-only → Kiro deserves a serious look. Multi-IDE, multi-model → BMAD (V6 cross-platform), Spec Kit (24+ agents), or OpenSpec (CLI-agnostic).

If you can answer these five questions clearly, you don’t need a framework consultant. If you can’t — particularly questions 3 and 4 — that’s where most teams need an outside perspective, and that’s the conversation we have most often with Reenbit clients.

Conclusion

After three client projects and five frameworks in Q1 2026, the answer we give most often is not BMAD. It’s “it depends on questions 1 through 5 above — and most teams should mix two or three frameworks across their portfolio rather than standardize on one.” BMAD is the right choice for complex, regulated, greenfield work. OpenSpec wins on brownfield. Spec Kit is the safest bet for a scaling team. Kiro is the easiest path inside AWS. GSD is the lightest tool for a solo dev. Hermes is for teams building their own pipelines.

The teams that ship best aren’t the ones that picked the “right” framework. They’re the ones who picked the appropriate framework for the specific project — and adjusted it when the project changed.

Need help choosing? Talk to our AI-Assisted Software Development team — we’ll walk you through the five-question framework with your actual project context, share the spec templates we use across BMAD, Spec Kit, and OpenSpec, and tell you honestly which one (if any) is the right fit.

FAQ

What is the difference between BMAD and Spec Kit?

BMAD simulates a full agile team with 12+ specialized AI personas (Analyst, PM, Architect, Developer, QA, etc.) and produces extensive artifacts before any code is written. Spec Kit uses a leaner four-phase workflow (specify → plan → tasks → implement) with a project-wide “constitution” that every spec inherits.

BMAD is heavier and better suited to complex greenfield work; Spec Kit is lighter and better suited to standardizing AI quality across an existing team.

Is OpenSpec better than BMAD for legacy code?

For most brownfield modernization projects, yes. OpenSpec’s delta-spec model captures only what’s changing, rather than trying to document the entire system up front — a much better fit for legacy refactors.

BMAD has a brownfield workflow, but active GitHub issues confirm it still struggles on messy legacy codebases.

How much does it cost to run BMAD?

Real-world BMAD usage averages around 31,667 tokens per workflow run, and large projects can consume 230 million tokens per week.

Monthly API costs typically range from $800 to $2,000 per developer for frontier models like Claude Opus 4.5 or Sonnet 5. OpenSpec and GSD run at a fraction of that cost.

Which spec-driven framework has the most GitHub stars?

As of early 2026, GitHub Spec Kit leads with over 80,000 stars (helped by GitHub’s distribution). BMAD-METHOD is second with 37,000+ stars. OpenSpec, GSD, and Hermes are smaller in terms of community size but are actively maintained.

Can I switch from one framework to another mid-project?

Yes, in most cases. Because these are methodologies producing markdown and JSON artifacts (not deeply coupled platforms), specs port across tools with light reformatting. The most common migrations we see are OpenSpec → BMAD (as brownfield becomes greenfield), Spec Kit → BMAD (as the team adds PM/Architect roles), and BMAD → GSD (when a project moves into maintenance mode).

Is AWS Kiro a replacement for BMAD?

Not exactly. Kiro is a full IDE with SDD baked in, while BMAD is a methodology that runs on top of your existing IDE and model of choice. If you’re locked into AWS and want SDD without picking a methodology yourself, Kiro is compelling. If you need flexibility across Claude, GPT, Gemini, and multiple IDEs, BMAD or Spec Kit remain the more portable choices.

Blog

What Is BMAD? The Agentic AI Framework for Production-Ready Development

Blog

AI Agents in the Real World: Tips for Modern Engineering Teams

Tell us about your challenge!

Use the contact form and we’ll get back to you shortly.

BMAD vs Spec Kit vs OpenSpec: Choosing Your Spec-Driven AI Framework

What Spec-Driven Development Actually Means (in 60 seconds)

BMAD: What It’s Best At and Where It Breaks

The Other Frameworks: Honest Capsule Reviews

GitHub Spec Kit

OpenSpec

GSD (Get Stuff Done)

Hermes

AWS Kiro

The Decision Matrix: Which Framework, When

Scenario

Team size

Project type

Compliance needs

Recommended framework

Pre-PMF startup MVP

Funded startup, scaling team

Complex enterprise greenfield

Brownfield modernization

Regulated industry (FinTech, health

AWS-only shop, single IDE policy

Custom multi-agent pipeline

Solo dev, fast iteration

Migration Paths: When You Outgrow Your Current Framework

The Five Questions We Ask Every Client

Conclusion

FAQ

What is the difference between BMAD and Spec Kit?

Is OpenSpec better than BMAD for legacy code?

How much does it cost to run BMAD?

Which spec-driven framework has the most GitHub stars?

Can I switch from one framework to another mid-project?

Is AWS Kiro a replacement for BMAD?

Related articles

What Is BMAD? The Agentic AI Framework for Production-Ready Development

AI Agents in the Real World: Tips for Modern Engineering Teams

Tell us about your challenge!