LongCut logo

BMAD v6 vs Plan Mode: The Honest Comparison Nobody Asked For

By The Gray Cat

Summary

Topics Covered

  • BMAD V6 Overpromises Speed Gains
  • Enterprise Process Wastes Simple Tasks
  • YOLO Hack Enables True Autonomy
  • One Prompt Outperforms Seven Hours
  • BMAD Excels as Learning Tool

Full Transcript

The game on the left took me seven hours to build.

The game on the right took one prompt.

Same requirements.

Same model.

Same cat.

A few months ago, I tested the BMAD Method for the first time.

Version five.

It took eight hours to build a single landing page, and the comments section had opinions.

"Try YOLO mode."

"You did it wrong."

"V6 changes everything."

Fair enough.

The creator released a "groundbreaking" v6 update, so I gave it another chance.

This time with Opus 4.6, every advantage I could give it, and the emotional resilience of a man who has already lost one weekend to this framework.

Let me show you how both of these were made.

Before I get into the suffering, a quick overview of what V6 claims to bring to the table.

It's marketed as a ground-up rewrite with five major changes.

First, process on everything -- V5 forced the same heavyweight planning even a bug fix got the enterprise treatment.

V6 introduces two tracks: Quick Flow for simple tasks and the full BMad Method for complex projects.

Smart idea.

Second Agent-as-Code.

Agents are now defined as .agent.yaml files

as .agent.yaml files

instead of monolithic markdown prompts.

They're versioned, portable, and can be installed across 20 different platforms. On paper, that sounds great.

Third, Party Mode.

You can bring multiple agent personas into a single conversation.

The PM, Architect, and Developer can all debate in one context window.

Worth noting: this is one LLM role-playing multiple characters, not actual parallel processes.

It's improv theater, not a team standup.

Fourth, step-file architecture.

Instead of loading a 50-kilobyte PRD into every session, documents are sharded into small, focused pieces.

Community case studies report token reductions of 74 to 90 percent.

And fifth, expansion packs.

Domain-specific modules like Game Dev Studio, Creative Intelligence Suite, and Test Architect that you can install on top of the core method.

Those are the headlines.

Let me tell you what happens when you actually sit down and use it.

When I was 16, a friend of mine built this ridiculous flash game.

The concept was, let's say, not suitable for YouTube.

I've always wanted to build it in modern web technologies, so I decompiled the original SWF file extracted the scripts and used Claude Code to generate proper requirements.

I replaced the original theme with something family-friendly -- a cat hanging from a rope, swinging back and forth.

You charge a cannon and launch tuna at it.

Hit the cat.

Score a point because apparently that's my brand.

The result: "Feed TGC", a canvas-based HTML5 game where you launch tuna at a swinging cat head within a sixty-second timer.

Simple game, perfect test case.

I used the exact same requirements document for both the BMAD build and the plan mode build.

Same Opus 4.6 model.

For comparison, I didn't install the Game Dev Studio expansion pack.

That would be a separate video if there's interest.

If you want to see how v5 performed, I'll link my previous comparison video where I tested BMAD against Spec Kit and Open Spec.

Installation is the same as before.

npx bmad-method install, select your modules, pick your IDE.

I went with the core method plus the Creative Intelligence Suite, because why not.

I'm already here.

The first phase is the PRD.

You summon the PM agent, and it starts asking questions.

Thirteen steps of elicitation.

And honestly, some of these questions are excellent.

Target platform?

Canvas rendering preferences?

Hit detection model -- generous or precise?

Sound management -- Web Audio API or simple audio elements?

Physics for the pendulum swing?

These are questions I wouldn't have thought to ask myself.

For someone who's never built a game, this is genuinely educational.

But then there's the other kind.

"Business success metrics" for a cat game I'm building for a YouTube video.

"User persona focus groups."

focus groups." Who plays this?

Me.

Me and the guy who made the original in 2005.

"Pre-mortem analysis."

Imagine TGC flopped -- what went wrong?

I mean, it's a tuna cannon game on a YouTube channel with a few thousand subscribers.

Let's not overthink this.

I told the PM the only business success metric is whether this video gets a thousand views.

It replied: "I love the honesty."

By the time the PRD was done, I was at 59 percent context and thirty minutes in.

Without writing a single line of code.

Next came the UX design phase.

This was optional -- BMAD asked if I wanted to run it, and I said sure, why not.

In hindsight, I should have skipped it.

Fresh context window, load the UX agent.

It gave me four design directions.

I pointed it to my channel's color palette -- black gray orange.

The HTML preview it generated actually looked promising.

Nice CRT retro effect, proper color tokens.

I thought, okay, maybe this time.

Then the walls of text started.

Emotional response definitions.

Micro-emotions.

Core experience mapping.

Flow optimization principles.

I'm reading pages of design theory for a game where you throw fish at a cat.

At one point it asked me to name two or three games that capture the experience I'm going for.

I don't play games.

I told it to pick for me.

It suggested Flappy Bird and Angry Birds.

Yeah, that tracks.

By the end of UX, I was at 76 percent context and starting to question my life choices.

Architecture was another fresh context window.

Another round of questions about things I'd already answered.

Load patterns.

Code conventions.

Folder structure.

All reasonable for a real project.

Total overkill for this one.

But I kept going because I wanted to see the full process, and at this point, I'd spent too many hours to quit.

Four epics.

Fifteen stories.

A coverage validation matrix.

The architecture phase ended at 39 percent context, and the implementation readiness check found zero critical issues.

On paper, everything looked perfect.

On paper, the Titanic was unsinkable.

One genuine improvement I noticed: Opus 4.6 stayed coherent all the way up to 70 percent context usage.

In my v5 test with Sonnet 4.0, the model would start losing track much earlier.

So the model upgrade helped, even if the method itself didn't change much.

This is the part the comments asked about.

YOLO mode.

The promise of autonomous implementation where you just kick it off and go grab a coffee.

The reality: YOLO mode exists, but it's not what you might picture.

It doesn’t even run a full story It runs a single step Just the task generation, or just the development, or just the review, and stops.

You have to manually tell it to continue to the next step.

That's not exactly "YOLO."

exactly "YOLO." After some frustration, I discovered the real trick -- and this works for both v5 and v6.

I added instructions to the project's CLAUDE.md file: "When

CLAUDE.md file: "When in YOLO mode, do not stop.

Go to the next logical step.

Develop review fix complete.

Do not stop until the story is done.

Use sub-agents where applicable." That

where applicable." That made a drastic improvement.

With those instructions, each story would actually run to completion autonomously.

Develop, review, apply fixes - all in one go.

I could clear the context, paste the next story command, and walk away.

If you're going to use BMAD, this CLAUDE.md

trick is essential.

But even with that fix, the process had real problems. After epic two, the code didn't compile.

The dev server was completely broken.

On paper, every story had green checkmarks.

In the browser?

Nothing worked.

"Document not defined" errors because it was running browser code on the server side.

Missing imports.

The SVG assets were rough placeholders that looked like they were drawn during an earthquake.

I spent several iterations debugging compilation issues.

Fixed the broken cannon state transition -- when you overcharged, it would flash the "game over" screen and immediately restart because the spacebar was still held down.

Fixed the audio context problem where sounds wouldn't play without user interaction.

Fixed the charging sound playing in an infinite loop when you overcharged.

One by one, each epic revealed new bugs that the method's own QA step had completely missed.

Eventually, after about seven hours spread across two days, the game was playable.

Sounds worked.

The pendulum swung.

The scoring tracked.

But the visuals were rough SVG shapes, and it took significant manual course correction to get there.

On my hundred dollar a month Claude Code Max subscription, running Opus 4.6 with low reasoning effort, I burned through 46 percent of my five hour token budget for a tuna cannon game.

Same requirements document.

Same sound files placed in the assets folder.

One prompt: "Read these requirements and build the game for me.

The game should run in HTML5 canvas.

Feel free to suggest the implementation details.

Use sub-agents if needed.

Make sure it runs at the best quality possible." No BMAD.

possible." No BMAD.

No agents.

No epics.

No stories.

Just plan mode with Opus 4.6.

It spent a few minutes planning.

Didn't ask me a single question.

Then it built the entire game.

Plain JavaScript.

No TypeScript, no bundler.

Just an index.html and some script files.

I was skeptical.

I opened the browser.

It worked.

First try.

Every sound played correctly.

The cat looked like a cat.

The fish looked like a fish.

The cannon charged and fired.

The pendulum swung with proper physics.

No compilation errors.

No broken states.

The only fix needed was running a local web server instead of opening the HTML file directly.

And it looked better.

Significantly better.

Not because of some design methodology or emotional response mapping, but because the model just made reasonable visual choices on its own.

Turns out, when you don't spend hours telling an AI how to feel about the tuna cannon, it just builds the thing.

The trade-offs are real.

No tests.

No TypeScript.

No architecture documents.

Simple HTML with inline scripts.

A professional codebase this is not.

But the end user doesn't care about TypeScript.

They care that the game works, looks decent, and is fun to play.

Total time: under fifteen minutes.

Token usage: 4 percent of my five-hour budget.

I also tried different reasoning effort levels Codex and Gemini -- but that's a different video.

I don't want to be completely unfair here.

BMAD does some things genuinely well.

The elicitation questions are useful if you're building something in a domain you don't understand.

The PM agent will ask questions you didn't know to ask.

Hit detection models audio API choices, physics approaches, difficulty curves.

For someone learning game development for the first time, that guided discovery has real value.

I learned things from those questions.

It's a solid learning tool.

Spending hours with the brainstorming and design phases teaches you terminology default patterns, and decision points you'd otherwise miss entirely.

If you're a non-technical person trying to build an application, the structured guidance can help you make informed decisions before you write a single line of code.

And the expansion packs could add real domain-specific value.

A Game Dev Studio module that actually knows Unity or Godot conventions might genuinely accelerate game development.

I didn't test it, but the concept is sound.

The negatives haven't changed from v5, though.

The process is still lengthy -- seven hours versus eight is not the improvement the changelog promised.

It's still single-threaded, one context window, no parallel execution.

It generates mountains of documentation that, let's be honest, nobody is going to read after the project ships.

And for a developer with defined tasks, plan mode Superpowers or Open Spec will get you to a working result in a fraction of the time.

The biggest gap is brownfield projects.

BMAD is designed for greenfield.

If you have an existing codebase and need to add a feature, the full BMAD ceremony makes no sense.

I'd code it faster myself and go through the BMAD ceremony.

You wouldn't summon a PM agent to change a button color or update some copy.

BMAD is too big for one video.

I skipped the expansion packs didn't test Quick Flow, and never tried Party Mode even though it kept prompting me to.

If you think any of those deserve their own video, tell me in the comments.

Subscribe.

It genuinely helps the channel grow and means more honest reviews of tools like this.

And remember, if your cat watched you code for seven hours straight, she's not impressed.

She could have caught the tuna herself.

Loading...

Loading video analysis...