5.11.2026

gstack: A Virtual Engineering Team Inside Claude Code

So, I have a link to a website, where the owner Garry Tan claimed that his productivity is x810 times comparing with his output in 2013.

https://github.com/garrytan/gstack

I looked at the github, and upon seeing so many files, I need Claude's help in summarizing the site's purpose for me. So I ask it:

 explain to me the essence of this toolkit. I only have Claude Code as you know. write your explanation to a md then html: https://github.com/garrytan/gstack


Below is the reply, neatly output into a HTML.

 

gstack: A Virtual Engineering Team Inside Claude Code

Published: May 2026


The One-Line Idea

gstack is a free, open-source toolkit that gives Claude Code a set of specialized roles — CEO, architect, designer, security officer, QA lead, release engineer — so a single person can ship software the way a team does, but faster.

It was built by Garry Tan, the president of Y Combinator, who claims it helped him ship at roughly 810× the rate he did in 2013, while still running one of the world's top startup accelerators full-time.


Why It Exists

Claude Code on its own is already powerful. You describe a problem, it writes code, fixes bugs, edits files. But raw capability is not the same as a structured process. A blank canvas is intimidating. What gstack noticed is that professional software teams do not work randomly — they move through predictable phases: think, plan, build, review, test, ship, reflect.

Without a structure like that, even the best AI assistant just becomes an expensive autocomplete. gstack imposes the structure. It wraps Claude Code in a sprint workflow where each phase has its own "specialist" with a specific lens and a specific job.


The Core Workflow

Every feature or product goes through seven stages:

Phase What Happens
Think /office-hours interrogates your idea with hard forcing questions before you write a line of code
Plan CEO, engineer, designer, and DX reviews all weigh in on your plan from their own angle
Build Claude Code writes the actual implementation
Review A virtual staff engineer reviews for production-level bugs; a design reviewer audits visuals
Test /qa opens a real Chromium browser, walks through your flows, fixes bugs, generates regression tests
Ship /ship syncs main, runs tests, opens a PR; /land-and-deploy merges and verifies production
Reflect /retro runs a weekly retrospective with per-contributor metrics

The key insight is that each skill feeds its output into the next. The plan review's output shapes what gets built. The build's output becomes the review's input. You're not hopping between random prompts — you're moving through a pipeline.


The 23 Skills, Grouped by What They Do

gstack ships 23 "skills" (slash commands) and 8 power tools. Here is how they cluster:

Planning layer — challenges assumptions before coding starts

  • /office-hours — Product interrogation. Forces you to defend what you are building and why.
  • /autoplan — Runs the CEO, design, engineering, and DX reviews automatically in one go.
  • /plan-ceo-review, /plan-eng-review, /plan-design-review, /plan-devex-review — Each specialist review on its own.

Design layer — goes from idea to working visual to production HTML

  • /design-consultation — Builds a design system from scratch.
  • /design-shotgun — Generates four to six distinct mockup variants so you can pick a direction rather than guess.
  • /design-html — Converts the approved mockup into production-ready HTML (30KB, zero dependencies, framework-aware).
  • /design-review — Live design audit; finds problems and applies fixes on the spot.

Code quality layer — catches what you missed

  • /review — Staff engineer level review with auto-fixes for obvious issues.
  • /investigate — Systematic root-cause debugging with hypothesis testing, not just stabs in the dark.
  • /cso — Chief Security Officer: scans for OWASP Top 10 vulnerabilities and runs STRIDE threat modeling.
  • /codex — Sends your code to OpenAI's Codex CLI for a cross-model second opinion.

Testing and deployment layer

  • /qa — The crown jewel. Opens a real Chromium browser, tests actual user flows, fixes bugs it finds, and writes regression tests for next time.
  • /qa-only — Same as above but report-only; no code changes.
  • /ship — Bootstraps test frameworks, audits coverage, and opens a pull request.
  • /land-and-deploy — Merge, CI, and production verification in one command.
  • /canary — Post-deploy monitoring: checks for console errors, performance regressions, page failures.
  • /benchmark — Measures Core Web Vitals and compares before/after.

Safety tools — guardrails so you do not break things accidentally

  • /careful — Warns you before destructive commands like rm -rf, DROP TABLE, or force-push.
  • /freeze — Restricts all file edits to a single directory so debugging one area cannot spill into another.
  • /guard — Activates both /careful and /freeze at once.

Utility layer

  • /browse — Real Chromium browser with roughly 100ms command latency.
  • /learn — Saves session learnings persistently so the next session starts smarter.
  • /document-release — Auto-updates README and project docs to match what just shipped.
  • /retro — Engineering retrospective with metrics.

GBrain: Memory Across Sessions

By default, Claude Code starts fresh every session. GBrain is gstack's optional persistent memory layer, backed by a local or cloud database.

Once set up, GBrain lets agents remember decisions made last week, design choices that were rejected, and project-specific patterns. You can also set trust tiers per repository: read-write, read-only, or no access.

Three ways to run it:

  • Supabase cloud — syncs across machines, takes about ninety seconds to provision automatically
  • PGLite local — zero accounts, zero network, everything stays on your machine
  • Existing Supabase — paste your Session Pooler URL and go

What Makes It Different From Just Prompting

Most people who use Claude Code write prompts like "refactor this function" or "fix this bug." That works for isolated tasks. gstack is doing something different: it is encoding the judgment of experienced engineers into repeatable, composable steps.

The /design-shotgun → /design-html → /qa → /ship sequence, for example, closes the loop between having a visual idea and getting something running in production — with testing and review built in between. It does not just generate code; it runs the code in a real browser and validates that it works before handing it off.

The other thing worth noting is that /codex deliberately routes a cross-check to a different AI model (OpenAI's Codex) for a second opinion. One AI reviewing its own output has obvious blind spots. Using a second model as a challenger catches different classes of errors.


Practical Notes If You Want to Try It

Requirements: Claude Code, Git, and Bun v1.0+. On Windows, you also need Node.js because of a Playwright bug with Bun's pipe transport.

Install (30 seconds):

```

git clone --single-branch --depth 1 https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup

```

Once installed, /gstack-qa runs the browser testing skill. You can shorten it to just /qa with ./setup --no-prefix.

License: MIT. Completely free. No premium tier.


Quick Reference

Skill When to Use It
/office-hours Before you write anything — to stress-test the idea
/autoplan To get all plan reviews in one pass
/design-shotgun When you need visual options, not a blank prompt
/review After every significant code change
/cso Before any public release
/qa Before opening a PR
/ship When you are ready to open a PR
/land-and-deploy When you are ready to merge to production
/careful Always, as a habit
/investigate When you have a bug you cannot explain

Key Things to Remember

  • gstack is not a replacement for knowing what you are building. /office-hours exists precisely because the tool assumes you might not have fully thought it through yet. Use it first, not last.
  • The browser testing skill (/qa) is the most unusual thing here. Most AI coding tools stop at generating code. gstack actually runs it in a real browser and verifies behavior. That gap is enormous in practice.
  • You do not need all 23 skills. Start with /review and /qa. Add the planning layer once you have a feel for the workflow.
  • GBrain is optional but valuable. If you find yourself re-explaining your project context at the start of every session, GBrain is the fix.
  • The 810× productivity claim is for Garry Tan specifically, coding at his pace, in his codebase. Your mileage will vary. The structure itself is the real gift — it gives AI assistance a shape that mirrors how professional teams actually work.