* docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
12 KiB
| title | sidebar_label | description |
|---|---|---|
| Subagent Driven Development — Execute plans via delegate_task subagents (2-stage review) | Subagent Driven Development | Execute plans via delegate_task subagents (2-stage review) |
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
Subagent Driven Development
Execute plans via delegate_task subagents (2-stage review).
Skill metadata
| Source | Bundled (installed by default) |
| Path | skills/software-development/subagent-driven-development |
| Version | 1.1.0 |
| Author | Hermes Agent (adapted from obra/superpowers) |
| License | MIT |
| Platforms | linux, macos, windows |
| Tags | delegation, subagent, implementation, workflow, parallel |
| Related skills | writing-plans, requesting-code-review, test-driven-development |
Reference: full SKILL.md
:::info The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. :::
Subagent-Driven Development
Overview
Execute implementation plans by dispatching fresh subagents per task with systematic two-stage review.
Core principle: Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration.
When to Use
Use this skill when:
- You have an implementation plan (from writing-plans skill or user requirements)
- Tasks are mostly independent
- Quality and spec compliance are important
- You want automated review between tasks
vs. manual execution:
- Fresh context per task (no confusion from accumulated state)
- Automated review process catches issues early
- Consistent quality checks across all tasks
- Subagents can ask questions before starting work
The Process
1. Read and Parse Plan
Read the plan file. Extract ALL tasks with their full text and context upfront. Create a todo list:
# Read the plan
read_file("docs/plans/feature-plan.md")
# Create todo list with all tasks
todo([
{"id": "task-1", "content": "Create User model with email field", "status": "pending"},
{"id": "task-2", "content": "Add password hashing utility", "status": "pending"},
{"id": "task-3", "content": "Create login endpoint", "status": "pending"},
])
Key: Read the plan ONCE. Extract everything. Don't make subagents read the plan file — provide the full task text directly in context.
2. Per-Task Workflow
For EACH task in the plan:
Step 1: Dispatch Implementer Subagent
Use delegate_task with complete context:
delegate_task(
goal="Implement Task 1: Create User model with email and password_hash fields",
context="""
TASK FROM PLAN:
- Create: src/models/user.py
- Add User class with email (str) and password_hash (str) fields
- Use bcrypt for password hashing
- Include __repr__ for debugging
FOLLOW TDD:
1. Write failing test in tests/models/test_user.py
2. Run: pytest tests/models/test_user.py -v (verify FAIL)
3. Write minimal implementation
4. Run: pytest tests/models/test_user.py -v (verify PASS)
5. Run: pytest tests/ -q (verify no regressions)
6. Commit: git add -A && git commit -m "feat: add User model with password hashing"
PROJECT CONTEXT:
- Python 3.11, Flask app in src/app.py
- Existing models in src/models/
- Tests use pytest, run from project root
- bcrypt already in requirements.txt
""",
toolsets=['terminal', 'file']
)
Step 2: Dispatch Spec Compliance Reviewer
After the implementer completes, verify against the original spec:
delegate_task(
goal="Review if implementation matches the spec from the plan",
context="""
ORIGINAL TASK SPEC:
- Create src/models/user.py with User class
- Fields: email (str), password_hash (str)
- Use bcrypt for password hashing
- Include __repr__
CHECK:
- [ ] All requirements from spec implemented?
- [ ] File paths match spec?
- [ ] Function signatures match spec?
- [ ] Behavior matches expected?
- [ ] Nothing extra added (no scope creep)?
OUTPUT: PASS or list of specific spec gaps to fix.
""",
toolsets=['file']
)
If spec issues found: Fix gaps, then re-run spec review. Continue only when spec-compliant.
Step 3: Dispatch Code Quality Reviewer
After spec compliance passes:
delegate_task(
goal="Review code quality for Task 1 implementation",
context="""
FILES TO REVIEW:
- src/models/user.py
- tests/models/test_user.py
CHECK:
- [ ] Follows project conventions and style?
- [ ] Proper error handling?
- [ ] Clear variable/function names?
- [ ] Adequate test coverage?
- [ ] No obvious bugs or missed edge cases?
- [ ] No security issues?
OUTPUT FORMAT:
- Critical Issues: [must fix before proceeding]
- Important Issues: [should fix]
- Minor Issues: [optional]
- Verdict: APPROVED or REQUEST_CHANGES
""",
toolsets=['file']
)
If quality issues found: Fix issues, re-review. Continue only when approved.
Step 4: Mark Complete
todo([{"id": "task-1", "content": "Create User model with email field", "status": "completed"}], merge=True)
3. Final Review
After ALL tasks are complete, dispatch a final integration reviewer:
delegate_task(
goal="Review the entire implementation for consistency and integration issues",
context="""
All tasks from the plan are complete. Review the full implementation:
- Do all components work together?
- Any inconsistencies between tasks?
- All tests passing?
- Ready for merge?
""",
toolsets=['terminal', 'file']
)
4. Verify and Commit
# Run full test suite
pytest tests/ -q
# Review all changes
git diff --stat
# Final commit if needed
git add -A && git commit -m "feat: complete [feature name] implementation"
Task Granularity
Each task = 2-5 minutes of focused work.
Too big:
- "Implement user authentication system"
Right size:
- "Create User model with email and password fields"
- "Add password hashing function"
- "Create login endpoint"
- "Add JWT token generation"
- "Create registration endpoint"
Red Flags — Never Do These
- Start implementation without a plan
- Skip reviews (spec compliance OR code quality)
- Proceed with unfixed critical/important issues
- Dispatch multiple implementation subagents for tasks that touch the same files
- Make subagent read the plan file (provide full text in context instead)
- Skip scene-setting context (subagent needs to understand where the task fits)
- Ignore subagent questions (answer before letting them proceed)
- Accept "close enough" on spec compliance
- Skip review loops (reviewer found issues → implementer fixes → review again)
- Let implementer self-review replace actual review (both are needed)
- Start code quality review before spec compliance is PASS (wrong order)
- Move to next task while either review has open issues
Handling Issues
If Subagent Asks Questions
- Answer clearly and completely
- Provide additional context if needed
- Don't rush them into implementation
If Reviewer Finds Issues
- Implementer subagent (or a new one) fixes them
- Reviewer reviews again
- Repeat until approved
- Don't skip the re-review
If Subagent Fails a Task
- Dispatch a new fix subagent with specific instructions about what went wrong
- Don't try to fix manually in the controller session (context pollution)
Efficiency Notes
Why fresh subagent per task:
- Prevents context pollution from accumulated state
- Each subagent gets clean, focused context
- No confusion from prior tasks' code or reasoning
Why two-stage review:
- Spec review catches under/over-building early
- Quality review ensures the implementation is well-built
- Catches issues before they compound across tasks
Cost trade-off:
- More subagent invocations (implementer + 2 reviewers per task)
- But catches issues early (cheaper than debugging compounded problems later)
Integration with Other Skills
With writing-plans
This skill EXECUTES plans created by the writing-plans skill:
- User requirements → writing-plans → implementation plan
- Implementation plan → subagent-driven-development → working code
With test-driven-development
Implementer subagents should follow TDD:
- Write failing test first
- Implement minimal code
- Verify test passes
- Commit
Include TDD instructions in every implementer context.
With requesting-code-review
The two-stage review process IS the code review. For final integration review, use the requesting-code-review skill's review dimensions.
With systematic-debugging
If a subagent encounters bugs during implementation:
- Follow systematic-debugging process
- Find root cause before fixing
- Write regression test
- Resume implementation
Example Workflow
[Read plan: docs/plans/auth-feature.md]
[Create todo list with 5 tasks]
--- Task 1: Create User model ---
[Dispatch implementer subagent]
Implementer: "Should email be unique?"
You: "Yes, email must be unique"
Implementer: Implemented, 3/3 tests passing, committed.
[Dispatch spec reviewer]
Spec reviewer: ✅ PASS — all requirements met
[Dispatch quality reviewer]
Quality reviewer: ✅ APPROVED — clean code, good tests
[Mark Task 1 complete]
--- Task 2: Password hashing ---
[Dispatch implementer subagent]
Implementer: No questions, implemented, 5/5 tests passing.
[Dispatch spec reviewer]
Spec reviewer: ❌ Missing: password strength validation (spec says "min 8 chars")
[Implementer fixes]
Implementer: Added validation, 7/7 tests passing.
[Dispatch spec reviewer again]
Spec reviewer: ✅ PASS
[Dispatch quality reviewer]
Quality reviewer: Important: Magic number 8, extract to constant
Implementer: Extracted MIN_PASSWORD_LENGTH constant
Quality reviewer: ✅ APPROVED
[Mark Task 2 complete]
... (continue for all tasks)
[After all tasks: dispatch final integration reviewer]
[Run full test suite: all passing]
[Done!]
Remember
Fresh subagent per task
Two-stage review every time
Spec compliance FIRST
Code quality SECOND
Never skip reviews
Catch issues early
Quality is not an accident. It's the result of systematic process.
Further reading (load when relevant)
When the orchestration involves significant context usage, long review loops, or complex validation checkpoints, load these references for the specific discipline:
references/context-budget-discipline.md— Four-tier context degradation model (PEAK / GOOD / DEGRADING / POOR), read-depth rules that scale with context window size, and early warning signs of silent degradation. Load when a run will clearly consume significant context (multi-phase plans, many subagents, large artifacts).references/gates-taxonomy.md— The four canonical gate types (Pre-flight, Revision, Escalation, Abort) with behavior, recovery, and examples. Load when designing or reviewing any workflow that has validation checkpoints — use the vocabulary explicitly so each gate has defined entry, failure behavior, and resumption rules.
Both references adapted from gsd-build/get-shit-done (MIT © 2025 Lex Christopherson).