From 0f6eabb89073e3215daffb0bc1c4d95401be7b3b Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Thu, 23 Apr 2026 22:22:11 -0700 Subject: [PATCH] docs(website): dedicated page per bundled + optional skill (#14929) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Generates a full dedicated Docusaurus page for every one of the 132 skills (73 bundled + 59 optional) under website/docs/user-guide/skills/{bundled,optional}//. Each page carries the skill's description, metadata (version, author, license, dependencies, platform gating, tags, related skills cross-linked to their own pages), and the complete SKILL.md body that Hermes loads at runtime. Previously the two catalog pages just listed skills with a one-line blurb and no way to see what the skill actually did — users had to go read the source repo. Now every skill has a browsable, searchable, cross-linked reference in the docs. - website/scripts/generate-skill-docs.py — generator that reads skills/ and optional-skills/, writes per-skill pages, regenerates both catalog indexes, and rewrites the Skills section of sidebars.ts. Handles MDX escaping (outside fenced code blocks: curly braces, unsafe HTML-ish tags) and rewrites relative references/*.md links to point at the GitHub source. - website/docs/reference/skills-catalog.md — regenerated; each row links to the new dedicated page. - website/docs/reference/optional-skills-catalog.md — same. - website/sidebars.ts — Skills section now has Bundled / Optional subtrees with one nested category per skill folder. - .github/workflows/{docs-site-checks,deploy-site}.yml — run the generator before docusaurus build so CI stays in sync with the source SKILL.md files. Build verified locally with `npx docusaurus build`. Only remaining warnings are pre-existing broken link/anchor issues in unrelated pages. --- .github/workflows/deploy-site.yml | 3 + .github/workflows/docs-site-checks.yml | 3 + .../docs/reference/optional-skills-catalog.md | 159 +- website/docs/reference/skills-catalog.md | 303 +-- .../skills/bundled/apple/apple-apple-notes.md | 106 + .../bundled/apple/apple-apple-reminders.md | 114 + .../skills/bundled/apple/apple-findmy.md | 149 + .../skills/bundled/apple/apple-imessage.md | 118 + .../autonomous-ai-agents-claude-code.md | 762 ++++++ .../autonomous-ai-agents-codex.md | 131 + .../autonomous-ai-agents-hermes-agent.md | 722 +++++ .../autonomous-ai-agents-opencode.md | 236 ++ .../creative/creative-architecture-diagram.md | 164 ++ .../bundled/creative/creative-ascii-art.md | 337 +++ .../bundled/creative/creative-ascii-video.md | 252 ++ .../bundled/creative/creative-baoyu-comic.md | 263 ++ .../creative/creative-baoyu-infographic.md | 253 ++ .../creative/creative-creative-ideation.md | 162 ++ .../bundled/creative/creative-design-md.md | 214 ++ .../bundled/creative/creative-excalidraw.md | 207 ++ .../bundled/creative/creative-manim-video.md | 284 ++ .../skills/bundled/creative/creative-p5js.md | 565 ++++ .../bundled/creative/creative-pixel-art.md | 232 ++ .../creative/creative-popular-web-designs.md | 212 ++ .../creative-songwriting-and-ai-music.md | 297 ++ .../data-science-jupyter-live-kernel.md | 183 ++ .../devops/devops-webhook-subscriptions.md | 221 ++ .../skills/bundled/dogfood/dogfood-dogfood.md | 178 ++ .../skills/bundled/email/email-himalaya.md | 293 ++ .../gaming/gaming-minecraft-modpack-server.md | 205 ++ .../bundled/gaming/gaming-pokemon-player.md | 235 ++ .../github/github-codebase-inspection.md | 131 + .../bundled/github/github-github-auth.md | 264 ++ .../github/github-github-code-review.md | 498 ++++ .../bundled/github/github-github-issues.md | 387 +++ .../github/github-github-pr-workflow.md | 384 +++ .../github/github-github-repo-management.md | 533 ++++ .../skills/bundled/mcp/mcp-native-mcp.md | 374 +++ .../skills/bundled/media/media-gif-search.md | 101 + .../skills/bundled/media/media-heartmula.md | 188 ++ .../skills/bundled/media/media-songsee.md | 97 + .../bundled/media/media-youtube-content.md | 88 + .../mlops-evaluation-lm-evaluation-harness.md | 507 ++++ .../mlops-evaluation-weights-and-biases.md | 608 +++++ .../bundled/mlops/mlops-huggingface-hub.md | 99 + .../mlops/mlops-inference-llama-cpp.md | 266 ++ .../mlops/mlops-inference-obliteratus.md | 348 +++ .../bundled/mlops/mlops-inference-outlines.md | 670 +++++ .../bundled/mlops/mlops-inference-vllm.md | 381 +++ .../bundled/mlops/mlops-models-audiocraft.md | 584 ++++ .../mlops/mlops-models-segment-anything.md | 520 ++++ .../bundled/mlops/mlops-research-dspy.md | 608 +++++ .../bundled/mlops/mlops-training-axolotl.md | 176 ++ .../mlops/mlops-training-trl-fine-tuning.md | 476 ++++ .../bundled/mlops/mlops-training-unsloth.md | 97 + .../note-taking/note-taking-obsidian.md | 86 + .../productivity-google-workspace.md | 296 ++ .../productivity/productivity-linear.md | 312 +++ .../bundled/productivity/productivity-maps.md | 209 ++ .../productivity/productivity-nano-pdf.md | 68 + .../productivity/productivity-notion.md | 186 ++ .../productivity-ocr-and-documents.md | 189 ++ .../productivity/productivity-powerpoint.md | 252 ++ .../red-teaming/red-teaming-godmode.md | 421 +++ .../skills/bundled/research/research-arxiv.md | 299 +++ .../bundled/research/research-blogwatcher.md | 151 ++ .../bundled/research/research-llm-wiki.md | 523 ++++ .../bundled/research/research-polymarket.md | 95 + .../research-research-paper-writing.md | 2389 +++++++++++++++++ .../bundled/smart-home/smart-home-openhue.md | 123 + .../bundled/social-media/social-media-xurl.md | 428 +++ .../software-development-plan.md | 75 + ...ware-development-requesting-code-review.md | 297 ++ ...development-subagent-driven-development.md | 360 +++ ...ftware-development-systematic-debugging.md | 384 +++ ...are-development-test-driven-development.md | 360 +++ .../software-development-writing-plans.md | 314 +++ website/docs/user-guide/skills/godmode.md | 2 +- .../autonomous-ai-agents-blackbox.md | 161 ++ .../autonomous-ai-agents-honcho.md | 445 +++ .../optional/blockchain/blockchain-base.md | 248 ++ .../optional/blockchain/blockchain-solana.md | 224 ++ .../communication-one-three-one-rule.md | 113 + .../optional/creative/creative-blender-mcp.md | 134 + .../creative/creative-concept-diagrams.md | 378 +++ .../creative/creative-meme-generation.md | 146 + .../creative/creative-touchdesigner-mcp.md | 356 +++ .../skills/optional/devops/devops-cli.md | 172 ++ .../devops/devops-docker-management.md | 296 ++ .../dogfood/dogfood-adversarial-ux-test.md | 208 ++ .../skills/optional/email/email-agentmail.md | 142 + .../health/health-fitness-nutrition.md | 257 ++ .../optional/health/health-neuroskill-bci.md | 469 ++++ .../skills/optional/mcp/mcp-fastmcp.md | 314 +++ .../skills/optional/mcp/mcp-mcporter.md | 137 + .../migration/migration-openclaw-migration.md | 315 +++ .../skills/optional/mlops/mlops-accelerate.md | 349 +++ .../skills/optional/mlops/mlops-chroma.md | 424 +++ .../skills/optional/mlops/mlops-clip.md | 271 ++ .../skills/optional/mlops/mlops-faiss.md | 239 ++ .../optional/mlops/mlops-flash-attention.md | 384 +++ .../skills/optional/mlops/mlops-guidance.md | 590 ++++ .../mlops-hermes-atropos-environments.md | 320 +++ .../mlops/mlops-huggingface-tokenizers.md | 534 ++++ .../skills/optional/mlops/mlops-instructor.md | 758 ++++++ .../optional/mlops/mlops-lambda-labs.md | 565 ++++ .../skills/optional/mlops/mlops-llava.md | 322 +++ .../skills/optional/mlops/mlops-modal.md | 361 +++ .../optional/mlops/mlops-nemo-curator.md | 400 +++ .../skills/optional/mlops/mlops-peft.md | 451 ++++ .../skills/optional/mlops/mlops-pinecone.md | 376 +++ .../optional/mlops/mlops-pytorch-fsdp.md | 144 + .../optional/mlops/mlops-pytorch-lightning.md | 364 +++ .../skills/optional/mlops/mlops-qdrant.md | 513 ++++ .../skills/optional/mlops/mlops-saelens.md | 406 +++ .../skills/optional/mlops/mlops-simpo.md | 236 ++ .../skills/optional/mlops/mlops-slime.md | 483 ++++ .../optional/mlops/mlops-stable-diffusion.md | 539 ++++ .../optional/mlops/mlops-tensorrt-llm.md | 205 ++ .../skills/optional/mlops/mlops-torchtitan.md | 377 +++ .../skills/optional/mlops/mlops-whisper.md | 335 +++ .../productivity/productivity-canvas.md | 113 + .../productivity-memento-flashcards.md | 336 +++ .../productivity/productivity-siyuan.md | 304 +++ .../productivity/productivity-telephony.md | 434 +++ .../research/research-bioinformatics.md | 252 ++ .../research/research-domain-intel.md | 116 + .../research/research-drug-discovery.md | 236 ++ .../research/research-duckduckgo-search.md | 254 ++ .../research/research-gitnexus-explorer.md | 231 ++ .../research/research-parallel-cli.md | 408 +++ .../skills/optional/research/research-qmd.md | 459 ++++ .../optional/research/research-scrapling.md | 350 +++ .../optional/security/security-1password.md | 172 ++ .../security/security-oss-forensics.md | 424 +++ .../optional/security/security-sherlock.md | 207 ++ .../web-development-page-agent.md | 206 ++ website/scripts/generate-skill-docs.py | 714 +++++ website/sidebars.ts | 385 +++ 139 files changed, 43523 insertions(+), 306 deletions(-) create mode 100644 website/docs/user-guide/skills/bundled/apple/apple-apple-notes.md create mode 100644 website/docs/user-guide/skills/bundled/apple/apple-apple-reminders.md create mode 100644 website/docs/user-guide/skills/bundled/apple/apple-findmy.md create mode 100644 website/docs/user-guide/skills/bundled/apple/apple-imessage.md create mode 100644 website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code.md create mode 100644 website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex.md create mode 100644 website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent.md create mode 100644 website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-opencode.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-architecture-diagram.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-ascii-art.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-ascii-video.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-baoyu-comic.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-baoyu-infographic.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-creative-ideation.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-design-md.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-excalidraw.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-manim-video.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-p5js.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-pixel-art.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-popular-web-designs.md create mode 100644 website/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music.md create mode 100644 website/docs/user-guide/skills/bundled/data-science/data-science-jupyter-live-kernel.md create mode 100644 website/docs/user-guide/skills/bundled/devops/devops-webhook-subscriptions.md create mode 100644 website/docs/user-guide/skills/bundled/dogfood/dogfood-dogfood.md create mode 100644 website/docs/user-guide/skills/bundled/email/email-himalaya.md create mode 100644 website/docs/user-guide/skills/bundled/gaming/gaming-minecraft-modpack-server.md create mode 100644 website/docs/user-guide/skills/bundled/gaming/gaming-pokemon-player.md create mode 100644 website/docs/user-guide/skills/bundled/github/github-codebase-inspection.md create mode 100644 website/docs/user-guide/skills/bundled/github/github-github-auth.md create mode 100644 website/docs/user-guide/skills/bundled/github/github-github-code-review.md create mode 100644 website/docs/user-guide/skills/bundled/github/github-github-issues.md create mode 100644 website/docs/user-guide/skills/bundled/github/github-github-pr-workflow.md create mode 100644 website/docs/user-guide/skills/bundled/github/github-github-repo-management.md create mode 100644 website/docs/user-guide/skills/bundled/mcp/mcp-native-mcp.md create mode 100644 website/docs/user-guide/skills/bundled/media/media-gif-search.md create mode 100644 website/docs/user-guide/skills/bundled/media/media-heartmula.md create mode 100644 website/docs/user-guide/skills/bundled/media/media-songsee.md create mode 100644 website/docs/user-guide/skills/bundled/media/media-youtube-content.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-evaluation-lm-evaluation-harness.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-evaluation-weights-and-biases.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-huggingface-hub.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-inference-llama-cpp.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-inference-obliteratus.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-inference-outlines.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-inference-vllm.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-models-audiocraft.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-models-segment-anything.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-research-dspy.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-training-axolotl.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-training-trl-fine-tuning.md create mode 100644 website/docs/user-guide/skills/bundled/mlops/mlops-training-unsloth.md create mode 100644 website/docs/user-guide/skills/bundled/note-taking/note-taking-obsidian.md create mode 100644 website/docs/user-guide/skills/bundled/productivity/productivity-google-workspace.md create mode 100644 website/docs/user-guide/skills/bundled/productivity/productivity-linear.md create mode 100644 website/docs/user-guide/skills/bundled/productivity/productivity-maps.md create mode 100644 website/docs/user-guide/skills/bundled/productivity/productivity-nano-pdf.md create mode 100644 website/docs/user-guide/skills/bundled/productivity/productivity-notion.md create mode 100644 website/docs/user-guide/skills/bundled/productivity/productivity-ocr-and-documents.md create mode 100644 website/docs/user-guide/skills/bundled/productivity/productivity-powerpoint.md create mode 100644 website/docs/user-guide/skills/bundled/red-teaming/red-teaming-godmode.md create mode 100644 website/docs/user-guide/skills/bundled/research/research-arxiv.md create mode 100644 website/docs/user-guide/skills/bundled/research/research-blogwatcher.md create mode 100644 website/docs/user-guide/skills/bundled/research/research-llm-wiki.md create mode 100644 website/docs/user-guide/skills/bundled/research/research-polymarket.md create mode 100644 website/docs/user-guide/skills/bundled/research/research-research-paper-writing.md create mode 100644 website/docs/user-guide/skills/bundled/smart-home/smart-home-openhue.md create mode 100644 website/docs/user-guide/skills/bundled/social-media/social-media-xurl.md create mode 100644 website/docs/user-guide/skills/bundled/software-development/software-development-plan.md create mode 100644 website/docs/user-guide/skills/bundled/software-development/software-development-requesting-code-review.md create mode 100644 website/docs/user-guide/skills/bundled/software-development/software-development-subagent-driven-development.md create mode 100644 website/docs/user-guide/skills/bundled/software-development/software-development-systematic-debugging.md create mode 100644 website/docs/user-guide/skills/bundled/software-development/software-development-test-driven-development.md create mode 100644 website/docs/user-guide/skills/bundled/software-development/software-development-writing-plans.md create mode 100644 website/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-blackbox.md create mode 100644 website/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-honcho.md create mode 100644 website/docs/user-guide/skills/optional/blockchain/blockchain-base.md create mode 100644 website/docs/user-guide/skills/optional/blockchain/blockchain-solana.md create mode 100644 website/docs/user-guide/skills/optional/communication/communication-one-three-one-rule.md create mode 100644 website/docs/user-guide/skills/optional/creative/creative-blender-mcp.md create mode 100644 website/docs/user-guide/skills/optional/creative/creative-concept-diagrams.md create mode 100644 website/docs/user-guide/skills/optional/creative/creative-meme-generation.md create mode 100644 website/docs/user-guide/skills/optional/creative/creative-touchdesigner-mcp.md create mode 100644 website/docs/user-guide/skills/optional/devops/devops-cli.md create mode 100644 website/docs/user-guide/skills/optional/devops/devops-docker-management.md create mode 100644 website/docs/user-guide/skills/optional/dogfood/dogfood-adversarial-ux-test.md create mode 100644 website/docs/user-guide/skills/optional/email/email-agentmail.md create mode 100644 website/docs/user-guide/skills/optional/health/health-fitness-nutrition.md create mode 100644 website/docs/user-guide/skills/optional/health/health-neuroskill-bci.md create mode 100644 website/docs/user-guide/skills/optional/mcp/mcp-fastmcp.md create mode 100644 website/docs/user-guide/skills/optional/mcp/mcp-mcporter.md create mode 100644 website/docs/user-guide/skills/optional/migration/migration-openclaw-migration.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-accelerate.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-chroma.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-clip.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-faiss.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-flash-attention.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-guidance.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-hermes-atropos-environments.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-huggingface-tokenizers.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-instructor.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-lambda-labs.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-llava.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-modal.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-nemo-curator.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-peft.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-pinecone.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-pytorch-fsdp.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-pytorch-lightning.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-qdrant.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-saelens.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-simpo.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-slime.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-stable-diffusion.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-tensorrt-llm.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-torchtitan.md create mode 100644 website/docs/user-guide/skills/optional/mlops/mlops-whisper.md create mode 100644 website/docs/user-guide/skills/optional/productivity/productivity-canvas.md create mode 100644 website/docs/user-guide/skills/optional/productivity/productivity-memento-flashcards.md create mode 100644 website/docs/user-guide/skills/optional/productivity/productivity-siyuan.md create mode 100644 website/docs/user-guide/skills/optional/productivity/productivity-telephony.md create mode 100644 website/docs/user-guide/skills/optional/research/research-bioinformatics.md create mode 100644 website/docs/user-guide/skills/optional/research/research-domain-intel.md create mode 100644 website/docs/user-guide/skills/optional/research/research-drug-discovery.md create mode 100644 website/docs/user-guide/skills/optional/research/research-duckduckgo-search.md create mode 100644 website/docs/user-guide/skills/optional/research/research-gitnexus-explorer.md create mode 100644 website/docs/user-guide/skills/optional/research/research-parallel-cli.md create mode 100644 website/docs/user-guide/skills/optional/research/research-qmd.md create mode 100644 website/docs/user-guide/skills/optional/research/research-scrapling.md create mode 100644 website/docs/user-guide/skills/optional/security/security-1password.md create mode 100644 website/docs/user-guide/skills/optional/security/security-oss-forensics.md create mode 100644 website/docs/user-guide/skills/optional/security/security-sherlock.md create mode 100644 website/docs/user-guide/skills/optional/web-development/web-development-page-agent.md create mode 100755 website/scripts/generate-skill-docs.py diff --git a/.github/workflows/deploy-site.yml b/.github/workflows/deploy-site.yml index 3e78bc61b..67f557bad 100644 --- a/.github/workflows/deploy-site.yml +++ b/.github/workflows/deploy-site.yml @@ -53,6 +53,9 @@ jobs: - name: Extract skill metadata for dashboard run: python3 website/scripts/extract-skills.py + - name: Regenerate per-skill docs pages + catalogs + run: python3 website/scripts/generate-skill-docs.py + - name: Build skills index (if not already present) env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} diff --git a/.github/workflows/docs-site-checks.yml b/.github/workflows/docs-site-checks.yml index 2f985122c..80fe9ea9d 100644 --- a/.github/workflows/docs-site-checks.yml +++ b/.github/workflows/docs-site-checks.yml @@ -36,6 +36,9 @@ jobs: - name: Extract skill metadata for dashboard run: python3 website/scripts/extract-skills.py + - name: Regenerate per-skill docs pages + catalogs + run: python3 website/scripts/generate-skill-docs.py + - name: Lint docs diagrams run: npm run lint:diagrams working-directory: website diff --git a/website/docs/reference/optional-skills-catalog.md b/website/docs/reference/optional-skills-catalog.md index ab48e036d..53b50a641 100644 --- a/website/docs/reference/optional-skills-catalog.md +++ b/website/docs/reference/optional-skills-catalog.md @@ -6,7 +6,7 @@ description: "Official optional skills shipped with hermes-agent — install via # Optional Skills Catalog -Official optional skills ship with the hermes-agent repository under `optional-skills/` but are **not active by default**. Install them explicitly: +Optional skills ship with hermes-agent under `optional-skills/` but are **not active by default**. Install them explicitly: ```bash hermes skills install official// @@ -19,7 +19,7 @@ hermes skills install official/blockchain/solana hermes skills install official/mlops/flash-attention ``` -Once installed, the skill appears in the agent's skill list and can be loaded automatically when relevant tasks are detected. +Each skill below links to a dedicated page with its full definition, setup, and usage. To uninstall: @@ -27,136 +27,139 @@ To uninstall: hermes skills uninstall ``` ---- - -## Autonomous AI Agents +## autonomous-ai-agents | Skill | Description | |-------|-------------| -| **blackbox** | Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. | -| **honcho** | Configure and use Honcho memory with Hermes — cross-session user modeling, multi-profile peer isolation, observation config, and dialectic reasoning. | +| [**blackbox**](/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-blackbox) | Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. Requires the blackbox CLI and a Blackbox AI API key. | +| [**honcho**](/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-honcho) | Configure and use Honcho memory with Hermes -- cross-session user modeling, multi-profile peer isolation, observation config, dialectic reasoning, session summaries, and context budget enforcement. Use when setting up Honcho, troubleshoo... | -## Blockchain +## blockchain | Skill | Description | |-------|-------------| -| **base** | Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection, whale detection, and live network stats. No API key required. | -| **solana** | Query Solana blockchain data with USD pricing — wallet balances, token portfolios, transaction details, NFTs, whale detection, and live network stats. No API key required. | +| [**base**](/docs/user-guide/skills/optional/blockchain/blockchain-base) | Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection, whale detection, and live network stats. Uses Base RPC + CoinGecko. No API key required. | +| [**solana**](/docs/user-guide/skills/optional/blockchain/blockchain-solana) | Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required. | -## Communication +## communication | Skill | Description | |-------|-------------| -| **one-three-one-rule** | Structured communication framework for proposals and decision-making. | +| [**one-three-one-rule**](/docs/user-guide/skills/optional/communication/communication-one-three-one-rule) | Structured decision-making framework for technical proposals and trade-off analysis. When the user faces a choice between multiple approaches (architecture decisions, tool selection, refactoring strategies, migration paths), this skill p... | -## Creative +## creative | Skill | Description | |-------|-------------| -| **blender-mcp** | Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python (bpy) code. | -| **concept-diagrams** | Generate flat, minimal light/dark-aware SVG diagrams as standalone HTML files, using a unified educational visual language (9 semantic color ramps, automatic dark mode). Best for physics setups, chemistry mechanisms, math curves, physical objects (aircraft, turbines, smartphones), floor plans, cross-sections, lifecycle/process narratives, and hub-spoke system diagrams. Ships with 15 example diagrams. | -| **meme-generation** | Generate real meme images by picking a template and overlaying text with Pillow. Produces actual `.png` meme files. | -| **touchdesigner-mcp** | Control a running TouchDesigner instance via the twozero MCP plugin — create operators, set parameters, wire connections, execute Python, build real-time audio-reactive visuals and GLSL networks. 36 native tools. | +| [**blender-mcp**](/docs/user-guide/skills/optional/creative/creative-blender-mcp) | Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python (bpy) code. Use when user wants to create or modify anything in Blender. | +| [**concept-diagrams**](/docs/user-guide/skills/optional/creative/creative-concept-diagrams) | Generate flat, minimal light/dark-aware SVG diagrams as standalone HTML files, using a unified educational visual language with 9 semantic color ramps, sentence-case typography, and automatic dark mode. Best suited for educational and no... | +| [**meme-generation**](/docs/user-guide/skills/optional/creative/creative-meme-generation) | Generate real meme images by picking a template and overlaying text with Pillow. Produces actual .png meme files. | +| [**touchdesigner-mcp**](/docs/user-guide/skills/optional/creative/creative-touchdesigner-mcp) | Control a running TouchDesigner instance via twozero MCP — create operators, set parameters, wire connections, execute Python, build real-time visuals. 36 native tools. | -## Dogfood +## devops | Skill | Description | |-------|-------------| -| **adversarial-ux-test** | Roleplay the most difficult, tech-resistant user for a product — browse in-persona, rant, then filter through a RED/YELLOW/WHITE/GREEN pragmatism layer so only real UX friction becomes tickets. | +| [**inference-sh-cli**](/docs/user-guide/skills/optional/devops/devops-cli) | Run 150+ AI apps via inference.sh CLI (infsh) — image generation, video creation, LLMs, search, 3D, social automation. Uses the terminal tool. Triggers: inference.sh, infsh, ai apps, flux, veo, image generation, video generation, seedrea... | +| [**docker-management**](/docs/user-guide/skills/optional/devops/devops-docker-management) | Manage Docker containers, images, volumes, networks, and Compose stacks — lifecycle ops, debugging, cleanup, and Dockerfile optimization. | -## DevOps +## dogfood | Skill | Description | |-------|-------------| -| **cli** | Run 150+ AI apps via inference.sh CLI (infsh) — image generation, video creation, LLMs, search, 3D, and social automation. | -| **docker-management** | Manage Docker containers, images, volumes, networks, and Compose stacks — lifecycle ops, debugging, cleanup, and Dockerfile optimization. | +| [**adversarial-ux-test**](/docs/user-guide/skills/optional/dogfood/dogfood-adversarial-ux-test) | Roleplay the most difficult, tech-resistant user for your product. Browse the app as that persona, find every UX pain point, then filter complaints through a pragmatism layer to separate real problems from noise. Creates actionable ticke... | -## Email +## email | Skill | Description | |-------|-------------| -| **agentmail** | Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses. | +| [**agentmail**](/docs/user-guide/skills/optional/email/email-agentmail) | Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. hermes-agent@agentmail.to). | -## Health +## health | Skill | Description | |-------|-------------| -| **fitness-nutrition** | Gym workout planner and nutrition tracker. Search 690+ exercises by muscle, equipment, or category via wger. Look up macros and calories for 380,000+ foods via USDA FoodData Central. Computes BMI, TDEE, one-rep max, macro splits, and body fat — pure Python, no pip installs. | -| **neuroskill-bci** | Brain-Computer Interface (BCI) integration for neuroscience research workflows. | +| [**fitness-nutrition**](/docs/user-guide/skills/optional/health/health-fitness-nutrition) | Gym workout planner and nutrition tracker. Search 690+ exercises by muscle, equipment, or category via wger. Look up macros and calories for 380,000+ foods via USDA FoodData Central. Compute BMI, TDEE, one-rep max, macro splits, and body... | +| [**neuroskill-bci**](/docs/user-guide/skills/optional/health/health-neuroskill-bci) | Connect to a running NeuroSkill instance and incorporate the user's real-time cognitive and emotional state (focus, relaxation, mood, cognitive load, drowsiness, heart rate, HRV, sleep staging, and 40+ derived EXG scores) into responses.... | -## MCP +## mcp | Skill | Description | |-------|-------------| -| **fastmcp** | Build, test, inspect, install, and deploy MCP servers with FastMCP in Python. Covers wrapping APIs or databases as MCP tools, exposing resources or prompts, and deployment. | -| **mcporter** | The `mcporter` CLI — list, configure, auth, and call MCP servers/tools directly (HTTP or stdio) from the terminal. Useful for ad-hoc MCP interactions; for always-on tool discovery use the built-in `native-mcp` client instead. | +| [**fastmcp**](/docs/user-guide/skills/optional/mcp/mcp-fastmcp) | Build, test, inspect, install, and deploy MCP servers with FastMCP in Python. Use when creating a new MCP server, wrapping an API or database as MCP tools, exposing resources or prompts, or preparing a FastMCP server for Claude Code, Cur... | +| [**mcporter**](/docs/user-guide/skills/optional/mcp/mcp-mcporter) | Use the mcporter CLI to list, configure, auth, and call MCP servers/tools directly (HTTP or stdio), including ad-hoc servers, config edits, and CLI/type generation. | -## Migration +## migration | Skill | Description | |-------|-------------| -| **openclaw-migration** | Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports memories, SOUL.md, command allowlists, user skills, and selected workspace assets. | +| [**openclaw-migration**](/docs/user-guide/skills/optional/migration/migration-openclaw-migration) | Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports Hermes-compatible memories, SOUL.md, command allowlists, user skills, and selected workspace assets from ~/.openclaw, then reports exactly what could not be mig... | -## MLOps - -The largest optional category — covers the full ML pipeline from data curation to production inference. +## mlops | Skill | Description | |-------|-------------| -| **accelerate** | Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. | -| **chroma** | Open-source embedding database. Store embeddings and metadata, perform vector and full-text search. Simple 4-function API for RAG and semantic search. | -| **clip** | OpenAI's vision-language model connecting images and text. Zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks without fine-tuning. | -| **faiss** | Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). | -| **flash-attention** | Optimize transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Supports PyTorch SDPA, flash-attn library, H100 FP8, and sliding window. | -| **guidance** | Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance — Microsoft Research's constrained generation framework. | -| **hermes-atropos-environments** | Build, test, and debug Hermes Agent RL environments for Atropos training. Covers the HermesAgentBaseEnv interface, reward functions, agent loop integration, and evaluation. | -| **huggingface-tokenizers** | Fast Rust-based tokenizers for research and production. Tokenizes 1GB in under 20 seconds. Supports BPE, WordPiece, and Unigram algorithms. | -| **instructor** | Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, and stream partial results. | -| **lambda-labs** | Reserved and on-demand GPU cloud instances for ML training and inference. SSH access, persistent filesystems, and multi-node clusters. | -| **llava** | Large Language and Vision Assistant — visual instruction tuning and image-based conversations combining CLIP vision with LLaMA language models. | -| **modal** | Serverless GPU cloud platform for running ML workloads. On-demand GPU access without infrastructure management, ML model deployment as APIs, or batch jobs with automatic scaling. | -| **nemo-curator** | GPU-accelerated data curation for LLM training. Fuzzy deduplication (16x faster), quality filtering (30+ heuristics), semantic dedup, PII redaction. Scales with RAPIDS. | -| **peft-fine-tuning** | Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Train `<1%` of parameters with minimal accuracy loss for 7B–70B models on limited GPU memory. HuggingFace's official PEFT library. | -| **pinecone** | Managed vector database for production AI. Auto-scaling, hybrid search (dense + sparse), metadata filtering, and low latency (under 100ms p95). | -| **pytorch-fsdp** | Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP — parameter sharding, mixed precision, CPU offloading, FSDP2. | -| **pytorch-lightning** | High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks, and minimal boilerplate. | -| **qdrant** | High-performance vector similarity search engine. Rust-powered with fast nearest neighbor search, hybrid search with filtering, and scalable vector storage. | -| **saelens** | Train and analyze Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. | -| **simpo** | Simple Preference Optimization — reference-free alternative to DPO with better performance (+6.4 pts on AlpacaEval 2.0). No reference model needed. | -| **slime** | LLM post-training with RL using Megatron+SGLang framework. Custom data generation workflows and tight Megatron-LM integration for RL scaling. | -| **stable-diffusion-image-generation** | State-of-the-art text-to-image generation with Stable Diffusion via HuggingFace Diffusers. Text-to-image, image-to-image translation, inpainting, and custom diffusion pipelines. | -| **tensorrt-llm** | Optimize LLM inference with NVIDIA TensorRT for maximum throughput. 10-100x faster than PyTorch on A100/H100 with quantization (FP8/INT4) and in-flight batching. | -| **torchtitan** | PyTorch-native distributed LLM pretraining with 4D parallelism (FSDP2, TP, PP, CP). Scale from 8 to 512+ GPUs with Float8 and torch.compile. | -| **whisper** | OpenAI's general-purpose speech recognition. 99 languages, transcription, translation to English, and language ID. Six model sizes from tiny (39M) to large (1550M). Best for robust multilingual ASR. | +| [**huggingface-accelerate**](/docs/user-guide/skills/optional/mlops/mlops-accelerate) | Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch comm... | +| [**chroma**](/docs/user-guide/skills/optional/mlops/mlops-chroma) | Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to production clusters. Use for semantic search, RAG... | +| [**clip**](/docs/user-guide/skills/optional/mlops/mlops-clip) | OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks w... | +| [**faiss**](/docs/user-guide/skills/optional/mlops/mlops-faiss) | Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). Use for fast k-NN search, large-scale vector retrieval, or whe... | +| [**optimizing-attention-flash**](/docs/user-guide/skills/optional/mlops/mlops-flash-attention) | Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster in... | +| [**guidance**](/docs/user-guide/skills/optional/mlops/mlops-guidance) | Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance - Microsoft Research's constrained generation framework | +| [**hermes-atropos-environments**](/docs/user-guide/skills/optional/mlops/mlops-hermes-atropos-environments) | Build, test, and debug Hermes Agent RL environments for Atropos training. Covers the HermesAgentBaseEnv interface, reward functions, agent loop integration, evaluation with tools, wandb logging, and the three CLI modes (serve/process/eva... | +| [**huggingface-tokenizers**](/docs/user-guide/skills/optional/mlops/mlops-huggingface-tokenizers) | Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track alignments, handle padding/truncation. Integ... | +| [**instructor**](/docs/user-guide/skills/optional/mlops/mlops-instructor) | Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor - battle-tested structured output library | +| [**lambda-labs-gpu-cloud**](/docs/user-guide/skills/optional/mlops/mlops-lambda-labs) | Reserved and on-demand GPU cloud instances for ML training and inference. Use when you need dedicated GPU instances with simple SSH access, persistent filesystems, or high-performance multi-node clusters for large-scale training. | +| [**llava**](/docs/user-guide/skills/optional/mlops/mlops-llava) | Large Language and Vision Assistant. Enables visual instruction tuning and image-based conversations. Combines CLIP vision encoder with Vicuna/LLaMA language models. Supports multi-turn image chat, visual question answering, and instruct... | +| [**modal-serverless-gpu**](/docs/user-guide/skills/optional/mlops/mlops-modal) | Serverless GPU cloud platform for running ML workloads. Use when you need on-demand GPU access without infrastructure management, deploying ML models as APIs, or running batch jobs with automatic scaling. | +| [**nemo-curator**](/docs/user-guide/skills/optional/mlops/mlops-nemo-curator) | GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs wit... | +| [**peft-fine-tuning**](/docs/user-guide/skills/optional/mlops/mlops-peft) | Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal accuracy loss, or for multi-adapter se... | +| [**pinecone**](/docs/user-guide/skills/optional/mlops/mlops-pinecone) | Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or se... | +| [**pytorch-fsdp**](/docs/user-guide/skills/optional/mlops/mlops-pytorch-fsdp) | Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP - parameter sharding, mixed precision, CPU offloading, FSDP2 | +| [**pytorch-lightning**](/docs/user-guide/skills/optional/mlops/mlops-pytorch-lightning) | High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops w... | +| [**qdrant-vector-search**](/docs/user-guide/skills/optional/mlops/mlops-qdrant) | High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered per... | +| [**sparse-autoencoder-training**](/docs/user-guide/skills/optional/mlops/mlops-saelens) | Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying... | +| [**simpo-training**](/docs/user-guide/skills/optional/mlops/mlops-simpo) | Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use for preference alignment when want simpl... | +| [**slime-rl-training**](/docs/user-guide/skills/optional/mlops/mlops-slime) | Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling. | +| [**stable-diffusion-image-generation**](/docs/user-guide/skills/optional/mlops/mlops-stable-diffusion) | State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines. | +| [**tensorrt-llm**](/docs/user-guide/skills/optional/mlops/mlops-tensorrt-llm) | Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantizatio... | +| [**distributed-llm-pretraining-torchtitan**](/docs/user-guide/skills/optional/mlops/mlops-torchtitan) | Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs with Float8, torch.compile, and dist... | +| [**whisper**](/docs/user-guide/skills/optional/mlops/mlops-whisper) | OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast... | -## Productivity +## productivity | Skill | Description | |-------|-------------| -| **canvas** | Canvas LMS integration — fetch enrolled courses and assignments using API token authentication. | -| **memento-flashcards** | Spaced repetition flashcard system for learning and knowledge retention. | -| **siyuan** | SiYuan Note API for searching, reading, creating, and managing blocks and documents in a self-hosted knowledge base. | -| **telephony** | Give Hermes phone capabilities — provision a Twilio number, send/receive SMS/MMS, make calls, and place AI-driven outbound calls through Bland.ai or Vapi. | +| [**canvas**](/docs/user-guide/skills/optional/productivity/productivity-canvas) | Canvas LMS integration — fetch enrolled courses and assignments using API token authentication. | +| [**memento-flashcards**](/docs/user-guide/skills/optional/productivity/productivity-memento-flashcards) | Spaced-repetition flashcard system. Create cards from facts or text, chat with flashcards using free-text answers graded by the agent, generate quizzes from YouTube transcripts, review due cards with adaptive scheduling, and export/impor... | +| [**siyuan**](/docs/user-guide/skills/optional/productivity/productivity-siyuan) | SiYuan Note API for searching, reading, creating, and managing blocks and documents in a self-hosted knowledge base via curl. | +| [**telephony**](/docs/user-guide/skills/optional/productivity/productivity-telephony) | Give Hermes phone capabilities without core tool changes. Provision and persist a Twilio number, send and receive SMS/MMS, make direct calls, and place AI-driven outbound calls through Bland.ai or Vapi. | -## Research +## research | Skill | Description | |-------|-------------| -| **bioinformatics** | Gateway to 400+ bioinformatics skills from bioSkills and ClawBio. Covers genomics, transcriptomics, single-cell, variant calling, pharmacogenomics, metagenomics, and structural biology. | -| **domain-intel** | Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, and bulk multi-domain analysis. No API keys required. | -| **duckduckgo-search** | Free web search via DuckDuckGo — text, news, images, videos. No API key needed. | -| **gitnexus-explorer** | Index a codebase with GitNexus and serve an interactive knowledge graph via web UI and Cloudflare tunnel. | -| **parallel-cli** | Vendor skill for Parallel CLI — agent-native web search, extraction, deep research, enrichment, and monitoring. | -| **qmd** | Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. | -| **scrapling** | Web scraping with Scrapling — HTTP fetching, stealth browser automation, Cloudflare bypass, and spider crawling via CLI and Python. | +| [**bioinformatics**](/docs/user-guide/skills/optional/research/research-bioinformatics) | Gateway to 400+ bioinformatics skills from bioSkills and ClawBio. Covers genomics, transcriptomics, single-cell, variant calling, pharmacogenomics, metagenomics, structural biology, and more. Fetches domain-specific reference material on... | +| [**domain-intel**](/docs/user-guide/skills/optional/research/research-domain-intel) | Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, domain availability checks, and bulk multi-domain analysis. No API keys required. | +| [**drug-discovery**](/docs/user-guide/skills/optional/research/research-drug-discovery) | Pharmaceutical research assistant for drug discovery workflows. Search bioactive compounds on ChEMBL, calculate drug-likeness (Lipinski Ro5, QED, TPSA, synthetic accessibility), look up drug-drug interactions via OpenFDA, interpret ADMET... | +| [**duckduckgo-search**](/docs/user-guide/skills/optional/research/research-duckduckgo-search) | Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Prefer the `ddgs` CLI when installed; use the Python DDGS library only after verifying that `ddgs` is available in the current runtime. | +| [**gitnexus-explorer**](/docs/user-guide/skills/optional/research/research-gitnexus-explorer) | Index a codebase with GitNexus and serve an interactive knowledge graph via web UI + Cloudflare tunnel. | +| [**parallel-cli**](/docs/user-guide/skills/optional/research/research-parallel-cli) | Optional vendor skill for Parallel CLI — agent-native web search, extraction, deep research, enrichment, FindAll, and monitoring. Prefer JSON output and non-interactive flows. | +| [**qmd**](/docs/user-guide/skills/optional/research/research-qmd) | Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration. | +| [**scrapling**](/docs/user-guide/skills/optional/research/research-scrapling) | Web scraping with Scrapling - HTTP fetching, stealth browser automation, Cloudflare bypass, and spider crawling via CLI and Python. | -## Security +## security | Skill | Description | |-------|-------------| -| **1password** | Set up and use 1Password CLI (op). Install the CLI, enable desktop app integration, sign in, and read/inject secrets for commands. | -| **oss-forensics** | Open-source software forensics — analyze packages, dependencies, and supply chain risks. | -| **sherlock** | OSINT username search across 400+ social networks. Hunt down social media accounts by username. | +| [**1password**](/docs/user-guide/skills/optional/security/security-1password) | Set up and use 1Password CLI (op). Use when installing the CLI, enabling desktop app integration, signing in, and reading/injecting secrets for commands. | +| [**oss-forensics**](/docs/user-guide/skills/optional/security/security-oss-forensics) | Supply chain investigation, evidence recovery, and forensic analysis for GitHub repositories. Covers deleted commit recovery, force-push detection, IOC extraction, multi-source evidence collection, hypothesis formation/validation, and st... | +| [**sherlock**](/docs/user-guide/skills/optional/security/security-sherlock) | OSINT username search across 400+ social networks. Hunt down social media accounts by username. | + +## web-development + +| Skill | Description | +|-------|-------------| +| [**page-agent**](/docs/user-guide/skills/optional/web-development/web-development-page-agent) | Embed alibaba/page-agent into your own web application — a pure-JavaScript in-page GUI agent that ships as a single <script> tag or npm package and lets end-users of your site drive the UI with natural language ("click login, fill userna... | --- @@ -167,4 +170,4 @@ To add a new optional skill to the repository: 1. Create a directory under `optional-skills///` 2. Add a `SKILL.md` with standard frontmatter (name, description, version, author) 3. Include any supporting files in `references/`, `templates/`, or `scripts/` subdirectories -4. Submit a pull request — the skill will appear in this catalog once merged +4. Submit a pull request — the skill will appear in this catalog and get its own docs page once merged diff --git a/website/docs/reference/skills-catalog.md b/website/docs/reference/skills-catalog.md index 301d7ee54..31eb71f11 100644 --- a/website/docs/reference/skills-catalog.md +++ b/website/docs/reference/skills-catalog.md @@ -6,325 +6,174 @@ description: "Catalog of bundled skills that ship with Hermes Agent" # Bundled Skills Catalog -Hermes ships with a large built-in skill library copied into `~/.hermes/skills/` on install. This page catalogs the bundled skills that live in the repository under `skills/`. +Hermes ships with a large built-in skill library copied into `~/.hermes/skills/` on install. Each skill below links to a dedicated page with its full definition, setup, and usage. + +If a skill is missing from this list but present in the repo, the catalog is regenerated by `website/scripts/generate-skill-docs.py`. ## apple -Apple/macOS-specific skills — iMessage, Reminders, Notes, FindMy, and macOS automation. These skills only load on macOS systems. - | Skill | Description | Path | |-------|-------------|------| -| `apple-notes` | Manage Apple Notes via the memo CLI on macOS (create, view, search, edit). | `apple/apple-notes` | -| `apple-reminders` | Manage Apple Reminders via remindctl CLI (list, add, complete, delete). | `apple/apple-reminders` | -| `findmy` | Track Apple devices and AirTags via FindMy.app on macOS using AppleScript and screen capture. | `apple/findmy` | -| `imessage` | Send and receive iMessages/SMS via the imsg CLI on macOS. | `apple/imessage` | +| [`apple-notes`](/docs/user-guide/skills/bundled/apple/apple-apple-notes) | Manage Apple Notes via the memo CLI on macOS (create, view, search, edit). | `apple/apple-notes` | +| [`apple-reminders`](/docs/user-guide/skills/bundled/apple/apple-apple-reminders) | Manage Apple Reminders via remindctl CLI (list, add, complete, delete). | `apple/apple-reminders` | +| [`findmy`](/docs/user-guide/skills/bundled/apple/apple-findmy) | Track Apple devices and AirTags via FindMy.app on macOS using AppleScript and screen capture. | `apple/findmy` | +| [`imessage`](/docs/user-guide/skills/bundled/apple/apple-imessage) | Send and receive iMessages/SMS via the imsg CLI on macOS. | `apple/imessage` | ## autonomous-ai-agents -Skills for spawning and orchestrating autonomous AI coding agents and multi-agent workflows — running independent agent processes, delegating tasks, and coordinating parallel workstreams. - | Skill | Description | Path | |-------|-------------|------| -| `claude-code` | Delegate coding tasks to Claude Code (Anthropic's CLI agent). Use for building features, refactoring, PR reviews, and iterative coding. Requires the claude CLI installed. | `autonomous-ai-agents/claude-code` | -| `codex` | Delegate coding tasks to OpenAI Codex CLI agent. Use for building features, refactoring, PR reviews, and batch issue fixing. Requires the codex CLI and a git repository. | `autonomous-ai-agents/codex` | -| `hermes-agent` | Complete guide to using and extending Hermes Agent — CLI usage, setup, configuration, spawning additional agents, gateway platforms, skills, voice, tools, profiles, and a concise contributor reference. Load this skill when helping users configure Hermes, troubleshoot issues, s… | `autonomous-ai-agents/hermes-agent` | -| `opencode` | Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions. Requires the opencode CLI installed and authenticated. | `autonomous-ai-agents/opencode` | +| [`claude-code`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code) | Delegate coding tasks to Claude Code (Anthropic's CLI agent). Use for building features, refactoring, PR reviews, and iterative coding. Requires the claude CLI installed. | `autonomous-ai-agents/claude-code` | +| [`codex`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex) | Delegate coding tasks to OpenAI Codex CLI agent. Use for building features, refactoring, PR reviews, and batch issue fixing. Requires the codex CLI and a git repository. | `autonomous-ai-agents/codex` | +| [`hermes-agent`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent) | Complete guide to using and extending Hermes Agent — CLI usage, setup, configuration, spawning additional agents, gateway platforms, skills, voice, tools, profiles, and a concise contributor reference. Load this skill when helping users... | `autonomous-ai-agents/hermes-agent` | +| [`opencode`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-opencode) | Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions. Requires the opencode CLI installed and authenticated. | `autonomous-ai-agents/opencode` | ## creative -Creative content generation — ASCII art, hand-drawn diagrams, animations, music, and visual design tools. - | Skill | Description | Path | |-------|-------------|------| -| `architecture-diagram` | Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics. Semantic component colors (cyan=frontend, emerald=backend, violet=database, amber=cloud/AWS, rose=security, orange=message bus), JetBrains Mono fon… | `creative/architecture-diagram` | -| `ascii-art` | Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required. | `creative/ascii-art` | -| `ascii-video` | Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid… | `creative/ascii-video` | -| `excalidraw` | Create hand-drawn style diagrams using Excalidraw JSON format. Generate .excalidraw files for architecture diagrams, flowcharts, sequence diagrams, concept maps, and more. Files can be opened at excalidraw.com or uploaded for shareable links. | `creative/excalidraw` | -| `ideation` | Generate project ideas through creative constraints. Use when the user says 'I want to build something', 'give me a project idea', 'I'm bored', 'what should I make', 'inspire me', or any variant of 'I have tools but no direction'. Works for code, art, hardware, writing, tools,… | `creative/creative-ideation` | -| `manim-video` | Production pipeline for mathematical and technical animations using Manim Community Edition. Creates 3Blue1Brown-style explainer videos, algorithm visualizations, equation derivations, architecture diagrams, and data stories. Use when users request: animated explanations, math… | `creative/manim-video` | -| `p5js` | Production pipeline for interactive and generative visual art using p5.js. Creates browser-based sketches, generative art, data visualizations, interactive experiences, 3D scenes, audio-reactive visuals, and motion graphics — exported as HTML, PNG, GIF, MP4, or SVG. Covers: 2D… | `creative/p5js` | -| `popular-web-designs` | 54 production-quality design systems extracted from real websites. Load a template to generate HTML/CSS that matches the visual identity of sites like Stripe, Linear, Vercel, Notion, Airbnb, and more. Each template includes colors, typography, components, layout rules, and rea… | `creative/popular-web-designs` | -| `songwriting-and-ai-music` | Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation techniques, phonetic tricks, and lessons learned. These are tools and ideas, not rules. Break any of them when the art calls for it. | `creative/songwriting-and-ai-music` | +| [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) | Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics. Semantic component colors (cyan=frontend, emerald=backend, violet=database, amber=cloud/AWS, rose=security,... | `creative/architecture-diagram` | +| [`ascii-art`](/docs/user-guide/skills/bundled/creative/creative-ascii-art) | Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required. | `creative/ascii-art` | +| [`ascii-video`](/docs/user-guide/skills/bundled/creative/creative-ascii-video) | Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers,... | `creative/ascii-video` | +| [`baoyu-comic`](/docs/user-guide/skills/bundled/creative/creative-baoyu-comic) | Knowledge comic creator supporting multiple art styles and tones. Creates original educational comics with detailed panel layouts and sequential image generation. Use when user asks to create "知识漫画", "教育漫画", "biography comic", "tutorial... | `creative/baoyu-comic` | +| [`baoyu-infographic`](/docs/user-guide/skills/bundled/creative/creative-baoyu-infographic) | Generate professional infographics with 21 layout types and 21 visual styles. Analyzes content, recommends layout×style combinations, and generates publication-ready infographics. Use when user asks to create "infographic", "visual summa... | `creative/baoyu-infographic` | +| [`ideation`](/docs/user-guide/skills/bundled/creative/creative-creative-ideation) | Generate project ideas through creative constraints. Use when the user says 'I want to build something', 'give me a project idea', 'I'm bored', 'what should I make', 'inspire me', or any variant of 'I have tools but no direction'. Works... | `creative/creative-ideation` | +| [`design-md`](/docs/user-guide/skills/bundled/creative/creative-design-md) | Author, validate, diff, and export DESIGN.md files — Google's open-source format spec that gives coding agents a persistent, structured understanding of a design system (tokens + rationale in one file). Use when building a design system,... | `creative/design-md` | +| [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw) | Create hand-drawn style diagrams using Excalidraw JSON format. Generate .excalidraw files for architecture diagrams, flowcharts, sequence diagrams, concept maps, and more. Files can be opened at excalidraw.com or uploaded for shareable l... | `creative/excalidraw` | +| [`manim-video`](/docs/user-guide/skills/bundled/creative/creative-manim-video) | Production pipeline for mathematical and technical animations using Manim Community Edition. Creates 3Blue1Brown-style explainer videos, algorithm visualizations, equation derivations, architecture diagrams, and data stories. Use when us... | `creative/manim-video` | +| [`p5js`](/docs/user-guide/skills/bundled/creative/creative-p5js) | Production pipeline for interactive and generative visual art using p5.js. Creates browser-based sketches, generative art, data visualizations, interactive experiences, 3D scenes, audio-reactive visuals, and motion graphics — exported as... | `creative/p5js` | +| [`pixel-art`](/docs/user-guide/skills/bundled/creative/creative-pixel-art) | Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc.), and animate them into short videos. Presets cover arcade, SNES, and 10+ era-correct looks. Use `clarify` to let the user pick a style... | `creative/pixel-art` | +| [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs) | 54 production-quality design systems extracted from real websites. Load a template to generate HTML/CSS that matches the visual identity of sites like Stripe, Linear, Vercel, Notion, Airbnb, and more. Each template includes colors, typog... | `creative/popular-web-designs` | +| [`songwriting-and-ai-music`](/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music) | Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation techniques, phonetic tricks, and lessons learned. These are tools and ideas, not rules. Break any of them when the art calls for it. | `creative/songwriting-and-ai-music` | ## data-science -Skills for data science workflows — interactive exploration, Jupyter notebooks, data analysis, and visualization. - | Skill | Description | Path | |-------|-------------|------| -| `jupyter-live-kernel` | Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb. Load this skill when the task involves exploration, iteration, or inspecting intermediate results — data science, ML experimentation, API exploration, or building up complex code step-by-step. Uses… | `data-science/jupyter-live-kernel` | +| [`jupyter-live-kernel`](/docs/user-guide/skills/bundled/data-science/data-science-jupyter-live-kernel) | Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb. Load this skill when the task involves exploration, iteration, or inspecting intermediate results — data science, ML experimentation, API exploration, or bui... | `data-science/jupyter-live-kernel` | ## devops -DevOps and infrastructure automation skills. - | Skill | Description | Path | |-------|-------------|------| -| `webhook-subscriptions` | Create and manage webhook subscriptions for event-driven agent activation. Use when the user wants external services to trigger agent runs automatically. | `devops/webhook-subscriptions` | +| [`webhook-subscriptions`](/docs/user-guide/skills/bundled/devops/devops-webhook-subscriptions) | Create and manage webhook subscriptions for event-driven agent activation, or for direct push notifications (zero LLM cost). Use when the user wants external services to trigger agent runs OR push notifications to chats. | `devops/webhook-subscriptions` | ## dogfood -Internal dogfooding and QA skills used to test Hermes Agent itself. - | Skill | Description | Path | |-------|-------------|------| -| `dogfood` | Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports | `dogfood` | -| `adversarial-ux-test` | Roleplay the most difficult, tech-resistant user for a product — browse in-persona, rant, then filter through a RED/YELLOW/WHITE/GREEN pragmatism layer so only real UX friction becomes tickets. | `dogfood/adversarial-ux-test` | +| [`dogfood`](/docs/user-guide/skills/bundled/dogfood/dogfood-dogfood) | Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports | `dogfood` | ## email -Skills for sending, receiving, searching, and managing email from the terminal. - | Skill | Description | Path | |-------|-------------|------| -| `himalaya` | CLI to manage emails via IMAP/SMTP. Use himalaya to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language). | `email/himalaya` | +| [`himalaya`](/docs/user-guide/skills/bundled/email/email-himalaya) | CLI to manage emails via IMAP/SMTP. Use himalaya to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language). | `email/himalaya` | ## gaming -Skills for setting up, configuring, and managing game servers, modpacks, and gaming-related infrastructure. - | Skill | Description | Path | |-------|-------------|------| -| `minecraft-modpack-server` | Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip. Covers NeoForge/Forge install, Java version, JVM tuning, firewall, LAN config, backups, and launch scripts. | `gaming/minecraft-modpack-server` | -| `pokemon-player` | Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal. | `gaming/pokemon-player` | +| [`minecraft-modpack-server`](/docs/user-guide/skills/bundled/gaming/gaming-minecraft-modpack-server) | Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip. Covers NeoForge/Forge install, Java version, JVM tuning, firewall, LAN config, backups, and launch scripts. | `gaming/minecraft-modpack-server` | +| [`pokemon-player`](/docs/user-guide/skills/bundled/gaming/gaming-pokemon-player) | Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal. | `gaming/pokemon-player` | ## github -GitHub workflow skills for managing repositories, pull requests, code reviews, issues, and CI/CD pipelines. - | Skill | Description | Path | |-------|-------------|------| -| `codebase-inspection` | Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios. Use when asked to check lines of code, repo size, language composition, or codebase stats. | `github/codebase-inspection` | -| `github-auth` | Set up GitHub authentication for the agent using git (universally available) or the gh CLI. Covers HTTPS tokens, SSH keys, credential helpers, and gh auth — with a detection flow to pick the right method automatically. | `github/github-auth` | -| `github-code-review` | Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-code-review` | -| `github-issues` | Create, manage, triage, and close GitHub issues. Search existing issues, add labels, assign people, and link to PRs. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-issues` | -| `github-pr-workflow` | Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-pr-workflow` | -| `github-repo-management` | Clone, create, fork, configure, and manage GitHub repositories. Manage remotes, secrets, releases, and workflows. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-repo-management` | +| [`codebase-inspection`](/docs/user-guide/skills/bundled/github/github-codebase-inspection) | Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios. Use when asked to check lines of code, repo size, language composition, or codebase stats. | `github/codebase-inspection` | +| [`github-auth`](/docs/user-guide/skills/bundled/github/github-github-auth) | Set up GitHub authentication for the agent using git (universally available) or the gh CLI. Covers HTTPS tokens, SSH keys, credential helpers, and gh auth — with a detection flow to pick the right method automatically. | `github/github-auth` | +| [`github-code-review`](/docs/user-guide/skills/bundled/github/github-github-code-review) | Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-code-review` | +| [`github-issues`](/docs/user-guide/skills/bundled/github/github-github-issues) | Create, manage, triage, and close GitHub issues. Search existing issues, add labels, assign people, and link to PRs. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-issues` | +| [`github-pr-workflow`](/docs/user-guide/skills/bundled/github/github-github-pr-workflow) | Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-pr-workflow` | +| [`github-repo-management`](/docs/user-guide/skills/bundled/github/github-github-repo-management) | Clone, create, fork, configure, and manage GitHub repositories. Manage remotes, secrets, releases, and workflows. Works with gh CLI or falls back to git + GitHub REST API via curl. | `github/github-repo-management` | ## mcp -Skills for working with MCP (Model Context Protocol) servers, tools, and integrations. - | Skill | Description | Path | |-------|-------------|------| -| `native-mcp` | Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools. Supports stdio and HTTP transports with automatic reconnection, security filtering, and zero-config tool injection. | `mcp/native-mcp` | +| [`native-mcp`](/docs/user-guide/skills/bundled/mcp/mcp-native-mcp) | Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools. Supports stdio and HTTP transports with automatic reconnection, security filterin... | `mcp/native-mcp` | ## media -Skills for working with media content — YouTube transcripts, GIF search, music generation, and audio visualization. - | Skill | Description | Path | |-------|-------------|------| -| `gif-search` | Search and download GIFs from Tenor using curl. No dependencies beyond curl and jq. Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat. | `media/gif-search` | -| `heartmula` | Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support. | `media/heartmula` | -| `songsee` | Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation. | `media/songsee` | -| `youtube-content` | Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts). Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouT… | `media/youtube-content` | +| [`gif-search`](/docs/user-guide/skills/bundled/media/media-gif-search) | Search and download GIFs from Tenor using curl. No dependencies beyond curl and jq. Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat. | `media/gif-search` | +| [`heartmula`](/docs/user-guide/skills/bundled/media/media-heartmula) | Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support. | `media/heartmula` | +| [`songsee`](/docs/user-guide/skills/bundled/media/media-songsee) | Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation. | `media/songsee` | +| [`youtube-content`](/docs/user-guide/skills/bundled/media/media-youtube-content) | Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts). Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to ex... | `media/youtube-content` | ## mlops -General-purpose ML operations tools — model hub management, dataset operations, and workflow orchestration. - | Skill | Description | Path | |-------|-------------|------| -| `huggingface-hub` | Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Spaces and buckets. | `mlops/huggingface-hub` | - -## mlops/evaluation - -Model evaluation benchmarks, experiment tracking, and interpretability tools. - -| Skill | Description | Path | -|-------|-------------|------| -| `evaluating-llms-harness` | Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. S… | `mlops/evaluation/lm-evaluation-harness` | -| `weights-and-biases` | Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform | `mlops/evaluation/weights-and-biases` | - -## mlops/inference - -Model serving, quantization (GGUF/GPTQ), structured output, inference optimization, and model surgery tools for deploying and running LLMs. - -| Skill | Description | Path | -|-------|-------------|------| -| `llama-cpp` | Run LLM inference with llama.cpp on CPU, Apple Silicon, AMD/Intel GPUs, or NVIDIA — plus GGUF model conversion and quantization (2–8 bit with K-quants and imatrix). Covers CLI, Python bindings, OpenAI-compatible server, and Ollama/LM Studio integration. Use for edge deployment… | `mlops/inference/llama-cpp` | -| `obliteratus` | Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE decomposition, etc.) to excise guardrails while preserving reasoning. 9 CLI methods, 28 analysis modules, 116 model presets … | `mlops/inference/obliteratus` | -| `outlines` | Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines - dottxt.ai's structured generation library | `mlops/inference/outlines` | -| `serving-llms-vllm` | Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), … | `mlops/inference/vllm` | - -## mlops/models - -Specific model architectures — image segmentation (SAM) and audio generation (AudioCraft / MusicGen). Additional model skills (CLIP, Stable Diffusion, Whisper, LLaVA) are available as optional skills. - -| Skill | Description | Path | -|-------|-------------|------| -| `audiocraft-audio-generation` | PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation. | `mlops/models/audiocraft` | -| `segment-anything-model` | Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image. | `mlops/models/segment-anything` | - -## mlops/research - -ML research frameworks for building and optimizing AI systems with declarative programming. - -| Skill | Description | Path | -|-------|-------------|------| -| `dspy` | Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming | `mlops/research/dspy` | - -## mlops/training - -Fine-tuning, RLHF/DPO/GRPO training, distributed training frameworks, and optimization tools. - -| Skill | Description | Path | -|-------|-------------|------| -| `axolotl` | Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support | `mlops/training/axolotl` | -| `fine-tuning-with-trl` | Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace … | `mlops/training/trl-fine-tuning` | -| `unsloth` | Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization | `mlops/training/unsloth` | +| [`audiocraft-audio-generation`](/docs/user-guide/skills/bundled/mlops/mlops-models-audiocraft) | PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation. | `mlops/models/audiocraft` | +| [`axolotl`](/docs/user-guide/skills/bundled/mlops/mlops-training-axolotl) | Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support | `mlops/training/axolotl` | +| [`dspy`](/docs/user-guide/skills/bundled/mlops/mlops-research-dspy) | Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming | `mlops/research/dspy` | +| [`huggingface-hub`](/docs/user-guide/skills/bundled/mlops/mlops-huggingface-hub) | Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Spaces and buckets. | `mlops/huggingface-hub` | +| [`llama-cpp`](/docs/user-guide/skills/bundled/mlops/mlops-inference-llama-cpp) | llama.cpp local GGUF inference + HF Hub model discovery. | `mlops/inference/llama-cpp` | +| [`evaluating-llms-harness`](/docs/user-guide/skills/bundled/mlops/mlops-evaluation-lm-evaluation-harness) | Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by El... | `mlops/evaluation/lm-evaluation-harness` | +| [`obliteratus`](/docs/user-guide/skills/bundled/mlops/mlops-inference-obliteratus) | Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE decomposition, etc.) to excise guardrails while preserving reasoning. 9 CLI methods,... | `mlops/inference/obliteratus` | +| [`outlines`](/docs/user-guide/skills/bundled/mlops/mlops-inference-outlines) | Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines - dottxt.ai's structured generation library | `mlops/inference/outlines` | +| [`segment-anything-model`](/docs/user-guide/skills/bundled/mlops/mlops-models-segment-anything) | Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image. | `mlops/models/segment-anything` | +| [`fine-tuning-with-trl`](/docs/user-guide/skills/bundled/mlops/mlops-training-trl-fine-tuning) | Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from... | `mlops/training/trl-fine-tuning` | +| [`unsloth`](/docs/user-guide/skills/bundled/mlops/mlops-training-unsloth) | Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization | `mlops/training/unsloth` | +| [`serving-llms-vllm`](/docs/user-guide/skills/bundled/mlops/mlops-inference-vllm) | Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible... | `mlops/inference/vllm` | +| [`weights-and-biases`](/docs/user-guide/skills/bundled/mlops/mlops-evaluation-weights-and-biases) | Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform | `mlops/evaluation/weights-and-biases` | ## note-taking -Note taking skills, to save information, assist with research, and collaborate on multi-session planning. - | Skill | Description | Path | |-------|-------------|------| -| `obsidian` | Read, search, and create notes in the Obsidian vault. | `note-taking/obsidian` | +| [`obsidian`](/docs/user-guide/skills/bundled/note-taking/note-taking-obsidian) | Read, search, and create notes in the Obsidian vault. | `note-taking/obsidian` | ## productivity -Skills for document creation, presentations, spreadsheets, and other productivity workflows. - | Skill | Description | Path | |-------|-------------|------| -| `google-workspace` | Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration for Hermes. Uses Hermes-managed OAuth2 setup, prefers the Google Workspace CLI (`gws`) when available for broader API coverage, and falls back to the Python client libraries otherwise. | `productivity/google-workspace` | -| `linear` | Manage Linear issues, projects, and teams via the GraphQL API. Create, update, search, and organize issues. Uses API key auth (no OAuth needed). All operations via curl — no dependencies. | `productivity/linear` | -| `maps` | Location intelligence — geocode, reverse-geocode, nearby POI search (44 categories, coordinates or address via `--near`), driving/walking/cycling distance + time, turn-by-turn directions, timezone, bounding box + area, POI search in a rectangle. Uses OpenStreetMap + Overpass + OSRM. No API key needed. Telegram location-pin friendly. | `productivity/maps` | -| `nano-pdf` | Edit PDFs with natural-language instructions using the nano-pdf CLI. Modify text, fix typos, update titles, and make content changes to specific pages without manual editing. | `productivity/nano-pdf` | -| `notion` | Notion API for creating and managing pages, databases, and blocks via curl. Search, create, update, and query Notion workspaces directly from the terminal. | `productivity/notion` | -| `ocr-and-documents` | Extract text from PDFs and scanned documents. Use web_extract for remote URLs, pymupdf for local text-based PDFs, marker-pdf for OCR/scanned docs. For DOCX use python-docx, for PPTX see the powerpoint skill. | `productivity/ocr-and-documents` | -| `powerpoint` | Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in … | `productivity/powerpoint` | +| [`google-workspace`](/docs/user-guide/skills/bundled/productivity/productivity-google-workspace) | Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration for Hermes. Uses Hermes-managed OAuth2 setup, prefers the Google Workspace CLI (`gws`) when available for broader API coverage, and falls back to the Python client libraries... | `productivity/google-workspace` | +| [`linear`](/docs/user-guide/skills/bundled/productivity/productivity-linear) | Manage Linear issues, projects, and teams via the GraphQL API. Create, update, search, and organize issues. Uses API key auth (no OAuth needed). All operations via curl — no dependencies. | `productivity/linear` | +| [`maps`](/docs/user-guide/skills/bundled/productivity/productivity-maps) | Location intelligence — geocode a place, reverse-geocode coordinates, find nearby places (46 POI categories), driving/walking/cycling distance + time, turn-by-turn directions, timezone lookup, bounding box + area for a named place, and P... | `productivity/maps` | +| [`nano-pdf`](/docs/user-guide/skills/bundled/productivity/productivity-nano-pdf) | Edit PDFs with natural-language instructions using the nano-pdf CLI. Modify text, fix typos, update titles, and make content changes to specific pages without manual editing. | `productivity/nano-pdf` | +| [`notion`](/docs/user-guide/skills/bundled/productivity/productivity-notion) | Notion API for creating and managing pages, databases, and blocks via curl. Search, create, update, and query Notion workspaces directly from the terminal. | `productivity/notion` | +| [`ocr-and-documents`](/docs/user-guide/skills/bundled/productivity/productivity-ocr-and-documents) | Extract text from PDFs and scanned documents. Use web_extract for remote URLs, pymupdf for local text-based PDFs, marker-pdf for OCR/scanned docs. For DOCX use python-docx, for PPTX see the powerpoint skill. | `productivity/ocr-and-documents` | +| [`powerpoint`](/docs/user-guide/skills/bundled/productivity/productivity-powerpoint) | Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted... | `productivity/powerpoint` | ## red-teaming -Skills for LLM red-teaming, jailbreaking, and safety filter bypass research. - | Skill | Description | Path | |-------|-------------|------| -| `godmode` | Jailbreak API-served LLMs using G0DM0D3 techniques — Parseltongue input obfuscation (33 techniques), GODMODE CLASSIC system prompt templates, ULTRAPLINIAN multi-model racing, encoding escalation, and Hermes-native prefill/system prompt integration. Use when a user wants to byp… | `red-teaming/godmode` | +| [`godmode`](/docs/user-guide/skills/bundled/red-teaming/red-teaming-godmode) | Jailbreak API-served LLMs using G0DM0D3 techniques — Parseltongue input obfuscation (33 techniques), GODMODE CLASSIC system prompt templates, ULTRAPLINIAN multi-model racing, encoding escalation, and Hermes-native prefill/system prompt i... | `red-teaming/godmode` | ## research -Skills for academic research, paper discovery, literature review, market data, content monitoring, and scientific knowledge retrieval. - | Skill | Description | Path | |-------|-------------|------| -| `arxiv` | Search and retrieve academic papers from arXiv using their free REST API. No API key needed. Search by keyword, author, category, or ID. Combine with web_extract or the ocr-and-documents skill to read full paper content. | `research/arxiv` | -| `blogwatcher` | Monitor blogs and RSS/Atom feeds for updates using the blogwatcher-cli tool. Add blogs, scan for new articles, track read status, and filter by category. | `research/blogwatcher` | -| `llm-wiki` | Karpathy's LLM Wiki — build and maintain a persistent, interlinked markdown knowledge base. Ingest sources, query compiled knowledge, and lint for consistency. | `research/llm-wiki` | -| `polymarket` | Query Polymarket prediction market data — search markets, get prices, orderbooks, and price history. Read-only via public REST APIs, no API key needed. | `research/polymarket` | -| `research-paper-writing` | End-to-end pipeline for writing ML/AI research papers — from experiment design through analysis, drafting, revision, and submission. Covers NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Integrates automated experiment monitoring, statistical analysis, iterative writing, and citation v… | `research/research-paper-writing` | +| [`arxiv`](/docs/user-guide/skills/bundled/research/research-arxiv) | Search and retrieve academic papers from arXiv using their free REST API. No API key needed. Search by keyword, author, category, or ID. Combine with web_extract or the ocr-and-documents skill to read full paper content. | `research/arxiv` | +| [`blogwatcher`](/docs/user-guide/skills/bundled/research/research-blogwatcher) | Monitor blogs and RSS/Atom feeds for updates using the blogwatcher-cli tool. Add blogs, scan for new articles, track read status, and filter by category. | `research/blogwatcher` | +| [`llm-wiki`](/docs/user-guide/skills/bundled/research/research-llm-wiki) | Karpathy's LLM Wiki — build and maintain a persistent, interlinked markdown knowledge base. Ingest sources, query compiled knowledge, and lint for consistency. | `research/llm-wiki` | +| [`polymarket`](/docs/user-guide/skills/bundled/research/research-polymarket) | Query Polymarket prediction market data — search markets, get prices, orderbooks, and price history. Read-only via public REST APIs, no API key needed. | `research/polymarket` | +| [`research-paper-writing`](/docs/user-guide/skills/bundled/research/research-research-paper-writing) | End-to-end pipeline for writing ML/AI research papers — from experiment design through analysis, drafting, revision, and submission. Covers NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Integrates automated experiment monitoring, statistical ana... | `research/research-paper-writing` | ## smart-home -Skills for controlling smart home devices — lights, switches, sensors, and home automation systems. - | Skill | Description | Path | |-------|-------------|------| -| `openhue` | Control Philips Hue lights, rooms, and scenes via the OpenHue CLI. Turn lights on/off, adjust brightness, color, color temperature, and activate scenes. | `smart-home/openhue` | +| [`openhue`](/docs/user-guide/skills/bundled/smart-home/smart-home-openhue) | Control Philips Hue lights, rooms, and scenes via the OpenHue CLI. Turn lights on/off, adjust brightness, color, color temperature, and activate scenes. | `smart-home/openhue` | ## social-media -Skills for interacting with social platforms — posting, reading, monitoring, and account operations. - | Skill | Description | Path | |-------|-------------|------| -| `xurl` | Interact with X/Twitter via xurl, the official X API CLI. Use for posting, replying, quoting, searching, timelines, mentions, likes, reposts, bookmarks, follows, DMs, media upload, and raw v2 endpoint access. | `social-media/xurl` | +| [`xurl`](/docs/user-guide/skills/bundled/social-media/social-media-xurl) | Interact with X/Twitter via xurl, the official X API CLI. Use for posting, replying, quoting, searching, timelines, mentions, likes, reposts, bookmarks, follows, DMs, media upload, and raw v2 endpoint access. | `social-media/xurl` | ## software-development -General software-engineering skills — planning, reviewing, debugging, and test-driven development. - | Skill | Description | Path | |-------|-------------|------| -| `plan` | Plan mode for Hermes — inspect context, write a markdown plan into the active workspace's `.hermes/plans/` directory, and do not execute the work. | `software-development/plan` | -| `requesting-code-review` | Pre-commit verification pipeline — static security scan, baseline-aware quality gates, independent reviewer subagent, and auto-fix loop. Use after code changes and before committing, pushing, or opening a PR. | `software-development/requesting-code-review` | -| `subagent-driven-development` | Use when executing implementation plans with independent tasks. Dispatches fresh delegate_task per task with two-stage review (spec compliance then code quality). | `software-development/subagent-driven-development` | -| `systematic-debugging` | Use when encountering any bug, test failure, or unexpected behavior. 4-phase root cause investigation — NO fixes without understanding the problem first. | `software-development/systematic-debugging` | -| `test-driven-development` | Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach. | `software-development/test-driven-development` | -| `writing-plans` | Use when you have a spec or requirements for a multi-step task. Creates comprehensive implementation plans with bite-sized tasks, exact file paths, and complete code examples. | `software-development/writing-plans` | - - ---- - -# Optional Skills - -Optional skills ship with the repository under `optional-skills/` but are **not active by default**. They cover heavier or niche use cases. Install them with: - -```bash -hermes skills install official// -``` - -## autonomous-ai-agents - -| Skill | Description | Path | -|-------|-------------|------| -| `blackbox` | Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. Requires the blackbox CLI and a Blackbox AI API key. | `autonomous-ai-agents/blackbox` | - -## blockchain - -| Skill | Description | Path | -|-------|-------------|------| -| `base` | Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection, whale detection, and live network stats. Uses Base RPC + CoinGecko. No API key required. | `blockchain/base` | -| `solana` | Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required. | `blockchain/solana` | - -## creative - -| Skill | Description | Path | -|-------|-------------|------| -| `blender-mcp` | Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python (bpy) code. | `creative/blender-mcp` | -| `meme-generation` | Generate real meme images by picking a template and overlaying text with Pillow. Produces actual .png meme files. | `creative/meme-generation` | -| `touchdesigner-mcp` | Control a running TouchDesigner instance via the twozero MCP plugin — create operators, set parameters, wire connections, execute Python, build real-time audio-reactive visuals and GLSL networks. 36 native tools. | `creative/touchdesigner-mcp` | - -## devops - -| Skill | Description | Path | -|-------|-------------|------| -| `docker-management` | Manage Docker containers, images, volumes, networks, and Compose stacks — lifecycle ops, debugging, cleanup, and Dockerfile optimization. | `devops/docker-management` | - -## email - -| Skill | Description | Path | -|-------|-------------|------| -| `agentmail` | Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. hermes-agent@agentmail.to). | `email/agentmail` | - -## health - -| Skill | Description | Path | -|-------|-------------|------| -| `neuroskill-bci` | Connect to a running NeuroSkill instance and incorporate the user's real-time cognitive and emotional state (focus, relaxation, mood, cognitive load, drowsiness, heart rate, HRV, sleep staging, and 40+ derived EXG scores) into responses. Requires a BCI wearable (Muse 2/S or OpenBCI) and the NeuroSkill desktop app. | `health/neuroskill-bci` | - -## mcp - -| Skill | Description | Path | -|-------|-------------|------| -| `fastmcp` | Build, test, inspect, install, and deploy MCP servers with FastMCP in Python. Use when creating a new MCP server, wrapping an API or database as MCP tools, exposing resources or prompts, or preparing a FastMCP server for HTTP deployment. | `mcp/fastmcp` | - -## migration - -| Skill | Description | Path | -|-------|-------------|------| -| `openclaw-migration` | Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports Hermes-compatible memories, SOUL.md, command allowlists, user skills, and selected workspace assets from ~/.openclaw, then reports what could not be migrated and why. | `migration/openclaw-migration` | - -## productivity - -| Skill | Description | Path | -|-------|-------------|------| -| `telephony` | Give Hermes phone capabilities — provision and persist a Twilio number, send and receive SMS/MMS, make direct calls, and place AI-driven outbound calls through Bland.ai or Vapi. | `productivity/telephony` | - -## research - -| Skill | Description | Path | -|-------|-------------|------| -| `bioinformatics` | Gateway to 400+ bioinformatics skills from bioSkills and ClawBio. Covers genomics, transcriptomics, single-cell, variant calling, pharmacogenomics, metagenomics, structural biology, and more. | `research/bioinformatics` | -| `qmd` | Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration. | `research/qmd` | - -## security - -| Skill | Description | Path | -|-------|-------------|------| -| `1password` | Set up and use 1Password CLI (op). Use when installing the CLI, enabling desktop app integration, signing in, and reading/injecting secrets for commands. | `security/1password` | -| `oss-forensics` | Supply chain investigation, evidence recovery, and forensic analysis for GitHub repositories. Covers deleted commit recovery, force-push detection, IOC extraction, multi-source evidence collection, and structured forensic reporting. | `security/oss-forensics` | -| `sherlock` | OSINT username search across 400+ social networks. Hunt down social media accounts by username. | `security/sherlock` | +| [`plan`](/docs/user-guide/skills/bundled/software-development/software-development-plan) | Plan mode for Hermes — inspect context, write a markdown plan into the active workspace's `.hermes/plans/` directory, and do not execute the work. | `software-development/plan` | +| [`requesting-code-review`](/docs/user-guide/skills/bundled/software-development/software-development-requesting-code-review) | Pre-commit verification pipeline — static security scan, baseline-aware quality gates, independent reviewer subagent, and auto-fix loop. Use after code changes and before committing, pushing, or opening a PR. | `software-development/requesting-code-review` | +| [`subagent-driven-development`](/docs/user-guide/skills/bundled/software-development/software-development-subagent-driven-development) | Use when executing implementation plans with independent tasks. Dispatches fresh delegate_task per task with two-stage review (spec compliance then code quality). | `software-development/subagent-driven-development` | +| [`systematic-debugging`](/docs/user-guide/skills/bundled/software-development/software-development-systematic-debugging) | Use when encountering any bug, test failure, or unexpected behavior. 4-phase root cause investigation — NO fixes without understanding the problem first. | `software-development/systematic-debugging` | +| [`test-driven-development`](/docs/user-guide/skills/bundled/software-development/software-development-test-driven-development) | Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach. | `software-development/test-driven-development` | +| [`writing-plans`](/docs/user-guide/skills/bundled/software-development/software-development-writing-plans) | Use when you have a spec or requirements for a multi-step task. Creates comprehensive implementation plans with bite-sized tasks, exact file paths, and complete code examples. | `software-development/writing-plans` | diff --git a/website/docs/user-guide/skills/bundled/apple/apple-apple-notes.md b/website/docs/user-guide/skills/bundled/apple/apple-apple-notes.md new file mode 100644 index 000000000..b3a4905f0 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/apple/apple-apple-notes.md @@ -0,0 +1,106 @@ +--- +title: "Apple Notes — Manage Apple Notes via the memo CLI on macOS (create, view, search, edit)" +sidebar_label: "Apple Notes" +description: "Manage Apple Notes via the memo CLI on macOS (create, view, search, edit)" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Apple Notes + +Manage Apple Notes via the memo CLI on macOS (create, view, search, edit). + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/apple/apple-notes` | +| Version | `1.0.0` | +| Author | Hermes Agent | +| License | MIT | +| Platforms | macos | +| Tags | `Notes`, `Apple`, `macOS`, `note-taking` | +| Related skills | [`obsidian`](/docs/user-guide/skills/bundled/note-taking/note-taking-obsidian) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Apple Notes + +Use `memo` to manage Apple Notes directly from the terminal. Notes sync across all Apple devices via iCloud. + +## Prerequisites + +- **macOS** with Notes.app +- Install: `brew tap antoniorodr/memo && brew install antoniorodr/memo/memo` +- Grant Automation access to Notes.app when prompted (System Settings → Privacy → Automation) + +## When to Use + +- User asks to create, view, or search Apple Notes +- Saving information to Notes.app for cross-device access +- Organizing notes into folders +- Exporting notes to Markdown/HTML + +## When NOT to Use + +- Obsidian vault management → use the `obsidian` skill +- Bear Notes → separate app (not supported here) +- Quick agent-only notes → use the `memory` tool instead + +## Quick Reference + +### View Notes + +```bash +memo notes # List all notes +memo notes -f "Folder Name" # Filter by folder +memo notes -s "query" # Search notes (fuzzy) +``` + +### Create Notes + +```bash +memo notes -a # Interactive editor +memo notes -a "Note Title" # Quick add with title +``` + +### Edit Notes + +```bash +memo notes -e # Interactive selection to edit +``` + +### Delete Notes + +```bash +memo notes -d # Interactive selection to delete +``` + +### Move Notes + +```bash +memo notes -m # Move note to folder (interactive) +``` + +### Export Notes + +```bash +memo notes -ex # Export to HTML/Markdown +``` + +## Limitations + +- Cannot edit notes containing images or attachments +- Interactive prompts require terminal access (use pty=true if needed) +- macOS only — requires Apple Notes.app + +## Rules + +1. Prefer Apple Notes when user wants cross-device sync (iPhone/iPad/Mac) +2. Use the `memory` tool for agent-internal notes that don't need to sync +3. Use the `obsidian` skill for Markdown-native knowledge management diff --git a/website/docs/user-guide/skills/bundled/apple/apple-apple-reminders.md b/website/docs/user-guide/skills/bundled/apple/apple-apple-reminders.md new file mode 100644 index 000000000..c7e01a844 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/apple/apple-apple-reminders.md @@ -0,0 +1,114 @@ +--- +title: "Apple Reminders — Manage Apple Reminders via remindctl CLI (list, add, complete, delete)" +sidebar_label: "Apple Reminders" +description: "Manage Apple Reminders via remindctl CLI (list, add, complete, delete)" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Apple Reminders + +Manage Apple Reminders via remindctl CLI (list, add, complete, delete). + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/apple/apple-reminders` | +| Version | `1.0.0` | +| Author | Hermes Agent | +| License | MIT | +| Platforms | macos | +| Tags | `Reminders`, `tasks`, `todo`, `macOS`, `Apple` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Apple Reminders + +Use `remindctl` to manage Apple Reminders directly from the terminal. Tasks sync across all Apple devices via iCloud. + +## Prerequisites + +- **macOS** with Reminders.app +- Install: `brew install steipete/tap/remindctl` +- Grant Reminders permission when prompted +- Check: `remindctl status` / Request: `remindctl authorize` + +## When to Use + +- User mentions "reminder" or "Reminders app" +- Creating personal to-dos with due dates that sync to iOS +- Managing Apple Reminders lists +- User wants tasks to appear on their iPhone/iPad + +## When NOT to Use + +- Scheduling agent alerts → use the cronjob tool instead +- Calendar events → use Apple Calendar or Google Calendar +- Project task management → use GitHub Issues, Notion, etc. +- If user says "remind me" but means an agent alert → clarify first + +## Quick Reference + +### View Reminders + +```bash +remindctl # Today's reminders +remindctl today # Today +remindctl tomorrow # Tomorrow +remindctl week # This week +remindctl overdue # Past due +remindctl all # Everything +remindctl 2026-01-04 # Specific date +``` + +### Manage Lists + +```bash +remindctl list # List all lists +remindctl list Work # Show specific list +remindctl list Projects --create # Create list +remindctl list Work --delete # Delete list +``` + +### Create Reminders + +```bash +remindctl add "Buy milk" +remindctl add --title "Call mom" --list Personal --due tomorrow +remindctl add --title "Meeting prep" --due "2026-02-15 09:00" +``` + +### Complete / Delete + +```bash +remindctl complete 1 2 3 # Complete by ID +remindctl delete 4A83 --force # Delete by ID +``` + +### Output Formats + +```bash +remindctl today --json # JSON for scripting +remindctl today --plain # TSV format +remindctl today --quiet # Counts only +``` + +## Date Formats + +Accepted by `--due` and date filters: +- `today`, `tomorrow`, `yesterday` +- `YYYY-MM-DD` +- `YYYY-MM-DD HH:mm` +- ISO 8601 (`2026-01-04T12:34:56Z`) + +## Rules + +1. When user says "remind me", clarify: Apple Reminders (syncs to phone) vs agent cronjob alert +2. Always confirm reminder content and due date before creating +3. Use `--json` for programmatic parsing diff --git a/website/docs/user-guide/skills/bundled/apple/apple-findmy.md b/website/docs/user-guide/skills/bundled/apple/apple-findmy.md new file mode 100644 index 000000000..bf193c81b --- /dev/null +++ b/website/docs/user-guide/skills/bundled/apple/apple-findmy.md @@ -0,0 +1,149 @@ +--- +title: "Findmy — Track Apple devices and AirTags via FindMy" +sidebar_label: "Findmy" +description: "Track Apple devices and AirTags via FindMy" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Findmy + +Track Apple devices and AirTags via FindMy.app on macOS using AppleScript and screen capture. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/apple/findmy` | +| Version | `1.0.0` | +| Author | Hermes Agent | +| License | MIT | +| Platforms | macos | +| Tags | `FindMy`, `AirTag`, `location`, `tracking`, `macOS`, `Apple` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Find My (Apple) + +Track Apple devices and AirTags via the FindMy.app on macOS. Since Apple doesn't +provide a CLI for FindMy, this skill uses AppleScript to open the app and +screen capture to read device locations. + +## Prerequisites + +- **macOS** with Find My app and iCloud signed in +- Devices/AirTags already registered in Find My +- Screen Recording permission for terminal (System Settings → Privacy → Screen Recording) +- **Optional but recommended**: Install `peekaboo` for better UI automation: + `brew install steipete/tap/peekaboo` + +## When to Use + +- User asks "where is my [device/cat/keys/bag]?" +- Tracking AirTag locations +- Checking device locations (iPhone, iPad, Mac, AirPods) +- Monitoring pet or item movement over time (AirTag patrol routes) + +## Method 1: AppleScript + Screenshot (Basic) + +### Open FindMy and Navigate + +```bash +# Open Find My app +osascript -e 'tell application "FindMy" to activate' + +# Wait for it to load +sleep 3 + +# Take a screenshot of the Find My window +screencapture -w -o /tmp/findmy.png +``` + +Then use `vision_analyze` to read the screenshot: +``` +vision_analyze(image_url="/tmp/findmy.png", question="What devices/items are shown and what are their locations?") +``` + +### Switch Between Tabs + +```bash +# Switch to Devices tab +osascript -e ' +tell application "System Events" + tell process "FindMy" + click button "Devices" of toolbar 1 of window 1 + end tell +end tell' + +# Switch to Items tab (AirTags) +osascript -e ' +tell application "System Events" + tell process "FindMy" + click button "Items" of toolbar 1 of window 1 + end tell +end tell' +``` + +## Method 2: Peekaboo UI Automation (Recommended) + +If `peekaboo` is installed, use it for more reliable UI interaction: + +```bash +# Open Find My +osascript -e 'tell application "FindMy" to activate' +sleep 3 + +# Capture and annotate the UI +peekaboo see --app "FindMy" --annotate --path /tmp/findmy-ui.png + +# Click on a specific device/item by element ID +peekaboo click --on B3 --app "FindMy" + +# Capture the detail view +peekaboo image --app "FindMy" --path /tmp/findmy-detail.png +``` + +Then analyze with vision: +``` +vision_analyze(image_url="/tmp/findmy-detail.png", question="What is the location shown for this device/item? Include address and coordinates if visible.") +``` + +## Workflow: Track AirTag Location Over Time + +For monitoring an AirTag (e.g., tracking a cat's patrol route): + +```bash +# 1. Open FindMy to Items tab +osascript -e 'tell application "FindMy" to activate' +sleep 3 + +# 2. Click on the AirTag item (stay on page — AirTag only updates when page is open) + +# 3. Periodically capture location +while true; do + screencapture -w -o /tmp/findmy-$(date +%H%M%S).png + sleep 300 # Every 5 minutes +done +``` + +Analyze each screenshot with vision to extract coordinates, then compile a route. + +## Limitations + +- FindMy has **no CLI or API** — must use UI automation +- AirTags only update location while the FindMy page is actively displayed +- Location accuracy depends on nearby Apple devices in the FindMy network +- Screen Recording permission required for screenshots +- AppleScript UI automation may break across macOS versions + +## Rules + +1. Keep FindMy app in the foreground when tracking AirTags (updates stop when minimized) +2. Use `vision_analyze` to read screenshot content — don't try to parse pixels +3. For ongoing tracking, use a cronjob to periodically capture and log locations +4. Respect privacy — only track devices/items the user owns diff --git a/website/docs/user-guide/skills/bundled/apple/apple-imessage.md b/website/docs/user-guide/skills/bundled/apple/apple-imessage.md new file mode 100644 index 000000000..d29fab6eb --- /dev/null +++ b/website/docs/user-guide/skills/bundled/apple/apple-imessage.md @@ -0,0 +1,118 @@ +--- +title: "Imessage — Send and receive iMessages/SMS via the imsg CLI on macOS" +sidebar_label: "Imessage" +description: "Send and receive iMessages/SMS via the imsg CLI on macOS" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Imessage + +Send and receive iMessages/SMS via the imsg CLI on macOS. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/apple/imessage` | +| Version | `1.0.0` | +| Author | Hermes Agent | +| License | MIT | +| Platforms | macos | +| Tags | `iMessage`, `SMS`, `messaging`, `macOS`, `Apple` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# iMessage + +Use `imsg` to read and send iMessage/SMS via macOS Messages.app. + +## Prerequisites + +- **macOS** with Messages.app signed in +- Install: `brew install steipete/tap/imsg` +- Grant Full Disk Access for terminal (System Settings → Privacy → Full Disk Access) +- Grant Automation permission for Messages.app when prompted + +## When to Use + +- User asks to send an iMessage or text message +- Reading iMessage conversation history +- Checking recent Messages.app chats +- Sending to phone numbers or Apple IDs + +## When NOT to Use + +- Telegram/Discord/Slack/WhatsApp messages → use the appropriate gateway channel +- Group chat management (adding/removing members) → not supported +- Bulk/mass messaging → always confirm with user first + +## Quick Reference + +### List Chats + +```bash +imsg chats --limit 10 --json +``` + +### View History + +```bash +# By chat ID +imsg history --chat-id 1 --limit 20 --json + +# With attachments info +imsg history --chat-id 1 --limit 20 --attachments --json +``` + +### Send Messages + +```bash +# Text only +imsg send --to "+14155551212" --text "Hello!" + +# With attachment +imsg send --to "+14155551212" --text "Check this out" --file /path/to/image.jpg + +# Force iMessage or SMS +imsg send --to "+14155551212" --text "Hi" --service imessage +imsg send --to "+14155551212" --text "Hi" --service sms +``` + +### Watch for New Messages + +```bash +imsg watch --chat-id 1 --attachments +``` + +## Service Options + +- `--service imessage` — Force iMessage (requires recipient has iMessage) +- `--service sms` — Force SMS (green bubble) +- `--service auto` — Let Messages.app decide (default) + +## Rules + +1. **Always confirm recipient and message content** before sending +2. **Never send to unknown numbers** without explicit user approval +3. **Verify file paths** exist before attaching +4. **Don't spam** — rate-limit yourself + +## Example Workflow + +User: "Text mom that I'll be late" + +```bash +# 1. Find mom's chat +imsg chats --limit 20 --json | jq '.[] | select(.displayName | contains("Mom"))' + +# 2. Confirm with user: "Found Mom at +1555123456. Send 'I'll be late' via iMessage?" + +# 3. Send after confirmation +imsg send --to "+1555123456" --text "I'll be late" +``` diff --git a/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code.md b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code.md new file mode 100644 index 000000000..515f12ba8 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code.md @@ -0,0 +1,762 @@ +--- +title: "Claude Code — Delegate coding tasks to Claude Code (Anthropic's CLI agent)" +sidebar_label: "Claude Code" +description: "Delegate coding tasks to Claude Code (Anthropic's CLI agent)" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Claude Code + +Delegate coding tasks to Claude Code (Anthropic's CLI agent). Use for building features, refactoring, PR reviews, and iterative coding. Requires the claude CLI installed. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/autonomous-ai-agents/claude-code` | +| Version | `2.2.0` | +| Author | Hermes Agent + Teknium | +| License | MIT | +| Tags | `Coding-Agent`, `Claude`, `Anthropic`, `Code-Review`, `Refactoring`, `PTY`, `Automation` | +| Related skills | [`codex`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex), [`hermes-agent`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent), [`opencode`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-opencode) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Claude Code — Hermes Orchestration Guide + +Delegate coding tasks to [Claude Code](https://code.claude.com/docs/en/cli-reference) (Anthropic's autonomous coding agent CLI) via the Hermes terminal. Claude Code v2.x can read files, write code, run shell commands, spawn subagents, and manage git workflows autonomously. + +## Prerequisites + +- **Install:** `npm install -g @anthropic-ai/claude-code` +- **Auth:** run `claude` once to log in (browser OAuth for Pro/Max, or set `ANTHROPIC_API_KEY`) +- **Console auth:** `claude auth login --console` for API key billing +- **SSO auth:** `claude auth login --sso` for Enterprise +- **Check status:** `claude auth status` (JSON) or `claude auth status --text` (human-readable) +- **Health check:** `claude doctor` — checks auto-updater and installation health +- **Version check:** `claude --version` (requires v2.x+) +- **Update:** `claude update` or `claude upgrade` + +## Two Orchestration Modes + +Hermes interacts with Claude Code in two fundamentally different ways. Choose based on the task. + +### Mode 1: Print Mode (`-p`) — Non-Interactive (PREFERRED for most tasks) + +Print mode runs a one-shot task, returns the result, and exits. No PTY needed. No interactive prompts. This is the cleanest integration path. + +``` +terminal(command="claude -p 'Add error handling to all API calls in src/' --allowedTools 'Read,Edit' --max-turns 10", workdir="/path/to/project", timeout=120) +``` + +**When to use print mode:** +- One-shot coding tasks (fix a bug, add a feature, refactor) +- CI/CD automation and scripting +- Structured data extraction with `--json-schema` +- Piped input processing (`cat file | claude -p "analyze this"`) +- Any task where you don't need multi-turn conversation + +**Print mode skips ALL interactive dialogs** — no workspace trust prompt, no permission confirmations. This makes it ideal for automation. + +### Mode 2: Interactive PTY via tmux — Multi-Turn Sessions + +Interactive mode gives you a full conversational REPL where you can send follow-up prompts, use slash commands, and watch Claude work in real time. **Requires tmux orchestration.** + +``` +# Start a tmux session +terminal(command="tmux new-session -d -s claude-work -x 140 -y 40") + +# Launch Claude Code inside it +terminal(command="tmux send-keys -t claude-work 'cd /path/to/project && claude' Enter") + +# Wait for startup, then send your task +# (after ~3-5 seconds for the welcome screen) +terminal(command="sleep 5 && tmux send-keys -t claude-work 'Refactor the auth module to use JWT tokens' Enter") + +# Monitor progress by capturing the pane +terminal(command="sleep 15 && tmux capture-pane -t claude-work -p -S -50") + +# Send follow-up tasks +terminal(command="tmux send-keys -t claude-work 'Now add unit tests for the new JWT code' Enter") + +# Exit when done +terminal(command="tmux send-keys -t claude-work '/exit' Enter") +``` + +**When to use interactive mode:** +- Multi-turn iterative work (refactor → review → fix → test cycle) +- Tasks requiring human-in-the-loop decisions +- Exploratory coding sessions +- When you need to use Claude's slash commands (`/compact`, `/review`, `/model`) + +## PTY Dialog Handling (CRITICAL for Interactive Mode) + +Claude Code presents up to two confirmation dialogs on first launch. You MUST handle these via tmux send-keys: + +### Dialog 1: Workspace Trust (first visit to a directory) +``` +❯ 1. Yes, I trust this folder ← DEFAULT (just press Enter) + 2. No, exit +``` +**Handling:** `tmux send-keys -t Enter` — default selection is correct. + +### Dialog 2: Bypass Permissions Warning (only with --dangerously-skip-permissions) +``` +❯ 1. No, exit ← DEFAULT (WRONG choice!) + 2. Yes, I accept +``` +**Handling:** Must navigate DOWN first, then Enter: +``` +tmux send-keys -t Down && sleep 0.3 && tmux send-keys -t Enter +``` + +### Robust Dialog Handling Pattern +``` +# Launch with permissions bypass +terminal(command="tmux send-keys -t claude-work 'claude --dangerously-skip-permissions \"your task\"' Enter") + +# Handle trust dialog (Enter for default "Yes") +terminal(command="sleep 4 && tmux send-keys -t claude-work Enter") + +# Handle permissions dialog (Down then Enter for "Yes, I accept") +terminal(command="sleep 3 && tmux send-keys -t claude-work Down && sleep 0.3 && tmux send-keys -t claude-work Enter") + +# Now wait for Claude to work +terminal(command="sleep 15 && tmux capture-pane -t claude-work -p -S -60") +``` + +**Note:** After the first trust acceptance for a directory, the trust dialog won't appear again. Only the permissions dialog recurs each time you use `--dangerously-skip-permissions`. + +## CLI Subcommands + +| Subcommand | Purpose | +|------------|---------| +| `claude` | Start interactive REPL | +| `claude "query"` | Start REPL with initial prompt | +| `claude -p "query"` | Print mode (non-interactive, exits when done) | +| `cat file \| claude -p "query"` | Pipe content as stdin context | +| `claude -c` | Continue the most recent conversation in this directory | +| `claude -r "id"` | Resume a specific session by ID or name | +| `claude auth login` | Sign in (add `--console` for API billing, `--sso` for Enterprise) | +| `claude auth status` | Check login status (returns JSON; `--text` for human-readable) | +| `claude mcp add -- ` | Add an MCP server | +| `claude mcp list` | List configured MCP servers | +| `claude mcp remove ` | Remove an MCP server | +| `claude agents` | List configured agents | +| `claude doctor` | Run health checks on installation and auto-updater | +| `claude update` / `claude upgrade` | Update Claude Code to latest version | +| `claude remote-control` | Start server to control Claude from claude.ai or mobile app | +| `claude install [target]` | Install native build (stable, latest, or specific version) | +| `claude setup-token` | Set up long-lived auth token (requires subscription) | +| `claude plugin` / `claude plugins` | Manage Claude Code plugins | +| `claude auto-mode` | Inspect auto mode classifier configuration | + +## Print Mode Deep Dive + +### Structured JSON Output +``` +terminal(command="claude -p 'Analyze auth.py for security issues' --output-format json --max-turns 5", workdir="/project", timeout=120) +``` + +Returns a JSON object with: +```json +{ + "type": "result", + "subtype": "success", + "result": "The analysis text...", + "session_id": "75e2167f-...", + "num_turns": 3, + "total_cost_usd": 0.0787, + "duration_ms": 10276, + "stop_reason": "end_turn", + "terminal_reason": "completed", + "usage": { "input_tokens": 5, "output_tokens": 603, ... }, + "modelUsage": { "claude-sonnet-4-6": { "costUSD": 0.078, "contextWindow": 200000 } } +} +``` + +**Key fields:** `session_id` for resumption, `num_turns` for agentic loop count, `total_cost_usd` for spend tracking, `subtype` for success/error detection (`success`, `error_max_turns`, `error_budget`). + +### Streaming JSON Output +For real-time token streaming, use `stream-json` with `--verbose`: +``` +terminal(command="claude -p 'Write a summary' --output-format stream-json --verbose --include-partial-messages", timeout=60) +``` + +Returns newline-delimited JSON events. Filter with jq for live text: +``` +claude -p "Explain X" --output-format stream-json --verbose --include-partial-messages | \ + jq -rj 'select(.type == "stream_event" and .event.delta.type? == "text_delta") | .event.delta.text' +``` + +Stream events include `system/api_retry` with `attempt`, `max_retries`, and `error` fields (e.g., `rate_limit`, `billing_error`). + +### Bidirectional Streaming +For real-time input AND output streaming: +``` +claude -p "task" --input-format stream-json --output-format stream-json --replay-user-messages +``` +`--replay-user-messages` re-emits user messages on stdout for acknowledgment. + +### Piped Input +``` +# Pipe a file for analysis +terminal(command="cat src/auth.py | claude -p 'Review this code for bugs' --max-turns 1", timeout=60) + +# Pipe multiple files +terminal(command="cat src/*.py | claude -p 'Find all TODO comments' --max-turns 1", timeout=60) + +# Pipe command output +terminal(command="git diff HEAD~3 | claude -p 'Summarize these changes' --max-turns 1", timeout=60) +``` + +### JSON Schema for Structured Extraction +``` +terminal(command="claude -p 'List all functions in src/' --output-format json --json-schema '{\"type\":\"object\",\"properties\":{\"functions\":{\"type\":\"array\",\"items\":{\"type\":\"string\"}}},\"required\":[\"functions\"]}' --max-turns 5", workdir="/project", timeout=90) +``` + +Parse `structured_output` from the JSON result. Claude validates output against the schema before returning. + +### Session Continuation +``` +# Start a task +terminal(command="claude -p 'Start refactoring the database layer' --output-format json --max-turns 10 > /tmp/session.json", workdir="/project", timeout=180) + +# Resume with session ID +terminal(command="claude -p 'Continue and add connection pooling' --resume $(cat /tmp/session.json | python3 -c 'import json,sys; print(json.load(sys.stdin)[\"session_id\"])') --max-turns 5", workdir="/project", timeout=120) + +# Or resume the most recent session in the same directory +terminal(command="claude -p 'What did you do last time?' --continue --max-turns 1", workdir="/project", timeout=30) + +# Fork a session (new ID, keeps history) +terminal(command="claude -p 'Try a different approach' --resume --fork-session --max-turns 10", workdir="/project", timeout=120) +``` + +### Bare Mode for CI/Scripting +``` +terminal(command="claude --bare -p 'Run all tests and report failures' --allowedTools 'Read,Bash' --max-turns 10", workdir="/project", timeout=180) +``` + +`--bare` skips hooks, plugins, MCP discovery, and CLAUDE.md loading. Fastest startup. Requires `ANTHROPIC_API_KEY` (skips OAuth). + +To selectively load context in bare mode: +| To load | Flag | +|---------|------| +| System prompt additions | `--append-system-prompt "text"` or `--append-system-prompt-file path` | +| Settings | `--settings ` | +| MCP servers | `--mcp-config ` | +| Custom agents | `--agents ''` | + +### Fallback Model for Overload +``` +terminal(command="claude -p 'task' --fallback-model haiku --max-turns 5", timeout=90) +``` +Automatically falls back to the specified model when the default is overloaded (print mode only). + +## Complete CLI Flags Reference + +### Session & Environment +| Flag | Effect | +|------|--------| +| `-p, --print` | Non-interactive one-shot mode (exits when done) | +| `-c, --continue` | Resume most recent conversation in current directory | +| `-r, --resume ` | Resume specific session by ID or name (interactive picker if no ID) | +| `--fork-session` | When resuming, create new session ID instead of reusing original | +| `--session-id ` | Use a specific UUID for the conversation | +| `--no-session-persistence` | Don't save session to disk (print mode only) | +| `--add-dir ` | Grant Claude access to additional working directories | +| `-w, --worktree [name]` | Run in an isolated git worktree at `.claude/worktrees/` | +| `--tmux` | Create a tmux session for the worktree (requires `--worktree`) | +| `--ide` | Auto-connect to a valid IDE on startup | +| `--chrome` / `--no-chrome` | Enable/disable Chrome browser integration for web testing | +| `--from-pr [number]` | Resume session linked to a specific GitHub PR | +| `--file ` | File resources to download at startup (format: `file_id:relative_path`) | + +### Model & Performance +| Flag | Effect | +|------|--------| +| `--model ` | Model selection: `sonnet`, `opus`, `haiku`, or full name like `claude-sonnet-4-6` | +| `--effort ` | Reasoning depth: `low`, `medium`, `high`, `max`, `auto` | Both | +| `--max-turns ` | Limit agentic loops (print mode only; prevents runaway) | +| `--max-budget-usd ` | Cap API spend in dollars (print mode only) | +| `--fallback-model ` | Auto-fallback when default model is overloaded (print mode only) | +| `--betas ` | Beta headers to include in API requests (API key users only) | + +### Permission & Safety +| Flag | Effect | +|------|--------| +| `--dangerously-skip-permissions` | Auto-approve ALL tool use (file writes, bash, network, etc.) | +| `--allow-dangerously-skip-permissions` | Enable bypass as an *option* without enabling it by default | +| `--permission-mode ` | `default`, `acceptEdits`, `plan`, `auto`, `dontAsk`, `bypassPermissions` | +| `--allowedTools ` | Whitelist specific tools (comma or space-separated) | +| `--disallowedTools ` | Blacklist specific tools | +| `--tools ` | Override built-in tool set (`""` = none, `"default"` = all, or tool names) | + +### Output & Input Format +| Flag | Effect | +|------|--------| +| `--output-format ` | `text` (default), `json` (single result object), `stream-json` (newline-delimited) | +| `--input-format ` | `text` (default) or `stream-json` (real-time streaming input) | +| `--json-schema ` | Force structured JSON output matching a schema | +| `--verbose` | Full turn-by-turn output | +| `--include-partial-messages` | Include partial message chunks as they arrive (stream-json + print) | +| `--replay-user-messages` | Re-emit user messages on stdout (stream-json bidirectional) | + +### System Prompt & Context +| Flag | Effect | +|------|--------| +| `--append-system-prompt ` | **Add** to the default system prompt (preserves built-in capabilities) | +| `--append-system-prompt-file ` | **Add** file contents to the default system prompt | +| `--system-prompt ` | **Replace** the entire system prompt (use --append instead usually) | +| `--system-prompt-file ` | **Replace** the system prompt with file contents | +| `--bare` | Skip hooks, plugins, MCP discovery, CLAUDE.md, OAuth (fastest startup) | +| `--agents ''` | Define custom subagents dynamically as JSON | +| `--mcp-config ` | Load MCP servers from JSON file (repeatable) | +| `--strict-mcp-config` | Only use MCP servers from `--mcp-config`, ignoring all other MCP configs | +| `--settings ` | Load additional settings from a JSON file or inline JSON | +| `--setting-sources ` | Comma-separated sources to load: `user`, `project`, `local` | +| `--plugin-dir ` | Load plugins from directories for this session only | +| `--disable-slash-commands` | Disable all skills/slash commands | + +### Debugging +| Flag | Effect | +|------|--------| +| `-d, --debug [filter]` | Enable debug logging with optional category filter (e.g., `"api,hooks"`, `"!1p,!file"`) | +| `--debug-file ` | Write debug logs to file (implicitly enables debug mode) | + +### Agent Teams +| Flag | Effect | +|------|--------| +| `--teammate-mode ` | How agent teams display: `auto`, `in-process`, or `tmux` | +| `--brief` | Enable `SendUserMessage` tool for agent-to-user communication | + +### Tool Name Syntax for --allowedTools / --disallowedTools +``` +Read # All file reading +Edit # File editing (existing files) +Write # File creation (new files) +Bash # All shell commands +Bash(git *) # Only git commands +Bash(git commit *) # Only git commit commands +Bash(npm run lint:*) # Pattern matching with wildcards +WebSearch # Web search capability +WebFetch # Web page fetching +mcp____ # Specific MCP tool +``` + +## Settings & Configuration + +### Settings Hierarchy (highest to lowest priority) +1. **CLI flags** — override everything +2. **Local project:** `.claude/settings.local.json` (personal, gitignored) +3. **Project:** `.claude/settings.json` (shared, git-tracked) +4. **User:** `~/.claude/settings.json` (global) + +### Permissions in Settings +```json +{ + "permissions": { + "allow": ["Bash(npm run lint:*)", "WebSearch", "Read"], + "ask": ["Write(*.ts)", "Bash(git push*)"], + "deny": ["Read(.env)", "Bash(rm -rf *)"] + } +} +``` + +### Memory Files (CLAUDE.md) Hierarchy +1. **Global:** `~/.claude/CLAUDE.md` — applies to all projects +2. **Project:** `./CLAUDE.md` — project-specific context (git-tracked) +3. **Local:** `.claude/CLAUDE.local.md` — personal project overrides (gitignored) + +Use the `#` prefix in interactive mode to quickly add to memory: `# Always use 2-space indentation`. + +## Interactive Session: Slash Commands + +### Session & Context +| Command | Purpose | +|---------|---------| +| `/help` | Show all commands (including custom and MCP commands) | +| `/compact [focus]` | Compress context to save tokens; CLAUDE.md survives compaction. E.g., `/compact focus on auth logic` | +| `/clear` | Wipe conversation history for a fresh start | +| `/context` | Visualize context usage as a colored grid with optimization tips | +| `/cost` | View token usage with per-model and cache-hit breakdowns | +| `/resume` | Switch to or resume a different session | +| `/rewind` | Revert to a previous checkpoint in conversation or code | +| `/btw ` | Ask a side question without adding to context cost | +| `/status` | Show version, connectivity, and session info | +| `/todos` | List tracked action items from the conversation | +| `/exit` or `Ctrl+D` | End session | + +### Development & Review +| Command | Purpose | +|---------|---------| +| `/review` | Request code review of current changes | +| `/security-review` | Perform security analysis of current changes | +| `/plan [description]` | Enter Plan mode with auto-start for task planning | +| `/loop [interval]` | Schedule recurring tasks within the session | +| `/batch` | Auto-create worktrees for large parallel changes (5-30 worktrees) | + +### Configuration & Tools +| Command | Purpose | +|---------|---------| +| `/model [model]` | Switch models mid-session (use arrow keys to adjust effort) | +| `/effort [level]` | Set reasoning effort: `low`, `medium`, `high`, `max`, or `auto` | +| `/init` | Create a CLAUDE.md file for project memory | +| `/memory` | Open CLAUDE.md for editing | +| `/config` | Open interactive settings configuration | +| `/permissions` | View/update tool permissions | +| `/agents` | Manage specialized subagents | +| `/mcp` | Interactive UI to manage MCP servers | +| `/add-dir` | Add additional working directories (useful for monorepos) | +| `/usage` | Show plan limits and rate limit status | +| `/voice` | Enable push-to-talk voice mode (20 languages; hold Space to record, release to send) | +| `/release-notes` | Interactive picker for version release notes | + +### Custom Slash Commands +Create `.claude/commands/.md` (project-shared) or `~/.claude/commands/.md` (personal): + +```markdown +# .claude/commands/deploy.md +Run the deploy pipeline: +1. Run all tests +2. Build the Docker image +3. Push to registry +4. Update the $ARGUMENTS environment (default: staging) +``` + +Usage: `/deploy production` — `$ARGUMENTS` is replaced with the user's input. + +### Skills (Natural Language Invocation) +Unlike slash commands (manually invoked), skills in `.claude/skills/` are markdown guides that Claude invokes automatically via natural language when the task matches: + +```markdown +# .claude/skills/database-migration.md +When asked to create or modify database migrations: +1. Use Alembic for migration generation +2. Always create a rollback function +3. Test migrations against a local database copy +``` + +## Interactive Session: Keyboard Shortcuts + +### General Controls +| Key | Action | +|-----|--------| +| `Ctrl+C` | Cancel current input or generation | +| `Ctrl+D` | Exit session | +| `Ctrl+R` | Reverse search command history | +| `Ctrl+B` | Background a running task | +| `Ctrl+V` | Paste image into conversation | +| `Ctrl+O` | Transcript mode — see Claude's thinking process | +| `Ctrl+G` or `Ctrl+X Ctrl+E` | Open prompt in external editor | +| `Esc Esc` | Rewind conversation or code state / summarize | + +### Mode Toggles +| Key | Action | +|-----|--------| +| `Shift+Tab` | Cycle permission modes (Normal → Auto-Accept → Plan) | +| `Alt+P` | Switch model | +| `Alt+T` | Toggle thinking mode | +| `Alt+O` | Toggle Fast Mode | + +### Multiline Input +| Key | Action | +|-----|--------| +| `\` + `Enter` | Quick newline | +| `Shift+Enter` | Newline (alternative) | +| `Ctrl+J` | Newline (alternative) | + +### Input Prefixes +| Prefix | Action | +|--------|--------| +| `!` | Execute bash directly, bypassing AI (e.g., `!npm test`). Use `!` alone to toggle shell mode. | +| `@` | Reference files/directories with autocomplete (e.g., `@./src/api/`) | +| `#` | Quick add to CLAUDE.md memory (e.g., `# Use 2-space indentation`) | +| `/` | Slash commands | + +### Pro Tip: "ultrathink" +Use the keyword "ultrathink" in your prompt for maximum reasoning effort on a specific turn. This triggers the deepest thinking mode regardless of the current `/effort` setting. + +## PR Review Pattern + +### Quick Review (Print Mode) +``` +terminal(command="cd /path/to/repo && git diff main...feature-branch | claude -p 'Review this diff for bugs, security issues, and style problems. Be thorough.' --max-turns 1", timeout=60) +``` + +### Deep Review (Interactive + Worktree) +``` +terminal(command="tmux new-session -d -s review -x 140 -y 40") +terminal(command="tmux send-keys -t review 'cd /path/to/repo && claude -w pr-review' Enter") +terminal(command="sleep 5 && tmux send-keys -t review Enter") # Trust dialog +terminal(command="sleep 2 && tmux send-keys -t review 'Review all changes vs main. Check for bugs, security issues, race conditions, and missing tests.' Enter") +terminal(command="sleep 30 && tmux capture-pane -t review -p -S -60") +``` + +### PR Review from Number +``` +terminal(command="claude -p 'Review this PR thoroughly' --from-pr 42 --max-turns 10", workdir="/path/to/repo", timeout=120) +``` + +### Claude Worktree with tmux +``` +terminal(command="claude -w feature-x --tmux", workdir="/path/to/repo") +``` +Creates an isolated git worktree at `.claude/worktrees/feature-x` AND a tmux session for it. Uses iTerm2 native panes when available; add `--tmux=classic` for traditional tmux. + +## Parallel Claude Instances + +Run multiple independent Claude tasks simultaneously: + +``` +# Task 1: Fix backend +terminal(command="tmux new-session -d -s task1 -x 140 -y 40 && tmux send-keys -t task1 'cd ~/project && claude -p \"Fix the auth bug in src/auth.py\" --allowedTools \"Read,Edit\" --max-turns 10' Enter") + +# Task 2: Write tests +terminal(command="tmux new-session -d -s task2 -x 140 -y 40 && tmux send-keys -t task2 'cd ~/project && claude -p \"Write integration tests for the API endpoints\" --allowedTools \"Read,Write,Bash\" --max-turns 15' Enter") + +# Task 3: Update docs +terminal(command="tmux new-session -d -s task3 -x 140 -y 40 && tmux send-keys -t task3 'cd ~/project && claude -p \"Update README.md with the new API endpoints\" --allowedTools \"Read,Edit\" --max-turns 5' Enter") + +# Monitor all +terminal(command="sleep 30 && for s in task1 task2 task3; do echo '=== '$s' ==='; tmux capture-pane -t $s -p -S -5 2>/dev/null; done") +``` + +## CLAUDE.md — Project Context File + +Claude Code auto-loads `CLAUDE.md` from the project root. Use it to persist project context: + +```markdown +# Project: My API + +## Architecture +- FastAPI backend with SQLAlchemy ORM +- PostgreSQL database, Redis cache +- pytest for testing with 90% coverage target + +## Key Commands +- `make test` — run full test suite +- `make lint` — ruff + mypy +- `make dev` — start dev server on :8000 + +## Code Standards +- Type hints on all public functions +- Docstrings in Google style +- 2-space indentation for YAML, 4-space for Python +- No wildcard imports +``` + +**Be specific.** Instead of "Write good code", use "Use 2-space indentation for JS" or "Name test files with `.test.ts` suffix." Specific instructions save correction cycles. + +### Rules Directory (Modular CLAUDE.md) +For projects with many rules, use the rules directory instead of one massive CLAUDE.md: +- **Project rules:** `.claude/rules/*.md` — team-shared, git-tracked +- **User rules:** `~/.claude/rules/*.md` — personal, global + +Each `.md` file in the rules directory is loaded as additional context. This is cleaner than cramming everything into a single CLAUDE.md. + +### Auto-Memory +Claude automatically stores learned project context in `~/.claude/projects//memory/`. +- **Limit:** 25KB or 200 lines per project +- This is separate from CLAUDE.md — it's Claude's own notes about the project, accumulated across sessions + +## Custom Subagents + +Define specialized agents in `.claude/agents/` (project), `~/.claude/agents/` (personal), or via `--agents` CLI flag (session): + +### Agent Location Priority +1. `.claude/agents/` — project-level, team-shared +2. `--agents` CLI flag — session-specific, dynamic +3. `~/.claude/agents/` — user-level, personal + +### Creating an Agent +```markdown +# .claude/agents/security-reviewer.md +--- +name: security-reviewer +description: Security-focused code review +model: opus +tools: [Read, Bash] +--- +You are a senior security engineer. Review code for: +- Injection vulnerabilities (SQL, XSS, command injection) +- Authentication/authorization flaws +- Secrets in code +- Unsafe deserialization +``` + +Invoke via: `@security-reviewer review the auth module` + +### Dynamic Agents via CLI +``` +terminal(command="claude --agents '{\"reviewer\": {\"description\": \"Reviews code\", \"prompt\": \"You are a code reviewer focused on performance\"}}' -p 'Use @reviewer to check auth.py'", timeout=120) +``` + +Claude can orchestrate multiple agents: "Use @db-expert to optimize queries, then @security to audit the changes." + +## Hooks — Automation on Events + +Configure in `.claude/settings.json` (project) or `~/.claude/settings.json` (global): + +```json +{ + "hooks": { + "PostToolUse": [{ + "matcher": "Write(*.py)", + "hooks": [{"type": "command", "command": "ruff check --fix $CLAUDE_FILE_PATHS"}] + }], + "PreToolUse": [{ + "matcher": "Bash", + "hooks": [{"type": "command", "command": "if echo \"$CLAUDE_TOOL_INPUT\" | grep -q 'rm -rf'; then echo 'Blocked!' && exit 2; fi"}] + }], + "Stop": [{ + "hooks": [{"type": "command", "command": "echo 'Claude finished a response' >> /tmp/claude-activity.log"}] + }] + } +} +``` + +### All 8 Hook Types +| Hook | When it fires | Common use | +|------|--------------|------------| +| `UserPromptSubmit` | Before Claude processes a user prompt | Input validation, logging | +| `PreToolUse` | Before tool execution | Security gates, block dangerous commands (exit 2 = block) | +| `PostToolUse` | After a tool finishes | Auto-format code, run linters | +| `Notification` | On permission requests or input waits | Desktop notifications, alerts | +| `Stop` | When Claude finishes a response | Completion logging, status updates | +| `SubagentStop` | When a subagent completes | Agent orchestration | +| `PreCompact` | Before context memory is cleared | Backup session transcripts | +| `SessionStart` | When a session begins | Load dev context (e.g., `git status`) | + +### Hook Environment Variables +| Variable | Content | +|----------|---------| +| `CLAUDE_PROJECT_DIR` | Current project path | +| `CLAUDE_FILE_PATHS` | Files being modified | +| `CLAUDE_TOOL_INPUT` | Tool parameters as JSON | + +### Security Hook Examples +```json +{ + "PreToolUse": [{ + "matcher": "Bash", + "hooks": [{"type": "command", "command": "if echo \"$CLAUDE_TOOL_INPUT\" | grep -qE 'rm -rf|git push.*--force|:(){ :|:& };:'; then echo 'Dangerous command blocked!' && exit 2; fi"}] + }] +} +``` + +## MCP Integration + +Add external tool servers for databases, APIs, and services: + +``` +# GitHub integration +terminal(command="claude mcp add -s user github -- npx @modelcontextprotocol/server-github", timeout=30) + +# PostgreSQL queries +terminal(command="claude mcp add -s local postgres -- npx @anthropic-ai/server-postgres --connection-string postgresql://localhost/mydb", timeout=30) + +# Puppeteer for web testing +terminal(command="claude mcp add puppeteer -- npx @anthropic-ai/server-puppeteer", timeout=30) +``` + +### MCP Scopes +| Flag | Scope | Storage | +|------|-------|---------| +| `-s user` | Global (all projects) | `~/.claude.json` | +| `-s local` | This project (personal) | `.claude/settings.local.json` (gitignored) | +| `-s project` | This project (team-shared) | `.claude/settings.json` (git-tracked) | + +### MCP in Print/CI Mode +``` +terminal(command="claude --bare -p 'Query database' --mcp-config mcp-servers.json --strict-mcp-config", timeout=60) +``` +`--strict-mcp-config` ignores all MCP servers except those from `--mcp-config`. + +Reference MCP resources in chat: `@github:issue://123` + +### MCP Limits & Tuning +- **Tool descriptions:** 2KB cap per server for tool descriptions and server instructions +- **Result size:** Default capped; use `maxResultSizeChars` annotation to allow up to **500K** characters for large outputs +- **Output tokens:** `export MAX_MCP_OUTPUT_TOKENS=50000` — cap output from MCP servers to prevent context flooding +- **Transports:** `stdio` (local process), `http` (remote), `sse` (server-sent events) + +## Monitoring Interactive Sessions + +### Reading the TUI Status +``` +# Periodic capture to check if Claude is still working or waiting for input +terminal(command="tmux capture-pane -t dev -p -S -10") +``` + +Look for these indicators: +- `❯` at bottom = waiting for your input (Claude is done or asking a question) +- `●` lines = Claude is actively using tools (reading, writing, running commands) +- `⏵⏵ bypass permissions on` = status bar showing permissions mode +- `◐ medium · /effort` = current effort level in status bar +- `ctrl+o to expand` = tool output was truncated (can be expanded interactively) + +### Context Window Health +Use `/context` in interactive mode to see a colored grid of context usage. Key thresholds: +- **< 70%** — Normal operation, full precision +- **70-85%** — Precision starts dropping, consider `/compact` +- **> 85%** — Hallucination risk spikes significantly, use `/compact` or `/clear` + +## Environment Variables + +| Variable | Effect | +|----------|--------| +| `ANTHROPIC_API_KEY` | API key for authentication (alternative to OAuth) | +| `CLAUDE_CODE_EFFORT_LEVEL` | Default effort: `low`, `medium`, `high`, `max`, or `auto` | +| `MAX_THINKING_TOKENS` | Cap thinking tokens (set to `0` to disable thinking entirely) | +| `MAX_MCP_OUTPUT_TOKENS` | Cap output from MCP servers (default varies; set e.g., `50000`) | +| `CLAUDE_CODE_NO_FLICKER=1` | Enable alt-screen rendering to eliminate terminal flicker | +| `CLAUDE_CODE_SUBPROCESS_ENV_SCRUB` | Strip credentials from sub-processes for security | + +## Cost & Performance Tips + +1. **Use `--max-turns`** in print mode to prevent runaway loops. Start with 5-10 for most tasks. +2. **Use `--max-budget-usd`** for cost caps. Note: minimum ~$0.05 for system prompt cache creation. +3. **Use `--effort low`** for simple tasks (faster, cheaper). `high` or `max` for complex reasoning. +4. **Use `--bare`** for CI/scripting to skip plugin/hook discovery overhead. +5. **Use `--allowedTools`** to restrict to only what's needed (e.g., `Read` only for reviews). +6. **Use `/compact`** in interactive sessions when context gets large. +7. **Pipe input** instead of having Claude read files when you just need analysis of known content. +8. **Use `--model haiku`** for simple tasks (cheaper) and `--model opus` for complex multi-step work. +9. **Use `--fallback-model haiku`** in print mode to gracefully handle model overload. +10. **Start new sessions for distinct tasks** — sessions last 5 hours; fresh context is more efficient. +11. **Use `--no-session-persistence`** in CI to avoid accumulating saved sessions on disk. + +## Pitfalls & Gotchas + +1. **Interactive mode REQUIRES tmux** — Claude Code is a full TUI app. Using `pty=true` alone in Hermes terminal works but tmux gives you `capture-pane` for monitoring and `send-keys` for input, which is essential for orchestration. +2. **`--dangerously-skip-permissions` dialog defaults to "No, exit"** — you must send Down then Enter to accept. Print mode (`-p`) skips this entirely. +3. **`--max-budget-usd` minimum is ~$0.05** — system prompt cache creation alone costs this much. Setting lower will error immediately. +4. **`--max-turns` is print-mode only** — ignored in interactive sessions. +5. **Claude may use `python` instead of `python3`** — on systems without a `python` symlink, Claude's bash commands will fail on first try but it self-corrects. +6. **Session resumption requires same directory** — `--continue` finds the most recent session for the current working directory. +7. **`--json-schema` needs enough `--max-turns`** — Claude must read files before producing structured output, which takes multiple turns. +8. **Trust dialog only appears once per directory** — first-time only, then cached. +9. **Background tmux sessions persist** — always clean up with `tmux kill-session -t ` when done. +10. **Slash commands (like `/commit`) only work in interactive mode** — in `-p` mode, describe the task in natural language instead. +11. **`--bare` skips OAuth** — requires `ANTHROPIC_API_KEY` env var or an `apiKeyHelper` in settings. +12. **Context degradation is real** — AI output quality measurably degrades above 70% context window usage. Monitor with `/context` and proactively `/compact`. + +## Rules for Hermes Agents + +1. **Prefer print mode (`-p`) for single tasks** — cleaner, no dialog handling, structured output +2. **Use tmux for multi-turn interactive work** — the only reliable way to orchestrate the TUI +3. **Always set `workdir`** — keep Claude focused on the right project directory +4. **Set `--max-turns` in print mode** — prevents infinite loops and runaway costs +5. **Monitor tmux sessions** — use `tmux capture-pane -t -p -S -50` to check progress +6. **Look for the `❯` prompt** — indicates Claude is waiting for input (done or asking a question) +7. **Clean up tmux sessions** — kill them when done to avoid resource leaks +8. **Report results to user** — after completion, summarize what Claude did and what changed +9. **Don't kill slow sessions** — Claude may be doing multi-step work; check progress instead +10. **Use `--allowedTools`** — restrict capabilities to what the task actually needs diff --git a/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex.md b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex.md new file mode 100644 index 000000000..70aa3334f --- /dev/null +++ b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex.md @@ -0,0 +1,131 @@ +--- +title: "Codex — Delegate coding tasks to OpenAI Codex CLI agent" +sidebar_label: "Codex" +description: "Delegate coding tasks to OpenAI Codex CLI agent" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Codex + +Delegate coding tasks to OpenAI Codex CLI agent. Use for building features, refactoring, PR reviews, and batch issue fixing. Requires the codex CLI and a git repository. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/autonomous-ai-agents/codex` | +| Version | `1.0.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `Coding-Agent`, `Codex`, `OpenAI`, `Code-Review`, `Refactoring` | +| Related skills | [`claude-code`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code), [`hermes-agent`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Codex CLI + +Delegate coding tasks to [Codex](https://github.com/openai/codex) via the Hermes terminal. Codex is OpenAI's autonomous coding agent CLI. + +## Prerequisites + +- Codex installed: `npm install -g @openai/codex` +- OpenAI API key configured +- **Must run inside a git repository** — Codex refuses to run outside one +- Use `pty=true` in terminal calls — Codex is an interactive terminal app + +## One-Shot Tasks + +``` +terminal(command="codex exec 'Add dark mode toggle to settings'", workdir="~/project", pty=true) +``` + +For scratch work (Codex needs a git repo): +``` +terminal(command="cd $(mktemp -d) && git init && codex exec 'Build a snake game in Python'", pty=true) +``` + +## Background Mode (Long Tasks) + +``` +# Start in background with PTY +terminal(command="codex exec --full-auto 'Refactor the auth module'", workdir="~/project", background=true, pty=true) +# Returns session_id + +# Monitor progress +process(action="poll", session_id="") +process(action="log", session_id="") + +# Send input if Codex asks a question +process(action="submit", session_id="", data="yes") + +# Kill if needed +process(action="kill", session_id="") +``` + +## Key Flags + +| Flag | Effect | +|------|--------| +| `exec "prompt"` | One-shot execution, exits when done | +| `--full-auto` | Sandboxed but auto-approves file changes in workspace | +| `--yolo` | No sandbox, no approvals (fastest, most dangerous) | + +## PR Reviews + +Clone to a temp directory for safe review: + +``` +terminal(command="REVIEW=$(mktemp -d) && git clone https://github.com/user/repo.git $REVIEW && cd $REVIEW && gh pr checkout 42 && codex review --base origin/main", pty=true) +``` + +## Parallel Issue Fixing with Worktrees + +``` +# Create worktrees +terminal(command="git worktree add -b fix/issue-78 /tmp/issue-78 main", workdir="~/project") +terminal(command="git worktree add -b fix/issue-99 /tmp/issue-99 main", workdir="~/project") + +# Launch Codex in each +terminal(command="codex --yolo exec 'Fix issue #78: . Commit when done.'", workdir="/tmp/issue-78", background=true, pty=true) +terminal(command="codex --yolo exec 'Fix issue #99: . Commit when done.'", workdir="/tmp/issue-99", background=true, pty=true) + +# Monitor +process(action="list") + +# After completion, push and create PRs +terminal(command="cd /tmp/issue-78 && git push -u origin fix/issue-78") +terminal(command="gh pr create --repo user/repo --head fix/issue-78 --title 'fix: ...' --body '...'") + +# Cleanup +terminal(command="git worktree remove /tmp/issue-78", workdir="~/project") +``` + +## Batch PR Reviews + +``` +# Fetch all PR refs +terminal(command="git fetch origin '+refs/pull/*/head:refs/remotes/origin/pr/*'", workdir="~/project") + +# Review multiple PRs in parallel +terminal(command="codex exec 'Review PR #86. git diff origin/main...origin/pr/86'", workdir="~/project", background=true, pty=true) +terminal(command="codex exec 'Review PR #87. git diff origin/main...origin/pr/87'", workdir="~/project", background=true, pty=true) + +# Post results +terminal(command="gh pr comment 86 --body ''", workdir="~/project") +``` + +## Rules + +1. **Always use `pty=true`** — Codex is an interactive terminal app and hangs without a PTY +2. **Git repo required** — Codex won't run outside a git directory. Use `mktemp -d && git init` for scratch +3. **Use `exec` for one-shots** — `codex exec "prompt"` runs and exits cleanly +4. **`--full-auto` for building** — auto-approves changes within the sandbox +5. **Background for long tasks** — use `background=true` and monitor with `process` tool +6. **Don't interfere** — monitor with `poll`/`log`, be patient with long-running tasks +7. **Parallel is fine** — run multiple Codex processes at once for batch work diff --git a/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent.md b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent.md new file mode 100644 index 000000000..ff60380aa --- /dev/null +++ b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent.md @@ -0,0 +1,722 @@ +--- +title: "Hermes Agent" +sidebar_label: "Hermes Agent" +description: "Complete guide to using and extending Hermes Agent — CLI usage, setup, configuration, spawning additional agents, gateway platforms, skills, voice, tools, pr..." +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Hermes Agent + +Complete guide to using and extending Hermes Agent — CLI usage, setup, configuration, spawning additional agents, gateway platforms, skills, voice, tools, profiles, and a concise contributor reference. Load this skill when helping users configure Hermes, troubleshoot issues, spawn agent instances, or make code contributions. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/autonomous-ai-agents/hermes-agent` | +| Version | `2.0.0` | +| Author | Hermes Agent + Teknium | +| License | MIT | +| Tags | `hermes`, `setup`, `configuration`, `multi-agent`, `spawning`, `cli`, `gateway`, `development` | +| Related skills | [`claude-code`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code), [`codex`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex), [`opencode`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-opencode) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Hermes Agent + +Hermes Agent is an open-source AI agent framework by Nous Research that runs in your terminal, messaging platforms, and IDEs. It belongs to the same category as Claude Code (Anthropic), Codex (OpenAI), and OpenClaw — autonomous coding and task-execution agents that use tool calling to interact with your system. Hermes works with any LLM provider (OpenRouter, Anthropic, OpenAI, DeepSeek, local models, and 15+ others) and runs on Linux, macOS, and WSL. + +What makes Hermes different: + +- **Self-improving through skills** — Hermes learns from experience by saving reusable procedures as skills. When it solves a complex problem, discovers a workflow, or gets corrected, it can persist that knowledge as a skill document that loads into future sessions. Skills accumulate over time, making the agent better at your specific tasks and environment. +- **Persistent memory across sessions** — remembers who you are, your preferences, environment details, and lessons learned. Pluggable memory backends (built-in, Honcho, Mem0, and more) let you choose how memory works. +- **Multi-platform gateway** — the same agent runs on Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email, and 10+ other platforms with full tool access, not just chat. +- **Provider-agnostic** — swap models and providers mid-workflow without changing anything else. Credential pools rotate across multiple API keys automatically. +- **Profiles** — run multiple independent Hermes instances with isolated configs, sessions, skills, and memory. +- **Extensible** — plugins, MCP servers, custom tools, webhook triggers, cron scheduling, and the full Python ecosystem. + +People use Hermes for software development, research, system administration, data analysis, content creation, home automation, and anything else that benefits from an AI agent with persistent context and full system access. + +**This skill helps you work with Hermes Agent effectively** — setting it up, configuring features, spawning additional agent instances, troubleshooting issues, finding the right commands and settings, and understanding how the system works when you need to extend or contribute to it. + +**Docs:** https://hermes-agent.nousresearch.com/docs/ + +## Quick Start + +```bash +# Install +curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash + +# Interactive chat (default) +hermes + +# Single query +hermes chat -q "What is the capital of France?" + +# Setup wizard +hermes setup + +# Change model/provider +hermes model + +# Check health +hermes doctor +``` + +--- + +## CLI Reference + +### Global Flags + +``` +hermes [flags] [command] + + --version, -V Show version + --resume, -r SESSION Resume session by ID or title + --continue, -c [NAME] Resume by name, or most recent session + --worktree, -w Isolated git worktree mode (parallel agents) + --skills, -s SKILL Preload skills (comma-separate or repeat) + --profile, -p NAME Use a named profile + --yolo Skip dangerous command approval + --pass-session-id Include session ID in system prompt +``` + +No subcommand defaults to `chat`. + +### Chat + +``` +hermes chat [flags] + -q, --query TEXT Single query, non-interactive + -m, --model MODEL Model (e.g. anthropic/claude-sonnet-4) + -t, --toolsets LIST Comma-separated toolsets + --provider PROVIDER Force provider (openrouter, anthropic, nous, etc.) + -v, --verbose Verbose output + -Q, --quiet Suppress banner, spinner, tool previews + --checkpoints Enable filesystem checkpoints (/rollback) + --source TAG Session source tag (default: cli) +``` + +### Configuration + +``` +hermes setup [section] Interactive wizard (model|terminal|gateway|tools|agent) +hermes model Interactive model/provider picker +hermes config View current config +hermes config edit Open config.yaml in $EDITOR +hermes config set KEY VAL Set a config value +hermes config path Print config.yaml path +hermes config env-path Print .env path +hermes config check Check for missing/outdated config +hermes config migrate Update config with new options +hermes login [--provider P] OAuth login (nous, openai-codex) +hermes logout Clear stored auth +hermes doctor [--fix] Check dependencies and config +hermes status [--all] Show component status +``` + +### Tools & Skills + +``` +hermes tools Interactive tool enable/disable (curses UI) +hermes tools list Show all tools and status +hermes tools enable NAME Enable a toolset +hermes tools disable NAME Disable a toolset + +hermes skills list List installed skills +hermes skills search QUERY Search the skills hub +hermes skills install ID Install a skill +hermes skills inspect ID Preview without installing +hermes skills config Enable/disable skills per platform +hermes skills check Check for updates +hermes skills update Update outdated skills +hermes skills uninstall N Remove a hub skill +hermes skills publish PATH Publish to registry +hermes skills browse Browse all available skills +hermes skills tap add REPO Add a GitHub repo as skill source +``` + +### MCP Servers + +``` +hermes mcp serve Run Hermes as an MCP server +hermes mcp add NAME Add an MCP server (--url or --command) +hermes mcp remove NAME Remove an MCP server +hermes mcp list List configured servers +hermes mcp test NAME Test connection +hermes mcp configure NAME Toggle tool selection +``` + +### Gateway (Messaging Platforms) + +``` +hermes gateway run Start gateway foreground +hermes gateway install Install as background service +hermes gateway start/stop Control the service +hermes gateway restart Restart the service +hermes gateway status Check status +hermes gateway setup Configure platforms +``` + +Supported platforms: Telegram, Discord, Slack, WhatsApp, Signal, Email, SMS, Matrix, Mattermost, Home Assistant, DingTalk, Feishu, WeCom, BlueBubbles (iMessage), Weixin (WeChat), API Server, Webhooks. Open WebUI connects via the API Server adapter. + +Platform docs: https://hermes-agent.nousresearch.com/docs/user-guide/messaging/ + +### Sessions + +``` +hermes sessions list List recent sessions +hermes sessions browse Interactive picker +hermes sessions export OUT Export to JSONL +hermes sessions rename ID T Rename a session +hermes sessions delete ID Delete a session +hermes sessions prune Clean up old sessions (--older-than N days) +hermes sessions stats Session store statistics +``` + +### Cron Jobs + +``` +hermes cron list List jobs (--all for disabled) +hermes cron create SCHED Create: '30m', 'every 2h', '0 9 * * *' +hermes cron edit ID Edit schedule, prompt, delivery +hermes cron pause/resume ID Control job state +hermes cron run ID Trigger on next tick +hermes cron remove ID Delete a job +hermes cron status Scheduler status +``` + +### Webhooks + +``` +hermes webhook subscribe N Create route at /webhooks/ +hermes webhook list List subscriptions +hermes webhook remove NAME Remove a subscription +hermes webhook test NAME Send a test POST +``` + +### Profiles + +``` +hermes profile list List all profiles +hermes profile create NAME Create (--clone, --clone-all, --clone-from) +hermes profile use NAME Set sticky default +hermes profile delete NAME Delete a profile +hermes profile show NAME Show details +hermes profile alias NAME Manage wrapper scripts +hermes profile rename A B Rename a profile +hermes profile export NAME Export to tar.gz +hermes profile import FILE Import from archive +``` + +### Credential Pools + +``` +hermes auth add Interactive credential wizard +hermes auth list [PROVIDER] List pooled credentials +hermes auth remove P INDEX Remove by provider + index +hermes auth reset PROVIDER Clear exhaustion status +``` + +### Other + +``` +hermes insights [--days N] Usage analytics +hermes update Update to latest version +hermes pairing list/approve/revoke DM authorization +hermes plugins list/install/remove Plugin management +hermes honcho setup/status Honcho memory integration (requires honcho plugin) +hermes memory setup/status/off Memory provider config +hermes completion bash|zsh Shell completions +hermes acp ACP server (IDE integration) +hermes claw migrate Migrate from OpenClaw +hermes uninstall Uninstall Hermes +``` + +--- + +## Slash Commands (In-Session) + +Type these during an interactive chat session. + +### Session Control +``` +/new (/reset) Fresh session +/clear Clear screen + new session (CLI) +/retry Resend last message +/undo Remove last exchange +/title [name] Name the session +/compress Manually compress context +/stop Kill background processes +/rollback [N] Restore filesystem checkpoint +/background Run prompt in background +/queue Queue for next turn +/resume [name] Resume a named session +``` + +### Configuration +``` +/config Show config (CLI) +/model [name] Show or change model +/provider Show provider info +/personality [name] Set personality +/reasoning [level] Set reasoning (none|minimal|low|medium|high|xhigh|show|hide) +/verbose Cycle: off → new → all → verbose +/voice [on|off|tts] Voice mode +/yolo Toggle approval bypass +/skin [name] Change theme (CLI) +/statusbar Toggle status bar (CLI) +``` + +### Tools & Skills +``` +/tools Manage tools (CLI) +/toolsets List toolsets (CLI) +/skills Search/install skills (CLI) +/skill Load a skill into session +/cron Manage cron jobs (CLI) +/reload-mcp Reload MCP servers +/plugins List plugins (CLI) +``` + +### Gateway +``` +/approve Approve a pending command (gateway) +/deny Deny a pending command (gateway) +/restart Restart gateway (gateway) +/sethome Set current chat as home channel (gateway) +/update Update Hermes to latest (gateway) +/platforms (/gateway) Show platform connection status (gateway) +``` + +### Utility +``` +/branch (/fork) Branch the current session +/btw Ephemeral side question (doesn't interrupt main task) +/fast Toggle priority/fast processing +/browser Open CDP browser connection +/history Show conversation history (CLI) +/save Save conversation to file (CLI) +/paste Attach clipboard image (CLI) +/image Attach local image file (CLI) +``` + +### Info +``` +/help Show commands +/commands [page] Browse all commands (gateway) +/usage Token usage +/insights [days] Usage analytics +/status Session info (gateway) +/profile Active profile info +``` + +### Exit +``` +/quit (/exit, /q) Exit CLI +``` + +--- + +## Key Paths & Config + +``` +~/.hermes/config.yaml Main configuration +~/.hermes/.env API keys and secrets +$HERMES_HOME/skills/ Installed skills +~/.hermes/sessions/ Session transcripts +~/.hermes/logs/ Gateway and error logs +~/.hermes/auth.json OAuth tokens and credential pools +~/.hermes/hermes-agent/ Source code (if git-installed) +``` + +Profiles use `~/.hermes/profiles//` with the same layout. + +### Config Sections + +Edit with `hermes config edit` or `hermes config set section.key value`. + +| Section | Key options | +|---------|-------------| +| `model` | `default`, `provider`, `base_url`, `api_key`, `context_length` | +| `agent` | `max_turns` (90), `tool_use_enforcement` | +| `terminal` | `backend` (local/docker/ssh/modal), `cwd`, `timeout` (180) | +| `compression` | `enabled`, `threshold` (0.50), `target_ratio` (0.20) | +| `display` | `skin`, `tool_progress`, `show_reasoning`, `show_cost` | +| `stt` | `enabled`, `provider` (local/groq/openai/mistral) | +| `tts` | `provider` (edge/elevenlabs/openai/minimax/mistral/neutts) | +| `memory` | `memory_enabled`, `user_profile_enabled`, `provider` | +| `security` | `tirith_enabled`, `website_blocklist` | +| `delegation` | `model`, `provider`, `base_url`, `api_key`, `max_iterations` (50), `reasoning_effort` | +| `checkpoints` | `enabled`, `max_snapshots` (50) | + +Full config reference: https://hermes-agent.nousresearch.com/docs/user-guide/configuration + +### Providers + +20+ providers supported. Set via `hermes model` or `hermes setup`. + +| Provider | Auth | Key env var | +|----------|------|-------------| +| OpenRouter | API key | `OPENROUTER_API_KEY` | +| Anthropic | API key | `ANTHROPIC_API_KEY` | +| Nous Portal | OAuth | `hermes auth` | +| OpenAI Codex | OAuth | `hermes auth` | +| GitHub Copilot | Token | `COPILOT_GITHUB_TOKEN` | +| Google Gemini | API key | `GOOGLE_API_KEY` or `GEMINI_API_KEY` | +| DeepSeek | API key | `DEEPSEEK_API_KEY` | +| xAI / Grok | API key | `XAI_API_KEY` | +| Hugging Face | Token | `HF_TOKEN` | +| Z.AI / GLM | API key | `GLM_API_KEY` | +| MiniMax | API key | `MINIMAX_API_KEY` | +| MiniMax CN | API key | `MINIMAX_CN_API_KEY` | +| Kimi / Moonshot | API key | `KIMI_API_KEY` | +| Alibaba / DashScope | API key | `DASHSCOPE_API_KEY` | +| Xiaomi MiMo | API key | `XIAOMI_API_KEY` | +| Kilo Code | API key | `KILOCODE_API_KEY` | +| AI Gateway (Vercel) | API key | `AI_GATEWAY_API_KEY` | +| OpenCode Zen | API key | `OPENCODE_ZEN_API_KEY` | +| OpenCode Go | API key | `OPENCODE_GO_API_KEY` | +| Qwen OAuth | OAuth | `hermes login --provider qwen-oauth` | +| Custom endpoint | Config | `model.base_url` + `model.api_key` in config.yaml | +| GitHub Copilot ACP | External | `COPILOT_CLI_PATH` or Copilot CLI | + +Full provider docs: https://hermes-agent.nousresearch.com/docs/integrations/providers + +### Toolsets + +Enable/disable via `hermes tools` (interactive) or `hermes tools enable/disable NAME`. + +| Toolset | What it provides | +|---------|-----------------| +| `web` | Web search and content extraction | +| `browser` | Browser automation (Browserbase, Camofox, or local Chromium) | +| `terminal` | Shell commands and process management | +| `file` | File read/write/search/patch | +| `code_execution` | Sandboxed Python execution | +| `vision` | Image analysis | +| `image_gen` | AI image generation | +| `tts` | Text-to-speech | +| `skills` | Skill browsing and management | +| `memory` | Persistent cross-session memory | +| `session_search` | Search past conversations | +| `delegation` | Subagent task delegation | +| `cronjob` | Scheduled task management | +| `clarify` | Ask user clarifying questions | +| `messaging` | Cross-platform message sending | +| `search` | Web search only (subset of `web`) | +| `todo` | In-session task planning and tracking | +| `rl` | Reinforcement learning tools (off by default) | +| `moa` | Mixture of Agents (off by default) | +| `homeassistant` | Smart home control (off by default) | + +Tool changes take effect on `/reset` (new session). They do NOT apply mid-conversation to preserve prompt caching. + +--- + +## Voice & Transcription + +### STT (Voice → Text) + +Voice messages from messaging platforms are auto-transcribed. + +Provider priority (auto-detected): +1. **Local faster-whisper** — free, no API key: `pip install faster-whisper` +2. **Groq Whisper** — free tier: set `GROQ_API_KEY` +3. **OpenAI Whisper** — paid: set `VOICE_TOOLS_OPENAI_KEY` +4. **Mistral Voxtral** — set `MISTRAL_API_KEY` + +Config: +```yaml +stt: + enabled: true + provider: local # local, groq, openai, mistral + local: + model: base # tiny, base, small, medium, large-v3 +``` + +### TTS (Text → Voice) + +| Provider | Env var | Free? | +|----------|---------|-------| +| Edge TTS | None | Yes (default) | +| ElevenLabs | `ELEVENLABS_API_KEY` | Free tier | +| OpenAI | `VOICE_TOOLS_OPENAI_KEY` | Paid | +| MiniMax | `MINIMAX_API_KEY` | Paid | +| Mistral (Voxtral) | `MISTRAL_API_KEY` | Paid | +| NeuTTS (local) | None (`pip install neutts[all]` + `espeak-ng`) | Free | + +Voice commands: `/voice on` (voice-to-voice), `/voice tts` (always voice), `/voice off`. + +--- + +## Spawning Additional Hermes Instances + +Run additional Hermes processes as fully independent subprocesses — separate sessions, tools, and environments. + +### When to Use This vs delegate_task + +| | `delegate_task` | Spawning `hermes` process | +|-|-----------------|--------------------------| +| Isolation | Separate conversation, shared process | Fully independent process | +| Duration | Minutes (bounded by parent loop) | Hours/days | +| Tool access | Subset of parent's tools | Full tool access | +| Interactive | No | Yes (PTY mode) | +| Use case | Quick parallel subtasks | Long autonomous missions | + +### One-Shot Mode + +``` +terminal(command="hermes chat -q 'Research GRPO papers and write summary to ~/research/grpo.md'", timeout=300) + +# Background for long tasks: +terminal(command="hermes chat -q 'Set up CI/CD for ~/myapp'", background=true) +``` + +### Interactive PTY Mode (via tmux) + +Hermes uses prompt_toolkit, which requires a real terminal. Use tmux for interactive spawning: + +``` +# Start +terminal(command="tmux new-session -d -s agent1 -x 120 -y 40 'hermes'", timeout=10) + +# Wait for startup, then send a message +terminal(command="sleep 8 && tmux send-keys -t agent1 'Build a FastAPI auth service' Enter", timeout=15) + +# Read output +terminal(command="sleep 20 && tmux capture-pane -t agent1 -p", timeout=5) + +# Send follow-up +terminal(command="tmux send-keys -t agent1 'Add rate limiting middleware' Enter", timeout=5) + +# Exit +terminal(command="tmux send-keys -t agent1 '/exit' Enter && sleep 2 && tmux kill-session -t agent1", timeout=10) +``` + +### Multi-Agent Coordination + +``` +# Agent A: backend +terminal(command="tmux new-session -d -s backend -x 120 -y 40 'hermes -w'", timeout=10) +terminal(command="sleep 8 && tmux send-keys -t backend 'Build REST API for user management' Enter", timeout=15) + +# Agent B: frontend +terminal(command="tmux new-session -d -s frontend -x 120 -y 40 'hermes -w'", timeout=10) +terminal(command="sleep 8 && tmux send-keys -t frontend 'Build React dashboard for user management' Enter", timeout=15) + +# Check progress, relay context between them +terminal(command="tmux capture-pane -t backend -p | tail -30", timeout=5) +terminal(command="tmux send-keys -t frontend 'Here is the API schema from the backend agent: ...' Enter", timeout=5) +``` + +### Session Resume + +``` +# Resume most recent session +terminal(command="tmux new-session -d -s resumed 'hermes --continue'", timeout=10) + +# Resume specific session +terminal(command="tmux new-session -d -s resumed 'hermes --resume 20260225_143052_a1b2c3'", timeout=10) +``` + +### Tips + +- **Prefer `delegate_task` for quick subtasks** — less overhead than spawning a full process +- **Use `-w` (worktree mode)** when spawning agents that edit code — prevents git conflicts +- **Set timeouts** for one-shot mode — complex tasks can take 5-10 minutes +- **Use `hermes chat -q` for fire-and-forget** — no PTY needed +- **Use tmux for interactive sessions** — raw PTY mode has `\r` vs `\n` issues with prompt_toolkit +- **For scheduled tasks**, use the `cronjob` tool instead of spawning — handles delivery and retry + +--- + +## Troubleshooting + +### Voice not working +1. Check `stt.enabled: true` in config.yaml +2. Verify provider: `pip install faster-whisper` or set API key +3. In gateway: `/restart`. In CLI: exit and relaunch. + +### Tool not available +1. `hermes tools` — check if toolset is enabled for your platform +2. Some tools need env vars (check `.env`) +3. `/reset` after enabling tools + +### Model/provider issues +1. `hermes doctor` — check config and dependencies +2. `hermes login` — re-authenticate OAuth providers +3. Check `.env` has the right API key +4. **Copilot 403**: `gh auth login` tokens do NOT work for Copilot API. You must use the Copilot-specific OAuth device code flow via `hermes model` → GitHub Copilot. + +### Changes not taking effect +- **Tools/skills:** `/reset` starts a new session with updated toolset +- **Config changes:** In gateway: `/restart`. In CLI: exit and relaunch. +- **Code changes:** Restart the CLI or gateway process + +### Skills not showing +1. `hermes skills list` — verify installed +2. `hermes skills config` — check platform enablement +3. Load explicitly: `/skill name` or `hermes -s name` + +### Gateway issues +Check logs first: +```bash +grep -i "failed to send\|error" ~/.hermes/logs/gateway.log | tail -20 +``` + +Common gateway problems: +- **Gateway dies on SSH logout**: Enable linger: `sudo loginctl enable-linger $USER` +- **Gateway dies on WSL2 close**: WSL2 requires `systemd=true` in `/etc/wsl.conf` for systemd services to work. Without it, gateway falls back to `nohup` (dies when session closes). +- **Gateway crash loop**: Reset the failed state: `systemctl --user reset-failed hermes-gateway` + +### Platform-specific issues +- **Discord bot silent**: Must enable **Message Content Intent** in Bot → Privileged Gateway Intents. +- **Slack bot only works in DMs**: Must subscribe to `message.channels` event. Without it, the bot ignores public channels. +- **Windows HTTP 400 "No models provided"**: Config file encoding issue (BOM). Ensure `config.yaml` is saved as UTF-8 without BOM. + +### Auxiliary models not working +If `auxiliary` tasks (vision, compression, session_search) fail silently, the `auto` provider can't find a backend. Either set `OPENROUTER_API_KEY` or `GOOGLE_API_KEY`, or explicitly configure each auxiliary task's provider: +```bash +hermes config set auxiliary.vision.provider +hermes config set auxiliary.vision.model +``` + +--- + +## Where to Find Things + +| Looking for... | Location | +|----------------|----------| +| Config options | `hermes config edit` or [Configuration docs](https://hermes-agent.nousresearch.com/docs/user-guide/configuration) | +| Available tools | `hermes tools list` or [Tools reference](https://hermes-agent.nousresearch.com/docs/reference/tools-reference) | +| Slash commands | `/help` in session or [Slash commands reference](https://hermes-agent.nousresearch.com/docs/reference/slash-commands) | +| Skills catalog | `hermes skills browse` or [Skills catalog](https://hermes-agent.nousresearch.com/docs/reference/skills-catalog) | +| Provider setup | `hermes model` or [Providers guide](https://hermes-agent.nousresearch.com/docs/integrations/providers) | +| Platform setup | `hermes gateway setup` or [Messaging docs](https://hermes-agent.nousresearch.com/docs/user-guide/messaging/) | +| MCP servers | `hermes mcp list` or [MCP guide](https://hermes-agent.nousresearch.com/docs/user-guide/features/mcp) | +| Profiles | `hermes profile list` or [Profiles docs](https://hermes-agent.nousresearch.com/docs/user-guide/profiles) | +| Cron jobs | `hermes cron list` or [Cron docs](https://hermes-agent.nousresearch.com/docs/user-guide/features/cron) | +| Memory | `hermes memory status` or [Memory docs](https://hermes-agent.nousresearch.com/docs/user-guide/features/memory) | +| Env variables | `hermes config env-path` or [Env vars reference](https://hermes-agent.nousresearch.com/docs/reference/environment-variables) | +| CLI commands | `hermes --help` or [CLI reference](https://hermes-agent.nousresearch.com/docs/reference/cli-commands) | +| Gateway logs | `~/.hermes/logs/gateway.log` | +| Session files | `~/.hermes/sessions/` or `hermes sessions browse` | +| Source code | `~/.hermes/hermes-agent/` | + +--- + +## Contributor Quick Reference + +For occasional contributors and PR authors. Full developer docs: https://hermes-agent.nousresearch.com/docs/developer-guide/ + +### Project Layout + +``` +hermes-agent/ +├── run_agent.py # AIAgent — core conversation loop +├── model_tools.py # Tool discovery and dispatch +├── toolsets.py # Toolset definitions +├── cli.py # Interactive CLI (HermesCLI) +├── hermes_state.py # SQLite session store +├── agent/ # Prompt builder, context compression, memory, model routing, credential pooling, skill dispatch +├── hermes_cli/ # CLI subcommands, config, setup, commands +│ ├── commands.py # Slash command registry (CommandDef) +│ ├── config.py # DEFAULT_CONFIG, env var definitions +│ └── main.py # CLI entry point and argparse +├── tools/ # One file per tool +│ └── registry.py # Central tool registry +├── gateway/ # Messaging gateway +│ └── platforms/ # Platform adapters (telegram, discord, etc.) +├── cron/ # Job scheduler +├── tests/ # ~3000 pytest tests +└── website/ # Docusaurus docs site +``` + +Config: `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys). + +### Adding a Tool (3 files) + +**1. Create `tools/your_tool.py`:** +```python +import json, os +from tools.registry import registry + +def check_requirements() -> bool: + return bool(os.getenv("EXAMPLE_API_KEY")) + +def example_tool(param: str, task_id: str = None) -> str: + return json.dumps({"success": True, "data": "..."}) + +registry.register( + name="example_tool", + toolset="example", + schema={"name": "example_tool", "description": "...", "parameters": {...}}, + handler=lambda args, **kw: example_tool( + param=args.get("param", ""), task_id=kw.get("task_id")), + check_fn=check_requirements, + requires_env=["EXAMPLE_API_KEY"], +) +``` + +**2. Add to `toolsets.py`** → `_HERMES_CORE_TOOLS` list. + +Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual list needed. + +All handlers must return JSON strings. Use `get_hermes_home()` for paths, never hardcode `~/.hermes`. + +### Adding a Slash Command + +1. Add `CommandDef` to `COMMAND_REGISTRY` in `hermes_cli/commands.py` +2. Add handler in `cli.py` → `process_command()` +3. (Optional) Add gateway handler in `gateway/run.py` + +All consumers (help text, autocomplete, Telegram menu, Slack mapping) derive from the central registry automatically. + +### Agent Loop (High Level) + +``` +run_conversation(): + 1. Build system prompt + 2. Loop while iterations < max: + a. Call LLM (OpenAI-format messages + tool schemas) + b. If tool_calls → dispatch each via handle_function_call() → append results → continue + c. If text response → return + 3. Context compression triggers automatically near token limit +``` + +### Testing + +```bash +python -m pytest tests/ -o 'addopts=' -q # Full suite +python -m pytest tests/tools/ -q # Specific area +``` + +- Tests auto-redirect `HERMES_HOME` to temp dirs — never touch real `~/.hermes/` +- Run full suite before pushing any change +- Use `-o 'addopts='` to clear any baked-in pytest flags + +### Commit Conventions + +``` +type: concise subject line + +Optional body. +``` + +Types: `fix:`, `feat:`, `refactor:`, `docs:`, `chore:` + +### Key Rules + +- **Never break prompt caching** — don't change context, tools, or system prompt mid-conversation +- **Message role alternation** — never two assistant or two user messages in a row +- Use `get_hermes_home()` from `hermes_constants` for all paths (profile-safe) +- Config values go in `config.yaml`, secrets go in `.env` +- New tools need a `check_fn` so they only appear when requirements are met diff --git a/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-opencode.md b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-opencode.md new file mode 100644 index 000000000..2fe44e129 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-opencode.md @@ -0,0 +1,236 @@ +--- +title: "Opencode" +sidebar_label: "Opencode" +description: "Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Opencode + +Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions. Requires the opencode CLI installed and authenticated. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/autonomous-ai-agents/opencode` | +| Version | `1.2.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `Coding-Agent`, `OpenCode`, `Autonomous`, `Refactoring`, `Code-Review` | +| Related skills | [`claude-code`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-claude-code), [`codex`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-codex), [`hermes-agent`](/docs/user-guide/skills/bundled/autonomous-ai-agents/autonomous-ai-agents-hermes-agent) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# OpenCode CLI + +Use [OpenCode](https://opencode.ai) as an autonomous coding worker orchestrated by Hermes terminal/process tools. OpenCode is a provider-agnostic, open-source AI coding agent with a TUI and CLI. + +## When to Use + +- User explicitly asks to use OpenCode +- You want an external coding agent to implement/refactor/review code +- You need long-running coding sessions with progress checks +- You want parallel task execution in isolated workdirs/worktrees + +## Prerequisites + +- OpenCode installed: `npm i -g opencode-ai@latest` or `brew install anomalyco/tap/opencode` +- Auth configured: `opencode auth login` or set provider env vars (OPENROUTER_API_KEY, etc.) +- Verify: `opencode auth list` should show at least one provider +- Git repository for code tasks (recommended) +- `pty=true` for interactive TUI sessions + +## Binary Resolution (Important) + +Shell environments may resolve different OpenCode binaries. If behavior differs between your terminal and Hermes, check: + +``` +terminal(command="which -a opencode") +terminal(command="opencode --version") +``` + +If needed, pin an explicit binary path: + +``` +terminal(command="$HOME/.opencode/bin/opencode run '...'", workdir="~/project", pty=true) +``` + +## One-Shot Tasks + +Use `opencode run` for bounded, non-interactive tasks: + +``` +terminal(command="opencode run 'Add retry logic to API calls and update tests'", workdir="~/project") +``` + +Attach context files with `-f`: + +``` +terminal(command="opencode run 'Review this config for security issues' -f config.yaml -f .env.example", workdir="~/project") +``` + +Show model thinking with `--thinking`: + +``` +terminal(command="opencode run 'Debug why tests fail in CI' --thinking", workdir="~/project") +``` + +Force a specific model: + +``` +terminal(command="opencode run 'Refactor auth module' --model openrouter/anthropic/claude-sonnet-4", workdir="~/project") +``` + +## Interactive Sessions (Background) + +For iterative work requiring multiple exchanges, start the TUI in background: + +``` +terminal(command="opencode", workdir="~/project", background=true, pty=true) +# Returns session_id + +# Send a prompt +process(action="submit", session_id="", data="Implement OAuth refresh flow and add tests") + +# Monitor progress +process(action="poll", session_id="") +process(action="log", session_id="") + +# Send follow-up input +process(action="submit", session_id="", data="Now add error handling for token expiry") + +# Exit cleanly — Ctrl+C +process(action="write", session_id="", data="\x03") +# Or just kill the process +process(action="kill", session_id="") +``` + +**Important:** Do NOT use `/exit` — it is not a valid OpenCode command and will open an agent selector dialog instead. Use Ctrl+C (`\x03`) or `process(action="kill")` to exit. + +### TUI Keybindings + +| Key | Action | +|-----|--------| +| `Enter` | Submit message (press twice if needed) | +| `Tab` | Switch between agents (build/plan) | +| `Ctrl+P` | Open command palette | +| `Ctrl+X L` | Switch session | +| `Ctrl+X M` | Switch model | +| `Ctrl+X N` | New session | +| `Ctrl+X E` | Open editor | +| `Ctrl+C` | Exit OpenCode | + +### Resuming Sessions + +After exiting, OpenCode prints a session ID. Resume with: + +``` +terminal(command="opencode -c", workdir="~/project", background=true, pty=true) # Continue last session +terminal(command="opencode -s ses_abc123", workdir="~/project", background=true, pty=true) # Specific session +``` + +## Common Flags + +| Flag | Use | +|------|-----| +| `run 'prompt'` | One-shot execution and exit | +| `--continue` / `-c` | Continue the last OpenCode session | +| `--session ` / `-s` | Continue a specific session | +| `--agent ` | Choose OpenCode agent (build or plan) | +| `--model provider/model` | Force specific model | +| `--format json` | Machine-readable output/events | +| `--file ` / `-f` | Attach file(s) to the message | +| `--thinking` | Show model thinking blocks | +| `--variant ` | Reasoning effort (high, max, minimal) | +| `--title ` | Name the session | +| `--attach ` | Connect to a running opencode server | + +## Procedure + +1. Verify tool readiness: + - `terminal(command="opencode --version")` + - `terminal(command="opencode auth list")` +2. For bounded tasks, use `opencode run '...'` (no pty needed). +3. For iterative tasks, start `opencode` with `background=true, pty=true`. +4. Monitor long tasks with `process(action="poll"|"log")`. +5. If OpenCode asks for input, respond via `process(action="submit", ...)`. +6. Exit with `process(action="write", data="\x03")` or `process(action="kill")`. +7. Summarize file changes, test results, and next steps back to user. + +## PR Review Workflow + +OpenCode has a built-in PR command: + +``` +terminal(command="opencode pr 42", workdir="~/project", pty=true) +``` + +Or review in a temporary clone for isolation: + +``` +terminal(command="REVIEW=$(mktemp -d) && git clone https://github.com/user/repo.git $REVIEW && cd $REVIEW && opencode run 'Review this PR vs main. Report bugs, security risks, test gaps, and style issues.' -f $(git diff origin/main --name-only | head -20 | tr '\n' ' ')", pty=true) +``` + +## Parallel Work Pattern + +Use separate workdirs/worktrees to avoid collisions: + +``` +terminal(command="opencode run 'Fix issue #101 and commit'", workdir="/tmp/issue-101", background=true, pty=true) +terminal(command="opencode run 'Add parser regression tests and commit'", workdir="/tmp/issue-102", background=true, pty=true) +process(action="list") +``` + +## Session & Cost Management + +List past sessions: + +``` +terminal(command="opencode session list") +``` + +Check token usage and costs: + +``` +terminal(command="opencode stats") +terminal(command="opencode stats --days 7 --models anthropic/claude-sonnet-4") +``` + +## Pitfalls + +- Interactive `opencode` (TUI) sessions require `pty=true`. The `opencode run` command does NOT need pty. +- `/exit` is NOT a valid command — it opens an agent selector. Use Ctrl+C to exit the TUI. +- PATH mismatch can select the wrong OpenCode binary/model config. +- If OpenCode appears stuck, inspect logs before killing: + - `process(action="log", session_id="")` +- Avoid sharing one working directory across parallel OpenCode sessions. +- Enter may need to be pressed twice to submit in the TUI (once to finalize text, once to send). + +## Verification + +Smoke test: + +``` +terminal(command="opencode run 'Respond with exactly: OPENCODE_SMOKE_OK'") +``` + +Success criteria: +- Output includes `OPENCODE_SMOKE_OK` +- Command exits without provider/model errors +- For code tasks: expected files changed and tests pass + +## Rules + +1. Prefer `opencode run` for one-shot automation — it's simpler and doesn't need pty. +2. Use interactive background mode only when iteration is needed. +3. Always scope OpenCode sessions to a single repo/workdir. +4. For long tasks, provide progress updates from `process` logs. +5. Report concrete outcomes (files changed, tests, remaining risks). +6. Exit interactive sessions with Ctrl+C or kill, never `/exit`. diff --git a/website/docs/user-guide/skills/bundled/creative/creative-architecture-diagram.md b/website/docs/user-guide/skills/bundled/creative/creative-architecture-diagram.md new file mode 100644 index 000000000..a5a8c5084 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/creative/creative-architecture-diagram.md @@ -0,0 +1,164 @@ +--- +title: "Architecture Diagram" +sidebar_label: "Architecture Diagram" +description: "Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Architecture Diagram + +Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics. Semantic component colors (cyan=frontend, emerald=backend, violet=database, amber=cloud/AWS, rose=security, orange=message bus), JetBrains Mono font, grid background. Best suited for software architecture, cloud/VPC topology, microservice maps, service-mesh diagrams, database + API layer diagrams, security groups, message buses — anything that fits a tech-infra deck with a dark aesthetic. If a more specialized diagramming skill exists for the subject (scientific, educational, hand-drawn, animated, etc.), prefer that — otherwise this skill can also serve as a general-purpose SVG diagram fallback. Based on Cocoon AI's architecture-diagram-generator (MIT). + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/creative/architecture-diagram` | +| Version | `1.0.0` | +| Author | Cocoon AI (hello@cocoon-ai.com), ported by Hermes Agent | +| License | MIT | +| Tags | `architecture`, `diagrams`, `SVG`, `HTML`, `visualization`, `infrastructure`, `cloud` | +| Related skills | [`concept-diagrams`](/docs/user-guide/skills/optional/creative/creative-concept-diagrams), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Architecture Diagram Skill + +Generate professional, dark-themed technical architecture diagrams as standalone HTML files with inline SVG graphics. No external tools, no API keys, no rendering libraries — just write the HTML file and open it in a browser. + +## Scope + +**Best suited for:** +- Software system architecture (frontend / backend / database layers) +- Cloud infrastructure (VPC, regions, subnets, managed services) +- Microservice / service-mesh topology +- Database + API map, deployment diagrams +- Anything with a tech-infra subject that fits a dark, grid-backed aesthetic + +**Look elsewhere first for:** +- Physics, chemistry, math, biology, or other scientific subjects +- Physical objects (vehicles, hardware, anatomy, cross-sections) +- Floor plans, narrative journeys, educational / textbook-style visuals +- Hand-drawn whiteboard sketches (consider `excalidraw`) +- Animated explainers (consider an animation skill) + +If a more specialized skill is available for the subject, prefer that. If none fits, this skill can also serve as a general SVG diagram fallback — the output will just carry the dark tech aesthetic described below. + +Based on [Cocoon AI's architecture-diagram-generator](https://github.com/Cocoon-AI/architecture-diagram-generator) (MIT). + +## Workflow + +1. User describes their system architecture (components, connections, technologies) +2. Generate the HTML file following the design system below +3. Save with `write_file` to a `.html` file (e.g. `~/architecture-diagram.html`) +4. User opens in any browser — works offline, no dependencies + +### Output Location + +Save diagrams to a user-specified path, or default to the current working directory: +``` +./[project-name]-architecture.html +``` + +### Preview + +After saving, suggest the user open it: +```bash +# macOS +open ./my-architecture.html +# Linux +xdg-open ./my-architecture.html +``` + +## Design System & Visual Language + +### Color Palette (Semantic Mapping) + +Use specific `rgba` fills and hex strokes to categorize components: + +| Component Type | Fill (rgba) | Stroke (Hex) | +| :--- | :--- | :--- | +| **Frontend** | `rgba(8, 51, 68, 0.4)` | `#22d3ee` (cyan-400) | +| **Backend** | `rgba(6, 78, 59, 0.4)` | `#34d399` (emerald-400) | +| **Database** | `rgba(76, 29, 149, 0.4)` | `#a78bfa` (violet-400) | +| **AWS/Cloud** | `rgba(120, 53, 15, 0.3)` | `#fbbf24` (amber-400) | +| **Security** | `rgba(136, 19, 55, 0.4)` | `#fb7185` (rose-400) | +| **Message Bus** | `rgba(251, 146, 60, 0.3)` | `#fb923c` (orange-400) | +| **External** | `rgba(30, 41, 59, 0.5)` | `#94a3b8` (slate-400) | + +### Typography & Background +- **Font:** JetBrains Mono (Monospace), loaded from Google Fonts +- **Sizes:** 12px (Names), 9px (Sublabels), 8px (Annotations), 7px (Tiny labels) +- **Background:** Slate-950 (`#020617`) with a subtle 40px grid pattern + +```svg + + + + +``` + +## Technical Implementation Details + +### Component Rendering +Components are rounded rectangles (`rx="6"`) with 1.5px strokes. To prevent arrows from showing through semi-transparent fills, use a **double-rect masking technique**: +1. Draw an opaque background rect (`#0f172a`) +2. Draw the semi-transparent styled rect on top + +### Connection Rules +- **Z-Order:** Draw arrows *early* in the SVG (after the grid) so they render behind component boxes +- **Arrowheads:** Defined via SVG markers +- **Security Flows:** Use dashed lines in rose color (`#fb7185`) +- **Boundaries:** + - *Security Groups:* Dashed (`4,4`), rose color + - *Regions:* Large dashed (`8,4`), amber color, `rx="12"` + +### Spacing & Layout Logic +- **Standard Height:** 60px (Services); 80-120px (Large components) +- **Vertical Gap:** Minimum 40px between components +- **Message Buses:** Must be placed *in the gap* between services, not overlapping them +- **Legend Placement:** **CRITICAL.** Must be placed outside all boundary boxes. Calculate the lowest Y-coordinate of all boundaries and place the legend at least 20px below it. + +## Document Structure + +The generated HTML file follows a four-part layout: +1. **Header:** Title with a pulsing dot indicator and subtitle +2. **Main SVG:** The diagram contained within a rounded border card +3. **Summary Cards:** A grid of three cards below the diagram for high-level details +4. **Footer:** Minimal metadata + +### Info Card Pattern +```html +
+
+
+

Title

+
+
    +
  • • Item one
  • +
  • • Item two
  • +
+
+``` + +## Output Requirements +- **Single File:** One self-contained `.html` file +- **No External Dependencies:** All CSS and SVG must be inline (except Google Fonts) +- **No JavaScript:** Use pure CSS for any animations (like pulsing dots) +- **Compatibility:** Must render correctly in any modern web browser + +## Template Reference + +Load the full HTML template for the exact structure, CSS, and SVG component examples: + +``` +skill_view(name="architecture-diagram", file_path="templates/template.html") +``` + +The template contains working examples of every component type (frontend, backend, database, cloud, security), arrow styles (standard, dashed, curved), security groups, region boundaries, and the legend — use it as your structural reference when generating diagrams. diff --git a/website/docs/user-guide/skills/bundled/creative/creative-ascii-art.md b/website/docs/user-guide/skills/bundled/creative/creative-ascii-art.md new file mode 100644 index 000000000..852fb28a4 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/creative/creative-ascii-art.md @@ -0,0 +1,337 @@ +--- +title: "Ascii Art" +sidebar_label: "Ascii Art" +description: "Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Ascii Art + +Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/creative/ascii-art` | +| Version | `4.0.0` | +| Author | 0xbyt4, Hermes Agent | +| License | MIT | +| Tags | `ASCII`, `Art`, `Banners`, `Creative`, `Unicode`, `Text-Art`, `pyfiglet`, `figlet`, `cowsay`, `boxes` | +| Related skills | [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# ASCII Art Skill + +Multiple tools for different ASCII art needs. All tools are local CLI programs or free REST APIs — no API keys required. + +## Tool 1: Text Banners (pyfiglet — local) + +Render text as large ASCII art banners. 571 built-in fonts. + +### Setup + +```bash +pip install pyfiglet --break-system-packages -q +``` + +### Usage + +```bash +python3 -m pyfiglet "YOUR TEXT" -f slant +python3 -m pyfiglet "TEXT" -f doom -w 80 # Set width +python3 -m pyfiglet --list_fonts # List all 571 fonts +``` + +### Recommended fonts + +| Style | Font | Best for | +|-------|------|----------| +| Clean & modern | `slant` | Project names, headers | +| Bold & blocky | `doom` | Titles, logos | +| Big & readable | `big` | Banners | +| Classic banner | `banner3` | Wide displays | +| Compact | `small` | Subtitles | +| Cyberpunk | `cyberlarge` | Tech themes | +| 3D effect | `3-d` | Splash screens | +| Gothic | `gothic` | Dramatic text | + +### Tips + +- Preview 2-3 fonts and let the user pick their favorite +- Short text (1-8 chars) works best with detailed fonts like `doom` or `block` +- Long text works better with compact fonts like `small` or `mini` + +## Tool 2: Text Banners (asciified API — remote, no install) + +Free REST API that converts text to ASCII art. 250+ FIGlet fonts. Returns plain text directly — no parsing needed. Use this when pyfiglet is not installed or as a quick alternative. + +### Usage (via terminal curl) + +```bash +# Basic text banner (default font) +curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello+World" + +# With a specific font +curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Slant" +curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Doom" +curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Star+Wars" +curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=3-D" +curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Banner3" + +# List all available fonts (returns JSON array) +curl -s "https://asciified.thelicato.io/api/v2/fonts" +``` + +### Tips + +- URL-encode spaces as `+` in the text parameter +- The response is plain text ASCII art — no JSON wrapping, ready to display +- Font names are case-sensitive; use the fonts endpoint to get exact names +- Works from any terminal with curl — no Python or pip needed + +## Tool 3: Cowsay (Message Art) + +Classic tool that wraps text in a speech bubble with an ASCII character. + +### Setup + +```bash +sudo apt install cowsay -y # Debian/Ubuntu +# brew install cowsay # macOS +``` + +### Usage + +```bash +cowsay "Hello World" +cowsay -f tux "Linux rules" # Tux the penguin +cowsay -f dragon "Rawr!" # Dragon +cowsay -f stegosaurus "Roar!" # Stegosaurus +cowthink "Hmm..." # Thought bubble +cowsay -l # List all characters +``` + +### Available characters (50+) + +`beavis.zen`, `bong`, `bunny`, `cheese`, `daemon`, `default`, `dragon`, +`dragon-and-cow`, `elephant`, `eyes`, `flaming-skull`, `ghostbusters`, +`hellokitty`, `kiss`, `kitty`, `koala`, `luke-koala`, `mech-and-cow`, +`meow`, `moofasa`, `moose`, `ren`, `sheep`, `skeleton`, `small`, +`stegosaurus`, `stimpy`, `supermilker`, `surgery`, `three-eyes`, +`turkey`, `turtle`, `tux`, `udder`, `vader`, `vader-koala`, `www` + +### Eye/tongue modifiers + +```bash +cowsay -b "Borg" # =_= eyes +cowsay -d "Dead" # x_x eyes +cowsay -g "Greedy" # $_$ eyes +cowsay -p "Paranoid" # @_@ eyes +cowsay -s "Stoned" # *_* eyes +cowsay -w "Wired" # O_O eyes +cowsay -e "OO" "Msg" # Custom eyes +cowsay -T "U " "Msg" # Custom tongue +``` + +## Tool 4: Boxes (Decorative Borders) + +Draw decorative ASCII art borders/frames around any text. 70+ built-in designs. + +### Setup + +```bash +sudo apt install boxes -y # Debian/Ubuntu +# brew install boxes # macOS +``` + +### Usage + +```bash +echo "Hello World" | boxes # Default box +echo "Hello World" | boxes -d stone # Stone border +echo "Hello World" | boxes -d parchment # Parchment scroll +echo "Hello World" | boxes -d cat # Cat border +echo "Hello World" | boxes -d dog # Dog border +echo "Hello World" | boxes -d unicornsay # Unicorn +echo "Hello World" | boxes -d diamonds # Diamond pattern +echo "Hello World" | boxes -d c-cmt # C-style comment +echo "Hello World" | boxes -d html-cmt # HTML comment +echo "Hello World" | boxes -a c # Center text +boxes -l # List all 70+ designs +``` + +### Combine with pyfiglet or asciified + +```bash +python3 -m pyfiglet "HERMES" -f slant | boxes -d stone +# Or without pyfiglet installed: +curl -s "https://asciified.thelicato.io/api/v2/ascii?text=HERMES&font=Slant" | boxes -d stone +``` + +## Tool 5: TOIlet (Colored Text Art) + +Like pyfiglet but with ANSI color effects and visual filters. Great for terminal eye candy. + +### Setup + +```bash +sudo apt install toilet toilet-fonts -y # Debian/Ubuntu +# brew install toilet # macOS +``` + +### Usage + +```bash +toilet "Hello World" # Basic text art +toilet -f bigmono12 "Hello" # Specific font +toilet --gay "Rainbow!" # Rainbow coloring +toilet --metal "Metal!" # Metallic effect +toilet -F border "Bordered" # Add border +toilet -F border --gay "Fancy!" # Combined effects +toilet -f pagga "Block" # Block-style font (unique to toilet) +toilet -F list # List available filters +``` + +### Filters + +`crop`, `gay` (rainbow), `metal`, `flip`, `flop`, `180`, `left`, `right`, `border` + +**Note**: toilet outputs ANSI escape codes for colors — works in terminals but may not render in all contexts (e.g., plain text files, some chat platforms). + +## Tool 6: Image to ASCII Art + +Convert images (PNG, JPEG, GIF, WEBP) to ASCII art. + +### Option A: ascii-image-converter (recommended, modern) + +```bash +# Install +sudo snap install ascii-image-converter +# OR: go install github.com/TheZoraiz/ascii-image-converter@latest +``` + +```bash +ascii-image-converter image.png # Basic +ascii-image-converter image.png -C # Color output +ascii-image-converter image.png -d 60,30 # Set dimensions +ascii-image-converter image.png -b # Braille characters +ascii-image-converter image.png -n # Negative/inverted +ascii-image-converter https://url/image.jpg # Direct URL +ascii-image-converter image.png --save-txt out # Save as text +``` + +### Option B: jp2a (lightweight, JPEG only) + +```bash +sudo apt install jp2a -y +jp2a --width=80 image.jpg +jp2a --colors image.jpg # Colorized +``` + +## Tool 7: Search Pre-Made ASCII Art + +Search curated ASCII art from the web. Use `terminal` with `curl`. + +### Source A: ascii.co.uk (recommended for pre-made art) + +Large collection of classic ASCII art organized by subject. Art is inside HTML `
` tags. Fetch the page with curl, then extract art with a small Python snippet.
+
+**URL pattern:** `https://ascii.co.uk/art/{subject}`
+
+**Step 1 — Fetch the page:**
+
+```bash
+curl -s 'https://ascii.co.uk/art/cat' -o /tmp/ascii_art.html
+```
+
+**Step 2 — Extract art from pre tags:**
+
+```python
+import re, html
+with open('/tmp/ascii_art.html') as f:
+    text = f.read()
+arts = re.findall(r']*>(.*?)
', text, re.DOTALL) +for art in arts: + clean = re.sub(r'<[^>]+>', '', art) + clean = html.unescape(clean).strip() + if len(clean) > 30: + print(clean) + print('\n---\n') +``` + +**Available subjects** (use as URL path): +- Animals: `cat`, `dog`, `horse`, `bird`, `fish`, `dragon`, `snake`, `rabbit`, `elephant`, `dolphin`, `butterfly`, `owl`, `wolf`, `bear`, `penguin`, `turtle` +- Objects: `car`, `ship`, `airplane`, `rocket`, `guitar`, `computer`, `coffee`, `beer`, `cake`, `house`, `castle`, `sword`, `crown`, `key` +- Nature: `tree`, `flower`, `sun`, `moon`, `star`, `mountain`, `ocean`, `rainbow` +- Characters: `skull`, `robot`, `angel`, `wizard`, `pirate`, `ninja`, `alien` +- Holidays: `christmas`, `halloween`, `valentine` + +**Tips:** +- Preserve artist signatures/initials — important etiquette +- Multiple art pieces per page — pick the best one for the user +- Works reliably via curl, no JavaScript needed + +### Source B: GitHub Octocat API (fun easter egg) + +Returns a random GitHub Octocat with a wise quote. No auth needed. + +```bash +curl -s https://api.github.com/octocat +``` + +## Tool 8: Fun ASCII Utilities (via curl) + +These free services return ASCII art directly — great for fun extras. + +### QR Codes as ASCII Art + +```bash +curl -s "qrenco.de/Hello+World" +curl -s "qrenco.de/https://example.com" +``` + +### Weather as ASCII Art + +```bash +curl -s "wttr.in/London" # Full weather report with ASCII graphics +curl -s "wttr.in/Moon" # Moon phase in ASCII art +curl -s "v2.wttr.in/London" # Detailed version +``` + +## Tool 9: LLM-Generated Custom Art (Fallback) + +When tools above don't have what's needed, generate ASCII art directly using these Unicode characters: + +### Character Palette + +**Box Drawing:** `╔ ╗ ╚ ╝ ║ ═ ╠ ╣ ╦ ╩ ╬ ┌ ┐ └ ┘ │ ─ ├ ┤ ┬ ┴ ┼ ╭ ╮ ╰ ╯` + +**Block Elements:** `░ ▒ ▓ █ ▄ ▀ ▌ ▐ ▖ ▗ ▘ ▝ ▚ ▞` + +**Geometric & Symbols:** `◆ ◇ ◈ ● ○ ◉ ■ □ ▲ △ ▼ ▽ ★ ☆ ✦ ✧ ◀ ▶ ◁ ▷ ⬡ ⬢ ⌂` + +### Rules + +- Max width: 60 characters per line (terminal-safe) +- Max height: 15 lines for banners, 25 for scenes +- Monospace only: output must render correctly in fixed-width fonts + +## Decision Flow + +1. **Text as a banner** → pyfiglet if installed, otherwise asciified API via curl +2. **Wrap a message in fun character art** → cowsay +3. **Add decorative border/frame** → boxes (can combine with pyfiglet/asciified) +4. **Art of a specific thing** (cat, rocket, dragon) → ascii.co.uk via curl + parsing +5. **Convert an image to ASCII** → ascii-image-converter or jp2a +6. **QR code** → qrenco.de via curl +7. **Weather/moon art** → wttr.in via curl +8. **Something custom/creative** → LLM generation with Unicode palette +9. **Any tool not installed** → install it, or fall back to next option diff --git a/website/docs/user-guide/skills/bundled/creative/creative-ascii-video.md b/website/docs/user-guide/skills/bundled/creative/creative-ascii-video.md new file mode 100644 index 000000000..18b1ca1fd --- /dev/null +++ b/website/docs/user-guide/skills/bundled/creative/creative-ascii-video.md @@ -0,0 +1,252 @@ +--- +title: "Ascii Video — Production pipeline for ASCII art video — any format" +sidebar_label: "Ascii Video" +description: "Production pipeline for ASCII art video — any format" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Ascii Video + +Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering. Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/creative/ascii-video` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# ASCII Video Production Pipeline + +## Creative Standard + +This is visual art. ASCII characters are the medium; cinema is the standard. + +**Before writing a single line of code**, articulate the creative concept. What is the mood? What visual story does this tell? What makes THIS project different from every other ASCII video? The user's prompt is a starting point — interpret it with creative ambition, not literal transcription. + +**First-render excellence is non-negotiable.** The output must be visually striking without requiring revision rounds. If something looks generic, flat, or like "AI-generated ASCII art," it is wrong — rethink the creative concept before shipping. + +**Go beyond the reference vocabulary.** The effect catalogs, shader presets, and palette libraries in the references are a starting vocabulary. For every project, combine, modify, and invent new patterns. The catalog is a palette of paints — you write the painting. + +**Be proactively creative.** Extend the skill's vocabulary when the project calls for it. If the references don't have what the vision demands, build it. Include at least one visual moment the user didn't ask for but will appreciate — a transition, an effect, a color choice that elevates the whole piece. + +**Cohesive aesthetic over technical correctness.** All scenes in a video must feel connected by a unifying visual language — shared color temperature, related character palettes, consistent motion vocabulary. A technically correct video where every scene uses a random different effect is an aesthetic failure. + +**Dense, layered, considered.** Every frame should reward viewing. Never flat black backgrounds. Always multi-grid composition. Always per-scene variation. Always intentional color. + +## Modes + +| Mode | Input | Output | Reference | +|------|-------|--------|-----------| +| **Video-to-ASCII** | Video file | ASCII recreation of source footage | `references/inputs.md` § Video Sampling | +| **Audio-reactive** | Audio file | Generative visuals driven by audio features | `references/inputs.md` § Audio Analysis | +| **Generative** | None (or seed params) | Procedural ASCII animation | `references/effects.md` | +| **Hybrid** | Video + audio | ASCII video with audio-reactive overlays | Both input refs | +| **Lyrics/text** | Audio + text/SRT | Timed text with visual effects | `references/inputs.md` § Text/Lyrics | +| **TTS narration** | Text quotes + TTS API | Narrated testimonial/quote video with typed text | `references/inputs.md` § TTS Integration | + +## Stack + +Single self-contained Python script per project. No GPU required. + +| Layer | Tool | Purpose | +|-------|------|---------| +| Core | Python 3.10+, NumPy | Math, array ops, vectorized effects | +| Signal | SciPy | FFT, peak detection (audio modes) | +| Imaging | Pillow (PIL) | Font rasterization, frame decoding, image I/O | +| Video I/O | ffmpeg (CLI) | Decode input, encode output, mux audio | +| Parallel | concurrent.futures | N workers for batch/clip rendering | +| TTS | ElevenLabs API (optional) | Generate narration clips | +| Optional | OpenCV | Video frame sampling, edge detection | + +## Pipeline Architecture + +Every mode follows the same 6-stage pipeline: + +``` +INPUT → ANALYZE → SCENE_FN → TONEMAP → SHADE → ENCODE +``` + +1. **INPUT** — Load/decode source material (video frames, audio samples, images, or nothing) +2. **ANALYZE** — Extract per-frame features (audio bands, video luminance/edges, motion vectors) +3. **SCENE_FN** — Scene function renders to pixel canvas (`uint8 H,W,3`). Composes multiple character grids via `_render_vf()` + pixel blend modes. See `references/composition.md` +4. **TONEMAP** — Percentile-based adaptive brightness normalization. See `references/composition.md` § Adaptive Tonemap +5. **SHADE** — Post-processing via `ShaderChain` + `FeedbackBuffer`. See `references/shaders.md` +6. **ENCODE** — Pipe raw RGB frames to ffmpeg for H.264/GIF encoding + +## Creative Direction + +### Aesthetic Dimensions + +| Dimension | Options | Reference | +|-----------|---------|-----------| +| **Character palette** | Density ramps, block elements, symbols, scripts (katakana, Greek, runes, braille), project-specific | `architecture.md` § Palettes | +| **Color strategy** | HSV, OKLAB/OKLCH, discrete RGB palettes, auto-generated harmony, monochrome, temperature | `architecture.md` § Color System | +| **Background texture** | Sine fields, fBM noise, domain warp, voronoi, reaction-diffusion, cellular automata, video | `effects.md` | +| **Primary effects** | Rings, spirals, tunnel, vortex, waves, interference, aurora, fire, SDFs, strange attractors | `effects.md` | +| **Particles** | Sparks, snow, rain, bubbles, runes, orbits, flocking boids, flow-field followers, trails | `effects.md` § Particles | +| **Shader mood** | Retro CRT, clean modern, glitch art, cinematic, dreamy, industrial, psychedelic | `shaders.md` | +| **Grid density** | xs(8px) through xxl(40px), mixed per layer | `architecture.md` § Grid System | +| **Coordinate space** | Cartesian, polar, tiled, rotated, fisheye, Möbius, domain-warped | `effects.md` § Transforms | +| **Feedback** | Zoom tunnel, rainbow trails, ghostly echo, rotating mandala, color evolution | `composition.md` § Feedback | +| **Masking** | Circle, ring, gradient, text stencil, animated iris/wipe/dissolve | `composition.md` § Masking | +| **Transitions** | Crossfade, wipe, dissolve, glitch cut, iris, mask-based reveal | `shaders.md` § Transitions | + +### Per-Section Variation + +Never use the same config for the entire video. For each section/scene: +- **Different background effect** (or compose 2-3) +- **Different character palette** (match the mood) +- **Different color strategy** (or at minimum a different hue) +- **Vary shader intensity** (more bloom during peaks, more grain during quiet) +- **Different particle types** if particles are active + +### Project-Specific Invention + +For every project, invent at least one of: +- A custom character palette matching the theme +- A custom background effect (combine/modify existing building blocks) +- A custom color palette (discrete RGB set matching the brand/mood) +- A custom particle character set +- A novel scene transition or visual moment + +Don't just pick from the catalog. The catalog is vocabulary — you write the poem. + +## Workflow + +### Step 1: Creative Vision + +Before any code, articulate the creative concept: + +- **Mood/atmosphere**: What should the viewer feel? Energetic, meditative, chaotic, elegant, ominous? +- **Visual story**: What happens over the duration? Build tension? Transform? Dissolve? +- **Color world**: Warm/cool? Monochrome? Neon? Earth tones? What's the dominant hue? +- **Character texture**: Dense data? Sparse stars? Organic dots? Geometric blocks? +- **What makes THIS different**: What's the one thing that makes this project unique? +- **Emotional arc**: How do scenes progress? Open with energy, build to climax, resolve? + +Map the user's prompt to aesthetic choices. A "chill lo-fi visualizer" demands different everything from a "glitch cyberpunk data stream." + +### Step 2: Technical Design + +- **Mode** — which of the 6 modes above +- **Resolution** — landscape 1920x1080 (default), portrait 1080x1920, square 1080x1080 @ 24fps +- **Hardware detection** — auto-detect cores/RAM, set quality profile. See `references/optimization.md` +- **Sections** — map timestamps to scene functions, each with its own effect/palette/color/shader config +- **Output format** — MP4 (default), GIF (640x360 @ 15fps), PNG sequence + +### Step 3: Build the Script + +Single Python file. Components (with references): + +1. **Hardware detection + quality profile** — `references/optimization.md` +2. **Input loader** — mode-dependent; `references/inputs.md` +3. **Feature analyzer** — audio FFT, video luminance, or synthetic +4. **Grid + renderer** — multi-density grids with bitmap cache; `references/architecture.md` +5. **Character palettes** — multiple per project; `references/architecture.md` § Palettes +6. **Color system** — HSV + discrete RGB + harmony generation; `references/architecture.md` § Color +7. **Scene functions** — each returns `canvas (uint8 H,W,3)`; `references/scenes.md` +8. **Tonemap** — adaptive brightness normalization; `references/composition.md` +9. **Shader pipeline** — `ShaderChain` + `FeedbackBuffer`; `references/shaders.md` +10. **Scene table + dispatcher** — time → scene function + config; `references/scenes.md` +11. **Parallel encoder** — N-worker clip rendering with ffmpeg pipes +12. **Main** — orchestrate full pipeline + +### Step 4: Quality Verification + +- **Test frames first**: render single frames at key timestamps before full render +- **Brightness check**: `canvas.mean() > 8` for all ASCII content. If dark, lower gamma +- **Visual coherence**: do all scenes feel like they belong to the same video? +- **Creative vision check**: does the output match the concept from Step 1? If it looks generic, go back + +## Critical Implementation Notes + +### Brightness — Use `tonemap()`, Not Linear Multipliers + +This is the #1 visual issue. ASCII on black is inherently dark. **Never use `canvas * N` multipliers** — they clip highlights. Use adaptive tonemap: + +```python +def tonemap(canvas, gamma=0.75): + f = canvas.astype(np.float32) + lo, hi = np.percentile(f[::4, ::4], [1, 99.5]) + if hi - lo < 10: hi = lo + 10 + f = np.clip((f - lo) / (hi - lo), 0, 1) ** gamma + return (f * 255).astype(np.uint8) +``` + +Pipeline: `scene_fn() → tonemap() → FeedbackBuffer → ShaderChain → ffmpeg` + +Per-scene gamma: default 0.75, solarize 0.55, posterize 0.50, bright scenes 0.85. Use `screen` blend (not `overlay`) for dark layers. + +### Font Cell Height + +macOS Pillow: `textbbox()` returns wrong height. Use `font.getmetrics()`: `cell_height = ascent + descent`. See `references/troubleshooting.md`. + +### ffmpeg Pipe Deadlock + +Never `stderr=subprocess.PIPE` with long-running ffmpeg — buffer fills at 64KB and deadlocks. Redirect to file. See `references/troubleshooting.md`. + +### Font Compatibility + +Not all Unicode chars render in all fonts. Validate palettes at init — render each char, check for blank output. See `references/troubleshooting.md`. + +### Per-Clip Architecture + +For segmented videos (quotes, scenes, chapters), render each as a separate clip file for parallel rendering and selective re-rendering. See `references/scenes.md`. + +## Performance Targets + +| Component | Budget | +|-----------|--------| +| Feature extraction | 1-5ms | +| Effect function | 2-15ms | +| Character render | 80-150ms (bottleneck) | +| Shader pipeline | 5-25ms | +| **Total** | ~100-200ms/frame | + +## References + +| File | Contents | +|------|----------| +| `references/architecture.md` | Grid system, resolution presets, font selection, character palettes (20+), color system (HSV + OKLAB + discrete RGB + harmony generation), `_render_vf()` helper, GridLayer class | +| `references/composition.md` | Pixel blend modes (20 modes), `blend_canvas()`, multi-grid composition, adaptive `tonemap()`, `FeedbackBuffer`, `PixelBlendStack`, masking/stencil system | +| `references/effects.md` | Effect building blocks: value field generators, hue fields, noise/fBM/domain warp, voronoi, reaction-diffusion, cellular automata, SDFs, strange attractors, particle systems, coordinate transforms, temporal coherence | +| `references/shaders.md` | `ShaderChain`, `_apply_shader_step()` dispatch, 38 shader catalog, audio-reactive scaling, transitions, tint presets, output format encoding, terminal rendering | +| `references/scenes.md` | Scene protocol, `Renderer` class, `SCENES` table, `render_clip()`, beat-synced cutting, parallel rendering, design patterns (layer hierarchy, directional arcs, visual metaphors, compositional techniques), complete scene examples at every complexity level, scene design checklist | +| `references/inputs.md` | Audio analysis (FFT, bands, beats), video sampling, image conversion, text/lyrics, TTS integration (ElevenLabs, voice assignment, audio mixing) | +| `references/optimization.md` | Hardware detection, quality profiles, vectorized patterns, parallel rendering, memory management, performance budgets | +| `references/troubleshooting.md` | NumPy broadcasting traps, blend mode pitfalls, multiprocessing/pickling, brightness diagnostics, ffmpeg issues, font problems, common mistakes | + +--- + +## Creative Divergence (use only when user requests experimental/creative/unique output) + +If the user asks for creative, experimental, surprising, or unconventional output, select the strategy that best fits and reason through its steps BEFORE generating code. + +- **Forced Connections** — when the user wants cross-domain inspiration ("make it look organic," "industrial aesthetic") +- **Conceptual Blending** — when the user names two things to combine ("ocean meets music," "space + calligraphy") +- **Oblique Strategies** — when the user is maximally open ("surprise me," "something I've never seen") + +### Forced Connections +1. Pick a domain unrelated to the visual goal (weather systems, microbiology, architecture, fluid dynamics, textile weaving) +2. List its core visual/structural elements (erosion → gradual reveal; mitosis → splitting duplication; weaving → interlocking patterns) +3. Map those elements onto ASCII characters and animation patterns +4. Synthesize — what does "erosion" or "crystallization" look like in a character grid? + +### Conceptual Blending +1. Name two distinct visual/conceptual spaces (e.g., ocean waves + sheet music) +2. Map correspondences (crests = high notes, troughs = rests, foam = staccato) +3. Blend selectively — keep the most interesting mappings, discard forced ones +4. Develop emergent properties that exist only in the blend + +### Oblique Strategies +1. Draw one: "Honor thy error as a hidden intention" / "Use an old idea" / "What would your closest friend do?" / "Emphasize the flaws" / "Turn it upside down" / "Only a part, not the whole" / "Reverse" +2. Interpret the directive against the current ASCII animation challenge +3. Apply the lateral insight to the visual design before writing code diff --git a/website/docs/user-guide/skills/bundled/creative/creative-baoyu-comic.md b/website/docs/user-guide/skills/bundled/creative/creative-baoyu-comic.md new file mode 100644 index 000000000..c1b37bc80 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/creative/creative-baoyu-comic.md @@ -0,0 +1,263 @@ +--- +title: "Baoyu Comic — Knowledge comic creator supporting multiple art styles and tones" +sidebar_label: "Baoyu Comic" +description: "Knowledge comic creator supporting multiple art styles and tones" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Baoyu Comic + +Knowledge comic creator supporting multiple art styles and tones. Creates original educational comics with detailed panel layouts and sequential image generation. Use when user asks to create "知识漫画", "教育漫画", "biography comic", "tutorial comic", or "Logicomix-style comic". + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/creative/baoyu-comic` | +| Version | `1.56.1` | +| Author | 宝玉 (JimLiu) | +| License | MIT | +| Tags | `comic`, `knowledge-comic`, `creative`, `image-generation` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Knowledge Comic Creator + +Adapted from [baoyu-comic](https://github.com/JimLiu/baoyu-skills) for Hermes Agent's tool ecosystem. + +Create original knowledge comics with flexible art style × tone combinations. + +## When to Use + +Trigger this skill when the user asks to create a knowledge/educational comic, biography comic, tutorial comic, or uses terms like "知识漫画", "教育漫画", or "Logicomix-style". The user provides content (text, file path, URL, or topic) and optionally specifies art style, tone, layout, aspect ratio, or language. + +## Reference Images + +Hermes' `image_generate` tool is **prompt-only** — it accepts a text prompt and an aspect ratio, and returns an image URL. It does **NOT** accept reference images. When the user supplies a reference image, use it to **extract traits in text** that get embedded in every page prompt: + +**Intake**: Accept file paths when the user provides them (or pastes images in conversation). +- File path(s) → copy to `refs/NN-ref-{slug}.{ext}` alongside the comic output for provenance +- Pasted image with no path → ask the user for the path via `clarify`, or extract style traits verbally as a text fallback +- No reference → skip this section + +**Usage modes** (per reference): + +| Usage | Effect | +|-------|--------| +| `style` | Extract style traits (line treatment, texture, mood) and append to every page's prompt body | +| `palette` | Extract hex colors and append to every page's prompt body | +| `scene` | Extract scene composition or subject notes and append to the relevant page(s) | + +**Record in each page's prompt frontmatter** when refs exist: + +```yaml +references: + - ref_id: 01 + filename: 01-ref-scene.png + usage: style + traits: "muted earth tones, soft-edged ink wash, low-contrast backgrounds" +``` + +Character consistency is driven by **text descriptions** in `characters/characters.md` (written in Step 3) that get embedded inline in every page prompt (Step 5). The optional PNG character sheet generated in Step 7.1 is a human-facing review artifact, not an input to `image_generate`. + +## Options + +### Visual Dimensions + +| Option | Values | Description | +|--------|--------|-------------| +| Art | ligne-claire (default), manga, realistic, ink-brush, chalk, minimalist | Art style / rendering technique | +| Tone | neutral (default), warm, dramatic, romantic, energetic, vintage, action | Mood / atmosphere | +| Layout | standard (default), cinematic, dense, splash, mixed, webtoon, four-panel | Panel arrangement | +| Aspect | 3:4 (default, portrait), 4:3 (landscape), 16:9 (widescreen) | Page aspect ratio | +| Language | auto (default), zh, en, ja, etc. | Output language | +| Refs | File paths | Reference images used for style / palette trait extraction (not passed to the image model). See [Reference Images](#reference-images) above. | + +### Partial Workflow Options + +| Option | Description | +|--------|-------------| +| Storyboard only | Generate storyboard only, skip prompts and images | +| Prompts only | Generate storyboard + prompts, skip images | +| Images only | Generate images from existing prompts directory | +| Regenerate N | Regenerate specific page(s) only (e.g., `3` or `2,5,8`) | + +Details: [references/partial-workflows.md](https://github.com/NousResearch/hermes-agent/blob/main/skills/creative/baoyu-comic/references/partial-workflows.md) + +### Art, Tone & Preset Catalogue + +- **Art styles** (6): `ligne-claire`, `manga`, `realistic`, `ink-brush`, `chalk`, `minimalist`. Full definitions at `references/art-styles/ + + + + + +``` + +Key implementation patterns: +- **Seeded randomness**: Always `randomSeed()` + `noiseSeed()` for reproducibility +- **Color mode**: Use `colorMode(HSB, 360, 100, 100, 100)` for intuitive color control +- **State separation**: CONFIG for parameters, PALETTE for colors, globals for mutable state +- **Class-based entities**: Particles, agents, shapes as classes with `update()` + `display()` methods +- **Offscreen buffers**: `createGraphics()` for layered composition, trails, masks + +### Step 4: Preview & Iterate + +- Open HTML file directly in browser — no server needed for basic sketches +- For `loadImage()`/`loadFont()` from local files: use `scripts/serve.sh` or `python3 -m http.server` +- Chrome DevTools Performance tab to verify 60fps +- Test at target export resolution, not just the window size +- Adjust parameters until the visual matches the concept from Step 1 + +### Step 5: Export + +| Format | Method | Command | +|--------|--------|---------| +| **PNG** | `saveCanvas('output', 'png')` in `keyPressed()` | Press 's' to save | +| **High-res PNG** | Puppeteer headless capture | `node scripts/export-frames.js sketch.html --width 3840 --height 2160 --frames 1` | +| **GIF** | `saveGif('output', 5)` — captures N seconds | Press 'g' to save | +| **Frame sequence** | `saveFrames('frame', 'png', 10, 30)` — 10s at 30fps | Then `ffmpeg -i frame-%04d.png -c:v libx264 output.mp4` | +| **MP4** | Puppeteer frame capture + ffmpeg | `bash scripts/render.sh sketch.html output.mp4 --duration 30 --fps 30` | +| **SVG** | `createCanvas(w, h, SVG)` with p5.js-svg | `save('output.svg')` | + +### Step 6: Quality Verification + +- **Does it match the vision?** Compare output to the creative concept. If it looks generic, go back to Step 1 +- **Resolution check**: Is it sharp at the target display size? No aliasing artifacts? +- **Performance check**: Does it hold 60fps in browser? (30fps minimum for animations) +- **Color check**: Do the colors work together? Test on both light and dark monitors +- **Edge cases**: What happens at canvas edges? On resize? After running for 10 minutes? + +## Critical Implementation Notes + +### Performance — Disable FES First + +The Friendly Error System (FES) adds up to 10x overhead. Disable it in every production sketch: + +```javascript +p5.disableFriendlyErrors = true; // BEFORE setup() + +function setup() { + pixelDensity(1); // prevent 2x-4x overdraw on retina + createCanvas(1920, 1080); +} +``` + +In hot loops (particles, pixel ops), use `Math.*` instead of p5 wrappers — measurably faster: + +```javascript +// In draw() or update() hot paths: +let a = Math.sin(t); // not sin(t) +let r = Math.sqrt(dx*dx+dy*dy); // not dist() — or better: skip sqrt, compare magSq +let v = Math.random(); // not random() — when seed not needed +let m = Math.min(a, b); // not min(a, b) +``` + +Never `console.log()` inside `draw()`. Never manipulate DOM in `draw()`. See `references/troubleshooting.md` § Performance. + +### Seeded Randomness — Always + +Every generative sketch must be reproducible. Same seed, same output. + +```javascript +function setup() { + randomSeed(CONFIG.seed); + noiseSeed(CONFIG.seed); + // All random() and noise() calls now deterministic +} +``` + +Never use `Math.random()` for generative content — only for performance-critical non-visual code. Always `random()` for visual elements. If you need a random seed: `CONFIG.seed = floor(random(99999))`. + +### Generative Art Platform Support (fxhash / Art Blocks) + +For generative art platforms, replace p5's PRNG with the platform's deterministic random: + +```javascript +// fxhash convention +const SEED = $fx.hash; // unique per mint +const rng = $fx.rand; // deterministic PRNG +$fx.features({ palette: 'warm', complexity: 'high' }); + +// In setup(): +randomSeed(SEED); // for p5's noise() +noiseSeed(SEED); + +// Replace random() with rng() for platform determinism +let x = rng() * width; // instead of random(width) +``` + +See `references/export-pipeline.md` § Platform Export. + +### Color Mode — Use HSB + +HSB (Hue, Saturation, Brightness) is dramatically easier to work with than RGB for generative art: + +```javascript +colorMode(HSB, 360, 100, 100, 100); +// Now: fill(hue, sat, bri, alpha) +// Rotate hue: fill((baseHue + offset) % 360, 80, 90) +// Desaturate: fill(hue, sat * 0.3, bri) +// Darken: fill(hue, sat, bri * 0.5) +``` + +Never hardcode raw RGB values. Define a palette object, derive variations procedurally. See `references/color-systems.md`. + +### Noise — Multi-Octave, Not Raw + +Raw `noise(x, y)` looks like smooth blobs. Layer octaves for natural texture: + +```javascript +function fbm(x, y, octaves = 4) { + let val = 0, amp = 1, freq = 1, sum = 0; + for (let i = 0; i < octaves; i++) { + val += noise(x * freq, y * freq) * amp; + sum += amp; + amp *= 0.5; + freq *= 2; + } + return val / sum; +} +``` + +For flowing organic forms, use **domain warping**: feed noise output back as noise input coordinates. See `references/visual-effects.md`. + +### createGraphics() for Layers — Not Optional + +Flat single-pass rendering looks flat. Use offscreen buffers for composition: + +```javascript +let bgLayer, fgLayer, trailLayer; +function setup() { + createCanvas(1920, 1080); + bgLayer = createGraphics(width, height); + fgLayer = createGraphics(width, height); + trailLayer = createGraphics(width, height); +} +function draw() { + renderBackground(bgLayer); + renderTrails(trailLayer); // persistent, fading + renderForeground(fgLayer); // cleared each frame + image(bgLayer, 0, 0); + image(trailLayer, 0, 0); + image(fgLayer, 0, 0); +} +``` + +### Performance — Vectorize Where Possible + +p5.js draw calls are expensive. For thousands of particles: + +```javascript +// SLOW: individual shapes +for (let p of particles) { + ellipse(p.x, p.y, p.size); +} + +// FAST: single shape with beginShape() +beginShape(POINTS); +for (let p of particles) { + vertex(p.x, p.y); +} +endShape(); + +// FASTEST: pixel buffer for massive counts +loadPixels(); +for (let p of particles) { + let idx = 4 * (floor(p.y) * width + floor(p.x)); + pixels[idx] = r; pixels[idx+1] = g; pixels[idx+2] = b; pixels[idx+3] = 255; +} +updatePixels(); +``` + +See `references/troubleshooting.md` § Performance. + +### Instance Mode for Multiple Sketches + +Global mode pollutes `window`. For production, use instance mode: + +```javascript +const sketch = (p) => { + p.setup = function() { + p.createCanvas(800, 800); + }; + p.draw = function() { + p.background(0); + p.ellipse(p.mouseX, p.mouseY, 50); + }; +}; +new p5(sketch, 'canvas-container'); +``` + +Required when embedding multiple sketches on one page or integrating with frameworks. + +### WebGL Mode Gotchas + +- `createCanvas(w, h, WEBGL)` — origin is center, not top-left +- Y-axis is inverted (positive Y goes up in WEBGL, down in P2D) +- `translate(-width/2, -height/2)` to get P2D-like coordinates +- `push()`/`pop()` around every transform — matrix stack overflows silently +- `texture()` before `rect()`/`plane()` — not after +- Custom shaders: `createShader(vert, frag)` — test on multiple browsers + +### Export — Key Bindings Convention + +Every sketch should include these in `keyPressed()`: + +```javascript +function keyPressed() { + if (key === 's' || key === 'S') saveCanvas('output', 'png'); + if (key === 'g' || key === 'G') saveGif('output', 5); + if (key === 'r' || key === 'R') { randomSeed(millis()); noiseSeed(millis()); } + if (key === ' ') CONFIG.paused = !CONFIG.paused; +} +``` + +### Headless Video Export — Use noLoop() + +For headless rendering via Puppeteer, the sketch **must** use `noLoop()` in setup. Without it, p5's draw loop runs freely while screenshots are slow — the sketch races ahead and you get skipped/duplicate frames. + +```javascript +function setup() { + createCanvas(1920, 1080); + pixelDensity(1); + noLoop(); // capture script controls frame advance + window._p5Ready = true; // signal readiness to capture script +} +``` + +The bundled `scripts/export-frames.js` detects `_p5Ready` and calls `redraw()` once per capture for exact 1:1 frame correspondence. See `references/export-pipeline.md` § Deterministic Capture. + +For multi-scene videos, use the per-clip architecture: one HTML per scene, render independently, stitch with `ffmpeg -f concat`. See `references/export-pipeline.md` § Per-Clip Architecture. + +### Agent Workflow + +When building p5.js sketches: + +1. **Write the HTML file** — single self-contained file, all code inline +2. **Open in browser** — `open sketch.html` (macOS) or `xdg-open sketch.html` (Linux) +3. **Local assets** (fonts, images) require a server: `python3 -m http.server 8080` in the project directory, then open `http://localhost:8080/sketch.html` +4. **Export PNG/GIF** — add `keyPressed()` shortcuts as shown above, tell the user which key to press +5. **Headless export** — `node scripts/export-frames.js sketch.html --frames 300` for automated frame capture (sketch must use `noLoop()` + `_p5Ready`) +6. **MP4 rendering** — `bash scripts/render.sh sketch.html output.mp4 --duration 30` +7. **Iterative refinement** — edit the HTML file, user refreshes browser to see changes +8. **Load references on demand** — use `skill_view(name="p5js", file_path="references/...")` to load specific reference files as needed during implementation + +## Performance Targets + +| Metric | Target | +|--------|--------| +| Frame rate (interactive) | 60fps sustained | +| Frame rate (animated export) | 30fps minimum | +| Particle count (P2D shapes) | 5,000-10,000 at 60fps | +| Particle count (pixel buffer) | 50,000-100,000 at 60fps | +| Canvas resolution | Up to 3840x2160 (export), 1920x1080 (interactive) | +| File size (HTML) | < 100KB (excluding CDN libraries) | +| Load time | < 2s to first frame | + +## References + +| File | Contents | +|------|----------| +| `references/core-api.md` | Canvas setup, coordinate system, draw loop, `push()`/`pop()`, offscreen buffers, composition patterns, `pixelDensity()`, responsive design | +| `references/shapes-and-geometry.md` | 2D primitives, `beginShape()`/`endShape()`, Bezier/Catmull-Rom curves, `vertex()` systems, custom shapes, `p5.Vector`, signed distance fields, SVG path conversion | +| `references/visual-effects.md` | Noise (Perlin, fractal, domain warp, curl), flow fields, particle systems (physics, flocking, trails), pixel manipulation, texture generation (stipple, hatch, halftone), feedback loops, reaction-diffusion | +| `references/animation.md` | Frame-based animation, easing functions, `lerp()`/`map()`, spring physics, state machines, timeline sequencing, `millis()`-based timing, transition patterns | +| `references/typography.md` | `text()`, `loadFont()`, `textToPoints()`, kinetic typography, text masks, font metrics, responsive text sizing | +| `references/color-systems.md` | `colorMode()`, HSB/HSL/RGB, `lerpColor()`, `paletteLerp()`, procedural palettes, color harmony, `blendMode()`, gradient rendering, curated palette library | +| `references/webgl-and-3d.md` | WEBGL renderer, 3D primitives, camera, lighting, materials, custom geometry, GLSL shaders (`createShader()`, `createFilterShader()`), framebuffers, post-processing | +| `references/interaction.md` | Mouse events, keyboard state, touch input, DOM elements, `createSlider()`/`createButton()`, audio input (p5.sound FFT/amplitude), scroll-driven animation, responsive events | +| `references/export-pipeline.md` | `saveCanvas()`, `saveGif()`, `saveFrames()`, deterministic headless capture, ffmpeg frame-to-video, CCapture.js, SVG export, per-clip architecture, platform export (fxhash), video gotchas | +| `references/troubleshooting.md` | Performance profiling, per-pixel budgets, common mistakes, browser compatibility, WebGL debugging, font loading issues, pixel density traps, memory leaks, CORS | +| `templates/viewer.html` | Interactive viewer template: seed navigation (prev/next/random/jump), parameter sliders, download PNG, responsive canvas. Start from this for explorable generative art | + +--- + +## Creative Divergence (use only when user requests experimental/creative/unique output) + +If the user asks for creative, experimental, surprising, or unconventional output, select the strategy that best fits and reason through its steps BEFORE generating code. + +- **Conceptual Blending** — when the user names two things to combine or wants hybrid aesthetics +- **SCAMPER** — when the user wants a twist on a known generative art pattern +- **Distance Association** — when the user gives a single concept and wants exploration ("make something about time") + +### Conceptual Blending +1. Name two distinct visual systems (e.g., particle physics + handwriting) +2. Map correspondences (particles = ink drops, forces = pen pressure, fields = letterforms) +3. Blend selectively — keep mappings that produce interesting emergent visuals +4. Code the blend as a unified system, not two systems side-by-side + +### SCAMPER Transformation +Take a known generative pattern (flow field, particle system, L-system, cellular automata) and systematically transform it: +- **Substitute**: replace circles with text characters, lines with gradients +- **Combine**: merge two patterns (flow field + voronoi) +- **Adapt**: apply a 2D pattern to a 3D projection +- **Modify**: exaggerate scale, warp the coordinate space +- **Purpose**: use a physics sim for typography, a sorting algorithm for color +- **Eliminate**: remove the grid, remove color, remove symmetry +- **Reverse**: run the simulation backward, invert the parameter space + +### Distance Association +1. Anchor on the user's concept (e.g., "loneliness") +2. Generate associations at three distances: + - Close (obvious): empty room, single figure, silence + - Medium (interesting): one fish in a school swimming the wrong way, a phone with no notifications, the gap between subway cars + - Far (abstract): prime numbers, asymptotic curves, the color of 3am +3. Develop the medium-distance associations — they're specific enough to visualize but unexpected enough to be interesting diff --git a/website/docs/user-guide/skills/bundled/creative/creative-pixel-art.md b/website/docs/user-guide/skills/bundled/creative/creative-pixel-art.md new file mode 100644 index 000000000..beecb38f0 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/creative/creative-pixel-art.md @@ -0,0 +1,232 @@ +--- +title: "Pixel Art — Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc" +sidebar_label: "Pixel Art" +description: "Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Pixel Art + +Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc.), and animate them into short videos. Presets cover arcade, SNES, and 10+ era-correct looks. Use `clarify` to let the user pick a style before generating. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/creative/pixel-art` | +| Version | `2.0.0` | +| Author | dodo-reach | +| License | MIT | +| Tags | `creative`, `pixel-art`, `arcade`, `snes`, `nes`, `gameboy`, `retro`, `image`, `video` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Pixel Art + +Convert any image into retro pixel art, then optionally animate it into a short +MP4 or GIF with era-appropriate effects (rain, fireflies, snow, embers). + +Two scripts ship with this skill: + +- `scripts/pixel_art.py` — photo → pixel-art PNG (Floyd-Steinberg dithering) +- `scripts/pixel_art_video.py` — pixel-art PNG → animated MP4 (+ optional GIF) + +Each is importable or runnable directly. Presets snap to hardware palettes +when you want era-accurate colors (NES, Game Boy, PICO-8, etc.), or use +adaptive N-color quantization for arcade/SNES-style looks. + +## When to Use + +- User wants retro pixel art from a source image +- User asks for NES / Game Boy / PICO-8 / C64 / arcade / SNES styling +- User wants a short looping animation (rain scene, night sky, snow, etc.) +- Posters, album covers, social posts, sprites, characters, avatars + +## Workflow + +Before generating, confirm the style with the user. Different presets produce +very different outputs and regenerating is costly. + +### Step 1 — Offer a style + +Call `clarify` with 4 representative presets. Pick the set based on what the +user asked for — don't just dump all 14. + +Default menu when the user's intent is unclear: + +```python +clarify( + question="Which pixel-art style do you want?", + choices=[ + "arcade — bold, chunky 80s cabinet feel (16 colors, 8px)", + "nes — Nintendo 8-bit hardware palette (54 colors, 8px)", + "gameboy — 4-shade green Game Boy DMG", + "snes — cleaner 16-bit look (32 colors, 4px)", + ], +) +``` + +When the user already named an era (e.g. "80s arcade", "Gameboy"), skip +`clarify` and use the matching preset directly. + +### Step 2 — Offer animation (optional) + +If the user asked for a video/GIF, or the output might benefit from motion, +ask which scene: + +```python +clarify( + question="Want to animate it? Pick a scene or skip.", + choices=[ + "night — stars + fireflies + leaves", + "urban — rain + neon pulse", + "snow — falling snowflakes", + "skip — just the image", + ], +) +``` + +Do NOT call `clarify` more than twice in a row. One for style, one for scene if +animation is on the table. If the user explicitly asked for a specific style +and scene in their message, skip `clarify` entirely. + +### Step 3 — Generate + +Run `pixel_art()` first; if animation was requested, chain into +`pixel_art_video()` on the result. + +## Preset Catalog + +| Preset | Era | Palette | Block | Best for | +|--------|-----|---------|-------|----------| +| `arcade` | 80s arcade | adaptive 16 | 8px | Bold posters, hero art | +| `snes` | 16-bit | adaptive 32 | 4px | Characters, detailed scenes | +| `nes` | 8-bit | NES (54) | 8px | True NES look | +| `gameboy` | DMG handheld | 4 green shades | 8px | Monochrome Game Boy | +| `gameboy_pocket` | Pocket handheld | 4 grey shades | 8px | Mono GB Pocket | +| `pico8` | PICO-8 | 16 fixed | 6px | Fantasy-console look | +| `c64` | Commodore 64 | 16 fixed | 8px | 8-bit home computer | +| `apple2` | Apple II hi-res | 6 fixed | 10px | Extreme retro, 6 colors | +| `teletext` | BBC Teletext | 8 pure | 10px | Chunky primary colors | +| `mspaint` | Windows MS Paint | 24 fixed | 8px | Nostalgic desktop | +| `mono_green` | CRT phosphor | 2 green | 6px | Terminal/CRT aesthetic | +| `mono_amber` | CRT amber | 2 amber | 6px | Amber monitor look | +| `neon` | Cyberpunk | 10 neons | 6px | Vaporwave/cyber | +| `pastel` | Soft pastel | 10 pastels | 6px | Kawaii / gentle | + +Named palettes live in `scripts/palettes.py` (see `references/palettes.md` for +the complete list — 28 named palettes total). Any preset can be overridden: + +```python +pixel_art("in.png", "out.png", preset="snes", palette="PICO_8", block=6) +``` + +## Scene Catalog (for video) + +| Scene | Effects | +|-------|---------| +| `night` | Twinkling stars + fireflies + drifting leaves | +| `dusk` | Fireflies + sparkles | +| `tavern` | Dust motes + warm sparkles | +| `indoor` | Dust motes | +| `urban` | Rain + neon pulse | +| `nature` | Leaves + fireflies | +| `magic` | Sparkles + fireflies | +| `storm` | Rain + lightning | +| `underwater` | Bubbles + light sparkles | +| `fire` | Embers + sparkles | +| `snow` | Snowflakes + sparkles | +| `desert` | Heat shimmer + dust | + +## Invocation Patterns + +### Python (import) + +```python +import sys +sys.path.insert(0, "/home/teknium/.hermes/skills/creative/pixel-art/scripts") +from pixel_art import pixel_art +from pixel_art_video import pixel_art_video + +# 1. Convert to pixel art +pixel_art("/path/to/photo.jpg", "/tmp/pixel.png", preset="nes") + +# 2. Animate (optional) +pixel_art_video( + "/tmp/pixel.png", + "/tmp/pixel.mp4", + scene="night", + duration=6, + fps=15, + seed=42, + export_gif=True, +) +``` + +### CLI + +```bash +cd /home/teknium/.hermes/skills/creative/pixel-art/scripts + +python pixel_art.py in.jpg out.png --preset gameboy +python pixel_art.py in.jpg out.png --preset snes --palette PICO_8 --block 6 + +python pixel_art_video.py out.png out.mp4 --scene night --duration 6 --gif +``` + +## Pipeline Rationale + +**Pixel conversion:** +1. Boost contrast/color/sharpness (stronger for smaller palettes) +2. Posterize to simplify tonal regions before quantization +3. Downscale by `block` with `Image.NEAREST` (hard pixels, no interpolation) +4. Quantize with Floyd-Steinberg dithering — against either an adaptive + N-color palette OR a named hardware palette +5. Upscale back with `Image.NEAREST` + +Quantizing AFTER downscale keeps dithering aligned with the final pixel grid. +Quantizing before would waste error-diffusion on detail that disappears. + +**Video overlay:** +- Copies the base frame each tick (static background) +- Overlays stateless-per-frame particle draws (one function per effect) +- Encodes via ffmpeg `libx264 -pix_fmt yuv420p -crf 18` +- Optional GIF via `palettegen` + `paletteuse` + +## Dependencies + +- Python 3.9+ +- Pillow (`pip install Pillow`) +- ffmpeg on PATH (only needed for video — Hermes installs package this) + +## Pitfalls + +- Pallet keys are case-sensitive (`"NES"`, `"PICO_8"`, `"GAMEBOY_ORIGINAL"`). +- Very small sources (<100px wide) collapse under 8-10px blocks. Upscale the + source first if it's tiny. +- Fractional `block` or `palette` will break quantization — keep them positive ints. +- Animation particle counts are tuned for ~640x480 canvases. On very large + images you may want a second pass with a different seed for density. +- `mono_green` / `mono_amber` force `color=0.0` (desaturate). If you override + and keep chroma, the 2-color palette can produce stripes on smooth regions. +- `clarify` loop: call it at most twice per turn (style, then scene). Don't + pepper the user with more picks. + +## Verification + +- PNG is created at the output path +- Clear square pixel blocks visible at the preset's block size +- Color count matches preset (eyeball the image or run `Image.open(p).getcolors()`) +- Video is a valid MP4 (`ffprobe` can open it) with non-zero size + +## Attribution + +Named hardware palettes and the procedural animation loops in `pixel_art_video.py` +are ported from [pixel-art-studio](https://github.com/Synero/pixel-art-studio) +(MIT). See `ATTRIBUTION.md` in this skill directory for details. diff --git a/website/docs/user-guide/skills/bundled/creative/creative-popular-web-designs.md b/website/docs/user-guide/skills/bundled/creative/creative-popular-web-designs.md new file mode 100644 index 000000000..838a1c179 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/creative/creative-popular-web-designs.md @@ -0,0 +1,212 @@ +--- +title: "Popular Web Designs — 54 production-quality design systems extracted from real websites" +sidebar_label: "Popular Web Designs" +description: "54 production-quality design systems extracted from real websites" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Popular Web Designs + +54 production-quality design systems extracted from real websites. Load a template to generate HTML/CSS that matches the visual identity of sites like Stripe, Linear, Vercel, Notion, Airbnb, and more. Each template includes colors, typography, components, layout rules, and ready-to-use CSS values. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/creative/popular-web-designs` | +| Version | `1.0.0` | +| Author | Hermes Agent + Teknium (design systems sourced from VoltAgent/awesome-design-md) | +| License | MIT | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Popular Web Designs + +54 real-world design systems ready for use when generating HTML/CSS. Each template captures a +site's complete visual language: color palette, typography hierarchy, component styles, spacing +system, shadows, responsive behavior, and practical agent prompts with exact CSS values. + +## How to Use + +1. Pick a design from the catalog below +2. Load it: `skill_view(name="popular-web-designs", file_path="templates/.md")` +3. Use the design tokens and component specs when generating HTML +4. Pair with the `generative-widgets` skill to serve the result via cloudflared tunnel + +Each template includes a **Hermes Implementation Notes** block at the top with: +- CDN font substitute and Google Fonts `` tag (ready to paste) +- CSS font-family stacks for primary and monospace +- Reminders to use `write_file` for HTML creation and `browser_vision` for verification + +## HTML Generation Pattern + +```html + + + + + + Page Title + + + + + + + + +``` + +Write the file with `write_file`, serve with the `generative-widgets` workflow (cloudflared tunnel), +and verify the result with `browser_vision` to confirm visual accuracy. + +## Font Substitution Reference + +Most sites use proprietary fonts unavailable via CDN. Each template maps to a Google Fonts +substitute that preserves the design's character. Common mappings: + +| Proprietary Font | CDN Substitute | Character | +|---|---|---| +| Geist / Geist Sans | Geist (on Google Fonts) | Geometric, compressed tracking | +| Geist Mono | Geist Mono (on Google Fonts) | Clean monospace, ligatures | +| sohne-var (Stripe) | Source Sans 3 | Light weight elegance | +| Berkeley Mono | JetBrains Mono | Technical monospace | +| Airbnb Cereal VF | DM Sans | Rounded, friendly geometric | +| Circular (Spotify) | DM Sans | Geometric, warm | +| figmaSans | Inter | Clean humanist | +| Pin Sans (Pinterest) | DM Sans | Friendly, rounded | +| NVIDIA-EMEA | Inter (or Arial system) | Industrial, clean | +| CoinbaseDisplay/Sans | DM Sans | Geometric, trustworthy | +| UberMove | DM Sans | Bold, tight | +| HashiCorp Sans | Inter | Enterprise, neutral | +| waldenburgNormal (Sanity) | Space Grotesk | Geometric, slightly condensed | +| IBM Plex Sans/Mono | IBM Plex Sans/Mono | Available on Google Fonts | +| Rubik (Sentry) | Rubik | Available on Google Fonts | + +When a template's CDN font matches the original (Inter, IBM Plex, Rubik, Geist), no +substitution loss occurs. When a substitute is used (DM Sans for Circular, Source Sans 3 +for sohne-var), follow the template's weight, size, and letter-spacing values closely — +those carry more visual identity than the specific font face. + +## Design Catalog + +### AI & Machine Learning + +| Template | Site | Style | +|---|---|---| +| `claude.md` | Anthropic Claude | Warm terracotta accent, clean editorial layout | +| `cohere.md` | Cohere | Vibrant gradients, data-rich dashboard aesthetic | +| `elevenlabs.md` | ElevenLabs | Dark cinematic UI, audio-waveform aesthetics | +| `minimax.md` | Minimax | Bold dark interface with neon accents | +| `mistral.ai.md` | Mistral AI | French-engineered minimalism, purple-toned | +| `ollama.md` | Ollama | Terminal-first, monochrome simplicity | +| `opencode.ai.md` | OpenCode AI | Developer-centric dark theme, full monospace | +| `replicate.md` | Replicate | Clean white canvas, code-forward | +| `runwayml.md` | RunwayML | Cinematic dark UI, media-rich layout | +| `together.ai.md` | Together AI | Technical, blueprint-style design | +| `voltagent.md` | VoltAgent | Void-black canvas, emerald accent, terminal-native | +| `x.ai.md` | xAI | Stark monochrome, futuristic minimalism, full monospace | + +### Developer Tools & Platforms + +| Template | Site | Style | +|---|---|---| +| `cursor.md` | Cursor | Sleek dark interface, gradient accents | +| `expo.md` | Expo | Dark theme, tight letter-spacing, code-centric | +| `linear.app.md` | Linear | Ultra-minimal dark-mode, precise, purple accent | +| `lovable.md` | Lovable | Playful gradients, friendly dev aesthetic | +| `mintlify.md` | Mintlify | Clean, green-accented, reading-optimized | +| `posthog.md` | PostHog | Playful branding, developer-friendly dark UI | +| `raycast.md` | Raycast | Sleek dark chrome, vibrant gradient accents | +| `resend.md` | Resend | Minimal dark theme, monospace accents | +| `sentry.md` | Sentry | Dark dashboard, data-dense, pink-purple accent | +| `supabase.md` | Supabase | Dark emerald theme, code-first developer tool | +| `superhuman.md` | Superhuman | Premium dark UI, keyboard-first, purple glow | +| `vercel.md` | Vercel | Black and white precision, Geist font system | +| `warp.md` | Warp | Dark IDE-like interface, block-based command UI | +| `zapier.md` | Zapier | Warm orange, friendly illustration-driven | + +### Infrastructure & Cloud + +| Template | Site | Style | +|---|---|---| +| `clickhouse.md` | ClickHouse | Yellow-accented, technical documentation style | +| `composio.md` | Composio | Modern dark with colorful integration icons | +| `hashicorp.md` | HashiCorp | Enterprise-clean, black and white | +| `mongodb.md` | MongoDB | Green leaf branding, developer documentation focus | +| `sanity.md` | Sanity | Red accent, content-first editorial layout | +| `stripe.md` | Stripe | Signature purple gradients, weight-300 elegance | + +### Design & Productivity + +| Template | Site | Style | +|---|---|---| +| `airtable.md` | Airtable | Colorful, friendly, structured data aesthetic | +| `cal.md` | Cal.com | Clean neutral UI, developer-oriented simplicity | +| `clay.md` | Clay | Organic shapes, soft gradients, art-directed layout | +| `figma.md` | Figma | Vibrant multi-color, playful yet professional | +| `framer.md` | Framer | Bold black and blue, motion-first, design-forward | +| `intercom.md` | Intercom | Friendly blue palette, conversational UI patterns | +| `miro.md` | Miro | Bright yellow accent, infinite canvas aesthetic | +| `notion.md` | Notion | Warm minimalism, serif headings, soft surfaces | +| `pinterest.md` | Pinterest | Red accent, masonry grid, image-first layout | +| `webflow.md` | Webflow | Blue-accented, polished marketing site aesthetic | + +### Fintech & Crypto + +| Template | Site | Style | +|---|---|---| +| `coinbase.md` | Coinbase | Clean blue identity, trust-focused, institutional feel | +| `kraken.md` | Kraken | Purple-accented dark UI, data-dense dashboards | +| `revolut.md` | Revolut | Sleek dark interface, gradient cards, fintech precision | +| `wise.md` | Wise | Bright green accent, friendly and clear | + +### Enterprise & Consumer + +| Template | Site | Style | +|---|---|---| +| `airbnb.md` | Airbnb | Warm coral accent, photography-driven, rounded UI | +| `apple.md` | Apple | Premium white space, SF Pro, cinematic imagery | +| `bmw.md` | BMW | Dark premium surfaces, precise engineering aesthetic | +| `ibm.md` | IBM | Carbon design system, structured blue palette | +| `nvidia.md` | NVIDIA | Green-black energy, technical power aesthetic | +| `spacex.md` | SpaceX | Stark black and white, full-bleed imagery, futuristic | +| `spotify.md` | Spotify | Vibrant green on dark, bold type, album-art-driven | +| `uber.md` | Uber | Bold black and white, tight type, urban energy | + +## Choosing a Design + +Match the design to the content: + +- **Developer tools / dashboards:** Linear, Vercel, Supabase, Raycast, Sentry +- **Documentation / content sites:** Mintlify, Notion, Sanity, MongoDB +- **Marketing / landing pages:** Stripe, Framer, Apple, SpaceX +- **Dark mode UIs:** Linear, Cursor, ElevenLabs, Warp, Superhuman +- **Light / clean UIs:** Vercel, Stripe, Notion, Cal.com, Replicate +- **Playful / friendly:** PostHog, Figma, Lovable, Zapier, Miro +- **Premium / luxury:** Apple, BMW, Stripe, Superhuman, Revolut +- **Data-dense / dashboards:** Sentry, Kraken, Cohere, ClickHouse +- **Monospace / terminal aesthetic:** Ollama, OpenCode, x.ai, VoltAgent diff --git a/website/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music.md b/website/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music.md new file mode 100644 index 000000000..cd0b7fb14 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music.md @@ -0,0 +1,297 @@ +--- +title: "Songwriting And Ai Music" +sidebar_label: "Songwriting And Ai Music" +description: "Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation techniques, phonetic tricks, and lessons learned" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Songwriting And Ai Music + +Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation techniques, phonetic tricks, and lessons learned. These are tools and ideas, not rules. Break any of them when the art calls for it. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/creative/songwriting-and-ai-music` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Songwriting & AI Music Generation + +Everything here is a GUIDELINE, not a rule. Art breaks rules on purpose. +Use what serves the song. Ignore what doesn't. + +--- + +## 1. Song Structure (Pick One or Invent Your Own) + +Common skeletons — mix, modify, or throw out as needed: + +``` +ABABCB Verse/Chorus/Verse/Chorus/Bridge/Chorus (most pop/rock) +AABA Verse/Verse/Bridge/Verse (refrain-based) (jazz standards, ballads) +ABAB Verse/Chorus alternating (simple, direct) +AAA Verse/Verse/Verse (strophic, no chorus) (folk, storytelling) +``` + +The six building blocks: +- Intro — set the mood, pull the listener in +- Verse — the story, the details, the world-building +- Pre-Chorus — optional tension ramp before the payoff +- Chorus — the emotional core, the part people remember +- Bridge — a detour, a shift in perspective or key +- Outro — the farewell, can echo or subvert the rest + +You don't need all of these. Some great songs are just one section +that evolves. Structure serves the emotion, not the other way around. + +--- + +## 2. Rhyme, Meter, and Sound + +RHYME TYPES (from tight to loose): +- Perfect: lean/mean +- Family: crate/braid +- Assonance: had/glass (same vowels, different endings) +- Consonance: scene/when (different vowels, similar endings) +- Near/slant: enough to suggest connection without locking it down + +Mix them. All perfect rhymes can sound like a nursery rhyme. +All slant rhymes can sound lazy. The blend is where it lives. + +INTERNAL RHYME: Rhyming within a line, not just at the ends. + "We pruned the lies from bleeding trees / Distilled the storm + from entropy" — "lies/flies," "trees/entropy" create internal echoes. + +METER: The rhythm of stressed vs unstressed syllables. +- Matching syllable counts between parallel lines helps singability +- The STRESSED syllables matter more than total count +- Say it out loud. If you stumble, the meter needs work. +- Intentionally breaking meter can create emphasis or surprise + +--- + +## 3. Emotional Arc and Dynamics + +Think of a song as a journey, not a flat road. + +ENERGY MAPPING (rough idea, not prescription): + Intro: 2-3 | Verse: 5-6 | Pre-Chorus: 7 + Chorus: 8-9 | Bridge: varies | Final Chorus: 9-10 + +The most powerful dynamic trick: CONTRAST. +- Whisper before a scream hits harder than just screaming +- Sparse before dense. Slow before fast. Low before high. +- The drop only works because of the buildup +- Silence is an instrument + +"Whisper to roar to whisper" — start intimate, build to full power, +strip back to vulnerability. Works for ballads, epics, anthems. + +--- + +## 4. Writing Lyrics That Work + +SHOW, DON'T TELL (usually): +- "I was sad" = flat +- "Your hoodie's still on the hook by the door" = alive +- But sometimes "I give my life" said plainly IS the power + +THE HOOK: +- The line people remember, hum, repeat +- Usually the title or core phrase +- Works best when melody + lyric + emotion all align +- Place it where it lands hardest (often first/last line of chorus) + +PROSODY — lyrics and music supporting each other: +- Stable feelings (resolution, peace) pair with settled melodies, + perfect rhymes, resolved chords +- Unstable feelings (longing, doubt) pair with wandering melodies, + near-rhymes, unresolved chords +- Verse melody typically sits lower, chorus goes higher +- But flip this if it serves the song + +AVOID (unless you're doing it on purpose): +- Cliches on autopilot ("heart of gold" without earning it) +- Forcing word order to hit a rhyme ("Yoda-speak") +- Same energy in every section (flat dynamics) +- Treating your first draft as sacred — revision is creation + +--- + +## 5. Parody and Adaptation + +When rewriting an existing song with new lyrics: + +THE SKELETON: Map the original's structure first. +- Count syllables per line +- Mark the rhyme scheme (ABAB, AABB, etc.) +- Identify which syllables are STRESSED +- Note where held/sustained notes fall + +FITTING NEW WORDS: +- Match stressed syllables to the same beats as the original +- Total syllable count can flex by 1-2 unstressed syllables +- On long held notes, try to match the VOWEL SOUND of the original + (if original holds "LOOOVE" with an "oo" vowel, "FOOOD" fits + better than "LIFE") +- Monosyllabic swaps in key spots keep rhythm intact + (Crime -> Code, Snake -> Noose) +- Sing your new words over the original — if you stumble, revise + +CONCEPT: +- Pick a concept strong enough to sustain the whole song +- Start from the title/hook and build outward +- Generate lots of raw material (puns, phrases, images) FIRST, + then fit the best ones into the structure +- If you need a specific line somewhere, reverse-engineer the + rhyme scheme backward to set it up + +KEEP SOME ORIGINALS: Leaving a few original lines or structures +intact adds recognizability and lets the audience feel the connection. + +--- + +## 6. Suno AI Prompt Engineering + +### Style/Genre Description Field + +FORMULA (adapt as needed): + Genre + Mood + Era + Instruments + Vocal Style + Production + Dynamics + +``` +BAD: "sad rock song" +GOOD: "Cinematic orchestral spy thriller, 1960s Cold War era, smoky + sultry female vocalist, big band jazz, brass section with + trumpets and french horns, sweeping strings, minor key, + vintage analog warmth" +``` + +DESCRIBE THE JOURNEY, not just the genre: +``` +"Begins as a haunting whisper over sparse piano. Gradually layers + in muted brass. Builds through the chorus with full orchestra. + Second verse erupts with raw belting intensity. Outro strips back + to a lone piano and a fragile whisper fading to silence." +``` + +TIPS: +- V4.5+ supports up to 1,000 chars in Style field — use them +- NO artist names or trademarks. Describe the sound instead. + "1960s Cold War spy thriller brass" not "James Bond style" + "90s grunge" not "Nirvana-style" +- Specify BPM and key when you have a preference +- Use Exclude Styles field for what you DON'T want +- Unexpected genre combos can be gold: "bossa nova trap", + "Appalachian gothic", "chiptune jazz" +- Build a vocal PERSONA, not just a gender: + "A weathered torch singer with a smoky alto, slight rasp, + who starts vulnerable and builds to devastating power" + +### Metatags (place in [brackets] inside lyrics field) + +STRUCTURE: + [Intro] [Verse] [Verse 1] [Pre-Chorus] [Chorus] + [Post-Chorus] [Hook] [Bridge] [Interlude] + [Instrumental] [Instrumental Break] [Guitar Solo] + [Breakdown] [Build-up] [Outro] [Silence] [End] + +VOCAL PERFORMANCE: + [Whispered] [Spoken Word] [Belted] [Falsetto] [Powerful] + [Soulful] [Raspy] [Breathy] [Smooth] [Gritty] + [Staccato] [Legato] [Vibrato] [Melismatic] + [Harmonies] [Choir] [Harmonized Chorus] + +DYNAMICS: + [High Energy] [Low Energy] [Building Energy] [Explosive] + [Emotional Climax] [Gradual swell] [Orchestral swell] + [Quiet arrangement] [Falling tension] [Slow Down] + +GENDER: + [Female Vocals] [Male Vocals] + +ATMOSPHERE: + [Melancholic] [Euphoric] [Nostalgic] [Aggressive] + [Dreamy] [Intimate] [Dark Atmosphere] + +SFX: + [Vinyl Crackle] [Rain] [Applause] [Static] [Thunder] + +Put tags in BOTH style field AND lyrics for reinforcement. +Keep to 5-8 tags per section max — too many confuses the AI. +Don't contradict yourself ([Calm] + [Aggressive] in same section). + +### Custom Mode +- Always use Custom Mode for serious work (separate Style + Lyrics) +- Lyrics field limit: ~3,000 chars (~40-60 lines) +- Always add structural tags — without them Suno defaults to + flat verse/chorus/verse with no emotional arc + +--- + +## 7. Phonetic Tricks for AI Singers + +AI vocalists don't read — they pronounce. Help them: + +PHONETIC RESPELLING: +- Spell words as they SOUND: "through" -> "thru" +- Proper nouns are highest failure rate — test early +- "Nous" -> "Noose" (forces correct pronunciation) +- Hyphenate to guide syllables: "Re-search", "bio-engineering" + +DELIVERY CONTROL: +- ALL CAPS = louder, more intense +- Vowel extension: "lo-o-o-ove" = sustained/melisma +- Ellipses: "I... need... you" = dramatic pauses +- Hyphenated stretch: "ne-e-ed" = emotional stretch + +ALWAYS: +- Spell out numbers: "24/7" -> "twenty four seven" +- Space acronyms: "AI" -> "A I" or "A-I" +- Test proper nouns/unusual words in a short 30-second clip first +- Once generated, pronunciation is baked in — fix in lyrics BEFORE + +--- + +## 8. Workflow + +1. Write the concept/hook first — what's the emotional core? +2. If adapting, map the original structure (syllables, rhyme, stress) +3. Generate raw material — brainstorm freely before structuring +4. Draft lyrics into the structure +5. Read/sing aloud — catch stumbles, fix meter +6. Build the Suno style description — paint the dynamic journey +7. Add metatags to lyrics for performance direction +8. Generate 3-5 variations minimum — treat them like recording takes +9. Pick the best, use Extend/Continue to build on promising sections +10. If something great happens by accident, keep it + +EXPECT: ~3-5 generations per 1 good result. Revision is normal. +Style can drift in extensions — restate genre/mood when extending. + +--- + +## 9. Lessons Learned + +- Describing the dynamic ARC in the style field matters way more + than just listing genres. "Whisper to roar to whisper" gives + Suno a performance map. +- Keeping some original lines intact in a parody adds recognizability + and emotional weight — the audience feels the ghost of the original. +- The bridge slot in a song is where you can transform imagery. + Swap the original's specific references for your theme's metaphors + while keeping the emotional function (reflection, shift, revelation). +- Monosyllabic word swaps in hooks/tags are the cleanest way to + maintain rhythm while changing meaning. +- A strong vocal persona description in the style field makes a + bigger difference than any single metatag. +- Don't be precious about rules. If a line breaks meter but hits + harder, keep it. The feeling is what matters. Craft serves art, + not the other way around. diff --git a/website/docs/user-guide/skills/bundled/data-science/data-science-jupyter-live-kernel.md b/website/docs/user-guide/skills/bundled/data-science/data-science-jupyter-live-kernel.md new file mode 100644 index 000000000..027156ccd --- /dev/null +++ b/website/docs/user-guide/skills/bundled/data-science/data-science-jupyter-live-kernel.md @@ -0,0 +1,183 @@ +--- +title: "Jupyter Live Kernel — Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb" +sidebar_label: "Jupyter Live Kernel" +description: "Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Jupyter Live Kernel + +Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb. Load this skill when the task involves exploration, iteration, or inspecting intermediate results — data science, ML experimentation, API exploration, or building up complex code step-by-step. Uses terminal to run CLI commands against a live Jupyter kernel. No new tools required. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/data-science/jupyter-live-kernel` | +| Version | `1.0.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `jupyter`, `notebook`, `repl`, `data-science`, `exploration`, `iterative` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Jupyter Live Kernel (hamelnb) + +Gives you a **stateful Python REPL** via a live Jupyter kernel. Variables persist +across executions. Use this instead of `execute_code` when you need to build up +state incrementally, explore APIs, inspect DataFrames, or iterate on complex code. + +## When to Use This vs Other Tools + +| Tool | Use When | +|------|----------| +| **This skill** | Iterative exploration, state across steps, data science, ML, "let me try this and check" | +| `execute_code` | One-shot scripts needing hermes tool access (web_search, file ops). Stateless. | +| `terminal` | Shell commands, builds, installs, git, process management | + +**Rule of thumb:** If you'd want a Jupyter notebook for the task, use this skill. + +## Prerequisites + +1. **uv** must be installed (check: `which uv`) +2. **JupyterLab** must be installed: `uv tool install jupyterlab` +3. A Jupyter server must be running (see Setup below) + +## Setup + +The hamelnb script location: +``` +SCRIPT="$HOME/.agent-skills/hamelnb/skills/jupyter-live-kernel/scripts/jupyter_live_kernel.py" +``` + +If not cloned yet: +``` +git clone https://github.com/hamelsmu/hamelnb.git ~/.agent-skills/hamelnb +``` + +### Starting JupyterLab + +Check if a server is already running: +``` +uv run "$SCRIPT" servers +``` + +If no servers found, start one: +``` +jupyter-lab --no-browser --port=8888 --notebook-dir=$HOME/notebooks \ + --IdentityProvider.token='' --ServerApp.password='' > /tmp/jupyter.log 2>&1 & +sleep 3 +``` + +Note: Token/password disabled for local agent access. The server runs headless. + +### Creating a Notebook for REPL Use + +If you just need a REPL (no existing notebook), create a minimal notebook file: +``` +mkdir -p ~/notebooks +``` +Write a minimal .ipynb JSON file with one empty code cell, then start a kernel +session via the Jupyter REST API: +``` +curl -s -X POST http://127.0.0.1:8888/api/sessions \ + -H "Content-Type: application/json" \ + -d '{"path":"scratch.ipynb","type":"notebook","name":"scratch.ipynb","kernel":{"name":"python3"}}' +``` + +## Core Workflow + +All commands return structured JSON. Always use `--compact` to save tokens. + +### 1. Discover servers and notebooks + +``` +uv run "$SCRIPT" servers --compact +uv run "$SCRIPT" notebooks --compact +``` + +### 2. Execute code (primary operation) + +``` +uv run "$SCRIPT" execute --path --code '' --compact +``` + +State persists across execute calls. Variables, imports, objects all survive. + +Multi-line code works with $'...' quoting: +``` +uv run "$SCRIPT" execute --path scratch.ipynb --code $'import os\nfiles = os.listdir(".")\nprint(f"Found {len(files)} files")' --compact +``` + +### 3. Inspect live variables + +``` +uv run "$SCRIPT" variables --path list --compact +uv run "$SCRIPT" variables --path preview --name --compact +``` + +### 4. Edit notebook cells + +``` +# View current cells +uv run "$SCRIPT" contents --path --compact + +# Insert a new cell +uv run "$SCRIPT" edit --path insert \ + --at-index --cell-type code --source '' --compact + +# Replace cell source (use cell-id from contents output) +uv run "$SCRIPT" edit --path replace-source \ + --cell-id --source '' --compact + +# Delete a cell +uv run "$SCRIPT" edit --path delete --cell-id --compact +``` + +### 5. Verification (restart + run all) + +Only use when the user asks for a clean verification or you need to confirm +the notebook runs top-to-bottom: + +``` +uv run "$SCRIPT" restart-run-all --path --save-outputs --compact +``` + +## Practical Tips from Experience + +1. **First execution after server start may timeout** — the kernel needs a moment + to initialize. If you get a timeout, just retry. + +2. **The kernel Python is JupyterLab's Python** — packages must be installed in + that environment. If you need additional packages, install them into the + JupyterLab tool environment first. + +3. **--compact flag saves significant tokens** — always use it. JSON output can + be very verbose without it. + +4. **For pure REPL use**, create a scratch.ipynb and don't bother with cell editing. + Just use `execute` repeatedly. + +5. **Argument order matters** — subcommand flags like `--path` go BEFORE the + sub-subcommand. E.g.: `variables --path nb.ipynb list` not `variables list --path nb.ipynb`. + +6. **If a session doesn't exist yet**, you need to start one via the REST API + (see Setup section). The tool can't execute without a live kernel session. + +7. **Errors are returned as JSON** with traceback — read the `ename` and `evalue` + fields to understand what went wrong. + +8. **Occasional websocket timeouts** — some operations may timeout on first try, + especially after a kernel restart. Retry once before escalating. + +## Timeout Defaults + +The script has a 30-second default timeout per execution. For long-running +operations, pass `--timeout 120`. Use generous timeouts (60+) for initial +setup or heavy computation. diff --git a/website/docs/user-guide/skills/bundled/devops/devops-webhook-subscriptions.md b/website/docs/user-guide/skills/bundled/devops/devops-webhook-subscriptions.md new file mode 100644 index 000000000..8b5b8ade8 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/devops/devops-webhook-subscriptions.md @@ -0,0 +1,221 @@ +--- +title: "Webhook Subscriptions" +sidebar_label: "Webhook Subscriptions" +description: "Create and manage webhook subscriptions for event-driven agent activation, or for direct push notifications (zero LLM cost)" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Webhook Subscriptions + +Create and manage webhook subscriptions for event-driven agent activation, or for direct push notifications (zero LLM cost). Use when the user wants external services to trigger agent runs OR push notifications to chats. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/devops/webhook-subscriptions` | +| Version | `1.1.0` | +| Tags | `webhook`, `events`, `automation`, `integrations`, `notifications`, `push` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Webhook Subscriptions + +Create dynamic webhook subscriptions so external services (GitHub, GitLab, Stripe, CI/CD, IoT sensors, monitoring tools) can trigger Hermes agent runs by POSTing events to a URL. + +## Setup (Required First) + +The webhook platform must be enabled before subscriptions can be created. Check with: +```bash +hermes webhook list +``` + +If it says "Webhook platform is not enabled", set it up: + +### Option 1: Setup wizard +```bash +hermes gateway setup +``` +Follow the prompts to enable webhooks, set the port, and set a global HMAC secret. + +### Option 2: Manual config +Add to `~/.hermes/config.yaml`: +```yaml +platforms: + webhook: + enabled: true + extra: + host: "0.0.0.0" + port: 8644 + secret: "generate-a-strong-secret-here" +``` + +### Option 3: Environment variables +Add to `~/.hermes/.env`: +```bash +WEBHOOK_ENABLED=true +WEBHOOK_PORT=8644 +WEBHOOK_SECRET=generate-a-strong-secret-here +``` + +After configuration, start (or restart) the gateway: +```bash +hermes gateway run +# Or if using systemd: +systemctl --user restart hermes-gateway +``` + +Verify it's running: +```bash +curl http://localhost:8644/health +``` + +## Commands + +All management is via the `hermes webhook` CLI command: + +### Create a subscription +```bash +hermes webhook subscribe \ + --prompt "Prompt template with {payload.fields}" \ + --events "event1,event2" \ + --description "What this does" \ + --skills "skill1,skill2" \ + --deliver telegram \ + --deliver-chat-id "12345" \ + --secret "optional-custom-secret" +``` + +Returns the webhook URL and HMAC secret. The user configures their service to POST to that URL. + +### List subscriptions +```bash +hermes webhook list +``` + +### Remove a subscription +```bash +hermes webhook remove +``` + +### Test a subscription +```bash +hermes webhook test +hermes webhook test --payload '{"key": "value"}' +``` + +## Prompt Templates + +Prompts support `{dot.notation}` for accessing nested payload fields: + +- `{issue.title}` — GitHub issue title +- `{pull_request.user.login}` — PR author +- `{data.object.amount}` — Stripe payment amount +- `{sensor.temperature}` — IoT sensor reading + +If no prompt is specified, the full JSON payload is dumped into the agent prompt. + +## Common Patterns + +### GitHub: new issues +```bash +hermes webhook subscribe github-issues \ + --events "issues" \ + --prompt "New GitHub issue #{issue.number}: {issue.title}\n\nAction: {action}\nAuthor: {issue.user.login}\nBody:\n{issue.body}\n\nPlease triage this issue." \ + --deliver telegram \ + --deliver-chat-id "-100123456789" +``` + +Then in GitHub repo Settings → Webhooks → Add webhook: +- Payload URL: the returned webhook_url +- Content type: application/json +- Secret: the returned secret +- Events: "Issues" + +### GitHub: PR reviews +```bash +hermes webhook subscribe github-prs \ + --events "pull_request" \ + --prompt "PR #{pull_request.number} {action}: {pull_request.title}\nBy: {pull_request.user.login}\nBranch: {pull_request.head.ref}\n\n{pull_request.body}" \ + --skills "github-code-review" \ + --deliver github_comment +``` + +### Stripe: payment events +```bash +hermes webhook subscribe stripe-payments \ + --events "payment_intent.succeeded,payment_intent.payment_failed" \ + --prompt "Payment {data.object.status}: {data.object.amount} cents from {data.object.receipt_email}" \ + --deliver telegram \ + --deliver-chat-id "-100123456789" +``` + +### CI/CD: build notifications +```bash +hermes webhook subscribe ci-builds \ + --events "pipeline" \ + --prompt "Build {object_attributes.status} on {project.name} branch {object_attributes.ref}\nCommit: {commit.message}" \ + --deliver discord \ + --deliver-chat-id "1234567890" +``` + +### Generic monitoring alert +```bash +hermes webhook subscribe alerts \ + --prompt "Alert: {alert.name}\nSeverity: {alert.severity}\nMessage: {alert.message}\n\nPlease investigate and suggest remediation." \ + --deliver origin +``` + +### Direct delivery (no agent, zero LLM cost) + +For use cases where you just want to push a notification through to a user's chat — no reasoning, no agent loop — add `--deliver-only`. The rendered `--prompt` template becomes the literal message body and is dispatched directly to the target adapter. + +Use this for: +- External service push notifications (Supabase/Firebase webhooks → Telegram) +- Monitoring alerts that should forward verbatim +- Inter-agent pings where one agent is telling another agent's user something +- Any webhook where an LLM round trip would be wasted effort + +```bash +hermes webhook subscribe antenna-matches \ + --deliver telegram \ + --deliver-chat-id "123456789" \ + --deliver-only \ + --prompt "🎉 New match: {match.user_name} matched with you!" \ + --description "Antenna match notifications" +``` + +The POST returns `200 OK` on successful delivery, `502` on target failure — so upstream services can retry intelligently. HMAC auth, rate limits, and idempotency still apply. + +Requires `--deliver` to be a real target (telegram, discord, slack, github_comment, etc.) — `--deliver log` is rejected because log-only direct delivery is pointless. + +## Security + +- Each subscription gets an auto-generated HMAC-SHA256 secret (or provide your own with `--secret`) +- The webhook adapter validates signatures on every incoming POST +- Static routes from config.yaml cannot be overwritten by dynamic subscriptions +- Subscriptions persist to `~/.hermes/webhook_subscriptions.json` + +## How It Works + +1. `hermes webhook subscribe` writes to `~/.hermes/webhook_subscriptions.json` +2. The webhook adapter hot-reloads this file on each incoming request (mtime-gated, negligible overhead) +3. When a POST arrives matching a route, the adapter formats the prompt and triggers an agent run +4. The agent's response is delivered to the configured target (Telegram, Discord, GitHub comment, etc.) + +## Troubleshooting + +If webhooks aren't working: + +1. **Is the gateway running?** Check with `systemctl --user status hermes-gateway` or `ps aux | grep gateway` +2. **Is the webhook server listening?** `curl http://localhost:8644/health` should return `{"status": "ok"}` +3. **Check gateway logs:** `grep webhook ~/.hermes/logs/gateway.log | tail -20` +4. **Signature mismatch?** Verify the secret in your service matches the one from `hermes webhook list`. GitHub sends `X-Hub-Signature-256`, GitLab sends `X-Gitlab-Token`. +5. **Firewall/NAT?** The webhook URL must be reachable from the service. For local development, use a tunnel (ngrok, cloudflared). +6. **Wrong event type?** Check `--events` filter matches what the service sends. Use `hermes webhook test ` to verify the route works. diff --git a/website/docs/user-guide/skills/bundled/dogfood/dogfood-dogfood.md b/website/docs/user-guide/skills/bundled/dogfood/dogfood-dogfood.md new file mode 100644 index 000000000..0ff7e72d9 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/dogfood/dogfood-dogfood.md @@ -0,0 +1,178 @@ +--- +title: "Dogfood" +sidebar_label: "Dogfood" +description: "Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Dogfood + +Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/dogfood` | +| Version | `1.0.0` | +| Tags | `qa`, `testing`, `browser`, `web`, `dogfood` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Dogfood: Systematic Web Application QA Testing + +## Overview + +This skill guides you through systematic exploratory QA testing of web applications using the browser toolset. You will navigate the application, interact with elements, capture evidence of issues, and produce a structured bug report. + +## Prerequisites + +- Browser toolset must be available (`browser_navigate`, `browser_snapshot`, `browser_click`, `browser_type`, `browser_vision`, `browser_console`, `browser_scroll`, `browser_back`, `browser_press`) +- A target URL and testing scope from the user + +## Inputs + +The user provides: +1. **Target URL** — the entry point for testing +2. **Scope** — what areas/features to focus on (or "full site" for comprehensive testing) +3. **Output directory** (optional) — where to save screenshots and the report (default: `./dogfood-output`) + +## Workflow + +Follow this 5-phase systematic workflow: + +### Phase 1: Plan + +1. Create the output directory structure: + ``` + {output_dir}/ + ├── screenshots/ # Evidence screenshots + └── report.md # Final report (generated in Phase 5) + ``` +2. Identify the testing scope based on user input. +3. Build a rough sitemap by planning which pages and features to test: + - Landing/home page + - Navigation links (header, footer, sidebar) + - Key user flows (sign up, login, search, checkout, etc.) + - Forms and interactive elements + - Edge cases (empty states, error pages, 404s) + +### Phase 2: Explore + +For each page or feature in your plan: + +1. **Navigate** to the page: + ``` + browser_navigate(url="https://example.com/page") + ``` + +2. **Take a snapshot** to understand the DOM structure: + ``` + browser_snapshot() + ``` + +3. **Check the console** for JavaScript errors: + ``` + browser_console(clear=true) + ``` + Do this after every navigation and after every significant interaction. Silent JS errors are high-value findings. + +4. **Take an annotated screenshot** to visually assess the page and identify interactive elements: + ``` + browser_vision(question="Describe the page layout, identify any visual issues, broken elements, or accessibility concerns", annotate=true) + ``` + The `annotate=true` flag overlays numbered `[N]` labels on interactive elements. Each `[N]` maps to ref `@eN` for subsequent browser commands. + +5. **Test interactive elements** systematically: + - Click buttons and links: `browser_click(ref="@eN")` + - Fill forms: `browser_type(ref="@eN", text="test input")` + - Test keyboard navigation: `browser_press(key="Tab")`, `browser_press(key="Enter")` + - Scroll through content: `browser_scroll(direction="down")` + - Test form validation with invalid inputs + - Test empty submissions + +6. **After each interaction**, check for: + - Console errors: `browser_console()` + - Visual changes: `browser_vision(question="What changed after the interaction?")` + - Expected vs actual behavior + +### Phase 3: Collect Evidence + +For every issue found: + +1. **Take a screenshot** showing the issue: + ``` + browser_vision(question="Capture and describe the issue visible on this page", annotate=false) + ``` + Save the `screenshot_path` from the response — you will reference it in the report. + +2. **Record the details**: + - URL where the issue occurs + - Steps to reproduce + - Expected behavior + - Actual behavior + - Console errors (if any) + - Screenshot path + +3. **Classify the issue** using the issue taxonomy (see `references/issue-taxonomy.md`): + - Severity: Critical / High / Medium / Low + - Category: Functional / Visual / Accessibility / Console / UX / Content + +### Phase 4: Categorize + +1. Review all collected issues. +2. De-duplicate — merge issues that are the same bug manifesting in different places. +3. Assign final severity and category to each issue. +4. Sort by severity (Critical first, then High, Medium, Low). +5. Count issues by severity and category for the executive summary. + +### Phase 5: Report + +Generate the final report using the template at `templates/dogfood-report-template.md`. + +The report must include: +1. **Executive summary** with total issue count, breakdown by severity, and testing scope +2. **Per-issue sections** with: + - Issue number and title + - Severity and category badges + - URL where observed + - Description of the issue + - Steps to reproduce + - Expected vs actual behavior + - Screenshot references (use `MEDIA:` for inline images) + - Console errors if relevant +3. **Summary table** of all issues +4. **Testing notes** — what was tested, what was not, any blockers + +Save the report to `{output_dir}/report.md`. + +## Tools Reference + +| Tool | Purpose | +|------|---------| +| `browser_navigate` | Go to a URL | +| `browser_snapshot` | Get DOM text snapshot (accessibility tree) | +| `browser_click` | Click an element by ref (`@eN`) or text | +| `browser_type` | Type into an input field | +| `browser_scroll` | Scroll up/down on the page | +| `browser_back` | Go back in browser history | +| `browser_press` | Press a keyboard key | +| `browser_vision` | Screenshot + AI analysis; use `annotate=true` for element labels | +| `browser_console` | Get JS console output and errors | + +## Tips + +- **Always check `browser_console()` after navigating and after significant interactions.** Silent JS errors are among the most valuable findings. +- **Use `annotate=true` with `browser_vision`** when you need to reason about interactive element positions or when the snapshot refs are unclear. +- **Test with both valid and invalid inputs** — form validation bugs are common. +- **Scroll through long pages** — content below the fold may have rendering issues. +- **Test navigation flows** — click through multi-step processes end-to-end. +- **Check responsive behavior** by noting any layout issues visible in screenshots. +- **Don't forget edge cases**: empty states, very long text, special characters, rapid clicking. +- When reporting screenshots to the user, include `MEDIA:` so they can see the evidence inline. diff --git a/website/docs/user-guide/skills/bundled/email/email-himalaya.md b/website/docs/user-guide/skills/bundled/email/email-himalaya.md new file mode 100644 index 000000000..55178bdc9 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/email/email-himalaya.md @@ -0,0 +1,293 @@ +--- +title: "Himalaya — CLI to manage emails via IMAP/SMTP" +sidebar_label: "Himalaya" +description: "CLI to manage emails via IMAP/SMTP" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Himalaya + +CLI to manage emails via IMAP/SMTP. Use himalaya to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language). + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/email/himalaya` | +| Version | `1.0.0` | +| Author | community | +| License | MIT | +| Tags | `Email`, `IMAP`, `SMTP`, `CLI`, `Communication` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Himalaya Email CLI + +Himalaya is a CLI email client that lets you manage emails from the terminal using IMAP, SMTP, Notmuch, or Sendmail backends. + +## References + +- `references/configuration.md` (config file setup + IMAP/SMTP authentication) +- `references/message-composition.md` (MML syntax for composing emails) + +## Prerequisites + +1. Himalaya CLI installed (`himalaya --version` to verify) +2. A configuration file at `~/.config/himalaya/config.toml` +3. IMAP/SMTP credentials configured (password stored securely) + +### Installation + +```bash +# Pre-built binary (Linux/macOS — recommended) +curl -sSL https://raw.githubusercontent.com/pimalaya/himalaya/master/install.sh | PREFIX=~/.local sh + +# macOS via Homebrew +brew install himalaya + +# Or via cargo (any platform with Rust) +cargo install himalaya --locked +``` + +## Configuration Setup + +Run the interactive wizard to set up an account: + +```bash +himalaya account configure +``` + +Or create `~/.config/himalaya/config.toml` manually: + +```toml +[accounts.personal] +email = "you@example.com" +display-name = "Your Name" +default = true + +backend.type = "imap" +backend.host = "imap.example.com" +backend.port = 993 +backend.encryption.type = "tls" +backend.login = "you@example.com" +backend.auth.type = "password" +backend.auth.cmd = "pass show email/imap" # or use keyring + +message.send.backend.type = "smtp" +message.send.backend.host = "smtp.example.com" +message.send.backend.port = 587 +message.send.backend.encryption.type = "start-tls" +message.send.backend.login = "you@example.com" +message.send.backend.auth.type = "password" +message.send.backend.auth.cmd = "pass show email/smtp" +``` + +## Hermes Integration Notes + +- **Reading, listing, searching, moving, deleting** all work directly through the terminal tool +- **Composing/replying/forwarding** — piped input (`cat << EOF | himalaya template send`) is recommended for reliability. Interactive `$EDITOR` mode works with `pty=true` + background + process tool, but requires knowing the editor and its commands +- Use `--output json` for structured output that's easier to parse programmatically +- The `himalaya account configure` wizard requires interactive input — use PTY mode: `terminal(command="himalaya account configure", pty=true)` + +## Common Operations + +### List Folders + +```bash +himalaya folder list +``` + +### List Emails + +List emails in INBOX (default): + +```bash +himalaya envelope list +``` + +List emails in a specific folder: + +```bash +himalaya envelope list --folder "Sent" +``` + +List with pagination: + +```bash +himalaya envelope list --page 1 --page-size 20 +``` + +### Search Emails + +```bash +himalaya envelope list from john@example.com subject meeting +``` + +### Read an Email + +Read email by ID (shows plain text): + +```bash +himalaya message read 42 +``` + +Export raw MIME: + +```bash +himalaya message export 42 --full +``` + +### Reply to an Email + +To reply non-interactively from Hermes, read the original message, compose a reply, and pipe it: + +```bash +# Get the reply template, edit it, and send +himalaya template reply 42 | sed 's/^$/\nYour reply text here\n/' | himalaya template send +``` + +Or build the reply manually: + +```bash +cat << 'EOF' | himalaya template send +From: you@example.com +To: sender@example.com +Subject: Re: Original Subject +In-Reply-To: + +Your reply here. +EOF +``` + +Reply-all (interactive — needs $EDITOR, use template approach above instead): + +```bash +himalaya message reply 42 --all +``` + +### Forward an Email + +```bash +# Get forward template and pipe with modifications +himalaya template forward 42 | sed 's/^To:.*/To: newrecipient@example.com/' | himalaya template send +``` + +### Write a New Email + +**Non-interactive (use this from Hermes)** — pipe the message via stdin: + +```bash +cat << 'EOF' | himalaya template send +From: you@example.com +To: recipient@example.com +Subject: Test Message + +Hello from Himalaya! +EOF +``` + +Or with headers flag: + +```bash +himalaya message write -H "To:recipient@example.com" -H "Subject:Test" "Message body here" +``` + +Note: `himalaya message write` without piped input opens `$EDITOR`. This works with `pty=true` + background mode, but piping is simpler and more reliable. + +### Move/Copy Emails + +Move to folder: + +```bash +himalaya message move 42 "Archive" +``` + +Copy to folder: + +```bash +himalaya message copy 42 "Important" +``` + +### Delete an Email + +```bash +himalaya message delete 42 +``` + +### Manage Flags + +Add flag: + +```bash +himalaya flag add 42 --flag seen +``` + +Remove flag: + +```bash +himalaya flag remove 42 --flag seen +``` + +## Multiple Accounts + +List accounts: + +```bash +himalaya account list +``` + +Use a specific account: + +```bash +himalaya --account work envelope list +``` + +## Attachments + +Save attachments from a message: + +```bash +himalaya attachment download 42 +``` + +Save to specific directory: + +```bash +himalaya attachment download 42 --dir ~/Downloads +``` + +## Output Formats + +Most commands support `--output` for structured output: + +```bash +himalaya envelope list --output json +himalaya envelope list --output plain +``` + +## Debugging + +Enable debug logging: + +```bash +RUST_LOG=debug himalaya envelope list +``` + +Full trace with backtrace: + +```bash +RUST_LOG=trace RUST_BACKTRACE=1 himalaya envelope list +``` + +## Tips + +- Use `himalaya --help` or `himalaya --help` for detailed usage. +- Message IDs are relative to the current folder; re-list after folder changes. +- For composing rich emails with attachments, use MML syntax (see `references/message-composition.md`). +- Store passwords securely using `pass`, system keyring, or a command that outputs the password. diff --git a/website/docs/user-guide/skills/bundled/gaming/gaming-minecraft-modpack-server.md b/website/docs/user-guide/skills/bundled/gaming/gaming-minecraft-modpack-server.md new file mode 100644 index 000000000..d85495a18 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/gaming/gaming-minecraft-modpack-server.md @@ -0,0 +1,205 @@ +--- +title: "Minecraft Modpack Server — Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip" +sidebar_label: "Minecraft Modpack Server" +description: "Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Minecraft Modpack Server + +Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip. Covers NeoForge/Forge install, Java version, JVM tuning, firewall, LAN config, backups, and launch scripts. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/gaming/minecraft-modpack-server` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Minecraft Modpack Server Setup + +## When to use +- User wants to set up a modded Minecraft server from a server pack zip +- User needs help with NeoForge/Forge server configuration +- User asks about Minecraft server performance tuning or backups + +## Gather User Preferences First +Before starting setup, ask the user for: +- **Server name / MOTD** — what should it say in the server list? +- **Seed** — specific seed or random? +- **Difficulty** — peaceful / easy / normal / hard? +- **Gamemode** — survival / creative / adventure? +- **Online mode** — true (Mojang auth, legit accounts) or false (LAN/cracked friendly)? +- **Player count** — how many players expected? (affects RAM & view distance tuning) +- **RAM allocation** — or let agent decide based on mod count & available RAM? +- **View distance / simulation distance** — or let agent pick based on player count & hardware? +- **PvP** — on or off? +- **Whitelist** — open server or whitelist only? +- **Backups** — want automated backups? How often? + +Use sensible defaults if the user doesn't care, but always ask before generating the config. + +## Steps + +### 1. Download & Inspect the Pack +```bash +mkdir -p ~/minecraft-server +cd ~/minecraft-server +wget -O serverpack.zip "" +unzip -o serverpack.zip -d server +ls server/ +``` +Look for: `startserver.sh`, installer jar (neoforge/forge), `user_jvm_args.txt`, `mods/` folder. +Check the script to determine: mod loader type, version, and required Java version. + +### 2. Install Java +- Minecraft 1.21+ → Java 21: `sudo apt install openjdk-21-jre-headless` +- Minecraft 1.18-1.20 → Java 17: `sudo apt install openjdk-17-jre-headless` +- Minecraft 1.16 and below → Java 8: `sudo apt install openjdk-8-jre-headless` +- Verify: `java -version` + +### 3. Install the Mod Loader +Most server packs include an install script. Use the INSTALL_ONLY env var to install without launching: +```bash +cd ~/minecraft-server/server +ATM10_INSTALL_ONLY=true bash startserver.sh +# Or for generic Forge packs: +# java -jar forge-*-installer.jar --installServer +``` +This downloads libraries, patches the server jar, etc. + +### 4. Accept EULA +```bash +echo "eula=true" > ~/minecraft-server/server/eula.txt +``` + +### 5. Configure server.properties +Key settings for modded/LAN: +```properties +motd=\u00a7b\u00a7lServer Name \u00a7r\u00a78| \u00a7aModpack Name +server-port=25565 +online-mode=true # false for LAN without Mojang auth +enforce-secure-profile=true # match online-mode +difficulty=hard # most modpacks balance around hard +allow-flight=true # REQUIRED for modded (flying mounts/items) +spawn-protection=0 # let everyone build at spawn +max-tick-time=180000 # modded needs longer tick timeout +enable-command-block=true +``` + +Performance settings (scale to hardware): +```properties +# 2 players, beefy machine: +view-distance=16 +simulation-distance=10 + +# 4-6 players, moderate machine: +view-distance=10 +simulation-distance=6 + +# 8+ players or weaker hardware: +view-distance=8 +simulation-distance=4 +``` + +### 6. Tune JVM Args (user_jvm_args.txt) +Scale RAM to player count and mod count. Rule of thumb for modded: +- 100-200 mods: 6-12GB +- 200-350+ mods: 12-24GB +- Leave at least 8GB free for the OS/other tasks + +``` +-Xms12G +-Xmx24G +-XX:+UseG1GC +-XX:+ParallelRefProcEnabled +-XX:MaxGCPauseMillis=200 +-XX:+UnlockExperimentalVMOptions +-XX:+DisableExplicitGC +-XX:+AlwaysPreTouch +-XX:G1NewSizePercent=30 +-XX:G1MaxNewSizePercent=40 +-XX:G1HeapRegionSize=8M +-XX:G1ReservePercent=20 +-XX:G1HeapWastePercent=5 +-XX:G1MixedGCCountTarget=4 +-XX:InitiatingHeapOccupancyPercent=15 +-XX:G1MixedGCLiveThresholdPercent=90 +-XX:G1RSetUpdatingPauseTimePercent=5 +-XX:SurvivorRatio=32 +-XX:+PerfDisableSharedMem +-XX:MaxTenuringThreshold=1 +``` + +### 7. Open Firewall +```bash +sudo ufw allow 25565/tcp comment "Minecraft Server" +``` +Check with: `sudo ufw status | grep 25565` + +### 8. Create Launch Script +```bash +cat > ~/start-minecraft.sh << 'EOF' +#!/bin/bash +cd ~/minecraft-server/server +java @user_jvm_args.txt @libraries/net/neoforged/neoforge//unix_args.txt nogui +EOF +chmod +x ~/start-minecraft.sh +``` +Note: For Forge (not NeoForge), the args file path differs. Check `startserver.sh` for the exact path. + +### 9. Set Up Automated Backups +Create backup script: +```bash +cat > ~/minecraft-server/backup.sh << 'SCRIPT' +#!/bin/bash +SERVER_DIR="$HOME/minecraft-server/server" +BACKUP_DIR="$HOME/minecraft-server/backups" +WORLD_DIR="$SERVER_DIR/world" +MAX_BACKUPS=24 +mkdir -p "$BACKUP_DIR" +[ ! -d "$WORLD_DIR" ] && echo "[BACKUP] No world folder" && exit 0 +TIMESTAMP=$(date +%Y-%m-%d_%H-%M-%S) +BACKUP_FILE="$BACKUP_DIR/world_${TIMESTAMP}.tar.gz" +echo "[BACKUP] Starting at $(date)" +tar -czf "$BACKUP_FILE" -C "$SERVER_DIR" world +SIZE=$(du -h "$BACKUP_FILE" | cut -f1) +echo "[BACKUP] Saved: $BACKUP_FILE ($SIZE)" +BACKUP_COUNT=$(ls -1t "$BACKUP_DIR"/world_*.tar.gz 2>/dev/null | wc -l) +if [ "$BACKUP_COUNT" -gt "$MAX_BACKUPS" ]; then + REMOVE=$((BACKUP_COUNT - MAX_BACKUPS)) + ls -1t "$BACKUP_DIR"/world_*.tar.gz | tail -n "$REMOVE" | xargs rm -f + echo "[BACKUP] Pruned $REMOVE old backup(s)" +fi +echo "[BACKUP] Done at $(date)" +SCRIPT +chmod +x ~/minecraft-server/backup.sh +``` + +Add hourly cron: +```bash +(crontab -l 2>/dev/null | grep -v "minecraft/backup.sh"; echo "0 * * * * $HOME/minecraft-server/backup.sh >> $HOME/minecraft-server/backups/backup.log 2>&1") | crontab - +``` + +## Pitfalls +- ALWAYS set `allow-flight=true` for modded — mods with jetpacks/flight will kick players otherwise +- `max-tick-time=180000` or higher — modded servers often have long ticks during worldgen +- First startup is SLOW (several minutes for big packs) — don't panic +- "Can't keep up!" warnings on first launch are normal, settles after initial chunk gen +- If online-mode=false, set enforce-secure-profile=false too or clients get rejected +- The pack's startserver.sh often has an auto-restart loop — make a clean launch script without it +- Delete the world/ folder to regenerate with a new seed +- Some packs have env vars to control behavior (e.g., ATM10 uses ATM10_JAVA, ATM10_RESTART, ATM10_INSTALL_ONLY) + +## Verification +- `pgrep -fa neoforge` or `pgrep -fa minecraft` to check if running +- Check logs: `tail -f ~/minecraft-server/server/logs/latest.log` +- Look for "Done (Xs)!" in the log = server is ready +- Test connection: player adds server IP in Multiplayer diff --git a/website/docs/user-guide/skills/bundled/gaming/gaming-pokemon-player.md b/website/docs/user-guide/skills/bundled/gaming/gaming-pokemon-player.md new file mode 100644 index 000000000..ab070f867 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/gaming/gaming-pokemon-player.md @@ -0,0 +1,235 @@ +--- +title: "Pokemon Player — Play Pokemon games autonomously via headless emulation" +sidebar_label: "Pokemon Player" +description: "Play Pokemon games autonomously via headless emulation" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Pokemon Player + +Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/gaming/pokemon-player` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Pokemon Player + +Play Pokemon games via headless emulation using the `pokemon-agent` package. + +## When to Use +- User says "play pokemon", "start pokemon", "pokemon game" +- User asks about Pokemon Red, Blue, Yellow, FireRed, etc. +- User wants to watch an AI play Pokemon +- User references a ROM file (.gb, .gbc, .gba) + +## Startup Procedure + +### 1. First-time setup (clone, venv, install) +The repo is NousResearch/pokemon-agent on GitHub. Clone it, then +set up a Python 3.10+ virtual environment. Use uv (preferred for speed) +to create the venv and install the package in editable mode with the +pyboy extra. If uv is not available, fall back to python3 -m venv + pip. + +On this machine it is already set up at /home/teknium/pokemon-agent +with a venv ready — just cd there and source .venv/bin/activate. + +You also need a ROM file. Ask the user for theirs. On this machine +one exists at roms/pokemon_red.gb inside that directory. +NEVER download or provide ROM files — always ask the user. + +### 2. Start the game server +From inside the pokemon-agent directory with the venv activated, run +pokemon-agent serve with --rom pointing to the ROM and --port 9876. +Run it in the background with &. +To resume from a saved game, add --load-state with the save name. +Wait 4 seconds for startup, then verify with GET /health. + +### 3. Set up live dashboard for user to watch +Use an SSH reverse tunnel via localhost.run so the user can view +the dashboard in their browser. Connect with ssh, forwarding local +port 9876 to remote port 80 on nokey@localhost.run. Redirect output +to a log file, wait 10 seconds, then grep the log for the .lhr.life +URL. Give the user the URL with /dashboard/ appended. +The tunnel URL changes each time — give the user the new one if restarted. + +## Save and Load + +### When to save +- Every 15-20 turns of gameplay +- ALWAYS before gym battles, rival encounters, or risky fights +- Before entering a new town or dungeon +- Before any action you are unsure about + +### How to save +POST /save with a descriptive name. Good examples: +before_brock, route1_start, mt_moon_entrance, got_cut + +### How to load +POST /load with the save name. + +### List available saves +GET /saves returns all saved states. + +### Loading on server startup +Use --load-state flag when starting the server to auto-load a save. +This is faster than loading via the API after startup. + +## The Gameplay Loop + +### Step 1: OBSERVE — check state AND take a screenshot +GET /state for position, HP, battle, dialog. +GET /screenshot and save to /tmp/pokemon.png, then use vision_analyze. +Always do BOTH — RAM state gives numbers, vision gives spatial awareness. + +### Step 2: ORIENT +- Dialog/text on screen → advance it +- In battle → fight or run +- Party hurt → head to Pokemon Center +- Near objective → navigate carefully + +### Step 3: DECIDE +Priority: dialog > battle > heal > story objective > training > explore + +### Step 4: ACT — move 2-4 steps max, then re-check +POST /action with a SHORT action list (2-4 actions, not 10-15). + +### Step 5: VERIFY — screenshot after every move sequence +Take a screenshot and use vision_analyze to confirm you moved where +intended. This is the MOST IMPORTANT step. Without vision you WILL get lost. + +### Step 6: RECORD progress to memory with PKM: prefix + +### Step 7: SAVE periodically + +## Action Reference +- press_a — confirm, talk, select +- press_b — cancel, close menu +- press_start — open game menu +- walk_up/down/left/right — move one tile +- hold_b_N — hold B for N frames (use for speeding through text) +- wait_60 — wait about 1 second (60 frames) +- a_until_dialog_end — press A repeatedly until dialog clears + +## Critical Tips from Experience + +### USE VISION CONSTANTLY +- Take a screenshot every 2-4 movement steps +- The RAM state tells you position and HP but NOT what is around you +- Ledges, fences, signs, building doors, NPCs — only visible via screenshot +- Ask the vision model specific questions: "what is one tile north of me?" +- When stuck, always screenshot before trying random directions + +### Warp Transitions Need Extra Wait Time +When walking through a door or stairs, the screen fades to black during +the map transition. You MUST wait for it to complete. Add 2-3 wait_60 +actions after any door/stair warp. Without waiting, the position reads +as stale and you will think you are still in the old map. + +### Building Exit Trap +When you exit a building, you appear directly IN FRONT of the door. +If you walk north, you go right back inside. ALWAYS sidestep first +by walking left or right 2 tiles, then proceed in your intended direction. + +### Dialog Handling +Gen 1 text scrolls slowly letter-by-letter. To speed through dialog, +hold B for 120 frames then press A. Repeat as needed. Holding B makes +text display at max speed. Then press A to advance to the next line. +The a_until_dialog_end action checks the RAM dialog flag, but this flag +does not catch ALL text states. If dialog seems stuck, use the manual +hold_b + press_a pattern instead and verify via screenshot. + +### Ledges Are One-Way +Ledges (small cliff edges) can only be jumped DOWN (south), never climbed +UP (north). If blocked by a ledge going north, you must go left or right +to find the gap around it. Use vision to identify which direction the +gap is. Ask the vision model explicitly. + +### Navigation Strategy +- Move 2-4 steps at a time, then screenshot to check position +- When entering a new area, screenshot immediately to orient +- Ask the vision model "which direction to [destination]?" +- If stuck for 3+ attempts, screenshot and re-evaluate completely +- Do not spam 10-15 movements — you will overshoot or get stuck + +### Running from Wild Battles +On the battle menu, RUN is bottom-right. To reach it from the default +cursor position (FIGHT, top-left): press down then right to move cursor +to RUN, then press A. Wrap with hold_b to speed through text/animations. + +### Battling (FIGHT) +On the battle menu FIGHT is top-left (default cursor position). +Press A to enter move selection, A again to use the first move. +Then hold B to speed through attack animations and text. + +## Battle Strategy + +### Decision Tree +1. Want to catch? → Weaken then throw Poke Ball +2. Wild you don't need? → RUN +3. Type advantage? → Use super-effective move +4. No advantage? → Use strongest STAB move +5. Low HP? → Switch or use Potion + +### Gen 1 Type Chart (key matchups) +- Water beats Fire, Ground, Rock +- Fire beats Grass, Bug, Ice +- Grass beats Water, Ground, Rock +- Electric beats Water, Flying +- Ground beats Fire, Electric, Rock, Poison +- Psychic beats Fighting, Poison (dominant in Gen 1!) + +### Gen 1 Quirks +- Special stat = both offense AND defense for special moves +- Psychic type is overpowered (Ghost moves bugged) +- Critical hits based on Speed stat +- Wrap/Bind prevent opponent from acting +- Focus Energy bug: REDUCES crit rate instead of raising it + +## Memory Conventions +| Prefix | Purpose | Example | +|--------|---------|---------| +| PKM:OBJECTIVE | Current goal | Get Parcel from Viridian Mart | +| PKM:MAP | Navigation knowledge | Viridian: mart is northeast | +| PKM:STRATEGY | Battle/team plans | Need Grass type before Misty | +| PKM:PROGRESS | Milestone tracker | Beat rival, heading to Viridian | +| PKM:STUCK | Stuck situations | Ledge at y=28 go right to bypass | +| PKM:TEAM | Team notes | Squirtle Lv6, Tackle + Tail Whip | + +## Progression Milestones +- Choose starter +- Deliver Parcel from Viridian Mart, receive Pokedex +- Boulder Badge — Brock (Rock) → use Water/Grass +- Cascade Badge — Misty (Water) → use Grass/Electric +- Thunder Badge — Lt. Surge (Electric) → use Ground +- Rainbow Badge — Erika (Grass) → use Fire/Ice/Flying +- Soul Badge — Koga (Poison) → use Ground/Psychic +- Marsh Badge — Sabrina (Psychic) → hardest gym +- Volcano Badge — Blaine (Fire) → use Water/Ground +- Earth Badge — Giovanni (Ground) → use Water/Grass/Ice +- Elite Four → Champion! + +## Stopping Play +1. Save the game with a descriptive name via POST /save +2. Update memory with PKM:PROGRESS +3. Tell user: "Game saved as [name]! Say 'play pokemon' to resume." +4. Kill the server and tunnel background processes + +## Pitfalls +- NEVER download or provide ROM files +- Do NOT send more than 4-5 actions without checking vision +- Always sidestep after exiting buildings before going north +- Always add wait_60 x2-3 after door/stair warps +- Dialog detection via RAM is unreliable — verify with screenshots +- Save BEFORE risky encounters +- The tunnel URL changes each time you restart it diff --git a/website/docs/user-guide/skills/bundled/github/github-codebase-inspection.md b/website/docs/user-guide/skills/bundled/github/github-codebase-inspection.md new file mode 100644 index 000000000..13c3fe442 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/github/github-codebase-inspection.md @@ -0,0 +1,131 @@ +--- +title: "Codebase Inspection" +sidebar_label: "Codebase Inspection" +description: "Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Codebase Inspection + +Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios. Use when asked to check lines of code, repo size, language composition, or codebase stats. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/github/codebase-inspection` | +| Version | `1.0.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `LOC`, `Code Analysis`, `pygount`, `Codebase`, `Metrics`, `Repository` | +| Related skills | [`github-repo-management`](/docs/user-guide/skills/bundled/github/github-github-repo-management) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Codebase Inspection with pygount + +Analyze repositories for lines of code, language breakdown, file counts, and code-vs-comment ratios using `pygount`. + +## When to Use + +- User asks for LOC (lines of code) count +- User wants a language breakdown of a repo +- User asks about codebase size or composition +- User wants code-vs-comment ratios +- General "how big is this repo" questions + +## Prerequisites + +```bash +pip install --break-system-packages pygount 2>/dev/null || pip install pygount +``` + +## 1. Basic Summary (Most Common) + +Get a full language breakdown with file counts, code lines, and comment lines: + +```bash +cd /path/to/repo +pygount --format=summary \ + --folders-to-skip=".git,node_modules,venv,.venv,__pycache__,.cache,dist,build,.next,.tox,.eggs,*.egg-info" \ + . +``` + +**IMPORTANT:** Always use `--folders-to-skip` to exclude dependency/build directories, otherwise pygount will crawl them and take a very long time or hang. + +## 2. Common Folder Exclusions + +Adjust based on the project type: + +```bash +# Python projects +--folders-to-skip=".git,venv,.venv,__pycache__,.cache,dist,build,.tox,.eggs,.mypy_cache" + +# JavaScript/TypeScript projects +--folders-to-skip=".git,node_modules,dist,build,.next,.cache,.turbo,coverage" + +# General catch-all +--folders-to-skip=".git,node_modules,venv,.venv,__pycache__,.cache,dist,build,.next,.tox,vendor,third_party" +``` + +## 3. Filter by Specific Language + +```bash +# Only count Python files +pygount --suffix=py --format=summary . + +# Only count Python and YAML +pygount --suffix=py,yaml,yml --format=summary . +``` + +## 4. Detailed File-by-File Output + +```bash +# Default format shows per-file breakdown +pygount --folders-to-skip=".git,node_modules,venv" . + +# Sort by code lines (pipe through sort) +pygount --folders-to-skip=".git,node_modules,venv" . | sort -t$'\t' -k1 -nr | head -20 +``` + +## 5. Output Formats + +```bash +# Summary table (default recommendation) +pygount --format=summary . + +# JSON output for programmatic use +pygount --format=json . + +# Pipe-friendly: Language, file count, code, docs, empty, string +pygount --format=summary . 2>/dev/null +``` + +## 6. Interpreting Results + +The summary table columns: +- **Language** — detected programming language +- **Files** — number of files of that language +- **Code** — lines of actual code (executable/declarative) +- **Comment** — lines that are comments or documentation +- **%** — percentage of total + +Special pseudo-languages: +- `__empty__` — empty files +- `__binary__` — binary files (images, compiled, etc.) +- `__generated__` — auto-generated files (detected heuristically) +- `__duplicate__` — files with identical content +- `__unknown__` — unrecognized file types + +## Pitfalls + +1. **Always exclude .git, node_modules, venv** — without `--folders-to-skip`, pygount will crawl everything and may take minutes or hang on large dependency trees. +2. **Markdown shows 0 code lines** — pygount classifies all Markdown content as comments, not code. This is expected behavior. +3. **JSON files show low code counts** — pygount may count JSON lines conservatively. For accurate JSON line counts, use `wc -l` directly. +4. **Large monorepos** — for very large repos, consider using `--suffix` to target specific languages rather than scanning everything. diff --git a/website/docs/user-guide/skills/bundled/github/github-github-auth.md b/website/docs/user-guide/skills/bundled/github/github-github-auth.md new file mode 100644 index 000000000..4f7360c43 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/github/github-github-auth.md @@ -0,0 +1,264 @@ +--- +title: "Github Auth — Set up GitHub authentication for the agent using git (universally available) or the gh CLI" +sidebar_label: "Github Auth" +description: "Set up GitHub authentication for the agent using git (universally available) or the gh CLI" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Github Auth + +Set up GitHub authentication for the agent using git (universally available) or the gh CLI. Covers HTTPS tokens, SSH keys, credential helpers, and gh auth — with a detection flow to pick the right method automatically. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/github/github-auth` | +| Version | `1.1.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `GitHub`, `Authentication`, `Git`, `gh-cli`, `SSH`, `Setup` | +| Related skills | [`github-pr-workflow`](/docs/user-guide/skills/bundled/github/github-github-pr-workflow), [`github-code-review`](/docs/user-guide/skills/bundled/github/github-github-code-review), [`github-issues`](/docs/user-guide/skills/bundled/github/github-github-issues), [`github-repo-management`](/docs/user-guide/skills/bundled/github/github-github-repo-management) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# GitHub Authentication Setup + +This skill sets up authentication so the agent can work with GitHub repositories, PRs, issues, and CI. It covers two paths: + +- **`git` (always available)** — uses HTTPS personal access tokens or SSH keys +- **`gh` CLI (if installed)** — richer GitHub API access with a simpler auth flow + +## Detection Flow + +When a user asks you to work with GitHub, run this check first: + +```bash +# Check what's available +git --version +gh --version 2>/dev/null || echo "gh not installed" + +# Check if already authenticated +gh auth status 2>/dev/null || echo "gh not authenticated" +git config --global credential.helper 2>/dev/null || echo "no git credential helper" +``` + +**Decision tree:** +1. If `gh auth status` shows authenticated → you're good, use `gh` for everything +2. If `gh` is installed but not authenticated → use "gh auth" method below +3. If `gh` is not installed → use "git-only" method below (no sudo needed) + +--- + +## Method 1: Git-Only Authentication (No gh, No sudo) + +This works on any machine with `git` installed. No root access needed. + +### Option A: HTTPS with Personal Access Token (Recommended) + +This is the most portable method — works everywhere, no SSH config needed. + +**Step 1: Create a personal access token** + +Tell the user to go to: **https://github.com/settings/tokens** + +- Click "Generate new token (classic)" +- Give it a name like "hermes-agent" +- Select scopes: + - `repo` (full repository access — read, write, push, PRs) + - `workflow` (trigger and manage GitHub Actions) + - `read:org` (if working with organization repos) +- Set expiration (90 days is a good default) +- Copy the token — it won't be shown again + +**Step 2: Configure git to store the token** + +```bash +# Set up the credential helper to cache credentials +# "store" saves to ~/.git-credentials in plaintext (simple, persistent) +git config --global credential.helper store + +# Now do a test operation that triggers auth — git will prompt for credentials +# Username: +# Password: +git ls-remote https://github.com//.git +``` + +After entering credentials once, they're saved and reused for all future operations. + +**Alternative: cache helper (credentials expire from memory)** + +```bash +# Cache in memory for 8 hours (28800 seconds) instead of saving to disk +git config --global credential.helper 'cache --timeout=28800' +``` + +**Alternative: set the token directly in the remote URL (per-repo)** + +```bash +# Embed token in the remote URL (avoids credential prompts entirely) +git remote set-url origin https://:@github.com//.git +``` + +**Step 3: Configure git identity** + +```bash +# Required for commits — set name and email +git config --global user.name "Their Name" +git config --global user.email "their-email@example.com" +``` + +**Step 4: Verify** + +```bash +# Test push access (this should work without any prompts now) +git ls-remote https://github.com//.git + +# Verify identity +git config --global user.name +git config --global user.email +``` + +### Option B: SSH Key Authentication + +Good for users who prefer SSH or already have keys set up. + +**Step 1: Check for existing SSH keys** + +```bash +ls -la ~/.ssh/id_*.pub 2>/dev/null || echo "No SSH keys found" +``` + +**Step 2: Generate a key if needed** + +```bash +# Generate an ed25519 key (modern, secure, fast) +ssh-keygen -t ed25519 -C "their-email@example.com" -f ~/.ssh/id_ed25519 -N "" + +# Display the public key for them to add to GitHub +cat ~/.ssh/id_ed25519.pub +``` + +Tell the user to add the public key at: **https://github.com/settings/keys** +- Click "New SSH key" +- Paste the public key content +- Give it a title like "hermes-agent-<machine-name>" + +**Step 3: Test the connection** + +```bash +ssh -T git@github.com +# Expected: "Hi ! You've successfully authenticated..." +``` + +**Step 4: Configure git to use SSH for GitHub** + +```bash +# Rewrite HTTPS GitHub URLs to SSH automatically +git config --global url."git@github.com:".insteadOf "https://github.com/" +``` + +**Step 5: Configure git identity** + +```bash +git config --global user.name "Their Name" +git config --global user.email "their-email@example.com" +``` + +--- + +## Method 2: gh CLI Authentication + +If `gh` is installed, it handles both API access and git credentials in one step. + +### Interactive Browser Login (Desktop) + +```bash +gh auth login +# Select: GitHub.com +# Select: HTTPS +# Authenticate via browser +``` + +### Token-Based Login (Headless / SSH Servers) + +```bash +echo "" | gh auth login --with-token + +# Set up git credentials through gh +gh auth setup-git +``` + +### Verify + +```bash +gh auth status +``` + +--- + +## Using the GitHub API Without gh + +When `gh` is not available, you can still access the full GitHub API using `curl` with a personal access token. This is how the other GitHub skills implement their fallbacks. + +### Setting the Token for API Calls + +```bash +# Option 1: Export as env var (preferred — keeps it out of commands) +export GITHUB_TOKEN="" + +# Then use in curl calls: +curl -s -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/user +``` + +### Extracting the Token from Git Credentials + +If git credentials are already configured (via credential.helper store), the token can be extracted: + +```bash +# Read from git credential store +grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|' +``` + +### Helper: Detect Auth Method + +Use this pattern at the start of any GitHub workflow: + +```bash +# Try gh first, fall back to git + curl +if command -v gh &>/dev/null && gh auth status &>/dev/null; then + echo "AUTH_METHOD=gh" +elif [ -n "$GITHUB_TOKEN" ]; then + echo "AUTH_METHOD=curl" +elif [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then + export GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r') + echo "AUTH_METHOD=curl" +elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then + export GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|') + echo "AUTH_METHOD=curl" +else + echo "AUTH_METHOD=none" + echo "Need to set up authentication first" +fi +``` + +--- + +## Troubleshooting + +| Problem | Solution | +|---------|----------| +| `git push` asks for password | GitHub disabled password auth. Use a personal access token as the password, or switch to SSH | +| `remote: Permission to X denied` | Token may lack `repo` scope — regenerate with correct scopes | +| `fatal: Authentication failed` | Cached credentials may be stale — run `git credential reject` then re-authenticate | +| `ssh: connect to host github.com port 22: Connection refused` | Try SSH over HTTPS port: add `Host github.com` with `Port 443` and `Hostname ssh.github.com` to `~/.ssh/config` | +| Credentials not persisting | Check `git config --global credential.helper` — must be `store` or `cache` | +| Multiple GitHub accounts | Use SSH with different keys per host alias in `~/.ssh/config`, or per-repo credential URLs | +| `gh: command not found` + no sudo | Use git-only Method 1 above — no installation needed | diff --git a/website/docs/user-guide/skills/bundled/github/github-github-code-review.md b/website/docs/user-guide/skills/bundled/github/github-github-code-review.md new file mode 100644 index 000000000..9a18c45e1 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/github/github-github-code-review.md @@ -0,0 +1,498 @@ +--- +title: "Github Code Review" +sidebar_label: "Github Code Review" +description: "Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Github Code Review + +Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Works with gh CLI or falls back to git + GitHub REST API via curl. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/github/github-code-review` | +| Version | `1.1.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `GitHub`, `Code-Review`, `Pull-Requests`, `Git`, `Quality` | +| Related skills | [`github-auth`](/docs/user-guide/skills/bundled/github/github-github-auth), [`github-pr-workflow`](/docs/user-guide/skills/bundled/github/github-github-pr-workflow) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# GitHub Code Review + +Perform code reviews on local changes before pushing, or review open PRs on GitHub. Most of this skill uses plain `git` — the `gh`/`curl` split only matters for PR-level interactions. + +## Prerequisites + +- Authenticated with GitHub (see `github-auth` skill) +- Inside a git repository + +### Setup (for PR interactions) + +```bash +if command -v gh &>/dev/null && gh auth status &>/dev/null; then + AUTH="gh" +else + AUTH="git" + if [ -z "$GITHUB_TOKEN" ]; then + if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then + GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r') + elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then + GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|') + fi + fi +fi + +REMOTE_URL=$(git remote get-url origin) +OWNER_REPO=$(echo "$REMOTE_URL" | sed -E 's|.*github\.com[:/]||; s|\.git$||') +OWNER=$(echo "$OWNER_REPO" | cut -d/ -f1) +REPO=$(echo "$OWNER_REPO" | cut -d/ -f2) +``` + +--- + +## 1. Reviewing Local Changes (Pre-Push) + +This is pure `git` — works everywhere, no API needed. + +### Get the Diff + +```bash +# Staged changes (what would be committed) +git diff --staged + +# All changes vs main (what a PR would contain) +git diff main...HEAD + +# File names only +git diff main...HEAD --name-only + +# Stat summary (insertions/deletions per file) +git diff main...HEAD --stat +``` + +### Review Strategy + +1. **Get the big picture first:** + +```bash +git diff main...HEAD --stat +git log main..HEAD --oneline +``` + +2. **Review file by file** — use `read_file` on changed files for full context, and the diff to see what changed: + +```bash +git diff main...HEAD -- src/auth/login.py +``` + +3. **Check for common issues:** + +```bash +# Debug statements, TODOs, console.logs left behind +git diff main...HEAD | grep -n "print(\|console\.log\|TODO\|FIXME\|HACK\|XXX\|debugger" + +# Large files accidentally staged +git diff main...HEAD --stat | sort -t'|' -k2 -rn | head -10 + +# Secrets or credential patterns +git diff main...HEAD | grep -in "password\|secret\|api_key\|token.*=\|private_key" + +# Merge conflict markers +git diff main...HEAD | grep -n "<<<<<<\|>>>>>>\|=======" +``` + +4. **Present structured feedback** to the user. + +### Review Output Format + +When reviewing local changes, present findings in this structure: + +``` +## Code Review Summary + +### Critical +- **src/auth.py:45** — SQL injection: user input passed directly to query. + Suggestion: Use parameterized queries. + +### Warnings +- **src/models/user.py:23** — Password stored in plaintext. Use bcrypt or argon2. +- **src/api/routes.py:112** — No rate limiting on login endpoint. + +### Suggestions +- **src/utils/helpers.py:8** — Duplicates logic in `src/core/utils.py:34`. Consolidate. +- **tests/test_auth.py** — Missing edge case: expired token test. + +### Looks Good +- Clean separation of concerns in the middleware layer +- Good test coverage for the happy path +``` + +--- + +## 2. Reviewing a Pull Request on GitHub + +### View PR Details + +**With gh:** + +```bash +gh pr view 123 +gh pr diff 123 +gh pr diff 123 --name-only +``` + +**With git + curl:** + +```bash +PR_NUMBER=123 + +# Get PR details +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER \ + | python3 -c " +import sys, json +pr = json.load(sys.stdin) +print(f\"Title: {pr['title']}\") +print(f\"Author: {pr['user']['login']}\") +print(f\"Branch: {pr['head']['ref']} -> {pr['base']['ref']}\") +print(f\"State: {pr['state']}\") +print(f\"Body:\n{pr['body']}\")" + +# List changed files +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/files \ + | python3 -c " +import sys, json +for f in json.load(sys.stdin): + print(f\"{f['status']:10} +{f['additions']:-4} -{f['deletions']:-4} {f['filename']}\")" +``` + +### Check Out PR Locally for Full Review + +This works with plain `git` — no `gh` needed: + +```bash +# Fetch the PR branch and check it out +git fetch origin pull/123/head:pr-123 +git checkout pr-123 + +# Now you can use read_file, search_files, run tests, etc. + +# View diff against the base branch +git diff main...pr-123 +``` + +**With gh (shortcut):** + +```bash +gh pr checkout 123 +``` + +### Leave Comments on a PR + +**General PR comment — with gh:** + +```bash +gh pr comment 123 --body "Overall looks good, a few suggestions below." +``` + +**General PR comment — with curl:** + +```bash +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues/$PR_NUMBER/comments \ + -d '{"body": "Overall looks good, a few suggestions below."}' +``` + +### Leave Inline Review Comments + +**Single inline comment — with gh (via API):** + +```bash +HEAD_SHA=$(gh pr view 123 --json headRefOid --jq '.headRefOid') + +gh api repos/$OWNER/$REPO/pulls/123/comments \ + --method POST \ + -f body="This could be simplified with a list comprehension." \ + -f path="src/auth/login.py" \ + -f commit_id="$HEAD_SHA" \ + -f line=45 \ + -f side="RIGHT" +``` + +**Single inline comment — with curl:** + +```bash +# Get the head commit SHA +HEAD_SHA=$(curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER \ + | python3 -c "import sys,json; print(json.load(sys.stdin)['head']['sha'])") + +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/comments \ + -d "{ + \"body\": \"This could be simplified with a list comprehension.\", + \"path\": \"src/auth/login.py\", + \"commit_id\": \"$HEAD_SHA\", + \"line\": 45, + \"side\": \"RIGHT\" + }" +``` + +### Submit a Formal Review (Approve / Request Changes) + +**With gh:** + +```bash +gh pr review 123 --approve --body "LGTM!" +gh pr review 123 --request-changes --body "See inline comments." +gh pr review 123 --comment --body "Some suggestions, nothing blocking." +``` + +**With curl — multi-comment review submitted atomically:** + +```bash +HEAD_SHA=$(curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER \ + | python3 -c "import sys,json; print(json.load(sys.stdin)['head']['sha'])") + +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/reviews \ + -d "{ + \"commit_id\": \"$HEAD_SHA\", + \"event\": \"COMMENT\", + \"body\": \"Code review from Hermes Agent\", + \"comments\": [ + {\"path\": \"src/auth.py\", \"line\": 45, \"body\": \"Use parameterized queries to prevent SQL injection.\"}, + {\"path\": \"src/models/user.py\", \"line\": 23, \"body\": \"Hash passwords with bcrypt before storing.\"}, + {\"path\": \"tests/test_auth.py\", \"line\": 1, \"body\": \"Add test for expired token edge case.\"} + ] + }" +``` + +Event values: `"APPROVE"`, `"REQUEST_CHANGES"`, `"COMMENT"` + +The `line` field refers to the line number in the *new* version of the file. For deleted lines, use `"side": "LEFT"`. + +--- + +## 3. Review Checklist + +When performing a code review (local or PR), systematically check: + +### Correctness +- Does the code do what it claims? +- Edge cases handled (empty inputs, nulls, large data, concurrent access)? +- Error paths handled gracefully? + +### Security +- No hardcoded secrets, credentials, or API keys +- Input validation on user-facing inputs +- No SQL injection, XSS, or path traversal +- Auth/authz checks where needed + +### Code Quality +- Clear naming (variables, functions, classes) +- No unnecessary complexity or premature abstraction +- DRY — no duplicated logic that should be extracted +- Functions are focused (single responsibility) + +### Testing +- New code paths tested? +- Happy path and error cases covered? +- Tests readable and maintainable? + +### Performance +- No N+1 queries or unnecessary loops +- Appropriate caching where beneficial +- No blocking operations in async code paths + +### Documentation +- Public APIs documented +- Non-obvious logic has comments explaining "why" +- README updated if behavior changed + +--- + +## 4. Pre-Push Review Workflow + +When the user asks you to "review the code" or "check before pushing": + +1. `git diff main...HEAD --stat` — see scope of changes +2. `git diff main...HEAD` — read the full diff +3. For each changed file, use `read_file` if you need more context +4. Apply the checklist above +5. Present findings in the structured format (Critical / Warnings / Suggestions / Looks Good) +6. If critical issues found, offer to fix them before the user pushes + +--- + +## 5. PR Review Workflow (End-to-End) + +When the user asks you to "review PR #N", "look at this PR", or gives you a PR URL, follow this recipe: + +### Step 1: Set up environment + +```bash +source "${HERMES_HOME:-$HOME/.hermes}/skills/github/github-auth/scripts/gh-env.sh" +# Or run the inline setup block from the top of this skill +``` + +### Step 2: Gather PR context + +Get the PR metadata, description, and list of changed files to understand scope before diving into code. + +**With gh:** +```bash +gh pr view 123 +gh pr diff 123 --name-only +gh pr checks 123 +``` + +**With curl:** +```bash +PR_NUMBER=123 + +# PR details (title, author, description, branch) +curl -s -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$GH_OWNER/$GH_REPO/pulls/$PR_NUMBER + +# Changed files with line counts +curl -s -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$GH_OWNER/$GH_REPO/pulls/$PR_NUMBER/files +``` + +### Step 3: Check out the PR locally + +This gives you full access to `read_file`, `search_files`, and the ability to run tests. + +```bash +git fetch origin pull/$PR_NUMBER/head:pr-$PR_NUMBER +git checkout pr-$PR_NUMBER +``` + +### Step 4: Read the diff and understand changes + +```bash +# Full diff against the base branch +git diff main...HEAD + +# Or file-by-file for large PRs +git diff main...HEAD --name-only +# Then for each file: +git diff main...HEAD -- path/to/file.py +``` + +For each changed file, use `read_file` to see full context around the changes — diffs alone can miss issues visible only with surrounding code. + +### Step 5: Run automated checks locally (if applicable) + +```bash +# Run tests if there's a test suite +python -m pytest 2>&1 | tail -20 +# or: npm test, cargo test, go test ./..., etc. + +# Run linter if configured +ruff check . 2>&1 | head -30 +# or: eslint, clippy, etc. +``` + +### Step 6: Apply the review checklist (Section 3) + +Go through each category: Correctness, Security, Code Quality, Testing, Performance, Documentation. + +### Step 7: Post the review to GitHub + +Collect your findings and submit them as a formal review with inline comments. + +**With gh:** +```bash +# If no issues — approve +gh pr review $PR_NUMBER --approve --body "Reviewed by Hermes Agent. Code looks clean — good test coverage, no security concerns." + +# If issues found — request changes with inline comments +gh pr review $PR_NUMBER --request-changes --body "Found a few issues — see inline comments." +``` + +**With curl — atomic review with multiple inline comments:** +```bash +HEAD_SHA=$(curl -s -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$GH_OWNER/$GH_REPO/pulls/$PR_NUMBER \ + | python3 -c "import sys,json; print(json.load(sys.stdin)['head']['sha'])") + +# Build the review JSON — event is APPROVE, REQUEST_CHANGES, or COMMENT +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$GH_OWNER/$GH_REPO/pulls/$PR_NUMBER/reviews \ + -d "{ + \"commit_id\": \"$HEAD_SHA\", + \"event\": \"REQUEST_CHANGES\", + \"body\": \"## Hermes Agent Review\n\nFound 2 issues, 1 suggestion. See inline comments.\", + \"comments\": [ + {\"path\": \"src/auth.py\", \"line\": 45, \"body\": \"🔴 **Critical:** User input passed directly to SQL query — use parameterized queries.\"}, + {\"path\": \"src/models.py\", \"line\": 23, \"body\": \"⚠️ **Warning:** Password stored without hashing.\"}, + {\"path\": \"src/utils.py\", \"line\": 8, \"body\": \"💡 **Suggestion:** This duplicates logic in core/utils.py:34.\"} + ] + }" +``` + +### Step 8: Also post a summary comment + +In addition to inline comments, leave a top-level summary so the PR author gets the full picture at a glance. Use the review output format from `references/review-output-template.md`. + +**With gh:** +```bash +gh pr comment $PR_NUMBER --body "$(cat <<'EOF' +## Code Review Summary + +**Verdict: Changes Requested** (2 issues, 1 suggestion) + +### 🔴 Critical +- **src/auth.py:45** — SQL injection vulnerability + +### ⚠️ Warnings +- **src/models.py:23** — Plaintext password storage + +### 💡 Suggestions +- **src/utils.py:8** — Duplicated logic, consider consolidating + +### ✅ Looks Good +- Clean API design +- Good error handling in the middleware layer + +--- +*Reviewed by Hermes Agent* +EOF +)" +``` + +### Step 9: Clean up + +```bash +git checkout main +git branch -D pr-$PR_NUMBER +``` + +### Decision: Approve vs Request Changes vs Comment + +- **Approve** — no critical or warning-level issues, only minor suggestions or all clear +- **Request Changes** — any critical or warning-level issue that should be fixed before merge +- **Comment** — observations and suggestions, but nothing blocking (use when you're unsure or the PR is a draft) diff --git a/website/docs/user-guide/skills/bundled/github/github-github-issues.md b/website/docs/user-guide/skills/bundled/github/github-github-issues.md new file mode 100644 index 000000000..8493663cd --- /dev/null +++ b/website/docs/user-guide/skills/bundled/github/github-github-issues.md @@ -0,0 +1,387 @@ +--- +title: "Github Issues — Create, manage, triage, and close GitHub issues" +sidebar_label: "Github Issues" +description: "Create, manage, triage, and close GitHub issues" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Github Issues + +Create, manage, triage, and close GitHub issues. Search existing issues, add labels, assign people, and link to PRs. Works with gh CLI or falls back to git + GitHub REST API via curl. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/github/github-issues` | +| Version | `1.1.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `GitHub`, `Issues`, `Project-Management`, `Bug-Tracking`, `Triage` | +| Related skills | [`github-auth`](/docs/user-guide/skills/bundled/github/github-github-auth), [`github-pr-workflow`](/docs/user-guide/skills/bundled/github/github-github-pr-workflow) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# GitHub Issues Management + +Create, search, triage, and manage GitHub issues. Each section shows `gh` first, then the `curl` fallback. + +## Prerequisites + +- Authenticated with GitHub (see `github-auth` skill) +- Inside a git repo with a GitHub remote, or specify the repo explicitly + +### Setup + +```bash +if command -v gh &>/dev/null && gh auth status &>/dev/null; then + AUTH="gh" +else + AUTH="git" + if [ -z "$GITHUB_TOKEN" ]; then + if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then + GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r') + elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then + GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|') + fi + fi +fi + +REMOTE_URL=$(git remote get-url origin) +OWNER_REPO=$(echo "$REMOTE_URL" | sed -E 's|.*github\.com[:/]||; s|\.git$||') +OWNER=$(echo "$OWNER_REPO" | cut -d/ -f1) +REPO=$(echo "$OWNER_REPO" | cut -d/ -f2) +``` + +--- + +## 1. Viewing Issues + +**With gh:** + +```bash +gh issue list +gh issue list --state open --label "bug" +gh issue list --assignee @me +gh issue list --search "authentication error" --state all +gh issue view 42 +``` + +**With curl:** + +```bash +# List open issues +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/repos/$OWNER/$REPO/issues?state=open&per_page=20" \ + | python3 -c " +import sys, json +for i in json.load(sys.stdin): + if 'pull_request' not in i: # GitHub API returns PRs in /issues too + labels = ', '.join(l['name'] for l in i['labels']) + print(f\"#{i['number']:5} {i['state']:6} {labels:30} {i['title']}\")" + +# Filter by label +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/repos/$OWNER/$REPO/issues?state=open&labels=bug&per_page=20" \ + | python3 -c " +import sys, json +for i in json.load(sys.stdin): + if 'pull_request' not in i: + print(f\"#{i['number']} {i['title']}\")" + +# View a specific issue +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues/42 \ + | python3 -c " +import sys, json +i = json.load(sys.stdin) +labels = ', '.join(l['name'] for l in i['labels']) +assignees = ', '.join(a['login'] for a in i['assignees']) +print(f\"#{i['number']}: {i['title']}\") +print(f\"State: {i['state']} Labels: {labels} Assignees: {assignees}\") +print(f\"Author: {i['user']['login']} Created: {i['created_at']}\") +print(f\"\n{i['body']}\")" + +# Search issues +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/search/issues?q=authentication+error+repo:$OWNER/$REPO" \ + | python3 -c " +import sys, json +for i in json.load(sys.stdin)['items']: + print(f\"#{i['number']} {i['state']:6} {i['title']}\")" +``` + +## 2. Creating Issues + +**With gh:** + +```bash +gh issue create \ + --title "Login redirect ignores ?next= parameter" \ + --body "## Description +After logging in, users always land on /dashboard. + +## Steps to Reproduce +1. Navigate to /settings while logged out +2. Get redirected to /login?next=/settings +3. Log in +4. Actual: redirected to /dashboard (should go to /settings) + +## Expected Behavior +Respect the ?next= query parameter." \ + --label "bug,backend" \ + --assignee "username" +``` + +**With curl:** + +```bash +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues \ + -d '{ + "title": "Login redirect ignores ?next= parameter", + "body": "## Description\nAfter logging in, users always land on /dashboard.\n\n## Steps to Reproduce\n1. Navigate to /settings while logged out\n2. Get redirected to /login?next=/settings\n3. Log in\n4. Actual: redirected to /dashboard\n\n## Expected Behavior\nRespect the ?next= query parameter.", + "labels": ["bug", "backend"], + "assignees": ["username"] + }' +``` + +### Bug Report Template + +``` +## Bug Description + + +## Steps to Reproduce +1. +2. + +## Expected Behavior + + +## Actual Behavior + + +## Environment +- OS: +- Version: +``` + +### Feature Request Template + +``` +## Feature Description + + +## Motivation + + +## Proposed Solution + + +## Alternatives Considered + +``` + +## 3. Managing Issues + +### Add/Remove Labels + +**With gh:** + +```bash +gh issue edit 42 --add-label "priority:high,bug" +gh issue edit 42 --remove-label "needs-triage" +``` + +**With curl:** + +```bash +# Add labels +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues/42/labels \ + -d '{"labels": ["priority:high", "bug"]}' + +# Remove a label +curl -s -X DELETE \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues/42/labels/needs-triage + +# List available labels in the repo +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/labels \ + | python3 -c " +import sys, json +for l in json.load(sys.stdin): + print(f\" {l['name']:30} {l.get('description', '')}\")" +``` + +### Assignment + +**With gh:** + +```bash +gh issue edit 42 --add-assignee username +gh issue edit 42 --add-assignee @me +``` + +**With curl:** + +```bash +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues/42/assignees \ + -d '{"assignees": ["username"]}' +``` + +### Commenting + +**With gh:** + +```bash +gh issue comment 42 --body "Investigated — root cause is in auth middleware. Working on a fix." +``` + +**With curl:** + +```bash +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues/42/comments \ + -d '{"body": "Investigated — root cause is in auth middleware. Working on a fix."}' +``` + +### Closing and Reopening + +**With gh:** + +```bash +gh issue close 42 +gh issue close 42 --reason "not planned" +gh issue reopen 42 +``` + +**With curl:** + +```bash +# Close +curl -s -X PATCH \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues/42 \ + -d '{"state": "closed", "state_reason": "completed"}' + +# Reopen +curl -s -X PATCH \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues/42 \ + -d '{"state": "open"}' +``` + +### Linking Issues to PRs + +Issues are automatically closed when a PR merges with the right keywords in the body: + +``` +Closes #42 +Fixes #42 +Resolves #42 +``` + +To create a branch from an issue: + +**With gh:** + +```bash +gh issue develop 42 --checkout +``` + +**With git (manual equivalent):** + +```bash +git checkout main && git pull origin main +git checkout -b fix/issue-42-login-redirect +``` + +## 4. Issue Triage Workflow + +When asked to triage issues: + +1. **List untriaged issues:** + +```bash +# With gh +gh issue list --label "needs-triage" --state open + +# With curl +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/repos/$OWNER/$REPO/issues?labels=needs-triage&state=open" \ + | python3 -c " +import sys, json +for i in json.load(sys.stdin): + if 'pull_request' not in i: + print(f\"#{i['number']} {i['title']}\")" +``` + +2. **Read and categorize** each issue (view details, understand the bug/feature) + +3. **Apply labels and priority** (see Managing Issues above) + +4. **Assign** if the owner is clear + +5. **Comment with triage notes** if needed + +## 5. Bulk Operations + +For batch operations, combine API calls with shell scripting: + +**With gh:** + +```bash +# Close all issues with a specific label +gh issue list --label "wontfix" --json number --jq '.[].number' | \ + xargs -I {} gh issue close {} --reason "not planned" +``` + +**With curl:** + +```bash +# List issue numbers with a label, then close each +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/repos/$OWNER/$REPO/issues?labels=wontfix&state=open" \ + | python3 -c "import sys,json; [print(i['number']) for i in json.load(sys.stdin)]" \ + | while read num; do + curl -s -X PATCH \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/issues/$num \ + -d '{"state": "closed", "state_reason": "not_planned"}' + echo "Closed #$num" + done +``` + +## Quick Reference Table + +| Action | gh | curl endpoint | +|--------|-----|--------------| +| List issues | `gh issue list` | `GET /repos/{o}/{r}/issues` | +| View issue | `gh issue view N` | `GET /repos/{o}/{r}/issues/N` | +| Create issue | `gh issue create ...` | `POST /repos/{o}/{r}/issues` | +| Add labels | `gh issue edit N --add-label ...` | `POST /repos/{o}/{r}/issues/N/labels` | +| Assign | `gh issue edit N --add-assignee ...` | `POST /repos/{o}/{r}/issues/N/assignees` | +| Comment | `gh issue comment N --body ...` | `POST /repos/{o}/{r}/issues/N/comments` | +| Close | `gh issue close N` | `PATCH /repos/{o}/{r}/issues/N` | +| Search | `gh issue list --search "..."` | `GET /search/issues?q=...` | diff --git a/website/docs/user-guide/skills/bundled/github/github-github-pr-workflow.md b/website/docs/user-guide/skills/bundled/github/github-github-pr-workflow.md new file mode 100644 index 000000000..f1a31e157 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/github/github-github-pr-workflow.md @@ -0,0 +1,384 @@ +--- +title: "Github Pr Workflow" +sidebar_label: "Github Pr Workflow" +description: "Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Github Pr Workflow + +Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge. Works with gh CLI or falls back to git + GitHub REST API via curl. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/github/github-pr-workflow` | +| Version | `1.1.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `GitHub`, `Pull-Requests`, `CI/CD`, `Git`, `Automation`, `Merge` | +| Related skills | [`github-auth`](/docs/user-guide/skills/bundled/github/github-github-auth), [`github-code-review`](/docs/user-guide/skills/bundled/github/github-github-code-review) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# GitHub Pull Request Workflow + +Complete guide for managing the PR lifecycle. Each section shows the `gh` way first, then the `git` + `curl` fallback for machines without `gh`. + +## Prerequisites + +- Authenticated with GitHub (see `github-auth` skill) +- Inside a git repository with a GitHub remote + +### Quick Auth Detection + +```bash +# Determine which method to use throughout this workflow +if command -v gh &>/dev/null && gh auth status &>/dev/null; then + AUTH="gh" +else + AUTH="git" + # Ensure we have a token for API calls + if [ -z "$GITHUB_TOKEN" ]; then + if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then + GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r') + elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then + GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|') + fi + fi +fi +echo "Using: $AUTH" +``` + +### Extracting Owner/Repo from the Git Remote + +Many `curl` commands need `owner/repo`. Extract it from the git remote: + +```bash +# Works for both HTTPS and SSH remote URLs +REMOTE_URL=$(git remote get-url origin) +OWNER_REPO=$(echo "$REMOTE_URL" | sed -E 's|.*github\.com[:/]||; s|\.git$||') +OWNER=$(echo "$OWNER_REPO" | cut -d/ -f1) +REPO=$(echo "$OWNER_REPO" | cut -d/ -f2) +echo "Owner: $OWNER, Repo: $REPO" +``` + +--- + +## 1. Branch Creation + +This part is pure `git` — identical either way: + +```bash +# Make sure you're up to date +git fetch origin +git checkout main && git pull origin main + +# Create and switch to a new branch +git checkout -b feat/add-user-authentication +``` + +Branch naming conventions: +- `feat/description` — new features +- `fix/description` — bug fixes +- `refactor/description` — code restructuring +- `docs/description` — documentation +- `ci/description` — CI/CD changes + +## 2. Making Commits + +Use the agent's file tools (`write_file`, `patch`) to make changes, then commit: + +```bash +# Stage specific files +git add src/auth.py src/models/user.py tests/test_auth.py + +# Commit with a conventional commit message +git commit -m "feat: add JWT-based user authentication + +- Add login/register endpoints +- Add User model with password hashing +- Add auth middleware for protected routes +- Add unit tests for auth flow" +``` + +Commit message format (Conventional Commits): +``` +type(scope): short description + +Longer explanation if needed. Wrap at 72 characters. +``` + +Types: `feat`, `fix`, `refactor`, `docs`, `test`, `ci`, `chore`, `perf` + +## 3. Pushing and Creating a PR + +### Push the Branch (same either way) + +```bash +git push -u origin HEAD +``` + +### Create the PR + +**With gh:** + +```bash +gh pr create \ + --title "feat: add JWT-based user authentication" \ + --body "## Summary +- Adds login and register API endpoints +- JWT token generation and validation + +## Test Plan +- [ ] Unit tests pass + +Closes #42" +``` + +Options: `--draft`, `--reviewer user1,user2`, `--label "enhancement"`, `--base develop` + +**With git + curl:** + +```bash +BRANCH=$(git branch --show-current) + +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + -H "Accept: application/vnd.github.v3+json" \ + https://api.github.com/repos/$OWNER/$REPO/pulls \ + -d "{ + \"title\": \"feat: add JWT-based user authentication\", + \"body\": \"## Summary\nAdds login and register API endpoints.\n\nCloses #42\", + \"head\": \"$BRANCH\", + \"base\": \"main\" + }" +``` + +The response JSON includes the PR `number` — save it for later commands. + +To create as a draft, add `"draft": true` to the JSON body. + +## 4. Monitoring CI Status + +### Check CI Status + +**With gh:** + +```bash +# One-shot check +gh pr checks + +# Watch until all checks finish (polls every 10s) +gh pr checks --watch +``` + +**With git + curl:** + +```bash +# Get the latest commit SHA on the current branch +SHA=$(git rev-parse HEAD) + +# Query the combined status +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/commits/$SHA/status \ + | python3 -c " +import sys, json +data = json.load(sys.stdin) +print(f\"Overall: {data['state']}\") +for s in data.get('statuses', []): + print(f\" {s['context']}: {s['state']} - {s.get('description', '')}\")" + +# Also check GitHub Actions check runs (separate endpoint) +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/commits/$SHA/check-runs \ + | python3 -c " +import sys, json +data = json.load(sys.stdin) +for cr in data.get('check_runs', []): + print(f\" {cr['name']}: {cr['status']} / {cr['conclusion'] or 'pending'}\")" +``` + +### Poll Until Complete (git + curl) + +```bash +# Simple polling loop — check every 30 seconds, up to 10 minutes +SHA=$(git rev-parse HEAD) +for i in $(seq 1 20); do + STATUS=$(curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/commits/$SHA/status \ + | python3 -c "import sys,json; print(json.load(sys.stdin)['state'])") + echo "Check $i: $STATUS" + if [ "$STATUS" = "success" ] || [ "$STATUS" = "failure" ] || [ "$STATUS" = "error" ]; then + break + fi + sleep 30 +done +``` + +## 5. Auto-Fixing CI Failures + +When CI fails, diagnose and fix. This loop works with either auth method. + +### Step 1: Get Failure Details + +**With gh:** + +```bash +# List recent workflow runs on this branch +gh run list --branch $(git branch --show-current) --limit 5 + +# View failed logs +gh run view --log-failed +``` + +**With git + curl:** + +```bash +BRANCH=$(git branch --show-current) + +# List workflow runs on this branch +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/repos/$OWNER/$REPO/actions/runs?branch=$BRANCH&per_page=5" \ + | python3 -c " +import sys, json +runs = json.load(sys.stdin)['workflow_runs'] +for r in runs: + print(f\"Run {r['id']}: {r['name']} - {r['conclusion'] or r['status']}\")" + +# Get failed job logs (download as zip, extract, read) +RUN_ID= +curl -s -L \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/actions/runs/$RUN_ID/logs \ + -o /tmp/ci-logs.zip +cd /tmp && unzip -o ci-logs.zip -d ci-logs && cat ci-logs/*.txt +``` + +### Step 2: Fix and Push + +After identifying the issue, use file tools (`patch`, `write_file`) to fix it: + +```bash +git add +git commit -m "fix: resolve CI failure in " +git push +``` + +### Step 3: Verify + +Re-check CI status using the commands from Section 4 above. + +### Auto-Fix Loop Pattern + +When asked to auto-fix CI, follow this loop: + +1. Check CI status → identify failures +2. Read failure logs → understand the error +3. Use `read_file` + `patch`/`write_file` → fix the code +4. `git add . && git commit -m "fix: ..." && git push` +5. Wait for CI → re-check status +6. Repeat if still failing (up to 3 attempts, then ask the user) + +## 6. Merging + +**With gh:** + +```bash +# Squash merge + delete branch (cleanest for feature branches) +gh pr merge --squash --delete-branch + +# Enable auto-merge (merges when all checks pass) +gh pr merge --auto --squash --delete-branch +``` + +**With git + curl:** + +```bash +PR_NUMBER= + +# Merge the PR via API (squash) +curl -s -X PUT \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER/merge \ + -d "{ + \"merge_method\": \"squash\", + \"commit_title\": \"feat: add user authentication (#$PR_NUMBER)\" + }" + +# Delete the remote branch after merge +BRANCH=$(git branch --show-current) +git push origin --delete $BRANCH + +# Switch back to main locally +git checkout main && git pull origin main +git branch -d $BRANCH +``` + +Merge methods: `"merge"` (merge commit), `"squash"`, `"rebase"` + +### Enable Auto-Merge (curl) + +```bash +# Auto-merge requires the repo to have it enabled in settings. +# This uses the GraphQL API since REST doesn't support auto-merge. +PR_NODE_ID=$(curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/pulls/$PR_NUMBER \ + | python3 -c "import sys,json; print(json.load(sys.stdin)['node_id'])") + +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/graphql \ + -d "{\"query\": \"mutation { enablePullRequestAutoMerge(input: {pullRequestId: \\\"$PR_NODE_ID\\\", mergeMethod: SQUASH}) { clientMutationId } }\"}" +``` + +## 7. Complete Workflow Example + +```bash +# 1. Start from clean main +git checkout main && git pull origin main + +# 2. Branch +git checkout -b fix/login-redirect-bug + +# 3. (Agent makes code changes with file tools) + +# 4. Commit +git add src/auth/login.py tests/test_login.py +git commit -m "fix: correct redirect URL after login + +Preserves the ?next= parameter instead of always redirecting to /dashboard." + +# 5. Push +git push -u origin HEAD + +# 6. Create PR (picks gh or curl based on what's available) +# ... (see Section 3) + +# 7. Monitor CI (see Section 4) + +# 8. Merge when green (see Section 6) +``` + +## Useful PR Commands Reference + +| Action | gh | git + curl | +|--------|-----|-----------| +| List my PRs | `gh pr list --author @me` | `curl -s -H "Authorization: token $GITHUB_TOKEN" "https://api.github.com/repos/$OWNER/$REPO/pulls?state=open"` | +| View PR diff | `gh pr diff` | `git diff main...HEAD` (local) or `curl -H "Accept: application/vnd.github.diff" ...` | +| Add comment | `gh pr comment N --body "..."` | `curl -X POST .../issues/N/comments -d '{"body":"..."}'` | +| Request review | `gh pr edit N --add-reviewer user` | `curl -X POST .../pulls/N/requested_reviewers -d '{"reviewers":["user"]}'` | +| Close PR | `gh pr close N` | `curl -X PATCH .../pulls/N -d '{"state":"closed"}'` | +| Check out someone's PR | `gh pr checkout N` | `git fetch origin pull/N/head:pr-N && git checkout pr-N` | diff --git a/website/docs/user-guide/skills/bundled/github/github-github-repo-management.md b/website/docs/user-guide/skills/bundled/github/github-github-repo-management.md new file mode 100644 index 000000000..839225034 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/github/github-github-repo-management.md @@ -0,0 +1,533 @@ +--- +title: "Github Repo Management — Clone, create, fork, configure, and manage GitHub repositories" +sidebar_label: "Github Repo Management" +description: "Clone, create, fork, configure, and manage GitHub repositories" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Github Repo Management + +Clone, create, fork, configure, and manage GitHub repositories. Manage remotes, secrets, releases, and workflows. Works with gh CLI or falls back to git + GitHub REST API via curl. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/github/github-repo-management` | +| Version | `1.1.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `GitHub`, `Repositories`, `Git`, `Releases`, `Secrets`, `Configuration` | +| Related skills | [`github-auth`](/docs/user-guide/skills/bundled/github/github-github-auth), [`github-pr-workflow`](/docs/user-guide/skills/bundled/github/github-github-pr-workflow), [`github-issues`](/docs/user-guide/skills/bundled/github/github-github-issues) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# GitHub Repository Management + +Create, clone, fork, configure, and manage GitHub repositories. Each section shows `gh` first, then the `git` + `curl` fallback. + +## Prerequisites + +- Authenticated with GitHub (see `github-auth` skill) + +### Setup + +```bash +if command -v gh &>/dev/null && gh auth status &>/dev/null; then + AUTH="gh" +else + AUTH="git" + if [ -z "$GITHUB_TOKEN" ]; then + if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then + GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r') + elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then + GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|') + fi + fi +fi + +# Get your GitHub username (needed for several operations) +if [ "$AUTH" = "gh" ]; then + GH_USER=$(gh api user --jq '.login') +else + GH_USER=$(curl -s -H "Authorization: token $GITHUB_TOKEN" https://api.github.com/user | python3 -c "import sys,json; print(json.load(sys.stdin)['login'])") +fi +``` + +If you're inside a repo already: + +```bash +REMOTE_URL=$(git remote get-url origin) +OWNER_REPO=$(echo "$REMOTE_URL" | sed -E 's|.*github\.com[:/]||; s|\.git$||') +OWNER=$(echo "$OWNER_REPO" | cut -d/ -f1) +REPO=$(echo "$OWNER_REPO" | cut -d/ -f2) +``` + +--- + +## 1. Cloning Repositories + +Cloning is pure `git` — works identically either way: + +```bash +# Clone via HTTPS (works with credential helper or token-embedded URL) +git clone https://github.com/owner/repo-name.git + +# Clone into a specific directory +git clone https://github.com/owner/repo-name.git ./my-local-dir + +# Shallow clone (faster for large repos) +git clone --depth 1 https://github.com/owner/repo-name.git + +# Clone a specific branch +git clone --branch develop https://github.com/owner/repo-name.git + +# Clone via SSH (if SSH is configured) +git clone git@github.com:owner/repo-name.git +``` + +**With gh (shorthand):** + +```bash +gh repo clone owner/repo-name +gh repo clone owner/repo-name -- --depth 1 +``` + +## 2. Creating Repositories + +**With gh:** + +```bash +# Create a public repo and clone it +gh repo create my-new-project --public --clone + +# Private, with description and license +gh repo create my-new-project --private --description "A useful tool" --license MIT --clone + +# Under an organization +gh repo create my-org/my-new-project --public --clone + +# From existing local directory +cd /path/to/existing/project +gh repo create my-project --source . --public --push +``` + +**With git + curl:** + +```bash +# Create the remote repo via API +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/user/repos \ + -d '{ + "name": "my-new-project", + "description": "A useful tool", + "private": false, + "auto_init": true, + "license_template": "mit" + }' + +# Clone it +git clone https://github.com/$GH_USER/my-new-project.git +cd my-new-project + +# -- OR -- push an existing local directory to the new repo +cd /path/to/existing/project +git init +git add . +git commit -m "Initial commit" +git remote add origin https://github.com/$GH_USER/my-new-project.git +git push -u origin main +``` + +To create under an organization: + +```bash +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/orgs/my-org/repos \ + -d '{"name": "my-new-project", "private": false}' +``` + +### From a Template + +**With gh:** + +```bash +gh repo create my-new-app --template owner/template-repo --public --clone +``` + +**With curl:** + +```bash +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/owner/template-repo/generate \ + -d '{"owner": "'"$GH_USER"'", "name": "my-new-app", "private": false}' +``` + +## 3. Forking Repositories + +**With gh:** + +```bash +gh repo fork owner/repo-name --clone +``` + +**With git + curl:** + +```bash +# Create the fork via API +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/owner/repo-name/forks + +# Wait a moment for GitHub to create it, then clone +sleep 3 +git clone https://github.com/$GH_USER/repo-name.git +cd repo-name + +# Add the original repo as "upstream" remote +git remote add upstream https://github.com/owner/repo-name.git +``` + +### Keeping a Fork in Sync + +```bash +# Pure git — works everywhere +git fetch upstream +git checkout main +git merge upstream/main +git push origin main +``` + +**With gh (shortcut):** + +```bash +gh repo sync $GH_USER/repo-name +``` + +## 4. Repository Information + +**With gh:** + +```bash +gh repo view owner/repo-name +gh repo list --limit 20 +gh search repos "machine learning" --language python --sort stars +``` + +**With curl:** + +```bash +# View repo details +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO \ + | python3 -c " +import sys, json +r = json.load(sys.stdin) +print(f\"Name: {r['full_name']}\") +print(f\"Description: {r['description']}\") +print(f\"Stars: {r['stargazers_count']} Forks: {r['forks_count']}\") +print(f\"Default branch: {r['default_branch']}\") +print(f\"Language: {r['language']}\")" + +# List your repos +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/user/repos?per_page=20&sort=updated" \ + | python3 -c " +import sys, json +for r in json.load(sys.stdin): + vis = 'private' if r['private'] else 'public' + print(f\" {r['full_name']:40} {vis:8} {r.get('language', ''):10} ★{r['stargazers_count']}\")" + +# Search repos +curl -s \ + "https://api.github.com/search/repositories?q=machine+learning+language:python&sort=stars&per_page=10" \ + | python3 -c " +import sys, json +for r in json.load(sys.stdin)['items']: + print(f\" {r['full_name']:40} ★{r['stargazers_count']:6} {r['description'][:60] if r['description'] else ''}\")" +``` + +## 5. Repository Settings + +**With gh:** + +```bash +gh repo edit --description "Updated description" --visibility public +gh repo edit --enable-wiki=false --enable-issues=true +gh repo edit --default-branch main +gh repo edit --add-topic "machine-learning,python" +gh repo edit --enable-auto-merge +``` + +**With curl:** + +```bash +curl -s -X PATCH \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO \ + -d '{ + "description": "Updated description", + "has_wiki": false, + "has_issues": true, + "allow_auto_merge": true + }' + +# Update topics +curl -s -X PUT \ + -H "Authorization: token $GITHUB_TOKEN" \ + -H "Accept: application/vnd.github.mercy-preview+json" \ + https://api.github.com/repos/$OWNER/$REPO/topics \ + -d '{"names": ["machine-learning", "python", "automation"]}' +``` + +## 6. Branch Protection + +```bash +# View current protection +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/branches/main/protection + +# Set up branch protection +curl -s -X PUT \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/branches/main/protection \ + -d '{ + "required_status_checks": { + "strict": true, + "contexts": ["ci/test", "ci/lint"] + }, + "enforce_admins": false, + "required_pull_request_reviews": { + "required_approving_review_count": 1 + }, + "restrictions": null + }' +``` + +## 7. Secrets Management (GitHub Actions) + +**With gh:** + +```bash +gh secret set API_KEY --body "your-secret-value" +gh secret set SSH_KEY < ~/.ssh/id_rsa +gh secret list +gh secret delete API_KEY +``` + +**With curl:** + +Secrets require encryption with the repo's public key — more involved via API: + +```bash +# Get the repo's public key for encrypting secrets +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/actions/secrets/public-key + +# Encrypt and set (requires Python with PyNaCl) +python3 -c " +from base64 import b64encode +from nacl import encoding, public +import json, sys + +# Get the public key +key_id = '' +public_key = '' + +# Encrypt +sealed = public.SealedBox( + public.PublicKey(public_key.encode('utf-8'), encoding.Base64Encoder) +).encrypt('your-secret-value'.encode('utf-8')) +print(json.dumps({ + 'encrypted_value': b64encode(sealed).decode('utf-8'), + 'key_id': key_id +}))" + +# Then PUT the encrypted secret +curl -s -X PUT \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/actions/secrets/API_KEY \ + -d '' + +# List secrets (names only, values hidden) +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/actions/secrets \ + | python3 -c " +import sys, json +for s in json.load(sys.stdin)['secrets']: + print(f\" {s['name']:30} updated: {s['updated_at']}\")" +``` + +Note: For secrets, `gh secret set` is dramatically simpler. If setting secrets is needed and `gh` isn't available, recommend installing it for just that operation. + +## 8. Releases + +**With gh:** + +```bash +gh release create v1.0.0 --title "v1.0.0" --generate-notes +gh release create v2.0.0-rc1 --draft --prerelease --generate-notes +gh release create v1.0.0 ./dist/binary --title "v1.0.0" --notes "Release notes" +gh release list +gh release download v1.0.0 --dir ./downloads +``` + +**With curl:** + +```bash +# Create a release +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/releases \ + -d '{ + "tag_name": "v1.0.0", + "name": "v1.0.0", + "body": "## Changelog\n- Feature A\n- Bug fix B", + "draft": false, + "prerelease": false, + "generate_release_notes": true + }' + +# List releases +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/releases \ + | python3 -c " +import sys, json +for r in json.load(sys.stdin): + tag = r.get('tag_name', 'no tag') + print(f\" {tag:15} {r['name']:30} {'draft' if r['draft'] else 'published'}\")" + +# Upload a release asset (binary file) +RELEASE_ID= +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + -H "Content-Type: application/octet-stream" \ + "https://uploads.github.com/repos/$OWNER/$REPO/releases/$RELEASE_ID/assets?name=binary-amd64" \ + --data-binary @./dist/binary-amd64 +``` + +## 9. GitHub Actions Workflows + +**With gh:** + +```bash +gh workflow list +gh run list --limit 10 +gh run view +gh run view --log-failed +gh run rerun +gh run rerun --failed +gh workflow run ci.yml --ref main +gh workflow run deploy.yml -f environment=staging +``` + +**With curl:** + +```bash +# List workflows +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/actions/workflows \ + | python3 -c " +import sys, json +for w in json.load(sys.stdin)['workflows']: + print(f\" {w['id']:10} {w['name']:30} {w['state']}\")" + +# List recent runs +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + "https://api.github.com/repos/$OWNER/$REPO/actions/runs?per_page=10" \ + | python3 -c " +import sys, json +for r in json.load(sys.stdin)['workflow_runs']: + print(f\" Run {r['id']} {r['name']:30} {r['conclusion'] or r['status']}\")" + +# Download failed run logs +RUN_ID= +curl -s -L \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/actions/runs/$RUN_ID/logs \ + -o /tmp/ci-logs.zip +cd /tmp && unzip -o ci-logs.zip -d ci-logs + +# Re-run a failed workflow +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/actions/runs/$RUN_ID/rerun + +# Re-run only failed jobs +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/actions/runs/$RUN_ID/rerun-failed-jobs + +# Trigger a workflow manually (workflow_dispatch) +WORKFLOW_ID= +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/repos/$OWNER/$REPO/actions/workflows/$WORKFLOW_ID/dispatches \ + -d '{"ref": "main", "inputs": {"environment": "staging"}}' +``` + +## 10. Gists + +**With gh:** + +```bash +gh gist create script.py --public --desc "Useful script" +gh gist list +``` + +**With curl:** + +```bash +# Create a gist +curl -s -X POST \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/gists \ + -d '{ + "description": "Useful script", + "public": true, + "files": { + "script.py": {"content": "print(\"hello\")"} + } + }' + +# List your gists +curl -s \ + -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/gists \ + | python3 -c " +import sys, json +for g in json.load(sys.stdin): + files = ', '.join(g['files'].keys()) + print(f\" {g['id']} {g['description'] or '(no desc)':40} {files}\")" +``` + +## Quick Reference Table + +| Action | gh | git + curl | +|--------|-----|-----------| +| Clone | `gh repo clone o/r` | `git clone https://github.com/o/r.git` | +| Create repo | `gh repo create name --public` | `curl POST /user/repos` | +| Fork | `gh repo fork o/r --clone` | `curl POST /repos/o/r/forks` + `git clone` | +| Repo info | `gh repo view o/r` | `curl GET /repos/o/r` | +| Edit settings | `gh repo edit --...` | `curl PATCH /repos/o/r` | +| Create release | `gh release create v1.0` | `curl POST /repos/o/r/releases` | +| List workflows | `gh workflow list` | `curl GET /repos/o/r/actions/workflows` | +| Rerun CI | `gh run rerun ID` | `curl POST /repos/o/r/actions/runs/ID/rerun` | +| Set secret | `gh secret set KEY` | `curl PUT /repos/o/r/actions/secrets/KEY` (+ encryption) | diff --git a/website/docs/user-guide/skills/bundled/mcp/mcp-native-mcp.md b/website/docs/user-guide/skills/bundled/mcp/mcp-native-mcp.md new file mode 100644 index 000000000..267c8c064 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/mcp/mcp-native-mcp.md @@ -0,0 +1,374 @@ +--- +title: "Native Mcp" +sidebar_label: "Native Mcp" +description: "Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Native Mcp + +Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools. Supports stdio and HTTP transports with automatic reconnection, security filtering, and zero-config tool injection. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/mcp/native-mcp` | +| Version | `1.0.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `MCP`, `Tools`, `Integrations` | +| Related skills | [`mcporter`](/docs/user-guide/skills/optional/mcp/mcp-mcporter) | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Native MCP Client + +Hermes Agent has a built-in MCP client that connects to MCP servers at startup, discovers their tools, and makes them available as first-class tools the agent can call directly. No bridge CLI needed -- tools from MCP servers appear alongside built-in tools like `terminal`, `read_file`, etc. + +## When to Use + +Use this whenever you want to: +- Connect to MCP servers and use their tools from within Hermes Agent +- Add external capabilities (filesystem access, GitHub, databases, APIs) via MCP +- Run local stdio-based MCP servers (npx, uvx, or any command) +- Connect to remote HTTP/StreamableHTTP MCP servers +- Have MCP tools auto-discovered and available in every conversation + +For ad-hoc, one-off MCP tool calls from the terminal without configuring anything, see the `mcporter` skill instead. + +## Prerequisites + +- **mcp Python package** -- optional dependency; install with `pip install mcp`. If not installed, MCP support is silently disabled. +- **Node.js** -- required for `npx`-based MCP servers (most community servers) +- **uv** -- required for `uvx`-based MCP servers (Python-based servers) + +Install the MCP SDK: + +```bash +pip install mcp +# or, if using uv: +uv pip install mcp +``` + +## Quick Start + +Add MCP servers to `~/.hermes/config.yaml` under the `mcp_servers` key: + +```yaml +mcp_servers: + time: + command: "uvx" + args: ["mcp-server-time"] +``` + +Restart Hermes Agent. On startup it will: +1. Connect to the server +2. Discover available tools +3. Register them with the prefix `mcp_time_*` +4. Inject them into all platform toolsets + +You can then use the tools naturally -- just ask the agent to get the current time. + +## Configuration Reference + +Each entry under `mcp_servers` is a server name mapped to its config. There are two transport types: **stdio** (command-based) and **HTTP** (url-based). + +### Stdio Transport (command + args) + +```yaml +mcp_servers: + server_name: + command: "npx" # (required) executable to run + args: ["-y", "pkg-name"] # (optional) command arguments, default: [] + env: # (optional) environment variables for the subprocess + SOME_API_KEY: "value" + timeout: 120 # (optional) per-tool-call timeout in seconds, default: 120 + connect_timeout: 60 # (optional) initial connection timeout in seconds, default: 60 +``` + +### HTTP Transport (url) + +```yaml +mcp_servers: + server_name: + url: "https://my-server.example.com/mcp" # (required) server URL + headers: # (optional) HTTP headers + Authorization: "Bearer sk-..." + timeout: 180 # (optional) per-tool-call timeout in seconds, default: 120 + connect_timeout: 60 # (optional) initial connection timeout in seconds, default: 60 +``` + +### All Config Options + +| Option | Type | Default | Description | +|-------------------|--------|---------|---------------------------------------------------| +| `command` | string | -- | Executable to run (stdio transport, required) | +| `args` | list | `[]` | Arguments passed to the command | +| `env` | dict | `{}` | Extra environment variables for the subprocess | +| `url` | string | -- | Server URL (HTTP transport, required) | +| `headers` | dict | `{}` | HTTP headers sent with every request | +| `timeout` | int | `120` | Per-tool-call timeout in seconds | +| `connect_timeout` | int | `60` | Timeout for initial connection and discovery | + +Note: A server config must have either `command` (stdio) or `url` (HTTP), not both. + +## How It Works + +### Startup Discovery + +When Hermes Agent starts, `discover_mcp_tools()` is called during tool initialization: + +1. Reads `mcp_servers` from `~/.hermes/config.yaml` +2. For each server, spawns a connection in a dedicated background event loop +3. Initializes the MCP session and calls `list_tools()` to discover available tools +4. Registers each tool in the Hermes tool registry + +### Tool Naming Convention + +MCP tools are registered with the naming pattern: + +``` +mcp_{server_name}_{tool_name} +``` + +Hyphens and dots in names are replaced with underscores for LLM API compatibility. + +Examples: +- Server `filesystem`, tool `read_file` → `mcp_filesystem_read_file` +- Server `github`, tool `list-issues` → `mcp_github_list_issues` +- Server `my-api`, tool `fetch.data` → `mcp_my_api_fetch_data` + +### Auto-Injection + +After discovery, MCP tools are automatically injected into all `hermes-*` platform toolsets (CLI, Discord, Telegram, etc.). This means MCP tools are available in every conversation without any additional configuration. + +### Connection Lifecycle + +- Each server runs as a long-lived asyncio Task in a background daemon thread +- Connections persist for the lifetime of the agent process +- If a connection drops, automatic reconnection with exponential backoff kicks in (up to 5 retries, max 60s backoff) +- On agent shutdown, all connections are gracefully closed + +### Idempotency + +`discover_mcp_tools()` is idempotent -- calling it multiple times only connects to servers that aren't already connected. Failed servers are retried on subsequent calls. + +## Transport Types + +### Stdio Transport + +The most common transport. Hermes launches the MCP server as a subprocess and communicates over stdin/stdout. + +```yaml +mcp_servers: + filesystem: + command: "npx" + args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"] +``` + +The subprocess inherits a **filtered** environment (see Security section below) plus any variables you specify in `env`. + +### HTTP / StreamableHTTP Transport + +For remote or shared MCP servers. Requires the `mcp` package to include HTTP client support (`mcp.client.streamable_http`). + +```yaml +mcp_servers: + remote_api: + url: "https://mcp.example.com/mcp" + headers: + Authorization: "Bearer sk-..." +``` + +If HTTP support is not available in your installed `mcp` version, the server will fail with an ImportError and other servers will continue normally. + +## Security + +### Environment Variable Filtering + +For stdio servers, Hermes does NOT pass your full shell environment to MCP subprocesses. Only safe baseline variables are inherited: + +- `PATH`, `HOME`, `USER`, `LANG`, `LC_ALL`, `TERM`, `SHELL`, `TMPDIR` +- Any `XDG_*` variables + +All other environment variables (API keys, tokens, secrets) are excluded unless you explicitly add them via the `env` config key. This prevents accidental credential leakage to untrusted MCP servers. + +```yaml +mcp_servers: + github: + command: "npx" + args: ["-y", "@modelcontextprotocol/server-github"] + env: + # Only this token is passed to the subprocess + GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_..." +``` + +### Credential Stripping in Error Messages + +If an MCP tool call fails, any credential-like patterns in the error message are automatically redacted before being shown to the LLM. This covers: + +- GitHub PATs (`ghp_...`) +- OpenAI-style keys (`sk-...`) +- Bearer tokens +- Generic `token=`, `key=`, `API_KEY=`, `password=`, `secret=` patterns + +## Troubleshooting + +### "MCP SDK not available -- skipping MCP tool discovery" + +The `mcp` Python package is not installed. Install it: + +```bash +pip install mcp +``` + +### "No MCP servers configured" + +No `mcp_servers` key in `~/.hermes/config.yaml`, or it's empty. Add at least one server. + +### "Failed to connect to MCP server 'X'" + +Common causes: +- **Command not found**: The `command` binary isn't on PATH. Ensure `npx`, `uvx`, or the relevant command is installed. +- **Package not found**: For npx servers, the npm package may not exist or may need `-y` in args to auto-install. +- **Timeout**: The server took too long to start. Increase `connect_timeout`. +- **Port conflict**: For HTTP servers, the URL may be unreachable. + +### "MCP server 'X' requires HTTP transport but mcp.client.streamable_http is not available" + +Your `mcp` package version doesn't include HTTP client support. Upgrade: + +```bash +pip install --upgrade mcp +``` + +### Tools not appearing + +- Check that the server is listed under `mcp_servers` (not `mcp` or `servers`) +- Ensure the YAML indentation is correct +- Look at Hermes Agent startup logs for connection messages +- Tool names are prefixed with `mcp_{server}_{tool}` -- look for that pattern + +### Connection keeps dropping + +The client retries up to 5 times with exponential backoff (1s, 2s, 4s, 8s, 16s, capped at 60s). If the server is fundamentally unreachable, it gives up after 5 attempts. Check the server process and network connectivity. + +## Examples + +### Time Server (uvx) + +```yaml +mcp_servers: + time: + command: "uvx" + args: ["mcp-server-time"] +``` + +Registers tools like `mcp_time_get_current_time`. + +### Filesystem Server (npx) + +```yaml +mcp_servers: + filesystem: + command: "npx" + args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/documents"] + timeout: 30 +``` + +Registers tools like `mcp_filesystem_read_file`, `mcp_filesystem_write_file`, `mcp_filesystem_list_directory`. + +### GitHub Server with Authentication + +```yaml +mcp_servers: + github: + command: "npx" + args: ["-y", "@modelcontextprotocol/server-github"] + env: + GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_xxxxxxxxxxxxxxxxxxxx" + timeout: 60 +``` + +Registers tools like `mcp_github_list_issues`, `mcp_github_create_pull_request`, etc. + +### Remote HTTP Server + +```yaml +mcp_servers: + company_api: + url: "https://mcp.mycompany.com/v1/mcp" + headers: + Authorization: "Bearer sk-xxxxxxxxxxxxxxxxxxxx" + X-Team-Id: "engineering" + timeout: 180 + connect_timeout: 30 +``` + +### Multiple Servers + +```yaml +mcp_servers: + time: + command: "uvx" + args: ["mcp-server-time"] + + filesystem: + command: "npx" + args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"] + + github: + command: "npx" + args: ["-y", "@modelcontextprotocol/server-github"] + env: + GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_xxxxxxxxxxxxxxxxxxxx" + + company_api: + url: "https://mcp.internal.company.com/mcp" + headers: + Authorization: "Bearer sk-xxxxxxxxxxxxxxxxxxxx" + timeout: 300 +``` + +All tools from all servers are registered and available simultaneously. Each server's tools are prefixed with its name to avoid collisions. + +## Sampling (Server-Initiated LLM Requests) + +Hermes supports MCP's `sampling/createMessage` capability — MCP servers can request LLM completions through the agent during tool execution. This enables agent-in-the-loop workflows (data analysis, content generation, decision-making). + +Sampling is **enabled by default**. Configure per server: + +```yaml +mcp_servers: + my_server: + command: "npx" + args: ["-y", "my-mcp-server"] + sampling: + enabled: true # default: true + model: "gemini-3-flash" # model override (optional) + max_tokens_cap: 4096 # max tokens per request + timeout: 30 # LLM call timeout (seconds) + max_rpm: 10 # max requests per minute + allowed_models: [] # model whitelist (empty = all) + max_tool_rounds: 5 # tool loop limit (0 = disable) + log_level: "info" # audit verbosity +``` + +Servers can also include `tools` in sampling requests for multi-turn tool-augmented workflows. The `max_tool_rounds` config prevents infinite tool loops. Per-server audit metrics (requests, errors, tokens, tool use count) are tracked via `get_mcp_status()`. + +Disable sampling for untrusted servers with `sampling: { enabled: false }`. + +## Notes + +- MCP tools are called synchronously from the agent's perspective but run asynchronously on a dedicated background event loop +- Tool results are returned as JSON with either `{"result": "..."}` or `{"error": "..."}` +- The native MCP client is independent of `mcporter` -- you can use both simultaneously +- Server connections are persistent and shared across all conversations in the same agent process +- Adding or removing servers requires restarting the agent (no hot-reload currently) diff --git a/website/docs/user-guide/skills/bundled/media/media-gif-search.md b/website/docs/user-guide/skills/bundled/media/media-gif-search.md new file mode 100644 index 000000000..67b56645d --- /dev/null +++ b/website/docs/user-guide/skills/bundled/media/media-gif-search.md @@ -0,0 +1,101 @@ +--- +title: "Gif Search — Search and download GIFs from Tenor using curl" +sidebar_label: "Gif Search" +description: "Search and download GIFs from Tenor using curl" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Gif Search + +Search and download GIFs from Tenor using curl. No dependencies beyond curl and jq. Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/media/gif-search` | +| Version | `1.1.0` | +| Author | Hermes Agent | +| License | MIT | +| Tags | `GIF`, `Media`, `Search`, `Tenor`, `API` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# GIF Search (Tenor API) + +Search and download GIFs directly via the Tenor API using curl. No extra tools needed. + +## Setup + +Set your Tenor API key in your environment (add to `~/.hermes/.env`): + +```bash +TENOR_API_KEY=your_key_here +``` + +Get a free API key at https://developers.google.com/tenor/guides/quickstart — the Google Cloud Console Tenor API key is free and has generous rate limits. + +## Prerequisites + +- `curl` and `jq` (both standard on macOS/Linux) +- `TENOR_API_KEY` environment variable + +## Search for GIFs + +```bash +# Search and get GIF URLs +curl -s "https://tenor.googleapis.com/v2/search?q=thumbs+up&limit=5&key=${TENOR_API_KEY}" | jq -r '.results[].media_formats.gif.url' + +# Get smaller/preview versions +curl -s "https://tenor.googleapis.com/v2/search?q=nice+work&limit=3&key=${TENOR_API_KEY}" | jq -r '.results[].media_formats.tinygif.url' +``` + +## Download a GIF + +```bash +# Search and download the top result +URL=$(curl -s "https://tenor.googleapis.com/v2/search?q=celebration&limit=1&key=${TENOR_API_KEY}" | jq -r '.results[0].media_formats.gif.url') +curl -sL "$URL" -o celebration.gif +``` + +## Get Full Metadata + +```bash +curl -s "https://tenor.googleapis.com/v2/search?q=cat&limit=3&key=${TENOR_API_KEY}" | jq '.results[] | {title: .title, url: .media_formats.gif.url, preview: .media_formats.tinygif.url, dimensions: .media_formats.gif.dims}' +``` + +## API Parameters + +| Parameter | Description | +|-----------|-------------| +| `q` | Search query (URL-encode spaces as `+`) | +| `limit` | Max results (1-50, default 20) | +| `key` | API key (from `$TENOR_API_KEY` env var) | +| `media_filter` | Filter formats: `gif`, `tinygif`, `mp4`, `tinymp4`, `webm` | +| `contentfilter` | Safety: `off`, `low`, `medium`, `high` | +| `locale` | Language: `en_US`, `es`, `fr`, etc. | + +## Available Media Formats + +Each result has multiple formats under `.media_formats`: + +| Format | Use case | +|--------|----------| +| `gif` | Full quality GIF | +| `tinygif` | Small preview GIF | +| `mp4` | Video version (smaller file size) | +| `tinymp4` | Small preview video | +| `webm` | WebM video | +| `nanogif` | Tiny thumbnail | + +## Notes + +- URL-encode the query: spaces as `+`, special chars as `%XX` +- For sending in chat, `tinygif` URLs are lighter weight +- GIF URLs can be used directly in markdown: `![alt](https://github.com/NousResearch/hermes-agent/blob/main/skills/media/gif-search/url)` diff --git a/website/docs/user-guide/skills/bundled/media/media-heartmula.md b/website/docs/user-guide/skills/bundled/media/media-heartmula.md new file mode 100644 index 000000000..85dae5e86 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/media/media-heartmula.md @@ -0,0 +1,188 @@ +--- +title: "Heartmula — Set up and run HeartMuLa, the open-source music generation model family (Suno-like)" +sidebar_label: "Heartmula" +description: "Set up and run HeartMuLa, the open-source music generation model family (Suno-like)" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Heartmula + +Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/media/heartmula` | +| Version | `1.0.0` | +| Tags | `music`, `audio`, `generation`, `ai`, `heartmula`, `heartcodec`, `lyrics`, `songs` | +| Related skills | `audiocraft` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# HeartMuLa - Open-Source Music Generation + +## Overview +HeartMuLa is a family of open-source music foundation models (Apache-2.0) that generates music conditioned on lyrics and tags. Comparable to Suno for open-source. Includes: +- **HeartMuLa** - Music language model (3B/7B) for generation from lyrics + tags +- **HeartCodec** - 12.5Hz music codec for high-fidelity audio reconstruction +- **HeartTranscriptor** - Whisper-based lyrics transcription +- **HeartCLAP** - Audio-text alignment model + +## When to Use +- User wants to generate music/songs from text descriptions +- User wants an open-source Suno alternative +- User wants local/offline music generation +- User asks about HeartMuLa, heartlib, or AI music generation + +## Hardware Requirements +- **Minimum**: 8GB VRAM with `--lazy_load true` (loads/unloads models sequentially) +- **Recommended**: 16GB+ VRAM for comfortable single-GPU usage +- **Multi-GPU**: Use `--mula_device cuda:0 --codec_device cuda:1` to split across GPUs +- 3B model with lazy_load peaks at ~6.2GB VRAM + +## Installation Steps + +### 1. Clone Repository +```bash +cd ~/ # or desired directory +git clone https://github.com/HeartMuLa/heartlib.git +cd heartlib +``` + +### 2. Create Virtual Environment (Python 3.10 required) +```bash +uv venv --python 3.10 .venv +. .venv/bin/activate +uv pip install -e . +``` + +### 3. Fix Dependency Compatibility Issues + +**IMPORTANT**: As of Feb 2026, the pinned dependencies have conflicts with newer packages. Apply these fixes: + +```bash +# Upgrade datasets (old version incompatible with current pyarrow) +uv pip install --upgrade datasets + +# Upgrade transformers (needed for huggingface-hub 1.x compatibility) +uv pip install --upgrade transformers +``` + +### 4. Patch Source Code (Required for transformers 5.x) + +**Patch 1 - RoPE cache fix** in `src/heartlib/heartmula/modeling_heartmula.py`: + +In the `setup_caches` method of the `HeartMuLa` class, add RoPE reinitialization after the `reset_caches` try/except block and before the `with device:` block: + +```python +# Re-initialize RoPE caches that were skipped during meta-device loading +from torchtune.models.llama3_1._position_embeddings import Llama3ScaledRoPE +for module in self.modules(): + if isinstance(module, Llama3ScaledRoPE) and not module.is_cache_built: + module.rope_init() + module.to(device) +``` + +**Why**: `from_pretrained` creates model on meta device first; `Llama3ScaledRoPE.rope_init()` skips cache building on meta tensors, then never rebuilds after weights are loaded to real device. + +**Patch 2 - HeartCodec loading fix** in `src/heartlib/pipelines/music_generation.py`: + +Add `ignore_mismatched_sizes=True` to ALL `HeartCodec.from_pretrained()` calls (there are 2: the eager load in `__init__` and the lazy load in the `codec` property). + +**Why**: VQ codebook `initted` buffers have shape `[1]` in checkpoint vs `[]` in model. Same data, just scalar vs 0-d tensor. Safe to ignore. + +### 5. Download Model Checkpoints +```bash +cd heartlib # project root +hf download --local-dir './ckpt' 'HeartMuLa/HeartMuLaGen' +hf download --local-dir './ckpt/HeartMuLa-oss-3B' 'HeartMuLa/HeartMuLa-oss-3B-happy-new-year' +hf download --local-dir './ckpt/HeartCodec-oss' 'HeartMuLa/HeartCodec-oss-20260123' +``` + +All 3 can be downloaded in parallel. Total size is several GB. + +## GPU / CUDA + +HeartMuLa uses CUDA by default (`--mula_device cuda --codec_device cuda`). No extra setup needed if the user has an NVIDIA GPU with PyTorch CUDA support installed. + +- The installed `torch==2.4.1` includes CUDA 12.1 support out of the box +- `torchtune` may report version `0.4.0+cpu` — this is just package metadata, it still uses CUDA via PyTorch +- To verify GPU is being used, look for "CUDA memory" lines in the output (e.g. "CUDA memory before unloading: 6.20 GB") +- **No GPU?** You can run on CPU with `--mula_device cpu --codec_device cpu`, but expect generation to be **extremely slow** (potentially 30-60+ minutes for a single song vs ~4 minutes on GPU). CPU mode also requires significant RAM (~12GB+ free). If the user has no NVIDIA GPU, recommend using a cloud GPU service (Google Colab free tier with T4, Lambda Labs, etc.) or the online demo at https://heartmula.github.io/ instead. + +## Usage + +### Basic Generation +```bash +cd heartlib +. .venv/bin/activate +python ./examples/run_music_generation.py \ + --model_path=./ckpt \ + --version="3B" \ + --lyrics="./assets/lyrics.txt" \ + --tags="./assets/tags.txt" \ + --save_path="./assets/output.mp3" \ + --lazy_load true +``` + +### Input Formatting + +**Tags** (comma-separated, no spaces): +``` +piano,happy,wedding,synthesizer,romantic +``` +or +``` +rock,energetic,guitar,drums,male-vocal +``` + +**Lyrics** (use bracketed structural tags): +``` +[Intro] + +[Verse] +Your lyrics here... + +[Chorus] +Chorus lyrics... + +[Bridge] +Bridge lyrics... + +[Outro] +``` + +### Key Parameters +| Parameter | Default | Description | +|-----------|---------|-------------| +| `--max_audio_length_ms` | 240000 | Max length in ms (240s = 4 min) | +| `--topk` | 50 | Top-k sampling | +| `--temperature` | 1.0 | Sampling temperature | +| `--cfg_scale` | 1.5 | Classifier-free guidance scale | +| `--lazy_load` | false | Load/unload models on demand (saves VRAM) | +| `--mula_dtype` | bfloat16 | Dtype for HeartMuLa (bf16 recommended) | +| `--codec_dtype` | float32 | Dtype for HeartCodec (fp32 recommended for quality) | + +### Performance +- RTF (Real-Time Factor) ≈ 1.0 — a 4-minute song takes ~4 minutes to generate +- Output: MP3, 48kHz stereo, 128kbps + +## Pitfalls +1. **Do NOT use bf16 for HeartCodec** — degrades audio quality. Use fp32 (default). +2. **Tags may be ignored** — known issue (#90). Lyrics tend to dominate; experiment with tag ordering. +3. **Triton not available on macOS** — Linux/CUDA only for GPU acceleration. +4. **RTX 5080 incompatibility** reported in upstream issues. +5. The dependency pin conflicts require the manual upgrades and patches described above. + +## Links +- Repo: https://github.com/HeartMuLa/heartlib +- Models: https://huggingface.co/HeartMuLa +- Paper: https://arxiv.org/abs/2601.10547 +- License: Apache-2.0 diff --git a/website/docs/user-guide/skills/bundled/media/media-songsee.md b/website/docs/user-guide/skills/bundled/media/media-songsee.md new file mode 100644 index 000000000..231b87ea3 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/media/media-songsee.md @@ -0,0 +1,97 @@ +--- +title: "Songsee — Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc" +sidebar_label: "Songsee" +description: "Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Songsee + +Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/media/songsee` | +| Version | `1.0.0` | +| Author | community | +| License | MIT | +| Tags | `Audio`, `Visualization`, `Spectrogram`, `Music`, `Analysis` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# songsee + +Generate spectrograms and multi-panel audio feature visualizations from audio files. + +## Prerequisites + +Requires [Go](https://go.dev/doc/install): +```bash +go install github.com/steipete/songsee/cmd/songsee@latest +``` + +Optional: `ffmpeg` for formats beyond WAV/MP3. + +## Quick Start + +```bash +# Basic spectrogram +songsee track.mp3 + +# Save to specific file +songsee track.mp3 -o spectrogram.png + +# Multi-panel visualization grid +songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux + +# Time slice (start at 12.5s, 8s duration) +songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg + +# From stdin +cat track.mp3 | songsee - --format png -o out.png +``` + +## Visualization Types + +Use `--viz` with comma-separated values: + +| Type | Description | +|------|-------------| +| `spectrogram` | Standard frequency spectrogram | +| `mel` | Mel-scaled spectrogram | +| `chroma` | Pitch class distribution | +| `hpss` | Harmonic/percussive separation | +| `selfsim` | Self-similarity matrix | +| `loudness` | Loudness over time | +| `tempogram` | Tempo estimation | +| `mfcc` | Mel-frequency cepstral coefficients | +| `flux` | Spectral flux (onset detection) | + +Multiple `--viz` types render as a grid in a single image. + +## Common Flags + +| Flag | Description | +|------|-------------| +| `--viz` | Visualization types (comma-separated) | +| `--style` | Color palette: `classic`, `magma`, `inferno`, `viridis`, `gray` | +| `--width` / `--height` | Output image dimensions | +| `--window` / `--hop` | FFT window and hop size | +| `--min-freq` / `--max-freq` | Frequency range filter | +| `--start` / `--duration` | Time slice of the audio | +| `--format` | Output format: `jpg` or `png` | +| `-o` | Output file path | + +## Notes + +- WAV and MP3 are decoded natively; other formats require `ffmpeg` +- Output images can be inspected with `vision_analyze` for automated audio analysis +- Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines diff --git a/website/docs/user-guide/skills/bundled/media/media-youtube-content.md b/website/docs/user-guide/skills/bundled/media/media-youtube-content.md new file mode 100644 index 000000000..e94c755c9 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/media/media-youtube-content.md @@ -0,0 +1,88 @@ +--- +title: "Youtube Content" +sidebar_label: "Youtube Content" +description: "Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts)" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Youtube Content + +Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts). Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouTube video. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/media/youtube-content` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# YouTube Content Tool + +Extract transcripts from YouTube videos and convert them into useful formats. + +## Setup + +```bash +pip install youtube-transcript-api +``` + +## Helper Script + +`SKILL_DIR` is the directory containing this SKILL.md file. The script accepts any standard YouTube URL format, short links (youtu.be), shorts, embeds, live links, or a raw 11-character video ID. + +```bash +# JSON output with metadata +python3 SKILL_DIR/scripts/fetch_transcript.py "https://youtube.com/watch?v=VIDEO_ID" + +# Plain text (good for piping into further processing) +python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --text-only + +# With timestamps +python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --timestamps + +# Specific language with fallback chain +python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --language tr,en +``` + +## Output Formats + +After fetching the transcript, format it based on what the user asks for: + +- **Chapters**: Group by topic shifts, output timestamped chapter list +- **Summary**: Concise 5-10 sentence overview of the entire video +- **Chapter summaries**: Chapters with a short paragraph summary for each +- **Thread**: Twitter/X thread format — numbered posts, each under 280 chars +- **Blog post**: Full article with title, sections, and key takeaways +- **Quotes**: Notable quotes with timestamps + +### Example — Chapters Output + +``` +00:00 Introduction — host opens with the problem statement +03:45 Background — prior work and why existing solutions fall short +12:20 Core method — walkthrough of the proposed approach +24:10 Results — benchmark comparisons and key takeaways +31:55 Q&A — audience questions on scalability and next steps +``` + +## Workflow + +1. **Fetch** the transcript using the helper script with `--text-only --timestamps`. +2. **Validate**: confirm the output is non-empty and in the expected language. If empty, retry without `--language` to get any available transcript. If still empty, tell the user the video likely has transcripts disabled. +3. **Chunk if needed**: if the transcript exceeds ~50K characters, split into overlapping chunks (~40K with 2K overlap) and summarize each chunk before merging. +4. **Transform** into the requested output format. If the user did not specify a format, default to a summary. +5. **Verify**: re-read the transformed output to check for coherence, correct timestamps, and completeness before presenting. + +## Error Handling + +- **Transcript disabled**: tell the user; suggest they check if subtitles are available on the video page. +- **Private/unavailable video**: relay the error and ask the user to verify the URL. +- **No matching language**: retry without `--language` to fetch any available transcript, then note the actual language to the user. +- **Dependency missing**: run `pip install youtube-transcript-api` and retry. diff --git a/website/docs/user-guide/skills/bundled/mlops/mlops-evaluation-lm-evaluation-harness.md b/website/docs/user-guide/skills/bundled/mlops/mlops-evaluation-lm-evaluation-harness.md new file mode 100644 index 000000000..0112f747a --- /dev/null +++ b/website/docs/user-guide/skills/bundled/mlops/mlops-evaluation-lm-evaluation-harness.md @@ -0,0 +1,507 @@ +--- +title: "Evaluating Llms Harness — Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag)" +sidebar_label: "Evaluating Llms Harness" +description: "Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag)" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Evaluating Llms Harness + +Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/mlops/evaluation/lm-evaluation-harness` | +| Version | `1.0.0` | +| Author | Orchestra Research | +| License | MIT | +| Dependencies | `lm-eval`, `transformers`, `vllm` | +| Tags | `Evaluation`, `LM Evaluation Harness`, `Benchmarking`, `MMLU`, `HumanEval`, `GSM8K`, `EleutherAI`, `Model Quality`, `Academic Benchmarks`, `Industry Standard` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# lm-evaluation-harness - LLM Benchmarking + +## Quick start + +lm-evaluation-harness evaluates LLMs across 60+ academic benchmarks using standardized prompts and metrics. + +**Installation**: +```bash +pip install lm-eval +``` + +**Evaluate any HuggingFace model**: +```bash +lm_eval --model hf \ + --model_args pretrained=meta-llama/Llama-2-7b-hf \ + --tasks mmlu,gsm8k,hellaswag \ + --device cuda:0 \ + --batch_size 8 +``` + +**View available tasks**: +```bash +lm_eval --tasks list +``` + +## Common workflows + +### Workflow 1: Standard benchmark evaluation + +Evaluate model on core benchmarks (MMLU, GSM8K, HumanEval). + +Copy this checklist: + +``` +Benchmark Evaluation: +- [ ] Step 1: Choose benchmark suite +- [ ] Step 2: Configure model +- [ ] Step 3: Run evaluation +- [ ] Step 4: Analyze results +``` + +**Step 1: Choose benchmark suite** + +**Core reasoning benchmarks**: +- **MMLU** (Massive Multitask Language Understanding) - 57 subjects, multiple choice +- **GSM8K** - Grade school math word problems +- **HellaSwag** - Common sense reasoning +- **TruthfulQA** - Truthfulness and factuality +- **ARC** (AI2 Reasoning Challenge) - Science questions + +**Code benchmarks**: +- **HumanEval** - Python code generation (164 problems) +- **MBPP** (Mostly Basic Python Problems) - Python coding + +**Standard suite** (recommended for model releases): +```bash +--tasks mmlu,gsm8k,hellaswag,truthfulqa,arc_challenge +``` + +**Step 2: Configure model** + +**HuggingFace model**: +```bash +lm_eval --model hf \ + --model_args pretrained=meta-llama/Llama-2-7b-hf,dtype=bfloat16 \ + --tasks mmlu \ + --device cuda:0 \ + --batch_size auto # Auto-detect optimal batch size +``` + +**Quantized model (4-bit/8-bit)**: +```bash +lm_eval --model hf \ + --model_args pretrained=meta-llama/Llama-2-7b-hf,load_in_4bit=True \ + --tasks mmlu \ + --device cuda:0 +``` + +**Custom checkpoint**: +```bash +lm_eval --model hf \ + --model_args pretrained=/path/to/my-model,tokenizer=/path/to/tokenizer \ + --tasks mmlu \ + --device cuda:0 +``` + +**Step 3: Run evaluation** + +```bash +# Full MMLU evaluation (57 subjects) +lm_eval --model hf \ + --model_args pretrained=meta-llama/Llama-2-7b-hf \ + --tasks mmlu \ + --num_fewshot 5 \ # 5-shot evaluation (standard) + --batch_size 8 \ + --output_path results/ \ + --log_samples # Save individual predictions + +# Multiple benchmarks at once +lm_eval --model hf \ + --model_args pretrained=meta-llama/Llama-2-7b-hf \ + --tasks mmlu,gsm8k,hellaswag,truthfulqa,arc_challenge \ + --num_fewshot 5 \ + --batch_size 8 \ + --output_path results/llama2-7b-eval.json +``` + +**Step 4: Analyze results** + +Results saved to `results/llama2-7b-eval.json`: + +```json +{ + "results": { + "mmlu": { + "acc": 0.459, + "acc_stderr": 0.004 + }, + "gsm8k": { + "exact_match": 0.142, + "exact_match_stderr": 0.006 + }, + "hellaswag": { + "acc_norm": 0.765, + "acc_norm_stderr": 0.004 + } + }, + "config": { + "model": "hf", + "model_args": "pretrained=meta-llama/Llama-2-7b-hf", + "num_fewshot": 5 + } +} +``` + +### Workflow 2: Track training progress + +Evaluate checkpoints during training. + +``` +Training Progress Tracking: +- [ ] Step 1: Set up periodic evaluation +- [ ] Step 2: Choose quick benchmarks +- [ ] Step 3: Automate evaluation +- [ ] Step 4: Plot learning curves +``` + +**Step 1: Set up periodic evaluation** + +Evaluate every N training steps: + +```bash +#!/bin/bash +# eval_checkpoint.sh + +CHECKPOINT_DIR=$1 +STEP=$2 + +lm_eval --model hf \ + --model_args pretrained=$CHECKPOINT_DIR/checkpoint-$STEP \ + --tasks gsm8k,hellaswag \ + --num_fewshot 0 \ # 0-shot for speed + --batch_size 16 \ + --output_path results/step-$STEP.json +``` + +**Step 2: Choose quick benchmarks** + +Fast benchmarks for frequent evaluation: +- **HellaSwag**: ~10 minutes on 1 GPU +- **GSM8K**: ~5 minutes +- **PIQA**: ~2 minutes + +Avoid for frequent eval (too slow): +- **MMLU**: ~2 hours (57 subjects) +- **HumanEval**: Requires code execution + +**Step 3: Automate evaluation** + +Integrate with training script: + +```python +# In training loop +if step % eval_interval == 0: + model.save_pretrained(f"checkpoints/step-{step}") + + # Run evaluation + os.system(f"./eval_checkpoint.sh checkpoints step-{step}") +``` + +Or use PyTorch Lightning callbacks: + +```python +from pytorch_lightning import Callback + +class EvalHarnessCallback(Callback): + def on_validation_epoch_end(self, trainer, pl_module): + step = trainer.global_step + checkpoint_path = f"checkpoints/step-{step}" + + # Save checkpoint + trainer.save_checkpoint(checkpoint_path) + + # Run lm-eval + os.system(f"lm_eval --model hf --model_args pretrained={checkpoint_path} ...") +``` + +**Step 4: Plot learning curves** + +```python +import json +import matplotlib.pyplot as plt + +# Load all results +steps = [] +mmlu_scores = [] + +for file in sorted(glob.glob("results/step-*.json")): + with open(file) as f: + data = json.load(f) + step = int(file.split("-")[1].split(".")[0]) + steps.append(step) + mmlu_scores.append(data["results"]["mmlu"]["acc"]) + +# Plot +plt.plot(steps, mmlu_scores) +plt.xlabel("Training Step") +plt.ylabel("MMLU Accuracy") +plt.title("Training Progress") +plt.savefig("training_curve.png") +``` + +### Workflow 3: Compare multiple models + +Benchmark suite for model comparison. + +``` +Model Comparison: +- [ ] Step 1: Define model list +- [ ] Step 2: Run evaluations +- [ ] Step 3: Generate comparison table +``` + +**Step 1: Define model list** + +```bash +# models.txt +meta-llama/Llama-2-7b-hf +meta-llama/Llama-2-13b-hf +mistralai/Mistral-7B-v0.1 +microsoft/phi-2 +``` + +**Step 2: Run evaluations** + +```bash +#!/bin/bash +# eval_all_models.sh + +TASKS="mmlu,gsm8k,hellaswag,truthfulqa" + +while read model; do + echo "Evaluating $model" + + # Extract model name for output file + model_name=$(echo $model | sed 's/\//-/g') + + lm_eval --model hf \ + --model_args pretrained=$model,dtype=bfloat16 \ + --tasks $TASKS \ + --num_fewshot 5 \ + --batch_size auto \ + --output_path results/$model_name.json + +done < models.txt +``` + +**Step 3: Generate comparison table** + +```python +import json +import pandas as pd + +models = [ + "meta-llama-Llama-2-7b-hf", + "meta-llama-Llama-2-13b-hf", + "mistralai-Mistral-7B-v0.1", + "microsoft-phi-2" +] + +tasks = ["mmlu", "gsm8k", "hellaswag", "truthfulqa"] + +results = [] +for model in models: + with open(f"results/{model}.json") as f: + data = json.load(f) + row = {"Model": model.replace("-", "/")} + for task in tasks: + # Get primary metric for each task + metrics = data["results"][task] + if "acc" in metrics: + row[task.upper()] = f"{metrics['acc']:.3f}" + elif "exact_match" in metrics: + row[task.upper()] = f"{metrics['exact_match']:.3f}" + results.append(row) + +df = pd.DataFrame(results) +print(df.to_markdown(index=False)) +``` + +Output: +``` +| Model | MMLU | GSM8K | HELLASWAG | TRUTHFULQA | +|------------------------|-------|-------|-----------|------------| +| meta-llama/Llama-2-7b | 0.459 | 0.142 | 0.765 | 0.391 | +| meta-llama/Llama-2-13b | 0.549 | 0.287 | 0.801 | 0.430 | +| mistralai/Mistral-7B | 0.626 | 0.395 | 0.812 | 0.428 | +| microsoft/phi-2 | 0.560 | 0.613 | 0.682 | 0.447 | +``` + +### Workflow 4: Evaluate with vLLM (faster inference) + +Use vLLM backend for 5-10x faster evaluation. + +``` +vLLM Evaluation: +- [ ] Step 1: Install vLLM +- [ ] Step 2: Configure vLLM backend +- [ ] Step 3: Run evaluation +``` + +**Step 1: Install vLLM** + +```bash +pip install vllm +``` + +**Step 2: Configure vLLM backend** + +```bash +lm_eval --model vllm \ + --model_args pretrained=meta-llama/Llama-2-7b-hf,tensor_parallel_size=1,dtype=auto,gpu_memory_utilization=0.8 \ + --tasks mmlu \ + --batch_size auto +``` + +**Step 3: Run evaluation** + +vLLM is 5-10× faster than standard HuggingFace: + +```bash +# Standard HF: ~2 hours for MMLU on 7B model +lm_eval --model hf \ + --model_args pretrained=meta-llama/Llama-2-7b-hf \ + --tasks mmlu \ + --batch_size 8 + +# vLLM: ~15-20 minutes for MMLU on 7B model +lm_eval --model vllm \ + --model_args pretrained=meta-llama/Llama-2-7b-hf,tensor_parallel_size=2 \ + --tasks mmlu \ + --batch_size auto +``` + +## When to use vs alternatives + +**Use lm-evaluation-harness when:** +- Benchmarking models for academic papers +- Comparing model quality across standard tasks +- Tracking training progress +- Reporting standardized metrics (everyone uses same prompts) +- Need reproducible evaluation + +**Use alternatives instead:** +- **HELM** (Stanford): Broader evaluation (fairness, efficiency, calibration) +- **AlpacaEval**: Instruction-following evaluation with LLM judges +- **MT-Bench**: Conversational multi-turn evaluation +- **Custom scripts**: Domain-specific evaluation + +## Common issues + +**Issue: Evaluation too slow** + +Use vLLM backend: +```bash +lm_eval --model vllm \ + --model_args pretrained=model-name,tensor_parallel_size=2 +``` + +Or reduce fewshot examples: +```bash +--num_fewshot 0 # Instead of 5 +``` + +Or evaluate subset of MMLU: +```bash +--tasks mmlu_stem # Only STEM subjects +``` + +**Issue: Out of memory** + +Reduce batch size: +```bash +--batch_size 1 # Or --batch_size auto +``` + +Use quantization: +```bash +--model_args pretrained=model-name,load_in_8bit=True +``` + +Enable CPU offloading: +```bash +--model_args pretrained=model-name,device_map=auto,offload_folder=offload +``` + +**Issue: Different results than reported** + +Check fewshot count: +```bash +--num_fewshot 5 # Most papers use 5-shot +``` + +Check exact task name: +```bash +--tasks mmlu # Not mmlu_direct or mmlu_fewshot +``` + +Verify model and tokenizer match: +```bash +--model_args pretrained=model-name,tokenizer=same-model-name +``` + +**Issue: HumanEval not executing code** + +Install execution dependencies: +```bash +pip install human-eval +``` + +Enable code execution: +```bash +lm_eval --model hf \ + --model_args pretrained=model-name \ + --tasks humaneval \ + --allow_code_execution # Required for HumanEval +``` + +## Advanced topics + +**Benchmark descriptions**: See [references/benchmark-guide.md](https://github.com/NousResearch/hermes-agent/blob/main/skills/mlops/evaluation/lm-evaluation-harness/references/benchmark-guide.md) for detailed description of all 60+ tasks, what they measure, and interpretation. + +**Custom tasks**: See [references/custom-tasks.md](https://github.com/NousResearch/hermes-agent/blob/main/skills/mlops/evaluation/lm-evaluation-harness/references/custom-tasks.md) for creating domain-specific evaluation tasks. + +**API evaluation**: See [references/api-evaluation.md](https://github.com/NousResearch/hermes-agent/blob/main/skills/mlops/evaluation/lm-evaluation-harness/references/api-evaluation.md) for evaluating OpenAI, Anthropic, and other API models. + +**Multi-GPU strategies**: See [references/distributed-eval.md](https://github.com/NousResearch/hermes-agent/blob/main/skills/mlops/evaluation/lm-evaluation-harness/references/distributed-eval.md) for data parallel and tensor parallel evaluation. + +## Hardware requirements + +- **GPU**: NVIDIA (CUDA 11.8+), works on CPU (very slow) +- **VRAM**: + - 7B model: 16GB (bf16) or 8GB (8-bit) + - 13B model: 28GB (bf16) or 14GB (8-bit) + - 70B model: Requires multi-GPU or quantization +- **Time** (7B model, single A100): + - HellaSwag: 10 minutes + - GSM8K: 5 minutes + - MMLU (full): 2 hours + - HumanEval: 20 minutes + +## Resources + +- GitHub: https://github.com/EleutherAI/lm-evaluation-harness +- Docs: https://github.com/EleutherAI/lm-evaluation-harness/tree/main/docs +- Task library: 60+ tasks including MMLU, GSM8K, HumanEval, TruthfulQA, HellaSwag, ARC, WinoGrande, etc. +- Leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard (uses this harness) diff --git a/website/docs/user-guide/skills/bundled/mlops/mlops-evaluation-weights-and-biases.md b/website/docs/user-guide/skills/bundled/mlops/mlops-evaluation-weights-and-biases.md new file mode 100644 index 000000000..db8c4d4d7 --- /dev/null +++ b/website/docs/user-guide/skills/bundled/mlops/mlops-evaluation-weights-and-biases.md @@ -0,0 +1,608 @@ +--- +title: "Weights And Biases" +sidebar_label: "Weights And Biases" +description: "Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - coll..." +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Weights And Biases + +Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/mlops/evaluation/weights-and-biases` | +| Version | `1.0.0` | +| Author | Orchestra Research | +| License | MIT | +| Dependencies | `wandb` | +| Tags | `MLOps`, `Weights And Biases`, `WandB`, `Experiment Tracking`, `Hyperparameter Tuning`, `Model Registry`, `Collaboration`, `Real-Time Visualization`, `PyTorch`, `TensorFlow`, `HuggingFace` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Weights & Biases: ML Experiment Tracking & MLOps + +## When to Use This Skill + +Use Weights & Biases (W&B) when you need to: +- **Track ML experiments** with automatic metric logging +- **Visualize training** in real-time dashboards +- **Compare runs** across hyperparameters and configurations +- **Optimize hyperparameters** with automated sweeps +- **Manage model registry** with versioning and lineage +- **Collaborate on ML projects** with team workspaces +- **Track artifacts** (datasets, models, code) with lineage + +**Users**: 200,000+ ML practitioners | **GitHub Stars**: 10.5k+ | **Integrations**: 100+ + +## Installation + +```bash +# Install W&B +pip install wandb + +# Login (creates API key) +wandb login + +# Or set API key programmatically +export WANDB_API_KEY=your_api_key_here +``` + +## Quick Start + +### Basic Experiment Tracking + +```python +import wandb + +# Initialize a run +run = wandb.init( + project="my-project", + config={ + "learning_rate": 0.001, + "epochs": 10, + "batch_size": 32, + "architecture": "ResNet50" + } +) + +# Training loop +for epoch in range(run.config.epochs): + # Your training code + train_loss = train_epoch() + val_loss = validate() + + # Log metrics + wandb.log({ + "epoch": epoch, + "train/loss": train_loss, + "val/loss": val_loss, + "train/accuracy": train_acc, + "val/accuracy": val_acc + }) + +# Finish the run +wandb.finish() +``` + +### With PyTorch + +```python +import torch +import wandb + +# Initialize +wandb.init(project="pytorch-demo", config={ + "lr": 0.001, + "epochs": 10 +}) + +# Access config +config = wandb.config + +# Training loop +for epoch in range(config.epochs): + for batch_idx, (data, target) in enumerate(train_loader): + # Forward pass + output = model(data) + loss = criterion(output, target) + + # Backward pass + optimizer.zero_grad() + loss.backward() + optimizer.step() + + # Log every 100 batches + if batch_idx % 100 == 0: + wandb.log({ + "loss": loss.item(), + "epoch": epoch, + "batch": batch_idx + }) + +# Save model +torch.save(model.state_dict(), "model.pth") +wandb.save("model.pth") # Upload to W&B + +wandb.finish() +``` + +## Core Concepts + +### 1. Projects and Runs + +**Project**: Collection of related experiments +**Run**: Single execution of your training script + +```python +# Create/use project +run = wandb.init( + project="image-classification", + name="resnet50-experiment-1", # Optional run name + tags=["baseline", "resnet"], # Organize with tags + notes="First baseline run" # Add notes +) + +# Each run has unique ID +print(f"Run ID: {run.id}") +print(f"Run URL: {run.url}") +``` + +### 2. Configuration Tracking + +Track hyperparameters automatically: + +```python +config = { + # Model architecture + "model": "ResNet50", + "pretrained": True, + + # Training params + "learning_rate": 0.001, + "batch_size": 32, + "epochs": 50, + "optimizer": "Adam", + + # Data params + "dataset": "ImageNet", + "augmentation": "standard" +} + +wandb.init(project="my-project", config=config) + +# Access config during training +lr = wandb.config.learning_rate +batch_size = wandb.config.batch_size +``` + +### 3. Metric Logging + +```python +# Log scalars +wandb.log({"loss": 0.5, "accuracy": 0.92}) + +# Log multiple metrics +wandb.log({ + "train/loss": train_loss, + "train/accuracy": train_acc, + "val/loss": val_loss, + "val/accuracy": val_acc, + "learning_rate": current_lr, + "epoch": epoch +}) + +# Log with custom x-axis +wandb.log({"loss": loss}, step=global_step) + +# Log media (images, audio, video) +wandb.log({"examples": [wandb.Image(img) for img in images]}) + +# Log histograms +wandb.log({"gradients": wandb.Histogram(gradients)}) + +# Log tables +table = wandb.Table(columns=["id", "prediction", "ground_truth"]) +wandb.log({"predictions": table}) +``` + +### 4. Model Checkpointing + +```python +import torch +import wandb + +# Save model checkpoint +checkpoint = { + 'epoch': epoch, + 'model_state_dict': model.state_dict(), + 'optimizer_state_dict': optimizer.state_dict(), + 'loss': loss, +} + +torch.save(checkpoint, 'checkpoint.pth') + +# Upload to W&B +wandb.save('checkpoint.pth') + +# Or use Artifacts (recommended) +artifact = wandb.Artifact('model', type='model') +artifact.add_file('checkpoint.pth') +wandb.log_artifact(artifact) +``` + +## Hyperparameter Sweeps + +Automatically search for optimal hyperparameters. + +### Define Sweep Configuration + +```python +sweep_config = { + 'method': 'bayes', # or 'grid', 'random' + 'metric': { + 'name': 'val/accuracy', + 'goal': 'maximize' + }, + 'parameters': { + 'learning_rate': { + 'distribution': 'log_uniform', + 'min': 1e-5, + 'max': 1e-1 + }, + 'batch_size': { + 'values': [16, 32, 64, 128] + }, + 'optimizer': { + 'values': ['adam', 'sgd', 'rmsprop'] + }, + 'dropout': { + 'distribution': 'uniform', + 'min': 0.1, + 'max': 0.5 + } + } +} + +# Initialize sweep +sweep_id = wandb.sweep(sweep_config, project="my-project") +``` + +### Define Training Function + +```python +def train(): + # Initialize run + run = wandb.init() + + # Access sweep parameters + lr = wandb.config.learning_rate + batch_size = wandb.config.batch_size + optimizer_name = wandb.config.optimizer + + # Build model with sweep config + model = build_model(wandb.config) + optimizer = get_optimizer(optimizer_name, lr) + + # Training loop + for epoch in range(NUM_EPOCHS): + train_loss = train_epoch(model, optimizer, batch_size) + val_acc = validate(model) + + # Log metrics + wandb.log({ + "train/loss": train_loss, + "val/accuracy": val_acc + }) + +# Run sweep +wandb.agent(sweep_id, function=train, count=50) # Run 50 trials +``` + +### Sweep Strategies + +```python +# Grid search - exhaustive +sweep_config = { + 'method': 'grid', + 'parameters': { + 'lr': {'values': [0.001, 0.01, 0.1]}, + 'batch_size': {'values': [16, 32, 64]} + } +} + +# Random search +sweep_config = { + 'method': 'random', + 'parameters': { + 'lr': {'distribution': 'uniform', 'min': 0.0001, 'max': 0.1}, + 'dropout': {'distribution': 'uniform', 'min': 0.1, 'max': 0.5} + } +} + +# Bayesian optimization (recommended) +sweep_config = { + 'method': 'bayes', + 'metric': {'name': 'val/loss', 'goal': 'minimize'}, + 'parameters': { + 'lr': {'distribution': 'log_uniform', 'min': 1e-5, 'max': 1e-1} + } +} +``` + +## Artifacts + +Track datasets, models, and other files with lineage. + +### Log Artifacts + +```python +# Create artifact +artifact = wandb.Artifact( + name='training-dataset', + type='dataset', + description='ImageNet training split', + metadata={'size': '1.2M images', 'split': 'train'} +) + +# Add files +artifact.add_file('data/train.csv') +artifact.add_dir('data/images/') + +# Log artifact +wandb.log_artifact(artifact) +``` + +### Use Artifacts + +```python +# Download and use artifact +run = wandb.init(project="my-project") + +# Download artifact +artifact = run.use_artifact('training-dataset:latest') +artifact_dir = artifact.download() + +# Use the data +data = load_data(f"{artifact_dir}/train.csv") +``` + +### Model Registry + +```python +# Log model as artifact +model_artifact = wandb.Artifact( + name='resnet50-model', + type='model', + metadata={'architecture': 'ResNet50', 'accuracy': 0.95} +) + +model_artifact.add_file('model.pth') +wandb.log_artifact(model_artifact, aliases=['best', 'production']) + +# Link to model registry +run.link_artifact(model_artifact, 'model-registry/production-models') +``` + +## Integration Examples + +### HuggingFace Transformers + +```python +from transformers import Trainer, TrainingArguments +import wandb + +# Initialize W&B +wandb.init(project="hf-transformers") + +# Training arguments with W&B +training_args = TrainingArguments( + output_dir="./results", + report_to="wandb", # Enable W&B logging + run_name="bert-finetuning", + logging_steps=100, + save_steps=500 +) + +# Trainer automatically logs to W&B +trainer = Trainer( + model=model, + args=training_args, + train_dataset=train_dataset, + eval_dataset=eval_dataset +) + +trainer.train() +``` + +### PyTorch Lightning + +```python +from pytorch_lightning import Trainer +from pytorch_lightning.loggers import WandbLogger +import wandb + +# Create W&B logger +wandb_logger = WandbLogger( + project="lightning-demo", + log_model=True # Log model checkpoints +) + +# Use with Trainer +trainer = Trainer( + logger=wandb_logger, + max_epochs=10 +) + +trainer.fit(model, datamodule=dm) +``` + +### Keras/TensorFlow + +```python +import wandb +from wandb.keras import WandbCallback + +# Initialize +wandb.init(project="keras-demo") + +# Add callback +model.fit( + x_train, y_train, + validation_data=(x_val, y_val), + epochs=10, + callbacks=[WandbCallback()] # Auto-logs metrics +) +``` + +## Visualization & Analysis + +### Custom Charts + +```python +# Log custom visualizations +import matplotlib.pyplot as plt + +fig, ax = plt.subplots() +ax.plot(x, y) +wandb.log({"custom_plot": wandb.Image(fig)}) + +# Log confusion matrix +wandb.log({"conf_mat": wandb.plot.confusion_matrix( + probs=None, + y_true=ground_truth, + preds=predictions, + class_names=class_names +)}) +``` + +### Reports + +Create shareable reports in W&B UI: +- Combine runs, charts, and text +- Markdown support +- Embeddable visualizations +- Team collaboration + +## Best Practices + +### 1. Organize with Tags and Groups + +```python +wandb.init( + project="my-project", + tags=["baseline", "resnet50", "imagenet"], + group="resnet-experiments", # Group related runs + job_type="train" # Type of job +) +``` + +### 2. Log Everything Relevant + +```python +# Log system metrics +wandb.log({ + "gpu/util": gpu_utilization, + "gpu/memory": gpu_memory_used, + "cpu/util": cpu_utilization +}) + +# Log code version +wandb.log({"git_commit": git_commit_hash}) + +# Log data splits +wandb.log({ + "data/train_size": len(train_dataset), + "data/val_size": len(val_dataset) +}) +``` + +### 3. Use Descriptive Names + +```python +# ✅ Good: Descriptive run names +wandb.init( + project="nlp-classification", + name="bert-base-lr0.001-bs32-epoch10" +) + +# ❌ Bad: Generic names +wandb.init(project="nlp", name="run1") +``` + +### 4. Save Important Artifacts + +```python +# Save final model +artifact = wandb.Artifact('final-model', type='model') +artifact.add_file('model.pth') +wandb.log_artifact(artifact) + +# Save predictions for analysis +predictions_table = wandb.Table( + columns=["id", "input", "prediction", "ground_truth"], + data=predictions_data +) +wandb.log({"predictions": predictions_table}) +``` + +### 5. Use Offline Mode for Unstable Connections + +```python +import os + +# Enable offline mode +os.environ["WANDB_MODE"] = "offline" + +wandb.init(project="my-project") +# ... your code ... + +# Sync later +# wandb sync +``` + +## Team Collaboration + +### Share Runs + +```python +# Runs are automatically shareable via URL +run = wandb.init(project="team-project") +print(f"Share this URL: {run.url}") +``` + +### Team Projects + +- Create team account at wandb.ai +- Add team members +- Set project visibility (private/public) +- Use team-level artifacts and model registry + +## Pricing + +- **Free**: Unlimited public projects, 100GB storage +- **Academic**: Free for students/researchers +- **Teams**: $50/seat/month, private projects, unlimited storage +- **Enterprise**: Custom pricing, on-prem options + +## Resources + +- **Documentation**: https://docs.wandb.ai +- **GitHub**: https://github.com/wandb/wandb (10.5k+ stars) +- **Examples**: https://github.com/wandb/examples +- **Community**: https://wandb.ai/community +- **Discord**: https://wandb.me/discord + +## See Also + +- `references/sweeps.md` - Comprehensive hyperparameter optimization guide +- `references/artifacts.md` - Data and model versioning patterns +- `references/integrations.md` - Framework-specific examples diff --git a/website/docs/user-guide/skills/bundled/mlops/mlops-huggingface-hub.md b/website/docs/user-guide/skills/bundled/mlops/mlops-huggingface-hub.md new file mode 100644 index 000000000..27ab41b5e --- /dev/null +++ b/website/docs/user-guide/skills/bundled/mlops/mlops-huggingface-hub.md @@ -0,0 +1,99 @@ +--- +title: "Huggingface Hub" +sidebar_label: "Huggingface Hub" +description: "Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Space..." +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Huggingface Hub + +Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Spaces and buckets. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/mlops/huggingface-hub` | +| Version | `1.0.0` | +| Author | Hugging Face | +| License | MIT | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# Hugging Face CLI (`hf`) Reference Guide + +The `hf` command is the modern command-line interface for interacting with the Hugging Face Hub, providing tools to manage repositories, models, datasets, and Spaces. + +> **IMPORTANT:** The `hf` command replaces the now deprecated `huggingface-cli` command. + +## Quick Start +* **Installation:** `curl -LsSf https://hf.co/cli/install.sh | bash -s` +* **Help:** Use `hf --help` to view all available functions and real-world examples. +* **Authentication:** Recommended via `HF_TOKEN` environment variable or the `--token` flag. + +--- + +## Core Commands + +### General Operations +* `hf download REPO_ID`: Download files from the Hub. +* `hf upload REPO_ID`: Upload files/folders (recommended for single-commit). +* `hf upload-large-folder REPO_ID LOCAL_PATH`: Recommended for resumable uploads of large directories. +* `hf sync`: Sync files between a local directory and a bucket. +* `hf env` / `hf version`: View environment and version details. + +### Authentication (`hf auth`) +* `login` / `logout`: Manage sessions using tokens from [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens). +* `list` / `switch`: Manage and toggle between multiple stored access tokens. +* `whoami`: Identify the currently logged-in account. + +### Repository Management (`hf repos`) +* `create` / `delete`: Create or permanently remove repositories. +* `duplicate`: Clone a model, dataset, or Space to a new ID. +* `move`: Transfer a repository between namespaces. +* `branch` / `tag`: Manage Git-like references. +* `delete-files`: Remove specific files using patterns. + +--- + +## Specialized Hub Interactions + +### Datasets & Models +* **Datasets:** `hf datasets list`, `info`, and `parquet` (list parquet URLs). +* **SQL Queries:** `hf datasets sql SQL` — Execute raw SQL via DuckDB against dataset parquet URLs. +* **Models:** `hf models list` and `info`. +* **Papers:** `hf papers list` — View daily papers. + +### Discussions & Pull Requests (`hf discussions`) +* Manage the lifecycle of Hub contributions: `list`, `create`, `info`, `comment`, `close`, `reopen`, and `rename`. +* `diff`: View changes in a PR. +* `merge`: Finalize pull requests. + +### Infrastructure & Compute +* **Endpoints:** Deploy and manage Inference Endpoints (`deploy`, `pause`, `resume`, `scale-to-zero`, `catalog`). +* **Jobs:** Run compute tasks on HF infrastructure. Includes `hf jobs uv` for running Python scripts with inline dependencies and `stats` for resource monitoring. +* **Spaces:** Manage interactive apps. Includes `dev-mode` and `hot-reload` for Python files without full restarts. + +### Storage & Automation +* **Buckets:** Full S3-like bucket management (`create`, `cp`, `mv`, `rm`, `sync`). +* **Cache:** Manage local storage with `list`, `prune` (remove detached revisions), and `verify` (checksum checks). +* **Webhooks:** Automate workflows by managing Hub webhooks (`create`, `watch`, `enable`/`disable`). +* **Collections:** Organize Hub items into collections (`add-item`, `update`, `list`). + +--- + +## Advanced Usage & Tips + +### Global Flags +* `--format json`: Produces machine-readable output for automation. +* `-q` / `--quiet`: Limits output to IDs only. + +### Extensions & Skills +* **Extensions:** Extend CLI functionality via GitHub repositories using `hf extensions install REPO_ID`. +* **Skills:** Manage AI assistant skills with `hf skills add`. diff --git a/website/docs/user-guide/skills/bundled/mlops/mlops-inference-llama-cpp.md b/website/docs/user-guide/skills/bundled/mlops/mlops-inference-llama-cpp.md new file mode 100644 index 000000000..19f08067f --- /dev/null +++ b/website/docs/user-guide/skills/bundled/mlops/mlops-inference-llama-cpp.md @@ -0,0 +1,266 @@ +--- +title: "Llama Cpp — llama" +sidebar_label: "Llama Cpp" +description: "llama" +--- + +{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */} + +# Llama Cpp + +llama.cpp local GGUF inference + HF Hub model discovery. + +## Skill metadata + +| | | +|---|---| +| Source | Bundled (installed by default) | +| Path | `skills/mlops/inference/llama-cpp` | +| Version | `2.1.2` | +| Author | Orchestra Research | +| License | MIT | +| Dependencies | `llama-cpp-python>=0.2.0` | +| Tags | `llama.cpp`, `GGUF`, `Quantization`, `Hugging Face Hub`, `CPU Inference`, `Apple Silicon`, `Edge Deployment`, `AMD GPUs`, `Intel GPUs`, `NVIDIA`, `URL-first` | + +## Reference: full SKILL.md + +:::info +The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active. +::: + +# llama.cpp + GGUF + +Use this skill for local GGUF inference, quant selection, or Hugging Face repo discovery for llama.cpp. + +## When to use + +- Run local models on CPU, Apple Silicon, CUDA, ROCm, or Intel GPUs +- Find the right GGUF for a specific Hugging Face repo +- Build a `llama-server` or `llama-cli` command from the Hub +- Search the Hub for models that already support llama.cpp +- Enumerate available `.gguf` files and sizes for a repo +- Decide between Q4/Q5/Q6/IQ variants for the user's RAM or VRAM + +## Model Discovery workflow + +Prefer URL workflows before asking for `hf`, Python, or custom scripts. + +1. Search for candidate repos on the Hub: + - Base: `https://huggingface.co/models?apps=llama.cpp&sort=trending` + - Add `search=` for a model family + - Add `num_parameters=min:0,max:24B` or similar when the user has size constraints +2. Open the repo with the llama.cpp local-app view: + - `https://huggingface.co/?local-app=llama.cpp` +3. Treat the local-app snippet as the source of truth when it is visible: + - copy the exact `llama-server` or `llama-cli` command + - report the recommended quant exactly as HF shows it +4. Read the same `?local-app=llama.cpp` URL as page text or HTML and extract the section under `Hardware compatibility`: + - prefer its exact quant labels and sizes over generic tables + - keep repo-specific labels such as `UD-Q4_K_M` or `IQ4_NL_XL` + - if that section is not visible in the fetched page source, say so and fall back to the tree API plus generic quant guidance +5. Query the tree API to confirm what actually exists: + - `https://huggingface.co/api/models//tree/main?recursive=true` + - keep entries where `type` is `file` and `path` ends with `.gguf` + - use `path` and `size` as the source of truth for filenames and byte sizes + - separate quantized checkpoints from `mmproj-*.gguf` projector files and `BF16/` shard files + - use `https://huggingface.co//tree/main` only as a human fallback +6. If the local-app snippet is not text-visible, reconstruct the command from the repo plus the chosen quant: + - shorthand quant selection: `llama-server -hf :` + - exact-file fallback: `llama-server --hf-repo --hf-file ` +7. Only suggest conversion from Transformers weights if the repo does not already expose GGUF files. + +## Quick start + +### Install llama.cpp + +```bash +# macOS / Linux (simplest) +brew install llama.cpp +``` + +```bash +winget install llama.cpp +``` + +```bash +git clone https://github.com/ggml-org/llama.cpp +cd llama.cpp +cmake -B build +cmake --build build --config Release +``` + +### Run directly from the Hugging Face Hub + +```bash +llama-cli -hf bartowski/Llama-3.2-3B-Instruct-GGUF:Q8_0 +``` + +```bash +llama-server -hf bartowski/Llama-3.2-3B-Instruct-GGUF:Q8_0 +``` + +### Run an exact GGUF file from the Hub + +Use this when the tree API shows custom file naming or the exact HF snippet is missing. + +```bash +llama-server \ + --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf \ + --hf-file Phi-3-mini-4k-instruct-q4.gguf \ + -c 4096 +``` + +### OpenAI-compatible server check + +```bash +curl http://localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "messages": [ + {"role": "user", "content": "Write a limerick about Python exceptions"} + ] + }' +``` + +## Python bindings (llama-cpp-python) + +`pip install llama-cpp-python` (CUDA: `CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --force-reinstall --no-cache-dir`; Metal: `CMAKE_ARGS="-DGGML_METAL=on" ...`). + +### Basic generation + +```python +from llama_cpp import Llama + +llm = Llama( + model_path="./model-q4_k_m.gguf", + n_ctx=4096, + n_gpu_layers=35, # 0 for CPU, 99 to offload everything + n_threads=8, +) + +out = llm("What is machine learning?", max_tokens=256, temperature=0.7) +print(out["choices"][0]["text"]) +``` + +### Chat + streaming + +```python +llm = Llama( + model_path="./model-q4_k_m.gguf", + n_ctx=4096, + n_gpu_layers=35, + chat_format="llama-3", # or "chatml", "mistral", etc. +) + +resp = llm.create_chat_completion( + messages=[ + {"role": "system", "content": "You are a helpful assistant."}, + {"role": "user", "content": "What is Python?"}, + ], + max_tokens=256, +) +print(resp["choices"][0]["message"]["content"]) + +# Streaming +for chunk in llm("Explain quantum computing:", max_tokens=256, stream=True): + print(chunk["choices"][0]["text"], end="", flush=True) +``` + +### Embeddings + +```python +llm = Llama(model_path="./model-q4_k_m.gguf", embedding=True, n_gpu_layers=35) +vec = llm.embed("This is a test sentence.") +print(f"Embedding dimension: {len(vec)}") +``` + +You can also load a GGUF straight from the Hub: + +```python +llm = Llama.from_pretrained( + repo_id="bartowski/Llama-3.2-3B-Instruct-GGUF", + filename="*Q4_K_M.gguf", + n_gpu_layers=35, +) +``` + +## Choosing a quant + +Use the Hub page first, generic heuristics second. + +- Prefer the exact quant that HF marks as compatible for the user's hardware profile. +- For general chat, start with `Q4_K_M`. +- For code or technical work, prefer `Q5_K_M` or `Q6_K` if memory allows. +- For very tight RAM budgets, consider `Q3_K_M`, `IQ` variants, or `Q2` variants only if the user explicitly prioritizes fit over quality. +- For multimodal repos, mention `mmproj-*.gguf` separately. The projector is not the main model file. +- Do not normalize repo-native labels. If the page says `UD-Q4_K_M`, report `UD-Q4_K_M`. + +## Extracting available GGUFs from a repo + +When the user asks what GGUFs exist, return: + +- filename +- file size +- quant label +- whether it is a main model or an auxiliary projector + +Ignore unless requested: + +- README +- BF16 shard files +- imatrix blobs or calibration artifacts + +Use the tree API for this step: + +- `https://huggingface.co/api/models//tree/main?recursive=true` + +For a repo like `unsloth/Qwen3.6-35B-A3B-GGUF`, the local-app page can show quant chips such as `UD-Q4_K_M`, `UD-Q5_K_M`, `UD-Q6_K`, and `Q8_0`, while the tree API exposes exact file paths such as `Qwen3.6-35B-A3B-UD-Q4_K_M.gguf` and `Qwen3.6-35B-A3B-Q8_0.gguf` with byte sizes. Use the tree API to turn a quant label into an exact filename. + +## Search patterns + +Use these URL shapes directly: + +```text +https://huggingface.co/models?apps=llama.cpp&sort=trending +https://huggingface.co/models?search=&apps=llama.cpp&sort=trending +https://huggingface.co/models?search=&apps=llama.cpp&num_parameters=min:0,max:24B&sort=trending +https://huggingface.co/?local-app=llama.cpp +https://huggingface.co/api/models//tree/main?recursive=true +https://huggingface.co//tree/main +``` + +## Output format + +When answering discovery requests, prefer a compact structured result like: + +```text +Repo: +Recommended quant from HF: