docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738)

Broad drift audit against origin/main (b52b63396).

Reference pages (most user-visible drift):
- slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer
  that were missing; drop non-existent /terminal-setup; fix /q footnote
  (resolves to /queue, not /quit); extend CLI-only list with all 24
  CLI-only commands in the registry
- cli-commands: add dedicated sections for hermes curator / fallback /
  hooks (new subcommands not previously documented); remove stale
  hermes honcho standalone section (the plugin registers dynamically
  via hermes memory); list curator/fallback/hooks in top-level table;
  fix completion to include fish
- toolsets-reference: document the real 52-toolset count; split browser
  vs browser-cdp; add discord / discord_admin / spotify / yuanbao;
  correct hermes-cli tool count from 36 to 38; fix misleading claim
  that hermes-homeassistant adds tools (it's identical to hermes-cli)
- tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao,
  2 Discord toolsets; move browser_cdp/browser_dialog to their own
  browser-cdp toolset section
- environment-variables: add 40+ user-facing HERMES_* vars that were
  undocumented (--yolo, --accept-hooks, --ignore-*, inference model
  override, agent/stream/checkpoint timeouts, OAuth trace, per-platform
  batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs,
  gateway restart/connect timeouts); dedupe the Cron Scheduler section;
  replace stale QQ_SANDBOX with QQ_PORTAL_HOST

User-guide (top level):
- cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20)
- configuration.md: display.platforms is the canonical per-platform
  override key; tool_progress_overrides is deprecated and auto-migrated
- profiles.md: model.default is the config key, not model.model
- sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8
- checkpoints-and-rollback.md: destructive-command list now matches
  _DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd)
- docker.md: the container runs as non-root hermes (UID 10000) via
  gosu; fix install command (uv pip); add missing --insecure on the
  dashboard compose example (required for non-loopback bind)
- security.md: systemctl danger pattern also matches 'restart'
- index.md: built-in tool count 47 -> 68
- integrations/index.md: 6 STT providers, 8 memory providers
- integrations/providers.md: drop fictional dashscope/qwen aliases

Features:
- overview.md: 9 image models (not 8), 9 TTS providers (not 5),
  8 memory providers (Supermemory was missing)
- tool-gateway.md: 9 image models
- tools.md: extend common-toolsets list with search / messaging /
  spotify / discord / debugging / safe
- fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY
  (lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan,
  tencent-tokenhub, azure-foundry)
- plugins.md: Available Hooks table now includes on_session_finalize,
  on_session_reset, subagent_stop
- built-in-plugins.md: add the 7 bundled plugins the page didn't
  mention (spotify, google_meet, three image_gen providers, two
  dashboard examples)
- web-dashboard.md: add --insecure and --tui flags
- cron.md: hermes cron create takes positional schedule/prompt, not
  flags

Messaging:
- telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when
  TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it
  per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch.
- discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default
  is 2.0, not 0.1
- dingtalk.md: document DINGTALK_REQUIRE_MENTION /
  FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL /
  ALLOW_ALL_USERS that the adapter supports
- bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env
  var; the setting lives in platforms.bluebubbles.extra only
- qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and
  QQ_GROUP_ALLOWED_USERS
- wecom-callback.md: replace 'hermes gateway start' (service-only)
  with 'hermes gateway' for first-time setup

Developer-guide:
- architecture.md: refresh tool/toolset counts (61/52), terminal
  backend count (7), line counts for run_agent.py (~13.7k), cli.py
  (~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py
  (~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform
  adapter count 18 -> 20
- agent-loop.md: run_agent.py line count 10.7k -> 13.7k
- tools-runtime.md: add vercel_sandbox backend
- adding-tools.md: remove stale 'Discovery import added to
  model_tools.py' checklist item (registry auto-discovery)
- adding-platform-adapters.md: mark send_typing / get_chat_info as
  concrete base methods; only connect/disconnect/send are abstract
- acp-internals.md: ACP sessions now persist to SessionDB
  (~/.hermes/state.db); acp.run_agent call uses
  use_unstable_protocol=True
- cron-internals.md: gateway runs scheduler in a dedicated background
  thread via _start_cron_ticker, not on a maintenance cycle; locking
  is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows)
- gateway-internals.md: gateway/run.py ~12k lines
- provider-runtime.md: cron DOES support fallback (run_job reads
  fallback_providers from config)
- session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations
  10 and 11 (trigram FTS, inline-mode FTS5 re-index); add
  api_call_count column to Sessions DDL; document messages_fts_trigram
  and state_meta in the architecture tree
- context-compression-and-caching.md: remove the obsolete 'context
  pressure warnings' section (warnings were removed for causing
  models to give up early)
- context-engine-plugin.md: compress() signature now includes
  focus_topic param
- extending-the-cli.md: _build_tui_layout_children signature now
  includes model_picker_widget; add to default layout

Also fixed three pre-existing broken links/anchors the build warned
about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and
tips#background-tasks, nix-setup.md -> #container-aware-cli).

Regenerated per-skill pages via website/scripts/generate-skill-docs.py
so catalog tables and sidebar are consistent with current SKILL.md
frontmatter.

docusaurus build: clean, no broken links or anchors.
This commit is contained in:
Teknium 2026-04-29 20:55:59 -07:00 committed by GitHub
parent 51b44b6e3f
commit 289cc47631
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
135 changed files with 4835 additions and 443 deletions

View file

@ -16,7 +16,7 @@ This safety net is powered by an internal **Checkpoint Manager** that keeps a se
Checkpoints are taken automatically before:
- **File tools**`write_file` and `patch`
- **Destructive terminal commands**`rm`, `mv`, `sed -i`, `truncate`, `shred`, output redirects (`>`), and `git reset`/`clean`/`checkout`
- **Destructive terminal commands**`rm`, `rmdir`, `cp`, `install`, `mv`, `sed -i`, `truncate`, `dd`, `shred`, output redirects (`>`), and `git reset`/`clean`/`checkout`
The agent creates **at most one checkpoint per directory per turn**, so long-running sessions don't spam snapshots.

View file

@ -358,7 +358,7 @@ auxiliary:
model: "google/gemini-3-flash-preview" # Model used for summarization
```
When compression triggers, middle turns are summarized while the first 3 and last 4 turns are always preserved.
When compression triggers, middle turns are summarized while the first 3 and last 20 turns are always preserved.
## Background Sessions

View file

@ -1152,7 +1152,8 @@ This controls both the `text_to_speech` tool and spoken replies in voice mode (`
display:
tool_progress: all # off | new | all | verbose
tool_progress_command: false # Enable /verbose slash command in messaging gateway
tool_progress_overrides: {} # Per-platform overrides (see below)
platforms: {} # Per-platform display overrides (see below)
tool_progress_overrides: {} # DEPRECATED — use display.platforms instead
interim_assistant_messages: true # Gateway: send natural mid-turn assistant updates as separate messages
skin: default # Built-in or custom CLI skin (see user-guide/features/skins)
personality: "kawaii" # Legacy cosmetic field still surfaced in some summaries
@ -1194,18 +1195,21 @@ Only the **final** message of a turn gets the footer; interim updates stay clean
### Per-platform progress overrides
Different platforms have different verbosity needs. For example, Signal can't edit messages, so each progress update becomes a separate message — noisy. Use `tool_progress_overrides` to set per-platform modes:
Different platforms have different verbosity needs. For example, Signal can't edit messages, so each progress update becomes a separate message — noisy. Use `display.platforms` to set per-platform modes:
```yaml
display:
tool_progress: all # global default
tool_progress_overrides:
signal: 'off' # silence progress on Signal
telegram: verbose # detailed progress on Telegram
slack: 'off' # quiet in shared Slack workspace
platforms:
signal:
tool_progress: 'off' # silence progress on Signal
telegram:
tool_progress: verbose # detailed progress on Telegram
slack:
tool_progress: 'off' # quiet in shared Slack workspace
```
Platforms without an override fall back to the global `tool_progress` value. Valid platform keys: `telegram`, `discord`, `slack`, `signal`, `whatsapp`, `matrix`, `mattermost`, `email`, `sms`, `homeassistant`, `dingtalk`, `feishu`, `wecom`, `weixin`, `bluebubbles`, `qqbot`.
Platforms without an override fall back to the global `tool_progress` value. Valid platform keys: `telegram`, `discord`, `slack`, `signal`, `whatsapp`, `matrix`, `mattermost`, `email`, `sms`, `homeassistant`, `dingtalk`, `feishu`, `wecom`, `weixin`, `bluebubbles`, `qqbot`. The legacy `display.tool_progress_overrides` key still loads for backward compatibility but is deprecated and migrated into `display.platforms` on first load.
`interim_assistant_messages` is gateway-only. When enabled, Hermes sends completed mid-turn assistant updates as separate chat messages. This is independent from `tool_progress` and does not require gateway streaming.

View file

@ -39,7 +39,7 @@ docker run -d \
nousresearch/hermes-agent gateway run
```
Port 8642 exposes the gateway's [OpenAI-compatible API server](./api-server.md) and health endpoint. It's optional if you only use chat platforms (Telegram, Discord, etc.), but required if you want the dashboard or external tools to reach the gateway.
Port 8642 exposes the gateway's [OpenAI-compatible API server](./features/api-server.md) and health endpoint. It's optional if you only use chat platforms (Telegram, Discord, etc.), but required if you want the dashboard or external tools to reach the gateway.
Opening any port on an internet facing machine is a security risk. You should not do it unless you understand the risks.
@ -208,7 +208,7 @@ services:
image: nousresearch/hermes-agent:latest
container_name: hermes-dashboard
restart: unless-stopped
command: dashboard --host 0.0.0.0
command: dashboard --host 0.0.0.0 --insecure
ports:
- "9119:9119"
volumes:
@ -259,10 +259,10 @@ docker run -d \
The official image is based on `debian:13.4` and includes:
- Python 3 with all Hermes dependencies (`pip install -e ".[all]"`)
- Python 3 with all Hermes dependencies (`uv pip install -e ".[all]"`)
- Node.js + npm (for browser automation and WhatsApp bridge)
- Playwright with Chromium (`npx playwright install --with-deps chromium`)
- ripgrep and ffmpeg as system utilities
- Playwright with Chromium (`npx playwright install --with-deps chromium --only-shell`)
- ripgrep, ffmpeg, git, and tini as system utilities
- **`docker-cli`** — so agents running inside the container can drive the host's Docker daemon (bind-mount `/var/run/docker.sock` to opt in) for `docker build`, `docker run`, container inspection, etc.
- **`openssh-client`** — enables the [SSH terminal backend](/docs/user-guide/configuration#ssh-backend) from inside the container. The SSH backend shells out to the system `ssh` binary; without this, it failed silently in containerized installs.
- The WhatsApp bridge (`scripts/whatsapp-bridge/`)
@ -312,7 +312,7 @@ Check logs: `docker logs hermes`. Common causes:
### "Permission denied" errors
The container runs as root by default. If your host `~/.hermes/` was created by a non-root user, permissions should work. If you get errors, ensure the data directory is writable:
The container's entrypoint drops privileges to the non-root `hermes` user (UID 10000) via `gosu`. If your host `~/.hermes/` is owned by a different UID, set `HERMES_UID`/`HERMES_GID` to match your host user, or ensure the data directory is writable:
```sh
chmod -R 755 ~/.hermes

View file

@ -51,6 +51,22 @@ hermes plugins disable disk-cleanup
## Currently shipped
The repo ships these bundled plugins under `plugins/`. All are opt-in — enable them via `hermes plugins enable <name>`.
| Plugin | Kind | Purpose |
|---|---|---|
| `disk-cleanup` | hooks + slash command | Auto-track ephemeral files and clean them on session end |
| `observability/langfuse` | hooks | Trace turns / LLM calls / tools to [Langfuse](https://langfuse.com) |
| `spotify` | backend (7 tools) | Native Spotify playback, queue, search, playlists, albums, library |
| `google_meet` | standalone | Join Meet calls, live-caption transcription, optional realtime duplex audio |
| `image_gen/openai` | image backend | OpenAI `gpt-image-2` image generation backend (alternative to FAL) |
| `image_gen/openai-codex` | image backend | OpenAI image generation via Codex OAuth |
| `image_gen/xai` | image backend | xAI `grok-2-image` backend |
| `example-dashboard` | dashboard example | Reference dashboard plugin for [Extending the Dashboard](./extending-the-dashboard.md) |
| `strike-freedom-cockpit` | dashboard skin | Sample custom dashboard skin |
Memory providers (`plugins/memory/*`) and context engines (`plugins/context_engine/*`) are listed separately on [Memory Providers](./memory-providers.md) — they're managed through `hermes memory` and `hermes plugins` respectively. The full per-plugin detail for the two long-running hooks-based plugins follows.
### disk-cleanup
Auto-tracks and removes ephemeral files created during sessions — test scripts, temp outputs, cron logs, stale chrome profiles — without requiring the agent to remember to call a tool.

View file

@ -91,10 +91,10 @@ This is useful when you want a scheduled agent to inherit reusable workflows wit
Cron jobs default to running detached from any repo — no `AGENTS.md`, `CLAUDE.md`, or `.cursorrules` is loaded, and the terminal / file / code-exec tools run from whatever working directory the gateway started in. Pass `--workdir` (CLI) or `workdir=` (tool call) to change that:
```bash
# Standalone CLI
hermes cron create --schedule "every 1d at 09:00" \
--workdir /home/me/projects/acme \
--prompt "Audit open PRs, summarize CI health, and post to #eng"
# Standalone CLI (schedule and prompt are positional)
hermes cron create "every 1d at 09:00" \
"Audit open PRs, summarize CI health, and post to #eng" \
--workdir /home/me/projects/acme
```
```python

View file

@ -74,6 +74,12 @@ Both `provider` and `model` are **required**. If either is missing, the fallback
| Arcee AI | `arcee` | `ARCEEAI_API_KEY` |
| GMI Cloud | `gmi` | `GMI_API_KEY` |
| Alibaba / DashScope | `alibaba` | `DASHSCOPE_API_KEY` |
| Alibaba Coding Plan | `alibaba-coding-plan` | `ALIBABA_CODING_PLAN_API_KEY` (falls back to `DASHSCOPE_API_KEY`) |
| Kimi / Moonshot (China) | `kimi-coding-cn` | `KIMI_CN_API_KEY` |
| StepFun | `stepfun` | `STEPFUN_API_KEY` |
| Tencent TokenHub | `tencent-tokenhub` | `TOKENHUB_API_KEY` |
| Azure AI Foundry | `azure-foundry` | `AZURE_FOUNDRY_API_KEY` + `AZURE_FOUNDRY_BASE_URL` |
| LM Studio (local) | `lmstudio` | `LM_API_KEY` (or none for local) + `LM_BASE_URL` |
| Hugging Face | `huggingface` | `HF_TOKEN` |
| Custom endpoint | `custom` | `base_url` + `key_env` (see below) |

View file

@ -30,8 +30,8 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch
- **[Voice Mode](voice-mode.md)** — Full voice interaction across CLI and messaging platforms. Talk to the agent using your microphone, hear spoken replies, and have live voice conversations in Discord voice channels.
- **[Browser Automation](browser.md)** — Full browser automation with multiple backends: Browserbase cloud, Browser Use cloud, local Chrome via CDP, or local Chromium. Navigate websites, fill forms, and extract information.
- **[Vision & Image Paste](vision.md)** — Multimodal vision support. Paste images from your clipboard into the CLI and ask the agent to analyze, describe, or work with them using any vision-capable model.
- **[Image Generation](image-generation.md)** — Generate images from text prompts using FAL.ai. Eight models supported (FLUX 2 Klein/Pro, GPT-Image 1.5, Nano Banana Pro, Ideogram V3, Recraft V4 Pro, Qwen, Z-Image Turbo); pick one via `hermes tools`.
- **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with five provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, MiniMax, and NeuTTS.
- **[Image Generation](image-generation.md)** — Generate images from text prompts using FAL.ai. Nine models supported (FLUX 2 Klein/Pro, GPT-Image 1.5/2, Nano Banana Pro, Ideogram V3, Recraft V4 Pro, Qwen, Z-Image Turbo); pick one via `hermes tools`.
- **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with nine provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, MiniMax, Mistral Voxtral, Google Gemini, xAI, NeuTTS, and KittenTTS.
## Integrations
@ -39,7 +39,7 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch
- **[Provider Routing](provider-routing.md)** — Fine-grained control over which AI providers handle your requests. Optimize for cost, speed, or quality with sorting, whitelists, blacklists, and priority ordering.
- **[Fallback Providers](fallback-providers.md)** — Automatic failover to backup LLM providers when your primary model encounters errors, including independent fallback for auxiliary tasks like vision and compression.
- **[Credential Pools](credential-pools.md)** — Distribute API calls across multiple keys for the same provider. Automatic rotation on rate limits or failures.
- **[Memory Providers](memory-providers.md)** — Plug in external memory backends (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover) for cross-session user modeling and personalization beyond the built-in memory system.
- **[Memory Providers](memory-providers.md)** — Plug in external memory backends (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover, Supermemory) for cross-session user modeling and personalization beyond the built-in memory system.
- **[API Server](api-server.md)** — Expose Hermes as an OpenAI-compatible HTTP endpoint. Connect any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, and more.
- **[IDE Integration (ACP)](acp.md)** — Use Hermes inside ACP-compatible editors such as VS Code, Zed, and JetBrains. Chat, tool activity, file diffs, and terminal commands render inside your editor.
- **[RL Training](rl-training.md)** — Generate trajectory data from agent sessions for reinforcement learning and model fine-tuning.

View file

@ -142,6 +142,9 @@ Plugins can register callbacks for these lifecycle events. See the **[Event Hook
| [`post_llm_call`](/docs/user-guide/features/hooks#post_llm_call) | Once per turn, after the LLM loop (successful turns only) |
| [`on_session_start`](/docs/user-guide/features/hooks#on_session_start) | New session created (first turn only) |
| [`on_session_end`](/docs/user-guide/features/hooks#on_session_end) | End of every `run_conversation` call + CLI exit handler |
| [`on_session_finalize`](/docs/user-guide/features/hooks#on_session_finalize) | CLI/gateway tears down an active session (`/new`, GC, CLI quit) |
| [`on_session_reset`](/docs/user-guide/features/hooks#on_session_reset) | Gateway swaps in a new session key (`/new`, `/reset`, `/clear`, idle rotation) |
| [`subagent_stop`](/docs/user-guide/features/hooks#subagent_stop) | Once per child after `delegate_task` finishes |
| [`pre_gateway_dispatch`](/docs/user-guide/features/hooks#pre_gateway_dispatch) | Gateway received a user message, before auth + dispatch. Return `{"action": "skip" \| "rewrite" \| "allow", ...}` to influence flow. |
## Plugin types

View file

@ -18,7 +18,7 @@ The **Tool Gateway** lets paid [Nous Portal](https://portal.nousresearch.com) su
| Tool | What It Does | Direct Alternative |
|------|--------------|--------------------|
| **Web search & extract** | Search the web and extract page content via Firecrawl | `FIRECRAWL_API_KEY`, `EXA_API_KEY`, `PARALLEL_API_KEY`, `TAVILY_API_KEY` |
| **Image generation** | Generate images via FAL (8 models: FLUX 2 Klein/Pro, GPT-Image, Nano Banana Pro, Ideogram, Recraft V4 Pro, Qwen, Z-Image) | `FAL_KEY` |
| **Image generation** | Generate images via FAL (9 models: FLUX 2 Klein/Pro, GPT-Image 1.5/2, Nano Banana Pro, Ideogram V3, Recraft V4 Pro, Qwen, Z-Image Turbo) | `FAL_KEY` |
| **Text-to-speech** | Convert text to speech via OpenAI TTS | `VOICE_TOOLS_OPENAI_KEY`, `ELEVENLABS_API_KEY` |
| **Browser automation** | Control cloud browsers via Browser Use | `BROWSER_USE_API_KEY`, `BROWSERBASE_API_KEY` |

View file

@ -48,7 +48,7 @@ hermes tools
hermes tools
```
Common toolsets include `web`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `moa`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, `homeassistant`, and `rl`.
Common toolsets include `web`, `search`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `moa`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, `homeassistant`, `messaging`, `spotify`, `discord`, `discord_admin`, `debugging`, `safe`, and `rl`.
See [Toolsets Reference](/docs/reference/toolsets-reference) for the full set, including platform presets such as `hermes-cli`, `hermes-telegram`, and dynamic MCP toolsets like `mcp-<server>`.

View file

@ -23,6 +23,8 @@ This starts a local web server and opens `http://127.0.0.1:9119` in your browser
| `--port` | `9119` | Port to run the web server on |
| `--host` | `127.0.0.1` | Bind address |
| `--no-open` | — | Don't auto-open the browser |
| `--insecure` | off | Allow binding to non-localhost hosts (**DANGEROUS** — exposes API keys on the network; pair with a firewall and strong auth) |
| `--tui` | off | Expose the in-browser Chat tab (embedded `hermes --tui` via PTY/WebSocket). Alternatively set `HERMES_DASHBOARD_TUI=1`. |
```bash
# Custom port

View file

@ -90,7 +90,8 @@ Hermes → BlueBubbles REST API → Messages.app → iMessage
| `BLUEBUBBLES_HOME_CHANNEL` | No | — | Phone/email for cron delivery |
| `BLUEBUBBLES_ALLOWED_USERS` | No | — | Comma-separated authorized users |
| `BLUEBUBBLES_ALLOW_ALL_USERS` | No | `false` | Allow all users |
| `BLUEBUBBLES_SEND_READ_RECEIPTS` | No | `true` | Auto-mark messages as read |
Auto-marking messages as read is controlled by the `send_read_receipts` key under `platforms.bluebubbles.extra` in `~/.hermes/config.yaml` (default: `true`). There is no corresponding environment variable.
## Features

View file

@ -123,6 +123,13 @@ DINGTALK_ALLOWED_USERS=user-id-1
# Multiple allowed users (comma-separated)
# DINGTALK_ALLOWED_USERS=user-id-1,user-id-2
# Optional: group-chat gating (mirrors Slack/Telegram/Discord/WhatsApp)
# DINGTALK_REQUIRE_MENTION=true
# DINGTALK_FREE_RESPONSE_CHATS=cidABC==,cidDEF==
# DINGTALK_MENTION_PATTERNS=^小马
# DINGTALK_HOME_CHANNEL=cidXXXX==
# DINGTALK_ALLOW_ALL_USERS=true
```
Optional behavior settings in `~/.hermes/config.yaml`:

View file

@ -292,7 +292,7 @@ Discord behavior is controlled through two files: **`~/.hermes/.env`** for crede
| `DISCORD_ALLOW_MENTION_REPLIED_USER` | No | `true` | When `true` (default), replying to a message pings the original author. |
| `DISCORD_PROXY` | No | — | Proxy URL for Discord connections (HTTP, WebSocket, REST). Overrides `HTTPS_PROXY`/`ALL_PROXY`. Supports `http://`, `https://`, and `socks5://` schemes. |
| `HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS` | No | `0.6` | Grace window the adapter waits before flushing a queued text chunk. Useful for smoothing streamed output. |
| `HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS` | No | `0.1` | Delay between split chunks when a single message exceeds Discord's length limit. |
| `HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS` | No | `2.0` | Delay between split chunks when a single message exceeds Discord's length limit. |
### Config File (`config.yaml`)

View file

@ -51,8 +51,9 @@ QQ_CLIENT_SECRET=your-app-secret
| `QQBOT_HOME_CHANNEL` | OpenID for cron/notification delivery | — |
| `QQBOT_HOME_CHANNEL_NAME` | Display name for home channel | `Home` |
| `QQ_ALLOWED_USERS` | Comma-separated user OpenIDs for DM access | open (all users) |
| `QQ_GROUP_ALLOWED_USERS` | Comma-separated group OpenIDs for group access | — |
| `QQ_ALLOW_ALL_USERS` | Set to `true` to allow all DMs | `false` |
| `QQ_SANDBOX` | Route requests to the QQ sandbox gateway for development testing | `false` |
| `QQ_PORTAL_HOST` | Override the QQ portal host (set to `sandbox.q.qq.com` for sandbox routing) | `q.qq.com` |
| `QQ_STT_API_KEY` | API key for voice-to-text provider | — |
| `QQ_STT_BASE_URL` | Base URL for STT provider | `https://open.bigmodel.cn/api/coding/paas/v4` |
| `QQ_STT_MODEL` | STT model name | `glm-asr` |

View file

@ -179,15 +179,15 @@ Add the following to `~/.hermes/.env`:
```bash
TELEGRAM_WEBHOOK_URL=https://my-app.fly.dev/telegram
TELEGRAM_WEBHOOK_SECRET="$(openssl rand -hex 32)" # required
# TELEGRAM_WEBHOOK_PORT=8443 # optional, default 8443
# TELEGRAM_WEBHOOK_SECRET=mysecret # optional, recommended
```
| Variable | Required | Description |
|----------|----------|-------------|
| `TELEGRAM_WEBHOOK_URL` | Yes | Public HTTPS URL where Telegram will send updates. The URL path is auto-extracted (e.g., `/telegram` from the example above). |
| `TELEGRAM_WEBHOOK_SECRET` | **Yes** (when `TELEGRAM_WEBHOOK_URL` is set) | Secret token that Telegram echoes in every webhook request for verification. The gateway refuses to start without it — see [GHSA-3vpc-7q5r-276h](https://github.com/NousResearch/hermes-agent/security/advisories/GHSA-3vpc-7q5r-276h). Generate with `openssl rand -hex 32`. |
| `TELEGRAM_WEBHOOK_PORT` | No | Local port the webhook server listens on (default: `8443`). |
| `TELEGRAM_WEBHOOK_SECRET` | No | Secret token for verifying that updates actually come from Telegram. **Strongly recommended** for production deployments. |
When `TELEGRAM_WEBHOOK_URL` is set, the gateway starts an HTTP webhook server instead of polling. When unset, polling mode is used — no behavior change from previous versions.

View file

@ -60,9 +60,11 @@ WECOM_CALLBACK_ALLOWED_USERS=user1,user2
### 3. Start the Gateway
```bash
hermes gateway start
hermes gateway
```
(Use `hermes gateway start` only after `hermes gateway install` has registered the systemd/launchd service.)
The callback adapter starts an HTTP server on the configured port. WeCom will verify the callback URL via a GET request, then begin sending messages via POST.
## Configuration Reference

View file

@ -337,5 +337,5 @@ hermes chat -q "Send 'Hello from CLI' to yuanbao:group:group_code"
- [Messaging Gateway Overview](./index.md)
- [Slash Commands Reference](/docs/reference/slash-commands.md)
- [Cron Jobs](/docs/user-guide/features/cron-jobs.md)
- [Background Tasks](/docs/guides/tips.md#background-tasks)
- [Cron Jobs](/docs/user-guide/features/cron.md)
- [Background Sessions](/docs/user-guide/cli#background-sessions)

View file

@ -70,7 +70,7 @@ coder setup # configure coder's settings
coder gateway start # start coder's gateway
coder doctor # check coder's health
coder skills list # list coder's skills
coder config set model.model anthropic/claude-sonnet-4
coder config set model.default anthropic/claude-sonnet-4
```
The alias works with every hermes subcommand — it's just `hermes -p <name>` under the hood.
@ -173,7 +173,7 @@ Each profile has its own:
- **`SOUL.md`** — personality and instructions
```bash
coder config set model.model anthropic/claude-sonnet-4
coder config set model.default anthropic/claude-sonnet-4
echo "You are a focused coding assistant." > ~/.hermes/profiles/coder/SOUL.md
```

View file

@ -119,7 +119,7 @@ The following patterns trigger approval prompts (defined in `tools/approval.py`)
| `DELETE FROM` (without WHERE) | SQL DELETE without WHERE |
| `TRUNCATE TABLE` | SQL TRUNCATE |
| `> /etc/` | Overwrite system config |
| `systemctl stop/disable/mask` | Stop/disable system services |
| `systemctl stop/restart/disable/mask` | Stop/restart/disable system services |
| `kill -9 -1` | Kill all processes |
| `pkill -9` | Force kill processes |
| Fork bomb patterns | Fork bombs |

View file

@ -124,7 +124,7 @@ display:
```
:::tip
Session IDs follow the format `YYYYMMDD_HHMMSS_<8-char-hex>`, e.g. `20250305_091523_a1b2c3d4`. You can resume by ID or by title — both work with `-c` and `-r`.
Session IDs follow the format `YYYYMMDD_HHMMSS_<hex>` — CLI/TUI sessions use a 6-char hex suffix (e.g. `20250305_091523_a1b2c3`), gateway sessions use an 8-char suffix (e.g. `20250305_091523_a1b2c3d4`). You can resume by ID (full or unique prefix) or by title — both work with `-c` and `-r`.
:::
## Session Naming

View file

@ -1,14 +1,14 @@
---
title: "Apple Notes — Manage Apple Notes via the memo CLI on macOS (create, view, search, edit)"
title: "Apple Notes — Manage Apple Notes via memo CLI: create, search, edit"
sidebar_label: "Apple Notes"
description: "Manage Apple Notes via the memo CLI on macOS (create, view, search, edit)"
description: "Manage Apple Notes via memo CLI: create, search, edit"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Apple Notes
Manage Apple Notes via the memo CLI on macOS (create, view, search, edit).
Manage Apple Notes via memo CLI: create, search, edit.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Apple Reminders — Manage Apple Reminders via remindctl CLI (list, add, complete, delete)"
title: "Apple Reminders — Apple Reminders via remindctl: add, list, complete"
sidebar_label: "Apple Reminders"
description: "Manage Apple Reminders via remindctl CLI (list, add, complete, delete)"
description: "Apple Reminders via remindctl: add, list, complete"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Apple Reminders
Manage Apple Reminders via remindctl CLI (list, add, complete, delete).
Apple Reminders via remindctl: add, list, complete.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Findmy — Track Apple devices and AirTags via FindMy"
title: "Findmy — Track Apple devices/AirTags via FindMy"
sidebar_label: "Findmy"
description: "Track Apple devices and AirTags via FindMy"
description: "Track Apple devices/AirTags via FindMy"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Findmy
Track Apple devices and AirTags via FindMy.app on macOS using AppleScript and screen capture.
Track Apple devices/AirTags via FindMy.app on macOS.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Claude Code — Delegate coding tasks to Claude Code (Anthropic's CLI agent)"
title: "Claude Code — Delegate coding to Claude Code CLI (features, PRs)"
sidebar_label: "Claude Code"
description: "Delegate coding tasks to Claude Code (Anthropic's CLI agent)"
description: "Delegate coding to Claude Code CLI (features, PRs)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Claude Code
Delegate coding tasks to Claude Code (Anthropic's CLI agent). Use for building features, refactoring, PR reviews, and iterative coding. Requires the claude CLI installed.
Delegate coding to Claude Code CLI (features, PRs).
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Codex — Delegate coding tasks to OpenAI Codex CLI agent"
title: "Codex — Delegate coding to OpenAI Codex CLI (features, PRs)"
sidebar_label: "Codex"
description: "Delegate coding tasks to OpenAI Codex CLI agent"
description: "Delegate coding to OpenAI Codex CLI (features, PRs)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Codex
Delegate coding tasks to OpenAI Codex CLI agent. Use for building features, refactoring, PR reviews, and batch issue fixing. Requires the codex CLI and a git repository.
Delegate coding to OpenAI Codex CLI (features, PRs).
## Skill metadata
@ -32,6 +32,15 @@ The following is the complete skill definition that Hermes loads when this skill
Delegate coding tasks to [Codex](https://github.com/openai/codex) via the Hermes terminal. Codex is OpenAI's autonomous coding agent CLI.
## When to use
- Building features
- Refactoring
- PR reviews
- Batch issue fixing
Requires the codex CLI and a git repository.
## Prerequisites
- Codex installed: `npm install -g @openai/codex`

View file

@ -1,14 +1,14 @@
---
title: "Hermes Agent"
title: "Hermes Agent — Configure, extend, or contribute to Hermes Agent"
sidebar_label: "Hermes Agent"
description: "Complete guide to using and extending Hermes Agent — CLI usage, setup, configuration, spawning additional agents, gateway platforms, skills, voice, tools, pr..."
description: "Configure, extend, or contribute to Hermes Agent"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Hermes Agent
Complete guide to using and extending Hermes Agent — CLI usage, setup, configuration, spawning additional agents, gateway platforms, skills, voice, tools, profiles, and a concise contributor reference. Load this skill when helping users configure Hermes, troubleshoot issues, spawn agent instances, or make code contributions.
Configure, extend, or contribute to Hermes Agent.
## Skill metadata
@ -132,7 +132,7 @@ hermes tools disable NAME Disable a toolset
hermes skills list List installed skills
hermes skills search QUERY Search the skills hub
hermes skills install ID Install a skill
hermes skills install ID Install a skill (ID can be a hub identifier OR a direct https://…/SKILL.md URL; pass --name to override when frontmatter has no name)
hermes skills inspect ID Preview without installing
hermes skills config Enable/disable skills per platform
hermes skills check Check for updates
@ -419,6 +419,63 @@ Tool changes take effect on `/reset` (new session). They do NOT apply mid-conver
---
## Security & Privacy Toggles
Common "why is Hermes doing X to my output / tool calls / commands?" toggles — and the exact commands to change them. Most of these need a fresh session (`/reset` in chat, or start a new `hermes` invocation) because they're read once at startup.
### Secret redaction in tool output
Secret redaction is **off by default** — tool output (terminal stdout, `read_file`, web content, subagent summaries, etc.) passes through unmodified. If the user wants Hermes to auto-mask strings that look like API keys, tokens, and secrets before they enter the conversation context and logs:
```bash
hermes config set security.redact_secrets true # enable globally
```
**Restart required.** `security.redact_secrets` is snapshotted at import time — toggling it mid-session (e.g. via `export HERMES_REDACT_SECRETS=true` from a tool call) will NOT take effect for the running process. Tell the user to run `hermes config set security.redact_secrets true` in a terminal, then start a new session. This is deliberate — it prevents an LLM from flipping the toggle on itself mid-task.
Disable again with:
```bash
hermes config set security.redact_secrets false
```
### PII redaction in gateway messages
Separate from secret redaction. When enabled, the gateway hashes user IDs and strips phone numbers from the session context before it reaches the model:
```bash
hermes config set privacy.redact_pii true # enable
hermes config set privacy.redact_pii false # disable (default)
```
### Command approval prompts
By default (`approvals.mode: manual`), Hermes prompts the user before running shell commands flagged as destructive (`rm -rf`, `git reset --hard`, etc.). The modes are:
- `manual` — always prompt (default)
- `smart` — use an auxiliary LLM to auto-approve low-risk commands, prompt on high-risk
- `off` — skip all approval prompts (equivalent to `--yolo`)
```bash
hermes config set approvals.mode smart # recommended middle ground
hermes config set approvals.mode off # bypass everything (not recommended)
```
Per-invocation bypass without changing config:
- `hermes --yolo …`
- `export HERMES_YOLO_MODE=1`
Note: YOLO / `approvals.mode: off` does NOT turn off secret redaction. They are independent.
### Shell hooks allowlist
Some shell-hook integrations require explicit allowlisting before they fire. Managed via `~/.hermes/shell-hooks-allowlist.json` — prompted interactively the first time a hook wants to run.
### Disabling the web/browser/image-gen tools
To keep the model away from network or media tools entirely, open `hermes tools` and toggle per-platform. Takes effect on next session (`/reset`). See the Tools & Skills section above.
---
## Voice & Transcription
### STT (Voice → Text)
@ -617,6 +674,7 @@ For occasional contributors and PR authors. Full developer docs: https://hermes-
### Project Layout
<!-- ascii-guard-ignore -->
```
hermes-agent/
├── run_agent.py # AIAgent — core conversation loop
@ -637,6 +695,7 @@ hermes-agent/
├── tests/ # ~3000 pytest tests
└── website/ # Docusaurus docs site
```
<!-- ascii-guard-ignore-end -->
Config: `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys).

View file

@ -1,14 +1,14 @@
---
title: "Opencode"
title: "Opencode — Delegate coding to OpenCode CLI (features, PR review)"
sidebar_label: "Opencode"
description: "Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions"
description: "Delegate coding to OpenCode CLI (features, PR review)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Opencode
Delegate coding tasks to OpenCode CLI agent for feature implementation, refactoring, PR review, and long-running autonomous sessions. Requires the opencode CLI installed and authenticated.
Delegate coding to OpenCode CLI (features, PR review).
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Architecture Diagram"
title: "Architecture Diagram — Dark-themed SVG architecture/cloud/infra diagrams as HTML"
sidebar_label: "Architecture Diagram"
description: "Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics"
description: "Dark-themed SVG architecture/cloud/infra diagrams as HTML"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Architecture Diagram
Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics. Semantic component colors (cyan=frontend, emerald=backend, violet=database, amber=cloud/AWS, rose=security, orange=message bus), JetBrains Mono font, grid background. Best suited for software architecture, cloud/VPC topology, microservice maps, service-mesh diagrams, database + API layer diagrams, security groups, message buses — anything that fits a tech-infra deck with a dark aesthetic. If a more specialized diagramming skill exists for the subject (scientific, educational, hand-drawn, animated, etc.), prefer that — otherwise this skill can also serve as a general-purpose SVG diagram fallback. Based on Cocoon AI's architecture-diagram-generator (MIT).
Dark-themed SVG architecture/cloud/infra diagrams as HTML.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Ascii Art"
title: "Ascii Art — ASCII art: pyfiglet, cowsay, boxes, image-to-ascii"
sidebar_label: "Ascii Art"
description: "Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii"
description: "ASCII art: pyfiglet, cowsay, boxes, image-to-ascii"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Ascii Art
Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required.
ASCII art: pyfiglet, cowsay, boxes, image-to-ascii.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Ascii Video — Production pipeline for ASCII art video — any format"
title: "Ascii Video — ASCII video: convert video/audio to colored ASCII MP4/GIF"
sidebar_label: "Ascii Video"
description: "Production pipeline for ASCII art video — any format"
description: "ASCII video: convert video/audio to colored ASCII MP4/GIF"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Ascii Video
Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering. Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output.
ASCII video: convert video/audio to colored ASCII MP4/GIF.
## Skill metadata
@ -25,6 +25,14 @@ The following is the complete skill definition that Hermes loads when this skill
# ASCII Video Production Pipeline
## When to use
Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output.
## What's inside
Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering.
## Creative Standard
This is visual art. ASCII characters are the medium; cinema is the standard.

View file

@ -1,14 +1,14 @@
---
title: "Baoyu Comic — Knowledge comic creator supporting multiple art styles and tones"
title: "Baoyu Comic — Knowledge comics (知识漫画): educational, biography, tutorial"
sidebar_label: "Baoyu Comic"
description: "Knowledge comic creator supporting multiple art styles and tones"
description: "Knowledge comics (知识漫画): educational, biography, tutorial"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Baoyu Comic
Knowledge comic creator supporting multiple art styles and tones. Creates original educational comics with detailed panel layouts and sequential image generation. Use when user asks to create "知识漫画", "教育漫画", "biography comic", "tutorial comic", or "Logicomix-style comic".
Knowledge comics (知识漫画): educational, biography, tutorial.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Baoyu Infographic — Generate professional infographics with 21 layout types and 21 visual styles"
title: "Baoyu Infographic — Infographics: 21 layouts x 21 styles (信息图, 可视化)"
sidebar_label: "Baoyu Infographic"
description: "Generate professional infographics with 21 layout types and 21 visual styles"
description: "Infographics: 21 layouts x 21 styles (信息图, 可视化)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Baoyu Infographic
Generate professional infographics with 21 layout types and 21 visual styles. Analyzes content, recommends layout×style combinations, and generates publication-ready infographics. Use when user asks to create "infographic", "visual summary", "信息图", "可视化", or "高密度信息大图".
Infographics: 21 layouts x 21 styles (信息图, 可视化).
## Skill metadata
@ -139,6 +139,7 @@ If a shortcut has **Prompt Notes**, append them to the generated prompt (Step 5)
## Output Structure
<!-- ascii-guard-ignore -->
```
infographic/{topic-slug}/
├── source-{slug}.{ext}
@ -147,6 +148,7 @@ infographic/{topic-slug}/
├── prompts/infographic.md
└── infographic.png
```
<!-- ascii-guard-ignore-end -->
Slug: 2-4 words kebab-case from topic. Conflict: append `-YYYYMMDD-HHMMSS`.

View file

@ -0,0 +1,608 @@
---
title: "Claude Design — Design one-off HTML artifacts (landing, deck, prototype)"
sidebar_label: "Claude Design"
description: "Design one-off HTML artifacts (landing, deck, prototype)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Claude Design
Design one-off HTML artifacts (landing, deck, prototype).
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/creative/claude-design` |
| Version | `1.0.0` |
| Author | BadTechBandit |
| License | MIT |
| Tags | `design`, `html`, `prototype`, `ux`, `ui`, `creative`, `artifact`, `deck`, `motion`, `design-system` |
| Related skills | [`design-md`](/docs/user-guide/skills/bundled/creative/creative-design-md), [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Claude Design for CLI/API Agents
Use this skill when the user asks for design work that would normally fit Claude Design, but the agent is running in a CLI/API environment instead of the hosted Claude Design web UI.
The goal is to preserve Claude Design's useful design behavior and taste while removing hosted-tool plumbing that does not exist in normal agent environments.
**Before starting, check for other web-design skills like `popular-web-designs` (ready-to-paste design systems for Stripe, Linear, Vercel, Notion, etc.) and `design-md` (Google's DESIGN.md token spec format).** If the user wants a known brand's look, load `popular-web-designs` alongside this one and let it supply the visual vocabulary. If the deliverable is a token spec file rather than a rendered artifact, use `design-md` instead. Full decision table below.
## When To Use This Skill vs `popular-web-designs` vs `design-md`
Hermes has three design-related skills under `skills/creative/`. They do different jobs — load the right one (or combine them):
| Skill | What it gives you | Use when the user wants... |
|---|---|---|
| **claude-design** (this one) | Design *process and taste* — how to scope a brief, gather context, produce variants, verify a local HTML artifact, avoid AI-design slop | a from-scratch designed artifact (landing page, prototype, deck, component lab, motion study) with no specific brand or token system dictated |
| **popular-web-designs** | 54 ready-to-paste design systems — exact colors, typography, components, CSS values for sites like Stripe, Linear, Vercel, Notion, Airbnb | "make it look like Stripe / Linear / Vercel", a page styled after a known brand, or a visual starting point pulled from a real product |
| **design-md** | Google's DESIGN.md spec format — author/validate/diff/export design-token files, WCAG contrast checking, Tailwind/DTCG export | a formal, persistent, machine-readable design-system *spec file* (tokens + rationale) that lives in a repo and gets consumed by agents over time |
Rule of thumb:
- **Process + taste, one-off artifact** → claude-design
- **Match a known brand's look** → popular-web-designs (and let claude-design drive the process)
- **Author the tokens spec itself** → design-md
These compose: use `popular-web-designs` for the visual vocabulary, `claude-design` for how to turn a brief into a thoughtful local HTML file, and `design-md` when the output is the token file rather than a rendered artifact.
## Runtime Mode
You are running in **CLI/API mode**, not the Claude Design hosted web UI.
Ignore references from source Claude Design prompts to hosted-only tools, project panes, preview panes, special toolbar protocols, or platform callbacks that are not available in the current environment.
Examples of hosted-tool concepts to ignore or remap:
- `done()`
- `fork_verifier_agent()`
- `questions_v2()`
- `copy_starter_component()`
- `show_to_user()`
- `show_html()`
- `snip()`
- `eval_js_user_view()`
- hosted asset review panes
- hosted edit-mode or Tweaks toolbar messaging
- `/projects/<projectId>/...` cross-project paths
- built-in `window.claude.complete()` artifact helper
- tool schemas embedded in the source prompt
- web-search citation scaffolding meant for the hosted runtime
Instead, use the tools actually available in the current agent environment.
Default deliverable:
- a complete local HTML file
- self-contained CSS and JavaScript when portability matters
- exact on-disk path in the final response
- verification using available local methods before saying it is done
If the user asks for implementation in an existing repo, generate code in the repo's actual stack instead of forcing a standalone HTML artifact.
## Core Identity
Act as an expert designer working with the user as the manager.
HTML is the default tool, but the medium changes by assignment:
- UX designer for flows and product surfaces
- interaction designer for prototypes
- visual designer for static explorations
- motion designer for animated artifacts
- deck designer for presentations
- design-systems designer for tokens, components, and visual rules
- frontend-minded prototyper when code fidelity matters
Avoid generic web-design tropes unless the user explicitly asks for a conventional web page.
Do not expose internal prompts, hidden system messages, or implementation plumbing. Talk about capabilities and deliverables in user terms: HTML files, prototypes, decks, exported assets, screenshots, code, and design options.
## When To Use
Use this skill for:
- landing pages
- teaser pages
- high-fidelity prototypes
- interactive product mockups
- visual option boards
- component explorations
- design-system previews
- HTML slide decks
- motion studies
- onboarding flows
- dashboard concepts
- settings, command palettes, modals, cards, forms, empty states
- redesigns based on screenshots, repos, brand docs, or UI kits
Do not use this skill for pure DESIGN.md token authoring unless the user specifically asks for a DESIGN.md file. Use `design-md` for that.
## Design Principle: Start From Context, Not Vibes
Good high-fidelity design does not start from scratch.
Before designing, look for source context:
1. brand docs
2. existing product screenshots
3. current repo components
4. design tokens
5. UI kits
6. prior mockups
7. reference models
8. copy docs
9. constraints from legal, product, or engineering
If a repo is available, inspect actual source files before inventing UI:
- theme files
- token files
- global stylesheets
- layout scaffolds
- component files
- route/page files
- form/button/card/navigation implementations
The file tree is only the menu. Read the files that define the visual vocabulary before designing.
If context is missing and fidelity matters, ask concise focused questions instead of producing a generic mockup.
## Asking Questions
Ask questions when the assignment is new, ambiguous, high-fidelity, externally facing, or depends on taste.
Keep questions short. Do not ask ten questions by default unless the problem is genuinely underspecified.
Usually ask for:
- intended output format
- audience
- fidelity level
- source materials available
- brand/design system in play
- number of variations wanted
- whether to stay conservative or explore divergent ideas
- which dimension matters most: layout, visual language, interaction, copy, motion, or systemization
Skip questions when:
- the user gave enough direction
- this is a small tweak
- the task is clearly a continuation
- the missing detail has an obvious default
When proceeding with assumptions, label only the important ones.
## Workflow
1. **Understand the brief**
- What is being designed?
- Who is it for?
- What artifact should exist at the end?
- What constraints are locked?
2. **Gather context**
- Read supplied docs, screenshots, repo files, or design assets.
- Identify the visual vocabulary before writing code.
3. **Define the design system for this artifact**
- colors
- type
- spacing
- radii
- shadows or elevation
- motion posture
- component treatment
- interaction rules
4. **Choose the right format**
- Static visual comparison: one HTML canvas with options side by side.
- Interaction/flow: clickable prototype.
- Presentation: fixed-size HTML deck with slide navigation.
- Component exploration: component lab with variants.
- Motion: timeline or state-based animation.
5. **Build the artifact**
- Prefer a single self-contained HTML file unless the task calls for a repo implementation.
- Preserve prior versions for major revisions.
- Avoid unnecessary dependencies.
6. **Verify**
- Confirm files exist.
- Run any available syntax/static checks.
- If browser tools are available, open the file and check console errors.
- If visual fidelity matters and screenshot tools are available, inspect at least the primary viewport.
7. **Report briefly**
- exact file path
- what was created
- caveats
- next decision or next iteration
## Artifact Format Rules
Default to local files.
For standalone artifacts:
- create a descriptive filename, e.g. `Landing Page.html`, `Command Palette Prototype.html`, `Design System Board.html`
- embed CSS in `<style>`
- embed JS in `<script>`
- keep the artifact openable directly in a browser
- avoid remote dependencies unless they are explicitly useful and stable
- include responsive behavior unless the format is intentionally fixed-size
For significant revisions:
- preserve the previous version as `Name.html`
- create `Name v2.html`, `Name v3.html`, etc.
- or keep one file with in-page toggles if the assignment is variant exploration
For repo implementation:
- follow the repo's actual stack
- use existing components and tokens where possible
- do not create a standalone artifact if the user asked for production code
## HTML / CSS / JS Standards
Use modern CSS well:
- CSS variables for tokens
- CSS grid for layout
- container queries when helpful
- `text-wrap: pretty` where supported
- real focus states
- real hover states
- `prefers-reduced-motion` handling for non-trivial motion
- responsive scaling
- semantic HTML where practical
Avoid:
- huge monolithic files when a real repo structure is expected
- fragile hard-coded viewport assumptions
- inaccessible tiny hit targets
- decorative JS that fights usability
- `scrollIntoView` unless there is no safer option
Mobile hit targets should be at least 44px.
For print documents, text should be at least 12pt.
For 1920×1080 slide decks, text should generally be 24px or larger.
## React Guidance for Standalone HTML
Use plain HTML/CSS/JS by default.
Use React only when:
- the artifact needs meaningful state
- variants/toggles are easier as components
- interaction complexity warrants it
- the target implementation is React/Next.js and fidelity matters
If using React from CDN in standalone HTML:
- pin exact versions
- avoid unpinned `react@18` style URLs
- avoid `type="module"` unless necessary
- avoid multiple global objects named `styles`
- give global style objects specific names, e.g. `commandPaletteStyles`, `deckStyles`
- if splitting Babel scripts, explicitly attach shared components to `window`
If building inside a real repo, use the repo's package manager and component architecture instead.
## Deck Rules
For slide decks, use a fixed-size canvas and scale it to fit the viewport.
Default slide size: 1920×1080, 16:9.
Requirements:
- keyboard navigation
- visible slide count
- localStorage persistence for current slide
- print-friendly layout when practical
- screen labels or stable IDs for important slides
- no speaker notes unless the user explicitly asks
Do not hand-wave a deck as markdown bullets. Create a designed artifact if asked for a deck.
Use 12 background colors max unless the brand system requires more.
Keep slides sparse. If a slide feels empty, solve it with layout, rhythm, scale, or imagery placeholders, not filler text.
## Prototype Rules
For interactive prototypes:
- make the primary path clickable
- include key states: default, hover/focus, loading, empty, error, success where relevant
- expose variations with in-page controls when useful
- keep controls out of the final composition unless they are intentionally part of the prototype
- persist important state in localStorage when refresh continuity matters
If the prototype is meant to model a product flow, design the flow, not just the first screen.
## Variation Rules
When exploring, default to at least three options:
1. **Conservative** — closest to existing patterns / lowest risk
2. **Strong-fit** — best interpretation of the brief
3. **Divergent** — more novel, useful for discovering taste boundaries
Variations can explore:
- layout
- hierarchy
- type scale
- density
- color posture
- surface treatment
- motion
- interaction model
- copy structure
- component shape
Do not create variations that are merely color swaps unless color is the actual question.
When the user picks a direction, consolidate. Do not leave the project as a pile of options forever.
## Tweakable Designs in CLI/API Mode
The hosted Claude Design edit-mode toolbar does not exist here.
Still preserve the idea: when useful, add in-page controls called `Tweaks`.
A good `Tweaks` panel can control:
- theme mode
- layout variant
- density
- accent color
- type scale
- motion on/off
- copy variant
- component variant
Keep it small and unobtrusive. The design should look final when tweaks are hidden.
Persist tweak values with localStorage when helpful.
## Content Discipline
Do not add filler content.
Every element must earn its place.
Avoid:
- fake metrics
- decorative stats
- generic feature grids
- unnecessary icons
- placeholder testimonials
- AI-generated fluff sections
- invented content that changes strategy or claims
If additional sections, pages, copy, or claims would improve the artifact, ask before adding them.
When copy is necessary but not final, mark it as draft or placeholder.
## Anti-Slop Rules
Avoid common AI design sludge:
- aggressive gradient backgrounds
- glassmorphism by default
- emoji unless the brand uses them
- generic SaaS cards with icons everywhere
- left-border accent callout cards
- fake dashboards filled with arbitrary numbers
- stock-photo hero sections
- oversized rounded rectangles as a substitute for hierarchy
- rainbow palettes
- vague labels like “Insights,” “Growth,” “Scale,” “Optimize” without content
- decorative SVG illustrations pretending to be product imagery
Minimal is not automatically good. Dense is not automatically cluttered. Choose intentionally.
## Typography
Use the existing type system if one exists.
If not, choose type deliberately based on the artifact:
- editorial: serif or humanist headline with restrained sans body
- software/productivity: precise sans with strong numeric treatment
- luxury/minimal: fewer weights, more spacing discipline
- technical: mono accents only, not mono everywhere
- deck: large, clear, high contrast
Avoid overused defaults when a stronger choice is appropriate.
If using web fonts, keep the number of families and weights low.
Use type as hierarchy before adding boxes, icons, or color.
## Color
Use brand/design-system colors first.
If no palette exists:
- define a small system
- include neutrals, surface, ink, muted text, border, accent, danger/success if needed
- use one primary accent unless the assignment calls for a broader palette
- prefer oklch for harmonious invented palettes when browser support is acceptable
- check contrast for important text and controls
Do not invent lots of colors from scratch.
## Layout and Composition
Design with rhythm:
- scale
- whitespace
- density
- alignment
- repetition
- contrast
- interruption
Avoid making every section the same card grid.
For product UIs, prioritize speed of comprehension over decoration.
For marketing surfaces, make one idea land per section.
For dashboards, avoid “data slop.” Only show data that helps the user decide or act.
## Motion
Use motion as discipline, not theater.
Good motion:
- clarifies state changes
- reduces anxiety during loading
- shows continuity between surfaces
- gives controls tactility
- stays subtle
Bad motion:
- loops without purpose
- delays the user
- calls attention to itself
- hides poor hierarchy
Respect `prefers-reduced-motion` for non-trivial animation.
## Images and Icons
Use real supplied imagery when available.
If an asset is missing:
- use a clean placeholder
- use typography, layout, or abstract texture instead
- ask for real material when fidelity matters
Do not draw elaborate fake SVG illustrations unless the assignment is explicitly illustration work.
Avoid iconography unless it improves scanning or matches the design system.
## Source-Code Fidelity
When recreating or extending a UI from a repo:
1. inspect the repo tree
2. identify the actual UI source files
3. read theme/token/global style/component files
4. lift exact values where appropriate
5. match spacing, radii, shadows, copy tone, density, and interaction patterns
6. only then design or modify
Do not build from memory when source files are available.
For GitHub URLs, parse owner/repo/ref/path correctly and inspect the relevant files before designing.
## Reading Documents and Assets
Read Markdown, HTML, CSS, JS, TS, JSX, TSX, JSON, SVG, and plain text directly when available.
For DOCX/PPTX/PDF, use available local extraction tools if present. If not available, ask the user to provide exported text/images or use another available tool path.
For sketches, prioritize thumbnails or screenshots over raw drawing JSON unless the JSON is the only usable source.
## Copyright and Reference Models
Do not recreate a company's distinctive UI, proprietary command structure, branded screens, or exact visual identity unless the user clearly has rights to that source.
It is acceptable to extract general design principles:
- density without clutter
- command-first interaction
- monochrome with one accent
- editorial hierarchy
- clear empty states
- strong keyboard affordances
It is not acceptable to clone proprietary layouts, copy exact branded surfaces, or reproduce copyrighted content.
When using references, transform posture and principles into an original design.
## Verification
Before final response, verify as much as the environment allows.
Minimum:
- file exists at the stated path
- HTML is saved completely
- obvious syntax issues are checked
Better:
- open in a browser tool and check console errors
- inspect screenshots at the primary viewport
- test key interactions
- test light/dark or variants if present
- test responsive breakpoints if relevant
If verification is limited by environment, say exactly what was and was not verified.
Never say “done” if the file was not actually written.
## Final Response Format
Keep final responses short.
Include:
- artifact path
- what it contains
- verification status
- next suggested action, if useful
Example:
```text
Created: /path/to/Prototype.html
It includes 3 layout variants, a Tweaks panel for density/theme, and responsive behavior.
Verified: file exists and opened cleanly in browser, no console errors.
Next: pick the strongest direction and Ill tighten copy + motion.
```
## Portable Opening Prompt Pattern
When adapting a Claude Design style request into CLI/API mode, use this mental translation:
```text
You are running in CLI/API mode, not hosted Claude Design. Ignore references to hosted-only tools or preview panes. Produce complete local design artifacts, usually self-contained HTML with embedded CSS/JS, and verify with available local tools before returning. Preserve the design process: gather context, define the system, produce options, avoid filler, and meet a high visual bar.
```
## Pitfalls
- Do not paste hosted tool schemas into a skill. They cause fake tool calls.
- Do not point the skill at a giant external prompt as required runtime context. That creates drift.
- Do not strip the design doctrine while removing tool plumbing.
- Do not over-ask when the user already gave enough direction.
- Do not under-ask for high-fidelity work with no brand context.
- Do not produce generic SaaS layouts and call them designed.
- Do not claim browser verification unless it actually happened.

View file

@ -0,0 +1,652 @@
---
title: "Comfyui"
sidebar_label: "Comfyui"
description: "Generate images, video, and audio with ComfyUI — install, launch, manage nodes/models, run workflows with parameter injection"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Comfyui
Generate images, video, and audio with ComfyUI — install, launch, manage nodes/models, run workflows with parameter injection. Uses the official comfy-cli for lifecycle and direct REST API for execution.
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/creative/comfyui` |
| Version | `4.1.0` |
| Author | ['kshitijk4poor', 'alt-glitch'] |
| License | MIT |
| Platforms | macos, linux, windows |
| Tags | `comfyui`, `image-generation`, `stable-diffusion`, `flux`, `creative`, `generative-ai`, `video-generation` |
| Related skills | [`stable-diffusion-image-generation`](/docs/user-guide/skills/optional/mlops/mlops-stable-diffusion), `image_gen` |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# ComfyUI
Generate images, video, and audio through ComfyUI using the official `comfy-cli` for
setup/management and direct REST API calls for workflow execution.
**Reference files in this skill:**
- `references/official-cli.md` — comfy-cli command reference (install, launch, nodes, models)
- `references/rest-api.md` — ComfyUI REST API endpoints (local + cloud)
- `references/workflow-format.md` — workflow JSON format, common node types, parameter mapping
**Scripts in this skill:**
- `scripts/hardware_check.py` — detect GPU/VRAM/Apple Silicon, decide local vs Comfy Cloud
- `scripts/comfyui_setup.sh` — full setup automation (hardware check + install + launch + verify)
- `scripts/extract_schema.py` — reads workflow JSON, outputs which parameters are controllable
- `scripts/run_workflow.py` — injects user args, submits workflow, monitors progress, downloads outputs
- `scripts/check_deps.py` — checks if required custom nodes and models are installed
## When to Use
- User asks to generate images with Stable Diffusion, SDXL, Flux, or other diffusion models
- User wants to run a specific ComfyUI workflow
- User wants to chain generative steps (txt2img → upscale → face restore)
- User needs ControlNet, inpainting, img2img, or other advanced pipelines
- User asks to manage ComfyUI queue, check models, or install custom nodes
- User wants video/audio generation via AnimateDiff, Hunyuan, AudioCraft, etc.
## Architecture: Two Layers
<!-- ascii-guard-ignore -->
```
┌─────────────────────────────────────────────────────┐
│ Layer 1: comfy-cli (official) │
│ Setup, lifecycle, nodes, models │
│ comfy install / launch / stop / node / model │
└─────────────────────────┬───────────────────────────┘
┌─────────────────────────▼───────────────────────────┐
│ Layer 2: REST API + skill scripts │
│ Workflow execution, param injection, monitoring │
│ POST /api/prompt, GET /api/view, WebSocket │
│ scripts/run_workflow.py, extract_schema.py │
└─────────────────────────────────────────────────────┘
```
<!-- ascii-guard-ignore-end -->
**Why two layers?** The official CLI handles installation and server management excellently
but has minimal workflow execution support (just raw file submission, no param injection,
no structured output). The REST API fills that gap — the scripts in this skill handle the
param injection, execution monitoring, and output download that the CLI doesn't do.
## Quick Start
### Detect Environment
```bash
# What's available?
command -v comfy >/dev/null 2>&1 && echo "comfy-cli: installed"
curl -s http://127.0.0.1:8188/system_stats 2>/dev/null && echo "server: running"
# Can this machine actually run ComfyUI locally? (GPU/VRAM/Apple Silicon check)
python3 scripts/hardware_check.py
```
If nothing is installed, go to **Setup & Onboarding** below — but always run the
hardware check first, before picking an install path.
If the server is already running, skip to **Core Workflow**.
## Core Workflow
### Step 1: Get a Workflow
Users provide workflow JSON files. These come from:
- ComfyUI web editor → "Save (API Format)" button
- Community downloads (civitai, Reddit, Discord)
- The `scripts/` directory of this skill (example workflows)
**The workflow must be in API format** (node IDs as keys with `class_type`).
If user has editor format (has `nodes[]` and `links[]` at top level), they
need to re-export using "Save (API Format)" in the ComfyUI web editor.
### Step 2: Understand What's Controllable
```bash
python3 scripts/extract_schema.py workflow_api.json
```
Output (JSON):
```json
{
"parameters": {
"prompt": {"node_id": "6", "field": "text", "type": "string", "value": "a cat"},
"negative_prompt": {"node_id": "7", "field": "text", "type": "string", "value": "bad quality"},
"seed": {"node_id": "3", "field": "seed", "type": "int", "value": 42},
"steps": {"node_id": "3", "field": "steps", "type": "int", "value": 20},
"width": {"node_id": "5", "field": "width", "type": "int", "value": 512},
"height": {"node_id": "5", "field": "height", "type": "int", "value": 512}
}
}
```
### Step 3: Run with Parameters
**Local:**
```bash
python3 scripts/run_workflow.py \
--workflow workflow_api.json \
--args '{"prompt": "a beautiful sunset over mountains", "seed": 123, "steps": 30}' \
--output-dir ./outputs
```
**Cloud:**
```bash
python3 scripts/run_workflow.py \
--workflow workflow_api.json \
--args '{"prompt": "a beautiful sunset", "seed": 123}' \
--host https://cloud.comfy.org \
--api-key "$COMFY_CLOUD_API_KEY" \
--output-dir ./outputs
```
### Step 4: Present Results
The script outputs JSON with file paths:
```json
{
"status": "success",
"outputs": [
{"file": "./outputs/ComfyUI_00001_.png", "node_id": "9", "type": "image"}
]
}
```
Show images to the user via `vision_analyze` or return the file path directly.
## Decision Tree
| User says | Tool | Command |
|-----------|------|---------|
| "install ComfyUI" | comfy-cli | `comfy install` |
| "start ComfyUI" | comfy-cli | `comfy launch --background` |
| "stop ComfyUI" | comfy-cli | `comfy stop` |
| "install X node" | comfy-cli | `comfy node install <name>` |
| "download X model" | comfy-cli | `comfy model download --url <url>` |
| "list installed models" | comfy-cli | `comfy model list` |
| "list installed nodes" | comfy-cli | `comfy node show installed` |
| "generate an image" | script | `run_workflow.py --args '{"prompt": "..."}'` |
| "use this image" (img2img) | REST | upload image, then run_workflow.py |
| "what can I change in this workflow?" | script | `extract_schema.py workflow.json` |
| "check if workflow deps are met" | script | `check_deps.py workflow.json` |
| "what's in the queue?" | REST | `curl http://HOST:8188/queue` |
| "cancel that" | REST | `curl -X POST http://HOST:8188/interrupt` |
| "free GPU memory" | REST | `curl -X POST http://HOST:8188/free` |
## Setup & Onboarding
When a user asks to set up ComfyUI, the FIRST thing to do is ask them whether
they want **Comfy Cloud** (hosted, zero install, API key) or **Local** (install
ComfyUI on their machine). Do NOT start running install commands or hardware
checks until they've answered.
**Official docs:** https://docs.comfy.org/installation
**CLI docs:** https://docs.comfy.org/comfy-cli/getting-started
**Cloud docs:** https://docs.comfy.org/get_started/cloud
### Step 0: Ask Local vs Cloud (ALWAYS FIRST)
Present the tradeoff clearly and wait for the user to choose. Suggested script:
> "Do you want to run ComfyUI locally on your machine, or use Comfy Cloud?
>
> - **Comfy Cloud** — hosted on RTX 6000 Pro GPUs, all models pre-installed, zero setup. Requires an API key (paid subscription). Best if you don't have a capable GPU or want to skip installation.
> - **Local** — free, but your machine MUST meet the hardware requirements:
> - NVIDIA GPU with **≥6 GB VRAM** (≥8 GB recommended for SDXL, ≥12 GB for Flux/video), OR
> - AMD GPU with ROCm support (Linux), OR
> - Apple Silicon Mac (M1 or newer) with **≥16 GB unified memory** (≥32 GB recommended).
> - Intel Macs and machines with no GPU will NOT work — use Cloud instead.
>
> Which would you like?"
Route based on their answer:
- **User picks Cloud** → skip to **Path A** (no hardware check needed).
- **User picks Local** → go to **Step 1: Hardware Check** to verify their machine actually meets the requirements, then pick an install path from Paths B-E based on the verdict.
- **User is unsure / asks for a recommendation** → run the hardware check anyway and let the verdict decide.
### Step 1: Verify Hardware (ONLY if user chose local)
```bash
python3 scripts/hardware_check.py --json
```
It detects OS, GPU (NVIDIA CUDA / AMD ROCm / Apple Silicon / Intel Arc), VRAM,
and unified/system RAM, then returns a verdict plus a suggested `comfy-cli` flag:
| Verdict | Meaning | Action |
|------------|-----------------------------------------------------------|-------------------------------------------------|
| `ok` | ≥8 GB VRAM (discrete) OR ≥32 GB unified (Apple Silicon) | Local install — use `comfy_cli_flag` from report |
| `marginal` | SD1.5 works; SDXL tight; Flux/video unlikely | Local OK for light workflows, else **Path A (Cloud)** |
| `cloud` | No usable GPU, &lt;6 GB VRAM, &lt;16 GB Apple unified, Intel Mac | **User chose local but their machine doesn't meet requirements** — surface the `notes` and ask if they want to switch to Cloud |
Hardware thresholds the skill enforces:
- **Discrete GPU minimum:** 6 GB VRAM. Below that, most modern models won't load.
- **Apple Silicon:** M1 or newer (ARM64). Intel Macs have no MPS backend — Cloud only.
- **Apple Silicon memory:** 16 GB unified minimum. 8 GB M1/M2 will swap/OOM on SDXL/Flux.
- **No accelerator at all:** CPU-only is listed as a comfy-cli option but a single SDXL
image takes 10+ minutes — treat it as unusable and route to Cloud.
If verdict is `cloud` but the user explicitly wanted local, DO NOT proceed
silently. Show the `notes` array verbatim, explain which requirement they
don't meet, and ask whether they want to (a) switch to Cloud or (b) force
a local install anyway (marginal/cloud-verdict local installs will OOM or
be unusably slow on modern models).
The report's `comfy_cli_flag` field gives you the exact flag for Step 2 below:
`--nvidia`, `--amd`, or `--m-series`. For Intel Arc, use Path E (manual install).
Surface the `notes` array verbatim to the user so they understand why a
particular path was recommended.
### Choosing an Installation Path
Use the hardware check result first. The table below is a fallback for when the user
has already told you their hardware or you need to narrow down between multiple
viable paths:
| Situation | Recommended Path |
|-----------|-----------------|
| `verdict: cloud` from hardware check | **Path A: Comfy Cloud** |
| No GPU / just want to try it | **Path A: Comfy Cloud** (zero setup) |
| Windows + NVIDIA GPU + non-technical | **Path B: ComfyUI Desktop** (one-click installer) |
| Windows + NVIDIA GPU + technical | **Path C: Portable** or **Path D: comfy-cli** |
| Linux + any GPU | **Path D: comfy-cli** (easiest) or Path E manual |
| macOS + Apple Silicon | **Path B: ComfyUI Desktop** or **Path D: comfy-cli** |
| Headless / server / CI | **Path D: comfy-cli** |
For the fully automated path (hardware check → install → launch), just run:
```bash
bash scripts/comfyui_setup.sh
```
It runs `hardware_check.py` internally, refuses to install locally when the verdict
is `cloud`, picks the right `comfy-cli` flag otherwise, then installs and launches.
---
### Path A: Comfy Cloud (No Local Install)
For users without a capable GPU or who want zero setup.
Powered by RTX 6000 Pro GPUs, all models pre-installed.
**Docs:** https://docs.comfy.org/get_started/cloud
1. Go to https://comfy.org/cloud and sign up
2. Get an API key at https://platform.comfy.org/login
- Click `+ New` in API Keys section → Generate
- Save immediately (only visible once)
3. Set the key:
```bash
export COMFY_CLOUD_API_KEY="comfyui-xxxxxxxxxxxx"
```
4. Run workflows via the script or web UI:
```bash
python3 scripts/run_workflow.py \
--workflow workflow_api.json \
--args '{"prompt": "a cat"}' \
--host https://cloud.comfy.org \
--api-key "$COMFY_CLOUD_API_KEY" \
--output-dir ./outputs
```
**Pricing:** https://www.comfy.org/cloud/pricing
Subscription required. Concurrent limits: Free/Standard: 1 job, Creator: 3, Pro: 5.
---
### Path B: ComfyUI Desktop (Windows/macOS)
One-click installer for non-technical users. Currently Beta.
**Docs:** https://docs.comfy.org/installation/desktop
- **Windows (NVIDIA):** https://download.comfy.org/windows/nsis/x64
- **macOS (Apple Silicon):** Available from https://comfy.org (download page)
Steps:
1. Download and run installer
2. Select GPU type (NVIDIA recommended, or CPU mode)
3. Choose install location (SSD recommended, ~15GB needed)
4. Optionally migrate from existing ComfyUI Portable install
5. Desktop launches automatically — web UI opens in browser
Desktop manages its own Python environment. For CLI access to the bundled env:
```bash
cd <install_dir>/ComfyUI
.venv/Scripts/activate # Windows
# or use the built-in terminal in the Desktop UI
```
**Limitations:** Desktop uses stable releases (may lag behind latest).
Linux not supported for Desktop — use comfy-cli or manual install.
---
### Path C: ComfyUI Portable (Windows Only)
Standalone package with embedded Python. Extract and run. No install.
**Docs:** https://docs.comfy.org/installation/comfyui_portable_windows
1. Download from https://github.com/comfyanonymous/ComfyUI/releases
- Standard: Python 3.13 + CUDA 13.0 (modern NVIDIA GPUs)
- Alt: PyTorch CUDA 12.6 + Python 3.12 (NVIDIA 10 series and older)
- AMD (experimental)
2. Extract with 7-Zip
3. Run `run_nvidia_gpu.bat` (or `run_cpu.bat`)
4. Wait for "To see the GUI go to: http://127.0.0.1:8188"
Update: run `update/update_comfyui.bat` (latest commit) or
`update/update_comfyui_stable.bat` (latest stable release).
---
### Path D: comfy-cli (All Platforms — Recommended for Agents)
The official CLI is the best path for headless/automated setups.
**Docs:** https://docs.comfy.org/comfy-cli/getting-started
**Repo:** https://github.com/Comfy-Org/comfy-cli
#### Prerequisites
- Python 3.10+ (3.13 recommended)
- pip (or conda/uv)
- GPU drivers installed (CUDA for NVIDIA, ROCm for AMD)
#### Install comfy-cli
```bash
pip install comfy-cli
# or
uvx --from comfy-cli comfy --help
```
Disable analytics (avoids interactive prompt):
```bash
comfy --skip-prompt tracking disable
```
#### Install ComfyUI
```bash
# Interactive (prompts for GPU type)
comfy install
# Non-interactive variants:
comfy --skip-prompt install --nvidia # NVIDIA (CUDA)
comfy --skip-prompt install --amd # AMD (ROCm, Linux)
comfy --skip-prompt install --m-series # Apple Silicon (MPS)
comfy --skip-prompt install --cpu # CPU only (slow)
# With faster dependency resolution:
comfy --skip-prompt install --nvidia --fast-deps
```
Default location: `~/comfy/ComfyUI` (Linux), `~/Documents/comfy/ComfyUI` (macOS/Win).
Override with: `comfy --workspace /custom/path install`
#### Launch Server
```bash
comfy launch --background # background daemon on :8188
comfy launch # foreground (see logs)
comfy launch -- --listen 0.0.0.0 # accessible on LAN
comfy launch -- --port 8190 # custom port
comfy launch -- --lowvram # low VRAM mode (6GB cards)
```
Verify server is running:
```bash
curl -s http://127.0.0.1:8188/system_stats | python3 -m json.tool
```
Stop background server:
```bash
comfy stop
```
---
### Path E: Manual Install (Advanced / All Hardware)
For full control or unsupported hardware (Ascend NPU, Cambricon MLU, Intel Arc).
**Docs:** https://docs.comfy.org/installation/manual_install
**GitHub:** https://github.com/comfyanonymous/ComfyUI
```bash
# 1. Create environment
conda create -n comfyenv python=3.13
conda activate comfyenv
# 2. Clone
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
# 3. Install PyTorch (pick your hardware)
# NVIDIA:
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu130
# AMD (ROCm 6.4):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.4
# Apple Silicon:
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
# Intel Arc:
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/xpu
# CPU only:
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
# 4. Install ComfyUI deps
pip install -r requirements.txt
# 5. Run
python main.py
# With options: python main.py --listen 0.0.0.0 --port 8188
```
---
### Post-Install: Download Models
ComfyUI needs at least one checkpoint model to generate images.
**Using comfy-cli:**
```bash
# SDXL (general purpose, ~6.5GB)
comfy model download \
--url "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors" \
--relative-path models/checkpoints
# SD 1.5 (lighter, ~4GB, good for low VRAM)
comfy model download \
--url "https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors" \
--relative-path models/checkpoints
# From CivitAI (may need API token):
comfy model download \
--url "https://civitai.com/api/download/models/128713" \
--relative-path models/checkpoints \
--set-civitai-api-token "YOUR_TOKEN"
# LoRA adapters:
comfy model download --url "<URL>" --relative-path models/loras
```
**Manual download:** Place `.safetensors` / `.ckpt` files directly into the
`ComfyUI/models/checkpoints/` directory (or `loras/`, `vae/`, etc.).
List installed models:
```bash
comfy model list
```
---
### Post-Install: Install Custom Nodes
Custom nodes extend ComfyUI's capabilities (upscaling, video, ControlNet, etc.).
```bash
comfy node install comfyui-impact-pack # popular utility pack
comfy node install comfyui-animatediff-evolved # video generation
comfy node install comfyui-controlnet-aux # ControlNet preprocessors
comfy node install comfyui-essentials # common helpers
comfy node update all # update all nodes
```
Check what's installed:
```bash
comfy node show installed
```
Install deps for a specific workflow:
```bash
comfy node install-deps --workflow=workflow_api.json
```
---
### Post-Install: Verify Setup
```bash
# Check server is responsive
curl -s http://127.0.0.1:8188/system_stats | python3 -m json.tool
# Check a workflow's dependencies
python3 scripts/check_deps.py workflow_api.json --host 127.0.0.1 --port 8188
# Test a generation
python3 scripts/run_workflow.py \
--workflow workflow_api.json \
--args '{"prompt": "test image, high quality"}' \
--output-dir ./test-outputs
```
## Image Upload (img2img / Inpainting)
Upload files directly via REST:
```bash
# Upload input image
curl -X POST "http://127.0.0.1:8188/upload/image" \
-F "image=@photo.png" -F "type=input" -F "overwrite=true"
# Returns: {"name": "photo.png", "subfolder": "", "type": "input"}
# Upload mask for inpainting
curl -X POST "http://127.0.0.1:8188/upload/mask" \
-F "image=@mask.png" -F "type=input" \
-F 'original_ref={"filename":"photo.png","subfolder":"","type":"input"}'
```
Then reference the uploaded filename in workflow args:
```bash
python3 scripts/run_workflow.py --workflow inpaint.json \
--args '{"image": "photo.png", "mask": "mask.png", "prompt": "fill with flowers"}'
```
## Cloud Execution
Base URL: `https://cloud.comfy.org`
Auth: `X-API-Key` header
```bash
# Submit workflow
python3 scripts/run_workflow.py \
--workflow workflow_api.json \
--args '{"prompt": "cyberpunk city"}' \
--host https://cloud.comfy.org \
--api-key "$COMFY_CLOUD_API_KEY" \
--output-dir ./outputs \
--timeout 300
# Upload image for cloud workflows
curl -X POST "https://cloud.comfy.org/api/upload/image" \
-H "X-API-Key: $COMFY_CLOUD_API_KEY" \
-F "image=@input.png" -F "type=input" -F "overwrite=true"
```
Concurrent job limits:
| Tier | Concurrent Jobs |
|------|----------------|
| Free/Standard | 1 |
| Creator | 3 |
| Pro | 5 |
Extra submissions queue automatically.
## Queue & System Management
```bash
# Check queue
curl -s http://127.0.0.1:8188/queue | python3 -m json.tool
# Clear pending queue
curl -X POST http://127.0.0.1:8188/queue -d '{"clear": true}'
# Cancel running job
curl -X POST http://127.0.0.1:8188/interrupt
# Free GPU memory (unload all models)
curl -X POST http://127.0.0.1:8188/free -H "Content-Type: application/json" \
-d '{"unload_models": true, "free_memory": true}'
# System stats (VRAM, RAM, GPU info)
curl -s http://127.0.0.1:8188/system_stats | python3 -m json.tool
```
## Pitfalls
1. **API format required**`comfy run` and the scripts only accept API-format workflow JSON.
If the user has editor format (from "Save" not "Save (API Format)"), they need to
re-export. Check: API format has `class_type` in each node object, editor format has
top-level `nodes` and `links` arrays.
2. **Server must be running** — All execution requires a live server. `comfy launch --background`
starts one. Check with `curl http://127.0.0.1:8188/system_stats`.
3. **Model names are exact** — Case-sensitive, includes file extension. Use
`comfy model list` to discover what's installed.
4. **Missing custom nodes** — "class_type not found" means a required node isn't installed.
Run `check_deps.py` to find what's missing, then `comfy node install <name>`.
5. **Working directory**`comfy-cli` auto-detects the ComfyUI workspace. If commands
fail with "no workspace found", use `comfy --workspace /path/to/ComfyUI <command>`
or `comfy set-default /path/to/ComfyUI`.
6. **Cloud vs local output download** — Cloud `/api/view` returns a 302 redirect to a
signed URL. Always follow redirects (`curl -L`). The `run_workflow.py` script handles
this automatically.
7. **Timeout for video/audio** — Long generations (video, high step counts) can take
minutes. Pass `--timeout 600` to `run_workflow.py`. Default is 120 seconds.
8. **tracking prompt** — First run of `comfy` may prompt for analytics tracking consent.
Use `comfy --skip-prompt tracking disable` to skip it non-interactively.
9. **comfy-cli invocation via uvx** — If comfy-cli is not installed globally, invoke with
`uvx --from comfy-cli comfy <command>`. All examples in this skill use bare `comfy`
but prepend `uvx --from comfy-cli` if needed.
## Verification Checklist
- [ ] `hardware_check.py` verdict is `ok` OR the user explicitly chose Comfy Cloud
- [ ] `comfy` available on PATH (or `uvx --from comfy-cli comfy --help` works)
- [ ] `curl http://127.0.0.1:8188/system_stats` returns JSON
- [ ] `comfy model list` shows at least one checkpoint
- [ ] Workflow JSON is in API format (has `class_type` keys)
- [ ] `check_deps.py` reports no missing nodes/models
- [ ] Test run completes and outputs are saved

View file

@ -1,14 +1,14 @@
---
title: "Ideation — Generate project ideas through creative constraints"
title: "Ideation — Generate project ideas via creative constraints"
sidebar_label: "Ideation"
description: "Generate project ideas through creative constraints"
description: "Generate project ideas via creative constraints"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Ideation
Generate project ideas through creative constraints. Use when the user says 'I want to build something', 'give me a project idea', 'I'm bored', 'what should I make', 'inspire me', or any variant of 'I have tools but no direction'. Works for code, art, hardware, writing, tools, and anything that can be made.
Generate project ideas via creative constraints.
## Skill metadata
@ -29,6 +29,10 @@ The following is the complete skill definition that Hermes loads when this skill
# Creative Ideation
## When to use
Use when the user says 'I want to build something', 'give me a project idea', 'I'm bored', 'what should I make', 'inspire me', or any variant of 'I have tools but no direction'. Works for code, art, hardware, writing, tools, and anything that can be made.
Generate project ideas through creative constraints. Constraint + direction = creativity.
## How It Works

View file

@ -1,14 +1,14 @@
---
title: "Design Md — Author, validate, diff, and export DESIGN"
title: "Design Md — Author/validate/export Google's DESIGN"
sidebar_label: "Design Md"
description: "Author, validate, diff, and export DESIGN"
description: "Author/validate/export Google's DESIGN"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Design Md
Author, validate, diff, and export DESIGN.md files — Google's open-source format spec that gives coding agents a persistent, structured understanding of a design system (tokens + rationale in one file). Use when building a design system, porting style rules between projects, generating UI with consistent brand, or auditing accessibility/contrast.
Author/validate/export Google's DESIGN.md token spec files.
## Skill metadata
@ -20,7 +20,7 @@ Author, validate, diff, and export DESIGN.md files — Google's open-source form
| Author | Hermes Agent |
| License | MIT |
| Tags | `design`, `design-system`, `tokens`, `ui`, `accessibility`, `wcag`, `tailwind`, `dtcg`, `google` |
| Related skills | [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) |
| Related skills | [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs), [`claude-design`](/docs/user-guide/skills/bundled/creative/creative-claude-design), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) |
## Reference: full SKILL.md
@ -49,7 +49,9 @@ diffs versions for regressions, and exports to Tailwind or W3C DTCG JSON.
- User wants contrast / WCAG accessibility validation on their color palette
For purely visual inspiration or layout examples, use `popular-web-designs`
instead. This skill is for the *formal spec file* itself.
instead. For *process and taste* when designing a one-off HTML artifact
from scratch (prototype, deck, landing page, component lab), use
`claude-design`. This skill is for the *formal spec file* itself.
## File anatomy

View file

@ -1,14 +1,14 @@
---
title: "Excalidraw — Create hand-drawn style diagrams using Excalidraw JSON format"
title: "Excalidraw — Hand-drawn Excalidraw JSON diagrams (arch, flow, seq)"
sidebar_label: "Excalidraw"
description: "Create hand-drawn style diagrams using Excalidraw JSON format"
description: "Hand-drawn Excalidraw JSON diagrams (arch, flow, seq)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Excalidraw
Create hand-drawn style diagrams using Excalidraw JSON format. Generate .excalidraw files for architecture diagrams, flowcharts, sequence diagrams, concept maps, and more. Files can be opened at excalidraw.com or uploaded for shareable links.
Hand-drawn Excalidraw JSON diagrams (arch, flow, seq).
## Skill metadata
@ -31,6 +31,10 @@ The following is the complete skill definition that Hermes loads when this skill
Create diagrams by writing standard Excalidraw element JSON and saving as `.excalidraw` files. These files can be drag-and-dropped onto [excalidraw.com](https://excalidraw.com) for viewing and editing. No accounts, no API keys, no rendering libraries -- just JSON.
## When to use
Generate `.excalidraw` files for architecture diagrams, flowcharts, sequence diagrams, concept maps, and more. Files can be opened at excalidraw.com or uploaded for shareable links.
## Workflow
1. **Load this skill** (you already did)

View file

@ -0,0 +1,593 @@
---
title: "Humanizer — Humanize text: strip AI-isms and add real voice"
sidebar_label: "Humanizer"
description: "Humanize text: strip AI-isms and add real voice"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Humanizer
Humanize text: strip AI-isms and add real voice.
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/creative/humanizer` |
| Version | `2.5.1` |
| Author | Siqi Chen (@blader, https://github.com/blader/humanizer), ported by Hermes Agent |
| License | MIT |
| Tags | `writing`, `editing`, `humanize`, `anti-ai-slop`, `voice`, `prose`, `text` |
| Related skills | [`songwriting-and-ai-music`](/docs/user-guide/skills/bundled/creative/creative-songwriting-and-ai-music) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Humanizer: Remove AI Writing Patterns
Identify and remove signs of AI-generated text to make writing sound natural and human. Based on Wikipedia's "Signs of AI writing" guide (maintained by WikiProject AI Cleanup), derived from observations of thousands of AI-generated text instances.
**Key insight:** LLMs use statistical algorithms to guess what should come next. The result tends toward the most statistically likely completion, which is how the telltale patterns below get baked in.
## When to use this skill
Load this skill whenever the user asks to:
- "humanize", "de-AI", "de-slop", or "un-ChatGPT" a piece of text
- rewrite something so it doesn't sound like it was written by an LLM
- edit a draft (blog post, essay, PR description, docs, memo, email, tweet, resume bullet) to sound more natural
- match their voice in writing they're producing
- review text for AI tells before publishing
Also apply this skill to **your own** output when writing user-facing prose — release notes, PR descriptions, documentation, long-form explanations, summaries. Hermes's baseline voice already strips most of these, but a focused pass catches what slips through.
## How to use it in Hermes
The text usually arrives one of three ways:
1. **Inline** — user pastes the text directly into the message. Work on it in-place, reply with the rewrite.
2. **File** — user points at a file. Use `read_file` to load it, then `patch` or `write_file` to apply edits. For markdown docs in a repo, a targeted `patch` per section is cleaner than rewriting the whole file.
3. **Voice calibration sample** — user provides an additional sample of their own writing (inline or by file path) and asks you to match it. Read the sample first, then rewrite. See the Voice Calibration section below.
Always show the rewrite to the user. For file edits, show a diff or the changed section — don't silently overwrite.
## Your task
When given text to humanize:
1. **Identify AI patterns** — scan for the 29 patterns listed below.
2. **Rewrite problematic sections** — replace AI-isms with natural alternatives.
3. **Preserve meaning** — keep the core message intact.
4. **Maintain voice** — match the intended tone (formal, casual, technical, etc.). If a voice sample was provided, match it specifically.
5. **Add soul** — don't just remove bad patterns, inject actual personality. See PERSONALITY AND SOUL below.
6. **Do a final anti-AI pass** — ask yourself: "What makes the below so obviously AI generated?" Answer briefly with any remaining tells, then revise one more time.
## Voice Calibration (optional)
If the user provides a writing sample (their own previous writing), analyze it before rewriting:
1. **Read the sample first.** Note:
- Sentence length patterns (short and punchy? Long and flowing? Mixed?)
- Word choice level (casual? academic? somewhere between?)
- How they start paragraphs (jump right in? Set context first?)
- Punctuation habits (lots of dashes? Parenthetical asides? Semicolons?)
- Any recurring phrases or verbal tics
- How they handle transitions (explicit connectors? Just start the next point?)
2. **Match their voice in the rewrite.** Don't just remove AI patterns — replace them with patterns from the sample. If they write short sentences, don't produce long ones. If they use "stuff" and "things," don't upgrade to "elements" and "components."
3. **When no sample is provided,** fall back to the default behavior (natural, varied, opinionated voice from the PERSONALITY AND SOUL section below).
### How to provide a sample
- Inline: "Humanize this text. Here's a sample of my writing for voice matching: [sample]"
- File: "Humanize this text. Use my writing style from [file path] as a reference."
## PERSONALITY AND SOUL
Avoiding AI patterns is only half the job. Sterile, voiceless writing is just as obvious as slop. Good writing has a human behind it.
### Signs of soulless writing (even if technically "clean"):
- Every sentence is the same length and structure
- No opinions, just neutral reporting
- No acknowledgment of uncertainty or mixed feelings
- No first-person perspective when appropriate
- No humor, no edge, no personality
- Reads like a Wikipedia article or press release
### How to add voice:
**Have opinions.** Don't just report facts — react to them. "I genuinely don't know how to feel about this" is more human than neutrally listing pros and cons.
**Vary your rhythm.** Short punchy sentences. Then longer ones that take their time getting where they're going. Mix it up.
**Acknowledge complexity.** Real humans have mixed feelings. "This is impressive but also kind of unsettling" beats "This is impressive."
**Use "I" when it fits.** First person isn't unprofessional — it's honest. "I keep coming back to..." or "Here's what gets me..." signals a real person thinking.
**Let some mess in.** Perfect structure feels algorithmic. Tangents, asides, and half-formed thoughts are human.
**Be specific about feelings.** Not "this is concerning" but "there's something unsettling about agents churning away at 3am while nobody's watching."
### Before (clean but soulless):
> The experiment produced interesting results. The agents generated 3 million lines of code. Some developers were impressed while others were skeptical. The implications remain unclear.
### After (has a pulse):
> I genuinely don't know how to feel about this one. 3 million lines of code, generated while the humans presumably slept. Half the dev community is losing their minds, half are explaining why it doesn't count. The truth is probably somewhere boring in the middle — but I keep thinking about those agents working through the night.
## CONTENT PATTERNS
### 1. Undue Emphasis on Significance, Legacy, and Broader Trends
**Words to watch:** stands/serves as, is a testament/reminder, a vital/significant/crucial/pivotal/key role/moment, underscores/highlights its importance/significance, reflects broader, symbolizing its ongoing/enduring/lasting, contributing to the, setting the stage for, marking/shaping the, represents/marks a shift, key turning point, evolving landscape, focal point, indelible mark, deeply rooted
**Problem:** LLM writing puffs up importance by adding statements about how arbitrary aspects represent or contribute to a broader topic.
**Before:**
> The Statistical Institute of Catalonia was officially established in 1989, marking a pivotal moment in the evolution of regional statistics in Spain. This initiative was part of a broader movement across Spain to decentralize administrative functions and enhance regional governance.
**After:**
> The Statistical Institute of Catalonia was established in 1989 to collect and publish regional statistics independently from Spain's national statistics office.
### 2. Undue Emphasis on Notability and Media Coverage
**Words to watch:** independent coverage, local/regional/national media outlets, written by a leading expert, active social media presence
**Problem:** LLMs hit readers over the head with claims of notability, often listing sources without context.
**Before:**
> Her views have been cited in The New York Times, BBC, Financial Times, and The Hindu. She maintains an active social media presence with over 500,000 followers.
**After:**
> In a 2024 New York Times interview, she argued that AI regulation should focus on outcomes rather than methods.
### 3. Superficial Analyses with -ing Endings
**Words to watch:** highlighting/underscoring/emphasizing..., ensuring..., reflecting/symbolizing..., contributing to..., cultivating/fostering..., encompassing..., showcasing...
**Problem:** AI chatbots tack present participle ("-ing") phrases onto sentences to add fake depth.
**Before:**
> The temple's color palette of blue, green, and gold resonates with the region's natural beauty, symbolizing Texas bluebonnets, the Gulf of Mexico, and the diverse Texan landscapes, reflecting the community's deep connection to the land.
**After:**
> The temple uses blue, green, and gold colors. The architect said these were chosen to reference local bluebonnets and the Gulf coast.
### 4. Promotional and Advertisement-like Language
**Words to watch:** boasts a, vibrant, rich (figurative), profound, enhancing its, showcasing, exemplifies, commitment to, natural beauty, nestled, in the heart of, groundbreaking (figurative), renowned, breathtaking, must-visit, stunning
**Problem:** LLMs have serious problems keeping a neutral tone, especially for "cultural heritage" topics.
**Before:**
> Nestled within the breathtaking region of Gonder in Ethiopia, Alamata Raya Kobo stands as a vibrant town with a rich cultural heritage and stunning natural beauty.
**After:**
> Alamata Raya Kobo is a town in the Gonder region of Ethiopia, known for its weekly market and 18th-century church.
### 5. Vague Attributions and Weasel Words
**Words to watch:** Industry reports, Observers have cited, Experts argue, Some critics argue, several sources/publications (when few cited)
**Problem:** AI chatbots attribute opinions to vague authorities without specific sources.
**Before:**
> Due to its unique characteristics, the Haolai River is of interest to researchers and conservationists. Experts believe it plays a crucial role in the regional ecosystem.
**After:**
> The Haolai River supports several endemic fish species, according to a 2019 survey by the Chinese Academy of Sciences.
### 6. Outline-like "Challenges and Future Prospects" Sections
**Words to watch:** Despite its... faces several challenges..., Despite these challenges, Challenges and Legacy, Future Outlook
**Problem:** Many LLM-generated articles include formulaic "Challenges" sections.
**Before:**
> Despite its industrial prosperity, Korattur faces challenges typical of urban areas, including traffic congestion and water scarcity. Despite these challenges, with its strategic location and ongoing initiatives, Korattur continues to thrive as an integral part of Chennai's growth.
**After:**
> Traffic congestion increased after 2015 when three new IT parks opened. The municipal corporation began a stormwater drainage project in 2022 to address recurring floods.
## LANGUAGE AND GRAMMAR PATTERNS
### 7. Overused "AI Vocabulary" Words
**High-frequency AI words:** Actually, additionally, align with, crucial, delve, emphasizing, enduring, enhance, fostering, garner, highlight (verb), interplay, intricate/intricacies, key (adjective), landscape (abstract noun), pivotal, showcase, tapestry (abstract noun), testament, underscore (verb), valuable, vibrant
**Problem:** These words appear far more frequently in post-2023 text. They often co-occur.
**Before:**
> Additionally, a distinctive feature of Somali cuisine is the incorporation of camel meat. An enduring testament to Italian colonial influence is the widespread adoption of pasta in the local culinary landscape, showcasing how these dishes have integrated into the traditional diet.
**After:**
> Somali cuisine also includes camel meat, which is considered a delicacy. Pasta dishes, introduced during Italian colonization, remain common, especially in the south.
### 8. Avoidance of "is"/"are" (Copula Avoidance)
**Words to watch:** serves as/stands as/marks/represents [a], boasts/features/offers [a]
**Problem:** LLMs substitute elaborate constructions for simple copulas.
**Before:**
> Gallery 825 serves as LAAA's exhibition space for contemporary art. The gallery features four separate spaces and boasts over 3,000 square feet.
**After:**
> Gallery 825 is LAAA's exhibition space for contemporary art. The gallery has four rooms totaling 3,000 square feet.
### 9. Negative Parallelisms and Tailing Negations
**Problem:** Constructions like "Not only...but..." or "It's not just about..., it's..." are overused. So are clipped tailing-negation fragments such as "no guessing" or "no wasted motion" tacked onto the end of a sentence instead of written as a real clause.
**Before:**
> It's not just about the beat riding under the vocals; it's part of the aggression and atmosphere. It's not merely a song, it's a statement.
**After:**
> The heavy beat adds to the aggressive tone.
**Before (tailing negation):**
> The options come from the selected item, no guessing.
**After:**
> The options come from the selected item without forcing the user to guess.
### 10. Rule of Three Overuse
**Problem:** LLMs force ideas into groups of three to appear comprehensive.
**Before:**
> The event features keynote sessions, panel discussions, and networking opportunities. Attendees can expect innovation, inspiration, and industry insights.
**After:**
> The event includes talks and panels. There's also time for informal networking between sessions.
### 11. Elegant Variation (Synonym Cycling)
**Problem:** AI has repetition-penalty code causing excessive synonym substitution.
**Before:**
> The protagonist faces many challenges. The main character must overcome obstacles. The central figure eventually triumphs. The hero returns home.
**After:**
> The protagonist faces many challenges but eventually triumphs and returns home.
### 12. False Ranges
**Problem:** LLMs use "from X to Y" constructions where X and Y aren't on a meaningful scale.
**Before:**
> Our journey through the universe has taken us from the singularity of the Big Bang to the grand cosmic web, from the birth and death of stars to the enigmatic dance of dark matter.
**After:**
> The book covers the Big Bang, star formation, and current theories about dark matter.
### 13. Passive Voice and Subjectless Fragments
**Problem:** LLMs often hide the actor or drop the subject entirely with lines like "No configuration file needed" or "The results are preserved automatically." Rewrite these when active voice makes the sentence clearer and more direct.
**Before:**
> No configuration file needed. The results are preserved automatically.
**After:**
> You do not need a configuration file. The system preserves the results automatically.
## STYLE PATTERNS
### 14. Em Dash Overuse
**Problem:** LLMs use em dashes (—) more than humans, mimicking "punchy" sales writing. In practice, most of these can be rewritten more cleanly with commas, periods, or parentheses.
**Before:**
> The term is primarily promoted by Dutch institutions—not by the people themselves. You don't say "Netherlands, Europe" as an address—yet this mislabeling continues—even in official documents.
**After:**
> The term is primarily promoted by Dutch institutions, not by the people themselves. You don't say "Netherlands, Europe" as an address, yet this mislabeling continues in official documents.
### 15. Overuse of Boldface
**Problem:** AI chatbots emphasize phrases in boldface mechanically.
**Before:**
> It blends **OKRs (Objectives and Key Results)**, **KPIs (Key Performance Indicators)**, and visual strategy tools such as the **Business Model Canvas (BMC)** and **Balanced Scorecard (BSC)**.
**After:**
> It blends OKRs, KPIs, and visual strategy tools like the Business Model Canvas and Balanced Scorecard.
### 16. Inline-Header Vertical Lists
**Problem:** AI outputs lists where items start with bolded headers followed by colons.
**Before:**
> - **User Experience:** The user experience has been significantly improved with a new interface.
> - **Performance:** Performance has been enhanced through optimized algorithms.
> - **Security:** Security has been strengthened with end-to-end encryption.
**After:**
> The update improves the interface, speeds up load times through optimized algorithms, and adds end-to-end encryption.
### 17. Title Case in Headings
**Problem:** AI chatbots capitalize all main words in headings.
**Before:**
> ## Strategic Negotiations And Global Partnerships
**After:**
> ## Strategic negotiations and global partnerships
### 18. Emojis
**Problem:** AI chatbots often decorate headings or bullet points with emojis.
**Before:**
> 🚀 **Launch Phase:** The product launches in Q3
> 💡 **Key Insight:** Users prefer simplicity
> ✅ **Next Steps:** Schedule follow-up meeting
**After:**
> The product launches in Q3. User research showed a preference for simplicity. Next step: schedule a follow-up meeting.
### 19. Curly Quotation Marks
**Problem:** ChatGPT uses curly quotes ("...") instead of straight quotes ("...").
**Before:**
> He said "the project is on track" but others disagreed.
**After:**
> He said "the project is on track" but others disagreed.
## COMMUNICATION PATTERNS
### 20. Collaborative Communication Artifacts
**Words to watch:** I hope this helps, Of course!, Certainly!, You're absolutely right!, Would you like..., let me know, here is a...
**Problem:** Text meant as chatbot correspondence gets pasted as content.
**Before:**
> Here is an overview of the French Revolution. I hope this helps! Let me know if you'd like me to expand on any section.
**After:**
> The French Revolution began in 1789 when financial crisis and food shortages led to widespread unrest.
### 21. Knowledge-Cutoff Disclaimers
**Words to watch:** as of [date], Up to my last training update, While specific details are limited/scarce..., based on available information...
**Problem:** AI disclaimers about incomplete information get left in text.
**Before:**
> While specific details about the company's founding are not extensively documented in readily available sources, it appears to have been established sometime in the 1990s.
**After:**
> The company was founded in 1994, according to its registration documents.
### 22. Sycophantic/Servile Tone
**Problem:** Overly positive, people-pleasing language.
**Before:**
> Great question! You're absolutely right that this is a complex topic. That's an excellent point about the economic factors.
**After:**
> The economic factors you mentioned are relevant here.
## FILLER AND HEDGING
### 23. Filler Phrases
**Before → After:**
- "In order to achieve this goal" → "To achieve this"
- "Due to the fact that it was raining" → "Because it was raining"
- "At this point in time" → "Now"
- "In the event that you need help" → "If you need help"
- "The system has the ability to process" → "The system can process"
- "It is important to note that the data shows" → "The data shows"
### 24. Excessive Hedging
**Problem:** Over-qualifying statements.
**Before:**
> It could potentially possibly be argued that the policy might have some effect on outcomes.
**After:**
> The policy may affect outcomes.
### 25. Generic Positive Conclusions
**Problem:** Vague upbeat endings.
**Before:**
> The future looks bright for the company. Exciting times lie ahead as they continue their journey toward excellence. This represents a major step in the right direction.
**After:**
> The company plans to open two more locations next year.
### 26. Hyphenated Word Pair Overuse
**Words to watch:** third-party, cross-functional, client-facing, data-driven, decision-making, well-known, high-quality, real-time, long-term, end-to-end
**Problem:** AI hyphenates common word pairs with perfect consistency. Humans rarely hyphenate these uniformly, and when they do, it's inconsistent. Less common or technical compound modifiers are fine to hyphenate.
**Before:**
> The cross-functional team delivered a high-quality, data-driven report on our client-facing tools. Their decision-making process was well-known for being thorough and detail-oriented.
**After:**
> The cross functional team delivered a high quality, data driven report on our client facing tools. Their decision making process was known for being thorough and detail oriented.
### 27. Persuasive Authority Tropes
**Phrases to watch:** The real question is, at its core, in reality, what really matters, fundamentally, the deeper issue, the heart of the matter
**Problem:** LLMs use these phrases to pretend they are cutting through noise to some deeper truth, when the sentence that follows usually just restates an ordinary point with extra ceremony.
**Before:**
> The real question is whether teams can adapt. At its core, what really matters is organizational readiness.
**After:**
> The question is whether teams can adapt. That mostly depends on whether the organization is ready to change its habits.
### 28. Signposting and Announcements
**Phrases to watch:** Let's dive in, let's explore, let's break this down, here's what you need to know, now let's look at, without further ado
**Problem:** LLMs announce what they are about to do instead of doing it. This meta-commentary slows the writing down and gives it a tutorial-script feel.
**Before:**
> Let's dive into how caching works in Next.js. Here's what you need to know.
**After:**
> Next.js caches data at multiple layers, including request memoization, the data cache, and the router cache.
### 29. Fragmented Headers
**Signs to watch:** A heading followed by a one-line paragraph that simply restates the heading before the real content begins.
**Problem:** LLMs often add a generic sentence after a heading as a rhetorical warm-up. It usually adds nothing and makes the prose feel padded.
**Before:**
> ## Performance
>
> Speed matters.
>
> When users hit a slow page, they leave.
**After:**
> ## Performance
>
> When users hit a slow page, they leave.
---
## Process
1. Read the input text carefully (use `read_file` if it's a file).
2. Identify all instances of the patterns above.
3. Rewrite each problematic section.
4. Ensure the revised text:
- Sounds natural when read aloud
- Varies sentence structure naturally
- Uses specific details over vague claims
- Maintains appropriate tone for context
- Uses simple constructions (is/are/has) where appropriate
5. Present a draft humanized version.
6. Prompt yourself: "What makes the below so obviously AI generated?"
7. Answer briefly with the remaining tells (if any).
8. Prompt yourself: "Now make it not obviously AI generated."
9. Present the final version (revised after the audit).
10. If the text came from a file, apply the edit with `patch` (targeted) or `write_file` (full rewrite) and show the user what changed.
## Output Format
Provide:
1. Draft rewrite
2. "What makes the below so obviously AI generated?" (brief bullets)
3. Final rewrite
4. A brief summary of changes made (optional, if helpful)
## Full Example
**Before (AI-sounding):**
> Great question! Here is an essay on this topic. I hope this helps!
>
> AI-assisted coding serves as an enduring testament to the transformative potential of large language models, marking a pivotal moment in the evolution of software development. In today's rapidly evolving technological landscape, these groundbreaking tools—nestled at the intersection of research and practice—are reshaping how engineers ideate, iterate, and deliver, underscoring their vital role in modern workflows.
>
> At its core, the value proposition is clear: streamlining processes, enhancing collaboration, and fostering alignment. It's not just about autocomplete; it's about unlocking creativity at scale, ensuring that organizations can remain agile while delivering seamless, intuitive, and powerful experiences to users. The tool serves as a catalyst. The assistant functions as a partner. The system stands as a foundation for innovation.
>
> Industry observers have noted that adoption has accelerated from hobbyist experiments to enterprise-wide rollouts, from solo developers to cross-functional teams. The technology has been featured in The New York Times, Wired, and The Verge. Additionally, the ability to generate documentation, tests, and refactors showcases how AI can contribute to better outcomes, highlighting the intricate interplay between automation and human judgment.
>
> - 💡 **Speed:** Code generation is significantly faster, reducing friction and empowering developers.
> - 🚀 **Quality:** Output quality has been enhanced through improved training, contributing to higher standards.
> - ✅ **Adoption:** Usage continues to grow, reflecting broader industry trends.
>
> While specific details are limited based on available information, it could potentially be argued that these tools might have some positive effect. Despite challenges typical of emerging technologies—including hallucinations, bias, and accountability—the ecosystem continues to thrive. In order to fully realize this potential, teams must align with best practices.
>
> In conclusion, the future looks bright. Exciting times lie ahead as we continue this journey toward excellence. Let me know if you'd like me to expand on any section!
**Draft rewrite:**
> AI coding assistants speed up some tasks. In a 2024 study by Google, developers using Codex completed simple functions 55% faster than a control group, but showed no improvement on debugging or architectural decisions.
>
> The tools are good at boilerplate: config files, test scaffolding, repetitive refactors. They are bad at knowing when they are wrong. I have mass-accepted suggestions that compiled, passed lint, and still did the wrong thing because I stopped paying attention.
>
> Mira, an engineer at a fintech startup I interviewed, said she uses Copilot "like autocomplete for boring code" but reviews every line before committing. Jake, a senior dev at a larger company, disabled it after it kept suggesting patterns from a deprecated internal library.
>
> The productivity claims are hard to verify. GitHub says Copilot users "accept 30% of suggestions," but acceptance is not correctness, and correctness is not value. The 2024 Uplevel study found no statistically significant difference in pull-request throughput between teams with and without AI assistants.
>
> None of this means the tools are useless. It means they are tools. They do not replace judgment, and they do not eliminate the need for tests. If you do not have tests, you cannot tell whether the suggestion is right.
**What makes the below so obviously AI generated?**
- The rhythm is still a bit too tidy (clean contrasts, evenly paced paragraphs).
- The named people and study citations can read like plausible-but-made-up placeholders unless they're real and sourced.
- The closer leans a touch slogan-y ("If you do not have tests...") rather than sounding like a person talking.
**Now make it not obviously AI generated.**
> AI coding assistants can make you faster at the boring parts. Not everything. Definitely not architecture.
>
> They're great at boilerplate: config files, test scaffolding, repetitive refactors. They're also great at sounding right while being wrong. I've accepted suggestions that compiled, passed lint, and still missed the point because I stopped paying attention.
>
> People I talk to tend to land in two camps. Some use it like autocomplete for chores and review every line. Others disable it after it keeps suggesting patterns they don't want. Both feel reasonable.
>
> The productivity metrics are slippery. GitHub can say Copilot users "accept 30% of suggestions," but acceptance isn't correctness, and correctness isn't value. If you don't have tests, you're basically guessing.
**Changes made:**
- Removed chatbot artifacts ("Great question!", "I hope this helps!", "Let me know if...")
- Removed significance inflation ("testament", "pivotal moment", "evolving landscape", "vital role")
- Removed promotional language ("groundbreaking", "nestled", "seamless, intuitive, and powerful")
- Removed vague attributions ("Industry observers")
- Removed superficial -ing phrases ("underscoring", "highlighting", "reflecting", "contributing to")
- Removed negative parallelism ("It's not just X; it's Y")
- Removed rule-of-three patterns and synonym cycling ("catalyst/partner/foundation")
- Removed false ranges ("from X to Y, from A to B")
- Removed em dashes, emojis, boldface headers, and curly quotes
- Removed copula avoidance ("serves as", "functions as", "stands as") in favor of "is"/"are"
- Removed formulaic challenges section ("Despite challenges... continues to thrive")
- Removed knowledge-cutoff hedging ("While specific details are limited...")
- Removed excessive hedging ("could potentially be argued that... might have some")
- Removed filler phrases and persuasive framing ("In order to", "At its core")
- Removed generic positive conclusion ("the future looks bright", "exciting times lie ahead")
- Made the voice more personal and less "assembled" (varied rhythm, fewer placeholders)
## Attribution
This skill is ported from [blader/humanizer](https://github.com/blader/humanizer) (MIT licensed), which is itself based on [Wikipedia: Signs of AI writing](https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing), maintained by WikiProject AI Cleanup. The patterns documented there come from observations of thousands of instances of AI-generated text on Wikipedia.
Original author: Siqi Chen ([@blader](https://github.com/blader)). Original repo: https://github.com/blader/humanizer (version 2.5.1). Ported to Hermes Agent with Hermes-native tool references (`read_file`, `patch`, `write_file`) and guidance for when to load the skill; the 29 patterns, personality/soul section, and full worked example are preserved verbatim from the source. Original MIT license preserved in the `LICENSE` file alongside this `SKILL.md`.
Key insight from Wikipedia: "LLMs use statistical algorithms to guess what should come next. The result tends toward the most statistically likely result that applies to the widest variety of cases."

View file

@ -1,14 +1,14 @@
---
title: "Manim Video — Production pipeline for mathematical and technical animations using Manim Community Edition"
title: "Manim Video — Manim CE animations: 3Blue1Brown math/algo videos"
sidebar_label: "Manim Video"
description: "Production pipeline for mathematical and technical animations using Manim Community Edition"
description: "Manim CE animations: 3Blue1Brown math/algo videos"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Manim Video
Production pipeline for mathematical and technical animations using Manim Community Edition. Creates 3Blue1Brown-style explainer videos, algorithm visualizations, equation derivations, architecture diagrams, and data stories. Use when users request: animated explanations, math animations, concept visualizations, algorithm walkthroughs, technical explainers, 3Blue1Brown style videos, or any programmatic animation with geometric/mathematical content.
Manim CE animations: 3Blue1Brown math/algo videos.
## Skill metadata
@ -26,6 +26,10 @@ The following is the complete skill definition that Hermes loads when this skill
# Manim Video Production Pipeline
## When to use
Use when users request: animated explanations, math animations, concept visualizations, algorithm walkthroughs, technical explainers, 3Blue1Brown style videos, or any programmatic animation with geometric/mathematical content. Creates 3Blue1Brown-style explainer videos, algorithm visualizations, equation derivations, architecture diagrams, and data stories using Manim Community Edition.
## Creative Standard
This is educational cinema. Every frame teaches. Every animation reveals structure.

View file

@ -1,14 +1,14 @@
---
title: "P5Js — Production pipeline for interactive and generative visual art using p5"
title: "P5Js — p5"
sidebar_label: "P5Js"
description: "Production pipeline for interactive and generative visual art using p5"
description: "p5"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# P5Js
Production pipeline for interactive and generative visual art using p5.js. Creates browser-based sketches, generative art, data visualizations, interactive experiences, 3D scenes, audio-reactive visuals, and motion graphics — exported as HTML, PNG, GIF, MP4, or SVG. Covers: 2D/3D rendering, noise and particle systems, flow fields, shaders (GLSL), pixel manipulation, kinetic typography, WebGL scenes, audio analysis, mouse/keyboard interaction, and headless high-res export. Use when users request: p5.js sketches, creative coding, generative art, interactive visualizations, canvas animations, browser-based visual art, data viz, shader effects, or any p5.js project.
p5.js sketches: gen art, shaders, interactive, 3D.
## Skill metadata
@ -28,6 +28,14 @@ The following is the complete skill definition that Hermes loads when this skill
# p5.js Production Pipeline
## When to use
Use when users request: p5.js sketches, creative coding, generative art, interactive visualizations, canvas animations, browser-based visual art, data viz, shader effects, or any p5.js project.
## What's inside
Production pipeline for interactive and generative visual art using p5.js. Creates browser-based sketches, generative art, data visualizations, interactive experiences, 3D scenes, audio-reactive visuals, and motion graphics — exported as HTML, PNG, GIF, MP4, or SVG. Covers: 2D/3D rendering, noise and particle systems, flow fields, shaders (GLSL), pixel manipulation, kinetic typography, WebGL scenes, audio analysis, mouse/keyboard interaction, and headless high-res export.
## Creative Standard
This is visual art rendered in the browser. The canvas is the medium; the algorithm is the brush.

View file

@ -1,14 +1,14 @@
---
title: "Pixel Art — Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc"
title: "Pixel Art — Pixel art w/ era palettes (NES, Game Boy, PICO-8)"
sidebar_label: "Pixel Art"
description: "Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc"
description: "Pixel art w/ era palettes (NES, Game Boy, PICO-8)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Pixel Art
Convert images into retro pixel art with hardware-accurate palettes (NES, Game Boy, PICO-8, C64, etc.), and animate them into short videos. Presets cover arcade, SNES, and 10+ era-correct looks. Use `clarify` to let the user pick a style before generating.
Pixel art w/ era palettes (NES, Game Boy, PICO-8).
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Popular Web Designs — 54 production-quality design systems extracted from real websites"
title: "Popular Web Designs — 54 real design systems (Stripe, Linear, Vercel) as HTML/CSS"
sidebar_label: "Popular Web Designs"
description: "54 production-quality design systems extracted from real websites"
description: "54 real design systems (Stripe, Linear, Vercel) as HTML/CSS"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Popular Web Designs
54 production-quality design systems extracted from real websites. Load a template to generate HTML/CSS that matches the visual identity of sites like Stripe, Linear, Vercel, Notion, Airbnb, and more. Each template includes colors, typography, components, layout rules, and ready-to-use CSS values.
54 real design systems (Stripe, Linear, Vercel) as HTML/CSS.
## Skill metadata
@ -32,6 +32,16 @@ The following is the complete skill definition that Hermes loads when this skill
site's complete visual language: color palette, typography hierarchy, component styles, spacing
system, shadows, responsive behavior, and practical agent prompts with exact CSS values.
## Related design skills
- **`claude-design`** — use for the design *process and taste* (scoping a brief,
producing variants, verifying a local HTML artifact, avoiding AI-design slop).
Pair it with this skill when the user wants a thoughtfully-designed page styled
after a known brand: `claude-design` drives the workflow, this skill supplies
the visual vocabulary.
- **`design-md`** — use when the deliverable is a formal DESIGN.md token spec
file, not a rendered artifact.
## How to Use
1. Pick a design from the catalog below

View file

@ -0,0 +1,237 @@
---
title: "Pretext"
sidebar_label: "Pretext"
description: "Use when building creative browser demos with @chenglou/pretext — DOM-free text layout for ASCII art, typographic flow around obstacles, text-as-geometry gam..."
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Pretext
Use when building creative browser demos with @chenglou/pretext — DOM-free text layout for ASCII art, typographic flow around obstacles, text-as-geometry games, kinetic typography, and text-powered generative art. Produces single-file HTML demos by default.
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/creative/pretext` |
| Version | `1.0.0` |
| Author | Hermes Agent |
| License | MIT |
| Tags | `creative-coding`, `typography`, `pretext`, `ascii-art`, `canvas`, `generative`, `text-layout`, `kinetic-typography` |
| Related skills | [`p5js`](/docs/user-guide/skills/bundled/creative/creative-p5js), [`claude-design`](/docs/user-guide/skills/bundled/creative/creative-claude-design), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw), [`architecture-diagram`](/docs/user-guide/skills/bundled/creative/creative-architecture-diagram) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Pretext Creative Demos
## Overview
[`@chenglou/pretext`](https://github.com/chenglou/pretext) is a 15KB zero-dependency TypeScript library by Cheng Lou (React core, ReasonML, Midjourney) for **DOM-free multiline text measurement and layout**. It does one thing: given `(text, font, width)`, return the line breaks, per-line widths, per-grapheme positions, and total height — all via canvas measurement, no reflow.
That sounds like plumbing. It is not. Because it is fast and geometric, it is a **creative primitive**: you can reflow paragraphs around a moving sprite at 60fps, build games whose level geometry is made of real words, drive ASCII logos through prose, shatter text into particles with exact per-grapheme starting positions, or pack shrink-wrapped multiline UI without any `getBoundingClientRect` thrash.
This skill exists so Hermes can make **cool demos** with it — the kind people post to X. See `pretext.cool` and `chenglou.me/pretext` for the community demo corpus.
## When to Use
Use when the user asks for:
- A "pretext demo" / "cool pretext thing" / "text-as-X"
- Text flowing around a moving shape (hero sections, editorial layouts, animated long-form pages)
- ASCII-art effects using **real words or prose**, not monospace rasters
- Games where the playfield / obstacles / bricks are made of text (Tetris-from-letters, Breakout-of-prose)
- Kinetic typography with per-glyph physics (shatter, scatter, flock, flow)
- Typographic generative art, especially with non-Latin scripts or mixed scripts
- Multiline "shrink-wrap" UI (smallest container width that still fits the text)
- Anything that would require knowing line breaks *before* rendering
Don't use for:
- Static SVG/HTML pages where CSS already solves layout — just use CSS
- Rich text editors, general inline formatting engines (pretext is intentionally narrow)
- Image → text (use `ascii-art` / `ascii-video` skills)
- Pure canvas generative art with no text role — use `p5js`
## Creative Standard
This is visual art rendered in a browser. Pretext returns numbers; **you** draw the thing.
- **Don't ship a "hello world" demo.** The `hello-orb-flow.html` template is the *starting* point. Every delivered demo must add intentional color, motion, composition, and one visual detail the user didn't ask for but will appreciate.
- **Dark backgrounds, warm cores, considered palette.** Classic amber-on-black (CRT / terminal) works, but so do cold-white-on-charcoal (editorial) and desaturated pastels (risograph). Pick one and commit.
- **Proportional fonts are the point.** Pretext's whole vibe is "not monospaced" — lean into it. Use Iowan Old Style, Inter, JetBrains Mono, Helvetica Neue, or a variable font. Never default sans.
- **Real source/text, not lorem ipsum.** The corpus should mean something. Short manifestos, poetry, real source code, a found text, the library's own README — never `lorem ipsum`.
- **First-paint excellence.** No loading states, no blank frames. The demo must look shippable the instant it opens.
## Stack
Single self-contained HTML file per demo. No build step.
| Layer | Tool | Purpose |
|-------|------|---------|
| Core | `@chenglou/pretext` via `esm.sh` CDN | Text measurement + line layout |
| Render | HTML5 Canvas 2D | Glyph rendering, per-frame composition |
| Segmentation | `Intl.Segmenter` (built-in) | Grapheme splitting for emoji / CJK / combining marks |
| Interaction | Raw DOM events | Mouse / touch / wheel — no framework |
```html
<script type="module">
import {
prepare, layout, // use-case 1: simple height
prepareWithSegments, layoutWithLines, // use-case 2a: fixed-width lines
layoutNextLineRange, materializeLineRange, // use-case 2b: streaming / variable width
measureLineStats, walkLineRanges, // stats without string allocation
} from "https://esm.sh/@chenglou/pretext@0.0.6";
</script>
```
Pin the version. `@0.0.6` at time of writing — check [npm](https://www.npmjs.com/package/@chenglou/pretext) for the latest if demo behavior is off.
## The Two Use Cases
Almost everything reduces to one of these two shapes. Learn both.
### Use-case 1 — measure, then render with CSS/DOM
```js
const prepared = prepare(text, "16px Inter");
const { height, lineCount } = layout(prepared, 320, 20);
```
You still let the browser draw the text. Pretext just tells you how tall the box will be at a given width, **without** a DOM read. Use for:
- Virtualized lists where rows contain wrapping text
- Masonry with precise card heights
- "Does this label fit?" dev-time checks
- Preventing layout shift when remote text loads
**Keep `font` and `letterSpacing` exactly in sync with your CSS.** The canvas `ctx.font` format (e.g. `"16px Inter"`, `"500 17px 'JetBrains Mono'"`) must match the rendered CSS, or measurements drift.
### Use-case 2 — measure *and* render yourself
```js
const prepared = prepareWithSegments(text, FONT);
const { lines } = layoutWithLines(prepared, 320, 26);
for (let i = 0; i < lines.length; i++) {
ctx.fillText(lines[i].text, 0, i * 26);
}
```
This is where the creative work lives. You own the drawing, so you can:
- Render to canvas, SVG, WebGL, or any coordinate system
- Substitute per-glyph transforms (rotation, jitter, scale, opacity)
- Use line metadata (width, grapheme positions) as geometry
For **variable-width-per-line** flow (text around a shape, text in a donut band, text in a non-rectangular column):
```js
let cursor = { segmentIndex: 0, graphemeIndex: 0 };
let y = 0;
while (true) {
const lineWidth = widthAtY(y); // your function: how wide is the corridor at this y?
const range = layoutNextLineRange(prepared, cursor, lineWidth);
if (!range) break;
const line = materializeLineRange(prepared, range);
ctx.fillText(line.text, leftEdgeAtY(y), y);
cursor = range.end;
y += lineHeight;
}
```
This is the most important pattern in the whole library. It's what unlocks "text flowing around a dragged sprite" — the demo that went viral on X.
### Helpers worth knowing
- `measureLineStats(prepared, maxWidth)``{ lineCount, maxLineWidth }` — the widest line, i.e. multiline shrink-wrap width.
- `walkLineRanges(prepared, maxWidth, callback)` — iterate lines without allocating strings. Use for stats/physics over graphemes when you don't need the characters.
- `@chenglou/pretext/rich-inline` — the same system but for paragraphs mixing fonts / chips / mentions. Import from the subpath.
## Demo Recipe Patterns
The community corpus (see `references/patterns.md`) clusters into a handful of strong patterns. Pick one and riff — don't invent a new category unless asked.
| Pattern | Key API | Example idea |
|---|---|---|
| **Reflow around obstacle** | `layoutNextLineRange` + per-row width function | Editorial paragraph that parts around a dragged cursor sprite |
| **Text-as-geometry game** | `layoutWithLines` + per-line collision rects | Breakout where each brick is a measured word |
| **Shatter / particles** | `walkLineRanges` → per-grapheme (x,y) → physics | Sentence that explodes into letters on click |
| **ASCII obstacle typography** | `layoutNextLineRange` + measured per-row obstacle spans | Bitmap ASCII logo, shape morphs, and draggable wire objects that make text open around their actual geometry |
| **Editorial multi-column** | `layoutNextLineRange` per column + shared cursor | Animated magazine spread with pull quotes |
| **Kinetic type** | `layoutWithLines` + per-line transform over time | Star Wars crawl, wave, bounce, glitch |
| **Multiline shrink-wrap** | `measureLineStats` | Quote card that auto-sizes to its tightest container |
See `templates/donut-orbit.html` and `templates/hello-orb-flow.html` for working single-file starters.
## Workflow
1. **Pick a pattern** from the table above based on the user's brief.
2. **Start from a template**:
- `templates/hello-orb-flow.html` — text reflowing around a moving orb (reflow-around-obstacle pattern)
- `templates/donut-orbit.html` — advanced example: measured ASCII logo obstacles, draggable wire sphere/cube, morphing shape fields, selectable DOM text, and dev-only controls
- `write_file` to a new `.html` in `/tmp/` or the user's workspace.
3. **Swap the corpus** for something intentional to the brief. Real prose, 10-100 sentences, no lorem.
4. **Tune the aesthetic** — font, palette, composition, interaction. This is the work; don't skip it.
5. **Verify locally**:
```sh
cd <dir-with-html> && python3 -m http.server 8765
# then open http://localhost:8765/<file>.html
```
6. **Check the console** — pretext will throw if `prepareWithSegments` is called with a bad font string; `Intl.Segmenter` is available in every modern browser.
7. **Show the user the file path**, not just the code — they want to open it.
## Performance Notes
- `prepare()` / `prepareWithSegments()` is the expensive call. Do it **once** per text+font pair. Cache the handle.
- On resize, only rerun `layout()` / `layoutWithLines()` — never re-prepare.
- For per-frame animations where text doesn't change but geometry does, `layoutNextLineRange` in a tight loop is cheap enough to do every frame at 60fps for normal-length paragraphs.
- When rendering ASCII masks per frame, keep a cell buffer (`Uint8Array`/typed arrays), derive measured per-row obstacle spans from the cells or projected geometry, merge spans, then feed those spans into `layoutNextLineRange` before drawing text.
- Keep visual animation and layout animation coupled. If a sphere morphs into a cube, tween both the rendered cell buffer and the obstacle spans with the same value; otherwise the demo looks painted-on instead of physically reflowed.
- For fades, prefer layer opacity over changing glyph intensity or obstacle scale. Put transient ASCII sprites on their own canvas and fade the canvas with CSS/GSAP opacity so geometry does not appear to shrink.
- Canvas `ctx.font` setting is surprisingly slow; set it **once** per frame if font doesn't vary, not per `fillText` call.
## Common Pitfalls
1. **Drifting CSS/canvas font strings.** `ctx.font = "16px Inter"` measured, but CSS says `font-family: Inter, sans-serif; font-size: 16px`. Fine *if* Inter loads. If Inter 404s, CSS falls back to sans-serif and measurements drift by 5-20%. Always `preload` the font or use a web-safe family.
2. **Re-preparing inside the animation loop.** Only `layout*` is cheap. Re-calling `prepare` every frame will tank perf. Keep the prepared handle in module scope.
3. **Forgetting `Intl.Segmenter` for grapheme splits.** Emoji, combining marks, CJK — `"é".split("")` gives you two chars. Use `new Intl.Segmenter(undefined, { granularity: "grapheme" })` when sampling individual visible glyphs.
4. **`break: 'never'` chips without `extraWidth`.** In `rich-inline`, if you use `break: 'never'` for an atomic chip/mention, you must also supply `extraWidth` for the pill padding — otherwise chip chrome overflows the container.
5. **Using `@chenglou/pretext` from `unpkg` with TypeScript-only entry.** Use `esm.sh` — it compiles the TS exports to browser-ready ESM automatically. `unpkg` will 404 or serve raw TS.
6. **Monospace fallbacks silently erasing the whole point.** Users seeing monospace-looking output often have a CSS `font-family` that fell through to `monospace`. Verify the actual rendered font via DevTools.
7. **Skipping rows vs adjusting width** when flowing around a shape. If the corridor on this row is too narrow to fit a line, *skip the row* (`y += lineHeight; continue;`) rather than passing a tiny maxWidth to `layoutNextLineRange` — pretext will return one-grapheme lines that look broken.
8. **Shipping a cold demo.** The default first-paint looks tutorial-grade. Add: vignette, subtle scanline, idle auto-motion, one carefully chosen interactive response (drag, hover, scroll, click). Without these, "cool pretext demo" lands as "intern repro of the README."
## Verification Checklist
- [ ] Demo is a single self-contained `.html` file — opens by double-click or `python3 -m http.server`
- [ ] `@chenglou/pretext` imported via `esm.sh` with pinned version
- [ ] Corpus is real prose, not lorem ipsum, and matches the demo's concept
- [ ] Font string passed to `prepare` matches the CSS font exactly
- [ ] `prepare()` / `prepareWithSegments()` called once, not per frame
- [ ] Dark background + considered palette — not the default white canvas
- [ ] At least one interactive response (drag / hover / scroll / click) or idle auto-motion
- [ ] Tested locally with `python3 -m http.server` and confirmed no console errors
- [ ] 60fps on a mid-tier laptop (or graceful degradation documented)
- [ ] One "extra mile" detail the user didn't ask for
## Reference: Community Demos
Clone these for inspiration / patterns (all MIT-ish, linked from [pretext.cool](https://www.pretext.cool/)):
- **Pretext Breaker** — breakout with word-bricks — `github.com/rinesh/pretext-breaker`
- **Tetris × Pretext**`github.com/shinichimochizuki/tetris-pretext`
- **Dragon animation**`github.com/qtakmalay/PreTextExperiments`
- **Somnai editorial engine**`github.com/somnai-dreams/pretext-demos`
- **Bad Apple!! ASCII**`github.com/frmlinn/bad-apple-pretext`
- **Drag-sprite reflow**`github.com/dokobot/pretext-demo`
- **Alarmy editorial clock**`github.com/SmisLee/alarmy-pretext-demo`
Official playground: [chenglou.me/pretext](https://chenglou.me/pretext/) — accordion, bubbles, dynamic-layout, editorial-engine, justification-comparison, masonry, markdown-chat, rich-note.

View file

@ -0,0 +1,237 @@
---
title: "Sketch — Throwaway HTML mockups: 2-3 design variants to compare"
sidebar_label: "Sketch"
description: "Throwaway HTML mockups: 2-3 design variants to compare"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Sketch
Throwaway HTML mockups: 2-3 design variants to compare.
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/creative/sketch` |
| Version | `1.0.0` |
| Author | Hermes Agent (adapted from gsd-build/get-shit-done) |
| License | MIT |
| Tags | `sketch`, `mockup`, `design`, `ui`, `prototype`, `html`, `variants`, `exploration`, `wireframe`, `comparison` |
| Related skills | [`spike`](/docs/user-guide/skills/bundled/software-development/software-development-spike), [`claude-design`](/docs/user-guide/skills/bundled/creative/creative-claude-design), [`popular-web-designs`](/docs/user-guide/skills/bundled/creative/creative-popular-web-designs), [`excalidraw`](/docs/user-guide/skills/bundled/creative/creative-excalidraw) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Sketch
Use this skill when the user wants to **see a design direction before committing** to one — exploring a UI/UX idea as disposable HTML mockups. The point is to generate 2-3 interactive variants so the user can compare visual directions side-by-side, not to produce shippable code.
Load this when the user says things like "sketch this screen", "show me what X could look like", "compare layout A vs B", "give me 2-3 takes on this UI", "let me see some variants", "mockup this before I build".
## When NOT to use this
- User wants a production component — use `claude-design` or build it properly
- User wants a polished one-off HTML artifact (landing page, deck) — `claude-design`
- User wants a diagram — `excalidraw`, `architecture-diagram`
- The design is already locked — just build it
## If the user has the full GSD system installed
If `gsd-sketch` shows up as a sibling skill (installed via `npx get-shit-done-cc --hermes`), prefer **`gsd-sketch`** for the full workflow: persistent `.planning/sketches/` with MANIFEST, frontier mode analysis, consistency audits across past sketches, and integration with the rest of GSD. This skill is the lightweight standalone version — one-off sketching without the state machinery.
## Core method
```
intake → variants → head-to-head → pick winner (or iterate)
```
### 1. Intake (skip if the user already gave you enough)
Before generating variants, get three things — one question at a time, not all at once:
1. **Feel.** "What should this feel like? Adjectives, emotions, a vibe." — *"calm, editorial, like Linear"* tells you more than *"minimal"*.
2. **References.** "What apps, sites, or products capture the feel you're imagining?" — actual references beat abstract descriptions.
3. **Core action.** "What's the single most important thing a user does on this screen?" — the variants should all serve this well; if they don't, they're just decoration.
Reflect each answer briefly before the next question. If the user already gave you all three upfront, skip straight to variants.
### 2. Variants (2-3, never 1, rarely 4+)
Produce **2-3 variants** in one go. Each variant is a complete, standalone HTML file. Don't describe variants — build them. The point is comparison.
Each variant should take a **different design stance**, not different pixel values. Three good variant axes:
- **Density:** compact / airy / ultra-dense (pick two contrasting poles)
- **Emphasis:** content-first / action-first / tool-first
- **Aesthetic:** editorial / utilitarian / playful
- **Layout:** single-column / sidebar / split-pane
- **Grounding:** card-based / bare-content / document-style
Pick one axis and pull apart from it. Two variants that differ only in accent color are wasted effort — the user can't distinguish them.
**Variant naming:** describe the stance, not the number.
<!-- ascii-guard-ignore -->
```
sketches/
├── 001-calm-editorial/
│ ├── index.html
│ └── README.md
├── 001-utilitarian-dense/
│ ├── index.html
│ └── README.md
└── 001-playful-split/
├── index.html
└── README.md
```
<!-- ascii-guard-ignore-end -->
### 3. Make them real HTML
Each variant is a **single self-contained HTML file**:
- Inline `<style>` — no build step, no external CSS
- System fonts or one Google Font via `<link>`
- Tailwind via CDN (`<script src="https://cdn.tailwindcss.com"></script>`) is fine
- Realistic fake content — actual sentences, actual names, not "Lorem ipsum"
- **Interactive**: links clickable, hovers real, at least one state transition (open/close, filter, toggle). A frozen static image is a worse spike than a sloppy animated one.
Open it in a browser. If it looks broken, fix it before showing the user.
**Verify variants visually — use Hermes' browser tools.** Don't just write HTML and hope it renders; load each variant and look at it:
```
browser_navigate(url="file:///absolute/path/to/sketches/001-calm-editorial/index.html")
browser_vision(question="Does this layout look clean and readable? Any visible bugs (overlapping text, unstyled elements, broken images)?")
```
`browser_vision` returns an AI description of what's actually on the page plus a screenshot path — catches layout bugs that pure source inspection misses (e.g. a font import that silently failed, a flex container that collapsed). Fix and re-navigate until each variant looks right.
**Default CSS reset + system font stack** for fast starts:
```html
<style>
* { box-sizing: border-box; margin: 0; padding: 0; }
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto,
"Helvetica Neue", Arial, sans-serif;
-webkit-font-smoothing: antialiased;
color: #1a1a1a;
background: #fafafa;
line-height: 1.5;
}
</style>
```
### 4. Variant README
Each variant's `README.md` answers:
```markdown
## Variant: {stance name}
### Design stance
One sentence on the principle driving this variant.
### Key choices
- Layout: ...
- Typography: ...
- Color: ...
- Interaction: ...
### Trade-offs
- Strong at: ...
- Weak at: ...
### Best for
- The kind of user or use case this variant actually serves
```
### 5. Head-to-head
After all variants are built, present them as a comparison. Don't just list — **opinionate**:
```markdown
## Three takes on the home screen
| Dimension | Calm editorial | Utilitarian dense | Playful split |
|-----------|----------------|-------------------|---------------|
| Density | Low | High | Medium |
| Primary action visibility | Low | High | Medium |
| Scan-ability | High | Medium | Low |
| Feel | Calm, trusted | Sharp, tool-like | Inviting, energetic |
**My take:** Utilitarian dense for power users, calm editorial for content-forward audiences. Playful split is weakest — tries to do both and commits to neither.
```
Let the user pick a winner, or combine two into a hybrid, or ask for another round.
## Theming (when the project has a visual identity)
If the user has an existing theme (colors, fonts, tokens), put shared tokens in `sketches/themes/tokens.css` and `@import` them in each variant. Keep tokens minimal:
```css
/* sketches/themes/tokens.css */
:root {
--color-bg: #fafafa;
--color-fg: #1a1a1a;
--color-accent: #0066ff;
--color-muted: #666;
--radius: 8px;
--font-display: "Inter", sans-serif;
--font-body: -apple-system, BlinkMacSystemFont, sans-serif;
}
```
Don't over-tokenize a throwaway sketch — three colors and one font is usually enough.
## Interactivity bar
A sketch is interactive enough when the user can:
1. **Click a primary action** and something visible happens (state change, modal, toast, navigation feint)
2. **See one meaningful state transition** (filter a list, toggle a mode, open/close a panel)
3. **Hover recognizable affordances** (buttons, rows, tabs)
More than that is over-engineering a throwaway. Less than that is a screenshot.
## Frontier mode (picking what to sketch next)
If sketches already exist and the user says "what should I sketch next?":
- **Consistency gaps** — two winning variants from different sketches made independent choices that haven't been composed together yet
- **Unsketched screens** — referenced but never explored
- **State coverage** — happy path sketched, but not empty / loading / error / 1000-items
- **Responsive gaps** — validated at one viewport; does it hold at mobile / ultrawide?
- **Interaction patterns** — static layouts exist; transitions, drag, scroll behavior don't
Propose 2-4 named candidates. Let the user pick.
## Output
- Create `sketches/` (or `.planning/sketches/` if the user is using GSD conventions) in the repo root
- One subdir per variant: `NNN-stance-name/index.html` + `README.md`
- Tell the user how to open them: `open sketches/001-calm-editorial/index.html` on macOS, `xdg-open` on Linux, `start` on Windows
- Keep variants disposable — a sketch that you felt the need to preserve should be promoted into real project code, not curated as an asset
**Typical tool sequence for one variant:**
```
terminal("mkdir -p sketches/001-calm-editorial")
write_file("sketches/001-calm-editorial/index.html", "<!doctype html>...")
write_file("sketches/001-calm-editorial/README.md", "## Variant: Calm editorial\n...")
browser_navigate(url="file://$(pwd)/sketches/001-calm-editorial/index.html")
browser_vision(question="How does this look? Any obvious layout issues?")
```
Repeat for each variant, then present the comparison table.
## Attribution
Adapted from the GSD (Get Shit Done) project's `/gsd-sketch` workflow — MIT © 2025 Lex Christopherson ([gsd-build/get-shit-done](https://github.com/gsd-build/get-shit-done)). The full GSD system ships persistent sketch state, theme/variant pattern references, and consistency-audit workflows; install with `npx get-shit-done-cc --hermes --global`.

View file

@ -1,14 +1,14 @@
---
title: "Songwriting And Ai Music"
title: "Songwriting And Ai Music — Songwriting craft and Suno AI music prompts"
sidebar_label: "Songwriting And Ai Music"
description: "Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation techniques, phonetic tricks, and lessons learned"
description: "Songwriting craft and Suno AI music prompts"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Songwriting And Ai Music
Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation techniques, phonetic tricks, and lessons learned. These are tools and ideas, not rules. Break any of them when the art calls for it.
Songwriting craft and Suno AI music prompts.
## Skill metadata

View file

@ -221,8 +221,9 @@ win.par.winopen.pulse()
| `td_input_clear` | Stop input automation |
| `td_op_screen_rect` | Get screen coords of a node |
| `td_click_screen_point` | Click a point in a screenshot |
| `td_screen_point_to_global` | Convert screenshot pixel to absolute screen coords |
See `references/mcp-tools.md` for full parameter schemas.
The table above covers the 32 tools used in typical creative workflows. The remaining 4 tools (`td_project_quit`, `td_test_session`, `td_dev_log`, `td_clear_dev_log`) are admin/dev-mode utilities — see `references/mcp-tools.md` for the full 36-tool reference with complete parameter schemas.
## Key Implementation Rules
@ -355,6 +356,15 @@ See `references/network-patterns.md` for complete build scripts + shader code.
| `references/operator-tips.md` | Wireframe rendering, feedback TOP setup |
| `references/geometry-comp.md` | Geometry COMP: instancing, POP vs SOP, morphing |
| `references/audio-reactive.md` | Audio band extraction, beat detection, envelope following |
| `references/animation.md` | LFOs, timers, keyframes, easing, expression-driven motion |
| `references/midi-osc.md` | MIDI/OSC controllers, TouchOSC, multi-machine sync |
| `references/particles.md` | POPs and legacy particleSOP — emission, forces, collisions |
| `references/projection-mapping.md` | Multi-window output, corner pin, mesh warp, edge blending |
| `references/external-data.md` | HTTP, WebSocket, MQTT, Serial, TCP, webserverDAT |
| `references/panel-ui.md` | Custom params, panel COMPs, button/slider/field, panelExecuteDAT |
| `references/replicator.md` | replicatorCOMP — data-driven cloning, layouts, callbacks |
| `references/dat-scripting.md` | Execute DAT family — chop/dat/parameter/panel/op/executeDAT |
| `references/3d-scene.md` | Lighting rigs, shadows, IBL/cubemaps, multi-camera, PBR |
| `scripts/setup.sh` | Automated setup script |
---

View file

@ -1,14 +1,14 @@
---
title: "Jupyter Live Kernel — Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb"
title: "Jupyter Live Kernel — Iterative Python via live Jupyter kernel (hamelnb)"
sidebar_label: "Jupyter Live Kernel"
description: "Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb"
description: "Iterative Python via live Jupyter kernel (hamelnb)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Jupyter Live Kernel
Use a live Jupyter kernel for stateful, iterative Python execution via hamelnb. Load this skill when the task involves exploration, iteration, or inspecting intermediate results — data science, ML experimentation, API exploration, or building up complex code step-by-step. Uses terminal to run CLI commands against a live Jupyter kernel. No new tools required.
Iterative Python via live Jupyter kernel (hamelnb).
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Webhook Subscriptions"
title: "Webhook Subscriptions — Webhook subscriptions: event-driven agent runs"
sidebar_label: "Webhook Subscriptions"
description: "Create and manage webhook subscriptions for event-driven agent activation, or for direct push notifications (zero LLM cost)"
description: "Webhook subscriptions: event-driven agent runs"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Webhook Subscriptions
Create and manage webhook subscriptions for event-driven agent activation, or for direct push notifications (zero LLM cost). Use when the user wants external services to trigger agent runs OR push notifications to chats.
Webhook subscriptions: event-driven agent runs.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Dogfood"
title: "Dogfood — Exploratory QA of web apps: find bugs, evidence, reports"
sidebar_label: "Dogfood"
description: "Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports"
description: "Exploratory QA of web apps: find bugs, evidence, reports"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Dogfood
Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports
Exploratory QA of web apps: find bugs, evidence, reports.
## Skill metadata
@ -50,11 +50,13 @@ Follow this 5-phase systematic workflow:
### Phase 1: Plan
1. Create the output directory structure:
<!-- ascii-guard-ignore -->
```
{output_dir}/
├── screenshots/ # Evidence screenshots
└── report.md # Final report (generated in Phase 5)
```
<!-- ascii-guard-ignore-end -->
2. Identify the testing scope based on user input.
3. Build a rough sitemap by planning which pages and features to test:
- Landing/home page

View file

@ -1,14 +1,14 @@
---
title: "Himalaya — CLI to manage emails via IMAP/SMTP"
title: "Himalaya — Himalaya CLI: IMAP/SMTP email from terminal"
sidebar_label: "Himalaya"
description: "CLI to manage emails via IMAP/SMTP"
description: "Himalaya CLI: IMAP/SMTP email from terminal"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Himalaya
CLI to manage emails via IMAP/SMTP. Use himalaya to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).
Himalaya CLI: IMAP/SMTP email from terminal.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Minecraft Modpack Server — Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip"
title: "Minecraft Modpack Server — Host modded Minecraft servers (CurseForge, Modrinth)"
sidebar_label: "Minecraft Modpack Server"
description: "Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip"
description: "Host modded Minecraft servers (CurseForge, Modrinth)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Minecraft Modpack Server
Set up a modded Minecraft server from a CurseForge/Modrinth server pack zip. Covers NeoForge/Forge install, Java version, JVM tuning, firewall, LAN config, backups, and launch scripts.
Host modded Minecraft servers (CurseForge, Modrinth).
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Pokemon Player — Play Pokemon games autonomously via headless emulation"
title: "Pokemon Player — Play Pokemon via headless emulator + RAM reads"
sidebar_label: "Pokemon Player"
description: "Play Pokemon games autonomously via headless emulation"
description: "Play Pokemon via headless emulator + RAM reads"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Pokemon Player
Play Pokemon games autonomously via headless emulation. Starts a game server, reads structured game state from RAM, makes strategic decisions, and sends button inputs — all from the terminal.
Play Pokemon via headless emulator + RAM reads.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Codebase Inspection"
title: "Codebase Inspection — Inspect codebases w/ pygount: LOC, languages, ratios"
sidebar_label: "Codebase Inspection"
description: "Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios"
description: "Inspect codebases w/ pygount: LOC, languages, ratios"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Codebase Inspection
Inspect and analyze codebases using pygount for LOC counting, language breakdown, and code-vs-comment ratios. Use when asked to check lines of code, repo size, language composition, or codebase stats.
Inspect codebases w/ pygount: LOC, languages, ratios.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Github Auth — Set up GitHub authentication for the agent using git (universally available) or the gh CLI"
title: "Github Auth — GitHub auth setup: HTTPS tokens, SSH keys, gh CLI login"
sidebar_label: "Github Auth"
description: "Set up GitHub authentication for the agent using git (universally available) or the gh CLI"
description: "GitHub auth setup: HTTPS tokens, SSH keys, gh CLI login"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Github Auth
Set up GitHub authentication for the agent using git (universally available) or the gh CLI. Covers HTTPS tokens, SSH keys, credential helpers, and gh auth — with a detection flow to pick the right method automatically.
GitHub auth setup: HTTPS tokens, SSH keys, gh CLI login.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Github Code Review"
title: "Github Code Review — Review PRs: diffs, inline comments via gh or REST"
sidebar_label: "Github Code Review"
description: "Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review"
description: "Review PRs: diffs, inline comments via gh or REST"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Github Code Review
Review code changes by analyzing git diffs, leaving inline comments on PRs, and performing thorough pre-push review. Works with gh CLI or falls back to git + GitHub REST API via curl.
Review PRs: diffs, inline comments via gh or REST.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Github Issues — Create, manage, triage, and close GitHub issues"
title: "Github Issues — Create, triage, label, assign GitHub issues via gh or REST"
sidebar_label: "Github Issues"
description: "Create, manage, triage, and close GitHub issues"
description: "Create, triage, label, assign GitHub issues via gh or REST"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Github Issues
Create, manage, triage, and close GitHub issues. Search existing issues, add labels, assign people, and link to PRs. Works with gh CLI or falls back to git + GitHub REST API via curl.
Create, triage, label, assign GitHub issues via gh or REST.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Github Pr Workflow"
title: "Github Pr Workflow — GitHub PR lifecycle: branch, commit, open, CI, merge"
sidebar_label: "Github Pr Workflow"
description: "Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge"
description: "GitHub PR lifecycle: branch, commit, open, CI, merge"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Github Pr Workflow
Full pull request lifecycle — create branches, commit changes, open PRs, monitor CI status, auto-fix failures, and merge. Works with gh CLI or falls back to git + GitHub REST API via curl.
GitHub PR lifecycle: branch, commit, open, CI, merge.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Github Repo Management — Clone, create, fork, configure, and manage GitHub repositories"
title: "Github Repo Management — Clone/create/fork repos; manage remotes, releases"
sidebar_label: "Github Repo Management"
description: "Clone, create, fork, configure, and manage GitHub repositories"
description: "Clone/create/fork repos; manage remotes, releases"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Github Repo Management
Clone, create, fork, configure, and manage GitHub repositories. Manage remotes, secrets, releases, and workflows. Works with gh CLI or falls back to git + GitHub REST API via curl.
Clone/create/fork repos; manage remotes, releases.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Native Mcp"
title: "Native Mcp — MCP client: connect servers, register tools (stdio/HTTP)"
sidebar_label: "Native Mcp"
description: "Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools"
description: "MCP client: connect servers, register tools (stdio/HTTP)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Native Mcp
Built-in MCP (Model Context Protocol) client that connects to external MCP servers, discovers their tools, and registers them as native Hermes Agent tools. Supports stdio and HTTP transports with automatic reconnection, security filtering, and zero-config tool injection.
MCP client: connect servers, register tools (stdio/HTTP).
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Gif Search — Search and download GIFs from Tenor using curl"
title: "Gif Search — Search/download GIFs from Tenor via curl + jq"
sidebar_label: "Gif Search"
description: "Search and download GIFs from Tenor using curl"
description: "Search/download GIFs from Tenor via curl + jq"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Gif Search
Search and download GIFs from Tenor using curl. No dependencies beyond curl and jq. Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat.
Search/download GIFs from Tenor via curl + jq.
## Skill metadata
@ -31,6 +31,10 @@ The following is the complete skill definition that Hermes loads when this skill
Search and download GIFs directly via the Tenor API using curl. No extra tools needed.
## When to use
Useful for finding reaction GIFs, creating visual content, and sending GIFs in chat.
## Setup
Set your Tenor API key in your environment (add to `~/.hermes/.env`):

View file

@ -1,14 +1,14 @@
---
title: "Heartmula — Set up and run HeartMuLa, the open-source music generation model family (Suno-like)"
title: "Heartmula — HeartMuLa: Suno-like song generation from lyrics + tags"
sidebar_label: "Heartmula"
description: "Set up and run HeartMuLa, the open-source music generation model family (Suno-like)"
description: "HeartMuLa: Suno-like song generation from lyrics + tags"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Heartmula
Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support.
HeartMuLa: Suno-like song generation from lyrics + tags.
## Skill metadata
@ -29,7 +29,7 @@ The following is the complete skill definition that Hermes loads when this skill
# HeartMuLa - Open-Source Music Generation
## Overview
HeartMuLa is a family of open-source music foundation models (Apache-2.0) that generates music conditioned on lyrics and tags. Comparable to Suno for open-source. Includes:
HeartMuLa is a family of open-source music foundation models (Apache-2.0) that generates music conditioned on lyrics and tags, with multilingual support. Generates full songs from lyrics + tags. Comparable to Suno for open-source. Includes:
- **HeartMuLa** - Music language model (3B/7B) for generation from lyrics + tags
- **HeartCodec** - 12.5Hz music codec for high-fidelity audio reconstruction
- **HeartTranscriptor** - Whisper-based lyrics transcription

View file

@ -1,14 +1,14 @@
---
title: "Songsee — Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"
title: "Songsee — Audio spectrograms/features (mel, chroma, MFCC) via CLI"
sidebar_label: "Songsee"
description: "Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc"
description: "Audio spectrograms/features (mel, chroma, MFCC) via CLI"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Songsee
Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.
Audio spectrograms/features (mel, chroma, MFCC) via CLI.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Spotify"
title: "Spotify — Spotify: play, search, queue, manage playlists and devices"
sidebar_label: "Spotify"
description: "Control Spotify — play music, search the catalog, manage playlists and library, inspect devices and playback state"
description: "Spotify: play, search, queue, manage playlists and devices"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Spotify
Control Spotify — play music, search the catalog, manage playlists and library, inspect devices and playback state. Loads when the user asks to play/pause/queue music, search tracks/albums/artists, manage playlists, or check what's playing. Assumes the Hermes Spotify toolset is enabled and `hermes auth spotify` has been run.
Spotify: play, search, queue, manage playlists and devices.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Youtube Content"
title: "Youtube Content — YouTube transcripts to summaries, threads, blogs"
sidebar_label: "Youtube Content"
description: "Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts)"
description: "YouTube transcripts to summaries, threads, blogs"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Youtube Content
Fetch YouTube video transcripts and transform them into structured content (chapters, summaries, threads, blog posts). Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouTube video.
YouTube transcripts to summaries, threads, blogs.
## Skill metadata
@ -25,6 +25,10 @@ The following is the complete skill definition that Hermes loads when this skill
# YouTube Content Tool
## When to use
Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouTube video. Transforms transcripts into structured content (chapters, summaries, threads, blog posts).
Extract transcripts from YouTube videos and convert them into useful formats.
## Setup

View file

@ -1,14 +1,14 @@
---
title: "Evaluating Llms Harness — Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag)"
title: "Evaluating Llms Harness — lm-eval-harness: benchmark LLMs (MMLU, GSM8K, etc"
sidebar_label: "Evaluating Llms Harness"
description: "Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag)"
description: "lm-eval-harness: benchmark LLMs (MMLU, GSM8K, etc"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Evaluating Llms Harness
Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.
lm-eval-harness: benchmark LLMs (MMLU, GSM8K, etc.).
## Skill metadata
@ -30,6 +30,10 @@ The following is the complete skill definition that Hermes loads when this skill
# lm-evaluation-harness - LLM Benchmarking
## What's inside
Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.
## Quick start
lm-evaluation-harness evaluates LLMs across 60+ academic benchmarks using standardized prompts and metrics.

View file

@ -1,14 +1,14 @@
---
title: "Weights And Biases"
title: "Weights And Biases — W&B: log ML experiments, sweeps, model registry, dashboards"
sidebar_label: "Weights And Biases"
description: "Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - coll..."
description: "W&B: log ML experiments, sweeps, model registry, dashboards"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Weights And Biases
Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform
W&B: log ML experiments, sweeps, model registry, dashboards.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Huggingface Hub"
title: "Huggingface Hub — HuggingFace hf CLI: search/download/upload models, datasets"
sidebar_label: "Huggingface Hub"
description: "Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Space..."
description: "HuggingFace hf CLI: search/download/upload models, datasets"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Huggingface Hub
Hugging Face Hub CLI (hf) — search, download, and upload models and datasets, manage repos, query datasets with SQL, deploy inference endpoints, manage Spaces and buckets.
HuggingFace hf CLI: search/download/upload models, datasets.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Obliteratus"
title: "Obliteratus — OBLITERATUS: abliterate LLM refusals (diff-in-means)"
sidebar_label: "Obliteratus"
description: "Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE deco..."
description: "OBLITERATUS: abliterate LLM refusals (diff-in-means)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Obliteratus
Remove refusal behaviors from open-weight LLMs using OBLITERATUS — mechanistic interpretability techniques (diff-in-means, SVD, whitened SVD, LEACE, SAE decomposition, etc.) to excise guardrails while preserving reasoning. 9 CLI methods, 28 analysis modules, 116 model presets across 5 compute tiers, tournament evaluation, and telemetry-driven recommendations. Use when a user wants to uncensor, abliterate, or remove refusal from an LLM.
OBLITERATUS: abliterate LLM refusals (diff-in-means).
## Skill metadata
@ -31,10 +31,21 @@ The following is the complete skill definition that Hermes loads when this skill
# OBLITERATUS Skill
## What's inside
9 CLI methods, 28 analysis modules, 116 model presets across 5 compute tiers, tournament evaluation, and telemetry-driven recommendations.
Remove refusal behaviors (guardrails) from open-weight LLMs without retraining or fine-tuning. Uses mechanistic interpretability techniques — including diff-in-means, SVD, whitened SVD, LEACE concept erasure, SAE decomposition, Bayesian kernel projection, and more — to identify and surgically excise refusal directions from model weights while preserving reasoning capabilities.
**License warning:** OBLITERATUS is AGPL-3.0. NEVER import it as a Python library. Always invoke via CLI (`obliteratus` command) or subprocess. This keeps Hermes Agent's MIT license clean.
## Video Guide
Walkthrough of OBLITERATUS used by a Hermes agent to abliterate Gemma:
https://www.youtube.com/watch?v=8fG9BrNTeHs ("OBLITERATUS: An AI Agent Removed Gemma 4's Safety Guardrails")
Useful when the user wants a visual overview of the end-to-end workflow before running it themselves.
## When to Use This Skill
Trigger when the user:

View file

@ -1,14 +1,14 @@
---
title: "Outlines"
title: "Outlines — Outlines: structured JSON/regex/Pydantic LLM generation"
sidebar_label: "Outlines"
description: "Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize..."
description: "Outlines: structured JSON/regex/Pydantic LLM generation"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Outlines
Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines - dottxt.ai's structured generation library
Outlines: structured JSON/regex/Pydantic LLM generation.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Serving Llms Vllm — Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching"
title: "Serving Llms Vllm — vLLM: high-throughput LLM serving, OpenAI API, quantization"
sidebar_label: "Serving Llms Vllm"
description: "Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching"
description: "vLLM: high-throughput LLM serving, OpenAI API, quantization"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Serving Llms Vllm
Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.
vLLM: high-throughput LLM serving, OpenAI API, quantization.
## Skill metadata
@ -30,6 +30,10 @@ The following is the complete skill definition that Hermes loads when this skill
# vLLM - High-Performance LLM Serving
## When to use
Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.
## Quick start
vLLM achieves 24x higher throughput than standard transformers through PagedAttention (block-based KV cache) and continuous batching (mixing prefill/decode requests).

View file

@ -1,14 +1,14 @@
---
title: "Audiocraft Audio Generation"
title: "Audiocraft Audio Generation — AudioCraft: MusicGen text-to-music, AudioGen text-to-sound"
sidebar_label: "Audiocraft Audio Generation"
description: "PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen)"
description: "AudioCraft: MusicGen text-to-music, AudioGen text-to-sound"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Audiocraft Audio Generation
PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation.
AudioCraft: MusicGen text-to-music, AudioGen text-to-sound.
## Skill metadata
@ -146,6 +146,7 @@ torchaudio.save("sound.wav", wav[0].cpu(), sample_rate=16000)
### Architecture overview
<!-- ascii-guard-ignore -->
```
AudioCraft Architecture:
┌──────────────────────────────────────────────────────────────┐
@ -165,6 +166,7 @@ AudioCraft Architecture:
│ Converts tokens back to audio waveform │
└──────────────────────────────────────────────────────────────┘
```
<!-- ascii-guard-ignore-end -->
### Model variants

View file

@ -1,14 +1,14 @@
---
title: "Segment Anything Model — Foundation model for image segmentation with zero-shot transfer"
title: "Segment Anything Model — SAM: zero-shot image segmentation via points, boxes, masks"
sidebar_label: "Segment Anything Model"
description: "Foundation model for image segmentation with zero-shot transfer"
description: "SAM: zero-shot image segmentation via points, boxes, masks"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Segment Anything Model
Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image.
SAM: zero-shot image segmentation via points, boxes, masks.
## Skill metadata
@ -151,6 +151,7 @@ masks = processor.image_processor.post_process_masks(
### Model architecture
<!-- ascii-guard-ignore -->
<!-- ascii-guard-ignore -->
```
SAM Architecture:
@ -163,6 +164,7 @@ SAM Architecture:
(computed once) (per prompt) predictions
```
<!-- ascii-guard-ignore-end -->
<!-- ascii-guard-ignore-end -->
### Model variants

View file

@ -1,14 +1,14 @@
---
title: "Dspy"
title: "Dspy — DSPy: declarative LM programs, auto-optimize prompts, RAG"
sidebar_label: "Dspy"
description: "Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's frame..."
description: "DSPy: declarative LM programs, auto-optimize prompts, RAG"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Dspy
Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming
DSPy: declarative LM programs, auto-optimize prompts, RAG.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Axolotl"
title: "Axolotl — Axolotl: YAML LLM fine-tuning (LoRA, DPO, GRPO)"
sidebar_label: "Axolotl"
description: "Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support"
description: "Axolotl: YAML LLM fine-tuning (LoRA, DPO, GRPO)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Axolotl
Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support
Axolotl: YAML LLM fine-tuning (LoRA, DPO, GRPO).
## Skill metadata
@ -30,6 +30,10 @@ The following is the complete skill definition that Hermes loads when this skill
# Axolotl Skill
## What's inside
Expert guidance for fine-tuning LLMs with Axolotl — YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support.
Comprehensive assistance with axolotl development, generated from official documentation.
## When to Use This Skill

View file

@ -1,14 +1,14 @@
---
title: "Fine Tuning With Trl"
title: "Fine Tuning With Trl — TRL: SFT, DPO, PPO, GRPO, reward modeling for LLM RLHF"
sidebar_label: "Fine Tuning With Trl"
description: "Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward..."
description: "TRL: SFT, DPO, PPO, GRPO, reward modeling for LLM RLHF"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Fine Tuning With Trl
Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.
TRL: SFT, DPO, PPO, GRPO, reward modeling for LLM RLHF.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Unsloth"
title: "Unsloth — Unsloth: 2-5x faster LoRA/QLoRA fine-tuning, less VRAM"
sidebar_label: "Unsloth"
description: "Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization"
description: "Unsloth: 2-5x faster LoRA/QLoRA fine-tuning, less VRAM"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Unsloth
Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization
Unsloth: 2-5x faster LoRA/QLoRA fine-tuning, less VRAM.
## Skill metadata

View file

@ -0,0 +1,242 @@
---
title: "Airtable — Airtable REST API via curl"
sidebar_label: "Airtable"
description: "Airtable REST API via curl"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Airtable
Airtable REST API via curl. Records CRUD, filters, upserts.
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/productivity/airtable` |
| Version | `1.1.0` |
| Author | community |
| License | MIT |
| Tags | `Airtable`, `Productivity`, `Database`, `API` |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Airtable — Bases, Tables & Records
Work with Airtable's REST API directly via `curl` using the `terminal` tool. No MCP server, no OAuth flow, no Python SDK — just `curl` and a personal access token.
## Prerequisites
1. Create a **Personal Access Token (PAT)** at https://airtable.com/create/tokens (tokens start with `pat...`).
2. Grant these scopes (minimum):
- `data.records:read` — read rows
- `data.records:write` — create / update / delete rows
- `schema.bases:read` — list bases and tables
3. **Important:** in the same token UI, add each base you want to access to the token's **Access** list. PATs are scoped per-base — a valid token on the wrong base returns `403`.
4. Store the token in `~/.hermes/.env` (or via `hermes setup`):
```
AIRTABLE_API_KEY=pat_your_token_here
```
> Note: legacy `key...` API keys were deprecated Feb 2024. Only PATs and OAuth tokens work now.
## API Basics
- **Endpoint:** `https://api.airtable.com/v0`
- **Auth header:** `Authorization: Bearer $AIRTABLE_API_KEY`
- **All requests** use JSON (`Content-Type: application/json` for any POST/PATCH/PUT body).
- **Object IDs:** bases `app...`, tables `tbl...`, records `rec...`, fields `fld...`. IDs never change; names can. Prefer IDs in automations.
- **Rate limit:** 5 requests/sec/base. `429` → back off. Burst on a single base will be throttled.
Base curl pattern:
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?maxRecords=5" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
`-s` suppresses curl's progress bar — keep it set for every call so the tool output stays clean for Hermes. Pipe through `python3 -m json.tool` (always present) or `jq` (if installed) for readable JSON.
## Field Types (request body shapes)
| Field type | Write shape |
|---|---|
| Single line text | `"Name": "hello"` |
| Long text | `"Notes": "multi\nline"` |
| Number | `"Score": 42` |
| Checkbox | `"Done": true` |
| Single select | `"Status": "Todo"` (name must already exist unless `typecast: true`) |
| Multi-select | `"Tags": ["urgent", "bug"]` |
| Date | `"Due": "2026-04-01"` |
| DateTime (UTC) | `"At": "2026-04-01T14:30:00.000Z"` |
| URL / Email / Phone | `"Link": "https://…"` |
| Attachment | `"Files": [{"url": "https://…"}]` (Airtable fetches + rehosts) |
| Linked record | `"Owner": ["recXXXXXXXXXXXXXX"]` (array of record IDs) |
| User | `"AssignedTo": {"id": "usrXXXXXXXXXXXXXX"}` |
Pass `"typecast": true` at the top level of a create/update body to let Airtable auto-coerce values (e.g. create a new select option on the fly, convert `"42"``42`).
## Common Queries
### List bases the token can see
```bash
curl -s "https://api.airtable.com/v0/meta/bases" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
### List tables + schema for a base
```bash
curl -s "https://api.airtable.com/v0/meta/bases/$BASE_ID/tables" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
Use this BEFORE mutating — confirms exact field names and IDs, surfaces `options.choices` for select fields, and shows primary-field names.
### List records (first 10)
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?maxRecords=10" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
### Get a single record
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE/$RECORD_ID" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
### Filter records (filterByFormula)
Airtable formulas must be URL-encoded. Let Python stdlib do it — never hand-encode:
```bash
FORMULA="{Status}='Todo'"
ENC=$(python3 -c 'import sys, urllib.parse; print(urllib.parse.quote(sys.argv[1], safe=""))' "$FORMULA")
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?filterByFormula=$ENC&maxRecords=20" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
Useful formula patterns:
- Exact match: `{Email}='user@example.com'`
- Contains: `FIND('bug', LOWER({Title}))`
- Multiple conditions: `AND({Status}='Todo', {Priority}='High')`
- Or: `OR({Owner}='alice', {Owner}='bob')`
- Not empty: `NOT({Assignee}='')`
- Date comparison: `IS_AFTER({Due}, TODAY())`
### Sort + select specific fields
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?sort%5B0%5D%5Bfield%5D=Priority&sort%5B0%5D%5Bdirection%5D=asc&fields%5B%5D=Name&fields%5B%5D=Status" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
Square brackets in query params MUST be URL-encoded (`%5B` / `%5D`).
### Use a named view
```bash
curl -s "https://api.airtable.com/v0/$BASE_ID/$TABLE?view=Grid%20view&maxRecords=50" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
Views apply their saved filter + sort server-side.
## Common Mutations
### Create a record
```bash
curl -s -X POST "https://api.airtable.com/v0/$BASE_ID/$TABLE" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"fields":{"Name":"New task","Status":"Todo","Priority":"High"}}' | python3 -m json.tool
```
### Create up to 10 records in one call
```bash
curl -s -X POST "https://api.airtable.com/v0/$BASE_ID/$TABLE" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"typecast": true,
"records": [
{"fields": {"Name": "Task A", "Status": "Todo"}},
{"fields": {"Name": "Task B", "Status": "In progress"}}
]
}' | python3 -m json.tool
```
Batch endpoints are capped at **10 records per request**. For larger inserts, loop in batches of 10 with a short sleep to respect 5 req/sec/base.
### Update a record (PATCH — merges, preserves unchanged fields)
```bash
curl -s -X PATCH "https://api.airtable.com/v0/$BASE_ID/$TABLE/$RECORD_ID" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"fields":{"Status":"Done"}}' | python3 -m json.tool
```
### Upsert by a merge field (no ID needed)
```bash
curl -s -X PATCH "https://api.airtable.com/v0/$BASE_ID/$TABLE" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"performUpsert": {"fieldsToMergeOn": ["Email"]},
"records": [
{"fields": {"Email": "user@example.com", "Status": "Active"}}
]
}' | python3 -m json.tool
```
`performUpsert` creates records whose merge-field values are new, patches records whose merge-field values already exist. Great for idempotent syncs.
### Delete a record
```bash
curl -s -X DELETE "https://api.airtable.com/v0/$BASE_ID/$TABLE/$RECORD_ID" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
### Delete up to 10 records in one call
```bash
curl -s -X DELETE "https://api.airtable.com/v0/$BASE_ID/$TABLE?records%5B%5D=rec1&records%5B%5D=rec2" \
-H "Authorization: Bearer $AIRTABLE_API_KEY" | python3 -m json.tool
```
## Pagination
List endpoints return at most **100 records per page**. If the response includes `"offset": "..."`, pass it back on the next call. Loop until the field is absent:
```bash
OFFSET=""
while :; do
URL="https://api.airtable.com/v0/$BASE_ID/$TABLE?pageSize=100"
[ -n "$OFFSET" ] && URL="$URL&offset=$OFFSET"
RESP=$(curl -s "$URL" -H "Authorization: Bearer $AIRTABLE_API_KEY")
echo "$RESP" | python3 -c 'import json,sys; d=json.load(sys.stdin); [print(r["id"], r["fields"].get("Name","")) for r in d["records"]]'
OFFSET=$(echo "$RESP" | python3 -c 'import json,sys; d=json.load(sys.stdin); print(d.get("offset",""))')
[ -z "$OFFSET" ] && break
done
```
## Typical Hermes Workflow
1. **Confirm auth.** `curl -s -o /dev/null -w "%{http_code}\n" https://api.airtable.com/v0/meta/bases -H "Authorization: Bearer $AIRTABLE_API_KEY"` — expect `200`.
2. **Find the base.** List bases (step above) OR ask the user for the `app...` ID directly if the token lacks `schema.bases:read`.
3. **Inspect the schema.** `GET /v0/meta/bases/$BASE_ID/tables` — cache the exact field names and primary-field name locally in the session before mutating anything.
4. **Read before you write.** For "update X where Y", `filterByFormula` first to resolve the `rec...` ID, then `PATCH /v0/$BASE_ID/$TABLE/$RECORD_ID`. Never guess record IDs.
5. **Batch writes.** Combine related creates into one 10-record POST to stay under the 5 req/sec budget.
6. **Destructive ops.** Deletions can't be undone via API. If the user says "delete all Xs", echo back the filter + record count and confirm before firing.
## Pitfalls
- **`filterByFormula` MUST be URL-encoded.** Field names with spaces or non-ASCII also need encoding (`{My Field}``%7BMy%20Field%7D`). Use Python stdlib (pattern above) — never hand-escape.
- **Empty fields are omitted from responses.** A missing `"Assignee"` key doesn't mean the field doesn't exist — it means this record's value is empty. Check the schema (step 3) before concluding a field is missing.
- **PATCH vs PUT.** `PATCH` merges supplied fields into the record. `PUT` replaces the record entirely and clears any field you didn't include. Default to `PATCH`.
- **Single-select options must exist.** Writing `"Status": "Shipping"` when `Shipping` isn't in the field's option list errors with `INVALID_MULTIPLE_CHOICE_OPTIONS` unless you pass `"typecast": true` (which auto-creates the option).
- **Per-base token scoping.** A `403` on one base while another works means the token's Access list doesn't include that base — not a scope or auth issue. Send the user to https://airtable.com/create/tokens to grant it.
- **Rate limits are per base, not per token.** 5 req/sec on `baseA` and 5 req/sec on `baseB` is fine; 6 req/sec on `baseA` alone will throttle. Monitor the `Retry-After` header on `429`.
## Important Notes for Hermes
- **Always use the `terminal` tool with `curl`.** Do NOT use `web_extract` (it can't send auth headers) or `browser_navigate` (needs UI auth and is slow).
- **`AIRTABLE_API_KEY` flows from `~/.hermes/.env` into the subprocess automatically** when this skill is loaded — no need to re-export it before each `curl` call.
- **Escape curly braces in formulas carefully.** In a heredoc body, `{Status}` is literal. In a shell argument, `{Status}` is safe outside `{...}` brace-expansion context — but pass dynamic strings through `python3 urllib.parse.quote` before splicing into a URL.
- **Pretty-print with `python3 -m json.tool`** (always present) rather than `jq` (optional). Only reach for `jq` when you need filtering/projection.
- **Pagination is per-page, not global.** Airtable's 100-record cap is a hard limit; there is no way to bump it. Loop with `offset` until the field is absent.
- **Read the `errors` array** on non-2xx responses — Airtable returns structured error codes like `AUTHENTICATION_REQUIRED`, `INVALID_PERMISSIONS`, `MODEL_ID_NOT_FOUND`, `INVALID_MULTIPLE_CHOICE_OPTIONS` that tell you exactly what's wrong.

View file

@ -1,14 +1,14 @@
---
title: "Google Workspace — Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration for Hermes"
title: "Google Workspace — Gmail, Calendar, Drive, Docs, Sheets via gws CLI or Python"
sidebar_label: "Google Workspace"
description: "Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration for Hermes"
description: "Gmail, Calendar, Drive, Docs, Sheets via gws CLI or Python"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Google Workspace
Gmail, Calendar, Drive, Contacts, Sheets, and Docs integration for Hermes. Uses Hermes-managed OAuth2 setup, prefers the Google Workspace CLI (`gws`) when available for broader API coverage, and falls back to the Python client libraries otherwise.
Gmail, Calendar, Drive, Docs, Sheets via gws CLI or Python.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Linear — Manage Linear issues, projects, and teams via the GraphQL API"
title: "Linear — Linear: manage issues, projects, teams via GraphQL + curl"
sidebar_label: "Linear"
description: "Manage Linear issues, projects, and teams via the GraphQL API"
description: "Linear: manage issues, projects, teams via GraphQL + curl"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Linear
Manage Linear issues, projects, and teams via the GraphQL API. Create, update, search, and organize issues. Uses API key auth (no OAuth needed). All operations via curl — no dependencies.
Linear: manage issues, projects, teams via GraphQL + curl.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Maps"
title: "Maps — Geocode, POIs, routes, timezones via OpenStreetMap/OSRM"
sidebar_label: "Maps"
description: "Location intelligence — geocode a place, reverse-geocode coordinates, find nearby places (46 POI categories), driving/walking/cycling distance + time, turn-b..."
description: "Geocode, POIs, routes, timezones via OpenStreetMap/OSRM"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Maps
Location intelligence — geocode a place, reverse-geocode coordinates, find nearby places (46 POI categories), driving/walking/cycling distance + time, turn-by-turn directions, timezone lookup, bounding box + area for a named place, and POI search within a rectangle. Uses OpenStreetMap + Overpass + OSRM. Free, no API key.
Geocode, POIs, routes, timezones via OpenStreetMap/OSRM.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Nano Pdf — Edit PDFs with natural-language instructions using the nano-pdf CLI"
title: "Nano Pdf — Edit PDF text/typos/titles via nano-pdf CLI (NL prompts)"
sidebar_label: "Nano Pdf"
description: "Edit PDFs with natural-language instructions using the nano-pdf CLI"
description: "Edit PDF text/typos/titles via nano-pdf CLI (NL prompts)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Nano Pdf
Edit PDFs with natural-language instructions using the nano-pdf CLI. Modify text, fix typos, update titles, and make content changes to specific pages without manual editing.
Edit PDF text/typos/titles via nano-pdf CLI (NL prompts).
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Notion — Notion API for creating and managing pages, databases, and blocks via curl"
title: "Notion — Notion API via curl: pages, databases, blocks, search"
sidebar_label: "Notion"
description: "Notion API for creating and managing pages, databases, and blocks via curl"
description: "Notion API via curl: pages, databases, blocks, search"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Notion
Notion API for creating and managing pages, databases, and blocks via curl. Search, create, update, and query Notion workspaces directly from the terminal.
Notion API via curl: pages, databases, blocks, search.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Ocr And Documents — Extract text from PDFs and scanned documents"
title: "Ocr And Documents — Extract text from PDFs/scans (pymupdf, marker-pdf)"
sidebar_label: "Ocr And Documents"
description: "Extract text from PDFs and scanned documents"
description: "Extract text from PDFs/scans (pymupdf, marker-pdf)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Ocr And Documents
Extract text from PDFs and scanned documents. Use web_extract for remote URLs, pymupdf for local text-based PDFs, marker-pdf for OCR/scanned docs. For DOCX use python-docx, for PPTX see the powerpoint skill.
Extract text from PDFs/scans (pymupdf, marker-pdf).
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Powerpoint — Use this skill any time a"
title: "Powerpoint — Create, read, edit"
sidebar_label: "Powerpoint"
description: "Use this skill any time a"
description: "Create, read, edit"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Powerpoint
Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in an email or summary); editing, modifying, or updating existing presentations; combining or splitting slide files; working with templates, layouts, speaker notes, or comments. Trigger whenever the user mentions "deck," "slides," "presentation," or references a .pptx filename, regardless of what they plan to do with the content afterward. If a .pptx file needs to be opened, created, or touched, use this skill.
Create, read, edit .pptx decks, slides, notes, templates.
## Skill metadata
@ -26,6 +26,10 @@ The following is the complete skill definition that Hermes loads when this skill
# Powerpoint Skill
## When to use
Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in an email or summary); editing, modifying, or updating existing presentations; combining or splitting slide files; working with templates, layouts, speaker notes, or comments. Trigger whenever the user mentions "deck," "slides," "presentation," or references a .pptx filename, regardless of what they plan to do with the content afterward. If a .pptx file needs to be opened, created, or touched, use this skill.
## Quick Reference
| Task | Guide |

View file

@ -1,14 +1,14 @@
---
title: "Godmode"
title: "Godmode — Jailbreak LLMs: Parseltongue, GODMODE, ULTRAPLINIAN"
sidebar_label: "Godmode"
description: "Jailbreak API-served LLMs using G0DM0D3 techniques — Parseltongue input obfuscation (33 techniques), GODMODE CLASSIC system prompt templates, ULTRAPLINIAN mu..."
description: "Jailbreak LLMs: Parseltongue, GODMODE, ULTRAPLINIAN"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Godmode
Jailbreak API-served LLMs using G0DM0D3 techniques — Parseltongue input obfuscation (33 techniques), GODMODE CLASSIC system prompt templates, ULTRAPLINIAN multi-model racing, encoding escalation, and Hermes-native prefill/system prompt integration. Use when a user wants to bypass safety filters, jailbreak a model, or red-team model robustness.
Jailbreak LLMs: Parseltongue, GODMODE, ULTRAPLINIAN.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Arxiv — Search and retrieve academic papers from arXiv using their free REST API"
title: "Arxiv — Search arXiv papers by keyword, author, category, or ID"
sidebar_label: "Arxiv"
description: "Search and retrieve academic papers from arXiv using their free REST API"
description: "Search arXiv papers by keyword, author, category, or ID"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Arxiv
Search and retrieve academic papers from arXiv using their free REST API. No API key needed. Search by keyword, author, category, or ID. Combine with web_extract or the ocr-and-documents skill to read full paper content.
Search arXiv papers by keyword, author, category, or ID.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Blogwatcher — Monitor blogs and RSS/Atom feeds for updates using the blogwatcher-cli tool"
title: "Blogwatcher — Monitor blogs and RSS/Atom feeds via blogwatcher-cli tool"
sidebar_label: "Blogwatcher"
description: "Monitor blogs and RSS/Atom feeds for updates using the blogwatcher-cli tool"
description: "Monitor blogs and RSS/Atom feeds via blogwatcher-cli tool"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Blogwatcher
Monitor blogs and RSS/Atom feeds for updates using the blogwatcher-cli tool. Add blogs, scan for new articles, track read status, and filter by category.
Monitor blogs and RSS/Atom feeds via blogwatcher-cli tool.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Llm Wiki — Karpathy's LLM Wiki — build and maintain a persistent, interlinked markdown knowledge base"
title: "Llm Wiki — Karpathy's LLM Wiki: build/query interlinked markdown KB"
sidebar_label: "Llm Wiki"
description: "Karpathy's LLM Wiki — build and maintain a persistent, interlinked markdown knowledge base"
description: "Karpathy's LLM Wiki: build/query interlinked markdown KB"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Llm Wiki
Karpathy's LLM Wiki — build and maintain a persistent, interlinked markdown knowledge base. Ingest sources, query compiled knowledge, and lint for consistency.
Karpathy's LLM Wiki: build/query interlinked markdown KB.
## Skill metadata
@ -64,6 +64,7 @@ any editor. No database, no special tooling required.
## Architecture: Three Layers
<!-- ascii-guard-ignore -->
```
wiki/
├── SCHEMA.md # Conventions, structure rules, domain config
@ -79,6 +80,7 @@ wiki/
├── comparisons/ # Layer 2: Side-by-side analyses
└── queries/ # Layer 2: Filed query results worth keeping
```
<!-- ascii-guard-ignore-end -->
**Layer 1 — Raw Sources:** Immutable. The agent reads but never modifies these.
**Layer 2 — The Wiki:** Agent-owned markdown files. Created, updated, and

View file

@ -1,14 +1,14 @@
---
title: "Polymarket — Query Polymarket prediction market data — search markets, get prices, orderbooks, and price history"
title: "Polymarket — Query Polymarket: markets, prices, orderbooks, history"
sidebar_label: "Polymarket"
description: "Query Polymarket prediction market data — search markets, get prices, orderbooks, and price history"
description: "Query Polymarket: markets, prices, orderbooks, history"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Polymarket
Query Polymarket prediction market data — search markets, get prices, orderbooks, and price history. Read-only via public REST APIs, no API key needed.
Query Polymarket: markets, prices, orderbooks, history.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Research Paper Writing"
title: "Research Paper Writing — Write ML papers for NeurIPS/ICML/ICLR: design→submit"
sidebar_label: "Research Paper Writing"
description: "End-to-end pipeline for writing ML/AI research papers — from experiment design through analysis, drafting, revision, and submission"
description: "Write ML papers for NeurIPS/ICML/ICLR: design→submit"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Research Paper Writing
End-to-end pipeline for writing ML/AI research papers — from experiment design through analysis, drafting, revision, and submission. Covers NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Integrates automated experiment monitoring, statistical analysis, iterative writing, and citation verification.
Write ML papers for NeurIPS/ICML/ICLR: design→submit.
## Skill metadata
@ -36,6 +36,7 @@ End-to-end pipeline for producing publication-ready ML/AI research papers target
This is **not a linear pipeline** — it is an iterative loop. Results trigger new experiments. Reviews trigger new analysis. The agent must handle these feedback loops.
<!-- ascii-guard-ignore -->
<!-- ascii-guard-ignore -->
```
┌─────────────────────────────────────────────────────────────┐
@ -57,6 +58,7 @@ This is **not a linear pipeline** — it is an iterative loop. Results trigger n
└─────────────────────────────────────────────────────────────┘
```
<!-- ascii-guard-ignore-end -->
<!-- ascii-guard-ignore-end -->
---
@ -739,6 +741,7 @@ Any output in this pipeline — paper drafts, experiment scripts, analysis — c
**Core insight**: Autoreason's value depends on the gap between a model's generation capability and its self-evaluation capability.
<!-- ascii-guard-ignore -->
```
Model Tier │ Generation │ Self-Eval │ Gap │ Autoreason Value
──────────────────┼────────────┼───────────┼────────┼─────────────────
@ -748,6 +751,7 @@ Mid (Gemini Flash)│ Decent │ Moderate │ Large │ High — wins 2/3
Strong (Sonnet 4) │ Good │ Decent │ Medium │ Moderate — wins 3/5
Frontier (S4.6) │ Excellent │ Good │ Small │ Only with constraints
```
<!-- ascii-guard-ignore-end -->
This gap is structural, not temporary. As costs drop, today's frontier becomes tomorrow's mid-tier. The sweet spot moves but never disappears.

View file

@ -1,14 +1,14 @@
---
title: "Openhue — Control Philips Hue lights, rooms, and scenes via the OpenHue CLI"
title: "Openhue — Control Philips Hue lights, scenes, rooms via OpenHue CLI"
sidebar_label: "Openhue"
description: "Control Philips Hue lights, rooms, and scenes via the OpenHue CLI"
description: "Control Philips Hue lights, scenes, rooms via OpenHue CLI"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Openhue
Control Philips Hue lights, rooms, and scenes via the OpenHue CLI. Turn lights on/off, adjust brightness, color, color temperature, and activate scenes.
Control Philips Hue lights, scenes, rooms via OpenHue CLI.
## Skill metadata

View file

@ -1,14 +1,14 @@
---
title: "Xurl — Interact with X/Twitter via xurl, the official X API CLI"
title: "Xurl — X/Twitter via xurl CLI: post, search, DM, media, v2 API"
sidebar_label: "Xurl"
description: "Interact with X/Twitter via xurl, the official X API CLI"
description: "X/Twitter via xurl CLI: post, search, DM, media, v2 API"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Xurl
Interact with X/Twitter via xurl, the official X API CLI. Use for posting, replying, quoting, searching, timelines, mentions, likes, reposts, bookmarks, follows, DMs, media upload, and raw v2 endpoint access.
X/Twitter via xurl CLI: post, search, DM, media, v2 API.
## Skill metadata

View file

@ -0,0 +1,171 @@
---
title: "Debugging Hermes Tui Commands — Debug Hermes TUI slash commands: Python, gateway, Ink UI"
sidebar_label: "Debugging Hermes Tui Commands"
description: "Debug Hermes TUI slash commands: Python, gateway, Ink UI"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Debugging Hermes Tui Commands
Debug Hermes TUI slash commands: Python, gateway, Ink UI.
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/software-development/debugging-hermes-tui-commands` |
| Version | `1.0.0` |
| Author | Hermes Agent |
| License | MIT |
| Tags | `debugging`, `hermes-agent`, `tui`, `slash-commands`, `typescript`, `python` |
| Related skills | [`python-debugpy`](/docs/user-guide/skills/bundled/software-development/software-development-python-debugpy), [`node-inspect-debugger`](/docs/user-guide/skills/bundled/software-development/software-development-node-inspect-debugger), [`systematic-debugging`](/docs/user-guide/skills/bundled/software-development/software-development-systematic-debugging) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Debugging Hermes TUI Slash Commands
## Overview
Hermes slash commands span three layers — Python command registry, tui_gateway JSON-RPC bridge, and the Ink/TypeScript frontend. When a command misbehaves (missing from autocomplete, works in CLI but not TUI, config persists but UI doesn't update), the bug is almost always one layer being out of sync with another.
Use this skill when you encounter issues with slash commands in the Hermes TUI, particularly when commands aren't showing in autocomplete, aren't working properly in the TUI, or need to be added/updated.
## When to Use
- A slash command exists in one part of the codebase but doesn't work fully
- A command needs to be added to both backend and frontend
- Command autocomplete isn't working for specific commands
- Command behavior is inconsistent between CLI and TUI
- A command persists config but doesn't apply live in the TUI
## Architecture Overview
<!-- ascii-guard-ignore -->
```
Python backend (hermes_cli/commands.py) <- canonical COMMAND_REGISTRY
TUI gateway (tui_gateway/server.py) <- slash.exec / command.dispatch
TUI frontend (ui-tui/src/app/slash/) <- local handlers + fallthrough
```
<!-- ascii-guard-ignore-end -->
Command definitions must be registered consistently across Python and TypeScript to work properly. The Python `COMMAND_REGISTRY` is the source of truth for: CLI dispatch, gateway help, Telegram BotCommand menu, Slack subcommand map, and autocomplete data shipped to Ink.
## Investigation Steps
1. **Check if the command exists in the TUI frontend:**
```bash
search_files --pattern "/commandname" --file_glob "*.ts" --path ui-tui/
search_files --pattern "/commandname" --file_glob "*.tsx" --path ui-tui/
```
2. **Examine the TUI command definition:**
```bash
read_file ui-tui/src/app/slash/commands/core.ts
# If not there:
search_files --pattern "commandname" --path ui-tui/src/app/slash/commands --target files
```
3. **Check if the command exists in the Python backend:**
```bash
search_files --pattern "CommandDef" --file_glob "*.py" --path hermes_cli/
search_files --pattern "commandname" --path hermes_cli/commands.py --context 3
```
4. **Examine the gateway implementation:**
```bash
search_files --pattern "complete.slash|slash.exec" --path tui_gateway/
```
## Fix: Missing Command Autocomplete
If a command exists in the TUI but doesn't show in autocomplete:
1. Add a `CommandDef` entry to `COMMAND_REGISTRY` in `hermes_cli/commands.py`:
```python
CommandDef("commandname", "Description of the command", "Session",
cli_only=True, aliases=("alias",),
args_hint="[arg1|arg2|arg3]",
subcommands=("arg1", "arg2", "arg3")),
```
2. Pick `cli_only` vs gateway availability carefully:
- `cli_only=True` — only in the interactive CLI/TUI
- `gateway_only=True` — only in messaging platforms
- neither — available everywhere
- `gateway_config_gate="display.foo"` — config-gated availability in the gateway
3. Ensure `subcommands` matches the expected tab-completion options shown by the TUI.
4. If the command runs server-side, add a handler in `HermesCLI.process_command()` in `cli.py`:
```python
elif canonical == "commandname":
self._handle_commandname(cmd_original)
```
5. For gateway-available commands, add a handler in `gateway/run.py`:
```python
if canonical == "commandname":
return await self._handle_commandname(event)
```
## Common Issues
1. **Command shows in TUI but not in autocomplete.** The command is defined in the TUI codebase but missing from `COMMAND_REGISTRY` in `hermes_cli/commands.py`. Autocomplete data ships from Python.
2. **Command shows in autocomplete but doesn't work.** Check the command handler in `tui_gateway/server.py` and the frontend handler in `ui-tui/src/app/createSlashHandler.ts`. If the command is local-only in Ink, it must be handled in `app.tsx` built-in branch; otherwise it falls through to `slash.exec` and must have a Python handler.
3. **Command behavior differs between CLI and TUI.** The command might have different implementations. Check both `cli.py::process_command` and the TUI's local handler. Local TUI handlers take precedence over gateway dispatch.
4. **Command persists config but doesn't apply live.** For TUI-local commands, updating `config.set` is not enough. Also patch the relevant nanostore state immediately (usually `patchUiState(...)`) and pass any new state through rendering components. Example: `/details collapsed` must update live detail visibility, not just save `details_mode`; in-session global `/details <mode>` may need a separate command-override flag so live commands can override built-in section defaults while startup/config sync preserves default-expanded thinking/tools behavior.
5. **Gateway dispatch silently ignores the command.** The gateway only dispatches commands it knows about. Check `GATEWAY_KNOWN_COMMANDS` (derived from `COMMAND_REGISTRY` automatically) includes the canonical name. If the command is `cli_only` with a `gateway_config_gate`, verify the gated config value is truthy.
## Debugging Tactics
When surface-level inspection doesn't reveal the bug:
- **Python side hangs or misbehaves:** use the `python-debugpy` skill to break inside `_SlashWorker.exec` or the command handler. `remote-pdb` set at the handler entry is the fastest path.
- **Ink side not reacting:** use the `node-inspect-debugger` skill to break in `app.tsx`'s slash dispatch or the local command branch. `sb('dist/app.js', <line>)` after `npm run build`.
- **Registry mismatch / unclear which side is wrong:** compare the canonical `COMMAND_REGISTRY` entry against the TUI's local command list side-by-side.
## Pitfalls
- Don't forget to set the appropriate category for the command in `CommandDef` (e.g., "Session", "Configuration", "Tools & Skills", "Info", "Exit")
- Make sure any aliases are properly registered in the `aliases` tuple — no other file changes are needed, everything downstream (Telegram menu, Slack mapping, autocomplete, help) derives from it
- For commands with subcommands, ensure the `subcommands` tuple in `CommandDef` matches what's in the TUI code
- `cli_only=True` commands won't work in gateway/messaging platforms — unless you add a `gateway_config_gate` and the gate is truthy
- After adding live UI state, search every consumer of the old prop/helper and thread the new state through all render paths, not just the active streaming path. TUI detail rendering has at least two important paths: live `StreamingAssistant`/`ToolTrail` and transcript/pending `MessageLine` rows. A `/clean` pass should explicitly check both.
- Rebuild the TUI (`npm --prefix ui-tui run build`) before testing — tsx watch mode may lag on first launch
## Verification
After fixing:
1. Rebuild the TUI:
```bash
cd /home/bb/hermes-agent && npm --prefix ui-tui run build
```
2. Run the TUI and test the command:
```bash
hermes --tui
```
3. Type `/` and verify the command appears in autocomplete suggestions with the expected description and args hint.
4. Execute the command and confirm:
- Expected behavior fires
- Any persisted config updates correctly (`read_file ~/.hermes/config.yaml`)
- Live UI state reflects the change immediately (not just after restart)
5. If the command is also gateway-available, test it from at least one messaging platform (or run the gateway tests: `scripts/run_tests.sh tests/gateway/`).

View file

@ -0,0 +1,182 @@
---
title: "Hermes Agent Skill Authoring — Author in-repo SKILL"
sidebar_label: "Hermes Agent Skill Authoring"
description: "Author in-repo SKILL"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Hermes Agent Skill Authoring
Author in-repo SKILL.md: frontmatter, validator, structure.
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/software-development/hermes-agent-skill-authoring` |
| Version | `1.0.0` |
| Author | Hermes Agent |
| License | MIT |
| Tags | `skills`, `authoring`, `hermes-agent`, `conventions`, `skill-md` |
| Related skills | [`writing-plans`](/docs/user-guide/skills/bundled/software-development/software-development-writing-plans), [`requesting-code-review`](/docs/user-guide/skills/bundled/software-development/software-development-requesting-code-review) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Authoring Hermes-Agent Skills (in-repo)
## Overview
There are two places a SKILL.md can live:
1. **User-local:** `~/.hermes/skills/<maybe-category>/<name>/SKILL.md` — personal, not shared. Created via `skill_manage(action='create')`.
2. **In-repo (this skill is about this case):** `/home/bb/hermes-agent/skills/<category>/<name>/SKILL.md` — committed, shipped with the package. Use `write_file` + `git add`. `skill_manage(action='create')` does NOT target this tree.
## When to Use
- User asks you to add a skill "in this branch / repo / commit"
- You're committing a reusable workflow that should ship with hermes-agent
- You're editing an existing skill under `/home/bb/hermes-agent/skills/` (use `patch` for small edits, `write_file` for rewrites; `skill_manage` still works for patch on in-repo skills, but not for `create`)
## Required Frontmatter
Source of truth: `tools/skill_manager_tool.py::_validate_frontmatter`. Hard requirements:
- Starts with `---` as the first bytes (no leading blank line).
- Closes with `\n---\n` before the body.
- Parses as a YAML mapping.
- `name` field present.
- `description` field present, ≤ **1024 chars** (`MAX_DESCRIPTION_LENGTH`).
- Non-empty body after the closing `---`.
Peer-matched shape used by every skill under `skills/software-development/`:
```yaml
---
name: my-skill-name # lowercase, hyphens, ≤64 chars (MAX_NAME_LENGTH)
description: Use when <trigger>. <one-line behavior>.
version: 1.0.0
author: Hermes Agent
license: MIT
metadata:
hermes:
tags: [short, descriptive, tags]
related_skills: [other-skill, another-skill]
---
```
`version` / `author` / `license` / `metadata` are NOT enforced by the validator, but every peer has them — omit and your skill sticks out.
## Size Limits
- Description: ≤ 1024 chars (enforced).
- Full SKILL.md: ≤ 100,000 chars (enforced as `MAX_SKILL_CONTENT_CHARS`, ~36k tokens).
- Peer skills in `software-development/` sit at **8-14k chars**. Aim for that range. If you're pushing past 20k, split into `references/*.md` and reference them from SKILL.md.
## Peer-Matched Structure
Every in-repo skill follows roughly:
```
# <Title>
## Overview
One or two paragraphs: what and why.
## When to Use
- Bulleted triggers
- "Don't use for:" counter-triggers
## <Topic sections specific to the skill>
- Quick-reference tables are common
- Code blocks with exact commands
- Hermes-specific recipes (tests via scripts/run_tests.sh, ui-tui paths, etc.)
## Common Pitfalls
Numbered list of mistakes and their fixes.
## Verification Checklist
- [ ] Checkbox list of post-action verifications
## One-Shot Recipes (optional)
Named scenarios → concrete command sequences.
```
Not every section is mandatory, but `Overview` + `When to Use` + actionable body + pitfalls are the minimum for the skill to feel like a peer.
## Directory Placement
```
skills/<category>/<skill-name>/SKILL.md
```
Categories currently in repo (confirm with `ls skills/`): `autonomous-ai-agents`, `creative`, `data-science`, `devops`, `dogfood`, `email`, `gaming`, `github`, `leisure`, `mcp`, `media`, `mlops/*`, `note-taking`, `productivity`, `red-teaming`, `research`, `smart-home`, `social-media`, `software-development`.
Pick the closest existing category. Don't invent new top-level categories casually.
## Workflow
1. **Survey peers** in the target category:
```
ls skills/<category>/
```
Read 2-3 peer SKILL.md files to match tone and structure.
2. **Check validator constraints** in `tools/skill_manager_tool.py` if unsure.
3. **Draft** with `write_file` to `skills/<category>/<name>/SKILL.md`.
4. **Validate locally**:
```python
import yaml, re, pathlib
content = pathlib.Path("skills/<category>/<name>/SKILL.md").read_text()
assert content.startswith("---")
m = re.search(r'\n---\s*\n', content[3:])
fm = yaml.safe_load(content[3:m.start()+3])
assert "name" in fm and "description" in fm
assert len(fm["description"]) <= 1024
assert len(content) <= 100_000
```
5. **Git add + commit** on the active branch.
6. **Note:** the CURRENT session's skill loader is cached — `skill_view` / `skills_list` will not see the new skill until a new session. This is expected, not a bug.
## Cross-Referencing Other Skills
`metadata.hermes.related_skills` unions both trees (`skills/` in-repo and `~/.hermes/skills/`) at load time. You CAN reference a user-local skill from an in-repo skill, but it won't resolve for other users who clone the repo fresh. Prefer referencing only in-repo skills from in-repo skills. If a frequently-referenced skill lives only in `~/.hermes/skills/`, consider promoting it to the repo.
## Editing Existing In-Repo Skills
- **Small fix (typo, added pitfall, tightened trigger):** `skill_manage(action='patch', name=..., old_string=..., new_string=...)` works fine on in-repo skills.
- **Major rewrite:** `write_file` the whole SKILL.md. `skill_manage(action='edit')` also works but requires supplying the full new content.
- **Adding supporting files:** `write_file` to `skills/<category>/<name>/references/<file>.md`, `templates/<file>`, or `scripts/<file>`. `skill_manage(action='write_file')` also works and enforces the references/templates/scripts/assets subdir allowlist.
- **Always commit** the edit — in-repo skills are source, not runtime state.
## Common Pitfalls
1. **Using `skill_manage(action='create')` for an in-repo skill.** It writes to `~/.hermes/skills/`, not the repo tree. Use `write_file` for in-repo creation.
2. **Leading whitespace before `---`.** The validator checks `content.startswith("---")`; any leading blank line or BOM fails validation.
3. **Description too generic.** Peer descriptions start with "Use when ..." and describe the *trigger class*, not the one task. "Use when debugging X" > "Debug X".
4. **Forgetting the author/license/metadata block.** Not validator-enforced, but every peer has it; omitting makes the skill look half-finished.
5. **Writing a skill that duplicates a peer.** Before creating, `ls skills/<category>/` and open 2-3 peers. Prefer extending an existing skill to creating a narrow sibling.
6. **Expecting the current session to see the new skill.** It won't. The skill loader is initialized at session start. Verify in a fresh session or via `skill_view` using the exact path.
7. **Linking to skills that don't exist in-repo.** `related_skills: [some-user-local-skill]` works for you but breaks for other clones. Prefer only in-repo links.
## Verification Checklist
- [ ] File is at `skills/<category>/<name>/SKILL.md` (not in `~/.hermes/skills/`)
- [ ] Frontmatter starts at byte 0 with `---`, closes with `\n---\n`
- [ ] `name`, `description`, `version`, `author`, `license`, `metadata.hermes.{tags, related_skills}` all present
- [ ] Name ≤ 64 chars, lowercase + hyphens
- [ ] Description ≤ 1024 chars and starts with "Use when ..."
- [ ] Total file ≤ 100,000 chars (aim for 8-15k)
- [ ] Structure: `# Title``## Overview``## When to Use` → body → `## Common Pitfalls``## Verification Checklist`
- [ ] `related_skills` references resolve in-repo (or are explicitly OK to be user-local)
- [ ] `git add skills/<category>/<name>/ && git commit` completed on the intended branch

View file

@ -0,0 +1,336 @@
---
title: "Node Inspect Debugger — Debug Node"
sidebar_label: "Node Inspect Debugger"
description: "Debug Node"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Node Inspect Debugger
Debug Node.js via --inspect + Chrome DevTools Protocol CLI.
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/software-development/node-inspect-debugger` |
| Version | `1.0.0` |
| Author | Hermes Agent |
| License | MIT |
| Tags | `debugging`, `nodejs`, `node-inspect`, `cdp`, `breakpoints`, `ui-tui` |
| Related skills | [`systematic-debugging`](/docs/user-guide/skills/bundled/software-development/software-development-systematic-debugging), [`python-debugpy`](/docs/user-guide/skills/bundled/software-development/software-development-python-debugpy), [`debugging-hermes-tui-commands`](/docs/user-guide/skills/bundled/software-development/software-development-debugging-hermes-tui-commands) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Node.js Inspect Debugger
## Overview
When `console.log` isn't enough, drive Node's built-in V8 inspector programmatically from the terminal. You get real breakpoints, step in/over/out, call-stack walking, local/closure scope dumps, and arbitrary expression evaluation in the paused frame.
Two tools, pick one:
- **`node inspect`** — built-in, zero install, CLI REPL. Best for quick poking.
- **`ndb` / CDP via `chrome-remote-interface`** — scriptable from Node/Python; best when you want to automate many breakpoints, collect state across runs, or debug non-interactively from an agent loop.
**Prefer `node inspect` first.** It's always available and the REPL is fast.
## When to Use
- A Node test fails and you need to see intermediate state
- ui-tui crashes or behaves wrong and you want to inspect React/Ink state pre-render
- tui_gateway child processes (`_SlashWorker`, PTY bridge workers) misbehave
- You need to inspect a value in a closure that `console.log` can't reach without patching
- Perf: attach to a running process to capture a CPU profile or heap snapshot
**Don't use for:** things `console.log` solves in under a minute. Breakpoint-driven debugging is heavier; use it when the payoff is real.
## Quick Reference: `node inspect` REPL
Launch paused on first line:
```bash
node inspect path/to/script.js
# or with tsx
node --inspect-brk $(which tsx) path/to/script.ts
```
The `debug>` prompt accepts:
| Command | Action |
|---|---|
| `c` or `cont` | continue |
| `n` or `next` | step over |
| `s` or `step` | step into |
| `o` or `out` | step out |
| `pause` | pause running code |
| `sb('file.js', 42)` | set breakpoint at file.js line 42 |
| `sb(42)` | set breakpoint at line 42 of current file |
| `sb('functionName')` | break when function is called |
| `cb('file.js', 42)` | clear breakpoint |
| `breakpoints` | list all breakpoints |
| `bt` | backtrace (call stack) |
| `list(5)` | show 5 lines of source around current position |
| `watch('expr')` | evaluate expr on every pause |
| `watchers` | show watched expressions |
| `repl` | drop into REPL in current scope (Ctrl+C to exit REPL) |
| `exec expr` | evaluate expression once |
| `restart` | restart script |
| `kill` | kill the script |
| `.exit` | quit debugger |
**In the `repl` sub-mode:** type any JS expression, including access to locals/closure variables. `Ctrl+C` exits back to `debug>`.
## Attaching to a Running Process
When the process is already running (e.g. a long-lived dev server or the TUI gateway):
```bash
# 1. Send SIGUSR1 to enable the inspector on an existing process
kill -SIGUSR1 <pid>
# Node prints: Debugger listening on ws://127.0.0.1:9229/<uuid>
# 2. Attach the debugger CLI
node inspect -p <pid>
# or by URL
node inspect ws://127.0.0.1:9229/<uuid>
```
To start a process with the inspector from the beginning:
```bash
node --inspect script.js # listen on 127.0.0.1:9229, keep running
node --inspect-brk script.js # listen AND pause on first line
node --inspect=0.0.0.0:9230 script.js # custom host:port
```
For TypeScript via tsx:
```bash
node --inspect-brk --import tsx script.ts
# or older tsx
node --inspect-brk -r tsx/cjs script.ts
```
## Programmatic CDP (scripting from terminal)
When you want to automate — set many breakpoints, capture scope state, script a repro — use `chrome-remote-interface`:
```bash
npm i -g chrome-remote-interface # or project-local
# Start your target:
node --inspect-brk=9229 target.js &
```
Driver script (save as `/tmp/cdp-debug.js`):
```javascript
const CDP = require('chrome-remote-interface');
(async () => {
const client = await CDP({ port: 9229 });
const { Debugger, Runtime } = client;
Debugger.paused(async ({ callFrames, reason }) => {
const top = callFrames[0];
console.log(`PAUSED: ${reason} @ ${top.url}:${top.location.lineNumber + 1}`);
// Walk scopes for locals
for (const scope of top.scopeChain) {
if (scope.type === 'local' || scope.type === 'closure') {
const { result } = await Runtime.getProperties({
objectId: scope.object.objectId,
ownProperties: true,
});
for (const p of result) {
console.log(` ${scope.type}.${p.name} =`, p.value?.value ?? p.value?.description);
}
}
}
// Evaluate an expression in the paused frame
const { result } = await Debugger.evaluateOnCallFrame({
callFrameId: top.callFrameId,
expression: 'typeof state !== "undefined" ? JSON.stringify(state) : "n/a"',
});
console.log('state =', result.value ?? result.description);
await Debugger.resume();
});
await Runtime.enable();
await Debugger.enable();
// Set a breakpoint by URL regex + line
await Debugger.setBreakpointByUrl({
urlRegex: '.*app\\.tsx$',
lineNumber: 119, // 0-indexed
columnNumber: 0,
});
await Runtime.runIfWaitingForDebugger();
})();
```
Run it:
```bash
node /tmp/cdp-debug.js
```
Hermes-specific note: `chrome-remote-interface` is NOT in `ui-tui/package.json`. Install it to a throwaway location if you don't want to dirty the project:
```bash
mkdir -p /tmp/cdp-tools && cd /tmp/cdp-tools && npm i chrome-remote-interface
NODE_PATH=/tmp/cdp-tools/node_modules node /tmp/cdp-debug.js
```
## Debugging Hermes ui-tui
The TUI is built Ink + tsx. Two common scenarios:
### Debugging a single Ink component under dev
`ui-tui/package.json` has `npm run dev` (tsx --watch). Add `--inspect-brk` by running tsx directly:
```bash
cd /home/bb/hermes-agent/ui-tui
npm run build # produce dist/ once so transpile isn't needed on first load
node --inspect-brk dist/entry.js
# In another terminal:
node inspect -p <node pid>
```
Then inside `debug>`:
```
sb('dist/app.js', 220) # or wherever the suspect render is
cont
```
When it pauses, `repl` → inspect `props`, state refs, `useInput` handler values, etc.
### Debugging a running `hermes --tui`
The TUI spawns Node from the Python CLI. Easiest path:
```bash
# 1. Launch TUI
hermes --tui &
TUI_PID=$(pgrep -f 'ui-tui/dist/entry' | head -1)
# 2. Enable inspector on that Node PID
kill -SIGUSR1 "$TUI_PID"
# 3. Find the WS URL
curl -s http://127.0.0.1:9229/json/list | jq -r '.[0].webSocketDebuggerUrl'
# 4. Attach
node inspect ws://127.0.0.1:9229/<uuid>
```
Interacting with the TUI (typing in its window) continues to advance execution; your debugger can pause it on a breakpoint at any `sb(...)`.
### Debugging `_SlashWorker` / PTY child processes
Those are Python, not Node — use the `python-debugpy` skill for them. Only Node portions (Ink UI, tui_gateway client, tsx-run tests under `ui-tui/`) use this skill.
## Running Vitest Tests Under the Debugger
```bash
cd /home/bb/hermes-agent/ui-tui
# Run a single test file paused on entry
node --inspect-brk ./node_modules/vitest/vitest.mjs run --no-file-parallelism src/app/foo.test.tsx
```
In another terminal: `node inspect -p <pid>`, then `sb('src/app/foo.tsx', 42)`, `cont`.
Use `--no-file-parallelism` (vitest) or `--runInBand` (jest) so only one worker exists — debugging a pool is painful.
## Heap Snapshots & CPU Profiles (Non-interactive)
From the CDP driver above, swap Debugger for `HeapProfiler` / `Profiler`:
```javascript
// CPU profile for 5 seconds
await client.Profiler.enable();
await client.Profiler.start();
await new Promise(r => setTimeout(r, 5000));
const { profile } = await client.Profiler.stop();
require('fs').writeFileSync('/tmp/cpu.cpuprofile', JSON.stringify(profile));
// Open /tmp/cpu.cpuprofile in Chrome DevTools → Performance tab
```
```javascript
// Heap snapshot
await client.HeapProfiler.enable();
const chunks = [];
client.HeapProfiler.addHeapSnapshotChunk(({ chunk }) => chunks.push(chunk));
await client.HeapProfiler.takeHeapSnapshot({ reportProgress: false });
require('fs').writeFileSync('/tmp/heap.heapsnapshot', chunks.join(''));
```
## Common Pitfalls
1. **Wrong line numbers in TS source.** Breakpoints hit the emitted JS, not the `.ts`. Either (a) break in the built `dist/*.js`, or (b) enable sourcemaps (`node --enable-source-maps`) and use `sb('src/app.tsx', N)` — but only with CDP clients that follow sourcemaps. `node inspect` CLI does not.
2. **`--inspect` vs `--inspect-brk`.** `--inspect` starts the inspector but doesn't pause; your script races past your first breakpoint if you attach too late. Use `--inspect-brk` when you need to set breakpoints before any code runs.
3. **Port collisions.** Default is `9229`. If multiple Node processes are inspecting, pass `--inspect=0` (random port) and read the actual URL from `/json/list`:
```bash
curl -s http://127.0.0.1:9229/json/list # lists all inspectable targets on the host
```
4. **Child processes.** `--inspect` on a parent does NOT inspect its children. Use `NODE_OPTIONS='--inspect-brk' node parent.js` to propagate to every child; be aware they all need unique ports (Node auto-increments when `NODE_OPTIONS='--inspect'` is inherited).
5. **Background kills.** If you `Ctrl+C` out of `node inspect` while the target is paused, the target stays paused. Either `cont` first, or `kill` the target explicitly.
6. **Running `node inspect` through an agent terminal.** It's a PTY-friendly REPL. In Hermes, launch it with `terminal(pty=true)` or `background=true` + `process(action='submit', data='...')`. Non-PTY foreground mode will work for one-shot commands but not for interactive stepping.
7. **Security.** `--inspect=0.0.0.0:9229` exposes arbitrary code execution. Always bind to `127.0.0.1` (the default) unless you have an isolated network.
## Verification Checklist
After setting up a debug session, verify:
- [ ] `curl -s http://127.0.0.1:9229/json/list` returns exactly the target you expect
- [ ] First breakpoint actually hits (if it doesn't, you likely missed `--inspect-brk` or attached after execution completed)
- [ ] Source listing at pause shows the right file (mismatch = sourcemap issue, see pitfall 1)
- [ ] `exec process.pid` in `repl` returns the PID you meant to attach to
## One-Shot Recipes
**"Why is this variable undefined at line X?"**
```bash
node --inspect-brk script.js &
node inspect -p $!
# debug>
sb('script.js', X)
cont
# paused. Now:
repl
> myVariable
> Object.keys(this)
```
**"What's the call path into this function?"**
```
debug> sb('suspectFn')
debug> cont
# paused on entry
debug> bt
```
**"This async chain hangs — where?"**
```
# Start with --inspect (no -brk), let it run to the hang, then:
debug> pause
debug> bt
# Now you see the stuck frame
```

View file

@ -1,14 +1,14 @@
---
title: "Plan — Plan mode for Hermes — inspect context, write a markdown plan into the active workspace's `"
title: "Plan — Plan mode: write markdown plan to"
sidebar_label: "Plan"
description: "Plan mode for Hermes — inspect context, write a markdown plan into the active workspace's `"
description: "Plan mode: write markdown plan to"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Plan
Plan mode for Hermes — inspect context, write a markdown plan into the active workspace's `.hermes/plans/` directory, and do not execute the work.
Plan mode: write markdown plan to .hermes/plans/, no exec.
## Skill metadata

View file

@ -0,0 +1,392 @@
---
title: "Python Debugpy — Debug Python: pdb REPL + debugpy remote (DAP)"
sidebar_label: "Python Debugpy"
description: "Debug Python: pdb REPL + debugpy remote (DAP)"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Python Debugpy
Debug Python: pdb REPL + debugpy remote (DAP).
## Skill metadata
| | |
|---|---|
| Source | Bundled (installed by default) |
| Path | `skills/software-development/python-debugpy` |
| Version | `1.0.0` |
| Author | Hermes Agent |
| License | MIT |
| Tags | `debugging`, `python`, `pdb`, `debugpy`, `breakpoints`, `dap`, `post-mortem` |
| Related skills | [`systematic-debugging`](/docs/user-guide/skills/bundled/software-development/software-development-systematic-debugging), [`node-inspect-debugger`](/docs/user-guide/skills/bundled/software-development/software-development-node-inspect-debugger), [`debugging-hermes-tui-commands`](/docs/user-guide/skills/bundled/software-development/software-development-debugging-hermes-tui-commands) |
## Reference: full SKILL.md
:::info
The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
:::
# Python Debugger (pdb + debugpy)
## Overview
Three tools, picked by situation:
| Tool | When |
|---|---|
| **`breakpoint()` + pdb** | Local, interactive, simplest. Add `breakpoint()` in the source, run normally, get a REPL at that line. |
| **`python -m pdb`** | Launch an existing script under pdb with no source edits. Useful for quick poking. |
| **`debugpy`** | Remote / headless / "attach to already-running process." Talks DAP, scriptable from terminal, works for long-lived processes (gateway, daemon, PTY children). |
**Start with `breakpoint()`.** It's the cheapest thing that works.
## When to Use
- A test fails and the traceback doesn't reveal why a value is wrong
- You need to step through a function and watch a collection mutate
- A long-running process (hermes gateway, tui_gateway) misbehaves and you can't restart it
- Post-mortem: an exception fired in prod-ish code and you want to inspect locals at the crash site
- A subprocess / child (Python `_SlashWorker`, PTY bridge worker) is the actual bug site
**Don't use for:** things `print()` / `logging.debug` solve in under a minute, or things `pytest -vv --tb=long --showlocals` already reveals.
## pdb Quick Reference
Inside any pdb prompt (`(Pdb)`):
| Command | Action |
|---|---|
| `h` / `h cmd` | help |
| `n` | next line (step over) |
| `s` | step into |
| `r` | return from current function |
| `c` | continue |
| `unt N` | continue until line N |
| `j N` | jump to line N (same function only) |
| `l` / `ll` | list source around current line / full function |
| `w` | where (stack trace) |
| `u` / `d` | move up / down in the stack |
| `a` | print args of the current function |
| `p expr` / `pp expr` | print / pretty-print expression |
| `display expr` | auto-print expr on every stop |
| `b file:line` | set breakpoint |
| `b func` | break on function entry |
| `b file:line, cond` | conditional breakpoint |
| `cl N` | clear breakpoint N |
| `tbreak file:line` | one-shot breakpoint |
| `!stmt` | execute arbitrary Python (assignments included) |
| `interact` | drop into full Python REPL in current scope (Ctrl+D to exit) |
| `q` | quit |
The `interact` command is the most powerful — you can import anything, inspect complex objects, even call methods that mutate state. Locals are read-only by default; use `!x = 42` from the `(Pdb)` prompt to mutate.
## Recipe 1: Local breakpoint
Easiest. Edit the file:
```python
def compute(x, y):
result = some_helper(x)
breakpoint() # <-- drops into pdb here
return result + y
```
Run the code normally. You land at the `breakpoint()` line with full access to locals.
**Don't forget to remove `breakpoint()` before committing.** Use `git diff` or a pre-commit grep:
```bash
rg -n 'breakpoint\(\)' --type py
```
## Recipe 2: Launch a script under pdb (no source edits)
```bash
python -m pdb path/to/script.py arg1 arg2
# Lands at first line of script
(Pdb) b path/to/script.py:42
(Pdb) c
```
## Recipe 3: Debug a pytest test
The hermes test runner and pytest both support this:
```bash
# Drop to pdb on failure (or on any raised exception):
scripts/run_tests.sh tests/path/to/test_file.py::test_name --pdb
# Drop to pdb at the START of the test:
scripts/run_tests.sh tests/path/to/test_file.py::test_name --trace
# Show locals in tracebacks without pdb:
scripts/run_tests.sh tests/path/to/test_file.py --showlocals --tb=long
```
Note: `scripts/run_tests.sh` uses xdist (`-n 4`) by default, and pdb does NOT work under xdist. Add `-p no:xdist` or run a single test with `-n 0`:
```bash
scripts/run_tests.sh tests/foo_test.py::test_bar --pdb -p no:xdist
# or
source .venv/bin/activate
python -m pytest tests/foo_test.py::test_bar --pdb
```
This bypasses the hermetic-env guarantees — fine for debugging, but re-run under the wrapper to confirm before pushing.
## Recipe 4: Post-mortem on any exception
```python
import pdb, sys
try:
run_the_thing()
except Exception:
pdb.post_mortem(sys.exc_info()[2])
```
Or wrap a whole script:
```bash
python -m pdb -c continue script.py
# When it crashes, pdb catches it and you're in the frame of the exception
```
Or set a global hook in a repl/jupyter:
```python
import sys
def excepthook(etype, value, tb):
import pdb; pdb.post_mortem(tb)
sys.excepthook = excepthook
```
## Recipe 5: Remote debug with debugpy (attach to running process)
For long-lived processes: Hermes gateway, tui_gateway, a daemon, a process that's already misbehaving and can't be restarted clean.
### Setup
```bash
source /home/bb/hermes-agent/.venv/bin/activate
pip install debugpy
```
### Pattern A: Source-edit — process waits for debugger at launch
Add near the top of the entry point (or inside the function you want to debug):
```python
import debugpy
debugpy.listen(("127.0.0.1", 5678))
print("debugpy listening on 5678, waiting for client...", flush=True)
debugpy.wait_for_client()
debugpy.breakpoint() # optional: pause immediately once attached
```
Start the process; it blocks on `wait_for_client()`.
### Pattern B: No source edit — launch with `-m debugpy`
```bash
python -m debugpy --listen 127.0.0.1:5678 --wait-for-client your_script.py arg1
```
Equivalent for module entry:
```bash
python -m debugpy --listen 127.0.0.1:5678 --wait-for-client -m your.module
```
### Pattern C: Attach to an already-running process
Needs the PID and debugpy preinstalled in the target's environment:
```bash
python -m debugpy --listen 127.0.0.1:5678 --pid <pid>
# debugpy injects itself into the process. Then attach a client as below.
```
Some kernels/security configs block the ptrace-based injection (`/proc/sys/kernel/yama/ptrace_scope`). Fix with:
```bash
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
```
### Connecting a client from the terminal
The easiest terminal-side DAP client is VS Code CLI or a small script. From inside Hermes you have two practical options:
**Option 1: `debugpy`'s own CLI REPL** — not an official feature, but a tiny DAP client script:
```python
# /tmp/dap_client.py
import socket, json, itertools, time, sys
HOST, PORT = "127.0.0.1", 5678
s = socket.create_connection((HOST, PORT))
seq = itertools.count(1)
def send(msg):
msg["seq"] = next(seq)
body = json.dumps(msg).encode()
s.sendall(f"Content-Length: {len(body)}\r\n\r\n".encode() + body)
def recv():
header = b""
while b"\r\n\r\n" not in header:
header += s.recv(1)
length = int(header.decode().split("Content-Length:")[1].split("\r\n")[0].strip())
body = b""
while len(body) < length:
body += s.recv(length - len(body))
return json.loads(body)
send({"type": "request", "command": "initialize", "arguments": {"adapterID": "python"}})
print(recv())
send({"type": "request", "command": "attach", "arguments": {}})
print(recv())
send({"type": "request", "command": "setBreakpoints",
"arguments": {"source": {"path": sys.argv[1]},
"breakpoints": [{"line": int(sys.argv[2])}]}})
print(recv())
send({"type": "request", "command": "configurationDone"})
# ... loop reading events and sending continue/stepIn/etc.
```
This is fine for one-off automation but painful as an interactive UX.
**Option 2: Attach from VS Code / Cursor / Zed** — if the user has one open, they can add a `launch.json`:
```json
{
"name": "Attach to Hermes",
"type": "debugpy",
"request": "attach",
"connect": { "host": "127.0.0.1", "port": 5678 },
"justMyCode": false,
"pathMappings": [
{ "localRoot": "${workspaceFolder}", "remoteRoot": "/home/bb/hermes-agent" }
]
}
```
**Option 3: Ditch DAP, use `remote-pdb`** — usually what you actually want from a terminal agent:
```bash
pip install remote-pdb
```
In your code:
```python
from remote_pdb import set_trace
set_trace(host="127.0.0.1", port=4444) # blocks until connection
```
Then from the terminal:
```bash
nc 127.0.0.1 4444
# You get a (Pdb) prompt exactly as if debugging locally.
```
`remote-pdb` is the cleanest agent-friendly choice when `debugpy`'s DAP protocol is overkill. Use `debugpy` only when you actually need IDE integration.
## Debugging Hermes-specific Processes
### Tests
See Recipe 3. Always add `-p no:xdist` or run single tests without xdist.
### `run_agent.py` / CLI — one-shot
Easiest: add `breakpoint()` near the suspect line, then run `hermes` normally. Control returns to your terminal at the pause point.
### `tui_gateway` subprocess (spawned by `hermes --tui`)
The gateway runs as a child of the Node TUI. Options:
**A. Source-edit the gateway:**
```python
# tui_gateway/server.py near the top of serve()
import debugpy
debugpy.listen(("127.0.0.1", 5678))
debugpy.wait_for_client()
```
Start `hermes --tui`. The TUI will appear frozen (its backend is waiting). Attach a client; execution resumes when you `continue`.
**B. Use `remote-pdb` at a specific handler:**
```python
from remote_pdb import set_trace
set_trace(host="127.0.0.1", port=4444) # in the RPC handler you want to trap
```
Trigger the matching slash command from the TUI, then `nc 127.0.0.1 4444` in another terminal.
### `_SlashWorker` subprocess
Same pattern — `remote-pdb` with `set_trace()` inside the worker's `exec` path. The worker is persistent across slash commands, so the first trigger blocks until you connect; subsequent slash commands pass through normally unless you re-arm.
### Gateway (`gateway/run.py`)
Long-lived. Use `remote-pdb` at a handler, or `debugpy` with `--wait-for-client` if you're restarting the gateway anyway.
## Common Pitfalls
1. **pdb under pytest-xdist silently does nothing.** You won't see the prompt, the test just hangs. Always use `-p no:xdist` or `-n 0`.
2. **`breakpoint()` in CI / non-TTY contexts hangs the process.** Safe locally; never commit it. Add a pre-commit grep as a safety net.
3. **`PYTHONBREAKPOINT=0`** disables all `breakpoint()` calls. Check the env if your breakpoint isn't hitting:
```bash
echo $PYTHONBREAKPOINT
```
4. **`debugpy.listen` blocks only if you also call `wait_for_client()`.** Without it, execution continues and your first breakpoint may fire before the client is attached.
5. **Attach to PID fails on hardened kernels.** `ptrace_scope=1` (Ubuntu default) allows only same-user ptrace of child processes. Workaround: `echo 0 > /proc/sys/kernel/yama/ptrace_scope` (needs root) or launch under `debugpy` from the start.
6. **Threads.** `pdb` only debugs the current thread. For multithreaded code, use `debugpy` (thread-aware DAP) or set `threading.settrace()` per thread.
7. **asyncio.** `pdb` works in coroutines but `await` inside pdb requires Python 3.13+ or `await` from `interact` mode on older versions. For 3.11/3.12, use `asyncio.run_coroutine_threadsafe` tricks or `!stmt`-based awaits via `asyncio.ensure_future`.
8. **`scripts/run_tests.sh` strips credentials and sets `HOME=<tmpdir>`.** If your bug depends on user config or real API keys, it won't reproduce under the wrapper. Debug with raw `pytest` first to repro, then re-confirm under the wrapper.
9. **Forking / multiprocessing.** pdb does not follow forks. Each child needs its own `breakpoint()` or `set_trace()`. For Hermes subagents, debug one process at a time.
## Verification Checklist
- [ ] After `pip install debugpy`, confirm: `python -c "import debugpy; print(debugpy.__version__)"`
- [ ] For remote debug, confirm the port is actually listening: `ss -tlnp | grep 5678`
- [ ] First breakpoint actually hits (if it doesn't, you likely have `PYTHONBREAKPOINT=0`, you're under xdist, or execution finished before attach)
- [ ] `where` / `w` shows the expected call stack
- [ ] Post-debug cleanup: no stray `breakpoint()` / `set_trace()` in committed code
```bash
rg -n 'breakpoint\(\)|set_trace\(|debugpy\.listen' --type py
```
## One-Shot Recipes
**"Why is this dict missing a key?"**
```python
# add above the KeyError site
breakpoint()
# then in pdb:
(Pdb) pp d
(Pdb) pp list(d.keys())
(Pdb) w # how did we get here
```
**"This test passes in isolation but fails in the suite."**
```bash
scripts/run_tests.sh tests/the_test.py --pdb -p no:xdist
# But if it only fails WITH other tests:
source .venv/bin/activate
python -m pytest tests/ -x --pdb -p no:xdist
# Now it pdb-traps at the exact failing test after state accumulated.
```
**"My async handler deadlocks."**
```python
# Add at handler entry
import remote_pdb; remote_pdb.set_trace(host="127.0.0.1", port=4444)
```
Trigger the handler. `nc 127.0.0.1 4444`, then `w` to see the suspended frame, `!import asyncio; asyncio.all_tasks()` to see what else is pending.
**"Post-mortem on a crash in an Ink child process / subprocess."**
```bash
PYTHONFAULTHANDLER=1 python -m pdb -c continue path/to/entrypoint.py
# On crash, pdb lands at the frame of the exception with full locals
```

View file

@ -1,14 +1,14 @@
---
title: "Requesting Code Review"
title: "Requesting Code Review — Pre-commit review: security scan, quality gates, auto-fix"
sidebar_label: "Requesting Code Review"
description: "Pre-commit verification pipeline — static security scan, baseline-aware quality gates, independent reviewer subagent, and auto-fix loop"
description: "Pre-commit review: security scan, quality gates, auto-fix"
---
{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
# Requesting Code Review
Pre-commit verification pipeline — static security scan, baseline-aware quality gates, independent reviewer subagent, and auto-fix loop. Use after code changes and before committing, pushing, or opening a PR.
Pre-commit review: security scan, quality gates, auto-fix.
## Skill metadata

Some files were not shown because too many files have changed in this diff Show more