Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor

2026-04-26 01:01:40 +00:00 · 2026-04-17 08:59:33 -05:00 · 2026-04-17 08:59:33 -05:00 · 1f37ef2fd1
commit 1f37ef2fd1
parent 5435287dec 6ea7386a6f
126 changed files with 12584 additions and 2666 deletions
--- a/website/docs/user-guide/features/image-generation.md
+++ b/website/docs/user-guide/features/image-generation.md
@ -1,6 +1,6 @@
 ---
 title: Image Generation
-description: Generate images via FAL.ai — 8 models including FLUX 2, GPT-Image, Nano Banana, Ideogram, and more, selectable via `hermes tools`.
+description: Generate images via FAL.ai — 8 models including FLUX 2, GPT-Image, Nano Banana Pro, Ideogram, Recraft V4 Pro, and more, selectable via `hermes tools`.
 sidebar_label: Image Generation
 sidebar_position: 6
 ---
@ -13,13 +13,13 @@ Hermes Agent generates images from text prompts via FAL.ai. Eight models are sup

 | Model | Speed | Strengths | Price |
 |---|---|---|---|
-| `fal-ai/flux-2/klein/9b` *(default)* | <1s | Fast, crisp text | $0.006/MP |
+| `fal-ai/flux-2/klein/9b` *(default)* | `<1s` | Fast, crisp text | $0.006/MP |
 | `fal-ai/flux-2-pro` | ~6s | Studio photorealism | $0.03/MP |
 | `fal-ai/z-image/turbo` | ~2s | Bilingual EN/CN, 6B params | $0.005/MP |
-| `fal-ai/nano-banana` | ~6s | Gemini 2.5, character consistency | $0.08/image |
+| `fal-ai/nano-banana-pro` | ~8s | Gemini 3 Pro, reasoning depth, text rendering | $0.15/image (1K) |
 | `fal-ai/gpt-image-1.5` | ~15s | Prompt adherence | $0.034/image |
 | `fal-ai/ideogram/v3` | ~5s | Best typography | $0.03–0.09/image |
-| `fal-ai/recraft-v3` | ~8s | Vector art, brand styles | $0.04/image |
+| `fal-ai/recraft/v4/pro/text-to-image` | ~8s | Design, brand systems, production-ready | $0.25/image |
 | `fal-ai/qwen-image` | ~12s | LLM-based, complex text | $0.02/MP |

 Prices are FAL's pricing at time of writing; check [fal.ai](https://fal.ai/) for current numbers.
@ -87,7 +87,7 @@ Make me a futuristic cityscape, landscape orientation

 Every model accepts the same three aspect ratios from the agent's perspective. Internally, each model's native size spec is filled in automatically:

-| Agent input | image_size (flux/z-image/qwen/recraft/ideogram) | aspect_ratio (nano-banana) | image_size (gpt-image) |
+| Agent input | image_size (flux/z-image/qwen/recraft/ideogram) | aspect_ratio (nano-banana-pro) | image_size (gpt-image) |
 |---|---|---|---|
 | `landscape` | `landscape_16_9` | `16:9` | `1536x1024` |
 | `square` | `square_hd` | `1:1` | `1024x1024` |
--- a/website/docs/user-guide/features/overview.md
+++ b/website/docs/user-guide/features/overview.md
@ -30,7 +30,7 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch
 - **[Voice Mode](voice-mode.md)** — Full voice interaction across CLI and messaging platforms. Talk to the agent using your microphone, hear spoken replies, and have live voice conversations in Discord voice channels.
 - **[Browser Automation](browser.md)** — Full browser automation with multiple backends: Browserbase cloud, Browser Use cloud, local Chrome via CDP, or local Chromium. Navigate websites, fill forms, and extract information.
 - **[Vision & Image Paste](vision.md)** — Multimodal vision support. Paste images from your clipboard into the CLI and ask the agent to analyze, describe, or work with them using any vision-capable model.
- **[Image Generation](image-generation.md)** — Generate images from text prompts using FAL.ai. Eight models supported (FLUX 2 Klein/Pro, GPT-Image 1.5, Nano Banana, Ideogram V3, Recraft V3, Qwen, Z-Image Turbo); pick one via `hermes tools`.
+- **[Image Generation](image-generation.md)** — Generate images from text prompts using FAL.ai. Eight models supported (FLUX 2 Klein/Pro, GPT-Image 1.5, Nano Banana Pro, Ideogram V3, Recraft V4 Pro, Qwen, Z-Image Turbo); pick one via `hermes tools`.
 - **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with five provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, MiniMax, and NeuTTS.

 ## Integrations
--- a/website/docs/user-guide/features/skills.md
+++ b/website/docs/user-guide/features/skills.md
@ -278,6 +278,8 @@ hermes skills check                               # Check installed hub skills f
 hermes skills update                              # Reinstall hub skills with upstream changes when needed
 hermes skills audit                               # Re-scan all hub skills for security
 hermes skills uninstall k8s                       # Remove a hub skill
+hermes skills reset google-workspace              # Un-stick a bundled skill from "user-modified" (see below)
+hermes skills reset google-workspace --restore    # Also restore the bundled version, deleting your local edits
 hermes skills publish skills/my-skill --to github --repo owner/repo
 hermes skills snapshot export setup.json          # Export skill config
 hermes skills tap add myorg/skills-repo           # Add a custom GitHub source
@ -430,6 +432,43 @@ This uses the stored source identifier plus the current upstream bundle content
 Skills hub operations use the GitHub API, which has a rate limit of 60 requests/hour for unauthenticated users. If you see rate-limit errors during install or search, set `GITHUB_TOKEN` in your `.env` file to increase the limit to 5,000 requests/hour. The error message includes an actionable hint when this happens.
 :::

+## Bundled skill updates (`hermes skills reset`)
+
+Hermes ships with a set of bundled skills in `skills/` inside the repo. On install and on every `hermes update`, a sync pass copies those into `~/.hermes/skills/` and records a manifest at `~/.hermes/skills/.bundled_manifest` mapping each skill name to the content hash at the time it was synced (the **origin hash**).
+
+On each sync, Hermes recomputes the hash of your local copy and compares it to the origin hash:
+
+- **Unchanged** → safe to pull upstream changes, copy the new bundled version in, record the new origin hash.
+- **Changed** → treated as **user-modified** and skipped forever, so your edits never get stomped.
+
+The protection is good, but it has one sharp edge. If you edit a bundled skill and then later want to abandon your changes and go back to the bundled version by just copy-pasting from `~/.hermes/hermes-agent/skills/`, the manifest still holds the *old* origin hash from whenever the last successful sync ran. Your fresh copy-paste contents (current bundled hash) won't match that stale origin hash, so sync keeps flagging it as user-modified.
+
+`hermes skills reset` is the escape hatch:
+
+```bash
+# Safe: clears the manifest entry for this skill. Your current copy is preserved,
+# but the next sync re-baselines against it so future updates work normally.
+hermes skills reset google-workspace
+
+# Full restore: also deletes your local copy and re-copies the current bundled
+# version. Use this when you want the pristine upstream skill back.
+hermes skills reset google-workspace --restore
+
+# Non-interactive (e.g. in scripts or TUI mode) — skip the --restore confirmation.
+hermes skills reset google-workspace --restore --yes
+```
+
+The same command works in chat as a slash command:
+
+```text
+/skills reset google-workspace
+/skills reset google-workspace --restore
+```
+
+:::note Profiles
+Each profile has its own `.bundled_manifest` under its own `HERMES_HOME`, so `hermes -p coder skills reset <name>` only affects that profile.
+:::
+
 ### Slash commands (inside chat)

 All the same commands work with `/skills`:
@ -442,6 +481,7 @@ All the same commands work with `/skills`:
 /skills install openai/skills/skill-creator --force
 /skills check
 /skills update
+/skills reset google-workspace
 /skills list
 ```

--- a/website/docs/user-guide/features/tool-gateway.md
+++ b/website/docs/user-guide/features/tool-gateway.md
@ -18,7 +18,7 @@ The **Tool Gateway** lets paid [Nous Portal](https://portal.nousresearch.com) su
 | Tool | What It Does | Direct Alternative |
 |------|--------------|--------------------|
 | **Web search & extract** | Search the web and extract page content via Firecrawl | `FIRECRAWL_API_KEY`, `EXA_API_KEY`, `PARALLEL_API_KEY`, `TAVILY_API_KEY` |
-| **Image generation** | Generate images via FAL (8 models: FLUX 2 Klein/Pro, GPT-Image, Nano Banana, Ideogram, Recraft, Qwen, Z-Image) | `FAL_KEY` |
+| **Image generation** | Generate images via FAL (8 models: FLUX 2 Klein/Pro, GPT-Image, Nano Banana Pro, Ideogram, Recraft V4 Pro, Qwen, Z-Image) | `FAL_KEY` |
 | **Text-to-speech** | Convert text to speech via OpenAI TTS | `VOICE_TOOLS_OPENAI_KEY`, `ELEVENLABS_API_KEY` |
 | **Browser automation** | Control cloud browsers via Browser Use | `BROWSER_USE_API_KEY`, `BROWSERBASE_API_KEY` |

--- a/website/docs/user-guide/messaging/discord.md
+++ b/website/docs/user-guide/messaging/discord.md
@ -283,6 +283,10 @@ Discord behavior is controlled through two files: **`~/.hermes/.env`** for crede
 | `DISCORD_IGNORED_CHANNELS` | No | — | Comma-separated channel IDs where the bot **never** responds, even when `@mentioned`. Takes priority over all other channel settings. |
 | `DISCORD_NO_THREAD_CHANNELS` | No | — | Comma-separated channel IDs where the bot responds directly in the channel instead of creating a thread. Only relevant when `DISCORD_AUTO_THREAD` is `true`. |
 | `DISCORD_REPLY_TO_MODE` | No | `"first"` | Controls reply-reference behavior: `"off"` — never reply to the original message, `"first"` — reply-reference on the first message chunk only (default), `"all"` — reply-reference on every chunk. |
+| `DISCORD_ALLOW_MENTION_EVERYONE` | No | `false` | When `false` (default), the bot cannot ping `@everyone` or `@here` even if its response contains those tokens. Set to `true` to opt back in. See [Mention Control](#mention-control) below. |
+| `DISCORD_ALLOW_MENTION_ROLES` | No | `false` | When `false` (default), the bot cannot ping `@role` mentions. Set to `true` to allow. |
+| `DISCORD_ALLOW_MENTION_USERS` | No | `true` | When `true` (default), the bot can ping individual users by ID. |
+| `DISCORD_ALLOW_MENTION_REPLIED_USER` | No | `true` | When `true` (default), replying to a message pings the original author. |

 ### Config File (`config.yaml`)

@ -298,6 +302,11 @@ discord:
  ignored_channels: []            # Channel IDs where bot never responds
  no_thread_channels: []          # Channel IDs where bot responds without threading
  channel_prompts: {}             # Per-channel ephemeral system prompts
+  allow_mentions:                 # What the bot is allowed to ping (safe defaults)
+    everyone: false               # @everyone / @here pings (default: false)
+    roles: false                  # @role pings (default: false)
+    users: true                   # @user pings (default: true)
+    replied_user: true            # reply-reference pings the author (default: true)

 # Session isolation (applies to all gateway platforms, not just Discord)
 group_sessions_per_user: true     # Isolate sessions per user in shared channels
@ -552,6 +561,34 @@ If you intentionally want a shared room conversation, leave it off — just expe
 Always set `DISCORD_ALLOWED_USERS` to restrict who can interact with the bot. Without it, the gateway denies all users by default as a safety measure. Only add User IDs of people you trust — authorized users have full access to the agent's capabilities, including tool use and system access.
 :::

+### Mention Control
+
+By default, Hermes blocks the bot from pinging `@everyone`, `@here`, and role mentions, even if its reply contains those tokens. This prevents a poorly-worded prompt or echoed user content from spamming a whole server. Individual `@user` pings and reply-reference pings (the little "replying to…" chip) stay enabled so normal conversation still works.
+
+You can relax these defaults via either env vars or `config.yaml`:
+
+```yaml
+# ~/.hermes/config.yaml
+discord:
+  allow_mentions:
+    everyone: false      # allow the bot to ping @everyone / @here
+    roles: false         # allow the bot to ping @role mentions
+    users: true          # allow the bot to ping individual @users
+    replied_user: true   # ping the author when replying to their message
+```
+
+```bash
+# ~/.hermes/.env — env vars win over config.yaml
+DISCORD_ALLOW_MENTION_EVERYONE=false
+DISCORD_ALLOW_MENTION_ROLES=false
+DISCORD_ALLOW_MENTION_USERS=true
+DISCORD_ALLOW_MENTION_REPLIED_USER=true
+```
+
+:::tip
+Leave `everyone` and `roles` at `false` unless you know exactly why you need them. It is very easy for an LLM to produce the string `@everyone` inside a normal-looking response; without this protection, that would notify every member of your server.
+:::
+
 For more information on securing your Hermes Agent deployment, see the [Security Guide](../security.md).


--- a/website/docs/user-guide/messaging/weixin.md
+++ b/website/docs/user-guide/messaging/weixin.md
@ -16,14 +16,14 @@ This adapter is for **personal WeChat accounts** (微信). If you need enterpris

 - A personal WeChat account
 - Python packages: `aiohttp` and `cryptography`
- The `qrcode` package is optional (for terminal QR rendering during setup)
+- Terminal QR rendering is included when Hermes is installed with the `messaging` extra

 Install the required dependencies:

 ```bash
 pip install aiohttp cryptography
 # Optional: for terminal QR code display
-pip install qrcode
+pip install hermes-agent[messaging]
 ```

 ## Setup
@ -90,7 +90,7 @@ The adapter will restore saved credentials, connect to the iLink API, and begin
 - **Media support** — images, video, files, and voice messages
 - **AES-128-ECB encrypted CDN** — automatic encryption/decryption for all media transfers
 - **Context token persistence** — disk-backed reply continuity across restarts
- **Markdown formatting** — headers, tables, and code blocks are reformatted for WeChat readability
+- **Markdown formatting** — preserves Markdown, including headers, tables, and code blocks, so WeChat clients that support Markdown can render it natively
 - **Smart message chunking** — messages stay as a single bubble when under the limit; only oversized payloads split at logical boundaries
 - **Typing indicators** — shows "typing…" status in the WeChat client while the agent processes
 - **SSRF protection** — outbound media URLs are validated before download
@ -206,12 +206,12 @@ This ensures reply continuity even after gateway restarts.

 ## Markdown Formatting

-WeChat's personal chat does not natively render full Markdown. The adapter reformats content for better readability:
+WeChat clients connected through the iLink Bot API can render Markdown directly, so the adapter preserves Markdown instead of rewriting it:

- **Headers** (`# Title`) → converted to `【Title】` (level 1) or `**Title**` (level 2+)
- **Tables** → reformatted as labeled key-value lists (e.g., `- Column: Value`)
- **Code fences** → preserved as-is (WeChat renders these adequately)
- **Excessive blank lines** → collapsed to double newlines
+- **Headers** stay as Markdown headings (`#`, `##`, ...)
+- **Tables** stay as Markdown tables
+- **Code fences** stay as fenced code blocks
+- **Excessive blank lines** are collapsed to double newlines outside fenced code blocks

 ## Message Chunking

@ -296,4 +296,4 @@ Only one Weixin gateway instance can use a given token at a time. The adapter ac
 | Voice messages show as text | If WeChat provides a transcription, the adapter uses the text. This is expected behavior |
 | Messages appear duplicated | The adapter deduplicates by message ID. If you see duplicates, check if multiple gateway instances are running |
 | `iLink POST ... HTTP 4xx/5xx` | API error from the iLink service. Check your token validity and network connectivity |
-| Terminal QR code doesn't render | Install `qrcode`: `pip install qrcode`. Alternatively, open the URL printed above the QR |
+| Terminal QR code doesn't render | Reinstall with the messaging extra: `pip install hermes-agent[messaging]`. Alternatively, open the URL printed above the QR |