mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-02 02:01:47 +00:00
docs: two-week gap sweep — platforms, CLI, config, TUI, hooks, providers (#17727)
Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since #11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (#17417 #16997 #16193 #14315 #13151 #11794 #10610 #10283 #10246 #11564 #13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (#16052 #16539 #16566 #15841 #14798 #10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (#17305 #17026 #17000 #15077 #14557 #14227 #14166 #14730 #17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (#17130 #17113 #17175 #17150 #16707 #12312 #12305 #12934 #14810 #14045 #17286 #17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (#16506 #15027 #13428 #12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (#12929 #12972 #10763 #16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (#16576 #16572 #16383 #15878 #15608 #15606 #14809 #14767 #14231 #14232 #14307 #13683 #12373 #11891 #11291 #10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (#15045 #14473 #15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
This commit is contained in:
parent
8dcab19d02
commit
22ff6ca32b
26 changed files with 727 additions and 19 deletions
|
|
@ -96,11 +96,17 @@ When resuming a previous session (`hermes -c` or `hermes --resume <id>`), a "Pre
|
|||
| `Alt+V` | Paste an image from the clipboard when supported by the terminal |
|
||||
| `Ctrl+V` | Paste text and opportunistically attach clipboard images |
|
||||
| `Ctrl+B` | Start/stop voice recording when voice mode is enabled (`voice.record_key`, default: `ctrl+b`) |
|
||||
| `Ctrl+G` | Open the current input buffer in `$EDITOR` (vim/nvim/nano/VS Code/etc.). Save and quit to send the edited text as the next prompt — ideal for long, multi-paragraph prompts. |
|
||||
| `Ctrl+X Ctrl+E` | Emacs-style alternate binding for the external editor (same behavior as `Ctrl+G`). |
|
||||
| `Ctrl+C` | Interrupt agent (double-press within 2s to force exit) |
|
||||
| `Ctrl+D` | Exit |
|
||||
| `Ctrl+Z` | Suspend Hermes to background (Unix only). Run `fg` in the shell to resume. |
|
||||
| `Tab` | Accept auto-suggestion (ghost text) or autocomplete slash commands |
|
||||
|
||||
**Multiline paste preview.** When you paste a multi-line block, the CLI echoes a compact single-line preview (`[pasted: 47 lines, 1,842 chars — press Enter to send]`) instead of dumping the whole payload into the scrollback. The full content is still what gets sent; this is just display polish.
|
||||
|
||||
**Markdown stripping in final responses.** The CLI strips the most verbose markdown fences and `**bold**` / `*italic*` wrappers from *final* agent replies so they render as readable terminal prose rather than raw source. Code blocks and lists are preserved. This does not affect gateway platforms or tool results — they keep their markdown for native rendering.
|
||||
|
||||
## Slash Commands
|
||||
|
||||
Type `/` to see the autocomplete dropdown. Hermes supports a large set of CLI slash commands, dynamic skill commands, and user-defined quick commands.
|
||||
|
|
|
|||
|
|
@ -132,6 +132,7 @@ terminal:
|
|||
backend: docker
|
||||
docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
|
||||
docker_mount_cwd_to_workspace: false # Mount launch dir into /workspace
|
||||
docker_run_as_host_user: false # See "Running container as host user" below
|
||||
docker_forward_env: # Env vars to forward into container
|
||||
- "GITHUB_TOKEN"
|
||||
docker_volumes: # Host directory mounts
|
||||
|
|
@ -145,7 +146,7 @@ terminal:
|
|||
container_persistent: true # Persist /workspace and /root across sessions
|
||||
```
|
||||
|
||||
**Requirements:** Docker Desktop or Docker Engine installed and running. Hermes probes `$PATH` plus common macOS install locations (`/usr/local/bin/docker`, `/opt/homebrew/bin/docker`, Docker Desktop app bundle).
|
||||
**Requirements:** Docker Desktop or Docker Engine installed and running. Hermes probes `$PATH` plus common macOS install locations (`/usr/local/bin/docker`, `/opt/homebrew/bin/docker`, Docker Desktop app bundle). Podman is supported out of the box: set `HERMES_DOCKER_BINARY=podman` (or the full path) to force it when both are installed.
|
||||
|
||||
**Container lifecycle:** Hermes reuses a single long-lived container (`docker run -d ... sleep 2h`) for every terminal and file-tool call, across sessions, `/new`, `/reset`, and `delegate_task` subagents, for the lifetime of the Hermes process. Commands run via `docker exec` with a login shell, so working-directory changes, installed packages, and files in `/workspace` all persist from one tool call to the next. The container is stopped and removed on Hermes shutdown (or when the idle-sweep reclaims it).
|
||||
|
||||
|
|
@ -301,6 +302,23 @@ If terminal commands fail immediately or the terminal tool is reported as disabl
|
|||
|
||||
When in doubt, set `terminal.backend` back to `local` and verify that commands run there first.
|
||||
|
||||
### Remote-to-Host File Sync on Teardown
|
||||
|
||||
For the **SSH**, **Modal**, and **Daytona** backends (anywhere the agent's working tree lives on a different machine than the host running Hermes), Hermes tracks files the agent touched inside the remote sandbox and, on session teardown / sandbox cleanup, **syncs the modified files back to the host** under `~/.hermes/cache/remote-syncs/<session-id>/`.
|
||||
|
||||
- Triggers on: session close, `/new`, `/reset`, gateway message timeout, `delegate_task` subagent completion when the child used a remote backend.
|
||||
- Covers the whole tree the agent modified, not just files it explicitly opened. Additions, edits, and deletions are all captured.
|
||||
- The remote sandbox may have been torn down by the time you go looking; the local `~/.hermes/cache/remote-syncs/…` copy is the authoritative record of what the agent changed.
|
||||
- Large binary outputs (model checkpoints, raw datasets) are capped by size — the sync skips files over `file_sync_max_mb` (default `100`). Bump that if you expect bigger artifacts to come back.
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
file_sync_max_mb: 100 # default — sync files up to 100 MB each
|
||||
file_sync_enabled: true # default — set false to skip the sync entirely
|
||||
```
|
||||
|
||||
This is how you recover results from ephemeral cloud sandboxes that get destroyed after the session ends, without having to tell the agent to explicitly `scp` or `modal volume put` every artifact.
|
||||
|
||||
### Docker Volume Mounts
|
||||
|
||||
When using the Docker backend, `docker_volumes` lets you share host directories with the container. Each entry uses standard Docker `-v` syntax: `host_path:container_path[:options]`.
|
||||
|
|
@ -355,6 +373,20 @@ Hermes resolves each listed variable from your current shell first, then falls b
|
|||
Anything listed in `docker_forward_env` becomes visible to commands run inside the container. Only forward credentials you are comfortable exposing to the terminal session.
|
||||
:::
|
||||
|
||||
### Running the Container as Your Host User
|
||||
|
||||
By default Docker containers run as `root` (UID 0). Files created inside `/workspace` or other bind-mounts end up owned by root on the host, so after a session you have to `sudo chown` them before you can edit them from your host editor. The `terminal.docker_run_as_host_user` flag fixes this:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: docker
|
||||
docker_run_as_host_user: true # default: false
|
||||
```
|
||||
|
||||
When enabled, Hermes appends `--user $(id -u):$(id -g)` to the `docker run` command so files written into bind-mounted directories (`/workspace`, `/root`, anything in `docker_volumes`) are owned by your host user, not root. The trade-off: the container can no longer `apt install` or write to root-owned paths like `/root/.npm` — use a base image whose `HOME` is owned by a non-root user (or add your required tooling at image build time) if you need both.
|
||||
|
||||
Leave this `false` (the default) for backwards-compatible behavior. Turn it on when your workflow is mostly "edit mounted host files" and you're tired of `sudo chown -R`.
|
||||
|
||||
### Optional: Mount the Launch Directory into `/workspace`
|
||||
|
||||
Docker sandboxes stay isolated by default. Hermes does **not** pass your current host working directory into the container unless you explicitly opt in.
|
||||
|
|
@ -447,6 +479,17 @@ hermes config set skills.config.myplugin.path ~/myplugin-data
|
|||
|
||||
For details on declaring config settings in your own skills, see [Creating Skills — Config Settings](/docs/developer-guide/creating-skills#config-settings-configyaml).
|
||||
|
||||
### Guard on agent-created skill writes
|
||||
|
||||
When the agent uses `skill_manage` to create, edit, patch, or delete a skill, Hermes can optionally scan the new/updated content for dangerous keyword patterns (credential harvesting, obvious prompt injection, exfil instructions). The scanner is **off by default** — real agent workflows that legitimately touch `~/.ssh/` or mention `$OPENAI_API_KEY` were tripping the heuristic too often. Turn it back on if you want the scanner to prompt you before the agent's skill writes land:
|
||||
|
||||
```yaml
|
||||
skills:
|
||||
guard_agent_created: true # default: false
|
||||
```
|
||||
|
||||
When on, any flagged `skill_manage` write surfaces as an approval prompt with the scanner's rationale. Accepted writes land; denied writes return an explanatory error to the agent.
|
||||
|
||||
## Memory Configuration
|
||||
|
||||
```yaml
|
||||
|
|
@ -560,6 +603,7 @@ compression:
|
|||
threshold: 0.50 # Compress at this % of context limit
|
||||
target_ratio: 0.20 # Fraction of threshold to preserve as recent tail
|
||||
protect_last_n: 20 # Min recent messages to keep uncompressed
|
||||
hygiene_hard_message_limit: 400 # Gateway safety valve — see below
|
||||
|
||||
# The summarization model/provider is configured under auxiliary:
|
||||
auxiliary:
|
||||
|
|
@ -573,6 +617,12 @@ auxiliary:
|
|||
Older configs with `compression.summary_model`, `compression.summary_provider`, and `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load (config version 17). No manual action needed.
|
||||
:::
|
||||
|
||||
`hygiene_hard_message_limit` is a gateway-only **pre-compression safety valve**. Runaway sessions with thousands of messages can hit model context limits before the normal percent-of-context threshold fires; when message count crosses this ceiling, Hermes forces compression regardless of token usage. Default `400` — raise it for platforms where very long sessions are normal, lower it to force more aggressive compression. Editing this value on a running gateway takes effect on the next message (see below).
|
||||
|
||||
:::tip Gateway hot-reload of compression and context length
|
||||
As of recent releases, editing `model.context_length` or any `compression.*` key in `config.yaml` on a running gateway takes effect on the next message — no gateway restart, no `/reset`, no session rotation required. The cached-agent signature includes these keys, so the gateway transparently rebuilds the agent when it sees a change. API keys and tool/skill config still require the usual reload paths.
|
||||
:::
|
||||
|
||||
### Common setups
|
||||
|
||||
**Default (auto-detect) — no configuration needed:**
|
||||
|
|
@ -581,7 +631,7 @@ compression:
|
|||
enabled: true
|
||||
threshold: 0.50
|
||||
```
|
||||
Uses the first available provider (OpenRouter → Nous → Codex) with Gemini Flash.
|
||||
Uses your main provider and main model. Override per-task (e.g. `auxiliary.compression.provider: openrouter` + `model: google/gemini-2.5-flash`) if you want compression on a cheaper model than your main chat model.
|
||||
|
||||
**Force a specific provider** (OAuth or API-key based):
|
||||
```yaml
|
||||
|
|
@ -647,12 +697,15 @@ Warnings are injected into the last tool result's JSON (as a `_budget_warning` f
|
|||
```yaml
|
||||
agent:
|
||||
max_turns: 90 # Max iterations per conversation turn (default: 90)
|
||||
api_max_retries: 2 # Retries per provider before fallback engages (default: 2)
|
||||
```
|
||||
|
||||
Budget pressure is enabled by default. The agent sees warnings naturally as part of tool results, encouraging it to consolidate its work and deliver a response before running out of iterations.
|
||||
|
||||
When the iteration budget is fully exhausted, the CLI shows a notification to the user: `⚠ Iteration budget reached (90/90) — response may be incomplete`. If the budget runs out during active work, the agent generates a summary of what was accomplished before stopping.
|
||||
|
||||
`agent.api_max_retries` controls how many times Hermes retries a provider API call on transient errors (rate limits, connection drops, 5xx) **before** fallback-provider switching engages. The default is `2` — three attempts total, matching the OpenAI SDK default. If you have [fallback providers](/docs/user-guide/features/fallback-providers) configured and want to fail over faster, drop this to `0` so the first transient error on your primary immediately hands off to the fallback instead of churning retries against the flaky endpoint.
|
||||
|
||||
### API Timeouts
|
||||
|
||||
Hermes has separate timeout layers for streaming, plus a stale detector for non-streaming calls. The stale detectors auto-adjust for local providers only when you leave them at their implicit defaults.
|
||||
|
|
@ -709,7 +762,29 @@ Options: `fill_first` (default), `round_robin`, `least_used`, `random`. See [Cre
|
|||
|
||||
## Auxiliary Models
|
||||
|
||||
Hermes uses lightweight "auxiliary" models for side tasks like image analysis, web page summarization, and browser screenshot analysis. By default, these use **Gemini Flash** via auto-detection — you don't need to configure anything.
|
||||
Hermes uses "auxiliary" models for side tasks like image analysis, web page summarization, browser screenshot analysis, session-title generation, and context compression. By default (`auxiliary.*.provider: "auto"`), Hermes routes every auxiliary task to your **main chat model** — the same provider/model you picked in `hermes model`. You don't need to configure anything to get started, but be aware that on expensive reasoning models (Opus, MiniMax M2.7, etc.) auxiliary tasks add meaningful cost. If you want cheap-and-fast side tasks regardless of your main model, set `auxiliary.<task>.provider` and `auxiliary.<task>.model` explicitly (for example, Gemini Flash on OpenRouter for vision and web extraction).
|
||||
|
||||
:::note Why "auto" uses your main model
|
||||
Earlier builds split aggregator users (OpenRouter, Nous Portal) onto a cheap provider-side default. That was surprising — users who paid for an aggregator subscription would see a different model handling their auxiliary traffic. `auto` now uses the main model for everyone, and per-task overrides in `config.yaml` still win (see [Full auxiliary config reference](#full-auxiliary-config-reference) below).
|
||||
:::
|
||||
|
||||
### Configuring auxiliary models interactively
|
||||
|
||||
Instead of hand-editing YAML, run `hermes model` and pick **"Configure auxiliary models"** from the menu. You'll get an interactive per-task picker:
|
||||
|
||||
```
|
||||
$ hermes model
|
||||
→ Configure auxiliary models
|
||||
|
||||
[ ] vision currently: auto / main model
|
||||
[ ] web_extract currently: auto / main model
|
||||
[ ] session_search currently: openrouter / google/gemini-2.5-flash
|
||||
[ ] title_generation currently: openrouter / google/gemini-3-flash-preview
|
||||
[ ] compression currently: auto / main model
|
||||
[ ] approval currently: auto / main model
|
||||
```
|
||||
|
||||
Select a task, pick a provider (OAuth flows open a browser; API-key providers prompt), pick a model. The change persists to `auxiliary.<task>.*` in `config.yaml`. Same machinery as the main-model picker — no extra syntax to learn.
|
||||
|
||||
### Video Tutorial
|
||||
|
||||
|
|
@ -1088,6 +1163,7 @@ display:
|
|||
streaming: false # Stream tokens to terminal as they arrive (real-time output)
|
||||
show_cost: false # Show estimated $ cost in the CLI status bar
|
||||
tool_preview_length: 0 # Max chars for tool call previews (0 = no limit, show full paths/commands)
|
||||
runtime_metadata_footer: false # Gateway: append a runtime-context footer to final replies
|
||||
```
|
||||
|
||||
| Mode | What you see |
|
||||
|
|
@ -1099,6 +1175,23 @@ display:
|
|||
|
||||
In the CLI, cycle through these modes with `/verbose`. To use `/verbose` in messaging platforms (Telegram, Discord, Slack, etc.), set `tool_progress_command: true` in the `display` section above. The command will then cycle the mode and save to config.
|
||||
|
||||
### Runtime-metadata footer (gateway only)
|
||||
|
||||
When `display.runtime_metadata_footer: true`, Hermes appends a small runtime-context footer to the **final** message of each gateway turn — same info the CLI shows in its status bar (model, session duration, tokens, cost). Off by default; opt in per-gateway if your team wants every reply to include the provenance.
|
||||
|
||||
```yaml
|
||||
display:
|
||||
runtime_metadata_footer: true
|
||||
```
|
||||
|
||||
Example footer appended to a Telegram/Discord/Slack reply:
|
||||
|
||||
```
|
||||
— claude-opus-4.7 · 12 tool calls · 2m 14s · $0.042
|
||||
```
|
||||
|
||||
Only the **final** message of a turn gets the footer; interim updates stay clean.
|
||||
|
||||
### Per-platform progress overrides
|
||||
|
||||
Different platforms have different verbosity needs. For example, Signal can't edit messages, so each progress update becomes a separate message — noisy. Use `tool_progress_overrides` to set per-platform modes:
|
||||
|
|
|
|||
|
|
@ -263,6 +263,8 @@ The official image is based on `debian:13.4` and includes:
|
|||
- Node.js + npm (for browser automation and WhatsApp bridge)
|
||||
- Playwright with Chromium (`npx playwright install --with-deps chromium`)
|
||||
- ripgrep and ffmpeg as system utilities
|
||||
- **`docker-cli`** — so agents running inside the container can drive the host's Docker daemon (bind-mount `/var/run/docker.sock` to opt in) for `docker build`, `docker run`, container inspection, etc.
|
||||
- **`openssh-client`** — enables the [SSH terminal backend](/docs/user-guide/configuration#ssh-backend) from inside the container. The SSH backend shells out to the system `ssh` binary; without this, it failed silently in containerized installs.
|
||||
- The WhatsApp bridge (`scripts/whatsapp-bridge/`)
|
||||
|
||||
The entrypoint script (`docker/entrypoint.sh`) bootstraps the data volume on first run:
|
||||
|
|
|
|||
|
|
@ -162,6 +162,36 @@ Hermes-prefixed and standard SDK env vars (`LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECR
|
|||
|
||||
**Disabling:** `hermes plugins disable observability/langfuse`. The plugin module is still discovered, but no module code runs until you re-enable.
|
||||
|
||||
### google_meet
|
||||
|
||||
Lets the agent **join, transcribe, and participate in Google Meet calls** — take notes on a meeting, summarize the back-and-forth after, follow up on specific points, and (optionally) speak replies back into the call via TTS.
|
||||
|
||||
**What it adds:**
|
||||
|
||||
- A headless virtual participant that joins a Meet URL using browser automation
|
||||
- Live transcription of the meeting audio via the configured STT provider
|
||||
- A `meet_summarize` / `meet_speak` / `meet_followup` toolset the agent invokes to act on what it heard
|
||||
- Post-meeting artifacts (transcript, speaker-attributed notes, action items) saved under `~/.hermes/cache/google_meet/<meeting_id>/`
|
||||
|
||||
**Setup:**
|
||||
|
||||
```bash
|
||||
hermes plugins enable google_meet
|
||||
# Prompts you to sign in via the plugin's OAuth flow on first use —
|
||||
# needs a Google account with Meet access. Host approval may be required
|
||||
# if the meeting enforces "only invited participants can join".
|
||||
```
|
||||
|
||||
Usage from chat:
|
||||
|
||||
> "Join meet.google.com/abc-defg-hij and take notes. After the call, send me a summary with action items."
|
||||
|
||||
The agent kicks off the meeting join, streams the transcription back into its context as the call proceeds, and produces a structured summary when the meeting ends (or when you tell it to stop).
|
||||
|
||||
**When to use it:** recurring standups where you want a bot to transcribe + summarize for async attendees; deposition-style interviews where you want structured notes; any case where you'd otherwise need Fireflies / Otter / Grain. When you'd rather not have an AI listening in — don't enable it.
|
||||
|
||||
**Disabling:** `hermes plugins disable google_meet`. Any cached transcripts and recordings stay in `~/.hermes/cache/google_meet/` until you remove them.
|
||||
|
||||
## Adding a bundled plugin
|
||||
|
||||
Bundled plugins are written exactly like any other Hermes plugin — see [Build a Hermes Plugin](/docs/guides/build-a-hermes-plugin). The only differences are:
|
||||
|
|
|
|||
|
|
@ -366,6 +366,64 @@ cronjob(action="remove", job_id="...")
|
|||
|
||||
For `update`, pass `skills=[]` to remove all attached skills.
|
||||
|
||||
## Toolsets available to cron jobs
|
||||
|
||||
Cron runs each job in a fresh agent session with no chat platform attached. By default the cron agent gets **the toolset you configured for the `cron` platform in `hermes tools`** — not the CLI default, not everything under the sun.
|
||||
|
||||
```bash
|
||||
hermes tools
|
||||
# → pick the "cron" platform in the curses UI
|
||||
# → toggle toolsets on/off just like you would for Telegram/Discord/etc.
|
||||
```
|
||||
|
||||
Tighter per-job control is available via the `enabled_toolsets` field on `cronjob.create` (or on an existing job via `cronjob.update`):
|
||||
|
||||
```text
|
||||
cronjob(action="create", name="weekly-news-summary",
|
||||
schedule="every sunday 9am",
|
||||
enabled_toolsets=["web", "file"], # just web + file, no terminal/browser/etc.
|
||||
prompt="Summarize this week's AI news: ...")
|
||||
```
|
||||
|
||||
When `enabled_toolsets` is set on a job it wins; otherwise the `hermes tools` cron-platform config wins; otherwise Hermes falls back to the built-in defaults. This matters for cost control: carrying `moa`, `browser`, `delegation` into every tiny "fetch news" job bloats the tool-schema prompt on every LLM call.
|
||||
|
||||
### Skipping the agent entirely: `wakeAgent`
|
||||
|
||||
If your cron job attaches a pre-check script (via `script=`), the script can decide at runtime whether Hermes should even invoke the agent. Emit a final stdout line of the form:
|
||||
|
||||
```text
|
||||
{"wakeAgent": false}
|
||||
```
|
||||
|
||||
…and cron skips the agent run entirely for this tick. Useful for frequent polls (every 1–5 min) that only need to wake the LLM when state actually changed — otherwise you pay for zero-content agent turns over and over.
|
||||
|
||||
```python
|
||||
# pre-check script
|
||||
import json, sys
|
||||
latest = fetch_latest_issue_count()
|
||||
prev = read_state("issue_count")
|
||||
if latest == prev:
|
||||
print(json.dumps({"wakeAgent": False})) # skip this tick
|
||||
sys.exit(0)
|
||||
write_state("issue_count", latest)
|
||||
print(json.dumps({"wakeAgent": True, "context": {"new_issues": latest - prev}}))
|
||||
```
|
||||
|
||||
When `wakeAgent` is omitted, the default is `true` (wake the agent as usual).
|
||||
|
||||
### Chaining jobs: `context_from`
|
||||
|
||||
A cron job can consume the most recent successful output of one or more other jobs by listing their names (or IDs) in `context_from`:
|
||||
|
||||
```text
|
||||
cronjob(action="create", name="daily-digest",
|
||||
schedule="every day 7am",
|
||||
context_from=["ai-news-fetch", "github-prs-fetch"],
|
||||
prompt="Write the daily digest using the outputs above.")
|
||||
```
|
||||
|
||||
The referenced jobs' most recent completed outputs are injected above the prompt as context for this run. Each upstream entry must be a valid job ID or name (see `cronjob action="list"`). Note: chaining reads the *most recent completed* output — it does not wait for upstream jobs that are running in the same tick.
|
||||
|
||||
## Job storage
|
||||
|
||||
Jobs are stored in `~/.hermes/cron/jobs.json`. Output from job runs is saved to `~/.hermes/cron/output/{job_id}/{timestamp}.md`.
|
||||
|
|
|
|||
|
|
@ -173,6 +173,32 @@ delegate_task(
|
|||
)
|
||||
```
|
||||
|
||||
## Child Timeout
|
||||
|
||||
Subagents are killed as stuck if they go quiet for more than `delegation.child_timeout_seconds` wall-clock seconds. The default is **600** (10 minutes) — bumped up from 300s in earlier releases because high-reasoning models on non-trivial research tasks were getting killed mid-think. Tune it per-install:
|
||||
|
||||
```yaml
|
||||
delegation:
|
||||
child_timeout_seconds: 600 # default
|
||||
```
|
||||
|
||||
Lower it for fast local models; raise it for slow reasoning models on hard problems. The timer resets every time the child makes an API call or tool call — only genuinely idle workers trigger the kill.
|
||||
|
||||
:::tip Diagnostic dump on zero-call timeout
|
||||
If a subagent times out having made **zero** API calls (usually: provider unreachable, auth failure, or tool-schema rejection), `delegate_task` writes a structured diagnostic to `~/.hermes/logs/subagent-timeout-<session>-<timestamp>.log` containing the subagent's config snapshot, credential-resolution trace, and any early error messages. Much easier to root-cause than the previous silent-timeout behavior.
|
||||
:::
|
||||
|
||||
## Monitoring Running Subagents (`/agents`)
|
||||
|
||||
The TUI ships a `/agents` overlay (alias `/tasks`) that turns recursive `delegate_task` fan-out into a first-class audit surface:
|
||||
|
||||
- Live tree view of running and recently-finished subagents, grouped by parent
|
||||
- Per-branch cost, token, and file-touched rollups
|
||||
- Kill and pause controls — cancel a specific subagent mid-flight without interrupting its siblings
|
||||
- Post-hoc review: step through each subagent's turn-by-turn history even after they've returned to the parent
|
||||
|
||||
The classic CLI just prints `/agents` as a text summary; the TUI is where the overlay shines. See [TUI — Slash commands](/docs/user-guide/tui#slash-commands).
|
||||
|
||||
## Depth Limit and Nested Orchestration
|
||||
|
||||
By default, delegation is **flat**: a parent (depth 0) spawns children (depth 1), and those children cannot delegate further. This prevents runaway recursive delegation.
|
||||
|
|
|
|||
|
|
@ -21,7 +21,15 @@ When your main LLM provider encounters errors — rate limits, server overload,
|
|||
|
||||
### Configuration
|
||||
|
||||
Add a `fallback_model` section to `~/.hermes/config.yaml`:
|
||||
The easiest path is the interactive manager:
|
||||
|
||||
```bash
|
||||
hermes fallback
|
||||
```
|
||||
|
||||
`hermes fallback` reuses the provider picker from `hermes model` — same provider list, same credential prompts, same validation. Press `a` to add a fallback, `↑`/`↓` to reorder, `d` to remove, `q` to save and exit. Changes persist under `model.fallback_providers` in `config.yaml`.
|
||||
|
||||
If you'd rather edit the YAML directly, add a `fallback_model` section to `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
fallback_model:
|
||||
|
|
@ -31,6 +39,10 @@ fallback_model:
|
|||
|
||||
Both `provider` and `model` are **required**. If either is missing, the fallback is disabled.
|
||||
|
||||
:::note `fallback_model` vs `fallback_providers`
|
||||
`fallback_model` (singular) is the legacy single-fallback key — Hermes still honors it for back-compat. `fallback_providers` (plural, list) supports multiple fallbacks tried in order; `hermes fallback` writes to this key. When both are set, Hermes merges them with `fallback_providers` taking priority.
|
||||
:::
|
||||
|
||||
### Supported Providers
|
||||
|
||||
| Provider | Value | Requirements |
|
||||
|
|
|
|||
|
|
@ -385,6 +385,8 @@ def register(ctx):
|
|||
| [`pre_gateway_dispatch`](#pre_gateway_dispatch) | Gateway received a user message, before auth + dispatch | `{"action": "skip" \| "rewrite" \| "allow", ...}` to influence flow |
|
||||
| [`pre_approval_request`](#pre_approval_request) | Dangerous command needs user approval, before the prompt/notification is sent | ignored |
|
||||
| [`post_approval_response`](#post_approval_response) | User responded to an approval prompt (or it timed out) | ignored |
|
||||
| [`transform_tool_result`](#transform_tool_result) | After any tool returns, before the result is handed back to the model | `str` to replace the result, `None` to leave unchanged |
|
||||
| [`transform_terminal_output`](#transform_terminal_output) | Inside the `terminal` tool, before truncation/ANSI-strip/redact | `str` to replace the raw output, `None` to leave unchanged |
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -1003,6 +1005,94 @@ def register(ctx):
|
|||
|
||||
---
|
||||
|
||||
### `transform_tool_result`
|
||||
|
||||
Fires **after** a tool returns and **before** the result is appended to the conversation. Lets a plugin rewrite ANY tool's result string — not just terminal output — before the model sees it.
|
||||
|
||||
**Callback signature:**
|
||||
|
||||
```python
|
||||
def my_callback(
|
||||
tool_name: str,
|
||||
arguments: dict,
|
||||
result: str,
|
||||
task_id: str | None,
|
||||
**kwargs,
|
||||
) -> str | None:
|
||||
```
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `tool_name` | `str` | Tool that produced the result (`read_file`, `web_extract`, `delegate_task`, …). |
|
||||
| `arguments` | `dict` | Arguments the model called the tool with. |
|
||||
| `result` | `str` | The tool's raw result string, post-truncation and post-ANSI-strip. |
|
||||
| `task_id` | `str \| None` | Task/session ID when running inside RL/benchmark environments. |
|
||||
|
||||
**Return value:** `str` to replace the result (the returned string is what the model sees), `None` to leave it unchanged.
|
||||
|
||||
**Use cases:** Redact organization-specific PII from `web_extract` output, wrap long JSON tool responses in a summary header, inject retrieval-augmented hints into `read_file` results, rewrite `delegate_task` subagent reports into a project-specific schema.
|
||||
|
||||
```python
|
||||
import re
|
||||
SECRET = re.compile(r"sk-[A-Za-z0-9]{32,}")
|
||||
|
||||
def redact_secrets(tool_name, result, **kwargs):
|
||||
if SECRET.search(result):
|
||||
return SECRET.sub("[REDACTED]", result)
|
||||
return None
|
||||
|
||||
def register(ctx):
|
||||
ctx.register_hook("transform_tool_result", redact_secrets)
|
||||
```
|
||||
|
||||
Applies to every tool. For terminal-only rewriting see `transform_terminal_output` below — it's narrower and runs earlier in the pipeline (pre-truncation, pre-redaction).
|
||||
|
||||
---
|
||||
|
||||
### `transform_terminal_output`
|
||||
|
||||
Fires inside the `terminal` tool's foreground-output pipeline, **before** the default 50 KB truncation, ANSI strip, and secret redaction. Lets plugins rewrite the raw stdout/stderr of a shell command before any downstream processing touches it.
|
||||
|
||||
**Callback signature:**
|
||||
|
||||
```python
|
||||
def my_callback(
|
||||
command: str,
|
||||
output: str,
|
||||
exit_code: int,
|
||||
cwd: str,
|
||||
task_id: str | None,
|
||||
**kwargs,
|
||||
) -> str | None:
|
||||
```
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `command` | `str` | The shell command that produced the output. |
|
||||
| `output` | `str` | Raw combined stdout/stderr (may be very large — truncation happens after the hook). |
|
||||
| `exit_code` | `int` | Process exit code. |
|
||||
| `cwd` | `str` | Working directory the command ran in. |
|
||||
|
||||
**Return value:** `str` to replace the output, `None` to leave it unchanged.
|
||||
|
||||
**Use cases:** Inject summaries for commands that produce massive output (`du -ah`, `find`, `tree`), tag output with a project-specific marker so downstream hooks know how to handle it, strip timing noise that flaps between runs and defeats prompt caching.
|
||||
|
||||
```python
|
||||
def summarize_find(command, output, **kwargs):
|
||||
if command.startswith("find ") and len(output) > 50_000:
|
||||
lines = output.count("\n")
|
||||
head = "\n".join(output.splitlines()[:40])
|
||||
return f"{head}\n\n[summary: {lines} paths total, showing first 40]"
|
||||
return None
|
||||
|
||||
def register(ctx):
|
||||
ctx.register_hook("transform_terminal_output", summarize_find)
|
||||
```
|
||||
|
||||
Pairs well with `transform_tool_result` (which covers every other tool).
|
||||
|
||||
---
|
||||
|
||||
## Shell Hooks
|
||||
|
||||
Declare shell-script hooks in your `cli-config.yaml` and Hermes will run them as subprocesses whenever the corresponding plugin-hook event fires — in both CLI and gateway sessions. No Python plugin authoring required.
|
||||
|
|
|
|||
|
|
@ -135,13 +135,15 @@ Local transcription works out of the box when `faster-whisper` is installed. If
|
|||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
stt:
|
||||
provider: "local" # "local" | "groq" | "openai" | "mistral"
|
||||
provider: "local" # "local" | "groq" | "openai" | "mistral" | "xai"
|
||||
local:
|
||||
model: "base" # tiny, base, small, medium, large-v3
|
||||
openai:
|
||||
model: "whisper-1" # whisper-1, gpt-4o-mini-transcribe, gpt-4o-transcribe
|
||||
mistral:
|
||||
model: "voxtral-mini-latest" # voxtral-mini-latest, voxtral-mini-2602
|
||||
xai:
|
||||
model: "grok-stt" # xAI Grok STT
|
||||
```
|
||||
|
||||
### Provider Details
|
||||
|
|
@ -162,6 +164,8 @@ stt:
|
|||
|
||||
**Mistral API (Voxtral Transcribe)** — Requires `MISTRAL_API_KEY`. Uses Mistral's [Voxtral Transcribe](https://docs.mistral.ai/capabilities/audio/speech_to_text/) models. Supports 13 languages, speaker diarization, and word-level timestamps. Install with `pip install hermes-agent[mistral]`.
|
||||
|
||||
**xAI Grok STT** — Requires `XAI_API_KEY`. Posts to `https://api.x.ai/v1/stt` as multipart/form-data. Good choice if you're already using xAI for chat or TTS and want one API key for everything. Auto-detection order puts it after Groq — explicitly set `stt.provider: xai` to force it.
|
||||
|
||||
**Custom local CLI fallback** — Set `HERMES_LOCAL_STT_COMMAND` if you want Hermes to call a local transcription command directly. The command template supports `{input_path}`, `{output_dir}`, `{language}`, and `{model}` placeholders.
|
||||
|
||||
### Fallback Behavior
|
||||
|
|
|
|||
|
|
@ -189,3 +189,16 @@ Image paste works with any vision-capable model. The image is sent as a base64-e
|
|||
```
|
||||
|
||||
Most modern models support this format, including GPT-4 Vision, Claude (with vision), Gemini, and open-source multimodal models served through OpenRouter.
|
||||
|
||||
## Image Routing (Vision-Capable vs Text-Only Models)
|
||||
|
||||
When a user attaches an image — from the CLI clipboard, the gateway (Telegram/Discord photo), or any other entry point — Hermes routes it based on whether your current model actually supports vision:
|
||||
|
||||
| Your model | What happens to the image |
|
||||
|---|---|
|
||||
| **Vision-capable** (GPT-4V, Claude with vision, Gemini, Qwen-VL, MiMo-VL, etc.) | Sent as **real pixels** using the provider's native image content format above. No text summary layer. |
|
||||
| **Text-only** (DeepSeek V3, smaller open-source models, older chat-only endpoints) | Routed through the `vision_analyze` auxiliary tool — an auxiliary vision model describes the image, and the text description is injected into the conversation. |
|
||||
|
||||
You don't configure this — Hermes looks up your current model's capability in the provider metadata and picks the right path automatically. The practical effect: you can switch between vision and non-vision models mid-session and image handling "just works" without changing your workflow. Text-only models get coherent context about the image rather than a broken multimodal payload they'd have to reject.
|
||||
|
||||
Which auxiliary model handles the text-description path is configurable under `auxiliary.vision` — see [Auxiliary Models](/docs/user-guide/configuration#auxiliary-models).
|
||||
|
|
|
|||
|
|
@ -105,6 +105,8 @@ If `faster-whisper` is installed, voice mode works with **zero API keys** for ST
|
|||
|
||||
## CLI Voice Mode
|
||||
|
||||
Voice mode is available in both the **classic CLI** (`hermes chat`) and the **TUI** (`hermes --tui`). Behavior is identical across both — same slash commands, same VAD silence detection, same streaming TTS, same hallucination filter. The TUI additionally forwards crash-forensic logs to `~/.hermes/logs/` so push-to-talk failures on exotic audio backends can be reported with a full stack trace rather than disappearing silently.
|
||||
|
||||
### Quick Start
|
||||
|
||||
Start the CLI and enable voice mode:
|
||||
|
|
|
|||
|
|
@ -129,9 +129,25 @@ Optional behavior settings in `~/.hermes/config.yaml`:
|
|||
|
||||
```yaml
|
||||
group_sessions_per_user: true
|
||||
|
||||
gateway:
|
||||
platforms:
|
||||
dingtalk:
|
||||
extra:
|
||||
# Require @mention in groups before the bot replies (parity with Slack/Telegram/Discord).
|
||||
# DMs ignore this — the bot always replies in 1:1 chats.
|
||||
require_mention: true
|
||||
|
||||
# Per-platform allowlist. When set, only these DingTalk user IDs can interact with the bot
|
||||
# (same semantics as DINGTALK_ALLOWED_USERS, but scoped here instead of in .env).
|
||||
allowed_users:
|
||||
- user-id-1
|
||||
- user-id-2
|
||||
```
|
||||
|
||||
- `group_sessions_per_user: true` keeps each participant's context isolated inside shared group chats
|
||||
- `require_mention: true` prevents the bot from responding to every group message — it only answers when someone @-mentions it
|
||||
- `allowed_users` under `dingtalk.extra` is an alternative to `DINGTALK_ALLOWED_USERS`; if both are set, they're merged
|
||||
|
||||
### Start the Gateway
|
||||
|
||||
|
|
|
|||
|
|
@ -482,6 +482,34 @@ Hermes automatically registers installed skills as **native Discord Application
|
|||
|
||||
No extra configuration is needed — any skill installed via `hermes skills install` is automatically registered as a Discord slash command on the next gateway restart.
|
||||
|
||||
### Disabling Slash Command Registration
|
||||
|
||||
If you run multiple Hermes gateways against the same Discord application (e.g. staging + production), only one of them should own the global slash-command registration — otherwise the last startup wins and the registrations flap. Turn slash registration off on the "follower" gateway:
|
||||
|
||||
```yaml
|
||||
gateway:
|
||||
platforms:
|
||||
discord:
|
||||
extra:
|
||||
slash_commands: false # default: true
|
||||
```
|
||||
|
||||
Leaving this at `true` on the "primary" gateway keeps the normal behavior — global `/`-menu commands for built-ins and installed skills.
|
||||
|
||||
## Sending Media (`send_message` + `MEDIA:` tags)
|
||||
|
||||
The Discord adapter supports native file uploads for every common media type via the `send_message` tool and inline `MEDIA:/path/to/file` tags emitted by the agent:
|
||||
|
||||
| Type | How it's delivered |
|
||||
|---|---|
|
||||
| Images (PNG/JPG/WebP) | Native Discord image attachment with inline preview |
|
||||
| Animated GIFs | `send_animation` uploads as `animation.gif` so Discord plays it inline (not as a static thumbnail) |
|
||||
| Video (MP4/MOV) | `send_video` — native video player |
|
||||
| Audio / Voice | `send_voice` — native voice message when possible, file attachment otherwise |
|
||||
| Documents (PDF/ZIP/docx/etc.) | `send_document` — native attachment with download button |
|
||||
|
||||
Discord's per-upload size limit depends on the server's boost tier (25 MB free, up to 500 MB). If Hermes gets an HTTP 413, the adapter falls back to a link pointing at the local cache path rather than failing silently.
|
||||
|
||||
## Home Channel
|
||||
|
||||
You can designate a "home channel" where the bot sends proactive messages (such as cron job output, reminders, and notifications). There are two ways to set it:
|
||||
|
|
|
|||
|
|
@ -168,6 +168,16 @@ All outgoing media goes through Signal's standard attachment API. Unlike some pl
|
|||
|
||||
Attachment size limit: **100 MB** (both directions).
|
||||
|
||||
### Native Formatting, Reply Quotes, and Reactions
|
||||
|
||||
Signal messages render with **native formatting** instead of literal markdown characters. The adapter converts markdown (`**bold**`, `*italic*`, `` `code` ``, `~~strike~~`, `||spoiler||`, headings) into Signal `bodyRanges` so the text shows up with real styling on the recipient's client rather than as visible `**` / `` ` `` characters.
|
||||
|
||||
**Reply quotes.** When Hermes replies to a specific message, it now posts a native reply that quotes the original — same UI affordance Signal users see when they use "Reply" themselves. This is automatic for replies generated in response to an inbound message.
|
||||
|
||||
**Reactions.** The agent can react to messages via the standard reaction API; reactions surface in Signal as emoji reactions on the referenced message rather than as extra text.
|
||||
|
||||
None of this requires additional config — it ships on by default in recent signal-cli builds. If your `signal-cli` version is too old, Hermes falls back to plaintext delivery and logs a one-time warning.
|
||||
|
||||
### Typing Indicators
|
||||
|
||||
The bot sends typing indicators while processing messages, refreshing every 8 seconds.
|
||||
|
|
|
|||
|
|
@ -347,6 +347,14 @@ slack:
|
|||
# but you can set this explicitly for consistency with other platforms)
|
||||
require_mention: true
|
||||
|
||||
# Prevent thread auto-engagement: only reply to channel messages that
|
||||
# contain an explicit @mention. With this OFF (default), Slack can
|
||||
# "auto-engage" — remembering past mentions in a thread and following
|
||||
# up on bot-message replies, and resuming active sessions without a
|
||||
# fresh mention. With strict_mention ON, every new channel message
|
||||
# must @mention the bot before Hermes will respond.
|
||||
strict_mention: false
|
||||
|
||||
# Custom mention patterns that trigger the bot
|
||||
# (in addition to the default @mention detection)
|
||||
mention_patterns:
|
||||
|
|
@ -357,6 +365,10 @@ slack:
|
|||
reply_prefix: ""
|
||||
```
|
||||
|
||||
:::tip When to use `strict_mention`
|
||||
Set this to `true` in busy workspaces where Slack's default "the bot remembers this thread" behavior surprises users — for example, a long tech-support thread where the bot helped at the start and you'd rather it stay silent unless explicitly pinged again. DMs and active interactive sessions are unaffected.
|
||||
:::
|
||||
|
||||
:::info
|
||||
Slack supports both patterns: `@mention` required to start a conversation by default, but you can opt specific channels out via `SLACK_FREE_RESPONSE_CHANNELS` (comma-separated channel IDs) or `slack.free_response_channels` in `config.yaml`. Once the bot has an active session in a thread, subsequent thread replies do not require a mention. In DMs the bot always responds without needing a mention.
|
||||
:::
|
||||
|
|
|
|||
|
|
@ -144,6 +144,22 @@ Then:
|
|||
If you already have a `docker_volumes:` section, add the new mount to the same
|
||||
list. YAML duplicate keys silently override earlier ones.
|
||||
|
||||
### Supported `MEDIA:` file extensions
|
||||
|
||||
The gateway extracts `MEDIA:/path/to/file` tags from agent replies and ships the referenced file as a platform-native attachment. Supported extensions across all gateway platforms:
|
||||
|
||||
| Category | Extensions |
|
||||
|---|---|
|
||||
| Images | `png`, `jpg`, `jpeg`, `gif`, `webp`, `bmp`, `tiff`, `svg` |
|
||||
| Audio | `mp3`, `wav`, `ogg`, `m4a`, `opus`, `flac`, `aac` |
|
||||
| Video | `mp4`, `mov`, `webm`, `mkv`, `avi` |
|
||||
| **Documents** | `pdf`, `txt`, `md`, `csv`, `json`, `xml`, `html`, `yaml`, `yml`, `log` |
|
||||
| **Office** | `docx`, `xlsx`, `pptx`, `odt`, `ods`, `odp` |
|
||||
| **Archives** | `zip`, `rar`, `7z`, `tar`, `gz`, `bz2` |
|
||||
| **Books / packages** | `epub`, `apk`, `ipa` |
|
||||
|
||||
Anything on this list delivered as a native attachment on platforms that support it (Telegram, Discord, Signal, Slack, WhatsApp, Feishu, Matrix, etc.); on platforms without native support it falls back to a link or plain-text indicator. The **bold** categories were added in the last few releases — if you were relying on the model saying `here is the file: /path/to/report.docx` instead, swap to `MEDIA:/path/to/report.docx` for native delivery.
|
||||
|
||||
## Webhook Mode
|
||||
|
||||
By default, Hermes connects to Telegram using **long polling** — the gateway makes outbound requests to Telegram's servers to fetch new updates. This works well for local and always-on deployments.
|
||||
|
|
@ -451,6 +467,50 @@ To find a topic's `thread_id`, open the topic in Telegram Web or Desktop and loo
|
|||
- **Privacy policy:** Telegram now requires bots to have a privacy policy. Set one via BotFather with `/setprivacy_policy`, or Telegram may auto-generate a placeholder. This is particularly important if your bot is public-facing.
|
||||
- **Message streaming:** Bot API 9.x added support for streaming long responses, which can improve perceived latency for lengthy agent replies.
|
||||
|
||||
## Rendering: Tables and Link Previews
|
||||
|
||||
Telegram's MarkdownV2 has no native table syntax — pipe tables render as backslash-escaped noise if passed through raw. Hermes normalizes markdown tables automatically:
|
||||
|
||||
- **Small tables** are flattened into **row-group bullets** — each row becomes a readable bulleted list under the column headings. Good for 2–4 columns and short cells.
|
||||
- **Larger or wider tables** fall back to a **fenced code block** with aligned columns so nothing collapses. A one-line prompt hint is added so the agent knows to prefer prose follow-ups over more tables on Telegram.
|
||||
|
||||
There's nothing to configure — the adapter picks the right fallback per message. If you want the legacy "always code-block" behavior, disable table normalization by setting `telegram.pretty_tables: false` in `config.yaml` (default: `true`).
|
||||
|
||||
**Link previews.** Telegram auto-generates link previews for URLs in bot messages. If you'd rather suppress those (long `/tools` output, agent reply that mentions ten links, etc.):
|
||||
|
||||
```yaml
|
||||
gateway:
|
||||
platforms:
|
||||
telegram:
|
||||
extra:
|
||||
disable_link_previews: true
|
||||
```
|
||||
|
||||
When enabled, Hermes attaches Telegram's `LinkPreviewOptions(is_disabled=True)` to every outgoing message and falls back to the legacy `disable_web_page_preview` parameter on older `python-telegram-bot` versions.
|
||||
|
||||
## Group Allowlisting by Chat ID
|
||||
|
||||
In addition to per-user access control via `TELEGRAM_ALLOWED_USERS`, you can allowlist entire group chats (and forum topics) by their numeric chat ID. Useful for team/support bots where any group member should be able to chat, but only in certain groups or topics.
|
||||
|
||||
```yaml
|
||||
gateway:
|
||||
platforms:
|
||||
telegram:
|
||||
extra:
|
||||
group_allowed_chats:
|
||||
- -1001234567890 # supergroup — all members allowed
|
||||
- -1001234567891/42 # supergroup + forum thread_id 42 only
|
||||
```
|
||||
|
||||
Equivalent env var: `TELEGRAM_GROUP_ALLOWED_USERS="-1001234567890,-1001234567891/42"` (comma-separated; the `/<thread_id>` suffix is optional).
|
||||
|
||||
Behavior:
|
||||
|
||||
- A chat that appears in `group_allowed_chats` bypasses `TELEGRAM_ALLOWED_USERS` for its members — anyone in the group can interact with the bot.
|
||||
- Omit the `/<thread_id>` suffix to allow the whole group; include it to allow just that forum topic.
|
||||
- DMs still require the user ID to be in `TELEGRAM_ALLOWED_USERS`.
|
||||
- This layers cleanly on top of `group_topics` (for topic-scoped skill binding) and `ignored_threads` (for silencing specific topics).
|
||||
|
||||
## Interactive Model Picker
|
||||
|
||||
When you send `/model` with no arguments in a Telegram chat, Hermes shows an interactive inline keyboard for switching models:
|
||||
|
|
|
|||
|
|
@ -65,9 +65,31 @@ The `/yolo` command is a **toggle** — each use flips the mode on or off:
|
|||
YOLO mode is available in both CLI and gateway sessions. Internally, it sets the `HERMES_YOLO_MODE` environment variable which is checked before every command execution.
|
||||
|
||||
:::danger
|
||||
YOLO mode disables **all** dangerous command safety checks for the session. Use only when you fully trust the commands being generated (e.g., well-tested automation scripts in disposable environments).
|
||||
YOLO mode disables **all** dangerous command safety checks for the session — **except** the hardline blocklist (see below). Use only when you fully trust the commands being generated (e.g., well-tested automation scripts in disposable environments).
|
||||
:::
|
||||
|
||||
### Hardline Blocklist (Always-On Floor)
|
||||
|
||||
Some commands are so catastrophic — irreversible filesystem wipes, fork bombs, direct block-device writes — that Hermes refuses to run them **regardless** of:
|
||||
|
||||
- `--yolo` / `/yolo` toggled on
|
||||
- `approvals.mode: off`
|
||||
- Cron jobs running in headless `approve` mode
|
||||
- User explicitly clicking "allow always"
|
||||
|
||||
The blocklist is the floor below `--yolo`. It trips **before** the approval layer even sees the command, and there's no override flag. Patterns currently covered (not exhaustive; kept in sync with `tools/approval.py::UNRECOVERABLE_BLOCKLIST`):
|
||||
|
||||
| Pattern | Why it's hardline |
|
||||
|---|---|
|
||||
| `rm -rf /` and obvious variants | Wipes the filesystem root |
|
||||
| `rm -rf --no-preserve-root /` | The explicit "yes I mean root" variant |
|
||||
| `:(){ :\|:& };:` (bash fork bomb) | Pegs the host until reboot |
|
||||
| `mkfs.*` on a mounted root device | Formats the live system |
|
||||
| `dd if=/dev/zero of=/dev/sd*` | Zeroes a physical disk |
|
||||
| Piping untrusted URLs to `sh` at the rootfs top level | Remote-code-execution attack vector too broad to approve |
|
||||
|
||||
If you hit the blocklist, the tool call returns an explanatory error to the agent and nothing runs. If a legitimate workflow needs one of these commands (you're the operator of a wipe-and-reinstall pipeline, for example), run it outside the agent.
|
||||
|
||||
### Approval Timeout
|
||||
|
||||
When a dangerous command prompt appears, the user has a configurable amount of time to respond. If no response is given within the timeout, the command is **denied** by default (fail-closed).
|
||||
|
|
@ -479,7 +501,20 @@ All URL-capable tools (web search, web extract, vision, browser) validate URLs b
|
|||
- **Cloud metadata hostnames**: `metadata.google.internal`, `metadata.goog`
|
||||
- **Reserved, multicast, and unspecified addresses**
|
||||
|
||||
SSRF protection is always active and cannot be disabled. DNS failures are treated as blocked (fail-closed). Redirect chains are re-validated at each hop to prevent redirect-based bypasses.
|
||||
SSRF protection is always active for internet-facing use and DNS failures are treated as blocked (fail-closed). Redirect chains are re-validated at each hop to prevent redirect-based bypasses.
|
||||
|
||||
#### Intentionally allowing private URLs
|
||||
|
||||
Some setups legitimately need private/internal URL access — home networks that resolve `home.arpa` to RFC 1918 space, LAN-only Ollama/llama.cpp endpoints, internal wikis, cloud metadata debugging, and the like. For those cases there's a global opt-out:
|
||||
|
||||
```yaml
|
||||
security:
|
||||
allow_private_urls: true # default: false
|
||||
```
|
||||
|
||||
When on, web tools, the browser, vision URL fetches, and gateway media downloads no longer reject RFC 1918 / loopback / link-local / CGNAT / cloud-metadata destinations. **This is a deliberate trust boundary** — only enable it on machines where the agent running arbitrary prompt-injected URLs against the local network is an acceptable risk. Public-facing gateways should leave it off.
|
||||
|
||||
The host-substring guard (which blocks lookalike Unicode domain tricks even when the underlying IP is public) stays on regardless of this setting.
|
||||
|
||||
### Tirith Pre-Exec Security Scanning
|
||||
|
||||
|
|
|
|||
|
|
@ -76,6 +76,8 @@ Keybindings match the [Classic CLI](cli.md#keybindings) exactly. The only behavi
|
|||
- **`Cmd+V` / `Ctrl+V`** first tries normal text paste, then falls back to OSC52/native clipboard reads, and finally image attach when the clipboard or pasted payload resolves to an image.
|
||||
- **`/terminal-setup`** installs local VS Code / Cursor / Windsurf terminal bindings for better `Cmd+Enter` and undo/redo parity on macOS.
|
||||
- **Slash autocompletion** opens as a floating panel with descriptions, not an inline dropdown.
|
||||
- **`Ctrl+X`** — when a queued message is highlighted (sent while the agent was still running), delete it from the queue. **`Esc`** cancels editing and unhighlights without deleting.
|
||||
- **`Ctrl+G` / `Ctrl+X Ctrl+E`** — open the current input buffer in `$EDITOR` for multi-line / long-prompt composition; save-and-exit sends the contents back as the prompt.
|
||||
|
||||
## Slash commands
|
||||
|
||||
|
|
@ -89,9 +91,56 @@ All slash commands work unchanged. A few are TUI-owned — they produce richer o
|
|||
| `/skin` | Live preview — theme change applies as you browse |
|
||||
| `/details` | Toggle verbose tool-call details (global or per-section) |
|
||||
| `/usage` | Rich token / cost / context panel |
|
||||
| `/agents` (alias `/tasks`) | Observability overlay — live subagent tree with kill/pause controls, per-branch cost / token / file rollups, turn-by-turn history |
|
||||
| `/reload` | Re-reads `~/.hermes/.env` into the running TUI process so newly added API keys take effect without a restart |
|
||||
| `/mouse` | Toggle mouse tracking on/off at runtime (also persists to `display.mouse_tracking` in `config.yaml`) |
|
||||
|
||||
Every other slash command (including installed skills, quick commands, and personality toggles) works identically to the classic CLI. See [Slash Commands Reference](../reference/slash-commands.md).
|
||||
|
||||
## LaTeX math rendering
|
||||
|
||||
The TUI's markdown pipeline renders LaTeX math inline: `$E = mc^2$` and `$$\frac{a}{b}$$` render as Unicode-formatted math instead of the raw TeX source. Works for inline and block math; unsupported syntax falls back to showing the literal TeX wrapped in a code span so it remains copyable.
|
||||
|
||||
This is always-on — nothing to configure. Classic CLI keeps the raw TeX.
|
||||
|
||||
## Light-terminal detection
|
||||
|
||||
The TUI auto-detects light terminals and swaps to the light theme accordingly. Detection works in three layers:
|
||||
|
||||
1. `HERMES_TUI_THEME` env var — highest priority. Values: `light`, `dark`, or a raw 6-char background hex (e.g. `ffffff`, `1a1a2e`).
|
||||
2. `COLORFGBG` env var — the classic "what's my background color?" hint used by xterm-derived terminals.
|
||||
3. Terminal background probe via OSC 11 — works on modern terminals (Ghostty, Warp, iTerm2, WezTerm, Kitty) that don't set `COLORFGBG`.
|
||||
|
||||
If you want the light theme permanently regardless of terminal:
|
||||
|
||||
```bash
|
||||
export HERMES_TUI_THEME=light
|
||||
```
|
||||
|
||||
## Busy indicator styles
|
||||
|
||||
The status-bar FaceTicker is pluggable — the default rotates Hermes' kawaii face palette every 2.5 seconds during agent work. Pick a different style (or `none` for a minimal dot) via config:
|
||||
|
||||
```yaml
|
||||
display:
|
||||
busy_indicator:
|
||||
style: kawaii # kawaii | minimal | dots | wings | none
|
||||
```
|
||||
|
||||
Styles ship with matched glyph widths so the rest of the status bar doesn't jitter on rotation.
|
||||
|
||||
## Auto-resume
|
||||
|
||||
By default, `hermes --tui` starts a fresh session each launch. To re-attach to the most recent TUI session automatically (useful when your terminal or SSH connection drops unexpectedly), opt in:
|
||||
|
||||
```bash
|
||||
export HERMES_TUI_RESUME=1 # most-recent TUI session
|
||||
# or:
|
||||
export HERMES_TUI_RESUME=<session-id> # specific session
|
||||
```
|
||||
|
||||
Unset the variable or pass `--resume <id>` explicitly to override on a per-launch basis.
|
||||
|
||||
## Status line
|
||||
|
||||
The TUI's status line tracks agent state in real time:
|
||||
|
|
@ -106,6 +155,11 @@ The TUI's status line tracks agent state in real time:
|
|||
|
||||
The per-skin status-bar colors and thresholds are shared with the classic CLI — see [Skins](features/skins.md) for customization.
|
||||
|
||||
The status line also shows:
|
||||
|
||||
- **Working directory with git branch** — `~/projects/hermes-agent (docs/two-week-gap-sweep)`. The branch suffix updates when you `git checkout` in a side terminal (mtime-cached) so the TUI reflects your actual active branch, not whatever it was at launch.
|
||||
- **Per-prompt elapsed time** — `⏱ 12s/3m 45s` while the turn is running (live), frozen to `⏲ 32s / 3m 45s` after the turn completes. First number is time since last user message; second is total session duration. Resets on every new prompt.
|
||||
|
||||
## Configuration
|
||||
|
||||
The TUI respects all standard Hermes config: `~/.hermes/config.yaml`, profiles, personalities, skins, quick commands, credential pools, memory providers, tool/skill enablement. No TUI-specific config file exists.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue