Drops the duplicate _FILE_MUTATING_TOOLS frozenset in run_agent.py and
imports the canonical FILE_MUTATING_TOOL_NAMES from
agent/tool_result_classification.py (aliased as _FILE_MUTATING_TOOLS to
avoid renaming the existing call sites). Prevents future drift if
another file-mutating tool is added — only one set needs updating.
No behavior change: same frozenset({'write_file', 'patch'}), and the
117 PR-scoped tests still pass.
`lsp` is registered as a top-level subparser in `main()` (lines 9539-9545)
via `agent.lsp.cli.register_subparser`, so it shows up in `hermes --help`
output alongside the other built-ins. The `_BUILTIN_SUBCOMMANDS` set used
by `_plugin_cli_discovery_needed` to short-circuit the ~500-650ms plugin
import pass did not list it, so every `hermes lsp ...` invocation paid
the full discovery cost despite being a fully-built-in command.
This is also caught by the parity guard added in #22120:
`tests/hermes_cli/test_startup_plugin_gating.py::test_builtin_set_covers_every_registered_subcommand`
has been failing on clean origin/main with:
AssertionError: _BUILTIN_SUBCOMMANDS is missing these live
subcommands: ['lsp']. Add them to hermes_cli/main.py::_BUILTIN_SUBCOMMANDS
so plugin discovery can be skipped when the user targets them.
Fix: add `"lsp"` to the frozenset (alphabetical position between `logs`
and `mcp`). The accompanying `test_builtin_set_has_no_phantom_entries`
guard still passes because `lsp` is genuinely live — registered via the
guarded `try/except Exception` in main() since #24168.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Dockerfile permissions section made /opt/hermes/.venv readable but not
writable by the hermes runtime user. Since the 2026-05-12 policy change
moved messaging packages (discord.py, telegram, slack, etc.) out of [all]
and into lazy_deps.py, the Docker image no longer ships with them
pre-installed. At first gateway boot, lazy_deps.ensure() tries to
`uv pip install` them into the venv but fails with EACCES because
site-packages is root-owned.
The result: every messaging platform adapter silently fails to load inside
Docker containers, producing only a cryptic "discord.py not installed"
warning despite the gateway being correctly configured.
Two-part fix:
1. Dockerfile: add /opt/hermes/.venv to the existing chown -R hermes:hermes
line so the default (UID 10000) case works out of the box.
2. docker/entrypoint.sh: extend the needs_chown block to also re-chown the
.venv when HERMES_UID is remapped. Without this, the build-time chown
becomes stale when someone uses the documented HERMES_UID override in
docker-compose.yml.
Fixes#21536
Related: #17674, #21543, #21755
- Rename 'Alibaba Cloud (DashScope)' display label to 'Qwen Cloud'
in CANONICAL_PROVIDERS (model picker, /model, hermes model TUI) and
PROVIDER_REGISTRY (setup wizard prompts, status output).
- Move Qwen Cloud (alibaba) up to position 6 — directly below
OpenAI Codex and above Xiaomi MiMo.
- Move Qwen OAuth (Portal) (qwen-oauth) to the bottom of the
canonical provider list.
Provider slug 'alibaba' is unchanged — only the display label
moved. DashScope env var (DASHSCOPE_API_KEY) and base URL are
unchanged. The separate 'alibaba-coding-plan' plugin provider is
not affected.
* feat(nous): unified client=hermes-client-v<version> tag on every Portal request
Every Hermes request to Nous Portal now carries the same
client=hermes-client-v<__version__> tag (e.g. client=hermes-client-v0.13.0
on this release), sourced live from hermes_cli.__version__. The release
script's regex bump auto-aligns it on every release.
Centralized in agent/portal_tags.py and wired into all four call sites:
- NousProfile.build_extra_body (main agent loop, every chat completion)
- auxiliary_client.NOUS_EXTRA_BODY + _build_call_kwargs (aux client)
- run_agent.py compression-summary fallback path
- tools/web_tools.py web_extract fallback
Replaces the client=aux marker added in #24194 with the unified version
tag. Tests assert against the helper output (invariant) rather than the
literal string, so they don't need updating on every release.
* feat(nous): cover /goal judge and kanban specify aux paths
Two aux-using surfaces bypassed call_llm by invoking
client.chat.completions.create() directly without extra_body, so they
were missing the unified Portal client tag:
- hermes_cli/goals.py — /goal standing-goal judge
- hermes_cli/kanban_specify.py — kanban triage specifier
Both now pass extra_body=get_auxiliary_extra_body() or None so they
inherit the version tag when the aux client points at Nous Portal, and
emit nothing otherwise (no tag leak to OpenRouter/Anthropic auxes).
The long-lived prefix-cache layout split the system prompt into stable/
context/volatile blocks and re-derived them on every API call. The
volatile tier (timestamp + memory snapshot + USER profile) ticks per
turn, so the system message bytes mutated mid-conversation and broke
upstream prompt caches (OpenRouter, Nous Portal, Anthropic).
Diagnosed via live wire-format diffing: an 8-turn conversation showed
OLD layout flipping system block[1] sha mid-session at the minute
boundary, dropping cached_tokens to 0 on that turn (cumulative
66.6% vs 83.3% for the single-block layout). Hermes invariant:
history (system + all but the last 1-2 messages) must be static.
Fix: drop the long-lived layout entirely. Single layout everywhere —
system_and_3 with one cached system string built once on first turn,
replayed verbatim on every subsequent turn. Loses cross-session 1h
prefix caching for Claude (the feature that motivated the split), but
within-session caching now actually works on every provider.
Removed:
- run_agent.py: _use_long_lived_prefix_cache flag, _long_lived_cache_ttl,
_supports_long_lived_anthropic_cache method, the long-lived branch in
run_conversation, mark_tools_for_long_lived_cache call site
- agent/prompt_caching.py: apply_anthropic_cache_control_long_lived,
mark_tools_for_long_lived_cache, _mark_system_stable_block helper
- hermes_cli/config.py: prompt_caching.long_lived_prefix and
prompt_caching.long_lived_ttl config keys
- tests/agent/test_prompt_caching_live.py (entire file)
- tests/agent/test_prompt_caching.py: TestMarkToolsForLongLivedCache,
TestApplyAnthropicCacheControlLongLived
- tests/run_agent/test_anthropic_prompt_cache_policy.py:
TestSupportsLongLivedAnthropicCache
Targeted tests: 62/62 pass.
When switching models via /model, AIAgent._config_context_length was
never cleared, so the new model inherited the previous model's context
window instead of auto-detecting the correct one via
get_model_context_length().
Clear _config_context_length to None before the runtime field swap so
the full resolution chain (custom_providers per-model, endpoint probe,
models.dev, etc.) is re-evaluated for the newly selected model.
Closes#21509
The test_restart_command_while_busy_requests_drain_without_interrupt test
was asserting against a hardcoded emoji string that was valid before the
i18n migration. After gateway/run.py switched to t("gateway.draining",
count=N), the test sees the translated output (or the raw key when the
locale catalog isn't resolved in xdist workers).
Fix by asserting against t("gateway.draining", count=1) — this produces
the correct expected value regardless of whether the locale file is
available in the test environment.
Default timeout raised from 60s to 300s (5 minutes) to accommodate
slower systems like Unraid NAS. Configurable via WHATSAPP_NPM_INSTALL_TIMEOUT
environment variable.
The live adapter path in _send_via_adapter called adapter.send() without
passing thread_id, while the standalone fallback path correctly forwarded
it. For plugin platforms (google_chat, teams, irc, line) running with the
gateway in-process, this caused every threaded reply to land as a new
top-level message instead of continuing the thread.
Matches the pattern already used by _send_matrix_via_adapter and
_send_feishu: build metadata={"thread_id": thread_id} and pass it through.
The WeCom adapter's _listen_loop() automatically reconnects when the
WebSocket drops, but it never called _mark_connected() after a successful
reconnection. This left the runtime status file (gateway_state.json) stuck
in "disconnected" even though the adapter was fully operational again.
Add self._mark_connected() right after _open_connection() succeeds so
that the dashboard and health probes report the correct state.
Tested by forcing a WebSocket close via the heartbeat loop and verifying
that the status file updated from "disconnected" back to "connected".
The LINE adapter calls self.create_source(...) which raises
AttributeError on every inbound message — no such method exists.
The base PlatformAdapter exposes this factory as build_source(),
consistent with the IRC and Teams adapters.
Fixes#23728
GLM-family models (z-ai/glm-4.5-air, z-ai/glm-4.5-flash, etc.) exhibit
the same "describe-instead-of-call" failure mode that gpt/codex/gemini/
gemma/grok already trigger enforcement for. Without the injection,
free-tier GLM workers spawned by the kanban dispatcher routinely exit
cleanly (rc=0) without invoking kanban_complete or kanban_block,
producing the "protocol violation" error and triggering the dispatcher's
gave_up path.
Observed in real workloads: seven consecutive kanban tasks across three
GLM-tier profiles (shipbackend, frontend-engineer, backend-engineer) all
failed with the identical message:
worker exited cleanly (rc=0) without calling kanban_complete or
kanban_block — protocol violation
Re-running the same tasks on Claude Haiku immediately resolved them.
Adding "glm" to TOOL_USE_ENFORCEMENT_MODELS closes the gap so future
GLM-routed work receives the explicit "every response must contain a
tool call or final result" steering that already protects the other
enforcement-gated model families.
One-line change; no behavior change for non-GLM models.
PR #23458 introduced _send_message_with_thread_fallback() and applied it
to all control-style sends (send_update_prompt, send_approval_request,
send_model_picker_prompt), but the slash-confirm result message in
handle_callback_query still called self._bot.send_message directly.
In supergroups with stale message_thread_id on the callback's parent
message, this raises "Message thread not found" and silently swallows
the result text. Replace with the helper so the same retry-without-
thread-id logic applies.
Autostash creates refs/stash as a pointer to the latest stash commit, but
git stash apply/drop expect the symbolic ref format like stash@{0}, not
the raw commit SHA. Using the commit SHA causes: error: 'X is not a stash reference'
- Note that typescript-language-server pulls in the typescript SDK
automatically (peer-dep relationship was previously implicit and
caused initialize failures when the SDK was absent).
- Add a Troubleshooting entry for the new Backend warnings section
in hermes lsp status, with the shellcheck install commands across
apt / brew / scoop.
Reflects what shipped in PR #24630.
_session_info() used os.getcwd() which reflects the gateway process
working directory, not the user's actual working directory. This caused
the TUI status line to display incorrect paths (e.g. D:\HermesWork
instead of D:\Hermes\HermesWork) after agent turns that changed the
process cwd.
Align with session.create which already correctly reads TERMINAL_CWD
env var set by the CLI launcher.
In WSL2, sounddevice.query_devices() returns [] even when the
PulseAudio bridge is functional. The existing code already handled
the case where the query itself raises an exception, but it missed
the empty-list case.
This change treats an empty device list as non-fatal in WSL when
PULSE_SERVER is configured, matching the existing exception-handler
behavior.
Fixes: WSL users seeing 'No audio input/output devices detected'
even though paplay/arecord work fine.
Closes#23064
When Hermes connects to Signal via signal-cli in daemon mode (linked
device setup), group messages sent from the user's phone were silently
dropped. The syncMessage handler only processed events where
destinationNumber equals the bot's own number (Note to Self).
Group messages from linked devices carry a groupInfo.groupId instead of a
destinationNumber. Extend the condition to also pass through sync messages
that have a groupId, so group messages are promoted to dataMessage and
reach the agent.
PR #24151 routed Portal Qwen (qwen3.6-plus) through the prefix_and_2
long-lived cache layout, attaching {"type":"ephemeral","ttl":"1h"}
markers to the tools[-1] entry and the stable system-prefix block.
That layout works for Portal Claude because Anthropic / OpenRouter on
Anthropic routes honour 1h TTL — but Portal Qwen ultimately proxies to
Alibaba DashScope, which documents a single "ephemeral" TTL of 5
minutes on its Context Cache. The ttl="1h" qualifier is silently
dropped upstream, so the two highest-value breakpoints (tools array +
system prefix) never land. Only the rolling-window 5m markers on the
last 2 messages cache, which matches the observed ~25% read rate.
Fix: keep Portal Qwen on cache_control via _anthropic_prompt_cache_policy
returning (True, False), but drop it from _supports_long_lived_anthropic_cache
so it rides the standard system_and_3 5m layout (system + last 3 messages,
all at 5m). Same 4 breakpoints, all in a TTL the upstream actually honours.
Refs: https://www.alibabacloud.com/help/en/model-studio/context-cachehttps://openrouter.ai/docs/features/prompt-caching (Alibaba Qwen
section: "TTL: 5 minutes")
- _supports_long_lived_anthropic_cache: Portal scope narrowed back to Claude
- tests: flip the two qwen long-lived expectations to False, retitle
non_claude_non_qwen_rejected -> non_claude_rejected
Cron jobs using `deliver: whatsapp` were silently dropped because the
resolver's home-channel env var dict in cron/scheduler.py listed every
messaging platform except whatsapp. _resolve_delivery_targets() returned
[] and no message was sent — but jobs.json marked the run successful and
no log line surfaced the failure.
The gateway adapter and the send_message tool path both honored
WHATSAPP_HOME_CHANNEL correctly; only the cron path missed.
Adds 'whatsapp' -> 'WHATSAPP_HOME_CHANNEL' to _HOME_TARGET_ENV_VARS.
Verified end-to-end with multiple cron pings landing in WhatsApp
self-chat after the fix.
Fixes#22997
Tavily's /crawl endpoint requires Authorization: Bearer <key> in the header,
unlike /search and /extract which accept api_key in the JSON body.
Without the header, crawl returns 401 Unauthorized.
Xiaomi MiMo's /v1/models endpoint returns 401 even with a valid API key,
causing hermes doctor to falsely report 'invalid API key'.
Add a `supports_health_check` field to ProviderProfile (default True).
Providers whose /models endpoint doesn't support auth verification can
set it to False. The doctor's dynamic provider discovery now reads this
field instead of hardcoding True.
The xiaomi provider plugin sets supports_health_check=False.
_parse_target_ref() has no handler for XMPP JIDs (user@server or
room@conference.server), so they fall through to the final
`return None, None, False`. This causes send_message to fail when
targeting an XMPP chat by JID, since the JID is not numeric and
doesn't match any other platform pattern.
Add an explicit check for XMPP targets containing '@', matching the
existing Matrix pattern above it.
Salvage of #21063 — adds 'Weixin, and more' to module-level docstrings
in gateway/__init__.py, gateway/config.py, gateway/platforms/base.py
and the 'hermes gateway' subparser description.
Co-authored-by: wuwuzhijing <chuang.guo@hopechart.com>
Three follow-ups to PR #24168 found during live E2E testing on TS/bash files:
1. typescript-language-server now installs the typescript SDK (tsserver)
alongside it. Without that sibling install, initialize() failed with
"Could not find a valid TypeScript installation" and the server was
marked broken — no diagnostics ever reached the agent. New extra_pkgs
field on INSTALL_RECIPES makes that explicit and reusable for future
peer-dep cases.
2. _check_lint now treats "linter command exists on PATH but cannot
actually run" as skipped instead of error. The motivating case is
npx tsc when typescript is not in node_modules — npx prints its
"This is not the tsc command you are looking for" banner and exits
non-zero, which previously blocked the LSP semantic tier (gated on
success or skipped). Pattern-matched per base command (npx,
rustfmt, go) so genuine lint errors still flow through normally.
3. hermes lsp status now surfaces a Backend warnings section when
bash-language-server is installed but shellcheck is missing. The
server itself spawns fine but bash-language-server delegates
diagnostics to shellcheck — without it on PATH the integration
looks alive but never reports any problems. Same warning is
logged once at server spawn time.
Validation:
- 12 new tests in tests/agent/lsp/test_install_and_lint_fixes.py:
* recipe carries typescript SDK
* _install_npm passes both pkg + extras to npm CLI
* backwards compat: recipes without extras still work
* _backend_warnings quiet when bash absent / both present
* _backend_warnings fires when bash installed without shellcheck
* status output includes the Backend warnings section
* _looks_like_linter_unusable catches the npx tsc banner
* real TS type errors not misclassified as unusable
* unfamiliar linters fall through normally
* _check_lint returns skipped on npx tsc unusable
* _check_lint returns error on real tsc type errors
- Full lsp + file_operations test suite: 245/245 pass
- Live E2E:
* try_install("typescript-language-server") installs both packages
into node_modules
* write_file(bad.ts, ...) returns lint=skipped + lsp_diagnostics
with two real TS errors (was lint=error, no lsp_diagnostics)
* hermes lsp status renders the shellcheck warning when bash is
installed but shellcheck is not on PATH
When the user runs /stop or a session is interrupted mid-flight, the
👀 in-progress reaction lingered on the user's message indefinitely.
Without another agent run to swap it for 👍/👎, the eyes stayed there
forever — visually misleading (looks like the agent is still working).
Fix: on ProcessingOutcome.CANCELLED, call set_message_reaction with
reaction=None to clear all reactions on the message. Documented Bot API
semantics (equivalent to Bot API 10.0's deleteMessageReaction, but works
on PTB 22.6 already without the version bump).
Test changes:
- Renamed test_on_processing_complete_cancelled_keeps_existing_reaction
→ test_on_processing_complete_cancelled_clears_reaction; updated
assertion to expect set_message_reaction(reaction=None).
- Added test_on_processing_complete_cancelled_skipped_when_disabled
(TELEGRAM_REACTIONS=false short-circuits).
- Added test_clear_reactions_handles_api_error_gracefully and
test_clear_reactions_returns_false_without_bot to cover the new
_clear_reactions helper.
The fuzzy @-file completer shells out to 'rg --files' via subprocess.run
with text=True. On Windows, Python 3.13 decodes stdout using the system
ANSI codepage (cp1252), so any filename containing bytes like 0x81/0x8f
crashes the background reader thread with UnicodeDecodeError. The
exception is swallowed inside subprocess, leaving proc.stdout=None, and
the next line ('proc.stdout.strip()') blows up with:
AttributeError: 'NoneType' object has no attribute 'strip'
This takes down the prompt_toolkit event loop and forces 'Press ENTER to
continue' until the user clears the @-query.
Fix:
- Pass encoding='utf-8', errors='replace' so rg's UTF-8 output is decoded
consistently across platforms and unmappable bytes don't crash.
- Guard 'proc.stdout' with a None check before .strip(), so a future
reader-thread failure degrades gracefully instead of breaking input.
Replace `len(label)` with `HermesCLI._status_bar_display_width(label)`
in two places where the response box top border is rendered.
`len()` counts characters, not terminal columns. CJK characters like
`测` and `试` each occupy 2 columns, causing the top border
`╭─ 测试 ───╮` to render 2 columns wider than the bottom border
`╰─────────╯`.
The `_status_bar_display_width` helper already exists (line 2881) and
uses `prompt_toolkit.utils.get_cwidth` for proper CJK width calculation.
When TUI exits, tmux captures some TUI output into its scrollback buffer.
On restart, stale scrollback content appears at the top of screen before
AlternateScreen takes over.
Add ANSI escape sequences at startup:
- ESC[2J clear visible screen
- ESC[H cursor home
- ESC[3J clear scrollback buffer
Replace the hardcoded i18n placeholder "~/.hermes/config.yaml" with the
real config_path returned from api.getStatus(), falling back to the i18n
string while loading or on API failure.
Co-authored-by: aqilaziz <gonzes7@gmail.com>
Fixes#24127
On headless Linux VPS (no DISPLAY or WAYLAND_DISPLAY), some Python
webbrowser backends register TUI programs such as links, lynx, or
www-browser. GenericBrowser.open() spawns these without redirecting
stdin/stdout, allowing them to take over the terminal. This can cause
the process to receive SIGHUP and exit immediately even though uvicorn
bound the port successfully, producing a misleading success message
followed by an empty --status.
Fix: detect headless Linux at startup and skip the auto-open when no
display server is available. On such systems the URL is still printed
so the user can open it manually or via an SSH tunnel. The webbrowser
call is also wrapped in a try/except so any unexpected failure on other
platforms is silently absorbed rather than surfacing as an unhandled
exception in the daemon thread.
_resolve_task_provider_model drops cfg_base_url and cfg_api_key when
returning a named provider, causing configured API keys and base URLs
to be lost. Pass them through so named providers can use custom
endpoints while still resolving credentials from provider-specific
env vars.
Closes#20139
Built-in commands with required args (e.g. /queue, /steer, /background)
were excluded from Telegram setMyCommands output, making them invisible
in the autocomplete menu. However, their handlers already return usage
text when invoked without arguments, so hiding them hurts discoverability.
This commit removes the _requires_argument filter for built-in commands
(COMMAND_REGISTRY) while keeping it for plugin-registered slash commands,
which may not provide a no-arg usage fallback.
Closes#24312
The clarify tool returned 'not available in this execution context' for
every gateway-mode agent because gateway/run.py never passed
clarify_callback into the AIAgent constructor. Schema actively encouraged
calling it; users never saw the question.
Changes:
- tools/clarify_gateway.py — new event-based primitive mirroring
tools/approval.py: register/wait_for_response/resolve_gateway_clarify
with per-session FIFO, threading.Event blocking with 1s heartbeat
slices (so the inactivity watchdog keeps ticking), and
clear_session for boundary cleanup.
- gateway/platforms/base.py — abstract send_clarify with a numbered-text
fallback so every adapter (Discord, Slack, WhatsApp, Signal, Matrix,
etc.) gets a working clarify out of the box. Plus an active-session
bypass: when the agent is blocked on a text-awaiting clarify, the next
non-command message routes inline to the runner's intercept instead
of being queued + triggering an interrupt. Same shape as the /approve
deadlock fix from PR #4926.
- gateway/platforms/telegram.py — concrete send_clarify renders one
inline button per choice plus '✏️ Other (type answer)'. cl: callback
handler resolves numeric choices immediately, flips to text-capture
mode for Other, with the same authorization guards as exec/slash
approvals.
- gateway/run.py — clarify_callback wired at the cached-agent per-turn
callback assignment site (only the user-facing agent path; cron and
hygiene-compress agents have no human attached). Bridges sync→async
via run_coroutine_threadsafe, blocks with the configured timeout, and
returns a '[user did not respond within Xm]' sentinel on timeout so
the agent adapts rather than pinning the running-agent guard. Text-
intercept added to _handle_message before slash-confirm intercept
(skipping slash commands). clear_session called in the run's finally
to cancel any orphan entries.
- hermes_cli/config.py — agent.clarify_timeout default 600s.
- website/docs/user-guide/messaging/telegram.md — Interactive Prompts
section.
Tests:
- tests/tools/test_clarify_gateway.py (14 tests) — full primitive
coverage: button resolve, open-ended auto-await, Other flip, timeout
None, unknown-id idempotency, clear_session cancellation, FIFO
ordering, register/unregister notify, config default.
- tests/gateway/test_telegram_clarify_buttons.py (12 tests) — render
paths (multi-choice/open-ended/long-label/HTML-escape/not-connected),
callback dispatch (numeric resolve/Other flip/already-resolved/
unauthorized/invalid-token), and base-adapter text fallback.
Out of scope: bot-to-bot, guest mode, checklists, poll media, live
photos. Closes#24191.