os.walk() by default does not follow symlinks, causing skills
linked via symlinks to be invisible to the skill discovery system.
Add followlinks=True so that symlinked skill directories are scanned.
Port from cline/cline#10266.
When OpenAI-compatible proxies (OpenRouter, Vercel AI Gateway, Cline)
route Claude models, they sometimes surface the Anthropic-native cache
counters (`cache_read_input_tokens`, `cache_creation_input_tokens`) at
the top level of the `usage` object instead of nesting them inside
`prompt_tokens_details`. Our chat-completions branch of
`normalize_usage()` only read the nested `prompt_tokens_details` fields,
so those responses:
- reported `cache_write_tokens = 0` even when the model actually did a
prompt-cache write,
- reported only some of the cache-read tokens when the proxy exposed them
top-level only,
- overstated `input_tokens` by the missed cache-write amount, which in
turn made cost estimation and the status-bar cache-hit percentage wrong
for Claude traffic going through these gateways.
Now the chat-completions branch tries the OpenAI-standard
`prompt_tokens_details` first and falls back to the top-level
Anthropic-shape fields only if the nested values are absent/zero. The
Anthropic and Codex Responses branches are unchanged.
Regression guards added for three shapes: top-level write + nested read,
top-level-only, and both-present (nested wins).
Version managers like frum (Ruby), rvm, nvm, and others commonly alias
cd to a wrapper function that runs additional logic after directory
changes. When Hermes captures the shell environment into a session
snapshot, these aliases are preserved. If the wrapper function fails
in the subprocess context (e.g. frum not on PATH), every cd fails,
causing all terminal commands to exit with code 126.
Using builtin cd bypasses any aliases or functions, ensuring the
directory change always uses the real bash builtin regardless of
what version managers are installed.
Make Tavily client respect a TAVILY_BASE_URL environment variable,
defaulting to https://api.tavily.com for backward compatibility.
Consistent with FIRECRAWL_API_URL pattern already used in this module.
Zhipu AI (智谱) serves both international users via api.z.ai and
China-based users via open.bigmodel.cn. The domestic endpoint was not
mapped in _URL_TO_PROVIDER, causing Hermes to treat it as an unknown
custom endpoint and fall back to the default 128K context length
instead of resolving the correct 200K+ context via models.dev or the
hardcoded GLM defaults.
This affects users of both the standard API
(https://open.bigmodel.cn/api/paas/v4) and the Coding Plan
(https://open.bigmodel.cn/api/coding/paas/v4).
New and newer models from models.dev now surface automatically in
/model (both hermes model CLI and the gateway Telegram/Discord picker)
for a curated set of secondary providers — no Hermes release required
when the registry publishes a new model.
Primary user-visible fix: on OpenCode Go, typing '/model mimo-v2.5-pro'
no longer silently fuzzy-corrects to 'mimo-v2-pro'. The exact match
against the merged models.dev catalog wins.
Scope (opt-in frozenset _MODELS_DEV_PREFERRED in hermes_cli/models.py):
opencode-go, opencode-zen, deepseek, kilocode, fireworks, mistral,
togetherai, cohere, perplexity, groq, nvidia, huggingface, zai,
gemini, google.
Explicitly NOT merged:
- openrouter and nous (never): curated list is already a hand-picked
subset / Portal is source of truth.
- xai, xiaomi, minimax, minimax-cn, kimi-coding, kimi-coding-cn,
alibaba, qwen-oauth (per-project decision to keep curated-only).
- providers with dedicated live-endpoint paths (copilot, anthropic,
ai-gateway, ollama-cloud, custom, stepfun, openai-codex) — those
paths already handle freshness themselves.
Changes:
- hermes_cli/models.py: add _MODELS_DEV_PREFERRED + _merge_with_models_dev
helper. provider_model_ids() branches on the set at its curated-fallback
return. Merge is models.dev-first, curated-only extras appended,
case-insensitive dedup, graceful fallback when models.dev is offline.
- hermes_cli/model_switch.py: list_authenticated_providers() calls the
same merge in both its code paths (PROVIDER_TO_MODELS_DEV loop +
HERMES_OVERLAYS loop). Picker AND validation-fallback both see
fresh entries.
- tests/hermes_cli/test_models_dev_preferred_merge.py (new): 13 tests —
merge-helper unit tests (empty/raise/order/dedup), opencode-go/zen
behavior, openrouter+nous explicitly guarded from merge.
- tests/hermes_cli/test_opencode_go_in_model_list.py: converted from
snapshot-style assertion to a behavior-based floor check, so it
doesn't break when models.dev publishes additional opencode-go
entries.
Addresses a report from @pfanis via Telegram: newer Xiaomi variants
on OpenCode Go weren't appearing in the /model picker, and /model
was silently routing requests for new variants to older ones.
The code execution sandbox creates a Unix domain socket in /tmp with
default permissions, allowing any local user to connect and execute
tool calls. Restrict to 0o600 after bind.
Closes#6230
- Adds 'ctx_size' field to _CONTEXT_LENGTH_KEYS tuple
- Enables hermes agent to correctly detect context size from custom LLMs
running on Lemonade server that use this field name instead of the
standard keys (max_seq_len, n_ctx_train, n_ctx)
The _looks_like_gateway_process function was missing the
hermes-gateway script pattern, causing dashboard to report gateway
as not running even when the process was active.
Patterns now cover all entry points:
- hermes_cli.main gateway
- hermes_cli/main.py gateway
- hermes gateway
- hermes-gateway (new)
- gateway/run.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Follow-up for salvaged PR #14179.
`_cleanup_invalid_pid_path` previously called `remove_pid_file()` for the
default PID path, but that helper defensively refuses to delete a PID file
whose pid field differs from `os.getpid()` (to protect --replace handoffs).
Every realistic stale-PID scenario is exactly that case: a crashed/Ctrl+C'd
gateway left behind a PID file owned by a now-dead foreign PID.
Once `get_running_pid()` has confirmed the runtime lock is inactive, the
on-disk metadata is known to belong to a dead process, so we can force-unlink
both the PID file and the sibling `gateway.lock` directly instead of going
through the defensive helper.
Also adds a regression test with a dead foreign PID that would have failed
against the previous cleanup logic.
Upgrades agent-browser from 0.13.0 to 0.26.0, picking up 13 releases of
daemon reliability fixes:
- Daemon hang on Linux from waitpid(-1) race in SIGCHLD handler (#1098)
- Chrome killed after ~10s idle due to PR_SET_PDEATHSIG thread tracking (#1157)
- Orphaned Chrome processes via process-group kill on shutdown (#1137)
- Stale daemon after upgrade via .version sidecar and auto-restart (#1134)
- Idle timeout not firing (sleep future recreated each loop) (#1110)
- Navigation hanging on lifecycle events that never fire (#1059, #1092)
- CDP attach hang on Chrome 144+ (#1133)
- Windows daemon TCP bind with Hyper-V port conflicts (#1041)
- Shadow DOM traversal in accessibility tree snapshots
- doctor command for user self-diagnosis
Also wires AGENT_BROWSER_IDLE_TIMEOUT_MS into the browser subprocess
environment so the daemon self-terminates after our configured inactivity
timeout (default 300s). This is the daemon-side counterpart to the
Python-side inactivity reaper — the daemon kills itself and its Chrome
children when no commands arrive, preventing orphan accumulation even
when the Python process dies without running atexit handlers.
Addresses #7343 (daemon socket hangs, shadow DOM) and #13793 (orphan
accumulation from force-killed sessions).
Fixes#12976
The generic "gemma": 8192 fallback was incorrectly matching gemma4:31b-cloud
before the more specific Gemma 4 entries could match, causing Hermes to assign
only 8K context instead of 262K. Added "gemma-4" and "gemma4" entries before
the fallback to correctly handle Gemma 4 model naming conventions.
Plugin slash commands now surface as first-class commands in every gateway
enumerator — Discord native slash picker, Telegram BotCommand menu, Slack
/hermes subcommand map — without a separate per-platform plugin API.
The existing 'command:<name>' gateway hook gains a decision protocol via
HookRegistry.emit_collect(): handlers that return a dict with
{'decision': 'deny'|'handled'|'rewrite'|'allow'} can intercept slash
command dispatch before core handling runs, unifying what would otherwise
have been a parallel 'pre_gateway_command' hook surface.
Changes:
- gateway/hooks.py: add HookRegistry.emit_collect() that fires the same
handler set as emit() but collects non-None return values. Backward
compatible — fire-and-forget telemetry hooks still work via emit().
- hermes_cli/plugins.py: add optional 'args_hint' param to
register_command() so plugins can opt into argument-aware native UI
registration (Discord arg picker, future platforms).
- hermes_cli/commands.py: add _iter_plugin_command_entries() helper and
merge plugin commands into telegram_bot_commands() and
slack_subcommand_map(). New is_gateway_known_command() recognizes both
built-in and plugin commands so the gateway hook fires for either.
- gateway/platforms/discord.py: extract _build_auto_slash_command helper
from the COMMAND_REGISTRY auto-register loop and reuse it for
plugin-registered commands. Built-in name conflicts are skipped.
- gateway/run.py: before normal slash dispatch, call emit_collect on
command:<canonical> and honor deny/handled/rewrite/allow decisions.
Hook now fires for plugin commands too.
- scripts/release.py: AUTHOR_MAP entry for @Magaav.
- Tests: emit_collect semantics, plugin command surfacing per platform,
decision protocol (deny/handled/rewrite/allow + non-dict tolerance),
Discord plugin auto-registration + conflict skipping, is_gateway_known_command.
Salvaged from #14131 (@Magaav). Original PR added a parallel
'pre_gateway_command' hook and a platform-keyed plugin command
registry; this re-implementation reuses the existing 'command:<name>'
hook and treats plugin commands as platform-agnostic so the same
capability reaches Telegram and Slack without new API surface.
Co-authored-by: Magaav <73175452+Magaav@users.noreply.github.com>
Replace xiaomi/mimo-v2-pro with xiaomi/mimo-v2.5-pro and xiaomi/mimo-v2.5
in the OpenRouter fallback catalog and the nous provider model list.
Add matching DEFAULT_CONTEXT_LENGTHS entries (1M tokens each).
- appLayout.tsx: restore the 1-row placeholder when `showStickyPrompt`
is false. Dropping it saved a row but the composer height shifted by
one as the prompt appeared/disappeared, jumping the input vertically
on scroll.
- useInputHandlers: gateway.rpc (from useMainApp) already catches errors
with its own sys() message and resolves to null. The previous `.catch`
was dead code and on RPC failures the user saw both 'error: ...' (from
rpc) and 'failed to toggle yolo'. Drop the catch and gate 'failed to
toggle yolo' on a non-null response so null (= rpc already spoke)
stays silent.
`is_local_endpoint()` leaned on `ipaddress.is_private`, which classifies
RFC-1918 ranges and link-local as private but deliberately excludes the
RFC 6598 CGNAT block (100.64.0.0/10) — the range Tailscale uses for its
mesh IPs. As a result, Ollama reached over Tailscale (e.g.
`http://100.77.243.5:11434`) was treated as remote and missed the
automatic stream-read / stale-stream timeout bumps, so cold model load
plus long prefill would trip the 300 s watchdog before the first token.
Add a module-level `_TAILSCALE_CGNAT = ipaddress.IPv4Network("100.64.0.0/10")`
(built once) and extend `is_local_endpoint()` to match the block both
via the parsed-`IPv4Address` path and the existing bare-string fallback
(for symmetry with the 10/172/192 checks). Also hoist the previously
function-local `import ipaddress` to module scope now that it's used by
the constant.
Extend `TestIsLocalEndpoint` with a CGNAT positive set (lower bound,
representative host, MagicDNS anchor, upper bound) and a near-miss
negative set (just below 100.64.0.0, just above 100.127.255.255, well
outside the block, and first-octet-wrong).
Resolve Feishu @_user_N / @_all placeholders into display names plus a
structured [Mentioned: Name (open_id=...), ...] hint so agents can both
reason about who was mentioned and call Feishu OpenAPI tools with stable
open_ids. Strip bot self-mentions only at message edges (leading
unconditionally, trailing only before whitespace/terminal punctuation)
so commands parse cleanly while mid-text references are preserved.
Covers both plain-text and rich-post payloads.
Also fixes a pre-existing hydration bug: Client.request no longer accepts
the 'method' kwarg on lark-oapi 1.5.3, so bot identity silently failed
to hydrate and self-filtering never worked. Migrate to the
BaseRequest.builder() pattern and accept the 'app_name' field the API
actually returns. Tighten identity matching precedence so open_id is
authoritative when present on both sides.
Adds security.allow_private_urls / HERMES_ALLOW_PRIVATE_URLS toggle so
users on OpenWrt routers, TUN-mode proxies (Clash/Mihomo/Sing-box),
corporate split-tunnel VPNs, and Tailscale networks — where DNS resolves
public domains to 198.18.0.0/15 or 100.64.0.0/10 — can use web_extract,
browser, vision URL fetching, and gateway media downloads.
Single toggle in tools/url_safety.py; all 23 is_safe_url() call sites
inherit automatically. Cached for process lifetime.
Cloud metadata endpoints stay ALWAYS blocked regardless of the toggle:
169.254.169.254 (AWS/GCP/Azure/DO/Oracle), 169.254.170.2 (AWS ECS task
IAM creds), 169.254.169.253 (Azure IMDS wire server), 100.100.100.200
(Alibaba), fd00:ec2::254 (AWS IPv6), the entire 169.254.0.0/16
link-local range, and the metadata.google.internal / metadata.goog
hostnames (checked pre-DNS so they can't be bypassed on networks where
those names resolve to local IPs).
Supersedes #3779 (narrower HERMES_ALLOW_RFC2544 for the same class of
users).
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
- normalizeStatusBar: replace Set + early-returns + cast with a single
alias lookup table. Handles legacy `false`, trims/lowercases strings,
maps `on` → `top` in one pass. One expression, no `as` hacks.
- Tab title block: drop the narrative comment, fold
blockedOnInput/titleStatus/cwdTag/terminalTitle into inline expressions
inside useTerminalTitle. Avoids shadowing the outer `cwd`.
- tui_gateway statusbar set branch: read `display` once instead of
`cfg0.get("display")` twice.
Previously the terminal tab title was `{⏳/✓} {model} — Hermes` which
only distinguished busy vs idle. Users juggling multiple Hermes tabs had
no way to tell which one was waiting on them for approval/clarify/sudo/
secret, and no cue for which workspace the tab was attached to.
- 3-state marker: `⚠` when an overlay prompt is open, `⏳` busy, `✓` idle.
- Append `· {shortCwd}` (28-char budget, $HOME → ~) so the tab surfaces
the workspace directly.
- Drop the `— Hermes` suffix — the marker already signals what this is,
and tab titles are tight.
Anthropic's API can legitimately return content=[] with stop_reason="end_turn"
when the model has nothing more to add after a turn that already delivered the
user-facing text alongside a trivial tool call (e.g. memory write). The transport
validator was treating that as an invalid response, triggering 3 retries that
each returned the same valid-but-empty response, then failing the run with
"Invalid API response after 3 retries."
The downstream normalizer already handles empty content correctly (empty loop
over response.content, content=None, finish_reason="stop"), so the only fix
needed is at the validator boundary.
Tests:
- Empty content + stop_reason="end_turn" → valid (the fix)
- Empty content + stop_reason="tool_use" → still invalid (regression guard)
- Empty content without stop_reason → still invalid (existing behavior preserved)
Copilot on #14145 flagged that the shift+tab yolo handler treated any
non-null RPC result as valid, so a response shape like {value: undefined}
or {value: 'weird'} would incorrectly echo 'yolo off'. Now only '1' and
'0' map to on/off; anything else (including missing value) surfaces as
'failed to toggle yolo', matching the null/catch branches.
When the streaming connection dropped AFTER user-visible text was
delivered but a tool call was in flight, we stubbed the turn with a
'⚠ Stream stalled mid tool-call; Ask me to retry' warning — costing
an iteration and breaking the flow. Users report this happening
increasingly often on long SSE streams through flaky provider routes.
Fix: in the existing inner stream-retry loop, relax the
deltas_were_sent short-circuit. If a tool call was in flight
(partial_tool_names populated) AND the error is a transient connection
error (timeout, RemoteProtocolError, SSE 'connection lost', etc.),
silently retry instead of bailing out. Fire a brief 'Connection
dropped mid tool-call; reconnecting…' marker so the user understands
the preamble is about to be re-streamed.
Researched how Claude Code (tombstone + non-streaming fallback),
OpenCode (blind Effect.retry wrapping whole stream), and Clawdbot
(4-way gate: stopReason==error + output==0 + !hadPotentialSideEffects)
handle this. Chose the narrow Clawdbot-style gate: retry only when
(a) a tool call was actually in flight (otherwise the existing
stub-with-recovered-text is correct for pure-text stalls) and
(b) the error is transient. Side-effect safety is automatic — no
tool has been dispatched within this single API call yet.
UX trade-off: user sees preamble text twice on retry (OpenCode-style).
Strictly better than a lost action with a 'retry manually' message.
If retries exhaust, falls through to the existing stub-with-warning
path so the user isn't left with zero signal.
Tests: 3 new tests in TestSilentRetryMidToolCall covering
(1) silent retry recovers tool call; (2) exhausted retries fall back
to stub; (3) text-only stalls don't trigger retry. 30/30 pass.
Copilot on #14138 flagged that the share report says '(file not found)'
when the log exists but is empty (either because the primary is empty
and no .1 rotation exists, or in the rare race where the file is
truncated between _resolve_log_path() and stat()).
- Split _primary_log_path() out of _resolve_log_path so both can share
the LOG_FILES/home math without duplication.
- _capture_log_snapshot now reports '(file empty)' when the primary
path exists on disk with zero bytes, and keeps '(file not found)'
for the truly-missing case.
Tests: rename test_returns_none_for_empty → test_empty_primary_reports_file_empty
with the new assertion, plus a race-path test that monkeypatches
_resolve_log_path to exercise the size==0 branch directly.
- entry.tsx no longer writes bootBanner() to the main screen before the
alt-screen enters. The <Banner> renders inside the alt screen via the
seeded intro row, so nothing is lost — just the flash that preceded it.
Fixes the torn first frame reported on Alacritty (blitz row 5 #17) and
shaves the 'starting agent' hang perception (row 5 #1) since the UI
paints straight into the steady-state view
- AlternateScreen prefixes ERASE_SCROLLBACK (\x1b[3J) to its entry so
strict emulators start from a pristine grid; named constants replace
the inline sequences for clarity
- bootBanner.ts deleted — dead code
- normalizeStatusBar: trim/lowercase + 'on' → 'top' alias so user-edited
YAML variants (Top, " bottom ", on) coerce correctly
- shift-tab yolo: no-op with sys note when no live session; success-gated
echo and catch fallback so RPC failures don't report as 'yolo off'
- tui_gateway config.set/get statusbar: isinstance(display, dict) guards
mirroring the compact branch so a malformed display scalar in config.yaml
can't raise
Tests: +1 vitest for trim/case/on, +2 pytest for non-dict display survival.