`fetch_models_dev()` is on the hot path of every `AIAgent.__init__`
(via `context_compressor → get_model_context_length`). The previous
policy was "always try network first, only fall back to disk if
network fails," so every fresh `hermes chat` / `hermes gateway` /
batch / cron process paid 250-500 ms re-fetching a 2 MB JSON registry
that was already on disk from earlier runs.
Add a stage 2 between in-mem and network: if
`models_dev_cache.json` exists and its mtime is younger than the
existing `_MODELS_DEV_CACHE_TTL` (1 hour, same TTL the in-mem cache
already uses), load from disk and skip the network call.
The in-mem TTL is anchored to the disk file's age, so a 50-min-old
cache stays in-memory for only 10 more minutes — no surprise
extension of staleness window.
Invariants preserved:
- `force_refresh=True` still always hits the network and only falls
back to disk on failure (`hermes config refresh` semantics).
- Missing disk cache → fall through to network (first-ever run).
- Stale disk cache (mtime > TTL) → fall through to network.
- Negative file age (clock skew) → fall through to network.
- Network failure → existing stage-4 stale-disk fallback unchanged.
Measured impact (3-run medians, 9950X3D, fresh process per run):
fetch_models_dev cold: 256 → 17 ms (-93%)
hermes chat -q wall: 4.00 → 3.73 s (-7% median)
3.99 → 3.60 s (-10% min)
The chat-end-to-end win is bounded below by API latency variance, but
the fetch_models_dev microbenchmark is the cleanest signal: 239 ms
shaved off every fresh-process agent construction.
Win compounds with the previous perf PRs:
#22681 google_chat lazy-load
#22766 doctor parallel + IMDS off
#22790 gateway.platforms PEP 562
Tests: all 30 `tests/agent/test_models_dev.py` pass (added 4 new ones
covering the new disk-cache-first path, force_refresh override, stale
disk fallback, and missing-disk-cache fall-through). Full `tests/agent/`
suite: 2560 passed, 0 failed.
The is_xai_responses branch only sent include=[reasoning.encrypted_content]
without forwarding the resolved reasoning_effort. Other Responses providers
(OpenAI, GitHub) already get effort forwarded — this aligns the xAI path.
Without this, agent.reasoning_effort is silently dropped on the xAI direct
path, making Hermes unable to control reasoning depth on grok-4.x via
api.x.ai. Tests added to TestCodexBuildKwargs cover effort passthrough,
disabled state, and minimal-clamp parity with non-xAI.
* docs: deep audit — fix stale config keys, missing commands, and registry drift
Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level
user-guide, user-guide/features) against the live registries:
hermes_cli/commands.py COMMAND_REGISTRY (slash commands)
hermes_cli/auth.py PROVIDER_REGISTRY (providers)
hermes_cli/config.py DEFAULT_CONFIG (config keys)
toolsets.py TOOLSETS (toolsets)
tools/registry.py get_all_tool_names() (tools)
python -m hermes_cli.main <subcmd> --help (CLI args)
reference/
- cli-commands.md: drop duplicate hermes fallback row + duplicate section,
add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand
lists to match --help output (status/logout/spotify, login, archive/prune/
list-archived).
- slash-commands.md: add missing /sessions and /reload-skills entries +
correct the cross-platform Notes line.
- tools-reference.md: drop bogus '68 tools' headline, drop fictional
'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated),
add missing 'kanban' and 'video' toolset sections, fix MCP example to use
the real mcp_<server>_<tool> prefix.
- toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser'
row, add missing 'kanban' and 'video' toolset rows, drop the stale
'38 tools' count for hermes-cli.
- profile-commands.md: add missing install/update/info subcommands, document
fish completion.
- environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the
one with the correct gmi-serving.com default).
- faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just
via OpenRouter), refresh the OpenAI model list.
getting-started/
- installation.md: PortableGit (not MinGit) is what the Windows installer
fetches; document the 32-bit MinGit fallback.
- installation.md / termux.md: installer prefers .[termux-all] then falls
back to .[termux].
- nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid
'nix flake update --flake' invocation.
- updating.md: 'hermes backup restore --state pre-update' doesn't exist —
point at the snapshot/quick-snapshot flow; correct config key
'updates.pre_update_backup' (was 'update.backup').
user-guide/
- configuration.md: api_max_retries default 3 (not 2); display.runtime_footer
is the real key (not display.runtime_metadata_footer); checkpoints defaults
enabled=false / max_snapshots=20 (not true / 50).
- configuring-models.md: 'hermes model list' / 'hermes model set ...' don't
exist — hermes model is interactive only.
- tui.md: busy_indicator -> tui_status_indicator with values
kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none).
- security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env,
not config.yaml.
- windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the
OpenAI-compatible API server runs inside hermes gateway.
user-guide/features/
- computer-use.md: approvals.mode (not security.approval_level); fix broken
./browser-use.md link to ./browser.md.
- fallback-providers.md: top-level fallback_providers (not
model.fallback_providers); the picker is subcommand-based, not modal.
- api-server.md: API_SERVER_* are env vars — write to per-profile .env,
not 'hermes config set' which targets YAML.
- web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl
modes are exposed through web_extract.
- kanban.md: failure_limit default is 2, not '~5'.
- plugins.md: drop hard-coded '33 providers' count.
- honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document
that 'hermes honcho' subcommand is gated on memory.provider=honcho;
reconcile subcommand list with actual --help output.
- memory-providers.md: legacy 'hermes honcho setup' redirect documented.
Verified via 'npm run build' — site builds cleanly; broken-link count went
from 149 to 146 (no regressions, fixed a few in passing).
* docs: round 2 audit fixes + regenerate skill catalogs
Follow-up to the previous commit on this branch:
Round 2 manual fixes:
- quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY;
voice-mode and ACP install commands rewritten — bare 'pip install ...'
doesn't work for curl-installed setups (no pip on PATH, not in repo
dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e
".[voice]"'. ACP already ships in [all] so the curl install includes it.
- cli.md / configuration.md: 'auxiliary.compression.model' shown as
'google/gemini-3-flash-preview' (the doc's own claimed default);
actual default is empty (= use main model). Reworded as 'leave empty
(default) or pin a cheap model'.
- built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row
that was missing from the table.
Regenerated skill catalogs:
- ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill
pages and both reference catalogs (skills-catalog.md,
optional-skills-catalog.md). This adds the entries that were genuinely
missing — productivity/teams-meeting-pipeline (bundled),
optional/finance/* (entire category — 7 skills:
3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model,
merger-model, pptx-author), creative/hyperframes,
creative/kanban-video-orchestrator, devops/watchers,
productivity/shop-app, research/searxng-search,
apple/macos-computer-use — and rewrites every other per-skill page from
the current SKILL.md. Most diffs are tiny (one line of refreshed
metadata).
Validation:
- 'npm run build' succeeded.
- Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation
shells that lag every newly-added skill page (pre-existing pattern).
No regressions on any en/ page.
`gateway/platforms/__init__.py` eagerly imported `QQAdapter` and
`YuanbaoAdapter` at package-init time, which transitively pulled in
qqbot's chunked-upload + keyboards + onboard machinery and yuanbao's
websocket stack. About 84 ms wall and 23 MB RSS on every fresh process
that touched anything under `gateway.platforms` — including `hermes
chat` (via run_agent → cli's plugin discovery transitive import).
Nothing in the codebase actually consumes these symbols from the
package root; every real call site uses the long-form path
(`from gateway.platforms.qqbot import QQAdapter`,
`from gateway.platforms.yuanbao import YuanbaoAdapter` in gateway/run.py).
The eager re-export was only there for convenience.
Replace with a PEP 562 module-level `__getattr__` that lazily imports
on first attribute access. Public API stays identical:
`from gateway.platforms import QQAdapter` keeps working but only
pays the import cost when the symbol is actually touched. `__dir__`
preserves help() / autocomplete behavior.
Measured impact (7-run medians, 9950X3D):
import gateway.platforms 127 → 43 ms (-66%)
50 → 27 MB (-46%)
import gateway.platforms.base 127 → 44 ms (-65%)
50 → 27 MB (-46%)
import cli (full chat path) 745 → 710 ms ( -5%)
96 → 90 MB ( -6%)
hermes chat -q (cold) -5 MB
The per-import win is biggest because qqbot/yuanbao deps don't overlap
with anything on the gateway-platforms path — full `import cli`
already loads aiohttp/websockets transitively from other places, so
the marginal CLI win is smaller than the isolated import benchmark.
The `gateway.platforms.base` win is what matters most for long-lived
gateway processes: every gateway boot saves 23 MB resident.
All 144 qqbot tests pass; broader gateway suite (5132 tests) passes
modulo 4 pre-existing flakes also failing on main without this change.
The xAI image-gen provider was DOA from PR #14765 onward — every request
422'd because the resolution param was being mapped to '1024'/'2048' but
xAI's API expects the literal strings '1k'/'2k'. PR #18678 fixed the
mapping; this test asserts the wire payload carries the literal so the
regression cannot recur silently.
The xAI /v1/images/generations endpoint expects resolution as a
literal string ('1k' or '2k'), not the numeric value ('1024').
- Change _XAI_RESOLUTIONS from a dict mapping to a validation set
- Use the resolution key directly instead of the mapped value
- Fall back to DEFAULT_RESOLUTION on invalid config values
Fixes 422 Unprocessable Entity errors when resolution was sent.
`hermes doctor` ran every connectivity probe sequentially and on a typical
developer laptop spent ~2s of its ~5s wall time inside boto3's EC2
instance-metadata-service lookup (169.254.169.254) — the default
AWS credential chain probes IMDS even when AWS_BEARER_TOKEN_BEDROCK
or AWS_ACCESS_KEY_ID is the only legitimate source.
Refactor the API Connectivity section so every probe (OpenRouter,
Anthropic, ~16 static API-key providers + dynamic profiles, AWS
Bedrock) is a pure function returning a structured result, then
fan them out through a ThreadPoolExecutor(max_workers=8). Output
order, glyphs, colours, padding, and issue strings stay byte-for-byte
identical to the sequential implementation; results are gathered
in submission order.
Also disable IMDS for the parallel block by setting
AWS_EC2_METADATA_DISABLED=true on the parent thread before submitting
work (and restoring its prior value in a finally block). Bedrock's
real-API call gets a Config(connect_timeout=5, read_timeout=10,
retries={max_attempts:1}) so a transient regional failure can't pad
the run by 30+ seconds.
Measured impact (5-run medians, 9950X3D):
hermes doctor: 5.07 → 2.16 s (-57%)
Doctor tests: 48 passed (test_doctor.py + test_doctor_command_install.py).
The remaining ~2s of wall is import overhead + a couple of one-off
network calls outside the API Connectivity section (`fetch_models_dev`
provider catalog refresh, Nous OAuth refresh in `Auth Providers`).
Those are next-tier targets, not part of this change.
Returning users who enabled '🖱️ Computer Use (macOS)' via 'hermes tools'
saw '✓ Saved configuration' but no install — cua-driver was never on
PATH and the toolset failed at first use. Two compounding causes:
1. _toolset_needs_configuration_prompt fell through to _toolset_has_keys,
which returned True for any provider with empty env_vars. cua-driver
has no env vars, so the gate skipped _configure_toolset entirely and
_run_post_setup('cua_driver') never ran.
2. No stable CLI entry-point existed for re-running the install when
the picker no-op'd it (e.g. when toggling the toolset off+on inside
one picker session, where 'added' is empty).
Changes:
- hermes_cli/tools_config.py: add _POST_SETUP_INSTALLED registry
mapping post_setup keys to installed-state predicates. The gate
now returns True when any visible provider has a registered
post_setup whose predicate fails. cua_driver is the only opt-in
for now; other post_setup hooks keep their existing behaviour.
- hermes_cli/main.py: add 'hermes computer-use install' and
'hermes computer-use status' as a stable docs target. install
reuses the same _run_post_setup('cua_driver') path that the
picker invokes; status reports whether cua-driver is on PATH.
- tools/computer_use/cua_backend.py: install hint now points users
at 'hermes computer-use install' first.
- website/docs/user-guide/features/computer-use.md: document the
new command as the primary install path.
- website/docs/reference/cli-commands.md: catalog 'hermes
computer-use' alongside 'hermes tools'.
- tests/hermes_cli/test_post_setup_gating.py: regression coverage
for the gate predicate (missing -> setup forced, installed ->
setup skipped, broken predicate -> non-blocking, unregistered
keys -> behaviour unchanged).
Fixes#22737. Reported by @f-trycua.
The model regularly writes session-outcome facts to MEMORY.md despite
the existing 'Do NOT save task progress' line — entries like
'Submitted PR #22577 for the kanban dedup fix' or 'Fixed bug X in
file Y'. These are stale within days, pollute the system prompt,
and crowd out durable user preferences (the issue #22563 reporter
saw 9 sections of bug-fix notes injected on a brand-new task).
Add explicit examples of what NOT to save (PR numbers, issue
numbers, commit SHAs, 'fixed/submitted/Phase N done', file counts)
plus the 7-day-staleness heuristic so the model has a concrete
calibration target rather than guessing what counts as 'task progress'.
Closes#22563 (the prompt-side, low-risk portion). The bigger
relevance-based-injection / vector-retrieval feature requested in
#22563 is tracked under #2184 (Richer local memory). Per skill rule
on prompt caching, dynamic memory injection breaks the frozen-snapshot
invariant and needs a separate design call.
_try_activate_fallback() walked the chain by index without comparing
the candidate entry against the currently-failing backend. So a
misconfigured chain that listed the same provider+model as the primary,
or two custom_providers entries pointing at the same shim URL, would
loop the same failure 3x for the same backend.
After the fix, advance() skips:
- entries where (provider, model) match the current agent's
- entries with a base_url + model matching the current backend
(catches two custom_providers names pointing at the same shim)
Recursing through self._try_activate_fallback() continues to the next
chain entry; if everything matches, returns False and the caller
moves on without retrying the same broken path.
3 regression tests covering same-provider-same-model skip, same-base_url-
same-model skip, and the all-self-matching-returns-False exhaustion path.
Closes#22548 (the Hermes-side portion). The 120s timeout itself in
the downstream claude-cli shim is a deployment concern documented in
that issue's wherewolf87 comment.
Native Windows, WSL, SSH sessions, and Windows Terminal all send
Ctrl+Enter as bare LF (c-j). Hermes was binding c-j as submit on
every POSIX platform, so Ctrl+Enter submitted instead of inserting
a newline on those terminals. Reported in #22379.
Add _preserve_ctrl_enter_newline() predicate that detects the
environments where Ctrl+Enter must produce a newline (sys.platform
== 'win32', SSH_CONNECTION/SSH_CLIENT/SSH_TTY env, WT_SESSION,
WSL_DISTRO_NAME, /proc/version 'microsoft' marker). Gate the
c-j-as-submit binding off in those environments and gate the
c-j-as-newline handler on. Local POSIX TTYs without those markers
(docker exec, plain ssh from a Mac) keep c-j as submit so plain
Enter still works on thin PTYs.
Add install_ctrl_enter_alias() in hermes_cli/pt_input_extras.py
mapping the three CSI-u / modifyOtherKeys variants of Ctrl+Enter
('\x1b[13;5u', '\x1b[27;5;13~', '\x1b[27;5;13u') to the
(Escape, ControlM) tuple Alt+Enter produces. This lets Kitty /
mintty / xterm-with-modifyOtherKeys users over SSH get a Ctrl+Enter
newline through the existing Alt+Enter handler.
9 new tests + extended existing test_lf_enter_binds_to_submit_handler_posix
to cover bare-local vs SSH branches.
Closes#22379.
Non-streaming /v1/chat/completions wrapped any AIAgent result \u2014 including
partial/failed runs \u2014 as a successful 200 with finish_reason='stop' and
the internal failure string substituted into message.content. API
clients had no way to distinguish 'agent answered: X' from
'agent crashed and the X you see is its error message'.
After the fix:
- completed: True \u2192 200 finish_reason='stop' (unchanged)
- partial + truncated text \u2192 200 finish_reason='length' + hermes extras
- partial + no text / failed \u2192 502 OpenAI error envelope (SDKs raise)
- other failures \u2192 200 finish_reason='error' + hermes extras
Adds X-Hermes-Completed / X-Hermes-Partial / X-Hermes-Error headers
plus a 'hermes' extras object on partial responses for clients that
want the full picture.
Closes#22496.
Gateway creates a fresh AIAgent per inbound message in several common
scenarios: cache miss, idle eviction (1h TTL), config-signature
mismatch, process restart. A freshly-built AIAgent has
_turns_since_memory=0 and _user_turn_count=0, so the
memory.nudge_interval trigger ('_turns_since_memory >=
_memory_nudge_interval') can never be reached when these reconstructions
happen on roughly the cadence of the interval. A user can chat for hours
on Telegram without ever seeing a self-improvement review fire.
Reconstruct the counters from conversation_history at the top of
run_conversation(), right after the existing _hydrate_todo_store call.
Idempotent guard ('if self._user_turn_count == 0') means a cached agent
that already accumulated counters keeps them; only freshly-built agents
hydrate. Modulo arithmetic preserves the original 1-in-N cadence rather
than firing a review immediately on resume.
7 regression tests pinning the contract (mid-cycle history, modulo wrap,
idempotency, zero-interval skip, role==user filtering, production-code
anchor).
Closes#22357.
Operator-controlled HERMES_PROFILE values were rendered as
'**${author}** (${ts}):' — markdown bold with no provenance prefix.
Worker comment bodies render directly underneath. A misleading
profile name like 'hermes-system' or 'operator' could be misread by
the next worker as a system directive above attacker-influenced
content (confused-deputy primitive gated on operator misconfig).
The LLM-controlled author-forgery surface was already closed in
#22435 (author removed from KANBAN_COMMENT_SCHEMA). This is
defense-in-depth: render with an explicit 'comment from worker
`<author>` at <ts>:' prefix so even 'hermes-system' resolves to
'comment from worker `hermes-system` at ...' — parseable as
worker-comment metadata, not a system directive. Strip backticks
from author so they can't break out of the fence.
Update test_build_worker_context_caps_comments to count by body
regex since the rendered author line now also starts with
'comment '.
Closes#22452.
Two unrelated but co-located fixes to scripts/run_tests.sh:
1. pytest-split bootstrap (#22401): the script tried '$PYTHON -m pip
install pytest-split' on first run, but uv-created venvs ship without
pip. Result: 'No module named pip' before any test ran. Add a uv
fallback (uv pip install --python $PYTHON), keep pip as a secondary
path, and emit a clear error pointing at 'uv pip install -e ".[dev]"'
when neither is available. Also declare pytest-split in
pyproject.toml dev extra so a normal '.[dev]' install provisions it.
2. HERMES_CRON_SESSION leak (#22400): the hermetic env scrub already
unsets HERMES_GATEWAY_SESSION and HERMES_INTERACTIVE but missed the
sibling HERMES_CRON_SESSION. When run_tests.sh is invoked from a
Hermes cron job, that variable leaks into pytest, flipping
tools/approval.py into cron-deny mode and breaking
tests/acp/test_approval_isolation.py and friends.
Closes#22400.
Closes#22401.
When session_id rotates (e.g. /new), commit_memory_session was firing
MemoryManager.on_session_end but skipping ContextEngine.on_session_end.
Engines that accumulate per-session state (LCM-style DAGs, summary
stores) leaked that state from the rotated-out session into whatever
continued under the same compressor instance.
Mirror the call shutdown_memory_provider already makes — same
lifecycle moment, same hook contract ("real session boundaries (CLI
exit, /reset, gateway expiry)"). /new is a real boundary for the old
session_id; providers keep their state but the rotated-out session_id
is done.
6 regression tests covering both-hooks-fire, no-memory-manager,
no-context-engine, both failure-tolerant paths.
Closes#22394.
- Restore allowed_chats gate before thread_id check so ignored_threads
applies universally (even to guest mentions).
- Compute _message_mentions_bot once in _should_process_message to
eliminate redundant second entity scan when guest_mode=true and the
message does not mention the bot.
- Remove redundant _is_group_chat from _is_guest_mention (caller already
verified the message is a group chat).
- Update _telegram_allowed_chats docstring to note guest_mode exception.
- Add test coverage: bot_command entity, text_mention entity,
caption_entities, and ignored_threads + guest_mode interaction.
- Add nik1t7n to AUTHOR_MAP.
The original PR placed 'pwd = pytest.importorskip("pwd")' on line 4
but 'import pytest' on line 9 — NameError on module load. Same for
test_file_sync_back.py. Plus, the in-function 'pwd = pytest.importorskip'
calls in test_auto_detected_root_is_rejected confused Python's scope
analysis (later 'import pytest' made pytest local everywhere in the
function) and caused UnboundLocalError. Drop the now-redundant
in-function importorskip calls and rely on the module-level guard.
The github-pr-workflow skill wraps the URL in double-quotes
('curl -H ... "https://api.github.com/..."'), which the original
allowlist regex (\s+https://api...) did not match. Without this,
the bundled github-pr-workflow skill is still blocked at every
cron tick despite #22605's fix landing for the bare-URL form.
Make the leading quote optional and add a regression test pinning
both single- and double-quoted forms.
Adds 'codex' to the _MCP_PRESETS registry so users can add it via
Connecting to 'codex'...
✓ Connected! Found 2 tool(s) from 'codex':
codex Run a Codex session. Accepts configuration parameters matchi...
codex-reply Continue a Codex conversation by providing the thread id and...
Enable all 2 tools? [Y/n/select]:
Cancelled. without manually specifying
the command and args.
Enables: codex mcp-server → Hermes native MCP client → Codex tools
available as first-class Hermes tools.
Problem:
After `hermes profile use NAME`, the gateway (started via systemd with
HERMES_HOME=/root/.hermes hardcoded) ignores the active profile and
always runs as the Default profile. WebUI, Telegram, and all non-CLI
platforms are affected.
Root cause:
_apply_profile_override() contained an early-return guard:
if profile_name is None and os.environ.get("HERMES_HOME"):
return # trust the inherited value
The intent was to let child processes inherit their parent's profile via
HERMES_HOME without redundantly re-reading active_profile. But
systemd also sets HERMES_HOME — to the hermes root (/root/.hermes),
not a profile directory — so the guard fired and silently skipped the
active_profile check. The user's `hermes profile use NAME` write to
~/.hermes/active_profile was never seen by the gateway process.
Fix:
Only skip the active_profile check when HERMES_HOME is already a
profile directory, identified by its immediate parent directory being
named "profiles" (e.g. ~/.hermes/profiles/coder or
/opt/data/profiles/coder). When HERMES_HOME points to a root
directory (parent name != "profiles"), continue to read active_profile.
Tests:
- test_hermes_home_at_root_with_active_profile_is_redirected: the
bug scenario — HERMES_HOME=/root/.hermes + active_profile=coder →
HERMES_HOME must be redirected to .../profiles/coder.
Stash-verified: FAILS without fix, PASSES with fix.
- test_hermes_home_already_profile_dir_is_trusted: child-process
inheritance contract unchanged — .../profiles/coder is trusted as-is.
- test_hermes_home_unset_reads_active_profile: classic path unchanged.
- test_hermes_home_unset_default_profile_no_redirect: "default" still
produces no redirect.
4/4 tests green.
Closes#22502.
When a Telegram user replies using the native quote feature to select
only part of a prior message, _build_message_event was injecting the
ENTIRE replied-to message into reply_to_text via
message.reply_to_message.text/caption. python-telegram-bot exposes
the user-selected substring as message.quote (TextQuote.text); we now
prefer that and fall back to the full replied-to text only when no
native quote is present.
The agent-visible "[Replying to: \"...\"]" prefix can otherwise expand
the user's narrow quote into the full prior message, causing the agent
to act on unrelated actionable-looking text the user did not select
(e.g. multi-item briefings where the user quotes one bullet but the
prefix injects every bullet). Falls back cleanly when message.quote
is absent (PTB <21 or replies that don't quote a substring).
Fixes#22619
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolve git via shutil.which with POSIX and Git-for-Windows fallbacks before clone and pull so Dashboard/API installs do not misreport Git as missing.
Add regression tests for the resolver and pull subprocess invocation.
When platform_toolsets[<platform>] contains both a composite (e.g.
hermes-cli) and at least one configurable opt-in (e.g. spotify), the
has_explicit_config branch in _get_platform_tools silently dropped the
composite, leaving sessions with only the configurable + plugin tools
and no native tools (terminal, file, web, browser, memory, etc.).
Mirror the else-branch's subset inference for composites that sit
alongside the configurables, but apply _DEFAULT_OFF_TOOLSETS only to the
implicit expansion so user-listed default-off toolsets (spotify,
discord) survive.
The delegate_task tool description hardcoded 'default 3' / 'default 2' for
max_concurrent_children / max_spawn_depth, which misled the model on any
install that raised these limits — the schema text said 'default 3' even
when the user had set max_concurrent_children=15 / max_spawn_depth=3, so
the model would self-cap at 3 and never use the headroom.
Make the description dynamic. ToolEntry gains an optional
dynamic_schema_overrides callable; registry.get_definitions() merges its
output on top of the static schema before returning it. delegate_tool
registers a builder that reads the current delegation.* config and emits:
- 'up to N items concurrently for this user' (N = max_concurrent_children)
- 'Nested delegation IS enabled / OFF for this user (max_spawn_depth=N)'
- 'orchestrator children can themselves delegate up to M more level(s)'
- 'orchestrator_enabled=false' when the kill switch is set
The model_tools cache key already includes config.yaml mtime+size, so
edits to delegation.* in config invalidate the cached tool definitions
without an explicit hook. CLI_CONFIG staleness within a process is a
pre-existing limitation of _load_config and out of scope here.
Static description / tasks.description / role.description in
DELEGATE_TASK_SCHEMA are placeholders so module import doesn't trigger
cli.CLI_CONFIG load before the test conftest can redirect HERMES_HOME.
Enforce the parent-completion invariant at claim_task (the single
ready->running chokepoint) and re-gate unblock_task so blocked->ready
only fires when parents are done. Prevents child tasks from running
ahead of in-progress parents under the create-then-link race.
Also adds a stress test that races concurrent create+link against
hammered claim_task and asserts no child runs while any parent is undone.
Ref: kanban/boards/cookai/workspaces/t_a6acd07d/root-cause.md
Refs: t_8d6af9d6
Plugin authors had no easy way to figure out why their plugin wasn't
loading — failures were buried in agent.log at WARNING and skip reasons
(disabled, not enabled, depth cap, exclusive) were DEBUG-only and
invisible by default.
Set HERMES_PLUGINS_DEBUG=1 to attach a stderr handler at DEBUG to the
hermes_cli.plugins logger only. Surfaces:
- which directories were scanned + manifest counts per source
- per manifest: resolved key, name, kind, source, on-disk path
- skip reasons (disabled, not enabled, exclusive, depth cap, no register)
- per load: tools/hooks/slash/CLI commands the plugin registered
- full traceback on YAML parse failure (exc_info on the existing warning)
- full traceback on register() exceptions, pointing at the plugin author's line
Env var off (default) → zero new stderr output, same as before.
Touches only hermes_cli/plugins.py + a doc section in the plugin-build
guide + an entry in the env-vars reference. 3 new tests lock the
attach/idempotent/no-attach behavior.
Plugin discovery imports every bundled platform plugin at model_tools
import time. The google_chat adapter unconditionally pulled in
google.cloud.pubsub_v1, googleapiclient, grpc, httplib2, and friends at
module top — about 33 MB RSS and 110 ms wall on every CLI invocation,
even ones that never construct a gateway adapter.
Wrap the heavy imports in _load_google_modules(): an idempotent loader
that rebinds the module-level globals (pubsub_v1, service_account,
HttpError, MediaFileUpload, …) on first call and is invoked from
GoogleChatAdapter.__init__, connect(), and check_google_chat_requirements().
The HttpError = Exception placeholder is preserved for the brief window
before the loader runs, so 'except HttpError as exc:' clauses stay
correct (Python looks up the name at try/except evaluation time, not
at function definition time).
Measured impact on a 9950X3D, 7-run medians:
import cli: 895 → 787 ms (-108 ms / -12%)
133 → 110 MB ( -23 MB / -17%)
import model_tools: 491 → 400 ms ( -91 ms / -19%)
95 → 66 MB ( -29 MB / -31%)
google_chat alone: 244 → 132 ms (-112 ms / -46%)
83 → 50 MB ( -33 MB / -40%)
hermes chat -q (cold): 177 → 145 MB ( -32 MB / -18%)
Real-world win lands on every path that imports cli.py: hermes chat,
hermes gateway, cron jobs, batch runs, subagents. Long-lived gateway
processes save ~30 MB resident.
All 157 google_chat tests pass; full gateway suite (5050 tests) green.
Problem:
unlink_tasks() removes a parent→child dependency edge but does not trigger
recompute_ready(). A child whose last blocking parent is unlinked stays
stuck in 'todo' indefinitely — it only promotes to 'ready' on the next
dispatcher tick or a manual 'hermes kanban recompute'. For CLI-only users
without a dispatcher, the child is permanently stuck.
Root cause:
complete_task() and unblock_task() both call recompute_ready() after their
write transaction so downstream children are evaluated immediately.
unlink_tasks() was missing this call — removing a dependency is
semantically equivalent to completing one, so the same recompute is needed.
Fix:
Capture the rowcount result before the write_txn exits, then call
recompute_ready(conn) outside the transaction when a row was actually
deleted (so the child sees the updated task_links state).
Tests:
Added test_unlink_tasks_triggers_recompute_ready in
tests/hermes_cli/test_kanban_db.py: creates parent A (done) + parent C
(running), child B with both parents (todo), unlinks C→B, asserts B is
ready immediately. Stash-verified: FAILS without fix (child stays todo),
PASSES with fix.
62/62 tests green in tests/hermes_cli/test_kanban_db.py.
Closes#22459.
/clear, /new, /reset, and /undo now ask the user to confirm before
discarding conversation state — three-option prompt routed through the
existing tools.slash_confirm primitive.
Native yes/no buttons render on Telegram, Discord, and Slack (their
adapters already implement send_slash_confirm); other platforms get a
text-fallback prompt and reply with /approve, /always, or /cancel.
The classic prompt_toolkit CLI uses the same three-option flow via the
established _prompt_text_input pattern (see _confirm_and_reload_mcp).
TUI keeps its existing modal overlay (#12312).
Gated by new config key approvals.destructive_slash_confirm (default
true). Picking 'Always Approve' flips the gate to false so subsequent
destructive commands run silently — matches the established
mcp_reload_confirm UX.
Out of scope: /cron remove (separate domain — scheduled jobs, not
session history). Existing TUI overlay env-var (HERMES_TUI_NO_CONFIRM)
left unchanged; cosmetic unification can come later.
Closes#4069.
When a GFM table has a row-label column (first column with no header),
_render_table_block_for_telegram incorrectly included the row-label cell
in the bullet zip alongside the data cells, producing a spurious bullet
like '• 維度: 核心賣點' before the real data rows.
Detect the row-label column by comparing the first data row cell count
against the header count (has_row_label_col = len(first_data_row) ==
len(headers) + 1). When present, use cells[0] as the heading and
zip headers against cells[1:] only, correctly excluding the row-label
from the bullet list.
Fixes#22604
check_for_updates() and _resolve_repo_dir() were preferring
$HERMES_HOME/hermes-agent/ over Path(__file__).parent.parent.resolve()
when looking for a .git checkout. For profiles created with
--clone-all, $HERMES_HOME/hermes-agent/ points to a stale copy
with a frozen HEAD, causing persistent "N commits behind" banners
that never resolved.
Flip the resolution order: prefer the running code's location first,
fall back to $HERMES_HOME/hermes-agent/ only when the live checkout
doesn't have a .git (system-wide pip installs, distro packages).
The embedded-rev branch (HERMES_REVISION env var, set by nix builds)
is unaffected — it uses git ls-remote against upstream, never reads
the local checkout's HEAD.
Based on PR #21728 by @fahdad
When the source profile is the default (~/.hermes), shutil.copytree()
was copying multi-GB infrastructure alongside the ~40 MB of actual
profile data: hermes-agent/ (repo checkout + 3 GB venv), .worktrees/,
profiles/ (sibling profiles — recursive!), bin/ (installed binaries),
node_modules/ (hundreds of MB).
Add _CLONE_ALL_DEFAULT_EXCLUDE_ROOT frozenset with these five entries
and pass an ignore callback to copytree(). Exclusions are gated on
the source actually being the default profile (is_default_source) so
named-profile sources are never affected.
Also exclude at any depth: __pycache__/, *.pyc, *.pyo, *.sock, *.tmp.
Profile data (config.yaml, .env, auth.json, state.db, sessions/,
skills/, logs/) is preserved intact — clone-all means 'complete
snapshot minus infrastructure'.
Mirrors the approach already used by _default_export_ignore() and
_DEFAULT_EXPORT_EXCLUDE_ROOT (the export-side exclusion set which is
broader because it produces a portable archive, not a live clone).
Co-authored-by: MustafaKara7 <karamusti912@gmail.com>
Co-authored-by: fahdad <30740087+fahdad@users.noreply.github.com>
Fixes#5022
Based on PRs #5025, #5026, and #21728