Commit graph

7856 commits

Author SHA1 Message Date
Teknium
775c0e22cf
perf(models_dev): cache-first lookup, skip network when disk cache is fresh (#22808)
`fetch_models_dev()` is on the hot path of every `AIAgent.__init__`
(via `context_compressor → get_model_context_length`). The previous
policy was "always try network first, only fall back to disk if
network fails," so every fresh `hermes chat` / `hermes gateway` /
batch / cron process paid 250-500 ms re-fetching a 2 MB JSON registry
that was already on disk from earlier runs.

Add a stage 2 between in-mem and network: if
`models_dev_cache.json` exists and its mtime is younger than the
existing `_MODELS_DEV_CACHE_TTL` (1 hour, same TTL the in-mem cache
already uses), load from disk and skip the network call.

The in-mem TTL is anchored to the disk file's age, so a 50-min-old
cache stays in-memory for only 10 more minutes — no surprise
extension of staleness window.

Invariants preserved:
- `force_refresh=True` still always hits the network and only falls
  back to disk on failure (`hermes config refresh` semantics).
- Missing disk cache → fall through to network (first-ever run).
- Stale disk cache (mtime > TTL) → fall through to network.
- Negative file age (clock skew) → fall through to network.
- Network failure → existing stage-4 stale-disk fallback unchanged.

Measured impact (3-run medians, 9950X3D, fresh process per run):
  fetch_models_dev cold:  256 → 17 ms  (-93%)
  hermes chat -q wall:   4.00 → 3.73 s (-7% median)
                         3.99 → 3.60 s (-10% min)

The chat-end-to-end win is bounded below by API latency variance, but
the fetch_models_dev microbenchmark is the cleanest signal: 239 ms
shaved off every fresh-process agent construction.

Win compounds with the previous perf PRs:
  #22681 google_chat lazy-load
  #22766 doctor parallel + IMDS off
  #22790 gateway.platforms PEP 562

Tests: all 30 `tests/agent/test_models_dev.py` pass (added 4 new ones
covering the new disk-cache-first path, force_refresh override, stale
disk fallback, and missing-disk-cache fall-through). Full `tests/agent/`
suite: 2560 passed, 0 failed.
2026-05-09 13:32:38 -07:00
Julien Talbot
cd712b176a feat(transports/codex): pass reasoning.effort to xAI Responses API
The is_xai_responses branch only sent include=[reasoning.encrypted_content]
without forwarding the resolved reasoning_effort. Other Responses providers
(OpenAI, GitHub) already get effort forwarded — this aligns the xAI path.

Without this, agent.reasoning_effort is silently dropped on the xAI direct
path, making Hermes unable to control reasoning depth on grok-4.x via
api.x.ai. Tests added to TestCodexBuildKwargs cover effort passthrough,
disabled state, and minimal-clamp parity with non-xAI.
2026-05-09 13:23:02 -07:00
Teknium
252d68fd45
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784)
* docs: deep audit — fix stale config keys, missing commands, and registry drift

Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level
user-guide, user-guide/features) against the live registries:

  hermes_cli/commands.py    COMMAND_REGISTRY (slash commands)
  hermes_cli/auth.py        PROVIDER_REGISTRY (providers)
  hermes_cli/config.py      DEFAULT_CONFIG (config keys)
  toolsets.py               TOOLSETS (toolsets)
  tools/registry.py         get_all_tool_names() (tools)
  python -m hermes_cli.main <subcmd> --help (CLI args)

reference/
- cli-commands.md: drop duplicate hermes fallback row + duplicate section,
  add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand
  lists to match --help output (status/logout/spotify, login, archive/prune/
  list-archived).
- slash-commands.md: add missing /sessions and /reload-skills entries +
  correct the cross-platform Notes line.
- tools-reference.md: drop bogus '68 tools' headline, drop fictional
  'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated),
  add missing 'kanban' and 'video' toolset sections, fix MCP example to use
  the real mcp_<server>_<tool> prefix.
- toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser'
  row, add missing 'kanban' and 'video' toolset rows, drop the stale
  '38 tools' count for hermes-cli.
- profile-commands.md: add missing install/update/info subcommands, document
  fish completion.
- environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the
  one with the correct gmi-serving.com default).
- faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just
  via OpenRouter), refresh the OpenAI model list.

getting-started/
- installation.md: PortableGit (not MinGit) is what the Windows installer
  fetches; document the 32-bit MinGit fallback.
- installation.md / termux.md: installer prefers .[termux-all] then falls
  back to .[termux].
- nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid
  'nix flake update --flake' invocation.
- updating.md: 'hermes backup restore --state pre-update' doesn't exist —
  point at the snapshot/quick-snapshot flow; correct config key
  'updates.pre_update_backup' (was 'update.backup').

user-guide/
- configuration.md: api_max_retries default 3 (not 2); display.runtime_footer
  is the real key (not display.runtime_metadata_footer); checkpoints defaults
  enabled=false / max_snapshots=20 (not true / 50).
- configuring-models.md: 'hermes model list' / 'hermes model set ...' don't
  exist — hermes model is interactive only.
- tui.md: busy_indicator -> tui_status_indicator with values
  kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none).
- security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env,
  not config.yaml.
- windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the
  OpenAI-compatible API server runs inside hermes gateway.

user-guide/features/
- computer-use.md: approvals.mode (not security.approval_level); fix broken
  ./browser-use.md link to ./browser.md.
- fallback-providers.md: top-level fallback_providers (not
  model.fallback_providers); the picker is subcommand-based, not modal.
- api-server.md: API_SERVER_* are env vars — write to per-profile .env,
  not 'hermes config set' which targets YAML.
- web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl
  modes are exposed through web_extract.
- kanban.md: failure_limit default is 2, not '~5'.
- plugins.md: drop hard-coded '33 providers' count.
- honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document
  that 'hermes honcho' subcommand is gated on memory.provider=honcho;
  reconcile subcommand list with actual --help output.
- memory-providers.md: legacy 'hermes honcho setup' redirect documented.

Verified via 'npm run build' — site builds cleanly; broken-link count went
from 149 to 146 (no regressions, fixed a few in passing).

* docs: round 2 audit fixes + regenerate skill catalogs

Follow-up to the previous commit on this branch:

Round 2 manual fixes:
- quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY;
  voice-mode and ACP install commands rewritten — bare 'pip install ...'
  doesn't work for curl-installed setups (no pip on PATH, not in repo
  dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e
  ".[voice]"'. ACP already ships in [all] so the curl install includes it.
- cli.md / configuration.md: 'auxiliary.compression.model' shown as
  'google/gemini-3-flash-preview' (the doc's own claimed default);
  actual default is empty (= use main model). Reworded as 'leave empty
  (default) or pin a cheap model'.
- built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row
  that was missing from the table.

Regenerated skill catalogs:
- ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill
  pages and both reference catalogs (skills-catalog.md,
  optional-skills-catalog.md). This adds the entries that were genuinely
  missing — productivity/teams-meeting-pipeline (bundled),
  optional/finance/* (entire category — 7 skills:
  3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model,
  merger-model, pptx-author), creative/hyperframes,
  creative/kanban-video-orchestrator, devops/watchers,
  productivity/shop-app, research/searxng-search,
  apple/macos-computer-use — and rewrites every other per-skill page from
  the current SKILL.md. Most diffs are tiny (one line of refreshed
  metadata).

Validation:
- 'npm run build' succeeded.
- Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation
  shells that lag every newly-added skill page (pre-existing pattern).
  No regressions on any en/ page.
2026-05-09 13:19:51 -07:00
Teknium
ea2d66ddc0
perf(gateway): defer QQAdapter and YuanbaoAdapter imports via PEP 562 (#22790)
`gateway/platforms/__init__.py` eagerly imported `QQAdapter` and
`YuanbaoAdapter` at package-init time, which transitively pulled in
qqbot's chunked-upload + keyboards + onboard machinery and yuanbao's
websocket stack. About 84 ms wall and 23 MB RSS on every fresh process
that touched anything under `gateway.platforms` — including `hermes
chat` (via run_agent → cli's plugin discovery transitive import).

Nothing in the codebase actually consumes these symbols from the
package root; every real call site uses the long-form path
(`from gateway.platforms.qqbot import QQAdapter`,
`from gateway.platforms.yuanbao import YuanbaoAdapter` in gateway/run.py).
The eager re-export was only there for convenience.

Replace with a PEP 562 module-level `__getattr__` that lazily imports
on first attribute access. Public API stays identical:
`from gateway.platforms import QQAdapter` keeps working but only
pays the import cost when the symbol is actually touched. `__dir__`
preserves help() / autocomplete behavior.

Measured impact (7-run medians, 9950X3D):
  import gateway.platforms        127 →  43 ms  (-66%)
                                   50 →  27 MB  (-46%)
  import gateway.platforms.base   127 →  44 ms  (-65%)
                                   50 →  27 MB  (-46%)
  import cli (full chat path)     745 → 710 ms  ( -5%)
                                   96 →  90 MB  ( -6%)
  hermes chat -q (cold)                  -5 MB

The per-import win is biggest because qqbot/yuanbao deps don't overlap
with anything on the gateway-platforms path — full `import cli`
already loads aiohttp/websockets transitively from other places, so
the marginal CLI win is smaller than the isolated import benchmark.
The `gateway.platforms.base` win is what matters most for long-lived
gateway processes: every gateway boot saves 23 MB resident.

All 144 qqbot tests pass; broader gateway suite (5132 tests) passes
modulo 4 pre-existing flakes also failing on main without this change.
2026-05-09 13:17:48 -07:00
Teknium
dcff23a25f test(xai-image): regression-guard literal '1k'/'2k' resolution payload
The xAI image-gen provider was DOA from PR #14765 onward — every request
422'd because the resolution param was being mapped to '1024'/'2048' but
xAI's API expects the literal strings '1k'/'2k'. PR #18678 fixed the
mapping; this test asserts the wire payload carries the literal so the
regression cannot recur silently.
2026-05-09 13:07:46 -07:00
Ayman Kamal
5b32c9fc66 chore: add A-kamal to AUTHOR_MAP for PR #18678 2026-05-09 13:07:46 -07:00
Ayman Kamal
13b474c56e fix: send correct resolution param to xAI image generation API
The xAI /v1/images/generations endpoint expects resolution as a
literal string ('1k' or '2k'), not the numeric value ('1024').

- Change _XAI_RESOLUTIONS from a dict mapping to a validation set
- Use the resolution key directly instead of the mapped value
- Fall back to DEFAULT_RESOLUTION on invalid config values

Fixes 422 Unprocessable Entity errors when resolution was sent.
2026-05-09 13:07:46 -07:00
Teknium
e612c3d6f0
perf(doctor): parallelize API connectivity checks and disable IMDS (#22766)
`hermes doctor` ran every connectivity probe sequentially and on a typical
developer laptop spent ~2s of its ~5s wall time inside boto3's EC2
instance-metadata-service lookup (169.254.169.254) — the default
AWS credential chain probes IMDS even when AWS_BEARER_TOKEN_BEDROCK
or AWS_ACCESS_KEY_ID is the only legitimate source.

Refactor the API Connectivity section so every probe (OpenRouter,
Anthropic, ~16 static API-key providers + dynamic profiles, AWS
Bedrock) is a pure function returning a structured result, then
fan them out through a ThreadPoolExecutor(max_workers=8). Output
order, glyphs, colours, padding, and issue strings stay byte-for-byte
identical to the sequential implementation; results are gathered
in submission order.

Also disable IMDS for the parallel block by setting
AWS_EC2_METADATA_DISABLED=true on the parent thread before submitting
work (and restoring its prior value in a finally block). Bedrock's
real-API call gets a Config(connect_timeout=5, read_timeout=10,
retries={max_attempts:1}) so a transient regional failure can't pad
the run by 30+ seconds.

Measured impact (5-run medians, 9950X3D):
  hermes doctor:           5.07 → 2.16 s  (-57%)

Doctor tests: 48 passed (test_doctor.py + test_doctor_command_install.py).

The remaining ~2s of wall is import overhead + a couple of one-off
network calls outside the API Connectivity section (`fetch_models_dev`
provider catalog refresh, Nous OAuth refresh in `Auth Providers`).
Those are next-tier targets, not part of this change.
2026-05-09 13:03:20 -07:00
Teknium
8f711f79a4
fix(tools): install cua-driver when Computer Use is enabled via 'hermes tools' (#22765)
Returning users who enabled '🖱️ Computer Use (macOS)' via 'hermes tools'
saw '✓ Saved configuration' but no install — cua-driver was never on
PATH and the toolset failed at first use. Two compounding causes:

1. _toolset_needs_configuration_prompt fell through to _toolset_has_keys,
   which returned True for any provider with empty env_vars. cua-driver
   has no env vars, so the gate skipped _configure_toolset entirely and
   _run_post_setup('cua_driver') never ran.

2. No stable CLI entry-point existed for re-running the install when
   the picker no-op'd it (e.g. when toggling the toolset off+on inside
   one picker session, where 'added' is empty).

Changes:

- hermes_cli/tools_config.py: add _POST_SETUP_INSTALLED registry
  mapping post_setup keys to installed-state predicates. The gate
  now returns True when any visible provider has a registered
  post_setup whose predicate fails. cua_driver is the only opt-in
  for now; other post_setup hooks keep their existing behaviour.
- hermes_cli/main.py: add 'hermes computer-use install' and
  'hermes computer-use status' as a stable docs target. install
  reuses the same _run_post_setup('cua_driver') path that the
  picker invokes; status reports whether cua-driver is on PATH.
- tools/computer_use/cua_backend.py: install hint now points users
  at 'hermes computer-use install' first.
- website/docs/user-guide/features/computer-use.md: document the
  new command as the primary install path.
- website/docs/reference/cli-commands.md: catalog 'hermes
  computer-use' alongside 'hermes tools'.
- tests/hermes_cli/test_post_setup_gating.py: regression coverage
  for the gate predicate (missing -> setup forced, installed ->
  setup skipped, broken predicate -> non-blocking, unregistered
  keys -> behaviour unchanged).

Fixes #22737. Reported by @f-trycua.
2026-05-09 13:02:25 -07:00
Teknium
6e5489c9f3
fix(memory): tighten MEMORY_GUIDANCE against ephemeral PR/issue/SHA notes (#22781)
The model regularly writes session-outcome facts to MEMORY.md despite
the existing 'Do NOT save task progress' line — entries like
'Submitted PR #22577 for the kanban dedup fix' or 'Fixed bug X in
file Y'. These are stale within days, pollute the system prompt,
and crowd out durable user preferences (the issue #22563 reporter
saw 9 sections of bug-fix notes injected on a brand-new task).

Add explicit examples of what NOT to save (PR numbers, issue
numbers, commit SHAs, 'fixed/submitted/Phase N done', file counts)
plus the 7-day-staleness heuristic so the model has a concrete
calibration target rather than guessing what counts as 'task progress'.

Closes #22563 (the prompt-side, low-risk portion). The bigger
relevance-based-injection / vector-retrieval feature requested in
#22563 is tracked under #2184 (Richer local memory). Per skill rule
on prompt caching, dynamic memory injection breaks the frozen-snapshot
invariant and needs a separate design call.
2026-05-09 12:48:25 -07:00
Teknium
e7c0d6ee53
fix(fallback): skip chain entries matching current provider/model/base_url (#22780)
_try_activate_fallback() walked the chain by index without comparing
the candidate entry against the currently-failing backend. So a
misconfigured chain that listed the same provider+model as the primary,
or two custom_providers entries pointing at the same shim URL, would
loop the same failure 3x for the same backend.

After the fix, advance() skips:
  - entries where (provider, model) match the current agent's
  - entries with a base_url + model matching the current backend
    (catches two custom_providers names pointing at the same shim)

Recursing through self._try_activate_fallback() continues to the next
chain entry; if everything matches, returns False and the caller
moves on without retrying the same broken path.

3 regression tests covering same-provider-same-model skip, same-base_url-
same-model skip, and the all-self-matching-returns-False exhaustion path.

Closes #22548 (the Hermes-side portion). The 120s timeout itself in
the downstream claude-cli shim is a deployment concern documented in
that issue's wherewolf87 comment.
2026-05-09 12:48:19 -07:00
Teknium
70bc52e408
fix(cli): make Ctrl+Enter insert newline on WSL/SSH/Windows Terminal (#22777)
Native Windows, WSL, SSH sessions, and Windows Terminal all send
Ctrl+Enter as bare LF (c-j). Hermes was binding c-j as submit on
every POSIX platform, so Ctrl+Enter submitted instead of inserting
a newline on those terminals. Reported in #22379.

Add _preserve_ctrl_enter_newline() predicate that detects the
environments where Ctrl+Enter must produce a newline (sys.platform
== 'win32', SSH_CONNECTION/SSH_CLIENT/SSH_TTY env, WT_SESSION,
WSL_DISTRO_NAME, /proc/version 'microsoft' marker). Gate the
c-j-as-submit binding off in those environments and gate the
c-j-as-newline handler on. Local POSIX TTYs without those markers
(docker exec, plain ssh from a Mac) keep c-j as submit so plain
Enter still works on thin PTYs.

Add install_ctrl_enter_alias() in hermes_cli/pt_input_extras.py
mapping the three CSI-u / modifyOtherKeys variants of Ctrl+Enter
('\x1b[13;5u', '\x1b[27;5;13~', '\x1b[27;5;13u') to the
(Escape, ControlM) tuple Alt+Enter produces. This lets Kitty /
mintty / xterm-with-modifyOtherKeys users over SSH get a Ctrl+Enter
newline through the existing Alt+Enter handler.

9 new tests + extended existing test_lf_enter_binds_to_submit_handler_posix
to cover bare-local vs SSH branches.

Closes #22379.
2026-05-09 12:48:14 -07:00
Teknium
2124ad72a2
fix(api-server): emit length/error finish_reason for truncation/failure (#22775)
Non-streaming /v1/chat/completions wrapped any AIAgent result \u2014 including
partial/failed runs \u2014 as a successful 200 with finish_reason='stop' and
the internal failure string substituted into message.content. API
clients had no way to distinguish 'agent answered: X' from
'agent crashed and the X you see is its error message'.

After the fix:
  - completed: True             \u2192 200 finish_reason='stop' (unchanged)
  - partial + truncated text    \u2192 200 finish_reason='length' + hermes extras
  - partial + no text / failed  \u2192 502 OpenAI error envelope (SDKs raise)
  - other failures              \u2192 200 finish_reason='error' + hermes extras

Adds X-Hermes-Completed / X-Hermes-Partial / X-Hermes-Error headers
plus a 'hermes' extras object on partial responses for clients that
want the full picture.

Closes #22496.
2026-05-09 12:48:08 -07:00
Teknium
86f69e8c2a
fix(agent): hydrate memory-nudge counters from conversation_history (#22774)
Gateway creates a fresh AIAgent per inbound message in several common
scenarios: cache miss, idle eviction (1h TTL), config-signature
mismatch, process restart. A freshly-built AIAgent has
_turns_since_memory=0 and _user_turn_count=0, so the
memory.nudge_interval trigger ('_turns_since_memory >=
_memory_nudge_interval') can never be reached when these reconstructions
happen on roughly the cadence of the interval. A user can chat for hours
on Telegram without ever seeing a self-improvement review fire.

Reconstruct the counters from conversation_history at the top of
run_conversation(), right after the existing _hydrate_todo_store call.
Idempotent guard ('if self._user_turn_count == 0') means a cached agent
that already accumulated counters keeps them; only freshly-built agents
hydrate. Modulo arithmetic preserves the original 1-in-N cadence rather
than firing a review immediately on resume.

7 regression tests pinning the contract (mid-cycle history, modulo wrap,
idempotency, zero-interval skip, role==user filtering, production-code
anchor).

Closes #22357.
2026-05-09 12:48:03 -07:00
Teknium
ade5981429
fix(kanban): sanitize comment author rendering in build_worker_context (#22769)
Operator-controlled HERMES_PROFILE values were rendered as
'**${author}** (${ts}):' — markdown bold with no provenance prefix.
Worker comment bodies render directly underneath. A misleading
profile name like 'hermes-system' or 'operator' could be misread by
the next worker as a system directive above attacker-influenced
content (confused-deputy primitive gated on operator misconfig).

The LLM-controlled author-forgery surface was already closed in
#22435 (author removed from KANBAN_COMMENT_SCHEMA). This is
defense-in-depth: render with an explicit 'comment from worker
`<author>` at <ts>:' prefix so even 'hermes-system' resolves to
'comment from worker `hermes-system` at ...' — parseable as
worker-comment metadata, not a system directive. Strip backticks
from author so they can't break out of the fence.

Update test_build_worker_context_caps_comments to count by body
regex since the rendered author line now also starts with
'comment '.

Closes #22452.
2026-05-09 12:47:58 -07:00
Teknium
f00dc6d7a3
fix(tests): harden run_tests.sh — uv-aware bootstrap + scrub HERMES_CRON_SESSION (#22767)
Two unrelated but co-located fixes to scripts/run_tests.sh:

1. pytest-split bootstrap (#22401): the script tried '$PYTHON -m pip
   install pytest-split' on first run, but uv-created venvs ship without
   pip. Result: 'No module named pip' before any test ran. Add a uv
   fallback (uv pip install --python $PYTHON), keep pip as a secondary
   path, and emit a clear error pointing at 'uv pip install -e ".[dev]"'
   when neither is available. Also declare pytest-split in
   pyproject.toml dev extra so a normal '.[dev]' install provisions it.

2. HERMES_CRON_SESSION leak (#22400): the hermetic env scrub already
   unsets HERMES_GATEWAY_SESSION and HERMES_INTERACTIVE but missed the
   sibling HERMES_CRON_SESSION. When run_tests.sh is invoked from a
   Hermes cron job, that variable leaks into pytest, flipping
   tools/approval.py into cron-deny mode and breaking
   tests/acp/test_approval_isolation.py and friends.

Closes #22400.
Closes #22401.
2026-05-09 12:47:52 -07:00
Teknium
e90aa7f280
fix(agent): notify context engine on commit_memory_session (#22764)
When session_id rotates (e.g. /new), commit_memory_session was firing
MemoryManager.on_session_end but skipping ContextEngine.on_session_end.
Engines that accumulate per-session state (LCM-style DAGs, summary
stores) leaked that state from the rotated-out session into whatever
continued under the same compressor instance.

Mirror the call shutdown_memory_provider already makes — same
lifecycle moment, same hook contract ("real session boundaries (CLI
exit, /reset, gateway expiry)"). /new is a real boundary for the old
session_id; providers keep their state but the rotated-out session_id
is done.

6 regression tests covering both-hooks-fire, no-memory-manager,
no-context-engine, both failure-tolerant paths.

Closes #22394.
2026-05-09 12:28:42 -07:00
kshitijk4poor
dae94fa652 fix: follow-up for salvaged PR #22263
- Restore allowed_chats gate before thread_id check so ignored_threads
  applies universally (even to guest mentions).
- Compute _message_mentions_bot once in _should_process_message to
  eliminate redundant second entity scan when guest_mode=true and the
  message does not mention the bot.
- Remove redundant _is_group_chat from _is_guest_mention (caller already
  verified the message is a group chat).
- Update _telegram_allowed_chats docstring to note guest_mode exception.
- Add test coverage: bot_command entity, text_mention entity,
  caption_entities, and ignored_threads + guest_mode interaction.
- Add nik1t7n to AUTHOR_MAP.
2026-05-09 11:54:04 -07:00
Nikita Nosov
55f518e521 feat(gateway): add Telegram guest mention mode 2026-05-09 11:54:04 -07:00
Teknium
369cee018d chore: add wali-reheman to AUTHOR_MAP 2026-05-09 11:12:03 -07:00
Teknium
b959cfa056 fix: move pytest.importorskip below pytest import in skip-guarded tests
The original PR placed 'pwd = pytest.importorskip("pwd")' on line 4
but 'import pytest' on line 9 — NameError on module load. Same for
test_file_sync_back.py. Plus, the in-function 'pwd = pytest.importorskip'
calls in test_auto_detected_root_is_rejected confused Python's scope
analysis (later 'import pytest' made pytest local everywhere in the
function) and caused UnboundLocalError. Drop the now-redundant
in-function importorskip calls and rely on the module-level guard.
2026-05-09 11:12:03 -07:00
Wali Reheman
4e8b8573ca tests: add Windows skip guards for UNIX-only stdlib imports 2026-05-09 11:12:03 -07:00
Teknium
b6ff96c057 fix(cron): allow quoted URL in github auth-header allowlist
The github-pr-workflow skill wraps the URL in double-quotes
('curl -H ... "https://api.github.com/..."'), which the original
allowlist regex (\s+https://api...) did not match. Without this,
the bundled github-pr-workflow skill is still blocked at every
cron tick despite #22605's fix landing for the bare-URL form.

Make the leading quote optional and add a regression test pinning
both single- and double-quoted forms.
2026-05-09 11:11:45 -07:00
qWaitCrypto
691778a08b fix(cron): keep auth-header exfiltration blocked 2026-05-09 11:11:45 -07:00
qWaitCrypto
783d11717a fix(cron): avoid github skill false positives in scanner 2026-05-09 11:11:45 -07:00
kshitijk4poor
9aefa74a9f feat(mcp): add codex preset for built-in MCP server discovery
Adds 'codex' to the _MCP_PRESETS registry so users can add it via

  Connecting to 'codex'...

  ✓ Connected! Found 2 tool(s) from 'codex':

    codex                                    Run a Codex session. Accepts configuration parameters matchi...
    codex-reply                              Continue a Codex conversation by providing the thread id and...

  Enable all 2 tools? [Y/n/select]:
  Cancelled. without manually specifying
the command and args.

Enables: codex mcp-server → Hermes native MCP client → Codex tools
available as first-class Hermes tools.
2026-05-09 11:11:28 -07:00
Teknium
684fd14db0 fix(dingtalk): align override signatures with base + guard Optional[error] in tests 2026-05-09 11:11:10 -07:00
qWaitCrypto
c705c7ac9b fix(dingtalk): clarify webhook media behavior 2026-05-09 11:11:10 -07:00
Wesley Simplicio
a33c63b9f8 fix(profiles): honour active_profile when HERMES_HOME points to hermes root
Problem:
After `hermes profile use NAME`, the gateway (started via systemd with
HERMES_HOME=/root/.hermes hardcoded) ignores the active profile and
always runs as the Default profile.  WebUI, Telegram, and all non-CLI
platforms are affected.

Root cause:
_apply_profile_override() contained an early-return guard:

    if profile_name is None and os.environ.get("HERMES_HOME"):
        return   # trust the inherited value

The intent was to let child processes inherit their parent's profile via
HERMES_HOME without redundantly re-reading active_profile.  But
systemd also sets HERMES_HOME — to the hermes root (/root/.hermes),
not a profile directory — so the guard fired and silently skipped the
active_profile check.  The user's `hermes profile use NAME` write to
~/.hermes/active_profile was never seen by the gateway process.

Fix:
Only skip the active_profile check when HERMES_HOME is already a
profile directory, identified by its immediate parent directory being
named "profiles" (e.g. ~/.hermes/profiles/coder or
/opt/data/profiles/coder).  When HERMES_HOME points to a root
directory (parent name != "profiles"), continue to read active_profile.

Tests:
- test_hermes_home_at_root_with_active_profile_is_redirected: the
  bug scenario — HERMES_HOME=/root/.hermes + active_profile=coder →
  HERMES_HOME must be redirected to .../profiles/coder.
  Stash-verified: FAILS without fix, PASSES with fix.
- test_hermes_home_already_profile_dir_is_trusted: child-process
  inheritance contract unchanged — .../profiles/coder is trusted as-is.
- test_hermes_home_unset_reads_active_profile: classic path unchanged.
- test_hermes_home_unset_default_profile_no_redirect: "default" still
  produces no redirect.
4/4 tests green.

Closes #22502.
2026-05-09 11:10:53 -07:00
briandevans
854c2ce309 fix(telegram): honor message.quote for partial-quote reply context
When a Telegram user replies using the native quote feature to select
only part of a prior message, _build_message_event was injecting the
ENTIRE replied-to message into reply_to_text via
message.reply_to_message.text/caption. python-telegram-bot exposes
the user-selected substring as message.quote (TextQuote.text); we now
prefer that and fall back to the full replied-to text only when no
native quote is present.

The agent-visible "[Replying to: \"...\"]" prefix can otherwise expand
the user's narrow quote into the full prior message, causing the agent
to act on unrelated actionable-looking text the user did not select
(e.g. multi-item briefings where the user quotes one bullet but the
prefix injects every bullet). Falls back cleanly when message.quote
is absent (PTB <21 or replies that don't quote a substring).

Fixes #22619

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 11:10:36 -07:00
Teknium
78b8155ecb chore: add xieNniu to AUTHOR_MAP 2026-05-09 11:10:04 -07:00
xieNniu
c8ede8aa1b fix(plugins): resolve Git binary for installs under minimal PATH
Resolve git via shutil.which with POSIX and Git-for-Windows fallbacks before clone and pull so Dashboard/API installs do not misreport Git as missing.

Add regression tests for the resolver and pull subprocess invocation.
2026-05-09 11:10:04 -07:00
qWaitCrypto
124fbb0af0 fix(gateway): refresh runtime argv metadata 2026-05-09 11:08:23 -07:00
JackJin
7d276bfbee fix(cli): expand composite toolset when mixed with configurables in platform_toolsets
When platform_toolsets[<platform>] contains both a composite (e.g.
hermes-cli) and at least one configurable opt-in (e.g. spotify), the
has_explicit_config branch in _get_platform_tools silently dropped the
composite, leaving sessions with only the configurable + plugin tools
and no native tools (terminal, file, web, browser, memory, etc.).

Mirror the else-branch's subset inference for composites that sit
alongside the configurables, but apply _DEFAULT_OFF_TOOLSETS only to the
implicit expansion so user-listed default-off toolsets (spotify,
discord) survive.
2026-05-09 11:08:05 -07:00
Teknium
1f4200debf
feat(delegate): show user's actual concurrency / spawn-depth limits in tool description (#22694)
The delegate_task tool description hardcoded 'default 3' / 'default 2' for
max_concurrent_children / max_spawn_depth, which misled the model on any
install that raised these limits — the schema text said 'default 3' even
when the user had set max_concurrent_children=15 / max_spawn_depth=3, so
the model would self-cap at 3 and never use the headroom.

Make the description dynamic. ToolEntry gains an optional
dynamic_schema_overrides callable; registry.get_definitions() merges its
output on top of the static schema before returning it. delegate_tool
registers a builder that reads the current delegation.* config and emits:

- 'up to N items concurrently for this user' (N = max_concurrent_children)
- 'Nested delegation IS enabled / OFF for this user (max_spawn_depth=N)'
- 'orchestrator children can themselves delegate up to M more level(s)'
- 'orchestrator_enabled=false' when the kill switch is set

The model_tools cache key already includes config.yaml mtime+size, so
edits to delegation.* in config invalidate the cached tool definitions
without an explicit hook. CLI_CONFIG staleness within a process is a
pre-existing limitation of _load_config and out of scope here.

Static description / tasks.description / role.description in
DELEGATE_TASK_SCHEMA are placeholders so module import doesn't trigger
cli.CLI_CONFIG load before the test conftest can redirect HERMES_HOME.
2026-05-09 11:07:53 -07:00
Teknium
000ddb8a93 chore: add SiliconID to AUTHOR_MAP 2026-05-09 11:07:37 -07:00
Matthew Cater
cda20eec0c fix(kanban): gate claim + unblock on parent completion
Enforce the parent-completion invariant at claim_task (the single
ready->running chokepoint) and re-gate unblock_task so blocked->ready
only fires when parents are done. Prevents child tasks from running
ahead of in-progress parents under the create-then-link race.

Also adds a stress test that races concurrent create+link against
hammered claim_task and asserts no child runs while any parent is undone.

Ref: kanban/boards/cookai/workspaces/t_a6acd07d/root-cause.md
Refs: t_8d6af9d6
2026-05-09 11:07:37 -07:00
Teknium
79694018f8
feat(plugins): HERMES_PLUGINS_DEBUG=1 surfaces plugin discovery logs (#22684)
Plugin authors had no easy way to figure out why their plugin wasn't
loading — failures were buried in agent.log at WARNING and skip reasons
(disabled, not enabled, depth cap, exclusive) were DEBUG-only and
invisible by default.

Set HERMES_PLUGINS_DEBUG=1 to attach a stderr handler at DEBUG to the
hermes_cli.plugins logger only. Surfaces:

  - which directories were scanned + manifest counts per source
  - per manifest: resolved key, name, kind, source, on-disk path
  - skip reasons (disabled, not enabled, exclusive, depth cap, no register)
  - per load: tools/hooks/slash/CLI commands the plugin registered
  - full traceback on YAML parse failure (exc_info on the existing warning)
  - full traceback on register() exceptions, pointing at the plugin author's line

Env var off (default) → zero new stderr output, same as before.

Touches only hermes_cli/plugins.py + a doc section in the plugin-build
guide + an entry in the env-vars reference. 3 new tests lock the
attach/idempotent/no-attach behavior.
2026-05-09 11:07:12 -07:00
Teknium
8f83046f6c
perf(google_chat): defer heavy google-cloud imports to first adapter use (#22681)
Plugin discovery imports every bundled platform plugin at model_tools
import time. The google_chat adapter unconditionally pulled in
google.cloud.pubsub_v1, googleapiclient, grpc, httplib2, and friends at
module top — about 33 MB RSS and 110 ms wall on every CLI invocation,
even ones that never construct a gateway adapter.

Wrap the heavy imports in _load_google_modules(): an idempotent loader
that rebinds the module-level globals (pubsub_v1, service_account,
HttpError, MediaFileUpload, …) on first call and is invoked from
GoogleChatAdapter.__init__, connect(), and check_google_chat_requirements().

The HttpError = Exception placeholder is preserved for the brief window
before the loader runs, so 'except HttpError as exc:' clauses stay
correct (Python looks up the name at try/except evaluation time, not
at function definition time).

Measured impact on a 9950X3D, 7-run medians:
  import cli:              895 → 787 ms  (-108 ms / -12%)
                           133 → 110 MB  ( -23 MB / -17%)
  import model_tools:      491 → 400 ms  ( -91 ms / -19%)
                            95 →  66 MB  ( -29 MB / -31%)
  google_chat alone:       244 → 132 ms  (-112 ms / -46%)
                            83 →  50 MB  ( -33 MB / -40%)
  hermes chat -q (cold):   177 → 145 MB  ( -32 MB / -18%)

Real-world win lands on every path that imports cli.py: hermes chat,
hermes gateway, cron jobs, batch runs, subagents. Long-lived gateway
processes save ~30 MB resident.

All 157 google_chat tests pass; full gateway suite (5050 tests) green.
2026-05-09 11:07:06 -07:00
Teknium
0d9800743c chore: add wesleysimplicio to AUTHOR_MAP 2026-05-09 11:06:21 -07:00
Wesley Simplicio
0c22434f03 fix(kanban): call recompute_ready after unlink_tasks removes a dependency
Problem:
unlink_tasks() removes a parent→child dependency edge but does not trigger
recompute_ready().  A child whose last blocking parent is unlinked stays
stuck in 'todo' indefinitely — it only promotes to 'ready' on the next
dispatcher tick or a manual 'hermes kanban recompute'.  For CLI-only users
without a dispatcher, the child is permanently stuck.

Root cause:
complete_task() and unblock_task() both call recompute_ready() after their
write transaction so downstream children are evaluated immediately.
unlink_tasks() was missing this call — removing a dependency is
semantically equivalent to completing one, so the same recompute is needed.

Fix:
Capture the rowcount result before the write_txn exits, then call
recompute_ready(conn) outside the transaction when a row was actually
deleted (so the child sees the updated task_links state).

Tests:
Added test_unlink_tasks_triggers_recompute_ready in
tests/hermes_cli/test_kanban_db.py: creates parent A (done) + parent C
(running), child B with both parents (todo), unlinks C→B, asserts B is
ready immediately.  Stash-verified: FAILS without fix (child stays todo),
PASSES with fix.
62/62 tests green in tests/hermes_cli/test_kanban_db.py.

Closes #22459.
2026-05-09 11:06:21 -07:00
Teknium
b9c001116e
feat: confirm prompt for destructive slash commands (#4069) (#22687)
/clear, /new, /reset, and /undo now ask the user to confirm before
discarding conversation state — three-option prompt routed through the
existing tools.slash_confirm primitive.

Native yes/no buttons render on Telegram, Discord, and Slack (their
adapters already implement send_slash_confirm); other platforms get a
text-fallback prompt and reply with /approve, /always, or /cancel.

The classic prompt_toolkit CLI uses the same three-option flow via the
established _prompt_text_input pattern (see _confirm_and_reload_mcp).
TUI keeps its existing modal overlay (#12312).

Gated by new config key approvals.destructive_slash_confirm (default
true). Picking 'Always Approve' flips the gate to false so subsequent
destructive commands run silently — matches the established
mcp_reload_confirm UX.

Out of scope: /cron remove (separate domain — scheduled jobs, not
session history). Existing TUI overlay env-var (HERMES_TUI_NO_CONFIRM)
left unchanged; cosmetic unification can come later.

Closes #4069.
2026-05-09 11:04:46 -07:00
ethernet
0cafe7d50d
Merge pull request #22510 from novax635/fix/gateway-slash-confirm-boundary-cleanup
fix gateway: clear slash confirm state during session boundary cleanup
2026-05-09 12:48:49 -04:00
ethernet
f1f42a7b9f
Merge pull request #22610 from uzunkuyruk/fix/telegram-table-row-label-duplicate-bullet
fix(telegram): exclude row-label column from bullet items in table re…
2026-05-09 11:47:45 -04:00
uzunkuyruk
8fdaf4d3d6 fix(telegram): exclude row-label column from bullet items in table rendering
When a GFM table has a row-label column (first column with no header),
_render_table_block_for_telegram incorrectly included the row-label cell
in the bullet zip alongside the data cells, producing a spurious bullet
like '• 維度: 核心賣點' before the real data rows.

Detect the row-label column by comparing the first data row cell count
against the header count (has_row_label_col = len(first_data_row) ==
len(headers) + 1). When present, use cells[0] as the heading and
zip headers against cells[1:] only, correctly excluding the row-label
from the bullet list.

Fixes #22604
2026-05-09 17:39:16 +03:00
kshitijk4poor
f6d45e5df4 chore: add nik1t7n to AUTHOR_MAP
Nikita Nosov (nik1t7n, PR #22264) — first-time contributor email
and noreply alias.
2026-05-09 04:34:55 -07:00
Nikita Nosov
1ac8deb3ca feat(gateway): stream Telegram edits safely 2026-05-09 04:34:55 -07:00
novax635
8b6501786c fix(gateway): clear slash-confirm state during session boundary cleanup 2026-05-09 14:18:20 +03:00
fahdad
cca2869d78 fix(banner): resolve update-check repo from running code, not profile-scoped path
check_for_updates() and _resolve_repo_dir() were preferring
$HERMES_HOME/hermes-agent/ over Path(__file__).parent.parent.resolve()
when looking for a .git checkout.  For profiles created with
--clone-all, $HERMES_HOME/hermes-agent/ points to a stale copy
with a frozen HEAD, causing persistent "N commits behind" banners
that never resolved.

Flip the resolution order: prefer the running code's location first,
fall back to $HERMES_HOME/hermes-agent/ only when the live checkout
doesn't have a .git (system-wide pip installs, distro packages).

The embedded-rev branch (HERMES_REVISION env var, set by nix builds)
is unaffected — it uses git ls-remote against upstream, never reads
the local checkout's HEAD.

Based on PR #21728 by @fahdad
2026-05-09 04:10:35 -07:00
donrhmexe
f7e514d4ad fix(profiles): exclude infrastructure artifacts when cloning with --clone-all
When the source profile is the default (~/.hermes), shutil.copytree()
was copying multi-GB infrastructure alongside the ~40 MB of actual
profile data: hermes-agent/ (repo checkout + 3 GB venv), .worktrees/,
profiles/ (sibling profiles — recursive!), bin/ (installed binaries),
node_modules/ (hundreds of MB).

Add _CLONE_ALL_DEFAULT_EXCLUDE_ROOT frozenset with these five entries
and pass an ignore callback to copytree().  Exclusions are gated on
the source actually being the default profile (is_default_source) so
named-profile sources are never affected.

Also exclude at any depth: __pycache__/, *.pyc, *.pyo, *.sock, *.tmp.
Profile data (config.yaml, .env, auth.json, state.db, sessions/,
skills/, logs/) is preserved intact — clone-all means 'complete
snapshot minus infrastructure'.

Mirrors the approach already used by _default_export_ignore() and
_DEFAULT_EXPORT_EXCLUDE_ROOT (the export-side exclusion set which is
broader because it produces a portable archive, not a live clone).

Co-authored-by: MustafaKara7 <karamusti912@gmail.com>
Co-authored-by: fahdad <30740087+fahdad@users.noreply.github.com>
Fixes #5022
Based on PRs #5025, #5026, and #21728
2026-05-09 04:10:35 -07:00