Commit graph

7927 commits

Author SHA1 Message Date
tymrtn
d1fc748def fix(kanban): /kanban slash command emits argparse garbage instead of help
Closes #21794.

`/kanban`, `/kanban help`, `/kanban --help`, and `/kanban <sub> -h`
all returned broken output to the gateway and interactive CLI. Three
underlying bugs in `hermes_cli.kanban.run_slash`:

1. argparse writes help to **stdout** but `run_slash` only captured
   stderr at parse time, so `-h` text was silently swallowed and
   replaced with the `(usage error: 0)` sentinel.
2. The wrapping parser used `prog="/"` and routed via a synthetic
   "_top → kanban" subparser, producing `usage: / kanban …` (stray
   space) and `usage: /kanban kanban …` (doubled token) in error text.
3. Bare `/kanban` and `/kanban help` dumped argparse's full ~3KB
   usage tree, which reads as visual garbage in a chat bubble.

Fix: drive the kanban_parser directly (no double-wrap), rewrite prog
strings on every leaf subparser, capture stdout AND stderr around
parse_args, distinguish SystemExit(0) (help — return captured stdout)
from SystemExit(2) (error — return single-line ⚠-prefixed message),
and add an explicit chat-friendly short-help block returned for bare
invocation and the help aliases (`help`, `--help`, `-h`, `?`).

Added 5 regression tests covering bare invocation, every help alias,
subcommand help, unknown action, and missing required arg.

Affects every chat platform via gateway/run.py::_handle_kanban_command
and the interactive CLI via cli.py::_handle_kanban_command.

Co-Authored-By: Nagatha (Claude Opus 4.7) <noreply@anthropic.com>
2026-05-09 22:49:29 -07:00
Teknium
3d2bfc502e
chore(models): refresh OpenRouter + Nous fallback lists (#23001)
Reorder Anthropic Opus 4.7/4.6 + Sonnet 4.6 to the top, cluster free
models at the bottom of the OpenRouter list, and mirror the same
ordering into the Nous portal list (paid models only).

- Add inclusionai/ring-2.6-1t:free
- Drop minimax-m2.5, minimax-m2.5:free, sonnet-4.5, mimo-v2.5,
  glm-5v-turbo, glm-5-turbo, trinity-large-preview:free,
  trinity-large-thinking, qwen3.5-plus-02-15
- Replace qwen3.5-35b-a3b with qwen3.6-35b-a3b
- Drop x-ai/grok-4.20-beta from the Nous list
2026-05-09 22:47:38 -07:00
Teknium
e2ce89a8aa chore: AUTHOR_MAP entry for li0near gmail (#21378) 2026-05-09 22:38:01 -07:00
li0near
6f2d60559e fix(kanban): drop redundant init_db() in gateway watchers (#21378)
Both `_kanban_notifier_watcher` and `_kanban_dispatcher_watcher`'s
`_tick_once_for_board` called `_kb.connect(board=slug)` immediately
followed by `_kb.init_db(board=slug)`. Since `connect()` already runs
the schema + idempotent migration on first open per process, the
explicit `init_db()` was redundant — and worse, `init_db()` deliberately
busts the per-process `_INITIALIZED_PATHS` cache and re-runs the migration
on a *second* connection that races the first.

On every cold gateway start against a legacy DB this surfaced as either
`sqlite3.OperationalError: duplicate column name: <col>` or intermittent
`database is locked` errors logged at the first tick. The duplicate-column
case is now tolerated by `_add_column_if_missing` (commit 78698381a), but
the wasted second migration plus the database-is-locked race remain
fixable by skipping the redundant call entirely.

Drops `_kb.init_db(board=slug)` at both call sites and adds a regression
test in `tests/hermes_cli/test_kanban_notify.py` that pins the absence
via source inspection plus a runtime spy.

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-05-09 22:38:01 -07:00
Teknium
68e44642c8
fix(stream-retry): collapse two-line drop status, name provider, and let agent.log capture diagnostics (#22993)
Subagent stream drops were spamming the parent terminal with two lines
per blip ('Connection dropped...' + 'Reconnected...') while leaving zero
breadcrumb in agent.log to debug them.

Two underlying bugs, fixed together:

1. quiet_mode raised the run_agent/tools/etc. loggers to ERROR, which
   filters records before root-logger file handlers see them. The comment
   claimed 'File handlers still capture everything' — that was wrong.
   Removed in both run_agent.py and cli.py; console quietness already
   comes from hermes_logging not installing a console StreamHandler in
   non-verbose mode.

2. The stream-retry blocks emitted two _emit_status calls per drop
   ('⚠️ Connection dropped... Reconnecting...' + '🔄 Reconnected —
   resuming…') with no provider name, so multi-provider sessions had to
   dig through agent.log to attribute a drop. Replaced both call sites
   with a single _emit_stream_drop helper that emits ONE line naming the
   provider and error class, and always writes a structured WARNING to
   agent.log with subagent_id, depth, provider, base_url, error_type.

Net UX change: 6 lines per triple-subagent drop → 3 lines, each
naming the provider. agent.log now has a structured breadcrumb per
retry that didn't exist before.

Tests: 6 new tests in tests/run_agent/test_stream_drop_logging.py
covering the logger-level guard, structured WARNING content, single
status line per drop (no Reconnected follow-up), and provider naming.
2026-05-09 22:35:35 -07:00
Teknium
3800972dd0
feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955)
When the active main model has native vision and the provider supports
multimodal tool results (Anthropic, OpenAI Chat, Codex Responses, Gemini
3, OpenRouter, Nous), vision_analyze loads the image bytes and returns
them to the model as a multimodal tool-result envelope. The model then
sees the pixels directly on its next turn instead of receiving a lossy
text description from an auxiliary LLM.

Falls back to the legacy aux-LLM text path for non-vision models and
unverified providers.

Mirrors the architecture used in OpenCode, Claude Code, Codex CLI, and
Cline. All four converge on the same pattern: tool results carry image
content blocks for vision-capable provider/model combinations.

Changes
- tools/vision_tools.py: _vision_analyze_native fast path + provider
  capability table (_supports_media_in_tool_results). Schema description
  updated to reflect new behaviour.
- agent/codex_responses_adapter.py: function_call_output.output now
  accepts the array form for multimodal tool results (was string-only).
  Preflight validates input_text/input_image parts.
- agent/auxiliary_client.py: _RUNTIME_MAIN_PROVIDER/_MODEL globals so
  tools see the live CLI/gateway override, not the stale config.yaml
  default. set_runtime_main()/clear_runtime_main() helpers.
- run_agent.py: AIAgent.run_conversation calls set_runtime_main at turn
  start so vision_analyze's fast-path check sees the actual runtime.
- tests/conftest.py: clear runtime-main override between tests.

Tests
- tests/tools/test_vision_native_fast_path.py: provider capability
  table, envelope shape, fast-path gating (vision-capable model uses
  fast path; non-vision model falls through to aux).
- tests/run_agent/test_codex_multimodal_tool_result.py: list tool
  content becomes function_call_output.output array; preflight
  preserves arrays and drops unknown part types.

Live verified
- Opus 4.6 + Sonnet 4.6 on OpenRouter: model calls vision_analyze on a
  typed filepath, gets pixels back, reads exact text from images that
  no aux description could capture (font color irony, multi-line
  fruit-count list, etc.).

PR replaces the closed prior efforts (#16506 shipped the inbound user-
attached path; this PR closes the gap for tool-discovered images).
2026-05-09 21:06:19 -07:00
Teknium
e62250453b
docs(user-stories): add 18 verified social entries (99 → 117) (#22920)
Found 18 real Hermes-Agent stories from HN, X, and Reddit not yet
captured on the page. All URLs HTTP-verified to return 200 with
matching titles.

Reddit (15): r/hermesagent (Obsidian-as-memory writeup at 794 upvotes,
LLM cheatsheet at 635 upvotes, Kanban game-changer post, OpenRouter #1
ranking, AMA from the Nous team, etc.); r/LocalLLaMA, r/Rag,
r/openclaw, r/SideProject, r/LocalLLM threads where users describe
their actual setups (Qwen3.5-9b on 16gb VRAM, 5060Ti + Telegram, smart
routing tiers).

X (3): @vmiss33's 'what I use Hermes for' guide, @HeyYanvi's
X-to-NotebookLM podcast workflow, @ExileAI_0's spare-laptop Iris
running RenPy + ComfyUI, @brucexu_eth's Hermes Inc. Telegram startup
sim from the hackathon, Hype's deep-dive blog.

HN (1): 'I'm using Hermes — sandbox it like any agent.'

No component changes — all new entries fit the existing schema
(real URL, real author, real date).
2026-05-09 20:58:09 -07:00
Clooooode
998676dd0c chore(test): comment of test case rewrite to english
Some checks are pending
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Docker Build and Publish / move-latest (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (push) Waiting to run
Tests / e2e (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
2026-05-09 19:31:41 -07:00
Clooooode
a4036654f1 fix(kanban): remove blocked kind from unsub 2026-05-09 19:31:41 -07:00
Clooooode
dd49d50389 test(kanban): assert re-block notification is delivered after unblock cycle
Adds test_notifier_second_blocked_delivers to cover the case where a
task is blocked, unblocked, then blocked again — the second blocked
event must still deliver a gateway notification.

Currently fails because blocked is treated as a terminal event kind,
causing the subscription to be dropped after the first block.
2026-05-09 19:31:41 -07:00
Tranquil-Flow
8954537f95 fix(kanban): request default board explicitly (#21819) 2026-05-09 19:31:32 -07:00
Teknium
eb3db231dc chore: AUTHOR_MAP entry for eloklam (#22898) 2026-05-09 19:31:14 -07:00
eloklam
d04a0b81ee docs(skills): clarify kanban fan-out decomposition 2026-05-09 19:31:14 -07:00
Teknium
08ec602770
fix(tool-result-storage): persist via stdin to bypass 128 KB exec-arg cap (#22913)
Linux's MAX_ARG_STRLEN caps any single argv element at 128 KB
(32 * PAGE_SIZE). The previous heredoc-in-the-command-string approach
in _write_to_sandbox put the entire tool result inside the 'bash -c'
arg, so any result over ~128 KB raised OSError [Errno 7] 'Argument
list too long' before the heredoc ever ran. The caller logged a
warning, but quiet_mode (CLI default) sets tools.* to ERROR — so the
warning never reached agent.log either, and the agent saw a 1.5 KB
preview tagged 'Full output could not be saved to sandbox'. Hits
delegate_task with 3+ subagent outputs routinely now.

Switch to passing content via env.execute(stdin_data=...). cmd is
now just 'mkdir -p X && cat > Y' (under 1 KB), and the heavyweight
payload travels through stdin where there is no argv-element limit.

E2E reproduced the user's exact 144,778-char delegate_task envelope:
old code OSError'd, new code round-trips cleanly to disk with all
three task summaries intact.
2026-05-09 18:44:58 -07:00
Teknium
ded194eb6a
chore(skills): move heavy training skills + outlines to optional-skills (#22912)
These skills require heavy GPU/CUDA stacks or are niche enough that they shouldn't
be active by default. Moved to optional-skills/ where users opt-in via
`hermes skills install official/...`.

Moved:
- mlops/training/axolotl
- mlops/training/trl-fine-tuning
- mlops/training/unsloth
- mlops/inference/outlines

Counts: 91 -> 87 built-in, 72 -> 76 optional.

Auto-regenerated docs (per-skill pages + catalogs) reflect the move.
2026-05-09 18:44:12 -07:00
Teknium
4375b82cd9
feat(curator): show rename map in user-visible summary (#22910)
* feat(curator): show rename map (where skills went) in user-visible summary

The full data has always been on disk in REPORT.md, but the user-visible
curator summary (gateway 💾 line, CLI session-start panel,
`hermes curator status`) was counts-only — "consolidated 4 into 2
umbrellas" with no names. Users only discovered renames when something
they expected was gone.

New `_build_rename_summary()` formats the rename map and appends it to
`final_summary`:

    auto: 1 marked stale; llm: consolidated 2 into 1, pruned 1
    archived 3 skill(s):
      • docx-extraction → document-tools
      • pdf-extraction → document-tools
      • old-stale-thing — pruned (stale)
    full report: hermes curator status

Empty on no-op ticks (no archives), so most ticks add zero log noise.
Cap of 10 entries keeps agent.log readable when a 50-skill
consolidation lands; the full list is always in REPORT.md.

`hermes curator status` indents continuation lines so the multi-line
summary reads as one logical field.

5 new tests in tests/agent/test_curator_classification.py covering
empty / consolidation / pruning / cap / mixed cases.

* feat(curator): show recent run summary once on `hermes update`

The rename map is now visible from where users actually look — the
update flow they explicitly run, instead of just the live gateway log
or transient CLI session-start panel.

Behavior:
- After `hermes update`, if the most recent curator run produced a
  rename map (multi-line summary) that the user hasn't seen yet, print
  it once with a 'last run Xh ago' header and a one-time-message
  footer.
- Stamp `last_run_summary_shown_at = last_run_at` after printing so
  subsequent `hermes update` invocations are silent until a newer
  curator run lands.
- Silent on no-op runs (single-line summary like 'auto: no changes;
  llm: no change'). Still stamps shown so we don't reconsider on
  every update.
- Silent when the curator has never run (the existing first-run
  notice handles that case).

Output:

    ℹ Skill curator — last run 4h ago
      auto: 1 marked stale; llm: consolidated 2 into 1, pruned 1
      archived 3 skill(s):
        • docx-extraction → document-tools
        • pdf-extraction → document-tools
        • old-stale-thing — pruned (stale)
      full report: hermes curator status
      (This message shows once per curator run. View anytime: hermes curator status)

State migration:
- `_default_state()` gains `last_run_summary_shown_at: None`. Existing
  state files lack the field; `.get()` returns None; the comparison
  treats any prior run as 'not yet shown' and prints once on next
  update. Self-healing.

Wiring:
- Both `hermes update` paths in main.py call the new
  `_print_curator_recent_run_notice()` right after the existing
  first-run notice. Best-effort try/except so a state-load bug
  never breaks the update flow.

6 tests in tests/hermes_cli/test_curator_recent_run_notice.py:
no-run / single-line / multi-line / show-once / new-run-resets /
time-formatter buckets.
2026-05-09 18:43:40 -07:00
Teknium
b67ea7ff47
perf(cli): skip welcome banner on chat -q single-query mode (#22904)
`hermes chat -q "..."` printed the full welcome banner before
running the query — kawaii ASCII logo, available toolsets list,
available skills list, model name, session ID, working directory,
update-available notice. Building it took ~420 ms on cold start
(~200 ms version-update probe, the rest is toolset / skill enumeration
plus Rich panel rendering).

For a one-shot `-q` query the banner is noise: the user already
picked the prompt, doesn't need a toolset reference, and gets the
session ID + resume hint from `_print_exit_summary()` after the
response prints.

The fully-quiet `-Q` / `--quiet` machine-readable path was already
banner-free; this brings the human-facing single-query path in line
so all non-interactive invocations are fast.

Measured impact (`hermes chat -q "ok" --max-turns 1`, 10-run
percentiles, 9950X3D):
  median:  1.90 → 1.75 s  (-150 ms)
  min:     1.80 → 1.73 s  ( -70 ms)
  P25:     1.82 → 1.74 s  ( -80 ms)

Wider variance than expected; the banner cost overlaps with API
latency on real `chat -q` runs. Min-time delta of 70 ms is the
cleanest signal — that's the deterministic banner-build cost gone.
The 150 ms median delta picks up cases where the version-update
probe also finishes during the wait.

Interactive mode (`hermes` with no `-q`) and the `--list-tools` /
`--list-toolsets` one-shot listing commands still show the banner —
those are the contexts where it's actually wanted.

Tests: 656/656 `tests/cli/` pass on top of latest main (modulo 5 pre-
existing flakes in `test_cli_save_config_value.py` that fail with
`No module named 'ruamel'` both with and without this change).
2026-05-09 18:20:28 -07:00
Teknium
5971a4e092
feat(docs): richer info panels on the Skills Hub for built-in + optional skills (#22905)
The Skills Hub at /skills had cards that, when expanded, showed only the
one-line description, tags, author, version, and an install command. For
the 163 bundled and optional skills shipped with the repo, this was thinner
than the data we already have on disk.

Three changes, all under website/:

1. extract-skills.py now pulls four extra fields per local skill:
   - 'overview' — first non-heading body paragraph from SKILL.md (stripped
     of admonitions/code fences, capped at ~500 chars at a sentence boundary)
   - 'envVars' / 'commands' — from the prerequisites: block in frontmatter
   - 'license' — from the top-level frontmatter
   - 'docsPath' — slug to the per-skill /docs/user-guide/skills/.../* page,
     computed with the same logic as generate-skill-docs.py

   162 of 163 local skills get a non-empty overview automatically. The
   remaining one (media/heartmula) has only headings/code in its body and
   falls through to the description.

2. Skill TS interface + SkillCard expanded-panel render the new fields:
   - Overview paragraph at the top of the panel
   - Prerequisites box (env vars + required commands) when frontmatter
     declares them
   - License row alongside author/version
   - 'View full documentation →' link to the per-skill docs page

   Search now covers the overview text too, so users can find skills by
   matching content from inside SKILL.md, not just the one-line description.

3. styles.module.css gains six new classes (overviewBlock, detailLabel,
   overviewText, prereqBlock/Row/Kind/List/Item, docsLink) styled to match
   the existing dark panel aesthetic.

External / community skills (Anthropic, LobeHub, Claude Marketplace cached
indexes) keep the old behavior — overview is empty, no prereqs, no docsPath.

Validation: 'npm run build' clean (exit 0); broken-link count unchanged at
155 baseline; all 163 generated docsPath values resolve to existing pages
under website/docs/user-guide/skills/.
2026-05-09 18:17:39 -07:00
Teknium
da086a0154 chore: add ming1523 to AUTHOR_MAP 2026-05-09 17:55:12 -07:00
ming
85383c6363 fix(cli): preserve config comments on setting writes 2026-05-09 17:55:12 -07:00
Teknium
de54618720 chore: add v1b3coder to AUTHOR_MAP 2026-05-09 17:54:58 -07:00
v1b3coder
4fdaf0b4d8 fix: use credential_pool for custom endpoint model listing probes
Same-provider /model switches on a 'custom' endpoint kept stale credentials
because (a) _resolve_named_custom_runtime's bare-custom + explicit_base_url
path went straight to OPENAI_API_KEY/OPENROUTER_API_KEY env fallbacks
without consulting the credential pool, and (b) switch_model() guarded
against custom-provider re-resolution to preserve base_url, locking in
the prior api_key.

Now the bare-custom path queries the credential pool first (mirroring
the named-custom-provider branch behavior), and the same-provider switch
guard is removed since resolve_runtime_provider has since grown a robust
custom-resolution path that preserves base_url from model_cfg.

Refs #18681 (the gateway-side api_key wiring is still separate),
#16254, #12919.
2026-05-09 17:54:58 -07:00
Teknium
f93b8c28e3 chore: add DanielLSM to AUTHOR_MAP 2026-05-09 17:54:44 -07:00
Daniel Marta
1fb9f7c68c fix(gateway): pass max_total_size_mb and max_file_size_mb to CheckpointManager
The /rollback command handler in gateway/run.py was constructing
CheckpointManager with only enabled and max_snapshots, omitting
max_total_size_mb and max_file_size_mb that the __init__ expects.
This caused a TypeError on every /rollback invocation when checkpoints
were enabled.

Fixes: NousResearch/hermes-agent#18841
2026-05-09 17:54:44 -07:00
Teknium
4ca7c2104d test(gateway): stub /proc unavailability in find_gateway_pids fallback test
Follow-up test fix for #22693 — the existing test for ps-failure +
pid-file fallback needed the /proc walk path stubbed too since /proc
is now consulted first.
2026-05-09 17:54:17 -07:00
Wesley Simplicio
6bf7ac3185 fix(gateway): detect gateway process via /proc in Docker without procps
Salvage of NousResearch/hermes-agent#7622.

Docker images often lack procps so `ps` is unavailable.  Try reading
/proc/*/cmdline first (works in any Linux container) and fall back to
`ps -A eww` only when /proc is not present.  PermissionError on
individual PIDs is silently skipped.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 17:54:17 -07:00
Teknium
2ffef15675
fix(test_gateway): stop run_gateway() tests from rewriting the dev's installed systemd unit (#22900)
run_gateway() calls refresh_systemd_unit_if_needed() on every invocation
so restart settings stay current after exit-code-75 respawns. The
user-scope unit path resolves under Path.home() (NOT sandboxed by
conftest, only HERMES_HOME is), and generate_systemd_unit() bakes the
current HERMES_HOME into the unit's Environment= line.

Result: any test that exercises run_gateway() end-to-end on a real
Linux dev box silently rewrites the developer's installed
~/.config/systemd/user/hermes-gateway.service with a polluted
HERMES_HOME pointing at /tmp/pytest-of-<user>/.../hermes_test. On the
next reboot, systemd loads that unit, the gateway starts looking at an
empty tmp dir, and Telegram/Discord/etc. all show as 'No messaging
platforms enabled' even though the user's real config is fine. Three
tests in tests/hermes_cli/test_gateway.py hit this path:
test_run_gateway_exits_cleanly_on_keyboard_interrupt,
test_run_gateway_exits_nonzero_when_start_gateway_reports_failure, and
test_run_gateway_root_guard_has_escape_hatch.

Two-layer fix:

1. _install_fake_gateway_run helper (covers all four run_gateway() call
   sites in test_gateway.py and any future ones) now also stubs
   supports_systemd_services and refresh_systemd_unit_if_needed.

2. refresh_systemd_unit_if_needed() itself sniffs the generated unit
   body for /pytest-of- and /hermes_test markers and refuses to write
   when present. Defense in depth so a future test that bypasses the
   helper still can't corrupt the dev's gateway. Tests that legitimately
   exercise the refresh flow (test_run_gateway_refreshes_outdated_unit_on_boot)
   patch generate_systemd_unit to return synthetic content that doesn't
   carry those markers, so they keep working.

Adds test_refresh_refuses_to_bake_pytest_tmpdir_into_real_user_unit as a
regression test for the source-side guard.
2026-05-09 17:54:09 -07:00
Wesley Simplicio
4f8d8ad912 fix(error_classifier): classify generic-typed timeout messages as transient (carve-out of #22664)
RuntimeError('claude CLI turn timed out') from a local OpenAI-compatible
shim was falling through to FailoverReason.unknown, surfacing as 'Empty
response from model' and burning 3 retry slots on the same failing
endpoint. _classify_by_message had no timeout-message branch — only
billing/rate_limit/auth/context_overflow/model_not_found patterns. The
type-based check at line 565 also requires isinstance(error, (TimeoutError,
ConnectionError, OSError)) — a plain RuntimeError doesn't match.

Add _TIMEOUT_MESSAGE_PATTERNS for 'timed out', 'deadline exceeded',
'request timed out', 'operation timed out', 'upstream timed out', 'turn
timed out'. _classify_by_message returns FailoverReason.timeout (retryable=True)
when any pattern matches.

Salvage of #22664's classifier portion. The original PR also bundled a
fallback self-selection guard which is now redundant (already on main
via #22780) plus DeepSeek thinking and session_search fixes that are
their own separate concerns.

Follow-up to #22780 — fixes the still-broken classification of
generic-typed provider-shim timeouts that #22780's dedup didn't cover.
2026-05-09 17:54:07 -07:00
Wesley Simplicio
6ddc48b058 fix(fallback): resolve api_key_env in fallback chain entries (carve-out of #22665)
Fallback chain entries with 'api_key_env: ENV_VAR_NAME' weren't being
resolved by either the init-time fallback path (line ~1660) or the
runtime _try_activate_fallback path (line ~8045). Only literal
'api_key' was honored; the snake_case 'api_key_env' alias documented
elsewhere in the config was silently dropped, so a 'provider: custom'
fallback with base_url + api_key_env worked as primary but failed as
fallback with 'no endpoint credentials found' / 401.

Adds 'or fb.get("api_key_env")' to the existing 'key_env' lookup in
both call sites, with empty-string-to-None coercion so unset env vars
don't poison the resolver.

Salvage of #22665's fallback portion. The original PR also bundled
gateway-degrade-on-no-adapters changes (those land via the carve-out
in #22853 which is the same code) and run_agent.py memory-nudge
counter hydration (issue #22357 territory, not mentioned in the
title). Drops both bundled pieces; keeps just the api_key_env fix.

Closes #5392.
2026-05-09 17:53:56 -07:00
Wesley Simplicio
246c676c2b fix(gateway): degrade gracefully when all platform adapters are missing
When connected_count == 0 AND enabled_platform_count > 0, the gateway
treated 'all adapters returned None' identically to 'all adapters
failed to connect' — both as fatal startup errors. The 'returned None'
case happens when imports fail silently or when adapters are present
in config but their dependencies aren't installed (e.g. discord.py
missing). Cron jobs and other gateway-runtime work would unnecessarily
fail to start.

Split: only return False when startup_retryable_errors is non-empty
(real connection attempt failed). When the list is empty AND enabled
> 0, log a warning and continue running, matching the 'no platforms
enabled' cron path.

Salvage of #22642's gateway slice. Drops the bundled run_agent.py
memory-nudge counter hydration block (issue #22357 territory) which
wasn't mentioned in the PR description.

Closes #5196.
2026-05-09 17:53:46 -07:00
Wesley Simplicio
116a1446a4 fix(terminal): bridge docker_env config to TERMINAL_DOCKER_ENV
Problem: terminal.docker_env set in config.yaml was silently ignored.
Docker containers never received the user-specified env vars.

Root cause: docker_env was missing from all three config→env bridging
maps (cli.py env_mappings, gateway/run.py _terminal_env_map,
hermes_cli/config.py _config_to_env_sync) and from the terminal_tool
_get_env_config() reader. _create_environment() consumed the key from
container_config correctly, but it was always {} because TERMINAL_DOCKER_ENV
was never set.

Also extend the list-serialisation branches in cli.py and gateway/run.py
to handle dict values via json.dumps (lists already used json.dumps;
plain str() on a dict produces undecodable output).

Fix:
- cli.py: add "docker_env": "TERMINAL_DOCKER_ENV" to env_mappings;
  serialise dict values with json.dumps alongside existing list path
- gateway/run.py: same additions to _terminal_env_map and serialisation
- hermes_cli/config.py: add "terminal.docker_env": "TERMINAL_DOCKER_ENV"
  to _config_to_env_sync so `hermes config set terminal.docker_env …`
  persists to .env correctly
- tools/terminal_tool.py: add docker_env key to _get_env_config() reading
  TERMINAL_DOCKER_ENV via _parse_env_var with default "{}"

Tests: add test_docker_env_is_bridged_everywhere to
tests/tools/test_terminal_config_env_sync.py — stash-verified: fails on
origin/main, passes with fix.

Fixes #20537
2026-05-09 17:53:35 -07:00
Wesley Simplicio
53ec32819c fix(process_registry): kill orphaned Popen on post-spawn setup failure
After Popen succeeds with os.setsid (detached process group), 5 things
happen with no try/except: Thread construction, reader.start(), lock
acquisition, prune+register, checkpoint write. If any raises, the
Popen object goes unregistered and the detached process group leaks
indefinitely.

Wrap the post-spawn setup in try/except. On failure:
  - os.killpg(getpgid(pid), SIGKILL) takes down the entire process
    group (not just the shell - important because of detached PG +
    -lic shell wrapper that may have spawned children)
  - proc.kill() fallback for ProcessLookupError/PermissionError/OSError
  - proc.wait(timeout=5) reaps with a bound
  - re-raise to preserve original traceback
Nested try/except around cleanup so a secondary failure can't mask the
original.

Closes #2749.
2026-05-09 17:53:24 -07:00
Teknium
c179bdab3c fix(install): also patch psutil on Termux fresh-install path
The Termux update path (PR #22814) prebuilds psutil from a marker-patched
sdist so 'platform android is not supported' doesn't kill it. The same
psutil setup.py error blocks fresh installs via scripts/install.sh — only
the update path was wired up. Without this, a brand-new Termux user can't
get past the very first 'pip install -e .[termux-all]' call.

- New scripts/install_psutil_android.py — standalone version of the same
  patcher hermes_cli/main.py uses, callable from bash.
- scripts/install.sh detects sys.platform == 'android' and runs the
  patcher before pip install.
- TODO note added to both copies pointing at upstream
  https://github.com/giampaolo/psutil/pull/2762; remove both when that
  ships.

Note: we keep psutil as a base dep on Android (do not adopt the proposed
sys_platform != 'android' marker in pyproject). Removing it would crash
five unguarded 'import psutil' sites at runtime
(tools/code_execution_tool.py, tools/tts_tool.py, tools/process_registry.py
(2x), gateway/platforms/whatsapp.py).
2026-05-09 17:53:15 -07:00
adybag14-cyber
6d5d467d39 fix(update): use termux-all uv fallback path on Termux 2026-05-09 17:53:15 -07:00
adybag14-cyber
3863d6d344 fix(update): prebuild psutil on Termux Android via Linux path shim 2026-05-09 17:53:15 -07:00
Wesley Simplicio
2245879af0 fix(checkpoint): guard _touch_project against non-dict project metadata
Problem
=======
`tools.checkpoint_manager._touch_project` reads the project metadata
file with `json.loads(meta_path.read_text(...))`, then immediately does:

    meta["workdir"] = str(_normalize_path(working_dir))

The `except` block only catches `(OSError, ValueError)`.  When the file
parses successfully but returns a non-dict value (a list `[]`, `null`,
or a scalar from a corrupted or hand-truncated write), `json.loads`
succeeds without error and `meta` is set to, e.g., `[]`.  The subsequent
subscript assignment then raises `TypeError: list indices must be
integers or slices, not str`, which is NOT caught by the narrow except
clause.

This TypeError propagates up through `_take` to `ensure_checkpoint`,
where the broad `except Exception` safety net swallows it.  The effect
is that `ensure_checkpoint` silently returns False for the entire
session — all checkpoints are skipped for the affected working directory
without any user-visible error.

Root cause
==========
Missing `isinstance(meta, dict)` guard after `json.loads`, identical in
pattern to bugs fixed in `cron/jobs.py` (#22569) and
`tools/process_registry.py` (#22544).  The same guard is already
present one function below in `_list_projects` (line 506), but was
inadvertently omitted in `_touch_project`.

Fix
===
Add two lines after the try/except:

```python
if not isinstance(meta, dict):
    meta = {}
```

This matches the existing guard in `_list_projects` and ensures a fresh
empty dict is used whenever the persisted value is not a mapping —
preserving the `created_at` semantics via `setdefault` on the next line.

Tests
=====
`TestTouchProjectMalformedMeta` covers four non-dict root values
(`[]`, `null`, `42`, `"oops"`).  Each writes a corrupted metadata file,
calls `_touch_project`, and asserts: (a) no exception raised, (b) the
metadata file is rewritten as a valid dict containing `last_touch` and
`workdir`.  All four fail on main with `TypeError`, pass with fix.
Full `tests/tools/test_checkpoint_manager.py` regression: 77 passed.
2026-05-09 17:53:13 -07:00
Wesley Simplicio
058c50816c fix(session): route OR-combined short CJK tokens to LIKE fallback (#20494)
The FTS5 trigram tokenizer requires >=3 CJK characters per individual
token to produce matchable trigrams. A query like "广西 OR 桂林 OR 漓江"
has cjk_count=6 (passes the existing >=3 guard) but each token is only
2 CJK chars, so the trigram index returns 0 results.

Fix:
- Add per-token check: if any non-operator CJK token has <3 CJK chars,
  force the LIKE fallback path regardless of total cjk_count.
- Expand the LIKE fallback to build one LIKE condition per non-operator
  token joined with OR, so each term is matched independently.

Regression tests added in TestCJKSearchFallback:
- test_cjk_or_combined_short_tokens_returns_results
- test_cjk_short_token_or_query_preserves_filters
2026-05-09 17:53:02 -07:00
Wesley Simplicio
35f773c459 fix(context_compressor): treat streaming premature-close as transient error
Problem:
When a provider or proxy drops a streaming response mid-flight (httpcore
raises RemoteProtocolError: "incomplete chunked read", "peer closed
connection", "response ended prematurely", etc.), _generate_summary
would not classify it as a transient error.  Instead of retrying on the
main model, it entered the generic 60-second cooldown, leaving context
growing unbounded until the cooldown expired.  Issue #18458.

Root cause:
_is_connection_error in auxiliary_client.py did not match httpcore's
streaming premature-close error substrings.  context_compressor.py's
_generate_summary except block never called _is_connection_error, so
those errors fell through to the 60-second generic cooldown rather than
triggering the retry-on-main fallback path used for timeouts.

Fix:
1. auxiliary_client.py — extend _is_connection_error keyword list with:
   "incomplete chunked read", "peer closed connection",
   "response ended prematurely", "unexpected eof",
   "remoteprotocolerror", "localprotocolerror".
   Also guard the `from openai import ...` with try/except ImportError
   so the function works in environments without the openai package.
2. context_compressor.py — import _is_connection_error and call it in
   _generate_summary's except block as _is_streaming_closed.  Include
   _is_streaming_closed in the fallback-to-main condition (alongside
   _is_model_not_found, _is_timeout, _is_json_decode) and use the
   shorter 30s transient cooldown for streaming-closed errors.

Tests:
4 new regression tests in TestStreamingClosedFallback:
- test_incomplete_chunked_read_falls_back_to_main
- test_peer_closed_connection_falls_back_to_main
- test_streaming_closed_on_main_uses_short_cooldown  (stash-verified)
- test_non_streaming_unknown_error_still_uses_long_cooldown

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 17:52:51 -07:00
heathley
0c5c4d1b8d fix(skills-hub): cover remaining SSRF fetch paths after #10029 2026-05-09 17:52:12 -07:00
Teknium
af9df46525 chore: add kidonng to AUTHOR_MAP 2026-05-09 17:51:04 -07:00
Kid
1321bcf5fe fix(gateway): finalize final stream edit on done 2026-05-09 17:51:04 -07:00
Teknium
c1cc3d4ea6
perf(image_gen): defer fal_client import to first generation request (#22859)
`tools/image_generation_tool.py` did `import fal_client` at module
top, which pulled the entire fal_client + httpx + rich stack on every
process that ran `discover_builtin_tools()` — every `hermes` cold
start, even ones that never touch image generation.

Make the import lazy: replace the eager import with a placeholder
(`fal_client: Any = None`) and add an idempotent `_load_fal_client()`
that rebinds the module global on first use. Call it from the two
runtime entry points (`_ManagedFalSyncClient.__init__` and
`_submit_fal_request`) and from the SDK-presence check in
`check_image_generation_requirements`.

The loader short-circuits if the global is already truthy, which
preserves the test pattern of monkeypatching `fal_client` to install
a mock — the `monkeypatch.setattr(image_tool, "fal_client", ...)`
calls in test_image_generation.py keep working unchanged.

Measured impact (15-run min times, 9950X3D):
  tools.image_generation_tool alone:  77 → 20 ms  (-74%)
                                      36 → 20 MB   (-44%)
  import cli (full):                 734 → 720 ms  (-2%)
  import model_tools:                372 → 366 ms  (-2%)

The microbench is dramatic but the full-CLI win is small — fal_client
shares its httpx + rich dependencies with the rest of the agent, so
on a real cold start most of the 16 MB / 64 ms is already paid by
other imports. The win matters mostly for processes that touch this
tool without otherwise loading httpx (rare) and for architectural
consistency with the previous lazy-load PRs (#22681 google_chat,
#22831 teams).

Tests: 55/55 `tests/tools/test_image_generation.py` pass, including
the cases that monkeypatch the module global to install a mock
fal_client. End-to-end verification confirms `import model_tools`
no longer pulls `fal_client` into `sys.modules`.
2026-05-09 17:45:09 -07:00
Teknium
fef1a41248
docs: round 2 audit — messaging, developer-guide, guides, integrations (#22858)
Cross-checked 75 docs pages under user-guide/messaging/, developer-guide/,
guides/, and integrations/ against the live registries and gateway code.

messaging/
- index.md: API Server toolset is hermes-api-server (was 'hermes (default)');
  Google Chat slug is hermes-google_chat (underscore — plugin name uses _).
- google_chat.md: drop bogus 'pip install hermes-agent[google_chat]' (no such
  extra); list the actual deps (google-cloud-pubsub, google-api-python-client,
  google-auth, google-auth-oauthlib).
- qqbot.md: config namespace is platforms.qqbot (was platforms.qq, which is
  silently ignored by the adapter); QQ_STT_BASE_URL is not read directly —
  baseUrl lives under platforms.qqbot.extra.stt.
- teams-meetings.md: 'hermes teams-pipeline' is plugin-gated (teams_pipeline
  plugin must be enabled), not a built-in subcommand.
- sms.md: example log line 0.0.0.0:8080 -> 127.0.0.1:8080 (default
  SMS_WEBHOOK_HOST).
- open-webui.md: API_SERVER_* are env vars, not YAML keys — write them to
  per-profile .env, not 'hermes config set' (same pattern fixed in
  api-server.md last round). Also bumped example ports to 8650+ to dodge the
  default webhook (8644)/wecom-callback (8645)/msgraph-webhook (8646)
  collision.

developer-guide/
- architecture.md: tool/toolset counts (61/52 -> 70+/~28); LOC stamps for
  run_agent.py, cli.py, hermes_cli/main.py, setup.py, mcp_tool.py,
  gateway/run.py replaced with 'large file' to stop drifting.
- agent-loop.md: same LOC drift (~13,700 -> 'a large file (15k+ lines)').
- gateway-internals.md: '14+ external messaging platforms' -> '20+'; gateway
  platform tree updated (qqbot is a sub-package, not qqbot.py; added
  yuanbao.py, feishu_comment.py, msgraph_webhook.py); 'gateway/builtin_hooks/
  (always active)' was wrong — it's an empty extension point and
  _register_builtin_hooks() is a no-op stub.
- acp-internals.md: drop fictional 'message_callback' from the bridged-
  callbacks list; clarify thinking_callback is currently set to None.
- provider-runtime.md: provider list was missing AWS Bedrock, Azure Foundry,
  NVIDIA NIM, xAI, Arcee, GMI Cloud, StepFun, Qwen OAuth, Xiaomi, Ollama
  Cloud, LM Studio, Tencent TokenHub. Fallback section described only the
  legacy single-pair model — corrected to the canonical list-form
  fallback_providers chain.
- environments.md: parsers list missing llama4_json and the deepseek_v31
  alias; both register via @register_parser.
- browser-supervisor.md: drop reference to scripts/browser_supervisor_e2e.py
  which doesn't exist in-repo.
- contributing.md: tinker-atropos is a git submodule — note that
  'git submodule update --init' is required if cloning without
  --recurse-submodules.

guides/
- operate-teams-meeting-pipeline.md: cron flags were all wrong — schedule is
  positional (not --schedule), the script-only flag is --no-agent (not
  --script-only), and there's no --command flag. Replaced with a real example
  that creates the script under ~/.hermes/scripts/ and uses the actual flags.
  Also replaced fictional 'hermes cron show <name>' with 'hermes cron status'.
- automation-templates.md: 'cron create --skills "a,b"' doesn't work —
  the flag is --skill (singular, repeatable). Fixed all 5 occurrences via AST
  rewrite.
- minimax-oauth.md: 'hermes auth add minimax-oauth --region cn' silently
  fails because --region isn't registered on the auth-add argparse spec.
  Pointed users at the minimax-cn provider (or MINIMAX_CN_API_KEY env) for
  China-region access.
- cron-script-only.md: 'hermes send' is fictional — replaced the comparison-
  table mention with a webhook-subscription pointer; also fixed the dead link
  to /guides/pipe-script-output (page doesn't exist).
- cron-troubleshooting.md: 'hermes serve' isn't a real subcommand. Pointed
  at 'hermes gateway' (foreground) / 'hermes gateway start' (service).
- local-ollama-setup.md: 'agent.api_timeout' is not a config key. The right
  knob is the HERMES_API_TIMEOUT env var.
- python-library.md: run_conversation() return dict has only final_response
  and messages — task_id is stored on the agent instance, not echoed back.
- use-mcp-with-hermes.md: '--args /c "npx -y …"' wraps the npx command in
  one quoted string, so cmd.exe gets a single arg instead of the multi-token
  command line it needs. Removed the surrounding quotes — argparse nargs='*'
  collects each token correctly.

integrations/
- providers.md: Bedrock guardrail YAML keys were 'id'/'version' (don't exist);
  actual keys are guardrail_identifier/guardrail_version (matches DEFAULT_CONFIG
  and the run_agent.py reader). GMI default base URL (api.gmi.ai/v1 ->
  api.gmi-serving.com/v1) and portal URL (inference.gmi.ai -> www.gmicloud.ai)
  refreshed. Fallback section rewritten to lead with the canonical
  fallback_providers list form (was leading with the legacy fallback_model
  single dict); supported-providers list extended to include azure-foundry,
  alibaba-coding-plan, lmstudio.

index.md
- '68 built-in tools' -> '70+'; '15+ platforms' was both inconsistent with
  integrations/index.md ('19+') and undercounted — bumped to 20+ and added
  Weixin/QQ Bot/Yuanbao/Google Chat to the list.

Validation: 'npm run build' clean (exit 0); broken-link count unchanged at
155 (same as round-1 post-skill-regen baseline). 24 files, +132/-89.
2026-05-09 15:00:24 -07:00
Teknium
0bcc327cab
docs(openrouter): document auxiliary.<task>.extra_body for OR routing and Pareto (#22844)
The plumbing for setting OpenRouter provider preferences and the Pareto Code
router on auxiliary tasks already exists — auxiliary.<task>.extra_body is
forwarded verbatim by call_llm() / async_call_llm(). It just wasn't documented,
so users who wanted (e.g.) Pareto Code routing for compression but the strongest
coder for the main agent had no way to discover the escape hatch.

- hermes_cli/config.py: expand the auxiliary section header with a YAML
  example showing provider routing plus plugins under extra_body, and an
  explicit note that main-agent provider_routing / openrouter.min_coding_score
  do NOT propagate to aux calls (each task is independent by design)
- website/docs/user-guide/configuration.md: new 'OpenRouter routing and
  Pareto Code for auxiliary tasks' subsection with worked example
- website/docs/integrations/providers.md: cross-link from the Pareto Code
  Router section to the aux-side doc

E2E verified that auxiliary.<task>.extra_body reaches the OpenRouter API with
the configured provider routing and plugins blocks intact.
2026-05-09 14:51:20 -07:00
Teknium
70bfd429e5
fix(gateway): preserve reasoning_content, codex_message_items, finish_reason on transcript replay (#22839)
PR #2974 whitelisted three reasoning fields (reasoning, reasoning_details,
codex_reasoning_items) for the gateway's simple-text replay branch. Three
more fields were added to the DB later but the whitelist was never updated:

  - reasoning_content: provider-facing thinking text. _copy_reasoning_content_for_api
    promotes 'reasoning' -> 'reasoning_content' at send time only when the
    strings happen to match. Carrying the original verbatim avoids loss
    for providers that return them as distinct fields (DeepSeek/Kimi/
    Moonshot thinking modes), and preserves the empty-string sentinel
    that DeepSeek V4 Pro requires for thinking-mode replay.
  - codex_message_items: exact assistant message items with 'phase'.
    OpenAI docs: 'preserve and resend phase on all assistant messages —
    dropping it can degrade performance.' Required for prefix cache hits.
    No recovery path exists — once dropped, gone.
  - finish_reason: informational; cheap to keep so transcripts replay
    identically across CLI and gateway.

The CLI is unaffected because cli.py keeps the live in-memory message list
across turns (cli.py:10046 'self.conversation_history = result["messages"]').
The gateway rebuilds agent_history from the SQLite transcript on every turn,
so any field stripped during replay is silently lost.

Refactors the inline whitelist into a module-level _build_replay_entry()
helper so the contract can be unit-tested. 16 new tests pin the field set
and falsy-value handling.

Verified end-to-end: DB stores all 8 fields, replay now preserves all 8
(was preserving only 5 for assistant text turns).
2026-05-09 14:47:33 -07:00
Teknium
c7f0aab949
feat(openrouter): wire Pareto Code router with min_coding_score knob (#22838)
Pick openrouter/pareto-code as your model and OpenRouter auto-routes each
request to the cheapest model meeting your coding-quality bar (ranked by
Artificial Analysis). The new openrouter.min_coding_score config key (0.0-1.0,
default 0.65) tunes the floor.

- hermes_cli/models.py: add openrouter/pareto-code to OPENROUTER_MODELS so
  it shows up in the picker with a description
- hermes_cli/config.py: add openrouter.min_coding_score (default 0.65 — lands
  on a mid-tier coder on the current Pareto frontier)
- plugins/model-providers/openrouter: emit extra_body.plugins =
  [{id: pareto-router, min_coding_score: X}] when model is openrouter/pareto-code
  AND the score is a valid float in [0.0, 1.0]
- agent/transports/chat_completions.py: same emission on the legacy flag
  path (when no provider profile is loaded)
- run_agent.py: openrouter_min_coding_score kwarg + storage; plumbed into
  both build_kwargs() invocations and the context-summary extra_body path
- cli.py: read openrouter.min_coding_score once at init, validate float in
  [0,1], pass to AIAgent constructions (CLI + background-task paths)
- cron/scheduler.py, batch_runner.py, tools/delegate_tool.py,
  tui_gateway/server.py: propagate the kwarg (mirrors providers_order
  plumbing — subagents inherit, cron/batch read from config)
- tests: profile-level + transport-level coverage of the model gating,
  unset/empty/out-of-range handling, and the legacy flag path
- docs: new 'OpenRouter Pareto Code Router' section in providers.md

Verified end-to-end against api.openrouter.ai: at score=0.65 we land on a
mid-tier coder, at omission we get the strongest. Score is silently dropped
on any model other than openrouter/pareto-code, so it's safe to leave set.
2026-05-09 14:47:00 -07:00
Henkey
b349ae1e4c fix(acp): honor task cwd for foreground terminal commands 2026-05-09 14:46:34 -07:00
Teknium
550f6e2efc
perf(teams): defer httpx import to first webhook call (#22831)
Same pattern as the google_chat lazy-load (PR #22681), applied to the
Teams plugin. The bundled `plugins/platforms/teams/adapter.py` did
`import httpx` at module top, which dragged the entire httpx +
httpcore stack into every process that triggered plugin discovery —
including `hermes` invocations that never instantiate the Teams
adapter.

`httpx` is only needed inside one method
(`TeamsMeetingPipeline._write_summary_via_incoming_webhook`), and the
`httpx.AsyncBaseTransport` parameter annotation is already string-only
thanks to the existing `from __future__ import annotations`. Move the
runtime import inside the method.

Measured impact (7-run medians, 9950X3D):
  teams plugin alone:    118 → 89 ms  (-25%)
                         46 → 38 MB   (-17%)
  import cli (full):     unchanged
  import model_tools:    unchanged

The full-CLI numbers are flat because httpx is loaded transitively
from many other modules on that path. The microbench win is the real
signal: 29 ms / 8 MB shaved off any process that touches the teams
plugin without otherwise pulling httpx — primarily future workflows
where the gateway is enabled but Teams is not configured.

Tests: 44/44 `tests/gateway/test_teams.py` pass; 345 across all
plugin-platform suites (teams + qqbot + google_chat). The test file
imports `httpx` itself for the `MockTransport` fixture, which is
correct — tests legitimately use httpx, only the plugin's module-level
import was the issue.
2026-05-09 14:42:12 -07:00
HenkDz
840ebe063e fix: make session search initialize session db 2026-05-09 14:36:58 -07:00
helix4u
9c26297c80 fix(gateway): preserve Ctrl+C for Windows foreground runs 2026-05-09 14:34:18 -07:00