Commit graph

9782 commits

Author SHA1 Message Date
Teknium
e0572a6def
fix(skills-hub): stop ellipsis-truncating the Identifier column (#33810)
`hermes skills search` rendered the Identifier column with the default
overflow behaviour, so long slugs (notably browse-sh — every browse-sh
skill ends in a `-XXXXXX` hash that's part of the identifier) were cut
to `browse-sh/weathe…`. Users copied the visible string into
`hermes skills install` and got a not-found error because the hash was
gone.

Set overflow="fold" on the Identifier column in both search tables
(`do_search` and the `_resolve_short_name` multi-match table) so long
slugs wrap onto a second line instead of getting eaten. Also add a
`--json` flag to `hermes skills search` (and the `/skills search`
slash variant) for scripting — emits a list of {name, identifier,
source, trust_level, description} objects with the full identifier,
which is the right shape for copy-paste pipelines too.

Closes #33674.
2026-05-28 04:53:13 -07:00
Teknium
5e1f793430
chore(web): remove web_crawl tool + provider crawl plumbing (#33824)
The web_crawl_tool() function was an orphan — no model schema registered
it, no skill or CLI command called it, and the agent had no way to invoke
it. PR #32608 proposed wiring it up as a model-callable tool; we've
decided not to expose crawl as a separate capability since web_search +
web_extract cover the use cases we want models to have.

Removed:
- tools/web_tools.py: web_crawl_tool() (~230 LOC)
- plugins/web/firecrawl/provider.py: supports_crawl() + crawl()
- plugins/web/tavily/provider.py: supports_crawl() + crawl()
- plugins/web/xai/provider.py: supports_crawl() override
- agent/web_search_provider.py: supports_crawl() + crawl() ABC methods
- agent/web_search_registry.py: get_active_crawl_provider() +
  the 'crawl' branch in _resolve()
- agent/display.py: web_crawl tool-progress rendering
- hermes_cli/config.py: 'web_crawl' from TAVILY_API_KEY.tools
- tools/website_policy.py: stale comment reference
- Tests: removed TestWebCrawlTavily class, the two website-policy
  web_crawl tests, the searxng/ddgs/brave-free crawl-error tests,
  the integration test_web_crawl method, and the
  test_unconfigured_crawl_emits_top_level_error test. Trimmed the
  capability-flag parametrize list and the WebSearchProvider ABC
  conformance tests.
- Docs: trimmed the Crawl column from capability tables in both EN
  and zh-Hans, updated the developer-guide ABC table.

Net: 25 files, +115/-1067.

Closes #33762 (the schema-text bug only existed if #32608 landed).
Supersedes #32608.
2026-05-28 04:52:42 -07:00
teknium1
b243afb68b fix(discord): skip backfill for auto-created threads and update test fakes
When auto-threading kicked in, the broadened backfill gate ran on the
freshly-created thread — but the thread has no prior context to fetch,
and the parent-channel reference passed to _fetch_channel_context would
have leaked unrelated context (see #31467).

Skip backfill when auto_threaded_channel is set.  Also teach the
_FakeTextChannel / _FakeThreadChannel test doubles to expose a no-op
history() async generator so the broadened gate doesn't trip
AttributeError → discord.Forbidden (MagicMock) → TypeError in the
existing auto-thread tests.  Add a regression test that asserts
auto-threaded messages do not trigger backfill.
2026-05-28 04:52:02 -07:00
teknium1
68ddd6b338 refactor(discord): inline backfill gate and document intent
Drop the _needed_mention local variable now that it has only one use,
inline its expression as _has_mention_gap, and add a comment explaining
the three backfill cases (mention-gated channel, thread, DM skip).

Behaviorally identical to the prior commit; cleanup only.

Co-authored-by: liuhao1024 <liuhao1024@users.noreply.github.com>
2026-05-28 04:52:02 -07:00
Pluviobyte
eafe11d456 fix(gateway): backfill Discord thread context
Discord threads where the bot has already participated bypass mention gating by default, but the backfill check was still tied to the mention-needed condition. That meant follow-up thread messages could trigger a response without providing recent thread history to the session.

Run history backfill for thread messages whenever backfill is enabled, while keeping DMs skipped and channel mention backfill behavior unchanged. Add a regression test for a known thread follow-up without an explicit mention.

Fixes #33666

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-28 04:52:02 -07:00
Teknium
a1eaad2fc0
perf(skills-page): lazy-fetch the catalog instead of bundling 34MB into JS (#33809)
PR #33748 grew the live skills index from ~2k skills to ~69k, which made
the previous build-time bundling strategy untenable: the skills page's
JS chunk was about to balloon from ~1MB to ~35MB.  Initial page load
on mobile became unusable, search lagged on every keystroke against the
68k-item array, and JSON.parse blocked the main thread at startup.

Three changes:

1. extract-skills.py writes skills.json + skills-meta.json into
   website/static/api/ instead of website/src/data/.  Static-served by
   Vercel as /docs/api/skills.json (gzipped on the wire), same CDN that
   already serves skills-index.json.

2. skills/index.tsx drops the static import and fetches both files in
   parallel on mount.  Loading state shows '…' for the count; failures
   surface a small error pill instead of blanking the page.

3. Search is debounced 150ms and runs against a precomputed lowercase
   haystack stamped onto each row at load time.  Before: array-join +
   toLowerCase per row per keystroke on a 68k array.  After: single
   .includes() per row, deferred until typing settles.

Validation:

| | before | after |
|---|---|---|
| skills.json location | src/data/ (bundled) | static/api/ (CDN) |
| Largest JS chunk | would be ~35MB at 68k skills | 659 KB |
| Initial page render | wait for full parse | immediate, fetch async |
| Per-keystroke filter | join+lowercase x 68k rows | single includes x 68k rows |
| Debounce | none | 150ms |

Built locally for both en and zh-Hans locales; the 34MB skills.json now
lives in build/api/ and is served separately rather than inlined into
the page's bundle.

skills.json and skills-meta.json added to .gitignore — they were already
build artifacts, but the gitignore only listed skills-index.json before.
2026-05-28 03:41:43 -07:00
teknium1
6f9182cb34 fix(kanban): content-addressed corrupt-DB backup filename
Repeated quarantines of an unchanged corrupt kanban.db used to amplify
disk usage by N: the gateway dispatcher's 5-minute retry loop, multi-
profile fleets sharing one DB, and manual reopen attempts each produced
a fresh '.corrupt.<timestamp>.bak' copy of the same bytes. After 10
retries on a 100KB DB you had 11x the disk footprint of duplicate
corrupt data.

Derive the backup filename from a sha256 of the main DB instead of a
timestamp + collision counter. Same bytes → same filename → skip the
copy on retries. Different bytes (partial repair, further damage) →
different filename → preserve separately. Sidecar (-wal/-shm) backups
inherit the same content-addressed name.

Inspired by @hanzckernel's PR #33529, simplified down to ~30 LOC: drop
the persistent JSON marker file, drop the atomic temp+fsync+rename
helper (shutil.copy2 is fine for a quarantine-only path), drop the
gateway-side WAL/SHM fingerprint extension (the existing
(path, mtime, size) tuple still gives the 5-minute retry semantics it
needs), and drop the gateway-side helper extraction. The backup file
existing IS the marker; no separate state needed.

Test: tests/hermes_cli/test_kanban_db.py::test_repeated_corrupt_open_reuses_single_backup
proves 10 retries on the same corrupt bytes produce 1 backup (was 11),
and mutating the corrupt bytes produces a second backup with a
different fingerprint.

Refs #33529
Co-authored-by: hanzckernel <zhicheng.han@mathematik.uni-goettingen.de>
2026-05-28 03:38:09 -07:00
Teknium
432a691758
fix(update): stream + idle-kill npm run build so a stalled webui-build can't soft-brick the install (#33803)
`hermes update` ran the webui build with `capture_output=True` and no timeout. On low-memory hosts (WSL2's 4 GB default, small VPSes, antivirus stalls) Vite goes silent for minutes; users see a frozen terminal, decide the update is hung, and reboot. The reboot lands *after* `pip install -e .` has already touched the install but *before* the build completes, leaving the `hermes` launcher in place while `hermes_cli` is no longer importable — i.e. `ModuleNotFoundError: No module named 'hermes_cli'` (#33788, same class as #32384).

Changes:

- New `_run_with_idle_timeout()` helper: streams subprocess output line-by-line (so the user sees Vite progress in real time) and kills the process if no bytes appear on stdout/stderr for 180s. The existing stale-dist fallback (#23817) then serves the previous build instead of failing the update.
- `_build_web_ui()` uses the helper for `npm run build` (the actual stall site). `npm install` keeps `subprocess.run` + capture_output to preserve the existing EPERM-retry-on-Windows contract.
- Both `cmd_update` call sites print `→ Core update complete. Building dashboard (optional)...` before the webui build. The CLI is fully functional at this point; a webui-build failure only affects `hermes dashboard`. Telegraphing the boundary explicitly stops users from rebooting through the build step.

Tests:

- `tests/hermes_cli/test_run_with_idle_timeout.py` — 4 tests covering streaming success, nonzero exit, idle-kill, and missing-binary cases. Uses real `subprocess.Popen` on tiny Python scripts; isolated in its own file so per-file canonical-runner parallelism doesn't pair it with the mock-heavy tests.
- `tests/hermes_cli/test_web_ui_build.py` — updated existing tests to patch `_run_with_idle_timeout` for the build step in addition to `subprocess.run` for the install step.
- `tests/hermes_cli/test_cmd_update.py::test_update_refreshes_repo_and_tui_node_dependencies` — same update.

Full suite: `scripts/run_tests.sh tests/hermes_cli/` → 5646 passed, 0 failed.

Fixes #33788.
2026-05-28 03:34:47 -07:00
teknium1
78be458608 fix(patch): widen new_string \t/\r unescape to all match strategies (#33733)
Extends @liuhao1024's escape-normalized fix so the patch tool also
recovers when old_string carries a real tab byte and matches via the
`exact` strategy — which is the headline reproduction in the issue and
the most common case in practice (LLMs frequently get old_string right
because they re-read the file, but still serialize new_string's tabs as
two-character `\t`).

Instead of gating on the match strategy, decide per-sequence by looking
at the *matched region of the file*: only convert `\t` -> tab and
`\r` -> CR when the file region we're replacing actually contains the
corresponding control byte. That mirrors the region-based heuristic in
`_detect_escape_drift` and keeps legitimate writes of the literal
two-character string `"\t"` (e.g. patching `sep = "\t"` in Python
source) untouched — those files have a backslash+t in the matched
region, not a real tab, so new_string passes through verbatim. `\n` is
still excluded because newlines serialize correctly through JSON and
unescaping would corrupt source escape sequences far more often than
help.

E2E verified against the live `patch` tool: tab-indented file + literal
`\t` in new_string under both `exact` (Variant 1) and `escape_normalized`
(Variant 2) strategies now produces real tab bytes; a Python source line
containing `sep = "\t"` (legitimate literal backslash-t) survives a
patch unchanged.

Tests updated to cover both strategies and the legitimate-literal case,
and to assert that `\n` is intentionally preserved.

Refs #33733
2026-05-28 03:27:20 -07:00
liuhao1024
e9f3f2b34a fix(tools): unescape common sequences in new_string when escape_normalized matches
When the patch tool matches via the escape_normalized strategy, old_string
contains literal \t, \n, \r sequences that get unescaped to match real
control characters in the file. However, new_string was written as-is,
leaving literal backslash sequences in the output.

Add _unescape_common_sequences() helper and apply it to new_string when
the matching strategy is escape_normalized. This ensures LLM-generated
tab/newline sequences become real bytes in the patched file.

Fixes #33733
2026-05-28 03:27:20 -07:00
Teknium
10ee4a729b
fix(gateway): drain on Windows hermes gateway stop so sessions survive restart (#33798)
Sessions now survive `hermes gateway stop` / `restart` on native Windows.
Previously the gateway died on schtasks `/End` + os.kill SIGTERM without
ever running the drain loop, so the v0.13.0 session-resume feature (#21192)
silently broke on Windows: `resume_pending=True` was never written, and
the next boot started with a blank conversation history (issue #33778).

Root cause is twofold and the reporter only identified half of it:

1. `hermes_cli/gateway_windows.py::stop()` did not write the
   `planned_stop_marker` before signalling. The reporter caught this.

2. The bigger reason: `asyncio.add_signal_handler` raises
   NotImplementedError for SIGTERM/SIGINT on Windows, so even if the
   marker had been written, the gateway's existing SIGTERM handler
   (which is what calls `runner.stop()` and the `mark_resume_pending`
   loop) was never invoked. Writing the marker would have been
   necessary-but-insufficient.

The fix has two parts:

* gateway/run.py: new `_run_planned_stop_watcher` daemon thread polls
  for the planned-stop marker file every 0.5s. When the marker appears
  it `loop.call_soon_threadsafe(shutdown_signal_handler, None)` — the
  same shutdown path a real SIGTERM would have driven, including the
  pre-drain `mark_resume_pending` writes (run.py:5977) and graceful
  drain wait. The existing signal handler already accepts
  `received_signal=None` and falls through to
  `consume_planned_stop_marker_for_self()`, so no handler changes
  needed. Runs on every platform as cheap belt-and-suspenders.

* hermes_cli/gateway_windows.py: `stop()` now writes the marker for
  the running gateway PID and waits up to `agent.restart_drain_timeout`
  (default 30s) for the PID to exit cleanly. On clean drain, the kill
  sweep is non-forceful; on timeout, escalates to
  `kill_gateway_processes(force=True)` which routes to taskkill /T /F
  per `references/windows-native-support.md`.

Validation:

* 7 new tests in tests/gateway/test_planned_stop_watcher.py covering:
  marker→handler dispatch, no-marker idle, already-draining skip,
  not-yet-running skip, stop_event responsiveness, fire-once
  semantics, error tolerance.
* 8 new tests in tests/hermes_cli/test_gateway_windows.py covering:
  marker-before-kill ordering, clean-drain skips force-kill,
  drain-timeout escalates to force=True, no-pid-skips-drain,
  invalid-pid handling, fast-exit success, timeout failure,
  marker-write-failure tolerance.
* E2E (Linux, detached orphan): write_planned_stop_marker(pid) +
  `_drain_gateway_pid(pid, 5.0)` returns True in 0.5s after the
  victim sees the marker and exits. Tested with a double-forked
  subprocess so the test parent isn't holding it as a zombie.
* Targeted: tests/gateway/{restart_drain,restart_resume_pending,
  signal,signal_format,status,shutdown_forensics,approve_deny_commands,
  planned_stop_watcher} + tests/hermes_cli/{gateway_windows,
  gateway_service} → 519/519.

What was wrong with the reporter's claim (for future archaeology): they
described the symptom as "no `resume_pending=True` written to
`sessions.json`" — but Hermes uses `state.db` (SQLite), not
`sessions.json`, and `mark_resume_pending` is called regardless of
the marker (the marker only affects exit code 0 vs 1 for systemd
revival semantics). The real session-loss path is the missing drain
on Windows, not a missing marker. Both halves are fixed here.

Closes #33778.
2026-05-28 03:25:32 -07:00
teknium1
f8896dedc8 chore(release): map biser@bisko.be -> bisko in AUTHOR_MAP 2026-05-28 03:21:00 -07:00
Biser Perchinkov
b5495db701 fix(agent): re-pad reasoning_content on cross-provider fallback to require-side providers
api_messages is built once before the retry loop while the primary provider
is active. When a mid-conversation fallback switches to a require-side thinking
provider (DeepSeek/Kimi/MiMo), assistant turns built under a non-require primary
(e.g. Codex) go out without reasoning_content and the new provider rejects the
request with HTTP 400 ("reasoning_content must be passed back").

Re-apply the echo-back pad against the current provider immediately before
building the request kwargs. Idempotent and a no-op unless the active provider
enforces echo-back, so it covers all fallback paths without affecting normal or
reject-side operation.

Drafted by Claude (Opus 4.7) under human review while fixing a personal deployment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 03:21:00 -07:00
Indigo Karasu
9179396cb7 fix(stream-consumer): only set _final_content_delivered when final response confirmed delivered
In GatewayStreamConsumer._run(), _final_content_delivered was set to True
based on the success of a mid-stream finalize edit, before the final
finalize edit was attempted. When the final edit later failed (Telegram
flood control, retry-after), _final_response_sent stayed False but
_final_content_delivered was already True, so gateway/run.py suppressed
its normal final send and the user saw a partial / fallback message
instead of the real answer.

Changes in gateway/stream_consumer.py:
- Remove the premature _final_content_delivered = True at the top of
  the got_done block.
- Set _final_content_delivered = True only when the actual final send /
  edit succeeds, in each finalize branch (no-finalize adapter,
  _message_id finalize, no-_already_sent send).
- _send_fallback_final: don't set _final_response_sent = True when only
  some chunks were delivered; the gateway should still attempt a
  complete final send. Set _final_content_delivered = True alongside
  _final_response_sent on the success path and short-text path.
- Cancellation handler: set _final_content_delivered = True alongside
  _final_response_sent when the best-effort final edit succeeds.

Adds TestFinalContentDeliveredGuard with 3 regression tests covering
the core bug scenario, the happy path, and partial fallback.

Closes #33708
Closes #25010
Refs #29200

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-05-28 03:15:19 -07:00
Dusk1e
a91b1c8b31 fix(tirith): reject non-regular tar members during auto-install process 2026-05-28 02:49:26 -07:00
teknium1
247b24b49f chore(release): add AUTHOR_MAP entry for AdityaRajeshGadgil 2026-05-28 02:45:25 -07:00
Aditya Rajesh Gadgil
031983bbf8 fix: limit pre-update state snapshots 2026-05-28 02:45:25 -07:00
Teknium
8b6beaab5f
docs: 30-day overhaul — correctness audit, PR coverage, Nous Portal weave, sidebar reorg (#33782)
* docs(audit): correctness pass across getting-started, reference, features, messaging, developer-guide, guides, integrations, user-guide

* docs: add PR coverage for last 30d + Nous Portal weave + nav reorg + build fixes

- Add docs for top user-visible PRs that shipped without docs (api-server
  session control, kanban features, telegram pin/edit, provider client tag,
  xAI retired-model migration, cron name lookup, --branch update flag, etc.)
- Apply Nous Portal weave across 23 pages (tasteful one-liners on
  getting-started/learning-path, configuration, overview, vision, x-search,
  credential-pools, provider-routing, cron, codex-runtime, profiles, docker,
  messaging/index, multiple guides, plus FAQ + index promotion)
- Reorganize sidebar: split Messaging into Popular/M365/Chinese/Other,
  Reference into Command/Configuration/Tools-Skills sub-categories, add
  orphan developer-guide pages (web-search-provider-plugin,
  browser-supervisor), move features from Integrations back to Features,
  fold lone spotify into Media & Web.
- Regenerate skill stubs + catalogs (kanban-codex-lane, hermes-s6-container-
  supervision, web-pentest)
- Fix broken anchor links (security/cron, configuration/fallback, telegram
  large-files, adding-platform-adapters step-by-step)
2026-05-28 02:41:36 -07:00
teknium1
c7f7783e5c test(xai-proxy): regression coverage for #28932 429 handling
Three new tests in tests/hermes_cli/test_proxy.py:

- xai_adapter_retry_rotates_pool_entry_on_429 — headline #28932 case.
  Two-entry pool, 429 on first entry, must rotate to second entry
  AND must NOT call refresh_xai_oauth_pure (refresh is irrelevant
  for rate limits).
- xai_adapter_retry_returns_none_on_429_when_pool_exhausted —
  single-entry pool: 429 returns None so the rate-limit response
  flows back to the client unchanged (existing behavior preserved).
- xai_adapter_retry_returns_none_for_unrelated_status — non-{401,
  429} statuses must not trigger any retry path at all; guards
  against the gate becoming too broad in future changes.

Each test asserts that refresh_xai_oauth_pure is never called on the
429 path — refresh is a 401-specific concern.

39/39 in tests/hermes_cli/test_proxy.py.
2026-05-28 02:36:37 -07:00
sprmn24
4ed482549f fix(xai-proxy): handle 429 rate-limit responses in proxy retry path
get_retry_credential only triggered on 401; a 429 Too Many Requests from
xAI was silently streamed back with no key rotation or back-off signal.

- server.py: widen retry gate from == 401 to in {401, 429}
- xai.py: on 429, skip token refresh and call mark_exhausted_and_rotate
  to stamp the 1-hour cooldown on the rate-limited key and return the
  next available credential. Returns None if pool is exhausted.
2026-05-28 02:36:37 -07:00
Dusk1e
aa3466063b fix(android): reject unsafe tar members in psutil compatibility installer 2026-05-28 02:36:09 -07:00
Teknium
bb0ac5ced2 chore(release): AUTHOR_MAP entry for vynxevainglory-ai
PR #29233 salvage.
2026-05-28 02:33:51 -07:00
Teknium
70abae8e3b fix(kanban): show horizontal scrollbar instead of wrapping columns
Salvage follow-up on top of @vynxevainglory-ai's PR #29233. Keep the
column-body flex:1 + min-height:0 fix (tall columns scroll internally
now), but drop the flex-wrap: wrap part — instead just stop hiding
the existing horizontal scrollbar.

PR #523254b34 (sadiksaifi, May 18) deliberately moved the kanban board
from a wrapping grid to a single-row pinned-width flex so the board
stays as one stable horizontal row. The mistake in that PR was the
scrollbar-width: none + ::-webkit-scrollbar { display: none } pair,
which hid the affordance so columns past the viewport became visually
inaccessible. Fixing that hidden-scrollbar bug while keeping the
single-row design honors both contributors' intent.
2026-05-28 02:33:51 -07:00
Vynxe Vainglory
538f0fa339 fix(kanban): wrap columns into rows and fix vertical overflow
Two CSS issues in the kanban dashboard:

1. Columns overflow horizontally with no way to reach them — the
   original scrollbar-width: none hid the scrollbar entirely, and
   even with a scrollbar, a wrapping layout is better UX for a board
   with 8+ columns. Changed to flex-wrap: wrap and removed the
   overflow-x: auto + hidden scrollbar rules. Columns now flow into
   multiple rows (~3 per row on a typical viewport) instead of
   running off-screen.

2. .hermes-kanban-column-body lacked flex: 1 and min-height: 0,
   so the flex child's implicit min-height: auto prevented it from
   shrinking below its content size. Columns with many cards pushed
   past the parent max-height instead of scrolling internally.

Verified: 9 columns wrap into 3 rows, all visible without
horizontal scroll. Done column (53 tasks) scrolls vertically
within its column bounds.
2026-05-28 02:33:51 -07:00
teknium1
9b5dae17a5 feat(context-engine): host contract for external context engines
Condenses the substance of PRs #16453, #17453, #16451, #17600, and #13373
into a minimal generic host contract that external context engine plugins
(e.g. hermes-lcm) need to integrate cleanly. Drops scaffolding that
duplicated existing infrastructure or had marginal value.

Five concrete changes:

1. `_transition_context_engine_session()` on AIAgent — generic lifecycle
   helper that fires on_session_end → on_session_reset → on_session_start
   → optional carry_over_new_session_context. Engines implement only the
   hooks they need; missing hooks are skipped. Built-in compressor keeps
   its existing reset-only behavior because callers default to no
   metadata. `reset_session_state()` now optionally accepts
   previous_messages / old_session_id / carry_over_context and delegates
   to the transition helper when provided. (#16453)

2. `conversation_id` passed to `on_session_start()` — both the
   agent-init call site and the compression-boundary call site now
   forward `self._gateway_session_key` so plugin engines have a stable
   conversation identity that survives session_id rotation (compression
   splits, /new, resume). The key already existed on AIAgent; it just
   wasn't reaching engines. (#16453)

3. Canonical cache buckets forwarded to engines — the usage dict passed
   to `update_from_response()` now includes input_tokens, output_tokens,
   cache_read_tokens, cache_write_tokens, and reasoning_tokens on top of
   the legacy prompt/completion/total keys. Engines can make decisions on
   cache-hit ratios and reasoning costs instead of only aggregates. ABC
   docstring updated. (#17453)

4. Plugin-registered context engines visible in the picker —
   `_discover_context_engines()` in plugins_cmd.py now also includes
   engines registered via `ctx.register_context_engine()` from plugin
   manifests, deduplicating by name so repo-shipped descriptions win on
   collision. (#16451)

5. `_EngineCollector.register_command()` — context engines using the
   standard `register(ctx)` pattern can now expose slash commands (e.g.
   `/lcm`). Routes to the global plugin command registry with the same
   conflict-rejection policy regular plugins use (no shadowing built-ins,
   no clobbering other plugins). Previously these calls hit a no-op and
   the slash commands silently never appeared. (#17600)

Dropped from the original 5 PRs:

- Compression boundary signal (`boundary_reason="compression"`) from
  #16453 — already on main at `agent/conversation_compression.py:412-424`,
  landed via the bg-review extraction.

- `discover_plugins()` before fallback in run_agent.py from #16451 —
  redundant: `get_plugin_context_engine()` already routes through
  `_ensure_plugins_discovered()` which is idempotent.

- Runtime identity diagnostics method + helpers from #13373 (+251 LOC) —
  operators can already read engine state via `engine.get_status()`;
  the diagnostics view added marginal value relative to its surface area.

- The 553-LOC slash-command machinery from #17600 — replaced with a
  20-LOC `register_command` method on the collector that reuses the
  existing plugin command registry instead of building a parallel one.

Net: ~215 LOC of host-contract changes + 282 LOC of focused tests, vs
~1,176 LOC across the original 5 PRs.

Co-authored-by: Tosko4 <1294707+Tosko4@users.noreply.github.com>

Closes #16453.
Closes #17453.
Closes #16451.
Closes #17600.
Closes #13373.
Related: stephenschoettler/hermes-lcm#68.
2026-05-28 01:45:30 -07:00
Teknium
fb9f3a4ef9
fix(skills): pull full ClawHub catalog into the skills index (200 → 20k+) (#33748)
* fix(skills): pull full ClawHub catalog into the skills index

The website was showing 200 ClawHub skills out of 20k+ because
`ClawHubSource.search("")` for empty queries went straight to a single
unpaginated request. ClawHub's API caps any single page at 200 items and
returns a `nextCursor`; we grabbed page 1 and stopped, so the cached
index served from hermes-agent.nousresearch.com had a silent 99%
truncation.

End users never hit clawhub.ai directly (the index is rebuilt twice
daily by .github/workflows/skills-index.yml and served as a static JSON
on the docs site), so the cap-and-cache architecture is correct — it
just wasn't being filled.

Changes:
- `ClawHubSource.search(query="")` now routes through the existing
  `_load_catalog_index()` paginating walker instead of the unpaginated
  listing fallback (non-empty queries still hit the fast catalog search).
- `_load_catalog_index()` max_pages 50 → 250 (50k-skill ceiling; live
  catalog is ~20k as of May 2026, with headroom for growth).
- `build_skills_index.py`: per-source crawl limits split out — ClawHub
  and LobeHub get 100k, others keep their effective caps.
- `EXPECTED_FLOORS["clawhub"]` 50 → 5000 so the next pagination
  regression hard-fails the CI build instead of silently shipping a
  degenerate index.

Test plan:
- New unit test `test_search_empty_query_paginates_full_catalog`
  exercises the cursor-following path with three mocked pages (450
  total items) and asserts all pages are walked.
- Existing 9 ClawHub tests + 127 broader skills_hub tests all pass.
- E2E against live ClawHub API: walker reached 9700+ skills across 49
  pages before this commit landed, paginating well past the previous
  50-page cap.

* fix(skills): raise ClawHub ceilings — live catalog is 50k, not 20k

E2E walk against live ClawHub API hit my initial 250-page cap at 49,698
skills with cursor=yes still pending. The catalog is roughly 2.5x larger
than the docstring estimate.

- max_pages 250 → 750 (150k ceiling, walks terminate on cursor=None
  well before this in practice)
- SOURCE_LIMITS['clawhub'] 100k → 200k
- EXPECTED_FLOORS['clawhub'] 5000 → 20000
2026-05-28 01:42:19 -07:00
Teknium
09a5cd8084
fix(auth): sync manual:device_code Codex pool entries on re-auth (#33744)
#33164 made _save_codex_tokens sync the singleton-seeded `device_code`
pool entry on Codex OAuth re-auth. That fixed the #33000 path but missed
`manual:device_code` entries created by `hermes auth add openai-codex`
(the recommended workaround for users who hit #33000 before #33164
landed).

Every subsequent re-auth would refresh the device_code entry but leave
the manual:device_code entry holding the consumed refresh token plus
stale last_error_* markers — immediately recreating the 401
token_invalidated symptom on the next request, exactly as reported in
#33538.

Extend the refreshable source set to include `manual:device_code`.
Completing the device-code OAuth flow proves the user owns the ChatGPT
account, so it is safe to refresh every device-code-backed entry. Keep
`manual:api_key` and other non-device-code manual sources untouched —
those represent independent credentials.

Closes #33538.
2026-05-28 01:33:10 -07:00
Dusk1e
43abc51f66 fix(security): require source CIDR allowlisting for public msgraph webhook binds 2026-05-28 01:26:18 -07:00
Teknium
986abb3cf7
docs: drop stale Kimi/DeepSeek vision example (#33736)
Kimi K2.6 is natively multimodal — flagged by Shengyuan from the Kimi
growth team. Replace the named-vendor example with a model-agnostic
phrasing so the row doesn't go stale as more vendors ship vision.
2026-05-28 01:23:38 -07:00
Teknium
87e5b2fae0
feat(mcp): support TLS client certificates (mTLS) for HTTP and SSE servers (#33721)
Adds first-class `client_cert` / `client_key` config keys so MCP servers
behind mTLS work without an external TLS-terminating proxy. Resolves
inbound community question (Jeremy W.).

Schema (per `mcp_servers.<name>`, HTTP/SSE only):

- `client_cert: "/path/to/combined.pem"` — single PEM with cert + key
- `client_cert: "/path/to/cert"` + `client_key: "/path/to/key"` — separate
- `client_cert: [cert, key]` or `[cert, key, password]` — list form,
  with optional passphrase for encrypted keys

Paths support `~` expansion. Missing files raise a server-scoped
`FileNotFoundError` at connect time rather than failing later with an
opaque TLS handshake error.

Wiring:

- New SDK HTTP path (mcp >= 1.24): `cert=` on the user-owned
  `httpx.AsyncClient` alongside the existing `verify=` handling.
- SSE path: routed through an `httpx_client_factory` that wraps the
  SDK's defaults (follow_redirects=True) and layers `verify` + `cert`
  on top. The factory is only injected when needed, so the SDK's
  built-in `create_mcp_http_client` keeps being used in the default
  case.
- Deprecated mcp<1.24 path left untouched — that SDK's
  `streamablehttp_client` signature doesn't expose `cert`, and adding
  it would be dead code.

Also documents the previously-undocumented `ssl_verify` key (bool or
CA bundle path) in the MCP config reference.

Tests:

- `tests/tools/test_mcp_client_cert.py` (new, 19 tests):
  - `_resolve_client_cert` helper: all three input forms, `~` expansion,
    missing-file and validation errors.
  - HTTP transport: `cert=` forwarded into `httpx.AsyncClient` for
    string and tuple forms; absent when unset; missing-file error
    propagates.
  - SSE transport: factory only injected when cert or non-default
    verify is set; factory applies cert, custom CA bundle, and
    preserves `follow_redirects=True` + forwarded headers/auth.
- Existing tests: 200/200 in `test_mcp_tool.py` + `test_mcp_sse_transport.py`
  still pass.
2026-05-28 00:55:55 -07:00
Stephen Schoettler
8595281f3c fix: expose context engine tools with saved toolsets 2026-05-28 00:28:42 -07:00
Dusk1e
1a9ef83147 fix(security): require API_SERVER_KEY before dispatching API server work 2026-05-28 00:25:08 -07:00
LeonSGP43
442a9203c0 Fix xAI OAuth timeout manual fallback 2026-05-28 00:24:17 -07:00
helix4u
459d7694d3 fix(agent): preload jiter native parser 2026-05-28 00:20:11 -07:00
Robin Fernandes
dc52b82d53 test(auth): update entitlement CI expectations 2026-05-28 00:19:31 -07:00
Robin Fernandes
1cf5e639b3 fix(auth): refresh Nous entitlement in tool menus 2026-05-28 00:19:31 -07:00
Robin Fernandes
406901b27d feat(auth) normalise the way in which we check whether a user has free/paid access to nous portal so we can expose behaviour and error messages accordingly. 2026-05-28 00:19:31 -07:00
stephenschoettler
0bf9b867cf fix(website): pin serialize-javascript and uuid via npm overrides
Resolves the two Dependabot alerts currently open against the website
lockfile:

- serialize-javascript: pin to ^7.0.5 (was 6.0.2 — high-severity RCE
  via RegExp.flags + Date.prototype.to*, plus medium-severity DoS)
- uuid: pin to ^14.0.0 (was 8.3.2 — medium buffer bounds check miss
  in v3/v5/v6 when buf is provided)

Lockfile regenerated against current main (not the stale lockfile
from the original PR — several Dependabot bumps for mermaid,
webpack-dev-server, @babel/plugin-transform-modules-systemjs,
fast-uri, lodash-es+langium, lodash, follow-redirects, and dompurify
have landed since #30036 was opened, so the website portion was
re-applied surgically on top of those).

Salvaged the website half of PR #30036. The TUI test half landed
on main separately, so this PR is web-only.
2026-05-28 00:07:54 -07:00
kshitijk4poor
7b778db472 chore(release): map MoonRay305 contributor email for #32759 salvage
Adds `squiddy@2rook.ai → MoonRay305` to AUTHOR_MAP so contributor_audit.py
passes for the salvaged commits in #33482-followup PR.
2026-05-27 23:28:51 -07:00
Squiddy
3ba8962738 fix(kanban): add Windows init lock guard 2026-05-27 23:28:51 -07:00
Squiddy
90b6b3d18f fix(kanban): harden sqlite connection concurrency 2026-05-27 23:28:51 -07:00
Brian D. Evans
3ad46933d3
docs(voice): use uv pip install faster-whisper in STT install hints (#29800)
* docs(voice): use `uv pip install faster-whisper` in STT install hints

Three runtime messages told users to `pip install faster-whisper`
(reported in #29782 for the gateway STT failure message under
Telegram-in-Docker, where the user hit `bash: pip: command not
found`). The Hermes Docker image is built on `ghcr.io/astral-sh/uv`
with a uv-managed venv that doesn't ship `pip` on PATH; users on
modern `uv tool install` / `uv venv` installs see the same problem.

The canonical install command in this repo is `uv pip install`
(see `tools/lazy_deps.py:509` `feature_install_command()`), which
works in Docker (uv image), in `uv tool install` venvs, and in
pip-based venvs that already have uv on PATH.

Changed three locations to match:

- `gateway/run.py` — Telegram/Discord/Slack/WhatsApp/etc. voice
  reply when no STT provider is configured. Suggests
  `uv pip install faster-whisper` and notes that
  `pip install faster-whisper` also works if `pip` is on PATH.
- `tools/voice_mode.py` — `/voice` status line for missing STT.
- `cli.py` — Voice-mode startup error, "Option 1".

No behavior change beyond the user-facing text. No production
code path was touched.

* docs(voice): add pip fallback to cli + voice_mode STT hints

Copilot flagged that cli.py and tools/voice_mode.py recommend
`uv pip install faster-whisper` without a fallback for environments
where uv isn't on PATH. The gateway/run.py message already lists
`pip install faster-whisper` as an alternative; this commit aligns
the two remaining call sites to match.

Addresses inline Copilot review on #29800.

---------

Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>
2026-05-28 16:23:14 +10:00
Teknium
4e702fe2d9
test(ci): harden two flaky tests against CI noise (#33675)
Some checks are pending
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Waiting to run
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Two unrelated transient failures on PR #33661's initial CI run, both
pre-existing on main and recovered on rerun. Hardening:

1. tests/cron/test_scheduler.py::TestRunJobConfigLogging — added mocks for
   resolve_runtime_provider() and discover_mcp_tools(). The yaml-warning
   tests intend to exercise only the warning-log path, but
   _run_job_impl continues into provider resolution and MCP discovery
   after the warning. Both can spawn subprocesses / hit the network and
   pushed the test over its 30s budget under GHA load.

2. tests/tools/test_browser_supervisor.py — wrapped Chrome teardown
   against the stdlib subprocess._wait() race (bpo-38630). When SIGCHLD
   arrives during proc.wait(), _try_wait(WNOHANG) can return a foreign
   pid and the 'assert pid == self.pid or pid == 0' fires. Fixture now
   catches AssertionError/TimeoutExpired, force-kills, and always reaps
   so no zombie escapes. Same hardening applied to the early-skip branch.
2026-05-27 23:15:41 -07:00
Ben
875d930ac7 test(docker-update): stub subprocess.run in git-install regression guard
The regression-guard test
`test_cmd_update_on_git_install_does_not_print_docker_message` mocked
`is_managed` and `detect_install_method` but not `subprocess.run`, so
once `cmd_update(check=True)` decided this was a git install it shelled
out to a real `git fetch upstream` / `git fetch origin`. On CI runners
the worktree has no `upstream` remote configured and the fetch hung
past the 30s pytest-timeout — test (4) slice failed in #33659 CI.

Fix: stub `subprocess.run` with a successful CompletedProcess-shaped
object whose stdout is `"0\n"`, so:
  - no real git command is ever invoked
  - the rev-list parsing later in the flow (`int(stdout.strip())`)
    succeeds rather than `ValueError`-ing through the test's
    SystemExit catch
  - the flow proceeds far enough to confirm the docker banner is
    absent (the actual assertion)

Also broaden the except clause to `(SystemExit, Exception)`: the only
assertion in this test is the negative-banner check on captured stdout;
any further failure in the rest of the update flow is irrelevant to
that contract.

Verified locally: all 7 tests in
`tests/hermes_cli/test_cmd_update_docker.py` pass in 0.39s (previously
the regression-guard test alone consumed 30s+ and got SIGTERM'd).
2026-05-28 15:50:25 +10:00
Ben
b924b22a9d fix(docker): hermes update prints docker pull guidance instead of bogus git error
Inside the published Docker image, `hermes update` was hitting the
".git missing → reinstall via curl" fallback:

    ✗ Not a git repository. Please reinstall:
      curl -fsSL https://raw.githubusercontent.com/.../install.sh | bash

That message is wrong on two counts:
  1. It tells the user to run the host-side installer, which would
     install a *new* Hermes on the host — not update the running
     container.
  2. It doesn't mention `docker pull` at all, leaving Docker users
     to figure out the right action from scratch.

`hermes update --check` was worse: it bailed with "Not a git
repository — cannot check for updates." and nothing else.

Fix: detect the Docker install method (already stamped by
`docker/stage2-hook.sh` and surfaced by `detect_install_method()`)
in both update entry points and print a long-form message that
covers:

  - The right command: `docker pull nousresearch/hermes-agent:latest`
  - Restart guidance (`docker compose up -d --force-recreate` /
    re-run `docker run`)
  - How to verify the new version after restart
  - Tag-pinning caveat (`:latest` doesn't move a pinned tag)
  - Config persistence across upgrades (state under `HERMES_HOME` /
    `/opt/data` is bind-mounted and survives)
  - Fork escape hatch (build your own image with the repo's Dockerfile)

Exit code is 1 (matches `managed_error` semantic for "tried to
update but can't update this way").

Plumbing:
  - hermes_cli/config.py: new `format_docker_update_message()` helper
    sits next to the existing `_NIX_UPDATE_MSG` /
    `format_managed_message()` family so the wording lives in one
    place and both call sites (apply path + check path) consume it.
  - hermes_cli/main.py:
      * `cmd_update()`: bail right after the `is_managed()` gate, before
        any of the apply-path branches.
      * `_cmd_update_check()`: bail at the top of the function, before
        the existing `method == "pip"` branch.
    Neither path touches subprocess.run / git when method == "docker".

Coverage:
  - 7 new tests in `tests/hermes_cli/test_cmd_update_docker.py`:
      * `hermes update` in Docker → message + exit 1, no git calls
      * `hermes update --check` (via cmd_update) → same
      * `--yes` / `--force` don't bypass (intentional)
      * `_cmd_update_check` called directly → bails too
      * git/pip installs still take their normal paths (regression guards)
      * `format_docker_update_message` content-lock test pinning the
        five user-actionable bits the message must contain
  - Existing test_cmd_update.py (21 tests) + test_managed_installs.py
    (5 tests) still pass — no regression on the source-install path.
  - Verified end-to-end in a real container: `docker run ... update`
    and `docker run ... update --check` both render the message and
    exit 1.
2026-05-28 15:50:25 +10:00
stephenschoettler
4a6f1863ac test: cover ci-unblocker production regressions
Snapshot review_agent._session_messages before teardown so close() can
clean per-session state without dropping the user-visible
self-improvement summary. Adds two regressions:

- bg-review summarizer receives captured review-agent tool messages
  after review_agent.close() runs
- context-compressor protected-head handoff rehydration populates
  _previous_summary and keeps the old handoff out of newly summarized
  turns

Salvaged from PR #26039 onto current main after agent/background_review.py
extraction. Original commit 63eaf6055; bg-review test updated to patch
the module-level summarize_background_review_actions in
agent.background_review instead of the now-forwarder
AIAgent._summarize_background_review_actions.
2026-05-27 22:14:53 -07:00
Ben
66489f38c7 fix(docker): bake build-time git SHA into the image
`hermes dump` and the startup banner both call `git rev-parse HEAD` to
report the running commit, but `.dockerignore` line 2 excludes `.git` —
so inside the published image `hermes dump` shows
`version: ... [(unknown)]` and the banner drops its `· upstream <sha>`
suffix entirely.  That makes support triage from container bug reports
impossible: we can't tell which commit the user is actually running.

Fix: thread the build-time SHA through as a Docker build-arg, write it
to `/opt/hermes/.hermes_build_sha` in the image, and have a new
`hermes_cli/build_info.get_build_sha()` read it as a fallback after the
existing live-git lookup fails.  Output format is unchanged in both
callsites — same 8-char short SHA whether resolved live or baked.

Wiring:
  - Dockerfile: `ARG HERMES_GIT_SHA=` + write-file step after the source
    copy.  Empty/missing arg → no file written → callers fall through to
    live git (so local `docker build` without --build-arg is unchanged).
  - docker-publish.yml: passes `HERMES_GIT_SHA=${{ github.sha }}` on all
    four build-push-action steps (amd64/arm64, smoke-test + final push).
  - dump.py:_get_git_commit() / banner.py:get_git_banner_state(): try
    live git first, fall back to baked SHA, then to legacy `(unknown)`
    / None.  Banner returns `upstream == local, ahead=0` because a built
    image is by definition pinned to one commit.

Coverage:
  - Unit tests cover build_info (file present/absent/empty/error,
    truncation, whitespace), dump (live-git wins, both fallbacks,
    identical output-format regression guard), and banner (no-repo +
    baked, no-repo + no-sha, shallow-clone fallback).
  - tests/docker/test_dump_build_sha.py is an integration regression
    guard that runs against the real image, reads
    `/opt/hermes/.hermes_build_sha`, and asserts `hermes dump` surfaces
    its content (or stays at `(unknown)` if no file).
  - Verified end-to-end: `docker build --build-arg HERMES_GIT_SHA=abc...`
    → `docker run ... dump` reports `[abc12345]`; without the build-arg
    it reports `[(unknown)]` as before.
2026-05-28 15:14:05 +10:00
teknium1
ebe04c66cd fix(kanban): close kanban.db FD after every connect() in long-lived processes
`sqlite3.Connection.__exit__` commits/rollbacks but does NOT close the
underlying FD. `with kb.connect() as conn:` in long-lived processes
(gateway `run_slash`, dashboard `decompose_task_endpoint`) therefore
leaks one FD to `kanban.db` per call. After enough operations the
gateway dies with `[Errno 24] Too many open files` (~4 days uptime
in the production report — #33159).

Fix: add a `connect_closing()` context manager in `hermes_cli/kanban_db`
that wraps `connect()` with a real `try/finally: conn.close()`. Switch
the 42 leak-prone call sites in `hermes_cli/kanban.py` (35),
`hermes_cli/kanban_decompose.py` (4), and `hermes_cli/kanban_specify.py`
(3) over to it.

`kanban.py` matters because `run_slash` (called from the gateway for
every `/kanban` slash command) parses argparse and dispatches to those
`_cmd_*` functions in-process — each one was leaking one FD per
invocation.

Tests inside `tests/` are untouched: short-lived processes where OS
cleanup masks the leak. Regression tests added in
`test_kanban_db.py` cover both happy-path and exception-path closure,
plus an explicit assertion that bare `with kb.connect()` still does
NOT close (documenting the upstream sqlite3 behaviour we're working
around).

Closes #33159.
2026-05-27 22:07:49 -07:00
Teknium
6d947e4d78
feat(image_gen/fal): add Krea 2 Medium + Large to FAL catalog (#33506)
fal announced Krea 2 day-0 as an official API partner on 2026-05-27.
Add both variants to the FAL_MODELS catalog so they appear in the
'hermes tools' model picker alongside flux-2, gpt-image, nano-banana,
etc. Users who already bill through FAL or Nous Portal subscription
can now use Krea without registering directly with Krea.

Model IDs (as listed in fal's launch announcement):
  fal-ai/krea/v2/medium/text-to-image  — $0.030 / image
  fal-ai/krea/v2/large/text-to-image   — $0.060 / image

Both share the same parameter schema:
  - aspect_ratio (1:1, 4:3, 3:2, 16:9, 2.35:1, 4:5, 2:3, 9:16)
    mapped from our 3 abstract ratios via size_style='aspect_ratio'
  - creativity (raw|low|medium|high; default medium)
  - seed (reproducibility)
  - image_style_references (up to 10 per Krea's API spec)

No num_inference_steps / guidance_scale / num_images — Krea 2 does
not expose those, and the supports-set filter strips them defensively
if the agent ever passes them.

This is the FAL-routed variant. The separate native-Krea-API plugin
shipped in PR #33236 (plugins/image_gen/krea/) remains available for
users who want to bill directly through Krea's API with their own
key. Both routes converge on the same underlying model.

Nous Portal managed-FAL gateway: this commit makes the model IDs
known to the catalog and the picker. The Portal team will need to
allowlist these two endpoint slugs on the fal-queue origin server-side
for them to flow through the managed billing path.
2026-05-27 21:42:52 -07:00
Wesley Simplicio
10f13c3881
fix(web): allow mobile dashboard scrolling (#28051) (#28577)
* fix(web): allow mobile dashboard scrolling

* fix(web): combine mobile root scroll rules

---------

Co-authored-by: Wesley Simplicio <wesley.simplicio.ext@siemens-energy.com>
2026-05-28 00:02:50 -04:00