Commit graph

8158 commits

Author SHA1 Message Date
wesleysimplicio
1beb578fde fix(ci): install ripgrep in e2e job
Closes #22003
2026-05-12 17:09:45 -07:00
wuwuzhijing
a694a26330 docs(gateway): mention Weixin in gateway help and docstrings
Salvage of #21063 — adds 'Weixin, and more' to module-level docstrings
in gateway/__init__.py, gateway/config.py, gateway/platforms/base.py
and the 'hermes gateway' subparser description.

Co-authored-by: wuwuzhijing <chuang.guo@hopechart.com>
2026-05-12 17:08:51 -07:00
Teknium
29c9ff9ba5
fix(lsp): typescript SDK install + tsc-missing skip + shellcheck warning (#24630)
Three follow-ups to PR #24168 found during live E2E testing on TS/bash files:

1. typescript-language-server now installs the typescript SDK (tsserver)
   alongside it. Without that sibling install, initialize() failed with
   "Could not find a valid TypeScript installation" and the server was
   marked broken — no diagnostics ever reached the agent. New extra_pkgs
   field on INSTALL_RECIPES makes that explicit and reusable for future
   peer-dep cases.

2. _check_lint now treats "linter command exists on PATH but cannot
   actually run" as skipped instead of error. The motivating case is
   npx tsc when typescript is not in node_modules — npx prints its
   "This is not the tsc command you are looking for" banner and exits
   non-zero, which previously blocked the LSP semantic tier (gated on
   success or skipped). Pattern-matched per base command (npx,
   rustfmt, go) so genuine lint errors still flow through normally.

3. hermes lsp status now surfaces a Backend warnings section when
   bash-language-server is installed but shellcheck is missing. The
   server itself spawns fine but bash-language-server delegates
   diagnostics to shellcheck — without it on PATH the integration
   looks alive but never reports any problems. Same warning is
   logged once at server spawn time.

Validation:

- 12 new tests in tests/agent/lsp/test_install_and_lint_fixes.py:
    * recipe carries typescript SDK
    * _install_npm passes both pkg + extras to npm CLI
    * backwards compat: recipes without extras still work
    * _backend_warnings quiet when bash absent / both present
    * _backend_warnings fires when bash installed without shellcheck
    * status output includes the Backend warnings section
    * _looks_like_linter_unusable catches the npx tsc banner
    * real TS type errors not misclassified as unusable
    * unfamiliar linters fall through normally
    * _check_lint returns skipped on npx tsc unusable
    * _check_lint returns error on real tsc type errors
- Full lsp + file_operations test suite: 245/245 pass
- Live E2E:
    * try_install("typescript-language-server") installs both packages
      into node_modules
    * write_file(bad.ts, ...) returns lint=skipped + lsp_diagnostics
      with two real TS errors (was lint=error, no lsp_diagnostics)
    * hermes lsp status renders the shellcheck warning when bash is
      installed but shellcheck is not on PATH
2026-05-12 17:02:35 -07:00
Teknium
6f285efb80
fix(telegram): clear in-progress reaction on cancelled processing (#24628)
When the user runs /stop or a session is interrupted mid-flight, the
👀 in-progress reaction lingered on the user's message indefinitely.
Without another agent run to swap it for 👍/👎, the eyes stayed there
forever — visually misleading (looks like the agent is still working).

Fix: on ProcessingOutcome.CANCELLED, call set_message_reaction with
reaction=None to clear all reactions on the message. Documented Bot API
semantics (equivalent to Bot API 10.0's deleteMessageReaction, but works
on PTB 22.6 already without the version bump).

Test changes:
- Renamed test_on_processing_complete_cancelled_keeps_existing_reaction
  → test_on_processing_complete_cancelled_clears_reaction; updated
  assertion to expect set_message_reaction(reaction=None).
- Added test_on_processing_complete_cancelled_skipped_when_disabled
  (TELEGRAM_REACTIONS=false short-circuits).
- Added test_clear_reactions_handles_api_error_gracefully and
  test_clear_reactions_returns_false_without_bot to cover the new
  _clear_reactions helper.
2026-05-12 17:02:29 -07:00
Teknium
413990c945 chore(release): add AUTHOR_MAP entries for JamesX88 2026-05-12 16:45:04 -07:00
JamesX88
a33ec10874 fix(cli): @-file completion crash on Windows when paths aren't cp1252-decodable
The fuzzy @-file completer shells out to 'rg --files' via subprocess.run
with text=True. On Windows, Python 3.13 decodes stdout using the system
ANSI codepage (cp1252), so any filename containing bytes like 0x81/0x8f
crashes the background reader thread with UnicodeDecodeError. The
exception is swallowed inside subprocess, leaving proc.stdout=None, and
the next line ('proc.stdout.strip()') blows up with:

  AttributeError: 'NoneType' object has no attribute 'strip'

This takes down the prompt_toolkit event loop and forces 'Press ENTER to
continue' until the user clears the @-query.

Fix:
- Pass encoding='utf-8', errors='replace' so rg's UTF-8 output is decoded
  consistently across platforms and unmappable bytes don't crash.
- Guard 'proc.stdout' with a None check before .strip(), so a future
  reader-thread failure degrades gracefully instead of breaking input.
2026-05-12 16:45:04 -07:00
Teknium
c7cfad5d96 chore(release): add AUTHOR_MAP entries for NorethSea 2026-05-12 16:44:03 -07:00
NorethSea
7a4ad5ccb4 fix(cli): use display-width for response box header label to support CJK
Replace `len(label)` with `HermesCLI._status_bar_display_width(label)`
in two places where the response box top border is rendered.

`len()` counts characters, not terminal columns. CJK characters like
`测` and `试` each occupy 2 columns, causing the top border
`╭─ 测试 ───╮` to render 2 columns wider than the bottom border
`╰─────────╯`.

The `_status_bar_display_width` helper already exists (line 2881) and
uses `prompt_toolkit.utils.get_cwidth` for proper CJK width calculation.
2026-05-12 16:44:03 -07:00
Teknium
b7bd0f77f3 chore(release): add AUTHOR_MAP entries for laoli-no1 2026-05-12 16:42:53 -07:00
laoli-no1
d33deb7cbe fix(tui): clear scrollback buffer on startup to prevent tmux scrollback leakage
When TUI exits, tmux captures some TUI output into its scrollback buffer.
On restart, stale scrollback content appears at the top of screen before
AlternateScreen takes over.

Add ANSI escape sequences at startup:
- ESC[2J  clear visible screen
- ESC[H   cursor home
- ESC[3J  clear scrollback buffer
2026-05-12 16:42:53 -07:00
liuhao1024
2a3140a814 fix(dashboard): rescan plugins when cached directory is removed 2026-05-12 16:41:33 -07:00
Teknium
6ec89d885d chore(release): add AUTHOR_MAP entries for aqilaziz 2026-05-12 16:40:10 -07:00
aqilaziz
80375cbe2c fix(dashboard): display real config path on Config page
Replace the hardcoded i18n placeholder "~/.hermes/config.yaml" with the
real config_path returned from api.getStatus(), falling back to the i18n
string while loading or on API failure.

Co-authored-by: aqilaziz <gonzes7@gmail.com>
2026-05-12 16:40:10 -07:00
Teknium
782e3f5164 chore(release): add AUTHOR_MAP entries for AllynSheep 2026-05-12 16:38:14 -07:00
AllynSheep
e3858772d0 fix(dashboard): skip browser-open on headless Linux to prevent process exit
Fixes #24127

On headless Linux VPS (no DISPLAY or WAYLAND_DISPLAY), some Python
webbrowser backends register TUI programs such as links, lynx, or
www-browser.  GenericBrowser.open() spawns these without redirecting
stdin/stdout, allowing them to take over the terminal.  This can cause
the process to receive SIGHUP and exit immediately even though uvicorn
bound the port successfully, producing a misleading success message
followed by an empty --status.

Fix: detect headless Linux at startup and skip the auto-open when no
display server is available.  On such systems the URL is still printed
so the user can open it manually or via an SSH tunnel.  The webbrowser
call is also wrapped in a try/except so any unexpected failure on other
platforms is silently absorbed rather than surfacing as an unhandled
exception in the daemon thread.
2026-05-12 16:38:14 -07:00
Teknium
b3ca6362a8 chore(release): add AUTHOR_MAP entry for hookinglau 2026-05-12 16:36:20 -07:00
hookinglau
d68a0ec383 fix(auxiliary): pass cfg_base_url and cfg_api_key when resolving task provider
_resolve_task_provider_model drops cfg_base_url and cfg_api_key when
returning a named provider, causing configured API keys and base URLs
to be lost. Pass them through so named providers can use custom
endpoints while still resolving credentials from provider-specific
env vars.

Closes #20139
2026-05-12 16:36:20 -07:00
Teknium
389c707e42 chore(release): add AUTHOR_MAP entry for ryptotalent 2026-05-12 16:34:40 -07:00
ryptotalent
9b2488af2a fix: include arg-taking commands in Telegram menu
Built-in commands with required args (e.g. /queue, /steer, /background)
were excluded from Telegram setMyCommands output, making them invisible
in the autocomplete menu. However, their handlers already return usage
text when invoked without arguments, so hiding them hurts discoverability.

This commit removes the _requires_argument filter for built-in commands
(COMMAND_REGISTRY) while keeping it for plugin-registered slash commands,
which may not provide a no-arg usage fallback.

Closes #24312
2026-05-12 16:34:40 -07:00
Teknium
29d7c244c5
feat(gateway): wire clarify tool with inline keyboard buttons on Telegram (#24199)
The clarify tool returned 'not available in this execution context' for
every gateway-mode agent because gateway/run.py never passed
clarify_callback into the AIAgent constructor. Schema actively encouraged
calling it; users never saw the question.

Changes:

- tools/clarify_gateway.py — new event-based primitive mirroring
  tools/approval.py: register/wait_for_response/resolve_gateway_clarify
  with per-session FIFO, threading.Event blocking with 1s heartbeat
  slices (so the inactivity watchdog keeps ticking), and
  clear_session for boundary cleanup.

- gateway/platforms/base.py — abstract send_clarify with a numbered-text
  fallback so every adapter (Discord, Slack, WhatsApp, Signal, Matrix,
  etc.) gets a working clarify out of the box. Plus an active-session
  bypass: when the agent is blocked on a text-awaiting clarify, the next
  non-command message routes inline to the runner's intercept instead
  of being queued + triggering an interrupt. Same shape as the /approve
  deadlock fix from PR #4926.

- gateway/platforms/telegram.py — concrete send_clarify renders one
  inline button per choice plus '✏️ Other (type answer)'. cl: callback
  handler resolves numeric choices immediately, flips to text-capture
  mode for Other, with the same authorization guards as exec/slash
  approvals.

- gateway/run.py — clarify_callback wired at the cached-agent per-turn
  callback assignment site (only the user-facing agent path; cron and
  hygiene-compress agents have no human attached). Bridges sync→async
  via run_coroutine_threadsafe, blocks with the configured timeout, and
  returns a '[user did not respond within Xm]' sentinel on timeout so
  the agent adapts rather than pinning the running-agent guard. Text-
  intercept added to _handle_message before slash-confirm intercept
  (skipping slash commands). clear_session called in the run's finally
  to cancel any orphan entries.

- hermes_cli/config.py — agent.clarify_timeout default 600s.

- website/docs/user-guide/messaging/telegram.md — Interactive Prompts
  section.

Tests:

- tests/tools/test_clarify_gateway.py (14 tests) — full primitive
  coverage: button resolve, open-ended auto-await, Other flip, timeout
  None, unknown-id idempotency, clear_session cancellation, FIFO
  ordering, register/unregister notify, config default.

- tests/gateway/test_telegram_clarify_buttons.py (12 tests) — render
  paths (multi-choice/open-ended/long-label/HTML-escape/not-connected),
  callback dispatch (numeric resolve/Other flip/already-resolved/
  unauthorized/invalid-token), and base-adapter text fallback.

Out of scope: bot-to-bot, guest mode, checklists, poll media, live
photos. Closes #24191.
2026-05-12 16:33:33 -07:00
teknium
76bbb94be4 chore: AUTHOR_MAP entry for AhmetArif0 (PR #24600) 2026-05-12 16:33:09 -07:00
EloquentBrush
f9559c39c4 fix(gateway): consult lock record argv when cmdline unreadable in scoped-lock stale check
PR #24500 introduced stale-lock detection that calls
`_looks_like_gateway_process` to confirm a running PID is not an
unrelated process that reused the slot.  On Windows neither `/proc`
nor `ps` is available, so `_read_process_cmdline` always returns
`None` and `_looks_like_gateway_process` always returns `False` —
causing every valid Windows gateway lock to be marked stale and
immediately evicted.

Fix: after `_looks_like_gateway_process` returns `False`, call
`_read_process_cmdline` directly.  If the result is non-`None` the
live cmdline was readable and confirms the PID is foreign → stale.
If it is `None` (cmdline unreadable, e.g. Windows without ps), fall
back to `_record_looks_like_gateway` which validates the stored
`argv` the gateway wrote into the lock file at startup.  Both
oracles must say "not a gateway" before the lock is evicted — the
same two-oracle pattern already used in `get_running_pid` (line 941).

Adds a regression test that simulates a Windows host where
`_looks_like_gateway_process` returns `False` for every PID and
`_read_process_cmdline` returns `None`, confirming the lock is kept
when the record's argv identifies it as a gateway process.
2026-05-12 16:33:09 -07:00
Teknium
24e2151cd6 chore(release): add AUTHOR_MAP entries for zccyman and Osraka 2026-05-12 16:32:57 -07:00
zccyman
88ede807c4 fix(pricing): add deepseek-v4-pro to official docs pricing table
deepseek-v4-pro has been routable since v0.12 but was missing from
the _OFFICIAL_DOCS_PRICING table. Sessions using this model showed
as "unknown cost" in hermes insights instead of a dollar estimate.

Add pricing entry using published list prices:
- input: \$1.74/M tokens
- output: \$3.48/M tokens
- cache_read: \$0.0145/M tokens

Uses standard list rates (not the 75% promo) so estimates remain
accurate after promo expires 2026-05-31.

Closes #24218
2026-05-12 16:32:57 -07:00
Teknium
83b93898c2
feat(lsp): semantic diagnostics from real language servers in write_file/patch (#24168)
* feat(lsp): semantic diagnostics from real language servers in write_file/patch

Wire ~26 language servers (pyright, gopls, rust-analyzer, typescript-language-server,
clangd, bash-language-server, ...) into the post-write lint check used by write_file
and patch. The model now sees type errors, undefined names, missing imports, and
project-wide semantic issues introduced by its edits, not just syntax errors.

LSP is gated on git workspace detection: when the agent's cwd or the file being
edited is inside a git worktree, LSP runs against that workspace; otherwise the
existing in-process syntax checks are the only tier. This keeps users on
user-home cwds (Telegram/Discord gateway chats) from spawning daemons.

The post-write check is layered: in-process syntax check first (microseconds),
then LSP semantic diagnostics second when syntax is clean. Diagnostics are
delta-filtered against a baseline captured at write start, so the agent only
sees errors its edit introduced. A flaky/missing language server can never
break a write -- every LSP failure path falls back silently to the syntax-only
result.

New module agent/lsp/ split into:

- protocol.py: Content-Length JSON-RPC framer + envelope helpers
- client.py: async LSPClient (spawn, initialize, didOpen/didChange,
  ContentModified retry, push/pull diagnostic stores)
- workspace.py: git worktree walk-up + per-server NearestRoot resolver
- servers.py: registry of 26 language servers (extension match,
  root resolver, spawn builder per language)
- install.py: auto-install dispatch (npm install --prefix, go install
  with GOBIN, pip install --target) into HERMES_HOME/lsp/bin/
- manager.py: LSPService (per-(server_id, root) client registry, lazy
  spawn, broken-set, in-flight dedupe, sync facade for tools layer)
- reporter.py: <diagnostics> block formatter (severity-1-only, 20-per-file)
- cli.py: hermes lsp {status,list,install,install-all,restart,which}

Wired into tools/file_operations.py:

- write_file/patch_replace now call _snapshot_lsp_baseline before write
- _check_lint_delta gains a third tier: LSP semantic diagnostics when
  syntax is clean
- All LSP code paths swallow exceptions; write_file's contract unchanged

Config: 'lsp' section in DEFAULT_CONFIG with enabled (default true),
wait_mode, wait_timeout, install_strategy (default 'auto'), and per-server
overrides (disabled, command, env, initialization_options).

Tests: tests/agent/lsp/ -- 49 tests covering protocol framing (encode and
read_message round-trip, EOF/truncation/missing Content-Length), workspace
gate (git walk-up, exclude markers, fallback to file location), reporter
(severity filter, max-per-file cap, truncation), service-level delta filter,
and an in-process mock LSP server that exercises the full client lifecycle
including didChange version bumps, dedup, crash recovery, and idempotent
teardown.

Live E2E verified end-to-end through ShellFileOperations: pyright
auto-installed via npm into HERMES_HOME, baseline captured, type error
introduced, single delta diagnostic surfaced with correct line/column/code/
source, then patch fix removes the diagnostic from the output.

Docs: new website/docs/user-guide/features/lsp.md page covering supported
languages, configuration knobs, performance characteristics, and
troubleshooting; cli-commands.md updated with the 'hermes lsp' reference;
sidebar updated.

* feat(lsp): structured logging, backend gate, defensive walk caps

Cherry-picks the substantive ideas from #24155 (different scope, same
problem space) onto our PR.

agent/lsp/eventlog.py (new): dedicated structured logger
``hermes.lint.lsp`` with steady-state silence. Module-level dedup sets
keep a 1000-write session at exactly ONE INFO line ("active for
<root>") at the default INFO threshold; clean writes log at DEBUG so
they never reach agent.log under normal config. State transitions
(server starts, no project root for a file, server unavailable) fire
at INFO/WARNING once per (server_id, key); novel events (timeouts,
unexpected errors) fire WARNING per call. Grep recipe: ``rg 'lsp\\['``.

agent/lsp/manager.py: wire the eventlog into _get_or_spawn and
get_diagnostics_sync so users can answer "did LSP fire on this edit?"
with a single grep, plus surface "binary not on PATH" warnings once
instead of silently retrying every write.

tools/file_operations.py: backend-type gate. ``_lsp_local_only()``
returns False for non-local backends (Docker / Modal / SSH /
Daytona); ``_snapshot_lsp_baseline`` and ``_maybe_lsp_diagnostics``
now skip entirely on remote envs. The host-side language server
can't see files inside a sandbox, so this prevents pretending to
lint a file the host process can't open.

agent/lsp/protocol.py: 8 KiB cap on the header block in
``read_message``. A pathological server that streams headers
without ever emitting CRLF-CRLF would have looped forever consuming
bytes; now raises ``LSPProtocolError`` instead.

agent/lsp/workspace.py: 64-step cap on ``find_git_worktree`` and
``nearest_root`` upward walks, plus try/except containment around
``Path(...).resolve()`` and child ``.exists()`` calls. Defensive
against pathological inputs (symlink loops, encoding errors,
permission failures mid-walk) — the lint hook is hot-path code and
must never raise.

Tests:
- tests/agent/lsp/test_eventlog.py: 18 tests covering steady-state
  silence (clean writes stay DEBUG), state-transition INFO-once
  semantics (active for, no project root), action-required
  WARNING-once (server unavailable), per-call WARNING (timeouts,
  spawn failures), and the "1000 clean writes => 1 INFO" contract.
- tests/agent/lsp/test_backend_gate.py: 5 tests verifying
  _lsp_local_only / snapshot_baseline / maybe_lsp_diagnostics skip
  the LSP layer for non-local backends and route correctly for
  LocalEnvironment.
- tests/agent/lsp/test_protocol.py: new test_read_message_rejects_runaway_header
  exercising the 8 KiB cap.

Validation:
- 73/73 LSP tests pass (49 original + 18 eventlog + 5 backend-gate + 1 framer cap)
- 198/198 pass when run alongside existing file_operations tests
- Live E2E re-run with pyright still surfaces "ERROR [2:12] Type
  ... reportReturnType (Pyright)" through the full path, then patch
  fix removes it on the next call.

* feat(lsp): atexit cleanup + separate lsp_diagnostics JSON field

Two improvements salvaged from #24414's plugin-form alternative,
keeping our core-integrated design:

1. atexit cleanup of spawned language servers
   ----------------------------------------------------------------
   ``agent/lsp/__init__.get_service`` now registers an ``atexit``
   handler on first creation that tears down the LSPService on
   Python exit.  Without this, every ``hermes chat`` exit was
   leaking pyright/gopls/etc. processes for a few seconds while
   their stdout buffers drained -- they got reaped by the kernel
   eventually but a watchful ``ps aux`` would catch them.

   The handler runs once per process (gated by
   ``_atexit_registered``); idempotent ``shutdown_service``
   ensures double-fire is a no-op.  Errors during shutdown are
   swallowed at debug level since by the time atexit fires the
   user has already seen the agent's final response.

2. Separate ``lsp_diagnostics`` field on WriteResult / PatchResult
   ----------------------------------------------------------------
   Previously the LSP layer folded its diagnostic block into the
   ``lint.output`` string, conflating the syntax-check tier with
   the semantic tier.  The agent (and any downstream parsers) now
   read syntax errors and semantic errors as independent signals:

       {
         "bytes_written": 42,
         "lint": {"status": "ok", "output": ""},
         "lsp_diagnostics": "<diagnostics file=...>\nERROR [2:12] ..."
       }

   ``_check_lint_delta`` returns to its original two-tier shape
   (syntax check + delta filter); ``write_file`` and
   ``patch_replace`` independently fetch LSP diagnostics via
   ``_maybe_lsp_diagnostics`` and pass them into the new field.
   ``patch_replace`` propagates the inner write_file's
   ``lsp_diagnostics`` so the outer PatchResult carries the patch's
   delta correctly.

Tests: 19 new
- tests/agent/lsp/test_lifecycle.py (8 tests): atexit registration
  fires once and only once across N get_service calls; the
  registered callable is our internal shutdown wrapper;
  shutdown_service is idempotent and safe when never started;
  exceptions during shutdown are swallowed; inactive service is
  cached so we don't rebuild on every check.
- tests/agent/lsp/test_diagnostics_field.py (11 tests): WriteResult
  / PatchResult dataclass shape, to_dict include/omit semantics,
  channel separation (lint and lsp_diagnostics carry independent
  signals), write_file populates the field via
  _maybe_lsp_diagnostics only when the syntax tier is clean,
  patch_replace propagates the field forward from its internal
  write_file.

Validation:
- 92/92 LSP tests pass (73 prior + 8 lifecycle + 11 diagnostics field)
- 217/217 pass with file_operations + LSP combined
- Live E2E reverified: clean writes -> both fields empty/none; type
  error introduced -> lint clean (parses), lsp_diagnostics carries
  the pyright reportReturnType block; patch fix -> both fields
  clean again.

* fix(lsp): broken-set short-circuit so a wedged server isn't paid every write

Discovered while auditing failure paths: a language server binary that
hangs (sleep forever, no LSP traffic on stdin/stdout) caused EVERY
subsequent write to re-pay the 8s snapshot_baseline timeout. Five
writes = ~64s of dead time.

The bug: ``_get_or_spawn`` adds the (server_id, root) pair to
``_broken`` inside its inner exception handler, but when the OUTER
``_loop.run`` timeout fires, it cancels the inner task before that
handler runs. The pair never makes it to broken-set, so the next
write re-enters the spawn path and re-pays the timeout.

Fix:

- New ``_mark_broken_for_file`` helper at the service layer marks
  the (server_id, workspace_root) pair broken from the OUTSIDE when
  the outer timeout fires. Called from the except branches in
  ``snapshot_baseline``, ``get_diagnostics_sync`` (asyncio.TimeoutError
  + generic Exception). Also kills any orphan client process that
  survived the cancelled future, fire-and-forget with a 1s ceiling.

- ``enabled_for`` now consults the broken-set BEFORE returning True.
  Files in already-broken (server_id, root) pairs short-circuit to
  False, so the file_operations layer skips the LSP path entirely
  with no spawn cost. Until the service is restarted (``hermes lsp
  restart``) or the process exits.

- A single eventlog WARNING is emitted on first mark-broken so the
  user knows which server gave up. Subsequent edits in the same
  project stay silent.

Tests: 7 new in tests/agent/lsp/test_broken_set.py — covers the
key shape (server_id, per_server_root), enabled_for short-circuit,
sibling-file skip in same project, project isolation (broken in
A doesn't affect B), graceful no-op for missing-server / no-workspace,
and an end-to-end test that snapshots after a failure and verifies
the next ``enabled_for`` returns False.

Validation:

- Live retest of the wedged-binary scenario: 5 sequential writes,
  first 8.88s (the one snapshot timeout), subsequent four ~0.84s
  (no LSP cost). Down from 5x12.85s = 64s before this fix.
- 99/99 LSP tests pass (92 prior + 7 broken-set)
- 224/224 pass with file_operations + LSP combined
- Happy path E2E reverified — clean write, type error introduced,
  patch fix all behave correctly with the new broken-set logic.

Note: the FIRST write to a wedged binary still pays 8s (the
snapshot_baseline timeout). We could shorten that, but pyright/
tsserver normally take 2-3s and slow CI rust-analyzer can need
5+ seconds, so 8s is the conservative ceiling. Subsequent writes
are instant.
2026-05-12 16:31:54 -07:00
Teknium
d89553c2d6
fix(daytona): migrate legacy-sandbox lookup to cursor-based list() (#24587)
Daytona ships breaking SDK changes on June 10, 2026 — `list()` returns
an iterator and the `page=` offset parameter is removed. We pin
daytona==0.155.0 so we're past the May 24 hard-cutoff, but the
legacy-sandbox resume path in DaytonaEnvironment still passes `page=1`
and reads `.items` off the result.

Switch to `next(iter(results), None)` against a single-result
`list(labels=..., limit=1)` call. Update tests to use `iter([...])`
and drop the `page=1` kwarg from list() assertions.
2026-05-12 16:31:46 -07:00
Teknium
38441a7d77
docs(camofox): expand externally-managed sessions section (#24584)
Adds behavior detail to the existing 'Externally managed Camofox sessions'
subsection in features/browser.md:

- Three-row settings table (config key + env var + effect).
- 'What changes when user_id is set' — soft-cleanup behavior, why
  DELETE /sessions/<user_id> is skipped.
- 'How tab adoption works' — 4-step lookup against GET /tabs, listItemId
  matching, fallback to new-tab creation, no mid-run re-polling.
- Picking session_key: how to attach to a specific existing tab vs
  share-profile-only behavior with the default per-task session_key.
- Concurrency note that Camofox does not arbitrate per-tab focus.
2026-05-12 15:20:42 -07:00
Teknium
f63d520496 chore(camofox): document new env vars + AUTHOR_MAP entry
Follow-up to externally managed Camofox session support:
- .env.example: document CAMOFOX_URL plus the new CAMOFOX_USER_ID,
  CAMOFOX_SESSION_KEY, CAMOFOX_ADOPT_EXISTING_TAB env vars.
- scripts/release.py: AUTHOR_MAP entry for db@project-aeon.com -> db-aeon.
2026-05-12 15:14:49 -07:00
Dan Benyamin
62fd905340 feat(browser): support externally managed Camofox sessions
Allow integrations to share a visible Camofox identity with Hermes and recover existing tabs without carrying local patches.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 15:14:49 -07:00
Teknium
3955aefced
fix(install): use --extra all not --all-extras; drop lazy-covered extras from [all] (#24515)
* fix(install): use `--extra all` not `--all-extras`; drop lazy-covered extras from [all]

Two coupled fixes for the Windows install hang where uv sync built
python-olm from sdist and failed on missing make.

# Root cause: --all-extras vs --extra all (credit: ethernet)

`uv sync --all-extras` installs every key in [project.optional-
dependencies], bypassing the curated [all] extra entirely. So even
when [all] excluded [matrix], [rl], [yc-bench], etc., the installer
pulled them anyway because they were still defined as extras. On
Windows that meant python-olm (no wheel, needs make to build from
sdist) and the install died there.

The right flag is `--extra all` — install just the [all] extra's
contents, respecting curation. Empirically verified via dry-run:

  --all-extras: pulls python-olm, mautrix, ctranslate2, onnxruntime,
                atroposlib, tinker, wandb, modal, daytona, vercel,
                python-telegram-bot, discord.py, slack-bolt,
                dingtalk-stream, lark-oapi, anthropic, boto3,
                edge-tts, elevenlabs, exa-py, fal-client, faster-
                whisper, firecrawl-py, honcho-ai, parallel-web
  --extra all:  pulls none of those — just [all]'s curated set

Dockerfile already uses `--extra all` (with comment explaining the
gotcha) — knowledge existed; the gap was install.sh / install.ps1 /
setup-hermes.sh.

Sites fixed: scripts/install.sh L1118, scripts/install.ps1 L809,
setup-hermes.sh L245.

# Companion fix: drop lazy-covered extras from [all]

`tools/lazy_deps.py` already covers anthropic, bedrock, exa,
firecrawl, parallel-web, fal, edge-tts, elevenlabs, modal, daytona,
vercel, all messaging platforms (telegram/discord/slack/matrix/
dingtalk/feishu), honcho, and faster-whisper. They were ALSO in
[all], which defeats the whole point of lazy-install — fresh
installs eager-pulled them and inherited whatever was broken
upstream (the matrix → python-olm → no Windows wheel chain being
the proximate symptom).

[all] now contains only what genuinely can't be lazy-installed:
cron, cli, dev, pty, mcp, homeassistant, sms, acp, google, web,
youtube. Same trim applied to [termux-all]. New regression test
asserts the contract: every extra in LAZY_DEPS must NOT also appear
in [all].

# Companion fix: surface uv progress + errors

setup-hermes.sh's hash-verified path swallowed uv's stderr to a
tempfile, identical to the install.sh bug fixed in PR #24504. Same
fix applied: stream stderr through directly so users see live
progress instead of staring at a frozen prompt.

# Files

- pyproject.toml: trim [all] and [termux-all] to non-lazy extras only.
- scripts/install.sh: --all-extras → --extra all; trim _ALL_EXTRAS /
  _PYPI_EXTRAS to match.
- scripts/install.ps1: --all-extras → --extra all; trim $allExtras /
  $pypiExtras to match.
- setup-hermes.sh: --all-extras → --extra all; stream stderr.
- tests/test_project_metadata.py: invert matrix-in-[all] assertion;
  add lazy-coverage contract test.
- uv.lock: regenerated.

# Validation

5/5 metadata tests pass. 37/37 in update_autostash + tool_token_
estimation. `uv lock --check` passes. Empirical dry-run confirms
`--extra all` excludes python-olm + RL chain on the new lockfile.

* fix(install): parse [all] from pyproject.toml instead of mirroring it

ethernet's review point: the previous patch left two hand-mirrored
copies of [all]'s contents (in install.sh's $_ALL_EXTRAS and
install.ps1's $allExtras). That guarantees future drift the next
time pyproject.toml's [all] changes.

Now both scripts parse pyproject.toml at install time using stdlib
tomllib (Python 3.11+, which the bootstrap step already requires).
Single source of truth. The only purpose of the parsed list is to
build the 'Tier 2: [all] minus broken extras' fallback spec — so we
parse, filter against $brokenExtras, and rebuild the .[a,b,c] spec.

Also: removed redundant fallback tiers.

  Before:   Tier 1 [all]
            Tier 2 [all] minus broken
            Tier 3 PyPI-only extras (no git deps)
            Tier 4 [web,mcp,cron,cli,messaging,dev]
            Tier 5 .

  After:    Tier 1 [all]
            Tier 2 [all] minus broken
            Tier 3 .

Tier 3 (PyPI-only) and Tier 4 (dashboard+core) used to dodge the [rl]
git+sdist deps and the [matrix] python-olm build. Both are no longer
in [all] post-2026-05-12 lazy-install migration, so the carve-out
tiers had no remaining content. Tier 4 also referenced [messaging],
which is now lazy-installed — the hardcoded fallback was actually
inconsistent with the new policy.

Defensive fallback: if tomllib parse fails (corrupted pyproject,
unexpected schema), Tier 2 collapses to '.[all]' (same as Tier 1) so
the broken-extras path becomes a no-op rather than crashing.

* fix(gateway): hide Matrix from setup picker on Windows

Matrix is the one messaging platform that has no working install path
on Windows: [matrix] -> mautrix[encryption] -> python-olm, which has
Linux-only wheels and needs make + libolm to build from sdist. The
[all] cleanup in this PR keeps mautrix out of fresh installs, but a
user who picked Matrix in 'hermes setup gateway' would still walk
into the same sdist build failure when the wizard tried to install
the extra.

Hide the option at the picker so users never get the chance to try.
The gate lives in _all_platforms() — single source of truth for the
setup wizard, the curses gateway-config menu, and any future picker.

Adapter loading at runtime is intentionally NOT gated: users who
already have MATRIX_* env vars set (e.g. config copied from a Linux
install) keep working if they somehow have python-olm available.
This is the lowest-friction fix — picker visibility only.

Tests cover linux/darwin/win32 and verify other platforms aren't
collateral damage.
2026-05-12 15:06:25 -07:00
Ahmet Oşrak
4bb0a82a2b fix(gateway): enqueue SSE EOS sentinel on task completion 2026-05-12 15:04:54 -07:00
Teknium
4fa5f7b765 chore(release): add AUTHOR_MAP entry for luarss 2026-05-12 15:03:33 -07:00
luarss
1189ed7855 fix(docs): correct broken internal links to webhooks and mlops skill pages
- cron-script-only: webhook subscription links pointed to
  /docs/user-guide/features/webhooks; the page lives under messaging/
- mlops-hermes-atropos-environments: axolotl and TRL related-skill links
  pointed to skills/bundled/mlops/; both files live under skills/optional/mlops/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 15:03:33 -07:00
墨綠BG
71198b9e19 📝 docs(kanban): clarify dependent task gating 2026-05-12 15:01:55 -07:00
Teknium
954e854ccc chore(release): map kyanam.preetham@gmail.com → pkyanam 2026-05-12 15:00:29 -07:00
Teknium
629c33c633 test(gateway): patch _pid_exists instead of os.kill for scoped-lock tests
Post-#21561 the liveness probe in acquire_scoped_lock() routes through
gateway.status._pid_exists (psutil-first, safe on Windows), not
os.kill(pid, 0). The two new macOS regression tests were patching
status.os.kill, which had no effect — the unmocked psutil call returned
False for PID 99999, marking the lock stale before the new code branch
ran. The 'replaces' test passed only because acquired=True was already
the expected outcome; the 'keeps' test failed in CI.

Switch both tests to monkeypatch status._pid_exists directly, matching
the existing test_acquire_scoped_lock_rejects_live_other_process pattern,
so they actually exercise the new start_time=None + cmdline-based
staleness branch.
2026-05-12 15:00:29 -07:00
Preetham Kyanam
653d304290 fix(gateway): detect stale scoped locks via cmdline when start_time is absent on macOS
On macOS (and Windows), /proc is unavailable so _get_process_start_time()
always returns None. When a gateway creates a scoped lock record with
start_time=None and then exits, macOS can reuse that PID for an unrelated
process. On restart, acquire_scoped_lock() sees:

  1. os.kill(pid, 0) succeeds (PID is alive — but it's bluetoothuserd, not
     the gateway)
  2. existing.start_time is None and current_start is None, so the
     start_time comparison is inconclusive
  3. The lock is treated as active, blocking gateway startup with:
     "Telegram bot token already in use (PID 873). Stop the other gateway
     first."

Root cause: _read_process_cmdline() only reads /proc/<pid>/cmdline, which
doesn't exist on macOS. It always returns None, making
_looks_like_gateway_process() always return False, so the cmdline fallback
path in acquire_scoped_lock() was unreachable on macOS.

Fix (two parts):

1. _read_process_cmdline(): Add a ps(1) fallback for platforms without
   /proc. When /proc/<pid>/cmdline doesn't exist, we now run
   "ps -p <pid> -o command=" to retrieve the process command line. The
   /proc path is tried first (preserving Linux performance); ps is only
   invoked as a fallback.

2. acquire_scoped_lock(): When both the lock record's start_time and the
   live process's start_time are None (the macOS case), fall back to
   checking whether the live PID still looks like a Hermes gateway process
   via _looks_like_gateway_process(). If it doesn't, the lock is stale.

Closes #16376
2026-05-12 15:00:29 -07:00
Austin Pickett
642768c5c7
Merge pull request #24161 from NousResearch/austin/fix/dashboard
fix(dashboard): UI polish — modals, layout, consistency
2026-05-12 17:57:31 -04:00
helix4u
a34998ee2f fix(cli): parse positional insights days 2026-05-12 14:56:47 -07:00
rob-maron
c23a87bc16
union paid recs from nous portal with static list (#24509) 2026-05-12 12:16:17 -07:00
Teknium
d186186e1a
fix(install): surface uv install + uv.lock sync errors instead of silently hanging (#24504)
The c1eb2dcda tiered installer made two install paths look frozen on
slow networks or broken environments because both swallowed the
underlying tool's stderr.

scripts/install.sh, setup-hermes.sh:
  curl -LsSf https://astral.sh/uv/install.sh | sh 2>/dev/null
  printed only '✗ Failed to install uv' on failure with no diagnostic.
  Common real causes (glibc mismatch on old distros, corp proxy / TLS
  interception, missing curl, ~/.local/bin not writable, disk full)
  were invisible. Also: piping curl into sh masks curl failures under
  set -e (no pipefail) — sh exits 0 on empty stdin, so a network error
  succeeded silently.
  Fix: download installer to a tempfile first, then run it. Capture
  curl + installer output to a log; on failure, indent and print it.

scripts/install.sh hash-verified tier:
  uv sync --all-extras --locked 2>"$(mktemp)" silenced uv's progress
  output, making a fresh-venv install (~50 transitives including
  torch-class deps) look hung for 1-5 minutes — users see 'Trying tier:
  hash-verified (uv.lock) ...' and assume it's frozen. The mktemp
  substitution also wasn't saved to a variable, so the uv error on
  failure was unreachable.
  Fix: stream uv's stderr directly so users see live 'Resolved N /
  Prepared / Installed' progress. Print an upfront note that the first
  run takes 1-5 minutes.
2026-05-12 12:11:16 -07:00
rob-maron
2863e9484a
Use nous portal as model metadata authority (#24502)
* nous portal metadata resolver

* minor fixes
2026-05-12 11:59:31 -07:00
Teknium
c594a23047
feat(agent): per-turn file-mutation verifier footer (#24498)
Detect when write_file / patch calls fail during a turn and are never
superseded by a successful write to the same path.  When the final
text response is delivered, append an advisory footer listing the
files that did NOT change — so models that over-claim 'patched 5 files'
after 4 silent failures can't hide the lie.

Catches the failure mode reported in Ben Eng's llm-wiki session:
grok-4.1-fast issued batches of parallel patches, half failed with
'Could not find old_string', and the agent summarised the turn
claiming every file was edited.  The user had to manually run
'git status' each turn to catch it.

The verifier is a pure post-hoc check on tool results — no new LLM
calls, no synthetic messages injected into history (prompt cache
preserved), no changes to tool argument dispatch.  Per-turn state is
keyed by path; a later successful write to the same path clears the
failure entry so single-file retry recovery is not flagged.

Wired into both _execute_tool_calls_concurrent and
_execute_tool_calls_sequential, so batched parallel patches and one-at-
a-time edits are both covered.  Footer emission happens after the
agent loop exits, before transform_llm_output / post_llm_call plugin
hooks run, so plugins still see (and can modify) the augmented text.

Config: display.file_mutation_verifier (bool, default true) +
HERMES_FILE_MUTATION_VERIFIER env override.

31 unit tests in tests/run_agent/test_file_mutation_verifier.py cover
target extraction (write_file, patch-replace, patch-v4a single and
multi-file), error-preview extraction (JSON .error field and plain
string), per-turn state transitions (first-error-wins on repeated
failure, success supersedes failure), footer rendering (truncation
at 10 entries, user-actionable hint), and env/config precedence.

Companion docs updated: user-guide/configuration.md +
reference/environment-variables.md.
2026-05-12 11:54:13 -07:00
Austin Pickett
fc3fd6bb6b fix(dashboard): UI polish — modals, layout, consistency, test fixes
Dashboard UX polish pass — consolidates create forms into modals
triggered from the page header, fixes layout inconsistencies, adds
scroll-to navigation for the Keys page, and aligns the TokenBar with
the design system.

Changes:
- App.tsx: add padding to sidebar header
- resolve-page-title.ts: add missing routes, better fallback title
- en.ts: fix nav labels (Profiles was 'profiles : multi agents')
- ModelsPage: two-col layout, auxiliary tasks modal, TokenBar redesign
- ProfilesPage: create button in header, form in modal, Checkbox component
- CronPage: create button in header, form in modal
- EnvPage: scroll-to sub-nav in header, fix text overflow

Modal and dialog standardization:
- Replace all native confirm()/window.confirm() with ConfirmDialog
  (OAuthProvidersCard, PluginsPage, ModelsPage, ConfigPage)
- Add useModalBehavior hook (Escape-to-close, scroll lock, focus restore)
- Apply hook to ProfilesPage, CronPage, AuxiliaryTasksModal

Component fixes (from PR review):
- Checkbox: fix controlled/uncontrolled mismatch, add focus-visible ring
- TokenBar: add rounded-full to legend dots, remove dead code

CI/test fixes:
- Fix TS unused imports (noUnusedLocals), type-narrow PickerTarget union
- Add windows-footgun suppression on platform-guarded os.killpg
- Fix 19 stale unit tests + 9 e2e tests broken by recent main changes
- Restore minimal example-dashboard plugin for plugin auth test
2026-05-12 13:59:22 -04:00
Teknium
dd0923bb89
docs: remove public advisory page (handle community comms separately) (#24253) 2026-05-12 01:09:58 -07:00
Teknium
c1eb2dcda7
feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220)
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback

Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.

# What this PR makes true

1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
   detection banner with copy-pasteable remediation steps the moment
   they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
   a fresh install to 'core only' — the installer keeps every other
   extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
   lazy-install on first use under a strict allowlist, instead of
   eagerly pulling everything at install time.

# Detection: hermes_cli/security_advisories.py

- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
  mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
  dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
  the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
  re-banner after ack.
- Wired into:
  * hermes doctor — runs first, prints full remediation block
  * hermes doctor --ack <id> — dismisses an advisory
  * cli.py interactive run() and single-query branches — short
    stderr banner pointing at hermes doctor
  * gateway/run.py startup — operator-visible warning in gateway.log

# Lazy-install framework: tools/lazy_deps.py

- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
  memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
  uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
  pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
  HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
  * tools/tts_tool.py — _import_elevenlabs() calls ensure first
  * plugins/memory/honcho/client.py — get_honcho_client lazy-installs
  * tts.mistral / stt.mistral entries pre-registered for when PyPI
    restores mistralai

# Installer fallback tiers

scripts/install.sh, scripts/install.ps1, setup-hermes.sh:

- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
  array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
  PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
  landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
  the same _BROKEN_EXTRAS array so updates stay in sync.

Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).

# Config

hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: []  (advisory IDs the user has dismissed)
- allow_lazy_installs: True  (security gate for ensure())

No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.

# Tests

tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
  gateway_log_message
- shipped catalog well-formedness invariant

tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
  every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
  pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command

Combined: 63 new tests, all passing under scripts/run_tests.sh.

# Validation

- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
  tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
  tests/hermes_cli/test_doctor_command_install.py
  tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
  tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
  9191 passed, 8 pre-existing failures (verified on origin/main
  before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
  + gateway_log_message with mocked installed version → produces
  copy-pasteable remediation output

# Community

Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md

Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md

Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>

* build(deps): pin every direct dep to ==X.Y.Z (no ranges)

Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.

Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.

What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.

Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.

Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.

mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.

LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.

Validation:

- Cross-checked all 77 pinned direct deps in pyproject.toml against
  uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
  → 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.

* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra

You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.

# What this commit fixes

1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
   uv.lock records SHA256 hashes for every transitive — a compromised
   package with a different hash gets REJECTED. Falls through to the
   existing `uv pip install` cascade if the lockfile is missing or
   stale, with a loud warning that the fallback path does NOT
   hash-verify transitives. Previously only `setup-hermes.sh` (the dev
   path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
   (the paths fresh users actually run) skipped it.

2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
   project is fully quarantined right now — every version returns 404,
   so any pin we wrote was unresolvable, which broke `uv lock --check`
   in CI. Restoration is documented in pyproject.toml as a 5-step
   checklist (verify, re-add extra, re-enable in 4 modules, regenerate
   lock, optionally re-add to [all]).

3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
   jsonpath-python pruned. `uv lock --check` now passes.

# Defense-in-depth view

| Layer                      | Where             | Protects against                          |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject    | direct deps       | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph  | transitive worm injection                  |
| Tier-0 hash-verified path  | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate  | every PR          | drift between pyproject and lockfile      |
| `hermes_cli/security_advisories.py` | runtime  | cleanup for users who already got hit      |

The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.

# Validation

- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
  (test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
  test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.

* chore: remove community announcement drafts (PR body covers it)

* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)

Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.

Moved out of core dependencies = []:
- anthropic   (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client  (image gen; only when picked)
- edge-tts    (default TTS but still optional)

New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].

New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.

Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.

Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).

Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
2026-05-12 01:02:25 -07:00
Teknium
99ad2d1372
fix(deps): unbreak [all] install — drop mistralai while PyPI quarantined (#24205)
The `mistralai` PyPI package was quarantined on 2026-05-12 after a
malicious 2.4.6 release. Every fresh resolve (AUR makepkg, Docker build,
CI run, install.sh first-run) currently fails on
`mistralai>=2.3.0,<3` because PyPI returns zero candidates.

Existing users running `hermes update` mostly didn't notice — `hermes
update` falls back from `.[all]` to per-extra retries and silently
skips mistral with a warning that scrolls past. But fresh installs
hard-fail or lose every other extra.

Changes:
- pyproject.toml: drop `hermes-agent[mistral]` from `[all]` and
  `[termux-all]`. The `mistral` extra itself is preserved so users
  can opt back in once PyPI un-quarantines.
- hermes_cli/tools_config.py: hide Mistral Voxtral TTS from the
  `hermes tools` provider picker until restored.
- hermes_cli/web_server.py: drop "mistral" from dashboard STT options.
- tools/transcription_tools.py: explicit `provider: mistral` returns
  "none" with a clear status message; auto-detect skips mistral.
- tools/tts_tool.py: dispatcher returns a clear "temporarily disabled"
  error before any SDK import attempt (avoids cached-stale-package
  surprises).
- tests/tools/: update three test files to assert the new disabled
  behavior. Each test docstring records why and points at the rollback
  trigger (PyPI un-quarantines mistralai).

Restore plan: revert this commit once the package is available on PyPI
again. The behavior change is intentional and documented in code
comments + test docstrings to make the rollback trivial.

Validation:
- scripts/run_tests.sh tests/tools/ -k 'mistral or stt or tts' →
  425/425 passing.

Refs: https://pypi.org/simple/mistralai/ (currently
"pypi:project-status: quarantined").
2026-05-11 23:02:15 -07:00
nightcityblade
407683b72d fix(docs): repair Voice & TTS provider table
Fixes NousResearch/hermes-agent#24101
2026-05-11 22:42:00 -07:00
Robin Fernandes
94d9db72ba add client marker tag on aux inference requests 2026-05-11 22:30:42 -07:00
Austin Pickett
58e2109f10 fix(minimax): harden OAuth dashboard and runtime
Handle MiniMax OAuth expiry values consistently across CLI and dashboard
flows, fix CLI status/add behavior, and force pooled OAuth runtime
requests through Anthropic Messages.

- web_server._minimax_poller: parse expired_in via the shared resolver
  so unix-ms absolute timestamps stop landing as TTL seconds and crashing
  with 'year 583911 is out of range' when a user connects MiniMax OAuth
  from the dashboard.
- auth._minimax_oauth_login / _refresh_minimax_oauth_state: same fix on
  the CLI login + refresh paths.
- auth.get_auth_status: dispatch minimax-oauth to its dedicated status
  function instead of falling through.
- auth_commands.auth_add_command: 'hermes auth add minimax-oauth' now
  starts the device-code login flow and persists a pool entry with the
  access + refresh tokens, instead of requiring credentials to already
  exist.
- runtime_provider._resolve_runtime_from_pool_entry: pin pooled
  minimax-oauth credentials to anthropic_messages so a stale
  model.api_mode: chat_completions can't send requests to
  /anthropic/chat/completions and trigger MiniMax nginx 404s.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 22:15:16 -07:00