Commit graph

764 commits

Author SHA1 Message Date
wysie
a0fc3df878
fix(browser): rewrite Camofox Docker loopback URLs (#25541)
Co-authored-by: Wysie <wysie@users.noreply.github.com>
2026-05-29 15:43:55 +10:00
Ben
d77d877665 fix(docker): startup orphan reaper for crashed-process containers
The cleanup-fix in the previous commit handles the graceful-exit leak: a
Hermes process that runs ``atexit`` will now actually wait on the docker
stop/rm worker thread, so containers either survive (persist mode) or are
fully removed (opt-out mode) by the time the interpreter exits.

But ``atexit`` doesn't fire on SIGKILL, OOM-kill, or terminal-window
close. Containers from those exits stay parked with no surviving Python
process to reuse or remove them, so they accumulate until the operator
intervenes with ``docker rm -f``. The cleanup-fix doesn't help this class
— there's no live cleanup() to fix.

This commit adds the safety net: a startup orphan reaper that runs once
per Hermes process and removes long-Exited hermes-labeled containers
that the prior commit couldn't reach.

Implementation:

* New ``reap_orphan_containers()`` in ``tools/environments/docker.py``.
  Filters: ``label=hermes-agent=1`` + ``status=exited`` + (optional)
  ``label=hermes-profile=<current>``. Per-container ``docker inspect``
  parses ``State.FinishedAt`` (with nanosecond-precision trimming for
  Python's microsecond-bound ``fromisoformat``); containers older than
  the threshold get ``docker rm -f``'d. The ``status=exited`` filter is
  load-bearing — a running container may belong to a sibling Hermes
  process whose reuse path will pick it up; killing it would crash the
  sibling mid-command. Single-container failures are logged and the
  sweep continues to the next candidate.

* New ``_maybe_reap_docker_orphans()`` helper in
  ``tools/terminal_tool.py``. Wired into ``_create_environment()`` for
  ``env_type == "docker"``. Gated by:

    - ``terminal.docker_orphan_reaper: true`` (default; opt-out for
      operators running multiple Hermes processes in the same profile
      who don't trust the conservative defaults)
    - ``_docker_orphan_reaper_ran`` module flag with double-checked
      locking — parallel subagents and RL rollouts don't trigger N
      concurrent docker ps storms
    - Age threshold = ``2 × TERMINAL_LIFETIME_SECONDS`` with a 60s floor
      (so ``TERMINAL_LIFETIME_SECONDS=0`` doesn't race the user's own
      setup)
    - Profile scoping — a research profile NEVER reaps the default
      profile's stragglers
    - Exception swallow — a janitor failure must never block container
      creation

* New config ``terminal.docker_orphan_reaper`` wired through all four
  config-bridge sites (cli.py, gateway/run.py, hermes_cli/config.py,
  tests/conftest.py) and pinned by
  ``test_docker_orphan_reaper_is_bridged_everywhere``.

Coverage:

* 9 new unit tests in test_docker_environment.py — happy path, recent-
  container sparing, profile scoping, unparseable-timestamp safety,
  docker-ps-failure handling, partial-failure continuation, nanosecond
  timestamp parsing, zero-value FinishedAt rejection.
* 6 new integration tests in test_docker_orphan_reaper_integration.py
  — once-per-process gate, disable-flag respected, lifetime doubling
  with 60s floor, current-profile filter wiring, exception swallow.
* 1 new bridge-invariant regression test.

Closes #20561 (combined with the two prior commits on this branch).
2026-05-29 11:49:54 +10:00
Ben
ac8e238bc8 fix(docker): reuse containers across processes + fix cleanup leaks
The Docker backend docs claim "Single persistent container — ONE long-
lived container shared across sessions, /new, /reset, and delegate_task
subagents. Stopped/removed on shutdown." In practice the code only
honored that contract within a single Python process via the in-memory
\`_active_environments[task_id]\` cache. Every \`hermes chat\` invocation
spawned a fresh \`hermes-<hex>\` container; older containers piled up in
\`Exited\` state and accumulated until manual \`docker rm\` (issue #20561).

Three root causes, all addressed by this commit:

1. No cross-process container discovery.
2. \`cleanup()\` used fire-and-forget \`subprocess.Popen("... &", shell=True)\`
   which raced with parent-process exit — when Python exited promptly the
   detached shell child got killed mid-\`docker stop\`, leaving stopped
   containers behind.
3. The \`docker rm\` step in cleanup was gated on \`not self._persistent\`
   (the bind-mount-persistence flag). Default config sets
   \`container_persistent: true\`, so the default happy path skipped \`rm\`
   entirely — even when the user explicitly didn't want cross-process
   reuse, containers leaked.

Fix:

* Add \`DockerEnvironment.__init__(persist_across_processes=True)\`. When
  true, init probes
  \`docker ps -a --filter label=hermes-agent=1
                  --filter label=hermes-task-id=<task>
                  --filter label=hermes-profile=<profile>\`
  and reuses a matching container (running → attach; stopped →
  \`docker start\` → attach; \`docker start\` failure → fall through to a
  fresh \`docker run\`). Multiple matches prefer the running one, with the
  stragglers left for the orphan reaper (next commit) to clean up.

* Rewrite \`cleanup()\`. Uses \`subprocess.run(..., timeout=30)\` on a
  daemon \`threading.Thread\`, not the racy \`Popen(... &)\`. The
  \`_persistent\` guard is dropped on the \`rm\` step — \`rm\` now runs
  whenever \`persist_across_processes\` is false, regardless of the
  bind-mount-persistence setting. The leak class is gone in all
  combinations.

* Add \`wait_for_cleanup(timeout)\`. \`tools/terminal_tool.py\`'s atexit
  hook calls this on every active env, blocking up to 15s for the
  cleanup thread before interpreter exit. Without this, \`hermes /quit\`
  raced the daemon-thread teardown and dropped the stop/rm work.

* New config \`terminal.docker_persist_across_processes\` (default
  \`true\` — restores the documented contract). Set \`false\` for hard
  per-process isolation. Wired through all four config-bridge sites
  (cli.py env_mappings, gateway/run.py _terminal_env_map,
  hermes_cli/config.py _config_to_env_sync, tests/conftest.py env-strip
  list); regression-pinned by
  \`test_docker_persist_across_processes_is_bridged_everywhere\` matching
  the existing pattern for docker_run_as_host_user / docker_env.

Reuse intentionally does NOT compare image / mounts / resources — only
the labels. Operators changing those settings should set
\`docker_persist_across_processes: false\` (or \`docker rm -f\` the
labeled container) to force a fresh start. This keeps the probe cheap
and the failure mode obvious.

Coverage: 12 new unit tests in tests/tools/test_docker_environment.py
covering reuse paths (running, stopped, fallback, opt-out, duplicate
preference) and cleanup behavior (persist-mode no-rm, opt-out always-rm,
no-Popen, wait_for_cleanup semantics, partial-init safety). Plus one
config-bridge regression pin.

Refs #20561
2026-05-29 11:49:54 +10:00
Teknium
769ee86cd2
feat(kanban): attach images referenced in task bodies to worker vision (#34210)
Kanban workers now scan the task body for local image paths and
http(s) image URLs and attach them to the worker's first user turn —
matching the CLI/gateway behaviour for inbound images. Before, a
user pasting `/home/me/screenshot.png` or `https://example.com/img.png`
into a kanban task description had it sent to the model as plain
text and the pixels were never seen.

How it works:
* agent/image_routing.py gains extract_image_refs(text) → (paths, urls)
  that mirrors gateway/platforms/base.py:extract_local_files (absolute /
  ~-relative paths, image extensions only, ignores fenced/inline code).
* build_native_content_parts() accepts an optional image_urls= kwarg
  and emits passthrough image_url parts for remote URLs alongside the
  base64 data: URLs used for local paths.
* cli.py (single-query/quiet branch — the path every dispatcher-spawned
  worker takes) detects HERMES_KANBAN_TASK, reads the task body via
  kanban_db.get_task, runs extract_image_refs, and threads the results
  into the existing image-routing decision (native vs text). Best-effort:
  enrichment failures never block worker startup.

Tested:
* tests/agent/test_image_routing.py — 22 new tests for extract_image_refs
  and URL pass-through in build_native_content_parts.
* tests/hermes_cli/test_kanban_worker_image_extraction.py — 10 new tests
  driving real kanban_db round-trip (create task → read body → extract
  refs → build parts).
* E2E: created a fake kanban task with a body referencing both a local
  PNG and an https URL; verified the worker pipeline produces a
  multimodal user turn with 1 text part + 2 image_url parts (data URL
  for the local file, passthrough URL for the remote).
2026-05-28 17:50:42 -07:00
kshitijk4poor
5cbc3fbdcc fix(cli): /yolo in chat must enable session bypass, not just set env var
The CLI's in-chat `/yolo` toggle mutated `os.environ["HERMES_YOLO_MODE"]`
but had no effect because `tools/approval.py:_YOLO_MODE_FROZEN` captures
that env var once at module-import time (a deliberate security floor that
keeps prompt-injected skills from flipping the bypass mid-run). By the
time the user reaches `/yolo` in a running CLI session, `tools.approval`
has already been imported, so the env flip after that is a silent no-op.

Result: `/yolo` advertised "⚠ YOLO" in the status bar while every
dangerous command still hit the approval prompt or got denied.  Only
`hermes --yolo` (set before tool imports), `HERMES_YOLO_MODE=1 hermes ...`,
and `hermes config set approvals.mode off` actually bypassed.

This patches the CLI to match what the gateway and TUI `/yolo` handlers
already do, plus mirrors the TUI's session-rename YOLO transfer:

* `_toggle_yolo()` now calls `enable_session_yolo(self.session_id)` /
  `disable_session_yolo(self.session_id)` instead of touching the env
  var.  Matches `gateway/run.py:_handle_yolo_command` and the
  `tui_gateway/server.py` key=="yolo" branch.
* Around each `run_conversation()` call, `run_agent()` now binds
  `set_current_session_key(self.session_id)` so
  `tools.approval.is_current_session_yolo_enabled()` resolves against
  the same key the toggle writes under, and resets it in `finally` so
  reused threads don't see stale identity.  Matches the
  `tui_gateway/server.py` and `gateway/platforms/api_server.py` binding
  pattern.
* New `_transfer_session_yolo()` helper carries YOLO bypass state
  across `self.session_id` reassignments — `/branch` forking into a
  new session id and the auto-compression sync that rotates into a
  fresh continuation session id.  Without this, the same UX failure
  mode the rest of this fix addresses (silent `/yolo` no-op) would
  reappear after a single `/branch` or auto-compression event.
  Mirrors `tui_gateway/server.py` ~line 1297-1305.
* New `_is_session_yolo_active()` helper replaces the two
  `bool(os.getenv("HERMES_YOLO_MODE"))` reads in the status-bar
  builders, so the badge reflects the actual bypass state.  Uses
  `getattr(self, "session_id", None)` so status-bar test fixtures
  that bypass `__init__` via `HermesCLI.__new__(HermesCLI)` don't
  trip `AttributeError` (the builders swallow exceptions silently
  and lose every field after the failure).  Still honors
  `_YOLO_MODE_FROZEN` so `hermes --yolo` keeps lighting it up.

The `_YOLO_MODE_FROZEN` security freeze is preserved — env-var-based
opt-in still only works when set before process start, which is the
documented contract for `--yolo` / `HERMES_YOLO_MODE`.

Closes #33925
2026-05-28 12:10:21 -07:00
teknium1
f30db14ced fix(kanban): SIGTERM on worker must terminate the process (#28181)
The single-query signal handler in cli.py raises KeyboardInterrupt on
SIGTERM/SIGHUP. For interactive 'hermes chat -q' that unwinds the main
thread cleanly. For kanban workers spawned by the dispatcher, the
worker process is likely to have a non-daemon thread alive (terminal
_wait_for_process, custom plugins, etc.). With KeyboardInterrupt only
the main thread unwinds; the non-daemon thread keeps the process alive,
the gateway has already restarted, and the dispatcher's _pid_alive
check returns True forever — task stuck in 'running' indefinitely.

When HERMES_KANBAN_TASK is set (dispatcher-spawned worker), flush
logging + stdout/stderr, then os._exit(0) instead of raising
KeyboardInterrupt. The kernel reclaims the PID immediately, and the
existing zombie-state detection in _pid_alive flips the task to
crashed on the next dispatcher tick. detect_crashed_workers then
re-spawns it on the following tick — no manual recovery needed.

A SIGALRM(2s) deadman is armed before the flush so a pathological
blocking-I/O flush can't wedge the worker forever. In practice the
reporter measured flush in <1ms; the alarm is a failsafe, never
the common path.

Interactive (non-kanban) chat -q is unchanged — the env-gated branch
only fires for dispatcher-spawned workers.

Live verification on this machine:
- Without HERMES_KANBAN_TASK + non-daemon thread alive: process hangs
  alive 4+ seconds after SIGTERM. Dispatcher's _pid_alive returns
  True → task stuck.
- With HERMES_KANBAN_TASK + same non-daemon thread: process exits in
  0.10s via os._exit(0). Dispatcher reclaims on next tick.

Tests:
- tests/hermes_cli/test_signal_handler_kanban_worker.py (3 cases):
  end-to-end subprocess test with a non-daemon thread,
  HERMES_KANBAN_TASK env, SIGTERM, dispatcher-style _pid_alive check.
  Plus a source-level invariant test catching future refactors that
  drop the env-gated exit.
- 452/452 kanban tests pass.

Co-authored-by: andrewhosf <andrewho.sf@gmail.com>
2026-05-28 11:59:58 -07:00
Teknium
3a9bc9d88a
fix(model picker): unify /model and hermes model lists, add disk cache (#33867)
* fix(model picker): unify /model and `hermes model` model lists, add disk cache

The /model slash picker and `hermes model` were drifting apart. /model
read the raw static `OPENROUTER_MODELS` list (31 entries, including 5
that fail at runtime — no tool-call support or absent from live catalog),
while `hermes model` ran the same list through the live OpenRouter
/v1/models tool-support filter and showed 26 valid entries. Same problem
existed for every other authed provider: /model used curated static
lists, `hermes model` used live /v1/models.

Unifies both surfaces on `provider_model_ids()` and adds a generic
disk-cached wrapper so the picker stays snappy.

Changes
- hermes_cli/models.py: new `cached_provider_model_ids()` —
  ~/.hermes/provider_models_cache.json, 1h TTL, per-provider entries
  keyed by credential fingerprint (env vars + OAuth file mtimes).
  Stale-data-beats-no-data on transient failures. Pair with
  `clear_provider_models_cache(provider=None)`.
- hermes_cli/models.py: `provider_model_ids("nous")` now falls back
  to the docs-hosted manifest (not the in-repo snapshot) when the live
  Portal /models call fails — preserves the model_catalog regression
  guarantee while still going through the unified pathway.
- hermes_cli/model_switch.py: `list_authenticated_providers` routes
  sections 1, 2, and 2b through `cached_provider_model_ids(slug)` with
  curated fallback when the live fetcher comes up empty.
- hermes_cli/model_switch.py: `parse_model_flags` extended to a
  4-tuple, parses `--refresh`.
- cli.py / gateway/run.py / tui_gateway/server.py: updated unpacking;
  CLI + gateway wire `--refresh` to `clear_provider_models_cache()`.
- hermes_cli/main.py: `hermes model --refresh` argparse flag.
- hermes_cli/commands.py: `/model` args_hint advertises `--refresh`.
- tests/hermes_cli/test_inventory.py: refresh stale comment.

Live PTY parity verification
- /model → OpenRouter row: `(26 models)` (was 31, with broken entries)
- `hermes model` → OpenRouter: 26 models (unchanged)
- The 5 dropped entries: `pareto-code` (no tool-call support),
  `gemini-3-pro-image-preview` (no tool-call support),
  `elephant-alpha`, `hy3-preview:free`, `ring-2.6-1t:free` (gone
  from OpenRouter's live catalog).

Live PTY timing
- First /model open, empty cache: 4624 ms (full network round trip
  across every authed provider)
- Second /model open, warm cache: 51 ms (90× faster)
- `/model --refresh` clears the disk cache and re-fetches.

Cache schema (~/.hermes/provider_models_cache.json, ~3 KB):
  { "anthropic": {"fp": "<sha256:16>", "at": 1748..., "models": [...]},
    ... }

Targeted tests: tests/hermes_cli/ + gateway model tests + tui_gateway —
5855/5855 pass.

* fix(model picker): use blake2b for cache fingerprint to silence CodeQL

py/weak-sensitive-data-hashing flagged the sha256 call in
_credential_fingerprint() as a high-severity alert because the input
includes env var values whose names contain *_API_KEY / *_TOKEN.

The hash is used solely as a cache-bust identity — never reversed, never
stored, collisions are harmless (worst case: cache miss → live re-fetch).
blake2b serves the same purpose and isn't flagged by this rule.

Functional behavior identical: 16-hex-char digest, cache hit/miss logic
unchanged. Live re-verified — 26 OpenRouter models, warm-cache 78ms.
2026-05-28 11:33:16 -07:00
Brian D. Evans
3ad46933d3
docs(voice): use uv pip install faster-whisper in STT install hints (#29800)
* docs(voice): use `uv pip install faster-whisper` in STT install hints

Three runtime messages told users to `pip install faster-whisper`
(reported in #29782 for the gateway STT failure message under
Telegram-in-Docker, where the user hit `bash: pip: command not
found`). The Hermes Docker image is built on `ghcr.io/astral-sh/uv`
with a uv-managed venv that doesn't ship `pip` on PATH; users on
modern `uv tool install` / `uv venv` installs see the same problem.

The canonical install command in this repo is `uv pip install`
(see `tools/lazy_deps.py:509` `feature_install_command()`), which
works in Docker (uv image), in `uv tool install` venvs, and in
pip-based venvs that already have uv on PATH.

Changed three locations to match:

- `gateway/run.py` — Telegram/Discord/Slack/WhatsApp/etc. voice
  reply when no STT provider is configured. Suggests
  `uv pip install faster-whisper` and notes that
  `pip install faster-whisper` also works if `pip` is on PATH.
- `tools/voice_mode.py` — `/voice` status line for missing STT.
- `cli.py` — Voice-mode startup error, "Option 1".

No behavior change beyond the user-facing text. No production
code path was touched.

* docs(voice): add pip fallback to cli + voice_mode STT hints

Copilot flagged that cli.py and tools/voice_mode.py recommend
`uv pip install faster-whisper` without a fallback for environments
where uv isn't on PATH. The gateway/run.py message already lists
`pip install faster-whisper` as an alternative; this commit aligns
the two remaining call sites to match.

Addresses inline Copilot review on #29800.

---------

Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>
2026-05-28 16:23:14 +10:00
LeonSGP43
458a94e425 fix(cli): keep destructive slash modal on Linux 2026-05-27 05:57:01 -07:00
Teknium
febc4cfec0
remove Vercel AI Gateway and Vercel Sandbox (#33067)
* remove Vercel AI Gateway provider and Vercel Sandbox terminal backend

Both Vercel-hosted integrations are removed end-to-end. Users on the AI
Gateway should switch to OpenRouter or one of the other aggregators
(Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should
switch to Docker, Modal, Daytona, or SSH.

What's removed:
- `plugins/model-providers/ai-gateway/` provider plugin
- `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper
- `tools/environments/vercel_sandbox.py` terminal backend
- `ai-gateway` provider wiring across auth, doctor, setup, models,
  config, status, providers, main, web_server, model_normalize, dump
- `vercel_sandbox` backend wiring across terminal_tool, file_tools,
  code_execution_tool, file_operations, approval, skills_tool,
  environments/local, credential_files, lazy_deps, prompt_builder,
  cli, gateway/run
- `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client
  header set, run_agent base-URL header/reasoning special-cases
- `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock
- env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`,
  `VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`,
  `TERMINAL_VERCEL_RUNTIME`
- Tests: deletes test_ai_gateway_models.py and
  test_vercel_sandbox_environment.py; scrubs references across 23
  surviving test files (no entire tests deleted unless they were
  dedicated to AI Gateway / Sandbox)
- Docs: provider tables, env-var reference, setup guides, security
  notes, tool config, terminal-backend tables — English plus zh-Hans
  i18n parity
- `hermes-agent` skill: provider table entry and remote-backend list

What stays (intentional):
- `popular-web-designs/templates/vercel.md` — CSS design reference,
  unrelated to Vercel-the-AI-product
- `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN
  response header, useful diag signal on any Vercel-hosted endpoint
- `vercel-labs/agent-browser` URL in browser config — lightpanda
  browser project, different OSS effort
- `userStories.json` historical contributor entry mentioning Vercel
  Sandbox — archive, not active docs

Validation:
- 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`)
- Full repo `py_compile` clean
- Live import of every touched module + invariant check (no
  `ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no
  `vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`)

* test: convert profile-count check from change-detector to invariant

The hardcoded "== 34" assertion broke when ai-gateway was removed.
Per AGENTS.md change-detector-test guidance, assert the relationship
(registry count >= number of plugin dirs) instead of a literal count.
Counts shift when providers are added/removed; that's expected.
2026-05-27 00:43:32 -07:00
Teknium
2517917de3
fix(cli): restore fallback paste collapse + handle long single-line pastes (#32447)
Follow-up to #32087 after community report from @ethernet that 8000-char
single-line pastes get dumped raw into the input box.

A) Fallback regression revert
   paste_collapse_threshold_fallback default: 0 -> 5
   #32087 disabled the fallback handler by default. The fallback path
   has been always-on with line_count >= 5 since #3065 (March 2026);
   the previous shape was the salvaged contributor's design and didn't
   match pre-existing behavior for terminals without bracketed paste
   support (Windows terminals, some SSH setups). Restoring the original
   on-by-default.

B) Long single-line paste guard
   New config key: paste_collapse_char_threshold (default 2000)
   Bracketed-paste handler and fallback handler now BOTH collapse when
   line count >= line threshold OR total char length >= char threshold.
   Catches the case ethernet hit: ~8000 chars of minified JSON / log
   output on a single line dumped raw into the buffer.
   TUI mirrors the same config via uiStore.pasteCollapseChars.
   Set 0 to disable.

Defaults verified:
  paste_collapse_threshold: 5
  paste_collapse_threshold_fallback: 5
  paste_collapse_char_threshold: 2000

Tests:
  tests/hermes_cli/test_config.py: 87/87 pass
  ui-tui useConfigSync.test.ts: 34/34 pass
  ui-tui useComposerState.test.ts: 9/9 pass
  tsc: 0 new errors in touched files
2026-05-25 23:49:01 -07:00
kylekahraman
ab42658dfc feat: configurable paste collapse thresholds (TUI + CLI)
Adds two new config keys:
- paste_collapse_threshold (default: 5) — line count threshold for
  bracketed paste collapse in both TUI and CLI
- paste_collapse_threshold_fallback (default: 0, disabled) — same for
  the fallback heuristic in terminals without bracketed paste support

TUI frontend reads these from config.get full via applyDisplay/patchUiState.
CLI reads from self.config at paste-handling time.

Closes #5626
Related: #5623
2026-05-25 06:23:18 -07:00
Teknium
1c3c364287
feat(cli): show live background terminal-process count in status bar (#32061)
The CLI status bar tracked /background agent tasks (▶ N) but not shell
processes spawned via terminal(background=true). Both kinds of work can
run concurrently and a user has no in-bar signal for shell processes.

Add an independent indicator (⚙ N) sourced from
tools.process_registry.process_registry._running. The two indicators
render side-by-side when both are active (▶ 1 │ ⚙ 2), hidden when their
count is zero. Renders at all four status-bar tiers (text fallback +
prompt_toolkit fragments, narrow + wide widths). The narrow <52 tier
still drops both for space — unchanged.

New ProcessRegistry.count_running() returns len(_running) without
acquiring _lock; CPython dict len is atomic and we're polling on every
status-bar tick, so lock-free is the right tradeoff.
2026-05-25 05:35:02 -07:00
Evi Nova
1b12cd5241 fix(cli): bracketed-paste timeout prevents permanent input freeze (#16263)
When the terminal drops the ESC[201~ end mark during a bracketed paste
(terminal race, torn write, SSH glitch, macOS sleep/wake), prompt_toolkit's
Vt100Parser keeps buffering all later input in _paste_buffer forever. From
the user's perspective, the CLI appears frozen — the only recovery was
closing the tab/session.

This patch monkey-patches Vt100Parser.feed() so that bracketed-paste mode
flushes buffered content as a normal BracketedPaste event after 2 seconds
without an end marker, then restores normal parsing.

Includes 8 regression tests covering normal paste, timeout recovery,
torn end marks, and edge cases.

Surgical reapply of PR #27518. Original branch was many months stale
(1193 files / 172k LOC of unrelated reverts); the substantive ~77 LOC
patch in cli.py plus the new 157-line test file were reapplied onto
current main with the contributor's authorship preserved via --author.
2026-05-25 05:07:11 -07:00
ygd58
63d6b9e637 fix(cli): catch KeyboardInterrupt during slash commands to prevent session exit
A Ctrl+C during a slow slash command (e.g. /skills browse on a large
skill tree, /sessions list against a multi-GB SQLite DB) used to unwind
past self.process_command() to the outer prompt_toolkit event loop,
which killed the entire session — losing all conversation state.

Fix: wrap the slash-command dispatch in try/except KeyboardInterrupt
so Ctrl+C aborts the command but the prompt loop continues. Other
exceptions still propagate so real bugs aren't silently swallowed.

Surgical reapply of PR #5189. Original branch was many months stale
(3764 files / 1M+ LOC of unrelated reverts); the substantive ~6 LOC
change in cli.py was reapplied by hand onto current main with the
contributor's authorship preserved via --author.
2026-05-25 05:06:06 -07:00
simokiihamaki
fae815adc2 fix(cli): prevent /reset and /new freeze on Windows by falling back to stdin prompt
On Windows (PowerShell/Windows Terminal), the queue-based modal used for
destructive slash command confirmations deadlocks because prompt_toolkit's
input channel becomes unresponsive when entered from the process_loop daemon
thread. Keystrokes never reach the key bindings, so response_queue.get()
blocks until the 120-second timeout expires.

Fix: fall back to _prompt_text_input (stdin-based) when:
1. sys.platform == 'win32' — Windows console doesn't support the modal reliably
2. Called from non-main thread — key bindings can't fire from daemon threads
3. self._app is not set — existing behavior for tests/non-interactive

This mirrors the thread-aware guard from _prompt_text_input (PR #23454).

9 new regression tests covering Windows detection, non-main thread fallback,
macOS/Linux modal preservation, and integration with _confirm_destructive_slash.

Fixes #30768

Surgical reapply of PR #30773. Original branch was many months stale (911
files / 146k LOC of unrelated reverts); the substantive ~30 LOC change in
cli.py plus the new test file were reapplied onto current main with the
contributor's authorship preserved via --author.
2026-05-25 05:06:03 -07:00
Michel Belleau
25295e7ac9 fix(cli): redirect resume status lines to stderr in quiet mode (#11793)
When 'hermes chat --quiet --resume <id> -q "..."' is used, three status
messages were written to stdout via ChatConsole / _cprint:

  - '↻ Resumed session <id> (N user messages, M total messages)'
  - 'Session <id> found but has no messages. Starting fresh.'
  - 'Session not found: <id>' / usage hint

This polluted the machine-readable stdout that automation wrappers capture
with $(...), making it impossible to cleanly separate the agent's answer
from the resume banner.

Fix: detect quiet mode via tool_progress_mode == 'off' and route the three
resume status messages to stderr (as plain text, matching the existing
stderr convention for session_id). Interactive mode is unchanged — it
still uses the Rich-rendered path through ChatConsole.

Surgical reapply of PR #11868. Original branch was stale against current
main; reapplied onto current cli.py by hand with original authorship
preserved via --author.
2026-05-25 01:47:12 -07:00
CK iRonin.IT
2a2cef4ac7 fix: include -p profile flag in exit resume hint
Session IDs are profile-constrained, so the resume hint needs to
include the active profile for multi-profile users. Without this,
copying the hint from a non-default profile fails to resume the
correct session.

Before:  hermes --resume 20260414_063228_c1240e
After:   hermes --resume 20260414_063228_c1240e -p dev

Also includes -p on the resume-by-title hint. Skipped for
'default' and 'custom' profiles (no -p needed).

Surgical reapply of PR #9652. Original branch was stale against
current main (~6 months); reapplied onto current cli.py by hand
with original authorship preserved.
2026-05-25 01:41:54 -07:00
Claw Assistant
1c7a783c42 fix(cli,gateway): strip outer brackets/quotes from /resume args + accept session IDs in gateway
The /resume usage hint shows '<session_id_or_title>' which a few users have
typed verbatim, including the angle brackets. Strip outer <>, [], "", and ''
from the argument before lookup so '/resume <abc123>' works the same as
'/resume abc123'. Mirrors the new bracket-stripping in the CLI handler.

Also let the gateway resolve a bare session ID. Previously the gateway only
called resolve_session_by_title, so '/resume <session_id>' always returned
'Session not found' even for valid IDs. Try get_session() first, fall back
to title resolution second.

Surgical reapply of PR #10215 (branch was based on a many-months-old main
and reverted ~3100 unrelated files; original commit by claw@openclaw.ai
preserved via --author).
2026-05-25 01:33:32 -07:00
helix4u
3b839f4369 fix(context): align guidance with 64k minimum 2026-05-24 23:23:12 -07:00
daizhonggeng
fef733d56b feat: support numbered resume selection in cli and gateway 2026-05-24 16:22:48 -07:00
LeonSGP43
6c44d537cc fix(cli): show full session titles in /resume list 2026-05-24 16:13:23 -07:00
Teknium
8e68426981 fix(cli): add inline --yes/now skip for destructive slash commands (#30768)
Issue #30768 reports that on native Windows PowerShell the destructive-slash
confirmation modal renders but never registers keypresses, leaving the user
unable to confirm or cancel /reset, /new, /clear, or /undo. The modal works
on macOS, Linux, and WSL; PR #23907 (merged May 11) replaced the
daemon-thread input() pattern with a prompt_toolkit-native keybinding modal
but the win32 input pipeline apparently doesn't dispatch keys to the
filter-conditioned handlers. The modal investigation is ongoing.

This change ships the immediate escape hatch: append `now`, `--yes`, or `-y`
to any destructive slash command to bypass the modal and run the action
immediately. Works on every platform without touching the broken Windows
code path.

  /reset now            -> reset, no modal
  /new --yes my-session -> new session titled "my-session", no modal
  /clear -y             -> clear, no modal
  /undo -y              -> undo, no modal

The default behavior (modal prompts when approvals.destructive_slash_confirm
is True) is unchanged for users who don't pass a skip token.

Implementation:

- New classmethod HermesCLI._split_destructive_skip(text) -> (remainder, skip)
  parses a destructive-slash command string, strips the leading "/cmd" word
  and any recognized skip tokens (case-insensitive exact match, not substring),
  and reports whether a skip was requested.
- HermesCLI._confirm_destructive_slash gains an optional cmd_original= arg.
  When the arg contains a skip token, it returns "once" immediately —
  before the gate check and before any modal rendering.
- The /clear, /new, /undo handlers in process_command pass cmd_original
  through. /new additionally uses _split_destructive_skip to strip skip
  tokens from the remaining text before deriving the session title, so
  "/new now My Session" yields title="My Session" (not "now My Session").

Tests:

- 7 new unit tests in tests/cli/test_destructive_slash_confirm.py covering
  the helper (recognized tokens, command-word stripping, case-insensitive
  exact match, None/empty input) and the modal bypass (now and --yes both
  skip; no-skip-token still consults the modal).
- 3 new integration tests in tests/cli/test_destructive_slash_inline_skip_e2e.py
  driving HermesCLI.process_command end-to-end and asserting (a) new_session
  is invoked, (b) the modal is never reached, (c) the skip token does not
  leak into the session title, and (d) the no-skip-token path still reaches
  the modal as a sanity check that we haven't accidentally short-circuited
  the normal flow.

All 31 tests across the destructive-slash test surface pass.

Docs:

- website/docs/reference/slash-commands.md documents the new flags both in
  the destructive-commands table and the dedicated approval section, with a
  link back to issue #30768 explaining why the escape hatch exists.
2026-05-24 16:13:03 -07:00
Teknium
13b85bc646 feat(config): document resume-recap tuning keys in DEFAULT_CONFIG
The hardcoded constants in _display_resumed_history were exposed as
config in PR #4434; declare them in DEFAULT_CONFIG and the CLI fallback
dict so they show up in 'hermes config' diagnostics and the schema
validator.
2026-05-24 15:36:37 -07:00
ygd58
cdf4876bfe fix(cli): skip tool-call-only entries in resume recap, expose limits as config options 2026-05-24 15:36:37 -07:00
Samuel Zhang
961e34a1d3 fix: show recap after in-session resume 2026-05-24 15:36:37 -07:00
Teknium
c9b3eeabdc
fix(cli): decouple tool_progress=verbose from global DEBUG logging (#31379)
PR #6a1aa420e coupled `display.tool_progress: verbose` (a per-tool display
toggle for full args / results / think blocks) to `self.verbose` — which
controls root-logger DEBUG level. Result: setting tool_progress: verbose
in config silently flipped every module in the process to DEBUG and
flooded the terminal with internal logging, far beyond just full tool
calls.

The two concepts are separate:
- `tool_progress_mode == 'verbose'` → display behavior (tool rendering)
- `self.verbose` → logging behavior (root logger → DEBUG, line 9795)

This change keeps PR #6a1aa420e's argparse.SUPPRESS / config-fallback
plumbing but severs the verbose-display → debug-logging link.

Changes:
- cli.py:2868 — `self.verbose` only follows explicit `verbose=` arg; no
  longer auto-True when tool_progress_mode == 'verbose'.
- cli.py:_toggle_verbose — slash-cycle through tool progress modes no
  longer flips `self.verbose` / `agent.verbose_logging` / `agent.quiet_mode`.
- cli.py:9355 — fix misleading label (drop 'and debug logs').
- tui_gateway/server.py:_make_agent — same decoupling on the TUI side
  (verbose_logging no longer derived from tool_progress_mode).
- tests/cli/test_tool_progress_scrollback.py — invert the test that
  asserted the broken coupling; add coverage for explicit `--verbose`
  still enabling DEBUG independent of tool_progress.

Live verified:
- tool_progress: verbose, no --verbose flag → 0 DEBUG/INFO log lines
- --verbose flag explicit → 32 DEBUG/INFO log lines (as expected)
2026-05-24 02:19:20 -07:00
novax635
421ab81052 fix(cli): reuse canonical root model key normalization in load_cli_config 2026-05-23 23:08:05 -07:00
Albert.Zhou
094d732378 fix(cli): surface tool failures with specific error messages
Improves the failure suffix on tool completion lines. Instead of always
showing '[error]' for non-terminal failures, parse the tool's JSON result
and surface the actual message:

  Before:  ┊ 📖 read      foo.py  0.1s [error]
  After:   ┊ 📖 read      foo.py  0.1s [File not found: foo.py]

  Before:  ┊ 💻 $         ls bad  0.1s [exit 127]
  After:   ┊ 💻 $         ls bad  0.1s [ls: cannot access 'bad'...]

Adds a _trim_error helper that strips long absolute paths down to the
filename and caps the suffix at 48 chars so it stays readable on narrow
terminals.

Threads the tool result through the tool.completed progress callback so
agent/display.get_cute_tool_message can inspect it. The cli.py [error]
post-suffix is removed in favor of the richer suffix _detect_tool_failure
now produces directly.

Originally proposed in PR #17194 by Albert.Zhou; salvaged onto current
main with the dead-code preview-length bumps dropped (tool_preview_length
config already strictly caps previews, so the per-tool n= defaults are
unreachable).

Co-authored-by: Albert.Zhou <albert748@gmail.com>
2026-05-23 21:03:51 -07:00
honor2030
6a1aa420e7 Fix CLI verbose tool progress config fallback 2026-05-23 21:03:51 -07:00
novax635
86871ee25a fix(cli): synchronize HERMES_SESSION_ID across environment and contextvar during session switches 2026-05-23 17:46:55 -07:00
QuenVix
7245bc77eb fix(fallback): merge fallback_providers with legacy fallback_model configurations 2026-05-23 05:24:57 -07:00
adybag14-cyber
a3beee475b perf(termux): speed up bare cli prompt startup 2026-05-22 14:27:38 -07:00
helix4u
3462b097e2 fix(voice): chunk oversized CLI recordings 2026-05-21 14:17:39 -07:00
Teknium
975e13091e fix(cli): honour image-routing decision in quiet-mode -q --image path
The interactive CLI input path consults decide_image_input_mode() to pick
between native image_url attachment and the vision_analyze text pipeline,
but the non-interactive 'hermes chat -Q -q ... --image FOO' path
unconditionally called _preprocess_images_with_vision() — so even with
`model.supports_vision: true` set, --image always went through the
text-pipeline. Symptom: vision_analyze runs 4-5s per image and the model
sees a lossy text summary instead of the actual pixels.

Mirror the interactive path: load config, call decide_image_input_mode,
branch on native vs text. Falls back to the text-pipeline on any import
or build error (Pyright-clean: _build_parts guarded with `is not None`).

Live E2E (provider=custom, base_url=openrouter, anthropic/claude-haiku-4.5,
red 64x64 PNG):
  baseline (no override): vision_analyze called (8 log lines), 5.8s
  with supports_vision:   vision_analyze NOT called (0 log lines),  3.9s
Same model, same image, single knob flips text→native routing.
2026-05-20 23:27:10 -07:00
yoniebans
cebd480818 refactor(session-log): drop branch/compress re-point of session_log_file
The attribute no longer exists; nothing to re-point.
2026-05-20 11:44:10 -07:00
H-Ali13381
697d38a3f4 feat: auto-launch Chromium-family browser for CDP
Add browser CDP launch candidates for Chrome, Chromium, Brave, and Edge while preserving Chrome-first selection. Retry candidate launch failures instead of giving up after the first executable.

Update /browser CLI and TUI messaging, docs, and tool descriptions from Chrome-only wording to Chromium-family browser support. Add regression coverage for Brave/Edge paths, Chrome-first precedence, fallback launches, and CDP endpoint probing.
2026-05-19 22:34:05 -07:00
墨綠BG
c9d5ef28bf 🐛 fix(cli): handle missing remote tracking refs 2026-05-19 14:50:42 -07:00
墨綠BG
28ab420302 🐛 fix(cli): handle no-remote worktree cleanup 2026-05-19 14:50:42 -07:00
Teknium
784febe1cf
perf(cli): defer openai._base_client import via sys.meta_path finder (#28864)
`cli.py` was eager-importing `openai._base_client` at module-load time
purely to monkeypatch `AsyncHttpxClientWrapper.__del__` (defense against
"Press ENTER to continue..." errors when AsyncOpenAI clients are GC'd
against dead event loops). That import cost ~166ms / ~30MB on every
cold CLI start because openai's type tree (responses/*, graders/*) is huge.

Replace with a `sys.meta_path` finder that intercepts the first import
of `openai._base_client` from anywhere in the codebase, lets the normal
load run, then applies the `__del__ = lambda self: None` patch before
control returns to the caller. Same correctness guarantee (patch
applies before any AsyncOpenAI instance can be constructed), zero cost
until the SDK is actually needed.

Hot path: every hermes chat / gateway boot / cron tick / subagent spawn.

A/B benchmark, 10 runs each, fresh subprocess:
                     BEFORE  AFTER   delta
  import cli wall    0.86s   0.62s   -28% (median)
  import cli wall    0.85s   0.59s   -31% (min)
  import cli RSS     91.2MB  74.0MB  -19% (median)

The `neuter_async_httpx_del` function in agent/auxiliary_client.py is
unchanged; its tests still pass and any future callers can still invoke
it directly.

Verified:
- import cli no longer pulls openai into sys.modules
- first 'from openai._base_client import AsyncHttpxClientWrapper'
  triggers the patch; __del__.__name__ == '<lambda>'
- tests/run_agent/test_async_httpx_del_neuter.py: 9/9 pass
- tests/agent/test_auxiliary_client.py: 159/159 pass
- tests/cli/: 715/715 pass
2026-05-19 14:24:53 -07:00
Teknium
f1254b1bc2
fix(cli): exit prompt_toolkit cleanly on SIGTERM/SIGHUP instead of raising KeyboardInterrupt (#28688)
The SIGTERM/SIGHUP handler raised KeyboardInterrupt() at the end of its
agent-interrupt + grace-window sequence. Python delivers signals between
bytecodes on the main thread, so when the signal hit mid-event-loop
(typically inside prompt_toolkit's '_poll_output_size' coroutine's
'await asyncio.sleep()'), the KeyboardInterrupt unwound INTO that
coroutine. prompt_toolkit's Task captured it as a BaseException;
prompt_toolkit's '_handle_exception' then printed 'Unhandled exception
in event loop' + the full asyncio traceback and parked the terminal on
'Press ENTER to continue...' before exiting.

Same root cause as #13710, different surface: there the failure was an
EIO cascade after a logging-cache KeyError escaped the handler; here
it's the KBI raise itself landing inside an asyncio Task. The fix is
the same shape — let the event loop unwind on its own terms.

Now: schedule 'app.exit()' via 'loop.call_soon_threadsafe()'. The
prompt_toolkit Application returns normally from 'app.run()' and the
existing '(EOFError, KeyboardInterrupt, BrokenPipeError)' handler in
the input loop catches everything else. Fallback to 'raise
KeyboardInterrupt()' preserved for contexts where prompt_toolkit isn't
the active app (e.g. -q one-shot mode).

The agent interrupt + 1.5 s grace window run unchanged before the new
exit path, so subprocess-group cleanup ('os.killpg' on Linux) still
gets its window.

Tested live: external SIGTERM to the CLI (with 'kill <pid>') now exits
cleanly with no traceback dump and no ENTER pause.
2026-05-19 03:33:27 -07:00
Teknium
b5c1fe78aa
feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373)
Skill bundles are tiny YAML files in ~/.hermes/skill-bundles/ that
group several skills under one slash command. Invoking /<bundle-name>
from any surface (CLI, TUI, dashboard, any gateway platform) loads
every referenced skill into a single combined user message.

Use cases:
- /backend-dev → loads github-code-review + test-driven-development
  + github-pr-workflow as one bundle.
- /research → loads several research skills together.
- Team task profiles shared via dotfiles.

Behavior:
- Bundles take precedence over individual skills when slugs collide.
- Missing skills are skipped with a note, not fatal.
- No system-prompt mutation — bundles generate a fresh user message
  at invocation time, the same way /<skill> does. Prompt cache stays
  intact.
- Works in CLI dispatch, gateway dispatch, autocomplete (CLI + TUI),
  /help display.

Schema (~/.hermes/skill-bundles/<slug>.yaml):
    name: backend-dev
    description: Backend feature work.
    skills:
      - github-code-review
      - test-driven-development
    instruction: |
      Optional extra guidance prepended to the loaded skills.

New module: agent/skill_bundles.py — load, scan, resolve, build
invocation message, save, delete. yaml.safe_load only; broken
bundles log a warning and are skipped, never raise.

New CLI subcommand: hermes bundles {list,show,create,delete,reload}.
Implementation in hermes_cli/bundles.py; wired in hermes_cli/main.py.
'bundles' added to _BUILTIN_SUBCOMMANDS so plugin discovery skips it.

New in-session slash command: /bundles lists installed bundles in
both CLI and gateway. /<bundle-name> dispatch added to CLI (cli.py)
and gateway (gateway/run.py) before the existing /<skill-name> path.

Autocomplete: SlashCommandCompleter gained an optional
skill_bundles_provider parameter that defaults to None — the prompt
shows '▣ <description> (N skills)' for bundles vs '' for skills.

Tests:
- tests/agent/test_skill_bundles.py — 33 tests covering slugify,
  scan/cache freshness, resolve (including underscore→hyphen
  Telegram alias), build_bundle_invocation_message (loading, missing
  skills, user/bundle instruction injection, dedup), save/delete,
  reload diff, list sort.
- tests/hermes_cli/test_bundles.py — 8 tests for the CLI
  subcommand (create/list/show/delete/reload, --force, missing
  bundle errors).
- tests/gateway/test_bundles_command.py — 4 tests for the gateway
  handler and bundle resolution priority.

Live E2E: verified subprocess invocations of hermes bundles
{list,create,show,reload,delete} round-trip correctly against an
isolated HERMES_HOME.

Docs:
- website/docs/user-guide/features/skills.md — new 'Skill Bundles'
  section with quick example, YAML schema, management commands,
  behavior notes.
- website/docs/reference/cli-commands.md — 'hermes bundles' added to
  the top-level command table and given its own subcommand section.
2026-05-18 21:38:05 -07:00
Bartok9
365da2d2df fix: 4 small surgical bugs
Salvages #23302 by @Bartok9. Four independent one-area fixes:

1. kanban boards delete alias now hard-deletes (not archives) — the
   alias didn't carry --delete, so getattr(args, 'delete', False)
   returned False. Detect boards_action=='delete' explicitly.
2. Gateway auto-title failures no longer leak as user-visible
   warnings — debug-log only since they're not actionable.
3. Background process completion notification snaps truncation to
   the next newline boundary, prepends a marker when content is
   dropped.
4. _cprint() schedules the run_in_terminal coroutine via
   asyncio.ensure_future so output isn't silently dropped from
   background threads (fixes #23185 Bug A). Skips the
   double-print fallback that would fire for mock paths.
2026-05-18 20:54:52 -07:00
felix-windsor
5d1f350784 fix(cli): preserve cron asterisks in strip mode 2026-05-18 20:08:36 -07:00
Grogger
8bcb6082ac fix(windows): handle redirected stdout in _cprint fallback
Wraps _pt_print in try/except with a print() fallback. When a
kanban worker's stdout is piped to a log file, prompt_toolkit
raises NoConsoleScreenBufferError (Windows) or OSError (other)
because there is no real console buffer. The fallback keeps
worker output flowing instead of crashing.
2026-05-18 20:00:07 -07:00
Austin Pickett
2ef501e1f5
feat(cli): add /update slash command to CLI and TUI (#23854)
* feat: add /update slash command to CLI and TUI

* test(cli): add Python tests for /update slash command

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cli): address Copilot review for /update slash command

Route classic CLI /update through prompt_toolkit modal confirmation and
defer relaunch to the main-thread cleanup path after app.exit(). Tighten
Y/n semantics, add Python wrapper and catalog coverage tests, and assert
/update stays visible in the TUI command catalog.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cli): address review feedback on /update command

- Replace raw input() with _prompt_text_input_modal in _handle_update_command
  to avoid EOF/hang/keystroke-leak races with prompt_toolkit's stdin ownership
- Fix confirmation logic: only proceed on recognized affirmative aliases
  (y/yes/1/ok); cancel on everything else including empty string, typos,
  and unrecognized input — matches all other [Y/n] prompts in the codebase
- Route relaunch through main-thread shutdown path: set _pending_relaunch
  and return False from process_command so process_loop triggers app.exit();
  run() then calls relaunch() after prompt_toolkit has restored terminal modes
  and after cleanup — safe on both POSIX (execvp) and Windows (subprocess+exit)
- Fix misleading docstring in test_update_command.py: the Vitest only covers
  the TypeScript slash handler that emits code 42, not the Python wrapper
  branch that acts on it
- Rewrite tests to use SimpleNamespace pattern (like test_destructive_slash_confirm)
  so _prompt_text_input_modal can be stubbed directly
- Add Python test for _launch_tui exit-code-42 → relaunch branch in main.py

Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb

Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com>

* fix(cli): polish test fixtures for /update command

- Remove unused _prompt_text_input from SimpleNamespace stub
- Use pytest.fail sentinel in managed-install guard test to catch unexpected modal invocations

Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb

Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com>

* chore: re-trigger CI after Copilot review fixes

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com>
2026-05-18 20:10:46 -04:00
Teknium
1634397ddb
fix(compress): abort instead of dropping messages when summary LLM fails (#28102)
When auxiliary compression's summary generation returns None (aux model
errored, returned non-JSON, timed out, etc.) the compressor previously
still dropped every middle message between compress_start..compress_end
and replaced them with a static 'Summary generation was unavailable'
placeholder. The session kept going but the user silently lost N turns
of context for nothing.

New behavior: on summary failure, compress() aborts entirely — returns
the input messages unchanged and sets _last_compress_aborted=True. The
existing _summary_failure_cooldown_until gate (30-60s) keeps the aux
model from being burned on every turn. Auto-compress callers detect
the no-op (len(after) == len(before)) and stop looping. The chat is
'frozen' at its current size until the next /compress or /new.

Manual /compress (CLI + gateway) now passes force=True which clears
the cooldown so users can retry immediately after an auto-abort. If
the manual retry also fails, the user gets a visible warning telling
them nothing was dropped and how to retry.

- agent/context_compressor.py: compress() gains force= kwarg; failure
  branch sets _last_compress_aborted and returns messages unchanged
  instead of inserting placeholder.
- run_agent.py: _compress_context() detects abort, surfaces warning,
  skips session-rotation entirely, returns messages unchanged.
- cli.py + gateway/run.py: manual /compress paths pass force=True.
- gateway/run.py: hygiene + /compress handlers detect _last_compress_aborted
  and emit the new 'Compression aborted' warning (gateway.compress.aborted)
  instead of the old 'N historical messages were removed' message.
- locales/*.yaml: new gateway.compress.aborted key in all 16 locales.
- tests: updated to assert the abort contract (messages preserved,
  compression_count not incremented, abort flag set, no placeholder
  leaked). New test_force_true_bypasses_failure_cooldown covers the
  manual-retry path.
2026-05-18 10:19:40 -07:00
glennc
9df9816dab feat(azure-foundry): add Microsoft Entra ID auth
Use azure-identity DefaultAzureCredential for keyless Foundry auth.

Preserve refreshable callable credentials through OpenAI and Anthropic client paths.

Add setup, doctor, auth status, docs, and tests for Entra auth.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-18 10:14:38 -07:00
ms-alan
3c51da1cb7 fix(cli): sync _skill_commands after /reload-skills so Tab completion picks up new skills
The Tab-completion lambda captured _skill_commands at startup, so newly
installed skills were missing from Tab completion even after /reload-skills
reported them as added.

Two changes:
1. Tab-completion lambda now calls get_skill_commands() instead of reading
   the module-level _skill_commands snapshot — ensures the lambda always
   gets fresh data without needing to touch global state.
2. _reload_skills() now syncs cli.py's module-level _skill_commands via
   get_skill_commands() after reload, so help display, command dispatch,
   and any other direct _skill_commands readers also see the updated map.

Closes #26441
2026-05-17 02:31:18 -07:00
kshitij
5fba236644
chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355)
Six days after #23937 (608 fixes) the codebase had accumulated 241 new
PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix,
same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the
two are semantically equivalent for hashable scalar membership tests.

All 241 instances fixed via `ruff check --select PLR6201 --fix
--unsafe-fixes`, zero remaining. Every changed value is a hashable
scalar (str/int/None/enum/signal); no risk of unhashable runtime
errors. No behavior change.

Test plan:
- 119 files changed, +244/-244 (net zero) — exactly one-line edits
- `ruff check` clean afterward
- Compile checks pass on the largest touched files (cli.py, run_agent.py,
  gateway/run.py, gateway/platforms/discord.py, model_tools.py)
- Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/
  tests/tools/: 18187 passed, 59 pre-existing failures (verified against
  origin/main with the same shape — identical failure count, identical
  category — all xdist test-order flakes unrelated to this change)

Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).
2026-05-17 02:29:41 -07:00