Commit graph

2797 commits

Author SHA1 Message Date
Scott Trinh
fd943461ca fix(doctor): accept catalog provider aliases
Validate configured providers against both Hermes runtime provider ids and
catalog-normalized provider ids. This keeps providers like ai-gateway from
being rejected after catalog resolution maps them to models.dev ids.

Keep credential checks and vendor-slug warnings anchored to the runtime id
so doctor reports actionable provider names in follow-up diagnostics.
2026-04-28 18:27:42 -07:00
Teknium
9f004b6d94
perf(tools): memoize get_tool_definitions + TTL-cache check_fn results (#17098)
Two amplifying optimizations to per-turn overhead in the gateway:

1. get_tool_definitions() memoization (model_tools.py)
   Keyed on (frozenset(enabled), frozenset(disabled),
   registry._generation, config.yaml mtime+size). Only active when
   quiet_mode=True (which is every hot-path caller — gateway,
   AIAgent.__init__); quiet_mode=False keeps the existing print side
   effects. Cached path returns a shallow-copy list sharing read-only
   schema dicts.

   Measured: 7.5 ms → 0.01 ms per call (~750× speedup). Gateway
   constructs fresh AIAgent per message, so this saves ~7 ms/turn before
   any LLM work.

2. check_fn() TTL cache (tools/registry.py)
   check_fn callables like check_terminal_requirements probe external
   state (Docker daemon, Modal SDK, playwright binary). For a long-lived
   process, hitting them on every get_definitions() pass was pure waste
   — external state changes on human timescales. 30 s TTL so env-var
   flips (hermes tools enable X) propagate within a turn or two without
   explicit invalidation.

   Measured: first call 7.5ms → 1.6ms (check_fn probes now dominate);
   subsequent calls ~0.01ms via the upstream memoization.

Invalidation surface:
- registry._generation bumps on register/deregister/register_toolset_alias,
  invalidating the memoized definitions automatically.
- config.yaml mtime in the cache key captures user-visible config edits
  affecting dynamic schemas (execute_code mode, discord allowlist).
- invalidate_check_fn_cache() exposed for explicit flushes (e.g. after
  hermes tools enable/disable).
- tests/conftest.py autouse fixture clears both caches before every test
  so env-var monkeypatches don't see stale results.

Also fixes a regression from PR #17046 that I missed:
- tools/web_tools.py — Firecrawl was removed from module scope by the
  lazy import, breaking 8 tests that patch 'tools.web_tools.Firecrawl'.
  Applied the same _FirecrawlProxy pattern used in auxiliary_client/
  run_agent for OpenAI (module-level proxy that looks like the class
  but imports the SDK on first call/isinstance; patch() replaces the
  attribute as usual).

Verified:
- 49/49 tests/tools/test_web_tools_config.py pass (was 8 failing on main)
- 68/68 tests/tools/test_homeassistant_tool.py pass (was 1 failing in
  the full suite due to check_fn TTL cross-test pollution; fixed by
  the autouse fixture)
- 3887/3895 tests/tools/ (8 pre-existing fails: 2 delegate, 1 mcp
  dynamic discovery, 5 mcp structured content — all confirmed on main)
- 2973/2976 tests/agent/ + tests/run_agent/ (3 pre-existing fails)
- 868/868 tests/run_agent/ (excluding test_run_agent.py which has
  pre-existing suite-level issues)
- Live smoke: 2 turns + /model switch + tool calls, zero errors in
  agent.log session window.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 18:20:17 -07:00
brooklyn!
188eaa57c4
fix(tui): honor documented mouse_tracking config key (#17188)
* fix(tui): honor documented mouse_tracking config key

The TUI runtime was reading display.tui_mouse while docs and user-facing
examples pointed users at display.mouse_tracking. That made persistent
mouse-disable config look like a no-op for users trying to restore native
terminal selection/copy behavior on Linux/SSH/tmux terminals.

Use display.mouse_tracking as the canonical key, keep display.tui_mouse as
a legacy fallback, and have /mouse write the documented key. Both gateway
config.get and client-side config sync now share the same precedence: the
canonical key wins, then the legacy key, then default on.

* review(copilot): align mouse tracking config coercion

- Load gateway config once before deriving display.mouse_tracking state.
- Use key-presence precedence on the TUI client too, so canonical
  mouse_tracking wins over legacy tui_mouse even when the value is null.
- Treat numeric 0 as disabled on both gateway and client, matching the
  existing string "0" handling.
- Widen ConfigDisplayConfig mouse fields because config.get full returns raw
  YAML, not normalized booleans.
2026-04-28 17:39:07 -07:00
brooklyn!
6b09df39be
fix(tui): restore macOS copy behavior and theme polish (#17131)
This PR groups the TUI fixes that restore macOS Terminal usability and clean up the theme/composer regressions:

- copy transcript selections on macOS drag-release so Terminal.app users can copy while mouse tracking is enabled
- copy composer selections on macOS drag-release; composer selection is internal to TextInput and does not use the global Ink selection bus
- keep IDE Cmd+C forwarding setup macOS-only, and make keybinding conflict checks respect simple when-clause overlap/negation
- force truecolor before chalk initializes (unless NO_COLOR / FORCE_COLOR / HERMES_TUI_TRUECOLOR opt-outs apply) so the default banner keeps its gold/amber/bronze gradient in Terminal.app
- move TUI surfaces onto semantic theme tokens and preserve skin prompt symbols as bare tokens with renderer-owned spacing
- render focused placeholders as dim hint text in TTY mode instead of inverse/selected-looking synthetic cursor text
2026-04-28 18:47:14 -05:00
brooklyn!
7d81d76366
feat(tui): pluggable busy-indicator styles (#13610) (#17150)
* feat(tui): pluggable busy-indicator styles (kaomoji/emoji/unicode/ascii)

The status-bar `FaceTicker` rotated through wide-and-variable kaomoji
glyphs (`(。•́︿•̀。)`, `( ͡° ͜ʖ ͡°)`, …) every 2.5s.  Real display widths range
from ~5 to ~16 columns, so the rest of the bar (cwd, ctx %, voice,
bg counter) shifted on every cycle.  Padding the verb alone (#17116)
helped but didn't address the dominant jitter source — the glyph
itself.

Add four indicator styles, configurable + hot-swappable:

* `kaomoji` (default — preserves the existing vibe; verb is now
  pad-stable so the only width churn left is the kaomoji itself).
* `emoji`  — single 2-col emoji frame (`⚕ 🌀 🤔  🍵 🔮`).
* `unicode` — `unicode-animations` braille spinner (1-col, smooth).
* `ascii`  — `| / - \` (1-col, max compat).

Wires:

* `display.tui_status_indicator` in `DEFAULT_CONFIG` (default
  `kaomoji`).
* New JSON-RPC `config.set/get indicator` keys, narrow allow-list.
* `applyDisplay` reads the field and patches `UiState.indicatorStyle`,
  so the existing `mtime` poll picks up `~/.hermes/config.yaml` edits
  within ~5s without a TUI restart.
* `/indicator [style]` slash command (alias `/indicator-style`,
  subcommand completion `kaomoji|emoji|unicode|ascii`).  Bare form
  shows the current style; setter fires `config.set` and
  optimistically `patchUiState({ indicatorStyle })` so the live TUI
  swaps immediately, matching the `/skin` UX.
* `CommandDef("indicator", ..., subcommands=...)` so classic CLI
  autocomplete + TUI `complete.slash` both surface it.
* `FaceTicker` decouples spinner cadence from verb cadence — the
  glyph runs at the spinner's authored interval (or `FACE_TICK_MS`
  for kaomoji), the verb stays on the original 2.5s cycle, and both
  re-arm cleanly when style changes.

Tests:

* `normalizeIndicatorStyle` rejects unknown / non-string input.
* `applyDisplay → tui_status_indicator` covers fan-out + fallback.
* `/indicator <style>` hot-swaps `UiState.indicatorStyle` after a
  successful `config.set`.
* `/indicator sparkle` rejects with the usage hint and never hits
  the gateway.
* Slash-parity matrix gets `'/indicator'` → `config.get`.

Validation:
  cd ui-tui && npm run type-check — clean; npm test --run — 398/398.
  scripts/run_tests.sh tests/test_tui_gateway_server.py
  tests/hermes_cli/test_commands.py — 220/220.

* chore(tui): drop /indicator-style alias to declutter autocomplete

* fix(tui): drop verb-width pad — /indicator handles glyph jitter directly

* fix(tui): unicode indicator style hides the verb (cleanest option)

* refactor(tui): single source of truth for INDICATOR_STYLES; cleaner error format

Round 1 Copilot review on PR #17150:

- Exported `INDICATOR_STYLES` const tuple from `interfaces.ts`;
  `IndicatorStyle` union type is derived from it. `useConfigSync`
  builds its validation Set from the tuple, and `session.ts` uses it
  for both the usage hint and the runtime allow-list — adding/removing
  a style now touches one line.
- Backend `config.set indicator` error message: switched
  `sorted(allowed)` list repr to `pick one of ascii|emoji|kaomoji|unicode`
  (matches the TUI usage hint), and reports the normalized `raw`
  instead of the original `value`. Backend allowed tuple now has a
  comment pointing back at `INDICATOR_STYLES` so the two stay aligned.

Note: kept the verb portion unpadded per design intent — fixed-width
padding was the exact UX the `/indicator` command was added to remove.
Stable width comes from the glyph; verbs cycling is part of the kawaii
aesthetic. Reply on the verb thread will explain.

* fix(tui): drop type collapse + gate verb timer + DEFAULT_INDICATOR_STYLE

Round 2 Copilot review on PR #17150:

- `tui_status_indicator?: 'ascii' | ... | string` collapses to `string`
  in TS — consumers got no narrowing. Documented as plain `string` with
  a comment about runtime validation via `normalizeIndicatorStyle`.
- `FaceTicker` always started a 2.5s verb interval, even for the
  `unicode` style which hides the verb entirely. Now gated on
  `showVerb` from `renderIndicator` — `unicode` stays calm.

Pre-emptive self-review (avoid round 3):
- Three call sites duplicated the literal `'kaomoji'` default
  (uiStore, normalizeIndicatorStyle, slash command). Added
  `DEFAULT_INDICATOR_STYLE` to interfaces.ts and threaded it through
  so changing the default touches one line.

* fix(tui-gateway): normalize config.get indicator output to match TUI render

Round 4 Copilot review on PR #17150: `config.get` for `indicator`
returned the raw `display.tui_status_indicator` value without
validation, so a hand-edited config.yaml with stray casing or an
unknown style would leave `/indicator` printing one thing while
the TUI rendered the kaomoji default (frontend's
`normalizeIndicatorStyle` does this normalization on receive).

Lifted the allow-list to module scope as `_INDICATOR_STYLES` /
`_INDICATOR_DEFAULT`, reused by both `config.set` and `config.get`.
Comment notes the alignment with `INDICATOR_STYLES` /
`DEFAULT_INDICATOR_STYLE` in interfaces.ts so adding/removing a
style is a one-line change on each end.

Tests cover: known value verbatim, casing/whitespace normalize,
unknown→default, unset→default.

* fix(tui-gateway): preserve falsy-input diagnostics in config.set indicator error

Round 5 Copilot review on PR #17150: `raw = str(value or "").strip().lower()`
collapsed any falsy non-string (`0`, `False`, `[]`) to empty string,
so the error message read `unknown indicator: ` with nothing after —
losing the original input.

Switched to `("" if value is None else str(value)).strip().lower()`
so only `None` (the genuine 'no value' case) becomes blank.  Used
`{raw!r}` in the error so the diagnostic is unambiguous (`'0'` vs `0`).

Tests:
- known-value happy path (`'EMOJI'` → `'emoji'`)
- falsy non-string inputs (`0` / `False` / `[]`) surface meaningfully
- `None` keeps the blank-repr error
2026-04-28 18:19:16 -05:00
brooklyn!
1e326c686d
fix(tui-gateway): harden stdio transport against half-closed pipes + SIGTERM races (#17118)
* fix(tui-gateway): harden stdio transport against half-closed pipes + SIGTERM races

`tui_gateway` reports `tui_gateway_crash.log` traces where the main
thread sits in `sys.stdin` while a worker holds `_stdout_lock` mid-
flush, and SIGTERM then calls `sys.exit(0)` while the lock is still
held — the interpreter shutdown stalls behind the wedged write.

Two narrowly scoped hardenings:

**`tui_gateway/transport.py`**

* Move JSON serialisation outside the lock — long messages no longer
  block sibling writers while we serialise.
* Treat `BrokenPipeError`, `ValueError` ("I/O on closed file") and
  generic `OSError` from both `write` and `flush` as "peer is gone":
  return `False` instead of bubbling, matching what `write_json`'s
  callers in `entry.py` already expect.
* Split `flush` into its own try block so a stuck flush never strands
  a partial write or holds the lock indefinitely on its way out.
* Optional `HERMES_TUI_GATEWAY_NO_FLUSH=1` env knob to skip explicit
  `flush()` entirely on environments where a half-closed read pipe
  produces an indefinite kernel-level block.  Default unchanged.

**`tui_gateway/entry.py`**

* `_log_signal` now spawns a 1-second daemon timer that calls
  `os._exit(0)` if the orderly `sys.exit(0)` path is itself stuck
  behind a wedged worker.  Atexit handlers run inside the grace
  window when they can; the timer is the safety net so a deadlocked
  flush no longer strands the gateway process.

Tests:

* `test_write_json_closed_stream_returns_false` — ValueError path.
* `test_write_json_oserror_on_flush_returns_false` — OSError on flush
  must not strand the lock; the write portion still landed before the
  flush failure.
* `test_write_json_no_flush_env_skips_flush` — env knob bypass.

Validation: `scripts/run_tests.sh tests/tui_gateway/test_protocol.py`
(42/42 pass; one pre-existing failure on
`test_session_resume_returns_hydrated_messages` is unrelated to this
change — same `include_ancestors` mock kwarg issue tracked elsewhere).
`scripts/run_tests.sh tests/test_tui_gateway_server.py` 90/90 pass.

* review(copilot): tighten transport hardening comments + test cleanup

* review(copilot): narrow exception capture, configurable grace, simpler no-flush test

* fix(tui-gateway): narrow ValueError to closed-stream; surface UnicodeEncodeError

Copilot review on PR #17118: `UnicodeEncodeError` is a ValueError
subclass, so a non-UTF-8 stdout (mismatched PYTHONIOENCODING / locale)
would have been silently swallowed as 'peer gone' under
`except ValueError`.  That hides a real environment bug.

Now:
- UnicodeEncodeError → log with exc_info (warning) and drop the frame
- ValueError where str(e) contains 'closed file' → peer gone, return False
- Any other ValueError → log loudly, drop frame (defensive, but visible)

Same shape applied to flush.  Adds two regression tests.

* fix(tui-gateway): reserve write() False for peer-gone; re-raise programming errors

Round 2 Copilot review on PR #17118: `Transport.write()` returning
`False` is documented as 'peer is gone', and `entry.py` reacts by
calling `sys.exit(0)`.  But the implementation also returned False
for non-IO conditions (non-JSON-safe payloads, UnicodeEncodeError,
unrelated ValueErrors), so a programming error or local env bug would
present as a clean disconnect — exactly the diagnosis pain we wanted
to eliminate.

Now:
- `json.dumps` failure → re-raises (TypeError/ValueError surfaces in crash log)
- `BrokenPipeError` → False (peer gone)
- `ValueError('...closed file...')` → False (peer gone)
- `UnicodeEncodeError` and any other ValueError → re-raise
- `OSError` → False (existing IO-failure semantics, debug-logged)

Tests updated to assert the re-raise behaviour and added a
non-serializable-payload regression test.

* fix(tui-gateway): narrow OSError to peer-gone errnos; honest test naming

Round 3 Copilot review on PR #17118:

- Docstring claimed False = peer gone, but generic OSError on write/flush
  also returned False — meaning ENOSPC/EACCES/EIO would silently exit.
  Added `_PEER_GONE_ERRNOS = {EPIPE, ECONNRESET, EBADF, ESHUTDOWN, +WSA}`
  and narrowed the OSError handlers; non-peer-gone errnos re-raise.
  Docstring now lists OSError as peer-gone branch with the errno set.
- The `_DISABLE_FLUSH` test was named after the env var but actually
  patched the module constant. Renamed it to reflect the contract being
  tested (skips flush when constant is true) AND added a real
  end-to-end test that sets the env var, reloads transport.py, and
  asserts the constant flips. Cleanup reload restores defaults so
  parallel tests stay isolated.

Self-review (avoid round 4):
- Verified TeeTransport's secondary-swallow stays intentional.
- _log_signal grace path already covered by separate tests.
2026-04-28 17:54:06 -05:00
brooklyn!
15ef11a8b8
fix(tui): make /browser connect actually take effect on the live agent (#17120)
* fix(tui): make /browser connect actually take effect on the live agent

Reports were that `/browser connect <url>` (and "changes to CDP url
don't get picked up") didn't propagate to the live agent in `--tui`,
forcing users to fall back to setting `browser.cdp_url` in
`config.yaml` and restarting.  Tracing the path on current main shows
the protocol wiring is already correct — `/browser` is registered in
`ui-tui/src/app/slash/commands/ops.ts` and dispatches `browser.manage`
through the gateway RPC, NOT the slash worker (covered by the
`browser.manage` row in `slashParity.test.ts`).  But three real gaps
left the experience flaky:

1. `cleanup_all_browsers()` ran AFTER `os.environ["BROWSER_CDP_URL"]`
   was rewritten.  `_ensure_cdp_supervisor(...)` reads the env to
   resolve its target URL, so a tool call landing in that brief window
   could re-attach the supervisor to the OLD CDP endpoint just before
   we reaped sessions, leaving the agent talking to a dead URL.
   Reorder to clean first, swap env, clean again so the supervisor
   for the default task is definitively closed.
2. `browser.manage status` reported only the env var, ignoring
   `browser.cdp_url` from config.yaml.  `_get_cdp_override()` (the
   resolver the agent itself uses) consults both — match it so
   `/browser status` answers the same question the next
   `browser_navigate` will see.  Closes a stealth bug where users
   saw "browser not connected" while their CDP URL was perfectly
   set in config.yaml.
3. `/browser disconnect` only cleared `BROWSER_CDP_URL` and reaped
   once, leaving the same swap window as connect.  Symmetrical
   double-cleanup here too.

Frontend (`ops.ts`):
* Echo "next browser tool call will use this CDP endpoint" on success
  so users see immediate confirmation that the gateway accepted the
  swap, even before any tool runs.
* Mention `browser.cdp_url` in `config.yaml` in the usage hint and
  the not-connected status line.  Persistent config is the correct
  fix for some terminal-multiplexer / sub-agent flows where env
  inheritance is unreliable; surfacing it makes that workaround
  discoverable.

Tests (4 new, all hermetic):
* `status` returns the resolved URL when only `browser.cdp_url` is
  set in config.yaml.
* `connect` writes env AND cleans before/after, in that order.
* `connect` against an unreachable endpoint does NOT mutate env or
  reap.
* `disconnect` removes env and cleans twice.

Validation:
  scripts/run_tests.sh tests/test_tui_gateway_server.py — 94/94 pass.
  cd ui-tui && npm run type-check — clean; npm test --run — 389/389.

* review(copilot): always defer to _get_cdp_override; normalize bare host:port

* review(copilot): collapse discovery-style CDP paths so /json/version isn't duplicated

* fix(tui): /browser status must not perform CDP discovery I/O

Copilot review on PR #17120: previous version routed through
`tools.browser_tool._get_cdp_override`, which calls
`_resolve_cdp_override` and performs an HTTP probe to /json/version
with a multi-second timeout for discovery-style URLs.  That blocks
the TUI on `/browser status` whenever the configured host is slow
or unreachable.

Status now reads env-then-config directly with no network I/O.  The
WS normalization still happens in `browser_navigate` for actual
tool calls, so behaviour-on-call is unchanged.

* fix(tui): skip /json/version probe for concrete ws://devtools/browser endpoints

Round 2 Copilot review on PR #17120: hosted CDP providers (Browserbase,
browserless, etc.) return concrete `ws[s]://.../devtools/browser/<id>`
URLs which are already directly connectable but don't serve the HTTP
discovery path.  The previous `/json/version` probe rejected these
valid endpoints with 'could not reach browser CDP'.

For `ws[s]://...` URLs whose path starts with `/devtools/browser/` we
now do a TCP-level reachability check (`socket.create_connection`)
instead of the HTTP probe.  The actual CDP handshake happens on the
next `browser_navigate` call, so we still surface unreachable hosts
as 5031 errors — just without the false negatives.

Discovery-style URLs (`http://host:port[/json[/version]]`) keep the
HTTP probe path unchanged.  Updated existing test + added two new
ones (TCP-only success, TCP unreachable → 5031).
2026-04-28 17:46:57 -05:00
brooklyn!
87d3fa6f1c
feat(tui): opt-in auto-resume of the most recent session (#17130)
* feat(tui): opt-in auto-resume of the most recent session

`hermes --tui` always forges a fresh session at startup unless the user
sets `HERMES_TUI_RESUME=<id>`.  Disconnects, terminal-window crashes,
and accidental Ctrl+D therefore lose every piece of in-flight context
even though `state.db` still has the full history a `/resume` away.

Add an opt-in path that mirrors classic CLI's `hermes -c` muscle
memory: when `display.tui_auto_resume_recent: true` is set in
`~/.hermes/config.yaml`, the TUI looks up the most recent human-facing
session and resumes it instead of starting fresh.  Default off so
existing users aren't surprised; explicit `HERMES_TUI_RESUME` always
wins.

Wires:

* New `session.most_recent` JSON-RPC in `tui_gateway/server.py` that
  returns the first non-`tool` row from `list_sessions_rich`, or
  `{"session_id": null}` when none.  Uses the same deny-list as
  `session.list` so sub-agent rows can't sneak in.
* `createGatewayEventHandler.handleReady` re-ordered: explicit
  `STARTUP_RESUME_ID` first (unchanged), then conditional auto-resume
  via `config.get full → display.tui_auto_resume_recent`, then the
  legacy `newSession()` fallback.  Failures of either RPC fall back
  to `newSession()` so the path is always finite.
* Default `display.tui_auto_resume_recent: False` added to
  `DEFAULT_CONFIG` in `hermes_cli/config.py` (no `_config_version`
  bump per AGENTS.md — deep-merge handles the additive key).

Tests:

* 4 new vitest cases in `createGatewayEventHandler.test.ts` cover
  every gate-and-fallback combination (env wins, config off, config
  on with hit, config on with miss).
* 3 new pytest cases for `session.most_recent` (denied row skip,
  tool-only → null, db-unavailable → null).

Validation:
  scripts/run_tests.sh tests/test_tui_gateway_server.py — 93/93.
  cd ui-tui && npm run type-check — clean; npm test --run — 393/393.

* review(copilot): fold session.most_recent errors into null + extend ConfigDisplayConfig

* review(copilot): cover RPC-rejection fallbacks in auto-resume tests
2026-04-28 16:53:38 -05:00
Gille
0d957a8d48
fix(tui): surface mouse slash command (#17126) 2026-04-28 13:27:43 -07:00
brooklyn!
5f215b13ce
fix(docker): materialize bundled TUI Ink package (#16690)
* fix(docker): materialize bundled TUI Ink package

* fix(docker): keep nested deps out of build context

* fix(docker): make TUI Ink smoke check deterministic

* test(docker): skip dockerignore assertion in partial checkouts

* fix(docker): use lockfile install for vendored Ink deps

* test(cli): expect deterministic npm ci in /update flow

* fix(docker): fall back to npm install for vendored Ink deps

* fix(docker): keep bundled Ink source for TUI runtime builds

* fix(docker): dedupe React in vendored Ink package
2026-04-28 15:11:47 -05:00
kshitijk4poor
5d2f9b5d7d fix: follow-up for salvaged PR #17061
- Remove dead _lmstudio_loaded_context attribute from run_agent.py (set
  but never read — the loaded context is pushed to context_compressor.update_model
  which is the actual consumer)
- Cache empty reasoning options with 60s TTL to avoid per-turn HTTP probe
  for non-reasoning LM Studio models. Non-empty results cached permanently.
- Extract _lmstudio_server_root(), _lmstudio_request_headers(), and
  _lmstudio_fetch_raw_models() shared helpers in models.py — eliminates
  URL-strip + auth-header + HTTP-call duplication across probe_lmstudio_models,
  ensure_lmstudio_model_loaded, and lmstudio_model_reasoning_options
- Revert runtime_provider.py base_url precedence change: preserve the
  established contract (saved config.base_url > env var > default) for all
  api_key providers
- Remove unnecessary config version bump 22→23
- Fix TUI test: relax target_model assertion to avoid module-cache flake
- AUTHOR_MAP: added rugved@lmstudio.ai → rugvedS07
2026-04-28 12:27:36 -07:00
Rugved Somwanshi
01ad0aacaf fix(tui): show correct context length 2026-04-28 12:27:36 -07:00
Rugved Somwanshi
fa2bee1215 fix(tui): update test for target model 2026-04-28 12:27:36 -07:00
Rugved Somwanshi
214ca943ac feat(agent): add lmstudio integration 2026-04-28 12:27:36 -07:00
nfb0408
74c209534c fix(copilot-acp): disable streaming path for CopilotACPClient
CopilotACPClient communicates via subprocess stdio and returns a plain
SimpleNamespace from _create_chat_completion(). The streaming path tries
to iterate this as a stream, crashing with:
  TypeError: 'types.SimpleNamespace' object is not iterable

Mirror the existing ACP exclusion pattern (used for Responses API upgrade)
to disable streaming when provider is copilot-acp or base_url starts with
acp:// or acp+tcp://.

Based on PR #9428 by @ningfangbin and issue #16271 by @Joseph19820124.

Fixes #16271
2026-04-28 11:33:07 -07:00
Teknium
42be5e49b0
fix(browser): detect missing Chromium and fail fast with actionable error (#17039)
Previously, check_browser_requirements() only checked for the agent-browser
CLI, not the Chromium binary it drives. When the CLI was present but
Chromium wasn't (common in Docker images predating the playwright install
step), the browser tool was advertised to the agent, every call hung for
the full command timeout (~30s each, ~220s for a chained navigate), and
the agent eventually gave up with no useful error — users saw 'browser
not working' with empty errors.log.

Changes:
- tools/browser_tool.py: add _chromium_installed() checking
  PLAYWRIGHT_BROWSERS_PATH + default Playwright cache paths for
  chromium-* / chromium_headless_shell-* dirs; wire into
  check_browser_requirements() for local mode (cloud providers
  unaffected). _run_browser_command fails fast with an actionable
  Docker vs. host message instead of hanging. _running_in_docker()
  checks /.dockerenv and /proc/1/cgroup.
- hermes_cli/tools_config.py: post_setup for 'Local Browser' now runs
  'agent-browser install --with-deps' after npm install to actually
  download Chromium. In Docker, points user at the updated image pull
  instead of trying to install into a read-only layer. Cloud-provider
  post_setup (browserbase) skips Chromium install entirely.
- tests/tools/test_browser_chromium_check.py: new tests covering
  search roots, install detection, requirements branches (local/cloud/
  camofox), and the fast-fail guard in docker/non-docker contexts.
- tests/tools/test_browser_homebrew_paths.py: 5 existing subprocess-path
  tests now mock _chromium_installed=True since they exercise the
  post-guard subprocess path.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 07:03:44 -07:00
konsisumer
e4b69bf149 fix(gateway): guard against None request_overrides in _build_api_kwargs 2026-04-28 06:57:23 -07:00
Teknium
1d8b9e6458
fix(auxiliary): auto-detect Anthropic Messages transport for all aux clients (#17027)
Auxiliary tasks (title_generation, vision, compression, web_extract,
session_search) now pick the correct wire protocol based on the
endpoint, not just on which resolve_provider_client branch built the
client.  Fixes 404s on Kimi Coding Plan and any other named provider
whose endpoint speaks Anthropic Messages.

Root cause: the 'api_key' branch of resolve_provider_client (and the
Step 2 fallback chain inside _resolve_auto) always built a plain
OpenAI client regardless of what the endpoint actually spoke.  For
provider=kimi-coding + model=kimi-for-coding, that meant:

    POST https://api.kimi.com/coding/v1/chat/completions
    { "model": "kimi-for-coding", ... }
    → 404 resource_not_found_error

The /coding route only accepts the Anthropic Messages shape (the main
agent already uses api_mode=anthropic_messages for it).  Earlier fixes
(#16819, #22ddac4b1) patched the anonymous-custom, named-custom, and
external-process branches — but the named api_key branch (kimi-coding,
minimax, zai, future /anthropic providers) was the fourth sibling and
never got the same treatment.

Fix: one module-level helper _maybe_wrap_anthropic() that rewraps a
plain OpenAI client in AnthropicAuxiliaryClient when:

  - api_mode is explicitly 'anthropic_messages', OR
  - the URL ends in '/anthropic', OR
  - the host is api.kimi.com + path contains '/coding', OR
  - the host is api.anthropic.com.

Wired into _wrap_if_needed (covers all resolve_provider_client
branches that already go through it) and into the Step 2 api_key
fallback chain inside _resolve_auto.  Explicit api_mode still wins:
passing api_mode='chat_completions' forces OpenAI wire, and already-
wrapped specialized adapters (Codex, Gemini native, CopilotACP) pass
through unchanged.

E2E verified:
- resolve_provider_client('kimi-coding', 'kimi-for-coding')
  → AnthropicAuxiliaryClient (was plain OpenAI, which 404'd)
- _resolve_auto Step 1 for kimi-coding runtime → AnthropicAuxiliaryClient
- resolve_provider_client('openrouter', ...) → plain OpenAI (no regression)
- api_mode='chat_completions' override → plain OpenAI (explicit wins)

Tests:
- tests/agent/test_auxiliary_transport_autodetect.py (new): 21 tests
  covering URL detection, wrap decisions, and integration.
- 204/205 existing auxiliary tests pass (1 pre-existing failure on
  main, unrelated to this change).

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 06:50:14 -07:00
Teknium
e123f4ecf0
feat(gateway): opt-in runtime-metadata footer on final replies (#17026)
Append a compact 'model · 68% · ~/projects/hermes' footer to the FINAL
message of each turn, disabled by default (display.runtime_footer.enabled).
Answers the Telegram-side parity ask: runtime context that the CLI status
bar already shows is now available in messaging replies when enabled.

Wiring:
- gateway/runtime_footer.py: resolve_footer_config + format_runtime_footer +
  build_footer_line. Pure-function renderer; per-platform overrides under
  display.platforms.<platform>.runtime_footer.
- gateway/run.py: appends footer to response right after reasoning prepend
  so it lands only on the final message (never tool progress or streaming
  chunks). When streaming already delivered the body (already_sent), the
  footer is sent as a small trailing message instead.
- agent_result now exposes context_length alongside last_prompt_tokens so
  the footer can compute the pct; both gateway return paths updated.
- /footer [on|off|status] slash command, wired in CLI (cli.py) and gateway
  (gateway/run.py both running-agent bypass and main dispatch). Global
  toggle only; per-platform overrides via config.yaml.

Graceful degradation:
- Missing context_length (unknown model) → pct field silently dropped
  (no '?%' artifact).
- Empty final_response → no footer appended.
- Unknown field names in config → silently ignored.

Tests: 25-case unit suite (tests/gateway/test_runtime_footer.py) plus E2E
harness covering streaming vs non-streaming branches, per-platform override,
and the exact argument contract gateway/run.py uses.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 06:50:04 -07:00
teknium1
3d8be2c617 fix(install): widen /dev/tty open-probe to sibling gates (#16746)
The contributor's PR (#16750) scoped the fix to run_setup_wizard() and
explicitly punted the two sibling sites. Both have the identical
[ -e /dev/tty ] pattern followed by a < /dev/tty redirect and crash in
Docker the same way:

- scripts/install.sh:732 install_system_packages() -- apt sudo prompt
  fallback. sudo ... < /dev/tty dies with the same ENXIO.
- scripts/install.sh:1395 maybe_start_gateway() -- gateway-install gate,
  same function path as the wizard reproducer.

Fix both with the same (: </dev/tty) 2>/dev/null probe, and parametrize
the regression test over all three gated functions so any future
regression is caught regardless of which site breaks.
2026-04-28 06:45:55 -07:00
briandevans
89e8c87354 test(install): regex-based gate assertions per copilot review on #16750
Address the three Copilot inline findings on the regression test:

- Switch _extract_run_setup_wizard() from str.index() with hard-coded
  markers (which raises ValueError if `maybe_start_gateway()` is renamed
  or the marker leaks into a comment) to an anchored regex on the
  function-definition + closing-brace boundaries.
- Match `[ -e /dev/tty ]` with surrounding whitespace, optional quoting,
  and the `test -e /dev/tty` form so the regression guard catches every
  spelling of the existence-only check, not just the exact substring.
- Replace the literal `(: </dev/tty)` substring assertion with a
  higher-level invariant — the gate must be an `if`/`if !` whose test
  redirects stdin from /dev/tty — so equivalent open-based probes
  (`exec 3</dev/tty` + close, brace-grouped variants, etc.) keep the
  test green while the bare existence check stays caught.

Verified guard: both tests still pass on the fix and both fail on
`origin/main` with the documented messages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 06:45:55 -07:00
briandevans
20c9340c34 fix(install): probe /dev/tty by opening it, not bare existence (#16746)
In Docker builds the `/dev/tty` device node is present in the mount
namespace, so `[ -e /dev/tty ]` returns true — but opening it fails
with `ENXIO: No such device or address`. Under the old gate the
"no terminal available" skip never triggered, the setup wizard ran,
and the build aborted a few lines later when bash tried `< /dev/tty`:

    /tmp/install.sh: line 1347: /dev/tty: No such device or address

Replace the existence check with `(: </dev/tty) 2>/dev/null`, which
actually attempts to open /dev/tty in a subshell. The probe succeeds
when piped from `curl | bash` on a real terminal (the wizard's intended
use case) and fails cleanly in Docker build / CI contexts so the skip
kicks in before the redirect can crash.

Add a regression test that statically asserts run_setup_wizard does not
gate on the bare existence check and that the open-based probe is in
place.

Fixes #16746.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 06:45:55 -07:00
Dejie Guo
8cced33784 fix(model): prefer live models for user providers 2026-04-28 06:45:35 -07:00
Teknium
5f84eac451
feat(gateway): bust cached agent on compression/context_length config edits (#17008)
The gateway caches one AIAgent per session to preserve prompt-cache hits,
keyed by _agent_config_signature().  The signature previously only
fingerprinted model/credentials/toolsets/ephemeral-prompt — NOT the
compression or context_length config.  As a result, users who edited
model.context_length or compression.threshold in config.yaml on a
long-lived gateway saw no effect until they triggered an unrelated
cache eviction (/model switch, /reset, gateway restart).

Add a new cache_keys parameter to _agent_config_signature and a
_CACHE_BUSTING_CONFIG_KEYS registry listing config values the agent
bakes in at construction time.  Call sites read the current config and
pass it through — next gateway message with an edited config
rebuilds the agent.

Keys registered:
- model.context_length
- compression.enabled
- compression.threshold
- compression.target_ratio
- compression.protect_last_n

Reported by @OP (Apr 26 feedback bundle).

## Changes
- gateway/run.py: new _CACHE_BUSTING_CONFIG_KEYS tuple,
  _extract_cache_busting_config classmethod, cache_keys kwarg on
  _agent_config_signature, call site passes the extracted dict
- tests/gateway/test_agent_cache.py: 11 new tests
  (5 on _agent_config_signature behavior, 6 on _extract_cache_busting_config)

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 06:37:42 -07:00
Siwen Wang
d6137453ac fix(gateway): drain stale httpx polling connections on Telegram reconnect
Network errors through proxies (e.g. sing-box) can leave httpx
connections in a half-closed state occupying pool slots.  After enough
reconnect cycles the 256-connection default fills up entirely, causing
Pool timeout: All connections in the connection pool are occupied.

Fix: cycle only the getUpdates request object (_request[0]) via
shut-down + re-initialize before restarting polling.  This drains stale
connections without touching the general request (_request[1]) that
concurrent send_message / edit_message calls rely on.

The drain is applied to both _handle_polling_network_error and
_handle_polling_conflict reconnect paths via a shared
_drain_polling_connections() helper.  Failures in the drain are
swallowed so reconnect always proceeds.

Based on #16466 by @Mirac1eSky.
2026-04-28 06:37:22 -07:00
Teknium
391f1ca1f4
feat(aux): translate extra_body.reasoning into Codex Responses API (#17004)
Auxiliary callers that configure reasoning via
auxiliary.<task>.extra_body.reasoning were having that config silently
dropped by the Codex Responses adapter — it only forwarded
messages/model/tools through to responses.stream(), never translating
chat.completions-shaped reasoning hints into the Responses API's
top-level reasoning + include fields.

Mirror the main-agent translation from agent/transports/codex.py:
- extra_body.reasoning.effort → resp_kwargs.reasoning.{effort, summary:"auto"}
- 'minimal' → 'low' clamp (Codex backend rejects 'minimal')
- Always include ['reasoning.encrypted_content'] when reasoning is enabled
- {'enabled': False} → omit reasoning and include entirely
- Non-dict reasoning values are ignored defensively

Reported by @OP (Apr 26 feedback bundle).

## Changes
- agent/auxiliary_client.py: _CodexCompletionsAdapter.create() now reads
  and translates extra_body.reasoning before calling responses.stream()
- tests/agent/test_auxiliary_client.py: 9 new tests covering all effort
  levels, the minimal→low clamp, the disabled path, the no-op paths,
  and defensive handling of wrong-shape inputs

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 05:47:42 -07:00
Teknium
72dea9f4f7
feat(gateway): make hygiene hard message limit configurable (#17000)
The gateway session-hygiene pre-compression safety valve had a hardcoded
400-message threshold. On long-lived sessions with short turns this was
either too high (users with aggressive compression preferences) or too
low (users with very large context models who want to keep more history
in-flight).

Add compression.hygiene_hard_message_limit (default 400) so it can be
tuned without forking the gateway.

Reported by @OP (Apr 26 feedback bundle).

## Changes
- hermes_cli/config.py: new DEFAULT_CONFIG key with 400 default
- gateway/run.py: read compression.hygiene_hard_message_limit at
  hygiene-time, fall back to 400 if missing/invalid
- tests/gateway/test_session_hygiene.py: two tests — override fires at
  the configured limit, default does not fire below 400

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 05:43:12 -07:00
Teknium
06164a7b28
fix(codex): resync pool entry from auth.json after reauth (#17001)
When openai-codex tokens expire or the ChatGPT account hits a 429
window, the pool entry gets marked STATUS_EXHAUSTED with
last_error_reset_at many hours in the future. If the user then runs
`hermes model` / `hermes auth openai-codex` to reauth, fresh tokens
land in ~/.hermes/auth.json but the pool entry stayed frozen behind
its reset_at — every request kept failing with 'credential pool: no
available entries (all exhausted or empty)' until the original window
elapsed.

_available_entries() already had auth.json/credentials-file resync
branches for anthropic/claude_code and nous/device_code; openai-codex
was missing. Added _sync_codex_entry_from_auth_store() mirroring the
nous version (reads state["tokens"][{access,refresh}_token] +
state["last_refresh"]) and wired it into the exhausted-entry resync
loop.

Also softens the 'codex CLI not found' doctor warning — native
device-code OAuth does not require the Codex binary, only
importing existing Codex CLI tokens does. Downgraded to an info line.

Reported on Discord by p1aceho1der: Codex stalled indefinitely after
a rate-limit reset, reauth didn't help, and doctor falsely warned
that the codex CLI was required.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 05:43:09 -07:00
teknium1
529eb29b6a fix(gemini): clamp Flash thinkingLevel to documented low/medium/high set
Gemini 3 Flash documents low/medium/high as the accepted thinkingLevel
values. The salvaged bridge was forwarding Hermes' "minimal" effort to
Flash verbatim, which is not a documented Gemini level and risks a 400
from the native adapter.

Clamp minimal->low on Flash (matching how Pro already clamps minimal+low
down), and funnel anything outside {low, medium, high} into medium to
keep the request valid by construction. No behaviour change for the
documented effort levels.
2026-04-28 05:38:23 -07:00
Nanako0129
dbbe2d1973 fix(gemini): bridge reasoning_config into thinking_config for chat-completions routes 2026-04-28 05:38:23 -07:00
LeonSGP43
a3b9343f08 feat(telegram): render markdown tables as row groups 2026-04-28 05:37:50 -07:00
helix4u
d8c5573ffe fix(profiles): migrate Honcho host on rename 2026-04-28 05:37:09 -07:00
revar
052b3449e5 test(cli): regression test for manual /compress system_message
Add tests/test_cli_manual_compress.py verifying _manual_compress passes
None (not the cached system prompt) to _compress_context, forwards the
/compress <topic> focus string, rotates CLI session_id to the new child
session, and clears the pending title.

Co-authored-by: revar <revar@users.noreply.github.com>
2026-04-28 05:21:49 -07:00
teknium1
7444e49d4e fix(gateway): use transcript timestamp for auto-continue freshness
Follow-up to PR #16802 (BeliefanX). The original fix read
`agent_history[-1].get("timestamp")` for the tool-tail freshness gate,
but `gateway/run.py` strips the `timestamp` field off all tool/tool_call
rows when building `agent_history` from the raw transcript (see
`clean_msg = {k: v for k, v in msg.items() if k != "timestamp"}`).  At
runtime the tool-tail branch always saw `None` and silently took the
legacy-fresh path — the stale-guard never fired for the tool-tail case
it was supposed to cover.

Changes:
- Read the freshness signal from the RAW `history` list (via new
  `_last_transcript_timestamp()` helper) BEFORE the strip.  Both the
  resume_pending branch and the tool-tail branch use this single signal,
  replacing the two divergent ones.
- Default window bumped 15 min → 1 hour via new
  `_AUTO_CONTINUE_FRESHNESS_SECS_DEFAULT`.  The 15-minute default was
  shorter than the default `gateway_timeout` of 30 min, so a legitimate
  long-running turn interrupted near its timeout boundary and resumed
  shortly after would have been misclassified as stale.
- Configurable via `config.yaml` `agent.gateway_auto_continue_freshness`
  (bridged to `HERMES_AUTO_CONTINUE_FRESHNESS` at gateway startup — same
  pattern as `gateway_timeout`).  Set to 0 to disable the gate.
- `_coerce_gateway_timestamp` now explicitly rejects bool (which is a
  subclass of int and would otherwise coerce to 0.0/1.0).
- Tests rewritten to exercise the real production data shape: raw
  `history` → `_build_agent_history` strip → freshness decision.  A
  regression guard (`test_stale_tool_tail_with_production_data_shape`)
  asserts `agent_history` tool rows carry NO timestamp, protecting
  against someone "fixing" the original bug by re-adding the stripped
  field (which would break the OpenAI tool-result message contract).

Add BeliefanX to scripts/release.py AUTHOR_MAP.

E2E verified: config.yaml → env var bridge → helper returns configured
value; default 1h window; malformed/empty env var falls back to default;
ISO-Z timestamps parse; ms-epoch coerced; bool rejected.
2026-04-28 05:20:35 -07:00
beliefanx
93feffbcfa fix(gateway): avoid stale interrupted turn auto-continue 2026-04-28 05:20:35 -07:00
Teknium
b61d9b297a refactor: consolidate symlink-safe atomic replace into shared helper
Extract the islink/realpath guard from the 16743 fix into a single
atomic_replace() helper in utils.py, then migrate every os.replace()
call site in the codebase to use it.

The original PR #16777 correctly identified and fixed the bug, but
only patched 9 of ~24 call sites. The same bug class (managed
deployments that symlink state files silently losing the link on
every write) still existed at auth.json, sessions file, gateway
config, env_loader, webhook subscriptions, debug store, model
catalog, pairing, google OAuth, nous rate guard, and more.

Rather than add another 10+ copies of the same three-line guard,
consolidate into atomic_replace(tmp, target) which:
- resolves symlinks via os.path.realpath before os.replace
- returns the resolved real path so callers can re-apply permissions
- is a drop-in replacement for os.replace at the use sites

Changes:
- utils.py: new atomic_replace() helper + atomic_json_write /
  atomic_yaml_write now call it instead of inlining the guard
- 16 files: all os.replace() call sites migrated to atomic_replace()
  - agent/{google_oauth, nous_rate_guard, shell_hooks}.py
  - cron/jobs.py
  - gateway/{pairing, session, platforms/telegram}.py
  - hermes_cli/{auth, config, debug, env_loader, model_catalog, webhook}.py
  - tools/{memory_tool, skill_manager_tool, skills_sync}.py

Tests: tests/test_atomic_replace_symlinks.py pins the invariant for
atomic_replace + atomic_json_write + atomic_yaml_write, covers plain
files, first-time creates, broken symlinks, and permission preservation.

Refs #16743
Builds on #16777 by @vominh1919.
2026-04-28 04:58:22 -07:00
teknium1
1369dae226 test(openclaw-migration): cover alias reverse-lookup for real OpenClaw schema
Real OpenClaw configs key agents.defaults.models by full provider/model
API ID with an 'alias' field on the value (e.g.
{'anthropic/claude-opus-4-6': {'alias': 'Claude Opus 4.6'}}).  Add
regression tests for issue #16745 covering:

- reverse-lookup of alias against real schema (keyed by API ID)
- alias resolution when model is a bare string vs {'primary': ...}
- passthrough when the value is already a provider/model API ID
- passthrough when the alias has no catalog match
- string-valued catalog entries (belt-and-suspenders)
- no catalog at all
2026-04-28 04:58:13 -07:00
Pony.Ma
aa94883288 fix(mcp): preserve nullable schema coercion 2026-04-28 04:58:03 -07:00
Pony.Ma
1350d12b0b fix: keep mcp dynamic refresh tasks tracked 2026-04-28 04:58:03 -07:00
Pony.Ma
02ae152222 fix(mcp): normalize nullable tool schemas 2026-04-28 04:58:03 -07:00
Ruda Porto Filgueiras
37551ee53e test(bedrock): add model picker and region routing tests
25 new tests (all Bedrock API calls mocked, no real AWS creds needed):

tests/hermes_cli/test_bedrock_model_picker.py (20 tests):
  - provider_model_ids("bedrock") uses live discovery, returns regional
    model IDs, falls back gracefully on empty/exception, resolves all
    bedrock aliases (aws, aws-bedrock, amazon-bedrock) to live discovery
  - list_authenticated_providers() section 2: bedrock appears with AWS
    creds, model list from discover_bedrock_models(), total_models
    matches, is_current flag works, absent creds hides bedrock, discovery
    failure does not crash, no duplicate entries
  - Region routing: botocore profile eu-central-1 yields eu.* model IDs
    end-to-end; env var takes priority over botocore profile
  - providers.py overlay: exists with correct transport/auth_type, label
    is non-empty, all aliases normalize to bedrock

tests/agent/test_bedrock_adapter.py (5 tests):
  - resolve_bedrock_region() botocore profile fallback, botocore failure
    fallback, us-east-1 hard fallback (with botocore mocked)
2026-04-28 03:53:11 -07:00
Teknium
023f5c74b1
fix(anthropic): remove Claude Code fingerprinting from OAuth Messages API path (#16957)
* fix(anthropic): remove Claude Code fingerprinting from OAuth Messages API path

OAuth requests now identify as Hermes on the wire. Removed:

  - "You are Claude Code, Anthropic's official CLI for Claude." system
    prompt prepend
  - Hermes Agent → Claude Code / Nous Research → Anthropic
    system-prompt substitutions
  - mcp_ tool-name prefix on outgoing tool schemas + message history
  - Matching mcp_ strip on inbound tool_use blocks (strip_tool_prefix path
    removed from AnthropicTransport.normalize_response, + all 5 call
    sites in run_agent.py and auxiliary_client.py)
  - user-agent: claude-cli/<v> (external, cli) and x-app: cli headers on
    the Messages API client

Added:

  - OAuth path strips context-1m-2025-08-07 — Anthropic rejects OAuth
    requests carrying it with HTTP 400 'This authentication style is
    incompatible with the long context beta header.'

Kept (auth plumbing, not identity spoofing):

  - _is_oauth_token classifier and is_oauth flag threading
  - Bearer vs x-api-key auth routing
  - _OAUTH_ONLY_BETAS (claude-code-20250219, oauth-2025-04-20) — backend
    requires these on the OAuth-gated Messages endpoint
  - _OAUTH_CLIENT_ID (Claude Code's) — Anthropic doesn't issue OAuth
    creds to third parties; this is the only way the login flow works
  - claude-cli/<v> User-Agent on the OAuth token exchange + refresh
    endpoints at platform.claude.com/v1/oauth/token — bare requests get
    Cloudflare 1010 blocked

Verified live against api.anthropic.com with a fresh sk-ant-oat01-*
token:

  - claude-haiku-4-5 simple message: HTTP 200, 'OK' response
  - claude-haiku-4-5 tool call: HTTP 200, stop_reason=tool_use, tool
    named 'terminal' (no mcp_ prefix) round-tripped correctly
  - Outgoing wire: no user-agent, no x-app, real Hermes identity in
    system prompt, real tool name in schema

Closes/supersedes #16820 (mcp_ PascalCase normalization patch — no longer
needed since the mcp_ round-trip is gone).

* fix(anthropic): resolve_anthropic_token() reads credential pool first

Close the gap where ~/.hermes/auth.json → credential_pool.anthropic
(where hermes login + dashboard PKCE flow write OAuth tokens) was not
in resolve_anthropic_token()'s source list.

Before: users who authed via hermes login got the token written into
the pool, but legacy fallback code paths (auxiliary_client, models
catalog fetch, explicit-runtime path) that call resolve_anthropic_token()
saw None and raised 'No Anthropic credentials found' — even though the
token was sitting in auth.json.

New priority 1: pool.select() with env-sourced entries skipped. Skipping
env:* entries preserves the existing env-var priority logic further
down the chain (static env OAuth → refreshable Claude Code upgrade via
_prefer_refreshable_claude_code_token).

Surfaced while writing the hermes-agent-dev skill playbook for
'finding a live OAuth token for an E2E test'.

---------

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 03:51:17 -07:00
Teknium
2b728e1274
fix(agent): drop thinking-only assistant turns before provider call (#16959)
Adds a pre-call sanitizer that detects assistant messages containing only
reasoning (reasoning / reasoning_content, no visible content, no
tool_calls) and drops them from the API copy. Adjacent user messages
left behind are merged so role alternation is preserved for the
provider.

Mirrors Claude Code's approach in src/utils/messages.ts
(filterOrphanedThinkingOnlyMessages + mergeAdjacentUserMessages). We
drop the whole turn rather than fabricate stub text (the '.' /
'(continued)' pattern from contributor PRs #11098, #13010, #16842 that
were rejected because they put words in the model's mouth).

The stored conversation history (self.messages) is never mutated — only
the per-call api_messages copy. Users still see the reasoning block in
the CLI/gateway transcript; only the wire copy is cleaned. Session
persistence keeps the full trace.

Two call sites covered:
- Main agent loop, after _sanitize_api_messages (catches every turn).
- Iteration-limit-summary fallback path.

Tests: tests/run_agent/test_thinking_only_sanitizer.py — 25 cases
covering detection (string/list content, whitespace-only, tool_calls,
reasoning_details list form), drop behavior, adjacent-user merge
(string+string, list+list, mixed), non-mutation of input dicts, and
system-message handling.

E2E live-tested against 5 providers with a poisoned history (empty
assistant message + reasoning_content): OpenRouter→Anthropic/OpenAI/
DeepSeek-R1/Qwen, native Gemini. All 5 accepted the cleaned request.
Happy-path regression (5/5) confirms the sanitizer is a noop when no
thinking-only turn exists.

Related: #16823 (wontfix — stub-text approach rejected).

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 03:50:51 -07:00
simonweng
a6a6cf047d feat(providers): add tencent-tokenhub provider support
Registers tencent-tokenhub (https://tokenhub.tencentmaas.com/v1) as a
new API-key provider with model tencent/hy3-preview (256K context).

- PROVIDER_REGISTRY entry + TOKENHUB_API_KEY / TOKENHUB_BASE_URL env vars
- Aliases: tencent, tokenhub, tencent-cloud, tencentmaas
- openai_chat transport with is_tokenhub branch for top-level
  reasoning_effort (Hy3 is a reasoning model)
- tencent/hy3-preview:free added to OpenRouter curated list
- 60+ tests (provider registry, aliases, runtime resolution,
  credentials, model catalog, URL mapping, context length)
- Docs: integrations/providers.md, environment-variables.md,
  model-catalog.json

Author: simonweng <simonweng@tencent.com>
Salvaged from PR #16860 onto current main (resolved conflicts with
#16935 Azure Anthropic env-var hint tests and the --provider choices=
list removal in chat_parser).
2026-04-28 03:45:52 -07:00
Teknium
bd10acd747
fix(providers): honor key_env/api_key_env on Azure Anthropic + accept alias in normalizer (#16935)
Three related fixes around custom env-var-name hints for provider entries.

1. Azure Anthropic path: previously hardcoded to look up AZURE_ANTHROPIC_KEY
   then ANTHROPIC_API_KEY with no way to override.  If a user wrote
     model:
       provider: anthropic
       base_url: https://my-resource.services.ai.azure.com/anthropic
       key_env: MY_CUSTOM_KEY
   the key_env hint was silently ignored and the resolver raised
   'No Azure Anthropic API key found' even when MY_CUSTOM_KEY was set
   in the environment.  The runtime now checks, in order:
     (1) os.getenv(model_cfg.key_env)
     (2) os.getenv(model_cfg.api_key_env)    # docs alias
     (3) model_cfg.api_key                     # inline value
     (4) AZURE_ANTHROPIC_KEY                   # historical default
     (5) ANTHROPIC_API_KEY                     # historical default
   Error message updated to mention key_env as an option.

2. Provider entry normalizer (_normalize_custom_provider_entry): accept
   'api_key_env' as a snake_case alias for 'key_env', and 'apiKeyEnv' as a
   camelCase alias.  Adds both to the _KNOWN_KEYS set so the 'unknown
   config keys ignored' warning doesn't fire on valid configs.

3. _VALID_CUSTOM_PROVIDER_FIELDS: add 'key_env'.  That set documents
   supported custom_providers entry fields; it was drifting from reality
   since key_env has been read at runtime in auxiliary_client.py,
   runtime_provider.py, and main.py for a while.

Docs: website/docs/guides/azure-foundry.md now uses the canonical key_env
field and notes that api_key_env / keyEnv / apiKeyEnv are accepted as
aliases.

Validation: 12 new tests in test_runtime_provider_resolution.py covering
all 5 Azure Anthropic resolution paths + 4 normalizer-alias tests.  Pass
rate across related suites (165 + 46 tests): 100%.

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 02:12:08 -07:00
墨綠BG
4462b349b2 feat(web): expose search result limit 2026-04-28 02:09:30 -07:00
Teknium
e63364b8df
revert: computer-use cua-driver (PR #16919) (#16927)
Reverts PR #16919 (commits dad10a78d, 413ee1a28, b4a8031b2, afb958829)
which was merged prematurely. Restoring the pre-merge state so #14817
and #15328 can be revisited as standing PRs.

Reverted commits:
- afb958829 fix(computer-use): harden image-rejection fallback + AUTHOR_MAP
- b4a8031b2 fix(computer-use): unwrap _multimodal tool results
- 413ee1a28 feat(computer-use): background focus-safe backend
- dad10a78d feat(computer-use): cua-driver backend, universal any-model schema

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 01:57:21 -07:00
Teknium
cf0852f92e
feat(claw-migrate): harden OpenClaw import with plan-first apply, redaction, and pre-migration backup (#16911)
* feat(claw-migrate): harden OpenClaw import with plan-first apply, redaction, and pre-migration backup

Adopts four design patterns from OpenClaw's reciprocal migrate-hermes
importer so both migration paths have the same safety posture.

- **Refuse-on-conflict apply.** 'hermes claw migrate' now refuses to
  execute when the plan has any conflict items, unless --overwrite is
  set. Previously the user could say 'yes, proceed' and end up with a
  silent partial migration that skipped every conflicting item.
- **Engine-level secret redaction.** The report.json and summary.md
  written to disk (and --json stdout) run through a redactor that
  matches OpenClaw's key-name markers and value-shape patterns
  (sk-*, ghp_*, xox*-, AIza*, Bearer *). Prevents accidental API key
  leakage in bug reports and support channels.
- **Pre-migration tarball snapshot.** Apply creates one timestamped
  restore-point archive of ~/.hermes/ at ~/.hermes/migration/pre-migration-backups/
  before any mutation, excluding regenerable directories
  (sessions, logs, cache). Opt out with --no-backup.
- **Blocked-by-earlier-conflict sequencing.** If a config.yaml write
  hits conflict/error mid-apply, subsequent config-mutating options
  are marked skipped with reason 'blocked by earlier apply conflict'
  rather than attempting partial writes.
- **Structured warnings[] and next_steps[] on the report** — actionable
  guidance surfaces in both JSON output and summary.md.
- **--json output mode** — emits the redacted report on stdout for CI.

Also flips --preset full to NOT auto-enable --migrate-secrets. Users
now have to opt in to secret import explicitly, mirroring OpenClaw's
two-phase posture.

Status/kind/action constants are defined (STATUS_MIGRATED etc) with
values that match the existing strings the script emits, so the
report schema is backward-compatible. ItemResult gains a 'sensitive'
bool field that redaction and consumers can key off.

Validation: 26 new unit tests + 1 updated test in tests/skills/
test_openclaw_migration_hardening.py and test_claw.py cover redaction
(key markers, value patterns, recursion, on-disk), warnings/next_steps,
blocked-by-earlier sequencing, --json mode, and the preset-flip.
Manual E2E against a fake $HERMES_HOME with real-shaped secrets
confirmed: (1) secrets never appear in stdout or on disk,
(2) _cmd_migrate refuses apply when plan has conflicts,
(3) --overwrite proceeds past the guard and the backup tarball is
created, (4) --no-backup skips the archive.

Related docs: website/docs/guides/migrate-from-openclaw.md and
website/docs/reference/cli-commands.md updated to reflect the
preset-flip and new --no-backup flag.

* refactor(claw-migrate): reuse hermes backup system for pre-migration snapshot

Drops the inline tarball in hermes_cli/claw.py in favor of
hermes_cli.backup.create_pre_migration_backup(), which shares an
implementation with create_pre_update_backup via a new
_write_full_zip_backup helper.  Benefits:

- Consistent exclusion rules with hermes backup (_EXCLUDED_DIRS,
  _EXCLUDED_SUFFIXES, _EXCLUDED_NAMES — single source of truth).
- SQLite safe-copy via _safe_copy_db (state.db restores cleanly).
- Zip format restorable with 'hermes import <archive>'.
- Lives under ~/.hermes/backups/pre-migration-*.zip alongside
  pre-update-*.zip — one place for all snapshot archives.
- Auto-prune rotation with separate keep counters (pre-migration
  keeps 5, pre-update keeps 5, they don't touch each other's files).

7 new tests in tests/hermes_cli/test_backup.py lock the contract:
directory location, shared exclusion rules, _validate_backup_zip
acceptance (i.e. restorable with 'hermes import'), non-recursive
into prior backups, rotation, missing-home handling, and the
invariant that pre-migration rotation never touches pre-update
backups.

Help text and docs updated — the restore hint now says
'hermes import <name>' instead of 'tar -xzf <archive> -C ~/'.

* chore(claw-migrate): use backup._format_size and drop duplicate output line

Minor polish using another existing primitive from hermes_cli.backup:

- Show backup archive size with _format_size (e.g. '(245 B)' or '(2.4 MB)')
  matching the format hermes backup already uses.
- Drop the duplicate 'Pre-migration backup saved' line after Migration
  Results — the earlier 'Pre-migration backup: <path> (<size>)' line
  already surfaces the path before apply runs.

---------

Co-authored-by: teknium1 <teknium@users.noreply.github.com>
2026-04-28 01:50:23 -07:00
ThomassJonax
2f9243c333 fix(session): make SQLite transcript rewrites transactional 2026-04-28 01:49:46 -07:00
crayfish-ai
f3371c39a4 fix(auxiliary): custom provider URL rewrite + main_runtime model for title gen
- auxiliary_client: apply _to_openai_base_url() to custom base_url
  (fixes /anthropic → /v1 rewrite missing for provider="custom")
- auxiliary_client: use main_runtime.get("model") instead of _read_main_model()
  so auxiliary tasks follow system default model changes
- title_generator: thread main_runtime through generate_title → auto_title_session → maybe_auto_title
- cli.py / gateway/run.py: pass main_runtime to maybe_auto_title
- tests: update mock assertions for new main_runtime parameter
2026-04-28 01:47:25 -07:00