* feat(mcp): raise default tool-call timeout 120s -> 300s
Port from openai/codex#28234. Long-running MCP tools (web fetches,
sandboxed builds, deep-research servers) routinely exceed 120s, causing
spurious timeout failures. Codex bumped its default MCP tool timeout from
120 to 300 for the same reason.
- _DEFAULT_TOOL_TIMEOUT 120 -> 300 in tools/mcp_tool.py (per-server
'timeout' config override unchanged)
- update test_default_timeout assertion
- document the default in mcp-config-reference.md
* fix(dump): show commit date instead of release date in hermes dump
The version line in `hermes dump` (the top of the /debug report) appended
the package release date in parentheses, which reads like a wall-clock
"generated at" timestamp and confuses support triage. Replace it with the
date the HEAD commit was actually made, resolved live via
`git log -1 --format=%cd --date=short`, kept next to the commit SHA.
On Docker/wheel installs with no .git the date resolves to '' and the
suffix is simply omitted (the baked SHA still identifies the build).
* fix(desktop): resolve electronDist dynamically + self-heal blocked installs
Supersedes the static-path approach (#48081) and the install-step self-heal
(#48082) with a fix that removes the whole failure class instead of chasing each
symptom. Three distinct faults converged into the June desktop-build outage; this
closes all three.
Root cause (the part #48081 left open — "Gap B"):
build.electronDist was a static relative path in apps/desktop/package.json, but
npm workspace hoisting is NOT deterministic — depending on the npm version and
what else is installed, npm nests the workspace-only electron devDep under
apps/desktop/node_modules/electron OR hoists it to the repo root. A static path
matches only one layout, so a clean install intermittently fails with "The
specified electronDist does not exist". #48081 re-pointed the path at the
nested layout (correct today) but electron-builder reads electronDist
STATICALLY, so any future hoist change silently breaks it again — only caught
by a CI invariant, never self-corrected.
Fix:
- scripts/run-electron-builder.cjs: resolve electron the way Node's runtime does
— require.resolve("electron/package.json") walks node_modules from the desktop
project upward and finds electron wherever npm actually put it. The path can
never drift out of sync with the install layout again, on any OS/npm version.
* dist present -> pass -c.electronDist=<abs>/dist so electron-builder reuses
the unpacked runtime (keeps the #38673 fast path that dodges the 26.8.x
missing-binary re-unpack bug).
* dist absent -> omit electronDist; electron-builder fetches Electron itself
via @electron/get honoring electronVersion + ELECTRON_MIRROR.
package.json: builder script now runs the wrapper; the static build.electronDist
is removed (the resolver owns it).
- main.py / install.sh / install.ps1: on a dependency-install failure where the
electron package staged but its dist is missing (electron's install.js
process.exit(1) on a blocked/throttled binary download — #47266/#47917/#48021),
repopulate the dist via electron's downloader (canonical, then npmmirror.com)
and CONTINUE to the build instead of aborting. npm runs postinstall LAST, so
the only casualty is electron/dist; bailing here is what made the pack-time
mirror self-heal unreachable on a blocked network. Hard-fail only when electron
never staged at all (a genuine dependency error).
- The pack-time mirror fallback now retries the build even when the pre-fetch
can't populate the dist: the wrapper lets electron-builder download Electron
itself via the mirror, so the retry is no longer a no-op (it was, when
electronDist was a static path).
The exact 40.10.2 pin (already on main) keeps the third mode — the native
@electron-internal/extract-zip win32 binding that 40.10.3/40.10.4 ship without a
published prebuild — from recurring.
Tests:
- test_desktop_electron_pin.py: replace the static-path-matches-lockfile
invariant with contracts that there is no hardcoded electronDist to drift, the
builder script routes through the resolver, and the resolver uses Node module
resolution + injects -c.electronDist.
- test_gui_command.py: install-failure self-heal continues to build; genuine
(electron-never-staged) install failure still hard-fails; pack retries under
the mirror even when the pre-fetch is blocked.
Salvages/supersedes the overlapping community work in #48003 (sitkarev),
#48012 (omegazheng), #48033 (james47kjv), and #48082.
Co-authored-by: sitkarev <59806492+sitkarev@users.noreply.github.com>
Co-authored-by: omegazheng <zheng@omegasys.eu>
Co-authored-by: james47kjv <220877172+james47kjv@users.noreply.github.com>
* fix(desktop): narrow Electron self-heal to real missing-dist failures
Follow-up on #48091 to remove the remaining misdiagnosis risk from the
installer/build fallback path (#46785 concern): only take the Electron
repair/retry path when Electron's package files are staged and dist is actually
missing/corrupt.
- main.py: add _electron_pkg_staged_missing_dist() and use it to gate install
failure recovery; fail fast for unrelated npm install errors.
- main.py/install.sh/install.ps1: run cache purge + retry only when dist is
missing; do not retry unrelated tsc/vite/build failures under an
Electron-specific narrative.
- install.sh/install.ps1: tighten install-stage self-heal guard to require both
package.json + install.js and missing dist.
- tests: add coverage that install failure hard-fails when Electron dist already
exists, and update retry test to reflect the tightened recovery condition.
Validation:
- Python tests: 64 passed
- install.sh-related tests included in the run
- Real mac build on this machine:
- npm ci at repo root: success
- cd apps/desktop && npm run pack: success
- electron-builder packaged darwin arm64 and used custom unpacked Electron dist
* refactor(desktop): trim electron self-heal helpers and comments
Deduplicate mirror-retry into _try_redownload_electron_dist / shell
counterparts; shorten wrapper and install-script commentary without
changing recovery semantics.
---------
Co-authored-by: sitkarev <59806492+sitkarev@users.noreply.github.com>
Co-authored-by: omegazheng <zheng@omegasys.eu>
Co-authored-by: james47kjv <220877172+james47kjv@users.noreply.github.com>
- test_ws_transport.py: drives WebSocketRelayTransport against a REAL in-process
websockets server (not a mock socket): handshake (hello->descriptor), inbound
frame -> handler, outbound request/response correlation, follow_up routing,
and clean disconnect failing pending waiters. Skips if websockets is absent.
- test_relay_registration.py: rewritten for the config-driven gate — registers
when GATEWAY_RELAY_URL is set / an explicit url is passed / force=True; no-op
without a URL; trailing slash stripped; adapter constructs through the registry.
Full relay suite: 57 passed.
The relay outbound surface had send/edit/typing but no way to act on a
SHARED-identity capability (e.g. a Discord interaction follow-up token,
~15min) that the connector captured + stripped at the edge. Under A2 that
credential never reaches the gateway, so the gateway can't just 'send with
the token' — it needs a semantic op naming the session it's already in.
Adds the follow_up op end to end on the gateway side:
- RelayTransport.send_follow_up(action): protocol method. Action carries
op='follow_up' + session_key + kind + content (+ metadata) and NO token.
- RelayAdapter.send_follow_up(session_key, kind, content, metadata): builds
that action and returns a SendResult. The connector resolves the real
capability (its resolveOutboundCapability), enforces the tenant match so
tenant B can't wield tenant A's capability, and egresses; success=False
when the capability is absent/expired/mismatched (nothing to retry — a
leaked gateway holds zero capability material).
- StubConnector records follow_ups + a canned next_follow_up_result.
Tests: round-trips without a token; the wire action carries only session
refs (no credential value field — the 'kind' string is a type ref, not the
secret); failure surfaces when the connector can't resolve; no-transport
fails cleanly. 55 passed. §4 doc entry follows in the contract-rewrite commit.
Under the A2 trust model the connector is the SOLE crypto/identity
boundary: it verifies/decrypts every inbound platform payload at the edge
(it holds the tenant secrets), normalizes to a tenant-scoped MessageEvent,
and forwards only the sanitized event. The gateway re-validates nothing —
it cannot without being handed the shared signing secret, which on a
shared bot is itself the cross-tenant leak.
The relay path already imports no platform-crypto today; this locks that
in as an enforced invariant so nobody bolts re-validation (Discord
ed25519, Twilio HMAC, WeCom BizMsgCrypt, generic webhook signature checks)
onto the relay later and silently re-couples the gateway to platform
secrets it must never hold. Verification stays in the direct platform
adapters (gateway/platforms/*) which serve non-relay deployments.
- test_relay_package_imports_no_platform_crypto: AST-walks gateway/relay/*
and fails on any import of a platform-crypto/verification module.
- test_relay_package_calls_no_signature_verification: fails on any
verification-symbol reference (ed25519/hmac/bizmsg/verify_*).
Invariants (assert the relation 'relay re-validates nothing'), not frozen
snapshots. Verified the guard bites: injecting a wecom_crypto import makes
it fail, removing it goes green. docs §6 rewrite follows in a later commit.
The Phase 1 exit gate requires BOTH Discord and Telegram to round-trip
through the relay stub, but test_relay_roundtrip.py only covered Discord.
Add the Telegram companion exercising its distinct discriminator profile:
- no guild_id — two chats isolate on chat_id alone
- forum topics share one chat_id and isolate by thread_id (the Telegram
analog of Discord per-guild isolation), shared across participants by
default (thread_sessions_per_user=False)
- DM isolation by chat_id
- utf16 len_unit + markdown_v2 dialect round-trip and configure the adapter
- outbound send round-trips through the stub
Proves the CapabilityDescriptor + build_session_key generalize beyond
Discord, not just the struct (which the descriptor unit tests already
covered).
Add an invariant test pinning docs/relay-connector-contract.md to the
Python source of truth so the doc (which the connector repo mirrors by
hand) cannot silently drift:
- CapabilityDescriptor §2 table ⟷ dataclass fields + required/optional
- SessionSource wire keys (to_dict output) ⟷ §3 documented fields
- per-platform discriminator columns exist as real SessionSource fields
- guard that is_bot stays off the wire until deliberately promoted
Writing the test surfaced a real gap: §3 only enumerated 5 discriminators
in its per-platform table while to_dict() emits 12 keys. Seven wire keys
the connector must populate (chat_name, chat_topic, user_id_alt,
chat_id_alt, parent_chat_id, message_id, user_name) were undocumented —
a connector author reading the doc would never know to set them. Added a
complete SessionSource wire-field table to §3. The connector's existing
contract.ts already carries all 12, so no connector change is needed; the
doc was the lagging artifact.
The platform-connected-checker invariant test requires every built-in
Platform enum member to have either a generic token path or a bespoke
entry in _PLATFORM_CONNECTED_CHECKERS. Platform.RELAY was added without
one, so test_all_builtins_have_checker_or_generic_token_path failed.
Relay dials OUT to a connector and is 'connected' once an endpoint URL
is configured (extra['relay_url'] or extra['url']); the capability
descriptor is negotiated at handshake time, so the URL is the only
config-level signal in the experimental phase. Add the checker plus a
synthetic-config case exercising its True path.
CI guard: fails if gateway/ or plugins/ ever imports the test-only stub
connector or defines StubConnector. Matches code leaks (imports / class defs),
not prose mentions, so the transport.py docstring reference to the stub's path
is allowed.
Phase 1 complete. Task 1.6 of the gateway-relay plan.
RelayAdapter.on_interrupt(session_key, chat_id) bridges a connector-delivered
mid-turn /stop into the existing interrupt_session_activity path, setting the
per-session _active_sessions Event and clearing typing — cancelling exactly the
targeted session's turn without touching siblings (mirrors test_stop_thread_
sibling isolation). Transport.send_interrupt carries the gateway-side egress to
the connector for socket-owner routing.
Phase 1, Task 1.4 of the gateway-relay plan.
register_relay_adapter() registers the generic 'relay' platform via the same
PlatformRegistry path as plugin adapters — no core dispatch changes. OFF by
default (dark-launch): only registers when HERMES_GATEWAY_RELAY is truthy (or
force=True for tests), so existing single-tenant/direct deployments are
unaffected. Factory builds a transport-less RelayAdapter with a placeholder
descriptor; the real descriptor is negotiated at handshake.
Phase 1, Task 1.3 of the gateway-relay plan.
Defines RelayTransport (lifecycle/handshake/inbound/outbound/interrupt) as the
gateway<->connector wire contract; RelayAdapter.connect now registers an inbound
handler that bridges connector-delivered MessageEvents into handle_message.
Adds an in-memory StubConnector under tests/ and an E2E round-trip proving:
connect registers the handler, inbound events reach the adapter, guild_id drives
build_session_key isolation (two guilds -> two keys; same guild/channel/user ->
one), outbound send round-trips, get_chat_info is proxied.
Phase 1, Task 1.2 of the gateway-relay plan.
One BasePlatformAdapter subclass that reads its capability profile from a
CapabilityDescriptor: MAX_MESSAGE_LENGTH attribute, message_len_fn (table-driven
by len_unit: chars=len, utf16=Telegram-style code units), supports_draft_streaming.
Implements the four abstract methods (connect/disconnect/send/get_chat_info) by
delegating to an injected RelayTransport (full protocol lands in Task 1.2). Adds
Platform.RELAY enum member. No per-platform gateway code.
Phase 1, Task 1.1 of the gateway-relay plan.
CapabilityDescriptor.from_platform_entry() projects an existing PlatformEntry
(label, max_message_length, emoji, platform_hint, pii_safe, name) into a
descriptor, proving the descriptor is a projection of existing config rather
than a parallel concept. Runtime-only capabilities (len_unit, draft/edit/
thread/markdown) are caller-supplied. max_message_length==0 ('no limit') maps
to the stream_consumer 4096 default.
Phase 0 complete. Task 0.3 of the gateway-relay plan.
Behavioral regression harness locking the capability surface that the future
RelayAdapter must reproduce: the abstract-method set (connect/disconnect/send/
get_chat_info), message_len_fn default, supports_draft_streaming default, and
the stream_consumer MAX_MESSAGE_LENGTH attribute read. Passes on main before
any RelayAdapter exists.
Phase 0, Task 0.1 of the gateway-relay plan.
After the June lockfile regeneration (#46652) floated electron and reshuffled
npm workspace hoisting, the desktop pack fails with "The specified electronDist
does not exist". apps/desktop/package.json pointed electronDist at the repo
root (../../node_modules/electron/dist) while npm now installs electron nested
under apps/desktop/node_modules/electron. The two contradict, so a clean
install can never package the app (Windows + macOS).
- electronDist -> node_modules/electron/dist (resolved relative to apps/desktop,
i.e. the workspace-local install npm actually produces).
- hermes_cli/main.py, scripts/install.sh, scripts/install.ps1: add a runtime
electron-dir resolver that prefers apps/desktop/node_modules/electron and
falls back to the root hoist, so dist checks + the mirror re-download work
under either npm layout.
- patch-electron-builder-mac-binary.cjs: try the workspace-local Electron.app
before the root hoist in the macOS binary-restore fallback (sibling site no
PR touched).
- test: assert build.electronDist resolves to where the lockfile installs
electron, so a future hoist change (root <-> nested) can't silently break it.
Salvages the overlapping work in #48003 (sitkarev), #48012 (omegazheng), and
#48033 (james47kjv).
Co-authored-by: sitkarev <59806492+sitkarev@users.noreply.github.com>
Co-authored-by: omegazheng <zheng@omegasys.eu>
Co-authored-by: james47kjv <220877172+james47kjv@users.noreply.github.com>
* fix(photon): preserve text in mixed iMessage attachments
When an iMessage bubble carried both text and an attachment, spectrum-ts'
inbound mapper returned only buildAttachmentMessage(...), dropping the user's
typed text before Hermes could see it. The Photon adapter then had no 'group'
content path, so the text was lost entirely.
- adapter.py: handle a new 'group' content type that flattens text + attachment
items, preserving the typed text alongside cached media (extracted shared
_normalize_binary_payload helper).
- sidecar: emit 'group' content in normalizeContent, and ship
patch-spectrum-mixed-attachments.mjs which patches spectrum-ts' pinned mapper
(at npm postinstall AND at sidecar startup, so existing installs self-heal).
Windows robustness fixes on top of the original PR:
- The patcher's CLI guard used 'import.meta.url === file://${argv[1]}', which
never matches on Windows (file:/// + drive letter) — it silently no-opped.
Switched to pathToFileURL(argv[1]).href.
- The patcher matched \n-joined strings, so a CRLF checkout (Windows git
autocrlf) defeated every replacement. It now normalizes CRLF->LF for matching
and restores the original EOL style on write.
Co-authored-by: Yuhang Lin <yuhanglin@YuhangdeMac-mini.local>
* chore: map YuhangLin contributor email for attribution (#46513)
---------
Co-authored-by: Yuhang Lin <yuhanglin@YuhangdeMac-mini.local>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
A /title typed before any message in a fresh desktop chat could be silently
lost: the session DB row is deferred to the first prompt, so session.title
found no row, only stashed pending_title, and returned pending:true. It then
relied on a post-turn apply block to write the title. When that turn never
landed under the same session_key (or the apply path didn't fire), the title
was dropped and the sidebar fell back to the first-message preview — e.g.
"/title my-custom-name" then "hello" left the session titled "hello".
Mirror the messaging gateway's _handle_title_command: an explicit /title is
clear user intent, not an abandoned draft, so create the row up front
(_ensure_session_db_row) and set the title immediately via the profile-aware
_session_db handle, returning pending:false. This also fixes the frontend
symptom for free — the desktop handler's immediate refreshSessions() now pulls
the correct persisted title instead of clobbering the optimistic value with a
still-NULL row.
If row creation can't take (DB unavailable / racing writer), fall back to the
existing pending_title queue so the post-turn apply block remains a recovery
path. The sidebar's min-messages filter keeps a titled 0-message row hidden, so
a /title'd-but-never-used draft still doesn't clutter the list.
Updates the test that asserted the old queue-on-missing-row behavior and adds a
fallback-to-queue regression test.
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
* feat(search_files): path-grouped lossless densification of content matches
Content-mode search_files results repeat the {path,line,content} JSON keys
and the full path string for every match. Group consecutive same-path matches
under one path header with indented '<line>: <content>' rows — lossless (every
path/line/content byte preserved), self-describing (matches_format key), and
readable by the model with no decode step.
57.8% mean token reduction on real search_files content outputs (422-output
corpus), fires on 97% of them. Gated at >=5 matches; below that the verbose
array is left untouched. Default to_dict(densify=False) is unchanged, so no
other caller is affected.
ripgrep emits matches path-ordered, so consecutive grouping never reorders
results.
* test: accept densify kwarg in _FakeSearchResult.to_dict
The search loop-detection tests stub SearchResult with a fake whose
to_dict() must mirror the real signature now that it takes densify=.
* test(search_files): edge-case losslessness battery for densification
Adversarial single-line content (colons, indentation, unicode/emoji, empty,
trailing whitespace, quotes+commas), paths with spaces, and an explicit
one-line-per-match invariant documenting the ripgrep contract the format
relies on (0/6775 real match contents contained a newline).
* fix(logging): alias RotatingFileHandler to concurrent-log-handler
On Windows, stdlib RotatingFileHandler.doRollover() uses os.rename(), which
fails with PermissionError [WinError 32] whenever another process holds an
append-mode handle on agent.log — essentially always in Hermes (TUI, gateway,
hy_memory server, MCP servers, and on-demand CLI commands all log from separate
processes). This pinned agent.log at the 5 MiB threshold and spammed stderr
with a traceback on every emit (#44873).
Add concurrent-log-handler==0.9.29 as a core dep and alias its
ConcurrentRotatingFileHandler as RotatingFileHandler in hermes_logging.py. It
wraps the rename in a cross-process file lock (via portalocker: pywin32 on
Windows, fcntl on POSIX) so only one process rotates at a time. Aliasing keeps
every existing isinstance/class-declaration reference working unchanged.
Co-authored-by: tuancookiez-hub <tuancookiez@gmail.com>
* fix(logging): gate concurrent-log-handler swap to Windows only
The initial salvage aliased RotatingFileHandler -> ConcurrentRotatingFileHandler
unconditionally, which regressed POSIX: CLH opens lazily and rotates via its own
lock path, breaking managed-mode (NixOS) group-writable perms and eager file
creation that _ManagedRotatingFileHandler depends on. CI caught it as 2 failures
in test_managed_mode_*_group_writable on Linux.
The WinError 32 bug (#44873) is Windows-specific — POSIX renames an open file
fine, so stdlib already works on Linux/macOS. Gate the swap behind
sys.platform == 'win32': Windows uses CLH, POSIX keeps stdlib RotatingFileHandler.
- hermes_logging.py: platform-conditional import.
- tests/test_hermes_logging.py: import RotatingFileHandler from hermes_logging
(single source of truth) so the autouse fixture's isinstance checks match the
real handler class on both platforms.
- pyproject.toml/uv.lock: mark the dep 'sys_platform == "win32"' so portalocker
/pywin32 only ship where used.
---------
Co-authored-by: tuancookiez-hub <tuancookiez@gmail.com>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Follow-up hardening on @ehz0ah / @harshitAgr's session-switch work (#28296):
- on_session_switch no longer runs the old-session writer-drain + pending-token
GET + commit POST inline on the caller's command thread. /new, /branch,
/resume, /undo call it synchronously, so a slow drain (up to 10s) or wedged
commit blocked the user-facing command — the same hazard #41945 fixed for
end-of-turn sync. State now rotates synchronously (cheap) and the old-session
commit is offloaded to a daemon finalizer (generalized _finalize_session_async).
- Guard the (_session_id, _turn_count) pair with _session_state_lock: sync_turn
runs on the memory-manager executor thread while the session hooks run on the
command thread, so the snapshot+reset vs increment was a cross-thread race.
- _session_needs_commit checks the committed-session guard BEFORE the
turn_count>0 shortcut, closing a double-commit window when a racing sync_turn
re-increments after commit+reset.
- Add a _shutting_down flag so deferred finalizers stop POSTing against a
torn-down client; track all prefetch threads in a set so invalidate/shutdown
join every one, not just the latest slot.
Tests: regression for the non-blocking switch (asserts the caller returns while
a slow drain is parked off-thread) and the committed-guard ordering; updated the
deferred-commit test to the unified finalizer contract.
* fix(desktop): pin Electron below the broken native extract-zip install
The Windows desktop install fails at "Building desktop app": Electron's
postinstall aborts with `ERR_DLOPEN_FAILED loading
index.win32-x64-msvc.node` / "Cannot find native binding" from
`@electron-internal/extract-zip`.
Root cause is a dependency drift, not the user's machine. Electron changed
its install mechanism mid-patch-series:
electron 40.9.3 .. 40.10.2 -> @electron/get@^2 + extract-zip@^2 (pure JS)
electron 40.10.3 / 40.10.4 -> @electron/get@^5 + @electron-internal/extract-zip@^1 (native napi)
apps/desktop declares `electronVersion: 40.9.3` (the tested, JS-extract
build) but pinned the dependency as `electron: ^40.9.3`, so `npm ci`/`npm
install` silently resolved 40.10.3/40.10.4 — onto the brand-new native
extract-zip whose win32-x64 binding fails to dlopen on some Windows hosts.
The committed lockfile already carried 40.10.3, and the installer's mirror
fallback can't help (it re-runs Electron's own `install.js`, which uses the
same broken native module).
Fix:
- Pin `electron` to an exact `40.10.2` — the newest build before the native
extract-zip switch — and align `build.electronVersion` to match (Electron
Builder needs electronVersion/electronDist to match the installed binary).
- Add a root `yauzl: ^3.3.1` override so the (re-introduced) JS extract-zip
path also works on Node >= 24.16 / >= 26.1, where the old yauzl hangs.
This is the same workaround the wider Electron ecosystem adopted.
- Regenerate package-lock.json: drops @electron-internal/extract-zip and
@electron/get@5, restores @electron/get@2 + extract-zip@2 + yauzl@3.4.0.
* test(desktop): lock the Electron pin/version/lockfile consistency contract
Guards against the dependency drift that broke the Windows desktop install:
the Electron dependency must be an exact version, must equal
build.electronVersion, and the lockfile must resolve to that same version so
`npm ci` installs exactly what electron-builder packages. Asserts the
relationships, not a specific version number.
The model is callable via xAI OAuth but omitted from models.dev and
/v1/models listings. Merge it into the curated xAI catalog so it appears
in `hermes model` without requiring a custom model name.
Avoid applying text-only persist_user_message overrides to multimodal current-turn user messages. Early crash-resilience persistence mutates the same messages list later used for the API call, so clobbering list content drops ACP image blocks before model dispatch.\n\nAdd regression coverage for both text override behavior and multimodal preservation.\n\nCloses #44242
* feat(mcp): raise default tool-call timeout 120s -> 300s
Port from openai/codex#28234. Long-running MCP tools (web fetches,
sandboxed builds, deep-research servers) routinely exceed 120s, causing
spurious timeout failures. Codex bumped its default MCP tool timeout from
120 to 300 for the same reason.
- _DEFAULT_TOOL_TIMEOUT 120 -> 300 in tools/mcp_tool.py (per-server
'timeout' config override unchanged)
- update test_default_timeout assertion
- document the default in mcp-config-reference.md
* refactor: remove agent-callable send_message tool
The agent should not decide on its own to fire off cross-platform
messages or reactions. Outbound platform messaging is handled outside
the agent loop — cron delivery, the gateway kanban notifier
(dashboard-toggled), and the `hermes send` CLI.
Removes the model-tool registration only; the send engine in
send_message_tool.py (_send_to_platform, _send_via_adapter,
_parse_target_ref, per-platform _send_* helpers) is kept intact for
those non-agent callers. Drops the now-empty 'messaging' toolset and
its `hermes tools` toggle. Yuanbao DM guidance now points at the
native yb_send_dm tool.
restore_skill() falls back to p.name.startswith(f"{skill_name}-") when no
archive directory matches the requested name exactly. That fallback is meant
to catch the timestamped duplicate archive_skill() writes on a name collision
(<skill>-YYYYMMDDHHMMSS), but the bare prefix also matches any unrelated
archived skill named <name>-something. So restoring "git" can pull an archived
"git-helpers" out of .archive/, rename it to "git", and report success: the
requested skill is not restored and the sibling is gone from the archive.
Constrain the fallback to the exact suffix archive_skill() produces, a 14 digit
timestamp. The exact-name match and the recursive nested-archive walk are
unchanged, so nested and timestamped restores still work; unrelated siblings no
longer match.
Fixes#47647
The OpenAI device-code login (POST auth.openai.com/.../deviceauth/usercode)
had no retry or 429 handling — a transient throttle from OpenAI surfaced as
a bare "Device code request returned status 429" with no guidance, reading
as a hard login failure.
- Retry the device-code request with capped exponential backoff (honoring
Retry-After), up to 4 attempts.
- On persistent 429, raise a clear AuthError tagged CODEX_RATE_LIMITED_CODE
(classified transient, not a credential problem) with a wait hint.
- Apply the same 429 classification to the token-exchange step (same bug
class).
Unrelated to PR #47399 (Responses-API cache headers); this is the OAuth
device-code path in hermes_cli/auth.py.
Context files (AGENTS.md, CLAUDE.md, .hermes.md, .cursorrules, SOUL.md) were
hard-capped at a flat 20K chars before head/tail truncation. Among the agent
harnesses we track, only Codex caps project docs at all (32 KiB); Claude Code,
OpenCode, and Cline load them whole. The flat 20K predates large context
windows and silently truncates real-world AGENTS.md files.
B — dynamic cap: when context_file_max_chars is unset (now the shipped
default), the cap scales with the model's context window
(ctx_tokens * 4 * 0.06, floor 20K, ceiling 500K). Small-context models stay at
the historical 20K; a 200K model gets 48K; large models stop truncating real
docs. An explicit context_file_max_chars still wins. Context length is resolved
once per conversation (stable -> prompt cache untouched).
C — when truncation does happen, the marker now names the concrete file path
and tells the agent to read_file it for the full content.
Validation: 154 targeted tests + full agent/ + hermes_cli/ + test_config
(0 failures); E2E against a real 60K AGENTS.md confirms small windows truncate
with the path-bearing marker, large windows load whole, and the system prompt
is byte-stable across rebuilds.
Rolling back to the oldest curator snapshot failed and deleted that
snapshot. rollback() takes a safety snapshot first, and snapshot_skills()
ends by pruning the backups directory down to keep (5 by default). At the
steady keep limit that prune removed the oldest snapshot, which is the very
one being restored, so the extract found no skills.tar.gz and the rollback
stopped with "snapshot extract failed (state restored)".
Thread an optional protect set through snapshot_skills() into _prune_old()
so the pre rollback safety snapshot can never evict the snapshot being
restored. Add two regression tests covering restore of the oldest snapshot
at the keep limit.
Fixes#47612
The curator now defaults to prune-only: the deterministic inactivity pass
(mark stale / archive long-unused skills) still runs whenever the curator is
enabled, but the opinionated LLM umbrella-building consolidation fork is OFF
by default.
- agent/curator.py: add DEFAULT_CONSOLIDATE=False + get_consolidate(); gate
the forked aux-model review in run_curator_review behind it (new consolidate
param, None=read config). When off, the LLM pass is skipped entirely (no
aux-model cost); the run is still recorded and reported.
- config.py: add curator.consolidate (default false); v29->v30 migration seeds
the key for existing installs without clobbering a user-set value.
- hermes_cli/curator.py: 'hermes curator run --consolidate' override; status
shows consolidate state; prune-only notice on run.
- docs + tests.
When refresh_launchd_plist_if_needed() runs from inside the gateway's own
launchd process tree (agent-initiated self-update via the terminal tool), a
direct launchctl bootout tears down the service's process group — including
the CLI doing the refresh — before the follow-up bootstrap can run. The
gateway is left unloaded and KeepAlive can't revive it (#43842).
Detect in-service execution via gateway.status.get_running_pid() +
_is_pid_ancestor_of_current_process(), and delegate the bootout->bootstrap to
a detached (start_new_session=True) helper that survives the process-group
teardown. The normal out-of-tree CLI path is unchanged.
Fixes#43842.
The double-underscore prefix swap fixed bare native tools but SKIPPED tools
already named mcp_<server>_<tool> (real MCP servers, e.g. mcp_linear_get_issue):
they went on the OAuth wire single-underscore and still tripped Anthropic's
third-party billing classifier -> HTTP 400 'extra usage, not plan limits'.
Verified empirically against a live Max subscription: a single mcp_ tool flips
the whole request to the extra-usage lane; mcp__ is accepted.
- build_anthropic_kwargs: promote ANY leading single-underscore mcp_ to mcp__
(bare names -> mcp__name; mcp_<server>_<tool> -> mcp__<server>_<tool>),
never double-prefixing an already-mcp__ name. Same for tool_use blocks in
history.
- normalize_response: reverse the mcp__ wire name back to whichever original
the registry knows — the single-underscore mcp_<server>_<tool> form for MCP
server tools, or the bare name for native tools — preferring a name that
already resolves natively.
- Tests rewritten to assert the invariant: ZERO single-underscore mcp_ names
reach the OAuth wire, and the mcp__ round-trip resolves back to the
registered name for both native and MCP-server tools.
Builds on liuhao1024's mcp__ prefix commit (cherry-picked). Closes the
MCP-server gap that left any session with an MCP server configured still
billing to extra usage.
Anthropic's Claude-Code request classifier treats tool names with a
single-underscore `mcp_<x>` prefix as non-Claude-Code / third-party,
routing the request to extra-usage billing (HTTP 400). Real Claude Code
uses double underscores: `mcp__<server>__<tool>`.
Change the tool-name prefix from `mcp_` to `mcp__` in both the outgoing
path (build_anthropic_kwargs) and the incoming path
(normalize_response). Update the skip-guard to check for both `mcp_`
and `mcp__` prefixes so native MCP server tools (which use the legacy
single-underscore format) are not double-prefixed.
Fixes#46675
`hermes login` was removed in favor of `hermes auth` / `hermes model`, but
the subparser still validated `--provider` against a hardcoded choices list
(nous, openai-codex, xai-oauth). Running `hermes login --provider anthropic`
therefore crashed in argparse with `invalid choice: 'anthropic'` *before* the
deprecation handler could print the redirect to `hermes model` — so a user
trying to authenticate a perfectly valid provider just saw a hard error and
assumed the feature was broken rather than relocated.
- Drop the restrictive `choices=` so every `--provider` value reaches the
deprecation handler (which ignores the value and prints guidance).
- Omit the subparser `help=` kwarg so the dead command no longer advertises
itself in `hermes --help` (#24756). Avoids the `==SUPPRESS==` placeholder
leak that `help=argparse.SUPPRESS` emits for a top-level subparser on 3.12+.
- `hermes login [--flags]` still reaches the actionable deprecation message
for old scripts/aliases; `hermes login --help` shows the redirect.
Picks up the intent of the inactivity-closed #24902, rebased onto the
post-refactor parser location (hermes_cli/subcommands/login.py) and extended
to fix the whole bug class (any provider value), not just hiding from --help.
Tests: parametrized provider acceptance + help-suppression (no SUPPRESS leak).
Regression coverage for the keystroke-latency fix: a URL token contains
"/", so the bare-slash path heuristic used to return it as a path word and
run os.listdir on every keystroke. Assert _extract_path_word rejects
http/https/ssh scheme tokens, that ordinary paths (incl. a bare colon) are
unaffected, and that the completer never touches the filesystem for a URL
under the cursor.
sync_turn's bounded join could drop a still-alive previous worker by
replacing the single _sync_thread slot. The dropped worker kept POSTing
under the old sid but was no longer visible to on_session_end /
on_session_switch, so the commit could fire while orphaned writes were
still in flight — those writes landed past the commit boundary and were
never extracted.
Replace the single _sync_thread slot with _inflight_writers:
Dict[sid, Set[Thread]]. Writers self-register on spawn (sync_turn,
on_memory_write) and self-deregister on exit. The commit path drains
_drain_writers(sid, 10.0) and skips the commit if any writer for that
sid is still alive after the bounded budget.
Also trim inline review-rationale comments to short invariants per
reviewer style ask: "commit only after session writes drain" and
"drop prefetch results from older switch generations."
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit 7537ee6f5b)
Three follow-ups from review on #28296:
1. Sync worker outliving the bounded join. Each sync_turn POST has
_TIMEOUT=30s and there are two per turn, but on_session_end and
on_session_switch only join for 10s. If the worker is still alive
after the join, committing the old session orphans the worker's
late writes past the commit boundary — they land in an already-
committed session and never get extracted. Both hooks now re-check
is_alive() after the join and skip the commit when the worker
hasn't drained.
2. on_memory_write late session_id capture. Same shape as the
pre-fix sync_turn: f-string for the post path read self._session_id
inside the worker, so a switch between thread spawn and post call
landed the memory note in the new session. Snapshot sid at call
time, same pattern as sync_turn.
3. Stale prefetch repopulating the new session. The pre-switch
drain+clear only protects against workers that finish before the
join completes; one finishing after the clear would write its
result into the new generation's slot. Added a monotonic
_prefetch_generation; workers capture it at spawn and refuse to
write if it has advanced.
Tests: existing in-flight-sync test updated to drain (it tested the
join-before-commit happy path); four new tests cover hung-writer skip
on end + switch, on_memory_write sid capture, and prefetch generation
gating. 177/177 memory tests pass.
(cherry picked from commit 3791a87dbe)