patch_tool extracts V4A patch paths so _check_sensitive_path can refuse
writes to /etc/*, /boot/*, etc. before they reach the low-level file ops.
The extraction regex had two gaps:
1. `*** Move File: src -> dst` was never extracted (regex only matched
Update/Add/Delete), so a Move targeting /etc/crontab skipped the
pre-check and fell back on the narrower file_operations deny list.
2. The regex required `\\s+` after `***` but patch_parser uses `\\s*`, so
`***Update File: /etc/hosts` (no space) parsed + applied while
skipping the check.
Loosen the leading whitespace to \\s* and add a Move regex that checks
both endpoints. Move endpoints also run through the same '..' traversal
rejection as the other V4A headers (closes the sibling gap on current
main, which gained that traversal guard after this PR was opened).
`_resolve_api_key_provider_secret` resolved API keys via `get_env_value`,
which returns the `os.environ` value first and only falls back to
`~/.hermes/.env`. After a user rotates a key in `.env`, a stale value still
exported in the parent shell (Codex CLI, test runner, login profile) shadows
the fresh key on every request, producing persistent 401s.
The credential-pool seeding path was already fixed to prefer `.env`
(#18254/#18755), but the live request-time resolution path was not — so the
pool re-seeded with the fresh key while `_resolve_api_key_provider_secret`
kept returning the stale shell export. This closes that remaining path.
- config: add `get_env_value_prefer_dotenv()` — checks `~/.hermes/.env`
first, then `os.environ`. Distinct from `get_env_value()` (unchanged,
os.environ-first) so only Hermes-managed credential resolution flips
precedence; the generic helper's many callers are unaffected.
- auth: `_resolve_api_key_provider_secret` resolves through the new helper.
- tests: regression coverage for both the pool-seeding path and the
auth resolution path (a rotated `.env` key must beat a stale shell export).
Closes#20591.
Co-authored-by: 0xDevNinja <manmit0x@gmail.com>
Follow-up to the salvaged #8008 fix:
- Sibling-site fix: _evaluate_slash_authorization gated DISCORD_ALLOWED_CHANNELS /
DISCORD_IGNORED_CHANNELS on numeric IDs only, so name/#name config that now works
for on_message still silently failed for slash-command interactions. Refactor the
channel-key helper to _discord_channel_keys_from_channel(channel, parent) and reuse
it at the interaction gate. Fail-closed on missing channel id is preserved.
- The contributor's hardcoded 8s flush deadline could be hard-cancelled mid-flush:
_teardown_adapter already wraps cancel_background_tasks() in the per-adapter
disconnect budget (HERMES_GATEWAY_ADAPTER_DISCONNECT_TIMEOUT, default 5s). The flush
deadline now derives from that budget with headroom so it always completes inside it.
- AUTHOR_MAP: map cypher@augmentl.com -> Nickperillo for CI.
- Tests: slash-auth name/#name allow + name ignore matching.
Two related fixes to the Discord gateway adapter:
1. Channel name matching (free-response, allowed, ignored, no-thread channels)
Previously these config values only matched against numeric channel IDs.
If a user configured free_response_channels: cypher (by name), the adapter
would silently ignore it because it only intersected against channel_ids.
Now the adapter builds a channel_keys set that includes the channel ID,
channel name, and #channel-name form, and checks all three for each gate.
2. Flush pending text-batch tasks before shutdown
The Discord adapter uses _pending_text_batch_tasks (its own dict) for
merging rapid successive message chunks. These tasks were NOT added to
self._background_tasks (the base class list), so the base
cancel_background_tasks() never awaited them on restart/shutdown.
This caused a race: in-flight response deliveries were cancelled before
Discord had a chance to send them, resulting in silent dropped messages
visible to users as tool-log-only replies with no text body.
Fix: override cancel_background_tasks() in DiscordAdapter to await all
pending text-batch tasks (8s deadline) before delegating to the base class.
The gateway-lifecycle guard's hermes-CLI pattern required `hermes`
and `gateway` to be adjacent, so a profile flag slipped the agent
past it: `hermes -p ade gateway restart` was not flagged. That is the
exact form from the 2026-04-11 ade-profile self-kill loop. Allow an
optional run of global flags (`-p ade`, `--profile ade`, multiple
flags) between `hermes` and the gateway subcommand.
launchctl self-termination is already covered on main by #33071; this
narrows the only remaining real gap.
Update the shared queued-edit ref synchronously with React state so draft persistence sees the correct edit mode while loading and restoring queued prompts. Also drop the accidental node_modules symlink from the PR.
Ensure Windows desktop and local terminal teardown kill full process trees so Git Bash descendants cannot survive wrapper exits and accumulate across retries.
Audit follow-up. ChatBar subscribed to the whole `$statusItemsBySession` (a
computed that rebuilds the entire map) + `$previewStatusBySession` maps just to
derive a boolean, so every per-item status mutation (a subagent tick, the 5s
background poll) and every OTHER session's change re-rendered the ~1.4k
component. The queue hook likewise subscribed to the whole `$queuedPromptsBySession`
map.
- Add `useSessionStatusPresence` — a coarse edge (useSyncExternalStore) that
flips only when the stack shows/hides; ChatBar uses it for the styling
data-attr instead of the two map subscriptions.
- Add generic `useSessionSlice(store, key)` — subscribes to one session's array,
bailing out when other sessions churn (the plain atom keeps per-key refs
stable). The queue hook now reads its slice through it.
Result: ChatBar re-renders only when the stack's presence flips or this session's
queue changes — not on background/subagent status streaming or other sessions.
Verified: typecheck clean, 0 lint errors, composer tests 39/40 (pre-existing
attachments failure unrelated).
Two composer fixes:
- **Paste/input lag** — `flushEditorToDraft` serializes the whole editor
(`composerPlainText` is O(n)); running it on every event during a burst
(holding a key, or holding Cmd+V into a growing editor) was O(n²). Coalesce
the input/paste path to one flush per animation frame. Lossless: the
contentEditable DOM is the source of truth and submit + the compositionend /
keydown paths re-read it synchronously (those stay immediate).
- **Detached-composer dock glow** — was `fixed inset-x-0` (full viewport, spilled
under the sessions sidebar). Switched to `absolute inset-x-0`, so it anchors to
the chat-column root the docked composer centers in — the glow now spans only
the thread area, matching the actual dock target.
Verified: typecheck clean, 0 lint errors, composer DOM repro tests pass.
Lift the submit orchestration out of ChatBar into
composer/hooks/use-composer-submit.ts: `submitDraft` (the one decision tree —
queue-edit save · slash-now-while-busy · queue · drain · send · stop),
`dispatchSubmit` (the shared send-with-restore primitive + the external-submit
listener), and `steerDraft`.
This is the seam where the draft and queue engines meet; it now reads both clean
APIs as explicit inputs instead of closing over inline state. ChatBar is left as
a thin coordinator that owns the shared `queueEditRef` and wires the four engines
(draft · queue · submit · metrics/voice/drop) into render.
Behaviour-identical (verbatim move). Verified: typecheck clean, composer DOM
repro tests (enter-submit, IME, slash-now, steer, drain) pass.
Lift the queue subsystem out of ChatBar into composer/hooks/use-composer-queue.ts:
the per-session queue-store binding + queuedPrompts, in-place queued-prompt
editing (begin/step/exit), the shared drain lock + send-then-remove sequence,
manual send-now, bounded auto-drain, and the three queue effects (re-key migrate,
idle auto-drain, queue-edit cleanup).
It consumes the draft API (draftRef/clearDraft/loadIntoComposer/focusInput) and
writes the coordinator-owned `queueEditRef` the draft engine reads — so the
draft↔queue coupling is two explicit deps, not an inline tangle. `steerDraft`
and the chat-focus Esc-cancel stay in ChatBar (not queue-internal).
Behaviour-identical (verbatim move). Verified: typecheck clean, composer DOM
repro tests + queue/edit paths pass.
De-entangle the draft spine: lift the source-of-truth engine (the imperative
composer-runtime subscription, edit primitives, focus, edge selectors, and
per-session load/clear/stash/restore) out of ChatBar into
composer/hooks/use-composer-draft.ts.
The draft↔queue cycle is broken by making `queueEditRef` a coordinator-owned
ref ChatBar threads into the hook (explicit dep, not an implicit shared global).
The contentEditable *event* handlers stay in ChatBar (they bridge into the
trigger engine) and drive the primitives the hook exposes.
Behaviour-preserving (verbatim move); typing perf preserved. Verified: typecheck
clean, composer DOM repro tests (enter-submit, IME, slash-nav) + text-guard pass.
Lift the attachment drop engine (dragActive + the 7 drag/drop handlers + the
in-app-ref vs OS-upload split) out of ChatBar into
composer/hooks/use-composer-drop.ts. Self-contained, off the keystroke path —
consumes insertInlineRefs + onAttachDroppedItems + requestMainFocus. Verbatim
move, behaviour-preserving.
Lift the dictation + voice-conversation + auto-speak subsystem out of ChatBar
into composer/hooks/use-composer-voice.ts. It owns voiceConversationActive,
lastSpokenIdRef, the pending-reply readers, submitVoiceTurn, the voice
hooks (recorder/conversation/auto-speak), the Ctrl+B toggle event, and
handleToggleAutoSpeak; it exposes dictate/voiceStatus/voiceActivityState/
conversation/start+endConversation/handleToggleAutoSpeak for the controls.
Self-contained: consumes the draft/submit primitives (insertText, clearDraft,
focusInput, onSubmit) passed in, nothing depends back on it — so unlike the
queue subsystem (which is circularly coupled to the draft helpers) it lifts
cleanly. Behaviour-preserving; verbatim move.
The real composer state-engine fix. ChatBar subscribed to the full draft string
(`useAuiState(s => s.composer.text)`), so every keystroke re-rendered the whole
~2k-line component even though the contentEditable DOM already owns the text.
Replace that with:
- an imperative composer-runtime subscription (useComposerRuntime().subscribe)
that mirrors text into draftRef, repaints the editor ONLY on external changes
(clear/restore/insert; the focused editor is the source otherwise), and drives
the debounced per-session stash — all without a React render. This folds the
old `[draft]` sync effect and the `[draft]` debounced-stash effect into one
place keyed off the runtime, surviving core rebinds via the effect dep.
- coarse edge selectors (hasText / isHelpHint / isSteerableText, plus
isEmpty / hasHardNewline in useComposerMetrics) for the chrome, which only
re-render when an edge actually flips.
Net: typing within a line does zero ChatBar re-renders / style invalidations;
work happens only on real edges. Behaviour-preserving — draftRef + editor are
already kept current by every mutation path; verified by the composer DOM repro
tests (enter-submit, IME composition, slash-nav) + text-guard.
First step of decomposing the ChatBar god component (composer/index.tsx). Pull
the self-contained *sizing* engine — stacked/inline layout + the measured-height
CSS vars the thread reads for clearance — into composer/hooks/use-composer-metrics.ts.
The hook owns: the media-query `narrow`, `expanded`/`tight`, the 8px height
bucketing (so per-keystroke growth never invalidates the tree's computed style),
the ResizeObserver, the popout re-sync, and the CSS-var cleanup. ChatBar now
just calls `useComposerMetrics(...)` and consumes `stacked`.
Behaviour-preserving (no keystroke/IME/contentEditable path touched): code moved
verbatim. Deliberately a low-risk first slice on the app's most fragile file;
the draft/state-engine spine is the next, dogfood-heavy step
(see desktop-composer-plan.md).
Two status-stack icon nits:
- Subagents used `hubot`; switch to the dedicated `agent` codicon.
- The queue section had no icon while every other group (todos, subagents,
background) has one. Give it `layers` (a stack of pending turns), matched to
the group-icon styling so all four sections read consistently.
MCP tools connected and enabled but never surfaced into the agent's
session toolset on the desktop app + dashboard WebUI (#51587).
There are two independent background MCP discovery thread owners by
surface: tui_gateway.entry (stdio 'hermes --tui') and hermes_cli.mcp_startup
(desktop app + dashboard WS sidecar via tui_gateway/ws.py, and 'hermes
dashboard'). The late-refresh scheduler gates on
tui_gateway.entry.mcp_discovery_in_flight(), which read ONLY the entry
thread global. On the desktop/dashboard surfaces that global is None, so a
server slower than the bounded build-time wait never triggered a late
refresh and its tools stayed invisible for the whole session.
Make mcp_discovery_in_flight() / join_mcp_discovery() consult BOTH thread
owners. Adds the matching in-flight/join helpers to hermes_cli.mcp_startup
and has tui_gateway.entry delegate to them as a second owner.
Preview links (detected HTML files / localhost dev URLs) were rendered as
CHILDREN of the background StatusSection, which is collapsed by default — so the
moment a background task appeared, the previews got swallowed into the collapsed
"N Background" expandable and vanished until you manually expanded it. With no
background group they rendered as a standalone always-visible block, so the bug
only showed once a bg task was running.
Render the preview links as their own always-visible block right after the
background section instead of as collapsible children. They stay visually
associated with the background group (a localhost dev server and its preview are
the same thing) but are no longer hidden by its collapse — a one-tap open is the
whole point.
_sanitize_api_messages() compared raw tool_call_id strings without
stripping whitespace. When assistant-side IDs and tool-result IDs
diverged due to surrounding whitespace, valid tool results were treated
as orphaned and replaced with [Result unavailable] stub placeholders.
Strip whitespace in _get_tool_call_id_static() (both call_id/id paths,
dict and object) and at the two result_call_id comparison sites in
sanitize_api_messages(). Adds regression tests for preserved-whitespace
results and orphaned-whitespace removal.
Closes#9999
- Rework share/import into one Dialog (matches rename/create): a single code
field (copy to share, paste + Load to import) with a hover copy button, a
Reset link beside the upload icon when viewing an imported map, and plainer
copy.
- Core orb: scales with the world zoom (~1.25× the inner shell), backdrop wash
behind it; on focus/hover the scene composites above the orb so the active
tooltip + lit lines are never covered.
- fitViewport floors zoom at the reference (5-ring) extent, so big maps render
at a constant scale and pan instead of shrinking every node to fit.
- Light mode: flip inter-ring band shading to read as depth (not a mound),
fade the core ring in from t=0, drop the timeline star glow.
- Timeline: filled play glyph, crisper constellation, date moved into the legend.
After the slash dispatcher, the next-largest body unit was submitPromptText —
a ~280-line submit pipeline. Lift it into a colocated useSubmitPrompt sub-hook
(use-prompt-actions/submit.ts) with a typed SubmitPromptDeps object; body moves
verbatim. SubmitTextOptions moves to utils.ts (shared by submit + submitText).
Pure restructuring, no behaviour change (full use-prompt-actions suite green).
index.ts: 1,212 -> 937.
The remaining bulk of useMessageStream was handleGatewayEvent — a ~550-line
event-type dispatcher. Lift it into a colocated useGatewayEventHandler sub-hook
(use-message-stream/gateway-event.ts): the values it closed over (sibling
streaming callbacks + the 3 stable refs the deps array omitted + options)
become a typed GatewayEventDeps object; the dispatcher body moves verbatim.
Pure restructuring, no behaviour change (utils tests still green). index.ts:
1,120 -> 540.
Follow-up hardening on the salvaged #54465 backoff persistence work.
The lease refresher's loop treated ANY falsy refresh as a permanent stop
(`if not refreshed: break`), conflating two distinct cases:
- genuine lost-ownership (rowcount 0) — correct to stop, and
- a one-off transient DB error (write contention that escapes
_execute_write's retry budget) — which returned False identically.
A single transient blip therefore killed the lease for the rest of a
multi-minute compression call, silently reintroducing the exact 300s-TTL <
~361s-call expiry wedge the PR set out to fix.
Changes:
- _CompressionLockLeaseRefresher._run now tolerates a bounded run of
consecutive failures (_MAX_CONSECUTIVE_REFRESH_FAILURES = 3) before giving
up the lease; a recovered tick resets the counter. Worst-case extra hold is
cap * refresh_interval, still bounded by the acquirer's TTL.
- Replace the two remaining silent `except Exception: pass` arms in the
compression-failure-cooldown persist/clear helpers with debug logging, for
parity with their sqlite3.Error sibling arms (a non-sqlite bug was invisible).
- Document the join(timeout=1.0) quiesce bound in stop().
- Add 3 regression tests: single-blip tolerance, persistent-failure stop at the
cap, and refresh-raising tolerance.
The usePromptActions body's largest unit was executeSlashCommand — a ~530-line
`/command` dispatcher. Lift it into a colocated useSlashCommand sub-hook
(use-prompt-actions/slash.ts): the ~13 values it closed over become a typed
SlashCommandDeps object the parent passes in; the dispatcher body (and its inner
runSlash recursion) moves verbatim. SlashActionCtx (slash-only) moves with it.
Pure restructuring, no behaviour change (verified: full use-prompt-actions test
suite still green). index.ts: 1,772 -> ~1,250.
Cron pre-run scripts were capped at 120s by default, which surprised
users running long data-collection scripts on crons (the whole point of
crons being to offload long work). Raise _DEFAULT_SCRIPT_TIMEOUT to 3600s
(1 hour).
This bounds the script only — skill/agent jobs already run on a separate
inactivity budget (HERMES_CRON_TIMEOUT, default 600s idle, 0=unlimited),
not a wall-clock cap. Scripts dispatch to a persistent thread pool and do
not hold the tick lock, so a long script doesn't starve other due jobs.
Docs clarified to make the script-vs-agent timeout distinction explicit.
env/config overrides (HERMES_CRON_SCRIPT_TIMEOUT,
cron.script_timeout_seconds) unchanged and still take precedence.
Extract the standalone gateway-event helpers (session-info patch derivation,
completion-error detection, todo-payload routing, delegate_task -> subagent
spec mapping, + the stream-flush/subagent-event constants) out of the
1,285-line hook into a colocated, tested use-message-stream/utils.ts. index.ts
keeps the stateful streaming hook and consumes the helpers.
Pure restructuring, no behaviour change; folder index keeps the import path
intact. index.ts: 1,285 -> ~1,120. Adds unit tests for the pure helpers.
Extract the ~16 standalone helpers (message reconciliation, optimistic/resolved
session upserts, stored-session resolution, runtime-info application, error
classification) out of the 1,254-line god hook into a colocated, tested
use-session-actions/utils.ts. index.ts keeps the hook orchestrator (the
stateful action callbacks) and consumes the helpers.
Pure restructuring, no behaviour change; folder index keeps the import path
(`@/app/session/hooks/use-session-actions`) intact. index.ts: 1,254 -> ~950.
Adds unit tests for the pure helpers.
Detect a routing key whose session is already ended in state.db
(end_reason set) inside get_or_create_session and drop the stale entry
instead of silently routing the message into a closed session.
Previously the only runtime cleanup of sessions.json was the startup
_prune_stale_sessions_locked (#52808/#54138), which requires a restart.
A session ended while the gateway stays alive — any path that finalizes
the DB row without clearing sessions.json — left a live routing key
pointing at a closed session. get_or_create_session never consulted
end_reason, so it returned that stale entry and every subsequent message
was silently dropped (no log, no error, no response) until the next
restart. This is the live-gateway variant of #52804/FM9, which needed an
actual gateway crash.
The guard drops the stale entry and falls through to
_recover_session_from_db, which reopens agent_close-ended rows and
resumes the SAME session_id (transcript preserved); if the row ended for
a non-recoverable reason (e.g. /new) it correctly starts a fresh
session. A warning is logged so the event is visible (the field
incident reported zero log output).
Adds tests/gateway/test_session_store_runtime_stale_guard.py covering
the _is_session_ended_in_db helper and the end-to-end routing self-heal
(recover-vs-fresh, live-entry untouched, stale-wins-over-suspended,
force_new short-circuit).
Closes#54878.
Co-authored-by: David Gutowsky <david.gutowsky@gmail.com>
Two robustness gaps from the #54843 truncate-store path:
- _store_full_text wrote the full clean page to cache/web with no upper
bound (path.write_text(content)); a multi-MB page → unbounded per-extract
disk write. Cap at MAX_STORED_TEXT_CHARS (2MB, the pre-truncate-store
refusal ceiling) with a marker when capped.
- The truncation footer told the model 'read_file ... offset=<line>' — a
literal placeholder it had to guess. Compute the real starting line of the
omitted middle (head line count + 1) so the first read_file lands in the gap.
Multiple @-references in one message (esp. @url: refs, each a full
web_extract round-trip) were expanded in a serial `for ref in refs: await`
loop. Switch to asyncio.gather over the independent _expand_reference calls,
reassembling warnings/blocks in original positional order so output is
byte-identical to the serial path; the token-budget check is unchanged.
Generic + provider-agnostic: helps every web backend equally (exa/tavily/
firecrawl/parallel) since it's above the provider layer. RED/GREEN test:
3 url refs @ 0.2s each = 0.60s serial -> ~0.20s concurrent.