hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-16 09:31:37 +00:00

Author	SHA1	Message	Date
Teknium	36d68bcb82	fix(api-server): persist incomplete snapshot on asyncio.CancelledError too Extends PR #15171 to also cover the server-side cancellation path (aiohttp shutdown, request-level timeout) — previously only ConnectionResetError triggered the incomplete-snapshot write, so cancellations left the store stuck at the in_progress snapshot written on response.created. Factors the incomplete-snapshot build into a _persist_incomplete_if_needed() helper called from both the ConnectionResetError and CancelledError branches; the CancelledError handler re-raises so cooperative cancellation semantics are preserved. Adds two regression tests that drive _write_sse_responses directly (the TestClient disconnect path races the server handler, which makes the end-to-end assertion flaky).	2026-04-24 15:22:19 -07:00
UgwujaGeorge	a29bad2a3c	fix(api-server): persist response snapshot on client disconnect when store=True	2026-04-24 15:22:19 -07:00
sprmn24	7957da7a1d	fix(web_server): hold _oauth_sessions_lock during PKCE session state writes _submit_anthropic_pkce() retrieved sess under _oauth_sessions_lock but wrote back to sess["status"] and sess["error_message"] outside the lock. A concurrent session GC or cancel could race with these writes, producing inconsistent session state. Wrap all 4 sess write sites in _oauth_sessions_lock: - network exception path (Token exchange failed) - missing access_token path - credential save failure path - success path (approved)	2026-04-24 15:22:04 -07:00
Cyprian Kowalczyk	fd3864d8bd	feat(cli): wrap /compress in _busy_command to block input during compression Before this, typing during /compress was accepted by the classic CLI prompt and landed in the next prompt after compression finished, effectively consuming a keystroke for a prompt that was about to be replaced. Wrapping the body in self._busy_command('Compressing context...') blocks input rendering for the duration, matching the pattern /skills install and other slow commands already use. Salvages the useful part of #10303 (@iRonin). The `_compressing` flag added to run_agent.py in the original PR was dead code (set in 3 spots, read nowhere — not by cli.py, not by run_agent.py, not by the Ink TUI which doesn't use _busy_command at all) and was dropped.	2026-04-24 15:21:22 -07:00
Yukipukii1	8ea389a7f8	fix(gateway/config): coerce quoted boolean values in config parsing	2026-04-24 15:20:05 -07:00
knockyai	3e6c108565	fix(gateway): honor queue mode in runner PRIORITY interrupt path When display.busy_input_mode is 'queue', the runner-level PRIORITY block in _handle_message was still calling running_agent.interrupt() for every text follow-up to an active session. The adapter-level busy handler already honors queue mode (commit `9d147f7fd`), but this runner-level path was an unconditional interrupt regardless of config. Adds a queue-mode branch that queues the follow-up via _queue_or_replace_pending_event() and returns without interrupting. Salvages the useful part of #12070 (@knockyai). The config fan-out to per-platform extra was redundant — runner already loads busy_input_mode directly via _load_busy_input_mode().	2026-04-24 15:18:34 -07:00
Teknium	e3a1a9c24d	chore(release): map julia@alexland.us -> alexg0bot in AUTHOR_MAP (#15384 )	2026-04-24 15:18:09 -07:00
Teknium	e3697e20a6	chore(release): map iRonin personal email to GitHub login	2026-04-24 15:17:09 -07:00
Teknium	ed91b79b7e	fix(cli): keep Ctrl+D no-op when only attachments pending Follow-up to @iRonin's Ctrl+D EOF fix. If the input text is empty but the user has pending attached images, do nothing rather than exiting — otherwise a stray Ctrl+D silently discards the attachments.	2026-04-24 15:17:09 -07:00
CK iRonin.IT	08d5c9c539	fix: Ctrl+D deletes char under cursor, only exits on empty input (bash/zsh behaviour)	2026-04-24 15:17:09 -07:00
Julia Bennet	1dcf79a864	feat: add slash command for busy input mode	2026-04-24 15:15:26 -07:00
teknium1	2de8a7a229	fix(skills): drop raw_content to avoid doubling skill payload skill_view response went to the model verbatim; duplicating the SKILL.md body as raw_content on every tool call added token cost with no agent-facing benefit. Remove the field and update tests to assert on content only. The slash/preload caller (agent/skill_commands.py) already falls back to content when raw_content is absent, and it calls skill_view(preprocess=False) anyway, so content is already unrendered on that path.	2026-04-24 15:15:07 -07:00
helix4u	ead66f0c92	fix(skills): apply inline shell in skill_view	2026-04-24 15:15:07 -07:00
Allard	0bcbc9e316	docs(faq): Update docs on backups - update faq answer with new `backup` command in release 0.9.0 - move profile export section together with backup section so related information can be read more easily - add table comparison between `profile export` and `backup` to assist users if understanding the nuances between both	2026-04-24 15:14:08 -07:00
Teknium	2d444fc84d	fix(run_agent): handle unescaped control chars in tool_call arguments (#15356 ) Extends _repair_tool_call_arguments() to cover the most common local-model JSON corruption pattern: llama.cpp/Ollama backends emit literal tabs and newlines inside JSON string values (memory save summaries, file contents, etc.). Previously fell through to '{}' replacement, losing the call. Adds two repair passes: - Pass 0: json.loads(strict=False) + re-serialise to canonical wire form - Pass 4: escape 0x00-0x1F control chars inside string values, then retry Ports the core utility from #12068 / PR #12093 without the larger plumbing change (that PR also replaced json.loads at 8 call sites; current main's _repair_tool_call_arguments is already the single chokepoint, so the upgrade happens transparently for every existing caller). Credit: @truenorth-lj for the original utility design. 4 new regression tests covering literal newlines, tabs, re-serialisation to strict=True-valid output, and the trailing-comma + control-char combination case.	2026-04-24 15:06:41 -07:00
Teknium	bb53d79d26	chore(release): map q19dcp@gmail.com -> aj-nt in AUTHOR_MAP	2026-04-24 15:03:07 -07:00
AJ	17fc84c256	fix: repair malformed tool call args in streaming assembly before flagging as truncated When the streaming path (chat completions) assembled tool call deltas and detected malformed JSON arguments, it set has_truncated_tool_args=True but passed the broken args through unchanged. This triggered the truncation handler which returned a partial result and killed the session (/new required). _many_ malformations are repairable: trailing commas, unclosed brackets, Python None, empty strings. _repair_tool_call_arguments() already existed for the pre-API-request path but wasn't called during streaming assembly. Now when JSON parsing fails during streaming assembly, we attempt repair via _repair_tool_call_arguments() before flagging as truncated. If repair succeeds (returns valid JSON), the tool call proceeds normally. Only truly unrepairable args fall through to the truncation handler. This prevents the most common session-killing failure mode for models like GLM-5.1 that produce trailing commas or unclosed brackets. Tests: 12 new streaming assembly repair tests, all 29 existing repair tests still passing.	2026-04-24 15:03:07 -07:00
Teknium	b7c1d77e55	fix(dashboard): remove unimplemented 'block' busy_input_mode option The web UI schema advertised 'block' as a busy_input_mode choice, but no implementation ever existed — the gateway and CLI both silently collapsed 'block' (and anything other than 'queue') to 'interrupt'. Users who picked 'block' in the dashboard got interrupts anyway. Drop 'block' from the select options. The two supported modes are 'interrupt' (default) and 'queue'.	2026-04-24 15:01:38 -07:00
luyao618	7a192b124e	fix(run_agent): repair corrupted tool_call arguments before sending to provider When a session is split by context compression mid-tool-call, an assistant message may end up with truncated/invalid JSON in tool_calls[].function.arguments. On the next turn this is replayed verbatim and providers reject the entire request with HTTP 400 invalid_tool_call_format, bricking the conversation in a loop that cannot recover without manual session quarantine. This patch adds a defensive sanitizer that runs immediately before client.chat.completions.create() in AIAgent.run_conversation(): - Validates each assistant tool_calls[].function.arguments via json.loads - Replaces invalid/empty arguments with '{}' - Injects a synthetic tool response (or prepends a marker to the existing one) so downstream messages keep valid tool_call_id pairing - Logs each repair with session_id / message_index / preview for observability Defense in depth: corruption can originate from compression splits, manual edits, or plugin bugs. Sanitizing at the send chokepoint catches all sources. Adds 7 unit tests covering: truncated JSON, empty string, None, non-string args, existing matching tool response (no duplicate injection), non-assistant messages ignored, multiple repairs. Fixes #15236	2026-04-24 14:55:47 -07:00
Teknium	4093ee9c62	fix(codex): detect leaked tool-call text in assistant content (#15347 ) gpt-5.x on the Codex Responses API sometimes degenerates and emits Harmony-style `to=functions.<name> {json}` serialization as plain assistant-message text instead of a structured `function_call` item. The intent never makes it into `response.output` as a function_call, so `tool_calls` is empty and `_normalize_codex_response()` returns the leaked text as the final content. Downstream (e.g. delegate_task), this surfaces as a confident-looking summary with `tool_trace: []` because no tools actually ran — the Taiwan-embassy-email bug report. Detect the pattern, scrub the content, and return finish_reason= 'incomplete' so the existing Codex-incomplete continuation path (run_agent.py:11331, 3 retries) gets a chance to re-elicit a proper function_call item. Encrypted reasoning items are preserved so the model keeps its chain-of-thought on the retry. Regression tests: leaked text triggers incomplete, real tool calls alongside leak-looking text are preserved, clean responses pass through unchanged. Reported on Discord (gpt-5.4 / openai-codex).	2026-04-24 14:39:59 -07:00
helix4u	6a957a74bc	fix(memory): add write origin metadata	2026-04-24 14:37:55 -07:00
Teknium	14b27bb68c	chore(release): map @tochukwuada in AUTHOR_MAP Contributor email for PR #15161 salvage (debthemelon <thomasgeorgevii09@gmail.com>).	2026-04-24 14:32:21 -07:00
Teknium	ef9355455b	test: regression coverage for checkpoint dedup and inf/nan coercion Covers the two bugs salvaged from PR #15161: - test_batch_runner_checkpoint: TestFinalCheckpointNoDuplicates asserts the final aggregated completed_prompts list has no duplicate indices, and keeps a sanity anchor test documenting the pre-fix pattern so a future refactor that re-introduces it is caught immediately. - test_model_tools: TestCoerceNumberInfNan asserts _coerce_number returns the original string for inf/-inf/nan/Infinity inputs and that the result round-trips through strict (allow_nan=False) json.dumps.	2026-04-24 14:32:21 -07:00
debthemelon	dbdefa43c8	fix: eliminate duplicate checkpoint entries and JSON-unsafe coercion batch_runner: completed_prompts_set is already fully populated by the time the aggregation loop runs (incremental updates happen at result collection time), so the subsequent extend() call re-added every completed prompt index a second time. Removed the redundant variable and extend, and write sorted(completed_prompts_set) directly to the final checkpoint instead. model_tools: _coerce_number returned Python float('inf')/float('nan') for inf/nan strings rather than the original string. json.dumps raises ValueError for these values, so any tool call where the model emitted "inf" or "nan" for a numeric parameter would crash at serialization. Changed the guard to return the original string, matching the function's documented "returns original string on failure" contract.	2026-04-24 14:32:21 -07:00
Teknium	db9d6375fb	feat(models): add openai/gpt-5.5 and gpt-5.5-pro to OpenRouter + Nous Portal (#15343 ) Replaces gpt-5.4 / gpt-5.4-pro entries in the OpenRouter fallback snapshot and the Nous Portal curated list. Other aggregators (Vercel AI Gateway) and provider-native lists are unchanged.	2026-04-24 14:31:47 -07:00
helix4u	8a2506af43	fix(aux): surface auxiliary failures in UI	2026-04-24 14:31:21 -07:00
helix4u	e7590f92a2	fix(telegram): honor no_proxy for explicit proxy setup	2026-04-24 14:31:04 -07:00
brooklyn!	a5129c72ef	Merge pull request #15337 from NousResearch/bb/tui-kawaii-default-off fix(tui): keep default personality neutral	2026-04-24 16:23:00 -05:00
Brooklyn Nicholson	53fc10fc9a	fix(tui): keep default personality neutral	2026-04-24 16:19:23 -05:00
brooklyn!	93ddff53e3	Merge pull request #15321 from NousResearch/bb/tui-inline-diff-tooltrail-order fix(tui): render tool trail before anchored inline diffs	2026-04-24 15:20:42 -05:00
Brooklyn Nicholson	de596aca1c	fix(tui): render tool trail before anchored inline diffs Inline diff segments were anchored relative to assistant narration, but the turn details pane still rendered after streamSegments. On completion that put the diff before the tool telemetry that produced it. When a turn has anchored diff segments, commit the accumulated thinking/tool trail as a pre-diff trail message, then render the diff and final summary.	2026-04-24 15:07:02 -05:00
brooklyn!	6f1eed3968	Merge pull request #15274 from NousResearch/bb/tui-null-config-guard fix(tui): tolerate + warn on null sections in config.yaml	2026-04-24 13:02:12 -05:00
Brooklyn Nicholson	e3940f9807	fix(tui): guard personality overlay when personalities is null TUI auto-resolves `display.personality` at session init, unlike the base CLI. If config contains `agent.personalities: null`, `_resolve_personality_prompt` called `.get()` on None and failed before model/provider selection. Normalize null personalities to `{}` and surface a targeted config warning.	2026-04-24 12:57:51 -05:00
Brooklyn Nicholson	bfa60234c8	feat(tui): warn on bare null sections in config.yaml Tolerating null top-level keys silently drops user settings (e.g. `agent.system_prompt` next to a bare `agent:` line is gone). Probe at session create, log via `logger.warning`, and surface in the boot info under `config_warning` — rendered in the TUI feed alongside the existing `credential_warning` banner.	2026-04-24 12:49:02 -05:00
Brooklyn Nicholson	fd9b692d33	fix(tui): tolerate null top-level sections in config.yaml YAML parses bare keys like `agent:` or `display:` as None. `dict.get(key, {})` returns that None instead of the default (defaults only fire on missing keys), so every `cfg.get("agent", {}).get(...)` chain in tui_gateway/server.py crashed agent init with `'NoneType' object has no attribute 'get'`. Guard all 21 sites with `(cfg.get(X) or {})`. Regression test covers the null-section init path reported on Twitter against the new TUI.	2026-04-24 12:43:09 -05:00
Austin Pickett	c61547c067	Merge pull request #14890 from NousResearch/bb/tui-web-chat-unified feat(web): dashboard Chat tab — xterm.js + JSON-RPC sidecar (supersedes #12710 + #13379)	2026-04-24 10:35:43 -07:00
brooklyn!	7f0f67d5f7	Merge pull request #15266 from NousResearch/bb/fix-tui-section-toggle fix(tui): chevrons re-toggle even when section default is expanded	2026-04-24 12:24:27 -05:00
Brooklyn Nicholson	f5e2a77a80	fix(tui): chevrons re-toggle even when section default is expanded Recovers the manual click on the details accordion: with #14968's new SECTION_DEFAULTS (thinking/tools start `expanded`), every panel render was OR-ing the local open toggle against `visible.X === 'expanded'`. That pinned `open=true` for the default-expanded sections, so clicking the chevron flipped the local state but the panel never collapsed. Local toggle is now the sole source of truth at render time; the useState init still seeds from the resolved visibility (so first paint is correct) and the existing useEffect still re-syncs when the user mutates visibility at runtime via `/details`. Same OR-lock cleared inside SubagentAccordion (`showChildren \|\| openX`) — pre-existing but the same shape, so expand-all on the spawn tree no longer makes inner sections un-collapsible either.	2026-04-24 12:22:20 -05:00
Austin Pickett	850fac14e3	chore: address copilot comments	2026-04-24 12:51:04 -04:00
Austin Pickett	5500b51800	chore: fix lint	2026-04-24 12:32:10 -04:00
Austin Pickett	63975aa75b	fix: mobile chat in new layout	2026-04-24 12:07:46 -04:00
Teknium	62c14d5513	refactor(gateway): extract WhatsApp identity helpers into shared module Follow-up to the canonical-identity session-key fix: pull the JID/LID normalize/expand/canonical helpers into gateway/whatsapp_identity.py instead of living in two places. gateway/session.py (session-key build) and gateway/run.py (authorisation allowlist) now both import from the shared module, so the two resolution paths can't drift apart. Also switches the auth path from module-level _hermes_home (cached at import time) to dynamic get_hermes_home() lookup, which matches the session-key path and correctly reflects HERMES_HOME env overrides. The lone test that monkeypatched gateway.run._hermes_home for the WhatsApp auth path is updated to set HERMES_HOME env var instead; all other tests that monkeypatch _hermes_home for unrelated paths (update, restart drain, shutdown marker, etc.) still work — the module-level _hermes_home is untouched.	2026-04-24 07:55:55 -07:00
Keira Voss	10deb1b87d	fix(gateway): canonicalize WhatsApp identity in session keys Hermes' WhatsApp bridge routinely surfaces the same person under either a phone-format JID (60123456789@s.whatsapp.net) or a LID (…@lid), and may flip between the two for a single human within the same conversation. Before this change, build_session_key used the raw identifier verbatim, so the bridge reshuffling an alias form produced two distinct session keys for the same person — in two places: 1. DM chat_id — a user's DM sessions split in half, transcripts and per-sender state diverge. 2. Group participant_id (with group_sessions_per_user enabled) — a member's per-user session inside a group splits in half for the same reason. Add a canonicalizer that walks the bridge's lid-mapping-*.json files and picks the shortest/numeric-preferred alias as the stable identity. build_session_key now routes both the DM chat_id and the group participant_id through this helper when the platform is WhatsApp. All other platforms and chat types are untouched. Expose canonical_whatsapp_identifier and normalize_whatsapp_identifier as public helpers. Plugins that need per-sender behaviour (role-based routing, per-contact authorization, policy gating) need the same identity resolution Hermes uses internally; without a public helper, each plugin would have to re-implement the walker against the bridge's internal on-disk format. Keeping this alongside build_session_key makes it authoritative and one refactor away if the bridge ever changes shape. _expand_whatsapp_aliases stays private — it's an implementation detail of how the mapping files are walked, not a contract callers should depend on.	2026-04-24 07:55:55 -07:00
emozilla	f49afd3122	feat(web): add /api/pty WebSocket bridge to embed TUI in dashboard Exposes hermes --tui over a PTY-backed WebSocket so the dashboard can embed the real TUI rather than reimplement its surface. The browser attaches xterm.js to the socket; keystrokes flow in, PTY output bytes flow out. Architecture: browser <Terminal> (xterm.js) │ onData ───► ws.send(keystrokes) │ onResize ► ws.send('\x1b[RESIZE:cols;rows]') │ write ◄── ws.onmessage (PTY bytes) ▼ FastAPI /api/pty (token-gated, loopback-only) ▼ PtyBridge (ptyprocess) ── spawns node ui-tui/dist/entry.js ──► tui_gateway + AIAgent Components ---------- hermes_cli/pty_bridge.py Thin wrapper around ptyprocess.PtyProcess: byte-safe read/write on the master fd via os.read/os.write (not PtyProcessUnicode — ANSI is inherently byte-oriented and UTF-8 boundaries may land mid-read), non-blocking select-based reads, TIOCSWINSZ resize, idempotent SIGHUP→SIGTERM→SIGKILL teardown, platform guard (POSIX-only; Windows is WSL-supported only). hermes_cli/web_server.py @app.websocket("/api/pty") endpoint gated by the existing _SESSION_TOKEN (via ?token= query param since browsers can't set Authorization on WS upgrades). Loopback-only enforcement. Reader task uses run_in_executor to pump PTY bytes without blocking the event loop. Writer loop intercepts a custom \x1b[RESIZE:cols;rows] escape before forwarding to the PTY. The endpoint resolves the TUI argv through a _resolve_chat_argv hook so tests can inject fake commands without building the real TUI. Tests ----- tests/hermes_cli/test_pty_bridge.py — 12 unit tests: spawn, stdout, stdin round-trip, EOF, resize (via TIOCSWINSZ + tput readback), close idempotency, cwd, env forwarding, unavailable-platform error. tests/hermes_cli/test_web_server.py — TestPtyWebSocket adds 7 tests: missing/bad token rejection (close code 4401), stdout streaming, stdin round-trip, resize escape forwarding, unavailable-platform ANSI error frame + 1011 close, resume parameter forwarding to argv. 96 tests pass under scripts/run_tests.sh. (cherry picked from commit `29b337bca7`) feat(web): add Chat tab with xterm.js terminal + Sessions resume button (cherry picked from commit `3d21aee8` by emozilla, conflicts resolved against current main: BUILTIN_ROUTES table + plugin slot layout) fix(tui): replace OSC 52 jargon in /copy confirmation When the user ran /copy successfully, Ink confirmed with: sent OSC52 copy sequence (terminal support required) That reads like a protocol spec to everyone who isn't a terminal implementer. The caveat was a historical artifact — OSC 52 wasn't universally supported when this message was written, so the TUI honestly couldn't guarantee the copy had landed anywhere. Today every modern terminal (including the dashboard's embedded xterm.js) handles OSC 52 reliably. Say what the user actually wants to know — that it copied, and how much — matching the message the TUI already uses for selection copy: copied 1482 chars (cherry picked from commit `a0701b1d5a`) docs: document the dashboard Chat tab AGENTS.md — new subsection under TUI Architecture explaining that the dashboard embeds the real hermes --tui rather than rewriting it, with pointers to the pty_bridge + WebSocket endpoint and the rule 'never add a parallel chat surface in React.' website/docs/user-guide/features/web-dashboard.md — user-facing Chat section inside the existing Web Dashboard page, covering how it works (WebSocket + PTY + xterm.js), the Sessions-page resume flow, and prerequisites (Node.js, ptyprocess, POSIX kernel / WSL on Windows). (cherry picked from commit `2c2e32cc45`) feat(tui-gateway): transport-aware dispatch + WebSocket sidecar Decouples the JSON-RPC dispatcher from its I/O sink so the same handler surface can drive multiple transports concurrently. The PTY chat tab already speaks to the TUI binary as bytes — this adds a structured event channel alongside it for dashboard-side React widgets that need typed events (tool.start/complete, model picker state, slash catalog) that PTY can't surface. - `tui_gateway/transport.py` — `Transport` protocol + `contextvars` binding + module-level `StdioTransport` fallback. The stdio stream resolves through a lambda so existing tests that monkey-patch `_real_stdout` keep passing without modification. - `tui_gateway/ws.py` — WebSocket transport implementation; FastAPI endpoint mounting lives in hermes_cli/web_server.py. - `tui_gateway/server.py`: - `write_json` routes via session transport (for async events) → contextvar transport (for in-request writes) → stdio fallback. - `dispatch(req, transport=None)` binds the transport for the request lifetime and propagates it to pool workers via `contextvars.copy_context` so async handlers don't lose their sink. - `_init_session` and the manual-session create path stash the request's transport so out-of-band events (subagent.complete, etc.) fan out to the right peer. `tui_gateway.entry` (Ink's stdio handshake) is unchanged externally — it falls through every precedence step into the stdio fallback, byte- identical to the previous behaviour. feat(web): ChatSidebar — JSON-RPC sidecar next to xterm.js terminal Composes the two transports into a single Chat tab: ┌─────────────────────────────────────────┬──────────────┐ │ xterm.js / PTY (emozilla #13379) │ ChatSidebar │ │ the literal hermes --tui process │ /api/ws │ └─────────────────────────────────────────┴──────────────┘ terminal bytes structured events The terminal pane stays the canonical chat surface — full TUI fidelity, slash commands, model picker, mouse, skin engine, wide chars all paint inside the terminal. The sidebar opens a parallel JSON-RPC WebSocket to the same gateway and renders metadata that PTY can't surface to React chrome: • model + provider badge with connection state (click → switch) • running tool-call list (driven by tool.start / tool.progress / tool.complete events) • model picker dialog (gateway-driven, reuses ModelPickerDialog) The sidecar is best-effort. If the WS can't connect (older gateway, network hiccup, missing token) the terminal pane keeps working unimpaired — sidebar just shows the connection-state badge in the appropriate tone. - `web/src/components/ChatSidebar.tsx` — new component (~270 lines). Owns its GatewayClient, drives the model picker through `slash.exec`, fans tool events into a capped tool list. - `web/src/pages/ChatPage.tsx` — split layout: terminal pane (`flex-1`) + sidebar (`w-80`, `lg+` only). - `hermes_cli/web_server.py` — mount `/api/ws` (token + loopback guards mirror /api/pty), delegate to `tui_gateway.ws.handle_ws`. Co-authored-by: emozilla <emozilla@nousresearch.com> refactor(web): /clean pass on ChatSidebar + ChatPage lint debt - ChatSidebar: lift gw out of useRef into a useMemo derived from a reconnect counter. React 19's react-hooks/refs and react-hooks/ set-state-in-effect rules both fire when you touch a ref during render or call setState from inside a useEffect body. The counter-derived gw is the canonical pattern for "external resource that needs to be replaceable on user action" — re-creating the client comes from bumping `version`, the effect just wires + tears down. Drops the imperative `gwRef.current = …` reassign in reconnect, drops the truthy ref guard in JSX. modelLabel + banner inlined as derived locals (one-off useMemo was overkill). - ChatPage: lazy-init the banner state from the missing-token check so the effect body doesn't have to setState on first run. Drops the unused react-hooks/exhaustive-deps eslint-disable. Adds a scoped no-control-regex disable on the SGR mouse parser regex (the \\x1b is intentional for xterm escape sequences). All my-touched files now lint clean. Remaining warnings on web/ belong to pre-existing files this PR doesn't touch. Verified: vitest 249/249, ui-tui eslint clean, web tsc clean, python imports clean. chore: uptick fix(web): drop ChatSidebar tool list — events can't cross PTY/WS boundary The /api/pty endpoint spawns `hermes --tui` as a child process with its own tui_gateway and _sessions dict; /api/ws runs handle_ws in-process in the dashboard server with a separate _sessions dict. Tool events fire on the child's gateway and never reach the WS sidecar, so the sidebar's tool.start/progress/complete listeners always observed an empty list. Drop the misleading list (and the now-orphaned ToolCall primitive), keep model badge + connection state + model picker + error banner — those work because they're sidecar-local concerns. Surfacing tool calls in the sidebar requires cross-process forwarding (PTY child opens a back-WS to the dashboard, gateway tees emits onto stdio + sidecar transport) — proper feature for a follow-up. feat(web): wire ChatSidebar tool list to PTY child via /api/pub broadcast The dashboard's /api/pty spawns hermes --tui as a child process; tool events fire in the python tui_gateway grandchild and never crossed the process boundary into the in-process WS sidecar — so the sidebar tool list was always empty. Cross-process forwarding: - tui_gateway: TeeTransport (transport.py) + WsPublisherTransport (event_publisher.py, sync websockets client). entry.py installs the tee on _stdio_transport when HERMES_TUI_SIDECAR_URL is set, mirroring every dispatcher emit to a back-WS without disturbing Ink's stdio handshake. - hermes_cli/web_server.py: new /api/pub (publisher) + /api/events (subscriber) endpoints with a per-channel registry. /api/pty now accepts ?channel= and propagates the sidecar URL via env. start_server also stashes app.state.bound_port so the URL is constructable. - web/src/pages/ChatPage.tsx: generates a channel UUID per mount, passes it to /api/pty and as a prop to ChatSidebar. - web/src/components/ChatSidebar.tsx: opens /api/events?channel=, fans tool.start/progress/complete back into the ToolCall list. Restores the ToolCall primitive. Tests: 4 new TestPtyWebSocket cases cover channel propagation, broadcast fan-out, and missing-channel rejection (10 PTY tests pass, 120 web_server tests overall). fix(web): address Copilot review on #14890 Five threads, all real: - gatewayClient.ts: register `message`/`close` listeners BEFORE awaiting the open handshake. Server emits `gateway.ready` immediately after accept, so a listener attached after the open promise could race past the initial skin payload and lose it. - ChatSidebar.tsx: wire `error`/`close` on the /api/events subscriber WS into the existing error banner. 4401/4403 (auth/loopback reject) surface as a "reload the page" message; mid-stream drops surface as "events feed disconnected" with the existing reconnect button. Clean unmount closes (1000/1001) stay silent. - web-dashboard.md: install hint was `pip install hermes-agent[web]` but ptyprocess lives in the `pty` extra, not `web`. Switch to `hermes-agent[web,pty]` in both prerequisite blocks. - AGENTS.md: previous "never add a parallel React chat surface" guidance was overbroad and contradicted this PR's sidebar. Tightened to forbid re-implementing the transcript/composer/PTY terminal while explicitly allowing structured supporting widgets (sidebar / model picker / inspectors), matching the actual architecture. - web/package-lock.json: regenerated cleanly so the wterm sibling workspace paths (extraneous machine-local entries) stop polluting CI. Tests: 249/249 vitest, 10/10 PTY/events, web tsc clean. refactor(web): /clean pass on ChatSidebar events handler Spotted in the round-2 review: - Banner flashed on clean unmount: `ws.close()` from the effect cleanup fires `close` with code 1005, opened=true, neither 1000 nor 1001 — hit the "unexpected drop" branch. Track `unmounting` in the effect scope and gate the banner through a `surface()` helper so cleanup closes stay silent. - DRY the duplicated "events feed disconnected" string into a local const used by both the error and close handlers. - Drop the `opened` flag (no longer needed once the unmount guard is the source of truth for "is this an expected close?").	2026-04-24 10:51:49 -04:00
Austin Pickett	1143f234e3	Merge pull request #14899 from NousResearch/feat/dashboard-layout Feat/dashboard layout	2026-04-24 07:48:31 -07:00
Teknium	c4627f4933	chore(release): map Group G contributors in AUTHOR_MAP	2026-04-24 07:26:07 -07:00
bsgdigital	7c3e5706d8	fix(bedrock): Bedrock-aware _rebuild_anthropic_client helper on interrupt Three interrupt-recovery sites in run_agent.py rebuilt self._anthropic_client with build_anthropic_client(self._anthropic_api_key, ...) unconditionally. When provider=bedrock + api_mode=anthropic_messages (AnthropicBedrock SDK path), self._anthropic_api_key is the sentinel 'aws-sdk' — build_anthropic_client doesn't accept that and the rebuild either crashed or produced a non-functional client. Extract a _rebuild_anthropic_client() helper that dispatches to build_anthropic_bedrock_client(region) when provider='bedrock', falling back to build_anthropic_client() for native Anthropic and other anthropic_messages providers (MiniMax, Kimi, Alibaba, etc.). Three inline rebuild sites now call the helper. Partial salvage of #14680 by @bsgdigital — only the _rebuild_anthropic_client helper. The normalize_model_name Bedrock-prefix piece was subsumed by #14664, and the aux client aws_sdk branch was subsumed by #14770 (both in the same salvage PR as this commit).	2026-04-24 07:26:07 -07:00
Andre Kurait	a9ccb03ccc	fix(bedrock): evict cached boto3 client on stale-connection errors ## Problem When a pooled HTTPS connection to the Bedrock runtime goes stale (NAT timeout, VPN flap, server-side TCP RST, proxy idle cull), the next Converse call surfaces as one of: * botocore.exceptions.ConnectionClosedError / ReadTimeoutError / EndpointConnectionError / ConnectTimeoutError * urllib3.exceptions.ProtocolError * A bare AssertionError raised from inside urllib3 or botocore (internal connection-pool invariant check) The agent loop retries the request 3x, but the cached boto3 client in _bedrock_runtime_client_cache is reused across retries — so every attempt hits the same dead connection pool and fails identically. Only a process restart clears the cache and lets the user keep working. The bare-AssertionError variant is particularly user-hostile because str(AssertionError()) is an empty string, so the retry banner shows: ⚠️ API call failed: AssertionError 📝 Error: with no hint of what went wrong. ## Fix Add two helpers to agent/bedrock_adapter.py: * is_stale_connection_error(exc) — classifies exceptions that indicate dead-client/dead-socket state. Matches botocore ConnectionError + HTTPClientError subtrees, urllib3 ProtocolError / NewConnectionError, and AssertionError raised from a frame whose module name starts with urllib3., botocore., or boto3.. Application-level AssertionErrors are intentionally excluded. * invalidate_runtime_client(region) — per-region counterpart to the existing reset_client_cache(). Evicts a single cached client so the next call rebuilds it (and its connection pool). Wire both into the Converse call sites: * call_converse() / call_converse_stream() in bedrock_adapter.py (defense-in-depth for any future caller) * The two direct client.converse(kwargs) / client.converse_stream(kwargs) call sites in run_agent.py (the paths the agent loop actually uses) On a stale-connection exception, the client is evicted and the exception re-raised unchanged. The agent's existing retry loop then builds a fresh client on the next attempt and recovers without requiring a process restart. ## Tests tests/agent/test_bedrock_adapter.py gets three new classes (14 tests): * TestInvalidateRuntimeClient — per-region eviction correctness; non-cached region returns False. * TestIsStaleConnectionError — classifies botocore ConnectionClosedError / EndpointConnectionError / ReadTimeoutError, urllib3 ProtocolError, library-internal AssertionError (both urllib3.* and botocore.* frames), and correctly ignores application-level AssertionError and unrelated exceptions (ValueError, KeyError). * TestCallConverseInvalidatesOnStaleError — end-to-end: stale error evicts the cached client, non-stale error (validation) leaves it alone, successful call leaves it cached. All 116 tests in test_bedrock_adapter.py pass. Signed-off-by: Andre Kurait <andrekurait@gmail.com>	2026-04-24 07:26:07 -07:00
Tranquil-Flow	7dc6eb9fbf	fix(agent): handle aws_sdk auth type in resolve_provider_client Bedrock's aws_sdk auth_type had no matching branch in resolve_provider_client(), causing it to fall through to the "unhandled auth_type" warning and return (None, None). This broke all auxiliary tasks (compression, memory, summarization) for Bedrock users — the main conversation loop worked fine, but background context management silently failed. Add an aws_sdk branch that creates an AnthropicAuxiliaryClient via build_anthropic_bedrock_client(), using boto3's default credential chain (IAM roles, SSO, env vars, instance metadata). Default auxiliary model is Haiku for cost efficiency. Closes #13919	2026-04-24 07:26:07 -07:00
Andre Kurait	b290297d66	fix(bedrock): resolve context length via static table before custom-endpoint probe ## Problem `get_model_context_length()` in `agent/model_metadata.py` had a resolution order bug that caused every Bedrock model to fall back to the 128K default context length instead of reaching the static Bedrock table (200K for Claude, etc.). The root cause: `bedrock-runtime.<region>.amazonaws.com` is not listed in `_URL_TO_PROVIDER`, so `_is_known_provider_base_url()` returned False. The resolution order then ran the custom-endpoint probe (step 2) before the Bedrock branch (step 4b), which: 1. Treated Bedrock as a custom endpoint (via `_is_custom_endpoint`). 2. Called `fetch_endpoint_model_metadata()` → `GET /models` on the bedrock-runtime URL (Bedrock doesn't serve this shape). 3. Fell through to `return DEFAULT_FALLBACK_CONTEXT` (128K) at the "probe-down" branch — never reaching the Bedrock static table. Result: users on Bedrock saw 128K context for Claude models that actually support 200K on Bedrock, causing premature auto-compression. ## Fix Promote the Bedrock branch from step 4b to step 1b, so it runs before the custom-endpoint probe at step 2. The static table in `bedrock_adapter.py::get_bedrock_context_length()` is the authoritative source for Bedrock (the ListFoundationModels API doesn't expose context window sizes), so there's no reason to probe `/models` first. The original step 4b is replaced with a one-line breadcrumb comment pointing to the new location, to make the resolution-order docstring accurate. ## Changes - `agent/model_metadata.py` - Add step 1b: Bedrock static-table branch (unchanged predicate, moved). - Remove dead step 4b block, replace with breadcrumb comment. - Update resolution-order docstring to include step 1b. - `tests/agent/test_model_metadata.py` - New `TestBedrockContextResolution` class (3 tests): - `test_bedrock_provider_returns_static_table_before_probe`: confirms `provider="bedrock"` hits the static table and does NOT call `fetch_endpoint_model_metadata` (regression guard). - `test_bedrock_url_without_provider_hint`: confirms the `bedrock-runtime.*.amazonaws.com` host match works without an explicit `provider=` hint. - `test_non_bedrock_url_still_probes`: confirms the probe still fires for genuinely-custom endpoints (no over-reach). ## Testing pytest tests/agent/test_model_metadata.py -q # 83 passed in 1.95s (3 new + 80 existing) ## Risk Very low. - Predicate is identical to the original step 4b — no behaviour change for non-Bedrock paths. - Original step 4b was dead code for the user-facing case (always hit the 128K fallback first), so removing it cannot regress behaviour. - Bedrock path now short-circuits before any network I/O — faster too. - `ImportError` fall-through preserved so users without `boto3` installed are unaffected. ## Related - This is a prerequisite for accurate context-window accounting on Bedrock — the fix for #14710 (stale-connection client eviction) depends on correct context sizing to know when to compress. Signed-off-by: Andre Kurait <andrekurait@gmail.com>	2026-04-24 07:26:07 -07:00

1 2 3 4 5 ...

5823 commits