hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-20 15:33:54 +00:00

Author	SHA1	Message	Date
LifeJiggy	6d495d9e7c	fix(approval): surface pending-approval state with explicit marker visible to LLM When a tool call requires user approval in the non-blocking gateway path, the LLM previously received a result that was indistinguishable from a failed tool call (exit_code=-1, error=message). The LLM could not tell whether the tool was pending approval, had returned empty results, or had failed silently — causing it to burn context on wrong hypotheses. Fix changes the result format to include: - status: pending_approval (clear state name) - approval_pending: True (explicit boolean for LLMs to detect) - error: cleared to empty string (removes misleading error signal) This lets the LLM reason about approval latency vs actual errors, short-circuiting the previous silent failure mode. Fixes #14806	2026-05-18 19:37:16 -07:00
sadiksaifi	523254b34a	fix(kanban): single-row horizontal scroll for board columns Switch .hermes-kanban-columns from auto-fit CSS grid to a flex row with overflow-x: auto and a hidden scrollbar (scrollbar-width / ::-webkit- scrollbar), and pin .hermes-kanban-column to flex: 0 0 280px so columns sit side-by-side at a fixed width instead of wrapping into a 2xN grid. Page vertical scroll is unaffected: each column already caps at max-height: calc(100vh - 220px), so the container never grows tall enough to introduce its own vertical scrollbar.	2026-05-18 19:36:50 -07:00
EloquentBrush0x	5cbf86f1c8	fix(acp): resolve /tmp symlink before workspace auto-approve check on macOS Path.resolve() follows the /tmp -> /private/tmp symlink on macOS, so str(path).startswith("/tmp/") is always False for temp-dir paths. The "Accept Edits" (workspace_session) mode silently refused to auto-approve every /tmp write on macOS, breaking the documented behaviour and making the existing test fail on this platform. Fix: keep the raw expanded path (pre-resolve) for the /tmp prefix check and continue using the resolved form only for the cwd relative_to() call where symlink resolution is correct behaviour.	2026-05-18 19:36:27 -07:00
burjorjee	52b049b560	fix: treat inline-shell timeout guard as timeout	2026-05-18 19:36:04 -07:00
zccyman	4e9df52d60	fix: elevate plugin discovery failures from debug to warning Plugin discovery exceptions in gateway startup (gateway/run.py) and CLI startup (hermes_cli/main.py) are caught and logged at DEBUG level, making them invisible at the default INFO log level. If any plugin import fails — syntax error, missing dependency, import cycle — operators get zero indication unless they bump the log level to DEBUG. This makes broken plugins appear enabled but silently non-functional. Change both locations to logger.warning() so failures are visible at production log levels. Closes #28137	2026-05-18 19:35:41 -07:00
Teknium	a24184f295	chore(release): alias stale-ID salvage commit for @LifeJiggy (#28317 ) * fix(process-registry): detach stdin from background subprocesses to prevent keyboard freeze Background process non-PTY path used stdin=subprocess.PIPE unconditionally, creating an orphan pipe that was never written to and never closed. Child processes that read stdin would block indefinitely, competing with the parent's prompt_toolkit event loop for terminal ownership and causing complete keyboard lockout. Change to stdin=subprocess.DEVNULL so children get immediate EOF on stdin reads instead of blocking forever. For interactive stdin, the PTY path (which has its own independent PTY via ptyprocess.PtyProcess.spawn) should be used instead. Fixes #17959 * chore(release): alias stale-ID salvage commit for LifeJiggy PR #28315 was salvaged with a wrong noreply numeric ID (192385615 vs the correct 141562589). The commit on main is correctly authored to LifeJiggy by username, but the noreply email doesn't match AUTHOR_MAP. Adds an alias so release-notes generation maps both forms to the same contributor. --------- Co-authored-by: LifeJiggy <192385615+LifeJiggy@users.noreply.github.com>	2026-05-18 19:35:21 -07:00
LifeJiggy	214b95392b	fix(process-registry): detach stdin from background subprocesses to prevent keyboard freeze Background process non-PTY path used stdin=subprocess.PIPE unconditionally, creating an orphan pipe that was never written to and never closed. Child processes that read stdin would block indefinitely, competing with the parent's prompt_toolkit event loop for terminal ownership and causing complete keyboard lockout. Change to stdin=subprocess.DEVNULL so children get immediate EOF on stdin reads instead of blocking forever. For interactive stdin, the PTY path (which has its own independent PTY via ptyprocess.PtyProcess.spawn) should be used instead. Fixes #17959	2026-05-18 19:34:16 -07:00
EloquentBrush0x	5766504c60	fix(gateway): align kanban artifact _IMAGE_EXTS with response dispatch _deliver_kanban_artifacts used a broader _IMAGE_EXTS that included .bmp, .tiff, and .svg. These three extensions are absent from the equivalent set in _deliver_media_from_response (line 10661), which intentionally routes them through send_document rather than send_multiple_images (comment near line 10522 notes that Telegram sendPhoto recompresses and rejects non-raster formats). Routing .svg (XML text), .bmp, or .tiff through the photo API causes send_multiple_images to raise on most platforms; the exception is caught and logged as a warning, silently dropping the artifact. Aligning the two sets ensures kanban deliverables with these extensions follow the same send_document path as regular agent responses. No behaviour change for .png/.jpg/.jpeg/.gif/.webp.	2026-05-18 19:33:53 -07:00
zccyman	7923f844fa	fix: include hermes_plugins in gateway.log component filter gateway.log uses a _ComponentFilter that only passes records from loggers starting with ('gateway',). Plugin modules are loaded under the hermes_plugins.* namespace, so all plugin log output is silently dropped from gateway.log. This makes plugin registration — which directly affects gateway hooks (pre_gateway_dispatch, transform_llm_output, etc.) — invisible in the gateway-specific log. Operators debugging gateway behavior check gateway.log and see no plugin activity, even when plugins are working correctly. Add 'hermes_plugins' to the gateway component prefixes tuple so plugin log messages appear in gateway.log. Closes #28138	2026-05-18 19:33:30 -07:00
rudi193-cmd	95846eddd2	fix(auth): treat empty credential pool entries as unauthenticated Fixes #28140 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-18 19:33:07 -07:00
02356abc	8dca28775e	fix(wecom): handle WSMsgType.CLOSING to prevent CPU spin The WeCom adapter's _read_events() loop only handled CLOSE, CLOSED, and ERROR websocket message types. When the server initiates a graceful shutdown, aiohttp returns WSMsgType.CLOSING before the connection is fully closed. This message type was not handled, causing the receive() call to return immediately in a tight loop while self._ws.closed remained False. The result was 100% CPU usage on the asyncio event loop. Add WSMsgType.CLOSING to the set of terminal message types that raise RuntimeError("WeCom websocket closed"), allowing _listen_loop() to enter its normal reconnect backoff path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-18 19:32:42 -07:00
teknium1	e73e487d40	chore(release): pre-stage AUTHOR_MAP for May 2026 LHF batch group 7 Pre-stages AUTHOR_MAP entries for 5 new contributors whose PRs are being salvaged in the May 2026 low-hanging-fruit batch (group 7). Lands ahead of the per-PR salvage PRs so they don't get blocked by AUTHOR_MAP CI. Contributors: - 02356abc (#28286 — wecom WSMsgType.CLOSING) - burjorjee (#28201 — inline-shell timeout guard) - oseftg (#28168 — natural response ending: emoji + caret) - rudi193-cmd (#28241 — empty credential pool entries) - sadiksaifi (#27982 — kanban horizontal scroll) Per references/batch-pr-salvage-may14-additions.md.	2026-05-18 19:31:00 -07:00
0xjackyang	3df699be50	chore(release): map Jack Yang contributor email Adds the contributor email mapping for Jack Yang (@0xjackyang) so future release-note generation attributes commits correctly. Salvage of #27964 by @0xjackyang.	2026-05-18 19:31:00 -07:00
teknium1	3d258097db	chore(skills/baoyu-article-illustrator): tighten description, add platforms, regen docs	2026-05-18 18:28:56 -07:00
Jim Liu 宝玉	a93de60b68	fix(skills): align article-illustrator with real Hermes tool capabilities Addresses review feedback on #13193: 1. Reference-image flow no longer assumes write_file/read_file handle binaries. vision_analyze produces a textual description; the binary is optionally copied via terminal (cp/curl). The description is what gets embedded in prompts. 2. image_generate's URL-only return is now explicit. Step 6 downloads the returned URL to local disk via terminal (curl -sSL -o ...), then verifies non-zero size before proceeding. 3. Removed "Please use nano banana pro..." line from prompts/system.md — the backend is user-configured and not agent-selectable, so routing hints in the prompt are misleading. PORT_NOTES.md updated: prompts/system.md is no longer verbatim, and the file-ops/backend-selection rows now reflect Hermes' actual tool surface (write_file/read_file for text, terminal for binaries and URL downloads, vision_analyze for reading images).	2026-05-18 18:28:56 -07:00
Jim Liu 宝玉	4bd297094a	feat(skills): adapt baoyu-article-illustrator for Hermes Adapts the upstream baoyu-article-illustrator skill (verbatim-copied in the previous commit) to Hermes' tool ecosystem, matching the pattern used by baoyu-infographic. - Metadata: openclaw → hermes; add author, license, tags, category - Triggering: slash command + CLI flags → natural language - User config: remove EXTEND.md, first-time-setup, preferences-schema - User prompts: AskUserQuestion (batched) → clarify (one at a time) - Image gen: baoyu-imagine → image_generate (describe refs in prompt text) - Platform: drop Windows/PowerShell; Linux/macOS only - File ops: switch to write_file / read_file - Watermark: opt-in per-article instead of EXTEND.md-driven - Add PORT_NOTES.md describing the adaptation and sync procedure Style, palette, and prompt/system.md reference files are verbatim copies and are the sync points with upstream.	2026-05-18 18:28:56 -07:00
Jim Liu 宝玉	680189b5de	feat(skills): add baoyu-article-illustrator skill	2026-05-18 18:28:56 -07:00
Jeffrey Quesnelle	49c8299798	Merge pull request #28169 from NousResearch/jq/install-ps1-improvements feat(install.ps1): strip BOM, add -Commit/-Tag pin params, harden git ops	2026-05-18 21:28:40 -04:00
Austin Pickett	2ef501e1f5	feat(cli): add /update slash command to CLI and TUI (#23854 ) * feat: add /update slash command to CLI and TUI * test(cli): add Python tests for /update slash command Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cli): address Copilot review for /update slash command Route classic CLI /update through prompt_toolkit modal confirmation and defer relaunch to the main-thread cleanup path after app.exit(). Tighten Y/n semantics, add Python wrapper and catalog coverage tests, and assert /update stays visible in the TUI command catalog. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cli): address review feedback on /update command - Replace raw input() with _prompt_text_input_modal in _handle_update_command to avoid EOF/hang/keystroke-leak races with prompt_toolkit's stdin ownership - Fix confirmation logic: only proceed on recognized affirmative aliases (y/yes/1/ok); cancel on everything else including empty string, typos, and unrecognized input — matches all other [Y/n] prompts in the codebase - Route relaunch through main-thread shutdown path: set _pending_relaunch and return False from process_command so process_loop triggers app.exit(); run() then calls relaunch() after prompt_toolkit has restored terminal modes and after cleanup — safe on both POSIX (execvp) and Windows (subprocess+exit) - Fix misleading docstring in test_update_command.py: the Vitest only covers the TypeScript slash handler that emits code 42, not the Python wrapper branch that acts on it - Rewrite tests to use SimpleNamespace pattern (like test_destructive_slash_confirm) so _prompt_text_input_modal can be stubbed directly - Add Python test for _launch_tui exit-code-42 → relaunch branch in main.py Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> * fix(cli): polish test fixtures for /update command - Remove unused _prompt_text_input from SimpleNamespace stub - Use pytest.fail sentinel in managed-install guard test to catch unexpected modal invocations Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> * chore: re-trigger CI after Copilot review fixes Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com>	2026-05-18 20:10:46 -04:00
Teknium	378bca1d2f	chore(release): add AUTHOR_MAP entry for falasi	2026-05-18 14:31:37 -07:00
falasi	43802ef3e3	fix: add default base_url_override for ollama-cloud provider	2026-05-18 14:31:37 -07:00
emozilla	a53e8ca733	feat(install.ps1): strip BOM, add -Commit/-Tag pin params, harden git ops Three install.ps1 improvements pulled from the thin-installer work on bb/gui (PR #27822) that benefit the canonical CLI install flow on main: 1. Strip UTF-8 BOM from scripts/install.ps1. The canonical 'irm <raw URL> \| iex' install flow has been broken since commit `4279da4db` re-introduced a UTF-8 BOM that PR #27224 had explicitly stripped. PowerShell 5.1's 'irm' returns the response body as a string with the BOM surviving as a leading \ufeff character; 'iex' then evaluates that string and the parser chokes on the invisible character before param(), surfacing as a cascade of 'The assignment expression is not valid' errors at every param default value. File body is verified pure ASCII (no character above byte 127), so PS 5.1 with no BOM falls back to Windows-1252 decoding which is identical to ASCII for our content. Both install paths work: - 'irm ... \| iex' (canonical one-liner) - 'powershell -File install.ps1' (programmatic / desktop bootstrap) 2. New -Commit and -Tag string params for reproducible pinning. Higher-precedence variants of -Branch. When set, the repository stage clones $Branch (fast partial fetch) and then 'git checkout's the exact ref. Precedence: Commit > Tag > Branch. Honoured by all three code paths: - Update path (existing valid checkout): fetch + checkout --detach <commit\|tag> instead of checkout + pull. - Fresh clone: clone --branch $Branch, then post-clone 'git checkout --detach' to the requested ref. - ZIP fallback: pick archive URL for the most-specific ref (commit -> archive/<sha>.zip, tag -> archive/refs/tags/ <tag>.zip, else archive/refs/heads/<branch>.zip). Used by the Hermes desktop's first-launch bootstrap to pin the .exe to the exact commit it was built against, so the cloned Hermes Agent tree always matches what the .exe was tested with. Also enables release-bundle pinning (e.g. Microsoft Store builds pinning to a release tag) and CI reproducibility. 3. EAP=Continue wrap around the new pin-step git invocations. 'git fetch origin <commit>' writes the routine 'From <url>' info line to stderr. Under the script's global $ErrorActionPreference = 'Stop' that stderr line is wrapped as an ErrorRecord and terminates the script even though fetch+checkout actually succeed. Same EAP=Stop + native-stderr footgun we hit during the install.ps1 hardening pass in Install-Uv, Test-Python, _Run-NpmInstall. Wrap both the update-path fetch/checkout block AND the post-clone pin block in $ErrorActionPreference = 'Continue' (restored in finally). Real failures still caught by $LASTEXITCODE checks.	2026-05-18 15:45:28 -04:00
Austin Pickett	6fa1701bd3	feat(web): mobile dashboard UX polish (#28127 ) * feat(web): mobile dashboard UX polish Bottom sheets for sidebar theme/language pickers on narrow viewports with enter/exit animation and drag-to-close; inline header badges beside titles; bottom padding on the route outlet for scroll clearance; profiles loading uses a unicode braille spinner; align profile/cron card actions to the top; viewport-fit cover and supporting layout tweaks across dashboard pages. Co-authored-by: Cursor <cursoragent@cursor.com> * Fix Nix web npm hash and mobile sheet accessibility. Align fetchNpmDeps in nix/web.nix with web/package-lock.json for CI. Improve BottomPickSheet backdrop labeling, avoid aria-hidden on the dialog during exit animation, and wire theme/language sheets with listbox semantics and localized dismiss labels. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-18 15:20:31 -04:00
HenkDz	52e3bfc2f4	feat(acp): enrich permission request cards	2026-05-18 11:47:27 -07:00
teknium1	2057977102	fix(acp): use refresh moment as updated_at on session info push Follow-up to #26543. The sessions table does not have an updated_at column (see hermes_state.py — only started_at/ended_at), so row.get('updated_at') always returned None and the str() coercion was dead code. Use datetime.now(UTC).isoformat() instead, which reflects exactly what the field means here: 'the title was refreshed at this moment'. Drop the dead coercion.	2026-05-18 11:46:04 -07:00
HenkDz	741a349458	fix(acp): refresh session info after auto-title	2026-05-18 11:46:04 -07:00
teknium1	eda1c97a1e	fix(acp): also mark raised-exception tool results as failed Extends #26573 to also catch the case the original PR deliberately left out: when a tool raises an exception, the agent's tool executor wraps it in a canonical 'Error executing tool '<name>': ...' string prefix (see agent/tool_executor.py around the try/except). That prefix is unique to the wrapper and cannot legitimately appear in well-behaved tool output, so it is a safe signal that the tool blew up. Without this, the canonical 'tool raised' case still rendered as a green 'completed' row in Zed despite being a runtime failure — exactly the class of bug #26573 set out to fix. Adds a positive test (raised-exception prefix -> failed) and a negative test (bare 'Error:' word in legit tool output stays completed) so a future contributor doesn't accidentally widen the rule to false-positive on compiler/linter diagnostics.	2026-05-18 11:43:45 -07:00
HenkDz	9cf1140caa	fix(acp): treat polished tool error payloads as failed	2026-05-18 11:43:45 -07:00
HenkDz	b38d2d133b	fix(acp): mark failed tool completions	2026-05-18 11:43:45 -07:00
HenkDz	375c7f9cc3	fix(acp): render structured JSON tool output	2026-05-18 11:41:57 -07:00
墨綠BG	50e93f23f2	🐛 fix(memory): require newline after context tag	2026-05-18 10:53:08 -07:00
墨綠BG	341c8d3030	🐛 fix(memory): keep inline memory-context mentions visible	2026-05-18 10:53:08 -07:00
teknium1	956dd44625	chore(release): add AUTHOR_MAP entry for dskwe	2026-05-18 10:51:15 -07:00
Ryan Lee	6143ce1546	fix(url_safety): block IPv4-mapped IPv6 addresses to prevent SSRF bypass	2026-05-18 10:51:15 -07:00
alt-glitch	e3f391c1ac	test(cron): cover profile + workdir combined scenario	2026-05-18 17:39:50 +00:00
alt-glitch	ef5fe8dfaf	fix(cron): gracefully degrade when runtime profile is deleted Instead of raising FileNotFoundError (which silently bricks the job), log a warning and fall back to the scheduler default home. Validates at create/update time still catches typos. Idea from PR #19958.	2026-05-18 17:39:50 +00:00
alt-glitch	1d74d7f73a	fix(cron): use delta-based env restore instead of clear+update Avoids a brief window where other threads see an empty os.environ during profile job teardown. Idea from PR #19958.	2026-05-18 17:39:50 +00:00
alt-glitch	1f9b2e4d0b	chore: add gianfrancopiana to AUTHOR_MAP	2026-05-18 17:39:50 +00:00
Gianfranco Piana	9c48d47aaf	fix(cron): isolate profile job env	2026-05-18 17:39:50 +00:00
Gianfranco Piana	544406ef23	fix: avoid process-wide cron profile home mutation	2026-05-18 17:39:50 +00:00
Gianfranco Piana	bb9ecb2178	feat: add cron job profile support	2026-05-18 17:39:50 +00:00
teknium1	47bc8e080d	chore(release): AUTHOR_MAP noreply entry for Slimydog21	2026-05-18 10:37:35 -07:00
Slimydog21	aae1615977	fix(xai-responses): strip enum values containing '/' from tool schemas xAI's /v1/responses and /v1/chat/completions endpoints reject tool schemas whose enum values contain a forward slash with a generic HTTP 400 'Invalid arguments passed to the model.' before any token is emitted — the schema compiler trips on the '/' character regardless of where it appears. Most commonly hit by MCP-derived tools whose enum lists HuggingFace model IDs ('Qwen/Qwen3.5-0.8B', 'openai/gpt-oss-20b') or owner/name environment identifiers. Mirrors the existing strip_pattern_and_format sanitizer (PR for #27197). The new strip_slash_enum walks tool parameters and drops the entire enum keyword when any value contains '/' — keeping it partial would still 400 since xAI's failure is all-or-nothing on the enum. The field description still reaches the model so the prompting hint is preserved. Wired in at both code paths for parity: - agent/chat_completion_helpers.py (main agent xAI Responses path) - agent/auxiliary_client.py (aux client xAI Responses path, matching the same parity guarantee `2fae8fba9` established for pattern/format) Salvaged from #28021 by @Slimydog21 — contributor's branch was severely stale (would have reverted ~5000 LOC across azure/kanban/i18n); fix re-applied surgically on current main with their sanitizer + 9 tests preserved verbatim. Author noreply email used (original was a Mac hostname leak).	2026-05-18 10:37:35 -07:00
EloquentBrush0x	d9331eecee	fix(minimax-oauth): quarantine dead tokens on terminal refresh failure resolve_minimax_oauth_runtime_credentials called _refresh_minimax_oauth_state without a try/except, so a terminal failure (invalid_grant, refresh_token_reused, invalid_refresh_token) raised AuthError but left the dead refresh_token in auth.json. Every subsequent API call retried the same token via a network round-trip, failing identically each time. Fix: wrap the refresh call and, when exc.relogin_required is True and a refresh_token is present, clear the dead OAuth fields (access_token, refresh_token, expires_*) and write a last_auth_error quarantine marker to auth.json before re-raising. The next call sees no access_token and fails fast with 'not_logged_in' — no network retry — and the user is prompted to re-authenticate. Mirrors the existing quarantine pattern for Nous (_quarantine_nous_oauth_state), xAI-OAuth (#28116), and Codex-OAuth (#28118). Persist failure is best-effort (logged at DEBUG, error still re-raised). Salvaged from #28003 by @EloquentBrush0x — contributor's branch was severely stale (would have reverted ~5000 LOC across azure/kanban/i18n subsystems); fix re-applied surgically with their pattern preserved and added two regression tests (terminal-quarantines + transient-does-not-quarantine).	2026-05-18 10:34:03 -07:00
EloquentBrush0x	b570e0fdd0	fix(codex-oauth): quarantine terminal refresh errors so dead tokens are not replayed across sessions When a Codex OAuth refresh token is permanently invalidated (HTTP 400/401/403, token revoked or reused), _mark_exhausted was called but auth.json was left with the dead credentials. On the next session, _seed_from_singletons re-read auth.json and re-seeded the pool with the same revoked token, triggering the same terminal failure in a loop. Add _is_terminal_codex_oauth_refresh_error to auth.py and a matching quarantine block in _refresh_entry: when a terminal error is detected and auth.json holds no newer tokens, clear access_token/refresh_token from auth.json and remove all device_code-sourced pool entries from memory. Mirrors the Nous quarantine added in `c90556262` and the xAI quarantine in #28116. Also add a pre-refresh sync from auth.json before calling refresh_codex_oauth_pure, matching the xAI and Nous patterns, to avoid refresh_token_reused races when multiple Hermes processes share the same auth.json singleton. Salvaged from #27911 by @EloquentBrush0x — contributor's branch was severely stale (would have reverted ~5000 LOC across azure/kanban/i18n subsystems); fix re-applied surgically on current main with their predicate and tests preserved.	2026-05-18 10:31:40 -07:00
Teknium	9aae59feab	fix(compress): make abort-on-summary-failure opt-in via config flag (#28117 ) PR #28102 made the summary-failure abort path the unconditional default, changing established behavior. Gate it behind config.yaml flag `compression.abort_on_summary_failure` (default False = historical fallback-placeholder behavior). - hermes_cli/config.py: new `compression.abort_on_summary_failure` key, default False, documented inline. - agent/agent_init.py: read the flag from compression config and pass to ContextCompressor. - agent/context_compressor.py: `__init__` accepts `abort_on_summary_failure` (default False). `compress()` failure branch gates the abort on the flag; when False, falls through to the restored legacy fallback path (static "summary unavailable" placeholder + drop middle window). - tests: restore original fallback expectations as default; add new TestAbortOnSummaryFailure class for the opt-in mode. Gateway/CLI plumbing (force=True on /compress, hygiene/handler abort detection, locale `gateway.compress.aborted` key) from PR #28102 stays intact — those paths only fire when `_last_compress_aborted` is True, which now only happens when the flag is enabled.	2026-05-18 10:28:20 -07:00
EloquentBrush0x	5e40f83cb7	fix(xai-oauth): quarantine terminal refresh errors so dead tokens are not replayed across sessions When refresh_xai_oauth_pure raises a terminal error (HTTP 400/401/403, i.e. revoked or reused refresh token), _refresh_entry's existing race- recovery path re-syncs from auth.json and returns if another process has already rotated the tokens. If auth.json still holds the same stale token pair, the function fell through to _mark_exhausted — leaving the dead credentials in auth.json. On the next Hermes startup _seed_from_singletons re-seeded the pool from those stale tokens, causing the same failure loop on every session. Fix: after the auth.json re-sync check in the xAI-oauth error handler, detect terminal errors with the new _is_terminal_xai_oauth_refresh_error helper and apply a quarantine: - Clear access_token and refresh_token from providers["xai-oauth"]["tokens"] in auth.json so they are not re-seeded. - Write a last_auth_error entry for hermes doctor / auth status diagnostics. - Remove all loopback_pkce entries from the in-memory pool so the current session stops retrying with the dead credentials. Mirrors the identical quarantine already in place for Nous OAuth (`c90556262`). Closes the parity gap introduced when `c90556262` added Nous-only terminal error handling without a corresponding xAI-oauth path.	2026-05-18 10:28:09 -07:00
konsisumer	226680500d	fix(auth): improve xAI OAuth SSH hint with visual header and auto-detected host	2026-05-18 10:26:55 -07:00
briandevans	bf6eeb3f93	fix(xai-oauth): show "not received" page when loopback callback has no code When xAI's auth backend fails to redirect (e.g. the German "We couldn't reach your app" fallback shown in #27385), users sometimes navigate manually to the bare loopback callback URL — `http://127.0.0.1:<port>/callback` with no query string. The handler used to return 200 "xAI authorization received" for any GET that hit the expected path, because `parse_qs("")` yields no `code` and no `error`, leaving `result` untouched while the success page was still served. The CLI's wait loop, of course, still saw no code and timed out with `AuthError: xAI authorization timed out waiting for the local callback.` The user is left looking at a browser tab that claims success and a terminal that says failure — exactly the contradiction in #27385. This change makes the empty-callback case return 400 with an explicit "not received" page and a hint to retry `hermes auth add xai-oauth`. The wait-loop semantics are unchanged: `result["code"]` and `result["error"]` both stay None, so the CLI still raises a real timeout rather than treating the bare hit as a successful callback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 10:26:00 -07:00
EloquentBrush0x	1fabd6e100	fix(error_classifier): classify xAI Grok entitlement SSE errors as auth When xAI returns a subscription/entitlement error through an SSE ``type=error`` frame, ``_StreamErrorEvent`` is raised with ``status_code=None``. This caused ``_classify_by_status`` (step 2 of ``classify_api_error``) to be skipped entirely, and the Grok-specific phrases ("do not have an active Grok subscription", "out of available resources") appeared in none of the message-pattern lists. The error fell through to ``FailoverReason.unknown (retryable=True)``, burning ``max_retries`` on every affected X Premium+ / SuperGrok user before the agent stopped — and ``_is_entitlement_failure`` was never called because it only fires under ``FailoverReason.auth``. The HTTP 403 path already handled this correctly (``_classify_by_status`` returns ``auth/non-retryable`` for 403). Add an explicit pattern block at step 1 (highest priority, before the ``status_code`` guard) so both code paths route to ``FailoverReason.auth, retryable=False, should_fallback=True`` — matching the 403 path exactly. Add three regression tests in ``Fix D`` section of ``test_codex_xai_oauth_recovery.py``: - primary "do not have an active Grok subscription" phrase - "out of available resources" + "grok" variant - unrelated ``_StreamErrorEvent`` must not be reclassified	2026-05-18 10:24:13 -07:00

1 2 3 4 5 ...

8787 commits