hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
Teknium	0b6ace6498	test(verbose): align with telegram tier-1 inbox default Two tests in test_verbose_command.py asserted Telegram's tool_progress default was "new" and expected /verbose to cycle that to "all". The default has since been overridden to "off" in gateway/display_config.py (_PLATFORM_DEFAULTS for telegram — tier-1 inbox preset that keeps mobile chats final-answer-first), making the first /verbose invocation cycle off → new, not all → verbose. The behavioral change was intentional; the tests were stale and missing from the same commit. Surfaced as a pre-existing failure on origin/main during CI for the unrelated #33164 / #33168 Codex auth salvages.	2026-05-27 03:13:15 -07:00
konsisumer	f1422ffd77	fix(gateway): classify Codex 429 quota as rate-limit, not missing credentials When the Codex OAuth token endpoint returns 429 (usage-limit / quota exhaustion), refresh_codex_oauth_pure raised a generic auth error that the gateway surfaced as 'Primary provider auth failed: No Codex credentials stored. Run hermes auth', prompting re-auth that cannot lift a quota cap. Classify 429 distinctly (codex_rate_limited, relogin_required=False) with a non-alarming quota message that honors Retry-After, log it as 'Primary provider rate-limited (429)', and stop format_auth_error from appending the re-authenticate remediation. Also log the fallback provider's literal config key instead of the resolved runtime category. Refs #32790	2026-05-27 03:13:15 -07:00
konsisumer	2bbd53493d	fix(cli): sync credential_pool on Codex re-auth Codex re-auth via `hermes setup` / `hermes model` wrote fresh OAuth tokens to providers.openai-codex.tokens but left the credential_pool device_code entry holding the consumed refresh token and stale error markers. Since the runtime selects from the pool, the next request spent a dead token and got a 401 token_invalidated. Update the singleton-seeded pool entries in lockstep and clear their error state. Fixes #33000	2026-05-27 03:02:06 -07:00
houenyang-momo	60f84c6c28	gateway: quiet Telegram operational chatter	2026-05-27 02:41:24 -07:00
Robert DaSilva	efa952531b	fix: ignore Telegram start pings	2026-05-27 02:41:24 -07:00
sir-ad	8807b1c727	fix(gateway): hide telegram compaction status noise	2026-05-27 02:41:24 -07:00
chaconne67	9c69204d87	fix(codex_responses_adapter): drop foreign-issuer reasoning on replay reasoning.encrypted_content is sealed to the Responses endpoint that minted it. When a session switches model providers mid-conversation — say the user runs /model gpt-5.5 after several turns on grok-4.3, or vice versa — the persisted codex_reasoning_items carry blobs the new endpoint cannot decrypt, and every subsequent turn fails with HTTP 400 invalid_encrypted_content. This is the cross-issuer prevention layer. Pairs with: * PR #33035 — runtime recovery when the HTTP 400 fires anyway * PR #33146 — prevention for transient rs_tmp_* items Stamps each reasoning item with the issuer kind that minted it (codex_backend / xai_responses / github_responses / other:<url>) at normalize time, then drops items at replay time when the active endpoint differs from the stamp. Unstamped (legacy) items pass through for backwards compatibility. Cherry-picked from @chaconne67's PR #31629. Conflict against current main (#33035's replay_encrypted_reasoning parameter) resolved as 'keep both' — the two guards compose: replay_encrypted_reasoning=False is the session-wide kill switch, current_issuer_kind is the per-item filter that runs only when replay is still enabled.	2026-05-27 02:40:03 -07:00
Krishna	b1a46b3047	fix(codex): drop transient rs_tmp reasoning replay state	2026-05-27 02:25:59 -07:00
Teknium	187cf0f257	tools(terminal): nudge homebrewed CI pollers at the tool surface (#33142 ) Background processes whose command contains `gh pr view --json statusCheckRollup` or `gh pr checks \| jq` now get a runtime hint in the result pointing at the canonical green-ci-policy snippets. The homebrew shape has caused at least seven silent CI-watcher failures in the past two weeks (#31329, #31448, #31695, #31709, #31745, #32264, #33131) — each one a different jq/awk/grep variation of the same fundamental problem (stdout buffering, jq null-key edge cases, conclusion-vs-status confusion, TTY-only banner grepping). The skill that documents this anti-pattern is excellent, but a skill only fires if the agent loads it. The tool surface fires on every misuse. This is the embed-footguns-in-tool-surface pattern from PR #31289 applied to a recurring failure mode that's outgrown skill-only enforcement. Detector is deliberately narrow — flags two specific shapes: 1. Any command containing `statusCheckRollup` (the JSON-API path — conclusion vs status field semantics keep burning us). 2. `gh pr view` / `gh pr checks` combined with `jq` (gh pr checks doesn't emit JSON, so any `\| jq` here is confused intent; the canonical column-2 poller uses awk-on-tabs, not jq). Does NOT flag the blessed column-2 awk-on-tabs poller (which uses `awk -F"\t" "\==\"pending\""`) or the exit-code-driven `gh pr checks $PR >/dev/null` snippet. Hint composes with the existing background-without-notify_on_complete hint — both can fire on the same call. Each is independently actionable. Tests: - 4 new cases in tests/tools/test_notify_on_complete.py - test_homebrew_ci_poller_via_statusCheckRollup_emits_hint (positive) - test_homebrew_ci_poller_via_gh_pr_checks_piped_to_jq_emits_hint (positive) - test_canonical_column2_awk_poller_does_not_emit_homebrew_hint (negative) - test_canonical_gh_pr_checks_exit_code_loop_does_not_emit_hint (negative) - test_non_ci_background_command_does_not_emit_homebrew_hint (negative) - 30/30 passing (was 26)	2026-05-27 02:22:08 -07:00
Ben	a890389b69	feat(dashboard-auth): HERMES_DASHBOARD_PUBLIC_URL / dashboard.public_url override Operators behind reverse proxies that don't reliably forward X-Forwarded-Host / X-Forwarded-Proto / X-Forwarded-Prefix (manual nginx setups, on-prem ingresses, custom-domain Fly deploys with incomplete proxy chains) had no way to force the absolute base URL the OAuth callback redirects from. The dashboard would reconstruct the redirect_uri from request headers, the IDP would echo it back, and the user would land on the wrong host or wrong path — 404. Add `dashboard.public_url` to config.yaml with env override HERMES_DASHBOARD_PUBLIC_URL. When set, it is the complete authority — scheme + host + optional path prefix (e.g. https://example.com/hermes) — and becomes the base for the OAuth `redirect_uri`. X-Forwarded-Prefix is IGNORED on this code path because the operator has explicitly declared the public URL; we no longer need to guess from proxy headers, and stacking the prefix on top would double-prefix the common case where the prefix is already baked into public_url. When unset, the existing proxy_headers + X-Forwarded-Prefix reconstruction runs untouched. Existing Fly.io deploys continue to work without configuration — this is purely additive. Precedence mirrors dashboard.oauth.client_id: env (non-empty) > config.yaml > reconstructed from request Implementation: - hermes_cli/config.py: add dashboard.public_url to DEFAULT_CONFIG with a multi-paragraph doc comment explaining the use case, the X-Forwarded-Prefix interaction, and the validation rules. - hermes_cli/dashboard_auth/prefix.py: factored out the existing _REJECT_CHARS frozenset, added _normalise_public_url() validator (requires http/https scheme + non-empty host + no header-injection chars), _load_dashboard_section() loader (robust to load_config raising, non-dict shapes), and resolve_public_url() entry point with the env-overrides-config precedence. A malformed value silently falls through to ""; the caller treats "" as "reconstruct from request" so a typo never breaks the login flow. - hermes_cli/dashboard_auth/routes.py: rewrite _redirect_uri() docstring to spell out the three resolution tiers; add the public_url short-circuit before the existing X-Forwarded-Prefix splicing. Source-level comment notes that X-Forwarded-Prefix is intentionally ignored when public_url is set so a future reader doesn't try to "fix" the missing prefix layering. - cli-config.yaml.example: extend the existing dashboard section with a public_url block. - website/docs/user-guide/features/web-dashboard.md: new "Public URL override" section between the provider configuration and the OAuth flow walkthrough. Documents the env-vs-config table, the validation rules, and the `http://` `public_url` ↔ Secure cookie footgun. Test coverage — new TestPublicUrlOverride class (8 tests): - env var overrides request reconstruction (the primary motivating case) - config.yaml used when env unset - env wins over config (precedence pin) - public_url with a path prefix already baked in (the Q1-a case the user explicitly chose) - public_url suppresses X-Forwarded-Prefix layering (defends against the double-prefix bug) - trailing slash stripped from public_url (no //auth/callback) - malformed public_url falls through to reconstruction (six hostile inputs: javascript:, ftp:, missing scheme, missing host, quote chars, CRLF injection) - empty env string doesn't shadow config.yaml entry (CI / Fly provisioned-but-empty secret case) Mutation-tested: flipping the precedence in resolve_public_url() trips exactly test_env_overrides_config_public_url; weakening the validator (accept any scheme) trips exactly test_malformed_public_url_falls_through_to_reconstruction. Both other tests in each pair stay green, confirming the suite discriminates the specific regression each test pins.	2026-05-27 02:12:27 -07:00
Ben	61dcc33893	feat(dashboard-auth): config.yaml as canonical surface for dashboard.oauth Per AGENTS.md, ~/.hermes/.env is reserved for API keys / secrets and config.yaml is the surface for non-secret configuration. The Nous Portal plugin previously read HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL from the environment only, which forced local-dev / on-prem operators to put non-secret per-instance configuration in .env — violating the convention. Add dashboard.oauth.{client_id,portal_url} to DEFAULT_CONFIG and have the plugin resolve each setting with env-overrides-config precedence: 1. Env var when set to a non-empty value (Fly.io platform-secret injection — what pushes per-deploy client_ids without baking them into the image). 2. config.yaml entry (canonical surface for local dev / on-prem). 3. Plugin default (no provider registered when client_id is empty; portal_url defaults to https://portal.nousresearch.com). Empty env values are explicitly treated as unset so a provisioned-but- not-populated Fly secret can't accidentally shadow a valid config.yaml entry with an empty string — operators would otherwise lose the gate. Implementation: - hermes_cli/config.py: add dashboard.oauth.{client_id,portal_url} block to DEFAULT_CONFIG with full doc comment explaining the override precedence and Fly.io rationale. - plugins/dashboard_auth/nous/__init__.py: add _load_config_oauth_section, _resolve_client_id, _resolve_portal_url helpers; replace the two direct os.environ.get() calls in register() with the resolvers. Update the skip-reason string to mention BOTH surfaces so an operator looking at the fail-closed bind error knows config.yaml is a valid alternative to the env var. - plugins/dashboard_auth/nous/plugin.yaml: update description to name both surfaces. requires_env stays pointing at the env var name — it's metadata-only (not used by the plugin loader for gating) so this is documentation/UX, not enforcement. - cli-config.yaml.example: append commented dashboard.oauth block with the same override rationale operators see in code. - website/docs/user-guide/features/web-dashboard.md: rewrite the 'Default provider: Nous Research' section to lead with config.yaml, present env vars as operator overrides (Fly.io's primary path). Updated the example fail-closed bind error to match the new skip-reason text. Test coverage — new TestConfigYamlSource class (8 tests) pinning every tier of the precedence chain: - config-yaml-only path registers correctly - both config-yaml fields (client_id + portal_url) honoured - env var overrides config for client_id (Fly.io critical path) - env var overrides config for portal_url - empty env string does NOT shadow config (CI/Fly edge case) - neither source set → skip with reason mentioning BOTH surfaces - load_config() raising falls through to env-only path (resilience) - non-dict oauth section falls through cleanly (typo resilience) Mutation-tested: flipping the precedence to config-wins-over-env trips exactly test_env_overrides_config_client_id while the other 7 stay green, confirming the suite discriminates the order, not just the sources. This closes the last item in Teknium's PR review (PR #30156).	2026-05-27 02:12:27 -07:00
Ben	b26d81d536	feat(dashboard-auth): honour X-Forwarded-Prefix + __Host-/__Secure- cookies Mission-control style deploys reverse-proxy the dashboard at a path prefix (e.g. mission-control.tilos.com/hermes/* -> :9119) and inject X-Forwarded-Prefix: /hermes on every request. The SPA mount already honoured this for asset URLs and the bootstrap __HERMES_BASE_PATH__, but the OAuth gate didn't: 1. The gate's Location: header to /login and the 401 envelope's login_url were built bare ("/login?next=..."). Under a /hermes prefix the browser follows that to mission-control.tilos.com/login which the proxy doesn't route to the dashboard. 2. _redirect_uri (the OAuth callback URL handed to the IDP) used request.url_for() which doesn't honour X-Forwarded-Prefix (Starlette/uvicorn only proxy_headers Host + Proto + For). The IDP redirects back to /auth/callback instead of /hermes/auth/ callback → 404 in the user's browser. 3. Cookies were set with Path=/ which leaks them to other apps on the same origin and won't be sent back on requests under the prefix in the first place. Fix threads the normalised prefix through every boundary: * New hermes_cli/dashboard_auth/prefix.py — single source of truth for X-Forwarded-Prefix parsing. web_server._normalise_prefix becomes a re-export so the SPA mount, the gate, and the cookies helper all agree. * middleware._unauth_response builds login_url = f"{prefix}/login". * routes._redirect_uri splices the prefix into the path component of the IDP-bound URL (with full validation of the header). * cookies.{set,clear}_{session,pkce}_cookie now take prefix="". Path attribute switches to /hermes when set; cookie name switches name variant (see below). Every caller passes the request's normalised prefix. Cookie hardening (Teknium's lesser-note #1 in the PR review): adopt the __Host- / __Secure- cookie name prefixes per draft-west-cookie- prefixes. The variant is selected from (use_https, prefix): * Loopback HTTP → bare "hermes_session_at" (both prefixes require Secure, incompatible with HTTP). * HTTPS, direct deploy (Path=/) → "__Host-hermes_session_at". Strongest spec: bound to exact origin, no Domain attribute, Secure required. * HTTPS, behind a proxy prefix (Path=/hermes) → "__Secure-hermes_session_at". __Host- forbids Path != "/"; the explicit Path=/hermes covers same-origin app isolation. Setter and reader BOTH consult the prefix because the cookie name changes — a reader that looked up the bare name when the setter wrote __Secure- would never find the value. The reader falls back across all three variants so a request whose shape changed mid-session (e.g. post-deploy from no-prefix to /hermes) still picks up the existing cookie until it expires. Test coverage: - tests/hermes_cli/test_dashboard_auth_prefix.py — new file. 11 tests pinning: • Location: /hermes/login on the gate's HTML redirect • 401 envelope login_url carries the prefix • Malformed X-Forwarded-Prefix is ignored (header-injection defence; the script-tag value is normalised to empty string) • _redirect_uri splices /hermes into the path (the property that prevents the IDP-returns-to-404 failure) • PKCE cookie uses Path=/hermes + __Secure- when proxied • Session cookies use __Host- when direct, __Secure- when proxied, bare on loopback HTTP • End-to-end round trip with hand-managed PKCE cookie carriage (TestClient can't simulate a Path=/hermes cookie automatically) - tests/hermes_cli/test_dashboard_auth_cookies.py — rewritten to pin each (use_https, prefix) shape produces its expected cookie name, plus reader-side coverage that __Host- and __Secure- variants are both recognised. - Existing tests across middleware / 401-reauth / etc. updated to match the new cookie names (substring contains instead of startswith). Mutation-tested: reverting _unauth_response to build the bare "/login" URL trips exactly the two tests that pin the prefix carriage, confirming the suite discriminates the regression.	2026-05-27 02:12:27 -07:00
Ben	034ad95fed	fix(dashboard-auth): propagate next= through login page + PKCE cookie The gate's _unauth_response set next=<path> on the /login redirect URL, but nothing downstream read it: render_login_html ignored next=, auth_login dropped it, and auth_callback read next= from its own query string — which an IDP never sets on the callback URL (real IDPs only echo back code+state). The _validate_post_login_target plumbing in the callback was unreachable on the happy path, so users always landed on "/" regardless of what they originally requested. Worse: reading next= from the callback URL was a latent open-redirect sink, since an attacker could craft /auth/callback?...&next=/admin and have the server honour it post-auth. Fix carries next= through the round trip on a server-controlled channel: 1. login_page reads request.query_params['next'] and passes it (post- validation) to render_login_html. 2. render_login_html threads next= URL-encoded into each provider button's href, with HTML-attribute escaping as defence in depth. 3. auth_login accepts ?next= as a query param, re-validates, and appends it as a fourth segment (next=<urlquoted>) in the PKCE cookie payload alongside provider/state/verifier. 4. auth_callback no longer accepts a next: str = "" query param. It parses next= out of the PKCE cookie and validates that with the same same-origin rules. Any attacker-supplied ?next= on the callback URL is silently ignored — server-only carrier. Test coverage adds three classes: - TestAuthCallbackNext drives /login → /auth/login → IDP-bounce → /auth/callback end-to-end without smuggling next= onto the callback URL (which is what the previous tests did and why they didn't catch the bug). Includes test_attacker_callback_next_param_is_ignored to pin the security property that the URL value is never read. - TestRenderLoginHtmlNext covers the rendering function at the unit boundary so a regression that drops next_path is caught without spinning up the full app. - TestAuthLoginPkceCookieNext inspects the Set-Cookie header on /auth/login responses so a regression in cookie encoding is caught without driving the full round trip. Mutation-tested: reverting auth_callback to read next= from the URL trips 3 of 6 TestAuthCallbackNext tests (the safe-path and attacker- hardening ones), confirming the suite discriminates between the cookie read and the URL read.	2026-05-27 02:12:27 -07:00
Ben	c3104195b8	fix(dashboard-auth): bypass loopback WS peer check in gated mode When the OAuth gate is active, start_server runs uvicorn with proxy_headers=True so the dashboard can honour X-Forwarded-Proto from Fly's TLS terminator (cookies, redirect URI reconstruction). A side effect: ws.client.host is rewritten to the X-Forwarded-For value, which on Fly is the real internet client IP — never loopback. The loopback peer guard in _ws_client_is_allowed then rejected every WS upgrade in gated mode (4403 close) even after a successful OAuth round trip and ticket consumption, silently breaking /api/pty, /api/ws, /api/pub, and /api/events. Fix: in gated mode, bypass the peer-IP check. The OAuth gate + single-use ticket is the auth. The Host/Origin guard in _ws_host_origin_is_allowed still runs and is what protects against DNS-rebinding here, not the peer IP. Loopback mode behaviour is unchanged: the legacy ?token= path is the only auth there and we don't want LAN hosts guessing tokens. Regression coverage: TestWsRequestIsAllowedGated pins all four behaviours — non-loopback peer allowed in gated mode, non-loopback peer rejected in loopback mode, loopback peer allowed in loopback mode, and the Host/Origin guard still firing on a rebinding attempt with gated mode + matching peer.	2026-05-27 02:12:27 -07:00
Ben	866cc988b5	fix(dashboard-auth): use fixed-length sig suffix in stub token framing The stub auth provider's _sign/_unsign helpers joined payload and HMAC with a 'b"."' separator and recovered the parts via bytes.rsplit. HMAC-SHA256 digests are random bytes, so ~12% of the time the digest contains 0x2E ('.') and rsplit picks the wrong split point -- HMAC verification then spuriously rejects valid tokens. test_stub_refresh_round_trips was failing ~25% of the time in isolation because of this. Switch to a fixed-length suffix (32 bytes, sliced off in _unsign): no separator means no collision class. After the fix, 10/10 runs pass.	2026-05-27 02:12:27 -07:00
Ben	c598076b76	test(dashboard-auth): strip HERMES_DASHBOARD_OAUTH_* env vars in hermetic fixture When these vars are set in the developer's shell, every /api/status call triggers load_gateway_config() -> discover_plugins() -> the bundled dashboard_auth/nous plugin auto-registers itself, leaking a provider into the registry across tests on the same xdist worker. That breaks assertions like 'auth_providers == []' (loopback) and '== ["stub"]' (gated) in test_dashboard_auth_status_endpoint.py. CI never has these set, so this only surfaced locally -- exactly the hermeticity gap _hermetic_environment is meant to close. Add them to _HERMES_BEHAVIORAL_VARS so the autouse fixture strips them, and to the unset list in scripts/run_tests.sh as belt-and-suspenders for direct pytest invocations.	2026-05-27 02:12:27 -07:00
Ben	a498485631	feat(dashboard-auth-nous): surface token iss/aud in verification-failure error When jwt.decode raises InvalidTokenError, decode the token a second time without signature verification (safe — we never trust the values, just display them) and append the actual iss/aud claims plus our configured expected values to the error message. Lets operators see config drift between HERMES_DASHBOARD_PORTAL_URL / HERMES_DASHBOARD_OAUTH_CLIENT_ID and what Portal is actually emitting without having to hand-decode the JWT from the browser cookie.	2026-05-27 02:12:27 -07:00
Ben	b3dc539304	feat(dashboard-auth): Nous plugin always-on; default portal URL; specific error messages The Nous OAuth provider plugin (plugins/dashboard_auth/nous) is bundled and auto-loaded — same as before — but previously refused to register unless BOTH HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL were set, then the gate's fail-closed branch told the operator 'install the default Nous provider'. That message is misleading: the provider IS installed; it's just unconfigured. And the contract only really needs the per-instance client_id — the portal URL is the same for everyone in production. Three changes: 1. plugins/dashboard_auth/nous/__init__.py: - HERMES_DASHBOARD_PORTAL_URL is now optional and defaults to 'https://portal.nousresearch.com'. Override only for staging (portal.rewbs.uk) or a custom deployment. Empty string also falls back to the default so an empty Fly secret can't point the dashboard at nowhere. - Plugin exposes a module-level LAST_SKIP_REASON: str that the gate reads when no providers register. Cleared on each register() call. Skip reasons are human-readable and actionable ('HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. The Nous Portal provisions this env var…'). 2. plugins/dashboard_auth/nous/plugin.yaml: - requires_env drops HERMES_DASHBOARD_PORTAL_URL; only the client_id is mandatory. Description updated to reflect this. 3. hermes_cli/web_server.py: - When the gate fail-closes for 'no providers', it now reads each bundled plugin's LAST_SKIP_REASON and embeds them in the SystemExit message. Operator sees the specific config fix needed: Bundled providers reported these issues: • nous: HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. … instead of the prior generic 'Install the default Nous provider'. Tests: - TestPluginRegister rewritten to assert the new defaults + LAST_SKIP_REASON contents (6 tests, +1 new for empty-string env). - New gate test test_start_server_surfaces_nous_skip_reason_when_unconfigured. - test_get_method_is_not_allowed widened to handle the SPA-shell 200 path explicitly — assertion now verifies no JSON ticket leaks rather than asserting a specific status code (covers all four of 401/404/405/200). Docs updated: web-dashboard.md's 'Default provider' section now shows the env-var table with required/optional columns and embeds the fail-closed error message verbatim so operators can match what they see at the prompt.	2026-05-27 02:12:27 -07:00
Ben	2fc4615fc4	feat(dashboard-auth): Phase 7 — SPA AuthWidget + /api/status auth fields Phase 7 surfaces the OAuth gate state to users. web/src/components/AuthWidget.tsx (new): Sidebar widget that fetches /api/auth/me on mount and renders a compact 'Logged in as <user_id…> via <provider>' row with a logout icon. Contract V1 (Nous Portal) emits no email/display_name claims, so user_id is the display value (truncated to 14 chars + ellipsis); display_name and email fallthroughs are forward-compat for OQ-C1. Renders nothing on 401 from /api/auth/me — that's the signal the gate isn't engaged (loopback mode), in which case the widget would be confusing. Logout POSTs /auth/logout (which clears cookies + redirects to /login) then full-page-navigates to /login itself; the SPA's fetch wrapper doesn't follow that redirect, so the navigation is explicit. web/src/App.tsx: mounts <AuthWidget /> above <SidebarFooter />. Component is self-hiding in loopback mode so there's no need for a conditional mount. web/src/lib/api.ts: - getAuthMe() + logout() helpers - AuthMeResponse type - StatusResponse gets optional auth_required + auth_providers fields so the existing StatusPage can render a gated/loopback badge. hermes_cli/web_server.py: /api/status payload now includes - auth_required: bool — whether app.state.auth_required is True - auth_providers: list[str] — registered DashboardAuthProvider names Lazy-imports list_providers so early-startup status calls don't crash if the dashboard_auth module is still being set up. tests/hermes_cli/test_dashboard_auth_status_endpoint.py: 3 new tests covering the new status fields in both gated and loopback modes plus a regression that no existing field got dropped from the payload. The hermes status CLI is unchanged in this commit — that command tracks model providers + OAuth credentials, not running-dashboard state. The /api/status endpoint is the canonical place to query dashboard auth-gate state, consumed by the React StatusPage already.	2026-05-27 02:12:27 -07:00
Ben	5e9308b5b8	feat(dashboard-auth): Phase 6 — 401 re-auth envelope + next= propagation Contract V1 of nous-account-service PR #180 ships no refresh tokens, so the original Phase 6 silent-refresh design is replaced with a thinner '401 → redirect to /login' UX. The dashboard's gated middleware now emits a structured envelope on any auth failure; the SPA's fetch wrapper sees it and full-page-navigates the user through re-auth. hermes_cli/dashboard_auth/cookies.py: set_session_cookies(refresh_token='') SKIPS writing the hermes_session_rt cookie. Forward-compat: a non-empty refresh_token still emits the cookie unchanged, so a future Portal contract that starts issuing RTs flips the persistence on with no other change. clear_session_cookies still emits a Max-Age=0 deletion for the RT cookie so stale cookies from earlier deployments get flushed on logout / session expiry. Deprecation marker + rationale in module docstring per the user's docstring-only deprecation pattern. hermes_cli/dashboard_auth/middleware.py: _unauth_response now builds a structured JSON envelope for API 401s: { error: 'session_expired' \| 'unauthenticated', detail: 'Unauthorized', reason: <internal>, login_url: '/login?next=<safe-path>' } HTML redirects also carry next= so a user landing on /sessions without a cookie bounces back to /sessions after re-auth. _safe_next_target validates same-origin: drops protocol-relative paths (//evil.com), absolute URLs, and any /login or /auth/* loop. Dead cookies are cleared on the 401 path so the browser stops replaying invalid tokens. hermes_cli/dashboard_auth/routes.py: /auth/callback accepts next= query param and validates via _validate_post_login_target (same rules as the gate's _safe_next_target — defence-in-depth because next= survived a full IDP round trip and attacker-controlled state can re-enter via the callback URL). Open-redirect attempts land at '/' instead. web/src/lib/api.ts: fetchJSON parses the 401 envelope and full-page-navigates to body.login_url ONLY on the known session-expiry error codes. Domain-level 401s (e.g. permission errors) bubble up as regular errors. credentials: 'include' added so cookie auth works for all fetches routed through this wrapper. sessionStorage.lastLocation is preserved for future use by AuthWidget / hermes_status. Test files marked with pytest.mark.xdist_group so the four files that mutate web_server.app.state.auth_required serialize onto the same xdist worker — eliminates 'works locally, fails in CI' app-state bleed. 20 new tests in test_dashboard_auth_401_reauth.py: - set_session_cookies(refresh_token='') skips RT cookie - clear_session_cookies still emits RT deletion - 401 envelope shape (unauthenticated vs session_expired) - dead cookie cleared on invalid-token 401 - login_url carries next= for deep paths - login loop avoided when path is /login/auth/api-auth - protocol-relative URL rejected - _safe_next_target unit tests (accept same-origin, reject loops/abs) - /auth/callback respects safe next= but rejects open redirects 2 pre-existing tests updated to accept the new /login?next=%2F shape. Full dashboard-auth suite: 168 passed, 1 skipped (Phase 0 pre-existing).	2026-05-27 02:12:27 -07:00
Ben	b2360ba44e	feat(dashboard-auth): _ws_auth_ok helper + ticket auth on all 4 WS endpoints Phase 5 task 5.2. Four WebSocket endpoints — /api/pty, /api/ws, /api/pub, /api/events — previously authed with the same constant-time check against `_SESSION_TOKEN`. Replaced with a single helper that branches on `app.state.auth_required`: Loopback / --insecure: legacy ?token=<_SESSION_TOKEN> path (unchanged). Gated: ?ticket=<single-use> consumed against the dashboard-auth ticket store. Critical security property: gated mode UNCONDITIONALLY rejects the ?token= path. A leaked _SESSION_TOKEN value from a log line is not replayable for WS access in gated deployments. `_build_sidecar_url` now branches too: loopback uses the legacy token; gated mode mints a server-internal ticket via mint_ticket() with pseudo-user 'pty-sidecar' / provider 'server-internal' so audit logs can distinguish PTY-internal sidecar tickets from browser tickets. PTY children open /api/pub exactly once at startup so single-use suffices. Ticket rejections audit-log as WS_TICKET_REJECTED with truncated reason + client IP + WS path. Operators debugging 'WS keeps closing' issues see which endpoint and why. 17 new tests: - POST /api/auth/ws-ticket: 200 with cookie, 401/302 without, distinct per call, GET-not-allowed. - _ws_auth_ok loopback: token accept/reject, missing-token reject, ticket-param-ignored. - _ws_auth_ok gated: ticket accept, single-use rejection, unknown reject, legacy-token-rejected-in-gated assertion, audit-log emission. - _build_sidecar_url: loopback uses token=, gated uses ticket=, no-bound returns None.	2026-05-27 02:12:27 -07:00
Ben	b69fce9c86	feat(dashboard-auth): single-use WS tickets + POST /api/auth/ws-ticket Phase 5 task 5.1. Browsers cannot set Authorization on a WebSocket upgrade, so in gated mode the SPA needs an alternative way to bind the upgrade to its authenticated session. hermes_cli/dashboard_auth/ws_tickets.py — in-memory single-use ticket store with 30s TTL. Thread-safe (threading.Lock), token_urlsafe(32) values, ticket value truncated to 8 chars in error messages for log hygiene. Module-level state with _reset_for_tests() helper. hermes_cli/dashboard_auth/routes.py — adds POST /api/auth/ws-ticket. Auth-required (the gate middleware already attaches Session to request.state.session). Returns {ticket, ttl_seconds}; emits WS_TICKET_MINTED audit event with user_id + provider + ip. hermes_cli/dashboard_auth/audit.py — adds WS_TICKET_REJECTED enum value for the consume-side rejection event (wired into the WS endpoints in task 5.2). 11 new tests covering round-trip, single-use, TTL boundary, unknown ticket rejection, secret-hygiene truncation in error messages, and concurrent mint+consume from 20 threads.	2026-05-27 02:12:27 -07:00
Ben	848baeb0a8	feat(dashboard-auth): plugins/dashboard_auth/nous — contract-compliant Nous OAuth provider Bundled, kind=backend, auto-loads. Activates ONLY when Portal-injected env vars are present: HERMES_DASHBOARD_OAUTH_CLIENT_ID — agent:{instance_id} HERMES_DASHBOARD_PORTAL_URL — Portal base URL Loopback / --insecure operators leave both unset and never see this plugin register anything. The fail-closed branch in start_server handles the 'public bind + zero providers' case independently. Implementation follows nous-account-service PR #180's published OAuth contract verbatim: - client_id is per-instance (agent:{instance_id}); the suffix is cross-checked against the token's agent_instance_id claim as defense-in-depth (contract C9). - scope is agent_dashboard:access only (contract C3). - aud is the bare client_id, no hermes-cli: prefix (contract C2). - RS256 JWT verification against /.well-known/jwks.json with 5-minute cache (contract C7). - No refresh tokens in V1: refresh_session always raises RefreshExpiredError; revoke_session is a no-op (contract C5). - oauth_contract_version claim: missing → warn + proceed; present and != 1 → refuse (contract C11, OQ-C2 tolerant treatment). - redirect_uri validated client-side as defense before bouncing to Portal; authoritative check is server-side per agent-redirect-uri.ts. 41 new tests covering construction, plugin-entry env gating, start_login shape, complete_login httpx-mocked happy path + error mapping, verify_session JWT verification (RSA keypair fixture, full claim-check matrix), refresh_session always raising, revoke_session no-op. PyJWT + cryptography are already in the venv (jose was previously suggested; switched to pyjwt[crypto] since the latter is already pulled in transitively).	2026-05-27 02:12:27 -07:00
Ben	53736b3922	feat(dashboard-auth): fail-closed on no providers; proxy_headers when gated; suppress _SESSION_TOKEN injection Phase 3, Task 3.5. Three changes to web_server.py: 1. start_server replaces the legacy SystemExit-refusing-to-bind guard with: if app.state.auth_required and no providers registered, exit with a clear message; otherwise log the gate-on banner. --insecure keeps its existing behaviour. 2. uvicorn proxy_headers flag is computed from app.state.auth_required. Loopback / --insecure keep it False (so _ws_client_is_allowed sees the real peer for the loopback gate); gated mode flips it True so X-Forwarded-Proto from Fly's TLS terminator is honoured for cookie Secure-flag decisions in detect_https(). 3. _serve_index no longer injects window.__HERMES_SESSION_TOKEN__ when the gate is on — the SPA reads identity from /api/auth/me using cookie auth instead. window.__HERMES_AUTH_REQUIRED__ flag lets the SPA pick between ticket-auth (gated) and token-auth (loopback) for /api/pty + /api/ws (Phase 5 will wire this in the React layer). 4 new behavioural tests; loopback regression harness still green.	2026-05-27 02:12:27 -07:00
Ben	5b17eab67a	feat(dashboard-auth): auth gate middleware + /auth/* routes + /login HTML Phase 3, Tasks 3.2 + 3.3 + 3.4. These three pieces are mutually dependent so they land together. middleware.py - gated_auth_middleware engages when app.state.auth_required is True. Allowlists /login, /auth/, /api/auth/providers, and static asset paths; everything else demands a valid session_at cookie. Verifies by trying every registered provider's verify_session in turn (multi- provider stack); attaches verified Session to request.state.session. Returns 401 JSON for /api/ and 302 -> /login for HTML. ProviderError during verify -> 503. routes.py - APIRouter with: GET /login server-rendered HTML GET /auth/login?provider=N 302 to IDP + PKCE cookie GET /auth/callback?code,state completes login, sets session cookies POST /auth/logout clears cookies + best-effort revoke GET /api/auth/providers public bootstrap endpoint (503 if zero) GET /api/auth/me verified session as JSON (auth-required) login_page.py - Inline-CSS HTML template, no React, no JavaScript. web_server.py - Mounted gated_auth_middleware between host_header and auth_middleware (FastAPI runs middlewares in registration order: host check -> cookie auth -> token auth). auth_middleware short-circuits when auth_required so cookie auth is authoritative in gated mode. Router is included before mount_spa so the catch-all doesn't swallow /login or /auth/*. 17 new behavioural tests; loopback regression harness still green.	2026-05-27 02:12:27 -07:00
Ben	a30c4d8ebd	feat(dashboard-auth): cookie helpers for session_at/session_rt/pkce Phase 3, Task 3.1. Three cookies: - hermes_session_at: OAuth access token (HttpOnly, TTL = token TTL) - hermes_session_rt: OAuth refresh token (HttpOnly, 30d max-age) - hermes_session_pkce: PKCE state + verifier + provider hint (10min) All SameSite=Lax + Path=/. Secure flag is set ONLY when the request scheme is https — uvicorn proxy_headers=True (enabled in gated mode at Phase 3.5) rewrites scheme from X-Forwarded-Proto so Fly's TLS terminator works.	2026-05-27 02:12:27 -07:00
Ben	628a52fce2	test(dashboard-auth): stub auth provider for E2E gate testing Phase 2, Task 2.1. Self-contained fake IDP — start_login redirects straight back to {redirect_uri}?code=stub_code&state=<s> so tests can walk the OAuth round trip in-process. Tokens are HMAC-signed JSON blobs (not real JWTs) — enough structure for verify_session to detect tamper and expiry without pulling in pyjwt. Lives in tests/ only — never registered as a real plugin. Phase 3's end-to-end tests import StubAuthProvider directly. Convention: exp <= now counts as expired (TTL=0 means born-expired) — matches what Phase 6's silent-refresh test will need.	2026-05-27 02:12:27 -07:00
Ben	865cae4f61	feat(dashboard-auth): json-lines audit log at $HERMES_HOME/logs/dashboard-auth.log Phase 1, Task 1.4. Records every auth event (login start/success/failure, logout, refresh success/failure, revoke, session verify failure, WS ticket mint) as one JSON object per line. Token-like kwargs (access_token, refresh_token, code, code_verifier, state, ticket, cookie, Authorization) are dropped before serialisation so the log never contains live secrets. Write failures log at WARNING but never raise — auth flows must not fail because the audit logger broke.	2026-05-27 02:12:27 -07:00
Ben	c32b17f557	feat(plugins): add register_dashboard_auth_provider hook on PluginContext Phase 1, Task 1.3. Mirrors the existing register_image_gen_provider pattern (plugins.py:531) — wrong-type or duplicate-name registrations log at WARNING and silently return rather than raising, so a misbehaving auth plugin cannot crash the host. Deviation from plan: the plan's draft raised TypeError on non-provider input; switched to silent-warn to match the established image_gen convention. Test updated to match.	2026-05-27 02:12:27 -07:00
Ben	1bbfed70c4	test(dashboard-auth): cover registry register/get/list/clear semantics Phase 1, Task 1.2. Verifies registration order is preserved, duplicate names are rejected with ValueError, and non-compliant providers fail at register time (not later when the middleware tries to dispatch).	2026-05-27 02:12:27 -07:00
Ben	2dc6d03a3d	feat(dashboard-auth): define DashboardAuthProvider ABC + Session dataclass Phase 1, Task 1.1. New package hermes_cli/dashboard_auth/ contains: base.py - DashboardAuthProvider ABC with 5 abstract methods (start_login, complete_login, verify_session, refresh_session, revoke_session), Session + LoginStart frozen dataclasses, three exception types (ProviderError / InvalidCodeError / RefreshExpiredError), and assert_protocol_compliance() for plugins to call in their own tests. registry.py - Module-level register/get/list/clear with a lock. Nothing reads the registry yet — Phase 2 adds the StubAuthProvider and Phase 3 wires the gate middleware. The plugin hook lands in Task 1.3.	2026-05-27 02:12:27 -07:00
Ben	949ad95e4b	feat(dashboard): stash auth_required flag on app.state Phase 0, Task 0.3. start_server now computes should_require_auth(host, allow_public) and records it on app.state.auth_required BEFORE the existing legacy SystemExit guard fires. This gives middleware, the SPA token-injection path, and WS endpoints a consistent read source for 'is the gate active'. The flag is set but no one reads it yet — Phase 3 registers the gate middleware. Note: 4 pre-existing test failures in tests/hermes_cli/test_web_server.py (PtyWebSocket) + test_update_hangup_protection.py reproduce on pristine HEAD and are unrelated to this change (starlette TestClient WS regression).	2026-05-27 02:12:27 -07:00
Ben	8773bbf186	feat(dashboard): add should_require_auth predicate for OAuth gate Phase 0, Task 0.2. Single source of truth for 'is the auth gate active?'. Reuses the existing _LOOPBACK_HOST_VALUES frozenset so this stays in sync with the DNS-rebinding host-header check. RFC1918/CGNAT/link-local are treated as public — exact threat model the gate exists for.	2026-05-27 02:12:27 -07:00
Ben	f2b479e7a2	test(dashboard): pin current loopback auth behavior as regression harness Phase 0, Task 0.1 of the dashboard-oauth plan. Establishes a baseline for the loopback dashboard's auth surface so future phases can prove they didn't regress the existing _SESSION_TOKEN flow when adding the OAuth gate.	2026-05-27 02:12:27 -07:00
Teknium	249534e472	plugins: add security-guidance — pattern-matched warnings on dangerous code writes (#33131 ) New opt-in plugin that scans the content passed to write_file / patch / skill_manage for 25 known-dangerous code patterns — pickle.load, yaml.load, eval(, os.system, subprocess(shell=True), child_process.exec, dangerouslySetInnerHTML, innerHTML/outerHTML/document.write/ insertAdjacentHTML, crypto.createCipher (no IV), AES ECB, TLS verification disabled, XXE-prone xml.etree/minidom parsers, <script src=//...> without SRI, torch.load without weights_only=True, GitHub Actions ${{ github.event.* }} injection — and appends a "Security guidance" warning block to the tool result via the transform_tool_result hook. Default behaviour is non-blocking: the file is written and the warning rides back to the model in the next turn so it can self-correct or document why the construct is safe. SECURITY_GUIDANCE_BLOCK=1 upgrades to refusing the write entirely; SECURITY_GUIDANCE_DISABLE=1 is the kill switch. Pattern data (patterns.py) is a verbatim Apache-2.0 fork of Anthropic's claude-plugins-official/plugins/security-guidance/hooks/ patterns.py at commit 0bde168 (2026-05-26). LICENSE and NOTICE preserve attribution. The Hermes-side plugin glue (__init__.py, plugin.yaml, README.md, tests) is original work. Plugin is opt-in like all bundled plugins: hermes plugins enable security-guidance Inspired by https://x.com/ClaudeDevs/status/1927108527247... — Anthropic shipped this as their security-guidance plugin for Claude Code on 2026-05-26 with a measured 30-40% reduction in security-related PR comments on internal rollout. What's NOT ported (deferred): * Layer 2 (LLM diff review on turn end) — would route through main model by default on Hermes, real money on reasoning models. A follow-up can wire it to a cheap aux model with explicit opt-in. * Layer 3 (agentic commit-time review) — agent can run this on demand via delegate_task today. * .hermes/security-guidance.md project-rules file — only used by layers 2/3 upstream.	2026-05-27 02:07:21 -07:00
SuperEarn	4920f8437f	test(codex): cover null output stream terminal events	2026-05-27 02:06:21 -07:00
Teknium	96223265b9	chore(api-server): mark skills_api capability True now that /v1/skills shipped #33016 added GET /v1/skills + /v1/toolsets on the API server; the capability flag introduced in this branch was placeholder-False. Flip to True so capability probers see the truth.	2026-05-27 01:56:55 -07:00
Jonathan	464b51d455	Support media in session chat API	2026-05-27 01:56:55 -07:00
Bailey Dixon	f7527b0fdb	feat: add API server session controls	2026-05-27 01:56:55 -07:00
EvilHumphrey	4243b6dc45	fix(codex): update silent-hang workaround hint	2026-05-27 01:52:34 -07:00
Teknium	25f43d38de	feat(api-server): add GET /v1/skills and /v1/toolsets (#33016 ) Lets external clients enumerate the agent's skills and resolved toolsets deterministically over the OpenAI-compatible API server, without standing up the dashboard web server or sending a chat message and asking the model to list them. - GET /v1/skills — list installed skills (name, description, category) - GET /v1/toolsets — list toolsets resolved for the api_server platform, with enabled/configured state and the concrete tool names each expands to - Both gated by API_SERVER_KEY (same Bearer scheme as every other /v1/* endpoint) - /v1/capabilities advertises both new endpoints Closes the gap a community user just hit asking how to list skills over REST when only the OpenAI-compatible server is running. Test plan - python -m pytest tests/gateway/test_api_server.py -k "Skills or Toolsets or Capabilities" -o 'addopts=' -q → 9/9 pass - python -m pytest tests/gateway/test_api_server.py -o 'addopts=' -q → 156/156 pass, no regressions - E2E: started a real adapter on an isolated HERMES_HOME with a fake skill installed; curl-equivalent calls to /v1/capabilities, /v1/skills, /v1/toolsets returned the expected JSON; unauthenticated calls returned 401 with the configured API_SERVER_KEY.	2026-05-27 01:27:26 -07:00
Teknium	febc4cfec0	remove Vercel AI Gateway and Vercel Sandbox (#33067 ) * remove Vercel AI Gateway provider and Vercel Sandbox terminal backend Both Vercel-hosted integrations are removed end-to-end. Users on the AI Gateway should switch to OpenRouter or one of the other aggregators (Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should switch to Docker, Modal, Daytona, or SSH. What's removed: - `plugins/model-providers/ai-gateway/` provider plugin - `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper - `tools/environments/vercel_sandbox.py` terminal backend - `ai-gateway` provider wiring across auth, doctor, setup, models, config, status, providers, main, web_server, model_normalize, dump - `vercel_sandbox` backend wiring across terminal_tool, file_tools, code_execution_tool, file_operations, approval, skills_tool, environments/local, credential_files, lazy_deps, prompt_builder, cli, gateway/run - `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client header set, run_agent base-URL header/reasoning special-cases - `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock - env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`, `VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`, `TERMINAL_VERCEL_RUNTIME` - Tests: deletes test_ai_gateway_models.py and test_vercel_sandbox_environment.py; scrubs references across 23 surviving test files (no entire tests deleted unless they were dedicated to AI Gateway / Sandbox) - Docs: provider tables, env-var reference, setup guides, security notes, tool config, terminal-backend tables — English plus zh-Hans i18n parity - `hermes-agent` skill: provider table entry and remote-backend list What stays (intentional): - `popular-web-designs/templates/vercel.md` — CSS design reference, unrelated to Vercel-the-AI-product - `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN response header, useful diag signal on any Vercel-hosted endpoint - `vercel-labs/agent-browser` URL in browser config — lightpanda browser project, different OSS effort - `userStories.json` historical contributor entry mentioning Vercel Sandbox — archive, not active docs Validation: - 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`) - Full repo `py_compile` clean - Live import of every touched module + invariant check (no `ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no `vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`) * test: convert profile-count check from change-detector to invariant The hardcoded "== 34" assertion broke when ai-gateway was removed. Per AGENTS.md change-detector-test guidance, assert the relationship (registry count >= number of plugin dirs) instead of a literal count. Counts shift when providers are added/removed; that's expected.	2026-05-27 00:43:32 -07:00
Teknium	cb38ce28cb	refactor(codex): drop SDK responses.stream() helper; consume events directly (#33042 ) * refactor(codex): drop SDK responses.stream() helper; consume events directly The OpenAI Python SDK's high-level `client.responses.stream(...)` helper does post-hoc typed reconstruction from the terminal `response.completed.response.output` field. The chatgpt.com Codex backend has been observed (today, gpt-5.5) to ship `response.output = null` on terminal frames, which crashes the SDK with `TypeError: 'NoneType' object is not iterable` mid-iteration. Carlton's #32963 patched the symptom by wrapping the helper in try/except and recovering from the same per-event accumulator the SDK was supposed to populate. This PR removes the helper from the call path entirely: we now use `client.responses.create(stream=True)` (raw AsyncIterable of SSE events) and assemble the final response object ourselves from `response.output_item.done` events as they arrive. The terminal event's `output` field is never read for content. Same strategy OpenClaw uses for the same backend. This makes Hermes structurally immune to the bug class, not patched. The next time OpenAI ships a shape change to chatgpt.com's terminal frame, our consumer keeps working because it doesn't read that frame for content — only for usage/status/id. Changes - `agent/codex_runtime.py`: new `_consume_codex_event_stream()` shared consumer; `run_codex_stream()` uses `responses.create(stream=True)`; `run_codex_create_stream_fallback()` collapses into a thin alias since the primary path now does what the fallback used to do. - `agent/auxiliary_client.py`: `_CodexCompletionsAdapter` uses the same consumer; old null-output recovery helpers deleted as unreferenced. - Tests migrated: fixtures that mocked `responses.stream` now mock `responses.create` returning a raw iterable. New regression test asserts the auxiliary path returns streamed items even when the terminal event's `output` is literally `null`. Validation - Live: tested against fresh OAuth on `chatgpt.com/backend-api/codex` with `gpt-5.5` — response built correctly with `response.output=null` on the terminal frame, all events consumed, usage/reasoning tokens propagated. - `tests/run_agent/test_run_agent_codex_responses.py` + `tests/agent/test_auxiliary_client.py`: 242 passed. * test+fix(codex): migrate streaming tests, raise on truncated streams CI surfaced 10 test failures across tests/run_agent/test_streaming.py and tests/run_agent/test_codex_xai_oauth_recovery.py — both files had their own `responses.stream(...)` mocks I missed in the first sweep. agent/codex_runtime.py: _consume_codex_event_stream() now raises "Codex Responses stream did not emit a terminal response" when the stream ends without any terminal frame AND no usable content. This preserves the signal callers used to get from the SDK's high-level helper, which they distinguished from "completed with empty body" in error handling. Tests migrated: - test_streaming.py: text-delta callback, activity-touch, and remote-protocol-error tests all switch from mocking responses.stream to responses.create returning an iterable of events. - test_codex_xai_oauth_recovery.py: prelude-error tests are recast as wire-error-event tests (the new path raises _StreamErrorEvent directly when the wire emits type=error, which is strictly better than the old two-phase "SDK RuntimeError → retry → fallback"). The retry-on-transport-error test moves from responses.stream side-effect to responses.create side-effect. Verified live against chatgpt.com Codex with gpt-5.5 — AIAgent.chat() through the full codex_responses path returns correctly, 319/319 targeted tests passing.	2026-05-27 00:30:06 -07:00
Teknium	b6ca56f651	fix(codex-responses): gracefully recover from invalid_encrypted_content (salvage #10144 ) (#33035 ) * fix(codex-responses): gracefully recover from invalid_encrypted_content (salvage #10144) When an OpenAI-compatible Responses API surface accepts an initial request but later rejects the replayed `codex_reasoning_items` encrypted blob with HTTP 400 `invalid_encrypted_content`, the session previously got stuck retrying the same poisoned payload. Recovery: classify the error as a dedicated FailoverReason, and on the first hit disable encrypted reasoning replay for the rest of the session, strip cached items from message history, and retry once. Changes: * error_classifier: add FailoverReason.invalid_encrypted_content branch in _classify_400 (before context_overflow so the messages that mention 'encrypted content … could not be verified' don't trip context heuristics), in _classify_by_error_code, and extend _extract_error_code to peek inside wrapped JSON in error.message and ignore the bare '400' as a code. * agent_init: initialize `_codex_reasoning_replay_enabled = True` on every agent. * run_agent: add AIAgent._disable_codex_reasoning_replay() helper that flips the flag and pops cached items. * codex_responses_adapter: thread a `replay_encrypted_reasoning` kwarg through _chat_messages_to_responses_input so that when the flag is False we don't replay codex_reasoning_items. * transports/codex.py: read `replay_encrypted_reasoning` from params, thread it into the adapter, and gate the `include=['reasoning.encrypted_content']` request hint on it. * chat_completion_helpers: pass the agent's replay flag through to the transport. * conversation_loop: in the retry loop, add an invalid_encrypted_content recovery branch that fires once per session, only when api_mode == codex_responses, only when replay is still enabled, and only when at least one assistant message in history actually carries cached reasoning items (otherwise the 400 has nothing to do with our cache and the normal retry path handles it). Tests: * test_error_classifier: new wrapped-JSON _extract_error_code case; new TestClassifyApiError cases proving the 400 is retryable with no fallback, that the broad message match doesn't catch a generic 'parsed' message, and that the error code match is case-insensitive. * test_run_agent_codex_responses: end-to-end test of the recovery branch firing once and disabling replay, plus a sibling test that proves the branch does not fire (and the flag stays True) when history has no cached reasoning items. Salvages PR #10144 onto the post-refactor module layout (error_classifier / codex_responses_adapter / transports/codex / conversation_loop / agent_init) since the original diff was written against the pre-refactor monolithic run_agent.py. * chore(release): map victorGPT in AUTHOR_MAP for #10144 salvage --------- Co-authored-by: victorGPT <wuxuebin1993@gmail.com>	2026-05-26 22:01:17 -07:00
emozilla	3d9a26afad	Merge remote-tracking branch 'origin/main' into jq/hermes-update-branch-flag	2026-05-27 00:48:25 -04:00
Ben Barclay	81a4f280d2	Merge pull request #22534 from wesleysimplicio/fix/voice-mode-docker-respect-pulse-pipewire fix(voice): honor PULSE_SERVER/PIPEWIRE_REMOTE inside Docker (#21203)	2026-05-27 13:59:12 +10:00
Nick	0a83247e9f	feat: add TUI session orchestrator Add a first-class active-session orchestrator for the Ink TUI: - list, activate, close, and launch live process-local TUI sessions - hydrate committed and in-flight output when switching sessions - dispatch a new prompt session from the +new row with session-scoped model picks - expose a clickable live-session count in the status chrome - preserve stable row order while initially focusing the current session - support mouse hit-testing for floating orchestrator overlays - add backend and frontend regression coverage for the lifecycle and UI helpers	2026-05-26 20:51:59 -07:00
beardthelion	2fc77c53f0	feat(opencode-go): route qwen3.7-max via anthropic_messages qwen3.7-max on OpenCode Go rejects the OpenAI-compatible (oa-compat) format with HTTP 401 but works correctly via the Anthropic Messages endpoint (/v1/messages with x-api-key auth). Route it the same way MiniMax models are routed: anthropic_messages api_mode. Changes: - hermes_cli/models.py: add qwen3.7-max routing + curated list - hermes_cli/setup.py: add to setup wizard model list - hermes_cli/auth.py: update provider comment - tests: add assertions for qwen3.7-max api_mode routing	2026-05-26 20:44:43 -07:00
Will Falcon	bba50977bc	fix: parse Codex image generation SSE directly	2026-05-26 20:40:29 -07:00
Carlton	43a3f119fc	fix(agent): recover Codex streams with null output	2026-05-26 19:37:37 -07:00

1 2 3 4 5 ...

4392 commits