hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-30 06:41:51 +00:00

Author	SHA1	Message	Date
kshitij	a82c88bac0	fix(xai-oauth): accept bare-code manual paste (state=None) (#26923 ) (#33880 ) xAI's consent page renders the authorization code in-page rather than redirecting through the 127.0.0.1 callback, so on remote/headless setups (GCP Cloud Shell, Codespaces, container consoles, headless VPS) the only value the user can paste is the opaque code with no `code=`/`state=` query parameters. `_parse_pasted_callback` correctly returns `state=None` for that input, but `_xai_oauth_loopback_login` then validated state unconditionally and raised `xai_state_mismatch`, making the documented bare-code paste path unreachable. PKCE (code_verifier) still binds the token exchange to this client, so the local state-equality check is redundant when there is no state to compare. On the manual-paste path only, substitute the locally generated state when the callback returned none — the rest of the validation chain (code presence, error field, token exchange) is unchanged. The loopback HTTP-server path still requires a matching state (a real browser redirect always carries one). Also: clarify the manual-paste prompt to mention xAI's in-page code rendering so users know pasting the bare code on its own is expected. Root-cause analysis from #26923 comment by @AccursedGalaxy (2026-05-20). Tests ----- * test_xai_loopback_login_manual_paste_bare_code_succeeds — positive end-to-end through the token exchange with state=None. * test_xai_loopback_login_loopback_path_rejects_missing_state — the HTTP-server path still rejects state=None as a regression guard (the bare-code relaxation must NOT widen the loopback path). * Existing test_xai_loopback_login_manual_paste_state_mismatch_raises continues to verify wrong (non-None) state is rejected on manual-paste. Closes #26923.	2026-05-28 05:47:30 -07:00
Teknium	09a5cd8084	fix(auth): sync manual:device_code Codex pool entries on re-auth (#33744 ) #33164 made _save_codex_tokens sync the singleton-seeded `device_code` pool entry on Codex OAuth re-auth. That fixed the #33000 path but missed `manual:device_code` entries created by `hermes auth add openai-codex` (the recommended workaround for users who hit #33000 before #33164 landed). Every subsequent re-auth would refresh the device_code entry but leave the manual:device_code entry holding the consumed refresh token plus stale last_error_* markers — immediately recreating the 401 token_invalidated symptom on the next request, exactly as reported in #33538. Extend the refreshable source set to include `manual:device_code`. Completing the device-code OAuth flow proves the user owns the ChatGPT account, so it is safe to refresh every device-code-backed entry. Keep `manual:api_key` and other non-device-code manual sources untouched — those represent independent credentials. Closes #33538.	2026-05-28 01:33:10 -07:00
LeonSGP43	442a9203c0	Fix xAI OAuth timeout manual fallback	2026-05-28 00:24:17 -07:00
Robin Fernandes	406901b27d	feat(auth) normalise the way in which we check whether a user has free/paid access to nous portal so we can expose behaviour and error messages accordingly.	2026-05-28 00:19:31 -07:00
JohnC1009	414a5bc924	fix(auth): fall back to global auth.json in _load_provider_state In profile mode, _load_provider_state previously returned None when a provider was absent from the profile's auth.json — even if the user had authenticated at the global root. This broke runtime credential resolvers that read state directly (resolve_nous_access_token, resolve_nous_runtime_credentials), causing profiles without their own nous login to fail with 'Hermes is not logged into Nous Portal' despite a valid global session. Push the existing read-only global fallback (already used by get_provider_auth_state and read_credential_pool) into _load_provider_state so every caller benefits, and simplify get_provider_auth_state into a thin wrapper. Writes still target the profile only — profile state continues to shadow global state on the next read after a per-profile login. Behavior in classic (non-profile) mode is unchanged because _load_global_auth_store returns an empty dict. Adds 5 tests covering the new contract on _load_provider_state directly. Existing 770 auth/credential/nous tests still pass.	2026-05-27 09:38:58 -07:00
Teknium	69dfcdcc15	fix(auth): codex chat path falls back to credential_pool when singleton is empty Closes #32992. The chat path resolves Codex credentials via `resolve_codex_runtime_credentials` which only reads `providers.openai-codex.tokens` (the singleton). The auxiliary path uses `_read_codex_access_token` which checks the credential_pool first. For users whose tokens live only in the pool — manual seed, partial re-auth, restore from backup, or any state where the singleton is empty but the pool is healthy — the chat path raised AuthError or (worse, since OpenAI(api_key='') silently attaches no header) the wire saw HTTP 401 "Missing Authentication header" while the auxiliary path worked fine. This adds a pool fallback to `resolve_codex_runtime_credentials`: when the singleton has no usable access_token, scan `credential_pool.openai-codex` for the first entry that has a non-empty access_token and isn't in an exhaustion cooldown window (`last_error_reset_at` in the future). If found, return that token with `source="credential_pool"`. If no usable entry exists, the original AuthError propagates as before. Regression tests cover: - Empty singleton + healthy pool entry → pool token returned - Pool fallback skips entries currently in cooldown - Empty singleton + empty/wedged pool → AuthError propagates (existing contract preserved)	2026-05-27 03:43:51 -07:00
konsisumer	f1422ffd77	fix(gateway): classify Codex 429 quota as rate-limit, not missing credentials When the Codex OAuth token endpoint returns 429 (usage-limit / quota exhaustion), refresh_codex_oauth_pure raised a generic auth error that the gateway surfaced as 'Primary provider auth failed: No Codex credentials stored. Run hermes auth', prompting re-auth that cannot lift a quota cap. Classify 429 distinctly (codex_rate_limited, relogin_required=False) with a non-alarming quota message that honors Retry-After, log it as 'Primary provider rate-limited (429)', and stop format_auth_error from appending the re-authenticate remediation. Also log the fallback provider's literal config key instead of the resolved runtime category. Refs #32790	2026-05-27 03:13:15 -07:00
konsisumer	2bbd53493d	fix(cli): sync credential_pool on Codex re-auth Codex re-auth via `hermes setup` / `hermes model` wrote fresh OAuth tokens to providers.openai-codex.tokens but left the credential_pool device_code entry holding the consumed refresh token and stale error markers. Since the runtime selects from the pool, the next request spent a dead token and got a 401 token_invalidated. Update the singleton-seeded pool entries in lockstep and clear their error state. Fixes #33000	2026-05-27 03:02:06 -07:00
Teknium	febc4cfec0	remove Vercel AI Gateway and Vercel Sandbox (#33067 ) * remove Vercel AI Gateway provider and Vercel Sandbox terminal backend Both Vercel-hosted integrations are removed end-to-end. Users on the AI Gateway should switch to OpenRouter or one of the other aggregators (Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should switch to Docker, Modal, Daytona, or SSH. What's removed: - `plugins/model-providers/ai-gateway/` provider plugin - `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper - `tools/environments/vercel_sandbox.py` terminal backend - `ai-gateway` provider wiring across auth, doctor, setup, models, config, status, providers, main, web_server, model_normalize, dump - `vercel_sandbox` backend wiring across terminal_tool, file_tools, code_execution_tool, file_operations, approval, skills_tool, environments/local, credential_files, lazy_deps, prompt_builder, cli, gateway/run - `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client header set, run_agent base-URL header/reasoning special-cases - `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock - env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`, `VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`, `TERMINAL_VERCEL_RUNTIME` - Tests: deletes test_ai_gateway_models.py and test_vercel_sandbox_environment.py; scrubs references across 23 surviving test files (no entire tests deleted unless they were dedicated to AI Gateway / Sandbox) - Docs: provider tables, env-var reference, setup guides, security notes, tool config, terminal-backend tables — English plus zh-Hans i18n parity - `hermes-agent` skill: provider table entry and remote-backend list What stays (intentional): - `popular-web-designs/templates/vercel.md` — CSS design reference, unrelated to Vercel-the-AI-product - `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN response header, useful diag signal on any Vercel-hosted endpoint - `vercel-labs/agent-browser` URL in browser config — lightpanda browser project, different OSS effort - `userStories.json` historical contributor entry mentioning Vercel Sandbox — archive, not active docs Validation: - 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`) - Full repo `py_compile` clean - Live import of every touched module + invariant check (no `ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no `vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`) * test: convert profile-count check from change-detector to invariant The hardcoded "== 34" assertion broke when ai-gateway was removed. Per AGENTS.md change-detector-test guidance, assert the relationship (registry count >= number of plugin dirs) instead of a literal count. Counts shift when providers are added/removed; that's expected.	2026-05-27 00:43:32 -07:00
beardthelion	2fc77c53f0	feat(opencode-go): route qwen3.7-max via anthropic_messages qwen3.7-max on OpenCode Go rejects the OpenAI-compatible (oa-compat) format with HTTP 401 but works correctly via the Anthropic Messages endpoint (/v1/messages with x-api-key auth). Route it the same way MiniMax models are routed: anthropic_messages api_mode. Changes: - hermes_cli/models.py: add qwen3.7-max routing + curated list - hermes_cli/setup.py: add to setup wizard model list - hermes_cli/auth.py: update provider comment - tests: add assertions for qwen3.7-max api_mode routing	2026-05-26 20:44:43 -07:00
jacevys	aeb87508c6	feat(providers): add OpenAI API provider option	2026-05-25 00:59:53 -07:00
Hasan Ali	d7c5d5dee5	fix: avoid persisting borrowed credential secrets (#31416 )	2026-05-25 00:32:08 -07:00
Teknium	b0135c741d	diag(xai-oauth): log loopback callback hits + wait-timeout outcome (#27385 ) (#31894 ) #27385 reports that on macOS the browser sees the xAI 'authorization received' success page but Hermes still raises xai_callback_timeout. The loopback HTTP handler was silent — no log line on receipt, no log line on wait timeout — so triaging the gap between 'browser saw success' and 'CLI saw timeout' required either a code change or guesswork. Adds two INFO log lines: - Per callback hit (handler): path, has_code, has_state, has_error, truncated User-Agent. Booleans / fingerprints only — no actual code/state strings leak. - On wait timeout: report whether result.code or result.error was populated at deadline. Distinguishes three failure modes: 1. No hit log + timeout log w/ has_code=False has_error=False → xAI's IDP never reached the loopback (firewall, port-binding, IPv6/IPv4 mismatch, browser blocked private-network access). 2. Hit log w/ has_code=False has_error=False + timeout log → xAI hit the loopback without OAuth params (the bare-URL case the handler already 400s on). 3. Hit log w/ has_code=True + timeout log w/ has_code=False → result_lock contention or race; would indicate a real bug. 133/133 in tests/hermes_cli/test_auth_xai_oauth_provider.py, tests/hermes_cli/test_xai_oauth_pkce_token_exchange.py, and tests/run_agent/test_codex_xai_oauth_recovery.py.	2026-05-24 23:05:25 -07:00
teknium1	af144cd60d	fix(model): include Premium+ in xAI OAuth label X Premium+ also grants Grok OAuth access — the 'SuperGrok Subscription' wording suggested SuperGrok was the only entitlement path. Updated to 'SuperGrok / Premium+' across the picker label, setup wizard, auth flows, and docs so Premium+ subscribers know the row applies to them too.	2026-05-24 18:12:16 -07:00
Teknium	be27bfed01	security: harden API server key placeholder handling (#30738 )	2026-05-24 04:25:32 -07:00
soynchux	e8fa415a9e	fix(cli): validate runtime token refresh capability in Qwen auth status	2026-05-23 17:47:36 -07:00
Teknium	a84cec61ca	fix(minimax-oauth): refresh short-lived access tokens per request (#30619 ) * fix(minimax-oauth): refresh short-lived access tokens per request MiniMax OAuth issues ~15-minute access tokens. The Anthropic SDK caches api_key as a static string at client construction, so a session that resolves credentials once at startup keeps sending the same bearer until MiniMax returns 401 mid-session. Swap the static string for a callable token provider, reusing the existing Entra-ID bearer-hook infrastructure in build_anthropic_client. The callable re-reads auth.json on each invocation and calls _refresh_minimax_oauth_state, which is a no-op when the token still has more than 60s of life left and refreshes proactively otherwise. Refreshes persist to auth.json so other processes (gateway, cron) see them immediately. The wire-up lives at the agent-init / model-switch boundary rather than in resolve_runtime_provider, so aux client paths that hand the api_key string to OpenAI(api_key=...) are unaffected. * docs: add infographic for minimax-oauth token refresh	2026-05-22 15:16:15 -07:00
Teknium	e32d2ffc1d	fix(security): wire Nous URL allowlist into refresh / mint persistence sites @memosr's PR #27612 put the inference_base_url allowlist check only at the Nous proxy adapter forward boundary. The poisoned URL, however, lands in ``auth.json`` upstream of that — at five refresh / agent-key-mint payload read sites inside ``resolve_nous_runtime_credentials`` and ``_extend_state_from_refresh``. Without gating those sites, a single MITM on a refresh response persists the attacker's URL across restarts, even if the proxy adapter's defense-in-depth check would later catch it on the way out. Replace ``_optional_base_url`` with ``_validate_nous_inference_url_from_network`` at all five Portal-network reads: - hermes_cli/auth.py L4840 (refresh-only access-token path) - hermes_cli/auth.py L4876 (mint payload path) - hermes_cli/auth.py L5154 (terminal-runtime access-token refresh) - hermes_cli/auth.py L5262 (cross-process serialized refresh) - hermes_cli/auth.py L5317 (terminal-runtime mint payload) The state-read path at L5025 (``state.get("inference_base_url")``) is deliberately NOT gated — pre-existing state in ``auth.json`` is either already validated (it came from one of the five network sites above) or set by a trusted local actor (manual edit, ``_setup_nous_auth`` test fixture, ``hermes login nous`` against a staging endpoint via the documented ``NOUS_INFERENCE_BASE_URL`` env override). Direct write_file / patch tampering with auth.json is independently blocked by PR #14157. Adds tests/hermes_cli/test_nous_inference_url_validation.py covering: - validator https + host + edge-case rules (12 cases) - all 5 network call sites grep contracts (no _optional_base_url regression possible without test failure) - proxy adapter defense-in-depth check still present - env override path NOT gated (documented dev/staging behaviour) 18 new tests, all 119 Nous-auth tests green.	2026-05-22 14:17:40 -07:00
memosr	d33c99bbb1	fix(security): validate Nous Portal inference_base_url against host allowlist The Nous Portal proxy adapter forwards minted ``agent_key`` bearer tokens to whatever ``base_url`` ``resolve_nous_runtime_credentials()`` returns, which is read directly from the refresh / agent-key-mint response and persisted to ``~/.hermes/auth.json``. With no validation beyond a trailing-slash strip, a poisoned URL (Portal-side MITM, or local write to auth.json) gets forwarded the legitimate bearer on every subsequent proxy request — exfiltrating the user's inference budget and opening a response-injection channel back into the IDE / chat client. Add ``_validate_nous_inference_url_from_network()`` in ``hermes_cli.auth``: an https + host-allowlist check that returns None for anything outside ``inference-api.nousresearch.com``, so callers fall back to the documented default rather than ship the bearer to an attacker. This commit wires the validator into the proxy adapter at ``nous_portal.py``. A follow-up commit wires it into the four refresh / mint sites in ``auth.py`` so the poisoned URL never lands in auth.json in the first place. The env-var override path (``NOUS_INFERENCE_BASE_URL``) bypasses validation by design — that's the documented staging/dev escape hatch and the env source is already trusted (the user set it themselves). Co-authored-by: memosr <mehmet.sr35@gmail.com>	2026-05-22 14:17:40 -07:00
liuhao1024	4ead464f97	fix(security): guard os.chmod(parent) against / and top-level dirs Five call sites do os.chmod(path.parent, 0o700) without checking that the parent resolves to a safe directory. If HERMES_HOME or another path env var resolves to /, the chmod strips traversal permission from the root inode and bricks the entire host. Add secure_parent_dir() to hermes_constants.py that refuses to chmod / or any top-level directory (depth < 2). Replace all 5 call sites with this helper. Fixes #25821	2026-05-20 22:56:55 -07:00
Teknium	64a9a199bb	fix(xai-oauth): pin inference base_url to x.ai origin (#28952 ) XAI_BASE_URL / HERMES_XAI_BASE_URL let users repoint the OAuth-authenticated inference endpoint, but the env override was an unguarded credential-leak vector: a tampered .env or hostile shell init setting XAI_BASE_URL=https://attacker.example/v1 would silently ship the SuperGrok OAuth bearer to a third party on every request. Add _xai_validate_inference_base_url() that pins the host to x.ai or a *.x.ai subdomain and rejects non-HTTPS. On rejection, fall back to the default with a warning rather than raise — a bad env var should not deadlock auth, but should never leak the bearer either. Apply at all three sites that read the env override for xai-oauth: - hermes_cli/auth.py resolve_xai_oauth_runtime_credentials (main path) - hermes_cli/auth.py _xai_oauth_loopback_login (initial login) - agent/auxiliary_client.py _resolve_xai_oauth_for_aux (aux client) E2E validated against four scenarios: attacker.example, lookalike api.x.ai.evil.com, http:// downgrade on api.x.ai, and legit custom.x.ai subdomain (which still resolves correctly). Discovered while comparing against the opencode-grok-auth plugin (github.com/ysnock404/opencode-grok-auth), which highlighted the same guard on the OpenCode side.	2026-05-19 14:51:21 -07:00
vanthinh6886	62573f44cf	fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes 1. trajectory_compressor.py: yaml.safe_load() returns None on empty files, crashing with TypeError on `if 'tokenizer' in data`. Fix by adding `or {}` fallback. (HIGH — blocks startup with empty config) 2. 6 files with fcntl.flock(LOCK_UN) in finally blocks without try/except: cron/scheduler.py, hermes_cli/auth.py, agent/shell_hooks.py, tools/skill_usage.py, tools/environments/file_sync.py, tools/memory_tool.py. If unlock raises OSError, fd.close() is skipped and the lock is held forever. The msvcrt branches already had try/except; the fcntl branches did not. Fix by wrapping in try/except (OSError, IOError): pass. 3. agent/copilot_acp_client.py line 639: TOCTOU race — path.exists() followed by path.read_text() with no try/except. If file is deleted between the check and the read, FileNotFoundError propagates. Fix by using try/except FileNotFoundError. 4. gateway/sticker_cache.py: non-atomic write via Path.write_text() can leave truncated JSON on crash, causing JSONDecodeError on next load. Fix by writing to tempfile + fsync + os.replace (atomic).	2026-05-19 00:12:41 -07:00
xxxigm	5a5c265bcf	fix(oauth): add manual-paste fallback for browser-only remote consoles xAI Grok OAuth (and Spotify) use a loopback redirect to ``http://127.0.0.1:<port>/callback`` to capture the authorization code. That works when the browser and Hermes run on the same machine, and the SSH tunnel recipe handles the regular remote case. It breaks completely on browser-only remote consoles (GCP Cloud Shell, GitHub Codespaces, AWS EC2 Instance Connect, Gitpod, Replit, …) where the user has a browser but no real SSH client to forward a port — the redirect to 127.0.0.1 on the remote VM simply isn't reachable from the laptop, and there's nothing the existing flow can do about it (#26923). This commit adds the foundation for a manual-paste fallback: * ``_is_remote_session`` now also recognises Cloud Shell, Codespaces, Gitpod, Replit, StackBlitz (in addition to SSH), so the existing tunnel hint at least fires in those environments. * ``_parse_pasted_callback`` accepts any of: a full ``http(s)://...?code=...&state=...`` URL, a bare ``?code=...`` query string, a bare ``code=...&state=...`` fragment, or a bare opaque code value. Returns the same dict shape the HTTP callback handler produces, so the caller's state / error validation works unchanged (no CSRF bypass). * ``_prompt_manual_callback_paste`` reads stdin with a clear multi-line explanation of what's happening and what to paste. * ``_xai_oauth_loopback_login`` gains a ``manual_paste`` kwarg that skips the HTTP listener entirely. The redirect_uri, PKCE verifier, state, and nonce are byte-identical to the loopback path so xAI's token endpoint can't tell the difference at the protocol level. * ``_print_loopback_ssh_hint`` now also mentions ``--manual-paste`` so users without a real SSH client see a path forward instead of a dead-end tunnel recipe. * ``_login_xai_oauth`` threads ``args.manual_paste`` into the loopback helper.	2026-05-18 20:10:52 -07:00
xxxigm	60ef368792	fix(xai-oauth): split 403 (tier/entitlement) from 400/401 in token endpoint xAI's token endpoint returns HTTP 403 to the OAuth grant when the account isn't on the allowlist for API access (e.g. standard SuperGrok subscribers — see #26847). Treating it like a stale-token 400/401 made ``format_auth_error`` append "Run ``hermes model`` to re-authenticate", which is misleading because re-login can't change xAI's tier decision. Split 403 off in both ``refresh_xai_oauth_pure`` and the loopback login token exchange: * New error code ``xai_oauth_tier_denied`` with ``relogin_required=False`` * Message explains the entitlement gate and points at the ``XAI_API_KEY`` + ``provider: xai`` fallback * 400/401 still set ``relogin_required=True`` as before * 5xx still set ``relogin_required=False`` as before	2026-05-18 20:08:09 -07:00
EloquentBrush0x	b3e714e8b7	fix(xai-oauth): quarantine dead tokens on terminal refresh failure resolve_xai_oauth_runtime_credentials() called _refresh_xai_oauth_tokens() with no try/except. A terminal refresh failure (HTTP 400/401/403 — invalid_grant, token revoked) propagated without clearing the dead access_token / refresh_token from auth.json, causing every subsequent session to retry the same doomed network request. Add a try/except around the refresh call that mirrors the existing credential_pool.py quarantine: when _is_terminal_xai_oauth_refresh_error identifies a non-retryable failure, clear the dead token fields from auth.json and write a last_auth_error diagnostic marker so future calls fail fast with a clear relogin_required error instead of hitting the network. active_provider is preserved (set_active=False) so multi-provider users whose chosen provider is not xai-oauth are unaffected. Tests: two new cases in test_auth_xai_oauth_provider.py cover terminal quarantine and transient pass-through.	2026-05-18 20:02:11 -07:00
EloquentBrush0x	d9331eecee	fix(minimax-oauth): quarantine dead tokens on terminal refresh failure resolve_minimax_oauth_runtime_credentials called _refresh_minimax_oauth_state without a try/except, so a terminal failure (invalid_grant, refresh_token_reused, invalid_refresh_token) raised AuthError but left the dead refresh_token in auth.json. Every subsequent API call retried the same token via a network round-trip, failing identically each time. Fix: wrap the refresh call and, when exc.relogin_required is True and a refresh_token is present, clear the dead OAuth fields (access_token, refresh_token, expires_*) and write a last_auth_error quarantine marker to auth.json before re-raising. The next call sees no access_token and fails fast with 'not_logged_in' — no network retry — and the user is prompted to re-authenticate. Mirrors the existing quarantine pattern for Nous (_quarantine_nous_oauth_state), xAI-OAuth (#28116), and Codex-OAuth (#28118). Persist failure is best-effort (logged at DEBUG, error still re-raised). Salvaged from #28003 by @EloquentBrush0x — contributor's branch was severely stale (would have reverted ~5000 LOC across azure/kanban/i18n subsystems); fix re-applied surgically with their pattern preserved and added two regression tests (terminal-quarantines + transient-does-not-quarantine).	2026-05-18 10:34:03 -07:00
EloquentBrush0x	b570e0fdd0	fix(codex-oauth): quarantine terminal refresh errors so dead tokens are not replayed across sessions When a Codex OAuth refresh token is permanently invalidated (HTTP 400/401/403, token revoked or reused), _mark_exhausted was called but auth.json was left with the dead credentials. On the next session, _seed_from_singletons re-read auth.json and re-seeded the pool with the same revoked token, triggering the same terminal failure in a loop. Add _is_terminal_codex_oauth_refresh_error to auth.py and a matching quarantine block in _refresh_entry: when a terminal error is detected and auth.json holds no newer tokens, clear access_token/refresh_token from auth.json and remove all device_code-sourced pool entries from memory. Mirrors the Nous quarantine added in `c90556262` and the xAI quarantine in #28116. Also add a pre-refresh sync from auth.json before calling refresh_codex_oauth_pure, matching the xAI and Nous patterns, to avoid refresh_token_reused races when multiple Hermes processes share the same auth.json singleton. Salvaged from #27911 by @EloquentBrush0x — contributor's branch was severely stale (would have reverted ~5000 LOC across azure/kanban/i18n subsystems); fix re-applied surgically on current main with their predicate and tests preserved.	2026-05-18 10:31:40 -07:00
EloquentBrush0x	5e40f83cb7	fix(xai-oauth): quarantine terminal refresh errors so dead tokens are not replayed across sessions When refresh_xai_oauth_pure raises a terminal error (HTTP 400/401/403, i.e. revoked or reused refresh token), _refresh_entry's existing race- recovery path re-syncs from auth.json and returns if another process has already rotated the tokens. If auth.json still holds the same stale token pair, the function fell through to _mark_exhausted — leaving the dead credentials in auth.json. On the next Hermes startup _seed_from_singletons re-seeded the pool from those stale tokens, causing the same failure loop on every session. Fix: after the auth.json re-sync check in the xAI-oauth error handler, detect terminal errors with the new _is_terminal_xai_oauth_refresh_error helper and apply a quarantine: - Clear access_token and refresh_token from providers["xai-oauth"]["tokens"] in auth.json so they are not re-seeded. - Write a last_auth_error entry for hermes doctor / auth status diagnostics. - Remove all loopback_pkce entries from the in-memory pool so the current session stops retrying with the dead credentials. Mirrors the identical quarantine already in place for Nous OAuth (`c90556262`). Closes the parity gap introduced when `c90556262` added Nous-only terminal error handling without a corresponding xAI-oauth path.	2026-05-18 10:28:09 -07:00
konsisumer	226680500d	fix(auth): improve xAI OAuth SSH hint with visual header and auto-detected host	2026-05-18 10:26:55 -07:00
briandevans	bf6eeb3f93	fix(xai-oauth): show "not received" page when loopback callback has no code When xAI's auth backend fails to redirect (e.g. the German "We couldn't reach your app" fallback shown in #27385), users sometimes navigate manually to the bare loopback callback URL — `http://127.0.0.1:<port>/callback` with no query string. The handler used to return 200 "xAI authorization received" for any GET that hit the expected path, because `parse_qs("")` yields no `code` and no `error`, leaving `result` untouched while the success page was still served. The CLI's wait loop, of course, still saw no code and timed out with `AuthError: xAI authorization timed out waiting for the local callback.` The user is left looking at a browser tab that claims success and a terminal that says failure — exactly the contradiction in #27385. This change makes the empty-callback case return 400 with an explicit "not received" page and a hint to retry `hermes auth add xai-oauth`. The wait-loop semantics are unchanged: `result["code"]` and `result["error"]` both stay None, so the CLI still raises a real timeout rather than treating the bare hit as a successful callback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 10:26:00 -07:00
Fewmanism	0d63661702	fix: latch xAI OAuth callback result	2026-05-18 10:23:13 -07:00
Fewmanism	eac198b6d5	fix: make xAI OAuth callback server threaded	2026-05-18 10:23:13 -07:00
glennc	9df9816dab	feat(azure-foundry): add Microsoft Entra ID auth Use azure-identity DefaultAzureCredential for keyless Foundry auth. Preserve refreshable callable credentials through OpenAI and Anthropic client paths. Add setup, doctor, auth status, docs, and tests for Entra auth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-18 10:14:38 -07:00
Robin Fernandes	569bc94b59	fix(auth) fix a few cases where refresh tokens were not rotated.	2026-05-17 16:56:37 -07:00
Robin Fernandes	20bffa5b37	refactor(auth): mostly cleanups and style changes	2026-05-17 16:56:37 -07:00
Robin Fernandes	0bac7dd05b	refactor(auth): collapse Nous inference fallback controls	2026-05-17 16:56:37 -07:00
Robin Fernandes	89a3d038cf	Switch to JWT token for inference against Nous, falling back to old opaque token on failure.	2026-05-17 16:56:37 -07:00
Robin Fernandes	c905562623	fix(auth): stop replaying invalid Nous refresh tokens Quarantine Nous OAuth state when refresh fails with terminal invalid_grant/invalid_token errors. Clear local and shared refresh material across runtime, managed access-token, proxy, and credential-pool paths so Hermes stops retrying revoked refresh sessions.	2026-05-17 16:56:37 -07:00
xxxigm	cb53c40e45	fix(xai-oauth): echo code_challenge in token POST so PKCE exchange succeeds xAI's OAuth implementation at ``auth.x.ai`` validates the PKCE ``code_challenge`` at the token endpoint, not just at the authorize step. When Hermes sends the standards-compliant token POST with ``code_verifier`` alone — exactly what RFC 7636 §4.5 prescribes — xAI rejects the exchange with ``code_challenge is required`` and the user is stuck with no working OAuth login. The fix: * Extract the token POST into ``_xai_oauth_exchange_code_for_tokens`` so the wire format is unit-testable in isolation. * Send the original ``code_challenge`` and ``code_challenge_method`` in the form body alongside ``code_verifier``. Strict RFC-compliant servers ignore the extras at the token endpoint, and xAI's permissive implementation accepts the exchange. This is the standard "defensive echo" workaround used by every OAuth client that targets a server with this quirk. * Refuse to fire the POST when ``code_verifier`` is empty — leaking the authorization code to a server that can't redeem it is worse than failing locally with an actionable error. The new error code is ``xai_pkce_verifier_missing`` and the message points at this issue for context. * Surface the HTTP status code prominently in the 4xx error message (``xAI token exchange failed (HTTP 400). Response: …``) so users and maintainers can tell a 400 (bad request / PKCE problem) from a 403 (tier denied, see #26847) at a glance instead of parsing the JSON body by eye. Closes #26990	2026-05-17 12:35:01 -07:00
kshitij	5fba236644	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 ) Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).	2026-05-17 02:29:41 -07:00
Teknium	3b9368a0c4	fix(auth): point SSH OAuth users at the tunnel they actually need (#26592 ) Two loopback-redirect OAuth flows (xAI Grok, Spotify) silently fail when Hermes runs on a remote host: the auth server redirects to 127.0.0.1:<port> on the user's laptop, not on the remote box. The --no-browser flag only suppresses webbrowser.open() — it doesn't change the bind address. Symptom xAI surfaces is 'Could not establish connection. We couldn't reach your app.', followed by a 'xAI authorization timed out waiting for the local callback' on the CLI side. Changes - hermes_cli/auth.py: new _print_loopback_ssh_hint() helper, called from _xai_oauth_loopback_login() and _spotify_login() right after they print the redirect URI. Silent off SSH; on SSH prints the exact 'ssh -N -L <port>:127.0.0.1:<port>' command using the actually-bound port (not the hardcoded constant — the listener auto-bumps when the preferred port is busy), a provider-specific docs URL, and a link to the new shared guide. - website/docs/guides/oauth-over-ssh.md (new): single source of truth for the tunnel pattern — TL;DR command, jump-box / ProxyJump variant, mosh+tmux+ControlMaster gotchas, troubleshooting. - website/docs/guides/xai-grok-oauth.md: fix the two sections that claimed --no-browser alone was enough; link to the shared guide. - website/docs/user-guide/features/spotify.md: expand the existing one-liner; link to the shared guide. - website/sidebars.ts: register the new page. - tests/hermes_cli/test_auth_loopback_ssh_hint.py: 7 unit tests covering SSH-vs-not, loopback-vs-not, malformed URIs, port echo, with and without provider docs URL.	2026-05-15 14:27:50 -07:00
teknium1	aac6d97a14	chore(xai-oauth): trim CORS allowlist to xAI auth origins Drop accounts.mouseion.dev and localhost:20000 / 127.0.0.1:20000 from the loopback callback CORS allowlist — leftover dev origins. The redirect_uri is bound to 127.0.0.1 and gated by PKCE + state, so only xAI's own auth origins are needed. Co-Authored-By: Jaaneek <Jaaneek@users.noreply.github.com>	2026-05-15 12:11:32 -07:00
Jaaneek	b62c997973	feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider Adds a new authentication provider that lets SuperGrok subscribers sign in to Hermes with their xAI account via the standard OAuth 2.0 PKCE loopback flow, instead of pasting a raw API key from console.x.ai. Highlights ---------- * OAuth 2.0 PKCE loopback login against accounts.x.ai with discovery, state/nonce, and a strict CORS-origin allowlist on the callback. * Authorize URL carries `plan=generic` (required for non-allowlisted loopback clients) and `referrer=hermes-agent` for best-effort attribution in xAI's OAuth server logs. * Token storage in `auth.json` with file-locked atomic writes; JWT `exp`-based expiry detection with skew; refresh-token rotation synced both ways between the singleton store and the credential pool so multi-process / multi-profile setups don't tear each other's refresh tokens. * Reactive 401 retry: on a 401 from the xAI Responses API, the agent refreshes the token, swaps it back into `self.api_key`, and retries the call once. Guarded against silent account swaps when the active key was sourced from a different (manual) pool entry. * Auxiliary tasks (curator, vision, embeddings, etc.) route through a dedicated xAI Responses-mode auxiliary client instead of falling back to OpenRouter billing. * Direct HTTP tools (`tools/xai_http.py`, transcription, TTS, image-gen plugin) resolve credentials through a unified runtime → singleton → env-var fallback chain so xai-oauth users get them for free. * `hermes auth add xai-oauth` and `hermes auth remove xai-oauth N` are wired through the standard auth-commands surface; remove cleans up the singleton loopback_pkce entry so it doesn't silently reinstate. * `hermes model` provider picker shows "xAI Grok OAuth (SuperGrok Subscription)" and the model-flow falls back to pool credentials when the singleton is missing. Hardening --------- * Discovery and refresh responses validate the returned `token_endpoint` host against the same `.x.ai` allowlist as the authorization endpoint, blocking MITM persistence of a hostile endpoint. Discovery / refresh / token-exchange `response.json()` calls are wrapped to raise typed `AuthError` on malformed bodies (captive portals, proxy error pages) instead of leaking JSONDecodeError tracebacks. * `prompt_cache_key` is routed through `extra_body` on the codex transport (sending it as a top-level kwarg trips xAI's SDK with a TypeError). * Credential-pool sync-back preserves `active_provider` so refreshing an OAuth entry doesn't silently flip the active provider out from under the running agent. Testing ------- * New `tests/hermes_cli/test_auth_xai_oauth_provider.py` (~63 tests) covers JWT expiry, OAuth URL params (plan + referrer), CORS origins, redirect URI validation, singleton↔pool sync, concurrency races, refresh error paths, runtime resolution, and malformed-JSON guards. * Extended `test_credential_pool.py`, `test_codex_transport.py`, and `test_run_agent_codex_responses.py` cover the pool sync-back, `extra_body` routing, and 401 reactive refresh paths. * 165 tests passing on this branch via `scripts/run_tests.sh`.	2026-05-15 12:11:32 -07:00
Teknium	3f13d78088	perf(tools): cache get_nous_auth_status() and load_env() to fix slow `hermes tools` menus (#25341 ) `hermes tools` -> "All Platforms" took ~14s to render the checklist because building the toolset labels called `get_nous_auth_status()` ~31x transitively (`_toolset_has_keys` -> `_visible_providers` -> `get_nous_subscription_features` -> `managed_nous_tools_enabled`). Each call did a synchronous OAuth refresh POST to portal.nousresearch.com (~350ms even on the failure path), so one menu paint burned >13s of HTTP and 31 single-use Nous refresh tokens. Secondary hot spot: every `get_env_value()` re-read and re-sanitised the entire .env file. 116 reads with O(lines x known-keys) scanning added ~300ms of CPU per render. Fix is two process-level caches, both mtime-keyed so login/logout/edit invalidate naturally: * `hermes_cli/auth.py`: memoise `get_nous_auth_status()` for 15s keyed on auth.json mtime. Splits `_compute_nous_auth_status()` as the uncached impl. Adds `invalidate_nous_auth_status_cache()`. * `hermes_cli/config.py`: memoise `load_env()` keyed on .env (path, mtime, size). Adds `invalidate_env_cache()`, wired into `save_env_value`, `remove_env_value`, and the sanitize-on-load writer so writers don't return stale dicts on same-second writes. Before/after on Teknium's box (real HERMES_HOME, no Nous login): * "All Platforms" cold path: ~13,874ms -> ~691ms label-build * Warm re-open within the same process: ~122ms -> ~17ms Side benefit: stops burning a Nous refresh token on every menu paint, which was risking the portal's reuse-detection revocation logic.	2026-05-13 18:40:14 -07:00
Teknium	1e01b25e76	feat(providers): rename Alibaba Cloud to Qwen Cloud, reorder picker (#24835 ) - Rename 'Alibaba Cloud (DashScope)' display label to 'Qwen Cloud' in CANONICAL_PROVIDERS (model picker, /model, hermes model TUI) and PROVIDER_REGISTRY (setup wizard prompts, status output). - Move Qwen Cloud (alibaba) up to position 6 — directly below OpenAI Codex and above Xiaomi MiMo. - Move Qwen OAuth (Portal) (qwen-oauth) to the bottom of the canonical provider list. Provider slug 'alibaba' is unchanged — only the display label moved. DashScope env var (DASHSCOPE_API_KEY) and base URL are unchanged. The separate 'alibaba-coding-plan' plugin provider is not affected.	2026-05-12 22:43:41 -07:00
rob-maron	c23a87bc16	union paid recs from nous portal with static list (#24509 )	2026-05-12 12:16:17 -07:00
Austin Pickett	58e2109f10	fix(minimax): harden OAuth dashboard and runtime Handle MiniMax OAuth expiry values consistently across CLI and dashboard flows, fix CLI status/add behavior, and force pooled OAuth runtime requests through Anthropic Messages. - web_server._minimax_poller: parse expired_in via the shared resolver so unix-ms absolute timestamps stop landing as TTL seconds and crashing with 'year 583911 is out of range' when a user connects MiniMax OAuth from the dashboard. - auth._minimax_oauth_login / _refresh_minimax_oauth_state: same fix on the CLI login + refresh paths. - auth.get_auth_status: dispatch minimax-oauth to its dedicated status function instead of falling through. - auth_commands.auth_add_command: 'hermes auth add minimax-oauth' now starts the device-code login flow and persists a pool entry with the access + refresh tokens, instead of requiring credentials to already exist. - runtime_provider._resolve_runtime_from_pool_entry: pin pooled minimax-oauth credentials to anthropic_messages so a stale model.api_mode: chat_completions can't send requests to /anthropic/chat/completions and trigger MiniMax nginx 404s. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 22:15:16 -07:00
Teknium	e85592591e	fix(nous): surface Portal-flagged free models in picker even when curated list is stale (#24082 ) Free-tier users were seeing 'No free models currently available.' in the `hermes model` and post-login pickers even though qwen/qwen3.6-plus is free on the Portal right now. Three independent breakages compounded: 1. The docs-hosted catalog manifest at website/static/api/model-catalog.json was not regenerated when _PROVIDER_MODELS['nous'] was updated, so users fetching the manifest got a list that didn't include qwen/qwen3.6-plus. 2. _resolve_nous_pricing_credentials() returned ('', '') on any auth blip, collapsing get_pricing_for_provider('nous') to {} and making every curated model fall through the free-tier filter as 'paid'. 3. Even with healthy pricing, the picker only ever showed models from the in-repo curated list intersected with live pricing — a Portal-flagged free model not yet in the curated list could never appear. Changes: - hermes_cli/models.py: new union_with_portal_free_recommendations() that augments the curated list with Portal freeRecommendedModels entries (with synthetic free pricing so partition keeps them). The Portal's /api/nous/recommended-models endpoint is now the source of truth for free-tier surfacing — old Hermes builds will see new free models without a CLI release. - hermes_cli/models.py: _resolve_nous_pricing_credentials() falls back to the public inference base URL when runtime cred resolution fails. The /v1/models endpoint exposes pricing without auth, so silently returning {} just because a refresh token expired was wrong. - hermes_cli/auth.py + hermes_cli/main.py: both free-tier picker call sites call union_with_portal_free_recommendations() before partition. - tests/hermes_cli/test_models.py: 7 tests covering union behaviour (prepend, dedup, end-to-end with stale pricing, empty/missing/error payloads, invalid entries). - tests/hermes_cli/test_model_catalog.py: drift guard TestManifestMatchesInRepoLists fails CI when _PROVIDER_MODELS['nous'] or OPENROUTER_MODELS is edited without re-running scripts/build_model_catalog.py. Verified empirically that removing a manifest entry triggers an assertion with an actionable error message. Validation: - 133/133 targeted tests pass (test_models, test_model_catalog, test_auth_nous_provider). - Live E2E against the real Portal: - Stale curated list ['claude-opus','claude-sonnet','gpt-5.4'] (no qwen) → after union: ['qwen/qwen3.6-plus', ...] → partition(free_tier=True): selectable=['qwen/qwen3.6-plus']. - Simulated expired refresh token → anon fetch returns 403 pricing entries including qwen/qwen3.6-plus -> {prompt:0, completion:0}. - ruff: clean.	2026-05-11 18:08:16 -07:00
kshitij	2ec8d2b42f	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 ) Replace with for all literal-tuple membership tests. Set lookup is O(1) vs O(n) for tuple — consistent micro-optimization across the codebase. 608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining. 133 files, +626/-626 (net zero).	2026-05-11 11:13:25 -07:00
Teknium	cc38282b04	feat(cross-platform): psutil for PID/process management + Windows footgun checker ## Why Hermes supports Linux, macOS, and native Windows, but the codebase grew up POSIX-first and has accumulated patterns that silently break (or worse, silently kill!) on Windows: - `os.kill(pid, 0)` as a liveness probe — on Windows this maps to CTRL_C_EVENT and broadcasts Ctrl+C to the target's entire console process group (bpo-14484, open since 2012). - `os.killpg` — doesn't exist on Windows at all (AttributeError). - `os.setsid` / `os.getuid` / `os.geteuid` — same. - `signal.SIGKILL` / `signal.SIGHUP` / `signal.SIGUSR1` — module-attr errors at runtime on Windows. - `open(path)` / `open(path, "r")` without explicit encoding= — inherits the platform default, which is cp1252/mbcs on Windows (UTF-8 on POSIX), causing mojibake round-tripping between hosts. - `wmic` — removed from Windows 10 21H1+. This commit does three things: 1. Makes `psutil` a core dependency and migrates critical callsites to it. 2. Adds a grep-based CI gate (`scripts/check-windows-footguns.py`) that blocks new instances of any of the above patterns. 3. Fixes every existing instance in the codebase so the baseline is clean. ## What changed ### 1. psutil as a core dependency (pyproject.toml) Added `psutil>=5.9.0,<8` to core deps. psutil is the canonical cross-platform answer for "is this PID alive" and "kill this process tree" — its `pid_exists()` uses `OpenProcess + GetExitCodeProcess` on Windows (NOT a signal call), and its `Process.children(recursive=True)` + `.kill()` combo replaces `os.killpg()` portably. ### 2. `gateway/status.py::_pid_exists` Rewrote to call `psutil.pid_exists()` first, falling back to the hand-rolled ctypes `OpenProcess + WaitForSingleObject` dance on Windows (and `os.kill(pid, 0)` on POSIX) only if psutil is somehow missing — e.g. during the scaffold phase of a fresh install before pip finishes. ### 3. `os.killpg` migration to psutil (7 callsites, 5 files) - `tools/code_execution_tool.py` - `tools/process_registry.py` - `tools/tts_tool.py` - `tools/environments/local.py` (3 sites kept as-is, suppressed with `# windows-footgun: ok` — the pgid semantics psutil can't replicate, and the calls are already Windows-guarded at the outer branch) - `gateway/platforms/whatsapp.py` ### 4. `scripts/check-windows-footguns.py` (NEW, 500 lines) Grep-based checker with 11 rules covering every Windows cross-platform footgun we've hit so far: 1. `os.kill(pid, 0)` — the silent killer 2. `os.setsid` without guard 3. `os.killpg` (recommends psutil) 4. `os.getuid` / `os.geteuid` / `os.getgid` 5. `os.fork` 6. `signal.SIGKILL` 7. `signal.SIGHUP/SIGUSR1/SIGUSR2/SIGALRM/SIGCHLD/SIGPIPE/SIGQUIT` 8. `subprocess` shebang script invocation 9. `wmic` without `shutil.which` guard 10. Hardcoded `~/Desktop` (OneDrive trap) 11. `asyncio.add_signal_handler` without try/except 12. `open()` without `encoding=` on text mode Features: - Triple-quoted-docstring aware (won't flag prose inside docstrings) - Trailing-comment aware (won't flag mentions in `# os.kill(pid, 0)` comments) - Guard-hint aware (skips lines with `hasattr(os, ...)`, `shutil.which(...)`, `if platform.system() != 'Windows'`, etc.) - Inline suppression with `# windows-footgun: ok — <reason>` - `--list` to print all rules with fixes - `--all` / `--diff <ref>` / staged-files (default) modes - Scans 380 files in under 2 seconds ### 5. CI integration A GitHub Actions workflow that runs the checker on every PR and push is staged at `/tmp/hermes-stash/windows-footguns.yml` — not included in this commit because the GH token on the push machine lacks `workflow` scope. A maintainer with `workflow` permissions should add it as `.github/workflows/windows-footguns.yml` in a follow-up. Content: ```yaml name: Windows footgun check on: push: branches: [main] pull_request: branches: [main] jobs: check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: {python-version: "3.11"} - run: python scripts/check-windows-footguns.py --all ``` ### 6. CONTRIBUTING.md — "Cross-Platform Compatibility" expansion Expanded from 5 to 16 rules, each with message, example, and fix. Recommends psutil as the preferred API for PID / process-tree operations. ### 7. Baseline cleanup (91 → 0 findings) - 14 `open()` sites → added `encoding='utf-8'` (internal logs/caches) or `encoding='utf-8-sig'` (user-editable files that Notepad may BOM) - 23 POSIX-only callsites in systemd helpers, pty_bridge, and plugin tool subprocess management → annotated with `# windows-footgun: ok — <reason>` - 7 `os.killpg` sites → migrated to psutil (see §3 above) ## Verification ``` $ python scripts/check-windows-footguns.py --all ✓ No Windows footguns found (380 file(s) scanned). $ python -c "from gateway.status import _pid_exists; import os > print('self:', _pid_exists(os.getpid())); print('bogus:', _pid_exists(999999))" self: True bogus: False ``` Proof-of-repro that `os.kill(pid, 0)` was actually killing processes before this fix — see commit ``1cbe39914`` and bpo-14484. This commit removes the last hand-rolled ctypes path from the hot liveness-check path and defers to the best-maintained cross-platform answer.	2026-05-08 14:27:40 -07:00

1 2 3 4

197 commits