Commit graph

9642 commits

Author SHA1 Message Date
Ben
2fc4615fc4 feat(dashboard-auth): Phase 7 — SPA AuthWidget + /api/status auth fields
Phase 7 surfaces the OAuth gate state to users.

web/src/components/AuthWidget.tsx (new):
  Sidebar widget that fetches /api/auth/me on mount and renders a
  compact 'Logged in as <user_id…> via <provider>' row with a logout
  icon. Contract V1 (Nous Portal) emits no email/display_name claims,
  so user_id is the display value (truncated to 14 chars + ellipsis);
  display_name and email fallthroughs are forward-compat for OQ-C1.
  Renders nothing on 401 from /api/auth/me — that's the signal the
  gate isn't engaged (loopback mode), in which case the widget would
  be confusing.
  Logout POSTs /auth/logout (which clears cookies + redirects to
  /login) then full-page-navigates to /login itself; the SPA's fetch
  wrapper doesn't follow that redirect, so the navigation is explicit.

web/src/App.tsx: mounts <AuthWidget /> above <SidebarFooter />.
  Component is self-hiding in loopback mode so there's no need for a
  conditional mount.

web/src/lib/api.ts:
  - getAuthMe() + logout() helpers
  - AuthMeResponse type
  - StatusResponse gets optional auth_required + auth_providers fields
    so the existing StatusPage can render a gated/loopback badge.

hermes_cli/web_server.py: /api/status payload now includes
  - auth_required: bool — whether app.state.auth_required is True
  - auth_providers: list[str] — registered DashboardAuthProvider names
  Lazy-imports list_providers so early-startup status calls don't
  crash if the dashboard_auth module is still being set up.

tests/hermes_cli/test_dashboard_auth_status_endpoint.py: 3 new tests
covering the new status fields in both gated and loopback modes plus
a regression that no existing field got dropped from the payload.

The hermes status CLI is unchanged in this commit — that command
tracks model providers + OAuth credentials, not running-dashboard
state. The /api/status endpoint is the canonical place to query
dashboard auth-gate state, consumed by the React StatusPage already.
2026-05-27 02:12:27 -07:00
Ben
5e9308b5b8 feat(dashboard-auth): Phase 6 — 401 re-auth envelope + next= propagation
Contract V1 of nous-account-service PR #180 ships no refresh tokens, so
the original Phase 6 silent-refresh design is replaced with a thinner
'401 → redirect to /login' UX. The dashboard's gated middleware now
emits a structured envelope on any auth failure; the SPA's fetch
wrapper sees it and full-page-navigates the user through re-auth.

hermes_cli/dashboard_auth/cookies.py:
  set_session_cookies(refresh_token='') SKIPS writing the
  hermes_session_rt cookie. Forward-compat: a non-empty refresh_token
  still emits the cookie unchanged, so a future Portal contract that
  starts issuing RTs flips the persistence on with no other change.
  clear_session_cookies still emits a Max-Age=0 deletion for the RT
  cookie so stale cookies from earlier deployments get flushed on
  logout / session expiry. Deprecation marker + rationale in
  module docstring per the user's docstring-only deprecation pattern.

hermes_cli/dashboard_auth/middleware.py:
  _unauth_response now builds a structured JSON envelope for API 401s:
    { error: 'session_expired' | 'unauthenticated',
      detail: 'Unauthorized',
      reason: <internal>,
      login_url: '/login?next=<safe-path>' }
  HTML redirects also carry next= so a user landing on /sessions
  without a cookie bounces back to /sessions after re-auth.
  _safe_next_target validates same-origin: drops protocol-relative
  paths (//evil.com), absolute URLs, and any /login or /auth/* loop.
  Dead cookies are cleared on the 401 path so the browser stops
  replaying invalid tokens.

hermes_cli/dashboard_auth/routes.py:
  /auth/callback accepts next= query param and validates via
  _validate_post_login_target (same rules as the gate's
  _safe_next_target — defence-in-depth because next= survived a full
  IDP round trip and attacker-controlled state can re-enter via the
  callback URL). Open-redirect attempts land at '/' instead.

web/src/lib/api.ts:
  fetchJSON parses the 401 envelope and full-page-navigates to
  body.login_url ONLY on the known session-expiry error codes.
  Domain-level 401s (e.g. permission errors) bubble up as regular
  errors. credentials: 'include' added so cookie auth works for all
  fetches routed through this wrapper. sessionStorage.lastLocation is
  preserved for future use by AuthWidget / hermes_status.

Test files marked with pytest.mark.xdist_group so the four files that
mutate web_server.app.state.auth_required serialize onto the same xdist
worker — eliminates 'works locally, fails in CI' app-state bleed.

20 new tests in test_dashboard_auth_401_reauth.py:
  - set_session_cookies(refresh_token='') skips RT cookie
  - clear_session_cookies still emits RT deletion
  - 401 envelope shape (unauthenticated vs session_expired)
  - dead cookie cleared on invalid-token 401
  - login_url carries next= for deep paths
  - login loop avoided when path is /login/auth/api-auth
  - protocol-relative URL rejected
  - _safe_next_target unit tests (accept same-origin, reject loops/abs)
  - /auth/callback respects safe next= but rejects open redirects

2 pre-existing tests updated to accept the new /login?next=%2F shape.

Full dashboard-auth suite: 168 passed, 1 skipped (Phase 0 pre-existing).
2026-05-27 02:12:27 -07:00
Ben
8971e94831 feat(dashboard-auth): SPA WS auth — getWsTicket() + buildWsAuthParam()
Phase 5 task 5.3. The dashboard's three WS-using surfaces (ChatPage,
gatewayClient, ChatSidebar) previously hardcoded ?token=<session>. In
gated mode the server rejects that path; the SPA must mint a single-use
ticket via POST /api/auth/ws-ticket and pass ?ticket= on the upgrade.

web/src/lib/api.ts: adds getWsTicket() (POST /api/auth/ws-ticket with
credentials: 'include') and buildWsAuthParam() — a helper that returns
['ticket', <minted>] in gated mode and ['token', <session>] in loopback.
Window.__HERMES_AUTH_REQUIRED__ is read from the server-injected
bootstrap script and toggles the path. Documented as the bridge from
cookie auth (REST) to WS auth.

web/src/pages/ChatPage.tsx: buildWsUrl() now takes an [authName, authValue]
pair instead of a bare token. The WS construct is wrapped in an IIFE so
the outer effect can stay synchronous (the cleanup returns the effect's
disposer at top level). onDataDisposable + onResizeDisposable hoisted to
`let` bindings the cleanup closes over.

web/src/lib/gatewayClient.ts: connect() branches on
window.__HERMES_AUTH_REQUIRED__ before opening /api/ws. Explicit token
overrides win (test-only path); otherwise gated → fetch ticket, loopback
→ use injected session token.

web/src/components/ChatSidebar.tsx: events-feed WS opens through the
same IIFE pattern as ChatPage. The ws local is hoisted so the cleanup's
ws?.close() works after the async mint resolves.

Server side already injects window.__HERMES_AUTH_REQUIRED__ in
_serve_index (Phase 3.5).
2026-05-27 02:12:27 -07:00
Ben
b2360ba44e feat(dashboard-auth): _ws_auth_ok helper + ticket auth on all 4 WS endpoints
Phase 5 task 5.2. Four WebSocket endpoints — /api/pty, /api/ws, /api/pub,
/api/events — previously authed with the same constant-time check against
`_SESSION_TOKEN`. Replaced with a single helper that branches on
`app.state.auth_required`:

  Loopback / --insecure: legacy ?token=<_SESSION_TOKEN> path (unchanged).
  Gated:                  ?ticket=<single-use> consumed against the
                          dashboard-auth ticket store.

Critical security property: gated mode UNCONDITIONALLY rejects the
?token= path. A leaked _SESSION_TOKEN value from a log line is not
replayable for WS access in gated deployments.

`_build_sidecar_url` now branches too: loopback uses the legacy token;
gated mode mints a server-internal ticket via mint_ticket() with
pseudo-user 'pty-sidecar' / provider 'server-internal' so audit logs can
distinguish PTY-internal sidecar tickets from browser tickets. PTY
children open /api/pub exactly once at startup so single-use suffices.

Ticket rejections audit-log as WS_TICKET_REJECTED with truncated reason
+ client IP + WS path. Operators debugging 'WS keeps closing' issues see
which endpoint and why.

17 new tests:
- POST /api/auth/ws-ticket: 200 with cookie, 401/302 without, distinct
  per call, GET-not-allowed.
- _ws_auth_ok loopback: token accept/reject, missing-token reject,
  ticket-param-ignored.
- _ws_auth_ok gated: ticket accept, single-use rejection, unknown reject,
  legacy-token-rejected-in-gated assertion, audit-log emission.
- _build_sidecar_url: loopback uses token=, gated uses ticket=, no-bound
  returns None.
2026-05-27 02:12:27 -07:00
Ben
b69fce9c86 feat(dashboard-auth): single-use WS tickets + POST /api/auth/ws-ticket
Phase 5 task 5.1. Browsers cannot set Authorization on a WebSocket
upgrade, so in gated mode the SPA needs an alternative way to bind the
upgrade to its authenticated session.

  hermes_cli/dashboard_auth/ws_tickets.py — in-memory single-use ticket
  store with 30s TTL. Thread-safe (threading.Lock), token_urlsafe(32)
  values, ticket value truncated to 8 chars in error messages for log
  hygiene. Module-level state with _reset_for_tests() helper.

  hermes_cli/dashboard_auth/routes.py — adds POST /api/auth/ws-ticket.
  Auth-required (the gate middleware already attaches Session to
  request.state.session). Returns {ticket, ttl_seconds}; emits
  WS_TICKET_MINTED audit event with user_id + provider + ip.

  hermes_cli/dashboard_auth/audit.py — adds WS_TICKET_REJECTED enum
  value for the consume-side rejection event (wired into the WS
  endpoints in task 5.2).

11 new tests covering round-trip, single-use, TTL boundary, unknown
ticket rejection, secret-hygiene truncation in error messages, and
concurrent mint+consume from 20 threads.
2026-05-27 02:12:27 -07:00
Ben
848baeb0a8 feat(dashboard-auth): plugins/dashboard_auth/nous — contract-compliant Nous OAuth provider
Bundled, kind=backend, auto-loads. Activates ONLY when Portal-injected
env vars are present:

  HERMES_DASHBOARD_OAUTH_CLIENT_ID  — agent:{instance_id}
  HERMES_DASHBOARD_PORTAL_URL       — Portal base URL

Loopback / --insecure operators leave both unset and never see this
plugin register anything. The fail-closed branch in start_server handles
the 'public bind + zero providers' case independently.

Implementation follows nous-account-service PR #180's published OAuth
contract verbatim:

  - client_id is per-instance (agent:{instance_id}); the suffix is
    cross-checked against the token's agent_instance_id claim as
    defense-in-depth (contract C9).
  - scope is agent_dashboard:access only (contract C3).
  - aud is the bare client_id, no hermes-cli: prefix (contract C2).
  - RS256 JWT verification against /.well-known/jwks.json with
    5-minute cache (contract C7).
  - No refresh tokens in V1: refresh_session always raises
    RefreshExpiredError; revoke_session is a no-op (contract C5).
  - oauth_contract_version claim: missing → warn + proceed; present
    and != 1 → refuse (contract C11, OQ-C2 tolerant treatment).
  - redirect_uri validated client-side as defense before bouncing to
    Portal; authoritative check is server-side per agent-redirect-uri.ts.

41 new tests covering construction, plugin-entry env gating, start_login
shape, complete_login httpx-mocked happy path + error mapping,
verify_session JWT verification (RSA keypair fixture, full claim-check
matrix), refresh_session always raising, revoke_session no-op.

PyJWT + cryptography are already in the venv (jose was previously
suggested; switched to pyjwt[crypto] since the latter is already
pulled in transitively).
2026-05-27 02:12:27 -07:00
Ben
53999b9e95 docs(dashboard-auth): plan v2 — incorporate Portal OAuth contract (PR #180)
Adds a 'Contract Anchor' section at the top of the plan summarizing the
11 material findings from nous-account-service PR #180's published
contract. Rewrites Phase 4 (Nous provider) and Phase 6 (re-auth UX)
in-place; the v1 drafts are preserved inline marked 'rejected —
preserved for archeology' for reviewer context.

Phases 0–3 (already shipped) are unaffected — they set up gate
engagement and cookie plumbing only. The cookies module's RT cookie
becomes dead in Phase 6 task 6.3 and is removed there.

Key contract-driven reversals:
  - client_id is per-instance (agent:{id}), env-injected — not static
  - audience is bare client_id, not 'hermes-cli:' prefixed
  - scope is 'agent_dashboard:access' only
  - JWT claims do NOT include email/name — surface user_id instead
  - no refresh tokens in V1 — 401 → redirect to /login
  - JWKS-only verification, no userinfo fallback
  - redirect_uri is exact-match per AgentInstance, not wildcard

Phase 7's AuthWidget needs to display user_id (truncated) instead of
email; one-line annotation added at the top of that phase.
2026-05-27 02:12:27 -07:00
Ben
53736b3922 feat(dashboard-auth): fail-closed on no providers; proxy_headers when gated; suppress _SESSION_TOKEN injection
Phase 3, Task 3.5. Three changes to web_server.py:

  1. start_server replaces the legacy SystemExit-refusing-to-bind guard
     with: if app.state.auth_required and no providers registered, exit
     with a clear message; otherwise log the gate-on banner. --insecure
     keeps its existing behaviour.

  2. uvicorn proxy_headers flag is computed from app.state.auth_required.
     Loopback / --insecure keep it False (so _ws_client_is_allowed sees
     the real peer for the loopback gate); gated mode flips it True so
     X-Forwarded-Proto from Fly's TLS terminator is honoured for cookie
     Secure-flag decisions in detect_https().

  3. _serve_index no longer injects window.__HERMES_SESSION_TOKEN__ when
     the gate is on — the SPA reads identity from /api/auth/me using
     cookie auth instead. window.__HERMES_AUTH_REQUIRED__ flag lets the
     SPA pick between ticket-auth (gated) and token-auth (loopback) for
     /api/pty + /api/ws (Phase 5 will wire this in the React layer).

4 new behavioural tests; loopback regression harness still green.
2026-05-27 02:12:27 -07:00
Ben
5b17eab67a feat(dashboard-auth): auth gate middleware + /auth/* routes + /login HTML
Phase 3, Tasks 3.2 + 3.3 + 3.4. These three pieces are mutually
dependent so they land together.

middleware.py - gated_auth_middleware engages when app.state.auth_required
is True.  Allowlists /login, /auth/*, /api/auth/providers, and static
asset paths; everything else demands a valid session_at cookie.  Verifies
by trying every registered provider's verify_session in turn (multi-
provider stack); attaches verified Session to request.state.session.
Returns 401 JSON for /api/* and 302 -> /login for HTML.  ProviderError
during verify -> 503.

routes.py - APIRouter with:
  GET  /login              server-rendered HTML
  GET  /auth/login?provider=N  302 to IDP + PKCE cookie
  GET  /auth/callback?code,state  completes login, sets session cookies
  POST /auth/logout        clears cookies + best-effort revoke
  GET  /api/auth/providers public bootstrap endpoint (503 if zero)
  GET  /api/auth/me        verified session as JSON (auth-required)

login_page.py - Inline-CSS HTML template, no React, no JavaScript.

web_server.py - Mounted gated_auth_middleware between host_header and
auth_middleware (FastAPI runs middlewares in registration order: host
check -> cookie auth -> token auth).  auth_middleware short-circuits
when auth_required so cookie auth is authoritative in gated mode.
Router is included before mount_spa so the catch-all doesn't swallow
/login or /auth/*.

17 new behavioural tests; loopback regression harness still green.
2026-05-27 02:12:27 -07:00
Ben
a30c4d8ebd feat(dashboard-auth): cookie helpers for session_at/session_rt/pkce
Phase 3, Task 3.1. Three cookies:
  - hermes_session_at: OAuth access token (HttpOnly, TTL = token TTL)
  - hermes_session_rt: OAuth refresh token (HttpOnly, 30d max-age)
  - hermes_session_pkce: PKCE state + verifier + provider hint (10min)

All SameSite=Lax + Path=/. Secure flag is set ONLY when the request
scheme is https — uvicorn proxy_headers=True (enabled in gated mode at
Phase 3.5) rewrites scheme from X-Forwarded-Proto so Fly's TLS
terminator works.
2026-05-27 02:12:27 -07:00
Ben
628a52fce2 test(dashboard-auth): stub auth provider for E2E gate testing
Phase 2, Task 2.1. Self-contained fake IDP — start_login redirects
straight back to {redirect_uri}?code=stub_code&state=<s> so tests can
walk the OAuth round trip in-process. Tokens are HMAC-signed JSON blobs
(not real JWTs) — enough structure for verify_session to detect tamper
and expiry without pulling in pyjwt.

Lives in tests/ only — never registered as a real plugin. Phase 3's
end-to-end tests import StubAuthProvider directly.

Convention: exp <= now counts as expired (TTL=0 means born-expired)
— matches what Phase 6's silent-refresh test will need.
2026-05-27 02:12:27 -07:00
Ben
865cae4f61 feat(dashboard-auth): json-lines audit log at $HERMES_HOME/logs/dashboard-auth.log
Phase 1, Task 1.4. Records every auth event (login start/success/failure,
logout, refresh success/failure, revoke, session verify failure, WS
ticket mint) as one JSON object per line. Token-like kwargs (access_token,
refresh_token, code, code_verifier, state, ticket, cookie, Authorization)
are dropped before serialisation so the log never contains live secrets.

Write failures log at WARNING but never raise — auth flows must not fail
because the audit logger broke.
2026-05-27 02:12:27 -07:00
Ben
c32b17f557 feat(plugins): add register_dashboard_auth_provider hook on PluginContext
Phase 1, Task 1.3. Mirrors the existing register_image_gen_provider
pattern (plugins.py:531) — wrong-type or duplicate-name registrations
log at WARNING and silently return rather than raising, so a misbehaving
auth plugin cannot crash the host.

Deviation from plan: the plan's draft raised TypeError on non-provider
input; switched to silent-warn to match the established image_gen
convention. Test updated to match.
2026-05-27 02:12:27 -07:00
Ben
1bbfed70c4 test(dashboard-auth): cover registry register/get/list/clear semantics
Phase 1, Task 1.2. Verifies registration order is preserved, duplicate
names are rejected with ValueError, and non-compliant providers fail at
register time (not later when the middleware tries to dispatch).
2026-05-27 02:12:27 -07:00
Ben
2dc6d03a3d feat(dashboard-auth): define DashboardAuthProvider ABC + Session dataclass
Phase 1, Task 1.1. New package hermes_cli/dashboard_auth/ contains:

  base.py     - DashboardAuthProvider ABC with 5 abstract methods
                (start_login, complete_login, verify_session,
                refresh_session, revoke_session), Session + LoginStart
                frozen dataclasses, three exception types
                (ProviderError / InvalidCodeError / RefreshExpiredError),
                and assert_protocol_compliance() for plugins to call
                in their own tests.
  registry.py - Module-level register/get/list/clear with a lock.

Nothing reads the registry yet — Phase 2 adds the StubAuthProvider and
Phase 3 wires the gate middleware. The plugin hook lands in Task 1.3.
2026-05-27 02:12:27 -07:00
Ben
949ad95e4b feat(dashboard): stash auth_required flag on app.state
Phase 0, Task 0.3. start_server now computes should_require_auth(host,
allow_public) and records it on app.state.auth_required BEFORE the
existing legacy SystemExit guard fires. This gives middleware, the SPA
token-injection path, and WS endpoints a consistent read source for
'is the gate active'. The flag is set but no one reads it yet — Phase 3
registers the gate middleware.

Note: 4 pre-existing test failures in tests/hermes_cli/test_web_server.py
(PtyWebSocket) + test_update_hangup_protection.py reproduce on pristine
HEAD and are unrelated to this change (starlette TestClient WS regression).
2026-05-27 02:12:27 -07:00
Ben
8773bbf186 feat(dashboard): add should_require_auth predicate for OAuth gate
Phase 0, Task 0.2. Single source of truth for 'is the auth gate active?'.
Reuses the existing _LOOPBACK_HOST_VALUES frozenset so this stays in sync
with the DNS-rebinding host-header check. RFC1918/CGNAT/link-local are
treated as public — exact threat model the gate exists for.
2026-05-27 02:12:27 -07:00
Ben
f2b479e7a2 test(dashboard): pin current loopback auth behavior as regression harness
Phase 0, Task 0.1 of the dashboard-oauth plan. Establishes a baseline for
the loopback dashboard's auth surface so future phases can prove they
didn't regress the existing _SESSION_TOKEN flow when adding the OAuth gate.
2026-05-27 02:12:27 -07:00
Teknium
249534e472
plugins: add security-guidance — pattern-matched warnings on dangerous code writes (#33131)
New opt-in plugin that scans the content passed to write_file / patch /
skill_manage for 25 known-dangerous code patterns — pickle.load,
yaml.load, eval(, os.system, subprocess(shell=True), child_process.exec,
dangerouslySetInnerHTML, innerHTML/outerHTML/document.write/
insertAdjacentHTML, crypto.createCipher (no IV), AES ECB,
TLS verification disabled, XXE-prone xml.etree/minidom parsers,
<script src=//...> without SRI, torch.load without weights_only=True,
GitHub Actions ${{ github.event.* }} injection — and appends a
"Security guidance" warning block to the tool result via the
transform_tool_result hook.

Default behaviour is non-blocking: the file is written and the warning
rides back to the model in the next turn so it can self-correct or
document why the construct is safe. SECURITY_GUIDANCE_BLOCK=1 upgrades
to refusing the write entirely; SECURITY_GUIDANCE_DISABLE=1 is the
kill switch.

Pattern data (patterns.py) is a verbatim Apache-2.0 fork of
Anthropic's claude-plugins-official/plugins/security-guidance/hooks/
patterns.py at commit 0bde168 (2026-05-26). LICENSE and NOTICE
preserve attribution. The Hermes-side plugin glue (__init__.py,
plugin.yaml, README.md, tests) is original work.

Plugin is opt-in like all bundled plugins:
  hermes plugins enable security-guidance

Inspired by https://x.com/ClaudeDevs/status/1927108527247... — Anthropic
shipped this as their security-guidance plugin for Claude Code on
2026-05-26 with a measured 30-40% reduction in security-related PR
comments on internal rollout.

What's NOT ported (deferred):
  * Layer 2 (LLM diff review on turn end) — would route through main
    model by default on Hermes, real money on reasoning models. A
    follow-up can wire it to a cheap aux model with explicit opt-in.
  * Layer 3 (agentic commit-time review) — agent can run this on
    demand via delegate_task today.
  * .hermes/security-guidance.md project-rules file — only used by
    layers 2/3 upstream.
2026-05-27 02:07:21 -07:00
Teknium
c752205635 chore(release): map superearn-fisher noreply for #33122 salvage 2026-05-27 02:06:21 -07:00
SuperEarn
4920f8437f test(codex): cover null output stream terminal events 2026-05-27 02:06:21 -07:00
orcool
f0fdb5e67d feat(catalog): add qwen3.7-max to alibaba + alibaba-coding-plan model lists
Alibaba's latest flagship Qwen model is released but not yet present in the
DashScope (alibaba) or Alibaba Coding Plan curated catalogs.  Add it so it
shows up in the /model picker and setup wizard for those providers.

OpenCode Go routing for qwen3.7-max already landed via #32780 (commit 2fc77c53f).
OpenRouter + Nous catalog entries already landed via #32809 (commit ccd3d04fc).
This salvage picks up the remaining alibaba / alibaba-coding-plan entries from
#32806 — the AI Gateway entry is dropped because Vercel AI Gateway was removed
in #33067.
2026-05-27 02:05:58 -07:00
Teknium
96223265b9 chore(api-server): mark skills_api capability True now that /v1/skills shipped
#33016 added GET /v1/skills + /v1/toolsets on the API server; the
capability flag introduced in this branch was placeholder-False. Flip
to True so capability probers see the truth.
2026-05-27 01:56:55 -07:00
Jonathan
464b51d455 Support media in session chat API 2026-05-27 01:56:55 -07:00
Bailey Dixon
f7527b0fdb feat: add API server session controls 2026-05-27 01:56:55 -07:00
Teknium
f0be32232d chore(release): map EvilHumphrey noreply for #33034 salvage 2026-05-27 01:52:34 -07:00
EvilHumphrey
4243b6dc45 fix(codex): update silent-hang workaround hint 2026-05-27 01:52:34 -07:00
Siddharth Balyan
976979489a
feat(nix): add #messaging and #full package variants (#33108)
* fix(plugins/discord): correct install_hint extra to [messaging]

The Discord platform registered install_hint pointing at
'hermes-agent[discord]', but pyproject.toml has no [discord] extra —
the deps live in [messaging] alongside Telegram and Slack. Users hitting
"Platform 'Discord' requirements not met" were directed at a pip command
that installs nothing.

* feat(nix): add #messaging and #full package variants

Make Discord/Telegram/Slack work out of the box for `nix profile install`
users. Messaging deps were dropped from [all] on 2026-05-12 in favor of
lazy-install, but lazy-install can't write to the read-only /nix/store —
users hit "No adapter available for discord" with no actionable guidance.

  - #messaging: pre-built with discord.py/telegram/slack (+33 MB venv)
  - #full:      all 18 platform-portable extras + matrix on Linux only
                (python-olm lacks Darwin PyPI wheels) (+738 MB venv)

Also adds a `messaging-variant` flake check that verifies `import discord`
succeeds in the sealed venv — regression guard for the lazy-install
migration.

Docs updated: Quick Start callout, extraDependencyGroups rewrite with
messaging as primary example + full extras table, troubleshooting row,
cheatsheet row.

Closure size deltas (measured x86_64-linux):
  default   1792 MB pkg / 512 MB venv
  messaging 1826 MB pkg / 546 MB venv   (+33 MB)
  full      2530 MB pkg / 1250 MB venv  (+738 MB)

* chore(nix): trim variant comments + alphabetize full extras

Drop the date-stamped changelog from messaging-variant's comment and the
"+33 MB / +704 MB" numbers from the variant defs — those drift and belong
in the PR description, not source. Alphabetize the 18-extra list in #full
so future additions produce clean one-line diffs.

No semantic change. messaging-variant check still passes.
2026-05-27 14:15:39 +05:30
Teknium
25f43d38de
feat(api-server): add GET /v1/skills and /v1/toolsets (#33016)
Lets external clients enumerate the agent's skills and resolved toolsets
deterministically over the OpenAI-compatible API server, without standing
up the dashboard web server or sending a chat message and asking the model
to list them.

- GET /v1/skills — list installed skills (name, description, category)
- GET /v1/toolsets — list toolsets resolved for the api_server platform,
  with enabled/configured state and the concrete tool names each expands
  to
- Both gated by API_SERVER_KEY (same Bearer scheme as every other /v1/*
  endpoint)
- /v1/capabilities advertises both new endpoints

Closes the gap a community user just hit asking how to list skills over
REST when only the OpenAI-compatible server is running.

Test plan
- python -m pytest tests/gateway/test_api_server.py -k "Skills or Toolsets or Capabilities" -o 'addopts=' -q
  → 9/9 pass
- python -m pytest tests/gateway/test_api_server.py -o 'addopts=' -q
  → 156/156 pass, no regressions
- E2E: started a real adapter on an isolated HERMES_HOME with a fake
  skill installed; curl-equivalent calls to /v1/capabilities,
  /v1/skills, /v1/toolsets returned the expected JSON; unauthenticated
  calls returned 401 with the configured API_SERVER_KEY.
2026-05-27 01:27:26 -07:00
Teknium
febc4cfec0
remove Vercel AI Gateway and Vercel Sandbox (#33067)
* remove Vercel AI Gateway provider and Vercel Sandbox terminal backend

Both Vercel-hosted integrations are removed end-to-end. Users on the AI
Gateway should switch to OpenRouter or one of the other aggregators
(Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should
switch to Docker, Modal, Daytona, or SSH.

What's removed:
- `plugins/model-providers/ai-gateway/` provider plugin
- `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper
- `tools/environments/vercel_sandbox.py` terminal backend
- `ai-gateway` provider wiring across auth, doctor, setup, models,
  config, status, providers, main, web_server, model_normalize, dump
- `vercel_sandbox` backend wiring across terminal_tool, file_tools,
  code_execution_tool, file_operations, approval, skills_tool,
  environments/local, credential_files, lazy_deps, prompt_builder,
  cli, gateway/run
- `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client
  header set, run_agent base-URL header/reasoning special-cases
- `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock
- env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`,
  `VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`,
  `TERMINAL_VERCEL_RUNTIME`
- Tests: deletes test_ai_gateway_models.py and
  test_vercel_sandbox_environment.py; scrubs references across 23
  surviving test files (no entire tests deleted unless they were
  dedicated to AI Gateway / Sandbox)
- Docs: provider tables, env-var reference, setup guides, security
  notes, tool config, terminal-backend tables — English plus zh-Hans
  i18n parity
- `hermes-agent` skill: provider table entry and remote-backend list

What stays (intentional):
- `popular-web-designs/templates/vercel.md` — CSS design reference,
  unrelated to Vercel-the-AI-product
- `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN
  response header, useful diag signal on any Vercel-hosted endpoint
- `vercel-labs/agent-browser` URL in browser config — lightpanda
  browser project, different OSS effort
- `userStories.json` historical contributor entry mentioning Vercel
  Sandbox — archive, not active docs

Validation:
- 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`)
- Full repo `py_compile` clean
- Live import of every touched module + invariant check (no
  `ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no
  `vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`)

* test: convert profile-count check from change-detector to invariant

The hardcoded "== 34" assertion broke when ai-gateway was removed.
Per AGENTS.md change-detector-test guidance, assert the relationship
(registry count >= number of plugin dirs) instead of a literal count.
Counts shift when providers are added/removed; that's expected.
2026-05-27 00:43:32 -07:00
Teknium
cb38ce28cb
refactor(codex): drop SDK responses.stream() helper; consume events directly (#33042)
* refactor(codex): drop SDK responses.stream() helper; consume events directly

The OpenAI Python SDK's high-level `client.responses.stream(...)` helper
does post-hoc typed reconstruction from the terminal
`response.completed.response.output` field.  The chatgpt.com Codex
backend has been observed (today, gpt-5.5) to ship `response.output =
null` on terminal frames, which crashes the SDK with `TypeError:
'NoneType' object is not iterable` mid-iteration.

Carlton's #32963 patched the symptom by wrapping the helper in
try/except and recovering from the same per-event accumulator the SDK
was supposed to populate.  This PR removes the helper from the call
path entirely: we now use `client.responses.create(stream=True)` (raw
AsyncIterable of SSE events) and assemble the final response object
ourselves from `response.output_item.done` events as they arrive.  The
terminal event's `output` field is never read for content.  Same
strategy OpenClaw uses for the same backend.

This makes Hermes structurally immune to the bug class, not patched.
The next time OpenAI ships a shape change to chatgpt.com's terminal
frame, our consumer keeps working because it doesn't read that frame
for content — only for usage/status/id.

Changes
- `agent/codex_runtime.py`: new `_consume_codex_event_stream()` shared
  consumer; `run_codex_stream()` uses `responses.create(stream=True)`;
  `run_codex_create_stream_fallback()` collapses into a thin alias
  since the primary path now does what the fallback used to do.
- `agent/auxiliary_client.py`: `_CodexCompletionsAdapter` uses the
  same consumer; old null-output recovery helpers deleted as
  unreferenced.
- Tests migrated: fixtures that mocked `responses.stream` now mock
  `responses.create` returning a raw iterable.  New regression test
  asserts the auxiliary path returns streamed items even when the
  terminal event's `output` is literally `null`.

Validation
- Live: tested against fresh OAuth on `chatgpt.com/backend-api/codex`
  with `gpt-5.5` — response built correctly with `response.output=null`
  on the terminal frame, all events consumed, usage/reasoning tokens
  propagated.
- `tests/run_agent/test_run_agent_codex_responses.py` +
  `tests/agent/test_auxiliary_client.py`: 242 passed.

* test+fix(codex): migrate streaming tests, raise on truncated streams

CI surfaced 10 test failures across tests/run_agent/test_streaming.py
and tests/run_agent/test_codex_xai_oauth_recovery.py — both files had
their own `responses.stream(...)` mocks I missed in the first sweep.

agent/codex_runtime.py: _consume_codex_event_stream() now raises
"Codex Responses stream did not emit a terminal response" when the
stream ends without any terminal frame AND no usable content. This
preserves the signal callers used to get from the SDK's high-level
helper, which they distinguished from "completed with empty body"
in error handling.

Tests migrated:
- test_streaming.py: text-delta callback, activity-touch, and
  remote-protocol-error tests all switch from mocking responses.stream
  to responses.create returning an iterable of events.
- test_codex_xai_oauth_recovery.py: prelude-error tests are recast as
  wire-error-event tests (the new path raises _StreamErrorEvent
  directly when the wire emits type=error, which is strictly better
  than the old two-phase "SDK RuntimeError → retry → fallback"). The
  retry-on-transport-error test moves from responses.stream side-effect
  to responses.create side-effect.

Verified live against chatgpt.com Codex with gpt-5.5 — AIAgent.chat()
through the full codex_responses path returns correctly, 319/319
targeted tests passing.
2026-05-27 00:30:06 -07:00
Ben
fb298a958c fix(docker): mkdir HERMES_HOME as root in stage2 before chown / privilege drop (#18488)
When HERMES_HOME points at a custom path whose parent directories
only root can create (e.g. HERMES_HOME=/home/hermes/.hermes in a
Compose file, or any path under a fresh / not pre-populated by the
image), stage2-hook.sh fails on first boot:

  [stage2] Warning: chown failed (rootless container?) - continuing
  mkdir: cannot create directory '/custom': Permission denied
  mkdir: cannot create directory '/custom': Permission denied
  ... (one per s6-setuidgid hermes mkdir invocation)
  cont-init: info: /etc/cont-init.d/01-hermes-setup exited 1

The mkdirs fail because s6-setuidgid drops to hermes (UID 10000)
before invoking mkdir -p, and the runtime user has no permission to
create root-owned ancestor directories. 02-reconcile-profiles then
crashes with FileNotFoundError, .install_method never lands, and
the container limps on in a half-initialized state.

Bootstrap HERMES_HOME with mkdir -p while still root, before the
ownership normalization. Idempotent on the default /opt/data path
(directory already exists from the Dockerfile RUN mkdir -p) and on
any subsequent restart. (#18482)

Retargeted from the original PR's docker/entrypoint.sh (now a
deprecated shim) to docker/stage2-hook.sh where the related chown
logic moved during the s6-overlay rework.

Co-authored-by: wpengpeng168 <133926080+wpengpeng168@users.noreply.github.com>
2026-05-27 17:16:40 +10:00
Ben
c3bdb2af37 ci(docker): add shellcheck shell=sh directive to main-wrapper.sh
shellcheck doesn't recognize the s6-overlay `#!/command/with-contenv sh`
shebang and aborts with SC1008 ("This shebang was unrecognized. ShellCheck
only supports sh/bash/dash/ksh/'busybox sh'. Add a 'shell' directive to
specify."). The error fires at --severity=error too, so it fails the
"Docker / shell lint" CI job on every PR that touches docker/.

Add the canonical `# shellcheck shell=sh` directive — same fix already
applied to the sibling cont-init.d scripts (`02-reconcile-profiles` and
`015-supervise-perms`) when they adopted the with-contenv shebang.

The shebang was changed from `#!/bin/sh` → `#!/command/with-contenv sh`
in PR #32412 (commit 29c71e9) to fix env-propagation through s6's PID 1.
The shellcheck-directive line was missed in that PR; this patches it.

Reproduces locally:
  docker run --rm -v "$PWD:/mnt" -w /mnt koalaman/shellcheck:stable \
    --severity=error --format=gcc docker/main-wrapper.sh

Before:  docker/main-wrapper.sh:1:1: error: [SC1008]  (rc=1)
After:   (no output)                                   (rc=0)

Script behavior is unchanged — the directive is a comment, and `sh -n`
/ `bash -n` parse the file cleanly either way.
2026-05-27 16:32:35 +10:00
Ben
27a29ee54e feat(docker): upgrade Node to 22 LTS via multi-stage from node:22-bookworm-slim (#4977)
Debian trixie's bundled `nodejs` package is pinned to 20.19.2, which
reached LTS EOL in April 2026. Trixie won't upgrade in place; Debian 14
(forky) — where the apt nodejs is 24.x — isn't released until ~mid-2027.

To stay on a supported LTS without waiting for Debian 14, copy node + npm
+ corepack from the upstream `node:22-bookworm-slim` image as a
multi-stage source, matching the existing `uv_source` and `gosu_source`
patterns in the Dockerfile. Bookworm-based slim image is used so the
produced binary links against glibc 2.36, which runs cleanly on Debian 13
(trixie, glibc 2.41).

Changes:
- Add `FROM node:22-bookworm-slim@sha256:... AS node_source` stage
- Remove `nodejs npm` from `apt-get install` (now sourced from node_source)
- Add `ca-certificates` explicitly to apt install (was a transitive of
  the apt nodejs package; removing nodejs broke the chain and curl
  inside the build failed with "error setting certificate file")
- COPY node binary + npm + corepack from node_source; recreate the
  symlinks at /usr/local/bin/{npm,npx,corepack}
- Update the npm_config_install_links=false comment block — npm 10's
  default is already `install-links=false`, but we keep the env as
  defense-in-depth against future Node-source-version regressions

Future bumps to Node 24/26 are a one-line ARG change.

Validation:
- Built --no-cache against current origin/main; build succeeds in 1m42s
- Image size: 3.27 GB (pre-salvage-1 baseline) → 3.14 GB (this PR);
  net 130 MiB savings (60 MiB from this change alone vs current main —
  removing apt nodejs+transitive deps that duplicated what node bundles)
- Node 22.22.3 / npm 10.9.8 / esbuild 0.27.7 all run cleanly under
  trixie's glibc 2.41
- Standard image smoke (6/6), Node-version E2E (8/8), chown E2E from
  #19788 (6/6), TUI UID-remap E2E from #28851 (4/4) — 24 checks total

Co-authored-by: Prithvi Monangi <8312237+Prithvi1994@users.noreply.github.com>
2026-05-27 16:22:21 +10:00
Ben
22eb4d13f7 fix(docker): chown ui-tui and node_modules on UID remap so TUI esbuild works (#28851)
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Waiting to run
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Docker Build and Publish / move-latest (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Build Skills Index / build-index (push) Has been cancelled
Build Skills Index / trigger-deploy (push) Has been cancelled
When HERMES_UID remaps the hermes user from 10000 to another UID
(e.g. matching the host user's UID for bind-mount ergonomics), the TUI
launcher's esbuild step fails:

  ✘ [ERROR] Failed to write to output file:
     open /opt/hermes/ui-tui/dist/entry.js: permission denied
  TUI build failed.

This is because the Dockerfile's build-time `chown -R hermes:hermes` on
`/opt/hermes/{.venv,ui-tui,node_modules}` (line 154) wrote UID 10000,
and stage2-hook.sh only re-chowned `.venv` on UID remap — leaving the
TUI build trees still owned by the old UID.

Extend the stage2 re-chown to include the same set as the build-time
chown: `.venv`, `ui-tui`, `node_modules`. These are the runtime-writable
trees under $INSTALL_DIR; everything else under /opt/hermes is read-only
at runtime so keeping it root-owned is fine.

Original fix targeted docker/entrypoint.sh which is now a deprecated shim;
retargeted to docker/stage2-hook.sh where the .venv chown moved during
the s6-overlay rework.

Co-authored-by: Andreas Steffan <623481+deas@users.noreply.github.com>
2026-05-27 15:41:48 +10:00
Ben
9eadb6805c fix(docker): targeted chown to preserve host file ownership in HERMES_HOME (#19795)
Replaces the recursive chown of $HERMES_HOME in stage2-hook.sh with a
targeted approach: chown the top-level dir (so hermes can create new subdirs)
plus the specific hermes-owned subdirectories (cron/, sessions/, logs/,
hooks/, memories/, skills/, skins/, plans/, workspace/, home/, profiles/) —
the same canonical list seeded by the s6-setuidgid mkdir -p block below.

Avoids clobbering host-side file ownership when $HERMES_HOME is a bind
mount that contains user-owned files not managed by hermes (issue #19788).

Original fix targeted docker/entrypoint.sh which is now a deprecated shim;
retargeted to docker/stage2-hook.sh where the recursive chown moved during
the s6-overlay rework.

Co-authored-by: Ptichalouf <1809721+ptichalouf@users.noreply.github.com>
2026-05-27 15:08:41 +10:00
Teknium
b6ca56f651
fix(codex-responses): gracefully recover from invalid_encrypted_content (salvage #10144) (#33035)
* fix(codex-responses): gracefully recover from invalid_encrypted_content (salvage #10144)

When an OpenAI-compatible Responses API surface accepts an initial
request but later rejects the replayed `codex_reasoning_items`
encrypted blob with HTTP 400 `invalid_encrypted_content`, the
session previously got stuck retrying the same poisoned payload.

Recovery: classify the error as a dedicated FailoverReason, and on the
first hit disable encrypted reasoning replay for the rest of the
session, strip cached items from message history, and retry once.

Changes:
* error_classifier: add FailoverReason.invalid_encrypted_content
  branch in _classify_400 (before context_overflow so the messages
  that mention 'encrypted content … could not be verified' don't trip
  context heuristics), in _classify_by_error_code, and extend
  _extract_error_code to peek inside wrapped JSON in error.message and
  ignore the bare '400' as a code.
* agent_init: initialize `_codex_reasoning_replay_enabled = True` on
  every agent.
* run_agent: add AIAgent._disable_codex_reasoning_replay() helper
  that flips the flag and pops cached items.
* codex_responses_adapter: thread a `replay_encrypted_reasoning`
  kwarg through _chat_messages_to_responses_input so that when the
  flag is False we don't replay codex_reasoning_items.
* transports/codex.py: read `replay_encrypted_reasoning` from params,
  thread it into the adapter, and gate the
  `include=['reasoning.encrypted_content']` request hint on it.
* chat_completion_helpers: pass the agent's replay flag through to
  the transport.
* conversation_loop: in the retry loop, add an
  invalid_encrypted_content recovery branch that fires once per
  session, only when api_mode == codex_responses, only when replay is
  still enabled, and only when at least one assistant message in
  history actually carries cached reasoning items (otherwise the 400
  has nothing to do with our cache and the normal retry path handles
  it).

Tests:
* test_error_classifier: new wrapped-JSON _extract_error_code case;
  new TestClassifyApiError cases proving the 400 is retryable with
  no fallback, that the broad message match doesn't catch a generic
  'parsed' message, and that the error code match is
  case-insensitive.
* test_run_agent_codex_responses: end-to-end test of the recovery
  branch firing once and disabling replay, plus a sibling test that
  proves the branch does *not* fire (and the flag stays True) when
  history has no cached reasoning items.

Salvages PR #10144 onto the post-refactor module layout
(error_classifier / codex_responses_adapter / transports/codex /
conversation_loop / agent_init) since the original diff was written
against the pre-refactor monolithic run_agent.py.

* chore(release): map victorGPT in AUTHOR_MAP for #10144 salvage

---------

Co-authored-by: victorGPT <wuxuebin1993@gmail.com>
2026-05-26 22:01:17 -07:00
Jeffrey Quesnelle
9d3e9316f4
Merge pull request #29591 from NousResearch/jq/hermes-update-branch-flag
feat(cli): add --branch flag to `hermes update`
2026-05-27 00:57:37 -04:00
emozilla
3d9a26afad Merge remote-tracking branch 'origin/main' into jq/hermes-update-branch-flag 2026-05-27 00:48:25 -04:00
Ben
1e5884e38f refactor(docker): drop build-essential from apt install (#27507)
build-essential is a Debian metapackage (libc6-dev + gcc + g++ + make + dpkg-dev).
The Dockerfile already installs gcc + python3-dev + libffi-dev explicitly,
which covers the C-ext compile cases lazy_deps may hit at first boot.
g++/make/dpkg-dev aren't reached by the resolved [all]+[messaging] tree
on current main — verified via uv sync --dry-run on cp313-linux.

Co-authored-by: Monty Taylor <mordred@inaugust.com>
2026-05-27 14:35:19 +10:00
Ben Barclay
81a4f280d2
Merge pull request #22534 from wesleysimplicio/fix/voice-mode-docker-respect-pulse-pipewire
fix(voice): honor PULSE_SERVER/PIPEWIRE_REMOTE inside Docker (#21203)
2026-05-27 13:59:12 +10:00
teknium1
9feadc2734 chore(release): map ticketclosed-wontfix noreply to GitHub login 2026-05-26 20:51:59 -07:00
Nick
0a83247e9f feat: add TUI session orchestrator
Add a first-class active-session orchestrator for the Ink TUI:

- list, activate, close, and launch live process-local TUI sessions
- hydrate committed and in-flight output when switching sessions
- dispatch a new prompt session from the +new row with session-scoped model picks
- expose a clickable live-session count in the status chrome
- preserve stable row order while initially focusing the current session
- support mouse hit-testing for floating orchestrator overlays
- add backend and frontend regression coverage for the lifecycle and UI helpers
2026-05-26 20:51:59 -07:00
beardthelion
2fc77c53f0 feat(opencode-go): route qwen3.7-max via anthropic_messages
qwen3.7-max on OpenCode Go rejects the OpenAI-compatible (oa-compat)
format with HTTP 401 but works correctly via the Anthropic Messages
endpoint (/v1/messages with x-api-key auth).  Route it the same way
MiniMax models are routed: anthropic_messages api_mode.

Changes:
- hermes_cli/models.py: add qwen3.7-max routing + curated list
- hermes_cli/setup.py: add to setup wizard model list
- hermes_cli/auth.py: update provider comment
- tests: add assertions for qwen3.7-max api_mode routing
2026-05-26 20:44:43 -07:00
Ben Barclay
3c7f786ade
Merge pull request #31557 from yu-xin-c/codex/docs-xurl-docker-home-29108
docs: clarify xurl auth HOME in Docker
2026-05-27 13:42:51 +10:00
Ben Barclay
7d94eee0a9
Merge pull request #32122 from yu-xin-c/codex/docs-docker-audio-bridge-32009
docs: add Docker audio bridge notes
2026-05-27 13:42:36 +10:00
Ben Barclay
628aaea63a
Merge pull request #32412 from jonpol01/fix/docker-env-propagation
fix(docker): propagate env through s6 to cont-init and main CMD
2026-05-27 13:42:26 +10:00
Ben Barclay
840f79ed12
Merge pull request #31031 from Sunil123135/feat/windows-docker-desktop
feat(docker): add Windows Docker Desktop compatible compose file
2026-05-27 13:41:16 +10:00
Will Falcon
bba50977bc fix: parse Codex image generation SSE directly 2026-05-26 20:40:29 -07:00
Teknium
16e86ce6a7
chore(release): map wangpuv contributor email for #32933 (#33005)
Pre-stages the AUTHOR_MAP entry so the contributor-check workflow
passes when Will Falcon's image-gen SSE fix lands.
2026-05-26 20:40:17 -07:00