Commit graph

530 commits

Author SHA1 Message Date
Teknium
05d8f11085
fix(/model): show provider-enforced context length, not raw models.dev (#15438)
/model gpt-5.5 on openai-codex showed 'Context: 1,050,000 tokens' because
the display block used ModelInfo.context_window directly from models.dev.
Codex OAuth actually enforces 272K for the same slug, and the agent's
compressor already runs at 272K via get_model_context_length() — so the
banner + real context budget said 272K while /model lied with 1M.

Route the display context through a new resolve_display_context_length()
helper that always prefers agent.model_metadata.get_model_context_length
(which knows about Codex OAuth, Copilot, Nous caps) and only falls back
to models.dev when that returns nothing.

Fix applied to all 3 /model display sites:
  cli.py _handle_model_switch
  gateway/run.py picker on_model_selected callback
  gateway/run.py text-fallback confirmation

Reported by @emilstridell (Telegram, April 2026).
2026-04-24 17:21:38 -07:00
Austin Pickett
63975aa75b fix: mobile chat in new layout 2026-04-24 12:07:46 -04:00
emozilla
f49afd3122 feat(web): add /api/pty WebSocket bridge to embed TUI in dashboard
Exposes hermes --tui over a PTY-backed WebSocket so the dashboard can
embed the real TUI rather than reimplement its surface. The browser
attaches xterm.js to the socket; keystrokes flow in, PTY output bytes
flow out.

Architecture:

    browser <Terminal> (xterm.js)
           │  onData ───► ws.send(keystrokes)
           │  onResize ► ws.send('\x1b[RESIZE:cols;rows]')
           │  write   ◄── ws.onmessage (PTY bytes)
           ▼
    FastAPI /api/pty (token-gated, loopback-only)
           ▼
    PtyBridge (ptyprocess) ── spawns node ui-tui/dist/entry.js ──► tui_gateway + AIAgent

Components
----------

hermes_cli/pty_bridge.py
  Thin wrapper around ptyprocess.PtyProcess: byte-safe read/write on the
  master fd via os.read/os.write (not PtyProcessUnicode — ANSI is
  inherently byte-oriented and UTF-8 boundaries may land mid-read),
  non-blocking select-based reads, TIOCSWINSZ resize, idempotent
  SIGHUP→SIGTERM→SIGKILL teardown, platform guard (POSIX-only; Windows
  is WSL-supported only).

hermes_cli/web_server.py
  @app.websocket("/api/pty") endpoint gated by the existing
  _SESSION_TOKEN (via ?token= query param since browsers can't set
  Authorization on WS upgrades). Loopback-only enforcement. Reader task
  uses run_in_executor to pump PTY bytes without blocking the event
  loop. Writer loop intercepts a custom \x1b[RESIZE:cols;rows] escape
  before forwarding to the PTY. The endpoint resolves the TUI argv
  through a _resolve_chat_argv hook so tests can inject fake commands
  without building the real TUI.

Tests
-----

tests/hermes_cli/test_pty_bridge.py — 12 unit tests: spawn, stdout,
stdin round-trip, EOF, resize (via TIOCSWINSZ + tput readback), close
idempotency, cwd, env forwarding, unavailable-platform error.

tests/hermes_cli/test_web_server.py — TestPtyWebSocket adds 7 tests:
missing/bad token rejection (close code 4401), stdout streaming,
stdin round-trip, resize escape forwarding, unavailable-platform ANSI
error frame + 1011 close, resume parameter forwarding to argv.

96 tests pass under scripts/run_tests.sh.

(cherry picked from commit 29b337bca7)

feat(web): add Chat tab with xterm.js terminal + Sessions resume button

(cherry picked from commit 3d21aee8 by emozilla, conflicts resolved
 against current main: BUILTIN_ROUTES table + plugin slot layout)

fix(tui): replace OSC 52 jargon in /copy confirmation

When the user ran /copy successfully, Ink confirmed with:

  sent OSC52 copy sequence (terminal support required)

That reads like a protocol spec to everyone who isn't a terminal
implementer. The caveat was a historical artifact — OSC 52 wasn't
universally supported when this message was written, so the TUI
honestly couldn't guarantee the copy had landed anywhere.

Today every modern terminal (including the dashboard's embedded
xterm.js) handles OSC 52 reliably. Say what the user actually wants
to know — that it copied, and how much — matching the message the
TUI already uses for selection copy:

  copied 1482 chars

(cherry picked from commit a0701b1d5a)

docs: document the dashboard Chat tab

AGENTS.md — new subsection under TUI Architecture explaining that the
dashboard embeds the real hermes --tui rather than rewriting it,
with pointers to the pty_bridge + WebSocket endpoint and the rule
'never add a parallel chat surface in React.'

website/docs/user-guide/features/web-dashboard.md — user-facing Chat
section inside the existing Web Dashboard page, covering how it works
(WebSocket + PTY + xterm.js), the Sessions-page resume flow, and
prerequisites (Node.js, ptyprocess, POSIX kernel / WSL on Windows).

(cherry picked from commit 2c2e32cc45)

feat(tui-gateway): transport-aware dispatch + WebSocket sidecar

Decouples the JSON-RPC dispatcher from its I/O sink so the same handler
surface can drive multiple transports concurrently. The PTY chat tab
already speaks to the TUI binary as bytes — this adds a structured
event channel alongside it for dashboard-side React widgets that need
typed events (tool.start/complete, model picker state, slash catalog)
that PTY can't surface.

- `tui_gateway/transport.py` — `Transport` protocol + `contextvars` binding
  + module-level `StdioTransport` fallback. The stdio stream resolves
  through a lambda so existing tests that monkey-patch `_real_stdout`
  keep passing without modification.
- `tui_gateway/ws.py` — WebSocket transport implementation; FastAPI
  endpoint mounting lives in hermes_cli/web_server.py.
- `tui_gateway/server.py`:
  - `write_json` routes via session transport (for async events) →
    contextvar transport (for in-request writes) → stdio fallback.
  - `dispatch(req, transport=None)` binds the transport for the request
    lifetime and propagates it to pool workers via `contextvars.copy_context`
    so async handlers don't lose their sink.
  - `_init_session` and the manual-session create path stash the
    request's transport so out-of-band events (subagent.complete, etc.)
    fan out to the right peer.

`tui_gateway.entry` (Ink's stdio handshake) is unchanged externally —
it falls through every precedence step into the stdio fallback, byte-
identical to the previous behaviour.

feat(web): ChatSidebar — JSON-RPC sidecar next to xterm.js terminal

Composes the two transports into a single Chat tab:

  ┌─────────────────────────────────────────┬──────────────┐
  │  xterm.js / PTY  (emozilla #13379)      │ ChatSidebar  │
  │  the literal hermes --tui process       │  /api/ws     │
  └─────────────────────────────────────────┴──────────────┘
        terminal bytes                          structured events

The terminal pane stays the canonical chat surface — full TUI fidelity,
slash commands, model picker, mouse, skin engine, wide chars all paint
inside the terminal. The sidebar opens a parallel JSON-RPC WebSocket
to the same gateway and renders metadata that PTY can't surface to
React chrome:

  • model + provider badge with connection state (click → switch)
  • running tool-call list (driven by tool.start / tool.progress /
    tool.complete events)
  • model picker dialog (gateway-driven, reuses ModelPickerDialog)

The sidecar is best-effort. If the WS can't connect (older gateway,
network hiccup, missing token) the terminal pane keeps working
unimpaired — sidebar just shows the connection-state badge in the
appropriate tone.

- `web/src/components/ChatSidebar.tsx` — new component (~270 lines).
  Owns its GatewayClient, drives the model picker through
  `slash.exec`, fans tool events into a capped tool list.
- `web/src/pages/ChatPage.tsx` — split layout: terminal pane
  (`flex-1`) + sidebar (`w-80`, `lg+` only).
- `hermes_cli/web_server.py` — mount `/api/ws` (token + loopback
  guards mirror /api/pty), delegate to `tui_gateway.ws.handle_ws`.

Co-authored-by: emozilla <emozilla@nousresearch.com>

refactor(web): /clean pass on ChatSidebar + ChatPage lint debt

- ChatSidebar: lift gw out of useRef into a useMemo derived from a
  reconnect counter. React 19's react-hooks/refs and react-hooks/
  set-state-in-effect rules both fire when you touch a ref during
  render or call setState from inside a useEffect body. The
  counter-derived gw is the canonical pattern for "external resource
  that needs to be replaceable on user action" — re-creating the
  client comes from bumping `version`, the effect just wires + tears
  down. Drops the imperative `gwRef.current = …` reassign in
  reconnect, drops the truthy ref guard in JSX. modelLabel +
  banner inlined as derived locals (one-off useMemo was overkill).
- ChatPage: lazy-init the banner state from the missing-token check
  so the effect body doesn't have to setState on first run. Drops
  the unused react-hooks/exhaustive-deps eslint-disable. Adds a
  scoped no-control-regex disable on the SGR mouse parser regex
  (the \\x1b is intentional for xterm escape sequences).

All my-touched files now lint clean. Remaining warnings on web/
belong to pre-existing files this PR doesn't touch.

Verified: vitest 249/249, ui-tui eslint clean, web tsc clean,
python imports clean.

chore: uptick

fix(web): drop ChatSidebar tool list — events can't cross PTY/WS boundary

The /api/pty endpoint spawns `hermes --tui` as a child process with its
own tui_gateway and _sessions dict; /api/ws runs handle_ws in-process in
the dashboard server with a separate _sessions dict. Tool events fire on
the child's gateway and never reach the WS sidecar, so the sidebar's
tool.start/progress/complete listeners always observed an empty list.

Drop the misleading list (and the now-orphaned ToolCall primitive),
keep model badge + connection state + model picker + error banner —
those work because they're sidecar-local concerns. Surfacing tool calls
in the sidebar requires cross-process forwarding (PTY child opens a
back-WS to the dashboard, gateway tees emits onto stdio + sidecar
transport) — proper feature for a follow-up.

feat(web): wire ChatSidebar tool list to PTY child via /api/pub broadcast

The dashboard's /api/pty spawns hermes --tui as a child process; tool
events fire in the python tui_gateway grandchild and never crossed the
process boundary into the in-process WS sidecar — so the sidebar tool
list was always empty.

Cross-process forwarding:

- tui_gateway: TeeTransport (transport.py) + WsPublisherTransport
  (event_publisher.py, sync websockets client). entry.py installs the
  tee on _stdio_transport when HERMES_TUI_SIDECAR_URL is set, mirroring
  every dispatcher emit to a back-WS without disturbing Ink's stdio
  handshake.

- hermes_cli/web_server.py: new /api/pub (publisher) + /api/events
  (subscriber) endpoints with a per-channel registry. /api/pty now
  accepts ?channel= and propagates the sidecar URL via env. start_server
  also stashes app.state.bound_port so the URL is constructable.

- web/src/pages/ChatPage.tsx: generates a channel UUID per mount,
  passes it to /api/pty and as a prop to ChatSidebar.

- web/src/components/ChatSidebar.tsx: opens /api/events?channel=, fans
  tool.start/progress/complete back into the ToolCall list. Restores
  the ToolCall primitive.

Tests: 4 new TestPtyWebSocket cases cover channel propagation,
broadcast fan-out, and missing-channel rejection (10 PTY tests pass,
120 web_server tests overall).

fix(web): address Copilot review on #14890

Five threads, all real:

- gatewayClient.ts: register `message`/`close` listeners BEFORE awaiting
  the open handshake.  Server emits `gateway.ready` immediately after
  accept, so a listener attached after the open promise could race past
  the initial skin payload and lose it.

- ChatSidebar.tsx: wire `error`/`close` on the /api/events subscriber
  WS into the existing error banner.  4401/4403 (auth/loopback reject)
  surface as a "reload the page" message; mid-stream drops surface as
  "events feed disconnected" with the existing reconnect button.  Clean
  unmount closes (1000/1001) stay silent.

- web-dashboard.md: install hint was `pip install hermes-agent[web]` but
  ptyprocess lives in the `pty` extra, not `web`.  Switch to
  `hermes-agent[web,pty]` in both prerequisite blocks.

- AGENTS.md: previous "never add a parallel React chat surface" guidance
  was overbroad and contradicted this PR's sidebar.  Tightened to forbid
  re-implementing the transcript/composer/PTY terminal while explicitly
  allowing structured supporting widgets (sidebar / model picker /
  inspectors), matching the actual architecture.

- web/package-lock.json: regenerated cleanly so the wterm sibling
  workspace paths (extraneous machine-local entries) stop polluting CI.

Tests: 249/249 vitest, 10/10 PTY/events, web tsc clean.

refactor(web): /clean pass on ChatSidebar events handler

Spotted in the round-2 review:

- Banner flashed on clean unmount: `ws.close()` from the effect cleanup
  fires `close` with code 1005, opened=true, neither 1000 nor 1001 —
  hit the "unexpected drop" branch.  Track `unmounting` in the effect
  scope and gate the banner through a `surface()` helper so cleanup
  closes stay silent.

- DRY the duplicated "events feed disconnected" string into a local
  const used by both the error and close handlers.

- Drop the `opened` flag (no longer needed once the unmount guard is
  the source of truth for "is this an expected close?").
2026-04-24 10:51:49 -04:00
5park1e
e1106772d9 fix: re-auth on stale OAuth token; read Claude Code credentials from macOS Keychain
Bug 3 — Stale OAuth token not detected in 'hermes model':
- _model_flow_anthropic used 'has_creds = bool(existing_key)' which treats
  any non-empty token (including expired OAuth tokens) as valid.
- Added existing_is_stale_oauth check: if the only credential is an OAuth
  token (sk-ant- prefix) with no valid cc_creds fallback, mark it stale
  and force the re-auth menu instead of silently accepting a broken token.

Bug 4 — macOS Keychain credentials never read:
- Claude Code >=2.1.114 migrated from ~/.claude/.credentials.json to the
  macOS Keychain under service 'Claude Code-credentials'.
- Added _read_claude_code_credentials_from_keychain() using the 'security'
  CLI tool; read_claude_code_credentials() now tries Keychain first then
  falls back to JSON file.
- Non-Darwin platforms return None from Keychain read immediately.

Tests:
- tests/agent/test_anthropic_keychain.py: 11 cases covering Darwin-only
  guard, security command failures, JSON parsing, fallback priority.
- tests/hermes_cli/test_anthropic_model_flow_stale_oauth.py: 8 cases
  covering stale OAuth detection, API key passthrough, cc_creds fallback.

Refs: #12905
2026-04-24 07:14:00 -07:00
Teknium
8d12fb1e6b
refactor(spotify): convert to built-in bundled plugin under plugins/spotify (#15174)
Moves the Spotify integration from tools/ into plugins/spotify/,
matching the existing pattern established by plugins/image_gen/ for
third-party service integrations.

Why:
- tools/ should be reserved for foundational capabilities (terminal,
  read_file, web_search, etc.). tools/providers/ was a one-off
  directory created solely for spotify_client.py.
- plugins/ is already the home for image_gen backends, memory
  providers, context engines, and standalone hook-based plugins.
  Spotify is a third-party service integration and belongs alongside
  those, not in tools/.
- Future service integrations (eventually: Deezer, Apple Music, etc.)
  now have a pattern to copy.

Changes:
- tools/spotify_tool.py → plugins/spotify/tools.py (handlers + schemas)
- tools/providers/spotify_client.py → plugins/spotify/client.py
- tools/providers/ removed (was only used for Spotify)
- New plugins/spotify/__init__.py with register(ctx) calling
  ctx.register_tool() × 7. The handler/check_fn wiring is unchanged.
- New plugins/spotify/plugin.yaml (kind: backend, bundled, auto-load).
- tests/tools/test_spotify_client.py: import paths updated.

tools_config fix — _DEFAULT_OFF_TOOLSETS now wins over plugin auto-enable:
- _get_platform_tools() previously auto-enabled unknown plugin
  toolsets for new platforms. That was fine for image_gen (which has
  no toolset of its own) but bad for Spotify, which explicitly
  requires opt-in (don't ship 7 tool schemas to users who don't use
  it). Added a check: if a plugin toolset is in _DEFAULT_OFF_TOOLSETS,
  it stays off until the user picks it in 'hermes tools'.

Pre-existing test bug fix:
- tests/hermes_cli/test_plugins.py::test_list_returns_sorted
  asserted names were sorted, but list_plugins() sorts by key
  (path-derived, e.g. image_gen/openai). With only image_gen plugins
  bundled, name and key order happened to agree. Adding plugins/spotify
  broke that coincidence (spotify sorts between openai-codex and xai
  by name but after xai by key). Updated test to assert key order,
  which is what the code actually documents.

Validation:
- scripts/run_tests.sh tests/hermes_cli/test_plugins.py \
    tests/hermes_cli/test_tools_config.py \
    tests/hermes_cli/test_spotify_auth.py \
    tests/tools/test_spotify_client.py \
    tests/tools/test_registry.py
  → 143 passed
- E2E plugin load: 'spotify' appears in loaded plugins, all 7 tools
  register into the spotify toolset, check_fn gating intact.
2026-04-24 07:06:11 -07:00
XieNBi
4a51ab61eb fix(cli): non-zero /model counts for native OpenAI and direct API rows 2026-04-24 05:48:15 -07:00
wangshengyang2004
647900e813 fix(cli): support model validation for anthropic_messages and cloudflare-protected endpoints
- probe_api_models: add api_mode param; use x-api-key + anthropic-version
  headers for anthropic_messages mode (Anthropic's native Models API auth)
- probe_api_models: add User-Agent header to avoid Cloudflare 403 blocks
  on third-party OpenAI-compatible endpoints
- validate_requested_model: pass api_mode through from switch_model
- validate_requested_model: for anthropic_messages mode, attempt probe with
  correct auth; if probe fails (many proxies don't implement /v1/models),
  accept the model with an informational warning instead of rejecting
- fetch_api_models: propagate api_mode to probe_api_models
2026-04-24 05:48:15 -07:00
Teknium
05394f2f28
feat(spotify): interactive setup wizard + docs page (#15130)
Previously 'hermes auth spotify' crashed with 'HERMES_SPOTIFY_CLIENT_ID
is required' if the user hadn't manually created a Spotify developer
app and set env vars. Now the command detects a missing client_id and
walks the user through the one-time app registration inline:

- Opens https://developer.spotify.com/dashboard in the browser
- Tells the user exactly what to paste into the Spotify form
  (including the correct default redirect URI, 127.0.0.1:43827)
- Prompts for the Client ID
- Persists HERMES_SPOTIFY_CLIENT_ID to ~/.hermes/.env so subsequent
  runs skip the wizard
- Continues straight into the PKCE OAuth flow

Also prints the docs URL at both the start of the wizard and the end
of a successful login so users can find the full guide.

Adds website/docs/user-guide/features/spotify.md with the complete
setup walkthrough, tool reference, and troubleshooting, and wires it
into the sidebar under User Guide > Features > Advanced.

Fixes a stale redirect URI default in the hermes_cli/tools_config.py
TOOL_CATEGORIES entry (was 8888/callback from the PR description
instead of the actual DEFAULT_SPOTIFY_REDIRECT_URI value
43827/spotify/callback defined in auth.py).
2026-04-24 05:30:05 -07:00
0xbyt4
4ac731c841 fix(model-normalize): pass DeepSeek V-series IDs through instead of folding to deepseek-chat
`_normalize_for_deepseek` was mapping every non-reasoner input into
`deepseek-chat` on the assumption that DeepSeek's API accepts only two
model IDs. That assumption no longer holds — `deepseek-v4-pro` and
`deepseek-v4-flash` are first-class IDs accepted by the direct API,
and on aggregators `deepseek-chat` routes explicitly to V3 (DeepInfra
backend returns `deepseek-chat-v3`). So a user picking V4 Pro through
the model picker was being silently downgraded to V3.

Verified 2026-04-24 against Nous portal's OpenAI-compat surface:
  - `deepseek/deepseek-v4-flash` → provider: DeepSeek,
    model: deepseek-v4-flash-20260423
  - `deepseek/deepseek-chat`     → provider: DeepInfra,
    model: deepseek/deepseek-chat-v3

Fix:
- Add `deepseek-v4-pro` and `deepseek-v4-flash` to
  `_DEEPSEEK_CANONICAL_MODELS` so exact matches pass through.
- Add `_DEEPSEEK_V_SERIES_RE` (`^deepseek-v\d+(...)?$`) so future
  V-series IDs (`deepseek-v5-*`, dated variants) keep passing through
  without another code change.
- Update docstring + module header to reflect the new rule.

Tests:
- New `TestDeepseekVSeriesPassThrough` — 8 parametrized cases covering
  bare, vendor-prefixed, case-variant, dated, and future V-series IDs
  plus end-to-end `normalize_model_for_provider(..., "deepseek")`.
- New `TestDeepseekCanonicalAndReasonerMapping` — regression coverage
  for canonical pass-through, reasoner-keyword folding, and
  fall-back-to-chat behaviour.
- 77/77 pass.

Reported on Discord (Ufonik, Don Piedro): `/model > Deepseek >
deepseek-v4-pro` surfaced
`Normalized 'deepseek-v4-pro' to 'deepseek-chat'`. Picker listing
showed the v4 names, so validation also rejected the post-normalize
`deepseek-chat` as "not in provider listing" — the contradiction
users saw. Normalizer now respects the picker's choice.
2026-04-24 05:24:54 -07:00
Dilee
7e9dd9ca45 Add native Spotify tools with PKCE auth 2026-04-24 05:20:38 -07:00
Michael Steuer
cd221080ec fix: validate nous auth status against runtime credentials 2026-04-24 05:20:05 -07:00
jakubkrcmar
1af44a13c0 fix(model_picker): detect mapped-provider auth-store credentials 2026-04-24 05:20:05 -07:00
Andy
fff7ee31ae fix: clarify auth retry guidance 2026-04-24 05:20:05 -07:00
NiuNiu Xia
76329196c1 fix(copilot): wire live /models max_prompt_tokens into context-window resolver
The Copilot provider resolved context windows via models.dev static data,
which does not include account-specific models (e.g. claude-opus-4.6-1m
with 1M context). This adds the live Copilot /models API as a higher-
priority source for copilot/copilot-acp/github-copilot providers.

New helper get_copilot_model_context() in hermes_cli/models.py extracts
capabilities.limits.max_prompt_tokens from the cached catalog. Results
are cached in-process for 1 hour.

In agent/model_metadata.py, step 5a queries the live API before falling
through to models.dev (step 5b). This ensures account-specific models
get correct context windows while standard models still have a fallback.

Part 1 of #7731.
Refs: #7272
2026-04-24 05:09:08 -07:00
NiuNiu Xia
d7ad07d6fe fix(copilot): exchange raw GitHub token for Copilot API JWT
Raw GitHub tokens (gho_/github_pat_/ghu_) are now exchanged for
short-lived Copilot API tokens via /copilot_internal/v2/token before
being used as Bearer credentials. This is required to access
internal-only models (e.g. claude-opus-4.6-1m with 1M context).

Implementation:
- exchange_copilot_token(): calls the token exchange endpoint with
  in-process caching (dict keyed by SHA-256 fingerprint), refreshed
  2 minutes before expiry. No disk persistence — gateway is long-running
  so in-memory cache is sufficient.
- get_copilot_api_token(): convenience wrapper with graceful fallback —
  returns exchanged token on success, raw token on failure.
- Both callers (hermes_cli/auth.py and agent/credential_pool.py) now
  pipe the raw token through get_copilot_api_token() before use.

12 new tests covering exchange, caching, expiry, error handling,
fingerprinting, and caller integration. All 185 existing copilot/auth
tests pass.

Part 2 of #7731.
2026-04-24 05:09:08 -07:00
Teknium
78450c4bd6
fix(nous-oauth): preserve obtained_at in pool + actionable message on RT reuse (#15111)
Two narrow fixes motivated by #15099.

1. _seed_from_singletons() was dropping obtained_at, agent_key_obtained_at,
   expires_in, and friends when seeding device_code pool entries from the
   providers.nous singleton. Fresh credentials showed up with
   obtained_at=None, which broke downstream freshness-sensitive consumers
   (self-heal hooks, pool pruning by age) — they treated just-minted
   credentials as older than they actually were and evicted them.

2. When the Nous Portal OAuth 2.1 server returns invalid_grant with
   'Refresh token reuse detected' in the error_description, rewrite the
   message to explain the likely cause (an external process consumed the
   rotated RT without persisting it back) and the mitigation. The generic
   reuse message led users to report this as a Hermes persistence bug when
   the actual trigger was typically a third-party monitoring script calling
   /api/oauth/token directly. Non-reuse errors keep their original server
   description untouched.

Closes #15099.

Regression tests:
- tests/agent/test_credential_pool.py::test_nous_seed_from_singletons_preserves_obtained_at_timestamps
- tests/hermes_cli/test_auth_nous_provider.py::test_refresh_token_reuse_detection_surfaces_actionable_message
- tests/hermes_cli/test_auth_nous_provider.py::test_refresh_non_reuse_error_keeps_original_description
2026-04-24 05:08:46 -07:00
Teknium
0e235947b9
fix(redact): honor security.redact_secrets from config.yaml (#15109)
agent/redact.py snapshots _REDACT_ENABLED from HERMES_REDACT_SECRETS at
module-import time. hermes_cli/main.py calls setup_logging() early, which
transitively imports agent.redact — BEFORE any config bridge has run. So
users who set 'security.redact_secrets: false' in config.yaml (instead of
HERMES_REDACT_SECRETS=false in .env) had the toggle silently ignored in
both 'hermes chat' and 'hermes gateway run'.

Bridge config.yaml -> env var in hermes_cli/main.py BEFORE setup_logging.
.env still wins (only set env when unset) — config.yaml is the fallback.

Regression tests in tests/hermes_cli/test_redact_config_bridge.py spawn
fresh subprocesses to verify:
- redact_secrets: false in config.yaml disables redaction
- default (key absent) leaves redaction enabled
- .env HERMES_REDACT_SECRETS=true overrides config.yaml
2026-04-24 05:03:26 -07:00
Teknium
1eb29e6452
fix(opencode): derive api_mode from target model, not stale config default (#15106)
/model kimi-k2.6 on opencode-zen (or glm-5.1 on opencode-go) returned OpenCode's
website 404 HTML page when the user's persisted model.default was a Claude or
MiniMax model. The switched-to chat_completions request hit
https://opencode.ai/zen (or /zen/go) with no /v1 suffix.

Root cause: resolve_runtime_provider() computed api_mode from
model_cfg.get('default') instead of the model being requested. With a Claude
default, it resolved api_mode=anthropic_messages, stripped /v1 from base_url
(required for the Anthropic SDK), then switch_model()'s opencode_model_api_mode
override flipped api_mode back to chat_completions without restoring /v1.

Fix: thread an optional target_model kwarg through resolve_runtime_provider
and _resolve_runtime_from_pool_entry. When the caller is performing an explicit
mid-session model switch (i.e. switch_model()), the target model drives both
api_mode selection and the conditional /v1 strip. Other callers (CLI init,
gateway init, cron, ACP, aux client, delegate, account_usage, tui_gateway) pass
nothing and preserve the existing config-default behavior.

Regression tests added in test_model_switch_opencode_anthropic.py use the REAL
resolver (not a mock) to guard the exact Quentin-repro scenario. Existing tests
that mocked resolve_runtime_provider with 'lambda requested:' had their mock
signatures widened to '**kwargs' to accept the new kwarg.
2026-04-24 04:58:46 -07:00
georgex8001
1dca2e0a28 fix(runtime): resolve bare custom provider to loopback or CUSTOM_BASE_URL
When /model selects Custom but model.provider in YAML still reflects a prior provider, trust model.base_url only for loopback hosts or when provider is custom. Consult CUSTOM_BASE_URL before OpenRouter defaults (#14676).
2026-04-24 04:54:16 -07:00
Matt Maximo
271f0e6eb0 fix(model): let Codex setup reuse or reauthenticate 2026-04-24 04:53:32 -07:00
j3ffffff
f76df30e08 fix(auth): parse OpenAI nested error shape in Codex token refresh
OpenAI's OAuth token endpoint returns errors in a nested shape —
{"error": {"code": "refresh_token_reused", "message": "..."}} —
not the OAuth spec's flat {"error": "...", "error_description": "..."}.
The existing parser only handled the flat shape, so:

- `err.get("error")` returned a dict, the `isinstance(str)` guard
  rejected it, and `code` stayed `"codex_refresh_failed"`.
- The dedicated `refresh_token_reused` branch (with its actionable
  "re-run codex + hermes auth" message and `relogin_required=True`)
  never fired.
- Users saw the generic "Codex token refresh failed with status 401"
  when another Codex client (CLI, VS Code extension) had consumed
  their single-use refresh token — giving no hint that re-auth was
  required.

Parse both shapes, mapping OpenAI's nested `code`/`type` onto the
existing `code` variable so downstream branches (`refresh_token_reused`,
`invalid_grant`, etc.) fire correctly.

Add regression tests covering:
- nested `refresh_token_reused` → actionable message + relogin_required
- nested generic code → code + message surfaced
- flat OAuth-spec `invalid_grant` still handled (back-compat)
- unparseable body → generic fallback message, relogin_required=False

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-24 04:53:32 -07:00
LeonSGP43
ccc8fccf77 fix(cli): validate user-defined providers consistently 2026-04-24 04:48:56 -07:00
Teknium
3aa1a41e88
feat(gemini): block free-tier keys at setup + surface guidance on 429 (#15100)
Google AI Studio's free tier (<= 250 req/day for gemini-2.5-flash) is
exhausted in a handful of agent turns, so the setup wizard now refuses
to wire up Gemini when the supplied key is on the free tier, and the
runtime 429 handler appends actionable billing guidance.

Setup-time probe (hermes_cli/main.py):
- `_model_flow_api_key_provider` fires one minimal generateContent call
  when provider_id == 'gemini' and classifies the response as
  free/paid/unknown via x-ratelimit-limit-requests-per-day header or
  429 body containing 'free_tier'.
- Free  -> print block message, refuse to save the provider, return.
- Paid  -> 'Tier check: paid' and proceed.
- Unknown (network/auth error) -> 'could not verify', proceed anyway.

Runtime 429 handler (agent/gemini_native_adapter.py):
- `gemini_http_error` appends billing guidance when the 429 error body
  mentions 'free_tier', catching users who bypass setup by putting
  GOOGLE_API_KEY directly in .env.

Tests: 21 unit tests for the probe + error path, 4 tests for the
setup-flow block. All 67 existing gemini tests still pass.
2026-04-24 04:46:17 -07:00
Teknium
18f3fc8a6f
fix(tests): resolve 17 persistent CI test failures (#15084)
Make the main-branch test suite pass again. Most failures were tests
still asserting old shapes after recent refactors; two were real source
bugs.

Source fixes:
- tools/mcp_tool.py: _kill_orphaned_mcp_children() slept 2s on every
  shutdown even when no tracked PIDs existed, making test_shutdown_is_parallel
  measure ~3s for 3 parallel 1s shutdowns. Early-return when pids is empty.
- hermes_cli/tips.py: tip 105 was 157 chars; corpus max is 150.

Test fixes (mostly stale mock targets / missing fixture fields):
- test_zombie_process_cleanup, test_agent_cache: patch run_agent.cleanup_vm
  (the local name bound at import), not tools.terminal_tool.cleanup_vm.
- test_browser_camofox: patch tools.browser_camofox.load_config, not
  hermes_cli.config.load_config (the source module, not the resolved one).
- test_flush_memories_codex._chat_response_with_memory_call: add
  finish_reason, tool_call.id, tool_call.type so the chat_completions
  transport normalizer doesn't AttributeError.
- test_concurrent_interrupt: polling_tool signature now accepts
  messages= kwarg that _invoke_tool() passes through.
- test_minimax_provider: add _fallback_chain=[] to the __new__'d agent
  so switch_model() doesn't AttributeError.
- test_skills_config: SKILLS_DIR MagicMock + .rglob stopped working
  after the scanner switched to agent.skill_utils.iter_skill_index_files
  (os.walk-based). Point SKILLS_DIR at a real tmp_path and patch
  agent.skill_utils.get_external_skills_dirs.
- test_browser_cdp_tool: browser_cdp toolset was intentionally split into
  'browser-cdp' (commit 96b0f3700) so its stricter check_fn doesn't gate
  the whole browser toolset; test now expects 'browser-cdp'.
- test_registry: add tools.browser_dialog_tool to the expected
  builtin-discovery set (PR #14540 added it).
- test_file_tools TestPatchHints: patch_tool surfaces hints as a '_hint'
  key on the JSON payload, not inline '[Hint: ...' text.
- test_write_deny test_hermes_env: resolve .env via get_hermes_home() so
  the path matches the profile-aware denylist under hermetic HERMES_HOME.
- test_checkpoint_manager test_falls_back_to_parent: guard the walk-up
  so a stray /tmp/pyproject.toml on the host doesn't pick up /tmp as the
  project root.
- test_quick_commands: set cli.session_id in the __new__'d CLI so the
  alias-args path doesn't trip AttributeError when fuzzy-matching leaks
  a skill command across xdist test distribution.
2026-04-24 03:46:46 -07:00
Nicecsh
fe34741f32 fix(model): repair Discord Copilot /model flow
Keep Discord Copilot model switching responsive and current by refreshing picker data from the live catalog when possible, correcting the curated fallback list, and clearing stale controls before the switch completes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 03:33:29 -07:00
kshitij
7897f65a94
fix(normalize): lowercase Xiaomi model IDs for case-insensitive config (#15066)
Xiaomi's API (api.xiaomimimo.com) requires lowercase model IDs like
"mimo-v2.5-pro" but rejects mixed-case names like "MiMo-V2.5-Pro"
that users copy from marketing docs or the ProviderEntry description.

Add _LOWERCASE_MODEL_PROVIDERS set and apply .lower() to model names
for providers in this set (currently just xiaomi) after stripping the
provider prefix. This ensures any case variant in config.yaml is
normalized before hitting the API.

Other providers (minimax, zai, etc.) are NOT affected — their APIs
accept mixed case (e.g. MiniMax-M2.7).
2026-04-24 03:33:05 -07:00
Harry Riddle
ac25e6c99a feat(auth-codex): add config-provider fallback detection for logout in hermes-agent/hermes_cli/auth.py 2026-04-24 03:17:18 -07:00
Teknium
b2e124d082
refactor(commands): drop /provider, /plan handler, and clean up slash registry (#15047)
* refactor(commands): drop /provider and clean up slash registry

* refactor(commands): drop /plan special handler — use plain skill dispatch
2026-04-24 03:10:52 -07:00
Keira Voss
1ef1e4c669 feat(plugins): add pre_gateway_dispatch hook
Introduces a new plugin hook `pre_gateway_dispatch` fired once per
incoming MessageEvent in `_handle_message`, after the internal-event
guard but before the auth / pairing chain. Plugins may return a dict
to influence flow:

    {"action": "skip",    "reason": "..."}  -> drop (no reply)
    {"action": "rewrite", "text":   "..."}  -> replace event.text
    {"action": "allow"}  /  None             -> normal dispatch

Motivation: gateway-level message-flow patterns that don't fit cleanly
into any single adapter — e.g. listen-only group-chat windows (buffer
ambient messages, collapse on @mention), or human-handover silent
ingest (record messages while an owner handles the chat manually).
Today these require forking core; with this hook they can live in a
single profile-agnostic plugin.

Hook runs BEFORE auth so plugins can handle unauthorized senders
(e.g. customer-service handover ingest) without triggering the
pairing-code flow. Exceptions in plugin callbacks are caught and
logged; the first non-None action dict wins, remaining results are
ignored.

Includes:
- `VALID_HOOKS` entry + inline doc in `hermes_cli/plugins.py`
- Invocation block in `gateway/run.py::_handle_message`
- 5 new tests in `tests/gateway/test_pre_gateway_dispatch.py`
  (skip, rewrite, allow, exception safety, internal-event bypass)
- 2 additional tests in `tests/hermes_cli/test_plugins.py`
- Table entry in `website/docs/user-guide/features/plugins.md`

Made-with: Cursor
2026-04-24 03:02:03 -07:00
0xbyt4
8aa37a0cf9 fix(auth): honor SSL CA env vars across httpx + requests callsites
- hermes_cli/auth.py: add _default_verify() with macOS Homebrew certifi
  fallback (mirrors weixin 3a0ec1d93). Extend env var chain to include
  REQUESTS_CA_BUNDLE so one env var works across httpx + requests paths.
- agent/model_metadata.py: add _resolve_requests_verify() reading
  HERMES_CA_BUNDLE / REQUESTS_CA_BUNDLE / SSL_CERT_FILE in priority
  order. Apply explicit verify= to all 6 requests.get callsites.
- Tests: 18 new unit tests + autouse platform pin on existing
  TestResolveVerifyFallback to keep its "returns True" assertions
  platform-independent.

Empirically verified against self-signed HTTPS server: requests honors
REQUESTS_CA_BUNDLE only; httpx honors SSL_CERT_FILE only. Hermes now
honors all three everywhere.

Triggered by Discord reports — Nous OAuth SSL failure on macOS
Homebrew Python; custom provider self-signed cert ignored despite
REQUESTS_CA_BUNDLE set in env.
2026-04-24 03:00:33 -07:00
Teknium
6051fba9dc
feat(banner): hyperlink startup banner title to latest GitHub release (#14945)
Wrap the existing version label in the welcome-banner panel title
('Hermes Agent v… · upstream … · local …') with an OSC-8 terminal
hyperlink pointing at the latest git tag's GitHub release page
(https://github.com/NousResearch/hermes-agent/releases/tag/<tag>).

Clickable in modern terminals (iTerm2, WezTerm, Windows Terminal,
GNOME Terminal, Kitty, etc.); degrades to plain text on terminals
without OSC-8 support. No new line added to the banner.

New get_latest_release_tag() helper runs 'git describe --tags
--abbrev=0' in the Hermes checkout (3s timeout, per-process cache,
silent fallback for non-git/pip installs and forks without tags).
2026-04-23 23:28:34 -07:00
0xbyt4
04c489b587 feat(tui): match CLI's voice slash + VAD-continuous recording model
The TUI had drifted from the CLI's voice model in two ways:

- /voice on was lighting up the microphone immediately and Ctrl+B was
  interpreted as a mode toggle.  The CLI separates the two: /voice on
  just flips the umbrella bit, recording only starts once the user
  presses Ctrl+B, which also sets _voice_continuous so the VAD loop
  auto-restarts until the user presses Ctrl+B again or three silent
  cycles pass.
- /voice tts was missing entirely, so users couldn't turn agent reply
  speech on/off from inside the TUI.

This commit brings the TUI to parity.

Python

- hermes_cli/voice.py: continuous-mode API (start_continuous,
  stop_continuous, is_continuous_active) layered on the existing PTT
  wrappers. The silence callback transcribes, fires on_transcript,
  tracks consecutive no-speech cycles, and auto-restarts — mirroring
  cli.py:_voice_stop_and_transcribe + _restart_recording.
- tui_gateway/server.py:
  - voice.toggle now supports on / off / tts / status.  The umbrella
    bit lives in HERMES_VOICE + display.voice_enabled; tts lives in
    HERMES_VOICE_TTS + display.voice_tts.  /voice off also tears down
    any active continuous loop so a toggle-off really releases the
    microphone.
  - voice.record start/stop now drives start_continuous/stop_continuous.
    start is refused with a clear error when the mode is off, matching
    cli.py:handle_voice_record's early return on `not _voice_mode`.
  - New voice.transcript / voice.status events emit through
    _voice_emit (remembers the sid that last enabled the mode so
    events land in the right session).

TypeScript

- gatewayTypes.ts: voice.status + voice.transcript event
  discriminants; VoiceToggleResponse gains tts; VoiceRecordResponse
  gains status for the new "started/stopped" responses.
- interfaces.ts: GatewayEventHandlerContext gains composer.setInput +
  submission.submitRef + voice.{setRecording, setProcessing,
  setVoiceEnabled}; InputHandlerContext.voice gains enabled +
  setVoiceEnabled for the mode-aware Ctrl+B handler.
- createGatewayEventHandler.ts: voice.status drives REC/STT badges;
  voice.transcript auto-submits when the composer is empty (CLI
  _pending_input.put parity) and appends when a draft is in flight.
  no_speech_limit flips voice off + sys line.
- useInputHandlers.ts: Ctrl+B now calls voice.record (start/stop),
  not voice.toggle, and nudges the user with a sys line when the
  mode is off instead of silently flipping it on.
- useMainApp.ts: wires the new event-handler context fields.
- slash/commands/session.ts: /voice handles on / off / tts / status
  with CLI-matching output ("voice: mode on · tts off").

Backward compat preserved for voice.record (was always PTT shape;
gateway still honours start/stop with mode-gating added).
2026-04-23 16:18:15 -07:00
0xbyt4
0bb460b070 fix(tui): add missing hermes_cli.voice wrapper for gateway RPC
tui_gateway/server.py:3486/3491/3509 imports start_recording,
stop_and_transcribe, and speak_text from hermes_cli.voice, but the
module never existed (not in git history — never shipped, never
deleted). Every voice.record / voice.tts RPC call hit the ImportError
branch and the TUI surfaced it as "voice module not available — install
audio dependencies" even on boxes with sounddevice / faster-whisper /
numpy installed.

Adds a thin wrapper on top of tools.voice_mode (recording +
transcription) and tools.tts_tool (text-to-speech):

- start_recording() — idempotent; stores the active AudioRecorder in a
  module-global guarded by a Lock so repeat Ctrl+B presses don't fight
  over the mic.
- stop_and_transcribe() — returns None for no-op / no-speech /
  Whisper-hallucination cases so the TUI's existing "no speech detected"
  path keeps working unchanged.
- speak_text(text) — lazily imports tts_tool (optional provider SDKs
  stay unloaded until the first /voice tts call), parses the tool's
  JSON result, and plays the audio via play_audio_file.

Paired with the Ctrl+B keybinding fix in the prior commit, the TUI
voice pipeline now works end-to-end for the first time.
2026-04-23 16:18:15 -07:00
Teknium
ef5eaf8d87
feat(cron): honor hermes tools config for the cron platform (#14798)
Cron now resolves its toolset from the same per-platform config the
gateway uses — `_get_platform_tools(cfg, 'cron')` — instead of blindly
loading every default toolset.  Existing cron jobs without a per-job
override automatically lose `moa`, `homeassistant`, and `rl` (the
`_DEFAULT_OFF_TOOLSETS` set), which stops the "surprise $4.63
mixture_of_agents run" class of bug (Norbert, Discord).

Precedence inside `run_job`:
  1. per-job `enabled_toolsets` (PR #14767 / #6130) — wins if set
  2. `_get_platform_tools(cfg, 'cron')` — new, the blanket gate
  3. `None` fallback (legacy) — only on resolver exception

Changes:
- hermes_cli/platforms.py: register 'cron' with default_toolset
  'hermes-cron'
- toolsets.py: add 'hermes-cron' toolset (mirrors 'hermes-cli';
  `_get_platform_tools` then filters via `_DEFAULT_OFF_TOOLSETS`)
- cron/scheduler.py: add `_resolve_cron_enabled_toolsets(job, cfg)`,
  call it at the `AIAgent(...)` kwargs site
- tests/cron/test_scheduler.py: replace the 'None when not set' test
  (outdated contract) with an invariant ('moa not in default cron
  toolset') + new per-job-wins precedence test
- tests/hermes_cli/test_tools_config.py: mark 'cron' as non-messaging
  in the gateway-toolset-coverage test
2026-04-23 15:48:50 -07:00
Teknium
f593c367be
feat(dashboard): reskin extension points for themes and plugins (#14776)
Themes and plugins can now pull off arbitrary dashboard reskins (cockpit
HUD, retro terminal, etc.) without touching core code.

Themes gain four new fields:
- layoutVariant: standard | cockpit | tiled — shell layout selector
- assets: {bg, hero, logo, crest, sidebar, header, custom: {...}} —
  artwork URLs exposed as --theme-asset-* CSS vars
- customCSS: raw CSS injected as a scoped <style> tag on theme apply
  (32 KiB cap, cleaned up on theme switch)
- componentStyles: per-component CSS-var overrides (clipPath,
  borderImage, background, boxShadow, ...) for card/header/sidebar/
  backdrop/tab/progress/badge/footer/page

Plugin manifests gain three new fields:
- tab.override: replaces a built-in route instead of adding a tab
- tab.hidden: register component + slots without adding a nav entry
- slots: declares shell slots the plugin populates

10 named shell slots: backdrop, header-left/right/banner, sidebar,
pre-main, post-main, footer-left/right, overlay. Plugins register via
window.__HERMES_PLUGINS__.registerSlot(name, slot, Component). A
<PluginSlot> React helper is exported on the plugin SDK.

Ships a full demo at plugins/strike-freedom-cockpit/ — theme YAML +
slot-only plugin that reproduces a Gundam cockpit dashboard: MS-STATUS
sidebar with live telemetry, COMPASS crest in header, notched card
corners via componentStyles, scanline overlay via customCSS, gold/cyan
palette, Orbitron typography.

Validation:
- 15 new tests in test_web_server.py covering every extended field
- tests/hermes_cli/: 2615 passed (3 pre-existing unrelated failures)
- tsc -b --noEmit: clean
- vite build: 418 kB bundle, ~2 kB delta for slots/theme extensions

Co-authored-by: Teknium <p@nousresearch.com>
2026-04-23 15:31:01 -07:00
helix4u
1cc0bdd5f3 fix(dashboard): avoid auth header collision with reverse proxies 2026-04-23 14:05:23 -07:00
Teknium
97b9b3d6a6
fix(gateway): drain-aware hermes update + faster still-working pings (#14736)
cmd_update no longer SIGKILLs in-flight agent runs, and users get
'still working' status every 3 min instead of 10. Two long-standing
sources of '@user — agent gives up mid-task' reports on Telegram and
other gateways.

Drain-aware update:
- New helper hermes_cli.gateway._graceful_restart_via_sigusr1(pid,
  drain_timeout) sends SIGUSR1 to the gateway and polls os.kill(pid,
  0) until the process exits or the budget expires.
- cmd_update's systemd loop now reads MainPID via 'systemctl show
  --property=MainPID --value' and tries the graceful path first. The
  gateway's existing SIGUSR1 handler -> request_restart(via_service=
  True) -> drain -> exit(75) is wired in gateway/run.py and is
  respawned by systemd's Restart=on-failure (and the explicit
  RestartForceExitStatus=75 on newer units).
- Falls back to 'systemctl restart' when MainPID is unknown, the
  drain budget elapses, or the unit doesn't respawn after exit (older
  units missing Restart=on-failure). Old install behavior preserved.
- Drain budget = max(restart_drain_timeout, 30s) + 15s margin so the
  drain loop in run_agent + final exit have room before fallback
  fires. Composes with #14728's tool-subprocess reaping.

Notification interval:
- agent.gateway_notify_interval default 600 -> 180.
- HERMES_AGENT_NOTIFY_INTERVAL env-var fallback in gateway/run.py
  matched.
- 9-minute weak-model spinning runs now ping at 3 min and 6 min
  instead of 27 seconds before completion, removing the 'is the bot
  dead?' reflex that drives gateway-restart cycles.

Tests:
- Two new tests in tests/hermes_cli/test_update_gateway_restart.py:
  one asserts SIGUSR1 is sent and 'systemctl restart' is NOT called
  when MainPID is known and the helper succeeds; one asserts the
  fallback fires when the helper returns False.
- E2E: spawned detached bash processes confirm the helper returns
  True on SIGUSR1-handling exit (~0.5s) and False on SIGUSR1-ignoring
  processes (timeout). Verified non-existent PID and pid=0 edge cases.
- 41/41 in test_update_gateway_restart.py (was 39, +2 new).
- 154/154 in shutdown-related suites including #14728's new tests.

Reported by @GeoffWellman and @ANT_1515 on X.
2026-04-23 14:01:57 -07:00
Teknium
327b57da91
fix(gateway): kill tool subprocesses before adapter disconnect on drain timeout (#14728)
Closes #8202.

Root cause: stop() reclaimed tool-call bash/sleep children only at the
very end of the shutdown sequence — after a 60s drain, 5s interrupt
grace, and per-adapter disconnect. Under systemd (TimeoutStopSec bounded
by drain_timeout), that meant the cgroup SIGKILL escalation fired first,
and systemd reaped the bash/sleep children instead of us.

Fix:
- Extract tool-subprocess cleanup into a local helper
  _kill_tool_subprocesses() in _stop_impl().
- Invoke it eagerly right after _interrupt_running_agents() on the
  drain-timeout path, before adapter disconnect.
- Keep the existing catch-all call at the end for the graceful path
  and defense in depth against mid-teardown respawns.
- Bump generated systemd unit TimeoutStopSec to drain_timeout + 30s
  so cleanup + disconnect + DB close has headroom above the drain
  budget, matching the 'subprocess timeout > TimeoutStopSec + margin'
  rule from the skill.

Tests:
- New: test_gateway_stop_kills_tool_subprocesses_before_adapter_disconnect_on_timeout
  asserts kill_all() runs before disconnect() when drain times out.
- New: test_gateway_stop_kills_tool_subprocesses_on_graceful_path
  guards that the final catch-all still fires when drain succeeds
  (regression guard against accidental removal during refactor).
- Updated: existing systemd unit generator tests expect TimeoutStopSec=90
  (= 60s drain + 30s headroom) with explanatory comment.
2026-04-23 13:59:29 -07:00
Teknium
255ba5bf26
feat(dashboard): expand themes to fonts, layout, density (#14725)
Dashboard themes now control typography and layout, not just colors.
Each built-in theme picks its own fonts, base size, radius, and density
so switching produces visible changes beyond hue.

Schema additions (per theme):

- typography — fontSans, fontMono, fontDisplay, fontUrl, baseSize,
  lineHeight, letterSpacing. fontUrl is injected as <link> on switch
  so Google/Bunny/self-hosted stylesheets all work.
- layout — radius (any CSS length) and density
  (compact | comfortable | spacious, multiplies Tailwind spacing).
- colorOverrides (optional) — pin individual shadcn tokens that would
  otherwise derive from the palette.

Built-in themes are now distinct beyond palette:

- default  — system stack, 15px, 0.5rem radius, comfortable
- midnight — Inter + JetBrains Mono, 14px, 0.75rem, comfortable
- ember    — Spectral (serif) + IBM Plex Mono, 15px, 0.25rem
- mono     — IBM Plex Sans + Mono, 13px, 0 radius, compact
- cyberpunk— Share Tech Mono everywhere, 14px, 0 radius, compact
- rose     — Fraunces (serif) + DM Mono, 16px, 1rem, spacious

Also fixes two bugs:

1. Custom user themes silently fell back to default. ThemeProvider
   only applied BUILTIN_THEMES[name], so YAML files in
   ~/.hermes/dashboard-themes/ showed in the picker but did nothing.
   Server now ships the full normalised definition; client applies it.
2. Docs documented a 21-token flat colors schema that never matched
   the code (applyPalette reads a 3-layer palette). Rewrote the
   Themes section against the actual shape.

Implementation:

- web/src/themes/types.ts: extend DashboardTheme with typography,
  layout, colorOverrides; ThemeListEntry carries optional definition.
- web/src/themes/presets.ts: 6 built-ins with distinct typography+layout.
- web/src/themes/context.tsx: applyTheme() writes palette+typography+
  layout+overrides as CSS vars, injects fontUrl stylesheet, fixes the
  fallback-to-default bug via resolveTheme(name).
- web/src/index.css: html/body/code read the new theme-font vars;
  --radius-sm/md/lg/xl derive from --theme-radius; --spacing scales
  with --theme-spacing-mul so Tailwind utilities shift with density.
- hermes_cli/web_server.py: _normalise_theme_definition() parses loose
  YAML (bare hex strings, partial blocks) into the canonical wire
  shape; /api/dashboard/themes ships full definitions for user themes.
- tests/hermes_cli/test_web_server.py: 16 new tests covering the
  normaliser and discovery (rejection cases, clamping, defaults).
- website/docs/user-guide/features/web-dashboard.md: rewrite Themes
  section with real schema, per-model tables, full YAML example.
2026-04-23 13:49:51 -07:00
kshitij
82a0ed1afb
feat: add Xiaomi MiMo v2.5-pro and v2.5 model support (#14635)
## Merged

Adds MiMo v2.5-pro and v2.5 support to Xiaomi native provider, OpenCode Go, and setup wizard.

### Changes
- Context lengths: added v2.5-pro (1M) and v2.5 (1M), corrected existing MiMo entries to exact values (262144)
- Provider lists: xiaomi, opencode-go, setup wizard
- Vision: upgraded from mimo-v2-omni to mimo-v2.5 (omnimodal)
- Config description updated for XIAOMI_API_KEY
- Tests updated for new vision model preference

### Verification
- 4322 tests passed, 0 new regressions
- Live API tested on Xiaomi portal: basic, reasoning, tool calling, multi-tool, file ops, system prompt, vision — all pass
- Self-review found and fixed 2 issues (redundant vision check, stale HuggingFace context length)
2026-04-23 10:06:25 -07:00
Teknium
d45c738a52
fix(gateway): preflight user D-Bus before systemctl --user start (#14531)
On fresh RHEL/Debian SSH sessions without linger, `systemctl --user
start hermes-gateway` fails with 'Failed to connect to bus: No medium
found' because /run/user/$UID/bus doesn't exist. Setup previously
showed a raw CalledProcessError and continued claiming success, so the
gateway never actually started.

systemd_start() and systemd_restart() now call _preflight_user_systemd()
for the user scope first:
- Bus socket already there → no-op (desktop / linger-enabled servers)
- Linger off → try loginctl enable-linger (works when polkit permits,
  needs sudo otherwise), wait for socket
- Still unreachable → raise UserSystemdUnavailableError with a clean
  remediation message pointing to sudo loginctl + hermes gateway run
  as the foreground fallback

Setup's start/restart handlers and gateway_command() catch the new
exception and render the multi-line guidance instead of a traceback.
2026-04-23 05:09:38 -07:00
David VV
39fcf1d127 fix(model_switch): group custom_providers by endpoint in /model picker (#9210)
Multiple custom_providers entries sharing the same base_url + api_key
are now grouped into a single picker row. A local Ollama host with
per-model display names ("Ollama — GLM 5.1", "Ollama — Qwen3-coder",
"Ollama — Kimi K2", "Ollama — MiniMax M2.7") previously produced four
near-duplicate picker rows that differed only by suffix; now it appears
as one "Ollama" row with four models.

Key changes:
- Grouping key changed from slug-by-name to (base_url, api_key). Names
  frequently differ per model while the endpoint stays the same.
- When the grouped endpoint matches current_base_url, the row's slug is
  set to current_provider so picker-driven switches route through the
  live credential pipeline (no re-resolution needed).
- Per-model suffix is stripped from the display name ("Ollama — X" →
  "Ollama") via em-dash / " - " separators.
- Two groups with different api_keys at the same base_url (or otherwise
  colliding on cleaned name) are disambiguated with a numeric suffix
  (custom:openai, custom:openai-2) so both stay visible.
- current_base_url parameter plumbed through both gateway call sites.

Existing #8216, #11499, #13509 regressions covered (dict/list shapes
of models:, section-3/section-4 dedup, normalized list-format entries).

Salvaged from @davidvv's PR #9210 — the underlying code had diverged
~1400 commits since that PR was opened, so this is a reconstruction of
the same approach on current main rather than a clean cherry-pick.
Authorship preserved via --author on this commit.

Closes #9210
2026-04-23 03:10:30 -07:00
Aslaaen
51c1d2de16 fix(profiles): stage profile imports to prevent directory clobbering 2026-04-23 03:02:34 -07:00
drstrangerujn
a5b0c7e2ec fix(config): preserve list-format models in custom_providers normalize
_normalize_custom_provider_entry silently drops the models field when it's
a list. Hand-edited configs (and the shape used by older Hermes versions)
still write models as a plain list of ids, so after the normalize pass the
entry reaches list_authenticated_providers() with no models and /model
shows the provider with (0) models — even though the underlying picker
code handles lists fine.

Convert list-format models into the empty-value dict shape the rest of
the pipeline already expects. Dict-format entries keep passing through
unchanged.

Repro (before the fix):

    custom_providers:
    - name: acme
      base_url: https://api.example.com/v1
      models: [foo, bar, baz]

/model shows "acme (0)"; bypassing normalize in list_authenticated_providers
returns three models, confirming the drop happens in normalize.

Adds four unit tests covering list→dict conversion, dict pass-through,
filtering of empty/non-string entries, and the empty-list case.
2026-04-23 02:37:07 -07:00
helix4u
bace220d29 fix(image-gen): persist plugin provider on reconfigure 2026-04-23 01:56:09 -07:00
Teknium
a2a8092e90 feat(cli): add --ignore-user-config and --ignore-rules flags
Port from openai/codex#18646.

Adds two flags to 'hermes chat' that fully isolate a run from user-level
configuration and rules:

* --ignore-user-config: skip ~/.hermes/config.yaml and fall back to
  built-in defaults. Credentials in .env are still loaded so the agent
  can actually call a provider.
* --ignore-rules: skip auto-injection of AGENTS.md, SOUL.md,
  .cursorrules, and persistent memory (maps to AIAgent(skip_context_files=True,
  skip_memory=True)).

Primary use cases:
- Reproducible CI runs that should not pick up developer-local config
- Third-party integrations (e.g. Chronicle in Codex) that bring their
  own config and don't want user preferences leaking in
- Bug-report reproduction without the reporter's personal overrides
- Debugging: bisect 'was it my config?' vs 'real bug' in one command

Both flags are registered on the parent parser AND the 'chat' subparser
(with argparse.SUPPRESS on the subparser to avoid overwriting the parent
value when the flag is placed before the subcommand, matching the
existing --yolo/--worktree/--pass-session-id pattern).

Env vars HERMES_IGNORE_USER_CONFIG=1 and HERMES_IGNORE_RULES=1 are set
by cmd_chat BEFORE 'from cli import main' runs, which is critical
because cli.py evaluates CLI_CONFIG = load_cli_config() at module import
time. The cli.py / hermes_cli.config.load_cli_config() function checks
the env var and skips ~/.hermes/config.yaml when set.

Tests: 11 new tests in tests/hermes_cli/test_ignore_user_config_flags.py
covering the env gate, constructor wiring, cmd_chat simulation, and
argparse flag registration. All pass; existing hermes_cli + cli suites
unaffected (3005 pass, 2 pre-existing unrelated failures).
2026-04-22 19:58:42 -07:00
Teknium
9eb543cafe
feat(/model): merge models.dev entries for lesser-loved providers (#14221)
New and newer models from models.dev now surface automatically in
/model (both hermes model CLI and the gateway Telegram/Discord picker)
for a curated set of secondary providers — no Hermes release required
when the registry publishes a new model.

Primary user-visible fix: on OpenCode Go, typing '/model mimo-v2.5-pro'
no longer silently fuzzy-corrects to 'mimo-v2-pro'. The exact match
against the merged models.dev catalog wins.

Scope (opt-in frozenset _MODELS_DEV_PREFERRED in hermes_cli/models.py):
  opencode-go, opencode-zen, deepseek, kilocode, fireworks, mistral,
  togetherai, cohere, perplexity, groq, nvidia, huggingface, zai,
  gemini, google.

Explicitly NOT merged:
  - openrouter and nous (never): curated list is already a hand-picked
    subset / Portal is source of truth.
  - xai, xiaomi, minimax, minimax-cn, kimi-coding, kimi-coding-cn,
    alibaba, qwen-oauth (per-project decision to keep curated-only).
  - providers with dedicated live-endpoint paths (copilot, anthropic,
    ai-gateway, ollama-cloud, custom, stepfun, openai-codex) — those
    paths already handle freshness themselves.

Changes:
  - hermes_cli/models.py: add _MODELS_DEV_PREFERRED + _merge_with_models_dev
    helper. provider_model_ids() branches on the set at its curated-fallback
    return. Merge is models.dev-first, curated-only extras appended,
    case-insensitive dedup, graceful fallback when models.dev is offline.
  - hermes_cli/model_switch.py: list_authenticated_providers() calls the
    same merge in both its code paths (PROVIDER_TO_MODELS_DEV loop +
    HERMES_OVERLAYS loop). Picker AND validation-fallback both see
    fresh entries.
  - tests/hermes_cli/test_models_dev_preferred_merge.py (new): 13 tests —
    merge-helper unit tests (empty/raise/order/dedup), opencode-go/zen
    behavior, openrouter+nous explicitly guarded from merge.
  - tests/hermes_cli/test_opencode_go_in_model_list.py: converted from
    snapshot-style assertion to a behavior-based floor check, so it
    doesn't break when models.dev publishes additional opencode-go
    entries.

Addresses a report from @pfanis via Telegram: newer Xiaomi variants
on OpenCode Go weren't appearing in the /model picker, and /model
was silently routing requests for new variants to older ones.
2026-04-22 17:33:42 -07:00
helix4u
b52123eb15 fix(gateway): recover stale pid and planned restart state 2026-04-22 16:33:46 -07:00
Teknium
51ca575994 feat(gateway): expose plugin slash commands natively on all platforms + decision-capable command hook
Plugin slash commands now surface as first-class commands in every gateway
enumerator — Discord native slash picker, Telegram BotCommand menu, Slack
/hermes subcommand map — without a separate per-platform plugin API.

The existing 'command:<name>' gateway hook gains a decision protocol via
HookRegistry.emit_collect(): handlers that return a dict with
{'decision': 'deny'|'handled'|'rewrite'|'allow'} can intercept slash
command dispatch before core handling runs, unifying what would otherwise
have been a parallel 'pre_gateway_command' hook surface.

Changes:

- gateway/hooks.py: add HookRegistry.emit_collect() that fires the same
  handler set as emit() but collects non-None return values. Backward
  compatible — fire-and-forget telemetry hooks still work via emit().
- hermes_cli/plugins.py: add optional 'args_hint' param to
  register_command() so plugins can opt into argument-aware native UI
  registration (Discord arg picker, future platforms).
- hermes_cli/commands.py: add _iter_plugin_command_entries() helper and
  merge plugin commands into telegram_bot_commands() and
  slack_subcommand_map(). New is_gateway_known_command() recognizes both
  built-in and plugin commands so the gateway hook fires for either.
- gateway/platforms/discord.py: extract _build_auto_slash_command helper
  from the COMMAND_REGISTRY auto-register loop and reuse it for
  plugin-registered commands. Built-in name conflicts are skipped.
- gateway/run.py: before normal slash dispatch, call emit_collect on
  command:<canonical> and honor deny/handled/rewrite/allow decisions.
  Hook now fires for plugin commands too.
- scripts/release.py: AUTHOR_MAP entry for @Magaav.
- Tests: emit_collect semantics, plugin command surfacing per platform,
  decision protocol (deny/handled/rewrite/allow + non-dict tolerance),
  Discord plugin auto-registration + conflict skipping, is_gateway_known_command.

Salvaged from #14131 (@Magaav). Original PR added a parallel
'pre_gateway_command' hook and a platform-keyed plugin command
registry; this re-implementation reuses the existing 'command:<name>'
hook and treats plugin commands as platform-agnostic so the same
capability reaches Telegram and Slack without new API surface.

Co-authored-by: Magaav <73175452+Magaav@users.noreply.github.com>
2026-04-22 16:23:21 -07:00
Brooklyn Nicholson
b641639e42 fix(debug): distinguish empty-log from missing-log in report placeholder
Copilot on #14138 flagged that the share report says '(file not found)'
when the log exists but is empty (either because the primary is empty
and no .1 rotation exists, or in the rare race where the file is
truncated between _resolve_log_path() and stat()).

- Split _primary_log_path() out of _resolve_log_path so both can share
  the LOG_FILES/home math without duplication.
- _capture_log_snapshot now reports '(file empty)' when the primary
  path exists on disk with zero bytes, and keeps '(file not found)'
  for the truly-missing case.

Tests: rename test_returns_none_for_empty → test_empty_primary_reports_file_empty
with the new assertion, plus a race-path test that monkeypatches
_resolve_log_path to exercise the size==0 branch directly.
2026-04-22 15:27:54 -05:00