Quarantine Nous OAuth state when refresh fails with terminal invalid_grant/invalid_token errors. Clear local and shared refresh material across runtime, managed access-token, proxy, and credential-pool paths so Hermes stops retrying revoked refresh sessions.
* feat(kanban): orchestrator-driven auto-decomposition on triage
Closes the core gap in the kanban system: dropping a one-liner into Triage
now decomposes it into a graph of child tasks routed to specialist
profiles by description, matching teknium's original vision ("main
orchestrator splits/creates actual tasks, doles them out to each agent").
The build
---------
- hermes_cli/profiles.py: new `description` + `description_auto` fields
on ProfileInfo, persisted in <profile_dir>/profile.yaml. Helpers
read_profile_meta / write_profile_meta. `create_profile` accepts
optional description.
- hermes_cli/profile_describer.py: new module — auto-generate a 1-2
sentence description from a profile's skills + model + name via the
auxiliary LLM (`auxiliary.profile_describer`).
- hermes_cli/main.py: new `hermes profile create --description ...`
flag; new `hermes profile describe [name] [--text ... | --auto |
--all --auto]` subcommand.
- hermes_cli/kanban_db.py: new `decompose_triage_task` atomic helper —
creates N child tasks, links the root as a child of every leaf
(root waits for the whole graph), flips root `triage -> todo` with
orchestrator assignee, records an audit comment + `decomposed` event
in a single write_txn.
- hermes_cli/kanban_decompose.py: new module — calls the auxiliary LLM
(`auxiliary.kanban_decomposer`) with the profile roster + descriptions
to produce a JSON task graph, then invokes the DB helper. Rewrites
unknown assignees to the configured `kanban.default_assignee` (or
the active default profile) so a task NEVER lands with assignee=None.
Falls back to specify-style single-task promotion when the LLM
returns `fanout: false`.
- hermes_cli/kanban.py: new `hermes kanban decompose [task_id | --all]`
CLI verb.
- hermes_cli/config.py: new DEFAULT_CONFIG keys —
kanban.orchestrator_profile, kanban.default_assignee,
kanban.auto_decompose (default True), kanban.auto_decompose_per_tick
(default 3), auxiliary.kanban_decomposer, auxiliary.profile_describer.
- gateway/run.py: kanban dispatcher watcher now runs auto-decompose
before each `_tick_once`, capped by `auto_decompose_per_tick` so a
bulk-load of triage tasks doesn't burst-spend the aux LLM.
- plugins/kanban/dashboard/plugin_api.py: new endpoints —
GET /profiles (list roster + descriptions),
PATCH /profiles/<name> (set description, user-authored),
POST /profiles/<name>/describe-auto (LLM-generate),
POST /tasks/<id>/decompose (run decomposer),
GET/PUT /orchestration (orchestrator/default-assignee/auto-decompose
pickers, with resolved fallbacks echoed back).
- plugins/kanban/dashboard/dist/index.js: new OrchestrationPanel
collapsible — dropdowns for orchestrator profile and default
assignee, auto-decompose toggle, per-profile description editor with
Save and Auto-generate buttons. New ⚗ Decompose button next to
✨ Specify on triage-column task drawers.
Behavior
--------
- A task in Triage gets fanned out into a small DAG of child tasks.
Children with no internal parents flip to `ready` immediately
(parallel dispatch). Children with sibling parents wait. The root
stays alive as a parent of every child — when the whole graph
finishes, it promotes to `ready` and the orchestrator profile wakes
back up to judge completion (the "adds more tasks until done" part
of the original vision).
- `kanban.orchestrator_profile` unset -> falls back to the default
profile (whichever `hermes` launches with no -p flag).
- `kanban.default_assignee` unset -> same fallback. Tasks NEVER end
up unassigned.
- `kanban.auto_decompose=true` (default) runs the decomposer
automatically on dispatcher ticks; manual `hermes kanban decompose`
is always available.
Tests
-----
- tests/hermes_cli/test_kanban_decompose_db.py — 7 tests for the
atomic DB helper (status transitions, dep graph, audit trail,
validation errors).
- tests/hermes_cli/test_kanban_decompose.py — 6 tests for the
decomposer module (fanout, no-fanout fallback, unknown-assignee
rewrite, malformed-JSON resilience, no-aux-client path).
- tests/hermes_cli/test_profile_describer.py — 10 tests for
profile.yaml r/w + the LLM auto-describer (yaml corrupt tolerance,
user-vs-auto description protection, --overwrite, fallback parsing).
E2E
---
- CLI end-to-end: created profiles with descriptions, dropped a triage
task, mocked the aux LLM with a 3-task graph -> verified all three
children were created with the right assignees, the dependency
edges matched the LLM's graph, root flipped to todo gated by every
child, audit comment + `decomposed` event recorded.
- Dashboard end-to-end: started the dashboard against an isolated
HERMES_HOME, verified all four new endpoints via curl (profile
listing, PATCH for description, PUT for orchestration settings,
POST for decompose). Opened the UI in the browser, confirmed the
OrchestrationPanel renders with all three pickers + the per-profile
description editor, typed a description, clicked Save, verified
~/.hermes/profile.yaml was written. Clicked Decompose on the triage
card and confirmed the inline error message surfaced as designed
("no auxiliary client configured").
* feat(kanban): surface decompose mode (Auto/Manual) as a one-click pill
The auto/manual toggle already existed as kanban.auto_decompose (default
true), but it was buried inside the collapsed Orchestration settings
panel — users couldn't tell at a glance which mode they were in. This
hoists it to a pill at the top of the kanban page so the state is always
visible and one click flips it.
UX
- New "⚗ Decompose: AUTO|MANUAL" pill in the kanban header. Emerald
styling when Auto is on (the default), muted/gray when Manual.
- Pill is visible both in the collapsed AND expanded Orchestration
settings views so context is preserved when the user opens the panel.
- Tooltip explains both states + what clicking does.
- Renamed the in-panel "Auto-decompose on triage / Enabled" checkbox
to "Decompose mode / Auto (default) | Manual" for language parity
with the pill.
Behavior preserved
- Default remains Auto (kanban.auto_decompose=true).
- Manual mode restores pre-PR behavior: triage tasks stay in triage
until the user clicks ⚗ Decompose on each card (or runs
`hermes kanban decompose <id>`).
Implementation
- plugins/kanban/dashboard/dist/index.js: load /orchestration on mount
(not just on expand) so the collapsed pill reflects real state.
Render mode pill in both collapsed and expanded headers. Reuses the
existing PUT /api/plugins/kanban/orchestration endpoint — no new
backend, no new tests required.
E2E verified
- Pill renders as "⚗ Decompose: AUTO" on page load (default).
- One click flips to "⚗ Decompose: MANUAL" with muted styling.
- config.yaml on disk shows auto_decompose: false after the flip.
- Second click round-trips back to Auto; config.yaml flips to true.
* feat(kanban): rename mode pill to "Orchestration: Auto/Manual"
Per Teknium feedback — "Decompose" was too implementation-specific.
"Orchestration" is the user-facing concept (the whole pitch is the
orchestrator profile routing work), and the pill is the front door to it.
- Pill text: "Orchestration: Auto" / "Orchestration: Manual" (title case,
no ⚗ prefix, no SHOUTY-CAPS for the mode value)
- In-panel checkbox label: "Orchestration mode" (was "Decompose mode")
- Tooltips updated to match
- No behavior change
* docs(kanban): document decompose, profile descriptions, orchestration mode
Brings the docs site up to parity with the PR. English build verified
locally (npx docusaurus build --locale en) — clean, no new broken links
or anchors. Pre-existing broken-link warnings (rl-training, llms.txt,
step-by-step-checklist, fallback-model) untouched.
- website/docs/reference/cli-commands.md
+ `hermes kanban decompose` action row in the action table, with
pointer to the Auto vs Manual orchestration section.
- website/docs/reference/profile-commands.md
+ `--description "<text>"` flag on `hermes profile create`.
+ Full `hermes profile describe` section: read, --text, --auto,
--overwrite, --all flags with examples.
- website/docs/user-guide/features/kanban.md (the big one)
+ Triage column intro rewritten around the Auto-decompose default
behavior, with pointer to the new Auto vs Manual section.
+ Status action row updated to mention both ⚗ Decompose and
✨ Specify on triage cards.
+ New "Auto vs Manual orchestration" section explaining the two
modes, how to flip them (pill, config), how routing-by-description
works, the no-None-assignee guarantee, plus a config knob table
(auto_decompose, auto_decompose_per_tick, orchestrator_profile,
default_assignee) and the two new auxiliary slots
(kanban_decomposer, profile_describer).
+ REST surface table gains 6 new endpoint rows: /tasks/:id/decompose,
/profiles (GET), /profiles/:name (PATCH), /profiles/:name/describe-auto,
/orchestration (GET + PUT).
- website/docs/user-guide/features/kanban-tutorial.md
+ Triage column blurb updated for Auto by default + Manual via the
pill, with cross-link to the Auto vs Manual orchestration section.
- website/docs/user-guide/profiles.md
+ Blank-profile flow now mentions --description and points to the
kanban routing model for context.
- website/docs/user-guide/configuration.md
+ `kanban_decomposer` and `profile_describer` added to the
`hermes model -> Configure auxiliary models` menu listing.
xAI's OAuth implementation at ``auth.x.ai`` validates the PKCE
``code_challenge`` at the **token** endpoint, not just at the
authorize step. When Hermes sends the standards-compliant token
POST with ``code_verifier`` alone — exactly what RFC 7636 §4.5
prescribes — xAI rejects the exchange with ``code_challenge is
required`` and the user is stuck with no working OAuth login.
The fix:
* Extract the token POST into ``_xai_oauth_exchange_code_for_tokens``
so the wire format is unit-testable in isolation.
* Send the original ``code_challenge`` and ``code_challenge_method``
in the form body alongside ``code_verifier``. Strict RFC-compliant
servers ignore the extras at the token endpoint, and xAI's
permissive implementation accepts the exchange. This is the
standard "defensive echo" workaround used by every OAuth client
that targets a server with this quirk.
* Refuse to fire the POST when ``code_verifier`` is empty — leaking
the authorization code to a server that can't redeem it is worse
than failing locally with an actionable error. The new error
code is ``xai_pkce_verifier_missing`` and the message points at
this issue for context.
* Surface the HTTP status code prominently in the 4xx error message
(``xAI token exchange failed (HTTP 400). Response: …``) so users
and maintainers can tell a 400 (bad request / PKCE problem) from
a 403 (tier denied, see #26847) at a glance instead of parsing
the JSON body by eye.
Closes#26990
`_ws_client_is_allowed()` enforces a loopback-only client check on every
dashboard WebSocket upgrade (`/api/ws`, `/api/events`, `/api/pty`,
`/api/pub`):
def _ws_client_is_allowed(ws):
if _is_public_bind():
return True
client_host = ws.client.host if ws.client else ""
if not client_host:
return True
return client_host in _LOOPBACK_HOSTS
The intent is: when bound to 127.0.0.1, only accept WS upgrades from
loopback peers. Public bind (--insecure) trades that for token-only.
However, `uvicorn.run(app, host=host, port=port, log_level="warning")`
omits `proxy_headers`. In modern uvicorn (>= 0.20) `proxy_headers`
defaults to True and `forwarded_allow_ips` defaults to "127.0.0.1".
With those defaults, any reverse proxy connecting from loopback (nginx,
in-cluster proxy, Cloudflare Tunnel sidecar in HTTP mode, K8s
ingress-nginx) causes uvicorn to rewrite `ws.client.host` from the
request's `X-Forwarded-For` header. So the gate sees the original
client's IP (a public address) instead of the loopback peer, returns
False, and closes every browser WS with code=4403 (surfaces as HTTP
403 to the proxy).
Passing `proxy_headers=False` keeps the loopback gate's view of
`ws.client.host` at the immediate transport peer (the proxy on
127.0.0.1), which is exactly what the gate is designed to check.
The bug is invisible in dev (no proxy → no XFF → ws.client.host stays
loopback). It surfaces in proxied production: dashboard chat tab opens,
events feed banner shows "disconnected — tool calls may not appear",
all WS endpoints return 403. Reproduces with:
curl -i -H "Connection: Upgrade" -H "Upgrade: websocket" \
-H "Sec-WebSocket-Version: 13" -H "Sec-WebSocket-Key: ..." \
-H "X-Forwarded-For: 1.2.3.4" \
"http://127.0.0.1:9119/api/ws?token=\$TOKEN"
# Before: HTTP/1.1 403 Forbidden
# After: HTTP/1.1 101 Switching Protocols
Without the XFF header, both behave the same (101) — confirming the
single-variable trigger.
Discovered while diagnosing why the Hermes dashboard at
mandy.loadmagic.ai (behind nginx + Cloudflare Tunnel + CF Access)
refused all browser WS upgrades despite Access app config matching a
known-working sibling deployment (Simone, which doesn't have nginx in
the path).
`hermes doctor` displayed OAuth status for Nous, Codex, Gemini, and MiniMax
but silently omitted xAI OAuth, even though `get_xai_oauth_auth_status()`
exists and the same information is already surfaced in `hermes status`.
Add xAI OAuth as a *separate* try/except block so an import failure cannot
silence the already-printed provider rows above it — consistent with the
per-provider isolation introduced in the doctor fallback fix.
Tests:
- 9 new tests in TestDoctorXaiOAuthStatus covering: logged-in ok, not-logged-in
warn, error line present/absent, import failure isolation, runtime exception
and None-return safety.
- 9 existing run_doctor helpers updated to mock get_xai_oauth_auth_status for
deterministic output.
hermes status listed Nous Portal, OpenAI Codex, Qwen OAuth, and MiniMax
OAuth in the Auth Providers section but omitted xAI OAuth entirely.
Users who authenticated via `hermes auth add xai-oauth` had no way to
verify their session state from the status output.
Add xAI OAuth display using the same field shape as OpenAI Codex:
auth_store (Auth file:), last_refresh (Refreshed:), and error when
not logged in. The import is isolated in its own try/except so an
import failure cannot affect the already-printed rows above it.
Tests cover:
- logged in: check mark, auth_store, last_refresh, error suppressed
- not logged in: login command hint, error shown, error absent = no line
- resilience: import failure, status function raises, returns None
- isolation: xAI import failure does not break Nous/MiniMax display
Shared try/except import block meant that if any one status function was
missing, all providers lost their OAuth fallback suppression. Split into
per-provider try/except so each branch is independently safe.
Add end-to-end test for xAI: bad XAI_API_KEY with healthy OAuth does not
surface a blocking issue in run_doctor output. Add tests for None return,
import failure isolation (xAI missing does not break Gemini), and move
test_returns_false_for_unknown_provider out of the xAI-specific class.
_has_healthy_oauth_fallback_for_apikey_provider() covers Gemini and
MiniMax (added by #26853) but omits xAI. The xAI provider profile
(plugins/model-providers/xai/__init__.py) has auth_type="api_key" and
env_vars=("XAI_API_KEY",), so it enters the generic API-key
connectivity loop. When XAI_API_KEY fails a 401 probe but xAI OAuth
is healthy, the failure is promoted to the blocking summary even though
xAI works fine via OAuth — the same false-positive #26853 fixed for
Gemini and MiniMax.
Fix: import get_xai_oauth_auth_status alongside the existing two
helpers and add the "xai" branch. get_xai_oauth_auth_status() already
exists in hermes_cli/auth.py and returns {"logged_in": True} when a
valid OAuth token is present.
Symmetric with the Gemini and MiniMax branches introduced in #26853.
No behavior change for providers without an OAuth path.
Drops the three hardcoded browser-provider rows (Browserbase, Browser Use,
Firecrawl) from TOOL_CATEGORIES['browser']['providers'] and replaces them
with runtime injection from agent.browser_registry — mirroring the
_plugin_web_search_providers() pattern PR #25182 established for the
Web Search and Extract category.
Adds _plugin_browser_providers() helper in hermes_cli/tools_config.py
that walks list_providers() and builds a TOOL_CATEGORIES-shape dict per
provider via get_setup_schema(). The new visible_providers() hook calls
it for cat['name'] == 'Browser Automation'.
The three remaining hardcoded rows are non-provider UX setup-flow rows:
- 'Nous Subscription (Browser Use cloud)' — managed Browser Use billed
via Nous subscription; uses the browser-use plugin as the underlying
backend but has distinct setup UX (requires_nous_auth gates it).
- 'Local Browser' — headless Chromium, no CloudBrowserProvider.
- 'Camofox' — anti-detection local Firefox; _is_camofox_mode()
short-circuits the cloud-provider dispatch path entirely.
Verified the picker output matches pre-migration order/content:
Local Browser, Camofox, Browser Use, Browserbase, Firecrawl
(with 'Nous Subscription' surfaced only when the user is Nous-authed,
unchanged from main).
Foundation commit for the browser-provider plugin migration (#25214).
Mirrors the architecture established by PR #25182 (web providers):
- agent/browser_provider.py — BrowserProvider ABC. Preserves the legacy
CloudBrowserProvider lifecycle contract bit-for-bit (create_session,
close_session, emergency_cleanup, session metadata shape) so the
dispatcher in tools/browser_tool.py becomes a pure registry lookup.
Renames is_configured() → is_available() for parity with WebSearchProvider.
- agent/browser_registry.py — selection registry with the same
three-rule resolution as web_search_registry:
1. Explicit config wins (returns even if is_available() == False so
the dispatcher surfaces a precise credentials error)
2. Single-eligible shortcut
3. Legacy preference walk: browser-use → browserbase, filtered by
availability. Firecrawl is intentionally NOT in the legacy walk
(matches pre-migration behaviour — Firecrawl was only reachable
via explicit browser.cloud_provider: firecrawl).
- hermes_cli/plugins.py — adds ctx.register_browser_provider() facade,
one-liner mirror of register_web_search_provider().
No plugins registered yet; no dispatcher cutover yet. The next commits
move browserbase/browser-use/firecrawl into plugins/browser/<vendor>/
and switch tools/browser_tool.py over to the registry.
The SSH connectivity check in `run_doctor` only passed the host to ssh,
using the current OS user and default port 22. When the target requires a
different user (TERMINAL_SSH_USER), non-standard port (TERMINAL_SSH_PORT),
or a specific identity file (TERMINAL_SSH_KEY), the check always failed
with "Permission denied" — even though the agent itself connects fine.
Fix: read all four TERMINAL_SSH_* env vars and build the ssh command with
-p, -i, and user@host as appropriate, matching how the terminal tool
actually establishes the connection.
Six days after #23937 (608 fixes) the codebase had accumulated 241 new
PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix,
same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the
two are semantically equivalent for hashable scalar membership tests.
All 241 instances fixed via `ruff check --select PLR6201 --fix
--unsafe-fixes`, zero remaining. Every changed value is a hashable
scalar (str/int/None/enum/signal); no risk of unhashable runtime
errors. No behavior change.
Test plan:
- 119 files changed, +244/-244 (net zero) — exactly one-line edits
- `ruff check` clean afterward
- Compile checks pass on the largest touched files (cli.py, run_agent.py,
gateway/run.py, gateway/platforms/discord.py, model_tools.py)
- Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/
tests/tools/: 18187 passed, 59 pre-existing failures (verified against
origin/main with the same shape — identical failure count, identical
category — all xdist test-order flakes unrelated to this change)
Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).
_configure_provider() calls _run_post_setup() after collecting env vars
(line 2286). _reconfigure_provider() did not — providers with both
env_vars and post_setup (Browserbase, Browser Use, Firecrawl, Camofox)
skipped the installation step on reconfiguration.
Fix: mirror the _configure_provider() call. post_setup hooks are
idempotent (check before installing), so no behaviour change for users
who already have the dependencies installed.
The x_search toolset is gated on xAI credentials (SuperGrok OAuth or
XAI_API_KEY), but it was staying off-by-default even for users who had
already configured those credentials — they had to also click through
`hermes tools` → X (Twitter) Search to flip it on. The HASS_TOKEN →
homeassistant rule already handles the parallel case cleanly; x_search
needs the same treatment.
Why a separate code path from HASS_TOKEN: `ha_*` tools live inside
the `hermes-cli` composite, so the subset-inference loop picks them
up and the HASS branch just unmasks default_off. `x_search` is its
own one-tool toolset NOT in the composite, so the subset loop never
adds it — it has to be injected directly.
* Add `_xai_credentials_present()` — side-effect-free check for stored
xAI OAuth tokens or XAI_API_KEY (dotenv or env). No network.
* In `_get_platform_tools()` else branch (no explicit user config),
inject `x_search` and carve a parallel hole in default_off.
* Auto-enable does NOT fire when the user has saved an explicit toolset
list via `hermes tools` — that list stays authoritative.
* `agent.disabled_toolsets: [x_search]` still wins (global override).
Tests: 4 new in test_tools_config.py covering OAuth path, API-key path,
no-creds path, and explicit-config-respect. All pass alongside existing
70/70 in that file.
The goal judge only receives the goal text and the agent's last
response. It has no concept of the current time, making it
impossible to evaluate time-sensitive goals like 'keep working
until 5pm'.
This commit adds 'Current time' to both JUDGE_USER_PROMPT_TEMPLATE
and JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE, computed from
datetime.now().astimezone() at judge call time.
The Telegram/Discord model picker skipped live model discovery for
custom providers (llama.cpp, Ollama) unless an api_key was configured.
Local providers typically don't require auth on the /models endpoint.
The CLI always probes /models, so this brings the gateway picker into
parity.
Change: `if api_url and api_key:` -> `if api_url:`
Add creationflags=CREATE_NO_WINDOW to every Windows Popen call
across the terminal, process registry, code execution, and kanban
worker subsystems. Prevents visible CMD windows from flashing on
the user's desktop during agent operation.
Also adds the _IS_WINDOWS module constant to kanban_db.py where
it was missing, for consistency with the other patched files.
5 Popen sites across 4 files:
- tools/environments/local.py (terminal foreground spawn)
- tools/process_registry.py (background process spawn)
- tools/code_execution_tool.py (sandbox + interpreter probe)
- hermes_cli/kanban_db.py (kanban worker spawn)
The _normalize_custom_provider_entry() function was dropping the
discover_models field from custom_provider entries because:
1. It was not listed in _KNOWN_KEYS, so it was logged as an
unknown key and ignored.
2. The function builds the normalized dict by explicitly copying
known fields, so even if the warning was suppressed, the value
was not carried through.
This caused downstream model_switch.py to default discover_models
to True, triggering /models HTTP probes on unreachable endpoints.
With 4 unreachable internal endpoints at ~6s timeout each, the
/api/model/options endpoint took ~24s instead of <1s.
hermes_cli/gateway.py:3702 referenced logger.debug() but 'logger' was
never defined in the module, causing a NameError at runtime if the
try/except around discover_plugins() caught an exception.
Added import logging and logger = logging.getLogger(__name__)
at module level to resolve the undefined name.
Consume multi-byte non-CSI ESC sequences during ANSI sanitization and handle UnicodeDecodeError for `hermes send --file` so review findings are resolved without regressions.
Avoid Terminal.app paint corruption by disabling fast-echo in that terminal, sanitizing non-SGR control sequences before ANSI rendering, and defaulting Apple Terminal back to the safer 256-color path unless truecolor is explicitly requested.
The Discord adapter silently dropped any attachment whose extension wasn't
in the SUPPORTED_DOCUMENT_TYPES allowlist (PDF, text family, zip, office).
Users uploading .wav / .bin / other unrecognized formats saw nothing in
their conversation — the file got logged as 'Unsupported document type'
and discarded before the agent ever saw it.
Add discord.allow_any_attachment (default false) to bypass the allowlist.
When on:
- Any file is downloaded, cached under ~/.hermes/cache/documents/, and
surfaced as a DOCUMENT-typed event with application/octet-stream MIME
- gateway/run.py already emits a context note with the cached path,
auto-translated via to_agent_visible_cache_path() for Docker/Modal
sandboxed terminals
- File body is NOT inlined — only the path — so binary uploads don't
blow up the context window
- Allowlisted text formats (.txt/.md/.log) keep their 100 KiB inline
behavior unchanged
Also adds discord.max_attachment_bytes (default 32 MiB matches the
historical hardcoded cap; 0 = unlimited) since users opting into arbitrary
types may want to raise the cap. The whole attachment is held in memory
while being cached, so unlimited carries a real memory cost.
Env overrides: DISCORD_ALLOW_ANY_ATTACHMENT, DISCORD_MAX_ATTACHMENT_BYTES.
Discord-only by deliberate scope. Telegram has hard 20 MB API limits and
Slack has its own caps — extending the same flag there is a separate
follow-up if/when requested.
Add a TestDiscoverAllPlugins class covering the six cases the recursive
scan needs to handle:
- flat plugin uses its manifest ``name:`` as the key
- category-namespaced plugin keys off ``<category>/<dirname>`` even when
the manifest ``name:`` is bare (regression test for the original bug —
``plugins/observability/langfuse/`` with ``name: langfuse`` must
surface as ``observability/langfuse``, not ``langfuse``)
- user-installed plugin overrides bundled on key collision
- depth cap: anything below ``<root>/<category>/<plugin>/`` is ignored
- bundled ``memory/`` and ``context_engine/`` are skipped (they have
their own loaders), but user plugins under those category names are
still scanned
Also add an in-source comment next to the key derivation pointing at the
loader's matching line (``PluginManager._parse_manifest`` in
plugins.py:1027-1028), so future renames of one site flag the other.
Both items raised in Copilot review on #27161.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `if key in seen and source == "bundled": continue` check was
unreachable: bundled is scanned before user, so `key in seen` can never
be true while `source == "bundled"`. The "user overrides bundled"
semantics are preserved automatically by the unconditional
`seen[key] = …` on the user pass.
Replaces the dead guard with a one-line comment explaining the
overwrite semantics, so a future contributor adding a third source
(e.g. project plugins) can see at a glance how ordering interacts with
the dict-overwrite. Matches `PluginManager.discover_and_load`'s
"user wins" rule.
Spotted by Copilot in code review on #27161.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The langfuse plugin is hooks-only (no toolsets), so it never appears in
`hermes tools` — that menu iterates `_get_effective_configurable_toolsets()`
(= `CONFIGURABLE_TOOLSETS` + plugin-registered toolsets), and "langfuse"
is in neither. The `TOOL_CATEGORIES["langfuse"]` setup wizard (with its
`post_setup: "langfuse"` hook that pip-installs the SDK and writes
`plugins.enabled`) was reachable only when a toolset key "langfuse" got
enabled, which can't happen — so it's been dead code, and the docs that
promised "Setup (interactive): hermes tools → Langfuse Observability"
were silently broken.
Right home for that wizard is `hermes plugins` (e.g. auto-running a
plugin's post-setup hook on enable), which is a generic plugin-setup
mechanism worth designing properly rather than shoehorning langfuse
back into `hermes tools`. Until that exists, point users at the
working manual flow.
Code:
- Delete `TOOL_CATEGORIES["langfuse"]` (24 lines) — unreachable.
- Delete the `post_setup_key == "langfuse"` branch in `_run_post_setup`
(29 lines) — only caller was the deleted TOOL_CATEGORIES entry.
Docs / comments (point at the manual flow + interactive `hermes plugins`):
- `plugins/observability/langfuse/README.md`: collapse the two-option
setup section to the single working flow.
- `plugins/observability/langfuse/plugin.yaml`: update `description`.
- `plugins/observability/langfuse/__init__.py`: update module docstring.
- `hermes_cli/config.py`: update inline comment above the LANGFUSE_*
env-var allow-list.
- `website/docs/user-guide/features/built-in-plugins.md`: collapse
"Setup (interactive)" + "Setup (manual)" into one accurate block.
- `website/docs/reference/environment-variables.md`: update the
cross-reference in the Langfuse env-vars section.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_discover_all_plugins()` in plugins_cmd.py did a flat scan of the
bundled and user plugin directories — only direct children with a
plugin.yaml were surfaced. Category directories like `observability/`,
`image_gen/`, `platforms/`, `model-providers/`, `web/`, and `video_gen/`
have no plugin.yaml of their own, so their nested plugins
(`observability/langfuse`, `image_gen/openai`, etc.) never appeared in
`hermes plugins list` or the interactive `hermes plugins` UI — even
though the runtime loader (`PluginManager._scan_directory_level`)
discovers them correctly and they do load at runtime.
This broke the documented promise that bundled plugins appear in
`hermes plugins list` and the interactive UI before being enabled,
and made it look like `observability/langfuse` didn't exist.
Refactor `_discover_all_plugins()` to mirror the loader's recursion
(depth cap = 2, same skip set, user overrides bundled on key collision).
Return the path-derived registry key (e.g. `observability/langfuse`) as
the displayed name, matching what the user passes to
`hermes plugins enable …` / writes under `plugins.enabled` in
config.yaml.
Also clarify the plugins docs: spell out that sub-category plugins
surface by their `<category>/<plugin>` key in `hermes plugins list` /
interactive UI, add an `observability/langfuse` example to the command
reference, and include a nested entry in the interactive-UI mock.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces a thin CLI wrapper around the existing send_message_tool so
shell scripts, cron scripts, CI hooks, and monitoring daemons can reuse
the gateway's already-configured platform credentials without
reimplementing each platform's REST client.
hermes send --to telegram "deploy finished"
echo "RAM 92%" | hermes send --to telegram:-1001234567890
hermes send --to discord:#ops --file report.md
hermes send --to slack:#eng --subject "[CI]" --file build.log
hermes send --list # all targets
hermes send --list telegram # filter by platform
Supports all platforms the send_message tool already does (Telegram,
Discord, Slack, Signal, SMS, WhatsApp, Matrix, Feishu, DingTalk, WeCom,
Weixin, Email, etc.), including threaded targets and #channel-name
resolution via the channel directory.
hermes_cli/send_cmd.py delegates to tools.send_message_tool.send_message_tool,
which means there is zero new platform-specific code. The subcommand just:
1. Bridges ~/.hermes/.env and top-level ~/.hermes/config.yaml scalars into
os.environ (same bootstrap the gateway does at startup) — required so
TELEGRAM_HOME_CHANNEL and friends are visible to load_gateway_config().
2. Resolves the message body from positional arg, --file, or piped stdin.
3. Calls the shared tool and translates its JSON result to exit codes:
0 success, 1 delivery failure, 2 usage error.
No running gateway is required for bot-token platforms (Telegram, Discord,
Slack, Signal, SMS, WhatsApp) — the tool hits each platform's REST API
directly. Plugin platforms that rely on a live adapter connection still
need the gateway running; the error message is forwarded verbatim.
- New guide: website/docs/guides/pipe-script-output.md covering real-world
patterns (memory watchdogs, CI hooks, cron pipes, long-running task
completion pings) and the security/gateway notes.
- Cross-links added from automate-with-cron.md ("no LLM? use hermes send")
and developer-guide/gateway-internals.md (delivery-path section).
tests/hermes_cli/test_send_cmd.py (20 tests, all green):
- Happy paths: positional message, stdin, --file, --file -, --subject,
--json, --quiet.
- Error paths: missing --to, missing body, file not found, tool returns
error payload (exit 1), tool skipped-send result (exit 0).
- --list: human output, --json output, platform filter, unknown platform.
- Env loader: bridges config.yaml scalars into env, does not override
existing env vars, gracefully handles missing files.
- Registrar contract: register_send_subparser() returns a working parser.
Smoke-tested end-to-end against a live Telegram bot before commit.
Adds a pure-local recap of recent session activity — turn counts,
tools used, files touched, last user ask, last assistant reply —
appended to the existing /status output. Useful when juggling multiple
sessions and you want a one-glance reminder of where this one left off.
Inspired by Claude Code 2.1.114's /recap, but folded into /status so
we don't add a 6th info command. Pure local computation: no LLM call,
no auxiliary model, no prompt-cache invalidation, instant and free.
Salvage of #18587 — kept the shared hermes_cli.session_recap.build_recap
helper and its 13 unit tests, dropped the /recap slash command +
ACTIVE_SESSION_BYPASS_COMMANDS entry + Level-2 bypass since /status
already covers both surfaces.
Tailored to hermes-agent's tool vocabulary: file-editing tools
(patch, write_file, read_file, skill_manage, skill_view) surface
touched paths; tool-call counts highlight which classes of work
drove the session.
Source: https://code.claude.com/docs/en/whats-new/2026-w17
Emit a grep-friendly '[MEMORY] rss=...MB ...' line in agent.log /
gateway.log every N minutes (default 5) so slow leaks in the long-lived
gateway process show up as a time series. Based on
https://github.com/cline/cline/pull/10343
(src/standalone/memory-monitor.ts).
- gateway/memory_monitor.py: new module. Daemon thread, baseline on
start, final snapshot on stop. Uses resource.getrusage() (stdlib)
first, falls back to psutil, disables itself with one WARNING if
neither is available.
- gateway/run.py: start monitor right after setup_logging() in
start_gateway(); stop it in the shutdown block next to MCP teardown.
- hermes_cli/config.py: logging.memory_monitor { enabled, interval_seconds }
defaults under the existing logging section.
- tests/gateway/test_memory_monitor.py: 10 unit tests covering format,
baseline/shutdown snapshots, double-start noop, periodic timer,
daemon thread invariant, and unavailable-RSS warn-and-skip path.
Adapted from TypeScript/Node to Python (threading.Event-based daemon
thread instead of setInterval/unref), added Python-specific gc + thread
counts to the log line (handier than ext/arrayBuffers for diagnosing
Python gateway leaks), and gated behind a config.yaml toggle so users
can silence the periodic line if they want.
No heap-snapshot-on-OOM equivalent — CPython doesn't have V8's
--heapsnapshot-near-heap-limit; tracemalloc would be the Python
equivalent but adds non-trivial overhead, so leaving that out.
Port from google-gemini/gemini-cli#19332.
Users can now exit with '/exit --delete' (or '/quit --delete', '/exit -d')
to permanently remove the current session's SQLite history plus on-disk
transcripts (*.json / *.jsonl / request_dump_*) in one shot. Useful for
privacy-sensitive workflows and one-off interactions where leaving a
session recording behind is undesirable.
Implementation:
- New HermesCLI._delete_session_on_exit one-shot flag (defaults False).
- process_command() parses --delete / -d after /exit or /quit and arms
the flag. Unknown args print a hint and keep the CLI running (prevents
typos like '/exit -delete' from accidentally exiting).
- Shutdown path calls SessionDB.delete_session(session_id, sessions_dir=...)
right after end_session() when the flag is set. That API already
existed for 'hermes sessions delete' and handles both SQLite removal
(orphaning child sessions so FK constraints hold) and on-disk file
cleanup.
- /quit CommandDef now advertises '[--delete]' in args_hint so /help
and CLI autocomplete surface it.
Tests: tests/cli/test_exit_delete_session.py (12 cases covering both
aliases, case insensitivity, whitespace, short form, unknown-arg
rejection, and registry metadata).
E2E-verified with isolated HERMES_HOME: session row deleted, all three
transcript/request-dump files removed, second delete_session call
correctly returns False.
`hermes update` ran the repo-root and ui-tui npm installs with both
`--silent` and `subprocess.run(..., capture_output=True)`, which hides
all output from optional postinstall scripts. The largest of those —
`@askjo/camofox-browser`'s `npx camoufox-js fetch` — downloads a
Firefox-fork browser binary that can take many minutes on slow
connections. Because nothing was printed during that wait, the updater
appeared to hang at "Updating Node.js dependencies..." and users
Ctrl-C'd, sometimes leaving `node_modules` partially installed.
Drop `--silent` and pass `capture_output=False` for the repo-root and
ui-tui paths so npm streams its `info run …` postinstall lines straight
to the terminal. Output is still mirrored to `~/.hermes/logs/update.log`
by the existing `_UpdateOutputStream` wrapper, so SSH-disconnect safety
is preserved.
The `web/` install path is untouched — its build step is fast and does
not run binary-fetching postinstalls.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `@askjo/camofox-browser` npm package was a top-level entry in
the root `package.json` `dependencies` block, so `hermes update`
ran its postinstall on every user, every update. That postinstall
calls `npx camoufox-js fetch`, which silently downloads a ~300MB
Firefox-fork browser binary from GitHub Releases — multi-minute on
fast connections, and a hard block for users on slow / restricted
networks (notably users in China running through a VPN).
Camofox is an explicit opt-in browser backend. The runtime check
in `tools/browser_tool.py` only routes through Camofox when the
user has set `CAMOFOX_URL` (selected via `hermes tools` →
Browser Automation → Camofox). Users who never opted in never
touched the package at runtime, yet every `hermes update` paid
for the binary fetch anyway.
This change:
* Removes `@askjo/camofox-browser` from root `package.json`
dependencies (and the regenerated `package-lock.json` drops
Camofox's entire transitive tree, ~2.6k lines).
* Updates the Camofox `post_setup` handler in
`hermes_cli/tools_config.py` to install
`@askjo/camofox-browser@^1.5.2` explicitly when the user
selects Camofox, and streams npm output (no `--silent`, no
`capture_output`) so the ~300MB download is visible rather
than appearing frozen.
* Adds `tests/test_package_json_lazy_deps.py` as a regression
guard so future PRs can't silently re-add Camofox (or any
binary-postinstall package) to eager root dependencies.
`agent-browser` stays eager — it is the default Chromium-driving
backend used by every session that does not have a cloud browser
provider configured, and its postinstall is small.
Validation:
| | Before | After |
|---|---|---|
| `hermes update` time on slow network | multi-minute hang at `→ Updating Node.js dependencies...` | seconds (no binary fetch) |
| Camofox opt-in install visibility | silent, looked frozen | streamed npm output |
| Regression guard against re-adding | none | `test_package_json_lazy_deps.py` |
Tests:
- `tests/test_package_json_lazy_deps.py`: 3/3 pass
- `tests/tools/test_browser_camofox*`: 92/92 pass
- `tests/hermes_cli/test_tools_config.py`: 66/66 pass
- `tests/hermes_cli/test_cmd_update.py` + adjacent: green
Reported by lulu (Discord, May 2026) — `hermes update` hangs at
`→ Updating Node.js dependencies...` in China.
Related: #18840, #18869.
Address two blocking issues when using GitHub Copilot integrations:
1. ACP mode: detect the gh-copilot CLI deprecation error from stderr
and surface an actionable message with alternatives instead of
hanging or showing a cryptic error.
2. GitHub Models (Azure) 413: recognize models.inference.ai.azure.com
as a known GitHub Models URL, and print a targeted hint explaining
the hard 8K token limit that makes this endpoint incompatible with
Hermes' system prompt size.
Fixes#26693
`hermes doctor` currently promotes invalid direct API keys into the final
summary even when the matching OAuth path is already healthy. That makes
the setup look more broken than it really is.
This change keeps the failed API Connectivity row visible but stops
treating it as a blocking summary issue when a healthy OAuth fallback
already exists for the same provider family.
Covered cases:
- Gemini OAuth + invalid direct Gemini key
- MiniMax OAuth + invalid direct MiniMax key
Based on #26704 by @worlldz.
Subagent delegation hardcoded api_mode='chat_completions' for any
delegation.base_url that didn't match three specific hostnames
(chatgpt.com, api.anthropic.com, api.kimi.com/coding), and never
read delegation.api_mode from config. Azure AI Foundry's
https://foundry.services.ai.azure.com/anthropic endpoint fell through
and got chat_completions, causing 404s on every delegate_task call.
The main agent already handles this correctly via the shared
_detect_api_mode_for_url() helper (anything ending in /anthropic →
anthropic_messages); delegation reimplemented its own narrower check.
Reuse the shared detector and honor an explicit delegation.api_mode
when set so users can also force the transport on non-standard
endpoints the URL heuristic can't classify.
Fixes#10213.
Co-authored-by: HiddenPuppy <HiddenPuppy@users.noreply.github.com>
* feat(x_search): gated X (Twitter) search tool with OAuth-or-API-key auth
Salvages tools/x_search_tool.py from the closed PR #10786 (originally by
@Jaaneek) and reworks its credential resolution so the tool registers
when EITHER xAI credential path is available:
* XAI_API_KEY (paid xAI API key) is set in ~/.hermes/.env or the env, OR
* The user is signed in via xAI Grok OAuth — SuperGrok subscription —
i.e. hermes auth add xai-oauth has been run
Both paths route through xAI's built-in x_search Responses tool at
https://api.x.ai/v1/responses. When both credentials exist OAuth wins,
matching tools/xai_http.py's existing preference order (uses SuperGrok
quota instead of paid API spend).
The check_fn calls resolve_xai_http_credentials() which auto-refreshes
the OAuth access token if it's within the refresh skew window, so a
True return means the bearer is fetchable AND non-empty.
Wiring
- tools/x_search_tool.py — new tool, ~370 LOC. Schema gated by check_fn,
bearer resolved per-call so revoked OAuth surfaces a clean tool_error
rather than an HTTP 401.
- toolsets.py — "x_search" toolset def. NOT added to _HERMES_CORE_TOOLS;
users opt in via hermes tools.
- hermes_cli/tools_config.py — CONFIGURABLE_TOOLSETS entry + TOOL_CATEGORIES
block with two provider options (OAuth + API key) sharing the existing
xai_grok post_setup hook for credential bootstrap.
- hermes_cli/config.py — DEFAULT_CONFIG["x_search"] with model /
timeout_seconds / retries. Additive nested key; no version bump.
- tests/tools/test_x_search_tool.py — 13 tests covering HTTP shape,
handle validation, citation extraction, 4xx/5xx/timeout handling,
and the full credential-resolution matrix (OAuth-only, API-key-only,
both-set, neither-set, resolver-raises, config overrides, registry
registration).
- website/docs/guides/xai-grok-oauth.md — adds X Search to the
direct-to-xAI tools section with off-by-default note.
- website/docs/user-guide/features/tools.md — new row in the tools table.
Off by default — users enable via `hermes tools` → 🐦 X (Twitter) Search.
Schema only appears to the model when xAI credentials are configured.
Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>
* docs(x_search): add dedicated feature page + reference entries
- website/docs/user-guide/features/x-search.md (new) — full feature
walkthrough: authentication, enablement, configuration, parameters,
returned fields, example, troubleshooting, see-also links.
- website/docs/reference/tools-reference.md — new "x_search" toolset
section with parameter docs and credential gating note.
- website/docs/reference/toolsets-reference.md — new row in the
toolset catalog table.
- website/sidebars.ts — wires the new feature page under
Media & Web, after web-search.
---------
Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>
Plugins can now replace a built-in tool by passing override=True to
ctx.register_tool(). Without it, the registry rejects any registration
that would shadow an existing tool from a different toolset (unchanged
default behavior).
Unlocks the use case from #11049: drop-in replacement of browser/web
backends without forking core. Composes with the existing pre_tool_call
hook for runtime interception of any implementation.
The override is audit-logged at INFO so it surfaces in agent.log.