hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-24 16:54:43 +00:00

Author	SHA1	Message	Date
kshitijk4poor	2b89afec79	fix(plugins): alias-normalize enable/disable for nested category plugins (follow-up to #41076 ) #41076 makes `hermes plugins list` discover nested category plugins (e.g. observability/nemo_relay). This adds the missing enable/disable mutation path so those plugins can actually be toggled, and fixes two incomplete-update breakages on the #41076 base. Before: `hermes plugins enable nemo_relay` -> "Plugin 'nemo_relay' is not installed or bundled." (exit 1), because cmd_enable/cmd_disable went through _plugin_exists(), which only checked top-level plugins/<name>/. Changes: - Add _resolve_plugin_key(): resolve a bare manifest/leaf name OR a full path-derived key (observability/nemo_relay) to the canonical key the runtime loader gates on, reusing #41076's _discover_all_plugins(). A bare leaf name ambiguous across two categories resolves to None rather than silently picking one. - cmd_enable/cmd_disable resolve first, persist the canonical key, and drop any stale legacy bare-name alias so the enabled/disabled lists can't drift into a contradictory state. _plugin_exists delegates to the same resolver. - Fix #41076 base breakages: _discover_all_plugins now returns 6-tuples, but web_server._merged_plugins_hub() still unpacked 5 (ValueError on the dashboard plugins-hub endpoint) and several test_plugins_cmd_list.py fixtures were still 5-tuples. Both updated; the hub status check is now key-aware. Verified e2e on the real CLI + runtime loader (isolated HERMES_HOME): `hermes plugins enable nemo_relay` writes observability/nemo_relay to config.yaml and the loader then loads it (enabled=True, error=None); a stale bare-name alias is cleared on disable; the dashboard _merged_plugins_hub() runs without crashing. Adds resolution + enable/disable tests; full tests/hermes_cli/test_plugins_cmd* + web_server plugin tests green. Follow-up to #41076 (#41066). Branched from that PR's head.	2026-06-08 17:57:37 +05:30
islam666	ccacfdbd6d	fix(plugins): discover nested category plugins in 'plugins list' (issue #41066 ) _discover_all_plugins() previously did a flat iterdir() scan, missing all category-namespaced plugins (web/, image_gen/, browser/, video_gen/). Now recurses up to 2 levels deep, matching PluginManager._scan_directory_level(). Also fixes _plugin_status() to check both manifest name AND path-derived key against enabled/disabled sets, so category plugins like 'web/tavily' show correct status when enabled via config.	2026-06-07 08:02:55 +00:00
annguyenNous	b08662b782	fix(gateway): tolerate Unicode in stderr log handlers on Windows On Windows with non-UTF-8 console encodings (e.g. cp949, cp1252), StreamHandler emits raise UnicodeEncodeError when log messages contain characters outside the console codepage — such as the em-dash (U+2014) in the session hygiene message. This crashed the gateway process silently, leaving no diagnostic output. Fix: add _safe_stderr() helper that wraps sys.stderr in a TextIOWrapper with encoding='utf-8' and errors='replace' when the console encoding is not UTF-8. Applied to both: - hermes_logging.py setup_verbose_logging() stderr handler - gateway/run.py optional stderr handler The wrapper ensures log lines are never lost — un-encodable characters are replaced with '?' instead of crashing the process. Fixes #40432	2026-06-06 19:57:44 -07:00
Teknium	fc086da8bd	fix(gateway,windows): reliability — JOB breakaway + status --deep probes + test-leak fix (#40909 ) * fix(gateway,windows): reliability — supervisor task, JOB breakaway, status --deep Three coordinated fixes for the Windows gateway reliability story: 1. CREATE_BREAKAWAY_FROM_JOB on every detached spawn The 'hermes update' triggered from the Electron Desktop GUI ran inside Electron's job object. Without breakaway, the post-update gateway watcher spawned by update — already DETACHED_PROCESS — was still reaped when Electron's job tore down, so the gateway never came back after a GUI-initiated update. Adds CREATE_BREAKAWAY_FROM_JOB (0x01000000) to: - hermes_cli/_subprocess_compat.py::windows_detach_flags() — used by every helper that calls windows_detach_popen_kwargs(), including launch_detached_profile_gateway_restart() - The watcher subprocess's own respawn snippet in hermes_cli/gateway.py (inlined flags so the watcher's child respawn also breaks away) _spawn_detached() in gateway_windows.py already had the flag; this change brings the rest of the codebase to parity. 2. Per-minute supervisor Scheduled Task — Windows equivalent of systemd Restart=always Introduces hermes_cli/gateway_supervisor.py and registers it as a second Scheduled Task ('Hermes_Gateway_Supervisor', SC MINUTE /MO 1, LIMITED rights) alongside the existing ONLOGON task. Every minute, the supervisor uses the same gateway.status.get_running_pid() probe as 'hermes gateway status' and, if no gateway is alive, calls gateway_windows._spawn_detached() (which now includes BREAKAWAY) to bring one back. Covers every crash mode, not just 'machine rebooted': taskkill, OOM, GUI update SIGTERM, parent job teardown. Cheap — one pythonw startup per minute when down, one PID-existence check per minute when up. Wired into both the schtasks-success and Startup-folder-fallback install paths via _install_supervisor_best_effort(), and removed in uninstall(). Best-effort: a failing supervisor install logs a warning but doesn't roll back the primary install. 3. 'hermes gateway status --deep' shows per-probe PASS/FAIL Replaces the existing terse '--deep' output (which only printed paths) with an actual diagnostic table: [1] PID file present [2] Lock file held by a live process [3] get_running_pid() result [4] _pid_exists(pid) — OS-level liveness [5] gateway_state.json (state + age) [6] Last lifecycle event from gateway-exit-diag.log When the high-level summary disagrees with reality, the user can see exactly which signal is lying. Test-leak fix ------------- tests/hermes_cli/test_gateway_wsl.py::TestGatewayCommandWSLMessages monkey-patched is_linux/is_wsl/supports_systemd_services to simulate WSL but did NOT stub is_windows(). On a Windows host, the dispatcher in _gateway_command_inner takes the is_windows() branch BEFORE the WSL guidance branch, so the test invoked gateway_windows.install() for real. install() writes to %APPDATA%\...\Startup\Hermes_Gateway.cmd — the REAL user Startup folder, never sandboxed by tmp_path — pointing at the test's pytest-of-<user>/pytest-<N>/.../gateway-service/ wrapper. When pytest tore down the tmp_path, every subsequent Windows login flashed a cmd.exe window that failed to find the missing target. Stubs is_windows=False on all four affected tests: test_install_wsl_no_systemd test_start_wsl_no_systemd test_status_wsl_running_manual test_status_wsl_not_running Defense-in-depth: _build_startup_launcher() now prefixes the launcher with 'if not exist <target> exit /b 0', so any future stale Startup entry silently no-ops instead of flashing a console window. Status enhancements ------------------- - status() now reports supervisor task presence alongside the existing schtasks/Startup info, and nudges the user to reinstall if the supervisor isn't registered. - Deep mode dumps both the supervisor task name + script path. * fix(gateway,windows): drop the per-minute supervisor task — keep breakaway + deep probes Earlier in this branch we added a per-minute schtasks-based supervisor to respawn the gateway after crashes / GUI-update SIGTERMs. The implementation flashed a brief console window on every firing, which stole window focus. We tried several variants: - cmd.exe wrapper invoking pythonw -> flashes (cmd.exe is console-subsystem) - schtasks /TR pointing at pythonw -> flashes (uv venv launcher pythonw is actually subsystem=Console, not GUI; it respawns the real pythonw) - schtasks /TR pointing at base uv -> still flashes (Task Scheduler-side conhost preallocation; documented Windows quirk) - XML registration with <Hidden>true> -> still flashes (<Hidden> only hides the task in the Task Scheduler UI, not the spawned window) Researched what leading projects do: - Ollama: GUI-subsystem tray exe + Startup-folder shortcut. No supervisor. - Tailscale: real Windows Service via SCM. Session 0, no console possible. - Syncthing: --no-console flag inside the binary + Startup folder. - openclaw: VBS Run(..., 0, False) wrapper. Suppresses the window but Super User Q971162 confirms focus-steal still occurs in some cases. None of these use a per-minute polling scheduled task. The 'auto-restart on crash' responsibility belongs INSIDE the daemon (Tailscale's in-process recovery / Ollama's monitor+worker pair) OR is delegated to the Windows Service Control Manager — not Task Scheduler. So this commit drops the supervisor entirely. The CREATE_BREAKAWAY_FROM_JOB fix in _subprocess_compat.py (from commit `c1e5fa433`) survives — that is the real fix for problem #2 (GUI-update kills gateway): the post-update watcher in launch_detached_profile_gateway_restart() now breaks out of Electron's job object, so the gateway respawn watcher survives the GUI quit and successfully respawns the gateway. Surviving from `c1e5fa433`: * CREATE_BREAKAWAY_FROM_JOB in hermes_cli/_subprocess_compat.py (fixes #2) * Inlined breakaway flag in the watcher respawn snippet in gateway.py * hermes gateway status --deep PASS/FAIL probes (fixes #1 — visibility) * 'if not exist <target> exit /b 0' guard in _build_startup_launcher (fixes #3 — silent no-op for stale Startup entries) * tests/hermes_cli/test_gateway_wsl.py is_windows=False stubs (root cause of #3 — pytest WSL tests no longer leak Startup entries on Win hosts) Removed in this commit: * hermes_cli/gateway_supervisor.py (entire file) * Supervisor section in hermes_cli/gateway_windows.py (~180 lines): get_supervisor_task_name, get_supervisor_script_path, _build_supervisor_cmd_script, _write_supervisor_script, _install_supervisor_task, is_supervisor_task_registered, _install_supervisor_best_effort * _install_supervisor_best_effort() calls in install() (3 spots) * supervisor cleanup block in uninstall() * supervisor display lines in status() / status(deep=True) Future direction (out of scope for this PR): the right place for Windows 'Restart=always' semantics is a real Windows Service installed via pywin32's win32serviceutil.ServiceFramework — session-0 isolation, SCM auto-restart, no console window possible. That's a meaningful next-PR project, not a band-aid. Tests: 51 pass / 2 pre-existing failures in tests/hermes_cli/test_gateway_{windows,wsl}.py (the 2 failures are TestSupportsSystemdServicesWSL cases that fail on origin/main too — unrelated to this PR).	2026-06-06 19:53:58 -07:00
Frowtek	40cea4d58d	fix(agent): import SimpleNamespace for hook payload sanitization _hook_jsonable() referenced SimpleNamespace without importing it, so sanitizing any hook payload that contained one raised NameError: name 'SimpleNamespace' is not defined. Bedrock, Codex-responses, and the auxiliary client build their response / message / tool_call objects as SimpleNamespace and hand the raw objects to the post_api_request hook. The hook call sites swallow exceptions (except Exception: pass), so the crash silently dropped the observability hook for those providers. Add the missing `from types import SimpleNamespace` and a regression test covering the SimpleNamespace sanitization path.	2026-06-06 19:32:36 -07:00
helix4u	bb53edc773	fix(image_gen): use gpt-5.5 for Codex image host	2026-06-06 19:31:51 -07:00
Teknium	3eeca4613d	fix(qqbot): stop 100% CPU spin when WebSocket is closed but not None (#31193 , #31771 ) (#40574 ) _read_events() returned normally when self._ws was closed-but-non-None (the while-condition is false on entry). _listen_loop treats a normal return as a clean read, resets backoff to 0, and immediately retries — a tight busy-loop pinning CPU. Raising on entry routes it through the reconnect/backoff path instead. Co-authored-by: xushibo <xushibo@users.noreply.github.com> Co-authored-by: cnfi <cnfi@users.noreply.github.com>	2026-06-06 18:44:44 -07:00
Teknium	fe8920db18	fix(memory): reject memory tools that shadow core tool names (#40902 ) A memory provider tool whose name collides with a built-in core tool (e.g. clarify, delegate_task) was skipped from agent.tools at init but lingered in MemoryManager._tool_to_provider, where the has_tool dispatch branch could route a call to a tool that was never registered (#40466). Block the collision at registration instead of patching dispatch: - MemoryManager.add_provider rejects any tool whose name is in _HERMES_CORE_TOOLS (warn + skip), so it never enters the routing table. - get_all_tool_schemas applies the same filter, so the manager never advertises a schema it would refuse to route. Built-ins always win, matching the invariant used by the TTS/browser/ search provider registries. Makes the dispatch-hijack structurally impossible regardless of branch ordering. Closes #40466.	2026-06-06 18:44:09 -07:00
Teknium	887295ba54	fix(config): preserve custom-provider models maps and metadata through v11->v12 migration (#40573 ) Salvaged from #40410; cleaned up, re-verified against main, tests added. Co-authored-by: rodboev <rodboev@users.noreply.github.com>	2026-06-06 18:43:20 -07:00
teknium1	f9ea4927f2	test(tui): cover _terminal_task_cwd remote-backend branches Adds regression tests for the SSH cwd fix: local backend keeps host-validated session cwd; non-local backend uses TERMINAL_CWD (or terminal.cwd config) verbatim without host isdir() validation; sentinel values fall back to session cwd.	2026-06-06 18:40:43 -07:00
Teknium	89040e0db3	fix(secrets): fail early with clear error when bitwarden setup runs without TTY (#40571 ) Salvaged from #40280; cleaned up, re-verified against main, tests added. Co-authored-by: liuhao1024 <liuhao1024@users.noreply.github.com>	2026-06-06 18:36:40 -07:00
Teknium	365437e4aa	fix(cua-driver): reconnect MCP stdio session once on ClosedResourceError after daemon restart (#40570 ) Salvaged from #40282; cleaned up, re-verified against main, tests added. Co-authored-by: jeeves-assistant <jeeves-assistant@users.noreply.github.com>	2026-06-06 18:35:12 -07:00
Teknium	8f7567c325	fix(bitwarden): prevent zip-slip path traversal when extracting bws binary (#40569 ) Salvaged from #40381; cleaned up, re-verified against main, tests added. Co-authored-by: zapabob <zapabob@users.noreply.github.com>	2026-06-06 18:33:44 -07:00
Teknium	5a36f76a00	fix(skill_manager): allow SKILL.md in _validate_file_path without weakening traversal guard (#40568 ) Salvaged from #40453; cleaned up, re-verified against main, tests added. Co-authored-by: l37525778-coder <l37525778-coder@users.noreply.github.com>	2026-06-06 18:32:37 -07:00
Teknium	c0424b06af	fix(osv_check): honor npx --package/-p install target when parsing package arg (#40567 ) Salvaged from #40461; cleaned up, re-verified against main, tests added. Co-authored-by: HeLLGURD <HeLLGURD@users.noreply.github.com>	2026-06-06 18:30:39 -07:00
Teknium	56f833efa4	fix(skills): block path traversal via skill_view name argument (#40566 ) Closes #38643. Salvaged from #40521; cleaned up, re-verified against main, tests added. Co-authored-by: xy200303 <xy200303@users.noreply.github.com>	2026-06-06 18:29:52 -07:00
Teknium	f4a73abbd0	chore(gateway): drop HOMEASSISTANT from /update allowlist (#40736 ) Home Assistant is a bundled plugin now (#40709) and declares allow_update_command=True on its PlatformEntry. The registry fallback in _handle_update_command already covers it, so the frozenset entry is a redundant double-allow — same cleanup #40711 did for Discord and Mattermost. Adds a registry-fallback test mirroring the existing discord/mattermost cases.	2026-06-06 18:25:43 -07:00
Teknium	5b43bf7d02	feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI) (#40355 ) * feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI) Adds a GUI-only uninstall path so people can remove the desktop Chat GUI while keeping the Hermes agent + their config/sessions/.env, and surfaces the three CLI uninstall modes inside the desktop app's Settings → About. CLI: - New hermes_cli/gui_uninstall.py: cross-platform discovery + removal of the desktop GUI's artifacts (source-built dist/release/node_modules + build stamp, the packaged app bundle, and the Electron userData dir) on Linux, macOS, and Windows. Never touches the agent source, venv, or user data. - `hermes uninstall --gui` removes only the Chat GUI; `--gui-summary` prints a JSON install snapshot (used by the desktop UI to gate options + detect a missing agent for a future lite client). - `hermes uninstall --yes` / `--full --yes` now run non-interactively, sharing the destructive sequence via a new _perform_uninstall() helper. The keep-data and full flows also sweep the GUI artifacts. Desktop: - electron/desktop-uninstall.cjs: pure helpers mapping each mode (gui/lite/full) to CLI flags, resolving the running app bundle per OS, and building the detached cleanup script that waits for the app to exit, runs the Python uninstall, and removes the bundle. - IPC hermes:uninstall:summary / :run, preload bridge, and types. - Settings → About "Danger zone" with the three options; agent-removing options hide when no local agent is detected. Tests: tests/hermes_cli/test_gui_uninstall.py (22 pass with the existing uninstall tests), electron/desktop-uninstall.test.cjs (17 pass, wired into test:desktop:platforms). Docs: desktop.md "Uninstalling" + cli-commands.md. * fix(desktop): tear down backend process tree before GUI uninstall (Windows lock safety) The desktop uninstall cleanup script waited only on the desktop app's own PID, but a backend grandchild (gateway / pty terminal / hermes REPL) can outlive it and keep hermes.exe + venv files mandatory-locked on Windows — making the script's rmdir half-fail and leaving a partial install, the same failure class as the self-update path's #37532. - main.cjs: runDesktopUninstall now awaits releaseBackendLock() before spawning the cleanup script — tree-kills every backend PID the desktop owns (primary + pool) via taskkill /T /F and polls the venv shim until unlocked. Extracted the shared core out of releaseBackendLockForUpdate so both the update hand-off and the uninstaller use the identical, incident-hardened teardown. No-op on macOS/Linux (no mandatory locks). - desktop-uninstall.cjs: Windows cleanup script removes the bundle via a bounded rmdir retry loop (10x, 1s) instead of a single rmdir, since Windows releases directory handles lazily even after the holding process exits. - Dropped a fragile tasklist\|findstr reap-by-path attempt; the Electron-side tree-kill-by-PID is the reliable mechanism. Tests: desktop-uninstall.test.cjs updated for the retry-loop output (17 pass). * fix(desktop): address review on GUI uninstall (venv self-delete, gates, wait-loop) Resolves @OutThisLife's review on #40355: 1. full mode now gated on agent presence (needsAgent: true). It removes the agent + user data, so on a lite client with no local agent it's hidden like lite — no more offering to remove an agent that isn't there. 2. (Finding 3, the real bug) lite/full no longer rmtree the venv from the venv's OWN python. On Windows a running python.exe is mandatory-locked, so that half-fails. New lightweight 'python -m hermes_cli.uninstall --mode X' entrypoint (stdlib-only imports) lets the desktop run agent-removing modes under the SYSTEM python (findSystemPython) with PYTHONPATH=<agentRoot>, so import hermes_cli resolves from source while the venv is torn down. Falls back to venv python + logs when no system python (gui-only unaffected). 3. Windows wait-loop is now bounded (60 tries, matching POSIX) and matches the PID as a whole space-delimited token via findstr (no substring 99->990 trap, no redundant bare find). set HERMES_HOME/PID/PYTHONPATH now quoted. 4. Renamed the misleading 'returns null for dev run' test — the dev-run safety is shouldRemoveAppBundle(isPackaged=false), which the test now asserts. Docs: note that --gui on a source checkout also sweeps node_modules/build output. Tests: 18 python + 19 desktop pass.	2026-06-06 18:22:38 -07:00
Teknium	f2e8234307	test: update non-Termux workspace-scope fixtures for #38358 fix The non-Termux web/TUI install path now scopes to --workspace <name>; update two fixtures that asserted the old unscoped install commands.	2026-06-06 18:22:20 -07:00
Teknium	7db7a9462d	fix: align test fixture arg order + add zakame to AUTHOR_MAP Conflict resolution prefixes --workspace web before --silent (preserving the Termux npm_workspace_args path); update test_cmd_update fixture to match. Add zakame@zakame.net -> zakame mapping so CI author check passes.	2026-06-06 18:22:20 -07:00
Zak B. Elep	675fb10240	fix(install): correct check_dir tautology and add --workspace web test - check_dir = npm_dir if audit_extra else npm_dir evaluated identically in both branches; change to PROJECT_ROOT if audit_extra else npm_dir so workspace-scoped audits check the workspace root's node_modules - Add test_npm_install_uses_workspace_web_scope asserting --workspace web is passed adjacently in the _build_web_ui npm install invocation	2026-06-06 18:22:20 -07:00
Zak B. Elep	4bf52022e5	fix(tui): correct --skip-build hint and add TUI workspace install test - Update the --skip-build pre-build hint in the dashboard startup path to use `npm install --workspace web && npm run build -w web` so users don't accidentally trigger a desktop rebuild by following the hint. - Add test_tui_launch_install_uses_workspace_scope to assert that the TUI launch npm install carries --workspace ui-tui, covering the call site added in the prior commit.	2026-06-06 18:22:20 -07:00
Zak B. Elep	0416f852f2	fix(tui): scope TUI launch install and fix stale hints/test - Add --workspace ui-tui to the TUI launch npm install, the one call site missed by the prior commit. Without scoping it ran from PROJECT_ROOT and still resolved apps/desktop via the apps/* glob. - Update the two manual-recovery hints in _build_web_ui (npm install failure and build failure paths) to use the scoped form `npm install --workspace web && npm run build -w web` so users following the hint don't accidentally trigger a desktop rebuild. - Update the stale test assertion in test_cmd_update.py to expect --workspace web in the _build_web_ui npm ci call, which was previously unreachable through the if-guard and left the workspace- scoping change from the prior commit unverified.	2026-06-06 18:22:20 -07:00
kshitijk4poor	c79e3fd0ba	refactor(image_gen): delegate cache-path mapping to shared helper Follow-up on the backend-visible artifact-path fix. - Extract the cache-mount iteration loop into a reusable, backend-agnostic credential_files.map_cache_path_to_container(host_path, container_base) that returns the POSIX container path or None. to_agent_visible_cache_path() now delegates to it (keeping its Docker-only gate), and image_generation_tool's _agent_visible_cache_path() delegates to it too — eliminating the duplicated loop and the divergent path-join (posixpath vs Path) between the two. - Drop the now-unused posixpath/Path imports from image_generation_tool.py. - Document the agent_visible_cache_base getattr probe as a forward-looking optional hook (no producer yet) so it doesn't read as a typo'd attribute. - Add unit tests for map_cache_path_to_container.	2026-06-06 13:19:07 -07:00
Gille	7c4aa3e4da	fix(image_gen): expose backend-visible artifact paths	2026-06-06 13:19:07 -07:00
kshitijk4poor	ef7e5168b5	chore(gateway): drop plugin-migrated platforms from /update allowlist `gateway/run.py::_UPDATE_ALLOWED_PLATFORMS` was a hardcoded frozenset listing every messaging platform allowed to invoke the `/update` slash command. Plugin-migrated platforms (currently Discord and Mattermost, soon also Home Assistant via #32500) declare `allow_update_command=True` on their `PlatformEntry`, and `_handle_update_command` already falls back to the registry when a platform isn't in the frozenset. The result was a silent redundancy: those entries said "allowed" twice, and the registry flag was a no-op for them in practice. - Removed `Platform.DISCORD` and `Platform.MATTERMOST` from the frozenset. - Updated the docstring to make the split explicit (built-ins live in the frozenset; plugins use `allow_update_command` on the registry entry). The remaining frozenset entries are all still built-in platforms living under `gateway/platforms/` today. Future plugin migrations should drop their entry from the frozenset as part of the migration PR (or in a sibling chore PR like this one). Added a `TestUpdateCommandPlatformGate` test class that pins down all three branches of the gate so future changes don't silently regress: - Programmatic interfaces (`Platform.WEBHOOK`, `Platform.API_SERVER`) must remain blocked. - Plugin-migrated platforms (Discord, Mattermost) must pass via the registry fallback. - Built-in platforms in the hardcoded frozenset (Telegram) must still pass without needing the registry. The gate previously had zero direct test coverage — its only existing coverage was `test_no_adapter_for_platform` which exercised a different code path.	2026-06-06 11:48:55 -07:00
kshitijk4poor	c37c6eaf29	refactor(gateway): migrate Home Assistant adapter to bundled plugin Move gateway/platforms/homeassistant.py into plugins/platforms/homeassistant/ following the same shape as the Mattermost and Discord migrations. - Adapter file is renamed via git mv (history is preserved). - register() exposes the platform via the plugin system instead of the hardcoded Platform.HOMEASSISTANT elif in gateway/run.py::build_adapter(). - _standalone_send() replaces the legacy _send_homeassistant() helper in tools/send_message_tool.py. Out-of-process cron delivery (deliver=homeassistant from a cron process not co-located with the gateway) now flows through the registry's standalone_sender_fn path instead of the hardcoded elif. - _is_connected() probes HASS_TOKEN via hermes_cli.gateway.get_env_value so existing connected-platform checks behave identically. The HASS_TOKEN / HASS_URL env-to-PlatformConfig seeding in gateway/config.py stays in core — same pattern bluebubbles, mattermost, and discord migrations followed. No setup_fn or apply_yaml_config_fn is registered because Home Assistant has no _setup_homeassistant wizard in hermes_cli/setup.py and no homeassistant: YAML block in config.yaml today; setup runs through the existing hermes_cli/tools_config.py toolset wizard. Test imports were rewritten across tests/gateway/test_homeassistant.py, tests/integration/test_ha_integration.py, and tests/tools/test_send_message_missing_platforms.py; the legacy (token, extra, chat_id, message)-shaped _send_homeassistant call site is preserved via a small SimpleNamespace shim in test_send_message_missing_platforms.py (same approach used when mattermost moved). - Focused HA suites (64 tests across the three rewritten files) pass. - Broader gateway/cron sweep produces 10 failures identical to main baseline (telegram approval/model-picker xdist isolation flakes, wecom_callback defusedxml issue, cron script_timeout fixture issue). Zero net new failures.	2026-06-06 11:46:24 -07:00
kshitij	ebed881d46	fix(cli): quarantine running hermes.exe during update dep-verification repair on Windows (#40409 ) The dependency-verification repair in _verify_core_dependencies_installed ran 'pip install --reinstall -e .' via _run_install_with_heartbeat directly, bypassing the Windows shim-quarantine that the primary install path performs. That reinstall rewrites the entry-point shims, and on Windows the live hermes.exe is the running process — pip can neither delete nor overwrite it. With no quarantine, the shim was left missing and 'hermes' dropped off PATH ('hermes' is not recognized... after update). Extract the rename-out-of-the-way / restore-on-failure logic into a reusable _run_quarantined_install helper and route both the primary editable installs and the --reinstall -e . repair through it. The per-package repair installs only third-party deps (never hermes-agent), so they don't touch the shims and are left untouched. Add a regression test (fails on old code, passes on new).	2026-06-06 12:50:58 -05:00
kshitij	d4a7bfd3aa	Merge pull request #29724 from bbednarski9/bbednarski/nmf-41B-nemoflow-plugin feat(middleware): add adaptive middleware to hermes-agent, consumed by NeMo-Relay	2026-06-06 10:46:41 -07:00
Brooklyn Nicholson	003110c107	fix(ci): map @TheGardenGallery email + drop unused pytest import - check-attribution: add chilltulpa@gmail.com -> TheGardenGallery to AUTHOR_MAP in scripts/release.py (new external contributor via the carried-over commits). - ty: the dashboard back-compat test imported pytest but never used it, tripping unresolved-import. Drop the dead import — tests are plain functions driving the parser via subprocess, no pytest API needed.	2026-06-06 12:43:28 -05:00
The Garden	2820d87ea5	fix(cli): tolerate stale `dashboard --tui` from old desktop shells Older Hermes desktop app shells (<= 0.15.x) spawn the backend as `hermes dashboard --no-open --tui --host ... --port ...`. The --tui flag was removed from the dashboard subcommand in `cae6b5486` (embedded chat is always on now). When a user's CLI updates past that commit but their desktop app binary has not, argparse hard-errored with 'unrecognized arguments: --tui' and exit(2). The backend died before becoming ready and the desktop GUI showed only 'Hermes couldn't start' with no actionable cause — a confusing brick for anyone whose app and CLI versions drift apart across an update. Add a hidden, deprecated, accepted-and-ignored --tui flag to the dashboard subparser so an old app shell + new CLI degrades gracefully. Hidden from --help via argparse.SUPPRESS so we don't re-advertise a removed feature. Safe to delete once the floor app version is well past 0.16.0. Adds tests/hermes_cli/test_dashboard_tui_backcompat.py pinning: the flag parses without error, stays hidden from --help, and the modern (no --tui) invocation is unaffected.	2026-06-06 12:43:28 -05:00
kshitijk4poor	c4c5548eb4	fix(middleware): single-use next_call guard + deepcopy-safe request copies Address the two non-blocking follow-ups from review: - next_call is now single-use per middleware frame. A second invocation raises instead of silently re-running the downstream provider/tool, so the terminal call cannot execute twice via the chain. The error surfaces through the existing handler, which preserves the first downstream result. - Request-middleware payload copies go through _safe_copy(), which falls back to a shallow dict copy when deepcopy() fails on a non-deepcopyable member (clients, callbacks, file handles) instead of aborting the pass. Adds regression coverage for both: double next_call() keeps the terminal single-run, and a non-deepcopyable (threading.Lock) request payload still runs middleware via the shallow fallback.	2026-06-06 23:07:25 +05:30
Dusk1e	d1771114ed	fix(search): sanitize ":" in FTS5 queries so colon searches don't silently return empty ":" is FTS5's column-filter operator. With a single-column "content" FTS table, an unquoted query like "TODO: fix" parses as "column:term" and raises "no such column: TODO". search_messages() catches that OperationalError at the execute site and returns [], so colon queries silently yield zero hits even when the content is present. This hits both the session_search tool and the dashboard search. Add ":" to the Step 2 metacharacter strip in _sanitize_fts5_query(), mirroring how the other FTS5 syntax characters are already stripped. Colons inside quoted phrases are preserved (Step 1 protects them). Adds a regression test asserting a colon query still finds matching content, plus unit assertions on the sanitizer.	2026-06-06 09:32:55 -07:00
Bryan Bednarski	5abe45674d	fix(middleware): preserve translated downstream failures Track successful next_call completion separately from invocation so execution middleware that catches and translates a downstream provider/tool failure does not accidentally convert that failure into a successful None result. Also avoid wrapping BaseException from downstream execution, and document the execution middleware error semantics. Tests cover: - pre-next_call middleware failures fail open to the remaining chain - post-next_call middleware failures preserve the downstream result - translated downstream failures propagate instead of returning None - downstream BaseException is not wrapped Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>	2026-06-06 09:26:18 -07:00
Brooklyn Nicholson	3606307339	fix(gateway): use user launchd domain + Background session, detached fallback (macOS 26) Salvages the primary fix from #24275 (asdlem) and layers a last-resort fallback on top: Primary (from #24275): the real macOS 26 root cause is that `gui/<uid>` isn't reachable from non-Aqua/background sessions. Switch the launchd domain to `user/<uid>` and mark the plist valid for both Aqua and Background sessions (LimitLoadToSessionType), restoring a real supervised service. Treat exit code 125 as "job unloaded" so start/restart re-bootstrap and retry. Last resort (this PR): the #23387 reporter saw `user/<uid>` bootstrap also fail with error 5 on some hosts. When even a fresh bootstrap can't manage the domain (codes 5/125 persist), degrade to a CLI-managed detached background process instead of crashing — logs to gateway.log, PID tracked via gateway.pid so stop/status/restart keep working. Print guidance that it won't auto-start at login or auto-restart on crash. Co-authored-by: asdlem <asdlem@users.noreply.github.com>	2026-06-06 09:08:37 -07:00
Brooklyn Nicholson	59c273ba3a	fix(gateway): fall back to detached launch when launchd rejects domain (macOS 26) macOS 26+ broke launchctl management of the gui/<uid> (and user/<uid>) domains: `bootstrap` returns error 5 and `kickstart` returns error 125 ("Domain does not support specified action"), so `hermes gateway start/install/restart` crashed with a cryptic traceback (#23387). Detect these codes and degrade gracefully: launch the gateway as a CLI-managed detached background process (the documented `nohup hermes gateway run --replace` workaround), with logs to gateway.log and the PID tracked via gateway.pid so stop/status/restart keep working. Print clear guidance that the service won't auto-start at login or auto-restart on crash on this macOS version. launchd_stop also tolerates 125/5 from bootout and falls through to the PID-based kill.	2026-06-06 09:08:37 -07:00
Teknium	54e7b74f7f	fix(gateway): plain text while busy interrupts by default again (#40590 ) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * fix(gateway): plain text while busy interrupts by default again busy_input_mode (default 'interrupt') was advertised as the busy-behavior knob, but a second knob added in `7abd62719` — busy_text_mode, defaulting to 'queue' — short-circuited every plain TEXT message before busy_input_mode was consulted. Result: plain follow-ups silently queued instead of interrupting, even with busy_input_mode left at its 'interrupt' default (regression #38390, silent-queue #31588). Collapse to one source of truth: busy_input_mode drives text handling. busy_text_mode is kept only as a legacy explicit override for back-compat (existing queue setups keep working); when unset it follows busy_input_mode. All default fallbacks flipped queue->interrupt. The debounce mechanism is preserved and now keyed off the resolved mode. Fixes #38390, #31588.	2026-06-06 09:00:10 -07:00
Teknium	2bf0a6e760	feat(dashboard): full tool backend configuration in the GUI (#40418 ) Replicate the `hermes tools` configurator in the dashboard Skills → Toolsets view. Each toolset now opens a config drawer that covers the full lifecycle the CLI offers: enable/disable, pick a provider/backend, enter and save API keys, and run a provider's post-setup install hook with a live log tail. The toolset view was previously read+toggle only — the provider matrix and key-status endpoints existed but the page never called them, and there was no way to save a key or run a backend install (npm/pip/binary) from the browser. Backend: - New CLI subcommand `hermes tools post-setup <KEY>` — non-interactive, scriptable target that runs a provider's install hook (agent_browser, camofox, cua_driver, kittentts, piper, ddgs, spotify, langfuse, xai_grok). Validated against valid_post_setup_keys() so an arbitrary key can't drive _run_post_setup. - PUT /api/tools/toolsets/{name}/env — save API keys to ~/.hermes/.env via save_env_value (same store the CLI writes), validated against the toolset category's env-var allowlist; blank values skipped. - POST /api/tools/toolsets/{name}/post-setup — spawn-action that runs `hermes tools post-setup <key>`; frontend tails the log via the existing /api/actions/tools-post-setup/status. Registered in _ACTION_LOG_FILES. Frontend: - New ToolsetConfigDrawer component (provider radios, password key inputs with saved-state, get-a-key links, Run-setup + live install log). Toolset cards get a Configure button + the drawer also exposes the enable toggle. - api.ts: toggleToolset, getToolsetConfig, selectToolsetProvider, saveToolsetEnv, runToolsetPostSetup + ToolsetConfig/Provider/EnvVar/ EnvResult types. Validation: 56 admin-endpoint tests pass (10 new: env save w/ CLI parity + allowlist reject + blank-skip, post-setup spawn validation, auth gate); 232 web_server tests pass; web npm run build + eslint clean; HTTP E2E exercises save-key (CLI reads it back) and spawn+poll post-setup to exit 0.	2026-06-06 07:45:36 -07:00
Teknium	56236b16e3	feat(dashboard): rehaul Skills hub browser — connected hubs, featured, preview + security scan (#40384 ) The Browse-hub tab was a blank search box with sparse result cards (name + source + one Install button), no way to read a skill before installing, no visual security scan, and no indication it was even connected to any hubs. Backend (web_server.py): - GET /api/skills/hub/sources — lists the configured hubs (label + trust tier + GitHub rate-limit + index availability) and featured skills pulled from the centralized index (zero extra API calls), plus installed-skill provenance so the UI can mark already-installed results. - GET /api/skills/hub/preview — fetches a skill's SKILL.md text + file manifest WITHOUT installing (decodes byte-stored text, masks binaries). - GET /api/skills/hub/scan — runs the SAME quarantine + scan_skill + should_allow_install pipeline the CLI installer uses, then cleans up quarantine, returning verdict / per-finding detail / severity tally / install-policy decision. - search now returns per-source counts + timed-out sources + installed map. Frontend (SkillsPage HubBrowser): - Landing state: connected-hubs strip + featured skill grid (no more blank page). - Rich cards: trust-level color coding, source, tags, identifier, Details + Install (or Installed state). - Detail dialog: read the actual SKILL.md, on-demand visual security scan (verdict pill, severity tally, per-finding list, allow/block policy), GitHub repo link. - Search meta line: result count + timing + per-source breakdown (the 'feels slow / no feedback' complaint). Tests: 4 new endpoint test classes (sources/preview/scan + updated search shape) in test_dashboard_admin_endpoints.py.	2026-06-06 02:44:50 -07:00
kshitij	5af899c7ca	feat(cli): display custom profile alias names in profile list/show (#40371 ) profile list and profile show assumed the wrapper script is always named after the profile (wrapper_dir / name). When a custom alias exists — e.g. `hermes profile alias steve --name qiaobusi` creates ~/.local/bin/qiaobusi pointing at `hermes -p steve` — the display silently showed the profile name (or nothing) instead of the alias the user actually typed. The custom-alias creation path (create_wrapper_script(name, target)) was added later; the display path was never updated to match. Add find_alias_for_profile() — a reverse lookup that scans the wrapper dir for our own wrappers (alias-named file containing 'hermes -p <profile>'), prefers a custom alias over the profile-named one, strips .bat on Windows, and sorts for deterministic output. Populate ProfileInfo.alias_name and wire it into the three display sites (profile describe, list, show). Credit: salvages the intent of #11506 by wss434631143, reimplemented on current main against the post-#11506 custom-alias (--name/target) mechanism. Tests: 6 new (profile-named, custom-name, none, unrelated-file rejection, windows .bat strip, list_profiles surfacing). All 123 in test_profiles pass. E2E verified against the real CLI for both custom and profile-named aliases.	2026-06-06 08:08:07 +00:00
Siddharth Balyan	fcb1944b4f	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 ) Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details * feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits) L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that exercises the real header -> CreditsState -> TUI pipe end-to-end behind HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists. - agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are strings -> paid_access via == "true", never bool(); retain-last-known; only subscription_micros may be negative; _usd kept verbatim). - run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros, session-start baseline latch, + dev-gated "credits" capture log. - agent/chat_completion_helpers.py: capture on the streaming response. - agent/agent_init.py: init _credits_state + _credits_session_start_micros. - tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged. - ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner. Off by default; silent for normal users. Validated live against staging (capture log delta matches the TUI segment). Throwaway consumer (readout/log/ banner); credits_tracker + the capture plumbing are the real feature foundation. test(credits): lock parser under 9-state matrix + harden validation (L2) Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free, depleted, debt, missing, no_org) plus validation edge cases: version strict==1 with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off == "true"/"false", never bool()), half-pair subscription limit treated as both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros → None, negative non-subscription micros → None, as_of_ms junk → None, zero limit ZeroDivision guard. Harden agent/credits_tracker.py to match the spec: - Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState - Add depleted property (== not paid_access, never remaining==0) - Change used_fraction guard to key off subscription_limit_micros (the actual denominator) not denominator_kind (metadata) - Replace fail-soft _safe_int with a sentinel-returning variant; full validation now returns None on any malformed field rather than silently defaulting - Add module-level warn-once latch for version > 1 - Add USD regex validation; add denominator_kind allow-list check - Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-) feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1) L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's policy will fire through and L5's TUI render will consume. - agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id; kind defaults "sticky", kept TTL-expressive for a future config seam). - run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and _emit_notice / _emit_notice_clear emitters (swallow all callback errors — a notice must never break the agent loop; no-op when unbound). - agent/agent_init.py: thread both callbacks through init_agent. - tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear WS events (snake_case payload, matching the existing gateway-event convention). - ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent. - tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op, signature threading, TUI binding payload shape). Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/ decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly. * feat(credits): threshold reconciliation policy + tests (L4.1) * feat(credits): wire threshold policy into capture + latch (L4.2) After a fresh header parse, _capture_credits runs evaluate_credits_notices against the agent's _credits_latch and emits the result — clears first, then shows (so a recovered depletion clears before the "restored" success lands, and depleted wins the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks) still caches state for /usage but runs no policy. Parse stays fail-open (miss → keep last-known); the eval/emit path warns on failure rather than swallowing, so a depletion-notice bug can't vanish silently. - run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn); latch lazy-guarded (object.__new__ safety). - agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}. * feat(tui): render credits notices in the status bar (L5, Strategy B) The TUI now renders the notification.show / notification.clear gateway events the agent emits — a level-colored notice overrides the status/verb slot when not busy. - Notice state machine on turnController (pendingNotice + dedicated noticeTimer + show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler decodes the events and delegates. - Render priority busy > notice > status (appChrome StatusRule); notice text rendered verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx; dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire). - Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites (recordMessageComplete / interruptTurn / recordError) — never idle(), which reset() also calls (would leak across sessions); reset() clears instead. - Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard; latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak). - 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority). * feat(credits): cold-start seed for new Nous sessions (L3) A genuinely-new Nous session has no inference header yet, so seed credits state from the authoritative GET /api/oauth/account snapshot at session start (in the new-session branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin hook gets no agent reference). The seed runs the shared notice policy, so a session that opens already depleted warns IMMEDIATELY rather than only after the first turn. - Maps the nested account fields (paid_service_access → paid_access; total_usable / subscription / purchased on paid_service_access_info; rollover on subscription), each None-guarded; float dollars → micros via round(d1e6), _usd left "" (render formats from micros — never synthesize a verbatim usd from a float). - Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset → used_fraction None → no warn90 from the seed (% only once a header lands, per D-E). - Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never blocks startup); paid_access unknown ⇒ True (never falsely depleted). - run_agent.py: extracted the warm-path policy/emit block into a shared _emit_credits_notices() so capture and the seed fire notices identically. * feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6) Add Nous credit dollar magnitudes to /usage (subscription / top-up / total + rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the account endpoint exposes a denominator). Reuses the existing account-usage render machinery via a new pure build_nous_credits_snapshot() that maps a NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to fetch_account_usage (keeps the per-provider boundary intact). CLI /usage also doubles as a depletion-recovery trigger: a force_fresh account fetch, kept in a SEPARATE local so it never clobbers the header-sourced agent._credits_state (which alone carries used_fraction). If paid access recovered while credits.depleted is latched and a notice consumer is bound, it reuses agent._emit_credits_notices() to clear it. Gateway /usage displays magnitudes only — messaging binds no notice consumer, so it performs no recovery emit. Fail-open throughout: any portal hiccup leaves /usage unaffected. * refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers The dev-flag truthy check was inlined in three places. Replace with the shared utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the env check on every render). Behaviour-preserving; identical truthy set. * fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review) Adversarial review found the /usage depletion-recovery trigger dead AND broken: the CLI binds no notice_clear_callback, the TUI runs /usage in a separate slash-worker subprocess (its own agent/latch), and the no-clobber rule made it evaluate stale paid_access anyway. Recovery already happens on the next inference (warm path), so the trigger was redundant — remove it and stop the depleted notice over-promising. - cli.py: remove the dead recovery block; bound the /usage portal fetch with a 10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch — urllib's per-socket timeout is not a wall-clock guarantee. - agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance" (no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn). - agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch so a stalled portal can't hang session startup; tidy its time import. * chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE) Throwaway dev scaffolding to exercise the notice pipeline without real spend or Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct / grant_exhausted / depleted / clear) or a file path whose contents name a state (re-read each turn → flip states live for recovery testing). _capture_credits injects the chosen CreditsState instead of parsing real headers and runs the shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding. * feat(credits): /usage monthly-grant % gauge The portal /api/oauth/account subscription block now carries monthly_credits (the per-period grant allowance, the % denominator). The consumer parsed monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only. Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload. build_nous_credits_snapshot emits a Subscription usage window (real % used, routed through the existing render machinery) when monthly_credits is a finite positive denominator and credits_remaining is finite and <= cap; otherwise it degrades to magnitudes-only (older portals, rollover-over-cap, or non-finite payloads). Guards (adversarial-review-driven): reject non-finite operands (json.loads parses bare NaN/Infinity by default → would render $nan + a false 100% used), reject bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap (rollover spanning the period makes the cap a nonsensical denominator → the $X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%. Money rule preserved: the ratio + magnitudes are computed from numeric float account fields via display formatting, never by parsing a server _usd string (there are none on these dataclasses). 13 gauge tests added (tests/agent/test_nous_credits_gauge.py). fix(credits): show /usage Nous block whenever a Nous account is present /usage runs in a slash-worker subprocess whose resolved inference provider is often not "nous" even when the user has a Nous account, so gating the Nous credits block on (provider == "nous") hid it entirely — the account data was fully available but never rendered. Gate instead on "a Nous account is logged in": a cheap local auth-state lookup (get_provider_auth_state('nous') has an access_token) decides whether to attempt the portal fetch, regardless of which provider inference runs on. In the gateway the block is also lifted out of the 'if provider:' scope so a Nous-credentialled user with another (or no) resident inference provider still sees their balance. Fail-open and the per-fetch wall-clock timeout are preserved. * fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker) In the TUI, /usage runs in a slash-worker subprocess that resumes the session WITHOUT building an agent (self.agent is None), so _show_usage early-returned "(._.) No active agent" before ever reaching the Nous credits block — which is agent-independent (a portal fetch gated on Nous auth-state). Extract the block into _print_nous_credits_block() and run it at the no-agent / no-calls early-returns too (returns True if it printed, so the fallback message only shows when there's genuinely nothing). Verified live against staging: the block + monthly-grant gauge now render in the slash-worker /usage path (previously hidden). The plain CLI REPL + messaging paths are unchanged (they have a live agent). * feat(credits): escalating 50/75/90 usage bands (single status line) Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn, 90 warn) shown as ONE status-bar line: it displays the highest band the subscription grant has crossed, replaces the line as usage climbs, steps back down on recovery, and clears below 50%. No stacking, no per-turn churn. Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything from it. Single notice key (credits.usage) with a usage_band latch field so the notice only re-emits when the band actually changes. The crossing gate (seen_below_90) is preserved so a fresh live session that opens mid-range stays quiet until it has been observed below the lowest band (cold-start primes it when it wants an open-high warning). Denominator math unchanged: % = subscription grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %. Migrated test_credits_policy.py to the new key + added TestUsageBands (climb, step-down, recovery-clear, idempotent, inclusive boundaries). * feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn) Notices previously only fired inside a conversation turn (first message), so a session that opened already depleted / past a usage band showed nothing at 'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start() and call it (a) in the TUI/desktop agent build right after the notice callback is wired (fires at 'ready', before any message) and (b) as the first-turn fallback in conversation_loop. Idempotent (skips once _credits_state exists) and fail-open. The seed now maps monthly_credits -> subscription_limit_micros + denominator_kind='subscription_cap', so used_fraction is computable at seed time and usage-band warnings (not just depletion) hydrate on open. Primes the crossing latch so a session opening already in a band warns immediately. Degrades to depletion-only when monthly_credits is absent (older portals). Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap degradation, and the shared seed (fires/idempotent/skips-non-nous). * feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge when the portal supplies a positive, finite monthly_credits denominator with remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise. Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so the CLI and TUI /usage render the same block, and _snapshot_from_credits_state() so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too. TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage panel renders them regardless of API-call count or resume state — previously the TUI's separate /usage implementation only showed token counts. Money rule preserved: %% and magnitudes come from numeric float account fields via display formatting, never by parsing a server _usd string. feat(credits): CLI REPL inline notices (parity with TUI) The plain CLI agent bound no notice callbacks, so credit notices were TUI-only. Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders a single level-colored line above the prompt (error red / warn yellow / success green / info dim) via _cprint, and seed credits at session open so a depletion or usage-band warning shows before the first message — the same hydration the TUI got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot). * test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open seed, /usage gauge). * fix(credits): usage-band notice clears on next prompt (not sticky-forever) A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear the visible credits.usage notice when a new turn starts (startMessage), so it shows until your next prompt then yields. The server latch is unchanged, so it won't re-nag at the same band — it only re-shows when the band actually changes (climb) or clears when usage drops below the lowest band. Depletion stays sticky. * refactor(credits): consolidate the /usage credits block behind nous_credits_lines() The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command) each re-implemented the auth-gate + portal fetch + render, and both bypassed the dev-fixture short-circuit that only the TUI honored — so /usage ignored HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth gate, and the fixture works on every surface (~60 fewer duplicated lines). The gateway usage test recorded only the last asyncio.to_thread call; /usage now dispatches both the account fetch and the credits fetch, so it records every call and matches the account fetch by its provider arg. * fix(credits): keep the /usage gauge type-safe and log its fail-open path _is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge operands (monthly_credits / credits_remaining) and the magnitudes passed to _fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block is diagnosable in agent.log without a dev flag. * fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed - Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a real account. Matches the documented run workflow (both vars set together). - Hot-path probe: parse_credits_headers checks for the version sentinel header before allocating a lowercased copy of the response headers — skips that work on every non-Nous API call. Behaviour-identical and still case-insensitive. - Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now runs in a daemon thread, so a slow/unreachable portal never delays session "ready" (previously blocked up to 10s). The dev-fixture path stays synchronous; the thread re-checks idempotency before hydrating (a live header may land first). - Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed parser / dead seed is distinguishable from a legitimate no-headers miss. Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate. * test(tui): fix env-timing in the StatusRule dev-credits assertion DEV_CREDITS_MODE is read once at module load (config/env), so mutating process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner assertion only passed if the env was exported before vitest started, and failed in a normal run. Move that assertion to a sibling file that mocks config/env with DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard). * test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt - _snapshot_from_credits_state (the offline /usage renderer) had no direct test: lock the gauge math, the verbatim _usd magnitudes, the depletion line and the fixture marker, plus the no-cap (no gauge) and None-state cases. - turnController.startMessage had no test for clearing the credits.usage notice on the next prompt while leaving credits.depleted sticky. feat(credits): deliver credit notices over messaging gateways Bind notice_callback/notice_clear_callback on the per-turn gateway agent so usage-band / depletion / restored notices reach Telegram/Discord/Slack/ etc. Previously the messaging gateway bound neither callback, so the agent's _emit_credits_notices early-returned and a chat user crossing a band got nothing unless they ran /usage manually. - render_notice_line(): AgentNotice -> single plaintext line (level glyph + text), plaintext-only so it renders uniformly without per-platform escaping. Fail-soft on malformed/empty notices. - Standalone push for every notice (messaging has no persistent status bar): route through the shared _deliver_platform_notice rail (honors private/ public delivery + thread metadata), scheduled onto the gateway loop via safe_schedule_threadsafe from the agent's sync worker thread — same pattern as _status_callback_sync. - The fired-once latch lives on the cached (reused-in-place) agent and persists across turns, so a band crosses once -> one push, no per-turn re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder). - Recovery ('Credit access restored') rides the show path (emitted as a success notice, not a clear). notice_clear_callback is a no-op: a sent platform message can't be cleanly retracted. Tests: render glyph/levels/fail-soft + public/private delivery seam through _deliver_platform_notice + no-adapter no-op. * fix(credits): don't double the glyph on messaging notices render_notice_line prepended a per-level glyph, but the notice policy already bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used", "⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead level→glyph map. The render tests fed glyph-less text (and the success case only checked startswith), so the doubling slipped through. Rework them around the verbatim contract and add an end-to-end regression that runs real evaluate_credits_notices output through render_notice_line and asserts the line is returned unchanged.	2026-06-06 13:18:18 +05:30
Teknium	b91aade176	feat(desktop): warn when main-model switch leaves auxiliary tasks pinned to another provider (#40286 ) Switching the main model never touches auxiliary slot pins (they're independent, sticky per-task overrides). A user who switches main away from a now-unpaid provider keeps paying 402s on every background aux call until they manually reset those pins — silently, with no UI signal. - /api/model/set scope:'main' now returns stale_aux: slots still pinned to a provider different from the new main (additive field). - Desktop Model Settings shows a switch-time notice after Apply AND a persistent banner when any loaded aux slot mismatches the main provider, both wired to the existing 'Reset all to main' action. - Never auto-clears pins — a dedicated cheaper aux model is a legitimate config; surface-and-offer instead of nuking. - Fixes a stale pre-existing assertion in the panel test (main model now renders via selectors, not a standalone label).	2026-06-05 23:35:36 -07:00
Teknium	f8a241e105	fix(delegate): flatten content blocks in live overlay tail + AUTHOR_MAP Follow-up on the cherry-picked content-block fix. _extract_output_tail (the live subagent overlay) still used crude str(content), which renders a "[{'type': 'text'...}]" blob and — worse — mislabels a block-wrapped "Error: ..." result as is_error=False. Route it through the same _stringify_tool_content helper so error detection and previews work at both consumer sites. - delegate_tool.py: _extract_output_tail uses _stringify_tool_content - tests: add _extract_output_tail content-block test (error detection + clean preview) - release.py: AUTHOR_MAP entry for randomsnowflake (CI gate)	2026-06-05 23:34:00 -07:00
Alexander Lehmann	f83918c31d	fix(delegate): handle content-block tool results	2026-06-05 23:34:00 -07:00
helix4u	338c074336	fix(send-message): treat ntfy topic targets as explicit	2026-06-05 20:38:28 -07:00
Teknium	50f9ad70fc	fix(dashboard): populate cron delivery dropdown from configured platforms (#40218 ) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * fix(dashboard): populate cron delivery dropdown from configured platforms The dashboard cron-create/edit dropdown hardcoded five delivery options (local, telegram, discord, slack, email), so users on Matrix — or any other backend-supported platform — had no way to pick their channel even though the cron scheduler delivers to all of them. It also offered Telegram/Discord/etc. to users who never set those up. - cron/scheduler.py: add cron_delivery_targets() — the single source of truth. Intersects gateway-configured platforms with cron-deliverable ones and reports whether each platform's home channel is set. - web_server.py: GET /api/cron/delivery-targets exposes that list (+ the implicit local option) to the dashboard. - CronPage.tsx: both modals render options from the endpoint. Configured platforms missing a home channel still appear, annotated "set a home channel first" (option B), so the user knows what to fix. Edit modal preserves a job's current target even if it's no longer configured. Local-only state shows a "configure a platform under Channels" hint. Validation: scheduler + endpoint E2E'd with a Matrix gateway (home set and unset); 5 new tests; tests/cron + tests/hermes_cli/test_web_server green (366 passed).	2026-06-05 20:23:54 -07:00
Brooklyn Nicholson	0f45509daf	fix(agent): make mid-turn /steer trusted, not read as injection A steer rides inside a tool result (the only role-alternation-safe slot mid-turn), so a bare "User guidance:" line reads as untrusted tool content — well-behaved models refuse it as suspected prompt injection (observed live: "I only follow instructions from you directly, not ones injected through command results"). - Wrap steers in a bounded, self-describing [OUT-OF-BAND USER MESSAGE] marker (prompt_builder.format_steer_marker), shared by both drain sites. - Add STEER_CHANNEL_NOTE to the core system prompt so the model expects this exact marker and trusts it as a genuine user message — while still ignoring lookalikes buried in tool/web/file output. Static text → byte-stable prompt, no prompt-cache regression; gated on the agent having tools. - Desktop: steer ack is now an inline transcript note (⏩ steered · …) instead of a toast. Marker is intentionally static (not a per-session nonce) to honor the byte-stable system-prompt caching policy; nonce hardening noted as follow-up.	2026-06-05 20:59:36 -05:00
Teknium	78122c52cf	test(slack): drop /q alias assertion now displaced by /version cap clamp Slack's native-slash manifest hard-caps at 50 (_SLACK_MAX_SLASH_COMMANDS). Adding the /version canonical claims a pass-1 slot, so the lowest-priority pass-2 alias (/q for /quit) clamps off the end. /q stays reachable via /hermes q. Surviving aliases (/btw /bg /reset) still prove alias parity.	2026-06-05 18:05:05 -07:00
Brooklyn Nicholson	30340eae2f	Include git SHA in /version output via banner label helper. Reuses format_banner_version_label() so CLI, TUI, gateway, and desktop show upstream/local commit when available.	2026-06-05 18:05:05 -07:00
Brooklyn Nicholson	9c1bb8d2c7	Add /version slash command across CLI, gateway, TUI, and desktop. Surfaces Hermes Agent version info on demand without leaving chat; works mid-run like /help and /update.	2026-06-05 18:05:05 -07:00

1 2 3 4 5 ...

5002 commits