hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-18 04:41:56 +00:00

Author	SHA1	Message	Date
kshitij	3034eee38e	fix(acp): replay session history before responding to session/load (#12285 follow-up) (#26957 ) Switches `_replay_session_history` from `loop.call_soon`-deferred (after the `LoadSessionResponse` is written) to `await`-inline (before the response is constructed) for both `session/load` and `session/resume`. Adds defensive try/except around the awaited call so a replay helper crash still yields a successful load response — partial transcripts are acceptable, total load failure is not. The deferral was added on May 2 in commit `19854c7cd` with the rationale "Zed only attaches streamed transcript/tool updates once the load/resume response has completed." That justification was incorrect: - Zed's current ACP integration (zed-industries/zed crates/agent_servers/src/acp.rs) explicitly registers the session-update routing entry BEFORE awaiting the loadSession RPC, with the comment: "so that any session/update notifications that arrive during the call (e.g. history replay during session/load) can find the thread." - Every other reference ACP server (Codex, Claude Code, OpenCode, Pi, agentao) replays history BEFORE responding to the load request. - The ACP spec wording ("Stream the entire conversation history back to the client via notifications") and the natural JSON-RPC reading both mean "during the request's lifetime", not "after the response resolves". Empirical reproduction (reported by Biraj on @agentclientprotocol/sdk v0.21.1): the same custom ACP client works correctly against Codex / Claude Code / OpenCode / Pi but receives 0 notifications from Hermes because it measures the per-call notification count at the moment `loadSession` resolves — which on Hermes was before the `call_soon`- scheduled replay coroutine had a chance to run. Changes: - `acp_adapter/server.py`: remove `_schedule_history_replay`; both `load_session` and `resume_session` now `await self._replay_session_history` before returning, wrapped in try/except that logs and continues on helper exceptions. - `tests/acp/test_server.py`: replace the single `test_load_session_schedules_history_replay_after_response` (which encoded the now-incorrect post-response ordering) with two tests asserting `events == ["replay", "returned"]` for load and resume. Add two regression tests confirming that a replay helper raising still yields a `LoadSessionResponse` / `ResumeSessionResponse` rather than propagating the exception out as a JSON-RPC error. Result: 240 ACP tests pass (was 238), ruff clean. Verified end-to-end: biraj's synchronous notification-counter pattern now sees 6 notifications during `loadSession` for a 5-message session, matching all other reference ACP servers. The `_fenced_text` change in `acp_adapter/tools.py` from the same May 2 commit is orthogonal and intentionally left intact — it's a separate, still-valid fix for Zed's pipe-as-table rendering. Refs #12285. Follows up #26943 (which added thought-chunk replay but kept the deferral).	2026-05-16 07:41:34 -07:00
kshitij	f3a4af9cf2	fix(acp): replay assistant reasoning as agent_thought_chunk on session/load (#12285 ) (#26943 ) Persisted assistant `reasoning_content` / `reasoning` fields are now emitted as ACP `agent_thought_chunk` notifications during `_replay_session_history`, so editor clients (Zed, etc.) rebuild collapsed Thinking panes when the user re-opens a session that used a thinking model. Ordering matches live streaming: thought precedes message text within the same assistant turn, mirroring how `reasoning_callback` deltas arrive before `stream_delta_callback` deltas in `events.py::make_thinking_cb` / `make_message_cb`. Behavior on non-reasoning histories is unchanged; the replay loop's existing text / tool_call / tool_call_update / plan emission is preserved bit-for-bit. Closes #12285. Credit: - @Yukipukii1 (#14691) — original thought-replay design via `acp.update_agent_thought_text`; the tool-call portion of that PR has since landed via #19139, but the reasoning replay is theirs. - @HenkDz (#17652 / #18578) — established the `_replay_session_history` and `_history_*` helper conventions this builds on. - @D1zzyDwarf (#16531) — also closed by this work.	2026-05-16 06:45:29 -07:00
HenkDz	bd3a5873e1	fix(acp): replay native todo plans	2026-05-15 14:07:53 -07:00
HenkDz	4444d5fe4f	fix(acp): emit native plan updates for todo	2026-05-15 14:07:53 -07:00
Teknium	4e89c53082	fix(async): close unscheduled coroutines in all threadsafe bridges (#26584 ) Wraps every sync->async coroutine-scheduling site in the codebase with a new agent.async_utils.safe_schedule_threadsafe() helper that closes the coroutine on scheduling failure (closed loop, shutdown race, etc.) instead of leaking it as 'coroutine was never awaited' RuntimeWarnings plus reference leaks. 22 production call sites migrated across the codebase: - acp_adapter/events.py, acp_adapter/permissions.py - agent/lsp/manager.py - cron/scheduler.py (media + text delivery paths) - gateway/platforms/feishu.py (5 sites, via existing _submit_on_loop helper which now delegates to safe_schedule_threadsafe) - gateway/run.py (10 sites: telegram rename, agent:step hook, status callback, interim+bg-review, clarify send, exec-approval button+text, temp-bubble cleanup, channel-directory refresh) - plugins/memory/hindsight, plugins/platforms/google_chat - tools/browser_supervisor.py (3), browser_cdp_tool.py, computer_use/cua_backend.py, slash_confirm.py - tools/environments/modal.py (_AsyncWorker) - tools/mcp_tool.py (2 + 8 _run_on_mcp_loop callers converted to factory-style so the coroutine is never constructed on a dead loop) - tui_gateway/ws.py Tests: new tests/agent/test_async_utils.py covers helper behavior under live loop, dead loop, None loop, and scheduling exceptions. Regression tests added at three PR-original sites (acp events, acp permissions, mcp loop runner) mirroring contributor's intent. Live-tested end-to-end: - Helper stress test: 1500 schedules across live/dead/race scenarios, zero leaked coroutines - Race exercised: 5000 schedules with loop killed mid-flight, 100 ok / 4900 None returns, zero leaks - hermes chat -q with terminal tool call (exercises step_callback bridge) - MCP probe against failing subprocess servers + factory path - Real gateway daemon boot + SIGINT shutdown across multiple platform adapter inits - WSTransport 100 live + 50 dead-loop writes - Cron delivery path live + dead loop Salvages PR #2657 — adopts contributor's intent over a much wider site list and a single centralized helper instead of inline try/except at each site. 3 of the original PR's 6 sites no longer exist on main (environments/patches.py deleted, DingTalk refactored to native async); the equivalent fix lives in tools/environments/modal.py instead. Co-authored-by: JithendraNara <jithendranaidunara@gmail.com>	2026-05-15 14:00:01 -07:00
teknium1	85782a4ed7	feat(acp): hermes acp --setup-browser bootstraps browser tools for registry installs The Zed ACP Registry path (uvx --from 'hermes-agent[acp]==X' hermes-acp) gets a Python-only install. Browser tools depend on the agent-browser npm package + Chromium, neither of which are in the wheel. Without an explicit bootstrap, registry users have no path to working browser tools. Ship a bundled, idempotent bootstrap script (Linux/macOS bash + Windows PowerShell) inside acp_adapter/bootstrap/ as wheel package-data. New entry points: hermes acp --setup-browser # interactive; prompts before Chromium download hermes acp --setup-browser --yes # non-interactive hermes-acp --setup-browser The terminal-auth flow (hermes acp --setup) also offers the browser bootstrap as a follow-up after model selection, so first-run registry users get the option without knowing the flag exists. Key design choices: - npm install -g --prefix $NODE_PREFIX so we never need sudo. System Node on PATH is respected; only the install target is redirected to the user-writable Hermes-managed Node prefix. - tools/browser_tool.py::_browser_candidate_path_dirs() already walks $HERMES_HOME/node/bin, so installed binaries are discovered with no agent-side code change. - System Chrome/Chromium detection short-circuits the ~400 MB Playwright download when a suitable browser already exists. - Bash + PowerShell live as ONE copy each under acp_adapter/bootstrap/. Not duplicated under scripts/. install.sh and install.ps1 keep their inline browser blocks for the source-checkout path. E2E validated end-to-end: bash bootstrap_browser_tools.sh --skip-chromium → installs agent-browser into ~/.hermes/node/bin/ tools.browser_tool._find_agent_browser() → returns the installed path check_browser_requirements() → returns True (browser tools register) Tests: - tests/acp/test_entry.py: 11 tests covering --setup-browser dispatch (linux + windows + --yes forwarding + failure propagation), the terminal-auth follow-up prompt path, and a package-data wheel-shipping assertion that catches any future pyproject.toml regression. Docs: website/docs/user-guide/features/acp.md gains a 'Browser tools (optional)' subsection with the two-line install + what-it-does.	2026-05-15 01:38:24 -07:00
mr-r0b0t	4c94396206	feat: add ACP registry metadata for Zed	2026-05-14 20:26:02 -07:00
mr.Shu	31b4721791	fix: simplify ACP approval bridging Previously ACP dangerous-command approvals mixed an invalid ACP payload shape with partial Hermes option mapping, and the callback plumbing was shared across worker threads. This commit uses ACP tool-call updates, preserves Hermes once/session/always semantics, and scopes approval callbacks to the current worker thread. - Build permission requests with `update_tool_call` and unique `perm-check-*` ids in `acp_adapter/permissions.py` - Keep ACP option mapping explicit and fail closed on unknown outcomes or request failures - Set approval callbacks inside the ACP executor worker and read them from thread-local state in `tools/terminal_tool.py` - Replace duplicated ACP bridge coverage with focused tests in `tests/acp/test_permissions.py` and add a thread-local callback test	2026-05-13 22:59:39 -07:00
kshitij	2ec8d2b42f	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 ) Replace with for all literal-tuple membership tests. Set lookup is O(1) vs O(n) for tuple — consistent micro-optimization across the codebase. 608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining. 133 files, +626/-626 (net zero).	2026-05-11 11:13:25 -07:00
HenkDz	840ebe063e	fix: make session search initialize session db	2026-05-09 14:36:58 -07:00
Teknium	26bac67ef9	fix(entry-points): guard hermes_bootstrap import so partial updates don't brick hermes (#22091 ) teknium1 hit ModuleNotFoundError: No module named 'hermes_bootstrap' after a code update, on both his Windows machine AND his Linux workstation. The failure mode is real and affects every user who updates hermes by any path OTHER than a fully-successful ``hermes update``. ## What happens hermes_bootstrap.py is a top-level module registered via pyproject.toml's ``py-modules`` list (added by Brooklyn's Windows UTF-8 stdio work). It must be registered in the venv's editable-install .pth file before Python can find it as a bare ``import hermes_bootstrap``. ``hermes update`` handles this correctly: (1) git reset --hard, (2) clear __pycache__, (3) uv pip install -e . (re-registers the package including the new py-modules list), (4) restart. BUT if any step AFTER (1) fails — network blip during pip install, PEP 668 on a system Python, venv locked, uv not in PATH, a crash mid-update — the user is left with new code that references hermes_bootstrap and a venv that doesn't know about it. Every hermes invocation after that crashes with ModuleNotFoundError, including ``hermes update`` itself. No recovery path without manual `uv pip install -e .`. Also affects users who ``git pull`` the repo directly without running hermes update — relatively common for developers. ## Fix Wrap ``import hermes_bootstrap`` in a try/except ModuleNotFoundError across all 6 entry points (hermes_cli/main, run_agent, gateway/run, acp_adapter/entry, cli, batch_runner). On Windows, missing bootstrap means the UTF-8 stdio setup doesn't run — degraded behavior (Unicode chars may fail to print) but NOT a crash. POSIX is unaffected either way since the bootstrap is a no-op there. Once hermes is running again, the user can ``hermes update`` to fully recover. ## Test update tests/test_hermes_bootstrap.py::test_entry_point_imports_bootstrap scans for the first top-level import in each entry point and asserts it is hermes_bootstrap. Extended the check to accept a Try block whose body is a lone Import of hermes_bootstrap — that's the recovery-friendly form we just introduced. Verified behavior by ``mv hermes_bootstrap.py hermes_bootstrap.py.bak`` and confirming ``python -c "import hermes_cli.main"`` succeeds. 82/82 tests pass (hermes_bootstrap + windows-native + windows-compat).	2026-05-08 14:43:13 -07:00
Teknium	d94fb47717	hermes_bootstrap: Windows-only UTF-8 stdio shim for all entry points Codebase-wide fix for Python-on-Windows UTF-8 footguns, complementing the earlier execute_code sandbox fixes (which remain load-bearing for when the sandbox explicitly scrubs child env). Problem: Python on Windows has two long-standing text-encoding pitfalls: 1. sys.stdout/stderr are bound to the console code page (cp1252 on US-locale installs) — print('café') crashes with UnicodeEncodeError. 2. Subprocess children don't know to use UTF-8 unless PYTHONUTF8 and/or PYTHONIOENCODING are set in their env — so any Python we spawn (linters, sandbox children, delegation workers) hits the same bug. Solution: A tiny bootstrap module (hermes_bootstrap.py) imported as the first statement of every Hermes entry point: - hermes_cli/main.py (hermes / hermes-agent console_script) - run_agent.py (hermes-agent direct) - acp_adapter/entry.py (hermes-acp) - gateway/run.py (messaging gateway) - batch_runner.py (parallel batch mode) - cli.py (legacy direct-launch CLI) On Windows, the bootstrap: - os.environ.setdefault('PYTHONUTF8', '1') (PEP 540 UTF-8 mode) - os.environ.setdefault('PYTHONIOENCODING', 'utf-8') - sys.stdout/stderr/stdin.reconfigure(encoding='utf-8', errors='replace') Children inherit the env vars → they run in UTF-8 mode. Current process's stdio is reconfigured → print('café') works now. On POSIX (Linux/macOS), the bootstrap is a complete no-op. We don't touch LANG, LC_, or anything else — users who have intentionally configured a non-UTF-8 locale aren't affected. POSIX systems are already UTF-8 by default in 99% of modern setups, so there's nothing to fix. setdefault() (not overwrite) means users who explicitly set PYTHONUTF8=0 or PYTHONIOENCODING=cp1252 in their environment are respected. What this does NOT fix: bare open(path, 'w') calls in the parent* process still default to locale encoding because PYTHONUTF8 is only read at interpreter init. A ruff PLW1514 sweep (separate follow-up) will add explicit encoding='utf-8' at those ~219 call sites for belt-and-suspenders. Tests (17): 16 passed, 1 skipped on Windows. - Windows: env vars set, stdio reconfigured, child inherits UTF-8 mode - POSIX: complete no-op (verified on fake POSIX + skipped on real POSIX since we don't have a Linux box in this session) - Idempotence: multiple calls safe - Graceful degradation: non-reconfigurable streams don't crash - User opt-out: explicit PYTHONUTF8=0 is respected - Load order: every entry point's FIRST top-level import is hermes_bootstrap, enforced by an AST-level parametrized test pyproject.toml: added hermes_bootstrap to py-modules so it ships with pip installs.	2026-05-08 14:27:40 -07:00
Teknium	7e2af0c2e8	feat(acp): pass image file attachments through as image_url parts Extends PR #21400's resource inlining with image-specific handling: ACP resource_link and embedded blob resources with an image/* mime (or image file suffix when mime is missing) now emit an OpenAI image_url part with a base64 data URL, so vision models actually see the image instead of a [Binary file omitted] note. Non-image resources keep the existing text-inlining behavior. Adds 3 tests: local PNG via resource_link, JPEG mime inferred from suffix when client omits mimeType, and embedded blob PNG.	2026-05-07 09:24:32 -07:00
HenkDz	733e297b8a	fix(acp): inline file attachment resources	2026-05-07 09:24:32 -07:00
Junass1	5795b3be4e	fix(acp): use SessionDB.replace_messages for atomic history rewrite ACP's save_session() did a non-atomic clear_messages() + append_message() loop. If any message hit an exception mid-loop (bad tool_call shape, etc.), the DELETE had already committed and the persisted conversation was lost. SessionDB.replace_messages() wraps DELETE + bulk INSERT in a single BEGIN IMMEDIATE transaction that rolls back on any exception, so a bad message can no longer clobber previously-persisted history. Salvages @Awsh1's PR #13675 — uses the existing replace_messages() helper (which covers more message fields than the PR's own copy) instead of adding a duplicate.	2026-05-05 10:05:23 -07:00
Henkey	9987f3d824	fix(acp): compact Zed tool replay rendering	2026-05-03 01:44:23 -07:00
Henkey	19854c7cd2	Schedule ACP history replay and fence file output	2026-05-03 01:44:23 -07:00
Henkey	eb612f5574	fix(acp): keep web extract rendering compact	2026-05-03 01:44:23 -07:00
Henkey	b294d1d022	fix(acp): keep read-file starts compact	2026-05-03 01:44:23 -07:00
Henkey	72c8037a24	fix(acp): polish common tool rendering	2026-05-03 01:44:23 -07:00
Henkey	ef9a08a872	fix(acp): polish Zed context and tool rendering	2026-05-03 01:44:23 -07:00
Henkey	e26f9b2070	fix(acp): route Zed thoughts to reasoning callbacks	2026-05-03 01:44:23 -07:00
Teknium	f0dc919f92	fix(compression): include system prompt + tool schemas in token estimates (#18265 ) The user-visible /compress banner and the post-compression last_prompt_tokens writeback both counted only the raw message transcript (chars/4). With a 15KB system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of request pressure — a 234x gap. Two user-facing consequences: - Banner shows 'Compressing … (~45 tokens)…' while compression is actually firing on 10K+ tokens of real pressure, confusing users about why compression triggered (reported by @codecovenant on X; #6217). - Post-compression last_prompt_tokens writeback omits tool schemas, so the next should_compress() check compares real usage against a stale underestimate — compression triggers late, potentially past the model's context limit on small-context models (#14695). Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough() at every user-visible banner and at the post-compression writeback. estimate_request_tokens_rough() already existed for exactly this purpose and includes system prompt + tool schemas. Touched call sites: - run_agent.py: post-compression last_prompt_tokens writeback, post-tool call should_compress() fallback when provider usage is missing - cli.py: /compress banner + summary - gateway/run.py: gateway /compress banner + summary - tui_gateway/server.py: TUI /compress status + summary - acp_adapter/server.py: ACP /compact before/after Left intentionally alone: - Session-hygiene fallback and the 'no agent' /status path in gateway/run.py — no agent instance is in scope to query for system prompt/tools, and the existing 30-50% overestimate wobble on hygiene is safety-accepted. - Verbose-mode 'Request size' logging — informational only, already counts system prompt via api_messages[0]. Also relabels the feedback line from 'Rough transcript estimate' to 'Approx request size' so the metric label matches what it actually measures. Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217); user report @codecovenant on X (2026-04-30). Closes #14695 Closes #6217	2026-04-30 23:03:54 -07:00
Teknium	41fa1f1b5c	fix(acp): run /steer as a regular prompt on idle sessions (#18258 ) When a user types /steer <text> on an ACP session that isn't actively running a turn (and there's no interrupted-prompt salvage available), _cmd_steer silently appended to state.queued_prompts and replied "No active turn — queued for the next turn". That looks identical to /queue output even though the user never typed /queue — @EddyLeeKhane reported this as "/steer never works, gets queued instead". Rewrite the payload to a plain user prompt before the slash-intercept fires, matching the gateway's idle-/steer fallthrough in gateway/run.py ~L4898.	2026-04-30 22:45:14 -07:00
Henkey	ec1443b9f1	fix(acp): normalize Windows cwd for WSL tool execution	2026-04-30 20:55:14 -07:00
Henkey	78886365c2	fix(acp): replay interrupted prompts for steer	2026-04-30 20:54:37 -07:00
Henkey	e27b0b7651	feat(acp): add steer and queue slash commands	2026-04-30 20:54:37 -07:00
Henkey	cdf9793d6d	fix(acp): advertise and forward image prompts	2026-04-30 10:31:16 -07:00
teknium1	658947480a	fix(acp): drop dead message_id kwarg from replay chunks UserMessageChunk and AgentMessageChunk do not have a message_id field in the ACP schema. Passing it silently dropped the kwarg (pydantic does not raise on unknown init kwargs here) and the subsequent test assertions on .message_id raised AttributeError. Strip the dead plumbing (uuid import, message_id= kwarg on both chunk types, unused session_id/index parameters) and remove the matching .message_id asserts from the test.	2026-04-30 02:45:54 -07:00
Henkey	d2536a72bf	fix(acp): replay session history on load	2026-04-30 02:45:54 -07:00
Teknium	0edcc57d9a	fix(acp): wire HERMES_SESSION_KEY per session so sudo cache scope activates PR #16858's session-scoped interactive sudo password cache falls back to a thread-identity scope when no HERMES_SESSION_KEY is bound. ACP never set that contextvar, so two ACP sessions landing on the same reused ThreadPoolExecutor thread still shared the cache — the exact scenario the PR headlined. acp_adapter/server.py now: - binds HERMES_SESSION_KEY=<session_id> via gateway.session_context inside _run_agent() (and clears on exit) - wraps the loop.run_in_executor(_executor, _run_agent) call in a fresh contextvars.copy_context() so concurrent ACP sessions don't stomp on each other's ContextVar writes (executor pool threads would otherwise share a context). Adds tests/acp/test_approval_isolation.py:: test_sudo_password_cache_isolated_across_acp_sessions_on_same_pool_thread which drives two back-to-back sessions through a 1-worker ThreadPoolExecutor and asserts B does not observe A's cached password.	2026-04-28 01:34:16 -07:00
Teknium	dd789a4fdf	fix(mcp): move discovery out of model_tools import side effect (#16856 ) (#16899 ) model_tools.py ran discover_mcp_tools() as a module-level side effect. discover_mcp_tools() uses a blocking 120s wait internally (via _run_on_mcp_loop -> future.result(timeout=120)). The gateway lazy-imports run_agent -> model_tools on the first user message, which happens inside the asyncio event loop thread. A slow or unreachable MCP server therefore froze Discord shard heartbeats and Telegram polling for up to 120s on the first message after gateway start. Fix: remove the module-level call. Every entry point now runs discovery explicitly at its own startup, using the context-appropriate blocking/non-blocking pattern: - gateway/run.py: loop.run_in_executor(None, discover_mcp_tools) before platforms start accepting traffic - hermes_cli/main.py: inline (no event loop at CLI startup) - tui_gateway/entry.py: inline (sync stdin loop, no event loop) - acp_adapter/entry.py: inline before asyncio.run() Closes #16856.	2026-04-28 01:17:58 -07:00
Cameron Aragon	dfc5563641	fix(acp): include MCP toolsets in ACP sessions	2026-04-24 03:04:42 -07:00
Teknium	62348cffbe	fix(acp): wire approval callback + make it thread-local (#13525 ) Two related ACP approval issues: GHSA-96vc-wcxf-jjff — ACP's _run_agent never set HERMES_INTERACTIVE (or any other flag recognized by tools.approval), so check_all_command_guards took the non-interactive auto-approve path and never consulted the ACP-supplied approval callback (conn.request_permission). Dangerous commands executed in ACP sessions without operator approval despite the callback being installed. Fix: set HERMES_INTERACTIVE=1 around the agent run so check_all_command_guards routes through prompt_dangerous_approval(approval_callback=...) — the correct shape for ACP's per-session request_permission call. HERMES_EXEC_ASK would have routed through the gateway-queue path instead, which requires a notify_cb registered in _gateway_notify_cbs (not applicable to ACP). GHSA-qg5c-hvr5-hjgr — _approval_callback and _sudo_password_callback were module-level globals in terminal_tool. Concurrent ACP sessions running in ThreadPoolExecutor threads each installed their own callback into the same slot, racing. Fix: store both callbacks in threading.local() so each thread has its own slot. CLI mode (single thread) is unaffected; gateway mode uses a separate queue-based approval path and was never touched. set_approval_callback is now called INSIDE _run_agent (the executor thread) rather than before dispatching — so the TLS write lands on the correct thread. Tests: 5 new in tests/acp/test_approval_isolation.py covering thread-local isolation of both callbacks and the HERMES_INTERACTIVE callback routing. Existing tests/acp/ (159 tests) and tests/tools/ approval-related tests continue to pass. Fixes GHSA-96vc-wcxf-jjff Fixes GHSA-qg5c-hvr5-hjgr	2026-04-21 06:20:40 -07:00
Teknium	4cc5065f63	fix(acp): follow-up — named-const page size, alias kwarg, tests - Replace kwargs.get('limit', 50) with module-level _LIST_SESSIONS_PAGE_SIZE constant. ListSessionsRequest schema has no 'limit' field, so the kwarg path was dead. Constant is the single source of truth for the page cap. - Use next_cursor= (field name) instead of nextCursor= (alias). Both work under the schema's populate_by_name config, but using the declared Python field name is the consistent style in this file. - Add docstring explaining cwd pass-through and cursor semantics. - Add 4 tests: first-page with next_cursor, single-page no next_cursor, cursor resumes after match, unknown cursor returns empty page.	2026-04-21 06:00:41 -07:00
Aniruddha Adak	c1fb7b6d27	fix: support pagination and cwd filtering in list_sessions	2026-04-21 06:00:41 -07:00
Aniruddha Adak	ea06104a3c	fix(permissions): handle None response from ACP request_permission	2026-04-21 05:57:23 -07:00
Teknium	c6974043ef	refactor(acp): validate method_id against advertised provider in authenticate() (#13468 ) * feat(models): hide OpenRouter models that don't advertise tool support Port from Kilo-Org/kilocode#9068. hermes-agent is tool-calling-first — every provider path assumes the model can invoke tools. Models whose OpenRouter supported_parameters doesn't include 'tools' (e.g. image-only or completion-only models) cannot be driven by the agent loop and fail at the first tool call. Filter them out of fetch_openrouter_models() so they never appear in the model picker (`hermes model`, setup wizard, /model slash command). Permissive when the field is missing — OpenRouter-compatible gateways (Nous Portal, private mirrors, older snapshots) don't always populate supported_parameters. Treat missing as 'unknown → allow' rather than silently emptying the picker on those gateways. Only hide models whose supported_parameters is an explicit list that omits tools. Tests cover: tools present → kept, tools absent → dropped, field missing → kept, malformed non-list → kept, non-dict item → kept, empty list → dropped. * refactor(acp): validate method_id against advertised provider in authenticate() Previously authenticate() accepted any method_id whenever the server had provider credentials configured. This was not a vulnerability under the personal-assistant trust model (ACP is stdio-only, local-trust — anything that can reach the transport is already code-execution-equivalent to the user), but it was sloppy API hygiene: the advertised auth_methods list from initialize() was effectively ignored. Now authenticate() only returns AuthenticateResponse when method_id matches the currently-advertised provider (case-insensitive). Mismatched or missing method_id returns None, consistent with the no-credentials case. Raised by xeloxa via GHSA-g5pf-8w9m-h72x. Declined as a CVE (ACP transport is stdio, local-trust model), but the correctness fix is worth having on its own.	2026-04-21 03:39:55 -07:00
alt-glitch	1010e5fa3c	refactor: remove redundant local imports already available at module level Sweep ~74 redundant local imports across 21 files where the same module was already imported at the top level. Also includes type fixes and lint cleanups on the same branch.	2026-04-21 00:50:58 -07:00
Teknium	a33e890644	fix(acp): silence 'Background task failed' noise on liveness-probe requests (#12855 ) Clients like acp-bridge send periodic bare `ping` JSON-RPC requests as a liveness probe. The acp router correctly returns JSON-RPC -32601 to the caller, which those clients already handle as 'agent alive'. But the supervisor task that ran the request then surfaces the raised RequestError via `logging.exception('Background task failed', ...)`, dumping a full traceback to stderr on every probe interval. Install a logging filter on the stderr handler that suppresses 'Background task failed' records only when the exception is an acp RequestError(-32601) for one of {ping, health, healthcheck}. Real method_not_found for any other method, other exception classes, other log messages, and -32601 logged under a different message all pass through untouched. The protocol response is unchanged — the client still receives a standard -32601 'Method not found' error back. Only the server-side stderr noise is silenced. Closes #12529	2026-04-20 00:10:27 -07:00
Henkey	cb883f9e97	fix(acp): improve zed integration	2026-04-17 13:29:26 -07:00
Yao	1bcc87a153	fix(acp): declare session load and resume capabilities in initialize response (#6985 ) The resume_session and load_session handlers were implemented but undiscoverable by ACP clients because the capabilities weren't declared in the initialize response. Adds load_session=True and resume=SessionResumeCapabilities() plus wire-format tests. Fixes #6633. Contributed by @luyao618.	2026-04-10 03:45:36 -07:00
Teknium	4e78963fe8	fix(acp): remove dead nested usage dict path run_conversation() never returns a result["usage"] nested dict — token counters are always at the top level. The nested path used the wrong key name ("cached_tokens" vs "cache_read_tokens") and was never reachable. Remove it.	2026-04-10 03:00:12 -07:00
Yuhan Lei	f92298fe95	fix(acp): populate usage from top-level result fields	2026-04-10 03:00:12 -07:00
Teknium	d0ffb111c2	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 ) Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture) and manual analysis of the entire codebase. Changes by category: Unused imports removed (~95 across 55 files): - Removed genuinely unused imports from all major subsystems - agent/, hermes_cli/, tools/, gateway/, plugins/, cron/ - Includes imports in try/except blocks that were truly unused (vs availability checks which were left alone) Unused variables removed (~25): - Removed dead variables: connected, inner, channels, last_exc, source, new_server_names, verify, pconfig, default_terminal, result, pending_handled, temperature, loop - Dropped unused argparse subparser assignments in hermes_cli/main.py (12 instances of add_parser() where result was never used) Dead code removed: - run_agent.py: Removed dead ternary (None if False else None) and surrounding unreachable branch in identity fallback - run_agent.py: Removed write-only attribute _last_reported_tool - hermes_cli/providers.py: Removed dead @property decorator on module-level function (decorator has no effect outside a class) - gateway/run.py: Removed unused MCP config load before reconnect - gateway/platforms/slack.py: Removed dead SessionSource construction Undefined name bugs fixed (would cause NameError at runtime): - batch_runner.py: Added missing logger = logging.getLogger(__name__) - tools/environments/daytona.py: Added missing Dict and Path imports Unnecessary global statements removed (14): - tools/terminal_tool.py: 5 functions declared global for dicts they only mutated via .pop()/[key]=value (no rebinding) - tools/browser_tool.py: cleanup thread loop only reads flag - tools/rl_training_tool.py: 4 functions only do dict mutations - tools/mcp_oauth.py: only reads the global - hermes_time.py: only reads cached values Inefficient patterns fixed: - startswith/endswith tuple form: 15 instances of x.startswith('a') or x.startswith('b') consolidated to x.startswith(('a', 'b')) - len(x)==0 / len(x)>0: 13 instances replaced with pythonic truthiness checks (not x / bool(x)) - in dict.keys(): 5 instances simplified to in dict - Redefined unused name: removed duplicate _strip_mdv2 import in send_message_tool.py Other fixes: - hermes_cli/doctor.py: Replaced undefined logger.debug() with pass - hermes_cli/config.py: Consolidated chained .endswith() calls Test results: 3934 passed, 17 failed (all pre-existing on main), 19 skipped. Zero regressions.	2026-04-07 10:25:31 -07:00
Teknium	8b861b77c1	refactor: remove browser_close tool — auto-cleanup handles it (#5792 ) * refactor: remove browser_close tool — auto-cleanup handles it The browser_close tool was called in only 9% of browser sessions (13/144 navigations across 66 sessions), always redundantly — cleanup_browser() already runs via _cleanup_task_resources() at conversation end, and the background inactivity reaper catches anything else. Removing it saves one tool schema slot in every browser-enabled API call. Also fixes a latent bug: cleanup_browser() now handles Camofox sessions too (previously only Browserbase). Camofox sessions were never auto-cleaned per-task because they live in a separate dict from _active_sessions. Files changed (13): - tools/browser_tool.py: remove function, schema, registry entry; add camofox cleanup to cleanup_browser() - toolsets.py, model_tools.py, prompt_builder.py, display.py, acp_adapter/tools.py: remove browser_close from all tool lists - tests/: remove browser_close test, update toolset assertion - docs/skills: remove all browser_close references * fix: repeat browser_scroll 5x per call for meaningful page movement Most backends scroll ~100px per call — barely visible on a typical viewport. Repeating 5x gives ~500px (~half a viewport), making each scroll tool call actually useful. Backend-agnostic approach: works across all 7+ browser backends without needing to configure each one's scroll amount individually. Breaks early on error for the agent-browser path. * feat: auto-return compact snapshot from browser_navigate Every browser session starts with navigate → snapshot. Now navigate returns the compact accessibility tree snapshot inline, saving one tool call per browser task. The snapshot captures the full page DOM (not viewport-limited), so scroll position doesn't affect it. browser_snapshot remains available for refreshing after interactions or getting full=true content. Both Browserbase and Camofox paths auto-snapshot. If the snapshot fails for any reason, navigation still succeeds — the snapshot is a bonus, not a requirement. Schema descriptions updated to guide models: navigate mentions it returns a snapshot, snapshot mentions it's for refresh/full content. * refactor: slim cronjob tool schema — consolidate model/provider, drop unused params Session data (151 calls across 67 sessions) showed several schema properties were never used by models. Consolidated and cleaned up: Removed from schema (still work via backend/CLI): - skill (singular): use skills array instead - reason: pause-only, unnecessary - include_disabled: now defaults to true - base_url: extreme edge case, zero usage - provider (standalone): merged into model object Consolidated: - model + provider → single 'model' object with {model, provider} fields. If provider is omitted, the current main provider is pinned at creation time so the job stays stable even if the user changes their default. Kept: - script: useful data collection feature - skills array: standard interface for skill loading Schema shrinks from 14 to 10 properties. All backend functionality preserved — the Python function signature and handler lambda still accept every parameter. * fix: remove mixture_of_agents from core toolsets — opt-in only via hermes tools MoA was in _HERMES_CORE_TOOLS and composite toolsets (hermes-cli, hermes-messaging, safe), which meant it appeared in every session for anyone with OPENROUTER_API_KEY set. The _DEFAULT_OFF_TOOLSETS gate only works after running 'hermes tools' explicitly. Now MoA only appears when a user explicitly enables it via 'hermes tools'. The moa toolset definition and check_fn remain unchanged — it just needs to be opted into.	2026-04-07 03:28:44 -07:00
Mibayy	cc2b56b26a	feat(api): structured run events via /v1/runs SSE endpoint Add POST /v1/runs to start async agent runs and GET /v1/runs/{run_id}/events for SSE streaming of typed lifecycle events (tool.started, tool.completed, message.delta, reasoning.available, run.completed, run.failed). Changes the internal tool_progress_callback signature from positional (tool_name, preview, args) to event-type-first (event_type, tool_name, preview, args, **kwargs). Existing consumers filter on event_type and remain backward-compatible. Adds concurrency limit (_MAX_CONCURRENT_RUNS=10) and orphaned run sweep. Fixes logic inversion in cli.py _on_tool_progress where the original PR would have displayed internal tools instead of non-internal ones. Co-authored-by: Mibayy <mibayy@users.noreply.github.com>	2026-04-05 12:05:13 -07:00
NexVeridian	c71b1d197f	fix(acp): advertise slash commands via ACP protocol Send AvailableCommandsUpdate on session create/load/resume/fork so ACP clients (Zed, etc.) can discover /help, /model, /tools, /compact, etc. Also rewrites /compact to use agent._compress_context() properly with token estimation and session DB isolation. Co-authored-by: NexVeridian <NexVeridian@users.noreply.github.com>	2026-04-05 12:05:13 -07:00
Git-on-my-level	fcdd5447e2	fix: keep ACP stdout protocol-clean Route AIAgent print output to stderr via _print_fn for ACP stdio sessions. Gate quiet-mode spinner startup on _should_start_quiet_spinner() so JSON-RPC on stdout isn't corrupted. Child agents inherit the redirect. Co-authored-by: Git-on-my-level <Git-on-my-level@users.noreply.github.com>	2026-04-05 12:05:13 -07:00
Teknium	914a7db448	fix(acp): rename AuthMethod to AuthMethodAgent for agent-client-protocol 0.9.0 Straight rename to match the 0.9.0 API where AuthMethod was split into AuthMethodAgent, AuthMethodEnvVar, AuthMethodTerminal. Bump pin to >=0.9.0,<1.0. Co-authored-by: Mibayy <mibayy@users.noreply.github.com>	2026-04-05 12:05:13 -07:00

1 2

62 commits