hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-18 04:41:56 +00:00

Author	SHA1	Message	Date
Teknium	486b692ddd	feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779 ) Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Has been cancelled Details uv.lock check / uv lock --check (push) Has been cancelled Details * feat(nous): unified client=hermes-client-v<version> tag on every Portal request Every Hermes request to Nous Portal now carries the same client=hermes-client-v<__version__> tag (e.g. client=hermes-client-v0.13.0 on this release), sourced live from hermes_cli.__version__. The release script's regex bump auto-aligns it on every release. Centralized in agent/portal_tags.py and wired into all four call sites: - NousProfile.build_extra_body (main agent loop, every chat completion) - auxiliary_client.NOUS_EXTRA_BODY + _build_call_kwargs (aux client) - run_agent.py compression-summary fallback path - tools/web_tools.py web_extract fallback Replaces the client=aux marker added in #24194 with the unified version tag. Tests assert against the helper output (invariant) rather than the literal string, so they don't need updating on every release. * feat(nous): cover /goal judge and kanban specify aux paths Two aux-using surfaces bypassed call_llm by invoking client.chat.completions.create() directly without extra_body, so they were missing the unified Portal client tag: - hermes_cli/goals.py — /goal standing-goal judge - hermes_cli/kanban_specify.py — kanban triage specifier Both now pass extra_body=get_auxiliary_extra_body() or None so they inherit the version tag when the aux client points at Nous Portal, and emit nothing otherwise (no tag leak to OpenRouter/Anthropic auxes).	2026-05-12 20:49:20 -07:00
Teknium	b06e999302	fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778 ) The long-lived prefix-cache layout split the system prompt into stable/ context/volatile blocks and re-derived them on every API call. The volatile tier (timestamp + memory snapshot + USER profile) ticks per turn, so the system message bytes mutated mid-conversation and broke upstream prompt caches (OpenRouter, Nous Portal, Anthropic). Diagnosed via live wire-format diffing: an 8-turn conversation showed OLD layout flipping system block[1] sha mid-session at the minute boundary, dropping cached_tokens to 0 on that turn (cumulative 66.6% vs 83.3% for the single-block layout). Hermes invariant: history (system + all but the last 1-2 messages) must be static. Fix: drop the long-lived layout entirely. Single layout everywhere — system_and_3 with one cached system string built once on first turn, replayed verbatim on every subsequent turn. Loses cross-session 1h prefix caching for Claude (the feature that motivated the split), but within-session caching now actually works on every provider. Removed: - run_agent.py: _use_long_lived_prefix_cache flag, _long_lived_cache_ttl, _supports_long_lived_anthropic_cache method, the long-lived branch in run_conversation, mark_tools_for_long_lived_cache call site - agent/prompt_caching.py: apply_anthropic_cache_control_long_lived, mark_tools_for_long_lived_cache, _mark_system_stable_block helper - hermes_cli/config.py: prompt_caching.long_lived_prefix and prompt_caching.long_lived_ttl config keys - tests/agent/test_prompt_caching_live.py (entire file) - tests/agent/test_prompt_caching.py: TestMarkToolsForLongLivedCache, TestApplyAnthropicCacheControlLongLived - tests/run_agent/test_anthropic_prompt_cache_policy.py: TestSupportsLongLivedAnthropicCache Targeted tests: 62/62 pass.	2026-05-12 20:46:04 -07:00
amathxbt	80374d4dd9	fix: approval DELETE pattern DOTALL flag allows newline bypass	2026-05-12 18:50:37 -07:00
AgentArcLab	8ac351407e	fix(agent): clear stale config context_length on model switch When switching models via /model, AIAgent._config_context_length was never cleared, so the new model inherited the previous model's context window instead of auto-detecting the correct one via get_model_context_length(). Clear _config_context_length to None before the runtime field swap so the full resolution chain (custom_providers per-model, endpoint probe, models.dev, etc.) is re-evaluated for the newly selected model. Closes #21509	2026-05-12 18:50:04 -07:00
alblez	a4289d74ac	fix(test): use i18n t() for restart drain assertion The test_restart_command_while_busy_requests_drain_without_interrupt test was asserting against a hardcoded emoji string that was valid before the i18n migration. After gateway/run.py switched to t("gateway.draining", count=N), the test sees the translated output (or the raw key when the locale catalog isn't resolved in xdist workers). Fix by asserting against t("gateway.draining", count=1) — this produces the correct expected value regardless of whether the locale file is available in the test environment.	2026-05-12 18:49:33 -07:00
liuhao1024	1a4e8f7041	fix(gateway): make WhatsApp npm install timeout configurable Default timeout raised from 60s to 300s (5 minutes) to accommodate slower systems like Unraid NAS. Configurable via WHATSAPP_NPM_INSTALL_TIMEOUT environment variable.	2026-05-12 18:49:07 -07:00
AhmetArif0	420762f867	fix(tools): forward thread_id via metadata in _send_via_adapter live path The live adapter path in _send_via_adapter called adapter.send() without passing thread_id, while the standalone fallback path correctly forwarded it. For plugin platforms (google_chat, teams, irc, line) running with the gateway in-process, this caused every threaded reply to land as a new top-level message instead of continuing the thread. Matches the pattern already used by _send_matrix_via_adapter and _send_feishu: build metadata={"thread_id": thread_id} and pass it through.	2026-05-12 18:48:44 -07:00
02356abc	e77fd75c44	fix(wecom): update connection status after WebSocket reconnection The WeCom adapter's _listen_loop() automatically reconnects when the WebSocket drops, but it never called _mark_connected() after a successful reconnection. This left the runtime status file (gateway_state.json) stuck in "disconnected" even though the adapter was fully operational again. Add self._mark_connected() right after _open_connection() succeeds so that the dashboard and health probes report the correct state. Tested by forcing a WebSocket close via the heartbeat loop and verifying that the status file updated from "disconnected" back to "connected".	2026-05-12 18:48:17 -07:00
mizgyo	7c67097325	fix(line): use build_source instead of nonexistent create_source The LINE adapter calls self.create_source(...) which raises AttributeError on every inbound message — no such method exists. The base PlatformAdapter exposes this factory as build_source(), consistent with the IRC and Teams adapters. Fixes #23728	2026-05-12 18:46:54 -07:00
ALIYILD	afa5b81918	fix(prompt_builder): inject tool-use enforcement for GLM models GLM-family models (z-ai/glm-4.5-air, z-ai/glm-4.5-flash, etc.) exhibit the same "describe-instead-of-call" failure mode that gpt/codex/gemini/ gemma/grok already trigger enforcement for. Without the injection, free-tier GLM workers spawned by the kanban dispatcher routinely exit cleanly (rc=0) without invoking kanban_complete or kanban_block, producing the "protocol violation" error and triggering the dispatcher's gave_up path. Observed in real workloads: seven consecutive kanban tasks across three GLM-tier profiles (shipbackend, frontend-engineer, backend-engineer) all failed with the identical message: worker exited cleanly (rc=0) without calling kanban_complete or kanban_block — protocol violation Re-running the same tasks on Claude Haiku immediately resolved them. Adding "glm" to TOOL_USE_ENFORCEMENT_MODELS closes the gap so future GLM-routed work receives the explicit "every response must contain a tool call or final result" steering that already protects the other enforcement-gated model families. One-line change; no behavior change for non-GLM models.	2026-05-12 18:46:28 -07:00
AhmetArif0	e474130c48	fix(telegram): use thread fallback helper in slash-confirm result send PR #23458 introduced _send_message_with_thread_fallback() and applied it to all control-style sends (send_update_prompt, send_approval_request, send_model_picker_prompt), but the slash-confirm result message in handle_callback_query still called self._bot.send_message directly. In supergroups with stale message_thread_id on the callback's parent message, this raises "Message thread not found" and silently swallows the result text. Replace with the helper so the same retry-without- thread-id logic applies.	2026-05-12 18:46:02 -07:00
Jwd-gity	327b8cee9e	fix(install): use stash@{0} instead of git rev-parse refs/stash for autostash recovery Autostash creates refs/stash as a pointer to the latest stash commit, but git stash apply/drop expect the symbolic ref format like stash@{0}, not the raw commit SHA. Using the commit SHA causes: error: 'X is not a stash reference'	2026-05-12 18:45:26 -07:00
diablozzc	dd1d4e9c5d	fix(gateway): add chat_id to hook_ctx for message source tracking	2026-05-12 18:44:57 -07:00
Teknium	80c4b27437	docs(lsp): document follow-up fixes from #24630 (#24709 ) - Note that typescript-language-server pulls in the typescript SDK automatically (peer-dep relationship was previously implicit and caused initialize failures when the SDK was absent). - Add a Troubleshooting entry for the new Backend warnings section in hermes lsp status, with the shellcheck install commands across apt / brew / scoop. Reflects what shipped in PR #24630.	2026-05-12 18:44:33 -07:00
Dangooy	557deece6f	fix(tui): use TERMINAL_CWD in _session_info for accurate status line path _session_info() used os.getcwd() which reflects the gateway process working directory, not the user's actual working directory. This caused the TUI status line to display incorrect paths (e.g. D:\HermesWork instead of D:\Hermes\HermesWork) after agent turns that changed the process cwd. Align with session.create which already correctly reads TERMINAL_CWD env var set by the CLI launcher.	2026-05-12 18:44:17 -07:00
RhombusMaximus	081f9368bc	fix(voice_mode): detect audio in WSL when sd.query_devices() returns empty list but PULSE_SERVER is set In WSL2, sounddevice.query_devices() returns [] even when the PulseAudio bridge is functional. The existing code already handled the case where the query itself raises an exception, but it missed the empty-list case. This change treats an empty device list as non-fatal in WSL when PULSE_SERVER is configured, matching the existing exception-handler behavior. Fixes: WSL users seeing 'No audio input/output devices detected' even though paplay/arecord work fine.	2026-05-12 18:43:50 -07:00
ms-alan	e71393237e	fix(signal): handle group messages from linked devices in syncMessage path Closes #23064 When Hermes connects to Signal via signal-cli in daemon mode (linked device setup), group messages sent from the user's phone were silently dropped. The syncMessage handler only processed events where destinationNumber equals the bot's own number (Note to Self). Group messages from linked devices carry a groupInfo.groupId instead of a destinationNumber. Extend the condition to also pass through sync messages that have a groupId, so group messages are promoted to dataMessage and reach the agent.	2026-05-12 18:43:26 -07:00
ryptotalent	4c825554c1	fix(retry): use float() for Retry-After header to handle sub-second values	2026-05-12 18:42:52 -07:00
Teknium	2a18b6283b	fix(cache): drop ttl=1h on Portal Qwen — Alibaba upstream is 5m-only (#24702 ) PR #24151 routed Portal Qwen (qwen3.6-plus) through the prefix_and_2 long-lived cache layout, attaching {"type":"ephemeral","ttl":"1h"} markers to the tools[-1] entry and the stable system-prefix block. That layout works for Portal Claude because Anthropic / OpenRouter on Anthropic routes honour 1h TTL — but Portal Qwen ultimately proxies to Alibaba DashScope, which documents a single "ephemeral" TTL of 5 minutes on its Context Cache. The ttl="1h" qualifier is silently dropped upstream, so the two highest-value breakpoints (tools array + system prefix) never land. Only the rolling-window 5m markers on the last 2 messages cache, which matches the observed ~25% read rate. Fix: keep Portal Qwen on cache_control via _anthropic_prompt_cache_policy returning (True, False), but drop it from _supports_long_lived_anthropic_cache so it rides the standard system_and_3 5m layout (system + last 3 messages, all at 5m). Same 4 breakpoints, all in a TTL the upstream actually honours. Refs: https://www.alibabacloud.com/help/en/model-studio/context-cache https://openrouter.ai/docs/features/prompt-caching (Alibaba Qwen section: "TTL: 5 minutes") - _supports_long_lived_anthropic_cache: Portal scope narrowed back to Claude - tests: flip the two qwen long-lived expectations to False, retitle non_claude_non_qwen_rejected -> non_claude_rejected	2026-05-12 18:34:43 -07:00
dhruv-saxena	d8c4460fe3	fix(cron): include whatsapp in _HOME_TARGET_ENV_VARS Cron jobs using `deliver: whatsapp` were silently dropped because the resolver's home-channel env var dict in cron/scheduler.py listed every messaging platform except whatsapp. _resolve_delivery_targets() returned [] and no message was sent — but jobs.json marked the run successful and no log line surfaced the failure. The gateway adapter and the send_message tool path both honored WHATSAPP_HOME_CHANNEL correctly; only the cron path missed. Adds 'whatsapp' -> 'WHATSAPP_HOME_CHANNEL' to _HOME_TARGET_ENV_VARS. Verified end-to-end with multiple cron pings landing in WhatsApp self-chat after the fix. Fixes #22997	2026-05-12 17:13:15 -07:00
McClean	6f92a21926	fix(web): add Bearer auth header for Tavily /crawl endpoint Tavily's /crawl endpoint requires Authorization: Bearer <key> in the header, unlike /search and /extract which accept api_key in the JSON body. Without the header, crawl returns 401 Unauthorized.	2026-05-12 17:12:49 -07:00
jak983464779	0c233e70f8	fix(doctor): skip /models health check for providers that don't support it Xiaomi MiMo's /v1/models endpoint returns 401 even with a valid API key, causing hermes doctor to falsely report 'invalid API key'. Add a `supports_health_check` field to ProviderProfile (default True). Providers whose /models endpoint doesn't support auth verification can set it to False. The doctor's dynamic provider discovery now reads this field instead of hardcoding True. The xiaomi provider plugin sets supports_health_check=False.	2026-05-12 17:12:25 -07:00
Quarkex	a54d4b0e46	fix(send_message): recognize XMPP JIDs as explicit targets _parse_target_ref() has no handler for XMPP JIDs (user@server or room@conference.server), so they fall through to the final `return None, None, False`. This causes send_message to fail when targeting an XMPP chat by JID, since the JID is not numeric and doesn't match any other platform pattern. Add an explicit check for XMPP targets containing '@', matching the existing Matrix pattern above it.	2026-05-12 17:11:56 -07:00
silv-mt-holdings	0bc5f7b235	fix(gateway): reduce systemd restart delay	2026-05-12 17:11:25 -07:00
wesleysimplicio	8d553056c0	fix(ci): bump e2e job timeout to 15 minutes Closes #22006	2026-05-12 17:10:57 -07:00
wesleysimplicio	1beb578fde	fix(ci): install ripgrep in e2e job Closes #22003	2026-05-12 17:09:45 -07:00
wuwuzhijing	a694a26330	docs(gateway): mention Weixin in gateway help and docstrings Salvage of #21063 — adds 'Weixin, and more' to module-level docstrings in gateway/__init__.py, gateway/config.py, gateway/platforms/base.py and the 'hermes gateway' subparser description. Co-authored-by: wuwuzhijing <chuang.guo@hopechart.com>	2026-05-12 17:08:51 -07:00
Teknium	29c9ff9ba5	fix(lsp): typescript SDK install + tsc-missing skip + shellcheck warning (#24630 ) Three follow-ups to PR #24168 found during live E2E testing on TS/bash files: 1. typescript-language-server now installs the typescript SDK (tsserver) alongside it. Without that sibling install, initialize() failed with "Could not find a valid TypeScript installation" and the server was marked broken — no diagnostics ever reached the agent. New extra_pkgs field on INSTALL_RECIPES makes that explicit and reusable for future peer-dep cases. 2. _check_lint now treats "linter command exists on PATH but cannot actually run" as skipped instead of error. The motivating case is npx tsc when typescript is not in node_modules — npx prints its "This is not the tsc command you are looking for" banner and exits non-zero, which previously blocked the LSP semantic tier (gated on success or skipped). Pattern-matched per base command (npx, rustfmt, go) so genuine lint errors still flow through normally. 3. hermes lsp status now surfaces a Backend warnings section when bash-language-server is installed but shellcheck is missing. The server itself spawns fine but bash-language-server delegates diagnostics to shellcheck — without it on PATH the integration looks alive but never reports any problems. Same warning is logged once at server spawn time. Validation: - 12 new tests in tests/agent/lsp/test_install_and_lint_fixes.py: * recipe carries typescript SDK * _install_npm passes both pkg + extras to npm CLI * backwards compat: recipes without extras still work * _backend_warnings quiet when bash absent / both present * _backend_warnings fires when bash installed without shellcheck * status output includes the Backend warnings section * _looks_like_linter_unusable catches the npx tsc banner * real TS type errors not misclassified as unusable * unfamiliar linters fall through normally * _check_lint returns skipped on npx tsc unusable * _check_lint returns error on real tsc type errors - Full lsp + file_operations test suite: 245/245 pass - Live E2E: * try_install("typescript-language-server") installs both packages into node_modules * write_file(bad.ts, ...) returns lint=skipped + lsp_diagnostics with two real TS errors (was lint=error, no lsp_diagnostics) * hermes lsp status renders the shellcheck warning when bash is installed but shellcheck is not on PATH	2026-05-12 17:02:35 -07:00
Teknium	6f285efb80	fix(telegram): clear in-progress reaction on cancelled processing (#24628 ) When the user runs /stop or a session is interrupted mid-flight, the 👀 in-progress reaction lingered on the user's message indefinitely. Without another agent run to swap it for 👍/👎, the eyes stayed there forever — visually misleading (looks like the agent is still working). Fix: on ProcessingOutcome.CANCELLED, call set_message_reaction with reaction=None to clear all reactions on the message. Documented Bot API semantics (equivalent to Bot API 10.0's deleteMessageReaction, but works on PTB 22.6 already without the version bump). Test changes: - Renamed test_on_processing_complete_cancelled_keeps_existing_reaction → test_on_processing_complete_cancelled_clears_reaction; updated assertion to expect set_message_reaction(reaction=None). - Added test_on_processing_complete_cancelled_skipped_when_disabled (TELEGRAM_REACTIONS=false short-circuits). - Added test_clear_reactions_handles_api_error_gracefully and test_clear_reactions_returns_false_without_bot to cover the new _clear_reactions helper.	2026-05-12 17:02:29 -07:00
Teknium	413990c945	chore(release): add AUTHOR_MAP entries for JamesX88	2026-05-12 16:45:04 -07:00
JamesX88	a33ec10874	fix(cli): @-file completion crash on Windows when paths aren't cp1252-decodable The fuzzy @-file completer shells out to 'rg --files' via subprocess.run with text=True. On Windows, Python 3.13 decodes stdout using the system ANSI codepage (cp1252), so any filename containing bytes like 0x81/0x8f crashes the background reader thread with UnicodeDecodeError. The exception is swallowed inside subprocess, leaving proc.stdout=None, and the next line ('proc.stdout.strip()') blows up with: AttributeError: 'NoneType' object has no attribute 'strip' This takes down the prompt_toolkit event loop and forces 'Press ENTER to continue' until the user clears the @-query. Fix: - Pass encoding='utf-8', errors='replace' so rg's UTF-8 output is decoded consistently across platforms and unmappable bytes don't crash. - Guard 'proc.stdout' with a None check before .strip(), so a future reader-thread failure degrades gracefully instead of breaking input.	2026-05-12 16:45:04 -07:00
Teknium	c7cfad5d96	chore(release): add AUTHOR_MAP entries for NorethSea	2026-05-12 16:44:03 -07:00
NorethSea	7a4ad5ccb4	fix(cli): use display-width for response box header label to support CJK Replace `len(label)` with `HermesCLI._status_bar_display_width(label)` in two places where the response box top border is rendered. `len()` counts characters, not terminal columns. CJK characters like `测` and `试` each occupy 2 columns, causing the top border `╭─ 测试 ───╮` to render 2 columns wider than the bottom border `╰─────────╯`. The `_status_bar_display_width` helper already exists (line 2881) and uses `prompt_toolkit.utils.get_cwidth` for proper CJK width calculation.	2026-05-12 16:44:03 -07:00
Teknium	b7bd0f77f3	chore(release): add AUTHOR_MAP entries for laoli-no1	2026-05-12 16:42:53 -07:00
laoli-no1	d33deb7cbe	fix(tui): clear scrollback buffer on startup to prevent tmux scrollback leakage When TUI exits, tmux captures some TUI output into its scrollback buffer. On restart, stale scrollback content appears at the top of screen before AlternateScreen takes over. Add ANSI escape sequences at startup: - ESC[2J clear visible screen - ESC[H cursor home - ESC[3J clear scrollback buffer	2026-05-12 16:42:53 -07:00
liuhao1024	2a3140a814	fix(dashboard): rescan plugins when cached directory is removed	2026-05-12 16:41:33 -07:00
Teknium	6ec89d885d	chore(release): add AUTHOR_MAP entries for aqilaziz	2026-05-12 16:40:10 -07:00
aqilaziz	80375cbe2c	fix(dashboard): display real config path on Config page Replace the hardcoded i18n placeholder "~/.hermes/config.yaml" with the real config_path returned from api.getStatus(), falling back to the i18n string while loading or on API failure. Co-authored-by: aqilaziz <gonzes7@gmail.com>	2026-05-12 16:40:10 -07:00
Teknium	782e3f5164	chore(release): add AUTHOR_MAP entries for AllynSheep	2026-05-12 16:38:14 -07:00
AllynSheep	e3858772d0	fix(dashboard): skip browser-open on headless Linux to prevent process exit Fixes #24127 On headless Linux VPS (no DISPLAY or WAYLAND_DISPLAY), some Python webbrowser backends register TUI programs such as links, lynx, or www-browser. GenericBrowser.open() spawns these without redirecting stdin/stdout, allowing them to take over the terminal. This can cause the process to receive SIGHUP and exit immediately even though uvicorn bound the port successfully, producing a misleading success message followed by an empty --status. Fix: detect headless Linux at startup and skip the auto-open when no display server is available. On such systems the URL is still printed so the user can open it manually or via an SSH tunnel. The webbrowser call is also wrapped in a try/except so any unexpected failure on other platforms is silently absorbed rather than surfacing as an unhandled exception in the daemon thread.	2026-05-12 16:38:14 -07:00
Teknium	b3ca6362a8	chore(release): add AUTHOR_MAP entry for hookinglau	2026-05-12 16:36:20 -07:00
hookinglau	d68a0ec383	fix(auxiliary): pass cfg_base_url and cfg_api_key when resolving task provider _resolve_task_provider_model drops cfg_base_url and cfg_api_key when returning a named provider, causing configured API keys and base URLs to be lost. Pass them through so named providers can use custom endpoints while still resolving credentials from provider-specific env vars. Closes #20139	2026-05-12 16:36:20 -07:00
Teknium	389c707e42	chore(release): add AUTHOR_MAP entry for ryptotalent	2026-05-12 16:34:40 -07:00
ryptotalent	9b2488af2a	fix: include arg-taking commands in Telegram menu Built-in commands with required args (e.g. /queue, /steer, /background) were excluded from Telegram setMyCommands output, making them invisible in the autocomplete menu. However, their handlers already return usage text when invoked without arguments, so hiding them hurts discoverability. This commit removes the _requires_argument filter for built-in commands (COMMAND_REGISTRY) while keeping it for plugin-registered slash commands, which may not provide a no-arg usage fallback. Closes #24312	2026-05-12 16:34:40 -07:00
Teknium	29d7c244c5	feat(gateway): wire clarify tool with inline keyboard buttons on Telegram (#24199 ) The clarify tool returned 'not available in this execution context' for every gateway-mode agent because gateway/run.py never passed clarify_callback into the AIAgent constructor. Schema actively encouraged calling it; users never saw the question. Changes: - tools/clarify_gateway.py — new event-based primitive mirroring tools/approval.py: register/wait_for_response/resolve_gateway_clarify with per-session FIFO, threading.Event blocking with 1s heartbeat slices (so the inactivity watchdog keeps ticking), and clear_session for boundary cleanup. - gateway/platforms/base.py — abstract send_clarify with a numbered-text fallback so every adapter (Discord, Slack, WhatsApp, Signal, Matrix, etc.) gets a working clarify out of the box. Plus an active-session bypass: when the agent is blocked on a text-awaiting clarify, the next non-command message routes inline to the runner's intercept instead of being queued + triggering an interrupt. Same shape as the /approve deadlock fix from PR #4926. - gateway/platforms/telegram.py — concrete send_clarify renders one inline button per choice plus '✏️ Other (type answer)'. cl: callback handler resolves numeric choices immediately, flips to text-capture mode for Other, with the same authorization guards as exec/slash approvals. - gateway/run.py — clarify_callback wired at the cached-agent per-turn callback assignment site (only the user-facing agent path; cron and hygiene-compress agents have no human attached). Bridges sync→async via run_coroutine_threadsafe, blocks with the configured timeout, and returns a '[user did not respond within Xm]' sentinel on timeout so the agent adapts rather than pinning the running-agent guard. Text- intercept added to _handle_message before slash-confirm intercept (skipping slash commands). clear_session called in the run's finally to cancel any orphan entries. - hermes_cli/config.py — agent.clarify_timeout default 600s. - website/docs/user-guide/messaging/telegram.md — Interactive Prompts section. Tests: - tests/tools/test_clarify_gateway.py (14 tests) — full primitive coverage: button resolve, open-ended auto-await, Other flip, timeout None, unknown-id idempotency, clear_session cancellation, FIFO ordering, register/unregister notify, config default. - tests/gateway/test_telegram_clarify_buttons.py (12 tests) — render paths (multi-choice/open-ended/long-label/HTML-escape/not-connected), callback dispatch (numeric resolve/Other flip/already-resolved/ unauthorized/invalid-token), and base-adapter text fallback. Out of scope: bot-to-bot, guest mode, checklists, poll media, live photos. Closes #24191.	2026-05-12 16:33:33 -07:00
teknium	76bbb94be4	chore: AUTHOR_MAP entry for AhmetArif0 (PR #24600 )	2026-05-12 16:33:09 -07:00
EloquentBrush	f9559c39c4	fix(gateway): consult lock record argv when cmdline unreadable in scoped-lock stale check PR #24500 introduced stale-lock detection that calls `_looks_like_gateway_process` to confirm a running PID is not an unrelated process that reused the slot. On Windows neither `/proc` nor `ps` is available, so `_read_process_cmdline` always returns `None` and `_looks_like_gateway_process` always returns `False` — causing every valid Windows gateway lock to be marked stale and immediately evicted. Fix: after `_looks_like_gateway_process` returns `False`, call `_read_process_cmdline` directly. If the result is non-`None` the live cmdline was readable and confirms the PID is foreign → stale. If it is `None` (cmdline unreadable, e.g. Windows without ps), fall back to `_record_looks_like_gateway` which validates the stored `argv` the gateway wrote into the lock file at startup. Both oracles must say "not a gateway" before the lock is evicted — the same two-oracle pattern already used in `get_running_pid` (line 941). Adds a regression test that simulates a Windows host where `_looks_like_gateway_process` returns `False` for every PID and `_read_process_cmdline` returns `None`, confirming the lock is kept when the record's argv identifies it as a gateway process.	2026-05-12 16:33:09 -07:00
Teknium	24e2151cd6	chore(release): add AUTHOR_MAP entries for zccyman and Osraka	2026-05-12 16:32:57 -07:00
zccyman	88ede807c4	fix(pricing): add deepseek-v4-pro to official docs pricing table deepseek-v4-pro has been routable since v0.12 but was missing from the _OFFICIAL_DOCS_PRICING table. Sessions using this model showed as "unknown cost" in hermes insights instead of a dollar estimate. Add pricing entry using published list prices: - input: \$1.74/M tokens - output: \$3.48/M tokens - cache_read: \$0.0145/M tokens Uses standard list rates (not the 75% promo) so estimates remain accurate after promo expires 2026-05-31. Closes #24218	2026-05-12 16:32:57 -07:00
Teknium	83b93898c2	feat(lsp): semantic diagnostics from real language servers in write_file/patch (#24168 ) * feat(lsp): semantic diagnostics from real language servers in write_file/patch Wire ~26 language servers (pyright, gopls, rust-analyzer, typescript-language-server, clangd, bash-language-server, ...) into the post-write lint check used by write_file and patch. The model now sees type errors, undefined names, missing imports, and project-wide semantic issues introduced by its edits, not just syntax errors. LSP is gated on git workspace detection: when the agent's cwd or the file being edited is inside a git worktree, LSP runs against that workspace; otherwise the existing in-process syntax checks are the only tier. This keeps users on user-home cwds (Telegram/Discord gateway chats) from spawning daemons. The post-write check is layered: in-process syntax check first (microseconds), then LSP semantic diagnostics second when syntax is clean. Diagnostics are delta-filtered against a baseline captured at write start, so the agent only sees errors its edit introduced. A flaky/missing language server can never break a write -- every LSP failure path falls back silently to the syntax-only result. New module agent/lsp/ split into: - protocol.py: Content-Length JSON-RPC framer + envelope helpers - client.py: async LSPClient (spawn, initialize, didOpen/didChange, ContentModified retry, push/pull diagnostic stores) - workspace.py: git worktree walk-up + per-server NearestRoot resolver - servers.py: registry of 26 language servers (extension match, root resolver, spawn builder per language) - install.py: auto-install dispatch (npm install --prefix, go install with GOBIN, pip install --target) into HERMES_HOME/lsp/bin/ - manager.py: LSPService (per-(server_id, root) client registry, lazy spawn, broken-set, in-flight dedupe, sync facade for tools layer) - reporter.py: <diagnostics> block formatter (severity-1-only, 20-per-file) - cli.py: hermes lsp {status,list,install,install-all,restart,which} Wired into tools/file_operations.py: - write_file/patch_replace now call _snapshot_lsp_baseline before write - _check_lint_delta gains a third tier: LSP semantic diagnostics when syntax is clean - All LSP code paths swallow exceptions; write_file's contract unchanged Config: 'lsp' section in DEFAULT_CONFIG with enabled (default true), wait_mode, wait_timeout, install_strategy (default 'auto'), and per-server overrides (disabled, command, env, initialization_options). Tests: tests/agent/lsp/ -- 49 tests covering protocol framing (encode and read_message round-trip, EOF/truncation/missing Content-Length), workspace gate (git walk-up, exclude markers, fallback to file location), reporter (severity filter, max-per-file cap, truncation), service-level delta filter, and an in-process mock LSP server that exercises the full client lifecycle including didChange version bumps, dedup, crash recovery, and idempotent teardown. Live E2E verified end-to-end through ShellFileOperations: pyright auto-installed via npm into HERMES_HOME, baseline captured, type error introduced, single delta diagnostic surfaced with correct line/column/code/ source, then patch fix removes the diagnostic from the output. Docs: new website/docs/user-guide/features/lsp.md page covering supported languages, configuration knobs, performance characteristics, and troubleshooting; cli-commands.md updated with the 'hermes lsp' reference; sidebar updated. * feat(lsp): structured logging, backend gate, defensive walk caps Cherry-picks the substantive ideas from #24155 (different scope, same problem space) onto our PR. agent/lsp/eventlog.py (new): dedicated structured logger ``hermes.lint.lsp`` with steady-state silence. Module-level dedup sets keep a 1000-write session at exactly ONE INFO line ("active for <root>") at the default INFO threshold; clean writes log at DEBUG so they never reach agent.log under normal config. State transitions (server starts, no project root for a file, server unavailable) fire at INFO/WARNING once per (server_id, key); novel events (timeouts, unexpected errors) fire WARNING per call. Grep recipe: ``rg 'lsp\\['``. agent/lsp/manager.py: wire the eventlog into _get_or_spawn and get_diagnostics_sync so users can answer "did LSP fire on this edit?" with a single grep, plus surface "binary not on PATH" warnings once instead of silently retrying every write. tools/file_operations.py: backend-type gate. ``_lsp_local_only()`` returns False for non-local backends (Docker / Modal / SSH / Daytona); ``_snapshot_lsp_baseline`` and ``_maybe_lsp_diagnostics`` now skip entirely on remote envs. The host-side language server can't see files inside a sandbox, so this prevents pretending to lint a file the host process can't open. agent/lsp/protocol.py: 8 KiB cap on the header block in ``read_message``. A pathological server that streams headers without ever emitting CRLF-CRLF would have looped forever consuming bytes; now raises ``LSPProtocolError`` instead. agent/lsp/workspace.py: 64-step cap on ``find_git_worktree`` and ``nearest_root`` upward walks, plus try/except containment around ``Path(...).resolve()`` and child ``.exists()`` calls. Defensive against pathological inputs (symlink loops, encoding errors, permission failures mid-walk) — the lint hook is hot-path code and must never raise. Tests: - tests/agent/lsp/test_eventlog.py: 18 tests covering steady-state silence (clean writes stay DEBUG), state-transition INFO-once semantics (active for, no project root), action-required WARNING-once (server unavailable), per-call WARNING (timeouts, spawn failures), and the "1000 clean writes => 1 INFO" contract. - tests/agent/lsp/test_backend_gate.py: 5 tests verifying _lsp_local_only / snapshot_baseline / maybe_lsp_diagnostics skip the LSP layer for non-local backends and route correctly for LocalEnvironment. - tests/agent/lsp/test_protocol.py: new test_read_message_rejects_runaway_header exercising the 8 KiB cap. Validation: - 73/73 LSP tests pass (49 original + 18 eventlog + 5 backend-gate + 1 framer cap) - 198/198 pass when run alongside existing file_operations tests - Live E2E re-run with pyright still surfaces "ERROR [2:12] Type ... reportReturnType (Pyright)" through the full path, then patch fix removes it on the next call. * feat(lsp): atexit cleanup + separate lsp_diagnostics JSON field Two improvements salvaged from #24414's plugin-form alternative, keeping our core-integrated design: 1. atexit cleanup of spawned language servers ---------------------------------------------------------------- ``agent/lsp/__init__.get_service`` now registers an ``atexit`` handler on first creation that tears down the LSPService on Python exit. Without this, every ``hermes chat`` exit was leaking pyright/gopls/etc. processes for a few seconds while their stdout buffers drained -- they got reaped by the kernel eventually but a watchful ``ps aux`` would catch them. The handler runs once per process (gated by ``_atexit_registered``); idempotent ``shutdown_service`` ensures double-fire is a no-op. Errors during shutdown are swallowed at debug level since by the time atexit fires the user has already seen the agent's final response. 2. Separate ``lsp_diagnostics`` field on WriteResult / PatchResult ---------------------------------------------------------------- Previously the LSP layer folded its diagnostic block into the ``lint.output`` string, conflating the syntax-check tier with the semantic tier. The agent (and any downstream parsers) now read syntax errors and semantic errors as independent signals: { "bytes_written": 42, "lint": {"status": "ok", "output": ""}, "lsp_diagnostics": "<diagnostics file=...>\nERROR [2:12] ..." } ``_check_lint_delta`` returns to its original two-tier shape (syntax check + delta filter); ``write_file`` and ``patch_replace`` independently fetch LSP diagnostics via ``_maybe_lsp_diagnostics`` and pass them into the new field. ``patch_replace`` propagates the inner write_file's ``lsp_diagnostics`` so the outer PatchResult carries the patch's delta correctly. Tests: 19 new - tests/agent/lsp/test_lifecycle.py (8 tests): atexit registration fires once and only once across N get_service calls; the registered callable is our internal shutdown wrapper; shutdown_service is idempotent and safe when never started; exceptions during shutdown are swallowed; inactive service is cached so we don't rebuild on every check. - tests/agent/lsp/test_diagnostics_field.py (11 tests): WriteResult / PatchResult dataclass shape, to_dict include/omit semantics, channel separation (lint and lsp_diagnostics carry independent signals), write_file populates the field via _maybe_lsp_diagnostics only when the syntax tier is clean, patch_replace propagates the field forward from its internal write_file. Validation: - 92/92 LSP tests pass (73 prior + 8 lifecycle + 11 diagnostics field) - 217/217 pass with file_operations + LSP combined - Live E2E reverified: clean writes -> both fields empty/none; type error introduced -> lint clean (parses), lsp_diagnostics carries the pyright reportReturnType block; patch fix -> both fields clean again. * fix(lsp): broken-set short-circuit so a wedged server isn't paid every write Discovered while auditing failure paths: a language server binary that hangs (sleep forever, no LSP traffic on stdin/stdout) caused EVERY subsequent write to re-pay the 8s snapshot_baseline timeout. Five writes = ~64s of dead time. The bug: ``_get_or_spawn`` adds the (server_id, root) pair to ``_broken`` inside its inner exception handler, but when the OUTER ``_loop.run`` timeout fires, it cancels the inner task before that handler runs. The pair never makes it to broken-set, so the next write re-enters the spawn path and re-pays the timeout. Fix: - New ``_mark_broken_for_file`` helper at the service layer marks the (server_id, workspace_root) pair broken from the OUTSIDE when the outer timeout fires. Called from the except branches in ``snapshot_baseline``, ``get_diagnostics_sync`` (asyncio.TimeoutError + generic Exception). Also kills any orphan client process that survived the cancelled future, fire-and-forget with a 1s ceiling. - ``enabled_for`` now consults the broken-set BEFORE returning True. Files in already-broken (server_id, root) pairs short-circuit to False, so the file_operations layer skips the LSP path entirely with no spawn cost. Until the service is restarted (``hermes lsp restart``) or the process exits. - A single eventlog WARNING is emitted on first mark-broken so the user knows which server gave up. Subsequent edits in the same project stay silent. Tests: 7 new in tests/agent/lsp/test_broken_set.py — covers the key shape (server_id, per_server_root), enabled_for short-circuit, sibling-file skip in same project, project isolation (broken in A doesn't affect B), graceful no-op for missing-server / no-workspace, and an end-to-end test that snapshots after a failure and verifies the next ``enabled_for`` returns False. Validation: - Live retest of the wedged-binary scenario: 5 sequential writes, first 8.88s (the one snapshot timeout), subsequent four ~0.84s (no LSP cost). Down from 5x12.85s = 64s before this fix. - 99/99 LSP tests pass (92 prior + 7 broken-set) - 224/224 pass with file_operations + LSP combined - Happy path E2E reverified — clean write, type error introduced, patch fix all behave correctly with the new broken-set logic. Note: the FIRST write to a wedged binary still pays 8s (the snapshot_baseline timeout). We could shorten that, but pyright/ tsserver normally take 2-3s and slow CI rust-analyzer can need 5+ seconds, so 8s is the conservative ceiling. Subsequent writes are instant.	2026-05-12 16:31:54 -07:00

1 2 3 4 5 ...

8183 commits