hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-18 04:41:56 +00:00

History

Teknium 83b93898c2 feat(lsp): semantic diagnostics from real language servers in write_file/patch (#24168 ) * feat(lsp): semantic diagnostics from real language servers in write_file/patch Wire ~26 language servers (pyright, gopls, rust-analyzer, typescript-language-server, clangd, bash-language-server, ...) into the post-write lint check used by write_file and patch. The model now sees type errors, undefined names, missing imports, and project-wide semantic issues introduced by its edits, not just syntax errors. LSP is gated on git workspace detection: when the agent's cwd or the file being edited is inside a git worktree, LSP runs against that workspace; otherwise the existing in-process syntax checks are the only tier. This keeps users on user-home cwds (Telegram/Discord gateway chats) from spawning daemons. The post-write check is layered: in-process syntax check first (microseconds), then LSP semantic diagnostics second when syntax is clean. Diagnostics are delta-filtered against a baseline captured at write start, so the agent only sees errors its edit introduced. A flaky/missing language server can never break a write -- every LSP failure path falls back silently to the syntax-only result. New module agent/lsp/ split into: - protocol.py: Content-Length JSON-RPC framer + envelope helpers - client.py: async LSPClient (spawn, initialize, didOpen/didChange, ContentModified retry, push/pull diagnostic stores) - workspace.py: git worktree walk-up + per-server NearestRoot resolver - servers.py: registry of 26 language servers (extension match, root resolver, spawn builder per language) - install.py: auto-install dispatch (npm install --prefix, go install with GOBIN, pip install --target) into HERMES_HOME/lsp/bin/ - manager.py: LSPService (per-(server_id, root) client registry, lazy spawn, broken-set, in-flight dedupe, sync facade for tools layer) - reporter.py: <diagnostics> block formatter (severity-1-only, 20-per-file) - cli.py: hermes lsp {status,list,install,install-all,restart,which} Wired into tools/file_operations.py: - write_file/patch_replace now call _snapshot_lsp_baseline before write - _check_lint_delta gains a third tier: LSP semantic diagnostics when syntax is clean - All LSP code paths swallow exceptions; write_file's contract unchanged Config: 'lsp' section in DEFAULT_CONFIG with enabled (default true), wait_mode, wait_timeout, install_strategy (default 'auto'), and per-server overrides (disabled, command, env, initialization_options). Tests: tests/agent/lsp/ -- 49 tests covering protocol framing (encode and read_message round-trip, EOF/truncation/missing Content-Length), workspace gate (git walk-up, exclude markers, fallback to file location), reporter (severity filter, max-per-file cap, truncation), service-level delta filter, and an in-process mock LSP server that exercises the full client lifecycle including didChange version bumps, dedup, crash recovery, and idempotent teardown. Live E2E verified end-to-end through ShellFileOperations: pyright auto-installed via npm into HERMES_HOME, baseline captured, type error introduced, single delta diagnostic surfaced with correct line/column/code/ source, then patch fix removes the diagnostic from the output. Docs: new website/docs/user-guide/features/lsp.md page covering supported languages, configuration knobs, performance characteristics, and troubleshooting; cli-commands.md updated with the 'hermes lsp' reference; sidebar updated. * feat(lsp): structured logging, backend gate, defensive walk caps Cherry-picks the substantive ideas from #24155 (different scope, same problem space) onto our PR. agent/lsp/eventlog.py (new): dedicated structured logger ``hermes.lint.lsp`` with steady-state silence. Module-level dedup sets keep a 1000-write session at exactly ONE INFO line ("active for <root>") at the default INFO threshold; clean writes log at DEBUG so they never reach agent.log under normal config. State transitions (server starts, no project root for a file, server unavailable) fire at INFO/WARNING once per (server_id, key); novel events (timeouts, unexpected errors) fire WARNING per call. Grep recipe: ``rg 'lsp\\['``. agent/lsp/manager.py: wire the eventlog into _get_or_spawn and get_diagnostics_sync so users can answer "did LSP fire on this edit?" with a single grep, plus surface "binary not on PATH" warnings once instead of silently retrying every write. tools/file_operations.py: backend-type gate. ``_lsp_local_only()`` returns False for non-local backends (Docker / Modal / SSH / Daytona); ``_snapshot_lsp_baseline`` and ``_maybe_lsp_diagnostics`` now skip entirely on remote envs. The host-side language server can't see files inside a sandbox, so this prevents pretending to lint a file the host process can't open. agent/lsp/protocol.py: 8 KiB cap on the header block in ``read_message``. A pathological server that streams headers without ever emitting CRLF-CRLF would have looped forever consuming bytes; now raises ``LSPProtocolError`` instead. agent/lsp/workspace.py: 64-step cap on ``find_git_worktree`` and ``nearest_root`` upward walks, plus try/except containment around ``Path(...).resolve()`` and child ``.exists()`` calls. Defensive against pathological inputs (symlink loops, encoding errors, permission failures mid-walk) — the lint hook is hot-path code and must never raise. Tests: - tests/agent/lsp/test_eventlog.py: 18 tests covering steady-state silence (clean writes stay DEBUG), state-transition INFO-once semantics (active for, no project root), action-required WARNING-once (server unavailable), per-call WARNING (timeouts, spawn failures), and the "1000 clean writes => 1 INFO" contract. - tests/agent/lsp/test_backend_gate.py: 5 tests verifying _lsp_local_only / snapshot_baseline / maybe_lsp_diagnostics skip the LSP layer for non-local backends and route correctly for LocalEnvironment. - tests/agent/lsp/test_protocol.py: new test_read_message_rejects_runaway_header exercising the 8 KiB cap. Validation: - 73/73 LSP tests pass (49 original + 18 eventlog + 5 backend-gate + 1 framer cap) - 198/198 pass when run alongside existing file_operations tests - Live E2E re-run with pyright still surfaces "ERROR [2:12] Type ... reportReturnType (Pyright)" through the full path, then patch fix removes it on the next call. * feat(lsp): atexit cleanup + separate lsp_diagnostics JSON field Two improvements salvaged from #24414's plugin-form alternative, keeping our core-integrated design: 1. atexit cleanup of spawned language servers ---------------------------------------------------------------- ``agent/lsp/__init__.get_service`` now registers an ``atexit`` handler on first creation that tears down the LSPService on Python exit. Without this, every ``hermes chat`` exit was leaking pyright/gopls/etc. processes for a few seconds while their stdout buffers drained -- they got reaped by the kernel eventually but a watchful ``ps aux`` would catch them. The handler runs once per process (gated by ``_atexit_registered``); idempotent ``shutdown_service`` ensures double-fire is a no-op. Errors during shutdown are swallowed at debug level since by the time atexit fires the user has already seen the agent's final response. 2. Separate ``lsp_diagnostics`` field on WriteResult / PatchResult ---------------------------------------------------------------- Previously the LSP layer folded its diagnostic block into the ``lint.output`` string, conflating the syntax-check tier with the semantic tier. The agent (and any downstream parsers) now read syntax errors and semantic errors as independent signals: { "bytes_written": 42, "lint": {"status": "ok", "output": ""}, "lsp_diagnostics": "<diagnostics file=...>\nERROR [2:12] ..." } ``_check_lint_delta`` returns to its original two-tier shape (syntax check + delta filter); ``write_file`` and ``patch_replace`` independently fetch LSP diagnostics via ``_maybe_lsp_diagnostics`` and pass them into the new field. ``patch_replace`` propagates the inner write_file's ``lsp_diagnostics`` so the outer PatchResult carries the patch's delta correctly. Tests: 19 new - tests/agent/lsp/test_lifecycle.py (8 tests): atexit registration fires once and only once across N get_service calls; the registered callable is our internal shutdown wrapper; shutdown_service is idempotent and safe when never started; exceptions during shutdown are swallowed; inactive service is cached so we don't rebuild on every check. - tests/agent/lsp/test_diagnostics_field.py (11 tests): WriteResult / PatchResult dataclass shape, to_dict include/omit semantics, channel separation (lint and lsp_diagnostics carry independent signals), write_file populates the field via _maybe_lsp_diagnostics only when the syntax tier is clean, patch_replace propagates the field forward from its internal write_file. Validation: - 92/92 LSP tests pass (73 prior + 8 lifecycle + 11 diagnostics field) - 217/217 pass with file_operations + LSP combined - Live E2E reverified: clean writes -> both fields empty/none; type error introduced -> lint clean (parses), lsp_diagnostics carries the pyright reportReturnType block; patch fix -> both fields clean again. * fix(lsp): broken-set short-circuit so a wedged server isn't paid every write Discovered while auditing failure paths: a language server binary that hangs (sleep forever, no LSP traffic on stdin/stdout) caused EVERY subsequent write to re-pay the 8s snapshot_baseline timeout. Five writes = ~64s of dead time. The bug: ``_get_or_spawn`` adds the (server_id, root) pair to ``_broken`` inside its inner exception handler, but when the OUTER ``_loop.run`` timeout fires, it cancels the inner task before that handler runs. The pair never makes it to broken-set, so the next write re-enters the spawn path and re-pays the timeout. Fix: - New ``_mark_broken_for_file`` helper at the service layer marks the (server_id, workspace_root) pair broken from the OUTSIDE when the outer timeout fires. Called from the except branches in ``snapshot_baseline``, ``get_diagnostics_sync`` (asyncio.TimeoutError + generic Exception). Also kills any orphan client process that survived the cancelled future, fire-and-forget with a 1s ceiling. - ``enabled_for`` now consults the broken-set BEFORE returning True. Files in already-broken (server_id, root) pairs short-circuit to False, so the file_operations layer skips the LSP path entirely with no spawn cost. Until the service is restarted (``hermes lsp restart``) or the process exits. - A single eventlog WARNING is emitted on first mark-broken so the user knows which server gave up. Subsequent edits in the same project stay silent. Tests: 7 new in tests/agent/lsp/test_broken_set.py — covers the key shape (server_id, per_server_root), enabled_for short-circuit, sibling-file skip in same project, project isolation (broken in A doesn't affect B), graceful no-op for missing-server / no-workspace, and an end-to-end test that snapshots after a failure and verifies the next ``enabled_for`` returns False. Validation: - Live retest of the wedged-binary scenario: 5 sequential writes, first 8.88s (the one snapshot timeout), subsequent four ~0.84s (no LSP cost). Down from 5x12.85s = 64s before this fix. - 99/99 LSP tests pass (92 prior + 7 broken-set) - 224/224 pass with file_operations + LSP combined - Happy path E2E reverified — clean write, type error introduced, patch fix all behave correctly with the new broken-set logic. Note: the FIRST write to a wedged binary still pays 8s (the snapshot_baseline timeout). We could shorten that, but pyright/ tsserver normally take 2-3s and slow CI rust-analyzer can need 5+ seconds, so 8s is the conservative ceiling. Subsequent writes are instant.		2026-05-12 16:31:54 -07:00
..
__init__.py	chore: release v0.13.0 (2026.5.7) (#21406 )	2026-05-07 09:22:48 -07:00
_parser.py	fix: add dashboard to CLI help epilogue and Docker CI smoke test	2026-05-07 06:16:23 -07:00
_subprocess_compat.py	feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags	2026-05-08 14:27:40 -07:00
auth.py	union paid recs from nous portal with static list (#24509 )	2026-05-12 12:16:17 -07:00
auth_commands.py	fix(minimax): harden OAuth dashboard and runtime	2026-05-11 22:15:16 -07:00
azure_detect.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
backup.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
banner.py	fix(banner): resolve update-check repo from running code, not profile-scoped path	2026-05-09 04:10:35 -07:00
browser_connect.py	fix(browser): address Copilot review on /browser connect	2026-04-28 22:11:10 -07:00
callbacks.py	fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902 )	2026-04-14 16:11:37 -07:00
checkpoints.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
claw.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
cli_output.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
clipboard.py	feat: fix img pasting in new ink plus newline after tools	2026-04-11 13:14:32 -05:00
codex_models.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
colors.py	feat: respect NO_COLOR env var and TERM=dumb (#4079 )	2026-03-30 17:07:21 -07:00
commands.py	chore: ruff auto-fixes — collapsible-else-if, if-stmt-min-max, dict.fromkeys (#23926 )	2026-05-11 11:03:29 -07:00
completion.py	fix(completion): use valid zsh _arguments exclusion-group syntax	2026-05-09 13:36:44 -07:00
config.py	feat(lsp): semantic diagnostics from real language servers in write_file/patch (#24168 )	2026-05-12 16:31:54 -07:00
copilot_auth.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
cron.py	feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709 )	2026-05-04 12:31:01 -07:00
curator.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
curses_ui.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
debug.py	fix(debug): redact log content at upload time in hermes debug share	2026-05-03 11:42:20 -07:00
default_soul.py	fix: reset default SOUL.md to baseline identity text (#3159 )	2026-03-26 01:34:27 -07:00
dingtalk_auth.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
doctor.py	feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220 )	2026-05-12 01:02:25 -07:00
dump.py	refactor(env): use shared Hermes dotenv loader	2026-05-05 10:13:13 -07:00
env_loader.py	feat(cross-platform): psutil for PID/process management + Windows footgun checker	2026-05-08 14:27:40 -07:00
fallback_cmd.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
gateway.py	fix(install): use `--extra all` not `--all-extras`; drop lazy-covered extras from [all] (#24515 )	2026-05-12 15:06:25 -07:00
gateway_windows.py	fix(gateway): preserve Ctrl+C for Windows foreground runs	2026-05-09 14:34:18 -07:00
goals.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
hooks.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
kanban.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
kanban_db.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
kanban_diagnostics.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
kanban_specify.py	feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks (#21435 )	2026-05-07 13:04:41 -07:00
logs.py	feat: component-separated logging with session context and filtering (#7991 )	2026-04-11 17:23:36 -07:00
main.py	feat(lsp): semantic diagnostics from real language servers in write_file/patch (#24168 )	2026-05-12 16:31:54 -07:00
mcp_config.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
memory_setup.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
model_catalog.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
model_normalize.py	fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802 )	2026-05-06 09:08:33 -07:00
model_switch.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
models.py	union paid recs from nous portal with static list (#24509 )	2026-05-12 12:16:17 -07:00
nous_subscription.py	feat(web): add SearXNG as a native search-only backend	2026-05-06 10:05:29 -07:00
oneshot.py	fix: make session search initialize session db	2026-05-09 14:36:58 -07:00
pairing.py	fix(pairing): enforce lockout on approve_code, not just generate_code (#10195 ) (#21325 )	2026-05-07 07:18:21 -07:00
platforms.py	feat: complete plugin platform parity — all 12 integration points	2026-04-29 21:56:51 -07:00
plugins.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
plugins_cmd.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
profile_distribution.py	feat(profile): shareable profile distributions via git (#20831 )	2026-05-08 10:04:32 -07:00
profiles.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
providers.py	fix: prevent bare 'custom' slug in model.provider (#17478 )	2026-04-30 04:32:11 -07:00
pt_input_extras.py	fix(cli): make Ctrl+Enter insert newline on WSL/SSH/Windows Terminal (#22777 )	2026-05-09 12:48:14 -07:00
pty_bridge.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
relaunch.py	fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch	2026-05-08 14:27:40 -07:00
runtime_provider.py	fix(minimax): harden OAuth dashboard and runtime	2026-05-11 22:15:16 -07:00
security_advisories.py	feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220 )	2026-05-12 01:02:25 -07:00
setup.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
skills_config.py	refactor(config): migrate remaining 33 cfg_get call sites (#17311 )	2026-04-29 04:03:03 -07:00
skills_hub.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
skin_engine.py	fix(tui): honor skin highlight colors (#20895 )	2026-05-06 14:01:56 -07:00
slack_cli.py	fix(slack): enable writable app home DMs in manifest	2026-05-08 17:01:12 -07:00
status.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
stdio.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
timeouts.py	refactor(timeouts): drop redundant ImportError in except clause	2026-04-26 20:48:20 -07:00
tips.py	feat: Ctrl+Enter inserts newline on Windows Terminal	2026-05-08 14:27:40 -07:00
tools_config.py	fix(deps): unbreak [all] install — drop mistralai while PyPI quarantined (#24205 )	2026-05-11 23:02:15 -07:00
uninstall.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
vercel_auth.py	feat: add Vercel Sandbox backend	2026-04-29 07:22:33 -07:00
voice.py	fix(tui): restore voice push-to-talk parity (#20897 )	2026-05-06 15:49:59 -07:00
web_server.py	feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220 )	2026-05-12 01:02:25 -07:00
webhook.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00