* feat(lsp): semantic diagnostics from real language servers in write_file/patch
Wire ~26 language servers (pyright, gopls, rust-analyzer, typescript-language-server,
clangd, bash-language-server, ...) into the post-write lint check used by write_file
and patch. The model now sees type errors, undefined names, missing imports, and
project-wide semantic issues introduced by its edits, not just syntax errors.
LSP is gated on git workspace detection: when the agent's cwd or the file being
edited is inside a git worktree, LSP runs against that workspace; otherwise the
existing in-process syntax checks are the only tier. This keeps users on
user-home cwds (Telegram/Discord gateway chats) from spawning daemons.
The post-write check is layered: in-process syntax check first (microseconds),
then LSP semantic diagnostics second when syntax is clean. Diagnostics are
delta-filtered against a baseline captured at write start, so the agent only
sees errors its edit introduced. A flaky/missing language server can never
break a write -- every LSP failure path falls back silently to the syntax-only
result.
New module agent/lsp/ split into:
- protocol.py: Content-Length JSON-RPC framer + envelope helpers
- client.py: async LSPClient (spawn, initialize, didOpen/didChange,
ContentModified retry, push/pull diagnostic stores)
- workspace.py: git worktree walk-up + per-server NearestRoot resolver
- servers.py: registry of 26 language servers (extension match,
root resolver, spawn builder per language)
- install.py: auto-install dispatch (npm install --prefix, go install
with GOBIN, pip install --target) into HERMES_HOME/lsp/bin/
- manager.py: LSPService (per-(server_id, root) client registry, lazy
spawn, broken-set, in-flight dedupe, sync facade for tools layer)
- reporter.py: <diagnostics> block formatter (severity-1-only, 20-per-file)
- cli.py: hermes lsp {status,list,install,install-all,restart,which}
Wired into tools/file_operations.py:
- write_file/patch_replace now call _snapshot_lsp_baseline before write
- _check_lint_delta gains a third tier: LSP semantic diagnostics when
syntax is clean
- All LSP code paths swallow exceptions; write_file's contract unchanged
Config: 'lsp' section in DEFAULT_CONFIG with enabled (default true),
wait_mode, wait_timeout, install_strategy (default 'auto'), and per-server
overrides (disabled, command, env, initialization_options).
Tests: tests/agent/lsp/ -- 49 tests covering protocol framing (encode and
read_message round-trip, EOF/truncation/missing Content-Length), workspace
gate (git walk-up, exclude markers, fallback to file location), reporter
(severity filter, max-per-file cap, truncation), service-level delta filter,
and an in-process mock LSP server that exercises the full client lifecycle
including didChange version bumps, dedup, crash recovery, and idempotent
teardown.
Live E2E verified end-to-end through ShellFileOperations: pyright
auto-installed via npm into HERMES_HOME, baseline captured, type error
introduced, single delta diagnostic surfaced with correct line/column/code/
source, then patch fix removes the diagnostic from the output.
Docs: new website/docs/user-guide/features/lsp.md page covering supported
languages, configuration knobs, performance characteristics, and
troubleshooting; cli-commands.md updated with the 'hermes lsp' reference;
sidebar updated.
* feat(lsp): structured logging, backend gate, defensive walk caps
Cherry-picks the substantive ideas from #24155 (different scope, same
problem space) onto our PR.
agent/lsp/eventlog.py (new): dedicated structured logger
``hermes.lint.lsp`` with steady-state silence. Module-level dedup sets
keep a 1000-write session at exactly ONE INFO line ("active for
<root>") at the default INFO threshold; clean writes log at DEBUG so
they never reach agent.log under normal config. State transitions
(server starts, no project root for a file, server unavailable) fire
at INFO/WARNING once per (server_id, key); novel events (timeouts,
unexpected errors) fire WARNING per call. Grep recipe: ``rg 'lsp\\['``.
agent/lsp/manager.py: wire the eventlog into _get_or_spawn and
get_diagnostics_sync so users can answer "did LSP fire on this edit?"
with a single grep, plus surface "binary not on PATH" warnings once
instead of silently retrying every write.
tools/file_operations.py: backend-type gate. ``_lsp_local_only()``
returns False for non-local backends (Docker / Modal / SSH /
Daytona); ``_snapshot_lsp_baseline`` and ``_maybe_lsp_diagnostics``
now skip entirely on remote envs. The host-side language server
can't see files inside a sandbox, so this prevents pretending to
lint a file the host process can't open.
agent/lsp/protocol.py: 8 KiB cap on the header block in
``read_message``. A pathological server that streams headers
without ever emitting CRLF-CRLF would have looped forever consuming
bytes; now raises ``LSPProtocolError`` instead.
agent/lsp/workspace.py: 64-step cap on ``find_git_worktree`` and
``nearest_root`` upward walks, plus try/except containment around
``Path(...).resolve()`` and child ``.exists()`` calls. Defensive
against pathological inputs (symlink loops, encoding errors,
permission failures mid-walk) — the lint hook is hot-path code and
must never raise.
Tests:
- tests/agent/lsp/test_eventlog.py: 18 tests covering steady-state
silence (clean writes stay DEBUG), state-transition INFO-once
semantics (active for, no project root), action-required
WARNING-once (server unavailable), per-call WARNING (timeouts,
spawn failures), and the "1000 clean writes => 1 INFO" contract.
- tests/agent/lsp/test_backend_gate.py: 5 tests verifying
_lsp_local_only / snapshot_baseline / maybe_lsp_diagnostics skip
the LSP layer for non-local backends and route correctly for
LocalEnvironment.
- tests/agent/lsp/test_protocol.py: new test_read_message_rejects_runaway_header
exercising the 8 KiB cap.
Validation:
- 73/73 LSP tests pass (49 original + 18 eventlog + 5 backend-gate + 1 framer cap)
- 198/198 pass when run alongside existing file_operations tests
- Live E2E re-run with pyright still surfaces "ERROR [2:12] Type
... reportReturnType (Pyright)" through the full path, then patch
fix removes it on the next call.
* feat(lsp): atexit cleanup + separate lsp_diagnostics JSON field
Two improvements salvaged from #24414's plugin-form alternative,
keeping our core-integrated design:
1. atexit cleanup of spawned language servers
----------------------------------------------------------------
``agent/lsp/__init__.get_service`` now registers an ``atexit``
handler on first creation that tears down the LSPService on
Python exit. Without this, every ``hermes chat`` exit was
leaking pyright/gopls/etc. processes for a few seconds while
their stdout buffers drained -- they got reaped by the kernel
eventually but a watchful ``ps aux`` would catch them.
The handler runs once per process (gated by
``_atexit_registered``); idempotent ``shutdown_service``
ensures double-fire is a no-op. Errors during shutdown are
swallowed at debug level since by the time atexit fires the
user has already seen the agent's final response.
2. Separate ``lsp_diagnostics`` field on WriteResult / PatchResult
----------------------------------------------------------------
Previously the LSP layer folded its diagnostic block into the
``lint.output`` string, conflating the syntax-check tier with
the semantic tier. The agent (and any downstream parsers) now
read syntax errors and semantic errors as independent signals:
{
"bytes_written": 42,
"lint": {"status": "ok", "output": ""},
"lsp_diagnostics": "<diagnostics file=...>\nERROR [2:12] ..."
}
``_check_lint_delta`` returns to its original two-tier shape
(syntax check + delta filter); ``write_file`` and
``patch_replace`` independently fetch LSP diagnostics via
``_maybe_lsp_diagnostics`` and pass them into the new field.
``patch_replace`` propagates the inner write_file's
``lsp_diagnostics`` so the outer PatchResult carries the patch's
delta correctly.
Tests: 19 new
- tests/agent/lsp/test_lifecycle.py (8 tests): atexit registration
fires once and only once across N get_service calls; the
registered callable is our internal shutdown wrapper;
shutdown_service is idempotent and safe when never started;
exceptions during shutdown are swallowed; inactive service is
cached so we don't rebuild on every check.
- tests/agent/lsp/test_diagnostics_field.py (11 tests): WriteResult
/ PatchResult dataclass shape, to_dict include/omit semantics,
channel separation (lint and lsp_diagnostics carry independent
signals), write_file populates the field via
_maybe_lsp_diagnostics only when the syntax tier is clean,
patch_replace propagates the field forward from its internal
write_file.
Validation:
- 92/92 LSP tests pass (73 prior + 8 lifecycle + 11 diagnostics field)
- 217/217 pass with file_operations + LSP combined
- Live E2E reverified: clean writes -> both fields empty/none; type
error introduced -> lint clean (parses), lsp_diagnostics carries
the pyright reportReturnType block; patch fix -> both fields
clean again.
* fix(lsp): broken-set short-circuit so a wedged server isn't paid every write
Discovered while auditing failure paths: a language server binary that
hangs (sleep forever, no LSP traffic on stdin/stdout) caused EVERY
subsequent write to re-pay the 8s snapshot_baseline timeout. Five
writes = ~64s of dead time.
The bug: ``_get_or_spawn`` adds the (server_id, root) pair to
``_broken`` inside its inner exception handler, but when the OUTER
``_loop.run`` timeout fires, it cancels the inner task before that
handler runs. The pair never makes it to broken-set, so the next
write re-enters the spawn path and re-pays the timeout.
Fix:
- New ``_mark_broken_for_file`` helper at the service layer marks
the (server_id, workspace_root) pair broken from the OUTSIDE when
the outer timeout fires. Called from the except branches in
``snapshot_baseline``, ``get_diagnostics_sync`` (asyncio.TimeoutError
+ generic Exception). Also kills any orphan client process that
survived the cancelled future, fire-and-forget with a 1s ceiling.
- ``enabled_for`` now consults the broken-set BEFORE returning True.
Files in already-broken (server_id, root) pairs short-circuit to
False, so the file_operations layer skips the LSP path entirely
with no spawn cost. Until the service is restarted (``hermes lsp
restart``) or the process exits.
- A single eventlog WARNING is emitted on first mark-broken so the
user knows which server gave up. Subsequent edits in the same
project stay silent.
Tests: 7 new in tests/agent/lsp/test_broken_set.py — covers the
key shape (server_id, per_server_root), enabled_for short-circuit,
sibling-file skip in same project, project isolation (broken in
A doesn't affect B), graceful no-op for missing-server / no-workspace,
and an end-to-end test that snapshots after a failure and verifies
the next ``enabled_for`` returns False.
Validation:
- Live retest of the wedged-binary scenario: 5 sequential writes,
first 8.88s (the one snapshot timeout), subsequent four ~0.84s
(no LSP cost). Down from 5x12.85s = 64s before this fix.
- 99/99 LSP tests pass (92 prior + 7 broken-set)
- 224/224 pass with file_operations + LSP combined
- Happy path E2E reverified — clean write, type error introduced,
patch fix all behave correctly with the new broken-set logic.
Note: the FIRST write to a wedged binary still pays 8s (the
snapshot_baseline timeout). We could shorten that, but pyright/
tsserver normally take 2-3s and slow CI rust-analyzer can need
5+ seconds, so 8s is the conservative ceiling. Subsequent writes
are instant.
8.4 KiB
| sidebar_position | title | description |
|---|---|---|
| 16 | LSP — Semantic Diagnostics | Real language servers (pyright, gopls, rust-analyzer, …) wired into the post-write lint check used by write_file and patch. |
Language Server Protocol (LSP)
Hermes runs full language servers — pyright, gopls, rust-analyzer,
typescript-language-server, clangd, and ~20 more — as background
subprocesses and feeds their semantic diagnostics into the post-write
lint check used by write_file and patch. When the agent edits a
file, it sees exactly the errors that edit introduced — not just
syntax errors, but type errors, undefined names, missing imports,
and project-wide semantic issues the language server detects.
This is the same architecture top-tier coding agents use. Hermes ships it self-contained: no editor host required, no plugins to install, no separate daemon to manage.
When LSP runs
LSP is gated on git workspace detection. When the agent's working directory (or the file being edited) is inside a git worktree, LSP runs against that workspace. When neither is in a git repo, LSP stays dormant — useful for messaging gateways where the cwd is the user's home directory and there's no project to diagnose.
The check is layered: in-process syntax check first (microseconds), then LSP diagnostics second when syntax is clean. A flaky or missing language server can never break a write — every LSP failure path falls back silently to the syntax-only result.
Concretely, on every successful write_file or patch:
- Hermes captures a baseline of current diagnostics for the file.
- Performs the write.
- Re-queries the language server, filters out diagnostics that were already in the baseline, and surfaces only the new ones.
The agent sees output like:
{
"bytes_written": 42,
"dirs_created": false,
"lint": {"status": "ok", "output": ""},
"lsp_diagnostics": "LSP diagnostics introduced by this edit:\n<diagnostics file=\"/path/to/foo.py\">\nERROR [42:5] Cannot find name 'foo' [reportUndefinedVariable] (Pyright)\nERROR [50:1] Argument of type \"str\" is not assignable to \"int\" [reportArgumentType] (Pyright)\n</diagnostics>"
}
The lint field carries the syntax-check result (microsecond
in-process parse via ast.parse, json.loads, etc.); the
lsp_diagnostics field carries the semantic diagnostics from the
real language server. Two channels, independent signals — the
agent sees a syntax-clean file with semantic problems as
lint: ok plus a populated lsp_diagnostics.
Supported languages
| Language | Server | Auto-install |
|---|---|---|
| Python | pyright-langserver |
npm |
| TypeScript / JavaScript / JSX / TSX | typescript-language-server |
npm |
| Vue | @vue/language-server |
npm |
| Svelte | svelte-language-server |
npm |
| Astro | @astrojs/language-server |
npm |
| Go | gopls |
go install |
| Rust | rust-analyzer |
manual (rustup) |
| C / C++ | clangd |
manual (LLVM) |
| Bash / Zsh | bash-language-server |
npm |
| YAML | yaml-language-server |
npm |
| Lua | lua-language-server |
manual (GitHub releases) |
| PHP | intelephense |
npm |
| OCaml | ocaml-lsp |
manual (opam) |
| Dockerfile | dockerfile-language-server-nodejs |
npm |
| Terraform | terraform-ls |
manual |
| Dart | dart language-server |
manual (dart sdk) |
| Haskell | haskell-language-server |
manual (ghcup) |
| Julia | julia + LanguageServer.jl |
manual |
| Clojure | clojure-lsp |
manual |
| Nix | nixd |
manual |
| Zig | zls |
manual |
| Gleam | gleam lsp |
manual (gleam install) |
| Elixir | elixir-ls |
manual |
| Prisma | prisma language-server |
manual |
| Kotlin | kotlin-language-server |
manual |
| Java | jdtls |
manual |
For "manual" entries, install the server through whatever toolchain
manager makes sense for that language (rustup, ghcup, opam, brew,
…). Hermes auto-detects the binary on PATH or in
<HERMES_HOME>/lsp/bin/.
CLI
hermes lsp status # service state + per-server install status
hermes lsp list # registry, optionally --installed-only
hermes lsp install <id> # eagerly install one server
hermes lsp install-all # try every server with a known recipe
hermes lsp restart # tear down running clients
hermes lsp which <id> # print resolved binary path
hermes lsp status is the best starting point — it shows which
languages will get semantic diagnostics today and which need a
binary installed.
Configuration
The defaults work for typical setups; nothing to set if the binaries are on PATH.
# config.yaml
lsp:
# Master toggle. Disabling skips the entire subsystem — no servers
# spawn, no background event loop runs.
enabled: true
# How long to wait for diagnostics after each write.
wait_mode: document # "document" or "full"
wait_timeout: 5.0
# How to handle missing server binaries.
# auto — install via npm/pip/go install into <HERMES_HOME>/lsp/bin
# manual — only use binaries already on PATH
install_strategy: auto
# Per-server overrides (all optional).
servers:
pyright:
disabled: false
command: ["/abs/path/to/pyright-langserver", "--stdio"]
env: { PYRIGHT_LOG_LEVEL: "info" }
initialization_options:
python:
analysis:
typeCheckingMode: "strict"
typescript:
disabled: true # skip TS even when its extensions match
Per-server keys
disabled: true— skip this server entirely even when its extensions match a file.command: [bin, ...args]— pin a custom binary path. Bypasses auto-install.env: {KEY: value}— extra env vars passed to the spawned process.initialization_options: {...}— merged into the LSPinitializationOptionspayload sent in theinitializehandshake. Server-specific; consult the language server's docs.
Installation locations
When install_strategy: auto, Hermes installs binaries into
<HERMES_HOME>/lsp/bin/. NPM packages land in
<HERMES_HOME>/lsp/node_modules/ with bin symlinks one level up.
Go binaries come from go install with GOBIN pointed at the
staging dir.
Nothing is ever installed to /usr/local/, ~/.local/, or any other
shared location — the staging dir is fully Hermes-owned and is
removed when you reset the profile.
Performance characteristics
LSP servers are lazy-spawned on first use. Editing a Python file
in a project that's never seen .py traffic spawns pyright; the
spawn takes 1-3 seconds for most servers (rust-analyzer can take 10+
on a cold project). Subsequent edits in the same workspace re-use
the running server.
The LSP layer adds a few milliseconds to clean writes when no
diagnostics are emitted. When diagnostics are emitted, the wait
budget is wait_timeout seconds — typically the server responds in
tens of milliseconds for pyright/tsserver and a few seconds for
rust-analyzer mid-indexing.
Servers are kept alive for the life of the Hermes process. There's no idle-timeout reaper — the cost of restarting the server's index on every write would be far higher than holding the daemon.
Disabling
Set lsp.enabled: false in config.yaml to disable the entire
subsystem. The post-write check falls back to the in-process syntax
check (ast.parse for Python, json.loads for JSON, etc.) which
ships unchanged from earlier versions.
To disable a single language without disabling the whole layer:
lsp:
servers:
rust-analyzer:
disabled: true
Troubleshooting
hermes lsp status shows a server as "missing"
The binary isn't on PATH and isn't in <HERMES_HOME>/lsp/bin/. Run
hermes lsp install <server_id> to attempt an auto-install, or
install the binary manually through the language's normal toolchain.
Server starts but never returns diagnostics
Check ~/.hermes/logs/agent.log for [agent.lsp.client] entries —
both stderr from the language server and protocol errors land
there. Some servers (rust-analyzer especially) need to finish a
project-wide index before they emit per-file diagnostics; the first
edit after server start may complete with no diagnostics, with
subsequent edits picking them up.
Server crashed
A crashed server is added to the broken-set and won't be retried for
the rest of the session. Run hermes lsp restart to clear the set;
the next edit re-spawns.
Editing a file outside any git repo
By design, LSP only runs inside git worktrees. Run git init in the
project, or accept the in-process syntax-only fallback.