mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-30 06:41:51 +00:00
5 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
5e743559e0
|
fix(lint): skip per-file shell linter when LSP will handle the file (#29054)
* fix(lint): skip per-file shell linter when LSP will handle the file `_check_lint` ran `npx tsc --noEmit FILE.ts` after every `.ts`/`.tsx` edit. `tsc` ignores `tsconfig.json` when given an explicit file argument (documented quirk) and defaults to no-lib / ES5, so every ES2015+ stdlib reference reports as missing: - `Cannot find global value 'Promise'` - `Cannot find name 'Map' / 'Set' / 'ReadonlySet' / 'Iterable'` - `Property 'isFinite' does not exist on type 'NumberConstructor'` - `Module 'phaser' can only be default-imported using esModuleInterop` - `import.meta is only allowed when --module is es2020+` On real TypeScript projects this floods the `lint` field on WriteResult / PatchResult with up to 25K tokens of false positives per edit. The delta filter in `_check_lint_delta` is supposed to mask them, but a tiny edit shifts line numbers and every phantom resurfaces as "introduced by this edit". The result is a 1MB+ phantom-error dump on every patch that eats the agent's context budget. Same shape for `.go` (`go vet` outside a module) and `.rs` (`rustfmt --check` outside a Cargo project). PR #24168 added an LSP tier on top of this — real `tsserver` / `gopls` / `rust-analyzer` diagnostics surface in the separate `lsp_diagnostics` field. But the broken shell linter kept running underneath, so the phantom-error dump kept happening even when LSP was giving us a clean authoritative signal. This change short-circuits the shell linter for the structurally-broken extensions (`.ts`, `.tsx`, `.go`, `.rs`) when an LSP server is active and claims the file via `LSPService.enabled_for(path)`. The LSP tier runs as before and carries the real diagnostics in `lsp_diagnostics`. Other shell linters (`py_compile`, `node --check`) keep running unconditionally — they're fast, file-local, and correct. Default behavior (LSP disabled, LSP misconfigured, remote backend, file outside a workspace) is unchanged — the existing fallback paths trigger when `_lsp_will_handle` returns False, so users who haven't opted into LSP get the same shell-linter behavior they had before. Drive-by: `.tsx` was missing from the `LINTERS` table entirely, so TS React files got no post-edit syntax check at all. Added it for symmetry; in practice it now hits the LSP-skip path. Tests: - `tests/agent/lsp/test_shell_linter_lsp_skip.py` — 14 tests covering: * skip happens for each redundant extension when LSP claims the file (asserted by patching `_exec` to raise on any shell-linter call) * shell linter still runs when LSP is inactive (regression guard) * `.py` / `.js` continue to run unconditionally even with LSP active * `_lsp_will_handle` is exception-safe: returns False on None service, remote backend, or `enabled_for` raising * `.tsx` is in both `LINTERS` and `_SHELL_LINTER_LSP_REDUNDANT` - All pre-existing tests in `tests/agent/lsp/` and `tests/tools/test_file_operations*.py` still pass (233/233). * fix(lint): address Copilot review on #29054 Two fixes from copilot-pull-request-reviewer on PR #29054: 1. `.tsx` regression with LSP disabled (https://github.com/NousResearch/hermes-agent/pull/29054#discussion_r3271017282) The first revision added `.tsx` to the `LINTERS` table so that TypeScript React files would hit the LSP skip path. Side effect: when LSP is *disabled* (the default), `.tsx` edits would suddenly run `npx tsc --noEmit FILE.tsx` and inherit the same phantom-error dump this PR is supposed to fix. Pre-PR behavior was implicit `skipped` (no `LINTERS` entry); restore that. - Remove `.tsx` from `LINTERS`. - Remove `.tsx` from `_SHELL_LINTER_LSP_REDUNDANT` (the skip path is unreachable without a `LINTERS` entry — falls through to `ext not in LINTERS` first). - When LSP IS enabled, `.tsx` is still covered by the LSP tier via `_maybe_lsp_diagnostics` (typescript-language-server's `extensions` tuple includes `.tsx`), so the diagnostics still surface — just on the `lsp_diagnostics` channel, not `lint`. - Update test_shell_linter_lsp_skip.py to reflect this contract (drop `.tsx` from the parametrize lists; add `test_tsx_stays_out_of_linters_table_for_default_compatibility` and `test_tsx_default_check_lint_returns_skipped`). 2. V4A patches dropped `WriteResult.lsp_diagnostics` (https://github.com/NousResearch/hermes-agent/pull/29054#discussion_r3271017295) `tools/patch_parser.py::apply_v4a_operations` calls `file_ops.write_file()` per operation, then calls `_check_lint()` directly afterwards — but never propagates `WriteResult.lsp_diagnostics` to the `PatchResult`. The shell-linter skip introduced in this PR makes the gap visible: a `.ts` / `.go` / `.rs` V4A patch with LSP active would return `lint = {f: {skipped: True}}` and zero diagnostics from any channel. - `_apply_add` and `_apply_update` now return `Tuple[bool, str, Optional[str]]` where the third element is `WriteResult.lsp_diagnostics` (or `None` on failure / no diags). - `_apply_delete` and `_apply_move` stay 2-tuples — they don't produce diagnostics, no write goes through `write_file`. - `apply_v4a_operations` accumulates per-file diagnostics blocks and surfaces a combined block on `PatchResult.lsp_diagnostics`. Each block already carries its `<diagnostics file="...">` header from `LSPService.report_for_file`, so concatenation preserves per-file attribution. Tests added (`test_patch_parser.py::TestV4ALspDiagnosticsPropagation`): - ADD op: `WriteResult.lsp_diagnostics` flows to `PatchResult` - UPDATE op: same - No diagnostics → `PatchResult.lsp_diagnostics is None` (not "") - Multi-file patch: combined block contains every per-file block Verification: - Targeted test scope: 257/257 pass (tests/agent/lsp/, tests/tools/test_file_operations*.py, tests/tools/test_patch_parser.py) - Wider sweep: 5400 pass; 11 failures all pre-existing on origin/main (file_staleness / file_read_guards / file_state_registry — unrelated macOS /var/folders tmp-path sensitivity issues, confirmed by re-running on a clean origin/main checkout) * docs(test): align shell-linter LSP skip docstring with .tsx behavior Copilot review feedback (review #4324947616, comment #3271049036): the test module docstring still listed .tsx alongside .ts/.go/.rs in the skip contract, but .tsx is now intentionally NOT in LINTERS or _SHELL_LINTER_LSP_REDUNDANT. Updated the bullet list to drop .tsx from the skip contract and added a paragraph documenting why .tsx is left out (preserves pre-PR implicit-skip behavior for LSP-disabled users; LSP coverage still happens via _maybe_lsp_diagnostics). * test(lsp): drop unused tmp_path from _make_fops helper Copilot review #3271069484: the helper accepted tmp_path but never used it. Callers still need tmp_path themselves for the file they're asserting against, so we just drop the helper's parameter. |
||
|
|
5fba236644
|
chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355)
Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)). |
||
|
|
19071529f6
|
fix(lsp): shift baseline diagnostics into post-edit coordinates (#25978)
Pre-existing diagnostics below an edit point used to surface as 'LSP diagnostics introduced by this edit' whenever the edit deleted or inserted lines. The delta-filter key included the diagnostic's range, so the same logical error reported at a different line in the post-edit snapshot looked like a brand new diagnostic. Concrete case: deleting 14 lines in cli.py caused Pyright errors at lines 9873, 10590, 12413, 13004 (unrelated to the edit) to be reported as introduced by it. Fix: build a piecewise-linear line-shift map (via difflib's SequenceMatcher) from pre and post content, and remap baseline diagnostics into post-edit coordinates before the set-difference. Diagnostics in deleted regions drop out cleanly; diagnostics below the edit shift by the right amount; diagnostics above are untouched. The strict (range-aware) equality key stays — so a genuinely new instance of an identical error class at a different line still surfaces as new. Pieces: - agent/lsp/range_shift.py — build_line_shift, shift_diagnostic_range, shift_baseline. Pure functions, no LSP state. - agent/lsp/manager.py — LSPService.get_diagnostics_sync gains an optional line_shift kwarg; baseline is shift_baseline'd before computing the seen-set. _diag_key keeps the strict range key. - tools/file_operations.py — write_file captures pre_content for any LSP-handled extension (not just LINTERS_INPROC) and passes pre/post to _maybe_lsp_diagnostics, which builds the shift map. - New _lsp_handles_extension helper guards the pre_content read. Trade-offs preserved: - Genuinely new same-class errors at different lines still surface (content-only key would have swallowed them). - Pre-existing errors at unshifted positions still get filtered (covered by the strict-key path with no shift). - Best-effort: when pre_content can't be captured (file didn't exist, permissions), the unshifted comparison still catches most pre-existing errors; the edge case it misses is a new file with a non-empty baseline, which is structurally impossible. |
||
|
|
29c9ff9ba5
|
fix(lsp): typescript SDK install + tsc-missing skip + shellcheck warning (#24630)
Three follow-ups to PR #24168 found during live E2E testing on TS/bash files: 1. typescript-language-server now installs the typescript SDK (tsserver) alongside it. Without that sibling install, initialize() failed with "Could not find a valid TypeScript installation" and the server was marked broken — no diagnostics ever reached the agent. New extra_pkgs field on INSTALL_RECIPES makes that explicit and reusable for future peer-dep cases. 2. _check_lint now treats "linter command exists on PATH but cannot actually run" as skipped instead of error. The motivating case is npx tsc when typescript is not in node_modules — npx prints its "This is not the tsc command you are looking for" banner and exits non-zero, which previously blocked the LSP semantic tier (gated on success or skipped). Pattern-matched per base command (npx, rustfmt, go) so genuine lint errors still flow through normally. 3. hermes lsp status now surfaces a Backend warnings section when bash-language-server is installed but shellcheck is missing. The server itself spawns fine but bash-language-server delegates diagnostics to shellcheck — without it on PATH the integration looks alive but never reports any problems. Same warning is logged once at server spawn time. Validation: - 12 new tests in tests/agent/lsp/test_install_and_lint_fixes.py: * recipe carries typescript SDK * _install_npm passes both pkg + extras to npm CLI * backwards compat: recipes without extras still work * _backend_warnings quiet when bash absent / both present * _backend_warnings fires when bash installed without shellcheck * status output includes the Backend warnings section * _looks_like_linter_unusable catches the npx tsc banner * real TS type errors not misclassified as unusable * unfamiliar linters fall through normally * _check_lint returns skipped on npx tsc unusable * _check_lint returns error on real tsc type errors - Full lsp + file_operations test suite: 245/245 pass - Live E2E: * try_install("typescript-language-server") installs both packages into node_modules * write_file(bad.ts, ...) returns lint=skipped + lsp_diagnostics with two real TS errors (was lint=error, no lsp_diagnostics) * hermes lsp status renders the shellcheck warning when bash is installed but shellcheck is not on PATH |
||
|
|
83b93898c2
|
feat(lsp): semantic diagnostics from real language servers in write_file/patch (#24168)
* feat(lsp): semantic diagnostics from real language servers in write_file/patch
Wire ~26 language servers (pyright, gopls, rust-analyzer, typescript-language-server,
clangd, bash-language-server, ...) into the post-write lint check used by write_file
and patch. The model now sees type errors, undefined names, missing imports, and
project-wide semantic issues introduced by its edits, not just syntax errors.
LSP is gated on git workspace detection: when the agent's cwd or the file being
edited is inside a git worktree, LSP runs against that workspace; otherwise the
existing in-process syntax checks are the only tier. This keeps users on
user-home cwds (Telegram/Discord gateway chats) from spawning daemons.
The post-write check is layered: in-process syntax check first (microseconds),
then LSP semantic diagnostics second when syntax is clean. Diagnostics are
delta-filtered against a baseline captured at write start, so the agent only
sees errors its edit introduced. A flaky/missing language server can never
break a write -- every LSP failure path falls back silently to the syntax-only
result.
New module agent/lsp/ split into:
- protocol.py: Content-Length JSON-RPC framer + envelope helpers
- client.py: async LSPClient (spawn, initialize, didOpen/didChange,
ContentModified retry, push/pull diagnostic stores)
- workspace.py: git worktree walk-up + per-server NearestRoot resolver
- servers.py: registry of 26 language servers (extension match,
root resolver, spawn builder per language)
- install.py: auto-install dispatch (npm install --prefix, go install
with GOBIN, pip install --target) into HERMES_HOME/lsp/bin/
- manager.py: LSPService (per-(server_id, root) client registry, lazy
spawn, broken-set, in-flight dedupe, sync facade for tools layer)
- reporter.py: <diagnostics> block formatter (severity-1-only, 20-per-file)
- cli.py: hermes lsp {status,list,install,install-all,restart,which}
Wired into tools/file_operations.py:
- write_file/patch_replace now call _snapshot_lsp_baseline before write
- _check_lint_delta gains a third tier: LSP semantic diagnostics when
syntax is clean
- All LSP code paths swallow exceptions; write_file's contract unchanged
Config: 'lsp' section in DEFAULT_CONFIG with enabled (default true),
wait_mode, wait_timeout, install_strategy (default 'auto'), and per-server
overrides (disabled, command, env, initialization_options).
Tests: tests/agent/lsp/ -- 49 tests covering protocol framing (encode and
read_message round-trip, EOF/truncation/missing Content-Length), workspace
gate (git walk-up, exclude markers, fallback to file location), reporter
(severity filter, max-per-file cap, truncation), service-level delta filter,
and an in-process mock LSP server that exercises the full client lifecycle
including didChange version bumps, dedup, crash recovery, and idempotent
teardown.
Live E2E verified end-to-end through ShellFileOperations: pyright
auto-installed via npm into HERMES_HOME, baseline captured, type error
introduced, single delta diagnostic surfaced with correct line/column/code/
source, then patch fix removes the diagnostic from the output.
Docs: new website/docs/user-guide/features/lsp.md page covering supported
languages, configuration knobs, performance characteristics, and
troubleshooting; cli-commands.md updated with the 'hermes lsp' reference;
sidebar updated.
* feat(lsp): structured logging, backend gate, defensive walk caps
Cherry-picks the substantive ideas from #24155 (different scope, same
problem space) onto our PR.
agent/lsp/eventlog.py (new): dedicated structured logger
``hermes.lint.lsp`` with steady-state silence. Module-level dedup sets
keep a 1000-write session at exactly ONE INFO line ("active for
<root>") at the default INFO threshold; clean writes log at DEBUG so
they never reach agent.log under normal config. State transitions
(server starts, no project root for a file, server unavailable) fire
at INFO/WARNING once per (server_id, key); novel events (timeouts,
unexpected errors) fire WARNING per call. Grep recipe: ``rg 'lsp\\['``.
agent/lsp/manager.py: wire the eventlog into _get_or_spawn and
get_diagnostics_sync so users can answer "did LSP fire on this edit?"
with a single grep, plus surface "binary not on PATH" warnings once
instead of silently retrying every write.
tools/file_operations.py: backend-type gate. ``_lsp_local_only()``
returns False for non-local backends (Docker / Modal / SSH /
Daytona); ``_snapshot_lsp_baseline`` and ``_maybe_lsp_diagnostics``
now skip entirely on remote envs. The host-side language server
can't see files inside a sandbox, so this prevents pretending to
lint a file the host process can't open.
agent/lsp/protocol.py: 8 KiB cap on the header block in
``read_message``. A pathological server that streams headers
without ever emitting CRLF-CRLF would have looped forever consuming
bytes; now raises ``LSPProtocolError`` instead.
agent/lsp/workspace.py: 64-step cap on ``find_git_worktree`` and
``nearest_root`` upward walks, plus try/except containment around
``Path(...).resolve()`` and child ``.exists()`` calls. Defensive
against pathological inputs (symlink loops, encoding errors,
permission failures mid-walk) — the lint hook is hot-path code and
must never raise.
Tests:
- tests/agent/lsp/test_eventlog.py: 18 tests covering steady-state
silence (clean writes stay DEBUG), state-transition INFO-once
semantics (active for, no project root), action-required
WARNING-once (server unavailable), per-call WARNING (timeouts,
spawn failures), and the "1000 clean writes => 1 INFO" contract.
- tests/agent/lsp/test_backend_gate.py: 5 tests verifying
_lsp_local_only / snapshot_baseline / maybe_lsp_diagnostics skip
the LSP layer for non-local backends and route correctly for
LocalEnvironment.
- tests/agent/lsp/test_protocol.py: new test_read_message_rejects_runaway_header
exercising the 8 KiB cap.
Validation:
- 73/73 LSP tests pass (49 original + 18 eventlog + 5 backend-gate + 1 framer cap)
- 198/198 pass when run alongside existing file_operations tests
- Live E2E re-run with pyright still surfaces "ERROR [2:12] Type
... reportReturnType (Pyright)" through the full path, then patch
fix removes it on the next call.
* feat(lsp): atexit cleanup + separate lsp_diagnostics JSON field
Two improvements salvaged from #24414's plugin-form alternative,
keeping our core-integrated design:
1. atexit cleanup of spawned language servers
----------------------------------------------------------------
``agent/lsp/__init__.get_service`` now registers an ``atexit``
handler on first creation that tears down the LSPService on
Python exit. Without this, every ``hermes chat`` exit was
leaking pyright/gopls/etc. processes for a few seconds while
their stdout buffers drained -- they got reaped by the kernel
eventually but a watchful ``ps aux`` would catch them.
The handler runs once per process (gated by
``_atexit_registered``); idempotent ``shutdown_service``
ensures double-fire is a no-op. Errors during shutdown are
swallowed at debug level since by the time atexit fires the
user has already seen the agent's final response.
2. Separate ``lsp_diagnostics`` field on WriteResult / PatchResult
----------------------------------------------------------------
Previously the LSP layer folded its diagnostic block into the
``lint.output`` string, conflating the syntax-check tier with
the semantic tier. The agent (and any downstream parsers) now
read syntax errors and semantic errors as independent signals:
{
"bytes_written": 42,
"lint": {"status": "ok", "output": ""},
"lsp_diagnostics": "<diagnostics file=...>\nERROR [2:12] ..."
}
``_check_lint_delta`` returns to its original two-tier shape
(syntax check + delta filter); ``write_file`` and
``patch_replace`` independently fetch LSP diagnostics via
``_maybe_lsp_diagnostics`` and pass them into the new field.
``patch_replace`` propagates the inner write_file's
``lsp_diagnostics`` so the outer PatchResult carries the patch's
delta correctly.
Tests: 19 new
- tests/agent/lsp/test_lifecycle.py (8 tests): atexit registration
fires once and only once across N get_service calls; the
registered callable is our internal shutdown wrapper;
shutdown_service is idempotent and safe when never started;
exceptions during shutdown are swallowed; inactive service is
cached so we don't rebuild on every check.
- tests/agent/lsp/test_diagnostics_field.py (11 tests): WriteResult
/ PatchResult dataclass shape, to_dict include/omit semantics,
channel separation (lint and lsp_diagnostics carry independent
signals), write_file populates the field via
_maybe_lsp_diagnostics only when the syntax tier is clean,
patch_replace propagates the field forward from its internal
write_file.
Validation:
- 92/92 LSP tests pass (73 prior + 8 lifecycle + 11 diagnostics field)
- 217/217 pass with file_operations + LSP combined
- Live E2E reverified: clean writes -> both fields empty/none; type
error introduced -> lint clean (parses), lsp_diagnostics carries
the pyright reportReturnType block; patch fix -> both fields
clean again.
* fix(lsp): broken-set short-circuit so a wedged server isn't paid every write
Discovered while auditing failure paths: a language server binary that
hangs (sleep forever, no LSP traffic on stdin/stdout) caused EVERY
subsequent write to re-pay the 8s snapshot_baseline timeout. Five
writes = ~64s of dead time.
The bug: ``_get_or_spawn`` adds the (server_id, root) pair to
``_broken`` inside its inner exception handler, but when the OUTER
``_loop.run`` timeout fires, it cancels the inner task before that
handler runs. The pair never makes it to broken-set, so the next
write re-enters the spawn path and re-pays the timeout.
Fix:
- New ``_mark_broken_for_file`` helper at the service layer marks
the (server_id, workspace_root) pair broken from the OUTSIDE when
the outer timeout fires. Called from the except branches in
``snapshot_baseline``, ``get_diagnostics_sync`` (asyncio.TimeoutError
+ generic Exception). Also kills any orphan client process that
survived the cancelled future, fire-and-forget with a 1s ceiling.
- ``enabled_for`` now consults the broken-set BEFORE returning True.
Files in already-broken (server_id, root) pairs short-circuit to
False, so the file_operations layer skips the LSP path entirely
with no spawn cost. Until the service is restarted (``hermes lsp
restart``) or the process exits.
- A single eventlog WARNING is emitted on first mark-broken so the
user knows which server gave up. Subsequent edits in the same
project stay silent.
Tests: 7 new in tests/agent/lsp/test_broken_set.py — covers the
key shape (server_id, per_server_root), enabled_for short-circuit,
sibling-file skip in same project, project isolation (broken in
A doesn't affect B), graceful no-op for missing-server / no-workspace,
and an end-to-end test that snapshots after a failure and verifies
the next ``enabled_for`` returns False.
Validation:
- Live retest of the wedged-binary scenario: 5 sequential writes,
first 8.88s (the one snapshot timeout), subsequent four ~0.84s
(no LSP cost). Down from 5x12.85s = 64s before this fix.
- 99/99 LSP tests pass (92 prior + 7 broken-set)
- 224/224 pass with file_operations + LSP combined
- Happy path E2E reverified — clean write, type error introduced,
patch fix all behave correctly with the new broken-set logic.
Note: the FIRST write to a wedged binary still pays 8s (the
snapshot_baseline timeout). We could shorten that, but pyright/
tsserver normally take 2-3s and slow CI rust-analyzer can need
5+ seconds, so 8s is the conservative ceiling. Subsequent writes
are instant.
|