hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-09 03:11:58 +00:00

Author	SHA1	Message	Date
Teknium	cc38282b04	feat(cross-platform): psutil for PID/process management + Windows footgun checker ## Why Hermes supports Linux, macOS, and native Windows, but the codebase grew up POSIX-first and has accumulated patterns that silently break (or worse, silently kill!) on Windows: - `os.kill(pid, 0)` as a liveness probe — on Windows this maps to CTRL_C_EVENT and broadcasts Ctrl+C to the target's entire console process group (bpo-14484, open since 2012). - `os.killpg` — doesn't exist on Windows at all (AttributeError). - `os.setsid` / `os.getuid` / `os.geteuid` — same. - `signal.SIGKILL` / `signal.SIGHUP` / `signal.SIGUSR1` — module-attr errors at runtime on Windows. - `open(path)` / `open(path, "r")` without explicit encoding= — inherits the platform default, which is cp1252/mbcs on Windows (UTF-8 on POSIX), causing mojibake round-tripping between hosts. - `wmic` — removed from Windows 10 21H1+. This commit does three things: 1. Makes `psutil` a core dependency and migrates critical callsites to it. 2. Adds a grep-based CI gate (`scripts/check-windows-footguns.py`) that blocks new instances of any of the above patterns. 3. Fixes every existing instance in the codebase so the baseline is clean. ## What changed ### 1. psutil as a core dependency (pyproject.toml) Added `psutil>=5.9.0,<8` to core deps. psutil is the canonical cross-platform answer for "is this PID alive" and "kill this process tree" — its `pid_exists()` uses `OpenProcess + GetExitCodeProcess` on Windows (NOT a signal call), and its `Process.children(recursive=True)` + `.kill()` combo replaces `os.killpg()` portably. ### 2. `gateway/status.py::_pid_exists` Rewrote to call `psutil.pid_exists()` first, falling back to the hand-rolled ctypes `OpenProcess + WaitForSingleObject` dance on Windows (and `os.kill(pid, 0)` on POSIX) only if psutil is somehow missing — e.g. during the scaffold phase of a fresh install before pip finishes. ### 3. `os.killpg` migration to psutil (7 callsites, 5 files) - `tools/code_execution_tool.py` - `tools/process_registry.py` - `tools/tts_tool.py` - `tools/environments/local.py` (3 sites kept as-is, suppressed with `# windows-footgun: ok` — the pgid semantics psutil can't replicate, and the calls are already Windows-guarded at the outer branch) - `gateway/platforms/whatsapp.py` ### 4. `scripts/check-windows-footguns.py` (NEW, 500 lines) Grep-based checker with 11 rules covering every Windows cross-platform footgun we've hit so far: 1. `os.kill(pid, 0)` — the silent killer 2. `os.setsid` without guard 3. `os.killpg` (recommends psutil) 4. `os.getuid` / `os.geteuid` / `os.getgid` 5. `os.fork` 6. `signal.SIGKILL` 7. `signal.SIGHUP/SIGUSR1/SIGUSR2/SIGALRM/SIGCHLD/SIGPIPE/SIGQUIT` 8. `subprocess` shebang script invocation 9. `wmic` without `shutil.which` guard 10. Hardcoded `~/Desktop` (OneDrive trap) 11. `asyncio.add_signal_handler` without try/except 12. `open()` without `encoding=` on text mode Features: - Triple-quoted-docstring aware (won't flag prose inside docstrings) - Trailing-comment aware (won't flag mentions in `# os.kill(pid, 0)` comments) - Guard-hint aware (skips lines with `hasattr(os, ...)`, `shutil.which(...)`, `if platform.system() != 'Windows'`, etc.) - Inline suppression with `# windows-footgun: ok — <reason>` - `--list` to print all rules with fixes - `--all` / `--diff <ref>` / staged-files (default) modes - Scans 380 files in under 2 seconds ### 5. CI integration A GitHub Actions workflow that runs the checker on every PR and push is staged at `/tmp/hermes-stash/windows-footguns.yml` — not included in this commit because the GH token on the push machine lacks `workflow` scope. A maintainer with `workflow` permissions should add it as `.github/workflows/windows-footguns.yml` in a follow-up. Content: ```yaml name: Windows footgun check on: push: branches: [main] pull_request: branches: [main] jobs: check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: {python-version: "3.11"} - run: python scripts/check-windows-footguns.py --all ``` ### 6. CONTRIBUTING.md — "Cross-Platform Compatibility" expansion Expanded from 5 to 16 rules, each with message, example, and fix. Recommends psutil as the preferred API for PID / process-tree operations. ### 7. Baseline cleanup (91 → 0 findings) - 14 `open()` sites → added `encoding='utf-8'` (internal logs/caches) or `encoding='utf-8-sig'` (user-editable files that Notepad may BOM) - 23 POSIX-only callsites in systemd helpers, pty_bridge, and plugin tool subprocess management → annotated with `# windows-footgun: ok — <reason>` - 7 `os.killpg` sites → migrated to psutil (see §3 above) ## Verification ``` $ python scripts/check-windows-footguns.py --all ✓ No Windows footguns found (380 file(s) scanned). $ python -c "from gateway.status import _pid_exists; import os > print('self:', _pid_exists(os.getpid())); print('bogus:', _pid_exists(999999))" self: True bogus: False ``` Proof-of-repro that `os.kill(pid, 0)` was actually killing processes before this fix — see commit ``1cbe39914`` and bpo-14484. This commit removes the last hand-rolled ctypes path from the hot liveness-check path and defers to the best-maintained cross-platform answer.	2026-05-08 14:27:40 -07:00
Teknium	cbce5e93fc	codebase: add encoding='utf-8' to all bare open() calls (PLW1514) Closes the last Python-on-Windows UTF-8 exposure by making every text-mode open() call explicit about its encoding. Before: on Windows, bare open(path, 'r') defaults to the system locale encoding (cp1252 on US-locale installs). That means reading any config/yaml/markdown/json file with non-ASCII content either crashes with UnicodeDecodeError or silently mis-decodes bytes. After: all 89 affected call sites in production code now pass encoding='utf-8' explicitly. Works identically on every platform and every locale, no surprise behavior. Mechanical sweep via: ruff check --preview --extend-select PLW1514 --unsafe-fixes --fix --exclude 'tests,venv,.venv,node_modules,website,optional-skills, skills,tinker-atropos,plugins' . All 89 fixes have the same shape: open(x) or open(x, mode) became open(x, encoding='utf-8') or open(x, mode, encoding='utf-8'). Nothing else changed. Every modified file still parses and the Windows/sandbox test suite is still green (85 passed, 14 skipped, 0 failed across tests/tools/test_code_execution_windows_env.py + tests/tools/test_code_execution_modes.py + tests/tools/test_env_passthrough.py + tests/test_hermes_bootstrap.py). Scope notes: - tests/ excluded: test fixtures can use locale encoding intentionally (exercising edge cases). If we want to tighten tests later that's a separate PR. - plugins/ excluded: plugin-specific conventions may differ; plugin authors own their code. - optional-skills/ and skills/ excluded: skill scripts are user-authored and we don't want to mass-edit them. - website/ and tinker-atropos/ excluded: vendored / generated content. 46 files touched, 89 +/- lines (symmetric replacement). No behavior change on POSIX or on Windows when the file is ASCII; bug fix on Windows when the file contains non-ASCII.	2026-05-08 14:27:40 -07:00
web-dev0521	dfe512c58d	fix(paths): route achievements plugin + profile-tui through HERMES_HOME Four callsites hardcoded Path.home() / '.hermes' with no HERMES_HOME check, breaking Docker deployments and profile isolation (hermes -p): - plugins/hermes-achievements/dashboard/plugin_api.py: state_path(), snapshot_path(), checkpoint_path() bare-literal paths - scripts/profile-tui.py: DEFAULT_STATE_DB and DEFAULT_LOG defaults ignored HERMES_HOME - hermes_cli/slack_cli.py: except-Exception fallback for slack-manifest.json dump - optional-skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py: --target argparse default Use get_hermes_home() (with an ImportError shim for the standalone scripts) or 'os.environ.get("HERMES_HOME") or str(Path.home()/".hermes")' where importing hermes_constants is impractical. E2E-verified: with HERMES_HOME=/tmp/x all three achievements paths and both profile-tui defaults route under /tmp/x. Salvaged from #18068 (original scope was broader mechanical cleanup claiming 23 callsites were buggy; most were already respecting HERMES_HOME via os.environ.get(key, default) — only these 4 had no env check at all). Credit: @web-dev0521.	2026-04-30 23:21:54 -07:00
Brooklyn Nicholson	ffa33e53f6	chore(tui): remove dead branch cleanup code - drop unused TUI helpers, test-only layout scaffolding, and stale public debug exports - remove an unused profiler import and trim test-only coverage for deleted helpers	2026-04-26 21:54:24 -05:00
Brooklyn Nicholson	82f842277e	perf(tui): profile harness gains --loop, --save, --compare Before: change code → build → run profile → manually compare to mental model of last run. After: `--loop` watches ui-tui/src and packages/hermes-ink/src for .ts(x) changes, rebuilds on change, re-runs the same scenario, prints a side-by-side A/B diff against the previous iteration — so each edit's impact is quantified instantly. Ctrl+C to stop. Also added: --save LABEL saves metrics snapshot to /tmp/perf-<LABEL>.json --compare LABEL diffs the current run vs that snapshot --extra-flag X pass-through to node dist/entry.js (prepping for --no-fullscreen below) key_metrics() flattens a full run into scalar numbers across frames, React commits, and per-phase timings. format_diff() prints a table with ↑/↓ markers denoting regressions vs improvements based on whether the metric is lower-is-better (p99, max, patches, drain) or higher-is-better (fps, gaps_under_16ms). Run-to-run noise on static code is ~5-15% on most metrics — big signal (>30% change on renderer_p99 / fps) cuts through cleanly. Useful both for validating a single fix and for detecting subtle regressions during the wheel-accel port. Usage during the next perf session: # one-shot with a baseline for later comparison scripts/profile-tui.py --seconds 6 --hold wheel_up --save pre-accel # after porting the wheel handler scripts/profile-tui.py --seconds 6 --hold wheel_up --compare pre-accel # continuous iteration scripts/profile-tui.py --seconds 6 --hold wheel_up --loop	2026-04-26 17:08:07 -05:00
Brooklyn Nicholson	f823535db2	perf(tui): instrument stdout drain — rule out terminal parse bottleneck Adds four fields to FrameEvent.phases and the matching profile summary: optimizedPatches post-optimize patch count (what's actually written to stdout; the .patches field is pre-optimize) writeBytes UTF-8 byte count of the write this frame backpressure true when Node's stdout.write returned false (Writable buffer full — outer terminal can't keep up) prevFrameDrainMs end-to-end drain time of the PREVIOUS frame's write, captured from stdout.write's 2-arg callback. Reported on the next frame so the measurement reflects "time until OS flushed the bytes to the terminal fd", not "time until queued in Node". writeDiffToTerminal() now returns { bytes, backpressure } and accepts an optional onDrain callback. Only attached on TTY with diff; piped/non-TTY stdout bypasses flow control so the callback would fire synchronously anyway. Initial measurements under hold-wheel_up against 1106-msg session (30Hz for 6s): patches total 28,888 optimized total 16,700 (ratio 0.58 — optimizer cuts ~42%) writeBytes 42 KB / 10s = 4.2 KB/s throughput drainMs p50 0.14 ms terminal accepts bytes instantly drainMs p99 0.85 ms backpressure 0% of frames This rules out the terminal-parse hypothesis — Cursor's xterm.js drains our output in sub-millisecond time at only 4 KB/s. The remaining lag has to be in the render pipeline, not the wire. Profile output now includes the bytes+drain+backpressure lines to keep this visible on every subsequent iteration.	2026-04-26 17:06:22 -05:00
Brooklyn Nicholson	71eee26640	perf(tui): full-pipeline instrumentation + profiling harness Extends HERMES_DEV_PERF to capture the complete render pipeline, not just React commits. Adds scripts/profile-tui.py to drive repeatable hold-PageUp stress tests against a real long session. perfPane.tsx: Wires ink's onFrame callback (already plumbed through the fork) into the same perf.log as the React.Profiler samples. Captures per-phase timing (yoga calculateLayout, renderNodeToOutput, screen diff, patch optimize, stdout write) plus yoga counters (visited/measured/cache- Hits/live) and patch counts per frame. Events are tagged {src: 'react'\|'frame'} so jq can split them. logFrameEvent is undefined when HERMES_DEV_PERF is unset, so ink doesn't even attach the callback. entry.tsx: Passes logFrameEvent into render(). types/hermes-ink.d.ts: Declares FrameEvent + onFrame on RenderOptions so the ui-tui side type-checks against the plumbed-through ink option. scripts/profile-tui.py: New harness. Launches the built TUI under a PTY with the longest session in state.db resumed, holds PageUp/PageDown/etc at a configurable Hz for N seconds, then parses perf.log and prints per-phase p50/p95/p99/max plus yoga-counter summaries. Zero deps beyond stdlib. Exit 2 if nothing was captured (wiring broken). Initial findings (1106-msg session, 6s PageUp hold at 30Hz): - Steady state: 10 fps; renderer phase p99=63ms, write p99=0.2ms - 4/107 heavy frames (>=16ms), all dominated by renderNodeToOutput - One pathological 97ms frame with yoga measuring 70,415 text cells and Yoga visiting 225k nodes — the cold-unmeasured-region hit - Ink's scroll fast-path (DECSTBM blit from prevScreen) is disqualified because our spacer-based virtual history doesn't keep heightDelta in sync with scroll.delta, so every PageUp step falls through to a full 2000-4800 patch re-render instead of ~40	2026-04-26 16:36:25 -05:00

7 commits