mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-10 03:22:05 +00:00
feat(cross-platform): psutil for PID/process management + Windows footgun checker
## Why
Hermes supports Linux, macOS, and native Windows, but the codebase grew up
POSIX-first and has accumulated patterns that silently break (or worse,
silently kill!) on Windows:
- `os.kill(pid, 0)` as a liveness probe — on Windows this maps to
CTRL_C_EVENT and broadcasts Ctrl+C to the target's entire console
process group (bpo-14484, open since 2012).
- `os.killpg` — doesn't exist on Windows at all (AttributeError).
- `os.setsid` / `os.getuid` / `os.geteuid` — same.
- `signal.SIGKILL` / `signal.SIGHUP` / `signal.SIGUSR1` — module-attr
errors at runtime on Windows.
- `open(path)` / `open(path, "r")` without explicit encoding= — inherits
the platform default, which is cp1252/mbcs on Windows (UTF-8 on POSIX),
causing mojibake round-tripping between hosts.
- `wmic` — removed from Windows 10 21H1+.
This commit does three things:
1. Makes `psutil` a core dependency and migrates critical callsites to it.
2. Adds a grep-based CI gate (`scripts/check-windows-footguns.py`) that
blocks new instances of any of the above patterns.
3. Fixes every existing instance in the codebase so the baseline is clean.
## What changed
### 1. psutil as a core dependency (pyproject.toml)
Added `psutil>=5.9.0,<8` to core deps. psutil is the canonical
cross-platform answer for "is this PID alive" and "kill this process
tree" — its `pid_exists()` uses `OpenProcess + GetExitCodeProcess` on
Windows (NOT a signal call), and its `Process.children(recursive=True)`
+ `.kill()` combo replaces `os.killpg()` portably.
### 2. `gateway/status.py::_pid_exists`
Rewrote to call `psutil.pid_exists()` first, falling back to the
hand-rolled ctypes `OpenProcess + WaitForSingleObject` dance on Windows
(and `os.kill(pid, 0)` on POSIX) only if psutil is somehow missing —
e.g. during the scaffold phase of a fresh install before pip finishes.
### 3. `os.killpg` migration to psutil (7 callsites, 5 files)
- `tools/code_execution_tool.py`
- `tools/process_registry.py`
- `tools/tts_tool.py`
- `tools/environments/local.py` (3 sites kept as-is, suppressed with
`# windows-footgun: ok` — the pgid semantics psutil can't replicate,
and the calls are already Windows-guarded at the outer branch)
- `gateway/platforms/whatsapp.py`
### 4. `scripts/check-windows-footguns.py` (NEW, 500 lines)
Grep-based checker with 11 rules covering every Windows cross-platform
footgun we've hit so far:
1. `os.kill(pid, 0)` — the silent killer
2. `os.setsid` without guard
3. `os.killpg` (recommends psutil)
4. `os.getuid` / `os.geteuid` / `os.getgid`
5. `os.fork`
6. `signal.SIGKILL`
7. `signal.SIGHUP/SIGUSR1/SIGUSR2/SIGALRM/SIGCHLD/SIGPIPE/SIGQUIT`
8. `subprocess` shebang script invocation
9. `wmic` without `shutil.which` guard
10. Hardcoded `~/Desktop` (OneDrive trap)
11. `asyncio.add_signal_handler` without try/except
12. `open()` without `encoding=` on text mode
Features:
- Triple-quoted-docstring aware (won't flag prose inside docstrings)
- Trailing-comment aware (won't flag mentions in `# os.kill(pid, 0)` comments)
- Guard-hint aware (skips lines with `hasattr(os, ...)`,
`shutil.which(...)`, `if platform.system() != 'Windows'`, etc.)
- Inline suppression with `# windows-footgun: ok — <reason>`
- `--list` to print all rules with fixes
- `--all` / `--diff <ref>` / staged-files (default) modes
- Scans 380 files in under 2 seconds
### 5. CI integration
A GitHub Actions workflow that runs the checker on every PR and push is
staged at `/tmp/hermes-stash/windows-footguns.yml` — not included in this
commit because the GH token on the push machine lacks `workflow` scope.
A maintainer with `workflow` permissions should add it as
`.github/workflows/windows-footguns.yml` in a follow-up. Content:
```yaml
name: Windows footgun check
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: {python-version: "3.11"}
- run: python scripts/check-windows-footguns.py --all
```
### 6. CONTRIBUTING.md — "Cross-Platform Compatibility" expansion
Expanded from 5 to 16 rules, each with message, example, and fix.
Recommends psutil as the preferred API for PID / process-tree operations.
### 7. Baseline cleanup (91 → 0 findings)
- 14 `open()` sites → added `encoding='utf-8'` (internal logs/caches) or
`encoding='utf-8-sig'` (user-editable files that Notepad may BOM)
- 23 POSIX-only callsites in systemd helpers, pty_bridge, and plugin
tool subprocess management → annotated with
`# windows-footgun: ok — <reason>`
- 7 `os.killpg` sites → migrated to psutil (see §3 above)
## Verification
```
$ python scripts/check-windows-footguns.py --all
✓ No Windows footguns found (380 file(s) scanned).
$ python -c "from gateway.status import _pid_exists; import os
> print('self:', _pid_exists(os.getpid())); print('bogus:', _pid_exists(999999))"
self: True
bogus: False
```
Proof-of-repro that `os.kill(pid, 0)` was actually killing processes
before this fix — see commit `1cbe39914` and bpo-14484. This commit
removes the last hand-rolled ctypes path from the hot liveness-check
path and defers to the best-maintained cross-platform answer.
This commit is contained in:
parent
324567c936
commit
cc38282b04
28 changed files with 943 additions and 85 deletions
|
|
@ -135,7 +135,7 @@ def discover_memory_providers() -> List[Tuple[str, str, bool]]:
|
|||
if yaml_file.exists():
|
||||
try:
|
||||
import yaml
|
||||
with open(yaml_file) as f:
|
||||
with open(yaml_file, encoding="utf-8-sig") as f:
|
||||
meta = yaml.safe_load(f) or {}
|
||||
desc = meta.get("description", "")
|
||||
except Exception:
|
||||
|
|
@ -381,7 +381,7 @@ def discover_plugin_cli_commands() -> List[dict]:
|
|||
if yaml_file.exists():
|
||||
try:
|
||||
import yaml
|
||||
with open(yaml_file) as f:
|
||||
with open(yaml_file, encoding="utf-8-sig") as f:
|
||||
meta = yaml.safe_load(f) or {}
|
||||
desc = meta.get("description", "")
|
||||
if desc:
|
||||
|
|
|
|||
|
|
@ -1215,7 +1215,7 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
# would capture output from other threads.
|
||||
import hindsight_embed.daemon_embed_manager as dem
|
||||
from rich.console import Console
|
||||
dem.console = Console(file=open(log_path, "a"), force_terminal=False)
|
||||
dem.console = Console(file=open(log_path, "a", encoding="utf-8"), force_terminal=False)
|
||||
|
||||
client = self._get_client()
|
||||
profile = self._config.get("profile", "hermes")
|
||||
|
|
@ -1231,15 +1231,15 @@ class HindsightMemoryProvider(MemoryProvider):
|
|||
if config_changed:
|
||||
profile_env = _materialize_embedded_profile_env(self._config)
|
||||
if client._manager.is_running(profile):
|
||||
with open(log_path, "a") as f:
|
||||
with open(log_path, "a", encoding="utf-8") as f:
|
||||
f.write("\n=== Config changed, restarting daemon ===\n")
|
||||
client._manager.stop(profile)
|
||||
|
||||
client._ensure_started()
|
||||
with open(log_path, "a") as f:
|
||||
with open(log_path, "a", encoding="utf-8") as f:
|
||||
f.write("\n=== Daemon started successfully ===\n")
|
||||
except Exception as e:
|
||||
with open(log_path, "a") as f:
|
||||
with open(log_path, "a", encoding="utf-8") as f:
|
||||
f.write(f"\n=== Daemon startup failed: {e} ===\n")
|
||||
traceback.print_exc(file=f)
|
||||
|
||||
|
|
|
|||
|
|
@ -101,7 +101,7 @@ def _load_plugin_config() -> dict:
|
|||
return {}
|
||||
try:
|
||||
import yaml
|
||||
with open(config_path) as f:
|
||||
with open(config_path, encoding="utf-8-sig") as f:
|
||||
all_config = yaml.safe_load(f) or {}
|
||||
return cfg_get(all_config, "plugins", "hermes-memory-store", default={}) or {}
|
||||
except Exception:
|
||||
|
|
@ -136,11 +136,11 @@ class HolographicMemoryProvider(MemoryProvider):
|
|||
import yaml
|
||||
existing = {}
|
||||
if config_path.exists():
|
||||
with open(config_path) as f:
|
||||
with open(config_path, encoding="utf-8-sig") as f:
|
||||
existing = yaml.safe_load(f) or {}
|
||||
existing.setdefault("plugins", {})
|
||||
existing["plugins"]["hermes-memory-store"] = values
|
||||
with open(config_path, "w") as f:
|
||||
with open(config_path, "w", encoding="utf-8") as f:
|
||||
yaml.dump(existing, f, default_flow_style=False)
|
||||
except Exception:
|
||||
pass
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue