hermes-agent/gateway/platforms
Teknium e93bfc6c93 feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags
Second pass on native Windows support, driven by a systematic audit across
five areas: POSIX-only primitives (signal.SIGKILL/SIGHUP/SIGPIPE, os.WNOHANG,
os.setsid), path translation bugs (/c/Users → C:\Users), subprocess patterns
(npm.cmd batch shims, start_new_session no-op on Windows), subsystem health
(cron, gateway daemon, update flow), and module-level import guards.

Every change is platform-gated — POSIX (Linux/macOS) behaviour is preserved
bit-identical. Explicit "do no harm" test: test_posix_path_preserved_on_linux,
test_posix_noop, test_windows_detach_popen_kwargs_is_posix_equivalent_on_posix.

## New module

- hermes_cli/_subprocess_compat.py — shared helpers (resolve_node_command,
  windows_detach_flags, windows_hide_flags, windows_detach_popen_kwargs).
  All no-ops on non-Windows.

## CRITICAL fixes (would crash or silently break on Windows)

- tui_gateway/entry.py: SIGPIPE/SIGHUP referenced at module top level would
  AttributeError on import on Windows, breaking `hermes --tui` entirely (it
  spawns this module as a subprocess).  Guard each signal.signal() call with
  hasattr() and add SIGBREAK as Windows' SIGHUP equivalent.

- hermes_cli/kanban_db.py: os.waitpid(-1, os.WNOHANG) in dispatcher tick was
  unguarded.  os.WNOHANG doesn't exist on Windows.  Gate the whole reap loop
  behind `os.name != "nt"` — Windows has no zombies anyway.

- tools/code_execution_tool.py: AF_UNIX socket for execute_code RPC fails on
  most Windows builds.  Fall back to loopback TCP (AF_INET on 127.0.0.1:0
  ephemeral port) when _IS_WINDOWS.  HERMES_RPC_SOCKET env var now accepts
  either a filesystem path (POSIX) or `tcp://127.0.0.1:<port>` (Windows).
  Generated sandbox client parses both.

- cron/scheduler.py: `argv = ["/bin/bash", str(path)]` hardcoded.  Use
  shutil.which("bash") so Windows (Git Bash via MinGit) works, with a
  readable error when bash is genuinely absent.

- 6 bare npm/npx spawn sites: tools_config.py x2, doctor.py, whatsapp.py
  (npm install + node version probe), browser_tool.py x2.  On Windows npm
  is npm.cmd / npx is npx.cmd (batch shims); subprocess.Popen(["npm", ...])
  fails with WinError 193.  shutil.which(...) returns the absolute .cmd
  path which CreateProcessW accepts because the extension routes through
  cmd.exe /c.  POSIX behaviour unchanged (shutil.which still returns the
  same path subprocess would resolve itself).

## HIGH fixes (silent misbehaviour on Windows)

- tools/environments/local.py get_temp_dir: hardcoded /tmp returned on
  Windows meant `_cwd_file = "/tmp/hermes-cwd-*.txt"`, which bash wrote
  via MSYS2's virtual /tmp but native Python couldn't open.  Result: cwd
  tracking silently broken — `cd` in terminal tool did nothing.  Windows
  branch now returns `%HERMES_HOME%/cache/terminal` with forward slashes
  (works in both bash and Python, guaranteed no spaces).

- tools/environments/local.py _make_run_env PATH injection: `/usr/bin not
  in split(":")` heuristic mangles Windows PATH (";" separator).  Gate
  the injection behind `not _IS_WINDOWS`.

- hermes_cli/gateway.py launch_detached_profile_gateway_restart: outer
  Popen + watcher-script Popen both used start_new_session=True, which
  Windows silently ignores.  Watcher stayed attached to CLI's console,
  died when user closed terminal after `hermes update`, left gateway
  stale.  Now branches through windows_detach_popen_kwargs() helper
  (CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS | CREATE_NO_WINDOW on
  Windows, start_new_session=True on POSIX — identical to main).

## MEDIUM fixes

- gateway/run.py /restart and /update handlers: hardcoded bash/setsid
  chain crashes on Windows when user triggers /update in-gateway.  Now
  has sys.platform=="win32" branch using sys.executable + a tiny
  Python watcher with proper detach flags.  POSIX path is unchanged.

- cli.py _git_repo_root: Git on Windows sometimes returns /c/Users/...
  style paths that break subprocess.Popen(cwd=...) and Path().resolve().
  Added _normalize_git_bash_path() helper that translates /c/Users,
  /cygdrive/c, /mnt/c variants to native C:\Users form.  POSIX no-op.
  _git_repo_root() now routes every result through it.

- cli.py worktree .worktreeinclude: os.symlink on directories failed
  hard on Windows (requires admin or Developer Mode).  Falls back to
  shutil.copytree with a warning log.

## Tests

- 29 new tests in tests/tools/test_windows_native_support.py covering:
  subprocess_compat helpers, TUI entry signal guards, kanban waitpid
  guard, code_execution TCP fallback source-level invariants, cron bash
  resolution, npm/npx bare-spawn lint per-file, local env Windows temp
  dir, PATH injection gating, git bash path normalization, symlink
  fallback, gateway detached watcher flags.

- One existing test assertion adjusted in test_browser_homebrew_paths:
  it compared captured Popen argv to the BARE `"npx"` literal; after the
  shutil.which() change argv[0] is the absolute path.  New assertion
  checks the shape (two items, second is `agent-browser`) rather than
  the exact first-item string.  Behaviour unchanged; test was too strict.

All 56 tests pass on Linux (30 from previous commits + 26 new).
267 tests from the affected files/dirs (browser, code_exec, local_env,
process_registry, kanban_db, windows_compat) all pass — zero regressions.
tests/hermes_cli/ (3909 pass) and tests/gateway/ (5021 pass) unchanged;
all pre-existing test failures confirmed unrelated via `git stash` re-run.

## What's still deferred (LOW priority)

- Visible cmd-window flashes on short-lived console apps (~14 sites) —
  cosmetic, needs a follow-up pass once we have user reports.
- agent/file_safety.py POSIX-only security deny patterns — separate
  hardening task.
- tools/process_registry.py returning "/tmp" as fallback — theoretical;
  reachable only when all env-var candidates fail.
2026-05-08 14:27:40 -07:00
..
qqbot feat(qqbot): wire native tool-approval UX via inline keyboards 2026-05-07 07:48:15 -07:00
__init__.py yuanbao platform (#16298) 2026-04-26 18:50:49 -07:00
_http_client_limits.py fix(gateway): tighten httpx keepalive and close whatsapp typing-response leak (#18451) 2026-05-02 02:23:37 -07:00
ADDING_A_PLATFORM.md docs(platforms): document env_enablement_fn + cron_deliver_env_var hooks (#21331) 2026-05-07 07:36:42 -07:00
api_server.py feat(api-server): expose run approval events 2026-05-08 07:30:14 -07:00
base.py fix(gateway): defer goal status notices until after response delivery 2026-05-07 17:33:09 -07:00
bluebubbles.py fix(gateway): tighten httpx keepalive and close whatsapp typing-response leak (#18451) 2026-05-02 02:23:37 -07:00
dingtalk.py feat(gateway): add allowed_{chats,channels,rooms} whitelist to Telegram, Mattermost, Matrix, DingTalk 2026-05-07 06:54:29 -07:00
discord.py fix(discord): route DM role-auth opt-in through config.yaml (not env var) 2026-05-07 05:51:56 -07:00
email.py fix(email): drop non-allowlisted senders before dispatch to prevent mail loops 2026-05-04 12:35:22 -07:00
feishu.py fix(gateway): use monotonic deadlines in QR onboarding flows 2026-05-07 05:09:39 -07:00
feishu_comment.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
feishu_comment_rules.py fix(feishu-comment): use get_hermes_home(); drop dead asyncio wrapper; AUTHOR_MAP 2026-04-17 19:04:11 -07:00
helpers.py fix(gateway): ensure deterministic thread eviction in helpers 2026-05-05 10:13:55 -07:00
homeassistant.py fix(gateway): correct ws scheme conversion for https urls 2026-05-03 03:54:03 -07:00
matrix.py feat(gateway): add allowed_{chats,channels,rooms} whitelist to Telegram, Mattermost, Matrix, DingTalk 2026-05-07 06:54:29 -07:00
mattermost.py feat(gateway): add allowed_{chats,channels,rooms} whitelist to Telegram, Mattermost, Matrix, DingTalk 2026-05-07 06:54:29 -07:00
msgraph_webhook.py fix(msgraph_webhook): harden auth surface + IP allowlisting + response hygiene 2026-05-08 10:29:58 -07:00
signal.py fix(signal): skip reactions for unauthorized senders 2026-05-04 01:38:21 -07:00
signal_rate_limit.py feat(gateway/signal): add support for multiple images sending 2026-04-30 04:28:08 -07:00
slack.py feat(slack): add allowed_channels whitelist config 2026-05-07 06:54:29 -07:00
sms.py test(sms): use clear=True in test_missing_phone_number_is_non_retryable 2026-05-04 05:25:09 -07:00
telegram.py fix(telegram): preserve thread_id=1 for forum General typing indicator (#21390) 2026-05-07 08:39:21 -07:00
telegram_network.py fix(gateway): keep DoH-confirmed Telegram IPs that match system DNS (#14520) 2026-05-05 04:42:59 -07:00
webhook.py fix(webhook): widen INSECURE_NO_AUTH loopback check + tests + docs 2026-05-07 07:38:43 -07:00
wecom.py fix(gateway): use monotonic deadlines in QR onboarding flows 2026-05-07 05:09:39 -07:00
wecom_callback.py fix(gateway): tighten httpx keepalive and close whatsapp typing-response leak (#18451) 2026-05-02 02:23:37 -07:00
wecom_crypto.py feat(gateway): add WeCom callback-mode adapter for self-built apps 2026-04-11 15:22:49 -07:00
weixin.py fix(weixin): wrap long copy-unfriendly lines 2026-05-07 06:08:06 -07:00
whatsapp.py feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags 2026-05-08 14:27:40 -07:00
yuanbao.py fix(yuanbao): enforce owner identity check on group slash commands 2026-04-30 23:57:55 -07:00
yuanbao_media.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
yuanbao_proto.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
yuanbao_sticker.py yuanbao platform (#16298) 2026-04-26 18:50:49 -07:00