hermes-agent/hermes_cli
firefly ae94ed1728
fix(tui-gateway): reap leaked slash_worker sessions on disconnect + active_list liveness (re-scoped onto current main)
Salvaged from #35626 (banditburai) and re-scoped after maintainers landed the
parent-death watchdog (slash_worker.py) and PTY process-group teardown
(pty_bridge.py) directly on main. Those pieces are intentionally NOT included
here — this carries only what is still missing:

- C1 disconnect reap: ws.py's `finally` only re-pointed the dead transport at
  stdio. `_close_sessions_for_transport` now reaps `close_on_disconnect`
  sessions and schedules the grace-reap for the rest, offloaded via
  `asyncio.to_thread` so the blocking worker.close() + DB write never stalls
  the uvicorn loop.
- C2 create/close orphan race: `_attach_worker` stores the worker iff
  `_sessions.get(sid) is session` under the lock (else closes it), applied at
  every spawn site incl. the post-turn `_restart_slash_worker`.
- Single idempotent teardown funnel: session.close, WS disconnect, the
  generous-TTL idle reaper, shutdown, and the WS grace-reap all reach
  `_close_session_by_id` → `_teardown_session`; `_finalized`/`_closed` flags
  make concurrent/double teardown a no-op. `_sessions_lock` upgraded to RLock.
- uvicorn `ws_ping_interval/timeout=20s` so a half-open socket (reverse-proxy
  524) becomes a `WebSocketDisconnect` and the C1 path runs.

Plus two review-driven hardening fixes (mine):

- `session.active_list` now skips `_finalized` sessions so the footer
  "N sessions" count reflects attachable sessions instead of only ever
  growing until restart (#38950). Keys on `_finalized` only, NOT the stdio
  sentinel, so a standalone `hermes --tui` session stays visible.
- `_schedule_ws_orphan_reap._reap` pops via `_close_session_by_id`
  (under `_sessions_lock`) instead of `_sessions.pop` under the unrelated
  `_session_resume_lock` (#39591); the resume_lock now only guards the orphan
  re-check against `session.resume`.
- Float env knobs (`HERMES_SLASH_WATCHDOG_*`, `HERMES_TUI_SESSION_TTL_S`)
  parse with a fallback helper so a malformed value can't crash the worker at
  import.

Fixes #32377
Fixes #38950
Addresses #22855

Co-authored-by: banditburai <123342691+banditburai@users.noreply.github.com>
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-06-08 10:02:05 -07:00
..
dashboard_auth fix(desktop): gate OAuth remote connect on AT-or-RT, not access token alone 2026-06-04 22:18:46 -07:00
proxy chore: remove dead code — 28 unused functions/classes across 16 files 2026-05-29 04:22:27 -07:00
subcommands refactor(cli): promote 9 closure handlers to top-level + extract their parsers (god-file Phase 2 follow-up) 2026-06-07 22:56:23 -07:00
__init__.py chore: release v0.16.0 (2026.6.5) (#40206) 2026-06-05 17:55:43 -07:00
_parser.py feat(cli): configurable default interface (cli vs tui) 2026-06-02 20:49:44 -05:00
_subprocess_compat.py fix(windows): retry watcher Popen without breakaway when parent job denies it, plus regression tests for the breakaway bit (#40956) 2026-06-07 01:21:58 -07:00
auth.py fix(auth): auto-detect OpenRouter credential from the pool, not just env (#42263) 2026-06-08 10:01:47 -07:00
auth_commands.py fix(auth): set active_provider after hermes auth add qwen-oauth 2026-06-04 05:58:33 -07:00
azure_detect.py feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
backup.py fix(cron): restore jobs.json emptied by config migration on update 2026-05-29 13:22:54 -07:00
banner.py fix(update-check): stop reporting phantom "N commits behind" inside Docker (#39559) 2026-06-05 15:37:19 +10:00
browser_connect.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
build_info.py fix(docker): bake build-time git SHA into the image 2026-05-28 15:14:05 +10:00
bundles.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
callbacks.py fix(cli): show masked feedback for secret prompts 2026-05-25 01:20:33 -07:00
checkpoints.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
claw.py fix: batch of small robustness/correctness fixes from @kyssta-exe 2026-06-01 19:51:03 -07:00
cli_agent_setup_mixin.py refactor(cli): extract agent-construction cluster into CLIAgentSetupMixin (god-file Phase 4) 2026-06-08 09:41:34 -07:00
cli_commands_mixin.py refactor(cli): extract 32 slash-command handlers into CLICommandsMixin (god-file Phase 4) 2026-06-08 02:13:07 -07:00
cli_output.py fix(cli): show masked feedback for secret prompts 2026-05-25 01:20:33 -07:00
clipboard.py fix(clipboard): only read PNG signature bytes, not entire file 2026-05-13 22:54:21 -07:00
codex_models.py fix(codex): drop dead model slugs that HTTP 400 on ChatGPT Pro (#33424) 2026-05-27 12:16:15 -07:00
codex_runtime_plugin_migration.py fix(codex-runtime): de-dup [plugins.X] tables and stop leaking HERMES_HOME into config.toml 2026-05-15 02:31:30 -07:00
codex_runtime_switch.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
colors.py
commands.py Add /version slash command across CLI, gateway, TUI, and desktop. 2026-06-05 18:05:05 -07:00
completion.py fix: batch of small robustness/correctness fixes from @kyssta-exe 2026-06-01 19:51:03 -07:00
config.py feat(onboarding): opt-in structured profile-build path on first contact (#41114) 2026-06-07 08:36:48 -07:00
container_boot.py fix(docker): seed s6 gateway state for legacy run cmd (#34829) 2026-06-01 11:28:56 +10:00
copilot_auth.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
cron.py fix(cron): don't crash on cron list when a job's repeat is null 2026-06-05 00:19:45 -07:00
curator.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
curses_ui.py feat(cli): ranked fuzzy search in the curses model picker 2026-06-01 16:58:58 -07:00
dashboard_register.py fix(dashboard): honor --portal-url / HERMES_DASHBOARD_PORTAL_URL override in register 2026-06-04 00:17:57 -07:00
debug.py feat(dashboard): add Debug Share to the System page (#38600) 2026-06-03 19:37:04 -07:00
default_soul.py
dep_ensure.py feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection (#27845) 2026-05-18 16:34:24 +05:30
dingtalk_auth.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
doctor.py fix: respect Honcho env var fallback in doctor and honcho status 2026-06-07 05:37:02 -07:00
dump.py fix(auth): auto-detect OpenRouter credential from the pool, not just env (#42263) 2026-06-08 10:01:47 -07:00
env_loader.py fix(secrets): only apply external secrets once per HERMES_HOME per process (#32271) 2026-05-25 15:18:55 -07:00
fallback_cmd.py fix(fallback): merge fallback_providers with legacy fallback_model configurations 2026-05-23 05:24:57 -07:00
fallback_config.py fix(fallback): merge fallback_providers with legacy fallback_model configurations 2026-05-23 05:24:57 -07:00
gateway.py fix(gateway): drop --replace from systemd unit templates (#41892) 2026-06-08 00:20:08 -07:00
gateway_windows.py fix(gateway,windows): reliability — JOB breakaway + status --deep probes + test-leak fix (#40909) 2026-06-06 19:53:58 -07:00
goals.py feat(kanban): goal_mode cards run workers in a /goal loop (#35710) 2026-05-31 01:16:33 -07:00
gui_uninstall.py feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI) (#40355) 2026-06-06 18:22:38 -07:00
hooks.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
inventory.py fix(inventory): avoid fresh Nous tier checks in picker payloads 2026-06-07 00:41:13 -07:00
kanban.py fix(kanban): isolate board override per concurrent call 2026-06-04 07:39:53 -07:00
kanban_db.py fix(kanban): sweep deferred scratch parent on non-scratch child completion + tests 2026-06-07 09:50:44 -07:00
kanban_decompose.py docs(kanban): clarify decomposer profile roles 2026-06-06 19:29:00 -07:00
kanban_diagnostics.py chore: remove dead code — 28 unused functions/classes across 16 files 2026-05-29 04:22:27 -07:00
kanban_specify.py fix: guard int(os.getenv()) casts against malformed env vars (#40598) 2026-06-07 06:14:24 -07:00
kanban_swarm.py fix(kanban): CLI dispatch honors max_in_progress/max_spawn from config; swap missing 'avoid-ai-writing' skill for bundled humanizer (#33488, #29415) (#34337) 2026-05-28 21:00:46 -07:00
logs.py feat(debug): include desktop.log in hermes debug share / /debug / hermes logs (#38203) 2026-06-03 05:41:35 -07:00
main.py refactor(cli): extract 18 model-flow wizard functions into model_setup_flows (god-file Phase 2) 2026-06-08 09:42:44 -07:00
managed_uv.py fix(update/windows): don't return _UvResult on Windows (subprocess argv crash) (#39820) 2026-06-05 07:54:08 -05:00
mcp_catalog.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
mcp_config.py fix(mcp): ensure server.shutdown() on probe iteration failure 2026-06-04 17:11:17 -07:00
mcp_picker.py feat(mcp): Nous-approved MCP catalog with interactive picker (#30870) 2026-05-26 12:48:14 -07:00
mcp_startup.py perf(cli): stop eager MCP discovery from blocking agent-capable startup 2026-05-30 07:45:26 -07:00
memory_setup.py fix(memory): fall back to pip when uv is unavailable (salvage #5954) (#38668) 2026-06-04 14:03:02 +10:00
middleware.py fix(middleware): single-use next_call guard + deepcopy-safe request copies 2026-06-06 23:07:25 +05:30
migrate.py feat(cli): hermes migrate xai [--apply] [--no-backup] 2026-05-20 09:18:23 -07:00
model_catalog.py feat(models): refresh model catalog hourly instead of daily (#35756) 2026-05-31 00:29:40 -07:00
model_normalize.py remove Vercel AI Gateway and Vercel Sandbox (#33067) 2026-05-27 00:43:32 -07:00
model_setup_flows.py refactor(cli): extract 18 model-flow wizard functions into model_setup_flows (god-file Phase 2) 2026-06-08 09:42:44 -07:00
model_switch.py refactor(inventory): make force_fresh_nous_tier keyword-only + pin contract 2026-06-07 00:41:13 -07:00
models.py fix(models): use deepseek-v4-flash as Nous silent default 2026-06-05 02:54:34 -07:00
nous_account.py feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011) 2026-06-06 13:18:18 +05:30
nous_subscription.py fix(cli): require Chromium for local browser readiness in setup/status surfaces 2026-06-05 04:06:17 -07:00
oneshot.py fix(cli): surface oneshot agent exceptions to stderr with rc=1 2026-05-30 07:31:48 -07:00
pairing.py fix(pairing): enforce lockout on approve_code, not just generate_code (#10195) (#21325) 2026-05-07 07:18:21 -07:00
partial_compress.py Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here' (#35048) 2026-05-29 17:49:15 -07:00
platforms.py feat: complete plugin platform parity — all 12 integration points 2026-04-29 21:56:51 -07:00
plugins.py feat(middleware): add adaptive execution intercepts 2026-06-03 11:22:06 -07:00
plugins_cmd.py fix(plugins): alias-normalize enable/disable for nested category plugins (follow-up to #41076) 2026-06-08 17:57:37 +05:30
portal_cli.py feat(cli): make hermes portal run the full quick-setup Nous flow (model picker) 2026-06-04 02:20:31 +05:30
profile_describer.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
profile_distribution.py fix(dist): stop USER_OWNED_EXCLUDE from filtering nested directories 2026-06-07 21:50:57 -07:00
profiles.py fix(profiles): skip 'default' in named profiles scan to prevent duplicates 2026-06-07 21:50:57 -07:00
prompt_size.py feat(cli): add hermes prompt-size diagnostic (#35276) 2026-05-30 02:53:42 -07:00
providers.py fix(model-picker): OpenAI shows curated models; OpenRouter no longer phantom-shows (#37404) 2026-06-02 06:31:37 -07:00
psutil_android.py fix(android): reject unsafe tar members in psutil compatibility installer 2026-05-28 02:36:09 -07:00
pt_input_extras.py fix(cli): ignore terminal focus reports (salvage of #16780) 2026-05-29 00:31:44 -07:00
pty_bridge.py fix(pty-bridge): mark os.killpg/getpgid windows-footgun-ok (POSIX-only module) 2026-06-08 07:03:12 -07:00
relaunch.py fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch 2026-05-08 14:27:40 -07:00
runtime_provider.py fix: guard int(os.getenv()) casts against malformed env vars (#40598) 2026-06-07 06:14:24 -07:00
secret_prompt.py fix(cli): show masked feedback for secret prompts 2026-05-25 01:20:33 -07:00
secrets_cli.py fix(secrets): fail early with clear error when bitwarden setup runs without TTY (#40571) 2026-06-06 18:36:40 -07:00
security_advisories.py fix(stt,tts): restore mistralai — 2.4.8 is clean, ban lifted (#34841) 2026-05-29 13:24:12 -07:00
security_audit.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
send_cmd.py fix(review): address Copilot follow-up on sanitizer and file decode errors 2026-05-16 23:00:58 -05:00
service_manager.py Remove prviliges drop when you never ran as root (#34837) 2026-06-01 13:54:18 +10:00
session_recap.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
setup.py fix(cli): require Chromium for local browser readiness in setup/status surfaces 2026-06-05 04:06:17 -07:00
skills_config.py refactor(config): migrate remaining 33 cfg_get call sites (#17311) 2026-04-29 04:03:03 -07:00
skills_hub.py fix(skills): browse shows full catalog, not first 5000 (#41413) 2026-06-07 10:15:31 -07:00
skin_engine.py fix(tui): improve charizard completion menu contrast 2026-05-18 20:05:23 -07:00
slack_cli.py fix(slack): enable writable app home DMs in manifest 2026-05-08 17:01:12 -07:00
status.py feat(cli): make hermes portal the human-readable Portal onboarding alias 2026-06-04 01:19:28 +05:30
stdio.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
telegram_managed_bot.py Add CLI Telegram QR onboarding 2026-06-05 03:20:10 -07:00
timeouts.py perf(agent-loop): cut 47% of per-conversation function calls via 3 targeted hot-path optimizations (#28866) 2026-05-19 14:25:10 -07:00
tips.py docs: align runtime footer field docs 2026-06-06 11:20:40 -06:00
tools_config.py fix(install): scope npm installs/audits to avoid pulling in apps/desktop 2026-06-06 18:22:20 -07:00
uninstall.py feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI) (#40355) 2026-06-06 18:22:38 -07:00
voice.py fix(tui): restore voice push-to-talk parity (#20897) 2026-05-06 15:49:59 -07:00
web_server.py fix(tui-gateway): reap leaked slash_worker sessions on disconnect + active_list liveness (re-scoped onto current main) 2026-06-08 10:02:05 -07:00
webhook.py fix(state): restrict sensitive store file permissions 2026-05-24 04:55:18 -07:00
xai_retirement.py fix(xai): align migrate retirement map with docs 2026-05-20 09:18:23 -07:00