hermes-agent/hermes_cli
Ben 19b2624404 feat(gateway): external drain trigger + accept-gating (begin/cancel + control channel)
Tasks 2.1 + 2.2 + 2.3 of the safe-shutdown plan — the reversible
quiesce-without-restart machinery NAS drives during a lifecycle action (D4a).
These ship together because the endpoint, the control channel, and the gateway
state machine are one coherent slice.

2.2 — control channel (gateway/drain_control.py, new):
The dashboard has no HTTP path into a running gateway (guardrails: "there is NO
external control channel into a running gateway"); restart/drain is driven only
by markers the gateway reacts to. So begin/cancel-drain writes/removes a
presence-based marker .drain_request.json (HERMES_HOME-scoped, atomic write,
never-raises read; a corrupt marker reads as present-contentless → fail-safe
toward quiescing). This is Q-B option A.

2.2 — gateway state machine (gateway/run.py):
- _external_drain_active flag, DISTINCT from the shutdown _draining flag: this
  one does NOT exit the process and is fully reversible.
- _enter_external_drain / _exit_external_drain: idempotent transitions that
  flip gateway_state→draining / →running via _update_runtime_status (preserving
  the live active_agents count). exit refuses to revert to running during a
  real shutdown or after the loop stops (shutdown wins).
- _drain_control_watcher: 1s background task (modelled on _handoff_watcher)
  reconciling accept-state with the marker; honours a marker that survived a
  restart on its first tick. Registered alongside the other watchers in start.
- New-turn accept gate in _handle_message, placed BEFORE the session-slot
  claim: when draining, refuse to START a new turn (so active_agents can only
  fall → no TOCTOU race), while in-flight turns finish untouched. Internal/
  system events (restart-recovery replays, bg-process completions) bypass it.

2.1 — endpoint (hermes_cli/web_server.py):
POST /api/gateway/drain {action: drain|cancel}. Authenticated by the Task-2.0a
token seam (the drain plugin registered this exact path as a token route);
attributes the request to the verified token principal. Begin writes the
marker, cancel removes it — the gateway process owns the actual transition.
Force-override (D6) is NOT here; it maps onto the existing immediate
/api/gateway/restart force path.

Tests (mocked — necessary-not-sufficient; the HARD live gate Q-B is next):
- tests/gateway/test_external_drain_control.py — marker contract (write/clear/
  read/corrupt/atomic), state machine (enter/exit/idempotency/shutdown-wins/
  loop-stopped), watcher reconcile-enter-then-exit, new-turn refusal, and
  in-flight-not-interrupted. 15 tests.
- tests/hermes_cli/test_web_server.py — /api/gateway/drain begin/default-begin/
  cancel/cancel-idempotent/bad-action-400. 6 tests.
- dashboard.drain_auth config section already added in 2.0b commit.

All touched suites green: 301 (gateway+auth) + 9 (web_server endpoints) passed.

Intentionally deferred:
- HARD live-validation gate (Q-B): real isolated `hermes gateway run`, drive a
  real begin-drain marker, prove the 5-point checklist a–e.
- Spec-doc status flip + Phase-2 PR.

Build status: external-drain, restart-drain, status, dashboard-auth, drain-plugin,
token-auth, and web_server-endpoint suites green.
2026-06-26 00:47:19 -07:00
..
dashboard_auth feat(dashboard-auth): generic non-interactive API-token capability 2026-06-26 00:47:19 -07:00
proxy fix(auth): honor NOUS_INFERENCE_BASE_URL env override for Nous OAuth sessions (#52270) 2026-06-25 00:11:15 -07:00
subcommands fix(mcp): auto-recover from invalid_client on stale OAuth client registration 2026-06-26 00:35:27 -07:00
__init__.py chore: release v0.17.0 (2026.6.19) 2026-06-19 12:38:31 -07:00
_parser.py fix(desktop): keep composer usable during reconnect (#45488) 2026-06-13 02:36:09 -07:00
_subprocess_compat.py fix(windows): retry watcher Popen without breakaway when parent job denies it, plus regression tests for the breakaway bit (#40956) 2026-06-07 01:21:58 -07:00
active_sessions.py fix(tui): preserve live session identity across compression (#49041) 2026-06-24 00:54:18 +05:30
auth.py fix(auth): honor NOUS_INFERENCE_BASE_URL env override for Nous OAuth sessions (#52270) 2026-06-25 00:11:15 -07:00
auth_commands.py feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492) 2026-06-21 19:53:27 -07:00
azure_detect.py feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
backup.py fix(cron): detect partial job loss in restore_cron_jobs_if_emptied (#52144) 2026-06-25 18:49:18 -07:00
banner.py fix(update): don't count across shallow-clone boundary (bogus '12492 commits behind') (#50784) 2026-06-22 05:39:11 -07:00
blueprint_cmd.py refactor(cron): rebrand Cron Recipes -> Automation Blueprints 2026-06-11 10:49:47 -07:00
browser_connect.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
build_info.py fix(docker): bake build-time git SHA into the image 2026-05-28 15:14:05 +10:00
bundles.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
callbacks.py fix(cli): show masked feedback for secret prompts 2026-05-25 01:20:33 -07:00
checkpoints.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
claw.py fix: batch of small robustness/correctness fixes from @kyssta-exe 2026-06-01 19:51:03 -07:00
cli_agent_setup_mixin.py fix(cli): publish agent ref to cli module so memory on_session_end fires on exit 2026-06-19 16:59:43 -07:00
cli_commands_mixin.py feat(pets): polish generate flow and reduce hatch CPU pressure 2026-06-24 19:08:06 -05:00
cli_output.py fix(cli): show masked feedback for secret prompts 2026-05-25 01:20:33 -07:00
clipboard.py fix(clipboard): only read PNG signature bytes, not entire file 2026-05-13 22:54:21 -07:00
codex_models.py fix(codex): drop dead model slugs that HTTP 400 on ChatGPT Pro (#33424) 2026-05-27 12:16:15 -07:00
codex_runtime_plugin_migration.py fix(codex-runtime): de-dup [plugins.X] tables and stop leaking HERMES_HOME into config.toml 2026-05-15 02:31:30 -07:00
codex_runtime_switch.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
colors.py
commands.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
completion.py fix: batch of small robustness/correctness fixes from @kyssta-exe 2026-06-01 19:51:03 -07:00
config.py feat(dashboard-auth): drain shared-bearer-secret provider plugin 2026-06-26 00:47:19 -07:00
container_boot.py fix(gateway): exit 78 (EX_CONFIG) on fatal startup errors, s6 finish script stops restart loop 2026-06-24 16:34:51 +10:00
context_switch_guard.py fix(cli): log instead of swallow preflight-warning errors; consistent TUI warning field 2026-06-21 16:31:56 +05:30
copilot_auth.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
cron.py feat(cron): warn when gateway not running on cron create/list (#51696) 2026-06-23 23:29:50 -07:00
curator.py feat(curator): make skill consolidation opt-in (prune stays default-on) (#47840) 2026-06-17 05:20:32 -07:00
curses_ui.py feat(cli): ranked fuzzy search in the curses model picker 2026-06-01 16:58:58 -07:00
dashboard_register.py fix(cli): persist custom --portal-url to .env on dashboard register (#42435) 2026-06-09 13:56:33 +10:00
debug.py fix(debug): include gui.log (dashboard/TUI/pty/websocket) in hermes debug share 2026-06-19 07:05:42 -07:00
default_soul.py fix(soul): installers seed the real default persona, upgrade legacy empty templates (#52246) 2026-06-24 18:56:26 -07:00
dep_ensure.py fix(browser): validate agent-browser is runnable, not just present (#51740) 2026-06-24 00:14:49 -07:00
dingtalk_auth.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
doctor.py fix(state): detect and repair FTS write corruption that silently drops gateway history (#52798) 2026-06-25 21:18:41 -07:00
dump.py fix(dump): show commit date instead of release date in hermes debug (#48104) 2026-06-17 16:53:42 -07:00
env_loader.py feat(managed-scope): apply managed .env last with override 2026-06-19 07:46:33 -07:00
fallback_cmd.py fix(fallback): merge fallback_providers with legacy fallback_model configurations 2026-05-23 05:24:57 -07:00
fallback_config.py fix(fallback): merge fallback_providers with legacy fallback_model configurations 2026-05-23 05:24:57 -07:00
gateway.py fix(windows): suppress console flashes and harden gateway restarts 2026-06-25 14:42:38 -07:00
gateway_enroll.py feat(relay): Phase 5 Unit C — wake primitive (gateway side) (#51595) 2026-06-24 11:00:11 +10:00
gateway_windows.py fix(windows): suppress console flashes and harden gateway restarts 2026-06-25 14:42:38 -07:00
goals.py feat(goals): completion contracts for /goal — evidence-based judging (#50501) 2026-06-22 12:20:09 -07:00
gui_uninstall.py feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI) (#40355) 2026-06-06 18:22:38 -07:00
hooks.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
inventory.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
kanban.py feat(kanban): typed block reasons + unblock-loop breaker (#52848) 2026-06-25 21:46:58 -07:00
kanban_db.py feat(kanban): typed block reasons + unblock-loop breaker (#52848) 2026-06-25 21:46:58 -07:00
kanban_decompose.py docs(kanban): clarify decomposer profile roles 2026-06-06 19:29:00 -07:00
kanban_diagnostics.py chore: remove dead code — 28 unused functions/classes across 16 files 2026-05-29 04:22:27 -07:00
kanban_specify.py fix: guard int(os.getenv()) casts against malformed env vars (#40598) 2026-06-07 06:14:24 -07:00
kanban_swarm.py refactor(kanban): fold worker/orchestrator skills into injected guidance (#50473) 2026-06-21 17:06:48 -07:00
logs.py feat(debug): include desktop.log in hermes debug share / /debug / hermes logs (#38203) 2026-06-03 05:41:35 -07:00
main.py fix(cron): detect partial job loss in restore_cron_jobs_if_emptied (#52144) 2026-06-25 18:49:18 -07:00
managed_scope.py fix(managed-scope): honor managed scope in all standalone config loaders 2026-06-19 07:46:33 -07:00
managed_uv.py fix(update/windows): don't return _UvResult on Windows (subprocess argv crash) (#39820) 2026-06-05 07:54:08 -05:00
mcp_catalog.py fix(mcp): block exfil-shaped stdio server configs (#46083) 2026-06-14 04:24:14 -07:00
mcp_config.py fix(mcp): auto-recover from invalid_client on stale OAuth client registration 2026-06-26 00:35:27 -07:00
mcp_picker.py feat(mcp): Nous-approved MCP catalog with interactive picker (#30870) 2026-05-26 12:48:14 -07:00
mcp_security.py fix(security): close hermes-0day MCP-persistence attack surface 2026-06-21 19:05:27 -07:00
mcp_startup.py fix(mcp): address adversarial review round 2 (stale-publish race, parity holes) 2026-06-19 11:57:43 -07:00
memory_oauth.py feat(memory): Honcho OAuth connect — desktop and CLI flows + token refresh (#44335) 2026-06-22 19:16:47 -05:00
memory_providers.py fix(desktop): show Hindsight memory provider (#37546) 2026-06-18 16:48:47 -05:00
memory_setup.py feat(memory): improve OpenViking setup UX 2026-06-17 01:04:26 +08:00
middleware.py fix(middleware): single-use next_call guard + deepcopy-safe request copies 2026-06-06 23:07:25 +05:30
migrate.py feat(cli): hermes migrate xai [--apply] [--no-backup] 2026-05-20 09:18:23 -07:00
moa_cmd.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
moa_config.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
model_catalog.py feat(models): seed model-catalog disk cache from checkout on update (#42614) 2026-06-08 22:31:06 -07:00
model_cost_guard.py fix(model): require confirmation for expensive model selections 2026-06-10 00:24:06 -07:00
model_normalize.py fix(gemini): strip native self prefixes before generateContent (#36141) 2026-06-13 13:47:08 -07:00
model_setup_flows.py refactor(setup): simplify Z.AI picker — drop dead fallback, fix tests 2026-06-25 12:07:01 +05:30
model_switch.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
models.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
nous_account.py feat(billing): /credits command — balance + portal top-up handoff (#44776) 2026-06-12 08:51:10 +00:00
nous_auth_keepalive.py fix Nous auth refresh for idle agents 2026-06-21 22:43:48 -07:00
nous_billing.py feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449) 2026-06-19 01:53:32 +05:30
nous_subscription.py fix(browser): validate agent-browser is runnable, not just present (#51740) 2026-06-24 00:14:49 -07:00
oneshot.py fix(cli): surface oneshot agent exceptions to stderr with rc=1 2026-05-30 07:31:48 -07:00
pairing.py
partial_compress.py Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here' (#35048) 2026-05-29 17:49:15 -07:00
pets.py feat(pets): generation RPCs, non-blocking gallery + gateway plumbing 2026-06-24 13:48:38 -05:00
platforms.py feat(whatsapp): add WhatsApp Business Cloud API adapter 2026-05-23 01:07:01 -04:00
plugins.py feat(kanban): add task lifecycle plugin hooks (claimed/completed/blocked) (#50349) 2026-06-21 12:38:14 -07:00
plugins_cmd.py fix(plugins): normalize browser-pasted GitHub repo URLs (#33539) 2026-06-13 13:23:59 -07:00
portal_cli.py feat(cli): make hermes portal run the full quick-setup Nous flow (model picker) 2026-06-04 02:20:31 +05:30
profile_describer.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
profile_distribution.py fix(dist): stop USER_OWNED_EXCLUDE from filtering nested directories 2026-06-07 21:50:57 -07:00
profiles.py fix(gateway): scope dashboard liveness fallback to the profile 2026-06-25 10:25:54 +10:00
projects_cmd.py feat(projects): add per-profile project store 2026-06-25 16:40:26 -05:00
projects_db.py feat(projects): add per-profile project store 2026-06-25 16:40:26 -05:00
prompt_size.py feat(cli): add hermes prompt-size diagnostic (#35276) 2026-05-30 02:53:42 -07:00
provider_catalog.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
providers.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
psutil_android.py fix(android): reject unsafe tar members in psutil compatibility installer 2026-05-28 02:36:09 -07:00
pt_input_extras.py fix(cli): ignore terminal focus reports (salvage of #16780) 2026-05-29 00:31:44 -07:00
pty_bridge.py fix(pty-bridge): mark os.killpg/getpgid windows-footgun-ok (POSIX-only module) 2026-06-08 07:03:12 -07:00
relaunch.py
runtime_provider.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
secret_prompt.py feat(memory): improve OpenViking setup UX 2026-06-17 01:04:26 +08:00
secrets_cli.py fix(secrets): fail early with clear error when bitwarden setup runs without TTY (#40571) 2026-06-06 18:36:40 -07:00
security_advisories.py fix(stt,tts): restore mistralai — 2.4.8 is clean, ban lifted (#34841) 2026-05-29 13:24:12 -07:00
security_audit.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
security_audit_startup.py style(security-audit): add explicit encoding to read_text calls (ruff PLW1514) 2026-06-21 19:05:27 -07:00
send_cmd.py fix(managed-scope): honor managed scope in config→env bridges too 2026-06-19 07:46:33 -07:00
service_manager.py fix(gateway): exit 78 (EX_CONFIG) on fatal startup errors, s6 finish script stops restart loop 2026-06-24 16:34:51 +10:00
session_listing.py fix: harden salvaged session and browser improvements 2026-06-15 07:46:34 -07:00
session_recap.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
setup.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
setup_whatsapp_cloud.py fix(whatsapp-cloud): review follow-ups for #43921 2026-06-11 07:51:01 -07:00
skills_config.py fix(skills): apply global|platform disabled union to all resolution sites 2026-06-14 22:54:54 +05:30
skills_hub.py feat(skills): find & diff user-modified bundled skills 2026-06-18 12:26:20 +05:30
skin_engine.py fix(tui): improve charizard completion menu contrast 2026-05-18 20:05:23 -07:00
slack_cli.py feat(slack): add --no-assistant flag to manifest generation 2026-06-23 11:30:10 -07:00
sqlite_util.py feat(projects): add per-profile project store 2026-06-25 16:40:26 -05:00
status.py Merge commit '6110aed9b' into feat/whatsapp-cloud-api 2026-06-10 21:39:22 -04:00
stdio.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
suggestions_cmd.py refactor(cron): rebrand Cron Recipes -> Automation Blueprints 2026-06-11 10:49:47 -07:00
telegram_managed_bot.py Add CLI Telegram QR onboarding 2026-06-05 03:20:10 -07:00
timeouts.py perf(agent-loop): cut 47% of per-conversation function calls via 3 targeted hot-path optimizations (#28866) 2026-05-19 14:25:10 -07:00
tips.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
tools_config.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
uninstall.py feat: uninstall the Chat GUI without removing the agent (CLI + desktop UI) (#40355) 2026-06-06 18:22:38 -07:00
voice.py
web_server.py feat(gateway): external drain trigger + accept-gating (begin/cancel + control channel) 2026-06-26 00:47:19 -07:00
webhook.py fix(state): restrict sensitive store file permissions 2026-05-24 04:55:18 -07:00
win_pty_bridge.py feat(windows): enable dashboard /chat tab via ConPTY (win_pty_bridge) + tests (#42251) 2026-06-08 11:32:43 -07:00
write_approval_commands.py refactor(memory,skills): replace tri-state write_mode with boolean write_approval (default off) (#43354) 2026-06-09 23:21:14 -07:00
xai_retirement.py fix(xai): align migrate retirement map with docs 2026-05-20 09:18:23 -07:00