mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-29 06:31:32 +00:00
Sessions now survive `hermes gateway stop` / `restart` on native Windows. Previously the gateway died on schtasks `/End` + os.kill SIGTERM without ever running the drain loop, so the v0.13.0 session-resume feature (#21192) silently broke on Windows: `resume_pending=True` was never written, and the next boot started with a blank conversation history (issue #33778). Root cause is twofold and the reporter only identified half of it: 1. `hermes_cli/gateway_windows.py::stop()` did not write the `planned_stop_marker` before signalling. The reporter caught this. 2. The bigger reason: `asyncio.add_signal_handler` raises NotImplementedError for SIGTERM/SIGINT on Windows, so even if the marker had been written, the gateway's existing SIGTERM handler (which is what calls `runner.stop()` and the `mark_resume_pending` loop) was never invoked. Writing the marker would have been necessary-but-insufficient. The fix has two parts: * gateway/run.py: new `_run_planned_stop_watcher` daemon thread polls for the planned-stop marker file every 0.5s. When the marker appears it `loop.call_soon_threadsafe(shutdown_signal_handler, None)` — the same shutdown path a real SIGTERM would have driven, including the pre-drain `mark_resume_pending` writes (run.py:5977) and graceful drain wait. The existing signal handler already accepts `received_signal=None` and falls through to `consume_planned_stop_marker_for_self()`, so no handler changes needed. Runs on every platform as cheap belt-and-suspenders. * hermes_cli/gateway_windows.py: `stop()` now writes the marker for the running gateway PID and waits up to `agent.restart_drain_timeout` (default 30s) for the PID to exit cleanly. On clean drain, the kill sweep is non-forceful; on timeout, escalates to `kill_gateway_processes(force=True)` which routes to taskkill /T /F per `references/windows-native-support.md`. Validation: * 7 new tests in tests/gateway/test_planned_stop_watcher.py covering: marker→handler dispatch, no-marker idle, already-draining skip, not-yet-running skip, stop_event responsiveness, fire-once semantics, error tolerance. * 8 new tests in tests/hermes_cli/test_gateway_windows.py covering: marker-before-kill ordering, clean-drain skips force-kill, drain-timeout escalates to force=True, no-pid-skips-drain, invalid-pid handling, fast-exit success, timeout failure, marker-write-failure tolerance. * E2E (Linux, detached orphan): write_planned_stop_marker(pid) + `_drain_gateway_pid(pid, 5.0)` returns True in 0.5s after the victim sees the marker and exits. Tested with a double-forked subprocess so the test parent isn't holding it as a zombie. * Targeted: tests/gateway/{restart_drain,restart_resume_pending, signal,signal_format,status,shutdown_forensics,approve_deny_commands, planned_stop_watcher} + tests/hermes_cli/{gateway_windows, gateway_service} → 519/519. What was wrong with the reporter's claim (for future archaeology): they described the symptom as "no `resume_pending=True` written to `sessions.json`" — but Hermes uses `state.db` (SQLite), not `sessions.json`, and `mark_resume_pending` is called regardless of the marker (the marker only affects exit code 0 vs 1 for systemd revival semantics). The real session-loss path is the missing drain on Windows, not a missing marker. Both halves are fixed here. Closes #33778. |
||
|---|---|---|
| .. | ||
| acp | ||
| acp_adapter | ||
| agent | ||
| cli | ||
| cron | ||
| docker | ||
| e2e | ||
| fakes | ||
| gateway | ||
| hermes_cli | ||
| hermes_state | ||
| honcho_plugin | ||
| integration | ||
| openviking_plugin | ||
| plugins | ||
| providers | ||
| run_agent | ||
| scripts | ||
| skills | ||
| stress | ||
| tools | ||
| tui_gateway | ||
| website | ||
| __init__.py | ||
| conftest.py | ||
| run_interrupt_test.py | ||
| test_account_usage.py | ||
| test_atomic_replace_symlinks.py | ||
| test_base_url_hostname.py | ||
| test_batch_runner_checkpoint.py | ||
| test_bitwarden_secrets.py | ||
| test_cli_file_drop.py | ||
| test_cli_manual_compress.py | ||
| test_cli_skin_integration.py | ||
| test_ctx_halving_fix.py | ||
| test_docker_home_override_scripts.py | ||
| test_empty_model_fallback.py | ||
| test_env_loader_secret_sources.py | ||
| test_evidence_store.py | ||
| test_gateway_streaming_nested_config.py | ||
| test_get_tool_definitions_cache_isolation.py | ||
| test_hermes_bootstrap.py | ||
| test_hermes_constants.py | ||
| test_hermes_home_profile_warning.py | ||
| test_hermes_logging.py | ||
| test_hermes_state.py | ||
| test_hermes_state_wal_fallback.py | ||
| test_honcho_client_config.py | ||
| test_honcho_session_context.py | ||
| test_install_sh_browser_install.py | ||
| test_install_sh_pythonpath_sanitization.py | ||
| test_install_sh_root_fhs_uv_python_path.py | ||
| test_install_sh_setup_wizard_tty_probe.py | ||
| test_install_sh_symlink_stomp.py | ||
| test_install_sh_termux_network_prereqs.py | ||
| test_ipv4_preference.py | ||
| test_lazy_session_regressions.py | ||
| test_lint_config.py | ||
| test_live_system_guard_self_test.py | ||
| test_mcp_serve.py | ||
| test_mini_swe_runner.py | ||
| test_minimax_model_validation.py | ||
| test_minimax_oauth.py | ||
| test_minisweagent_path.py | ||
| test_model_picker_scroll.py | ||
| test_model_tools.py | ||
| test_model_tools_async_bridge.py | ||
| test_ollama_num_ctx.py | ||
| test_package_json_lazy_deps.py | ||
| test_packaging_metadata.py | ||
| test_plugin_skills.py | ||
| test_process_loop_event_loop_warning.py | ||
| test_project_metadata.py | ||
| test_retry_utils.py | ||
| test_run_tests_parallel.py | ||
| test_sanitize_tool_error.py | ||
| test_sql_injection.py | ||
| test_subprocess_home_isolation.py | ||
| test_termux_all_extra_compat.py | ||
| test_timezone.py | ||
| test_toolset_distributions.py | ||
| test_toolsets.py | ||
| test_trajectory_compressor.py | ||
| test_trajectory_compressor_async.py | ||
| test_transform_llm_output_hook.py | ||
| test_transform_tool_result_hook.py | ||
| test_tui_gateway_server.py | ||
| test_utils_truthy_values.py | ||
| test_yuanbao_integration.py | ||
| test_yuanbao_markdown.py | ||
| test_yuanbao_pipeline.py | ||
| test_yuanbao_proto.py | ||