hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
Teknium	f53a5a7fe1	fix: suppress duplicate completion notifications when agent already consumed output via wait/poll/log (#8228 ) When the agent calls process(action='wait') or process(action='poll') and gets the exited status, the completion_queue notification is redundant — the agent already has the output from the tool return. Previously, the drain loops in CLI and gateway would still inject the [SYSTEM: Background process completed] message, causing the agent to receive the same information twice. Fix: track session IDs in _completion_consumed set when wait/poll/log returns an exited process. Drain loops in cli.py and gateway watcher skip completion events for consumed sessions. Watch pattern events are never suppressed (they have independent semantics). Adds 4 tests covering wait/poll/log marking and running-process negative case.	2026-04-12 00:36:22 -07:00
Teknium	dfc820345d	fix: scope tool interrupt signal per-thread to prevent cross-session leaks (#7930 ) The interrupt mechanism in tools/interrupt.py used a process-global threading.Event. In the gateway, multiple agents run concurrently in the same process via run_in_executor. When any agent was interrupted (user sends a follow-up message), the global flag killed ALL agents' running tools — terminal commands, browser ops, web requests — across all sessions. Changes: - tools/interrupt.py: Replace single threading.Event with a set of interrupted thread IDs. set_interrupt() targets a specific thread; is_interrupted() checks the current thread. Includes a backward- compatible _ThreadAwareEventProxy for legacy _interrupt_event usage. - run_agent.py: Store execution thread ID at start of run_conversation(). interrupt() and clear_interrupt() pass it to set_interrupt() so only this agent's thread is affected. - tools/code_execution_tool.py: Use is_interrupted() instead of directly checking _interrupt_event.is_set(). - tools/process_registry.py: Same — use is_interrupted(). - tests: Update interrupt tests for per-thread semantics. Add new TestPerThreadInterruptIsolation with two tests verifying cross-thread isolation.	2026-04-11 14:02:58 -07:00
Teknium	cac6178104	fix(gateway): propagate user identity through process watcher pipeline Background process watchers (notify_on_complete, check_interval) created synthetic SessionSource objects without user_id/user_name. While the internal=True bypass (`1d8d4f28`) prevented false pairing for agent- generated notifications, the missing identity caused: - Garbage entries in pairing rate limiters (discord:None, telegram:None) - 'User None' in approval messages and logs - No user identity available for future code paths that need it Additionally, platform messages arriving without from_user (Telegram service messages, channel forwards, anonymous admin actions) could still trigger false pairing because they are not internal events. Fix: 1. Propagate user_id/user_name through the full watcher chain: session_context.py → gateway/run.py → terminal_tool.py → process_registry.py (including checkpoint persistence/recovery) 2. Add None user_id guard in _handle_message() — silently drop non-internal messages with no user identity instead of triggering the pairing flow. Salvaged from PRs #7664 (kagura-agent, ContextVar approach), #6540 (MestreY0d4-Uninter, tests), and #7709 (guang384, None guard). Closes #6341, #6485, #7643 Relates to #6516, #7392	2026-04-11 13:46:16 -07:00
Teknium	f459214010	feat: background process monitoring — watch_patterns for real-time output alerts * feat: add watch_patterns to background processes for output monitoring Adds a new 'watch_patterns' parameter to terminal(background=true) that lets the agent specify strings to watch for in process output. When a matching line appears, a notification is queued and injected as a synthetic message — triggering a new agent turn, similar to notify_on_complete but mid-process. Implementation: - ProcessSession gets watch_patterns field + rate-limit state - _check_watch_patterns() in ProcessRegistry scans new output chunks from all three reader threads (local, PTY, env-poller) - Rate limited: max 8 notifications per 10s window - Sustained overload (45s) permanently disables watching for that process - watch_queue alongside completion_queue, same consumption pattern - CLI drains watch_queue in both idle loop and post-turn drain - Gateway drains after agent runs via _inject_watch_notification() - Checkpoint persistence + crash recovery includes watch_patterns - Blocked in execute_code sandbox (like other bg params) - 20 new tests covering matching, rate limiting, overload kill, checkpoint persistence, schema, and handler passthrough Usage: terminal( command='npm run dev', background=true, watch_patterns=['ERROR', 'WARN', 'listening on port'] ) * refactor: merge watch_queue into completion_queue Unified queue with 'type' field distinguishing 'completion', 'watch_match', and 'watch_disabled' events. Extracted _format_process_notification() in CLI and gateway to handle all event types in a single drain loop. Removes duplication across both CLI drain sites and the gateway.	2026-04-11 03:13:23 -07:00
aaronagent	307697688e	fix: prevent zombie processes, redact cron stderr, skip symlinks in skill enumeration process_registry.py: _reader_loop() has process.wait() after the try-except block (line 380). If the reader thread crashes with an unexpected exception (e.g. MemoryError, KeyboardInterrupt), control exits the except handler but skips wait() — leaving the child as a zombie process. Move wait() and the cleanup into a finally block so the child is always reaped. cron/scheduler.py: _run_job_script() only redacts secrets in stdout on the SUCCESS path (line 417-421). When a cron script fails (non-zero exit), both stdout and stderr are returned WITHOUT redaction (lines 407-413). A script that accidentally prints an API key to stderr during a failure would leak it into the LLM context. Move redaction before the success/failure branch so both paths benefit. skill_commands.py: _build_skill_message() enumerates supporting files using rglob("*") but only checks is_file() (line 171) without filtering symlinks. PR #6693 added symlink protection to scan_skill_commands() but missed this function. A malicious skill can create symlinks in references/ pointing to arbitrary files, exposing their paths (and potentially content via skill_view) to the LLM. Add is_symlink() check to match the guard in scan_skill_commands. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 02:03:20 -07:00
coffee	c1f832a610	fix(tools): guard against ValueError on int() env var and header parsing Three locations perform `int()` conversion on environment variables or HTTP headers without error handling, causing unhandled `ValueError` crashes when the values are non-numeric: 1. `send_message_tool.py` — `EMAIL_SMTP_PORT` env var parsed outside the try/except block; a non-numeric value crashes `_send_email()` instead of returning a user-friendly error. 2. `process_registry.py` — `TERMINAL_TIMEOUT` env var parsed without protection; a non-numeric value crashes the `wait()` method. 3. `skills_hub.py` — HTTP `Retry-After` header can contain date strings per RFC 7231; `int()` conversion crashes on non-numeric values. All three now fall back to their default values on `ValueError`/`TypeError`.	2026-04-10 16:47:44 -07:00
Teknium	c8e4dcf412	fix: prevent duplicate completion notifications on process kill (#7124 ) When kill_process() sends SIGTERM, both it and the reader thread race to call _move_to_finished() — kill_process sets exit_code=-15 and enqueues a notification, then the reader thread's process.wait() returns with exit_code=143 (128+SIGTERM) and enqueues a second one. Fix: make _move_to_finished() idempotent by tracking whether the session was actually removed from _running. The second call sees it was already moved and skips the completion_queue.put(). Adds regression test: test_move_to_finished_idempotent_no_duplicate	2026-04-10 03:52:16 -07:00
adybag14-cyber	54d5138a54	fix(termux): harden env-backed background jobs	2026-04-09 16:24:53 -07:00
adybag14-cyber	e79cc88985	feat: add tested Termux install path and EOF-aware gh auth	2026-04-09 16:24:53 -07:00
mrshu	19b0ddce40	fix(process): correct detached crash recovery state Previously crash recovery recreated detached sessions as if they were fully managed, so polls and kills could lie about liveness and the checkpoint could forget recovered jobs after the next restart. This commit refreshes recovered host-backed sessions from real PID state, keeps checkpoint data durable, and preserves notify watcher metadata while treating sandbox-only PIDs as non-recoverable. - Persist `pid_scope` in `tools/process_registry.py` and skip recovering sandbox-backed entries without a host-visible PID handle - Refresh detached sessions on access so `get`/`poll`/`wait` and active session queries observe exited processes instead of hanging forever - Allow recovered host PIDs to be terminated honestly and requeue `notify_on_complete` watchers during checkpoint recovery - Add regression tests for durable checkpoints, detached exit/kill behavior, sandbox skip logic, and recovered notify watchers	2026-04-08 03:35:43 -07:00
Teknium	678a87c477	refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites Add three reusable helpers to eliminate pervasive boilerplate: tools/registry.py — tool_error() and tool_result(): Every tool handler returns JSON strings. The pattern json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times, and json.dumps({"success": False, "error": msg}, ...) another 23. Now: tool_error(msg) or tool_error(msg, success=False). tool_result() handles arbitrary result dicts: tool_result(success=True, data=payload) or tool_result(some_dict). hermes_cli/config.py — read_raw_config(): Lightweight YAML reader that returns the raw config dict without load_config()'s deep-merge + migration overhead. Available for callsites that just need a single config value. Migration (129 callsites across 32 files): - tools/: browser_camofox (18), file_tools (10), homeassistant (8), web_tools (7), skill_manager (7), cronjob (11), code_execution (4), delegate (5), send_message (4), tts (4), memory (7), session_search (3), mcp (2), clarify (2), skills_tool (3), todo (1), vision (1), browser (1), process_registry (2), image_gen (1) - plugins/memory/: honcho (9), supermemory (9), hindsight (8), holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2) - agent/: memory_manager (2), builtin_memory_provider (1)	2026-04-07 13:36:38 -07:00
Teknium	ca0459d109	refactor: remove 24 confirmed dead functions — 432 lines of unused code Each function was verified to have exactly 1 reference in the entire codebase (its own definition). Zero calls, zero imports, zero string references anywhere including tests. Removed by category: Superseded wrappers (replaced by newer implementations): - agent/anthropic_adapter.py: run_hermes_oauth_login, refresh_hermes_oauth_token - hermes_cli/callbacks.py: sudo_password_callback (superseded by CLI method) - hermes_cli/setup.py: _set_model_provider, _sync_model_from_disk - tools/file_tools.py: get_file_tools (superseded by registry.register) - tools/cronjob_tools.py: get_cronjob_tool_definitions (same) - tools/terminal_tool.py: _check_dangerous_command (_check_all_guards used) Dead private helpers (lost their callers during refactors): - agent/anthropic_adapter.py: _convert_user_content_part_to_anthropic - agent/display.py: honcho_session_line, write_tty - hermes_cli/providers.py: _build_labels (+ dead _labels_cache var) - hermes_cli/tools_config.py: _prompt_yes_no - hermes_cli/models.py: _extract_model_ids - hermes_cli/uninstall.py: log_error - gateway/platforms/feishu.py: _is_loop_ready - tools/file_operations.py: _read_image (64-line method) - tools/process_registry.py: cleanup_expired - tools/skill_manager_tool.py: check_skill_manage_requirements Dead class methods (zero callers): - run_agent.py: _is_anthropic_url (logic duplicated inline at L618) - run_agent.py: _classify_empty_content_response (68-line method, never wired) - cli.py: reset_conversation (callers all use new_session directly) - cli.py: _clear_current_input (added but never wired in) Other: - gateway/delivery.py: build_delivery_context_for_tool - tools/browser_tool.py: get_active_browser_sessions	2026-04-07 11:41:26 -07:00
Teknium	e120d2afac	feat: notify_on_complete for background processes (#5779 ) * feat: notify_on_complete for background processes When terminal(background=true, notify_on_complete=true), the system auto-triggers a new agent turn when the process exits — no polling needed. Changes: - ProcessSession: add notify_on_complete field - ProcessRegistry: add completion_queue, populate on _move_to_finished() - Terminal tool: add notify_on_complete parameter to schema + handler - CLI: drain completion_queue after agent turn AND during idle loop - Gateway: enhanced _run_process_watcher injects synthetic MessageEvent on completion, triggering a full agent turn - Checkpoint persistence includes notify_on_complete for crash recovery - code_execution_tool: block notify_on_complete in sandbox scripts - 15 new tests covering queue mechanics, checkpoint round-trip, schema * docs: update terminal tool descriptions for notify_on_complete - background: remove 'ONLY for servers' language, describe both patterns (long-lived processes AND long-running tasks with notify_on_complete) - notify_on_complete: more prescriptive about when to use it - TERMINAL_TOOL_DESCRIPTION: remove 'Do NOT use background for builds' guidance that contradicted the new feature	2026-04-07 02:40:16 -07:00
Teknium	8bb1d15da4	chore: remove ~100 unused imports across 55 files (#3016 ) Automated cleanup via pyflakes + autoflake with manual review. Changes: - Removed unused stdlib imports (os, sys, json, pathlib.Path, etc.) - Removed unused typing imports (List, Dict, Any, Optional, Tuple, Set, etc.) - Removed unused internal imports (hermes_cli.auth, hermes_cli.config, etc.) - Fixed cli.py: removed 8 shadowed banner imports (imported from hermes_cli.banner then immediately redefined locally — only build_welcome_banner is actually used) - Added noqa comments to imports that appear unused but serve a purpose: - Re-exports (gateway/session.py SessionResetPolicy, tools/terminal_tool.py is_interrupted/_interrupt_event) - SDK presence checks in try/except (daytona, fal_client, discord) - Test mock targets (auxiliary_client.py Path, mcp_config.py get_hermes_home) Zero behavioral changes. Full test suite passes (6162/6162, 2 pre-existing streaming test failures unrelated to this change).	2026-03-25 15:02:03 -07:00
Teknium	934fbe3c06	fix: strip ANSI at the source — clean terminal output before it reaches the model Root cause: terminal_tool, execute_code, and process_registry returned raw subprocess output with ANSI escape sequences intact. The model saw these in tool results and copied them into file writes. Previous fix (PR #2532) stripped ANSI at the write point in file_tools.py, but this was a band-aid — regex on file content risks corrupting legitimate content, and doesn't prevent ANSI from wasting tokens in the model context. Source-level fix: - New tools/ansi_strip.py with comprehensive ECMA-48 regex covering CSI (incl. private-mode, colon-separated, intermediate bytes), OSC (both terminators), DCS/SOS/PM/APC strings, Fp/Fe/Fs/nF escapes, 8-bit C1 - terminal_tool.py: strip output before returning to model - code_execution_tool.py: strip stdout/stderr before returning - process_registry.py: strip output in poll/read_log/wait - file_tools.py: remove _strip_ansi band-aid (no longer needed) Verified: `ls --color=always` output returned as clean text to model, file written from that output contains zero ESC bytes.	2026-03-23 07:43:12 -07:00
Teknium	d87655afff	fix(gateway): persist watcher metadata in checkpoint for crash recovery (#1706 ) Salvaged from PR #1573 by @eren-karakus0. Cherry-picked with authorship preserved. Fixes #1143 — background process notifications resume after gateway restart. Co-authored-by: Muhammet Eren Karakuş <erenkar950@gmail.com>	2026-03-17 03:52:15 -07:00
teknium1	210d5ade1e	feat(tools): centralize tool emoji metadata in registry + skin integration - Add 'emoji' field to ToolEntry and 'get_emoji()' to ToolRegistry - Add emoji= to all 50+ registry.register() calls across tool files - Add get_tool_emoji() helper in agent/display.py with 3-tier resolution: skin override → registry default → hardcoded fallback - Replace hardcoded emoji maps in run_agent.py, delegate_tool.py, and gateway/run.py with centralized get_tool_emoji() calls - Add 'tool_emojis' field to SkinConfig so skins can override per-tool emojis (e.g. ares skin could use swords instead of wrenches) - Add 11 tests (5 registry emoji, 6 display/skin integration) - Update AGENTS.md skin docs table Based on the approach from PR #1061 by ForgingAlex (emoji centralization in registry). This salvage fixes several issues from the original: - Does NOT split the cronjob tool (which would crash on missing schemas) - Does NOT change image_generate toolset/requires_env/is_async - Does NOT delete existing tests - Completes the centralization (gateway/run.py was missed) - Hooks into the skin system for full customizability	2026-03-15 20:21:21 -07:00
teknium1	b177b4abad	fix(security): block gateway and tool env vars in subprocesses Extend subprocess env sanitization beyond provider credentials by blocking Hermes-managed tool, messaging, and related gateway runtime vars. Reuse a shared sanitizer in LocalEnvironment and ProcessRegistry so background and PTY processes honor the same blocklist and _HERMES_FORCE_ escape hatch. Add regression coverage for local env execution and process_registry spawning.	2026-03-15 02:51:04 -07:00
0xIbra	437ec17125	fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths Several files resolved paths via Path.home() / ".hermes" or os.path.expanduser("~/.hermes/..."), bypassing the HERMES_HOME environment variable. This broke isolation when running multiple Hermes instances with distinct HERMES_HOME directories. Replace all hardcoded paths with calls to get_hermes_home() from hermes_cli.config, consistent with the rest of the codebase. Files fixed: - tools/process_registry.py (processes.json) - gateway/pairing.py (pairing/) - gateway/sticker_cache.py (sticker_cache.json) - gateway/channel_directory.py (channel_directory.json, sessions.json) - gateway/config.py (gateway.json, config.yaml, sessions_dir) - gateway/mirror.py (sessions/) - gateway/hooks.py (hooks/) - gateway/platforms/base.py (image_cache/, audio_cache/, document_cache/) - gateway/platforms/whatsapp.py (whatsapp/session) - gateway/delivery.py (cron/output) - agent/auxiliary_client.py (auth.json) - agent/prompt_builder.py (SOUL.md) - cli.py (config.yaml, images/, pastes/, history) - run_agent.py (logs/) - tools/environments/base.py (sandboxes/) - tools/environments/modal.py (modal_snapshots.json) - tools/environments/singularity.py (singularity_snapshots.json) - tools/tts_tool.py (audio_cache) - hermes_cli/status.py (cron/jobs.json, sessions.json) - hermes_cli/gateway.py (logs/, whatsapp session) - hermes_cli/main.py (whatsapp/session) Tests updated to use HERMES_HOME env var instead of patching Path.home(). Closes #892 (cherry picked from commit `78ac1bba43`)	2026-03-13 21:32:53 -07:00
Teknium	646b4ec533	fix(terminal): strip provider env vars from background and PTY subprocesses (#1172 ) * fix: Home Assistant event filtering now closed by default Previously, when no watch_domains or watch_entities were configured, ALL state_changed events passed through to the agent, causing users to be flooded with notifications for every HA entity change. Now events are dropped by default unless the user explicitly configures: - watch_domains: list of domains to monitor (e.g. climate, light) - watch_entities: list of specific entity IDs to monitor - watch_all: true (new option — opt-in to receive all events) A warning is logged at connect time if no filters are configured, guiding users to set up their HA platform config. All 49 gateway HA tests + 52 HA tool tests pass. * docs: update Home Assistant integration documentation - homeassistant.md: Fix event filtering docs to reflect closed-by-default behavior. Add watch_all option. Replace Python dict config example with YAML. Fix defaults table (was incorrectly showing 'all'). Add required configuration warning admonition. - environment-variables.md: Add HASS_TOKEN and HASS_URL to Messaging section. - messaging/index.md: Add Home Assistant to description, architecture diagram, platform toolsets table, and Next Steps links. * fix(terminal): strip provider env vars from background and PTY subprocesses Extends the env var blocklist from #1157 to also cover the two remaining leaky paths in process_registry.py: - spawn_local() PTY path (line 156) - spawn_local() background Popen path (line 197) Both were still using raw os.environ, leaking provider vars to background processes and interactive PTY sessions. Now uses the same dynamic _HERMES_PROVIDER_ENV_BLOCKLIST from local.py. Explicit env_vars passed to spawn_local() still override the blocklist, matching the existing behavior for callers that intentionally need these. Gap identified by PR #1004 (@PeterFile).	2026-03-13 07:54:46 -07:00
teknium1	4de5e017f1	Merge PR #457 : Use pywinpty for PTY support on Windows Authored by shitcoinsherpa. Imports winpty.PtyProcess on Windows instead of ptyprocess.PtyProcess, and adds platform markers to the [pty] extra so the correct package is installed automatically.	2026-03-09 21:09:56 -07:00
teknium1	d63b363cde	refactor: extract atomic_json_write helper, add 24 checkpoint tests Extract the duplicated temp-file + fsync + os.replace pattern from batch_runner.py (1 instance) and process_registry.py (2 instances) into a shared utils.atomic_json_write() function. Add 12 tests for atomic_json_write covering: valid JSON, parent dir creation, overwrite, crash safety (original preserved on error), no temp file leaks, string paths, unicode, custom indent, concurrent writes. Add 12 tests for batch_runner checkpoint behavior covering: _save_checkpoint (valid JSON, last_updated, overwrite, lock/no-lock, parent dirs, no temp leaks), _load_checkpoint (missing file, existing data, corrupt JSON), and resume logic (preserves prior progress, different run_name starts fresh).	2026-03-06 05:50:12 -08:00
teknium1	c05c60665e	Merge PR #298 : Make process_registry checkpoint writes atomic Authored by aydnOktay. Companion to PR #297 (batch_runner). Applies the same atomic write pattern (temp file + fsync + os.replace) to both _write_checkpoint() and recover_from_checkpoint() in process_registry.py. Prevents checkpoint corruption on gateway crashes. Also improves error handling: bare 'pass' replaced with logger.debug(..., exc_info=True) for better debugging.	2026-03-06 05:32:35 -08:00
teknium1	3670089a42	docs: add Daytona to batch_runner, process_registry, agent_loop, tool_context Add daytona_image to batch_runner per-prompt container image overrides so batch processing works with the Daytona backend. Update inline comments in RL environment files (agent_loop, tool_context) and process_registry docstrings to include Daytona in backend lists.	2026-03-06 03:49:59 -08:00
shitcoinsherpa	dcba291d45	Use pywinpty instead of ptyprocess on Windows for PTY support ptyprocess depends on Unix-only APIs (fork, openpty) and cannot work on Windows at all. pywinpty provides a compatible PtyProcess interface using the Windows ConPTY API. This conditionally imports winpty.PtyProcess on Windows and ptyprocess.PtyProcess on Unix. The pyproject.toml pty extra now uses platform markers so the correct package is installed automatically.	2026-03-05 17:16:04 -05:00
teknium1	ff3a479156	fix: coerce session_id and data to string in process tool handler Some models send session_id as an integer instead of a string, causing type errors downstream. Defensively cast session_id and write/submit data args to str to handle non-compliant model outputs.	2026-03-04 16:37:00 -08:00
teknium1	de59d91add	feat: Windows native support via Git Bash - Add scripts/install.cmd batch wrapper for CMD users (delegates to install.ps1) - Add _find_shell() in local.py: detects Git Bash on Windows via HERMES_GIT_BASH_PATH env var, shutil.which, or common install paths (same pattern as Claude Code's CLAUDE_CODE_GIT_BASH_PATH) - Use _find_shell() in process_registry.py for background processes - Fix hermes_cli/gateway.py: use wmic instead of ps aux on Windows, skip SIGKILL (doesn't exist on Windows), fix venv path (Scripts/python.exe vs bin/python) - Update README with three install commands (Linux/macOS, PowerShell, CMD) and Windows native documentation Requires Git for Windows, which bundles bash.exe. The terminal tool transparently uses Git Bash for shell commands regardless of whether the user launched hermes from PowerShell or CMD.	2026-03-02 22:03:29 -08:00
teknium1	2ba87a10b0	Merge PR #219 : fix: guard POSIX-only process functions for Windows compatibility Authored by Farukest. Fixes #218.	2026-03-02 17:07:49 -08:00
aydnOktay	5fa3e24b76	Make process_registry checkpoint writes atomic	2026-03-03 02:44:01 +03:00
teknium1	dda9f3e734	fix(process_registry): ensure unbuffered output for subprocesses Updated the environment variables for subprocess execution in the ProcessRegistry class to set PYTHONUNBUFFERED to "1". This change ensures that output from Python scripts is unbuffered, allowing for real-time visibility of progress during background execution. Adjusted both the pty and background process spawning methods to use the new environment configuration.	2026-03-01 16:14:57 -08:00
teknium1	1db5598294	feat(tests): add live integration tests for file operations and shell noise filtering - Introduce a new test suite in `test_file_tools_live.py` to validate file operations and ensure accurate command execution in a real environment. - Implement assertions to check for shell noise contamination in outputs, enhancing the reliability of command results. - Create fixtures for setting up a local environment and populating directories with known file contents for comprehensive testing. - Refactor shell noise handling in `process_registry.py` and `local.py` to support multiple noise patterns, improving output cleanliness.	2026-02-28 22:57:58 -08:00
Farukest	3f58e47c63	fix: guard POSIX-only process functions for Windows compatibility os.setsid, os.killpg, and os.getpgid do not exist on Windows and raise AttributeError on import or first call. This breaks the terminal tool, code execution sandbox, process registry, and WhatsApp bridge on Windows. Added _IS_WINDOWS platform guard in all four affected files, following the pattern documented in CONTRIBUTING.md. On Windows, preexec_fn is set to None and process termination falls back to proc.terminate() / proc.kill() instead of process group signals. Files changed: - tools/environments/local.py (3 call sites) - tools/process_registry.py (2 call sites) - tools/code_execution_tool.py (3 call sites) - gateway/platforms/whatsapp.py (3 call sites)	2026-03-01 01:54:27 +03:00
teknium1	66a5bc64db	fix(process): use shlex to safely quote commands in bg_command for improved security	2026-02-27 22:50:26 -08:00
Teknium	7f423508e4	Merge pull request #151 from johnh4098/fix/shell-injection-spawn-via-env-v2 fix(process): escape single quotes in spawn_via_env bg_command	2026-02-27 22:49:04 -08:00
teknium1	fb7df099e0	feat(cli): add shell noise filtering and improve command execution with interactive login shell	2026-02-27 16:26:47 -08:00
teknium1	f14ff3e041	feat(cli): use user's login shell for command execution to ensure environment consistency	2026-02-27 15:10:27 -08:00
johnh4098	e5f719a33b	fix(process): escape single quotes in spawn_via_env bg_command	2026-02-27 21:03:17 +03:30
teknium1	08ff1c1aa8	More major refactor/tech debt removal!	2026-02-21 20:22:33 -08:00
teknium1	748fd3db88	refactor: enhance error handling with structured logging across multiple modules - Updated various modules including cli.py, run_agent.py, gateway, and tools to replace silent exception handling with structured logging. - Improved error messages to provide more context, aiding in debugging and monitoring. - Ensured consistent logging practices throughout the codebase, enhancing traceability and maintainability.	2026-02-21 03:32:11 -08:00
teknium1	a885d2f240	refactor: implement structured logging across multiple modules - Introduced logging functionality in cli.py, run_agent.py, scheduler.py, and various tool modules to replace print statements with structured logging. - Enhanced error handling and informational messages to improve debugging and monitoring capabilities. - Ensured consistent logging practices across the codebase, facilitating better traceability and maintenance.	2026-02-21 03:11:11 -08:00
teknium1	ec59d71e60	Update PTY write handling in ProcessRegistry to ensure data is encoded as bytes before writing. This change improves compatibility with string inputs and clarifies the expected data type in comments.	2026-02-17 03:14:47 -08:00
teknium1	061fa70907	Add background process management with process tool, wait, PTY, and stdin support New process registry and tool for managing long-running background processes across all terminal backends (local, Docker, Singularity, Modal, SSH). Process Registry (tools/process_registry.py): - ProcessSession tracking with rolling 200KB output buffer - spawn_local() with optional PTY via ptyprocess for interactive CLIs - spawn_via_env() for non-local backends (runs inside sandbox, never on host) - Background reader threads per process (Popen stdout or PTY) - wait() with timeout clamping, interrupt support, and transparent limit reporting - JSON checkpoint to ~/.hermes/processes.json for gateway crash recovery - Module-level singleton shared across agent loop, gateway, and RL Process Tool (model_tools.py): - 7 actions: list, poll, log, wait, kill, write, submit - Paired with terminal in all toolsets (CLI, messaging, RL) - Timeout clamping with transparent notes in response Terminal Tool Updates (tools/terminal_tool.py): - Replaced nohup background mode with registry spawn (returns session_id) - Added workdir parameter for per-command working directory - Added check_interval parameter for gateway auto-check watchers - Added pty parameter for interactive CLI tools (Codex, Claude Code) - Updated TERMINAL_TOOL_DESCRIPTION with full background workflow docs - Cleanup thread now respects active background processes (won't reap sandbox) Gateway Integration (gateway/run.py, session.py, config.py): - Session reset protection: sessions with active processes exempt from reset - Default idle timeout increased from 2 hours to 24 hours - from_dict fallback aligned to match (was 120, now 1440) - session_key env var propagated to process registry for session mapping - Crash recovery on gateway startup via checkpoint probe - check_interval watcher: asyncio task polls process, delivers updates to platform RL Safety (environments/): - tool_context.py cleanup() kills background processes on episode end - hermes_base_env.py warns when enabled_toolsets is None (loads all tools) - Process tool safe in RL via wait() blocking the agent loop Also: - Added ptyprocess as optional dependency (in pyproject.toml [pty] extra + [all]) - Fixed pre-existing bug: rl_test_inference missing from TOOL_TO_TOOLSET_MAP - Updated AGENTS.md with process management docs and project structure - Updated README.md terminal section with process management overview	2026-02-17 02:51:31 -08:00

42 commits