hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-24 10:52:21 +00:00

Author	SHA1	Message	Date
Tranquil-Flow	15880da8bb	fix(file_tools): resolve tilde using profile home for file operations (#48552 ) File tools (read_file, write_file, patch, list_directory, etc.) used os.path.expanduser() which reads the gateway process HOME env var. In Docker/systemd/s6 deployments where the gateway HOME differs from interactive sessions, tilde expanded to the wrong directory. Add _expand_tilde() helper that delegates to get_subprocess_home() when available, falling back to os.path.expanduser(). Replace all 9 expanduser() call sites in file_tools.py with _expand_tilde().	2026-06-23 03:17:47 +05:30
kshitijk4poor	0e69cd4b37	fix(memory): honor configured char limits in the no-agent on-disk store Follow-up to the /memory approve fresh-store fix. Both the CLI fallback and the messaging-gateway handler built a bare MemoryStore() with the hardcoded default char limits (2200/1375), ignoring the user's configured memory.memory_char_limit / user_char_limit. A live agent honors those overrides (agent/agent_init.py), so an approval applied without a live agent could accept a write the user's lower cap would reject, or vice versa. Extract a shared tools.memory_tool.load_on_disk_store() factory that reads the configured limits (falling back to defaults if config can't load) and wire both the CLI and gateway handlers to it, closing the gap on both surfaces and de-duplicating the construction block.	2026-06-23 03:10:53 +05:30
Max Hsu	3147cbb136	fix(memory): apply /memory approve against a fresh store when no live agent The CLI /memory slash handler (cli_commands_mixin._handle_memory_command) passed self.agent._memory_store straight through, which is None when the command runs without a live agent — e.g. /memory approve from the Desktop GUI. The shared write-approval handler then returns "memory store unavailable" and applies nothing, even with built-in memory enabled and pending writes present. Fall back to a freshly loaded on-disk MemoryStore when no live store is available, mirroring the gateway path (gateway/slash_commands.py). It persists to the same MEMORY/USER.md and creates MEMORY.md on the first approved write. Fixes #46783 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-23 03:10:53 +05:30
Francesco Bonacci	f2e37549c6	feat(computer_use): cross-platform cua-driver (macOS/Windows/Linux) Make the computer_use toolset platform-agnostic by driving cua-driver on macOS, Windows, and Linux. Consumes the 8 cua-driver decoupling surfaces (capability discovery, structuredContent AX tree, opaque element_token, click button enum, explicit mimeType, machine-readable manifest, structured list_windows, structured health_report), each degrading gracefully on older drivers. Adds `hermes computer-use doctor` (drives cua-driver health_report with a per-OS check matrix and an exit 0/1/2 ok/degraded/blocked contract), full typed wrappers for the previously-uncovered cua-driver tools plus a generic call_tool escape hatch, per-session agent-cursor lifecycle, platform-aware system-prompt guidance (host-deterministic, cache-safe), and honors HERMES_CUA_DRIVER_CMD end-to-end. Replaces the macOS-only skills/apple/macos-computer-use skill with a cross-platform skills/computer-use skill, and refreshes the EN + zh-Hans docs. Supersedes #44221 (Windows-enablement salvage of #30660). Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-06-22 06:42:30 -07:00
Ben Barclay	de6b3ae377	fix(terminal): bridge docker_extra_args to TERMINAL_DOCKER_EXTRA_ARGS in CLI + gateway (#50631 ) terminal.docker_extra_args passes flags verbatim to `docker run` (e.g. --gpus=all, --shm-size=16g). It was wired into DEFAULT_CONFIG, TERMINAL_CONFIG_ENV_MAP (so `hermes config set` bridged it), terminal_tool._get_env_config (reads TERMINAL_DOCKER_EXTRA_ARGS), and DockerEnvironment (applies extra_args) -- but it was MISSING from cli.py's env_mappings and gateway/run.py's _terminal_env_map. Consequence: a user who hand-edits config.yaml (rather than running `hermes config set`) has docker_extra_args silently dropped on the CLI and gateway/desktop startup paths, while docker_image / docker_volumes (which ARE in those maps) bridge correctly -- producing the reported 'Hermes partially reads the Docker config' symptom where --gpus=all and --shm-size=16g never reach docker run. This is the same bridge-coverage bug class that shipped before for docker_run_as_host_user (cli + gateway) and docker_mount_cwd_to_workspace (gateway). Fix by adding the key to both maps, plus a dedicated regression pin in test_terminal_config_env_sync.py mirroring the existing test_docker_*_is_bridged_everywhere guards.	2026-06-22 15:41:23 +10:00
Teknium	b0a25980f8	fix(terminal): make hermes install dir reachable in subshell PATH (#50534 ) Plugins shelling out to bare `hermes` via the terminal tool hit `command not found` (exit 127) when the gateway was launched without the hermes install dir on PATH (systemd, service managers, cron, desktop launchers) — even though `hermes` works in the user's own interactive terminal, which sources the shell rc that exports that dir. The terminal tool's subshell PATH was the agent process PATH plus a static set of system dirs (_SANE_PATH); it never included wherever the hermes console-script actually lives (~/.local/bin, the venv bin/Scripts, pipx, nix). Resolve that dir once (which/argv0/sys.executable) and prepend-if-missing it so bare `hermes` resolves regardless of launch method.	2026-06-21 20:00:06 -07:00
teknium1	8cfcbd327d	fix(process): SIGKILL the whole tree on escalation, not just wait_procs survivors Live testing against a real SIGTERM-ignoring process TREE (parent + children, the agent-browser daemon + renderer shape) revealed psutil.wait_procs's gone/alive partition mis-handles a parent/child tree: it reaps via Process.wait() and could mark targets gone/alive inconsistently across the tree, leaving survivors un-killed (flaky — sometimes the parent lived, sometimes a child). Replace it with: sleep out the grace window, then directly re-probe every captured target (_proc_alive, treating zombies as dead) and SIGKILL any that's still running. Add a multi-child-tree regression test. 6/6 escalation tests green across repeated runs; the real-tree E2E now kills the full tree 6/6 runs.	2026-06-21 19:08:52 -07:00
teknium1	8cecaf0b29	feat(process): escalate SIGTERM->SIGKILL on host-pid termination after grace A daemon that ignores or stalls in its SIGTERM handler currently survives the process-registry reap and leaks until reboot (observed as agent-browser daemons accumulating to EMFILE on long-running gateways). _terminate_host_pid now snapshots the tree, SIGTERMs it, waits a bounded grace window (terminal.daemon_term_grace_seconds, default 2.0s, 0 disables), then SIGKILLs any survivor. The recycled-PID identity guard still gates the whole path, so escalation never reaches a stranger; Windows is unchanged (taskkill /F is already a hard kill). Config lives in config.yaml (terminal.daemon_term_grace_seconds), NOT an env var, per the .env-secrets-only policy. Implements the SIGKILL-escalation idea from @tkwong's #15008, reworked onto the current _terminate_host_pid tree-kill path (the original predated it) and config-gated instead of env-var-gated. Co-authored-by: Benjamin Wong <tkwong@inspiresynergy.com>	2026-06-21 19:08:52 -07:00
valentt	e447723149	fix(process-registry): re-validate PID identity before killing host processes The background-process registry signalled host PIDs (recovery adoption, detached-session kill, tree-kill) using a number captured at spawn, guarded only by a bare liveness check. Once a session's process exits and is reaped the kernel recycles that PID onto an unrelated process, so an alive-but-different PID passed the check and got tree-killed. Observed in the wild: a recycled background-session PID landed on Firefox's session leader; a later kill/refresh walked its process tree and SIGTERMed every tab — Firefox "closing" at irregular intervals with no crash/coredump. This is the same PID/PGID-recycling class fixed for the MCP orphan reaper in `7bd1f8a2d`, but the process_registry subsystem was never guarded — so the bug persisted. Fix: record each host process's kernel start time (/proc/<pid>/stat field 22) at spawn, persist it in the checkpoint, and re-validate it before every signal via `_host_pid_is_ours`. A PID whose start time no longer matches — or that is gone — is never signalled: - recover_from_checkpoint: a recycled PID is not adopted as a session. - _refresh_detached_session: a recycled detached PID is marked exited. - kill_process / _terminate_host_pid: refuse to tree-kill a stranger. Legacy checkpoints and platforms without /proc (no baseline) degrade to the prior best-effort liveness behaviour, so nothing else changes. Adds TestPidReuseGuard: real-process tests proving a mismatched start time refuses termination while a matching one still kills, plus recovery/refresh recycling paths. 74 registry + 22 MCP-stability tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-21 17:23:33 -07:00
Teknium	84e1d31e54	refactor(kanban): fold worker/orchestrator skills into injected guidance (#50473 ) The kanban-worker and kanban-orchestrator bundled skills existed only to be force-loaded into dispatcher-spawned workers, gated by environments:[kanban] so they wouldn't leak into normal CLI listings. That gating was fragile (the leak that #50443 patched) and the --skills auto-load was already best-effort — most workers ran without it because the bundled skill isn't present in profile-scoped skills dirs. Remove the skills entirely and promote their load-bearing content (workspace kinds, deliverable artifacts, created-card integrity, profile discovery) into KANBAN_GUIDANCE, which is already injected into every kanban worker's system prompt. Net result: every worker reliably gets the guidance, nothing can leak into a CLI/blank-slate session, and the gating machinery is gone. - agent/prompt_builder.py: promote the 4 load-bearing rules into KANBAN_GUIDANCE - hermes_cli/kanban_db.py: drop --skills kanban-worker auto-injection + _kanban_worker_skill_available probe - hermes_cli/kanban_swarm.py: drop skills=[kanban-orchestrator] on the root card - hermes_cli/kanban.py: drop kanban-init skill seeding; fix help text - delete skills/devops/kanban-{worker,orchestrator} - docs: delete the two skill pages (EN+zh), fix sidebars/catalog/kanban.md/kanban-worker-lanes.md and the video-orchestrator + codex-lane references - tests: update spawn-argv expectations; re-bound the guidance-size guard Supersedes the skill-leak half of #50443 (credit @helix4u for flagging the area).	2026-06-21 17:06:48 -07:00
Dusk1e	84fcbbf6a9	fix(security): quote HERMES_TIMEZONE in remote code execution to prevent shell injection	2026-06-21 16:55:12 -07:00
Dusk1e	8fcb8136bb	fix(security): harden smart approval guard against prompt injection # Conflicts: # tools/approval.py	2026-06-21 16:39:48 -07:00
teknium1	624580e836	fix(browser): verify daemon identity before orphan reaper kills a PID (#14073 ) The browser orphan reaper reads a daemon PID from a `.pid` file in a world-writable, predictably-named temp dir (`/tmp/agent-browser-h_`) it does not write itself, then tree-kills that PID via `_terminate_host_pid` after only a liveness check. A same-user actor could plant a fake socket dir whose `.pid` points at an arbitrary victim process, and OS PID reuse after the real daemon exits could land the recorded PID on an unrelated process — either way an arbitrary same-user process (and its whole tree) gets SIGTERMed. Local DoS. Add `_verify_reapable_browser_daemon()`, gated before the kill: via psutil (a hard dep, fine cross-platform for the same-user processes the reaper can signal) require both (1) identity — `agent-browser` in the process name/cmdline — and (2) binding — the live process references this* session's socket dir in its cmdline or `AGENT_BROWSER_SOCKET_DIR`. The binding check is the real spoof defense: a planted/recycled PID won't embed our exact socket path. Fail-closed on any ambiguity (unreadable cmdline, no match), leaving the process and its socket dir untouched for a later sweep. Builds on @sgaofen's fix in #14394 (cmdline identity check); rewritten to use psutil instead of `/proc`+`ps` (cross-platform, Windows-covered) and to add the session-socket-dir binding check for recycled-PID / spoof resistance. Co-authored-by: sgaofen <135070653+sgaofen@users.noreply.github.com>	2026-06-21 15:23:47 -07:00
teknium1	87ab373381	test(url-safety): cover IPv6 scope-ID strip + fail-closed in URL guards Follow-up to the salvaged #25961 fix: regression tests asserting that scope-bearing IPv6 addresses (fe80::1%eth0, ::1%lo) are blocked by is_safe_url after the scope is stripped, that a still-unparseable address fails closed, and that a scoped IPv4-mapped IMDS address is caught by the always-blocked floor.	2026-06-21 13:56:35 -07:00
teknium1	4cff0360ea	test(approval): regression for interrupt-unblocks-approval; AUTHOR_MAP - Add thread-scoped regression test: interrupt on the waiting thread resolves the approval as deny well under the 300s timeout; a foreign-thread interrupt does NOT release the wait (interrupts are per-thread). - Add panghuer023 to AUTHOR_MAP for the salvaged #37994 fix.	2026-06-21 13:33:48 -07:00
Eugeniusz Gilewski	def3f6388f	fix(file): anchor device symlink guard to task cwd The read_file device guard now walks symlink hops before the file operation layer, but that hop walk still interpreted relative paths against the Python process cwd. In sessions where TERMINAL_CWD points at the task workspace, a relative workspace symlink to a blocked alias such as /dev/../dev/stdin could therefore miss the intermediate device target before later task-cwd resolution. Anchor relative device checks to the task base before symlink-hop inspection so the pre-I/O guard sees the same workspace path that read_file would otherwise read. Absolute device paths and the existing final realpath fallback remain unchanged. Refs #10141 Refs #29158	2026-06-21 12:16:10 -07:00
Stephen Chin	3b56d3a29a	fix(security): redact secrets in kanban tool payloads before persistence	2026-06-21 12:02:30 -07:00
teknium1	16899ae144	test(file): update guard assertions for unified display-text message The salvaged #19820 unifies the write_file guard under _is_internal_file_tool_content with the message 'internal read_file display text'. Two tests added to test_file_read_guards.py after the PR branch point still asserted the old 'status text' wording. Update them to match the new (correct, more general) message.	2026-06-21 11:55:59 -07:00
Brandon Zarnitz	71274f264b	fix(file): reject read_file line-numbered writeback	2026-06-21 11:55:59 -07:00
yeyitech	b17180d950	fix(session): finalize owned SQLite session rows on AIAgent.close() Funnel session finalization through AIAgent.close() — the single terminal path every agent (CLI, gateway, subagent, cron) funnels through — so finished agents stop leaving rows with ended_at IS NULL. The biggest leak source was delegate_task subagent + background-review forks whose close() never ended their row. end_session() is first-reason-wins and no-ops on an already-ended row, so a 'compression'/'cron_complete'/'cli_close' reason set by an earlier terminal path is never clobbered. /resume already calls reopen_session(), so finalizing-on-close does not break resumability. Temporary helper agents that rotate/share the session forward (manual compression, gateway session-hygiene) opt out via _end_session_on_close=False. Also stop the long-running gateway heartbeat once the executor is done or the session slot is rebound to a different agent, preventing a stale 'running: delegate_task' bubble from outliving its run. Closes #12029.	2026-06-21 11:35:09 -07:00
Teknium	41ba90f814	fix(process): keep CLI drain dedup after poll goes read-only (#10156 ) Follow-up to @de1tydev's poll-read-only fix. Removing the _completion_consumed.add() from poll() fixes the gateway/tui watcher suppression (#10156) but reintroduces the CLI duplicate that #8228 fixed: a notify_on_complete process always enqueues a completion event, and the CLI idle/post-turn drain would re-inject it as a [SYSTEM: ...] message even though the agent already saw the exit inline in its poll result. Add a separate _poll_observed set that poll() populates on an observed exit. drain_notifications() (CLI only) skips poll-observed sessions; the gateway/tui watchers keep checking only is_completion_consumed, so a read-only poll never suppresses their autonomous delivery turn. - _poll_observed pruned alongside _completion_consumed in _prune_if_needed - 4 tests: CLI drain dedup after poll, gateway gate untouched, running poll doesn't mark observed, wait/log still skip CLI drain	2026-06-21 11:11:23 -07:00
Liao Shiwu	6f5f58e34b	fix: keep poll read-only for notify_on_complete watcher	2026-06-21 11:11:23 -07:00
Eugeniusz Gilewski	9078b4bbdf	fix(file): harden read_file device alias blocking Security-hardening fix for the read_file device guard, not a new sandbox boundary. The guard already rejects direct device paths and upstream now has a resolved-path pass for workspace symlinks to blocked devices, but its concrete-path helper still compared the expanded path before normalization. That leaves residual alias cases where the dangerous path is visible before final terminal-specific resolution, for example: 1. /dev/../dev/zero and /dev/./urandom should match the blocked-device list as concrete paths, not only after final realpath; 2. /dev/stdin-style aliases can disappear once realpath follows them to /proc/self/fd/0 and then to a tty path; 3. a user symlink to /dev/../dev/stdin exposes the dangerous intermediate target before final resolution, but not necessarily after it. Normalize expanded paths before matching and inspect each symlink hop before falling back to realpath. This preserves the existing /proc fd and /proc pseudo-file guards while enforcing the intended security invariant: model-supplied read paths must not reach blocking or infinite device streams through spelling, normalization, or symlink-hop tricks. Classification: security hardening / residual bypass fix for the read_file device blocklist. This is defensive code at the file-tool boundary, but it fixes a concrete denial-of-service class tracked as security in #10141 and #29158. Tests: - normalized /dev/../dev/zero and /dev/./urandom aliases - symlink to /dev/../dev/stdin blocked before realpath - existing symlink-to-device and regular-symlink guards still pass Fixes #10141 Fixes #29158	2026-06-21 11:11:19 -07:00
kshitij	ed8f7898b9	Merge pull request #50136 from NousResearch/fix/context-aware-tool-budget fix(agent): scale tool-output budget to the model context window (#23767)	2026-06-21 20:01:32 +05:30
liuhao1024	6984026f12	fix(browser): enable SSRF guard when terminal runs in container When terminal.backend is docker/modal/daytona/ssh/singularity, the terminal runs in a sandboxed container with network isolation, but the browser still runs on the host. The SSRF guard was skipped because _is_local_backend() only checked browser.cloud_provider, not the terminal backend. Now _is_local_backend() also checks TERMINAL_ENV — when the terminal is containerized, the browser is treated as non-local and SSRF protection is enabled. Fixes #38690	2026-06-21 07:26:18 -07:00
kshitijk4poor	1965d56219	fix(agent): scale tool-output budget to the model context window (#23767 ) The tool-result persistence budget was a fixed 100K chars/result and 200K chars/turn regardless of the active model. On a small-context model (e.g. a 65K-token local model switched into mid-session) a single large tool result (reporter: a 279K-char search result) or a full 200K-char turn (~50K tokens) could by itself approach or exceed the window, forcing an oversized request that the provider rejects as "Prompt too long". - budget_config.budget_for_context_window() scales per-result/per-turn char caps to a fraction of the model window, clamped to the historical 100K/200K defaults (large models unchanged) and floored so small models stay usable. - resolve_threshold() now caps the per-tool registry value at default_result_size so tools that register a fixed 100K cap (web/terminal/x_search) don't re-inflate a scaled-down budget. No-op for the default budget (both 100K). - tool_executor wires the agent's live context_length (recomputed on model switch) into all four persist/turn-budget call sites. read_file stays inf-pinned (no persist loop). Verified E2E: a 279K-char result against a 65K model collapses to a ~1.6K preview; a 200K model is byte-identical to today.	2026-06-21 17:46:38 +05:30
xxxigm	7b9a0b315b	test(mcp): cover 'unknown method' ping keepalive fallback (#50028 ) Two regression tests for the agentmemory reconnect-loop: - _is_method_not_found_error matches the plain 'Unknown method: ping' phrasing (no structural -32601 code). - _keepalive_probe latches _ping_unsupported and falls back to list_tools when send_ping raises 'Unknown method: ping', instead of propagating (which would reconnect-loop).	2026-06-21 16:02:56 +05:30
kyssta-exe	65d7c7fafd	fix(cron): execute job immediately on action='run' `cronjob(action='run')` (and `hermes cron run`) only set `next_run_at = now` and returned success, relying on the scheduler ticker to actually execute the job on its next tick. When no gateway/ticker is running — a CLI-only setup, or the Windows case in #41037 — the job never executed: `run` reported success, but `last_run_at` stayed null forever, no output, no delivery. A manual `run` should actually run. `_execute_job_now` now: - claims the job via `claim_job_for_fire` — the same at-most-once CAS the scheduler/external-provider fire path uses. This both advances `next_run_at` for recurring jobs and blocks a concurrently-running gateway ticker from double-firing the same job; if the claim is lost, the run is skipped (the tool reports `execution_skipped`). This closes the double-fire race that a bare `advance_next_run` left open (a tick whose `get_due_jobs` already captured the job between trigger and advance would still fire it). - delegates firing to `run_one_job` — the single shared execute→save→deliver→mark body the ticker and external providers use — so failure delivery, `[SILENT]` handling, and live-adapter delivery stay identical across paths and can't drift. (The original salvage re-implemented this sequence inline and had already dropped failure delivery + `[SILENT]`.) The tool response carries `executed`, `execution_success`, and either `execution_error` or `execution_skipped`. The `hermes cron run` CLI message no longer claims "It will run on the next scheduler tick" — it reports the actual "Ran now: succeeded/failed" outcome (or the skip). Salvaged from #41130 by @kyssta-exe (authorship preserved); reworked to reuse `claim_job_for_fire` + `run_one_job` per review rather than re-implementing the fire sequence inline. Adds tests for the claim-then-fire path, claim-lost skip, failure reporting, and exception capture. Fixes #41037 Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>	2026-06-21 13:28:04 +05:30
Teknium	c6bf6bda90	fix(memory): recover from missing old_text on single-op replace/remove (#49997 ) Single-op replace/remove failed with a dead-end 'old_text is required' error when a structured-output client omitted the optional old_text field (it can't be schema-required without a top-level if/then combinator that OpenAI's Codex backend 400s on). The model couldn't recover. Now a missing old_text returns the current entry inventory plus a retry instruction (mirroring the batch path's _batch_error), so the model can reissue the call with old_text set. Also sharpens the old_text schema description to state it's required for replace/remove. Fixes #49466, #43412.	2026-06-20 23:46:52 -07:00
aieng-abdullah	74b5cc7ca4	docs(spotify): document 6-month re-auth cycle and add client-level invalid_grant test - Remove the 'you only log in once per machine' claim from spotify.md and document the ~6-month refresh token expiry with re-auth instructions - Add test_client_wraps_invalid_grant_as_spotify_auth_required_error to confirm SpotifyClient wraps AuthError(code=spotify_refresh_invalid_grant) into SpotifyAuthRequiredError with a user-facing message Refs: #28155	2026-06-20 23:23:47 -07:00
Andres Sommerhoff	97563ab821	fix: warn on line-oriented newline search patterns	2026-06-20 23:23:47 -07:00
lkz-de	6403ed06b3	docs(session-search): document source-first retrieval limits Clarify that session_search is secondary context and direct source identifiers must be inspected first when accessible. Add regression coverage for the tool description.	2026-06-20 23:23:47 -07:00
Teknium	ea8a8b4af8	feat(delegation): background fan-out — parallel subagents, one consolidated return (#49734 ) * feat(delegation): single-task delegate_task always runs in the background The model no longer decides whether a subagent runs in the background — a single-task delegate_task from the top-level agent is now always dispatched async, so the parent turn returns immediately and the subagent's result re-enters the conversation when it finishes. - run_agent._dispatch_delegate_task (the live model path) forces background=True for top-level single-task calls; the schema-level `background` param is ignored. - A batch (tasks with >1 item) stays synchronous (fan-out can't go async). - A delegation from an orchestrator subagent (depth > 0) stays synchronous — it needs its workers' results within its own turn. - The function-level default is unchanged, so direct Python callers/tests keep the historical synchronous behavior. - On async-pool capacity rejection, single-task now falls through to a synchronous run instead of erroring (the child stays attached for interrupt propagation; detach happens only on a successful dispatch). - Schema `background` param marked deprecated/ignored; tool description updated to state the always-background single-task rule. * feat(delegation): all delegate_task fan-out runs in the background Extend the always-background behavior to the full fan-out. A batch is now dispatched as N independent async subagents (one handle each), instead of running synchronously. Single task and batch both return immediately; each subagent's result re-enters the conversation as its own message when it finishes. - delegate_task: when background is set, loop over ALL built children and dispatch each via dispatch_async_delegation; return a combined handle block (count + per-task delegation_ids). Children the async pool rejects (at capacity) run synchronously inline and are reported alongside the dispatched handles, so nothing is silently dropped. - run_agent._dispatch_delegate_task + registry handler: force background for any top-level model delegation (single OR batch); orchestrator subagents (depth > 0) still run synchronously since they need workers' results within their own turn. - Removed the v1 'batch async not supported' rejection. - Tool description updated: BOTH MODES RUN IN THE BACKGROUND. - Tests updated to assert batch fan-out dispatches each task async (verified E2E: 3-task batch -> 3 independent completion-queue events). * fix(delegation): background fan-out joins and returns one consolidated block Correct the fan-out semantics: a backgrounded batch is dispatched as ONE async unit (one handle, one async-pool slot), not N independent dispatches. The unit runs all children in parallel, waits on every one, and emits a SINGLE completion event carrying the consolidated per-task results. The chat is never blocked; when all subagents finish, their full summaries re-enter the conversation together as one message. - async_delegation.dispatch_async_delegation_batch + _finalize_batch: a batch occupies one slot; its runner returns the combined {results:[...]} dict and one event with the full results list is pushed to the completion queue. - delegate_tool: extract the sync execution+aggregation into _execute_and_aggregate(); background dispatches it via the batch unit and returns one handle; on pool-capacity rejection it runs the batch inline. - process_registry._format_async_delegation: render a consolidated multi-task block (TASK i/N + per-task summary) when the event carries is_batch/results. - Tests updated; E2E verified: 3-task batch -> immediate return -> one combined completion block with all three summaries.	2026-06-20 11:27:12 -07:00
Teknium	c329279482	test: retarget source-path refs to migrated plugin paths test_telegram_webhook_secret reads telegram adapter source by path; point it at plugins/platforms/telegram/adapter.py. test_windows_native_support npm-spawn parametrization referenced gateway/platforms/whatsapp.py; point it at plugins/platforms/whatsapp/adapter.py.	2026-06-20 10:26:45 -07:00
Teknium	5600105478	refactor(gateway): migrate slack/dingtalk/whatsapp/matrix/feishu/telegram/wecom/email/sms adapters to bundled plugins Salvage of PR #41284 onto current main. Relocates the last 9 inline messaging adapters (+ satellites: telegram_network, feishu_comment/_rules/meeting_invite, wecom_crypto, wecom_callback) from gateway/platforms/ into self-contained bundled plugins under plugins/platforms/<x>/, discovered via the platform registry. Strips the per-platform core touchpoints from gateway/run.py, gateway/config.py, hermes_cli/gateway.py, hermes_cli/setup.py, and tools/send_message_tool.py. Carries forward the migration fixes (explicit enabled:false honored, get_connected_platforms forces discovery, plugin is_connected via gateway.get_env_value, logs --component gateway matches plugins.platforms.*, matrix hidden on Windows). Additionally ports config keys main added since the PR base: the matrix plugin's _apply_yaml_config now also covers allowed_users, ignore_user_patterns, process_notices, and session_scope (the inline gateway/config.py matrix block gained these in the 1340 commits the PR sat open; they would otherwise have been silently dropped on deletion).	2026-06-20 10:26:45 -07:00
lkz-de	905820b59f	fix(signal): share markdown formatting across send paths Route Signal send paths through shared markdown formatting helpers and render markdown bullets consistently as Unicode bullets. Add coverage for Signal formatting and send_message integration.	2026-06-20 13:47:14 +05:30
hakanpak	d45addc2f1	fix(tools): never let a model whitelist strip the prompt / source images _build_fal_payload and _build_fal_edit_payload assemble the request and then filter it down to the model's supports / edit_supports whitelist. That filter also covers prompt (and image_urls for edits), which every FAL endpoint requires. Today all model configs happen to list those keys, but a single config that omits one would silently produce a request with no prompt or no source images — a broken generation with no error. Always keep the mandatory keys regardless of the whitelist so a missing whitelist entry can only drop optional knobs, never the prompt or the images.	2026-06-19 16:59:54 -07:00
emozilla	40722058e5	fix(mcp): keep short-TTL HTTP sessions alive with configurable ping keepalive MCP Streamable HTTP servers that garbage-collect idle sessions on a short TTL (e.g. Unreal Engine's editor MCP, ~15s) were unusable: the keepalive was hardcoded at 180s, so the session was always dead by the time it ran, and every idle tool call then landed on an expired session and paid the full reconnect path (observed hangs of 113-143s until interrupt, bounded only by the 300s tool_timeout). Two coordinated, backward-compatible changes: - Add per-server `keepalive_interval` (config.yaml, not an env var per the contribution rubric). Default 180s — byte-identical to the old hardcoded value when unset — floored at 5s. Servers with short session TTLs set it below their TTL so the session stays warm. - Switch the keepalive probe from `list_tools()` to `ping` (the MCP base protocol liveness primitive). On large servers `list_tools` pulled ~1 MB every cycle (830 tools = 1,068,041 bytes); `ping` is ~55 bytes and works uniformly across tool/prompt/resource servers. Tool-list changes still arrive out-of-band via notifications/tools/list_changed -> _refresh_tools. `ping` is an OPTIONAL utility, so to guarantee zero regression for a tool-capable server that doesn't implement it: the first -32601 latches `_ping_unsupported` and the probe falls back to the pre-ping `list_tools` path for that connection (no reconnect loop). The latch resets on each fresh connection (_discover_tools, all transport paths) so a server that gains ping support after a reconnect is re-probed with the cheap path. Non-(-32601) ping errors propagate as genuine liveness failures. Verified end-to-end against a live Unreal MCP server (idle 22s past the ~15s TTL -> post-idle tool call returns in 0.31s, no teardown) and with a simulated ping-less tool server driving the real keepalive loop (ping once, list_tools thereafter, no reconnect). 25/25 unit tests pass. Note: a separate upstream defect (modelcontextprotocol/python-sdk#2604) still tears down the whole session when one tool-call POST returns 4xx; that is not addressed here.	2026-06-19 12:16:33 -07:00
alt-glitch	88d523220f	fix(mcp): address adversarial review round 2 (stale-publish race, parity holes) Second review pass (Codex + Hermes subagent). Codex reproduced a real race with a two-thread harness; both converged on the remaining issues. - Generation-aware publish (fixes a lost-update race): two refresh callers (the late-refresh daemon and the between-turns prologue around turn 1) could each compute a snapshot outside the lock; a SLOWER caller holding an OLDER registry generation could acquire the publish lock after a newer caller and clobber it, deleting just-landed tools. refresh_agent_mcp_tools now captures registry._generation before computing and refuses to publish a stale set; agent._tool_snapshot_generation tracks the published generation. - Context-engine routing names (_context_engine_tool_names) are now staged on a local and published atomically with the snapshot, and only claimed when this rebuild actually appended the schema — matching agent_init's dedup so a registry/plugin tool of the same name keeps its own dispatch. (Previously mutated live, before the publish lock, and on no-change refreshes.) - CLI /reload-mcp: self.enabled_toolsets is resolved once at startup, so a server newly ENABLED in config mid-session wasn't picked up (TUI already re-resolved). Merge now-connected MCP server names into the override (unless the user pinned all/*), mirroring startup, and keep self.enabled_toolsets in sync. Closes the CLI/TUI parity hole. - ACP (acp_adapter/server.py) routed through the shared helper — it was a 5th sibling rebuild that re-injected memory tools but NOT context-engine tools and bypassed the atomic/name-diff path (inert today, fragile). - mcp_startup._resolve_discovery_timeout pulls its default from DEFAULT_CONFIG (single source of truth) instead of a stale hardcoded 5.0 literal. - Tests: stale-generation-no-clobber, _skip_mcp_refresh honored, timeout fallback uses DEFAULT_CONFIG.	2026-06-19 11:57:43 -07:00
alt-glitch	b6e2a54a94	fix(mcp): address adversarial review round 1 (cache parity, gates, races) Consolidated findings from three independent reviewers (Codex, Claude Code, a Hermes subagent w/ the hermes-agent-dev skill): - BLOCKING: refresh_agent_mcp_tools rebuilt only the registry subset, silently dropping post-build-injected memory-provider (mem0/honcho/…) and context- engine (lcm_) tools on every refresh. Now additive-preserving: re-applies the same injectors agent_init uses, staged on locals and published atomically. - Re-injection now honors the #5544 enabled_toolsets gate for context-engine tools, so a restricted-toolset platform can't get lcm_ leaked back in. - Atomic read-diff-publish under one lock: the returned `added` set and the (tools, valid_tool_names) pair are consistent even under concurrent callers (no half-swap, no TOCTOU). - background_review fork opts out (_skip_mcp_refresh) so its byte-identical tools[] cache parity with the parent is preserved. - CLI /reload-mcp routed through the shared helper (was a 4th divergent copy with the same clobber bug + missing disabled_toolsets). - Explicit reloads (TUI RPC + CLI) pass enabled_override so a server the user just enabled in config this session is picked up; automatic paths reuse the agent's build-time selection. - mcp_discovery_timeout default 5.0 -> 1.5s: correctness now comes from the between-turns refresh, so the startup wait is only a small turn-1 UX bump rather than a heavy dead-server latency penalty. - has_registered_mcp_tools checks registered TOOLS (not connected servers) so a zero-tool/prompt-only server doesn't make the per-turn hook fire forever. - Tests: rewrote the thread-safety test to actually exercise the write path (alternating tool sets), added the #5544-gate regression, the memory/context preservation regression, and a "callable next turn via valid_tool_names" contract; removed a dead monkeypatch line.	2026-06-19 11:57:43 -07:00
alt-glitch	93d6e73028	fix(mcp): expose late-connecting MCP tools to the agent (TUI/CLI/gateway) MCP servers that connect after the agent's one-time tool snapshot were invisible for the whole session. Two root causes, fixed together: 1. The startup discovery wait was a flat 0.75s. HTTP/OAuth servers commonly take 2-6s on a cold connect, so they missed the window and their tools never entered the agent's snapshot. `thread.join(timeout)` already returns the instant discovery completes, so raising the bound costs ~0s for the common case (no MCP / fast servers) and only ever blocks for a genuinely-pending server, capped so a dead server can't freeze startup. The bound is now configurable via `mcp_discovery_timeout` (config.yaml, default 5.0s). 2. Three call sites duplicated the agent tool-snapshot rebuild (the TUI `reload.mcp` RPC, the gateway reload, and the TUI late-binding refresh thread), and the late-refresh detected changes by tool COUNT — missing an equal-size add/remove swap. Consolidated into one shared `tools.mcp_tool.refresh_agent_mcp_tools(agent)` helper that diffs by tool NAME, mutates the agent under a lock (thread-safe), and respects the agent's own enabled/disabled toolsets. The late-binding refresh keeps its pre-first-turn cache-safety guard: it never rebuilds the tool list once a turn has started, so the cached prompt prefix is never invalidated mid-conversation. Tests: new tests/tools/test_refresh_agent_mcp_tools.py covers the name-based diff, in-place mutation, agent-scoped filtering, thread safety, and the config-driven discovery bound (incl. instant-return when nothing is pending). 75 passed across the touched areas.	2026-06-19 11:57:43 -07:00
Ludo Galabru	239740a19e	feat(tools): MCP elicitation handler with gateway-aware approval routing Wires support for the MCP `elicitation/create` request (Python SDK 1.11+) so MCP servers can ask the user to confirm sensitive operations mid-tool-call (payment authorization, OAuth confirmation, etc.) instead of failing closed or requiring out-of-band biometrics. Behavior: - `tools/mcp_tool.py` adds `ElicitationHandler`, attached per server task and passed to `ClientSession` as `elicitation_callback`. Form-mode requests route through the existing approval system; URL-mode requests decline cleanly (out of scope for this pass). - `tools/approval.py` adds `request_elicitation_consent()`, which dispatches to whichever surface owns the active session — `_await_gateway_decision` for Telegram / Slack / etc. (so the approval prompt lands on the right platform), `prompt_dangerous_approval` for CLI / TUI. Fails closed on timeout, missing notify_cb, or exception. - The MCP tool wrapper snapshots `contextvars.copy_context()` into `MCPServerTask._pending_call_context` before each `session.call_tool` and clears it after. The recv-loop task that dispatches incoming `elicitation/create` requests does not inherit the agent task's contextvars (HERMES_SESSION_PLATFORM and friends), so without the bridge `_is_gateway_approval_context()` returns False on every gateway session and the elicitation falls through to a CLI prompt that has no TTY → fail-closed decline. The handler now reads the snapshot via its `owner` back-reference and replays it through `Context.copy().run(...)` so attribution survives the task hop. Tests (`tests/tools/test_mcp_elicitation.py`): - form-mode accept / decline / cancel - URL-mode declined without prompting - exception in approval system → decline - timeout in approval → cancel - context-bridge regression tests (replay observed in consent call, missing-context fallback, multiple-replay safety, owner with cleared `_pending_call_context`) Verified end-to-end against pay's MCP server on macOS: agent message arrives via Telegram, agent calls `mcp_pay_curl` against a paid endpoint, pay returns 402, ElicitationHandler routes the approval prompt back to the originating Telegram chat, user replies in TG, the curl tool signs and completes. Platforms tested: macOS 14 (darwin/arm64). No Unix-only syscalls introduced; Windows footgun checker passes on the touched files.	2026-06-19 11:46:25 -07:00
Carlos Diosdado	e00b965406	feat(tts): add xAI TTS speed and optimize_streaming_latency config knobs The xAI TTS REST endpoint (POST /v1/tts) accepts 'speed' (0.7-1.5) and 'optimize_streaming_latency' (0/1/2) parameters, but the Hermes built-in xAI provider was reading neither from config nor sending either in the request body. Add them as tts.xai.speed and tts.xai.optimize_streaming_latency config knobs (with global tts.speed / tts.optimize_streaming_latency fallbacks). - speed: float, clamped to 0.7-1.5. 1.0 (the API default) is omitted from the request body to preserve the existing minimal-payload contract. - optimize_streaming_latency: int, clamped to 0-2. 0 (best quality, the API default) is omitted from the request body. Resolver order: tts.xai.<knob> overrides the global tts.<knob>.	2026-06-19 07:26:56 -07:00
Carlos Diosdado	8ae6bd0823	test(tts): cover xAI auto speech-tags auxiliary rewrite path The previous xAI auto-speech-tag tests asserted on the local pause-only fallback and only passed because call_llm silently returns None in the test environment. They gave zero coverage of the new auxiliary-rewrite path added in the previous commit. Add tests that: - mock agent.auxiliary_client.call_llm and pin down the new contract (auxiliary rewriter output wins over the local fallback) - verify the system prompt lists every documented inline + wrapping tag and uses BBCode-style [/tag] closing syntax - cover markdown-fence stripping (with and without language hint) - exercise the local fallback on rewriter exception, empty response, None response, and missing-choices response - confirm call_llm is NOT invoked when the input already has explicit speech tags, or is empty / whitespace-only - replace the end-to-end test that asserted on the silent-fallback output with one that mocks the rewriter and asserts the rewriter's tagged text is what reaches the xAI TTS API	2026-06-19 07:16:57 -07:00
Cdddo	160bb565b4	feat(tts): expose speaker_id on built-in Piper provider The built-in Piper provider (tts.provider: piper, Python piper-tts package) already constructs piper.SynthesisConfig for the advanced tuning knobs, but did not forward speaker_id from the user config. This wires tts.piper.speaker_id through to SynthesisConfig.speaker_id so multi-speaker ONNX models (e.g. libritts_r) can be addressed via config without dropping to the command-provider path. Changes: - Add speaker_id to the has_advanced tuple so setting it triggers SynthesisConfig construction (same gating as the other knobs). - Pass speaker_id=speaker_id to SynthesisConfig. Defaults to 0 (Piper's own default; single-speaker models ignore the field). - Tolerant parse: bad input (non-int strings, lists, dicts) is dropped to 0 instead of raising. Booleans are rejected outright (True/False would silently coerce to 1/0 and hide a config mistake). Mirrors the same shape as the command-provider's _resolve_command_tts_optional_number helper. speaker_id is applied per-call via syn_config.speaker_id, so the PiperVoice cache key is intentionally left as just (model, cuda) -- the same loaded model serves all speakers. Tests cover the config knob, the tolerant parse, and the no-reload invariant. sentence_silence is intentionally not added here: the Python piper-tts SynthesisConfig does not expose that field (CLI-only).	2026-06-19 07:04:58 -07:00
teknium1	2c3aebcadc	fix(clarify): unwrap dict choices at the source so every surface gets clean text The Discord fix (previous commit) handles dict-shaped clarify choices at the Discord adapter only. The same dict-repr leak originates upstream at tools/clarify_tool.py's str(c).strip() normalization — the single platform-agnostic point both the CLI and every gateway adapter flow through. When an LLM emits [{"description": "..."}] instead of bare strings, str(c) produced {'description': '...'} which leaked onto the CLI panel (cli.py:13048/13081), was returned verbatim as the user's answer (cli.py:11945), and hit Telegram's numbered list too. Add _flatten_choice (same label->description->text->title unwrap as the Discord adapter, name/value excluded, keyless dicts dropped) and apply it at the normalization line. Fixes CLI + Telegram + all platforms at the root; the Discord smart-truncation now operates on already-clean text. Adds johnjacobkenny to AUTHOR_MAP for the salvaged commit.	2026-06-19 06:31:08 -07:00
Teknium	c02192ff6a	feat(image-gen): add image-to-image / editing to image_generate (#48705 ) * feat(image-gen): add image-to-image / editing to image_generate Brings image generation to parity with video generation: the unified image_generate tool now edits/transforms a source image (image-to-image) when given image_url / reference_image_urls, routing to each backend's edit endpoint, exactly as video_generate routes to image-to-video. - ImageGenProvider ABC: generate() gains keyword-only image_url + reference_image_urls; new capabilities() declares modalities + max_reference_images (defaults to text-only, backward compatible). success_response gains a modality field; adds normalize_reference_images. - image_generate tool: schema exposes image_url + reference_image_urls; dynamic schema reflects the active model's actual edit capability so the agent knows when image_url is honored. Handler + plugin dispatch forward the new inputs; legacy/text-only providers get a clear modality_unsupported error instead of silently dropping the source image. - In-tree FAL: 7 models gain edit endpoints (flux-2-klein, flux-2-pro, nano-banana-pro, gpt-image-1.5, gpt-image-2, ideogram/v3, qwen-image) with per-model edit_supports whitelists + reference caps; routes to the /edit endpoint and skips the upscaler for edits. - Plugins: openai (images.edit, 16 refs), xai (/v1/images/edits via grok-imagine-image-quality, JSON body per xAI docs), krea (image_style_references, 10 refs). openai-codex stays text-only and rejects edits with an actionable error. - Tests: 15 new (payload, routing, dispatch forwarding, dynamic schema, capabilities); updated 2 change-detector/lambda tests for the new schema. - Docs: image-generation feature page, image-gen provider plugin guide, tools reference. * fix(image-gen): preserve legacy passthrough in fal/krea plugin tests Two existing plugin tests asserted pre-image-to-image behavior: - fal: forward image_url/reference_image_urls only when supplied, so a text-to-image delegation stays byte-identical (no None kwargs). - krea: keep dict-shaped image_style_references refs verbatim (the unified string refs go through normalize_reference_images; legacy non-string ref objects pass through unchanged) — fixes KeyError when callers pass the richer Krea ref-object shape. * fix(image-gen): clearer not-capable message for text-to-image-only models When a text-to-image-only model (incl. gpt-image-2 on the Codex OAuth path, which can't do editing through the Responses image_generation tool) gets a source image, say 'this model is not capable of image-to-image / editing — provide a text-only prompt' rather than sending the user shopping for other backends. Applies to the openai-codex guard, the in-tree FAL no-edit-endpoint error, and the dynamic tool-schema text-only line.	2026-06-18 22:13:07 -07:00
flooryyyy	f8d8f045fa	feat(kanban): auto-subscribe calling session on kanban_create When a worker calls kanban_create from inside a session that has a persistent delivery channel, the originating session is now subscribed to the new task's completion/block events automatically. The agent that dispatched the task gets notified instead of having to poll. - Gateway sessions (telegram/discord/slack): HERMES_SESSION_PLATFORM + HERMES_SESSION_CHAT_ID ContextVars, set by the messaging gateway. - TUI / desktop sessions: HERMES_SESSION_KEY in the subprocess env. The TUI notification poller keys on platform='tui' + chat_id=<key>. - CLI / cron / test: no persistent channel, no subscription. Gated by kanban.auto_subscribe_on_create in config.yaml (default True). Disable to mirror pre-feature behaviour — users who want explicit kanban_notify-subscribe calls per task can set it to false. This config gate addresses the design concern that got PR #19718 reverted upstream (unconditional implicit auto-subscribe on tool-driven kanban_create was too aggressive for orchestrator users). HERMES_SESSION_ID is intentionally not a fallback channel — it is set by ACP/agent subprocess telemetry for every invocation, not just TUI, so treating it as a notification target would auto-subscribe every CLI session and re-introduce the over-eager behaviour. The kanban_create response now includes a 'subscribed' bool so orchestrators can react if subscription failed (e.g. by falling back to explicit kanban_notify-subscribe or to polling). Includes 6 tests covering the gateway / TUI / CLI / partial-context / gated / add_notify_sub-failure paths. All 90 tests in test_kanban_tools.py pass; 509 broader kanban tests pass.	2026-06-18 14:10:51 -07:00
Teknium	38c8a9c10f	feat(memory): batch operations for single-turn memory updates (#48507 ) The memory tool was strictly one-op-per-call. With the store running near its char limit by design, a new add that would overflow gets rejected with 'consolidate now, then retry' -- but the model could not consolidate and add in one call. It had to remove/replace across several turns, then retry the add, each turn re-sending the whole conversation context. Expensive thrash. Add an 'operations' array: a list of add/replace/remove ops applied atomically against the FINAL char budget. The model frees space and adds new entries in ONE call, even when an add alone would overflow. All-or-nothing: any bad op aborts the whole batch, nothing written. Root-cause note: the two agent-level memory interception sites (agent_runtime_helpers.py, tool_executor.py) silently dropped any param not in their explicit kwarg list, so 'operations' never reached the handler and batch calls failed with 'Unknown action None'. Both now pass it through and bridge each add/replace op to external memory providers. Also: success response is now terminal (done=true + 'do not repeat' note, no full-entries echo that invited re-edits); schema rewritten to lead with the batch mechanism and an explicit one-shot stop rule (2138 -> 1476 chars). Live-verified: near-full consolidate-and-add went 7 calls -> 1 call, stable across 3 reps. 103 memory/approval tests + 398 background-review/ run_agent tests green; 6 new batch tests added.	2026-06-18 10:19:33 -07:00
Teknium	25c590ccd0	fix(skills): refuse SKILLS_DIR root in rmtree guard, not just outside-tree The salvaged guard allowed _rmtree_writable(SKILLS_DIR) itself. No call site ever passes the root — every site passes a skill subdir or its .bak sibling — so allowing the root only preserves the #48200 footgun (a dest that collapses to the root wipes every installed skill). Require a strict strict-child relationship and update the test that documented the nonexistent 'full reset' capability.	2026-06-18 08:53:35 -07:00

1 2 3 4 5 ...

1205 commits