hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-31 19:16:29 +00:00

Author	SHA1	Message	Date
johnncenae	9ae1fa9e39	fix(delegate): honor runtime default model during provider resolution	2026-04-30 19:58:55 -07:00
Yukipukii1	5f3f456784	fix(approval): wake blocked gateway approvals on session cleanup	2026-04-30 19:46:27 -07:00
Teknium	c868425467	feat(kanban): durable multi-profile collaboration board (#17805 ) Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init\|create\|list\|show\|assign\|link\|unlink\| claim\|comment\|complete\|block\|unblock\|archive\|tail\|dispatch\|context\| init\|gc\|watch\|stats\|notify\|log\|heartbeat\|runs\|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com>	2026-04-30 13:36:47 -07:00
y0shualee	f4b76fa272	fix: use skill activity in curator status Treat skill views and edits as activity when curator reports and applies lifecycle transitions, so recently loaded or patched skills are not displayed or transitioned as never used.\n\nAdds regression tests for activity derivation, automatic transitions, and CLI status output.	2026-04-30 10:31:47 -07:00
0xDevNinja	564a649e6a	fix(curator): scan nested archive subdirs in restore_skill restore_skill() in tools/skill_usage.py used archive_root.iterdir(), which only walked the top level of .archive/. Skills archived under nested layouts (e.g. .archive/openclaw-imports/<skill>/ from older archive paths or external imports) were invisible to both the exact-match and prefix-match candidate scans, surfacing as a misleading "skill '<name>' not found in archive" error even though the directory existed on disk. Switch both candidate scans to archive_root.rglob('*') so the lookup descends into category subdirectories. Fixes #17942	2026-04-30 10:31:44 -07:00
Stephen Schoettler	407dfbb021	fix(ci): stabilize current main test regressions	2026-04-30 06:36:50 -07:00
Teknium	ae8930afa5	fix(skills): also bump_use on skill_view tool invocation Widen #17818 to cover the dominant 'agent actively used this skill' path: when the model calls the skill_view tool, bump use_count alongside view_count. The slash-command and --skill preload paths (covered by the cherry-picked commit) only catch user-initiated invocation; most skill activation happens via the agent calling skill_view to consume an indexed skill. Curator's stale-timer keys off last_used_at (agent/curator.py:233), so without this wire-up agent-created skills would transition to stale simultaneously regardless of actual use.	2026-04-30 05:07:34 -07:00
Teknium	a845177ebe	fix(skills): also exclude .archive in skills_tool + add author map entry Widen #17639 to the fourth sibling site (tools/skills_tool.py _EXCLUDED_SKILL_DIRS) and register leoneparise in scripts/release.py AUTHOR_MAP so CI release script resolves the contributor.	2026-04-30 04:59:22 -07:00
Maxence Groine	04ea895ffb	feat(gateway/signal): add support for multiple images sending Adds a new `send_multiple_images` method to the ``BasePlatformAdapter`` that implements the default "One image per message" loop and allows for platform-specific overriding. Implements such an override for the Signal adapter, batching images and trying (best-effort) to work around rate-limits for voluminous batches using a specific scheduler. Also implements batching + rate-limit handling in the `send_message` tool. New tests added for the Signal adapter, its rate-limit scheduler and the `send_message` tool	2026-04-30 04:28:08 -07:00
Heltman	19f9be1dff	fix(tools): serialize concurrent hermes_tools RPC calls from execute_code The sandbox-side `_call()` in both the UDS and file-based transports was not thread-safe, so scripts that call tools from multiple threads (e.g. `ThreadPoolExecutor` over `terminal()`) inside a single `execute_code` run could silently receive each other's responses. Root cause: * UDS transport — a single module-level `_sock` was shared across all threads; the newline-framed protocol has no request-id; and the server-side RPC loop handles one connection serially. With concurrent callers, each thread would `sendall()` then race to `recv()` the next newline-terminated response from the shared buffer, so responses got delivered to the wrong caller. * File transport — `_seq += 1` is a non-atomic read-modify-write, so two threads could allocate the same sequence number and clobber each other's request/response files. Fix: guard `_call()` with a `threading.Lock` in the UDS case (covering send+recv), and guard `_seq` allocation with a lock in the file case. No protocol change. Regression tests cover both the generated-source level (lock is present and used) and an end-to-end concurrency test: running a sandboxed ThreadPoolExecutor of 10 `terminal()` calls against a slow mock dispatcher, asserting every caller sees its own tagged response. The test fails without the fix (10/10 mismatched, matching real-world repro) and passes with it.	2026-04-30 03:31:16 -07:00
vominh1919	cb130bf776	fix(ssh): prevent tar from overwriting remote home dir permissions tar xf - -C / extracts the staging directory tree to the remote root. GNU tar default behavior overwrites metadata (including mode) of existing directories. When the local umask is 002 (Ubuntu default), the staging dirs are 0775, and tar chmod's /home/<user> to 0775 — breaking sshd StrictModes which requires 0755 or stricter for home dirs. Add --no-overwrite-dir to the remote tar command so existing directory metadata is preserved. Fixes #17767	2026-04-30 03:26:35 -07:00
Teknium	8d302e37a8	feat(tts): add Piper as a native local TTS provider (closes #8508 ) (#17885 ) Piper (OHF-Voice/piper1-gpl) is a fast, local neural TTS engine from the Home Assistant project that supports 44 languages with zero API keys. Adds it as a native built-in provider alongside edge/neutts/kittentts, installable via 'hermes tools' with one keystroke. What ships: - New 'piper' built-in provider in tools/tts_tool.py - Lazy import via _import_piper() - Module-level voice cache keyed on (model_path, use_cuda) so switching voices doesn't invalidate older cached voices - _resolve_piper_voice_path() accepts either an absolute .onnx path or a voice name (auto-downloaded on first use via 'python -m piper.download_voices --download-dir <cache>') - Voice cache at ~/.hermes/cache/piper-voices/ (profile-aware via get_hermes_dir) - Optional SynthesisConfig knobs: length_scale, noise_scale, noise_w_scale, volume, normalize_audio, use_cuda — passed through only when configured, so older piper-tts versions aren't broken - WAV output then ffmpeg conversion path (same as neutts/kittentts) so Telegram voice bubbles work when ffmpeg is present - Piper added to BUILTIN_TTS_PROVIDERS so a user's tts.providers.piper.command cannot shadow the native provider (regression test included) - 'hermes tools' wizard entry - Piper appears under Voice and TTS as local free, with 'pip install piper-tts' auto-install via post_setup handler - Prints voice-catalog URL and default-voice info after install - config.yaml defaults - tts.piper.voice defaults to en_US-lessac-medium - Commented advanced knobs for discoverability - Docs - New 'Piper (local, 44 languages)' section in features/tts.md explaining install path, voice switching, pre-downloaded voices, and advanced knobs - Piper listed in the ten-provider table and ffmpeg table - Custom-command-providers section updated to drop the Piper example (now native) and add a piper-custom example for users with their own trained .onnx models - overview.md bumps provider count to ten - Tests (tests/tools/test_tts_piper.py, 16 tests) - Registration (BUILTIN_TTS_PROVIDERS, PROVIDER_MAX_TEXT_LENGTH) - _resolve_piper_voice_path across every branch: direct .onnx path, cached voice name, fresh download with correct CLI args, download failure, successful-exit-but-missing-files, empty voice to default - _generate_piper_tts: loads voice once, reuses cache, voice-name download wiring, advanced knobs flow through SynthesisConfig - text_to_speech_tool end-to-end dispatch and missing-package error - check_tts_requirements: piper availability toggles the return value - Regression guard: piper cannot be shadowed by a command provider with the same name - Pre-existing test_tts_mistral test broadened to mock the new piper/kittentts/command-provider checks (otherwise it false-passes when piper is installed in the test venv) E2E verification (live): Actual pip install piper-tts, config piper + en_US-lessac-low, text_to_speech_tool call, voice auto-downloaded from HuggingFace, WAV synthesized, ffmpeg-converted to Ogg/Opus. Second call hits the cache (~60ms). Cache dir populated with .onnx and .onnx.json. This caught a real bug during development: the first pass used '-d' as the download-dir flag; the actual piper.download_voices CLI wants '--download-dir'. Fixed before PR opened.	2026-04-30 02:53:20 -07:00
Teknium	2facea7f71	feat(tts): add command-type provider registry under tts.providers.<name> (#17843 ) Reshape of PR #17211 (@versun). Lets users wire any local or external TTS CLI into Hermes without adding engine-specific Python code. Users declare any number of named providers in config.yaml and switch between them with tts.provider: <name>, alongside the built-ins (edge, openai, elevenlabs, …). Config shape: tts: provider: piper-en providers: piper-en: type: command command: 'piper -m ~/model.onnx -f {output_path} < {input_path}' output_format: wav Placeholders: {input_path}, {text_path}, {output_path}, {format}, {voice}, {model}, {speed}. Use {{ / }} for literal braces. Key behavior: - Built-in provider names always win — a tts.providers.openai entry cannot shadow the native OpenAI provider. - type: command is the default when command: is set. - Placeholder values are shell-quote-aware (bare / single / double context), so paths with spaces and shell metacharacters are safe. - Default delivery is a regular audio attachment. voice_compatible: true opts in to Telegram voice-bubble delivery via ffmpeg Opus conversion. - Command failures (non-zero exit, timeout, empty output) surface to the agent with stderr/stdout included so you can debug from chat. - Process-tree kill on timeout (Unix killpg, Windows taskkill /T). - max_text_length defaults to 5000 for command providers; override under tts.providers.<name>.max_text_length. Tests: tests/tools/test_tts_command_providers.py — 42 new tests cover provider resolution, shell-quote context, placeholder rendering with injection payloads, timeout, non-zero exit, empty output, voice_compatible opt-in, and end-to-end dispatch through text_to_speech_tool. All 88 pre-existing TTS tests still pass. Docs: new "Custom command providers" section in website/docs/user-guide/features/tts.md with three worked examples (Piper, VoxCPM, MLX-Kokoro), placeholder reference, optional keys, behavior notes, and security caveat. E2E-verified live: isolated HERMES_HOME, command provider declared in config.yaml, text_to_speech_tool dispatches through the registered shell command and the output file is produced as expected. Co-authored-by: Versun <me+github7604@versun.org>	2026-04-30 02:29:08 -07:00
Teknium	aa7bf329bc	feat(gateway): centralize audio routing + FLAC support + Telegram doc fallback (#17833 ) Extracted from PR #17211 (@versun) so it can land independently of the local_command TTS provider redesign. - Add should_send_media_as_audio(platform, ext, is_voice) in gateway/platforms/base.py; single source of truth for audio routing. - Add .flac to recognized audio extensions (MEDIA regex, weixin audio set, send_message audio set). - Telegram send_voice() now falls back to send_document for formats Telegram's Bot API can't play natively (.wav, .flac, ...) instead of raising; MP3/M4A still go to sendAudio, Opus/OGG still go to sendVoice. - Route _send_telegram() in send_message_tool through a narrower _TELEGRAM_SEND_AUDIO_EXTS = {.mp3, .m4a} set. - cron.scheduler._send_media_via_adapter now delegates the audio decision to should_send_media_as_audio so it matches the gateway. - Update the cron live-adapter ogg test to flag [[audio_as_voice]] so it still routes to sendVoice under the new Telegram-specific policy. - Tests: unit coverage for should_send_media_as_audio across platforms, end-to-end MEDIA routing via _process_message_background and GatewayRunner._deliver_media_from_response, TelegramAdapter.send_voice fallback for FLAC/WAV. Co-authored-by: Versun <me+github7604@versun.org>	2026-04-30 01:32:31 -07:00
Teknium	21e695fcb6	fix: clean up defensive shims and finish CI stabilization from #17660 (#17801 ) PR #17660 landed a sweep of CI fixes but left three loose ends: 1. tests/cli/test_cli_loading_indicator.py::test_reload_mcp_sets_busy_state_ and_prints_status — /reload-mcp gained a prompt-cache-invalidation confirmation (commit `4d7fc0f37`) that was never wired into this test. The test exercises the loading-indicator path, so pre-approve via config and go straight into _reload_mcp(). 2. tools/mcp_tool.py _make_tool_handler — the added getattr(server, '_rpc_lock', None) + 'skip the lock if missing' branch is inconsistent with four sibling call sites that still direct-access server._rpc_lock. The lock is guaranteed by MCPServerTask.__init__; falling through to an unlocked session.call_tool would silently serialize-strip RPCs if the guard ever triggered. Restore direct access. 3. tui_gateway/server.py _messages_as_conversation — the helper existed only to catch 'TypeError: include_ancestors unexpected' from mocked SessionDBs that don't actually exist. The real SessionDB.get_messages_as_conversation has accepted include_ancestors since introduction, and every test FakeDB in the repo already declares the kwarg. Remove the shim, inline the two call sites.	2026-04-29 23:53:17 -07:00
Stephen Schoettler	f73364b1c4	fix(ci): stabilize main test suite regressions (#17660 ) * fix: stabilize main test suite regressions * test(agent): update MiniMax normalization expectation * test: stabilize remaining CI assertions * test: harden config helper monkeypatching * test: harden CI-only assertions * fix(agent): propagate fast streaming interrupts	2026-04-29 23:18:55 -07:00
Teknium	71c8ca17dc	chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 Removes drive-by duplication that accumulated during the contributor branch's multiple rebases. All runtime-benign (dict last-wins, redefinition last-wins) but left dead source that would confuse reviewers and maintainers. Surgical in-place de-duplication (kept PR's intentional additions, removed only the doubled copy): * hermes_cli/auth.py: duplicate "gmi" + "azure-foundry" ProviderConfig * hermes_cli/models.py: duplicate "gmi" entry in _PROVIDER_MODELS * hermes_cli/config.py: duplicate NOTION/LINEAR/AIRTABLE/TENOR skill env block + duplicate get_custom_provider_context_length definition * hermes_cli/gateway.py: duplicate _setup_yuanbao * gateway/platforms/base.py: duplicate is_host_excluded_by_no_proxy * gateway/platforms/telegram.py: duplicate delete_message * gateway/stream_consumer.py: duplicate _should_send_fresh_final and _try_fresh_final * gateway/run.py: duplicate _parse_reasoning_command_args / _resolve_session_reasoning_config / _set_session_reasoning_override, duplicate "Drain silently when interrupted" interrupt check * run_agent.py: duplicate HERMES_AGENT_HELP_GUIDANCE append, duplicate codex_message_items capture, duplicate custom_providers resolution * tools/approval.py: duplicate HARDLINE_PATTERNS section and duplicate hardline call in check_dangerous_command * tools/mcp_tool.py: duplicate _orphan_stdio_pids module-level decl * cron/scheduler.py: duplicate "not configured/enabled" check — kept the new early-rejection, removed the stale late-path copy Full-file resets to origin/main (all PR additions were duplicates of content already on main): * ui-tui/packages/hermes-ink/index.d.ts * ui-tui/packages/hermes-ink/src/entry-exports.ts * ui-tui/packages/hermes-ink/src/ink/selection.ts * ui-tui/src/app/interfaces.ts * ui-tui/src/app/slash/commands/core.ts * ui-tui/src/components/thinking.tsx * ui-tui/src/lib/memoryMonitor.ts * ui-tui/src/types.ts * ui-tui/src/types/hermes-ink.d.ts * tests/hermes_cli/test_doctor.py * tests/hermes_cli/test_api_key_providers.py * tests/hermes_cli/test_model_validation.py * tests/plugins/memory/test_hindsight_provider.py * tests/run_agent/test_run_agent.py * tests/gateway/test_email.py * tests/tools/test_dockerfile_pid1_reaping.py * hermes_cli/commands.py (slack_native_slashes block — full duplicate)	2026-04-29 21:56:51 -07:00
Ari Lotter	868bc1c242	feat(irc): add interactive setup feat(gateway): refine Platform._missing_ and platform-connected dispatch Restricts plugin-name acceptance to bundled plugin scan + registry (no arbitrary string -> enum-pollution), pulls per-platform connectivity checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean _is_platform_connected method, and adds tests covering the checker map, plugin platform interface, and IRC setup wizard.	2026-04-29 21:56:51 -07:00
Teknium	2e20f6ae2d	feat: complete plugin platform parity — all 12 integration points Extends the platform plugin interface from Phase 1 to cover every touchpoint where built-in platforms have hardcoded behavior. - allowed_users_env / allow_all_env: per-platform auth env vars - max_message_length: smart-chunking for send_message tool - pii_safe: session PII redaction flag - emoji: CLI/gateway display - allow_update_command: /update access control send_message tool (tools/send_message_tool.py): - Replaced hardcoded platform_map dict with Platform() call - Added _send_via_adapter() for plugin platforms — routes through live gateway adapter when available - Registry-aware max message length for smart chunking Cron delivery (cron/scheduler.py): - Replaced hardcoded 15-entry platform_map with Platform() call - Plugin platforms now work as cron delivery targets User authorization (gateway/run.py _is_user_authorized): - Registry fallback: checks PlatformEntry.allowed_users_env and allow_all_env when platform not in hardcoded maps - Plugin platforms get per-platform auth support _UPDATE_ALLOWED_PLATFORMS: checks registry allow_update_command flag Channel directory: includes plugin platforms in session enumeration Orphaned config warning: descriptive message when plugin platform is in config but no plugin registered it Gateway weakref: _gateway_runner_ref for cross-module adapter access hermes status: shows plugin platforms with (plugin) tag hermes gateway setup: plugin platforms appear in menu with setup hints hermes_cli/platforms.py: get_all_platforms() merges with registry, platform_label() falls back to registry for plugin names - 8 new tests (extended fields, cron resolution, platforms merge) - Updated 3 tests for new Platform() based resolution - 2829 passed, 24 pre-existing failures, zero new failures	2026-04-29 21:56:51 -07:00
Teknium	4d7fc0f37c	feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation Reloading MCP servers rebuilds the tool set for the active session, which invalidates the provider prompt cache (tool schemas are baked into the system prompt). The next message re-sends full input tokens — can be expensive on long-context or high-reasoning models. To surface that cost, /reload-mcp now routes through a new slash-confirm primitive with three options: Approve Once / Always Approve / Cancel. 'Always Approve' persists approvals.mcp_reload_confirm: false so future reloads run silently. Coverage: * Classic CLI (cli.py) — interactive numbered prompt. * TUI (tui_gateway + Ink ops.ts) — text warning on first call; `now` / `always` args skip the gate; `always` also persists the opt-out. * Messenger gateway — button UI on Telegram (inline keyboard), Discord (discord.ui.View), Slack (Block Kit actions); text fallback on every other platform via /approve /always /cancel replies intercepted in gateway/run.py _handle_message. * Config key: approvals.mcp_reload_confirm (default true). * Auto-reload paths (CLI file watcher, TUI config-sync mtime poll) pass confirm=true so they do NOT prompt. Implementation: * tools/slash_confirm.py — module-level pending-state store used by all adapters and by the CLI prompt. Thread-safe register/resolve/clear. * gateway/platforms/base.py — send_slash_confirm hook (default 'Not supported' → text fallback). * gateway/run.py — _request_slash_confirm helper + text intercept in _handle_message (yields to in-progress tool-exec approvals so dangerous-command /approve still unblocks the tool thread first). Tests: * tests/tools/test_slash_confirm.py — primitive lifecycle + async resolution + double-click atomicity (16 tests). * tests/hermes_cli/test_mcp_reload_confirm_gate.py — default-config shape + deep-merge preserves user opt-out (5 tests). Targeted runs (hermetic): 89 passed (slash-confirm, config gate, existing agent cache, existing telegram approval buttons).	2026-04-29 21:56:47 -07:00
teknium1	dd2d1ba5e6	refactor(reload-skills): queue note for next turn, drop cache invalidation + agent tool Salvage-follow-up to @shannonsands's /reload-skills PR. Trims the feature to match the design: user-initiated rescan, no prompt-cache reset, no new schema surface, no phantom user turn, and the next-turn note carries each added/removed skill's 60-char description (not just its name). Changes vs the original PR: * Drop the in-process skills prompt-cache clear in reload_skills(). Skills are invoked at runtime via /skill-name, skills_list, or skill_view — they don't need to live in the system prompt for the model to use them. Keeping the cache intact preserves prefix caching across the reload so /reload-skills pays no cache-reset cost. (MCP has to break the cache because tool schemas must be known at conversation start; skills do not.) * Drop the skills_reload agent tool and SKILLS_RELOAD_SCHEMA from tools/skills_tool.py, plus the four skills_reload enumerations in toolsets.py. No new schema surface — agents can already see a freshly- installed skill via skill_view / skills_list the moment it's on disk. * Replace the phantom 'role: user' turn injection with a one-shot queued note. CLI uses self._pending_skills_reload_note (same pattern as _pending_model_switch_note, prepended to the next API call and cleared). Gateway uses self._pending_skills_reload_notes[session_key]. The note is prepended to the NEXT real user message in this session, so message alternation stays intact and nothing out-of-band is persisted to the transcript. * reload_skills() now returns added/removed as [{'name': str, 'description': str}, ...] (description truncated to 60 chars — matches the curator / gateway adapter budget). The injected next-turn note formats each entry as 'name — description' so the model can actually reason about which new skills to call without running skills_list first. * Only emit the note when the diff is non-empty. On empty diff, print 'No new skills detected' and do nothing else. * Tests rewritten to cover the queue semantics, the description payload, and a regression guard that the prompt-cache snapshot is preserved.	2026-04-29 21:07:47 -07:00
Shannon Sands	7966560fb5	feat(skills): /reload-skills slash command + skills_reload agent tool Adds a public reload path for the in-process skill caches so newly installed (or removed) skills become visible mid-session without a gateway restart. Mirrors the shape of /reload-mcp. Three surfaces: * /reload-skills slash command — CLI (cli.py) and gateway (gateway/run.py), with /reload_skills alias for Telegram autocomplete and an explicit Discord registration. * skills_reload agent tool (tools/skills_tool.py) — lets agents/subagents pick up freshly-installed skills via tool call. * agent.skill_commands.reload_skills() — shared helper that clears _skill_commands, _SKILLS_PROMPT_CACHE (in-process LRU), and the on-disk .skills_prompt_snapshot.json, then returns an added/removed diff plus the new total count. Tested: * tests/agent/test_skill_commands_reload.py (9 cases) * tests/cli/test_cli_reload_skills.py (3 cases) * tests/gateway/test_reload_skills_command.py (4 cases) Use case: NemoClaw / OpenShell-style sandboxed orchestrators that drop skills into ~/.hermes/skills mid-session, plus agentic flows where the agent itself installs a skill via the shell tool and needs it bound without a gateway restart. The Python helper clear_skills_system_prompt_cache(clear_snapshot=True) already exists internally — this PR just exposes it via slash command and tool.	2026-04-29 21:07:47 -07:00
Teknium	0ba451d004	fix(vision): use HERMES_HOME-based cache dir instead of cwd (#17719 ) vision_analyze used Path('./temp_vision_images') — a relative path that resolved against cwd. Under Docker the image's WORKDIR is /opt/hermes, which is root-owned and only chmoded a+rX (read + traversal). Since #5811 landed (run as non-root hermes UID 10000, Apr 12), remote-URL vision calls fail with PermissionError on mkdir. Switch to get_hermes_dir('cache/vision', 'temp_vision_images'): resolves to $HERMES_HOME/cache/vision/ (= /opt/data/cache/vision/ in Docker — the user-owned volume mount). Existing installs with the old dir keep using it via the get_hermes_dir back-compat path; no migration needed. Only site in the codebase that stored runtime files via Path('./...'). Reported via Discord: https://juick.com/i/p/3089079.jpg → Telegram → gateway → [Errno 13] Permission denied: 'temp_vision_images'.	2026-04-29 20:14:02 -07:00
Teknium	c61b2e0af7	feat(skills): refuse skill_manage writes on pinned skills (#17562 ) Extend curator's pin flag from 'skip auto-transitions' to 'no agent edits at all'. All five skill_manage mutation actions (edit, patch, delete, write_file, remove_file) now refuse pinned skills with a message pointing the user at `hermes curator unpin <name>`. Motivation: pin used to only stop the curator's own maintenance pass from touching a skill. Nothing prevented the main agent from editing or deleting a pinned skill via skill_manage in-session. This gives users a hard fence against unwanted agent edits — same semantics as curator pinning, extended to the write tool. Create is unaffected (you can't pin a name that doesn't exist yet, and name collisions already error out). Broken sidecars fail open rather than lock the agent out. The schema description advertises the new refusal so models know not to route around it with rename/recreate tricks.	2026-04-29 10:28:25 -07:00
Teknium	8c8fc6c1ec	fix(skills): let skill_manage patch/edit/delete skills in external_dirs in place (#17512 ) Closes #4759, closes #4381. Mutating actions (patch, edit, write_file, remove_file, delete) used to refuse skills that lived under `skills.external_dirs` with 'Skill X is in an external directory and cannot be modified. Copy it to your local skills directory first.' Faced with that error, the agent would fall back to action='create', which always writes under ~/.hermes/skills/ — producing a silent duplicate of the external skill in the local store. Fix: drop the read-only gate. `skills.external_dirs` is configured by the user; if they pointed it at a directory, they already said 'these are my skills, treat them the same.' Filesystem permissions handle the genuine read-only case (write fails, agent sees the error). - New _containing_skills_root() resolves whichever dir actually contains the skill; _delete_skill uses it to bound empty-category cleanup so an external root is never rmdir'd. - _create_skill behavior is unchanged: new skills still land in local SKILLS_DIR only. Fewer moving parts. - Seven new TestExternalSkillMutations tests covering patch/edit/write_file/ remove_file/delete/create against a mocked two-root layout + a category rmdir-safety check.	2026-04-29 08:16:52 -07:00
kshitijk4poor	13c238327e	fix: address self-review findings for Vercel Sandbox salvage - Add vercel_sandbox to hardline blocklist container bypass test - Add vercel_sandbox to skills_tool remote backend parametrize test - Deduplicate runtime set: doctor.py and setup.py now import _SUPPORTED_VERCEL_RUNTIMES from terminal_tool.py - Add docstring to _run_bash explaining timeout/stdin_data discards - Always stop sandbox during cleanup (unconditional, matching Modal/Daytona) - Update security.md: container bypass text, production tip, comparison table - Update environment-variables.md: TERMINAL_ENV list, Vercel auth vars, TERMINAL_VERCEL_RUNTIME - Update inline comments in cli.py and config.py to include vercel_sandbox	2026-04-29 07:22:33 -07:00
Scott Trinh	5a1d4f6804	feat: add Vercel Sandbox backend Adds Vercel Sandbox as a supported Hermes terminal backend alongside existing providers (Local, Docker, Modal, SSH, Daytona, Singularity). Uses the Vercel Python SDK to create/manage cloud microVMs, supports snapshot-based filesystem persistence keyed by task_id, and integrates with the existing BaseEnvironment shell contract and FileSyncManager for credential/skill syncing. Based on #17127 by @scotttrinh, cherry-picked onto current main.	2026-04-29 07:22:33 -07:00
Teknium	398945e7b1	fix(cron): accept list-form deliver values so deliver=['telegram'] works (#17456 ) The cron schema contracts deliver as a string ("local", "origin", "telegram", "telegram:chat_id[:thread_id]", or comma-separated combos), but MCP clients and scripts sometimes pass an array like ['telegram']. Before this change, the list was written to jobs.json verbatim, and the scheduler's str(deliver).split(',') then tried to resolve the literal string "['telegram']" as a platform — returning None and logging 'no delivery target resolved for deliver=[\'telegram\']'. Fix on both ends: - tools/cronjob_tools.py: normalize deliver at the API boundary on create and update, so storage is always a string. - cron/scheduler.py: normalize deliver in _resolve_delivery_targets, so existing jobs.json entries with list-form deliver are handled gracefully without requiring users to edit the file. Closes #17139	2026-04-29 06:35:34 -07:00
teknium1	9e63062b6c	fix(stt): resolve API keys from ~/.hermes/.env via get_env_value (#17140 ) Widen #17163 to the sibling file tools/transcription_tools.py, which had the same class of bug. STT provider call sites and the _get_provider selection gate called os.getenv(...) directly and missed keys that only lived in ~/.hermes/.env. Same pattern as tts_tool.py: one guarded top-level import of get_env_value (falls back to os.getenv on ImportError), then every API-key and paired-base-URL lookup swapped over. Call sites migrated: - _transcribe_groq — GROQ_API_KEY - _transcribe_mistral — MISTRAL_API_KEY - _transcribe_xai — XAI_API_KEY, XAI_STT_BASE_URL - _get_provider — GROQ/MISTRAL/XAI_API_KEY in explicit + auto branches Module-level defaults (DEFAULT_STT_MODEL, GROQ_BASE_URL, etc.) stay on os.getenv — they're import-time constants, not runtime config, and the dotenv fallback would add no value there. New regression tests in tests/tools/test_transcription_dotenv_fallback.py (8 cases) mirror briandevans' TTS tests: per-provider dotenv-key forwarding, selection-gate dotenv visibility, and an end-to-end probe that patches hermes_cli.config.load_env to simulate ~/.hermes/.env carrying the key while os.environ does not.	2026-04-29 06:25:20 -07:00
briandevans	33967b4e52	fix(tts): tolerate missing hermes_cli.config in tts_tool import Wrap the new top-level `from hermes_cli.config import get_env_value` in try/except ImportError and fall back to a thin os.getenv shim, so importing tools.tts_tool keeps working in environments where hermes_cli.config is unavailable. This matches the existing tolerance in `_load_tts_config()` (tools/tts_tool.py) and the same import-fallback pattern in tools/tool_backend_helpers.py::fal_key_is_configured. Also update the TestDotenvFallbackPerProvider docstring to accurately describe the mocking strategy: per-provider tests patch `tools.tts_tool.get_env_value` directly, while the regression-guard tests cover the lower-level `hermes_cli.config.load_env` integration. Addresses Copilot review on #17163. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 06:25:20 -07:00
briandevans	40d25e125b	fix(tts): resolve API keys from ~/.hermes/.env via get_env_value (#17140 ) TTS provider tools (elevenlabs, xai, minimax, mistral, gemini) called os.getenv("X_API_KEY") directly, which bypassed Hermes's dotenv bridge in hermes_cli.config. Users who keep their TTS keys only in ~/.hermes/.env saw "X_API_KEY not set" errors even though the rest of the stack (agent/credential_pool, hermes_cli/auth) already resolves keys through get_env_value() — same class of bug as #15914 fixed for those modules. Switch every TTS env-var lookup (API keys, base URLs, and check_tts_requirements gates) to get_env_value, which checks os.environ first and then ~/.hermes/.env. Behaviour for users with keys exported in the shell is unchanged; users with dotenv-only keys now succeed. The two diagnostics prints in __main__ are migrated for consistency. Regression test (tests/tools/test_tts_dotenv_fallback.py): - per-provider: each backend reads the dotenv key when only ~/.hermes/.env carries it (5 providers). - end-to-end: with hermes_cli.config.load_env returning the key and os.environ empty, _generate_minimax_tts and check_tts_requirements both succeed; reverting tools/tts_tool.py back to os.getenv makes all 7 tests fail with "MINIMAX_API_KEY not set" / similar. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 06:25:20 -07:00
vominh1919	fe6c86623f	fix: close file descriptor in LocalEnvironment._update_cwd _update_cwd() uses a bare open(self._cwd_file).read() that never closes the file descriptor. This method runs on every terminal command execution, so the fd leaks accumulate in long sessions. Use a with statement so the fd is released promptly. Fixes #15552 (standalone resubmission)	2026-04-29 05:46:52 -07:00
Teknium	20b759cd02	fix(process): reconcile session.exited against real child exit in poll/wait (#17430 ) When a background terminal process spawns a descendant daemon that inherits the stdout pipe (e.g. 'hermes update' triggering a gateway systemctl restart), the reader thread's stdout.read() never returns EOF and its finally: block never runs. session.exited stays False forever, so process(action='poll') returns 'running' indefinitely even though the direct child exited long ago. Issue #17327: Feishu user polled 74 times over 7 minutes before killing the gateway manually. Fix: add _reconcile_local_exit() that checks the direct Popen.poll() before trusting session.exited. If the direct child has exited, drain any immediately-readable bytes non-blocking and flip session.exited. Called from poll() and wait(). The stuck reader thread remains blocked but is a daemon thread and gets reaped with the process. Safe no-op for env/PTY sessions, already-exited sessions, and live children (returns None from Popen.poll()).	2026-04-29 04:59:21 -07:00
loongzhao	ecaf8008bb	feat(yuanbao): wire native text + media delivery into send_message _send_yuanbao() already supported media_files= and the user-facing error strings already advertised yuanbao support, but there was no dispatch branch in _send_to_platform() actually routing to it. Target yuanbao in send_message previously fell through to "Direct sending not yet implemented". - Add yuanbao media-chunk branch (mirrors Signal/Matrix: media on final chunk only). - Add yuanbao elif in the non-media loop. Salvage of #17411; SKILL.md description change and redundant sidebars.ts entry dropped, indentation/trailing-whitespace cleaned up.	2026-04-29 04:56:18 -07:00
Ben Barclay	58a6171bfb	Merge pull request #17305 from NousResearch/feat/docker-run-as-host-user feat(docker): run container as host user to avoid root-owned bind mounts	2026-04-29 16:41:55 +10:00
Teknium	2d137074a3	refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304 ) The "cfg.get('X', {}).get('Y', default)" pattern appears 50+ times across tools/, gateway/, and plugins/. Each call site manually handles the same three gotchas: 1. Missing intermediate key → empty dict → chain works 2. Non-dict value at intermediate position → AttributeError (uncaught in most sites, so a misconfigured YAML crashes the tool) 3. cfg is None → AttributeError Introduces cfg_get(cfg, keys, default=None) in hermes_cli/config.py as the canonical helper. Handles all three uniformly, returns default only when the final key is absent* (matches dict.get semantics — explicit None values are preserved, falsy values like 0 / False / '' are preserved). Named cfg_get rather than cfg_path to avoid shadowing the existing 'cfg_path = _hermes_home / "config.yaml"' local variable that appears in gateway/run.py, cron/scheduler.py, hermes_cli/main.py, etc. Migrated 20 call sites as the first-batch proof-of-value: gateway/run.py 10 sites (agent/display subtrees) tools/browser_tool.py 3 sites tools/vision_tools.py 2 sites tools/browser_camofox.py 1 site tools/approval.py 1 site tools/skills_tool.py 1 site tools/skill_manager_tool.py 1 site tools/credential_files.py 1 site tools/env_passthrough.py 1 site The remaining ~30 sites across plugins/ and smaller tool files can be migrated opportunistically — the helper is now available and the pattern is established. Fixed a latent bug along the way: tools/vision_tools.py had its cfg_get usage at line 560 inside a function that locally re-imports 'from hermes_cli.config import load_config', but the AST-based migration script wrote the top-level cfg_get import to a different function scope, leaving line 560's cfg_get as a NameError silently swallowed by the surrounding try/except. Test test_vision_uses_configured_temperature_and_timeout caught it. Fixed by including cfg_get in the function-local import. Verified: - 7880/7893 tests/tools/ + tests/gateway/ + tests/hermes_cli/test_config tests pass; all 13 failures pre-existing on main (MCP, delegate, session_split_brain — verified earlier in the sweep). - All 20 migrated sites AST-verified to have cfg_get in scope (either module-level or function-local). - Live 'hermes chat' smoke: 2 turns + /model switch + tool calls + /quit, zero errors. Agent correctly counted 20 cfg_get hits across 8 tool files — matching the migration. Semantic parity verified against the original pattern across 8 edge cases (missing keys, None values, falsy values, empty strings, string instead of dict, None cfg, nested levels).	2026-04-28 23:17:39 -07:00
Ben	5531c0df82	feat(docker): run container as host user to avoid root-owned bind mounts Add opt-in terminal.docker_run_as_host_user config flag that passes --user $(id -u):$(id -g) to the Docker backend so files written into bind-mounted directories (/workspace, /root, docker_volumes entries) are owned by the host user instead of root. When enabled on POSIX platforms, also drops SETUID/SETGID caps since the container no longer needs gosu/su to switch users. Falls back cleanly on platforms without os.getuid (e.g. native Windows Docker) with a warning. Wired through all three config.yaml -> TERMINAL_* env-var bridges: - cli.py env_mappings (CLI + TUI startup) - gateway/run.py _terminal_env_map (gateway / messaging platforms) - hermes_cli/config.py _config_to_env_sync (`hermes config set`) Also fixes docker_mount_cwd_to_workspace silently failing in gateway mode -- it was missing from gateway/run.py's _terminal_env_map. Adds tests/tools/test_terminal_config_env_sync.py to guard against future drift between the three bridges (same bug class shipped twice in one month). Bundled Hermes image won't work with this flag since its entrypoint expects to start as root for the usermod/gosu hermes flow; works with the default nikolaik/python-nodejs image and plain Debian/Ubuntu.	2026-04-29 16:16:43 +10:00
brooklyn!	5e68503d2f	Merge pull request #17190 from NousResearch/bb/tui-cold-start-profiling perf(tui): cut visible cold start ~57% with lazy agent init	2026-04-28 22:45:14 -07:00
Teknium	0d31864e3b	fix(curator): defense-in-depth gates against bundled/hub skills Previous invariants only gated the primary entry points (apply_automatic_transitions, archive_skill, CLI pin). Several paths were unprotected: - bump_view / bump_use / bump_patch / set_state / set_pinned wrote usage records unconditionally, which is confusing noise in .usage.json even though the review list filtered them out - restore_skill did not check whether a bundled skill now shadows the archived name - CLI unpin was asymmetric with CLI pin — it had no gate Fixes: - _mutate() (the shared counter / state writer) now drops silently when the skill is not agent-created. .usage.json never gains a record for a bundled or hub-installed skill. - restore_skill() refuses to restore under a name that is now bundled or hub-installed (would shadow upstream). - CLI unpin gate matches CLI pin. New tests: - 5 provenance-guard tests on skill_usage (one per mutator) - 1 end-to-end test that hammers every mutator at a bundled skill and a hub skill, asserts both are untouched on disk, and asserts the sidecar stays clean - 2 CLI tests proving pin/unpin refuse bundled skills symmetrically 64/64 tests passing (29 skill_usage + 27 curator + 8 new guards).	2026-04-28 22:33:33 -07:00
Teknium	bc79e227e6	feat(curator): background skill maintenance (issue #7816 ) Adds the Curator — an auxiliary-model background task that periodically reviews AGENT-CREATED skills and keeps the collection tidy: tracks usage, transitions unused skills through active → stale → archived, and spawns a forked AIAgent to consolidate overlaps and patch drift. Default: enabled, inactivity-triggered (no cron daemon). Runs on CLI startup and gateway boot when the last run is older than interval_hours (default 24) AND the agent has been idle for min_idle_hours (default 2). Invariants (all load-bearing): - Never touches bundled or hub-installed skills (.bundled_manifest + .hub/lock.json double-filter) - Never auto-deletes — archive only. Archives are recoverable via `hermes curator restore <skill>` - Pinned skills bypass all auto-transitions - Uses the aux client; never touches the main session's prompt cache New files: - tools/skill_usage.py — sidecar .usage.json telemetry, atomic writes, provenance filter - agent/curator.py — orchestrator: config, idle gating, state-machine transitions (pure, no LLM), forked-agent review prompt - hermes_cli/curator.py — `hermes curator {status,run,pause,resume, pin,unpin,restore}` subcommand - tests/tools/test_skill_usage.py — 29 tests - tests/agent/test_curator.py — 25 tests Modified files (surgical patches): - tools/skills_tool.py — bump view_count on successful skill_view - tools/skill_manager_tool.py — bump patch_count on skill_manage patch/edit/write_file/remove_file; forget record on delete - hermes_cli/config.py — add curator: section to DEFAULT_CONFIG - hermes_cli/commands.py — add /curator CommandDef with subcommands - hermes_cli/main.py — register `hermes curator` subparser via register_cli() from hermes_cli.curator - cli.py — /curator slash-command dispatch + startup hook - gateway/run.py — gateway-boot hook (mirrors CLI) Validation: - 54 new tests across skill_usage + curator, all passing in 3s - 346 tests across all touched files' neighbors green - 2783 tests across hermes_cli/ + gateway/test_run_progress_topics.py green - CLI smoke: `hermes curator status/pause/resume` work end-to-end Companion to PR #16026 (class-first skill review prompt) — together they form a loop: the review prompt stops near-duplicate skill creation at the source, and the curator prunes/consolidates what still accumulates. Refs #7816.	2026-04-28 22:33:33 -07:00
Lyle Lengyel	80e474f11f	fix(gateway,terminal): expand shell tilde in terminal.cwd before subprocess Commit `3c42064e` made config.yaml the single source of truth for TERMINAL_CWD, but the config bridge passes cwd values verbatim to os.environ. When a user sets terminal.cwd: ~/ in config.yaml, the literal string '~/'' reaches subprocess.Popen, which the kernel rejects because it does not expand shell tilde syntax. This patch adds three defensive layers: 1. gateway/run.py — expanduser at config bridge time so TERMINAL_CWD is always an absolute path. 2. tools/terminal_tool.py — expanduser when reading TERMINAL_CWD in _get_env_config(), guarding against stale or manually-set env vars. 3. tools/environments/local.py — expanduser in LocalEnvironment before passing cwd to subprocess.Popen, the final safety net. Includes regression tests in test_config_cwd_bridge.py for nested terminal.cwd, top-level cwd alias, and precedence ordering. Refs: `3c42064e`	2026-04-28 22:26:09 -07:00
Tranquil-Flow	ac855bba0e	fix(cli): respect terminal.cwd config in local terminal backend init_session() runs a login shell bootstrap that sources profile scripts (.bashrc, .bash_profile, etc.) before capturing pwd. If any profile script changes the working directory, the captured cwd overwrites the configured terminal.cwd value — so terminal commands run in the wrong directory despite the TUI banner showing the configured path. Add an explicit 'builtin cd' to the configured cwd in the bootstrap script, after profile sourcing but before pwd capture, ensuring the configured terminal.cwd is always what gets recorded. Fixes #14044	2026-04-28 22:16:08 -07:00
Brooklyn Nicholson	9e398e1809	perf(tui): avoid importing classic CLI during tool discovery TUI session readiness was still laggy after the gateway-ready fixes. Profiling session.create -> session.info showed the slow phase is background AIAgent construction (~1.1s). A cProfile run of tui_gateway.server::_make_agent showed model_tools/tool discovery importing tools.code_execution_tool, whose module-level EXECUTE_CODE_SCHEMA calls _get_execution_mode(), which imported cli.CLI_CONFIG. That pulled the classic interactive CLI stack (prompt_toolkit/Rich and REPL setup) into every agent startup path, including hermes --tui where it is not used. Replace that with hermes_cli.config.read_raw_config(), which is cached and reads only the raw code_execution section. Existing defaults still apply when the key is absent. Measurements on macOS Terminal.app: - import run_agent: ~466ms -> ~347ms - model_tools import: ~418ms -> ~272ms - _make_agent: ~1452ms -> ~1239ms - session.create -> session.info: ~1167ms -> ~999ms - full hermes --tui ready p50: ~1655ms -> ~1537ms Tests: - scripts/run_tests.sh tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py	2026-04-28 22:42:17 -05:00
Teknium	cd7150a195	perf(approval): precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS (#17206 ) detect_dangerous_command() and detect_hardline_command() were calling re.search(pattern, text, re.IGNORECASE \| re.DOTALL) inline — Python's re._cache (512 patterns) amortizes compile cost on the warm path, but: 1. The first terminal() call per process pays the full compile fan-out for all 59 patterns (12 HARDLINE + 47 DANGEROUS). Measured at ~2.6 ms per detect_dangerous_command() call after re.purge(). 2. The re._cache is LRU — unrelated regex work elsewhere in the agent (response parsing, text normalization, etc.) can evict our patterns and silently re-compile them on the next terminal() call. Precompiling at module load eliminates both costs: detect_dangerous_command: cold 2.613 ms → 0.298 ms (-88%) warm 0.042 ms → 0.004 ms (-90%) detect_hardline_command: cold ~0.6 ms → 0.006 ms warm 0.011 ms → 0.002 ms Savings are per terminal() call. Agents with heavy terminal use see compound savings; the bigger value is the stability guarantee (no re._cache eviction can silently re-introduce the 2.6 ms cold cost mid-session). Implementation: - HARDLINE_PATTERNS_COMPILED and DANGEROUS_PATTERNS_COMPILED built at module load from the existing (pattern, description) tuples, using shared _RE_FLAGS = re.IGNORECASE \| re.DOTALL. - detect_* functions now iterate the compiled list and call pattern_re.search(text). - Original HARDLINE_PATTERNS and DANGEROUS_PATTERNS lists kept as-is (other code in the file uses them for key derivation / _PATTERN_KEY_ALIASES). Verified: - 160/161 tests/tools/test_approval*.py pass (1 pre-existing heartbeat test flake on main). - 349/349 tests/tools/ 'approval or terminal or dangerous' pass. - Live hermes chat smoke: 3 benign terminal commands + 1 rm -rf /tmp/ (clarify prompt fired — approval path still works) + 1 sudo (sudo password prompt fired — DANGEROUS pattern match still works). 23 log lines in the smoke window, zero errors. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 18:44:14 -07:00
Teknium	9f004b6d94	perf(tools): memoize get_tool_definitions + TTL-cache check_fn results (#17098 ) Two amplifying optimizations to per-turn overhead in the gateway: 1. get_tool_definitions() memoization (model_tools.py) Keyed on (frozenset(enabled), frozenset(disabled), registry._generation, config.yaml mtime+size). Only active when quiet_mode=True (which is every hot-path caller — gateway, AIAgent.__init__); quiet_mode=False keeps the existing print side effects. Cached path returns a shallow-copy list sharing read-only schema dicts. Measured: 7.5 ms → 0.01 ms per call (~750× speedup). Gateway constructs fresh AIAgent per message, so this saves ~7 ms/turn before any LLM work. 2. check_fn() TTL cache (tools/registry.py) check_fn callables like check_terminal_requirements probe external state (Docker daemon, Modal SDK, playwright binary). For a long-lived process, hitting them on every get_definitions() pass was pure waste — external state changes on human timescales. 30 s TTL so env-var flips (hermes tools enable X) propagate within a turn or two without explicit invalidation. Measured: first call 7.5ms → 1.6ms (check_fn probes now dominate); subsequent calls ~0.01ms via the upstream memoization. Invalidation surface: - registry._generation bumps on register/deregister/register_toolset_alias, invalidating the memoized definitions automatically. - config.yaml mtime in the cache key captures user-visible config edits affecting dynamic schemas (execute_code mode, discord allowlist). - invalidate_check_fn_cache() exposed for explicit flushes (e.g. after hermes tools enable/disable). - tests/conftest.py autouse fixture clears both caches before every test so env-var monkeypatches don't see stale results. Also fixes a regression from PR #17046 that I missed: - tools/web_tools.py — Firecrawl was removed from module scope by the lazy import, breaking 8 tests that patch 'tools.web_tools.Firecrawl'. Applied the same _FirecrawlProxy pattern used in auxiliary_client/ run_agent for OpenAI (module-level proxy that looks like the class but imports the SDK on first call/isinstance; patch() replaces the attribute as usual). Verified: - 49/49 tests/tools/test_web_tools_config.py pass (was 8 failing on main) - 68/68 tests/tools/test_homeassistant_tool.py pass (was 1 failing in the full suite due to check_fn TTL cross-test pollution; fixed by the autouse fixture) - 3887/3895 tests/tools/ (8 pre-existing fails: 2 delegate, 1 mcp dynamic discovery, 5 mcp structured content — all confirmed on main) - 2973/2976 tests/agent/ + tests/run_agent/ (3 pre-existing fails) - 868/868 tests/run_agent/ (excluding test_run_agent.py which has pre-existing suite-level issues) - Live smoke: 2 turns + /model switch + tool calls, zero errors in agent.log session window. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 18:20:17 -07:00
Teknium	b5128a751b	perf(startup): lazy-import OpenAI, Anthropic, Firecrawl, account_usage (#17046 ) * perf(startup): lazy-import OpenAI, Anthropic, Firecrawl, account_usage Four heavy SDK/module imports are now deferred off the hot startup path. Net savings on cold module imports: cli 1200 → 958 ms (-242) run_agent 1220 → 901 ms (-319) tools.web_tools 711 → 423 ms (-288) agent.anthropic_adapter 230 → 15 ms (-215) agent.auxiliary_client 253 → 68 ms (-185) Four independent changes in one PR since they all use the same pattern and share the same risk profile (heavy SDK import → lazy proxy or function-local import): 1. tools/web_tools.py: 'from firecrawl import Firecrawl' moved into _get_firecrawl_client(), which is only called when backend='firecrawl'. Users on Exa/Tavily/ Parallel pay zero firecrawl cost. 2. cli.py + gateway/run.py: 'from agent.account_usage import ...' moved into the /limits handlers. account_usage transitively pulls the OpenAI SDK chain; only needed when the user runs /limits. 3. agent/anthropic_adapter.py: 'try: import anthropic as _anthropic_sdk' replaced with a cached '_get_anthropic_sdk()' accessor. The three usage sites (build_anthropic_client, build_anthropic_bedrock_client, read_claude_code_credentials_from_keychain) now resolve via the accessor. All pre-existing test patches of 'agent.anthropic_adapter._anthropic_sdk' keep working because the accessor respects any value already in module globals. 4. agent/auxiliary_client.py AND run_agent.py: 'from openai import OpenAI' replaced with an '_OpenAIProxy()' module- level object that looks like the OpenAI class but imports the SDK on first call/isinstance check. This preserves: - 15+ in-module OpenAI(...) construction sites in auxiliary_client and the single site in run_agent's _create_openai_client (Python's function-scope name lookup finds the proxy, forwards the call); - 'patch("agent.auxiliary_client.OpenAI", ...)' and 'patch("run_agent.OpenAI", ...)' test patterns used by 28+ test files (patch replaces the module attribute as usual). Tried two alternatives first: - 'from openai._client import OpenAI' — doesn't skip openai/__init__.py (the audit's hypothesis here was wrong). - Module-level __getattr__ — works for external access but Python function-scope name resolution skips __getattr__, so in-module OpenAI(...) calls NameError. Note: 'openai' still loads on 'import cli' because cli.py -> neuter_async_httpx_del() -> openai._base_client, and run_agent.py -> code_execution_tool.py (module-level build_execute_code_schema) -> _load_config() -> 'from cli import CLI_CONFIG'. Deferring those is a separate, larger change — out of scope for this PR. The savings above all come from avoiding the openai/, anthropic/, and firecrawl/* top-level type-tree imports on paths that don't need them. Verified: - 302/302 tests in tests/agent/{test_anthropic_adapter, test_bedrock_1m_context, test_minimax_provider, test_anthropic_keychain} pass. Two pre-existing failures on main unchanged. - 106/106 tests/agent/test_auxiliary_client.py pass (1 pre-existing fail). - 97/97 tests/run_agent/test_create_openai_client_kwargs_isolation.py, test_plugin_context_engine_init.py, test_invalid_context_length_warning.py, test_api_max_retries_config.py, tests/hermes_cli/test_gemini_provider.py, test_ollama_cloud_provider.py pass (1 pre-existing fail). - Live hermes chat smoke: 2 turns + /model switch + tool calls, zero errors in the 57-line agent.log window. - Module-level import of run_agent + auxiliary_client + anthropic_adapter no longer pulls 'anthropic' or 'firecrawl' at all. * fix(gateway): restore top-level account_usage import for test-patch surface CI caught two failures in tests/gateway/test_usage_command.py that I missed locally: AttributeError: 'module' object at gateway.run has no attribute 'fetch_account_usage' The test uses monkeypatch.setattr('gateway.run.fetch_account_usage', ...) to inject a fake account-fetch call. Moving the import inside the handler deleted that module-level attribute, breaking the patch surface. Restoring the top-level import in gateway/run.py gives up the ~230 ms gateway-boot savings from that one lazy, but: 1. the gateway is a long-running daemon — boot cost is paid once per install, not per turn; 2. the other four lazy-imports (firecrawl, openai, anthropic, cli's account_usage) remain in place and still account for the bulk of the savings reported in the PR body; 3. preserving the patch surface keeps the established 'gateway.run.fetch_account_usage' monkeypatch pattern working without touching tests. Verified: tests/gateway/test_usage_command.py — 8 passed, 0 failed. Full targeted sweep (2336 tests across agent/gateway/hermes_cli/run_agent): 2332 passed, 4 failed — all 4 pre-existing on main. --------- Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 09:38:42 -07:00
Teknium	42be5e49b0	fix(browser): detect missing Chromium and fail fast with actionable error (#17039 ) Previously, check_browser_requirements() only checked for the agent-browser CLI, not the Chromium binary it drives. When the CLI was present but Chromium wasn't (common in Docker images predating the playwright install step), the browser tool was advertised to the agent, every call hung for the full command timeout (~30s each, ~220s for a chained navigate), and the agent eventually gave up with no useful error — users saw 'browser not working' with empty errors.log. Changes: - tools/browser_tool.py: add _chromium_installed() checking PLAYWRIGHT_BROWSERS_PATH + default Playwright cache paths for chromium-* / chromium_headless_shell-* dirs; wire into check_browser_requirements() for local mode (cloud providers unaffected). _run_browser_command fails fast with an actionable Docker vs. host message instead of hanging. _running_in_docker() checks /.dockerenv and /proc/1/cgroup. - hermes_cli/tools_config.py: post_setup for 'Local Browser' now runs 'agent-browser install --with-deps' after npm install to actually download Chromium. In Docker, points user at the updated image pull instead of trying to install into a read-only layer. Cloud-provider post_setup (browserbase) skips Chromium install entirely. - tests/tools/test_browser_chromium_check.py: new tests covering search roots, install detection, requirements branches (local/cloud/ camofox), and the fast-fail guard in docker/non-docker contexts. - tests/tools/test_browser_homebrew_paths.py: 5 existing subprocess-path tests now mock _chromium_installed=True since they exercise the post-guard subprocess path. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 07:03:44 -07:00
Teknium	6085d7a93e	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 ) Mechanical cleanup across 43 files — removes 46 unused imports (F401) and 14 unused local variables (F841) detected by `ruff check --select F401,F841`. Net: -49 lines. Also fixes a latent NameError in rl_cli.py where `get_hermes_home()` was called at module line 32 before its import at line 65 — the module never imported successfully on main. The ruff audit surfaced this because it correctly saw the symbol as imported-but-unused (the call happened before the import ran); the fix moves the import to the top of the file alongside other stdlib imports. One `# noqa: F401` kept in hermes_cli/status.py for `subprocess`: tests monkeypatch `hermes_cli.status.subprocess` as a regression guard that systemctl isn't called on Termux, so the name must exist at module scope even though the module body doesn't reference it. Docstring explains the reason. Also fixes an invalid `# noqa:` directive in gateway/platforms/discord.py:308 that lacked a rule code. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 06:46:45 -07:00
Teknium	69b8fa65d4	docs(delegate_task): clarify that it is synchronous and not durable (#17022 ) delegate_task runs inside the parent turn and is cancelled when the parent is interrupted (new user message, /stop, /new). The child status payload (status=interrupted, exit_reason=interrupted) is already honest, but the tool schema and user-facing docs did not set the expectation, so users reasonably assumed delegated subagents would keep running in the background after interrupting the parent. Updates: - tools/delegate_tool.py DELEGATE_TASK_SCHEMA description adds a WHEN NOT TO USE bullet pointing at cronjob / terminal(background=True, notify_on_complete=True) for durable long-running work. - website/docs/user-guide/features/delegation.md gains a Lifetime and Durability callout above Key Properties. - website/docs/guides/delegation-patterns.md expands the Use something else list and the Constraints section with the same guidance. Reported by LizLiz (@lizliz404) via Teknium. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 06:45:15 -07:00
Teknium	b61d9b297a	refactor: consolidate symlink-safe atomic replace into shared helper Extract the islink/realpath guard from the 16743 fix into a single atomic_replace() helper in utils.py, then migrate every os.replace() call site in the codebase to use it. The original PR #16777 correctly identified and fixed the bug, but only patched 9 of ~24 call sites. The same bug class (managed deployments that symlink state files silently losing the link on every write) still existed at auth.json, sessions file, gateway config, env_loader, webhook subscriptions, debug store, model catalog, pairing, google OAuth, nous rate guard, and more. Rather than add another 10+ copies of the same three-line guard, consolidate into atomic_replace(tmp, target) which: - resolves symlinks via os.path.realpath before os.replace - returns the resolved real path so callers can re-apply permissions - is a drop-in replacement for os.replace at the use sites Changes: - utils.py: new atomic_replace() helper + atomic_json_write / atomic_yaml_write now call it instead of inlining the guard - 16 files: all os.replace() call sites migrated to atomic_replace() - agent/{google_oauth, nous_rate_guard, shell_hooks}.py - cron/jobs.py - gateway/{pairing, session, platforms/telegram}.py - hermes_cli/{auth, config, debug, env_loader, model_catalog, webhook}.py - tools/{memory_tool, skill_manager_tool, skills_sync}.py Tests: tests/test_atomic_replace_symlinks.py pins the invariant for atomic_replace + atomic_json_write + atomic_yaml_write, covers plain files, first-time creates, broken symlinks, and permission preservation. Refs #16743 Builds on #16777 by @vominh1919.	2026-04-28 04:58:22 -07:00

1 2 3 4 5 ...

1143 commits