hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-03 02:11:48 +00:00

Author	SHA1	Message	Date
Mind-Dragon	ab6c629ccc	fix(terminal): skip sudo prompt when local NOPASSWD sudo works When running on a host with sudoers NOPASSWD configured for the current user, interactive Hermes sessions were unnecessarily entering the password prompt path before executing sudo commands. Outside Hermes, `sudo -n true` exits 0 for that user. Add `_sudo_nopasswd_works()` that probes `sudo -n true` and, when it succeeds, lets `_transform_sudo_command()` return the command unchanged with no stdin password. The probe: - Is scoped to the `local` terminal backend only, so Docker/SSH/Modal and other remote backends do not inherit host sudo state. - Re-probes every call (no process-lifetime cache) so an expired sudo timestamp cannot silently make a later command block waiting for a password that Hermes never prompts for. - Is bypassed entirely when `SUDO_PASSWORD` is configured or a cached password already exists, preserving existing explicit-password flows. Co-authored-by: Junting Wu <juntingpublic@gmail.com>	2026-04-30 20:38:09 -07:00
hharry11	24130b7e53	fix(approval): harden YOLO mode env parsing against quoted-bool strings	2026-04-30 20:37:37 -07:00
teknium1	e21898ea98	test(discord_tool): add regression test for per-token capability cache Proves token A's detected capabilities do not leak to token B after the fix in the preceding commit. Before the fix this test would have seen both tokens return token A's cached value.	2026-04-30 20:37:12 -07:00
Teknium	82b5786721	test(browser_supervisor): cover cache-hit healthcheck on dead thread/loop Pure unit tests for _SupervisorRegistry — no Chrome required. Verified to fail when the fix is reverted, pass with it in place.	2026-04-30 20:33:33 -07:00
simbam99	142b4bf3ce	fix(session_search): order recent mode by last activity instead of start time - order session_search recent-mode results by last activity instead of session start time - add an opt-in `order_by_last_active` path to `SessionDB.list_sessions_rich` - add regression coverage for both the database ordering and recent-mode call path	2026-04-30 20:17:15 -07:00
johnncenae	9ae1fa9e39	fix(delegate): honor runtime default model during provider resolution	2026-04-30 19:58:55 -07:00
briandevans	97d6f25008	test(toolsets): include kanban in expected post-#17805 toolset assertions The kanban PR (#17805, `c86842546`) added the `kanban` toolset and `tools/kanban_tools.py`, but didn't update three pre-existing test assertions that bake the full toolset/tool inventory: * `tests/tools/test_registry.py::test_matches_previous_manual_builtin_tool_set` hard-codes the manual list of builtin tool modules. `tools.kanban_tools` was missing. * `tests/test_tui_gateway_server.py::test_load_enabled_toolsets_rejects_disabled_mcp_env` and `test_load_enabled_toolsets_falls_back_when_tui_env_invalid` both expect `["memory"]` from `_load_enabled_toolsets()`. With kanban now auto-recovered by `_get_platform_tools` (its tools live in hermes-cli's universe but are not in CONFIGURABLE_TOOLSETS), the resolver returns `["kanban", "memory"]`. * `tests/hermes_cli/test_tools_config.py::test_get_platform_tools_preserves_explicit_empty_selection` asserts `set()` for an explicit empty list. The recovery loop now also surfaces `kanban`. Reframed to assert the contract the test name describes — no CONFIGURABLE toolset gets re-enabled when the user explicitly saved an empty list — which stays correct as more non-configurable platform toolsets are added. Verified the failures reproduce on clean origin/main (`180a7036b`) with `.[all,dev]`-equivalent extras (fastapi, starlette, httpx, pytest-asyncio) and that all four pass with this commit applied. CI on main itself is currently red on these tests; this restores green for everyone's PRs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 19:43:03 -07:00
Teknium	c868425467	feat(kanban): durable multi-profile collaboration board (#17805 ) Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init\|create\|list\|show\|assign\|link\|unlink\| claim\|comment\|complete\|block\|unblock\|archive\|tail\|dispatch\|context\| init\|gc\|watch\|stats\|notify\|log\|heartbeat\|runs\|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com>	2026-04-30 13:36:47 -07:00
y0shualee	f4b76fa272	fix: use skill activity in curator status Treat skill views and edits as activity when curator reports and applies lifecycle transitions, so recently loaded or patched skills are not displayed or transitioned as never used.\n\nAdds regression tests for activity derivation, automatic transitions, and CLI status output.	2026-04-30 10:31:47 -07:00
0xDevNinja	564a649e6a	fix(curator): scan nested archive subdirs in restore_skill restore_skill() in tools/skill_usage.py used archive_root.iterdir(), which only walked the top level of .archive/. Skills archived under nested layouts (e.g. .archive/openclaw-imports/<skill>/ from older archive paths or external imports) were invisible to both the exact-match and prefix-match candidate scans, surfacing as a misleading "skill '<name>' not found in archive" error even though the directory existed on disk. Switch both candidate scans to archive_root.rglob('*') so the lookup descends into category subdirectories. Fixes #17942	2026-04-30 10:31:44 -07:00
Stephen Schoettler	407dfbb021	fix(ci): stabilize current main test regressions	2026-04-30 06:36:50 -07:00
Teknium	87f5e1a25a	test(ssh): update tar pipe assertion for --no-overwrite-dir Existing test_tar_pipe_commands asserted the literal substring 'tar xf - -C /' in ssh_str, which is no longer present after the #17767 fix adds --no-overwrite-dir between 'tar xf -' and '-C /'. Split the one substring check into three independent assertions for the tar stdin mode, the new --no-overwrite-dir flag (regression guard for #17767), and the extract target.	2026-04-30 04:32:28 -07:00
Maxence Groine	04ea895ffb	feat(gateway/signal): add support for multiple images sending Adds a new `send_multiple_images` method to the ``BasePlatformAdapter`` that implements the default "One image per message" loop and allows for platform-specific overriding. Implements such an override for the Signal adapter, batching images and trying (best-effort) to work around rate-limits for voluminous batches using a specific scheduler. Also implements batching + rate-limit handling in the `send_message` tool. New tests added for the Signal adapter, its rate-limit scheduler and the `send_message` tool	2026-04-30 04:28:08 -07:00
Heltman	19f9be1dff	fix(tools): serialize concurrent hermes_tools RPC calls from execute_code The sandbox-side `_call()` in both the UDS and file-based transports was not thread-safe, so scripts that call tools from multiple threads (e.g. `ThreadPoolExecutor` over `terminal()`) inside a single `execute_code` run could silently receive each other's responses. Root cause: * UDS transport — a single module-level `_sock` was shared across all threads; the newline-framed protocol has no request-id; and the server-side RPC loop handles one connection serially. With concurrent callers, each thread would `sendall()` then race to `recv()` the next newline-terminated response from the shared buffer, so responses got delivered to the wrong caller. * File transport — `_seq += 1` is a non-atomic read-modify-write, so two threads could allocate the same sequence number and clobber each other's request/response files. Fix: guard `_call()` with a `threading.Lock` in the UDS case (covering send+recv), and guard `_seq` allocation with a lock in the file case. No protocol change. Regression tests cover both the generated-source level (lock is present and used) and an end-to-end concurrency test: running a sandboxed ThreadPoolExecutor of 10 `terminal()` calls against a slow mock dispatcher, asserting every caller sees its own tagged response. The test fails without the fix (10/10 mismatched, matching real-world repro) and passes with it.	2026-04-30 03:31:16 -07:00
Teknium	8d302e37a8	feat(tts): add Piper as a native local TTS provider (closes #8508 ) (#17885 ) Piper (OHF-Voice/piper1-gpl) is a fast, local neural TTS engine from the Home Assistant project that supports 44 languages with zero API keys. Adds it as a native built-in provider alongside edge/neutts/kittentts, installable via 'hermes tools' with one keystroke. What ships: - New 'piper' built-in provider in tools/tts_tool.py - Lazy import via _import_piper() - Module-level voice cache keyed on (model_path, use_cuda) so switching voices doesn't invalidate older cached voices - _resolve_piper_voice_path() accepts either an absolute .onnx path or a voice name (auto-downloaded on first use via 'python -m piper.download_voices --download-dir <cache>') - Voice cache at ~/.hermes/cache/piper-voices/ (profile-aware via get_hermes_dir) - Optional SynthesisConfig knobs: length_scale, noise_scale, noise_w_scale, volume, normalize_audio, use_cuda — passed through only when configured, so older piper-tts versions aren't broken - WAV output then ffmpeg conversion path (same as neutts/kittentts) so Telegram voice bubbles work when ffmpeg is present - Piper added to BUILTIN_TTS_PROVIDERS so a user's tts.providers.piper.command cannot shadow the native provider (regression test included) - 'hermes tools' wizard entry - Piper appears under Voice and TTS as local free, with 'pip install piper-tts' auto-install via post_setup handler - Prints voice-catalog URL and default-voice info after install - config.yaml defaults - tts.piper.voice defaults to en_US-lessac-medium - Commented advanced knobs for discoverability - Docs - New 'Piper (local, 44 languages)' section in features/tts.md explaining install path, voice switching, pre-downloaded voices, and advanced knobs - Piper listed in the ten-provider table and ffmpeg table - Custom-command-providers section updated to drop the Piper example (now native) and add a piper-custom example for users with their own trained .onnx models - overview.md bumps provider count to ten - Tests (tests/tools/test_tts_piper.py, 16 tests) - Registration (BUILTIN_TTS_PROVIDERS, PROVIDER_MAX_TEXT_LENGTH) - _resolve_piper_voice_path across every branch: direct .onnx path, cached voice name, fresh download with correct CLI args, download failure, successful-exit-but-missing-files, empty voice to default - _generate_piper_tts: loads voice once, reuses cache, voice-name download wiring, advanced knobs flow through SynthesisConfig - text_to_speech_tool end-to-end dispatch and missing-package error - check_tts_requirements: piper availability toggles the return value - Regression guard: piper cannot be shadowed by a command provider with the same name - Pre-existing test_tts_mistral test broadened to mock the new piper/kittentts/command-provider checks (otherwise it false-passes when piper is installed in the test venv) E2E verification (live): Actual pip install piper-tts, config piper + en_US-lessac-low, text_to_speech_tool call, voice auto-downloaded from HuggingFace, WAV synthesized, ffmpeg-converted to Ogg/Opus. Second call hits the cache (~60ms). Cache dir populated with .onnx and .onnx.json. This caught a real bug during development: the first pass used '-d' as the download-dir flag; the actual piper.download_voices CLI wants '--download-dir'. Fixed before PR opened.	2026-04-30 02:53:20 -07:00
Teknium	2facea7f71	feat(tts): add command-type provider registry under tts.providers.<name> (#17843 ) Reshape of PR #17211 (@versun). Lets users wire any local or external TTS CLI into Hermes without adding engine-specific Python code. Users declare any number of named providers in config.yaml and switch between them with tts.provider: <name>, alongside the built-ins (edge, openai, elevenlabs, …). Config shape: tts: provider: piper-en providers: piper-en: type: command command: 'piper -m ~/model.onnx -f {output_path} < {input_path}' output_format: wav Placeholders: {input_path}, {text_path}, {output_path}, {format}, {voice}, {model}, {speed}. Use {{ / }} for literal braces. Key behavior: - Built-in provider names always win — a tts.providers.openai entry cannot shadow the native OpenAI provider. - type: command is the default when command: is set. - Placeholder values are shell-quote-aware (bare / single / double context), so paths with spaces and shell metacharacters are safe. - Default delivery is a regular audio attachment. voice_compatible: true opts in to Telegram voice-bubble delivery via ffmpeg Opus conversion. - Command failures (non-zero exit, timeout, empty output) surface to the agent with stderr/stdout included so you can debug from chat. - Process-tree kill on timeout (Unix killpg, Windows taskkill /T). - max_text_length defaults to 5000 for command providers; override under tts.providers.<name>.max_text_length. Tests: tests/tools/test_tts_command_providers.py — 42 new tests cover provider resolution, shell-quote context, placeholder rendering with injection payloads, timeout, non-zero exit, empty output, voice_compatible opt-in, and end-to-end dispatch through text_to_speech_tool. All 88 pre-existing TTS tests still pass. Docs: new "Custom command providers" section in website/docs/user-guide/features/tts.md with three worked examples (Piper, VoxCPM, MLX-Kokoro), placeholder reference, optional keys, behavior notes, and security caveat. E2E-verified live: isolated HERMES_HOME, command provider declared in config.yaml, text_to_speech_tool dispatches through the registered shell command and the output file is produced as expected. Co-authored-by: Versun <me+github7604@versun.org>	2026-04-30 02:29:08 -07:00
Stephen Schoettler	f73364b1c4	fix(ci): stabilize main test suite regressions (#17660 ) * fix: stabilize main test suite regressions * test(agent): update MiniMax normalization expectation * test: stabilize remaining CI assertions * test: harden config helper monkeypatching * test: harden CI-only assertions * fix(agent): propagate fast streaming interrupts	2026-04-29 23:18:55 -07:00
Teknium	71c8ca17dc	chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 Removes drive-by duplication that accumulated during the contributor branch's multiple rebases. All runtime-benign (dict last-wins, redefinition last-wins) but left dead source that would confuse reviewers and maintainers. Surgical in-place de-duplication (kept PR's intentional additions, removed only the doubled copy): * hermes_cli/auth.py: duplicate "gmi" + "azure-foundry" ProviderConfig * hermes_cli/models.py: duplicate "gmi" entry in _PROVIDER_MODELS * hermes_cli/config.py: duplicate NOTION/LINEAR/AIRTABLE/TENOR skill env block + duplicate get_custom_provider_context_length definition * hermes_cli/gateway.py: duplicate _setup_yuanbao * gateway/platforms/base.py: duplicate is_host_excluded_by_no_proxy * gateway/platforms/telegram.py: duplicate delete_message * gateway/stream_consumer.py: duplicate _should_send_fresh_final and _try_fresh_final * gateway/run.py: duplicate _parse_reasoning_command_args / _resolve_session_reasoning_config / _set_session_reasoning_override, duplicate "Drain silently when interrupted" interrupt check * run_agent.py: duplicate HERMES_AGENT_HELP_GUIDANCE append, duplicate codex_message_items capture, duplicate custom_providers resolution * tools/approval.py: duplicate HARDLINE_PATTERNS section and duplicate hardline call in check_dangerous_command * tools/mcp_tool.py: duplicate _orphan_stdio_pids module-level decl * cron/scheduler.py: duplicate "not configured/enabled" check — kept the new early-rejection, removed the stale late-path copy Full-file resets to origin/main (all PR additions were duplicates of content already on main): * ui-tui/packages/hermes-ink/index.d.ts * ui-tui/packages/hermes-ink/src/entry-exports.ts * ui-tui/packages/hermes-ink/src/ink/selection.ts * ui-tui/src/app/interfaces.ts * ui-tui/src/app/slash/commands/core.ts * ui-tui/src/components/thinking.tsx * ui-tui/src/lib/memoryMonitor.ts * ui-tui/src/types.ts * ui-tui/src/types/hermes-ink.d.ts * tests/hermes_cli/test_doctor.py * tests/hermes_cli/test_api_key_providers.py * tests/hermes_cli/test_model_validation.py * tests/plugins/memory/test_hindsight_provider.py * tests/run_agent/test_run_agent.py * tests/gateway/test_email.py * tests/tools/test_dockerfile_pid1_reaping.py * hermes_cli/commands.py (slack_native_slashes block — full duplicate)	2026-04-29 21:56:51 -07:00
Ari Lotter	868bc1c242	feat(irc): add interactive setup feat(gateway): refine Platform._missing_ and platform-connected dispatch Restricts plugin-name acceptance to bundled plugin scan + registry (no arbitrary string -> enum-pollution), pulls per-platform connectivity checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean _is_platform_connected method, and adds tests covering the checker map, plugin platform interface, and IRC setup wizard.	2026-04-29 21:56:51 -07:00
Teknium	4d7fc0f37c	feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation Reloading MCP servers rebuilds the tool set for the active session, which invalidates the provider prompt cache (tool schemas are baked into the system prompt). The next message re-sends full input tokens — can be expensive on long-context or high-reasoning models. To surface that cost, /reload-mcp now routes through a new slash-confirm primitive with three options: Approve Once / Always Approve / Cancel. 'Always Approve' persists approvals.mcp_reload_confirm: false so future reloads run silently. Coverage: * Classic CLI (cli.py) — interactive numbered prompt. * TUI (tui_gateway + Ink ops.ts) — text warning on first call; `now` / `always` args skip the gate; `always` also persists the opt-out. * Messenger gateway — button UI on Telegram (inline keyboard), Discord (discord.ui.View), Slack (Block Kit actions); text fallback on every other platform via /approve /always /cancel replies intercepted in gateway/run.py _handle_message. * Config key: approvals.mcp_reload_confirm (default true). * Auto-reload paths (CLI file watcher, TUI config-sync mtime poll) pass confirm=true so they do NOT prompt. Implementation: * tools/slash_confirm.py — module-level pending-state store used by all adapters and by the CLI prompt. Thread-safe register/resolve/clear. * gateway/platforms/base.py — send_slash_confirm hook (default 'Not supported' → text fallback). * gateway/run.py — _request_slash_confirm helper + text intercept in _handle_message (yields to in-progress tool-exec approvals so dangerous-command /approve still unblocks the tool thread first). Tests: * tests/tools/test_slash_confirm.py — primitive lifecycle + async resolution + double-click atomicity (16 tests). * tests/hermes_cli/test_mcp_reload_confirm_gate.py — default-config shape + deep-merge preserves user opt-out (5 tests). Targeted runs (hermetic): 89 passed (slash-confirm, config gate, existing agent cache, existing telegram approval buttons).	2026-04-29 21:56:47 -07:00
Teknium	31f70d1f2a	fix(ci): recover 38 failing tests on main (#17642 ) CI Tests workflow has been red on main for 40+ consecutive runs. This commit recovers every failure visible in run 25130722163 (most recent completed run prior to this PR). Root causes, by group: Test-mock drift after product landed (fix: update mocks) - test_mcp_structured_content / test_mcp_dynamic_discovery (6 tests): product added _rpc_lock (#`02ae15222`) and _schedule_tools_refresh (#`1350d12b0`) without updating sibling test files. Install a real asyncio.Lock inside the fake run-loop and patch at _schedule_tools_refresh. - test_session.py: renamed normalize_whatsapp_identifier → canonical_ whatsapp_identifier upstream; keep a local alias so the legacy tests keep working. - test_run_progress_topics Slack DM test: PR #8006 made Slack default tool_progress=off; explicitly set it to 'all' in the test fixture so the progress-callback path still runs. Also read tool_progress_callback at call time rather than freezing it in FakeAgent.__init__ — production assigns it AFTER construction. - test_tui_gateway_server session-create/close race: session.create now defers _start_agent_build behind a 50ms timer — wait for the build thread to enter _make_agent before closing, otherwise the orphan- cleanup path never runs. - test_protocol session.resume: product get_messages_as_conversation now takes include_ancestors kwarg; accept *_kwargs in the test stub. - test_copilot_acp_client redaction: redactor is OFF by default (snapshots HERMES_REDACT_SECRETS at import); patch agent.redact._REDACT_ENABLED=True for the duration of the test. - test_minimax_provider: after #17171, dots in non-Anthropic model names stay dots even with preserve_dots=False. Assert the new invariant rather than the old 'broken for MiniMax' behavior. - test_update_autostash: updater now scans `ps -A` for dashboard PIDs; the test's catch-all subprocess.run stub needed stdout/stderr fields. - test_accretion_caps: read_timestamps dict is populated lazily when os.path.getmtime succeeds. Use .get("read_timestamps", {}) to tolerate CI filesystems where the stat races file creation. Change-detector tests (fix: rewrite as structural invariants) - test_credential_sources_registry_has_expected_steps: was a frozen set comparison that broke when minimax-oauth was added. Rewrite as an invariant check (every step has description, no dupes, core steps present) per AGENTS.md 'don't write change-detector tests'. xdist ordering / test pollution (fix: reset state, use module-local patches) - test_setup vercel: sibling test saved VERCEL_PROJECT_ID='project' to os.environ via save_env_value() and never cleared it. monkeypatch.delenv the VERCEL_ vars in the link-file test. - test_clipboard TestIsWsl: GitHub Actions is on Azure VMs whose real /proc/version often contains 'microsoft'. Patching builtins.open with mock_open didn't reliably intercept hermes_constants.is_wsl's call in xdist workers that had already cached _wsl_detected=True from an earlier test. Patch hermes_constants.open directly and add teardown_method to reset the cache after each test. Pytest-asyncio cancellation hangs (fix: bound product await with timeout) - test_session_split_brain_11016 (3 params) + test_gateway_shutdown cancel-inflight: under pytest-asyncio 1.3.0, 'await task' and 'asyncio.gather(cancelled_tasks)' can stall for 30s when the cancelled task's finally block awaits typing-task cleanup. Bound both with asyncio.wait_for(..., timeout=5.0) and asyncio.shield — the stragglers are released from adapter tracking and allowed to finish unwinding in the background. This is also a legitimate hardening: a wedged finally shouldn't stall the caller's dispatch or a gateway shutdown. Orphan UI config (fix: merge tiny tab into messaging category) - test_web_server test_no_single_field_categories: the telegram.reactions config field lived in its own 'telegram' schema category with no siblings. Fold it under 'discord' via _CATEGORY_MERGE so the dashboard doesn't render an orphan single-field tab. Local verification: 38/38 originally-failing tests pass; 4044/4044 gateway tests pass; 684/684 targeted subset (all 16 touched test files) passes.	2026-04-29 20:05:32 -07:00
Teknium	c61b2e0af7	feat(skills): refuse skill_manage writes on pinned skills (#17562 ) Extend curator's pin flag from 'skip auto-transitions' to 'no agent edits at all'. All five skill_manage mutation actions (edit, patch, delete, write_file, remove_file) now refuse pinned skills with a message pointing the user at `hermes curator unpin <name>`. Motivation: pin used to only stop the curator's own maintenance pass from touching a skill. Nothing prevented the main agent from editing or deleting a pinned skill via skill_manage in-session. This gives users a hard fence against unwanted agent edits — same semantics as curator pinning, extended to the write tool. Create is unaffected (you can't pin a name that doesn't exist yet, and name collisions already error out). Broken sidecars fail open rather than lock the agent out. The schema description advertises the new refusal so models know not to route around it with rename/recreate tricks.	2026-04-29 10:28:25 -07:00
Teknium	8c8fc6c1ec	fix(skills): let skill_manage patch/edit/delete skills in external_dirs in place (#17512 ) Closes #4759, closes #4381. Mutating actions (patch, edit, write_file, remove_file, delete) used to refuse skills that lived under `skills.external_dirs` with 'Skill X is in an external directory and cannot be modified. Copy it to your local skills directory first.' Faced with that error, the agent would fall back to action='create', which always writes under ~/.hermes/skills/ — producing a silent duplicate of the external skill in the local store. Fix: drop the read-only gate. `skills.external_dirs` is configured by the user; if they pointed it at a directory, they already said 'these are my skills, treat them the same.' Filesystem permissions handle the genuine read-only case (write fails, agent sees the error). - New _containing_skills_root() resolves whichever dir actually contains the skill; _delete_skill uses it to bound empty-category cleanup so an external root is never rmdir'd. - _create_skill behavior is unchanged: new skills still land in local SKILLS_DIR only. Fewer moving parts. - Seven new TestExternalSkillMutations tests covering patch/edit/write_file/ remove_file/delete/create against a mocked two-root layout + a category rmdir-safety check.	2026-04-29 08:16:52 -07:00
kshitijk4poor	13c238327e	fix: address self-review findings for Vercel Sandbox salvage - Add vercel_sandbox to hardline blocklist container bypass test - Add vercel_sandbox to skills_tool remote backend parametrize test - Deduplicate runtime set: doctor.py and setup.py now import _SUPPORTED_VERCEL_RUNTIMES from terminal_tool.py - Add docstring to _run_bash explaining timeout/stdin_data discards - Always stop sandbox during cleanup (unconditional, matching Modal/Daytona) - Update security.md: container bypass text, production tip, comparison table - Update environment-variables.md: TERMINAL_ENV list, Vercel auth vars, TERMINAL_VERCEL_RUNTIME - Update inline comments in cli.py and config.py to include vercel_sandbox	2026-04-29 07:22:33 -07:00
Scott Trinh	5a1d4f6804	feat: add Vercel Sandbox backend Adds Vercel Sandbox as a supported Hermes terminal backend alongside existing providers (Local, Docker, Modal, SSH, Daytona, Singularity). Uses the Vercel Python SDK to create/manage cloud microVMs, supports snapshot-based filesystem persistence keyed by task_id, and integrates with the existing BaseEnvironment shell contract and FileSyncManager for credential/skill syncing. Based on #17127 by @scotttrinh, cherry-picked onto current main.	2026-04-29 07:22:33 -07:00
Teknium	398945e7b1	fix(cron): accept list-form deliver values so deliver=['telegram'] works (#17456 ) The cron schema contracts deliver as a string ("local", "origin", "telegram", "telegram:chat_id[:thread_id]", or comma-separated combos), but MCP clients and scripts sometimes pass an array like ['telegram']. Before this change, the list was written to jobs.json verbatim, and the scheduler's str(deliver).split(',') then tried to resolve the literal string "['telegram']" as a platform — returning None and logging 'no delivery target resolved for deliver=[\'telegram\']'. Fix on both ends: - tools/cronjob_tools.py: normalize deliver at the API boundary on create and update, so storage is always a string. - cron/scheduler.py: normalize deliver in _resolve_delivery_targets, so existing jobs.json entries with list-form deliver are handled gracefully without requiring users to edit the file. Closes #17139	2026-04-29 06:35:34 -07:00
teknium1	9e63062b6c	fix(stt): resolve API keys from ~/.hermes/.env via get_env_value (#17140 ) Widen #17163 to the sibling file tools/transcription_tools.py, which had the same class of bug. STT provider call sites and the _get_provider selection gate called os.getenv(...) directly and missed keys that only lived in ~/.hermes/.env. Same pattern as tts_tool.py: one guarded top-level import of get_env_value (falls back to os.getenv on ImportError), then every API-key and paired-base-URL lookup swapped over. Call sites migrated: - _transcribe_groq — GROQ_API_KEY - _transcribe_mistral — MISTRAL_API_KEY - _transcribe_xai — XAI_API_KEY, XAI_STT_BASE_URL - _get_provider — GROQ/MISTRAL/XAI_API_KEY in explicit + auto branches Module-level defaults (DEFAULT_STT_MODEL, GROQ_BASE_URL, etc.) stay on os.getenv — they're import-time constants, not runtime config, and the dotenv fallback would add no value there. New regression tests in tests/tools/test_transcription_dotenv_fallback.py (8 cases) mirror briandevans' TTS tests: per-provider dotenv-key forwarding, selection-gate dotenv visibility, and an end-to-end probe that patches hermes_cli.config.load_env to simulate ~/.hermes/.env carrying the key while os.environ does not.	2026-04-29 06:25:20 -07:00
briandevans	33967b4e52	fix(tts): tolerate missing hermes_cli.config in tts_tool import Wrap the new top-level `from hermes_cli.config import get_env_value` in try/except ImportError and fall back to a thin os.getenv shim, so importing tools.tts_tool keeps working in environments where hermes_cli.config is unavailable. This matches the existing tolerance in `_load_tts_config()` (tools/tts_tool.py) and the same import-fallback pattern in tools/tool_backend_helpers.py::fal_key_is_configured. Also update the TestDotenvFallbackPerProvider docstring to accurately describe the mocking strategy: per-provider tests patch `tools.tts_tool.get_env_value` directly, while the regression-guard tests cover the lower-level `hermes_cli.config.load_env` integration. Addresses Copilot review on #17163. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 06:25:20 -07:00
briandevans	40d25e125b	fix(tts): resolve API keys from ~/.hermes/.env via get_env_value (#17140 ) TTS provider tools (elevenlabs, xai, minimax, mistral, gemini) called os.getenv("X_API_KEY") directly, which bypassed Hermes's dotenv bridge in hermes_cli.config. Users who keep their TTS keys only in ~/.hermes/.env saw "X_API_KEY not set" errors even though the rest of the stack (agent/credential_pool, hermes_cli/auth) already resolves keys through get_env_value() — same class of bug as #15914 fixed for those modules. Switch every TTS env-var lookup (API keys, base URLs, and check_tts_requirements gates) to get_env_value, which checks os.environ first and then ~/.hermes/.env. Behaviour for users with keys exported in the shell is unchanged; users with dotenv-only keys now succeed. The two diagnostics prints in __main__ are migrated for consistency. Regression test (tests/tools/test_tts_dotenv_fallback.py): - per-provider: each backend reads the dotenv key when only ~/.hermes/.env carries it (5 providers). - end-to-end: with hermes_cli.config.load_env returning the key and os.environ empty, _generate_minimax_tts and check_tts_requirements both succeed; reverting tools/tts_tool.py back to os.getenv makes all 7 tests fail with "MINIMAX_API_KEY not set" / similar. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 06:25:20 -07:00
Teknium	20b759cd02	fix(process): reconcile session.exited against real child exit in poll/wait (#17430 ) When a background terminal process spawns a descendant daemon that inherits the stdout pipe (e.g. 'hermes update' triggering a gateway systemctl restart), the reader thread's stdout.read() never returns EOF and its finally: block never runs. session.exited stays False forever, so process(action='poll') returns 'running' indefinitely even though the direct child exited long ago. Issue #17327: Feishu user polled 74 times over 7 minutes before killing the gateway manually. Fix: add _reconcile_local_exit() that checks the direct Popen.poll() before trusting session.exited. If the direct child has exited, drain any immediately-readable bytes non-blocking and flip session.exited. Called from poll() and wait(). The stuck reader thread remains blocked but is a daemon thread and gets reaped with the process. Safe no-op for env/PTY sessions, already-exited sessions, and live children (returns None from Popen.poll()).	2026-04-29 04:59:21 -07:00
Ben Barclay	58a6171bfb	Merge pull request #17305 from NousResearch/feat/docker-run-as-host-user feat(docker): run container as host user to avoid root-owned bind mounts	2026-04-29 16:41:55 +10:00
Ben	5531c0df82	feat(docker): run container as host user to avoid root-owned bind mounts Add opt-in terminal.docker_run_as_host_user config flag that passes --user $(id -u):$(id -g) to the Docker backend so files written into bind-mounted directories (/workspace, /root, docker_volumes entries) are owned by the host user instead of root. When enabled on POSIX platforms, also drops SETUID/SETGID caps since the container no longer needs gosu/su to switch users. Falls back cleanly on platforms without os.getuid (e.g. native Windows Docker) with a warning. Wired through all three config.yaml -> TERMINAL_* env-var bridges: - cli.py env_mappings (CLI + TUI startup) - gateway/run.py _terminal_env_map (gateway / messaging platforms) - hermes_cli/config.py _config_to_env_sync (`hermes config set`) Also fixes docker_mount_cwd_to_workspace silently failing in gateway mode -- it was missing from gateway/run.py's _terminal_env_map. Adds tests/tools/test_terminal_config_env_sync.py to guard against future drift between the three bridges (same bug class shipped twice in one month). Bundled Hermes image won't work with this flag since its entrypoint expects to start as root for the usermod/gosu hermes flow; works with the default nikolaik/python-nodejs image and plain Debian/Ubuntu.	2026-04-29 16:16:43 +10:00
brooklyn!	5e68503d2f	Merge pull request #17190 from NousResearch/bb/tui-cold-start-profiling perf(tui): cut visible cold start ~57% with lazy agent init	2026-04-28 22:45:14 -07:00
Teknium	0d31864e3b	fix(curator): defense-in-depth gates against bundled/hub skills Previous invariants only gated the primary entry points (apply_automatic_transitions, archive_skill, CLI pin). Several paths were unprotected: - bump_view / bump_use / bump_patch / set_state / set_pinned wrote usage records unconditionally, which is confusing noise in .usage.json even though the review list filtered them out - restore_skill did not check whether a bundled skill now shadows the archived name - CLI unpin was asymmetric with CLI pin — it had no gate Fixes: - _mutate() (the shared counter / state writer) now drops silently when the skill is not agent-created. .usage.json never gains a record for a bundled or hub-installed skill. - restore_skill() refuses to restore under a name that is now bundled or hub-installed (would shadow upstream). - CLI unpin gate matches CLI pin. New tests: - 5 provenance-guard tests on skill_usage (one per mutator) - 1 end-to-end test that hammers every mutator at a bundled skill and a hub skill, asserts both are untouched on disk, and asserts the sidecar stays clean - 2 CLI tests proving pin/unpin refuse bundled skills symmetrically 64/64 tests passing (29 skill_usage + 27 curator + 8 new guards).	2026-04-28 22:33:33 -07:00
Teknium	bc79e227e6	feat(curator): background skill maintenance (issue #7816 ) Adds the Curator — an auxiliary-model background task that periodically reviews AGENT-CREATED skills and keeps the collection tidy: tracks usage, transitions unused skills through active → stale → archived, and spawns a forked AIAgent to consolidate overlaps and patch drift. Default: enabled, inactivity-triggered (no cron daemon). Runs on CLI startup and gateway boot when the last run is older than interval_hours (default 24) AND the agent has been idle for min_idle_hours (default 2). Invariants (all load-bearing): - Never touches bundled or hub-installed skills (.bundled_manifest + .hub/lock.json double-filter) - Never auto-deletes — archive only. Archives are recoverable via `hermes curator restore <skill>` - Pinned skills bypass all auto-transitions - Uses the aux client; never touches the main session's prompt cache New files: - tools/skill_usage.py — sidecar .usage.json telemetry, atomic writes, provenance filter - agent/curator.py — orchestrator: config, idle gating, state-machine transitions (pure, no LLM), forked-agent review prompt - hermes_cli/curator.py — `hermes curator {status,run,pause,resume, pin,unpin,restore}` subcommand - tests/tools/test_skill_usage.py — 29 tests - tests/agent/test_curator.py — 25 tests Modified files (surgical patches): - tools/skills_tool.py — bump view_count on successful skill_view - tools/skill_manager_tool.py — bump patch_count on skill_manage patch/edit/write_file/remove_file; forget record on delete - hermes_cli/config.py — add curator: section to DEFAULT_CONFIG - hermes_cli/commands.py — add /curator CommandDef with subcommands - hermes_cli/main.py — register `hermes curator` subparser via register_cli() from hermes_cli.curator - cli.py — /curator slash-command dispatch + startup hook - gateway/run.py — gateway-boot hook (mirrors CLI) Validation: - 54 new tests across skill_usage + curator, all passing in 3s - 346 tests across all touched files' neighbors green - 2783 tests across hermes_cli/ + gateway/test_run_progress_topics.py green - CLI smoke: `hermes curator status/pause/resume` work end-to-end Companion to PR #16026 (class-first skill review prompt) — together they form a loop: the review prompt stops near-duplicate skill creation at the source, and the curator prunes/consolidates what still accumulates. Refs #7816.	2026-04-28 22:33:33 -07:00
Tranquil-Flow	ac855bba0e	fix(cli): respect terminal.cwd config in local terminal backend init_session() runs a login shell bootstrap that sources profile scripts (.bashrc, .bash_profile, etc.) before capturing pwd. If any profile script changes the working directory, the captured cwd overwrites the configured terminal.cwd value — so terminal commands run in the wrong directory despite the TUI banner showing the configured path. Add an explicit 'builtin cd' to the configured cwd in the bootstrap script, after profile sourcing but before pwd capture, ensuring the configured terminal.cwd is always what gets recorded. Fixes #14044	2026-04-28 22:16:08 -07:00
Brooklyn Nicholson	9e398e1809	perf(tui): avoid importing classic CLI during tool discovery TUI session readiness was still laggy after the gateway-ready fixes. Profiling session.create -> session.info showed the slow phase is background AIAgent construction (~1.1s). A cProfile run of tui_gateway.server::_make_agent showed model_tools/tool discovery importing tools.code_execution_tool, whose module-level EXECUTE_CODE_SCHEMA calls _get_execution_mode(), which imported cli.CLI_CONFIG. That pulled the classic interactive CLI stack (prompt_toolkit/Rich and REPL setup) into every agent startup path, including hermes --tui where it is not used. Replace that with hermes_cli.config.read_raw_config(), which is cached and reads only the raw code_execution section. Existing defaults still apply when the key is absent. Measurements on macOS Terminal.app: - import run_agent: ~466ms -> ~347ms - model_tools import: ~418ms -> ~272ms - _make_agent: ~1452ms -> ~1239ms - session.create -> session.info: ~1167ms -> ~999ms - full hermes --tui ready p50: ~1655ms -> ~1537ms Tests: - scripts/run_tests.sh tests/tools/test_code_execution_modes.py tests/tools/test_code_execution.py	2026-04-28 22:42:17 -05:00
brooklyn!	5f215b13ce	fix(docker): materialize bundled TUI Ink package (#16690 ) * fix(docker): materialize bundled TUI Ink package * fix(docker): keep nested deps out of build context * fix(docker): make TUI Ink smoke check deterministic * test(docker): skip dockerignore assertion in partial checkouts * fix(docker): use lockfile install for vendored Ink deps * test(cli): expect deterministic npm ci in /update flow * fix(docker): fall back to npm install for vendored Ink deps * fix(docker): keep bundled Ink source for TUI runtime builds * fix(docker): dedupe React in vendored Ink package	2026-04-28 15:11:47 -05:00
Teknium	42be5e49b0	fix(browser): detect missing Chromium and fail fast with actionable error (#17039 ) Previously, check_browser_requirements() only checked for the agent-browser CLI, not the Chromium binary it drives. When the CLI was present but Chromium wasn't (common in Docker images predating the playwright install step), the browser tool was advertised to the agent, every call hung for the full command timeout (~30s each, ~220s for a chained navigate), and the agent eventually gave up with no useful error — users saw 'browser not working' with empty errors.log. Changes: - tools/browser_tool.py: add _chromium_installed() checking PLAYWRIGHT_BROWSERS_PATH + default Playwright cache paths for chromium-* / chromium_headless_shell-* dirs; wire into check_browser_requirements() for local mode (cloud providers unaffected). _run_browser_command fails fast with an actionable Docker vs. host message instead of hanging. _running_in_docker() checks /.dockerenv and /proc/1/cgroup. - hermes_cli/tools_config.py: post_setup for 'Local Browser' now runs 'agent-browser install --with-deps' after npm install to actually download Chromium. In Docker, points user at the updated image pull instead of trying to install into a read-only layer. Cloud-provider post_setup (browserbase) skips Chromium install entirely. - tests/tools/test_browser_chromium_check.py: new tests covering search roots, install detection, requirements branches (local/cloud/ camofox), and the fast-fail guard in docker/non-docker contexts. - tests/tools/test_browser_homebrew_paths.py: 5 existing subprocess-path tests now mock _chromium_installed=True since they exercise the post-guard subprocess path. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 07:03:44 -07:00
Pony.Ma	aa94883288	fix(mcp): preserve nullable schema coercion	2026-04-28 04:58:03 -07:00
Pony.Ma	1350d12b0b	fix: keep mcp dynamic refresh tasks tracked	2026-04-28 04:58:03 -07:00
Pony.Ma	02ae152222	fix(mcp): normalize nullable tool schemas	2026-04-28 04:58:03 -07:00
墨綠BG	4462b349b2	✨ feat(web): expose search result limit	2026-04-28 02:09:30 -07:00
Teknium	e63364b8df	revert: computer-use cua-driver (PR #16919 ) (#16927 ) Reverts PR #16919 (commits `dad10a78d`, `413ee1a28`, `b4a8031b2`, `afb958829`) which was merged prematurely. Restoring the pre-merge state so #14817 and #15328 can be revisited as standing PRs. Reverted commits: - `afb958829` fix(computer-use): harden image-rejection fallback + AUTHOR_MAP - `b4a8031b2` fix(computer-use): unwrap _multimodal tool results - `413ee1a28` feat(computer-use): background focus-safe backend - `dad10a78d` feat(computer-use): cua-driver backend, universal any-model schema Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 01:57:21 -07:00
Teknium	afb9588298	fix(computer-use): harden image-rejection fallback + AUTHOR_MAP Follow-up to #15328's vision-unsupported retry branch in run_agent.py. _strip_images_from_messages() previously deleted any message whose content was entirely images. That's fine for synthetic user messages injected for attachment delivery, but it breaks providers for tool-role messages — the paired tool_call_id on the preceding assistant message ends up unmatched, which OpenAI-compatible APIs reject with HTTP 400. Fix: tool-role messages whose content becomes empty are replaced with a plaintext placeholder that preserves the tool_call_id linkage. Only non-tool messages are dropped. Added 10 tests covering the role-alternation invariants + image-type coverage. Image-rejection detector: expanded phrase list (image content not supported / multimodal input / vision input / model does not support image) and gated on 4xx status so transient 5xx errors never get misinterpreted as 'server said no to images'. Detection is documented as best-effort English phrase matching. AUTHOR_MAP: mapped 3820588+ddupont808@users.noreply.github.com to ddupont808 so release notes attribute the salvage correctly.	2026-04-28 01:46:36 -07:00
Teknium	dad10a78d0	feat(computer-use): cua-driver backend, universal any-model schema Background macOS desktop control via cua-driver MCP — does NOT steal the user's cursor or keyboard focus, works with any tool-capable model. Replaces the Anthropic-native `computer_20251124` approach from the abandoned #4562 with a generic OpenAI function-calling schema plus SOM (set-of-mark) captures so Claude, GPT, Gemini, and open models can all drive the desktop via numbered element indices. - `tools/computer_use/` package — swappable ComputerUseBackend ABC + CuaDriverBackend (stdio MCP client to trycua/cua's cua-driver binary). - Universal `computer_use` tool with one schema for all providers. Actions: capture (som/vision/ax), click, double_click, right_click, middle_click, drag, scroll, type, key, wait, list_apps, focus_app. - Multimodal tool-result envelope (`_multimodal=True`, OpenAI-style `content: [text, image_url]` parts) that flows through handle_function_call into the tool message. Anthropic adapter converts into native `tool_result` image blocks; OpenAI-compatible providers get the parts list directly. - Image eviction in convert_messages_to_anthropic: only the 3 most recent screenshots carry real image data; older ones become text placeholders to cap per-turn token cost. - Context compressor image pruning: old multimodal tool results have their image parts stripped instead of being skipped. - Image-aware token estimation: each image counts as a flat 1500 tokens instead of its base64 char length (~1MB would have registered as ~250K tokens before). - COMPUTER_USE_GUIDANCE system-prompt block — injected when the toolset is active. - Session DB persistence strips base64 from multimodal tool messages. - Trajectory saver normalises multimodal messages to text-only. - `hermes tools` post-setup installs cua-driver via the upstream script and prints permission-grant instructions. - CLI approval callback wired so destructive computer_use actions go through the same prompt_toolkit approval dialog as terminal commands. - Hard safety guards at the tool level: blocked type patterns (curl\|bash, sudo rm -rf, fork bomb), blocked key combos (empty trash, force delete, lock screen, log out). - Skill `apple/macos-computer-use/SKILL.md` — universal (model-agnostic) workflow guide. - Docs: `user-guide/features/computer-use.md` plus reference catalog entries. 44 new tests in tests/tools/test_computer_use.py covering schema shape (universal, not Anthropic-native), dispatch routing, safety guards, multimodal envelope, Anthropic adapter conversion, screenshot eviction, context compressor pruning, image-aware token estimation, run_agent helpers, and universality guarantees. 469/469 pass across tests/tools/test_computer_use.py + the affected agent/ test suites. - `model_tools.py` provider-gating: the tool is available to every provider. Providers without multi-part tool message support will see text-only tool results (graceful degradation via `text_summary`). - Anthropic server-side `clear_tool_uses_20250919` — deferred; client-side eviction + compressor pruning cover the same cost ceiling without a beta header. - macOS only. cua-driver uses private SkyLight SPIs (SLEventPostToPid, SLPSPostEventRecordTo, _AXObserverAddNotificationAndCheckRemote) that can break on any macOS update. Pin with HERMES_CUA_DRIVER_VERSION. - Requires Accessibility + Screen Recording permissions — the post-setup prints the Settings path. Supersedes PR #4562 (pyautogui/Quartz foreground backend, Anthropic- native schema). Credit @0xbyt4 for the original #3816 groundwork whose context/eviction/token design is preserved here in generic form.	2026-04-28 01:46:36 -07:00
hharry11	de03a332f7	fix(security): isolate interactive sudo password cache per session	2026-04-28 01:34:16 -07:00
Teknium	d7528d43ac	fix(web): scope dashboard config Reset button to the current tab (#16813 ) * Port from Kilo-Org/kilocode#9448: roll up subagent costs into parent session total Child subagents built by delegate_task() each track their own session_estimated_cost_usd, but the parent agent's total never folded those numbers in. On runs where the parent mostly delegates and the children do the expensive work, the footer/UI was reporting a fraction of the actual spend — sometimes $0.00 when the parent itself made no billed calls. Fix: - Capture each child's session_estimated_cost_usd into _child_cost_usd on the result entry (before child.close() drops the counter). - After the existing subagent_stop hook loop, sum the children's costs and add the total to parent.session_estimated_cost_usd. - Promote session_cost_source from 'none' -> 'subagent' when the parent had no direct spend but children did, so the UI doesn't label the total as having unknown provenance. Real sources (openrouter, anthropic, etc.) are preserved. Nested orchestrator -> worker trees roll up naturally: each layer's own delegate_task() folds its direct children in, and when the orchestrator itself returns, its parent folds the orchestrator's now-inflated total on top. Internal fields (_child_cost_usd, _child_role) are stripped from the results dict before it's serialised back to the model — same contract as _child_role already followed. Tests: TestSubagentCostRollup (5 cases) covers single-child, batch, zero-cost-children, preserved-source, and legacy-fixture paths. Source: https://github.com/Kilo-Org/kilocode/pull/9448 * fix(web): scope dashboard config Reset button to the current tab Reported by @ykmfb001 via X: clicking 'Restore Defaults' (恢复默认值) on the Auxiliary page wiped the entire config.yaml to defaults, not just the auxiliary section. The button sits next to the category tabs and users reasonably assumed 'reset this tab', not 'reset everything'. Changes: - handleReset now scopes to the fields in the current view: active category's fields (form mode) or search-matched fields (search mode). Only those keys are copied from defaults; the rest of the config is left alone. - Added a window.confirm() with the scope name before applying. - Button is hidden in YAML mode (scoping doesn't apply there). - Tooltip/aria-label now name the scope, e.g. 'Reset Auxiliary to defaults'. - i18n: new resetScopeTooltip / confirmResetScope / resetScopeToast strings in en + zh; resetDefaults key preserved for compat.	2026-04-27 21:09:14 -07:00
Teknium	30307a9802	feat(plugins): add pre_approval_request / post_approval_response hooks (#16776 ) Plugins can now observe dangerous-command approval events in real time, on both the CLI-interactive path and the async gateway path. This is the missing hook surface external tools need to build approval notifiers (macOS menu-bar allow/deny, Slack alerts, audit logs, etc.) without forking Hermes or running a parallel gateway adapter. Changes: - hermes_cli/plugins.py: add two entries to VALID_HOOKS - tools/approval.py: fire both hooks from check_all_command_guards -- around prompt_dangerous_approval (CLI surface) and around the notify_cb + blocking event.wait loop (gateway surface) - website/docs/user-guide/features/hooks.md: document both hooks with a macOS-notification example - tests/tools/test_approval_plugin_hooks.py: 5 tests covering CLI once, CLI deny, plugin-crash resilience, gateway approve, gateway timeout Hooks are observer-only: return values are ignored, so plugins cannot veto or pre-answer an approval (use pre_tool_call for that). A crashing plugin cannot break the approval flow -- invoke_hook swallows per- callback errors, and the wrapper logs and swallows dispatch-layer errors too. Surface kwarg distinguishes "cli" from "gateway"; post hook reports choice as one of once/session/always/deny/timeout.	2026-04-27 20:08:33 -07:00
Brooklyn Nicholson	633f74504f	fix(ci): resolve follow-up title edge case and flaky checks Handle queued-title ValueError cleanup during session init, harden Discord message source building for test stubs, and fix the Dockerfile contract test syntax error. Also refresh the TUI lockfile and Nix build flags so nix ubuntu-latest no longer fails on npm lock/peer resolution drift.	2026-04-27 11:49:02 -05:00

1 2 3 4 5 ...

678 commits