hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-17 09:41:58 +00:00

Author	SHA1	Message	Date
Teknium	87f5e1a25a	test(ssh): update tar pipe assertion for --no-overwrite-dir Existing test_tar_pipe_commands asserted the literal substring 'tar xf - -C /' in ssh_str, which is no longer present after the #17767 fix adds --no-overwrite-dir between 'tar xf -' and '-C /'. Split the one substring check into three independent assertions for the tar stdin mode, the new --no-overwrite-dir flag (regression guard for #17767), and the extract target.	2026-04-30 04:32:28 -07:00
Teknium	b50bc13ef9	fix(config): preserve YAML lists in hermes config set (#17876 ) _set_nested unconditionally replaced any non-dict value with an empty dict when walking the dotted path, which silently destroyed list-typed config nodes the moment someone set a value with a numeric index (e.g. 'hermes config set custom_providers.0.api_key NEW'). Any sibling entries and any fields inside the targeted entry that the user didn't write were lost. Fix: - _set_nested now detects list nodes and navigates by numeric index, and preserves both dicts AND lists at intermediate positions (scalars are still replaced so bare-scalar -> nested overrides keep working). - set_config_value drops its duplicated navigation logic and calls _set_nested instead -- single source of truth for the rules. Regression tests (tests/hermes_cli/test_set_config_value.py): - test_indexed_set_preserves_sibling_list_entries -- exact #17876 repro - test_indexed_set_preserves_non_targeted_fields -- inner-dict fields survive - test_deeper_nesting_through_list -- dict -> list -> dict -> scalar path 35/35 existing + new tests pass. E2E-verified with the issue's repro against a real on-disk config.yaml -- list stays a list, entry 0 updated, entry 1 intact. Closes #17876	2026-04-30 04:32:17 -07:00
Teknium	3fc4c63d38	test(model_switch): update regression to reflect bare-custom guard	2026-04-30 04:32:11 -07:00
Sanjays2402	e0fa2cf972	fix(tools): isolate get_tool_definitions quiet_mode cache + dedup LCM injection (#17335 ) Long-lived Gateway processes were sending duplicate tool names to providers that enforce uniqueness: - DeepSeek: 'Tool names must be unique.' - Xiaomi MiMo: 'tools contains duplicate names: lcm_expand' - Moonshot/Kimi: 'function name lcm_grep is duplicated' TUI was unaffected because TUI runs with quiet_mode=False and skips the cache entirely. Root cause (two layered bugs) - model_tools.get_tool_definitions(quiet_mode=True) memoizes its result in _tool_defs_cache. The cache-hit path returned list(cached) (safe), but the FIRST uncached call stored and returned the SAME object. run_agent.py mutates self.tools (memory + LCM context-engine schemas) in-place, so the very first agent init in a Gateway process poisoned the cache, and every subsequent init appended LCM schemas again on top of the already-polluted list. - run_agent.py's context-engine injection (lcm_grep / lcm_describe / lcm_expand) had no dedup, unlike the memory-tools injection right above it which already skips already-present names. Fix (defense in depth, per the issue's suggested fix) - model_tools.get_tool_definitions: on the uncached branch, cache the computed list but return list(result) to the caller. Same pattern as the cache-hit path. - run_agent.py: build _existing_tool_names from self.tools and skip schemas whose names are already present, mirroring the memory-tools block. This also defends against plugin paths that may register the same schemas via ctx.register_tool(). Tests (tests/test_get_tool_definitions_cache_isolation.py) - test_first_uncached_call_returns_fresh_list \u2014 pins the fix; without it, first-call alias caused all the symptoms. - test_cache_hit_returns_fresh_list \u2014 pre-existing behavior stays. - test_caller_mutation_does_not_poison_cache \u2014 simulates run_agent appending lcm_grep / lcm_expand to the returned list and asserts the next call doesn't see them. - test_repeated_caller_mutation_does_not_accumulate \u2014 reproduces the long-lived Gateway accumulation pattern across 5 agent inits. - test_non_quiet_mode_does_not_use_cache \u2014 sanity, explains why TUI was fine. 5/5 pass on the new file; 23/23 still pass on tests/test_model_tools.py.	2026-04-30 04:32:06 -07:00
Rob Moen	0dd373ec43	fix(context): honor model.context_length for Ollama num_ctx and all display paths When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests).	2026-04-30 04:31:23 -07:00
Bartok9	fbb3775770	fix(gateway): enforce auth check in busy-session path to prevent unauthorized injection (#17775 ) The busy-session handler (_handle_active_session_busy_message) bypassed the authorization gate that the cold path enforces via _is_user_authorized(). In shared-thread contexts (Slack threads, Telegram forum topics, Discord threads) where thread_sessions_per_user=False (the default), all participants share one session_key. An unauthorized user posting in the same thread as an authorized user would hit the active-session branch, skip the auth check, and have their text merged into _pending_messages or injected via agent.interrupt(). This commit adds the same _is_user_authorized() check at the top of the busy handler, before any message queuing, steering, or interrupt logic. Unauthorized messages are silently dropped (return True) with a warning log — matching the cold-path behavior. Affected platforms: Slack, Telegram, Discord, any adapter with shared-session thread contexts. Closes #17775	2026-04-30 04:29:15 -07:00
briandevans	cc5b9fb581	fix(transport): omit thinking_config for Gemma on the gemini provider (#17426 ) The `gemini` provider also serves Gemma (e.g. `gemma-4-31b-it`) and historically other Google models like PaLM. Those reject `extra_body.thinking_config` with HTTP 400: Unknown name "thinking_config": Cannot find field `_build_gemini_thinking_config()` was unconditionally producing a config dict for any model on the `gemini` / `google-gemini-cli` provider, which `ChatCompletionsTransport.build_kwargs` then dropped into `extra_body["thinking_config"]`. The result: every chat turn for Gemma users on the gemini provider blew up at the API edge. The fix is the same shape Hermes already uses for the Gemini-2.5 vs Gemini-3 family clamping: normalise the model id, strip an `OpenRouter`-style `google/` prefix, and short-circuit early when the result doesn't start with `gemini`. We return `None` rather than `{"includeThoughts": False}`, because the API rejects the field name itself — even the polite "off" form trips the same 400. Three regression tests cover Gemma with reasoning enabled, Gemma with reasoning disabled, and the `google/gemma-…` OpenRouter-style id; the existing Gemini-2.5 / Gemini-3 / `google/gemini-…` cases keep passing because the Gemini guard fires after the prefix strip. Fixes #17426 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 04:29:04 -07:00
Teknium	3de8e21683	feat(gateway): native send_multiple_images for Telegram, Discord, Slack, Mattermost, Email Ports PR #17888's send_multiple_images ABC to every gateway platform that has a native multi-attachment API, so images arrive as a single bundled message instead of N separate ones. Native overrides: - Telegram: send_media_group (10 photos per album, chunks over); animated GIFs peeled off and routed through send_animation (albums don't support animations) - Discord: channel.send(files=[...]) (10 attachments per message, chunks over); URL images downloaded into BytesIO so they render inline; forum channels use create_thread with files=[...] - Slack: files_upload_v2(file_uploads=[...]) (10 per call, chunks over); respects thread_ts; records thread participation - Mattermost: single post with file_ids list (5 per post — Mattermost cap, chunks over) - Email: single SMTP message with multiple MIME attachments (no chunk cap, SMTP size governs); remote URLs remain linked in body (parity with existing send_image) All platforms fall back to the base per-image loop on any failure, so a single bad image in a batch never loses the rest. Matrix, WhatsApp, and single-attachment platforms (BlueBubbles, Feishu, WeCom, WeChat, DingTalk) continue to use the base default loop — their server APIs only accept one attachment per message anyway. Tests: adds tests/gateway/test_send_multiple_images.py with 19 targeted tests covering base default loop, chunking, animation peel-off, fallback paths, and empty-batch no-ops across all five new overrides. Co-authored-by: Maxence Groine <maxence@groine.fr>	2026-04-30 04:28:08 -07:00
Maxence Groine	04ea895ffb	feat(gateway/signal): add support for multiple images sending Adds a new `send_multiple_images` method to the ``BasePlatformAdapter`` that implements the default "One image per message" loop and allows for platform-specific overriding. Implements such an override for the Signal adapter, batching images and trying (best-effort) to work around rate-limits for voluminous batches using a specific scheduler. Also implements batching + rate-limit handling in the `send_message` tool. New tests added for the Signal adapter, its rate-limit scheduler and the `send_message` tool	2026-04-30 04:28:08 -07:00
Heltman	19f9be1dff	fix(tools): serialize concurrent hermes_tools RPC calls from execute_code The sandbox-side `_call()` in both the UDS and file-based transports was not thread-safe, so scripts that call tools from multiple threads (e.g. `ThreadPoolExecutor` over `terminal()`) inside a single `execute_code` run could silently receive each other's responses. Root cause: * UDS transport — a single module-level `_sock` was shared across all threads; the newline-framed protocol has no request-id; and the server-side RPC loop handles one connection serially. With concurrent callers, each thread would `sendall()` then race to `recv()` the next newline-terminated response from the shared buffer, so responses got delivered to the wrong caller. * File transport — `_seq += 1` is a non-atomic read-modify-write, so two threads could allocate the same sequence number and clobber each other's request/response files. Fix: guard `_call()` with a `threading.Lock` in the UDS case (covering send+recv), and guard `_seq` allocation with a lock in the file case. No protocol change. Regression tests cover both the generated-source level (lock is present and used) and an end-to-end concurrency test: running a sandboxed ThreadPoolExecutor of 10 `terminal()` calls against a slow mock dispatcher, asserting every caller sees its own tagged response. The test fails without the fix (10/10 mismatched, matching real-world repro) and passes with it.	2026-04-30 03:31:16 -07:00
Rylen Anil	3858f9419e	fix: handle gateway Ctrl+C shutdown cleanly	2026-04-30 03:29:57 -07:00
Sebastian B	362996e269	fix(runtime_provider): _get_named_custom_provider must honour transport field on v12+ providers dict The v11→v12 migrate_config step writes the API mode for every entry under the new transport: field (per the v12+ schema in _normalize_custom_provider_entry). _get_named_custom_provider read the legacy api_mode: spelling only, so for every migrated config the lookup returned None for the api mode. Downstream, _resolve_named_custom_runtime then falls back through custom_provider.get("api_mode") or _detect_api_mode_for_url(base_url) or "chat_completions". For loopback URLs (proxies, local servers) or unknown hostnames, the URL detector returns None and the resolver silently downgrades the configured codex_responses / anthropic_messages transport to chat_completions. Requests get sent to /v1/chat/completions instead of /v1/responses or /v1/messages and the provider 404s — or worse, returns a usable chat_completions response while skipping the model's reasoning / caching surface. Fix: read both field names — entry.get("api_mode") or entry.get("transport") — at the two match-by-key + match-by-name branches in _get_named_custom_provider. The runtime normaliser _normalize_custom_provider_entry already accepts both spellings; this lifts the same compat into the direct-dict reader so v12+ configs work without going through the shim. Adds three regression tests under tests/hermes_cli/test_user_providers_model_switch.py: - transport field is read on the match-by-key branch - legacy api_mode spelling still works for hand-edited configs - transport is read on the match-by-display-name branch	2026-04-30 03:29:48 -07:00
briandevans	f54935738c	fix(cron): surface agent run_conversation failure flags as job failure run_job() ignored the result's `failed=True` / `completed=False` flags that agent.run_conversation populates on API exhaustion, mid-run interrupts, and model aborts. Because final_response on those paths is often a non-empty error string ("API call failed after 3 retries: Request timed out."), the existing empty-response soft-fail in _process_job did not trip either: the error text was delivered as if it were the agent's reply and last_status was set to "ok" with no error notification. Detect those flags right after the dict-shape guard and raise so the existing except handler builds the proper failure tuple, preserving the agent's error message via result["error"]. Adds a parametrized regression covering: API-retry-exhausted with error text in final_response, completed=False with no final_response, completed=False without an explicit failed flag, and the partial-reply plus failed=True case. Plus a guard that a normal completed=True success result is still treated as success. Fixes #17855 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 03:27:37 -07:00
briandevans	f44f1f9615	fix(gateway): preserve session guard across in-band drain handoff When the in-band pending-message drain spawns a fresh task and transfers ownership via _session_tasks[session_key] = drain_task, the original task still unwinds through the finally block. The drain task picks up the same interrupt_event in its own _process_message_background entry, so an unconditional _release_session_guard(session_key, guard=interrupt_event) at the end of the finally matches and deletes _active_sessions[session_key] while the drain task is still pending its first await. A concurrent inbound message arriving in that handoff window passes the Level-1 guard (no entry exists) and spawns a second _process_message_background for the same session — two agents on one session_key, duplicate responses, duplicate tool calls. Fix: only call _release_session_guard when the current task still owns _session_tasks[session_key]. When ownership has been transferred to a drain task, leave _active_sessions populated; the drain task's own lifecycle releases it. This mirrors the late-arrival drain path in the same finally block, which already leaves both entries alone after handing off. Also reorder stdlib imports in the new regression test file to match the gateway test convention (stdlib before third-party). Regression test: capture _active_sessions[sk] identity at every handler entry across a 2-step in-band drain chain and assert the guard Event identity stays the same. Pre-fix, the original task's finally deletes the entry, the drain task falls through to the `or asyncio.Event()` branch, and a fresh Event is installed — identity diverges. Post-fix, the entry is preserved and the drain task reuses the original Event. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 03:27:08 -07:00
briandevans	663ba9a58f	fix(gateway): drain pending messages via fresh task, not recursion (#17758 ) `_process_message_background` finished a turn, found a queued follow-up, and drained it via `await self._process_message_background(pending_event, session_key)`. Each chained follow-up added a frame to the call stack instead of starting fresh. Under sustained pending-queue activity (e.g. a user sending follow-ups faster than the agent finishes turns) the C stack would exhaust at ~2000 nested frames and SIGSEGV the process. Mirror the late-arrival drain pattern that already exists in the same function: spawn a new `asyncio.create_task(...)` for the pending event and return so the current frame can unwind. The new task takes ownership via `_session_tasks[session_key]`. The late-arrival drain in `finally` could now race with the in-band drain across the `await typing_task` / `await stop_typing` window, so add a guard: if `_session_tasks[session_key]` is no longer the current task, an in-band drain already spawned a follow-up task — re-queue the late-arrival event so that task picks it up after its current event, instead of spawning a second concurrent task for the same session_key. Regression test (`test_pending_drain_no_recursion.py`) chains 12 follow-ups and asserts the recorded `_process_message_background` stack depth stays bounded at handler entry. Pre-fix: depths grow linearly `[1,2,3,…,12]`. Post-fix: all depths are `1`. `test_duplicate_reply_suppression::test_stale_response_suppressed_when_interrupted` called `_process_message_background` directly and implicitly relied on the old recursive `await` semantic — updated to wait for the spawned drain task before checking the sent list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 03:27:08 -07:00
Teknium	8d302e37a8	feat(tts): add Piper as a native local TTS provider (closes #8508 ) (#17885 ) Piper (OHF-Voice/piper1-gpl) is a fast, local neural TTS engine from the Home Assistant project that supports 44 languages with zero API keys. Adds it as a native built-in provider alongside edge/neutts/kittentts, installable via 'hermes tools' with one keystroke. What ships: - New 'piper' built-in provider in tools/tts_tool.py - Lazy import via _import_piper() - Module-level voice cache keyed on (model_path, use_cuda) so switching voices doesn't invalidate older cached voices - _resolve_piper_voice_path() accepts either an absolute .onnx path or a voice name (auto-downloaded on first use via 'python -m piper.download_voices --download-dir <cache>') - Voice cache at ~/.hermes/cache/piper-voices/ (profile-aware via get_hermes_dir) - Optional SynthesisConfig knobs: length_scale, noise_scale, noise_w_scale, volume, normalize_audio, use_cuda — passed through only when configured, so older piper-tts versions aren't broken - WAV output then ffmpeg conversion path (same as neutts/kittentts) so Telegram voice bubbles work when ffmpeg is present - Piper added to BUILTIN_TTS_PROVIDERS so a user's tts.providers.piper.command cannot shadow the native provider (regression test included) - 'hermes tools' wizard entry - Piper appears under Voice and TTS as local free, with 'pip install piper-tts' auto-install via post_setup handler - Prints voice-catalog URL and default-voice info after install - config.yaml defaults - tts.piper.voice defaults to en_US-lessac-medium - Commented advanced knobs for discoverability - Docs - New 'Piper (local, 44 languages)' section in features/tts.md explaining install path, voice switching, pre-downloaded voices, and advanced knobs - Piper listed in the ten-provider table and ffmpeg table - Custom-command-providers section updated to drop the Piper example (now native) and add a piper-custom example for users with their own trained .onnx models - overview.md bumps provider count to ten - Tests (tests/tools/test_tts_piper.py, 16 tests) - Registration (BUILTIN_TTS_PROVIDERS, PROVIDER_MAX_TEXT_LENGTH) - _resolve_piper_voice_path across every branch: direct .onnx path, cached voice name, fresh download with correct CLI args, download failure, successful-exit-but-missing-files, empty voice to default - _generate_piper_tts: loads voice once, reuses cache, voice-name download wiring, advanced knobs flow through SynthesisConfig - text_to_speech_tool end-to-end dispatch and missing-package error - check_tts_requirements: piper availability toggles the return value - Regression guard: piper cannot be shadowed by a command provider with the same name - Pre-existing test_tts_mistral test broadened to mock the new piper/kittentts/command-provider checks (otherwise it false-passes when piper is installed in the test venv) E2E verification (live): Actual pip install piper-tts, config piper + en_US-lessac-low, text_to_speech_tool call, voice auto-downloaded from HuggingFace, WAV synthesized, ffmpeg-converted to Ogg/Opus. Second call hits the cache (~60ms). Cache dir populated with .onnx and .onnx.json. This caught a real bug during development: the first pass used '-d' as the download-dir flag; the actual piper.download_voices CLI wants '--download-dir'. Fixed before PR opened.	2026-04-30 02:53:20 -07:00
Teknium	2662bfb756	fix(tests): make test_update_stale_dashboard immune to hermes_cli.main reload (#17881 ) Six tests in this file failed in CI (-n auto) after #17832 landed because other tests on the same xdist worker reload hermes_cli.main: tests/hermes_cli/test_env_loader.py:85-86 sys.modules.pop('hermes_cli.main', None) importlib.import_module('hermes_cli.main') tests/hermes_cli/test_skills_subparser.py:24-25 del sys.modules['hermes_cli.main'] When either ran first on a worker, our top-of-file 'from hermes_cli.main import _kill_stale_dashboard_processes' captured a stale function object whose __globals__ points at the old module dict. patch('hermes_cli.main._find_stale_dashboard_pids', ...) then patched the new module, but the stale function resolved the dependency via its stale __globals__, so every patch became a no-op: pids=[] → early return → no signals, no output, assertions failed. Fix: add an autouse fixture that rebinds the three module-level names to whatever is currently live in sys.modules['hermes_cli.main'] before each test runs. The pollutants in the other two files are load-bearing for their own tests, so fixing it on the consumer side is correct. Repro: pytest tests/hermes_cli/test_env_loader.py tests/hermes_cli/test_update_stale_dashboard.py	2026-04-30 02:46:56 -07:00
Teknium	0da968e521	fix(curator): unify under auxiliary.curator (hermes model, dashboard) (#17868 ) Voscko reported curator.auxiliary.provider/model was advertised in the docs but ignored — the review fork read only model.provider/default. The narrow fix would wire the one-off key through, but that leaves curator as a parallel system: not in `hermes model` → auxiliary picker, not in the dashboard Models tab, missing per-task base_url/api_key/timeout/ extra_body. Unify curator with the rest of the aux task system so `hermes model` and the dashboard configure it like every other aux task. Four sources of truth updated: - hermes_cli/config.py — add 'curator' slot to DEFAULT_CONFIG.auxiliary (timeout=600 since reviews run long), drop the one-off curator.auxiliary block from DEFAULT_CONFIG.curator. - hermes_cli/main.py — add ('curator', 'Curator', 'skill-usage review pass') to _AUX_TASKS so the CLI picker offers it. - hermes_cli/web_server.py — add 'curator' to _AUX_TASK_SLOTS so the dashboard REST endpoint accepts it. - web/src/pages/ModelsPage.tsx — add Curator entry so the dashboard Models tab renders the task. agent/curator.py _resolve_review_model() now reads auxiliary.curator first (canonical), falls back to legacy curator.auxiliary (with an info log asking users to migrate), then falls back to the main chat model. Pre-unification users keep working. Docs updated: docs/user-guide/features/curator.md now points at `hermes model` → auxiliary → Curator and the dashboard Models tab. Tests: 6 unit tests on _resolve_review_model (auto default, canonical slot honored, partial override fallback, legacy fallback with deprecation log assertion, new-wins-over-legacy, empty-config safety) plus a cross-registry test that curator is wired into all four sources of truth. test_aux_tasks_keys_all_exist_in_default_config already covers the DEFAULT_CONFIG ↔ _AUX_TASKS invariant. Reported by Voscko on Discord.	2026-04-30 02:46:01 -07:00
teknium1	658947480a	fix(acp): drop dead message_id kwarg from replay chunks UserMessageChunk and AgentMessageChunk do not have a message_id field in the ACP schema. Passing it silently dropped the kwarg (pydantic does not raise on unknown init kwargs here) and the subsequent test assertions on .message_id raised AttributeError. Strip the dead plumbing (uuid import, message_id= kwarg on both chunk types, unused session_id/index parameters) and remove the matching .message_id asserts from the test.	2026-04-30 02:45:54 -07:00
Henkey	d2536a72bf	fix(acp): replay session history on load	2026-04-30 02:45:54 -07:00
teknium1	5d253e65b7	fix(openviking): pre-check fs/stat to route file URIs before hitting directory-only endpoints Adds a deterministic pre-check on top of htsh's exception-based fallback: before calling /content/abstract or /content/overview on a non-pseudo URI, probe /api/v1/fs/stat. If the server says the URI is a file, route straight to /content/read instead of eating a failing 500 round-trip. This is the same idea pty819 and chennest independently landed in PRs #12757 and #12937 — merged here on top of htsh's broader fix so we keep pseudo-URI normalization and v0.3.3 browse-shape handling while avoiding the slow exception path on servers that return a raised 500 every time. The exception fallback from #5886 stays in place for environments where fs/stat is unavailable or returns an unfamiliar shape. Also credits pty819, chennest, and htsh in AUTHOR_MAP so future release notes attribute them correctly.	2026-04-30 02:35:29 -07:00
hitesh	10e43edc09	fix(openviking): fallback summary reads to content/read for file URIs OpenViking returns 500 for /content/abstract and /content/overview when URI points to mem_*.md files. Add resilient fallback to /content/read for non-pseudo summary file URIs while preserving pseudo summary normalization. Also add regression tests for fallback behavior.	2026-04-30 02:35:29 -07:00
hitesh	bff8ab0311	test(openviking): add helper regression coverage	2026-04-30 02:35:29 -07:00
Teknium	0ad4f55aa8	feat(dashboard): add --stop and --status flags (#17840 ) `hermes dashboard` is a long-lived foreground server that users often start and forget about, sometimes in a shell they've since closed. We didn't have a way to stop it — users had to find the PID manually. Adds two lifecycle flags that reuse the same detection + termination path the post-`hermes update` cleanup (PR #17832) uses: hermes dashboard --status List running hermes dashboard processes with PID + cmdline. Exit 0, informational. hermes dashboard --stop Terminate all running dashboards (3s grace then force-kill survivors). Exit 0 if none remain, 1 if any couldn't be stopped. Windows uses `taskkill /F` as before. Both flags short-circuit before any fastapi/uvicorn import so they work even on installations where the dashboard extras aren't installed — useful when you're cleaning up after uninstalling. The kill helper gained an optional `reason=...` param so the output reads "(requested via --stop)" instead of the post-update-specific "running backend no longer matches the updated frontend" wording. E2E: `hermes dashboard --status` with nothing running prints the empty message; with a fake `hermes dashboard ...` cmdline spawned via `exec -a`, `--status` lists it, `--stop` terminates it (exit -15), and a follow-up `--status` returns empty.	2026-04-30 02:30:20 -07:00
Teknium	2facea7f71	feat(tts): add command-type provider registry under tts.providers.<name> (#17843 ) Reshape of PR #17211 (@versun). Lets users wire any local or external TTS CLI into Hermes without adding engine-specific Python code. Users declare any number of named providers in config.yaml and switch between them with tts.provider: <name>, alongside the built-ins (edge, openai, elevenlabs, …). Config shape: tts: provider: piper-en providers: piper-en: type: command command: 'piper -m ~/model.onnx -f {output_path} < {input_path}' output_format: wav Placeholders: {input_path}, {text_path}, {output_path}, {format}, {voice}, {model}, {speed}. Use {{ / }} for literal braces. Key behavior: - Built-in provider names always win — a tts.providers.openai entry cannot shadow the native OpenAI provider. - type: command is the default when command: is set. - Placeholder values are shell-quote-aware (bare / single / double context), so paths with spaces and shell metacharacters are safe. - Default delivery is a regular audio attachment. voice_compatible: true opts in to Telegram voice-bubble delivery via ffmpeg Opus conversion. - Command failures (non-zero exit, timeout, empty output) surface to the agent with stderr/stdout included so you can debug from chat. - Process-tree kill on timeout (Unix killpg, Windows taskkill /T). - max_text_length defaults to 5000 for command providers; override under tts.providers.<name>.max_text_length. Tests: tests/tools/test_tts_command_providers.py — 42 new tests cover provider resolution, shell-quote context, placeholder rendering with injection payloads, timeout, non-zero exit, empty output, voice_compatible opt-in, and end-to-end dispatch through text_to_speech_tool. All 88 pre-existing TTS tests still pass. Docs: new "Custom command providers" section in website/docs/user-guide/features/tts.md with three worked examples (Piper, VoxCPM, MLX-Kokoro), placeholder reference, optional keys, behavior notes, and security caveat. E2E-verified live: isolated HERMES_HOME, command provider declared in config.yaml, text_to_speech_tool dispatches through the registered shell command and the output file is produced as expected. Co-authored-by: Versun <me+github7604@versun.org>	2026-04-30 02:29:08 -07:00
Teknium	5b85a7d351	fix(update): kill stale dashboard processes instead of warning (#17832 ) `hermes update` previously just printed a warning when it detected a running `hermes dashboard` process from the previous version, telling the user to kill and restart it themselves. In practice dashboards get started and forgotten, so the warning was routinely ignored and users ended up with a silent frontend/backend mismatch (new JS bundle served against the old in-memory Python backend, e.g. new auth headers the old code doesn't recognise → every API call 401s). The dashboard has no service manager, no PID file, and we don't record the original launch args (--host, --port, --insecure, --tui, --no-open) so we can't auto-restart it. But we CAN stop it, which is what the user wants — the failure mode when the stale process is left alive is worse than the dashboard just being down. - POSIX: SIGTERM, poll for ~3s, SIGKILL any survivors. - Windows: `taskkill /PID <pid> /F`. - Print each PID's outcome plus a one-line restart hint. - Detection logic is unchanged (same ps / wmic scan, same guards against the `pgrep -f` greedy-match trap from #16872 and the #17049 wmic UnicodeDecodeError fix). Also split the old monolithic `_warn_stale_dashboard_processes` into `_find_stale_dashboard_pids` (scan) + `_kill_stale_dashboard_processes` (kill), keeping the old name as an alias so any external callers still work. E2E verified: spawned a fake `hermes dashboard` cmdline via `exec -a 'hermes dashboard …' sleep 300`, ran `_kill_stale_dashboard_processes()`, confirmed SIGTERM exit (-15) and that a post-scan returns an empty PID list.	2026-04-30 01:34:34 -07:00
Teknium	fd0796947f	fix: stabilize CI — TS widen, sys.modules restore, WS subscriber race (#17836 ) Three narrow fixes targeting the remaining red checks after #17828: 1. ui-tui/src/app/slash/commands/ops.ts (Docker Build): /reload-mcp's local params type annotated session_id: string while ctx.sid is string \| null. Widen to string \| null — matches every other rpc call site and the test harness which passes { session_id: null }. Fixes TS2322 on line 86. The rpc signature itself is Record<string, unknown>, so this is purely a local typing fix, no behavioral change. 2. tests/plugins/test_achievements_plugin.py (13 cascading test failures): _install_fake_session_db did a raw sys.modules['hermes_state'] = fake_module without restoration, leaking the fake across xdist worker boundaries. Downstream tests doing from hermes_state import SessionDB got a module whose SessionDB was lambda: fake_db — 6 test_hermes_state.py tests failed with AttributeError: 'function' object has no attribute '_sanitize_fts5_query' / _contains_cjk, and 7 test_860_dedup.py tests failed with TypeError: got unexpected keyword argument 'db_path' (real code calls SessionDB(db_path=...)). Fix: stash monkeypatch on the plugin_api module object in the fixture, and have the helper do monkeypatch.setitem(sys.modules, 'hermes_state', fake_module) for auto-restoration at test teardown. 3. tests/hermes_cli/test_web_server.py (WS race): TestPtyWebSocket::test_pub_broadcasts_to_events_subscribers hit the 30s test timeout on CI. websocket_connect returns after ws.accept() — but /api/events registers the subscriber in _event_channels on the NEXT await (inside _event_lock). A publish immediately after connect could race ahead of registration and be dropped, and the subsequent receive_text() blocked until SIGALRM killed the test. Fix: poll _event_channels after the subscriber connects, before publishing. Validation: scripts/run_tests.sh tests/plugins/test_achievements_plugin.py tests/run_agent/test_860_dedup.py tests/test_hermes_state.py tests/hermes_cli/test_web_server.py 338 passed cd ui-tui && npm run type-check clean cd ui-tui && npm run build clean Remaining red checks are pure infra (Nix ubuntu hits TwirpErrorResponse ResourceExhausted on the GH Actions cache API; Nix macos bounces between npm build openssl-legacy and cache rate-limits) and cannot be fixed in the codebase.	2026-04-30 01:34:08 -07:00
Teknium	aa7bf329bc	feat(gateway): centralize audio routing + FLAC support + Telegram doc fallback (#17833 ) Extracted from PR #17211 (@versun) so it can land independently of the local_command TTS provider redesign. - Add should_send_media_as_audio(platform, ext, is_voice) in gateway/platforms/base.py; single source of truth for audio routing. - Add .flac to recognized audio extensions (MEDIA regex, weixin audio set, send_message audio set). - Telegram send_voice() now falls back to send_document for formats Telegram's Bot API can't play natively (.wav, .flac, ...) instead of raising; MP3/M4A still go to sendAudio, Opus/OGG still go to sendVoice. - Route _send_telegram() in send_message_tool through a narrower _TELEGRAM_SEND_AUDIO_EXTS = {.mp3, .m4a} set. - cron.scheduler._send_media_via_adapter now delegates the audio decision to should_send_media_as_audio so it matches the gateway. - Update the cron live-adapter ogg test to flag [[audio_as_voice]] so it still routes to sendVoice under the new Telegram-specific policy. - Tests: unit coverage for should_send_media_as_audio across platforms, end-to-end MEDIA routing via _process_message_background and GatewayRunner._deliver_media_from_response, TelegramAdapter.send_voice fallback for FLAC/WAV. Co-authored-by: Versun <me+github7604@versun.org>	2026-04-30 01:32:31 -07:00
Teknium	26787ce638	test(gateway): isolate plugin adapter imports and guard the anti-pattern Fixes the xdist collision that broke CI on PR #17764, and structurally prevents future plugin-adapter tests from reintroducing it. Problem ------- tests/gateway/test_teams.py (new in this PR) and tests/gateway/test_irc_adapter.py (already on main) both followed the same anti-pattern: sys.path.insert(0, str(_REPO_ROOT / 'plugins' / 'platforms' / '<name>')) from adapter import <Adapter> Every platform plugin ships its own adapter.py, so the bare 'from adapter import ...' races for sys.modules['adapter']. Whichever test collected first in a given xdist worker won; the other crashed at collection with ImportError, and the polluted sys.path cascaded into 19 unrelated test failures across tools/, hermes_cli/, and run_agent/ in the same worker. Fix --- 1. tests/gateway/_plugin_adapter_loader.py (new): shared helper load_plugin_adapter('<name>') that imports plugins/platforms/<name>/adapter.py via importlib.util under the unique module name plugin_adapter_<name>. Zero sys.path mutation, no possibility of collision. 2. tests/gateway/test_irc_adapter.py and tests/gateway/test_teams.py: migrated to the helper. All 'from adapter import ...' statements (including the ones inside test methods) are replaced with module-level attribute access on the loaded module. 3. tests/gateway/conftest.py: new pytest_configure guard that AST-scans every test_.py under tests/gateway/ at session start and fails the run with a pointer to the helper if any test uses sys.path.insert into plugins/platforms/ OR a bare 'import adapter' / 'from adapter import'. Runs on the xdist controller only (skipped in workers). The next plugin adapter test that tries to reintroduce this pattern gets rejected at collection time with a clear remediation message. 4. scripts/release.py: add aamirjawaid@microsoft.com -> heyitsaamir to AUTHOR_MAP so the check-attribution workflow passes. Validation ---------- scripts/run_tests.sh tests/gateway/ 4194 passed scripts/run_tests.sh tests/gateway/test_{teams,irc} 72 passed (both orderings) scripts/run_tests.sh <11 prev-failing test files> 398 passed Guard triggers correctly on both Path-operator and string-literal forms of the anti-pattern.	2026-04-30 01:19:34 -07:00
Aamir Jawaid	b3137d758c	feat(teams): add Microsoft Teams platform adapter as a plugin Hello! I am the maintainer of the microsoft-teams-apps Python SDK and I built this Teams adapter to integrate Microsoft Teams into Hermes. Adds a `plugins/platforms/teams` platform plugin using the new PlatformRegistry system from #17751. The adapter self-registers via `register(ctx)` — no hardcoding in run.py, toolsets.py, or any other core file. Key features: - Supports personal DMs, group chats, and channel posts - Adaptive Card approval prompts with in-place button replacement (Allow Once / Allow Session / Always Allow / Deny) - aiohttp webhook server bridged from the Teams SDK to avoid the fastapi/uvicorn dependency - ConversationReference caching for correct proactive sends in non-DM chats - `interactive_setup()` for `hermes gateway setup` integration - `platform_hint` for LLM context (Teams markdown subset) - 34 tests covering adapter init, send, message handling, and plugin registration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 01:19:34 -07:00
Teknium	21e695fcb6	fix: clean up defensive shims and finish CI stabilization from #17660 (#17801 ) PR #17660 landed a sweep of CI fixes but left three loose ends: 1. tests/cli/test_cli_loading_indicator.py::test_reload_mcp_sets_busy_state_ and_prints_status — /reload-mcp gained a prompt-cache-invalidation confirmation (commit `4d7fc0f37`) that was never wired into this test. The test exercises the loading-indicator path, so pre-approve via config and go straight into _reload_mcp(). 2. tools/mcp_tool.py _make_tool_handler — the added getattr(server, '_rpc_lock', None) + 'skip the lock if missing' branch is inconsistent with four sibling call sites that still direct-access server._rpc_lock. The lock is guaranteed by MCPServerTask.__init__; falling through to an unlocked session.call_tool would silently serialize-strip RPCs if the guard ever triggered. Restore direct access. 3. tui_gateway/server.py _messages_as_conversation — the helper existed only to catch 'TypeError: include_ancestors unexpected' from mocked SessionDBs that don't actually exist. The real SessionDB.get_messages_as_conversation has accepted include_ancestors since introduction, and every test FakeDB in the repo already declares the kwarg. Remove the shim, inline the two call sites.	2026-04-29 23:53:17 -07:00
Teknium	62a5d7207d	feat(plugins): bundle hermes-achievements + scan full session history (#17754 ) * feat(plugins): bundle hermes-achievements, scan full session history Ships @PCinkusz's hermes-achievements dashboard plugin (https://github.com/PCinkusz/hermes-achievements) as a bundled plugin at plugins/hermes-achievements/ and fixes a bug in the scan path that made the plugin only see the first 200 sessions — making lifetime badges (50k tool calls, 75k errors, etc.) unreachable on long-running installs. Changes: - plugins/hermes-achievements/: vendor v0.3.1 verbatim (manifest, dist/, plugin_api.py, tests, docs, README). - plugins/hermes-achievements/dashboard/plugin_api.py: * scan_sessions(): limit=None now scans ALL sessions via SQLite LIMIT -1. Previously capped at 200, so users with 8000+ sessions saw ~2% of their history. * evaluate_all(): first-ever scans run in a background thread so the dashboard request path never blocks. Stale snapshots serve immediately while a background refresh runs. force=True still blocks synchronously for manual /rescan. * _build_pending_snapshot(), _start_background_scan(), _run_scan_and_update_cache(): supporting plumbing + idempotent thread spawn. - tests/plugins/test_achievements_plugin.py: new tests covering the 200-cap regression, the background-scan first-run flow, stale-serve-plus-background-refresh, forced sync rescan, and scan-thread idempotency. - website/docs/user-guide/features/built-in-plugins.md: lists hermes-achievements in the bundled-plugins table and documents API endpoints, state files, and performance characteristics. E2E validated against a real 8564-session ~6.4GB state.db: * Cold scan: 13m 19s (one-time, backgrounded — UI never blocks) * Warm rescan: 1.47s (8563/8564 sessions reused from checkpoint cache) * 57/60 achievements unlocked, 3 discovered — aggregates like total_tool_calls=259958, total_errors=164213, skill_events=368243 correctly surface lifetime badges that the 200-cap made unreachable. Original credit: @PCinkusz (MIT-licensed). Upstream repo remains the staging ground for new badges; this bundle keeps the dashboard feature parity with Hermes core changes. * feat(achievements): publish partial snapshots during cold scan Previously a cold scan on a large session DB (13min on 8564 sessions) showed zero badges for the entire duration, then every badge at once when the scan completed. A dashboard refresh mid-scan was indistinguishable from a fresh install with no history. Now the scanner publishes a partial snapshot to _SNAPSHOT_CACHE every 250 sessions, so each refresh during a cold scan surfaces more badges incrementally. Mechanism: - scan_sessions() takes an optional progress_callback fired every progress_every sessions with (sessions_so_far, scanned, total). - _compute_from_scan() is extracted from compute_all() and gains an is_partial flag that skips writing to state.json — we don't want to record unlocked_at based on a half-complete aggregate that a later session might rebalance. - _run_scan_and_update_cache() installs a publisher callback that builds a partial snapshot, marks it mode='in_progress', and writes it to the cache with age=0 so the UI keeps polling /scan-status and picks up the final snapshot when the scan completes. - Manual /rescan (force=True) disables partial publishing — the caller is blocking on the final result anyway. E2E against real 8564-session state.db (polled cache every 10s): t=10s: cache empty t=20s: 250/8564 scanned, 35 unlocked, 25 discovered t=40s: 500/8564 scanned, 42 unlocked, 18 discovered t=60s: 1000/8564 scanned, 49 unlocked, 11 discovered ... Tests: 9/9 pass (2 new — partial snapshot publication + no-persist-on-partial). Upstream unittest suite: 10/10 pass. * feat(achievements): in-progress scan banner with live % progress Previously the dashboard showed zero badges silently during long cold scans (13min on 8564 sessions). The backend was publishing partial snapshots every 250 sessions, but the bundled UI didn't surface any indicator that a scan was running — it just rendered the main page with whatever counts were currently published and no way for the user to know more progress was coming. UI changes (dist/index.js, dist/style.css): - Added a scan-in-progress banner rendered between the hero and stats when scan_meta.mode is 'pending' or 'in_progress'. Shows: BUILDING ACHIEVEMENT PROFILE… Scanned 1,750 of 8,564 sessions · 20%. Badges unlock as more history streams in. with a pulsing teal indicator and a filling teal/cyan progress bar. Disappears the moment the backend flips to 'full' or 'incremental'. - Added an auto-poller via useEffect — while scanInFlight is true the page re-fetches /achievements every 4s WITHOUT toggling the loading skeleton, so unlock counts tick up visibly without the user refreshing. The effect cleans itself up when the scan finishes. - Added refresh() (re-fetch, no loading flip) alongside the existing load() (full reload, used by the Rescan button). Attribution preserved: - Added a header comment to index.js crediting @PCinkusz (https://github.com/PCinkusz/hermes-achievements, MIT) as the original author, noting the banner is a layered addition on top of the original dist bundle. - Matching header comment in style.css, flagging the new .ha-scan-banner* rules as the local addition. Live-verified end to end: - Spun up `hermes dashboard --port 9229 --no-open` against a fresh HERMES_HOME symlinked to the real 8564-session state.db. - Opened /achievements in a browser, confirmed the banner renders with live progress: 'Scanned 1,000 of 8,564 sessions · 11%' → updates to '1,250 ... · 14%' → '1,750 ... · 20%' without user interaction, matching the backend's partial publications. - Stats row simultaneously climbed from 35 → 49 → 53 unlocked as more history streamed in. - Vision analysis of the rendered page confirms the banner styling matches the rest of the dashboard (dark card bg, teal accent, same small-caps typography, pulsing indicator reusing ha-pulse keyframes).	2026-04-29 23:23:57 -07:00
Teknium	ce0c3ae493	fix(aux): remove hardcoded Codex fallback model, drop Codex from auto chain (#17765 ) The _CODEX_AUX_MODEL constant had already rotated twice in 6 weeks (gpt-5.3-codex -> gpt-5.2-codex -> now broken again at gpt-5.2-codex) because ChatGPT-account Codex gates which models it accepts via an undocumented, shifting allow-list that OpenAI publishes no changelog for. Any pinned default will keep going stale. Issue #17533 reports the current breakage: every ChatGPT-account auxiliary fallback fails with HTTP 400 "model is not supported" and the 60s pause loop degrades long sessions. Rather than reset the clock with another stale pin (PR #17544 proposes gpt-5.2-codex -> gpt-5.4), remove the hardcoded second-order Codex fallback entirely: - Delete `_CODEX_AUX_MODEL`. - Drop `_try_codex` from `_get_provider_chain()` (the auto chain now ends at api-key providers; 4 rungs instead of 5). - Rename `_try_codex() -> _build_codex_client(model)` and require an explicit model from the caller. No more guessing. - `resolve_provider_client("openai-codex", model=None)` now warns and returns (None, None) instead of silently guessing a stale model ID. - Remove `_try_codex` from the `provider="custom"` fallback ladder (same stale-constant trap). - `_resolve_strict_vision_backend("openai-codex")` routes through `resolve_provider_client` so the caller's explicit model is honored. Codex-main users are unaffected: Step 1 of `_resolve_auto` already uses `main_provider` + `main_model` directly and passes the user's configured Codex model through `resolve_provider_client`, which never touched `_CODEX_AUX_MODEL`. Per-task overrides (`auxiliary.<task>.provider/model`) continue to work and are the supported way to route specific aux tasks through Codex. Users whose main provider fails with a payment/connection error and who have ONLY ChatGPT-account Codex auth will now see the 60s pause without a stale-model-rejection noise line in between -- same outcome, cleaner failure. Closes #17533. Supersedes #17544 (which resets the clock on the same stale-constant problem).	2026-04-29 23:23:50 -07:00
Stephen Schoettler	f73364b1c4	fix(ci): stabilize main test suite regressions (#17660 ) * fix: stabilize main test suite regressions * test(agent): update MiniMax normalization expectation * test: stabilize remaining CI assertions * test: harden config helper monkeypatching * test: harden CI-only assertions * fix(agent): propagate fast streaming interrupts	2026-04-29 23:18:55 -07:00
Teknium	828d3a320b	fix(anthropic): reactive recovery for OAuth 1M-context beta rejection (#17752 ) Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable subscriptions retain full context. When Anthropic rejects a request with 400 'long context beta is not yet available for this subscription', disable the beta for the rest of the session, rebuild the client, and retry once. Addresses #17680 (thanks @JayGwod for the clean reproduction) without forcing every OAuth user off the 1M context window. Changes: - agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden; pattern matches 400 + 'long context beta' + 'not yet available'. Narrow enough that the existing 429 tier-gate pattern keeps its own reason. - agent/anthropic_adapter.py: _common_betas_for_base_url, build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged. - agent/transports/anthropic.py: build_kwargs forwards the flag. - run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard, recovery branch next to the image-shrink path. _rebuild_anthropic_client honors the flag. The main build_kwargs call site threads it through for fast-mode extra_headers. - hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models probes get the same reactive retry — previously they'd falsely report the Anthropic API as unreachable for affected subscriptions. Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit tests cover the classifier pattern (including the collision guard against the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default keeps 1M, flag strips only 1M while preserving every other beta).	2026-04-29 21:56:54 -07:00
Teknium	4d363499db	feat(plugins): bundled platform plugins auto-load by default Platform plugins shipped in-repo under plugins/platforms/ should be available out of the box — users shouldn't have to add 'irc-platform' to plugins.enabled before they can pick IRC from the gateway setup menu. Adds a new ``kind: platform`` plugin type that mirrors the existing ``kind: backend`` auto-load semantics: - Bundled (shipped in the hermes-agent repo): auto-load unconditionally. - User-installed (~/.hermes/plugins/): still opt-in via plugins.enabled so untrusted code doesn't silently run. Changes: * hermes_cli/plugins.py: add 'platform' to _VALID_PLUGIN_KINDS, document the new kind in the PluginManifest docstring, extend the bundled auto- load rule from 'backend only' to 'backend or platform'. * plugins/platforms/irc/plugin.yaml: declare kind: platform. * hermes_cli/gateway.py: remove the now-redundant _load_bundled_platform_plugins_for_enumeration() helper and the _enable_plugin_for_platform() helper. The setup menu's _all_platforms() just calls discover_plugins() and reads the registry — bundled platforms are already loaded at that point. Drops the 'needs_enable' flag and the 'plugin disabled — select to enable' status string. * hermes_cli/setup.py: relax the "gateway is configured" detector used during OpenClaw migration. Switching to _platform_status() in an earlier commit tightened the check to require an exact "configured" match, dropping platforms whose status is "enabled, not paired", "partially configured", "configured + E2EE", etc. Now any non-"not configured" status counts — the user has already started setup there and we shouldn't force the section to rerun. * tests/hermes_cli/test_setup_irc.py: drop the TestIRCPluginDisabledFlow class and test_configure_platform_enables_disabled_plugin_first — the no-longer-existent flow they were testing. * tests/hermes_cli/test_setup_openclaw_migration.py: patch both setup.get_env_value and gateway.get_env_value in the 4 gateway-section tests that reach _platform_status() through the unified setup flow; switch WHATSAPP_ENABLED to the literal "true" in the registry-parity test so WhatsApp's value-shape validator matches. Verified via fresh-install smoke (empty plugins.enabled, no env vars): IRC plugin loads, Platform('irc') resolves, _all_platforms() lists IRC with status 'not configured'. 160 targeted tests pass.	2026-04-29 21:56:51 -07:00
Teknium	71c8ca17dc	chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 Removes drive-by duplication that accumulated during the contributor branch's multiple rebases. All runtime-benign (dict last-wins, redefinition last-wins) but left dead source that would confuse reviewers and maintainers. Surgical in-place de-duplication (kept PR's intentional additions, removed only the doubled copy): * hermes_cli/auth.py: duplicate "gmi" + "azure-foundry" ProviderConfig * hermes_cli/models.py: duplicate "gmi" entry in _PROVIDER_MODELS * hermes_cli/config.py: duplicate NOTION/LINEAR/AIRTABLE/TENOR skill env block + duplicate get_custom_provider_context_length definition * hermes_cli/gateway.py: duplicate _setup_yuanbao * gateway/platforms/base.py: duplicate is_host_excluded_by_no_proxy * gateway/platforms/telegram.py: duplicate delete_message * gateway/stream_consumer.py: duplicate _should_send_fresh_final and _try_fresh_final * gateway/run.py: duplicate _parse_reasoning_command_args / _resolve_session_reasoning_config / _set_session_reasoning_override, duplicate "Drain silently when interrupted" interrupt check * run_agent.py: duplicate HERMES_AGENT_HELP_GUIDANCE append, duplicate codex_message_items capture, duplicate custom_providers resolution * tools/approval.py: duplicate HARDLINE_PATTERNS section and duplicate hardline call in check_dangerous_command * tools/mcp_tool.py: duplicate _orphan_stdio_pids module-level decl * cron/scheduler.py: duplicate "not configured/enabled" check — kept the new early-rejection, removed the stale late-path copy Full-file resets to origin/main (all PR additions were duplicates of content already on main): * ui-tui/packages/hermes-ink/index.d.ts * ui-tui/packages/hermes-ink/src/entry-exports.ts * ui-tui/packages/hermes-ink/src/ink/selection.ts * ui-tui/src/app/interfaces.ts * ui-tui/src/app/slash/commands/core.ts * ui-tui/src/components/thinking.tsx * ui-tui/src/lib/memoryMonitor.ts * ui-tui/src/types.ts * ui-tui/src/types/hermes-ink.d.ts * tests/hermes_cli/test_doctor.py * tests/hermes_cli/test_api_key_providers.py * tests/hermes_cli/test_model_validation.py * tests/plugins/memory/test_hindsight_provider.py * tests/run_agent/test_run_agent.py * tests/gateway/test_email.py * tests/tools/test_dockerfile_pid1_reaping.py * hermes_cli/commands.py (slack_native_slashes block — full duplicate)	2026-04-29 21:56:51 -07:00
Ari Lotter	868bc1c242	feat(irc): add interactive setup feat(gateway): refine Platform._missing_ and platform-connected dispatch Restricts plugin-name acceptance to bundled plugin scan + registry (no arbitrary string -> enum-pollution), pulls per-platform connectivity checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean _is_platform_connected method, and adds tests covering the checker map, plugin platform interface, and IRC setup wizard.	2026-04-29 21:56:51 -07:00
Ari Lotter	1f1608067c	feat(gateway): unify setup flows, load platforms dynamically from registry Merge the two gateway setup paths (hermes setup gateway + hermes gateway setup) to use a single _unified_platforms() list that merges built-in _PLATFORMS with dynamically registered plugin entries from platform_registry. - Add setup_fn field to PlatformEntry for plugin setup flows - _unified_platforms() merges built-ins with registry entries by key - setup_gateway() now uses unified list instead of hardcoded _GATEWAY_PLATFORMS tuple list - gateway_setup() uses same unified list, plugin entries appear alongside built-ins with no [plugin] suffix - _platform_status() handles plugin platforms via registry check_fn - Plugin platforms with setup_fn get called directly; plugins without get a generic env-var display fallback IRC and other plugin platforms now appear automatically in the setup menu when registered via platform_registry.register(). feat(gateway): surface disabled platform plugins in setup and auto-enable on select Platform plugins under plugins/platforms/* (IRC, etc.) were gated behind plugins.enabled, so `hermes gateway setup` wouldn't list them until the user ran `hermes plugins enable <name>` first. Now the setup menu always surfaces them as "plugin disabled — select to enable", and picking one adds it to plugins.enabled before running its setup flow. Along the way, unify the two gateway setup flows so `hermes setup gateway` and `hermes gateway setup` both read from the same platform list (built-in _PLATFORMS + platform_registry entries), dispatch through a single _configure_platform() helper, and share _platform_status(). Deletes the dead bespoke wrappers in setup.py (_setup_whatsapp, _setup_weixin, _setup_email, etc.) that duplicated logic now covered by the registry path or _setup_standard_platform. Also: - PlatformEntry gains a plugin_name field so the registry knows which plugin owns each entry (required for auto-enable). - PluginContext.register_platform auto-stamps plugin_name from the manifest so plugins don't have to pass it explicitly. - PluginManager now scans plugins/platforms/* as its own category root, one level below the bundled plugin scan. - Fix IRC plugin discovery: rename PLUGIN.yaml → plugin.yaml (the scanner is case-sensitive) and add the missing __init__.py that _load_directory_module requires.	2026-04-29 21:56:51 -07:00
Teknium	2e20f6ae2d	feat: complete plugin platform parity — all 12 integration points Extends the platform plugin interface from Phase 1 to cover every touchpoint where built-in platforms have hardcoded behavior. - allowed_users_env / allow_all_env: per-platform auth env vars - max_message_length: smart-chunking for send_message tool - pii_safe: session PII redaction flag - emoji: CLI/gateway display - allow_update_command: /update access control send_message tool (tools/send_message_tool.py): - Replaced hardcoded platform_map dict with Platform() call - Added _send_via_adapter() for plugin platforms — routes through live gateway adapter when available - Registry-aware max message length for smart chunking Cron delivery (cron/scheduler.py): - Replaced hardcoded 15-entry platform_map with Platform() call - Plugin platforms now work as cron delivery targets User authorization (gateway/run.py _is_user_authorized): - Registry fallback: checks PlatformEntry.allowed_users_env and allow_all_env when platform not in hardcoded maps - Plugin platforms get per-platform auth support _UPDATE_ALLOWED_PLATFORMS: checks registry allow_update_command flag Channel directory: includes plugin platforms in session enumeration Orphaned config warning: descriptive message when plugin platform is in config but no plugin registered it Gateway weakref: _gateway_runner_ref for cross-module adapter access hermes status: shows plugin platforms with (plugin) tag hermes gateway setup: plugin platforms appear in menu with setup hints hermes_cli/platforms.py: get_all_platforms() merges with registry, platform_label() falls back to registry for plugin names - 8 new tests (extended fields, cron resolution, platforms merge) - Updated 3 tests for new Platform() based resolution - 2829 passed, 24 pre-existing failures, zero new failures	2026-04-29 21:56:51 -07:00
Teknium	8f144fe36b	feat: pluggable platform adapter registry + IRC reference implementation Adds a platform adapter plugin interface so anyone can create new gateway platforms (IRC, Viber, Line, etc.) as drop-in plugins without modifying core gateway code. - PlatformEntry dataclass: name, label, adapter_factory, check_fn, validate_config, required_env, install_hint, source - PlatformRegistry singleton with register/unregister/create_adapter - _create_adapter() in gateway/run.py checks registry first, falls through to existing if/elif chain for built-in platforms - Platform._missing_() accepts unknown string values, creating cached pseudo-members so Platform('irc') is Platform('irc') holds true - GatewayConfig.from_dict() now parses plugin platform names from config.yaml without rejecting them - get_connected_platforms() delegates to registry for unknown platforms - PluginContext.register_platform() for plugin authors - Mirrors the existing register_tool() / register_hook() pattern - Full async IRC adapter using stdlib asyncio (zero external deps) - Connects via TLS, handles PING/PONG, nick collision, NickServ auth - Channel messages require addressing (nick: msg), DMs always dispatch - Markdown stripping for IRC-clean output, message splitting for 512-byte line limit - Config via config.yaml extra dict or IRC_* env vars - Platform enum dynamic members (identity stability, case normalization) - PlatformRegistry (register, unregister, create, validation, factory) - GatewayConfig integration (from_dict parsing, get_connected_platforms) - IRC adapter (init, send, protocol parsing, markdown, requirements) No existing platform adapters were migrated — the if/elif chain is untouched. This is Phase 1: prove the interface with a real plugin.	2026-04-29 21:56:51 -07:00
Teknium	4d7fc0f37c	feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation Reloading MCP servers rebuilds the tool set for the active session, which invalidates the provider prompt cache (tool schemas are baked into the system prompt). The next message re-sends full input tokens — can be expensive on long-context or high-reasoning models. To surface that cost, /reload-mcp now routes through a new slash-confirm primitive with three options: Approve Once / Always Approve / Cancel. 'Always Approve' persists approvals.mcp_reload_confirm: false so future reloads run silently. Coverage: * Classic CLI (cli.py) — interactive numbered prompt. * TUI (tui_gateway + Ink ops.ts) — text warning on first call; `now` / `always` args skip the gate; `always` also persists the opt-out. * Messenger gateway — button UI on Telegram (inline keyboard), Discord (discord.ui.View), Slack (Block Kit actions); text fallback on every other platform via /approve /always /cancel replies intercepted in gateway/run.py _handle_message. * Config key: approvals.mcp_reload_confirm (default true). * Auto-reload paths (CLI file watcher, TUI config-sync mtime poll) pass confirm=true so they do NOT prompt. Implementation: * tools/slash_confirm.py — module-level pending-state store used by all adapters and by the CLI prompt. Thread-safe register/resolve/clear. * gateway/platforms/base.py — send_slash_confirm hook (default 'Not supported' → text fallback). * gateway/run.py — _request_slash_confirm helper + text intercept in _handle_message (yields to in-progress tool-exec approvals so dangerous-command /approve still unblocks the tool thread first). Tests: * tests/tools/test_slash_confirm.py — primitive lifecycle + async resolution + double-click atomicity (16 tests). * tests/hermes_cli/test_mcp_reload_confirm_gate.py — default-config shape + deep-merge preserves user opt-out (5 tests). Targeted runs (hermetic): 89 passed (slash-confirm, config gate, existing agent cache, existing telegram approval buttons).	2026-04-29 21:56:47 -07:00
helix4u	7fae87bc00	fix(gateway): refresh cached agents after MCP tool changes	2026-04-29 21:56:47 -07:00
simbam99	ffa65291d1	fix(cron): clear auto-delivery thread context between jobs	2026-04-29 21:08:59 -07:00
teknium1	763aadd6bf	fix(telegram): preserve pre-#17686 chat-ID-in-_USERS configs + doc split PR #15027 (5 days ago) shipped TELEGRAM_GROUP_ALLOWED_USERS as a chat-ID allowlist. #17686 correctly renames that to sender user IDs and moves chat IDs to TELEGRAM_GROUP_ALLOWED_CHATS. Without a shim, any user on PR #15027's guidance would silently start rejecting group traffic on upgrade. - gateway/run.py: in _is_user_authorized, if TELEGRAM_GROUP_ALLOWED_USERS contains values starting with '-' (chat-ID-shaped), honor them as chat IDs and log a one-shot deprecation warning pointing users at the new TELEGRAM_GROUP_ALLOWED_CHATS var. - tests/gateway/test_unauthorized_dm_behavior.py: three new tests cover legacy chat-ID values authorizing the listed chat, not crossing to other chats, and mixed sender/chat values in the same var. - website/docs/user-guide/messaging/telegram.md: rewrite the Group Allowlisting section to document the new user/chat split + migration note. Remove stale '/thread_id' suffix claim (code never parsed it). - website/docs/reference/environment-variables.md: document all three Telegram allowlist env vars.	2026-04-29 21:07:55 -07:00
Anders Bell	1f712173b2	fix(telegram): support group user allowlist	2026-04-29 21:07:55 -07:00
teknium1	dd2d1ba5e6	refactor(reload-skills): queue note for next turn, drop cache invalidation + agent tool Salvage-follow-up to @shannonsands's /reload-skills PR. Trims the feature to match the design: user-initiated rescan, no prompt-cache reset, no new schema surface, no phantom user turn, and the next-turn note carries each added/removed skill's 60-char description (not just its name). Changes vs the original PR: * Drop the in-process skills prompt-cache clear in reload_skills(). Skills are invoked at runtime via /skill-name, skills_list, or skill_view — they don't need to live in the system prompt for the model to use them. Keeping the cache intact preserves prefix caching across the reload so /reload-skills pays no cache-reset cost. (MCP has to break the cache because tool schemas must be known at conversation start; skills do not.) * Drop the skills_reload agent tool and SKILLS_RELOAD_SCHEMA from tools/skills_tool.py, plus the four skills_reload enumerations in toolsets.py. No new schema surface — agents can already see a freshly- installed skill via skill_view / skills_list the moment it's on disk. * Replace the phantom 'role: user' turn injection with a one-shot queued note. CLI uses self._pending_skills_reload_note (same pattern as _pending_model_switch_note, prepended to the next API call and cleared). Gateway uses self._pending_skills_reload_notes[session_key]. The note is prepended to the NEXT real user message in this session, so message alternation stays intact and nothing out-of-band is persisted to the transcript. * reload_skills() now returns added/removed as [{'name': str, 'description': str}, ...] (description truncated to 60 chars — matches the curator / gateway adapter budget). The injected next-turn note formats each entry as 'name — description' so the model can actually reason about which new skills to call without running skills_list first. * Only emit the note when the diff is non-empty. On empty diff, print 'No new skills detected' and do nothing else. * Tests rewritten to cover the queue semantics, the description payload, and a regression guard that the prompt-cache snapshot is preserved.	2026-04-29 21:07:47 -07:00
Shannon Sands	7966560fb5	feat(skills): /reload-skills slash command + skills_reload agent tool Adds a public reload path for the in-process skill caches so newly installed (or removed) skills become visible mid-session without a gateway restart. Mirrors the shape of /reload-mcp. Three surfaces: * /reload-skills slash command — CLI (cli.py) and gateway (gateway/run.py), with /reload_skills alias for Telegram autocomplete and an explicit Discord registration. * skills_reload agent tool (tools/skills_tool.py) — lets agents/subagents pick up freshly-installed skills via tool call. * agent.skill_commands.reload_skills() — shared helper that clears _skill_commands, _SKILLS_PROMPT_CACHE (in-process LRU), and the on-disk .skills_prompt_snapshot.json, then returns an added/removed diff plus the new total count. Tested: * tests/agent/test_skill_commands_reload.py (9 cases) * tests/cli/test_cli_reload_skills.py (3 cases) * tests/gateway/test_reload_skills_command.py (4 cases) Use case: NemoClaw / OpenShell-style sandboxed orchestrators that drop skills into ~/.hermes/skills mid-session, plus agentic flows where the agent itself installs a skill via the shell tool and needs it bound without a gateway restart. The Python helper clear_skills_system_prompt_cache(clear_snapshot=True) already exists internally — this PR just exposes it via slash command and tool.	2026-04-29 21:07:47 -07:00
ethernet	7d48a16f14	remove relaunch_chat not needed	2026-04-29 20:33:29 -07:00
ethernet	3c673468b4	refactor(cli): derive relaunch flag table from argparse introspection Pull the top-level + chat parser construction out of main() into hermes_cli/_parser.py so relaunch.py can introspect parser._actions to discover which flags exist and whether they take values, instead of maintaining a parallel hand-rolled (flag, takes_value) tuple list. - _parser.py: build_top_level_parser() returns (parser, subparsers, chat_parser); side-effect-free import. - main.py: ~290 lines of inline parser construction collapsed to a helper call. Other subparsers stay inline (dispatch is bound to module-level cmd_* functions). - _parser._inherited_flag(parser, ...): wraps parser.add_argument and sets action.inherit_on_relaunch = True. Used in place of parser.add_argument for the 25 flags (top-level + chat) that need to carry over. - _parser.PRE_ARGPARSE_INHERITED_FLAGS: holds --profile/-p, which isn't on argparse (consumed earlier by main._apply_profile_override). - relaunch.py: drops _CRITICAL_DESTS and _PRE_ARGPARSE_FLAGS; the table builder now filters by getattr(action, 'inherit_on_relaunch', False). - test_ignore_user_config_flags.py: brittle inspect.getsource grep replaced with proper parser introspection. - test_relaunch.py: introspection sanity tests added. Salvaged from PR #17549; added top-level -t/--toolsets flag to _parser.py so #17623 (fix(tui): honor launch toolsets) behavior is preserved on current main. Co-authored-by: ethernet <arilotter@gmail.com>	2026-04-29 20:33:29 -07:00

1 2 3 4 5 ...

2917 commits