hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-03 02:11:48 +00:00

Author	SHA1	Message	Date
Teknium	27ec74c68a	fix: coerce show_reasoning and guard_agent_created config bools Widens #16528 to two sibling sites that had the same quoted-boolean bug: a YAML string "false" (or "0", "no", "off") silently evaluated truthy under bool() / if-check. - gateway/run.py _load_show_reasoning: is_truthy_value wrap - tools/skill_manager_tool.py _guard_agent_created_enabled: is_truthy_value wrap - regression tests for both	2026-04-30 20:40:46 -07:00
johnncenae	bb706c3f38	fix(gateway): coerce tool_progress_command as a real boolean	2026-04-30 20:40:46 -07:00
simbam99	7ba1a2b3df	fix(gateway): preserve assistant metadata when branching sessions	2026-04-30 20:40:28 -07:00
Roy-oss1	b94cb8e2c4	feat(feishu): operator-configurable bot admission and mention policy Add two operator-facing toggles for inbound Feishu admission, enabling bot-to-bot scenarios such as A2A orchestration and inter-bot notifications: FEISHU_ALLOW_BOTS=none\|mentions\|all (default: none) Accept messages from other bots. `mentions` requires the peer bot to @-mention Hermes; `all` admits every peer-bot message. FEISHU_REQUIRE_MENTION=true\|false (default: true) Whether group messages must @-mention the bot. Override per-chat via `group_rules.<chat_id>.require_mention` in config.yaml. Defaults preserve prior behavior. Self-echo protection is always on: when the bot's identity is unresolved (auto-detection failed and FEISHU_BOT_OPEN_ID unset), peer-bot messages are rejected fail-closed to avoid feedback loops. Admitted peer bots bypass the human-user allowlist (FEISHU_ALLOWED_USERS) to match existing Discord behavior; humans still need an explicit allowlist entry. yaml feishu.allow_bots is bridged to the env var so the adapter and gateway auth layer share one source of truth. Resolving peer-bot display names requires the application:bot.basic_info:read scope; without it, peers still route but appear as their open_id. Test: tests/gateway/test_feishu_bot_admission.py covers the admission pipeline, group-policy bot-bypass, hydration, and event-dispatch plumbing as a parametrized matrix. Change-Id: I363cccb578c2a5c8b8bf0f0a890c01c89909e256	2026-04-30 20:30:31 -07:00
buray	fa9fd26acb	fix(gateway): re-inject topic-bound skill after /new or /reset reset_session() creates a fresh SessionEntry with created_at == updated_at, but get_or_create_session() bumps updated_at on the next inbound message, causing _is_new_session in _handle_message_with_agent to evaluate False. The topic/channel skill auto-load gate (group_topics, channel_skill_bindings) silently skips the first message after a manual reset. Add an is_fresh_reset flag on SessionEntry, set by reset_session() and consumed once by the message handler. Kept distinct from was_auto_reset because that flag also drives a 'session expired due to inactivity' user-facing notice and a context-note prepend — both wrong for an explicit /new or /reset. Persisted through to_dict/from_dict so the flag survives gateway restart between /reset and the next message. Fixes #6508 Co-authored-by: warabe1122 <45554392+warabe1122@users.noreply.github.com> Co-authored-by: willy-scr <187001140+willy-scr@users.noreply.github.com>	2026-04-30 20:29:19 -07:00
Jezza Hehn	7abc9ce4df	fix(gateway): read /status token totals from SessionDB (#17158 ) /status was reading session_entry.total_tokens from the in-memory SessionStore (gateway/session.py), which the agent never writes to — so the token count was always 0. The agent already persists token deltas to the SQLite SessionDB (run_agent.py:11497) for every platform with a session_id. Route /status through that single source of truth instead of duplicating token writes into a second store. Fix: - gateway/run.py: _handle_status_command now calls self._session_db.get_session(session_id) and sums the five token component columns (input/output/cache_read/cache_write/reasoning). Falls back to 0 when no SessionDB is configured or no row exists. - Two new regression tests covering the populated-row and missing-row paths. Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>	2026-04-30 20:28:50 -07:00
Teknium	a178081468	fix(gateway): use _session_key_for_source for native image buffer write Minor follow-up to the native-image-buffer isolation fix. The write site in _prepare_inbound_message_text was calling build_session_key directly, while every other call site in gateway/run.py uses the _session_key_for_source helper — which consults session_store._generate_session_key first and falls back to build_session_key. Keeping the write key and consume key on the same helper prevents key drift if the session store ever overrides the default keying behavior.	2026-04-30 20:26:35 -07:00
Yukipukii1	bdb7edd89e	fix(gateway): isolate pending native image paths by session	2026-04-30 20:26:35 -07:00
Teknium	9a75743496	fix(gateway): apply agent.disabled_toolsets in gateway message loop Widens the cherry-picked fix from @jatingodnani (#17343) to the gateway path. On main, user_config.agent.disabled_toolsets was only honored by _get_platform_tools' name-level subtraction — it did not catch tools pulled in implicitly by a composite toolset (browser includes web_search, hermes-* platforms include most tools). Changes: - gateway/run.py: resolve disabled_toolsets alongside enabled_toolsets and pass to AIAgent at both user-facing construction sites (normal message loop + single-turn cron-like path). Hygiene/compression agents (fixed enabled_toolsets=[memory]) are intentionally untouched. - gateway/run.py: add (agent, disabled_toolsets) to _CACHE_BUSTING_CONFIG_KEYS so editing the list in config.yaml invalidates the cached AIAgent on the next message. - cli.py: drop unused 'import platform' left over from PR #17343's import churn; restore 'import sys' used throughout the file. - model_tools.py: drop unused 'import os, sys' added by PR #17343; fix comment reference from #15291 (unrelated OAuth issue) to #17309. Co-authored-by: jatin godnani <godnanijatin@gmail.com>	2026-04-30 20:24:39 -07:00
Teknium	01cc701e54	docs + nit: busy_ack_enabled follow-ups - Move the disabled-ack guard above the debounce so we don't stamp _busy_ack_ts[session_key] when no ack was actually sent. Harmless (never read when disabled) but cosmetically off. - Document display.busy_ack_enabled in user-guide/messaging/index.md and HERMES_GATEWAY_BUSY_ACK_ENABLED in reference/environment-variables.md. - Add JezzaHehn to scripts/release.py AUTHOR_MAP for contributor credit. Follow-up to #17491 (Jezza Hehn).	2026-04-30 20:22:30 -07:00
Jezza Hehn	2b512cbca4	feat(gateway): add busy_ack_enabled config option to suppress ack messages When a user sends a message while the gateway is busy processing, an acknowledgment message is sent. This can be spammy for users who send rapid messages. Add display.busy_ack_enabled config option (default: true) to allow users to suppress these busy-input acknowledgment messages. Fixes #17457	2026-04-30 20:22:30 -07:00
Yukipukii1	25cbe3e1d6	fix(gateway): preserve thread routing for /update progress and prompts	2026-04-30 20:19:23 -07:00
johnncenae	1ef9e88549	fix(gateway): write restart markers atomically and fix Windows lock collisions	2026-04-30 19:58:16 -07:00
Teknium	c868425467	feat(kanban): durable multi-profile collaboration board (#17805 ) Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init\|create\|list\|show\|assign\|link\|unlink\| claim\|comment\|complete\|block\|unblock\|archive\|tail\|dispatch\|context\| init\|gc\|watch\|stats\|notify\|log\|heartbeat\|runs\|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com>	2026-04-30 13:36:47 -07:00
Leone Parise	eda1d516dc	fix(skills): exclude .archive from skill index walk Archived skills (moved to ~/.hermes/skills/.archive/ by the curator) were still surfaced in the <available_skills> system prompt under a fake '.archive' category, causing the agent to load and try to use deprecated skills. The os.walk in iter_skill_index_files() only excluded .git/.github/.hub. Add '.archive' to EXCLUDED_SKILL_DIRS, and to the two other places that hardcode the same exclusion tuple (gateway/run.py and agent/skill_commands.py).	2026-04-30 04:59:22 -07:00
konsisumer	d1d0ef6dbd	fix(gateway): persist user message on transient agent failures (#7100 ) The #1630 fix introduced a blanket ``agent_failed_early`` transcript skip to prevent context-overflow sessions from looping. That guard also triggers for unrelated transient failures (429 rate limits, read timeouts, connection resets, provider 5xx) which have nothing to do with session size — and it silently drops the user's message, so the agent has no memory of the last turn on retry. Split the failure classification in ``GatewayRunner._run_agent``: * Context-overflow (``compression_exhausted`` flag, explicit context-length phrases, or generic 400 with a long history) → keep the existing skip, preserving the #1630/#9893 fix. * Anything else that failed → persist just the user message so the conversation survives a retry. Use specific multi-word phrases (``context length``, ``token limit``, ``prompt is too long``, etc.) to match ``run_agent.py``'s own classifier; bare ``exceed`` false-positively flagged "rate limit exceeded" as context overflow. Covered by new tests in ``tests/gateway/test_7100_transient_failure_transcript.py`` and the existing #1630 suite still passes.	2026-04-30 04:32:33 -07:00
Rob Moen	0dd373ec43	fix(context): honor model.context_length for Ollama num_ctx and all display paths When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests).	2026-04-30 04:31:23 -07:00
Bartok9	fbb3775770	fix(gateway): enforce auth check in busy-session path to prevent unauthorized injection (#17775 ) The busy-session handler (_handle_active_session_busy_message) bypassed the authorization gate that the cold path enforces via _is_user_authorized(). In shared-thread contexts (Slack threads, Telegram forum topics, Discord threads) where thread_sessions_per_user=False (the default), all participants share one session_key. An unauthorized user posting in the same thread as an authorized user would hit the active-session branch, skip the auth check, and have their text merged into _pending_messages or injected via agent.interrupt(). This commit adds the same _is_user_authorized() check at the top of the busy handler, before any message queuing, steering, or interrupt logic. Unauthorized messages are silently dropped (return True) with a warning log — matching the cold-path behavior. Affected platforms: Slack, Telegram, Discord, any adapter with shared-session thread contexts. Closes #17775	2026-04-30 04:29:15 -07:00
Maxence Groine	04ea895ffb	feat(gateway/signal): add support for multiple images sending Adds a new `send_multiple_images` method to the ``BasePlatformAdapter`` that implements the default "One image per message" loop and allows for platform-specific overriding. Implements such an override for the Signal adapter, batching images and trying (best-effort) to work around rate-limits for voluminous batches using a specific scheduler. Also implements batching + rate-limit handling in the `send_message` tool. New tests added for the Signal adapter, its rate-limit scheduler and the `send_message` tool	2026-04-30 04:28:08 -07:00
Teknium	411f586c67	refactor(gateway): extract _float_env helper for env-var float casts Follow-up to the try/except guards added in the previous commit. Four sibling call sites all read HERMES_AGENT_TIMEOUT / HERMES_AGENT_TIMEOUT_WARNING / HERMES_AGENT_NOTIFY_INTERVAL via the same read-env-or-fallback pattern, so factor it into _float_env(name, default) alongside the existing _auto_continue_freshness_window() helper.	2026-04-30 03:32:37 -07:00
vominh1919	ca87c822ed	fix(gateway): guard yaml.safe_load and float() env var casts against crash Two defensive fixes in gateway/run.py: 1. yaml.safe_load returning None on empty config files (line 12706): GatewayConfig.from_dict(data) crashes with AttributeError when the YAML file is empty because safe_load returns None. All 6 other yaml.safe_load call sites already use `or {}` — this one was missed. Impact: gateway fails to start with empty --config file. 2. float() on env vars without ValueError guard (lines 3951, 11757, 11805, 11807): HERMES_AGENT_TIMEOUT, HERMES_AGENT_TIMEOUT_WARNING, and HERMES_AGENT_NOTIFY_INTERVAL are cast via float() directly from os.getenv(). A typo (e.g. "abc") raises ValueError and crashes the agent turn or gateway startup. Impact: single misconfigured env var crashes the entire gateway.	2026-04-30 03:32:37 -07:00
Teknium	aa7bf329bc	feat(gateway): centralize audio routing + FLAC support + Telegram doc fallback (#17833 ) Extracted from PR #17211 (@versun) so it can land independently of the local_command TTS provider redesign. - Add should_send_media_as_audio(platform, ext, is_voice) in gateway/platforms/base.py; single source of truth for audio routing. - Add .flac to recognized audio extensions (MEDIA regex, weixin audio set, send_message audio set). - Telegram send_voice() now falls back to send_document for formats Telegram's Bot API can't play natively (.wav, .flac, ...) instead of raising; MP3/M4A still go to sendAudio, Opus/OGG still go to sendVoice. - Route _send_telegram() in send_message_tool through a narrower _TELEGRAM_SEND_AUDIO_EXTS = {.mp3, .m4a} set. - cron.scheduler._send_media_via_adapter now delegates the audio decision to should_send_media_as_audio so it matches the gateway. - Update the cron live-adapter ogg test to flag [[audio_as_voice]] so it still routes to sendVoice under the new Telegram-specific policy. - Tests: unit coverage for should_send_media_as_audio across platforms, end-to-end MEDIA routing via _process_message_background and GatewayRunner._deliver_media_from_response, TelegramAdapter.send_voice fallback for FLAC/WAV. Co-authored-by: Versun <me+github7604@versun.org>	2026-04-30 01:32:31 -07:00
Stephen Schoettler	f73364b1c4	fix(ci): stabilize main test suite regressions (#17660 ) * fix: stabilize main test suite regressions * test(agent): update MiniMax normalization expectation * test: stabilize remaining CI assertions * test: harden config helper monkeypatching * test: harden CI-only assertions * fix(agent): propagate fast streaming interrupts	2026-04-29 23:18:55 -07:00
Teknium	71c8ca17dc	chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 Removes drive-by duplication that accumulated during the contributor branch's multiple rebases. All runtime-benign (dict last-wins, redefinition last-wins) but left dead source that would confuse reviewers and maintainers. Surgical in-place de-duplication (kept PR's intentional additions, removed only the doubled copy): * hermes_cli/auth.py: duplicate "gmi" + "azure-foundry" ProviderConfig * hermes_cli/models.py: duplicate "gmi" entry in _PROVIDER_MODELS * hermes_cli/config.py: duplicate NOTION/LINEAR/AIRTABLE/TENOR skill env block + duplicate get_custom_provider_context_length definition * hermes_cli/gateway.py: duplicate _setup_yuanbao * gateway/platforms/base.py: duplicate is_host_excluded_by_no_proxy * gateway/platforms/telegram.py: duplicate delete_message * gateway/stream_consumer.py: duplicate _should_send_fresh_final and _try_fresh_final * gateway/run.py: duplicate _parse_reasoning_command_args / _resolve_session_reasoning_config / _set_session_reasoning_override, duplicate "Drain silently when interrupted" interrupt check * run_agent.py: duplicate HERMES_AGENT_HELP_GUIDANCE append, duplicate codex_message_items capture, duplicate custom_providers resolution * tools/approval.py: duplicate HARDLINE_PATTERNS section and duplicate hardline call in check_dangerous_command * tools/mcp_tool.py: duplicate _orphan_stdio_pids module-level decl * cron/scheduler.py: duplicate "not configured/enabled" check — kept the new early-rejection, removed the stale late-path copy Full-file resets to origin/main (all PR additions were duplicates of content already on main): * ui-tui/packages/hermes-ink/index.d.ts * ui-tui/packages/hermes-ink/src/entry-exports.ts * ui-tui/packages/hermes-ink/src/ink/selection.ts * ui-tui/src/app/interfaces.ts * ui-tui/src/app/slash/commands/core.ts * ui-tui/src/components/thinking.tsx * ui-tui/src/lib/memoryMonitor.ts * ui-tui/src/types.ts * ui-tui/src/types/hermes-ink.d.ts * tests/hermes_cli/test_doctor.py * tests/hermes_cli/test_api_key_providers.py * tests/hermes_cli/test_model_validation.py * tests/plugins/memory/test_hindsight_provider.py * tests/run_agent/test_run_agent.py * tests/gateway/test_email.py * tests/tools/test_dockerfile_pid1_reaping.py * hermes_cli/commands.py (slack_native_slashes block — full duplicate)	2026-04-29 21:56:51 -07:00
Ari Lotter	868bc1c242	feat(irc): add interactive setup feat(gateway): refine Platform._missing_ and platform-connected dispatch Restricts plugin-name acceptance to bundled plugin scan + registry (no arbitrary string -> enum-pollution), pulls per-platform connectivity checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean _is_platform_connected method, and adds tests covering the checker map, plugin platform interface, and IRC setup wizard.	2026-04-29 21:56:51 -07:00
Teknium	2e20f6ae2d	feat: complete plugin platform parity — all 12 integration points Extends the platform plugin interface from Phase 1 to cover every touchpoint where built-in platforms have hardcoded behavior. - allowed_users_env / allow_all_env: per-platform auth env vars - max_message_length: smart-chunking for send_message tool - pii_safe: session PII redaction flag - emoji: CLI/gateway display - allow_update_command: /update access control send_message tool (tools/send_message_tool.py): - Replaced hardcoded platform_map dict with Platform() call - Added _send_via_adapter() for plugin platforms — routes through live gateway adapter when available - Registry-aware max message length for smart chunking Cron delivery (cron/scheduler.py): - Replaced hardcoded 15-entry platform_map with Platform() call - Plugin platforms now work as cron delivery targets User authorization (gateway/run.py _is_user_authorized): - Registry fallback: checks PlatformEntry.allowed_users_env and allow_all_env when platform not in hardcoded maps - Plugin platforms get per-platform auth support _UPDATE_ALLOWED_PLATFORMS: checks registry allow_update_command flag Channel directory: includes plugin platforms in session enumeration Orphaned config warning: descriptive message when plugin platform is in config but no plugin registered it Gateway weakref: _gateway_runner_ref for cross-module adapter access hermes status: shows plugin platforms with (plugin) tag hermes gateway setup: plugin platforms appear in menu with setup hints hermes_cli/platforms.py: get_all_platforms() merges with registry, platform_label() falls back to registry for plugin names - 8 new tests (extended fields, cron resolution, platforms merge) - Updated 3 tests for new Platform() based resolution - 2829 passed, 24 pre-existing failures, zero new failures	2026-04-29 21:56:51 -07:00
Teknium	8f144fe36b	feat: pluggable platform adapter registry + IRC reference implementation Adds a platform adapter plugin interface so anyone can create new gateway platforms (IRC, Viber, Line, etc.) as drop-in plugins without modifying core gateway code. - PlatformEntry dataclass: name, label, adapter_factory, check_fn, validate_config, required_env, install_hint, source - PlatformRegistry singleton with register/unregister/create_adapter - _create_adapter() in gateway/run.py checks registry first, falls through to existing if/elif chain for built-in platforms - Platform._missing_() accepts unknown string values, creating cached pseudo-members so Platform('irc') is Platform('irc') holds true - GatewayConfig.from_dict() now parses plugin platform names from config.yaml without rejecting them - get_connected_platforms() delegates to registry for unknown platforms - PluginContext.register_platform() for plugin authors - Mirrors the existing register_tool() / register_hook() pattern - Full async IRC adapter using stdlib asyncio (zero external deps) - Connects via TLS, handles PING/PONG, nick collision, NickServ auth - Channel messages require addressing (nick: msg), DMs always dispatch - Markdown stripping for IRC-clean output, message splitting for 512-byte line limit - Config via config.yaml extra dict or IRC_* env vars - Platform enum dynamic members (identity stability, case normalization) - PlatformRegistry (register, unregister, create, validation, factory) - GatewayConfig integration (from_dict parsing, get_connected_platforms) - IRC adapter (init, send, protocol parsing, markdown, requirements) No existing platform adapters were migrated — the if/elif chain is untouched. This is Phase 1: prove the interface with a real plugin.	2026-04-29 21:56:51 -07:00
Teknium	4d7fc0f37c	feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation Reloading MCP servers rebuilds the tool set for the active session, which invalidates the provider prompt cache (tool schemas are baked into the system prompt). The next message re-sends full input tokens — can be expensive on long-context or high-reasoning models. To surface that cost, /reload-mcp now routes through a new slash-confirm primitive with three options: Approve Once / Always Approve / Cancel. 'Always Approve' persists approvals.mcp_reload_confirm: false so future reloads run silently. Coverage: * Classic CLI (cli.py) — interactive numbered prompt. * TUI (tui_gateway + Ink ops.ts) — text warning on first call; `now` / `always` args skip the gate; `always` also persists the opt-out. * Messenger gateway — button UI on Telegram (inline keyboard), Discord (discord.ui.View), Slack (Block Kit actions); text fallback on every other platform via /approve /always /cancel replies intercepted in gateway/run.py _handle_message. * Config key: approvals.mcp_reload_confirm (default true). * Auto-reload paths (CLI file watcher, TUI config-sync mtime poll) pass confirm=true so they do NOT prompt. Implementation: * tools/slash_confirm.py — module-level pending-state store used by all adapters and by the CLI prompt. Thread-safe register/resolve/clear. * gateway/platforms/base.py — send_slash_confirm hook (default 'Not supported' → text fallback). * gateway/run.py — _request_slash_confirm helper + text intercept in _handle_message (yields to in-progress tool-exec approvals so dangerous-command /approve still unblocks the tool thread first). Tests: * tests/tools/test_slash_confirm.py — primitive lifecycle + async resolution + double-click atomicity (16 tests). * tests/hermes_cli/test_mcp_reload_confirm_gate.py — default-config shape + deep-merge preserves user opt-out (5 tests). Targeted runs (hermetic): 89 passed (slash-confirm, config gate, existing agent cache, existing telegram approval buttons).	2026-04-29 21:56:47 -07:00
helix4u	7fae87bc00	fix(gateway): refresh cached agents after MCP tool changes	2026-04-29 21:56:47 -07:00
teknium1	763aadd6bf	fix(telegram): preserve pre-#17686 chat-ID-in-_USERS configs + doc split PR #15027 (5 days ago) shipped TELEGRAM_GROUP_ALLOWED_USERS as a chat-ID allowlist. #17686 correctly renames that to sender user IDs and moves chat IDs to TELEGRAM_GROUP_ALLOWED_CHATS. Without a shim, any user on PR #15027's guidance would silently start rejecting group traffic on upgrade. - gateway/run.py: in _is_user_authorized, if TELEGRAM_GROUP_ALLOWED_USERS contains values starting with '-' (chat-ID-shaped), honor them as chat IDs and log a one-shot deprecation warning pointing users at the new TELEGRAM_GROUP_ALLOWED_CHATS var. - tests/gateway/test_unauthorized_dm_behavior.py: three new tests cover legacy chat-ID values authorizing the listed chat, not crossing to other chats, and mixed sender/chat values in the same var. - website/docs/user-guide/messaging/telegram.md: rewrite the Group Allowlisting section to document the new user/chat split + migration note. Remove stale '/thread_id' suffix claim (code never parsed it). - website/docs/reference/environment-variables.md: document all three Telegram allowlist env vars.	2026-04-29 21:07:55 -07:00
Anders Bell	1f712173b2	fix(telegram): support group user allowlist	2026-04-29 21:07:55 -07:00
teknium1	dd2d1ba5e6	refactor(reload-skills): queue note for next turn, drop cache invalidation + agent tool Salvage-follow-up to @shannonsands's /reload-skills PR. Trims the feature to match the design: user-initiated rescan, no prompt-cache reset, no new schema surface, no phantom user turn, and the next-turn note carries each added/removed skill's 60-char description (not just its name). Changes vs the original PR: * Drop the in-process skills prompt-cache clear in reload_skills(). Skills are invoked at runtime via /skill-name, skills_list, or skill_view — they don't need to live in the system prompt for the model to use them. Keeping the cache intact preserves prefix caching across the reload so /reload-skills pays no cache-reset cost. (MCP has to break the cache because tool schemas must be known at conversation start; skills do not.) * Drop the skills_reload agent tool and SKILLS_RELOAD_SCHEMA from tools/skills_tool.py, plus the four skills_reload enumerations in toolsets.py. No new schema surface — agents can already see a freshly- installed skill via skill_view / skills_list the moment it's on disk. * Replace the phantom 'role: user' turn injection with a one-shot queued note. CLI uses self._pending_skills_reload_note (same pattern as _pending_model_switch_note, prepended to the next API call and cleared). Gateway uses self._pending_skills_reload_notes[session_key]. The note is prepended to the NEXT real user message in this session, so message alternation stays intact and nothing out-of-band is persisted to the transcript. * reload_skills() now returns added/removed as [{'name': str, 'description': str}, ...] (description truncated to 60 chars — matches the curator / gateway adapter budget). The injected next-turn note formats each entry as 'name — description' so the model can actually reason about which new skills to call without running skills_list first. * Only emit the note when the diff is non-empty. On empty diff, print 'No new skills detected' and do nothing else. * Tests rewritten to cover the queue semantics, the description payload, and a regression guard that the prompt-cache snapshot is preserved.	2026-04-29 21:07:47 -07:00
Shannon Sands	7966560fb5	feat(skills): /reload-skills slash command + skills_reload agent tool Adds a public reload path for the in-process skill caches so newly installed (or removed) skills become visible mid-session without a gateway restart. Mirrors the shape of /reload-mcp. Three surfaces: * /reload-skills slash command — CLI (cli.py) and gateway (gateway/run.py), with /reload_skills alias for Telegram autocomplete and an explicit Discord registration. * skills_reload agent tool (tools/skills_tool.py) — lets agents/subagents pick up freshly-installed skills via tool call. * agent.skill_commands.reload_skills() — shared helper that clears _skill_commands, _SKILLS_PROMPT_CACHE (in-process LRU), and the on-disk .skills_prompt_snapshot.json, then returns an added/removed diff plus the new total count. Tested: * tests/agent/test_skill_commands_reload.py (9 cases) * tests/cli/test_cli_reload_skills.py (3 cases) * tests/gateway/test_reload_skills_command.py (4 cases) Use case: NemoClaw / OpenShell-style sandboxed orchestrators that drop skills into ~/.hermes/skills mid-session, plus agentic flows where the agent itself installs a skill via the shell tool and needs it bound without a gateway restart. The Python helper clear_skills_system_prompt_cache(clear_snapshot=True) already exists internally — this PR just exposes it via slash command and tool.	2026-04-29 21:07:47 -07:00
Scott Trinh	5a1d4f6804	feat: add Vercel Sandbox backend Adds Vercel Sandbox as a supported Hermes terminal backend alongside existing providers (Local, Docker, Modal, SSH, Daytona, Singularity). Uses the Vercel Python SDK to create/manage cloud microVMs, supports snapshot-based filesystem persistence keyed by task_id, and integrates with the existing BaseEnvironment shell contract and FileSyncManager for credential/skill syncing. Based on #17127 by @scotttrinh, cherry-picked onto current main.	2026-04-29 07:22:33 -07:00
tmimmanuel	3606414ec7	fix(gateway): isolate platform connect failures with per-platform timeout Wrap each adapter.connect() in asyncio.wait_for() so one platform hanging during startup or reconnect cannot block the others. Telegram's 8-retry connect loop (~140s worst case) previously prevented Feishu from ever starting when Telegram was network-restricted — common for users in regions where Telegram is blocked. Default timeout is 30s; override via HERMES_GATEWAY_PLATFORM_CONNECT_TIMEOUT (0 disables). Applied to both startup and the reconnect watcher so a platform that hangs mid-retry also does not stall retries for others. Fixes #17242	2026-04-29 05:00:37 -07:00
Teknium	13683c0842	feat(memory): notify providers on mid-process session_id rotation (#17409 ) Fixes #6672 Memory providers now receive on_session_switch() whenever AIAgent.session_id rotates mid-process — /resume, /branch, /reset, /new, and context compression. Before this, providers that cached per-session state in initialize() (Hindsight's _session_id, _document_id, accumulated _session_turns, _turn_counter) kept writing into the old session's record after the agent had moved on. MemoryProvider ABC ------------------ - New optional hook on_session_switch(new_session_id, , parent_session_id='', reset=False, *kwargs) with no-op default for backward compat. reset=True signals /reset or /new — providers should flush accumulated per-session buffers. reset=False for /resume, /branch, compression where the logical conversation continues. MemoryManager ------------- - on_session_switch() fans the hook out to every registered provider. Isolated try/except per provider — one bad provider can't block others. - Empty/None new_session_id is a no-op to avoid corrupting provider state during shutdown paths. run_agent.py ------------ - _sync_external_memory_for_turn now passes session_id=self.session_id into sync_all() and queue_prefetch_all(). Providers with defensive session_id updates in sync_turn (Hindsight already had this at plugins/memory/hindsight/__init__.py:1199) now actually receive the current id. - Compression block at ~L8884 already notified the context engine of the rollover; now also calls _memory_manager.on_session_switch(reason='compression'). cli.py ------ - new_session() fires reset=True, reason='new_session' so providers flush buffers. - _handle_resume_command fires reset=False, reason='resume' with the previous session as parent_session_id. - _handle_branch_command fires reset=False, reason='branch' with the parent session_id already captured for the DB parent link. gateway/run.py -------------- - _handle_resume_command now evicts the cached AIAgent, mirroring /branch and /reset. The next message rebuilds a fresh agent whose memory provider initialize() runs with the correct session_id — matches the pattern the gateway already uses for provider state cross-session transitions. Hindsight reference implementation ---------------------------------- - plugins/memory/hindsight/__init__.py adds on_session_switch that: updates _session_id, mints a fresh _document_id (prevents vectorize-io/hindsight#1303 overwrite), and clears _session_turns / _turn_counter / _turn_index so in-flight batches don't flush under the new document id. parent_session_id only overwritten when provided (avoids clobbering on a bare switch). Tests ----- - tests/agent/test_memory_session_switch.py: new dedicated file. ABC default no-op, manager fan-out, failure isolation, empty-id no-op, session_id propagation through sync_all/queue_prefetch_all, Hindsight state transitions for every reset/non-reset case, parent preservation. - tests/cli/test_branch_command.py: new test verifying /branch fires the hook with correct parent_session_id + reset=False + reason. - tests/gateway/test_resume_command.py: new test verifying /resume evicts the cached agent. - tests/run_agent/test_memory_sync_interrupted.py: updated existing assertions to account for the session_id kwarg on sync_all and queue_prefetch_all. E2E verified (real imports, tmp HERMES_HOME): - /resume: session_id updates, doc_id fresh, buffers cleared, parent set - /branch: session_id forks, parent links to original - /new: reset=True clears accumulated state - compression: reason='compression' propagated, lineage preserved - Empty id: no-op, state preserved - Legacy provider without on_session_switch: no crash Reported by @nicoloboschi (Hindsight maintainer); related scope-widening comment by @kidonng extending coverage to compression.	2026-04-29 04:57:22 -07:00
Ben Barclay	58a6171bfb	Merge pull request #17305 from NousResearch/feat/docker-run-as-host-user feat(docker): run container as host user to avoid root-owned bind mounts	2026-04-29 16:41:55 +10:00
Teknium	2d137074a3	refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304 ) The "cfg.get('X', {}).get('Y', default)" pattern appears 50+ times across tools/, gateway/, and plugins/. Each call site manually handles the same three gotchas: 1. Missing intermediate key → empty dict → chain works 2. Non-dict value at intermediate position → AttributeError (uncaught in most sites, so a misconfigured YAML crashes the tool) 3. cfg is None → AttributeError Introduces cfg_get(cfg, keys, default=None) in hermes_cli/config.py as the canonical helper. Handles all three uniformly, returns default only when the final key is absent* (matches dict.get semantics — explicit None values are preserved, falsy values like 0 / False / '' are preserved). Named cfg_get rather than cfg_path to avoid shadowing the existing 'cfg_path = _hermes_home / "config.yaml"' local variable that appears in gateway/run.py, cron/scheduler.py, hermes_cli/main.py, etc. Migrated 20 call sites as the first-batch proof-of-value: gateway/run.py 10 sites (agent/display subtrees) tools/browser_tool.py 3 sites tools/vision_tools.py 2 sites tools/browser_camofox.py 1 site tools/approval.py 1 site tools/skills_tool.py 1 site tools/skill_manager_tool.py 1 site tools/credential_files.py 1 site tools/env_passthrough.py 1 site The remaining ~30 sites across plugins/ and smaller tool files can be migrated opportunistically — the helper is now available and the pattern is established. Fixed a latent bug along the way: tools/vision_tools.py had its cfg_get usage at line 560 inside a function that locally re-imports 'from hermes_cli.config import load_config', but the AST-based migration script wrote the top-level cfg_get import to a different function scope, leaving line 560's cfg_get as a NameError silently swallowed by the surrounding try/except. Test test_vision_uses_configured_temperature_and_timeout caught it. Fixed by including cfg_get in the function-local import. Verified: - 7880/7893 tests/tools/ + tests/gateway/ + tests/hermes_cli/test_config tests pass; all 13 failures pre-existing on main (MCP, delegate, session_split_brain — verified earlier in the sweep). - All 20 migrated sites AST-verified to have cfg_get in scope (either module-level or function-local). - Live 'hermes chat' smoke: 2 turns + /model switch + tool calls + /quit, zero errors. Agent correctly counted 20 cfg_get hits across 8 tool files — matching the migration. Semantic parity verified against the original pattern across 8 edge cases (missing keys, None values, falsy values, empty strings, string instead of dict, None cfg, nested levels).	2026-04-28 23:17:39 -07:00
Ben	5531c0df82	feat(docker): run container as host user to avoid root-owned bind mounts Add opt-in terminal.docker_run_as_host_user config flag that passes --user $(id -u):$(id -g) to the Docker backend so files written into bind-mounted directories (/workspace, /root, docker_volumes entries) are owned by the host user instead of root. When enabled on POSIX platforms, also drops SETUID/SETGID caps since the container no longer needs gosu/su to switch users. Falls back cleanly on platforms without os.getuid (e.g. native Windows Docker) with a warning. Wired through all three config.yaml -> TERMINAL_* env-var bridges: - cli.py env_mappings (CLI + TUI startup) - gateway/run.py _terminal_env_map (gateway / messaging platforms) - hermes_cli/config.py _config_to_env_sync (`hermes config set`) Also fixes docker_mount_cwd_to_workspace silently failing in gateway mode -- it was missing from gateway/run.py's _terminal_env_map. Adds tests/tools/test_terminal_config_env_sync.py to guard against future drift between the three bridges (same bug class shipped twice in one month). Bundled Hermes image won't work with this flag since its entrypoint expects to start as root for the usermod/gosu hermes flow; works with the default nikolaik/python-nodejs image and plain Debian/Ubuntu.	2026-04-29 16:16:43 +10:00
Teknium	019d4c1c3f	feat(curator): hook into the gateway's cron-ticker thread Long-running gateways need the curator to fire on cadence without restarts. Piggy-back on the existing cron ticker thread (which already runs image/document cache cleanup every hour on the same pattern) instead of spawning a dedicated timer thread. - New CURATOR_EVERY = 60 ticks (poll hourly at default 60s interval). The inner config.interval_hours gate controls the real cadence, so 60 of these 60 hourly pokes are cheap no-ops and one runs the review. - Removed the boot-time call added in the prior commit — the ticker covers boot + every hour thereafter. Avoids double-running. Handles the weekly-default-on-24/7-gateway gap flagged in review.	2026-04-28 22:33:33 -07:00
Teknium	bc79e227e6	feat(curator): background skill maintenance (issue #7816 ) Adds the Curator — an auxiliary-model background task that periodically reviews AGENT-CREATED skills and keeps the collection tidy: tracks usage, transitions unused skills through active → stale → archived, and spawns a forked AIAgent to consolidate overlaps and patch drift. Default: enabled, inactivity-triggered (no cron daemon). Runs on CLI startup and gateway boot when the last run is older than interval_hours (default 24) AND the agent has been idle for min_idle_hours (default 2). Invariants (all load-bearing): - Never touches bundled or hub-installed skills (.bundled_manifest + .hub/lock.json double-filter) - Never auto-deletes — archive only. Archives are recoverable via `hermes curator restore <skill>` - Pinned skills bypass all auto-transitions - Uses the aux client; never touches the main session's prompt cache New files: - tools/skill_usage.py — sidecar .usage.json telemetry, atomic writes, provenance filter - agent/curator.py — orchestrator: config, idle gating, state-machine transitions (pure, no LLM), forked-agent review prompt - hermes_cli/curator.py — `hermes curator {status,run,pause,resume, pin,unpin,restore}` subcommand - tests/tools/test_skill_usage.py — 29 tests - tests/agent/test_curator.py — 25 tests Modified files (surgical patches): - tools/skills_tool.py — bump view_count on successful skill_view - tools/skill_manager_tool.py — bump patch_count on skill_manage patch/edit/write_file/remove_file; forget record on delete - hermes_cli/config.py — add curator: section to DEFAULT_CONFIG - hermes_cli/commands.py — add /curator CommandDef with subcommands - hermes_cli/main.py — register `hermes curator` subparser via register_cli() from hermes_cli.curator - cli.py — /curator slash-command dispatch + startup hook - gateway/run.py — gateway-boot hook (mirrors CLI) Validation: - 54 new tests across skill_usage + curator, all passing in 3s - 346 tests across all touched files' neighbors green - 2783 tests across hermes_cli/ + gateway/test_run_progress_topics.py green - CLI smoke: `hermes curator status/pause/resume` work end-to-end Companion to PR #16026 (class-first skill review prompt) — together they form a loop: the review prompt stops near-duplicate skill creation at the source, and the curator prunes/consolidates what still accumulates. Refs #7816.	2026-04-28 22:33:33 -07:00
Lyle Lengyel	80e474f11f	fix(gateway,terminal): expand shell tilde in terminal.cwd before subprocess Commit `3c42064e` made config.yaml the single source of truth for TERMINAL_CWD, but the config bridge passes cwd values verbatim to os.environ. When a user sets terminal.cwd: ~/ in config.yaml, the literal string '~/'' reaches subprocess.Popen, which the kernel rejects because it does not expand shell tilde syntax. This patch adds three defensive layers: 1. gateway/run.py — expanduser at config bridge time so TERMINAL_CWD is always an absolute path. 2. tools/terminal_tool.py — expanduser when reading TERMINAL_CWD in _get_env_config(), guarding against stale or manually-set env vars. 3. tools/environments/local.py — expanduser in LocalEnvironment before passing cwd to subprocess.Popen, the final safety net. Includes regression tests in test_config_cwd_bridge.py for nested terminal.cwd, top-level cwd alias, and precedence ordering. Refs: `3c42064e`	2026-04-28 22:26:09 -07:00
Teknium	dcd7b717f8	fix(gateway): linearize tool-progress bubbles with content messages (#17280 ) After PR #7885 (`97b0cd51e`) added content-side segment breaks for natural mid-turn assistant messages, the tool-progress task in gateway/run.py was not updated to match. progress_msg_id and progress_lines persisted for the whole run, so after a tool batch produced bubble B1 followed by content bubble C1, the next tool.started kept editing the OLD bubble B1 above C1 — making the chat appear out of order on Telegram, Discord, and Slack. Add on_new_message callback to GatewayStreamConsumer, fired at the four sites where a fresh content bubble lands on the platform: - _send_or_edit first-send branch (NOT edits) - _send_commentary - _send_new_chunk (overflow split) - each successful chunk of _send_fallback_final Gateway supplies a lambda that enqueues ('__reset__',) into the progress_queue. send_progress_messages() handles the marker in both the main loop and the CancelledError drain path, clearing progress_msg_id, progress_lines, and the dedup state so the next tool.started opens a fresh bubble below the new content. Result: each tool batch appears in chronological order below the preceding content. When no content appears between tool batches, tools still group in one bubble (CLI-style compactness). Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 22:17:33 -07:00
Rugved Somwanshi	214ca943ac	feat(agent): add lmstudio integration	2026-04-28 12:27:36 -07:00
Teknium	b5128a751b	perf(startup): lazy-import OpenAI, Anthropic, Firecrawl, account_usage (#17046 ) * perf(startup): lazy-import OpenAI, Anthropic, Firecrawl, account_usage Four heavy SDK/module imports are now deferred off the hot startup path. Net savings on cold module imports: cli 1200 → 958 ms (-242) run_agent 1220 → 901 ms (-319) tools.web_tools 711 → 423 ms (-288) agent.anthropic_adapter 230 → 15 ms (-215) agent.auxiliary_client 253 → 68 ms (-185) Four independent changes in one PR since they all use the same pattern and share the same risk profile (heavy SDK import → lazy proxy or function-local import): 1. tools/web_tools.py: 'from firecrawl import Firecrawl' moved into _get_firecrawl_client(), which is only called when backend='firecrawl'. Users on Exa/Tavily/ Parallel pay zero firecrawl cost. 2. cli.py + gateway/run.py: 'from agent.account_usage import ...' moved into the /limits handlers. account_usage transitively pulls the OpenAI SDK chain; only needed when the user runs /limits. 3. agent/anthropic_adapter.py: 'try: import anthropic as _anthropic_sdk' replaced with a cached '_get_anthropic_sdk()' accessor. The three usage sites (build_anthropic_client, build_anthropic_bedrock_client, read_claude_code_credentials_from_keychain) now resolve via the accessor. All pre-existing test patches of 'agent.anthropic_adapter._anthropic_sdk' keep working because the accessor respects any value already in module globals. 4. agent/auxiliary_client.py AND run_agent.py: 'from openai import OpenAI' replaced with an '_OpenAIProxy()' module- level object that looks like the OpenAI class but imports the SDK on first call/isinstance check. This preserves: - 15+ in-module OpenAI(...) construction sites in auxiliary_client and the single site in run_agent's _create_openai_client (Python's function-scope name lookup finds the proxy, forwards the call); - 'patch("agent.auxiliary_client.OpenAI", ...)' and 'patch("run_agent.OpenAI", ...)' test patterns used by 28+ test files (patch replaces the module attribute as usual). Tried two alternatives first: - 'from openai._client import OpenAI' — doesn't skip openai/__init__.py (the audit's hypothesis here was wrong). - Module-level __getattr__ — works for external access but Python function-scope name resolution skips __getattr__, so in-module OpenAI(...) calls NameError. Note: 'openai' still loads on 'import cli' because cli.py -> neuter_async_httpx_del() -> openai._base_client, and run_agent.py -> code_execution_tool.py (module-level build_execute_code_schema) -> _load_config() -> 'from cli import CLI_CONFIG'. Deferring those is a separate, larger change — out of scope for this PR. The savings above all come from avoiding the openai/, anthropic/, and firecrawl/* top-level type-tree imports on paths that don't need them. Verified: - 302/302 tests in tests/agent/{test_anthropic_adapter, test_bedrock_1m_context, test_minimax_provider, test_anthropic_keychain} pass. Two pre-existing failures on main unchanged. - 106/106 tests/agent/test_auxiliary_client.py pass (1 pre-existing fail). - 97/97 tests/run_agent/test_create_openai_client_kwargs_isolation.py, test_plugin_context_engine_init.py, test_invalid_context_length_warning.py, test_api_max_retries_config.py, tests/hermes_cli/test_gemini_provider.py, test_ollama_cloud_provider.py pass (1 pre-existing fail). - Live hermes chat smoke: 2 turns + /model switch + tool calls, zero errors in the 57-line agent.log window. - Module-level import of run_agent + auxiliary_client + anthropic_adapter no longer pulls 'anthropic' or 'firecrawl' at all. * fix(gateway): restore top-level account_usage import for test-patch surface CI caught two failures in tests/gateway/test_usage_command.py that I missed locally: AttributeError: 'module' object at gateway.run has no attribute 'fetch_account_usage' The test uses monkeypatch.setattr('gateway.run.fetch_account_usage', ...) to inject a fake account-fetch call. Moving the import inside the handler deleted that module-level attribute, breaking the patch surface. Restoring the top-level import in gateway/run.py gives up the ~230 ms gateway-boot savings from that one lazy, but: 1. the gateway is a long-running daemon — boot cost is paid once per install, not per turn; 2. the other four lazy-imports (firecrawl, openai, anthropic, cli's account_usage) remain in place and still account for the bulk of the savings reported in the PR body; 3. preserving the patch surface keeps the established 'gateway.run.fetch_account_usage' monkeypatch pattern working without touching tests. Verified: tests/gateway/test_usage_command.py — 8 passed, 0 failed. Full targeted sweep (2336 tests across agent/gateway/hermes_cli/run_agent): 2332 passed, 4 failed — all 4 pre-existing on main. --------- Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 09:38:42 -07:00
Teknium	df51ad7973	perf(config): mtime-cache load_config() and read_raw_config() (#17041 ) load_config() and read_raw_config() now cache their result keyed on the config file's (mtime_ns, size). On cache hit they return a deepcopy of the cached value, skipping yaml.safe_load + deep-merge + normalize + env-var expansion entirely. save_config() + migrate_config() write via atomic_yaml_write which produces a fresh inode, so stat() sees a new mtime_ns and the next load repopulates automatically — no explicit invalidation hook needed. Measured per-call cost: load_config() cold: 13.3 ms load_config() cached: 0.23 ms (57x faster) read_raw_config() cached: 0.13 ms A single gateway turn hits the config 5-15 times (session context, auxiliary client resolution, memory config, plugin hooks, approval lookups, per-tool settings). That's 65-200 ms/turn of pure YAML re-parsing on main. After this change: 1-3 ms/turn. Also migrates gateway/run.py's 6 direct yaml.safe_load(config.yaml) call sites through _load_gateway_config, which now shares the read_raw_config cache when _hermes_home agrees with the canonical config path. The direct-read fallback is retained for tests that monkeypatch gateway_run._hermes_home without touching HERMES_HOME. Safety: - load_config() returns a deepcopy on every call; the 67+ call sites that mutate the result (cfg["model"]["default"] = ..., etc.) can't corrupt the cache. - save_config() / atomic_yaml_write bump mtime, naturally invalidating the cache for the next reader. - Cache is keyed on str(config_path), so HERMES_HOME profile switches don't collide. Verified: - 112 config tests pass (test_config, test_config_env_expansion, test_config_env_refs, test_config_drift, test_config_validation, test_aux_config). - 87 gateway tests pass (test_verbose_command, test_session_info, test_compress_focus, test_runtime_footer, test_resume_command, test_reasoning_command, test_approve_deny_commands, test_run_progress_interrupt). - Live hermes chat smoke — 2 turns + /model switch + tool calls, zero errors in agent.log. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 07:06:35 -07:00
konsisumer	e4b69bf149	fix(gateway): guard against None request_overrides in _build_api_kwargs	2026-04-28 06:57:23 -07:00
Teknium	e123f4ecf0	feat(gateway): opt-in runtime-metadata footer on final replies (#17026 ) Append a compact 'model · 68% · ~/projects/hermes' footer to the FINAL message of each turn, disabled by default (display.runtime_footer.enabled). Answers the Telegram-side parity ask: runtime context that the CLI status bar already shows is now available in messaging replies when enabled. Wiring: - gateway/runtime_footer.py: resolve_footer_config + format_runtime_footer + build_footer_line. Pure-function renderer; per-platform overrides under display.platforms.<platform>.runtime_footer. - gateway/run.py: appends footer to response right after reasoning prepend so it lands only on the final message (never tool progress or streaming chunks). When streaming already delivered the body (already_sent), the footer is sent as a small trailing message instead. - agent_result now exposes context_length alongside last_prompt_tokens so the footer can compute the pct; both gateway return paths updated. - /footer [on\|off\|status] slash command, wired in CLI (cli.py) and gateway (gateway/run.py both running-agent bypass and main dispatch). Global toggle only; per-platform overrides via config.yaml. Graceful degradation: - Missing context_length (unknown model) → pct field silently dropped (no '?%' artifact). - Empty final_response → no footer appended. - Unknown field names in config → silently ignored. Tests: 25-case unit suite (tests/gateway/test_runtime_footer.py) plus E2E harness covering streaming vs non-streaming branches, per-platform override, and the exact argument contract gateway/run.py uses. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 06:50:04 -07:00
Teknium	6085d7a93e	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 ) Mechanical cleanup across 43 files — removes 46 unused imports (F401) and 14 unused local variables (F841) detected by `ruff check --select F401,F841`. Net: -49 lines. Also fixes a latent NameError in rl_cli.py where `get_hermes_home()` was called at module line 32 before its import at line 65 — the module never imported successfully on main. The ruff audit surfaced this because it correctly saw the symbol as imported-but-unused (the call happened before the import ran); the fix moves the import to the top of the file alongside other stdlib imports. One `# noqa: F401` kept in hermes_cli/status.py for `subprocess`: tests monkeypatch `hermes_cli.status.subprocess` as a regression guard that systemctl isn't called on Termux, so the name must exist at module scope even though the module body doesn't reference it. Docstring explains the reason. Also fixes an invalid `# noqa:` directive in gateway/platforms/discord.py:308 that lacked a rule code. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 06:46:45 -07:00
Teknium	5f84eac451	feat(gateway): bust cached agent on compression/context_length config edits (#17008 ) The gateway caches one AIAgent per session to preserve prompt-cache hits, keyed by _agent_config_signature(). The signature previously only fingerprinted model/credentials/toolsets/ephemeral-prompt — NOT the compression or context_length config. As a result, users who edited model.context_length or compression.threshold in config.yaml on a long-lived gateway saw no effect until they triggered an unrelated cache eviction (/model switch, /reset, gateway restart). Add a new cache_keys parameter to _agent_config_signature and a _CACHE_BUSTING_CONFIG_KEYS registry listing config values the agent bakes in at construction time. Call sites read the current config and pass it through — next gateway message with an edited config rebuilds the agent. Keys registered: - model.context_length - compression.enabled - compression.threshold - compression.target_ratio - compression.protect_last_n Reported by @OP (Apr 26 feedback bundle). ## Changes - gateway/run.py: new _CACHE_BUSTING_CONFIG_KEYS tuple, _extract_cache_busting_config classmethod, cache_keys kwarg on _agent_config_signature, call site passes the extracted dict - tests/gateway/test_agent_cache.py: 11 new tests (5 on _agent_config_signature behavior, 6 on _extract_cache_busting_config) Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 06:37:42 -07:00

1 2 3 4 5 ...

718 commits