hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-03 02:11:48 +00:00

Author	SHA1	Message	Date
Rob Moen	0dd373ec43	fix(context): honor model.context_length for Ollama num_ctx and all display paths When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests).	2026-04-30 04:31:23 -07:00
Teknium	8d302e37a8	feat(tts): add Piper as a native local TTS provider (closes #8508 ) (#17885 ) Piper (OHF-Voice/piper1-gpl) is a fast, local neural TTS engine from the Home Assistant project that supports 44 languages with zero API keys. Adds it as a native built-in provider alongside edge/neutts/kittentts, installable via 'hermes tools' with one keystroke. What ships: - New 'piper' built-in provider in tools/tts_tool.py - Lazy import via _import_piper() - Module-level voice cache keyed on (model_path, use_cuda) so switching voices doesn't invalidate older cached voices - _resolve_piper_voice_path() accepts either an absolute .onnx path or a voice name (auto-downloaded on first use via 'python -m piper.download_voices --download-dir <cache>') - Voice cache at ~/.hermes/cache/piper-voices/ (profile-aware via get_hermes_dir) - Optional SynthesisConfig knobs: length_scale, noise_scale, noise_w_scale, volume, normalize_audio, use_cuda — passed through only when configured, so older piper-tts versions aren't broken - WAV output then ffmpeg conversion path (same as neutts/kittentts) so Telegram voice bubbles work when ffmpeg is present - Piper added to BUILTIN_TTS_PROVIDERS so a user's tts.providers.piper.command cannot shadow the native provider (regression test included) - 'hermes tools' wizard entry - Piper appears under Voice and TTS as local free, with 'pip install piper-tts' auto-install via post_setup handler - Prints voice-catalog URL and default-voice info after install - config.yaml defaults - tts.piper.voice defaults to en_US-lessac-medium - Commented advanced knobs for discoverability - Docs - New 'Piper (local, 44 languages)' section in features/tts.md explaining install path, voice switching, pre-downloaded voices, and advanced knobs - Piper listed in the ten-provider table and ffmpeg table - Custom-command-providers section updated to drop the Piper example (now native) and add a piper-custom example for users with their own trained .onnx models - overview.md bumps provider count to ten - Tests (tests/tools/test_tts_piper.py, 16 tests) - Registration (BUILTIN_TTS_PROVIDERS, PROVIDER_MAX_TEXT_LENGTH) - _resolve_piper_voice_path across every branch: direct .onnx path, cached voice name, fresh download with correct CLI args, download failure, successful-exit-but-missing-files, empty voice to default - _generate_piper_tts: loads voice once, reuses cache, voice-name download wiring, advanced knobs flow through SynthesisConfig - text_to_speech_tool end-to-end dispatch and missing-package error - check_tts_requirements: piper availability toggles the return value - Regression guard: piper cannot be shadowed by a command provider with the same name - Pre-existing test_tts_mistral test broadened to mock the new piper/kittentts/command-provider checks (otherwise it false-passes when piper is installed in the test venv) E2E verification (live): Actual pip install piper-tts, config piper + en_US-lessac-low, text_to_speech_tool call, voice auto-downloaded from HuggingFace, WAV synthesized, ffmpeg-converted to Ogg/Opus. Second call hits the cache (~60ms). Cache dir populated with .onnx and .onnx.json. This caught a real bug during development: the first pass used '-d' as the download-dir flag; the actual piper.download_voices CLI wants '--download-dir'. Fixed before PR opened.	2026-04-30 02:53:20 -07:00
Teknium	0da968e521	fix(curator): unify under auxiliary.curator (hermes model, dashboard) (#17868 ) Voscko reported curator.auxiliary.provider/model was advertised in the docs but ignored — the review fork read only model.provider/default. The narrow fix would wire the one-off key through, but that leaves curator as a parallel system: not in `hermes model` → auxiliary picker, not in the dashboard Models tab, missing per-task base_url/api_key/timeout/ extra_body. Unify curator with the rest of the aux task system so `hermes model` and the dashboard configure it like every other aux task. Four sources of truth updated: - hermes_cli/config.py — add 'curator' slot to DEFAULT_CONFIG.auxiliary (timeout=600 since reviews run long), drop the one-off curator.auxiliary block from DEFAULT_CONFIG.curator. - hermes_cli/main.py — add ('curator', 'Curator', 'skill-usage review pass') to _AUX_TASKS so the CLI picker offers it. - hermes_cli/web_server.py — add 'curator' to _AUX_TASK_SLOTS so the dashboard REST endpoint accepts it. - web/src/pages/ModelsPage.tsx — add Curator entry so the dashboard Models tab renders the task. agent/curator.py _resolve_review_model() now reads auxiliary.curator first (canonical), falls back to legacy curator.auxiliary (with an info log asking users to migrate), then falls back to the main chat model. Pre-unification users keep working. Docs updated: docs/user-guide/features/curator.md now points at `hermes model` → auxiliary → Curator and the dashboard Models tab. Tests: 6 unit tests on _resolve_review_model (auto default, canonical slot honored, partial override fallback, legacy fallback with deprecation log assertion, new-wins-over-legacy, empty-config safety) plus a cross-registry test that curator is wired into all four sources of truth. test_aux_tasks_keys_all_exist_in_default_config already covers the DEFAULT_CONFIG ↔ _AUX_TASKS invariant. Reported by Voscko on Discord.	2026-04-30 02:46:01 -07:00
Teknium	71c8ca17dc	chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 Removes drive-by duplication that accumulated during the contributor branch's multiple rebases. All runtime-benign (dict last-wins, redefinition last-wins) but left dead source that would confuse reviewers and maintainers. Surgical in-place de-duplication (kept PR's intentional additions, removed only the doubled copy): * hermes_cli/auth.py: duplicate "gmi" + "azure-foundry" ProviderConfig * hermes_cli/models.py: duplicate "gmi" entry in _PROVIDER_MODELS * hermes_cli/config.py: duplicate NOTION/LINEAR/AIRTABLE/TENOR skill env block + duplicate get_custom_provider_context_length definition * hermes_cli/gateway.py: duplicate _setup_yuanbao * gateway/platforms/base.py: duplicate is_host_excluded_by_no_proxy * gateway/platforms/telegram.py: duplicate delete_message * gateway/stream_consumer.py: duplicate _should_send_fresh_final and _try_fresh_final * gateway/run.py: duplicate _parse_reasoning_command_args / _resolve_session_reasoning_config / _set_session_reasoning_override, duplicate "Drain silently when interrupted" interrupt check * run_agent.py: duplicate HERMES_AGENT_HELP_GUIDANCE append, duplicate codex_message_items capture, duplicate custom_providers resolution * tools/approval.py: duplicate HARDLINE_PATTERNS section and duplicate hardline call in check_dangerous_command * tools/mcp_tool.py: duplicate _orphan_stdio_pids module-level decl * cron/scheduler.py: duplicate "not configured/enabled" check — kept the new early-rejection, removed the stale late-path copy Full-file resets to origin/main (all PR additions were duplicates of content already on main): * ui-tui/packages/hermes-ink/index.d.ts * ui-tui/packages/hermes-ink/src/entry-exports.ts * ui-tui/packages/hermes-ink/src/ink/selection.ts * ui-tui/src/app/interfaces.ts * ui-tui/src/app/slash/commands/core.ts * ui-tui/src/components/thinking.tsx * ui-tui/src/lib/memoryMonitor.ts * ui-tui/src/types.ts * ui-tui/src/types/hermes-ink.d.ts * tests/hermes_cli/test_doctor.py * tests/hermes_cli/test_api_key_providers.py * tests/hermes_cli/test_model_validation.py * tests/plugins/memory/test_hindsight_provider.py * tests/run_agent/test_run_agent.py * tests/gateway/test_email.py * tests/tools/test_dockerfile_pid1_reaping.py * hermes_cli/commands.py (slack_native_slashes block — full duplicate)	2026-04-29 21:56:51 -07:00
Ari Lotter	868bc1c242	feat(irc): add interactive setup feat(gateway): refine Platform._missing_ and platform-connected dispatch Restricts plugin-name acceptance to bundled plugin scan + registry (no arbitrary string -> enum-pollution), pulls per-platform connectivity checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean _is_platform_connected method, and adds tests covering the checker map, plugin platform interface, and IRC setup wizard.	2026-04-29 21:56:51 -07:00
Teknium	8f144fe36b	feat: pluggable platform adapter registry + IRC reference implementation Adds a platform adapter plugin interface so anyone can create new gateway platforms (IRC, Viber, Line, etc.) as drop-in plugins without modifying core gateway code. - PlatformEntry dataclass: name, label, adapter_factory, check_fn, validate_config, required_env, install_hint, source - PlatformRegistry singleton with register/unregister/create_adapter - _create_adapter() in gateway/run.py checks registry first, falls through to existing if/elif chain for built-in platforms - Platform._missing_() accepts unknown string values, creating cached pseudo-members so Platform('irc') is Platform('irc') holds true - GatewayConfig.from_dict() now parses plugin platform names from config.yaml without rejecting them - get_connected_platforms() delegates to registry for unknown platforms - PluginContext.register_platform() for plugin authors - Mirrors the existing register_tool() / register_hook() pattern - Full async IRC adapter using stdlib asyncio (zero external deps) - Connects via TLS, handles PING/PONG, nick collision, NickServ auth - Channel messages require addressing (nick: msg), DMs always dispatch - Markdown stripping for IRC-clean output, message splitting for 512-byte line limit - Config via config.yaml extra dict or IRC_* env vars - Platform enum dynamic members (identity stability, case normalization) - PlatformRegistry (register, unregister, create, validation, factory) - GatewayConfig integration (from_dict parsing, get_connected_platforms) - IRC adapter (init, send, protocol parsing, markdown, requirements) No existing platform adapters were migrated — the if/elif chain is untouched. This is Phase 1: prove the interface with a real plugin.	2026-04-29 21:56:51 -07:00
Teknium	4d7fc0f37c	feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation Reloading MCP servers rebuilds the tool set for the active session, which invalidates the provider prompt cache (tool schemas are baked into the system prompt). The next message re-sends full input tokens — can be expensive on long-context or high-reasoning models. To surface that cost, /reload-mcp now routes through a new slash-confirm primitive with three options: Approve Once / Always Approve / Cancel. 'Always Approve' persists approvals.mcp_reload_confirm: false so future reloads run silently. Coverage: * Classic CLI (cli.py) — interactive numbered prompt. * TUI (tui_gateway + Ink ops.ts) — text warning on first call; `now` / `always` args skip the gate; `always` also persists the opt-out. * Messenger gateway — button UI on Telegram (inline keyboard), Discord (discord.ui.View), Slack (Block Kit actions); text fallback on every other platform via /approve /always /cancel replies intercepted in gateway/run.py _handle_message. * Config key: approvals.mcp_reload_confirm (default true). * Auto-reload paths (CLI file watcher, TUI config-sync mtime poll) pass confirm=true so they do NOT prompt. Implementation: * tools/slash_confirm.py — module-level pending-state store used by all adapters and by the CLI prompt. Thread-safe register/resolve/clear. * gateway/platforms/base.py — send_slash_confirm hook (default 'Not supported' → text fallback). * gateway/run.py — _request_slash_confirm helper + text intercept in _handle_message (yields to in-progress tool-exec approvals so dangerous-command /approve still unblocks the tool thread first). Tests: * tests/tools/test_slash_confirm.py — primitive lifecycle + async resolution + double-click atomicity (16 tests). * tests/hermes_cli/test_mcp_reload_confirm_gate.py — default-config shape + deep-merge preserves user opt-out (5 tests). Targeted runs (hermetic): 89 passed (slash-confirm, config gate, existing agent cache, existing telegram approval buttons).	2026-04-29 21:56:47 -07:00
kshitijk4poor	13c238327e	fix: address self-review findings for Vercel Sandbox salvage - Add vercel_sandbox to hardline blocklist container bypass test - Add vercel_sandbox to skills_tool remote backend parametrize test - Deduplicate runtime set: doctor.py and setup.py now import _SUPPORTED_VERCEL_RUNTIMES from terminal_tool.py - Add docstring to _run_bash explaining timeout/stdin_data discards - Always stop sandbox during cleanup (unconditional, matching Modal/Daytona) - Update security.md: container bypass text, production tip, comparison table - Update environment-variables.md: TERMINAL_ENV list, Vercel auth vars, TERMINAL_VERCEL_RUNTIME - Update inline comments in cli.py and config.py to include vercel_sandbox	2026-04-29 07:22:33 -07:00
Scott Trinh	5a1d4f6804	feat: add Vercel Sandbox backend Adds Vercel Sandbox as a supported Hermes terminal backend alongside existing providers (Local, Docker, Modal, SSH, Daytona, Singularity). Uses the Vercel Python SDK to create/manage cloud microVMs, supports snapshot-based filesystem persistence keyed by task_id, and integrates with the existing BaseEnvironment shell contract and FileSyncManager for credential/skill syncing. Based on #17127 by @scotttrinh, cherry-picked onto current main.	2026-04-29 07:22:33 -07:00
Ben Barclay	58a6171bfb	Merge pull request #17305 from NousResearch/feat/docker-run-as-host-user feat(docker): run container as host user to avoid root-owned bind mounts	2026-04-29 16:41:55 +10:00
Teknium	2d137074a3	refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304 ) The "cfg.get('X', {}).get('Y', default)" pattern appears 50+ times across tools/, gateway/, and plugins/. Each call site manually handles the same three gotchas: 1. Missing intermediate key → empty dict → chain works 2. Non-dict value at intermediate position → AttributeError (uncaught in most sites, so a misconfigured YAML crashes the tool) 3. cfg is None → AttributeError Introduces cfg_get(cfg, keys, default=None) in hermes_cli/config.py as the canonical helper. Handles all three uniformly, returns default only when the final key is absent* (matches dict.get semantics — explicit None values are preserved, falsy values like 0 / False / '' are preserved). Named cfg_get rather than cfg_path to avoid shadowing the existing 'cfg_path = _hermes_home / "config.yaml"' local variable that appears in gateway/run.py, cron/scheduler.py, hermes_cli/main.py, etc. Migrated 20 call sites as the first-batch proof-of-value: gateway/run.py 10 sites (agent/display subtrees) tools/browser_tool.py 3 sites tools/vision_tools.py 2 sites tools/browser_camofox.py 1 site tools/approval.py 1 site tools/skills_tool.py 1 site tools/skill_manager_tool.py 1 site tools/credential_files.py 1 site tools/env_passthrough.py 1 site The remaining ~30 sites across plugins/ and smaller tool files can be migrated opportunistically — the helper is now available and the pattern is established. Fixed a latent bug along the way: tools/vision_tools.py had its cfg_get usage at line 560 inside a function that locally re-imports 'from hermes_cli.config import load_config', but the AST-based migration script wrote the top-level cfg_get import to a different function scope, leaving line 560's cfg_get as a NameError silently swallowed by the surrounding try/except. Test test_vision_uses_configured_temperature_and_timeout caught it. Fixed by including cfg_get in the function-local import. Verified: - 7880/7893 tests/tools/ + tests/gateway/ + tests/hermes_cli/test_config tests pass; all 13 failures pre-existing on main (MCP, delegate, session_split_brain — verified earlier in the sweep). - All 20 migrated sites AST-verified to have cfg_get in scope (either module-level or function-local). - Live 'hermes chat' smoke: 2 turns + /model switch + tool calls + /quit, zero errors. Agent correctly counted 20 cfg_get hits across 8 tool files — matching the migration. Semantic parity verified against the original pattern across 8 edge cases (missing keys, None values, falsy values, empty strings, string instead of dict, None cfg, nested levels).	2026-04-28 23:17:39 -07:00
Ben	5531c0df82	feat(docker): run container as host user to avoid root-owned bind mounts Add opt-in terminal.docker_run_as_host_user config flag that passes --user $(id -u):$(id -g) to the Docker backend so files written into bind-mounted directories (/workspace, /root, docker_volumes entries) are owned by the host user instead of root. When enabled on POSIX platforms, also drops SETUID/SETGID caps since the container no longer needs gosu/su to switch users. Falls back cleanly on platforms without os.getuid (e.g. native Windows Docker) with a warning. Wired through all three config.yaml -> TERMINAL_* env-var bridges: - cli.py env_mappings (CLI + TUI startup) - gateway/run.py _terminal_env_map (gateway / messaging platforms) - hermes_cli/config.py _config_to_env_sync (`hermes config set`) Also fixes docker_mount_cwd_to_workspace silently failing in gateway mode -- it was missing from gateway/run.py's _terminal_env_map. Adds tests/tools/test_terminal_config_env_sync.py to guard against future drift between the three bridges (same bug class shipped twice in one month). Bundled Hermes image won't work with this flag since its entrypoint expects to start as root for the usermod/gosu hermes flow; works with the default nikolaik/python-nodejs image and plain Debian/Ubuntu.	2026-04-29 16:16:43 +10:00
Teknium	a12f7aa8bb	fix(curator): default cycle is every 7 days, not 24 hours Weekly is closer to how skill churn actually works — most agent-created skills don't change multiple times per day, so a daily review is pure cost without benefit. Bumping the default to 7 days reduces aux-model spend while still catching drift and staleness on the timescales that matter (30d stale, 90d archive). Changes: - DEFAULT_INTERVAL_HOURS: 24 -> 168 (7 days) - config.yaml default: interval_hours: 24 -> 24 * 7 - CLI status line renders as '7d' when interval is a whole-day multiple - Test `test_old_run_eligible` decoupled from the exact default: it now uses 2 * get_interval_hours() so future tweaks don't break it	2026-04-28 22:33:33 -07:00
Teknium	bc79e227e6	feat(curator): background skill maintenance (issue #7816 ) Adds the Curator — an auxiliary-model background task that periodically reviews AGENT-CREATED skills and keeps the collection tidy: tracks usage, transitions unused skills through active → stale → archived, and spawns a forked AIAgent to consolidate overlaps and patch drift. Default: enabled, inactivity-triggered (no cron daemon). Runs on CLI startup and gateway boot when the last run is older than interval_hours (default 24) AND the agent has been idle for min_idle_hours (default 2). Invariants (all load-bearing): - Never touches bundled or hub-installed skills (.bundled_manifest + .hub/lock.json double-filter) - Never auto-deletes — archive only. Archives are recoverable via `hermes curator restore <skill>` - Pinned skills bypass all auto-transitions - Uses the aux client; never touches the main session's prompt cache New files: - tools/skill_usage.py — sidecar .usage.json telemetry, atomic writes, provenance filter - agent/curator.py — orchestrator: config, idle gating, state-machine transitions (pure, no LLM), forked-agent review prompt - hermes_cli/curator.py — `hermes curator {status,run,pause,resume, pin,unpin,restore}` subcommand - tests/tools/test_skill_usage.py — 29 tests - tests/agent/test_curator.py — 25 tests Modified files (surgical patches): - tools/skills_tool.py — bump view_count on successful skill_view - tools/skill_manager_tool.py — bump patch_count on skill_manage patch/edit/write_file/remove_file; forget record on delete - hermes_cli/config.py — add curator: section to DEFAULT_CONFIG - hermes_cli/commands.py — add /curator CommandDef with subcommands - hermes_cli/main.py — register `hermes curator` subparser via register_cli() from hermes_cli.curator - cli.py — /curator slash-command dispatch + startup hook - gateway/run.py — gateway-boot hook (mirrors CLI) Validation: - 54 new tests across skill_usage + curator, all passing in 3s - 346 tests across all touched files' neighbors green - 2783 tests across hermes_cli/ + gateway/test_run_progress_topics.py green - CLI smoke: `hermes curator status/pause/resume` work end-to-end Companion to PR #16026 (class-first skill review prompt) — together they form a loop: the review prompt stops near-duplicate skill creation at the source, and the curator prunes/consolidates what still accumulates. Refs #7816.	2026-04-28 22:33:33 -07:00
JackJin	88e07c42b4	fix(cli): prevent .env sanitizer from splitting GLM_API_KEY by LM_API_KEY suffix The known-key splitter in `_sanitize_env_lines` used substring matching to find concatenated KEY=VALUE pairs. When a registered key was a suffix of another (LM_API_KEY is a suffix of GLM_API_KEY), the shorter key's needle would match inside the longer one, causing the sanitizer to rewrite `GLM_API_KEY=...` as `G\nLM_API_KEY=...` and silently break Z.AI/GLM auth (and similarly `GLM_BASE_URL` -> `G\nLM_BASE_URL`). Drop matches whose needle range is fully contained within a longer overlapping match. Two regression tests cover the suffix-collision case and confirm a real concatenation that happens to start with the longer key still splits where it should. Fixes #17138	2026-04-28 22:22:45 -07:00
Teknium	8c892c1453	refactor(redact): canonical mask_secret helper; fix status.py DIM drift (#17207 ) Three modules independently implemented the same "preserve head+tail of a secret, mask the middle" logic with slightly different behaviors that had started to drift: hermes_cli/config.py redact_key — 12-char floor, 4+4, DIM '(not set)' hermes_cli/status.py redact_key — 12-char floor, 4+4, plain '(not set)' ← drift hermes_cli/dump.py _redact — 12-char floor, 4+4, empty string The visible bug: 'hermes status' displayed the '(not set)' placeholder in plain text while 'hermes config' showed it in dim text. Same concept, inconsistent UI. Introduces mask_secret() in agent/redact.py as the canonical helper, with head/tail/floor/placeholder/empty kwargs. The three call sites become one-line wrappers that differ only in the 'empty' handling: config.redact_key → mask_secret(k, empty=color('(not set)', Colors.DIM)) status.redact_key → mask_secret(k, empty=color('(not set)', Colors.DIM)) dump._redact → mask_secret(v) # empty → '' agent.redact._mask_token (log redactor, different policy: 18-char floor, 6+4 visible, '*' on empty) also ports to mask_secret but retains its own empty-case handling to preserve the historical '' return. Net: the three display-time redactors now agree on formatting, the canonical helper lives in one place, and future tweaks (e.g. adding bullet-point masking, changing the head/tail widths) happen once. Verified: - 3/3 tests/hermes_cli/test_web_server.py::TestRedactKey pass - 89/89 agent/tests/test_redact.py + tests/tools/test_browser_secret_exfil.py + tests/hermes_cli/test_redact_config_bridge.py pass - Live 'hermes status', 'hermes config', 'hermes dump' all render the same way they did before (verified against actual env with real keys: OpenRouter, Firecrawl, Browserbase, FAL, Tinker all show 'prefix...suffix'; Kimi shows '**' at <12 chars; unset shows '(not set)' uniformly). Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 21:04:35 -07:00
brooklyn!	7d81d76366	feat(tui): pluggable busy-indicator styles (#13610 ) (#17150 ) * feat(tui): pluggable busy-indicator styles (kaomoji/emoji/unicode/ascii) The status-bar `FaceTicker` rotated through wide-and-variable kaomoji glyphs (`(｡•́︿•̀｡)`, `( ͡° ͜ʖ ͡°)`, …) every 2.5s. Real display widths range from ~5 to ~16 columns, so the rest of the bar (cwd, ctx %, voice, bg counter) shifted on every cycle. Padding the verb alone (#17116) helped but didn't address the dominant jitter source — the glyph itself. Add four indicator styles, configurable + hot-swappable: * `kaomoji` (default — preserves the existing vibe; verb is now pad-stable so the only width churn left is the kaomoji itself). * `emoji` — single 2-col emoji frame (`⚕ 🌀 🤔 ✨ 🍵 🔮`). * `unicode` — `unicode-animations` braille spinner (1-col, smooth). * `ascii` — `\| / - \` (1-col, max compat). Wires: * `display.tui_status_indicator` in `DEFAULT_CONFIG` (default `kaomoji`). * New JSON-RPC `config.set/get indicator` keys, narrow allow-list. * `applyDisplay` reads the field and patches `UiState.indicatorStyle`, so the existing `mtime` poll picks up `~/.hermes/config.yaml` edits within ~5s without a TUI restart. * `/indicator [style]` slash command (alias `/indicator-style`, subcommand completion `kaomoji\|emoji\|unicode\|ascii`). Bare form shows the current style; setter fires `config.set` and optimistically `patchUiState({ indicatorStyle })` so the live TUI swaps immediately, matching the `/skin` UX. * `CommandDef("indicator", ..., subcommands=...)` so classic CLI autocomplete + TUI `complete.slash` both surface it. * `FaceTicker` decouples spinner cadence from verb cadence — the glyph runs at the spinner's authored interval (or `FACE_TICK_MS` for kaomoji), the verb stays on the original 2.5s cycle, and both re-arm cleanly when style changes. Tests: * `normalizeIndicatorStyle` rejects unknown / non-string input. * `applyDisplay → tui_status_indicator` covers fan-out + fallback. * `/indicator <style>` hot-swaps `UiState.indicatorStyle` after a successful `config.set`. * `/indicator sparkle` rejects with the usage hint and never hits the gateway. * Slash-parity matrix gets `'/indicator'` → `config.get`. Validation: cd ui-tui && npm run type-check — clean; npm test --run — 398/398. scripts/run_tests.sh tests/test_tui_gateway_server.py tests/hermes_cli/test_commands.py — 220/220. * chore(tui): drop /indicator-style alias to declutter autocomplete * fix(tui): drop verb-width pad — /indicator handles glyph jitter directly * fix(tui): unicode indicator style hides the verb (cleanest option) * refactor(tui): single source of truth for INDICATOR_STYLES; cleaner error format Round 1 Copilot review on PR #17150: - Exported `INDICATOR_STYLES` const tuple from `interfaces.ts`; `IndicatorStyle` union type is derived from it. `useConfigSync` builds its validation Set from the tuple, and `session.ts` uses it for both the usage hint and the runtime allow-list — adding/removing a style now touches one line. - Backend `config.set indicator` error message: switched `sorted(allowed)` list repr to `pick one of ascii\|emoji\|kaomoji\|unicode` (matches the TUI usage hint), and reports the normalized `raw` instead of the original `value`. Backend allowed tuple now has a comment pointing back at `INDICATOR_STYLES` so the two stay aligned. Note: kept the verb portion unpadded per design intent — fixed-width padding was the exact UX the `/indicator` command was added to remove. Stable width comes from the glyph; verbs cycling is part of the kawaii aesthetic. Reply on the verb thread will explain. * fix(tui): drop type collapse + gate verb timer + DEFAULT_INDICATOR_STYLE Round 2 Copilot review on PR #17150: - `tui_status_indicator?: 'ascii' \| ... \| string` collapses to `string` in TS — consumers got no narrowing. Documented as plain `string` with a comment about runtime validation via `normalizeIndicatorStyle`. - `FaceTicker` always started a 2.5s verb interval, even for the `unicode` style which hides the verb entirely. Now gated on `showVerb` from `renderIndicator` — `unicode` stays calm. Pre-emptive self-review (avoid round 3): - Three call sites duplicated the literal `'kaomoji'` default (uiStore, normalizeIndicatorStyle, slash command). Added `DEFAULT_INDICATOR_STYLE` to interfaces.ts and threaded it through so changing the default touches one line. * fix(tui-gateway): normalize config.get indicator output to match TUI render Round 4 Copilot review on PR #17150: `config.get` for `indicator` returned the raw `display.tui_status_indicator` value without validation, so a hand-edited config.yaml with stray casing or an unknown style would leave `/indicator` printing one thing while the TUI rendered the kaomoji default (frontend's `normalizeIndicatorStyle` does this normalization on receive). Lifted the allow-list to module scope as `_INDICATOR_STYLES` / `_INDICATOR_DEFAULT`, reused by both `config.set` and `config.get`. Comment notes the alignment with `INDICATOR_STYLES` / `DEFAULT_INDICATOR_STYLE` in interfaces.ts so adding/removing a style is a one-line change on each end. Tests cover: known value verbatim, casing/whitespace normalize, unknown→default, unset→default. * fix(tui-gateway): preserve falsy-input diagnostics in config.set indicator error Round 5 Copilot review on PR #17150: `raw = str(value or "").strip().lower()` collapsed any falsy non-string (`0`, `False`, `[]`) to empty string, so the error message read `unknown indicator: ` with nothing after — losing the original input. Switched to `("" if value is None else str(value)).strip().lower()` so only `None` (the genuine 'no value' case) becomes blank. Used `{raw!r}` in the error so the diagnostic is unambiguous (`'0'` vs `0`). Tests: - known-value happy path (`'EMOJI'` → `'emoji'`) - falsy non-string inputs (`0` / `False` / `[]`) surface meaningfully - `None` keeps the blank-repr error	2026-04-28 18:19:16 -05:00
brooklyn!	87d3fa6f1c	feat(tui): opt-in auto-resume of the most recent session (#17130 ) * feat(tui): opt-in auto-resume of the most recent session `hermes --tui` always forges a fresh session at startup unless the user sets `HERMES_TUI_RESUME=<id>`. Disconnects, terminal-window crashes, and accidental Ctrl+D therefore lose every piece of in-flight context even though `state.db` still has the full history a `/resume` away. Add an opt-in path that mirrors classic CLI's `hermes -c` muscle memory: when `display.tui_auto_resume_recent: true` is set in `~/.hermes/config.yaml`, the TUI looks up the most recent human-facing session and resumes it instead of starting fresh. Default off so existing users aren't surprised; explicit `HERMES_TUI_RESUME` always wins. Wires: * New `session.most_recent` JSON-RPC in `tui_gateway/server.py` that returns the first non-`tool` row from `list_sessions_rich`, or `{"session_id": null}` when none. Uses the same deny-list as `session.list` so sub-agent rows can't sneak in. * `createGatewayEventHandler.handleReady` re-ordered: explicit `STARTUP_RESUME_ID` first (unchanged), then conditional auto-resume via `config.get full → display.tui_auto_resume_recent`, then the legacy `newSession()` fallback. Failures of either RPC fall back to `newSession()` so the path is always finite. * Default `display.tui_auto_resume_recent: False` added to `DEFAULT_CONFIG` in `hermes_cli/config.py` (no `_config_version` bump per AGENTS.md — deep-merge handles the additive key). Tests: * 4 new vitest cases in `createGatewayEventHandler.test.ts` cover every gate-and-fallback combination (env wins, config off, config on with hit, config on with miss). * 3 new pytest cases for `session.most_recent` (denied row skip, tool-only → null, db-unavailable → null). Validation: scripts/run_tests.sh tests/test_tui_gateway_server.py — 93/93. cd ui-tui && npm run type-check — clean; npm test --run — 393/393. * review(copilot): fold session.most_recent errors into null + extend ConfigDisplayConfig * review(copilot): cover RPC-rejection fallbacks in auto-resume tests	2026-04-28 16:53:38 -05:00
kshitijk4poor	5d2f9b5d7d	fix: follow-up for salvaged PR #17061 - Remove dead _lmstudio_loaded_context attribute from run_agent.py (set but never read — the loaded context is pushed to context_compressor.update_model which is the actual consumer) - Cache empty reasoning options with 60s TTL to avoid per-turn HTTP probe for non-reasoning LM Studio models. Non-empty results cached permanently. - Extract _lmstudio_server_root(), _lmstudio_request_headers(), and _lmstudio_fetch_raw_models() shared helpers in models.py — eliminates URL-strip + auth-header + HTTP-call duplication across probe_lmstudio_models, ensure_lmstudio_model_loaded, and lmstudio_model_reasoning_options - Revert runtime_provider.py base_url precedence change: preserve the established contract (saved config.base_url > env var > default) for all api_key providers - Remove unnecessary config version bump 22→23 - Fix TUI test: relax target_model assertion to avoid module-cache flake - AUTHOR_MAP: added rugved@lmstudio.ai → rugvedS07	2026-04-28 12:27:36 -07:00
Rugved Somwanshi	214ca943ac	feat(agent): add lmstudio integration	2026-04-28 12:27:36 -07:00
Teknium	df51ad7973	perf(config): mtime-cache load_config() and read_raw_config() (#17041 ) load_config() and read_raw_config() now cache their result keyed on the config file's (mtime_ns, size). On cache hit they return a deepcopy of the cached value, skipping yaml.safe_load + deep-merge + normalize + env-var expansion entirely. save_config() + migrate_config() write via atomic_yaml_write which produces a fresh inode, so stat() sees a new mtime_ns and the next load repopulates automatically — no explicit invalidation hook needed. Measured per-call cost: load_config() cold: 13.3 ms load_config() cached: 0.23 ms (57x faster) read_raw_config() cached: 0.13 ms A single gateway turn hits the config 5-15 times (session context, auxiliary client resolution, memory config, plugin hooks, approval lookups, per-tool settings). That's 65-200 ms/turn of pure YAML re-parsing on main. After this change: 1-3 ms/turn. Also migrates gateway/run.py's 6 direct yaml.safe_load(config.yaml) call sites through _load_gateway_config, which now shares the read_raw_config cache when _hermes_home agrees with the canonical config path. The direct-read fallback is retained for tests that monkeypatch gateway_run._hermes_home without touching HERMES_HOME. Safety: - load_config() returns a deepcopy on every call; the 67+ call sites that mutate the result (cfg["model"]["default"] = ..., etc.) can't corrupt the cache. - save_config() / atomic_yaml_write bump mtime, naturally invalidating the cache for the next reader. - Cache is keyed on str(config_path), so HERMES_HOME profile switches don't collide. Verified: - 112 config tests pass (test_config, test_config_env_expansion, test_config_env_refs, test_config_drift, test_config_validation, test_aux_config). - 87 gateway tests pass (test_verbose_command, test_session_info, test_compress_focus, test_runtime_footer, test_resume_command, test_reasoning_command, test_approve_deny_commands, test_run_progress_interrupt). - Live hermes chat smoke — 2 turns + /model switch + tool calls, zero errors in agent.log. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 07:06:35 -07:00
Teknium	5ed1eb0d0f	docs(config): surface telegram.reactions in DEFAULT_CONFIG (#17028 ) The telegram.reactions key was already wired up (gateway/config.py bridges it to TELEGRAM_REACTIONS at startup) but was undocumented and missing from DEFAULT_CONFIG, so users had no way to discover it. Add it with the existing off-by-default behavior preserved. No behavior change — runtime default stays False. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 07:02:30 -07:00
Teknium	e123f4ecf0	feat(gateway): opt-in runtime-metadata footer on final replies (#17026 ) Append a compact 'model · 68% · ~/projects/hermes' footer to the FINAL message of each turn, disabled by default (display.runtime_footer.enabled). Answers the Telegram-side parity ask: runtime context that the CLI status bar already shows is now available in messaging replies when enabled. Wiring: - gateway/runtime_footer.py: resolve_footer_config + format_runtime_footer + build_footer_line. Pure-function renderer; per-platform overrides under display.platforms.<platform>.runtime_footer. - gateway/run.py: appends footer to response right after reasoning prepend so it lands only on the final message (never tool progress or streaming chunks). When streaming already delivered the body (already_sent), the footer is sent as a small trailing message instead. - agent_result now exposes context_length alongside last_prompt_tokens so the footer can compute the pct; both gateway return paths updated. - /footer [on\|off\|status] slash command, wired in CLI (cli.py) and gateway (gateway/run.py both running-agent bypass and main dispatch). Global toggle only; per-platform overrides via config.yaml. Graceful degradation: - Missing context_length (unknown model) → pct field silently dropped (no '?%' artifact). - Empty final_response → no footer appended. - Unknown field names in config → silently ignored. Tests: 25-case unit suite (tests/gateway/test_runtime_footer.py) plus E2E harness covering streaming vs non-streaming branches, per-platform override, and the exact argument contract gateway/run.py uses. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 06:50:04 -07:00
Teknium	72dea9f4f7	feat(gateway): make hygiene hard message limit configurable (#17000 ) The gateway session-hygiene pre-compression safety valve had a hardcoded 400-message threshold. On long-lived sessions with short turns this was either too high (users with aggressive compression preferences) or too low (users with very large context models who want to keep more history in-flight). Add compression.hygiene_hard_message_limit (default 400) so it can be tuned without forking the gateway. Reported by @OP (Apr 26 feedback bundle). ## Changes - hermes_cli/config.py: new DEFAULT_CONFIG key with 400 default - gateway/run.py: read compression.hygiene_hard_message_limit at hygiene-time, fall back to 400 if missing/invalid - tests/gateway/test_session_hygiene.py: two tests — override fires at the configured limit, default does not fire below 400 Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 05:43:12 -07:00
teknium1	7444e49d4e	fix(gateway): use transcript timestamp for auto-continue freshness Follow-up to PR #16802 (BeliefanX). The original fix read `agent_history[-1].get("timestamp")` for the tool-tail freshness gate, but `gateway/run.py` strips the `timestamp` field off all tool/tool_call rows when building `agent_history` from the raw transcript (see `clean_msg = {k: v for k, v in msg.items() if k != "timestamp"}`). At runtime the tool-tail branch always saw `None` and silently took the legacy-fresh path — the stale-guard never fired for the tool-tail case it was supposed to cover. Changes: - Read the freshness signal from the RAW `history` list (via new `_last_transcript_timestamp()` helper) BEFORE the strip. Both the resume_pending branch and the tool-tail branch use this single signal, replacing the two divergent ones. - Default window bumped 15 min → 1 hour via new `_AUTO_CONTINUE_FRESHNESS_SECS_DEFAULT`. The 15-minute default was shorter than the default `gateway_timeout` of 30 min, so a legitimate long-running turn interrupted near its timeout boundary and resumed shortly after would have been misclassified as stale. - Configurable via `config.yaml` `agent.gateway_auto_continue_freshness` (bridged to `HERMES_AUTO_CONTINUE_FRESHNESS` at gateway startup — same pattern as `gateway_timeout`). Set to 0 to disable the gate. - `_coerce_gateway_timestamp` now explicitly rejects bool (which is a subclass of int and would otherwise coerce to 0.0/1.0). - Tests rewritten to exercise the real production data shape: raw `history` → `_build_agent_history` strip → freshness decision. A regression guard (`test_stale_tool_tail_with_production_data_shape`) asserts `agent_history` tool rows carry NO timestamp, protecting against someone "fixing" the original bug by re-adding the stripped field (which would break the OpenAI tool-result message contract). Add BeliefanX to scripts/release.py AUTHOR_MAP. E2E verified: config.yaml → env var bridge → helper returns configured value; default 1h window; malformed/empty env var falls back to default; ISO-Z timestamps parse; ms-epoch coerced; bool rejected.	2026-04-28 05:20:35 -07:00
Teknium	b61d9b297a	refactor: consolidate symlink-safe atomic replace into shared helper Extract the islink/realpath guard from the 16743 fix into a single atomic_replace() helper in utils.py, then migrate every os.replace() call site in the codebase to use it. The original PR #16777 correctly identified and fixed the bug, but only patched 9 of ~24 call sites. The same bug class (managed deployments that symlink state files silently losing the link on every write) still existed at auth.json, sessions file, gateway config, env_loader, webhook subscriptions, debug store, model catalog, pairing, google OAuth, nous rate guard, and more. Rather than add another 10+ copies of the same three-line guard, consolidate into atomic_replace(tmp, target) which: - resolves symlinks via os.path.realpath before os.replace - returns the resolved real path so callers can re-apply permissions - is a drop-in replacement for os.replace at the use sites Changes: - utils.py: new atomic_replace() helper + atomic_json_write / atomic_yaml_write now call it instead of inlining the guard - 16 files: all os.replace() call sites migrated to atomic_replace() - agent/{google_oauth, nous_rate_guard, shell_hooks}.py - cron/jobs.py - gateway/{pairing, session, platforms/telegram}.py - hermes_cli/{auth, config, debug, env_loader, model_catalog, webhook}.py - tools/{memory_tool, skill_manager_tool, skills_sync}.py Tests: tests/test_atomic_replace_symlinks.py pins the invariant for atomic_replace + atomic_json_write + atomic_yaml_write, covers plain files, first-time creates, broken symlinks, and permission preservation. Refs #16743 Builds on #16777 by @vominh1919.	2026-04-28 04:58:22 -07:00
vominh1919	3ab97a32d1	fix: preserve symlinks during atomic file writes (#16743 ) os.replace(tmp, path) replaces the symlink itself with a regular file, breaking users who symlink config.yaml, SOUL.md, or .env from ~/.hermes/ to a dotfiles repo or managed profile package. Fix: resolve symlinks via os.path.realpath() before os.replace(), so the real file is overwritten in-place while the symlink survives. Fixed in 7 files covering all os.replace call sites: - utils.py (atomic_json_write, atomic_yaml_write — fixes save_config) - hermes_cli/config.py (env sanitizer, save_env_value, remove_env_value) - tools/skill_manager_tool.py (_atomic_write_text — SOUL.md writes) - tools/memory_tool.py (memory file writes) - tools/skills_sync.py (manifest writes) - cron/jobs.py (job state + output file writes) - agent/shell_hooks.py (hook file writes) Fixes NousResearch/hermes-agent#16743	2026-04-28 04:58:22 -07:00
Teknium	bd10acd747	fix(providers): honor key_env/api_key_env on Azure Anthropic + accept alias in normalizer (#16935 ) Three related fixes around custom env-var-name hints for provider entries. 1. Azure Anthropic path: previously hardcoded to look up AZURE_ANTHROPIC_KEY then ANTHROPIC_API_KEY with no way to override. If a user wrote model: provider: anthropic base_url: https://my-resource.services.ai.azure.com/anthropic key_env: MY_CUSTOM_KEY the key_env hint was silently ignored and the resolver raised 'No Azure Anthropic API key found' even when MY_CUSTOM_KEY was set in the environment. The runtime now checks, in order: (1) os.getenv(model_cfg.key_env) (2) os.getenv(model_cfg.api_key_env) # docs alias (3) model_cfg.api_key # inline value (4) AZURE_ANTHROPIC_KEY # historical default (5) ANTHROPIC_API_KEY # historical default Error message updated to mention key_env as an option. 2. Provider entry normalizer (_normalize_custom_provider_entry): accept 'api_key_env' as a snake_case alias for 'key_env', and 'apiKeyEnv' as a camelCase alias. Adds both to the _KNOWN_KEYS set so the 'unknown config keys ignored' warning doesn't fire on valid configs. 3. _VALID_CUSTOM_PROVIDER_FIELDS: add 'key_env'. That set documents supported custom_providers entry fields; it was drifting from reality since key_env has been read at runtime in auxiliary_client.py, runtime_provider.py, and main.py for a while. Docs: website/docs/guides/azure-foundry.md now uses the canonical key_env field and notes that api_key_env / keyEnv / apiKeyEnv are accepted as aliases. Validation: 12 new tests in test_runtime_provider_resolution.py covering all 5 Azure Anthropic resolution paths + 4 normalizer-alias tests. Pass rate across related suites (165 + 46 tests): 100%. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 02:12:08 -07:00
kshitijk4poor	42cc905c13	feat(plugins): add bundled observability/langfuse plugin Opt-in Langfuse tracing for Hermes conversations — LLM calls, tool usage, usage/cost breakdown per span. Hooks into pre/post_api_request, pre/post_llm_call, pre/post_tool_call. SDK is optional; missing SDK or credentials renders the plugin inert. Salvaged from PR #16845 by @kshitijk4poor, who wrote the plugin (~875 LOC, 6 hooks, Langfuse usage-details/cost-details normalization, read_file payload summarization). Salvage scope (why this isn't PR #16845 as-authored): - Lives at plugins/observability/langfuse/ (standalone kind, opt-in via plugins.enabled) instead of a new parallel optional-plugins/ directory. Standalone bundled plugins are already opt-in — only their plugin.yaml is scanned at startup; the Python module is not imported unless the user enables it. The premise of optional-plugins/ (avoid import cost for users who don't want it) is already solved by the existing plugin system. - Dropped the triple activation gate (plugins.enabled + plugins.langfuse.enabled + HERMES_LANGFUSE_ENABLED). The Hermes plugin system's own enable/disable is authoritative; runtime credentials gate whether the hook actually traces. - Rewrote _is_enabled() → cached _get_langfuse() with an _INIT_FAILED sentinel. The original called hermes_cli.config.load_config() from every hook invocation (full yaml parse + deep merge + env expansion on every pre/post_tool_call, potentially 100+ times per turn). The cached version reads env once and returns the cached client or None on every subsequent call with zero further work. - hermes tools → Langfuse Observability post-setup adds observability/langfuse to plugins.enabled directly (via _save_enabled_set) instead of going through an install-copy flow. Enable: hermes tools # interactive hermes plugins enable observability/langfuse # manual Required env (set by `hermes tools` or in ~/.hermes/.env): HERMES_LANGFUSE_PUBLIC_KEY HERMES_LANGFUSE_SECRET_KEY HERMES_LANGFUSE_BASE_URL # optional Co-authored-by: kshitijk4poor <kshitijk4poor@gmail.com>	2026-04-28 01:40:59 -07:00
Surat Srichan	a8f9c56cb4	fix(config): accept fallback_model list (chain) in validator + save Runtime already supports list-form fallback_model (run_agent.py:1459 iterates fallback_chain; fallback_cmd.py migrates legacy single-dict configs to list format). The config validator and save_config comment gate still assumed single-dict form and flagged list-form configs as errors. Fix both: - validate_config_structure: when fallback_model is a list, validate each entry has provider+model; keep the existing single-dict path. - save_config: suppress the "add fallback_model" comment when any list entry is well-formed. Adds 4 list-form validator tests.	2026-04-28 01:40:25 -07:00
vominh1919	0169c51820	fix(config): add request_timeout_seconds and stale_timeout_seconds to provider _KNOWN_KEYS Both keys are documented in cli-config.yaml.example and read at runtime by hermes_cli/timeouts.py (get_provider_request_timeout and get_provider_stale_timeout), but the provider-entry validator in config.py flagged them as unknown, producing noisy warnings on every CLI invocation for users who followed the documented config. Fixes #16779	2026-04-28 01:28:25 -07:00
ztexydt-cqh	1d5e25f353	fix(gateway): persist /sethome home channel to .env across all platforms _handle_set_home_command wrote FEISHU_HOME_CHANNEL / DISCORD_HOME_CHANNEL / etc. as top-level keys into config.yaml, but load_gateway_config() only reads home channels from env vars. After every gateway restart the home channel was lost — on every platform, not just Feishu. Fix: switch /sethome to save_env_value(), which atomically writes to ~/.hermes/.env and updates the current process env in one shot. The handler builds the env key from platform_name.upper(), so one line change repairs /sethome for every platform that has a HOME_CHANNEL env var. Also widen _EXTRA_ENV_KEYS in hermes_cli/config.py so HOME_CHANNEL and HOME_CHANNEL_NAME for every platform are treated as managed env vars: SIGNAL, SLACK, SMS, DINGTALK, BLUEBUBBLES, FEISHU, WECOM, YUANBAO, plus the missing *_NAME variants for DISCORD/TELEGRAM/MATTERMOST. Closes #16806 Co-authored-by: teknium1 <screenmachine@gmail.com>	2026-04-28 01:17:17 -07:00
Teknium	8081425a1c	feat(security): make secret redaction off by default (#16794 ) Flips security.redact_secrets from true to false in DEFAULT_CONFIG, and the HERMES_REDACT_SECRETS env-var fallback in agent/redact.py now requires explicit opt-in ("1"/"true"/"yes"/"on") to enable. New installs and users without a security.redact_secrets key get pass- through tool output. Existing users whose config.yaml explicitly sets redact_secrets: true keep redaction on — the config-yaml -> env-var bridges in hermes_cli/main.py and gateway/run.py still honor their setting. Also updates the inline config comments, website docs, and the hermes-agent skill so /hermes config set security.redact_secrets true is now the documented way to turn it on.	2026-04-27 21:24:08 -07:00
Adam Rummer	1eab5960f0	feat(matrix): add dm_auto_thread config for DM auto-threading Adds MATRIX_DM_AUTO_THREAD env var (default: false) to control auto-threading in DM rooms independently from channel auto-threading. Closes #15398	2026-04-27 21:22:44 -07:00
kshitijk4poor	56724147ef	fix(providers/gmi): post-salvage review fixes - config.py: remove dead ENV_VARS_BY_VERSION[17] entry (current _config_version is 22, so all users are past version 17 and would never be prompted for GMI_API_KEY on upgrade — consistent with how arcee was added) - auxiliary_client.py: use google/gemini-3.1-flash-lite-preview as GMI aux model instead of anthropic/claude-opus-4.6 (matches cheap fast-model pattern used by all other providers: zai→glm-4.5-flash, kimi→kimi-k2-turbo-preview, stepfun→step-3.5-flash, kilocode→google/gemini-3-flash-preview) - test_gmi_provider.py: fix malformed write_text() call in doctor test (was: write_text("GMI_API_KEY=* encoding="utf-8") → missing closing quote, wrote literal string 'GMI_API_KEY=* encoding=' to .env file) - test_gmi_provider.py + test_auxiliary_client.py: update aux model assertions to match new cheaper default - docs/integrations/providers.md: add 'gmi' to inline 'Supported providers' fallback list (was only in the table, not the inline list at line ~1181) - docs/reference/cli-commands.md: add 'gmi' to --provider choices list	2026-04-27 11:17:59 -07:00
Isaac Huang	c53fcb0173	feat(providers): add GMI Cloud as a first-class API-key provider (#11955 ) Add GMI Cloud (api.gmi-serving.com) as a full first-class API-key provider with built-in auth, aliases, model catalog, CLI entry points, auxiliary client routing, context length resolution, doctor checks, env var tracking, and docs. - auth.py: ProviderConfig for 'gmi' (api_key, GMI_API_KEY / GMI_BASE_URL) - providers.py: HermesOverlay with extra_env_vars for models.dev detection - models.py: curated slash-form model catalog; live /v1/models fetch - main.py: 'gmi' in _named_custom_provider_map and --provider choices - model_metadata.py: _URL_TO_PROVIDER, _PROVIDER_PREFIXES, dedicated context-length probe block (GMI's /models has authoritative data) - auxiliary_client.py: alias entries; _compat_model fix for slash-form models on cached aggregator-style clients; gmi aux default model - doctor.py: GMI in provider connectivity checks - config.py: GMI_API_KEY / GMI_BASE_URL in OPTIONAL_ENV_VARS - conftest.py: explicit GMI_BASE_URL clearing (not caught by _API_KEY suffix) - docs: providers.md, environment-variables.md, fallback-providers.md, configuration.md, quickstart.md (expands provider table) Co-authored-by: Isaac Huang <isaachuang@Isaacs-MacBook-Pro.local>	2026-04-27 11:17:59 -07:00
Teknium	ea3c5a14c3	feat(update): make pre-update backup opt-in (off by default) (#16566 ) The zip backup could add minutes to every 'hermes update' on large HERMES_HOME directories. Flip the default to off and add a --backup flag for one-off opt-in runs. - updates.pre_update_backup default: True -> False - hermes update: new --backup flag (opposite of existing --no-backup) - Silent no-op when disabled (no message spam on every update) - Existing --no-backup still works and wins over --backup - Users who explicitly set pre_update_backup: true keep the old behavior - Tests updated to cover default-off, --backup opt-in, and config-enabled paths	2026-04-27 06:36:35 -07:00
Teknium	ec671c4154	feat(image-input): native multimodal routing based on model vision capability (#16506 ) * feat(image-input): native multimodal routing based on model vision capability Attach user-sent images as OpenAI-style content parts on the user turn when the active model supports native vision, so vision-capable models see real pixels instead of a lossy text description from vision_analyze. Routing decision (agent/image_routing.py::decide_image_input_mode): agent.image_input_mode = auto \| native \| text (default: auto) In auto mode: - If auxiliary.vision.provider/model is explicitly configured, keep the text pipeline (user paid for a dedicated vision backend). - Else if models.dev reports supports_vision=True for the active provider/model, attach natively. - Else fall back to text (current behaviour). Call sites updated: gateway/run.py (all messaging platforms), tui_gateway (dashboard/Ink), cli.py (interactive /attach + drag-drop). run_agent.py changes: - _prepare_anthropic_messages_for_api now passes image parts through unchanged when the model supports vision — the Anthropic adapter translates them to native image blocks. Previous behaviour (vision_analyze → text) only runs for non-vision Anthropic models. - New _prepare_messages_for_non_vision_model mirrors the same contract for chat.completions and codex_responses paths, so non-vision models on any provider get text-fallback instead of failing at the provider. - New _model_supports_vision() helper reads models.dev caps. vision_analyze description rewritten: positions it as a tool for images NOT already visible in the conversation (URLs, tool output, deeper inspection). Prevents the model from redundantly calling it on images already attached natively. Config default: agent.image_input_mode = auto. Tests: 35 new (test_image_routing.py + test_vision_aware_preprocessing.py), all existing tests that reference _prepare_anthropic_messages_for_api still pass (198 targeted + new tests green). * feat(image-input): size-cap + resize oversized images, charge image tokens in compressor Two follow-ups that make the native image routing safer for long / heavy sessions: 1) Oversize handling in build_native_content_parts: - 20 MB ceiling per image (matches vision_tools._MAX_BASE64_BYTES, the most restrictive provider — Gemini inline data). - Delegates to vision_tools._resize_image_for_vision (Pillow-based, already battle-tested) to downscale to 5 MB first-try. - If Pillow is missing or resize still overshoots, the image is dropped and reported back in skipped[]; caller falls back to text enrichment for that image. 2) Image-token accounting in context_compressor: - New _IMAGE_TOKEN_ESTIMATE = 1600 (matches Claude Code's constant; within the realistic range for Anthropic/GPT-4o/Gemini billing). - _content_length_for_budget() helper: sums text-part lengths and charges _IMAGE_CHAR_EQUIVALENT (1600 * 4 chars) per image/image_url/ input_image part. Base64 payload inside image_url is NOT counted as chars — dimensions don't matter, only image-presence. - Both tail-cut sites (_prune_old_tool_results L527 and _find_tail_cut_by_tokens L1126) now call the helper so multi-image conversations don't slip past compression budget. Tests: 9 new in test_image_routing.py (oversize triggers resize, resize-fails-returns-None, oversize-skipped-reported), 11 new in test_compressor_image_tokens.py (flat charge per image, multiple images, Responses-API / Anthropic-native / OpenAI-chat shapes, no-inflation on raw base64, bounds-check on the constant, integration test that an image-heavy tail actually gets trimmed). * fix(image-input): replace blanket 20MB ceiling with empirically-verified per-provider limits The previous commit imposed a hardcoded 20 MB base64 ceiling on all providers, triggering auto-resize on anything larger. This was wrong in both directions: * Too loose for Anthropic — actual limit is 5 MB (returns HTTP 400 'image exceeds 5 MB maximum' above that). * Too strict for OpenAI / Codex / OpenRouter — accept 49 MB+ without complaint (empirically verified April 2026 with progressive PNG sizes). New behaviour: * _PROVIDER_BASE64_CEILING table: only anthropic and bedrock have a ceiling (5 MB, since bedrock-on-Claude shares Anthropic's decoder). * Providers NOT in the table get no ceiling — images attach at native size and we trust the provider to return its own error if it disagrees. A provider-specific 400 message is clearer than us guessing wrong and silently degrading image quality. * build_native_content_parts() gains a keyword-only provider arg; gateway/CLI/TUI pass the active provider so Anthropic users get auto-resize protection while OpenAI users don't pay it. * Resize target dropped from 5 MB to 4 MB to slide safely under Anthropic's boundary with header overhead. Empirical measurements (direct API, no Hermes in the loop): image b64 anthropic openrouter/gpt5.5 codex-oauth/gpt5.5 0.19 MB ✓ ✓ ✓ 12.37 MB ✗ 400 5MB ✓ ✓ 23.85 MB ✗ 400 5MB ✓ ✓ 49.46 MB ✗ 413 ✓ ✓ Tests: rewrote TestOversizeHandling (5 tests): no-ceiling pass-through, Anthropic resize fires, Anthropic skip on resize-fail, build_native_parts routes ceiling by provider, unknown provider gets no ceiling. All 52 targeted tests pass. * refactor(image-input): attempt native, shrink-and-retry on provider reject Replace proactive per-provider size ceilings with a reactive shrink path on the provider's actual rejection. All providers now attempt native full-size attachment first; if the provider returns an image-too-large error, the agent silently shrinks and retries once. Why the previous design was wrong: hardcoding provider ceilings (anthropic=5MB, others=unlimited) meant OpenAI users on a 10MB image paid no tax, but Anthropic users lost quality on anything >5MB even though the empirical behaviour at provider-reject time is the same (shrink + retry). Baking the table into the routing layer also requires updating Hermes every time a provider's limit changes. Reactive design: - image_routing.py: _file_to_data_url encodes native size, no ceiling. build_native_content_parts drops its provider kwarg. - error_classifier.py: new FailoverReason.image_too_large + pattern match ("image exceeds", "image too large", etc.) checked BEFORE context_overflow so Anthropic's 5MB rejection lands in the right bucket. - run_agent.py: new _try_shrink_image_parts_in_messages walks api messages in-place, re-encodes oversized data: URL image parts through vision_tools._resize_image_for_vision to fit under 4MB, handles both chat.completions (dict image_url) and Responses (string image_url) shapes, ignores http URLs (provider-fetched). New image_shrink_retry_attempted flag in the retry loop fires the shrink exactly once per turn after credential-pool recovery but before auth retries. E2E verified live against Anthropic claude-sonnet-4-6: - 17.9MB PNG (23.9MB b64) attached at native size - Anthropic returns 400 "image exceeds 5 MB maximum" - Agent logs '📐 Image(s) exceeded provider size limit — shrank and retrying...' - Retry succeeds, correct response delivered in 6.8s total. Tests: 12 new (8 shrink-helper shapes + 4 classifier signals), replaces 5 proactive-ceiling tests with 3 simpler 'native attach works' tests. 181 targeted tests pass. test_enum_members_exist in test_error_classifier.py updated for the new enum value.	2026-04-27 06:27:59 -07:00
Teknium	8ed599dc05	feat(update): auto-backup HERMES_HOME before hermes update (#16539 ) Every 'hermes update' now runs a full backup of ~/.hermes/ first, so users can always roll back to the exact state they had before the update if anything goes wrong (corrupted sessions.db, broken skills, config migrations that don't round-trip, etc.). Changes: - hermes_cli/backup.py: new create_pre_update_backup() helper. Writes to <HERMES_HOME>/backups/pre-update-<stamp>.zip using the same exclusion rules and SQLite safe-copy as 'hermes backup'. Auto-rotates (keep last N, pre-update-*.zip only — hand-dropped zips in backups/ are untouched). Adds 'backups' to _EXCLUDED_DIRS so subsequent backups don't nest prior ones. - hermes_cli/main.py: _run_pre_update_backup() wired into _cmd_update_impl before any git operation. Prints save path, restore command, and how to disable. Swallows failures so a broken backup never blocks the update itself. New --no-backup flag on 'hermes update' for one-off override. - hermes_cli/config.py: new 'updates' section in DEFAULT_CONFIG with pre_update_backup (default true) and backup_keep (default 5). Auto-surfaces in the dashboard config UI. - tests/hermes_cli/test_backup.py: +11 tests covering backup location, content parity with 'hermes backup', no-recursion, rotation, manual file preservation, config gate, --no-backup flag, flag-wins-over-config.	2026-04-27 05:36:19 -07:00
Teknium	478444c262	feat(checkpoints): auto-prune orphan and stale shadow repos at startup (#16303 ) Every working dir hermes ever touches gets its own shadow git repo under ~/.hermes/checkpoints/{sha256(abs_dir)[:16]}/. The per-repo _prune is a no-op (comment in CheckpointManager._prune says so), so abandoned repos from deleted/moved projects or one-off tmp dirs pile up forever. Field reports put the typical offender at 1000+ repos / ~12 GB on active contributor machines. Adds an opt-in startup sweep that mirrors the sessions.auto_prune pattern from #13861 / #16286: - tools/checkpoint_manager.py: new prune_checkpoints() and maybe_auto_prune_checkpoints() helpers. Deletes shadow repos that are orphan (HERMES_WORKDIR marker points to a path that no longer exists) or stale (newest in-repo mtime older than retention_days). Idempotent via a CHECKPOINT_BASE/.last_prune marker file so it only runs once per min_interval_hours regardless of how many hermes processes start up. - hermes_cli/config.py: new checkpoints.auto_prune / retention_days / delete_orphans / min_interval_hours knobs. Default auto_prune: false so users who rely on /rollback against long-ago sessions never lose data silently. - cli.py / gateway/run.py: startup hooks gated on checkpoints.auto_prune, called right next to the existing state.db maintenance block. - Docs updated with the new config knobs. - 11 regression tests: orphan/stale deletion, precedence, byte-freed tracking, non-shadow dir skip, interval gating, corrupt marker recovery. Refs #3015 (session-file disk growth was fixed in #16286; this covers the checkpoint side noted out-of-scope there).	2026-04-26 19:05:52 -07:00
Teknium	55e9329ee6	feat(config): register bundled-skill API keys in OPTIONAL_ENV_VARS Adds NOTION_API_KEY, LINEAR_API_KEY, TENOR_API_KEY, and AIRTABLE_API_KEY to OPTIONAL_ENV_VARS so: - They persist to ~/.hermes/.env via save_env_value like every other key Hermes knows about, instead of being ad-hoc variables the user has to hand-edit the dotfile for. - load_env() / reload_env() populate os.environ from .env on every startup — the user sets the key once, skills keep working across restarts without losing access. - hermes setup / hermes config show surface them as known optional vars with the correct signup URL (linear.app/settings/api, airtable.com/create/tokens, etc.). These four entries use category="skill" (new) rather than "tool". tools/environments/local.py auto-adds every category=tool/messaging entry to _HERMES_PROVIDER_ENV_BLOCKLIST, which stops env passthrough from leaking provider credentials into the execute_code sandbox (GHSA-rhgp-j443-p4rf). Skill API keys are the opposite case — the point is for the agent's subprocess to see them so curl can read Authorization headers — so they must be outside the blocklist. The new category is inert for that check. All four entries are advanced=True: they show up in 'hermes config' and 'hermes status' displays, but do not nag users who have never touched those skills during setup checklists. E2E verified: save_env_value → reload_env → os.environ populated → skill_view reports setup_needed=False → env_passthrough registers the key for subprocess inheritance.	2026-04-26 18:45:15 -07:00
Teknium	635253b918	feat(busy): add 'steer' as a third display.busy_input_mode option (#16279 ) Enter while the agent is busy can now inject the typed text via /steer — arriving at the agent after the next tool call — instead of interrupting (current default) or queueing for the next turn. Changes: - cli.py: keybinding honors busy_input_mode='steer' by calling agent.steer(text) on the UI thread (thread-safe), with automatic fallback to 'queue' when the agent is missing, steer() is unavailable, images are attached, or steer() rejects the payload. /busy accepts 'steer' as a fourth argument alongside queue/interrupt/status. - gateway/run.py: busy-message handler and the PRIORITY running-agent path both route through running_agent.steer() when the mode is 'steer', with the same fallback-to-queue safety net. Ack wording tells users their message was steered into the current run. Restart-drain queueing now also activates for 'steer' so messages aren't lost across restarts. - agent/onboarding.py: first-touch hint has a steer branch for both CLI and gateway. - hermes_cli/commands.py: /busy args_hint updated to include steer, and 'steer' is registered as a subcommand (completions). - hermes_cli/web_server.py: dashboard select widget offers steer. - hermes_cli/config.py, cli-config.yaml.example, hermes_cli/tips.py: inline docs updated. - website/docs/user-guide/cli.md + messaging/index.md: documented. - Tests: steer set/status path for /busy; onboarding hints; _load_busy_input_mode accepts steer; busy-session ack exercises steer success + two fallback-to-queue branches. Requested on X by @CodingAcct. Default is unchanged (interrupt).	2026-04-26 18:21:29 -07:00
Teknium	42c076d349	feat(browser): auto-spawn local Chromium for LAN/localhost URLs in cloud mode (#16136 ) When a cloud browser provider (Browserbase / Browser-Use / Firecrawl) is configured, browser_navigate now transparently spawns a local Chromium sidecar for URLs whose host resolves to a private/loopback/LAN address (localhost, 127.0.0.1, 192.168.x.x, 10.x.x.x, .local, .lan, *.internal, ::1, 169.254.x.x). Public URLs continue to use the cloud provider in the same conversation. Previously, setting BROWSERBASE_API_KEY / cloud_provider: browserbase pinned the whole tool to cloud for the process — localhost URLs were either SSRF-blocked (default) or sent to Browserbase (where they 404'd because the cloud can't reach your LAN). Users who wanted 'cloud for public, local for localhost' had no way to express it short of toggling providers mid-session. Implementation uses a composite session key scheme: the bare task_id serves the cloud session, and a '{task_id}::local' sidecar serves the local Chromium. _last_active_session_key[task_id] tracks which of the two served the most recent nav so snapshot/click/fill/etc. hit the correct one. cleanup_browser(bare_task_id) reaps both. Feature is on by default. Opt out via: browser: auto_local_for_private_urls: false The cloud provider never sees private URLs. Post-redirect SSRF guard is preserved: redirects from public onto private addresses still block.	2026-04-26 09:57:58 -07:00
Teknium	83c1c201f6	feat(onboarding): contextual first-touch hints for /busy and /verbose (#16046 ) Instead of a blocking first-run questionnaire, show a one-time hint the first time the user hits each behavior fork: 1. First message while the agent is working — appends a hint to the busy-ack explaining the /busy queue vs /busy interrupt knob, phrased to match the mode that was just applied (don't tell a queue-mode user to switch to queue). 2. First tool that runs for >= 30s in the noisiest progress mode (tool_progress: all) — prints a hint about /verbose to cycle display modes (all -> new -> off -> verbose). Gated on /verbose actually being usable on the surface: always shown on CLI; on gateway only shown when display.tool_progress_command is enabled. Each hint is latched in config.yaml under onboarding.seen.<flag>, so it fires exactly once per install across CLI, gateway, and cron, then never again. Users can wipe the section to re-see hints. New: - agent/onboarding.py — is_seen / mark_seen / hint strings, shared by both CLI and gateway. - onboarding.seen in DEFAULT_CONFIG (hermes_cli/config.py) and in load_cli_config defaults (cli.py). No _config_version bump — deep merge handles new keys. Wired: - gateway/run.py: _handle_active_session_busy_message appends the hint after building the ack. progress_callback tracks tool.completed duration and queues the tool-progress hint into the progress bubble. - cli.py: CLI input loop appends the busy-input hint on the first busy Enter; _on_tool_progress appends the tool-progress hint on the first >=30s tool completion. In-memory CLI_CONFIG is also updated so subsequent fires in the same process are suppressed immediately. All writes go through atomic_yaml_write and are wrapped in try/except so onboarding can never break the input/busy-ack paths.	2026-04-26 06:06:27 -07:00
Teknium	855366909f	feat(models): remote model catalog manifest for OpenRouter + Nous Portal (#16033 ) OpenRouter and Nous Portal curated picker lists now resolve via a JSON manifest served by the docs site, falling back to the in-repo snapshot when unreachable. Lets us update model lists without shipping a release. Live URL: https://hermes-agent.nousresearch.com/docs/api/model-catalog.json (source at website/static/api/model-catalog.json; auto-deploys via the existing deploy-site.yml GitHub Pages pipeline on every merge to main). Schema (v1) carries id + optional description + free-form metadata at manifest, provider, and model levels. Pricing and context length stay live-fetched via existing machinery (/v1/models endpoints, models.dev). Config (new model_catalog section, default enabled): model_catalog.url master manifest URL model_catalog.ttl_hours disk cache TTL (default 24h) model_catalog.providers.<name>.url optional per-provider override Fetch pipeline: in-process cache -> disk cache (fresh < TTL) -> HTTP fetch -> disk-cache-on-failure fallback -> in-repo snapshot as last resort. Never raises to callers; at worst returns the bundled list. Changes: - website/static/api/model-catalog.json initial manifest (35 OR + 31 Nous) - scripts/build_model_catalog.py regenerator from in-repo lists - hermes_cli/model_catalog.py fetch + validate + cache module - hermes_cli/models.py fetch_openrouter_models() + new get_curated_nous_model_ids() - hermes_cli/main.py, hermes_cli/auth.py Nous flows use the helper - hermes_cli/config.py model_catalog defaults - website/docs/reference/model-catalog.md + sidebars.ts - tests/hermes_cli/test_model_catalog.py 21 tests (validation, fetch success/failure, accessors, disabled, overrides, integration)	2026-04-26 05:46:43 -07:00
TechPrototyper	3a7653dd1f	feat: Add Azure Foundry provider with OpenAI/Anthropic API mode selection Add support for Azure Foundry as a new inference provider. Azure Foundry endpoints can use either OpenAI-style (/v1/chat/completions) or Anthropic-style (/v1/messages) API formats. Changes: - Add azure-foundry to PROVIDER_REGISTRY (auth.py) - Add azure-foundry overlay in HERMES_OVERLAYS (providers.py) - Add empty model list for azure-foundry (models.py) - Add _model_flow_azure_foundry() interactive setup (main.py) - Add azure-foundry runtime resolution with api_mode support (runtime_provider.py) - Add AZURE_FOUNDRY_API_KEY and AZURE_FOUNDRY_BASE_URL env vars (config.py) Usage: hermes model -> More providers -> Azure Foundry The setup wizard prompts for: - Endpoint URL - API format (OpenAI or Anthropic-style) - API key - Model name Configuration is saved to config.yaml (model.provider, model.base_url, model.api_mode, model.default) and ~/.hermes/.env (AZURE_FOUNDRY_API_KEY).	2026-04-25 18:48:43 -07:00
Teknium	125de02056	fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K (#15844 ) Fixes #15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback. ## What changed New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching. `agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe). Wired through five call sites that previously either duplicated the lookup or ignored it entirely: - `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning) - `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch - `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg - `gateway/run.py` /model confirmation (picker callback + text path) - `gateway/run.py` `_format_session_info` (/info) ## Context probe tiers `CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency. ## Repro (from #15779) ```yaml custom_providers: - name: my-custom-endpoint base_url: https://example.invalid/v1 model: gpt-5.5 models: gpt-5.5: context_length: 1050000 ``` `/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000". ## Tests - `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants - `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for #15779 through the display resolver - `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K) - `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier	2026-04-25 18:47:53 -07:00
Teknium	ea01bdcebe	refactor(memory): remove flush_memories entirely (#15696 ) The AIAgent.flush_memories pre-compression save, the gateway _flush_memories_for_session, and everything feeding them are obsolete now that the background memory/skill review handles persistent memory extraction. Problems with flush_memories: - Pre-dates the background review loop. It was the only memory-save path when introduced; the background review now fires every 10 user turns on CLI and gateway alike, which is far more frequent than compression or session reset ever triggered flush. - Blocking and synchronous. Pre-compression flush ran on the live agent before compression, blocking the user-visible response. - Cache-breaking. Flush built a temporary conversation prefix (system prompt + memory-only tool list) that diverged from the live conversation's cached prefix, invalidating prompt caching. The gateway variant spawned a fresh AIAgent with its own clean prompt for each finalized session — still cache-breaking, just in a different process. - Redundant. Background review runs in the live conversation's session context, gets the same content, writes to the same memory store, and doesn't break the cache. Everything flush_memories claimed to preserve is already covered. What this removes: - AIAgent.flush_memories() method (~248 LOC in run_agent.py) - Pre-compression flush call in _compress_context - flush_memories call sites in cli.py (/new + exit) - GatewayRunner._flush_memories_for_session + _async_flush_memories (and the 3 call sites: session expiry watcher, /new, /resume) - 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks, hermes tools UI task list, auxiliary_client docstrings - _memory_flush_min_turns config + init - #15631's headroom-deduction math in _check_compression_model_feasibility (headroom was only needed because flush dragged the full main-agent system prompt along; the compression summariser sends a single user-role prompt so new_threshold = aux_context is safe again) - The dedicated test files and assertions that exercised flush-specific paths What this renames (with read-time backcompat on sessions.json): - SessionEntry.memory_flushed -> SessionEntry.expiry_finalized. The session-expiry watcher still uses the flag to avoid re-running finalize/eviction on the same expired session; the new name reflects what it now actually gates. from_dict() reads 'expiry_finalized' first, falls back to the legacy 'memory_flushed' key so existing sessions.json files upgrade seamlessly. Supersedes #15631 and #15638. Tested: 383 targeted tests pass across run_agent/, agent/, cli/, and gateway/ session-boundary suites. No behavior regressions — background memory review continues to handle persistent memory extraction on both CLI and gateway.	2026-04-25 08:21:14 -07:00
alt-glitch	81987f0350	feat(discord): split discord_server into discord + discord_admin tools Split the monolithic discord_server tool (14 actions) into two: - discord: core actions (fetch_messages, search_members, create_thread) that are useful for the agent's normal operation. Auto-enabled on the discord platform via the pipeline fix. - discord_admin: server management actions (list channels/roles, pins, role assignment) that require explicit opt-in via hermes tools. Added to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS.	2026-04-25 04:50:14 -07:00
Teknium	023b1bff11	fix(delegate): resolve subagent approval prompts without deadlocking parent TUI (#15491 ) Subagents run inside a ThreadPoolExecutor. The CLI's interactive approval callback lives in tools/terminal_tool.py's threading.local(), which worker threads do not inherit. When a subagent hits a dangerous-command guard, prompt_dangerous_approval() falls back to input() from the worker thread, deadlocking against the parent's prompt_toolkit TUI that owns stdin. Fix: install a non-interactive callback into every subagent worker thread via ThreadPoolExecutor(initializer=set_approval_callback, initargs=(cb,)). The callback is config-gated by delegation.subagent_auto_approve: false (default) -> _subagent_auto_deny (safe; matches leaf tool blocklist) true -> _subagent_auto_approve (opt-in YOLO for cron/batch) Both emit a logger.warning audit line. Gateway sessions are unaffected because they resolve approvals via tools/approval.py's per-session queue, not through these TLS callbacks. Diagnosis credit: @MorAlekss (#14685). - hermes_cli/config.py: DEFAULT_CONFIG.delegation.subagent_auto_approve: False - cli-config.yaml.example: documented, commented (default) - tools/delegate_tool.py: _subagent_auto_deny, _subagent_auto_approve, _get_subagent_approval_callback, wired into the child timeout executor - tests/tools/test_delegate.py: 7 tests covering defaults, truthy coercion, and TLS scoping in the worker thread	2026-04-24 22:37:22 -07:00

1 2 3 4 5 ...

358 commits