hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-20 15:33:54 +00:00

Author	SHA1	Message	Date
Teknium	0f597dd127	fix: STT provider-model mismatch — whisper-1 fed to faster-whisper (#7113 ) Legacy flat stt.model config key (from cli-config.yaml.example and older versions) was passed as a model override to transcribe_audio() by the gateway, bypassing provider-specific model resolution. When the provider was 'local' (faster-whisper), this caused: ValueError: Invalid model size 'whisper-1' Changes: - gateway/run.py, discord.py: stop passing model override — let transcribe_audio() handle provider-specific model resolution internally - get_stt_model_from_config(): now provider-aware, reads from the correct nested section (stt.local.model, stt.openai.model, etc.); ignores legacy flat key for local provider to prevent model name mismatch - cli-config.yaml.example: updated STT section to show nested provider config structure instead of legacy flat key - config migration v13→v14: moves legacy stt.model to the correct provider section and removes the flat key Reported by community user on Discord.	2026-04-10 03:27:30 -07:00
maxyangcn	19292eb8bf	feat(cron): support Discord thread_id in deliver targets Add Discord thread support to cron delivery and send_message_tool. - _parse_target_ref: handle discord platform with chat_id:thread_id format - _send_discord: add thread_id param, route to /channels/{thread_id}/messages - _send_to_platform: pass thread_id through for Discord - Discord adapter send(): read thread_id from metadata for gateway path - Update tool schema description to document Discord thread targets Cherry-picked from PR #7046 by pandacooming (maxyangcn). Follow-up fixes: - Restore proxy support (resolve_proxy_url/proxy_kwargs_for_aiohttp) that was accidentally deleted — would have caused NameError at runtime - Remove duplicate _DISCORD_TARGET_RE regex; reuse existing _TELEGRAM_TOPIC_TARGET_RE via _NUMERIC_TOPIC_RE alias (identical pattern) - Fix misleading test comments about Discord negative snowflake IDs (Discord uses positive snowflakes; negative IDs are a Telegram convention) - Rewrite misleading scheduler test that claimed to exercise home channel fallback but actually tested the explicit platform:chat_id parsing path	2026-04-10 03:20:05 -07:00
Teknium	30ae68dd33	fix: apply hidden_div regex newline bypass fix to skills_guard.py The same .* pattern vulnerable to newline bypass that was fixed in prompt_builder.py (PR #6925) also existed in skills_guard.py. Changed to [\s\S]*? to match across newlines.	2026-04-10 03:05:04 -07:00
aaronagent	9afe1784bd	fix: hidden_div regex bypass with newlines, credential config silent failure, webhook route error severity prompt_builder.py: The `hidden_div` detection pattern uses `.` which does not match newlines in Python regex (re.DOTALL is not passed). An attacker can bypass detection by splitting the style attribute across lines: `<div style="color:red;\ndisplay: none">injected content</div>` Replace `.` with `[\s\S]*?` to match across line boundaries. credential_files.py: `_load_config_files()` catches all exceptions at DEBUG level (line 171), making YAML parse failures invisible in production logs. Users whose credential files silently fail to mount into sandboxes have no diagnostic clue. Promote to WARNING to match the severity pattern used by the path validation warnings at lines 150 and 158 in the same function. webhook.py: `_reload_dynamic_routes()` logs JSON parse failures at WARNING (line 265) but the impact — stale/corrupted dynamic routes persisting silently — warrants ERROR level to ensure operator visibility in alerting pipelines. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 03:05:04 -07:00
aaronagent	94f5979cc2	fix(approval,mcp): log silent exception handlers, narrow OAuth catches, close server on error Three silent `except Exception` blocks in approval.py (lines 345, 387, 469) return fallback values with zero logging — making it impossible to debug callback failures, allowlist load errors, or config read issues. Add logger.warning/error calls that match the pattern already used by save_permanent_allowlist() and _smart_approve() in the same file. In mcp_oauth.py, narrow the overly-broad `except Exception` in get_tokens() and get_client_info() to the specific exceptions Pydantic's model_validate() can raise (ValueError, TypeError, KeyError), and include the exception message in the warning. Also wrap the _wait_for_callback() polling loop in try/finally so the HTTPServer is always closed — previously an asyncio.CancelledError or any exception in the loop would leak the server socket. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 03:05:04 -07:00
aaronagent	738f0bac13	fix: align auth-by-message classification with status-code path, decode URLs before secret check error_classifier.py: Message-only auth errors ("invalid api key", "unauthorized", etc.) were classified as retryable=True (line 707), inconsistent with the HTTP 401 path (line 432) which correctly uses retryable=False + should_fallback=True. The mismatch causes 3 wasted retries with the same broken credential before fallback, while 401 errors immediately attempt fallback. Align the message-based path to match: retryable=False, should_fallback=True. web_tools.py: The _PREFIX_RE secret-detection check in web_extract_tool() runs against the raw URL string (line 1196). URL-encoded secrets like %73k-1234... ( sk-1234...) bypass the filter because the regex expects literal ASCII. Add urllib.parse.unquote() before the check so percent-encoded variants are also caught. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 03:05:04 -07:00
alt-glitch	1f1f297528	feat(environments): unified file sync with change tracking and deletion Replace per-backend ad-hoc file sync with a shared FileSyncManager that handles mtime-based change detection, remote deletion of locally-removed files, and transactional state updates. - New FileSyncManager class (tools/environments/file_sync.py) with callbacks for upload/delete, rate limiting, and rollback - Shared iter_sync_files() eliminates 3 duplicate implementations - SSH: replace unconditional rsync with scp + mtime skip - Modal/Daytona: replace inline _synced_files dict with manager - All 3 backends now sync credentials + skills + cache uniformly - Remote deletion: files removed locally are cleaned from remote - HERMES_FORCE_FILE_SYNC=1 env var for debugging - Base class _before_execute() simplified to empty hook - 12 unit tests covering mtime skip, deletion, rollback, rate limiting	2026-04-10 03:01:46 -07:00
Teknium	a420235b66	fix: reject foreground timeout above cap instead of clamping Change behavior from silent clamping to returning an error when the model requests a foreground timeout exceeding FOREGROUND_MAX_TIMEOUT. This forces the model to use background=true for long-running commands rather than silently changing its intent. - Config default timeouts above the cap are NOT rejected (user's choice) - Only explicit model-requested timeouts trigger rejection - Added boundary test for timeout exactly at the limit	2026-04-10 02:58:54 -07:00
kshitijk4poor	6c3565df57	fix(terminal): cap foreground timeout to prevent session deadlocks When the model calls terminal() in foreground mode without background=true (e.g. to start a server), the tool call blocks until the command exits or the timeout expires. Without an upper bound the model can request arbitrarily high timeouts (the schema had minimum=1 but no maximum), blocking the entire agent session for hours until the gateway idle watchdog kills it. Changes: - Add FOREGROUND_MAX_TIMEOUT (600s, configurable via TERMINAL_MAX_FOREGROUND_TIMEOUT env var) that caps foreground timeout - Clamp effective_timeout to the cap when background=false and timeout exceeds the limit - Include a timeout_note in the tool result when clamped, nudging the model to use background=true for long-running processes - Update schema description to show the max timeout value - Remove dead clamping code in the background branch that could never fire (max_timeout was set to effective_timeout, so timeout > max_timeout was always false) - Add 7 tests covering clamping, no-clamping, config-default-exceeds-cap edge case, background bypass, default timeout, constant value, and schema content Self-review fixes: - Fixed bug where timeout_note said 'Requested timeout Nones' when clamping fired from config default exceeding cap (timeout param is None). Now uses unclamped_timeout instead of the raw timeout param. - Removed unused pytest import from test file - Extracted test config dict into _make_env_config() helper - Fixed tautological test_default_value assertion - Added missing test for config default > cap with no model timeout	2026-04-10 02:58:54 -07:00
Teknium	69a0092c38	fix: deduplicate _is_termux() into hermes_constants.is_termux() Replace 6 identical copies of the Termux detection function across cli.py, browser_tool.py, voice_mode.py, status.py, doctor.py, and gateway.py with a single shared implementation in hermes_constants.py. Each call site imports with its original local name to preserve all existing callers (internal references and test monkeypatches).	2026-04-09 16:24:53 -07:00
adybag14-cyber	c3141429b7	fix(termux): tighten voice setup and mobile chat UX	2026-04-09 16:24:53 -07:00
adybag14-cyber	769ec1ee1a	fix(termux): deepen browser, voice, and tui support	2026-04-09 16:24:53 -07:00
adybag14-cyber	3237733ca5	fix(termux): harden execute_code and mobile browser/audio UX	2026-04-09 16:24:53 -07:00
adybag14-cyber	54d5138a54	fix(termux): harden env-backed background jobs	2026-04-09 16:24:53 -07:00
adybag14-cyber	122925a6f2	fix(termux): honor temp dirs for local temp artifacts	2026-04-09 16:24:53 -07:00
adybag14-cyber	e79cc88985	feat: add tested Termux install path and EOF-aware gh auth	2026-04-09 16:24:53 -07:00
Teknium	49d8c9557f	fix: cleanup_all_camofox_sessions respects managed persistence (#6820 ) When managed_persistence is enabled, cleanup_all now only clears local tracking state without sending DELETE requests to the Camofox server. This prevents persistent browser profiles (cookies, logins, localStorage) from being destroyed during process-wide cleanup. Ephemeral sessions still get full server-side deletion as before.	2026-04-09 14:54:07 -07:00
Teknium	6f8e426275	fix: add SOCKS proxy support, DISCORD_PROXY env var, and send_message proxy coverage Follow-up improvements on top of the shared resolver from PR #6562: - Add platform_env_var parameter to resolve_proxy_url() so DISCORD_PROXY takes priority over generic HTTPS_PROXY/ALL_PROXY env vars - Add SOCKS proxy support via aiohttp_socks.ProxyConnector with rdns=True (critical for GFW/Shadowrocket/Clash users — issue #6649) - proxy_kwargs_for_bot() returns connector= for SOCKS, proxy= for HTTP - proxy_kwargs_for_aiohttp() returns split (session_kw, request_kw) for standalone aiohttp sessions - Add proxy support to send_message_tool.py (Discord REST, Slack, SMS) for cron job delivery behind proxies (from PR #2208) - Add proxy support to Discord image/document downloads - Fix duplicate import sys in base.py	2026-04-09 14:19:06 -07:00
dashed	7f7b02b764	fix(slack): comprehensive mrkdwn formatting — 6 bug fixes + 52 tests Fixes blockquote > escaping, edit_message raw markdown, *bold italic* handling, HTML entity double-escaping (&amp;), Wikipedia URL parens truncation, and step numbering format. Also adds format_message to the tool-layer _send_to_platform for consistent formatting across all delivery paths. Changes: - Protect Slack entities (<@user>, <https://...\|label>, <!here>) from escaping passes - Protect blockquote > markers before HTML entity escaping - Unescape-before-escape for idempotent HTML entity handling - *bold italic* → _text_ conversion (before bold pass) - URL regex upgraded to handle balanced parentheses - mrkdwn:True flag on chat_postMessage payloads - format_message applied in edit_message and send_message_tool - 52 new tests (format, edit, streaming, splitting, tool chunking) - Use reversed(dict) idiom for placeholder restoration Based on PR #3715 by dashed, cherry-picked onto current main.	2026-04-09 14:07:32 -07:00
Lumen Radley	e22416dd9b	fix: handle empty sudo password and false prompts	2026-04-09 02:50:07 -07:00
Teknium	d97f6cec7f	feat(gateway): add BlueBubbles iMessage platform adapter (#6437 ) Adds Apple iMessage as a gateway platform via BlueBubbles macOS server. Architecture: - Webhook-based inbound (event-driven, no polling/dedup needed) - Email/phone → chat GUID resolution for user-friendly addressing - Private API safety (checks helper_connected before tapback/typing) - Inbound attachment downloading (images, audio, documents cached locally) - Markdown stripping for clean iMessage delivery - Smart progress suppression for platforms without message editing Based on PR #5869 by @benjaminsehl (webhook architecture, GUID resolution, Private API safety, progress suppression) with inbound attachment downloading from PR #4588 by @1960697431 (attachment cache routing). Integration points: Platform enum, env config, adapter factory, auth maps, cron delivery, send_message routing, channel directory, platform hints, toolset definition, setup wizard, status display. 27 tests covering config, adapter, webhook parsing, GUID resolution, attachment download routing, toolset consistency, and prompt hints.	2026-04-08 23:54:03 -07:00
helix4u	e94008c404	fix(terminal): guard invalid command values	2026-04-08 21:37:51 -07:00
angelos	e7d3e9d767	fix(terminal): persistent sandbox envs survive between turns `_cleanup_task_resources` was unconditionally calling `cleanup_vm()` at the end of every `run_conversation` (i.e. every user turn), tearing down the docker/daytona/modal sandbox container regardless of its `persistent_filesystem` setting. This contradicted the documented intent of `terminal.lifetime_seconds` (idle reaper) and `container_persistent`, and caused per-turn loss of `/workspace`, `~/.config`, agent CLI auth state, and any other content living inside the sandbox. The unconditional teardown was introduced in `fbd3a2fd` ("prevent leakage of morph instances between tasks", 2025-11-04) to plug a Morph backend leak, two days after `lifetime_seconds` shipped in `faecbddd`. It was later refactored into `_cleanup_task_resources` in `70dd3a16` without changing semantics. Code and docs have disagreed since. Fix: introduce `terminal_tool.is_persistent_env(task_id)` and skip the per-turn `cleanup_vm` when the active env is persistent. The idle reaper (`_cleanup_inactive_envs`) still tears persistent envs down once `terminal.lifetime_seconds` is exceeded. Non-persistent backends (Morph) are unchanged — still torn down per turn, preserving the original leak-prevention intent.	2026-04-08 21:31:57 -07:00
alt-glitch	d684d7ee7e	feat(environments): unified spawn-per-call execution layer Replace dual execution model (PersistentShellMixin + per-backend oneshot) with spawn-per-call + session snapshot for all backends except ManagedModal. Core changes: - Every command spawns a fresh bash process; session snapshot (env vars, functions, aliases) captured at init and re-sourced before each command - CWD persists via file-based read (local) or in-band stdout markers (remote) - ProcessHandle protocol + _ThreadedProcessHandle adapter for SDK backends - cancel_fn wired for Modal (sandbox.terminate) and Daytona (sandbox.stop) - Shared utilities extracted: _pipe_stdin, _popen_bash, _load_json_store, _save_json_store, _file_mtime_key, _SYNC_INTERVAL_SECONDS - Rate-limited file sync unified in base _before_execute() with _sync_files() hook - execute_oneshot() removed; all 11 call sites in code_execution_tool.py migrated to execute() - Daytona timeout wrapper replaced with SDK-native timeout parameter - persistent_shell.py deleted (291 lines) Backend-specific: - Local: process-group kill via os.killpg, file-based CWD read - Docker: -e env flags only on init_session, not per-command - SSH: shlex.quote transport, ControlMaster connection reuse - Singularity: apptainer exec with instance://, no forced --pwd - Modal: _AsyncWorker + _ThreadedProcessHandle, cancel_fn -> sandbox.terminate - Daytona: SDK-level timeout (not shell wrapper), cancel_fn -> sandbox.stop - ManagedModal: unchanged (gateway owns execution); docstring added explaining why	2026-04-08 17:23:15 -07:00
Teknium	7156f8d866	fix: CI test failures — metadata key, cli console, docker env, vision order (#6294 ) Fixes 9 test failures on current main, incorporating ideas from PR stack #6219-#6222 by xinbenlv with corrections: - model_metadata: sync HF context length key casing (minimaxai/minimax-m2.5 → MiniMaxAI/MiniMax-M2.5) - cli.py: route quick command error output through self.console instead of creating a new ChatConsole() instance - docker.py: explicit docker_forward_env entries now bypass the Hermes secret blocklist (intentional opt-in wins over generic filter) - auxiliary_client: revert _read_main_provider() to simple provider.strip().lower() — the _normalize_aux_provider() call introduced in `5c03f2e7` stripped the custom: prefix, breaking named custom provider resolution - auxiliary_client: flip vision auto-detection order to active provider → OpenRouter → Nous → stop (was OR → Nous → active) - test: update vision priority test to match new order Based on PR #6219-#6222 by xinbenlv.	2026-04-08 16:37:05 -07:00
jjovalle99	d46db0a1b4	fix(tools): use correct import path for mistralai SDK mistralai v2.x is a namespace package — `Mistral` class lives at `mistralai.client`, not at the top-level `mistralai` module. The previous `from mistralai import Mistral` raises ImportError at runtime. Update both production code and test fixture to use the correct path.	2026-04-08 13:47:08 -07:00
jjovalle99	5f4b93c20f	feat(tools): add Voxtral Transcribe STT provider (Mistral AI)	2026-04-08 13:47:08 -07:00
Teknium	4f467700d4	fix(doctor): only check the active memory provider, not all providers unconditionally (#6285 ) * fix(tools): skip camofox auto-cleanup when managed persistence is enabled When managed_persistence is enabled, cleanup_browser() was calling camofox_close() which destroys the server-side browser context via DELETE /sessions/{userId}, killing login sessions across cron runs. Add camofox_soft_cleanup() — a public wrapper that drops only the in-memory session entry when managed persistence is on, returning True. When persistence is off it returns False so the caller falls back to the full camofox_close(). The inactivity reaper still handles idle resource cleanup. Also surface a logger.warning() when _managed_persistence_enabled() fails to load config, replacing a silent except-and-return-False. Salvaged from #6182 by el-analista (Eduardo Perea Fernandez). Added public API wrapper to avoid cross-module private imports, and test coverage for both persistence paths. Co-authored-by: Eduardo Perea Fernandez <el-analista@users.noreply.github.com> * fix(doctor): only check the active memory provider, not all providers unconditionally hermes doctor had hardcoded Honcho Memory and Mem0 Memory sections that always ran regardless of the user's memory.provider config setting. After the swappable memory provider update (#4623), users with leftover Honcho config but no active provider saw false 'broken' errors. Replaced both sections with a single Memory Provider section that reads memory.provider from config.yaml and only checks the configured provider. Users with no external provider see a green 'Built-in memory active' check. Reported by community user michaelruiz001, confirmed by Eri (Honcho). --------- Co-authored-by: Eduardo Perea Fernandez <el-analista@users.noreply.github.com>	2026-04-08 13:44:58 -07:00
mrshu	19b0ddce40	fix(process): correct detached crash recovery state Previously crash recovery recreated detached sessions as if they were fully managed, so polls and kills could lie about liveness and the checkpoint could forget recovered jobs after the next restart. This commit refreshes recovered host-backed sessions from real PID state, keeps checkpoint data durable, and preserves notify watcher metadata while treating sandbox-only PIDs as non-recoverable. - Persist `pid_scope` in `tools/process_registry.py` and skip recovering sandbox-backed entries without a host-visible PID handle - Refresh detached sessions on access so `get`/`poll`/`wait` and active session queries observe exited processes instead of hanging forever - Allow recovered host PIDs to be terminated honestly and requeue `notify_on_complete` watchers during checkpoint recovery - Add regression tests for durable checkpoints, detached exit/kill behavior, sandbox skip logic, and recovered notify watchers	2026-04-08 03:35:43 -07:00
Vasanthdev2004	085c1c6875	fix(browser): preserve agent-browser paths with spaces	2026-04-08 02:35:48 -07:00
Teknium	3696c74bfb	fix: preserve existing thresholds, remove pre-read byte guard - DEFAULT_RESULT_SIZE_CHARS: 50K -> 100K (match current _LARGE_RESULT_CHARS) - DEFAULT_PREVIEW_SIZE_CHARS: 2K -> 1.5K (match current _LARGE_RESULT_PREVIEW_CHARS) - Per-tool overrides all set to 100K (terminal, execute_code, search_files) - Remove pre-read byte guard (no behavioral regression vs current main) - Revert limit signature change to int=500 (match current default) - Restore original read_file schema description - Update test assertions to match 100K thresholds	2026-04-08 02:24:32 -07:00
alt-glitch	bbcff8dcd0	fix(tools): address PR review — remove _extract_raw_output, BudgetConfig everywhere, read_file hardening - Remove _extract_raw_output: persist content verbatim (fixes size mismatch bug) - Drop import aliases: import from budget_config directly, one canonical name - BudgetConfig param on maybe_persist_tool_result and enforce_turn_budget - read_file: limit=None signature, pre-read guard fires only when limit omitted (256KB) - Unify binary extensions: file_operations.py imports from binary_extensions.py - Exclude .pdf and .svg from binary set (text-based, agents may inspect) - Remove redundant outer try/except in eval path (internal fallback handles it) - Fix broken tests: update assertion strings for new persistence format - Module-level constants: _PRE_READ_MAX_BYTES, _DEFAULT_READ_LIMIT - Remove redundant pathlib import (Path already at module level) - Update spec.md with IMPLEMENTED annotations and design decisions	2026-04-08 02:24:32 -07:00
alt-glitch	77c5bc9da9	feat(budget): make tool result persistence thresholds configurable Add BudgetConfig dataclass to centralize and make overridable the hardcoded constants (50K per-result, 200K per-turn, 2K preview) that control when tool outputs get persisted to sandbox. Configurable at the RL environment level via HermesAgentEnvConfig fields, threaded through HermesAgentLoop to the storage layer. Resolution: pinned (read_file=inf) > env config overrides > registry per-tool > default. CLI override: --env.turn_budget_chars 80000	2026-04-08 02:24:32 -07:00
alt-glitch	65e24c942e	wip: tool result fixes -- persistence	2026-04-08 02:24:32 -07:00
Teknium	fff237e111	feat(cron): track delivery failures in job status (#6042 ) _deliver_result() now returns Optional[str] — None on success, error message on failure. All failure paths (unknown platform, platform disabled, config load error, send failure, unresolvable target) return descriptive error strings. mark_job_run() gains delivery_error param, tracked as last_delivery_error on the job — separate from agent execution errors. A job where the agent succeeded but delivery failed shows last_status='ok' + last_delivery_error='...'. The cronjob list tool now surfaces last_delivery_error so agents and users can see when cron outputs aren't arriving. Inspired by PR #5863 (oxngon) — reimplemented with proper wiring. Tests: 3 new mark_job_run tests + 6 new _deliver_result return tests.	2026-04-07 22:49:01 -07:00
Teknium	b9a5e6e247	fix: use camelCase structuredContent attr, prefer structured over text - The MCP SDK Pydantic model uses camelCase (structuredContent), not snake_case (structured_content). The original getattr was a silent no-op. - When structuredContent is present, return it AS the result instead of alongside text — the structured payload is the machine-readable data. - Move test file to tests/tools/ and fix fake class to use camelCase. - Patch _run_on_mcp_loop in tests so the handler actually executes.	2026-04-07 18:00:01 -07:00
r266-tech	2ad7694874	fix(mcp): preserve structured_content in tool call results MCP CallToolResult may include structured_content (a JSON object) alongside content blocks. The tool handler previously only forwarded concatenated text from content blocks, silently dropping the structured payload. This breaks MCP tools that return a minimal human text in content while putting the actual machine-usable payload in structured_content. Now, when structured_content is present, it is included in the returned JSON under the 'structuredContent' key. Fixes NousResearch/hermes-agent#5874	2026-04-07 18:00:01 -07:00
Teknium	f3c59321af	fix: add _profile_arg tests + move STT language to config.yaml - Add 7 unit tests for _profile_arg: default home, named profile, hash path, nested path, invalid name, systemd integration, launchd integration - Add stt.local.language to config.yaml (empty = auto-detect) - Both STT code paths now read config.yaml first, env var fallback, then default (auto-detect for faster-whisper, 'en' for CLI command) - HERMES_LOCAL_STT_LANGUAGE env var still works as backward-compat fallback	2026-04-07 17:59:16 -07:00
Marc Bickel	6e02fa73c2	fix(discord): discard empty placeholder on voice transcription + force STT language - gateway/run.py: Strip "(The user sent a message with no text content)" placeholder when voice transcription succeeds — it was being appended alongside the transcript, creating duplicate user turns. - tools/transcription_tools.py: Wire HERMES_LOCAL_STT_LANGUAGE env var into the faster-whisper backend. It was only used by the CLI fallback path (_transcribe_local_command), not the primary faster-whisper path.	2026-04-07 17:59:16 -07:00
Teknium	469cd16fe0	fix(security): consolidated security hardening — SSRF, timing attack, tar traversal, credential leakage (#5944 ) Salvaged from PRs #5800 (memosr), #5806 (memosr), #5915 (Ruzzgar), #5928 (Awsh1). Changes: - Use hmac.compare_digest for API key comparison (timing attack prevention) - Apply provider env var blocklist to Docker containers (credential leakage) - Replace tar.extractall() with safe extraction in TerminalBench2 (CVE-2007-4559) - Add SSRF protection via is_safe_url to ALL platform adapters: base.py (cache_image_from_url, cache_audio_from_url), discord, slack, telegram, matrix, mattermost, feishu, wecom (Signal and WhatsApp protected via base.py helpers) - Update tests: mock is_safe_url in Mattermost download tests - Add security tests for tar extraction (traversal, symlinks, safe files)	2026-04-07 17:28:37 -07:00
Teknium	b1a66d55b4	refactor: migrate 10 config.yaml inline loaders to read_raw_config() Replace 10 callsites across 6 files that manually opened config.yaml, called yaml.safe_load(), and handled missing-file/parse-error fallbacks with the new read_raw_config() helper from hermes_cli/config.py. Each migrated site previously had 5-8 lines of boilerplate: config_path = get_hermes_home() / 'config.yaml' if config_path.exists(): import yaml with open(config_path) as f: cfg = yaml.safe_load(f) or {} Now reduced to: from hermes_cli.config import read_raw_config cfg = read_raw_config() Migrated files: - tools/browser_tool.py (4 sites): command_timeout, cloud_provider, allow_private_urls, record_sessions - tools/env_passthrough.py: terminal.env_passthrough - tools/credential_files.py: terminal.credential_files - tools/transcription_tools.py: stt.model - hermes_cli/commands.py: config-gated command resolution - hermes_cli/auth.py (2 sites): model config read + provider reset Skipped (intentionally): - gateway/run.py: 10+ sites with local aliases, critical path - hermes_cli/profiles.py: profile-specific config path - hermes_cli/doctor.py: reads raw then writes fixes back - agent/model_metadata.py: different file (context_length_cache.yaml) - tools/website_policy.py: custom config_path param + error types	2026-04-07 17:28:23 -07:00
Siddharth Balyan	f3006ebef9	refactor(tests): re-architect tests + fix CI failures (#5946 ) * refactor: re-architect tests to mirror the codebase * Update tests.yml * fix: add missing tool_error imports after registry refactor * fix(tests): replace patch.dict with monkeypatch to prevent env var leaks under xdist patch.dict(os.environ) can leak TERMINAL_ENV across xdist workers, causing test_code_execution tests to hit the Modal remote path. * fix(tests): fix update_check and telegram xdist failures - test_update_check: replace patch("hermes_cli.banner.os.getenv") with monkeypatch.setenv("HERMES_HOME") — banner.py no longer imports os directly, it uses get_hermes_home() from hermes_constants. - test_telegram_conflict/approval_buttons: provide real exception classes for telegram.error mock (NetworkError, TimedOut, BadRequest) so the except clause in connect() doesn't fail with "catching classes that do not inherit from BaseException" when xdist pollutes sys.modules. * fix(tests): accept unavailable_models kwarg in _prompt_model_selection mock	2026-04-07 17:19:07 -07:00
Teknium	678a87c477	refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites Add three reusable helpers to eliminate pervasive boilerplate: tools/registry.py — tool_error() and tool_result(): Every tool handler returns JSON strings. The pattern json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times, and json.dumps({"success": False, "error": msg}, ...) another 23. Now: tool_error(msg) or tool_error(msg, success=False). tool_result() handles arbitrary result dicts: tool_result(success=True, data=payload) or tool_result(some_dict). hermes_cli/config.py — read_raw_config(): Lightweight YAML reader that returns the raw config dict without load_config()'s deep-merge + migration overhead. Available for callsites that just need a single config value. Migration (129 callsites across 32 files): - tools/: browser_camofox (18), file_tools (10), homeassistant (8), web_tools (7), skill_manager (7), cronjob (11), code_execution (4), delegate (5), send_message (4), tts (4), memory (7), session_search (3), mcp (2), clarify (2), skills_tool (3), todo (1), vision (1), browser (1), process_registry (2), image_gen (1) - plugins/memory/: honcho (9), supermemory (9), hindsight (8), holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2) - agent/: memory_manager (2), builtin_memory_provider (1)	2026-04-07 13:36:38 -07:00
Teknium	ca0459d109	refactor: remove 24 confirmed dead functions — 432 lines of unused code Each function was verified to have exactly 1 reference in the entire codebase (its own definition). Zero calls, zero imports, zero string references anywhere including tests. Removed by category: Superseded wrappers (replaced by newer implementations): - agent/anthropic_adapter.py: run_hermes_oauth_login, refresh_hermes_oauth_token - hermes_cli/callbacks.py: sudo_password_callback (superseded by CLI method) - hermes_cli/setup.py: _set_model_provider, _sync_model_from_disk - tools/file_tools.py: get_file_tools (superseded by registry.register) - tools/cronjob_tools.py: get_cronjob_tool_definitions (same) - tools/terminal_tool.py: _check_dangerous_command (_check_all_guards used) Dead private helpers (lost their callers during refactors): - agent/anthropic_adapter.py: _convert_user_content_part_to_anthropic - agent/display.py: honcho_session_line, write_tty - hermes_cli/providers.py: _build_labels (+ dead _labels_cache var) - hermes_cli/tools_config.py: _prompt_yes_no - hermes_cli/models.py: _extract_model_ids - hermes_cli/uninstall.py: log_error - gateway/platforms/feishu.py: _is_loop_ready - tools/file_operations.py: _read_image (64-line method) - tools/process_registry.py: cleanup_expired - tools/skill_manager_tool.py: check_skill_manage_requirements Dead class methods (zero callers): - run_agent.py: _is_anthropic_url (logic duplicated inline at L618) - run_agent.py: _classify_empty_content_response (68-line method, never wired) - cli.py: reset_conversation (callers all use new_session directly) - cli.py: _clear_current_input (added but never wired in) Other: - gateway/delivery.py: build_delivery_context_for_tool - tools/browser_tool.py: get_active_browser_sessions	2026-04-07 11:41:26 -07:00
Teknium	187e90e425	refactor: replace inline HERMES_HOME re-implementations with get_hermes_home() 16 callsites across 14 files were re-deriving the hermes home path via os.environ.get('HERMES_HOME', ...) instead of using the canonical get_hermes_home() from hermes_constants. This breaks profiles — each profile has its own HERMES_HOME, and the inline fallback defaults to ~/.hermes regardless. Fixed by importing and calling get_hermes_home() at each site. For files already inside the hermes process (agent/, hermes_cli/, tools/, gateway/, plugins/), this is always safe. Files that run outside the process context (mcp_serve.py, mcp_oauth.py) already had correct try/except ImportError fallbacks and were left alone. Skipped: hermes_constants.py (IS the implementation), env_loader.py (bootstrap), profiles.py (intentionally manipulates the env var), standalone scripts (optional-skills/, skills/), and tests.	2026-04-07 10:40:34 -07:00
Teknium	d0ffb111c2	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 ) Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture) and manual analysis of the entire codebase. Changes by category: Unused imports removed (~95 across 55 files): - Removed genuinely unused imports from all major subsystems - agent/, hermes_cli/, tools/, gateway/, plugins/, cron/ - Includes imports in try/except blocks that were truly unused (vs availability checks which were left alone) Unused variables removed (~25): - Removed dead variables: connected, inner, channels, last_exc, source, new_server_names, verify, pconfig, default_terminal, result, pending_handled, temperature, loop - Dropped unused argparse subparser assignments in hermes_cli/main.py (12 instances of add_parser() where result was never used) Dead code removed: - run_agent.py: Removed dead ternary (None if False else None) and surrounding unreachable branch in identity fallback - run_agent.py: Removed write-only attribute _last_reported_tool - hermes_cli/providers.py: Removed dead @property decorator on module-level function (decorator has no effect outside a class) - gateway/run.py: Removed unused MCP config load before reconnect - gateway/platforms/slack.py: Removed dead SessionSource construction Undefined name bugs fixed (would cause NameError at runtime): - batch_runner.py: Added missing logger = logging.getLogger(__name__) - tools/environments/daytona.py: Added missing Dict and Path imports Unnecessary global statements removed (14): - tools/terminal_tool.py: 5 functions declared global for dicts they only mutated via .pop()/[key]=value (no rebinding) - tools/browser_tool.py: cleanup thread loop only reads flag - tools/rl_training_tool.py: 4 functions only do dict mutations - tools/mcp_oauth.py: only reads the global - hermes_time.py: only reads cached values Inefficient patterns fixed: - startswith/endswith tuple form: 15 instances of x.startswith('a') or x.startswith('b') consolidated to x.startswith(('a', 'b')) - len(x)==0 / len(x)>0: 13 instances replaced with pythonic truthiness checks (not x / bool(x)) - in dict.keys(): 5 instances simplified to in dict - Redefined unused name: removed duplicate _strip_mdv2 import in send_message_tool.py Other fixes: - hermes_cli/doctor.py: Replaced undefined logger.debug() with pass - hermes_cli/config.py: Consolidated chained .endswith() calls Test results: 3934 passed, 17 failed (all pre-existing on main), 19 skipped. Zero regressions.	2026-04-07 10:25:31 -07:00
Ben Barclay	b2f477a30b	feat: switch managed browser provider from Browserbase to Browser Use (#5750 ) * feat: switch managed browser provider from Browserbase to Browser Use The Nous subscription tool gateway now routes browser automation through Browser Use instead of Browserbase. This commit: - Adds managed Nous gateway support to BrowserUseProvider (idempotency keys, X-BB-API-Key auth header, external_call_id persistence) - Removes managed gateway support from BrowserbaseProvider (now direct-only via BROWSERBASE_API_KEY/BROWSERBASE_PROJECT_ID) - Updates browser_tool.py fallback: prefers Browser Use over Browserbase - Updates nous_subscription.py: gateway vendor 'browser-use', auto-config sets cloud_provider='browser-use' for new subscribers - Updates tools_config.py: Nous Subscription entry now uses Browser Use - Updates setup.py, cli.py, status.py, prompt_builder.py display strings - Updates all affected tests to match new behavior Browserbase remains fully functional for users with direct API credentials. The change only affects the managed/subscription path. * chore: remove redundant Browser Use hint from system prompt * fix: upgrade Browser Use provider to v3 API - Base URL: api/v2 -> api/v3 (v2 is legacy) - Unified all endpoints to use native Browser Use paths: - POST /browsers (create session, returns cdpUrl) - PATCH /browsers/{id} with {action: stop} (close session) - Removed managed-mode branching that used Browserbase-style /v1/sessions paths — v3 gateway now supports /browsers directly - Removed unused managed_mode variable in close_session * fix(browser-use): use X-Browser-Use-API-Key header for managed mode The managed gateway expects X-Browser-Use-API-Key, not X-BB-API-Key (which is a Browserbase-specific header). Using the wrong header caused a 401 AUTH_ERROR on every managed-mode browser session create. Simplified _headers() to always use X-Browser-Use-API-Key regardless of direct vs managed mode. * fix(nous_subscription): browserbase explicit provider is direct-only Since managed Nous gateway now routes through Browser Use, the browserbase explicit provider path should not check managed_browser_available (which resolves against the browser-use gateway). Simplified to direct-only with managed=False. * fix(browser-use): port missing improvements from PR #5605 - CDP URL normalization: resolve HTTP discovery URLs to websocket after cloud provider create_session() (prevents agent-browser failures) - Managed session payload: send timeout=5 and proxyCountryCode=us for gateway-backed sessions (prevents billing overruns) - Update prompt builder, browser_close schema, and module docstring to replace remaining Browserbase references with Browser Use - Dynamic /browser status detection via _get_cloud_provider() instead of hardcoded env var checks (future-proof for new providers) - Rename post_setup key from 'browserbase' to 'agent_browser' - Update setup hint to mention Browser Use alongside Browserbase - Add tests: CDP normalization, browserbase direct-only guard, managed browser-use gateway, direct browserbase fallback --------- Co-authored-by: rob-maron <132852777+rob-maron@users.noreply.github.com>	2026-04-07 08:40:22 -04:00
Teknium	8b861b77c1	refactor: remove browser_close tool — auto-cleanup handles it (#5792 ) * refactor: remove browser_close tool — auto-cleanup handles it The browser_close tool was called in only 9% of browser sessions (13/144 navigations across 66 sessions), always redundantly — cleanup_browser() already runs via _cleanup_task_resources() at conversation end, and the background inactivity reaper catches anything else. Removing it saves one tool schema slot in every browser-enabled API call. Also fixes a latent bug: cleanup_browser() now handles Camofox sessions too (previously only Browserbase). Camofox sessions were never auto-cleaned per-task because they live in a separate dict from _active_sessions. Files changed (13): - tools/browser_tool.py: remove function, schema, registry entry; add camofox cleanup to cleanup_browser() - toolsets.py, model_tools.py, prompt_builder.py, display.py, acp_adapter/tools.py: remove browser_close from all tool lists - tests/: remove browser_close test, update toolset assertion - docs/skills: remove all browser_close references * fix: repeat browser_scroll 5x per call for meaningful page movement Most backends scroll ~100px per call — barely visible on a typical viewport. Repeating 5x gives ~500px (~half a viewport), making each scroll tool call actually useful. Backend-agnostic approach: works across all 7+ browser backends without needing to configure each one's scroll amount individually. Breaks early on error for the agent-browser path. * feat: auto-return compact snapshot from browser_navigate Every browser session starts with navigate → snapshot. Now navigate returns the compact accessibility tree snapshot inline, saving one tool call per browser task. The snapshot captures the full page DOM (not viewport-limited), so scroll position doesn't affect it. browser_snapshot remains available for refreshing after interactions or getting full=true content. Both Browserbase and Camofox paths auto-snapshot. If the snapshot fails for any reason, navigation still succeeds — the snapshot is a bonus, not a requirement. Schema descriptions updated to guide models: navigate mentions it returns a snapshot, snapshot mentions it's for refresh/full content. * refactor: slim cronjob tool schema — consolidate model/provider, drop unused params Session data (151 calls across 67 sessions) showed several schema properties were never used by models. Consolidated and cleaned up: Removed from schema (still work via backend/CLI): - skill (singular): use skills array instead - reason: pause-only, unnecessary - include_disabled: now defaults to true - base_url: extreme edge case, zero usage - provider (standalone): merged into model object Consolidated: - model + provider → single 'model' object with {model, provider} fields. If provider is omitted, the current main provider is pinned at creation time so the job stays stable even if the user changes their default. Kept: - script: useful data collection feature - skills array: standard interface for skill loading Schema shrinks from 14 to 10 properties. All backend functionality preserved — the Python function signature and handler lambda still accept every parameter. * fix: remove mixture_of_agents from core toolsets — opt-in only via hermes tools MoA was in _HERMES_CORE_TOOLS and composite toolsets (hermes-cli, hermes-messaging, safe), which meant it appeared in every session for anyone with OPENROUTER_API_KEY set. The _DEFAULT_OFF_TOOLSETS gate only works after running 'hermes tools' explicitly. Now MoA only appears when a user explicitly enables it via 'hermes tools'. The moa toolset definition and check_fn remain unchanged — it just needs to be opted into.	2026-04-07 03:28:44 -07:00
Teknium	e120d2afac	feat: notify_on_complete for background processes (#5779 ) * feat: notify_on_complete for background processes When terminal(background=true, notify_on_complete=true), the system auto-triggers a new agent turn when the process exits — no polling needed. Changes: - ProcessSession: add notify_on_complete field - ProcessRegistry: add completion_queue, populate on _move_to_finished() - Terminal tool: add notify_on_complete parameter to schema + handler - CLI: drain completion_queue after agent turn AND during idle loop - Gateway: enhanced _run_process_watcher injects synthetic MessageEvent on completion, triggering a full agent turn - Checkpoint persistence includes notify_on_complete for crash recovery - code_execution_tool: block notify_on_complete in sandbox scripts - 15 new tests covering queue mechanics, checkpoint round-trip, schema * docs: update terminal tool descriptions for notify_on_complete - background: remove 'ONLY for servers' language, describe both patterns (long-lived processes AND long-running tasks with notify_on_complete) - notify_on_complete: more prescriptive about when to use it - TERMINAL_TOOL_DESCRIPTION: remove 'Do NOT use background for builds' guidance that contradicted the new feature	2026-04-07 02:40:16 -07:00
Teknium	d9e7e42d0b	fix(approval): load permanent command allowlist on startup (#5076 ) Co-authored-by: Timo Karp <timo@timos-macbook-pro.taildbbd26.ts.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 01:00:02 -07:00

1 2 3 4 5 ...

772 commits