hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-27 01:11:40 +00:00

Author	SHA1	Message	Date
Yuhan Lei	f92298fe95	fix(acp): populate usage from top-level result fields	2026-04-10 03:00:12 -07:00
Teknium	a420235b66	fix: reject foreground timeout above cap instead of clamping Change behavior from silent clamping to returning an error when the model requests a foreground timeout exceeding FOREGROUND_MAX_TIMEOUT. This forces the model to use background=true for long-running commands rather than silently changing its intent. - Config default timeouts above the cap are NOT rejected (user's choice) - Only explicit model-requested timeouts trigger rejection - Added boundary test for timeout exactly at the limit	2026-04-10 02:58:54 -07:00
kshitijk4poor	6c3565df57	fix(terminal): cap foreground timeout to prevent session deadlocks When the model calls terminal() in foreground mode without background=true (e.g. to start a server), the tool call blocks until the command exits or the timeout expires. Without an upper bound the model can request arbitrarily high timeouts (the schema had minimum=1 but no maximum), blocking the entire agent session for hours until the gateway idle watchdog kills it. Changes: - Add FOREGROUND_MAX_TIMEOUT (600s, configurable via TERMINAL_MAX_FOREGROUND_TIMEOUT env var) that caps foreground timeout - Clamp effective_timeout to the cap when background=false and timeout exceeds the limit - Include a timeout_note in the tool result when clamped, nudging the model to use background=true for long-running processes - Update schema description to show the max timeout value - Remove dead clamping code in the background branch that could never fire (max_timeout was set to effective_timeout, so timeout > max_timeout was always false) - Add 7 tests covering clamping, no-clamping, config-default-exceeds-cap edge case, background bypass, default timeout, constant value, and schema content Self-review fixes: - Fixed bug where timeout_note said 'Requested timeout Nones' when clamping fired from config default exceeding cap (timeout param is None). Now uses unclamped_timeout instead of the raw timeout param. - Removed unused pytest import from test file - Extracted test config dict into _make_env_config() helper - Fixed tautological test_default_value assertion - Added missing test for config default > cap with no model timeout	2026-04-10 02:58:54 -07:00
kshitijk4poor	51d826f889	fix(gateway): apply /model session overrides so switch persists across messages The gateway /model command stored session overrides in _session_model_overrides but run_sync() never consulted them when resolving the model and runtime for the next message. It always read from config.yaml, so the switch was lost as soon as a new agent was created. Two fixes: 1. In run_sync(), apply _session_model_overrides after resolving from config.yaml/env — the override takes precedence for model, provider, api_key, base_url, and api_mode. 2. In post-run fallback detection, check whether the model mismatch (agent.model != config_model) is due to an intentional /model switch before evicting the cached agent. Without this, the first message after /model would work (cached agent reused) but the fallback detector would evict it, causing the next message to revert. Affects all gateway platforms (Telegram, Discord, Slack, WhatsApp, Signal, Matrix, BlueBubbles, HomeAssistant) since they all share GatewayRunner._run_agent(). Fixes #6213	2026-04-10 02:58:42 -07:00
Young	940237c6fd	fix(cli): prevent stale image attachment on text paste and voice input Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 02:58:18 -07:00
Carlos	7368854398	Refresh OpenRouter model catalog Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-10 02:57:39 -07:00
Carlos	38ccd9eb95	Harden setup provider flows Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-10 02:57:39 -07:00
Teknium	6da952bc50	fix(gateway): /usage now shows rate limits, cost, and token details between turns (#7038 ) The gateway /usage handler only looked in _running_agents for the agent object, which is only populated while the agent is actively processing a message. Between turns (when users actually type /usage), the dict is empty and the handler fell through to a rough message-count estimate. The agent object actually lives in _agent_cache between turns (kept for prompt caching). This fix checks both dicts, with _running_agents taking priority (mid-turn) and _agent_cache as the between-turns fallback. Also brings the gateway output to parity with the CLI /usage: - Model name - Detailed token breakdown (input, output, cache read, cache write) - Cost estimation (estimated amount or 'included' for subscriptions) - Cache token lines hidden when zero (cleaner output) This fixes Nous Portal rate limit headers not showing up for gateway users — the data was being captured correctly but the handler could never see it.	2026-04-10 02:33:01 -07:00
Teknium	8779a268a7	feat: add Anthropic Fast Mode support to /fast command (#7037 ) Extends the /fast command to support Anthropic's Fast Mode beta in addition to OpenAI Priority Processing. When enabled on Claude Opus 4.6, adds speed:"fast" and the fast-mode-2026-02-01 beta header to API requests for ~2.5x faster output token throughput. Changes: - hermes_cli/models.py: Add _ANTHROPIC_FAST_MODE_MODELS registry, model_supports_fast_mode() now recognizes Claude Opus 4.6, resolve_fast_mode_overrides() returns {speed: fast} for Anthropic vs {service_tier: priority} for OpenAI - agent/anthropic_adapter.py: Add _FAST_MODE_BETA constant, build_anthropic_kwargs() accepts fast_mode=True which injects speed:fast + beta header via extra_headers (skipped for third-party Anthropic-compatible endpoints like MiniMax) - run_agent.py: Pass fast_mode to build_anthropic_kwargs in the anthropic_messages path of _build_api_kwargs() - cli.py: Update _handle_fast_command with provider-aware messaging (shows 'Anthropic Fast Mode' vs 'Priority Processing') - hermes_cli/commands.py: Update /fast description to mention both providers - tests: 13 new tests covering Anthropic model detection, override resolution, CLI availability, routing, adapter kwargs, and third-party endpoint safety	2026-04-10 02:32:15 -07:00
Teknium	0848a79476	fix(update): always reset on stash conflict — never leave conflict markers (#7010 ) When `hermes update` stashes local changes and the restore hits merge conflicts, the old code prompted the user to reset or keep conflict markers. If the user declined the reset, git conflict markers (<<<<<<< Updated upstream) were left in source files, making hermes completely unrunnable with a SyntaxError on the next invocation. Additionally, the interactive path called sys.exit(1), which killed the entire update process before pip dependency install, skill sync, and gateway restart could finish — even though the code pull itself had succeeded. Changes: - Always auto-reset to clean state when stash restore conflicts - Remove the "Reset working tree?" prompt (footgun) - Remove sys.exit(1) — return False so cmd_update continues normally - User's changes remain safely in the stash for manual recovery Also fixes a secondary bug where the conflict handling prompt used bare input() instead of the input_fn parameter, which would hang in gateway mode. Tests updated: replaced prompt/sys.exit assertions with auto-reset behavior checks; removed the "user declines reset" test (path no longer exists).	2026-04-10 00:32:20 -07:00
Teknium	871313ae2d	fix: clear conversation_history after mid-loop compression to prevent empty sessions (#7001 ) After mid-loop compression (triggered by 413, context_overflow, or Anthropic long-context tier errors), _compress_context() creates a new session in SQLite and resets _last_flushed_db_idx=0. However, conversation_history was not cleared, so _flush_messages_to_session_db() computed: flush_from = max(len(conversation_history=200), _last_flushed_db_idx=0) = 200 messages[200:] → empty (compressed messages < 200) This resulted in zero messages being written to the new session's SQLite store. On resume, the user would see 'Session found but has no messages.' The preflight compression path (line 7311) already had the fix: conversation_history = None This commit adds the same clearing to the three mid-loop compression sites: - Anthropic long-context tier overflow - HTTP 413 payload too large - Generic context_overflow error Reported by Aaryan (Nous community).	2026-04-10 00:14:59 -07:00
Teknium	8104f400f8	test: disable text batching in existing adapter tests Set _text_batch_delay_seconds = 0 on test adapter fixtures so messages dispatch immediately (bypassing async batching). This preserves the existing synchronous assertion patterns while the batching logic is tested separately in test_text_batching.py.	2026-04-09 23:25:27 -07:00
Teknium	1ed00496f2	test: add text batching tests for Discord, Matrix, WeCom, Telegram, Feishu 22 tests covering: - Single message dispatch after delay - Split message aggregation (2-way and 3-way) - Different chats/rooms not merged - Adaptive delay for near-limit chunks - State cleanup after flush - Split continuation merging All 5 platform adapters tested.	2026-04-09 23:25:27 -07:00
Teknium	f783986f5a	fix: increase stream read timeout default to 120s, auto-raise for local LLMs (#6967 ) Raise the default httpx stream read timeout from 60s to 120s for all providers. Additionally, auto-detect local LLM endpoints (Ollama, llama.cpp, vLLM) and raise the read timeout to HERMES_API_TIMEOUT (1800s) since local models can take minutes for prefill on large contexts before producing the first token. The stale stream timeout already had this local auto-detection pattern; the httpx read timeout was missing it — causing a hard 60s wall that users couldn't find (HERMES_STREAM_READ_TIMEOUT was undocumented). Changes: - Default HERMES_STREAM_READ_TIMEOUT: 60s -> 120s - Auto-detect local endpoints -> raise to 1800s (user override respected) - Document HERMES_STREAM_READ_TIMEOUT and HERMES_STREAM_STALE_TIMEOUT - Add 10 parametrized tests Reported-by: Pavan Srinivas (@pavanandums)	2026-04-09 22:35:30 -07:00
emozilla	bda9aa17cb	fix(streaming): prevent <think> in prose from suppressing response output When the model mentions <think> as literal text in its response (e.g. "(/think not producing <think> tags)"), the streaming display treated it as a reasoning block opener and suppressed everything after it. The response box would close with truncated content and no error — the API response was complete but the display ate it. Root cause: _stream_delta() matched <think> anywhere in the text stream regardless of position. Real reasoning blocks always start at the beginning of a line; mentions in prose appear mid-sentence. Fix: track line position across streaming deltas with a _stream_last_was_newline flag. Only enter reasoning suppression when the tag appears at a block boundary (start of stream, after a newline, or after only whitespace on the current line). Add a _flush_stream() safety net that recovers buffered content if no closing tag is found by end-of-stream. Also fixes three related issues discovered during investigation: - anthropic_adapter: _get_anthropic_max_output() now normalizes dots to hyphens so 'claude-opus-4.6' matches the 'claude-opus-4-6' table key (was returning 32K instead of 128K) - run_agent: send explicit max_tokens for Claude models on Nous Portal, same as OpenRouter — both proxy to Anthropic's API which requires it. Without it the backend defaults to a low limit that truncates responses. - run_agent: reset truncated_tool_call_retries after successful tool execution so a single truncation doesn't poison the entire conversation.	2026-04-09 22:16:36 -07:00
Teknium	8394b5ddd2	feat: expand /fast to all OpenAI Priority Processing models (#6960 ) Previously /fast only supported gpt-5.4 and forced a provider switch to openai-codex. Now supports all 13 models from OpenAI's Priority Processing pricing table (gpt-5.4, gpt-5.4-mini, gpt-5.2, gpt-5.1, gpt-5, gpt-5-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4o, gpt-4o-mini, o3, o4-mini). Key changes: - Replaced _FAST_MODE_BACKEND_CONFIG with _PRIORITY_PROCESSING_MODELS frozenset - Removed provider-forcing logic — service_tier is now injected into whatever API path the user is already on (Codex Responses, Chat Completions, or OpenRouter passthrough) - Added request_overrides support to chat_completions path in run_agent.py - Updated messaging from 'Codex inference tier' to 'Priority Processing' - Expanded test coverage for all supported models	2026-04-09 22:06:30 -07:00
g-guthrie	d416a69288	feat: add Codex fast mode toggle (/fast command) Add /fast slash command to toggle OpenAI Codex service_tier between normal and priority ('fast') inference. Only exposed for models registered in _FAST_MODE_BACKEND_CONFIG (currently gpt-5.4). - Registry-based backend config for extensibility - Dynamic command visibility (hidden from help/autocomplete for non-supported models) via command_filter on SlashCommandCompleter - service_tier flows through request_overrides from route resolution - Omit max_output_tokens for Codex backend (rejects it) - Persists to config.yaml under agent.service_tier Salvage cleanup: removed simple_term_menu/input() menu (banned), bare /fast now shows status like /reasoning. Removed redundant override resolution in _build_api_kwargs — single source of truth via request_overrides from route. Co-authored-by: Hermes Agent <hermes@nousresearch.com>	2026-04-09 21:54:32 -07:00
kshitijk4poor	08e2a1a51e	fix(anthropic): omit tool-streaming beta on MiniMax endpoints MiniMax's Anthropic-compatible endpoints reject requests that include the fine-grained-tool-streaming beta header — every tool-use message triggers a connection error (~18s timeout). Regular chat works fine. Add _common_betas_for_base_url() that filters out the tool-streaming beta for Bearer-auth (MiniMax) endpoints while keeping all other betas. All four client-construction branches now use the filtered list. Based on #6528 by @HiddenPuppy. Original cherry-picked from PR #6688 by kshitijk4poor. Fixes #6510, fixes #6555.	2026-04-09 17:53:52 -07:00
Teknium	9634e20e15	feat: API server model name derived from profile name (#6857 ) * feat: API server model name derived from profile name For multi-user setups (e.g. OpenWebUI), each profile's API server now advertises a distinct model name on /v1/models: - Profile 'lucas' -> model ID 'lucas' - Profile 'admin' -> model ID 'admin' - Default profile -> 'hermes-agent' (unchanged) Explicit override via API_SERVER_MODEL_NAME env var or platforms.api_server.model_name config for custom names. Resolves friction where OpenWebUI couldn't distinguish multiple hermes-agent connections all advertising the same model name. * docs: multi-user setup with profiles for API server + Open WebUI - api-server.md: added Multi-User Setup section, API_SERVER_MODEL_NAME to config table, updated /v1/models description - open-webui.md: added Multi-User Setup with Profiles section with step-by-step guide, updated model name references - environment-variables.md: added API_SERVER_MODEL_NAME entry	2026-04-09 17:07:29 -07:00
AIandI0x1	2d0d05a337	fix(agent): detect truncated streaming tool calls before execution When a streaming response is cut mid-tool-call (connection drop, timeout), the accumulated function.arguments is invalid JSON. The mock response builder defaulted finish_reason to 'stop', so the agent loop treated it as a valid completed turn and tried to execute tools with broken args. Fix: validate tool call arguments with json.loads() during mock response reconstruction. If any are invalid JSON, override finish_reason to 'length'. In the main loop's length handler, if tool calls are present, refuse to execute and return partial=True with a clear error instead of silently failing or wasting retries. Also fixes _thinking_exhausted to not short-circuit when tool calls are present — truncated tool calls are not thinking exhaustion. Original cherry-picked from PR #6776 by AIandI0x1. Closes #6638.	2026-04-09 17:03:54 -07:00
Teknium	3b554bf839	fix: test for suppress_status_output should capture stdout, not mock _vprint The test was mocking _vprint entirely, bypassing the suppress guard. Switch to capturing _print_fn output so the real _vprint runs and the guard suppresses retry noise as intended.	2026-04-09 16:24:53 -07:00
adybag14-cyber	c3141429b7	fix(termux): tighten voice setup and mobile chat UX	2026-04-09 16:24:53 -07:00
adybag14-cyber	769ec1ee1a	fix(termux): deepen browser, voice, and tui support	2026-04-09 16:24:53 -07:00
adybag14-cyber	3237733ca5	fix(termux): harden execute_code and mobile browser/audio UX	2026-04-09 16:24:53 -07:00
adybag14-cyber	54d5138a54	fix(termux): harden env-backed background jobs	2026-04-09 16:24:53 -07:00
adybag14-cyber	6dcb3c4774	fix(termux): compact narrow-screen tui chrome	2026-04-09 16:24:53 -07:00
adybag14-cyber	096b3f9f12	fix(termux): add local image chat route	2026-04-09 16:24:53 -07:00
adybag14-cyber	a3aed1bd26	fix(termux): keep quiet chat output parseable	2026-04-09 16:24:53 -07:00
adybag14-cyber	4970705ed3	fix(termux): silence quiet chat tool previews	2026-04-09 16:24:53 -07:00
adybag14-cyber	2194425918	fix(termux): make setup-hermes use android path	2026-04-09 16:24:53 -07:00
adybag14-cyber	3878495972	fix(termux): disable gateway service flows on android	2026-04-09 16:24:53 -07:00
adybag14-cyber	4e40e93b98	fix(termux): improve status and install UX	2026-04-09 16:24:53 -07:00
adybag14-cyber	122925a6f2	fix(termux): honor temp dirs for local temp artifacts	2026-04-09 16:24:53 -07:00
adybag14-cyber	e79cc88985	feat: add tested Termux install path and EOF-aware gh auth	2026-04-09 16:24:53 -07:00
sprmn24	e053433c84	fix(error_classifier): disambiguate usage-limit patterns in _classify_by_message _classify_by_message had no handling for _USAGE_LIMIT_PATTERNS, so messages like 'usage limit exceeded, try again in 5 minutes' arriving without an HTTP status code fell through to FailoverReason.unknown instead of rate_limit. Apply the same billing/rate-limit disambiguation that _classify_402 already uses: USAGE_LIMIT_PATTERNS + transient signal → rate_limit, USAGE_LIMIT_PATTERNS alone → billing. Add 4 tests covering the no-status-code usage-limit path.	2026-04-09 16:24:13 -07:00
Siddharth Balyan	1789c2699a	feat(nix): shared-state permission model for interactive CLI users (#6796 ) * feat(nix): shared-state permission model for interactive CLI users Enable interactive CLI users in the hermes group to share full read-write state (sessions, memories, logs, cron) with the gateway service via a setgid + group-writable permission model. Changes: nix/nixosModules.nix: - Directories use setgid 2770 (was 0750) so new files inherit the hermes group. home/ stays 0750 (no interactive write needed). - Activation script creates HERMES_HOME subdirs (cron, sessions, logs, memories) — previously Python created them but managed mode now skips mkdir. - Activation migrates existing runtime files to group-writable (chmod g+rw). Nix-managed files (config.yaml, .env, .managed) stay 0640/0644. - Gateway systemd unit gets UMask=0007 so files it creates are 0660. hermes_cli/config.py: - ensure_hermes_home() splits into managed/unmanaged paths. Managed mode verifies dirs exist (raises RuntimeError if not) instead of creating them. Scoped umask(0o007) ensures SOUL.md is created as 0660. hermes_logging.py: - _ManagedRotatingFileHandler subclass applies chmod 0660 after log rotation in managed mode. RotatingFileHandler.doRollover() creates new files via open() which uses the process umask (0022 → 0644), not the scoped umask from ensure_hermes_home(). Verified with a 13-subtest NixOS VM integration test covering setgid, interactive writes, file ownership, migration, and gateway coexistence. Refs: #6044 * Fix managed log file mode on initial open Co-authored-by: Siddharth Balyan <alt-glitch@users.noreply.github.com> * refactor: simplify managed file handler and merge activation loops - Cache is_managed() result in handler __init__ instead of lazy-importing on every _open()/_chmod_if_managed() call. Avoids repeated stat+env checks on log rotation. - Merge two for-loops over the same subdir list in activation script into a single loop (mkdir + chown + chmod + find in one pass). --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Siddharth Balyan <alt-glitch@users.noreply.github.com>	2026-04-10 03:48:42 +05:30
Teknium	6b437f7934	fix: /browser connect auto-launch uses dedicated profile dir (#6821 ) Chrome auto-launch now passes --user-data-dir, --no-first-run, and --no-default-browser-check so the debug instance doesn't conflict with an already-running Chrome using the default profile. The profile dir lives at {hermes_home}/chrome-debug/. Also updates the fallback manual instructions to include the same flags and removes the stale 'close existing Chrome windows' hint.	2026-04-09 14:55:45 -07:00
Teknium	f91fffbe33	Revert "fix: /browser connect auto-launch uses dedicated profile dir" This reverts commit `c3854e0f85`.	2026-04-09 14:54:37 -07:00
Teknium	c3854e0f85	fix: /browser connect auto-launch uses dedicated profile dir Chrome auto-launch now passes --user-data-dir, --no-first-run, and --no-default-browser-check so the debug instance doesn't conflict with an already-running Chrome using the default profile. The profile dir lives at {hermes_home}/chrome-debug/. Also updates the fallback manual instructions to include the same flags and removes the stale 'close existing Chrome windows' hint.	2026-04-09 14:52:58 -07:00
Teknium	c8bbd29aae	fix: update tests for gws migration - Rewrite test_google_workspace_api.py: test bridge token handling and calendar date range instead of removed get_credentials() - Update test_google_oauth_setup.py: partial scopes now accepted with warning instead of rejected with SystemExit	2026-04-09 14:28:35 -07:00
Greer Guthrie	775a46ce75	fix: normalize reasoning effort ordering in UI	2026-04-09 14:20:16 -07:00
Zheng Li	88dbbfe982	feat(gateway): unified proxy support for Discord and Telegram with macOS auto-detection - Add resolve_proxy_url() to base.py — shared by all platform adapters - Check HTTPS_PROXY / HTTP_PROXY / ALL_PROXY env vars first - Fall back to macOS system proxy via scutil --proxy (zero-config) - Pass proxy= to discord.py commands.Bot() for gateway connectivity - Refactor telegram_network.py to use shared resolver - Update test fixtures to accept proxy kwarg	2026-04-09 14:19:06 -07:00
dashed	7f7b02b764	fix(slack): comprehensive mrkdwn formatting — 6 bug fixes + 52 tests Fixes blockquote > escaping, edit_message raw markdown, *bold italic* handling, HTML entity double-escaping (&amp;), Wikipedia URL parens truncation, and step numbering format. Also adds format_message to the tool-layer _send_to_platform for consistent formatting across all delivery paths. Changes: - Protect Slack entities (<@user>, <https://...\|label>, <!here>) from escaping passes - Protect blockquote > markers before HTML entity escaping - Unescape-before-escape for idempotent HTML entity handling - *bold italic* → _text_ conversion (before bold pass) - URL regex upgraded to handle balanced parentheses - mrkdwn:True flag on chat_postMessage payloads - format_message applied in edit_message and send_message_tool - 52 new tests (format, edit, streaming, splitting, tool chunking) - Use reversed(dict) idiom for placeholder restoration Based on PR #3715 by dashed, cherry-picked onto current main.	2026-04-09 14:07:32 -07:00
Doruk Ardahan	7d499c75db	feat(slack): add require_mention and free_response_channels config support Port the mention gating pattern from Telegram, Discord, WhatsApp, and Matrix adapters to the Slack platform adapter. - Add _slack_require_mention() with explicit-false parsing and env var fallback (SLACK_REQUIRE_MENTION) - Add _slack_free_response_channels() with env var fallback (SLACK_FREE_RESPONSE_CHANNELS) - Replace hardcoded mention check with configurable gating logic - Bridge slack config.yaml settings to env vars - Bridge free_response_channels through the generic platform bridging loop - Add 26 tests covering config parsing, env fallback, gating logic Config usage: slack: require_mention: false free_response_channels: - "C0AQWDLHY9M" Default behavior unchanged: channels require @mention (backward compatible). Based on PR #5885 by dorukardahan, cherry-picked and adapted to current main.	2026-04-09 14:07:32 -07:00
Dylan Socolobsky	c6dba918b3	fix(tests): fix several failing/flaky tests on main (#6777 ) * fix(tests): mock is_safe_url in tests that use example.com Tests using example.com URLs were failing because is_safe_url does a real DNS lookup which fails in environments where example.com doesn't resolve, causing the request to be blocked before reaching the already-mocked HTTP client. This should fix around 17 failing tests. These tests test logic, caching, etc. so mocking this method should not modify them in any way. TestMattermostSendUrlAsFile was already doing this so we follow the same pattern. * fix(test): use case-insensitive lookup for model context length check DEFAULT_CONTEXT_LENGTHS uses inconsistent casing (MiniMax keys are lowercase, Qwen keys are mixed-case) so the test was broken in some cases since it couldn't find the model. * fix(test): patch is_linux in systemd gateway restart test The test only patched is_macos to False but didn't patch is_linux to True. On macOS hosts, is_linux() returns False and the systemd restart code path is skipped entirely, making the assertion fail. * fix(test): use non-blocklisted env var in docker forward_env tests GITHUB_TOKEN is in api_key_env_vars and thus in _HERMES_PROVIDER_ENV_BLOCKLIST so the env var is silently dropped, we replace it with a non-blocked one like DATABASE_URL so the tests actually work. * fix(test): fully isolate _has_any_provider_configured from host env _has_any_provider_configured() checks all env vars from PROVIDER_REGISTRY (not just the 5 the tests were clearing) and also calls get_auth_status() which detects gh auth token for Copilot. On machines with any of these set, the function returns True before reaching the code path under test. Clear all registry vars and mock get_auth_status so host credentials don't interfere. * fix(test): correct path to hermes_base_env.py in tool parser tests Path(__file__).parent.parent resolved to tests/, not the project root. The file lives at environments/hermes_base_env.py so we need one more parent level. * fix(test): accept optional HTML fields in Matrix send payload _send_matrix sometimes adds format and formatted_body when the markdown library is installed. The test was doing an exact dict equality check which broke. Check required fields instead. * fix(test): add config.yaml to codex vision requirements test The test only wrote auth.json but not config.yaml, so _read_main_provider() returned empty and vision auto-detect never tried the codex provider. Add a config.yaml pointing at openai-codex so the fallback path actually resolves the client. * fix(test): clear OPENROUTER_API_KEY in _isolate_hermes_home run_agent.py calls load_hermes_dotenv() at import time, which injects API keys from ~/.hermes/.env into os.environ before any test fixture runs. This caused test_agent_loop_tool_calling to make real API calls instead of skipping, which ends up making some tests fail. * fix(test): add get_rate_limit_state to agent mock in usage report tests _show_usage now calls agent.get_rate_limit_state() for rate limit display. The SimpleNamespace mock was missing this method. * fix(test): update expected Camofox config version from 12 to 13 * fix(test): mock _get_enabled_platforms in nous managed defaults test Importing gateway.run leaks DISCORD_BOT_TOKEN into os.environ, which makes _get_enabled_platforms() return ["cli", "discord"] instead of just ["cli"]. tools_command loops per platform, so apply_nous_managed_defaults runs twice: the first call sets config values, the second sees them as already configured and returns an empty set, causing the assertion to fail.	2026-04-09 13:17:06 -07:00
Teknium	3eade90b39	fix: OpenClaw migration now shows dry-run preview before executing (#6769 ) The setup wizard's OpenClaw migration previously ran immediately with aggressive defaults (overwrite=True, preset=full) after a single 'Would you like to import?' prompt. This caused several problems: - Config values with different semantics (e.g. tool_call_execution: 'auto' in OpenClaw vs 'off' for Hermes yolo mode) were imported without translation - Gateway tokens were hijacked from OpenClaw without warning, taking over Telegram/Slack/Discord channels - Instruction files (.md) containing OpenClaw-specific setup/restart procedures were copied, causing Hermes restart failures Now the migration: 1. Asks 'Would you like to see what can be imported?' (softer framing) 2. Runs a dry-run preview showing everything that would be imported 3. Displays categorized warnings for high-impact items (gateway takeover, config value differences, instruction files) 4. Asks for explicit confirmation with default=No 5. Executes with overwrite=False (preserves existing Hermes config) Also extracts _load_openclaw_migration_module() for reuse and adds _print_migration_preview() with keyword-based warning detection. Tests updated for two-phase behavior + new test for decline-after-preview.	2026-04-09 12:15:06 -07:00
KUSH42	34d06a9802	fix(compaction): don't halve context_length on output-cap-too-large errors When the API returns "max_tokens too large given prompt" (input tokens are within the context window, but input + requested output > window), the old code incorrectly routed through the same handler as "prompt too long" errors, calling get_next_probe_tier() and permanently halving context_length. This made things worse: the window was fine, only the requested output size needed trimming for that one call. Two distinct error classes now handled separately: Prompt too long — input itself exceeds context window. Fix: compress history + halve context_length (existing behaviour, unchanged). Output cap too large — input OK, but input + max_tokens > window. Fix: parse available_tokens from the error message, set a one-shot _ephemeral_max_output_tokens override for the retry, and leave context_length completely untouched. Changes: - agent/model_metadata.py: add parse_available_output_tokens_from_error() that detects Anthropic's "available_tokens: N" error format and returns the available output budget, or None for all other error types. - run_agent.py: call the new parser first in the is_context_length_error block; if it fires, set _ephemeral_max_output_tokens (with a 64-token safety margin) and break to retry without touching context_length. _build_api_kwargs consumes the ephemeral value exactly once then clears it so subsequent calls use self.max_tokens normally. - agent/anthropic_adapter.py: expand build_anthropic_kwargs docstring to clearly document the max_tokens (output cap) vs context_length (total window) distinction, which is a persistent source of confusion due to the OpenAI-inherited "max_tokens" name. - cli-config.yaml.example: add inline comments explaining both keys side by side where users are most likely to look. - website/docs/integrations/providers.md: add a callout box at the top of "Context Length Detection" and clarify the troubleshooting entry. - tests/test_ctx_halving_fix.py: 24 tests across four classes covering the parser, build_anthropic_kwargs clamping, ephemeral one-shot consumption, and the invariant that context_length is never mutated on output-cap errors.	2026-04-09 11:27:41 -07:00
Teknium	2772d99085	fix: remove /prompt slash command — footgun via prefix expansion (#6752 ) /pr <anything> silently resolved to /prompt via the shortest-match tiebreaker in prefix expansion, permanently overwriting the system prompt and persisting to config. The command's functionality (setting agent.system_prompt) is available via config.yaml and /personality covers the common use case. Removes: CommandDef, dispatch branch, _handle_prompt_command handler, docs references, and updates subcommand extraction test.	2026-04-09 11:27:27 -07:00
Teknium	3007174a61	fix: prevent 400 format errors from triggering compression loop on Codex Responses API (#6751 ) The error classifier's generic-400 heuristic only extracted err_body_msg from the nested body structure (body['error']['message']), missing the flat body format used by OpenAI's Responses API (body['message']). This caused descriptive 400 errors like 'Invalid input[index].name: string does not match pattern' to appear generic when the session was large, misclassifying them as context overflow and triggering an infinite compression loop. Added flat-body fallback in _classify_400() consistent with the parent classify_api_error() function's existing handling at line 297-298.	2026-04-09 11:11:34 -07:00
Teknium	1a3ae6ac6e	feat: structured API error classification for smart failover (#6514 ) Add agent/error_classifier.py with a priority-ordered classification pipeline that replaces scattered inline string-matching in the retry loop with structured error taxonomy and recovery hints. FailoverReason enum (14 categories): auth, auth_permanent, billing, rate_limit, overloaded, server_error, timeout, context_overflow, payload_too_large, model_not_found, format_error, thinking_signature, long_context_tier, unknown. ClassifiedError dataclass carries reason + recovery action hints (retryable, should_compress, should_rotate_credential, should_fallback). Key improvements over inline matching: - 402 disambiguation: 'insufficient credits' = billing (immediate rotate), 'usage limit, try again' = rate_limit (backoff first) - OpenRouter 403 'key limit exceeded' correctly classified as billing - Error cause chain walking (walks __cause__/__context__ up to 5 levels) - Body message included in pattern matching (SDK str() misses it) - Server disconnect + large session check ordered before generic transport catch so RemoteProtocolError triggers compression when appropriate - Chinese error message support for context overflow run_agent.py: replaced 6 inline detection blocks with classifier calls, net -55 lines. All recovery actions (pool rotation, fallback activation, compression, transport recovery) unchanged. 65 new unit tests + 10 E2E tests + live tests with real SDK error objects. Inspired by OpenClaw's failover error classification system.	2026-04-09 04:10:11 -07:00

1 2 3 4 5 ...

1474 commits