hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
wysie	ff078738ea	fix(skills): load symlinked skill slash commands	2026-05-18 00:34:29 -07:00
Teknium	abf1af5401	feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590 ) * feat(session_search): single-shape tool with discovery, scroll, browse — no LLM Replaces the LLM-summarized session_search with a single-shape tool that returns actual messages from the DB. Three calling shapes inferred from args (no mode parameter): 1. Discovery — pass query. FTS5 + anchored ±5 window + bookends per hit, all in one call. ~20ms on a real DB instead of ~90s for the previous three aux-LLM calls. 2. Scroll — pass session_id + around_message_id. Returns a window centered on the anchor. To paginate, re-anchor on the first/last id of the returned window. Boundary message appears in both windows as the orientation marker. ~1ms per scroll call. 3. Browse — no args. Recent sessions chronologically. Bookend_start (first 3 user+assistant msgs) and bookend_end (last 3) give the agent goal + resolution on every discovery hit, so a single tool call reconstructs a long session's arc without loading the whole transcript. The aux-LLM summary path is gone: it cost ~$0.30/call, took ~30s, and laundered FTS5 hits through a model that could confabulate when the right session wasn't in the hit list. The merged shape returns byte-for-byte content from SQLite. History: - PR #20238 (JabberELF) seeded the fast/summary dual-mode split. - PR #26419 (yoniebans) expanded to fast/guided/summary with bookends, multi-anchor drill-down, default-mode config, and a teaching skill. This PR collapses that toolkit into one shape with explicit scroll support, drops the summary path, drops the mode parameter, drops the config knob, drops the skill. JabberELF's seed work is acknowledged via the AUTHOR_MAP entry. Validation: - 38/38 tool tests pass (tests/tools/test_session_search.py) - 12/12 get_messages_around tests pass (tests/hermes_state/) - 11/11 get_anchored_view tests pass (tests/hermes_state/) - Full tests/tools/ run: 5168 passing, 2 failures pre-exist on main (test ordering in test_delegate.py, unrelated) - E2E against live state DB: discovery 20ms, scroll 1ms, browse 280ms; pagination forward+backward works with boundary-message orientation; error paths return clean tool_error responses Co-authored-by: JabberELF <abcdjmm970703@gmail.com> Co-authored-by: yoniebans <jonny@nousresearch.com> * chore(session_search): prune dead LLM-summary config and docs Companion to the single-shape rewrite. The auxiliary.session_search config block, max_concurrency / extra_body tunables, and matching docs sections all referenced the removed LLM summarization path. Removing them so users don't try to tune knobs that nothing reads. - hermes_cli/config.py: drop dead auxiliary.session_search block from DEFAULT_CONFIG. Leftover keys in user config.yaml are harmless and ignored. - hermes_cli/tips.py: drop two tips referencing the removed max_concurrency / extra_body knobs. - website/docs/user-guide/configuration.md: drop 'Session Search Tuning' section and the auxiliary.session_search block from the example. - website/docs/user-guide/features/fallback-providers.md: drop session_search rows from the auxiliary-tasks tables and the dedicated tuning subsection. - website/docs/reference/tools-reference.md: rewrite the session_search entry to describe the new three-shape behaviour. - CONTRIBUTING.md: update the file-tree description. - tests/tools/test_llm_content_none_guard.py: remove TestSessionSearchContentNone class and test_session_search_tool_guarded — both guard against an unguarded .content.strip() call site in _summarize_session() that no longer exists. Validation: 97/97 targeted tests still pass (hermes_state + session_search + llm_content_none_guard). Config tests 55/55. --------- Co-authored-by: JabberELF <abcdjmm970703@gmail.com> Co-authored-by: yoniebans <jonny@nousresearch.com>	2026-05-17 23:28:45 -07:00
teknium1	4a3f13b47b	perf(prompt-cache): date-only timestamp + loud gateway-DB roundtrip logging The system prompt's 'Conversation started:' line carried minute precision (%I:%M %p), making it byte-unstable across every rebuild path. Within a CLI session the in-memory cache held, but on the gateway path (fresh AIAgent per turn → restore from session DB), any silent failure in the read or write path dropped the cache stem and forced a full re-prefill on every subsequent turn. Local prefix-caching backends (llama.cpp / vLLM) saw this as KV-cache invalidation; remote prefix-caching providers saw it as an Anthropic-style cache miss. Three changes: 1. Date-only timestamp ('Sunday, May 17, 2026' instead of '... 03:42 PM'). System prompt now byte-stable for the full day. The model can still query exact time via tools when it actually needs it. Credit: @iamfoz (PR #20451). 2. Loud logging on session DB write failures. The update_system_prompt call used to log at DEBUG, hiding disk-full / locked-database / schema drift behind a silent fall-through that forced fresh rebuilds on every subsequent turn. Now WARN with the session id and exception so persistent issues show up in agent.log without verbose mode. 3. Three-way stored-state distinction on read. The previous 'session_row.get("system_prompt") or None' collapsed three states into one (missing row / null column / empty string). Now we tell them apart and WARN when a continuing session lands on null/empty (which means the previous turn's write never persisted — every subsequent turn rebuilds and the prefix cache misses every time). The restore block is extracted into _restore_or_build_system_prompt() so the prefix-cache path can be unit-tested in isolation. E2E proof: fresh AIAgent constructed for turn 2 across a minute-boundary sleep restores byte-identical bytes from the session DB. NULL stored prompt fires the new warning. Date-only timestamp survives the rebuild path. All on real SessionDB, no mocks. Tests: - tests/agent/test_system_prompt_restore.py (10 new tests) - tests/run_agent/test_run_agent.py::TestBuildSystemPrompt:: test_datetime_is_date_only_not_minute_precision Closes #20451 (date-only), #18547 (prefix stabilization), #8689 (stabilize timestamp across compression), #15866 (timestamp caching question), #8687 (compression timestamp), #27339 (claim #3: live timestamp in cached system prompt). Co-authored-by: Martyn Forryan <9133432+iamfoz@users.noreply.github.com>	2026-05-17 23:20:37 -07:00
Teknium	9b91377bec	feat(grok): apply OpenAI execution guidance to xAI Grok / xai-oauth models (#27797 ) Grok models hit the same failure modes that OPENAI_MODEL_EXECUTION_GUIDANCE addresses for GPT/Codex: claiming completion without tool calls ('to be honest, I didn't create the file yet'), suggesting workarounds instead of using existing tools (proposing a folder-based memory system when the memory tool exists), replying with plans instead of executing. TOOL_USE_ENFORCEMENT_GUIDANCE was already injected for any model whose name contains 'grok' (TOOL_USE_ENFORCEMENT_MODELS). This extends the follow-on family-specific block — OPENAI_MODEL_EXECUTION_GUIDANCE (tool_persistence / mandatory_tool_use / act_dont_ask / prerequisite_checks / verification / missing_context) — to grok-named models too. The OPENAI_ prefix is retained for backwards compat with imports/tests; docstring + inline comment now note that the body is family-agnostic and the prefix reflects origin, not exclusivity. Tests cover the OpenRouter slug (x-ai/grok-4.3) and the xai-oauth bare name (grok-4.3), plus a negative control on claude. E2E verified against a real AIAgent build of the system prompt for both xai-oauth and openrouter grok models.	2026-05-17 23:00:37 -07:00
zccyman	a574246837	feat(auxiliary): add configurable fallback chains + main-agent safety net Layered fallback for auxiliary tasks (compression, vision, tts, web_extract, session_search, etc.): 1. Primary aux provider (existing) 2. User-configured auxiliary.<task>.fallback_chain (new) 3. Main agent provider + model (new — last-resort safety net) 4. Warn user + re-raise original error (new) For users on 'auto' (no explicit aux provider), the existing _try_payment_fallback auto-detection chain runs instead — its Step 1 already IS the main agent model, so they get the same behaviour without configuration. The configured fallback_chain config schema comes from #26882 / @zccyman; the main-agent safety net + exhaustion warning were added on top. Closes #26882. Builds on the capacity-error gate fix in the previous commit (#26803 / @Bartok9).	2026-05-17 17:15:31 -07:00
Bartok9	24c209f112	fix(auxiliary): detect quota exhaustion as payment error; allow capacity-error fallback for explicit providers Closes #26803 Root causes: 1. _is_payment_error() checked for billing keywords (credits, insufficient funds, billing, payment required) but missed daily token quota exhaustion phrases used by Bedrock, Vertex AI, and LiteLLM proxies — e.g. 'Too many tokens per day', 'quota exceeded', 'resource exhausted', 'daily limit'. These are functionally identical to credit exhaustion (provider cannot serve the request) but don't trigger fallback. 2. The call_llm() fallback chain was gated on resolved_provider == 'auto'. When a task resolves to a specific provider (e.g. 'custom' for a LiteLLM proxy, or 'openrouter'), capacity failures (payment/quota/connection) silently raise instead of trying alternatives. This is overly conservative: capacity errors mean the provider cannot serve the request regardless of user intent, so alternatives should always be tried. Fixes: - Add quota-related keywords to _is_payment_error(): quota_exceeded, too many tokens per day, daily limit, tokens per day, daily quota, resource exhausted (Vertex AI gRPC code). - Allow fallback for capacity errors (payment + connection) even when resolved_provider is not 'auto'. Rate-limit fallback stays gated on is_auto to honour explicit provider constraints for transient limits. - Apply both fixes to sync call_llm() and async acall_llm() paths. - Add 6 targeted tests for the new quota-error detection cases.	2026-05-17 17:15:31 -07:00
Robin Fernandes	569bc94b59	fix(auth) fix a few cases where refresh tokens were not rotated.	2026-05-17 16:56:37 -07:00
Robin Fernandes	20bffa5b37	refactor(auth): mostly cleanups and style changes	2026-05-17 16:56:37 -07:00
Robin Fernandes	0bac7dd05b	refactor(auth): collapse Nous inference fallback controls	2026-05-17 16:56:37 -07:00
Robin Fernandes	89a3d038cf	Switch to JWT token for inference against Nous, falling back to old opaque token on failure.	2026-05-17 16:56:37 -07:00
Robin Fernandes	c905562623	fix(auth): stop replaying invalid Nous refresh tokens Quarantine Nous OAuth state when refresh fails with terminal invalid_grant/invalid_token errors. Clear local and shared refresh material across runtime, managed access-token, proxy, and credential-pool paths so Hermes stops retrying revoked refresh sessions.	2026-05-17 16:56:37 -07:00
teknium1	bdc2113b5c	fix(xai): wire schema sanitizer into post-refactor build_api_kwargs Port of the run_agent.py changes from #27219 to current main: the _build_api_kwargs body was extracted into agent/chat_completion_helpers. build_api_kwargs, so wire the xAI tool-schema sanitization there (provider in {'xai', 'xai-oauth'} or base_url=api.x.ai). Logs a warning instead of silently swallowing exceptions, matching the contributor's review-followup fix. Co-authored-by: zccyman <zccyman@163.com>	2026-05-17 13:13:22 -07:00
teknium1	822e92edb3	fix(aux): default OpenRouter auxiliary to gemini-3-flash-preview	2026-05-17 12:44:48 -07:00
Hoang V. Pham	4a7cd2e16d	fix(codex): allow kanban worker board writes	2026-05-17 11:50:43 -07:00
teknium1	55d6a1636b	fix(agent): honor provider timeout config in streaming API calls Closes #25249 (and supersedes PR #25260) in spirit. Two bugs in the streaming chat-completions path caused provider timeout configuration to be silently ignored: 1. Hardcoded connect/pool timeout. The httpx.Timeout for streaming calls used hardcoded connect=30.0 and pool=30.0 regardless of the user's providers.<id>.request_timeout_seconds config. If the custom provider (e.g. Ollama) was unreachable, the call always waited exactly 30s before failing, ignoring any configured timeout. Fix: use min(_base_timeout, 60.0) for connect and pool when a provider timeout is configured, falling back to 30.0 otherwise. The 60s cap addresses review feedback (TCP handshake shouldn't wait the inference timeout — connect/pool cover the connection layer, not model latency). 2. Streaming stale-stream detector ignored provider config. The stale detector read only HERMES_STREAM_STALE_TIMEOUT (env default 180s). The providers.<id>.stale_timeout_seconds key (correctly used in the non-streaming path) was never consulted. Fix: check get_provider_stale_timeout(provider, model) first, then fall back to the env var. Aligns the streaming path with the non-streaming path's priority chain (config > env > default). Salvage shape diverged from PR #25260: the function moved to agent/chat_completion_helpers.py and the contributor's two commits (initial fix + 60s-cap review follow-up) are squashed into one final commit applied at the new location. Original diagnosis, fix shape, AND the 60s-cap review response from @zccyman in PR #25260; credited via Co-authored-by. Co-authored-by: zccyman <16263913+zccyman@users.noreply.github.com>	2026-05-17 11:39:37 -07:00
QuenVix	d5a0815c3d	fix(transports): use monotonic deadlines in codex app-server turn loop	2026-05-17 11:37:45 -07:00
kshitijk4poor	c74ff2c8ef	fix(browser): self-review pass — dead-import, log levels, future-proofing Addresses findings from two self-review passes pre-merge. First pass (3-agent parallel review): 1. plugins/browser/browser_use/provider.py: drop the ``_ = managed_nous_tools_enabled`` dead-import-hider in _get_config_or_none(). The import was actively misleading — the helper IS used in _get_config() (separate method, separate import), not here. The "keep static analysis happy" comment was wrong about what the helper does in this scope. 2. agent/browser_provider.py: drop ``pragma: no cover`` from is_configured() / provider_name() backward-compat aliases. They ARE covered by ``TestLegacyAbcAliases`` — the pragma would have masked future regressions. 3. tools/browser_tool.py: refactor _is_legacy_provider_registry_overridden() to compare against a module-frozen _DEFAULT_PROVIDER_REGISTRY snapshot instead of hardcoded set of 3 keys. Future maintainers adding a 4th built-in provider now just extend _PROVIDER_REGISTRY; the override detection adapts automatically. Previously the hardcoded ``set(...) != {"browserbase", "browser-use", "firecrawl"}`` would flip True forever on any 4-key registry, silently routing every install onto the legacy fixture path. 4. tools/browser_tool.py: when explicit ``browser.cloud_provider`` is set but the registry has no matching plugin (typo, uninstalled plugin, discovery failure), emit a WARNING with actionable text instead of silently falling through to auto-detect. Legacy code surfaced a typed credentials error via direct class instantiation; this log restores the signal in the post-migration path. 5. agent/browser_registry.py: trim the triple-redundant _LEGACY_PREFERENCE documentation. Module docstring + 13-line block-comment + 5-line inline comment was repeating the same point. Kept the docstring and trimmed the block-comment to 5 lines. 6. agent/browser_registry.py: upgrade is_available()-raised logging from DEBUG to WARNING with exc_info=True. A provider's availability check throwing is unusual enough that users debugging "no cloud provider" need the traceback in logs. 7. tests/plugins/browser/check_parity_vs_main.py: drop dead top-level imports (os, shutil, tempfile — only referenced inside the SUBPROCESS_SCRIPT string literal that runs in a child process). Second pass (architecture + claim-verification review): 8. tools/browser_tool.py: rewrite the inline comment in _get_cloud_provider auto-detect branch. Prior text claimed it "routes through the plugin registry's legacy preference walk so third-party plugins still get a chance to be selected when they're explicitly configured" — false on both counts. The branch uses module-level legacy class aliases (BrowserUseProvider / BrowserbaseProvider) directly; third-party plugins are intentionally reachable only via explicit ``browser.cloud_provider``. Corrected comment now matches behaviour and cross-references _LEGACY_PREFERENCE for the firecrawl gate rationale. 9. tools/browser_tool.py + tests/tools/test_managed_browserbase_and_modal.py: drop the unused ``get_active_browser_provider as _registry_get_active_browser_provider`` alias from the ``from agent.browser_registry import ...`` block. It was never referenced; matching test-stub line in the agent.browser_registry SimpleNamespace also dropped. ``get_provider`` is still imported (used by the explicit-config dispatch path at line 535). 10. plugins/browser/firecrawl/provider.py: align emergency_cleanup() with the early-guard pattern used in browserbase + browser_use plugins. Previously firecrawl tried the DELETE and relied on ``_headers()`` raising ValueError to trip a "missing credentials" warning; same final outcome but a different control flow that read like a bug to a maintainer skimming the three modules. Now: if is_available() is False, log+return early — identical shape to the other two providers. Verification: 54/54 unit tests + 13/13 parity scenarios still pass.	2026-05-17 04:04:15 -07:00
kshitijk4poor	40fde853fa	refactor(browser): dispatch _get_cloud_provider through agent.browser_registry Switches tools.browser_tool's cloud-provider lookup from the hardcoded _PROVIDER_REGISTRY class-instantiation pattern to the agent.browser_registry singleton registry that plugins self-populate. Changes: - tools/browser_tool.py top imports: pull BrowserProvider from agent.browser_provider (re-exported as CloudBrowserProvider for legacy callers) and the three provider classes from plugins/browser/<vendor>/. Legacy class names (BrowserbaseProvider, BrowserUseProvider, FirecrawlProvider) remain on tools.browser_tool as re-export shims so existing test patches (monkeypatch.setattr(browser_tool, 'BrowserUseProvider', ...)) keep working. - _get_cloud_provider() now consults agent.browser_registry.get_provider() for explicit-config lookups. The auto-detect fallback still uses BrowserUseProvider() / BrowserbaseProvider() at the module level so the cache-policy test fixtures (which patch those names) keep driving the function. Test-time _PROVIDER_REGISTRY overrides are detected by class identity and routed through the legacy factory-call path. - agent/browser_provider.py: BrowserProvider grows is_configured() and provider_name() as thin backward-compat aliases for the legacy CloudBrowserProvider API. Subclasses MUST implement is_available() and name; the aliases delegate. This keeps ~6 caller sites in browser_tool.py working without churning them. - tests/tools/test_managed_browserbase_and_modal.py: _install_fake_tools_package grows stubs for agent.browser_provider / agent.browser_registry / plugins.browser.<vendor>.provider so the test's spec-loader path (sys.modules-reset + reload-tool-from-disk) can satisfy tools.browser_tool's top-level imports. Verified: all 23 existing tests in test_browser_cloud_*.py + test_managed_browserbase_and_modal.py still pass post-cutover. The legacy tools/browser_providers/ directory is NOT yet deleted; several tests still _load_tool_module() those files via spec_from_file_location. The deletion + test-path updates land in a later commit.	2026-05-17 04:04:15 -07:00
kshitijk4poor	a15cdfb050	feat(browser): browser-use + firecrawl plugins; drop single-eligible shortcut Migrates the remaining two cloud browser providers to plugins: plugins/browser/browser_use/ — dual auth (direct BROWSER_USE_API_KEY or managed Nous gateway), idempotency- key handling for retried managed-mode creates, x-external-call-id capture. plugins/browser/firecrawl/ — direct FIRECRAWL_API_KEY only; distinct from plugins/web/firecrawl/ (same key, different endpoint). Also drops the 'single-eligible shortcut' rule from agent.browser_registry._resolve(). Was a copy-paste from web_search_registry that would have introduced a real behavior change: a user with only FIRECRAWL_API_KEY set (for web-extract) would silently get routed to a paid Firecrawl cloud browser on a fresh install — not matching origin/main, which only auto-detected between Browser Use and Browserbase. Third-party browser plugins are subject to the same gate: they require explicit `browser.cloud_provider` to take effect. Verified end-to-end via plugin discovery: - 3 plugins register (browser-use, browserbase, firecrawl) - _resolve(None) with no creds: None (local mode) - _resolve(None) with only FIRECRAWL_API_KEY: None (matches main) - _resolve('firecrawl'): firecrawl (explicit wins) - _resolve(None) with BU+firecrawl: browser-use (legacy walk first hit) - _resolve(None) with all three: browser-use (legacy walk order)	2026-05-17 04:04:15 -07:00
kshitijk4poor	c6e6909e5a	feat(browser): add BrowserProvider ABC mirroring web_search_provider template Foundation commit for the browser-provider plugin migration (#25214). Mirrors the architecture established by PR #25182 (web providers): - agent/browser_provider.py — BrowserProvider ABC. Preserves the legacy CloudBrowserProvider lifecycle contract bit-for-bit (create_session, close_session, emergency_cleanup, session metadata shape) so the dispatcher in tools/browser_tool.py becomes a pure registry lookup. Renames is_configured() → is_available() for parity with WebSearchProvider. - agent/browser_registry.py — selection registry with the same three-rule resolution as web_search_registry: 1. Explicit config wins (returns even if is_available() == False so the dispatcher surfaces a precise credentials error) 2. Single-eligible shortcut 3. Legacy preference walk: browser-use → browserbase, filtered by availability. Firecrawl is intentionally NOT in the legacy walk (matches pre-migration behaviour — Firecrawl was only reachable via explicit browser.cloud_provider: firecrawl). - hermes_cli/plugins.py — adds ctx.register_browser_provider() facade, one-liner mirror of register_web_search_provider(). No plugins registered yet; no dispatcher cutover yet. The next commits move browserbase/browser-use/firecrawl into plugins/browser/<vendor>/ and switch tools/browser_tool.py over to the registry.	2026-05-17 04:04:15 -07:00
hawknewton	c02606a385	chore(deps): lazy-install boto3/botocore for bedrock adapter agent/bedrock_adapter.py now calls lazy_deps to install boto3 and botocore on first import, mirroring how other optional provider adapters defer their heavy AWS dependencies until actually used. Keeps the base install slim for users who don't run on Bedrock.	2026-05-17 02:31:18 -07:00
flamiinngo	dbeaaa47f2	refactor(security): extract _block_message helper to unify block logic in _parse_response Both the `action=block` and `decision=block` branches in _parse_response shared identical field-priority and type-validation logic. Extract it into a single _block_message(primary, secondary) helper so the two branches are one line each and the type guard lives in exactly one place. No functional change: existing tests (TestParseResponse, 14 tests) all pass unchanged, confirming identical behaviour.	2026-05-17 02:31:18 -07:00
flamiinngo	63805965e7	fix(security): restore type safety and extract constant in shell hook block handler Address code review feedback on _parse_response: 1. Restore isinstance(raw, str) guard so non-string message/reason values (e.g. integers, lists) from a malformed hook response fall back to the default rather than being forwarded as-is. This keeps the contract that message in the returned dict is always a string. 2. Extract the repeated literal 'Blocked by shell hook.' into a module-level constant _DEFAULT_BLOCK_MESSAGE to avoid duplication and make it easy to change in one place. Four new unit tests added to tests/agent/test_shell_hooks.py covering: - action block with no message (uses default) - decision block with no reason (uses default) - action block with empty string message (uses default) - action block with non-string message, e.g. integer (uses default)	2026-05-17 02:31:18 -07:00
flamiinngo	aeda146112	fix(security): honor shell hook blocks even when message/reason is absent _parse_response in agent/shell_hooks.py only forwarded a pre_tool_call block directive if the hook also provided a non-empty message or reason. When either field was missing the function returned None, causing Hermes to treat the response as a no-op and execute the tool unconditionally. This means a hook that outputs {"action": "block"} or {"decision": "block"} without a reason string is silently ignored. The security boundary fails open: tools the user intended to gate are executed anyway. Fix: remove the message-presence guard. Honor the block unconditionally and fall back to a default message when none is provided. Existing hooks that already include a message or reason are unaffected.	2026-05-17 02:31:18 -07:00
haran2001	d9abbe7fa4	fix(metadata): qwen3.6-plus has a 1M context window (#27008 ) qwen3.6-plus did not have an explicit entry in DEFAULT_CONTEXT_LENGTHS, so the longest-substring fallback matched the generic 'qwen': 131072 catch-all. That dropped the effective context limit from 1,048,576 tokens to 131,072, prematurely lowered the compression threshold, and produced misleading warnings about main/compression context mismatch in long sessions. Add an explicit 'qwen3.6-plus': 1048576 entry before the catch-all and cover it with a regression test (bare, qwen/, and dashscope/ prefixes). Note: PR #6599 also mentions touching model_metadata.py but the actual diff only edits hermes_cli/models.py, so this fix is independent and not duplicated by that PR. Closes #27008	2026-05-17 02:31:18 -07:00
kshitij	5fba236644	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 ) Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).	2026-05-17 02:29:41 -07:00
teknium1	563b4d9e51	fix: strip image parts for non-vision models with provider profiles + getattr-safe _custom_providers Original commit `75e5d0f6b` by hueilau targeted _build_api_kwargs in pre-refactor run_agent.py. The body now lives in agent/chat_completion_helpers.build_api_kwargs — re-applied there. Also: switch the custom_providers forward (from `21078ebce`) to use getattr() — tests build a bare AIAgent via __new__ and would otherwise hit AttributeError on _custom_providers. Co-authored-by: hueilau <33933019+hueilau@users.noreply.github.com>	2026-05-16 23:47:51 -07:00
teknium1	36ad8336f9	fix(run_agent): guard memory provider init against empty/whitespace string Original commit `8d756a421` by austrian_guy targeted __init__ in pre-refactor run_agent.py. The body now lives in agent/agent_init.init_agent — re-applied there. Co-authored-by: austrian_guy <33156212+ether-btc@users.noreply.github.com>	2026-05-16 23:43:09 -07:00
teknium1	4ece521bcf	fix(run_agent): isolate background review fork from external memory plugins (#27190 ) Original commit `973f27e95` by Teknium targeted _spawn_background_review in pre-refactor run_agent.py. The body now lives in agent/background_review._spawn_background_review — re-applied there. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-16 23:42:49 -07:00
teknium1	b5bcffe167	fix(fallback): forward custom_providers to fallback model context-length detection Original commit `21078ebce` by PaTTeeL targeted _try_activate_fallback in pre-refactor run_agent.py. The body now lives in agent/chat_completion_helpers.try_activate_fallback — re-applied there. Co-authored-by: PaTTeeL <9150277+PaTTeeL@users.noreply.github.com>	2026-05-16 23:42:16 -07:00
teknium1	4ab9a06a51	fix(agent): reset _fallback_index at turn start even when no fallback activated Original commit `33528b428` by konsisumer targeted _restore_primary_runtime in pre-refactor run_agent.py. The body now lives in agent/agent_runtime_helpers.restore_primary_runtime — re-applied there. Fixes #20465 Co-authored-by: konsisumer <der@konsi.org>	2026-05-16 23:41:45 -07:00
teknium1	aa05ffba53	fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184 ) Original commit `2b193907d` by Teknium added a new module-level _StreamErrorEvent class and threaded its raise into _run_codex_create_stream_fallback in pre-refactor run_agent.py. - _StreamErrorEvent class → run_agent.py (module-level, next to _qwen_portal_headers; class needs to be top-level for the codex runtime to import it) - The fallback event-loop's 'type=error' handler → agent/codex_runtime.py where run_codex_create_stream_fallback now lives. Imports _StreamErrorEvent lazily from run_agent to avoid circular import. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-16 23:41:09 -07:00
teknium1	80fa92a491	fix(codex): rotate pool on usage limit 429 — port to extracted modules Original commit `e51d74ab9` by Maxim Esipov targeted _extract_api_error_context and _recover_with_credential_pool in pre-refactor run_agent.py. Both bodies now live in agent/agent_runtime_helpers.py — re-applied to that module: - extract_api_error_context: payload.get('type') added to the reason fallback chain (Codex error bodies use 'type' instead of 'code'/'error') - recover_with_credential_pool: usage_limit_reached detection in the rate_limit branch — skip the retry-once-then-rotate dance and rotate immediately when the body says the per-account usage limit hit. Co-authored-by: Maxim Esipov <maksesipov@gmail.com>	2026-05-16 23:39:41 -07:00
teknium1	df22d29522	fix(copilot): GitHub Models 413 hint — port to extracted conversation_loop Original commits `4ded3ede3` (@konsisumer) + `374dc81c2` (Teknium) added a 413 hint to run_agent.py's agent loop. Final-state version (the sharpened `374dc81c2` wording) ported to agent/conversation_loop.py, where the payload_too_large branch now lives. The deprecation detection + _URL_TO_PROVIDER changes from both commits landed in agent/copilot_acp_client.py and agent/model_metadata.py via the prior merge. Closes #10648 Co-authored-by: konsisumer <der@konsi.org> Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-16 23:38:45 -07:00
teknium1	3fbedd732e	feat: add supports_parallel_tool_calls for MCP servers (#26825 ) — port to tool_dispatch_helpers Original commit `395e9dd9e` by Teknium targeted module-level _is_mcp_tool_parallel_safe and _should_parallelize_tool_batch helpers in pre-refactor run_agent.py. Both helpers now live in agent/tool_dispatch_helpers.py — re-applied to that module. The tools/mcp_tool.py portion (the public is_mcp_tool_parallel_safe API + _parallel_safe_servers tracking) merged cleanly from main via the prior merge commit. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-16 23:36:37 -07:00
teknium1	fe4c87eb28	fix(agent): retry malformed anthropic stream parser errors — port to extracted modules Original commit `9c304a7f5` by helix4u targeted _flatten_exception_chain, _summarize_api_error, and the _call streaming retry loop in pre-refactor run_agent.py. Re-applied to: - New _is_provider_stream_parse_error helper → run_agent.py (next to _flatten_exception_chain in the AIAgent class) - _summarize_api_error early-return for the malformed-streaming ValueError → run_agent.py (kept method body) - _call streaming retry: _is_stream_parse_err flag wired into _is_transient AND the post-exhaustion branch + dedicated malformed-streaming user-status string → agent/chat_completion_helpers.py (the _call body now lives there) Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>	2026-05-16 23:35:54 -07:00
teknium1	f885be030c	fix(auxiliary): resolve xai oauth compression from pool — port to conversation_compression Original commit `97a32afdc` by helix4u targeted _check_compression_model_feasibility in pre-refactor run_agent.py. The function body now lives in agent/conversation_compression.py — re-applied the configured-but-unavailable provider message there. Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>	2026-05-16 23:33:59 -07:00
teknium1	6975a2d9ae	fix(xai-oauth): entitlement-403 chain — final state (`ce0e189d3` + `9818b9a1a` + `6784c8079` + `dffb602f3`) Collapses the four-commit xAI entitlement-403 chain to its final on-main state, ported to the post-refactor module layout: - Added _is_entitlement_failure on AIAgent (run_agent.py) — detects Grok subscription-shape 403s on (401\|403\|None) status codes. - Added entitlement-skip branch to recover_with_credential_pool (agent/agent_runtime_helpers.py) — breaks the refresh-loop that Don's 100-iteration trace exposed when a Premium+ user hit a real entitlement issue. - Removed _decorate_xai_entitlement_error and unwrapped its two _summarize_api_error call sites — xAI's own body text already points users at grok.com/?_s=usage so we surface that verbatim (`dffb602f3` reasoning: X Premium subs DO now work per xAI's 2026-05-16 announcement, so editorialising would misdirect). - grok-4.3 1M context entry landed in agent/model_metadata.py via the prior merge — no additional port needed. Tests already on disk (tests/run_agent/test_codex_xai_oauth_recovery.py) assert _is_entitlement_failure shape and verbatim body surfacing. Closes #27110. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-16 23:33:18 -07:00
teknium1	6362e71973	fix(xai-oauth): recover from prelude SSE errors, gate reasoning replay, surface entitlement 403s Original commit `31ba2b0cb` by Teknium targeted run_codex_stream() at its pre-refactor location in run_agent.py. Re-applied: - Prelude error retry/fallback → agent/codex_runtime.py (in run_codex_stream where the body now lives) - _decorate_xai_entitlement_error helper + _summarize_api_error wrapping → run_agent.py (these methods remained on AIAgent as @staticmethod's; cherry-pick applied them cleanly) The xai-oauth provider gate, encrypted_content drop on replay, etc. landed in agent/codex_responses_adapter.py via the prior merge from main. Closes #8133, #14634 Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-16 23:28:05 -07:00
teknium1	27df249564	feat(nvidia): add NIM billing origin header — port to extracted modules Original commit `13c3d4b4e` by kchantharuan touched __init__ and _apply_client_headers_for_base_url in pre-refactor run_agent.py. Re-applied to: - __init__: agent/agent_init.py (3 hunks — NVIDIA branch + _custom_headers fallback in routed-client and fallback-client paths) - _apply_client_headers_for_base_url: still in run_agent.py (1 hunk) build_nvidia_nim_headers was already present in agent/auxiliary_client.py from the prior merge — no additional port needed. Co-authored-by: kchantharuan <kchantharuan@nvidia.com>	2026-05-16 23:25:11 -07:00
teknium1	b07524e53a	feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider — port to extracted modules Original commit `b62c99797` by Jaaneek targeted six locations in pre-refactor run_agent.py. Re-applied to the extracted post-PR locations: - api_mode dispatch → agent/agent_init.py - is_xai_responses build_api_kwargs → agent/chat_completion_helpers.py - codex_auth_retry block + 401 hint → agent/conversation_loop.py - _try_refresh_codex_client_credentials body → run_agent.py (kept) The non-run_agent.py portions of the commit (auxiliary_client, codex transport, hermes_cli/auth, tools/xai_http, tests, docs) merged cleanly from main via the prior merge commit. Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>	2026-05-16 23:23:38 -07:00
teknium1	7d221aa1f2	fix(langfuse): complete observability fix — port to extracted conversation_loop Original commit `db84a78e6` by kshitij targeted run_conversation()'s pre_api_request and post_api_request hooks in pre-refactor run_agent.py. Re-applied to the extracted location in agent/conversation_loop.py. Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com> Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com> Co-authored-by: Brian Conklin <brian@dralth.com>	2026-05-16 23:21:51 -07:00
teknium1	a77ca9295e	perf(run_agent): accumulate length-continuation prefix via list+join Original commit `4f8aaf104` by InB4DevOps targeted run_conversation() in the pre-refactor run_agent.py. Re-applied to the extracted location in agent/conversation_loop.py. Co-authored-by: InB4DevOps <tolle.lege+github@gmail.com>	2026-05-16 23:20:27 -07:00
teknium1	152d42d1a7	Merge origin/main into pr-27248 (resolving run_agent.py = ours) run_agent.py taken from HEAD (the extracted forwarder structure). The 25 run_agent.py fixes that landed on main during the PR's life need to be ported into the agent/* extracted modules in follow-up commits.	2026-05-16 23:16:52 -07:00
phoenixshen	52c89715a2	fix: respect user-configured vision model for OpenRouter _OPENROUTER_MODEL hardcoded 'google/gemini-3-flash-preview' which returns 404 on OpenRouter, breaking all vision tasks for users who rely on the OpenRouter default. Additionally, _try_openrouter() ignored the user-configured auxiliary.vision.model entirely. Changes: - Update _OPENROUTER_MODEL default to google/gemini-2.5-flash (valid) - Add optional 'model' parameter to _try_openrouter() - Pass configured model from _resolve_strict_vision_backend() through to _try_openrouter() This allows users who set auxiliary.vision.model (e.g. x-ai/grok-4.3) to have it actually used, while maintaining backward compatibility.	2026-05-16 23:11:43 -07:00
zccyman	b389796ae3	fix(auxiliary): resolve api_key_env alias in named custom provider path of resolve_provider_client In resolve_provider_client(), the named custom provider code path at ~line 2914 only checked the ``key_env`` field when looking for an environment-variable-based API key. The documented ``api_key_env`` snake_case alias was silently ignored, causing custom providers configured with ``api_key_env`` to fall through to the ``no-key-required`` placeholder — which produces a confusing 401 (``****ired`` mask) on auth-required remote endpoints. This mirrors the same fix already applied to run_agent.py in commit `6ddc48b05` (fix(fallback): resolve api_key_env in fallback chain entries). Also adds a logger.warning() when the placeholder is reached, so future alias gaps are easier to debug. Closes #25091	2026-05-16 23:11:43 -07:00
teknium1	47823790b0	refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards Four fixes from PR #27248 review: 1. __init__ forwarder is now keyword-forwarded (daimon-nous review). Previously the run_agent.AIAgent.__init__ wrapper forwarded all 64 params positionally to agent.agent_init.init_agent, so adding a 65th param on main would require three lockstep edits (signature, init_agent signature, forwarder call) or silently shift every value. Keyword forwarding makes this trivially safe — adding a param now only needs the two signatures and one extra keyword line. 2. Drop dead _ra() in agent/codex_runtime.py (daimon-nous + Copilot). The lazy run_agent reference was defined but never called inside this module — the codex paths use agent.* accessors only. 3. Drop unused imports in agent/codex_runtime.py (Copilot): contextvars, threading, time, uuid, Optional. Carried over from run_agent.py during the original extraction. 4. Tighten three source-introspection test guards (Copilot): - test_memory_nudge_counter_hydration.py — was scanning the concatenated source of run_agent.py + agent/conversation_loop.py and matching self.X or agent.X form. Now asserts the hydration block lives in agent/conversation_loop.py specifically with the agent.X form — the body never moves back, so if it ever drifts a future re-introduction fails the guard. - test_run_agent.py::TestMemoryNudgeCounterPersistence — anchor on agent.iteration_budget = IterationBudget exactly (was just iteration_budget = IterationBudget) so an unrelated identifier ending in iteration_budget can't match. - test_run_agent.py::TestMemoryProviderTurnStart — assert the agent._user_turn_count form directly (the extracted body uses agent.X, not self.X — accepting either was a transitional fudge). - test_jsondecodeerror_retryable.py — scan agent/conversation_loop.py only, not the concatenation. Not addressed in this commit: * Pre-existing bugs in agent/tool_executor.py (heartbeat index mismatch when calls are blocked, _current_tool clobber in result loop, blocked-counted-as-completed in spinner summary, dead result_preview computation). These were preserved byte-for-byte from the original _execute_tool_calls_concurrent — worth a separate follow-up PR with proper tests. * _OpenAIProxy.__instancecheck__ concern — pre-existing, not flagged by any of the original test patches (nothing actually does isinstance(x, OpenAI) against the proxy instance). * agent_init.py:949 mem_config potential NameError — pre-existing; only triggers if _agent_cfg.get('memory', {}) itself raises, which it can't with a stock dict. tests/run_agent/ + tests/agent/: 4313 passed, 1 pre-existing test_auxiliary_client failure (unchanged). run_agent.py: 3821 -> 3937 lines (+116 from the keyword-forwarded init call's verbosity). Final: 16083 -> 3937 (-12146, 75% reduction).	2026-05-16 22:55:49 -07:00
shellybotmoyer	1a4e64ba06	fix(credential_pool): parse ISO-string last_status_at during from_dict rehydration (#25516 )	2026-05-16 22:54:22 -07:00
0xchainer	4b17c2411a	fix(skills): return None instead of truthy stub when skill load fails build_skill_invocation_message() returns a non-empty placeholder string ('[Failed to load skill: ...]') when the skill exists in the command cache but loading the actual SKILL.md payload fails. CLI/gateway callers treat any truthy return value as success, so the failure is silently routed into the model as if it were a valid skill prompt. Return None instead, matching the existing behavior for unknown commands, so callers using 'if msg:' can properly detect the failure.	2026-05-16 22:52:22 -07:00
teknium1	94c3e0ab8e	refactor(run_agent): extract 10 more helpers to agent/agent_runtime_helpers.py Final extraction pass — the methods left over after run_conversation and __init__ moved out. Together these 10 cover ~813 LOC of medium- sized helpers: * switch_model (194 LOC) — model switching mid-session * _invoke_tool (87) — central tool dispatch with overrides * _repair_tool_call (72) — argument JSON repair entrypoint * _sanitize_api_messages (71) — role-filter for API send * _looks_like_codex_intermediate_ack (72) — codex transcript heuristic * _copy_reasoning_content_for_api (70) — reasoning preservation * _cleanup_dead_connections (70) — periodic dead-socket sweep * _extract_api_error_context (65) — error-dump context builder * _apply_pending_steer_to_tool_results (63) — /steer injection * _force_close_tcp_sockets (59) — aggressive socket cleanup AIAgent keeps thin forwarder methods for all 10 (staticmethods preserved where present). Names tests patch on run_agent (handle_function_call, AIAgent class attrs, logger) routed through _ra() so the patch surface is preserved. tests/run_agent/ + tests/agent/: 4313 passed (same pre-existing test_auxiliary_client failure as on main). run_agent.py: 4634 -> 3821 lines (-813). Final total: 16083 -> 3821 (-12262, 76% reduction).	2026-05-16 20:35:19 -07:00

1 2 3 4 5 ...

918 commits