hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-23 10:42:00 +00:00

Author	SHA1	Message	Date
Molvikar	8d363f8d54	fix(bedrock): preserve reasoningContent across converse normalization	2026-05-07 05:17:16 -07:00
Teknium	fb1ce793e6	feat(security): enable secret redaction by default (#17691 , #20785 ) (#21193 ) Flip the default for HERMES_REDACT_SECRETS from off to on so the redactor already wired into send_message_tool, logs, and tool output actually runs on a fresh install. - agent/redact.py: env-var default "" → "true" - hermes_cli/config.py: DEFAULT_CONFIG security.redact_secrets True; two config-template comments rewritten - gateway/run.py + cli.py: startup log / banner warning when the user has explicitly opted out, so the downgrade is visible in agent.log and at CLI banner time - docs/reference/environment-variables.md: description reconciled - tests: flipped the default-pin, restructured the force=True regression test to explicit-false instead of unset Users who need raw credential values (redactor development) can still opt out via security.redact_secrets: false in config.yaml or HERMES_REDACT_SECRETS=false in .env. Closes #17691. Addresses #20785 (short-term output-pipeline recommendation).	2026-05-07 05:10:33 -07:00
teknium1	2e00bcaaab	fix(oauth,gateway): monotonic deadlines for polling/timeout loops Widen PR #20314's fix to the other timeout-polling sites in the codebase that share the same wall-clock-jump bug class. All of these measure elapsed timeout duration, not civil time, so they belong on time.monotonic(). - hermes_cli/auth.py: auth-store file-lock timeout, Spotify OAuth callback wait, Nous portal device-auth token poll. - hermes_cli/copilot_auth.py: Copilot OAuth device-flow token poll. - hermes_cli/gateway.py: gateway systemd restart wait. - hermes_cli/web_server.py: dashboard Codex device-auth user_code wait, dashboard Nous device-auth token poll. (sess["expires_at"] stays on time.time() — it's a persisted absolute timestamp, not a local deadline-polling variable.) - agent/copilot_acp_client.py: Copilot ACP JSON-RPC request timeout.	2026-05-07 05:09:39 -07:00
briandevans	11b9b146f1	fix(image-routing): expose attached image paths in native multimodal text part In native image mode (vision-capable models like gpt-4o, claude-sonnet-4), build_native_content_parts() previously emitted only the user's caption plus image_url parts. The local file path of each attached image never appeared in the conversation text, so the model could see the pixels but had no string handle for tools that take image_url: str (custom MCP tools, vision_analyze on a re-look, attach-to-tracker workflows). The text-mode path already injects an equivalent hint via Runner._enrich_message_with_vision ("...vision_analyze using image_url: <path>..."). This brings native mode to parity by appending one "[Image attached at: <path>]" line per successfully attached image to the user-text part of the multimodal turn. Skipped (unreadable) paths are NOT advertised, so the model is never told a non-existent file is attached. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 04:58:00 -07:00
kshitijk4poor	aa88dcc57b	fix: salvage batch — compaction guidance, memory authority, cache eviction after compression - Fix /compact → /compress in context-overflow tips (closes #20020) - Evict cached agent after session hygiene and /compress so system prompt refreshes with current SOUL.md, memory, and skills - Restore memory authority across compaction: change 'informational background data' to 'authoritative reference data' in memory block and SUMMARY_PREFIX, with backward-compatible regex Based on: - PR #20027 by @LeonSGP43 - PR #18767 by @MacroAnarchy - PR #17380 by @vominh1919 PR #17121 boundary marker fix already merged to main (`2eef395e1`). PR #9262 user-message anchoring already on main via _ensure_last_user_message_in_tail().	2026-05-05 22:33:45 -07:00
etherman-os	985133852a	feat(i18n): add Turkish (tr) locale - Add locales/tr.yaml with Turkish translations for all approval.* and gateway.* keys - Register 'tr' in SUPPORTED_LANGUAGES - Add Turkish aliases: turkish, türkçe, tr-tr	2026-05-05 17:29:12 -07:00
rob-maron	2d4eaed111	arcee temperature + compression	2026-05-05 17:23:45 -07:00
Oleksii Lisikh	c4b287ba53	feat(i18n): add Ukrainian locale	2026-05-05 17:21:59 -07:00
Miniding	0d41e94ca9	feat(i18n): add French (fr) locale support - Add fr.yaml with French translations for approval prompts and gateway messages - Register 'fr' in SUPPORTED_LANGUAGES - Add French aliases: french, français, fr-fr, fr-be, fr-ca, fr-ch - Update locale sync comment in en.yaml	2026-05-05 15:13:57 -07:00
kshitijk4poor	20a4f79ed1	feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path Introduces providers/ package — single source of truth for every inference provider. Adding a simple api-key provider now requires one providers/<name>.py file with zero edits anywhere else. What this PR ships: - providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes) - ProviderProfile declarative fields: name, api_mode, aliases, display_name, env_vars, base_url, models_url, auth_type, fallback_models, hostname, default_headers, fixed_temperature, default_max_tokens, default_aux_model - 4 overridable hooks: prepare_messages, build_extra_body, build_api_kwargs_extras, fetch_models - chat_completions.build_kwargs: profile path via _build_kwargs_from_profile, legacy flag path retained for lmstudio/tencent-tokenhub (which have session-aware reasoning probing that doesn't map cleanly to hooks yet) - run_agent.py: profile path for all registered providers; legacy path variable scoping fixed (all flags defined before branching) - Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS, doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER - GeminiProfile: thinking_config translation (native + openai-compat nested) - New tests/providers/ (79 tests covering profile declarations, transport parity, hook overrides, e2e kwargs assembly) Deltas vs original PR (salvaged onto current main): - Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth (were added to main since original PR) - Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their reasoning_effort probing has no clean hook equivalent yet) - Removed lmstudio alias from custom profile (it's a separate provider now) - Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension (resolve_provider special-cases them; adding breaks runtime resolution) - runtime_provider: profile.api_mode only as fallback when URL detection finds nothing (was breaking minimax /v1 override) - Preserved main's legacy-path improvements: deepseek reasoning_content preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M beta recovery, etc. - Kept agent/copilot_acp_client.py in place (rejected PR's relocation — main has 7 fixes landed since; relocation would revert them) - _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing test imports Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Closes #14418	2026-05-05 13:40:01 -07:00
Bartok9	72c33dfe95	docs(agent): remove stale BuiltinMemoryProvider references from memory module docstrings The BuiltinMemoryProvider class was removed from the codebase but its name lingered in the module-level docstrings of memory_manager.py and memory_provider.py, creating false expectations: - memory_manager.py docstring showed example code doing add_provider(BuiltinMemoryProvider(...)) which ImportError at runtime - memory_provider.py docstring listed BuiltinMemoryProvider as 'always present, not removable' — misleading for new contributors The regression test (test_memory_user_id.py) already passes without any reference to BuiltinMemoryProvider; it uses RecordingProvider instances directly. The stale references were docs-only drift. Update both docstrings to reflect the actual current architecture: MemoryManager accepts external plugin providers only (one at a time). Closes #14402	2026-05-05 13:33:49 -07:00
Zeejay	f8ba265340	fix(aux): trigger fallback on 429 rate-limit errors in auxiliary client When a provider returns a 429 rate-limit error (not billing-related), the auxiliary client's call_llm/async_call_llm previously did NOT trigger the fallback chain. This caused auxiliary tasks like session_search to exhaust all 3 retries against the same rate-limited endpoint, losing session metadata that depended on the summarization completing. Root cause: `_is_payment_error()` only matched 429s containing billing keywords ("credits", "insufficient funds", etc.). Provider-specific rate-limit messages like Nous's "Hold up for a bit, you've exceeded the rate limit on your API key" didn't match, so `_is_payment_error` returned False, `_is_connection_error` returned False, and `should_fallback` was False — all retries hit the same rate-limited provider. Fix: - New `_is_rate_limit_error()` function that detects 429 + rate-limit keywords, generic 429 without billing keywords, and OpenAI SDK `RateLimitError` class instances (which may omit .status_code). - Updated `should_fallback` in both `call_llm` and `async_call_llm` to include `_is_rate_limit_error`. - Updated the max_tokens retry path to also check for rate-limit errors. - Updated the reason string to include "rate limit". This complements the Nous rate guard (PR #10568) which prevents new calls to Nous when already rate-limited — this fix handles the case where a request is already in flight when the 429 arrives. Related: #8023, #12554, #11034 Co-authored-by: Zeejay <zjtan1@gmail.com>	2026-05-05 10:15:57 -07:00
Jonathan Troyer	6430d67569	fix(openrouter): use canonical X-Title attribution header OpenRouter's dashboard attributes usage via the `X-Title` header. Hermes was sending `X-OpenRouter-Title`, which OpenRouter does not recognize, so Hermes usage showed up unlabeled. Rename to `X-Title` to match the canonical header (already used elsewhere in the same file via _AI_GATEWAY_HEADERS). Salvages the core fix from @JTroyerOvermatch's PR #13649. Dropped the PR's `HERMES_OPENROUTER_TITLE` / `HERMES_OPENROUTER_REFERER` env-var override plumbing per the '.env is for secrets only' policy — if per-deployment attribution is needed later it should go under `openrouter.title` / `openrouter.referer` in config.yaml instead.	2026-05-05 10:13:34 -07:00
Teknium	7de3c86c5a	feat(i18n): add display.language for static message translation (zh/ja/de/es) (#20231 ) * revert(gateway): remove stale-code self-check and auto-restart Removes the _detect_stale_code / _trigger_stale_code_restart mechanism introduced in #17648 and iterated in #19740. On every incoming message the gateway compared the boot-time git HEAD SHA to the current SHA on disk, and if they differed it would reply with Gateway code was updated in the background -- restarting this gateway so your next message runs on the new code. Please retry in a moment. and then kick off a graceful restart. This is unwanted behaviour: users who run a long-lived gateway and do their own ad-hoc git operations on the checkout end up with their chat interrupted and the current message dropped every time HEAD moves, with no way to opt out. If an operator really needs the old protection against stale sys.modules after "hermes update", the SIGKILL-survivor sweep in hermes update (hermes_cli/main.py, also tagged #17648) already handles the supervisor-respawn case on its own. Removed: gateway/run.py: - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS - _read_git_head_sha(), _compute_repo_mtime() module helpers - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha / _stale_code_restart_triggered defaults - __init__ boot-snapshot block (_boot_, _cached_current_sha, _repo_root_for_staleness, _stale_code_notified) - _current_git_sha_cached(), _detect_stale_code(), _trigger_stale_code_restart() methods - stale-code check + user-facing restart notice at the top of _handle_message() tests/gateway/test_stale_code_self_check.py (deleted, 412 lines) No new logic added. Zero remaining references to any removed symbol. Gateway test suite passes the same 4589 tests it passed before; the 3 pre-existing unrelated failures (discord free-channel, feishu bot admission, teams typing) are unchanged by this commit. * feat(i18n): add display.language for static message translation (zh/ja/de/es) Adds a thin-slice i18n layer covering the highest-impact static user-facing messages: the CLI dangerous-command approval prompt and a handful of gateway slash-command replies (restart-drain, goal cleared, approval expired, config read/save errors). Out of scope (stays English): agent responses, log lines, tool outputs, slash-command descriptions, error tracebacks. Infrastructure: - agent/i18n.py: catalog loader, t() helper, language resolution (HERMES_LANGUAGE env var > display.language config > en) - locales/{en,zh,ja,de,es}.yaml: ~19 translated strings per language - display.language in DEFAULT_CONFIG (hermes_cli/config.py) Tests: - tests/agent/test_i18n.py: 21 tests covering catalog parity, placeholder parity across locales, fallback behavior, env-var override, alias normalization, missing-key graceful degradation. Docs: - website/docs/user-guide/configuration.md: display.language entry plus a short section explaining scope so users don't expect agent responses to translate via this knob.	2026-05-05 08:03:07 -07:00
vominh1919	96514de472	fix(auxiliary): avoid locking into custom path when api_key is empty When auxiliary.<task> config has base_url set but api_key is empty (common when user expects env var fallback), _resolve_task_provider_model() returned provider="custom" with api_key=None. This caused downstream client construction to make API calls without an Authorization header, resulting in HTTP 401 errors. Fix: only return "custom" when BOTH cfg_base_url AND cfg_api_key are non-empty. When base_url is set without api_key but with a known provider (e.g. "openrouter"), pass through to that provider so it can resolve credentials from environment variables. Fixes #16829	2026-05-05 06:07:07 -07:00
briandevans	9e893d16d1	fix(aux): default Codex reasoning effort to medium when extra_body.reasoning.effort is falsy auxiliary.<task>.extra_body.reasoning, but the new translation path in _CodexCompletionsAdapter.create() reads the effort with ``reasoning_cfg.get("effort", "medium")``. That returns the configured value verbatim when the key is present, so ``effort: null`` / ``effort: ""`` (both common YAML shapes) flow through as ``{"effort": null, "summary": "auto"}`` and Codex rejects the request with "Invalid value for parameter ``reasoning.effort``". agent/transports/codex.py::build_kwargs() — which the new adapter is documented to mirror — uses a truthy check (``elif reasoning_config.get("effort"):``) so the same falsy values keep the "medium" default. Switch the auxiliary adapter to the same ``or "medium"`` truthy form so identical config produces identical requests on both paths. - [x] Two new regression tests cover ``effort: None`` and ``effort: ""`` and assert the request goes out as ``{"effort": "medium", "summary": "auto"}``. - [x] Old behaviour fails the new tests (``{'effort': None} != {'effort': 'medium'}``); fixed behaviour passes all 11 tests in the ``TestCodexAdapterReasoningTranslation`` class. - [x] Adjacent suites green: ``tests/agent/test_auxiliary_client.py`` (108 passed) and ``tests/agent/transports/test_codex_transport.py + test_chat_completions.py`` (73 passed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 05:47:50 -07:00
beardthelion	f15b0fbb4f	fix: add PLATFORM_HINTS entry for api_server platform The API server is a documented, first-class messaging platform with its own gateway adapter, docs pages, and toolset. But it's the only messaging platform missing from PLATFORM_HINTS in agent/prompt_builder.py. Without a platform hint, the agent has no context about the API server's rendering environment and defaults to markdown-heavy document-style outputs (code fences, bold, bullet points) — which break on the plain-text frontends most API server consumers wrap (Open WebUI, custom agents, third-party bridges). Adds a generic api_server entry that describes the medium (unknown rendering, assume plain text) without encoding any specific use case. Individual consumers can layer additional style guidance via ephemeral system prompts. Before (DeepSeek V4 Pro via API server, no hint): Sendblue bridge at /opt/sendblue-bridge - 68MB on disk After (same prompt, with hint): Sendblue bridge at /opt/sendblue-bridge, 68MB on disk No breaking changes — new dict entry only. Existing API server consumers see no behavioral change except for models that previously defaulted to markdown formatting, which now produce cleaner plain-text output.	2026-05-05 05:46:16 -07:00
wmagev	2eef395e1c	fix(compaction): mark end of context summary in role=user fallback When the head ends with assistant/tool and the tail starts with assistant, the summary is inserted as a standalone role="user" message. The body's verbatim "## Active Task" quote then gets read as fresh user input by weak/local models (#11475, #14521). The merge-into-tail path already appends an explicit end-of-summary marker for this reason. Mirror it on the standalone path so both insertion routes give the model the same "summary above, not new input" signal.	2026-05-05 04:51:29 -07:00
revaraver	4a3e3e20e5	fix(compression): preserve iterative summary continuity	2026-05-05 04:42:44 -07:00
Teknium	2a285d5ec2	fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 ) * revert(gateway): remove stale-code self-check and auto-restart Removes the _detect_stale_code / _trigger_stale_code_restart mechanism introduced in #17648 and iterated in #19740. On every incoming message the gateway compared the boot-time git HEAD SHA to the current SHA on disk, and if they differed it would reply with Gateway code was updated in the background -- restarting this gateway so your next message runs on the new code. Please retry in a moment. and then kick off a graceful restart. This is unwanted behaviour: users who run a long-lived gateway and do their own ad-hoc git operations on the checkout end up with their chat interrupted and the current message dropped every time HEAD moves, with no way to opt out. If an operator really needs the old protection against stale sys.modules after "hermes update", the SIGKILL-survivor sweep in hermes update (hermes_cli/main.py, also tagged #17648) already handles the supervisor-respawn case on its own. Removed: gateway/run.py: - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS - _read_git_head_sha(), _compute_repo_mtime() module helpers - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha / _stale_code_restart_triggered defaults - __init__ boot-snapshot block (_boot_, _cached_current_sha, _repo_root_for_staleness, _stale_code_notified) - _current_git_sha_cached(), _detect_stale_code(), _trigger_stale_code_restart() methods - stale-code check + user-facing restart notice at the top of _handle_message() tests/gateway/test_stale_code_self_check.py (deleted, 412 lines) No new logic added. Zero remaining references to any removed symbol. Gateway test suite passes the same 4589 tests it passed before; the 3 pre-existing unrelated failures (discord free-channel, feishu bot admission, teams typing) are unchanged by this commit. * fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924) Per-delta _strip_think_blocks ran at _fire_stream_delta and destroyed downstream state. When MiniMax-M2.7 / DeepSeek / Qwen3 streamed a tag split across deltas (delta1='<think>', delta2='Let me check'), the regex case-2 match erased delta1 entirely, so CLI/gateway state machines never learned a block was open and leaked delta2 as content. Raw consumers (ACP, api_server, TTS) had no downstream defense at all. Replace the per-delta regex with a stateful StreamingThinkScrubber that survives delta boundaries: - Closed <tag>X</tag> pairs always stripped (matches _strip_think_blocks case 1). - Unterminated open at block boundary enters a block; content discarded until close tag arrives. At end-of-stream, held content is dropped. - Orphan close tags stripped without boundary gating. - Partial tags at delta boundaries held back until resolved. - Block-boundary rule (start-of-stream, after \n, or whitespace-only since last \n) preserves prose that mentions tag names. Reset at turn start alongside the existing context scrubber; flush at turn end so a benign '<' held back at end-of-stream reaches the UI. E2E-verified on live OpenRouter->MiniMax-m2 streams: closed pairs strip cleanly, first word of post-block content is preserved, pure content passes through unchanged. Stefan's screenshot case (#17924) — 'Let me check' getting chopped to ' me check' — no longer happens. Final _strip_think_blocks calls on completed strings (final_response, replay, compression) are preserved; only the streaming per-delta call site switched to the scrubber.	2026-05-05 04:33:38 -07:00
Chris Danis	28f4d6db63	fix(tool-schemas): reactive strip of pattern/format on llama.cpp grammar 400s MCP servers commonly emit JSON Schema `pattern` (e.g. `\\d{4}-\\d{2}-\\d{2}` for date-time params) and `format` keywords. llama.cpp's `json-schema-to-grammar` converter rejects regex escape classes (\\d/\\w/\\s) and most format values, returning HTTP 400 "parse: error parsing grammar: unknown escape at \\d" — the whole request fails. Cloud providers (OpenAI, Anthropic, OpenRouter, Gemini) accept these keywords fine and use them as prompting hints. Stripping unconditionally loses useful hints for every cloud user to fix a llama.cpp-only bug. Approach: classify the llama.cpp grammar-parse 400 in the error classifier, and on match do a one-shot in-place strip of pattern/format from `self.tools`, then retry. Follows the existing `thinking_signature` recovery pattern. Cloud users hit zero overhead; llama.cpp users pay one failed request per session. Changes - agent/error_classifier.py: new `FailoverReason.llama_cpp_grammar_pattern` + narrow HTTP-400 branch matching "error parsing grammar", "json-schema-to-grammar", or "unable to generate parser ... template". - tools/schema_sanitizer.py: new `strip_pattern_and_format()` helper — reactive, walks schema nodes, skips property names (search_files.pattern survives). Returns strip count for logging. - run_agent.py: new one-shot recovery block in the retry loop. Strips, logs, continues. Falls through to normal retry if nothing to strip. - tests: 4 classifier tests (3 variants + 1 non-400 negative), 7 strip tests including the property-name preservation and idempotency checks. Co-authored-by: Chris Danis <cdanis@gmail.com>	2026-05-05 04:25:18 -07:00
EmelyanenkoK	25065283b3	fix: improve telegram topic mode setup	2026-05-04 12:07:17 -07:00
bobashopcashier	d89e7a3cd4	fix(anthropic): restrict fast mode to Opus 4.6 (Anthropic API contract) Per https://platform.claude.com/docs/en/build-with-claude/fast-mode: "Fast mode is currently supported on Opus 4.6 only. Sending speed: fast with an unsupported model returns an error." Pre-fix, _is_anthropic_fast_model() returned True for any claude-* model, so /fast on Opus 4.7 (or Sonnet/Haiku) would persist agent.service_tier=fast in config.yaml and the adapter would inject extra_body["speed"] = "fast" on every subsequent request. Opus 4.7 returns: HTTP 400: 'claude-opus-4-7' does not support the `speed` parameter. This wedged sessions across model upgrades (a user who ran /fast on Opus 4.6 and later switched the default model to 4.7 hit a hard 400 on every turn until they manually edited config.yaml). Changes: - _is_anthropic_fast_model: gate on "opus-4-6" / "opus-4.6" only - anthropic_adapter: add _supports_fast_mode predicate as defensive guard so stale request_overrides on an unsupported model are dropped silently instead of 400'ing - Tests: flip the assertions that mirrored the bug (Sonnet/Haiku/Opus 4.7 asserting fast-mode support) to match the documented API contract	2026-05-04 06:23:52 -07:00
JasonOA888	a7417f8a4a	fix(compressor): skip non-string tool content in summarization pass to prevent AttributeError Commit `408dd8aa` added a non-string guard for Pass 1 (dedup), but the same pattern exists in Pass 2 (summarization/pruning) where content.startswith() and len() are called on potentially non-string tool content. When a provider returns tool results with non-string content (e.g. dict or int from llama.cpp or similar), the pruning pass crashes with AttributeError. Add the same isinstance(content, str) guard to Pass 2 for consistency.	2026-05-04 06:23:52 -07:00
陈运波0668001438	6cf7a9e330	fix(vision): preserve explicit provider auth with custom base_url Keep the configured vision provider when base_url is overridden so credential-pool lookup still resolves provider-specific API keys (e.g. ZAI_API_KEY), and add a regression test for this path.	2026-05-04 05:05:43 -07:00
swithek	b7bbc62503	fix(compressor): _prune_old_tool_results boundary direction	2026-05-04 05:05:18 -07:00
Dejie Guo	d29f90e89d	fix(error_classifier): avoid large-context false overflow heuristics Generic 400 and server-disconnect heuristics used absolute token/message-count fallbacks that are too aggressive for 1M context sessions. Gate those absolute fallbacks to smaller context windows while preserving relative pressure checks. Fixes #16351	2026-05-04 05:04:56 -07:00
ms-alan	6f864f8f94	fix(redact): add code_file param to skip false-positive ENV/JSON patterns ENV-assignment and JSON-field regex patterns in redact_sensitive_text() cause false positives when reading source code files: - MAX_TOKENS=*** triggers the ENV assignment pattern - "apiKey": "test" in test fixtures triggers the JSON field pattern Add code_file=False parameter. When code_file=True, skip only the ENV-assignment and JSON-field regex passes; all other patterns (prefixes, auth headers, private keys, DB connstrings, JWTs, URL secrets) are still applied. Update file_tools.py (read_file and search_files) to pass code_file=True so agent code analysis is not polluted by false-positive redactions. Closes #15934	2026-05-04 04:56:28 -07:00
Grey0202	a219a0a4df	fix(anthropic): strip top-level oneOf/allOf/anyOf from tool input_schema Extends the existing _normalize_tool_input_schema to also drop top-level union keywords that Anthropic's tool schema validator rejects with HTTP 400. Several upstream and plugin tools ship schemas with a top-level oneOf/ allOf/anyOf (common for Pydantic discriminated unions). The existing strip_nullable_unions pass only handles anyOf-with-null patterns; a non-null top-level union keyword sails through and hits the API. Salvage of #16471 — approach folded into the existing normalize helper rather than introducing a parallel _sanitize_input_schema function, to avoid two schema-munging code paths running against the same input. Co-authored-by: Grey0202 <grey0202@users.noreply.github.com>	2026-05-04 03:17:35 -07:00
charliekerfoot	412f2389f1	fix(google_oauth): close TOCTOU window when saving credentials	2026-05-04 03:16:19 -07:00
pander	6b88f46c54	fix(compressor): trigger fallback on timeout errors alongside model-not-found Previously only HTTP 404/503 and specific error strings triggered a fallback to the main model when the summary model was unavailable. Timeout errors (HTTP 408/429/502/504, or error strings containing 'timeout') entered a short cooldown instead, leaving context to grow unbounded for the rest of the session. Add _is_timeout detection alongside _is_model_not_found so that transient timeout errors on the summary model also trigger immediate fallback to the main model, preventing compression failure from cascading. Closes #15935	2026-05-04 03:10:53 -07:00
flobo3	ba8337464d	fix(gemini): extract usageMetadata from streaming chunks for token tracking	2026-05-04 02:33:30 -07:00
B1GGersnow	dc63ad0ad2	fix(anthropic): cap max_tokens at 65536 for Qwen models via DashScope DashScope's Anthropic-compatible endpoint enforces max_tokens ∈ [1, 65536]. Adding "qwen3" to _ANTHROPIC_OUTPUT_LIMITS prevents 400 errors that were misclassified as context overflow, triggering premature compression. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-04 02:31:05 -07:00
nftpoetrist	e2211b2683	fix(compressor): reset _summary_failure_cooldown_until in on_session_reset() on_session_reset() cleared _previous_summary, _last_summary_error, and _ineffective_compression_count but left _summary_failure_cooldown_until intact. When a transient summary error sets a 60 s cooldown (or 600 s for a missing-provider RuntimeError) and the user immediately runs /reset or /new, the cooldown carries into the new session. If the new session reaches the compression threshold before the cooldown expires, _generate_summary() returns None early, middle turns are silently dropped without a summary, and the agent continues with no indication that compaction was skipped. Fix: set _summary_failure_cooldown_until = 0.0 in on_session_reset(), matching the value assigned in __init__ and symmetric with the other per-session fields already cleared there. Fixes #15547	2026-05-04 02:30:31 -07:00
daixin1204	744079ffe6	fix(curator): prevent false-positive consolidation from substring matching _classify_removed_skills used naive 'in' substring matching to detect whether a removed skill's name appeared in skill_manage arguments. Short/common skill names (api, git, test, foo, etc.) matched incorrectly when they appeared as substrings of longer words in file paths (references/api-design.md) or content (latest, testing). Replace with field-aware matching: - file_path: needle must match a complete filename stem or directory name, with -/_ normalised for variant tolerance - content fields: word-boundary regex (\b) prevents embedding in longer words Also add 3 regression tests covering the false-positive scenarios.	2026-05-04 01:21:23 -07:00
nftpoetrist	808fee151d	fix(auxiliary): propagate explicit_api_key to _try_anthropic() _try_anthropic() lacked the explicit_api_key parameter added to _try_openrouter() in #18768. When resolve_provider_client() is called with provider="anthropic" and an explicit key (e.g. from a fallback_model entry with api_key set), the key was silently ignored — _try_anthropic() always fell back to resolve_anthropic_token(), so the fallback returned None,None for users without a default Anthropic credential configured. Fix: add explicit_api_key: str = None to _try_anthropic() and use explicit_api_key or <pool/env fallback> in both the pool-present and no-pool paths. Pass explicit_api_key=explicit_api_key at the call site in resolve_provider_client(). Symmetric with the _try_openrouter() fix. No behavior change when explicit_api_key is None.	2026-05-03 17:00:55 -07:00
Teknium	b58db237e4	fix(kanban): drop worker identity claim from KANBAN_GUIDANCE (#19427 ) KANBAN_GUIDANCE layer 3 of the system prompt started with 'You are a Kanban worker', overriding the profile's SOUL.md identity at layer 1. Profiles with strict role boundaries (e.g. a reviewer profile that never writes code) still executed implementation tasks because the kanban identity claim diluted SOUL's. Drop the identity line. Layer 3 now describes the task-execution protocol only; SOUL.md remains the sole identity slot. Fixes #19351	2026-05-03 16:59:00 -07:00
0xKingBack	3c42024539	fix(curator): pass auxiliary curator api_key/base_url into runtime resolution Curator review fork now forwards per-slot credentials from auxiliary.curator and legacy curator.auxiliary to resolve_runtime_provider, matching the canonical aux task schema. Add regression tests for binding and main fallback.	2026-05-03 16:55:16 -07:00
sprmn24	408dd8aa28	fix(compressor): skip non-string tool content in dedup pass to prevent AttributeError	2026-05-03 15:28:30 -07:00
Zyproth	dfdd7b6e6f	fix(codex-transport): preserve request override headers for xai responses	2026-05-03 15:25:45 -07:00
kshitij	457c7b76cd	feat(openrouter): add response caching support (#19132 ) Enable OpenRouter's response caching feature (beta) via X-OpenRouter-Cache headers. When enabled, identical API requests return cached responses for free (zero billing), reducing both latency and cost. Configuration via config.yaml: openrouter: response_cache: true # default: on response_cache_ttl: 300 # 1-86400 seconds Changes: - Add openrouter config section to DEFAULT_CONFIG (response_cache + TTL) - Add build_or_headers() in auxiliary_client.py that builds attribution headers plus optional cache headers based on config - Replace inline _OR_HEADERS dicts with build_or_headers() at all 5 sites: run_agent.py __init__, _apply_client_headers_for_base_url(), and auxiliary_client.py _try_openrouter() + _to_async_client() - Add _check_openrouter_cache_status() method to AIAgent that reads X-OpenRouter-Cache-Status from streaming response headers and logs HIT/MISS status - Document in cli-config.yaml.example - Add 28 tests (22 unit + 6 integration) Ref: https://openrouter.ai/docs/guides/features/response-caching	2026-05-03 01:54:24 -07:00
liuhao1024	af98122793	fix(auxiliary): propagate explicit_api_key to _try_openrouter() When resolve_provider_client() passes explicit_api_key for OpenRouter auxiliary tasks, _try_openrouter() now accepts and honors this parameter instead of silently ignoring it and falling back to OPENROUTER_API_KEY env var. Root cause: _try_openrouter() had no explicit_api_key parameter, so even when callers wanted to pass a runtime credential pool key, it could not be used. Fix: - Add explicit_api_key: str = None parameter to _try_openrouter() - Prioritize explicit_api_key over pool key and env var - Update resolve_provider_client() call site to pass explicit_api_key Regression coverage: - Test that explicit_api_key is passed to OpenAI client when provided - Test that fallback to OPENROUTER_API_KEY still works when explicit_api_key is None Closes #18338	2026-05-02 02:27:49 -07:00
Frank Song	2ef1ad280b	fix: prefer ~/.hermes/.env over os.environ when seeding credential pool When _seed_from_env() reads API keys to populate the credential pool, it should treat ~/.hermes/.env as the authoritative source — not os.environ. Stale env vars inherited from parent shell processes (Codex CLI, test scripts, etc.) can shadow deliberate changes to the .env file, causing auth.json to cache an outdated key that leads to silent 401 errors. This is especially visible with OpenRouter: if a parent process exported OPENROUTER_API_KEY=test-key-fresh and the user later updates .env with a valid key, restarting Hermes still picks up the stale os.environ value, writes it back to auth.json, and all API calls fail with 401. Fixes #18254	2026-05-02 02:00:32 -07:00
liuhao1024	9bf260472b	fix(tools): deduplicate tool names at API boundary for Vertex/Azure/Bedrock Providers like Google Vertex, Azure, and Amazon Bedrock reject API requests with duplicate tool names (HTTP 400: 'Tool names must be unique'). The upstream injection paths in run_agent.py already dedup after PR #17335, but two API-boundary functions pass tools through without checking: - agent/auxiliary_client.py: _build_call_kwargs() (all non-Anthropic providers in chat_completions mode) - agent/anthropic_adapter.py: convert_tools_to_anthropic() (Anthropic Messages API path) Add defensive dedup guards at both sites. Duplicates are dropped with a warning log, converting a hard 400 failure into a recoverable condition. This is intentionally conservative — the root-cause dedup in run_agent.py is the primary defense; these guards add resilience against future injection-path regressions. Includes 8 new tests covering unique passthrough, duplicate removal, empty/None edge cases. Closes #18478	2026-05-02 01:51:51 -07:00
Teknium	c73594fe41	fix(skills): rescan skill_commands cache when platform scope changes (#18739 ) The process-global `_skill_commands` dict in agent/skill_commands.py was seeded by whichever platform scanned first, and `get_skill_commands()` only rescanned when the cache was empty. In a long-lived gateway process serving multiple platforms (Telegram + Discord + Slack), the first platform's `skills.platform_disabled` view was silently inherited by the others — so a skill disabled for Telegram would also disappear from Discord's slash menu, and vice versa. Track the platform scope the cache was populated for (`_skill_commands_platform`) and rescan in `get_skill_commands()` when the currently-active platform no longer matches. Platform resolution uses the same precedence as `_is_skill_disabled`: `HERMES_PLATFORM` env var then `HERMES_SESSION_PLATFORM` from the gateway session context. Fixes #14536 Salvages #14570 by LeonSGP43. Co-authored-by: LeonSGP <leon@sgp43.com>	2026-05-02 01:36:53 -07:00
Teknium	97acd66b4c	fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671 ) (#18731 ) * fix(curator): authoritative absorbed_into declarations on skill delete Closes #18671. The classification pipeline that feeds cron-ref rewriting used to infer consolidation vs pruning from two brittle signals: the curator model's post-hoc YAML summary block, and a substring heuristic scanning other tool calls for the removed skill's name. Both miss in real consolidations — the model forgets the YAML under reasoning pressure, and the heuristic misses when the umbrella's patch content describes the absorbed behavior abstractly instead of naming the old slug. When both miss, the skill falls through to 'no-evidence fallback' pruned, and #18253's cron rewriter drops the cron ref entirely instead of mapping it to the umbrella. Same observable symptom as pre-#18253: 'Skill(s) not found and skipped' at the next cron run. The fix makes the model declare intent at the moment of deletion. skill_manage(action='delete') now accepts absorbed_into: - absorbed_into='<umbrella>' -> consolidated, target must exist on disk - absorbed_into='' -> explicit prune, no forwarding target - missing -> legacy path, falls through to heuristic/YAML The curator reconciler reads these declarations off llm_meta.tool_calls BEFORE either the YAML block or the substring heuristic. Declaration wins. Fallback logic stays intact for backward compat with any caller (human or older curator conversation) that doesn't populate the arg. Changes - tools/skill_manager_tool.py: add absorbed_into param to skill_manage + _delete_skill. Validate target exists when non-empty. Reject absorbed_into=<self>. Wire through dispatcher + registry + schema. - agent/curator.py: new _extract_absorbed_into_declarations() walks tool calls for skill_manage(delete) with the arg. _reconcile_classification accepts absorbed_declarations= and treats them as authoritative. Curator prompt updated to require the arg on every delete. - Tests: 7 new skill_manager tests covering the tool contract (valid target, empty string, nonexistent target, self-reference, whitespace, backward compat, dispatcher plumbing). 11 new curator tests covering the extractor + authoritative reconciler path + mixed-legacy-and- declared runs. Validation - 307/307 targeted tests pass (curator + cron + skill_manager suites). - E2E #18671 repro: 3 narrow skills, 1 umbrella, cron job referencing all 3. Model emits NO YAML block. Heuristic misses (patch prose doesn't name old slugs). Delete calls carry absorbed_into. Result: both PR skills correctly classified 'consolidated' + cron rewritten ['pr-review-format', 'pr-review-checklist', 'stale-junk'] -> ['hermes-agent-dev']; stale-junk pruned via absorbed_into=''. - E2E backward-compat: delete without absorbed_into, model emits YAML -> routed via existing 'model' source, cron still rewritten correctly. * feat(curator): capture + restore cron skill links across snapshot/rollback Before this, rolling back a curator run restored the skills tree but cron jobs still pointed at the umbrella skills the curator had rewritten them to. The user would see their old narrow skills back on disk but their cron jobs still configured with the merged umbrella — not actually 'back to how it was'. Snapshot side: snapshot_skills() now captures ~/.hermes/cron/jobs.json alongside the skills tarball, as cron-jobs.json. The manifest gets a new 'cron_jobs' block with {backed_up, jobs_count} so rollback (and the CLI confirm dialog) can surface what's in the snapshot. If jobs.json is missing/unreadable/malformed, snapshot proceeds without cron data — the skills backup is the core guarantee; cron is additive. Rollback side: after the skills extract succeeds, the new _restore_cron_skill_links() reconciles the backed-up jobs into the live jobs.json SURGICALLY. Only 'skills' and 'skill' fields are restored, and only on jobs matched by id. Everything else about a cron job — schedule, last_run_at, next_run_at, enabled, prompt, workdir, hooks — is live state the user or scheduler has modified since the snapshot; overwriting it would regress unrelated activity. Reconciliation rules: - Job in backup AND live, skills differ → skills restored. - Job in backup AND live, skills match → no-op. - Job in backup, NOT in live → skipped (user deleted it after snapshot; their choice is later than the snapshot). - Job in live, NOT in backup → untouched (user created it after snapshot). - Snapshot missing cron-jobs.json at all → rollback still succeeds, reports 'not captured' (older pre-feature snapshots keep working). Writes go through cron.jobs.save_jobs under the same _jobs_file_lock the scheduler uses, so rollback doesn't race tick(). Also: - hermes_cli/curator.py: rollback confirm dialog now shows 'cron jobs: N (will be restored for skill-link fields only)' when the snapshot has cron data, or 'not in snapshot (<reason>)' otherwise. - rollback()'s message string includes a 'cron links: ...' clause summarizing the reconciliation outcome. Tests - 9 new cases: snapshot-with-cron, snapshot-without-cron, malformed-json captured-as-raw, full rollback-restores-skills-and-cron, rollback touches only skill fields, rollback skips user-deleted jobs, rollback leaves user-created jobs untouched, rollback still works with pre-feature snapshot that has no cron-jobs.json, standalone unit test on _restore_cron_skill_links exercising the full report shape. Validation - 484/484 targeted tests pass (curator + cron + skill_manager suites). - E2E: real snapshot_skills, real cron rewrite, real rollback. Before: ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage']. After curator: ['hermes-agent-dev']. After rollback: ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage']. Non-skill fields (id, name, prompt) preserved across the round trip.	2026-05-02 01:29:57 -07:00
Teknium	77c0bc6b13	fix(curator): defer first run and add --dry-run preview (#18373 ) (#18389 ) * fix(curator): defer first run and add --dry-run preview (#18373) Curator was meant to run 7 days after install, not on the very first gateway tick. On a fresh install (no .curator_state), should_run_now() returned True immediately because last_run_at was None — so the gateway cron ticker fired Curator against a fresh skill library moments after 'hermes update'. Combined with the binary 'agent-created' provenance model (anything not bundled and not hub-installed), this consolidated hand-authored user workflow skills without consent. Changes: - should_run_now(): first observation seeds last_run_at='now' and returns False. The next real pass fires one full interval_hours later (7 days by default), matching the original design intent. - hermes curator run --dry-run: produces the same review report without applying automatic transitions OR permitting the LLM to call skill_manage / terminal mv. A DRY-RUN banner is prepended to the prompt and the caller skips apply_automatic_transitions. State is NOT advanced so a preview doesn't defer the next scheduled real pass. - hermes update: prints a one-liner on fresh installs pointing at --dry-run, pause, and the docs. Silent on steady state. - Docs: curator.md and cli-commands.md explain the deferred first-run behavior and warn that hand-written SKILL.md files share the 'agent-created' bucket, with guidance to pin or preview before the first pass. Tests: - test_first_run_defers replaces the old 'first run always eligible' assertion — same fixture, inverted expectation. - test_maybe_run_curator_defers_on_fresh_install covers the gateway tick path end-to-end. - Three new dry-run tests cover state-advance suppression, prompt banner injection, and apply_automatic_transitions skipping. Fixes #18373. * feat(curator): pre-run backup + rollback (#18373) Every real curator pass now snapshots ~/.hermes/skills/ into ~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz before calling apply_automatic_transitions or the LLM review. If a run consolidates or archives something the user didn't want touched, 'hermes curator rollback' restores the tree in one command. Dry-run is skipped — no mutation means no snapshot needed. Changes: - agent/curator_backup.py (new): tar.gz snapshot + safe rollback. The snapshot excludes .curator_backups/ (would recurse) and .hub/ (managed by the skills hub). Extract refuses absolute paths and .. components, and uses tarfile's filter='data' on Python 3.12+. Rollback takes a pre-rollback safety snapshot FIRST, stages the current tree into .rollback-staging-<ts>/ so the extract lands in an empty dir, and cleans the staging dir on success. A failed extract restores the staged contents. - agent/curator.py: run_curator_review() calls curator_backup. snapshot_skills(reason='pre-curator-run') before apply_automatic_ transitions. Best-effort — a failed snapshot logs at debug and the run continues (a transient disk issue shouldn't silently disable curator forever). - hermes_cli/curator.py: new 'hermes curator backup' and 'hermes curator rollback' subcommands. rollback supports --list, --id <ts>, -y. - hermes_cli/config.py: curator.backup.{enabled, keep} config block with sane defaults (enabled=true, keep=5). - Docs: curator.md gets a 'Backups and rollback' section; cli-commands .md table gets the new rows. Tests (new file tests/agent/test_curator_backup.py, 16 cases): - snapshot creates tarball + manifest with correct counts - snapshot excludes .curator_backups/ (recursion guard) and .hub/ - snapshot disabled via config returns None without creating anything - snapshot uniquifies ids within the same second (-01 suffix) - prune honors keep count, newest-first - list_backups + _resolve_backup cover newest-default and unknown-id - rollback restores a deleted skill with content intact - rollback is itself undoable — safety snapshot shows up in list_backups - rollback with no snapshots returns an error - rollback refuses tarballs with absolute paths or .. components - real curator runs take a 'pre-curator-run' snapshot; dry-runs do not All curator tests: 210 passing locally.	2026-05-01 09:49:59 -07:00
teknium1	2af8b8ff37	fix(moonshot): also strip nullable/enum after anyOf collapse The anyOf collapse in _repair_schema returned early, skipping the nullable-strip and enum-cleanup steps. When a schema had anyOf [{enum: [..., null, '']}, {type: null}] alongside a parent-level 'nullable: true', collapsing to the single non-null branch produced a merged node that still had both 'nullable' and the bad enum values — Moonshot would still 400 on it. Fix: fall through to Rules 1/3 when the collapse produces a single merged node; only return early for the multi-branch case (pure anyOf preservation) or when there was no null branch to remove. Adds a test that locks in the combined-case expectation.	2026-04-30 23:14:31 -07:00
Hendrix	9ca72a69a7	fix(moonshot): fill missing type before enum cleanup to handle anyOf branches without explicit type When a schema node inside anyOf has enum values but no explicit 'type', Rule 3 (enum cleanup) ran before _fill_missing_type, so node_type was None and the enum was never cleaned. Moonshot then rejected the schema with 'enum value (<nil>) does not match any type in [string]'. Fix: reorder operations — fill missing type first, strip nullable, then clean enum. This ensures enum cleanup always has a type to check. Also fixes test expectation: empty string in enum is now correctly stripped (Moonshot rejects it too). Closes #16875	2026-04-30 23:14:31 -07:00
Teknium	e2eb561e8e	fix(curator): rewrite cron job skill refs after consolidation (#18253 ) When the curator consolidates skill X into umbrella Y, any cron job that listed X in its skills field would fail to load X at run time — the scheduler logs a warning and skips it, so the scheduled job runs without the instructions it was scheduled to follow. cron.jobs.rewrite_skill_refs(consolidated, pruned) now updates jobs in-place: consolidated names route to the umbrella target (dedup when umbrella is already present), pruned names are dropped. agent.curator._write_run_report calls it after classification, best-effort so a cron-side failure never breaks the curator itself. Results are recorded in run.json (counts.cron_jobs_rewritten + full cron_rewrites payload), a separate cron_rewrites.json for convenience when jobs were touched, and a section in REPORT.md. Reported by @tombielecki.	2026-04-30 23:04:50 -07:00

1 2 3 4 5 ...

750 commits