hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-29 01:31:41 +00:00

Author	SHA1	Message	Date
helix4u	d8c5573ffe	fix(profiles): migrate Honcho host on rename	2026-04-28 05:37:09 -07:00
revar	052b3449e5	test(cli): regression test for manual /compress system_message Add tests/test_cli_manual_compress.py verifying _manual_compress passes None (not the cached system prompt) to _compress_context, forwards the /compress <topic> focus string, rotates CLI session_id to the new child session, and clears the pending title. Co-authored-by: revar <revar@users.noreply.github.com>	2026-04-28 05:21:49 -07:00
teknium1	7444e49d4e	fix(gateway): use transcript timestamp for auto-continue freshness Follow-up to PR #16802 (BeliefanX). The original fix read `agent_history[-1].get("timestamp")` for the tool-tail freshness gate, but `gateway/run.py` strips the `timestamp` field off all tool/tool_call rows when building `agent_history` from the raw transcript (see `clean_msg = {k: v for k, v in msg.items() if k != "timestamp"}`). At runtime the tool-tail branch always saw `None` and silently took the legacy-fresh path — the stale-guard never fired for the tool-tail case it was supposed to cover. Changes: - Read the freshness signal from the RAW `history` list (via new `_last_transcript_timestamp()` helper) BEFORE the strip. Both the resume_pending branch and the tool-tail branch use this single signal, replacing the two divergent ones. - Default window bumped 15 min → 1 hour via new `_AUTO_CONTINUE_FRESHNESS_SECS_DEFAULT`. The 15-minute default was shorter than the default `gateway_timeout` of 30 min, so a legitimate long-running turn interrupted near its timeout boundary and resumed shortly after would have been misclassified as stale. - Configurable via `config.yaml` `agent.gateway_auto_continue_freshness` (bridged to `HERMES_AUTO_CONTINUE_FRESHNESS` at gateway startup — same pattern as `gateway_timeout`). Set to 0 to disable the gate. - `_coerce_gateway_timestamp` now explicitly rejects bool (which is a subclass of int and would otherwise coerce to 0.0/1.0). - Tests rewritten to exercise the real production data shape: raw `history` → `_build_agent_history` strip → freshness decision. A regression guard (`test_stale_tool_tail_with_production_data_shape`) asserts `agent_history` tool rows carry NO timestamp, protecting against someone "fixing" the original bug by re-adding the stripped field (which would break the OpenAI tool-result message contract). Add BeliefanX to scripts/release.py AUTHOR_MAP. E2E verified: config.yaml → env var bridge → helper returns configured value; default 1h window; malformed/empty env var falls back to default; ISO-Z timestamps parse; ms-epoch coerced; bool rejected.	2026-04-28 05:20:35 -07:00
beliefanx	93feffbcfa	fix(gateway): avoid stale interrupted turn auto-continue	2026-04-28 05:20:35 -07:00
Teknium	b61d9b297a	refactor: consolidate symlink-safe atomic replace into shared helper Extract the islink/realpath guard from the 16743 fix into a single atomic_replace() helper in utils.py, then migrate every os.replace() call site in the codebase to use it. The original PR #16777 correctly identified and fixed the bug, but only patched 9 of ~24 call sites. The same bug class (managed deployments that symlink state files silently losing the link on every write) still existed at auth.json, sessions file, gateway config, env_loader, webhook subscriptions, debug store, model catalog, pairing, google OAuth, nous rate guard, and more. Rather than add another 10+ copies of the same three-line guard, consolidate into atomic_replace(tmp, target) which: - resolves symlinks via os.path.realpath before os.replace - returns the resolved real path so callers can re-apply permissions - is a drop-in replacement for os.replace at the use sites Changes: - utils.py: new atomic_replace() helper + atomic_json_write / atomic_yaml_write now call it instead of inlining the guard - 16 files: all os.replace() call sites migrated to atomic_replace() - agent/{google_oauth, nous_rate_guard, shell_hooks}.py - cron/jobs.py - gateway/{pairing, session, platforms/telegram}.py - hermes_cli/{auth, config, debug, env_loader, model_catalog, webhook}.py - tools/{memory_tool, skill_manager_tool, skills_sync}.py Tests: tests/test_atomic_replace_symlinks.py pins the invariant for atomic_replace + atomic_json_write + atomic_yaml_write, covers plain files, first-time creates, broken symlinks, and permission preservation. Refs #16743 Builds on #16777 by @vominh1919.	2026-04-28 04:58:22 -07:00
teknium1	1369dae226	test(openclaw-migration): cover alias reverse-lookup for real OpenClaw schema Real OpenClaw configs key agents.defaults.models by full provider/model API ID with an 'alias' field on the value (e.g. {'anthropic/claude-opus-4-6': {'alias': 'Claude Opus 4.6'}}). Add regression tests for issue #16745 covering: - reverse-lookup of alias against real schema (keyed by API ID) - alias resolution when model is a bare string vs {'primary': ...} - passthrough when the value is already a provider/model API ID - passthrough when the alias has no catalog match - string-valued catalog entries (belt-and-suspenders) - no catalog at all	2026-04-28 04:58:13 -07:00
Pony.Ma	aa94883288	fix(mcp): preserve nullable schema coercion	2026-04-28 04:58:03 -07:00
Pony.Ma	1350d12b0b	fix: keep mcp dynamic refresh tasks tracked	2026-04-28 04:58:03 -07:00
Pony.Ma	02ae152222	fix(mcp): normalize nullable tool schemas	2026-04-28 04:58:03 -07:00
Ruda Porto Filgueiras	37551ee53e	test(bedrock): add model picker and region routing tests 25 new tests (all Bedrock API calls mocked, no real AWS creds needed): tests/hermes_cli/test_bedrock_model_picker.py (20 tests): - provider_model_ids("bedrock") uses live discovery, returns regional model IDs, falls back gracefully on empty/exception, resolves all bedrock aliases (aws, aws-bedrock, amazon-bedrock) to live discovery - list_authenticated_providers() section 2: bedrock appears with AWS creds, model list from discover_bedrock_models(), total_models matches, is_current flag works, absent creds hides bedrock, discovery failure does not crash, no duplicate entries - Region routing: botocore profile eu-central-1 yields eu.* model IDs end-to-end; env var takes priority over botocore profile - providers.py overlay: exists with correct transport/auth_type, label is non-empty, all aliases normalize to bedrock tests/agent/test_bedrock_adapter.py (5 tests): - resolve_bedrock_region() botocore profile fallback, botocore failure fallback, us-east-1 hard fallback (with botocore mocked)	2026-04-28 03:53:11 -07:00
Teknium	023f5c74b1	fix(anthropic): remove Claude Code fingerprinting from OAuth Messages API path (#16957 ) * fix(anthropic): remove Claude Code fingerprinting from OAuth Messages API path OAuth requests now identify as Hermes on the wire. Removed: - "You are Claude Code, Anthropic's official CLI for Claude." system prompt prepend - Hermes Agent → Claude Code / Nous Research → Anthropic system-prompt substitutions - mcp_ tool-name prefix on outgoing tool schemas + message history - Matching mcp_ strip on inbound tool_use blocks (strip_tool_prefix path removed from AnthropicTransport.normalize_response, + all 5 call sites in run_agent.py and auxiliary_client.py) - user-agent: claude-cli/<v> (external, cli) and x-app: cli headers on the Messages API client Added: - OAuth path strips context-1m-2025-08-07 — Anthropic rejects OAuth requests carrying it with HTTP 400 'This authentication style is incompatible with the long context beta header.' Kept (auth plumbing, not identity spoofing): - _is_oauth_token classifier and is_oauth flag threading - Bearer vs x-api-key auth routing - _OAUTH_ONLY_BETAS (claude-code-20250219, oauth-2025-04-20) — backend requires these on the OAuth-gated Messages endpoint - _OAUTH_CLIENT_ID (Claude Code's) — Anthropic doesn't issue OAuth creds to third parties; this is the only way the login flow works - claude-cli/<v> User-Agent on the OAuth token exchange + refresh endpoints at platform.claude.com/v1/oauth/token — bare requests get Cloudflare 1010 blocked Verified live against api.anthropic.com with a fresh sk-ant-oat01-* token: - claude-haiku-4-5 simple message: HTTP 200, 'OK' response - claude-haiku-4-5 tool call: HTTP 200, stop_reason=tool_use, tool named 'terminal' (no mcp_ prefix) round-tripped correctly - Outgoing wire: no user-agent, no x-app, real Hermes identity in system prompt, real tool name in schema Closes/supersedes #16820 (mcp_ PascalCase normalization patch — no longer needed since the mcp_ round-trip is gone). * fix(anthropic): resolve_anthropic_token() reads credential pool first Close the gap where ~/.hermes/auth.json → credential_pool.anthropic (where hermes login + dashboard PKCE flow write OAuth tokens) was not in resolve_anthropic_token()'s source list. Before: users who authed via hermes login got the token written into the pool, but legacy fallback code paths (auxiliary_client, models catalog fetch, explicit-runtime path) that call resolve_anthropic_token() saw None and raised 'No Anthropic credentials found' — even though the token was sitting in auth.json. New priority 1: pool.select() with env-sourced entries skipped. Skipping env:* entries preserves the existing env-var priority logic further down the chain (static env OAuth → refreshable Claude Code upgrade via _prefer_refreshable_claude_code_token). Surfaced while writing the hermes-agent-dev skill playbook for 'finding a live OAuth token for an E2E test'. --------- Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 03:51:17 -07:00
Teknium	2b728e1274	fix(agent): drop thinking-only assistant turns before provider call (#16959 ) Adds a pre-call sanitizer that detects assistant messages containing only reasoning (reasoning / reasoning_content, no visible content, no tool_calls) and drops them from the API copy. Adjacent user messages left behind are merged so role alternation is preserved for the provider. Mirrors Claude Code's approach in src/utils/messages.ts (filterOrphanedThinkingOnlyMessages + mergeAdjacentUserMessages). We drop the whole turn rather than fabricate stub text (the '.' / '(continued)' pattern from contributor PRs #11098, #13010, #16842 that were rejected because they put words in the model's mouth). The stored conversation history (self.messages) is never mutated — only the per-call api_messages copy. Users still see the reasoning block in the CLI/gateway transcript; only the wire copy is cleaned. Session persistence keeps the full trace. Two call sites covered: - Main agent loop, after _sanitize_api_messages (catches every turn). - Iteration-limit-summary fallback path. Tests: tests/run_agent/test_thinking_only_sanitizer.py — 25 cases covering detection (string/list content, whitespace-only, tool_calls, reasoning_details list form), drop behavior, adjacent-user merge (string+string, list+list, mixed), non-mutation of input dicts, and system-message handling. E2E live-tested against 5 providers with a poisoned history (empty assistant message + reasoning_content): OpenRouter→Anthropic/OpenAI/ DeepSeek-R1/Qwen, native Gemini. All 5 accepted the cleaned request. Happy-path regression (5/5) confirms the sanitizer is a noop when no thinking-only turn exists. Related: #16823 (wontfix — stub-text approach rejected). Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 03:50:51 -07:00
simonweng	a6a6cf047d	feat(providers): add tencent-tokenhub provider support Registers tencent-tokenhub (https://tokenhub.tencentmaas.com/v1) as a new API-key provider with model tencent/hy3-preview (256K context). - PROVIDER_REGISTRY entry + TOKENHUB_API_KEY / TOKENHUB_BASE_URL env vars - Aliases: tencent, tokenhub, tencent-cloud, tencentmaas - openai_chat transport with is_tokenhub branch for top-level reasoning_effort (Hy3 is a reasoning model) - tencent/hy3-preview:free added to OpenRouter curated list - 60+ tests (provider registry, aliases, runtime resolution, credentials, model catalog, URL mapping, context length) - Docs: integrations/providers.md, environment-variables.md, model-catalog.json Author: simonweng <simonweng@tencent.com> Salvaged from PR #16860 onto current main (resolved conflicts with #16935 Azure Anthropic env-var hint tests and the --provider choices= list removal in chat_parser).	2026-04-28 03:45:52 -07:00
Teknium	bd10acd747	fix(providers): honor key_env/api_key_env on Azure Anthropic + accept alias in normalizer (#16935 ) Three related fixes around custom env-var-name hints for provider entries. 1. Azure Anthropic path: previously hardcoded to look up AZURE_ANTHROPIC_KEY then ANTHROPIC_API_KEY with no way to override. If a user wrote model: provider: anthropic base_url: https://my-resource.services.ai.azure.com/anthropic key_env: MY_CUSTOM_KEY the key_env hint was silently ignored and the resolver raised 'No Azure Anthropic API key found' even when MY_CUSTOM_KEY was set in the environment. The runtime now checks, in order: (1) os.getenv(model_cfg.key_env) (2) os.getenv(model_cfg.api_key_env) # docs alias (3) model_cfg.api_key # inline value (4) AZURE_ANTHROPIC_KEY # historical default (5) ANTHROPIC_API_KEY # historical default Error message updated to mention key_env as an option. 2. Provider entry normalizer (_normalize_custom_provider_entry): accept 'api_key_env' as a snake_case alias for 'key_env', and 'apiKeyEnv' as a camelCase alias. Adds both to the _KNOWN_KEYS set so the 'unknown config keys ignored' warning doesn't fire on valid configs. 3. _VALID_CUSTOM_PROVIDER_FIELDS: add 'key_env'. That set documents supported custom_providers entry fields; it was drifting from reality since key_env has been read at runtime in auxiliary_client.py, runtime_provider.py, and main.py for a while. Docs: website/docs/guides/azure-foundry.md now uses the canonical key_env field and notes that api_key_env / keyEnv / apiKeyEnv are accepted as aliases. Validation: 12 new tests in test_runtime_provider_resolution.py covering all 5 Azure Anthropic resolution paths + 4 normalizer-alias tests. Pass rate across related suites (165 + 46 tests): 100%. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 02:12:08 -07:00
墨綠BG	4462b349b2	✨ feat(web): expose search result limit	2026-04-28 02:09:30 -07:00
Teknium	e63364b8df	revert: computer-use cua-driver (PR #16919 ) (#16927 ) Reverts PR #16919 (commits `dad10a78d`, `413ee1a28`, `b4a8031b2`, `afb958829`) which was merged prematurely. Restoring the pre-merge state so #14817 and #15328 can be revisited as standing PRs. Reverted commits: - `afb958829` fix(computer-use): harden image-rejection fallback + AUTHOR_MAP - `b4a8031b2` fix(computer-use): unwrap _multimodal tool results - `413ee1a28` feat(computer-use): background focus-safe backend - `dad10a78d` feat(computer-use): cua-driver backend, universal any-model schema Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 01:57:21 -07:00
Teknium	cf0852f92e	feat(claw-migrate): harden OpenClaw import with plan-first apply, redaction, and pre-migration backup (#16911 ) * feat(claw-migrate): harden OpenClaw import with plan-first apply, redaction, and pre-migration backup Adopts four design patterns from OpenClaw's reciprocal migrate-hermes importer so both migration paths have the same safety posture. - Refuse-on-conflict apply. 'hermes claw migrate' now refuses to execute when the plan has any conflict items, unless --overwrite is set. Previously the user could say 'yes, proceed' and end up with a silent partial migration that skipped every conflicting item. - Engine-level secret redaction. The report.json and summary.md written to disk (and --json stdout) run through a redactor that matches OpenClaw's key-name markers and value-shape patterns (sk-, ghp_, xox-, AIza, Bearer ). Prevents accidental API key leakage in bug reports and support channels. - Pre-migration tarball snapshot.* Apply creates one timestamped restore-point archive of ~/.hermes/ at ~/.hermes/migration/pre-migration-backups/ before any mutation, excluding regenerable directories (sessions, logs, cache). Opt out with --no-backup. - Blocked-by-earlier-conflict sequencing. If a config.yaml write hits conflict/error mid-apply, subsequent config-mutating options are marked skipped with reason 'blocked by earlier apply conflict' rather than attempting partial writes. - Structured warnings[] and next_steps[] on the report — actionable guidance surfaces in both JSON output and summary.md. - --json output mode — emits the redacted report on stdout for CI. Also flips --preset full to NOT auto-enable --migrate-secrets. Users now have to opt in to secret import explicitly, mirroring OpenClaw's two-phase posture. Status/kind/action constants are defined (STATUS_MIGRATED etc) with values that match the existing strings the script emits, so the report schema is backward-compatible. ItemResult gains a 'sensitive' bool field that redaction and consumers can key off. Validation: 26 new unit tests + 1 updated test in tests/skills/ test_openclaw_migration_hardening.py and test_claw.py cover redaction (key markers, value patterns, recursion, on-disk), warnings/next_steps, blocked-by-earlier sequencing, --json mode, and the preset-flip. Manual E2E against a fake $HERMES_HOME with real-shaped secrets confirmed: (1) secrets never appear in stdout or on disk, (2) _cmd_migrate refuses apply when plan has conflicts, (3) --overwrite proceeds past the guard and the backup tarball is created, (4) --no-backup skips the archive. Related docs: website/docs/guides/migrate-from-openclaw.md and website/docs/reference/cli-commands.md updated to reflect the preset-flip and new --no-backup flag. * refactor(claw-migrate): reuse hermes backup system for pre-migration snapshot Drops the inline tarball in hermes_cli/claw.py in favor of hermes_cli.backup.create_pre_migration_backup(), which shares an implementation with create_pre_update_backup via a new _write_full_zip_backup helper. Benefits: - Consistent exclusion rules with hermes backup (_EXCLUDED_DIRS, _EXCLUDED_SUFFIXES, _EXCLUDED_NAMES — single source of truth). - SQLite safe-copy via _safe_copy_db (state.db restores cleanly). - Zip format restorable with 'hermes import <archive>'. - Lives under ~/.hermes/backups/pre-migration-.zip alongside pre-update-.zip — one place for all snapshot archives. - Auto-prune rotation with separate keep counters (pre-migration keeps 5, pre-update keeps 5, they don't touch each other's files). 7 new tests in tests/hermes_cli/test_backup.py lock the contract: directory location, shared exclusion rules, _validate_backup_zip acceptance (i.e. restorable with 'hermes import'), non-recursive into prior backups, rotation, missing-home handling, and the invariant that pre-migration rotation never touches pre-update backups. Help text and docs updated — the restore hint now says 'hermes import <name>' instead of 'tar -xzf <archive> -C ~/'. * chore(claw-migrate): use backup._format_size and drop duplicate output line Minor polish using another existing primitive from hermes_cli.backup: - Show backup archive size with _format_size (e.g. '(245 B)' or '(2.4 MB)') matching the format hermes backup already uses. - Drop the duplicate 'Pre-migration backup saved' line after Migration Results — the earlier 'Pre-migration backup: <path> (<size>)' line already surfaces the path before apply runs. --------- Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 01:50:23 -07:00
ThomassJonax	2f9243c333	fix(session): make SQLite transcript rewrites transactional	2026-04-28 01:49:46 -07:00
crayfish-ai	f3371c39a4	fix(auxiliary): custom provider URL rewrite + main_runtime model for title gen - auxiliary_client: apply _to_openai_base_url() to custom base_url (fixes /anthropic → /v1 rewrite missing for provider="custom") - auxiliary_client: use main_runtime.get("model") instead of _read_main_model() so auxiliary tasks follow system default model changes - title_generator: thread main_runtime through generate_title → auto_title_session → maybe_auto_title - cli.py / gateway/run.py: pass main_runtime to maybe_auto_title - tests: update mock assertions for new main_runtime parameter	2026-04-28 01:47:25 -07:00
westers	1791324604	test(cli): regression coverage for user-provider routing fix (#16767 )	2026-04-28 01:47:20 -07:00
Teknium	afb9588298	fix(computer-use): harden image-rejection fallback + AUTHOR_MAP Follow-up to #15328's vision-unsupported retry branch in run_agent.py. _strip_images_from_messages() previously deleted any message whose content was entirely images. That's fine for synthetic user messages injected for attachment delivery, but it breaks providers for tool-role messages — the paired tool_call_id on the preceding assistant message ends up unmatched, which OpenAI-compatible APIs reject with HTTP 400. Fix: tool-role messages whose content becomes empty are replaced with a plaintext placeholder that preserves the tool_call_id linkage. Only non-tool messages are dropped. Added 10 tests covering the role-alternation invariants + image-type coverage. Image-rejection detector: expanded phrase list (image content not supported / multimodal input / vision input / model does not support image) and gated on 4xx status so transient 5xx errors never get misinterpreted as 'server said no to images'. Detection is documented as best-effort English phrase matching. AUTHOR_MAP: mapped 3820588+ddupont808@users.noreply.github.com to ddupont808 so release notes attribute the salvage correctly.	2026-04-28 01:46:36 -07:00
Teknium	dad10a78d0	feat(computer-use): cua-driver backend, universal any-model schema Background macOS desktop control via cua-driver MCP — does NOT steal the user's cursor or keyboard focus, works with any tool-capable model. Replaces the Anthropic-native `computer_20251124` approach from the abandoned #4562 with a generic OpenAI function-calling schema plus SOM (set-of-mark) captures so Claude, GPT, Gemini, and open models can all drive the desktop via numbered element indices. - `tools/computer_use/` package — swappable ComputerUseBackend ABC + CuaDriverBackend (stdio MCP client to trycua/cua's cua-driver binary). - Universal `computer_use` tool with one schema for all providers. Actions: capture (som/vision/ax), click, double_click, right_click, middle_click, drag, scroll, type, key, wait, list_apps, focus_app. - Multimodal tool-result envelope (`_multimodal=True`, OpenAI-style `content: [text, image_url]` parts) that flows through handle_function_call into the tool message. Anthropic adapter converts into native `tool_result` image blocks; OpenAI-compatible providers get the parts list directly. - Image eviction in convert_messages_to_anthropic: only the 3 most recent screenshots carry real image data; older ones become text placeholders to cap per-turn token cost. - Context compressor image pruning: old multimodal tool results have their image parts stripped instead of being skipped. - Image-aware token estimation: each image counts as a flat 1500 tokens instead of its base64 char length (~1MB would have registered as ~250K tokens before). - COMPUTER_USE_GUIDANCE system-prompt block — injected when the toolset is active. - Session DB persistence strips base64 from multimodal tool messages. - Trajectory saver normalises multimodal messages to text-only. - `hermes tools` post-setup installs cua-driver via the upstream script and prints permission-grant instructions. - CLI approval callback wired so destructive computer_use actions go through the same prompt_toolkit approval dialog as terminal commands. - Hard safety guards at the tool level: blocked type patterns (curl\|bash, sudo rm -rf, fork bomb), blocked key combos (empty trash, force delete, lock screen, log out). - Skill `apple/macos-computer-use/SKILL.md` — universal (model-agnostic) workflow guide. - Docs: `user-guide/features/computer-use.md` plus reference catalog entries. 44 new tests in tests/tools/test_computer_use.py covering schema shape (universal, not Anthropic-native), dispatch routing, safety guards, multimodal envelope, Anthropic adapter conversion, screenshot eviction, context compressor pruning, image-aware token estimation, run_agent helpers, and universality guarantees. 469/469 pass across tests/tools/test_computer_use.py + the affected agent/ test suites. - `model_tools.py` provider-gating: the tool is available to every provider. Providers without multi-part tool message support will see text-only tool results (graceful degradation via `text_summary`). - Anthropic server-side `clear_tool_uses_20250919` — deferred; client-side eviction + compressor pruning cover the same cost ceiling without a beta header. - macOS only. cua-driver uses private SkyLight SPIs (SLEventPostToPid, SLPSPostEventRecordTo, _AXObserverAddNotificationAndCheckRemote) that can break on any macOS update. Pin with HERMES_CUA_DRIVER_VERSION. - Requires Accessibility + Screen Recording permissions — the post-setup prints the Settings path. Supersedes PR #4562 (pyautogui/Quartz foreground backend, Anthropic- native schema). Credit @0xbyt4 for the original #3816 groundwork whose context/eviction/token design is preserved here in generic form.	2026-04-28 01:46:36 -07:00
kshitijk4poor	42cc905c13	feat(plugins): add bundled observability/langfuse plugin Opt-in Langfuse tracing for Hermes conversations — LLM calls, tool usage, usage/cost breakdown per span. Hooks into pre/post_api_request, pre/post_llm_call, pre/post_tool_call. SDK is optional; missing SDK or credentials renders the plugin inert. Salvaged from PR #16845 by @kshitijk4poor, who wrote the plugin (~875 LOC, 6 hooks, Langfuse usage-details/cost-details normalization, read_file payload summarization). Salvage scope (why this isn't PR #16845 as-authored): - Lives at plugins/observability/langfuse/ (standalone kind, opt-in via plugins.enabled) instead of a new parallel optional-plugins/ directory. Standalone bundled plugins are already opt-in — only their plugin.yaml is scanned at startup; the Python module is not imported unless the user enables it. The premise of optional-plugins/ (avoid import cost for users who don't want it) is already solved by the existing plugin system. - Dropped the triple activation gate (plugins.enabled + plugins.langfuse.enabled + HERMES_LANGFUSE_ENABLED). The Hermes plugin system's own enable/disable is authoritative; runtime credentials gate whether the hook actually traces. - Rewrote _is_enabled() → cached _get_langfuse() with an _INIT_FAILED sentinel. The original called hermes_cli.config.load_config() from every hook invocation (full yaml parse + deep merge + env expansion on every pre/post_tool_call, potentially 100+ times per turn). The cached version reads env once and returns the cached client or None on every subsequent call with zero further work. - hermes tools → Langfuse Observability post-setup adds observability/langfuse to plugins.enabled directly (via _save_enabled_set) instead of going through an install-copy flow. Enable: hermes tools # interactive hermes plugins enable observability/langfuse # manual Required env (set by `hermes tools` or in ~/.hermes/.env): HERMES_LANGFUSE_PUBLIC_KEY HERMES_LANGFUSE_SECRET_KEY HERMES_LANGFUSE_BASE_URL # optional Co-authored-by: kshitijk4poor <kshitijk4poor@gmail.com>	2026-04-28 01:40:59 -07:00
Surat Srichan	4d3e3ff8a2	fix(gateway): coerce plaintext "restart gateway" DMs to /restart Narrow plaintext shortcut that rewrites a tiny set of admin phrases ("restart gateway", "restart the gateway", "restart hermes") into the /restart slash command, but only in DMs. Scope is intentionally tight: - DM text messages only — group chats keep natural-language semantics - Exact restart-style phrases only - Skips anything already starting with "/" Without this, the LLM can receive "restart gateway" as a user turn and try to satisfy it via the terminal tool (systemctl restart ...). That kills the gateway while the originating agent is still running, which leaves systemd in "draining" state waiting on a process it's about to kill. Routing the phrase to the slash-command dispatcher bypasses the agent loop and uses the existing restart machinery (request_restart). Called once, at the adapter level in BasePlatformAdapter.handle_message, so every platform gets it for free and pending-message reinjection is covered by the same call site. Adds 2 Telegram-parametrized e2e tests: DM routes to request_restart, group chats fall through to the normal agent path.	2026-04-28 01:40:28 -07:00
Surat Srichan	a8f9c56cb4	fix(config): accept fallback_model list (chain) in validator + save Runtime already supports list-form fallback_model (run_agent.py:1459 iterates fallback_chain; fallback_cmd.py migrates legacy single-dict configs to list format). The config validator and save_config comment gate still assumed single-dict form and flagged list-form configs as errors. Fix both: - validate_config_structure: when fallback_model is a list, validate each entry has provider+model; keep the existing single-dict path. - save_config: suppress the "add fallback_model" comment when any list entry is well-formed. Adds 4 list-form validator tests.	2026-04-28 01:40:25 -07:00
Teknium	0edcc57d9a	fix(acp): wire HERMES_SESSION_KEY per session so sudo cache scope activates PR #16858's session-scoped interactive sudo password cache falls back to a thread-identity scope when no HERMES_SESSION_KEY is bound. ACP never set that contextvar, so two ACP sessions landing on the same reused ThreadPoolExecutor thread still shared the cache — the exact scenario the PR headlined. acp_adapter/server.py now: - binds HERMES_SESSION_KEY=<session_id> via gateway.session_context inside _run_agent() (and clears on exit) - wraps the loop.run_in_executor(_executor, _run_agent) call in a fresh contextvars.copy_context() so concurrent ACP sessions don't stomp on each other's ContextVar writes (executor pool threads would otherwise share a context). Adds tests/acp/test_approval_isolation.py:: test_sudo_password_cache_isolated_across_acp_sessions_on_same_pool_thread which drives two back-to-back sessions through a 1-worker ThreadPoolExecutor and asserts B does not observe A's cached password.	2026-04-28 01:34:16 -07:00
hharry11	de03a332f7	fix(security): isolate interactive sudo password cache per session	2026-04-28 01:34:16 -07:00
Teknium	8d76d69d48	fix(state): repair FTS5 delete trigger and add v11 migration for tool-call indexing Follow-up on top of the cherry-picked contributor commit for #16751: 1. Delete triggers: the original PR switched FTS5 from external to inline content mode and concatenated content \|\| tool_name \|\| tool_calls in the insert/update triggers, but left the delete triggers passing old.content to the FTS5 delete-command. FTS5 inline delete requires the content to match what was stored, so every DELETE on messages raised 'SQL logic error'. Replaced with plain DELETE FROM ... WHERE rowid = old.id on all four delete paths (normal + trigram, delete + update-delete). 2. v11 migration: existing DBs have the old external-content FTS tables and triggers. Because CREATE VIRTUAL TABLE IF NOT EXISTS / CREATE TRIGGER IF NOT EXISTS skip when the objects already exist, upgraders would have kept the broken behavior forever. Bumped SCHEMA_VERSION to 11 and added a migration that drops both FTS tables + all 6 old triggers, recreates them via FTS_SQL / FTS_TRIGRAM_SQL, and backfills from messages using the same concatenation expression. 3. Regression tests: 6 new tests cover INSERT / UPDATE / DELETE paths for tool_name + tool_calls indexing plus the full v10 -> v11 upgrade path on a hand-built legacy DB.	2026-04-28 01:33:00 -07:00
crayfish-ai	abefd89059	fix(search): quote underscored terms in FTS5 query sanitization FTS5 default tokenizer splits 'sp_new1' into tokens 'sp' and 'new1'. Without quoting, a search for 'sp_new' becomes an AND query ('sp AND new') that fails to match rows indexed as 'sp_new1'. Fix: add underscore to the character class in Step 5 regex ([.-] -> [._-]) so underscored terms are wrapped in double quotes. Also adds test_sanitize_fts5_quotes_underscored_terms.	2026-04-28 01:31:40 -07:00
vominh1919	0169c51820	fix(config): add request_timeout_seconds and stale_timeout_seconds to provider _KNOWN_KEYS Both keys are documented in cli-config.yaml.example and read at runtime by hermes_cli/timeouts.py (get_provider_request_timeout and get_provider_stale_timeout), but the provider-entry validator in config.py flagged them as unknown, producing noisy warnings on every CLI invocation for users who followed the documented config. Fixes #16779	2026-04-28 01:28:25 -07:00
Hafiy Zakaria	40bd6d4709	fix: honor agent.disabled_toolsets in gateway sessions Previously, agent.disabled_toolsets in config.yaml only worked for CLI mode (run_agent.py --disabled_toolsets). The gateway always passed enabled_toolsets to AIAgent, and get_tool_definitions() ignored disabled_toolsets when enabled_toolsets was set. Fix: _get_platform_tools() now reads agent.disabled_toolsets from config and excludes those toolsets from the returned set. This runs last so it overrides everything above. Added 3 tests covering cross-platform suppression, explicit platform config override, and empty/missing config no-op behavior.	2026-04-28 01:23:16 -07:00
Teknium	d63abbc329	fix(agent): persist streamed reasoning_content on assistant turns (#16844 ) (#16892 ) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in #16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs #16844, #16884, #15250, #15353, #15748.	2026-04-28 01:19:18 -07:00
briandevans	66a05e44d6	fix(copilot): require successful exchange when walking credential_pool catalog tokens Address Copilot review on #16868: 1. Tighten pool iteration. ``validate_copilot_token`` only rejects empty strings and classic PATs (``ghp_``); a malformed/unsupported ``gho_`` token at ``credential_pool.copilot[0]`` would pass the gate and short- circuit the loop, hiding a later valid entry. Switch to calling ``exchange_copilot_token`` directly: only entries that actually exchange into a live Copilot API token are returned. Bad/expired entries fall through to the next, and an exhausted pool returns ``""`` so the picker falls back to the curated list (existing behaviour). 2. Reword the docstring + test module docstring to describe the pool seed path accurately — ``hermes auth add copilot`` adds an api-key-typed credential whose ``access_token`` field stores the pasted token, and ``_seed_from_env`` mirrors ``COPILOT_GITHUB_TOKEN`` from ``~/.hermes/.env`` into the pool. The previous wording implied ``auth add copilot`` itself ran the device-code flow, which it does not (the device-code flow lives in ``hermes model``). Two new tests cover the iteration change: - ``test_skips_pool_entry_that_fails_to_exchange`` — pool[0] raises, pool[1] succeeds, picker uses pool[1]. - ``test_all_pool_entries_fail_exchange_returns_empty`` — every entry raises, return ``""``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 01:18:09 -07:00
briandevans	fdfe40a48b	fix(copilot): fall back to credential_pool OAuth access_token for /model picker (#16708 ) Users whose only Copilot credential is the OAuth `access_token` saved by `hermes auth add copilot` (device-code flow) saw the `/model` picker drop back to a stale hardcoded list. Reason: `_resolve_copilot_catalog_api_key` only consulted env vars (`COPILOT_GITHUB_TOKEN` / `GH_TOKEN` / `GITHUB_TOKEN`) and the `gh auth token` CLI fallback, never the credential pool that Hermes's own login flow writes into `auth.json`. With no token, the live catalog fetch silently 401s and the picker hides current models (claude-opus-4.7, claude-sonnet-4.6, gpt-5.5, grok-code-fast-1) — even though `/model <id>` works fine because runtime inference reads the pool through a different code path. Mirror the Codex catalog resolver pattern: env-var first (unchanged), then walk `read_credential_pool("copilot")` for the first entry with a supported `access_token` (`gho_` / `github_pat_` / `ghu_`). Run it through `get_copilot_api_token()` so the catalog request uses the same exchanged token the runtime path uses. Classic PATs (`ghp_`) are still rejected up-front via `validate_copilot_token` since the Copilot API doesn't accept them. Strictly additive: env still wins, and a missing/locked auth.json (or any exception during pool read) still returns "" so the caller falls through to the curated catalog. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 01:18:09 -07:00
Teknium	9e4d79b17f	fix(tui): `/model` writes HERMES_TUI_PROVIDER unconditionally (#16857 ) (#16897 ) `/new` after `/model <custom-provider>:<model>` silently reverted to a native provider whose static catalog happened to contain the same model name (e.g. `deepseek-v4-pro` → native `deepseek` → 401). Root cause at the `/model` writeback site: `HERMES_INFERENCE_PROVIDER` was set unconditionally but `HERMES_TUI_PROVIDER` was only mirrored when it was already set. On sessions launched without `--provider`, `HERMES_TUI_PROVIDER` stayed unset, so `_resolve_startup_runtime()` on `/new` skipped the explicit-provider early return and fell through to `detect_static_provider_for_model()`. Fix: set `HERMES_TUI_PROVIDER` unconditionally alongside `HERMES_INFERENCE_PROVIDER` when `/model` lands. Keeps #15755's invariant intact — `HERMES_TUI_PROVIDER` remains the canonical "explicit this process" carrier, `HERMES_INFERENCE_PROVIDER` remains ambient and does not short-circuit startup resolution. Bug report and diagnosis: @Bartok9 in #16857 / #16873. Fixes #16857	2026-04-28 01:17:04 -07:00
Teknium	9048fd020f	fix(cli): tighten stale-dashboard match to explicit patterns Replace the Linux/macOS pgrep regex ("hermes.*dashboard") with a ps scan + the same explicit patterns list already used on the Windows branch and in hermes_cli.gateway._scan_gateway_pids: hermes dashboard hermes_cli.main dashboard hermes_cli/main.py dashboard The old greedy regex would match any cmdline containing both words — e.g. a chat session whose argv mentions "dashboard" or an unrelated grafana/dashboard-server process. Added regression tests for both. Follow-up tightening on #16881.	2026-04-28 01:14:44 -07:00
Societus	66b1142384	fix(cli): warn about stale dashboard processes after hermes update The dashboard is a long-lived server process users start and forget. When hermes update replaces files on disk, the running process holds the old Python backend in memory while the JS bundle gets updated, producing a silent frontend/backend mismatch (e.g. v0.11.0 changed the session token header -- old backends reject every API call). Scan for running dashboard processes after a successful update (both git and ZIP paths) and print a warning with their PIDs and restart instructions. Mirrors the existing pattern for gateway processes. Fixes #16872	2026-04-28 01:14:44 -07:00
Teknium	54e24f7758	test(runtime_provider): lock in model-derivation precedence over stale api_mode PR #16888 swaps the opencode-zen/go resolver so that api_mode is always re-derived from the effective model before the persisted api_mode is consulted. That's the point of the fix — a stale anthropic_messages from a previous minimax default must not survive a /model switch to a chat_completions target (or vice versa) and strip /v1 from base_url. The prior test asserted the opposite precedence — that a persisted api_mode won over model-derived mode — and was added in #4508 to lock in escape-hatch behavior. Under the new precedence that escape hatch no longer exists for opencode (only for providers that genuinely support both modes at a single endpoint — and for opencode the model name is the unambiguous signal). Rename + invert the assertion to document the intentional behavior change. Refs #16878.	2026-04-28 01:14:35 -07:00
Teknium	8269f9056c	feat(fast): broaden /fast whitelist to all OpenAI + Anthropic models (#16883 ) Switch _PRIORITY_PROCESSING_MODELS and _ANTHROPIC_FAST_MODE_MODELS from hardcoded frozensets to prefix-based matching. Any gpt-, o1, o3, o4 (OpenAI) and any claude-* (Anthropic) now exposes /fast. Fixes the case where gpt-5.5 and other post-catalog models silently skipped Priority Processing because they weren't in the frozenset. Future OpenAI/Anthropic releases will work without a catalog bump. Safety: - Codex-series (codex) still excluded — they route through the Codex Responses API which doesn't take service_tier. - Anthropic adapter already gates speed=fast on native endpoints only (_is_third_party_anthropic_endpoint), so claude-sonnet-4.6 on OpenRouter/Bedrock/opencode-zen won't leak the unknown beta. - service_tier=priority is silently dropped by non-OpenAI proxies, so false positives are harmless.	2026-04-28 00:44:43 -07:00
helix4u	6ce796b495	fix(cron): preserve Telegram topic targets	2026-04-28 00:44:12 -07:00
in-liberty420	2dfd73a497	fix(migration): resolve workspace files from agents.defaults.workspace OpenClaw users who started before the rebrand (when the project was clawd/clawdbot) often have a custom workspace directory configured via agents.defaults.workspace in openclaw.json (e.g. ~/clawd/ instead of ~/.openclaw/workspace/). The migration tool only checked hardcoded relative paths (workspace/, workspace-main/, workspace-assistant/) inside the source root, so files like MEMORY.md, skills, and daily memory in custom workspaces were silently skipped. This change: - Reads agents.defaults.workspace from openclaw.json at init time - Uses it as a final fallback in source_candidate() when files aren't found in the standard locations - Standard workspace paths are still preferred (custom is fallback only) - Custom workspace is only used when it's outside the source_root tree (avoids double-matching when workspace/ is the default) Adds two tests: - Custom workspace files are discovered and migrated - Standard workspace location is preferred over custom	2026-04-28 00:39:58 -07:00
Teknium	8081425a1c	feat(security): make secret redaction off by default (#16794 ) Flips security.redact_secrets from true to false in DEFAULT_CONFIG, and the HERMES_REDACT_SECRETS env-var fallback in agent/redact.py now requires explicit opt-in ("1"/"true"/"yes"/"on") to enable. New installs and users without a security.redact_secrets key get pass- through tool output. Existing users whose config.yaml explicitly sets redact_secrets: true keep redaction on — the config-yaml -> env-var bridges in hermes_cli/main.py and gateway/run.py still honor their setting. Also updates the inline config comments, website docs, and the hermes-agent skill so /hermes config set security.redact_secrets true is now the documented way to turn it on.	2026-04-27 21:24:08 -07:00
Teknium	3d67364b8f	test(matrix): set user_id in approval-reaction test to bypass defensive self-drop MatrixAdapter._is_self_sender returns True defensively when _user_id is empty (whoami not yet resolved) to prevent echo loops — see #15763. The reaction approval test must therefore initialize a user_id so _on_reaction does not drop the inbound test event before reaching the approval handler.	2026-04-27 21:22:44 -07:00
nbot	38a6bada92	feat(matrix): reaction-based exec approval + mention_user_id Add Matrix reaction-based exec approval (✅/❎) and mention_user_id support for push notifications in muted rooms. - matrix.py: _MatrixApprovalPrompt, send_exec_approval, reaction approval handling, bot seed reaction redaction, mention pill in send - base.py: inject mention_user_id into send metadata - run.py: inject mention_user_id into status thread metadata - tests for approval prompt registration and reaction resolution	2026-04-27 21:22:44 -07:00
Andrew Miller	6c70ac8eef	matrix: e2e test for cross-signing auto-bootstrap Self-contained docker-compose harness that exercises the new bootstrap branch against a real Continuwuity homeserver. Three tests: 1. fresh bot → bootstrap fires, /keys/query returns master + ssk with UNPADDED base64 keyids, current device is signed by the new SSK 2. second startup with same crypto store → bootstrap is skipped 3. MATRIX_RECOVERY_KEY set → existing verify_with_recovery_key path takes precedence, no new bootstrap Run via: docker compose -f tests/e2e/matrix_xsign_bootstrap/docker-compose.yml up -d python tests/e2e/matrix_xsign_bootstrap/test_bootstrap.py docker compose -f tests/e2e/matrix_xsign_bootstrap/docker-compose.yml down -v The test mirrors the bootstrap snippet from matrix.py inline so it can run without importing the full hermes gateway and its deps. Skipped automatically when mautrix isn't installed or the homeserver is unreachable. All three pass against ghcr.io/continuwuity/continuwuity:latest (Continuwuity 0.5.7). The unpadded-keyid assertion is the load-bearing one — it's exactly the property the PR's bootstrap path provides that the hand-rolled `base64.b64encode().decode()` scripts get wrong.	2026-04-27 21:22:44 -07:00
konsisumer	32d4048c6b	fix: MatrixAdapter respects proxy configuration	2026-04-27 21:22:44 -07:00
Adam Rummer	1eab5960f0	feat(matrix): add dm_auto_thread config for DM auto-threading Adds MATRIX_DM_AUTO_THREAD env var (default: false) to control auto-threading in DM rooms independently from channel auto-threading. Closes #15398	2026-04-27 21:22:44 -07:00
LeonSGP43	74a4832b74	fix(matrix): normalize image-only filenames	2026-04-27 21:22:44 -07:00
Charles Brooks	57f8cf00e9	fix(matrix): reconcile pending invites from sync state	2026-04-27 21:22:44 -07:00
Teknium	6649e7e746	test(matrix): adapt outbound-mention notice test to current _send_simple_message API	2026-04-27 21:22:44 -07:00

1 2 3 4 5 ...

2766 commits