hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
emozilla	1cd2b280fd	Merge remote-tracking branch 'origin/main' into feat/dashboard-chat	2026-04-22 21:42:14 -04:00
Teknium	c78a188ddd	refactor: invalidate transport cache when api_mode auto-upgrades to codex_responses Follow-up for #13862 — the post-init api_mode upgrade at __init__ (direct OpenAI / gpt-5-requires-responses path) runs AFTER the eager transport warm. Clear the cache so the stale chat_completions entry is evicted. Cosmetic: correctness was already fine since _get_transport() keys by current api_mode, but this avoids leaving unused cache state behind.	2026-04-22 18:34:25 -07:00
kshitijk4poor	d30ee2e545	refactor: unify transport dispatch + collapse normalize shims Consolidate 4 per-transport lazy singleton helpers (_get_anthropic_transport, _get_codex_transport, _get_chat_completions_transport, _get_bedrock_transport) into one generic _get_transport(api_mode) with a shared dict cache. Collapse the 65-line main normalize block (3 api_mode branches, each with its own SimpleNamespace shim) into 7 lines: one _get_transport() call + one _nr_to_assistant_message() shared shim. The shim extracts provider_data fields (codex_reasoning_items, reasoning_details, call_id, response_item_id) into the SimpleNamespace shape downstream code expects. Wire chat_completions and bedrock_converse normalize through their transports for the first time — these were previously falling into the raw response.choices[0].message else branch. Remove 8 dead codex adapter imports that have zero callers after PRs 1-6. Transport lifecycle improvements: - Eagerly warm transport cache at __init__ (surfaces import errors early) - Invalidate transport cache on api_mode change (switch_model, fallback activation, fallback restore, transport recovery) — prevents stale transport after mid-session provider switch run_agent.py: -32 net lines (11,988 -> 11,956). PR 7 of the provider transport refactor.	2026-04-22 18:34:25 -07:00
Teknium	36730b90c4	fix(gateway): also clear session-scoped approval state on /new Follow-up to the /resume and /branch cleanup in the previous commit: /new is a conversation-boundary operation too, so session-scoped dangerous-command approvals and /yolo state must not survive it. Adds a scoped unit test for _clear_session_boundary_security_state that also covers the /new path (which calls the same helper).	2026-04-22 18:26:59 -07:00
Es1la	050aabe2d4	fix(gateway): reset approval and yolo state on session boundary	2026-04-22 18:26:59 -07:00
Teknium	64c38cc4d0	chore(release): map shushuzn in AUTHOR_MAP	2026-04-22 18:17:37 -07:00
shushuzn	fa2dbd1bb5	fix: use utf-8 encoding when reading .env file in load_env() On Windows, Path.open() defaults to the system ANSI code page (cp1252). If the .env file contains UTF-8 characters, decoding fails with 'gbk codec can't decode byte 0x94'. Specify encoding='utf-8' explicitly to ensure consistent behavior across platforms.	2026-04-22 18:17:37 -07:00
Teknium	6ad2fab8cf	chore(release): map Dev-Mriganka in AUTHOR_MAP	2026-04-22 18:16:49 -07:00
Dev-Mriganka	a14fb3ab1a	fix(cli): guard fallback_model list format in save_config_value When a user manually sets fallback_model as a YAML list instead of a dict, save_config_value() crashes with: AttributeError: 'list' object has no attribute 'get' at the fb.get('provider') call on hermes_cli/config.py. The fix adds isinstance(fb, dict) so list-format values are treated as unconfigured — the fallback_model comment block is appended to guide correct usage — instead of crashing. Fixes #4091 Co-authored-by: [AI-assisted — Claude Sonnet 4.6 via Milo/Hermes]	2026-04-22 18:16:49 -07:00
Teknium	2c26a80848	chore(release): map projectadmin-dev in AUTHOR_MAP	2026-04-22 18:16:08 -07:00
projectadmin-dev	d67d12b5df	Update whatsapp-bridge package-lock.json	2026-04-22 18:16:08 -07:00
Teknium	86510477f3	chore(release): map NIDNASSER-Abdelmajid in AUTHOR_MAP	2026-04-22 18:15:27 -07:00
Abdelmajid NIDNASSER	ce4214ec94	Normalize claw workspace paths for Windows	2026-04-22 18:15:27 -07:00
Teknium	50387d718e	chore(release): map haimu0x in AUTHOR_MAP	2026-04-22 18:14:49 -07:00
haimu0x	aa75d0a90b	fix(web): remove duplicate skill count in dashboard badge (#12372 ) skillCount i18n already embeds {count}; the badge also prefixed activeSkills.length, showing duplicated numbers.	2026-04-22 18:14:49 -07:00
Teknium	159061836e	chore(release): map @akhater's Azure VM commit email in AUTHOR_MAP Commits in PRs #13346 and #13349 were authored as Cos_Admin@PTG-COS.lodluvup4uaudnm3ycd14giyug.xx.internal.cloudapp.net (Azure VM default hostname-based identity). Mapping to akhater so check-attribution passes and release notes credit correctly.	2026-04-22 18:13:14 -07:00
Ubuntu	d70f0f1dc0	fix(docker): allow entrypoint to pass-through non-hermes commands Commit `8254b820` ("--init for zombie reaping + sleep infinity for idle-based lifetime") made the Docker terminal backend launch sandbox containers with `sleep infinity` as the command, so the lifetime is controlled by an external idle reaper instead of a fixed timeout. But `docker/entrypoint.sh` unconditionally wraps its args with `hermes`: exec hermes "$@" Result: `hermes sleep infinity` → argparse rejects `sleep` as a subcommand and the container exits immediately with code 2: hermes: error: argument command: invalid choice: 'sleep' (choose from chat, model, gateway, setup, ...) Every sandbox container launched by the docker backend dies at startup, breaking terminal/file tool execution end-to-end. Fix: dispatch at the tail of the entrypoint. If the first arg is an executable on PATH (sleep, bash, sh, etc.) run it raw; otherwise preserve the legacy `hermes <subcommand>` wrapping behavior. Both invocation styles below keep working: docker run <image> -> hermes (interactive) docker run <image> chat -q "hi" -> hermes chat -q "hi" docker run <image> sleep infinity -> sleep infinity docker run <image> bash -> bash Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 18:13:14 -07:00
Ubuntu	a3014a4481	fix(docker): add SETUID/SETGID caps so gosu drop in entrypoint succeeds The Docker terminal backend runs containers with `--cap-drop ALL` and re-adds only DAC_OVERRIDE, CHOWN, FOWNER. Since commit `fee0e0d3` ("run as non-root user, use virtualenv") the image entrypoint drops from root to the `hermes` user via `gosu`, which requires CAP_SETUID and CAP_SETGID. Without them every sandbox container exits immediately with: Dropping root privileges error: failed switching to 'hermes': operation not permitted Breaking every terminal/file tool invocation in `terminal.backend: docker` mode. Fix: add SETUID and SETGID to the cap-add list. The `no-new-privileges` security-opt is kept, so gosu still cannot escalate back to root after the one-way drop — the hardening posture is preserved. Reproduction ------------ With any image whose ENTRYPOINT calls `gosu <user>`, the container exits immediately under the pre-fix cap set. Post-fix, the drop succeeds and the container proceeds normally. docker run --rm \ --cap-drop ALL \ --cap-add DAC_OVERRIDE --cap-add CHOWN --cap-add FOWNER \ --security-opt no-new-privileges \ --entrypoint /usr/local/bin/gosu \ hermes-claude:latest hermes id # -> error: failed switching to 'hermes': operation not permitted # Same command with SETUID+SETGID added: # -> uid=10000(hermes) gid=10000(hermes) groups=10000(hermes) Tests ----- Added `test_security_args_include_setuid_setgid_for_gosu_drop` that asserts both caps are present and the overall hardening posture (cap-drop ALL + no-new-privileges) is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 18:13:14 -07:00
Teknium	c345ec9a63	fix(display): strip standalone tool-call XML tags from visible text Port from openclaw/openclaw#67318. Some open models (notably Gemma variants served via OpenRouter) emit tool calls as XML blocks inside assistant content instead of via the structured tool_calls field: <function name="read_file"><parameter name="path">/tmp/x</parameter></function> <tool_call>{"name":"x"}</tool_call> <function_calls>[{...}]</function_calls> Left unstripped, this raw XML leaked to gateway users (Discord, Telegram, Matrix, Feishu, Signal, WhatsApp, etc.) and the CLI, since hermes-agent's existing reasoning-tag stripper handled only <think>/<thinking>/<thought> variants. Extend _strip_think_blocks (run_agent.py) and _strip_reasoning_tags (cli.py) to cover: * <tool_call>, <tool_calls>, <tool_result> * <function_call>, <function_calls> * <function name="..."> ... </function> (Gemma-style) The <function> variant is boundary-gated (only strips when the tag sits at start-of-line or after sentence punctuation AND carries a name="..." attribute) so prose mentions like 'Use <function> declarations in JS' are preserved. Dangling <function name="..."> with no close is intentionally left visible — matches OpenClaw's asymmetry so a truncated streaming tail still reaches the user. Tests: 9 new cases in TestStripThinkBlocks (run_agent) + 9 in new file tests/run_agent/test_strip_reasoning_tags_cli.py. Covers Qwen-style <tool_call>, Gemma-style <function name="...">, multi-line payloads, prose preservation, stray close tags, dangling open tags, and mixed reasoning+tool_call content. Note: this port covers the post-streaming final-text path, which is what gateway adapters and CLI display consume. Extending the per-delta stream filter in gateway/stream_consumer.py to hide these tags live as they stream is a separate follow-up; for now users may see raw XML briefly during a stream before the final cleaned text replaces it. Refs: openclaw/openclaw#67318	2026-04-22 18:12:42 -07:00
brooklyn!	64b61cc24b	Merge pull request #11887 from liftaris/fix/tui-provider-resolution fix(tui): resolve runtime provider in _make_agent	2026-04-22 20:11:21 -05:00
brooklyn!	e47537e99d	Merge pull request #14135 from helix4u/fix/tui-state-db-optional fix(tui): degrade gracefully when state.db init fails	2026-04-22 20:11:07 -05:00
Teknium	9bd1518425	fix(feishu): correct identity model docs and prefer tenant-scoped user_id Feishu's open_id is app-scoped (same user gets different open_ids per bot app), not a canonical identity. Functionally correct for single-bot mode but semantically misleading. - Add comprehensive Feishu identity model documentation to module docstring - Prefer user_id (tenant-scoped) over open_id (app-scoped) in _resolve_sender_profile when both are available - Document bot_open_id usage for @mention matching - Update user_id_alt comment in SessionSource to be platform-generic Ref: closes analysis from PR #8388 (closed as over-scoped)	2026-04-22 18:06:22 -07:00
Teknium	c9c6182839	fix(anthropic): guard max_tokens against non-positive values Port from openclaw/openclaw#66664. The build_anthropic_kwargs call site used 'max_tokens or _get_anthropic_max_output(model)', which correctly falls back when max_tokens is 0 or None (falsy) but lets negative ints (-1, -500), fractional floats (0.5, 8192.7), NaN, and infinity leak through to the Anthropic API. Anthropic rejects these with HTTP 400 ('max_tokens: must be greater than or equal to 1'), turning a local config error into a surprise mid-conversation failure. Add two resolver helpers matching OpenClaw's: _resolve_positive_anthropic_max_tokens — returns int(value) only if value is a finite positive number; excludes bools, strings, NaN, infinity, sub-one positives (floor to 0). _resolve_anthropic_messages_max_tokens — prefers a positive requested value, else falls back to the model's output ceiling; raises ValueError only if no positive budget can be resolved. The context-window clamp at the call site (max_tokens > context_length) is preserved unchanged — it handles oversized values; the new resolver handles non-positive values. These concerns are now cleanly separated. Tests: 17 new cases covering positive/zero/negative ints, fractional floats (both >1 and <1), NaN, infinity, booleans, strings, None, and integration via build_anthropic_kwargs. Refs: openclaw/openclaw#66664	2026-04-22 18:04:47 -07:00
Teknium	8152de2a84	chore(release): map sicnuyudidi in AUTHOR_MAP	2026-04-22 17:57:13 -07:00
sicnuyudidi	c03858733d	fix: pass correct arguments in summary model fallback retry _generate_summary() takes (turns_to_summarize, focus_topic) but the summary model fallback path passed (messages, summary_budget) — where 'messages' is not even in scope, causing a NameError. Fix the recursive call to pass the correct variables so the fallback to the main model actually works when the summary model is unavailable. Fixes: #10721	2026-04-22 17:57:13 -07:00
Teknium	08089738d8	chore(release): map li0near in AUTHOR_MAP	2026-04-22 17:56:14 -07:00
li0near	82cce3d26c	fix: add base_url_env_var to Anthropic ProviderConfig The Anthropic provider entry in PROVIDER_REGISTRY is the only standard API-key provider missing a base_url_env_var. This causes the credential pool to hardcode base_url to https://api.anthropic.com, ignoring ANTHROPIC_BASE_URL from the environment. When using a proxy (e.g. LiteLLM, custom gateway), subagent delegation fails with 401 because: 1. _seed_from_env() creates pool entries with the hardcoded base_url 2. On error recovery, _swap_credential() overwrites the child agent's proxy URL with the pool entry's api.anthropic.com 3. The proxy API key is sent to real Anthropic → authentication_error Adding base_url_env_var="ANTHROPIC_BASE_URL" aligns Anthropic with the 20+ other providers that already have this field set (alibaba, gemini, deepseek, xai, etc.).	2026-04-22 17:56:14 -07:00
Teknium	e5114298f0	chore(release): map WuTianyi123 in AUTHOR_MAP	2026-04-22 17:55:23 -07:00
WuTianyi123	4c1362884d	fix(local): respect configured cwd in init_session() LocalEnvironment._run_bash() spawned subprocess.Popen without a cwd argument, so init_session()'s pwd -P ran in the gateway process's startup directory and overwrote self.cwd. Pass cwd=self.cwd so the initial snapshot captures the user-configured working directory. Tested: - pytest tests/ -q (255 env-related tests passed) - Full suite: 13,537 passed; 70 pre-existing failures unrelated to local env	2026-04-22 17:55:23 -07:00
Teknium	9ea2d96d73	chore(release): map ms-alan in AUTHOR_MAP	2026-04-22 17:54:23 -07:00
ms-alan	8db5517b4c	fix: add /opt/data/.local/bin to PATH in Docker image (Closes #13739 ) Running 'hermes profile create' inside the container creates wrappers at /opt/data/.local/bin but that directory isn't on PATH by default. Add ENV PATH so wrappers are discoverable without touching shell configs.	2026-04-22 17:54:23 -07:00
Teknium	54db933667	chore(release): map longsizhuo in AUTHOR_MAP	2026-04-22 17:53:45 -07:00
Siz Long	846b9758d8	Remove Discussions link from README Removed Discussions link from README	2026-04-22 17:53:45 -07:00
Teknium	142202910e	chore(release): map ycbai in AUTHOR_MAP	2026-04-22 17:45:56 -07:00
ycbai	db86ed1990	fix(terminal): forward docker_forward_env and docker_env to container_config The container_config builder in terminal_tool.py was missing docker_forward_env and docker_env keys, causing config.yaml's docker_forward_env setting to be silently ignored. Environment variables listed in docker_forward_env were never injected into Docker containers. This fix adds both keys to the container_config dict so they are properly passed to _create_environment().	2026-04-22 17:45:56 -07:00
Teknium	7d8b2eee63	fix(delegate): default inherit_mcp_toolsets=true, drop version bump Follow-up on helix4u's PR #14211: - Flip default to true: narrowing toolsets=['web','browser'] expresses 'I want these extras', not 'silently strip MCP'. Parent MCP tools (registered at runtime) should survive narrowing by default. - Drop _config_version bump (22->23); additive nested key under delegation.* is handled by _deep_merge, no migration needed. - Update tests to reflect new default behavior.	2026-04-22 17:45:48 -07:00
helix4u	3e96c87f37	fix(delegate): make MCP toolset inheritance configurable	2026-04-22 17:45:48 -07:00
Teknium	98e1396b15	chore(release): map yudaiyan in AUTHOR_MAP	2026-04-22 17:45:17 -07:00
yudaiyan	96b0f37001	fix: separate browser_cdp into its own toolset browser_cdp_tool.py registers before browser_tool.py (alphabetical import order), so its stricter check_fn (requires CDP endpoint) becomes the toolset-level check for all 11 browser tools. This causes 'hermes doctor' to report the entire browser toolset as unavailable even when agent-browser is correctly installed. Move browser_cdp to toolset='browser-cdp' so it is evaluated independently. browser_navigate et al. only need agent-browser; browser_cdp additionally requires a reachable CDP endpoint.	2026-04-22 17:45:17 -07:00
Teknium	d74eaef5f9	fix(error_classifier): retry mid-stream SSL/TLS alert errors as transport Mid-stream SSL alerts (bad_record_mac, tls_alert_internal_error, handshake failures) previously fell through the classifier pipeline to the 'unknown' bucket because: - ssl.SSLError type names weren't in _TRANSPORT_ERROR_TYPES (the isinstance(OSError) catch picks up some but not all SDK-wrapped forms) - the message-pattern list had no SSL alert substrings The 'unknown' bucket is still retryable, but: (a) logs tell the user 'unknown' instead of identifying the cause, (b) it bypasses the transport-specific backoff/fallback logic, and (c) if the SSL error happens on a large session with a generic 'connection closed' wrapper, the existing disconnect-on-large-session heuristic would incorrectly trigger context compression — expensive, and never fixes a transport hiccup. Changes: - Add ssl.SSLError and its subclass type names to _TRANSPORT_ERROR_TYPES - New _SSL_TRANSIENT_PATTERNS list (separate from _SERVER_DISCONNECT_PATTERNS so SSL alerts route to timeout, not context_overflow+compress) - New step 5 in the classifier pipeline: SSL pattern check runs BEFORE the disconnect check to pre-empt the large-session-compress path Patterns cover both space-separated ('ssl alert', 'bad record mac') and underscore-separated ('ERR_SSL_SSL/TLS_ALERT_BAD_RECORD_MAC') forms. This is load-bearing because OpenSSL 3.x changed the error-code separator from underscore to slash (e.g. SSLV3_ALERT_BAD_RECORD_MAC → SSL/TLS_ALERT_BAD_RECORD_MAC) and will likely churn again — matching on stable alert reason substrings survives future format changes. Tests (8 new): - BAD_RECORD_MAC in Python ssl.c format - OpenSSL 3.x underscore format - TLSV1_ALERT_INTERNAL_ERROR - ssl handshake failure - [SSL: ...] prefix fallback - Real ssl.SSLError instance - REGRESSION GUARD: SSL on large session does NOT compress - REGRESSION GUARD: plain disconnect on large session STILL compresses	2026-04-22 17:44:50 -07:00
Teknium	b2593c8d4e	chore(release): map brianclemens in AUTHOR_MAP	2026-04-22 17:44:40 -07:00
brianclemens	4009f2edd9	feat(docker): add docker-cli to Docker image	2026-04-22 17:44:40 -07:00
Teknium	c0100dde35	chore(release): map Somme4096 in AUTHOR_MAP	2026-04-22 17:43:59 -07:00
Somme4096	5fbb69989d	fix(docker): add openssh-client for SSH terminal backend	2026-04-22 17:43:59 -07:00
Teknium	6f629a0462	chore(release): map xandersbell in AUTHOR_MAP	2026-04-22 17:43:30 -07:00
Anders Bell	02aba4a728	fix(skills): follow symlinks in iter_skill_index_files os.walk() by default does not follow symlinks, causing skills linked via symlinks to be invisible to the skill discovery system. Add followlinks=True so that symlinked skill directories are scanned.	2026-04-22 17:43:30 -07:00
Teknium	b9463e32c6	fix(usage): read top-level Anthropic cache fields from OAI-compatible proxies Port from cline/cline#10266. When OpenAI-compatible proxies (OpenRouter, Vercel AI Gateway, Cline) route Claude models, they sometimes surface the Anthropic-native cache counters (`cache_read_input_tokens`, `cache_creation_input_tokens`) at the top level of the `usage` object instead of nesting them inside `prompt_tokens_details`. Our chat-completions branch of `normalize_usage()` only read the nested `prompt_tokens_details` fields, so those responses: - reported `cache_write_tokens = 0` even when the model actually did a prompt-cache write, - reported only some of the cache-read tokens when the proxy exposed them top-level only, - overstated `input_tokens` by the missed cache-write amount, which in turn made cost estimation and the status-bar cache-hit percentage wrong for Claude traffic going through these gateways. Now the chat-completions branch tries the OpenAI-standard `prompt_tokens_details` first and falls back to the top-level Anthropic-shape fields only if the nested values are absent/zero. The Anthropic and Codex Responses branches are unchanged. Regression guards added for three shapes: top-level write + nested read, top-level-only, and both-present (nested wins).	2026-04-22 17:40:49 -07:00
Teknium	75221db967	chore(release): map vrinek in AUTHOR_MAP	2026-04-22 17:37:12 -07:00
Konstantinos Karachalios	435d86ce36	fix: use builtin cd in command wrapper to bypass shell aliases Version managers like frum (Ruby), rvm, nvm, and others commonly alias cd to a wrapper function that runs additional logic after directory changes. When Hermes captures the shell environment into a session snapshot, these aliases are preserved. If the wrapper function fails in the subprocess context (e.g. frum not on PATH), every cd fails, causing all terminal commands to exit with code 126. Using builtin cd bypasses any aliases or functions, ensuring the directory change always uses the real bash builtin regardless of what version managers are installed.	2026-04-22 17:37:12 -07:00
Teknium	3e95963bde	chore(release): map niyoh120 in AUTHOR_MAP	2026-04-22 17:36:33 -07:00

1 2 3 4 5 ...

5459 commits