hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-08 03:01:47 +00:00

Author	SHA1	Message	Date
liuhao1024	f9b4b8af34	fix(mcp): include exception type in error messages when str(exc) is empty Some exception classes (e.g. anyio.ClosedResourceError) are raised without a message argument, so str(exc) returns an empty string. The existing error format f'{type(exc).__name__}: {exc}' would produce messages like 'MCP call failed: ClosedResourceError: ' with nothing after the colon. Add _exc_str() helper that falls back to repr(exc) when str(exc) is empty, and apply it to all 6 MCP error formatting sites (5 tool/prompt/resource handlers + 1 sampling handler). Fixes #19417	2026-05-07 06:33:57 -07:00
Alexander Monas	a1f85ef2b9	fix(mcp): retry stale pipe transport failures Treat closed-resource, closed-transport, broken-pipe, and EOF MCP failures as stale session equivalents so the existing reconnect/retry-once path can recover. Add regression coverage for the stale-pipe marker variants.\n\nChecks:\n- python -m py_compile tools/mcp_tool.py tests/tools/test_mcp_tool_session_expired.py\n- python -m pytest tests/tools/test_mcp_tool_session_expired.py -q -o addopts=\n- selected secret scan over touched files	2026-05-07 06:32:45 -07:00
paul-tian	4d4807585a	fix(gateway): honor configured goal turn budget	2026-05-07 06:31:08 -07:00
Luciano Pacheco	f7b71aa0da	fix: use configured model for gateway auth fallback	2026-05-07 06:29:27 -07:00
Mason James	80548f9a4f	fix(mcp): report configured timeout in MCP call errors Track elapsed wall time in _run_on_mcp_loop, cancel the in-flight future when a timeout expires, and raise a descriptive TimeoutError that includes the elapsed and configured timeout. Add regression coverage for the new timeout diagnostics.	2026-05-07 06:28:11 -07:00
Hedirman	a9ebee5f02	Fix WhatsApp long message splitting	2026-05-07 06:27:47 -07:00
acc001k	5533ad7644	fix(auxiliary): enforce Codex Responses stream timeout ## Summary - Forwards chat-completions `timeout` into the Codex Responses stream call. - Adds total elapsed-time enforcement while the Responses stream is still yielding events. - Closes the underlying client on timeout to unblock stalled streams, then raises `TimeoutError`. - Adds focused tests for timeout forwarding and total timeout enforcement. ## Why The Codex auxiliary adapter can be used by non-interactive auxiliary work such as context compression. If the stream keeps yielding progress-like events but never completes, SDK socket/read timeouts do not necessarily protect the full operation. This makes the CLI look stuck until the user force-interrupts the whole session. This is a refreshed upstream-ready version of the earlier fork fix around `d3f08e9a0` / PR #3. ## Verification - `python -m py_compile agent/auxiliary_client.py tests/agent/test_auxiliary_client.py` - `python -m pytest -o addopts='' tests/agent/test_auxiliary_client.py::TestCodexAuxiliaryAdapterTimeout -q` - `git diff --check`	2026-05-07 06:21:50 -07:00
nudiltoys-cmyk	498c01406f	fix(docker): chown runtime node_modules trees to hermes user (#18800 )	2026-05-07 06:17:49 -07:00
LeonSGP43	4876959a19	fix(auth): shorten credential 401 cooldown	2026-05-07 06:15:33 -07:00
stormhierta	f648c2e3aa	fix: use max_completion_tokens for GitHub Copilot	2026-05-07 06:14:45 -07:00
LeonSGP43	d12be46df8	fix(skills): lock usage telemetry updates	2026-05-07 06:13:37 -07:00
LeonSGP43	7244a1f0d3	fix(weixin): wrap long copy-unfriendly lines	2026-05-07 06:08:06 -07:00
LeonSGP43	31f22890ea	fix(matrix): defer reaction cleanup redactions	2026-05-07 06:05:44 -07:00
Steven Chou	9442a8fa22	fix(update): migrate config in non-interactive updates	2026-05-07 06:04:28 -07:00
LeonSGP43	84287b0de8	fix(docker): refuse root gateway runs in official image	2026-05-07 05:59:25 -07:00
shashwatgokhe	5cf703245b	fix(image-routing): sniff magic bytes for image MIME, ignore misleading suffix Discord (and similar platforms) can serve a PNG image cached as discord_xxx.webp because the CDN reports content_type=image/webp for proxied stickers, custom emoji, and certain bot-uploaded images even when the actual bytes are PNG. Hermes' agent.image_routing._guess_mime trusted the file suffix and declared media_type=image/webp to Anthropic, which strict-validates and returns: HTTP 400 messages.N.content.M.image.source.base64: The image was specified using the image/webp media type, but the image appears to be a image/png image The Discord image attachment never reaches the model; the whole turn fails with no salvage path. Fix: sniff magic bytes in _file_to_data_url before declaring MIME. Suffix-based detection is kept as a fallback when bytes aren't available. New helper _sniff_mime_from_bytes covers PNG, JPEG, GIF, WEBP, BMP, and HEIC/HEIF. Tests: - Two existing tests asserted the old broken behaviour (PNG bytes in a .jpg/.webp file should report jpeg/webp); rewritten with real jpeg/webp magic bytes so they still cover suffix-aligned cases. - New regression test test_mime_sniff_overrides_misleading_extension reproduces the exact Discord scenario (PNG bytes, .webp suffix) and asserts the data URL comes back as image/png. All 28 tests in tests/agent/test_image_routing.py pass.	2026-05-07 05:58:11 -07:00
LeonSGP43	5ead126709	fix(doctor): retry DashScope China endpoint	2026-05-07 05:55:06 -07:00
LeonSGP43	14f38822fa	fix(models): prefer image modalities for vision routing	2026-05-07 05:54:12 -07:00
Teknium	6e46f99e7e	fix(tui): surface backend error as visible text when final_response is empty (#21245 ) When the provider rejects a request (e.g. invalid model slug like '--provider nous --model kimi-k2.6' where the valid slug is 'moonshotai/kimi-k2.6'), run_conversation() returns {failed: True, error: <detail>, final_response: None}. The TUI gateway and one-shot CLI mode both dropped the error on the floor and emitted an empty turn, so the user saw a blank response with no indication that anything went wrong. Mirror the interactive CLI's existing pattern (cli.py:9832): when final_response is empty AND (failed\|partial) is set AND error is populated, surface 'Error: <detail>' as the visible text. Leaves the None-with-no-error path and the '(empty)' sentinel path untouched — an empty successful turn still renders empty, and existing sentinel handlers keep owning their lane. Reported by @counterposition in PR #20873; taking a minimal fix rather than the broader structured-failure refactor proposed there.	2026-05-07 05:53:19 -07:00
LeonSGP43	8dcdc3cbc2	fix(auth): keep Spotify logout from resetting model config	2026-05-07 05:53:14 -07:00
wxst	2021c18655	fix(agent): drop terminal empty-response sentinels	2026-05-07 05:52:10 -07:00
wxst	e73508979f	fix(agent): avoid persisting empty-response recovery scaffolding	2026-05-07 05:52:10 -07:00
Teknium	80717a157f	fix(discord): route DM role-auth opt-in through config.yaml (not env var) Per repo policy, ~/.hermes/.env is for secrets only. Guild IDs are behavioral configuration, not secrets. Replacing the DISCORD_DM_ROLE_AUTH_GUILD env var from the original fix with discord.dm_role_auth_guild in config.yaml. - New module-level _read_dm_role_auth_guild() helper reads hermes_cli.config.read_raw_config()['discord']['dm_role_auth_guild']. Fails closed on any parse error (safe default = DM role-auth off). - DEFAULT_CONFIG['discord'] gains dm_role_auth_guild: '' with a comment documenting the opt-in. - Tests patch hermes_cli.config.read_raw_config directly (via the _set_dm_role_auth_guild helper) instead of setenv/delenv. 12 tests in test_discord_roles_dm_scope pass; no env var involvement. - Docstring + module docstring + comments updated to reference discord.dm_role_auth_guild. - E2E verified with real imports across 6 scenarios: unset, int, string, garbage, zero, and (crucially) env-var-only-no-config all return None except the valid int/string cases. Env var has zero effect — policy compliance confirmed.	2026-05-07 05:51:56 -07:00
Teknium	5c045b8f6c	fix(discord): extend role-scope fix to slash surface + fixture update Sibling-site fix: _evaluate_slash_authorization was the fourth _is_allowed_user caller and didn't pass guild/is_dm through, so slash interactions would take the DM branch regardless of whether they came from a guild channel. Now reads interaction.guild + in_dm and forwards. Also updates test_discord_slash_auth fixture (_make_interaction) so the SimpleNamespace guild mock has a get_member(uid)->None method — required by the new guild-scoped fallback path in _is_allowed_user. Tests exercising positive role paths still work via user.roles. Three new regression tests in test_discord_roles_dm_scope: - Slash DM + role in mutual public guild → rejected - Slash in guild B + role only in guild A → rejected - Slash in guild B + role in guild B → allowed (positive control) 368 Discord tests pass. test_discord_free_channel_skips_auto_thread also fails on clean main (pre-existing, unrelated to this fix).	2026-05-07 05:51:56 -07:00
0xyg3n	ef1e565570	fix(discord): scope DISCORD_ALLOWED_ROLES to originating guild (CVSS 8.1) The initial DISCORD_ALLOWED_ROLES implementation (#11608, merged from #9873) scans every mutual guild when resolving a user's roles. This allows a cross-guild DM bypass: 1. Bot is in both public server A and private server B. 2. User holds the allowed role in server A only. 3. User DMs the bot. The role check finds the role in A and authorizes the DM, granting access as if the user were trusted in server B. Fix: - DMs (no guild context) disable role-based auth by default. Opt-in via DISCORD_DM_ROLE_AUTH_GUILD=<guild_id> restricts role lookup to one explicitly-trusted guild. - Guild messages check roles only in the originating guild (message.guild), never in other mutual guilds. - Reject cached author.roles when the Member came from a different guild than the current message. Backwards compatibility: - DISCORD_ALLOWED_USERS behavior is unchanged (still works in both DMs and guild messages). - Deployments that rely on roles in guild channels continue to work; role checks are now strictly scoped to that guild. - Deployments that intentionally want role-based DM auth can opt into a single trusted guild via DISCORD_DM_ROLE_AUTH_GUILD. Tests: 9 new regression guards in tests/gateway/test_discord_roles_dm_scope.py covering the bypass path, the opt-in path, cross-guild guild-message bypass, and backwards-compat user-ID paths. 47/47 discord-auth tests pass. Refs: #11608 (initial implementation), #7871 (feature request), #9873 (PR author credit @0xyg3n)	2026-05-07 05:51:56 -07:00
altmazza0-star	8308d18339	fix(gateway): preserve max turns after env reload	2026-05-07 05:49:16 -07:00
altmazza0-star	5b24c0fa85	fix: require memory schema fields by action	2026-05-07 05:48:17 -07:00
Teknium	47bf5d7ecb	test+docs: cover transform_llm_output hook + release author map - tests/test_transform_llm_output_hook.py: dispatch semantics (kwargs contract, first-non-empty-string-wins, empty-string pass-through, raising-plugin fail-open, no-plugins = no-op) - tests/hermes_cli/test_plugins.py: assert the new hook name is in VALID_HOOKS alongside the other transform_* hooks - website/docs/user-guide/features/hooks.md: summary-table entry + full section mirroring transform_tool_result / transform_terminal_output - scripts/release.py: map barnacleboy.jezzahehn@agentmail.to -> JezzaHehn (existing entry only covers the gmail address)	2026-05-07 05:46:05 -07:00
Teknium	6e250a55de	fix(openviking): add Bearer auth header and omit empty/legacy tenant headers (#21232 ) Authenticated remote OpenViking servers derive tenancy from the Bearer key, but the client was always sending X-OpenViking-Account and X-OpenViking-User — defaulted to the literal string "default" — which overrode the key-derived tenant and broke auth. - _headers(): skip X-OpenViking-Account/-User when blank or "default" (treats the legacy default value as unset, so existing installs don't need to touch their .env) - _headers(): send Authorization: Bearer <key> alongside X-API-Key for standard HTTP auth compatibility - health(): include auth headers so /health works against servers that require authentication Tests cover bearer emission, legacy "default" suppression, empty suppression, real tenant passthrough, and authenticated health checks. Fixes the same user report as #20695 (from @ZaynJarvis); that PR could not be merged because its branch was stale against main and would have reverted recent OpenViking work (#15696, local resource uploads, summary URI normalization, fs-stat pre-check).	2026-05-07 05:45:58 -07:00
abhinav11082001-stack	e9685a5cf7	fix: avoid unsupported anthropic context beta by default	2026-05-07 05:43:20 -07:00
Teknium	0214858ef5	fix(browser): enforce cloud-metadata SSRF floor in hybrid routing (#16234 ) (#21228 ) Cloud metadata endpoints (169.254.169.254 etc.) are now always blocked by browser_navigate regardless of hybrid routing, allow_private_urls, or backend. Bug: commit `42c076d3` (#16136) added hybrid routing that flips auto_local_this_nav=True for private URLs and short-circuits _is_safe_url(). IMDS endpoints are technically private (169.254/16 link-local), so the sidecar happily routed them to a local Chromium, and the agent could read IAM credentials via browser_snapshot. On EC2/GCP/Azure this is a full SSRF-to-credential-theft. Fix: new is_always_blocked_url() in url_safety.py — a narrow floor that checks _BLOCKED_HOSTNAMES, _ALWAYS_BLOCKED_IPS, _ALWAYS_BLOCKED_NETWORKS only. Applied as an independent gate in browser_navigate's pre-nav and post-redirect checks, BEFORE auto_local_this_nav gets a chance to short-circuit. Ordinary private URLs (localhost, 192.168.x, 10.x, .local, CGNAT) still route to the local sidecar as the #16136 feature intends. Secondary fix (reporter's finding): _url_is_private() now explicitly checks 172.16.0.0/12. ipaddress.is_private only covers that range on Python ≥3.11 (bpo-40791), so on 3.10 runtimes those URLs were routed to cloud instead of the local sidecar. No security impact — just a correctness fix for the hybrid-routing feature. Closes #16234.	2026-05-07 05:38:05 -07:00
Teknium	c4a7992317	fix(mcp-oauth): persist OAuth server metadata across process restarts (#21226 ) The MCP SDK discovers OAuth server metadata (token_endpoint, etc.) on demand and keeps it in memory only. Without disk persistence, a restart with valid cached refresh tokens forces the SDK to fall back to the guessed '{server_url}/token' path — which returns 404 on most real providers (Notion, Atlassian, GitHub remote MCP, etc.) and triggers a full browser re-authorization even though the refresh token is fine. Add a .meta.json file next to the existing tokens/client_info files: HERMES_HOME/mcp-tokens/<server>.json -- tokens (existing) HERMES_HOME/mcp-tokens/<server>.client.json -- client info (existing) HERMES_HOME/mcp-tokens/<server>.meta.json -- oauth metadata (new) Changes: - HermesTokenStorage.save_oauth_metadata / load_oauth_metadata / _meta_path — disk layer for the discovered OAuthMetadata. - HermesTokenStorage.remove() now also clears .meta.json so 'hermes mcp remove <name>' and the manager's remove() path clean up fully. - HermesMCPOAuthProvider._initialize cold-restores from disk before the existing pre-flight discovery runs. If disk has metadata we skip the discovery HTTP round-trips entirely. - HermesMCPOAuthProvider._prefetch_oauth_metadata now persists ASM as soon as it's discovered, so even the first pre-flight run seeds disk. - HermesMCPOAuthProvider._persist_oauth_metadata_if_changed() is called at the end of async_auth_flow so metadata discovered via the SDK's lazy 401-branch (not pre-flight) is also saved for next time. Tests cover the storage roundtrip (save/load/missing/corrupt/remove) and the manager provider path (cold-load restore, skip-when-in-memory, persist-on-discover, noop-when-unchanged, end-to-end async_auth_flow). Co-authored-by: nocturnum91 <50326054+nocturnum91@users.noreply.github.com>	2026-05-07 05:35:33 -07:00
Teknium	fe4748ede8	test(kanban): regression for CancelledError swallow in stream_events Drives stream_events directly and cancels the task while it is sleeping in the poll loop, asserting the coroutine returns cleanly instead of letting CancelledError bubble. Regression coverage for the Uvicorn application traceback on dashboard Ctrl-C fixed by the preceding commit.	2026-05-07 05:31:07 -07:00
pingchesu	43a6645718	docs: clarify API server tool execution locality	2026-05-07 05:30:37 -07:00
LeonSGP43	6b9f7140bb	fix(curator): make manual runs synchronous	2026-05-07 05:27:47 -07:00
Sofia Yang	f5a232af84	refactor: replace 'cmp' text with 🗜️ emoji in status bar Address review feedback to use the clamp emoji (��️) instead of the plain text 'cmp' prefix for the compression count indicator. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 05:27:45 -07:00
Sofia Yang	103e11926f	feat(cli): show context compression count in status bar Display the number of context compressions in the CLI status bar when compressions > 0, helping users understand conversation compression pressure during long sessions. - Wide layout (>=76 cols): shows 'cmp N' between context percent and duration - Medium layout (52-75 cols): shows 'cmp N' between percent and duration - Narrow layout (<52 cols): omitted to save space - Color-coded: dim for 1-4, warn for 5-9, bad for 10+ - Hidden when zero to keep the bar clean for new sessions Closes #18564	2026-05-07 05:27:45 -07:00
Hermes Agent	e38ea38079	fix(credential_pool): resolve key mix-up when custom providers share base_url When multiple custom_providers share the same base_url but have different API keys, get_custom_provider_pool_key() always returned the first match, causing wrong-key unauthorized errors. Add provider_name parameter to prefer exact name matches over base_url-only matching, with fallback for backward compatibility. Fixes #19083	2026-05-07 05:27:41 -07:00
GinWU	6d9b30632d	fix(cli): honor positive tool preview length	2026-05-07 05:26:28 -07:00
Teknium	fdb9e0f6a6	fix(kanban): auto-block workers that exit without completing (#20894 ) (#21214 ) When a kanban worker subprocess exits rc=0 but its task is still in status='running', the agent almost certainly answered the task conversationally without calling kanban_complete or kanban_block. The dispatcher used to classify this as a generic crash and respawn, which loops forever on small local models (gemma4-e2b q4 etc.) that keep returning clean but unproductive output. Dispatcher changes: - The waitpid reap loop at the top of dispatch_once now records each reaped child's raw exit status in a bounded module registry (_recent_worker_exits, TTL 600s, size cap 4096). - _classify_worker_exit distinguishes clean_exit / nonzero_exit / signaled / unknown using os.WIFEXITED / WIFSIGNALED. - detect_crashed_workers consults the classification when a worker is found dead. clean_exit → protocol_violation event + immediate circuit-breaker trip (failure_limit=1). Everything else keeps the existing crashed-event + counter behavior. - DispatchResult.auto_blocked now includes protocol-violation trips. Gateway fix (Bug A in #20894): - gateway.run._notify_active_sessions_of_shutdown snapshots self.adapters with list(...) before iterating. adapter.send() can hit a fatal-error path that pops the adapter from the dict, which was raising 'RuntimeError: dictionary changed size during iteration' during shutdown. Regression tests: - test_detect_crashed_workers_protocol_violation_auto_blocks verifies rc=0 + still-running → status=blocked on first occurrence with protocol_violation + gave_up events and NO crashed event. - test_detect_crashed_workers_nonzero_exit_uses_default_limit verifies non-zero exits keep the existing 2-strike behavior. Closes #20894.	2026-05-07 05:24:16 -07:00
Hao Zhe	2b6345cee3	fix(memory): harden OpenViking local path uploads	2026-05-07 05:21:50 -07:00
Hao Zhe	187951ec6b	test(memory): harden OpenViking local upload coverage	2026-05-07 05:21:50 -07:00
nan	7137cccbd1	fix(memory): support OpenViking local resource uploads	2026-05-07 05:21:50 -07:00
0oAstro	abe5a3c937	fix(model_switch): live model discovery for custom_providers in /model picker custom_providers entries (section 4 of list_authenticated_providers) only read the static models: dict from config.yaml, ignoring the live /v1/models endpoint. This means gateways like Bifrost that expose hundreds of models only show the handful explicitly listed in config. Add live discovery via fetch_api_models() for custom_providers entries that have api_key + base_url, matching the existing behavior for user providers: entries (section 3). When the endpoint is reachable and returns models, the live list replaces the static subset. Fixes: /model picker showing only 9 models from a Bifrost gateway that actually exposes 581.	2026-05-07 05:21:26 -07:00
Teknium	e82f3b0c41	test: update send_message_tool mocks for force_document kwarg	2026-05-07 05:20:10 -07:00
leon7609	d34f03c32a	feat(gateway): support [[as_document]] directive for skill media routing Skills that produce large/lossless images (e.g. info-graph, where a rendered JPG is 1-2 MB) currently lose quality in Telegram delivery because `_IMAGE_EXTS` membership routes the file through `send_multiple_images` → `sendMediaGroup`, which Telegram's server re-encodes to JPEG @ 1280px max edge. The original bytes only survive when the file goes through `send_document`, which the dispatch tables in three places (`_process_message_background`, `_deliver_media_from_response`, and the `send_message` tool's telegram path) only reach for files whose extension is NOT in `_IMAGE_EXTS`. This commit adds an `[[as_document]]` directive that mirrors the existing `[[audio_as_voice]]` shape: a skill emits the directive once in its response, and every image-extension MEDIA: file in that response is delivered via `send_document` instead of `send_multiple_images` / `sendPhoto`. The directive is detected at the dispatch sites (which see the raw response) and the directive string is stripped from the user-visible cleaned text in `extract_media` so it never leaks. Granularity is intentionally all-or-nothing per response, matching [[audio_as_voice]]'s scope. Skills that need fine control can split into two responses. Verified the targeted use case: info-graph emits 信息图已生成（...） [[as_document]] MEDIA:/tmp/info-graph-x/infographic.jpg → Telegram receives `infographic.jpg` via sendDocument, original 1MB JPEG bytes preserved, no recompression. Forwarding and download filenames stay clean (`infographic.jpg`). Tests: +3 cases in TestExtractMedia covering directive strip, isolation from voice flag, and coexistence with [[audio_as_voice]]. All 113 pre-existing media/extract/send tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 05:20:10 -07:00
Molvikar	8d363f8d54	fix(bedrock): preserve reasoningContent across converse normalization	2026-05-07 05:17:16 -07:00
badfriend	4f364c4e99	fix(mcp): give 'mcp add --command' a distinct argparse dest The --command flag of `hermes mcp add` shared its argparse dest with the top-level subparser (`dest="command"` in `hermes_cli/_parser.py`). When the flag was omitted, argparse still wrote `args.command = None`, clobbering the top-level value of `"mcp"`. The dispatcher then saw `args.command is None` and fell through to interactive chat, so `hermes mcp add ...` silently launched chat instead of registering the server. `cmd_mcp_add` was never reached. Use `dest="mcp_command"` on the flag and read it from `cmd_mcp_add`. The user-facing CLI flag `--command` is unchanged; only the in-memory namespace attribute moves. Also updates the `_make_args` helper in `tests/hermes_cli/test_mcp_config.py` to populate the new dest, and adds `tests/hermes_cli/test_mcp_add_command_dest.py` with a parser- level regression test. Closes #19785. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 05:17:03 -07:00
teknium1	333598cb0e	fix(gateway): cap cached session sources with LRU eviction Follow-up on top of Zyproth's session-source cache: swap the unbounded dict for an OrderedDict with a 512-entry LRU cap so long-running gateways can't accumulate stale entries for dead sessions forever. - self._session_sources is now an OrderedDict - _cache_session_source() move_to_end + popitem(last=False) above cap - _get_cached_session_source() move_to_end on hit (LRU read bump) - restart_test_helpers.py wires OrderedDict + _session_sources_max	2026-05-07 05:16:38 -07:00
Zyproth	176b93575a	fix(gateway): preserve thread routing from cached live session sources	2026-05-07 05:16:38 -07:00

1 2 3 4 5 ...

3328 commits