hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

Author	SHA1	Message	Date
teknium1	7c9cdad9fd	test(cli): cover Windows self-lock recovery guard + cmd-quote its hint Add two tests for the self-lock guard in _recover_from_interrupted_install: one asserting it clears the marker and skips install when hermes.exe is a process ancestor (breaking the #52378/#45542 loop), one asserting it falls through to a normal recovery install when the shim is NOT an ancestor. The guard's manual-recovery hint runs only inside the Windows branch, so quote it for cmd.exe (cd /d, double-quoted paths) — the cross-platform fallback hint at the end of the function is left POSIX-correct. Map Icather in scripts/release.py AUTHOR_MAP for the salvage.	2026-06-28 02:40:37 -07:00
teknium1	dddaea0c98	chore(release): map yungchentang author for #53622 salvage	2026-06-28 02:34:17 -07:00
teknium1	86ec979f66	chore(release): map PRATHAMESH75 author for #37550 salvage	2026-06-28 02:05:50 -07:00
Coy Geek	d7a1052424	fix(env-passthrough): fail closed when provider blocklist import fails When tools.environments.local can't be imported (partial install, import-time error), _is_hermes_provider_credential() returned False — fail-open. A skill could then register a Hermes provider credential (ANTHROPIC_API_KEY, etc.) as env passthrough; _scrub_child_env lets passthrough vars bypass the secret-substring net (rule 1), so the operator's real key would land in the execute_code child. Reopens the GHSA-rhgp-j443-p4rf bypass. Fail closed instead: on import failure, treat the name as a protected provider credential and refuse passthrough. Regression test exercises the full register -> scrub path under a simulated import failure. Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-06-28 02:05:43 -07:00
teknium1	58c36b1798	fix(api-server): widen error redaction to cron-endpoint + SSE sites Follow-up to the salvaged #37733 fix. The contributor centralized redaction at _openai_error and the chat/responses failure paths, which covers the OpenAI-compatible envelopes transitively. Two sibling classes crossed the same authenticated HTTP boundary unredacted: - 8x cron-management endpoints returning {"error": str(e)} on 500 - the session-chat SSE error event ({"message": str(exc)}) Route both through the same _redact_api_error_text(force=True) helper. Add AUTHOR_MAP entry for coygeek and a TestRedactApiErrorText guard covering mask/force/limit/passthrough behavior.	2026-06-28 02:05:38 -07:00
teknium1	578e3989d4	fix(agent): route content-filter stream stalls to fallback chain (#32421 ) When a provider's output-layer safety filter (MiniMax "output new_sensitive (1027)", Azure content_filter, etc.) kills a streaming response after deltas were already sent, interruptible_streaming_api_call swallows the raw error into a finish_reason=length partial-stream stub. The conversation loop then burned 3 continuation retries against the SAME primary — re-hitting the content-deterministic filter every time — and gave up with "Response remained truncated after 3 continuation attempts", never consulting fallback_providers. Builds on @595650661's classifier change (cherry-picked) so error_classifier recognizes the filter; then: - chat_completion_helpers: run the swallowed error through error_classifier at the stub-creation point and stamp _content_filter_terminated on the stub (single source of truth — no parallel pattern list). - conversation_loop: read the tag and activate the fallback chain BEFORE burning any continuation retries; roll partial content back to the last clean turn and re-issue against the new provider (restart_with_rebuilt_messages). Plain network stalls are unaffected (only content_policy_blocked is tagged). Credits #32479 (@sweetcornna) and #33845 (@Tranquil-Flow) which fixed the same issue via the stub-tag and loop-escalation approaches respectively. Live E2E confirmed: before, _try_activate_fallback called 0x; after, fallback fires on the first stub and the fallback provider completes the turn.	2026-06-28 01:15:21 -07:00
teknium1	2e1b48ed31	chore: map kurlyk local email → skabartem for PR #32867 salvage	2026-06-28 01:08:04 -07:00
Rafael Millan	54ea059919	fix: fall back to no-sandbox for desktop launch on restricted Linux hosts	2026-06-27 22:16:20 -07:00
teknium1	97640fd9ad	fix(desktop): reserve WCO width on plain Linux + author map The plain-Linux overlay re-enable (#53185) left nativeOverlayWidth() at 0 for plain Linux, so the native min/max/close buttons painted on top of the app's right-edge titlebar tools. Reserve the fallback width everywhere the WCO overlay is painted (Windows, WSLg, plain Linux); macOS still reserves 0 since it uses traffic lights.	2026-06-27 22:05:33 -07:00
teknium1	c72d68715f	chore(release): map salvaged contributor emails for #49129 and #51488	2026-06-27 21:23:25 -07:00
teknium1	2e7e600eaa	chore(release): map HexLab98 author for PR #53863 salvage	2026-06-27 21:22:49 -07:00
Jack Maloney	f0de4c6a47	fix(pool): re-select from credential pool on primary runtime restore _restore_primary_runtime restored the construction-time api_key snapshot and never consulted the credential pool. After the pool rotated away from a revoked/exhausted entry mid-session, every new turn restored the dead key, re-failed instantly, burned the remaining entries, and fell through to cross-provider fallback. After restoring the snapshot, re-select the pool's current best entry and swap the live credential in via _swap_credential (which already rebuilds the OpenAI/Anthropic client, reapplies base-url headers, and carries the #33163 base_url / OAuth-detection fixes). Falls back to the snapshot key when the pool is absent, empty, or the entry has no usable key. Salvaged from #25206 onto current main: the original targeted the pre-refactor monolithic method in run_agent.py; the logic now lives in agent/agent_runtime_helpers.py and is collapsed onto _swap_credential instead of re-inlining the client rebuild. Fixes #25205	2026-06-27 20:04:45 -07:00
teknium1	926a1b915d	fix(tools): suppress transient check_fn flakes so subagents keep file/terminal tools A flaky external probe in a tool's check_fn (e.g. check_terminal_requirements running `docker version` with a 5s timeout, momentarily timing out under load) would return False for a single get_tool_definitions() call. Because file tools delegate their check_fn to the terminal check, that one flake silently stripped read_file/write_file/patch/search_files AND terminal from whatever agent was being constructed at that instant — most visibly a delegate_task subagent, which then reported "Tool read_file does not exist". This explains both the intermittent (~80% success) user-session failures and the deterministic cron failures in #21658 / #5304. The existing _check_fn TTL cache made this worse: it cached the transient False for the full 30s window, poisoning every subagent spawned in that span. Fix: remember the last time each check_fn returned True; when a fresh probe fails within a short grace window of that success, treat it as a flake — serve the last-good True and do NOT cache the failure (so the next call re-probes). A failure with no recent success, or past the grace window, is honored normally so a backend that genuinely went down stops advertising its tools. Probe failures now log at WARNING regardless of quiet mode, making the previously-silent tool loss diagnosable in subagent (quiet) sessions. Co-authored-by: Stuart Horner <5261694+djstunami@users.noreply.github.com>	2026-06-27 19:29:00 -07:00
Shashwat Gokhe	505bc27d8d	fix(gateway): classify mixed attachments per-attachment + transcode uncommon image formats A document attached alongside an image in the same Discord message was swept into the vision pipeline and 400'd the whole turn ("Could not process image"), and was simultaneously never surfaced to the agent as a readable file. Restores the "any file type works" contract for mixed messages and fixes the HTTP 400. Bug 1 — mixed attachments: the inbound routing loop keyed image/audio/video classification off the message-level type (PHOTO/VOICE/AUDIO), so a doc in a PHOTO message landed in image_paths and poisoned the vision call. The document context-note path was gated on message_type == DOCUMENT, so that same doc never reached the agent at all. Now classification is per-attachment (trust each attachment's own MIME; fall back to the message-level type only when MIME is unknown), via shared _event_media_is_* helpers used by both _build_media_placeholder and the main inbound loop. The document note now fires for any non-image/audio/video attachment regardless of message-level type. Bug 2 — uncommon formats: AVIF/HEIC/BMP/TIFF/ICO produced the same generic 400 because providers only accept PNG/JPEG/GIF/WEBP. image_routing now transcodes those to PNG via Pillow before declaring media_type, skipping cleanly (logged) if Pillow/plugins are missing. SVG is vector — Pillow can't rasterize it — so it's skipped rather than transcoded. Closes #25935. Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com> Co-authored-by: cypres0099 <74935762+cypres0099@users.noreply.github.com>	2026-06-27 19:26:04 -07:00
Chaz Dinkle	1dde7e2f2a	fix(anthropic): adopt Claude Code's already-refreshed token before racing refresh Claude Code OAuth refresh tokens are single-use; Claude Code refreshes on its own schedule, so by the time Hermes notices an expired token Claude Code may have already rotated it. Re-read live credential sources first and adopt a valid token rather than POSTing a possibly-stale refresh token. Ports the _refresh_oauth_token hardening from PR #40107 (chazmaniandinkle) on top of the keychain/file reconciliation from PR #21112 (nodejun). Adds AUTHOR_MAP entry for nodejun.	2026-06-27 19:14:43 -07:00
teknium1	6514be5a28	chore(release): add AUTHOR_MAP entry for linyubin (#50228 salvage)	2026-06-27 19:12:21 -07:00
bykim0119	851f75d4df	fix(discord): honor "" wildcard in DISCORD_ALLOWED_USERS (#22334 ) DISCORD_ALLOWED_USERS="" now means "allow everyone", matching the SIGNAL_ALLOWED_USERS / DISCORD_ALLOWED_CHANNELS wildcard convention and the value `claw migrate` emits. Previously _is_allowed_user did exact ID matching only, so "" matched no user and blocked every non-self sender — a P1 with no workaround. Three sites, all required for the fix to hold at runtime: - _is_allowed_user: short-circuit when "" is in the allowlist. - connect(): exclude "" from the intents.members trigger so the wildcard does not request the privileged Server Members intent (which can block the bot from coming online). - _resolve_allowed_usernames: preserve "" verbatim; otherwise it lands in the username-resolution bucket, matches no member, and is silently dropped from the set and env var on the first on_ready — quietly undoing the fix. Slash auth delegates to _is_allowed_user (auto-covered); component auth already honors "*" on main.	2026-06-27 19:11:30 -07:00
teknium1	ea8facee81	chore(release): add konsisumer to AUTHOR_MAP for PR #19608 salvage	2026-06-27 19:01:37 -07:00
Dale Nguyen	dbbf102b8e	fix(terminal): strip VIRTUAL_ENV/CONDA_PREFIX from terminal subprocess env The Hermes gateway runs inside its own venv, so its process environment carries VIRTUAL_ENV (and possibly CONDA_PREFIX). The terminal tool spawned subprocesses inheriting those markers. When the agent ran `uv sync`, `uv pip install`, `poetry install`, etc. in ANY other project directory, those tools honored the inherited VIRTUAL_ENV and rebuilt/synced that project's dependencies into the Hermes venv path — wiping Hermes' own runtime deps (and, when the other project pinned a different Python, replacing the interpreter), bricking the gateway on the next restart (#23473). Strip VIRTUAL_ENV/CONDA_PREFIX in both subprocess-env construction points in tools/environments/local.py — `_sanitize_subprocess_env` and `_make_run_env` — via a shared `_ACTIVE_VENV_MARKER_VARS` constant. The Hermes venv stays reachable because its bin dir is already first on PATH, so removing the active-environment markers is safe and only prevents the cross-project clobber. Adds TestActiveVenvMarkerStripping: end-to-end (markers in os.environ don't reach the spawned subprocess) and unit coverage for both functions, plus a guard on the marker constant. Also adds the AUTHOR_MAP entry for the salvaged contributor. Closes #23473	2026-06-28 01:04:20 +05:30
teknium1	f2ca3e3d84	fix(gateway): hold _run_restart on _restart_task + explicit cancel-loop skip Follow-up on the cherry-picked #13173 fix. Holds the _run_restart task in self._restart_task (a bare asyncio.create_task keeps only a weak reference, so a still-pending task can be GC'd mid-flight) and explicitly skips it in the _stop_impl cancel loop alongside _stop_task. Adds AUTHOR_MAP entry for the contributor and a regression test that fails when the task is cancellable. Refs #12875	2026-06-27 03:57:31 -07:00
Teknium	d3db73210c	chore(release): map blaryx@gmail.com → Blaryxoff for PR #32602 salvage	2026-06-27 03:48:18 -07:00
teknium1	3cd4693494	chore: add DiamondEyesFox to AUTHOR_MAP for PR #53351 salvage	2026-06-27 03:04:26 -07:00
kshitijk4poor	5eb108f06c	chore: AUTHOR_MAP — yashiel@skyner.co.za → yashiels PR #53284 salvage (discord markdown table-to-bullet conversion; #21168)	2026-06-27 03:38:29 +05:30
kshitijk4poor	05ba5f3962	chore: add Dr1985 to AUTHOR_MAP for launchd salvage (#42567 )	2026-06-26 14:09:11 +05:30
teknium1	fbfccbb3ee	fix(security): align cron invisible-unicode set with install-time scanner The cron runtime tripwire (_scan_cron_prompt) used a 10-char invisible-unicode set while the install-time scanner (threat_patterns.INVISIBLE_CHARS) flags 17. The cron-local set was missing U+2062-U+2064 (invisible math operators) and U+2066-U+2069 (directional isolates), so a directive obfuscated with one of those codepoints (e.g. "ig<U+2063>nore all previous instructions") slipped past the runtime cron gate while being caught at install time. Import the canonical set so the cron tripwire and install scanner can't drift apart again. Emoji-ZWJ protection (_zwj_has_emoji_neighbour) is unchanged. Fixes #35075 Co-authored-by: rlaope <piyrw9754@gmail.com>	2026-06-26 01:11:11 -07:00
kshitijk4poor	fe255ab28b	chore: add Tranquil-Flow to AUTHOR_MAP for auxiliary base_url salvage (#52623 )	2026-06-26 11:11:33 +05:30
teknium1	4d04c652f2	fix(curator): make external-skill write guard actually fire during curation The salvaged #51875 added a background-review write guard in skill_manage that refuses mutations to skills.external_dirs skills — but it only fires when is_background_review() is true. The curator's LLM review fork ran with the default _memory_write_origin='assistant_tool', so the guard never triggered during the exact curation pass it exists to protect against (GH-47688). - Set _memory_write_origin='background_review' on the curator review fork so turn_context binds it onto the write-origin ContextVar and the guard fires. - Add a regression test asserting the fork runs under the background_review origin (the invariant linking the fork to the guard). - AUTHOR_MAP: map yu-xin-c for the salvaged commit.	2026-06-25 22:03:02 -07:00
teknium1	eed9bbeb0a	chore(release): add rebel0789 to AUTHOR_MAP for salvaged PR #47308	2026-06-25 22:02:22 -07:00
teknium1	1abfa66ba6	chore(release): add DavidMetcalfe to AUTHOR_MAP for PR #52272 salvage	2026-06-25 19:00:48 -07:00
teknium1	e29823f1e8	chore(release): map agt-user noreply email for #48496 salvage	2026-06-25 18:50:11 -07:00
teknium1	6dfb8326f5	fix(state): exclude delegate/branch/tool children from resume walk + reconcile salvaged fixes Follow-up to the salvage of #45035 + #48682. The two PRs touched different functions (resolve_resume_session_id vs get_compression_tip) but #45035's descendant walk followed ANY parent_session_id child, so a delegate/subagent child could hijack the resume target. Apply the same _branched_from / _delegate_from / source!='tool' exclusion the rest of hermes_state.py uses, so the resume walk only follows genuine compression continuations. Also updates the unrealistic delegation test fixture to carry the real _delegate_from marker, and updates 3 list_sessions_rich test mocks for the order_by_last_active kwarg #48682 added. AUTHOR_MAP: map PINKIIILQWQ + ailang323 salvage authors.	2026-06-25 16:29:09 -07:00
teknium1	92b5987ca2	chore: add herbalizer404 + pyxl-dev to AUTHOR_MAP for auxiliary fallback salvage	2026-06-25 13:08:18 -07:00
kshitijk4poor	0654319644	chore(release): map srojk34 legacy prefix-less noreply in AUTHOR_MAP (#50098 )	2026-06-25 12:56:05 -07:00
kshitij	d682f320b3	Merge pull request #52147 from NousResearch/salvage/29184-mcp-osv-nonblocking fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184)	2026-06-25 23:39:44 +05:30
qdaszx	6305ac0e4b	fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184 ) During stdio MCP server startup, _run_stdio (an async method) called the synchronous check_package_for_malware() inline. That makes a blocking urllib HTTPS POST to api.osv.dev whose own timeout doesn't reliably cover a stalled SSL handshake, so an intermittent network issue froze the entire asyncio event loop for up to ~120s — blowing past the TUI/gateway's 15s startup budget and showing "gateway startup timeout". Run the check via asyncio.to_thread (off the loop) AND bound it with asyncio.wait_for(timeout=_OSV_MALWARE_CHECK_TIMEOUT_S=12s). The malware check is fail-open, so on timeout we log and proceed rather than blocking startup. Salvaged from #29190 by @qdaszx (re-applied on current main — the call site moved since the PR was opened), combining the to_thread approach also proposed in #29192 by @ygd58. Two load-bearing tests: event-loop-not-blocked-during- check and timeout-fails-open — both mutation-verified to fail against the old inline blocking call. Closes #29184. Co-authored-by: ygd58 <buraysandro9@gmail.com>	2026-06-25 23:30:41 +05:30
Teknium	60a2feeebf	chore: add benbenlijie to AUTHOR_MAP for PR #47205 salvage	2026-06-25 00:17:17 -07:00
kshitij	4d589b1e13	Merge pull request #52121 from NousResearch/salvage/43466-strip-cronjob-toolset fix(delegate): strip cronjob toolset from delegated children (#43466)	2026-06-25 01:54:37 +05:30
kshitijk4poor	e25b56fc64	chore: AUTHOR_MAP entry for riyas22 (PR #43687 salvage)	2026-06-25 01:39:11 +05:30
texhy	aacc6bb0a8	fix(agent): trigger preflight compression on few-but-huge sessions (#27405 ) The preflight-compression gate only ran the (expensive) token estimate when the message COUNT exceeded protect_first_n + protect_last_n + 1. A session with a handful of very large messages never tripped the count condition, so compression was never attempted and the turn eventually hit a hard context-overflow error. Add _should_run_preflight_estimate() with OR semantics: run the estimate when either the message count exceeds the protected ranges (the historical gate) OR a cheap char-based estimate already crosses the configured threshold. The downstream estimate_request_tokens_rough() stays authoritative — this is only a hint that decides whether to pay for the full estimate. Salvaged from #27435 by @texhy (authorship preserved). Re-applied on current main: the preflight gate moved from conversation_loop.py to turn_context.py since the PR was opened, so the helper + gate are placed there; the test imports the real MINIMUM_CONTEXT_LENGTH instead of a hardcoded literal. Closes #27405.	2026-06-25 01:20:23 +05:30
emozilla	6638199c53	fix(install): harden venv-resident process sweep on Windows Follow-up to the salvaged venv-recreate fix. Three changes to the Install-Venv pre-delete sweep: - Match the venv path with a case-insensitive StartsWith instead of the PowerShell -like operator. A venv path containing wildcard metacharacters ('[', ']') — legal in a Windows user name — silently fails to match under -like, which would let the locking process slip through and reintroduce the exact access-denied failure this fix closes. - Retry Remove-Item once after a short pause. A force-killed process can take a moment to release its file handles, so the first delete may still hit a locked .pyd; retry before failing the stage. - Note in a comment that the gateway autostart task runs at LIMITED integrity as the current user, so the installer always runs at equal-or-higher integrity and can read the process executable path, and that Get-CimInstance is preferred over Get-Process because it returns a null path for an uninspectable process instead of throwing. Adds a regression test asserting the recreate branch sweeps by venv path prefix, uses StartsWith rather than -like, and runs the sweep before Remove-Item. Covers issues #47036, #47557, #47910.	2026-06-24 13:25:44 -04:00
kshitijk4poor	fce2af780f	chore(release): add Elshayib to AUTHOR_MAP (PR #48351 )	2026-06-24 19:34:33 +05:30
teknium1	98224ce8b6	chore: add chazmaniandinkle to AUTHOR_MAP for PR #43888 salvage	2026-06-24 00:14:25 -07:00
teknium1	ba50787180	test(anthropic-oauth): cover login token-endpoint host + fallback Add two regression tests for the salvaged #48706 fix: - login token exchange targets platform.claude.com first - falls back to console.anthropic.com when the new host is unreachable Also map the salvaged contributor's noreply email in release.py AUTHOR_MAP (CI author-map gate).	2026-06-23 23:59:40 -07:00
teknium1	3dfbc0ad1d	chore(release): map thestral123 author email for PR #42021 salvage	2026-06-23 23:49:22 -07:00
teknium1	901165b5a4	fix(cron): complete plugins.cron_providers rename in 2 missed test files uperLu's #50958 renamed plugins/cron → plugins/cron_providers but left two test files patching the now-gone plugins.cron.chronos.verify path, which would fail collection. Point them at plugins.cron_providers.*. Add uperLu to release.py AUTHOR_MAP.	2026-06-23 23:39:22 -07:00
teknium1	fa2f0bf3da	chore(release): add francescomucio to AUTHOR_MAP for salvaged PR #51357	2026-06-24 16:34:51 +10:00
teknium1	3d56807fbd	fix(gateway): actively reap no-systemd gateway orphan before restart Builds on @wgu9's runtime-tracking fix: now that find_gateway_pids() can see a no-supervisor `gateway restart` runtime, have stop_profile_gateway() fall back to an orphan-aware, profile-scoped reap (SIGTERM then SIGKILL) when the pidfile/runtime record is missing or stale. Closes the duplicate- accumulation path in #51325 — a follow-up restart now kills the prior orphan instead of stacking another listener on :8644. Gated on not supports_systemd_services() so a transient `gateway restart` argv on supervised hosts is never killed. Also adds the AUTHOR_MAP entry for the salvaged contributor.	2026-06-23 23:29:28 -07:00
Teknium	b60260c61a	chore(release): add SidUParis to AUTHOR_MAP for salvaged PR #50071	2026-06-23 21:33:10 -07:00
pefontana	4ea3096a85	chore(release): map jinhyuk9714 to AUTHOR_MAP for attribution check The cherry-picked commit is authored by jinhyuk9714@gmail.com (GitHub sjh9714); the check-attribution CI gate requires every PR commit author to be present in scripts/release.py AUTHOR_MAP.	2026-06-23 18:42:05 -07:00
kshitijk4poor	5511fcf944	chore(release): map manusjs email to manus-use GitHub login Required by contributor-check/check-attribution before salvaging PR #51129 (Discord thread-starter dedup, #51057). The CI step greps AUTHOR_MAP by exact email and does not special-case noreply addresses.	2026-06-24 03:09:23 +05:30

1 2 3 4 5 ...

1019 commits