hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
Teknium	263e008d6b	feat(skills): add web-pentest optional skill (#32265 ) Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern).	2026-05-25 14:51:41 -07:00
teknium1	386f245d9d	feat(skills): add optional openhands skill — closes #477 Adds an optional autonomous-ai-agents skill that delegates coding tasks to the OpenHands CLI (https://github.com/All-Hands-AI/OpenHands). Sits alongside claude-code / codex / opencode and is the model-agnostic option in that family — any LiteLLM-supported provider works. This is a ground-truth rewrite of #19325 by @xzessmedia (Tim Koepsel). The original PR's SKILL.md was drafted by the OpenHands agent itself and hallucinated several flags that don't exist in the real CLI (\`--model\`, \`--max-iterations\`, \`--workspace\`, \`--sandbox docker\`), pointed at the wrong PyPI package (\`openhands-ai\`, which is the legacy V0 SDK), and claimed native Windows support that the upstream docs explicitly disclaim. Rather than cherry-pick and rewrite half the lines under contributor authorship, the SKILL.md was rebuilt against a verified install (\`uv tool install openhands --python 3.12\`) and a real end-to-end \`--headless --json\` run against openrouter/openai/gpt-4o-mini. Authorship credited via the \`author:\` frontmatter field and an AUTHOR_MAP entry in scripts/release.py. Changes: - optional-skills/autonomous-ai-agents/openhands/SKILL.md (new) - website/docs/user-guide/skills/optional/autonomous-ai-agents/autonomous-ai-agents-openhands.md (auto-gen) - website/docs/reference/optional-skills-catalog.md (one new row) - website/sidebars.ts (one new entry under Optional → Autonomous AI Agents) - scripts/release.py (AUTHOR_MAP entry for xzessmedia) Pitfalls documented in the SKILL came from running the tool, not from the upstream README: LiteLLM bedrock/sagemaker stderr noise on every invocation, banner spam (\`OPENHANDS_SUPPRESS_BANNER=1\` required), \`--override-with-envs\` mandatory or the CLI ignores LLM_* env vars entirely, the dashed-vs-undashed Conversation ID footgun for \`--resume\`, LiteLLM model-slug double-prefix when going through OpenRouter.	2026-05-25 14:49:34 -07:00
Teknium	5671461c0c	feat(skills): add code-wiki skill — closes #486 (#32240 ) * feat(skills): add code-wiki skill — closes #486 Bundled skill at skills/software-development/code-wiki/ that generates comprehensive documentation for any codebase: project overview, architecture walkthrough with Mermaid flowchart, per-module deep-dives, class diagram, sequence diagrams, getting-started guide, and (when applicable) API reference. Output defaults to ~/.hermes/wikis/<repo-name>/ (external to repo, like Google CodeWiki); in-repo output supported when user explicitly requests it. Uses only existing Hermes tools (terminal, read_file, search_files, write_file) — no Docker, no external services, no extra dependencies. Works on local repos and GitHub URLs (shallow-clones to a temp dir). Bounded scope defaults (depth 3, cap 10 modules) keep token cost reasonable on large repos. * refactor(skills): move code-wiki to optional-skills Per the 'when in doubt, optional' rule — wiki generation is a 'I want this big thing right now' capability, not daily-driver behavior. Lines up with finance/research/blockchain skills as install-on-demand rather than always loaded. Install via: hermes skills install official/software-development/code-wiki	2026-05-25 14:48:53 -07:00
Teknium	5caeb65a08	test(tts): regression coverage for #29417 double-[pause] fix Three new tests in tests/tools/test_tts_xai_speech_tags.py: - multi_paragraph_emits_single_pause — the headline #29417 case. Requires a first sentence of 12+ chars to hit the _XAI_FIRST_SENTENCE_RE length floor; the trivial 'Hello.\\n\\nWorld.' case dodged the bug by accident, which is why the PR's quoted repro didn't reproduce. Uses the longer 'Welcome to the demo of our new product line.\\n\\nIt has many features.' shape that actually trips the bug. - single_paragraph_still_gets_first_sentence_pause — sanity guard that the fix only suppresses the first-sentence pass when a paragraph pass injected [pause], so plain single-paragraph input still gets its leading pause. - single_newline_still_gets_first_sentence_pause — single newline isn't a paragraph break, no [pause] from the paragraph pass, so the first-sentence pause MUST still fire. Catches over-broad fixes.	2026-05-25 14:30:06 -07:00
EloquentBrush0x	1d73d5facc	fix(tts): prevent double [pause] in xAI auto speech tags for multi-paragraph text _apply_xai_auto_speech_tags runs two independent transformations: 1. paragraph breaks (\n\n) → " [pause] " 2. first-sentence boundary → " [pause] " Both fired unconditionally, so multi-paragraph input produced "Hello world. [pause] [pause] Second paragraph." — an unnatural double pause in the TTS audio. Guard the first-sentence substitution with _XAI_SPEECH_TAG_RE.search(clean): if the paragraph pass already inserted a [pause] tag, skip the first-sentence pass. Single-paragraph behavior is unchanged.	2026-05-25 14:30:06 -07:00
alt-glitch	b62af47da8	chore: drop stale line-number reference in PRIORITY path comment The cherry-pick comment referenced 'line ~6771' for the /stop handler, but on current main the handler is at a different offset. Remove the hard-coded line number — the 'above' reference is sufficient.	2026-05-25 16:23:24 +00:00
xxxigm	737ee81167	test(gateway): regression tests for #30170 subagent interrupt protection 17 new tests in tests/gateway/test_subagent_protection_30170.py pin down both the detection helper and the demotion behaviour: * TestAgentHasActiveSubagents — 11 cases covering the precision and defensiveness of _agent_has_active_subagents: - returns False for None, _AGENT_PENDING_SENTINEL, and stub agents that lack the _active_children attribute; - returns False for an empty list (the steady state of an idle AIAgent); - returns True for one or many children; - works when _active_children_lock is None (test stubs); - rejects truthy MagicMock auto-attributes — this is the regression-guard for "every MagicMock-based gateway test suddenly demotes to queue mode" (which is how this was originally found); - accepts list/tuple/set as the children container. * TestBusyHandlerDemotesInterruptForSubagents — 6 cases driving _handle_active_session_busy_message directly: - parent.interrupt is NOT called when subagents are active, message is still merged into the pending queue; - ack copy mentions "Subagent working", "queued", and the /stop escape hatch — and does NOT mention "Interrupting"; - with no subagents, behaviour is byte-identical to the pre-#30170 interrupt path (parent.interrupt called with the user text, ack says "Interrupting"); - configured queue mode keeps its vanilla "Queued for the next turn" ack (the #30170 demotion-specific copy must NOT fire); - configured steer mode still routes to running_agent.steer() even when subagents are active (the guard is interrupt-only); - _AGENT_PENDING_SENTINEL does not trigger demotion. Refs #30170.	2026-05-25 16:23:24 +00:00
xxxigm	99d62f6ba1	fix(gateway): protect in-flight subagents from busy-mode interrupts (#30170 ) When a user sends a conversational follow-up while delegate_task is running, gateway/run.py calls running_agent.interrupt(event.text) on the PARENT agent. AIAgent.interrupt() then cascades synchronously through self._active_children and calls interrupt() on every child subagent, aborting in-flight delegate_task work. The user sees the fallback cascade with no root-cause in the gateway log, and minutes of subagent progress are destroyed — the exact failure mode reported in Add GatewayRunner._agent_has_active_subagents(running_agent) — a static helper that returns True iff the parent is currently driving subagents via delegate_task. The helper is type-defensive: it ignores truthy MagicMock auto-attributes (so this doesn't accidentally fire in every test mock that hits the busy path), the _AGENT_PENDING_SENTINEL placeholder, and missing locks. Wire the helper into both interrupt branches: 1. _handle_active_session_busy_message — the adapter-level busy handler. When busy_input_mode == 'interrupt' AND the parent has active subagents, demote to 'queue' semantics: skip the parent.interrupt() call, merge the message into the pending queue, and surface a dedicated ack ("⏳ Subagent working — your message is queued for when it finishes (use /stop to cancel everything).") so the operator knows the message wasn't lost and discovers the explicit escape hatch. 2. The PRIORITY interrupt branch inside _handle_message — the non-command fast path. Same rationale, same demotion. Routes through _queue_or_replace_pending_event so the next-turn pickup stays unchanged. Explicit /stop and /new commands take a completely different path (_interrupt_and_clear_session in the slash-command dispatch at line ~6771) and are NOT affected by this guard — the operator still has a way to force-cancel everything when they actually mean it. Configured 'queue' and 'steer' modes are also untouched: 'queue' already does the right thing, and 'steer' goes through running_agent.steer() which does NOT cascade to children (so subagents survive a steer too). This is Phase 1 of the fix outlined in #30170 — the minimum viable change that stops subagent loss. Phase 2 (delegation-aware steer forwarding to active children) and Phase 3 (async delegation, #11508) are intentionally out of scope. Refs #30170.	2026-05-25 16:23:24 +00:00
brooklyn!	50aaf0c4ad	fix(tui): delineate assistant responses from details (#31087 ) * fix(tui): delineate assistant responses from details Add a muted Response marker before assistant text when thinking/tool details are visible so reasoning and final output do not visually run together. * fix(tui): account for response separator height Keep virtual transcript estimates aligned with the new response separator and avoid allocating trimmed copies of long assistant text. * fix(tui): gate response separator estimate on details Only add response-separator height when assistant details actually render, and use a non-allocating body-text check. * fix(tui): skip empty detail height estimates Do not add virtual transcript height for assistant details when no thinking or tool detail UI will render. * fix(tui): estimate details by section visibility Pass resolved thinking/tool visibility into virtual height estimates so hidden detail sections do not reserve response-separator rows.	2026-05-25 10:23:03 -05:00
brooklyn!	0ec0cafdd0	Merge pull request #31084 from NousResearch/bb/tui-right-click-copy-selection fix(tui): right-click copies active transcript selection	2026-05-25 10:22:43 -05:00
Savanne Kham	4117fc3645	fix(credential-pool): correct pool rotation when weekly usage limit is reached After key #1 is marked exhausted the retry still called the API with key #1 due to env-var bias in _get_cached_client / resolve_api_key_provider_credentials. Fix: peek the pool and pass the active entry's key as explicit_api_key. Secondary: api_key_hint in mark_exhausted_and_rotate pins the correct entry under concurrent CLI+gateway calls; _is_payment_error matches GoUsageLimitError; extract_api_error_context parses "Resets in Xhr Ymin".	2026-05-25 06:32:30 -07:00
Teknium	8f19485f53	chore(release): map kylekahraman email to GitHub login Required by CI author validation after salvaging PR #29723.	2026-05-25 06:23:18 -07:00
kylekahraman	ab42658dfc	feat: configurable paste collapse thresholds (TUI + CLI) Adds two new config keys: - paste_collapse_threshold (default: 5) — line count threshold for bracketed paste collapse in both TUI and CLI - paste_collapse_threshold_fallback (default: 0, disabled) — same for the fallback heuristic in terminals without bracketed paste support TUI frontend reads these from config.get full via applyDisplay/patchUiState. CLI reads from self.config at paste-handling time. Closes #5626 Related: #5623	2026-05-25 06:23:18 -07:00
zccyman	973bb124a4	fix(credential-pool): rotate immediately when credential already exhausted Closes #26145. When the user interrupts the retry loop between two 429s (Ctrl-C in interactive mode, /new, gateway disconnect), the local has_retried_429 flag dies with the recovery function. On the next user prompt the agent restarts with has_retried_429=False, hits 429 on the exhausted credential, sets the flag, returns 'retry once'. Repeat forever — the second 429 that would trigger rotation is never reached, and healthy entries (priority>0 free/paid accounts) are never tried. Fix: in recover_with_credential_pool's rate_limit branch, pre-check pool.current().last_status before running the retry-once dance. If the current entry is already STATUS_EXHAUSTED, rotate immediately. Uses getattr() for the attribute read so existing tests with SimpleNamespace mocks (which only set 'label') keep working. Co-authored-by: zccyman <16263913+zccyman@users.noreply.github.com>	2026-05-25 06:21:28 -07:00
Teknium	0a6a0ba527	test(skills): widen assertion in PR#6656 regression to accept new validator msg The new install-path validator from this PR raises 'Unsafe install path: ...' earlier in the pipeline than the previous resolve-then-check path. Behavior is identical (ok=False, victim untouched, refused before rmtree) — only the error string changed.	2026-05-25 06:13:36 -07:00
峯岸亮	3b9b9a7ad7	fix(skills): guard uninstall lock paths Validate Skills Hub lock-file install paths at both ends of the lifecycle so a poisoned or malformed lock.json entry cannot drive shutil.rmtree to a location outside SKILLS_DIR: - HubLockFile.record_install rejects empty/'.'/absolute/traversal/ Windows-drive paths at write time, and requires the final path component to match the skill name (shape: '<skill>' or '<category>/<skill>'). - install_from_quarantine resolves its destination through the same validator, catching symlink/junction redirects inside skills/. - uninstall_skill resolves the lock entry through the new validator before rmtree. Refuses anything that resolves to SKILLS_DIR itself (empty/dot paths) or to a target outside SKILLS_DIR (absolute paths, traversal, symlinked dirs in skills/ pointing outward). - 14 focused regression tests covering each rejection class plus a symlink-redirect case. E2E verified: hand-crafted poisoned lock.json entries (absolute path, empty install_path, traversal) all refuse and leave the targeted victim untouched; legitimate uninstall still succeeds. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-25 06:13:36 -07:00
Teknium	0d137f1039	feat(errors): actionable guidance for Nous OAuth 401s (#32082 ) Nous Portal is OAuth-only (auth_type=oauth_device_code, no API key path), but the non-retryable-401 guidance branch only covered openai-codex and xai-oauth. A Nous 401 fell through to the generic 'Your API key was rejected... run hermes setup' message, which is wrong advice — the user needs hermes auth add nous --type oauth, not an API key. Also flag the case where the failing model slug ends in :free (OpenRouter syntax) while provider is nous. Without that hint, users re-OAuth successfully and then hit the same 401 on the next message because Nous Portal doesn't carry the OpenRouter free-tier slug. Reported by ashh — debug dump showed Nous device_code exhausted + deepseek/deepseek-v4-flash:free as the model.	2026-05-25 06:06:51 -07:00
wysie	dbe5d84972	fix(auxiliary): universal main-model fallback for aux tasks (#31845 ) Aux callers (title generation, vision, session search, etc.) can reach resolve_provider_client() without an explicit model when the user picked their main provider via 'hermes model' and didn't bother configuring a per-task auxiliary.<task>.model override. The expectation in that case is universal: 'use my main model for side tasks too.' Before, the OAuth providers (xai-oauth, openai-codex) silently returned (None, None) on an empty model — both lack a catalog default because their accepted-model lists drift on the backend. That caused _resolve_auto to drop to its Step-2 fallback chain (OpenRouter / Nous / etc.), so aux tasks billed against the wrong subscription without warning. The fix is at the top of resolve_provider_client() — a single 3-step universal fallback that runs before any provider branch, so no provider-specific empty-model guards are needed (now or for any future provider we add): 1. caller-passed model (caller knew what they wanted) 2. provider's catalog default (cheap aux model, if registered) 3. user's main model from config.yaml Behaviour by provider class: - OAuth providers (xai-oauth, openai-codex) — no catalog default, so step 3 applies. Title gen runs on grok-4.3 / gpt-5.4 against the user's actual subscription instead of leaking to OpenRouter. - API-key providers (anthropic, gemini, kimi-coding, etc.) — catalog default wins at step 2, preserving the original 'cheap aux model' behaviour. Anthropic users still get claude-haiku-4-5 for titles, not opus. - Explicit-model callers (auxiliary.<task>.model config, programmatic callers) — caller wins at step 1, no surprise switching. Salvaged from @wysie's PR #31845 which fixed the xai-oauth branch specifically. The universal shape supersedes the per-branch fix and covers openai-codex (same bug class) plus any future OAuth providers. 4 new tests in TestResolveProviderClientUniversalModelFallback: - empty_model_for_oauth_provider_falls_back_to_main_model - empty_model_for_codex_also_uses_main_model - empty_model_for_catalog_provider_uses_catalog_default - explicit_model_takes_precedence_over_fallbacks 365/365 across tests/agent/test_auxiliary_*, tests/run_agent/test_codex_xai_oauth_recovery.py, tests/hermes_cli/test_auth_xai_oauth_provider.py, and tests/hermes_cli/test_plugin_auxiliary_tasks.py. Co-authored-by: wysie <wysie@users.noreply.github.com>	2026-05-25 05:50:56 -07:00
Teknium	46c1ae8b24	fix(tests): four pre-existing flakes from the security cluster merge (#32072 ) All four failures were broken by the security cluster (#10082 / #10133 / #4609 / symlink-reject batch) merging on May 25. They were red on origin/main HEAD when #32042 and #32061 ran, gating PRs that touched unrelated code. 1) tests/hermes_cli/test_update_zip_symlink_reject.py test_update_via_zip_accepts_normal_member called the real _update_via_zip without sandboxing PROJECT_ROOT — so the function's shutil.copytree() actually copied the fake README from the test ZIP over the real repo's README.md, which then made test_readme_mentions_powershell_installer fail in any test run that happened to pick this test up earlier. Mock PROJECT_ROOT to an isolated tmp_path / install_dir, stub subprocess so pip/uv reinstall doesn't actually run, and assert the fake README lands in the sandbox (not the real tree). 2) tests/tools/test_windows_native_support.py test_readme_mentions_powershell_installer was the victim of (1) — nothing wrong with the test itself, the fix in (1) clears it. 3) tests/tools/test_file_read_guards.py test_proc_fd_other_not_blocked called _is_blocked_device('/proc/self/fd/3') expecting False. But _is_blocked_device runs realpath() and on pytest xdist workers fd 3 happens to be dup'd to /dev/urandom (because the worker subprocess inherits open fds from pytest's collection pipe machinery). Switch to the lower-level _is_blocked_device_path which is the path-pattern check the test actually means to exercise; realpath-resolution coverage already lives in test_symlink_to_blocked_device_is_blocked. 4) tests/tools/test_transcription_tools.py Module installed a faster_whisper stub via sys.modules without setting __spec__, then later @pytest.mark.skipif called importlib.util.find_spec('faster_whisper') which raises 'ValueError: __spec__ is None' for modules with a None spec attr. Set __spec__ on the stub to a real ModuleSpec. Validation: 195/195 green across the 4 affected files.	2026-05-25 05:50:29 -07:00
alt-glitch	f5bb595d51	chore(release): map 8bit64k + hclsys in AUTHOR_MAP	2026-05-25 12:48:46 +00:00
alt-glitch	85a0b3424e	test(tui): regression test for /q alias resolving to queue (#31983 ) Adapted from @hclsys's test in PR #31985. Asserts findSlashCommand('q') resolves to the queue command, not quit.	2026-05-25 12:48:46 +00:00
8bit64K	064ac28cbd	fix(tui): remove 'q' alias from /quit, add to /queue The TUI frontend's slash command registry shadowed /queue's 'q' alias with /quit's 'q' alias. Since /quit appeared later in the registry, the flat lookup kept the later entry, making /q always quit instead of queueing a prompt. This mirrors the backend fix in PR #10538 (hermes_cli/commands.py) but applies the same correction to the TUI TypeScript registry. Fixes #10467	2026-05-25 12:48:46 +00:00
Teknium	8191f663dd	feat(mcp-oauth): accept 'skip' at paste prompt to bypass auth without disabling server (#32069 ) When an MCP server triggers OAuth at startup, the user can now type 'skip' (or 'cancel', 's', 'n', 'no', 'q', 'quit') at the paste prompt + Enter to exit the flow cleanly and continue agent startup without that server. Previously the only ways to bypass an unwanted OAuth prompt were: - Wait the full 5-minute paste timeout - Ctrl+C (also kills the whole reload, may leave half-state) - Edit config.yaml to set 'enabled: false' on the server Skip writes a sentinel to result['error'] which _wait_for_callback maps to OAuthNonInteractiveError('user_skipped'). mcp_tool already classifies that as an auth error in _is_auth_error() and the reconnect loop logs it as 'not retrying automatically' — server stays disconnected for the session, other MCP servers continue normally, no infinite retry burn. The skip message tells users how to re-auth later ('hermes mcp login') or disable persistently ('enabled: false'), so they don't have to remember. 14 new tests covering: case-insensitive skip parsing, all 7 skip tokens, skip not stomping an HTTP-listener win, skip routed to skip path rather than URL-parse path, sentinel mapped to OAuthNonInteractiveError, prompt mentions the skip option.	2026-05-25 05:37:30 -07:00
Teknium	bdf3696705	docs(mcp-oauth): document paste-back flow and SSH options for remote MCP OAuth (#32067 ) Follow-up to #32053. The OAuth-over-SSH guide and the MCP feature page previously only covered xAI and Spotify. Now that MCP servers can complete OAuth via stdin paste-back on remote/headless hosts, document it. oauth-over-ssh.md: - Add MCP servers to the 'Which Providers Need This' table. - New 'MCP Servers' section covering: paste-back (no setup, works anywhere), SSH port forward (same pattern as xAI/Spotify), and the 30s config-auto-reload race pitfall (use 'hermes mcp login <server>' from a fresh terminal instead of editing config from inside a running session). mcp.md: - New 'OAuth-authenticated HTTP servers' section under HTTP servers, covering auth: oauth config, token cache path, paste-back vs SSH tunnel for headless hosts, and the same reload-race pitfall. - Cross-links to the OAuth-over-SSH guide anchor.	2026-05-25 05:35:47 -07:00
Teknium	1c3c364287	feat(cli): show live background terminal-process count in status bar (#32061 ) The CLI status bar tracked /background agent tasks (▶ N) but not shell processes spawned via terminal(background=true). Both kinds of work can run concurrently and a user has no in-bar signal for shell processes. Add an independent indicator (⚙ N) sourced from tools.process_registry.process_registry._running. The two indicators render side-by-side when both are active (▶ 1 │ ⚙ 2), hidden when their count is zero. Renders at all four status-bar tiers (text fallback + prompt_toolkit fragments, narrow + wide widths). The narrow <52 tier still drops both for space — unchanged. New ProcessRegistry.count_running() returns len(_running) without acquiring _lock; CPython dict len is atomic and we're polling on every status-bar tick, so lock-free is the right tradeoff.	2026-05-25 05:35:02 -07:00
teknium1	2b16de0ec3	chore(release): map adam91holt for PR #31984 salvage	2026-05-25 05:34:42 -07:00
adam91holt	8601c4d44c	fix(codex): add time-to-first-byte watchdog for stalled Codex streams The chatgpt.com/backend-api/codex endpoint has an intermittent failure mode where it accepts the connection but never emits a single stream event — the socket just hangs. Direct sequential probing reproduces it (0 events, no HTTP status), and a fresh reconnect then succeeds in ~2s. Today the only guard is the wall-clock stale timeout in interruptible_api_call, so a dead-on-arrival connection is held for the full stale window (90-900s depending on context / config) before the retry loop can reconnect — minutes of wasted wall time per stall, at a rate of ~20% of calls during affected windows. Add a TTFB watchdog scoped to the codex_responses path: - codex_runtime.run_codex_stream stamps agent._codex_stream_last_event_ts on every stream event (not just output-text deltas), so reasoning-only and tool-call-only turns are not mistaken for a stall. - interruptible_api_call resets that marker before the worker starts and, while it is still None, kills the connection once elapsed exceeds the TTFB cutoff (default 45s, tunable via HERMES_CODEX_TTFB_TIMEOUT_SECONDS, 0 disables). The raised TimeoutError flows through the existing retry path unchanged. Once any event has arrived the stream is healthy and only the existing wall-clock stale timeout applies, so legitimate long generations are never interrupted. Gated to codex_responses; the chat_completions non-stream, anthropic and bedrock branches have no first-event signal and are untouched. Adds tests/agent/test_codex_ttfb_watchdog.py covering the stall kill, the events-flowing pass-through, and the env-disable path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 05:34:42 -07:00
Teknium	a989a79c0c	fix(gateway): allow native delivery of freshly-produced agent files (#32060 ) The gateway's media delivery allowlist required files live inside `~/.hermes/cache/{documents,images,...}`, which is the wrong shape for real agent usage. Agents naturally produce artifacts via terminal tools (`pandoc -o /tmp/report.pdf`, `matplotlib savefig`, etc.) or write_file into project directories — these never land under the cache. Result: users got a raw file path in chat instead of an attachment. This is doubly bad in deployment shapes where the cache directories aren't writable by the agent at all: Hermes running in Docker with a read-only mount, or with a Docker/Modal/SSH terminal backend whose filesystem isn't the gateway host's filesystem. Layered trust model: 1. Cache-dir allowlist (unchanged) — Hermes-managed roots always trusted. 2. Operator allowlist — `HERMES_MEDIA_ALLOW_DIRS` env var, now also surfaced as `gateway.media_delivery_allow_dirs` in config.yaml. 3. Recency-based trust (new, default on) — files whose mtime is within `gateway.trust_recent_files_seconds` (default 600s) of "now" are trusted even outside the cache/operator allowlist. Old host files (`/etc/passwd`, `~/.bashrc`, `~/.ssh/id_rsa`) have mtimes measured in days/months, well outside the window — prompt-injection paths pointing at pre-existing files are still rejected. 4. Hard denylist — `/etc`, `/proc`, `/sys`, `/dev`, `/root`, `/boot`, `/var/{log,lib,run}`, plus `$HOME/.{ssh,aws,gnupg,kube,docker,config, azure,gcloud}` and `Library/Keychains`. Denylist blocks delivery even when recency would trust the file, in case an attacker somehow refreshes a sensitive file's mtime. Operators who want strict-allowlist behavior set `gateway.trust_recent_files: false` and the system reverts to pre-existing behavior. Tests: 6 new cases in test_platform_base.py cover the recency window, disabled mode, system-path denylist, and the motivating PDF-in-project scenario. 3 existing tests (test_platform_base, test_tts_media_routing, test_send_message_tool) that exercised the strict-allowlist path are updated to disable recency trust explicitly. E2E validation: real `validate_media_delivery_path()` accepts fresh PDFs in /tmp and project dirs, rejects /etc/passwd, ~/.ssh/id_rsa, and files older than the window; config.yaml `gateway.*` keys bridge correctly to the env vars the validator reads.	2026-05-25 05:34:31 -07:00
Teknium	0ff7c09e2f	feat(mcp-oauth): stdin paste-back fallback for headless OAuth flow (#32053 ) When the user runs OAuth on a remote/SSH machine without a port forward, the OAuth provider redirects to http://127.0.0.1:<port>/callback which only the listener on the remote machine can receive — the user's browser on another box just shows a connection error. _wait_for_callback() now races the HTTP listener against a stdin reader on interactive TTYs. The user can copy the URL from the browser's address bar after authorization (which contains code=...&state=...) and paste it back at the prompt. Whichever fills the result dict first wins; the HTTP listener remains the primary path for local sessions and SSH tunnels. Accepts any of: - Full local redirect URL: http://127.0.0.1:N/callback?code=...&state=... - Provider URL after redirect: https://mcp.linear.app/callback?code=...&state=... - Just the query string: ?code=...&state=... or code=...&state=... The paste thread only spawns when _is_interactive() is true, preserving the existing 'no input() in headless runs' invariant — verified by TestWaitForCallbackPasteIntegration.test_paste_prompt_NOT_shown_when_noninteractive. The SSH-session hint in _redirect_handler is updated to surface the paste option as the primary remedy, with ssh -L tunneling as the alternative.	2026-05-25 05:20:05 -07:00
teknium1	e9119e0eb8	chore(release): map dsr-restyn + WuKongAI-CMU + codeblackhole1024 for S04 cluster	2026-05-25 05:15:55 -07:00
codeblackhole1024	bd2756dd22	fix(update): reject symlink members in update ZIP _update_via_zip downloads a source ZIP from GitHub and calls zipfile.ZipFile.extractall. The existing zip-slip path guard validates each member's path stays under tmp_dir, but does not check member type — so a ZIP containing a symlink member would still be materialized by extractall, and a symlink target could point outside the extracted tree (or to a sensitive system path). This isn't a high-likelihood threat for hermes-agent's actual GitHub source ZIPs (we don't ship symlinks), but the extractall path runs as the user's account and a compromised mirror could plant arbitrary files via the symlink → target → write chain. Reject any member whose Unix mode bits (upper 16 bits of external_attr) are S_IFLNK before extractall. Hermes source ZIPs contain only regular files and directories; a symlink member is unambiguously suspicious. Regression tests cover: symlink member rejection (raises ValueError, caught by the outer try/except as a clean SystemExit, no extraction), and the happy-path verification that a normal ZIP doesn't trigger the symlink reject message. Salvaged from PR #15881 by @codeblackhole1024. The remaining pieces of that PR were already on main or contradicted explicit design decisions: - config.yaml write-deny: already in agent/file_safety.py's control_file_names denylist (the modern guard); the proposed addition to build_write_denied_paths was the legacy path. - Quick commands danger detection: contradicts the explicit cli.py:8491-8492 comment 'shell=True is intentional: quick_commands are user-defined shell snippets from config.yaml — not agent/LLM controlled.' - Memory plugin shlex.split for dep checks: already on main (hermes_cli/memory_setup.py:133). Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 05:15:55 -07:00
aaronlab	5f20322d23	fix(tts): reject '..' traversal in output_path text_to_speech_tool accepts an explicit output_path. Without a traversal guard, a path containing '..' components (whether prompt-injection- controlled, from a confused skill, or just a buggy caller) could escape its declared base and write the audio to a system location — e.g. `output_path='audio/../../etc/cron.d/x'` lands the file outside the intended audio cache. Reject '..' components in the user-supplied path. Explicit absolute paths are unchanged (the agent legitimately writes audio wherever the user/caller asks); only traversal-style escapes are blocked. The terminal tool can still write anywhere with approval — this just keeps the unattended TTS surface from materializing files via traversal. Regression tests cover: '..' in the middle (audio/../../etc/...), bare '..' prefix, and the negative cases (absolute paths + relative paths without '..' both pass through unchanged). Salvaged from PR #6693 by @aaronlab. The original PR confined output to DEFAULT_OUTPUT_DIR-or-cwd, which broke 9 existing tests that legitimately write to tmp_path locations. The traversal-only check covers the actual threat (path-escape via '..' from prompt injection) without restricting where users can choose to write their audio. The remaining pieces of #6693 (skill_commands rglob symlink rejection, delegate_tool batch prefix display) are dropped: - skill_commands rglob: breaks the documented design supporting ~/.hermes/skills/<name> as a symlink to a checked-out skill elsewhere (see comment at agent/skill_commands.py:73-75) - delegate_tool batch prefix: pure UX, doesn't belong in a security PR Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 05:15:55 -07:00
daimon-nous[bot]	ac5359a3f3	fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998 ) (#32012 ) * fix(streaming): route mid-tool-call partial-stream-stub through length continuation (#31998) When a stream stalls mid-tool-call (e.g. a large write_file), the partial-stream-stub recovery used finish_reason='stop' which caused the conversation loop to treat the turn as complete, returning only the warning text. When users said 'continue', the model retried the same large tool call, hit the same stale timeout, and looped indefinitely. Changes: - chat_completion_helpers.py: change _stub_finish_reason from 'stop' to 'length' for mid-tool-call partials. The stub still has tool_calls=None so no tool auto-executes — the model gets a fresh API call through the existing length-continuation machinery (bounded to 3 retries). Also attach _dropped_tool_names to the stub for downstream use. - conversation_loop.py: add a third continuation prompt branch for partial-stream-stubs with dropped tool calls. Instead of the generic 'continue where you left off' (which would retry the same large call), tell the model to break the output into smaller tool calls (~8K tokens each) to avoid stream timeouts. - test_partial_stream_finish_reason.py: update existing test from finish_reason='stop' to 'length', add _dropped_tool_names assertion, add new test_dropped_tool_call_uses_chunking_prompt for the 3-way prompt branching. Safety: tool_calls=None is preserved on the stub, so the conversation loop enters the text-continuation branch (line 1513), NOT the tool-call execution branch (line 3246). No tool auto-executes. The model simply gets another API call with targeted guidance. * refactor: extract constants and continuation prompt helper - Move magic strings to hermes_constants.py (PARTIAL_STREAM_STUB_ID, FINISH_REASON_LENGTH) - Extract _get_continuation_prompt() in conversation_loop.py — DRYs the 3-way prompt branching and lets tests import the real function - Trim verbose inline comments in chat_completion_helpers.py - Tests import constants + helper instead of duplicating logic --------- Co-authored-by: alt-glitch <balyan.sid@gmail.com>	2026-05-25 17:43:10 +05:30
nguyen binh	46d8b5dadf	fix(profile): reject symlinks in distributions (#25292 )	2026-05-25 05:07:58 -07:00
nguyen binh	0d55315c36	fix(backup): skip symlinked files in zip archives (#25289 )	2026-05-25 05:07:52 -07:00
Teknium	79799c80f5	test(approval): patch _YOLO_MODE_FROZEN directly in test_yolo_overrides_cron_deny The test set HERMES_YOLO_MODE=1 via monkeypatch.setenv, expecting check_dangerous_command() to honor yolo and bypass cron_mode=deny. But tools.approval._YOLO_MODE_FROZEN is intentionally frozen at module import time (security: prevents prompt-injection runtime escalation). When CI imports the module BEFORE the test sets the env, the frozen value stays False and the yolo bypass never activates. Local runs missed this because the conftest leaked a non-empty HERMES_YOLO_MODE into the import-time env. CI's clean-env path exposed the bug deterministically on test (3) / test (4) shards. Fix: patch the module attribute directly via mock.patch.object so the test simulates process-startup-with-yolo regardless of import order. The behavior under test (yolo bypasses cron_mode=deny for non-hardline commands) is unchanged; the security invariant (_YOLO_MODE_FROZEN can't be set at runtime by skills) is preserved. Reproduced locally with: env -i HOME=$HOME PATH=$PATH python3 -m pytest tests/tools/test_cron_approval_mode.py -o 'addopts=' -v Without the fix: 1 failed, 23 passed. With the fix: 24 passed.	2026-05-25 05:07:49 -07:00
Peter	95848b1cbc	fix(transcription): reject symlinked audio inputs (#10082 ) * fix(transcription): reject symlinked audio inputs Validation runs before provider selection, so rejecting symbolic-link paths there prevents supported-extension links from being treated as normal audio files. Use os.path.islink to avoid perturbing the existing Path.stat error path and to reject links before resolving targets. Constraint: Keep validation platform-safe and avoid requiring symlink support where unavailable. Rejected: Use Path.is_symlink \| it consumes pathlib stat calls and broke the existing stat error regression. Confidence: high Scope-risk: narrow Directive: Keep path hardening in _validate_audio_file before provider dispatch. Tested: source venv/bin/activate && python -m pytest tests/tools/test_transcription_tools.py::TestValidateAudioFileEdgeCases -q (5 passed) Tested: source venv/bin/activate && python -m pytest tests/tools/test_transcription_tools.py::TestValidateAudioFileEdgeCases tests/tools/test_transcription_tools.py::TestTranscribeAudioDispatch::test_invalid_file_short_circuits -q (6 passed) Tested: source venv/bin/activate && python -m compileall tools/transcription_tools.py tests/tools/test_transcription_tools.py Tested: git diff --check Not-tested: Full tests/tools/test_transcription_tools.py under .[dev] only; existing faster_whisper optional dependency tests fail with ModuleNotFoundError. * Keep transcription tests independent of optional whisper install The transcription suite mocks faster-whisper directly, so a minimal test stub keeps the branch verifiable in environments where the optional package is not installed. This preserves the existing mock-based coverage without adding a dependency. Constraint: faster-whisper is an optional local STT dependency and is absent from the current validation environment Rejected: Install faster-whisper just for branch validation \| would add heavyweight environment coupling outside the patch scope Confidence: high Scope-risk: narrow Directive: Keep this as a test-only stub unless production import semantics change Tested: pytest tests/tools/test_transcription_tools.py -q --------- Co-authored-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com>	2026-05-25 05:07:45 -07:00
Peter	ee59ef1946	fix: reject read_file symlinks to blocking devices (#10133 ) * fix: reject read_file symlinks to blocking devices The read_file guard already refused direct device paths such as /dev/zero, but a workspace symlink resolving to one of those devices could still reach the shell-backed read path and hang on wc/head/sed. Keep the literal alias check and add a resolved-path pass so local symlinks to blocked device/fd endpoints are rejected before I/O. Constraint: Preserve literal /dev/stdin handling before terminal-specific realpath resolution Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py tests/tools/test_file_tools.py -q; python -m compileall tools/file_tools.py tests/tools/test_file_read_guards.py; git diff --check Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> * Keep file guard tests off sensitive macOS temp paths The branch now inherits a sensitive-path write guard from upstream main. On macOS, tempfile.mkdtemp() resolves under /private/var/folders, so the new write-path guard fired before the file read dedup assertions could exercise their intended behavior. The tests now create their scratch files inside the worktree temp checkout, outside those system-sensitive prefixes, without changing production behavior. Constraint: Rebased branch must pass the expanded file read guard suite on macOS. Rejected: Loosen the production sensitive-path prefix list \| broader behavior change unrelated to this PR. Confidence: high Scope-risk: narrow Tested: pytest tests/tools/test_file_read_guards.py -q --------- Signed-off-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com> Co-authored-by: WuKongAI-CMU <210765158+WuKongAI-CMU@users.noreply.github.com>	2026-05-25 05:07:38 -07:00
Dakota Secula-Rosell	b7b8bec800	fix(security): block /proc//environ, cmdline, maps from file read (#4609 ) The read_file tool and terminal cat can access /proc/self/environ to recover all process env vars including secrets stripped by the subprocess blocklist. Output redaction partially mitigates (catches known-format tokens) but misses custom/proprietary key formats, especially when values are printed without their key names. Add /proc//environ, /proc//cmdline, and /proc//maps to the blocked device paths in _is_blocked_device(): - /proc//environ: leaks full process env (API keys, tokens) - /proc//cmdline: leaks command-line args (may contain passwords) - /proc/*/maps: leaks memory layout (ASLR bypass for exploitation) Legitimate /proc reads (cpuinfo, meminfo, uptime, version) remain accessible — the check only blocks per-pid pseudo-files with known sensitive suffixes. Complements PR #4432 (PID namespace isolation for child processes) which prevents children from reading the parent's /proc, but does not prevent the parent process itself from being read via file tools. Partially addresses #4427 Changes: tools/file_tools.py \| +6 tests/tools/test_file_read_guards.py \| +18 -1 Co-authored-by: dsr-restyn <dsr-restyn@users.noreply.github.com>	2026-05-25 05:07:31 -07:00
Teknium	4909dd84c1	chore(release): map 66773372+Tranquil-Flow@users.noreply.github.com to Tranquil-Flow (PR #27518 )	2026-05-25 05:07:11 -07:00
Evi Nova	1b12cd5241	fix(cli): bracketed-paste timeout prevents permanent input freeze (#16263 ) When the terminal drops the ESC[201~ end mark during a bracketed paste (terminal race, torn write, SSH glitch, macOS sleep/wake), prompt_toolkit's Vt100Parser keeps buffering all later input in _paste_buffer forever. From the user's perspective, the CLI appears frozen — the only recovery was closing the tab/session. This patch monkey-patches Vt100Parser.feed() so that bracketed-paste mode flushes buffered content as a normal BracketedPaste event after 2 seconds without an end marker, then restores normal parsing. Includes 8 regression tests covering normal paste, timeout recovery, torn end marks, and edge cases. Surgical reapply of PR #27518. Original branch was many months stale (1193 files / 172k LOC of unrelated reverts); the substantive ~77 LOC patch in cli.py plus the new 157-line test file were reapplied onto current main with the contributor's authorship preserved via --author.	2026-05-25 05:07:11 -07:00
Teknium	8697471419	test(cli): cover KeyboardInterrupt guard around slash command dispatch 4 tests: KBI during slash command does not set _should_exit; truthy return keeps session alive; falsy return still sets exit (legit /exit path); non-KBI exceptions propagate normally.	2026-05-25 05:06:06 -07:00
ygd58	63d6b9e637	fix(cli): catch KeyboardInterrupt during slash commands to prevent session exit A Ctrl+C during a slow slash command (e.g. /skills browse on a large skill tree, /sessions list against a multi-GB SQLite DB) used to unwind past self.process_command() to the outer prompt_toolkit event loop, which killed the entire session — losing all conversation state. Fix: wrap the slash-command dispatch in try/except KeyboardInterrupt so Ctrl+C aborts the command but the prompt loop continues. Other exceptions still propagate so real bugs aren't silently swallowed. Surgical reapply of PR #5189. Original branch was many months stale (3764 files / 1M+ LOC of unrelated reverts); the substantive ~6 LOC change in cli.py was reapplied by hand onto current main with the contributor's authorship preserved via --author.	2026-05-25 05:06:06 -07:00
Teknium	ee7789e547	chore(release): map simo.kiihamaki@gmail.com to SimoKiihamaki (PR #30773 )	2026-05-25 05:06:03 -07:00
simokiihamaki	fae815adc2	fix(cli): prevent /reset and /new freeze on Windows by falling back to stdin prompt On Windows (PowerShell/Windows Terminal), the queue-based modal used for destructive slash command confirmations deadlocks because prompt_toolkit's input channel becomes unresponsive when entered from the process_loop daemon thread. Keystrokes never reach the key bindings, so response_queue.get() blocks until the 120-second timeout expires. Fix: fall back to _prompt_text_input (stdin-based) when: 1. sys.platform == 'win32' — Windows console doesn't support the modal reliably 2. Called from non-main thread — key bindings can't fire from daemon threads 3. self._app is not set — existing behavior for tests/non-interactive This mirrors the thread-aware guard from _prompt_text_input (PR #23454). 9 new regression tests covering Windows detection, non-main thread fallback, macOS/Linux modal preservation, and integration with _confirm_destructive_slash. Fixes #30768 Surgical reapply of PR #30773. Original branch was many months stale (911 files / 146k LOC of unrelated reverts); the substantive ~30 LOC change in cli.py plus the new test file were reapplied onto current main with the contributor's authorship preserved via --author.	2026-05-25 05:06:03 -07:00
Tranquil-Flow	b1adb95038	fix(codex): surface actionable hint when stale-call detector fires on known silent-reject pattern The ChatGPT Codex backend (chatgpt.com/backend-api/codex) has historically silently dropped certain model requests: the connection is accepted but no stream events are emitted and no error is raised. PR #31967 lowered the implicit stale-call default from 300s to 90s so fallbacks kick in faster, but users still see an opaque "No response from provider for 90s (non-streaming, ...)" message that gives no path forward. This patch adds a narrow heuristic — gpt-5.5 family on the Codex backend via codex_responses api_mode — that substitutes the generic timeout message with actionable text naming the gpt-5.4-codex workaround and pointing at #21444 for symptom history. Changes: - run_agent.py — new ``AIAgent._codex_silent_hang_hint(model=...)`` method. Returns ``None`` for any request that does not match all three guards (codex_responses api_mode, openai-codex provider or chatgpt.com Codex base URL, gpt-5.5-family model name with word-boundary regex anchoring to avoid false-positives on e.g. ``gpt-5.50``). - agent/chat_completion_helpers.py — the non-stream stale-call site consults the hint via ``getattr(...)`` so the call site stays robust if the helper is ever removed or stubbed in tests. Hint is appended to both the ``_emit_status`` warning and the ``TimeoutError`` message so the user sees it in their terminal AND it lands in any retry-loop diagnostics. - tests/run_agent/test_codex_silent_hang_hint.py — 10 regression tests covering positive cases (bare gpt-5.5, vendor-prefixed openai/gpt-5.5, gpt-5.5-codex SKU, model=None fallback to self.model) and negative cases (gpt-5.4-codex workaround, gpt-5.50 false-positive guard, non-codex api_mode, non-codex provider, empty/None model, unrelated models on Codex). Does NOT fix the backend-side issue (that's an upstream OpenAI/ChatGPT problem we cannot patch from here). Only converts an opaque timeout into text that names the workaround so users do not have to dig through logs or wait for a forum post to learn what to do. Closes #22046	2026-05-25 04:49:22 -07:00
teknium1	4c64638897	chore(release): map liuhao1024 for PR #20778 salvage	2026-05-25 03:40:47 -07:00
liuhao1024	ba3c450914	fix(security): block read_file on project-local .env files get_read_block_error() only blocked internal Hermes cache files but allowed reading project-local secret-bearing environment files (.env, .env.production, .env.local, etc.) through both read_file and ACP fs/read_text_file paths. Add a basename deny set for common secret-bearing .env variants. .env.example remains readable as documentation. Fixes #20734	2026-05-25 03:40:47 -07:00
teknium1	51c913caf7	chore(release): map dusterbloom for PR #25726 salvage	2026-05-25 03:40:47 -07:00
dusterbloom	79fc92e9cb	fix(security): tighten .env file permissions to 0600 at all creation sites .env holds API keys and secrets. Multiple creation sites used `cp` / `touch` / `shutil.copy2` which obey the process umask — commonly 0o022, leaving the file at 0o644 (world-readable). Apply chmod 0o600 explicitly at every site that creates or copies .env. Sites covered: - docker/stage2-hook.sh: after the seed_one '.env' call, applied unconditionally (not just on first-seed) so a host-mounted .env with loose perms gets tightened on every container restart - hermes_cli/doctor.py: 'hermes doctor --fix' touches an empty .env when missing - hermes_cli/profiles.py: 'hermes profile create --clone' copies .env from the source profile; shutil.copy2 preserves source mode, so a source .env at 0o644 was being cloned into 0o644 - setup-hermes.sh: in-tree setup script's cp .env.example .env path, plus the already-exists branch (mirror of install.sh which already chmods 600 unconditionally on line 1442) scripts/install.sh was NOT changed — it already chmod 600's the .env unconditionally after the create/already-exists branches (line 1442). Salvaged from PR #25726 by @dusterbloom. The docker/entrypoint.sh portion of the original PR was dropped because main switched to an s6-overlay shim — the .env creation logic moved to stage2-hook.sh, which is where the chmod now lives. Closes #25497 (subset — install.sh + setup-hermes.sh) and #8448 (subset — install.sh only) as superseded. Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-25 03:40:47 -07:00

1 2 3 4 5 ...

9546 commits