hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

Author	SHA1	Message	Date
teknium1	ea95fdd6d7	chore(release): add nikshepsvn to AUTHOR_MAP for PR #27426 salvage	2026-06-30 03:41:46 -07:00
nikshepsvn	d82a69b624	fix(tools): prune acp_command from delegate_task schema when no ACP CLI is on PATH Defense-in-depth follow-up to the runtime guard added in the previous commit. Models on headless hosts (Railway / Fly / Docker / fresh VPS) without any ACP CLI installed occasionally hallucinate ``acp_command="copilot"`` from the schema description, despite the explicit "Do NOT set" instruction. The runtime guard prevented the crash but the model still wasted a tool turn and got an opaque silent fallback. This commit removes the temptation at its source: ``_build_dynamic_schema_overrides`` now strips ``acp_command`` and ``acp_args`` from both the top-level and per-task schemas when none of the known ACP CLIs (``copilot``, ``claude``, ``codex``) are detectable on PATH. The model literally never sees the fields, so it cannot pass them. The runtime guard from the previous commit stays in place as defense-in-depth for internal callers, tests, and any future code path that bypasses the schema. ``_acp_binary_available`` is intentionally NOT cached: ``shutil.which`` is cheap, and avoiding the cache means the schema reacts to mid-session installs without requiring a process restart. Tests: - ``test_schema_prunes_acp_command_when_no_acp_binary`` - ``test_schema_keeps_acp_command_when_binary_available`` - ``test_acp_binary_available_checks_known_clis`` Full ``test_delegate.py`` suite: 136/136 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-30 03:41:46 -07:00
nikshepsvn	2e0b591076	fix(tools): validate acp_command binary exists before forcing copilot-acp transport When a model passes `acp_command="copilot"` (or any other binary name) in a `delegate_task` tool call, `_build_child_agent` unconditionally sets `effective_provider = "copilot-acp"`, which routes the subagent through `CopilotACPClient`. That client spawns the named binary via subprocess; if it isn't on PATH, every retry raises RuntimeError and an asyncio cleanup race during error delivery can take the entire gateway down. This is a real failure mode on headless deploys (Railway / Fly / VPS / Docker) where `copilot` / `claude` / etc. aren't installed. The schema does say "Do NOT set unless the user explicitly told you an ACP CLI is installed," but models occasionally pass it anyway — particularly for X (Twitter) search prompts where Grok seems to associate ACP with "search assistance." Reproduction: - Headless install (no `copilot` binary on PATH) - Set provider to xai-oauth + model grok-4.3 - Telegram prompt: "Search X for crypto twitter trends" - Grok decides to delegate and passes `acp_command="copilot"` - Subagent crashes 3x, gateway crashes on the 3rd retry teardown Fix: validate the binary exists on PATH via `shutil.which` before honoring the override. If missing, log a warning and fall through to the parent's default transport. No behavior change when the binary IS present (covered by `test_build_child_agent_honors_acp_command_when_binary_present`). Tests: - `test_build_child_agent_ignores_acp_command_when_binary_missing` - `test_build_child_agent_honors_acp_command_when_binary_present` Verified on Python 3.11 (macOS) and 3.12 (Debian 13 container). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-30 03:41:46 -07:00
Kong	6d6702ef50	fix(whatsapp-bridge): clarify FIFO outbound-id tracker semantics Rename LRU/refresh wording to match Set insertion-order eviction and reject non-positive maxSize at construction time.	2026-06-30 03:41:43 -07:00
Kong	24aa02179b	test(whatsapp): repoint owner test import after adapter relocation WhatsAppAdapter lives under plugins/platforms/whatsapp/adapter.py on current upstream; the owner-forward test still imported the removed gateway.platforms.whatsapp module.	2026-06-30 03:41:43 -07:00
Keira Voss	db52ad0f07	fix(whatsapp): gate owner-typed forwards on customer chatId allowlist The opt-in WHATSAPP_FORWARD_OWNER_MESSAGES path in bot mode marks fromMe inbound messages as fromOwner: true and forwards them to the Python adapter so plugins can detect "owner just typed in this chat" and trigger handover / sliding TTL flows. The previous implementation bypassed the allowlist for that path: the existing allowlist gate at the bottom of the dispatch loop is guarded by !msg.key.fromMe, so any chat the operator happened to reply to was forwarded — even ones not on WHATSAPP_ALLOWED_USERS. Concretely, on a deployment with a single allowlisted customer, an owner reply in any other chat would still wake Hermes and let the gateway-policy plugin's owner-implicit branch create a stray handover row keyed by the non-allowlisted chatId. Fix: extract the bot-mode fromMe gate into a small pure helper (`owner_message_gate.js`) that returns one of {drop_echo, drop_disabled, drop_allowlist, forward_owner, pass} so the new allowlist branch can be unit-tested without spinning up Baileys. The check runs against the customer chatId (not senderId, which is the owner's own number/LID and won't be on the allowlist by construction). matchesAllowedUser already short-circuits true on an empty allowlist or "*", so deployments without an allowlist see no behavior change. Self-chat mode is untouched — its existing isSelfChat pin is the correct guard there. Tests: scripts/whatsapp-bridge/owner_message_gate.test.mjs covers echo drop, disabled drop, the new allowlist drop, the forward path, the open-allowlist short-circuit, and the precedence of echo/disabled checks over the allowlist check (so logs stay honest).	2026-06-30 03:41:43 -07:00
Keira Voss	a61cf774ce	feat(whatsapp): tag owner-typed inbound text with [owner reply] prefix When WHATSAPP_FORWARD_OWNER_MESSAGES is enabled and the bridge marks an inbound message with fromOwner=true, also prefix MessageEvent.text with "[owner reply] " at construction time. This makes the disambiguation survive any downstream plugin failure (e.g. handover-rule errors that bypass silent_ingest), so transcripts never misattribute owner-typed text to the customer. Idempotent: re-applies are guarded so a future producer that pre-tags text won't be double-prefixed.	2026-06-30 03:41:43 -07:00
keiravoss94	84f350efe0	feat(whatsapp): opt-in forwarding of owner-typed messages in bot mode In `WHATSAPP_MODE=bot` the bridge currently drops every fromMe inbound message — they are all assumed to be echoes of our own /send calls. That makes it impossible for plugins / agents to detect when a human owner has typed directly into a customer chat from the same WhatsApp Business account (e.g. via a linked phone or WhatsApp Web). This adds an opt-in `WHATSAPP_FORWARD_OWNER_MESSAGES` env var. When true, the bridge classifies fromMe inbound by looking up `key.id` in a bounded LRU of recently-sent message IDs (the existing 50-entry echo suppressor, bumped to 512 and extracted to a testable `outbound_ids.js` helper). Hits in the LRU are still dropped (echoes); misses are forwarded to the Python adapter with `fromOwner: true`. The Python adapter lifts that flag onto `MessageEvent.metadata["whatsapp_from_owner"]`. `metadata` is a new free-form dict on the event so future per-platform signals don't each need their own field. Default behaviour is unchanged: with the env flag unset, bot mode still drops every fromMe message exactly as before. Use cases for downstream consumers: - Implicit handover activation when the owner replies manually - Sliding TTL on owner activity (keep an active session alive while the owner is engaged) - Audit trails of owner interventions - Analytics on human-vs-bot reply ratios Heuristic limitation (documented in code): the LRU is in-memory. After a bridge restart, in-flight delivery receipts of pre-restart sends will briefly look like owner-typed for a few seconds until the set is repopulated. Persisting isn't worth the disk churn — downstream consumers should treat the flag as best-effort. Tests: - tests/gateway/test_whatsapp_from_owner.py (new): adapter sets the metadata flag iff the bridge payload has `fromOwner: true`; absent otherwise. - scripts/whatsapp-bridge/outbound_ids.test.mjs (new): LRU bounds, eviction order, falsy-id handling. Backwards compatibility: with the env flag unset, every code path is identical to before. No existing deployment is affected.	2026-06-30 03:41:43 -07:00
teknium1	1366f376d6	fix(moa): pin chat_completions on live switch to a MoA preset The gateway/CLI /model switch path (switch_model in agent_runtime_helpers) built the MoAClient facade but left agent.api_mode at the value determine_api_mode / the resolved aggregator transport produced (e.g. codex_responses or anthropic_messages). The conversation loop dispatches on agent.api_mode, so a non-chat_completions value made the primary/acting call go through client.responses.create — which the MoAClient facade has no .responses for — and fall through to the moa://local placeholder, 404 three times, then fall back to a reference model (issues #54259, #54669). agent_init.py already pins api_mode=chat_completions for provider==moa; mirror that in the live switch so the primary call always routes through MoAClient.chat.completions. The aggregator's real transport is resolved and applied inside the reference/aggregator fan-out, not on the outer call.	2026-06-30 03:39:50 -07:00
liuhao1024	d76ca3a7f2	fix(moa): propagate api_mode from slot runtime to call_llm Slot_runtime resolved the provider's real API surface (including api_mode) but only forwarded base_url and api_key to call_llm, dropping api_mode. This caused Copilot GPT-5.x reference slots to hit /chat/completions instead of the Responses API, returning 400 unsupported_api_for_model. - _slot_runtime: forward api_mode from resolve_runtime_provider - call_llm: accept explicit api_mode param, override task config - 4 regression tests for propagation, omission, and signature	2026-06-30 03:39:50 -07:00
sprmn24	da4f15cddc	fix(cron): log and redact on secrets-redaction failure If redact_sensitive_text() raises or fails to import, stdout/stderr were silently left unredacted and could leak API keys or tokens into cron job delivery messages and logs. Replace bare with a warning log and replace both outputs with '[REDACTED - redaction failed]' to prevent leaks. Root cause: silent exception swallow in _run_job_script() Impact: potential secrets leak in cron job output delivery	2026-06-30 03:34:21 -07:00
teknium1	d3d768efb9	test(copilot): update stale get_copilot_api_token mock to tuple signature get_copilot_api_token now returns (api_token, base_url); the auth-remove suppression test still mocked it as a bare string, mis-unpacking into the credential-pool seed path and failing with 'No credential #1'.	2026-06-30 03:27:41 -07:00
teknium1	3ecc58a8da	chore: map trevorgordon981 in AUTHOR_MAP for #50590 co-authorship	2026-06-30 03:27:41 -07:00
teknium1	15e44527ab	fix(copilot): prefer endpoints.api for base URL, guard empty chat base URL Folds @trevorgordon981's #50590 into difujia's #15139: - exchange_copilot_token now prefers the authoritative endpoints.api from the token-exchange response, falling back to the proxy-ep-derived host - resolve_api_key_provider_credentials gains a copilot branch that resolves the account-specific base URL and a non-empty last-resort guard, so chat inference never wedges on an empty base URL (#50252) Co-authored-by: Trevor Gordon <trevorbgordon@gmail.com>	2026-06-30 03:27:41 -07:00
NiuNiu Xia	fb07215844	fix(copilot): recognize enterprise subdomains in host checks The earlier enterprise base URL change (proxy-ep parsing) gave us URLs like `api.enterprise.githubcopilot.com`, but ~15 host-matching call sites still hard-coded `api.githubcopilot.com`. Enterprise users would therefore drop the `Copilot-Integration-Id: vscode-chat` header at client-build time, and upstream rejected requests with: The requested model is not available for integrator "zed" (or "copilot-language-server") — verify the correct Copilot-Integration-Id header is being sent. The header was correct in copilot_default_headers(); it just never made it into default_headers for non-default hostnames because every detector compared against the exact string "api.githubcopilot.com". This commit broadens all those checks to "githubcopilot.com" via base_url_host_matches (which already does proper subdomain matching), so api.enterprise.githubcopilot.com, api.business.githubcopilot.com, etc. all share the same headers, vision routing, max_completion_tokens selection, and reasoning-effort detection as the default endpoint. Also adds ".githubcopilot.com" to _URL_TO_PROVIDER so context-window resolution via models.dev works for enterprise base URLs, and tightens _is_github_copilot_url to use suffix matching instead of strict equality. Tests: - New: enterprise Copilot endpoint preserves Copilot-Integration-Id - New: enterprise endpoint returns max_completion_tokens (not max_tokens) - Existing 333 base_url / copilot / aux-client / credential-pool tests pass Parts 5 of #7731.	2026-06-30 03:27:41 -07:00
NiuNiu Xia	fbd15e285c	fix(copilot): switch to VS Code client ID and derive enterprise base URL Two changes that complete the Copilot auth story (#7731 parts 3 and 4): 1. Switch OAuth client ID from opencode (Ov23li8tweQw6odWQebz) to VS Code (Iv1.b507a08c87ecfe98). The old ID produces gho_* tokens that return 404 on /copilot_internal/v2/token, making token exchange non-functional. The new ID produces ghu_* tokens that support exchange. 2. Derive enterprise API base URL from the proxy-ep field in the exchanged token. Enterprise accounts get tokens containing e.g. "proxy-ep=proxy.enterprise.githubcopilot.com" which is converted to "https://api.enterprise.githubcopilot.com" and stored in the credential pool. Individual accounts (no proxy-ep) continue using the default URL. The COPILOT_API_BASE_URL env var remains as a user escape hatch. Tested on both Individual and Enterprise Copilot accounts: - Individual: device flow works, exchange succeeds, base_url=None (default) - Enterprise: device flow works, exchange succeeds, 39 models returned including claude-opus-4.6-1m (936K), enterprise base URL derived Parts 3 and 4 of #7731.	2026-06-30 03:27:41 -07:00
teknium1	bf2dc18f84	test+chore: real-path regression test for #15157 model_extra guard + AUTHOR_MAP Adds tests/agent/test_model_extra_type_guard.py exercising the real ChatCompletionsTransport.normalize_response path with string/list/None/dict model_extra; adds the AUTHOR_MAP entry for the contributor.	2026-06-30 03:27:12 -07:00
huangxudong663-sys	0df3c12699	fix(agent): guard against non-dict model_extra in tool call normalization Some OpenAI-compatible providers (NVIDIA NIM + qwen3.5) return a string for model_extra instead of a dict. The falsy fallback (x or {}) treats a truthy non-empty string as the value and calls .get() on it, raising AttributeError and turning every tool call into [error]. Replace the falsy fallback with an explicit isinstance(.., dict) guard at both extra_content extraction sites (non-streaming normalize_response and the streaming delta accumulator).	2026-06-30 03:27:12 -07:00
Teknium	c7e0bdef9a	fix(agent): stop over-cap max_tokens 400s from death-looping into compression (#55570 ) An over-cap model.max_tokens produces a provider 400 that mentions max_tokens, which trips _CONTEXT_OVERFLOW_PATTERNS and is classified as context_overflow. On providers whose wording isn't recognized by parse_available_output_tokens_from_error() (e.g. DashScope/Qwen: "Range of max_tokens should be [1, 65536]") the smart-retry is skipped and the error falls into the compression fallback, which re-sends the same oversized max_tokens, fails identically, and loops until "cannot compress further" on a tiny conversation (#55546). Root-cause fix for the whole class, not just DashScope: - parse_available_output_tokens_from_error(): recognize the DashScope "Range of max_tokens should be [1, N]" form and return N (smart-retry then caps output and retries WITHOUT compressing). - new is_output_cap_error(): broader yes/no gate for output-cap 400s. In the loop, when the error is output-cap-shaped but unparseable, fail fast with an actionable message (lower model.max_tokens) instead of routing into compression. Mirrors the existing GPT-5 max_tokens guard. Real input overflows and GPT-5 unsupported-param 400s are unchanged.	2026-06-30 03:26:41 -07:00
georgex8001	62b9fb6623	fix(acp): thread-safe interactive approval via contextvars Concurrent ACP sessions run on a shared ThreadPoolExecutor (max_workers=4). Each _run_agent mutated the process-global os.environ["HERMES_INTERACTIVE"] and restored it in finally, so one session's restore could clobber another's set mid-run — dropping the second session onto the non-interactive auto-approve path, executing a dangerous command without the approval callback firing (GHSA-96vc-wcxf-jjff). Replace the env-var flag with a thread/task-local contextvar in tools.approval. The two HERMES_INTERACTIVE read sites in approval.py now go through _is_interactive_cli() (contextvar-first, env fallback for legacy single-threaded CLI callers). The ACP executor sets the contextvar instead of os.environ; the existing contextvars.copy_context() wrapper isolates each session's write. Co-authored-by: Hermes Agent <127238744+teknium1@users.noreply.github.com>	2026-06-30 03:24:58 -07:00
teknium1	f5eb4c307b	fix(gateway): stop Matrix upload fallback from leaking host path The Matrix adapter's _upload_file fell back to sending "(file not found: {file_path})" directly into the room — the same host-path leak class fixed for the base adapter and Slack in the previous commit. Replace it with a friendly notice, log the path at WARN for operators, and preserve any caller-supplied caption.	2026-06-30 03:24:36 -07:00
UgwujaGeorge	cb9d18c759	fix(gateway): stop media-send fallbacks from leaking host paths into chat The base BasePlatformAdapter implementations of send_voice, send_video, send_document, and send_image_file forwarded their _path argument verbatim into the chat text (e.g. "🎬 Video: /home/.../hermes/cache/..."). Telegram, Discord, and Slack adapters all fall back to those base methods when their native send raises — so a rejected video on Telegram surfaced the host filesystem layout to the user instead of a useful message. Replace the path-echo with a friendly notice, log the path for operator diagnostics, and keep the user-supplied caption intact. The Slack adapter had three identical sites that fell through to the same path-echo on its own native upload failures; fix those too. send_document still surfaces the caller-provided file_name (or the basename derived from it) since that is the user-facing filename, not a host path. Add regression tests asserting the _path argument never appears in the fallback content while caption text and explicit file_name still do.	2026-06-30 03:24:36 -07:00
teknium1	fee3d4ed04	test(gateway): update startup-restart-race fixtures for current main The salvaged test double predated two main changes: - start() now connects via _connect_adapter_with_timeout, which forwards is_reconnect to adapter.connect(); the StartupRaceAdapter double didn't accept the kwarg. - stop() now awaits _finalize_shutdown_agents (async on main); the fixture stubbed it as a plain MagicMock. Accept is_reconnect in the double and use AsyncMock for the finalize stub.	2026-06-30 03:22:18 -07:00
Disaster-Terminator	f4a54b6292	fix(gateway): abort startup during restart	2026-06-30 03:22:18 -07:00
Kartik	c6eb7f9e72	fix(memory/mem0): recall on the current question + stronger search guidance (#55535 )	2026-06-30 15:51:08 +05:30
Tao Yan	b8ebe32866	fix(agent): flatten multi-part user_message in codex intermediate-ack detector Vision requests routed through the OpenAI-compat API server forward the raw multi-part content list ([{type:"text"}, {type:"image_url"}, ...]) straight through as user_message. The codex intermediate-ack detector flattened it with (user_message or "").strip(), so a truthy list survived and .strip() raised AttributeError — killing any Codex-routed vision turn that took the require_workspace path. Route through the existing _summarize_user_message_for_log helper (which already backs the logging/banner previews on main), and widen the param type hint from str to Any to match how the function is actually called. The two logging-preview sites the original PR also touched were fixed independently on main by the conversation-loop refactor. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-30 03:20:11 -07:00
Markus Phan	cd9f5cc671	fix(delegate): route subagent progress lines through _safe_print for ACP stdio delegate_task's per-task completion display emitted lines like "✓ [1/3] Research done (17.92s)" via a bare print(). Under ACP (and any headless JSON-RPC stdio host where AIAgent routes human output to stderr via a custom _print_fn), these landed on stdout and corrupted the protocol frame stream, surfacing as "Failed to parse JSON message: ✓ [3/3] …" in the ACP adapter. Add _emit_parent_console() which prefers parent_agent._safe_print (the same hook AIAgent uses for every other user-facing print) and falls back to print() only when no router is wired up or it raises. CLI behavior is unchanged. The PR's other fix (preset toolset expansion) is already covered on main by _expand_parent_toolsets(), so only the stdio-safe printing change is salvaged here.	2026-06-30 03:16:22 -07:00
teknium1	eeb4735078	test(web_server): assert ws-ping invariant, not frozen 20.0 literal The loopback ws-ping window is now 30s/60s (#48445/#50005), so the hardcoded == 20.0 assertion was a change-detector that broke the moment the loopback tuning landed. Assert the behavioral contract instead: ping stays enabled (positive) and timeout >= interval.	2026-06-30 03:11:13 -07:00
teknium1	db880186f2	chore(release): add AUTHOR_MAP entries for #51841 and #54287 salvage	2026-06-30 03:11:13 -07:00
teknium1	1a0c576813	fix(tui_gateway): drop emit-only session.info from _LONG_HANDLERS session.info is only ever an emitted event (_emit), never a dispatched @method RPC, so listing it in _LONG_HANDLERS is dead weight that can never match a dispatched method name. Remove it from the set and the test's frontend-polled list to keep _LONG_HANDLERS to real RPCs.	2026-06-30 03:11:13 -07:00
Zyxxx-xxxyZ	9d10dcd490	fix(tui_gateway): route frontend-polled inline RPCs to pool under GIL pressure Frontend-polled read-only RPCs (session.list, pet.info, process.list) ran inline in the WS read loop. Under GIL pressure from concurrent agent turns they block the loop, timing out frontend polls and surfacing as a false "needs setup" / dropped session (#50005, #48445). Route them through _LONG_HANDLERS so dispatch() returns immediately, and raise the default RPC pool to 8 workers so the added long handlers don't queue. Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-06-30 03:11:13 -07:00
Peetwan	ebb81f10cb	fix(tui_gateway): prevent WS disconnect under GIL pressure Three targeted fixes for Desktop GUI WebSocket stability when agent turns starve the uvicorn event loop of CPU (GIL contention): 1. Loosen ws_ping_timeout for loopback binds (QW-1) - Loopback (Desktop): ping 30s interval / 60s timeout - Non-loopback (Cloudflare Tunnel): unchanged 20/20 - A GIL-heavy agent turn can stall the event loop past 20s; uvicorn's keepalive ping runs on that same starved loop, so a 20s timeout kills an otherwise-healthy local connection over a recoverable stall. 60s rides out the stall without affecting half-open detection on public binds. 2. Coalesce streaming token frames in WSTransport (CF-2) - Buffer high-frequency delta frames (message.delta, reasoning.delta, thinking.delta) and flush as a batch every ~33ms (~30fps) - Non-streaming frames (RPC responses, control/tool/completion events) flush pending tokens first — wire ordering preserved - Thread-safe via threading.Lock; worker threads return immediately instead of blocking on per-token loop wakeups - Reduces event-loop wakeup churn by orders of magnitude during model streaming, directly cutting GIL pressure 3. Loop heartbeat watchdog (CF-1) - Self-rearming call_later tick (2s) measures drift between expected and actual fire time using loop.time() (monotonic) - Logs 'event loop stalled Ns (GIL pressure suspected)' when drift >5s - Turns mysterious WS drops into diagnosable log entries - Uses call_later chain (not a task) — dies with the loop, nothing to cancel on shutdown Root cause: uvicorn's ws keepalive ping (20/20s) runs on the same starved event loop as agent turns. Under GIL pressure from heavy agent turns or delegation, the loop can't service the ping within 20s, so the websockets protocol declares the connection dead. Reconnects fail with ready_send_failed because the old process's loop is still wedged. None of these fixes touch the model-facing message array, prompt caching, message role alternation, or the wire protocol — they are strictly display-transport improvements plus a config tweak and a diagnostic log. Tests: 762 passed, 17 skipped (0 failures) across test_tui_gateway_ws, test_tui_gateway_server, test_web_server, and tui_gateway/ suites.	2026-06-30 03:11:13 -07:00
teknium1	35a0803a3b	fix(delegation): budget subagent summaries against parent context headroom Batch delegation returned each subagent's full final_response verbatim into the parent's context. A fan-out of N children could dump 60k+ tokens at once, blowing the parent's context window and — on rate-limited providers — triggering a compression/429 death spiral (429 misread as context-too-large -> window step-down -> retry loop -> conversation dies). Cap each summary against the parent's remaining context headroom split across the batch (not a magic char count). When trimming, mirror the web_extract convention: spill the full text to cache/delegation (mounted into remote backends via credential_files._CACHE_DIRS) and return a head+tail window (75/25, line-snapped) plus a footer with the exact read_file offset to page the omitted middle. Both the subagent's opening AND its closing (outcomes / files-changed / issues, which live at the end) survive in-context, and nothing is lost — the parent can read_file the full version on any backend. delegation.max_summary_chars (default 24000) is a static ceiling layered on top as belt-and-suspenders for models that ignore 'be concise'; 0 disables it. Child prompt tightened to lead with outcomes / bullets. Co-authored-by: rc-int <rcint@klaith.com>	2026-06-30 03:07:40 -07:00
MarioYounger	3b2bb30c5d	fix(security): harden heredoc approval, NFKC homograph fold, env-var filter Three independent security-scanner hardenings, re-homed onto the current shared threat-pattern architecture (tools/threat_patterns.py): - approval.py: add bash/sh/zsh/ksh heredoc to DANGEROUS_PATTERNS. The existing heredoc pattern only covered python/perl/ruby/node, so `bash <<'EOF' ... EOF` ran arbitrary shell — including exfil pipelines whose inner commands don't individually match a pattern — with no prompt. - threat_patterns.py: apply unicodedata.normalize("NFKC", ...) before pattern matching so full-width / compatibility homographs (e.g. `ｃａｔ ~/.hermes/.env`) are folded to ASCII and no longer bypass the keyword scanners. Invisible-char detection still runs on the raw content first (NFKC can strip those codepoints). - code_execution_tool.py: add CREDS/BEARER/APIKEY to _SECRET_SUBSTRINGS so vars like HERMES_LLM_CREDS, API_BEARER, MY_APIKEY are scrubbed from the sandbox env. PASS was intentionally dropped from the original proposal — it false-positives on BYPASS_CACHE / COMPASS_DIR / PASSENGER_HOST while PASSWORD/PASSWD already cover the credential cases. The original PR also proposed a 'synonym' injection pattern block (overlook/forget/set aside/bypass/discard + developer-mode); dropped here because it false-positives on ordinary AGENTS.md/SOUL.md prose ("don't forget to follow the rules", "run in developer mode"), exactly the bossy-English class threat_patterns.py is documented to avoid. Salvaged from #9028. Co-authored-by: Hermes Agent <agent@nousresearch.com>	2026-06-30 02:59:46 -07:00
Teknium	c8376e0dc6	fix(auxiliary): stop SDK retries from multiplying compression stall (#54465 ) (#55544 ) The auxiliary OpenAI clients were built without overriding the SDK's default max_retries=2, so every aux call silently made up to 3 attempts against a slow/hung endpoint — a 120s timeout could stall ~360s before Hermes saw a single failure. On the critical compression preflight path, Hermes then added its own same-provider timeout retry on top, roughly doubling the user-visible stall again before fallback. - Build both the sync (_create_openai_client) and async (_to_async_client) aux clients with max_retries=0 (setdefault, so explicit callers still override). Hermes already owns retry + provider/model fallback policy. - For task == compression, skip the same-provider transient retry on a full-budget timeout and fall straight through to fallback. Fast blips (streaming-close, 5xx) still retry, since those are cheap. - Add _is_timeout_error to distinguish a full-budget timeout from a fast connection drop. Addresses the retry-multiplication root cause of #54465 (the resume-wedge persistence half landed in #55499).	2026-06-30 02:54:08 -07:00
0xbyt4	e6f66bc0f0	fix(security): cover Move and no-space headers in patch_tool sensitive path check patch_tool extracts V4A patch paths so _check_sensitive_path can refuse writes to /etc/, /boot/, etc. before they reach the low-level file ops. The extraction regex had two gaps: 1. `* Move File: src -> dst` was never extracted (regex only matched Update/Add/Delete), so a Move targeting /etc/crontab skipped the pre-check and fell back on the narrower file_operations deny list. 2. The regex required `\\s+` after `` but patch_parser uses `\\s`, so `**Update File: /etc/hosts` (no space) parsed + applied while skipping the check. Loosen the leading whitespace to \\s and add a Move regex that checks both endpoints. Move endpoints also run through the same '..' traversal rejection as the other V4A headers (closes the sibling gap on current main, which gained that traversal guard after this PR was opened).	2026-06-30 02:50:24 -07:00
kshitij	26f39f7b90	fix(credentials): prefer ~/.hermes/.env over stale os.environ on key rotation (#55528 ) `_resolve_api_key_provider_secret` resolved API keys via `get_env_value`, which returns the `os.environ` value first and only falls back to `~/.hermes/.env`. After a user rotates a key in `.env`, a stale value still exported in the parent shell (Codex CLI, test runner, login profile) shadows the fresh key on every request, producing persistent 401s. The credential-pool seeding path was already fixed to prefer `.env` (#18254/#18755), but the live request-time resolution path was not — so the pool re-seeded with the fresh key while `_resolve_api_key_provider_secret` kept returning the stale shell export. This closes that remaining path. - config: add `get_env_value_prefer_dotenv()` — checks `~/.hermes/.env` first, then `os.environ`. Distinct from `get_env_value()` (unchanged, os.environ-first) so only Hermes-managed credential resolution flips precedence; the generic helper's many callers are unaffected. - auth: `_resolve_api_key_provider_secret` resolves through the new helper. - tests: regression coverage for both the pool-seeding path and the auth resolution path (a rotated `.env` key must beat a stale shell export). Closes #20591. Co-authored-by: 0xDevNinja <manmit0x@gmail.com>	2026-06-30 09:49:52 +00:00
teknium1	b6045170bb	fix(discord): extend channel-name matching to slash-command auth; clamp flush deadline to disconnect budget Follow-up to the salvaged #8008 fix: - Sibling-site fix: _evaluate_slash_authorization gated DISCORD_ALLOWED_CHANNELS / DISCORD_IGNORED_CHANNELS on numeric IDs only, so name/#name config that now works for on_message still silently failed for slash-command interactions. Refactor the channel-key helper to _discord_channel_keys_from_channel(channel, parent) and reuse it at the interaction gate. Fail-closed on missing channel id is preserved. - The contributor's hardcoded 8s flush deadline could be hard-cancelled mid-flush: _teardown_adapter already wraps cancel_background_tasks() in the per-adapter disconnect budget (HERMES_GATEWAY_ADAPTER_DISCONNECT_TIMEOUT, default 5s). The flush deadline now derives from that budget with headroom so it always completes inside it. - AUTHOR_MAP: map cypher@augmentl.com -> Nickperillo for CI. - Tests: slash-auth name/#name allow + name ignore matching.	2026-06-30 02:48:42 -07:00
Cypher	cb9308f0a6	fix(discord): channel name matching and flush pending sends on shutdown Two related fixes to the Discord gateway adapter: 1. Channel name matching (free-response, allowed, ignored, no-thread channels) Previously these config values only matched against numeric channel IDs. If a user configured free_response_channels: cypher (by name), the adapter would silently ignore it because it only intersected against channel_ids. Now the adapter builds a channel_keys set that includes the channel ID, channel name, and #channel-name form, and checks all three for each gate. 2. Flush pending text-batch tasks before shutdown The Discord adapter uses _pending_text_batch_tasks (its own dict) for merging rapid successive message chunks. These tasks were NOT added to self._background_tasks (the base class list), so the base cancel_background_tasks() never awaited them on restart/shutdown. This caused a race: in-flight response deliveries were cancelled before Discord had a chance to send them, resulting in silent dropped messages visible to users as tool-log-only replies with no text body. Fix: override cancel_background_tasks() in DiscordAdapter to await all pending text-batch tasks (8s deadline) before delegating to the base class.	2026-06-30 02:48:42 -07:00
Teknium	b03635daea	fix(approval): catch hermes gateway stop/restart behind a profile flag (#55515 ) The gateway-lifecycle guard's hermes-CLI pattern required `hermes` and `gateway` to be adjacent, so a profile flag slipped the agent past it: `hermes -p ade gateway restart` was not flagged. That is the exact form from the 2026-04-11 ade-profile self-kill loop. Allow an optional run of global flags (`-p ade`, `--profile ade`, multiple flags) between `hermes` and the gateway subcommand. launchctl self-termination is already covered on main by #33071; this narrows the only remaining real gap.	2026-06-30 02:48:30 -07:00
brooklyn!	1d495cfbbf	Merge pull request #55226 from NousResearch/bb/desktop-memory-graph feat(desktop): memory graph — playable timeline of memories + skills over time	2026-06-30 04:36:17 -05:00
brooklyn!	6d20ac4c85	Merge pull request #55500 from NousResearch/bb/desktop-composer-draft perf+refactor(desktop): de-entangle the composer into isolated engine hooks	2026-06-30 04:35:28 -05:00
Brooklyn Nicholson	aa07400e1a	chore(desktop): keep draft persist effect deps clean Replace direct queueEditRef reads in cleanup/pagehide with a mirrored local ref so hook deps stay stable and eslint-clean.	2026-06-30 04:33:08 -05:00
Brooklyn Nicholson	9998ff4cbe	fix(desktop): persist live composer draft before swap/reload Sync the contentEditable text before stash-on-scope-change and pagehide so pending rAF draft flushes cannot drop the newest keystrokes.	2026-06-30 04:32:39 -05:00
brooklyn!	eeb69c7df2	Merge pull request #55547 from NousResearch/bb/54744-windows-bash-spawn fix(desktop): tree-kill Windows terminal descendants	2026-06-30 04:28:33 -05:00
Brooklyn Nicholson	2f46fde3f5	fix(desktop): keep queued composer edit ref in sync Update the shared queued-edit ref synchronously with React state so draft persistence sees the correct edit mode while loading and restoring queued prompts. Also drop the accidental node_modules symlink from the PR.	2026-06-30 04:27:22 -05:00
Brooklyn Nicholson	e5253d852b	fix(desktop): tree-kill Windows terminal descendants Ensure Windows desktop and local terminal teardown kill full process trees so Git Bash descendants cannot survive wrapper exits and accumulate across retries.	2026-06-30 04:23:27 -05:00
Brooklyn Nicholson	94d70dee54	perf(desktop): stop ChatBar re-rendering on cross-session status/queue churn Audit follow-up. ChatBar subscribed to the whole `$statusItemsBySession` (a computed that rebuilds the entire map) + `$previewStatusBySession` maps just to derive a boolean, so every per-item status mutation (a subagent tick, the 5s background poll) and every OTHER session's change re-rendered the ~1.4k component. The queue hook likewise subscribed to the whole `$queuedPromptsBySession` map. - Add `useSessionStatusPresence` — a coarse edge (useSyncExternalStore) that flips only when the stack shows/hides; ChatBar uses it for the styling data-attr instead of the two map subscriptions. - Add generic `useSessionSlice(store, key)` — subscribes to one session's array, bailing out when other sessions churn (the plain atom keeps per-key refs stable). The queue hook now reads its slice through it. Result: ChatBar re-renders only when the stack's presence flips or this session's queue changes — not on background/subagent status streaming or other sessions. Verified: typecheck clean, 0 lint errors, composer tests 39/40 (pre-existing attachments failure unrelated).	2026-06-30 04:19:10 -05:00
Brooklyn Nicholson	33d91029b2	perf+fix(desktop): coalesce composer paste/input flush; scope dock glow to thread Two composer fixes: - Paste/input lag — `flushEditorToDraft` serializes the whole editor (`composerPlainText` is O(n)); running it on every event during a burst (holding a key, or holding Cmd+V into a growing editor) was O(n²). Coalesce the input/paste path to one flush per animation frame. Lossless: the contentEditable DOM is the source of truth and submit + the compositionend / keydown paths re-read it synchronously (those stay immediate). - Detached-composer dock glow — was `fixed inset-x-0` (full viewport, spilled under the sessions sidebar). Switched to `absolute inset-x-0`, so it anchors to the chat-column root the docked composer centers in — the glow now spans only the thread area, matching the actual dock target. Verified: typecheck clean, 0 lint errors, composer DOM repro tests pass.	2026-06-30 04:19:10 -05:00
Brooklyn Nicholson	773a3703bf	refactor(desktop): extract composer submit engine into useComposerSubmit Lift the submit orchestration out of ChatBar into composer/hooks/use-composer-submit.ts: `submitDraft` (the one decision tree — queue-edit save · slash-now-while-busy · queue · drain · send · stop), `dispatchSubmit` (the shared send-with-restore primitive + the external-submit listener), and `steerDraft`. This is the seam where the draft and queue engines meet; it now reads both clean APIs as explicit inputs instead of closing over inline state. ChatBar is left as a thin coordinator that owns the shared `queueEditRef` and wires the four engines (draft · queue · submit · metrics/voice/drop) into render. Behaviour-identical (verbatim move). Verified: typecheck clean, composer DOM repro tests (enter-submit, IME, slash-now, steer, drain) pass.	2026-06-30 04:19:10 -05:00

1 2 3 4 5 ...

13702 commits