hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
AhmetArif0	39b8d1d313	fix(dingtalk): finalize open streaming cards before disconnect AI Card "tool progress" cards created with finalize=False were left in streaming state on DingTalk's UI after a gateway restart because disconnect() called _streaming_cards.clear() without first closing them via _close_streaming_siblings. Move the finalization loop before self._http_client.aclose() so the HTTP client is still available when the finalize requests are sent. Adds a regression test that asserts the HTTP client is alive during finalization.	2026-05-23 20:48:56 -07:00
Fewmanism	83f6a83b24	fix(tui): handle images with codex app-server	2026-05-23 20:40:09 -07:00
teknium1	7ce6b504a2	fix(process_registry): use taskkill /T /F for tree-kill on Windows The Windows branch of `_terminate_host_pid` early-returned after `os.kill(pid, SIGTERM)` (which Python maps to `TerminateProcess` for the target handle only), leaving descendant processes — e.g. Chromium renderer/GPU/network helpers spawned by an `agent-browser` daemon — running on Windows even after the preceding commit fixed POSIX. The right Windows primitive is `taskkill /PID <pid> /T /F`: `/T` walks the tree, `/F` force-terminates. Same approach `gateway.status.terminate_pid(force=True)` already uses for the gateway's own shutdown path; reuse the same shape here. Why NOT extend the POSIX psutil tree-walk to Windows: 1. Windows doesn't maintain a Unix-style process tree. `psutil. Process.children(recursive=True)` walks PPID links that go stale when intermediate processes exit, so enumeration is best-effort and silently misses orphaned descendants. The whole bug we're fixing is orphaned descendants. 2. `psutil.Process.terminate()` on Windows is `TerminateProcess()` for one handle — same single-PID scope as the existing `os.kill`. The existing comment in `gateway/status.py:: terminate_pid` warns this explicitly: 'os.kill SIGTERM is not equivalent to a tree-killing hard stop' on Windows. 3. Headless Chromium has no GUI window, so the softer `taskkill /T` without `/F` (which sends WM_CLOSE) won't reach it either. `/F` is required. POSIX path is unchanged. The taskkill subprocess uses the same `creationflags=windows_hide_flags()` pattern other Windows shellouts in this codebase use. `FileNotFoundError` / `TimeoutExpired` / `OSError` fall back to bare `os.kill(SIGTERM)` as cheap insurance. Tests cover the Windows branch via the codebase's standard `monkeypatch _IS_WINDOWS` pattern (`references/windows-native- support.md`), plus POSIX tree-walk order, NoSuchProcess swallow, and the OSError fallback path. 7 new tests, all green on Linux CI.	2026-05-23 20:30:29 -07:00
Yuan Li	22f3f5a75a	fix(browser): use process-tree termination for daemon cleanup os.kill(pid, SIGTERM) only signals the parent, leaving Chromium child processes (renderer, GPU, etc.) orphaned. Reuse the existing ProcessRegistry._terminate_host_pid() helper which walks the process tree leaf-up via psutil, terminating children before the parent.	2026-05-23 20:30:29 -07:00
Teknium	e42fcc5625	fix(provider): make config.yaml model.provider the single source of truth (#31222 ) Policy: if it ain't a secret it goes in config.yaml. HERMES_INFERENCE_PROVIDER was leaking behavioral config into the .env surface, including from the gateway, which bypassed config.yaml entirely. Behavior: - gateway/run.py: drop HERMES_INFERENCE_PROVIDER read in _resolve_runtime_agent_kwargs. Gateway now flows through resolve_runtime_provider() with no `requested` override, which reads model.provider from config.yaml first. Docs/UX (strip env var from user-facing surface): - --provider help text no longer mentions the env var - cli-config.yaml.example same - reference/environment-variables.md: remove HERMES_INFERENCE_PROVIDER row and the cross-reference from HERMES_INFERENCE_MODEL - reference/cli-commands.md: blank the env-var column for --provider - guides/xai-grok-oauth.md, guides/minimax-oauth.md: replace HERMES_INFERENCE_PROVIDER=x hermes invocations with config.yaml / --provider - developer-guide/adding-providers.md, model-provider-plugin.md: reframe Internal mechanism (kept as-is): - hermes_cli/main.py writes HERMES_INFERENCE_PROVIDER into the TUI subprocess env - tui_gateway/server.py reads it on TUI startup - resolve_requested_provider() / oneshot.py / cli.py still fall through to the env var as a last-resort behind config.yaml, which is what makes the TUI parent->child handoff work This stays. We just stop documenting it as a user knob. Tests: tests/gateway/test_auth_fallback.py — simplify mock to fail on first call, succeed on second; drop monkeypatch.setenv lines that no longer matter. Supersedes #31064 (closed with credit to @novax635 who surfaced the underlying issue but proposed aligning gateway to the env var rather than removing it).	2026-05-23 18:18:41 -07:00
Edison	e752c9454e	feat(plugins): add register_auxiliary_task() to PluginContext API Auxiliary LLM tasks (vision, compression, web_extract, etc.) currently require modifications to core files for any plugin that needs its own task slot — specifically the _AUX_TASKS list in hermes_cli/main.py and the hardcoded env-var bridging dict in gateway/run.py. This violates the 'plugins must not modify core files' rule and forces every memory or context plugin that wants its own auxiliary task to either fork core or open a coupled core+plugin PR. This change adds a generic plugin surface for auxiliary task registration: ctx.register_auxiliary_task( key='memory_retain_filter', display_name='Memory retain filter', description='hindsight pre-retain dedup/extract', defaults={'timeout': 30, 'extra_body': {'reasoning_effort': 'low'}}, ) After registration, the task automatically: - Appears in 'hermes model → Configure auxiliary models' picker via a new _all_aux_tasks() merge of built-in + plugin tasks - Has its provider/model/base_url/api_key bridged from config.yaml to AUXILIARY_<KEY_UPPER>_* env vars at gateway startup (gateway/run.py now uses a dynamic bridged-keys set instead of a hardcoded per-task dict) - Gets plugin-declared defaults (timeout, extra_body, etc.) layered underneath user config so unconfigured plugin tasks still work (agent/auxiliary_client._get_auxiliary_task_config) - Resets to auto via 'Reset all to auto' alongside built-ins Validation: - Rejects shadowing of built-in keys (vision, compression, etc.) - Rejects invalid key shapes (must match [A-Za-z0-9_]+) - Rejects cross-plugin collisions (clear error) - Allows same-plugin re-registration (idempotent updates) Plugin discovery failures (rare) fall back gracefully — the aux config UI still shows built-in tasks if get_plugin_auxiliary_tasks() raises, and gateway env-var bridging keeps working for built-ins. Built-in tasks remain hardcoded in _AUX_TASKS for stability — they're the baseline UX, and DEFAULT_CONFIG already ships their defaults. Plugin tasks layer on top. Tests: 15 new tests in test_plugin_auxiliary_tasks.py covering API validation, manager state lifecycle, helper sort order, _all_aux_tasks merge semantics, _reset_aux_to_auto inclusion of plugin tasks, and default-layering in auxiliary_client. Updates the gateway-bridge code-parity test (test_auxiliary_config_bridge) to assert the new dynamic shape rather than the hardcoded literal env var names which no longer appear post-refactor. Motivation: this unblocks PR #20262 (hindsight smart retain pipeline) and similar plugins that need a dedicated aux task slot. The change is non-breaking — built-in env vars (AUXILIARY_VISION_PROVIDER, etc.) keep working since they're produced by the same f-string template that built the hardcoded names.	2026-05-23 17:49:47 -07:00
soynchux	e8fa415a9e	fix(cli): validate runtime token refresh capability in Qwen auth status	2026-05-23 17:47:36 -07:00
teknium1	4254f7dd17	refactor(skills): slim AST diagnostic to single entry point Trim ~600 LOC off the original contribution while keeping the same operator-facing surface and detection coverage. - Collapse three entry points (file / dir / bundle) into one ast_scan_path(path) that handles both files and directories. - Drop AstFinding dataclass + severity field — replaced with plain (file, line, pattern_id, description) tuples. Severity ordering was display-only for a diagnostic that explicitly disclaims security verdicts, so the field added bookkeeping without earning its place. - Replace Rich-markup formatter with plain text grouped by file. - Drop the 'inspect --ast-deep' surface — same scanner, same output as 'audit --deep', single CLI entry is enough. Operators audit after install; pre-install inspection signal isn't worth the second surface. - Trim test file to the cases that earn their place: bypass payload, syntax error survival, RecursionError survival, false-positive guard (importer lookalike), literal-arg false-positive guard, non-.py ignored, directory recursion + cache-dir skipping, missing-path, getattr/__dict__ detection, formatter empty + populated. Net: tools/skills_ast_audit.py 353 -> 133 LOC, tests/tools/test_skills_ast_audit.py 299 -> 103 LOC, full diff +704/-12 -> +264/-6. No change to tools/skills_guard.py — Skills Guard verdicts remain untouched per SECURITY.md §2.4.	2026-05-23 17:47:26 -07:00
Tranquil-Flow	7255050c99	feat(skills): add opt-in AST deep diagnostics Add opt-in AST diagnostics for skill review without making Skills Guard stricter by default. - Add hermes skills inspect --ast-deep to scan fetched skill bundles before installation - Add hermes skills audit --deep to scan already-installed hub skills - Keep AST analysis in tools/skills_ast_audit.py, separate from tools/skills_guard.py - Label output as diagnostic hints, not security verdicts - Cover dynamic import/access patterns: importlib, __import__(computed), getattr(computed), and __dict__[computed] This follows the maintainer guidance from closed PR #7436: useful AST-level analysis belongs in an opt-in diagnostic path, not in Skills Guard's default heuristic scan.	2026-05-23 17:47:26 -07:00
novax635	86871ee25a	fix(cli): synchronize HERMES_SESSION_ID across environment and contextvar during session switches	2026-05-23 17:46:55 -07:00
0z1-ghb	8b2adead78	fix(compressor): ABC compliance — total_tokens, api_mode, logger consistency	2026-05-23 17:38:19 -07:00
Yuan Li	75643a6154	fix(env): strip null bytes from .env before python-dotenv loads Null bytes in API key values (introduced by copy-paste) crash os.environ[k] = v with ValueError: embedded null byte, preventing hermes from starting at all.	2026-05-23 17:17:05 -07:00
Glucksberg	9451087aab	fix(telegram): preserve observed group slash commands	2026-05-23 16:26:28 -07:00
Teknium	3b096d6f6d	ntfy: tighten robustness, dedupe auth/truncation, add docs Robustness: - Surface 401/404 stream failures via _set_fatal_error() so the gateway's runtime status reflects 'fatal: ntfy_unauthorized' / 'ntfy_topic_not_found' instead of staying 'connected' when the reconnect loop halts. Matches the pattern in whatsapp / telegram / sms adapters. - Strip whitespace from auth tokens so pasted tokens with trailing newlines don't produce malformed Authorization headers. Simplicity: - Extract _build_auth_header() and _truncate_body() to module-level helpers, used by both NtfyAdapter and _standalone_send. Removes the duplicated auth/truncation logic between the two paths. Docs: - website/docs/user-guide/messaging/ntfy.md — full setup guide, identity-model warning, self-hosting, cron usage, troubleshooting. - website/docs/reference/environment-variables.md — all 9 NTFY_* vars. - website/docs/user-guide/messaging/index.md — platform comparison row. - website/sidebars.ts — sidebar entry between simplex and open-webui. Tests: 78/78 (+ 10 new robustness tests covering token hygiene, fatal error propagation for 401/404, and the _truncate_body helper).	2026-05-23 16:13:01 -07:00
Teknium	6a8e131a0a	refactor(ntfy): convert built-in adapter to platform plugin ntfy now ships as a self-contained plugin under plugins/platforms/ntfy/ instead of editing 8 core files (gateway/config.py Platform enum, gateway/run.py factory + auth maps, cron/scheduler.py, toolsets.py, hermes_cli/status.py, agent/prompt_builder.py, gateway/channel_directory.py, tools/send_message_tool.py). All routing goes through gateway/platform_registry via register_platform(): - adapter_factory, check_fn, validate_config, is_connected - env_enablement_fn seeds PlatformConfig.extra from NTFY_* env vars so gateway status reflects env-only setups without instantiating httpx - standalone_sender_fn handles deliver=ntfy cron jobs when cron runs out-of-process from the gateway - allowed_users_env / allow_all_env hook into _is_user_authorized - cron_deliver_env_var=NTFY_HOME_CHANNEL for cron home routing - platform_hint surfaces in the system prompt - pii_safe=True (topic names are the only identifier; no PII to redact) Tests moved to tests/gateway/test_ntfy_plugin.py using _plugin_adapter_loader so the module lives under plugin_adapter_ntfy in sys.modules and cannot collide with sibling plugin-adapter tests on the same xdist worker. The core-file grep tests (Platform.NTFY in source, hermes-ntfy in toolsets, etc.) are replaced with plugin-shape tests covering register() metadata, env_enablement_fn output, and standalone_sender_fn behavior. 68 tests pass under scripts/run_tests.sh.	2026-05-23 16:13:01 -07:00
sprmn24	b10f17bf1e	feat(ntfy): add ntfy platform adapter with atomic reconnect, identity fix, and 81 tests	2026-05-23 16:13:01 -07:00
Teknium	ad11327db0	feat(kanban): warn users that scratch workspaces are deleted on completion (#30949 ) First scratch workspace creation on an install now emits a one-shot warning log + a 'tip_scratch_workspace' event on the task. Sentinel file at ~/.hermes/kanban/.scratch_tip_shown silences subsequent creations across the whole install. Behavior unchanged — scratch is still ephemeral by design. This just makes the design visible to new users (reported in user community: 'progress files vanished, no warning anywhere'). Docs (en + ko) updated to spell out 'Deleted when the task completes' on the scratch bullet and 'Preserved on completion' on worktree/dir.	2026-05-23 11:27:00 -07:00
Nick	39fe4ecee3	fix(kanban): refuse corrupt db auto-init	2026-05-23 05:51:33 -07:00
QuenVix	7245bc77eb	fix(fallback): merge fallback_providers with legacy fallback_model configurations	2026-05-23 05:24:57 -07:00
Teknium	7f1b2b4569	fix(approval): pin 'silence is not consent' contract on timeout/deny (#24912 ) (#30879 ) User incident (Slack, 2026-05-13): user walked away mid-conversation, agent requested approval to run `rm -rf .git`, the prompt timed out after the gateway_timeout (default 300s), and the agent removed the .git folder on its own. Corroborated by an independent report from a Telegram user. The underlying code path was correct — `check_all_command_guards` returns `approved=False` with a BLOCKED message on both timeout and explicit deny, and `terminal_tool` surfaces that as `status=blocked` to the agent. The bug is at the model-interface layer: the message "BLOCKED: Command timed out. Do NOT retry this command." reads to some models as "try a different command achieving the same outcome." This commit changes only the model-facing message + the structured return shape: - Timeout message now explicitly names the three evasion paths the agent must avoid: retry, rephrase, AND achieve the same outcome via a different command. Ends with "Silence is not consent." - Explicit deny gets the same shape minus the silence-is-not-consent line (it WAS an explicit deny, not silence). - New structured fields on the return dict: `outcome` ("timeout" or "denied") and `user_consent` (always False on this branch) so plugins, hooks, and audit pipelines don't have to string-parse the message to distinguish the two cases. The mechanism that should already have prevented the original incident — timeout treated as deny, BLOCKED result, post hook fires with `choice="timeout"` — is unchanged. This commit hardens only the agent's reading of the result. Tests: - test_timeout_returns_approved_false_with_no_consent — pins the return shape on the Slack-shaped notify_cb-registered path - test_timeout_message_is_emphatic_against_retry_and_rephrase — pins the exact phrases the message must contain - test_explicit_deny_carries_same_no_consent_shape — same contract on explicit /deny - test_timeout_emits_post_hook_with_timeout_outcome — pins the post_approval_response hook payload so audit plugins can act 329 approval tests passing (4 new + 325 existing). Fixes #24912	2026-05-23 02:59:13 -07:00
Teknium	6855d17753	fix(memory): guard against external drift in MEMORY.md/USER.md (#26045 ) (#30877 ) Reproduction (production, 2026-05-14): two concurrent sessions on the same agent. Session A patches MEMORY.md directly via the patch tool, appending ~8KB of structured content (Vendor Master, Standing Orders, Pin Board) — none of it through the memory tool, so no § delimiters. Session B starts later with stale in-memory state (1 entry, ~331 chars). Session B calls memory(action=replace) on its one known entry. The tool's _read_file parses A's content as a single 8KB 'entry' (no § splits), then replace truncates that entry to B's new 333-byte content. ~8KB of structured content silently destroyed. The atomic-rename write path is fine in isolation. The bug is the implicit contract: the tool assumes MEMORY.md is exclusively a §-delimited list of small entries it wrote, but the v0.13 install runbook itself uses 'cat >> MEMORY.md' for onboarding, the patch tool edits the file directly, and operators do too. Fix: a drift guard in MemoryStore._detect_external_drift that fires on either signal: 1. Re-parse + re-serialize doesn't produce identical bytes (catches oddly-encoded delimiters / partial writes). 2. Any single parsed entry exceeds the store's whole-file char limit. The tool budgets the ENTIRE store against that limit (2200 chars for memory, 1375 for user), so no tool-written entry can legitimately be larger. An entry bigger than the store limit means an external writer dropped free-form content into what the tool will treat as one entry. When drift fires, _reload_target writes a .bak.<ts> snapshot of the on-disk file, then add/replace/remove refuse to flush. The original file stays untouched. The error dict surfaces the .bak path AND a remediation string ('integrate missing entries via memory(add=...) one at a time, then rewrite the file clean') so the model can act on it without escalating to the operator. Tests: - test_replace_refuses_on_drift, test_add_refuses_on_drift, test_remove_refuses_on_drift — all three mutators refuse - test_clean_file_does_not_trigger_drift — false-positive check - test_error_message_points_at_remediation — error string shape - test_drift_guard_also_protects_user_target — USER.md too - test_drift_backup_filename_is_unique_per_invocation — bak.<ts> naming pin 144 memory tests passing (was 137; +7). Fixes #26045	2026-05-23 02:51:29 -07:00
xxxigm	b5ea6a5c80	test(xai-oauth): regression coverage for the bad-credentials disambiguator (#29344 ) Eleven new tests pinning the #29344 fix. Layout mirrors the existing "Fix D" entitlement section so the bad-credentials disambiguator sits alongside the entitlement-block tests it complements. Classifier-level coverage: * ``test_is_entitlement_failure_false_for_bad_credentials_wke_suffix`` — verbatim shape from the reporter's wire capture (``{code: 'caller does not have permission', error: 'OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]'}``) ↦ classifier must return False so the refresh path runs. * ``test_is_entitlement_failure_false_for_wke_suffix_in_normalized_shape`` — same body after ``_extract_api_error_context`` has rewritten it to ``{reason, message}``. The disambiguator must fire in BOTH shapes; without this guard the production call site at ``_recover_with_credential_pool`` (which goes through the normalised extractor) would still misclassify. * ``test_is_entitlement_failure_false_for_any_wke_unauthenticated_variant`` — parametrised forward-compat: ``bad-credentials``, ``expired-token``, ``revoked``, ``some-future-reason``. xAI documents the prefix as stable, the suffix after the colon as a reason code that can grow; every variant under ``unauthenticated:`` must route to refresh. * ``test_is_entitlement_failure_false_via_oauth2_validation_phrase_alone`` — belt-and-braces guard: if a future API revision drops the WKE suffix but keeps "OAuth2 access token could not be validated", we still classify correctly. * ``test_is_entitlement_failure_wke_signal_overrides_entitlement_keywords`` — defensive: if a body ever carries BOTH the WKE suffix and entitlement language, the WKE signal wins. Auth is recoverable; entitlement isn't, and a refreshed token will resurface the entitlement message on the next request. * ``test_is_entitlement_failure_case_insensitive_wke_match`` — pins that the classifier lowercases the haystack so a future xAI build that uppercases the prefix doesn't reintroduce the bug. Recovery-path coverage (end-to-end through ``_recover_with_credential_pool``): * ``test_recover_with_credential_pool_refreshes_on_xai_bad_credentials_403`` — the headline test the reporter requested: a bad-credentials 403 with the exact wire body must call ``try_refresh_current()`` exactly once and ``_swap_credential`` once. Pre-fix this returned ``(False, _)`` because the entitlement classifier over-matched and short-circuited the refresh path. * ``test_recover_with_credential_pool_still_blocks_real_entitlement`` — companion regression guard for #26847: a pure unsubscribed- account body (no WKE suffix, no OAuth2-validation phrase) must still surface as entitlement and skip refresh. The new disambiguator must not weaken the original loop-protection it was added to preserve. The scaffolding reuses ``_make_codex_agent``, ``_FakePool``, and the existing ``MagicMock`` patterns from the surrounding tests so the new section reads as a natural extension of "Fix D" rather than a separate test file.	2026-05-23 02:48:13 -07:00
Teknium	9acf949e34	feat(telegram): edit status messages in place instead of appending (#30864 ) Closes #30045. Based on @qike-ms's PR #30141. Telegram status callbacks (lifecycle, compression, context-pressure) used to append a fresh bubble on every emit. Now adapter tracks {(chat_id, status_key) -> message_id}; first call sends, subsequent calls edit. Failed edits drop the cache entry and fall through to a fresh send. - gateway/platforms/telegram.py: send_or_update_status() (+34 LOC) - gateway/run.py: route _status_callback_sync through it when the adapter supports it; plain adapter.send() otherwise (+15 LOC) - 5 tests covering first send / edit-in-place / edit-failure fallback / distinct key & chat isolation	2026-05-23 02:42:10 -07:00
Teknium	4b6d68bd64	test(fast-command): stub _load_gateway_runtime_config too PR `2362cc468` ("fix(gateway): enforce env variable template expansion on runtime config loaders") refactored `_load_service_tier` to read config via the new `_load_gateway_runtime_config` wrapper instead of opening `_hermes_home/config.yaml` directly. The `test_run_agent_passes_priority_processing_to_gateway_agent` test still only stubbed `_load_gateway_config` (the inner loader), so the runtime wrapper saw an empty config and `_load_service_tier` returned None, breaking the test: FAILED tests/gateway/test_fast_command.py::test_run_agent_passes_priority_processing_to_gateway_agent - AssertionError: assert None == 'priority' Fix: also stub `_load_gateway_runtime_config` to return the expected `agent.service_tier=fast` config, so the test once again drives the priority routing path it was written to verify. Confirmed reproducing on current main before the patch and passing after.	2026-05-23 02:40:33 -07:00
Zyrixtrex	61ac118724	fix(webhook): enforce INSECURE_NO_AUTH safety rail on dynamic route reloads	2026-05-23 02:39:12 -07:00
Teknium	6942b1836e	fix(skills_guard): explain why --force is rejected on dangerous verdicts Follow-up to @sprmn24's verdict-logic fix. The previous block-message ended in 'Use --force to override' regardless of verdict — but as of the --force fix above, dangerous community/trusted skills can't be overridden by --force at all. The misleading hint sends users in a loop. Replace it with a specific message that tells them what the documented behavior actually is. Adds two regression tests covering the dangerous-verdict message shape and one that pins the existing --force hint for non-dangerous blocks.	2026-05-23 02:37:30 -07:00
sprmn24	789043b691	fix(security): update tests for verdict and --force changes	2026-05-23 02:37:30 -07:00
Teknium	db489a315f	fix(tests): allowlist tmp_path for kanban_notify artifact delivery (#30852 ) `_deliver_kanban_artifacts` routes candidates through `BasePlatformAdapter.filter_local_delivery_paths` (added in `41d2c758c`), which rejects paths outside `MEDIA_DELIVERY_SAFE_ROOTS`. The two artifact-delivery tests create fixtures under `tmp_path`, which lives outside the cache roots — so under CI's hermetic HOME the filter silently dropped both fake files and the assertions on `images_uploaded` / `documents_uploaded` failed. Fix: monkeypatch `HERMES_MEDIA_ALLOW_DIRS=str(tmp_path)` in both tests so the safety filter accepts the fixtures. Production behaviour unchanged; test-side fix only. CI fail repro on origin/main: test (6) shard, both test_notifier_uploads_artifacts_on_completion and test_notifier_artifact_delivery_skips_missing_files.	2026-05-23 02:34:34 -07:00
xxxigm	5b6f0b695b	test(tls-fd-recycle): pin shutdown-only + thread-aware close contract (#29507 ) Ten regressions across both prongs of the #29507 fix, organised so each test names exactly which way the bug could come back: Prong 1 — ``force_close_tcp_sockets``: * ``shutdown_only_no_close`` is the smoking-gun assertion. If a future refactor adds back ``sock.close()`` to this helper, the FD-recycling race that wrote TLS bytes on top of ``kanban.db`` is back, and this trips. * ``uses_shut_rdwr`` pins that both halves are shut down (a half-close wouldn't unblock a worker stuck in ``recv``). * ``swallows_oserror_on_shutdown`` covers the already-shutdown case. * ``handles_multiple_pool_entries`` walks all pool connections. Prong 2 — thread-aware ``_close_request_client_once``: * ``stranger_thread_aborts_only_no_close`` simulates the asyncio_0 → Thread-1616 interrupt path: stranger drives abort, holder stays populated for the worker's eventual finally. * ``owner_thread_pops_and_full_close`` is the worker-thread path: pops + full close. * ``stranger_then_owner_close_sequence_runs_full_close_exactly_once`` replays the reporter's exact timeline at object level: abort runs once, full close runs once, holder ends empty. Agent surface: * ``_abort_request_openai_client_does_not_call_client_close`` pins that the new entrypoint shuts sockets and emits the ``deferred_close=stranger_thread`` marker but never calls ``client.close()``. * ``_abort_request_openai_client_null_client_is_noop`` defensive. End-to-end: * ``fd_recycle_window_closed_by_shutdown_only`` reproduces the race at object level — runs the abort path from a stranger thread and asserts that no ``close()`` ever fires, so the kernel can never recycle the FD under the owner's still-active reference.	2026-05-23 02:31:10 -07:00
xxxigm	e2a7d73a66	fix(force_close_tcp_sockets): shutdown only, do not release FD (#29507 ) The helper used to call ``socket.shutdown(SHUT_RDWR)`` followed by ``socket.close()`` to drop CLOSE-WAIT entries immediately. On its own ``shutdown()`` is safe from any thread — it only sends FIN and breaks pending ``recv``/``send`` — but ``close()`` releases the FD integer to the kernel. When the helper runs on a stranger thread (the interrupt loop, the stale-call detector) the FD release races the owning httpx worker thread that still has the same integer cached inside the SSL BIO. The kernel then recycles that integer to the next ``open()`` call — in production, kanban dispatcher's ``kanban.db`` — and the worker's delayed TLS flush writes a 24-byte TLS application-data record on top of the SQLite header. Restrict the helper to ``shutdown(SHUT_RDWR)`` only. The owning httpx worker's own unwind will close the underlying socket via the same Python ``socket.socket`` object, which atomically swaps ``_fd`` to -1 before issuing ``close(2)`` — no FD-aliasing window. The log field ``tcp_force_closed=N`` is kept (now counts shutdowns) so existing dashboards / log parsers keep working.	2026-05-23 02:31:10 -07:00
walli	0e7448d63a	fix(qqbot): use original attachment filename for cached files Add original_name parameter to _download_and_cache, preferring the attachment metadata filename over the CDN URL path basename. Previously files were cached with meaningless QQ CDN hash names (e.g. qqdownload_...oadftnv5), causing ugly filenames when sent back to users. Aligns with qqbot-agent-sdk's AttachmentDownloader.download_document.	2026-05-23 02:27:17 -07:00
walli	a54f5afc70	fix(qqbot): handle op 7/9 and expand fatal close code set 1. Handle op 7 (Server Reconnect): close WS to trigger reconnect loop while preserving session for Resume 2. Handle op 9 (Invalid Session): check d value to determine if session is resumable; clear session only when not resumable 3. Remove 4009 from session-clearing set (connection timeout is resumable) 4. Expand fatal close codes: 4001/4002/4010-4014 now stop reconnect immediately instead of retrying uselessly 5. Add unit tests	2026-05-23 02:27:17 -07:00
walli	bbd77d165c	fix(qqbot): add INTERACTION intent and expose video/file cached paths 1. Add INTERACTION intent bit (1<<26) to _send_identify, fixing approval button clicks not being received (INTERACTION_CREATE events were never dispatched by the gateway) 2. Include local cached path in video/file attachment descriptions so the LLM can reference files for re-sending to users 3. Add unit tests (TestIdentifyIntents, TestProcessAttachmentsPathExposure)	2026-05-23 02:27:17 -07:00
QuenVix	2362cc4688	fix(gateway): enforce env variable template expansion on runtime config loaders	2026-05-23 02:27:08 -07:00
QuenVix	d21ac579e9	fix(gateway): honor key_env in auth-failure fallback resolution	2026-05-23 02:25:53 -07:00
Teknium	99671a8634	test(kanban): allow tmp_path artifacts past media-delivery validator PR #`41d2c758c` ("Fix unsafe gateway media path delivery") tightened `validate_media_delivery_path` so that artifacts emitted by the agent must live inside `MEDIA_DELIVERY_SAFE_ROOTS` (Hermes-managed cache dirs) or an operator-allowlisted root via `HERMES_MEDIA_ALLOW_DIRS`. Two kanban-notifier tests put their PDFs and PNGs under pytest's `tmp_path`, which is correctly rejected by the new validator. They started failing on main as soon as that PR landed: FAILED tests/hermes_cli/test_kanban_notify.py::test_notifier_uploads_artifacts_on_completion FAILED tests/hermes_cli/test_kanban_notify.py::test_notifier_artifact_delivery_skips_missing_files Symptom in logs: "Skipping unsafe local file path outside allowed roots". The validator is doing exactly what it should — the tests were relying on the looser pre-fix behaviour. Fix: add `HERMES_MEDIA_ALLOW_DIRS=tmp_path` to the `kanban_home` fixture so artifacts under `tmp_path` are recognised as safe. This is the same allowlist mechanism the operator-facing env var documents.	2026-05-23 02:25:09 -07:00
teknium1	70aaa774be	fix(opencode-go): emit Kimi reasoning_effort, match KimiProfile shape The Kimi K2 branch added in the prior commit only emitted extra_body.thinking and dropped reasoning_effort entirely. KimiProfile (api.moonshot.ai/v1) sends both fields, and OpenCode Go proxies to the same Moonshot backend. Mirror that shape on the Go path so /reasoning effort actually reaches Kimi. - low/medium/high pass through verbatim - xhigh/max clamp to high (Moonshot's max supported value) - minimal / unknown effort → omit reasoning_effort, keep thinking on - disabled / no config → unchanged - DeepSeek branch unchanged	2026-05-23 02:20:28 -07:00
Harish Kukreja	3589960e03	fix(provider): expose OpenCode Go reasoning controls	2026-05-23 02:20:28 -07:00
helix4u	71291d83cd	test: keep tirith checks hermetic	2026-05-23 02:20:14 -07:00
QuenVix	52a368fa72	fix(gateway): preserve WhatsApp pairing approvals across JID/LID alias flips	2026-05-23 01:46:34 -07:00
Teknium	3127a41cb1	test(acp): pin parse_model_input in slash-command tests The two ACP slash-command tests that exercise `provider:model` routing (`test_set_session_model_accepts_provider_prefixed_choice` and `test_model_switch_uses_requested_provider`) relied on the live `hermes_cli.models._KNOWN_PROVIDER_NAMES` / `_PROVIDER_ALIASES` module state to parse `anthropic:claude-sonnet-4-6` into `("anthropic", "claude-sonnet-4-6")`. If any earlier test in the same xdist worker registers a custom provider that shadows `anthropic` or otherwise mutates those globals, the parser falls into the `detect_provider_for_model` branch and resolves to `custom` instead. Observed once in CI on run 26326728502 / job 77505732299 as `AssertionError: assert 'custom' == 'anthropic'` — could not reproduce locally under per-file isolation, so the failing in-file order was specific to a particular xdist scheduling. Monkeypatching `parse_model_input` + `detect_provider_for_model` for both tests removes the global-catalog dependency, so the tests now only exercise what they were written to verify (the `requested_provider -> runtime -> AIAgent kwargs` plumbing).	2026-05-23 01:44:56 -07:00
xxxigm	da636e982b	test(plugins): regression coverage for project-plugin RCE chain (#29156 ) 35 new tests across 5 classes covering every layer of the GHSA-5qr3-c538-wm9j defence. Each class corresponds to one chokepoint so a regression in any single layer is caught by the named class: * ``TestProjectPluginsEnvGate`` (13 cases) — parametrised over both the documented truthy values (``1`` / ``true`` / ``yes`` / ``on`` + uppercase variants) and the previously-bypassing falsy strings (``0`` / ``false`` / ``no`` / ``off`` / ``""`` / ``False``). The falsy half is the direct env-bypass repro: pre-fix any non-empty string enabled the project source. * ``TestApiPathSanitizer`` (16 cases) — unit-level coverage of the new ``_safe_plugin_api_relpath`` helper. Absolute paths (``/etc/passwd``, ``/tmp/payload.py``, ``/usr/bin/python``), ``..``-traversal payloads (including nested ``subdir/../../..``), and non-string / empty / whitespace-only values must all return ``None``. Safe relative paths (``api.py``, ``backend/routes.py``) round-trip unchanged so legitimate plugins keep working. * ``TestDiscoveryScrubsApiField`` (3 cases) — end-to-end through ``_discover_dashboard_plugins`` with a real manifest on disk. Verifies that the cached plugin entry's ``_api_file`` is scrubbed at discovery time (``None`` + ``has_api: False``) so any downstream consumer can't be tricked into re-deriving the unsafe path from cache. * ``TestMountApiRoutesRefusesUntrusted`` (3 cases) — pokes synthetic plugin entries with each refusal vector directly into the cache and patches ``importlib.util.spec_from_file_location`` to assert it is not invoked for project-source / traversal payloads, and is invoked normally for bundled / user plugins. * ``TestEndToEndPocBlocked`` (1 case) — reproduces the original advisory PoC: operator sets ``HERMES_ENABLE_PROJECT_PLUGINS=0`` believing project plugins are off, attacker plants a manifest in CWD's ``.hermes/plugins/`` with ``api`` pointing at an absolute payload path. Asserts that the importer is never called against the payload path and that ``hermes_dashboard_plugin_evil`` is not in ``sys.modules`` after the mount routine runs. An autouse fixture busts ``_dashboard_plugins_cache`` before and after each test so the production cache (populated by the import-time ``_mount_plugin_api_routes()`` call) can't bleed in. All 12 pre-existing dashboard-plugin tests in ``test_web_server.py`` still pass unchanged.	2026-05-23 01:43:52 -07:00
Eugeniusz Gilewski	41d2c758c3	Fix unsafe gateway media path delivery	2026-05-23 01:40:35 -07:00
Markus	4a91e36495	fix(gateway): separate observed Telegram group context	2026-05-23 01:33:42 -07:00
Teknium	97e975edd2	fix(file-safety): widen read-deny to .env, mcp-tokens/, webhook secrets, root Extends @briandevans's PR #17659 from {auth.json, auth.lock, .anthropic_oauth.json} to also cover: - HERMES_HOME/.env (provider API keys) - HERMES_HOME/webhook_subscriptions.json (per-route HMAC secrets) - HERMES_HOME/mcp-tokens/ (OAuth token directory; dir + everything inside) …AND iterates over both _hermes_home_path() AND _hermes_root_path() so profile-mode runs (HERMES_HOME = <root>/profiles/<name>) also block <root>/{auth.json, .env, mcp-tokens/, ...}. Same widening shape as the write-deny side already does (#15981, #14157). Explicitly NOT a security boundary. Per the personal-assistant trust model, the terminal tool runs as the same OS user and can `cat auth.json` directly. This read-deny exists as defense-in-depth: - Models that respect tool denials empirically tend to stop rather than reach for the shell. - The denial surfaces an audit trail when something tries to read credentials — easier to spot in logs than a generic `cat`. Docstring + error message both flag this as defense-in-depth so future contributors don't mistake it for a real security boundary and don't re-decline reports that propose the same fix shape. Absorbs the .env and mcp-tokens/ coverage from @tomqiaozc's parallel PR #8055 (closed-as-duplicate, credited). Co-authored-by: Tom Qiao <zqiao@microsoft.com>	2026-05-22 20:15:09 -07:00
briandevans	567ea61298	fix(file-safety): block auth.json read via TERMINAL_CWD relative path read_file_tool resolves relative paths against TERMINAL_CWD (or the task's live terminal cwd), but the prior call passed the original unresolved string to get_read_block_error. That function's own resolve() is anchored at the Python process cwd, so when a task's TERMINAL_CWD pointed at HERMES_HOME and the agent issued read_file on the relative path "auth.json", the credential-store denylist was never reached and the file was read normally. Pass the already-resolved absolute path string at the file_tools call site, document the contract on get_read_block_error, and add a read_file_tool-level regression test that pins the relative-path case under TERMINAL_CWD == HERMES_HOME. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:15:09 -07:00
briandevans	056e00a77e	fix(file-safety): block read_file on HERMES_HOME credential stores (#17656 ) `get_read_block_error` previously only denied reads inside `${HERMES_HOME}/skills/.hub`, which left `auth.json` (provider OAuth state + plaintext API keys) and `.anthropic_oauth.json` (Anthropic PKCE tokens) directly readable by the agent. A prompt-injection reaching `read_file` could exfiltrate active provider credentials in plaintext. Mode-0600 file permissions only protect against other Unix users — the agent runs as the file's owner, so `read_file` is unaffected. Extend the existing deny list with the three credential paths identified in #17656 (`auth.json`, `auth.lock`, `.anthropic_oauth.json`). The check uses the same `Path.resolve()` pattern as `skills/.hub`, so symlink/path-traversal indirection is caught too. The agent doesn't need to read these directly — `auxiliary_client` and `credential_pool` consume them through process env / OAuth flows that bypass `read_file`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:15:09 -07:00
Teknium	3f78d8073c	fix(skills): make content_hash filename-sensitive too (symmetric with bundle_content_hash) PR #6656 added rel_path + \x00 prefixing to ``bundle_content_hash`` so a filename swap between two files in a bundle changes the digest. But it only patched the in-memory side — ``content_hash`` in ``tools/skills_guard.py`` (the on-disk equivalent) still hashed file contents only. These two functions need to stay symmetric: ``check_for_skill_updates`` compares the disk hash of an installed skill against the bundle hash of the upstream copy. With the asymmetric fix, every clean install showed as drifted because the digests no longer matched (2 existing tests in ``test_skills_hub.py`` started failing as soon as the contributor's change landed). Apply the same ``rel_path + \x00 + content`` shape to the disk-side function. Both functions now produce the same digest for the same skill content laid out two ways. Documented the symmetry invariant in the docstring so a future change to either function knows to touch both. Also adds tests/tools/test_pr_6656_regressions.py with 10 regression tests covering all three fixes salvaged in PR #6656: - uninstall_skill path traversal (4 cases: parent segments, absolute paths, symlink escape, legitimate skill) - bundle_content_hash filename swap detection (4 cases: in-memory swap, identity, disk-side swap, bundle↔disk symmetry) - list_pending lock contract (2 cases: source-grep contract, smoke) Also fixes AUTHOR_MAP entry for @aaronlab — their commit email (1115117931@qq.com) maps to "aaronagent" which isn't a real GitHub login, so changelog @mentions would 404.	2026-05-22 19:59:24 -07:00
teknium1	8cf977c8b1	fix(plugins): widen _sanitize_plugin_name for category-namespaced names Follow-up to PR #28832 — the dashboard plugin routes now accept slashed names like `observability/langfuse` and `image_gen/openai`, but `_sanitize_plugin_name` still rejected forward slash and so dashboard update + remove on those plugins fell through to '404 not found' even though they exist on disk. Adds an opt-in `allow_subdir=True` flag that: - Permits internal forward slashes (category-namespaced plugin keys emitted by `_discover_all_plugins`). - Strips leading and trailing slashes. - Still rejects `..` and backslash, and still asserts the resolved target lives inside `plugins_dir`. Opted in at the two read-paths that operate on installed plugins: `_require_installed_plugin` (CLI update/remove) and `_user_installed_plugin_dir` (dashboard update/remove). The install path keeps the default (`allow_subdir=False`) because freshly-cloned plugins always land top-level under `~/.hermes/plugins/<name>/`. Adds 6 targeted unit tests covering the new flag's allow/reject matrix.	2026-05-22 19:50:32 -07:00
ethernet	f89afdbd17	fix(test): deflake two intermittent CI failures - test_browser_secret_exfil: mock _run_browser_command instead of launching real Chrome (secret check is pre-launch, browser is irrelevant to the assertion) - test_web_server: add time.sleep(0.05) after pub.send_text() to yield the event loop before receive_text(). TestClient's sync mode can race the broadcast handler otherwise, hanging the test.	2026-05-22 19:46:18 -07:00

1 2 3 4 5 ...

4172 commits