hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
Hinotobi	bba76f3dcd	fix(file-safety): deny reads of Google OAuth tokens (#30972 )	2026-05-24 17:45:03 -07:00
flamiinngo	fa957c06cf	fix(security): add missing credential paths to write denylist (#27217 ) The write denylist already protects SSH keys, AWS, GPG, npm, PyPI, Docker, Azure, and GitHub CLI credentials. Two common credential stores were missing: ~/.git-credentials stores plaintext git tokens in the format https://username:token@github.com when using git credential-store. It is directly analogous to ~/.netrc which was already protected. ~/.config/gcloud/ contains Google Cloud OAuth tokens and service account credentials. It is directly analogous to ~/.aws/ which was already protected. Under prompt injection, an agent could be instructed to overwrite these files, destroying credentials or planting malicious ones. Verified before and after with is_write_denied() on both paths.	2026-05-24 17:44:53 -07:00
Teknium	9c08070703	test(cli): update resume usage-hint assertion for numbered selection PR #9020's salvage changed the /resume list footer from 'Use /resume <session id or title> to continue.' to 'Use /resume <number>, /resume <session id>, or /resume <session title> to continue.\n Example: /resume 2'. test_resume_without_target_lists_recent_sessions still pinned the old string verbatim and failed in CI. Relax to substring assertions that allow both the new numbered footer and any future tweaks while still verifying the hint is shown.	2026-05-24 16:22:48 -07:00
Teknium	c043c86bd7	i18n+tests: add list_item_numbered, list_footer_numbered, out_of_range for 15 locales The numbered /resume feature added new i18n keys to en.yaml; the catalog parity tests require every locale to carry matching keys and placeholders, so add translations to all 15 supported locales. Also unblock tests/cli/test_cli_resume_command.py: - _make_cli stub now sets self.resume_display = 'minimal' since _handle_resume_command (post-#31695) calls _display_resumed_history. - mock_db.resolve_resume_session_id returns the input id (no compression chain) so HERMES_SESSION_ID is set to a real string, not a MagicMock.	2026-05-24 16:22:48 -07:00
Teknium	87580076fd	chore(release): map 490408354@qq.com to daizhonggeng (PR #9020 )	2026-05-24 16:22:48 -07:00
daizhonggeng	fef733d56b	feat: support numbered resume selection in cli and gateway	2026-05-24 16:22:48 -07:00
AhmetArif0	4f4e337c47	fix(file-safety): write-deny pairing/ directory to prevent approved-list injection The gateway pairing directory (~/.hermes/pairing/) stores per-platform access-control files (telegram-approved.json, discord-approved.json, etc.). A prompt-injected agent using write_file could add arbitrary user IDs to an approved file, granting persistent gateway access without going through the pairing code flow — the same threat class that motivated protecting webhook_subscriptions.json (#14157). The pairing directory was not included in the original control-plane protection because it postdates PR #14157. PR #30383 introduced the hashed-pending schema and made the approved files the sole source of truth for gateway access, raising the security sensitivity of the directory. Apply the same mcp-tokens pattern: block writes to pairing/ and any path within it, under both the active hermes_home and the root path (for profile-mode parity with the fix in #30382). Regression tests verify denial for pairing/telegram-approved.json, pairing/discord-pending.json, and the directory itself, in both normal and profile-mode layouts.	2026-05-24 16:15:33 -07:00
LeonSGP43	6c44d537cc	fix(cli): show full session titles in /resume list	2026-05-24 16:13:23 -07:00
Teknium	8e68426981	fix(cli): add inline --yes/now skip for destructive slash commands (#30768 ) Issue #30768 reports that on native Windows PowerShell the destructive-slash confirmation modal renders but never registers keypresses, leaving the user unable to confirm or cancel /reset, /new, /clear, or /undo. The modal works on macOS, Linux, and WSL; PR #23907 (merged May 11) replaced the daemon-thread input() pattern with a prompt_toolkit-native keybinding modal but the win32 input pipeline apparently doesn't dispatch keys to the filter-conditioned handlers. The modal investigation is ongoing. This change ships the immediate escape hatch: append `now`, `--yes`, or `-y` to any destructive slash command to bypass the modal and run the action immediately. Works on every platform without touching the broken Windows code path. /reset now -> reset, no modal /new --yes my-session -> new session titled "my-session", no modal /clear -y -> clear, no modal /undo -y -> undo, no modal The default behavior (modal prompts when approvals.destructive_slash_confirm is True) is unchanged for users who don't pass a skip token. Implementation: - New classmethod HermesCLI._split_destructive_skip(text) -> (remainder, skip) parses a destructive-slash command string, strips the leading "/cmd" word and any recognized skip tokens (case-insensitive exact match, not substring), and reports whether a skip was requested. - HermesCLI._confirm_destructive_slash gains an optional cmd_original= arg. When the arg contains a skip token, it returns "once" immediately — before the gate check and before any modal rendering. - The /clear, /new, /undo handlers in process_command pass cmd_original through. /new additionally uses _split_destructive_skip to strip skip tokens from the remaining text before deriving the session title, so "/new now My Session" yields title="My Session" (not "now My Session"). Tests: - 7 new unit tests in tests/cli/test_destructive_slash_confirm.py covering the helper (recognized tokens, command-word stripping, case-insensitive exact match, None/empty input) and the modal bypass (now and --yes both skip; no-skip-token still consults the modal). - 3 new integration tests in tests/cli/test_destructive_slash_inline_skip_e2e.py driving HermesCLI.process_command end-to-end and asserting (a) new_session is invoked, (b) the modal is never reached, (c) the skip token does not leak into the session title, and (d) the no-skip-token path still reaches the modal as a sanity check that we haven't accidentally short-circuited the normal flow. All 31 tests across the destructive-slash test surface pass. Docs: - website/docs/reference/slash-commands.md documents the new flags both in the destructive-commands table and the dedicated approval section, with a link back to issue #30768 explaining why the escape hatch exists.	2026-05-24 16:13:03 -07:00
teknium1	99a7ecc335	chore(release): map leeseoki0 for PR #31315 salvage	2026-05-24 15:48:58 -07:00
leeseoki0	ce529d6072	fix(kanban): scratch tasks must not inherit board.default_workdir (#28818 ) Board defaults represent persistent project checkouts. Scratch workspaces are auto-deleted on completion and must stay under the per-board scratch root that resolve_workspace() creates. Inheriting default_workdir for a scratch task pointed the cleanup path at the user's source tree — the data-loss vector documented in #28818. The containment guard in _cleanup_workspace (just added) is the safety rail. This commit prevents the bad state from being created in the first place: only persistent kinds (dir/worktree) inherit board defaults. Tests updated to cover the new semantics: scratch with default_workdir set keeps workspace_path=None; dir/worktree still inherits the board default. Salvaged from PR #31315 by @leeseoki0 — prevention layer on top of the #28819 containment fix by @briandevans. Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-24 15:48:58 -07:00
briandevans	23115b5c0f	fix(kanban): restrict managed-scratch roots to workspaces/ dirs only Copilot review on PR #28819 flagged that `_is_managed_scratch_path` accepted the entire `<kanban_home>/kanban` subtree as managed scratch storage. With that, a task whose `workspace_kind='scratch'` and `workspace_path` was mis-set to `<kanban_home>/kanban`, `.../kanban/logs`, or a board's metadata directory (e.g. `.../kanban/boards/<slug>` without the `workspaces/` child) would pass the containment guard and let task completion `shutil.rmtree` Hermes' own DB, metadata, and log subtrees. Tighten the guard: * Allowed roots are now exclusively `workspaces/` directories — the `HERMES_KANBAN_WORKSPACES_ROOT` override, `<kanban_home>/kanban/workspaces`, and each `<kanban_home>/kanban/boards/<slug>/workspaces` discovered on disk. * Require strict descendancy: a path equal to a root itself is rejected too, because deleting a workspaces root would wipe every task's scratch dir at once. Add a regression test covering the three Copilot-named attack paths (kanban root, kanban/logs, board root without `workspaces/`) plus the workspaces-root-itself case, and confirm the inner task-id dir still matches.	2026-05-24 15:48:58 -07:00
briandevans	80ad1609c8	fix(kanban): refuse to rmtree workspace_path outside managed scratch root (#28818 ) A board's ``default_workdir`` (e.g. ``hermes kanban boards set-default-workdir my-board /path/to/real/source``) is copied into ``tasks.workspace_path`` for tasks created without an explicit ``workspace_kind``. Those tasks default to ``workspace_kind='scratch'``, so completion calls ``_cleanup_workspace`` and unconditionally runs ``shutil.rmtree(wp, ignore_errors=True)`` — deleting the user's real source tree as if it were disposable scratch storage. Add ``_is_managed_scratch_path()`` and gate ``_cleanup_workspace`` on it: only delete paths under ``HERMES_KANBAN_WORKSPACES_ROOT`` (the worker-side override the dispatcher injects) or under the active kanban home's ``kanban/`` subtree (covering both the legacy default-board root and per-board ``kanban/boards/<slug>/workspaces`` roots). Anything else gets a warning log and is left alone, so a misconfigured ``default_workdir`` can no longer destroy user data on task completion.	2026-05-24 15:48:58 -07:00
Teknium	396ee69032	fix(gateway): seed plugin extras before is_connected gate (#31703 ) Follow-up to `54e61f933`. The plugin enablement gate calls ``entry.is_connected(probe_cfg)`` BEFORE ``env_enablement_fn`` runs, and the probe is built as ``existing_cfg or PlatformConfig()`` — empty extras, ``enabled=False``. For plugins whose ``is_connected`` reads ``config.extra`` instead of env vars directly, that probe is a misrepresentation of what the platform will look like after enablement. Google Chat's ``_is_connected`` short-circuits on ``config.enabled`` and inspects ``config.extra["project_id"]`` / ``config.extra["subscription_name"]`` — both False on the default probe even when the user has set ``GOOGLE_CHAT_PROJECT_ID`` and ``GOOGLE_CHAT_SUBSCRIPTION_NAME``. Result: Google Chat silently fails the gate on every env-var-only setup. Build a candidate probe that mirrors what the platform will look like post-enablement: - pre-call ``env_enablement_fn`` and layer its result into the probe's ``extra`` (without mutating any existing platform config) - pass ``enabled=True`` on the probe — we're asking "would this BE configured if we let it in?" not "is it currently enabled?" - reuse the same seeded extras when we commit the platform to ``config.platforms`` (avoids calling ``env_enablement_fn`` twice) Discord/IRC/Teams/LINE/ntfy/Simplex ``_is_connected`` hooks read env vars directly, so they are unaffected. This change only restores Google Chat on env-var-only setups while keeping the original #31116 Discord-no-token block intact. All 6 shipped ``env_enablement_fn`` implementations were audited and are pure reads (no ``os.environ`` writes), so running them earlier in the loop has no observable side effects. Tests: 2 new in tests/gateway/test_platform_registry.py covering extras-seeded-before-is_connected and don't-leak-extras-on-gate-fail. 693 tests across 11 adjacent suites pass (platform_registry, config, google_chat, matrix, discord_connect, ntfy_plugin, simplex_plugin, line_plugin, irc_adapter, teams, gateway_platform_gating). Refs #31116.	2026-05-24 15:44:26 -07:00
helix4u	514f5020c7	fix(debug): redact BlueBubbles webhook secrets	2026-05-24 15:43:48 -07:00
Teknium	13b85bc646	feat(config): document resume-recap tuning keys in DEFAULT_CONFIG The hardcoded constants in _display_resumed_history were exposed as config in PR #4434; declare them in DEFAULT_CONFIG and the CLI fallback dict so they show up in 'hermes config' diagnostics and the schema validator.	2026-05-24 15:36:37 -07:00
Teknium	5dc10ec3ba	test(cli): reconcile resume-recap tests with skip-tool-only default and compression-chain helper - test_tool_calls_shown_as_summary: explicitly disable resume_skip_tool_only (#4434 made True the default; the legacy assertion relied on tool-only entries being rendered as a summary). - test_tool_only_message_skipped_by_default: add coverage for the new default skip behavior. - test_resume_command_*: mock_db.resolve_resume_session_id now returns the same id (no compression chain) so the post-#15000 redirect block doesn't shove a MagicMock into HERMES_SESSION_ID.	2026-05-24 15:36:37 -07:00
Teknium	27c4ba98c3	chore(release): map zhangsamuel12@gmail.com to SamuelZ12 (PR #7480 )	2026-05-24 15:36:37 -07:00
ygd58	cdf4876bfe	fix(cli): skip tool-call-only entries in resume recap, expose limits as config options	2026-05-24 15:36:37 -07:00
Samuel Zhang	961e34a1d3	fix: show recap after in-session resume	2026-05-24 15:36:37 -07:00
Teknium	16eed4f91b	test(telegram): add brand-new-topic regression for #31086 The cherry-picked fix from #28605 inverts an existing test (an unknown non-lobby thread_id no longer rewrites to the most-recent binding), but that test only seeds two bindings and queries a third thread_id. Add a second regression test that more closely mirrors the live failure mode: seed exactly one prior binding, then query a brand-new thread_id and assert recovery returns None — so the new topic is allowed to get its own session row instead of being silently merged into the previous topic's session. Co-authored-by: Fábio Siqueira <fabioxxx@gmail.com> Co-authored-by: dillweed <dillweed@users.noreply.github.com>	2026-05-24 15:28:40 -07:00
Maxim Esipov	bdc9b0eff5	fix(telegram): preserve new DM topic lanes	2026-05-24 15:28:40 -07:00
Teknium	eea9553a9c	fix(anthropic): skip mcp_ prefix on outgoing tool schemas when already prefixed Companion to the GH-25255 incoming-strip fix from @hayka-pacha. Without this, build_anthropic_kwargs unconditionally added 'mcp_' to every tool name in step 3, so a native MCP server tool registered as 'mcp_composio_X' was sent as 'mcp_mcp_composio_X' on the wire. The incoming strip only removes ONE prefix, which still worked on first call, but on subsequent calls the model pattern-matched the single-prefixed form from message history and produced names that stripped to 'composio_X' — registry miss, dispatch fail. The history-rewrite block (#4) already has this guard. Apply the same guard to the schema-rewrite block (#3) so round-trip is symmetric. Added 4 outgoing-side tests. Existing 7 incoming-side tests still pass. Author map: hayka-pacha added for PR #25270 salvage attribution. Refs GH-25255.	2026-05-24 15:27:45 -07:00
HKPA	2f91a8406c	fix(agent): only strip mcp_ prefix for OAuth-injected tools (GH-25255) When strip_tool_prefix=True (Anthropic OAuth path), normalize_response unconditionally stripped the mcp_ prefix from ALL tool names starting with mcp_. This broke Hermes-native MCP server tools (registered under their full mcp_<server>_<tool> name in the registry) because the stripped name doesn't match any registry entry. Fix: check the tool registry before stripping. Only strip when: - The stripped name EXISTS in the registry (OAuth-injected tool) - The full name does NOT exist in the registry This preserves backward compatibility for OAuth-injected tools while protecting native MCP server tools from incorrect prefix removal. 7 new tests covering: OAuth strip, native preserve, no-flag, non-mcp, unknown tools, mixed responses, and dual-registration edge case. Signed-off-by: HKPA <hayka-pacha@users.noreply.github.com>	2026-05-24 15:27:45 -07:00
Yuan Li	476c897439	fix(telegram): gate send() on send-path health after reconnect storms (#31165 ) After sustained Bad Gateway / TimedOut reconnect cycles, the PTB httpx client can enter a state where bot.send_message() returns a valid Message (real message_id) but the message never reaches the recipient. TelegramAdapter.send returns SendResult(success=True) and cron's live-adapter branch marks the run delivered while the message is silently dropped. Add a _send_path_degraded flag. _handle_polling_network_error sets it on reconnect storms; the existing _verify_polling_after_reconnect heartbeat probe clears it once getMe() confirms the Bot client is healthy. While the flag is set, send() short-circuits with SendResult(success=False, retryable=True) so cron falls through to the standalone delivery path (fresh HTTP session). Closes #31165. Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-24 15:27:41 -07:00
Teknium	54e61f9331	fix(matrix,gateway): Matrix E2EE installs full dep set; plugins respect is_connected Fixes #31116 — two distinct bugs in fresh-install Matrix gateway: 1. Matrix E2EE setup installed only mautrix[encryption], leaving asyncpg / aiosqlite / Markdown / aiohttp-socks uninstalled. The first encrypted connect failed with 'No module named asyncpg' deep inside MatrixAdapter.connect(). Root cause: the setup wizard hand-rolled a pip install of one package instead of using lazy_deps.ensure( 'platform.matrix'), and check_matrix_requirements() short-circuited the runtime installer on 'import mautrix' alone — so the other 4 packages were never pulled in. 2. Discord auto-enabled itself on every gateway start, even when the user never selected Discord and had no DISCORD_BOT_TOKEN. Root cause: gateway/config.py plugin-enablement loop gated enablement on entry.check_fn() (just 'is the SDK importable?') and ignored entry.is_connected (the 'did the user configure credentials?' probe). Same bug class as commit `7849a3d73` fixed for _platform_status in the setup wizard; this is the runtime counterpart. Affects Discord, Teams, and Google Chat. Changes: - hermes_cli/setup.py::_setup_matrix — install via lazy_deps.ensure('platform.matrix') to pull the full feature group. - gateway/platforms/matrix.py::_check_e2ee_deps — verify asyncpg + aiosqlite + PgCryptoStore in addition to OlmMachine, so E2EE failures surface at startup instead of at first encrypted-room connect. - gateway/platforms/matrix.py::check_matrix_requirements — use feature_missing('platform.matrix') as the install gate instead of a single 'import mautrix' check, so partial installs trigger the lazy installer correctly. - gateway/config.py plugin-enablement loop — consult entry.is_connected before flipping enabled=True. Explicit YAML enabled=true still wins. Tests: 3 new in tests/gateway/test_matrix.py (asyncpg-required, aiosqlite-required, partial-install lazy-runs), 5 new in tests/gateway/test_platform_registry.py (is_connected=False blocks, is_connected=True enables, is_connected=None falls back to check_fn, raising probe doesn't enable, explicit YAML wins). Validation: 310 tests across affected test modules pass.	2026-05-24 15:16:03 -07:00
teknium1	88834baf50	chore: map soju06@users.noreply.github.com for PR #26054 salvage	2026-05-24 15:15:37 -07:00
Soju	6212e9ade8	fix(error-classifier): treat 5xx request-validation errors as non-retryable Standard OpenAI returns request-validation failures (unknown/ unsupported parameter, malformed request) as 4xx. Some OpenAI-compatible gateways return them as 5xx instead — codex.nekos.me returns 502 for an unknown parameter. The generic '5xx -> retryable server_error' rule then misfires: the error is deterministic (every retry gets the identical rejection), so the retry loop burns all 3 attempts, the transport-recovery path resets the counter and burns 3 more, and the result is a request flood against a request that can never succeed. Fix: when a 500/502 body carries an unambiguous request-validation signal — 'unknown parameter' / 'unsupported parameter' / 'invalid_request_error' in the message text, or invalid_request_error / unknown_parameter / unsupported_parameter as the structured error code — classify as a non-retryable format_error so the loop fails fast and falls back. Genuine 502 Bad Gateway with no such signal stays retryable as before. Origin: local-author Upstream-PR: none Patch-State: local-only	2026-05-24 15:15:37 -07:00
Soju	775a17284f	fix(transport): strip Hermes-internal scaffolding keys before chat.completions The empty-response recovery path in run_agent.py appends synthetic messages tagged with _empty_recovery_synthetic (and the agent loop uses _thinking_prefill / _empty_terminal_sentinel similarly). These are internal bookkeeping markers — they must never reach the wire. chat_completions' convert_messages only stripped Codex Responses leak fields (codex_reasoning_items, call_id, etc.), not these _-prefixed markers. Permissive providers (real OpenAI, Anthropic) silently ignore unknown message keys so the bug stayed hidden, but strict OpenAI-compatible gateways reject them outright. Observed against codex.nekos.me: 502: [ObjectParam] [input[617]._empty_recovery_synthetic] [unknown_parameter] Unknown parameter: '_empty_recovery_synthetic' Because the synthetic messages persist in the session, every subsequent request in that session carries the poisoned key and fails identically — a deterministic 502 the retry loop mistakes for a transient server error. Fix: convert_messages now drops any top-level message key starting with '_'. OpenAI's message schema has no '_'-prefixed fields, so this is safe and future-proofs against new internal markers. Origin: local-author Upstream-PR: none Patch-State: local-only	2026-05-24 15:15:37 -07:00
Teknium	7ab1677362	feat(security): on-demand supply-chain audit via OSV.dev (#31460 ) Adds 'hermes security audit' — a one-shot vulnerability scan against OSV.dev covering three surfaces a Hermes user actually controls: 1. The running Python's installed PyPI dists (importlib.metadata) 2. Plugin requirements.txt / pyproject.toml pins under ~/.hermes/plugins/ 3. Pinned npx/uvx MCP servers in config.yaml Zero new dependencies (stdlib urllib + importlib.metadata + tomllib + concurrent.futures). No auth required for OSV's public batch API. Flags: --json, --fail-on {low,moderate,high,critical} (default: critical), --skip-venv, --skip-plugins, --skip-mcp Output groups findings by source, sorts by severity descending, surfaces fixed-versions inline. Exit 1 when any finding meets the --fail-on tier. Deliberately out of scope: globally-installed pip/npm, editor/browser extensions, daily background scans, auto-blocking of installs. The audit is on-demand by design — daily scans become noise the user trains themselves to ignore.	2026-05-24 15:15:16 -07:00
Teknium	8065e70274	fix(agent): abort on HTTP 402 after pool rotation and fallback fail (#31443 ) Closes #31273. HTTP 402 (insufficient credits) was retried up to agent.api_max_retries times (default 3), burning paid requests against an exhausted balance. Real-world impact: ~$40 in 48h on a 24/7 Telegram+Discord gateway. Root cause: FailoverReason.billing was in the is_client_error exclusion set in agent/conversation_loop.py, which prevents the non-retryable-abort branch from firing. By the time control reaches that predicate: * credential-pool rotation has already run for billing and either continued the loop or returned False (pool exhausted/absent) * the eager-fallback branch has also fired on billing and either continued the loop or fell through (no fallback configured) Falling through to the backoff retry from here has no recovery mechanism left — it just burns more paid requests. Removing billing from the exclusion set makes 402 abort cleanly once pool+fallback recovery has failed, mirroring how 401/403 (also should_fallback=True) already behave. Added tests/run_agent/test_31273_402_not_retried.py which mirrors the is_client_error predicate shape from the source and asserts the invariant (plus a source-inspection guard against accidental re-introduction).	2026-05-24 15:14:13 -07:00
teknium1	5b52e26d18	fix(gateway): swallow transient Telegram TimedOut at loop level Closes #31066. Closes #31110. An unhandled `telegram.error.TimedOut` (or peer `NetworkError` / `httpx` connection error) propagating to the asyncio event loop killed the entire gateway process, taking down every profile attached to the same runner. systemd restarted the service after ~5s but the active conversation turn was lost. Public adapter methods (`adapter.send`, `adapter.edit_message`, `adapter.send_voice`, …) are individually try/except-wrapped on current main, but at least one async path was reaching the loop with TimedOut unhandled — the report's traceback ends at the deepest httpx frame and doesn't pinpoint the caller. Rather than audit 30+ call sites blind, install a loop-level safety net: `_gateway_loop_exception_handler` is set as the loop's exception handler in `start_gateway()` after `asyncio.get_running_loop()`. It classifies the exception via `_is_transient_network_error()` (walks the __cause__/__context__ chain, matches on class name so the test suite doesn't need the real telegram/httpx packages installed). Transient errors are logged at WARNING with full traceback so the originating call site stays diagnosable; everything else forwards to `loop.default_exception_handler` so real bugs still surface. Tests cover the classifier (known transients accepted, real bugs rejected, cause/context chain unwrap, cyclic-cause termination) and the handler (swallow + log warning, forward unknowns, missing-exception context). One end-to-end test schedules an orphan task raising TimedOut and asserts `asyncio.run` returns cleanly.	2026-05-24 15:03:27 -07:00
Teknium	3d66787a04	fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452 ) * fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main for vision Fixes #31179. Three coupled fixes so a configured aux vision backend actually serves vision tasks instead of silently routing images to the user's main provider: 1. agent/auxiliary_client.py: `auxiliary.<task>.provider: openai` resolves to `custom` + `https://api.openai.com/v1`. "openai" was not in PROVIDER_REGISTRY (we have `openai-codex` for OAuth and `custom` for manual base_url), so the obvious config name silently failed to build a client. User-supplied base_url is still preserved; only the provider name normalises to `custom` so resolution doesn't hit the PROVIDER_REGISTRY-only path. 2. agent/auxiliary_client.py: the vision auto-detect chain now skips the user's main provider when models.dev reports `supports_vision=False`. Without this guard, a misconfigured aux provider would fall back to `auto`, which happily returned the main-provider client. The caller would then send image content to e.g. api.deepseek.com with model `gpt-4o-mini` and get a cryptic `unknown variant 'image_url', expected 'text'` from the provider's parser. 3. tools/vision_tools.py + tools/browser_tool.py: `check_vision_requirements` now mirrors the runtime fallback chain (explicit provider, then auto), so `vision_analyze` shows up whenever vision is actually serviceable. `browser_vision` gets a new `check_browser_vision_requirements` check_fn that AND-gates browser + vision availability, so it doesn't get advertised to the model when the call would fail at runtime. Reproduction (config from the bug report): model.provider: deepseek model.default: deepseek-v4-pro auxiliary.vision.provider: openai auxiliary.vision.model: gpt-4o-mini Before: resolve_vision_provider_client() returns None for the explicit provider, fallback auto returns the deepseek client with model='gpt-4o-mini', image hits api.deepseek.com → 'unknown variant image_url'. vision_analyze hidden from tool list; browser_vision exposed but fails at call time. After: resolves to custom + api.openai.com/v1 with model gpt-4o-mini. vision_analyze and browser_vision both gate correctly on capability. Tests: tests/agent/test_vision_routing_31179.py covers all three fixes (12 cases including the user's exact scenario, base_url preservation, text-only-main skip, capability-unknown permissive fallback, and tool gating parity). Existing 382 tests across auxiliary/vision/image_routing suites still pass. * test(vision): use exact hostname check to silence CodeQL substring-sanitization alert * fix(auxiliary): drop model name from vision-skip debug log to silence CodeQL The new `logger.debug(...)` added in the previous commit interpolated both `main_provider` and `vision_model` (a public model slug \u2014 not sensitive). CodeQL's `py/clear-text-logging-sensitive-data` heuristic re-flagged it twice because the rule mis-detects multi-value interpolations near tainted-via-config provider strings. Drop the model from the log args (provider alone is enough to diagnose the skip; the same sibling branch a few lines up already logs provider only). Behavior unchanged; CodeQL false positive cleared.	2026-05-24 15:01:28 -07:00
Hinotoi Agent	d9ec90585c	test(dashboard): send loopback headers for WebSocket sidecar test	2026-05-24 15:00:44 -07:00
hinotoi-agent	2e66eefbc3	fix(dashboard): validate WebSocket Host and Origin	2026-05-24 15:00:44 -07:00
Teknium	186bf25cb1	test(guardrail): assert halt message reaches stream_delta_callback Regression guard for #30770 — verifies the guardrail-halt branch in agent/conversation_loop.py pushes the synthesized halt message through stream_delta_callback before breaking out of the loop. Without the emit, chat-completions SSE writers drain an empty queue and clients (Open WebUI, etc.) see a finish chunk with zero content delta — indistinguishable from a crash. Verified: the test fails when the production fix is reverted.	2026-05-24 07:38:24 -07:00
annguyenNous	38b8d0da85	fix: emit guardrail halt message to client before closing stream When the tool loop guardrail fires (max_tool_failures, etc.), the turn exits with guardrail_halt but no final assistant message was emitted to the client. The SSE stream closed silently — indistinguishable from a crash. The stream_delta_callback(None) before tool execution is a display flush, not a hard close. After generating the halt response, emit it through both _safe_print (CLI) and stream_delta_callback (SSE) so clients see the explanation. Fixes #30770	2026-05-24 07:38:24 -07:00
Teknium	889903f0fa	fix(tests): align CI tests with recent security hardening (#31470 ) Four recent security PRs landed on main with stale/missing test updates, breaking 4 test shards on every subsequent PR's CI run: - test_discord_bot_auth_bypass.py (PR #30742 `c3caca658`): DISCORD_ALLOWED_ROLES no longer bypasses _is_user_authorized. Inverted 3 tests to assert the new (correct) behavior: role config alone does NOT authorize at the gateway layer. - test_msgraph_webhook.py (PR #30169 `4ca77f105`): adapter.is_connected is a @property, not a method. Test was calling it with () after the connect() change; TypeError: 'bool' is not callable. Removed the parens. - test_feishu_approval_buttons.py (PR #30744 `bdb97b857`): Card-action callbacks now go through _allow_group_message authorization. 3 tests in TestCardActionCallbackResponse didn't populate adapter._allowed_group_users so the operator's open_id got rejected. Added the allowlist setup to each test, matching the existing pattern in test_returns_card_for_approve_action. Also raise tolerance on test_wait_for_process_kills_subprocess_on_keyboardinterrupt: the SIGTERM → 3s TimeoutStopSec → SIGKILL → reap chain can exceed 10s under loaded xdist (40 workers). Bumped _wait_for_pgid_exit timeout 10→30s and worker join timeout 5→15s. Passes 100% in isolation already; this just makes it tolerant of CI-host load. Validation: 270/270 tests pass across the 5 affected files.	2026-05-24 06:54:16 -07:00
Hinotoi-agent	3bace071bf	fix(state): restrict sensitive store file permissions response_store.db (api server) holds conversation history including tool payloads, prompts, and results. webhook_subscriptions.json holds per-route HMAC secrets. Under a permissive umask (e.g. 0o022, default on most distros) both files were created mode 0o644 — readable by other local users on shared boxes. - gateway/platforms/api_server.py: ResponseStore tightens itself + WAL/SHM sidecars to 0o600 after __init__, then trusts the inode. (Original contributor patch chmod'd after every _commit() — wasteful on a hot api_server path; chmod-on-create is sufficient since SQLite preserves mode bits across writes.) - hermes_cli/webhook.py: _save_subscriptions writes via tempfile.mkstemp (which itself creates the file with 0o600), chmods the temp before the atomic rename, and re-asserts 0o600 on the destination so an existing permissive file from before this fix gets narrowed. Tests cover (a) creation under permissive umask leaves 0o600 and (b) an existing 0o644 webhook_subscriptions.json gets narrowed on next save. Tests guarded with skipif os.name=='nt' since POSIX mode bits don't apply on Windows. Salvaged from PR #30917 by @Hinotoi-agent. Reworked the api_server.py side from chmod-on-every-commit to chmod-on-create. Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-24 04:55:18 -07:00
m0n3r0	f378f00bfb	fix(feishu): validate verification token before reflecting url_verification challenge When FEISHU_VERIFICATION_TOKEN is configured, an unauthenticated remote could previously prove endpoint control by sending a url_verification payload with any attacker-controlled challenge string — the handler reflected the challenge BEFORE running the token check. Move the verification_token check ahead of the url_verification echo so the challenge response is gated on a valid token. Add a regression test covering the wrong-token case. Also fix the stale test_connect_webhook_mode_starts_local_server fixture to set FEISHU_VERIFICATION_TOKEN (post #30746 webhook mode requires a secret). Salvaged from PR #29663 by @m0n3r0 — kept the url_verification reorder and its regression test; dropped the host-conditional weakening of the #30746 secret guard (we want webhook secrets required regardless of bind host, not only on 0.0.0.0/::). Docs updated to call out the gating. Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-24 04:51:19 -07:00
teknium1	5e6749fbf3	chore(release): map m0n3r0 for PR #29629 salvage	2026-05-24 04:47:45 -07:00
teknium1	15aa6884a2	fix(webhook): use 403 not 500 for missing-secret rejection Operator misconfiguration is a client/setup error, not an internal server exception. 403 "forbidden" more accurately reflects "this route refuses to authenticate" than 500 "internal server error" — the latter triggers incident alerting on operator monitoring and conflates real bugs with config drift. Follow-up tweak to PR #29629 by @m0n3r0.	2026-05-24 04:47:45 -07:00
m0n3r0	dbf73e90fa	fix: fail closed for webhook routes without secrets Reject unsigned webhook requests when a route has no effective HMAC secret, even if the request handler is reached without the normal connect-time validation. Add regression coverage for the direct-handler path.	2026-05-24 04:47:45 -07:00
BaxBit	bbf02c3224	fix(gateway): validate Svix webhook signatures (#30200 )	2026-05-24 04:45:13 -07:00
Jiaming Guo	ee002e7fc5	fix(dashboard): require auth for plugin rescan (#27340 )	2026-05-24 04:45:07 -07:00
Teknium	5acaeba2bb	fix(mcp): raise ImportError instead of NameError when stdio SDK missing (#31450 ) When the 'mcp' Python SDK isn't installed, _run_stdio leaked a bare 'NameError: name StdioServerParameters is not defined' because the top-level 'from mcp import ...' fails inside try/except ImportError, leaving the names unbound at module scope. Mirror the _MCP_HTTP_AVAILABLE gate that _run_http already had: raise a clear ImportError with install instructions instead. Fixes #30904	2026-05-24 04:44:59 -07:00
xxxigm	6cafcf9c77	test(streaming): pin partial-stream-stub finish_reason + continuation contract Three test classes lock in the #30963 fix: 1. TestPartialStreamStubFinishReason — drives _interruptible_streaming_api_call through the two recovery branches and asserts: - text-only partial → finish_reason="length" (the new behaviour), - mid-tool-call partial → finish_reason="stop" (unchanged on purpose). 2. TestLengthContinuationPromptBranching — pure-Python check on the branch that picks the continuation prompt by response.id. Locks the network error wording for partial-stream-stub vs. the output-length wording for everything else. 3. TestConversationLoopPartialStreamContinuation — feeds a stub + continuation pair into run_conversation, verifies the loop makes a second API call (instead of exiting with text_response(stop)), confirms the network-error continuation prompt actually reaches the model on call #2, and that final_response stitches both halves. Refs: NousResearch/hermes-agent#30963	2026-05-24 04:35:15 -07:00
xxxigm	20b3703a42	fix(conversation-loop): tailor length-continuation prompt for partial stream The length-continue path's user-facing vprint and continuation prompt both told the model "your response was truncated by the output length limit." That's a lie when the stub came from a partial-stream network error (issue #30963) — and a lie the model can detect, leading to "I wasn't truncated, I'm done" no-op responses that defeat the continuation entirely. Detect the partial-stream-stub via response.id and swap in: - vprint: "Stream interrupted by network error (finish_reason='length' on partial-stream-stub)" - prompt: "[System: The previous response was cut off by a network error mid-stream. Continue exactly where you left off. Do not restart or repeat prior text. Finish the answer directly.]" Real length truncations still see the original "truncated by output length limit" prompt — the model needs to know which class of failure it's recovering from. Same length_continue_retries=3 budget, truncated_response_parts merging, and final-response stitching infrastructure on both branches. Refs: NousResearch/hermes-agent#30963	2026-05-24 04:35:15 -07:00
xxxigm	9140be7c22	fix(streaming): emit finish_reason=length on text-only partial-stream stub When the API connection drops mid-stream after text deltas have already been delivered, chat_completion_helpers returned a stub response with finish_reason=stop. The conversation loop then classified the stub as a clean text completion (text_response(finish_reason=stop)) and exited with iteration budget remaining — even when the goal-judge verdict came back as "continue" milliseconds later (issue #30963). Switch the text-only partial-stream stub to finish_reason=length. The existing length-continuation path (length_continue_retries up to 3, "continue exactly where you left off" prompt, partial parts merged into final_response) then fires automatically: the partial assistant content is persisted, the model is asked to continue from the cut point, and the loop keeps making progress against the goal. The mid-tool-call branch keeps finish_reason=stop on purpose — its user-facing warning ("Ask me to retry if you want to continue") asks the user to drive the retry rather than auto-replaying a tool call with possible side effects. #5544's "no duplicate message" contract is preserved verbatim: the partial content is reused, never re-emitted as a fresh API call, so the user never sees two copies of the same delta. Refs: NousResearch/hermes-agent#30963	2026-05-24 04:35:15 -07:00
teknium1	60d20a37c9	fix(acp): only deliver final_response after streaming when transformed PR #29119 dropped the 'not streamed_message' guard unconditionally so that plugin-transformed responses (transform_llm_output hook) would reach ACP clients. That regressed test_prompt_does_not_duplicate_streamed_final_message: when no transform happened, the streamed text was re-sent as a duplicate final delivery. Tighten the condition to mirror the gateway side: deliver after streaming only when response_transformed=True. Otherwise keep the old guard. Adds test_prompt_delivers_transformed_response_after_streaming so the transformed path stays covered.	2026-05-24 04:31:13 -07:00

1 2 3 4 5 ...

9401 commits