hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-17 14:42:06 +00:00

Author	SHA1	Message	Date
Teknium	307c85e5c1	fix(goals): auto-pause when judge model returns unparseable output Weak judge models (e.g. deepseek-v4-flash) return empty strings or prose when asked for the strict {done, reason} JSON verdict. The old code failed-open to continue on every such turn, burning the entire turn budget with log lines like judge returned empty response judge reply was not JSON: "Let me analyze whether the goal..." and /goal clear could not stop it mid-loop without /stop. After N=3 consecutive parse failures (transport/API errors don't count — those are transient), the loop auto-pauses and prints: ⏸ Goal paused — the judge model (3 turns) isn't returning the required JSON verdict. Route the judge to a stricter model in ~/.hermes/config.yaml: auxiliary: goal_judge: provider: openrouter model: google/gemini-3-flash-preview Then /goal resume to continue. The counter resets on any usable reply (both "done"/"continue" and API errors) and persists across GoalManager reloads so cross-session resumes carry the correct state. Also fixes test_goal_verdict_send.py sharing a hardcoded session_id across tests — the shared id only worked because the previous _post_turn_goal_continuation was a never-awaited coroutine. Now that PR #19160 made it properly awaited, the xdist test-leakage bug surfaced. Each test gets a unique session_id via uuid suffix.	2026-05-07 17:33:09 -07:00
JC	03ddff8897	fix(gateway): defer goal status notices until after response delivery Route goal status notices through the platform adapter send API and register post-delivery callbacks so completed-goal notices appear after the final assistant response. Also cancel queued synthetic goal continuations on /goal pause and /goal clear while preserving normal queued user messages.	2026-05-07 17:33:09 -07:00
Austin Pickett	7f92e5506e	Merge pull request #20942 from NousResearch/austin/fix/personality fix(tui): preserve session when switching personality	2026-05-07 18:54:29 -04:00
teknium	292f468366	fix(mcp): unwrap platforms key in channels_list channels_list was iterating directory.items() directly, yielding ("updated_at", str) and ("platforms", dict) pairs — neither passed the isinstance(entries_list, list) check, so the inner loop never ran and every call returned count=0 even when channel_directory.json was populated. The writer (gateway/channel_directory.py) wraps the payload as {"updated_at": ..., "platforms": {...}}; every other reader in the codebase unwraps via directory.get("platforms", {}). This aligns channels_list with that convention. Also tightens the existing test_channels_with_directory test, which bypassed the bug by asserting against _load_channel_directory() directly instead of calling channels_list. It now calls the tool end-to-end and a new test_channels_with_directory_platform_filter covers the filter path. Both tests fail against the pre-fix code. Closes #21474 Co-authored-by: chrisworksai <262485129+chrisworksai@users.noreply.github.com>	2026-05-07 13:41:16 -07:00
Blake Johnson	9076a2e74e	fix(agent): keep Nous GPT-5 fallback on chat completions	2026-05-07 13:04:42 -07:00
Teknium	24d48ffb82	feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks (#21435 ) * feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks The Triage column shipped with a placeholder 'a specifier will flesh out the spec', but the specifier itself was never built. This wires it up as a dedicated CLI verb. `hermes kanban specify <id>` calls the auxiliary LLM (configured under `auxiliary.triage_specifier`) to expand a rough one-liner into a concrete spec — tightened title plus a body with Goal / Approach / Acceptance criteria / Out-of-scope sections — then atomically flips `status: triage -> todo` and recomputes ready so parent-free tasks go straight to the dispatcher on the same tick. Surface: hermes kanban specify <task_id> # single task hermes kanban specify --all [--tenant T] # sweep triage column hermes kanban specify ... --author NAME # audit-comment author hermes kanban specify ... --json # one JSON line per task Design choices: - Parent gating is preserved. specify_triage_task flips to 'todo', then recompute_ready promotes to 'ready' only when parents are done — same rule as a normal parent-gated todo. - No daemon, no background watcher. Every invocation is explicit — keeps cost predictable and doesn't fight the dispatcher loop. - Response parse is lenient: strict JSON preferred, markdown-fence tolerated, raw-body fallback on malformed JSON so the LLM can't strand a task in triage. - All failure modes (no aux client, API error, task moved out of triage mid-call) return SpecifyOutcome(ok=False, reason=...) so --all continues past individual failures. Changes: hermes_cli/kanban_db.py + specify_triage_task() hermes_cli/kanban_specify.py NEW (~220 LOC — prompt, parse, call) hermes_cli/kanban.py + specify subcommand + _cmd_specify hermes_cli/config.py + auxiliary.triage_specifier task slot website/docs/user-guide/features/kanban.md specify + config notes website/docs/reference/cli-commands.md CLI reference entry tests/hermes_cli/test_kanban_specify_db.py NEW (10 tests) tests/hermes_cli/test_kanban_specify.py NEW (20 tests) Validation: 30/30 targeted tests pass. E2E: triage task -> specify -> ends in 'ready' with events [created, specified, promoted] and the audit comment recorded under the configured author. * feat(kanban): wire specifier into dashboard and gateway slash Follow-ups to the initial PR #21435 — closes the two gaps I'd left as post-merge: dashboard button and first-class gateway surface. Dashboard (plugins/kanban/dashboard/) - POST /tasks/:id/specify NEW endpoint. Thin wrapper around kanban_specify.specify_task(). Returns the CLI outcome shape ({ok, task_id, reason, new_title}); ok=false with a human reason is a 200, not a 4xx, so the UI can render it inline without treating 'no aux client configured' as a crash. - Runs sync in FastAPI's threadpool because the LLM call can take tens of seconds on reasoning models. - Pins HERMES_KANBAN_BOARD around the specify call so the module's argless kb.connect() lands on the right board. - dist/index.js: doSpecify callback threaded through the drawer → TaskDetail → StatusActions prop chain. ✨ Specify button appears ONLY when task.status === 'triage' (elsewhere the backend would reject anyway — hide the button to keep the action row clean). Busy state (Specifying…) + inline success/error banner under the button using the response.reason text. - dist/style.css: tiny hermes-kanban-msg-ok / -err classes using existing --color vars so themes reskin cleanly. Gateway slash (/kanban specify) - Already works via the existing run_slash → build_parser → kanban_command pipeline. No code change needed — slash commands inherit the argparse tree automatically. Added coverage: test_run_slash_specify_end_to_end (create --triage, specify, verify promotion + retitle) and test_run_slash_specify_help_is_reachable. Tests - tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the REST endpoint — happy path, non-triage rejection as ok=false 200, missing aux client as ok=false 200. - tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests. Docs - website/docs/user-guide/features/kanban.md: dashboard action row description mentions ✨ Specify + all three surfaces. REST table gains /tasks/:id/specify. Slash examples include /kanban specify. Validation: 340/340 targeted tests pass. E2E via TestClient: create a triage task over REST → POST /specify with mocked aux client → task moves to 'ready' column on /board with new title and body applied.	2026-05-07 13:04:41 -07:00
adybag14-cyber	732a6c45fa	feat: add termux doctor fallback guidance for blocked extras	2026-05-07 13:04:08 -07:00
adybag14-cyber	dc5ef1ac8e	fix: add termux-all install profile and safe fallbacks	2026-05-07 13:04:08 -07:00
adybag14-cyber	da18fd084a	fix: strengthen termux install network prerequisites	2026-05-07 13:04:08 -07:00
adybag14-cyber	54c0b10d14	fix(update): add heartbeat during dependency install	2026-05-07 13:04:08 -07:00
Abd0r	04193cf71c	feat(web): add Brave Search (free tier) and DDGS search providers Both implement WebSearchProvider via tools/web_providers/ — matching the existing SearXNG pattern (PR #`5c906d702`). Search-only; pair with any extract provider via web.extract_backend. - tools/web_providers/brave_free.py — Brave Search API (free tier, 2k queries/mo). Uses BRAVE_SEARCH_API_KEY as X-Subscription-Token. - tools/web_providers/ddgs.py — DuckDuckGo via the ddgs Python package. No API key; gated on package importability. - tools/web_tools.py: both backends added to _get_backend() config list and auto-detect chain (trails paid providers), _is_backend_available, web_search_tool dispatch, web_extract_tool + web_crawl_tool search-only refusals, check_web_api_key, and the __main__ diagnostic. Introduces _ddgs_package_importable() helper so tests can monkeypatch a single symbol for the ddgs availability check. - hermes_cli/tools_config.py: picker entries for both providers; ddgs gets a post_setup handler that runs `pip install ddgs`. - hermes_cli/config.py: BRAVE_SEARCH_API_KEY in OPTIONAL_ENV_VARS. - scripts/release.py: AUTHOR_MAP entry for @Abd0r. - tests: 14 new tests (brave-free) + 15 new tests (ddgs) covering provider unit behavior, backend wiring, and search-only refusals. Salvages the brave-free + ddgs portion of PR #19796. Not included: the in-line helpers in web_tools.py (replaced with provider modules to match the shipped architecture), the lynx-based extract path (these backends should refuse extract with a clear error — users pair with a real extract provider), and scripts/start-llama-server.sh (unrelated). Co-authored-by: Abd0r <223003280+Abd0r@users.noreply.github.com>	2026-05-07 09:59:17 -07:00
xxxigm	cdc0a47dd5	test(hermes_constants): cover parse_reasoning_effort()	2026-05-07 09:59:07 -07:00
Teknium	7e2af0c2e8	feat(acp): pass image file attachments through as image_url parts Extends PR #21400's resource inlining with image-specific handling: ACP resource_link and embedded blob resources with an image/* mime (or image file suffix when mime is missing) now emit an OpenAI image_url part with a base64 data URL, so vision models actually see the image instead of a [Binary file omitted] note. Non-image resources keep the existing text-inlining behavior. Adds 3 tests: local PNG via resource_link, JPEG mime inferred from suffix when client omits mimeType, and embedded blob PNG.	2026-05-07 09:24:32 -07:00
HenkDz	733e297b8a	fix(acp): inline file attachment resources	2026-05-07 09:24:32 -07:00
Teknium	2564132a1f	fix(telegram): preserve thread_id=1 for forum General typing indicator (#21390 ) The May 5 refactor in `d5357f816` made _message_thread_id_for_typing() symmetric with _message_thread_id_for_send() by mapping the General topic (thread id "1") to None upfront for both. That's correct for sendMessage — Telegram rejects message_thread_id=1 on sends and the topic must be omitted — but it's wrong for sendChatAction. Observed behavior (confirmed via before/after Telegram wire traces): Before `d5357f816`: thread_id=1 → message_thread_id=1 → bubble visible in General After `d5357f816`: thread_id=1 → message_thread_id=None → no visible typing Omitting message_thread_id on sendChatAction does NOT fall back to the General topic's view in a forum-enabled supergroup; the bubble ends up hidden from the client's General-topic pane entirely. For any user on a forum-group, the typing indicator stopped appearing. Fix: drop the symmetric "1 → None" mapping from the typing resolver. sendMessage still maps 1 → None via _message_thread_id_for_send (that side was never broken). The asymmetry is real and required by Telegram's API — document it in the resolver docstring. Partial revert of `d5357f816`; restores the behavior from `0cf7d570e` ("fix(telegram): restore typing indicator and thread routing for forum General topic"). Does not re-introduce the retry-without-thread fallback that `41545f7ec` scoped down for DM topics — with the resolver fixed, the first call already hits the right wire shape. Test updated from test_send_typing_general_topic_uses_none_thread_id (which encoded the broken contract) to test_send_typing_preserves_general_topic_thread_id, asserting the single correct call with message_thread_id=1. 10 other tests in the file untouched and passing.	2026-05-07 08:39:21 -07:00
Teknium	812ce0b987	fix(run_agent): break permanent empty-response loop from orphan tool-tail (#21385 ) When empty-response terminal scaffolding fires on a tool-result turn, _drop_trailing_empty_response_scaffolding left the live history ending at a bare 'tool' message. The next user input then landed as [...tool, user], a protocol-invalid sequence that OpenRouter/Opus and other providers silently fail on (returns empty content). That retriggered the empty-retry recovery every turn, and recovery flags never hit SQLite (no column for them), so history kept looking broken on every reload. Two fixes: 1. Scaffolding strip rewinds the orphan assistant(tool_calls)+tool pair after popping sentinels. Only fires when scaffolding flags were actually present, so mid-iteration tool loops are untouched. 2. _repair_message_sequence runs right before every API call as a defensive belt: drops stray tool messages with unknown tool_call_ids, merges consecutive user messages so no user input is lost. Does NOT rewind assistant(tool_calls)+tool+user — that pattern is valid when the user redirected before the model got its continuation turn. Repro: session 20260507_044111_fa7e65. Opus-4.7/OpenRouter returned content-less response after a 42KB execute_code output, nudge+retry chain exhausted (no fallback configured), terminal sentinel appended, scaffolding stripped leaving bare tool tail, user typed 'wtf happened..' and landed as tool→user violation. Every subsequent turn collapsed in <50ms with the same 3-retry empty chain because the API request itself was malformed. Verified live via HTTP mock: pre-fix reproduced 5 api_calls/0.15s exit 'empty_response_exhausted'; post-fix 1 api_call/0.10s exit 'text_response(finish_reason=stop)'. Three-turn session flows cleanly through the scenario. Full run_agent suite: 1242 passed (0 regressions, 2 pre-existing concurrent_interrupt failures unrelated).	2026-05-07 08:35:10 -07:00
Teknium	1d2029b2b7	fix(update): reset-failed before every fallback restart so the gateway can't get stranded (#21371 ) cmd_update's auto-restart path could leave the gateway dead after a transient failure in systemd's own auto-restart window. Reproduced on Ubuntu 25.10 + systemd 257: after update, gateway drains and exits 75, systemd's first respawn 60s later fails (status=200/CHDIR with "No such file or directory" on a WorkingDirectory that demonstrably exists), the unit ends up in RestartMaxDelaySec=300 backoff, and cmd_update's fallback 'systemctl restart' never recovers it — leaving users with a permanently silent gateway until they manually run 'systemctl reset-failed'. The fix mirrors the recovery pattern 'hermes gateway restart' (systemd_restart) got in PR #20949: always reset-failed before restart, on both the initial fallback and the retry. Also rewrites the final failure message to tell the user to reset-failed + restart (not just restart, which is the step that already failed twice).	2026-05-07 08:34:12 -07:00
Teknium	04918345ea	fix(cron): initialize MCP servers before constructing the cron AIAgent (#21354 ) cron/scheduler.py:run_job() constructed AIAgent(...) without ever calling discover_mcp_tools(). The CLI and gateway paths do this at startup; cron jobs inherited none of it and the user's configured mcp_servers were invisible inside every cron run. Insert discover_mcp_tools() right before AIAgent(), wrapped in try/except so a broken MCP server can't kill an otherwise-working cron job. The call is idempotent: register_mcp_servers() short-circuits on already-connected servers, so subsequent ticks in the same scheduler process pay ~0ms. Scoped to the LLM path only; no_agent script jobs skip it entirely. Closes #4219.	2026-05-07 07:53:03 -07:00
WideLee	4de3ef38b1	feat(qqbot): wire native tool-approval UX via inline keyboards Makes the in-tree QQ inline keyboards actually light up when the agent blocks on a dangerous-command approval. Matches the cross-adapter gateway contract already implemented by Discord, Telegram, Slack, Matrix, and Feishu. Gateway/run.py's _approval_notify_sync checks type(adapter).send_exec_approval and falls back to a text prompt when it's missing. Without this wiring, QQ users stared at plain '/approve' text even though the adapter shipped button primitives. ### send_exec_approval(chat_id, command, session_key, description, metadata) Matches the signature the gateway calls with. Builds an ApprovalRequest (command_preview, description, timeout) and delegates to send_approval_request. Uses the last inbound msg_id as reply_to so QQ accepts the passive message. The 'metadata' parameter is accepted for contract parity but intentionally unused — QQ doesn't have thread_id/DM-targeting overrides. ### send_update_prompt(chat_id, prompt, default, session_key, metadata) Signature updated to match the cross-adapter contract used by 'hermes update --gateway' watcher. Renders a 'Update Needs Your Input' prompt with the optional default hint and a Yes/No keyboard. Replaces the earlier 3-arg helper that wasn't wired anywhere. ### Default interaction dispatcher _default_interaction_dispatch() auto-registered as the adapter's interaction callback in __init__. Routes: - approve:<session_key>:<decision> → tools.approval.resolve_gateway_approval Button → choice mapping: allow-once → 'once' allow-always → 'always' deny → 'deny' (QQ's 3-button mobile layout deliberately collapses 'session' + 'always' into one button; /approve session text fallback remains available.) - update_prompt:<answer> → atomic write of y/n to ~/.hermes/.update_response (the detached 'hermes update --gateway' watcher polls this file) - anything else → logged and dropped Resolve exceptions are caught and logged — never propagate into the WS loop. Callers can override via set_interaction_callback() to route clicks elsewhere or pass None to drop them entirely. ### Net effect QQ users now get native tap-to-approve UX on dangerous-command prompts and update-confirmation prompts, without having to type /approve or /deny as text. The adapter hooks into tools.approval the same way every other button-capable platform does. ### Tests 14 new tests cover: - Default callback installed on __init__ - send_exec_approval / send_update_prompt exist as class methods (so the gateway's type-probe detects them) - allow-once/always/deny each map to the correct resolve choice - update_prompt:y / update_prompt:n each write atomically to the response file (via monkeypatched get_hermes_home) - Unknown button_data / empty button_data / resolve exceptions are harmless - send_exec_approval honours last_msg_id reply-to and accepts metadata - send_update_prompt delegates with correct content + keyboard Full qqbot suite: 144 passed (72 pre-existing + 72 from this salvage arc). Also ran tools/test_approval.py alongside — no regressions (276 passed combined). Co-authored-by: WideLee <limkuan24@gmail.com>	2026-05-07 07:48:15 -07:00
Teknium	a1fe5f473d	fix(cron): scan assembled prompt including skill content (#3968 ) (#21350 ) _scan_cron_prompt ran at cron create/update time on the user-supplied prompt but skill content loaded inside _build_job_prompt at runtime was never scanned. Combined with non-interactive auto-approval, a malicious skill carrying an injection payload could execute with full tool access every tick. - cron/scheduler.py: new CronPromptInjectionBlocked exception and _scan_assembled_cron_prompt helper. _build_job_prompt now routes both return paths (with skills / without skills) through the helper, raising on match. run_job catches the exception and returns a clean (False, blocked_doc, "", error) tuple so the operator sees a BLOCKED delivery with the scanner result and an audit hint, rather than a scheduler crash or a silent skip. - tests/cron/test_cron_prompt_injection_skill.py: 10 regression tests. Unit coverage on _scan_assembled_cron_prompt (clean/injection/exfil/ invisible-unicode). End-to-end coverage via _build_job_prompt with planted skills (injection payload, env exfil, zero-width space, clean control, missing-skill-doesn't-crash). Fixture patches tools.skills_tool.SKILLS_DIR / HERMES_HOME so planted skills are visible. Importantly uses the current cron.scheduler module object (not a top-level import) so tests don't break when other fixtures reload cron.scheduler — CronPromptInjectionBlocked identity depends on which module object defined it.	2026-05-07 07:44:10 -07:00
maciekczech	162ad3dd16	fix(kanban): filter dashboard board by selected tenant	2026-05-07 07:39:57 -07:00
maciekczech	f4de3810ef	test(kanban): cover dashboard select filter wiring	2026-05-07 07:39:57 -07:00
Teknium	74c9c0eec9	fix(mcp): gate utility stubs on server-advertised capabilities (#21347 ) For every connected MCP server we register four "utility" tool schemas (mcp_<server>_list_resources, read_resource, list_prompts, get_prompt). The existing gate was `hasattr(server.session, method)` — but `mcp.ClientSession` defines all four methods on the class regardless of what the remote server supports, so the gate never filtered anything. Tools-only servers (e.g. @upstash/context7-mcp which advertises only `tools`) ended up with 4 dead stubs; every model call to them returned JSON-RPC -32601 Method not found, which made the model conclude the server was broken even when the real tools worked. Capture the `InitializeResult` returned by `await session.initialize()` on the `MCPServerTask`, then gate each utility schema on the corresponding `capabilities` sub-object (resources / prompts). A legacy `hasattr` fallback runs when `initialize_result` is missing (older test fixtures / not-yet-captured code paths) so pre-existing behavior is preserved. Verified against real `mcp.types.InitializeResult` pydantic models: - Context7 shape (tools only) → 0 utility stubs registered (was 4) - Resources-only server → 2 stubs (list_resources, read_resource) - Prompts-only server → 2 stubs (list_prompts, get_prompt) - Fully capable server → all 4 stubs Closes #18051. Co-authored-by: nikolay-bratanov <nikolay-bratanov@users.noreply.github.com>	2026-05-07 07:39:50 -07:00
teknium1	898b6d7d55	fix(webhook): widen INSECURE_NO_AUTH loopback check + tests + docs Follow-up to the previous commit: - Add _is_loopback_host() helper covering 127.0.0.1, localhost, ::1, ip6-localhost, ip6-loopback (case-insensitive). Empty/None host is treated as non-loopback since unset usually means public default bind. - Fix mixed-indent comment in the safety rail (comment now aligned with the if-block) and collapse the nested-if into one condition. - Add TestInsecureNoAuthSafetyRail covering rejection on 0.0.0.0, a LAN IP, and empty host; allowance on 127.0.0.1/localhost; plus unit-level parametrized coverage of _is_loopback_host for spellings we can't bind in the hermetic test env (::1, ip6-localhost, ip6-loopback). - Pin test_connect_starts_server + test_webhook_deliver_only defaults to 127.0.0.1 so they keep passing under the new rail. - Document the behavior in website/docs/user-guide/messaging/webhooks.md.	2026-05-07 07:38:43 -07:00
WideLee	5b121c6e35	feat(qqbot): process attachments in quoted (reply) messages When a user replies while quoting another message, QQ sets 'message_type = 103' and pushes the referenced message's content + attachments inside 'msg_elements[0]'. The old adapter ignored msg_elements entirely, so: - Bare quote-replies (no user text) surfaced nothing to the LLM. - Quoted images/files/voice were never downloaded or described. - Quoted voice messages specifically produced no transcript — the model had no way to see what the user was referring to when saying 'about this voice note…'. This commit adds _process_quoted_context(d) which extracts msg_elements, unions their attachments, and runs them through the SAME _process_attachments pipeline as the main message body. Quoted voice gets an STT transcript (tried via QQ's asr_refer_text first, then the configured STT provider); quoted images get cached just like main-body images; quoted files surface with their original filename intact (not the CDN URL hash). The quoted content is prepended to the user's text as a '[Quoted message]:' block so the LLM sees the full referential context on one turn. Images-only quotes surface a '[Quoted message]: (image)' marker so the model knows an image was referenced even if no text came with it. All four inbound handlers (_handle_c2c_message, _handle_group_message, _handle_guild_message, _handle_dm_message) now call the helper uniformly — one merge pattern, not four divergent implementations. Filename preservation is carried by _process_attachments' existing '[Attachment: {filename or ct}]' line; nothing else needed for that. 12 new tests under TestProcessQuotedContext and TestMergeQuoteInto cover: - Non-quote messages short-circuit to empty - message_type=103 with no msg_elements is harmless - Text-only quotes render with '[Quoted message]:' prefix - Voice attachments in the quote flow through STT - File attachments in the quote preserve the original filename - Image attachments surface cached paths + media types - Images-only quote still emits a marker - Multiple msg_elements are concatenated - Malformed message_type values return empty - _merge_quote_into prepends with a blank-line separator Full qqbot suite: 130 passed (72 existing + 19 chunked + 27 keyboards + 12 quoted). Co-authored-by: WideLee <limkuan24@gmail.com>	2026-05-07 07:36:30 -07:00
WideLee	de584cd1dd	feat(qqbot): add inline-keyboard approvals and update prompts The QQ Bot v2 API supports inline keyboards on outbound messages. When a user taps a button, the platform dispatches an INTERACTION_CREATE gateway event; the bot ACKs it via PUT /interactions/{id} and decodes the button's data payload to route the click. This commit adds: New module gateway/platforms/qqbot/keyboards.py - Inline-keyboard dataclasses (InlineKeyboard, KeyboardRow, KeyboardButton, KeyboardButtonAction, KeyboardButtonRenderData, KeyboardButtonPermission) that serialize to the JSON shape the QQ API expects. - build_approval_keyboard(session_key) — 3-button layout: ✅ 允许一次 / ⭐ 始终允许 / ❌ 拒绝, all sharing group_id='approval' so clicking one greys out the rest. - build_update_prompt_keyboard() — Yes/No keyboard for update confirms. - parse_approval_button_data() / parse_update_prompt_button_data() — decode the button_data payload from INTERACTION_CREATE. approve:<session_key>:<decision> (decision = allow-once\|allow-always\|deny) update_prompt:<answer> (answer = y\|n) - build_approval_text(ApprovalRequest) — markdown renderer for the surrounding message body (exec-approval and plugin-approval variants, with severity icons 🔴/🔵/🟡). - parse_interaction_event(raw) → InteractionEvent dataclass — normalizes the nested raw payload (id / scene / openids / button_data / etc.). Adapter changes (gateway/platforms/qqbot/adapter.py) - _dispatch_payload routes INTERACTION_CREATE → _on_interaction. - _on_interaction parses the event, ACKs via PUT /interactions/{id}, then invokes a user-registered interaction callback. Exceptions from the callback are caught and logged (never propagate into the WS loop). - set_interaction_callback(cb) lets gateway wiring register a routing handler that inspects button_data and resolves the corresponding pending approval / update prompt. - _send_c2c_text / _send_group_text now accept an optional keyboard kwarg and append it to the outbound body. - send_with_keyboard(chat_id, content, keyboard, reply_to=None) — public helper that sends a single short message with a keyboard attached. Does NOT chunk-split (a keyboard message has one interactive surface). Guild chats are rejected non-retryably — they don't support keyboards. - send_approval_request(chat_id, ApprovalRequest, reply_to=None) + send_update_prompt(chat_id, content, reply_to=None) — convenience wrappers over send_with_keyboard. Tests 27 new unit tests under TestApprovalButtonData, TestUpdatePromptButtonData, TestBuildApprovalKeyboard, TestBuildUpdatePromptKeyboard, TestBuildApprovalText, TestInteractionEventParsing, and TestAdapterInteractionDispatch. Cover: - Button-data round-trip (build → parse returns original session/decision) - Keyboard JSON shape + mutual-exclusion group_id - Exec vs plugin approval text templates + severity icons - Interaction event parsing (c2c / group / guild scene codes) - _on_interaction end-to-end: ACK invoked, callback receives parsed event, callback exceptions are swallowed, missing id skips ACK, no registered callback is harmless. Full qqbot suite: 118 passed (72 existing + 19 chunked + 27 keyboards). Co-authored-by: WideLee <limkuan24@gmail.com>	2026-05-07 07:36:30 -07:00
WideLee	9feaeb632b	feat(qqbot): add chunked upload with structured error types The v2 'single POST /v2/{users\|groups}/{id}/files' upload path is capped at ~10 MB inline (base64 'file_data' or 'url'). For larger files the QQ platform provides a three-step flow: 1. POST /upload_prepare → upload_id + pre-signed COS part URLs 2. PUT each part to its COS URL → POST /upload_part_finish 3. POST /files with {upload_id} → file_info token This commit adds a new gateway/platforms/qqbot/chunked_upload.py module that implements the flow, wires it into QQAdapter._send_media for local files (URL uploads keep the existing inline path), and introduces structured exceptions so the caller can surface actionable error text: - UploadDailyLimitExceededError (biz_code 40093002, non-retryable) - UploadFileTooLargeError (file exceeds the platform limit) Both carry file_name / file_size_human / limit_human so the model can compose user-friendly replies instead of seeing opaque HTTP codes. The part_finish 40093001 retryable-error loop respects the server- provided retry_timeout (capped at 10 minutes locally) with a 1 s polling interval. COS PUTs retry transient failures up to 2 times with exponential backoff. complete_upload retries up to 2 times. Covers files up to the platform's ~100 MB per-file limit; before this the adapter silently rejected anything over ~10 MB. 19 new unit tests under TestChunkedUpload* cover the happy path, prepare-response parsing, helper functions, part retries, COS PUT retries, group vs c2c routing, and the structured-error mapping. Co-authored-by: WideLee <limkuan24@gmail.com>	2026-05-07 07:36:30 -07:00
Teknium	ac51c4c1ad	feat(kanban): per-task max_retries override (#20263 follow-up, supersedes #20972 ) (#21330 ) Adds a per-task override for the consecutive-failure circuit breaker, so individual tasks can opt out of the global ``kanban.failure_limit`` without dragging everyone else with them. Resolution order (now three tiers): 1. per-task ``max_retries`` (new, this commit) 2. caller-supplied ``failure_limit`` — the gateway threads ``kanban.failure_limit`` from config here 3. ``DEFAULT_FAILURE_LIMIT`` (2) Changes: - ``tasks.max_retries INTEGER`` column + migration for existing DBs (NULL = no override, matches pre-column behavior). - ``Task.max_retries`` field + ``from_row`` plumbing. - ``create_task(..., max_retries=N)`` kwarg. - ``_record_task_failure`` reads the per-task value first and records ``limit_source`` + ``effective_limit`` on the ``gave_up`` event so operators can see which tier won. - CLI: ``hermes kanban create --max-retries N`` (rejects ``< 1``). - CLI: ``hermes kanban show`` surfaces the effective threshold + source (``(task)``, ``(config kanban.failure_limit)``, ``(default)``). - CLI: ``_task_to_dict`` includes ``max_retries`` in ``--json`` output. Key design choice vs. the earlier #20972 attempt: - No new config key. The existing ``kanban.failure_limit`` (landed in #21183) is the dispatcher-tier source — no silent break for users who already tuned it. - No ``!=`` sentinel for "is config set" (which would misfire when config equals the default). The tier-winner is determined purely by "is per-task override set" — the dispatcher always wins when per-task is NULL, regardless of whether the caller passed the default or a configured value. E2E verified across four scenarios: default-only (trips at 2), config-only (trips at caller's value), per-task-only beats default (trips at task value), per-task beats larger config (trips at task value). ``gave_up`` event metadata correctly records ``limit_source`` and ``effective_limit`` in all cases. Tests: - ``test_per_task_max_retries_overrides_dispatcher_limit`` — task=1 beats caller=10. - ``test_per_task_max_retries_allows_more_than_default`` — task=5 does not trip at caller=default of 2. - ``test_max_retries_none_falls_through_to_dispatcher_limit`` — None honors caller's config value (4), records ``limit_source=dispatcher``. Full kanban trio (db + core + cli + tools + dashboard-plugin): 342 passed, no regressions. Supersedes: #20972 (@jelrod27) — credit in PR close comment. Ref: #20263 (tangentially — the reporter asked about adapter API drift, not retry caps, but the CLI discussion there is what surfaced the original ask).	2026-05-07 07:29:02 -07:00
Teknium	145e8ec237	fix(pairing): enforce lockout on approve_code, not just generate_code (#10195 ) (#21325 ) PairingStore.approve_code() didn't consult _is_locked_out(), so after MAX_FAILED_ATTEMPTS bad approvals the lockout flag was set but a valid code still got accepted — any pending code (legitimately issued or attacker-obtained) could be approved during the 1-hour lockout window, nullifying the brute-force protection. - gateway/pairing.py: lockout check runs in approve_code() right after _cleanup_expired, before the pending lookup. Returns None on lockout. - tests/gateway/test_pairing.py: test_lockout_blocks_code_approval pins the regression — reporter's exact reproducer (generate valid code, exhaust attempts with WRONGCODE, try to approve valid code) must return None and leave is_approved == False. Also pins recovery: once lockout expires, the still-pending code approves normally. - hermes_cli/pairing.py: _cmd_approve distinguishes the two None cases. On lockout, prints 'Platform locked out... clears in N minutes. To reset sooner, delete the _lockout:<platform> entry from _rate_limits.json' instead of the misleading 'Code not found or expired' message. 29/29 pairing tests pass; E2E-verified with reporter's exact Python reproducer.	2026-05-07 07:18:21 -07:00
qWaitCrypto	62c2f5d8d2	fix(mcp): coerce numeric tool args defensively	2026-05-07 07:17:12 -07:00
Ramón Fernández	44cd79e798	feat(plugins/google_chat): Google Chat platform adapter as a bundled plugin Adds Google Chat as a new gateway platform, shipped under plugins/platforms/google_chat/ following the canonical bundled-plugin pattern (Teams, IRC). Rewired from the original PR #18425 to use the new env_enablement_fn + cron_deliver_env_var plugin interfaces landed in the preceding commit, so the adapter touches ZERO core files. What it does: - Inbound DM + group messages via Cloud Pub/Sub pull subscription (no public URL needed), with attachments (PDFs, images, audio, video) downloaded through an SSRF-guarded Google-host allowlist. - Outbound text replies with the 'Hermes is thinking…' patch-in-place pattern — no tombstones. - Native file attachment delivery via per-user OAuth. Google Chat's media.upload endpoint rejects service-account auth, so each user runs /setup-files once in their own DM to grant chat.messages.create for themselves; the adapter then uploads as them. Tokens stored per email at ~/.hermes/google_chat_user_tokens/<email>.json. - Thread isolation: side-threads get isolated sessions, top-level DM messages share one continuous session. Persistent thread-count store survives gateway restart. - Supervisor reconnect with exponential backoff. - Multi-user out of the box. How it plugs in (no core edits): - env_enablement_fn seeds PlatformConfig.extra with project_id, subscription_name, service_account_json, and the home_channel dict (which the core hook turns into a HomeChannel dataclass). Reads GOOGLE_CHAT_PROJECT_ID (falls back to GOOGLE_CLOUD_PROJECT), GOOGLE_CHAT_SUBSCRIPTION_NAME (falls back to GOOGLE_CHAT_SUBSCRIPTION), GOOGLE_CHAT_SERVICE_ACCOUNT_JSON (falls back to GOOGLE_APPLICATION_CREDENTIALS), GOOGLE_CHAT_HOME_CHANNEL. - cron_deliver_env_var='GOOGLE_CHAT_HOME_CHANNEL' gets cron delivery for free — cron/scheduler.py consults the platform registry for any name not in its hardcoded built-in sets. - plugin.yaml's rich requires_env / optional_env blocks auto-populate OPTIONAL_ENV_VARS via the new hermes_cli/config.py injector, so 'hermes config' UI surfaces them with description / url / prompt / password metadata. - Module-level Platform('google_chat') call in adapter.py triggers the Platform._missing_() registration so Platform.GOOGLE_CHAT attribute access works without an enum entry. Distribution: ships inside the existing hermes-agent package. Users opt in via 'pip install hermes-agent[google_chat]' and follow the 8-step GCP walkthrough at website/docs/user-guide/messaging/google_chat.md. Test coverage: 153 tests in tests/gateway/test_google_chat.py, all passing. Spans platform registration, env config loading, Pub/Sub envelope routing, outbound send + chunking + typing patch-in-place, attachment send paths, SSRF guard, thread/session model, supervisor reconnect, authorization, per-user OAuth, and the new plugin-registry cron delivery wiring. Credit: adapter + OAuth + tests + docs authored by @donramon77 (PR #18425). Rewire onto the new plugin hooks + salvage commit by Teknium. Co-Authored-By: Ramón Fernández <112875006+donramon77@users.noreply.github.com>	2026-05-07 07:15:44 -07:00
Teknium	c8e3e39185	fix(mcp): surface image tool results as MEDIA tags instead of dropping them (#21328 ) MCP tool results can include ImageContent blocks (screenshots from Playwright/Blockbench/Puppeteer etc). The tool result handler only extracted block.text, so image blocks were silently dropped and the agent saw an empty or text-only response — losing the actual payload. Add _cache_mcp_image_block() that base64-decodes the block, validates the bytes via gateway.platforms.base.cache_image_from_bytes (which sniffs for PNG/JPEG/WebP signatures and rejects non-images), writes to the shared `~/.hermes/cache/images/` dir, and returns a MEDIA:<path> tag. The handler appends that tag to the result parts so downstream gateway adapters render the image inline. Logs and drops on malformed base64 / non-image payload rather than raising — a single bad block shouldn't kill the tool call. Distilled from #17915 (c3115644151) and #10848 (gnanirahulnutakki), both too stale to cherry-pick (branches diverged enough to revert dozens of unrelated fixes). Went with #10848's approach of plumbing through Hermes' existing MEDIA tag / cache_image_from_bytes infrastructure rather than #17915's raw tempfile path, because it integrates with the remote-backend mount system and messaging adapters that already handle MEDIA tags natively. Co-authored-by: c3115644151 <c3115644151@users.noreply.github.com> Co-authored-by: gnanirahulnutakki <gnanirahulnutakki@users.noreply.github.com>	2026-05-07 07:14:16 -07:00
Teknium	dd2dc2bddf	fix(mcp): forward OAuth auth and bump sse_read_timeout on SSE transport (#21323 ) * fix(mcp): re-raise CancelledError explicitly in MCPServerTask.run On Python 3.11+, `asyncio.CancelledError` inherits from `BaseException` (not `Exception`), so the broad `except Exception as exc:` in `MCPServerTask.run`'s transport loop did NOT catch it. Task cancellation from gateway restart / explicit `task.cancel()` silently escaped past the reconnect logic — the MCP server task died without going through the shutdown/reconnect code paths that check `_shutdown_event`. Add an explicit `except asyncio.CancelledError: raise` before the broad catch so cancellation propagation is self-documenting rather than an accident of exception hierarchy, and future sibling-site work (e.g. distinguishing shutdown-cancel from transport-cancel) has an obvious hook. Behavior on pre-3.8 Pythons where CancelledError WAS an Exception subclass is also corrected: the old path would have caught it and treated it as a connection failure worth retrying. Closes #9930. * fix(mcp): forward OAuth auth and bump sse_read_timeout on SSE transport Two surgical correctness bugs in the SSE branch of MCPServerTask._run_http, distilled from @amiller's PR #5981 that couldn't be cherry-picked wholesale (branch too stale). 1. sse_read_timeout was set to the tool timeout (default 60s). That's the wrong dimension — it governs how long sse_client will wait between events on the SSE stream, not per-call latency. SSE servers routinely hold the stream idle for minutes between events; a 60s read timeout drops the connection after the first slow stretch (Router Teamwork, Supermemory on Cloudflare Workers idle-disconnect at ~60s). Bump to 300s to match the Streamable HTTP path's httpx read timeout. 2. OAuth auth was built via get_manager().get_or_build_provider() but never forwarded to sse_client. SSE MCP servers behind OAuth 2.1 PKCE would silently fail with 401s on every request. Keepalive (the other half of #5981) intentionally left for a follow-up — it's a real improvement but a bigger change, and these two are obvious corrections to ship now. Credits to @amiller. Co-authored-by: Andrew Miller <socrates1024@gmail.com> --------- Co-authored-by: Andrew Miller <socrates1024@gmail.com>	2026-05-07 07:08:04 -07:00
xxxigm	d5fcc83922	fix(tests): avoid asyncio DeprecationWarning in event loop fixture on 3.12+	2026-05-07 07:05:05 -07:00
Teknium	e0a2b08768	fix(mcp): re-raise CancelledError explicitly in MCPServerTask.run (#21318 ) On Python 3.11+, `asyncio.CancelledError` inherits from `BaseException` (not `Exception`), so the broad `except Exception as exc:` in `MCPServerTask.run`'s transport loop did NOT catch it. Task cancellation from gateway restart / explicit `task.cancel()` silently escaped past the reconnect logic — the MCP server task died without going through the shutdown/reconnect code paths that check `_shutdown_event`. Add an explicit `except asyncio.CancelledError: raise` before the broad catch so cancellation propagation is self-documenting rather than an accident of exception hierarchy, and future sibling-site work (e.g. distinguishing shutdown-cancel from transport-cancel) has an obvious hook. Behavior on pre-3.8 Pythons where CancelledError WAS an Exception subclass is also corrected: the old path would have caught it and treated it as a connection failure worth retrying. Closes #9930.	2026-05-07 07:04:38 -07:00
Teknium	5a3e5b23d2	fix(memory): remove dead allOf schema block at the source PR #21238 introduced top-level `allOf: [{if/then/required}]` blocks in the built-in memory tool's parameters schema as conditional-required hints. Two problems: 1. OpenAI's Codex backend (chatgpt.com/backend-api/codex, gpt-5.x) rejects top-level `allOf`/`anyOf`/`oneOf`/`enum`/`not` outright with a non-retryable 400 — affected every user on openai-codex/gpt-5.x. 2. The `if/then` hints were silently ignored by every other provider (Chat Completions doesn't honour them on function schemas), so they never actually enforced anything anywhere. The runtime handler in `memory_tool()` already validates the per-action required fields and returns actionable error messages, so removing the block changes nothing behaviourally. Paired with the defense-in-depth sanitizer in the previous commit, this closes the bug both at the source (schema no longer emits the forbidden form) and at the wire boundary (sanitizer strips it if anything else re-introduces it). - Rewrites `tests/tools/test_memory_tool_schema.py` to guard against regressing the forbidden-combinator shape instead of asserting it. - Adds AUTHOR_MAP entry for @hrkzogw (author of the sanitizer fix).	2026-05-07 07:03:21 -07:00
Hirokazu Ogawa	3924cb408b	fix: strip Codex-hostile top-level schema combinators	2026-05-07 07:03:21 -07:00
Teknium	69d025e4a7	feat(gateway): add allowed_{chats,channels,rooms} whitelist to Telegram, Mattermost, Matrix, DingTalk Mirrors the Slack `allowed_channels` feature (PR #7401) and Discord's `allowed_channels` (PR #7044) across the remaining group-capable platforms. All five platforms (Slack + Discord + the four added here) now follow the same pattern: primary config via config.yaml, env-var fallback as an escape hatch — matching the project policy that .env is for secrets only and behavioral settings belong in config.yaml. Also fixes a duplicate `slack` key in DEFAULT_CONFIG introduced by PR #7401 (the later entry silently overwrote `allowed_channels`, `require_mention`, and `free_response_channels` at dict-literal evaluation time). Platforms added: - Telegram: `telegram.allowed_chats` (env alias: `TELEGRAM_ALLOWED_CHATS`) - Mattermost: `mattermost.allowed_channels` (env alias: `MATTERMOST_ALLOWED_CHANNELS`) - Matrix: `matrix.allowed_rooms` (env alias: `MATRIX_ALLOWED_ROOMS`) - DingTalk: `dingtalk.allowed_chats` (env alias: `DINGTALK_ALLOWED_CHATS`) Mattermost and Matrix previously had NO config.yaml bridging for any of their gating settings; this PR adds `load_gateway_config` bridges for them (Mattermost gets require_mention + free_response_channels + allowed_channels; Matrix gets allowed_rooms on top of its existing bridges for require_mention and free_response_rooms). Semantics identical everywhere: - Empty = no restriction (fully backward compatible). - Non-empty = hard whitelist: non-listed chats are silently ignored, even when the bot is @mentioned. - DMs bypass the check entirely. DEFAULT_CONFIG merges the duplicate `slack` block and adds new `mattermost` and `matrix` blocks so all gating settings surface in defaults. Not included: Feishu (has its own per-chat `chat_rules` system that covers this use case differently), WhatsApp (already has `group_allow_from` via `group_policy: allowlist`), pure-DM platforms (Signal, SMS, BlueBubbles, Yuanbao — no group concept).	2026-05-07 06:54:29 -07:00
Cash Williams	cd3ef685c4	feat(slack): add allowed_channels whitelist config	2026-05-07 06:54:29 -07:00
LeonSGP43	fc88eec926	fix(compressor): soften summary prompt for content filters	2026-05-07 06:42:32 -07:00
luyao618	e795b7e3ab	fix(delegate): expand composite toolsets before intersection in delegate_task When the parent agent uses a composite toolset like hermes-cli, calling delegate_task with individual toolsets (e.g. web, terminal) resulted in zero tools because the name-based intersection failed: 'web' != 'hermes-cli'. Add _expand_parent_toolsets() which collects all tool names from parent toolsets, then recognises any individual toolset whose tools are a subset of the parent's available tools. This allows delegate_task(toolsets=['web']) to work correctly when the parent has hermes-cli enabled. Fixes #19447	2026-05-07 06:41:42 -07:00
LeonSGP43	a78e622dfe	fix(agent): honor configured model max tokens	2026-05-07 06:40:30 -07:00
Gabriel Lesperance	ec9d0e26d4	fix(tui): render structured content on resume	2026-05-07 06:37:23 -07:00
oluwadareab12	edbbc96b55	fix(cli): replace get_event_loop() with get_running_loop() to silence RuntimeWarning in process_loop thread (#19285 )	2026-05-07 06:35:54 -07:00
Contentment003111	2c1921241c	feat(models): add paid tencent/hy3-preview route on OpenRouter (#21077 ) Add tencent/hy3-preview (without :free suffix) as a paid model route alongside the existing free variant. This allows seamless transition when the model moves from free to paid on OpenRouter — both routes coexist so neither side's timing causes breakage. Changes: - models.py: add ("tencent/hy3-preview", "") to OPENROUTER_MODELS - model-catalog.json: add paid variant entry - tests: add assertions for paid route presence The :free entry can be removed in a follow-up PR once OpenRouter confirms the free route is deprecated. Co-authored-by: simonweng <simonweng@tencent.com>	2026-05-07 06:34:48 -07:00
liuhao1024	f9b4b8af34	fix(mcp): include exception type in error messages when str(exc) is empty Some exception classes (e.g. anyio.ClosedResourceError) are raised without a message argument, so str(exc) returns an empty string. The existing error format f'{type(exc).__name__}: {exc}' would produce messages like 'MCP call failed: ClosedResourceError: ' with nothing after the colon. Add _exc_str() helper that falls back to repr(exc) when str(exc) is empty, and apply it to all 6 MCP error formatting sites (5 tool/prompt/resource handlers + 1 sampling handler). Fixes #19417	2026-05-07 06:33:57 -07:00
Alexander Monas	a1f85ef2b9	fix(mcp): retry stale pipe transport failures Treat closed-resource, closed-transport, broken-pipe, and EOF MCP failures as stale session equivalents so the existing reconnect/retry-once path can recover. Add regression coverage for the stale-pipe marker variants.\n\nChecks:\n- python -m py_compile tools/mcp_tool.py tests/tools/test_mcp_tool_session_expired.py\n- python -m pytest tests/tools/test_mcp_tool_session_expired.py -q -o addopts=\n- selected secret scan over touched files	2026-05-07 06:32:45 -07:00
paul-tian	4d4807585a	fix(gateway): honor configured goal turn budget	2026-05-07 06:31:08 -07:00
Luciano Pacheco	f7b71aa0da	fix: use configured model for gateway auth fallback	2026-05-07 06:29:27 -07:00
Mason James	80548f9a4f	fix(mcp): report configured timeout in MCP call errors Track elapsed wall time in _run_on_mcp_loop, cancel the in-flight future when a timeout expires, and raise a descriptive TimeoutError that includes the elapsed and configured timeout. Add regression coverage for the new timeout diagnostics.	2026-05-07 06:28:11 -07:00

1 2 3 4 5 ...

3374 commits