hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-13 09:01:54 +00:00

Author	SHA1	Message	Date
lsaether	9b631e4ae1	fix(acp): suppress cancel interrupt sentinel	2026-06-07 22:20:43 -07:00
Teknium	2789bf4e25	fix(auxiliary): route Codex Responses path through shared converter (#5709 ) The auxiliary Codex adapter maintained its own chat->Responses conversion loop that forwarded every non-system message's role verbatim into Responses input[]. When flush_memories()/compression replayed session history containing assistant tool_calls + role=tool results, those tool messages leaked into the request and the Responses API rejected them with HTTP 400: Invalid value: 'tool'. Route _CodexCompletionsAdapter.create() through the same shared converter the main agent transport uses (_chat_messages_to_responses_input), so tool calls become function_call items and tool results become function_call_output items with a valid call_id. Single conversion path means no future drift. Also remove the now-dead _convert_content_for_responses() helper — its only caller was the private conversion loop this change deletes. Co-authored-by: ProgramCaiCai <techxacm@gmail.com>	2026-06-07 22:18:31 -07:00
teknium1	568e127612	refactor(cli): extract 25 more subcommand parsers into hermes_cli/subcommands/ Batch extraction of every remaining subcommand whose handler is top-level and whose parser block is pure argparse: model, setup, postinstall, whatsapp, slack, login, logout, auth, status, webhook, hooks, doctor, security, dump, debug, backup, import, config, version, update, uninstall, dashboard, gui, logs, prompt-size. Each becomes hermes_cli/subcommands/<name>.py with build_<name>_parser() and an injected handler (no main import). dashboard also injects cmd_dashboard_register for its nested 'register' action. Behavior-neutral: all 25 subcommands' --help output (and nested subaction help) diff-verified byte-identical to pre-extraction. Two RawDescriptionHelpFormatter epilogs (debug, logs) needed their multi-line string interiors preserved at column 0 — caught by the --help diff, not compile. main() 3297 -> 1798 LOC across this PR; add_parser calls in main.py 179 -> 89. Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process isolation; new test_subcommands_batch.py smoke-tests all 25 builders + the dashboard two-handler case.	2026-06-07 22:18:14 -07:00
teknium1	4da45e8727	refactor(cli): extract profile + gateway/proxy parsers into hermes_cli/subcommands/ Follow-on to the cron extraction in the same Phase 2 PR. Same pattern: per-group build_<name>_parser() functions with injected handlers, no main import. - subcommands/profile.py: build_profile_parser (190-line block out of main()). - subcommands/gateway.py: build_gateway_parser (gateway + proxy, 238-line block; they shared one inline section). Imports argparse for SUPPRESS defaults. - main(): two more inline blocks become single builder calls. Behavior-neutral: 'profile [sub] --help' and 'gateway/proxy [sub] --help' byte-identical to pre-extraction (diff-verified). main() now 2723 LOC (was 3297 at Phase 2 start); add_parser calls in main.py 179 -> 141. Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process isolation; new builder unit tests cover subactions, aliases, dispatch, flags.	2026-06-07 22:18:14 -07:00
teknium1	b2e6053243	refactor(cli): extract hermes cron parser into hermes_cli/subcommands/ (god-file Phase 2) Phase 2 of the god-file decomposition plan. main()'s argparse tree is 179 inline add_parser calls in one 3,297-line function. This establishes the hermes_cli/subcommands/ package and extracts the first group (cron) as the proof-of-pattern: - hermes_cli/subcommands/_shared.py: shared parser helpers (add_accept_hooks_flag), re-exported from main.py for backwards compat. - hermes_cli/subcommands/cron.py: build_cron_parser(subparsers, cmd_cron=...). Handler injected so the module never imports main (cycle avoidance). - main()'s ~155-line inline cron block becomes one build_cron_parser() call. Behavior-neutral: 'hermes cron create --help' output is byte-identical to origin/main. main() 3297 -> 3143 LOC. Validation: tests/hermes_cli/ 6466 passed / 0 failed under per-file process isolation; new test_subcommands_cron.py covers subactions, aliases, options, no-agent tristate, injected dispatch, and --accept-hooks.	2026-06-07 22:18:14 -07:00
teknium1	54870847cb	refactor(agent): extract run_conversation prologue into agent/turn_context.py Phase 1 of the god-file decomposition plan. run_conversation's ~470-line once-per-turn setup block (stdio guarding, retry-counter resets, user-message sanitization, todo/nudge hydration, system-prompt restore-or-build, crash-resilience persistence, preflight compression, the pre_llm_call hook, and external-memory prefetch) is moved verbatim into build_turn_context(), which returns a TurnContext dataclass the loop unpacks. Behavior-neutral move-and-name refactor: the builder mutates `agent` exactly as the inline code did; only the locals the loop reads back are returned. - run_conversation: 4602 -> 4217 LOC (-385) - agent/conversation_loop.py: 4965 -> ~4580 LOC - new agent/turn_context.py: focused, dependency-injected, unit-tested in isolation Tests: tests/run_agent/ 1570 passed / 0 failed under per-file process isolation. Relocation follow-ups: 413_compression mocks now patch both module references; nudge/on_turn_start source-inspection guards point at the extracted module.	2026-06-07 22:17:35 -07:00
Teknium	86c537d209	fix(memory): instruct in-turn consolidation + retry on overflow (#41755 ) * fix(memory): make overflow errors instruct in-turn consolidation + retry When bounded memory is full, the add/replace overflow errors now explicitly tell the model to consolidate (merge/remove/shorten) and retry the write in the same turn, matching the documented behavior. The replace-overflow path now also echoes current_entries + usage for parity with add-overflow, so the model has the same context to act on. Closes #23378 (working-as-documented; this sharpens runtime to match docs). * fix(memory): broaden overflow remediation hint beyond 'stale' Say 'stale or less important' — entries don't have to be stale to be the right ones to drop when making room.	2026-06-07 22:16:28 -07:00
teknium1	2a10da3a16	fix(gateway): keep /model + /reasoning overrides on topic recovery & compression splits Session-scoped /model and /reasoning overrides were silently lost on Telegram DM/forum topics and after compression session splits (#30479). Root cause: _handle_message_with_agent rewrites source.thread_id via _recover_telegram_topic_thread_id (lobby/stripped reply -> the user's bound topic) before deriving the session key. The /model and /reasoning handlers derived their override key from the raw inbound event.source, skipping that recovery, so the override was stored under one key and the next message turn read a different key. Fix: add _normalize_source_for_session_key (applies the same recovery a message turn does) and use it in both handlers before deriving the key. session_id rotation on compression was never the cause — overrides are keyed by the durable session_key; the split path preserves it. Author: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-06-07 22:10:32 -07:00
Hariharan Ayappane	b8469a81e3	fix(weixin): add rate-limit circuit breaker	2026-06-07 22:10:17 -07:00
Teknium	2e62862784	fix(telegram): use get_running_loop in polling-conflict retry reschedule (#41716 ) The conflict-retry path called asyncio.get_event_loop() to reschedule itself when a retry's start_polling raised. On Python 3.11+ (our floor) that raises 'RuntimeError: There is no current event loop in thread MainThread' when no loop is attached to the thread, which is what happens when PTB dispatches this error callback. The retry never gets scheduled, the adapter goes silent-but-alive, and gateway --replace keeps spawning fresh instances that hit the same wall — the crash loop reported in #19471 (worse under multi-profile, where two bots hold the same conflict open). We are inside a coroutine here, so asyncio.get_running_loop() is the correct, guaranteed-valid replacement. Only get_event_loop() call in any platform adapter, so no sibling sites. Fixes #19471	2026-06-07 22:10:03 -07:00
Basil Al Shukaili	8513a6aec7	fix(compression): guard against cross-session stale _previous_summary contamination When a cron or background session compacts, it sets _previous_summary for iterative updates. If that session ends without /new or /reset (which calls on_session_reset()), the stale summary survives on the ContextCompressor instance. A subsequent live messaging session's compaction then injects it as 'PREVIOUS SUMMARY:' into the summarizer prompt — contaminating the live session with unrelated content from the prior session. Add an else guard in compress(): when no handoff summary is found in the current messages but _previous_summary is non-empty, discard it so _generate_summary() starts fresh instead of iteratively updating a stale cross-session summary. Fixes #38788	2026-06-07 22:09:45 -07:00
Teknium	5408013369	fix(gateway): isolate DM sessions on user_id when chat_id is absent (#41764 ) build_session_key collapsed every DM that arrived without a chat_id into one shared 'agent:main:<platform>:dm' key. A single cached AIAgent then served multiple users' conversations, bleeding history across senders. DMs now fall back to the sender's user_id_alt/user_id (mirroring the group-path participant precedence and the telegram auth-path fallback) before the bare per-platform sink. Telegram's normal event path always sets chat_id, so this hardens the synthetic-source / non-standard-adapter paths that don't.	2026-06-07 22:07:07 -07:00
Teknium	a77bc2c08d	fix(compression): disable compression on background-review fork to prevent cross-turn stale-parent fork (#41708 ) The per-session compression lock prevents same-window concurrent forks but not cross-turn ones: the background-review fork shares the parent's session_id, so if it won a compression race its new child session was never adopted by the gateway (the fork is single-lifecycle). The next foreground turn then started from the stale parent and compressed it again, leaving the same parent with two sibling children. Set review_agent.compression_enabled = False so the fork never triggers compression. Both trigger sites in conversation_loop.py gate on compression_enabled before calling _compress_context, so the fork can never rotate the shared parent. Review needs full context anyway — compressing would degrade the memory/skill summary. The per-session lock is kept as defense-in-depth for any future shared-session path. Adds a regression test that fails without the flag and passes with it. Closes #38727	2026-06-07 22:06:48 -07:00
Teknium	48ae8029aa	fix(delegate): resolve custom-endpoint subagent pools by endpoint identity (#41730 ) Subagents delegated to a custom endpoint were misrouted when the parent ran on a different custom endpoint. Both runtimes collapse to provider="custom", so _resolve_child_credential_pool() treated them as interchangeable and handed the child the parent's pool. Leasing from it then overwrote the child's delegated base_url with the parent's endpoint via _swap_credential() — the child sent the delegated model name to the wrong endpoint. Custom runtimes now resolve by endpoint identity (the custom:<name> pool key derived from base_url). The parent pool is reused only when both parent and child resolve to the same custom endpoint; unregistered raw endpoints return None so the child keeps its fixed delegated credential. Non-custom provider paths are unchanged. Fixes #7833.	2026-06-07 22:05:14 -07:00
islam666	78e2101cd2	fix: reap zombie subprocesses in web_server action status and meet_bot cleanup - web_server.py: after proc.poll() returns a non-None exit code, call proc.wait() to reap the child and move the entry from _ACTION_PROCS to _ACTION_RESULTS. Previously .poll() alone left <defunct> zombies. - meet_bot.py: terminate and wait on the pcm_pump subprocess (paplay/ ffmpeg) during the finally-block teardown. Previously leaked on every normal bot exit. - tests: add test_action_status_reaps_completed_process and test_action_status_ignores_wait_failure covering both the happy path and the wait()-raises-OSError edge case. Closes #38032	2026-06-07 21:50:57 -07:00
islam666	e53b74c394	fix(dist): stop USER_OWNED_EXCLUDE from filtering nested directories The copytree ignore lambda in _copy_dist_payload applied USER_OWNED_EXCLUDE recursively at every directory depth. This caused nested directories whose names matched exclude entries (bin, logs, cache, etc.) to be silently dropped during distribution install/update. Fix: only apply USER_OWNED_EXCLUDE filtering at the root of the staged tree, matching the two-tier pattern used by _clone_all_copytree_ignore and _default_export_ignore in profiles.py. Add 5 tests covering nested bin/logs/cache preservation and top-level filtering still working. Fixes #37954	2026-06-07 21:50:57 -07:00
islam666	09a5548628	fix(weixin): refresh typing ticket on expiry to prevent stuck indicator (#38085 ) The WeChat iLink typing ticket has a 600-second TTL. When a long-running session exceeds that window, the cached ticket evicts from TypingTicketCache. Both send_typing and stop_typing silently returned early when the ticket was None, meaning the TYPING_STOP=2 signal was never sent to iLink. The WeChat client then showed the typing indicator indefinitely. Fix: add _ensure_typing_ticket() that transparently refreshes the ticket via getConfig when the cached one has expired or is missing. Both send_typing and stop_typing now call this method instead of silently no-oping. Fixes #38085	2026-06-07 21:50:57 -07:00
islam666	2e61de0638	fix(model_metadata): consult DEFAULT_CONTEXT_LENGTHS before 256K fallback on custom endpoints Problem: get_model_context_length() had an early return at the end of the custom-endpoint probe branch (step 3) that returned DEFAULT_FALLBACK_CONTEXT (256K) without ever consulting the hardcoded DEFAULT_CONTEXT_LENGTHS catalog (step 8). Models served through a custom/proxied gateway (e.g. corporate Anthropic proxy) that didn't expose Ollama or local-server endpoints would hit this path and get capped at 256K, even when the model name clearly matched a known entry in the catalog (e.g. claude-opus-4-8 → 1M). Changes: - agent/model_metadata.py: Before returning DEFAULT_FALLBACK_CONTEXT at the end of the custom-endpoint branch, consult DEFAULT_CONTEXT_LENGTHS using the same longest-key-first fuzzy matching as step 8. Only fall through to 256K if no catalog entry matches. - tests/agent/test_model_metadata.py: Updated existing test and added new test covering the custom-endpoint → catalog fallback behavior. Fixes #38865	2026-06-07 21:50:57 -07:00
islam666	9513793ad7	fix(vision): proactive downgrade for providers rejecting list-type tool content (#41072 ) Xiaomi MiMo (and potentially other providers) support multimodal user messages but reject list-type tool message content with 400 'text is not set'. Previously this was handled reactively — the API call would fail, images would be stripped, and the request retried, losing visual info. Fix: add supports_vision_tool_messages field to ProviderProfile (default True). Xiaomi sets it to False. _tool_result_content_for_active_model now checks this field proactively and returns a text summary instead of list content, avoiding the round-trip failure entirely.	2026-06-07 21:50:57 -07:00
islam666	41f0714287	fix(vision): honor custom_providers per-model supports_vision (#41036 ) _supports_vision_override() in image_routing.py checked model.supports_vision and providers.<name>.models, but not the legacy list-style custom_providers config. A custom provider entry like: custom_providers: - name: my-provider models: my-model: supports_vision: true was ignored, causing image_input_mode=auto to route through the auxiliary vision_analyze path instead of natively attaching images. Fix: added a lookup step for custom_providers list entries, matching by provider name (including 'custom:<name>' variants at runtime). providers.<name>.models still takes precedence over custom_providers. 13 new tests covering: true/false override, custom: prefix matching, no-match fallback, non-dict entries, empty lists, models key missing.	2026-06-07 21:50:57 -07:00
islam666	18c085b1a4	fix(gateway): normalize optional systemd directives in stale-check (#41119 ) On older systemd versions that don't support RestartMaxDelaySec / RestartSteps, the installed unit file has those directives silently dropped. systemd_unit_is_current() did a strict text comparison, so the unit was perpetually flagged as outdated. Fix: _strip_optional_systemd_directives() removes RestartMaxDelaySec and RestartSteps from both the installed and expected text before comparison. Units that differ only by these optional directives are now correctly considered current.	2026-06-07 21:50:57 -07:00
islam666	b18490b890	fix(compaction): prevent infinite loop when transcript fits in tail budget When summary_target_ratio is large (e.g. 0.45) and the context_length is moderate (e.g. 96000), the soft_ceiling (token_budget * 1.5) can exceed the total transcript size. _find_tail_cut_by_tokens walks the entire transcript without breaking early, and the resulting compress window is either empty (compress_start >= compress_end) or a single message whose summary-of-one overhead saves ~0 tokens. Both outcomes cause a no-op compression that does not increment _ineffective_compression_count, so should_compress() returns True on every subsequent turn and the loop repeats endlessly. Fix (two layers): 1. _find_tail_cut_by_tokens: when the backward walk consumed the entire transcript without breaking (cut_idx <= head_end and accumulated <= soft_ceiling), re-walk with the raw (non-inflated) token budget to find a meaningful cut that gives the summarizer a useful middle window. 2. compress(): when compress_start >= compress_end, increment _ineffective_compression_count and log a warning so the existing anti-thrashing guard in should_compress() can break the loop. Fixes #40803	2026-06-07 21:50:57 -07:00
Brian D. Evans	ab0a6270c3	fix(slack): align thread_ts check with is_thread_reply invariant (Copilot #15464 ) Two findings from Copilot's review on #15464, both addressed: 1. ``event.get("thread_ts")`` truthy vs ``event_thread_ts != ts``: the new channel branch treated ANY truthy ``thread_ts`` as a real thread reply, but three lines below ``is_thread_reply`` is defined with the stricter ``event_thread_ts and event_thread_ts != ts`` invariant. If Slack ever ships a payload where ``thread_ts == ts`` on a thread root, the stricter check would treat it as a top-level message for the ``is_thread_reply`` path but as a thread reply for session keying — divergent behaviour. Aligned this branch to the same ``and event_thread_ts_raw != ts`` invariant. 2. ``test_top_level_reply_to_id_stays_none_when_shared`` docstring had the ternary logic backwards ("None != ts → reply_to_message_id IS set"). The code reads ``reply_to_message_id = thread_ts if thread_ts != ts else None`` — with ``thread_ts = None``, the condition is True so the expression evaluates to ``thread_ts`` itself (None), meaning the reply stays un-threaded. The test asserted the correct end-state; only the explanatory docstring was wrong. Rewrote the docstring to match the actual code flow, with the note that Copilot caught the reversal. 7/7 tests still pass. No behaviour change for the existing test_thread_reply_scopes_by_thread_even_when_shared case because ``event_thread_ts_raw = "1700000000.000000"`` and ``ts = "1700000000.000005"`` are distinct — the new ``!= ts`` guard is a no-op there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-07 21:19:59 -07:00
Brian D. Evans	133e0271e2	fix(slack): scope top-level channel messages by channel-only when reply_in_thread=false (#15421 ) Top-level Slack channel messages previously fell back to the message's own ``ts`` as a synthetic ``thread_ts``: thread_ts = event.get("thread_ts") or ts # ts fallback for channels That value flows into ``build_source(thread_id=thread_ts)`` at line 1247. The gateway session store keys sessions by ``(platform, channel_id, thread_id)``, so every top-level channel message ended up on a unique session. Operators who set ``reply_in_thread: false`` in ``config.yaml`` expected all top-level channel messages to share one session (the whole point of that flag) — instead each one spawned a fresh conversation with no context carry-over. ### Fix Three explicit cases in the channel branch: \| event.thread_ts \| reply_in_thread \| thread_ts for session keying \| \|---\|---\|---\| \| non-null (real thread reply) \| either \| event.thread_ts \| \| null (top-level) \| true (default) \| ts (legacy: own-thread sessions) \| \| null (top-level) \| false \| None (shared channel session) \| The outbound-reply gate at line 1264 (``reply_to_message_id = thread_ts if thread_ts != ts else None``) still works correctly in all three cases without further changes: ``None != ts`` is True, so shared-channel top-level messages don't get their reply threaded either — matching the operator's ``reply_in_thread=false`` intent end-to-end. Genuine thread replies still scope per-thread under both modes so multi-person threaded conversations can't collide with unrelated channel chatter. ### Tests (7 new in ``tests/gateway/test_slack_channel_session_scope.py``) All drive the real ``SlackAdapter._handle_slack_message`` code path (not a re-implementation) via the standard pytest fixture pattern used by ``tests/gateway/test_slack.py``. Messages @mention the bot so the mention gate doesn't drop them — the tests are specifically about what happens once the handler decides to emit a ``MessageEvent``. * ``TestChannelSessionScopeDefault`` (2 cases): - Explicit ``reply_in_thread: true`` keeps ``thread_id = ts`` (legacy behaviour — regression guard) - Unset config behaves like ``reply_in_thread: true`` (pins the default) * ``TestChannelSessionScopeShared`` (3 cases): - ``reply_in_thread: false`` + top-level → ``thread_id is None`` (the #15421 bug 1 fix) - ``reply_to_message_id is None`` in the same case (no threaded outbound reply) - Genuine thread reply still scopes per-thread when shared mode is on — only TOP-LEVEL messages collapse to the channel session * ``TestThreadReplyAlwaysScopesByThread`` (2 parametrised cases): - Thread replies get ``thread_id = event.thread_ts`` regardless of ``reply_in_thread`` — critical invariant for multi-thread channels; a regression here would leak per-thread context across threads Regression guard verified: reverted the else-branch to the legacy ``thread_ts = event.get("thread_ts") or ts`` one-liner; ``test_top_level_maps_to_none_when_reply_in_thread_false`` correctly failed (asserts ``thread_id is None`` but got ``"1700000000.000003"``). Restored → 182 slack tests pass (175 existing + 7 new). Scope: this fixes #15421 bug 1 only. Bug 2 (sessions.json not persisting across compression) lives elsewhere in the session manager and is left for a separate diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-07 21:19:59 -07:00
Shannon Sands	86e5efb0ae	Preserve Telegram onboarding fallback errors	2026-06-07 19:48:09 -07:00
Shannon Sands	ba29010902	Use httpx for Telegram onboarding worker calls	2026-06-07 19:48:09 -07:00
brooklyn!	fa42ac094d	feat(desktop): Shift+click the status-bar zap to toggle YOLO globally (#41666 ) The status-bar zap currently toggles per-session approval bypass (the same scope as the TUI's Shift+Tab). This adds a global escape hatch: Shift+clicking the zap flips the persistent approvals.mode in config.yaml between "off" (bypass on) and "manual" (bypass off), affecting every session, the CLI, the TUI, and cron — and it survives restarts. - statusbar-controls: thread the click's shiftKey through onSelect via a new StatusbarSelectModifiers arg. - yolo-session: add setGlobalYolo() that calls config.set with scope="global". - use-statusbar-items: branch toggleYolo on modifiers.shiftKey; plain click stays per-session, Shift+click goes global. - tui_gateway config.set "yolo" key: add scope="global" that reads/writes approvals.mode through the gateway's own (mtime-cached) config view, honors an explicit value, and re-emits session.info to every live session so each window's zap reflects the flip immediately. - i18n: tooltip copy in en/ja/zh/zh-hant notes Shift+click toggles globally. Tests: two new tui_gateway tests cover the global toggle and explicit-value paths; existing session/process-scope yolo tests still pass.	2026-06-07 20:57:08 -05:00
Teknium	30c7913617	fix(api_server): report hermes version on /health and /health/detailed (#40620 ) Salvaged from #40479; re-verified on main, tightened, tested. Co-authored-by: tfournet <tfournet@users.noreply.github.com>	2026-06-07 18:38:54 -07:00
Teknium	b97cd81c78	refactor(insights): drop dead pricing/duration wrappers, call usage_pricing directly (#40618 ) Salvaged from #40527; re-verified on main, tightened, tested. Co-authored-by: HeLLGURD <HeLLGURD@users.noreply.github.com>	2026-06-07 18:33:20 -07:00
Teknium	2aa316ec9c	docs(windows): fix Get-Command PATH guidance to venv\Scripts\hermes.exe (#40613 ) Closes #40464. Salvaged from #40488; re-verified on main, tightened, tested. Co-authored-by: gauravsaxena1997 <gauravsaxena1997@users.noreply.github.com>	2026-06-07 18:28:23 -07:00
Teknium	6bdc4c0231	test: skip curses tests on Windows where _curses is unavailable (#40611 ) Salvaged from #40447; re-verified on main, tightened, tested. Co-authored-by: Ganesh0690 <Ganesh0690@users.noreply.github.com>	2026-06-07 18:21:03 -07:00
Teknium	69a293b419	hardening(todo): bound TodoStore item content length and count The todo list is re-injected into the model's context after every context-compression event (TodoStore.format_for_injection), so an oversized todo item or an unbounded number of items defeats the compression it is meant to ride through. TodoStore.write/_validate previously enforced no size or count bounds, so a single 50KB item produced a ~50KB re-injection block on every subsequent turn. Add two caps: - MAX_TODO_CONTENT_CHARS (4000): per-item content is truncated with a marker. Routed through a shared _cap_content() so the merge-update path (which writes content directly, bypassing _validate) is capped too. - MAX_TODO_ITEMS (256): total list length is bounded, keeping the highest-priority head (list order is priority). Both caps are generous relative to real plans — a todo item is a short task description and active lists are a handful of items. NOT a security fix. Raised externally via GHSA-5g4g-6jrg-mw3g, which framed a caller-supplied conversation_history on the authenticated API server replaying into _hydrate_todo_store as a DoS. That path is authenticated (the API server refuses to start without API_SERVER_KEY) and self-scoped (the caller supplies their own entire history and can only inflate their own response chain — forged role=tool entries are never persisted to the session DB), so it is out of scope as a vulnerability under SECURITY.md 3.2. These bounds are footgun containment that also applies to the trusted agent path, where the model itself authors the todos. Credit to the reporter for the observation. Co-authored-by: YLChen-007 <30854794+YLChen-007@users.noreply.github.com>	2026-06-07 18:06:27 -07:00
Gilad Bauman	ae82eed2b1	fix(gateway): use OGG for Telegram auto TTS	2026-06-07 18:05:58 -07:00
Teknium	cb83149dc6	fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (#40607 ) Salvaged from #40421; re-verified on main, tightened, tested. Co-authored-by: maxmilian <maxmilian@users.noreply.github.com>	2026-06-07 17:49:38 -07:00
Teknium	09d66037f8	fix(hindsight): send only new-turn delta on append retains instead of whole session (#40605 ) Closes #40503. Salvaged from #40519; re-verified on main, tightened, tested. Co-authored-by: skylarbpayne <skylarbpayne@users.noreply.github.com>	2026-06-07 17:41:10 -07:00
kshitijk4poor	7df81d0557	fix(web): make _has_env config-aware so SEARXNG_URL auto-detect honors Hermes config Follow-up to #34306. The provider fix made SearXNG usable with a config-only SEARXNG_URL, but tools/web_tools._has_env still read raw os.getenv, so the backend auto-detect cascade and check_web_api_key remained blind to it — SearXNG worked when explicitly selected but was never auto-selected. Route _has_env (and the SearXNG diagnostic print) through a config-aware _env_value helper mirroring the provider's _searxng_url(). Fixing the shared helper covers every provider key in one place. Adds regression tests for config-only auto-detect and check_web_api_key. See #34290.	2026-06-08 01:12:32 +05:30
kshitij	0c0fbf763b	Merge pull request #41430 from helix4u/fix-url-tools-unicode-normalization fix(tools): percent-encode non-ascii URL components	2026-06-07 12:39:30 -07:00
helix4u	333f01bc7f	fix(tools): percent-encode non-ascii URL components	2026-06-07 11:42:26 -06:00
teknium1	16786f3bb3	feat(desktop+gateway): remote media relay — attach images/PDFs and display gateway images over the network Desktop connected to a remote gateway can now attach images and PDFs and display agent-written images. Previously the desktop passed a LOCAL file path to image.attach; on a remote gateway that path doesn't exist, so the image was silently dropped ("skipped unreadable path") and the vision model never saw it. The reverse direction was also broken — images the agent wrote on the gateway rendered as dead links in the remote client. Gateway (tui_gateway/server.py): - image.attach_bytes: base64 byte upload written into the gateway's own images dir and queued via the existing native-image-attach pipeline. Magic-byte extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap, structured error codes. Accepts content_base64/filename (canonical) and data/ext (older-desktop aliases). - pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI and queues the pages as images; 50 MB / 25-page caps. Accepts host path or base64 upload. - Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image) so the two methods and the existing image.attach don't duplicate logic. Gateway (hermes_cli/web_server.py): - GET /api/media: returns a gateway-local image as a base64 data URL so remote clients can display it. Auth-gated like every /api route, extension allowlist + size cap, AND confined to the gateway's own media roots (images/screenshots/cache, resolved symlink-safe) so an authed caller can't read image-extension files anywhere on disk. Desktop (apps/desktop): - syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the connection mode is 'remote'; the local fast path is unchanged. - media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and markdown-text fetch images over /api/media in remote mode. Consolidates the competing remote-media PRs (#38876, #40317, #21908, #39437) into one coherent implementation, taking the strongest parts of each and adding shared-helper cleanup plus the /api/media root-confinement hardening on top. The per-profile gateway switching from #38876 is intentionally left out as a separable feature. TUI file uploads (#40492) remain a separate surface. Tested: 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop media.remote unit tests; full tui_gateway + web_server suites green (472 passed); tsc -b clean; E2E verified the full attach→disk→queue and gateway-path→data-URL display round-trip plus the out-of-root security block. Co-authored-by: Max Mitcham <maxmitcham@mac.home> Co-authored-by: Justlrnal4 <Justlrnal4@users.noreply.github.com> Co-authored-by: Chris Cook <ccook@nvms.com> Co-authored-by: Thomas Paquette <thomas.paquette@gmail.com>	2026-06-07 10:05:53 -07:00
Teknium	0c48b7165d	hardening(api-server): scan cron prompts on REST create/update for parity with the agent tool The agent-facing cronjob tool scans the user prompt with _scan_cron_prompt() before creating/updating a job (tools/cronjob_tools.py); the REST cron endpoints (POST /api/jobs, PATCH /api/jobs/{id}) validated length but not content. This adds the same scan to both handlers so an exfiltration/injection prompt is rejected the same way regardless of which surface created the job. NOT a security boundary, defense-in-depth / parity only: the REST cron endpoints are authenticated (every handler runs _check_auth, and connect() refuses to start without API_SERVER_KEY), and _scan_cron_prompt is a documented in-process heuristic, not a containment boundary (SECURITY.md 3.2). Raised externally via GHSA-fr3q-rjg3-x6mf (DNS-rebinding pre-auth RCE). The report's load-bearing 'no auth by default' premise was already closed three weeks after it was filed by the API_SERVER_KEY-required guard (commit `1a9ef8314`); this lands the create/update prompt-validation parity the report also pointed at. Scanner imported defensively so a missing scanner cannot disable the cron REST API.	2026-06-07 10:04:57 -07:00
Teknium	af08c43f3e	fix: skip MCP preflight content-type probe on reconnect when already ready (#40604 ) Closes #40366. Salvaged from #40548; re-verified on main, tightened, tested. Co-authored-by: mohamedorigami-jpg <mohamedorigami-jpg@users.noreply.github.com>	2026-06-07 09:51:11 -07:00
teknium1	76f01780f0	fix(kanban): sweep deferred scratch parent on non-scratch child completion + tests Follow-up on the deferred-cleanup salvage (#33774): _cleanup_workspace returned early for a non-scratch ('dir'/'worktree') task and never ran the parent sweep, so a scratch parent waiting on a 'dir' child would leak its deferred workspace forever. Run the parent sweep before the early return. Adds regression tests: deferred-while-child-active, swept-after-last-child, and dir-child-unblocks-scratch-parent.	2026-06-07 09:50:44 -07:00
Teknium	cb3e41e2fd	feat(onboarding): opt-in structured profile-build path on first contact (#41114 ) * feat(onboarding): opt-in structured profile-build path on first contact On a user's very first gateway message, Hermes now optionally offers to build a short profile of them — then, only with consent, gathers durable facts and persists them to the user-profile memory store (memory tool, target="user") so future sessions start already knowing who they are. Inspired by Poke's zero-input onboarding, but consent-first by design: - The agent OFFERS, never assumes. Declining stops it immediately. - Before ANY external lookup it states what it will look up and asks. - It never reads connected accounts (email/calendar) silently — the exact privacy concern that made naive implementations feel invasive. Wiring reuses existing infrastructure end-to-end: - gateway/run.py first-message hook (was a plain self-intro) now swaps in the profile-build directive when enabled and not yet offered. - agent/onboarding.py gains profile_build_mode()/profile_build_directive() + PROFILE_BUILD_FLAG, latched once via the existing onboarding.seen mechanism so the offer fires at most once per install. - config default onboarding.profile_build: "ask" (set "off" to disable). Added to an existing section, so no _config_version bump needed. No new storage layer, no new injection path, no prompt-cache impact. * fix(dashboard): fold onboarding into agent tab to avoid 1-field category onboarding.profile_build is the only schema-surfaced onboarding field (onboarding.seen is an internal latch dict), so the dashboard CONFIG_SCHEMA single-field-category invariant rejected it. Merge onboarding -> agent like the other small categories.	2026-06-07 08:36:48 -07:00
Teknium	d87f293972	feat(compression): temporal anchoring in compaction summaries (#41102 ) Compaction summaries now receive the current date and instruct the summarizer to rewrite completed actions as absolute, dated, past-tense facts (e.g. "email John about the proposal" -> "Sent the proposal email to John on 2026-06-07"). A resumed conversation no longer re-issues work that already happened or treats a finished action as still pending. The date is resolved via hermes_time.now() (date-only, user-configured timezone) inside _generate_summary. The compaction summary is a mid-conversation message that is never part of the cached prefix, so the date does not affect prompt-cache stability. Date resolution is best-effort: a clock failure omits the rule rather than blocking compaction. The rule rides the shared template, so both first-compaction and iterative-update prompts carry it. Inspired by Poke's summarization (temporal anchoring + semantic preservation).	2026-06-07 08:36:45 -07:00
Teknium	9dbad1990b	test(discord): align clarify/model-picker tests with fail-closed component auth (#41338 ) Three gateway tests broke on main after the component-auth security hardening (test_discord_component_auth.py) made empty Discord component allowlists fail-closed: a view built with allowed_user_ids=set() now rejects every click instead of allowing anyone. The clarify and model-picker BEHAVIOR tests still constructed their views with an empty allowlist and expected the click to succeed — a stale assumption from before the hardening. Fixed by giving each view an allowlist containing the clicking user (the interaction's own id), which is the realistic shape and what the security model requires. Production code unchanged — this only updates the test fixtures to match the intended (and separately pinned) fail-closed contract. The security regression suite and these behavior suites now both pass. Fixes: - test_discord_clarify_buttons.py: test_choice_falls_back_to_label_text_when_entry_missing, test_other_flips_entry_to_awaiting_text - test_discord_model_picker.py: test_model_picker_clears_controls_before_running_switch_callback	2026-06-07 08:27:40 -07:00
LaPhilosophie	f6f363662e	fix(discord): fail closed for component button auth when no allowlist set Salvage of the Discord half of PR #30964 by @LaPhilosophie. Discord component button callbacks (ExecApprovalView, SlashConfirmView, UpdatePromptView, ModelPickerView) bypass the normal message dispatch authorization path. _component_check_auth previously returned True when both the user and role allowlists were empty, so any guild member who could see an approval prompt could click Approve on a dangerous command. Fail closed instead: require DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES / GATEWAY_ALLOWED_USERS membership, or an explicit DISCORD_ALLOW_ALL_USERS / GATEWAY_ALLOW_ALL_USERS opt-in for deliberately-open deployments. Mirrors the Telegram (#24457) and Matrix fail-closed precedent. The Slack half of #30964 is superseded by PR #33844's helper. Reported via GHSA-mc26-p6fw-7pp6 (@whyiug). Co-authored-by: LaPhilosophie <804436395@qq.com>	2026-06-07 06:21:37 -07:00
Dusk1e	3fa15b33dd	fix(feishu): fail closed for update prompt card actions	2026-06-07 06:21:37 -07:00
Dusk1e	410cb743bf	fix(slack): re-check gateway auth on approval and slash-confirm buttons	2026-06-07 06:21:37 -07:00
synapsesx	f10a330aee	fix(research): keep tool_call/tool_response pairs intact when compressing trajectories ## What does this PR do? The trajectory compressor could corrupt training trajectories by cutting a conversation in the middle of a tool-call/tool-response pair. In the from/value trajectory format a `tool` turn (carrying `<tool_response>` markers) is always emitted immediately after the `gpt` turn whose `<tool_call>` it answers, so the two turns must stay together. The compressible region's end boundary, however, was chosen purely by token accumulation: the loop stopped at the first turn where the accumulated tokens met the savings target, with no regard for turn roles. For any over-budget trajectory whose savings boundary happened to land between a `gpt` turn and its `tool` turn, the `gpt` (with its `<tool_call>`) was summarised away into the replacement `human` message while the now-orphaned `tool` turn (with its `<tool_response>`) was kept verbatim in the tail — producing an unmatched marker and silently corrupting the training signal. The head boundary had the mirror problem when the first tool turn was not protected. This change snaps both compression boundaries to a clean turn boundary before the region is extracted and replaced, so the summary always covers whole gpt+tool blocks and a `tool` turn is never separated from the `gpt` turn that precedes it. The boundary is moved forward when possible (folding an orphaned tool turn into the region that already holds its gpt) and falls back to moving backward when no clean boundary exists ahead, such as when the protected tail itself begins on a tool turn. ## Related Issue N/A ## Type of Change - [x] 🐛 Bug fix (non-breaking change that fixes an issue) ## Changes Made - `trajectory_compressor.py`: added `_is_boundary_clean()` and `_snap_boundary()` helpers on `TrajectoryCompressor`, and applied them to both the head and tail compression boundaries in `compress_trajectory()` and `compress_trajectory_async()`. When snapping collapses the region to nothing safe to compress, the trajectory is returned unchanged and flagged as still over the limit rather than being corrupted. - `tests/test_trajectory_compressor.py`: added `TestCompressionToolPairIntegrity` covering the sync and async paths plus direct unit tests for the boundary snapping (forward skip and backward fallback). ## How to Test 1. Run the focused tests: `pytest tests/test_trajectory_compressor.py -q`. 2. The new sync/async cases build a trajectory of gpt/tool pairs with an oversized middle gpt turn and choose a token target that forces the accumulation boundary to stop between a `<tool_call>` and its `<tool_response>`. They assert that `<tool_call>` and `<tool_response>` markers stay balanced after compression and that every kept `tool` turn is immediately preceded by a `gpt` turn (never the inserted summary or another tool turn). ## Checklist ### Code - [x] I've read the [Contributing Guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md) - [x] My commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) (`fix(scope):`, `feat(scope):`, etc.) - [x] I searched for [existing PRs](https://github.com/NousResearch/hermes-agent/pulls) to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix/feature (no unrelated commits) - [x] I've run `pytest tests/ -q` and all tests pass - [x] I've added tests for my changes (required for bug fixes, strongly encouraged for features) - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) per the [compatibility guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md#cross-platform-compatibility) — or N/A - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A	2026-06-07 05:01:27 -07:00
manishbyatroy	490c486ff6	fix(simplex): accept display name in SIMPLEX_ALLOWED_USERS SIMPLEX_ALLOWED_USERS silently denied every contact when operators listed display names instead of numeric contactIds. The SimpleX UI never surfaces the numeric id, so display names are what operators naturally put in the env var. _is_user_authorized only compared source.user_id (the contactId), so the allowlist never matched. Expand check_ids to include source.user_name for the simplex platform, mirroring the existing WhatsApp phone-LID aliasing pattern. Adds doc + setup-prompt clarification and three regression tests. Salvaged from PR #40393. Adds manishbyatroy to release.py AUTHOR_MAP.	2026-06-07 04:53:22 -07:00

1 2 3 4 5 ...

5072 commits