hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-10 08:32:09 +00:00

Author	SHA1	Message	Date
liuhao1024	6459b3d991	fix(terminal): collapse CWD-only overrides to shared container When register_task_env_overrides is called with only a 'cwd' key (ACP adapter workspace tracking), the task_id should collapse to 'default' so all interactive surfaces (TUI, gateway, dashboard) share one long-lived container. Previously, any override registration — even CWD-only — caused _resolve_container_task_id to return the session key unchanged, spinning up a separate container per session. This made it impossible to authenticate into external services once and have that auth available across all surfaces. Now only overrides containing isolation keys (docker_image, modal_image, singularity_image, daytona_image, env_type) trigger per-task container isolation. Fixes #37361	2026-06-07 23:04:54 -07:00
teknium1	1a626470ca	refactor(cli): promote 9 closure handlers to top-level + extract their parsers (god-file Phase 2 follow-up) Subcommands whose handler was a closure defined inside main() — memory, acp, tools, insights, skills, pairing, plugins, mcp, claw — have their handler promoted to a top-level function and their parser block extracted into hermes_cli/subcommands/<name>.py (build_<name>_parser, injected handler). These 9 had zero closure-over-main-locals, so promotion is a pure relocation. acp/mcp parser blocks use the shared add_accept_hooks_flag helper. main() 1798 -> 954 LOC (71% below the 3297 Phase-2 starting point); add_parser calls in main.py 89 -> 28. Deferred: sessions, computer-use, secrets handlers reference <name>_parser (for a no-subcommand print_help fallback) — left in place to avoid the _self_parser indirection; minority, low value. Behavior-neutral: all 9 subcommands' --help (incl nested subactions) byte- identical to pre-extraction (diff-verified). tests/hermes_cli/ 6519 passed / 0 failed; new test_subcommands_followup.py covers the 9 builders.	2026-06-07 22:56:23 -07:00
teknium1	524453dab5	refactor(agent): consolidate inner-retry-loop recovery flags into TurnRetryState (god-file Phase 1b) run_conversation's inner retry loop tracked recovery state in ~15 scattered bare booleans (per-provider OAuth refresh guards, format-recovery guards, restart signals). They are now fields on a single TurnRetryState dataclass the loop mutates in place (_retry.<flag>), giving the recovery bookkeeping a named, testable home. Loop-control vars (retry_count, max_retries, max_compression_attempts) stay as plain locals — they're while-mechanics, not recovery bookkeeping. Behavior-neutral: pure local→attribute rewrite of 42 references; kwarg NAMES preserved (e.g. has_retried_429=_retry.has_retried_429). Live simple + tool turns OK. Validation: tests/run_agent/ 1615 passed / 0 failed under per-file process isolation; new test_turn_retry_state.py pins the field contract.	2026-06-07 22:42:05 -07:00
Rod Boev	648706936d	test(gateway): add compression session_id rotation integration tests (#34089 )	2026-06-07 22:39:51 -07:00
JimStenstrom	cb5c24e37d	fix(agent): sync logging session context on compaction id rotation When context compaction rotates agent.session_id, it updates the gateway/tools session context (set_current_session_id -> HERMES_SESSION_ID env + ContextVar) but never updates the separate logging session context. The [session_id] tag on log lines comes from hermes_logging._session_context (set once per turn in conversation_loop.py), so post-compaction log lines in the same turn carry the STALE old id while the message/DB/gateway state carry the new one — breaking log correlation exactly at the compaction boundary. Call hermes_logging.set_session_context(agent.session_id) alongside the existing set_current_session_id, guarded so a logging failure can't regress the routing update. Logs-only; no runtime or caching impact. Refs #34089	2026-06-07 22:30:02 -07:00
Teknium	8e223b36ed	fix(curator): protect load-bearing built-in skills from archival/consolidation (#41817 ) The curator's idle-archival path (apply_automatic_transitions under prune_builtins) could archive the bundled `plan` skill, killing the /plan slash command silently — typing /plan then returned 'Unknown command' with no signal that a skill had vanished. The archived skill's hash stays in .bundled_manifest, so 'hermes update' wouldn't re-seed it. Add PROTECTED_BUILTIN_SKILLS ({plan}) enforced at the master gate is_curation_eligible() (covers archive_skill + the transition walk) and in the candidate enumerator (so the LLM consolidation pass never sees them). Immune to prune_builtins, pin state, and LLM judgment.	2026-06-07 22:23:29 -07:00
Teknium	777dc9da62	feat(acp): emit session provenance metadata for compression rotation (#41724 ) Closes #33617. Adds additive _meta.hermes.sessionProvenance to ACP session surfaces so clients can detect compression-driven internal session rotation without parsing status text, guessing from token drops, or reading state.db. Derived on demand from the existing compression chain (parent_session_id / end_reason) — no new persisted state, no schema change, no ACP protocol change. ACP session_id stays the stable client handle. - acp_adapter/provenance.py: derive provenance from SessionDB - server.py: attach _meta to new/load/resume responses; emit a session_info_update when the internal head rotates during a prompt	2026-06-07 22:22:21 -07:00
Martín Alcalá Rubí	132d6fe6d6	fix(volcengine): strip XML attribute fragments from tool_use.name (#33007 ) VolcEngine's api/plan endpoint occasionally leaks raw XML attribute fragments into tool_use.name when its protocol-translation layer converts the model's native XML-style tool emission to Anthropic Messages tool_use blocks, producing names like: terminal" parameter="command" string="true execute_code" parameter="code" string="true session_search" parameter="session_id" string="true The corruption happens server-side at the provider, but it breaks every tool call for affected users — no normalization rule in repair_tool_call can rescue them, so each request runs through three retries and then aborts as partial. Add an early sanitizer in agent_runtime_helpers.repair_tool_call that trims at the first ' " ', " ' ", '<', or '>' character (idx > 0 only) so the rest of the existing repair pipeline (lowercase / snake_case / fuzzy match) can resolve the cleaned name normally. Whitespace is deliberately NOT a separator — the legitimate "write file" -> write_file repair path (covered by test_space_to_underscore) must keep working. Tests: 11 new regression cases in TestVolcEngineXmlPollution covering all three observed polluted names, CamelCase + pollution mix, single-quote variants, angle-bracket variants, clean-name passthrough, and the whitespace-preservation guard. All 18 pre- existing repair tests still pass (29 total in the file).	2026-06-07 22:22:01 -07:00
lsaether	9b631e4ae1	fix(acp): suppress cancel interrupt sentinel	2026-06-07 22:20:43 -07:00
Teknium	2789bf4e25	fix(auxiliary): route Codex Responses path through shared converter (#5709 ) The auxiliary Codex adapter maintained its own chat->Responses conversion loop that forwarded every non-system message's role verbatim into Responses input[]. When flush_memories()/compression replayed session history containing assistant tool_calls + role=tool results, those tool messages leaked into the request and the Responses API rejected them with HTTP 400: Invalid value: 'tool'. Route _CodexCompletionsAdapter.create() through the same shared converter the main agent transport uses (_chat_messages_to_responses_input), so tool calls become function_call items and tool results become function_call_output items with a valid call_id. Single conversion path means no future drift. Also remove the now-dead _convert_content_for_responses() helper — its only caller was the private conversion loop this change deletes. Co-authored-by: ProgramCaiCai <techxacm@gmail.com>	2026-06-07 22:18:31 -07:00
teknium1	568e127612	refactor(cli): extract 25 more subcommand parsers into hermes_cli/subcommands/ Batch extraction of every remaining subcommand whose handler is top-level and whose parser block is pure argparse: model, setup, postinstall, whatsapp, slack, login, logout, auth, status, webhook, hooks, doctor, security, dump, debug, backup, import, config, version, update, uninstall, dashboard, gui, logs, prompt-size. Each becomes hermes_cli/subcommands/<name>.py with build_<name>_parser() and an injected handler (no main import). dashboard also injects cmd_dashboard_register for its nested 'register' action. Behavior-neutral: all 25 subcommands' --help output (and nested subaction help) diff-verified byte-identical to pre-extraction. Two RawDescriptionHelpFormatter epilogs (debug, logs) needed their multi-line string interiors preserved at column 0 — caught by the --help diff, not compile. main() 3297 -> 1798 LOC across this PR; add_parser calls in main.py 179 -> 89. Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process isolation; new test_subcommands_batch.py smoke-tests all 25 builders + the dashboard two-handler case.	2026-06-07 22:18:14 -07:00
teknium1	4da45e8727	refactor(cli): extract profile + gateway/proxy parsers into hermes_cli/subcommands/ Follow-on to the cron extraction in the same Phase 2 PR. Same pattern: per-group build_<name>_parser() functions with injected handlers, no main import. - subcommands/profile.py: build_profile_parser (190-line block out of main()). - subcommands/gateway.py: build_gateway_parser (gateway + proxy, 238-line block; they shared one inline section). Imports argparse for SUPPRESS defaults. - main(): two more inline blocks become single builder calls. Behavior-neutral: 'profile [sub] --help' and 'gateway/proxy [sub] --help' byte-identical to pre-extraction (diff-verified). main() now 2723 LOC (was 3297 at Phase 2 start); add_parser calls in main.py 179 -> 141. Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process isolation; new builder unit tests cover subactions, aliases, dispatch, flags.	2026-06-07 22:18:14 -07:00
teknium1	b2e6053243	refactor(cli): extract hermes cron parser into hermes_cli/subcommands/ (god-file Phase 2) Phase 2 of the god-file decomposition plan. main()'s argparse tree is 179 inline add_parser calls in one 3,297-line function. This establishes the hermes_cli/subcommands/ package and extracts the first group (cron) as the proof-of-pattern: - hermes_cli/subcommands/_shared.py: shared parser helpers (add_accept_hooks_flag), re-exported from main.py for backwards compat. - hermes_cli/subcommands/cron.py: build_cron_parser(subparsers, cmd_cron=...). Handler injected so the module never imports main (cycle avoidance). - main()'s ~155-line inline cron block becomes one build_cron_parser() call. Behavior-neutral: 'hermes cron create --help' output is byte-identical to origin/main. main() 3297 -> 3143 LOC. Validation: tests/hermes_cli/ 6466 passed / 0 failed under per-file process isolation; new test_subcommands_cron.py covers subactions, aliases, options, no-agent tristate, injected dispatch, and --accept-hooks.	2026-06-07 22:18:14 -07:00
teknium1	54870847cb	refactor(agent): extract run_conversation prologue into agent/turn_context.py Phase 1 of the god-file decomposition plan. run_conversation's ~470-line once-per-turn setup block (stdio guarding, retry-counter resets, user-message sanitization, todo/nudge hydration, system-prompt restore-or-build, crash-resilience persistence, preflight compression, the pre_llm_call hook, and external-memory prefetch) is moved verbatim into build_turn_context(), which returns a TurnContext dataclass the loop unpacks. Behavior-neutral move-and-name refactor: the builder mutates `agent` exactly as the inline code did; only the locals the loop reads back are returned. - run_conversation: 4602 -> 4217 LOC (-385) - agent/conversation_loop.py: 4965 -> ~4580 LOC - new agent/turn_context.py: focused, dependency-injected, unit-tested in isolation Tests: tests/run_agent/ 1570 passed / 0 failed under per-file process isolation. Relocation follow-ups: 413_compression mocks now patch both module references; nudge/on_turn_start source-inspection guards point at the extracted module.	2026-06-07 22:17:35 -07:00
Teknium	86c537d209	fix(memory): instruct in-turn consolidation + retry on overflow (#41755 ) * fix(memory): make overflow errors instruct in-turn consolidation + retry When bounded memory is full, the add/replace overflow errors now explicitly tell the model to consolidate (merge/remove/shorten) and retry the write in the same turn, matching the documented behavior. The replace-overflow path now also echoes current_entries + usage for parity with add-overflow, so the model has the same context to act on. Closes #23378 (working-as-documented; this sharpens runtime to match docs). * fix(memory): broaden overflow remediation hint beyond 'stale' Say 'stale or less important' — entries don't have to be stale to be the right ones to drop when making room.	2026-06-07 22:16:28 -07:00
teknium1	2a10da3a16	fix(gateway): keep /model + /reasoning overrides on topic recovery & compression splits Session-scoped /model and /reasoning overrides were silently lost on Telegram DM/forum topics and after compression session splits (#30479). Root cause: _handle_message_with_agent rewrites source.thread_id via _recover_telegram_topic_thread_id (lobby/stripped reply -> the user's bound topic) before deriving the session key. The /model and /reasoning handlers derived their override key from the raw inbound event.source, skipping that recovery, so the override was stored under one key and the next message turn read a different key. Fix: add _normalize_source_for_session_key (applies the same recovery a message turn does) and use it in both handlers before deriving the key. session_id rotation on compression was never the cause — overrides are keyed by the durable session_key; the split path preserves it. Author: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-06-07 22:10:32 -07:00
Hariharan Ayappane	b8469a81e3	fix(weixin): add rate-limit circuit breaker	2026-06-07 22:10:17 -07:00
Teknium	2e62862784	fix(telegram): use get_running_loop in polling-conflict retry reschedule (#41716 ) The conflict-retry path called asyncio.get_event_loop() to reschedule itself when a retry's start_polling raised. On Python 3.11+ (our floor) that raises 'RuntimeError: There is no current event loop in thread MainThread' when no loop is attached to the thread, which is what happens when PTB dispatches this error callback. The retry never gets scheduled, the adapter goes silent-but-alive, and gateway --replace keeps spawning fresh instances that hit the same wall — the crash loop reported in #19471 (worse under multi-profile, where two bots hold the same conflict open). We are inside a coroutine here, so asyncio.get_running_loop() is the correct, guaranteed-valid replacement. Only get_event_loop() call in any platform adapter, so no sibling sites. Fixes #19471	2026-06-07 22:10:03 -07:00
Basil Al Shukaili	8513a6aec7	fix(compression): guard against cross-session stale _previous_summary contamination When a cron or background session compacts, it sets _previous_summary for iterative updates. If that session ends without /new or /reset (which calls on_session_reset()), the stale summary survives on the ContextCompressor instance. A subsequent live messaging session's compaction then injects it as 'PREVIOUS SUMMARY:' into the summarizer prompt — contaminating the live session with unrelated content from the prior session. Add an else guard in compress(): when no handoff summary is found in the current messages but _previous_summary is non-empty, discard it so _generate_summary() starts fresh instead of iteratively updating a stale cross-session summary. Fixes #38788	2026-06-07 22:09:45 -07:00
Teknium	5408013369	fix(gateway): isolate DM sessions on user_id when chat_id is absent (#41764 ) build_session_key collapsed every DM that arrived without a chat_id into one shared 'agent:main:<platform>:dm' key. A single cached AIAgent then served multiple users' conversations, bleeding history across senders. DMs now fall back to the sender's user_id_alt/user_id (mirroring the group-path participant precedence and the telegram auth-path fallback) before the bare per-platform sink. Telegram's normal event path always sets chat_id, so this hardens the synthetic-source / non-standard-adapter paths that don't.	2026-06-07 22:07:07 -07:00
Teknium	a77bc2c08d	fix(compression): disable compression on background-review fork to prevent cross-turn stale-parent fork (#41708 ) The per-session compression lock prevents same-window concurrent forks but not cross-turn ones: the background-review fork shares the parent's session_id, so if it won a compression race its new child session was never adopted by the gateway (the fork is single-lifecycle). The next foreground turn then started from the stale parent and compressed it again, leaving the same parent with two sibling children. Set review_agent.compression_enabled = False so the fork never triggers compression. Both trigger sites in conversation_loop.py gate on compression_enabled before calling _compress_context, so the fork can never rotate the shared parent. Review needs full context anyway — compressing would degrade the memory/skill summary. The per-session lock is kept as defense-in-depth for any future shared-session path. Adds a regression test that fails without the flag and passes with it. Closes #38727	2026-06-07 22:06:48 -07:00
Teknium	48ae8029aa	fix(delegate): resolve custom-endpoint subagent pools by endpoint identity (#41730 ) Subagents delegated to a custom endpoint were misrouted when the parent ran on a different custom endpoint. Both runtimes collapse to provider="custom", so _resolve_child_credential_pool() treated them as interchangeable and handed the child the parent's pool. Leasing from it then overwrote the child's delegated base_url with the parent's endpoint via _swap_credential() — the child sent the delegated model name to the wrong endpoint. Custom runtimes now resolve by endpoint identity (the custom:<name> pool key derived from base_url). The parent pool is reused only when both parent and child resolve to the same custom endpoint; unregistered raw endpoints return None so the child keeps its fixed delegated credential. Non-custom provider paths are unchanged. Fixes #7833.	2026-06-07 22:05:14 -07:00
islam666	78e2101cd2	fix: reap zombie subprocesses in web_server action status and meet_bot cleanup - web_server.py: after proc.poll() returns a non-None exit code, call proc.wait() to reap the child and move the entry from _ACTION_PROCS to _ACTION_RESULTS. Previously .poll() alone left <defunct> zombies. - meet_bot.py: terminate and wait on the pcm_pump subprocess (paplay/ ffmpeg) during the finally-block teardown. Previously leaked on every normal bot exit. - tests: add test_action_status_reaps_completed_process and test_action_status_ignores_wait_failure covering both the happy path and the wait()-raises-OSError edge case. Closes #38032	2026-06-07 21:50:57 -07:00
islam666	e53b74c394	fix(dist): stop USER_OWNED_EXCLUDE from filtering nested directories The copytree ignore lambda in _copy_dist_payload applied USER_OWNED_EXCLUDE recursively at every directory depth. This caused nested directories whose names matched exclude entries (bin, logs, cache, etc.) to be silently dropped during distribution install/update. Fix: only apply USER_OWNED_EXCLUDE filtering at the root of the staged tree, matching the two-tier pattern used by _clone_all_copytree_ignore and _default_export_ignore in profiles.py. Add 5 tests covering nested bin/logs/cache preservation and top-level filtering still working. Fixes #37954	2026-06-07 21:50:57 -07:00
islam666	09a5548628	fix(weixin): refresh typing ticket on expiry to prevent stuck indicator (#38085 ) The WeChat iLink typing ticket has a 600-second TTL. When a long-running session exceeds that window, the cached ticket evicts from TypingTicketCache. Both send_typing and stop_typing silently returned early when the ticket was None, meaning the TYPING_STOP=2 signal was never sent to iLink. The WeChat client then showed the typing indicator indefinitely. Fix: add _ensure_typing_ticket() that transparently refreshes the ticket via getConfig when the cached one has expired or is missing. Both send_typing and stop_typing now call this method instead of silently no-oping. Fixes #38085	2026-06-07 21:50:57 -07:00
islam666	2e61de0638	fix(model_metadata): consult DEFAULT_CONTEXT_LENGTHS before 256K fallback on custom endpoints Problem: get_model_context_length() had an early return at the end of the custom-endpoint probe branch (step 3) that returned DEFAULT_FALLBACK_CONTEXT (256K) without ever consulting the hardcoded DEFAULT_CONTEXT_LENGTHS catalog (step 8). Models served through a custom/proxied gateway (e.g. corporate Anthropic proxy) that didn't expose Ollama or local-server endpoints would hit this path and get capped at 256K, even when the model name clearly matched a known entry in the catalog (e.g. claude-opus-4-8 → 1M). Changes: - agent/model_metadata.py: Before returning DEFAULT_FALLBACK_CONTEXT at the end of the custom-endpoint branch, consult DEFAULT_CONTEXT_LENGTHS using the same longest-key-first fuzzy matching as step 8. Only fall through to 256K if no catalog entry matches. - tests/agent/test_model_metadata.py: Updated existing test and added new test covering the custom-endpoint → catalog fallback behavior. Fixes #38865	2026-06-07 21:50:57 -07:00
islam666	9513793ad7	fix(vision): proactive downgrade for providers rejecting list-type tool content (#41072 ) Xiaomi MiMo (and potentially other providers) support multimodal user messages but reject list-type tool message content with 400 'text is not set'. Previously this was handled reactively — the API call would fail, images would be stripped, and the request retried, losing visual info. Fix: add supports_vision_tool_messages field to ProviderProfile (default True). Xiaomi sets it to False. _tool_result_content_for_active_model now checks this field proactively and returns a text summary instead of list content, avoiding the round-trip failure entirely.	2026-06-07 21:50:57 -07:00
islam666	41f0714287	fix(vision): honor custom_providers per-model supports_vision (#41036 ) _supports_vision_override() in image_routing.py checked model.supports_vision and providers.<name>.models, but not the legacy list-style custom_providers config. A custom provider entry like: custom_providers: - name: my-provider models: my-model: supports_vision: true was ignored, causing image_input_mode=auto to route through the auxiliary vision_analyze path instead of natively attaching images. Fix: added a lookup step for custom_providers list entries, matching by provider name (including 'custom:<name>' variants at runtime). providers.<name>.models still takes precedence over custom_providers. 13 new tests covering: true/false override, custom: prefix matching, no-match fallback, non-dict entries, empty lists, models key missing.	2026-06-07 21:50:57 -07:00
islam666	18c085b1a4	fix(gateway): normalize optional systemd directives in stale-check (#41119 ) On older systemd versions that don't support RestartMaxDelaySec / RestartSteps, the installed unit file has those directives silently dropped. systemd_unit_is_current() did a strict text comparison, so the unit was perpetually flagged as outdated. Fix: _strip_optional_systemd_directives() removes RestartMaxDelaySec and RestartSteps from both the installed and expected text before comparison. Units that differ only by these optional directives are now correctly considered current.	2026-06-07 21:50:57 -07:00
islam666	b18490b890	fix(compaction): prevent infinite loop when transcript fits in tail budget When summary_target_ratio is large (e.g. 0.45) and the context_length is moderate (e.g. 96000), the soft_ceiling (token_budget * 1.5) can exceed the total transcript size. _find_tail_cut_by_tokens walks the entire transcript without breaking early, and the resulting compress window is either empty (compress_start >= compress_end) or a single message whose summary-of-one overhead saves ~0 tokens. Both outcomes cause a no-op compression that does not increment _ineffective_compression_count, so should_compress() returns True on every subsequent turn and the loop repeats endlessly. Fix (two layers): 1. _find_tail_cut_by_tokens: when the backward walk consumed the entire transcript without breaking (cut_idx <= head_end and accumulated <= soft_ceiling), re-walk with the raw (non-inflated) token budget to find a meaningful cut that gives the summarizer a useful middle window. 2. compress(): when compress_start >= compress_end, increment _ineffective_compression_count and log a warning so the existing anti-thrashing guard in should_compress() can break the loop. Fixes #40803	2026-06-07 21:50:57 -07:00
Brian D. Evans	ab0a6270c3	fix(slack): align thread_ts check with is_thread_reply invariant (Copilot #15464 ) Two findings from Copilot's review on #15464, both addressed: 1. ``event.get("thread_ts")`` truthy vs ``event_thread_ts != ts``: the new channel branch treated ANY truthy ``thread_ts`` as a real thread reply, but three lines below ``is_thread_reply`` is defined with the stricter ``event_thread_ts and event_thread_ts != ts`` invariant. If Slack ever ships a payload where ``thread_ts == ts`` on a thread root, the stricter check would treat it as a top-level message for the ``is_thread_reply`` path but as a thread reply for session keying — divergent behaviour. Aligned this branch to the same ``and event_thread_ts_raw != ts`` invariant. 2. ``test_top_level_reply_to_id_stays_none_when_shared`` docstring had the ternary logic backwards ("None != ts → reply_to_message_id IS set"). The code reads ``reply_to_message_id = thread_ts if thread_ts != ts else None`` — with ``thread_ts = None``, the condition is True so the expression evaluates to ``thread_ts`` itself (None), meaning the reply stays un-threaded. The test asserted the correct end-state; only the explanatory docstring was wrong. Rewrote the docstring to match the actual code flow, with the note that Copilot caught the reversal. 7/7 tests still pass. No behaviour change for the existing test_thread_reply_scopes_by_thread_even_when_shared case because ``event_thread_ts_raw = "1700000000.000000"`` and ``ts = "1700000000.000005"`` are distinct — the new ``!= ts`` guard is a no-op there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-07 21:19:59 -07:00
Brian D. Evans	133e0271e2	fix(slack): scope top-level channel messages by channel-only when reply_in_thread=false (#15421 ) Top-level Slack channel messages previously fell back to the message's own ``ts`` as a synthetic ``thread_ts``: thread_ts = event.get("thread_ts") or ts # ts fallback for channels That value flows into ``build_source(thread_id=thread_ts)`` at line 1247. The gateway session store keys sessions by ``(platform, channel_id, thread_id)``, so every top-level channel message ended up on a unique session. Operators who set ``reply_in_thread: false`` in ``config.yaml`` expected all top-level channel messages to share one session (the whole point of that flag) — instead each one spawned a fresh conversation with no context carry-over. ### Fix Three explicit cases in the channel branch: \| event.thread_ts \| reply_in_thread \| thread_ts for session keying \| \|---\|---\|---\| \| non-null (real thread reply) \| either \| event.thread_ts \| \| null (top-level) \| true (default) \| ts (legacy: own-thread sessions) \| \| null (top-level) \| false \| None (shared channel session) \| The outbound-reply gate at line 1264 (``reply_to_message_id = thread_ts if thread_ts != ts else None``) still works correctly in all three cases without further changes: ``None != ts`` is True, so shared-channel top-level messages don't get their reply threaded either — matching the operator's ``reply_in_thread=false`` intent end-to-end. Genuine thread replies still scope per-thread under both modes so multi-person threaded conversations can't collide with unrelated channel chatter. ### Tests (7 new in ``tests/gateway/test_slack_channel_session_scope.py``) All drive the real ``SlackAdapter._handle_slack_message`` code path (not a re-implementation) via the standard pytest fixture pattern used by ``tests/gateway/test_slack.py``. Messages @mention the bot so the mention gate doesn't drop them — the tests are specifically about what happens once the handler decides to emit a ``MessageEvent``. * ``TestChannelSessionScopeDefault`` (2 cases): - Explicit ``reply_in_thread: true`` keeps ``thread_id = ts`` (legacy behaviour — regression guard) - Unset config behaves like ``reply_in_thread: true`` (pins the default) * ``TestChannelSessionScopeShared`` (3 cases): - ``reply_in_thread: false`` + top-level → ``thread_id is None`` (the #15421 bug 1 fix) - ``reply_to_message_id is None`` in the same case (no threaded outbound reply) - Genuine thread reply still scopes per-thread when shared mode is on — only TOP-LEVEL messages collapse to the channel session * ``TestThreadReplyAlwaysScopesByThread`` (2 parametrised cases): - Thread replies get ``thread_id = event.thread_ts`` regardless of ``reply_in_thread`` — critical invariant for multi-thread channels; a regression here would leak per-thread context across threads Regression guard verified: reverted the else-branch to the legacy ``thread_ts = event.get("thread_ts") or ts`` one-liner; ``test_top_level_maps_to_none_when_reply_in_thread_false`` correctly failed (asserts ``thread_id is None`` but got ``"1700000000.000003"``). Restored → 182 slack tests pass (175 existing + 7 new). Scope: this fixes #15421 bug 1 only. Bug 2 (sessions.json not persisting across compression) lives elsewhere in the session manager and is left for a separate diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-07 21:19:59 -07:00
Shannon Sands	86e5efb0ae	Preserve Telegram onboarding fallback errors	2026-06-07 19:48:09 -07:00
Shannon Sands	ba29010902	Use httpx for Telegram onboarding worker calls	2026-06-07 19:48:09 -07:00
brooklyn!	fa42ac094d	feat(desktop): Shift+click the status-bar zap to toggle YOLO globally (#41666 ) The status-bar zap currently toggles per-session approval bypass (the same scope as the TUI's Shift+Tab). This adds a global escape hatch: Shift+clicking the zap flips the persistent approvals.mode in config.yaml between "off" (bypass on) and "manual" (bypass off), affecting every session, the CLI, the TUI, and cron — and it survives restarts. - statusbar-controls: thread the click's shiftKey through onSelect via a new StatusbarSelectModifiers arg. - yolo-session: add setGlobalYolo() that calls config.set with scope="global". - use-statusbar-items: branch toggleYolo on modifiers.shiftKey; plain click stays per-session, Shift+click goes global. - tui_gateway config.set "yolo" key: add scope="global" that reads/writes approvals.mode through the gateway's own (mtime-cached) config view, honors an explicit value, and re-emits session.info to every live session so each window's zap reflects the flip immediately. - i18n: tooltip copy in en/ja/zh/zh-hant notes Shift+click toggles globally. Tests: two new tui_gateway tests cover the global toggle and explicit-value paths; existing session/process-scope yolo tests still pass.	2026-06-07 20:57:08 -05:00
Teknium	30c7913617	fix(api_server): report hermes version on /health and /health/detailed (#40620 ) Salvaged from #40479; re-verified on main, tightened, tested. Co-authored-by: tfournet <tfournet@users.noreply.github.com>	2026-06-07 18:38:54 -07:00
Teknium	b97cd81c78	refactor(insights): drop dead pricing/duration wrappers, call usage_pricing directly (#40618 ) Salvaged from #40527; re-verified on main, tightened, tested. Co-authored-by: HeLLGURD <HeLLGURD@users.noreply.github.com>	2026-06-07 18:33:20 -07:00
Teknium	2aa316ec9c	docs(windows): fix Get-Command PATH guidance to venv\Scripts\hermes.exe (#40613 ) Closes #40464. Salvaged from #40488; re-verified on main, tightened, tested. Co-authored-by: gauravsaxena1997 <gauravsaxena1997@users.noreply.github.com>	2026-06-07 18:28:23 -07:00
Teknium	6bdc4c0231	test: skip curses tests on Windows where _curses is unavailable (#40611 ) Salvaged from #40447; re-verified on main, tightened, tested. Co-authored-by: Ganesh0690 <Ganesh0690@users.noreply.github.com>	2026-06-07 18:21:03 -07:00
Teknium	69a293b419	hardening(todo): bound TodoStore item content length and count The todo list is re-injected into the model's context after every context-compression event (TodoStore.format_for_injection), so an oversized todo item or an unbounded number of items defeats the compression it is meant to ride through. TodoStore.write/_validate previously enforced no size or count bounds, so a single 50KB item produced a ~50KB re-injection block on every subsequent turn. Add two caps: - MAX_TODO_CONTENT_CHARS (4000): per-item content is truncated with a marker. Routed through a shared _cap_content() so the merge-update path (which writes content directly, bypassing _validate) is capped too. - MAX_TODO_ITEMS (256): total list length is bounded, keeping the highest-priority head (list order is priority). Both caps are generous relative to real plans — a todo item is a short task description and active lists are a handful of items. NOT a security fix. Raised externally via GHSA-5g4g-6jrg-mw3g, which framed a caller-supplied conversation_history on the authenticated API server replaying into _hydrate_todo_store as a DoS. That path is authenticated (the API server refuses to start without API_SERVER_KEY) and self-scoped (the caller supplies their own entire history and can only inflate their own response chain — forged role=tool entries are never persisted to the session DB), so it is out of scope as a vulnerability under SECURITY.md 3.2. These bounds are footgun containment that also applies to the trusted agent path, where the model itself authors the todos. Credit to the reporter for the observation. Co-authored-by: YLChen-007 <30854794+YLChen-007@users.noreply.github.com>	2026-06-07 18:06:27 -07:00
Gilad Bauman	ae82eed2b1	fix(gateway): use OGG for Telegram auto TTS	2026-06-07 18:05:58 -07:00
Teknium	cb83149dc6	fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (#40607 ) Salvaged from #40421; re-verified on main, tightened, tested. Co-authored-by: maxmilian <maxmilian@users.noreply.github.com>	2026-06-07 17:49:38 -07:00
Teknium	09d66037f8	fix(hindsight): send only new-turn delta on append retains instead of whole session (#40605 ) Closes #40503. Salvaged from #40519; re-verified on main, tightened, tested. Co-authored-by: skylarbpayne <skylarbpayne@users.noreply.github.com>	2026-06-07 17:41:10 -07:00
kshitijk4poor	7df81d0557	fix(web): make _has_env config-aware so SEARXNG_URL auto-detect honors Hermes config Follow-up to #34306. The provider fix made SearXNG usable with a config-only SEARXNG_URL, but tools/web_tools._has_env still read raw os.getenv, so the backend auto-detect cascade and check_web_api_key remained blind to it — SearXNG worked when explicitly selected but was never auto-selected. Route _has_env (and the SearXNG diagnostic print) through a config-aware _env_value helper mirroring the provider's _searxng_url(). Fixing the shared helper covers every provider key in one place. Adds regression tests for config-only auto-detect and check_web_api_key. See #34290.	2026-06-08 01:12:32 +05:30
kshitij	0c0fbf763b	Merge pull request #41430 from helix4u/fix-url-tools-unicode-normalization fix(tools): percent-encode non-ascii URL components	2026-06-07 12:39:30 -07:00
helix4u	333f01bc7f	fix(tools): percent-encode non-ascii URL components	2026-06-07 11:42:26 -06:00
teknium1	16786f3bb3	feat(desktop+gateway): remote media relay — attach images/PDFs and display gateway images over the network Desktop connected to a remote gateway can now attach images and PDFs and display agent-written images. Previously the desktop passed a LOCAL file path to image.attach; on a remote gateway that path doesn't exist, so the image was silently dropped ("skipped unreadable path") and the vision model never saw it. The reverse direction was also broken — images the agent wrote on the gateway rendered as dead links in the remote client. Gateway (tui_gateway/server.py): - image.attach_bytes: base64 byte upload written into the gateway's own images dir and queued via the existing native-image-attach pipeline. Magic-byte extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap, structured error codes. Accepts content_base64/filename (canonical) and data/ext (older-desktop aliases). - pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI and queues the pages as images; 50 MB / 25-page caps. Accepts host path or base64 upload. - Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image) so the two methods and the existing image.attach don't duplicate logic. Gateway (hermes_cli/web_server.py): - GET /api/media: returns a gateway-local image as a base64 data URL so remote clients can display it. Auth-gated like every /api route, extension allowlist + size cap, AND confined to the gateway's own media roots (images/screenshots/cache, resolved symlink-safe) so an authed caller can't read image-extension files anywhere on disk. Desktop (apps/desktop): - syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the connection mode is 'remote'; the local fast path is unchanged. - media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and markdown-text fetch images over /api/media in remote mode. Consolidates the competing remote-media PRs (#38876, #40317, #21908, #39437) into one coherent implementation, taking the strongest parts of each and adding shared-helper cleanup plus the /api/media root-confinement hardening on top. The per-profile gateway switching from #38876 is intentionally left out as a separable feature. TUI file uploads (#40492) remain a separate surface. Tested: 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop media.remote unit tests; full tui_gateway + web_server suites green (472 passed); tsc -b clean; E2E verified the full attach→disk→queue and gateway-path→data-URL display round-trip plus the out-of-root security block. Co-authored-by: Max Mitcham <maxmitcham@mac.home> Co-authored-by: Justlrnal4 <Justlrnal4@users.noreply.github.com> Co-authored-by: Chris Cook <ccook@nvms.com> Co-authored-by: Thomas Paquette <thomas.paquette@gmail.com>	2026-06-07 10:05:53 -07:00
Teknium	0c48b7165d	hardening(api-server): scan cron prompts on REST create/update for parity with the agent tool The agent-facing cronjob tool scans the user prompt with _scan_cron_prompt() before creating/updating a job (tools/cronjob_tools.py); the REST cron endpoints (POST /api/jobs, PATCH /api/jobs/{id}) validated length but not content. This adds the same scan to both handlers so an exfiltration/injection prompt is rejected the same way regardless of which surface created the job. NOT a security boundary, defense-in-depth / parity only: the REST cron endpoints are authenticated (every handler runs _check_auth, and connect() refuses to start without API_SERVER_KEY), and _scan_cron_prompt is a documented in-process heuristic, not a containment boundary (SECURITY.md 3.2). Raised externally via GHSA-fr3q-rjg3-x6mf (DNS-rebinding pre-auth RCE). The report's load-bearing 'no auth by default' premise was already closed three weeks after it was filed by the API_SERVER_KEY-required guard (commit `1a9ef8314`); this lands the create/update prompt-validation parity the report also pointed at. Scanner imported defensively so a missing scanner cannot disable the cron REST API.	2026-06-07 10:04:57 -07:00
Teknium	af08c43f3e	fix: skip MCP preflight content-type probe on reconnect when already ready (#40604 ) Closes #40366. Salvaged from #40548; re-verified on main, tightened, tested. Co-authored-by: mohamedorigami-jpg <mohamedorigami-jpg@users.noreply.github.com>	2026-06-07 09:51:11 -07:00
teknium1	76f01780f0	fix(kanban): sweep deferred scratch parent on non-scratch child completion + tests Follow-up on the deferred-cleanup salvage (#33774): _cleanup_workspace returned early for a non-scratch ('dir'/'worktree') task and never ran the parent sweep, so a scratch parent waiting on a 'dir' child would leak its deferred workspace forever. Run the parent sweep before the early return. Adds regression tests: deferred-while-child-active, swept-after-last-child, and dir-child-unblocks-scratch-parent.	2026-06-07 09:50:44 -07:00

1 2 3 4 5 ...

5080 commits