Commit graph

5080 commits

Author SHA1 Message Date
liuhao1024
6459b3d991 fix(terminal): collapse CWD-only overrides to shared container
When register_task_env_overrides is called with only a 'cwd' key
(ACP adapter workspace tracking), the task_id should collapse to
'default' so all interactive surfaces (TUI, gateway, dashboard)
share one long-lived container.

Previously, any override registration — even CWD-only — caused
_resolve_container_task_id to return the session key unchanged,
spinning up a separate container per session. This made it
impossible to authenticate into external services once and have
that auth available across all surfaces.

Now only overrides containing isolation keys (docker_image,
modal_image, singularity_image, daytona_image, env_type) trigger
per-task container isolation.

Fixes #37361
2026-06-07 23:04:54 -07:00
teknium1
1a626470ca refactor(cli): promote 9 closure handlers to top-level + extract their parsers (god-file Phase 2 follow-up)
Subcommands whose handler was a closure defined inside main() — memory, acp,
tools, insights, skills, pairing, plugins, mcp, claw — have their handler
promoted to a top-level function and their parser block extracted into
hermes_cli/subcommands/<name>.py (build_<name>_parser, injected handler).

These 9 had zero closure-over-main-locals, so promotion is a pure relocation.
acp/mcp parser blocks use the shared add_accept_hooks_flag helper.

main() 1798 -> 954 LOC (71% below the 3297 Phase-2 starting point);
add_parser calls in main.py 89 -> 28.

Deferred: sessions, computer-use, secrets handlers reference <name>_parser
(for a no-subcommand print_help fallback) — left in place to avoid the
_self_parser indirection; minority, low value.

Behavior-neutral: all 9 subcommands' --help (incl nested subactions) byte-
identical to pre-extraction (diff-verified). tests/hermes_cli/ 6519 passed /
0 failed; new test_subcommands_followup.py covers the 9 builders.
2026-06-07 22:56:23 -07:00
teknium1
524453dab5 refactor(agent): consolidate inner-retry-loop recovery flags into TurnRetryState (god-file Phase 1b)
run_conversation's inner retry loop tracked recovery state in ~15 scattered
bare booleans (per-provider OAuth refresh guards, format-recovery guards,
restart signals). They are now fields on a single TurnRetryState dataclass the
loop mutates in place (_retry.<flag>), giving the recovery bookkeeping a named,
testable home.

Loop-control vars (retry_count, max_retries, max_compression_attempts) stay as
plain locals — they're while-mechanics, not recovery bookkeeping.

Behavior-neutral: pure local→attribute rewrite of 42 references; kwarg NAMES
preserved (e.g. has_retried_429=_retry.has_retried_429). Live simple + tool
turns OK.

Validation: tests/run_agent/ 1615 passed / 0 failed under per-file process
isolation; new test_turn_retry_state.py pins the field contract.
2026-06-07 22:42:05 -07:00
Rod Boev
648706936d test(gateway): add compression session_id rotation integration tests (#34089) 2026-06-07 22:39:51 -07:00
JimStenstrom
cb5c24e37d fix(agent): sync logging session context on compaction id rotation
When context compaction rotates agent.session_id, it updates the gateway/tools
session context (set_current_session_id -> HERMES_SESSION_ID env + ContextVar)
but never updates the separate logging session context. The [session_id] tag on
log lines comes from hermes_logging._session_context (set once per turn in
conversation_loop.py), so post-compaction log lines in the same turn carry the
STALE old id while the message/DB/gateway state carry the new one — breaking log
correlation exactly at the compaction boundary.

Call hermes_logging.set_session_context(agent.session_id) alongside the existing
set_current_session_id, guarded so a logging failure can't regress the routing
update. Logs-only; no runtime or caching impact.

Refs #34089
2026-06-07 22:30:02 -07:00
Teknium
8e223b36ed
fix(curator): protect load-bearing built-in skills from archival/consolidation (#41817)
The curator's idle-archival path (apply_automatic_transitions under
prune_builtins) could archive the bundled `plan` skill, killing the
/plan slash command silently — typing /plan then returned 'Unknown
command' with no signal that a skill had vanished. The archived skill's
hash stays in .bundled_manifest, so 'hermes update' wouldn't re-seed it.

Add PROTECTED_BUILTIN_SKILLS ({plan}) enforced at the master gate
is_curation_eligible() (covers archive_skill + the transition walk) and
in the candidate enumerator (so the LLM consolidation pass never sees
them). Immune to prune_builtins, pin state, and LLM judgment.
2026-06-07 22:23:29 -07:00
Teknium
777dc9da62
feat(acp): emit session provenance metadata for compression rotation (#41724)
Closes #33617. Adds additive _meta.hermes.sessionProvenance to ACP session
surfaces so clients can detect compression-driven internal session rotation
without parsing status text, guessing from token drops, or reading state.db.

Derived on demand from the existing compression chain (parent_session_id /
end_reason) — no new persisted state, no schema change, no ACP protocol change.
ACP session_id stays the stable client handle.

- acp_adapter/provenance.py: derive provenance from SessionDB
- server.py: attach _meta to new/load/resume responses; emit a
  session_info_update when the internal head rotates during a prompt
2026-06-07 22:22:21 -07:00
Martín Alcalá Rubí
132d6fe6d6 fix(volcengine): strip XML attribute fragments from tool_use.name (#33007)
VolcEngine's api/plan endpoint occasionally leaks raw XML attribute
fragments into tool_use.name when its protocol-translation layer
converts the model's native XML-style tool emission to Anthropic
Messages tool_use blocks, producing names like:

  terminal" parameter="command" string="true
  execute_code" parameter="code" string="true
  session_search" parameter="session_id" string="true

The corruption happens server-side at the provider, but it breaks
every tool call for affected users — no normalization rule in
repair_tool_call can rescue them, so each request runs through three
retries and then aborts as partial.

Add an early sanitizer in agent_runtime_helpers.repair_tool_call that
trims at the first ' " ', " ' ", '<', or '>' character (idx > 0
only) so the rest of the existing repair pipeline (lowercase /
snake_case / fuzzy match) can resolve the cleaned name normally.

Whitespace is deliberately NOT a separator — the legitimate
"write file" -> write_file repair path (covered by
test_space_to_underscore) must keep working.

Tests: 11 new regression cases in TestVolcEngineXmlPollution
covering all three observed polluted names, CamelCase + pollution
mix, single-quote variants, angle-bracket variants, clean-name
passthrough, and the whitespace-preservation guard. All 18 pre-
existing repair tests still pass (29 total in the file).
2026-06-07 22:22:01 -07:00
lsaether
9b631e4ae1 fix(acp): suppress cancel interrupt sentinel 2026-06-07 22:20:43 -07:00
Teknium
2789bf4e25 fix(auxiliary): route Codex Responses path through shared converter (#5709)
The auxiliary Codex adapter maintained its own chat->Responses conversion
loop that forwarded every non-system message's role verbatim into
Responses input[]. When flush_memories()/compression replayed session
history containing assistant tool_calls + role=tool results, those tool
messages leaked into the request and the Responses API rejected them with
HTTP 400: Invalid value: 'tool'.

Route _CodexCompletionsAdapter.create() through the same shared converter
the main agent transport uses (_chat_messages_to_responses_input), so tool
calls become function_call items and tool results become function_call_output
items with a valid call_id. Single conversion path means no future drift.

Also remove the now-dead _convert_content_for_responses() helper — its only
caller was the private conversion loop this change deletes.

Co-authored-by: ProgramCaiCai <techxacm@gmail.com>
2026-06-07 22:18:31 -07:00
teknium1
568e127612 refactor(cli): extract 25 more subcommand parsers into hermes_cli/subcommands/
Batch extraction of every remaining subcommand whose handler is top-level and
whose parser block is pure argparse: model, setup, postinstall, whatsapp, slack,
login, logout, auth, status, webhook, hooks, doctor, security, dump, debug,
backup, import, config, version, update, uninstall, dashboard, gui, logs,
prompt-size.

Each becomes hermes_cli/subcommands/<name>.py with build_<name>_parser() and an
injected handler (no main import). dashboard also injects cmd_dashboard_register
for its nested 'register' action.

Behavior-neutral: all 25 subcommands' --help output (and nested subaction help)
diff-verified byte-identical to pre-extraction. Two RawDescriptionHelpFormatter
epilogs (debug, logs) needed their multi-line string interiors preserved at
column 0 — caught by the --help diff, not compile.

main() 3297 -> 1798 LOC across this PR; add_parser calls in main.py 179 -> 89.

Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process
isolation; new test_subcommands_batch.py smoke-tests all 25 builders + the
dashboard two-handler case.
2026-06-07 22:18:14 -07:00
teknium1
4da45e8727 refactor(cli): extract profile + gateway/proxy parsers into hermes_cli/subcommands/
Follow-on to the cron extraction in the same Phase 2 PR. Same pattern:
per-group build_<name>_parser() functions with injected handlers, no main
import.

- subcommands/profile.py: build_profile_parser (190-line block out of main()).
- subcommands/gateway.py: build_gateway_parser (gateway + proxy, 238-line block;
  they shared one inline section). Imports argparse for SUPPRESS defaults.
- main(): two more inline blocks become single builder calls.

Behavior-neutral: 'profile [sub] --help' and 'gateway/proxy [sub] --help'
byte-identical to pre-extraction (diff-verified).

main() now 2723 LOC (was 3297 at Phase 2 start); add_parser calls in main.py
179 -> 141.

Validation: tests/hermes_cli/ 6476 passed / 0 failed under per-file process
isolation; new builder unit tests cover subactions, aliases, dispatch, flags.
2026-06-07 22:18:14 -07:00
teknium1
b2e6053243 refactor(cli): extract hermes cron parser into hermes_cli/subcommands/ (god-file Phase 2)
Phase 2 of the god-file decomposition plan. main()'s argparse tree is 179
inline add_parser calls in one 3,297-line function. This establishes the
hermes_cli/subcommands/ package and extracts the first group (cron) as the
proof-of-pattern:

- hermes_cli/subcommands/_shared.py: shared parser helpers (add_accept_hooks_flag),
  re-exported from main.py for backwards compat.
- hermes_cli/subcommands/cron.py: build_cron_parser(subparsers, cmd_cron=...).
  Handler injected so the module never imports main (cycle avoidance).
- main()'s ~155-line inline cron block becomes one build_cron_parser() call.

Behavior-neutral: 'hermes cron create --help' output is byte-identical to
origin/main. main() 3297 -> 3143 LOC.

Validation: tests/hermes_cli/ 6466 passed / 0 failed under per-file process
isolation; new test_subcommands_cron.py covers subactions, aliases, options,
no-agent tristate, injected dispatch, and --accept-hooks.
2026-06-07 22:18:14 -07:00
teknium1
54870847cb refactor(agent): extract run_conversation prologue into agent/turn_context.py
Phase 1 of the god-file decomposition plan. run_conversation's ~470-line
once-per-turn setup block (stdio guarding, retry-counter resets, user-message
sanitization, todo/nudge hydration, system-prompt restore-or-build,
crash-resilience persistence, preflight compression, the pre_llm_call hook, and
external-memory prefetch) is moved verbatim into build_turn_context(), which
returns a TurnContext dataclass the loop unpacks.

Behavior-neutral move-and-name refactor: the builder mutates `agent` exactly as
the inline code did; only the locals the loop reads back are returned.

- run_conversation: 4602 -> 4217 LOC (-385)
- agent/conversation_loop.py: 4965 -> ~4580 LOC
- new agent/turn_context.py: focused, dependency-injected, unit-tested in isolation

Tests: tests/run_agent/ 1570 passed / 0 failed under per-file process isolation.
Relocation follow-ups: 413_compression mocks now patch both module references;
nudge/on_turn_start source-inspection guards point at the extracted module.
2026-06-07 22:17:35 -07:00
Teknium
86c537d209
fix(memory): instruct in-turn consolidation + retry on overflow (#41755)
* fix(memory): make overflow errors instruct in-turn consolidation + retry

When bounded memory is full, the add/replace overflow errors now explicitly
tell the model to consolidate (merge/remove/shorten) and retry the write in
the same turn, matching the documented behavior. The replace-overflow path
now also echoes current_entries + usage for parity with add-overflow, so the
model has the same context to act on.

Closes #23378 (working-as-documented; this sharpens runtime to match docs).

* fix(memory): broaden overflow remediation hint beyond 'stale'

Say 'stale or less important' — entries don't have to be stale to be the
right ones to drop when making room.
2026-06-07 22:16:28 -07:00
teknium1
2a10da3a16 fix(gateway): keep /model + /reasoning overrides on topic recovery & compression splits
Session-scoped /model and /reasoning overrides were silently lost on
Telegram DM/forum topics and after compression session splits (#30479).

Root cause: _handle_message_with_agent rewrites source.thread_id via
_recover_telegram_topic_thread_id (lobby/stripped reply -> the user's
bound topic) before deriving the session key. The /model and /reasoning
handlers derived their override key from the raw inbound event.source,
skipping that recovery, so the override was stored under one key and the
next message turn read a different key.

Fix: add _normalize_source_for_session_key (applies the same recovery a
message turn does) and use it in both handlers before deriving the key.
session_id rotation on compression was never the cause — overrides are
keyed by the durable session_key; the split path preserves it.

Author: teknium1 <127238744+teknium1@users.noreply.github.com>
2026-06-07 22:10:32 -07:00
Hariharan Ayappane
b8469a81e3 fix(weixin): add rate-limit circuit breaker 2026-06-07 22:10:17 -07:00
Teknium
2e62862784
fix(telegram): use get_running_loop in polling-conflict retry reschedule (#41716)
The conflict-retry path called asyncio.get_event_loop() to reschedule
itself when a retry's start_polling raised. On Python 3.11+ (our floor)
that raises 'RuntimeError: There is no current event loop in thread
MainThread' when no loop is attached to the thread, which is what
happens when PTB dispatches this error callback. The retry never gets
scheduled, the adapter goes silent-but-alive, and gateway --replace
keeps spawning fresh instances that hit the same wall — the crash loop
reported in #19471 (worse under multi-profile, where two bots hold the
same conflict open).

We are inside a coroutine here, so asyncio.get_running_loop() is the
correct, guaranteed-valid replacement. Only get_event_loop() call in
any platform adapter, so no sibling sites.

Fixes #19471
2026-06-07 22:10:03 -07:00
Basil Al Shukaili
8513a6aec7 fix(compression): guard against cross-session stale _previous_summary contamination
When a cron or background session compacts, it sets _previous_summary for
iterative updates. If that session ends without /new or /reset (which calls
on_session_reset()), the stale summary survives on the ContextCompressor
instance. A subsequent live messaging session's compaction then injects it as
'PREVIOUS SUMMARY:' into the summarizer prompt — contaminating the live
session with unrelated content from the prior session.

Add an else guard in compress(): when no handoff summary is found in the
current messages but _previous_summary is non-empty, discard it so
_generate_summary() starts fresh instead of iteratively updating a stale
cross-session summary.

Fixes #38788
2026-06-07 22:09:45 -07:00
Teknium
5408013369
fix(gateway): isolate DM sessions on user_id when chat_id is absent (#41764)
build_session_key collapsed every DM that arrived without a chat_id into
one shared 'agent:main:<platform>:dm' key. A single cached AIAgent then
served multiple users' conversations, bleeding history across senders.

DMs now fall back to the sender's user_id_alt/user_id (mirroring the
group-path participant precedence and the telegram auth-path fallback)
before the bare per-platform sink. Telegram's normal event path always
sets chat_id, so this hardens the synthetic-source / non-standard-adapter
paths that don't.
2026-06-07 22:07:07 -07:00
Teknium
a77bc2c08d
fix(compression): disable compression on background-review fork to prevent cross-turn stale-parent fork (#41708)
The per-session compression lock prevents same-window concurrent forks but
not cross-turn ones: the background-review fork shares the parent's
session_id, so if it won a compression race its new child session was never
adopted by the gateway (the fork is single-lifecycle). The next foreground
turn then started from the stale parent and compressed it again, leaving the
same parent with two sibling children.

Set review_agent.compression_enabled = False so the fork never triggers
compression. Both trigger sites in conversation_loop.py gate on
compression_enabled before calling _compress_context, so the fork can never
rotate the shared parent. Review needs full context anyway — compressing
would degrade the memory/skill summary.

The per-session lock is kept as defense-in-depth for any future shared-session
path. Adds a regression test that fails without the flag and passes with it.

Closes #38727
2026-06-07 22:06:48 -07:00
Teknium
48ae8029aa
fix(delegate): resolve custom-endpoint subagent pools by endpoint identity (#41730)
Subagents delegated to a custom endpoint were misrouted when the parent
ran on a different custom endpoint. Both runtimes collapse to
provider="custom", so _resolve_child_credential_pool() treated them as
interchangeable and handed the child the parent's pool. Leasing from it
then overwrote the child's delegated base_url with the parent's endpoint
via _swap_credential() — the child sent the delegated model name to the
wrong endpoint.

Custom runtimes now resolve by endpoint identity (the custom:<name> pool
key derived from base_url). The parent pool is reused only when both
parent and child resolve to the same custom endpoint; unregistered raw
endpoints return None so the child keeps its fixed delegated credential.
Non-custom provider paths are unchanged.

Fixes #7833.
2026-06-07 22:05:14 -07:00
islam666
78e2101cd2 fix: reap zombie subprocesses in web_server action status and meet_bot cleanup
- web_server.py: after proc.poll() returns a non-None exit code, call
  proc.wait() to reap the child and move the entry from _ACTION_PROCS
  to _ACTION_RESULTS. Previously .poll() alone left <defunct> zombies.
- meet_bot.py: terminate and wait on the pcm_pump subprocess (paplay/
  ffmpeg) during the finally-block teardown. Previously leaked on every
  normal bot exit.
- tests: add test_action_status_reaps_completed_process and
  test_action_status_ignores_wait_failure covering both the happy path
  and the wait()-raises-OSError edge case.

Closes #38032
2026-06-07 21:50:57 -07:00
islam666
e53b74c394 fix(dist): stop USER_OWNED_EXCLUDE from filtering nested directories
The copytree ignore lambda in _copy_dist_payload applied USER_OWNED_EXCLUDE
recursively at every directory depth. This caused nested directories whose
names matched exclude entries (bin, logs, cache, etc.) to be silently dropped
during distribution install/update.

Fix: only apply USER_OWNED_EXCLUDE filtering at the root of the staged tree,
matching the two-tier pattern used by _clone_all_copytree_ignore and
_default_export_ignore in profiles.py.

Add 5 tests covering nested bin/logs/cache preservation and top-level
filtering still working.

Fixes #37954
2026-06-07 21:50:57 -07:00
islam666
09a5548628 fix(weixin): refresh typing ticket on expiry to prevent stuck indicator (#38085)
The WeChat iLink typing ticket has a 600-second TTL. When a long-running
session exceeds that window, the cached ticket evicts from TypingTicketCache.
Both send_typing and stop_typing silently returned early when the ticket was
None, meaning the TYPING_STOP=2 signal was never sent to iLink. The WeChat
client then showed the typing indicator indefinitely.

Fix: add _ensure_typing_ticket() that transparently refreshes the ticket
via getConfig when the cached one has expired or is missing. Both send_typing
and stop_typing now call this method instead of silently no-oping.

Fixes #38085
2026-06-07 21:50:57 -07:00
islam666
2e61de0638 fix(model_metadata): consult DEFAULT_CONTEXT_LENGTHS before 256K fallback on custom endpoints
Problem: get_model_context_length() had an early return at the end of the
custom-endpoint probe branch (step 3) that returned DEFAULT_FALLBACK_CONTEXT
(256K) without ever consulting the hardcoded DEFAULT_CONTEXT_LENGTHS catalog
(step 8). Models served through a custom/proxied gateway (e.g. corporate
Anthropic proxy) that didn't expose Ollama or local-server endpoints would
hit this path and get capped at 256K, even when the model name clearly
matched a known entry in the catalog (e.g. claude-opus-4-8 → 1M).

Changes:
- agent/model_metadata.py: Before returning DEFAULT_FALLBACK_CONTEXT at the
  end of the custom-endpoint branch, consult DEFAULT_CONTEXT_LENGTHS using
  the same longest-key-first fuzzy matching as step 8. Only fall through
  to 256K if no catalog entry matches.
- tests/agent/test_model_metadata.py: Updated existing test and added new
  test covering the custom-endpoint → catalog fallback behavior.

Fixes #38865
2026-06-07 21:50:57 -07:00
islam666
9513793ad7 fix(vision): proactive downgrade for providers rejecting list-type tool content (#41072)
Xiaomi MiMo (and potentially other providers) support multimodal user
messages but reject list-type tool message content with 400 'text is not
set'. Previously this was handled reactively — the API call would fail,
images would be stripped, and the request retried, losing visual info.

Fix: add supports_vision_tool_messages field to ProviderProfile (default
True). Xiaomi sets it to False. _tool_result_content_for_active_model
now checks this field proactively and returns a text summary instead of
list content, avoiding the round-trip failure entirely.
2026-06-07 21:50:57 -07:00
islam666
41f0714287 fix(vision): honor custom_providers per-model supports_vision (#41036)
_supports_vision_override() in image_routing.py checked model.supports_vision
and providers.<name>.models, but not the legacy list-style custom_providers
config. A custom provider entry like:

  custom_providers:
    - name: my-provider
      models:
        my-model:
          supports_vision: true

was ignored, causing image_input_mode=auto to route through the auxiliary
vision_analyze path instead of natively attaching images.

Fix: added a lookup step for custom_providers list entries, matching by
provider name (including 'custom:<name>' variants at runtime).
providers.<name>.models still takes precedence over custom_providers.

13 new tests covering: true/false override, custom: prefix matching,
no-match fallback, non-dict entries, empty lists, models key missing.
2026-06-07 21:50:57 -07:00
islam666
18c085b1a4 fix(gateway): normalize optional systemd directives in stale-check (#41119)
On older systemd versions that don't support RestartMaxDelaySec /
RestartSteps, the installed unit file has those directives silently
dropped. systemd_unit_is_current() did a strict text comparison, so
the unit was perpetually flagged as outdated.

Fix: _strip_optional_systemd_directives() removes RestartMaxDelaySec
and RestartSteps from both the installed and expected text before
comparison. Units that differ only by these optional directives are
now correctly considered current.
2026-06-07 21:50:57 -07:00
islam666
b18490b890 fix(compaction): prevent infinite loop when transcript fits in tail budget
When summary_target_ratio is large (e.g. 0.45) and the context_length is
moderate (e.g. 96000), the soft_ceiling (token_budget * 1.5) can exceed
the total transcript size.  _find_tail_cut_by_tokens walks the entire
transcript without breaking early, and the resulting compress window is
either empty (compress_start >= compress_end) or a single message whose
summary-of-one overhead saves ~0 tokens.

Both outcomes cause a no-op compression that does not increment
_ineffective_compression_count, so should_compress() returns True on
every subsequent turn and the loop repeats endlessly.

Fix (two layers):
1. _find_tail_cut_by_tokens: when the backward walk consumed the entire
   transcript without breaking (cut_idx <= head_end and accumulated <=
   soft_ceiling), re-walk with the raw (non-inflated) token budget to
   find a meaningful cut that gives the summarizer a useful middle window.
2. compress(): when compress_start >= compress_end, increment
   _ineffective_compression_count and log a warning so the existing
   anti-thrashing guard in should_compress() can break the loop.

Fixes #40803
2026-06-07 21:50:57 -07:00
Brian D. Evans
ab0a6270c3 fix(slack): align thread_ts check with is_thread_reply invariant (Copilot #15464)
Two findings from Copilot's review on #15464, both addressed:

1. ``event.get("thread_ts")`` truthy vs
   ``event_thread_ts != ts``: the new channel branch treated ANY
   truthy ``thread_ts`` as a real thread reply, but three lines below
   ``is_thread_reply`` is defined with the stricter
   ``event_thread_ts and event_thread_ts != ts`` invariant.  If Slack
   ever ships a payload where ``thread_ts == ts`` on a thread root,
   the stricter check would treat it as a top-level message for the
   ``is_thread_reply`` path but as a thread reply for session keying
   — divergent behaviour.  Aligned this branch to the same
   ``and event_thread_ts_raw != ts`` invariant.

2. ``test_top_level_reply_to_id_stays_none_when_shared`` docstring
   had the ternary logic backwards ("None != ts → reply_to_message_id
   IS set").  The code reads
   ``reply_to_message_id = thread_ts if thread_ts != ts else None`` —
   with ``thread_ts = None``, the condition is True so the expression
   evaluates to ``thread_ts`` itself (None), meaning the reply stays
   un-threaded.  The test asserted the correct end-state; only the
   explanatory docstring was wrong.  Rewrote the docstring to match
   the actual code flow, with the note that Copilot caught the
   reversal.

7/7 tests still pass.  No behaviour change for the existing
test_thread_reply_scopes_by_thread_even_when_shared case because
``event_thread_ts_raw = "1700000000.000000"`` and ``ts =
"1700000000.000005"`` are distinct — the new
``!= ts`` guard is a no-op there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-07 21:19:59 -07:00
Brian D. Evans
133e0271e2 fix(slack): scope top-level channel messages by channel-only when reply_in_thread=false (#15421)
Top-level Slack channel messages previously fell back to the message's
own ``ts`` as a synthetic ``thread_ts``:

    thread_ts = event.get("thread_ts") or ts  # ts fallback for channels

That value flows into ``build_source(thread_id=thread_ts)`` at
line 1247.  The gateway session store keys sessions by
``(platform, channel_id, thread_id)``, so every top-level channel
message ended up on a unique session.  Operators who set
``reply_in_thread: false`` in ``config.yaml`` expected all top-level
channel messages to share one session (the whole point of that flag)
— instead each one spawned a fresh conversation with no context
carry-over.

### Fix

Three explicit cases in the channel branch:

| event.thread_ts | reply_in_thread | thread_ts for session keying |
|---|---|---|
| non-null (real thread reply) | either | event.thread_ts |
| null (top-level) | true (default) | ts (legacy: own-thread sessions) |
| null (top-level) | false | **None** (shared channel session) |

The outbound-reply gate at line 1264 (``reply_to_message_id =
thread_ts if thread_ts != ts else None``) still works correctly in
all three cases without further changes: ``None != ts`` is True, so
shared-channel top-level messages don't get their reply threaded
either — matching the operator's ``reply_in_thread=false`` intent
end-to-end.

Genuine thread replies still scope per-thread under both modes so
multi-person threaded conversations can't collide with unrelated
channel chatter.

### Tests (7 new in ``tests/gateway/test_slack_channel_session_scope.py``)

All drive the real ``SlackAdapter._handle_slack_message`` code path
(not a re-implementation) via the standard pytest fixture pattern
used by ``tests/gateway/test_slack.py``.  Messages @mention the bot
so the mention gate doesn't drop them — the tests are specifically
about what happens once the handler decides to emit a ``MessageEvent``.

* ``TestChannelSessionScopeDefault`` (2 cases):
  - Explicit ``reply_in_thread: true`` keeps ``thread_id = ts``
    (legacy behaviour — regression guard)
  - Unset config behaves like ``reply_in_thread: true`` (pins the
    default)
* ``TestChannelSessionScopeShared`` (3 cases):
  - ``reply_in_thread: false`` + top-level → ``thread_id is None``
    (the #15421 bug 1 fix)
  - ``reply_to_message_id is None`` in the same case (no threaded
    outbound reply)
  - Genuine thread reply still scopes per-thread when shared mode is
    on — only TOP-LEVEL messages collapse to the channel session
* ``TestThreadReplyAlwaysScopesByThread`` (2 parametrised cases):
  - Thread replies get ``thread_id = event.thread_ts`` regardless of
    ``reply_in_thread`` — critical invariant for multi-thread
    channels; a regression here would leak per-thread context across
    threads

**Regression guard verified**: reverted the else-branch to the legacy
``thread_ts = event.get("thread_ts") or ts`` one-liner;
``test_top_level_maps_to_none_when_reply_in_thread_false`` correctly
failed (asserts ``thread_id is None`` but got ``"1700000000.000003"``).
Restored → 182 slack tests pass (175 existing + 7 new).

Scope: this fixes #15421 bug 1 only.  Bug 2 (sessions.json not
persisting across compression) lives elsewhere in the session
manager and is left for a separate diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-07 21:19:59 -07:00
Shannon Sands
86e5efb0ae Preserve Telegram onboarding fallback errors 2026-06-07 19:48:09 -07:00
Shannon Sands
ba29010902 Use httpx for Telegram onboarding worker calls 2026-06-07 19:48:09 -07:00
brooklyn!
fa42ac094d
feat(desktop): Shift+click the status-bar zap to toggle YOLO globally (#41666)
The status-bar zap currently toggles per-session approval bypass (the same
scope as the TUI's Shift+Tab). This adds a global escape hatch: Shift+clicking
the zap flips the persistent approvals.mode in config.yaml between "off"
(bypass on) and "manual" (bypass off), affecting every session, the CLI, the
TUI, and cron — and it survives restarts.

- statusbar-controls: thread the click's shiftKey through onSelect via a new
  StatusbarSelectModifiers arg.
- yolo-session: add setGlobalYolo() that calls config.set with scope="global".
- use-statusbar-items: branch toggleYolo on modifiers.shiftKey; plain click
  stays per-session, Shift+click goes global.
- tui_gateway config.set "yolo" key: add scope="global" that reads/writes
  approvals.mode through the gateway's own (mtime-cached) config view, honors
  an explicit value, and re-emits session.info to every live session so each
  window's zap reflects the flip immediately.
- i18n: tooltip copy in en/ja/zh/zh-hant notes Shift+click toggles globally.

Tests: two new tui_gateway tests cover the global toggle and explicit-value
paths; existing session/process-scope yolo tests still pass.
2026-06-07 20:57:08 -05:00
Teknium
30c7913617
fix(api_server): report hermes version on /health and /health/detailed (#40620)
Salvaged from #40479; re-verified on main, tightened, tested.

Co-authored-by: tfournet <tfournet@users.noreply.github.com>
2026-06-07 18:38:54 -07:00
Teknium
b97cd81c78
refactor(insights): drop dead pricing/duration wrappers, call usage_pricing directly (#40618)
Salvaged from #40527; re-verified on main, tightened, tested.

Co-authored-by: HeLLGURD <HeLLGURD@users.noreply.github.com>
2026-06-07 18:33:20 -07:00
Teknium
2aa316ec9c
docs(windows): fix Get-Command PATH guidance to venv\Scripts\hermes.exe (#40613)
Closes #40464.

Salvaged from #40488; re-verified on main, tightened, tested.

Co-authored-by: gauravsaxena1997 <gauravsaxena1997@users.noreply.github.com>
2026-06-07 18:28:23 -07:00
Teknium
6bdc4c0231
test: skip curses tests on Windows where _curses is unavailable (#40611)
Salvaged from #40447; re-verified on main, tightened, tested.

Co-authored-by: Ganesh0690 <Ganesh0690@users.noreply.github.com>
2026-06-07 18:21:03 -07:00
Teknium
69a293b419 hardening(todo): bound TodoStore item content length and count
The todo list is re-injected into the model's context after every
context-compression event (TodoStore.format_for_injection), so an oversized
todo item or an unbounded number of items defeats the compression it is meant
to ride through. TodoStore.write/_validate previously enforced no size or count
bounds, so a single 50KB item produced a ~50KB re-injection block on every
subsequent turn.

Add two caps:
- MAX_TODO_CONTENT_CHARS (4000): per-item content is truncated with a marker.
  Routed through a shared _cap_content() so the merge-update path (which writes
  content directly, bypassing _validate) is capped too.
- MAX_TODO_ITEMS (256): total list length is bounded, keeping the
  highest-priority head (list order is priority).

Both caps are generous relative to real plans — a todo item is a short task
description and active lists are a handful of items.

NOT a security fix. Raised externally via GHSA-5g4g-6jrg-mw3g, which framed a
caller-supplied conversation_history on the authenticated API server replaying
into _hydrate_todo_store as a DoS. That path is authenticated (the API server
refuses to start without API_SERVER_KEY) and self-scoped (the caller supplies
their own entire history and can only inflate their own response chain — forged
role=tool entries are never persisted to the session DB), so it is out of scope
as a vulnerability under SECURITY.md 3.2. These bounds are footgun containment
that also applies to the trusted agent path, where the model itself authors the
todos. Credit to the reporter for the observation.

Co-authored-by: YLChen-007 <30854794+YLChen-007@users.noreply.github.com>
2026-06-07 18:06:27 -07:00
Gilad Bauman
ae82eed2b1 fix(gateway): use OGG for Telegram auto TTS 2026-06-07 18:05:58 -07:00
Teknium
cb83149dc6
fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (#40607)
Salvaged from #40421; re-verified on main, tightened, tested.

Co-authored-by: maxmilian <maxmilian@users.noreply.github.com>
2026-06-07 17:49:38 -07:00
Teknium
09d66037f8
fix(hindsight): send only new-turn delta on append retains instead of whole session (#40605)
Closes #40503.

Salvaged from #40519; re-verified on main, tightened, tested.

Co-authored-by: skylarbpayne <skylarbpayne@users.noreply.github.com>
2026-06-07 17:41:10 -07:00
kshitijk4poor
7df81d0557 fix(web): make _has_env config-aware so SEARXNG_URL auto-detect honors Hermes config
Follow-up to #34306. The provider fix made SearXNG *usable* with a
config-only SEARXNG_URL, but tools/web_tools._has_env still read raw
os.getenv, so the backend auto-detect cascade and check_web_api_key
remained blind to it — SearXNG worked when explicitly selected but was
never auto-selected. Route _has_env (and the SearXNG diagnostic print)
through a config-aware _env_value helper mirroring the provider's
_searxng_url(). Fixing the shared helper covers every provider key in
one place. Adds regression tests for config-only auto-detect and
check_web_api_key. See #34290.
2026-06-08 01:12:32 +05:30
kshitij
0c0fbf763b
Merge pull request #41430 from helix4u/fix-url-tools-unicode-normalization
fix(tools): percent-encode non-ascii URL components
2026-06-07 12:39:30 -07:00
helix4u
333f01bc7f fix(tools): percent-encode non-ascii URL components 2026-06-07 11:42:26 -06:00
teknium1
16786f3bb3 feat(desktop+gateway): remote media relay — attach images/PDFs and display gateway images over the network
Desktop connected to a remote gateway can now attach images and PDFs and
display agent-written images. Previously the desktop passed a LOCAL file path
to image.attach; on a remote gateway that path doesn't exist, so the image was
silently dropped ("skipped unreadable path") and the vision model never saw it.
The reverse direction was also broken — images the agent wrote on the gateway
rendered as dead links in the remote client.

Gateway (tui_gateway/server.py):
- image.attach_bytes: base64 byte upload written into the gateway's own images
  dir and queued via the existing native-image-attach pipeline. Magic-byte
  extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap,
  structured error codes. Accepts content_base64/filename (canonical) and
  data/ext (older-desktop aliases).
- pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI
  and queues the pages as images; 50 MB / 25-page caps. Accepts host path or
  base64 upload.
- Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image)
  so the two methods and the existing image.attach don't duplicate logic.

Gateway (hermes_cli/web_server.py):
- GET /api/media: returns a gateway-local image as a base64 data URL so remote
  clients can display it. Auth-gated like every /api route, extension
  allowlist + size cap, AND confined to the gateway's own media roots
  (images/screenshots/cache, resolved symlink-safe) so an authed caller can't
  read image-extension files anywhere on disk.

Desktop (apps/desktop):
- syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the
  connection mode is 'remote'; the local fast path is unchanged.
- media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and
  markdown-text fetch images over /api/media in remote mode.

Consolidates the competing remote-media PRs (#38876, #40317, #21908, #39437)
into one coherent implementation, taking the strongest parts of each and adding
shared-helper cleanup plus the /api/media root-confinement hardening on top.
The per-profile gateway switching from #38876 is intentionally left out as a
separable feature. TUI file uploads (#40492) remain a separate surface.

Tested: 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop
media.remote unit tests; full tui_gateway + web_server suites green (472
passed); tsc -b clean; E2E verified the full attach→disk→queue and
gateway-path→data-URL display round-trip plus the out-of-root security block.

Co-authored-by: Max Mitcham <maxmitcham@mac.home>
Co-authored-by: Justlrnal4 <Justlrnal4@users.noreply.github.com>
Co-authored-by: Chris Cook <ccook@nvms.com>
Co-authored-by: Thomas Paquette <thomas.paquette@gmail.com>
2026-06-07 10:05:53 -07:00
Teknium
0c48b7165d hardening(api-server): scan cron prompts on REST create/update for parity with the agent tool
The agent-facing cronjob tool scans the user prompt with _scan_cron_prompt()
before creating/updating a job (tools/cronjob_tools.py); the REST cron
endpoints (POST /api/jobs, PATCH /api/jobs/{id}) validated length but not
content. This adds the same scan to both handlers so an exfiltration/injection
prompt is rejected the same way regardless of which surface created the job.

NOT a security boundary, defense-in-depth / parity only: the REST cron
endpoints are authenticated (every handler runs _check_auth, and connect()
refuses to start without API_SERVER_KEY), and _scan_cron_prompt is a documented
in-process heuristic, not a containment boundary (SECURITY.md 3.2).

Raised externally via GHSA-fr3q-rjg3-x6mf (DNS-rebinding pre-auth RCE). The
report's load-bearing 'no auth by default' premise was already closed three
weeks after it was filed by the API_SERVER_KEY-required guard (commit
1a9ef8314); this lands the create/update prompt-validation parity the report
also pointed at. Scanner imported defensively so a missing scanner cannot
disable the cron REST API.
2026-06-07 10:04:57 -07:00
Teknium
af08c43f3e
fix: skip MCP preflight content-type probe on reconnect when already ready (#40604)
Closes #40366.

Salvaged from #40548; re-verified on main, tightened, tested.

Co-authored-by: mohamedorigami-jpg <mohamedorigami-jpg@users.noreply.github.com>
2026-06-07 09:51:11 -07:00
teknium1
76f01780f0 fix(kanban): sweep deferred scratch parent on non-scratch child completion + tests
Follow-up on the deferred-cleanup salvage (#33774): _cleanup_workspace
returned early for a non-scratch ('dir'/'worktree') task and never ran the
parent sweep, so a scratch parent waiting on a 'dir' child would leak its
deferred workspace forever. Run the parent sweep before the early return.

Adds regression tests: deferred-while-child-active, swept-after-last-child,
and dir-child-unblocks-scratch-parent.
2026-06-07 09:50:44 -07:00