Pass the user's configured api_key through local-server detection and
context-length probes (detect_local_server_type, _query_local_context_length,
query_ollama_num_ctx) and use LM Studio's native /api/v1/models endpoint in
fetch_endpoint_model_metadata when a loaded instance is present — so the
probed context length is the actual runtime value the user loaded the model
at, not just the model's theoretical max.
Helps local-LLM users whose auto-detected context length was wrong, causing
compression failures and context-overrun crashes.
When /steer is sent during an API call (model thinking), the steer text
sits in _pending_steer until after the next tool batch — which may never
come if the model returns a final response. In that case the steer is
only delivered as a post-run follow-up, defeating the purpose.
Add a pre-API-call drain at the top of the main loop: before building
api_messages, check _pending_steer and inject into the last tool result
in the messages list. This ensures steers sent during model thinking are
visible on the very next API call.
If no tool result exists yet (first iteration), the steer is restashed
for the post-tool drain to pick up — injecting into a user message would
break role alternation.
Three new tests cover the pre-API-call drain: injection into last tool
result, restash when no tool message exists, and backward scan past
non-tool messages.
- Fix duplicate 'timezone' import in e2e conftest
- Fix test_text_before_command_not_detected asserting send() is awaited
when no agent is present in mock setup (text messages don't produce
command output)
Cherry-picked from PR #13159 by @cdanis.
Adds native media attachment delivery to Signal via signal-cli JSON-RPC
attachments param. Signal messages with media now follow the same
early-return pattern as Telegram/Discord/Matrix — attachments are sent
only with the last chunk to avoid duplicates.
Follow-up fixes on top of the original PR:
- Moved Signal into its own early-return block above the restriction
check (matches Telegram/Discord/Matrix pattern)
- Fixed media_files being sent on every chunk in the generic loop
- Restored restriction/warning guards to simple form (Signal exits early)
- Fixed non-hermetic test writing to /tmp instead of tmp_path
Kimi's gateway selects the correct temperature server-side based on the
active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any
temperature value — even the previously "correct" one — conflicts with
gateway-managed defaults.
Replaces the old approach of forcing specific temperature values (0.6
for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel
that tells all call sites to strip the temperature key from API kwargs
entirely.
Changes:
- agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model()
prefix check (covers all kimi-* models), _fixed_temperature_for_model()
returns sentinel for kimi models. _build_call_kwargs() strips temp.
- run_agent.py: _build_api_kwargs, flush_memories, and summary generation
paths all handle the sentinel by popping/omitting temperature.
- trajectory_compressor.py: _effective_temperature_for_model returns None
for kimi (sentinel mapped), direct client calls use kwargs dict to
conditionally include temperature.
- mini_swe_runner.py: same sentinel handling via wrapper function.
- 6 test files updated: all 'forces temperature X' assertions replaced
with 'temperature not in kwargs' assertions.
Net: -76 lines (171 added, 247 removed).
Inspired by PR #13137 (@kshitijk4poor).
Add dm_policy and group_policy to the WhatsApp adapter, bringing parity
with WeCom/Weixin/QQ. Allows independent control of DM and group access:
disable DMs entirely, allowlist specific senders/groups, or keep open.
- dm_policy: open (default) | allowlist | disabled
- group_policy: open (default) | allowlist | disabled
- Config bridging for YAML → env vars
- 22 tests covering all policy combinations
Backward compatible — defaults preserve existing behavior.
Cherry-picked from PR #11597 by @MassiveMassimo.
Dropped the run.py group auth bypass (would have skipped user auth
for ALL platforms, not just WhatsApp).
Replaces the serial for-loop in tick() with ThreadPoolExecutor so all
jobs due in a single tick run concurrently. A slow job no longer blocks
others from executing, fixing silent job skipping (issue #9086).
Thread safety:
- Session/delivery env vars migrated from os.environ to ContextVars
(gateway/session_context.py) so parallel jobs can't clobber each
other's delivery targets. Each thread gets its own copied context.
- jobs.json read-modify-write cycles (advance_next_run, mark_job_run)
protected by threading.Lock to prevent concurrent save clobber.
- send_message_tool reads delivery vars via get_session_env() for
ContextVar-aware resolution with os.environ fallback.
Configuration:
- cron.max_parallel_jobs in config.yaml (null = unbounded, 1 = serial)
- HERMES_CRON_MAX_PARALLEL env var override
Based on PR #9169 by @VenomMoth1.
Fixes#9086
* feat(security): URL query param + userinfo + form body redaction
Port from nearai/ironclaw#2529.
Hermes already has broad value-shape coverage in agent/redact.py
(30+ vendor prefixes, JWTs, DB connstrs, etc.) but missed three
key-name-based patterns that catch opaque tokens without recognizable
prefixes:
1. URL query params - OAuth callback codes (?code=...),
access_token, refresh_token, signature, etc. These are opaque and
won't match any prefix regex. Now redacted by parameter NAME.
2. URL userinfo (https://user:pass@host) - for non-DB schemes. DB
schemes were already handled by _DB_CONNSTR_RE.
3. Form-urlencoded body (k=v pairs joined by ampersands) -
conservative, only triggers on clean pure-form inputs with no
other text.
Sensitive key allowlist matches ironclaw's (exact case-insensitive,
NOT substring - so token_count and session_id pass through).
Tests: +20 new test cases across 3 test classes. All 75 redact tests
pass; gateway/test_pii_redaction and tools/test_browser_secret_exfil
also green.
Known pre-existing limitation: _ENV_ASSIGN_RE greedy match swallows
whole all-caps ENV-style names + trailing text when followed by
another assignment. Left untouched here (out of scope); URL query
redaction handles the lowercase case.
* feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal
Update model catalogs for OpenRouter (fallback snapshot), Nous Portal,
and NVIDIA NIM to reference moonshotai/kimi-k2.6. Add kimi-k2.6 to
the fixed-temperature frozenset in auxiliary_client.py so the 0.6
contract is enforced on aggregator routings.
Native Moonshot provider lists (kimi-coding, kimi-coding-cn, moonshot,
opencode-zen, opencode-go) are unchanged — those use Moonshot's own
model IDs which are unaffected.
Cherry-picked from PR #2545 by @Mibayy.
The setup wizard could leave stt.model: "whisper-1" in config.yaml.
When using the local faster-whisper provider, this crashed with
"Invalid model size 'whisper-1'". Voice messages were silently ignored.
_normalize_local_model() now detects cloud-only names (whisper-1,
gpt-4o-transcribe, etc.) and maps them to the default local model
with a warning. Valid local sizes (tiny, base, small, medium, large-v3)
pass through unchanged.
- Renamed _normalize_local_command_model -> _normalize_local_model
(backward-compat wrapper preserved)
- 6 new tests including integration test
- Added lowercase AUTHOR_MAP alias for @Mibayy
Closes#2544
Prefer session_store origin over _parse_session_key() for shutdown
notifications. Fixes misrouting when chat identifiers contain colons
(e.g. Matrix room IDs like !room123:example.org).
Falls back to session-key parsing when no persisted origin exists.
Co-authored-by: Ruzzgar <ruzzgarcn@gmail.com>
Ref: #12766
Follow-up for PR #12252 salvage:
- Extract 75-line inline repair block to _repair_tool_call_arguments()
module-level helper for testability and readability
- Remove redundant 'import re as _re' (re already imported at line 33)
- Bound the while-True excess-delimiter removal loop to 50 iterations
- Add 17 tests covering all 6 repair stages
- Add sirEven to AUTHOR_MAP in release.py
Cherry-picked from PR #12481 by @Sanjays2402.
Reasoning models (GLM-5.1, QwQ, DeepSeek R1) inflate completion_tokens
with internal thinking tokens. The compression trigger summed
prompt_tokens + completion_tokens, causing premature compression at ~42%
actual context usage instead of the configured 50% threshold.
Now uses only prompt_tokens — completion tokens don't consume context
window space for the next API call.
- 3 new regression tests
- Added AUTHOR_MAP entry for @Sanjays2402
Closes#12026
The opt-in-by-default change (70111eea) requires plugins to be listed
in plugins.enabled. The cherry-picked test fixtures didn't write this
config, so two tests failed on current main.
- discover plugin commands before building Telegram command menus
- make plugin command and context engine accessors lazy-load plugins
- add regression coverage for Telegram menu and plugin lookup paths
Replaces global id +/- 1 context lookup with CTE-based same-session
neighbor queries. When multiple sessions write concurrently, id adjacency
does not imply session adjacency — the old query missed real neighbors.
Co-authored-by: Junass1 <ysfalweshcan@gmail.com>
Custom Claude proxies fronted by Cloudflare with Browser Integrity Check
enabled (e.g. `packyapi.com`) reject requests with the default
`Python-urllib/*` signature, returning HTTP 403 "error code: 1010".
`probe_api_models` swallowed that in its blanket `except Exception:
continue`, so `validate_requested_model` returned the misleading
"Could not reach the <provider> API to validate `<model>`" error even
though the endpoint is reachable and lists the requested model.
Advertise the probe request as `hermes-cli/<version>` so Cloudflare
treats it as a first-party client. This mirrors the pattern already used
by `agent/gemini_native_adapter.py` and `agent/anthropic_adapter.py`,
which set a descriptive UA for the same reason.
Reproduction (pre-fix):
python3 -c "
import urllib.request
req = urllib.request.Request(
'https://www.packyapi.com/v1/models',
headers={'Authorization': 'Bearer sk-...'})
urllib.request.urlopen(req).read()
"
urllib.error.HTTPError: HTTP Error 403: Forbidden
(body: b'error code: 1010')
Any non-urllib UA (Mozilla, curl, reqwest) returns 200 with the
OpenAI-compatible models listing.
Tested on macOS (Python 3.11). No cross-platform concerns — the change
is a single header addition to an existing `urllib.request.Request`.
Cherry-picked from PR #10005 by @houziershi.
Discarded prompts (has_any_reasoning=False) were skipped by `continue`
before being added to completed_in_batch. On --resume they were retried
forever. Now they are added to completed_in_batch before the continue.
- Added AUTHOR_MAP entry for @houziershi
Closes#9950
Cherry-picked from PR #9359 by @luyao618.
- Accept camelCase aliases (apiKey, baseUrl, apiMode, keyEnv, defaultModel,
contextLength, rateLimitDelay) with auto-mapping to snake_case + warning
- Validate URL field values with urlparse (scheme + netloc check) — reject
non-URL strings like 'openai-reverse-proxy' that were silently accepted
- Warn on unknown keys in provider config entries
- Re-order URL field priority: base_url > url > api (was api > url > base_url)
- 12 new tests covering all scenarios
Closes#9332
Plugins now require explicit consent to load. Discovery still finds every
plugin — user-installed, bundled, and pip — so they all show up in
`hermes plugins` and `/plugins`, but the loader only instantiates
plugins whose name appears in `plugins.enabled` in config.yaml. This
removes the previous ambient-execution risk where a newly-installed or
bundled plugin could register hooks, tools, and commands on first run
without the user opting in.
The three-state model is now explicit:
enabled — in plugins.enabled, loads on next session
disabled — in plugins.disabled, never loads (wins over enabled)
not enabled — discovered but never opted in (default for new installs)
`hermes plugins install <repo>` prompts "Enable 'name' now? [y/N]"
(defaults to no). New `--enable` / `--no-enable` flags skip the prompt
for scripted installs. `hermes plugins enable/disable` manage both lists
so a disabled plugin stays explicitly off even if something later adds
it to enabled.
Config migration (schema v20 → v21): existing user plugins already
installed under ~/.hermes/plugins/ (minus anything in plugins.disabled)
are auto-grandfathered into plugins.enabled so upgrades don't silently
break working setups. Bundled plugins are NOT grandfathered — even
existing users have to opt in explicitly.
Also: HERMES_DISABLE_BUNDLED_PLUGINS env var removed (redundant with
opt-in default), cmd_list now shows bundled + user plugins together with
their three-state status, interactive UI tags bundled entries
[bundled], docs updated across plugins.md and built-in-plugins.md.
Validation: 442 plugin/config tests pass. E2E: fresh install discovers
disk-cleanup but does not load it; `hermes plugins enable disk-cleanup`
activates hooks; migration grandfathers existing user plugins correctly
while leaving bundled plugins off.
The original name was cute but non-obvious; disk-cleanup says what it
does. Plugin directory, script, state path, log lines, slash command,
and test module all renamed. No user-visible state exists yet, so no
migration path is needed.
New website page "Built-in Plugins" documents the <repo>/plugins/<name>/
source, how discovery interacts with user/project plugins, the
HERMES_DISABLE_BUNDLED_PLUGINS escape hatch, disk-cleanup's hook
behaviour and deletion rules, and guidance on when a plugin belongs
bundled vs. user-installable. Added to the Features → Core sidebar next
to the main Plugins page, with a cross-reference from plugins.md.
Rewires @LVT382009's disk-guardian (PR #12212) from a skill-plus-script
into a plugin that runs entirely via hooks — no agent compliance needed.
- post_tool_call hook auto-tracks files created by write_file / terminal
/ patch when they match test_/tmp_/*.test.* patterns under HERMES_HOME
- on_session_end hook runs cmd_quick cleanup when test files were
auto-tracked during the turn; stays quiet otherwise
- /disk-guardian slash command keeps status / dry-run / quick / deep /
track / forget for manual use
- Deterministic cleanup rules, path safety, atomic writes, and audit
logging preserved from the original contribution
- Protect well-known top-level state dirs (logs/, memories/, sessions/,
cron/, cache/, etc.) from empty-dir removal so fresh installs don't
get gutted on first session end
The plugin system gains a bundled-plugin discovery path (<repo>/plugins/
<name>/) alongside user/project/entry-point sources. Memory and
context_engine subdirs are skipped — they keep their own discovery
paths. HERMES_DISABLE_BUNDLED_PLUGINS=1 suppresses the scan; the test
conftest sets it by default so existing plugin tests stay clean.
Co-authored-by: LVT382009 <levantam.98.2324@gmail.com>
OpenAI-compatible clients (Open WebUI, LobeChat, etc.) can now send vision
requests to the API server. Both endpoints accept the canonical OpenAI
multimodal shape:
Chat Completions: {type: text|image_url, image_url: {url, detail?}}
Responses: {type: input_text|input_image, image_url: <str>, detail?}
The server validates and converts both into a single internal shape that the
existing agent pipeline already handles (Anthropic adapter converts,
OpenAI-wire providers pass through). Remote http(s) URLs and data:image/*
URLs are supported.
Uploaded files (file, input_file, file_id) and non-image data: URLs are
rejected with 400 unsupported_content_type.
Changes:
- gateway/platforms/api_server.py
- _normalize_multimodal_content(): validates + normalizes both Chat and
Responses content shapes. Returns a plain string for text-only content
(preserves prompt-cache behavior on existing callers) or a canonical
[{type:text|image_url,...}] list when images are present.
- _content_has_visible_payload(): replaces the bare truthy check so a
user turn with only an image no longer rejects as 'No user message'.
- _handle_chat_completions and _handle_responses both call the new helper
for user/assistant content; system messages continue to flatten to text.
- Codex conversation_history, input[], and inline history paths all share
the same validator. No duplicated normalizers.
- run_agent.py
- _summarize_user_message_for_log(): produces a short string summary
('[1 image] describe this') from list content for logging, spinner
previews, and trajectory writes. Fixes AttributeError when list
user_message hit user_message[:80] + '...' / .replace().
- _chat_content_to_responses_parts(): module-level helper that converts
chat-style multimodal content to Responses 'input_text'/'input_image'
parts. Used in _chat_messages_to_responses_input for Codex routing.
- _preflight_codex_input_items() now validates and passes through list
content parts for user/assistant messages instead of stringifying.
- tests/gateway/test_api_server_multimodal.py (new, 38 tests)
- Unit coverage for _normalize_multimodal_content, including both part
formats, data URL gating, and all reject paths.
- Real aiohttp HTTP integration on /v1/chat/completions and /v1/responses
verifying multimodal payloads reach _run_agent intact.
- 400 coverage for file / input_file / non-image data URL.
- tests/run_agent/test_run_agent_multimodal_prologue.py (new)
- Regression coverage for the prologue no-crash contract.
- _chat_content_to_responses_parts round-trip coverage.
- website/docs/user-guide/features/api-server.md
- Inline image examples for both endpoints.
- Updated Limitations: files still unsupported, images now supported.
Validated live against openrouter/anthropic/claude-opus-4.6:
POST /v1/chat/completions → 200, vision-accurate description
POST /v1/responses → 200, same image, clean output_text
POST /v1/chat/completions [file] → 400 unsupported_content_type
POST /v1/responses [input_file] → 400 unsupported_content_type
POST /v1/responses [non-image data URL] → 400 unsupported_content_type
Closes#5621, #8253, #4046, #6632.
Co-authored-by: Paul Bergeron <paul@gamma.app>
Co-authored-by: zhangxicen <zhangxicen@example.com>
Co-authored-by: Manuel Schipper <manuelschipper@users.noreply.github.com>
Co-authored-by: pradeep7127 <pradeep7127@users.noreply.github.com>
Closes#8933 more fully, extending the per-tool transform_terminal_output
hook from #12929 to a generic seam that fires after every tool dispatch.
Plugins can rewrite any tool's result string (normalize formats, redact
fields, summarize verbose output) without wrapping individual tools.
Changes
- hermes_cli/plugins.py: add "transform_tool_result" to VALID_HOOKS
- model_tools.py: invoke the hook in handle_function_call after
post_tool_call (which remains observational); first valid str return
replaces the result; fail-open
- tests/test_transform_tool_result_hook.py: 9 new tests covering no-op,
None return, non-string return, first-match wins, kwargs, hook
exception fallback, post_tool_call observation invariant, ordering
vs post_tool_call, and an end-to-end real-plugin integration
- tests/hermes_cli/test_plugins.py: assert new hook in VALID_HOOKS
- tests/test_model_tools.py: extend the hook-call-sequence assertion
to include the new hook
Design
- transform_tool_result runs AFTER post_tool_call so observers always
see the original (untransformed) result. This keeps post_tool_call's
observational contract.
- transform_terminal_output (from #12929) still runs earlier, inside
terminal_tool, so plugins can canonicalize BEFORE the 50k truncation
drops middle content. Both hooks coexist; they target different layers.
SessionStore.prune_old_entries was calling
self._has_active_processes_fn(entry.session_id) but the callback wired
up in gateway/run.py is process_registry.has_active_for_session, which
compares against session_key, not session_id. Every other caller in
session.py (_is_session_expired, _should_reset) already passes
session_key, so prune was the only outlier — and because session_id and
session_key live in different namespaces, the guard never fired.
Result in production: sessions with live background processes (queued
cron output, detached agents, long-running Bash) were pruned out of
_entries despite the docstring promising they'd be preserved. When the
process finished and tried to deliver output, the session_key to
session_id mapping was gone and the work was effectively orphaned.
Also update the existing test_prune_skips_entries_with_active_processes,
which was checking the wrong interface (its mock callback took session_id
so it agreed with the buggy implementation). The test now uses a
session_key-based mock, matching the production callback's real contract,
and a new regression guard test pins the behaviour.
Swallowed exceptions inside the prune loop now log at debug level instead
of silently disappearing.
After a conversation gets compressed, run_agent's _compress_context ends
the parent session and creates a continuation child with the same logical
conversation. Every list affordance in the codebase (list_sessions_rich
with its default include_children=False, plus the CLI/TUI/gateway/ACP
surfaces on top of it) hid those children, and resume-by-ID on the old
root landed on a dead parent with no messages.
Fix: lineage-aware projection on the read path.
- hermes_state.py::get_compression_tip(session_id) — walk the chain
forward using parent.end_reason='compression' AND
child.started_at >= parent.ended_at. The timing guard separates
compression continuations from delegate subagents (which were created
while the parent was still live) without needing a schema migration.
- hermes_state.py::list_sessions_rich — new project_compression_tips
flag (default True). For each compressed root in the result, replace
surfaced fields (id, ended_at, end_reason, message_count,
tool_call_count, title, last_active, preview, model, system_prompt)
with the tip's values. Preserve the root's started_at so chronological
ordering stays stable. Projected rows carry _lineage_root_id for
downstream consumers. Pass False to get raw roots (admin/debug).
- hermes_cli/main.py::_resolve_session_by_name_or_id — project forward
after ID/title resolution, so users who remember an old root ID (from
notes, or from exit summaries produced before the sibling Bug 1 fix)
land on the live tip.
All downstream callers of list_sessions_rich benefit automatically:
- cli.py _list_recent_sessions (/resume, show_history affordance)
- hermes_cli/main.py sessions list / sessions browse
- tui_gateway session.list picker
- gateway/run.py /resume titled session listing
- tools/session_search_tool.py
- acp_adapter/session.py
Tests: 7 new in TestCompressionChainProjection covering full-chain walks,
delegate-child exclusion, tip surfacing with lineage tracking, raw-root
mode, chronological ordering, and broken-chain graceful fallback.
Verified live: ran a real _compress_context on a live Gemini-backed
session, confirmed the DB split, then verified
- db.list_sessions_rich surfaces tip with _lineage_root_id set
- hermes sessions list shows the tip, not the ended parent
- _resolve_session_by_name_or_id(old_root_id) -> tip_id
- _resolve_last_session -> tip_id
Addresses #10373.
On macOS, Unix domain socket paths are capped at 104 bytes (sun_path).
SSH appends a 16-byte random suffix to the ControlPath when operating
in ControlMaster mode. With an IPv6 host embedded literally in the
filename and a deeply-nested macOS $TMPDIR like
/var/folders/XX/YYYYYYYYYYYY/T/, the full path reliably exceeds the
limit — every terminal/file-op tool call then fails immediately with
``unix_listener: path "…" too long for Unix domain socket``.
Swap the ``user@host:port.sock`` filename for a sha256-derived 16-char
hex digest. The digest is deterministic for a given (user, host, port)
triple, so ControlMaster reuse across reconnects is preserved, and the
full path fits comfortably under the limit even after SSH's random
suffix. Collision space is 2^64 — effectively unreachable for the
handful of concurrent connections any single Hermes process holds.
Regression tests cover: path length under realistic macOS $TMPDIR with
the IPv6 host from the issue report, determinism for reconnects, and
distinctness across different (user, host, port) triples.
Closes#11840
Four parametrized cases that pin down the running-agent guard behavior:
/yolo and /verbose dispatch mid-run; /fast and /reasoning get the
"can't run mid-turn" catch-all. Prevents the allowlist from silently
drifting in either direction.
Follow-up to #12704. The SignalAdapter can resolve +E164 numbers to
UUIDs via listContacts, but _parse_target_ref() in the send_message
tool rejected '+' as non-digit and fell through to channel-name
resolution — which fails for contacts without a prior session entry.
Adds an E.164 branch in _parse_target_ref for phone-based platforms
(signal, sms, whatsapp) that preserves the leading '+' so downstream
adapters keep the format they expect. Non-phone platforms are
unaffected.
Reported by @qdrop17 on Discord after pulling #12704.
Adds regression tests for list-typed, int-typed, and None-typed message
fields on top of the dict-typed coverage from #11496. Guards against
other provider quirks beyond the original Pydantic validation case.
Credit to @elmatadorgh (#11264) for the broader type coverage idea.
When API providers return Pydantic-style validation errors where
body['message'] or body['error']['message'] is a dict (e.g.
{"detail": [...]}), the error classifier was crashing with
AttributeError: 'dict' object has no attribute 'lower'.
The 'or ""' fallback only handles None/falsy values. A non-empty
dict is truthy and passes through to .lower(), which fails.
Fix: Wrap all 5 call sites with str() before calling .lower().
This is a no-op for strings and safely converts dicts to their
repr for pattern matching (no false positives on classification
patterns like 'rate limit', 'context length', etc.).
Closes#11233
The streaming translator in agent/gemini_cloudcode_adapter.py keyed OpenAI
tool-call indices by function name, so when the model emitted multiple
parallel functionCall parts with the same name in a single turn (e.g.
three read_file calls in one response), they all collapsed onto index 0.
Downstream aggregators that key chunks by index would overwrite or drop
all but the first call.
Replace the name-keyed dict with a per-stream counter that persists across
SSE events. Each functionCall part now gets a fresh, unique index,
matching the non-streaming path which already uses enumerate(parts).
Add TestTranslateStreamEvent covering parallel-same-name calls, index
persistence across events, and finish-reason promotion to tool_calls.
Replaces the permanent "OK" receipt reaction with a 3-phase visual
lifecycle:
- Typing animation appears when the agent starts processing.
- Cleared when processing succeeds — the reply message is the signal.
- Replaced with CrossMark when processing fails.
- Cleared when processing is cancelled or interrupted.
When Feishu rejects the reaction-delete call, we keep the Typing in
place and skip adding CrossMark. Showing both at once would leave the
user seeing both "still working" and "done/failed" simultaneously,
which is worse than a stuck Typing.
A FEISHU_REACTIONS env var (default on) disables the whole lifecycle.
User-added reactions with the same emoji still route through to the
agent; only bot-origin reactions are filtered to break the feedback
loop.
Change-Id: I527081da31f0f9d59b451f45de59df4ddab522ba
After context compression (manual /compress or auto), run_agent's
_compress_context ends the current session and creates a new continuation
child session, mutating agent.session_id. The classic CLI held its own
self.session_id that never resynced, so /status showed the ended parent,
the exit-summary --resume hint pointed at a closed row, and any later
end_session() call (from /resume <other> or /branch) targeted the wrong
row AND overwrote the parent's 'compression' end_reason.
This only affected the classic prompt_toolkit CLI. The gateway path was
already fixed in PR #1160 (March 2026); --tui and ACP use different
session plumbing and were unaffected.
Changes:
- cli.py::_manual_compress — sync self.session_id from self.agent.session_id
after _compress_context, clear _pending_title
- cli.py chat loop — same sync post-run_conversation for auto-compression
- cli.py hermes -q single-query mode — same sync so stderr session_id
output points at the continuation
- hermes_state.py::end_session — guard UPDATE with 'ended_at IS NULL' so
the first end_reason wins; reopen_session() remains the explicit
escape hatch for re-ending a closed row
Tests:
- 3 new in tests/cli/test_manual_compress.py (split sync, no-op guard,
pending_title behavior)
- 2 new in tests/test_hermes_state.py (preserve compression end_reason
on double-end; reopen-then-re-end still works)
Closes#12483. Credits @steve5636 for the same-day bug report and
@dieutx for PR #3529 which proposed the CLI sync approach.
file_tools._get_file_ops() built a container_config dict for Docker/
Singularity/Modal/Daytona backends but omitted docker_mount_cwd_to_workspace
and docker_forward_env. Both are read by _create_environment() from
container_config, so file tools (read_file, write_file, patch, search)
silently ignored those config values when running in Docker.
Add the two missing keys to match the container_config already built by
terminal_tool.terminal_tool().
Fixes#2672.