Commit graph

9380 commits

Author SHA1 Message Date
Maxim Esipov
bdc9b0eff5 fix(telegram): preserve new DM topic lanes 2026-05-24 15:28:40 -07:00
Teknium
eea9553a9c fix(anthropic): skip mcp_ prefix on outgoing tool schemas when already prefixed
Companion to the GH-25255 incoming-strip fix from @hayka-pacha. Without
this, build_anthropic_kwargs unconditionally added 'mcp_' to every tool
name in step 3, so a native MCP server tool registered as
'mcp_composio_X' was sent as 'mcp_mcp_composio_X' on the wire. The
incoming strip only removes ONE prefix, which still worked on first
call, but on subsequent calls the model pattern-matched the
single-prefixed form from message history and produced names that
stripped to 'composio_X' — registry miss, dispatch fail.

The history-rewrite block (#4) already has this guard. Apply the same
guard to the schema-rewrite block (#3) so round-trip is symmetric.

Added 4 outgoing-side tests. Existing 7 incoming-side tests still pass.

Author map: hayka-pacha added for PR #25270 salvage attribution.

Refs GH-25255.
2026-05-24 15:27:45 -07:00
HKPA
2f91a8406c fix(agent): only strip mcp_ prefix for OAuth-injected tools (GH-25255)
When strip_tool_prefix=True (Anthropic OAuth path), normalize_response
unconditionally stripped the mcp_ prefix from ALL tool names starting
with mcp_. This broke Hermes-native MCP server tools (registered under
their full mcp_<server>_<tool> name in the registry) because the stripped
name doesn't match any registry entry.

Fix: check the tool registry before stripping. Only strip when:
- The stripped name EXISTS in the registry (OAuth-injected tool)
- The full name does NOT exist in the registry

This preserves backward compatibility for OAuth-injected tools while
protecting native MCP server tools from incorrect prefix removal.

7 new tests covering: OAuth strip, native preserve, no-flag, non-mcp,
unknown tools, mixed responses, and dual-registration edge case.

Signed-off-by: HKPA <hayka-pacha@users.noreply.github.com>
2026-05-24 15:27:45 -07:00
Yuan Li
476c897439 fix(telegram): gate send() on send-path health after reconnect storms (#31165)
After sustained Bad Gateway / TimedOut reconnect cycles, the PTB httpx
client can enter a state where bot.send_message() returns a valid
Message (real message_id) but the message never reaches the recipient.
TelegramAdapter.send returns SendResult(success=True) and cron's
live-adapter branch marks the run delivered while the message is
silently dropped.

Add a _send_path_degraded flag. _handle_polling_network_error sets it
on reconnect storms; the existing _verify_polling_after_reconnect
heartbeat probe clears it once getMe() confirms the Bot client is
healthy. While the flag is set, send() short-circuits with
SendResult(success=False, retryable=True) so cron falls through to
the standalone delivery path (fresh HTTP session).

Closes #31165.

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
2026-05-24 15:27:41 -07:00
Teknium
54e61f9331 fix(matrix,gateway): Matrix E2EE installs full dep set; plugins respect is_connected
Fixes #31116 — two distinct bugs in fresh-install Matrix gateway:

1. Matrix E2EE setup installed only mautrix[encryption], leaving asyncpg
   / aiosqlite / Markdown / aiohttp-socks uninstalled. The first encrypted
   connect failed with 'No module named asyncpg' deep inside
   MatrixAdapter.connect(). Root cause: the setup wizard hand-rolled a
   pip install of one package instead of using lazy_deps.ensure(
   'platform.matrix'), and check_matrix_requirements() short-circuited the
   runtime installer on 'import mautrix' alone — so the other 4 packages
   were never pulled in.

2. Discord auto-enabled itself on every gateway start, even when the user
   never selected Discord and had no DISCORD_BOT_TOKEN. Root cause:
   gateway/config.py plugin-enablement loop gated enablement on
   entry.check_fn() (just 'is the SDK importable?') and ignored
   entry.is_connected (the 'did the user configure credentials?' probe).
   Same bug class as commit 7849a3d73 fixed for _platform_status in the
   setup wizard; this is the runtime counterpart. Affects Discord, Teams,
   and Google Chat.

Changes:
- hermes_cli/setup.py::_setup_matrix — install via
  lazy_deps.ensure('platform.matrix') to pull the full feature group.
- gateway/platforms/matrix.py::_check_e2ee_deps — verify asyncpg +
  aiosqlite + PgCryptoStore in addition to OlmMachine, so E2EE failures
  surface at startup instead of at first encrypted-room connect.
- gateway/platforms/matrix.py::check_matrix_requirements — use
  feature_missing('platform.matrix') as the install gate instead of a
  single 'import mautrix' check, so partial installs trigger the lazy
  installer correctly.
- gateway/config.py plugin-enablement loop — consult entry.is_connected
  before flipping enabled=True. Explicit YAML enabled=true still wins.

Tests: 3 new in tests/gateway/test_matrix.py (asyncpg-required,
aiosqlite-required, partial-install lazy-runs), 5 new in
tests/gateway/test_platform_registry.py (is_connected=False blocks,
is_connected=True enables, is_connected=None falls back to check_fn,
raising probe doesn't enable, explicit YAML wins).

Validation: 310 tests across affected test modules pass.
2026-05-24 15:16:03 -07:00
teknium1
88834baf50 chore: map soju06@users.noreply.github.com for PR #26054 salvage 2026-05-24 15:15:37 -07:00
Soju
6212e9ade8 fix(error-classifier): treat 5xx request-validation errors as non-retryable
Standard OpenAI returns request-validation failures (unknown/
unsupported parameter, malformed request) as 4xx. Some
OpenAI-compatible gateways return them as 5xx instead — codex.nekos.me
returns 502 for an unknown parameter.

The generic '5xx -> retryable server_error' rule then misfires: the
error is deterministic (every retry gets the identical rejection), so
the retry loop burns all 3 attempts, the transport-recovery path
resets the counter and burns 3 more, and the result is a request
flood against a request that can never succeed.

Fix: when a 500/502 body carries an unambiguous request-validation
signal — 'unknown parameter' / 'unsupported parameter' /
'invalid_request_error' in the message text, or invalid_request_error
/ unknown_parameter / unsupported_parameter as the structured error
code — classify as a non-retryable format_error so the loop fails
fast and falls back. Genuine 502 Bad Gateway with no such signal
stays retryable as before.

Origin: local-author
Upstream-PR: none
Patch-State: local-only
2026-05-24 15:15:37 -07:00
Soju
775a17284f fix(transport): strip Hermes-internal scaffolding keys before chat.completions
The empty-response recovery path in run_agent.py appends synthetic
messages tagged with _empty_recovery_synthetic (and the agent loop uses
_thinking_prefill / _empty_terminal_sentinel similarly). These are
internal bookkeeping markers — they must never reach the wire.

chat_completions' convert_messages only stripped Codex Responses leak
fields (codex_reasoning_items, call_id, etc.), not these _-prefixed
markers. Permissive providers (real OpenAI, Anthropic) silently ignore
unknown message keys so the bug stayed hidden, but strict
OpenAI-compatible gateways reject them outright. Observed against
codex.nekos.me:

  502: [ObjectParam] [input[617]._empty_recovery_synthetic]
       [unknown_parameter] Unknown parameter:
       '_empty_recovery_synthetic'

Because the synthetic messages persist in the session, every
subsequent request in that session carries the poisoned key and
fails identically — a deterministic 502 the retry loop mistakes for
a transient server error.

Fix: convert_messages now drops any top-level message key starting
with '_'. OpenAI's message schema has no '_'-prefixed fields, so this
is safe and future-proofs against new internal markers.

Origin: local-author
Upstream-PR: none
Patch-State: local-only
2026-05-24 15:15:37 -07:00
Teknium
7ab1677362
feat(security): on-demand supply-chain audit via OSV.dev (#31460)
Adds 'hermes security audit' — a one-shot vulnerability scan against
OSV.dev covering three surfaces a Hermes user actually controls:

  1. The running Python's installed PyPI dists (importlib.metadata)
  2. Plugin requirements.txt / pyproject.toml pins under ~/.hermes/plugins/
  3. Pinned npx/uvx MCP servers in config.yaml

Zero new dependencies (stdlib urllib + importlib.metadata + tomllib +
concurrent.futures). No auth required for OSV's public batch API.

Flags: --json, --fail-on {low,moderate,high,critical} (default: critical),
       --skip-venv, --skip-plugins, --skip-mcp

Output groups findings by source, sorts by severity descending, surfaces
fixed-versions inline. Exit 1 when any finding meets the --fail-on tier.

Deliberately out of scope: globally-installed pip/npm, editor/browser
extensions, daily background scans, auto-blocking of installs. The audit
is on-demand by design — daily scans become noise the user trains
themselves to ignore.
2026-05-24 15:15:16 -07:00
Teknium
8065e70274
fix(agent): abort on HTTP 402 after pool rotation and fallback fail (#31443)
Closes #31273.

HTTP 402 (insufficient credits) was retried up to agent.api_max_retries
times (default 3), burning paid requests against an exhausted balance.
Real-world impact: ~$40 in 48h on a 24/7 Telegram+Discord gateway.

Root cause: FailoverReason.billing was in the is_client_error
exclusion set in agent/conversation_loop.py, which prevents the
non-retryable-abort branch from firing.

By the time control reaches that predicate:
  * credential-pool rotation has already run for billing and either
    continued the loop or returned False (pool exhausted/absent)
  * the eager-fallback branch has also fired on billing and either
    continued the loop or fell through (no fallback configured)

Falling through to the backoff retry from here has no recovery
mechanism left — it just burns more paid requests.  Removing billing
from the exclusion set makes 402 abort cleanly once pool+fallback
recovery has failed, mirroring how 401/403 (also should_fallback=True)
already behave.

Added tests/run_agent/test_31273_402_not_retried.py which mirrors the
is_client_error predicate shape from the source and asserts the
invariant (plus a source-inspection guard against accidental
re-introduction).
2026-05-24 15:14:13 -07:00
teknium1
5b52e26d18 fix(gateway): swallow transient Telegram TimedOut at loop level
Closes #31066. Closes #31110.

An unhandled `telegram.error.TimedOut` (or peer `NetworkError` /
`httpx` connection error) propagating to the asyncio event loop killed
the entire gateway process, taking down every profile attached to the
same runner. systemd restarted the service after ~5s but the active
conversation turn was lost.

Public adapter methods (`adapter.send`, `adapter.edit_message`,
`adapter.send_voice`, …) are individually try/except-wrapped on
current main, but at least one async path was reaching the loop with
TimedOut unhandled — the report's traceback ends at the deepest httpx
frame and doesn't pinpoint the caller.

Rather than audit 30+ call sites blind, install a loop-level safety net:
`_gateway_loop_exception_handler` is set as the loop's exception handler
in `start_gateway()` after `asyncio.get_running_loop()`. It classifies
the exception via `_is_transient_network_error()` (walks the
__cause__/__context__ chain, matches on class name so the test suite
doesn't need the real telegram/httpx packages installed). Transient
errors are logged at WARNING with full traceback so the originating
call site stays diagnosable; everything else forwards to
`loop.default_exception_handler` so real bugs still surface.

Tests cover the classifier (known transients accepted, real bugs
rejected, cause/context chain unwrap, cyclic-cause termination) and the
handler (swallow + log warning, forward unknowns, missing-exception
context). One end-to-end test schedules an orphan task raising TimedOut
and asserts `asyncio.run` returns cleanly.
2026-05-24 15:03:27 -07:00
Teknium
3d66787a04
fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452)
* fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main for vision

Fixes #31179. Three coupled fixes so a configured aux vision backend
actually serves vision tasks instead of silently routing images to the
user's main provider:

1. agent/auxiliary_client.py: `auxiliary.<task>.provider: openai` resolves
   to `custom` + `https://api.openai.com/v1`. "openai" was not in
   PROVIDER_REGISTRY (we have `openai-codex` for OAuth and `custom` for
   manual base_url), so the obvious config name silently failed to build a
   client. User-supplied base_url is still preserved; only the provider
   name normalises to `custom` so resolution doesn't hit the
   PROVIDER_REGISTRY-only path.

2. agent/auxiliary_client.py: the vision auto-detect chain now skips the
   user's main provider when models.dev reports `supports_vision=False`.
   Without this guard, a misconfigured aux provider would fall back to
   `auto`, which happily returned the main-provider client. The caller
   would then send image content to e.g. api.deepseek.com with model
   `gpt-4o-mini` and get a cryptic `unknown variant 'image_url',
   expected 'text'` from the provider's parser.

3. tools/vision_tools.py + tools/browser_tool.py: `check_vision_requirements`
   now mirrors the runtime fallback chain (explicit provider, then auto),
   so `vision_analyze` shows up whenever vision is actually serviceable.
   `browser_vision` gets a new `check_browser_vision_requirements` check_fn
   that AND-gates browser + vision availability, so it doesn't get
   advertised to the model when the call would fail at runtime.

Reproduction (config from the bug report):
  model.provider: deepseek
  model.default: deepseek-v4-pro
  auxiliary.vision.provider: openai
  auxiliary.vision.model: gpt-4o-mini

Before: resolve_vision_provider_client() returns None for the explicit
provider, fallback auto returns the deepseek client with model='gpt-4o-mini',
image hits api.deepseek.com → 'unknown variant image_url'. vision_analyze
hidden from tool list; browser_vision exposed but fails at call time.

After: resolves to custom + api.openai.com/v1 with model gpt-4o-mini.
vision_analyze and browser_vision both gate correctly on capability.

Tests: tests/agent/test_vision_routing_31179.py covers all three fixes
(12 cases including the user's exact scenario, base_url preservation,
text-only-main skip, capability-unknown permissive fallback, and tool
gating parity). Existing 382 tests across auxiliary/vision/image_routing
suites still pass.

* test(vision): use exact hostname check to silence CodeQL substring-sanitization alert

* fix(auxiliary): drop model name from vision-skip debug log to silence CodeQL

The new `logger.debug(...)` added in the previous commit interpolated
both `main_provider` and `vision_model` (a public model slug \u2014 not
sensitive). CodeQL's `py/clear-text-logging-sensitive-data` heuristic
re-flagged it twice because the rule mis-detects multi-value
interpolations near tainted-via-config provider strings.

Drop the model from the log args (provider alone is enough to diagnose
the skip; the same sibling branch a few lines up already logs provider
only). Behavior unchanged; CodeQL false positive cleared.
2026-05-24 15:01:28 -07:00
Hinotoi Agent
d9ec90585c test(dashboard): send loopback headers for WebSocket sidecar test 2026-05-24 15:00:44 -07:00
hinotoi-agent
2e66eefbc3 fix(dashboard): validate WebSocket Host and Origin 2026-05-24 15:00:44 -07:00
Teknium
186bf25cb1 test(guardrail): assert halt message reaches stream_delta_callback
Regression guard for #30770 — verifies the guardrail-halt branch in
agent/conversation_loop.py pushes the synthesized halt message through
stream_delta_callback before breaking out of the loop.  Without the
emit, chat-completions SSE writers drain an empty queue and clients
(Open WebUI, etc.) see a finish chunk with zero content delta —
indistinguishable from a crash.

Verified: the test fails when the production fix is reverted.
2026-05-24 07:38:24 -07:00
annguyenNous
38b8d0da85 fix: emit guardrail halt message to client before closing stream
When the tool loop guardrail fires (max_tool_failures, etc.), the
turn exits with guardrail_halt but no final assistant message was
emitted to the client. The SSE stream closed silently —
indistinguishable from a crash.

The stream_delta_callback(None) before tool execution is a display
flush, not a hard close. After generating the halt response, emit
it through both _safe_print (CLI) and stream_delta_callback (SSE)
so clients see the explanation.

Fixes #30770
2026-05-24 07:38:24 -07:00
Teknium
889903f0fa
fix(tests): align CI tests with recent security hardening (#31470)
Four recent security PRs landed on main with stale/missing test updates,
breaking 4 test shards on every subsequent PR's CI run:

- test_discord_bot_auth_bypass.py (PR #30742 c3caca658):
  DISCORD_ALLOWED_ROLES no longer bypasses _is_user_authorized.
  Inverted 3 tests to assert the new (correct) behavior: role config
  alone does NOT authorize at the gateway layer.

- test_msgraph_webhook.py (PR #30169 4ca77f105):
  adapter.is_connected is a @property, not a method. Test was calling
  it with () after the connect() change; TypeError: 'bool' is not
  callable. Removed the parens.

- test_feishu_approval_buttons.py (PR #30744 bdb97b857):
  Card-action callbacks now go through _allow_group_message
  authorization. 3 tests in TestCardActionCallbackResponse didn't
  populate adapter._allowed_group_users so the operator's open_id got
  rejected. Added the allowlist setup to each test, matching the
  existing pattern in test_returns_card_for_approve_action.

Also raise tolerance on test_wait_for_process_kills_subprocess_on_keyboardinterrupt:
the SIGTERM → 3s TimeoutStopSec → SIGKILL → reap chain can exceed 10s
under loaded xdist (40 workers). Bumped _wait_for_pgid_exit timeout
10→30s and worker join timeout 5→15s. Passes 100% in isolation
already; this just makes it tolerant of CI-host load.

Validation: 270/270 tests pass across the 5 affected files.
2026-05-24 06:54:16 -07:00
Hinotoi-agent
3bace071bf fix(state): restrict sensitive store file permissions
response_store.db (api server) holds conversation history including tool
payloads, prompts, and results. webhook_subscriptions.json holds per-route
HMAC secrets. Under a permissive umask (e.g. 0o022, default on most
distros) both files were created mode 0o644 — readable by other local
users on shared boxes.

- gateway/platforms/api_server.py: ResponseStore tightens itself + WAL/SHM
  sidecars to 0o600 after __init__, then trusts the inode. (Original
  contributor patch chmod'd after every _commit() — wasteful on a hot
  api_server path; chmod-on-create is sufficient since SQLite preserves
  mode bits across writes.)

- hermes_cli/webhook.py: _save_subscriptions writes via tempfile.mkstemp
  (which itself creates the file with 0o600), chmods the temp before the
  atomic rename, and re-asserts 0o600 on the destination so an existing
  permissive file from before this fix gets narrowed.

Tests cover (a) creation under permissive umask leaves 0o600 and (b) an
existing 0o644 webhook_subscriptions.json gets narrowed on next save.
Tests guarded with skipif os.name=='nt' since POSIX mode bits don't apply
on Windows.

Salvaged from PR #30917 by @Hinotoi-agent. Reworked the api_server.py
side from chmod-on-every-commit to chmod-on-create.

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
2026-05-24 04:55:18 -07:00
m0n3r0
f378f00bfb fix(feishu): validate verification token before reflecting url_verification challenge
When FEISHU_VERIFICATION_TOKEN is configured, an unauthenticated remote
could previously prove endpoint control by sending a url_verification
payload with any attacker-controlled challenge string — the handler
reflected the challenge BEFORE running the token check.

Move the verification_token check ahead of the url_verification echo so
the challenge response is gated on a valid token. Add a regression test
covering the wrong-token case. Also fix the stale
test_connect_webhook_mode_starts_local_server fixture to set
FEISHU_VERIFICATION_TOKEN (post #30746 webhook mode requires a secret).

Salvaged from PR #29663 by @m0n3r0 — kept the url_verification reorder
and its regression test; dropped the host-conditional weakening of the
#30746 secret guard (we want webhook secrets required regardless of
bind host, not only on 0.0.0.0/::).

Docs updated to call out the gating.

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
2026-05-24 04:51:19 -07:00
teknium1
5e6749fbf3 chore(release): map m0n3r0 for PR #29629 salvage 2026-05-24 04:47:45 -07:00
teknium1
15aa6884a2 fix(webhook): use 403 not 500 for missing-secret rejection
Operator misconfiguration is a client/setup error, not an internal server
exception. 403 "forbidden" more accurately reflects "this route refuses
to authenticate" than 500 "internal server error" — the latter triggers
incident alerting on operator monitoring and conflates real bugs with
config drift.

Follow-up tweak to PR #29629 by @m0n3r0.
2026-05-24 04:47:45 -07:00
m0n3r0
dbf73e90fa fix: fail closed for webhook routes without secrets
Reject unsigned webhook requests when a route has no effective HMAC secret, even if the request handler is reached without the normal connect-time validation. Add regression coverage for the direct-handler path.
2026-05-24 04:47:45 -07:00
BaxBit
bbf02c3224
fix(gateway): validate Svix webhook signatures (#30200) 2026-05-24 04:45:13 -07:00
Jiaming Guo
ee002e7fc5
fix(dashboard): require auth for plugin rescan (#27340) 2026-05-24 04:45:07 -07:00
Teknium
5acaeba2bb
fix(mcp): raise ImportError instead of NameError when stdio SDK missing (#31450)
When the 'mcp' Python SDK isn't installed, _run_stdio leaked a bare
'NameError: name StdioServerParameters is not defined' because the
top-level 'from mcp import ...' fails inside try/except ImportError,
leaving the names unbound at module scope.

Mirror the _MCP_HTTP_AVAILABLE gate that _run_http already had: raise
a clear ImportError with install instructions instead.

Fixes #30904
2026-05-24 04:44:59 -07:00
xxxigm
6cafcf9c77 test(streaming): pin partial-stream-stub finish_reason + continuation contract
Three test classes lock in the #30963 fix:

1. TestPartialStreamStubFinishReason — drives _interruptible_streaming_api_call
   through the two recovery branches and asserts:
     - text-only partial → finish_reason="length" (the new behaviour),
     - mid-tool-call partial → finish_reason="stop" (unchanged on purpose).

2. TestLengthContinuationPromptBranching — pure-Python check on the branch
   that picks the continuation prompt by response.id. Locks the network
   error wording for partial-stream-stub vs. the output-length wording
   for everything else.

3. TestConversationLoopPartialStreamContinuation — feeds a stub +
   continuation pair into run_conversation, verifies the loop makes a
   second API call (instead of exiting with text_response(stop)),
   confirms the network-error continuation prompt actually reaches the
   model on call #2, and that final_response stitches both halves.

Refs: NousResearch/hermes-agent#30963
2026-05-24 04:35:15 -07:00
xxxigm
20b3703a42 fix(conversation-loop): tailor length-continuation prompt for partial stream
The length-continue path's user-facing vprint and continuation prompt
both told the model "your response was truncated by the output length
limit." That's a lie when the stub came from a partial-stream network
error (issue #30963) — and a lie the model can detect, leading to "I
wasn't truncated, I'm done" no-op responses that defeat the
continuation entirely.

Detect the partial-stream-stub via response.id and swap in:

- vprint:   "Stream interrupted by network error
             (finish_reason='length' on partial-stream-stub)"
- prompt:   "[System: The previous response was cut off by a network
             error mid-stream. Continue exactly where you left off.
             Do not restart or repeat prior text. Finish the answer
             directly.]"

Real length truncations still see the original "truncated by output
length limit" prompt — the model needs to know which class of failure
it's recovering from. Same length_continue_retries=3 budget,
truncated_response_parts merging, and final-response stitching
infrastructure on both branches.

Refs: NousResearch/hermes-agent#30963
2026-05-24 04:35:15 -07:00
xxxigm
9140be7c22 fix(streaming): emit finish_reason=length on text-only partial-stream stub
When the API connection drops mid-stream after text deltas have already
been delivered, chat_completion_helpers returned a stub response with
finish_reason=stop. The conversation loop then classified the stub as a
clean text completion (text_response(finish_reason=stop)) and exited
with iteration budget remaining — even when the goal-judge verdict
came back as "continue" milliseconds later (issue #30963).

Switch the text-only partial-stream stub to finish_reason=length. The
existing length-continuation path (length_continue_retries up to 3,
"continue exactly where you left off" prompt, partial parts merged
into final_response) then fires automatically: the partial assistant
content is persisted, the model is asked to continue from the cut
point, and the loop keeps making progress against the goal.

The mid-tool-call branch keeps finish_reason=stop on purpose — its
user-facing warning ("Ask me to retry if you want to continue") asks
the user to drive the retry rather than auto-replaying a tool call
with possible side effects.

#5544's "no duplicate message" contract is preserved verbatim: the
partial content is reused, never re-emitted as a fresh API call, so
the user never sees two copies of the same delta.

Refs: NousResearch/hermes-agent#30963
2026-05-24 04:35:15 -07:00
teknium1
60d20a37c9 fix(acp): only deliver final_response after streaming when transformed
PR #29119 dropped the 'not streamed_message' guard unconditionally so
that plugin-transformed responses (transform_llm_output hook) would
reach ACP clients. That regressed test_prompt_does_not_duplicate_streamed_final_message:
when no transform happened, the streamed text was re-sent as a duplicate
final delivery.

Tighten the condition to mirror the gateway side: deliver after streaming
only when response_transformed=True. Otherwise keep the old guard.

Adds test_prompt_delivers_transformed_response_after_streaming so the
transformed path stays covered.
2026-05-24 04:31:13 -07:00
teknium1
26088ca669 chore: map kenyon1977@gmail.com for PR #29119 salvage 2026-05-24 04:31:13 -07:00
teknium1
b9f533af0a test(gateway): regression for plugin-transformed response after streaming
Adds a test that fails without the gateway fix, exercising the
response_transformed=True branch in _finalize_response: a streamed
response whose final text was modified by a transform_llm_output
plugin hook must be edit_message'd in place (not duplicate-sent),
with already_sent=True so the normal final-send is skipped.

Also drops two minor leftovers from the salvaged PR #29119:

  * accumulated_text property on GatewayStreamConsumer (unused)
  * duplicate _response_transformed=False inside the hook try block
2026-05-24 04:31:13 -07:00
kenyonxu
5cb21e3fb5 fix(gateway): edit streamed message instead of sending duplicate when response_transformed
When a transform_llm_output hook appends content after streaming, the previous
fix skipped the final-send suppression which caused the full response to be
sent as a NEW message (duplicate). Instead, edit the existing streamed message
in-place to append the transformed content, then set already_sent=True.

Added stream_consumer.message_id and .accumulated_text public properties.
2026-05-24 04:31:13 -07:00
kenyonxu
a4ceead796 fix(gateway): propagate response_transformed flag through run_sync return dict
run_sync() cherry-picks fields from the run_conversation result dict into
a new response dict for the gateway. response_transformed was missing from
the cherry-pick list, so the gateway always saw it as False and suppressed
the final send even though a transform_llm_output hook had modified the content.
2026-05-24 04:31:13 -07:00
kenyonxu
8edeebe6d7 fix: propagate response_transformed flag — plugin hook output survives streaming suppression
When a transform_llm_output hook modifies final_response after streaming,
the gateway was silently discarding the transformed content because
streamed=True / content_delivered=True triggered the final-send
suppression. Three changes:

1. conversation_loop: set `_response_transformed=True` when a
   transform_llm_output hook returns a non-empty string, and expose it
   as `response_transformed` in the result dict.

2. gateway/run: skip the final-send suppression when
   `response_transformed` is True — the transformed response must
   reach the client even if streaming already sent the original text.

3. acp_adapter/server: remove `not streamed_message` guard so
   final_response is always delivered (ACP path fixed separately).
2026-05-24 04:31:13 -07:00
kenyonxu
7eb6c7f489 fix(acp): deliver final_response after streaming — transform_llm_output hook now visible
When streaming is active, streamed_message=True skipped the final_response
update, causing plugin hooks like transform_llm_output to be silently
invisible. Remove the `not streamed_message` guard so the final response
(possibly transformed by plugins) is always delivered to the ACP client.
2026-05-24 04:31:13 -07:00
Teknium
197f63f454
fix(feishu): require webhook auth secret and honor config extras (#30746) 2026-05-24 04:27:28 -07:00
Teknium
bdb97b8573
fix(feishu): enforce auth and chat binding for approval buttons (#30744) 2026-05-24 04:27:17 -07:00
Teknium
485292ac7d
fix(feishu): authorize interactive exec approval callbacks (#30739) 2026-05-24 04:26:57 -07:00
Teknium
be27bfed01
security: harden API server key placeholder handling (#30738) 2026-05-24 04:25:32 -07:00
Teknium
2df2f9190b
fix(docker): keep dashboard side-process loopback by default (#30740) 2026-05-24 04:25:28 -07:00
Teknium
4ca77f1059
Harden msgraph webhook auth requirements (#30169) 2026-05-24 04:25:20 -07:00
Teknium
3e78e353d7
fix(qqbot): authorize approval button interactions by session owner (#30737) 2026-05-24 04:25:12 -07:00
Teknium
e4a1220f83
security: restrict default webhook toolset capabilities (#30745) 2026-05-24 04:24:54 -07:00
Teknium
c3caca6584
fix(gateway): remove discord role allowlist auth bypass (#30742) 2026-05-24 04:24:49 -07:00
Teknium
1f897b0dc9
fix(gateway): stop enabling dingtalk allow-all during setup (#30743) 2026-05-24 04:24:44 -07:00
Teknium
9732559864
fix(security): restrict dashboard websockets to loopback clients (#30741) 2026-05-24 04:24:40 -07:00
Teknium
bc3f1f4f34
feat(secrets/bitwarden): EU Cloud + self-hosted server URL support (#31378)
Closes #31370.

bws defaults to the US identity endpoint, so EU Cloud and self-hosted
machine-account tokens fail with [400 Bad Request] {"error":"invalid_client"}
during 'hermes secrets bitwarden setup'. The token is valid — it's just
being checked against the wrong region.

Add a Bitwarden region step to the wizard between the access-token and
project-list steps:

  Step 1  Install bws
  Step 2  Provide access token
  Step 3  Pick region   <-- new (US / EU / self-hosted-custom-URL)
  Step 4  Pick project  (now talks to the right endpoint)
  Step 5  Test fetch

Region is stored in config.yaml as secrets.bitwarden.server_url and
plumbed into every bws subprocess as BWS_SERVER_URL (project list,
secret list, test fetch, and the env_loader startup pull).

Also:
- Non-interactive: 'hermes secrets bitwarden setup --server-url ...'
- Pre-existing BWS_SERVER_URL in the shell is detected and reused
- Cache key includes server_url so EU/US fetches don't collide
- 'hermes secrets bitwarden status' shows the configured region
- 'invalid_client' / '400 Bad Request' from bws now triggers a hint
  pointing at the region setting instead of looking like a bad token
2026-05-24 02:19:57 -07:00
Teknium
c9b3eeabdc
fix(cli): decouple tool_progress=verbose from global DEBUG logging (#31379)
PR #6a1aa420e coupled `display.tool_progress: verbose` (a per-tool display
toggle for full args / results / think blocks) to `self.verbose` — which
controls root-logger DEBUG level. Result: setting tool_progress: verbose
in config silently flipped every module in the process to DEBUG and
flooded the terminal with internal logging, far beyond just full tool
calls.

The two concepts are separate:
- `tool_progress_mode == 'verbose'` → display behavior (tool rendering)
- `self.verbose` → logging behavior (root logger → DEBUG, line 9795)

This change keeps PR #6a1aa420e's argparse.SUPPRESS / config-fallback
plumbing but severs the verbose-display → debug-logging link.

Changes:
- cli.py:2868 — `self.verbose` only follows explicit `verbose=` arg; no
  longer auto-True when tool_progress_mode == 'verbose'.
- cli.py:_toggle_verbose — slash-cycle through tool progress modes no
  longer flips `self.verbose` / `agent.verbose_logging` / `agent.quiet_mode`.
- cli.py:9355 — fix misleading label (drop 'and debug logs').
- tui_gateway/server.py:_make_agent — same decoupling on the TUI side
  (verbose_logging no longer derived from tool_progress_mode).
- tests/cli/test_tool_progress_scrollback.py — invert the test that
  asserted the broken coupling; add coverage for explicit `--verbose`
  still enabling DEBUG independent of tool_progress.

Live verified:
- tool_progress: verbose, no --verbose flag → 0 DEBUG/INFO log lines
- --verbose flag explicit → 32 DEBUG/INFO log lines (as expected)
2026-05-24 02:19:20 -07:00
AhmetArif0
5848174374 fix(wecom): guard flush task against cancel-delivery race to prevent message loss
When asyncio.sleep() fires just before Task.cancel() is called, CPython
sets _must_cancel=True but cannot cancel the already-completed sleep
future, so CancelledError is delivered at the next await (handle_message)
rather than at the sleep.  By that point the superseded task has already
popped the merged event from _pending_text_batches, so the superseding
task sees an empty batch and silently drops the message.

Fix: add a synchronous task-registry check between the sleep and the pop.
No await between the check and the pop means no other coroutine can
interleave, so the guard is race-free.
2026-05-24 01:33:40 -07:00
Teknium
1bed4e8eed fix(gateway): drop text snippet from debounce debug log (CodeQL)
CodeQL py/clear-text-logging-sensitive-data flagged the candidate-accept
debug log including event.text[:60]. Log text_len instead — sufficient for
debugging burst behavior without surfacing message contents.

Co-authored-by: Paulo Nascimento <pnascimento9596@gmail.com>
2026-05-24 01:31:45 -07:00