Fresh Windows installs were failing on first run with:
⚠ uv python install error: Downloading cpython-3.11.15-windows-x86_64-none (24.5MiB)
✗ Installation failed: Python was not found; run without arguments
to install from the Microsoft Store...
Two bugs compounding:
1) EAP=Stop swallows uv's stderr progress as an exception. uv writes
download progress ("Downloading cpython-3.11.15-windows-x86_64-none
(24.5MiB)") to stderr. With $ErrorActionPreference = "Stop" set at
the top of the script plus 2>&1 capture, PowerShell wraps each stderr
line as an ErrorRecord and throws on the first one — even though uv
exits 0 and Python was installed successfully. This was previously
fixed in commit ec1714e71 (May 8) but lost in the May 12 release
squash (413990c94). Reapply the EAP=Continue + verify-via
'uv python find' pattern.
2) System-python fallback invokes the Microsoft Store stub. When the uv
paths fall through, the legacy 'python --version' check invokes
%LOCALAPPDATA%\\Microsoft\\WindowsApps\\python.exe, a 0-byte
reparse-point stub that prints 'Python was not found...' to stdout
and exits non-zero. Get-Command matches it. The resulting error
message is what the user sees as the final installer crash. Detect
and skip the stub by checking for the \\WindowsApps\\ path
component or a 0-byte file size before invoking python.
Also save/restore EAP defensively in the catch blocks so a throw before
the assignment can't leave EAP in 'Continue'.
Wraps every sync->async coroutine-scheduling site in the codebase with a
new agent.async_utils.safe_schedule_threadsafe() helper that closes the
coroutine on scheduling failure (closed loop, shutdown race, etc.)
instead of leaking it as 'coroutine was never awaited' RuntimeWarnings
plus reference leaks.
22 production call sites migrated across the codebase:
- acp_adapter/events.py, acp_adapter/permissions.py
- agent/lsp/manager.py
- cron/scheduler.py (media + text delivery paths)
- gateway/platforms/feishu.py (5 sites, via existing _submit_on_loop helper
which now delegates to safe_schedule_threadsafe)
- gateway/run.py (10 sites: telegram rename, agent:step hook, status
callback, interim+bg-review, clarify send, exec-approval button+text,
temp-bubble cleanup, channel-directory refresh)
- plugins/memory/hindsight, plugins/platforms/google_chat
- tools/browser_supervisor.py (3), browser_cdp_tool.py,
computer_use/cua_backend.py, slash_confirm.py
- tools/environments/modal.py (_AsyncWorker)
- tools/mcp_tool.py (2 + 8 _run_on_mcp_loop callers converted to
factory-style so the coroutine is never constructed on a dead loop)
- tui_gateway/ws.py
Tests: new tests/agent/test_async_utils.py covers helper behavior under
live loop, dead loop, None loop, and scheduling exceptions. Regression
tests added at three PR-original sites (acp events, acp permissions,
mcp loop runner) mirroring contributor's intent.
Live-tested end-to-end:
- Helper stress test: 1500 schedules across live/dead/race scenarios,
zero leaked coroutines
- Race exercised: 5000 schedules with loop killed mid-flight, 100 ok /
4900 None returns, zero leaks
- hermes chat -q with terminal tool call (exercises step_callback bridge)
- MCP probe against failing subprocess servers + factory path
- Real gateway daemon boot + SIGINT shutdown across multiple platform
adapter inits
- WSTransport 100 live + 50 dead-loop writes
- Cron delivery path live + dead loop
Salvages PR #2657 — adopts contributor's intent over a much wider site
list and a single centralized helper instead of inline try/except at
each site. 3 of the original PR's 6 sites no longer exist on main
(environments/patches.py deleted, DingTalk refactored to native async);
the equivalent fix lives in tools/environments/modal.py instead.
Co-authored-by: JithendraNara <jithendranaidunara@gmail.com>
Build on @aydnOktay's cronjob fix by routing the cronjob check through
the shared 'env_var_enabled' helper in utils.py (same truthy set:
1/true/yes/on) and applying the same semantics to the 8 sibling call
sites that read HERMES_INTERACTIVE / HERMES_GATEWAY_SESSION /
HERMES_EXEC_ASK / HERMES_CRON_SESSION with bare os.getenv() truthy
checks:
- tools/approval.py: _is_gateway_approval_context (2), check_command_safety (2),
check_all_command_guards (3) -- 7 sites total
- tools/terminal_tool.py: _handle_sudo_failure, sudo password prompt -- 2 sites
- tools/skills_tool.py: _is_gateway_surface -- 1 site
Without this, a user who exports HERMES_INTERACTIVE=0 in their shell
still gets interactive sudo prompts, approval prompts, and gateway
skill-install paths -- only the cronjob tool was hardened. Now all
consumers agree on the same false-like values.
Also drops the duplicate _is_truthy_env helper from cronjob_tools.py
in favour of the existing canonical utils.env_var_enabled.
Tests: extend the parametrized regression coverage to all three
session env vars (HERMES_INTERACTIVE / HERMES_GATEWAY_SESSION /
HERMES_EXEC_ASK) symmetrically. tests/tools/test_cronjob_tools.py:
60/60 pass; tests/tools/{approval,terminal_tool,skills_tool,
cron_approval_mode,hardline_blocklist}.py: 378/378 pass.
Follow-up to #26534 (xai-oauth provider). The new guide and integrations
page were shipped with the salvage, but four reference/enumeration pages
still listed every other OAuth provider without xai-oauth:
- reference/cli-commands.md — `--provider` choices list
- reference/environment-variables.md — HERMES_INFERENCE_PROVIDER values
- user-guide/configuration.md — auxiliary-task provider list, OAuth
tip block (mirrored from MiniMax OAuth),
and provider table row
- user-guide/features/fallback-providers.md — provider table
Drop accounts.mouseion.dev and localhost:20000 / 127.0.0.1:20000 from
the loopback callback CORS allowlist — leftover dev origins. The
redirect_uri is bound to 127.0.0.1 and gated by PKCE + state, so only
xAI's own auth origins are needed.
Co-Authored-By: Jaaneek <Jaaneek@users.noreply.github.com>
Per @mark-xai's review on PR #26457 and the xAI model retirement on
2026-05-15: grok-code-fast-1 is being retired today and aliases redirect
to grok-4.3 (already pinned to the top of the xAI model list by this
PR). Update the two xAI Responses-API test fixtures Mark flagged plus
the picker fallback default in hermes_cli/main.py that uses the same
literal.
The previous "Logging Out" section showed `hermes auth remove xai-oauth`
with no positional target — argparse rejects that and the command does
not clear the singleton OAuth state anyway. The correct command for the
"clear everything" intent is `hermes auth logout xai-oauth`. Also point
users at `hermes auth remove xai-oauth <target>` for single-pool-row
deletion.
The xAI prompt_cache_key block carried two long comment paragraphs
that either restated setdefault semantics, narrated the SDK
type-validation mechanism, or recapped the historical motivation for
the extra_body indirection — all already covered by the test
docstring at test_xai_responses_sends_cache_key_via_extra_body
(which links to the xAI docs). Also restored the truncated link in
the body-injection comment.
No behavior change.
The new resolve_xai_http_credentials() resolver was using os.getenv()
for the XAI_API_KEY/XAI_BASE_URL fallback path, which dropped the
~/.hermes/.env contract guarded by PR #17140 / #17163. Users with
XAI_API_KEY in dotenv only would see "No xAI credentials found" even
though the key was configured.
Separately, _transcribe_xai started consulting creds["base_url"] (which
always returns at least the default https://api.x.ai/v1) ahead of the
public XAI_STT_BASE_URL env override, so the per-tool override stopped
working.
- tools/xai_http.py: add module-level get_env_value() wrapper that
reads ~/.hermes/.env first (via hermes_cli.config.get_env_value),
then os.environ. Resolver uses it for the API-key/base-url fallback.
- tools/transcription_tools.py: restore precedence so XAI_STT_BASE_URL
wins over creds["base_url"].
- tests/tools/test_transcription_dotenv_fallback.py +
tests/tools/test_tts_dotenv_fallback.py: repoint the per-call-site
patches at the new resolution point (tools.xai_http.get_env_value).
The end-to-end regression-guard test (which patches load_env) is
unchanged and still passes.
The contributor's commit author email is the legacy GitHub noreply
form (no leading numeric "id+"), so it doesn't match the
check-attribution workflow's auto-resolve regex
(\+.*@users\.noreply\.github\.com). Register it explicitly in
AUTHOR_MAP so the PR #26457 attribution check passes.
Two bugs in the `hermes tools` reconfigure flow caused picking xAI Grok
Imagine for video_gen (or image_gen) to feel like a no-op:
1. `_is_provider_active()` had a branch for `image_gen_plugin_name` but
none for `video_gen_plugin_name`, so a row marked as the active xAI
video provider was never recognized as active. The picker fell through
to the env-var fallback in `_detect_active_provider_index()`, which
matched the FAL row (because `FAL_KEY` is set), so the picker visually
defaulted to FAL even though the user had selected xAI.
2. `_plugin_video_gen_providers()` and `_plugin_image_gen_providers()`
built picker rows from the plugin's `get_setup_schema()` but only
copied `name`, `badge`, `tag`, `env_vars`. The xAI plugins declare
`post_setup: "xai_grok"` so the picker should run the OAuth /
API-key prompt hook after selection — that key was silently dropped,
so the hook never fired from the picker rows.
Adds the missing `video_gen_plugin_name` branch (placed before the
`managed_nous_feature` block, mirroring the existing image_gen branch)
and propagates `post_setup` from the plugin schema into both picker-row
builders. Adds focused tests in `test_video_gen_picker.py` and
`test_image_gen_picker.py`.
Adds a new authentication provider that lets SuperGrok subscribers sign
in to Hermes with their xAI account via the standard OAuth 2.0 PKCE
loopback flow, instead of pasting a raw API key from console.x.ai.
Highlights
----------
* OAuth 2.0 PKCE loopback login against accounts.x.ai with discovery,
state/nonce, and a strict CORS-origin allowlist on the callback.
* Authorize URL carries `plan=generic` (required for non-allowlisted
loopback clients) and `referrer=hermes-agent` for best-effort
attribution in xAI's OAuth server logs.
* Token storage in `auth.json` with file-locked atomic writes; JWT
`exp`-based expiry detection with skew; refresh-token rotation
synced both ways between the singleton store and the credential
pool so multi-process / multi-profile setups don't tear each other's
refresh tokens.
* Reactive 401 retry: on a 401 from the xAI Responses API, the agent
refreshes the token, swaps it back into `self.api_key`, and retries
the call once. Guarded against silent account swaps when the active
key was sourced from a different (manual) pool entry.
* Auxiliary tasks (curator, vision, embeddings, etc.) route through a
dedicated xAI Responses-mode auxiliary client instead of falling back
to OpenRouter billing.
* Direct HTTP tools (`tools/xai_http.py`, transcription, TTS, image-gen
plugin) resolve credentials through a unified runtime → singleton →
env-var fallback chain so xai-oauth users get them for free.
* `hermes auth add xai-oauth` and `hermes auth remove xai-oauth N` are
wired through the standard auth-commands surface; remove cleans up
the singleton loopback_pkce entry so it doesn't silently reinstate.
* `hermes model` provider picker shows
"xAI Grok OAuth (SuperGrok Subscription)" and the model-flow falls
back to pool credentials when the singleton is missing.
Hardening
---------
* Discovery and refresh responses validate the returned
`token_endpoint` host against the same `*.x.ai` allowlist as the
authorization endpoint, blocking MITM persistence of a hostile
endpoint.
* Discovery / refresh / token-exchange `response.json()` calls are
wrapped to raise typed `AuthError` on malformed bodies (captive
portals, proxy error pages) instead of leaking JSONDecodeError
tracebacks.
* `prompt_cache_key` is routed through `extra_body` on the codex
transport (sending it as a top-level kwarg trips xAI's SDK with a
TypeError).
* Credential-pool sync-back preserves `active_provider` so refreshing
an OAuth entry doesn't silently flip the active provider out from
under the running agent.
Testing
-------
* New `tests/hermes_cli/test_auth_xai_oauth_provider.py` (~63 tests)
covers JWT expiry, OAuth URL params (plan + referrer), CORS origins,
redirect URI validation, singleton↔pool sync, concurrency races,
refresh error paths, runtime resolution, and malformed-JSON guards.
* Extended `test_credential_pool.py`, `test_codex_transport.py`, and
`test_run_agent_codex_responses.py` cover the pool sync-back,
`extra_body` routing, and 401 reactive refresh paths.
* 165 tests passing on this branch via `scripts/run_tests.sh`.
* fix(tui): restrict fast-echo bypass to ASCII so Vietnamese/CJK/IME input renders correctly
The composer's fast-echo path (canFastAppend / canFastBackspace) writes
characters straight to stdout to skip an Ink re-render on the hot
typing path. The previous guard only checked
'stringWidth(text) === text.length', which lets a lot of non-ASCII
through:
- Vietnamese precomposed letters (ề, ắ, ờ, ự, ...) report width 1 and
length 1, but a Vietnamese Telex / IME stack produces them across
multiple keystrokes; the intermediate composition state must be
drawn by Ink so the rendered cell, the stored value, and the
cursor column stay in lockstep when the final commit replaces the
preview.
- NFD combining marks (U+0300..U+036F) are zero-width but length 1,
so even a passing equality lets them slip and silently desync the
cell column.
- CJK/East-Asian wide and emoji rejected only because their length
differs, but the boundary was shape-shaped, not intent-shaped.
User-visible bug from the original report:
Example: eê noiói nge neène
-> the bypass committed the IME preview char before the diacritic
replaced it, leaving doubled letters on screen.
Fix: gate fast-echo on pure printable ASCII (0x20-0x7e). The
performance-critical English typing path is unchanged; everything else
goes through the normal Ink render path so layout stays accurate.
Also extracts the shape preconditions as pure exported helpers
(canFastAppendShape / canFastBackspaceShape) so the regression matrix
is testable without spinning up a TextInput.
Tests: ui-tui/src/__tests__/textInputFastEcho.test.ts adds 20 cases
covering ASCII still works, Vietnamese precomposed + NFD, CJK, emoji,
NBSP / Latin-1, ANSI / control bytes, multi-line, and end-of-line
preconditions. Verified RED on the previous guard (11 of 20 fail) and
GREEN on the new guard.
Refs: #5221, #7443, #17602, #17603 (similar wide-char rendering bugs).
* docs(tui): clarify Vietnamese char terminology in regression comment
Address Copilot review: 'single byte width' implied UTF-8 byte semantics,
but the relevant property is JS code units (`text.length === 1`) and
display width (`stringWidth === 1`). Reworded to match.
* fix(langfuse): reject placeholder credentials with one-shot warning
When operators leave HERMES_LANGFUSE_PUBLIC_KEY / HERMES_LANGFUSE_SECRET_KEY
at a template value like 'placeholder', 'test-key', or 'your-langfuse-key',
the Langfuse SDK silently accepts the credentials at construction time and
drops every trace at flush time. No warning, no error — just an empty
Langfuse dashboard the operator only notices hours later.
Add prefix-based validation in _get_langfuse() against the documented
'pk-lf-' / 'sk-lf-' prefixes that Langfuse always issues server-side.
Anything else fires a single warning naming the offending env var(s)
with a log-safe value preview (full string for short placeholders so the
operator knows which template they left in place; truncated for long
values so a real secret pasted into the wrong field never hits the log),
then short-circuits via the existing _INIT_FAILED cache so the warning
fires once per process, not once per hook invocation.
The check sits after the 'Langfuse is None' SDK-installed guard so hosts
without the optional langfuse SDK don't see misleading 'set real keys'
hints when the actionable fix is 'pip install langfuse'. Missing
credentials remains the documented opt-out path and stays silent — no
log noise for unconfigured installs.
Fixes#22763Fixes#23823
* fix(langfuse): use actual API request messages for generation input
on_pre_llm_request previously used the messages kwarg alone, which
could be None when Hermes passes the payload via request_messages,
conversation_history, or user_message instead. Add _coerce_request_messages
to pick the first available list across all variants, falling back to a
synthetic user message. Generations now show the real outbound payload
rather than an empty input.
* fix(langfuse): record tool call outputs in traces
Tool observations showed input (arguments) but output was always
undefined. Root cause: when tool_call_id is empty, pre_tool_call stored
observations under a unique time-based key that post_tool_call could
never reconstruct, so every tool span was closed without output by the
_finish_trace sweep.
Fix pre/post matching by routing empty-tool_call_id tools through a
per-name FIFO queue (pending_tools_by_name) instead of the time-based
key. Tools with a tool_call_id continue to use the id-keyed dict.
Also:
- Preserve OpenAI-style nested function shape in serialized tool calls
so Langfuse renders name/arguments correctly
- Keep name + tool_call_id on role:tool messages for proper pairing
- Backfill tool results onto the matching turn_tool_calls entry so the
generation's tool-call record carries the result alongside arguments
- Coerce request messages from whichever field the runtime provides
(request_messages, messages, conversation_history, user_message)
* fix(langfuse): salvage-review polish — drop dead is_first_turn, shallow-copy request_messages, real threaded FIFO test
Self-review of the combined #22345 + #23831 salvage surfaced three issues
worth fixing in the same PR rather than as follow-ups:
1. Drop is_first_turn from the pre_api_request hook. The boolean expression
`not bool(conversation_history)` was wrong: conversation_history is
reassigned to None mid-run after compression (5 sites in run_agent.py),
so the value flips False -> True mid-conversation on every post-compression
API call. The langfuse plugin never consumed it, so the kwarg was both
misleading AND dead.
2. Replace copy.deepcopy(request_messages) with shallow list() copy. The
pre_api_request hook contract discards return values (invoke_hook never
writes back to api_kwargs), and the langfuse plugin's _serialize_messages
already builds its own snapshot dicts via _safe_value. A deepcopy on every
API call would walk every tool result and base64 image — significant
overhead for no real isolation benefit. Shallow copy of the outer list
protects against later mutations of api_messages without paying for the
inner-dict walk.
3. Rename test_empty_tool_call_id_concurrent_fifo_order ->
test_empty_tool_call_id_observations_are_fifo_within_tool_name and add a
real test_threaded_post_calls_preserve_fifo_under_lock that spawns 8
threads behind a barrier to actually exercise _STATE_LOCK on the
pending_tools_by_name queue. The original test was sequential and only
validated Python list semantics; this one validates the lock discipline.
4. Fix stale 'Cleared by reset_cache_for_tests()' comment on _INIT_FAILED —
that function does not exist. Tests reload the module via sys.modules.pop
+ importlib.import_module instead.
Tests: 37 langfuse plugin tests pass, 658 plugin tests overall pass.
---------
Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>
Co-authored-by: Brian Conklin <brian@dralth.com>
PR #22345 by @btorresgil authors commits as 'Brian Conklin
<brian@dralth.com>' (git config carries a different name/email than the
GitHub account). GitHub's commit-author mapping correctly attributes these
commits to @btorresgil based on the public-key registration, but Hermes'
release attribution audit reads the raw commit email, not the GitHub
mapping. Without this AUTHOR_MAP entry, salvaging #22345 would fail
`scripts/contributor_audit.py` strict mode at release time.
Prerequisite for the langfuse trace fix salvage that cherry-picks
@btorresgil's commits onto current main.
Builds on @steezkelly's Bug A fix (#25857, top-level default_permissions
via _insert_managed_block_at_top_level) by addressing the other two
config-corruption bugs described in #26250:
Bug B (duplicate [plugins.X] tables)
- Codex itself writes [plugins."<name>@<marketplace>"] tables to
config.toml when the user runs `codex plugins enable` directly,
before hermes-agent's managed block exists. On the next migrate run,
_query_codex_plugins() re-discovers the same plugins via plugin/list
and render_codex_toml_section() re-emits them inside the managed
block. Codex's strict TOML parser then rejects the duplicate table
header on startup.
- Add _strip_unmanaged_plugin_tables() that drops [plugins.*] tables
from the user-content portion of the file. Only run it when
plugin/list succeeded — if the RPC failed we can't re-emit and
must preserve the user's tables. plugin/list is the source of
truth when it answers.
Bug C (HERMES_HOME pytest-tempdir leak into ~/.codex/config.toml)
- _build_hermes_tools_mcp_entry() read HERMES_HOME directly from
os.environ, so a sibling pytest's monkeypatch.setenv("HERMES_HOME",
tmp_path) silently burned a transient pytest tempdir into the
user's real ~/.codex/config.toml. After pytest reaped the tempdir,
every codex-routed hermes-tools tool call failed silently.
- Derive HERMES_HOME from get_hermes_home() (the canonical resolver
that goes through the profile-aware path) and refuse to emit
obvious test-tempdir paths via _looks_like_test_tempdir() as
belt-and-suspenders for any other callsite that forgets to patch
migrate().
- test_enable_succeeds_when_codex_present in test_codex_runtime_switch.py
invoked the real migrate() (no mock), writing to Path.home() / .codex
using whatever HERMES_HOME the running pytest session had set. Add
the same migrate patch the other apply() tests already use, so the
suite stops touching the user's real ~/.codex/config.toml.
E2E verification (replicating the issue's repro):
- Pre-state config.toml with user [mcp_servers.omx_team_run] +
codex-installed [plugins."tasks@openai-curated"],
HERMES_HOME="/private/var/folders/.../pytest-of-.../..."
- On origin/main: tomllib refuses to load the result with
"Cannot declare ('plugins', 'tasks@openai-curated') twice" AND
the pytest-tempdir HERMES_HOME is burned in.
- On this branch: file parses cleanly, default_permissions is
top-level, exactly one [plugins."tasks@openai-curated"] table
inside the managed block, no HERMES_HOME in the MCP env.
7 new regression tests covering all three bugs + the test-leak guard.
`bash scripts/run_tests.sh tests/hermes_cli/test_codex_runtime_*.py` —
95 passed, 0 failed.
Closes#26250
Wrap requests.post() in create_session() for browser_use, browserbase,
and firecrawl providers with requests.RequestException handling.
Connection timeouts and DNS resolution failures now surface as clean
RuntimeError messages instead of raw requests exception tracebacks.
Browser Use managed-gateway mode preserves raw exception propagation
so the existing idempotency-key retry semantics keep working.
Closes#2746
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
When a user sends a Slack message like '/hermes ' (trailing whitespace
after the slash) the legacy subcommand router hit `text.split()[0]` with
a truthy-but-whitespace-only `text`. `' '.split()` returns `[]` →
IndexError, blowing up the slash handler before fallthrough to `/help`.
Switch to a two-step guard that materializes the parts list first and
indexes only if non-empty.
Salvaged from PR #2752 by @nidhi-singh02. The PR's other two hunks
(`tools/file_operations.py`, `agent/anthropic_adapter.py`) are
unreachable in current code — `LINTERS` is a hardcoded constant dict
with no empty values, and the anthropic version-detection site is
already guarded by a `result.stdout.strip()` truthy check — so only the
slack hunk is taken.
Closes#2745
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Three asyncio.gather() calls in tools/web_tools.py ran without
return_exceptions=True. A single failing task (e.g. LLM rate limit on
one URL) would raise out of gather() and discard every other
successfully fetched/summarized result.
Pass return_exceptions=True and filter BaseException entries with a
warning log before unpacking. Affects:
- chunk summarization gather (large web_extract pages)
- firecrawl per-result LLM post-processing
- tavily crawl per-result LLM post-processing
Closes#2744
Replaces bare `except Exception: pass` with debug-level logging
so failures in local endpoint model discovery are diagnosable
instead of silently hidden.
Remove redundant inner `import re` and regex recompilation on every call in
_interpolate_env_vars. Add module-level _ENV_VAR_PATTERN compiled once.
Replace the separate _interpolate_value() in mcp_config.py (which used \w+
and would silently fail on env vars containing hyphens or dots) with the
shared _ENV_VAR_PATTERN from mcp_tool.py. Remove now-unused import re.
Some catalog endpoints (OpenCode Zen, etc.) sit behind a WAF that
returns 403 for the default Python-urllib/<ver> User-Agent. The
generic profile-based live fetch in providers/base.py was silently
failing for any such provider — falling through to the static catalog
and missing newly-launched models.
Set a generic 'hermes-cli/<version>' UA on the catalog probe so every
api_key provider profile benefits. Verified live against opencode-zen:
before this change, profile.fetch_models() raised HTTP 403; after, it
returns 42 models including gpt-5.5, gpt-5.5-pro, kimi-k2.6, glm-5.1
and the *-free variants the static catalog doesn't list.
Also strip the now-stale comment in validate_requested_model() claiming
opencode-zen's /models returns 404 against the HTML marketing site —
the API endpoint at /zen/v1/models returns 200 with valid JSON.
Surfaced by #2651 (@aashizpoudel) — fixes the same user-facing gap
their PR targeted, applied at the right layer so all api_key provider
profiles get live catalogs through the same code path.
Co-authored-by: Aashish Poudel <mr.aashiz@gmail.com>
Replace O(n²) string concatenation of truncated_response_prefix in the
length-continuation retry loop with a list + ''.join(). Functionally
equivalent: same partial response on early return, same prepend on
final assembly. The legacy retry path is capped at 3 iterations, so
the practical wall-clock win is small, but the new idiom matches the
rest of the codebase and removes a needless repeated allocation.
Salvaged from PR #2717 (the run_conversation portion only — trajectory
refactor dropped because it silently rewrote </tool_response> to </think>).
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
When running with --yolo, all dangerous command approvals are bypassed.
Make this state visible so users don't forget:
- Banner: '⚠ YOLO mode — all approval prompts bypassed' line in red, only
shown when YOLO is active. Default case is silent (no extra line, no
always-on 'restricted' label).
- Status bar: '⚠ YOLO' fragment appended in red (#FF4444 bold) across all
three width tiers (<52, <76, ≥76) in both the plain-text fallback and
the fragments builder.
Closes#2663
Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>
- Adds plugins/platforms/simplex docs page to the messaging sidebar
between LINE and Open WebUI.
- Maps louismichalot@hotmail.com -> Mibayy in scripts/release.py so the
attribution check on the salvage PR passes.
SimpleX Chat (https://simplex.chat) is a private, decentralised messenger
with no persistent user IDs — every contact is identified by an opaque
internal ID generated at connection time. This adds it as a Hermes
gateway platform via the plugin system.
The adapter connects to a local simplex-chat daemon via WebSocket,
listens for inbound messages, and sends replies. Originally proposed in
PR #2558 as a core-modifying integration; reshaped here as a self-
contained plugin under plugins/platforms/simplex/ with no edits to any
core file. Discovery is filesystem-based (scanned by gateway.config),
and the platform identity is resolved on demand via Platform("simplex").
Plugin contract:
- check_requirements() requires SIMPLEX_WS_URL AND the websockets package
- validate_config() / is_connected() accept env or config.yaml input
- _env_enablement() seeds PlatformConfig.extra (ws_url + home_channel)
- _standalone_send() supports out-of-process cron delivery
- interactive_setup() provides a stdin wizard for hermes gateway setup
- register() wires the adapter into the registry with required_env,
install_hint, cron_deliver_env_var, allowed_users_env, and a
platform_hint for the LLM.
Lazy dependency: the websockets Python package is imported inside the
functions that need it. The plugin is importable and discoverable even
when websockets is missing — check_requirements() simply returns False
until `pip install websockets` is run. No new pyproject extras are
introduced.
Environment variables:
SIMPLEX_WS_URL WebSocket URL of the daemon (required)
SIMPLEX_ALLOWED_USERS Comma-separated allowed contact IDs
SIMPLEX_ALLOW_ALL_USERS Set true to allow all contacts
SIMPLEX_HOME_CHANNEL Default contact for cron delivery
SIMPLEX_HOME_CHANNEL_NAME Human label for the home channel
Closes#2557.
The Zed ACP Registry path (uvx --from 'hermes-agent[acp]==X' hermes-acp)
gets a Python-only install. Browser tools depend on the agent-browser npm
package + Chromium, neither of which are in the wheel. Without an
explicit bootstrap, registry users have no path to working browser tools.
Ship a bundled, idempotent bootstrap script (Linux/macOS bash + Windows
PowerShell) inside acp_adapter/bootstrap/ as wheel package-data. New
entry points:
hermes acp --setup-browser # interactive; prompts before Chromium download
hermes acp --setup-browser --yes # non-interactive
hermes-acp --setup-browser
The terminal-auth flow (hermes acp --setup) also offers the browser
bootstrap as a follow-up after model selection, so first-run registry
users get the option without knowing the flag exists.
Key design choices:
- npm install -g --prefix $NODE_PREFIX so we never need sudo. System Node
on PATH is respected; only the install target is redirected to the
user-writable Hermes-managed Node prefix.
- tools/browser_tool.py::_browser_candidate_path_dirs() already walks
$HERMES_HOME/node/bin, so installed binaries are discovered with no
agent-side code change.
- System Chrome/Chromium detection short-circuits the ~400 MB Playwright
download when a suitable browser already exists.
- Bash + PowerShell live as ONE copy each under acp_adapter/bootstrap/.
Not duplicated under scripts/. install.sh and install.ps1 keep their
inline browser blocks for the source-checkout path.
E2E validated end-to-end:
bash bootstrap_browser_tools.sh --skip-chromium
→ installs agent-browser into ~/.hermes/node/bin/
tools.browser_tool._find_agent_browser()
→ returns the installed path
check_browser_requirements()
→ returns True (browser tools register)
Tests:
- tests/acp/test_entry.py: 11 tests covering --setup-browser dispatch
(linux + windows + --yes forwarding + failure propagation), the
terminal-auth follow-up prompt path, and a package-data wheel-shipping
assertion that catches any future pyproject.toml regression.
Docs: website/docs/user-guide/features/acp.md gains a 'Browser tools
(optional)' subsection with the two-line install + what-it-does.
Cron mutation operations (run/pause/resume/remove) and 'hermes cron edit'
now accept a job name in addition to the hex ID, with case-insensitive
matching. Before this, 'hermes cron run my_job_name' died with
'Job with ID my_job_name not found' and forced the user to look up the
hex ID first.
The original PR matched by name but silently picked the first match when
two jobs shared a name. This version refuses to act on an ambiguous name
and surfaces every matching job (id, name, schedule, next_run_at) so the
caller can pick a specific ID.
- cron/jobs.py:
- get_job() stays ID-only (preserves existing call-site semantics for
web_server/api_server/curator/scheduler/test code that always passes
real IDs).
- resolve_job_ref() is the new name-or-ID resolver, used by pause/
resume/trigger/remove_job. Exact ID match wins over a name match
even if a different job's name happens to equal that ID. Ambiguous
name match raises AmbiguousJobReference with all candidate IDs.
- tools/cronjob_tools.py: dispatch site uses resolve_job_ref, surfaces
ambiguous matches as a structured error with the matching IDs.
- hermes_cli/cron.py: 'cron edit' uses resolve_job_ref so editing by
name works and ambiguous names are reported with IDs.
- tests/cron/test_jobs.py: new TestResolveJobRef covering ID match,
case-insensitive name match, ID-wins-over-name, ambiguous refusal,
and that pause/resume/trigger/remove all refuse on ambiguity.
Closes#2627
Adds three pre-run gate recipes to the cron docs:
- file-change gate (stat + mtime + state file)
- external-flag gate (file presence)
- SQL-count gate (user's own database, not state.db)
These are the use cases @iankar8 proposed adding as a parallel
'trigger' subsystem in #2654. The existing `script` + `wakeAgent`
gate already covers all three at $0 — this lands the patterns as
documentation so users can find them, instead of adding a second
gating mechanism to the cron subsystem.
When the in-tree FAL path has no API key (and no managed gateway), the
handler used to return a bare 'FAL_KEY environment variable not set'
error. Users had no idea where to get a key, that a managed Nous
gateway exists, or that plugin-registered providers are an option.
Now `image_generate_tool` returns a structured multi-line message:
- signup link (https://fal.ai)
- managed-gateway status (if Nous tools are enabled)
- pointer to `hermes tools` / `hermes plugins list` for alternate
backends, so users on a stale `image_gen.provider` know where to look
The schema is untouched — `check_fn` still gates the tool out of the
schema when no backend is reachable at startup, consistent with every
other conditional tool. This patch fixes the call-time failure modes:
managed-gateway 5xx, plugin provider disappearing mid-session, etc.
Inspired by #2546 / @Mibayy. The PR was ~5700 commits stale against
the new plugin-aware image_gen architecture, so this is a forward port
of the actionable-error idea rather than a cherry-pick.
Closes#2543
Co-authored-by: Mibayy <mibayy@users.noreply.github.com>
After the Mini Shai-Hulud supply chain campaign (May 2026) and the litellm
compromise (March 2026), codify the dependency pinning policy that was
established in PRs #2810 and #9801 but never written down for contributors.
Changes:
- pyproject.toml: Add tight upper bounds to the 5 deps that slipped
through as review escapes from external contributor PRs:
- hindsight-client>=0.4.22,<0.5 (was >=0.4.22)
- aiosqlite>=0.20,<0.23 (was >=0.20)
- asyncpg>=0.29,<0.32 (was >=0.29)
- alibabacloud-dingtalk>=2.0.0,<3 (was >=2.0.0)
- youtube-transcript-api>=1.2.0,<2 (was >=1.2.0)
Pre-1.0 packages get <0.(current_minor+2) — tight enough to block
hostile minor releases but loose enough to not require bumps every week.
- CONTRIBUTING.md: Add 'Dependency pinning policy' section under Security
with the full rationale, table of source types + treatments, and examples.
- AGENTS.md: Add concise 'Dependency Pinning Policy' section for AI coding
agents with the decision table and step-by-step checklist.
- supply-chain-audit.yml: Add dep-bounds job that fails PRs introducing
PyPI deps without <ceiling upper bounds. Fires on pyproject.toml changes.
Posts a PR comment with the specific unbounded specs found.
Refs: #2796#2810#9801#24205
Baileys' sock.sendMessage() can hang indefinitely while uploading
media to WhatsApp servers (and, less often, on text sends), pinning
the bridge's Express handler until the gateway's aiohttp timeout
fires — surfacing to the user as a 120s wait followed by an empty
error from the TTS/voice path.
Wrap every sock.sendMessage() call inside the bridge in a
sendWithTimeout() helper that rejects after WHATSAPP_SEND_TIMEOUT_MS
(default 60s) via Promise.race. The four call sites are /send,
/edit, and /send-media's primary send. Express handlers catch the
rejection in their existing try/catch and return a real 500 to the
gateway, which can then surface a retryable error.
Salvaged from #2608 — wysie diagnosed the hang and the
Promise.race shape; the other two parts of that PR (gateway HTTP
session pooling, base.py metadata kwarg removal) already landed on
main via separate routes and are no longer needed.
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
ResponseStore.put() and .delete() now remove conversations rows that
reference evicted or deleted response IDs, preventing 404 errors when
a conversation name is reused after its backing response was purged.
Adds regression tests for delete, eviction, and handler-level reuse.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a non-Anthropic provider (e.g. Morpheus proxy) returns a 429 with
`{"error": "Too Many Requests"}` instead of the expected
`{"error": {"type": ...}}` dict, _err_body.json().get("error", {})
returns the raw string and the next .get("type") line crashes with
AttributeError, taking down the message handler.
Guard with isinstance(_err_json, dict) so non-dict error bodies fall
through to the generic rate-limit hint.
Salvaged from PR #2587 by @KiraKatana. The PR's fallback-config
`base_url`/`api_key_env` fix was already implemented independently
on main (run_agent.py:8759-8780) with additional aliases and Ollama
Cloud host handling, so only the gateway guard is cherry-picked.
Co-authored-by: KiraKatana <kira.ops@proton.me>
was_auto_reset, auto_reset_reason, and reset_had_activity were not
included in SessionEntry.to_dict() / from_dict(), so a gateway restart
between session expiry and the user's next message would silently drop
the auto-reset notification and context note.
Add the three fields to the serialization roundtrip with safe defaults
(False / None / False) so existing sessions.json files load cleanly.
Add three roundtrip tests to test_session_reset_notify.py.