Previously a packaged macOS/Linux app with no Hermes install hit a
dead-end ("first-launch install is not yet automated -- run install.sh
manually") because install.sh lacked the staged protocol install.ps1
exposes. Now both platforms bootstrap on first launch with the same
structured, per-step progress UI as Windows.
- install.sh: add --manifest / --stage / --json / --non-interactive plus
a stage dispatcher (prerequisites, repository, venv, python-deps,
node-deps, path, config, setup, gateway, complete). User-input stages
(setup, gateway) are skipped under --non-interactive; the in-app
onboarding overlay owns API keys/model, matching the Windows flow.
Each stage runs inside the install dir (its own process) and a new
--commit flag pins the checkout to the build-stamp SHA.
- bootstrap-runner.cjs: drive the staged manifest/stage/JSON protocol for
both install.ps1 (PowerShell) and install.sh (bash), selected by
installer kind; removed the single-blob POSIX shim.
- main.cjs: drop the macOS/Linux unsupported-platform dead-end so the
bootstrap-needed path runs the installer on every platform.
Two related first-launch bugs on machines with a legacy ~/.hermes:
- install.ps1 hardcoded $HermesHome/$InstallDir to %LOCALAPPDATA%\hermes
and ignored the HERMES_HOME the desktop passes through. The desktop
freezes HERMES_HOME at module load and prefers a legacy ~/.hermes when
%LOCALAPPDATA%\hermes is absent, so the installer wrote to a different
home than the shell read -> "Could not connect to Hermes gateway". Honor
$env:HERMES_HOME in the param defaults.
- isBootstrapComplete() trusted the marker + checkout without verifying a
runnable venv, so an interrupted/split install spawned a dead backend
instead of re-bootstrapping. Also require the venv python to exist.
MEDIA:<path> tags for .md/.json/.yaml/.xml/.html and other document
extensions were silently dropped. extract_media() carried a narrow
extension allowlist that omitted them, while extract_local_files()
had a broad one. The dispatch sites then ran an unconditional
re.sub(r'MEDIA:\\s*\\S+', '') that stripped the tag from the body even
when extract_media had not matched it — so extract_local_files (broad
list) ran on text where the path was already gone, and the file was
delivered by neither path.
- Add MEDIA_DELIVERY_EXTS in gateway/platforms/base.py as the single
source of truth; extract_media and extract_local_files both derive
their extension set from it (no more drift).
- Replace the loose MEDIA cleanup at the non-streaming dispatch site
(base.py) and the streaming consumer (stream_consumer.py) with the
shared, extension-anchored MEDIA_TAG_CLEANUP_RE. A MEDIA: tag with an
unknown extension is left in the body so the bare-path detector can
still pick it up instead of being black-holed.
- Chain cleaned text through extract_media -> extract_images ->
extract_local_files in run.py's post-stream media delivery (it was
dropping the cleaned text and rescanning raw text with MEDIA: tags).
- Regression tests covering both halves: previously-dropped extensions
now extract, and unknown-ext paths survive the cleanup.
Consolidates the MEDIA extension-allowlist PR cluster.
Co-authored-by: Bartok9 <259807879+Bartok9@users.noreply.github.com>
Co-authored-by: banditburai <123342691+banditburai@users.noreply.github.com>
Co-authored-by: Kyzcreig <9063726+Kyzcreig@users.noreply.github.com>
Adds tests for the echo-loop fix (outgoing X-Tags header, inbound skip
on tagged events, genuine tags pass through) and extends the tag to the
out-of-process _standalone_send() path so cron / send_message deliveries
to a self-subscribed topic are also skipped. Maps both contributors in
release.py AUTHOR_MAP.
Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
Config-driven platform policies (dm_policy / group_policy / allow_from /
group_allow_from) for WeCom, Weixin, Yuanbao, and QQBot now work without
also setting a PLATFORM_ALLOWED_USERS env var.
These adapters enforce their access policy at intake — a message is dropped
inside the adapter and never dispatched unless it already passed the policy.
The gateway's env-based check (_is_user_authorized) ran afterward and, with
no env allowlist set, fell through to an env-only default-deny — silently
rejecting `dm_policy: open` and config-only allowlists the adapter had
already authorized.
Rather than re-implement each adapter's policy a second time in run.py
(which would drift), adapters that own their gate now declare it via a new
BasePlatformAdapter.enforces_own_access_policy property (default False). The
gateway trusts that flag and skips the env-only default-deny for those
platforms. Env allowlists still take precedence when set.
Also resolves unauthorized DM behavior from config dm_policy so allowlist /
disabled policies drop unauthorized DMs silently instead of leaking pairing
codes, while an explicit pairing policy opts back in.
Co-authored-by: Frowtek <frowte3k@gmail.com>
The live harness runs against a real OpenRouter key; record['error'] is a
full traceback that, on an auth failure, could echo a request header or URL
containing the key. _redact_secrets() now masks the live OPENROUTER_API_KEY,
any sk-/sk-or- bearer token, and Authorization/Bearer headers before
final_response and error enter the transcript or the console print. Addresses
the CodeQL clear-text-storage/logging findings at the source.
Brings in the tool_search live-test harness from the original PR but leaves
out the 11 checked-in scripts/out/*.json transcript files — those are
non-deterministic model output that goes stale the moment the model changes
and were the bulk of the diff. scripts/out/ is now gitignored so a harness
run never re-commits them.
Fixes on top:
- API-key loading goes through hermes_cli.env_loader.load_hermes_dotenv
instead of hand-parsing ~/.hermes/.env and assigning the value to a local.
The canonical loader never materializes the secret in a local variable in
this module, which clears the four CodeQL high alerts
(py/clear-text-storage / py/clear-text-logging-sensitive-data at the
transcript write/print sites — they were tracing the key from the
hand-rolled parser into the records) and removes a hand-rolled parser.
- encoding='utf-8' on every write_text/read_text in both harness scripts
(Windows-footgun hygiene).
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
Reporter diagnosed three independent gaps that together allowed infinite
'unblock → re-stuck' loops with no surfacing or escalation:
GAP 1: `_rule_stuck_in_blocked` resets timer on any `commented`/`unblocked`
event, so a task that cycles every few minutes is invisible to it
regardless of how many times it cycles.
Fix: new `_rule_block_unblock_cycling` rule (`hermes_cli/kanban_diagnostics.py`)
that counts block→unblock cycles in a sliding window. Default threshold
3 cycles within 24h, configurable via `block_cycle_threshold` /
`block_cycle_window_seconds`. Walks events in arrival order (event id)
since multiple events can share the same `created_at` second. Fires as a
warning with a CLI hint to inspect the block reasons.
GAP 2: Iteration-budget-exhausted runs in kanban workers map to
`kanban_block` (status=blocked, but a clean exit from the kernel's
perspective). `_rule_repeated_failures` reads `consecutive_failures`,
which `_record_task_failure` increments only for crashed/timed_out/
spawn_failed — `blocked` outcome bypasses the failure counter, so the
`kanban.failure_limit` circuit breaker never trips on budget-exhaustion
loops.
Fix: `agent/conversation_loop.py` budget-exhaustion path now calls
`_record_task_failure(outcome="timed_out")` instead of `kanban_block`.
Budget exhaustion is genuinely a timeout-shaped failure (the task ran out
of allowed iterations), so this is more honest semantics; it also routes
through the unified failure counter, so repeated budget exhaustions trip
the circuit breaker and the task auto-blocks with `gave_up` after
`failure_limit` retries.
GAP 3: `release_stale_claims` uses `_pid_alive(worker_pid)` only and
ignores `last_heartbeat_at`. Reporter observed a 91-min run that held
its claim with frozen heartbeat because the worker entered a logic loop
with no tool calls — `_pid_alive` kept returning True so the claim was
extended every 15 minutes indefinitely.
Fix: heartbeat-stale backstop. If `last_heartbeat_at` is set AND older
than `DEFAULT_CLAIM_HEARTBEAT_MAX_STALE_SECONDS` (default 1h), reclaim
even if the PID is alive. NULL `last_heartbeat_at` preserves backward
compatibility (no heartbeat yet = extend, as before). The reclaim event
payload now includes a `heartbeat_stale` boolean so operators see why a
live-PID worker was reclaimed.
This works cleanly in concert with PR #34418 (#31752 runtime → heartbeat
bridge): once `_touch_activity` keeps `last_heartbeat_at` fresh as a
side effect of normal API traffic, the backstop only fires for genuinely
wedged workers (no chunks, no tool results, no progress at all).
Co-authored-by: baofuen <45189813+baofuen@users.noreply.github.com>
The dispatcher watchdog (release_stale_claims) reads tasks.last_heartbeat_at
to decide whether to reclaim a running task. The agent maintains its own
in-process `_last_activity_ts` for every chunk/tool result, but those
liveness ticks never reach the board unless the model explicitly calls
the `kanban_heartbeat` tool — so a worker actively executing a long run
without tool-level heartbeats can be reclaimed mid-flight as 'stale',
returning the task to ready and orphaning the in-flight worker's progress.
Fix: in `_touch_activity` (the canonical 'we just did work' hook in
run_agent.py), call a new `heartbeat_current_worker_from_env` helper
in `tools/kanban_tools.py` that:
- No-ops outside dispatcher-spawned worker context (no HERMES_KANBAN_TASK).
- Rate-limited to one DB write per 60s (runtime activity ticks too often
to faithfully mirror; we just need the watchdog to see liveness).
- Best-effort: never raises. heartbeat_claim + heartbeat_worker calls are
individually try/except'd; any DB error logs at debug and returns.
- Uses worker env identity: HERMES_KANBAN_TASK + HERMES_KANBAN_RUN_ID +
HERMES_KANBAN_CLAIM_LOCK (all pinned by the dispatcher at spawn time).
- No durable note on auto-heartbeats — that's reserved for the explicit
`kanban_heartbeat` tool which carries a model-supplied note.
The explicit `kanban_heartbeat` tool stays available unchanged for
workers that want to attach a note or pre-emptively extend a claim
across a known-long single tool call.
Co-authored-by: faisfamilytravel <223516181+faisfamilytravel@users.noreply.github.com>
Kanban workers run headless — no live user is on the other side of `clarify`,
so the call times out (~120s default) and the task sits silently in `running`
with no signal to the operator that input is needed. Reporter observed a real
incident where a worker asked 'promote to production, or check staging first?'
via clarify, the call timed out, the agent hallucinated a fallback, and the
task sat 'running' for hours.
Fix: explicit 'do not call clarify' bullet in two surfaces every kanban worker
sees —
- `agent/prompt_builder.py` KANBAN_GUIDANCE `## Do NOT` section (auto-injected
into every dispatcher-spawned worker run).
- `skills/devops/kanban-worker/SKILL.md` `## Do NOT` section (the bundled
worker skill).
Both point at the right pattern: `kanban_comment` (context) + `kanban_block`
(decision needed) — the task surfaces on the board as blocked, the operator
sees it, unblocks with their answer in a comment, and the worker respawns
with the thread.
Co-authored-by: kweiner <17778+kweiner@users.noreply.github.com>
`hermes kanban unblock <id> review-required: ...` parsed every trailing word
as another task_id (since `task_ids` is `nargs='+'`), then quietly failed on
each non-existent id with "cannot unblock review-required: (not blocked/scheduled?)".
Reporter saw this as asymmetric with `block <id> <reason...>` which accepts
positional reason words.
Fix: add a `--reason "..."` flag that, when provided, is appended as a
`UNBLOCK: <reason>` comment before the unblock transition. Bulk syntax
(`unblock t_a t_b t_c`) is preserved unchanged.
Co-authored-by: julio-cloudvisor <211828103+julio-cloudvisor@users.noreply.github.com>
The xAI tool-schema sanitizers (strip_slash_enum, strip_pattern_and_format)
mutate their input in place — that's their documented contract. The two
call sites (chat_completion_helpers.build_api_kwargs and the auxiliary
client) were passing agent.tools straight through, so the first xAI
request would permanently strip slash-containing enum constraints and
pattern/format keywords from the per-agent tool registry.
Effect: any subsequent non-xAI call from the same agent (auxiliary task
routed to Anthropic, OpenRouter fallback, mid-session model switch) saw
the already-stripped schema with no way for the user to notice from
their config.
Fix: deepcopy tools_for_api before sanitizing at both call sites.
The slash-enum bug itself (xAI 400ing on enums with '/') was fixed
earlier by #32443 (Nami4D) — that PR landed the strip but used the
sanitizers directly without copying. This salvages #27907's correctness
contribution (the deepcopy) while skipping its redundant parallel
sanitizer (strip_xai_incompatible_enum_values is functionally
equivalent to the existing strip_slash_enum) and its preflight-
neutrality argument (we chose model-gated preflight in #32443).
3 new tests in tests/run_agent/test_run_agent_codex_responses.py:
- strips_slash_enum_from_outgoing_request — outgoing kwargs has no
slash-containing enum values (functional contract preserved).
- does_not_mutate_agent_tools — headline #27907 regression. Snapshot
agent.tools before build_api_kwargs, assert it survives intact
after. Pre-fix this assertion would have caught the mutation.
- is_idempotent_across_repeated_calls — three xAI requests in a row
each strip cleanly AND don't progressively erode the source schema.
344/344 across tests/agent/test_auxiliary_client.py,
tests/agent/transports/test_codex_transport.py,
tests/run_agent/test_run_agent_codex_responses.py, and
tests/tools/test_schema_sanitizer.py.
Co-authored-by: Gabor Barany <barany.gabor@gmail.com>
Remove unused imports (F401) and duplicate/shadowed import
redefinitions (F811) across the codebase using ruff's safe
autofixes. No behavioral changes -- imports only.
- ~1400 safe autofixes applied across 644 files (net -1072 lines)
- __init__.py re-exports preserved (excluded from F401 removal so
public re-export surfaces stay intact)
- Re-exports that are imported or monkeypatched by tests but look
unused in their defining module are kept with explicit # noqa:
F401 (gateway/run.py load_dotenv; run_agent re-exports from
agent.message_sanitization, agent.context_compressor,
agent.retry_utils, agent.prompt_builder, agent.process_bootstrap,
agent.codex_responses_adapter)
- Unsafe F841 (unused-variable) fixes deliberately skipped -- those
can change behavior when the RHS has side effects
- ruff lints remain disabled in pyproject.toml (only PLW1514 is
selected); this is a one-time cleanup, not a config change
Verification:
- python -m compileall: clean
- pytest --collect-only: all 27161 tests collect (zero import errors)
- core entry points import clean (run_agent, model_tools, cli,
toolsets, hermes_state, batch_runner, gateway)
- static scan: every name any test imports directly from an edited
module still resolves
The unpacked Hermes.exe showed the stock Electron icon + name in the
taskbar because build.win.signAndEditExecutable=false disables BOTH
electron-builder's signing AND its rcedit metadata/icon stamping. That
flag is load-bearing: enabling it re-triggers signtool -> winCodeSign,
whose macOS symlinks crash 7-Zip on non-admin Windows (unfixable dead end).
Decouple identity-stamping from signing entirely: after npm run pack,
run rcedit ourselves on the produced exe.
- Add rcedit as a direct devDependency of apps/desktop (the transitive
electron-winstaller copy is fragile).
- apps/desktop/scripts/set-exe-identity.cjs: Node helper that calls
rcedit's named export to set icon + ProductName/FileDescription/
CompanyName. Node builds argv natively — avoids the PowerShell->exe
->JSON double-escaping that broke the app-builder rcedit path.
- install.ps1 Set-DesktopExeIdentity invokes the script after the build,
before shortcuts. Best-effort: failure keeps the stock icon, never
fails the install. rcedit is a pure PE editor — no signtool, no
winCodeSign, no symlinks.
Verified locally: stamping a copy of the built Hermes.exe embeds the
32x32 icon and sets ProductName=Hermes.
Also fix update-path success-screen flash: in update mode the installer
hands off + exits in ~600ms, so don't route to the 'launch Hermes'
success view (it flashed before the window closed).
Converge update on the same principle as bootstrap: one driver owns all
repo mutation. The desktop becomes a pure consumer that hands off to
Hermes-Setup.exe --update instead of re-implementing git/pip in Electron.
- hermes desktop --build-only: build without launching, so the installer
owns the post-update launch (CLI keeps build logic single-sourced).
- Installer AppMode {Install,Update} from argv; get_mode exposed to the UI.
- Installer self-copies to HERMES_HOME/hermes-setup.exe on install success
(no-op guard during --update re-invocation to avoid the locked-exe copy).
- Installer --update flow (update.rs): wait for the desktop to release the
venv shim, run 'hermes update --yes --gateway' (branch on exit 0/2/other),
then 'hermes desktop --build-only', then launch the rebuilt desktop. Reuses
the bootstrap event channel + progress UI via a synthetic two-stage manifest.
- Desktop applyUpdates() gutted (~105 lines of git/stash/pull/pyproject/pip
removed) -> thin handoff: spawn updater, app.quit() to free the shim.
Detection (checkUpdates, commit changelog, behind-count) kept intact.
- install.ps1 creates Start Menu + Desktop shortcuts to the packed Hermes.exe
(never bare 'hermes desktop', which would rebuild every launch).
The skills.sh source was returning ~858 unique skills from a hardcoded
list of 28 popular keyword searches (each capped at 50 results). The
real catalog is ~20k — exposed via sitemap-skills-{1,2}.xml linked from
the site's sitemap index.
Switch the empty-query path in SkillsShSource.search() to walk the
sitemap instead of scraping the homepage's curated featured strip.
Falls back to the homepage scrape if the sitemap is unreachable.
build_skills_index.crawl_skills_sh() now just calls search("", limit=0)
instead of running 28 keyword searches — same result in one HTTP round
instead of 28.
Also handle a httpx + brotlicffi interaction: the per-skill sitemaps
are ~900 KB brotli-compressed and the cffi backend's streaming decode
chokes on them. Forcing Accept-Encoding to gzip dodges the bug without
requiring a brotli library upgrade.
E2E against live skills.sh: 19,932 unique skills walked in 0.7s.
Tests: 137 pass (+1 new regression test exercising the sitemap path).
Floor for skills.sh raised 100 → 10,000 in EXPECTED_FLOORS so a future
regression hard-fails the build.
The desktop app's main.cjs resolver ladder has a 'bootstrap-needed' rung
that fires when .hermes-bootstrap-complete is missing from
ACTIVE_HERMES_ROOT. Pre-Hermes-Setup, this marker was written by the
packaged-desktop's own bootstrap-runner.cjs at the end of its install
flow. Now that Hermes-Setup.exe runs install.ps1 directly, install.ps1
needs to own the marker — otherwise the desktop sees no marker on first
launch and triggers its legacy first-launch bootstrap (re-running
install.ps1 from inside Electron, the exact recursion Hermes-Setup.exe
was supposed to obviate).
Implementation:
* New Stage-BootstrapMarker (worker) → Write-BootstrapMarker (helper)
* Slotted in the manifest right after platform-sdks, before the
interactive configure/gateway stages, so it runs unconditionally
when the install reaches the finalize phase
* Schema mirrors apps/desktop/electron/main.cjs writeBootstrapMarker /
isBootstrapComplete EXACTLY: {schemaVersion: 1, pinnedCommit,
pinnedBranch, completedAt}. Schema version stays at 1 so old
desktops that read marker files written by future install.ps1s
can still parse them.
* pinnedCommit comes from -Commit flag (Hermes-Setup.exe passes it)
or falls back to 'git rev-parse HEAD' in InstallDir
* pinnedBranch from -Branch flag, defaults to 'main' matching
install.ps1's own param default
Two PS-5.1 gotchas baked into comments:
* The ?. null-conditional operator doesn't exist pre-PS7; use
explicit if-checks on Get-Command results
* Set-Content -Encoding UTF8 emits a BOM in 5.1 and Node's plain
JSON.parse rejects BOM — write via .NET's UTF8Encoding(false)
to produce BOM-less JSON the desktop's readJson() can parse
VM run 5 diagnosis: the pre-extract from 3b29e65c1 ran (extracted 83
files, 24MB) but produced ZERO files at the expected sentinel path
'/winCodeSign-2.6.0/windows-10/x64/signtool.exe'.
Cause: the .7z archive's root entries are 'windows-10/', 'darwin/',
'linux/', etc. — not 'winCodeSign-2.6.0/<arch>'. Extracting with
'-o$cacheRoot' put files at $cacheRoot/windows-10/..., NOT at
$cacheRoot/winCodeSign-2.6.0/windows-10/.... I had the directory
nesting wrong from the start.
And then we observed: electron-builder downloads winCodeSign-2.6.0.7z
under a random numeric filename ('384387955.7z') regardless of what's
already extracted in the parent dir. The cache key isn't the dirname;
it's content-addressed. So the pre-extract approach was doomed even
if the path nesting had been right.
Actual fix: signtoolOptions.sign=null in apps/desktop/package.json's
win build config. electron-builder honors this and skips the bundled-
prebuild signing entirely — no signtool invocation, no winCodeSign
fetch, no symlink-privilege crash. The previous failures all stemmed
from electron-builder pre-signing node-pty's bundled .exes
(winpty-agent.exe, OpenConsole.exe) which are already author-signed
upstream; re-signing with our nonexistent cert was overwriting good
sigs with nothing useful anyway.
Cost: when we DO get a real cert later, we'll add it back with the
sign function pointing at the cert chain. Until then, all-null is
the correct config and unblocks every non-admin Windows user.
Removed Initialize-ElectronBuilderCache (the dead pre-extract).
Removed the call site. Kept the CSC_IDENTITY_AUTO_DISCOVERY env
vars as belt-and-suspenders against a future electron-builder
change that might revive cert auto-discovery.
VM run 4 diagnosis: even with CSC_IDENTITY_AUTO_DISCOVERY=false set,
electron-builder still fetches winCodeSign and signs bundled binaries.
The log shows the signing happens BEFORE the cache extraction:
• signing with signtool.exe ...\winpty-agent.exe
• signing with signtool.exe ...\OpenConsole.exe
• downloading winCodeSign-2.6.0.7z
• <symlink privilege error>
Cause: node-pty's bundled prebuilds are listed in apps/desktop's
asarUnpack ['**/*.node', '**/prebuilds/**']. electron-builder
re-signs anything unpacked from asar, regardless of whether OUR
binary gets signed. The signtool invocation needs winCodeSign on
disk, which needs the .7z extracted, which hits the macOS-symlink
crash on non-admin Windows.
The CSC env vars I added in d5fe46727 only kill IDENTITY DISCOVERY
(so OUR Hermes.exe stays unsigned, which is fine — we have no cert).
They don't prevent the toolchain fetch for the bundled-prebuild
re-sign. I removed the pre-extract in d5fe46727 thinking the env
vars subsumed it; that was wrong. Both are needed.
Restoring Initialize-ElectronBuilderCache verbatim from c7e46f9f3
and keeping the CSC env vars. Wrote a clearer doc-comment at the
call site explaining the two-knob interaction so future maintainers
don't drop one half again.
VM run 3 diagnosis: node-deps stage skipped on the VM (logged
'Skipping Node.js dependencies (Node not installed)') and then
desktop's npm install failed with exit 1 and zero diagnostic detail.
Two root causes:
1. $HasNode false-skip in Stage-NodeDeps — same cross-process bug
pattern we fixed for Stage-Desktop in c7e46f9f3. Stage-Node ran
in process A and set $script:HasNode = $true, then exited. Stage-
NodeDeps ran in fresh process B (Hermes-Setup.exe -Stage NAME
spawns each stage independently), where that variable doesn't
exist. Re-probe via Get-Command npm instead of trusting the
stale script-scope global. The previous stage already verified
Node so the re-probe succeeds.
2. npm install --silent + Tee to TEMP file hid the real error.
When the workspace install failed on the VM, the actual reason
was buffered in $env:TEMP\hermes-npm-desktop-install-*.log and
the user saw only 'exit 1'. Drop --silent so npm streams its
full output, drop the TEMP-file dance — the Tauri installer's
streaming sink already tees every stdout/stderr line to the
rolling bootstrap-installer.log, so a side log file is dead
weight that hides the very error we need.
After this, the bootstrap log on a failure will contain npm's full
output (deprecation warnings, ETARGET, native-module compile errors,
whatever) tagged with stage=desktop, making the actual cause
diagnosable instead of an opaque exit code.
The previous commit (c7e46f9f3) worked around the winCodeSign-symlinks-
on-Windows extraction crash by pre-extracting the archive ourselves with
-snl + -x!darwin. That fix was correct but addressed the wrong layer.
The deeper question: why was electron-builder fetching winCodeSign at all
when we have no signing cert configured? Answer: electron-builder
unconditionally pre-warms the toolchain assuming any build MIGHT sign.
The cert auto-discovery never finds anything (we never set CSC_LINK
or anything else), so the signing never happens — but the 100MB fetch
of winCodeSign and its broken-on-Windows symlink extraction does.
Set CSC_IDENTITY_AUTO_DISCOVERY=false (with WIN_CSC_LINK and
WIN_CSC_KEY_PASSWORD also explicitly cleared as belt-and-suspenders)
before invoking npm run pack, and electron-builder skips the entire
winCodeSign apparatus. No download, no extraction, no privilege check.
Env vars are saved/restored around the invocation so we don't leak
the override into Stage-PlatformSdks etc.
Net: removes the 100-line Initialize-ElectronBuilderCache helper that
manually downloaded + extracted winCodeSign-2.6.0.7z. Replaced with
3 env-var assignments. The produced Hermes.exe is functionally
identical — just no longer carries a code-signing-machinery dependency
we never used.
* fix(agent): fallback immediately on provider content-policy blocks
Provider safety-filter refusals (e.g. OpenAI Codex 'flagged for possible
cybersecurity risk', OpenAI moderation 'violates our usage policies',
Anthropic safety-system rejections, Azure content_filter) are
deterministic decisions about a specific prompt. Retrying the same
prompt up to api_max_retries times just reproduces the same refusal and
burns paid attempts before surfacing the generic 'API failed after 3
retries — <provider message>' to Telegram / cron with no indication that
the failure came from the model provider rather than Hermes itself.
Classify these as a new FailoverReason.content_policy_blocked
(non-retryable, should_fallback=True) and route them through the
existing is_client_error path so the loop:
- skips the 3x retry backoff
- activates a configured fallback model immediately
- emits a clear provider-safety message to the user (not the generic
'Non-retryable error (HTTP None)') and surfaces actionable guidance
when no fallback is configured (rephrase, narrow context, or set
fallback_model in hermes config)
- returns a final_response that explicitly tells the user this came
from the model provider, so gateway delivery is unambiguous and
cron last_status reflects the safety block rather than a vague
'agent reported failure'
Patterns are intentionally narrow — verbatim refusal phrasings keyed to
specific provider safety pipelines, not generic words like 'policy' or
'violation' that would collide with billing / format / auth errors.
Regression guards in test_18028_content_policy_blocked.py verify
billing 402s, generic 400s, and OpenRouter account-level
provider_policy_blocked remain distinct classifications.
Salvaged from #18164 onto current main (file restructure: loop logic
moved from run_agent.py to agent/conversation_loop.py, _emit_status →
_buffer_status), broadened patterns beyond the original OpenAI Codex
cybersecurity case to cover OpenAI moderation, Anthropic safety system,
and Azure content_filter; added user-actionable guidance and a clear
final_response so cron/gateway surfaces the policy block instead of a
generic non-retryable error, and added a regression-guard test module
mirroring the is_client_error predicate.
Addresses #18028.
Co-authored-by: Kuan-Chieh Huang <kchuang1015@users.noreply.github.com>
* chore: add kchuang1015 to AUTHOR_MAP
---------
Co-authored-by: Kuan-Chieh Huang <kchuang1015@users.noreply.github.com>
Discord threads where the bot has already participated bypass mention gating by default, but the backfill check was still tied to the mention-needed condition. That meant follow-up thread messages could trigger a response without providing recent thread history to the session.
Run history backfill for thread messages whenever backfill is enabled, while keeping DMs skipped and channel mention backfill behavior unchanged. Add a regression test for a known thread follow-up without an explicit mention.
Fixes#33666
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(skills): pull full ClawHub catalog into the skills index
The website was showing 200 ClawHub skills out of 20k+ because
`ClawHubSource.search("")` for empty queries went straight to a single
unpaginated request. ClawHub's API caps any single page at 200 items and
returns a `nextCursor`; we grabbed page 1 and stopped, so the cached
index served from hermes-agent.nousresearch.com had a silent 99%
truncation.
End users never hit clawhub.ai directly (the index is rebuilt twice
daily by .github/workflows/skills-index.yml and served as a static JSON
on the docs site), so the cap-and-cache architecture is correct — it
just wasn't being filled.
Changes:
- `ClawHubSource.search(query="")` now routes through the existing
`_load_catalog_index()` paginating walker instead of the unpaginated
listing fallback (non-empty queries still hit the fast catalog search).
- `_load_catalog_index()` max_pages 50 → 250 (50k-skill ceiling; live
catalog is ~20k as of May 2026, with headroom for growth).
- `build_skills_index.py`: per-source crawl limits split out — ClawHub
and LobeHub get 100k, others keep their effective caps.
- `EXPECTED_FLOORS["clawhub"]` 50 → 5000 so the next pagination
regression hard-fails the CI build instead of silently shipping a
degenerate index.
Test plan:
- New unit test `test_search_empty_query_paginates_full_catalog`
exercises the cursor-following path with three mocked pages (450
total items) and asserts all pages are walked.
- Existing 9 ClawHub tests + 127 broader skills_hub tests all pass.
- E2E against live ClawHub API: walker reached 9700+ skills across 49
pages before this commit landed, paginating well past the previous
50-page cap.
* fix(skills): raise ClawHub ceilings — live catalog is 50k, not 20k
E2E walk against live ClawHub API hit my initial 250-page cap at 49,698
skills with cursor=yes still pending. The catalog is roughly 2.5x larger
than the docstring estimate.
- max_pages 250 → 750 (150k ceiling, walks terminate on cursor=None
well before this in practice)
- SOURCE_LIMITS['clawhub'] 100k → 200k
- EXPECTED_FLOORS['clawhub'] 5000 → 20000
Two bugs caught in the second VM end-to-end run:
1. electron-builder's winCodeSign extraction fails on grandma-class
Windows boxes because the .7z archive contains macOS symlinks
(darwin/10.12/lib/libcrypto.dylib and libssl.dylib pointing at
versioned siblings). Creating symlinks on Windows requires
SeCreateSymbolicLinkPrivilege, a per-user right that non-admin
accounts don't have on stock Windows. Result: every fresh install
on a non-admin user fails Stage-Desktop with a 7-Zip 'cannot create
symbolic link' error, retried four times, then bails.
Fix: Initialize-ElectronBuilderCache pre-extracts winCodeSign-2.6.0.7z
ourselves with -snl (don't preserve symlinks, store as resolved file
content) AND -x!darwin (skip the entire macOS subtree — irrelevant
on Windows). Writes to electron-builder's expected cache dir before
electron-builder gets a chance to try its own broken extraction.
Idempotent — fast-paths via signtool.exe sentinel check.
2. Install-Desktop's first guard was 'if (-not $HasNode) skip'.
$HasNode is set by Stage-Node into $script:HasNode, but in
cross-process driver mode (each -Stage NAME is a fresh powershell.exe
spawned by Hermes-Setup.exe), that script-scope variable from the
PREVIOUS process is invisible — so the guard always fired and
Install-Desktop returned in 900ms with a misleading
'Node.js not available' reason. The real npm probe below it never
got to run. Fix: re-probe npm directly via Get-Command when $HasNode
is empty/false, since by that point Stage-Node has already verified
Node is installed and the only question is whether *this* process
can see it on PATH (it can — installer-wide PATH update from Stage-Node).
The new Hermes-Setup.exe (Tauri bootstrap installer) passes -IncludeDesktop
so users who install via the GUI end up with a launchable Hermes.exe at
apps/desktop/release/<os>-unpacked/. Existing flows are unchanged:
* The 'irm install.ps1 | iex' CLI one-liner omits the flag — terminal
users don't need a prebuilt desktop binary; 'hermes desktop' builds
on demand.
* The Electron desktop's bootstrap-runner.cjs also omits the flag —
rebuilding apps/desktop from inside a running Hermes.exe would try
to overwrite the live binary on disk and fail.
Stage-Desktop runs after Stage-NodeDeps so workspace npm is already
installed when electron-builder fires. It does:
1. 'npm install' at repo root so apps/* workspaces resolve their deps
(Electron itself arrives via npm here, ~150MB)
2. 'npm run pack' in apps/desktop (tsc + vite + electron-builder --dir)
3. Probes apps/desktop/release/{win-unpacked,win-arm64-unpacked}/Hermes.exe
The --dir mode produces an unpacked launchable binary without an NSIS/MSI
installer artifact — we don't need one because Hermes-Setup.exe spawns the
unpacked binary directly via launch_hermes_desktop.
The reaper hoist in the prior commit adds an extra
`asyncio.to_thread(_kb.reap_worker_zombies)` call at the top of every
dispatcher tick (before the per-board work). The existing
`test_gateway_dispatcher_disables_corrupt_board_without_traceback`
mocks `to_thread` with a 4-call cap that previously matched 2 full
dispatch ticks. With the reaper hoist each tick is now 3
`to_thread` calls instead of 2, so the cap is raised to 6 to preserve
the same number of dispatch ticks. The `connect == 5` assertion is
unchanged.
Also add the contributor's `steveonjava@gmail.com` to AUTHOR_MAP
alongside `steve@steveonjava.com` so contributor-audit passes for
both identities used across the salvaged commits.
Salvage follow-up for PR #32857.
Reaper now runs at the top of every dispatcher tick regardless of per-board connect() failures. Previously the reaper sat inside dispatch_once after the kanban_db.connect() call — any EIO during connect would skip reaping for that tick, accumulating zombie workers and stale claim_lock rows.
Also: reap_worker_zombies now returns the list of reaped pids (the dispatcher logs them) and a test indentation fix.
Squashes three sibling commits from PR #32301 into one logical change for batch review.
When installing as root on Linux with the default FHS layout
(/usr/local/lib/hermes-agent), `uv python install` placed the managed
Python under /root/.local/share/uv/python/, which non-root users cannot
traverse. The shared /usr/local/bin/hermes wrapper then failed for them
with "bad interpreter: Permission denied" when execing the venv python.
Export UV_PYTHON_INSTALL_DIR and UV_PYTHON_BIN_DIR to /usr/local/share/uv/
in the root-FHS branch of resolve_install_layout so the managed Python
is world-readable and the shared wrapper works for any user.
Closes#21457