Commit graph

9948 commits

Author SHA1 Message Date
Niels Kaspers
6891e05e78 docs: fix session recap image baseUrl 2026-05-29 12:06:22 -07:00
hllqkb
0673638560 fix(docs): correct GitHub org links in memory-providers.md
hermes-ai/hermes-agent → NousResearch/hermes-agent (2 occurrences).
The old org name leads to 404 pages.
2026-05-29 12:06:22 -07:00
Hashclaw
ae9dfa510e docs: fix separate typo; hyphenate built-in trust wording
- ACL LaTeX template comment: seperate -> separate
- CONTRIBUTING and docs site: builtin trust -> built-in trust (prose/table cells)

Made-with: Cursor
2026-05-29 12:06:22 -07:00
kshitij
7379f17556
fix(gateway): only fire planned-stop watcher for self-targeting markers + fix Windows consume (#34749)
* fix(gateway): only fire planned-stop watcher for markers targeting self

Salvaged from #34599 — rebased onto current main.

The planned-stop watcher now only fires shutdown for a marker that targets
the current process, instead of any marker that exists on disk. Fixes the
Windows crash loop (#34597) where a stale marker from a previous Gateway
instance kills a freshly booted Gateway ~400ms after start with a false
"Received UNKNOWN — initiating shutdown".

Co-authored-by: Bartok9 <danielrpike9@gmail.com>

* fix(gateway): match planned-stop/takeover markers by PID alone when start_time is unavailable

Follow-up to the #34599 salvage. The watcher's non-destructive probe
(planned_stop_marker_targets_self) already falls back to PID equality when
a process start_time is unavailable, but the authoritative consume it gates
(_consume_pid_marker_for_self) still required a non-None start_time match.

_get_process_start_time reads /proc/<pid>/stat and returns None on macOS and
native Windows — the only platform the planned-stop watcher exists for. So on
Windows the probe would fire the shutdown handler (PID matches) but the
handler's consume_planned_stop_marker_for_self() would return False, and a
legitimate 'hermes gateway stop' was still misclassified as an unexpected
UNKNOWN exit (exit 1) and revived by the service manager — a residual half of
the #34597 crash loop on the legitimate-stop path.

Align the consume with the probe: when both start_times are known they must
match (PID-reuse guard preserved on Linux); when either is unavailable, fall
back to PID equality alone, bounded by the existing short marker TTL. This
also fixes the parallel --replace takeover consume on Windows, which shares
the same helper.

Adds regression tests for the Windows (None start_time) path, the foreign-PID
rejection under that fallback, and confirmation the start_time-mismatch guard
still rejects when both are known.

---------

Co-authored-by: Bartok9 <danielrpike9@gmail.com>
2026-05-29 17:36:58 +00:00
alt-glitch
0563ab0652 fix(test): add fal_client.submit stub to surface matrix test
The plugin switched from fal_client.subscribe() to submit()+handle.get().
The test mock only had subscribe, causing CI failures.
2026-05-29 22:26:24 +05:30
alt-glitch
e46e4bcf47 fix(video_gen): parse duration suffix in success_response
int(payload["duration"]) blows up on "4s" (veo3.1 format).
Strip non-digit chars before int conversion in the response builder.
2026-05-29 22:26:24 +05:30
alt-glitch
3183b2e28c fix(video_gen): veo3.1 duration format and 4k resolution
FAL veo3.1 API expects duration as "4s"/"6s"/"8s" (with unit suffix),
not bare "4"/"6"/"8" like other families. Add per-family duration_suffix
field and apply it in _build_payload. Also add "4k" to veo3.1 resolutions
per FAL API docs.

Note: the managed gateway currently rejects the "4s" format (expects
integer duration). Gateway-side fix needed for veo3.1 to work through
the Nous subscription path.
2026-05-29 22:26:24 +05:30
alt-glitch
a4c18f65d4 feat(video_gen): wire Nous subscription override into hermes tools UX
Add the same managed-gateway UX that image_gen already has:

- TOOL_CATEGORIES['video_gen'] gets a 'Nous Subscription' provider row
  with managed_nous_feature='video_gen' + video_gen_plugin_name='fal'
- NousSubscriptionFeatures gains a video_gen property + feature state
  computation (managed/active/available using the fal-queue gateway)
- _GATEWAY_TOOL_LABELS, _GATEWAY_DIRECT_LABELS, _ALL_GATEWAY_KEYS,
  _get_gateway_direct_credentials, opted_in all include video_gen
- apply_nous_managed_defaults and apply_gateway_defaults handle video_gen
- _is_toolset_satisfied checks Nous features for video_gen
- _is_provider_active detects managed video_gen (use_gateway + fal provider)
- _select_plugin_video_gen_provider accepts use_gateway kwarg, propagated
  from all 4 call sites in _configure_provider when managed_feature is set
- hermes setup status shows 'Video Generation (FAL via Nous subscription)'

Users on a Nous subscription can now pick 'Nous Subscription' under
hermes tools → Video Generation, which sets video_gen.provider=fal +
video_gen.use_gateway=true. The FAL plugin's _resolve_managed_fal_video_gateway
then routes through the managed queue gateway — no FAL_KEY needed.
2026-05-29 22:26:24 +05:30
alt-glitch
b6294ea9f1 test(video_gen): cover gateway decision matrix gaps and 4xx error path
- Add test for 4xx ValueError with actionable remediation message
- Add test for is_available() returning True via managed gateway
- Add test for prefers_gateway overriding direct FAL_KEY
- Add test for is_available() via gateway in plugin test file
2026-05-29 22:26:24 +05:30
alt-glitch
d04b3c193e feat(video_gen): route FAL video gen through managed Nous gateway
Wire plugins/video_gen/fal/__init__.py to use the same
_ManagedFalSyncClient pattern that image gen already uses.

Changes:
- Add managed gateway resolution, client caching, and
  _submit_fal_video_request() that routes between direct FAL_KEY
  and Nous gateway modes
- Update is_available() to return True when either FAL_KEY or the
  managed gateway is reachable
- Update generate() to use submit+get handle pattern instead of
  fal_client.subscribe() directly
- Fix happy-horse endpoint namespace: fal-ai/ → alibaba/ (matches
  the tool-gateway allowlist from fal-video-gen branch)
- Surface actionable error on 4xx gateway rejections

Tests:
- 4 new tests in test_managed_media_gateways.py (gateway routing,
  client reuse, direct mode fallback, alibaba namespace)
- Updated existing test_fal_plugin.py fixture to use submit/handle
  pattern and patch _resolve_managed_fal_video_gateway for isolation
2026-05-29 22:26:24 +05:30
kshitijk4poor
5cd0673217 ci: harden supply-chain gate jobs against changes-job failure
The scan-gate / dep-bounds-gate jobs use needs.changes; if the changes
job itself fails, its dependents would be skipped via a failed dependency
(not a conditional skip), leaving the required check unreported — the same
"pending forever" failure this PR fixes. Add always() and switch the gate
condition from == 'false' to != 'true' so the gate still fires (and reports
SUCCESS) when changes fails and its output is empty.
2026-05-29 09:17:01 -07:00
ethernet
6bc309baf2 ci: ensure required checks always report status
Remove paths filters from contributor-check and supply-chain-audit
workflows. When no matching files changed, the workflows never ran and
the required checks (check-attribution, supply chain scan, dep bounds)
stayed "pending" forever, blocking merge.

Now both workflows always trigger. A path-check step/job determines
whether the real work should run; gate jobs with matching names report
success when the real job was skipped, so branch protection always
gets a check status.

Also fixes dep-bounds: the old condition
  if: contains(github.event.pull_request.changed_files_url, 'pyproject.toml') || true
was always true (the || true made it unconditional). Now uses the
proper changes.deps output from the shared filter job.
2026-05-29 09:17:01 -07:00
ethernet
6928692cec
Merge pull request #33773 from dvir-pashut/fix/nix-full-drop-stale-vercel-group
fix(nix): drop stale "vercel" group from #full variant
2026-05-29 11:16:25 -04:00
teknium1
75cd420b3b docs(skills): move antigravity-cli to autonomous-ai-agents in catalog + sidebar 2026-05-29 05:21:48 -07:00
teknium1
78d7fa1b5c refactor(skills/antigravity-cli): move to autonomous-ai-agents (it's an AI agent CLI) 2026-05-29 05:21:48 -07:00
teknium1
904c0b479b refactor(state): return FTS index count from vacuum()
Have vacuum() return optimize_fts()'s count so the CLI 'sessions optimize'
summary uses the real merged-index count instead of probing the private
_FTS_TABLES / _fts_table_exists() members.
2026-05-29 05:09:56 -07:00
kshitijk4poor
38695254f8 perf(state): merge FTS5 segments on VACUUM + add 'hermes sessions optimize'
The FTS5 indexes (messages_fts, messages_fts_trigram) grow as a series of
incremental b-tree segments — one per trigger-driven insert batch. SQLite's
automerge caps at ~16 segments, so a long-lived store keeps scanning many
segments per MATCH and never collapses them unless the special 'optimize'
command runs. Nothing in the codebase ever ran it: vacuum() only fired after
a prune that deleted rows, and even then never merged FTS segments.

Changes:
- SessionDB.optimize_fts(): merges each FTS5 index to a single segment,
  probing for the (optional/lazy) trigram table first so it is safe to call
  unconditionally. Layout-only — search results and snippet() are unchanged.
- vacuum() now calls optimize_fts() before VACUUM so freed index pages are
  returned to the OS in the same pass.
- 'hermes sessions optimize' CLI subcommand for on-demand reclamation +
  segment compaction (previously there was no way to compact the store
  without a prune deleting rows), with before/after size reporting.

Benchmark (8000 msgs, fragmented to 8 segments/index):
- segments 8 -> 1 on both indexes
- porter MATCH 5.5x faster (0.449 -> 0.081 ms/q)
- trigram MATCH 3.0x faster (0.632 -> 0.207 ms/q)
- 8000 matches before == 8000 after, identical row ids (no functional change)

Orthogonal to the structural FTS-size PRs (#20239 external-content,
#27770 optional trigram) — segment merge helps regardless of those.

Tests: TestOptimizeFts covers index count, search+snippet preservation,
missing-trigram path, and idempotency. Full test_hermes_state.py green (227).
2026-05-29 05:09:56 -07:00
Teknium
2159d2a729
docs(credential-pools): document immediate rotation on usage-limit 429 (#34580)
The rotation flowchart only described the generic 'retry once, rotate on
second 429' path. ChatGPT/Codex plan-limit 429s carry a usage_limit_reached
reason and rotate to the next pool key immediately (no retry, since the cap
won't clear on retry). Document that case so the docs match the code.
2026-05-29 04:50:14 -07:00
teknium1
0dba60f73b docs(skills): regen catalog + sidebar for optional antigravity-cli skill 2026-05-29 04:49:42 -07:00
teknium1
632a7088a3 chore(skills/antigravity-cli): make optional, frame through Hermes tools, tighten frontmatter 2026-05-29 04:49:42 -07:00
Tony Simons
1bba5f27ab feat(skills): add antigravity-cli operator skill 2026-05-29 04:49:42 -07:00
teknium1
d6f2bdabda docs(skills): regen catalog + sidebar for optional grok skill 2026-05-29 04:49:38 -07:00
teknium1
99ddba94ed chore(skills/grok): make optional + tighten SKILL.md to modern format 2026-05-29 04:49:38 -07:00
Matt Maximo
10cd4138cc feat(skills): add grok skill for xAI Grok Build CLI
Adds a `grok` skill under `skills/autonomous-ai-agents/`, a third coding-agent orchestration guide alongside `codex` and `claude-code`. It teaches Hermes to delegate coding tasks to Grok Build (xAI's `grok` CLI).

- Headless `-p` one-shots (preferred)
- Interactive TUI via pty + tmux
- Session resume, background tasks, structured JSON output
- PR review and parallel worktree patterns
- Auth via SuperGrok / X Premium+ (`grok login`)
- Full pitfalls and config notes
2026-05-29 04:49:38 -07:00
Teknium
5e7c2ffa9f
chore(models): gemini-3.5-flash replaces gemini-3-flash-preview in OpenRouter + Nous lists (#34581)
* chore(models): swap gemini-3-flash-preview for gemini-3.5-flash in OpenRouter + Nous lists

* chore(models): regenerate model-catalog.json for gemini-3.5-flash swap
2026-05-29 04:27:58 -07:00
teknium1
1c53d39eaa test: deflake process-registry kill + PTY resize tests
Two CI flakes surfaced on PR #34572 (both in files this PR doesn't touch;
pre-existing host-dependent flakes):

1. test_process_registry::TestPopenLeakOnSetupFailure — the failure-cleanup
   tests use a fake proc.pid (8888/9999) and assert proc.kill() runs. But
   spawn_local's primary cleanup is os.killpg(os.getpgid(pid), SIGKILL),
   falling back to proc.kill() only on ProcessLookupError/PermissionError/
   OSError. When the fake PID happens to exist on a busy host, os.getpgid
   succeeds, os.killpg fires against an UNRELATED real process group, and
   proc.kill() is never reached -> flaky AssertionError (and a real risk of
   SIGKILLing an innocent process group from a unit test). Patch os.getpgid
   to raise ProcessLookupError so the fallback path runs deterministically
   and no real killpg is ever issued.

2. test_web_server::test_resize_escape_is_forwarded — the receive loop calls
   the blocking conn.receive_bytes() with no exception guard. Once the child
   prints its winsize and exits, the PTY closes; on a missed-marker run the
   next recv blocks until the 30s pytest-timeout instead of failing fast.
   Add a try/except break (matching the working sibling tests) and bump the
   child's pre-read sleep 0.15s -> 0.5s so the resize reliably lands first.

Verified: 4/4 pass across 3 consecutive runs; root cause for #1 reproduced
(os.getpgid(1) succeeds -> old code skips proc.kill).
2026-05-29 04:22:41 -07:00
teknium1
6a2e3c2d26 fix(gateway): guard adapter-trust check against bare GatewayRunner in tests
_adapter_enforces_own_access_policy accessed self.adapters directly, but
several auth tests build a bare GatewayRunner via object.__new__ without
setting .adapters (pitfalls.md #17). Read it defensively with getattr so a
missing/empty adapter map means "no adapter owns the policy" instead of
raising AttributeError.

Fixes 4 tests: test_feishu_bot_auth_bypass, test_discord_bot_auth_bypass (x2),
test_signal::test_signal_in_allowlist_maps.
2026-05-29 04:22:41 -07:00
teknium1
fd09b2c55e fix(gateway): trust adapter-owned access policy over env default-deny (#34515)
Config-driven platform policies (dm_policy / group_policy / allow_from /
group_allow_from) for WeCom, Weixin, Yuanbao, and QQBot now work without
also setting a PLATFORM_ALLOWED_USERS env var.

These adapters enforce their access policy at intake — a message is dropped
inside the adapter and never dispatched unless it already passed the policy.
The gateway's env-based check (_is_user_authorized) ran afterward and, with
no env allowlist set, fell through to an env-only default-deny — silently
rejecting `dm_policy: open` and config-only allowlists the adapter had
already authorized.

Rather than re-implement each adapter's policy a second time in run.py
(which would drift), adapters that own their gate now declare it via a new
BasePlatformAdapter.enforces_own_access_policy property (default False). The
gateway trusts that flag and skips the env-only default-deny for those
platforms. Env allowlists still take precedence when set.

Also resolves unauthorized DM behavior from config dm_policy so allowlist /
disabled policies drop unauthorized DMs silently instead of leaking pairing
codes, while an explicit pairing policy opts back in.

Co-authored-by: Frowtek <frowte3k@gmail.com>
2026-05-29 04:22:41 -07:00
teknium1
ddaf2f6712 style: restore PEP8 blank-line separation after dead-code removal
The deletions in the salvaged commit left some top-level defs/classes
separated by a single blank line. Restore the 2-blank-line separation.
2026-05-29 04:22:27 -07:00
kshitijk4poor
dc235e93cb chore: remove dead code — 28 unused functions/classes across 16 files
Vulture + per-symbol verification (whole-repo grep incl. tests, string
literals, getattr, decorator/registry/argparse dispatch) confirmed each of
these has zero callers anywhere — not reachable via any dynamic-dispatch path,
not referenced by tests, not re-exported.

Removed:
- acp_adapter/tools.py: _build_patch_mode_content
- agent/anthropic_adapter.py: read_claude_managed_key (diagnostics-only, never called)
- agent/bedrock_adapter.py: get_bedrock_model_ids
- agent/browser_registry.py: get_active_browser_provider
- agent/chat_completion_helpers.py: _take_request_client (x2 nested closures, never invoked)
- gateway/platforms/weixin.py: _rewrite_headers_for_weixin, _rewrite_table_block_for_weixin
- hermes_cli/banner.py: _skin_branding
- hermes_cli/debug.py: _delete_hint
- hermes_cli/gateway.py: _setup_email, _setup_sms, _setup_yuanbao
  (platform keys absent from the _builtin_setup_fn dispatch dict; handled by
  the _setup_standard_platform fallback)
- hermes_cli/kanban_db.py: set_max_runtime, active_run
- hermes_cli/kanban_diagnostics.py: severity_of_highest, _latest_clean_event_ts
- hermes_cli/main.py: _build_provider_choices, cmd_portal
  (portal subcommand is wired via portal_cli.add_parser, not this wrapper)
- hermes_cli/model_switch.py: CustomAutoResult (orphaned by the switch_model() extraction)
- hermes_cli/models.py: format_model_pricing_table, fetch_nous_account_tier
- hermes_cli/portal_cli.py: _nous_portal_base_url
- hermes_cli/proxy/server.py: handle_models_fallback (defined but never registered on the router)
- tools/computer_use/cua_backend.py: _parse_element, _is_arm_mac
- tools/file_operations.py: _get_safe_write_root (prod uses the imported
  agent.file_safety.get_safe_write_root directly)
- tools/skills_tool.py: _load_category_description

Also dropped two imports left unused by the removals:
- tools/file_operations.py: get_safe_write_root alias
- tools/computer_use/cua_backend.py: import platform

Pure deletion: -551 LOC. No behavior change. Test files covering the edited
modules pass (640/640); the broader suite's pre-existing/env-dependent
failures reproduce unchanged on origin/main.
2026-05-29 04:22:27 -07:00
teknium1
0aa9f6acfa docs(nav): wire multi-profile-gateways guide into sidebar
Follow-up for #30240 — the new page was not referenced in sidebars.ts,
leaving it orphaned (unreachable via nav and flagged as a broken relative
link to ./profiles.md). Added under Using Hermes after profile-distributions.
2026-05-29 04:11:10 -07:00
William Chen
0c0a905011 docs(gateway): add multi-profile gateways operations guide
Covers running multiple Hermes profiles as managed services on one host:

- A shell-loop wrapper pattern for start/stop/restart/status across every
  profile (the per-profile CLI commands stay unchanged).
- Per-platform service file locations (LaunchAgent on macOS, systemd user
  unit on Linux), plus the rules around clashes.
- Log paths per profile and how to tail every gateway at once.
- Config file layout per profile and the restart-after-edit workflow.
- Keeping the host awake: caffeinate flags on macOS,
  systemd-inhibit + loginctl enable-linger on Linux.
- Token-conflict auditing across .env files.
- Troubleshooting for the common "Could not find service in domain for
  user gui: 501" message and stale PIDs after a crash.

Tested locally with five profiles on macOS launchd.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 04:11:10 -07:00
Teknium
e4b9532c18
feat: embedder environment-hint hook for the system prompt (#34574)
* fix(security): block AWS SDK creds from subprocess env

* fix(security): narrow Bedrock subprocess strip to inference bearer token only

Scopes the AWS_SDK subprocess strip down from the full AWS credential chain
to just AWS_BEARER_TOKEN_BEDROCK — the only Hermes-managed *inference* secret
(analogous to OPENAI_API_KEY). The general AWS credential chain
(AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKEN / AWS_PROFILE
/ config + role pointers) is intentionally left inheritable.

Why: per SECURITY.md §3.2 the local terminal is the user's trusted operator
shell. Hard-blocklisting the general chain would (a) regress *every* user who
runs aws/terraform/cdk/boto3 in the agent terminal — not just Bedrock users,
since PROVIDER_REGISTRY is iterated unconditionally at import — and (b) be
unrecoverable, because env_passthrough.py refuses to re-allow anything in
_HERMES_PROVIDER_ENV_BLOCKLIST (GHSA-rhgp-j443-p4rf). The narrow strip closes
the reported leak (opencode enumerating the Bedrock catalog off the leaked
bearer token) with no capability loss.

Keeps zapabob's self-healing auth_type=="aws_sdk" mechanism so any future
SDK-cred provider is covered automatically.

Tests: bearer token stripped + general chain preserved (no-regression guard),
on both the runtime strip path and the blocklist-membership path.

Co-authored-by: zapabob <1920071390@campus.ouj.ac.jp>

* feat: embedder environment-hint hook for the system prompt

Adds HERMES_ENVIRONMENT_HINT env var (and config.yaml agent.environment_hint)
so a host wrapping Hermes (sandbox runner, managed platform) can describe the
runtime environment — proxy, credential handling, mount layout — in the system
prompt's environment-hints block, without editing the identity slot (SOUL.md).

Read once at prompt-build time, so it lands in the stable, cache-safe portion
of the system prompt. Env var overrides the config key (build-time/container
mechanism). Empty by default — no behavior change for existing installs.

---------

Co-authored-by: zapabob <1920071390@campus.ouj.ac.jp>
2026-05-29 04:10:05 -07:00
Hariharan Ayappane
c0b17b3c0c docs(weixin): clarify allowed users setup 2026-05-29 04:01:06 -07:00
Dave Tist
2520c9ad68 docs(skills): clarify Reminders alarm timing 2026-05-29 04:01:01 -07:00
LeonSGP43
62e81b2d9b docs(windows): add WSL desktop shortcut guide 2026-05-29 04:00:57 -07:00
SHL0MS
fe7e0a8c1d docs(feishu): add permission scopes, event subscription, and publish steps
The setup guide was missing the specific Feishu permission scopes to
configure and the event subscription (im.message.receive_v1) needed
for the bot to receive messages. Users had to reference external
OpenClaw documentation to complete the setup.

Adds:
- Required permissions table (im:message, im:message:send_as_bot,
  im:resource, im:chat, im:chat:readonly)
- Recommended permissions (reactions, app info, contact)
- Event subscription step (im.message.receive_v1)
- App version publish reminder (permissions require published version)
2026-05-29 04:00:52 -07:00
briandevans
6e179c44b1 fix(web): ensure plugin discovery before web_*_tool registry lookups
Web search/extract dispatch read agent.web_search_registry before plugin
discovery had run, so in any process that hadn't imported model_tools.py
(subprocess agent runs, delegate children, standalone scripts) the registry
was empty: get_provider('firecrawl') returned None and the dispatcher emitted
the misleading 'No web extract provider configured' error even with
web.extract_backend set and FIRECRAWL_API_KEY exported.

Adds an idempotent _ensure_web_plugins_loaded() helper (mirrors
tools.browser_tool._ensure_browser_plugins_loaded) and calls it at the top of
both the web_search_tool and web_extract_tool dispatch sites before the
registry lookup.

Fixes #27580.

Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>
2026-05-29 04:00:00 -07:00
teknium1
58e1b04665 chore(release): map tillfalko to GitHub login for PR #29987 salvage 2026-05-29 03:58:56 -07:00
teknium1
c77a697fa4 refactor(vision): consolidate native fast-path gate into one shared helper
The fast-path decision (native routing + provider allowlist OR
supports_vision override) lived inline in vision_analyze and was copied
into browser_vision. Extract it to _should_use_native_vision_fast_path()
so both tools share one source of truth.

- vision_tools: gate logic now one helper; vision_analyze calls it in 3 lines
- browser_tool: thin envelope decoration over the shared helper, not a copy
- browser_vision typed Union[str, Dict] to match its real return shape
- tests slimmed to target the override path + text-mode-wins invariant
2026-05-29 03:58:56 -07:00
tillfalko
c3f28c651d docs(browser): update browser_vision tool description for native vision routing 2026-05-29 03:58:56 -07:00
tillfalko
2402ec5e7b test: extend test coverage to native image routing 2026-05-29 03:58:56 -07:00
tillfalko
f8b8dffccf fix(browser): add native image support to browser_vision and respect supports_vision 2026-05-29 03:58:56 -07:00
tillfalko
f05353397d fix(vision): respect supports_vision in vision_analyze 2026-05-29 03:58:56 -07:00
EloquentBrush0x
784d8dd2c2 fix(matrix): fail-closed approval reaction auth when MATRIX_ALLOWED_USERS is empty
The _on_reaction approval handler used:

    if self._allowed_user_ids and sender not in self._allowed_user_ids:

When MATRIX_ALLOWED_USERS is not configured, _allowed_user_ids is an
empty set. The short-circuit on the empty set caused the deny block to
never execute, allowing any Matrix room member to approve or deny tool
calls via / reactions — even users that run.py's _is_user_authorized
would reject for regular messages.

Fix mirrors the Telegram _is_callback_user_authorized fix (commit
89d32052e, PR #28494): deny by default when no allowlist is configured,
unless GATEWAY_ALLOW_ALL_USERS=true is explicitly set.
2026-05-29 03:58:45 -07:00
teknium1
3171845479 fix(code-exec): make dropped HERMES_* env vars diagnosable in sandbox scrub
Follow-up mitigation for the #27303 env-scrub tightening. Dropping the
broad HERMES_ prefix in favor of a 4-var operational allowlist is correct
hardening, but a sandbox script that imports a repo module reading a
non-allowlisted HERMES_* var at import time would otherwise see it
silently unset. _scrub_child_env now emits a one-shot debug log naming the
dropped non-secret HERMES_* vars and pointing at the env_passthrough
opt-in escape hatch. Secret-shaped vars are never named in the log.

Tests: dropped vars are logged + env_passthrough named; no log when
nothing is dropped; secret vars excluded from the diagnostic.
2026-05-29 03:44:49 -07:00
firefly
4bdae34771 test(code-exec): regression suite for the approval-bypass cluster
Cover context+callback propagation and teardown-clears, a source guard that both RPC threads stay wrapped, the check_execute_code_guard decision matrix (isolated backend, headless-local, cron-deny, gateway approve/deny/timeout/missing-notify, smart mode, session-yolo), the env-scrub allowlist/secret rules, and a behavioral test that execute_code() blocks before spawning on denial.

Refs #4146, #27303, #30882, #33057
2026-05-29 03:44:49 -07:00
firefly
655090b3d3 feat(gateway): warn at startup on manual approvals with no risk assessor
When approvals.mode=manual with security.tirith_enabled off and no auxiliary.approval model, dangerous commands and execute_code scripts can only be gated by live in-chat approval; with routing fixed they now fail closed (block) rather than silently auto-run. Surface that at startup so operators knowingly enable tirith or auxiliary.approval for unattended gateways.

Refs #30882
2026-05-29 03:44:49 -07:00
firefly
1083977261 fix(code-exec): restore approval context in execute_code RPC threads + guard entry
Wrap both execute_code RPC threads (local UDS + remote file-RPC) with propagate_context_to_thread so gateway sessions no longer fall into check_dangerous_command's non-interactive auto-approve branch and the CLI approval prompt stays reachable. Add check_execute_code_guard: one-shot fail-closed approval of the whole script in gateway/ask/cron-deny before the child spawns (skips isolated backends; command-string built only past the early returns). Drop the broad HERMES_ env passthrough for an explicit operational allowlist plus DSN/WEBHOOK secret substrings, and update the POSIX-equivalence oracle.

Refs #4146, #27303, #30882, #33057
2026-05-29 03:44:49 -07:00
firefly
21aeefe5fd fix(code-exec): propagate agent-turn context into tool worker threads
Worker threads that dispatch Hermes tools started with an empty contextvars.Context and no thread-local approval/sudo callbacks. Add tools/thread_context.propagate_context_to_thread factoring that capture/install/clear lifecycle (mirrors the GHSA-qg5c-hvr5-hjgr pattern), and refactor agent/tool_executor onto it so the security-critical logic lives in one audited place. Update the contextvar-propagation source guard for the new call shape.

Refs #33057
2026-05-29 03:44:49 -07:00