Commit graph

12496 commits

Author SHA1 Message Date
Hermes Agent
4c1934dd87 docs: repoint remaining stale gateway/platforms adapter refs to plugins/platforms
Sibling-site follow-up to the AGENTS.md token-lock fix (#50481). Platform
adapters migrated from gateway/platforms/<name>.py to
plugins/platforms/<name>/adapter.py; a handful (signal, weixin, bluebubbles,
qqbot, yuanbao, msgraph_webhook, webhook, api_server) still live in
gateway/platforms/.

- adding-platform-adapters.md: new-adapter creation path + reference-impl table
- gateway-internals.md: rewrite the adapter tree to reflect the actual split
- zh-Hans mirrors of both kept in parity
- scripts/release.py: add TutkuEroglu to AUTHOR_MAP (CI gate)
2026-06-21 19:59:50 -07:00
TutkuEroglu
0768ed3b33 docs(agents): fix stale platform adapter path in token-lock note
gateway/platforms/telegram.py no longer exists (adapters moved to
plugins/platforms/<name>/adapter.py) and telegram no longer uses the
scoped-lock pattern. Point the token-lock canonical-pattern reference to
plugins/platforms/irc/adapter.py, which acquires the lock in connect()
and releases it in disconnect() — and is already cited as a canonical
example in ADDING_A_PLATFORM.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 19:59:50 -07:00
Teknium
7130d60861
feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492)
* feat(providers): remove google-gemini-cli + google-antigravity OAuth providers

Google now actively bans accounts for third-party tools that piggyback on
Gemini CLI / Antigravity / Code Assist OAuth, and because abuse prevention
sits at a backend layer the ban can extend to the entire Google account
(Gmail/Drive), with a second violation being permanent.
Ref: https://github.com/google-gemini/gemini-cli/discussions/20632

Removes both OAuth inference providers entirely (modules, provider profiles,
auth/runtime/config/models wiring, the /gquota Code Assist quota command,
the antigravity-cli optional skill, desktop + docs surface in en + zh-Hans).
The API-key 'gemini' provider (GOOGLE_API_KEY/GEMINI_API_KEY against
generativelanguage.googleapis.com) is unaffected and stays fully supported.

* fix(skills): keep the antigravity-cli skill — only the OAuth provider is removed

The antigravity-cli optional skill orchestrates the external `agy` binary as
a coding-agent tool via the terminal tool — it does NOT wrap Hermes inference
through the banned google-antigravity OAuth provider, so it carries none of
the account-ban risk that motivated removing that provider. Restore the skill,
its docs page, the sidebar entry, and the optional-skills catalog row. The
google-antigravity / google-gemini-cli inference providers stay fully removed.
2026-06-21 19:53:27 -07:00
Teknium
5bf23ff251
fix(banner): don't advertise toolsets/skills the agent wasn't given (#50497)
The welcome banner's 'Available Tools' merged in every toolset from the
global check_tool_availability() registry walk, regardless of whether it
was enabled for the current platform. On a Blank Slate CLI (file +
terminal only) that surfaced discord / feishu / kanban tools the agent
was never actually given — they are not in the agent's tool schema, but
the banner displayed them, making it look like they were exposed.

- Filter the unavailable-toolset merge to toolsets actually in
  enabled_toolsets (a toolset that's enabled but has unmet deps still
  legitimately shows as disabled/lazy).
- Gate the 'Available Skills' section on the skills toolset being
  enabled — when it's off, the agent can't load any skill, so show
  'Skills toolset disabled' instead of the on-disk catalog.

When enabled_toolsets is empty (older callers), behavior is unchanged.

Validation: blank-slate banner now shows only file + terminal and
'Skills toolset disabled'; a skills-enabled banner still lists the
catalog. Added regression tests; full banner suite green (15/15).
2026-06-21 19:08:54 -07:00
teknium1
8cfcbd327d fix(process): SIGKILL the whole tree on escalation, not just wait_procs survivors
Live testing against a real SIGTERM-ignoring process TREE (parent + children,
the agent-browser daemon + renderer shape) revealed psutil.wait_procs's
gone/alive partition mis-handles a parent/child tree: it reaps via
Process.wait() and could mark targets gone/alive inconsistently across the
tree, leaving survivors un-killed (flaky — sometimes the parent lived,
sometimes a child). Replace it with: sleep out the grace window, then
directly re-probe every captured target (_proc_alive, treating zombies as
dead) and SIGKILL any that's still running. Add a multi-child-tree regression
test. 6/6 escalation tests green across repeated runs; the real-tree E2E now
kills the full tree 6/6 runs.
2026-06-21 19:08:52 -07:00
teknium1
8cbb34b2bf chore: map tkwong co-author email for #15008 SIGKILL-escalation credit 2026-06-21 19:08:52 -07:00
teknium1
8cecaf0b29 feat(process): escalate SIGTERM->SIGKILL on host-pid termination after grace
A daemon that ignores or stalls in its SIGTERM handler currently survives the
process-registry reap and leaks until reboot (observed as agent-browser
daemons accumulating to EMFILE on long-running gateways). _terminate_host_pid
now snapshots the tree, SIGTERMs it, waits a bounded grace window
(terminal.daemon_term_grace_seconds, default 2.0s, 0 disables), then SIGKILLs
any survivor. The recycled-PID identity guard still gates the whole path, so
escalation never reaches a stranger; Windows is unchanged (taskkill /F is
already a hard kill).

Config lives in config.yaml (terminal.daemon_term_grace_seconds), NOT an env
var, per the .env-secrets-only policy.

Implements the SIGKILL-escalation idea from @tkwong's #15008, reworked onto the
current _terminate_host_pid tree-kill path (the original predated it) and
config-gated instead of env-var-gated.

Co-authored-by: Benjamin Wong <tkwong@inspiresynergy.com>
2026-06-21 19:08:52 -07:00
teknium1
41fe086eb6 style(security-audit): add explicit encoding to read_text calls (ruff PLW1514) 2026-06-21 19:05:27 -07:00
teknium1
f45ace9318 feat(security): startup security posture audit (warn-on-load)
Surface dangerous host/deployment posture at gateway startup so operators get
the 'you're exposed' signal the June 2026 MCP-config persistence campaign
victims never had. Warn-only — never blocks startup, never raises.

Checks (each independently fail-safe):
- Running as root (POSIX uid 0)
- SSH daemon with PasswordAuthentication enabled (incl. the 'yes' default)
- Running in a container with no persistent volume mount over HERMES_HOME
- Network-accessible API server with no API_SERVER_KEY

New module hermes_cli/security_audit_startup.py; invoked once per process from
start_gateway() right after setup_logging(). Cross-platform (root/SSH checks
no-op on Windows). Idea: @Cthulhu.
2026-06-21 19:05:27 -07:00
teknium1
eb51c180e6 fix(docker): replace dashboard --insecure with basic-auth provider
The s6 dashboard entrypoint and docker integration tests relied on
HERMES_DASHBOARD_INSECURE=1 to bring up a 0.0.0.0 dashboard with no auth
provider. With --insecure now a no-op (auth gate mandatory on non-loopback
binds), that path fails closed.

- s6 dashboard/run: drop --insecure derivation; warn that the env is a no-op
  and point operators at HERMES_DASHBOARD_BASIC_AUTH_* / OAuth.
- docker tests: supervision tests now register the bundled basic password
  provider (HERMES_DASHBOARD_BASIC_AUTH_USERNAME/_PASSWORD) so the gate has a
  provider and the dashboard binds. Rewrote the insecure-opt-out test to
  assert fail-closed (dashboard does NOT serve) instead of gate-bypass.
- docs (en + zh-Hans): HERMES_DASHBOARD_INSECURE documented as deprecated
  no-op; basic-auth is the zero-infra way to authenticate a containerized
  public dashboard.
2026-06-21 19:05:27 -07:00
teknium1
7726ce3040 fix(security): close hermes-0day MCP-persistence attack surface
Remove the dashboard --insecure auth-bypass, add an MCP persistence guard +
IOC blocklist, and raise the API-server key entropy floor.

Driven by the June 2026 hermes-0day campaign (r/hermesagent, live 854.media
instance): scanners find exposed Hermes dashboards/API servers, drive the
root agent to plant a 'command: bash' MCP entry that appends an attacker SSH
key to authorized_keys, which cron + startup then re-execute every tick.

- dashboard: --insecure no longer disables the auth gate. should_require_auth
  returns True for every non-loopback bind; a public bind ALWAYS requires an
  auth provider (bundled password provider or OAuth). --insecure kept as a
  warned no-op for backward compat. Fail-closed error now points at the
  password provider, not at --insecure.
- mcp_security: validate_mcp_server_entry now also rejects shell payloads that
  write to OS persistence surfaces (authorized_keys/.ssh/pam.d/sudoers/cron/
  rc files) and hard-rejects a hermes-0day IOC blocklist (attacker SSH key +
  source IPs) anywhere in command/args/env. Runs at save AND spawn time.
- api_server: raise network-bind API_SERVER_KEY entropy floor 8->16 chars;
  warn when a network-accessible API server runs an unsandboxed local backend.
2026-06-21 19:05:27 -07:00
teknium1
9bf9a9f1f1 fix(swe-runner): move logging.basicConfig out of Runner __init__ into main
Same library-code anti-pattern as the compressor fix: MiniSWERunner.__init__
called logging.basicConfig(), overriding the application's root logger config
every time a runner was instantiated. Moved the call into main() (the CLI
entry point) where it belongs; __init__ now only does getLogger(__name__).
Standalone verbose logging is preserved.
2026-06-21 19:02:06 -07:00
annguyenNous
0a7ae28ebc fix(compressor): remove logging.basicConfig from library class __init__
logging.basicConfig() in TrajectoryCompressor.__init__ overrides the
root logger configuration every time the class is instantiated. Library
code should use logging.getLogger(__name__) and let the application
entry point configure the root logger.

Fixes inconsistent log formatting when the compressor is used alongside
other logging configuration in the gateway.
2026-06-21 19:02:06 -07:00
Teknium
2b3a4f0af8
fix(agent): strip stale reasoning_content when falling back to a strict provider (#50480)
* fix(agent): strip stale reasoning_content when falling back to a strict provider

A reasoning primary (DeepSeek/Kimi/MiMo thinking mode) pins reasoning_content
on every assistant tool-call turn (a single space " " pad). api_messages is
built once under the primary; on a mid-session fallback to a strict
OpenAI-compatible provider (Mistral, Cerebras, Groq, SambaNova), those stale
pads were replayed verbatim and rejected with HTTP 400/422:

    body.messages.2.assistant.reasoning_content: Extra inputs are not
    permitted  (input: ' ')

reapply_reasoning_echo_for_provider() only ever ADDED pads, so it never
reconciled history built under a reasoning primary against a strict fallback.
copy_reasoning_content_for_api() also leaked empty-string and 'reasoning'-only
shapes to non-pad providers.

Fix both sites: when the active provider does not enforce echo-back, strip
reasoning_content (empty, space-pad, or non-empty) entirely. Re-padding when
switching TO a reasoning provider is preserved. Covers the Cerebras 400 from
#45655 and the DeepSeek->Mistral 422 fallback report.

Refs #45655.

* test: update reasoning-replay tests for strict-provider stripping

test_explicit_reasoning_content_beats_normalized_reasoning_on_replay was
implicitly running on the OpenRouter fixture (non-pad); pin it to a reasoning
provider so the precedence it checks is observable. Add a positive
strict-provider test asserting reasoning_content is stripped on replay.
2026-06-21 18:05:07 -07:00
teknium1
73340d8be6 chore: add buihongduc132 to AUTHOR_MAP for mem0 salvage 2026-06-21 17:28:02 -07:00
buihongduc132
452a725ae1 fix(mem0): address PR review — restore docstrings, keep api_key required
Addresses reviewer feedback on #13377:
1. Restore all stripped docstrings (_load_config, _is_breaker_open,
   sync_turn, register, _get_client, _read_filters, _write_filters,
   _unwrap_results, save_config) and section dividers
2. Revert api_key to required:true in schema — self-hosted Mem0 also
   requires auth by default; validation in _get_client() handles the
   either/or logic separately from the schema
3. Confirm secret:true remains on api_key (already correct)
2026-06-21 17:28:02 -07:00
buihongduc132
b6d2ac176e feat(mem0): add self-hosted support via MEM0_HOST / host config
The mem0 plugin previously hardcoded api.mem0.ai as the endpoint.
This adds a `host` config key and MEM0_HOST env var so users can
point the plugin at a self-hosted Mem0 instance.

Changes:
- _load_config(): read MEM0_HOST env var
- is_available(): accept host OR api_key (self-hosted may not need a real key)
- get_config_schema(): add host field
- initialize(): read host from config
- _get_client(): pass host kwarg to MemoryClient when set
- system_prompt_block(): show target (cloud vs URL)
- README: document self-hosted setup
2026-06-21 17:28:02 -07:00
teknium1
012f40c98c fix(status): cross-platform start-time fingerprint via psutil fallback
The PID-reuse guard (#43846) reads /proc/<pid>/stat field 22, which only
exists on Linux — on macOS/Windows it returned None and the guard silently
degraded to a bare liveness check (a no-op, safety-wise). Add a
psutil.create_time() fallback (psutil is a hard dep, cross-platform),
quantized to centiseconds for stable equality, so the recycled-PID guard
actually protects macOS/Windows too. /proc always wins first on Linux and
always misses on macOS/Windows, so the two sources never mix on one host and
same-source equality is all the guard needs.
2026-06-21 17:23:33 -07:00
teknium1
1cefc2a24e test(whatsapp): fix port-spares-client test race (listen before announce + retry connect)
The salvaged test spawned a listener subprocess that printed its port
immediately after bind() but BEFORE listen(), so under CI's loaded 8-worker
box the parent connected before the socket was listening -> ConnectionRefused
(flaked on test slice 2/6). Reorder the child to listen() then print the port,
and make the client connect with a short bounded retry to absorb scheduler
jitter. 15/15 green locally including direct hammering.
2026-06-21 17:23:33 -07:00
teknium1
0fb3b13b00 chore: add valentt to AUTHOR_MAP for #43846 salvage 2026-06-21 17:23:33 -07:00
teknium1
615a8e6516 fix(whatsapp): add missing re import + fix test import path after adapter relocation
Follow-up to the salvaged #43846 commits: the WhatsApp adapter moved from
gateway/platforms/whatsapp.py to plugins/platforms/whatsapp/adapter.py since the
PR was authored. The cherry-pick brought _listener_pids_on_port's `re.finditer`
ss-fallback and the new test's import, but the new module location doesn't import
`re` (latent NameError on the lsof-absent fallback path) and the test imported the
old module path. Add `import re` to the adapter and repoint the test import.
2026-06-21 17:23:33 -07:00
valentt
069ab40c5f fix(whatsapp): only kill LISTENers when freeing the bridge port, never clients
This is the bug that was actually closing Firefox. `_kill_port_process`, run on
every bridge (re)start to free the port, used `lsof -ti :PORT` / `fuser PORT/tcp`
— both of which match a process whose socket merely *involves* that port number
in ANY state, including ESTABLISHED client connections. It then SIGTERMed every
match.

The bridge defaults to port 3000 — a ubiquitous local dev-server port. With a
browser tab open on localhost:3000, `lsof -ti :3000` returned Firefox's PID, so
each restart of the (crash-looping) WhatsApp bridge SIGTERMed Firefox, closing
the whole browser at irregular intervals with no crash and no coredump.

Proven live with the kernel `signal:signal_generate` tracepoint:
  hermes-gateway(3396516) -> sig=15 (code=0/SI_USER) -> comm=firefox pid=3371585
captured immediately after a gateway start, while Firefox held a socket on the
bridge port. Demonstrated over-match: `lsof -ti :8080` returns the listener AND
the gateway's own client connection; `lsof -ti tcp:8080 -sTCP:LISTEN` returns
only the listener.

Fix: `_listener_pids_on_port` resolves only LISTEN-state sockets
(`lsof -ti tcp:PORT -sTCP:LISTEN`, with an `ss -ltnp` fallback) and
`_kill_port_process` signals just those. A client whose connection happens to
involve the port number is never touched — which is also more correct, since a
client never blocks the new bridge from binding. Windows already filtered
LISTENING; the broad `fuser -k` path is removed.

Adds TestKillPortProcess: real-socket tests proving a separate client process
is excluded from the listener lookup and survives port cleanup. 9 tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:23:33 -07:00
valentt
77fdbbfe81 fix(whatsapp): validate bridge PID identity before killing stale pidfile entry
`_kill_stale_bridge_by_pidfile` SIGTERMed the PID recorded in `bridge.pid`
after only a bare liveness check. Once the bridge exits and is reaped the
kernel recycles that PID onto an unrelated process; because the WhatsApp bridge
crash-loops ("Bridge process died (exit code 1)" repeating), this cleanup ran
on every restart and could SIGTERM a recycled PID that had landed on the user's
browser — closing Firefox at irregular intervals with no crash and no coredump
(a clean kill of a stranger).

Same PID-recycling class as the MCP reaper (7bd1f8a2d) and the process-registry
host-PID guard (e6a99cef2); this was the third, and most actively-fired, path.

Fix: `_write_bridge_pidfile` now also records the leader's kernel start time
(line 2). `_kill_stale_bridge_by_pidfile` re-validates identity via
`_bridge_pid_is_ours` before signalling — the (pid, start time) pair must match,
or for legacy single-line pidfiles the live cmdline must name `node` + this
session's unique path. A recycled PID (different start time / cmdline) is logged
and skipped, never signalled. Legacy pidfiles stay readable.

Adds TestWhatsappBridgePidfile: real-process tests proving a genuine bridge is
reaped while a recycled PID (start-time mismatch, or non-bridge cmdline) is
spared. 7 new + 108 gateway/registry tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:23:33 -07:00
valentt
e447723149 fix(process-registry): re-validate PID identity before killing host processes
The background-process registry signalled host PIDs (recovery adoption,
detached-session kill, tree-kill) using a number captured at spawn, guarded
only by a bare liveness check. Once a session's process exits and is reaped the
kernel recycles that PID onto an unrelated process, so an alive-but-different
PID passed the check and got tree-killed.

Observed in the wild: a recycled background-session PID landed on Firefox's
session leader; a later kill/refresh walked its process tree and SIGTERMed
every tab — Firefox "closing" at irregular intervals with no crash/coredump.

This is the same PID/PGID-recycling class fixed for the MCP orphan reaper in
7bd1f8a2d, but the process_registry subsystem was never guarded — so the bug
persisted.

Fix: record each host process's kernel start time (/proc/<pid>/stat field 22)
at spawn, persist it in the checkpoint, and re-validate it before every signal
via `_host_pid_is_ours`. A PID whose start time no longer matches — or that is
gone — is never signalled:
  - recover_from_checkpoint: a recycled PID is not adopted as a session.
  - _refresh_detached_session: a recycled detached PID is marked exited.
  - kill_process / _terminate_host_pid: refuse to tree-kill a stranger.
Legacy checkpoints and platforms without /proc (no baseline) degrade to the
prior best-effort liveness behaviour, so nothing else changes.

Adds TestPidReuseGuard: real-process tests proving a mismatched start time
refuses termination while a matching one still kills, plus recovery/refresh
recycling paths. 74 registry + 22 MCP-stability tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 17:23:33 -07:00
Teknium
84e1d31e54
refactor(kanban): fold worker/orchestrator skills into injected guidance (#50473)
The kanban-worker and kanban-orchestrator bundled skills existed only to
be force-loaded into dispatcher-spawned workers, gated by
environments:[kanban] so they wouldn't leak into normal CLI listings.
That gating was fragile (the leak that #50443 patched) and the
--skills auto-load was already best-effort — most workers ran without it
because the bundled skill isn't present in profile-scoped skills dirs.

Remove the skills entirely and promote their load-bearing content
(workspace kinds, deliverable artifacts, created-card integrity, profile
discovery) into KANBAN_GUIDANCE, which is already injected into every
kanban worker's system prompt. Net result: every worker reliably gets
the guidance, nothing can leak into a CLI/blank-slate session, and the
gating machinery is gone.

- agent/prompt_builder.py: promote the 4 load-bearing rules into KANBAN_GUIDANCE
- hermes_cli/kanban_db.py: drop --skills kanban-worker auto-injection + _kanban_worker_skill_available probe
- hermes_cli/kanban_swarm.py: drop skills=[kanban-orchestrator] on the root card
- hermes_cli/kanban.py: drop kanban-init skill seeding; fix help text
- delete skills/devops/kanban-{worker,orchestrator}
- docs: delete the two skill pages (EN+zh), fix sidebars/catalog/kanban.md/kanban-worker-lanes.md and the video-orchestrator + codex-lane references
- tests: update spawn-argv expectations; re-bound the guidance-size guard

Supersedes the skill-leak half of #50443 (credit @helix4u for flagging the area).
2026-06-21 17:06:48 -07:00
Carl
e5e2583635 fix(desktop): relaunch on Linux after in-app update instead of hanging (#45205)
On a Linux source install the in-app updater ran the full backend update +
desktop rebuild successfully but never restarted the app — it hung forever on
the applying overlay with no close button. Two causes:

- applyUpdatesPosixInApp() only handled the macOS .app bundle swap;
  runningAppBundle() is null off macOS, so Linux fell through to
  { ok: true, backendUpdated: true } without ever relaunching.
- The renderer store had no terminal state for that result shape, so
  $updateApply stayed { applying: true } and the overlay's close button
  (hidden while applying) never appeared.

Fix (new electron/update-relaunch.cjs, pure + unit-tested):
- Decide the Linux outcome from whether the *running* binary is the one we
  just rebuilt (execPath under release/<plat>-unpacked, path-segment-aware so
  linux-unpacked-evil can't masquerade) and whether its chrome-sandbox helper
  is launchable (root:root + setuid, or an --no-sandbox / ELECTRON_DISABLE_SANDBOX
  opt-out):
    relaunch — detached watcher waits for this PID to exit (graceful, then
      SIGKILL), self-deletes, and re-execs the rebuilt binary with the original
      launch context (filtered args + HERMES_*/sandbox env + cwd) restored.
    guiSkew  — AppImage/.deb/.rpm/dev: backend updated but this GUI package was
      NOT changed; surface an honest closeable 'reinstall the desktop app'
      terminal state instead of lying that it loads next launch (#37541 skew).
    manual   — rebuilt binary but sandbox helper not launchable: keep the
      working window, don't quit into a dead app.
- store/updates.ts lands a terminal, closeable state for EVERY resolved apply
  outcome (handedOff / guiSkew / manualRestart / updated-not-relaunched / error)
  so the hang is impossible regardless of platform or result.
- New DesktopUpdateStage values (update/rebuild/done/guiSkew) + GuiSkewView so
  progress reads correctly and the skew state is closeable. i18n in all four
  locales (en/ja/zh/zh-hant) in parity.
- electron/update-relaunch.test.cjs (16 tests) + store outcome tests.

Salvaged from #45205 onto current main. Linux quit dwell uses the shared
UPDATE_HANDOFF_DWELL_MS (2.5s) from #50448 for consistency. Four-locale i18n
parity, AUTHOR_MAP entry, and the test wiring added on top.

Closes #45205.
2026-06-21 17:04:52 -07:00
teknium1
1f6994d1ee chore(release): add AUTHOR_MAP entry for #45205 salvage (EtherAura) 2026-06-21 17:04:52 -07:00
brooklyn!
1ec4fcf614
Merge pull request #50466 from NousResearch/bb/composer-popout-bounds
fix(desktop): keep the floating composer in-bounds (can't be lost off-screen)
2026-06-21 18:58:14 -05:00
Flownium
13ce811906
fix: show desktop approval fallback (#46548) 2026-06-21 18:57:18 -05:00
Dusk1e
84fcbbf6a9 fix(security): quote HERMES_TIMEZONE in remote code execution to prevent shell injection 2026-06-21 16:55:12 -07:00
liuhao1024
bef1d3e4ff
fix(desktop): filter undefined entries in AttachmentList to prevent refText crash on session switch (#49624)
* fix(desktop): filter undefined entries in AttachmentList to prevent refText crash on session switch

When switching sessions, the attachments array can contain stale/undefined
entries from the previous session's state. Accessing attachment.refText on
an undefined entry throws TypeError, breaking session switching entirely.

Fix: add .filter(Boolean) before .map() to skip undefined/null entries.

Fixes #49614

* fix(desktop): update I18nConfigClient usage in attachment test

The i18n config API changed from getLocale/saveLocale to
getConfig/saveConfig. Update the test fixture to match.
2026-06-21 18:54:09 -05:00
Brooklyn Nicholson
16aeba1707 fix(desktop): clamp composer peel-off under cursor
Keep the floating composer bounded from the first peel-off frame and leave titlebar clearance when recovering bad persisted positions.
2026-06-21 18:52:01 -05:00
Teknium
c768c4b71c fix(antigravity): move model flow to model_setup_flows + stop bare-alias hijack
CI on the salvage caught two issues the stale PR base masked:

1. The model-setup flows were extracted from main.py into
   hermes_cli/model_setup_flows.py after @pmos69 forked. The cherry-pick
   re-introduced a stale _model_flow_custom into main.py (duplicating the
   one main.py now imports) and put _model_flow_google_antigravity there too.
   Move the antigravity flow into model_setup_flows.py alongside its siblings
   and drop the stale _model_flow_custom dup. Fixes the getpass/stdin OSError
   in tests/cli/test_cli_provider_resolution.py.

2. google-antigravity re-exposes Claude/Gemini/GPT-OSS models, so its catalog
   was hijacking bare short aliases (`sonnet` -> google-antigravity instead of
   anthropic) in detect_static_provider_for_model via dict insertion order.
   Add _BORROWED_MODEL_PROVIDERS and defer those providers to a last-resort
   pass so a model's native vendor always wins alias/direct-catalog detection.
   Fixes tests/hermes_cli/test_models.py::test_short_alias_resolves_to_static_model.
2026-06-21 16:41:30 -07:00
Teknium
37c37c9dc5 fix(antigravity): register google-antigravity ProviderProfile + AUTHOR_MAP
The salvaged PR wired auth.py / providers.py / runtime_provider.py for
google-antigravity but never registered a ProviderProfile, so the provider
was invisible to list_providers() / the model picker / alias resolution.
Register it in the gemini model-provider plugin (alongside gemini and
google-gemini-cli) with the antigravity-pa:// scheme and aliases. Also add
@pmos69 to release.py AUTHOR_MAP (CI gate).
2026-06-21 16:41:30 -07:00
Teknium
b7a912ea45 fix(antigravity): bake in public OAuth client + default project fallback
Salvage follow-up on top of @pmos69's #29474. The PR resolved the
Antigravity OAuth client purely by discovering it from an installed `agy`
binary or HERMES_ANTIGRAVITY_CLIENT_ID/SECRET env vars, so users without
agy installed hit a hard 'client ID not available' error.

Antigravity's desktop OAuth client is a public, non-confidential installed-app
client (PKCE provides the security), baked into every copy of the Antigravity
CLI — same posture as the gemini-cli credentials Hermes already ships in
google_oauth.py. Bake it in as the final fallback (env -> discovery -> public
default) and add the public default Code Assist project as the discovery
fallback, matching the reference Antigravity flow. Now consumers can
authenticate directly without agy installed.
2026-06-21 16:41:30 -07:00
pmos69
8baa4e9976 feat(cli): add native Antigravity OAuth provider 2026-06-21 16:41:30 -07:00
xxxigm
29176ffecf test(gateway): cover no eager platform install on startup sweep
Pin the contract that ``_apply_env_overrides`` consults ``is_connected``
before the install-triggering ``check_fn``: an unconfigured platform is
skipped without calling ``check_fn`` (no lazy install), while a configured
platform still has ``check_fn`` run and is auto-enabled. The first assertion
fails on the pre-fix unconditional sweep.
2026-06-21 16:41:17 -07:00
xxxigm
242ec45f45 fix(gateway): don't lazy-install SDKs for unconfigured platforms on startup
For adapter plugins, ``PlatformEntry.check_fn`` doubles as a lazy installer:
calling it pip-installs the platform SDK as a side effect (see e.g.
``plugins/platforms/discord/adapter.py::check_discord_requirements``). The
enablement sweep in ``_apply_env_overrides`` called ``check_fn`` for every
registered plugin platform unconditionally, so a single
``load_gateway_config()`` — which the desktop/dashboard readiness probe
``GET /api/status`` awaits synchronously — pip-installed Discord, Telegram,
Slack, Feishu and Dingtalk even when the user configured none of them
(``platforms: none``). On a slow or restricted network the installs ran long
enough to block the event loop past the desktop's readiness timeouts, so the
app timed out, killed and re-spawned the backend, and boot-looped (stuck at
94%).

Consult the cheap ``is_connected`` credential check FIRST and only run the
install-triggering ``check_fn`` for platforms that are already enabled or
actually configured. Auto-enable-by-credentials is unchanged: a platform with
its token set still gets its SDK installed and enabled.
2026-06-21 16:41:17 -07:00
Dusk1e
8fcb8136bb fix(security): harden smart approval guard against prompt injection
# Conflicts:
#	tools/approval.py
2026-06-21 16:39:48 -07:00
JP Lew
c11ae8261b fix(codex): seed app-server sessions with configured cwd 2026-06-21 16:39:02 -07:00
Brooklyn Nicholson
7785655b4e fix(desktop): keep the floating composer in-bounds so it can't be lost off-screen
The pop-out position is a bottom-right corner inset; the old clamp only floored
it and capped each inset by a flat constant, so dragging left/up (or restoring a
position saved on a larger/other monitor) could push the box's width/height past
the left/top edges and strand it off-screen — unrecoverable since the bad spot
persisted to localStorage.

Now the clamp bounds the WHOLE box (accounting for its measured width/height plus
an edge margin) on all four sides. Applied on drag (measured size), on load
(clamped in readPosition), and via a mount + window-resize reclamp so a shrunk
window or stale persisted value always pulls the box back into view.
2026-06-21 18:35:33 -05:00
Teknium
745c4db235
feat(desktop/windows): show update-in-progress feedback before the desktop exits (#50419) (#50448)
Follow-up to #50238/#50381. The restart-loop is now SAFE (marker + launch
gate), but the trigger that lured users into relaunching mid-update remained:
on the in-app update hand-off the desktop window vanished almost immediately
(app.quit() 600ms after spawning the detached updater), before the updater's
own window appeared — a blank-screen gap that looks like a crash.

- Linger on the update overlay for UPDATE_HANDOFF_DWELL_MS (2.5s, was 600ms)
  before quitting, on BOTH hand-off paths (in-app update + Windows bootstrap
  recovery), so the message lands and bridges to the updater window.
- Strengthen the restart-stage copy and the overlay's applyingBody/applyingClose
  to explicitly tell the user the window will reopen automatically and NOT to
  reopen Hermes themselves while it updates. All four locales (en/ja/zh/zh-hant)
  updated in parity.

Pure UX; does not touch the #50381 marker/gate mutual-exclusion safety net.
2026-06-21 15:34:52 -07:00
teknium1
624580e836 fix(browser): verify daemon identity before orphan reaper kills a PID (#14073)
The browser orphan reaper reads a daemon PID from a `.pid` file in a
world-writable, predictably-named temp dir (`/tmp/agent-browser-h_*`) it
does not write itself, then tree-kills that PID via `_terminate_host_pid`
after only a liveness check. A same-user actor could plant a fake socket
dir whose `.pid` points at an arbitrary victim process, and OS PID reuse
after the real daemon exits could land the recorded PID on an unrelated
process — either way an arbitrary same-user process (and its whole tree)
gets SIGTERMed. Local DoS.

Add `_verify_reapable_browser_daemon()`, gated before the kill: via psutil
(a hard dep, fine cross-platform for the same-user processes the reaper can
signal) require both (1) identity — `agent-browser` in the process
name/cmdline — and (2) binding — the live process references *this* session's
socket dir in its cmdline or `AGENT_BROWSER_SOCKET_DIR`. The binding check is
the real spoof defense: a planted/recycled PID won't embed our exact socket
path. Fail-closed on any ambiguity (unreadable cmdline, no match), leaving the
process and its socket dir untouched for a later sweep.

Builds on @sgaofen's fix in #14394 (cmdline identity check); rewritten to use
psutil instead of `/proc`+`ps` (cross-platform, Windows-covered) and to add
the session-socket-dir binding check for recycled-PID / spoof resistance.

Co-authored-by: sgaofen <135070653+sgaofen@users.noreply.github.com>
2026-06-21 15:23:47 -07:00
teknium1
4d4ba0831e refactor(session): simplify traversal guard to a helper + logger, harden non-leading separators
Follow-up to the salvaged #9560 fix:
- Replace the _TRAVERSAL_RE regex with an explicit _is_path_unsafe() helper
  (drops the now-unused `import re`); catches a path separator ANYWHERE,
  not just leading, so a non-leading Windows backslash can't slip through.
- Switch the per-entry skip in _ensure_loaded_locked from print() to
  logger.warning to match the module's logging conventions.
- Add AUTHOR_MAP entry for the contributor.
- Add regression tests for the non-leading-separator case.
2026-06-21 15:23:36 -07:00
OrbisAI Security
aa2aac68b0 fix(V-009): reject Windows drive-letter paths in session field validation
Extends the CWE-22 path traversal guard to cover Windows absolute paths
of the form C:/... and D:\... — previously only leading / and \ were
checked, which missed drive-letter prefixes. Replaces the inline
startswith check with a compiled module-level regex (_TRAVERSAL_RE) that
covers all three attack patterns: .., leading /\, and leading X: drives.
Adds two regression tests for C:/windows/system32 and D:\\path\\to\\file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-21 15:23:36 -07:00
OrbisAI Security
3a6a43cb81 fix(V-009): reject path traversal in SessionEntry.from_dict and harden _ensure_loaded
Addresses PR #9560 review comments: applies the CWE-22 fix to current main
(post-PR #458 rebase) and adds the requested regression tests.

- SessionEntry.from_dict now raises ValueError for session_key or session_id
  containing '..' or starting with '/' or '\' (directory traversal guard)
- SessionStore._ensure_loaded moves per-entry validation inside the loop so
  one malicious/corrupt entry is skipped with a warning instead of aborting
  the entire sessions.json load
- Adds TestSessionEntryFromDictTraversalValidation (5 cases) and
  TestEnsureLoadedSkipsInvalidEntries covering the skip-not-abort behavior

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-21 15:23:36 -07:00
orbisai0security
c8eb7cf843 fix: V-009 security vulnerability
Automated security fix generated by Orbis Security AI
2026-06-21 15:23:36 -07:00
ethernet
bb59075b25
Merge pull request #50398 from helix4u/fix/windows-npm-path-fallback
fix(windows): prefer cmd npm shim on PATH fallback
2026-06-21 18:55:02 -03:00
devorun
6f0ecf37da fix(redact): mask all Authorization schemes and x-api-key style headers
Secret redaction only matched `Authorization: Bearer <token>`. Other auth
headers passed through verbatim into logs, tool output, and transcripts:

- `Authorization: Basic <base64>` — leaks base64(user:password)
- `Authorization: token <pat>` / any non-Bearer scheme
- `Proxy-Authorization: ...`
- `x-api-key: <key>` (Anthropic and many providers) and `api-key`,
  `x-goog-api-key`, `x-auth-token`, `x-access-token`, ... — opaque values with
  no known vendor prefix were caught by nothing

A logged request or an echoed `curl -H "x-api-key: ..."` command therefore
leaked live credentials.

Generalize the Authorization rule to mask the credential for any scheme (and
Proxy-Authorization) while preserving the header name and scheme word for
debuggability, and add an api-key header rule for the single-opaque-value
headers. Bearer behavior is unchanged; plain prose containing the word
"authorization" (no colon-delimited value) is left untouched.

Adds regression tests for Basic/token/Proxy auth and the x-api-key/api-key
headers, including inside a curl command.
2026-06-21 14:08:06 -07:00
teknium1
87ab373381 test(url-safety): cover IPv6 scope-ID strip + fail-closed in URL guards
Follow-up to the salvaged #25961 fix: regression tests asserting that
scope-bearing IPv6 addresses (fe80::1%eth0, ::1%lo) are blocked by
is_safe_url after the scope is stripped, that a still-unparseable address
fails closed, and that a scoped IPv4-mapped IMDS address is caught by the
always-blocked floor.
2026-06-21 13:56:35 -07:00