hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-23 10:42:00 +00:00

Author	SHA1	Message	Date
Eri Barrett	ba9e3a491b	feat(memory): Honcho OAuth connect — desktop and CLI flows + token refresh (#44335 ) * feat(memory): OAuth token storage and refresh for the Honcho provider * feat(memory): refresh the Honcho OAuth token in the client and session * feat(memory): zero-CLI loopback OAuth authorization flow * feat(memory): generic memory-provider OAuth connect endpoints * feat(desktop): memory-provider OAuth connect link * feat(memory): CLI OAuth sign-in with source-tagged authorize links * fix(memory): IP-literal loopback redirect and consent config_path on the authorize link * fix(memory): profile-scope the memory-provider OAuth endpoints * refactor(desktop): generic memory-provider OAuth client functions * docs(memory): trim OAuth module docstrings to the invariants * docs(memory): document OAuth connect as an optional auth method * fix(memory): send home-relative display path to consent, not the absolute path * perf(memory): cache OAuth token expiry in memory to skip the hot-path disk read * fix(memory): log OAuth refresh failures at warning, not debug * feat(memory): fall back to an OS-assigned loopback port when 8765 is taken * test(memory): cover the desktop Connect launcher, status, and provider dispatch * fix(desktop): keep the memory-provider dropdown one size regardless of connect state * fix(desktop): move the memory connect link to the description line, leaving the dropdown untouched * refactor(memory): move OAuth connect routes out of web_server into a memory-layer router * refactor(desktop): import MemoryConnect directly, drop the single-export barrel * fix(memory): launch CLI OAuth sign-in right after the auth choice, not after the wizard * fix(desktop): auto-clear the OAuth error state instead of leaving it sticky * test(honcho): isolate auth-method prompt from deployment-shape wizard tests main's wizard suite scripts the cloud prompts without the OAuth auth-method step; auto-answer it in the shared helper so the answer lists stay shape-only. * docs(honcho): document query-adaptive reasoning level (reasoningHeuristic) README never mentioned reasoningHeuristic and listed reasoningLevelCap as an orphaned cap with the wrong default (— vs "high"). Add the query-adaptive scaling note + the reasoningHeuristic/reasoningLevelCap rows (grouped under Dialectic & Reasoning), matching the wording already on the hosted honcho.md page, and add a pointer from the memory-providers overview. * fix(honcho): default the CLI peer prompt to the OAuth consent name The CLI runs the grant with apply_config=False, so the peerName the user just entered at consent was dropped and the wizard's 'Your name' prompt fell back to $USER. Surface it as a transient OAuthCredential.consent_peer_name (set even when config isn't merged) and seed the prompt default from it. * feat(honcho): split OAuth client_id by surface (cli=hermes-agent, desktop=hermes-desktop) resolve_endpoints now picks the client_id from the initiating surface and threads it through authorize -> token exchange -> persisted grant -> refresh, so the CLI and desktop register as distinct OAuth clients. Surface-specific env overrides (HONCHO_OAUTH_CLIENT_ID_CLI/_DESKTOP) win over the generic HONCHO_OAUTH_CLIENT_ID, which still overrides every surface. * feat(honcho): show OAuth vs API key in status; detect existing OAuth in setup status now prints 'Auth: OAuth (clientId, token valid Xm/expired)' instead of masking the OAuth access token as a generic API key; setup notes an existing OAuth grant when re-run. * docs(honcho): drop 'shared pool' wording from unified observation mode help * fix(honcho): cross-process lock around OAuth refresh to prevent grant revocation The in-process threading lock can't stop a sibling process (another profile or the desktop app sharing honcho.json) from replaying the single-use refresh token and tripping reuse-detection, which revokes the whole grant. Guard the read-refresh-persist section with an OS file lock on <config>.lock so only one process rotates at a time; the others re-read the freshly-persisted token. Best-effort: platforms without flock degrade to in-process serialization. * refactor(honcho): one OAuth client (hermes-agent) for all surfaces Collapse the per-surface client_id split. CLI and desktop now use a single client_id (hermes-agent); consent branding/UI still adapt via the source query param. One grant identity means no clientId-vs-refresh-token desync that could get the grant revoked. HONCHO_OAUTH_CLIENT_ID still overrides for self-hosting. * fix(honcho): per-session resolves to session_id, never remapped by title Reorder resolve_session_name so stable identifiers win over labels: gateway per-chat key first, then the per-session session_id, then the cwd map / title. A (possibly auto-generated) title can no longer remap a live per-session conversation onto a second Honcho session mid-stream — fixes the desktop, which is per-conversation via session_id. Consequence: a gateway's per-chat key now also wins over a title (titles never remap a stable id).	2026-06-22 19:16:47 -05:00
brooklyn!	672ea1f894	Merge pull request #50994 from NousResearch/hermes/hermes-9fb04abd fix(computer-use): working vision capture + whole-screen/desktop target on Windows	2026-06-22 19:02:04 -05:00
Brooklyn Nicholson	833710d33e	Merge remote-tracking branch 'origin/main' into pr-50994 # Conflicts: # tools/computer_use/cua_backend.py	2026-06-22 18:48:07 -05:00
brooklyn!	116331dd3f	Merge pull request #51094 from NousResearch/bb/desktop-thread-timeline feat(desktop): conversation timeline rail for long threads	2026-06-22 18:41:13 -05:00
brooklyn!	760fd9513e	Merge pull request #51078 from NousResearch/bb/fix-vision-capture fix(computer-use): vision capture returns an image on cua-driver >=0.5.x	2026-06-22 18:37:18 -05:00
brooklyn!	6780cee679	Merge pull request #51072 from NousResearch/bb/desktop-computer-use feat(computer-use): add a cross-platform readiness preflight to the desktop	2026-06-22 18:37:07 -05:00
Brooklyn Nicholson	3fffecbdaf	feat(desktop): add timeline rail for long chat threads Adds a compact right-edge prompt timeline for long desktop chat sessions, with hover previews, click-to-jump, active/hover row states, and pane hover-reveal suppression so the rail can live at the hard edge without opening side panels.	2026-06-22 18:34:07 -05:00
brooklyn!	9bacd7d4bb	Merge pull request #51096 from NousResearch/bb/desktop-oversized-image-replay fix(agent): shrink anthropic-native image history	2026-06-22 18:30:18 -05:00
brooklyn!	b90f1e4ac0	Merge pull request #51093 from NousResearch/bb/desktop-string-stack-overflow fix(desktop): avoid stack overflow on embedded image replay	2026-06-22 18:26:34 -05:00
Brooklyn Nicholson	88e136448d	fix(agent): shrink anthropic-native image history Retry image-size rejections by rewriting Anthropic base64 image source blocks, not just OpenAI-style image_url parts.	2026-06-22 18:23:21 -05:00
Brooklyn Nicholson	a6b670d4a2	fix(desktop): avoid stack overflow on embedded image replay Replace the giant embedded-image regex with a bounded scanner so opening sessions with multi-megabyte data URLs does not crash the renderer.	2026-06-22 18:19:36 -05:00
Brooklyn Nicholson	3c1058e2e9	fix(computer-use): set stdin=DEVNULL on cua-driver subprocess calls The subprocess-stdin guard (TUI gateway fd-inheritance protection) flagged the `permissions grant` call. None of the cua-driver probes/grant read stdin, so DEVNULL is correct; apply it to the shared `_run` helper and the grant call.	2026-06-22 17:59:18 -05:00
Brooklyn Nicholson	2dfcead683	feat(computer-use): make the preflight cross-platform (win/linux) The card was macOS-only. cua-driver also runs on Windows and Linux, so fold `cua-driver doctor` (cross-platform binary/health probes) into a single OS-aware `ready` signal: - macOS: ready == both TCC grants; keeps the permission rows + grant flow. - Windows/Linux: no TCC toggles, so ready == driver health, with a per-OS note (SmartScreen/UIAccess on Windows; X11/XWayland on Linux). `computer_use_status()` replaces the macOS-only `permissions_status()` and surfaces `platform`, `ready`, `can_grant`, and the doctor `checks` (non-ok ones render as warnings). CLI `permissions status`, the REST endpoint, and the desktop card all key off the one payload. Grant stays macOS-only (400 elsewhere — nothing to grant).	2026-06-22 17:48:43 -05:00
Brooklyn Nicholson	807b696295	fix(computer-use): vision capture returns an image on cua-driver >=0.5.x Vision mode called a `screenshot` MCP tool that cua-driver dropped in 0.5.x (full-window PNG capture was folded into `get_window_state`). The driver replied "Unknown tool: screenshot", so `images` came back empty, `png_b64` stayed None, and capture returned a 0x0 result with no image on every call. `som`/`ax` were unaffected because they already use `get_window_state`, which masked the regression. Route vision by capability: - driver advertises `screenshot` (older builds) -> use it (no AX walk) - otherwise -> call `get_window_state` but discard the AX tree/elements, returning only the PNG so vision stays free of element noise - capabilities not yet discovered -> try `screenshot`, fall back to `get_window_state` on an empty image, so the path self-heals Add `_image_from_tool_result` to pull the PNG from either an MCP image content-part or `structuredContent.screenshot_png_b64`, and use it on the som path too so the image won't silently drop on driver builds that deliver it via structuredContent instead of a content part. Verified live (vision: 1568x954, 0 elements; som: image + 527 elements) and with unit coverage of all four routing cases.	2026-06-22 17:41:42 -05:00
Brooklyn Nicholson	0223ea5f59	feat(computer-use): surface macOS permission preflight in the desktop Computer Use already worked through the desktop backend (the cua-driver toolset enables + installs via Settings -> Skills & Tools), but there was no in-app way to see or grant the two macOS permissions it needs, so "give a model my Mac" was tribal knowledge. The grants attach to cua-driver's OWN TCC identity (com.trycua.driver / the installed CuaDriver.app), not Hermes -- so no app entitlement is involved. cua-driver 0.5+ exposes `permissions status/grant`, which we wrap: - tools/computer_use/permissions.py: thin client over the two subcommands - hermes computer-use permissions {status,grant}: CLI parity - GET /api/tools/computer-use/status, POST .../permissions/grant: desktop REST - ComputerUsePanel: live Accessibility + Screen Recording state with a Grant button (dialog attributed to CuaDriver), shown in the expanded Computer Use toolset row. Binary install stays in the existing provider post-setup runner. Follow-ups: i18n the card copy; a "Stop driver" control (cua-driver stop) for the runaway-`serve` case.	2026-06-22 17:33:52 -05:00
Teknium	87c4a5ebb8	feat(background-review): aux-model selector for the self-improvement review (#49252 ) Adds auxiliary.background_review.{provider,model} (default auto = main chat model — unchanged). Set it to a different, cheaper model and the post-turn self-improvement review runs there for ~3-5x lower cost. Cache-aware by design: the main chat is warm in the prompt cache, so the default full-history replay on the main model is cheap cache reads — left exactly as-is. A different model can't reuse that cache (different key), so when (and only when) routed to a different model the fork replays a compact digest instead of the full transcript, minimising what it cold-writes on the aux model. Same model -> full replay; different model -> digest. Quality holds in benchmarks: memory capture identical, skill near-identical. Nothing changes unless you opt in by naming a different model. Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-06-22 14:54:53 -07:00
Teknium	660e36f097	fix(cron): scope job execution to its owning profile (#32091 follow-up) (#50993 ) The #32091 fix moved every profile's cron jobs into one shared root store, but never wired the execution-scoping half it recommended: a job still ran under whichever profile's ticker picked it up, not its owning profile. So a job created under `hermes -p donna` could execute with the root profile's .env / config.yaml / credentials. - jobs.py: create_job auto-captures the active profile (explicit profile= override available) and stores it on the job; resolve_profile_home() maps a profile name to its HERMES_HOME; legacy jobs backfill to 'default'. - scheduler.py: run_job applies the job's profile via a scoped HERMES_HOME override (env var + in-process ContextVar) before any .env/config/script load, restored in finally. tick() routes profile-mismatched jobs to the single-worker sequential pool so the env mutation can't race. - cronjob tool threads profile through (NOT exposed in the model schema, to avoid cross-profile privilege escalation); hermes cron add gains --profile. E2E verified against a temp HERMES_HOME with a real profile dir: a root-profile ticker runs a profile='donna' job with HERMES_HOME=donna during execution and restores the ticker env afterward.	2026-06-22 14:54:28 -07:00
Tranquil-Flow	15880da8bb	fix(file_tools): resolve tilde using profile home for file operations (#48552 ) File tools (read_file, write_file, patch, list_directory, etc.) used os.path.expanduser() which reads the gateway process HOME env var. In Docker/systemd/s6 deployments where the gateway HOME differs from interactive sessions, tilde expanded to the wrong directory. Add _expand_tilde() helper that delegates to get_subprocess_home() when available, falling back to os.path.expanduser(). Replace all 9 expanduser() call sites in file_tools.py with _expand_tilde().	2026-06-23 03:17:47 +05:30
kshitijk4poor	c080b2dc3e	fix(gateway): redact credentials from TUI approval prompts (#48456 ) Follow-up to #50767, which redacted the chat-platform (_approval_notify_sync) and SSE/API (_approval_notify) approval transports. The TUI JSON-RPC transport is the third egress and was missed: three register_gateway_notify callbacks in tui_gateway/server.py emitted the raw approval_data — including the unredacted command Tirith flagged — straight to the TUI client via _emit. Route all three registrations through a new module-level _emit_approval_request() helper that redacts payload['command'] via the shared gateway.run._redact_approval_command seam before emitting, matching the pattern used for the other two transports. Completes the whole-bug-class fix for #48456. Tests: assert the helper emits a redacted command (real credential pattern), handles missing/None command, and a wiring guard that no registration emits the raw payload directly (only the helper may). Both mutation-checked. The #48456 fix series originated from @liuhao1024's #48462 — credit to them for the original report and chat-platform fix; this completes the remaining transport. Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-23 03:14:18 +05:30
kshitijk4poor	0e69cd4b37	fix(memory): honor configured char limits in the no-agent on-disk store Follow-up to the /memory approve fresh-store fix. Both the CLI fallback and the messaging-gateway handler built a bare MemoryStore() with the hardcoded default char limits (2200/1375), ignoring the user's configured memory.memory_char_limit / user_char_limit. A live agent honors those overrides (agent/agent_init.py), so an approval applied without a live agent could accept a write the user's lower cap would reject, or vice versa. Extract a shared tools.memory_tool.load_on_disk_store() factory that reads the configured limits (falling back to defaults if config can't load) and wire both the CLI and gateway handlers to it, closing the gap on both surfaces and de-duplicating the construction block.	2026-06-23 03:10:53 +05:30
Max Hsu	3147cbb136	fix(memory): apply /memory approve against a fresh store when no live agent The CLI /memory slash handler (cli_commands_mixin._handle_memory_command) passed self.agent._memory_store straight through, which is None when the command runs without a live agent — e.g. /memory approve from the Desktop GUI. The shared write-approval handler then returns "memory store unavailable" and applies nothing, even with built-in memory enabled and pending writes present. Fall back to a freshly loaded on-disk MemoryStore when no live store is available, mirroring the gateway path (gateway/slash_commands.py). It persists to the same MEMORY/USER.md and creates MEMORY.md on the first approved write. Fixes #46783 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-23 03:10:53 +05:30
kshitijk4poor	100e7be20e	fix(security): deny root-level credential stores in media delivery The media-delivery denylist in gateway/platforms/base.py enumerated only .env/auth.json/credentials/config.yaml under HERMES_HOME, so other credential stores that live at the root fell through and could be auto-attached to chat replies. The reported case: the Google Workspace skill's google_token.json refreshes every turn, bumping its mtime to 'now', which kept passing the strict-mode recency window and re-sent the OAuth token on every reply. Extend the explicit per-file denylist to mirror the canonical credential set already enforced by the read/write guards in agent/file_safety.py: google_token.json, google_oauth_pending.json, auth/google_oauth.json, .anthropic_oauth.json, webhook_subscriptions.json, cache/bws_cache.json, auth.lock, and the pairing/ token directory. Targeted per-file additions (not a blanket ~/.hermes deny, which was declined in #32090/#34425 because it would block skills/, logs/, and ad-hoc agent-written deliverables). mcp-tokens/ (#37222) and state.db/kanban.db (#41071) are left to their sibling targeted PRs. Reported-by: xxxigm (#50912)	2026-06-23 02:56:48 +05:30
Teknium	e9b86f352f	fix(discord): delete obsolete slash commands before creating new ones Discord enforces a hard 100-command limit per app and rejects an upsert that would push the live total over 100 (error 30032), which silently breaks ALL slash commands. The sync deleted obsolete commands AFTER creating new ones, so an app already at the cap momentarily exceeded it and the whole sync failed. Reorder: delete no-longer-desired commands up front, then create/update. Removes the now-redundant trailing delete loop. Adapts @infinitycrew39 PR #50890 to current main (the original adapter diff no longer applied after the platform refactor); test commit cherry-picked with authorship preserved.	2026-06-22 13:58:33 -07:00
infinitycrew39	91c465f6e7	test(discord): add regression test for 100-command sync limit Add a test to verify that _safe_sync_slash_commands deletes obsolete commands before creating new ones. This ensures we never temporarily exceed Discord's 100-command limit during sync, which would trigger error 30032 and break all slash commands. This test guards against the regression where sync could fail even though the registration cap was properly enforced.	2026-06-22 13:58:33 -07:00
helix4u	ae7e857420	fix(cron): deliver max-iteration fallback reports	2026-06-22 13:57:59 -07:00
helix4u	3972701424	fix(agent): complete final text on last turn	2026-06-22 13:57:59 -07:00
Teknium	0f741cef28	fix(tests): update cua install tests for cross-platform support f-trycua's #50855 test file predated the cross-platform PR (#50552) and reintroduced two stale tests asserting Linux is unsupported (test__non_macos_, patching platform.system="Linux" and expecting a no-op/warn). Linux + Windows are supported now, so install proceeds on those platforms. Restore main's cross-platform-correct versions: test__on_unsupported_platform_ using FreeBSD as the genuinely unsupported case.	2026-06-22 13:41:03 -07:00
Francesco Bonacci	5f1d23cfb2	fix(computer-use): delete broken pre-install asset probe; trust the upstream installer `hermes computer-use install` refused to install on Linux, Windows, and macOS x86_64 because the pre-install asset probe was hitting the wrong GitHub endpoint AND duplicating tag-resolution logic the upstream installer already does correctly. `_check_cua_driver_asset_for_arch()` queried `https://api.github.com/repos/trycua/cua/releases/latest`. On trycua/cua: - cua-driver-rs releases (the binary the installer fetches) are marked prerelease on every cut. GitHub's `/releases/latest` explicitly skips prereleases. - The Python package releases (`cua-agent`, `cua-computer`, `cua-train`) are non-prerelease and end up as the "latest" instead. Live API check today: $ curl -sf https://api.github.com/repos/trycua/cua/releases/latest \ \| jq '{tag:.tag_name, asset_count: (.assets\|length)}' { "tag": "agent-v0.8.3", "asset_count": 0 } The probe sees zero assets, prints "Latest CUA release has no Linux x86_64 asset", and skips install on every Linux / Windows / macOS-x86_64 host — even though the cua-driver-rs-v0.6.0 release ships 19 binary assets covering all those platforms. Filtering `/releases?per_page=N` for the `cua-driver-rs-v` prefix fixes the bug, but it duplicates tag-resolution logic the upstream `_install-rust.sh` already does correctly via `CUA_DRIVER_RS_BAKED_VERSION` (auto-baked by CD on every release, with a `/releases?per_page=N` API fallback for dev checkouts). The right answer is to trust that contract instead of mirroring it in Python where it can drift. Two paths get the same outcome without the probe: 1. Fresh install: run `install.sh` directly. It has the baked release tag, fetches the right asset, and errors with a clear message on missing-arch downloads. No preflight needed. 2. Upgrade path*: `cua_driver_update_check()` (separately added) shells `cua-driver check-update --json` against the installed binary, which returns the canonical update answer from the same source the installer uses. - `hermes_cli/tools_config.py`: delete `_check_cua_driver_asset_for_arch` and its two call sites in `install_cua_driver`. Replace with an inline comment near the top of the module explaining the rationale. - `tests/hermes_cli/test_install_cua_driver.py`: drop the `TestCheckCuaDriverAssetForArch` block. Add `TestArchProbeRemoval` with three regressions: - `test_probe_function_is_gone` — asserts the deleted helpers stay deleted. - `test_fresh_install_does_not_call_github_api` — asserts the install path doesn't hit GitHub directly from Python anymore. - `test_upgrade_with_binary_does_not_call_github_api_directly` — same for the upgrade path. All 9 `test_install_cua_driver` tests pass. Reported by @teknium1 while testing on a headed Ubuntu host.	2026-06-22 13:41:03 -07:00
Teknium	f721d2cda9	fix(image/video gen): make schema delivery instruction platform-neutral (#51031 ) * chore: re-trigger CI (workflows did not dispatch on prior head) * fix(image/video gen): make schema delivery instruction platform-neutral The image_generate and video_generate tool schema descriptions hardcoded a gateway-only delivery instruction ('display it with markdown ![description](url-or-path) and the gateway will deliver it'). That schema is sent on every platform, so on CLI it directly contradicted the CLI platform hint ('Do NOT emit MEDIA:/path tags ... state its absolute path in plain text'), and on messaging platforms it was also wrong about the mechanism (local file paths are delivered via MEDIA: tags, not markdown image syntax — markdown ![]() only works for URLs). The per-platform file-delivery convention is already owned correctly by the platform hints in prompt_builder.py. The tool schema now just describes the result shape (URL or absolute path in the image/video field) and defers 'how to deliver' to the active platform's guidance. Provider/model injection already works via _build_dynamic_image_schema() (the 'Active backend: <provider> · model: <model>' line); no change there.	2026-06-22 13:40:42 -07:00
Austin Pickett	2a58fee1a1	fix(api): allow dashboard updates for git checkouts in containers (#51005 ) Salvages #50469 by @libre-7. _dashboard_local_update_managed_externally() previously blocked every containerized dashboard from the local update API, even when the running install was a bind-mounted git checkout that can be updated with hermes update. Allow the dashboard updater only for git installs inside containers, while keeping hosted /opt/data, docker, and pip installs managed externally. Pip remains blocked because its apply path mutates the running container filesystem and is not the self-managed checkout case. Adds regression coverage for docker, git, and pip install-method handling inside containers, and maps the contributor email for release attribution. Co-authored-by: libre-7 <libre-7@users.noreply.github.com>	2026-06-22 15:55:33 -04:00
Teknium	6681f28d5b	fix(telegram): disable DM topic mode when last binding is pruned Follow-up to #31501. When the send-fallback prune removes a chat's final telegram_dm_topic_bindings row, also flip telegram_dm_topic_mode.enabled to 0 in the same transaction. Without this, a user who turns topics off in the Telegram client (rather than via /topic off) leaves enabled=1 with zero lanes: _recover_telegram_topic_thread_id keeps treating the chat as topic-enabled and lobby messages keep hunting for bindings that no longer exist. Clearing the flag makes recovery fully stand down once the dead topics are gone. Adds 3 regression tests covering the last-binding clear, the multi-binding no-op, and the unmatched-prune no-op.	2026-06-22 12:29:05 -07:00
xxxigm	11246dbe21	tests: regression coverage for stale topic-binding prune (#31501 ) Thirteen tests across four layers: * ``SessionDB.delete_telegram_topic_binding`` — pin the new helper's contract: removes only the (chat_id, thread_id) row it was asked about, leaves siblings alone, returns 0 silently when the row never existed, and is a no-op on a pristine database whose topic-mode tables haven't been migrated yet. * ``TelegramAdapter._prune_stale_dm_topic_binding`` — the glue must drop the binding when ``self._session_store._db`` exposes the helper, swallow exceptions so a failed cleanup never breaks the user-facing send, and refuse to issue a DELETE for ``chat_id=None`` / ``thread_id=None`` so a bookkeeping miss can't accidentally null-match every row. * Source-level guards on ``TelegramAdapter.send`` and ``_send_message_with_thread_fallback`` — the prune call must sit beside the two existing "Thread X not found, retrying without message_thread_id" warnings, before the retry runs, so a future refactor can't silently drop the cleanup wire. * End-to-end semantic — once a topic is pruned, the ``GatewayRunner._recover_telegram_topic_thread_id`` walk steers future inbound messages to the surviving binding instead of the dead one. This is the exact behaviour change the bug report's reproduction asks for: no more landings in the wrong topic until the operator hand-edits ``state.db``. Refs #31501	2026-06-22 12:29:05 -07:00
xxxigm	142a5751a2	gateway/telegram: prune stale DM topic binding on Thread-not-found (#31501 ) Both fallback sites that currently log "Thread X not found, retrying without message_thread_id" now also drop the ``telegram_dm_topic_bindings`` row keyed on ``(chat_id, thread_id)``: * The streaming send loop (``send`` body) — fires on the second failure, after the same-thread one-shot retry confirms the thread really is gone (the first attempt is left alone because Bot API has been observed to return a transient "Thread not found" that recovers on immediate retry). * The control-message helper ``_send_message_with_thread_fallback`` (approval prompts, model picker, update prompts) — single-shot retry, prune unconditionally on the BadRequest match. Without this prune, a user who deletes a Telegram DM topic in the client keeps getting their next inbound message recovered back to the dead thread by ``_recover_telegram_topic_thread_id`` in ``gateway/run.py``, which walks the per-user binding list newest-first and treats the deleted thread as authoritative. The reproduction in the bug report is exactly this: tool progress, approvals, activity messages and replies all land in the wrong place until the user manually runs DELETE on state.db. Cleanup is best-effort — we log at INFO when it succeeds, swallow any exception from the SessionDB call, and the user-facing send proceeds either way. Refs #31501	2026-06-22 12:29:05 -07:00
xxxigm	4849a8e555	hermes_state: add SessionDB.delete_telegram_topic_binding (#31501 ) Targeted ``(chat_id, thread_id)`` prune for the ``telegram_dm_topic_bindings`` table — the missing piece for #31501, where the Telegram adapter detects a topic the user deleted out-of-band but the binding row keeps living in state.db. The recovery logic in ``gateway.run._recover_telegram_topic_thread_id`` then steers every future inbound message back to the dead topic, dropping tool progress, approvals and replies into the wrong place. Returns the number of rows deleted; silently no-ops when the topic-mode tables haven't been migrated yet (read-only / pristine profile) so the helper is safe to call from a send-fallback hot path before the schema has run.	2026-06-22 12:29:05 -07:00
Teknium	30e5d0092d	feat(computer-use): add whole-screen/desktop capture target capture(app='screen'\|'desktop') now resolves to the OS shell/desktop window (Windows Progman/WorkerW desktop or Shell_TrayWnd taskbar, macOS Finder/Dock) so 'show me my screen' and 'click the taskbar' work. Previously capture() only matched application windows, and the schema advertised 'or the whole screen' without any code path delivering it. cua-driver is window-oriented (no virtual-desktop or per-monitor MCP tool), so a single image still cannot span multiple monitors — the schema now states this and the no-desktop-window path returns a clear message instead of silently grabbing the frontmost app.	2026-06-22 12:21:58 -07:00
jeeves-assistant	5250335863	fix(computer-use): route CuaDriver vision capture via get_window_state cua-driver 0.6.x removed the standalone screenshot MCP tool, so capture(mode='vision') hit 'Unknown tool: screenshot' and returned a 0x0 image with no PNG while som/ax (which use get_window_state) still worked. Route vision through get_window_state(capture_mode='vision'). Salvaged from PR #50771; same fix submitted earlier as #39262 by @Tranquil-Flow.	2026-06-22 12:21:58 -07:00
Teknium	2ba1cfeb2e	feat(goals): completion contracts for /goal — evidence-based judging (#50501 ) Adds an optional structured completion contract to the standing-goal loop, adapted from OpenAI Codex's /goal guidance (a durable objective works best when it names what done means, how to prove it, what not to break, what's in scope, and when to stop). A contract has five optional fields — outcome, verification, constraints, boundaries, stop_when. When set, the continuation prompt tells the agent to target the verification surface and respect constraints, and the judge marks the goal done only when the verification criterion is met with concrete evidence (command result, file excerpt, test output) instead of a loose "looks done" claim. This tightens the most common /goal failure mode: premature completion / endless over-continuation on an underspecified goal. Two ways to set a contract, both backward compatible (bare /goal <text> behaves exactly as before): - /goal draft <objective> — expands plain text into a full contract via the goal_judge aux model (cache-safe side call), falls back to a free-form goal if the model is unavailable. - /goal <text> with inline 'field: value' lines (verify:, constraints:, boundaries:, stop when:, ...). Plain goals with an incidental colon are not mangled — only known field prefixes are pulled out. - /goal show prints the active contract. Contracts persist in SessionDB.state_meta alongside the goal (survive /resume), compose with /subgoal criteria, and old goal rows load unchanged. CLI + every gateway platform via the shared GoalManager engine; zero new model tools. Tests: +18 in tests/hermes_cli/test_goals.py (parse/serialize/judge-prompt/ draft/fallback), 73/73 green; 42/42 across the broader goal test surface; live E2E roundtrip (set -> persist -> reload -> contract-aware prompts) green.	2026-06-22 12:20:09 -07:00
Teknium	ff08e60c63	feat(skills): add cloudflare-temporary-deploy optional skill (#50849 ) * chore: re-trigger CI (workflows did not dispatch on prior head) * feat(skills): add cloudflare-temporary-deploy optional skill Optional web-development skill teaching the agent to deploy a Worker to a live workers.dev URL with no Cloudflare account via 'wrangler deploy --temporary' (Wrangler 4.102.0+). Cloudflare provisions a throwaway, claimable account valid for 60 minutes — ideal for an autonomous write->deploy->verify loop with no OAuth/signup hard stop. - SKILL.md: when/when-not, prereqs (unauth requirement, version floor), step-by-step deploy + verify flow, product limits table, pitfalls (hidden flag, stale global wrangler, auth-present error, rate limits, workers.dev edge cache), verification. - scripts/parse_deploy_output.py: stdlib-only parser extracting live URL, claim URL, account name/state, expiry, deploy status from wrangler output. - tests/skills/test_cloudflare_temporary_deploy_skill.py: 16 tests incl. a real-output regression case. Verified live end-to-end: temporary account created with no creds, deployed to a live URL, curl confirmed body, redeploy reused the account.	2026-06-22 12:14:30 -07:00
brooklyn!	7dece1d933	Merge pull request #50977 from NousResearch/bb/composer-fixed-portal fix(desktop): keep floating composer on-screen, scoped to the thread area	2026-06-22 14:12:02 -05:00
Brooklyn Nicholson	de7ad8b78e	fix(desktop): guarantee out-of-bounds composer is reclamped on load Re-clamp once more on the next frame after pop-out so layout (sidebar widths, fonts) has settled, and treat a degenerate pre-layout bounds rect as "unknown" (fall back to the window) so we never clamp the box into a collapsed area. Net: anyone who loads in with a stranded position is pulled back on-screen and the fix is persisted, even if the first measure was premature.	2026-06-22 13:59:26 -05:00
Brooklyn Nicholson	ea5fa505d9	fix(desktop): clamp floating composer to the thread area, not the whole window Now that the popped-out composer is fixed to the viewport, clamping against the window let it slide under a pinned sidebar. Confine it to the thread region (data-slot="composer-bounds") instead — its rect already excludes a pinned sidebar and the header — falling back to the full window before it's measured. This subsumes the old titlebar top-margin (the thread rect starts below the header).	2026-06-22 13:57:53 -05:00
Brooklyn Nicholson	aff5ae692f	fix(desktop): move composer out of contain wrapper instead of portaling Replaces the body-portal approach: render ChatBar as a sibling of the contain:[layout paint] chat wrapper (inside the same runtime boundary) rather than portaling the floating instance to <body>. The wrapper is a containing block for — and clips — position:fixed descendants, which is what stranded the popped-out composer off-screen. As a sibling it anchors to the outer relative container: docked stays absolute (identical placement), floating resolves against the viewport. Both states stay mounted, so dock<->float no longer remounts the editor (the portal toggle did).	2026-06-22 13:41:53 -05:00
Brooklyn Nicholson	79f270f549	fix(desktop): portal floating composer to body so it can't be clipped off-screen The popped-out composer is position:fixed, but the chat content wrapper sets `contain: layout paint`, which makes it a containing block for — and clips — fixed descendants. Inline, the floating composer was positioned/clipped relative to the chat column (which shifts with the sidebars), not the viewport, so the viewport-based bounds clamp from #50466 couldn't keep it reachable: users still lost it off-screen. Portal it to <body> when popped out so fixed positioning and the clamp finally share the viewport as their reference. Docked stays inline (it's absolute within the chat column by design).	2026-06-22 13:37:31 -05:00
kshitij	5937b95192	Merge pull request #50773 from NousResearch/salvage/43719-dashboard-plugin-rce fix(security): restrict dashboard plugin backend auto-import to bundled plugins — defense-in-depth (#43719)	2026-06-22 22:57:33 +05:30
kshitijk4poor	e2bea0abe6	refactor(security): centralize non-bundled plugin sources in one constant /simplify-code (LOW, flagged by two reviewers): the source tags 'user' / 'project' / 'bundled' were bare string literals scattered across the discovery scrub and the two mount-time refuse guards. A typo in any one site (e.g. 'users') would SILENTLY disable a security gate with no error — the exact failure mode this RCE boundary must not have. Introduce a shared module-level _NON_BUNDLED_PLUGIN_SOURCES frozenset referenced by both the discovery scrub and the (now single) mount guard, so the auto-import policy lives in one place. The two mount guards collapse into one gate that still emits the distinct per-source operator message via a map (no loss of guidance). Behavior unchanged: 39 RCE-bypass tests pass, and the constant is mutation-checked (typo'ing it fails the bypass tests). Defence-in-depth (discovery scrub + mount refuse) is retained intentionally.	2026-06-22 22:48:37 +05:30
Teknium	f1e6d39a74	feat(computer_use): disable cua-driver telemetry by default, add opt-in (#50842 ) * feat(computer_use): disable cua-driver telemetry by default, add opt-in cua-driver ships anonymous PostHog usage telemetry ENABLED by default upstream (fires cua_driver_install / cua_driver_doctor events to eu.i.posthog.com). Hermes now disables it for our users unless they explicitly opt in. - New config key `computer_use.cua_telemetry` (default false) in DEFAULT_CONFIG. - `cua_backend.cua_driver_child_env()` injects `CUA_DRIVER_RS_TELEMETRY_ENABLED=0` into the child env when telemetry is disabled (the default); leaves the var untouched on opt-in so the driver uses its own default. Reads config fail-safe — any error defaults to telemetry off. - Routed every cua-driver spawn site through the policy: MCP backend (StdioServerParameters env), `cua_driver_update_check`, doctor's health_report Popen, the install.sh/install.ps1 runner, and the `--version` / status probes. - Docs: new Telemetry subsection in computer-use.md (EN). - Tests: tests/computer_use/test_cua_telemetry.py — default disables, explicit-false disables, opt-in leaves var untouched, config-failure fails safe, inherited-enabled is overridden off. Verified live on Linux against the real cua-driver-rs 0.6.0 binary: with the var=0 the driver reports "telemetry: disabled via CUA_DRIVER_RS_TELEMETRY_ENABLED" and sends no event; with it unset it logs "sending event: cua_driver_doctor". 213 computer_use + install tests green. * fix(dashboard): fold computer_use config category into agent tab The new computer_use.cua_telemetry key created a single-field dashboard config category, tripping test_no_single_field_categories (web_server's invariant that categories with <2 fields must be merged to avoid tab sprawl). Add computer_use -> agent to _CATEGORY_MERGE, matching the existing onboarding/telegram single-field folds.	2026-06-22 09:57:16 -07:00
Teknium	ed711e1c2c	chore: add iaji to AUTHOR_MAP for salvaged Slack mention_patterns fix	2026-06-22 09:44:52 -07:00
iaji	441bd6d8db	fix(slack): split csv mention pattern fallback	2026-06-22 09:44:52 -07:00
devorun	4966268764	fix(slack): honor documented `mention_patterns` wake words The Slack docs document `slack.mention_patterns` as custom wake words that trigger the bot alongside `@mention`, and the config layer bridges the key into the Slack adapter's `config.extra` — but the adapter never read it. With `require_mention` on, a channel message containing a configured wake word (and no literal `<@BOTUID>`) was silently ignored. Every other adapter that documents `mention_patterns` (Telegram, DingTalk, Mattermost, WhatsApp, BlueBubbles, Photon) implements it; Slack was the odd one out. Add `_slack_mention_patterns()` (compiled, cached; reads `slack.mention_patterns` as a list/string or `SLACK_MENTION_PATTERNS` as a JSON/CSV/newline list, invalid regexes warned and skipped) and `_slack_message_matches_mention_patterns()`, mirroring the existing adapters. Channel mention detection now also triggers on a wake-word match, so the documented field works as described. Adds tests for pattern compilation (list/string/env/invalid-regex) and for the channel-trigger gating with a wake word under require_mention.	2026-06-22 09:44:52 -07:00
Teknium	2617946397	fix(delegation): emit high-concurrency cost warning once per process (#50848 ) * chore: re-trigger CI (workflows did not dispatch on prior head) * fix(delegation): emit high-concurrency cost warning once per process _get_max_concurrent_children() runs on every get_definitions() schema rebuild (via _build_top_level_description / _build_tasks_param_description), not just on actual delegate_task calls. With max_concurrent_children>10 the cost advisory fired on every turn / agent spawn across every session, spamming the log even when delegate_task was never used. Gate it behind a module-level _HIGH_CONCURRENCY_WARNED flag so it warns at most once per process.	2026-06-22 09:44:30 -07:00

1 2 3 4 5 ...

12617 commits