hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

Author	SHA1	Message	Date
Teknium	290fa7fd2b	fix(gateway): skip confirmed-dead delivery targets (deleted groups, blocked bots) (#55115 ) * fix(gateway): skip confirmed-dead delivery targets (deleted groups, blocked bots) A deleted Telegram group, kicked/blocked bot, or deactivated user keeps throwing Forbidden/not_found on every cron tick and fan-out delivery. Each retry burns a send against the platform's flood-control envelope and spams the logs, making the whole session feel broken even when the model call completed. Add a small persistent DeadTargetRegistry (per-profile JSON under HERMES_HOME) that records a target the moment a send reports a whole-chat death (forbidden / chat-level not_found), and have DeliveryRouter.deliver() short-circuit it on subsequent attempts. Self-healing: any successful send clears the flag, so a user re-adding the bot recovers with no manual cleanup. Thread/topic-level not_found is NOT recorded (adapters already self-heal that by retrying without reply_to). Transient/timeout errors are never marked dead. * infographic: dead delivery target skipping	2026-06-29 13:23:29 -07:00
brooklyn!	d417ffb363	Merge pull request #55114 from NousResearch/bb/pet-roam feat(desktop): roaming pet (opt-in)	2026-06-29 15:00:03 -05:00
Brooklyn Nicholson	a1e699ae55	feat(desktop): roaming pet patrols the base of an open overlay When a full-screen route overlay (settings/profiles/cron/agents/command-center) is up, the pet's walkable surface swaps to a single ledge at the overlay card's bottom edge — derived from OverlayView's shared inset, not measured — so it patrols there; closing the overlay restores the normal surfaces and it drops back down.	2026-06-29 14:57:26 -05:00
Brooklyn Nicholson	0e2a5a3206	feat(desktop): ground the roaming pet — sprite-paced walk + feet on surface Walk speed is derived from the sprite's animation loop + on-screen size (one body-width per loop) instead of a fixed px/s, so it steps rather than glides; the pet also sinks a few px so its feet meet the surface instead of hovering.	2026-06-29 14:47:37 -05:00
Austin Pickett	75d4aa9325	fix(web): confirm sidebar Update Hermes before running Match the Restart Gateway flow with a confirm dialog that fetches cached update metadata so users see commit-behind context before applying. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-29 12:30:24 -07:00
Austin Pickett	dbe92b9ed1	fix(web): confirm sidebar gateway restart and use DS checkboxes Prompt before restarting from the sidebar system menu, and replace native checkboxes on the System page with the design-system Checkbox component. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-29 12:30:24 -07:00
Austin Pickett	1abf0c6cbf	fix(web): polish dashboard sidebar chrome and model card menus Use momentum easing for sidebar transitions, switch sidebar typography to sans-serif, replace the profile native select with the DS Select, and stop clipping the Models page Use-as dropdown inside model cards. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-29 12:30:24 -07:00
Austin Pickett	10374bb7a2	fix(web): theme terminal foreground and restore backdrop plugin slot Make Nous Blue terminal text readable without the inversion layer, re-mount the backdrop plugin slot, and drop unused backdrop CSS vars from theme apply. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-29 12:30:24 -07:00
Austin Pickett	57d98ebed7	fix(web): remove marketing backdrop stack for lighter dashboard shell Drop the CSS lens overlay (blend modes, noise, inversion) and backdrop-blur from the ops dashboard so compositing no longer competes with xterm on /chat. Use flat theme backgrounds and direct Nous Blue palette colors instead of FG-inversion authoring. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-29 12:30:24 -07:00
Brooklyn Nicholson	b72c9e1b2c	feat(desktop): add pet roam opt-in toggle + i18n	2026-06-29 14:26:02 -05:00
Brooklyn Nicholson	4da744ef9b	feat(desktop): let the pet perch on the status bar and profile rail Tag both bars with data-slots; the roam loop stands on the status bar's top edge (not over it) and treats the profile rail as a climbable ledge.	2026-06-29 14:26:02 -05:00
Brooklyn Nicholson	7d3c1d55f4	feat(desktop): wire roaming into the floating pet	2026-06-29 14:26:02 -05:00
Brooklyn Nicholson	a8f1d9cc76	feat(desktop): add surface-aware pet wander loop usePetRoam re-measures ledges from the live DOM each beat and walks/hops/falls between them, driving DOM position imperatively (no per-frame re-render).	2026-06-29 14:26:02 -05:00
Brooklyn Nicholson	964ec680cc	feat(desktop): pick directional run row from travel direction roamWalkRow() prefers running-left/running-right rows, falling back to the generic running row with a mirror for pets that lack them.	2026-06-29 14:26:02 -05:00
Brooklyn Nicholson	c6d6a1c30d	feat(desktop): add pet roam + motion/direction store signals Opt-in $petRoam (localStorage), $petMotion (run/jump pose) and $petRoamDir (-1/0/1) feed the shared $petState only while the agent is at rest ($petAtRest), so a wander never overrides real activity.	2026-06-29 14:26:02 -05:00
Ben Barclay	b963d3238b	feat(gateway): suppress home-channel shutdown broadcast on flagged drains (#54824 ) Add a generic suppress_notification flag to the drain-request marker. When a drain that ends in process exit (e.g. a NAS auto-update image migration on the always-on Hermes Cloud fleet) is flagged, the gateway skips ONLY the home-channel 'gateway shutting down' broadcast — the operator-flavoured ping that would otherwise fire on every routine auto-update, dozens of times a day. The per-active-session interrupt ping is ALWAYS kept: on a drained shutdown it's empty by construction, and in the force-interrupt (deadline-exceeded) case it carries the user-valuable 'your task was cut off, message me to resume' hint. The gateway stays agnostic about WHY a drain is quiet (generic boolean, not a kind enum); the policy of which drain causes set the flag lives in the caller (NAS). Default-false so legacy/operator drains behave exactly as before. The reader reuses the NS-570 epoch-staleness check so an orphaned marker on the durable volume can never silence a fresh gateway's legitimate broadcast. - drain_control.py: write_drain_request gains suppress_notification; new drain_notification_suppressed() reader (current-epoch + truthy flag). - web_server.py: /api/gateway/drain reads + echoes the flag. - run.py: _notify_active_sessions_of_shutdown skips the home-channel loop only. Tests prove: flag round-trips; home-channel suppressed when set, kept when unset; active-session ping always fires; stale/legacy/corrupt markers never suppress.	2026-06-29 12:18:11 -07:00
brooklyn!	ccc92c5213	Merge pull request #55086 from NousResearch/fix/gateway-statusbar-tooltip fix(desktop): show Gateway statusbar tooltip via composed trigger Slots	2026-06-29 13:50:50 -05:00
Brooklyn Nicholson	7a6b3cb923	fix(desktop): show Gateway statusbar tooltip via composed trigger Slots The Gateway item is the only statusbar entry with variant === 'menu'. Since `da73223f4` wrapped every render branch in `Tip`, the menu branch nested `<DropdownMenu>` (a Radix Root that renders no DOM node) inside `Tip`'s `<TooltipTrigger asChild>`. With no element to attach to, Radix could never wire hover listeners, so the tooltip silently never showed. `Tip` also can't be moved inside `DropdownMenuTrigger asChild` (the shape proposed in #54859): it's a plain component, not a Slot-forwarding one, so the trigger's injected ref/handlers would land on `TooltipContent` instead of the button and break the menu's click + popper anchoring. Fix by composing both trigger Slots directly onto a single <button> (`TooltipTrigger asChild` over `DropdownMenuTrigger asChild`), the pattern already used in profile-switcher.tsx, and skip the tooltip wrapper entirely when the item has no title. Supersedes #54859. Co-authored-by: wnuuee1 <wnuuee1@users.noreply.github.com>	2026-06-29 13:48:56 -05:00
brooklyn!	929dd9c0d7	Merge pull request #55033 from NousResearch/bb/subagent-watch-readonly feat(desktop): read-only spectator transcript for subagent watch windows	2026-06-29 12:09:53 -05:00
Brooklyn Nicholson	7cf6758e33	feat(desktop): read-only spectator transcript for subagent watch windows Subagent session pop-outs (`watch=1`) spectate a run driven elsewhere, so editing/steering the transcript from there makes no sense. Gate the composer and the user-bubble mutations on `isWatchWindow()`: - hide the composer (folds into `showChatBar`) - user prompts become a read-only button that toggles the 2-line clamp so long prompts stay fully readable, instead of opening the edit composer - drop the stop/restore actions and the checkpoint branch-picker Keyed off the narrow `isWatchWindow()` (not `isSecondaryWindow()`), so the new-session and cmd-click pop-outs are unaffected.	2026-06-29 12:06:25 -05:00
Teknium	ee8cbfdc03	feat(web_extract): truncate-and-store instead of LLM summarization (#54843 ) * feat(web_extract): truncate-and-store instead of LLM summarization web_extract no longer runs an auxiliary LLM over scraped pages. The extract backends (Firecrawl/Tavily/Exa/Parallel) already return clean, boilerplate- stripped markdown, so we return it directly: pages within a char budget (default 15000, web.extract_char_limit) come back whole; larger pages get a head+tail window plus an explicit footer giving the stored full-text path and the read_file call to page through the omitted middle. The full clean text is written to cache/web (mounted read-only into remote backends like the other cache dirs), so nothing is lost. Inline base64 images are converted to [IMAGE: alt] placeholders (token bombs dropped) while real http(s) image URLs are preserved as links so the agent can still web_extract/vision_analyze them. Removes process_content_with_llm + the chunked summarizer + check_auxiliary_model + _resolve_web_extract_auxiliary. context_references._default_url_fetcher is updated to the truncate path and its stale data.documents shape read is fixed to results (it was silently returning empty). Live before/after eval (firecrawl, 4 URLs): 11.7x faster overall (176.6s -> 15.1s); 10-60x on large pages. Quality identical; findability 4/4 (answer recoverable from stored full text on every truncated page). web_search is unchanged. No own scraper added; no changes to web_search. * fix(web_extract): add char_limit to execute_code web_extract stub The new web_extract char_limit param must appear in the code_execution_tool _TOOL_STUBS signature (and doc line) or test_stubs_cover_all_schema_params fails — the stub schema must cover every real schema param.	2026-06-29 10:00:49 -07:00
Teknium	c6c1fd8b6b	docs: create dev venv outside the source tree (root-cause fix for #7779 ) (#54862 ) A manually-installed venv inside the cloned repo can be destroyed by the agent running a relative-path command against its own checkout (rm -rf venv, uv venv venv, etc.), silently wiping the running runtime mid-session. Moving the canonical manual-install venv to ~/.hermes/venvs/hermes-dev means no relative path from the agent's workspace resolves to its own runtime, making the bug class impossible without any command-detection code. Closes the root cause of #7779. The managed install.sh layout is unchanged.	2026-06-29 10:00:37 -07:00
brooklyn!	3bbeb9e008	Merge pull request #54907 from NousResearch/austin/feat/context-usage-popover feat(desktop): add context usage breakdown popover	2026-06-29 11:45:23 -05:00
Austin Pickett	fd324562d3	feat(desktop): add context usage breakdown popover Let users click the status bar context indicator to see how tokens are split across system prompt, tools, rules, skills, MCP, and conversation. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-29 09:18:10 -04:00
HexLab98	f1345290ed	test(auxiliary): cover NVIDIA NIM max_tokens in _build_call_kwargs	2026-06-29 18:04:39 +05:30
HexLab98	88e6f9b98c	fix(auxiliary): preserve max_tokens for NVIDIA NIM aux calls NVIDIA integrate.api.nvidia.com models such as minimaxai/minimax-m3 can return HTTP 200 with empty choices when max_tokens is omitted. Keep the output cap on auxiliary chat-completions routes, matching the main NVIDIA provider profile behavior.	2026-06-29 18:04:39 +05:30
Ben Barclay	f53ba9bb54	fix(s6): dot-prefix gateway staging dir so svscan ignores it mid-build (#54834 ) Some checks are pending CI / Detect affected areas (push) Waiting to run Details CI / Python tests (push) Blocked by required conditions Details CI / Python lints (push) Blocked by required conditions Details CI / TypeScript (push) Blocked by required conditions Details CI / Docs Site (push) Blocked by required conditions Details CI / Deny unrelated histories (push) Blocked by required conditions Details CI / Check contributors (push) Blocked by required conditions Details CI / Check uv.lock (push) Blocked by required conditions Details CI / Lint Docker scripts (push) Blocked by required conditions Details CI / Build&Test Docker image (push) Blocked by required conditions Details CI / Supply-chain scan (push) Blocked by required conditions Details CI / OSV scan (push) Waiting to run Details CI / All required checks pass (push) Blocked by required conditions Details Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details The register path builds each profile-gateway slot in a sibling staging dir under /run/service (the scandir s6-svscan watches), then atomically renames it to the live gateway-<profile> name. The staging dir was named gateway-<profile>.tmp — a NON-dotfile — so a concurrent `s6-svscanctl -a` rescan (fired by the cont-init reconciler registering gateway-default, or by a sibling register) would supervise the half-built slot the moment it had a valid type/run: s6-supervise spawns AS ROOT and mkdirs supervise/ root-owned 0700, then the in-flight _seed_supervise_skeleton early-returns on the now-existing supervise/ and the next `mkdir supervise/event` hits PermissionError. That is the arm64-only CI flake on test_s6_unregister_removes_service_dir_in_live_container (PermissionError: /run/service/gateway-phase3test.tmp/supervise/event) — arm64-only because the native-arm runner's wider scheduling jitter lets the rescan land inside the ~ms seed window; amd64 ran 30/30 clean. Fix: dot-prefix the staging dir (.gateway-<profile>.tmp) in both register paths (S6ServiceManager.register_profile_gateway and container_boot._register_service). s6-svscan skips any scandir entry whose name begins with '.', so the half-built slot can never be supervised mid-build. The atomic rename to the dotless live name is unchanged. Verified on a real s6 image (amd64): a non-dotted staging dir is picked up by an svscanctl -a rescan (SUPERVISED owner=root) while a dot-prefixed one is ignored (NOT-SUPERVISED). Added a docker-harness regression test that asserts both, plus a unit test that the staging dir is dot-prefixed.	2026-06-29 21:33:00 +10:00
Teknium	dbad6d47d3	fix(gateway): also neutralize untrusted Matrix room name in prompt Widen #5961's _format_untrusted_prompt_value coverage to the Matrix room display name (Matrix Room:), a sibling attacker-controllable field the original fix missed. chat_name is user-settable, so an injected room name could render as literal markdown in the system prompt. Adds a regression test.	2026-06-29 04:25:51 -07:00
Xowiek	09666ceb76	fix(gateway): neutralize untrusted session metadata in prompts	2026-06-29 04:25:51 -07:00
teknium1	ea1372d2af	fix(security): wire session-id sanitizer into artifact paths + API boundary Defense-in-depth on top of _safe_session_filename_component (#5958): Sink (makes the bad write impossible regardless of entry point): - run_agent._save_session_log: sanitize session_id before building the session_{sid}.json snapshot path. - agent_runtime_helpers.dump_api_request_debug: sanitize before building the request_dump_{sid}_{ts}.json path. Boundary (clean 400 instead of a silently-hashed filename): - api_server rejects path-traversal-shaped X-Hermes-Session-Id on the session-continuation path and the explicit /api/sessions create path, reusing gateway.session._is_path_unsafe (mirrors the native gateway's entry-boundary guard). Also enforces the session-header length cap on the continuation path. Tests: traversal session_id stays contained at the write site; sanitizer always yields a traversal-free segment; the API header rejects ../, absolute, and Windows-traversal IDs with 400.	2026-06-29 04:25:45 -07:00
Xowiek	1debd5e8f9	fix(security): add session-id filename sanitizer to prevent path traversal Session IDs can originate from untrusted input (e.g. the X-Hermes-Session-Id API header) and are interpolated raw into on-disk artifact filenames under ~/.hermes/sessions/. A traversal-shaped ID (../../../../etc/pwned) would let a caller write the session snapshot or request dump outside the sessions directory. _safe_session_filename_component() collapses every non [A-Za-z0-9_-] character to _, caps the length, and appends a short content hash when sanitization changed the string, always yielding a single traversal-free path segment. Closes #5958.	2026-06-29 04:25:45 -07:00
teknium1	cdd8e0a271	test(gateway): exercise last_prompt_tokens in reset-activity tests The reset-had-activity tests set total_tokens (dead state) to simulate activity; production records activity via last_prompt_tokens. Update the fixtures to match the field the fix and runtime actually use.	2026-06-29 04:25:37 -07:00
Mibayy	0fe9755016	fix(gateway): use last_prompt_tokens for session-reset activity check reset_had_activity gated on entry.total_tokens, which is never written (token counts migrated to agent-direct persistence) so it was always 0. That suppressed session-reset notifications for sessions that genuinely had activity. Switch to last_prompt_tokens, which is updated on every turn.	2026-06-29 04:25:37 -07:00
Mibayy	9e490138a0	fix(security): fail-closed feishu webhook rate limiter + whatsapp bridge path guard Salvages the two still-valid hardenings from #5381 onto the relocated plugin adapters (the discord/feishu/whatsapp adapters moved to plugins/platforms/ since the PR was opened, and 4 of its 6 hunks are already on main or superseded). - feishu: rate limiter now denies untracked keys when the tracking table is at capacity after pruning stale entries (was: allow through without tracking). At-capacity-with-all-fresh-entries only happens under abuse, so allowing untracked requests let an attacker who flooded the table bypass the limiter entirely. Already-tracked keys and post-prune room are unaffected. - whatsapp: absolute file paths handed back by the Baileys bridge are now validated to resolve inside a known media cache dir before being attached. A compromised/buggy bridge could otherwise return an arbitrary path (e.g. /etc/passwd) that would be sent verbatim to the model. Guard resolves symlinks and accepts both the canonical cache/<kind> and legacy <kind>_cache layouts.	2026-06-29 04:25:31 -07:00
Ruzzgar	576424cc1c	fix(security): redact browser CDP endpoint logs	2026-06-29 04:25:26 -07:00
teknium1	23c03ced75	fix(session-db): enrich NULL session metadata via upsert instead of INSERT OR IGNORE The gateway's get_or_create_session() creates a bare session row (source + user_id) before the agent exists. The agent's later create_session() carries the real model/model_config/system_prompt, but _insert_session_row used INSERT OR IGNORE — silently dropping that enrichment. Gateway sessions were left with NULL model and NULL billing metadata. Switch to INSERT ... ON CONFLICT(id) DO UPDATE with COALESCE so NULL columns get backfilled while values an earlier writer already set are never overwritten (a later bare write with source='unknown' can't clobber a real source/model). Credit: original report and fix direction by @LucidPaths (#5048).	2026-06-29 04:25:23 -07:00
teknium1	61f56d27db	refactor(dashboard-auth): drop redundant _interactive_providers helper list_session_providers() already filters on supports_session=True, so the new helper re-filtered an already-filtered list. Call it directly at the single auto-SSO call site.	2026-06-29 04:25:18 -07:00
Ben	f5ecbe1ec6	feat(dashboard): auto-initiate portal SSO redirect on unauthenticated load When the dashboard gateway has no local session cookie, it rendered a click-through /login interstitial — even though the Nous portal's /oauth/authorize auto-approves any current member of the dashboard's org and is a silent 302 when the user already holds a portal session. For the common case (clicking a hosted-agent dashboard link while signed in to the portal) that interstitial click is pure friction. This makes the gate auto-initiate the OAuth redirect on an unauthenticated HTML document load instead of rendering the interstitial, when exactly one interactive provider is registered. A one-shot loop-guard cookie (hermes_sso_attempt, 60s TTL) ensures that a genuinely absent portal session (the portal bounces back still-unauthenticated) falls back to the /login page after exactly one bounce rather than ping-ponging forever. The marker is cleared on a successful callback and whenever the gate falls back to /login. Security: this removes a human CLICK, not a security check. The redirect lands on the existing /auth/login route and runs the unchanged PKCE auth-code flow; token verification, audience checks, redirect-URI match, and org-membership checks are all untouched. /api/* fetches still get the 401 JSON envelope (never a 302 a fetch() would follow opaquely), and with two or more providers the /login chooser still renders. Phase 1 of the cloud-auto-discovery work.	2026-06-29 04:25:18 -07:00
Teknium	dc5ef20d89	test(reasoning-floor): isolate stale-timeout floor tests from config-module reload races (#54775 ) The five _resolved_api_call_stale_timeout_base integration tests reloaded hermes_cli.config + hermes_cli.timeouts via importlib.reload to clear cached config. Under xdist that mutates module-global state shared across the worker process, so a sibling test could leave the config cache in a state that made get_provider_stale_timeout return a leaked value — intermittently failing test_reasoning_floor_applies_to_opus_4_thinking (shard 6 flake, #52217 area). Patch run_agent.get_provider_stale_timeout per-test instead: floor-path tests get None (resolver falls through to the reasoning floor / env var / default), the explicit-config test gets 60.0 (priority-1 short-circuit). Same assertions, no shared-module mutation, deterministic under parallel execution.	2026-06-29 02:42:54 -07:00
sgaofen	194bff0687	fix(gateway): confirm final delivery before suppressing send Fixes #14238. During a compression/session split at the response boundary, the interim callback delivered unrelated commentary, setting response_previewed=True. The suppression logic treated that as proof the final reply had been delivered and skipped the normal send — the response was persisted to the child session but never sent to chat. Only suppress the normal final send when the stream consumer confirms final delivery (final_response_sent / final_content_delivered) or the exact final response text was delivered as a preview.	2026-06-29 02:37:11 -07:00
teknium1	fa3dba4b30	docs(infographic): add list_profiles perf-fix infographic	2026-06-29 02:35:57 -07:00
teknium1	10c9eafde2	chore(attribution): map mango001@126.com -> max-chen for salvaged #51194	2026-06-29 02:35:57 -07:00
Sahil-SS9	1bb7b59c5d	fix: offload blocking profiles endpoints from asyncio event loop (#54523 ) (cherry picked from commit `09f10e2b77`)	2026-06-29 02:35:57 -07:00
chenxiang	d5eee133eb	perf(profiles): fix list_profiles O(N*M) wrapper rescan (6.4s -> 0.4s) find_alias_for_profile re-scanned the whole wrapper dir (~/.local/bin) and read_text every file for EACH profile — including large unrelated binaries (ffmpeg etc.) read 15x over. With 16 profiles this took ~6.4s, long enough that the desktop's per-request backend calls timed out (15s) and the sidebar rendered '全部智能体 0 / 会话 0'. - Add build_alias_map(): single-pass {profile -> alias} reverse map, reads only an 8KB head slice per wrapper, skips binaries via UnicodeDecodeError. - find_alias_for_profile now delegates to it (behavior preserved). - Cache _count_skills by skills-dir mtime signature (+30s TTL). list_profiles: 6.37s -> 0.84s cold / 0.44s warm. 138 profile tests pass. (cherry picked from commit `89e593749a`)	2026-06-29 02:35:57 -07:00
teknium1	2f5950a83a	chore(release): add telos-oc to AUTHOR_MAP for PR #14353 salvage	2026-06-29 02:25:48 -07:00
Telos	fa11b11cf5	fix: propagate key_env from custom_providers into ProviderDef resolve_custom_provider() previously returned api_key_env_vars=() for every custom provider entry, silently dropping the configured key_env field. This caused 401 errors for any custom provider that required an API key via environment variable (e.g. Xiaomi MiMo Token Plan, self-hosted OpenAI-compatible servers). The key_env field is already documented in _VALID_CUSTOM_PROVIDER_FIELDS and normalized by normalize_custom_provider_entry(), so this was just an oversight in the ProviderDef construction. Also adds a regression test that verifies key_env is properly propagated into the resolved ProviderDef.	2026-06-29 02:25:48 -07:00
teknium1	9f97915163	fix(browser): route open-timeout base through _safe_command_timeout Wire the salvaged _safe_command_timeout() guard into the surviving open-timeout call site. _get_open_command_timeout() feeds the browser_navigate 'open' path; this closes the last call site that could observe a None timeout from a torn cache (#14331), since the original PR's max(_get_command_timeout(), 60) site no longer exists on main (now routed through _get_open_command_timeout).	2026-06-29 02:24:57 -07:00
Sanjay Santhanam	c79e6bceae	fix(browser_tool): resolve race in _get_command_timeout cache returning None (#14331 ) # Conflicts: # tools/browser_tool.py	2026-06-29 02:24:57 -07:00
Teknium	bf0d8fed8e	fix(config): v32 migration flips baked-in verify_on_stop=true to false (#54740 ) The first ship of verify-on-stop (config v30) defaulted DEFAULT_CONFIG agent.verify_on_stop to a literal True, and migrate_config persists defaults with strip_defaults=False — so every install that updated through v30 had verify_on_stop: true written into config.yaml as a literal. The v30->v31 migration only flipped missing/'auto' values to false and deliberately preserved an explicit bool, so it skipped that entire population and left verify-on-stop ON for everyone who had updated. A literal true was never a user choice: the feature had no off-switch worth setting it against until v31 introduced one, so a true persisted before v32 is always the old machine default. v32 migration flips a literal true -> false once, for both v30 (skipped v31) and v31 (preserved-by-bug) installs. A true the user sets AFTER v32 is a deliberate opt-in and is never touched.	2026-06-29 01:51:08 -07:00
teknium1	75317d82d0	fix(vision): narrow the fan-out cap to the CPU encode burst only The original cap held a process-global slot across the WHOLE vision analysis (image load + encode + LLM call) with a default of min(CPUs, 4). That serialized legitimate multi-image workflows — "compare these 6 screenshots", "read this 10-page scan", "analyze every frame" — behind a 4-wide gate, and on the native fast path it even throttled calls that make no LLM request at all. Excess calls queued (blocking acquire, nothing dropped), but the latency hit on real fan-out was the wrong tradeoff. The incident was CPU exhaustion, not call count: concurrent base64/resize bursts saturated every core and left none to service the shared event loop serving /api/status. So cap ONLY that: - A dedicated, bounded ThreadPoolExecutor (_vision_cpu_executor) runs the encode/resize/dimension-check off the caller's loop, sized to the host's usable core count with NO fixed ceiling — the cap tracks the actual exhausted resource (cores), not a magic number. Excess encodes queue on the executor; cores stay free for the loop. - The LLM call is deliberately OUTSIDE the executor, so multi-image workflows keep full request concurrency. - Override via auxiliary.vision.max_concurrency / HERMES_VISION_MAX_CONCURRENCY (honored verbatim, including above core count); sub-1 ignored. - _vision_concurrency_slot() is now a no-op shim for back-compat. Tests assert: resolver defaults to host cores with no ceiling; env/config override (incl. above cores); sub-1 rejection; the executor is dedicated and core-sized; encode runs on a vision-encode thread; and crucially that encode bursts are bounded to the cap while the analyses themselves stay fully concurrent (calls_peak > cap).	2026-06-29 01:27:10 -07:00

1 2 3 4 5 ...

13550 commits