Commit graph

13603 commits

Author SHA1 Message Date
brooklyn!
b69c2d2fcd
Merge pull request #55456 from NousResearch/bb/desktop-split-controller
refactor(desktop): thin desktop-controller by extracting session-list actions
2026-06-30 02:19:03 -05:00
brooklyn!
116acf3821
Merge pull request #55455 from NousResearch/bb/desktop-split-composer
refactor(desktop): extract composer pure helpers into composer-utils
2026-06-30 02:18:41 -05:00
brooklyn!
61211967e1
Merge pull request #55453 from NousResearch/bb/desktop-split-sidebar
refactor(desktop): split sidebar/index.tsx god file into focused modules
2026-06-30 02:18:32 -05:00
brooklyn!
374d38f09d
Merge pull request #55451 from NousResearch/bb/desktop-split-thread
refactor(desktop): split thread.tsx god file into focused modules
2026-06-30 02:17:56 -05:00
brooklyn!
ddf0d980b6
Merge pull request #55473 from NousResearch/bb/cmdk-drag-region
fix(desktop): make ⌘K / session-switcher HUDs ignore the titlebar drag band
2026-06-30 02:11:03 -05:00
Brooklyn Nicholson
03311abe49 fix(desktop): make ⌘K / session-switcher HUDs ignore titlebar drag band
The top-center floating HUDs (command palette + session switcher) pin at
top-3, overlapping the titlebar's `[-webkit-app-region:drag]` bands. Drag
regions win hit-testing over the DOM regardless of z-index, so the top of
each surface — the search input — swallowed clicks, leaving only a ~2px
strip focusable. Add `[-webkit-app-region:no-drag]` to the shared
HUD_SURFACE so the whole surface is interactive.
2026-06-30 02:06:30 -05:00
brooklyn!
d6396e6a41
Merge pull request #55449 from NousResearch/bb/verify-on-stop-auto-default
feat(agent): restore surface-aware "auto" default for verify_on_stop
2026-06-30 01:47:59 -05:00
Brooklyn Nicholson
4dbd869ab3 feat(agent): restore surface-aware "auto" default for verify_on_stop
#53552 flipped verify_on_stop to default OFF because the guard fired on
doc/markdown/skill edits and felt like noise. That doc/markdown/skill
suppression already shipped in the same change (_filter_verifiable_paths in
agent/verification_stop.py), so the original noise rationale no longer holds:
the guard already skips prose-only turns.

Restore the surface-aware "auto" default — ON for interactive coding surfaces
(CLI, TUI, desktop) and programmatic callers, OFF for conversational messaging
surfaces (Telegram, Discord, etc.) where the verification narrative would reach
a human as chat noise. The missing/unrecognized fallback in
verify_on_stop_enabled now resolves to the same surface-aware default instead of
hard OFF, so both the DEFAULT_CONFIG value and the resolver agree.

Scope: this changes the shipped default for fresh installs and configs without
an explicit verify_on_stop key. Existing configs that #53552/#54740 migrated to
an explicit `false` are respected and unchanged — this PR does not add a
force-migration of those values back to auto.
2026-06-30 01:43:08 -05:00
brooklyn!
aa3c1d6679
Merge pull request #55457 from NousResearch/bb/fix-windows-subprocess-test-flake
test: fix flaky windows no-window-flag tests vs update-check daemon
2026-06-30 01:42:48 -05:00
Brooklyn Nicholson
0ea318a7d4 test: make windows no-window-flag assertions immune to update-check daemon
These tests patch `<module>.subprocess.run`, which is the shared `subprocess`
module singleton, so the patch is process-wide. Importing `tui_gateway.server`
runs `prefetch_update_check()` at import time, spawning an unnamed daemon thread
(`Thread-N (_run)`) that shells out to `git ... origin` (`text=True, timeout=5`).
That call races the test and lands in the captured list, intermittently failing
`test_tui_gateway_fuzzy_file_listing_hides_git_windows` with either
`KeyError: 'creationflags'` (the daemon's git call has no creationflags) or a
call-count mismatch (3 git calls captured, not 2). It only reproduced under the
parallel test harness because of the extra concurrency/timing.

Filter captured calls to the distinctive argv tokens of the call under test
(`--show-toplevel`, `ls-files`, `branch --show-current`, `diff`, `rg`,
`taskkill`) and read `creationflags` via `.get`, mirroring the existing
hardening on `test_gateway_pid_scan_hides_wmic_and_powershell_windows`. The
production code is unchanged; this is a test-isolation fix.
2026-06-30 01:35:55 -05:00
Brooklyn Nicholson
25c7900fb5 refactor(desktop): thin desktop-controller by extracting session-list actions
DesktopController is a route root that had grown a controller's worth of
session-list plumbing inline. Extract the cohesive fetch/paging cluster into
a focused hook and a tested pure helper, per AGENTS.md's "keep route roots
thin" guidance:

- use-session-list-actions.ts: refreshSessions / loadMoreSessions /
  loadMoreSessionsForProfile / loadMoreMessagingForPlatform / refreshCronJobs
  (plus the private cron/messaging refreshers, sessionsToKeep, and the
  excluded-source constants)
- desktop-controller-utils.ts: pure sameCronSignature helper (+ unit tests)

Pure restructuring, no behavior change. desktop-controller.tsx: 1,441 -> 1,233.
2026-06-30 01:34:12 -05:00
Brooklyn Nicholson
dd659c8d17 refactor(desktop): extract composer pure helpers into composer-utils
Pull ChatBar's module-level pure helpers, constants, and the QueueEditState
type out of the 2.3k-line composer/index.tsx into a focused, testable
composer-utils.ts sibling:

- constants: COMPOSER_STACK_BREAKPOINT_PX, COMPOSER_SINGLE_LINE_MAX_PX,
  COMPOSER_FADE_BACKGROUND, DRAFT_PERSIST_DEBOUNCE_MS
- helpers: pickPlaceholder, COMPLETION_ACTIONS, slashChipKindForItem,
  slashArgStage, slashCommandToken, cloneAttachments
- type: QueueEditState

Pure restructuring, no behavior change; adds unit tests for the slash helpers.
(The ChatBar component itself is a single tightly-coupled megacomponent; a
deeper hook-based decomposition is left for a dedicated follow-up.)
2026-06-30 01:30:52 -05:00
Brooklyn Nicholson
88e29e35bd refactor(desktop): split sidebar/index.tsx god file into focused modules
Behavior-preserving extraction of the 1,963-line ChatSidebar file into the
existing sidebar/ sibling-module convention:

- order.ts: add pure orderByIds / reconcileOrderIds / sameIds helpers (+ tests)
- reorderable-list.tsx: the generic ReorderableList + useSortableBindings DnD
  primitive
- section-states.tsx: SidebarSessionSkeletons / SidebarBlankState /
  SidebarPinnedEmptyState
- sessions-section.tsx: SidebarSectionHeader + the large SidebarSessionsSection
  renderer + its sortable row wrappers

index.tsx now holds only the ChatSidebar component (1,963 -> 1,416 lines).
2026-06-30 01:26:37 -05:00
Brooklyn Nicholson
7ff6908a59 refactor(desktop): split thread.tsx god file into focused modules
Behavior-preserving extraction of the 1,942-line thread.tsx transcript
renderer into co-located sibling modules, matching the existing flat
assistant-ui/ convention:

- thread-content.ts / thread-timestamp.ts: pure helpers (+ unit tests)
- thread-types.ts: shared RestoreMessageTarget
- thread-status.tsx: loading / stall / background-resume indicators
- thread-message-parts.tsx: reasoning + tool part components
- assistant-message.tsx, system-message.tsx, user-message.tsx,
  user-edit-composer.tsx: the message renderers

thread.tsx now holds only the Thread route component (1,942 -> 119 lines).
Also drops a dead readAloudAudio module variable (no references).
2026-06-30 01:21:55 -05:00
brooklyn!
a81c5922a2
Merge pull request #55413 from NousResearch/bb/pre-stop-hook
feat(agent): add pre_verify hook and coding guidance config
2026-06-30 01:10:08 -05:00
Brooklyn Nicholson
821d9f709f feat(agent): add configurable coding_instructions
agent.coding_instructions (a string or list) is appended to the coding brief as
its own stable system block, so users can pin project-wide workflow rules
without editing the shipped brief. Coding-posture only and cache-safe (resolved
once per session; takes effect next session). Empty by default.
2026-06-30 00:59:59 -05:00
Brooklyn Nicholson
a10113658b feat(agent): add pre_verify hook and verify-on-stop coding guidance
Add a `pre_verify` user/plugin/shell hook fired once per turn when the agent
edited code and is about to finish, after the existing verify-on-stop guard. A
hook can keep the agent going one more turn (run a check, defer it, tidy the
diff) by returning {"action":"continue","message":...} (the Claude-Code Stop
shape {"decision":"block","reason":...} is accepted too). Hooks receive coding,
attempt, final_response, and sorted changed_paths so they can self-scope and
self-throttle; the path is bounded by agent.max_verify_nudges and preserves
message-role alternation.

Hermes still ships its default coding guidance (agent.verify_guidance, on by
default), but it now rides the evidence-based verify-on-stop missing-evidence
nudge instead of a separate default pre_verify continuation, so it costs no
extra model turn of its own. Guidance reuses the shared utils.is_truthy_value
parser rather than a local copy.
2026-06-30 00:59:29 -05:00
beardthelion
14c4a849b7 fix(kanban): make goal_mode judge gate truly fail-open
Follow-up to the judge gate. judge_goal() is fail-open at the source:
when no auxiliary model is reachable it returns a "continue" verdict
that is indistinguishable from a real "not done yet" judgment. The gate
treated any non-"done" verdict as a rejection, so an unconfigured or
degraded auxiliary model would wedge every goal_mode worker — it could
never close its own task. That contradicted the gate's own "fail-open"
comment.

Probe judge availability before enforcing (the same auxiliary client
lookup judge_goal performs) and only gate when a judge is actually
reachable. When none is, completion proceeds.

Also fix the rejection guidance: kanban_create takes parents=[...], not
parent=.

Add test_complete_goal_mode_allows_when_judge_unavailable covering the
fail-open path; update the rejection test to force the availability probe.
2026-06-29 22:20:19 -07:00
beardthelion
b3c1b3b3f3 fix(kanban): address review feedback on goal_mode judge gate
Apply naqerl's review comments on PR #38388:

- Hoist `from hermes_cli.goals import judge_goal` to module-level
  imports so an import failure surfaces at module init, not lazily
  on the first goal-mode completion (no circular import: hermes_cli
  package init is trivial and does not load tools.kanban_tools).
- Narrow the fail-open `try` to wrap only the judge_goal() call.
  The verdict check and its rejection `return tool_error(...)` now
  live outside the handler, so a failure there can no longer be
  swallowed by the broad except.
- Pass `exc_info=True` to the logger.warning call per CONTRIBUTING.md.

Update the test mock target to tools.kanban_tools.judge_goal, since
the hoisted import rebinds the name into this module's namespace.
2026-06-29 22:20:19 -07:00
beardthelion
0b33bc5396 fix(kanban): gate goal_mode task completion with auxiliary judge
Prevents workers in goal_mode from bypassing the auxiliary judge by
calling kanban_complete before acceptance criteria are met. The tool
handler now synchronously invokes the goal judge against the task's
title/body and the completion summary. If the verdict is not "done",
the completion is rejected with actionable guidance for the agent.

This keeps kanban_db.py as a pure SQLite wrapper while intercepting
the bypass exactly at the agent tool-call boundary, aligning with
Hermes separation of concerns.

Fixes #38367

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-29 22:20:19 -07:00
brooklyn!
972b162090
Merge pull request #55410 from NousResearch/bb/install-theme-dedupe
feat(desktop): flag already-installed themes in the install pickers
2026-06-29 23:34:10 -05:00
Brooklyn Nicholson
04639adace feat(desktop): flag already-installed themes in the install pickers
The Cmd-K "Install theme…" palette listed Marketplace themes with no hint
that you already had them, and clicking one re-downloaded + re-installed a
theme you owned. The Appearance settings grid already detected this, but by
parsing theme descriptions inline on every render — plumbing that never made
it to the palette.

Lift it into one reactive source and reuse it everywhere:
- $marketplaceInstalls (computed over $userThemes): extensionId -> installed
  theme, derived once via marketplaceIdOf and memoized, instead of rebuilding
  a Set per render.
- Both install surfaces now mark owned rows installed and, on click,
  re-activate the installed theme rather than re-fetching it.
- Drops the duplicated description-parsing in settings and the per-session
  "installed here" state in both surfaces (the store is the source of truth,
  so previously-installed themes show correctly too).
2026-06-29 23:29:20 -05:00
Ben Barclay
05ac16778b feat(gateway): per-platform typing_indicator toggle
Add a generic per-platform PlatformConfig.typing_indicator flag (default
True) that gates the _keep_typing refresh loop in
_process_message_background. When false, the loop is never spawned, so no
typing/"is thinking…" status is shown on that platform — message delivery
is otherwise unchanged.

Mirrors the gateway_restart_notification contract exactly: dataclass field
+ to_dict/from_dict (with extra-fallback resolution) + shared-key bridge in
load_gateway_config, so 'slack: typing_indicator: false' under platforms
works without a separate block. Generic by design — the same key works for
every platform (Slack 'is thinking…', Telegram/Discord/Signal typing).

Motivated by users who find Slack's assistant 'is thinking…' status noisy
(it also briefly disables the compose box, via the Assistant API).
2026-06-29 21:12:57 -07:00
teknium1
463b1dfa9c fix(container-boot): also autostart a gateway stranded in 'degraded'
degraded is the same wedge class as draining: the gateway came up with
some platforms queued for retry, fell through to the running state
(gateway/run.py #5196), and is serving. A hard-kill there strands
gateway_state=degraded, which (like draining) is not in _AUTOSTART_STATES
and is not an operator stop or a failed boot — so it would stay DOWN
forever on every recreate. Add degraded to _TRANSIENT_RUNNING_STATES so
the fallback path normalises it to running-intent too.
2026-06-29 21:12:36 -07:00
Ben
d3f2931b8c fix(container-boot): autostart a gateway stranded in 'draining' state
A gateway hard-killed while draining (a container/VM recreate SIGTERMs it
before _stop_impl reaches its terminal-state persist) leaves
gateway_state.json frozen at 'draining'. With no explicit desired_state to
fall back to, container_boot read that transient value literally, found it
not in _AUTOSTART_STATES, and left the gateway DOWN on every subsequent
boot — dashboard up, messaging silently dark. Observed on a relay-opted-in
staging instance (2026-06): the s6 gateway-default slot kept its 'down'
marker across recreates and the gateway never came back.

'draining' is a transient sub-state of RUNNING (written by the drain
watcher / scale-to-zero go-dormant path), never an operator stop and never
a failed boot. Normalise it to 'running' in the gateway_state fallback so a
stranded drain marker reads as the run-intent it represents. This extends
gateway/run.py's #42675 handling (persist 'running' on an unexpected signal)
to the case where the gateway died before persisting anything at all.

'starting'/'startup_failed' are deliberately NOT normalised — those mean a
mid-boot death and must stay down to avoid the crash-loop the down-marker
guard prevents. An explicit desired_state still wins verbatim, so an
operator stop survives a transient 'draining' runtime value.

Tests: draining named-profile + default-root autostart (both fail without
the fix), plus a guard that an explicit desired_state=stopped still blocks a
draining runtime.
2026-06-29 21:12:36 -07:00
Teknium
d4c14011eb
feat(claude-design): add surface-first conditioning + slop diagnostic (#55399)
Port the two genuinely-novel ideas from Command Code's /design skill into
our existing claude-design skill (skill-only, zero model-tool footprint):

- Surface-First: commit to one of 7 surface archetypes (Monitor/Operate/
  Compare/Configure/Decide/Explore/Command) before any visual tokens. Most
  AI design slop is compositional, not cosmetic — conditioning generation on
  a surface choice collapses entropy the way a CoT step does. Workflow step 3.
- Slop Diagnostic: the ~10 tells that account for ~90% of the 'this is AI'
  signal, as a score-out-of-10 self-audit. Diagnose-then-treat: the report is
  context not a to-do list; repair only what fired, matched to the tell
  (re-layout vs recolor vs de-decorate). Workflow step 7 (Verify).

Did NOT clone /design's 16-mode CLI, proprietary reference corpus, or make it
a core tool. Docs page regenerated via generate-skill-docs.py.
2026-06-29 21:12:29 -07:00
teknium1
5a3d7fb99d fix(xai): suppress false-positive windows-footgun on binary image read
open(..., "rb") is binary mode and needs no encoding=; the checker's
regex doesn't recognize the mode. Add the documented suppression comment.
2026-06-29 21:11:58 -07:00
Jaaneek
9ce79cd642 feat(xai): Imagine public-URL storage, chaining & video edit/extend
Add durable public-URL output and URL-based chaining to xAI Grok Imagine:

- Store generated media on files-cdn with permanent public HTTPS URLs
  (public_url: true, no expiry by default).
- Chain by URL: generate -> edit -> extend each take a prior result's
  public HTTPS URL (or a data URI / local file for inputs).
- Add provider-specific xai_video_edit and xai_video_extend tools.
- Image generation: public-URL/storage output, multi-reference edits,
  and ~/ local-path support for image edits.

Credentials use xAI Grok device-code OAuth (separate PR).
2026-06-29 21:11:58 -07:00
Ben
184c10cf97 fix(slack): warn when configured token is a user token, not a bot token
A Slack user/legacy token (xoxp-...) makes auth.test resolve to the
installing human's member ID with no bot_id, so the adapter binds its
identity (_bot_user_id / _team_bot_user_ids) to that human. Every
"is this the bot?" check then misfires: that person's <@...> mentions
wake the bot and are stripped as the bot's own mention, so the agent is
genuinely told it was @mentioned and replies to messages merely
addressed to that human (symptom: bot responds to "@trevor ..." and
insists it was explicitly mentioned).

There is no runtime API error to catch — a user token still
sends/receives — so the only detectable moment is connect time. Add a
warning-only nudge (_warn_if_not_bot_token) alongside the existing
group-DM scope nudge: when auth.test resolves a user_id but no bot_id,
log that the token is a user token and to use the xoxb-... Bot User
OAuth Token. Warning-only: does not block a working-but-misconfigured
install. Fires once per workspace per process.
2026-06-29 20:57:43 -07:00
brooklyn!
33d044c3af
Merge pull request #55400 from NousResearch/bb/pet-roam-calmer
feat(desktop): calmer, more realistic pet roam + split roam modules
2026-06-29 22:55:30 -05:00
Brooklyn Nicholson
30cd39dc56 refactor(desktop): collapse stroll-direction coin to a single draw
DRY: the roomier-side bias computed its probability two ways
(STROLL_TOWARD_ROOM and 1 - STROLL_TOWARD_ROOM). One draw XNOR'd against
the roomier side says the same thing more plainly.
2026-06-29 22:53:38 -05:00
Brooklyn Nicholson
b7322f946d feat(desktop): calmer, more realistic pet roam + split roam modules
The floating pet wandered almost constantly: every idle beat picked a new
walk and hops fired ~45% of the time, so it read as nervous rather than
alive. Make movement the exception, not the default, and split the
overgrown roam hook into focused modules.

Behavior (per ambient game-AI: GameAIPro ch.36 + idle/wander state
machines):
- Loaf, don't pace: most decision beats just keep resting (REST_CHANCE
  0.62) instead of always re-walking.
- Memoryless dwell: pauses now draw from an exponential distribution
  (mostly short rests, the occasional long loaf) instead of a uniform
  1.8-5.2s window, so the cadence never reads as a metronome.
- Hops dialed back 0.45 -> 0.2 (the jumpiest, noisiest motion).

Structure (no god-file; a hook should own one narrow job):
- roam-behavior.ts - what to do & when (dwellMs, chooseMove,
  pickStrollTarget) + tuning. Pure, rng-injectable.
- roam-geometry.ts - where it can stand (snapshotLedges, overlayLedge,
  resolveLedge, overlapsX, groundTop). DOM measurement + pure ledge math.
- use-pet-roam.ts - the physics/RAF loop only.

Tests: deterministic, rng-seeded unit coverage for the decision + geometry
helpers (behavior contracts, not snapshots).
2026-06-29 22:51:09 -05:00
Teknium
6aefc9d925
feat(gateway): show per-category context breakdown in /usage (#55204)
Channel users get the same context split the desktop popover shows
(PR #54907) — system prompt, tools, rules, skills, MCP, subagents,
memory, conversation — under the existing Context line in /usage.

Reuses agent.context_breakdown.compute_session_context_breakdown, so
there is no new tool and no new engine. The slices are estimates
(chars/4) and the block is labelled _(estimated)_; the headline
Context line keeps using the provider-measured last_prompt_tokens.
Rendering is fail-open: any engine error returns no breakdown and the
rest of /usage is unaffected.

- gateway/slash_commands.py: _context_breakdown_lines() helper + wire
  into _handle_usage_command
- locales/*.yaml: breakdown_header, breakdown_line, and 8 category
  labels across all 16 locales (parity gate)
- tests/gateway/test_usage_command.py: render + fail-open coverage
2026-06-29 20:42:19 -07:00
Ben Barclay
53a75f147f
feat(dashboard_auth): support confidential clients (client_secret) in self-hosted OIDC (#55344)
The self-hosted OIDC dashboard provider was public-client + PKCE only, with
two `# TODO(confidential-client)` seams. Authentik and Keycloak commonly
default a new OIDC client to *confidential*, whose token endpoint rejects an
unauthenticated exchange (`invalid_client`) — so a self-hoster who accepts
their IDP's default could not complete dashboard login without manually
flipping the client to public.

Add optional confidential-client support:

- New optional `client_secret` (env `HERMES_DASHBOARD_OIDC_CLIENT_SECRET`,
  or `dashboard.oauth.self_hosted.client_secret`; env-wins-config, empty
  treated as unset). It is a credential, so docs steer operators to the
  `.env` file; config.yaml is supported only for precedence symmetry.
- `_token_endpoint_auth()` selects `client_secret_basic` (HTTP Basic header)
  vs `client_secret_post` (form body) from the IDP's advertised
  `token_endpoint_auth_methods_supported`, defaulting to basic (the OIDC
  default) when absent. Applied to complete_login, refresh_session, and
  revoke_session (RFC 7009 §2.1).
- PKCE is sent in BOTH modes — the secret is client authentication layered
  on top, never a replacement (OAuth 2.1 / RFC 9700 keep PKCE mandatory).
- Basic header url-encodes client_id/secret before base64 per RFC 6749
  §2.3.1, so reserved chars (`:`, `@`, space) round-trip correctly.

Non-breaking: with no secret configured the provider is a pure public PKCE
client, byte-identical to prior behaviour (no Authorization header, no
client_secret in the body). The secret is never logged — register() reports
only a `confidential=<bool>` flag.

Tests: 16 new cases covering basic/post selection, default-when-absent,
public-unchanged contract, PKCE-preserved, reserved-char url-encoding,
blank-secret-is-public, refresh + revoke auth, no-secret-in-logs, and
env/config register wiring. Full dashboard-auth suite (nous provider,
middleware, gate, cookies, WS, 401-reauth, status endpoint) — 396 tests —
green, proving no existing auth path regressed.
2026-06-30 13:32:51 +10:00
Teknium
481caa66f2
feat(display): friendly human-phrased tool labels for built-in tools (#55166)
* feat(display): friendly human-phrased tool labels for built-in tools

Built-in tools now render ChatGPT-style status verbs ('Searching the web
for ...', 'Reading <file>', 'Browsing <url>') on the CLI spinner and
gateway/desktop tool-progress instead of the raw tool name.

- agent/display.py: _TOOL_VERBS map + build_tool_label() + set/get
  friendly-labels flag (default on). Custom/plugin/MCP tools fall back to
  the raw preview; verbose gateway mode left untouched (debug surface).
- tool_executor.py / tui_gateway / gateway: route the three spinner sites,
  the TUI _tool_ctx, and the gateway all/new progress line through the label.
- config: display.friendly_tool_labels (default True, per-platform aware).

Zero new core tool / schema footprint — pure display layer.

* docs: add PR infographic for friendly tool labels

* fix(display): preserve arg preview in gateway friendly labels + update tests

The first gateway pass re-derived the label from the callback's `args`, which
is empty ({}) at the gateway tool.started callsite — the command/query lives in
the `preview` string, so terminal rendered as a bare '💻 Running' and dedup
collapsed consecutive commands. Now the gateway prefixes the verb onto the
already-computed preview via get_tool_verb/tool_verb_connector/verb_drops_preview,
preserving the command/url/query. CLI spinner path (real args) keeps build_tool_label.

Tests: update test_run_progress_topics exact-format assertions to the friendly
form ('💻 Running pwd'), add a format-agnostic preview extractor for the
truncation tests (works for both quoted-legacy and verb-prefixed output).

* test(tui): update resume-display context to friendly tool label

_tool_ctx now uses build_tool_label, so the desktop resume-view context for a
search_files turn reads 'Searching files for resume' instead of the bare
'resume' preview — consistent with live tool-progress. Update the assertion.

* test(tui): harden no-race worker test against sibling shard leakage

test_session_create_no_race_keeps_worker_alive flaked under -j 8: a daemon
build thread leaked from a prior session.create test in the same shard process
fires close/unregister against its own (foreign) session_key after this test
patches the global approval hooks, polluting the captured lists. Scope the
assertions to this session's own session_key so the regression intent
(this session's worker/notify must survive) is preserved while the test
becomes immune to shard composition. Not related to friendly-tool-labels.
2026-06-29 20:31:17 -07:00
ethernet
41c85fb946 fix(agents.md): fix documentation on subprocess isolation in tests 2026-06-29 19:17:04 -07:00
ethernet
cca8b4ef4e fix(ci): unify amd64/arm64 docker pipelines 2026-06-29 19:17:04 -07:00
ethernet
66ba9e06d9 change(ci): remove lint PR comment
it's already in the job summary.
having it as a comment just makes people ignore it. don't waste sapce.
2026-06-29 19:07:00 -07:00
ethernet
808ba82125 feat(ci): add CI timing report 2026-06-29 19:07:00 -07:00
Ben Barclay
3a55f66602
refactor(relay): adopt scope_id wire key (guild_id → scope_id dual-read/write) (#55289)
Gateway half of relay-platform-parity Phase 2.5 (D-Q2.5). The relay wire's
platform-neutral scope discriminator is renamed guild_id → scope_id; this is the
hermes-agent side of the cross-repo wire-compatible migration.

- SessionSource: scope_id is canonical; guild_id kept as @deprecated alias.
  __post_init__ mirrors the two so all existing SessionSource(guild_id=...)
  constructors across native adapters keep working unchanged. to_dict dual-WRITES
  scope_id+guild_id; from_dict dual-READS scope_id ?? guild_id.
- relay/adapter.py: capture + outbound metadata dual-read/write scope_id.
- relay/ws_transport.py: _frame_to_event dual-reads scope_id ?? guild_id.
- docs/relay-connector-contract.md: document scope_id (canonical) + guild_id
  (deprecated alias) in the §3 SessionSource field table (conformance test).

250 relay+session+contract tests green. Solo lane (relay).
2026-06-30 11:16:53 +10:00
Joey Kerper
f3d2dfbec6
fix(dashboard_auth): allow any http:// host in self-hosted OIDC redirect_uri (#55099)
The self-hosted OIDC dashboard login rejected any http:// redirect_uri
whose host was not localhost/127.0.0.1, surfacing "redirect_uri may only use http:// for localhost/127.0.0.1" before reaching the IDP. This broke self-hosted dashboards reached over plain HTTP (including LAN IPs, internal hostnames, and reverse proxies that terminate TLS upstream).

#38827 already dropped this check from the nous provider, but the generic self-hosted provider  copied the old localhost-only
branch and reintroduced the bug for HERMES_DASHBOARD_OIDC_ISSUER setups.

The IDP's own allowlist is authoritative on which redirect_uris are
permitted; this client-side _validate_redirect_uri is only a fast-fail for
obvious operator error and should not second-guess valid http:// deployments.

Fix: drop the localhost-only branch on the http scheme. Validation now enforces only that the scheme is http(s) and the path ends with
/auth/callback. Updated the docstring to explain the relaxed contract,
and added test_allows_http_with_arbitrary_host covering an internal
hostname and a LAN IP alongside the existing localhost case.
2026-06-30 09:45:11 +10:00
yoniebans
d2ce2c852d test(gateway): assert interleaving safety of concurrent offloaded DB calls 2026-06-29 15:51:57 -07:00
yoniebans
6735162531 fix(gateway): offload the Telegram topic-recovery helper tree off the loop
The topic-mode helpers (_telegram_topic_mode_enabled,
_recover_telegram_topic_thread_id, _record/_sync_telegram_topic_binding,
_is_telegram_topic_lane/_root_lobby, _normalize_source_for_session_key,
_telegram_topic_new_header, _schedule_telegram_topic_title_rename, and the
base.py _apply_topic_recovery hook) each run a synchronous SessionDB read or
write. They reach the event loop through async handlers, so a contended
state.db froze the loop the same way the handoff watcher did.

These helpers already run off-loop in the run_sync thread-pool closure, so
they are proven thread-safe there. Rather than colour them async, loop-side
callers now invoke them via asyncio.to_thread(...); the executor callers are
unchanged. Inside the helpers the SessionDB handle is unwrapped to the sync
door (getattr(db, '_db', db)) since they always run on a worker thread, and
AIAgent construction + query_session_listing are handed the sync SessionDB
directly. base.py wraps its single _apply_topic_recovery call in to_thread.

The guard is now alias-aware (catches db = getattr(self, '_session_db', None);
db.method(...)) and enforces the offload contract: the offloaded sync helpers
may never be called bare on the loop. Sibling test fixtures wrap their injected
SessionDB in AsyncSessionDB to match how the gateway holds it.
2026-06-29 15:51:57 -07:00
yoniebans
0a997aabbc fix(gateway): route aliased SessionDB calls through AsyncSessionDB
The migration's call-site sweep keyed on the literal self._session_db.
spelling and missed calls bound to a local first
(db = getattr(self, '_session_db', None); db.method(...)). Convert the
three in async contexts: get_telegram_topic_binding in the topic-rename
coroutine, and the two update_session_model sites on the model-switch path.
2026-06-29 15:51:57 -07:00
yoniebans
0896facce8 fix(gateway): route SessionDB calls through AsyncSessionDB 2026-06-29 15:51:57 -07:00
yoniebans
ea26f22710 feat(gateway): add AsyncSessionDB offload facade 2026-06-29 15:51:57 -07:00
yoniebans
89daacb454 test(gateway): cover AsyncSessionDB offload + raw-call guard (failing) 2026-06-29 15:51:57 -07:00
brooklyn!
f171842f0d
Merge pull request #55154 from NousResearch/bb/desktop-auto-speak-replies
feat(desktop): read replies aloud (auto-TTS) composer toggle
2026-06-29 15:25:27 -05:00
Teknium
290fa7fd2b
fix(gateway): skip confirmed-dead delivery targets (deleted groups, blocked bots) (#55115)
* fix(gateway): skip confirmed-dead delivery targets (deleted groups, blocked bots)

A deleted Telegram group, kicked/blocked bot, or deactivated user keeps
throwing Forbidden/not_found on every cron tick and fan-out delivery. Each
retry burns a send against the platform's flood-control envelope and spams
the logs, making the whole session feel broken even when the model call
completed.

Add a small persistent DeadTargetRegistry (per-profile JSON under
HERMES_HOME) that records a target the moment a send reports a whole-chat
death (forbidden / chat-level not_found), and have DeliveryRouter.deliver()
short-circuit it on subsequent attempts. Self-healing: any successful send
clears the flag, so a user re-adding the bot recovers with no manual cleanup.
Thread/topic-level not_found is NOT recorded (adapters already self-heal that
by retrying without reply_to). Transient/timeout errors are never marked dead.

* infographic: dead delivery target skipping
2026-06-29 13:23:29 -07:00
Brooklyn Nicholson
596b813c9b feat(desktop): add read-replies-aloud toggle and wire auto-speak 2026-06-29 15:22:37 -05:00