hermes-agent/tests
brooklyn! d62979a6f3
feat(desktop): composer status stack, live subagent windows, editable prompts (#44630)
* feat(desktop): session-scoped status stack + kill new-window theme flash

Stack subagents, background tasks, and the queue into one collapsible
"sink" above the composer, reusing the queue's chrome so every status
reads as one piece. Extracts shared StatusSection / StatusRow /
TerminalOutput primitives and a unified $statusItemsBySession store
(subagents mirrored, background owned here, merged + grouped for render).
Renames BrailleSpinner → GlyphSpinner now that it drives more than braille.

Separately, fix the white flash on every new/cmd-clicked window: macOS
`vibrancy` paints an NSVisualEffectView that follows the OS appearance and
ignores `backgroundColor`, so a dark app on a light-mode Mac flashed white
until the renderer painted over it. Pin `nativeTheme.themeSource` to the
app theme (persisted to userData so cold launches paint right before the
renderer loads), hold windows with `show:false` until `ready-to-show`, and
pre-paint the themed background via an inline script before the bundle runs.

* feat(desktop): dock the slash popover to the composer via one shared fill var

The slash·@ popover (and ? help) now docks onto the composer's edge with the
same chrome as the queue/status stack — rounded outer corners, fused borderless
edge, no shadow — but keeps its own narrow width.

Surface + drawer paint a single --composer-fill var; the state ladder
(rest / scrolled / focused / drawer-open) lives once in styles.css on
[data-slot='composer-root']. The :has() drawer-open rule is last and forces an
opaque fill, since translucent glass sampling different backdrops (thread vs
fade gradient) can never match. This replaces the focus-within !important
override that repainted the surface behind every previous matching attempt.

Also drop the chevron column from the project file tree — the folder open/closed
icon already carries the expand state.

* feat(desktop): base inset for file tree rows (post-chevron alignment)

* feat(desktop): wire the status stack's background tasks to the real process registry

The background group was UI-only (dev-mock seeded). Now it's live e2e:

- tui_gateway: new session-scoped `process.list` (registry snapshot filtered
  by the session's session_key, plus a 4KB output tail for the inline
  terminal viewer) and `process.kill` (single process, ownership-checked —
  unlike process.stop's kill_all).
- Renderer: `reconcileBackgroundProcesses` syncs snapshots into the store
  layout-stably — rows keep their position when state flips (never re-sort),
  new processes append, unchanged rows keep object identity so memoised rows
  skip re-rendering, and a dismissed-set stops the registry's retained
  finished procs from resurrecting X-ed rows.
- Refresh triggers: session open, terminal/process tool.complete,
  status.update(kind=process) from the gateway's notification poller, and a
  5s poll armed only while a running row is visible (catches silent exits).
- Stop = real `process.kill` + optimistic dismiss; Dismiss = client-side
  with resurrection guard.
- Re-keyed the stack to the RUNTIME session id: it was keyed by the stored
  session id, where neither subagent events nor process.list would ever land.
- Deleted dev-status-mocks.ts (__hermesStatusMocks) — no more seed shit.

Reconcile invariants covered in store/composer-status.test.ts.

* feat(desktop): todos + openable subagents in the status stack, self-healing file tree

- todo lists move out of the inline chat panel into the composer status stack
  (checklist icon, dashed ring = pending, spinner = in progress, check = done),
  fed live from todo tool events and seeded from history on session open
- subagent rows carry the child's real session id end-to-end
  (delegate_tool → gateway → renderer) so clicking one opens ITS session window
- status stack publishes its measured height so the thread's bottom clearance
  grows with it; card paints the shared --composer-fill so focused/scrolled
  states match the composer exactly
- file tree self-heals: ENOENT roots retry on a 3s cadence + Try again button,
  and the main process expands ~ in IPC paths (gateway cwds arrive as ~/...)
- composer drag-drop of tree entries inserts inline refs instead of attachments

* fix(desktop): file tree falls back to the workspace dir when a session's cwd is gone

Sessions record their launch cwd; deleted worktrees leave that path dead,
so opening such a session swapped the tree from the default workspace to a
directory that ENOENTs forever — the 3s retry just spun on it. On a root
read error the tree now asks main to sanitize the cwd (prefers the
configured default project dir), displays that fallback, and quietly
re-probes the original path so it switches back if the dir reappears.

* feat(desktop): working restore-checkpoint button on past user prompts

The discard icon on hover of a past user bubble was decorative — clicking
did nothing. It's now a real control: a confirmation dialog explains that
everything after the prompt is removed, then the session rewinds to that
turn and reruns the same prompt (prompt.submit with
truncate_before_user_ordinal, the same mechanism the edit composer uses).
Failures rethrow into the dialog's inline error instead of toasting.

* fix(desktop): show the restore-checkpoint button on the latest user prompt too

Restoring the most recent prompt is just 'retry this turn' — no reason to
exclude it. Stop still takes the slot while the turn is running.

* fix(desktop): finished todo lists clear themselves out of the status stack

A list whose every item is completed/cancelled lingers ~4s so the final
checkmark is visible, then the todo group drops out of the stack. A fresh
active list arriving within the linger cancels the scheduled clear.

* chore(desktop): drop dead editableCheckpoint copy, terser restore confirm

* fix(desktop): rewind clears the abandoned timeline's todos + background

Restoring to (or editing) an earlier prompt rewinds the conversation, but
the todos and background processes spawned by the now-discarded turns kept
showing in the status stack — and the real background processes kept
running. Both rewind paths now clear the session's todo rows and kill +
drop its background processes before the fresh run repopulates them. Also
drops the click-to-edit clamp transition, which flashed a half-expanded
bubble on the way into the edit composer.

* feat(desktop): user messages are always editable; edit/restore revert mid-stream

The bubble is now always click-to-edit — even while a turn streams — instead
of going inert during a run. Sending an edit acts like restore: it rewinds to
that prompt and re-runs with the new text. Both edit and restore can fire
mid-stream now; the gateway refuses prompt.submit while a turn runs (4009
"session busy"), so they interrupt the live turn first and retry the submit
until the cooperative interrupt winds it down. Restore (re-run as-is) shows on
every prompt except the latest running one, which keeps the Stop button.

* fix(desktop): label preview-pane ⌘L selections with the filename, not "zsh"

The terminal owns a global ⌘/Ctrl+L "send selection to composer" shortcut, so
selecting text in the file preview pane and hitting it fell through to the
terminal handler — which imported the right text but labelled the composer ref
"zsh:N lines" off the shell name. When the selection isn't an xterm selection,
label it with the previewed file instead.

* fix(desktop): ⌘L on a preview line selection inserts the @line ref, like dragging

The source preview lets you select lines in the gutter and drag them into the
composer as an @line:path:start-end ref. ⌘/Ctrl+L now does the same when a line
selection is active — it drops the identical ref instead of falling through to
the terminal's global handler (which grabbed the native text selection and sent
a bogus terminal block). Capture-phase + stopPropagation so it wins; with a line
selection there's no native selection, so the terminal handler stays out of it.

* chore: gitignore apps/desktop/demo/ scratch output

The desktop demo prompt writes demo/*.txt during recorded walkthroughs; it's
throwaway, never part of the app. Ignore it so it stops cluttering git status.

* feat(desktop): subagent watch windows, hard stop, sidebar hygiene

Child-session mirror for live subagent windows, delegate sessions tagged
and excluded from the sidebar, composer focus/stop polish, and WS stall
resilience on the gateway transport.

* refactor: DRY delegate SQL + trim status-stack noise

Extract shared listable-child and delegate-delete helpers in hermes_state,
collapse cancelRun busy release, and cut comment bloat in resume/status paths.

* fix(desktop): hide orphaned subagent sessions in sidebar

Cascade-delete all ephemeral children on parent delete (not just tagged rows),
run v16 backfill to tag legacy orphans, and record new delegates as source=subagent.

* fix: restore orphan contract for untagged children + lazy session eviction

Cascade-delete only _delegate_from-tagged rows (v16 backfill covers legacy),
walk marker chains recursively with FK-safe orphaning, gate lazy watch
sessions out of the still-starting eviction exemption via an explicit flag,
pass session_id to _make_agent only when resuming, and hide source=subagent
from session search.

* fix(gateway): gate child mirror off upgraded sessions + age out stale run entries

Review findings: the mirror could interleave synthetic events with a real
native stream once a watch window upgrades (prompt.submit builds an agent),
and a lost subagent.complete left _active_child_runs pinning running=true
forever. Mirror now stops when the live session owns an agent; liveness
reads ignore entries older than an hour.

* fix(gateway): reject prompt.submit into a watch session while its child runs

A lazy watch session's running flag is False (the run lives in the parent
turn), so typing mid-run sailed past the busy guard and built a second agent
racing the in-flight child on the same stored session. Busy error until the
run completes; afterwards the submit upgrades into a normal conversation.

* refactor(gateway): DRY watch-resume payload + compose listable-child SQL

Fold the duplicated child-run busy overlay into one _reuse_live_payload
helper across both resume reuse paths, collapse the twin mirror early-returns,
and build _LISTABLE_CHILD_SQL from _BRANCH_CHILD_SQL instead of restating it.

* fix(desktop): clip horizontal overflow on sidebar scroll areas

Add overflow-x-hidden alongside overflow-y-auto on session list scrollers
and the shared SidebarContent primitive — vertical scroll unchanged.
2026-06-12 08:30:06 -05:00
..
acp feat(acp): emit session provenance metadata for compression rotation (#41724) 2026-06-07 22:22:21 -07:00
acp_adapter feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
agent feat(billing): /credits command — balance + portal top-up handoff (#44776) 2026-06-12 08:51:10 +00:00
cli feat(cli): persist resolved approval/clarify prompts in scrollback (#44702) 2026-06-12 01:14:35 -07:00
cron Merge pull request #44331 from NousResearch/hermes/hermes-6b48295e 2026-06-11 22:48:06 -07:00
docker fix(gateway): auto-start after container restart via planned-stop marker (#42675) (#43236) 2026-06-10 14:01:34 +10:00
e2e chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
fakes
fixtures/plugins/example-dashboard/dashboard feat(dashboard): nous-blue theme, bulk sessions, schedule picker (#37383) 2026-06-02 12:37:40 -04:00
gateway fix(teams): cache document/video/audio attachments and classify as DOCUMENT (#44778) 2026-06-12 02:05:41 -07:00
hermes_cli fix(dashboard): profile-scope Channels endpoints and seed per-profile .env (#44792) 2026-06-12 02:09:28 -07:00
hermes_state fix(sessions): archive compressed conversation lineages 2026-06-11 12:31:10 +05:30
honcho_plugin fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759) (#42150) 2026-06-08 09:35:22 -07:00
integration refactor(gateway): migrate Home Assistant adapter to bundled plugin 2026-06-06 11:46:24 -07:00
openviking_plugin fix(openviking): add missing /agent/{agent}/ segment to memory URI — fixes #36969 2026-06-04 17:40:33 -07:00
plugins fix(photon): stop gateway restarts from orphaning the sidecar on its port 2026-06-12 01:07:38 -07:00
providers fix(openrouter): route reasoning_effort to verbosity for adaptive Anthropic models (#43436) 2026-06-10 15:03:01 +05:30
run_agent fix(credits): suppress usage gauge when top-up funds exist + add display.credits_notices toggle (#44716) 2026-06-12 01:06:46 -07:00
scripts fix(skills-hub): stop shipping a degenerate index when GitHub taps collapse (#42347) 2026-06-08 15:21:28 -07:00
skills fix(google-workspace): fall back to uv when venv has no pip (#39516) 2026-06-05 13:30:02 +10:00
stress chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
tools feat(messaging): expose action='unreact' in send_message + react dispatch tests 2026-06-12 01:07:38 -07:00
tui_gateway feat(desktop): composer status stack, live subagent windows, editable prompts (#44630) 2026-06-12 08:30:06 -05:00
website feat(skills): fix browse cap, add source links + copy buttons + category cleanup (#37143) 2026-06-01 19:52:28 -07:00
__init__.py
conftest.py fix: batch of small robustness/correctness fixes from @kyssta-exe 2026-06-01 19:51:03 -07:00
run_interrupt_test.py
test_account_usage.py
test_atomic_replace_symlinks.py
test_base_url_hostname.py
test_batch_runner_checkpoint.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_bitwarden_secrets.py fix(bitwarden): prevent zip-slip path traversal when extracting bws binary (#40569) 2026-06-06 18:33:44 -07:00
test_cli_file_drop.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_cli_manual_compress.py fix(tests): catch up six stale tests after compression/aux/kanban changes (#28465) 2026-05-18 21:43:59 -07:00
test_cli_skin_integration.py
test_ctx_halving_fix.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_dashboard_sidecar_close_on_disconnect.py fix(tui-gateway): reap leaked slash_worker sessions on disconnect + active_list liveness (re-scoped onto current main) 2026-06-08 10:02:05 -07:00
test_desktop_mac_entitlements.py test(desktop): assert macOS device entitlements are inherited 2026-06-03 07:32:00 +07:00
test_docker_home_override_scripts.py Repair cron ownership on container restart (#41976) 2026-06-10 15:32:34 +10:00
test_docker_stage2_browser_discovery.py fix(docker): discover Playwright headless_shell browser (#35717) 2026-06-01 16:06:44 +10:00
test_dockerfile_tini_compat_shim.py fix(docker): add /usr/bin/tini compatibility shim for legacy wrappers (#34192) (#34382) 2026-06-01 13:32:55 +10:00
test_empty_model_fallback.py test(models): guard Nous silent default against expensive-flagship escalation 2026-06-05 02:54:34 -07:00
test_empty_session_hygiene.py fix: in-memory transcript blocks empty-session prune 2026-06-10 17:37:34 -07:00
test_env_loader_secret_sources.py fix(secrets): only apply external secrets once per HERMES_HOME per process (#32271) 2026-05-25 15:18:55 -07:00
test_evidence_store.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_gateway_streaming_nested_config.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_get_tool_definitions_cache_isolation.py fix(gateway): close residual memory-leak sites under heavy scheduled workload 2026-06-08 06:32:42 -07:00
test_hermes_bootstrap.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_hermes_constants.py fix(constants): use windows native default hermes home 2026-06-03 19:37:29 -07:00
test_hermes_home_profile_warning.py
test_hermes_logging.py fix(gateway): tolerate Unicode in stderr log handlers on Windows 2026-06-06 19:57:44 -07:00
test_hermes_state.py feat(desktop): composer status stack, live subagent windows, editable prompts (#44630) 2026-06-12 08:30:06 -05:00
test_hermes_state_compression_locks.py fix(compression): prevent session-id fork from concurrent compressions (#34351) 2026-05-28 21:40:39 -07:00
test_hermes_state_wal_fallback.py fix(kanban): skip redundant WAL pragma on already-WAL connections 2026-05-27 14:31:55 -07:00
test_honcho_client_concurrency.py fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759) (#42150) 2026-06-08 09:35:22 -07:00
test_honcho_client_config.py fix(honcho): harden self-hosted setup paths 2026-05-29 22:29:48 -07:00
test_honcho_session_context.py fix(honcho): align user context peer perspective 2026-05-27 10:49:33 -07:00
test_honcho_startup_fail_open.py fix: make Honcho startup fail open 2026-06-01 20:13:42 -07:00
test_install_no_initial_commit.py fix(install): move broken checkout aside instead of deleting it 2026-06-08 02:18:21 -07:00
test_install_sh_browser_install.py fix(install): support non-sudo service-user installs on apt distros (#25814) 2026-05-14 09:05:31 -07:00
test_install_sh_pythonpath_sanitization.py
test_install_sh_root_fhs_uv_python_path.py test(install): harden uv-python-path regression test against future drift 2026-05-27 13:55:51 -07:00
test_install_sh_setup_wizard_tty_probe.py
test_install_sh_symlink_stomp.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_install_sh_termux_network_prereqs.py
test_ipv4_preference.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_lazy_session_regressions.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_lint_config.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_live_system_guard_self_test.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_mcp_serve.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minimax_oauth.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_minisweagent_path.py
test_model_forces_max_completion_tokens.py fix(params): send max_completion_tokens for newer OpenAI families on custom endpoints 2026-06-09 23:22:10 -07:00
test_model_picker_scroll.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_model_tools.py feat(middleware): add adaptive execution intercepts 2026-06-03 11:22:06 -07:00
test_model_tools_async_bridge.py fix(web): run URL SSRF checks off the event loop in async paths 2026-06-04 18:04:47 -07:00
test_ollama_num_ctx.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_output_cap_parsing.py test(agent): cover char-based output-cap overflow parsing (#42741) 2026-06-09 03:17:12 -07:00
test_package_json_lazy_deps.py fix(update): make Camofox lazy-installed instead of eager (#27055) 2026-05-16 12:15:45 -07:00
test_packaging_metadata.py fix(packaging): ship optional-mcps catalog in wheel and sdist (#39859) 2026-06-09 14:03:20 -04:00
test_plugin_skills.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_plugin_utils.py fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759) (#42150) 2026-06-08 09:35:22 -07:00
test_process_loop_event_loop_warning.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_project_metadata.py fix(deps): align anthropic extra pin with lazy pin + guard whole pin surface (#42335) 2026-06-08 12:11:54 -07:00
test_retry_utils.py
test_run_tests_parallel.py fix(ci): make parallel runner's exit-4 retry robust for newly-added test files (#42994) 2026-06-09 21:39:09 -07:00
test_sanitize_tool_error.py security: sanitize tool error strings before injecting into model context (#26823) 2026-05-16 00:57:39 -07:00
test_slash_worker_watchdog.py feat(slash-worker): self-terminate on parent death via create_time watchdog 2026-06-08 07:03:12 -07:00
test_sql_injection.py
test_state_db_malformed_repair.py fix(state.db): recover from malformed sqlite_master so hidden sessions reappear (#43149) 2026-06-09 18:49:08 -05:00
test_subprocess_home_isolation.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_termux_all_extra_compat.py
test_timezone.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_toolset_distributions.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_toolsets.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_trajectory_compressor.py fix(research): keep tool_call/tool_response pairs intact when compressing trajectories 2026-06-07 05:01:27 -07:00
test_trajectory_compressor_async.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_transform_llm_output_hook.py
test_transform_tool_result_hook.py test: stub has_hook in transform_tool_result hook tests 2026-06-03 06:36:46 -07:00
test_tui_gateway_server.py fix(desktop): discover MCP tools for dashboard /api/ws backends (#44512) 2026-06-11 22:45:45 +00:00
test_tui_gateway_ws.py feat(desktop): composer status stack, live subagent windows, editable prompts (#44630) 2026-06-12 08:30:06 -05:00
test_utils_truthy_values.py
test_web_server.py fix(tui-gateway): reap leaked slash_worker sessions on disconnect + active_list liveness (re-scoped onto current main) 2026-06-08 10:02:05 -07:00
test_wheel_locales_e2e.py fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix (#38383) 2026-06-03 12:00:27 -07:00
test_yuanbao_integration.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_yuanbao_markdown.py
test_yuanbao_pipeline.py feat(Yuanbao): support wechat forward msg (#43508) 2026-06-12 02:06:47 -07:00
test_yuanbao_proto.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_yuanbao_shutdown.py fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (#40607) 2026-06-07 17:49:38 -07:00