Pull the top-level + chat parser construction out of main() into
hermes_cli/_parser.py so relaunch.py can introspect parser._actions to
discover which flags exist and whether they take values, instead of
maintaining a parallel hand-rolled (flag, takes_value) tuple list.
- _parser.py: build_top_level_parser() returns (parser, subparsers,
chat_parser); side-effect-free import.
- main.py: ~290 lines of inline parser construction collapsed to a
helper call. Other subparsers stay inline (dispatch is bound to
module-level cmd_* functions).
- _parser._inherited_flag(parser, ...): wraps parser.add_argument and
sets action.inherit_on_relaunch = True. Used in place of
parser.add_argument for the 25 flags (top-level + chat) that need to
carry over.
- _parser.PRE_ARGPARSE_INHERITED_FLAGS: holds --profile/-p, which
isn't on argparse (consumed earlier by main._apply_profile_override).
- relaunch.py: drops _CRITICAL_DESTS and _PRE_ARGPARSE_FLAGS; the table
builder now filters by getattr(action, 'inherit_on_relaunch', False).
- test_ignore_user_config_flags.py: brittle inspect.getsource grep
replaced with proper parser introspection.
- test_relaunch.py: introspection sanity tests added.
Salvaged from PR #17549; added top-level -t/--toolsets flag to
_parser.py so #17623 (fix(tui): honor launch toolsets) behavior is
preserved on current main.
Co-authored-by: ethernet <arilotter@gmail.com>
Extract all os.execvp('hermes', ...) calls into a utility so flags like
--tui, --dev, --profile, --model, --provider, et al. survive session
resume and post-setup relaunch.
- resolve_hermes_bin: prefers sys.argv[0] when callable, then PATH,
then falls back to '${sys.executable} -m hermes_cli.main' (fixes nix
run relaunches)
- build_relaunch_argv: allowlists critical flags so they carry over
- cmd_sessions browse now calls relaunch(['--resume', <id>])
- _apply_profile_override skips redundant work when HERMES_HOME is
already set (child inherits parent profile)
- setup.py replaces _resolve_hermes_chat_argv with relaunch_chat()
- added comprehensive tests for flag extraction and binary resolution
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
If a concurrent RPC mutates _sessions while session.delete is iterating
it (e.g. a parallel session.create on the thread pool), the bare except
swallowed the RuntimeError and let the delete proceed against a row
that may still be live. Snapshot via list(_sessions.values()) and
return an error when even that raises, instead of treating "couldn't
check" as "no active sessions."
Single-key confirm matches how the picker already accepts 1-9 to
resume — no separate y/n keymap to learn — and "press d again" is
self-documenting next to the cursor.
Pressing `d` on the highlighted row in the resume picker prompts
`delete? y/n`; `y` deletes the session (DB row + on-disk transcript
files), anything else cancels. The active session is excluded from
deletion server-side.
Adds a new `session.delete` JSON-RPC handler that wraps
`SessionDB.delete_session`, forwarding the per-profile `sessions/`
directory so transcripts get cleaned up alongside the row.
vision_analyze used Path('./temp_vision_images') — a relative path that
resolved against cwd. Under Docker the image's WORKDIR is /opt/hermes,
which is root-owned and only chmoded a+rX (read + traversal). Since
#5811 landed (run as non-root hermes UID 10000, Apr 12), remote-URL
vision calls fail with PermissionError on mkdir.
Switch to get_hermes_dir('cache/vision', 'temp_vision_images'): resolves
to $HERMES_HOME/cache/vision/ (= /opt/data/cache/vision/ in Docker —
the user-owned volume mount). Existing installs with the old dir keep
using it via the get_hermes_dir back-compat path; no migration needed.
Only site in the codebase that stored runtime files via Path('./...').
Reported via Discord: https://juick.com/i/p/3089079.jpg → Telegram →
gateway → [Errno 13] Permission denied: 'temp_vision_images'.
CI Tests workflow has been red on main for 40+ consecutive runs. This
commit recovers every failure visible in run 25130722163 (most recent
completed run prior to this PR).
Root causes, by group:
Test-mock drift after product landed (fix: update mocks)
- test_mcp_structured_content / test_mcp_dynamic_discovery (6 tests):
product added _rpc_lock (#02ae15222) and _schedule_tools_refresh
(#1350d12b0) without updating sibling test files. Install a real
asyncio.Lock inside the fake run-loop and patch at _schedule_tools_refresh.
- test_session.py: renamed normalize_whatsapp_identifier → canonical_
whatsapp_identifier upstream; keep a local alias so the legacy tests
keep working.
- test_run_progress_topics Slack DM test: PR #8006 made Slack default
tool_progress=off; explicitly set it to 'all' in the test fixture so
the progress-callback path still runs. Also read tool_progress_callback
at call time rather than freezing it in FakeAgent.__init__ — production
assigns it AFTER construction.
- test_tui_gateway_server session-create/close race: session.create now
defers _start_agent_build behind a 50ms timer — wait for the build
thread to enter _make_agent before closing, otherwise the orphan-
cleanup path never runs.
- test_protocol session.resume: product get_messages_as_conversation now
takes include_ancestors kwarg; accept **_kwargs in the test stub.
- test_copilot_acp_client redaction: redactor is OFF by default (snapshots
HERMES_REDACT_SECRETS at import); patch agent.redact._REDACT_ENABLED=True
for the duration of the test.
- test_minimax_provider: after #17171, dots in non-Anthropic model names
stay dots even with preserve_dots=False. Assert the new invariant
rather than the old 'broken for MiniMax' behavior.
- test_update_autostash: updater now scans `ps -A` for dashboard PIDs;
the test's catch-all subprocess.run stub needed stdout/stderr fields.
- test_accretion_caps: read_timestamps dict is populated lazily when
os.path.getmtime succeeds. Use .get("read_timestamps", {}) to tolerate
CI filesystems where the stat races file creation.
Change-detector tests (fix: rewrite as structural invariants)
- test_credential_sources_registry_has_expected_steps: was a frozen set
comparison that broke when minimax-oauth was added. Rewrite as an
invariant check (every step has description, no dupes, core steps
present) per AGENTS.md 'don't write change-detector tests'.
xdist ordering / test pollution (fix: reset state, use module-local patches)
- test_setup vercel: sibling test saved VERCEL_PROJECT_ID='project' to
os.environ via save_env_value() and never cleared it. monkeypatch.delenv
the VERCEL_* vars in the link-file test.
- test_clipboard TestIsWsl: GitHub Actions is on Azure VMs whose real
/proc/version often contains 'microsoft'. Patching builtins.open with
mock_open didn't reliably intercept hermes_constants.is_wsl's call in
xdist workers that had already cached _wsl_detected=True from an
earlier test. Patch hermes_constants.open directly and add
teardown_method to reset the cache after each test.
Pytest-asyncio cancellation hangs (fix: bound product await with timeout)
- test_session_split_brain_11016 (3 params) + test_gateway_shutdown
cancel-inflight: under pytest-asyncio 1.3.0, 'await task' and
'asyncio.gather(cancelled_tasks)' can stall for 30s when the cancelled
task's finally block awaits typing-task cleanup. Bound both with
asyncio.wait_for(..., timeout=5.0) and asyncio.shield — the stragglers
are released from adapter tracking and allowed to finish unwinding in
the background. This is also a legitimate hardening: a wedged finally
shouldn't stall the caller's dispatch or a gateway shutdown.
Orphan UI config (fix: merge tiny tab into messaging category)
- test_web_server test_no_single_field_categories: the telegram.reactions
config field lived in its own 'telegram' schema category with no
siblings. Fold it under 'discord' via _CATEGORY_MERGE so the dashboard
doesn't render an orphan single-field tab.
Local verification: 38/38 originally-failing tests pass; 4044/4044
gateway tests pass; 684/684 targeted subset (all 16 touched test files)
passes.
Reset sticky mouse/focus/paste terminal modes before the TUI starts and during graceful shutdown paths so stale tab state from prior crashes cannot poison the next session.
Detect leaked SGR mouse-report fragments in CLI input, strip them, and reset terminal modes in-place so scroll and typing recover without reopening the tab. Add regression tests for escaped, visible, and bare leak forms.
Route Option/Alt or Ctrl wheel input through a gated precision path that scrolls at most one row per short interval, while preserving the existing accelerated behavior for plain wheel input. Keep precision active briefly after modifier release so queued wheel events from the same gesture do not jump into acceleration mid-stream.
curl is a ubiquitous tool both for users running ad-hoc commands inside
the container (debugging, health checks, quick HTTP probes) and for
agent workflows — many bundled skills and hub skills lean on curl for
HTTP calls, API exploration, and installer bootstrapping. Its absence
causes silent workflow failures with "curl: command not found" until
the user manually apt-installs it.
Add curl to the single apt-get install layer alongside the other base
utilities (build-essential, nodejs, git, openssh-client, etc.) so it
ships in the image with zero extra layers and negligible size impact
(~400 KB).
- Dockerfile: add curl to the apt-get install list
Decode Shift, Meta, and Ctrl bits from SGR and legacy X10 wheel event button bytes so TUI input handlers can distinguish modified wheel gestures from plain scrolling.
check_for_updates() looked at __file__.parent.parent for a .git dir to
diff against origin/main. A nix-built hermes lives in /nix/store with
no .git there, so the check fell through to whatever editable-install
dev checkout last populated ~/.hermes/.update_check, producing stale
"X commits behind" warnings right after a fresh `nix run --refresh`.
Embed the locked flake rev into the wrapper as HERMES_REVISION (only
on
clean builds — dirty refs don't represent any upstream commit). When
set, banner.py compares it to upstream main via `git ls-remote`
instead
of inspecting a local checkout, and the cache key includes the rev so
nix updates invalidate immediately. Without local history we can't
count commits, so the message is a plain "update available" with no
suggested command — nix users may install via `nix run`, profile,
system flake, or home-manager, and we don't know which.
Also bump web/package-lock.json npmDepsHash via `nix run
.#fix-lockfiles`.
* fix(tui): offload manual compaction RPC
Route TUI session compression through the existing long-handler pool so slow compaction does not block other gateway RPCs.
* fix(tui): show compaction progress immediately
Print a local status line before the compress RPC starts so slow manual compaction does not look like a no-op.
* feat(tui): rich /compress feedback parity with CLI
Show pre-compaction message count and rough token estimate immediately, emit a status update so the bottom bar reflects ongoing compaction, and report a multi-line summary (headline + token delta + optional note) using the shared summarize_manual_compression helper.
* fix(tui): show live compaction estimate in transcript
Mirror compression progress status into the transcript so users see the backend message count and token estimate while /compress is still running.
* fix(tui): single live compaction line with spinner glyph
Drop the redundant local "compressing context..." placeholder and prefix the live backend status line with a braille spinner glyph so /compress reads as a single in-progress row.
* fix(tui): address review nits on /compress feedback
Reuse the precomputed token estimate inside _compress_session_history so the gateway does not redo the O(n) work while holding history_lock, keep the status bar pinned during long manual compactions instead of auto-restoring after 4s, and drop the redundant noop bullet that doubled with the system role glyph.
* fix(tui): release history_lock during compaction LLM call
Move the snapshot/commit pattern into _compress_session_history so the lock is held only across the in-memory bookkeeping, not during agent._compress_context. Also emit a final neutral status update from session.compress so the pinned compressing indicator clears even on errors.
* fix(tui): rebuild prompt cleanly + sync session_key after compress
Pass system_message=None so AIAgent._compress_context rebuilds the system prompt without nesting the cached identity block. Reuse the handler's pre-snapshotted history inside _compress_session_history to avoid a second O(n) copy under the lock. After compaction, when AIAgent._compress_context rotates session_id, sync the gateway session_key, migrate approval notify + yolo state, restart the slash worker, and clear the stale pending title. Mirrors HermesCLI._manual_compress.
* Avoid /compress lock re-entry in slash side effects.
Stop pre-locking history before _compress_session_history in slash command mirroring, keep session-key sync parity with manual compression, and add a regression test that asserts /compress is invoked without holding history_lock.
* fix(tui): word-wrap composer input
Wrap composer input at word boundaries and anchor the good-vibes heart to the full composer row.
* test(tui): cover composer word wrap edge
Add regression coverage for moving the next word instead of splitting it at the composer edge.
* fix(tui): honor launch toolsets
Carry chat --toolsets through the TUI launcher so TUI sessions use the same per-session tool scope as the classic CLI.
* fix(tui): parse top-level toolsets flag
Allow top-level hermes --tui --toolsets to reach the implicit chat session, matching chat subcommand behavior.
* fix(tui): validate launch toolsets
Filter invalid HERMES_TUI_TOOLSETS entries and fall back to configured CLI toolsets when the override contains no valid toolsets.
* fix(tui): avoid config load for builtin toolsets
Honor built-in HERMES_TUI_TOOLSETS values before loading config and treat all/* as the all-toolsets sentinel.
* fix(cli): honor toolsets in oneshot mode
Forward top-level --toolsets into oneshot agent construction so the flag is not silently ignored outside the TUI path.
* fix(cli): validate oneshot toolsets
Reject invalid-only oneshot toolset overrides before output redirection and clarify TUI fallback warnings.
* fix(cli): preserve all-toolsets sentinel
Map explicit all/* oneshot toolset overrides to the all-toolsets sentinel and replace locals() checks in TUI toolset loading.
* fix(cli): warn on extra all-toolset entries
Warn when all/* toolset overrides include additional ignored entries so typos are still visible.
* fix(tui): honor plugin toolset overrides
Discover plugin toolsets before rejecting unresolved explicit toolset overrides and read raw config for MCP name validation.
* fix(tui): reuse toolset argument normalizer
Share top-level TUI toolset argument parsing with the oneshot path to avoid duplicate normalization logic.
* fix(cli): reject disabled mcp toolsets
Validate explicit toolset overrides against enabled MCP servers only and clarify top-level toolset flag help.
* fix(cli): distinguish disabled mcp from unknown toolsets
Report disabled MCP servers separately from unknown toolset entries and stub plugin discovery in invalid-name tests for determinism.
shutil.copytree from default ~/.hermes duplicated ~/.hermes/profiles into
the new profile, causing nested profiles/.../profiles/... and huge disk use.
Match export behavior (_DEFAULT_EXPORT_EXCLUDE_ROOT) by ignoring the sibling
profiles tree at the source root.
Made-with: Cursor
Intended placement per PR #17610 discussion — comfyui belongs in
skills/creative/ alongside other creative built-ins (touchdesigner-mcp,
pretext, sketch), not in optional-skills/.
Pure directory rename, no content changes. History preserved via git mv.
The skip_pre_tool_call_hook flag was added to prevent double-firing of
pre_tool_call when run_agent._invoke_tool pre-checks for a block
directive and then dispatches via handle_function_call. But the
implementation added an else: branch that fired invoke_hook again for
'observers', without noticing that get_pre_tool_call_block_message() in
hermes_cli.plugins already fires invoke_hook('pre_tool_call', ...) as
part of its block-directive poll.
Result: every tool call ran through the run_agent loop fired the hook
twice — reported by community users whose observer / audit plugins
logged each tool invocation twice with identical timestamps.
Fix: delete the else: branch. The single-fire contract is now:
- skip=False (direct handle_function_call): hook fires once inside
get_pre_tool_call_block_message().
- skip=True (run_agent._invoke_tool path): caller fires the hook
once via get_pre_tool_call_block_message(); handle_function_call
must not fire it again.
Tightened the existing skip-flag test (renamed to
test_skip_flag_prevents_double_fire) to assert pre_tool_call fires
zero times when skip=True, and added
test_run_agent_pattern_fires_pre_tool_call_exactly_once to lock in
end-to-end that the full block-check + dispatch sequence fires the
hook exactly once.
Adds Step 0 'Ask Local vs Cloud' as the very first onboarding step, with a
scripted question that spells out the hardware requirements for local
(6 GB VRAM NVIDIA, ROCm AMD on Linux, or M1+ Mac with 16 GB unified)
and routes Cloud users straight to Path A without a hardware check.
Hardware check becomes Step 1, run only when the user picked local.
Layers a programmatic hardware-feasibility check on top of the v4 skill
so the agent doesn't silently push users toward a local install they
can't actually run. The official comfy-cli supports --nvidia / --amd /
--m-series / --cpu, but has no guard against "4 GB laptop GPU on SDXL"
or "Intel Mac falling back to CPU" — both route to comfy-cli paths in
the original table and then fail on first workflow.
- scripts/hardware_check.py: detect OS/arch/GPU (NVIDIA nvidia-smi,
AMD rocm-smi, Apple M1+ via arm64+sysctl, Intel Arc via clinfo),
VRAM, system/unified RAM. Emits JSON
{verdict: ok|marginal|cloud, recommended_install_path, comfy_cli_flag}
with practical thresholds: discrete GPU >=6 GB VRAM minimum,
Apple Silicon >=16 GB unified memory minimum, Intel Mac -> cloud,
no accelerator -> cloud. comfy_cli_flag maps directly to
`comfy install` so the agent can stitch the whole flow together.
- scripts/comfyui_setup.sh: runs hardware_check.py first when no
explicit flag is passed. If verdict=cloud, refuses to install
locally, prints Comfy Cloud URL + an override command, exits 2.
Otherwise auto-selects the right --nvidia/--amd/--m-series flag
for `comfy install`. Surfaces marginal-verdict notes to the user.
- SKILL.md Setup & Onboarding: adds mandatory Step 0 "Check If This
Machine Can Run ComfyUI Locally" ahead of the Path A-E selection.
Documents the verdict thresholds inline, ties verdict + comfy_cli_flag
to the install paths, and updates the path-choice table so
"verdict: cloud" is the first row. Quick-Start "Detect Environment"
block extended to include the hardware check. Verification
checklist gains a hardware-check gate.
- Frontmatter setup.help rewritten to point at hardware_check.py
first. Version bumped 4.0.0 -> 4.1.0.
Capture the reusable layout and animation lessons from the advanced Pretext demo so the skill teaches measured obstacle fields, morphing geometry, and polished browser examples.