_get_platform_tools() correctly fell back to f"hermes-{platform}" for
unknown (plugin) platforms when building toolset_names, but then
unconditionally used PLATFORMS[platform] again for platform_tool_universe,
causing KeyError for any plugin-registered platform like Teams.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dashboard Models page was analytics-only — no way to pick a model as main
for new sessions or override an auxiliary task slot without hand-editing
config.yaml or running a /model slash command inside a chat.
Changes:
- hermes_cli/web_server.py: three REST endpoints (GET /api/model/options,
GET /api/model/auxiliary, POST /api/model/set). Reuses
list_authenticated_providers() from model_switch.py so the REST path
surfaces the same curated model lists as the TUI-gateway model.options
JSON-RPC. POST /api/model/set writes model.provider + model.default for
scope=main, and auxiliary.<task>.{provider,model} for scope=auxiliary
(with task="" meaning 'all 8 slots' and task="__reset__" resetting them
to auto).
- web/src/components/ModelPickerDialog.tsx: accepts an optional loader +
onApply pair so it works without an open chat PTY. ChatSidebar's
gw-WebSocket path still works unchanged (back-compat).
- web/src/pages/ModelsPage.tsx: Model Settings panel at the top showing
main model + collapsible list of 8 auxiliary tasks with per-row Change
buttons and Reset all to auto. Every existing model card gets a
'Use as' dropdown for one-click assignment to main or any aux slot.
Cards badged 'main' or 'aux · <task>' when currently assigned.
- website/docs/user-guide/configuring-models.md: new docs page walking
through both UI paths, aux task override patterns, troubleshooting,
plus REST/CLI alternatives.
- Screenshots under website/static/img/docs/dashboard-models/.
Applies to new sessions only — running sessions keep their model (use
/model slash command to hot-swap a live session). No prompt-cache
invalidation on existing sessions.
Dashboard plugin API routes (web_server._mount_plugin_api_routes) and
gateway event hooks (gateway.hooks.HookRegistry.discover_and_load) both
loaded Python files via importlib.util.spec_from_file_location +
exec_module without registering the resulting module in sys.modules.
That breaks any plugin or hook handler that uses `from __future__ import
annotations` together with a Pydantic BaseModel / dataclass / anything
that introspects `__module__`: at first request Pydantic tries to
resolve string-form type hints against the defining module's namespace,
can't find it by name, and raises:
PydanticUserError: TypeAdapter[...] is not fully defined;
you should define ... and all referenced types,
then call `.rebuild()` on the instance.
This is what broke the kanban dashboard's 'triage' button — POST
/api/plugins/kanban/tasks validated against CreateTaskBody (a Pydantic
model in a file using `from __future__ import annotations`) and
returned 500 on every click.
The fix, applied symmetrically to both loaders:
1. Compute module_name once.
2. Register the module in sys.modules BEFORE exec_module.
3. On exec_module failure, pop the half-initialized stub so subsequent
reloads don't pick up broken state.
GETs were unaffected because they don't build a body TypeAdapter, which
is why this only surfaced when users started POSTing.
Keep context-1m-2025-08-07 in OAuth requests by default so 1M-capable
subscriptions retain full context. When Anthropic rejects a request with
400 'long context beta is not yet available for this subscription',
disable the beta for the rest of the session, rebuild the client, and
retry once.
Addresses #17680 (thanks @JayGwod for the clean reproduction) without
forcing every OAuth user off the 1M context window.
Changes:
- agent/error_classifier.py: new FailoverReason.oauth_long_context_beta_forbidden;
pattern matches 400 + 'long context beta' + 'not yet available'. Narrow
enough that the existing 429 tier-gate pattern keeps its own reason.
- agent/anthropic_adapter.py: _common_betas_for_base_url,
build_anthropic_client, build_anthropic_kwargs gain drop_context_1m_beta
kwarg. Default=False (1M stays). OAuth OAUTH_ONLY_BETAS unchanged.
- agent/transports/anthropic.py: build_kwargs forwards the flag.
- run_agent.py: self._oauth_1m_beta_disabled flag, retry-once guard,
recovery branch next to the image-shrink path. _rebuild_anthropic_client
honors the flag. The main build_kwargs call site threads it through for
fast-mode extra_headers.
- hermes_cli/doctor.py, hermes_cli/models.py: sibling OAuth /v1/models
probes get the same reactive retry — previously they'd falsely report
the Anthropic API as unreachable for affected subscriptions.
Tests: 2190 tests/agent/ + 94 adjacent integration tests pass. New unit
tests cover the classifier pattern (including the collision guard against
the 429 tier-gate) and the drop_context_1m_beta adapter behavior (default
keeps 1M, flag strips only 1M while preserving every other beta).
Platform plugins shipped in-repo under plugins/platforms/ should be
available out of the box — users shouldn't have to add 'irc-platform'
to plugins.enabled before they can pick IRC from the gateway setup menu.
Adds a new ``kind: platform`` plugin type that mirrors the existing
``kind: backend`` auto-load semantics:
- Bundled (shipped in the hermes-agent repo): auto-load unconditionally.
- User-installed (~/.hermes/plugins/): still opt-in via plugins.enabled
so untrusted code doesn't silently run.
Changes:
* hermes_cli/plugins.py: add 'platform' to _VALID_PLUGIN_KINDS, document
the new kind in the PluginManifest docstring, extend the bundled auto-
load rule from 'backend only' to 'backend or platform'.
* plugins/platforms/irc/plugin.yaml: declare kind: platform.
* hermes_cli/gateway.py: remove the now-redundant
_load_bundled_platform_plugins_for_enumeration() helper and the
_enable_plugin_for_platform() helper. The setup menu's _all_platforms()
just calls discover_plugins() and reads the registry — bundled
platforms are already loaded at that point. Drops the 'needs_enable'
flag and the 'plugin disabled — select to enable' status string.
* hermes_cli/setup.py: relax the "gateway is configured" detector used
during OpenClaw migration. Switching to _platform_status() in an
earlier commit tightened the check to require an exact "configured"
match, dropping platforms whose status is "enabled, not paired",
"partially configured", "configured + E2EE", etc. Now any non-"not
configured" status counts — the user has already started setup there
and we shouldn't force the section to rerun.
* tests/hermes_cli/test_setup_irc.py: drop the TestIRCPluginDisabledFlow
class and test_configure_platform_enables_disabled_plugin_first — the
no-longer-existent flow they were testing.
* tests/hermes_cli/test_setup_openclaw_migration.py: patch both
setup.get_env_value and gateway.get_env_value in the 4 gateway-section
tests that reach _platform_status() through the unified setup flow;
switch WHATSAPP_ENABLED to the literal "true" in the registry-parity
test so WhatsApp's value-shape validator matches.
Verified via fresh-install smoke (empty plugins.enabled, no env vars):
IRC plugin loads, Platform('irc') resolves, _all_platforms() lists IRC
with status 'not configured'. 160 targeted tests pass.
feat(gateway): refine Platform._missing_ and platform-connected dispatch
Restricts plugin-name acceptance to bundled plugin scan + registry
(no arbitrary string -> enum-pollution), pulls per-platform connectivity
checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean
_is_platform_connected method, and adds tests covering the checker map,
plugin platform interface, and IRC setup wizard.
Merge the two gateway setup paths (hermes setup gateway + hermes gateway
setup) to use a single _unified_platforms() list that merges built-in
_PLATFORMS with dynamically registered plugin entries from
platform_registry.
- Add setup_fn field to PlatformEntry for plugin setup flows
- _unified_platforms() merges built-ins with registry entries by key
- setup_gateway() now uses unified list instead of hardcoded
_GATEWAY_PLATFORMS tuple list
- gateway_setup() uses same unified list, plugin entries appear
alongside built-ins with no [plugin] suffix
- _platform_status() handles plugin platforms via registry check_fn
- Plugin platforms with setup_fn get called directly; plugins without
get a generic env-var display fallback
IRC and other plugin platforms now appear automatically in the setup
menu when registered via platform_registry.register().
feat(gateway): surface disabled platform plugins in setup and auto-enable on select
Platform plugins under plugins/platforms/* (IRC, etc.) were gated behind
plugins.enabled, so `hermes gateway setup` wouldn't list them until the
user ran `hermes plugins enable <name>` first. Now the setup menu always
surfaces them as "plugin disabled — select to enable", and picking one
adds it to plugins.enabled before running its setup flow.
Along the way, unify the two gateway setup flows so `hermes setup gateway`
and `hermes gateway setup` both read from the same platform list (built-in
_PLATFORMS + platform_registry entries), dispatch through a single
_configure_platform() helper, and share _platform_status(). Deletes the
dead bespoke wrappers in setup.py (_setup_whatsapp, _setup_weixin,
_setup_email, etc.) that duplicated logic now covered by the registry
path or _setup_standard_platform.
Also:
- PlatformEntry gains a plugin_name field so the registry knows which
plugin owns each entry (required for auto-enable).
- PluginContext.register_platform auto-stamps plugin_name from the
manifest so plugins don't have to pass it explicitly.
- PluginManager now scans plugins/platforms/* as its own category root,
one level below the bundled plugin scan.
- Fix IRC plugin discovery: rename PLUGIN.yaml → plugin.yaml (the
scanner is case-sensitive) and add the missing __init__.py that
_load_directory_module requires.
Plugin platforms now get full toolset support without any entries in
toolsets.py.
tools_config._get_platform_tools(): Falls back to 'hermes-<name>'
when the platform isn't in the static PLATFORMS dict. No more
KeyError for plugin platforms.
toolsets.resolve_toolset(): Auto-generates a toolset for plugin
platforms (hermes-<name>) containing _HERMES_CORE_TOOLS plus any
tools the plugin registered into a matching toolset name. This means
a plugin can call ctx.register_tool(toolset='irc', ...) and those
tools will be included in the hermes-irc toolset automatically.
webhook.py: Registry-aware cross-platform delivery.
run_agent.py: Platform hints from plugin registry.
IRC adapter: Token lock + platform hint.
Removed dead token-empty-warning extension.
Updated docs.
Extends the platform plugin interface from Phase 1 to cover every
touchpoint where built-in platforms have hardcoded behavior.
- allowed_users_env / allow_all_env: per-platform auth env vars
- max_message_length: smart-chunking for send_message tool
- pii_safe: session PII redaction flag
- emoji: CLI/gateway display
- allow_update_command: /update access control
send_message tool (tools/send_message_tool.py):
- Replaced hardcoded platform_map dict with Platform() call
- Added _send_via_adapter() for plugin platforms — routes through
live gateway adapter when available
- Registry-aware max message length for smart chunking
Cron delivery (cron/scheduler.py):
- Replaced hardcoded 15-entry platform_map with Platform() call
- Plugin platforms now work as cron delivery targets
User authorization (gateway/run.py _is_user_authorized):
- Registry fallback: checks PlatformEntry.allowed_users_env and
allow_all_env when platform not in hardcoded maps
- Plugin platforms get per-platform auth support
_UPDATE_ALLOWED_PLATFORMS: checks registry allow_update_command flag
Channel directory: includes plugin platforms in session enumeration
Orphaned config warning: descriptive message when plugin platform is
in config but no plugin registered it
Gateway weakref: _gateway_runner_ref for cross-module adapter access
hermes status: shows plugin platforms with (plugin) tag
hermes gateway setup: plugin platforms appear in menu with setup hints
hermes_cli/platforms.py: get_all_platforms() merges with registry,
platform_label() falls back to registry for plugin names
- 8 new tests (extended fields, cron resolution, platforms merge)
- Updated 3 tests for new Platform() based resolution
- 2829 passed, 24 pre-existing failures, zero new failures
Adds a platform adapter plugin interface so anyone can create new gateway
platforms (IRC, Viber, Line, etc.) as drop-in plugins without modifying
core gateway code.
- PlatformEntry dataclass: name, label, adapter_factory, check_fn,
validate_config, required_env, install_hint, source
- PlatformRegistry singleton with register/unregister/create_adapter
- _create_adapter() in gateway/run.py checks registry first, falls
through to existing if/elif chain for built-in platforms
- Platform._missing_() accepts unknown string values, creating cached
pseudo-members so Platform('irc') is Platform('irc') holds true
- GatewayConfig.from_dict() now parses plugin platform names from
config.yaml without rejecting them
- get_connected_platforms() delegates to registry for unknown platforms
- PluginContext.register_platform() for plugin authors
- Mirrors the existing register_tool() / register_hook() pattern
- Full async IRC adapter using stdlib asyncio (zero external deps)
- Connects via TLS, handles PING/PONG, nick collision, NickServ auth
- Channel messages require addressing (nick: msg), DMs always dispatch
- Markdown stripping for IRC-clean output, message splitting for
512-byte line limit
- Config via config.yaml extra dict or IRC_* env vars
- Platform enum dynamic members (identity stability, case normalization)
- PlatformRegistry (register, unregister, create, validation, factory)
- GatewayConfig integration (from_dict parsing, get_connected_platforms)
- IRC adapter (init, send, protocol parsing, markdown, requirements)
No existing platform adapters were migrated — the if/elif chain is
untouched. This is Phase 1: prove the interface with a real plugin.
Reloading MCP servers rebuilds the tool set for the active session, which
invalidates the provider prompt cache (tool schemas are baked into the
system prompt). The next message re-sends full input tokens — can be
expensive on long-context or high-reasoning models.
To surface that cost, /reload-mcp now routes through a new slash-confirm
primitive with three options: Approve Once / Always Approve / Cancel.
'Always Approve' persists approvals.mcp_reload_confirm: false so future
reloads run silently.
Coverage:
* Classic CLI (cli.py) — interactive numbered prompt.
* TUI (tui_gateway + Ink ops.ts) — text warning on first call; `now` /
`always` args skip the gate; `always` also persists the opt-out.
* Messenger gateway — button UI on Telegram (inline keyboard), Discord
(discord.ui.View), Slack (Block Kit actions); text fallback on every
other platform via /approve /always /cancel replies intercepted in
gateway/run.py _handle_message.
* Config key: approvals.mcp_reload_confirm (default true).
* Auto-reload paths (CLI file watcher, TUI config-sync mtime poll) pass
confirm=true so they do NOT prompt.
Implementation:
* tools/slash_confirm.py — module-level pending-state store used by all
adapters and by the CLI prompt. Thread-safe register/resolve/clear.
* gateway/platforms/base.py — send_slash_confirm hook (default 'Not
supported' → text fallback).
* gateway/run.py — _request_slash_confirm helper + text intercept in
_handle_message (yields to in-progress tool-exec approvals so
dangerous-command /approve still unblocks the tool thread first).
Tests:
* tests/tools/test_slash_confirm.py — primitive lifecycle + async
resolution + double-click atomicity (16 tests).
* tests/hermes_cli/test_mcp_reload_confirm_gate.py — default-config
shape + deep-merge preserves user opt-out (5 tests).
Targeted runs (hermetic): 89 passed (slash-confirm, config gate,
existing agent cache, existing telegram approval buttons).
Adds a public reload path for the in-process skill caches so newly
installed (or removed) skills become visible mid-session without a
gateway restart. Mirrors the shape of /reload-mcp.
Three surfaces:
* /reload-skills slash command — CLI (cli.py) and gateway (gateway/run.py),
with /reload_skills alias for Telegram autocomplete and an explicit
Discord registration.
* skills_reload agent tool (tools/skills_tool.py) — lets agents/subagents
pick up freshly-installed skills via tool call.
* agent.skill_commands.reload_skills() — shared helper that clears
_skill_commands, _SKILLS_PROMPT_CACHE (in-process LRU), and the
on-disk .skills_prompt_snapshot.json, then returns an added/removed
diff plus the new total count.
Tested:
* tests/agent/test_skill_commands_reload.py (9 cases)
* tests/cli/test_cli_reload_skills.py (3 cases)
* tests/gateway/test_reload_skills_command.py (4 cases)
Use case: NemoClaw / OpenShell-style sandboxed orchestrators that drop
skills into ~/.hermes/skills mid-session, plus agentic flows where the
agent itself installs a skill via the shell tool and needs it bound
without a gateway restart. The Python helper
clear_skills_system_prompt_cache(clear_snapshot=True) already exists
internally — this PR just exposes it via slash command and tool.
- SQL: add `model != ''` to both queries in /api/analytics/models so
sessions with empty-string model (pre-existing data integrity,
confirmed in production DB: ~107 sessions) no longer render as
blank-header cards.
- ModelsPage: drop the arbitrary slashIdx < 20 length gate in
shortModelName / modelProvider. The gate was fragile for longer
vendor prefixes (e.g. `deepseek-ai/...`). Strip on the first /
unconditionally. Rename modelProvider -> modelVendor to avoid
confusion with the billing provider column.
- scripts/release.py: add AUTHOR_MAP entry for yatesjalex.
- New /models page in left nav (after Analytics)
- New /api/analytics/models endpoint with per-model token/cost/session
breakdown, cache read/reasoning tokens, tool calls, avg tokens/session,
and capabilities from models.dev (vision/tools/reasoning/context window)
- Model cards with stacked token distribution bar, capability badges,
provider badges, cost info, and relative time
- Summary stats bar (models used, total tokens, est. cost, sessions)
- Period selector (7d/30d/90d) with refresh
- i18n support (en + zh)
Pull the top-level + chat parser construction out of main() into
hermes_cli/_parser.py so relaunch.py can introspect parser._actions to
discover which flags exist and whether they take values, instead of
maintaining a parallel hand-rolled (flag, takes_value) tuple list.
- _parser.py: build_top_level_parser() returns (parser, subparsers,
chat_parser); side-effect-free import.
- main.py: ~290 lines of inline parser construction collapsed to a
helper call. Other subparsers stay inline (dispatch is bound to
module-level cmd_* functions).
- _parser._inherited_flag(parser, ...): wraps parser.add_argument and
sets action.inherit_on_relaunch = True. Used in place of
parser.add_argument for the 25 flags (top-level + chat) that need to
carry over.
- _parser.PRE_ARGPARSE_INHERITED_FLAGS: holds --profile/-p, which
isn't on argparse (consumed earlier by main._apply_profile_override).
- relaunch.py: drops _CRITICAL_DESTS and _PRE_ARGPARSE_FLAGS; the table
builder now filters by getattr(action, 'inherit_on_relaunch', False).
- test_ignore_user_config_flags.py: brittle inspect.getsource grep
replaced with proper parser introspection.
- test_relaunch.py: introspection sanity tests added.
Salvaged from PR #17549; added top-level -t/--toolsets flag to
_parser.py so #17623 (fix(tui): honor launch toolsets) behavior is
preserved on current main.
Co-authored-by: ethernet <arilotter@gmail.com>
Extract all os.execvp('hermes', ...) calls into a utility so flags like
--tui, --dev, --profile, --model, --provider, et al. survive session
resume and post-setup relaunch.
- resolve_hermes_bin: prefers sys.argv[0] when callable, then PATH,
then falls back to '${sys.executable} -m hermes_cli.main' (fixes nix
run relaunches)
- build_relaunch_argv: allowlists critical flags so they carry over
- cmd_sessions browse now calls relaunch(['--resume', <id>])
- _apply_profile_override skips redundant work when HERMES_HOME is
already set (child inherits parent profile)
- setup.py replaces _resolve_hermes_chat_argv with relaunch_chat()
- added comprehensive tests for flag extraction and binary resolution
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI Tests workflow has been red on main for 40+ consecutive runs. This
commit recovers every failure visible in run 25130722163 (most recent
completed run prior to this PR).
Root causes, by group:
Test-mock drift after product landed (fix: update mocks)
- test_mcp_structured_content / test_mcp_dynamic_discovery (6 tests):
product added _rpc_lock (#02ae15222) and _schedule_tools_refresh
(#1350d12b0) without updating sibling test files. Install a real
asyncio.Lock inside the fake run-loop and patch at _schedule_tools_refresh.
- test_session.py: renamed normalize_whatsapp_identifier → canonical_
whatsapp_identifier upstream; keep a local alias so the legacy tests
keep working.
- test_run_progress_topics Slack DM test: PR #8006 made Slack default
tool_progress=off; explicitly set it to 'all' in the test fixture so
the progress-callback path still runs. Also read tool_progress_callback
at call time rather than freezing it in FakeAgent.__init__ — production
assigns it AFTER construction.
- test_tui_gateway_server session-create/close race: session.create now
defers _start_agent_build behind a 50ms timer — wait for the build
thread to enter _make_agent before closing, otherwise the orphan-
cleanup path never runs.
- test_protocol session.resume: product get_messages_as_conversation now
takes include_ancestors kwarg; accept **_kwargs in the test stub.
- test_copilot_acp_client redaction: redactor is OFF by default (snapshots
HERMES_REDACT_SECRETS at import); patch agent.redact._REDACT_ENABLED=True
for the duration of the test.
- test_minimax_provider: after #17171, dots in non-Anthropic model names
stay dots even with preserve_dots=False. Assert the new invariant
rather than the old 'broken for MiniMax' behavior.
- test_update_autostash: updater now scans `ps -A` for dashboard PIDs;
the test's catch-all subprocess.run stub needed stdout/stderr fields.
- test_accretion_caps: read_timestamps dict is populated lazily when
os.path.getmtime succeeds. Use .get("read_timestamps", {}) to tolerate
CI filesystems where the stat races file creation.
Change-detector tests (fix: rewrite as structural invariants)
- test_credential_sources_registry_has_expected_steps: was a frozen set
comparison that broke when minimax-oauth was added. Rewrite as an
invariant check (every step has description, no dupes, core steps
present) per AGENTS.md 'don't write change-detector tests'.
xdist ordering / test pollution (fix: reset state, use module-local patches)
- test_setup vercel: sibling test saved VERCEL_PROJECT_ID='project' to
os.environ via save_env_value() and never cleared it. monkeypatch.delenv
the VERCEL_* vars in the link-file test.
- test_clipboard TestIsWsl: GitHub Actions is on Azure VMs whose real
/proc/version often contains 'microsoft'. Patching builtins.open with
mock_open didn't reliably intercept hermes_constants.is_wsl's call in
xdist workers that had already cached _wsl_detected=True from an
earlier test. Patch hermes_constants.open directly and add
teardown_method to reset the cache after each test.
Pytest-asyncio cancellation hangs (fix: bound product await with timeout)
- test_session_split_brain_11016 (3 params) + test_gateway_shutdown
cancel-inflight: under pytest-asyncio 1.3.0, 'await task' and
'asyncio.gather(cancelled_tasks)' can stall for 30s when the cancelled
task's finally block awaits typing-task cleanup. Bound both with
asyncio.wait_for(..., timeout=5.0) and asyncio.shield — the stragglers
are released from adapter tracking and allowed to finish unwinding in
the background. This is also a legitimate hardening: a wedged finally
shouldn't stall the caller's dispatch or a gateway shutdown.
Orphan UI config (fix: merge tiny tab into messaging category)
- test_web_server test_no_single_field_categories: the telegram.reactions
config field lived in its own 'telegram' schema category with no
siblings. Fold it under 'discord' via _CATEGORY_MERGE so the dashboard
doesn't render an orphan single-field tab.
Local verification: 38/38 originally-failing tests pass; 4044/4044
gateway tests pass; 684/684 targeted subset (all 16 touched test files)
passes.
check_for_updates() looked at __file__.parent.parent for a .git dir to
diff against origin/main. A nix-built hermes lives in /nix/store with
no .git there, so the check fell through to whatever editable-install
dev checkout last populated ~/.hermes/.update_check, producing stale
"X commits behind" warnings right after a fresh `nix run --refresh`.
Embed the locked flake rev into the wrapper as HERMES_REVISION (only
on
clean builds — dirty refs don't represent any upstream commit). When
set, banner.py compares it to upstream main via `git ls-remote`
instead
of inspecting a local checkout, and the cache key includes the rev so
nix updates invalidate immediately. Without local history we can't
count commits, so the message is a plain "update available" with no
suggested command — nix users may install via `nix run`, profile,
system flake, or home-manager, and we don't know which.
Also bump web/package-lock.json npmDepsHash via `nix run
.#fix-lockfiles`.
* fix(tui): honor launch toolsets
Carry chat --toolsets through the TUI launcher so TUI sessions use the same per-session tool scope as the classic CLI.
* fix(tui): parse top-level toolsets flag
Allow top-level hermes --tui --toolsets to reach the implicit chat session, matching chat subcommand behavior.
* fix(tui): validate launch toolsets
Filter invalid HERMES_TUI_TOOLSETS entries and fall back to configured CLI toolsets when the override contains no valid toolsets.
* fix(tui): avoid config load for builtin toolsets
Honor built-in HERMES_TUI_TOOLSETS values before loading config and treat all/* as the all-toolsets sentinel.
* fix(cli): honor toolsets in oneshot mode
Forward top-level --toolsets into oneshot agent construction so the flag is not silently ignored outside the TUI path.
* fix(cli): validate oneshot toolsets
Reject invalid-only oneshot toolset overrides before output redirection and clarify TUI fallback warnings.
* fix(cli): preserve all-toolsets sentinel
Map explicit all/* oneshot toolset overrides to the all-toolsets sentinel and replace locals() checks in TUI toolset loading.
* fix(cli): warn on extra all-toolset entries
Warn when all/* toolset overrides include additional ignored entries so typos are still visible.
* fix(tui): honor plugin toolset overrides
Discover plugin toolsets before rejecting unresolved explicit toolset overrides and read raw config for MCP name validation.
* fix(tui): reuse toolset argument normalizer
Share top-level TUI toolset argument parsing with the oneshot path to avoid duplicate normalization logic.
* fix(cli): reject disabled mcp toolsets
Validate explicit toolset overrides against enabled MCP servers only and clarify top-level toolset flag help.
* fix(cli): distinguish disabled mcp from unknown toolsets
Report disabled MCP servers separately from unknown toolset entries and stub plugin discovery in invalid-name tests for determinism.
shutil.copytree from default ~/.hermes duplicated ~/.hermes/profiles into
the new profile, causing nested profiles/.../profiles/... and huge disk use.
Match export behavior (_DEFAULT_EXPORT_EXCLUDE_ROOT) by ignoring the sibling
profiles tree at the source root.
Made-with: Cursor
Close integration gaps discovered by auditing qwen-oauth's file coverage.
These are surfaces the original salvage missed — they all existed on
main and were added in the 747 commits since PR #15203 was opened.
Coverage added:
- agent/credential_pool.py: seed pool from auth.json providers.minimax-oauth
so `hermes auth list` reflects logged-in state and
`hermes auth remove minimax-oauth <N>` works through the standard flow.
- agent/credential_sources.py: register RemovalStep for minimax-oauth
with suppression-aware `_clear_auth_store_provider`.
- agent/models_dev.py: PROVIDER_TO_MODELS_DEV mapping (-> 'minimax' family).
- hermes_cli/providers.py: HermesOverlay entry (anthropic_messages transport,
oauth_external auth_type, api.minimax.io/anthropic base).
- hermes_cli/model_normalize.py: add to _MATCHING_PREFIX_STRIP_PROVIDERS so
`minimax-oauth/MiniMax-M2.7` in config.yaml gets correctly repaired.
- hermes_cli/status.py: render MiniMax OAuth block in `hermes doctor`
(logged-in / region / expires_at / error).
- hermes_cli/web_server.py: register in OAUTH_PROVIDER_REGISTRY + dispatch
branch in _resolve_provider_status so the dashboard auth page shows it.
- website/docs/integrations/providers.md: full 'MiniMax (OAuth)' section.
- website/docs/reference/cli-commands.md: --provider enum.
- website/docs/user-guide/features/fallback-providers.md: fallback table row.
- scripts/release.py AUTHOR_MAP: amanning3390 mapping (CI gate).
Wire MiniMax-M2.7 and MiniMax-M2.7-highspeed into the model catalog,
CLI model picker, and agent auxiliary/metadata subsystems.
Changes:
- hermes_cli/models.py:
- Add 'minimax-oauth' to _PROVIDER_MODELS with MiniMax-M2.7 and
MiniMax-M2.7-highspeed
- Add ProviderEntry('minimax-oauth', 'MiniMax (OAuth)', ...) to
CANONICAL_PROVIDERS near existing minimax entries
- Add aliases: minimax-portal, minimax-global, minimax_oauth in
_PROVIDER_ALIASES
- hermes_cli/main.py:
- Add 'minimax-oauth' to provider_labels dict
- Insert 'minimax-oauth' into providers list in
select_provider_and_model() near the other minimax entries
- Add 'minimax-oauth' to --provider argparse choices
- Add _model_flow_minimax_oauth() function: ensures login via
_login_minimax_oauth(), resolves runtime credentials, prompts for
model selection, saves model choice and config
- Add dispatch elif branch for selected_provider == 'minimax-oauth'
- agent/auxiliary_client.py:
- Add 'minimax-oauth': 'MiniMax-M2.7-highspeed' to
_API_KEY_PROVIDER_AUX_MODELS
- Add 'minimax-oauth' to _ANTHROPIC_COMPAT_PROVIDERS set
- agent/model_metadata.py:
- Add 'minimax-oauth' to _PROVIDER_PREFIXES frozenset
- MiniMax-M2.7 context length (200_000) already covered by the
existing 'minimax' substring match in DEFAULT_CONTEXT_LENGTHS
When a user authenticates a built-in provider via env var (e.g. DASHSCOPE_API_KEY
triggers the built-in 'alibaba' row) AND defines a custom_providers entry
pointing at the same endpoint, the picker previously emitted two rows for one
endpoint. The built-in row already carries the canonical slug, curated model
list, and correct auth wiring, so the shadow custom entry is redundant.
Adds a _builtin_endpoints set populated as sections 1/2/2b emit rows. Each
entry is the provider's effective base URL (env override via base_url_env_var
wins over the static inference_base_url, so DASHSCOPE_BASE_URL-overridden
endpoints dedup correctly). Section 4 skips any grouped custom entry whose
base_url matches.
Intentionally does NOT repurpose model_catalog.enabled as a 'hide built-ins'
flag. That config controls the remote curated-manifest fetch (documented on
the model-catalog reference page) and overloading it would silently change
behavior for users who disable it for network/privacy reasons.
Three new tests:
- shadow dedup fires when endpoint matches static inference_base_url
- dedup does NOT hide custom entries on genuinely distinct endpoints
- dedup honors the base_url_env_var override path
Adds Vercel Sandbox as a supported Hermes terminal backend alongside
existing providers (Local, Docker, Modal, SSH, Daytona, Singularity).
Uses the Vercel Python SDK to create/manage cloud microVMs, supports
snapshot-based filesystem persistence keyed by task_id, and integrates
with the existing BaseEnvironment shell contract and FileSyncManager
for credential/skill syncing.
Based on #17127 by @scotttrinh, cherry-picked onto current main.
Pass encoding='utf-8', errors='ignore' and guard against result.stdout
being None so _scan_gateway_pids() no longer crashes with
UnicodeDecodeError + AttributeError on Windows systems whose default
code page is not UTF-8 (e.g. cp936 on zh-CN). The parser only matches
the ASCII prefixes CommandLine= and ProcessId=, so dropping undecodable
bytes is safe.
Closes#17049.
Two fix-ups for #17123:
1. Reword the inline comment in `_warn_stale_dashboard_processes` to
accurately describe the failure mode (locale-dependent decoder, not a
"default UTF-8 decoder") and identify `errors="ignore"` as the
load-bearing protection. Per Copilot's review.
2. Switch `TestWindowsWmicEncoding` from `patch("hermes_cli.main.sys")`
to `monkeypatch.setattr(sys, "platform", "win32")` — the codebase's
canonical pattern (e.g. `tests/hermes_cli/test_auth_ssl_macos.py`).
The MagicMock-replacement approach passed locally on Python 3.12 but
the platform-equality check failed under CI's xdist+Python 3.11,
leaving both new tests red despite the fix being present.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`hermes update` calls `_warn_stale_dashboard_processes()` to warn about
dashboard processes still running the pre-update Python backend. On
Windows, that scan shells out to `wmic process get ProcessId,CommandLine
/FORMAT:LIST` with `text=True` and no explicit encoding.
`wmic` emits text in the system code page (e.g. cp936 on zh-CN locales),
not UTF-8. Without an explicit `encoding=`, Python's default UTF-8
decoder crashes the subprocess reader thread with
`UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 ...`. In
Python 3.11 that crash is silently absorbed: `subprocess.run()` returns
a `CompletedProcess` with `result.stdout = None`, the next line calls
`result.stdout.split("\n")`, and `hermes update` aborts with the
exact `AttributeError: 'NoneType' object has no attribute 'split'`
trace reported in #17049.
Fix: pass `encoding="utf-8", errors="ignore"` so undecodable bytes
cannot take down the reader thread (the parsing only matches the ASCII
prefixes `CommandLine=` and `ProcessId=`, so dropping non-UTF-8 bytes
is safe), and short-circuit when `result.stdout is None` as a defensive
guard for environments where the reader thread still fails for other
reasons.
This is the same root cause as #17074 (which patches
`hermes_cli/gateway._scan_gateway_pids` for the `hermes setup` path).
That PR does not touch `_warn_stale_dashboard_processes`, so
`hermes update` remains broken on the same locales until this lands.
Regression test in `tests/hermes_cli/test_update_stale_dashboard.py`:
- `test_wmic_invoked_with_utf8_ignore_errors` asserts the explicit
encoding/errors kwargs reach `subprocess.run`.
- `test_wmic_returns_none_stdout_does_not_crash` simulates the
reader-thread-crashed `result.stdout=None` aftermath and asserts the
function returns silently instead of raising AttributeError.
Both new tests fail against clean origin/main (7d4648461) reproducing
the original AttributeError; both pass with this patch. The remaining
3 failures in `tests/hermes_cli/test_cmd_update.py` and
`test_update_autostash.py` are pre-existing baselines on origin/main —
they reproduce identically without this change and are unrelated to
the wmic scan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
QR-login connects an iLink bot identity (...@im.bot), not a scriptable
personal WeChat account. iLink typically does not deliver ordinary WeChat
group events to these bots, so WEIXIN_GROUP_POLICY / WEIXIN_GROUP_ALLOWED_USERS
often have no effect regardless of value.
- Setup wizard: print iLink-bot caveat before the group-policy prompt; relabel
the allowlist input as 'group chat IDs (not member user IDs)'; note that
'open' / 'allowlist' only take effect if iLink delivers group events.
- Adapter: log a WARNING at connect() when WEIXIN_GROUP_POLICY is non-disabled
so the limitation is surfaced in gateway logs, not just docs.
- Docs: add a top-of-page warning callout to weixin.md explaining the iLink
bot identity, narrow the 'DM and group messaging' feature line to DM-only
with a group caveat, tighten the Group Policy section and troubleshooting
row, and clarify WEIXIN_GROUP_ALLOWED_USERS as group IDs (not user IDs)
in weixin.md and environment-variables.md.
Closes#17094
Completes the cfg_get migration started in PR #17304. Covers the
remaining hermes_cli/ and plugins/ config-access sites that the first
PR intentionally left opportunistic.
Migrated (33 sites across 14 files):
hermes_cli/setup.py 13 sites (terminal.*, agent.*, display.*, compression.*, tts.*)
hermes_cli/tools_config.py 7 sites (tts.*, browser.*, web.*, platform_toolsets.*)
hermes_cli/plugins_cmd.py 3 sites (plugins.*, memory.*, context.*)
plugins/memory/honcho/cli.py 3 sites (hosts.*)
hermes_cli/web_server.py 1 site (dashboard.*)
hermes_cli/skills_config.py 1 site (platform_disabled)
hermes_cli/plugins.py 1 site (plugins.disabled)
hermes_cli/status.py 1 site (terminal.backend)
hermes_cli/mcp_config.py 1 site (mcp_servers.*)
hermes_cli/webhook.py 1 site (platforms.webhook)
plugins/memory/__init__.py 1 site (memory.provider)
plugins/memory/hindsight/ 1 site (banks.hermes)
plugins/memory/holographic/ 1 site (plugins.hermes-memory-store)
run_agent.py 1 site (auxiliary.compression)
The helper supports non-literal keys too, so e.g.
cfg.get('hosts', {}).get(HOST, {})
becomes
cfg_get(cfg, 'hosts', HOST, default={})
Migration bugs caught and fixed during this PR:
1. An AST-based batch rewrite naïvely captured the first word token in
a chain, which corrupted 'self._config.get(...).get(...)' into
'self.cfg_get(_config, ...)' (dropping 'self.', creating a broken
method call). Plugins/memory/hindsight caught it via its test suite.
Fixed manually to 'cfg_get(self._config, ...)'.
2. Import-extension heuristic rewrote multi-line parenthesized imports
('from X import (\n A,\n B,\n)') as
'from X import cfg_get, (' — syntactically broken. Fixed by inserting
cfg_get as the first name inside the parentheses.
Combined with PR #17304, the cfg_get migration now covers:
PR #17304 (first batch): 20 sites in tools/ + gateway/
PR #17317 (this one): 33 sites in hermes_cli/ + plugins/ + run_agent.py
Total: 53 sites migrated. Remaining ~8 sites are either:
- Function-call chains (e.g. '_load_stt_config().get(...).get(...)')
that would need double-evaluation or a local binding to migrate
cleanly — intentionally deferred.
- JSON response-navigation (e.g. 'response_data.get('data',{}).get('web'))
which is unrelated to config access and shouldn't use cfg_get.
Verified:
- 412/412 tests/plugins/ pass (including the hindsight test that caught
the self.X regex bug before commit)
- 3181/3189 tests/hermes_cli/ pass (8 pre-existing failures on main,
verified by git-stash comparison)
- Live 'hermes status' and 'hermes config' render correctly (exercise
the migrated terminal.backend, tts.provider, browser.cloud_provider,
compression.threshold, display.tool_progress sites)
- Live 'hermes chat': 1 turn + /quit, zero errors in 11-line log window
No semantic changes — cfg_get was already proven to be a 1:1 match for
the original .get("X",{}).get("Y",default) pattern in PR #17304.
Every curator pass now emits a dated report directory under
`~/.hermes/logs/curator/{YYYYMMDD-HHMMSS}/` with two files:
- `run.json` — machine-readable full record (before/after snapshot,
state transitions, all tool calls, model/provider, timing, full LLM
final response untruncated, error if any)
- `REPORT.md` — human-readable markdown: model + duration header,
auto-transition counts, LLM consolidation stats, archived-this-run
list, new-skills-this-run list, state transitions, the full LLM
final summary, and a recovery footer pointing at the archive + the
`hermes curator restore` command
Reports live under `logs/curator/`, not inside `skills/` — they're
operational telemetry, not user-authored skill data, and belong
alongside `agent.log` / `gateway.log`.
Internals:
- `_run_llm_review()` now returns a dict (final, summary, model,
provider, tool_calls, error) instead of a bare truncated string so
the reporter has full fidelity
- Report writer is fully best-effort — any failure logs at DEBUG and
never breaks the curator itself. Same-second rerun gets a numeric
suffix so reports can't clobber each other
- Report path stamped into `.curator_state` as `last_report_path`
- `hermes curator status` surfaces a "last report:" line so users
can immediately open the latest run
Tests (all green):
- 7 new tests in tests/agent/test_curator_reports.py covering: report
location (logs not skills), both files written, run.json shape and
diff accuracy, markdown structure, error path still writes, state
transitions captured, same-second runs get unique dirs
- Existing test_run_review_synchronous_invokes_llm_stub updated to
stub the new dict-returning _run_llm_review signature
Live E2E: ran a synchronous pass against a 1-skill test collection
with a stubbed LLM; report written correctly, state stamped with
last_report_path, markdown human-readable, run.json machine-parseable.
The "cfg.get('X', {}).get('Y', default)" pattern appears 50+ times
across tools/, gateway/, and plugins/. Each call site manually handles
the same three gotchas:
1. Missing intermediate key → empty dict → chain works
2. Non-dict value at intermediate position → AttributeError
(uncaught in most sites, so a misconfigured YAML crashes the tool)
3. cfg is None → AttributeError
Introduces cfg_get(cfg, *keys, default=None) in hermes_cli/config.py
as the canonical helper. Handles all three uniformly, returns default
only when the final key is *absent* (matches dict.get semantics —
explicit None values are preserved, falsy values like 0 / False / ''
are preserved).
Named cfg_get rather than cfg_path to avoid shadowing the existing
'cfg_path = _hermes_home / "config.yaml"' local variable that appears
in gateway/run.py, cron/scheduler.py, hermes_cli/main.py, etc.
Migrated 20 call sites as the first-batch proof-of-value:
gateway/run.py 10 sites (agent/display subtrees)
tools/browser_tool.py 3 sites
tools/vision_tools.py 2 sites
tools/browser_camofox.py 1 site
tools/approval.py 1 site
tools/skills_tool.py 1 site
tools/skill_manager_tool.py 1 site
tools/credential_files.py 1 site
tools/env_passthrough.py 1 site
The remaining ~30 sites across plugins/ and smaller tool files can be
migrated opportunistically — the helper is now available and the
pattern is established.
Fixed a latent bug along the way: tools/vision_tools.py had its
cfg_get usage at line 560 inside a function that locally re-imports
'from hermes_cli.config import load_config', but the AST-based
migration script wrote the top-level cfg_get import to a different
function scope, leaving line 560's cfg_get as a NameError silently
swallowed by the surrounding try/except. Test
test_vision_uses_configured_temperature_and_timeout caught it. Fixed
by including cfg_get in the function-local import.
Verified:
- 7880/7893 tests/tools/ + tests/gateway/ + tests/hermes_cli/test_config
tests pass; all 13 failures pre-existing on main (MCP, delegate,
session_split_brain — verified earlier in the sweep).
- All 20 migrated sites AST-verified to have cfg_get in scope (either
module-level or function-local).
- Live 'hermes chat' smoke: 2 turns + /model switch + tool calls +
/quit, zero errors. Agent correctly counted 20 cfg_get hits across
8 tool files — matching the migration.
Semantic parity verified against the original pattern across 8 edge
cases (missing keys, None values, falsy values, empty strings, string
instead of dict, None cfg, nested levels).
Add opt-in terminal.docker_run_as_host_user config flag that passes
--user $(id -u):$(id -g) to the Docker backend so files written into
bind-mounted directories (/workspace, /root, docker_volumes entries) are
owned by the host user instead of root.
When enabled on POSIX platforms, also drops SETUID/SETGID caps since the
container no longer needs gosu/su to switch users. Falls back cleanly on
platforms without os.getuid (e.g. native Windows Docker) with a warning.
Wired through all three config.yaml -> TERMINAL_* env-var bridges:
- cli.py env_mappings (CLI + TUI startup)
- gateway/run.py _terminal_env_map (gateway / messaging platforms)
- hermes_cli/config.py _config_to_env_sync (`hermes config set`)
Also fixes docker_mount_cwd_to_workspace silently failing in gateway
mode -- it was missing from gateway/run.py's _terminal_env_map.
Adds tests/tools/test_terminal_config_env_sync.py to guard against
future drift between the three bridges (same bug class shipped twice
in one month).
Bundled Hermes image won't work with this flag since its entrypoint
expects to start as root for the usermod/gosu hermes flow; works with
the default nikolaik/python-nodejs image and plain Debian/Ubuntu.
Weekly is closer to how skill churn actually works — most agent-created
skills don't change multiple times per day, so a daily review is pure
cost without benefit. Bumping the default to 7 days reduces aux-model
spend while still catching drift and staleness on the timescales that
matter (30d stale, 90d archive).
Changes:
- DEFAULT_INTERVAL_HOURS: 24 -> 168 (7 days)
- config.yaml default: interval_hours: 24 -> 24 * 7
- CLI status line renders as '7d' when interval is a whole-day multiple
- Test `test_old_run_eligible` decoupled from the exact default: it now
uses 2 * get_interval_hours() so future tweaks don't break it
Previous invariants only gated the primary entry points
(apply_automatic_transitions, archive_skill, CLI pin). Several paths
were unprotected:
- bump_view / bump_use / bump_patch / set_state / set_pinned wrote
usage records unconditionally, which is confusing noise in
.usage.json even though the review list filtered them out
- restore_skill did not check whether a bundled skill now shadows
the archived name
- CLI unpin was asymmetric with CLI pin — it had no gate
Fixes:
- _mutate() (the shared counter / state writer) now drops silently
when the skill is not agent-created. .usage.json never gains a
record for a bundled or hub-installed skill.
- restore_skill() refuses to restore under a name that is now
bundled or hub-installed (would shadow upstream).
- CLI unpin gate matches CLI pin.
New tests:
- 5 provenance-guard tests on skill_usage (one per mutator)
- 1 end-to-end test that hammers every mutator at a bundled skill
and a hub skill, asserts both are untouched on disk, and asserts
the sidecar stays clean
- 2 CLI tests proving pin/unpin refuse bundled skills symmetrically
64/64 tests passing (29 skill_usage + 27 curator + 8 new guards).
Adds the Curator — an auxiliary-model background task that periodically
reviews AGENT-CREATED skills and keeps the collection tidy: tracks usage,
transitions unused skills through active → stale → archived, and spawns
a forked AIAgent to consolidate overlaps and patch drift.
Default: enabled, inactivity-triggered (no cron daemon). Runs on CLI
startup and gateway boot when the last run is older than interval_hours
(default 24) AND the agent has been idle for min_idle_hours (default 2).
Invariants (all load-bearing):
- Never touches bundled or hub-installed skills (.bundled_manifest +
.hub/lock.json double-filter)
- Never auto-deletes — archive only. Archives are recoverable
via `hermes curator restore <skill>`
- Pinned skills bypass all auto-transitions
- Uses the aux client; never touches the main session's prompt cache
New files:
- tools/skill_usage.py — sidecar .usage.json telemetry, atomic writes,
provenance filter
- agent/curator.py — orchestrator: config, idle gating, state-machine
transitions (pure, no LLM), forked-agent review prompt
- hermes_cli/curator.py — `hermes curator {status,run,pause,resume,
pin,unpin,restore}` subcommand
- tests/tools/test_skill_usage.py — 29 tests
- tests/agent/test_curator.py — 25 tests
Modified files (surgical patches):
- tools/skills_tool.py — bump view_count on successful skill_view
- tools/skill_manager_tool.py — bump patch_count on skill_manage
patch/edit/write_file/remove_file; forget record on delete
- hermes_cli/config.py — add curator: section to DEFAULT_CONFIG
- hermes_cli/commands.py — add /curator CommandDef with subcommands
- hermes_cli/main.py — register `hermes curator` subparser via
register_cli() from hermes_cli.curator
- cli.py — /curator slash-command dispatch + startup hook
- gateway/run.py — gateway-boot hook (mirrors CLI)
Validation:
- 54 new tests across skill_usage + curator, all passing in 3s
- 346 tests across all touched files' neighbors green
- 2783 tests across hermes_cli/ + gateway/test_run_progress_topics.py green
- CLI smoke: `hermes curator status/pause/resume` work end-to-end
Companion to PR #16026 (class-first skill review prompt) — together
they form a loop: the review prompt stops near-duplicate skill creation
at the source, and the curator prunes/consolidates what still accumulates.
Refs #7816.
The known-key splitter in `_sanitize_env_lines` used substring matching
to find concatenated KEY=VALUE pairs. When a registered key was a suffix
of another (LM_API_KEY is a suffix of GLM_API_KEY), the shorter key's
needle would match inside the longer one, causing the sanitizer to
rewrite `GLM_API_KEY=...` as `G\nLM_API_KEY=...` and silently break
Z.AI/GLM auth (and similarly `GLM_BASE_URL` -> `G\nLM_BASE_URL`).
Drop matches whose needle range is fully contained within a longer
overlapping match. Two regression tests cover the suffix-collision case
and confirm a real concatenation that happens to start with the longer
key still splits where it should.
Fixes#17138
Classic CLI exposes ``/reload`` (re-reads ~/.hermes/.env into
``os.environ`` via ``hermes_cli.config.reload_env``) so newly added API
keys take effect without restarting the session. The TUI was missing
the parity command, so users had to Ctrl+C out and ``hermes --tui``
again whenever they added or rotated a credential.
Three small wires:
* New ``reload.env`` JSON-RPC method in ``tui_gateway/server.py`` that
delegates to ``hermes_cli.config.reload_env`` and returns the count
of vars updated.
* New ``/reload`` slash command in ``ui-tui/src/app/slash/commands/ops.ts``
matching the existing ``/reload-mcp`` pattern (native RPC, no slash
worker).
* Drop ``cli_only=True`` from the ``reload`` ``CommandDef`` in
``hermes_cli/commands.py`` so help/menus surface it in the TUI too.
``reload_env`` itself is environment-agnostic.
Same caveat as classic CLI: the *currently constructed* agent's
credential pool / provider routing does not auto-rebuild. Users who
want a brand-new credential resolution should follow with ``/new``.
Tests:
* New ``test_reload_env_rpc_calls_hermes_cli_reload_env`` confirms
RPC delegates and reports the count.
* New ``test_reload_env_rpc_surfaces_errors`` confirms exceptions are
rendered as JSON-RPC errors.
* ``createSlashHandler.test.ts`` slash-parity matrix extended with
``['/reload', 'reload.env', {}]`` so we can't regress the routing.
Validation:
scripts/run_tests.sh tests/test_tui_gateway_server.py — 92/92.
scripts/run_tests.sh tests/hermes_cli/test_commands.py — 128/128.
cd ui-tui && npm run type-check — clean; npm test --run — 390/390.
Fixes from Copilot's two passes on PR #17238:
* Validate parsed URL once: reject missing host, invalid port, and
unsupported scheme up front so malformed inputs (e.g. http://:9222
or http://localhost:abc) don't fall through to a generic 5031.
* Tighten _is_default_local_cdp to require a discovery-style path so
ws://127.0.0.1:9222/devtools/browser/<id> is not collapsed to bare
http://127.0.0.1:9222 (which would lose the path and break the
connect).
* Move browser.manage into _LONG_HANDLERS so the up-to-10s
launch-and-retry loop runs on the RPC pool instead of blocking the
main dispatcher.
* try_launch_chrome_debug uses Windows-appropriate detach kwargs
(creationflags=DETACHED_PROCESS|CREATE_NEW_PROCESS_GROUP) instead
of POSIX-only start_new_session=True.
* manual_chrome_debug_command uses subprocess.list2cmdline on
Windows so the printed instruction is cmd.exe-compatible.
* Mirror host/port validation in cli.py /browser connect so the
classic CLI never persists an invalid BROWSER_CDP_URL.
Split browser.manage into a small dispatcher with named connect/disconnect
helpers, fold _http_ok / _probe_urls / _normalize_cdp_url out of the nested
probe loop, collapse the failure-message scaffolding, and DRY the chrome
candidate path tables. Behaviour and event shape unchanged.
Detect an actual Chrome/Chromium executable before printing a manual CDP launch command, including common WSL-mounted Windows browser paths, so /browser connect does not suggest google-chrome when it is unavailable.
Share Chrome CDP launch helpers between the classic CLI and TUI so default /browser connect uses loopback consistently, retries local Chrome launch, and reports a copyable manual-start command instead of claiming a dead connection.
Three modules independently implemented the same "preserve head+tail of
a secret, mask the middle" logic with slightly different behaviors that
had started to drift:
hermes_cli/config.py redact_key — 12-char floor, 4+4, DIM '(not set)'
hermes_cli/status.py redact_key — 12-char floor, 4+4, plain '(not set)' ← drift
hermes_cli/dump.py _redact — 12-char floor, 4+4, empty string
The visible bug: 'hermes status' displayed the '(not set)' placeholder
in plain text while 'hermes config' showed it in dim text. Same concept,
inconsistent UI.
Introduces mask_secret() in agent/redact.py as the canonical helper,
with head/tail/floor/placeholder/empty kwargs. The three call sites
become one-line wrappers that differ only in the 'empty' handling:
config.redact_key → mask_secret(k, empty=color('(not set)', Colors.DIM))
status.redact_key → mask_secret(k, empty=color('(not set)', Colors.DIM))
dump._redact → mask_secret(v) # empty → ''
agent.redact._mask_token (log redactor, different policy: 18-char floor,
6+4 visible, '***' on empty) also ports to mask_secret but retains its
own empty-case handling to preserve the historical '***' return.
Net: the three display-time redactors now agree on formatting, the
canonical helper lives in one place, and future tweaks (e.g. adding
bullet-point masking, changing the head/tail widths) happen once.
Verified:
- 3/3 tests/hermes_cli/test_web_server.py::TestRedactKey pass
- 89/89 agent/tests/test_redact.py + tests/tools/test_browser_secret_exfil.py
+ tests/hermes_cli/test_redact_config_bridge.py pass
- Live 'hermes status', 'hermes config', 'hermes dump' all render the
same way they did before (verified against actual env with real
keys: OpenRouter, Firecrawl, Browserbase, FAL, Tinker all show
'prefix...suffix'; Kimi shows '***' at <12 chars; unset shows
'(not set)' uniformly).
Co-authored-by: teknium1 <teknium@users.noreply.github.com>