hermes-agent/hermes_cli/toolset_validation.py
lEWFkRAD 41ede84b93 fix(config): surface invalid platform_toolsets instead of silently dropping tools (#38798)
A config migration (or hand-edit) that leaves an invalid toolset name in
`platform_toolsets` — e.g. the #38798 corruption that rewrote `hermes-cli` to
the non-existent `hermes` — silently disabled all affected tools:
resolve_toolset() returns [] for an unknown name, so the agent quietly lost its
tools with no error, warning, or log entry and degraded to text-only replies.

Surface it loudly at two points:
- After migration (migrate_config): validate platform_toolsets and record/print
  a warning per unknown name, with a `hermes-<platform>` suggestion when that
  would have been valid (the exact #38798 shape).
- At runtime (_get_platform_tools): if a platform was explicitly configured but
  every toolset name is invalid, log a warning when tools are resolved for a
  session — so an ALREADY-corrupted config is caught at startup, not only on the
  next `hermes update`.

Logic lives in a new pure, side-effect-free helper (toolset_validation.py) with
validate_toolset injected, so it is unit-testable without the tool registry.

Note: the original v25→v26 migration that caused the corruption no longer
exists (config format is now v30; no migration step rewrites toolset names).
This change is the durable defense against the silent-failure mode regardless
of cause, matching the issue's "Expected: log a warning".

Salvaged from #39207 by @lEWFkRAD (authorship preserved via cherry-pick).
Tests: 9 helper cases (incl. the #38798 corruption shape, mixed valid/invalid,
zero-tools state, non-dict/scalar/non-string) + a runtime caplog test — both the
helper warning and the runtime guard mutation-verified to fail without the fix.

Closes #38798. Supersedes #39581 (prevent-in-v25→v26 — that path is gone),
#41006 / #40208 (repair-migration for already-corrupted configs).
2026-06-26 14:07:43 +05:30

74 lines
2.9 KiB
Python

"""Validation for the ``platform_toolsets`` config section.
Pure, side-effect-free helpers so the logic is unit-testable without importing
the tool registry or launching Hermes (mirrors the decoupled-helper pattern used
elsewhere in the CLI).
Motivated by #38798: a config migration silently rewrote the valid toolset name
``hermes-cli`` to the non-existent ``hermes``. ``resolve_toolset('hermes')``
returns an empty list, so every tool silently disappeared with no error, warning,
or log entry — the agent degraded to text-only replies and the cause took
significant debugging to find. Surfacing invalid toolset names (and the
zero-tools end state) loudly turns that silent failure into an actionable one.
"""
from typing import Callable, Dict, List
def validate_platform_toolsets(
platform_toolsets: object,
is_valid_toolset: Callable[[str], bool],
) -> List[str]:
"""Return human-readable warnings for a ``platform_toolsets`` mapping.
Two failure modes are reported:
1. A toolset name that ``is_valid_toolset`` rejects — usually a corrupted or
renamed entry. When ``hermes-<platform>`` would have been valid (the exact
#38798 shape, where ``cli`` held ``hermes`` instead of ``hermes-cli``),
the warning includes that as a suggestion.
2. The mapping is non-empty but resolves to *zero* valid toolsets, so the
agent would start with no tools at all.
``is_valid_toolset`` is injected (normally :func:`toolsets.validate_toolset`)
so this function performs no imports or I/O and is testable in isolation.
Args:
platform_toolsets: The raw ``platform_toolsets`` value from config. Only
``dict`` values carry toolset entries; anything else yields no
warnings (nothing to validate).
is_valid_toolset: Predicate returning ``True`` for a known toolset name.
Returns:
A list of warning strings (empty when everything is valid).
"""
warnings: List[str] = []
if not isinstance(platform_toolsets, dict) or not platform_toolsets:
return warnings
valid_count = 0
for platform, raw in platform_toolsets.items():
names = raw if isinstance(raw, list) else [raw]
for name in names:
if not isinstance(name, str) or not name:
continue
if is_valid_toolset(name):
valid_count += 1
continue
suggestion = f"hermes-{platform}"
hint = (
f" — did you mean '{suggestion}'?"
if is_valid_toolset(suggestion)
else ""
)
warnings.append(
f"platform '{platform}' references unknown toolset "
f"'{name}'{hint}"
)
if valid_count == 0:
warnings.append(
"platform_toolsets resolves to zero valid toolsets — the agent will "
"have no tools. Run `hermes tools` to reconfigure."
)
return warnings