Three tests in tests/test_plugin_skills.py and tests/hermes_cli/test_plugins.py
used caplog.at_level(logging.WARNING) without specifying a logger. When another
test earlier in the same xdist worker touched propagation on tools.skills_tool
or hermes_cli.plugins, caplog would miss the warning and the assertion would
fail intermittently in CI.
These three tests accounted for 15 of the last ~30 Tests workflow failures
(5 each), including the recent main failure on commit 436a7359 (PR #11398).
Fix: pass logger="tools.skills_tool" / logger="hermes_cli.plugins" to
caplog.at_level() so the handler attaches directly to the logger under test
and capture is independent of global propagation state.
Affected tests:
- tests/test_plugin_skills.py::TestSkillViewPluginGuards::test_injection_logged_but_served
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_empty_name_rejected
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_builtin_conflict_rejected
No production code change. Verified passing under xdist (-n 4) alongside
test_hermes_logging.py (the test most likely to poison the logger state).
Expands the plugin interface so slash command handlers can dispatch tool
calls through the registry with parent agent context wired up automatically.
This is the public API for plugins that need to orchestrate tools like
delegate_task — they call ctx.dispatch_tool() instead of reaching into
framework internals. The parent agent is resolved lazily from _cli_ref
when available (CLI mode) and omitted in gateway mode (tools degrade
gracefully).
Enables the hermes-deliver-plugin pattern where /deliver and /fanout
slash commands spawn subagents via delegate_task without touching the
agent conversation loop.
7 new tests covering: registry delegation, parent_agent injection from
cli_ref, gateway mode (no cli_ref), uninitialized agent, explicit
parent_agent override, kwargs forwarding, return value passthrough.
Complete the half-built plugin slash command system. The dispatch
code in cli.py and gateway/run.py already called
get_plugin_command_handler() but the registration side was never
implemented.
Changes:
- Add register_command() to PluginContext — stores handler,
description, and plugin name; normalizes names; rejects conflicts
with built-in commands
- Add _plugin_commands dict to PluginManager
- Add commands_registered tracking on LoadedPlugin
- Add get_plugin_command_handler() and get_plugin_commands()
module-level convenience functions
- Fix commands.py to use actual plugin description in Telegram
bot menu (was hardcoded 'Plugin command')
- Add plugin commands to SlashCommandCompleter autocomplete
- Show command count in /plugins display
- 12 new tests covering registration, conflict detection,
normalization, handler dispatch, and introspection
Closes#10495
Plugins can now return {"action": "block", "message": "reason"} from
their pre_tool_call hook to prevent a tool from executing. The error
message is returned to the model as a tool result so it can adjust.
Covers both execution paths: handle_function_call (model_tools.py) and
agent-level tools (run_agent.py _invoke_tool + sequential/concurrent).
Blocked tools skip all side effects (counter resets, checkpoints,
callbacks, read-loop tracker).
Adds skip_pre_tool_call_hook flag to avoid double-firing the hook when
run_agent.py already checked and then calls handle_function_call.
Salvaged from PR #5385 (gianfrancopiana) and PR #4610 (oredsecurity).
* refactor: re-architect tests to mirror the codebase
* Update tests.yml
* fix: add missing tool_error imports after registry refactor
* fix(tests): replace patch.dict with monkeypatch to prevent env var leaks under xdist
patch.dict(os.environ) can leak TERMINAL_ENV across xdist workers,
causing test_code_execution tests to hit the Modal remote path.
* fix(tests): fix update_check and telegram xdist failures
- test_update_check: replace patch("hermes_cli.banner.os.getenv") with
monkeypatch.setenv("HERMES_HOME") — banner.py no longer imports os
directly, it uses get_hermes_home() from hermes_constants.
- test_telegram_conflict/approval_buttons: provide real exception classes
for telegram.error mock (NetworkError, TimedOut, BadRequest) so the
except clause in connect() doesn't fail with "catching classes that do
not inherit from BaseException" when xdist pollutes sys.modules.
* fix(tests): accept unavailable_models kwarg in _prompt_model_selection mock