Commit graph

43 commits

Author SHA1 Message Date
Teknium
70768665a4
fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383)
* feat(mcp-oauth): scaffold MCPOAuthManager

Central manager for per-server MCP OAuth state. Provides
get_or_build_provider (cached), remove (evicts cache + deletes
disk), invalidate_if_disk_changed (mtime watch, core fix for
external-refresh workflow), and handle_401 (dedup'd recovery).

No behavior change yet — existing call sites still use
build_oauth_auth directly. Task 1 of 8 in the MCP OAuth
consolidation (fixes Cthulhu's BetterStack reliability issues).

* feat(mcp-oauth): add HermesMCPOAuthProvider with pre-flow disk watch

Subclasses the MCP SDK's OAuthClientProvider to inject a disk
mtime check before every async_auth_flow, via the central
manager. When a subclass instance is used, external token
refreshes (cron, another CLI instance) are picked up before
the next API call.

Still dead code: the manager's _build_provider still delegates
to build_oauth_auth and returns the plain OAuthClientProvider.
Task 4 wires this subclass in. Task 2 of 8.

* refactor(mcp-oauth): extract build_oauth_auth helpers

Decomposes build_oauth_auth into _configure_callback_port,
_build_client_metadata, _maybe_preregister_client, and
_parse_base_url. Public API preserved. These helpers let
MCPOAuthManager._build_provider reuse the same logic in Task 4
instead of duplicating the construction dance.

Also updates the SDK version hint in the warning from 1.10.0 to
1.26.0 (which is what we actually require for the OAuth types
used here). Task 3 of 8.

* feat(mcp-oauth): manager now builds HermesMCPOAuthProvider directly

_build_provider constructs the disk-watching subclass using the
helpers from Task 3, instead of delegating to the plain
build_oauth_auth factory. Any consumer using the manager now gets
pre-flow disk-freshness checks automatically.

build_oauth_auth is preserved as the public API for backwards
compatibility. The code path is now:

    MCPOAuthManager.get_or_build_provider  ->
      _build_provider  ->
        _configure_callback_port
        _build_client_metadata
        _maybe_preregister_client
        _parse_base_url
        HermesMCPOAuthProvider(...)

Task 4 of 8.

* feat(mcp): wire OAuth manager + add _reconnect_event

MCPServerTask gains _reconnect_event alongside _shutdown_event.
When set, _run_http / _run_stdio exit their async-with blocks
cleanly (no exception), and the outer run() loop re-enters the
transport to rebuild the MCP session with fresh credentials.
This is the recovery path for OAuth failures that the SDK's
in-place httpx.Auth cannot handle (e.g. cron externally consumed
the refresh_token, or server-side session invalidation).

_run_http now asks MCPOAuthManager for the OAuth provider
instead of calling build_oauth_auth directly. Config-time,
runtime, and reconnect paths all share one provider instance
with pre-flow disk-watch active.

shutdown() defensively sets both events so there is no race
between reconnect and shutdown signalling.

Task 5 of 8.

* feat(mcp): detect auth failures in tool handlers, trigger reconnect

All 5 MCP tool handlers (tool call, list_resources, read_resource,
list_prompts, get_prompt) now detect auth failures and route
through MCPOAuthManager.handle_401:

  1. If the manager says recovery is viable (disk has fresh tokens,
     or SDK can refresh in-place), signal MCPServerTask._reconnect_event
     to tear down and rebuild the MCP session with fresh credentials,
     then retry the tool call once.

  2. If no recovery path exists, return a structured needs_reauth
     JSON error so the model stops hallucinating manual refresh
     attempts (the 'let me curl the token endpoint' loop Cthulhu
     pasted from Discord).

_is_auth_error catches OAuthFlowError, OAuthTokenError,
OAuthNonInteractiveError, and httpx.HTTPStatusError(401). Non-auth
exceptions still surface via the generic error path unchanged.

Task 6 of 8.

* feat(mcp-cli): route add/remove through manager, add 'hermes mcp login'

cmd_mcp_add and cmd_mcp_remove now go through MCPOAuthManager
instead of calling build_oauth_auth / remove_oauth_tokens
directly. This means CLI config-time state and runtime MCP
session state are backed by the same provider cache — removing
a server evicts the live provider, adding a server populates
the same cache the MCP session will read from.

New 'hermes mcp login <name>' command:
  - Wipes both the on-disk tokens file and the in-memory
    MCPOAuthManager cache
  - Triggers a fresh OAuth browser flow via the existing probe
    path
  - Intended target for the needs_reauth error Task 6 returns
    to the model

Task 7 of 8.

* test(mcp-oauth): end-to-end integration tests

Five new tests exercising the full consolidation with real file
I/O and real imports (no transport mocks):

  1. external_refresh_picked_up_without_restart — Cthulhu's cron
     workflow. External process writes fresh tokens to disk;
     on the next auth flow the manager's mtime-watch flips
     _initialized and the SDK re-reads from storage.

  2. handle_401_deduplicates_concurrent_callers — 10 concurrent
     handlers for the same failed token fire exactly ONE recovery
     attempt (thundering-herd protection).

  3. handle_401_returns_false_when_no_provider — defensive path
     for unknown servers.

  4. invalidate_if_disk_changed_handles_missing_file — pre-auth
     state returns False cleanly.

  5. provider_is_reused_across_reconnects — cache stickiness so
     reconnects preserve the disk-watch baseline mtime.

Task 8 of 8 — consolidation complete.
2026-04-16 21:57:10 -07:00
Teknium
3ff18ffe14
fix: add circuit breaker to MCP tool handler to prevent retry burn loops (#10447) (#10776)
When an MCP server returns errors consistently (crashed, disconnected,
auth expired), the model sees each error and retries the tool call.
With no circuit breaker, this burned through all 90 iterations — each
one a full LLM API call plus failed MCP call — producing 15-45 minutes
of zero useful output while the gateway inactivity timeout never fired
(because the agent WAS active, just uselessly).

Fix: track consecutive error counts per MCP server. After 3 consecutive
failures (connection errors, MCP-level errors, or transport exceptions),
the handler short-circuits with a message telling the model to stop
retrying and use alternative approaches. The counter resets to 0 on
any successful call.

Closes #10447
2026-04-15 22:33:48 -07:00
Teknium
861efe274b
fix: add ensure_ascii=False to all MCP json.dumps calls (#10234) (#10512)
Python's json.dumps() defaults to ensure_ascii=True, escaping non-ASCII
characters to \uXXXX sequences.  For CJK characters this inflates
token count 3-4x — a single Chinese character like '中' becomes
'\u4e2d' (6 chars vs 3 bytes, ~6 tokens vs ~1 token).

Since MCP tool results feed directly into the model's conversation
context, this silently multiplied API costs for Chinese, Japanese,
and Korean users.

Fix: add ensure_ascii=False to all 20 json.dumps calls in mcp_tool.py.
Raw UTF-8 is valid JSON per RFC 8259 and all downstream consumers
(LLM APIs, display) handle it correctly.

Closes #10234
2026-04-15 13:59:57 -07:00
Teknium
ba24f058ed docs: fix stale docstring reference to _discover_tools in mcp_tool.py 2026-04-14 21:12:29 -07:00
Greer Guthrie
c10fea8d26 fix(mcp): make server aliases explicit 2026-04-14 17:19:20 -07:00
Greer Guthrie
cda64a5961 fix(mcp): resolve toolsets from live registry 2026-04-14 17:19:20 -07:00
Teknium
eed891f1bb
security: supply chain hardening — CI pinning, dep pinning, and code fixes (#9801)
CI/CD Hardening:
- Pin all 12 GitHub Actions to full commit SHAs (was mutable @vN tags)
- Add explicit permissions: {contents: read} to 4 workflows
- Pin CI pip installs to exact versions (pyyaml==6.0.2, httpx==0.28.1)
- Extend supply-chain-audit.yml to scan workflow, Dockerfile, dependency
  manifest, and Actions version changes

Dependency Pinning:
- Pin git-based Python deps to commit SHAs (atroposlib, tinker, yc-bench)
- Pin WhatsApp Baileys from mutable branch to commit SHA

Tool Registry:
- Reject tool name shadowing from different tool families (plugins/MCP
  cannot overwrite built-in tools). MCP-to-MCP overwrites still allowed.

MCP Security:
- Add tool description content scanning for prompt injection patterns
- Log detailed change diff on dynamic tool refresh at WARNING level

Skill Manager:
- Fix dangerous verdict bug: agent-created skills with dangerous
  findings were silently allowed (ask->None->allow). Now blocked.
2026-04-14 14:23:37 -07:00
helix4u
e08590888a fix: honor interrupts during MCP tool waits 2026-04-13 22:14:55 -07:00
Teknium
f324222b79
fix: add vLLM/local server error patterns + MCP initial connection retry (#9281)
Port two improvements inspired by Kilo-Org/kilocode analysis:

1. Error classifier: add context overflow patterns for vLLM, Ollama,
   and llama.cpp/llama-server. These local inference servers return
   different error formats than cloud providers (e.g., 'exceeds the
   max_model_len', 'context length exceeded', 'slot context'). Without
   these patterns, context overflow errors from local servers are
   misclassified as format errors, causing infinite retries instead
   of triggering compression.

2. MCP initial connection retry: previously, if the very first
   connection attempt to an MCP server failed (e.g., transient DNS
   blip at startup), the server was permanently marked as failed with
   no retry. Post-connect reconnection had 5 retries with exponential
   backoff, but initial connection had zero. Now initial connections
   retry up to 3 times with backoff before giving up, matching the
   resilience of post-connect reconnection.
   (Inspired by Kilo Code's MCP server disappearing fix in v1.3.3)

Tests: 6 new error classifier tests, 4 new MCP retry tests, 1
updated existing test. All 276 affected tests pass.
2026-04-13 18:46:14 -07:00
Awsh1
6f63ba9c8f fix(mcp): fall back when SIGKILL is unavailable 2026-04-10 16:47:44 -07:00
Teknium
04baab5422
fix(mcp): combine content and structuredContent when both present (#7118)
When an MCP server returns both content (model-oriented text) and
structuredContent (machine-oriented JSON), the client now combines
them instead of discarding content.  The text content becomes the
primary result (what the agent reads), and structuredContent is
included as supplementary metadata.

Previously, structuredContent took full precedence — causing data
loss for servers like Desktop Commander that put the actual file
text in content and metadata in structuredContent.

MCP spec guidance: for conversational/agent UX, prefer content.
2026-04-10 03:44:35 -07:00
Teknium
b9a5e6e247 fix: use camelCase structuredContent attr, prefer structured over text
- The MCP SDK Pydantic model uses camelCase (structuredContent), not
  snake_case (structured_content). The original getattr was a silent no-op.
- When structuredContent is present, return it AS the result instead of
  alongside text — the structured payload is the machine-readable data.
- Move test file to tests/tools/ and fix fake class to use camelCase.
- Patch _run_on_mcp_loop in tests so the handler actually executes.
2026-04-07 18:00:01 -07:00
r266-tech
2ad7694874 fix(mcp): preserve structured_content in tool call results
MCP CallToolResult may include structured_content (a JSON object) alongside
content blocks. The tool handler previously only forwarded concatenated text
from content blocks, silently dropping the structured payload.

This breaks MCP tools that return a minimal human text in content while
putting the actual machine-usable payload in structured_content.

Now, when structured_content is present, it is included in the returned
JSON under the 'structuredContent' key.

Fixes NousResearch/hermes-agent#5874
2026-04-07 18:00:01 -07:00
Siddharth Balyan
f3006ebef9
refactor(tests): re-architect tests + fix CI failures (#5946)
* refactor: re-architect tests to mirror the codebase

* Update tests.yml

* fix: add missing tool_error imports after registry refactor

* fix(tests): replace patch.dict with monkeypatch to prevent env var leaks under xdist

patch.dict(os.environ) can leak TERMINAL_ENV across xdist workers,
causing test_code_execution tests to hit the Modal remote path.

* fix(tests): fix update_check and telegram xdist failures

- test_update_check: replace patch("hermes_cli.banner.os.getenv") with
  monkeypatch.setenv("HERMES_HOME") — banner.py no longer imports os
  directly, it uses get_hermes_home() from hermes_constants.

- test_telegram_conflict/approval_buttons: provide real exception classes
  for telegram.error mock (NetworkError, TimedOut, BadRequest) so the
  except clause in connect() doesn't fail with "catching classes that do
  not inherit from BaseException" when xdist pollutes sys.modules.

* fix(tests): accept unavailable_models kwarg in _prompt_model_selection mock
2026-04-07 17:19:07 -07:00
Teknium
678a87c477
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:

tools/registry.py — tool_error() and tool_result():
  Every tool handler returns JSON strings. The pattern
  json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
  and json.dumps({"success": False, "error": msg}, ...) another 23.
  Now: tool_error(msg) or tool_error(msg, success=False).

  tool_result() handles arbitrary result dicts:
  tool_result(success=True, data=payload) or tool_result(some_dict).

hermes_cli/config.py — read_raw_config():
  Lightweight YAML reader that returns the raw config dict without
  load_config()'s deep-merge + migration overhead. Available for
  callsites that just need a single config value.

Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
  web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
  delegate (5), send_message (4), tts (4), memory (7), session_search (3),
  mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
  browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
  holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:38 -07:00
Teknium
38d8446011
feat: implement MCP OAuth 2.1 PKCE client support (#5420)
Implement tools/mcp_oauth.py — the OAuth adapter that mcp_tool.py's
existing auth: oauth hook has been waiting for.

Components:
- HermesTokenStorage: persists tokens + client registration to
  HERMES_HOME/mcp-tokens/<server>.json with 0o600 permissions
- Callback handler factory: per-flow isolated HTTP handlers (safe for
  concurrent OAuth flows across multiple MCP servers)
- OAuthClientProvider integration: wraps the MCP SDK's httpx.Auth
  subclass which handles discovery, DCR, PKCE, token exchange,
  refresh, and step-up auth (403 insufficient_scope) automatically
- Non-interactive detection: warns when gateway/cron environments
  try to OAuth without cached tokens
- Pre-registered client support: injects client_id/secret from config
  for servers that don't support Dynamic Client Registration (e.g. Slack)
- Path traversal protection on server names
- remove_oauth_tokens() for cleanup

Config format:
  mcp_servers:
    sentry:
      url: 'https://mcp.sentry.dev/mcp'
      auth: oauth
      oauth:                          # all optional
        client_id: '...'              # skip DCR
        client_secret: '...'          # confidential client
        scope: 'read write'           # server-provided by default

Also passes oauth config dict through from mcp_tool.py (was passing
only server_name and url before).

E2E verified: full OAuth flow (401 → discovery → DCR → authorize →
token exchange → authenticated request → tokens persisted) against
local test servers. 23 unit tests + 186 MCP suite tests pass.
2026-04-05 22:08:00 -07:00
Teknium
4494fba140
feat: OSV malware check for MCP extension packages (#5305)
Before launching an MCP server via npx/uvx, queries the OSV (Open Source
Vulnerabilities) API to check if the package has known malware advisories
(MAL-* IDs). Regular CVEs are ignored — only confirmed malware is blocked.

- Free, public API (Google-maintained), ~300ms per query
- Runs once per MCP server launch, inside _run_stdio() before subprocess spawn
- Parallel with other MCP servers (asyncio.gather already in place)
- Fail-open: network errors, timeouts, unrecognized commands → allow
- Parses npm (scoped @scope/pkg@version) and PyPI (name[extras]==version)

Inspired by Block/goose extension malware check.
2026-04-05 12:46:07 -07:00
Teknium
cc54818d26
fix(mcp): stability fix pack — reload timeout, shutdown cleanup, event loop handler, OAuth non-blocking (#4757)
Four fixes for MCP server stability issues reported by community member
(terminal lockup, zombie processes, escape sequence pollution, startup hang):

1. MCP reload timeout guard (cli.py): _check_config_mcp_changes now runs
   _reload_mcp in a separate daemon thread with a 30s hard timeout. Previously,
   a hung MCP server could block the process_loop thread indefinitely, freezing
   the entire TUI (user can type but nothing happens, only Ctrl+D/Ctrl+\ work).

2. MCP stdio subprocess PID tracking (mcp_tool.py): Tracks child PIDs spawned
   by stdio_client via before/after snapshots of /proc children. On shutdown,
   _stop_mcp_loop force-kills any tracked PIDs that survived the SDK's graceful
   SIGTERM→SIGKILL cleanup. Prevents zombie MCP server processes from
   accumulating across sessions.

3. MCP event loop exception handler (mcp_tool.py): Installs
   _mcp_loop_exception_handler on the MCP background event loop — same pattern
   as the existing _suppress_closed_loop_errors on prompt_toolkit's loop.
   Suppresses benign 'Event loop is closed' RuntimeError from httpx transport
   __del__ during MCP shutdown. Salvaged from PR #2538 (acsezen).

4. MCP OAuth non-blocking (mcp_oauth.py): Replaces blocking input() call in
   _wait_for_callback with OAuthNonInteractiveError raise. Adds _is_interactive()
   TTY detection. In non-interactive environments, build_oauth_auth() still
   returns a provider (cached tokens + refresh work), but the callback handler
   raises immediately instead of blocking the MCP event loop for 120s. Re-raises
   OAuth setup failures in _run_http so failed servers are reported cleanly
   without blocking others. Salvaged from PRs #4521 (voidborne-d) and #4465
   (heathley).

Closes #2537, closes #4462
Related: #4128, #3436
2026-04-03 02:29:20 -07:00
Teknium
21c2d32471 fix(gateway): normalize step_callback prev_tools for backward compat
The PR changed prev_tools from list[str] to list[dict] with name/result
keys.  The gateway's _step_callback_sync passed this directly to hooks
as 'tool_names', breaking user-authored hooks that call
', '.join(tool_names).

Now:
- 'tool_names' always contains strings (backward-compatible)
- 'tools' carries the enriched dicts for hooks that want results

Also adds summary logging to register_mcp_servers() and comprehensive
tests for all three PR changes:
- sanitize_mcp_name_component edge cases
- register_mcp_servers public API
- _register_session_mcp_servers ACP integration
- step_callback result forwarding
- gateway normalization backward compat
2026-04-02 20:54:27 -07:00
Jack
9b2fb1cc2e feat(acp): register client-provided MCP servers as agent tools
ACP clients pass MCP server definitions in session/new, load_session,
resume_session, and fork_session. Previously these were accepted but
silently ignored — the agent never connected to them.

This wires the mcp_servers parameter into the existing MCP registration
pipeline (tools/mcp_tool.py) so client-provided servers are connected,
their tools discovered, and the agent's tool surface refreshed before
the first prompt.

Changes:

tools/mcp_tool.py:
- Extract sanitize_mcp_name_component() to replace all non-[A-Za-z0-9_]
  characters (fixes crash when server names contain / or other chars
  that violate provider tool-name validation rules)
- Use it in _convert_mcp_schema, _sync_mcp_toolsets, _build_utility_schemas
- Extract register_mcp_servers(servers: dict) as a public API that takes
  an explicit {name: config} map. discover_mcp_tools() becomes a thin
  wrapper that loads config.yaml and calls register_mcp_servers()

acp_adapter/server.py:
- Add _register_session_mcp_servers() which converts ACP McpServerStdio /
  McpServerHttp / McpServerSse objects to Hermes MCP config dicts,
  registers them via asyncio.to_thread (avoids blocking the ACP event
  loop), then rebuilds agent.tools, valid_tool_names, and invalidates
  the cached system prompt
- Call it from new_session, load_session, resume_session, fork_session

Tested with Eden (theproxycompany.com) as ACP client — 5 MCP servers
(HTTP + stdio) registered successfully, 110 tools available to the agent.
2026-04-02 20:54:27 -07:00
Teknium
d5d22fe7ba
feat(mcp): dynamic tool discovery via notifications/tools/list_changed (#3812)
When a connected MCP server sends a ToolListChangedNotification (per the
MCP spec), Hermes now automatically re-fetches the tool list, deregisters
removed tools, and registers new ones — without requiring a restart.

This enables MCP servers with dynamic toolsets (e.g. GitHub MCP with
GITHUB_DYNAMIC_TOOLSETS=1) to add/remove tools at runtime.

Changes:
- registry.py: add ToolRegistry.deregister() for nuke-and-repave refresh
- mcp_tool.py: extract _register_server_tools() from
  _discover_and_register_server() as a shared helper for both initial
  discovery and dynamic refresh
- mcp_tool.py: add _make_message_handler() and _refresh_tools() on
  MCPServerTask, wired into all 3 ClientSession sites (stdio, new HTTP,
  deprecated HTTP)
- Graceful degradation: silently falls back to static discovery when the
  MCP SDK lacks notification types or message_handler support
- 8 new tests covering registration, refresh, handler dispatch, and
  deregister

Salvaged from PR #1794 by shivvor2.
2026-03-29 15:52:54 -07:00
Teknium
3e1157080a
fix(tools): use non-deprecated streamable_http_client for MCP HTTP transport (#3646)
Switch MCP HTTP transport from the deprecated streamablehttp_client()
(mcp < 1.24.0) to the new streamable_http_client() API that accepts a
pre-built httpx.AsyncClient.

Changes vs the original PR #3391:
- Separate try/except imports so mcp < 1.24.0 doesn't break (graceful
  fallback to deprecated API instead of losing HTTP MCP entirely)
- Wrap httpx.AsyncClient in async-with for proper lifecycle management
  (the new SDK API explicitly skips closing caller-provided clients)
- Match SDK's own create_mcp_http_client defaults: follow_redirects=True,
  Timeout(connect_timeout, read=300.0)
- Keep deprecated code path as fallback for older SDK versions

Co-authored-by: HenkDz <HenkDz@users.noreply.github.com>
2026-03-28 18:20:49 -07:00
Teknium
be416cdfa9
fix: guard config.get() against YAML null values to prevent AttributeError (#3377)
dict.get(key, default) returns None — not the default — when the key IS
present but explicitly set to null/~ in YAML.  Calling .lower() on that
raises AttributeError.

Use (config.get(key) or fallback) so both missing keys and explicit nulls
coalesce to the intended default.

Files fixed:
- tools/tts_tool.py — _get_provider()
- tools/web_tools.py — _get_backend()
- tools/mcp_tool.py — MCPServerTask auth config
- trajectory_compressor.py — _detect_provider() and config loading

Co-authored-by: dieutx <dangtc94@gmail.com>
2026-03-27 04:03:00 -07:00
Teknium
0cfc1f88a3
fix: add MCP tool name collision protection (#3077)
- Registry now warns when a tool name is overwritten by a different
  toolset (silent dict overwrite was the previous behavior)
- MCP tool registration checks for collisions with non-MCP (built-in)
  tools before registering. If an MCP tool's prefixed name matches an
  existing built-in, the MCP tool is skipped and a warning is logged.
  MCP-to-MCP collisions are allowed (last server wins).
- Both regular MCP tools and utility tools (resources/prompts) are
  guarded.
- Adds 5 tests covering: registry overwrite warning, same-toolset
  re-registration silence, built-in collision skip, normal registration,
  and MCP-to-MCP collision pass-through.

Reported by k_sze (KONG) — MiniMax MCP server's web_search tool could
theoretically shadow Hermes's built-in web_search if prefixing failed.
2026-03-25 16:52:04 -07:00
Teknium
b7091f93b1
feat(cli): MCP server management CLI + OAuth 2.1 PKCE auth
Add hermes mcp add/remove/list/test/configure CLI for managing MCP
server connections interactively. Discovery-first 'add' flow connects,
discovers tools, and lets users select which to enable via curses checklist.

Add OAuth 2.1 PKCE authentication for MCP HTTP servers (RFC 7636).
Supports browser-based and manual (headless) authorization, token
caching with 0600 permissions, automatic refresh. Zero external deps.

Add ${ENV_VAR} interpolation in MCP server config values, resolved
from os.environ + ~/.hermes/.env at load time.

Core OAuth module from PR #2021 by @imnotdev25. CLI and mcp_tool
wiring rewritten against current main. Closes #497, #690.
2026-03-22 04:52:52 -07:00
hermes
4d2c93a04f fix: normalize MCP object schemas without properties 2026-03-19 16:23:45 -07:00
Test
4b53b89f09 feat(mcp): expose MCP servers as standalone toolsets
Each configured MCP server now registers as its own toolset in TOOLSETS
(e.g. TOOLSETS['github'] = {tools: ['mcp_github_list_files', ...]}),
making raw server names resolvable in platform_toolsets overrides.

Previously MCP tools were only injected into hermes-* umbrella toolsets,
so gateway sessions using raw toolset names like ['terminal', 'github']
in platform_toolsets couldn't resolve MCP tools.

Skips server names that collide with built-in toolsets. Also handles
idempotent reloads (syncs toolsets even when no new servers connect).

Inspired by PR #1876 by @kshitijk4poor.
Adds 2 tests (standalone toolset creation + built-in collision guard).
2026-03-18 03:04:17 -07:00
Teknium
ce7418e274
feat: interactive MCP tool configuration in hermes tools (#1694)
Add the ability to selectively enable/disable individual MCP server
tools through the interactive 'hermes tools' TUI.

Changes:
- tools/mcp_tool.py: Add probe_mcp_server_tools() — lightweight function
  that temporarily connects to configured MCP servers, discovers their
  tools (names + descriptions), and disconnects. No registry side effects.

- hermes_cli/tools_config.py: Add 'Configure MCP tools' option to the
  interactive menu. When selected:
  1. Probes all enabled MCP servers for their available tools
  2. Shows a per-server curses checklist with tool descriptions
  3. Pre-selects tools based on existing include/exclude config
  4. Writes changes back as tools.exclude entries in config.yaml
  5. Reports which servers failed to connect

The existing CLI commands (hermes tools enable/disable server:tool)
continue to work unchanged. This adds the interactive TUI counterpart
so users can browse and toggle MCP tools visually.

Tests: 22 new tests covering probe function edge cases and interactive
flow (pre-selection, exclude/include modes, description truncation,
multi-server handling, error paths).
2026-03-17 03:48:44 -07:00
teknium1
04e151714f feat(mcp): make selective tool loading capability-aware
Extend the salvaged MCP filtering work so utility tools are also governed by policy and server capabilities. Store the registered tool subset per server so rediscovery and status reporting stay accurate after filtering.
2026-03-14 06:22:02 -07:00
teyrebaz33
3198cc8fd9 feat(mcp): per-server tool filtering via include/exclude and enabled flag
Add optional config keys under each mcp_servers entry:
- tools.include: whitelist, only listed tools are registered
- tools.exclude: blacklist, all tools except listed are registered
- enabled: false: skip server entirely, no connection attempt

Backward-compatible: no config keys = all tools registered as before.

Tests: TestMCPSelectiveToolLoading (4 tests), 134 passed total.
2026-03-14 06:12:17 -07:00
Teknium
b646440ca0
fix(mcp): resolve npx stdio connection failures (#1291)
Salvaged from PR #977 onto current main.
Preserves the MCP stdio command resolution and improved error diagnostics,
with deterministic regression tests for the npx/node PATH cases.

Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-14 05:44:00 -07:00
teknium1
2192b17670 merge: resolve conflicts with origin/main
- gateway/run.py: Take main's _resolve_gateway_model() helper
- hermes_cli/setup.py: Re-apply nous-api removal after merge brought
  it back. Fix provider_idx offset (Custom is now index 3, not 4).
- tests/hermes_cli/test_setup.py: Fix custom setup test index (3→4)
2026-03-12 00:29:04 -07:00
teknium1
0aa31cd3cb feat: call_llm/async_call_llm + config slots + migrate all consumers
Add centralized call_llm() and async_call_llm() functions that own the
full LLM request lifecycle:
  1. Resolve provider + model from task config or explicit args
  2. Get or create a cached client for that provider
  3. Format request args (max_tokens handling, provider extra_body)
  4. Make the API call with max_tokens/max_completion_tokens retry
  5. Return the response

Config: expanded auxiliary section with provider:model slots for all
tasks (compression, vision, web_extract, session_search, skills_hub,
mcp, flush_memories). Config version bumped to 7.

Migrated all auxiliary consumers:
- context_compressor.py: uses call_llm(task='compression')
- vision_tools.py: uses async_call_llm(task='vision')
- web_tools.py: uses async_call_llm(task='web_extract')
- session_search_tool.py: uses async_call_llm(task='session_search')
- browser_tool.py: uses call_llm(task='vision'/'web_extract')
- mcp_tool.py: uses call_llm(task='mcp')
- skills_guard.py: uses call_llm(provider='openrouter')
- run_agent.py flush_memories: uses call_llm(task='flush_memories')

Tests updated for context_compressor and MCP tool. Some test mocks
still need updating (15 remaining failures from mock pattern changes,
2 pre-existing).
2026-03-11 20:52:19 -07:00
0xbyt4
4a8f23eddf fix: correctly track failed MCP server connections in discovery
_discover_one() caught all exceptions and returned [], making
asyncio.gather(return_exceptions=True) redundant. The
isinstance(result, Exception) branch in _discover_all() was dead
code, so failed_count was always 0. This caused:
- No summary printed when all servers fail (silent failure)
- ok_servers always equaling total_servers (misleading count)
- Unused variables transport_desc and transport_type

Fix: let exceptions propagate to gather() so failed_count increments
correctly. Move per-server failure logging to _discover_all(). Remove
dead variables.
2026-03-11 18:24:45 +03:00
0xbyt4
4e3a8a0637 fix: handle empty choices in MCP sampling callback
SamplingHandler.__call__ accessed response.choices[0] without checking
if the list was non-empty. LLM APIs can return empty choices on content
filtering, provider errors, or rate limits, causing an unhandled
IndexError that propagates to the MCP SDK and may crash the connection.

Add a defensive guard that returns a proper ErrorData when choices is
empty, None, or missing. Includes three test cases covering all
variants.
2026-03-10 02:24:53 +03:00
Teknium
654e16187e
feat(mcp): add sampling support — server-initiated LLM requests (#753)
Add MCP sampling/createMessage capability via SamplingHandler class.

Text-only sampling + tool use in sampling with governance (rate limits,
model whitelist, token caps, tool loop limits). Per-server audit metrics.

Based on concept from PR #366 by eren-karakus0. Restructured as class-based
design with bug fixes and tests using real MCP SDK types.

50 new tests, 2600 total passing.
2026-03-09 03:37:38 -07:00
teknium1
7df14227a9 feat(mcp): banner integration, /reload-mcp command, resources & prompts
Banner integration:
- MCP Servers section in CLI startup banner between Tools and Skills
- Shows each server with transport type, tool count, connection status
- Failed servers shown in red; section hidden when no MCP configured
- Summary line includes MCP server count
- Removed raw print() calls from discovery (banner handles display)

/reload-mcp command:
- New slash command in both CLI and gateway
- Disconnects all MCP servers, re-reads config.yaml, reconnects
- Reports what changed (added/removed/reconnected servers)
- Allows adding/removing MCP servers without restarting

Resources & Prompts support:
- 4 utility tools registered per server: list_resources, read_resource,
  list_prompts, get_prompt
- Exposes MCP Resources (data sources) and Prompts (templates) as tools
- Proper parameter schemas (uri for read_resource, name for get_prompt)
- Handles text and binary resource content
- 23 new tests covering schemas, handlers, and registration

Test coverage: 74 MCP tests total, 1186 tests pass overall.
2026-03-02 19:15:59 -08:00
teknium1
60effcfc44 fix(mcp): parallel discovery, user-visible logging, config validation
- Discovery is now parallel (asyncio.gather) instead of sequential,
  fixing the 60s shared timeout issue with multiple servers
- Startup messages use print() so users see connection status even
  with default log levels (the 'tools' logger is set to ERROR)
- Summary line shows total tools and failed servers count
- Validate conflicting config: warn if both 'url' and 'command' are
  present (HTTP takes precedence)
- Update TODO.md: mark MCP as implemented, list remaining work
- Add test for conflicting config detection (51 tests total)

All 1163 tests pass.
2026-03-02 19:02:28 -08:00
teknium1
64ff8f065b feat(mcp): add HTTP transport, reconnection, security hardening
Upgrades the MCP client implementation from PR #291 with:

- HTTP/Streamable HTTP transport: support 'url' key in config for remote
  MCP servers (Notion, Slack, Sentry, Supabase, etc.)
- Automatic reconnection with exponential backoff (1s-60s, 5 retries)
  when a server connection drops unexpectedly
- Environment variable filtering: only pass safe vars (PATH, HOME, etc.)
  plus user-specified env to stdio subprocesses (prevents secret leaks)
- Credential stripping: sanitize error messages before returning to the
  LLM (strips GitHub PATs, OpenAI keys, Bearer tokens, etc.)
- Configurable per-server timeouts: 'timeout' and 'connect_timeout' keys
- Fix shutdown race condition in servers_snapshot variable scoping

Test coverage: 50 tests (up from 30), including new tests for env
filtering, credential sanitization, HTTP config detection, reconnection
logic, and configurable timeouts.

All 1162 tests pass (1162 passed, 3 skipped, 0 failed).
2026-03-02 18:40:03 -08:00
0xbyt4
11a2ecb936 fix: resolve thread safety issues and shutdown deadlock in MCP client
- Add threading.Lock protecting all shared state (_servers, _mcp_loop, _mcp_thread)
- Fix deadlock in shutdown_mcp_servers: _stop_mcp_loop was called inside
  a _lock block but also acquires _lock (non-reentrant)
- Fix race condition in _ensure_mcp_loop with concurrent callers
- Change idempotency to per-server (retry failed servers, skip connected)
- Dynamic toolset injection via startswith("hermes-") instead of hardcoded list
- Parallel shutdown via asyncio.gather instead of sequential loop
- Add tests for partial failure retry, parallel shutdown, dynamic injection
2026-03-02 22:08:32 +03:00
0xbyt4
593c549bc4 fix: make discover_mcp_tools idempotent to prevent duplicate connections
When discover_mcp_tools() is called multiple times (e.g. direct call
then model_tools import), return existing tool names instead of opening
new connections that would orphan the previous ones.
2026-03-02 21:34:21 +03:00
0xbyt4
aa2ecaef29 fix: resolve orphan subprocess leak on MCP server shutdown
Refactor MCP connections from AsyncExitStack to task-per-server
architecture. Each server now runs as a long-lived asyncio Task
with `async with stdio_client(...)`, ensuring anyio cancel-scope
cleanup happens in the same Task that opened the connection.
2026-03-02 21:22:00 +03:00
0xbyt4
3c252ae44b feat: add MCP (Model Context Protocol) client support
Connect to external MCP servers via stdio transport, discover their tools
at startup, and register them into the hermes-agent tool registry.

- New tools/mcp_tool.py: config loading, server connection via background
  event loop, tool handler factories, discovery, and graceful shutdown
- model_tools.py: trigger MCP discovery after built-in tool imports
- cli.py: call shutdown_mcp_servers in _run_cleanup
- pyproject.toml: add mcp>=1.2.0 as optional dependency
- 27 unit tests covering config, schema conversion, handlers, registration,
  SDK interaction, toolset injection, graceful fallback, and shutdown

Config format (in ~/.hermes/config.yaml):
  mcp_servers:
    filesystem:
      command: "npx"
      args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
2026-03-02 21:03:14 +03:00