mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-08 03:01:47 +00:00

docs: pluggable surfaces coverage — model-provider guide, full plugin map, opt-in fix (#20749 )

* docs(providers): add model-provider-plugin authoring guide + fix stale refs

New docs:
- website/docs/developer-guide/model-provider-plugin.md — full authoring
  guide (directory layout, minimal example, ProviderProfile fields,
  overridable hooks, user overrides, api_mode selection, auth types,
  testing, pip distribution)
- Wired into website/sidebars.ts under 'Extending'
- Cross-references added in:
  - guides/build-a-hermes-plugin.md (tip block)
  - developer-guide/adding-providers.md
  - developer-guide/provider-runtime.md

User guide:
- user-guide/features/plugins.md: Plugin types table grows from 3 to 4
  with 'Model providers' row

Stale comment cleanup (providers/*.py → plugins/model-providers/<name>/):
- hermes_cli/main.py:_is_profile_api_key_provider docstring
- hermes_cli/doctor.py:_build_apikey_providers_list docstring
- hermes_cli/auth.py: PROVIDER_REGISTRY + alias auto-extension comments
- hermes_cli/models.py: CANONICAL_PROVIDERS auto-extension comment

AGENTS.md:
- Project-structure tree: added plugins/model-providers/ row
- New section: 'Model-provider plugins' explaining discovery, override
  semantics, PluginManager integration, kind auto-coerce heuristic

Verified: docusaurus build succeeds, new page renders, all 3 cross-links
resolve. 347/347 targeted tests pass (tests/providers/,
tests/hermes_cli/test_plugins.py, tests/hermes_cli/test_runtime_provider_resolution.py,
tests/run_agent/test_provider_parity.py).

* docs(plugins): add 'pluggable interfaces at a glance' maps to plugins.md + build-a-hermes-plugin

Devs landing on either the user-guide plugin page or the build-a-plugin
guide now get an upfront table of every distinct pluggable surface with
a link to the right authoring doc. Previously they'd have to read the
full general-plugin guide to discover that model providers / platforms
/ memory / context engines are separate systems.

user-guide/features/plugins.md:
- New 'Pluggable interfaces — where to go for each' section below the
  existing 4-kinds table
- 10 rows covering every register_* surface (tool, hook, slash command,
  CLI subcommand, skill, model provider, platform, memory, context
  engine, image-gen)
- Explicit note: TTS/STT are NOT plugin-extensible yet — documented
  with a pointer to the current config.yaml 'command providers' pattern
  and a note that register_tts_provider()/register_stt_provider() may
  come later

guides/build-a-hermes-plugin.md:
- New :::info 'Not sure which guide you need?' map at the top so devs
  see all pluggable interfaces before investing in this 737-line
  general-plugin walkthrough
- Existing bottom :::tip expanded to include platform adapters alongside
  model/memory/context plugins

Verified:
- All 8 cross-doc links in the new plugins.md table resolve in a
  docusaurus build (SUCCESS, no new broken links)
- TTS link corrected (features/voice → features/tts; latter exists)
- Pre-existing broken links/anchors (cron-script-only, llms.txt,
  adding-platform-adapters#step-by-step-checklist) are unchanged

* docs(plugins): correct TTS/STT pluggability \u2014 they ARE plugins (command-providers)

Previous commit incorrectly said TTS/STT 'aren't plugin-extensible'. They
are, via the config-driven command-provider pattern \u2014 any CLI that reads
text and writes audio (or vice versa for STT) is automatically a plugin
with zero Python. The tts.md docs cover this extensively and I missed it.

plugins.md:
- TTS row: 'Config-driven (not a Python plugin)', points at
  tts.md#custom-command-providers
- STT row: points at tts.md#voice-message-transcription-stt (STT docs
  live in tts.md despite the filename)
- Expanded note: TTS/STT use config-driven shell-command templates as
  their plugin surface (full tts.providers.<name> registry for TTS;
  HERMES_LOCAL_STT_COMMAND escape hatch for STT)
- Any CLI that reads/writes files is automatically a plugin \u2014 no Python
  register_* API needed
- Future register_tts_provider()/register_stt_provider() hooks mentioned
  as nice-to-have for SDK/streaming cases, not as the primary story

build-a-hermes-plugin.md:
- Same map update: TTS/STT rows explicit, footer note corrected

Verified:
- tts.md anchors (custom-command-providers, voice-message-transcription-stt)
  exist and resolve in docusaurus build (SUCCESS, no new broken links)

* docs(plugins): expand pluggable interfaces table with MCP / event hooks / shell hooks / skill taps

Broadened the scope beyond Python register_* hooks. Hermes has MULTIPLE
plugin-style extension surfaces; they're now all in one table instead of
being scattered across feature docs.

Added rows for:
- **MCP servers** — config.yaml mcp_servers.<name> auto-registers external
  tools from any MCP server. Huge extensibility surface, previously not
  linked from the plugin map.
- **Gateway event hooks** — drop HOOK.yaml + handler.py into
  ~/.hermes/hooks/<name>/ to fire on gateway:startup, session:*, agent:*,
  command:* events. Separate from Python plugin hooks.
- **Shell hooks** — hooks: block in config.yaml runs shell commands on
  events (notifications, auditing, etc.).
- **Skill sources (taps)** — hermes skills tap add <repo> to pull in new
  skill registries beyond the built-in sources.

Both docs updated:
- user-guide/features/plugins.md: table column renamed to 'How' (mixes
  Python API + config-driven + drop-in-dir surfaces accurately)
- guides/build-a-hermes-plugin.md: :::info map at top mirrors the new
  surfaces with a forward-link to the consolidated table

Note block rewritten: instead of singling out TTS/STT as the 'different
style' exception, now honestly describes that Hermes deliberately
supports three plugin styles — Python APIs, config-driven commands, and
drop-in manifest directories — and devs should pick the one that fits
their integration.

Not included (considered and rejected):
- Transport layer (register_transport) — internal, not user-facing
- Tool-call parsers — internal, VLLM phase-2 thing
- Cloud browser providers — hardcoded registry, not drop-in yet
- Terminal backends — hardcoded if/elif, not drop-in yet
- Skill sources (the ABC) — hardcoded list, only taps are user-extensible

Verified:
- All 5 new anchors resolve (gateway-event-hooks, shell-hooks, skills-hub,
  custom-command-providers, voice-message-transcription-stt)
- Docusaurus build SUCCESS, zero new broken links
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
  adding-platform-adapters#step-by-step-checklist)

* docs(plugins): cover every pluggable surface in both the overview and how-to

Both plugins.md and build-a-hermes-plugin.md now cover every extension
surface end-to-end \u2014 general plugin APIs, specialized plugin types,
config-driven surfaces \u2014 with concrete authoring patterns for each.

plugins.md:
- 'What plugins can do' table grows from 9 rows (general ctx.register_*
  only) to 14 rows covering register_platform, register_image_gen_provider,
  register_context_engine, MemoryProvider subclass, register_provider
  (model). Each row links to its full authoring guide.
- New 'Plugin sub-categories' section under Plugin Discovery explains
  how plugins/platforms/, plugins/image_gen/, plugins/memory/,
  plugins/context_engine/, plugins/model-providers/ are routed to
  different loaders \u2014 PluginManager vs the per-category own-loader
  systems.
- Explicit mention of user-override semantics at
  ~/.hermes/plugins/model-providers/ and ~/.hermes/plugins/memory/.

build-a-hermes-plugin.md:
- New '## Specialized plugin types' section (5 sub-sections):
  - Model provider plugins \u2014 ProviderProfile + plugin.yaml example,
    auto-wiring summary, link to full guide
  - Platform plugins \u2014 BasePlatformAdapter + register_platform() skeleton
  - Memory provider plugins \u2014 MemoryProvider subclass example
  - Context engine plugins \u2014 ContextEngine subclass example
  - Image-generation backends \u2014 ImageGenProvider + kind: backend example
- New '## Non-Python extension surfaces' section (5 sub-sections):
  - MCP servers \u2014 config.yaml mcp_servers.<name> example
  - Gateway event hooks \u2014 HOOK.yaml + handler.py example
  - Shell hooks \u2014 hooks: block in config.yaml example
  - Skill sources (taps) \u2014 hermes skills tap add example
  - TTS / STT command templates \u2014 tts.providers.<name> with type: command
- Distribute via pip / NixOS promoted from ### to ## (they were orphaned
  after the reorganization)

Each specialized / non-Python section has a concrete, copy-pasteable
example plus a 'Full guide:' link to the authoritative doc. Devs arriving
at the build-a-hermes-plugin guide now see every extension surface at
their disposal, not just the general tool/hook/slash-command surface.

Verified:
- Docusaurus build SUCCESS, zero new broken links
- All new cross-links (developer-guide/model-provider-plugin,
  adding-platform-adapters, memory-provider-plugin, context-engine-plugin,
  user-guide/features/mcp, skills#skills-hub, hooks#gateway-event-hooks,
  hooks#shell-hooks, tts#custom-command-providers,
  tts#voice-message-transcription-stt) resolve
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
  adding-platform-adapters#step-by-step-checklist)

* docs(plugins): fix opt-in inconsistency — not every plugin is gated

The 'Every plugin is disabled by default' statement was wrong. Several
plugin categories intentionally bypass plugins.enabled:

- Bundled platform plugins (IRC, Teams) auto-load so shipped gateway
  channels are available out of the box. Activation per channel is via
  gateway.platforms.<name>.enabled.
- Bundled backends (plugins/image_gen/*) auto-load so the default
  backend 'just works'. Selection via <category>.provider config.
- Memory providers are all discovered; one is active via memory.provider.
- Context engines are all discovered; one is active via context.engine.
- Model providers: all 33 discovered at first get_provider_profile();
  user picks via --provider / config.

The plugins.enabled allow-list specifically gates:
- Standalone plugins (general tools/hooks/slash commands)
- User-installed backends
- User-installed platforms (third-party gateway adapters)
- Pip entry-point backends

Which matches the actual code in hermes_cli/plugins.py:737 where the
bundled+backend/platform check bypasses the allow-list.

Rewrote '## Plugins are opt-in' to:
- Retitle to 'Plugins are opt-in (with a few exceptions)'
- Narrow opening claim to 'General plugins and user-installed backends
  are disabled by default'
- Added 'What the allow-list does NOT gate' subsection with a full
  table of which bypass the gate and how they're activated instead
- Fixed migration section wording (bundled platform/backend plugins
  never needed grandfathering)

Verified: docusaurus build SUCCESS, zero new broken links.

2026-05-06 07:24:42 -07:00

14 KiB

Raw Blame History

sidebar_position	title	description
10	Model Provider Plugins	How to build a model provider (inference backend) plugin for Hermes Agent

Building a Model Provider Plugin

Model provider plugins declare an inference backend — an OpenAI-compatible endpoint, an Anthropic Messages server, a Codex-style Responses API, or a Bedrock-native surface — that Hermes can route AIAgent calls through. Every built-in provider (OpenRouter, Anthropic, GMI, DeepSeek, Nvidia, …) ships as one of these plugins. Third parties can add their own by dropping a directory under $HERMES_HOME/plugins/model-providers/ with zero changes to the repo.

:::tip Model provider plugins are the third kind of provider plugin. The others are Memory Provider Plugins (cross-session knowledge) and Context Engine Plugins (context compression strategies). All three follow the same "drop a directory, declare a profile, no repo edits" pattern. :::

How discovery works

providers/__init__.py._discover_providers() runs lazily the first time any code calls get_provider_profile() or list_providers(). Discovery order:

Bundled plugins — <repo>/plugins/model-providers/<name>/ — ship with Hermes
User plugins — $HERMES_HOME/plugins/model-providers/<name>/ — drop in any directory; no restart required for subsequent sessions
Legacy single-file — <repo>/providers/<name>.py — back-compat for out-of-tree editable installs

User plugins override bundled plugins of the same name because register_provider() is last-writer-wins. Drop a $HERMES_HOME/plugins/model-providers/gmi/ directory to replace the built-in GMI profile without touching the repo.

Directory structure

plugins/model-providers/my-provider/
├── __init__.py       # Calls register_provider(profile) at module-level
├── plugin.yaml       # kind: model-provider + metadata (optional but recommended)
└── README.md         # Setup instructions (optional)

The only required file is __init__.py. plugin.yaml is used by hermes plugins for introspection and by the general PluginManager to route the plugin to the right loader; without it, the general loader falls back to a source-text heuristic.

Minimal example — a simple API-key provider

# plugins/model-providers/acme-inference/__init__.py
from providers import register_provider
from providers.base import ProviderProfile

acme = ProviderProfile(
    name="acme-inference",
    aliases=("acme",),
    display_name="Acme Inference",
    description="Acme — OpenAI-compatible direct API",
    signup_url="https://acme.example.com/keys",
    env_vars=("ACME_API_KEY", "ACME_BASE_URL"),
    base_url="https://api.acme.example.com/v1",
    auth_type="api_key",
    default_aux_model="acme-small-fast",
    fallback_models=(
        "acme-large-v3",
        "acme-medium-v3",
        "acme-small-fast",
    ),
)

register_provider(acme)

# plugins/model-providers/acme-inference/plugin.yaml
name: acme-inference
kind: model-provider
version: 1.0.0
description: Acme Inference — OpenAI-compatible direct API
author: Your Name

That's it. After dropping these two files, the following auto-wire with no other edits:

Integration	Where	What it gets
Credential resolution	`hermes_cli/auth.py`	`PROVIDER_REGISTRY["acme-inference"]` populated from profile
`--provider` CLI flag	`hermes_cli/main.py`	Accepts `acme-inference`
`hermes model` picker	`hermes_cli/models.py`	Appears in `CANONICAL_PROVIDERS`, model list fetched from `{base_url}/models`
`hermes doctor`	`hermes_cli/doctor.py`	Health check for `ACME_API_KEY` + `{base_url}/models` probe
`hermes setup`	`hermes_cli/config.py`	`ACME_API_KEY` appears in `OPTIONAL_ENV_VARS` and the setup wizard
URL reverse-mapping	`agent/model_metadata.py`	Hostname → provider name for auto-detection
Auxiliary model	`agent/auxiliary_client.py`	Uses `default_aux_model` for compression / summarization
Runtime resolution	`hermes_cli/runtime_provider.py`	Returns correct `base_url`, `api_key`, `api_mode`
Transport	`agent/transports/chat_completions.py`	Profile path generates kwargs via `prepare_messages` / `build_extra_body` / `build_api_kwargs_extras`

ProviderProfile fields

Full definition in providers/base.py. The most useful ones:

Field	Type	Purpose
`name`	str	Canonical id — matches `--provider` choices and `HERMES_INFERENCE_PROVIDER`
`aliases`	`tuple[str, ...]`	Alternative names resolved by `get_provider_profile()` (e.g. `grok` → `xai`)
`api_mode`	str	`chat_completions` \| `codex_responses` \| `anthropic_messages` \| `bedrock_converse`
`display_name`	str	Human label shown in `hermes model` picker
`description`	str	Picker subtitle
`signup_url`	str	Shown during first-run setup ("get an API key here")
`env_vars`	`tuple[str, ...]`	API-key env vars in priority order; a final `*_BASE_URL` entry is used as the user base-URL override
`base_url`	str	Default inference endpoint
`models_url`	str	Explicit catalog URL (falls back to `{base_url}/models`)
`auth_type`	str	`api_key` \| `oauth_device_code` \| `oauth_external` \| `copilot` \| `aws_sdk` \| `external_process`
`fallback_models`	`tuple[str, ...]`	Curated list shown when live catalog fetch fails
`default_headers`	`dict[str, str]`	Sent on every request (e.g. Copilot's `Editor-Version`)
`fixed_temperature`	Any	`None` = use caller's value; `OMIT_TEMPERATURE` sentinel = don't send temperature at all (Kimi)
`default_max_tokens`	`int \| None`	Provider-level max_tokens cap (Nvidia: 16384)
`default_aux_model`	str	Cheap model for auxiliary tasks (compression, vision, summarization)

Overridable hooks

Subclass ProviderProfile for non-trivial quirks:

from typing import Any
from providers.base import ProviderProfile

class AcmeProfile(ProviderProfile):
    def prepare_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
        """Provider-specific message preprocessing. Runs after codex
        sanitization, before developer-role swap. Default: pass-through."""
        # Example: Qwen normalizes plain-text content to a list-of-parts
        # array and injects cache_control; Kimi rewrites tool-call JSON
        return messages

    def build_extra_body(self, *, session_id=None, **context) -> dict:
        """Provider-specific extra_body fields merged into the API call.
        Context includes: session_id, provider_preferences, model, base_url,
        reasoning_config. Default: empty dict."""
        # Example: OpenRouter's provider-preferences block,
        # Gemini's thinking_config translation.
        return {}

    def build_api_kwargs_extras(self, *, reasoning_config=None, **context):
        """Returns (extra_body_additions, top_level_kwargs). Needed when some
        fields go top-level (Kimi's reasoning_effort) and some go in extra_body
        (OpenRouter's reasoning dict). Default: ({}, {})."""
        return {}, {}

    def fetch_models(self, *, api_key=None, timeout=8.0) -> list[str] | None:
        """Live catalog fetch. Default hits {models_url or base_url}/models with
        Bearer auth. Override for: custom auth (Anthropic), no REST endpoint
        (Bedrock → None), or public/unauthenticated catalogs (OpenRouter)."""
        return super().fetch_models(api_key=api_key, timeout=timeout)

Hook reference examples

Look at these bundled plugins for idioms:

Plugin	Why look
`plugins/model-providers/openrouter/`	Aggregator with provider preferences, public model catalog
`plugins/model-providers/gemini/`	`thinking_config` translation (native + OpenAI-compat nested forms)
`plugins/model-providers/kimi-coding/`	`OMIT_TEMPERATURE`, `extra_body.thinking`, top-level `reasoning_effort`
`plugins/model-providers/qwen-oauth/`	Message normalization, `cache_control` injection, VL high-res
`plugins/model-providers/nous/`	Attribution tags, "omit reasoning when disabled"
`plugins/model-providers/custom/`	Ollama `num_ctx` + `think: false` quirks
`plugins/model-providers/bedrock/`	`api_mode="bedrock_converse"`, `fetch_models` returns None (no REST endpoint)

User overrides — replace a built-in without editing the repo

Say you want to point gmi at your private staging endpoint for testing. Create ~/.hermes/plugins/model-providers/gmi/__init__.py:

from providers import register_provider
from providers.base import ProviderProfile

register_provider(ProviderProfile(
    name="gmi",
    aliases=("gmi-cloud", "gmicloud"),
    env_vars=("GMI_API_KEY",),
    base_url="https://gmi-staging.internal.example.com/v1",
    auth_type="api_key",
    default_aux_model="google/gemini-3.1-flash-lite-preview",
))

Next session, get_provider_profile("gmi").base_url returns the staging URL. No repo patch, no rebuild. Because user plugins are discovered after bundled ones, the user register_provider() call wins.

api_mode selection

Four values are recognized. Hermes picks one based on:

User explicit override (config.yaml model.api_mode when set)
OpenCode's per-model dispatch (opencode_model_api_mode for Zen and Go)
URL auto-detection — /anthropic suffix → anthropic_messages, api.openai.com → codex_responses, api.x.ai → codex_responses, /coding on Kimi domains → chat_completions
Profile api_mode as a fallback when URL detection finds nothing
Default chat_completions

Set profile.api_mode to match the default your provider ships — it acts as a hint. User URL overrides still win.

Auth types

`auth_type`	Meaning	Who uses it
`api_key`	Single env var carries a static API key	Most providers
`oauth_device_code`	Device-code OAuth flow	—
`oauth_external`	User signs in elsewhere, tokens land in `auth.json`	Anthropic OAuth, MiniMax OAuth, Gemini Cloud Code, Qwen Portal, Nous Portal
`copilot`	GitHub Copilot token refresh cycle	`copilot` plugin only
`aws_sdk`	AWS SDK credential chain (IAM role, profile, env)	`bedrock` plugin only
`external_process`	Auth handled by a subprocess the agent spawns	`copilot-acp` plugin only

auth_type gates which codepaths treat your provider as a "simple api-key provider" — if it's not api_key, the PluginManager still records the manifest but Hermes' CLI-level automation (doctor checks, --provider flag, setup wizard delegation) may skip over it.

Discovery timing

Provider discovery is lazy — triggered by the first get_provider_profile() or list_providers() call in the process. In practice this happens early at startup (auth.py module load extends PROVIDER_REGISTRY eagerly). If you need to verify your plugin loaded, run:

hermes doctor

— a successful auth_type="api_key" profile appears under the Provider Connectivity section with a /models probe.

For programmatic inspection:

from providers import list_providers
for p in list_providers():
    print(p.name, p.base_url, p.api_mode)

Testing your plugin

Point HERMES_HOME at a temp directory so you don't pollute your real config:

export HERMES_HOME=/tmp/hermes-plugin-test
mkdir -p $HERMES_HOME/plugins/model-providers/my-provider
cat > $HERMES_HOME/plugins/model-providers/my-provider/__init__.py <<'EOF'
from providers import register_provider
from providers.base import ProviderProfile
register_provider(ProviderProfile(
    name="my-provider",
    env_vars=("MY_API_KEY",),
    base_url="https://api.my-provider.example.com/v1",
    auth_type="api_key",
))
EOF

export MY_API_KEY=your-test-key
hermes -z "hello" --provider my-provider -m some-model

General PluginManager integration

The general PluginManager (the thing hermes plugins operates on) sees model-provider plugins but does not import them — providers/__init__.py owns their lifecycle. The manager records the manifest for introspection and categorizes by kind: model-provider. When you drop an unlabeled user plugin into $HERMES_HOME/plugins/ that happens to call register_provider with a ProviderProfile, the manager auto-coerces it to kind: model-provider via a source-text heuristic — so the plugin still routes correctly even without plugin.yaml.

Distribute via pip

Like any Hermes plugin, model providers can ship as a pip package. Add an entry point to your pyproject.toml:

[project.entry-points."hermes.plugins"]
acme-inference = "acme_hermes_plugin:register"

…where acme_hermes_plugin:register is a function that calls register_provider(profile). The general PluginManager picks up entry-point plugins during discover_and_load(). For kind: model-provider pip plugins, you still need to declare the kind in your manifest (or rely on the source-text heuristic).

See Building a Hermes Plugin for the full entry-points setup.

Provider Runtime — resolution precedence + where each layer reads the profile
Adding Providers — end-to-end checklist for new inference backends (covers both the fast plugin path and the full CLI/auth integration)
Memory Provider Plugins
Context Engine Plugins
Building a Hermes Plugin — general plugin authoring

14 KiB Raw Blame History