mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-08 03:01:47 +00:00

docs: pluggable surfaces coverage — model-provider guide, full plugin map, opt-in fix (#20749 )

* docs(providers): add model-provider-plugin authoring guide + fix stale refs

New docs:
- website/docs/developer-guide/model-provider-plugin.md — full authoring
  guide (directory layout, minimal example, ProviderProfile fields,
  overridable hooks, user overrides, api_mode selection, auth types,
  testing, pip distribution)
- Wired into website/sidebars.ts under 'Extending'
- Cross-references added in:
  - guides/build-a-hermes-plugin.md (tip block)
  - developer-guide/adding-providers.md
  - developer-guide/provider-runtime.md

User guide:
- user-guide/features/plugins.md: Plugin types table grows from 3 to 4
  with 'Model providers' row

Stale comment cleanup (providers/*.py → plugins/model-providers/<name>/):
- hermes_cli/main.py:_is_profile_api_key_provider docstring
- hermes_cli/doctor.py:_build_apikey_providers_list docstring
- hermes_cli/auth.py: PROVIDER_REGISTRY + alias auto-extension comments
- hermes_cli/models.py: CANONICAL_PROVIDERS auto-extension comment

AGENTS.md:
- Project-structure tree: added plugins/model-providers/ row
- New section: 'Model-provider plugins' explaining discovery, override
  semantics, PluginManager integration, kind auto-coerce heuristic

Verified: docusaurus build succeeds, new page renders, all 3 cross-links
resolve. 347/347 targeted tests pass (tests/providers/,
tests/hermes_cli/test_plugins.py, tests/hermes_cli/test_runtime_provider_resolution.py,
tests/run_agent/test_provider_parity.py).

* docs(plugins): add 'pluggable interfaces at a glance' maps to plugins.md + build-a-hermes-plugin

Devs landing on either the user-guide plugin page or the build-a-plugin
guide now get an upfront table of every distinct pluggable surface with
a link to the right authoring doc. Previously they'd have to read the
full general-plugin guide to discover that model providers / platforms
/ memory / context engines are separate systems.

user-guide/features/plugins.md:
- New 'Pluggable interfaces — where to go for each' section below the
  existing 4-kinds table
- 10 rows covering every register_* surface (tool, hook, slash command,
  CLI subcommand, skill, model provider, platform, memory, context
  engine, image-gen)
- Explicit note: TTS/STT are NOT plugin-extensible yet — documented
  with a pointer to the current config.yaml 'command providers' pattern
  and a note that register_tts_provider()/register_stt_provider() may
  come later

guides/build-a-hermes-plugin.md:
- New :::info 'Not sure which guide you need?' map at the top so devs
  see all pluggable interfaces before investing in this 737-line
  general-plugin walkthrough
- Existing bottom :::tip expanded to include platform adapters alongside
  model/memory/context plugins

Verified:
- All 8 cross-doc links in the new plugins.md table resolve in a
  docusaurus build (SUCCESS, no new broken links)
- TTS link corrected (features/voice → features/tts; latter exists)
- Pre-existing broken links/anchors (cron-script-only, llms.txt,
  adding-platform-adapters#step-by-step-checklist) are unchanged

* docs(plugins): correct TTS/STT pluggability \u2014 they ARE plugins (command-providers)

Previous commit incorrectly said TTS/STT 'aren't plugin-extensible'. They
are, via the config-driven command-provider pattern \u2014 any CLI that reads
text and writes audio (or vice versa for STT) is automatically a plugin
with zero Python. The tts.md docs cover this extensively and I missed it.

plugins.md:
- TTS row: 'Config-driven (not a Python plugin)', points at
  tts.md#custom-command-providers
- STT row: points at tts.md#voice-message-transcription-stt (STT docs
  live in tts.md despite the filename)
- Expanded note: TTS/STT use config-driven shell-command templates as
  their plugin surface (full tts.providers.<name> registry for TTS;
  HERMES_LOCAL_STT_COMMAND escape hatch for STT)
- Any CLI that reads/writes files is automatically a plugin \u2014 no Python
  register_* API needed
- Future register_tts_provider()/register_stt_provider() hooks mentioned
  as nice-to-have for SDK/streaming cases, not as the primary story

build-a-hermes-plugin.md:
- Same map update: TTS/STT rows explicit, footer note corrected

Verified:
- tts.md anchors (custom-command-providers, voice-message-transcription-stt)
  exist and resolve in docusaurus build (SUCCESS, no new broken links)

* docs(plugins): expand pluggable interfaces table with MCP / event hooks / shell hooks / skill taps

Broadened the scope beyond Python register_* hooks. Hermes has MULTIPLE
plugin-style extension surfaces; they're now all in one table instead of
being scattered across feature docs.

Added rows for:
- **MCP servers** — config.yaml mcp_servers.<name> auto-registers external
  tools from any MCP server. Huge extensibility surface, previously not
  linked from the plugin map.
- **Gateway event hooks** — drop HOOK.yaml + handler.py into
  ~/.hermes/hooks/<name>/ to fire on gateway:startup, session:*, agent:*,
  command:* events. Separate from Python plugin hooks.
- **Shell hooks** — hooks: block in config.yaml runs shell commands on
  events (notifications, auditing, etc.).
- **Skill sources (taps)** — hermes skills tap add <repo> to pull in new
  skill registries beyond the built-in sources.

Both docs updated:
- user-guide/features/plugins.md: table column renamed to 'How' (mixes
  Python API + config-driven + drop-in-dir surfaces accurately)
- guides/build-a-hermes-plugin.md: :::info map at top mirrors the new
  surfaces with a forward-link to the consolidated table

Note block rewritten: instead of singling out TTS/STT as the 'different
style' exception, now honestly describes that Hermes deliberately
supports three plugin styles — Python APIs, config-driven commands, and
drop-in manifest directories — and devs should pick the one that fits
their integration.

Not included (considered and rejected):
- Transport layer (register_transport) — internal, not user-facing
- Tool-call parsers — internal, VLLM phase-2 thing
- Cloud browser providers — hardcoded registry, not drop-in yet
- Terminal backends — hardcoded if/elif, not drop-in yet
- Skill sources (the ABC) — hardcoded list, only taps are user-extensible

Verified:
- All 5 new anchors resolve (gateway-event-hooks, shell-hooks, skills-hub,
  custom-command-providers, voice-message-transcription-stt)
- Docusaurus build SUCCESS, zero new broken links
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
  adding-platform-adapters#step-by-step-checklist)

* docs(plugins): cover every pluggable surface in both the overview and how-to

Both plugins.md and build-a-hermes-plugin.md now cover every extension
surface end-to-end \u2014 general plugin APIs, specialized plugin types,
config-driven surfaces \u2014 with concrete authoring patterns for each.

plugins.md:
- 'What plugins can do' table grows from 9 rows (general ctx.register_*
  only) to 14 rows covering register_platform, register_image_gen_provider,
  register_context_engine, MemoryProvider subclass, register_provider
  (model). Each row links to its full authoring guide.
- New 'Plugin sub-categories' section under Plugin Discovery explains
  how plugins/platforms/, plugins/image_gen/, plugins/memory/,
  plugins/context_engine/, plugins/model-providers/ are routed to
  different loaders \u2014 PluginManager vs the per-category own-loader
  systems.
- Explicit mention of user-override semantics at
  ~/.hermes/plugins/model-providers/ and ~/.hermes/plugins/memory/.

build-a-hermes-plugin.md:
- New '## Specialized plugin types' section (5 sub-sections):
  - Model provider plugins \u2014 ProviderProfile + plugin.yaml example,
    auto-wiring summary, link to full guide
  - Platform plugins \u2014 BasePlatformAdapter + register_platform() skeleton
  - Memory provider plugins \u2014 MemoryProvider subclass example
  - Context engine plugins \u2014 ContextEngine subclass example
  - Image-generation backends \u2014 ImageGenProvider + kind: backend example
- New '## Non-Python extension surfaces' section (5 sub-sections):
  - MCP servers \u2014 config.yaml mcp_servers.<name> example
  - Gateway event hooks \u2014 HOOK.yaml + handler.py example
  - Shell hooks \u2014 hooks: block in config.yaml example
  - Skill sources (taps) \u2014 hermes skills tap add example
  - TTS / STT command templates \u2014 tts.providers.<name> with type: command
- Distribute via pip / NixOS promoted from ### to ## (they were orphaned
  after the reorganization)

Each specialized / non-Python section has a concrete, copy-pasteable
example plus a 'Full guide:' link to the authoritative doc. Devs arriving
at the build-a-hermes-plugin guide now see every extension surface at
their disposal, not just the general tool/hook/slash-command surface.

Verified:
- Docusaurus build SUCCESS, zero new broken links
- All new cross-links (developer-guide/model-provider-plugin,
  adding-platform-adapters, memory-provider-plugin, context-engine-plugin,
  user-guide/features/mcp, skills#skills-hub, hooks#gateway-event-hooks,
  hooks#shell-hooks, tts#custom-command-providers,
  tts#voice-message-transcription-stt) resolve
- Same 3 pre-existing broken links on main (cron-script-only, llms.txt,
  adding-platform-adapters#step-by-step-checklist)

* docs(plugins): fix opt-in inconsistency — not every plugin is gated

The 'Every plugin is disabled by default' statement was wrong. Several
plugin categories intentionally bypass plugins.enabled:

- Bundled platform plugins (IRC, Teams) auto-load so shipped gateway
  channels are available out of the box. Activation per channel is via
  gateway.platforms.<name>.enabled.
- Bundled backends (plugins/image_gen/*) auto-load so the default
  backend 'just works'. Selection via <category>.provider config.
- Memory providers are all discovered; one is active via memory.provider.
- Context engines are all discovered; one is active via context.engine.
- Model providers: all 33 discovered at first get_provider_profile();
  user picks via --provider / config.

The plugins.enabled allow-list specifically gates:
- Standalone plugins (general tools/hooks/slash commands)
- User-installed backends
- User-installed platforms (third-party gateway adapters)
- Pip entry-point backends

Which matches the actual code in hermes_cli/plugins.py:737 where the
bundled+backend/platform check bypasses the allow-list.

Rewrote '## Plugins are opt-in' to:
- Retitle to 'Plugins are opt-in (with a few exceptions)'
- Narrow opening claim to 'General plugins and user-installed backends
  are disabled by default'
- Added 'What the allow-list does NOT gate' subsection with a full
  table of which bypass the gate and how they're activated instead
- Fixed migration section wording (bundled platform/backend plugins
  never needed grandfathering)

Verified: docusaurus build SUCCESS, zero new broken links.

2026-05-06 07:24:42 -07:00

16 KiB

Raw Blame History

sidebar_position	title	description
5	Adding Providers	How to add a new inference provider to Hermes Agent — auth, runtime resolution, CLI flows, adapters, tests, and docs

Adding Providers

Hermes can already talk to any OpenAI-compatible endpoint through the custom provider path. Do not add a built-in provider unless you want first-class UX for that service:

provider-specific auth or token refresh
a curated model catalog
setup / hermes model menu entries
provider aliases for provider:model syntax
a non-OpenAI API shape that needs an adapter

If the provider is just "another OpenAI-compatible base URL and API key", a named custom provider may be enough.

The mental model

A built-in provider has to line up across a few layers:

hermes_cli/auth.py decides how credentials are found.
hermes_cli/runtime_provider.py turns that into runtime data:
- provider
- api_mode
- base_url
- api_key
- source
run_agent.py uses api_mode to decide how requests are built and sent.
hermes_cli/models.py and hermes_cli/main.py make the provider show up in the CLI. (hermes_cli/setup.py delegates to main.py automatically — no changes needed there.)
agent/auxiliary_client.py and agent/model_metadata.py keep side tasks and token budgeting working.

The important abstraction is api_mode.

Most providers use chat_completions.
Codex uses codex_responses.
Anthropic uses anthropic_messages.
A new non-OpenAI protocol usually means adding a new adapter and a new api_mode branch.

Choose the implementation path first

Path A — OpenAI-compatible provider

Use this when the provider accepts standard chat-completions style requests.

Typical work:

add auth metadata
add model catalog / aliases
add runtime resolution
add CLI menu wiring
add aux-model defaults
add tests and user docs

You usually do not need a new adapter or a new api_mode.

Path B — Native provider

Use this when the provider does not behave like OpenAI chat completions.

Examples in-tree today:

codex_responses
anthropic_messages

This path includes everything from Path A plus:

a provider adapter in agent/
run_agent.py branches for request building, dispatch, usage extraction, interrupt handling, and response normalization
adapter tests

File checklist

Required for every built-in provider

hermes_cli/auth.py
hermes_cli/models.py
hermes_cli/runtime_provider.py
hermes_cli/main.py
agent/auxiliary_client.py
agent/model_metadata.py
tests
user-facing docs under website/docs/

:::tip hermes_cli/setup.py does not need changes. The setup wizard delegates provider/model selection to select_provider_and_model() in main.py — any provider added there is automatically available in hermes setup. :::

Additional for native / non-OpenAI providers

agent/<provider>_adapter.py
run_agent.py
pyproject.toml if a provider SDK is required

Fast path: Simple API-key providers

If your provider is just an OpenAI-compatible endpoint that authenticates with a single API key, you do not need to touch auth.py, runtime_provider.py, main.py, or any of the other files in the full checklist below.

All you need is:

A plugin directory under plugins/model-providers/<your-provider>/ containing:
- __init__.py — calls register_provider(profile) at module-level
- plugin.yaml — manifest (name, kind: model-provider, version, description)
That's it. Provider plugins auto-load the first time anything calls get_provider_profile() or list_providers() — bundled plugins (this repo) and user plugins at $HERMES_HOME/plugins/model-providers/ both get picked up.

When you add a plugin and it calls register_provider(), the following wire up automatically:

PROVIDER_REGISTRY entry in auth.py (credential resolution, env-var lookup)
api_mode set to chat_completions
base_url sourced from the config or the declared env var
env_vars checked in priority order for the API key
fallback_models list registered for the provider
--provider CLI flag accepts the provider id
hermes model menu includes the provider
hermes setup wizard delegates to main.py automatically
provider:model alias syntax works
Runtime resolver returns the correct base_url and api_key
HERMES_INFERENCE_PROVIDER env-var override accepts the provider id
Fallback model activation can switch into the provider cleanly

User plugins at $HERMES_HOME/plugins/model-providers/<name>/ override bundled plugins of the same name (last-writer-wins in register_provider()) — so third parties can monkey-patch or replace any built-in profile without editing the repo.

See plugins/model-providers/nvidia/ or plugins/model-providers/gmi/ as a template, and the full Model Provider Plugin guide for field reference, hook idioms, and end-to-end examples.

Full path: OAuth and complex providers

Use the full checklist below when your provider needs any of the following:

OAuth or token refresh (Nous Portal, Codex, Google Gemini, Qwen Portal, Copilot)
A non-OpenAI API shape that requires a new adapter (Anthropic Messages, Codex Responses)
Custom endpoint detection or multi-region probing (z.ai, Kimi)
A curated static model catalog or live /models fetch
Provider-specific hermes model menu entries with bespoke auth flows

Step 1: Pick one canonical provider id

Choose a single provider id and use it everywhere.

Examples from the repo:

openai-codex
kimi-coding
minimax-cn

That same id should appear in:

PROVIDER_REGISTRY in hermes_cli/auth.py
_PROVIDER_LABELS in hermes_cli/models.py
_PROVIDER_ALIASES in both hermes_cli/auth.py and hermes_cli/models.py
CLI --provider choices in hermes_cli/main.py
setup / model selection branches
auxiliary-model defaults
tests

If the id differs between those files, the provider will feel half-wired: auth may work while /model, setup, or runtime resolution silently misses it.

Step 2: Add auth metadata in `hermes_cli/auth.py`

For API-key providers, add a ProviderConfig entry to PROVIDER_REGISTRY with:

id
name
auth_type="api_key"
inference_base_url
api_key_env_vars
optional base_url_env_var

Also add aliases to _PROVIDER_ALIASES.

Use the existing providers as templates:

simple API-key path: Z.AI, MiniMax
API-key path with endpoint detection: Kimi, Z.AI
native token resolution: Anthropic
OAuth / auth-store path: Nous, OpenAI Codex

Questions to answer here:

What env vars should Hermes check, and in what priority order?
Does the provider need base-URL overrides?
Does it need endpoint probing or token refresh?
What should the auth error say when credentials are missing?

If the provider needs something more than "look up an API key", add a dedicated credential resolver instead of shoving logic into unrelated branches.

Step 3: Add model catalog and aliases in `hermes_cli/models.py`

Update the provider catalog so the provider works in menus and in provider:model syntax.

Typical edits:

_PROVIDER_MODELS
_PROVIDER_LABELS
_PROVIDER_ALIASES
provider display order inside list_available_providers()
provider_model_ids() if the provider supports a live /models fetch

If the provider exposes a live model list, prefer that first and keep _PROVIDER_MODELS as the static fallback.

This file is also what makes inputs like these work:

anthropic:claude-sonnet-4-6
kimi:model-name

If aliases are missing here, the provider may authenticate correctly but still fail in /model parsing.

Step 4: Resolve runtime data in `hermes_cli/runtime_provider.py`

resolve_runtime_provider() is the shared path used by CLI, gateway, cron, ACP, and helper clients.

Add a branch that returns a dict with at least:

{
    "provider": "your-provider",
    "api_mode": "chat_completions",  # or your native mode
    "base_url": "https://...",
    "api_key": "...",
    "source": "env|portal|auth-store|explicit",
    "requested_provider": requested_provider,
}

If the provider is OpenAI-compatible, api_mode should usually stay chat_completions.

Be careful with API-key precedence. Hermes already contains logic to avoid leaking an OpenRouter key to unrelated endpoints. A new provider should be equally explicit about which key goes to which base URL.

Step 5: Wire the CLI in `hermes_cli/main.py`

A provider is not discoverable until it shows up in the interactive hermes model flow.

Update these in hermes_cli/main.py:

provider_labels dict
providers list in select_provider_and_model()
provider dispatch (if selected_provider == ...)
--provider argument choices
login/logout choices if the provider supports those flows
a _model_flow_<provider>() function, or reuse _model_flow_api_key_provider() if it fits

:::tip hermes_cli/setup.py does not need changes — it calls select_provider_and_model() from main.py, so your new provider appears in both hermes model and hermes setup automatically. :::

Step 6: Keep auxiliary calls working

Two files matter here:

`agent/auxiliary_client.py`

Add a cheap / fast default aux model to _API_KEY_PROVIDER_AUX_MODELS if this is a direct API-key provider.

Auxiliary tasks include things like:

vision summarization
web extraction summarization
context compression summaries
session-search summaries
memory flushes

If the provider has no sensible aux default, side tasks may fall back badly or use an expensive main model unexpectedly.

`agent/model_metadata.py`

Add context lengths for the provider's models so token budgeting, compression thresholds, and limits stay sane.

Step 7: If the provider is native, add an adapter and `run_agent.py` support

If the provider is not plain chat completions, isolate the provider-specific logic in agent/<provider>_adapter.py.

Keep run_agent.py focused on orchestration. It should call adapter helpers, not hand-build provider payloads inline all over the file.

A native provider usually needs work in these places:

New adapter file

Typical responsibilities:

build the SDK / HTTP client
resolve tokens
convert OpenAI-style conversation messages to the provider's request format
convert tool schemas if needed
normalize provider responses back into what run_agent.py expects
extract usage and finish-reason data

`run_agent.py`

Search for api_mode and audit every switch point. At minimum, verify:

__init__ chooses the new api_mode
client construction works for the provider
_build_api_kwargs() knows how to format requests
_interruptible_api_call() dispatches to the right client call
interrupt / client rebuild paths work
response validation accepts the provider's shape
finish-reason extraction is correct
token-usage extraction is correct
fallback-model activation can switch into the new provider cleanly
summary-generation and memory-flush paths still work

Also search run_agent.py for self.client.. Any code path that assumes the standard OpenAI client exists can break when a native provider uses a different client object or self.client = None.

Prompt caching and provider-specific request fields

Prompt caching and provider-specific knobs are easy to regress.

Examples already in-tree:

Anthropic has a native prompt-caching path
OpenRouter gets provider-routing fields
not every provider should receive every request-side option

When you add a native provider, double-check that Hermes is only sending fields that provider actually understands.

Step 8: Tests

At minimum, touch the tests that guard provider wiring.

Common places:

tests/test_runtime_provider_resolution.py
tests/test_cli_provider_resolution.py
tests/test_cli_model_command.py
tests/test_setup_model_selection.py
tests/test_provider_parity.py
tests/test_run_agent.py
tests/test_<provider>_adapter.py for a native provider

For docs-only examples, the exact file set may differ. The point is to cover:

auth resolution
CLI menu / provider selection
runtime provider resolution
agent execution path
provider:model parsing
any adapter-specific message conversion

Run tests with xdist disabled:

source venv/bin/activate
python -m pytest tests/test_runtime_provider_resolution.py tests/test_cli_provider_resolution.py tests/test_cli_model_command.py tests/test_setup_model_selection.py -n0 -q

For deeper changes, run the full suite before pushing:

source venv/bin/activate
python -m pytest tests/ -n0 -q

Step 9: Live verification

After tests, run a real smoke test.

source venv/bin/activate
python -m hermes_cli.main chat -q "Say hello" --provider your-provider --model your-model

Also test the interactive flows if you changed menus:

source venv/bin/activate
python -m hermes_cli.main model
python -m hermes_cli.main setup

For native providers, verify at least one tool call too, not just a plain text response.

Step 10: Update user-facing docs

If the provider is meant to ship as a first-class option, update the user docs too:

website/docs/getting-started/quickstart.md
website/docs/user-guide/configuration.md
website/docs/reference/environment-variables.md

A developer can wire the provider perfectly and still leave users unable to discover the required env vars or setup flow.

OpenAI-compatible provider checklist

Use this if the provider is standard chat completions.

ProviderConfig added in hermes_cli/auth.py
aliases added in hermes_cli/auth.py and hermes_cli/models.py
model catalog added in hermes_cli/models.py
runtime branch added in hermes_cli/runtime_provider.py
CLI wiring added in hermes_cli/main.py (setup.py inherits automatically)
aux model added in agent/auxiliary_client.py
context lengths added in agent/model_metadata.py
runtime / CLI tests updated
user docs updated

Native provider checklist

Use this when the provider needs a new protocol path.

everything in the OpenAI-compatible checklist
adapter added in agent/<provider>_adapter.py
new api_mode supported in run_agent.py
interrupt / rebuild path works
usage and finish-reason extraction works
fallback path works
adapter tests added
live smoke test passes

Common pitfalls

1. Adding the provider to auth but not to model parsing

That makes credentials resolve correctly while /model and provider:model inputs fail.

2. Forgetting that `config["model"]` can be a string or a dict

A lot of provider-selection code has to normalize both forms.

3. Assuming a built-in provider is required

If the service is just OpenAI-compatible, a custom provider may already solve the user problem with less maintenance.

4. Forgetting auxiliary paths

The main chat path can work while summarization, memory flushes, or vision helpers fail because aux routing was never updated.

5. Native-provider branches hiding in `run_agent.py`

Search for api_mode and self.client.. Do not assume the obvious request path is the only one.

6. Sending OpenRouter-only knobs to other providers

Fields like provider routing belong only on the providers that support them.

7. Updating `hermes model` but not `hermes setup`

Both flows need to know about the provider.

Good search targets while implementing

If you are hunting for all the places a provider touches, search these symbols:

PROVIDER_REGISTRY
_PROVIDER_ALIASES
_PROVIDER_MODELS
resolve_runtime_provider
_model_flow_
select_provider_and_model
api_mode
_API_KEY_PROVIDER_AUX_MODELS
self.client.

16 KiB Raw Blame History