mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-18 04:41:56 +00:00

feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )

* feat(video_gen): unified video_generate tool with pluggable provider backends

One core video_generate tool, every backend a plugin. Mirrors the
image_gen + memory_provider + context_engine architecture: ABC, registry,
plugin-context registration hook, and per-plugin model catalogs surfaced
through hermes tools.

Surface (one schema, every backend):
- operation: generate / edit / extend
- modalities: text-to-video (prompt only), image-to-video (prompt +
  image_url), video edit (prompt + video_url), video extend (video_url)
- reference_image_urls, duration, aspect_ratio, resolution,
  negative_prompt, audio, seed, model override
- Providers ignore unknown kwargs and declare what they support via
  VideoGenProvider.capabilities() — backend-specific quirks stay in the
  backend, the agent learns one tool

Backends shipped:
- plugins/video_gen/xai/  — Grok-Imagine, full generate/edit/extend +
  image-to-video + reference images (salvaged from PR #10600 by
  @Jaaneek, reshaped into the plugin interface)
- plugins/video_gen/fal/  — Veo 3.1 (t2v + i2v), Kling O3 i2v,
  Pixverse v6 i2v with model-aware payload building that drops keys a
  model doesn't declare

Wiring:
- agent/video_gen_provider.py — VideoGenProvider ABC, normalize_operation,
  success_response / error_response, save_b64_video / save_bytes_video,
  $HERMES_HOME/cache/videos/
- agent/video_gen_registry.py — thread-safe register/get/list +
  get_active_provider() reading video_gen.provider from config.yaml
- hermes_cli/plugins.py — PluginContext.register_video_gen_provider()
- hermes_cli/tools_config.py — Video Generation category in
  hermes tools, plugin-only providers list, model picker per plugin,
  config write to video_gen.{provider,model}
- toolsets.py — new video_gen toolset
- tests: 31 new tests covering ABC, registry, tool dispatch, both plugins
- docs: developer-guide/video-gen-provider-plugin.md (parallel to the
  image-gen guide), sidebar + toolsets-reference + plugin guides updated

Supersedes: #25035 (FAL), #17972 (FAL), #14543 (xAI), #13847 (HappyHorse),
#10458 (provider categories), #10786 (xAI media+search bundle), #2984
(FAL duplicate), #19086 (Google Veo standalone — easy port to plugin
interface).

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>

* feat(video_gen): dynamic schema reflects active backend's capabilities

Address the 'capability variance' question — instead of one tool with a
static schema that lies about what every backend supports, the
video_generate tool now rebuilds its description at get_definitions()
time based on the configured video_gen.provider and video_gen.model.

The agent sees backend-specific guidance up-front:
- 'fal-ai/veo3.1/image-to-video': 'image-to-video only — image_url is
  REQUIRED; text-only prompts will be rejected'
- 'fal-ai/veo3.1' (t2v): no image_url restriction shown
- xAI grok-imagine-video: 'operations: generate, edit, extend; up to 7
  reference_image_urls'
- Backends without edit/extend: 'not supported on this backend — surface
  that they need to switch backends via hermes tools'

This is the same pattern PR #22694 used for delegate_task self-capping —
documented in the dynamic-tool-schemas skill. Cache invalidation is
free: get_tool_definitions() already memoizes on config.yaml mtime, so a
mid-session backend swap rebuilds the schema automatically.

Tested:
- Empirical FAL OpenAPI schema check confirms image-to-video models
  require image_url (FAL returns HTTP 422 otherwise) — client-side
  rejection in FALVideoGenProvider.generate() now prevents the wasted
  round-trip
- Live E2E: fal-ai/veo3.1/image-to-video + prompt-only → clean
  missing_image_url error; fal-ai/veo3.1 + prompt-only → dispatches
- 6 new tests cover the builder (no config / image-only / full-surface /
  text-only / unknown provider / registry wiring), all passing
- 37/37 in the slice, 134/134 in the broader regression set

* test(video_gen/xai): full surface integration tests + cleaner schema

Verified end-to-end that the xAI plugin handles every documented mode
from PR #10600's surface: text-to-video, image-to-video,
reference-images-to-video, video edit, video extend (with and without
prompt). All five modes route to the correct xAI endpoint
(/videos/generations, /videos/edits, /videos/extensions) with the right
payload shape (image / reference_images / video keys), and all five
client-side rejections fire before the network: edit-without-prompt,
extend-without-video_url, image+refs conflict, >7 references, and
duration/aspect_ratio clamping.

15 new integration tests grouped into four classes (endpoint routing,
modalities, validation, clamping). httpx is stubbed via a small fake
AsyncClient that records POSTs so the tests assert the actual payload
the plugin would send to xAI — not just the success/error envelope.

Also cleaned up a description redundancy: when a model's operations
match the backend's overall set, we no longer print the duplicate
'operations supported by this model' line. xAI's description now reads:

    Active backend: xAI . model: grok-imagine-video
    - operations supported by this backend: edit, extend, generate
    - modalities supported by this backend: image, reference_images, text
    - aspect_ratio choices: 16:9, 1:1, 2:3, 3:2, 3:4, 4:3, 9:16
    - resolution choices: 480p, 720p
    - duration range: 1-15s
    - reference_image_urls: up to 7 images

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>

* feat(video_gen): collapse surface to t2v + i2v, family-based auto-routing

Two design changes per Teknium:

1) Drop edit/extend from the tool surface entirely. Only text-to-video
and image-to-video remain. The agent sees a clean tool with two
modalities; backend-specific quirks like xAI's edit/extend endpoints
stay out of the unified schema.

2) FAL: pick a model FAMILY once, the plugin routes between the
family's text-to-video and image-to-video endpoints based on whether
image_url was passed. Users no longer pick 'fal-ai/veo3.1' AND
'fal-ai/veo3.1/image-to-video' as separate options — they pick
'veo3.1', and the plugin handles the rest.

Catalog rewritten as families:

    veo3.1            fal-ai/veo3.1                                /  fal-ai/veo3.1/image-to-video
    pixverse-v6       fal-ai/pixverse/v6/text-to-video             /  fal-ai/pixverse/v6/image-to-video
    kling-o3-standard fal-ai/kling-video/o3/standard/text-to-video /  fal-ai/kling-video/o3/standard/image-to-video

xAI uses a single endpoint (/videos/generations) for both modes,
routed by the presence of the 'image' field in the payload — no
edit/extend exposure.

Schema changes:
- VIDEO_GENERATE_SCHEMA: drop operation, drop video_url. Final params:
  prompt (required), image_url, reference_image_urls, duration,
  aspect_ratio, resolution, negative_prompt, audio, seed, model.
- VideoGenProvider ABC: drop normalize_operation, VALID_OPERATIONS,
  DEFAULT_OPERATION. capabilities() drops 'operations' key.
- success_response: add 'modality' field ('text' | 'image') so the
  agent and logs can see which endpoint was actually hit.

Dynamic schema builder simplified — no operations bullet, no
'switch backends if you need edit/extend' guidance. When the active
backend supports both modalities (the common case), description reads:

    Active backend: FAL . model: pixverse-v6
    - supports both text-to-video (omit image_url) and image-to-video
      (pass image_url) - routes automatically
    - aspect_ratio choices: 16:9, 9:16, 1:1
    - resolution choices: 360p, 540p, 720p, 1080p
    - duration range: 1-15s
    - audio: pass audio=true to enable native audio (pricing tier)
    - negative_prompt: supported

Tests: 51 in the video_gen slice, 216 across the broader image+video
sweep, all passing. New FAL routing tests prove pixverse-v6 + no image
hits text-to-video endpoint, pixverse-v6 + image_url hits
image-to-video endpoint, same for veo3.1 and kling-o3-standard.

Docs updated: developer-guide page rewrites the 'model families' pattern
as a first-class section so external plugin authors know the convention.
toolsets-reference and toolsets.py descriptions match the new surface.

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>

* feat(video_gen/fal): expand catalog to 6 families, cheap + premium tiers

Catalog now covers everything Teknium specced from FAL:

  Cheap tier:
    ltx-2.3        fal-ai/ltx-2.3-22b/text-to-video       / image-to-video
    pixverse-v6    fal-ai/pixverse/v6/text-to-video       / image-to-video

  Premium tier:
    veo3.1         fal-ai/veo3.1                          / fal-ai/veo3.1/image-to-video
    seedance-2.0   bytedance/seedance-2.0/text-to-video   / image-to-video
    kling-v3-4k    fal-ai/kling-video/v3/4k/text-to-video / image-to-video
    happy-horse    fal-ai/happy-horse/text-to-video       / image-to-video

DEFAULT_MODEL moved from veo3.1 (premium) to pixverse-v6 (cheap, sane
defaults, both modalities) — better first-run UX for users who haven't
explicitly picked a model.

New family-entry knob: image_param_key. Kling v3 4K's image-to-video
endpoint expects start_image_url instead of image_url; declaring
image_param_key='start_image_url' on the family lets _build_payload
remap correctly. Other families default to plain image_url.

Per-family capability flags reflect each model's docs:
- LTX 2.3 + Happy Horse: minimal payloads (no duration/aspect/resolution
  enum exposed by FAL — let endpoint apply defaults)
- Seedance: 6 aspect ratios incl 21:9, durations 4-15, audio supported,
  negative prompts NOT supported per docs
- Kling v3 4K: 16:9/9:16/1:1, 3-15s, audio + negative
- Veo 3.1: unchanged, 16:9/9:16, 4/6/8s

Tests: +5 covering the new families (full catalog, Kling 4K
start_image_url remap, Seedance routing, LTX payload minimality, Happy
Horse minimality). 56/56 in the slice green.

Note: I did NOT add the FAL-hosted xAI Grok-Imagine variant. Hermes
already has a direct xAI plugin that talks to xAI's own API; routing
the same model through FAL's wrapper would duplicate the surface
without adding capabilities. Users on FAL who want Grok-Imagine should
use the xAI plugin directly; flag if you want both routes available.

* test(video_gen): tool-surface routing matrix — every model x modality

End-to-end matrix test driven through _handle_video_generate() — the
actual function the agent's video_generate tool call lands in. Writes
config.yaml, invokes the registered handler with a raw args dict, then
asserts the outbound HTTP/SDK call hit the right endpoint with the right
payload shape.

Parametrized over FAL_FAMILIES.keys() so the matrix auto-discovers new
families as they're added (add a family to FAL_FAMILIES and you get
both modalities tested for free).

Coverage:
- All 6 FAL families x {text-only, text+image} = 12 cases
- xAI x {text-only, text+image} = 2 cases
- tool-level model= arg overrides config = 2 cases

For each case, verifies:
- result['success'] is True
- result['modality'] matches input shape ('text' if no image_url, 'image' otherwise)
- outbound endpoint URL matches the family's text_endpoint or image_endpoint
- text-only payloads carry no image-shaped keys
- text+image payloads carry the family's image key (image_url for most,
  start_image_url for kling-v3-4k, wrapped 'image' object for xAI)

All 16 cases passing. Confirms the tool surface routes every
(provider, model, modality) combination correctly with zero leakage.

* feat(video_gen): keep video_gen out of first-run setup, surface in status

Two changes:

1. video_gen joins _DEFAULT_OFF_TOOLSETS, so it is NOT pre-selected in
   the first-run toolset checklist. Video gen is niche, paid, and slow —
   most users don't want it nagging them during initial setup. Anyone
   who wants it opts in via 'hermes tools' -> Video Generation, which
   already routes to the provider+model picker.

2. The 'hermes setup' status panel learns about video_gen — but only
   shows the row when a plugin reports available. Users without
   FAL_KEY/XAI_API_KEY see nothing about video gen; users with one of
   those keys see 'Video Generation (FAL) ✓' as confirmation it's wired.

Verified live:
- Fresh install (no creds): zero video_gen mentions in wizard.
- With FAL_KEY: status row appears with active backend name.
- 160/160 in the setup + tools_config + video_gen test slice.

Rationale: image_gen is on by default because it's a featured creative
tool used in casual chat (telegrams, etc). Video gen is heavier — long
wait, paid per-second pricing. Default-off matches user intent better.

---------

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>

2026-05-13 16:39:41 -07:00

22 KiB

Raw Blame History

sidebar_position	sidebar_label	title	description
11	Plugins	Plugins	Extend Hermes with custom tools, hooks, and integrations via the plugin system

Plugins

Hermes has a plugin system for adding custom tools, hooks, and integrations without modifying core code.

If you want to create a custom tool for yourself, your team, or one project, this is usually the right path. The developer guide's Adding Tools page is for built-in Hermes core tools that live in tools/ and toolsets.py.

→ Build a Hermes Plugin — step-by-step guide with a complete working example.

Quick overview

Drop a directory into ~/.hermes/plugins/ with a plugin.yaml and Python code:

~/.hermes/plugins/my-plugin/
├── plugin.yaml      # manifest
├── __init__.py      # register() — wires schemas to handlers
├── schemas.py       # tool schemas (what the LLM sees)
└── tools.py         # tool handlers (what runs when called)

Start Hermes — your tools appear alongside built-in tools. The model can call them immediately.

Minimal working example

Here is a complete plugin that adds a hello_world tool and logs every tool call via a hook.

~/.hermes/plugins/hello-world/plugin.yaml

name: hello-world
version: "1.0"
description: A minimal example plugin

~/.hermes/plugins/hello-world/__init__.py

"""Minimal Hermes plugin — registers a tool and a hook."""

import json


def register(ctx):
    # --- Tool: hello_world ---
    schema = {
        "name": "hello_world",
        "description": "Returns a friendly greeting for the given name.",
        "parameters": {
            "type": "object",
            "properties": {
                "name": {
                    "type": "string",
                    "description": "Name to greet",
                }
            },
            "required": ["name"],
        },
    }

    def handle_hello(params, **kwargs):
        del kwargs
        name = params.get("name", "World")
        return json.dumps({"success": True, "greeting": f"Hello, {name}!"})

    ctx.register_tool(
        name="hello_world",
        toolset="hello_world",
        schema=schema,
        handler=handle_hello,
        description="Return a friendly greeting for the given name.",
    )

    # --- Hook: log every tool call ---
    def on_tool_call(tool_name, params, result):
        print(f"[hello-world] tool called: {tool_name}")

    ctx.register_hook("post_tool_call", on_tool_call)

Drop both files into ~/.hermes/plugins/hello-world/, restart Hermes, and the model can immediately call hello_world. The hook prints a log line after every tool invocation.

Project-local plugins under ./.hermes/plugins/ are disabled by default. Enable them only for trusted repositories by setting HERMES_ENABLE_PROJECT_PLUGINS=true before starting Hermes.

What plugins can do

Every ctx.* API below is available inside a plugin's register(ctx) function.

Capability	How
Add tools	`ctx.register_tool(name=..., toolset=..., schema=..., handler=...)`
Add hooks	`ctx.register_hook("post_tool_call", callback)`
Add slash commands	`ctx.register_command(name, handler, description)` — adds `/name` in CLI and gateway sessions
Dispatch tools from commands	`ctx.dispatch_tool(name, args)` — invokes a registered tool with parent-agent context auto-wired
Add CLI commands	`ctx.register_cli_command(name, help, setup_fn, handler_fn)` — adds `hermes <plugin> <subcommand>`
Inject messages	`ctx.inject_message(content, role="user")` — see Injecting Messages
Ship data files	`Path(__file__).parent / "data" / "file.yaml"`
Bundle skills	`ctx.register_skill(name, path)` — namespaced as `plugin:skill`, loaded via `skill_view("plugin:skill")`
Gate on env vars	`requires_env: [API_KEY]` in plugin.yaml — prompted during `hermes plugins install`
Distribute via pip	`[project.entry-points."hermes_agent.plugins"]`
Register a gateway platform (Discord, Telegram, IRC, …)	`ctx.register_platform(name, label, adapter_factory, check_fn, ...)` — see Adding Platform Adapters
Register an image-generation backend	`ctx.register_image_gen_provider(provider)` — see Image Generation Provider Plugins
Register a video-generation backend	`ctx.register_video_gen_provider(provider)` — see Video Generation Provider Plugins
Register a context-compression engine	`ctx.register_context_engine(engine)` — see Context Engine Plugins
Register a memory backend	Subclass `MemoryProvider` in `plugins/memory/<name>/__init__.py` — see Memory Provider Plugins (uses a separate discovery system)
Run a host-owned LLM call	`ctx.llm.complete(...)` / `ctx.llm.complete_structured(...)` — borrow the user's active model + auth for a one-shot completion with optional JSON schema validation. See Plugin LLM Access
Register an inference backend (LLM provider)	`register_provider(ProviderProfile(...))` in `plugins/model-providers/<name>/__init__.py` — see Model Provider Plugins (uses a separate discovery system)

Plugin discovery

Source	Path	Use case
Bundled	`<repo>/plugins/`	Ships with Hermes — see Built-in Plugins
User	`~/.hermes/plugins/`	Personal plugins
Project	`.hermes/plugins/`	Project-specific plugins (requires `HERMES_ENABLE_PROJECT_PLUGINS=true`)
pip	`hermes_agent.plugins` entry_points	Distributed packages
Nix	`services.hermes-agent.extraPlugins` / `extraPythonPackages`	NixOS declarative installs — see Nix Setup

Later sources override earlier ones on name collision, so a user plugin with the same name as a bundled plugin replaces it.

Plugin sub-categories

Within each source, Hermes also recognizes sub-category directories that route plugins to specialized discovery systems:

Sub-directory	What it holds	Discovery system
`plugins/` (root)	General plugins — tools, hooks, slash commands, CLI commands, bundled skills	`PluginManager` (kind: `standalone` or `backend`)
`plugins/platforms/<name>/`	Gateway channel adapters (`ctx.register_platform()`)	`PluginManager` (kind: `platform`, one level deeper)
`plugins/image_gen/<name>/`	Image-generation backends (`ctx.register_image_gen_provider()`)	`PluginManager` (kind: `backend`, one level deeper)
`plugins/memory/<name>/`	Memory providers (subclass `MemoryProvider`)	Own loader in `plugins/memory/__init__.py` (kind: `exclusive` — one active at a time)
`plugins/context_engine/<name>/`	Context-compression engines (`ctx.register_context_engine()`)	Own loader in `plugins/context_engine/__init__.py` (one active at a time)
`plugins/model-providers/<name>/`	LLM provider profiles (`register_provider(ProviderProfile(...))`)	Own loader in `providers/__init__.py` (lazily scanned on first `get_provider_profile()` call)

User plugins at ~/.hermes/plugins/model-providers/<name>/ and ~/.hermes/plugins/memory/<name>/ override bundled plugins of the same name — last-writer-wins in register_provider() / register_memory_provider(). Drop a directory in, and it replaces the built-in without any repo edits.

Plugins are opt-in (with a few exceptions)

General plugins and user-installed backends are disabled by default — discovery finds them (so they show up in hermes plugins and /plugins), but nothing with hooks or tools loads until you add the plugin's name to plugins.enabled in ~/.hermes/config.yaml. This stops third-party code from running without your explicit consent.

plugins:
  enabled:
    - my-tool-plugin
    - disk-cleanup
  disabled:       # optional deny-list — always wins if a name appears in both
    - noisy-plugin

Three ways to flip state:

hermes plugins                    # interactive toggle (space to check/uncheck)
hermes plugins enable <name>      # add to allow-list
hermes plugins disable <name>     # remove from allow-list + add to disabled

After hermes plugins install owner/repo, you're asked Enable 'name' now? [y/N] — defaults to no. Skip the prompt for scripted installs with --enable or --no-enable.

What the allow-list does NOT gate

Several categories of plugin bypass plugins.enabled — they're part of Hermes' built-in surface and would break basic functionality if gated off by default:

Plugin kind	How it's activated instead
Bundled platform plugins (IRC, Teams, etc. under `plugins/platforms/`)	Auto-loaded so every shipped gateway channel is available. The actual channel turns on via `gateway.platforms.<name>.enabled` in `config.yaml`.
Bundled backends (image-gen providers under `plugins/image_gen/`, etc.)	Auto-loaded so the default backend "just works". Selection happens via `<category>.provider` in `config.yaml` (e.g. `image_gen.provider: openai`).
Memory providers (`plugins/memory/`)	All discovered; exactly one is active, chosen by `memory.provider` in `config.yaml`.
Context engines (`plugins/context_engine/`)	All discovered; one is active, chosen by `context.engine` in `config.yaml`.
Model providers (`plugins/model-providers/`)	All bundled providers under `plugins/model-providers/` discover and register at the first `get_provider_profile()` call. The user picks one at a time via `--provider` or `config.yaml`.
Pip-installed `backend` plugins	Opt-in via `plugins.enabled` (same as general plugins).
User-installed platforms (under `~/.hermes/plugins/platforms/`)	Opt-in via `plugins.enabled` — third-party gateway adapters need explicit consent.

In short: bundled "always-works" infrastructure loads automatically; third-party general plugins are opt-in. The plugins.enabled allow-list is the gate specifically for arbitrary code a user drops into ~/.hermes/plugins/.

Migration for existing users

When you upgrade to a version of Hermes that has opt-in plugins (config schema v21+), any user plugins already installed under ~/.hermes/plugins/ that weren't already in plugins.disabled are automatically grandfathered into plugins.enabled. Your existing setup keeps working. Bundled standalone plugins are NOT grandfathered — even existing users have to opt in explicitly. (Bundled platform/backend plugins never needed grandfathering because they were never gated.)

Available hooks

Plugins can register callbacks for these lifecycle events. See the Event Hooks page for full details, callback signatures, and examples.

Hook	Fires when
`pre_tool_call`	Before any tool executes
`post_tool_call`	After any tool returns
`pre_llm_call`	Once per turn, before the LLM loop — can return `{"context": "..."}` to inject context into the user message
`post_llm_call`	Once per turn, after the LLM loop (successful turns only)
`on_session_start`	New session created (first turn only)
`on_session_end`	End of every `run_conversation` call + CLI exit handler
`on_session_finalize`	CLI/gateway tears down an active session (`/new`, GC, CLI quit)
`on_session_reset`	Gateway swaps in a new session key (`/new`, `/reset`, `/clear`, idle rotation)
`subagent_stop`	Once per child after `delegate_task` finishes
`pre_gateway_dispatch`	Gateway received a user message, before auth + dispatch. Return `{"action": "skip" \| "rewrite" \| "allow", ...}` to influence flow.

Plugin types

Hermes has four kinds of plugins:

Type	What it does	Selection	Location
General plugins	Add tools, hooks, slash commands, CLI commands	Multi-select (enable/disable)	`~/.hermes/plugins/`
Memory providers	Replace or augment built-in memory	Single-select (one active)	`plugins/memory/`
Context engines	Replace the built-in context compressor	Single-select (one active)	`plugins/context_engine/`
Model providers	Declare an inference backend (OpenRouter, Anthropic, …)	Multi-register, picked by `--provider` / `config.yaml`	`plugins/model-providers/`

Memory providers and context engines are provider plugins — only one of each type can be active at a time. Model providers are also plugins, but many load simultaneously; the user picks one at a time via --provider or config.yaml. General plugins can be enabled in any combination.

Pluggable interfaces — where to go for each

The table above shows the four plugin categories, but within "General plugins" the PluginContext exposes several distinct extension points — and Hermes also accepts extensions outside the Python plugin system (config-driven backends, shell-hooked commands, external servers, etc.). Use this table to find the right doc for what you want to build:

Want to add…	How	Authoring guide
A tool the LLM can call	Python plugin — `ctx.register_tool()`	Build a Hermes Plugin · Adding Tools
A lifecycle hook (pre/post LLM, session start/end, tool filter)	Python plugin — `ctx.register_hook()`	Hooks reference · Build a Hermes Plugin
A slash command for the CLI / gateway	Python plugin — `ctx.register_command()`	Build a Hermes Plugin · Extending the CLI
A subcommand for `hermes <thing>`	Python plugin — `ctx.register_cli_command()`	Extending the CLI
A bundled skill that your plugin ships	Python plugin — `ctx.register_skill()`	Creating Skills
An inference backend (LLM provider: OpenAI-compat, Codex, Anthropic-Messages, Bedrock)	Provider plugin — `register_provider(ProviderProfile(...))` in `plugins/model-providers/<name>/`	Model Provider Plugins · Adding Providers
A gateway channel (Discord / Telegram / IRC / Teams / etc.)	Platform plugin — `ctx.register_platform()` in `plugins/platforms/<name>/`	Adding Platform Adapters
A memory backend (Honcho, Mem0, Supermemory, …)	Memory plugin — subclass `MemoryProvider` in `plugins/memory/<name>/`	Memory Provider Plugins
A context-compression strategy	Context-engine plugin — `ctx.register_context_engine()`	Context Engine Plugins
An image-generation backend (DALL·E, SDXL, …)	Backend plugin — `ctx.register_image_gen_provider()`	Image Generation Provider Plugins
A video-generation backend (Veo, Kling, Pixverse, Grok-Imagine, Runway, …)	Backend plugin — `ctx.register_video_gen_provider()`	Video Generation Provider Plugins
A TTS backend (any CLI — Piper, VoxCPM, Kokoro, xtts, voice-cloning scripts, …)	Config-driven — declare under `tts.providers.<name>` with `type: command` in `config.yaml`	TTS setup
An STT backend (custom whisper binary, local ASR CLI)	Config-driven — set `HERMES_LOCAL_STT_COMMAND` env var to a shell template	Voice Message Transcription (STT)
External tools via MCP (filesystem, GitHub, Linear, Notion, any MCP server)	Config-driven — declare `mcp_servers.<name>` with `command:` / `url:` in `config.yaml`. Hermes auto-discovers the server's tools and registers them alongside built-ins.	MCP
Additional skill sources (custom GitHub repos, private skill indexes)	CLI — `hermes skills tap add <repo>`	Skills Hub · Publishing a custom tap
Gateway event hooks (fire on `gateway:startup`, `session:start`, `agent:end`, `command:*`)	Drop `HOOK.yaml` + `handler.py` into `~/.hermes/hooks/<name>/`	Event Hooks
Shell hooks (run a shell command on events — notifications, audit logs, desktop alerts)	Config-driven — declare under `hooks:` in `config.yaml`	Shell Hooks

:::note Not everything is a Python plugin. Some extension surfaces intentionally use config-driven shell commands (TTS, STT, shell hooks) so any CLI you already have becomes a plugin without writing Python. Others are external servers (MCP) the agent connects to and auto-registers tools from. And some are drop-in directories (gateway hooks) with their own manifest format. Pick the right surface for the integration style that fits your use case; the authoring guides in the table above each cover placeholders, discovery, and examples. :::

NixOS declarative plugins

On NixOS, plugins can be installed declaratively via the module options — no hermes plugins install needed. See the Nix Setup guide for full details.

services.hermes-agent = {
  # Directory plugin (source tree with plugin.yaml)
  extraPlugins = [ (pkgs.fetchFromGitHub { ... }) ];
  # Entry-point plugin (pip package)
  extraPythonPackages = [ (pkgs.python312Packages.buildPythonPackage { ... }) ];
  # Enable in config
  settings.plugins.enabled = [ "my-plugin" ];
};

Declarative plugins are symlinked with a nix-managed- prefix — they coexist with manually installed plugins and are cleaned up automatically when removed from the Nix config.

Managing plugins

hermes plugins                               # unified interactive UI
hermes plugins list                          # table: enabled / disabled / not enabled
hermes plugins install user/repo             # install from Git, then prompt Enable? [y/N]
hermes plugins install user/repo --enable    # install AND enable (no prompt)
hermes plugins install user/repo --no-enable # install but leave disabled (no prompt)
hermes plugins update my-plugin              # pull latest
hermes plugins remove my-plugin              # uninstall
hermes plugins enable my-plugin              # add to allow-list
hermes plugins disable my-plugin             # remove from allow-list + add to disabled

Interactive UI

Running hermes plugins with no arguments opens a composite interactive screen:

Plugins
  ↑↓ navigate  SPACE toggle  ENTER configure/confirm  ESC done

  General Plugins
 → [✓] my-tool-plugin — Custom search tool
   [ ] webhook-notifier — Event hooks
   [ ] disk-cleanup — Auto-cleanup of ephemeral files [bundled]

  Provider Plugins
     Memory Provider          ▸ honcho
     Context Engine           ▸ compressor

General Plugins section — checkboxes, toggle with SPACE. Checked = in plugins.enabled, unchecked = in plugins.disabled (explicit off).
Provider Plugins section — shows current selection. Press ENTER to drill into a radio picker where you choose one active provider.
Bundled plugins appear in the same list with a [bundled] tag.

Provider plugin selections are saved to config.yaml:

memory:
  provider: "honcho"      # empty string = built-in only

context:
  engine: "compressor"    # default built-in compressor

Enabled vs. disabled vs. neither

Plugins occupy one of three states:

State	Meaning	In `plugins.enabled`?	In `plugins.disabled`?
`enabled`	Loaded on next session	Yes	No
`disabled`	Explicitly off — won't load even if also in `enabled`	(irrelevant)	Yes
`not enabled`	Discovered but never opted in	No	No

The default for a newly-installed or bundled plugin is not enabled. hermes plugins list shows all three distinct states so you can tell what's been explicitly turned off vs. what's just waiting to be enabled.

In a running session, /plugins shows which plugins are currently loaded.

Injecting Messages

Plugins can inject messages into the active conversation using ctx.inject_message():

ctx.inject_message("New data arrived from the webhook", role="user")

Signature: ctx.inject_message(content: str, role: str = "user") -> bool

How it works:

If the agent is idle (waiting for user input), the message is queued as the next input and starts a new turn.
If the agent is mid-turn (actively running), the message interrupts the current operation — the same as a user typing a new message and pressing Enter.
For non-"user" roles, the content is prefixed with [role] (e.g. [system] ...).
Returns True if the message was queued successfully, False if no CLI reference is available (e.g. in gateway mode).

This enables plugins like remote control viewers, messaging bridges, or webhook receivers to feed messages into the conversation from external sources.

:::note inject_message is only available in CLI mode. In gateway mode, there is no CLI reference and the method returns False. :::

See the full guide for handler contracts, schema format, hook behavior, error handling, and common mistakes.

22 KiB Raw Blame History