hermes-agent/website/docs/developer-guide/context-engine-plugin.md
Teknium 79198eb3a0 docs: context engine plugin system + unified hermes plugins UI
New page:
- developer-guide/context-engine-plugin.md — full guide for building
  context engine plugins (ABC contract, lifecycle, tools, registration)

Updated pages (11 files):
- plugins.md — plugin types table, composite UI documentation with
  screenshot-style example, provider plugin config format
- cli-commands.md — hermes plugins section rewritten for composite UI
  with provider plugin config keys documented
- context-compression-and-caching.md — new 'Pluggable Context Engine'
  section explaining the ABC, config-driven selection, resolution order
- configuration.md — new 'Context Engine' config section with examples
- architecture.md — context_engine.py and plugins/context_engine/ added
  to directory trees, plugin system description updated
- memory-provider-plugin.md — cross-reference tip to context engines
- memory-providers.md — hermes plugins as alternative setup path
- agent-loop.md — context_engine.py added to file reference table
- overview.md — plugins description expanded to cover all 3 types
- build-a-hermes-plugin.md — tip box linking to specialized plugin guides
- sidebars.ts — context-engine-plugin added to Extending category
2026-04-10 19:15:50 -07:00

6.8 KiB

sidebar_position title description
9 Context Engine Plugins How to build a context engine plugin that replaces the built-in ContextCompressor

Building a Context Engine Plugin

Context engine plugins replace the built-in ContextCompressor with an alternative strategy for managing conversation context. For example, a Lossless Context Management (LCM) engine that builds a knowledge DAG instead of lossy summarization.

How it works

The agent's context management is built on the ContextEngine ABC (agent/context_engine.py). The built-in ContextCompressor is the default implementation. Plugin engines must implement the same interface.

Only one context engine can be active at a time. Selection is config-driven:

# config.yaml
context:
  engine: "compressor"    # default built-in
  engine: "lcm"           # activates a plugin engine named "lcm"

Plugin engines are never auto-activated — the user must explicitly set context.engine to the plugin's name.

Directory structure

Each context engine lives in plugins/context_engine/<name>/:

plugins/context_engine/lcm/
├── __init__.py      # exports the ContextEngine subclass
├── plugin.yaml      # metadata (name, description, version)
└── ...              # any other modules your engine needs

The ContextEngine ABC

Your engine must implement these required methods:

from agent.context_engine import ContextEngine

class LCMEngine(ContextEngine):

    @property
    def name(self) -> str:
        """Short identifier, e.g. 'lcm'. Must match config.yaml value."""
        return "lcm"

    def update_from_response(self, usage: dict) -> None:
        """Called after every LLM call with the usage dict.

        Update self.last_prompt_tokens, self.last_completion_tokens,
        self.last_total_tokens from the response.
        """

    def should_compress(self, prompt_tokens: int = None) -> bool:
        """Return True if compaction should fire this turn."""

    def compress(self, messages: list, current_tokens: int = None) -> list:
        """Compact the message list and return a new (possibly shorter) list.

        The returned list must be a valid OpenAI-format message sequence.
        """

Class attributes your engine must maintain

The agent reads these directly for display and logging:

last_prompt_tokens: int = 0
last_completion_tokens: int = 0
last_total_tokens: int = 0
threshold_tokens: int = 0        # when compression triggers
context_length: int = 0          # model's full context window
compression_count: int = 0       # how many times compress() has run

Optional methods

These have sensible defaults in the ABC. Override as needed:

Method Default Override when
on_session_start(session_id, **kwargs) No-op You need to load persisted state (DAG, DB)
on_session_end(session_id, messages) No-op You need to flush state, close connections
on_session_reset() Resets token counters You have per-session state to clear
update_model(model, context_length, ...) Updates context_length + threshold You need to recalculate budgets on model switch
get_tool_schemas() Returns [] Your engine provides agent-callable tools (e.g., lcm_grep)
handle_tool_call(name, args, **kwargs) Returns error JSON You implement tool handlers
should_compress_preflight(messages) Returns False You can do a cheap pre-API-call estimate
get_status() Standard token/threshold dict You have custom metrics to expose

Engine tools

Context engines can expose tools the agent calls directly. Return schemas from get_tool_schemas() and handle calls in handle_tool_call():

def get_tool_schemas(self):
    return [{
        "name": "lcm_grep",
        "description": "Search the context knowledge graph",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"],
        },
    }]

def handle_tool_call(self, name, args, **kwargs):
    if name == "lcm_grep":
        results = self._search_dag(args["query"])
        return json.dumps({"results": results})
    return json.dumps({"error": f"Unknown tool: {name}"})

Engine tools are injected into the agent's tool list at startup and dispatched automatically — no registry registration needed.

Registration

Place your engine in plugins/context_engine/<name>/. The __init__.py must export a ContextEngine subclass. The discovery system finds and instantiates it automatically.

Via general plugin system

A general plugin can also register a context engine:

def register(ctx):
    engine = LCMEngine(context_length=200000)
    ctx.register_context_engine(engine)

Only one engine can be registered. A second plugin attempting to register is rejected with a warning.

Lifecycle

1. Engine instantiated (plugin load or directory discovery)
2. on_session_start() — conversation begins
3. update_from_response() — after each API call
4. should_compress() — checked each turn
5. compress() — called when should_compress() returns True
6. on_session_end() — session boundary (CLI exit, /reset, gateway expiry)

on_session_reset() is called on /new or /reset to clear per-session state without a full shutdown.

Configuration

Users select your engine via hermes plugins → Provider Plugins → Context Engine, or by editing config.yaml:

context:
  engine: "lcm"   # must match your engine's name property

The compression config block (compression.threshold, compression.protect_last_n, etc.) is specific to the built-in ContextCompressor. Your engine should define its own config format if needed, reading from config.yaml during initialization.

Testing

from agent.context_engine import ContextEngine

def test_engine_satisfies_abc():
    engine = YourEngine(context_length=200000)
    assert isinstance(engine, ContextEngine)
    assert engine.name == "your-name"

def test_compress_returns_valid_messages():
    engine = YourEngine(context_length=200000)
    msgs = [{"role": "user", "content": "hello"}]
    result = engine.compress(msgs)
    assert isinstance(result, list)
    assert all("role" in m for m in result)

See tests/agent/test_context_engine.py for the full ABC contract test suite.

See also