hermes-agent/website/docs/developer-guide/context-engine-plugin.md
Teknium 1d5deac346
fix(website): cross-locale doc links + drop empty ko locale (#31895)
The locale switcher appeared broken because hardcoded markdown links
(`](/docs/X)`) got double-prefixed by Docusaurus to `/docs/<locale>/docs/X`
(404) in non-English locales, and the MDX hero `<a href>` on the index
page escaped locale routing entirely.

Changes:
- Rewrite 922 `](/docs/X)` -> `](/X)` across 166 docs files (strip trailing
  .md too). Docusaurus prepends locale + baseUrl itself.
- docs/index.md -> index.mdx; hero "Get Started" anchor -> Docusaurus
  <Link> so it stays inside the active locale.
- Drop `ko` locale entirely from docusaurus.config.ts + delete i18n/ko/
  (4 stale auto-translated kanban pages, <2% coverage, misleading).

Verified `npm run build` succeeds for both en and zh-Hans; `build/zh-Hans/
index.html` has no /docs/zh-Hans/docs/... double-prefixed paths.

PR2 will translate the 335 English docs into i18n/zh-Hans/.
2026-05-24 23:16:20 -07:00

7 KiB

sidebar_position title description
9 Context Engine Plugins How to build a context engine plugin that replaces the built-in ContextCompressor

Building a Context Engine Plugin

Context engine plugins replace the built-in ContextCompressor with an alternative strategy for managing conversation context. For example, a Lossless Context Management (LCM) engine that builds a knowledge DAG instead of lossy summarization.

How it works

The agent's context management is built on the ContextEngine ABC (agent/context_engine.py). The built-in ContextCompressor is the default implementation. Plugin engines must implement the same interface.

Only one context engine can be active at a time. Selection is config-driven:

# config.yaml
context:
  engine: "compressor"    # default built-in
  engine: "lcm"           # activates a plugin engine named "lcm"

Plugin engines are never auto-activated — the user must explicitly set context.engine to the plugin's name.

Directory structure

Each context engine lives in plugins/context_engine/<name>/:

plugins/context_engine/lcm/
├── __init__.py      # exports the ContextEngine subclass
├── plugin.yaml      # metadata (name, description, version)
└── ...              # any other modules your engine needs

The ContextEngine ABC

Your engine must implement these required methods:

from agent.context_engine import ContextEngine

class LCMEngine(ContextEngine):

    @property
    def name(self) -> str:
        """Short identifier, e.g. 'lcm'. Must match config.yaml value."""
        return "lcm"

    def update_from_response(self, usage: dict) -> None:
        """Called after every LLM call with the usage dict.

        Update self.last_prompt_tokens, self.last_completion_tokens,
        self.last_total_tokens from the response.
        """

    def should_compress(self, prompt_tokens: int = None) -> bool:
        """Return True if compaction should fire this turn."""

    def compress(self, messages: list, current_tokens: int = None,
                 focus_topic: str = None) -> list:
        """Compact the message list and return a new (possibly shorter) list.

        The returned list must be a valid OpenAI-format message sequence.

        ``focus_topic`` is an optional topic string from manual
        ``/compress <focus>``; engines that support guided compression should
        prioritise preserving information related to it, others may ignore it.
        """

Class attributes your engine must maintain

The agent reads these directly for display and logging:

last_prompt_tokens: int = 0
last_completion_tokens: int = 0
last_total_tokens: int = 0
threshold_tokens: int = 0        # when compression triggers
context_length: int = 0          # model's full context window
compression_count: int = 0       # how many times compress() has run

Optional methods

These have sensible defaults in the ABC. Override as needed:

Method Default Override when
on_session_start(session_id, **kwargs) No-op You need to load persisted state (DAG, DB)
on_session_end(session_id, messages) No-op You need to flush state, close connections
on_session_reset() Resets token counters You have per-session state to clear
update_model(model, context_length, ...) Updates context_length + threshold You need to recalculate budgets on model switch
get_tool_schemas() Returns [] Your engine provides agent-callable tools (e.g., lcm_grep)
handle_tool_call(name, args, **kwargs) Returns error JSON You implement tool handlers
should_compress_preflight(messages) Returns False You can do a cheap pre-API-call estimate
get_status() Standard token/threshold dict You have custom metrics to expose

Engine tools

Context engines can expose tools the agent calls directly. Return schemas from get_tool_schemas() and handle calls in handle_tool_call():

def get_tool_schemas(self):
    return [{
        "name": "lcm_grep",
        "description": "Search the context knowledge graph",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"],
        },
    }]

def handle_tool_call(self, name, args, **kwargs):
    if name == "lcm_grep":
        results = self._search_dag(args["query"])
        return json.dumps({"results": results})
    return json.dumps({"error": f"Unknown tool: {name}"})

Engine tools are injected into the agent's tool list at startup and dispatched automatically — no registry registration needed.

Registration

Place your engine in plugins/context_engine/<name>/. The __init__.py must export a ContextEngine subclass. The discovery system finds and instantiates it automatically.

Via general plugin system

A general plugin can also register a context engine:

def register(ctx):
    engine = LCMEngine(context_length=200000)
    ctx.register_context_engine(engine)

Only one engine can be registered. A second plugin attempting to register is rejected with a warning.

Lifecycle

1. Engine instantiated (plugin load or directory discovery)
2. on_session_start() — conversation begins
3. update_from_response() — after each API call
4. should_compress() — checked each turn
5. compress() — called when should_compress() returns True
6. on_session_end() — session boundary (CLI exit, /reset, gateway expiry)

on_session_reset() is called on /new or /reset to clear per-session state without a full shutdown.

Configuration

Users select your engine via hermes plugins → Provider Plugins → Context Engine, or by editing config.yaml:

context:
  engine: "lcm"   # must match your engine's name property

The compression config block (compression.threshold, compression.protect_last_n, etc.) is specific to the built-in ContextCompressor. Your engine should define its own config format if needed, reading from config.yaml during initialization.

Testing

from agent.context_engine import ContextEngine

def test_engine_satisfies_abc():
    engine = YourEngine(context_length=200000)
    assert isinstance(engine, ContextEngine)
    assert engine.name == "your-name"

def test_compress_returns_valid_messages():
    engine = YourEngine(context_length=200000)
    msgs = [{"role": "user", "content": "hello"}]
    result = engine.compress(msgs)
    assert isinstance(result, list)
    assert all("role" in m for m in result)

See tests/agent/test_context_engine.py for the full ABC contract test suite.

See also