mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-02 02:01:47 +00:00
Broad drift audit against origin/main (b52b63396).
Reference pages (most user-visible drift):
- slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer
that were missing; drop non-existent /terminal-setup; fix /q footnote
(resolves to /queue, not /quit); extend CLI-only list with all 24
CLI-only commands in the registry
- cli-commands: add dedicated sections for hermes curator / fallback /
hooks (new subcommands not previously documented); remove stale
hermes honcho standalone section (the plugin registers dynamically
via hermes memory); list curator/fallback/hooks in top-level table;
fix completion to include fish
- toolsets-reference: document the real 52-toolset count; split browser
vs browser-cdp; add discord / discord_admin / spotify / yuanbao;
correct hermes-cli tool count from 36 to 38; fix misleading claim
that hermes-homeassistant adds tools (it's identical to hermes-cli)
- tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao,
2 Discord toolsets; move browser_cdp/browser_dialog to their own
browser-cdp toolset section
- environment-variables: add 40+ user-facing HERMES_* vars that were
undocumented (--yolo, --accept-hooks, --ignore-*, inference model
override, agent/stream/checkpoint timeouts, OAuth trace, per-platform
batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs,
gateway restart/connect timeouts); dedupe the Cron Scheduler section;
replace stale QQ_SANDBOX with QQ_PORTAL_HOST
User-guide (top level):
- cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20)
- configuration.md: display.platforms is the canonical per-platform
override key; tool_progress_overrides is deprecated and auto-migrated
- profiles.md: model.default is the config key, not model.model
- sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8
- checkpoints-and-rollback.md: destructive-command list now matches
_DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd)
- docker.md: the container runs as non-root hermes (UID 10000) via
gosu; fix install command (uv pip); add missing --insecure on the
dashboard compose example (required for non-loopback bind)
- security.md: systemctl danger pattern also matches 'restart'
- index.md: built-in tool count 47 -> 68
- integrations/index.md: 6 STT providers, 8 memory providers
- integrations/providers.md: drop fictional dashscope/qwen aliases
Features:
- overview.md: 9 image models (not 8), 9 TTS providers (not 5),
8 memory providers (Supermemory was missing)
- tool-gateway.md: 9 image models
- tools.md: extend common-toolsets list with search / messaging /
spotify / discord / debugging / safe
- fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY
(lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan,
tencent-tokenhub, azure-foundry)
- plugins.md: Available Hooks table now includes on_session_finalize,
on_session_reset, subagent_stop
- built-in-plugins.md: add the 7 bundled plugins the page didn't
mention (spotify, google_meet, three image_gen providers, two
dashboard examples)
- web-dashboard.md: add --insecure and --tui flags
- cron.md: hermes cron create takes positional schedule/prompt, not
flags
Messaging:
- telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when
TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it
per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch.
- discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default
is 2.0, not 0.1
- dingtalk.md: document DINGTALK_REQUIRE_MENTION /
FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL /
ALLOW_ALL_USERS that the adapter supports
- bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env
var; the setting lives in platforms.bluebubbles.extra only
- qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and
QQ_GROUP_ALLOWED_USERS
- wecom-callback.md: replace 'hermes gateway start' (service-only)
with 'hermes gateway' for first-time setup
Developer-guide:
- architecture.md: refresh tool/toolset counts (61/52), terminal
backend count (7), line counts for run_agent.py (~13.7k), cli.py
(~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py
(~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform
adapter count 18 -> 20
- agent-loop.md: run_agent.py line count 10.7k -> 13.7k
- tools-runtime.md: add vercel_sandbox backend
- adding-tools.md: remove stale 'Discovery import added to
model_tools.py' checklist item (registry auto-discovery)
- adding-platform-adapters.md: mark send_typing / get_chat_info as
concrete base methods; only connect/disconnect/send are abstract
- acp-internals.md: ACP sessions now persist to SessionDB
(~/.hermes/state.db); acp.run_agent call uses
use_unstable_protocol=True
- cron-internals.md: gateway runs scheduler in a dedicated background
thread via _start_cron_ticker, not on a maintenance cycle; locking
is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows)
- gateway-internals.md: gateway/run.py ~12k lines
- provider-runtime.md: cron DOES support fallback (run_job reads
fallback_providers from config)
- session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations
10 and 11 (trigram FTS, inline-mode FTS5 re-index); add
api_call_count column to Sessions DDL; document messages_fts_trigram
and state_meta in the architecture tree
- context-compression-and-caching.md: remove the obsolete 'context
pressure warnings' section (warnings were removed for causing
models to give up early)
- context-engine-plugin.md: compress() signature now includes
focus_topic param
- extending-the-cli.md: _build_tui_layout_children signature now
includes model_picker_widget; add to default layout
Also fixed three pre-existing broken links/anchors the build warned
about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and
tips#background-tasks, nix-setup.md -> #container-aware-cli).
Regenerated per-skill pages via website/scripts/generate-skill-docs.py
so catalog tables and sidebar are consistent with current SKILL.md
frontmatter.
docusaurus build: clean, no broken links or anchors.
194 lines
7 KiB
Markdown
194 lines
7 KiB
Markdown
---
|
|
sidebar_position: 9
|
|
title: "Context Engine Plugins"
|
|
description: "How to build a context engine plugin that replaces the built-in ContextCompressor"
|
|
---
|
|
|
|
# Building a Context Engine Plugin
|
|
|
|
Context engine plugins replace the built-in `ContextCompressor` with an alternative strategy for managing conversation context. For example, a Lossless Context Management (LCM) engine that builds a knowledge DAG instead of lossy summarization.
|
|
|
|
## How it works
|
|
|
|
The agent's context management is built on the `ContextEngine` ABC (`agent/context_engine.py`). The built-in `ContextCompressor` is the default implementation. Plugin engines must implement the same interface.
|
|
|
|
Only **one** context engine can be active at a time. Selection is config-driven:
|
|
|
|
```yaml
|
|
# config.yaml
|
|
context:
|
|
engine: "compressor" # default built-in
|
|
engine: "lcm" # activates a plugin engine named "lcm"
|
|
```
|
|
|
|
Plugin engines are **never auto-activated** — the user must explicitly set `context.engine` to the plugin's name.
|
|
|
|
## Directory structure
|
|
|
|
Each context engine lives in `plugins/context_engine/<name>/`:
|
|
|
|
```
|
|
plugins/context_engine/lcm/
|
|
├── __init__.py # exports the ContextEngine subclass
|
|
├── plugin.yaml # metadata (name, description, version)
|
|
└── ... # any other modules your engine needs
|
|
```
|
|
|
|
## The ContextEngine ABC
|
|
|
|
Your engine must implement these **required** methods:
|
|
|
|
```python
|
|
from agent.context_engine import ContextEngine
|
|
|
|
class LCMEngine(ContextEngine):
|
|
|
|
@property
|
|
def name(self) -> str:
|
|
"""Short identifier, e.g. 'lcm'. Must match config.yaml value."""
|
|
return "lcm"
|
|
|
|
def update_from_response(self, usage: dict) -> None:
|
|
"""Called after every LLM call with the usage dict.
|
|
|
|
Update self.last_prompt_tokens, self.last_completion_tokens,
|
|
self.last_total_tokens from the response.
|
|
"""
|
|
|
|
def should_compress(self, prompt_tokens: int = None) -> bool:
|
|
"""Return True if compaction should fire this turn."""
|
|
|
|
def compress(self, messages: list, current_tokens: int = None,
|
|
focus_topic: str = None) -> list:
|
|
"""Compact the message list and return a new (possibly shorter) list.
|
|
|
|
The returned list must be a valid OpenAI-format message sequence.
|
|
|
|
``focus_topic`` is an optional topic string from manual
|
|
``/compress <focus>``; engines that support guided compression should
|
|
prioritise preserving information related to it, others may ignore it.
|
|
"""
|
|
```
|
|
|
|
### Class attributes your engine must maintain
|
|
|
|
The agent reads these directly for display and logging:
|
|
|
|
```python
|
|
last_prompt_tokens: int = 0
|
|
last_completion_tokens: int = 0
|
|
last_total_tokens: int = 0
|
|
threshold_tokens: int = 0 # when compression triggers
|
|
context_length: int = 0 # model's full context window
|
|
compression_count: int = 0 # how many times compress() has run
|
|
```
|
|
|
|
### Optional methods
|
|
|
|
These have sensible defaults in the ABC. Override as needed:
|
|
|
|
| Method | Default | Override when |
|
|
|--------|---------|--------------|
|
|
| `on_session_start(session_id, **kwargs)` | No-op | You need to load persisted state (DAG, DB) |
|
|
| `on_session_end(session_id, messages)` | No-op | You need to flush state, close connections |
|
|
| `on_session_reset()` | Resets token counters | You have per-session state to clear |
|
|
| `update_model(model, context_length, ...)` | Updates context_length + threshold | You need to recalculate budgets on model switch |
|
|
| `get_tool_schemas()` | Returns `[]` | Your engine provides agent-callable tools (e.g., `lcm_grep`) |
|
|
| `handle_tool_call(name, args, **kwargs)` | Returns error JSON | You implement tool handlers |
|
|
| `should_compress_preflight(messages)` | Returns `False` | You can do a cheap pre-API-call estimate |
|
|
| `get_status()` | Standard token/threshold dict | You have custom metrics to expose |
|
|
|
|
## Engine tools
|
|
|
|
Context engines can expose tools the agent calls directly. Return schemas from `get_tool_schemas()` and handle calls in `handle_tool_call()`:
|
|
|
|
```python
|
|
def get_tool_schemas(self):
|
|
return [{
|
|
"name": "lcm_grep",
|
|
"description": "Search the context knowledge graph",
|
|
"parameters": {
|
|
"type": "object",
|
|
"properties": {
|
|
"query": {"type": "string", "description": "Search query"}
|
|
},
|
|
"required": ["query"],
|
|
},
|
|
}]
|
|
|
|
def handle_tool_call(self, name, args, **kwargs):
|
|
if name == "lcm_grep":
|
|
results = self._search_dag(args["query"])
|
|
return json.dumps({"results": results})
|
|
return json.dumps({"error": f"Unknown tool: {name}"})
|
|
```
|
|
|
|
Engine tools are injected into the agent's tool list at startup and dispatched automatically — no registry registration needed.
|
|
|
|
## Registration
|
|
|
|
### Via directory (recommended)
|
|
|
|
Place your engine in `plugins/context_engine/<name>/`. The `__init__.py` must export a `ContextEngine` subclass. The discovery system finds and instantiates it automatically.
|
|
|
|
### Via general plugin system
|
|
|
|
A general plugin can also register a context engine:
|
|
|
|
```python
|
|
def register(ctx):
|
|
engine = LCMEngine(context_length=200000)
|
|
ctx.register_context_engine(engine)
|
|
```
|
|
|
|
Only one engine can be registered. A second plugin attempting to register is rejected with a warning.
|
|
|
|
## Lifecycle
|
|
|
|
```
|
|
1. Engine instantiated (plugin load or directory discovery)
|
|
2. on_session_start() — conversation begins
|
|
3. update_from_response() — after each API call
|
|
4. should_compress() — checked each turn
|
|
5. compress() — called when should_compress() returns True
|
|
6. on_session_end() — session boundary (CLI exit, /reset, gateway expiry)
|
|
```
|
|
|
|
`on_session_reset()` is called on `/new` or `/reset` to clear per-session state without a full shutdown.
|
|
|
|
## Configuration
|
|
|
|
Users select your engine via `hermes plugins` → Provider Plugins → Context Engine, or by editing `config.yaml`:
|
|
|
|
```yaml
|
|
context:
|
|
engine: "lcm" # must match your engine's name property
|
|
```
|
|
|
|
The `compression` config block (`compression.threshold`, `compression.protect_last_n`, etc.) is specific to the built-in `ContextCompressor`. Your engine should define its own config format if needed, reading from `config.yaml` during initialization.
|
|
|
|
## Testing
|
|
|
|
```python
|
|
from agent.context_engine import ContextEngine
|
|
|
|
def test_engine_satisfies_abc():
|
|
engine = YourEngine(context_length=200000)
|
|
assert isinstance(engine, ContextEngine)
|
|
assert engine.name == "your-name"
|
|
|
|
def test_compress_returns_valid_messages():
|
|
engine = YourEngine(context_length=200000)
|
|
msgs = [{"role": "user", "content": "hello"}]
|
|
result = engine.compress(msgs)
|
|
assert isinstance(result, list)
|
|
assert all("role" in m for m in result)
|
|
```
|
|
|
|
See `tests/agent/test_context_engine.py` for the full ABC contract test suite.
|
|
|
|
## See also
|
|
|
|
- [Context Compression and Caching](/docs/developer-guide/context-compression-and-caching) — how the built-in compressor works
|
|
- [Memory Provider Plugins](/docs/developer-guide/memory-provider-plugin) — analogous single-select plugin system for memory
|
|
- [Plugins](/docs/user-guide/features/plugins) — general plugin system overview
|