mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-08 08:11:38 +00:00
feat(context-engine): host contract for external context engines
Condenses the substance of PRs #16453, #17453, #16451, #17600, and #13373 into a minimal generic host contract that external context engine plugins (e.g. hermes-lcm) need to integrate cleanly. Drops scaffolding that duplicated existing infrastructure or had marginal value. Five concrete changes: 1. `_transition_context_engine_session()` on AIAgent — generic lifecycle helper that fires on_session_end → on_session_reset → on_session_start → optional carry_over_new_session_context. Engines implement only the hooks they need; missing hooks are skipped. Built-in compressor keeps its existing reset-only behavior because callers default to no metadata. `reset_session_state()` now optionally accepts previous_messages / old_session_id / carry_over_context and delegates to the transition helper when provided. (#16453) 2. `conversation_id` passed to `on_session_start()` — both the agent-init call site and the compression-boundary call site now forward `self._gateway_session_key` so plugin engines have a stable conversation identity that survives session_id rotation (compression splits, /new, resume). The key already existed on AIAgent; it just wasn't reaching engines. (#16453) 3. Canonical cache buckets forwarded to engines — the usage dict passed to `update_from_response()` now includes input_tokens, output_tokens, cache_read_tokens, cache_write_tokens, and reasoning_tokens on top of the legacy prompt/completion/total keys. Engines can make decisions on cache-hit ratios and reasoning costs instead of only aggregates. ABC docstring updated. (#17453) 4. Plugin-registered context engines visible in the picker — `_discover_context_engines()` in plugins_cmd.py now also includes engines registered via `ctx.register_context_engine()` from plugin manifests, deduplicating by name so repo-shipped descriptions win on collision. (#16451) 5. `_EngineCollector.register_command()` — context engines using the standard `register(ctx)` pattern can now expose slash commands (e.g. `/lcm`). Routes to the global plugin command registry with the same conflict-rejection policy regular plugins use (no shadowing built-ins, no clobbering other plugins). Previously these calls hit a no-op and the slash commands silently never appeared. (#17600) Dropped from the original 5 PRs: - Compression boundary signal (`boundary_reason="compression"`) from #16453 — already on main at `agent/conversation_compression.py:412-424`, landed via the bg-review extraction. - `discover_plugins()` before fallback in run_agent.py from #16451 — redundant: `get_plugin_context_engine()` already routes through `_ensure_plugins_discovered()` which is idempotent. - Runtime identity diagnostics method + helpers from #13373 (+251 LOC) — operators can already read engine state via `engine.get_status()`; the diagnostics view added marginal value relative to its surface area. - The 553-LOC slash-command machinery from #17600 — replaced with a 20-LOC `register_command` method on the collector that reuses the existing plugin command registry instead of building a parallel one. Net: ~215 LOC of host-contract changes + 282 LOC of focused tests, vs ~1,176 LOC across the original 5 PRs. Co-authored-by: Tosko4 <1294707+Tosko4@users.noreply.github.com> Closes #16453. Closes #17453. Closes #16451. Closes #17600. Closes #13373. Related: stephenschoettler/hermes-lcm#68.
This commit is contained in:
parent
fb9f3a4ef9
commit
9b5dae17a5
8 changed files with 491 additions and 14 deletions
|
|
@ -1522,6 +1522,7 @@ def init_agent(
|
|||
platform=agent.platform or "cli",
|
||||
model=agent.model,
|
||||
context_length=getattr(agent.context_compressor, "context_length", 0),
|
||||
conversation_id=getattr(agent, "_gateway_session_key", None),
|
||||
)
|
||||
except Exception as _ce_err:
|
||||
_ra().logger.debug("Context engine on_session_start: %s", _ce_err)
|
||||
|
|
|
|||
|
|
@ -71,7 +71,12 @@ class ContextEngine(ABC):
|
|||
def update_from_response(self, usage: Dict[str, Any]) -> None:
|
||||
"""Update tracked token usage from an API response.
|
||||
|
||||
Called after every LLM call with the usage dict from the response.
|
||||
Called after every LLM call with a normalized usage dict. The legacy
|
||||
keys ``prompt_tokens``, ``completion_tokens``, and ``total_tokens``
|
||||
are always present. Newer hosts also include canonical buckets:
|
||||
``input_tokens``, ``output_tokens``, ``cache_read_tokens``,
|
||||
``cache_write_tokens``, and ``reasoning_tokens``. Engines should
|
||||
treat those fields as optional for compatibility with older hosts.
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
|
|
|
|||
|
|
@ -421,6 +421,7 @@ def compress_context(
|
|||
agent.session_id or "",
|
||||
boundary_reason="compression",
|
||||
old_session_id=_old_sid,
|
||||
conversation_id=getattr(agent, "_gateway_session_key", None),
|
||||
)
|
||||
except Exception as _ce_err:
|
||||
logger.debug("context engine on_session_start (compression): %s", _ce_err)
|
||||
|
|
|
|||
|
|
@ -1769,10 +1769,19 @@ def run_conversation(
|
|||
prompt_tokens = canonical_usage.prompt_tokens
|
||||
completion_tokens = canonical_usage.output_tokens
|
||||
total_tokens = canonical_usage.total_tokens
|
||||
# Forward canonical token + cache buckets so context engines
|
||||
# can make decisions on cache hit ratios / reasoning costs,
|
||||
# not just legacy aggregate tokens. Legacy keys stay for
|
||||
# back-compat with engines that only read prompt/completion/total.
|
||||
usage_dict = {
|
||||
"prompt_tokens": prompt_tokens,
|
||||
"completion_tokens": completion_tokens,
|
||||
"total_tokens": total_tokens,
|
||||
"input_tokens": canonical_usage.input_tokens,
|
||||
"output_tokens": canonical_usage.output_tokens,
|
||||
"cache_read_tokens": canonical_usage.cache_read_tokens,
|
||||
"cache_write_tokens": canonical_usage.cache_write_tokens,
|
||||
"reasoning_tokens": canonical_usage.reasoning_tokens,
|
||||
}
|
||||
agent.context_compressor.update_from_response(usage_dict)
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue