feat(context-engine): host contract for external context engines

Condenses the substance of PRs #16453, #17453, #16451, #17600, and #13373
into a minimal generic host contract that external context engine plugins
(e.g. hermes-lcm) need to integrate cleanly. Drops scaffolding that
duplicated existing infrastructure or had marginal value.

Five concrete changes:

1. `_transition_context_engine_session()` on AIAgent — generic lifecycle
   helper that fires on_session_end → on_session_reset → on_session_start
   → optional carry_over_new_session_context. Engines implement only the
   hooks they need; missing hooks are skipped. Built-in compressor keeps
   its existing reset-only behavior because callers default to no
   metadata. `reset_session_state()` now optionally accepts
   previous_messages / old_session_id / carry_over_context and delegates
   to the transition helper when provided. (#16453)

2. `conversation_id` passed to `on_session_start()` — both the
   agent-init call site and the compression-boundary call site now
   forward `self._gateway_session_key` so plugin engines have a stable
   conversation identity that survives session_id rotation (compression
   splits, /new, resume). The key already existed on AIAgent; it just
   wasn't reaching engines. (#16453)

3. Canonical cache buckets forwarded to engines — the usage dict passed
   to `update_from_response()` now includes input_tokens, output_tokens,
   cache_read_tokens, cache_write_tokens, and reasoning_tokens on top of
   the legacy prompt/completion/total keys. Engines can make decisions on
   cache-hit ratios and reasoning costs instead of only aggregates. ABC
   docstring updated. (#17453)

4. Plugin-registered context engines visible in the picker —
   `_discover_context_engines()` in plugins_cmd.py now also includes
   engines registered via `ctx.register_context_engine()` from plugin
   manifests, deduplicating by name so repo-shipped descriptions win on
   collision. (#16451)

5. `_EngineCollector.register_command()` — context engines using the
   standard `register(ctx)` pattern can now expose slash commands (e.g.
   `/lcm`). Routes to the global plugin command registry with the same
   conflict-rejection policy regular plugins use (no shadowing built-ins,
   no clobbering other plugins). Previously these calls hit a no-op and
   the slash commands silently never appeared. (#17600)

Dropped from the original 5 PRs:

- Compression boundary signal (`boundary_reason="compression"`) from
  #16453 — already on main at `agent/conversation_compression.py:412-424`,
  landed via the bg-review extraction.

- `discover_plugins()` before fallback in run_agent.py from #16451 —
  redundant: `get_plugin_context_engine()` already routes through
  `_ensure_plugins_discovered()` which is idempotent.

- Runtime identity diagnostics method + helpers from #13373 (+251 LOC) —
  operators can already read engine state via `engine.get_status()`;
  the diagnostics view added marginal value relative to its surface area.

- The 553-LOC slash-command machinery from #17600 — replaced with a
  20-LOC `register_command` method on the collector that reuses the
  existing plugin command registry instead of building a parallel one.

Net: ~215 LOC of host-contract changes + 282 LOC of focused tests, vs
~1,176 LOC across the original 5 PRs.

Co-authored-by: Tosko4 <1294707+Tosko4@users.noreply.github.com>

Closes #16453.
Closes #17453.
Closes #16451.
Closes #17600.
Closes #13373.
Related: stephenschoettler/hermes-lcm#68.
This commit is contained in:
teknium1 2026-05-28 01:38:13 -07:00 committed by Teknium
parent fb9f3a4ef9
commit 9b5dae17a5
8 changed files with 491 additions and 14 deletions

View file

@ -527,7 +527,81 @@ class AIAgent:
"Session DB creation failed (will retry next turn): %s", e
)
def reset_session_state(self):
def _transition_context_engine_session(
self,
*,
old_session_id: Optional[str] = None,
new_session_id: Optional[str] = None,
previous_messages: Optional[list] = None,
carry_over_context: bool = False,
reset_engine: bool = True,
**extra_context,
) -> None:
"""Notify the active context engine about a host session transition.
Generic host-side lifecycle helper. The built-in compressor keeps its
existing reset behavior; plugin engines that implement richer hooks
(``on_session_end``, ``on_session_reset``, ``on_session_start``,
``carry_over_new_session_context``) can flush old-session state,
reset runtime counters, bind to the new session, and optionally
carry retained context forward.
"""
engine = getattr(self, "context_compressor", None)
if not engine:
return
if old_session_id and previous_messages is not None and hasattr(engine, "on_session_end"):
try:
engine.on_session_end(old_session_id, previous_messages)
except Exception as exc:
logger.debug("context engine on_session_end during transition: %s", exc)
if reset_engine and hasattr(engine, "on_session_reset"):
try:
engine.on_session_reset()
except Exception as exc:
logger.debug("context engine on_session_reset during transition: %s", exc)
should_start = bool(
old_session_id
or previous_messages is not None
or carry_over_context
or extra_context
)
target_session_id = new_session_id or getattr(self, "session_id", "") or ""
if should_start and target_session_id and hasattr(engine, "on_session_start"):
start_context = {
"old_session_id": old_session_id,
"carry_over_context": carry_over_context,
"platform": getattr(self, "platform", None) or os.environ.get("HERMES_SESSION_SOURCE", "cli"),
"model": getattr(self, "model", ""),
"context_length": getattr(engine, "context_length", None),
"conversation_id": getattr(self, "_gateway_session_key", None),
}
start_context.update(extra_context)
start_context = {k: v for k, v in start_context.items() if v not in (None, "")}
try:
engine.on_session_start(target_session_id, **start_context)
except Exception as exc:
logger.debug("context engine on_session_start during transition: %s", exc)
if (
carry_over_context
and old_session_id
and target_session_id
and hasattr(engine, "carry_over_new_session_context")
):
try:
engine.carry_over_new_session_context(old_session_id, target_session_id)
except Exception as exc:
logger.debug("context engine carry_over_new_session_context during transition: %s", exc)
def reset_session_state(
self,
previous_messages: Optional[list] = None,
old_session_id: Optional[str] = None,
carry_over_context: bool = False,
):
"""Reset all session-scoped token counters to 0 for a fresh session.
This method encapsulates the reset logic for all session-level metrics
@ -541,9 +615,12 @@ class AIAgent:
The method safely handles optional attributes (e.g., context compressor)
using ``hasattr`` checks.
This keeps the counter reset logic DRY and maintainable in one place
rather than scattering it across multiple methods.
When ``previous_messages`` / ``old_session_id`` / ``carry_over_context``
are provided, the active context engine is notified through the
full transition lifecycle (``_transition_context_engine_session``)
instead of a bare reset. Default callers pass nothing and keep the
existing reset-only behavior.
"""
# Token usage counters
self.session_total_tokens = 0
@ -562,9 +639,14 @@ class AIAgent:
# Turn counter (added after reset_session_state was first written — #2635)
self._user_turn_count = 0
# Context engine reset (works for both built-in compressor and plugins)
if hasattr(self, "context_compressor") and self.context_compressor:
self.context_compressor.on_session_reset()
# Context engine reset/transition (works for built-in compressor and plugins)
self._transition_context_engine_session(
old_session_id=old_session_id,
new_session_id=getattr(self, "session_id", None),
previous_messages=previous_messages,
carry_over_context=carry_over_context,
reset_engine=True,
)
def _ensure_lmstudio_runtime_loaded(self, config_context_length: Optional[int] = None) -> None:
"""