fix(compress): make abort-on-summary-failure opt-in via config flag (#28117)

PR #28102 made the summary-failure abort path the unconditional default,
changing established behavior. Gate it behind config.yaml flag
`compression.abort_on_summary_failure` (default False = historical
fallback-placeholder behavior).

- hermes_cli/config.py: new `compression.abort_on_summary_failure` key,
  default False, documented inline.
- agent/agent_init.py: read the flag from compression config and pass to
  ContextCompressor.
- agent/context_compressor.py: `__init__` accepts `abort_on_summary_failure`
  (default False). `compress()` failure branch gates the abort on the
  flag; when False, falls through to the restored legacy fallback path
  (static "summary unavailable" placeholder + drop middle window).
- tests: restore original fallback expectations as default; add new
  TestAbortOnSummaryFailure class for the opt-in mode.

Gateway/CLI plumbing (force=True on /compress, hygiene/handler abort
detection, locale `gateway.compress.aborted` key) from PR #28102 stays
intact — those paths only fire when `_last_compress_aborted` is True,
which now only happens when the flag is enabled.
This commit is contained in:
Teknium 2026-05-18 10:28:20 -07:00 committed by GitHub
parent 5e40f83cb7
commit 9aae59feab
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 150 additions and 84 deletions

View file

@ -803,6 +803,17 @@ DEFAULT_CONFIG = {
# 0 for long-running rolling-compaction sessions
# where you want nothing pinned except the
# system prompt + rolling summary + recent tail.
"abort_on_summary_failure": False, # When True, auto-compression that fails
# to generate a summary (aux LLM errored / returned
# non-JSON / timed out) aborts entirely instead of
# dropping the middle window with a static
# "summary unavailable" placeholder. Messages are
# preserved unchanged and the session "freezes" at
# its current size until the user runs /compress
# (which bypasses the failure cooldown) or /new.
# Default False matches historical behavior; set to
# True if you'd rather pause than silently lose
# context turns when your aux model is flaky.
},
# Anthropic prompt caching (Claude via OpenRouter or native Anthropic API).