hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-17 09:41:58 +00:00

History

Teknium f0dc919f92 fix(compression): include system prompt + tool schemas in token estimates (#18265 ) The user-visible /compress banner and the post-compression last_prompt_tokens writeback both counted only the raw message transcript (chars/4). With a 15KB system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of request pressure — a 234x gap. Two user-facing consequences: - Banner shows 'Compressing … (~45 tokens)…' while compression is actually firing on 10K+ tokens of real pressure, confusing users about why compression triggered (reported by @codecovenant on X; #6217). - Post-compression last_prompt_tokens writeback omits tool schemas, so the next should_compress() check compares real usage against a stale underestimate — compression triggers late, potentially past the model's context limit on small-context models (#14695). Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough() at every user-visible banner and at the post-compression writeback. estimate_request_tokens_rough() already existed for exactly this purpose and includes system prompt + tool schemas. Touched call sites: - run_agent.py: post-compression last_prompt_tokens writeback, post-tool call should_compress() fallback when provider usage is missing - cli.py: /compress banner + summary - gateway/run.py: gateway /compress banner + summary - tui_gateway/server.py: TUI /compress status + summary - acp_adapter/server.py: ACP /compact before/after Left intentionally alone: - Session-hygiene fallback and the 'no agent' /status path in gateway/run.py — no agent instance is in scope to query for system prompt/tools, and the existing 30-50% overestimate wobble on hygiene is safety-accepted. - Verbose-mode 'Request size' logging — informational only, already counts system prompt via api_messages[0]. Also relabels the feedback line from 'Rough transcript estimate' to 'Approx request size' so the metric label matches what it actually measures. Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217); user report @codecovenant on X (2026-04-30). Closes #14695 Closes #6217		2026-04-30 23:03:54 -07:00
..
builtin_hooks	remove: BOOT.md built-in hook (#17093 )	2026-04-28 09:50:27 -07:00
platforms	fix(gateway): snapshot callback generation after agent binds it, not before	2026-04-30 20:41:18 -07:00
__init__.py	Enhance CLI with multi-platform messaging integration and configuration management	2026-02-02 19:01:51 -08:00
channel_directory.py	feat: complete plugin platform parity — all 12 integration points	2026-04-29 21:56:51 -07:00
config.py	fix(gateway): coerce StreamingConfig booleans and malformed numerics safely	2026-04-30 20:37:49 -07:00
delivery.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
display_config.py	fix(gateway): default Slack tool_progress to off	2026-04-26 18:33:35 -07:00
hooks.py	fix(plugins): register dynamically-loaded modules in sys.modules before exec	2026-04-29 23:34:35 -07:00
mirror.py	fix(gateway): avoid cross-user mirror writes in per-user group sessions	2026-04-26 18:31:24 -07:00
pairing.py	refactor: consolidate symlink-safe atomic replace into shared helper	2026-04-28 04:58:22 -07:00
platform_registry.py	feat(irc): add interactive setup	2026-04-29 21:56:51 -07:00
restart.py	fix(gateway): address restart review feedback	2026-04-10 21:18:34 -07:00
run.py	fix(compression): include system prompt + tool schemas in token estimates (#18265 )	2026-04-30 23:03:54 -07:00
runtime_footer.py	feat(gateway): opt-in runtime-metadata footer on final replies (#17026 )	2026-04-28 06:50:04 -07:00
session.py	fix(gateway): re-inject topic-bound skill after /new or /reset	2026-04-30 20:29:19 -07:00
session_context.py	fix(cron): run due jobs in parallel to prevent serial tick starvation (#13021 )	2026-04-20 11:53:07 -07:00
status.py	fix(gateway): write restart markers atomically and fix Windows lock collisions	2026-04-30 19:58:16 -07:00
sticker_cache.py	chore: remove ~100 unused imports across 55 files (#3016 )	2026-03-25 15:02:03 -07:00
stream_consumer.py	chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664	2026-04-29 21:56:51 -07:00
whatsapp_identity.py	fix(whatsapp_identity): pin identifier regex to ASCII, clarify it's defense-in-depth	2026-04-26 20:48:31 -07:00