hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-18 09:51:59 +00:00

History

Teknium f0dc919f92 fix(compression): include system prompt + tool schemas in token estimates (#18265 ) The user-visible /compress banner and the post-compression last_prompt_tokens writeback both counted only the raw message transcript (chars/4). With a 15KB system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of request pressure — a 234x gap. Two user-facing consequences: - Banner shows 'Compressing … (~45 tokens)…' while compression is actually firing on 10K+ tokens of real pressure, confusing users about why compression triggered (reported by @codecovenant on X; #6217). - Post-compression last_prompt_tokens writeback omits tool schemas, so the next should_compress() check compares real usage against a stale underestimate — compression triggers late, potentially past the model's context limit on small-context models (#14695). Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough() at every user-visible banner and at the post-compression writeback. estimate_request_tokens_rough() already existed for exactly this purpose and includes system prompt + tool schemas. Touched call sites: - run_agent.py: post-compression last_prompt_tokens writeback, post-tool call should_compress() fallback when provider usage is missing - cli.py: /compress banner + summary - gateway/run.py: gateway /compress banner + summary - tui_gateway/server.py: TUI /compress status + summary - acp_adapter/server.py: ACP /compact before/after Left intentionally alone: - Session-hygiene fallback and the 'no agent' /status path in gateway/run.py — no agent instance is in scope to query for system prompt/tools, and the existing 30-50% overestimate wobble on hygiene is safety-accepted. - Verbose-mode 'Request size' logging — informational only, already counts system prompt via api_messages[0]. Also relabels the feedback line from 'Rough transcript estimate' to 'Approx request size' so the metric label matches what it actually measures. Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217); user report @codecovenant on X (2026-04-30). Closes #14695 Closes #6217		2026-04-30 23:03:54 -07:00
..
transports	fix(deepseek): preserve v4 reasoning_content on replay	2026-04-30 11:18:39 -07:00
__init__.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
account_usage.py	feat(account-usage): add per-provider account limits module	2026-04-21 01:56:35 -07:00
anthropic_adapter.py	fix(anthropic): reactive recovery for OAuth 1M-context beta rejection (#17752 )	2026-04-29 21:56:54 -07:00
auxiliary_client.py	fix(fallback): let custom_providers shadow built-in aliases	2026-04-30 20:18:44 -07:00
bedrock_adapter.py	fix(bedrock): add live model discovery and region resolution for non-US regions	2026-04-28 03:53:11 -07:00
codex_responses_adapter.py	fix(agent): preserve Codex message items for replay	2026-04-25 18:22:06 -07:00
context_compressor.py	fix(context_compressor): off-by-one in tail protection for short conversations	2026-04-30 20:00:01 -07:00
context_engine.py	fix(compress): don't reach into ContextCompressor privates from /compress (#15039 )	2026-04-24 02:55:43 -07:00
context_references.py	fix(agent): fall back when rg is blocked for @folder references	2026-04-20 01:56:41 -07:00
copilot_acp_client.py	fix(ci): stabilize main test suite regressions (#17660 )	2026-04-29 23:18:55 -07:00
credential_pool.py	feat(minimax-oauth): full integration with peer OAuth providers	2026-04-29 09:53:42 -07:00
credential_sources.py	feat(minimax-oauth): full integration with peer OAuth providers	2026-04-29 09:53:42 -07:00
curator.py	fix(curator): preserve last_report_path in state	2026-04-30 19:45:59 -07:00
display.py	fix(guardrails): preserve display _detect_tool_failure semantics	2026-04-30 20:43:15 -07:00
error_classifier.py	fix(anthropic): reactive recovery for OAuth 1M-context beta rejection (#17752 )	2026-04-29 21:56:54 -07:00
file_safety.py	fix(security): apply file safety to copilot acp fs	2026-04-21 01:31:58 -07:00
gemini_cloudcode_adapter.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
gemini_native_adapter.py	fix(gemini): fail fast on missing API key + surface it in hermes dump (#15133 )	2026-04-24 05:35:17 -07:00
gemini_schema.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
google_code_assist.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
google_oauth.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
image_gen_provider.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
image_gen_registry.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
image_routing.py	feat(image-input): native multimodal routing based on model vision capability (#16506 )	2026-04-27 06:27:59 -07:00
insights.py	Merge branch 'main' into feat/dashboard-skill-analytics	2026-04-20 05:25:49 -07:00
lmstudio_reasoning.py	feat(agent): add lmstudio integration	2026-04-28 12:27:36 -07:00
manual_compression_feedback.py	fix(compression): include system prompt + tool schemas in token estimates (#18265 )	2026-04-30 23:03:54 -07:00
memory_manager.py	feat(memory): notify providers on mid-process session_id rotation (#17409 )	2026-04-29 04:57:22 -07:00
memory_provider.py	feat(memory): notify providers on mid-process session_id rotation (#17409 )	2026-04-29 04:57:22 -07:00
model_metadata.py	fix(context): honor model.context_length for Ollama num_ctx and all display paths	2026-04-30 04:31:23 -07:00
models_dev.py	feat(minimax-oauth): full integration with peer OAuth providers	2026-04-29 09:53:42 -07:00
moonshot_schema.py	fix(kimi,mcp): Moonshot schema sanitizer + MCP schema robustness (#14805 )	2026-04-23 16:11:57 -07:00
nous_rate_guard.py	refactor: consolidate symlink-safe atomic replace into shared helper	2026-04-28 04:58:22 -07:00
onboarding.py	docs(onboarding): lead OpenClaw residue banner with migrate, warn that cleanup breaks OpenClaw (#17507 )	2026-04-29 08:08:36 -07:00
prompt_builder.py	feat(kanban): durable multi-profile collaboration board (#17805 )	2026-04-30 13:36:47 -07:00
prompt_caching.py	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter	2026-03-21 16:54:43 -07:00
rate_limit_tracker.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
redact.py	fix(ci): stabilize main test suite regressions (#17660 )	2026-04-29 23:18:55 -07:00
retry_utils.py	feat(agent): add jittered retry backoff	2026-04-08 00:41:36 -07:00
shell_hooks.py	refactor: consolidate symlink-safe atomic replace into shared helper	2026-04-28 04:58:22 -07:00
skill_commands.py	fix(skills): wire bump_use() into skill invocation and preload paths (#17782 )	2026-04-30 05:07:34 -07:00
skill_preprocessing.py	fix(skills): apply inline shell in skill_view	2026-04-24 15:15:07 -07:00
skill_utils.py	fix(skills): exclude .archive from skill index walk	2026-04-30 04:59:22 -07:00
subdirectory_hints.py	fix(agent): catch PermissionError in subdirectory hint discovery	2026-04-09 03:10:30 -07:00
title_generator.py	fix(auxiliary): custom provider URL rewrite + main_runtime model for title gen	2026-04-28 01:47:25 -07:00
tool_guardrails.py	fix(guardrails): preserve display _detect_tool_failure semantics	2026-04-30 20:43:15 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
usage_pricing.py	fix(usage_pricing): add MiniMax-M2.7 pricing for minimax and minimax-cn providers	2026-04-29 04:56:50 -07:00