mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-03 02:11:48 +00:00
The user-visible /compress banner and the post-compression last_prompt_tokens writeback both counted only the raw message transcript (chars/4). With a 15KB system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of request pressure — a 234x gap. Two user-facing consequences: - Banner shows 'Compressing … (~45 tokens)…' while compression is actually firing on 10K+ tokens of real pressure, confusing users about why compression triggered (reported by @codecovenant on X; #6217). - Post-compression last_prompt_tokens writeback omits tool schemas, so the next should_compress() check compares real usage against a stale underestimate — compression triggers late, potentially past the model's context limit on small-context models (#14695). Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough() at every user-visible banner and at the post-compression writeback. estimate_request_tokens_rough() already existed for exactly this purpose and includes system prompt + tool schemas. Touched call sites: - run_agent.py: post-compression last_prompt_tokens writeback, post-tool call should_compress() fallback when provider usage is missing - cli.py: /compress banner + summary - gateway/run.py: gateway /compress banner + summary - tui_gateway/server.py: TUI /compress status + summary - acp_adapter/server.py: ACP /compact before/after Left intentionally alone: - Session-hygiene fallback and the 'no agent' /status path in gateway/run.py — no agent instance is in scope to query for system prompt/tools, and the existing 30-50% overestimate wobble on hygiene is safety-accepted. - Verbose-mode 'Request size' logging — informational only, already counts system prompt via api_messages[0]. Also relabels the feedback line from 'Rough transcript estimate' to 'Approx request size' so the metric label matches what it actually measures. Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217); user report @codecovenant on X (2026-04-30). Closes #14695 Closes #6217 |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| test_branch_command.py | ||
| test_busy_input_mode_command.py | ||
| test_cli_approval_ui.py | ||
| test_cli_background_tui_refresh.py | ||
| test_cli_bracketed_paste_sanitizer.py | ||
| test_cli_browser_connect.py | ||
| test_cli_context_warning.py | ||
| test_cli_copy_command.py | ||
| test_cli_extension_hooks.py | ||
| test_cli_external_editor.py | ||
| test_cli_file_drop.py | ||
| test_cli_force_redraw.py | ||
| test_cli_image_command.py | ||
| test_cli_init.py | ||
| test_cli_interrupt_subagent.py | ||
| test_cli_loading_indicator.py | ||
| test_cli_markdown_rendering.py | ||
| test_cli_mcp_config_watch.py | ||
| test_cli_new_session.py | ||
| test_cli_prefix_matching.py | ||
| test_cli_preloaded_skills.py | ||
| test_cli_provider_resolution.py | ||
| test_cli_reload_skills.py | ||
| test_cli_retry.py | ||
| test_cli_save_config_value.py | ||
| test_cli_secret_capture.py | ||
| test_cli_shutdown_memory_messages.py | ||
| test_cli_skin_integration.py | ||
| test_cli_status_bar.py | ||
| test_cli_status_command.py | ||
| test_cli_steer_busy_path.py | ||
| test_cli_terminal_response_sanitizer.py | ||
| test_cli_tools_command.py | ||
| test_cli_user_message_preview.py | ||
| test_compress_focus.py | ||
| test_cprint_bg_thread.py | ||
| test_cwd_env_respect.py | ||
| test_fast_command.py | ||
| test_gquota_command.py | ||
| test_manual_compress.py | ||
| test_personality_none.py | ||
| test_quick_commands.py | ||
| test_reasoning_command.py | ||
| test_resume_display.py | ||
| test_save_conversation_location.py | ||
| test_session_boundary_hooks.py | ||
| test_stream_delta_think_tag.py | ||
| test_surrogate_sanitization.py | ||
| test_tool_progress_scrollback.py | ||
| test_worktree.py | ||
| test_worktree_security.py | ||