hermes-agent/tests/agent
Teknium 9231a335d4
fix(compression): replace dead summary_target_tokens with ratio-based scaling (#2554)
The summary_target_tokens parameter was accepted in the constructor,
stored on the instance, and never used — the summary budget was always
computed from hardcoded module constants (_SUMMARY_RATIO=0.20,
_MAX_SUMMARY_TOKENS=8000). This caused two compounding problems:

1. The config value was silently ignored, giving users no control
   over post-compression size.
2. Fixed budgets (20K tail, 8K summary cap) didn't scale with
   context window size. Switching from a 1M-context model to a
   200K model would trigger compression that nuked 350K tokens
   of conversation history down to ~30K.

Changes:
- Replace summary_target_tokens with summary_target_ratio (default 0.40)
  which sets the post-compression target as a fraction of context_length.
  Tail token budget and summary cap now scale proportionally:
    MiniMax 200K → ~80K post-compression
    GPT-5   1M  → ~400K post-compression
- Change threshold_percent default: 0.50 → 0.80 (don't fire until
  80% of context is consumed)
- Change protect_last_n default: 4 → 20 (preserve ~10 full turns)
- Summary token cap scales to 5% of context (was fixed 8K), capped
  at 32K ceiling
- Read target_ratio and protect_last_n from config.yaml compression
  section (both are now configurable)
- Remove hardcoded summary_target_tokens=500 from run_agent.py
- Add 5 new tests for ratio scaling, clamping, and new defaults
2026-03-24 17:45:49 -07:00
..
__init__.py test: add unit tests for 8 modules (batch 2) 2026-02-26 13:54:20 +03:00
test_auxiliary_client.py fix: auxiliary client skips expired Codex JWT and propagates Anthropic OAuth flag 2026-03-21 17:36:25 -07:00
test_context_compressor.py fix(compression): replace dead summary_target_tokens with ratio-based scaling (#2554) 2026-03-24 17:45:49 -07:00
test_display_emoji.py feat(tools): centralize tool emoji metadata in registry + skin integration 2026-03-15 20:21:21 -07:00
test_model_metadata.py feat: overhaul context length detection with models.dev and provider-aware resolution (#2158) 2026-03-20 06:04:33 -07:00
test_models_dev.py feat: overhaul context length detection with models.dev and provider-aware resolution (#2158) 2026-03-20 06:04:33 -07:00
test_prompt_builder.py feat: priority-based context file selection + CLAUDE.md support (#2301) 2026-03-21 06:26:20 -07:00
test_prompt_caching.py fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter 2026-03-21 16:54:43 -07:00
test_redact.py fix(tests): resolve all consistently failing tests 2026-03-22 05:58:26 -07:00
test_skill_commands.py fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897) 2026-03-18 03:17:37 -07:00
test_smart_model_routing.py fix: hermes update causes dual gateways on macOS (launchd) (#1567) 2026-03-16 12:36:29 -07:00
test_subagent_progress.py fix(display): fix subagent progress tree-view visual nits 2026-02-28 23:29:49 -08:00
test_title_generator.py feat: auto-generate session titles after first exchange 2026-03-17 04:14:40 -07:00
test_usage_pricing.py feat: use endpoint metadata for custom model context and pricing (#1906) 2026-03-18 03:04:07 -07:00