hermes-agent/tests/run_agent
Teknium f92006ce1c
fix(compression): reserve system+tools headroom when aux binds threshold (#15631)
When the auxiliary compression model's context is smaller than the main
model's compression threshold, _check_compression_model_feasibility
auto-lowers the session threshold. Previously it set:

    new_threshold = aux_context

This let the raw message list grow to exactly aux_context tokens. But
compression and flush_memories actually send system_prompt + tool_schemas
+ messages to the aux model. With 50+ tools that overhead is 25-30K
tokens, so the full request overflowed aux with HTTP 400.

Subtract a headroom estimate from aux_context before setting the new
threshold: the actual tool-schema token count (from
estimate_request_tokens_rough) plus a 12K allowance for the system
prompt (not yet built at __init__ time) and flush-instruction overhead.
Clamp to MINIMUM_CONTEXT_LENGTH so the session still starts even with
an unusually heavy tool schema.

This fixes the 'flush_memories overflow on busy toolsets' path that
Teknium flagged — where main and aux can be nominally the same model
but still 400 because the threshold left no room for the request
overhead. Same fix also protects the normal compression summarisation
request on the same binding aux.

Tests: two new regression tests cover the headroom reservation and the
MINIMUM_CONTEXT_LENGTH floor. Two existing tests updated for the new
(lower) threshold values now that empty-tools still produces a 12K
static headroom deduction.
2026-04-25 05:41:56 -07:00
..
__init__.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
conftest.py test: speed up slow tests (backoff + subprocess + IMDS network) (#11797) 2026-04-17 14:21:22 -07:00
test_413_compression.py test: speed up slow tests (backoff + subprocess + IMDS network) (#11797) 2026-04-17 14:21:22 -07:00
test_860_dedup.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_1630_context_overflow_loop.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_agent_guardrails.py fix(delegate): make max_concurrent_children configurable + error on excess 2026-04-10 13:38:14 -07:00
test_agent_loop.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_agent_loop_tool_calling.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_agent_loop_vllm.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_anthropic_error_handling.py feat(providers): extend request_timeout_seconds to all client paths 2026-04-19 11:23:00 -07:00
test_anthropic_prompt_cache_policy.py fix(cache): enable prompt caching for Qwen on OpenCode/OpenCode-Go/Alibaba (#13528) 2026-04-21 06:40:58 -07:00
test_anthropic_third_party_oauth_guard.py fix(anthropic): complete third-party Anthropic-compatible provider support (#12846) 2026-04-19 22:43:09 -07:00
test_anthropic_truncation_continuation.py refactor: remove _nr_to_assistant_message shim + fix flush_memories guard 2026-04-23 02:30:05 -07:00
test_api_max_retries_config.py feat(agent): make API retry count configurable via agent.api_max_retries (#14730) 2026-04-23 13:59:32 -07:00
test_async_httpx_del_neuter.py fix: bound auxiliary client cache to prevent fd exhaustion in long-running gateways (#10200) (#10470) 2026-04-15 13:16:28 -07:00
test_background_review_summary.py fix(agent): exclude prior-history tool messages from background review summary 2026-04-24 03:10:19 -07:00
test_compress_focus_plugin_fallback.py fix(compress): don't reach into ContextCompressor privates from /compress (#15039) 2026-04-24 02:55:43 -07:00
test_compression_boundary.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_compression_feasibility.py fix(compression): reserve system+tools headroom when aux binds threshold (#15631) 2026-04-25 05:41:56 -07:00
test_compression_persistence.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_compression_trigger_excludes_reasoning.py fix(compression): exclude completion tokens from compression trigger (#12026) 2026-04-20 05:12:10 -07:00
test_compressor_fallback_update.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_concurrent_interrupt.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_context_token_tracking.py feat(providers): extend request_timeout_seconds to all client paths 2026-04-19 11:23:00 -07:00
test_create_openai_client_kwargs_isolation.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_create_openai_client_proxy_env.py test(proxy): regression tests for NO_PROXY bypass on keepalive client 2026-04-24 03:04:42 -07:00
test_create_openai_client_reuse.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_deepseek_reasoning_content_echo.py refactor(deepseek-reasoning): consolidate detection into helpers + regression tests 2026-04-24 16:38:29 -07:00
test_dict_tool_call_args.py fix(tests): fix 78 CI test failures and remove dead test (#9036) 2026-04-13 10:50:24 -07:00
test_exit_cleanup_interrupt.py test: speed up slow tests (backoff + subprocess + IMDS network) (#11797) 2026-04-17 14:21:22 -07:00
test_fallback_model.py test: speed up slow tests (backoff + subprocess + IMDS network) (#11797) 2026-04-17 14:21:22 -07:00
test_flush_memories_codex.py fix(flush_memories): strip temperature from codex_responses fallback (#15620) 2026-04-25 05:01:25 -07:00
test_interactive_interrupt.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_interrupt_propagation.py test: stop testing mutable data — convert change-detectors to invariants (#13363) 2026-04-20 23:20:33 -07:00
test_invalid_context_length_warning.py fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation 2026-04-15 22:05:21 -07:00
test_jsondecodeerror_retryable.py fix(agent): retry on json.JSONDecodeError instead of treating it as a local validation error (#15107) 2026-04-24 05:02:58 -07:00
test_long_context_tier_429.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_memory_provider_init.py fix(memory): keep Honcho provider opt-in 2026-04-18 22:50:55 -07:00
test_memory_sync_interrupted.py fix(memory): skip external-provider sync on interrupted turns (#15218) 2026-04-24 15:30:18 -07:00
test_openai_client_lifecycle.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_percentage_clamp.py fix: update 6 test files broken by dead code removal 2026-04-10 03:44:43 -07:00
test_plugin_context_engine_init.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_primary_runtime_restore.py fix(agent): only set rate-limit cooldown when leaving primary; add tests 2026-04-24 05:35:43 -07:00
test_provider_attribution_headers.py fix(providers): send user agent to routermint endpoints 2026-04-24 03:02:16 -07:00
test_provider_fallback.py fix(agent): fall back on rate limit when pool has no rotation room 2026-04-24 05:20:05 -07:00
test_provider_parity.py feat: add ResponsesApiTransport + wire all Codex transport paths 2026-04-21 19:48:56 -07:00
test_real_interrupt_subagent.py fix(tests): fix 78 CI test failures and remove dead test (#9036) 2026-04-13 10:50:24 -07:00
test_redirect_stdout_issue.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_repair_tool_call_arguments.py fix(run_agent): handle unescaped control chars in tool_call arguments (#15356) 2026-04-24 15:06:41 -07:00
test_repair_tool_call_name.py fix(agent): repair CamelCase + _tool suffix tool-call emissions (#15124) 2026-04-24 05:32:08 -07:00
test_run_agent.py feat: read prompt caching cache_ttl from config 2026-04-24 03:21:29 -07:00
test_run_agent_codex_responses.py fix(codex): detect leaked tool-call text in assistant content (#15347) 2026-04-24 14:39:59 -07:00
test_run_agent_multimodal_prologue.py refactor: unify transport dispatch + collapse normalize shims 2026-04-22 18:34:25 -07:00
test_sequential_chats_live.py test: regression guards for the keepalive/transport bug class (#10933) (#11266) 2026-04-16 16:36:33 -07:00
test_session_meta_filtering.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_session_reset_fix.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_steer.py refactor(steer): simplify injection marker to 'User guidance:' prefix (#13340) 2026-04-20 22:18:49 -07:00
test_streaming.py fix(streaming): silent retry when stream dies mid tool-call (#14151) 2026-04-22 13:47:33 -07:00
test_streaming_tool_call_repair.py fix: repair malformed tool call args in streaming assembly before flagging as truncated 2026-04-24 15:03:07 -07:00
test_strict_api_validation.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_strip_reasoning_tags_cli.py fix(display): strip standalone tool-call XML tags from visible text 2026-04-22 18:12:42 -07:00
test_switch_model_context.py fix: pass config_context_length to switch_model context compressor 2026-04-10 05:52:45 -07:00
test_switch_model_fallback_prune.py fix(agent): default missing fallback chain on switch 2026-04-24 05:35:43 -07:00
test_token_persistence_non_cli.py fix(tests): make AIAgent constructor calls self-contained (#11755) 2026-04-17 12:32:03 -07:00
test_tool_arg_coercion.py test,chore: cover stringified array/object coercion + AUTHOR_MAP entry 2026-04-23 16:38:38 -07:00
test_tool_call_args_sanitizer.py fix(run_agent): repair corrupted tool_call arguments before sending to provider 2026-04-24 14:55:47 -07:00
test_unicode_ascii_codec.py fix: always retry on ASCII codec UnicodeEncodeError — don't gate on per-component sanitization 2026-04-15 15:03:28 -07:00