hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

History

Teknium 6cb9917c73 perf(compression): defer feasibility check to first compression attempt (#28957 ) `AIAgent.__init__` was eagerly calling `_check_compression_model_feasibility()` which probes the auxiliary provider chain and runs `get_model_context_length()` (potentially network-bound) to decide whether the configured auxiliary model can fit a full compression-threshold window. That cost ~440ms cold on every agent construction. Most `chat -q` invocations finish in 1-5 seconds and never accumulate enough context to trip the compression threshold, so the feasibility check is pure overhead. The result is also only consumed when compression actually fires (the function adjusts the live threshold downward if the aux model can't fit; absent that mutation, the gate in `conversation_loop.py:442` would never fire anyway). Defer to first `compress_context()` call via `agent._compression_feasibility_checked` sentinel. Runs at most once per agent lifetime, just before the first compression pass. The warning storage (`_compression_warning`) and gateway replay machinery is unchanged — it still emits to status_callback on the first turn that actually needs compression. E2E timing (chat -q 'hi', 3 runs each): BEFORE AFTER delta median wall 2.03s 1.86s -8% (-169ms) min wall 1.92s 1.63s -15% (-293ms) Real cold-start observation (synthetic 31-turn agent loop): identical behavior since feasibility check fires once on first compression and caches. No semantic difference for sessions that DO compress. UX trade-off: users with broken auxiliary-provider config no longer see the warning at session start. They see it when compression first fires — which is exactly when it matters. For users with working config (the vast majority), the warning never fires anyway, so the deferral is invisible. Tests: - tests/run_agent/test_compression_feasibility.py — 16/16 pass (the one test that asserted call-at-init was updated to drive the lazy check explicitly via agent._check_compression_model_feasibility()) - Live tmux session: 2-turn conversation + tool call completes clean, zero errors in agent.log		2026-05-19 17:27:17 -07:00
..
__init__.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
conftest.py	test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )	2026-04-17 14:21:22 -07:00
test_413_compression.py	fix: show context compaction status	2026-05-13 23:11:43 -07:00
test_860_dedup.py	fix: lazy session creation — defer DB row until first message (#18370 )	2026-05-01 18:39:12 +05:30
test_1630_context_overflow_loop.py	fix(tests): make AIAgent constructor calls self-contained (#11755 )	2026-04-17 12:32:03 -07:00
test_agent_guardrails.py	fix(agent): include name field on every role:tool message for Gemini compatibility (#16478 )	2026-05-04 05:06:33 -07:00
test_anthropic_error_handling.py	fix(tests): catch up 25 stale tests after recent merges (#28626 )	2026-05-19 01:28:32 -07:00
test_anthropic_prompt_cache_policy.py	fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778 )	2026-05-12 20:46:04 -07:00
test_anthropic_third_party_oauth_guard.py	fix(anthropic): complete third-party Anthropic-compatible provider support (#12846 )	2026-04-19 22:43:09 -07:00
test_anthropic_truncation_continuation.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
test_api_max_retries_config.py	feat(agent): make API retry count configurable via agent.api_max_retries (#14730 )	2026-04-23 13:59:32 -07:00
test_async_httpx_del_neuter.py	fix(dashboard): UI polish — modals, layout, consistency, test fixes	2026-05-12 13:59:22 -04:00
test_background_review.py	fix(run_agent): isolate background review fork from external memory plugins (#27190 )	2026-05-16 20:33:38 -07:00
test_background_review_cache_parity.py	test(memory): cover cache-parity + runtime whitelist on background review fork	2026-05-13 22:12:47 -07:00
test_background_review_summary.py	fix(agent): exclude prior-history tool messages from background review summary	2026-04-24 03:10:19 -07:00
test_background_review_toolset_restriction.py	test(memory): cover cache-parity + runtime whitelist on background review fork	2026-05-13 22:12:47 -07:00
test_callable_api_key.py	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
test_codex_app_server_integration.py	fix(codex-runtime): retire wedged sessions + post-tool watchdog + OAuth refresh classify (#25769 )	2026-05-14 07:55:09 -07:00
test_codex_multimodal_tool_result.py	feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955 )	2026-05-09 21:06:19 -07:00
test_codex_xai_oauth_recovery.py	test(xai-oauth): pin tier-denied 403 behavior + docs warning for #26847	2026-05-18 20:08:09 -07:00
test_commit_memory_session_context_engine.py	fix(agent): notify context engine on commit_memory_session (#22764 )	2026-05-09 12:28:42 -07:00
test_compress_focus_plugin_fallback.py	refactor(memory): remove flush_memories entirely (#15696 )	2026-04-25 08:21:14 -07:00
test_compression_boundary.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_compression_boundary_hook.py	fix(tests): catch up six stale tests after compression/aux/kanban changes (#28465 )	2026-05-18 21:43:59 -07:00
test_compression_feasibility.py	perf(compression): defer feasibility check to first compression attempt (#28957 )	2026-05-19 17:27:17 -07:00
test_compression_persistence.py	fix(tests): make AIAgent constructor calls self-contained (#11755 )	2026-04-17 12:32:03 -07:00
test_compression_trigger_excludes_reasoning.py	fix(compression): exclude completion tokens from compression trigger (#12026 )	2026-04-20 05:12:10 -07:00
test_compressor_fallback_update.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_concurrent_interrupt.py	test: remove 50 stale/broken tests to unblock CI (#22098 )	2026-05-08 14:55:40 -07:00
test_context_token_tracking.py	feat(providers): extend request_timeout_seconds to all client paths	2026-04-19 11:23:00 -07:00
test_copilot_native_vision_headers.py	fix(copilot): mark native image requests as vision	2026-04-27 08:35:50 -07:00
test_create_openai_client_kwargs_isolation.py	fix(tests): make AIAgent constructor calls self-contained (#11755 )	2026-04-17 12:32:03 -07:00
test_create_openai_client_proxy_env.py	test(proxy): regression tests for NO_PROXY bypass on keepalive client	2026-04-24 03:04:42 -07:00
test_create_openai_client_reuse.py	fix(tests): make AIAgent constructor calls self-contained (#11755 )	2026-04-17 12:32:03 -07:00
test_deepseek_reasoning_content_echo.py	fix(deepseek): use non-empty reasoning_content placeholder for V4 Pro thinking mode	2026-04-30 23:04:23 -07:00
test_deepseek_v4_thinking_live.py	fix(deepseek): preserve v4 reasoning_content on replay	2026-04-30 11:18:39 -07:00
test_dict_tool_call_args.py	fix(tests): fix 78 CI test failures and remove dead test (#9036 )	2026-04-13 10:50:24 -07:00
test_empty_response_recovery_persistence.py	fix(run_agent): break permanent empty-response loop from orphan tool-tail (#21385 )	2026-05-07 08:35:10 -07:00
test_exit_cleanup_interrupt.py	test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )	2026-04-17 14:21:22 -07:00
test_fallback_model.py	fix(fallback): resolve api_key_env in fallback chain entries (carve-out of #22665 )	2026-05-09 17:53:56 -07:00
test_file_mutation_verifier.py	fix: classify landed file mutations with diagnostics	2026-05-13 06:46:23 -07:00
test_image_rejection_fallback.py	fix(agent): catch ChatGPT-account Codex data-URL rejection so images are stripped instead of cascading to compression (#23602 )	2026-05-11 07:37:22 -07:00
test_image_shrink_recovery.py	feat(image-input): native multimodal routing based on model vision capability (#16506 )	2026-04-27 06:27:59 -07:00
test_init_fallback_on_exhausted_pool.py	fix(agent): try fallback providers at init when primary credential pool is exhausted (#17929 )	2026-05-02 02:09:46 -07:00
test_interactive_interrupt.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_interrupt_propagation.py	test: stop testing mutable data — convert change-detectors to invariants (#13363 )	2026-04-20 23:20:33 -07:00
test_invalid_context_length_warning.py	fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation	2026-04-15 22:05:21 -07:00
test_iteration_budget_race.py	fix(run_agent): acquire lock in IterationBudget.used property	2026-05-04 12:37:28 -07:00
test_jsondecodeerror_retryable.py	refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards	2026-05-16 22:55:49 -07:00
test_last_reasoning_per_turn.py	test: pin per-turn reasoning extraction semantics	2026-05-05 05:00:05 -07:00
test_long_context_tier_429.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_materialize_data_url_cleanup.py	fix(misc): three small defensive fixes from PR #1974	2026-05-10 22:28:01 -07:00
test_memory_nudge_counter_hydration.py	refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards	2026-05-16 22:55:49 -07:00
test_memory_provider_init.py	fix(memory): keep Honcho provider opt-in	2026-04-18 22:50:55 -07:00
test_memory_sync_interrupted.py	feat(memory): notify providers on mid-process session_id rotation (#17409 )	2026-04-29 04:57:22 -07:00
test_message_sequence_repair.py	fix(run_agent): break permanent empty-response loop from orphan tool-tail (#21385 )	2026-05-07 08:35:10 -07:00
test_openai_client_lifecycle.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_percentage_clamp.py	fix: update 6 test files broken by dead code removal	2026-04-10 03:44:43 -07:00
test_plugin_context_engine_init.py	fix(tests): make AIAgent constructor calls self-contained (#11755 )	2026-04-17 12:32:03 -07:00
test_primary_runtime_restore.py	fix(agent): reset _fallback_index at turn start even when no fallback activated	2026-05-16 17:12:48 -07:00
test_provider_attribution_headers.py	feat(nvidia): add NIM billing origin header	2026-05-15 14:06:51 -07:00
test_provider_fallback.py	fix(fallback): skip chain entries matching current provider/model/base_url (#22780 )	2026-05-09 12:48:19 -07:00
test_provider_parity.py	fix(tests): stabilize xai env and provider parity	2026-05-17 11:55:25 -07:00
test_real_interrupt_subagent.py	fix(tests): fix 78 CI test failures and remove dead test (#9036 )	2026-04-13 10:50:24 -07:00
test_redirect_stdout_issue.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_repair_tool_call_arguments.py	fix(run_agent): handle unescaped control chars in tool_call arguments (#15356 )	2026-04-24 15:06:41 -07:00
test_repair_tool_call_name.py	fix(agent): repair CamelCase + _tool suffix tool-call emissions (#15124 )	2026-04-24 05:32:08 -07:00
test_review_prompt_class_first.py	fix(review): tell background reviewer not to capture transient env failures as skills (#23004 )	2026-05-09 22:51:25 -07:00
test_run_agent.py	fix(agent): add qwen and deepseek to TOOL_USE_ENFORCEMENT_MODELS	2026-05-18 20:06:49 -07:00
test_run_agent_codex_responses.py	test(xai-oauth): use grok-4.3 instead of retiring grok-code-fast-1	2026-05-15 12:11:32 -07:00
test_run_agent_multimodal_prologue.py	refactor: unify transport dispatch + collapse normalize shims	2026-04-22 18:34:25 -07:00
test_sequential_chats_live.py	test: regression guards for the keepalive/transport bug class (#10933 ) (#11266 )	2026-04-16 16:36:33 -07:00
test_session_id_env.py	feat: expose HERMES_SESSION_ID to agent tools via ContextVar + env (#23847 )	2026-05-12 00:16:45 +05:30
test_session_meta_filtering.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_session_reset_fix.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_steer.py	refactor(steer): simplify injection marker to 'User guidance:' prefix (#13340 )	2026-04-20 22:18:49 -07:00
test_stream_drop_logging.py	feat(stream-retry): add upstream + timing diagnostics to drop log (#23005 )	2026-05-09 22:49:35 -07:00
test_stream_interrupt_retry.py	fix: /stop now immediately aborts streaming retry loop	2026-04-25 09:51:39 -07:00
test_streaming.py	fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184 )	2026-05-16 17:09:41 -07:00
test_streaming_tool_call_repair.py	chore: remove Atropos RL environments and tinker-atropos integration (#26106 )	2026-05-15 10:36:38 +05:30
test_strict_api_validation.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_strip_reasoning_tags_cli.py	fix(display): strip standalone tool-call XML tags from visible text	2026-04-22 18:12:42 -07:00
test_switch_model_context.py	test(ci): stabilize shared optional dependency baselines	2026-05-13 17:32:22 -07:00
test_switch_model_fallback_prune.py	fix(agent): default missing fallback chain on switch	2026-04-24 05:35:43 -07:00
test_thinking_only_sanitizer.py	fix(agent): drop thinking-only assistant turns before provider call (#16959 )	2026-04-28 03:50:51 -07:00
test_token_persistence_non_cli.py	fix: make session search initialize session db	2026-05-09 14:36:58 -07:00
test_tool_arg_coercion.py	fix(tools): wrap bare scalars in single-element list for array-typed args	2026-05-04 05:00:37 -07:00
test_tool_call_args_sanitizer.py	fix(agent): include name field on every role:tool message for Gemini compatibility (#16478 )	2026-05-04 05:06:33 -07:00
test_tool_call_guardrail_runtime.py	fix: add recovery hints to loop guard warnings	2026-05-19 00:12:12 -07:00
test_tool_executor_contextvar_propagation.py	refactor(run_agent): extract tool execution to agent/tool_executor.py	2026-05-16 18:24:05 -07:00
test_tool_name_db_persistence.py	fix(agent): set tool_name on tool-result messages at construction time	2026-05-19 20:49:11 +01:00
test_unicode_ascii_codec.py	fix: always retry on ASCII codec UnicodeEncodeError — don't gate on per-component sanitization	2026-04-15 15:03:28 -07:00
test_vision_aware_preprocessing.py	feat(image-input): native multimodal routing based on model vision capability (#16506 )	2026-04-27 06:27:59 -07:00