hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

History

Teknium 6cb9917c73 perf(compression): defer feasibility check to first compression attempt (#28957 ) `AIAgent.__init__` was eagerly calling `_check_compression_model_feasibility()` which probes the auxiliary provider chain and runs `get_model_context_length()` (potentially network-bound) to decide whether the configured auxiliary model can fit a full compression-threshold window. That cost ~440ms cold on every agent construction. Most `chat -q` invocations finish in 1-5 seconds and never accumulate enough context to trip the compression threshold, so the feasibility check is pure overhead. The result is also only consumed when compression actually fires (the function adjusts the live threshold downward if the aux model can't fit; absent that mutation, the gate in `conversation_loop.py:442` would never fire anyway). Defer to first `compress_context()` call via `agent._compression_feasibility_checked` sentinel. Runs at most once per agent lifetime, just before the first compression pass. The warning storage (`_compression_warning`) and gateway replay machinery is unchanged — it still emits to status_callback on the first turn that actually needs compression. E2E timing (chat -q 'hi', 3 runs each): BEFORE AFTER delta median wall 2.03s 1.86s -8% (-169ms) min wall 1.92s 1.63s -15% (-293ms) Real cold-start observation (synthetic 31-turn agent loop): identical behavior since feasibility check fires once on first compression and caches. No semantic difference for sessions that DO compress. UX trade-off: users with broken auxiliary-provider config no longer see the warning at session start. They see it when compression first fires — which is exactly when it matters. For users with working config (the vast majority), the warning never fires anyway, so the deferral is invisible. Tests: - tests/run_agent/test_compression_feasibility.py — 16/16 pass (the one test that asserted call-at-init was updated to drive the lazy check explicitly via agent._check_compression_model_feasibility()) - Live tmux session: 2-turn conversation + tool call completes clean, zero errors in agent.log		2026-05-19 17:27:17 -07:00
..
lsp	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
transports	fix(codex): allow kanban worker board writes	2026-05-17 11:50:43 -07:00
__init__.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
account_usage.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
agent_init.py	perf(compression): defer feasibility check to first compression attempt (#28957 )	2026-05-19 17:27:17 -07:00
agent_runtime_helpers.py	fix(agent): set tool_name on tool-result messages at construction time	2026-05-19 20:49:11 +01:00
anthropic_adapter.py	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
async_utils.py	fix(async): close unscheduled coroutines in all threadsafe bridges (#26584 )	2026-05-15 14:00:01 -07:00
auxiliary_client.py	fix(xai-oauth): pin inference base_url to x.ai origin (#28952 )	2026-05-19 14:51:21 -07:00
azure_identity_adapter.py	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
background_review.py	feat(bg-review): add bundled/pinned skill protection rules to review prompts (#27644 )	2026-05-18 20:02:22 -07:00
bedrock_adapter.py	chore(deps): lazy-install boto3/botocore for bedrock adapter	2026-05-17 02:31:18 -07:00
browser_provider.py	fix(browser): self-review pass — dead-import, log levels, future-proofing	2026-05-17 04:04:15 -07:00
browser_registry.py	fix(browser): self-review pass — dead-import, log levels, future-proofing	2026-05-17 04:04:15 -07:00
chat_completion_helpers.py	fix(xai-responses): strip enum values containing '/' from tool schemas	2026-05-18 10:37:35 -07:00
codex_responses_adapter.py	fix(xai-oauth): recover from prelude SSE errors, gate reasoning replay, surface entitlement 403s (#26644 )	2026-05-15 16:35:12 -07:00
codex_runtime.py	fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184 )	2026-05-16 23:41:09 -07:00
context_compressor.py	fix(compress): make abort-on-summary-failure opt-in via config flag (#28117 )	2026-05-18 10:28:20 -07:00
context_engine.py	fix(compression): keep default protect_first_n at 3 + align ABC	2026-05-13 22:25:16 -07:00
context_references.py	fix(agent): fall back when rg is blocked for @folder references	2026-04-20 01:56:41 -07:00
conversation_compression.py	perf(compression): defer feasibility check to first compression attempt (#28957 )	2026-05-19 17:27:17 -07:00
conversation_loop.py	fix: wrap _pool_may_recover_from_rate_limit call through run_agent namespace	2026-05-18 20:04:57 -07:00
copilot_acp_client.py	fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes	2026-05-19 00:12:41 -07:00
credential_pool.py	fix(codex-oauth): quarantine terminal refresh errors so dead tokens are not replayed across sessions	2026-05-18 10:31:40 -07:00
credential_sources.py	feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider	2026-05-15 12:11:32 -07:00
curator.py	feat(curator): hint at `hermes curator pin` in the rename block (#23212 )	2026-05-10 06:44:53 -07:00
curator_backup.py	fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671 ) (#18731 )	2026-05-02 01:29:57 -07:00
display.py	chore: remove Atropos RL environments and tinker-atropos integration (#26106 )	2026-05-15 10:36:38 +05:30
error_classifier.py	fix(error_classifier): classify xAI Grok entitlement SSE errors as auth	2026-05-18 10:24:13 -07:00
file_safety.py	fix(security): apply file safety to copilot acp fs	2026-04-21 01:31:58 -07:00
gemini_cloudcode_adapter.py	fix(agent/gemini-cloudcode): seed delta defaults for reasoning-only stream chunks	2026-05-14 08:03:56 -07:00
gemini_native_adapter.py	fix(auxiliary): evict async wrappers on poisoned client (follow-up to #23482 )	2026-05-11 11:13:20 -07:00
gemini_schema.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
google_code_assist.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
google_oauth.py	fix(google_oauth): close TOCTOU window when saving credentials	2026-05-04 03:16:19 -07:00
i18n.py	feat(i18n): localize all gateway commands + web dashboard, add 8 new locales (16 total) (#22914 )	2026-05-10 07:14:14 -07:00
image_gen_provider.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
image_gen_registry.py	fix(plugins): filter resolution by is_available() in web + image_gen registries	2026-05-13 22:31:28 -07:00
image_routing.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
insights.py	Merge branch 'main' into feat/dashboard-skill-analytics	2026-04-20 05:25:49 -07:00
iteration_budget.py	refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget	2026-05-16 17:59:32 -07:00
lmstudio_reasoning.py	feat(agent): add lmstudio integration	2026-04-28 12:27:36 -07:00
manual_compression_feedback.py	fix(compression): include system prompt + tool schemas in token estimates (#18265 )	2026-04-30 23:03:54 -07:00
markdown_tables.py	fix(cli): vertical fallback for markdown tables wider than terminal (#23948 )	2026-05-11 16:49:13 -07:00
memory_manager.py	🐛 fix(memory): require newline after context tag	2026-05-18 10:53:08 -07:00
memory_provider.py	docs(agent): remove stale BuiltinMemoryProvider references from memory module docstrings	2026-05-05 13:33:49 -07:00
message_sanitization.py	refactor(run_agent): extract message sanitization to agent/message_sanitization.py	2026-05-16 17:41:09 -07:00
model_metadata.py	fix(metadata): qwen3.6-plus has a 1M context window (#27008 )	2026-05-17 02:31:18 -07:00
models_dev.py	feat: add NovitaAI as LLM provider	2026-05-13 23:51:15 -07:00
moonshot_schema.py	fix(moonshot): strip $ref siblings and collapse tuple items in tool schemas (#27104 )	2026-05-16 13:02:19 -07:00
nous_rate_guard.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
onboarding.py	docs(onboarding): lead OpenClaw residue banner with migrate, warn that cleanup breaks OpenClaw (#17507 )	2026-04-29 08:08:36 -07:00
plugin_llm.py	feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194 )	2026-05-10 07:09:28 -07:00
portal_tags.py	feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779 )	2026-05-12 20:49:20 -07:00
process_bootstrap.py	refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget	2026-05-16 17:59:32 -07:00
prompt_builder.py	fix(kanban): stale reclaim must not tick failure counter (#28680 )	2026-05-19 03:15:18 -07:00
prompt_caching.py	fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778 )	2026-05-12 20:46:04 -07:00
rate_limit_tracker.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
redact.py	perf(agent-loop): cut 47% of per-conversation function calls via 3 targeted hot-path optimizations (#28866 )	2026-05-19 14:25:10 -07:00
retry_utils.py	feat(agent): add jittered retry backoff	2026-04-08 00:41:36 -07:00
shell_hooks.py	fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes	2026-05-19 00:12:41 -07:00
skill_bundles.py	feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373 )	2026-05-18 21:38:05 -07:00
skill_commands.py	fix(skills): load symlinked skill slash commands	2026-05-18 00:34:29 -07:00
skill_preprocessing.py	fix: treat inline-shell timeout guard as timeout	2026-05-18 19:36:04 -07:00
skill_utils.py	perf(cli): cut ~19s from 'hermes' cold start (skills cache + lazy Feishu + no Nous HTTP) (#22138 )	2026-05-08 16:39:32 -07:00
stream_diag.py	refactor(run_agent): extract stream diagnostics to agent/stream_diag.py	2026-05-16 18:28:17 -07:00
subdirectory_hints.py	fix(agent): catch PermissionError in subdirectory hint discovery	2026-04-09 03:10:30 -07:00
system_prompt.py	perf(prompt): cache kanban worker guidance at session init	2026-05-18 20:56:44 -07:00
think_scrubber.py	fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 )	2026-05-05 04:33:38 -07:00
title_generator.py	fix: improve telegram topic mode setup	2026-05-04 12:07:17 -07:00
tool_dispatch_helpers.py	fix(agent): set tool_name on tool-result messages at construction time	2026-05-19 20:49:11 +01:00
tool_executor.py	fix(agent): set tool_name on tool-result messages at construction time	2026-05-19 20:49:11 +01:00
tool_guardrails.py	fix: add recovery hints to loop guard warnings	2026-05-19 00:12:12 -07:00
tool_result_classification.py	fix: classify landed file mutations with diagnostics	2026-05-13 06:46:23 -07:00
trajectory.py	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00
usage_pricing.py	fix(pricing): add deepseek-v4-pro to official docs pricing table	2026-05-12 16:32:57 -07:00
video_gen_provider.py	feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )	2026-05-13 16:39:41 -07:00
video_gen_registry.py	feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )	2026-05-13 16:39:41 -07:00
web_search_provider.py	fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup	2026-05-13 22:31:28 -07:00
web_search_registry.py	fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup	2026-05-13 22:31:28 -07:00