hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-31 06:51:29 +00:00

Author	SHA1	Message	Date
Teknium	1634397ddb	fix(compress): abort instead of dropping messages when summary LLM fails (#28102 ) When auxiliary compression's summary generation returns None (aux model errored, returned non-JSON, timed out, etc.) the compressor previously still dropped every middle message between compress_start..compress_end and replaced them with a static 'Summary generation was unavailable' placeholder. The session kept going but the user silently lost N turns of context for nothing. New behavior: on summary failure, compress() aborts entirely — returns the input messages unchanged and sets _last_compress_aborted=True. The existing _summary_failure_cooldown_until gate (30-60s) keeps the aux model from being burned on every turn. Auto-compress callers detect the no-op (len(after) == len(before)) and stop looping. The chat is 'frozen' at its current size until the next /compress or /new. Manual /compress (CLI + gateway) now passes force=True which clears the cooldown so users can retry immediately after an auto-abort. If the manual retry also fails, the user gets a visible warning telling them nothing was dropped and how to retry. - agent/context_compressor.py: compress() gains force= kwarg; failure branch sets _last_compress_aborted and returns messages unchanged instead of inserting placeholder. - run_agent.py: _compress_context() detects abort, surfaces warning, skips session-rotation entirely, returns messages unchanged. - cli.py + gateway/run.py: manual /compress paths pass force=True. - gateway/run.py: hygiene + /compress handlers detect _last_compress_aborted and emit the new 'Compression aborted' warning (gateway.compress.aborted) instead of the old 'N historical messages were removed' message. - locales/*.yaml: new gateway.compress.aborted key in all 16 locales. - tests: updated to assert the abort contract (messages preserved, compression_count not incremented, abort flag set, no placeholder leaked). New test_force_true_bypasses_failure_cooldown covers the manual-retry path.	2026-05-18 10:19:40 -07:00
glennc	9df9816dab	feat(azure-foundry): add Microsoft Entra ID auth Use azure-identity DefaultAzureCredential for keyless Foundry auth. Preserve refreshable callable credentials through OpenAI and Anthropic client paths. Add setup, doctor, auth status, docs, and tests for Entra auth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-18 10:14:38 -07:00
teknium1	f885be030c	fix(auxiliary): resolve xai oauth compression from pool — port to conversation_compression Original commit `97a32afdc` by helix4u targeted _check_compression_model_feasibility in pre-refactor run_agent.py. The function body now lives in agent/conversation_compression.py — re-applied the configured-but-unavailable provider message there. Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>	2026-05-16 23:33:59 -07:00
teknium1	5311d9959e	refactor(run_agent): extract context compression to agent/conversation_compression.py Move four compression-related methods to a dedicated module: * check_compression_model_feasibility — startup probe + auto-lowered threshold + hard floor * replay_compression_warning — re-emit stored warning through gateway status_callback * compress_context — run compressor, split SQLite session, notify plugins+memory * try_shrink_image_parts_in_messages — image-too-large recovery via re-encode AIAgent keeps thin forwarder methods so existing call sites and tests that patch run_agent.AIAgent methods keep working. tests/run_agent/ + tests/agent/: 4313 passed (same pre-existing test_auxiliary_client failure as before). run_agent.py: 15013 -> 14535 lines (-478).	2026-05-16 18:09:33 -07:00

4 commits