hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-19 15:18:03 +00:00

Author	SHA1	Message	Date
Teknium	2dc5f9d2d3	fix: light mode link/primary colors unreadable on white background (#10457 ) Gold #FFD700 has 1.4:1 contrast ratio on white — barely visible. Replace with dark amber palette (#8B6508 primary, #7A5800 links) that passes WCAG AA (5.3:1 and 6.5:1 respectively). Changes: - :root primary palette → dark amber tones for light mode - Explicit light mode link colors (#7A5800 / #5A4100 hover) - Light mode sidebar active state with amber accent - Light mode table header/border styling - Footer hover color split by theme (gold for dark, amber for light) Dark mode is completely unchanged. Reported by @AbrahamMat7632	2026-04-15 11:17:44 -07:00
Teknium	f61cc464f0	fix: include thread_id in _parse_session_key and fix stale parts reference _parse_session_key() now extracts the optional 6th part (thread_id) from session keys, and _notify_active_sessions_of_shutdown uses _parsed.get() instead of the removed 'parts' variable. Without this, shutdown notifications silently failed (NameError caught by try/except) and forum topic routing was lost.	2026-04-15 11:16:01 -07:00
kshitijk4poor	2276b72141	fix: follow-up improvements for watch notification routing (#9537 ) - Populate watcher_* routing fields for watch-only processes (not just notify_on_complete), so watch-pattern events carry direct metadata instead of relying solely on session_key parsing fallback - Extract _parse_session_key() helper to dedupe session key parsing at two call sites in gateway/run.py - Add negative test proving cross-thread leakage doesn't happen - Add edge-case tests for _build_process_event_source returning None (empty evt, invalid platform, short session_key) - Add unit tests for _parse_session_key helper	2026-04-15 11:16:01 -07:00
etcircle	dee592a0b1	fix(gateway): route synthetic background events by session	2026-04-15 11:16:01 -07:00
kshitij	da448d4fce	test(cron): add regression test for credential_files ContextVar propagation (#10462 ) Follow-up to #10459 (salvage of #7527). The copy_context() fix propagates ALL ContextVars into the cron worker thread, including credential_files. This test verifies that skill-declared required_credential_files are visible inside the worker thread, matching the existing env_passthrough regression test.	2026-04-15 11:11:08 -07:00
helix4u	aa398ad655	fix(cron): preserve skill env passthrough in worker thread	2026-04-15 11:03:49 -07:00
WideLee	422f2866e6	docs: restore sidebar entries removed by PR #9931 Re-add 'qqbot' and 'automation-templates' doc indexes to sidebars.ts that were accidentally dropped in https://github.com/NousResearch/hermes-agent/pull/9931.	2026-04-15 09:39:12 -07:00
Teknium	722331a57d	fix: replace hardcoded ~/.hermes with display_hermes_home() in agent-facing text (#10285 ) Tool schema descriptions and tool return values contained hardcoded ~/.hermes paths that the model sees and uses. When HERMES_HOME is set to a custom path (Docker containers, profiles), the agent would still reference ~/.hermes — looking at the wrong directory. Fixes 6 locations across 5 files: - tools/tts_tool.py: output_path schema description - tools/cronjob_tools.py: script path schema description - tools/skill_manager_tool.py: skill_manage schema description - tools/skills_tool.py: two tool return messages - agent/skill_commands.py: skill config injection text All now use display_hermes_home() which resolves to the actual HERMES_HOME path (e.g. /opt/data for Docker, ~/.hermes/profiles/X for profiles, ~/.hermes for default). Reported by: Sandeep Narahari (PrithviDevs)	2026-04-15 04:57:55 -07:00
sprmn24	41e2d61b3f	feat(discord): add native send_animation for inline GIF playback	2026-04-15 04:51:27 -07:00
Teknium	4da598b48a	docs: clarify hermes model vs /model — two commands, two purposes (#10276 ) Users are confused about the difference between `hermes model` (terminal command for full provider setup) and `/model` (session command for switching between already-configured providers). This distinction was not documented anywhere. Changes across 4 doc pages: - cli-commands.md: Added warning callout explaining the difference, added --global flag docs, added 'only see OpenRouter models?' info box - slash-commands.md: Added notes on both TUI and messaging /model entries that /model only switches between configured providers - providers.md: Added 'Two Commands for Model Management' comparison table near top of page, added warning callout in switching section - faq.md: Added new FAQ entry '/model only shows one provider' with quick reference table Prompted by user feedback in Discord — new users consistently hit this confusion when trying to add providers from inside a session.	2026-04-15 04:39:34 -07:00
asheriif	33ae403890	fix(gateway): fix matrix lingering typing indicator	2026-04-15 04:16:16 -07:00
Teknium	47e6ea84bb	fix: file handle bug, warning text, and tests for Discord media send - Fix file handle closed before POST: nest session.post() inside the 'with open()' block so aiohttp can read the file during upload - Update warning text to include weixin (also supports media delivery) - Add 8 unit tests covering: text+media, media-only, missing files, upload failures, multiple files, and _send_to_platform routing	2026-04-15 04:16:06 -07:00
sprmn24	4bcb2f2d26	feat(send_message): add native media attachment support for Discord Previously send_message only supported media delivery for Telegram. Discord users received a warning that media was omitted. - Add media_files parameter to _send_discord() - Upload media via Discord multipart/form-data API (files[0] field) - Handle Discord in _send_to_platform() same way as Telegram block - Remove Discord from generic chunk loop (now handled above) - Update error/warning strings to mention telegram and discord	2026-04-15 04:16:06 -07:00
Teknium	1c4d3216d3	fix(cron): include job_id in delivery and guide models on removal workflow (#10242 ) * fix(gateway): suppress duplicate replies on interrupt and streaming flood control Three fixes for the duplicate reply bug affecting all gateway platforms: 1. base.py: Suppress stale response when the session was interrupted by a new message that hasn't been consumed yet. Checks both interrupt_event and _pending_messages to avoid false positives. (#8221, #2483) 2. run.py (return path): Remove response_previewed guard from already_sent check. Stream consumer's already_sent alone is authoritative — if content was delivered via streaming, the duplicate send must be suppressed regardless of the agent's response_previewed flag. (#8375) 3. run.py (queued-message path): Same fix — already_sent without response_previewed now correctly marks the first response as already streamed, preventing re-send before processing the queued message. The response_previewed field is still produced by the agent (run_agent.py) but is no longer required as a gate for duplicate suppression. The stream consumer's already_sent flag is the delivery-level truth about what the user actually saw. Concepts from PR #8380 (konsisumer). Closes #8375, #8221, #2483. * fix(cron): include job_id in delivery and guide models on removal workflow Users reported cron reminders keep firing after asking the agent to stop. Root cause: the conversational agent didn't know the job_id (not in delivery) and models don't reliably do the list→remove two-step without guidance. 1. Include job_id in the cron delivery wrapper so users and agents can reference it when requesting removal. 2. Replace confusing footer ('The agent cannot see this message') with actionable guidance ('To stop or manage this job, send me a new message'). 3. Add explicit list→remove guidance in the cronjob tool schema so models know to list first and never guess job IDs.	2026-04-15 03:46:58 -07:00
Misturi	dedc4600dd	fix(skills): handle missing fields in Google Workspace token file gracefully instead of crashing with KeyError	2026-04-15 03:45:09 -07:00
Misturi	8bc9b5a0b4	fix(skills): use `is None` check for coordinates in find-nearby to avoid dropping valid 0.0 values	2026-04-15 03:45:09 -07:00
Teknium	2546b7acea	fix(gateway): suppress duplicate replies on interrupt and streaming flood control Three fixes for the duplicate reply bug affecting all gateway platforms: 1. base.py: Suppress stale response when the session was interrupted by a new message that hasn't been consumed yet. Checks both interrupt_event and _pending_messages to avoid false positives. (#8221, #2483) 2. run.py (return path): Remove response_previewed guard from already_sent check. Stream consumer's already_sent alone is authoritative — if content was delivered via streaming, the duplicate send must be suppressed regardless of the agent's response_previewed flag. (#8375) 3. run.py (queued-message path): Same fix — already_sent without response_previewed now correctly marks the first response as already streamed, preventing re-send before processing the queued message. The response_previewed field is still produced by the agent (run_agent.py) but is no longer required as a gate for duplicate suppression. The stream consumer's already_sent flag is the delivery-level truth about what the user actually saw. Concepts from PR #8380 (konsisumer). Closes #8375, #8221, #2483.	2026-04-15 03:42:24 -07:00
Teknium	7b2700c9af	fix(browser): use 127.0.0.1 instead of localhost for CDP default (#10231 ) /browser connect set BROWSER_CDP_URL to http://localhost:9222, but Chrome's --remote-debugging-port only binds to 127.0.0.1 (IPv4). On macOS, 'localhost' can resolve to ::1 (IPv6) first, causing both _resolve_cdp_override's /json/version fetch and agent-browser's --cdp connection to fail when Chrome isn't listening on IPv6. The socket check in the connect handler already used 127.0.0.1 explicitly and succeeded, masking the mismatch. Use 127.0.0.1 in the default CDP URL to match what Chrome actually binds to.	2026-04-15 03:29:37 -07:00
Teknium	a4e1842f12	fix: strip reasoning item IDs from Responses API input when store=False (#10217 ) With store=False (our default for the Responses API), the API does not persist response items. When reasoning items with 'id' fields were replayed on subsequent turns, the API attempted a server-side lookup for those IDs and returned 404: Item with id 'rs_...' not found. Items are not persisted when store is set to false. The encrypted_content blob is self-contained for reasoning chain continuity — the id field is unnecessary and triggers the failed lookup. Fix: strip 'id' from reasoning items in both _chat_messages_to_responses_input (message conversion) and _preflight_codex_input_items (normalization layer). The id is still used for local deduplication but never sent to the API. Reported by @zuogl448 on GPT-5.4.	2026-04-15 03:19:43 -07:00
Teknium	e69526be79	fix(send_message): URL-encode Matrix room IDs and add Matrix to schema examples (#10151 ) Matrix room IDs contain ! and : which must be percent-encoded in URI path segments per the Matrix C-S spec. Without encoding, some homeservers reject the PUT request. Also adds 'matrix:!roomid:server.org' and 'matrix:@user:server.org' to the tool schema examples so models know the correct target format.	2026-04-15 00:10:59 -07:00
Teknium	180b14442f	test: add _parse_target_ref Matrix coverage for salvaged PR #6144	2026-04-15 00:08:14 -07:00
bkadish	03446e06bb	fix(send_message): accept Matrix room IDs and user MXIDs as explicit targets `_parse_target_ref` has explicit-reference branches for Telegram, Feishu, and numeric IDs, but none for Matrix. As a result, callers of `send_message(target="matrix:!roomid:server")` or `send_message(target="matrix:@user:server")` fall through to `(None, None, False)` and the tool errors out with a resolution failure — even though a raw Matrix room ID or MXID is the most unambiguous possible target. Three-line fix: recognize `!…` as a room ID and `@…` as a user MXID when platform is `matrix`, and return them as explicit targets. Alias-based targets (`#…`) continue to go through the normal resolve path.	2026-04-15 00:08:14 -07:00
Teknium	df7be3d8ae	fix(cli): /model picker shows curated models instead of full catalog (#10146 ) The /model picker called provider_model_ids() which fetches the FULL live API catalog (hundreds of models for Anthropic, Copilot, etc.) and only fell back to the curated list when the live fetch failed. This flips the priority: use the curated model list from list_authenticated_providers() (same lists as `hermes model` and gateway pickers), falling back to provider_model_ids() only when the curated list is empty (e.g. user-defined endpoints).	2026-04-15 00:07:50 -07:00
Ubuntu	da8bab77fb	fix(cli): restore messaging toolset for gateway platforms	2026-04-14 23:13:35 -07:00
Teknium	9932366f3c	feat(doctor): add Command Installation check for hermes bin symlink hermes doctor now checks whether the ~/.local/bin/hermes symlink exists and points to the correct venv entry point. With --fix, it creates or repairs the symlink automatically. Covers: - Missing symlink at ~/.local/bin/hermes (or $PREFIX/bin on Termux) - Symlink pointing to wrong target - Missing venv entry point (venv/bin/hermes or .venv/bin/hermes) - PATH warning when ~/.local/bin is not on PATH - Skipped on Windows (different mechanism) Addresses user report: 'python -m hermes_cli.main doesn't have an option to fix the local bin/install' 10 new tests covering all scenarios.	2026-04-14 23:13:11 -07:00
Teknium	029938fbed	fix(cli): defensive subparser routing for argparse bpo-9338 (#10113 ) On some Python versions, argparse fails to route subcommand tokens when the parent parser has nargs='?' optional arguments (--continue). The symptom: 'hermes model' produces 'unrecognized arguments: model' even though 'model' is a registered subcommand. Fix: when argv contains a token matching a known subcommand, set subparsers.required=True to force deterministic routing. If that fails (e.g. 'hermes -c model' where 'model' is consumed as the session name for --continue), fall back to the default optional-subparsers behaviour. Adds 13 tests covering all key argument combinations. Reported via user screenshot showing the exact error on an installed version with the model subcommand listed in usage but rejected at parse time.	2026-04-14 23:13:02 -07:00
Teknium	772cfb6c4e	fix: stale agent timeout, uv venv detection, empty response after tools, compression model fallback (#9051 , #8620 , #9400 ) (#10093 ) Four independent fixes: 1. Reset activity timestamp on cached agent reuse (#9051) When the gateway reuses a cached AIAgent for a new turn, the _last_activity_ts from the previous turn (possibly hours ago) carried over. The inactivity timeout handler immediately saw the agent as idle for hours and killed it. Fix: reset _last_activity_ts, _last_activity_desc, and _api_call_count when retrieving an agent from the cache. 2. Detect uv-managed virtual environments (#8620 sub-issue 1) The systemd unit generator fell back to sys.executable (uv's standalone Python) when running under 'uv run', because sys.prefix == sys.base_prefix. The generated ExecStart pointed to a Python binary without site-packages. Fix: check VIRTUAL_ENV env var before falling back to sys.executable. uv sets VIRTUAL_ENV even when sys.prefix doesn't reflect the venv. 3. Nudge model to continue after empty post-tool response (#9400) Weaker models sometimes return empty after tool calls. The agent silently abandoned the remaining work. Fix: append assistant('(empty)') + user nudge message and retry once. Resets after each successful tool round. 4. Compression model fallback on permanent errors (#8620 sub-issue 4) When the default summary model (gemini-3-flash) returns 503 'model_not_found' on custom proxies, the compressor entered a 600s cooldown, leaving context growing unbounded. Fix: detect permanent model-not-found errors (503, 404, 'model_not_found', 'no available channel') and fall back to using the main model for compression instead of entering cooldown. One-time fallback with immediate retry. Test plan: 40 compressor tests + 97 gateway/CLI tests + 9 venv tests pass	2026-04-14 22:38:17 -07:00
Teknium	5d5d21556e	fix: sync client.api_key during UnicodeEncodeError ASCII recovery (#10090 ) The existing recovery block sanitized self.api_key and self._client_kwargs['api_key'] but did not update self.client.api_key. The OpenAI SDK stores its own copy of api_key and reads it dynamically via the auth_headers property on every request. Without this fix, the retry after sanitization would still send the corrupted key in the Authorization header, causing the same UnicodeEncodeError. The bug manifests when an API key contains Unicode lookalike characters (e.g. ʋ U+028B instead of v) from copy-pasting out of PDFs, rich-text editors, or web pages with decorative fonts. httpx hard-encodes all HTTP headers as ASCII, so the non-ASCII char in the Authorization header triggers the error. Adds TestApiKeyClientSync with two tests verifying: - All three key locations are synced after sanitization - Recovery handles client=None (pre-init) without crashing	2026-04-14 22:37:45 -07:00
kshitijk4poor	9855190f23	feat(compressor): smart collapse, dedup, anti-thrashing, template upgrade, hardening Combined salvage of PRs #9661, #9663, #9674, #9677, #9678 by kshitijk4poor. - Smart tool output collapse: informative 1-line summaries replace generic placeholder - Dedup identical tool results via MD5 hash, truncate large tool_call arguments - Anti-thrashing: skip compression after 2 consecutive <10% savings passes - Structured action-log summary template with numbered actions and Active State - Hardening: max_tokens 1.3x cap, multimodal safety, note idempotency, adaptive cooldown Follow-up fixes applied during salvage: - web_extract: reads 'urls' (list) not 'url' (original PR bug) - Multimodal list content guards in dedup and prune passes - Kept 'Relevant Files' section in template (original PR removed it) Skipped PRs #9665 (user msg preservation — duplication risk) and #9675 (dead code).	2026-04-14 22:21:25 -07:00
Teknium	50c35dcabe	fix: stale agent timeout, uv venv detection, empty response after tools (#9051 , #8620 , #9400 ) Three independent fixes: 1. Reset activity timestamp on cached agent reuse (#9051) When the gateway reuses a cached AIAgent for a new turn, the _last_activity_ts from the previous turn (possibly hours ago) carried over. The inactivity timeout handler immediately saw the agent as idle for hours and killed it. Fix: reset _last_activity_ts, _last_activity_desc, and _api_call_count when retrieving an agent from the cache. 2. Detect uv-managed virtual environments (#8620 sub-issue 1) The systemd unit generator fell back to sys.executable (uv's standalone Python) when running under 'uv run', because sys.prefix == sys.base_prefix (uv doesn't set up traditional venv activation). The generated ExecStart pointed to a Python binary without site-packages, crashing the service on startup. Fix: check VIRTUAL_ENV env var before falling back to sys.executable. uv sets VIRTUAL_ENV even when sys.prefix doesn't reflect the venv. 3. Nudge model to continue after empty post-tool response (#9400) Weaker models (GLM-5, mimo-v2-pro) sometimes return empty responses after tool calls instead of continuing to the next step. The agent silently abandoned the remaining work with '(empty)' or used prior-turn fallback text. Fix: when the model returns empty after tool calls AND there's no prior-turn content to fall back on, inject a one-time user nudge message telling the model to process the tool results and continue. The flag resets after each successful tool round so it can fire again on later rounds. Test plan: 97 gateway + CLI tests pass, 9 venv detection tests pass	2026-04-14 22:16:02 -07:00
Teknium	93fe4ead83	fix: warn on invalid context_length format in config.yaml (#10067 ) Previously, non-integer context_length values (e.g. '256K') in config.yaml were silently ignored, causing the agent to fall back to 128K auto-detection with no user feedback. This was confusing for users with custom LiteLLM endpoints expecting larger context. Now prints a clear stderr warning and logs at WARNING level when model.context_length or custom_providers[].models.<model>.context_length cannot be parsed as an integer, telling users to use plain integers (e.g. 256000 instead of '256K'). Reported by community user ChFarhan via Discord.	2026-04-14 22:14:27 -07:00
Teknium	a8b7db35b2	fix: interrupt agent immediately when user messages during active run (#10068 ) When a user sends a message while the agent is executing a task on the gateway, the agent is now interrupted immediately — not silently queued. Previously, messages were stored in _pending_messages with zero feedback to the user, potentially leaving them waiting 1+ hours. Root cause: Level 1 guard (base.py) intercepted all messages for active sessions and returned with no response. Level 2 (gateway/run.py) which calls agent.interrupt() was never reached. Fix: Expand _handle_active_session_busy_message to handle the normal (non-draining) case: 1. Call running_agent.interrupt(text) to abort in-flight tool calls and exit the agent loop at the next check point 2. Store the message as pending so it becomes the next turn once the interrupted run returns 3. Send a brief ack: 'Interrupting current task (10 min elapsed, iteration 21/60, running: terminal). I'll respond shortly.' 4. Debounce acks to once per 30s to avoid spam on rapid messages Reported by @Lonely__MH.	2026-04-14 22:07:28 -07:00
Teknium	8548893d14	feat: entry-level Podman support — find_docker() + rootless entrypoint (#10066 ) - find_docker() now checks HERMES_DOCKER_BINARY env var first, then docker on PATH, then podman on PATH, then macOS known locations - Entrypoint respects HERMES_HOME env var (was hardcoded to /opt/data) - Entrypoint uses groupmod -o to tolerate non-unique GIDs (fixes macOS GID 20 conflict with Debian's dialout group) - Entrypoint makes chown best-effort so rootless Podman continues instead of failing with 'Operation not permitted' - 5 new tests covering env var override, podman fallback, precedence Based on work by alanjds (PR #3996) and malaiwah (PR #8115). Closes #4084.	2026-04-14 21:20:37 -07:00
Teknium	c5688e7c8b	fix(gateway): break compression-exhaustion infinite loop and auto-reset session (#9893 ) When compression fails after max attempts, the agent returns {completed: False, partial: True} but was missing the 'failed' flag. The gateway's agent_failed_early guard checked for 'failed' AND 'not final_response', but _run_agent_blocking always converts errors to final_response — making the guard dead code. This caused the oversized session to persist, creating an infinite fail loop where every subsequent message hits the same compression failure. Changes: - run_agent.py: add 'failed: True' and 'compression_exhausted: True' to all 5 compression-exhaustion return paths - gateway/run.py (_run_agent_blocking): forward 'failed' and 'compression_exhausted' flags through to the caller - gateway/run.py (_handle_message_with_agent): fix agent_failed_early to check bool(failed) without the broken 'not final_response' clause; auto-reset the session when compression is exhausted so the next message starts fresh - Update tests to match new guard logic and add TestCompressionExhaustedFlag test class Closes #9893	2026-04-14 21:18:17 -07:00
Teknium	ba24f058ed	docs: fix stale docstring reference to _discover_tools in mcp_tool.py	2026-04-14 21:12:29 -07:00
Teknium	ef04de3e98	docs: update tool-adding instructions for auto-discovery - AGENTS.md: 3 files → 2 files, remove _discover_tools() step - adding-tools.md: remove Step 3, note auto-discovery - architecture.md: update discovery description - tools-runtime.md: replace manual list with discover_builtin_tools() docs - hermes-agent skill: remove manual import step	2026-04-14 21:12:29 -07:00
Teknium	fc6cb5b970	fix: tighten AST check to module-level only The original tree-wide ast.walk() would match registry.register() calls inside functions too. Restrict to top-level ast.Expr statements so helper modules that call registry.register() inside a function are never picked up as tool modules.	2026-04-14 21:12:29 -07:00
Greer Guthrie	4b2a1a4337	fix(tools): auto-discover built-in tool modules	2026-04-14 21:12:29 -07:00
Teknium	2871ef1807	docs: note session continuity for previous_response_id chains (#10060 )	2026-04-14 21:07:37 -07:00
Teknium	5cbb45d93e	fix: preserve session_id across previous_response_id chains in /v1/responses (#10059 ) The /v1/responses endpoint generated a new UUID session_id for every request, even when previous_response_id was provided. This caused each turn of a multi-turn conversation to appear as a separate session on the web dashboard, despite the conversation history being correctly chained. Fix: store session_id alongside the response in the ResponseStore, and reuse it when a subsequent request chains via previous_response_id. Applies to both the non-streaming /v1/responses path and the streaming SSE path. The /v1/runs endpoint also gains session continuity from stored responses (explicit body.session_id still takes priority). Adds test verifying session_id is preserved across chained requests.	2026-04-14 21:06:32 -07:00
Teknium	ca0ae56ccb	fix: add 402 billing error hint to gateway error handler (#5220 ) (#10057 ) * fix: hermes gateway restart waits for service to come back up (#8260) Previously, systemd_restart() sent SIGUSR1 to the gateway, printed 'restart requested', and returned immediately. The gateway still needed to drain active agents, exit with code 75, wait for systemd's RestartSec=30, and start the new process. The user saw 'success' but the gateway was actually down for 30-60 seconds. Now the SIGUSR1 path blocks with progress feedback: Phase 1 — wait for old process to die: ⏳ User service draining active work... Polls os.kill(pid, 0) until ProcessLookupError (up to 90s) Phase 2 — wait for new process to become active: ⏳ Waiting for hermes-gateway to restart... Polls systemctl is-active + verifies new PID (up to 60s) Success: ✓ User service restarted (PID 12345) Timeout: ⚠ User service did not become active within 60s. Check status: hermes gateway status Check logs: journalctl --user -u hermes-gateway --since '2 min ago' The reload-or-restart fallback path (line 1189) already blocks because systemctl reload-or-restart is synchronous. Test plan: - Updated test to verify wait-for-restart behavior - All 118 gateway CLI tests pass * fix: add 402 billing error hint to gateway error handler (#5220) The gateway's exception handler for agent errors had specific hints for HTTP 401, 429, 529, 400, 500 — but not 402 (Payment Required / quota exhausted). Users hitting billing limits from custom proxy providers got a generic error with no guidance. Added: 'Your API balance or quota is exhausted. Check your provider dashboard.' The underlying billing classification (error_classifier.py) already correctly handles 402 as FailoverReason.billing with credential rotation and fallback. The original issue (#5220) where 402 killed the entire gateway was from an older version — on current main, 402 is excluded from the is_client_error abort path (line 9460) and goes through the proper retry/fallback/fail flow. Combined with PR #9875 (auto-recover from unexpected SIGTERM), even edge cases where the gateway dies are now survivable.	2026-04-14 21:03:05 -07:00
Teknium	23b87c8ca8	chore: add zons-zhaozhy to AUTHOR_MAP	2026-04-14 21:01:40 -07:00
阿泥豆	92385679b6	fix: reset retry counters after compression and stop poisoning conversation history Three bugfixes in the agent loop: 1. Reset retry counters after context compression. Without this, pre-compression retry counts carry over, causing the model to hit empty-response recovery immediately after a compression- induced context loss, wasting API calls on a now-valid context. 2. Unmute output in the final-response (no-tool-call) branch. _mute_post_response could be left True from a prior housekeeping turn, silently suppressing empty-response warnings and recovery status that the user should see. 3. Stop injecting 'Calling the X tools...' into assistant message content when falling back to prior-turn content. This mutated conversation history with synthetic text that the model never produced, poisoning subsequent turns.	2026-04-14 21:01:40 -07:00
Teknium	82f364ffd1	feat: add --all flag to gateway start and restart commands (#10043 ) - gateway start --all: kills all stale gateway processes across all profiles before starting the current profile's service - gateway restart --all: stops all gateway processes across all profiles, then starts the current profile's service fresh - gateway stop --all: already existed, unchanged The --all flag was only available on 'stop' but not on 'start' or 'restart', causing 'unrecognized arguments' errors for users.	2026-04-14 20:52:18 -07:00
Teknium	31d0620663	chore: add simon-marcus to AUTHOR_MAP	2026-04-14 20:51:52 -07:00
Teknium	cf1d718823	fix: keep batch-path function_call_output.output as string per OpenAI spec The streaming path emits output as content-part arrays for Open WebUI compatibility, but the batch (non-streaming) Responses API path must return output as a plain string per the OpenAI Responses API spec. Reverts the _extract_output_items change from the cherry-picked commits while preserving the streaming path's array format.	2026-04-14 20:51:52 -07:00
simon-marcus	302554b158	fix(api-server): format responses tool outputs for open webui	2026-04-14 20:51:52 -07:00
simon-marcus	d6c09ab94a	feat(api-server): stream /v1/responses SSE tool events	2026-04-14 20:51:52 -07:00
Teknium	da528a8207	fix: detect and strip non-ASCII characters from API keys (#6843 ) API keys containing Unicode lookalike characters (e.g. ʋ U+028B instead of v) cause UnicodeEncodeError when httpx encodes the Authorization header as ASCII. This commonly happens when users copy-paste keys from PDFs, rich-text editors, or web pages with decorative fonts. Three layers of defense: 1. Save-time validation (hermes_cli/config.py): _check_non_ascii_credential() strips non-ASCII from credential values when saving to .env, with a clear warning explaining the issue. 2. Load-time sanitization (hermes_cli/env_loader.py): _sanitize_loaded_credentials() strips non-ASCII from credential env vars (those ending in _API_KEY, _TOKEN, _SECRET, _KEY) after dotenv loads them, so the rest of the codebase never sees non-ASCII keys. 3. Runtime recovery (run_agent.py): The UnicodeEncodeError recovery block now also sanitizes self.api_key and self._client_kwargs['api_key'], fixing the gap where message/tool sanitization succeeded but the API key still caused httpx to fail on the Authorization header. Also: hermes_logging.py RotatingFileHandler now explicitly sets encoding='utf-8' instead of relying on locale default (defensive hardening for ASCII-locale systems).	2026-04-14 20:20:31 -07:00
kshitijk4poor	677f1227c3	fix: remove @staticmethod from _context_completions — crashes on @ mention PR #9467 added a call to self._fuzzy_file_completions() inside _context_completions(), but the method was still decorated with @staticmethod and didn't receive self. Every @ mention in the input triggers 'name self is not defined' from prompt_toolkit's async completer, spamming the error on every keystroke. Fix: remove @staticmethod, add self parameter. The method already uses self._fuzzy_file_completions() and self._get_project_files() via that call chain, so it was never meant to stay static after the fuzzy search feature was added.	2026-04-14 19:43:42 -07:00

1 2 3 4 5 ...

4221 commits