hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
Teknium	1af2e18d40	chore: release v0.9.0 (v2026.4.13) (#9182 ) The everywhere release — Hermes goes mobile with Termux/Android, adds iMessage and WeChat, ships Fast Mode for OpenAI and Anthropic, introduces background process monitoring, launches a local web dashboard, and delivers the deepest security hardening pass yet across 16 supported platforms. 487 commits, 269 merged PRs, 167 resolved issues, 24 contributors.	2026-04-13 11:52:09 -07:00
Teknium	0e60a9dc25	fix: add kimi-coding-cn to remaining provider touchpoints Follow-up for salvaged PR #7637. Adds kimi-coding-cn to: - model_normalize.py (prefix strip) - providers.py (models.dev mapping) - runtime_provider.py (credential resolution) - setup.py (model list + setup label) - doctor.py (health check) - trajectory_compressor.py (URL detection) - models_dev.py (registry mapping) - integrations/providers.md (docs)	2026-04-13 11:20:37 -07:00
hcshen0111	2b3aa36242	feat(providers): add kimi-coding-cn provider for mainland China users Cherry-picked from PR #7637 by hcshen0111. Adds kimi-coding-cn provider with dedicated KIMI_CN_API_KEY env var and api.moonshot.cn/v1 endpoint for China-region Moonshot users.	2026-04-13 11:20:37 -07:00
Teknium	ef180880aa	fix: guard anthropic_adapter import + use canonical authorize URL - Wrap module-level import from agent.anthropic_adapter in try/except so hermes web still starts if the adapter is unavailable; Phase 2 PKCE endpoints return 501 in that case. - Change authorize URL from console.anthropic.com to claude.ai to match the canonical adapter code.	2026-04-13 11:18:18 -07:00
kshitijk4poor	247929b0dd	feat: dashboard OAuth provider management Add OAuth provider management to the Hermes dashboard with full lifecycle support for Anthropic (PKCE), Nous and OpenAI Codex (device-code) flows. ## Backend (hermes_cli/web_server.py) - 6 new API endpoints: GET /api/providers/oauth — list providers with connection status POST /api/providers/oauth/{id}/start — initiate PKCE or device-code POST /api/providers/oauth/{id}/submit — exchange PKCE auth code GET /api/providers/oauth/{id}/poll/{session} — poll device-code DELETE /api/providers/oauth/{id} — disconnect provider DELETE /api/providers/oauth/sessions/{id} — cancel pending session - OAuth constants imported from anthropic_adapter (no duplication) - Blocking I/O wrapped in run_in_executor for async safety - In-memory session store with 15-minute TTL and automatic GC - Auth token required on all mutating endpoints ## Frontend - OAuthLoginModal — PKCE (paste auth code) and device-code (poll) flows - OAuthProvidersCard — status, token preview, connect/disconnect actions - Toast fix: createPortal to document.body for correct z-index - App.tsx: skip animation key bump on initial mount (prevent double-mount) - Integrated into the Env/Keys page	2026-04-13 11:18:18 -07:00
yongtenglei	2773b18b56	fix(run_agent): refresh activity during streaming responses Previously, long-running streamed responses could be incorrectly treated as idle by the gateway/cron inactivity timeout even while tokens were actively arriving. The _touch_activity() call (which feeds get_activity_summary() polled by the external timeout) was either called only on the first chunk (chat completions) or not at all (Anthropic, Codex, Codex fallback). Add _touch_activity() on every chunk/event in all four streaming paths so the inactivity monitor knows data is still flowing. Fixes #8760	2026-04-13 10:55:51 -07:00
Teknium	ba50fa3035	docs: fix 30+ inaccuracies across documentation (#9023 ) Cross-referenced all docs pages against the actual codebase and fixed: Reference docs (cli-commands.md, slash-commands.md, profile-commands.md): - Fix: hermes web -> hermes dashboard (correct subparser name) - Fix: Wrong provider list (removed deepseek, ai-gateway, opencode-zen, opencode-go, alibaba; added gemini) - Fix: Missing tts in hermes setup section choices - Add: Missing --image flag for hermes chat - Add: Missing --component flag for hermes logs - Add: Missing CLI commands: debug, backup, import - Fix: /status incorrectly marked as messaging-only (available everywhere) - Fix: /statusbar moved from Session to Configuration category - Add: Missing slash commands: /fast, /snapshot, /image, /debug - Add: Missing /restart from messaging commands table - Fix: /compress description to match COMMAND_REGISTRY - Add: --no-alias flag to profile create docs Configuration docs (configuration.md, environment-variables.md): - Fix: Vision timeout default 30s -> 120s - Fix: TTS providers missing minimax and mistral - Fix: STT providers missing mistral - Fix: TTS openai base_url shown with wrong default - Fix: Compression config showing stale summary_model/provider/base_url keys (migrated out in config v17) -> target_ratio/protect_last_n Getting-started docs: - Fix: Redundant faster-whisper install (already in voice extra) - Fix: Messaging extra description missing Slack Developer guide: - Fix: architecture.md tool count 48 -> 47, toolset count 40 -> 19 - Fix: run_agent.py line count 9,200 -> 10,700 - Fix: cli.py line count 8,500 -> 10,000 - Fix: main.py line count 5,500 -> 6,000 - Fix: gateway/run.py line count 7,500 -> 9,000 - Fix: Browser tools count 11 -> 10 - Fix: Platform adapter count 15 -> 18 (add wecom_callback, api_server) - Fix: agent-loop.md wrong budget sharing (not shared, independent) - Fix: agent-loop.md non-existent _get_budget_warning() reference - Fix: context-compression-and-caching.md non-existent function name - Fix: toolsets-reference.md safe toolset includes mixture_of_agents (it doesn't) - Fix: toolsets-reference.md hermes-cli tool count 38 -> 36 Guides: - Fix: automate-with-cron.md claims daily at 9am is valid (it's not) - Fix: delegation-patterns.md Max 3 presented as hard cap (configurable) - Fix: sessions.md group thread key format (shared by default, not per-user) - Fix: cron-internals.md job ID format and JSON structure	2026-04-13 10:53:10 -07:00
Teknium	4ca6668daf	docs: comprehensive update for recent merged PRs (#9019 ) Audit and update documentation across 12 files to match changes from ~50 recently merged PRs. Key updates: Slash commands (slash-commands.md): - Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart - Fix /status incorrectly labeled as messaging-only (available in both) - Add --global flag to /model docs - Add [focus topic] arg to /compress docs CLI commands (cli-commands.md): - Add hermes debug share section with options and examples - Add hermes backup section with --quick and --label flags - Add hermes import section Feature docs: - TTS: document global tts.speed and per-provider speed for Edge/OpenAI - Web dashboard: add docs for 5 missing pages (Sessions, Logs, Analytics, Cron, Skills) and 15+ API endpoints - WhatsApp: add streaming, 4K chunking, and markdown formatting docs - Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip - Budget: document CLI notification on iteration budget exhaustion Config migration (compression.summary_* → auxiliary.compression.*): - Update configuration.md, environment-variables.md, fallback-providers.md, cli.md, and context-compression-and-caching.md - Replace legacy compression.summary_model/provider/base_url references with auxiliary.compression.model/provider/base_url - Add legacy migration info boxes explaining auto-migration Minor fixes: - wecom-callback.md: clarify 'text only' limitation (input only) - Escape {session_id}/{job_id} in web-dashboard.md headings for MDX	2026-04-13 10:50:59 -07:00
墨綠BG	c449cd1af5	fix(config): restore custom providers after v11→v12 migration The v11→v12 migration converts custom_providers (list) into providers (dict), then deletes the list. But all runtime resolvers read from custom_providers — after migration, named custom endpoints silently stop resolving and fallback chains fail with AuthError. Add get_compatible_custom_providers() that reads from both config schemas (legacy custom_providers list + v12+ providers dict), normalizes entries, deduplicates, and returns a unified list. Update ALL consumers: - hermes_cli/runtime_provider.py: _get_named_custom_provider() + key_env - hermes_cli/auth_commands.py: credential pool provider names - hermes_cli/main.py: model picker + _model_flow_named_custom() - agent/auxiliary_client.py: key_env + custom_entry model fallback - agent/credential_pool.py: _iter_custom_providers() - cli.py + gateway/run.py: /model switch custom_providers passthrough - run_agent.py + gateway/run.py: per-model context_length lookup Also: use config.pop() instead of del for safer migration, fix stale _config_version assertions in tests, add pool mock to codex test. Co-authored-by: 墨綠BG <s5460703@gmail.com> Closes #8776, salvaged from PR #8814	2026-04-13 10:50:52 -07:00
Teknium	0dd26c9495	fix(tests): fix 78 CI test failures and remove dead test (#9036 ) Production fixes: - voice_mode.py: add is_recording property to AudioRecorder (parity with TermuxAudioRecorder) - cronjob_tools.py: add sms example to deliver description Test fixes: - test_real_interrupt_subagent: add missing _execution_thread_id (fixes 19 cascading failures from leaked _build_system_prompt patch) - test_anthropic_error_handling: add _FakeMessages, override _interruptible_streaming_api_call (6 fixes) - test_ctx_halving_fix: add missing request_overrides attribute (4 fixes) - test_context_token_tracking: set _disable_streaming=True for non-streaming test path (4 fixes) - test_dict_tool_call_args: set _disable_streaming=True (1 fix) - test_provider_parity: add model='gpt-4o' for AIGateway tests to meet 64K minimum context (4 fixes) - test_session_race_guard: add user_id to SessionSource (5 fixes) - test_restart_drain/helpers: add user_id to SessionSource (2 fixes) - test_telegram_photo_interrupts: add user_id to SessionSource - test_interrupt: target thread_id for per-thread interrupt system (2 fixes) - test_zombie_process_cleanup: rewrite with object.__new__ for refactored GatewayRunner.stop() (1 fix) - test_browser_camofox_state: update config version 15->17 (1 fix) - test_trajectory_compressor_async: widen lookback window 10->20 for line-shifted AsyncOpenAI (1 fix) - test_voice_mode: fixed by production is_recording addition (5 fixes) - test_voice_cli_integration: add _attached_images to CLI stub (2 fixes) - test_hermes_logging: explicit propagation/level reset for cross-test pollution defense (1 fix) - test_run_agent: add base_url for OpenRouter detection tests (2 fixes) Deleted: - test_inline_think_blocks_reasoning_only_accepted: tested unimplemented inline <think> handling	2026-04-13 10:50:24 -07:00
kimsr96	b909a9efef	fix: extend ASCII-locale UnicodeEncodeError recovery to full request payload The existing ASCII codec handler only sanitized conversation messages, leaving tool schemas, system prompts, ephemeral prompts, prefill messages, and HTTP headers as unhandled sources of non-ASCII content. On systems with LANG=C or non-UTF-8 locale, Unicode symbols in tool descriptions (e.g. arrows, em-dashes from prompt_builder) and system prompt content would cause UnicodeEncodeError that fell through to the error path. Changes: - Add _sanitize_structure_non_ascii() generic recursive walker for nested dict/list payloads - Add _sanitize_tools_non_ascii() thin wrapper for tool schemas - Add _force_ascii_payload flag: once ASCII locale is detected, all subsequent API calls get proactively sanitized (prevents recurring failures from new tool results bringing fresh Unicode each turn) - Extend the ASCII codec error handler to sanitize: prefill_messages, tool schemas (self.tools), system prompt, ephemeral system prompt, and default HTTP headers - Update stale comment that acknowledged the gap Cherry-picked from PR #8834 (credential pool changes dropped as separate concern).	2026-04-13 05:16:35 -07:00
Teknium	28a9c43f81	fix: resolve key_env to actual API key value instead of env var name The cherry-picked code passed the env var NAME (e.g. 'MY_API_KEY') as the api_key value. The caller's has_usable_secret() check would reject the var name, so the actual key was never used. Now we os.getenv() the key_env value to get the real API key before returning it.	2026-04-13 05:16:21 -07:00
Geoff	76eecf3819	fix(model): Support providers: dict for custom endpoints in /model Two fixes for user-defined providers in config.yaml: 1. list_authenticated_providers() - now includes full models list from providers.*.models array, not just default_model. This fixes /model showing only one model when multiple are configured. 2. _get_named_custom_provider() - now checks providers: dict (new-style) in addition to custom_providers: list (legacy). This fixes credential resolution errors when switching models via /model command. Both changes are backwards compatible with existing custom_providers list format. Fixes: Only one model appears for custom providers in /model selection	2026-04-13 05:16:21 -07:00
konsisumer	311dac1971	fix(file_tools): block /private/etc writes on macOS symlink bypass On macOS, /etc is a symlink to /private/etc, so os.path.realpath() resolves /etc/hosts to /private/etc/hosts. The sensitive path check only matched /etc/ prefixes against the resolved path, allowing writes to system files on macOS. - Add /private/etc/ and /private/var/ to _SENSITIVE_PATH_PREFIXES - Check both realpath-resolved and normpath-normalized paths - Add regression tests for macOS symlink bypass Closes #8734 Co-authored-by: ElhamDevelopmentStudio (PR #8829)	2026-04-13 05:15:05 -07:00
Teknium	587eeb56b9	chore: remove duplicate dead _try_gh_cli_token / _gh_cli_candidates from auth.py These functions were duplicated between auth.py and copilot_auth.py. The auth.py copies had zero production callers — only copilot_auth.py's versions are used. Redirect the test import to the live copy and update monkeypatch targets accordingly.	2026-04-13 05:12:36 -07:00
HearthCore	2a9e50c104	fix(copilot): resolve GHE token poisoning when GITHUB_TOKEN is set When GITHUB_TOKEN is present in the environment (e.g. for gh CLI or GitHub Actions), two issues broke Copilot authentication against GitHub Enterprise (GHE) instances: 1. The copilot provider had no base_url_env_var, so COPILOT_API_BASE_URL was silently ignored — requests always went to public GitHub. 2. `gh auth token` (the CLI fallback) treats GITHUB_TOKEN as an override and echoes it back instead of reading from its credential store (hosts.yml). This caused the same rejected token to be used even after env var priority correctly skipped it. Fix: - Add base_url_env_var="COPILOT_API_BASE_URL" to copilot ProviderConfig - Strip GITHUB_TOKEN/GH_TOKEN from the subprocess env when calling `gh auth token` so it reads from hosts.yml - Pass --hostname from COPILOT_GH_HOST when set so gh returns the GHE-specific OAuth token	2026-04-13 05:12:36 -07:00
luyao618	8ec1608642	fix(agent): propagate api_mode to vision provider resolution resolve_vision_provider_client() computed resolved_api_mode from config but never passed it to downstream resolve_provider_client() or _get_cached_client() calls, causing custom providers with api_mode: anthropic_messages to crash when used for vision tasks. Also remove the for_vision special case in _normalize_aux_provider() that incorrectly discarded named custom provider identifiers. Fixes #8857 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 05:02:54 -07:00
Teknium	e3ffe5b75f	fix: remove legacy compression.summary_* config and env var fallbacks (#8992 ) Remove the backward-compat code paths that read compression provider/model settings from legacy config keys and env vars, which caused silent failures when auto-detection resolved to incompatible backends. What changed: - Remove compression.summary_model, summary_provider, summary_base_url from DEFAULT_CONFIG and cli.py defaults - Remove backward-compat block in _resolve_task_provider_model() that read from the legacy compression section - Remove _get_auxiliary_provider() and _get_auxiliary_env_override() helper functions (AUXILIARY_/CONTEXT_ env var readers) - Remove env var fallback chain for per-task overrides - Update hermes config show to read from auxiliary.compression - Add config migration (v16→17) that moves non-empty legacy values to auxiliary.compression and strips the old keys - Update example config and openclaw migration script - Remove/update tests for deleted code paths Compression model/provider is now configured exclusively via: auxiliary.compression.provider / auxiliary.compression.model Closes #8923	2026-04-13 04:59:26 -07:00
WorldInnovationsDepartment	c1809e85e7	fix(gateway): handle stale lock files in acquire_scoped_lock Updated the acquire_scoped_lock function to treat empty or corrupt lock files as stale. This change ensures that if a lock file exists but is invalid, it will be removed to prevent issues with stale locks. Added tests to verify recovery from both empty and corrupt lock files.	2026-04-13 04:59:25 -07:00
Teknium	23f668d66e	fix: extract Gemma 4 <thought> reasoning in _extract_reasoning() (#8991 ) Add <thought>(.*?)</thought> to inline_patterns so Gemma 4 reasoning content is captured for /reasoning display, not just stripped from visible output. Closes #8891 Co-authored-by: RhushabhVaghela <rhushabhvaghela@users.noreply.github.com>	2026-04-13 04:59:06 -07:00
flobo3	d8a521092b	fix(weixin): rename send_document parameter to match base class	2026-04-13 04:58:30 -07:00
Teknium	a5bd56eae3	fix: eliminate provider hang dead zones in retry/timeout architecture (#8985 ) Three targeted changes to close the gaps between retry layers that caused users to experience 'No response from provider for 580s' and 'No activity for 15 minutes' despite having 5 layers of retry: 1. Remove non-streaming fallback from streaming path Previously, when all 3 stream retries exhausted, the code fell back to _interruptible_api_call() which had no stale detection and no activity tracking — a black hole that could hang for up to 1800s. Now errors propagate to the main retry loop which has richer recovery (credential rotation, provider fallback, backoff). For 'stream not supported' errors, sets _disable_streaming flag so the main retry loop automatically switches to non-streaming on the next attempt. 2. Add _touch_activity to recovery dead zones The gateway inactivity monitor relies on _touch_activity() to know the agent is alive, but activity was never touched during: - Stale stream detection/kill cycles (180-300s gaps) - Stream retry connection rebuilds - Main retry backoff sleeps (up to 120s) - Error recovery classification Now all these paths touch activity every ~30s, keeping the gateway informed during recovery cycles. 3. Add stale-call detector to non-streaming path _interruptible_api_call() now has the same stale detection pattern as the streaming path: kills hung connections after 300s (default, configurable via HERMES_API_CALL_STALE_TIMEOUT), scaled for large contexts (450s for 50K+ tokens, 600s for 100K+ tokens), disabled for local providers. Also touches activity every ~30s during the wait so the gateway monitor stays informed. Env vars: - HERMES_API_CALL_STALE_TIMEOUT: non-streaming stale timeout (default 300s) - HERMES_STREAM_STALE_TIMEOUT: unchanged (default 180s) Before: worst case ~2+ hours of sequential retries with no feedback After: worst case bounded by gateway inactivity timeout (default 1800s) with continuous activity reporting	2026-04-13 04:55:20 -07:00
Teknium	acdff020b7	test: add multi-word query tests for truncation match strategy Tests phrase matching, proximity co-occurrence, and sliding window coverage maximisation — the three new tiers from the truncation fix.	2026-04-13 04:54:42 -07:00
Al Sayed Hoota	a5bc698b9a	fix(session_search): improve truncation to center on actual query matches Three-tier match strategy for _truncate_around_matches(): 1. Full-phrase search (exact query string positions) 2. Proximity co-occurrence (all terms within 200 chars) 3. Individual terms (fallback, preserves existing behavior) Sliding window picks the start offset covering the most matches. Moved inline import re to module level. Co-authored-by: Al Sayed Hoota <78100282+AlsayedHoota@users.noreply.github.com>	2026-04-13 04:54:42 -07:00
landy	dbed40f39b	fix: reopen resumed gateway sessions in sqlite	2026-04-13 04:54:07 -07:00
flobo3	d945cf6b1a	fix(docker): add .venv to .dockerignore	2026-04-13 04:52:00 -07:00
twilwa	3a64348772	fix(discord): voice session continuity and signal handler thread safety - Store source metadata on /voice channel join so voice input shares the same session as the linked text channel conversation - Treat voice-linked text channels as free-response (skip @mention and auto-thread) while voice is active - Scope the voice-linked exemption to the exact bound channel, not sibling threads - Guard signal handler registration in start_gateway() for non-main threads (prevents RuntimeError when gateway runs in a daemon thread) - Clean up _voice_sources on leave_voice_channel Salvaged from PR #3475 by twilwa (Modal runtime portions excluded).	2026-04-13 04:49:21 -07:00
Teknium	381810ad50	feat: fix SQLite safety in hermes backup + add --quick snapshots + /snapshot command (#8971 ) Three changes consolidated into the existing backup system: 1. Fix: hermes backup now uses sqlite3.Connection.backup() for .db files instead of raw file copy. Raw copy of a WAL-mode database can produce a corrupted backup — the backup() API handles this correctly. 2. hermes backup --quick: fast snapshot of just critical state files (config.yaml, state.db, .env, auth.json, cron/jobs.json, etc.) stored in ~/.hermes/state-snapshots/. Auto-prunes to 20 snapshots. 3. /snapshot slash command (alias /snap): in-session interface for quick state snapshots. create/list/restore/prune subcommands. Restore by ID or number. Powered by the same backup module. No new modules — everything lives in hermes_cli/backup.py alongside the existing full backup/import code. No hooks in run_agent.py — purely on-demand, zero runtime overhead. Closes the use case from PRs #8406 and #7813 with ~200 lines of new logic instead of a 1090-line content-addressed storage engine.	2026-04-13 04:46:13 -07:00
Richard Li	82901695ff	feat(wecom): add platform hint for native media sending	2026-04-13 04:46:04 -07:00
Teknium	3365abdddf	fix: use correct 'completed' state in status badge map, clean up blank lines The cron backend uses 'completed' (not 'exhausted') when repeat count is reached. Also removes extra blank lines from cherry-pick.	2026-04-13 04:45:29 -07:00
jonny	70f490a12a	fix(web): CronPage crash when rendering schedule object The cron API returns schedule as {kind, expr, display} object but CronPage.tsx rendered it directly as a React child, crashing with 'Objects are not valid as a React child'. - Update CronJob interface in api.ts to match actual API response - Use schedule_display (string) instead of schedule (object) - Use state instead of status for job state - Use last_error instead of error for error display	2026-04-13 04:45:29 -07:00
Teknium	8dfee98d06	fix: clean up description escaping, add string-data tests Follow-up for cherry-picked PR #8918.	2026-04-13 04:45:07 -07:00
dippwho	bca22f3090	fix(homeassistant): #8912 resolve XML tool calling loop by casting nested object to JSON string	2026-04-13 04:45:07 -07:00
MaybeRichard	11e2e04667	fix(telegram): pass proxy URL explicitly to HTTPXRequest when proxy env vars are set When HTTPS_PROXY / HTTP_PROXY / ALL_PROXY env vars are set (or macOS system proxy is detected), pass the proxy URL explicitly via HTTPXRequest(proxy=proxy_url) instead of relying on httpx's trust_env mechanism, which is unreliable for HTTP CONNECT proxies (e.g. Clash / ClashMac in fake-ip mode). Uses the shared resolve_proxy_url() from base.py (handles env vars + macOS system proxy detection) instead of duplicating env var reading inline. Consolidates the proxy_configured boolean into a single proxy_url = resolve_proxy_url() call that serves as both the gate for skipping fallback-IP transport and the value passed to HTTPXRequest. Co-authored-by: Hermes Agent <hermes@nousresearch.com> Salvaged from PR #8931 by MaybeRichard.	2026-04-13 04:45:05 -07:00
XiaoXiao0221	860489600a	fix(cli): sanitize surrogate characters in handle_paste Prevents UTF-8 encoding crash when pasting text from Word or Google Docs, which may contain lone surrogate code points (U+D800-U+DFFF). Reuses existing _sanitize_surrogates() from run_agent module.	2026-04-13 04:42:45 -07:00
Teknium	0998a57007	refactor: remove 5 dead utility functions from utils.py (#8975 ) Remove read_json_file, read_jsonl, append_jsonl, env_str, env_lower — all added in #7917 but never imported anywhere in the codebase. Also remove unused List and Optional typing imports. env_int, env_bool, and the other helpers that have real consumers are kept.	2026-04-13 04:39:59 -07:00
Teknium	cea34dc7ef	fix: follow-up for salvaged PR #8939 - Move test file to tests/hermes_cli/ (consistent with test layout) - Remove unused imports (os, pytest) from test file - Update _sanitize_env_lines docstring: now used on read + write paths	2026-04-13 04:35:37 -07:00
Mil Wang (from Dev Box)	e469f3f3db	fix: sanitize .env before loading to prevent token duplication (#8908 ) When .env files become corrupted (e.g. concatenated KEY=VALUE pairs on a single line due to concurrent writes or encoding issues), both python-dotenv and load_env() would parse the entire concatenated string as a single value. This caused bot tokens to appear duplicated up to 8×, triggering InvalidToken errors from the Telegram API. Root cause: _sanitize_env_lines() — which correctly splits concatenated lines — was only called during save_env_value() writes, not during reads. Fix: - load_env() now calls _sanitize_env_lines() before parsing - env_loader.load_hermes_dotenv() sanitizes the .env file on disk before python-dotenv reads it, so os.getenv() also returns clean values - Added tests reproducing the exact corruption pattern from #8908 Closes #8908	2026-04-13 04:35:37 -07:00
ismell0992-afk	e77f135ed8	fix(cli): narrow Nous Hermes non-agentic warning to actual hermes-3/-4 models The startup warning that Nous Research Hermes 3 & 4 models are not agentic fired on any model whose name contained "hermes" anywhere, via a plain substring check. That false-positived on unrelated local Modelfiles such as `hermes-brain:qwen3-14b-ctx16k` — a tool-capable Qwen3 wrapper that happens to live under a custom "hermes" tag namespace — making the warning noise for legitimate setups. Replace the substring check with a narrow regex anchored on `^`, `/`, or `:` boundaries that only matches the real Hermes-3 / Hermes-4 chat family (e.g. `NousResearch/Hermes-3-Llama-3.1-70B`, `hermes-4-405b`, `openrouter/hermes3:70b`). Consolidate into a single helper `is_nous_hermes_non_agentic()` in `hermes_cli.model_switch` so the CLI and the canonical check don't drift, and route the duplicate inline site in `cli.HermesCLI._print_warnings()` through the helper. Add a parametrized test covering positive matches (real Hermes-3/-4 names) and a broad set of negatives (custom Modelfiles, Qwen/Claude/GPT, older Nous-Hermes-2 families, bare "hermes", empty string, and the "brain-hermes-3-impostor" boundary case).	2026-04-13 04:33:52 -07:00
ismell0992-afk	3e99964789	fix(agent): prefer Ollama Modelfile num_ctx over GGUF training max _query_local_context_length was checking model_info.context_length (the GGUF training max) before num_ctx (the Modelfile runtime override), inverse to query_ollama_num_ctx. The two helpers therefore disagreed on the same model: hermes-brain:qwen3-14b-ctx32k # Modelfile: num_ctx 32768 underlying qwen3:14b GGUF # qwen3.context_length: 40960 query_ollama_num_ctx correctly returned 32768 (the value Ollama will actually allocate KV cache for). _query_local_context_length returned 40960, which let ContextCompressor grow conversations past 32768 before triggering compression — at which point Ollama silently truncated the prefix, corrupting context. Swap the order so num_ctx is checked first, matching query_ollama_num_ctx. Adds a parametrized test that seeds both values and asserts num_ctx wins. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-13 04:24:07 -07:00
Teknium	39b83f3443	fix: remove sandbox language from tool descriptions The terminal and execute_code tool schemas unconditionally mentioned 'cloud sandboxes' in their descriptions sent to the model. This caused agents running on local backends to believe they were in a sandboxed environment, refusing networking tasks and other operations. Worse, agents sometimes saved this false belief to persistent memory, making it persist across sessions. Reported by multiple users (XLion, 林泽).	2026-04-13 04:23:27 -07:00
Teknium	67fece1176	feat(cli): show notification when iteration budget is reached Displays a dim warning after the response panel when the agent hit its max iterations, so the user knows the response may be incomplete.	2026-04-13 03:40:47 -07:00
Teknium	934318ba3a	fix: budget-exhausted conversations now get a summary instead of empty response The post-loop grace call mechanism was broken: it injected a user message and set _budget_grace_call=True, but could never re-enter the while loop (already exited). Worse, the flag blocked the fallback _handle_max_iterations from running, so final_response stayed None. Users saw empty/no response when the agent hit max iterations. Fix: remove the dead grace block and let _handle_max_iterations handle it directly — it already injects a summary request and makes one extra toolless API call.	2026-04-13 03:36:20 -07:00
Teknium	3804556cd9	fix: restore clarify toolset row removed in cherry-pick	2026-04-13 02:49:11 -07:00
Haoqing Wang	8e0ae66520	fix(skills): correct TTS/STT providers, add missing platforms/commands in hermes-agent skill Fixes verified via 5-container parallel testing against v0.8.0 codebase. Critical fixes: - TTS providers: replace nonexistent kokoro/fish with actual minimax/mistral/neutts - STT providers: add missing mistral (Voxtral Transcribe) - Testing section: remove `source venv/bin/activate` (no venv dir in project) Expanded coverage: - Provider table: 13 → 22 entries (add Gemini, xAI, Xiaomi, Qwen OAuth, MiniMax CN, etc.) - Platform list: add BlueBubbles (iMessage) and Weixin (WeChat), clarify Open WebUI - Slash commands: add 14 undocumented commands (/approve, /deny, /branch, /fast, etc.) - Toolsets: add 4 missing (messaging, search, todo, rl) - Troubleshooting: expand from 6 to 10 sections with practical deployment fixes (Copilot OAuth 403, gateway linger, WSL2 systemd, Discord intents, etc.) Minor fixes: - agent/ directory description expanded - delegation config keys completed - /restart noted as gateway-only - hermes honcho noted as plugin-dependent	2026-04-13 02:49:11 -07:00
Teknium	397eae5d93	fix: recover partial streamed content on connection failure When streaming fails after partial content delivery (e.g. OpenRouter timeout kills connection mid-response), the stub response now carries the accumulated streamed text instead of content=None. Two fixes: 1. The partial-stream stub response includes recovered content from _current_streamed_assistant_text — the text that was already delivered to the user via stream callbacks before the connection died. 2. The empty response recovery chain now checks for partial stream content BEFORE falling back to _last_content_with_tools (prior turn content) or wasting API calls on retries. This prevents: - Showing wrong content from a prior turn - Burning 3+ unnecessary retry API calls - Falling through to '(empty)' when the user already saw content The root cause: OpenRouter has a ~125s inactivity timeout. When Anthropic's SSE stream goes silent during extended reasoning, the proxy kills the connection. The model's text was already partially streamed but the stub discarded it, triggering the empty recovery chain which would show stale prior-turn content or waste retries.	2026-04-13 02:12:01 -07:00
Teknium	35b11f48a5	docs: add web dashboard documentation (#8864 ) - New docs page: user-guide/features/web-dashboard.md covering quick start, prerequisites, all three pages (Status, Config, API Keys), the /reload slash command, REST API endpoints, CORS config, and development workflow - Added 'Management' category in sidebar for web-dashboard - Added 'hermes web' to CLI commands reference with options table - Added '/reload' to slash commands reference (both CLI and gateway tables)	2026-04-13 01:15:27 -07:00
Ubuntu	73ed09e145	fix(gateway): keep venv python symlink unresolved when remapping paths _remap_path_for_user was calling .resolve() on the Python path, which followed venv/bin/python into the base interpreter. On uv-managed venvs this swaps the systemd ExecStart to a bare Python that has none of the venv's site-packages, so the service crashes on first import. Classical python -m venv installs were unaffected by accident: the resolved target /usr/bin/python3.x lives outside $HOME so the path-remap branch was skipped and the system Python's packages silently worked. Remove .resolve() calls on both current_home and the path; use .expanduser() for lexical tilde expansion only. The function does lexical prefix substitution, which is all it needs to do for its actual purpose (remapping /root/.hermes -> /home/<user>/.hermes when installing system services as root for a different user). Repro: on a uv-managed venv install, `sudo hermes gateway install --system` writes ExecStart=.../uv/python/cpython-3.11.15-.../bin/python3.11 instead of .../hermes-agent/venv/bin/python, and the service crashes on ModuleNotFoundError: yaml. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 00:49:22 -07:00
Teknium	964ef681cf	fix(gateway): improve /restart response with fallback instructions	2026-04-12 22:34:23 -07:00
Teknium	276d20e62c	fix(gateway): /restart uses service restart under systemd instead of detached subprocess The detached bash subprocess spawned by /restart gets killed by systemd's KillMode=mixed cgroup cleanup, leaving the gateway dead. Under systemd (detected via INVOCATION_ID env var), /restart now uses via_service=True which exits with code 75 — RestartForceExitStatus=75 in the unit file makes systemd auto-restart the service. The detached subprocess approach is preserved as fallback for non-systemd environments (Docker, tmux, foreground mode).	2026-04-12 22:32:19 -07:00

1 2 3 4 5 ...

4049 commits