hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
Lumen Radley	e22416dd9b	fix: handle empty sudo password and false prompts	2026-04-09 02:50:07 -07:00
Teknium	a94099908a	fix(state): orphan children instead of cascade-deleting in prune/delete (#6513 ) prune_sessions and delete_session only handled direct children when satisfying the parent_session_id FK constraint. Multi-level chains (A -> B -> C) caused IntegrityError because deleting B while C still referenced it was blocked by the FK. Fix: NULL out parent_session_id for any session whose parent is about to be deleted. This orphans children instead of cascade-deleting them, which also respects the prune retention window — newer child sessions are no longer deleted just because an ancestor is old. Reported by Aaryan2304 in PR #6463.	2026-04-09 02:41:56 -07:00
cokemine	851857e413	fix(models): correct probed_url selection logic Updated the logic for determining the probed_url in the probe_api_models function to use the first tried URL instead of the last. This change ensures that the most relevant URL is returned when probing for models. Additionally, improved the output message in the _model_flow_custom function to provide clearer guidance based on the suggested_base_url.	2026-04-09 02:38:09 -07:00
Teknium	b408379e9d	fix: reduce credential exhaustion TTL from 24 hours to 1 hour (#6504 ) The 24-hour default cooldown for 402-exhausted credentials was far too aggressive — if a user tops up credits or the 402 was caused by an oversized max_tokens request rather than true billing exhaustion, they shouldn't have to wait a full day. Reduce to 1 hour (matching the existing 429 TTL). Inspired by PR #6493 (michalkomar).	2026-04-09 02:37:23 -07:00
Kira	e1b0b135cb	fix(discord): accept .log attachments and raise document size limit	2026-04-09 02:26:33 -07:00
Teknium	1eabbe905e	fix: retry 3 times when model returns truly empty response (#6488 ) When a model returns no content, no structured reasoning, and no tool calls (common with open models), the agent now silently retries up to 3 times before falling through to (empty). Silent retry (no synthetic messages) keeps the conversation history clean, preserves prompt caching, and respects the no-synthetic-user- injection invariant. Most empty responses from open models are transient (provider hiccups, rate limits, sampling flukes) so a simple retry is sufficient. This fills the last gap in the empty-response recovery chain: 1. _last_content_with_tools fallback (prior tool turn had content) 2. Thinking-only prefill continuation (#5931 — structured reasoning) 3. Empty response silent retry (NEW — truly empty, no reasoning) 4. (empty) terminal (last resort after all retries exhausted) Inline <think> blocks are excluded — the model chose to reason, it just produced no visible text. That differs from truly empty. Tests: - Updated test_truly_empty to expect 4 API calls (1 + 3 retries) - Added test_truly_empty_response_succeeds_on_nudge	2026-04-09 02:06:12 -07:00
Teknium	b962801f6a	fix(bluebubbles): add setup wizard integration and OPTIONAL_ENV_VARS (#6494 ) The BlueBubbles adapter was merged but missing setup wizard support: - Add _setup_bluebubbles() guided setup (server URL, password, allowlist, home channel, webhook port) - Add to _GATEWAY_PLATFORMS registry so it appears in 'hermes setup gateway' - Add to any_messaging check and home channel missing warning - Add to gateway status display in 'hermes setup' - Add BLUEBUBBLES_SERVER_URL, BLUEBUBBLES_PASSWORD, BLUEBUBBLES_ALLOWED_USERS to OPTIONAL_ENV_VARS with descriptions and categories Previously the only way to configure BlueBubbles was manually editing .env.	2026-04-09 02:05:41 -07:00
Cherif Yaya	5cf4fac2aa	fix: restore codex fallback auth-store lookup	2026-04-09 01:56:10 -07:00
Hunter B	894e8c8a8f	fix: resolve opencode.ai context window to 1M and clean up display formatting Two issues resolved: 1. Add opencode.ai to _URL_TO_PROVIDER mapping so base_url routes through models.dev lookup (which has mimo-v2-pro at 1M context) instead of falling back to probing /models (404) and defaulting to 128K. 2. Fix _format_context_length to round cleanly: 1048576 → '1M' instead of '1.048576M'. Applies same rounding logic to K values.	2026-04-09 01:43:22 -07:00
Teknium	18140199c3	fix(ci): build and push multi-arch Docker image (amd64 + arm64) (#6124 ) Add QEMU cross-compilation and multi-arch manifest support so Apple Silicon (M1/M2/M3) and other ARM-based systems get native images. - Add docker/setup-qemu-action for arm64 emulation on amd64 runners - Smoke test stays amd64-only (load:true can't export multi-arch) - Both push steps (main + release) now build linux/amd64,linux/arm64 - Bump timeout 30->60min for QEMU cross-compilation overhead - Add permissions: contents: read (least-privilege hardening) Salvaged from PR #3998 by Mibayy. Also addresses #5005 and #3913. Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>	2026-04-09 00:29:45 -07:00
Teknium	7120d6cdd6	fix(bluebubbles): add missing integration points and documentation (#6460 ) - hermes_cli/skills_config.py: add platform label for per-platform skill config - gateway/session.py: add to PII-safe platforms (no mention system) - website/docs/user-guide/messaging/bluebubbles.md: full setup guide - website/sidebars.ts: sidebar navigation entry - 10 docs pages: add BlueBubbles to all platform enumerations (env vars, toolsets, cron delivery, gateway internals, etc.)	2026-04-09 00:19:05 -07:00
Teknium	d40264d53b	test: add coverage for token-budget tail protection Tests for the new behavior paths: - Large tool outputs no longer block compaction (motivating scenario) - Hard minimum of 3 tail messages always protected - 1.5x soft ceiling for oversized messages - Small conversations still compress (min 8 messages) - Token-budget prune path in _prune_old_tool_results - Fallback to message-count when no token budget	2026-04-08 23:54:23 -07:00
BongSuCHOI	c506126123	fix(tests): update context_compressor tests for min_tail=3 PR #6240 changed tail protection from protect_last_n to min(3, ...) which increased the minimum compressible message count and shifted tail boundaries. Three tests broke: - test_summary_role_avoids_consecutive_user_messages: 6→8 msgs - test_double_collision_user_head_assistant_tail: 7→8 msgs - test_no_collision_scenarios_still_work: 6→8 msgs All tests now exceed the new min_for_compress threshold (6) and maintain proper role alternation in both head and tail sections.	2026-04-08 23:54:23 -07:00
BongSuCHOI	d12f8db0b8	fix(compaction): token-budget primary tail protection Tail protection was effectively message-count based despite having a token budget, because protect_last_n=20 acted as a hard floor. A single 50K-token tool output would cause all 20 recent messages to be preserved regardless of budget, leaving little room for summarization. Changes: - _find_tail_cut_by_tokens: min_tail reduced from protect_last_n (20) to 3; token budget is now the primary criterion - Soft ceiling at 1.5x budget to avoid cutting mid-oversized-message - _prune_old_tool_results: accepts optional protect_tail_tokens so pruning also respects the token budget instead of a fixed count - compress() minimum message check relaxed from protect_first_n + protect_last_n + 1 to protect_first_n + 3 + 1 - Tool group alignment (no splitting tool_call/result) preserved	2026-04-08 23:54:23 -07:00
Nicolò Boschi	25757d631b	feat(hindsight): feature parity, setup wizard, and config improvements Port missing features from the hindsight-hermes external integration package into the native plugin. Only touches plugin files — no core changes. Features: - Tags on retain/recall (tags, recall_tags, recall_tags_match) - Recall config (recall_max_tokens, recall_max_input_chars, recall_types, recall_prompt_preamble) - Retain controls (retain_every_n_turns, auto_retain, auto_recall, retain_async via aretain_batch, retain_context) - Bank config via Banks API (bank_mission, bank_retain_mission) - Structured JSON retain with per-message timestamps - Full session accumulation with document_id for dedup - Custom post_setup() wizard with curses picker - Mode-aware dep install (hindsight-client for cloud, hindsight-all for local) - local_external mode and openai_compatible LLM provider - OpenRouter support with auto base URL - Auto-upgrade of hindsight-client to >=0.4.22 on session start - Comprehensive debug logging across all operations - 46 unit tests - Updated README and website docs	2026-04-08 23:54:15 -07:00
Teknium	d97f6cec7f	feat(gateway): add BlueBubbles iMessage platform adapter (#6437 ) Adds Apple iMessage as a gateway platform via BlueBubbles macOS server. Architecture: - Webhook-based inbound (event-driven, no polling/dedup needed) - Email/phone → chat GUID resolution for user-friendly addressing - Private API safety (checks helper_connected before tapback/typing) - Inbound attachment downloading (images, audio, documents cached locally) - Markdown stripping for clean iMessage delivery - Smart progress suppression for platforms without message editing Based on PR #5869 by @benjaminsehl (webhook architecture, GUID resolution, Private API safety, progress suppression) with inbound attachment downloading from PR #4588 by @1960697431 (attachment cache routing). Integration points: Platform enum, env config, adapter factory, auth maps, cron delivery, send_message routing, channel directory, platform hints, toolset definition, setup wizard, status display. 27 tests covering config, adapter, webhook parsing, GUID resolution, attachment download routing, toolset consistency, and prompt hints.	2026-04-08 23:54:03 -07:00
Teknium	241bd4fc7e	fix: add size cap to assistant thread metadata cache Prevents unbounded memory growth in _assistant_threads dict. Evicts oldest entries when exceeding _ASSISTANT_THREADS_MAX (5000), matching the pattern used by _mentioned_threads and _seen_messages.	2026-04-08 23:53:50 -07:00
helix4u	30a0fcaec8	fix(slack): handle assistant thread lifecycle events	2026-04-08 23:53:50 -07:00
Teknium	5449c01d26	fix: clean env vars in pairing regression test The test_non_internal_event_without_user_triggers_pairing test relied on no Discord auth env vars being set, but gateway/run.py loads dotenv at module level. In environments with DISCORD_ALLOW_ALL_USERS=True in .env, the auth check passed instead of triggering the pairing flow. Clear DISCORD_ALLOW_ALL_USERS, DISCORD_ALLOWED_USERS, GATEWAY_ALLOW_ALL_USERS, and GATEWAY_ALLOWED_USERS via monkeypatch to ensure test isolation.	2026-04-08 23:01:04 -07:00
xingkongliang	1d8d4f28ae	fix(gateway): prevent background process notifications from triggering false pairing requests When a background process with notify_on_complete=True finishes, the gateway injects a synthetic MessageEvent to notify the session. This event was constructed without user_id, causing _is_user_authorized() to reject it and — for DM-origin sessions — trigger the pairing flow, sending "Hi~ I don't recognize you yet!" with a pairing code to the chat owner. Add an `internal` flag to MessageEvent that bypasses authorization checks for system-generated synthetic events. Only the process watcher sets this flag; no external/adapter code path can produce it. Includes 4 regression tests covering the fix and the normal pairing path.	2026-04-08 23:01:04 -07:00
helix4u	e94008c404	fix(terminal): guard invalid command values	2026-04-08 21:37:51 -07:00
angelos	e7d3e9d767	fix(terminal): persistent sandbox envs survive between turns `_cleanup_task_resources` was unconditionally calling `cleanup_vm()` at the end of every `run_conversation` (i.e. every user turn), tearing down the docker/daytona/modal sandbox container regardless of its `persistent_filesystem` setting. This contradicted the documented intent of `terminal.lifetime_seconds` (idle reaper) and `container_persistent`, and caused per-turn loss of `/workspace`, `~/.config`, agent CLI auth state, and any other content living inside the sandbox. The unconditional teardown was introduced in `fbd3a2fd` ("prevent leakage of morph instances between tasks", 2025-11-04) to plug a Morph backend leak, two days after `lifetime_seconds` shipped in `faecbddd`. It was later refactored into `_cleanup_task_resources` in `70dd3a16` without changing semantics. Code and docs have disagreed since. Fix: introduce `terminal_tool.is_persistent_env(task_id)` and skip the per-turn `cleanup_vm` when the active env is persistent. The idle reaper (`_cleanup_inactive_envs`) still tears persistent envs down once `terminal.lifetime_seconds` is exceeded. Non-persistent backends (Morph) are unchanged — still torn down per turn, preserving the original leak-prevention intent.	2026-04-08 21:31:57 -07:00
Teknium	54db7cbbe1	fix(agent): tiered context pressure warnings + gateway dedup (#6411 ) Combines the approaches from PR #6309 (duan78) and PR #5963 (KUSH42): Tiered warnings (from #5963): - Replaces boolean _context_pressure_warned with float _context_pressure_warned_at - Fires at 85% (orange) and re-fires at 95% (red/critical) - Adds 'compacting context...' status message before compression Gateway dedup (from #6309): - Class-level dict _context_pressure_last_warned survives across AIAgent instances (gateway creates a new instance per message) - 5-minute cooldown per session prevents warning spam - Higher-tier warnings bypass the cooldown (85% → 95% always fires) - Compression reset clears the dedup entry for the session - Stale entries evicted (older than 2x cooldown) to prevent memory leak Does NOT inject into messages — purely user-facing via _safe_print (CLI) and status_callback (gateway). Zero prompt cache impact. Fixes #6309. Fixes #5963.	2026-04-08 21:31:44 -07:00
Hermes Agent	ffeaf6ffae	feat(discord): inherit forum channel topic in thread sessions ORIGINAL INCIDENT: Discord forum descriptions (the topic field on ForumChannel) were invisible to the agent. When a user set project instructions in a forum's description (e.g. tool-evaluations), threads created in that forum had no Channel Topic in their session context. Discovered while evaluating per-forum auto-context injection for web-tap-terminal development threads. ISSUE IN THE CODE: In gateway/platforms/discord.py, all three session entry points (_handle_message, _build_slash_event, _dispatch_thread_session) read chat_topic via getattr(channel, 'topic', None). Discord Thread objects don't carry a topic — only the parent ForumChannel does. So chat_topic was always None for forum threads, and the Channel Topic line was never injected into build_session_context_prompt output. The infrastructure to handle this was already in place — _is_forum_parent() detects forum channels, _format_thread_chat_name() traverses to the parent, and build_session_context_prompt() renders Channel Topic when present. The forum parent was being identified; its topic just wasn't being read. HOW THIS COMMIT FIXES IT: Adds _get_effective_topic(channel, is_thread) helper that reads channel.topic first, then falls back to the parent forum's topic when the channel is a thread inside a forum. All three session entry points now call this helper instead of inlining getattr(channel, 'topic', None). Existing tests pass unchanged. Co-authored-by: dhabibi <9087935+dhabibi@users.noreply.github.com>	2026-04-08 21:29:04 -07:00
Teknium	989d4ea43d	fix: set compression_count on mock to avoid TypeError in test The new degradation warning reads compression_count as an int, but the existing test's MagicMock returns a MagicMock object for that attribute, causing '>=' comparison to fail.	2026-04-08 20:54:23 -07:00
SHL0MS	8567031433	fix: improve context compression quality — named constants, tool tracking, degradation warning Three targeted improvements to the compression system: 1. Replace hardcoded truncation limits with named class constants (_CONTENT_MAX=6000, _CONTENT_HEAD=4000, _CONTENT_TAIL=1500, _TOOL_ARGS_MAX=1500, _TOOL_ARGS_HEAD=1200). Previous limits (3000/500) heavily truncated the summarizer's input — a 200-line edit got cut to 3000 chars before the summarizer ever saw it. 2. Add '## Tools & Patterns' section to both compression prompt templates (first-pass and iterative). Preserves working tool invocations, preferred flags, and tool-specific discoveries across compaction boundaries. 3. Warn users on 2nd+ compression: 'Session compressed N times — accuracy may degrade. Consider /new to start fresh.' Ref #499	2026-04-08 20:54:23 -07:00
Teknium	af4abd2f22	fix: correct unbound exception variable and remaining-time math in warning - Bind exception in warning send handler (was using stale _ne from outer scope) - Calculate remaining time until timeout correctly: (timeout - warning) // 60 instead of warning // 60 (which equals elapsed time, not remaining)	2026-04-08 20:01:06 -07:00
Helmi	092061711e	fix(gateway): add staged inactivity warning before timeout escalation Introduce gateway_timeout_warning (default 900s) as a pre-timeout alert layer. When inactivity reaches the warning threshold, a single notification is sent to the user offering to wait or reset. If inactivity continues to the gateway_timeout (default 1800s), the full timeout fires as before. This gives users a chance to intervene before work is lost on slow API providers without disabling the safety timeout entirely. Config: agent.gateway_timeout_warning in config.yaml, or HERMES_AGENT_TIMEOUT_WARNING env var (0 = disable warning).	2026-04-08 20:01:06 -07:00
Teknium	980fadfea9	fix(models): preserve OpenRouter variant tags (:free, :extended, :fast) during model switch (#6383 ) Step c in switch_model() blindly converted the first colon to a slash for aggregator providers, even when the model name already contained a slash (vendor/model format). This mangled variant tags like :free into /free, causing 400 Bad Request from the API. Fix: skip the colon→slash conversion when the model already has a slash, since the colon is a variant tag, not a vendor separator. The module docstring already documented this intent (line 17-18) but the implementation didn't enforce it. Reported via Discord. Related to PR #6088 (which identified the same bug but placed the fix in model_normalize.py instead of model_switch.py where the actual mangling occurs).	2026-04-08 19:58:16 -07:00
Teknium	ae4a884e8d	fix(agent): disable stale stream timeout for local providers (#6368 ) Local inference providers (Ollama, oMLX, llama-cpp) can take 300+ seconds for prefill on large contexts. The 180s stale stream detector was killing these connections while the provider was still processing. Uses the existing is_local_endpoint() (proper URL parsing with RFC-1918, localhost, WSL detection) instead of ad-hoc substring matching. The stale timeout is only disabled when the user hasn't explicitly set HERMES_STREAM_STALE_TIMEOUT — explicit user config is always honored. Fixes #5889	2026-04-08 19:53:39 -07:00
Teknium	6e3f7f3610	docs: add tool_progress_overrides to configuration reference (#6364 ) Documents the per-platform tool_progress_overrides config key added in PR #6348. Shows example YAML with Signal set to 'off' while Telegram stays on 'verbose'. Lists all valid platform keys.	2026-04-08 19:04:21 -07:00
konsisumer	42e366f27b	fix(agent): respect config timeout for flush_memories instead of hardcoded 30s The _call_llm() and direct OpenAI fallback paths in flush_memories() both hardcoded timeout=30.0, ignoring the user-configurable value at auxiliary.flush_memories.timeout in config.yaml. Remove the explicit timeout from the auxiliary _call_llm() call so that _get_task_timeout('flush_memories') reads from config. For the direct OpenAI fallback, import and use _get_task_timeout() instead of the hardcoded value. Add two regression tests verifying both code paths respect the config. Fixes #6154	2026-04-08 18:55:33 -07:00
Teknium	3baafea380	fix(tools): skip camofox auto-cleanup when managed persistence is enabled (#6233 ) When managed_persistence is enabled, cleanup_browser() was calling camofox_close() which destroys the server-side browser context via DELETE /sessions/{userId}, killing login sessions across cron runs. Add camofox_soft_cleanup() — a public wrapper that drops only the in-memory session entry when managed persistence is on, returning True. When persistence is off it returns False so the caller falls back to the full camofox_close(). The inactivity reaper still handles idle resource cleanup. Also surface a logger.warning() when _managed_persistence_enabled() fails to load config, replacing a silent except-and-return-False. Salvaged from #6182 by el-analista (Eduardo Perea Fernandez). Added public API wrapper to avoid cross-module private imports, and test coverage for both persistence paths. Co-authored-by: Eduardo Perea Fernandez <el-analista@users.noreply.github.com>	2026-04-08 18:07:18 -07:00
Teknium	e26393ffc2	fix: Signal duplicate replies with streaming + per-platform tool_progress (#6348 ) Fixes #4647 — Signal replies duplicated when gateway streaming is enabled. Root cause: stream_consumer.py did not handle the case where send() returns success=True but no message_id (Signal behavior). Every stream delta produced a separate send() call (7+ messages instead of 2), plus the gateway sent another full duplicate since already_sent was never set. Changes: - stream_consumer.py: Add elif branch for success-without-message_id — enters fallback mode (sets already_sent, disables editing, sends only continuation) - signal.py send(): Extract timestamp from signal-cli RPC result as message_id so stream consumer follows normal edit→fallback path - signal.py: Add public stop_typing() delegating to _stop_typing_indicator() so base adapter's _keep_typing finally block can clean up typing tasks - gateway/run.py: Per-platform tool_progress_overrides (#6164) — lets users set e.g. signal: off while keeping telegram: all - hermes_cli/config.py: Add tool_progress_overrides to DEFAULT_CONFIG Refs: #4647, #6164	2026-04-08 17:39:45 -07:00
Teknium	e19252afc4	fix: update tests for unified spawn-per-call execution model - Docker env tests: verify _build_init_env_args() instead of per-execute Popen flags (env forwarding is now init-time only) - Docker: preserve explicit forward_env bypass of blocklist from main - Daytona tests: adapt to SDK-native timeout, _ThreadedProcessHandle, base.py interrupt handling, HERMES_STDIN_ heredoc prefix - Modal tests: fix _load_module to include _ThreadedProcessHandle stub, check ensurepip in _resolve_modal_image instead of __init__ - SSH tests: mock time.sleep on base module instead of removed ssh import - Add missing BaseEnvironment attributes to __new__()-based test fixtures	2026-04-08 17:23:15 -07:00
alt-glitch	d684d7ee7e	feat(environments): unified spawn-per-call execution layer Replace dual execution model (PersistentShellMixin + per-backend oneshot) with spawn-per-call + session snapshot for all backends except ManagedModal. Core changes: - Every command spawns a fresh bash process; session snapshot (env vars, functions, aliases) captured at init and re-sourced before each command - CWD persists via file-based read (local) or in-band stdout markers (remote) - ProcessHandle protocol + _ThreadedProcessHandle adapter for SDK backends - cancel_fn wired for Modal (sandbox.terminate) and Daytona (sandbox.stop) - Shared utilities extracted: _pipe_stdin, _popen_bash, _load_json_store, _save_json_store, _file_mtime_key, _SYNC_INTERVAL_SECONDS - Rate-limited file sync unified in base _before_execute() with _sync_files() hook - execute_oneshot() removed; all 11 call sites in code_execution_tool.py migrated to execute() - Daytona timeout wrapper replaced with SDK-native timeout parameter - persistent_shell.py deleted (291 lines) Backend-specific: - Local: process-group kill via os.killpg, file-based CWD read - Docker: -e env flags only on init_session, not per-command - SSH: shlex.quote transport, ControlMaster connection reuse - Singularity: apptainer exec with instance://, no forced --pwd - Modal: _AsyncWorker + _ThreadedProcessHandle, cancel_fn -> sandbox.terminate - Daytona: SDK-level timeout (not shell wrapper), cancel_fn -> sandbox.stop - ManagedModal: unchanged (gateway owns execution); docstring added explaining why	2026-04-08 17:23:15 -07:00
Teknium	7d26feb9a3	feat(discord): add DISCORD_REPLY_TO_MODE setting (#6333 ) Add configurable reply-reference behavior for Discord, matching the existing Telegram (TELEGRAM_REPLY_TO_MODE) and Mattermost (MATTERMOST_REPLY_MODE) implementations. Modes: - 'off': never reply-reference the original message - 'first': reply-reference on first chunk only (default, current behavior) - 'all': reply-reference on every chunk Set DISCORD_REPLY_TO_MODE=off in .env to disable reply-to messages. Changes: - gateway/config.py: parse DISCORD_REPLY_TO_MODE env var - gateway/platforms/discord.py: read reply_to_mode from config, respect it in send() — skip fetch_message entirely when 'off' - hermes_cli/config.py: add to OPTIONAL_ENV_VARS for hermes setup - 23 tests covering config, send behavior, env var override - docs: discord.md env var table + environment-variables.md reference Closes community request from Stuart on Discord.	2026-04-08 17:08:40 -07:00
kshitijk4poor	875a72e4c8	fix: normalize httpx.URL base_url + strip thinking signatures for third-party endpoints Two linked fixes for MiniMax Anthropic-compatible fallback: 1. Normalize httpx.URL to str before calling .rstrip() in auth/provider detection helpers. Some client objects expose base_url as httpx.URL, not str — crashed with AttributeError in _requires_bearer_auth() and _is_third_party_anthropic_endpoint(). Also fixes _try_activate_fallback() to use the already-stringified fb_base_url instead of raw httpx.URL. 2. Strip Anthropic-proprietary thinking block signatures when targeting third-party Anthropic-compatible endpoints (MiniMax, Azure AI Foundry, self-hosted proxies). These endpoints cannot validate Anthropic's signatures and reject them with HTTP 400 'Invalid signature in thinking block'. Now threads base_url through convert_messages_to_anthropic() → build_anthropic_kwargs() so signature management is endpoint-aware. Based on PR #4945 by kshitijk4poor (rstrip fix). Fixes #4944.	2026-04-08 16:39:29 -07:00
Teknium	20a5e589c6	docs: clarify that provider "main" is for auxiliary tasks only (#6291 ) Users were setting model.provider to "main" after reading the auxiliary provider docs, causing "Unknown provider" errors. The "main" alias is only valid inside auxiliary:, compression:, and fallback_model: configs where it means "use the same provider as my main agent chat." Added warning admonitions and inline clarifications to: - configuration.md: Auxiliary Models provider list and Provider Options table - fallback-providers.md: Provider Options for Auxiliary Tasks table Reported by community member cn on Discord.	2026-04-08 16:39:17 -07:00
Teknium	7156f8d866	fix: CI test failures — metadata key, cli console, docker env, vision order (#6294 ) Fixes 9 test failures on current main, incorporating ideas from PR stack #6219-#6222 by xinbenlv with corrections: - model_metadata: sync HF context length key casing (minimaxai/minimax-m2.5 → MiniMaxAI/MiniMax-M2.5) - cli.py: route quick command error output through self.console instead of creating a new ChatConsole() instance - docker.py: explicit docker_forward_env entries now bypass the Hermes secret blocklist (intentional opt-in wins over generic filter) - auxiliary_client: revert _read_main_provider() to simple provider.strip().lower() — the _normalize_aux_provider() call introduced in `5c03f2e7` stripped the custom: prefix, breaking named custom provider resolution - auxiliary_client: flip vision auto-detection order to active provider → OpenRouter → Nous → stop (was OR → Nous → active) - test: update vision priority test to match new order Based on PR #6219-#6222 by xinbenlv.	2026-04-08 16:37:05 -07:00
Siddharth Balyan	8de91ce9d2	fix(nix): make addToSystemPackages fully functional for interactive CLI (#6317 ) * fix(nix): export HERMES_HOME system-wide when addToSystemPackages is true The `addToSystemPackages` option's documentation (and the `:::tip` block in `website/docs/getting-started/nix-setup.md`) promises that enabling it both puts the `hermes` CLI on PATH and sets `HERMES_HOME` system-wide so interactive shells share state with the gateway service. The module only did the former, so running `hermes` in a user shell silently created a separate `~/.hermes/` directory instead of the managed `${stateDir}/.hermes`. Implement the documented behavior by also setting `environment.variables.HERMES_HOME = "${cfg.stateDir}/.hermes"` in the same mkIf block, and update the option description to match. Fixes #6044 * fix(nix): preserve group-readable permissions in managed mode The NixOS module sets HERMES_HOME directories to 0750 and files to 0640 so interactive users in the hermes group can share state with the gateway service. Two issues prevented this from working: 1. hermes_cli/config.py: _secure_dir() unconditionally chmod'd HERMES_HOME to 0700 on every startup, overwriting the NixOS module's 0750. Similarly, _secure_file() forced 0600 on config files. Both now skip in managed mode (detected via .managed marker or HERMES_MANAGED env var). 2. nix/nixosModules.nix: the .env file was created with 0600 (owner-only), while config.yaml was already 0640 (group-readable). Changed to 0640 for consistency — users granted hermes group membership should be able to read the managed .env. Verified with a NixOS VM integration test: a normal user in the hermes group can now run `hermes version` and `hermes config` against the managed HERMES_HOME without PermissionError. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: zerone0x <zerone0x@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 04:09:53 +05:30
yyovil	8385f54e98	fix(nix): preserve voice deps on aarch64-darwin via nixpkgs (#5079 ) * Fixes the nix profile installation for hermes agent (cherry picked from commit c822a082a8c0ce33f3d406e6b2ae1b2833071df0) * Update nix/python.nix Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Applied gating for aarch64-darwin platform Entire-Checkpoint: 1ab2074bd4f1 --------- Co-authored-by: yyovil <tanishq231003@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-04-09 03:39:39 +05:30
Teknium	105caa001b	chore: regenerate uv.lock against current main	2026-04-08 13:47:08 -07:00
jjovalle99	d46db0a1b4	fix(tools): use correct import path for mistralai SDK mistralai v2.x is a namespace package — `Mistral` class lives at `mistralai.client`, not at the top-level `mistralai` module. The previous `from mistralai import Mistral` raises ImportError at runtime. Update both production code and test fixture to use the correct path.	2026-04-08 13:47:08 -07:00
jjovalle99	5f4b93c20f	feat(tools): add Voxtral Transcribe STT provider (Mistral AI)	2026-04-08 13:47:08 -07:00
Teknium	5d2fc6d928	fix: cleanup Qwen OAuth provider gaps - Add HERMES_QWEN_BASE_URL to OPTIONAL_ENV_VARS in config.py (was missing despite being referenced in code) - Remove redundant qwen-oauth entry from _API_KEY_PROVIDER_AUX_MODELS (non-aggregator providers use their main model for aux tasks automatically)	2026-04-08 13:46:30 -07:00
kshitijk4poor	3377017eb4	feat(qwen): add Qwen OAuth provider with portal request support Based on #6079 by @tunamitom with critical fixes and comprehensive tests. Changes from #6079: - Fix: sanitization overwrite bug — Qwen message prep now runs AFTER codex field sanitization, not before (was silently discarding Qwen transforms) - Fix: missing try/except AuthError in runtime_provider.py — stale Qwen credentials now fall through to next provider on auto-detect - Fix: 'qwen' alias conflict — bare 'qwen' stays mapped to 'alibaba' (DashScope); use 'qwen-portal' or 'qwen-cli' for the OAuth provider - Fix: hardcoded ['coder-model'] replaced with live API fetch + curated fallback list (qwen3-coder-plus, qwen3-coder) - Fix: extract _is_qwen_portal() helper + _qwen_portal_headers() to replace 5 inline 'portal.qwen.ai' string checks and share headers between init and credential swap - Fix: add Qwen branch to _apply_client_headers_for_base_url for mid-session credential swaps - Fix: remove suspicious TypeError catch blocks around _prompt_provider_choice - Fix: handle bare string items in content lists (were silently dropped) - Fix: remove redundant dict() copies after deepcopy in message prep - Revert: unrelated ai-gateway test mock removal and model_switch.py comment deletion New tests (30 test functions): - _qwen_cli_auth_path, _read_qwen_cli_tokens (success + 3 error paths) - _save_qwen_cli_tokens (roundtrip, parent creation, permissions) - _qwen_access_token_is_expiring (5 edge cases: fresh, expired, within skew, None, non-numeric) - _refresh_qwen_cli_tokens (success, preserve old refresh, 4 error paths, default expires_in, disk persistence) - resolve_qwen_runtime_credentials (fresh, auto-refresh, force-refresh, missing token, env override) - get_qwen_auth_status (logged in, not logged in) - Runtime provider resolution (direct, pool entry, alias) - _build_api_kwargs (metadata, vl_high_resolution_images, message formatting, max_tokens suppression)	2026-04-08 13:46:30 -07:00
Teknium	a1213d06bd	fix(hindsight): correct config key mismatch and add base URL support (#6282 ) Fixes #6259. Three bugs fixed: 1. Config key mismatch: _get_client() and _start_daemon() read 'llmApiKey' (camelCase) but save_config() stores 'llm_api_key' (snake_case). The config value was never read — only the env var fallback worked. 2. Missing base URL support: users on OpenRouter or custom endpoints had no way to configure HINDSIGHT_API_LLM_BASE_URL through setup. Added llm_base_url to config schema with empty default, passed conditionally to HindsightEmbedded constructor. 3. Daemon config change detection: config_changed now also checks HINDSIGHT_API_LLM_BASE_URL, and the daemon profile .env includes the base URL when set. Keeps HINDSIGHT_API_LLM_API_KEY (with double API) in the daemon profile .env — this matches the upstream hindsight .env.example convention.	2026-04-08 13:46:14 -07:00
Teknium	1631895d5a	docs(telegram): add proxy support section Documents the proxy env var support added in PR #3591 (salvage of #3411 by @kufufu9). Covers HTTPS_PROXY/HTTP_PROXY/ALL_PROXY precedence, configuration methods, and scope.	2026-04-08 13:45:14 -07:00
Teknium	4f467700d4	fix(doctor): only check the active memory provider, not all providers unconditionally (#6285 ) * fix(tools): skip camofox auto-cleanup when managed persistence is enabled When managed_persistence is enabled, cleanup_browser() was calling camofox_close() which destroys the server-side browser context via DELETE /sessions/{userId}, killing login sessions across cron runs. Add camofox_soft_cleanup() — a public wrapper that drops only the in-memory session entry when managed persistence is on, returning True. When persistence is off it returns False so the caller falls back to the full camofox_close(). The inactivity reaper still handles idle resource cleanup. Also surface a logger.warning() when _managed_persistence_enabled() fails to load config, replacing a silent except-and-return-False. Salvaged from #6182 by el-analista (Eduardo Perea Fernandez). Added public API wrapper to avoid cross-module private imports, and test coverage for both persistence paths. Co-authored-by: Eduardo Perea Fernandez <el-analista@users.noreply.github.com> * fix(doctor): only check the active memory provider, not all providers unconditionally hermes doctor had hardcoded Honcho Memory and Mem0 Memory sections that always ran regardless of the user's memory.provider config setting. After the swappable memory provider update (#4623), users with leftover Honcho config but no active provider saw false 'broken' errors. Replaced both sections with a single Memory Provider section that reads memory.provider from config.yaml and only checks the configured provider. Users with no external provider see a green 'Built-in memory active' check. Reported by community user michaelruiz001, confirmed by Eri (Honcho). --------- Co-authored-by: Eduardo Perea Fernandez <el-analista@users.noreply.github.com>	2026-04-08 13:44:58 -07:00

1 2 3 4 5 ...

3596 commits