hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
Teknium	69a0092c38	fix: deduplicate _is_termux() into hermes_constants.is_termux() Replace 6 identical copies of the Termux detection function across cli.py, browser_tool.py, voice_mode.py, status.py, doctor.py, and gateway.py with a single shared implementation in hermes_constants.py. Each call site imports with its original local name to preserve all existing callers (internal references and test monkeypatches).	2026-04-09 16:24:53 -07:00
adybag14-cyber	c3141429b7	fix(termux): tighten voice setup and mobile chat UX	2026-04-09 16:24:53 -07:00
adybag14-cyber	769ec1ee1a	fix(termux): deepen browser, voice, and tui support	2026-04-09 16:24:53 -07:00
adybag14-cyber	3237733ca5	fix(termux): harden execute_code and mobile browser/audio UX	2026-04-09 16:24:53 -07:00
adybag14-cyber	54d5138a54	fix(termux): harden env-backed background jobs	2026-04-09 16:24:53 -07:00
adybag14-cyber	6dcb3c4774	fix(termux): compact narrow-screen tui chrome	2026-04-09 16:24:53 -07:00
adybag14-cyber	096b3f9f12	fix(termux): add local image chat route	2026-04-09 16:24:53 -07:00
adybag14-cyber	a3aed1bd26	fix(termux): keep quiet chat output parseable	2026-04-09 16:24:53 -07:00
adybag14-cyber	4970705ed3	fix(termux): silence quiet chat tool previews	2026-04-09 16:24:53 -07:00
adybag14-cyber	2194425918	fix(termux): make setup-hermes use android path	2026-04-09 16:24:53 -07:00
adybag14-cyber	3878495972	fix(termux): disable gateway service flows on android	2026-04-09 16:24:53 -07:00
adybag14-cyber	4e40e93b98	fix(termux): improve status and install UX	2026-04-09 16:24:53 -07:00
adybag14-cyber	122925a6f2	fix(termux): honor temp dirs for local temp artifacts	2026-04-09 16:24:53 -07:00
adybag14-cyber	e79cc88985	feat: add tested Termux install path and EOF-aware gh auth	2026-04-09 16:24:53 -07:00
sprmn24	e053433c84	fix(error_classifier): disambiguate usage-limit patterns in _classify_by_message _classify_by_message had no handling for _USAGE_LIMIT_PATTERNS, so messages like 'usage limit exceeded, try again in 5 minutes' arriving without an HTTP status code fell through to FailoverReason.unknown instead of rate_limit. Apply the same billing/rate-limit disambiguation that _classify_402 already uses: USAGE_LIMIT_PATTERNS + transient signal → rate_limit, USAGE_LIMIT_PATTERNS alone → billing. Add 4 tests covering the no-status-code usage-limit path.	2026-04-09 16:24:13 -07:00
Siddharth Balyan	1789c2699a	feat(nix): shared-state permission model for interactive CLI users (#6796 ) * feat(nix): shared-state permission model for interactive CLI users Enable interactive CLI users in the hermes group to share full read-write state (sessions, memories, logs, cron) with the gateway service via a setgid + group-writable permission model. Changes: nix/nixosModules.nix: - Directories use setgid 2770 (was 0750) so new files inherit the hermes group. home/ stays 0750 (no interactive write needed). - Activation script creates HERMES_HOME subdirs (cron, sessions, logs, memories) — previously Python created them but managed mode now skips mkdir. - Activation migrates existing runtime files to group-writable (chmod g+rw). Nix-managed files (config.yaml, .env, .managed) stay 0640/0644. - Gateway systemd unit gets UMask=0007 so files it creates are 0660. hermes_cli/config.py: - ensure_hermes_home() splits into managed/unmanaged paths. Managed mode verifies dirs exist (raises RuntimeError if not) instead of creating them. Scoped umask(0o007) ensures SOUL.md is created as 0660. hermes_logging.py: - _ManagedRotatingFileHandler subclass applies chmod 0660 after log rotation in managed mode. RotatingFileHandler.doRollover() creates new files via open() which uses the process umask (0022 → 0644), not the scoped umask from ensure_hermes_home(). Verified with a 13-subtest NixOS VM integration test covering setgid, interactive writes, file ownership, migration, and gateway coexistence. Refs: #6044 * Fix managed log file mode on initial open Co-authored-by: Siddharth Balyan <alt-glitch@users.noreply.github.com> * refactor: simplify managed file handler and merge activation loops - Cache is_managed() result in handler __init__ instead of lazy-importing on every _open()/_chmod_if_managed() call. Avoids repeated stat+env checks on log rotation. - Merge two for-loops over the same subdir list in activation script into a single loop (mkdir + chown + chmod + find in one pass). --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Siddharth Balyan <alt-glitch@users.noreply.github.com>	2026-04-10 03:48:42 +05:30
dangelo352	aed9b90ae3	fix(stream_consumer): handle overflow when no message exists yet The overflow split loop required _message_id to be set, but on the first streamed message (or after a segment break) _message_id is None. Oversized text fell through to _send_or_edit → adapter.send(), which split internally — but subsequent edits hit Telegram's 'message too long' and were silently truncated with '…', cutting off the response. Add a new code path for the _message_id is None case that uses truncate_message() (same as the non-streaming path) to split with proper word/code-fence boundaries and chunk indicators. Each chunk is sent as a new message via _send_new_chunk(). Properly handles got_done (returns immediately after sending chunks instead of continuing into an infinite loop) and got_segment_break. Original cherry-picked from PR #6816 by dangelo352. Fixes silent message truncation on Telegram for long streamed responses.	2026-04-09 15:07:21 -07:00
Teknium	6b437f7934	fix: /browser connect auto-launch uses dedicated profile dir (#6821 ) Chrome auto-launch now passes --user-data-dir, --no-first-run, and --no-default-browser-check so the debug instance doesn't conflict with an already-running Chrome using the default profile. The profile dir lives at {hermes_home}/chrome-debug/. Also updates the fallback manual instructions to include the same flags and removes the stale 'close existing Chrome windows' hint.	2026-04-09 14:55:45 -07:00
Teknium	f91fffbe33	Revert "fix: /browser connect auto-launch uses dedicated profile dir" This reverts commit `c3854e0f85`.	2026-04-09 14:54:37 -07:00
Teknium	49d8c9557f	fix: cleanup_all_camofox_sessions respects managed persistence (#6820 ) When managed_persistence is enabled, cleanup_all now only clears local tracking state without sending DELETE requests to the Camofox server. This prevents persistent browser profiles (cookies, logins, localStorage) from being destroyed during process-wide cleanup. Ephemeral sessions still get full server-side deletion as before.	2026-04-09 14:54:07 -07:00
Teknium	c3854e0f85	fix: /browser connect auto-launch uses dedicated profile dir Chrome auto-launch now passes --user-data-dir, --no-first-run, and --no-default-browser-check so the debug instance doesn't conflict with an already-running Chrome using the default profile. The profile dir lives at {hermes_home}/chrome-debug/. Also updates the fallback manual instructions to include the same flags and removes the stale 'close existing Chrome windows' hint.	2026-04-09 14:52:58 -07:00
Teknium	97308707e9	fix: insert static fallback when compression summary fails When _generate_summary() failed (no provider, timeout, model error), the compressor silently dropped all middle turns with just a debug log. The agent would then see head + tail with no explanation of the gap, causing total context amnesia (generic greetings instead of continuing the conversation). Now generates a static fallback marker that tells the model context was lost and to continue from the recent tail messages. The fallback flows through the same role-alternation logic as a real summary so message structure stays valid.	2026-04-09 14:28:56 -07:00
Teknium	e9168f917e	fix: handle HTTP errors gracefully in gws_bridge token refresh Instead of crashing with a raw urllib traceback on refresh failure, print a clean error message and suggest re-running setup.py.	2026-04-09 14:28:35 -07:00
Teknium	c8bbd29aae	fix: update tests for gws migration - Rewrite test_google_workspace_api.py: test bridge token handling and calendar date range instead of removed get_credentials() - Update test_google_oauth_setup.py: partial scopes now accepted with warning instead of rejected with SystemExit	2026-04-09 14:28:35 -07:00
Teknium	73eb59db8d	fix: follow-up fixes for google-workspace gws migration - Fix npm package name: @anthropic -> @googleworkspace/cli - Add Homebrew install option - Fix calendar_list to respect --start/--end args (uses raw Calendar API for date ranges, +agenda helper for default 7-day view) - Improve check_auth partial scope output (list missing scopes) - Add output format documentation with key JSON shapes - Use npm install in troubleshooting (no Rust toolchain needed) Follow-up to cherry-picked PR #6713	2026-04-09 14:28:35 -07:00
spideystreet	127b4caf0d	feat(skills): migrate google-workspace to gws CLI backend Migrate the google-workspace skill from custom Python API wrappers (google-api-python-client) to Google's official Rust CLI gws (googleworkspace/cli). Add gws_bridge.py for headless-compatible token refresh. Fix partial OAuth scope handling. Co-authored-by: spideystreet <dhicham.pro@gmail.com> Cherry-picked from PR #6713	2026-04-09 14:28:35 -07:00
Teknium	1780ad24b1	fix: normalize remaining reasoning effort orderings and add missing 'minimal' Follow-up to cherry-picked PR #6698. Fixes spots the original PR missed: - hermes_constants.py: VALID_REASONING_EFFORTS tuple ordering - gateway/run.py: _load_reasoning_config docstring + validation tuple - configuration.md and batch-processing.md: docs ordering - hermes-agent skill: /reasoning usage hint was missing 'minimal'	2026-04-09 14:20:16 -07:00
Greer Guthrie	775a46ce75	fix: normalize reasoning effort ordering in UI	2026-04-09 14:20:16 -07:00
Teknium	6f8e426275	fix: add SOCKS proxy support, DISCORD_PROXY env var, and send_message proxy coverage Follow-up improvements on top of the shared resolver from PR #6562: - Add platform_env_var parameter to resolve_proxy_url() so DISCORD_PROXY takes priority over generic HTTPS_PROXY/ALL_PROXY env vars - Add SOCKS proxy support via aiohttp_socks.ProxyConnector with rdns=True (critical for GFW/Shadowrocket/Clash users — issue #6649) - proxy_kwargs_for_bot() returns connector= for SOCKS, proxy= for HTTP - proxy_kwargs_for_aiohttp() returns split (session_kw, request_kw) for standalone aiohttp sessions - Add proxy support to send_message_tool.py (Discord REST, Slack, SMS) for cron job delivery behind proxies (from PR #2208) - Add proxy support to Discord image/document downloads - Fix duplicate import sys in base.py	2026-04-09 14:19:06 -07:00
Zheng Li	88dbbfe982	feat(gateway): unified proxy support for Discord and Telegram with macOS auto-detection - Add resolve_proxy_url() to base.py — shared by all platform adapters - Check HTTPS_PROXY / HTTP_PROXY / ALL_PROXY env vars first - Fall back to macOS system proxy via scutil --proxy (zero-config) - Pass proxy= to discord.py commands.Bot() for gateway connectivity - Refactor telegram_network.py to use shared resolver - Update test fixtures to accept proxy kwarg	2026-04-09 14:19:06 -07:00
jarvisxyz	88845b99d2	fix(slack): add rate-limit retry and TTL cache to thread context fetching - Add _ThreadContextCache dataclass for caching fetched context (60s TTL) - Add exponential backoff retry for conversations.replies 429 rate limits (Tier 3, ~50 req/min) - Only fetch context when no active session exists (guard at call site) to prevent duplication across turns - Hoist bot_uid lookup outside the per-message loop - Clearer header text for injected thread context Based on PR #6162 by jarvisxyz, cherry-picked onto current main.	2026-04-09 14:07:32 -07:00
gunpowder-client-vm	18d8e91a5a	fix(slack): treat group DMs (mpim) like DMs + smart reaction guard - Treat mpim (multi-party IM / group DM) channels as DMs — no @mention required, continuous session like 1:1 DMs - Only add 👀/✅ reactions when bot is directly addressed (DM or @mention). In listen-all channels (require_mention=false) reacting to every message would be noisy. Based on PR #4633 by gunpowder-client-vm, adapted to current main.	2026-04-09 14:07:32 -07:00
Mibayy	1773e3d647	feat(slack): add allow_bots config for bot-to-bot communication Three modes: "none" (default, backward-compatible), "mentions" (accept bot messages only when they @mention us), "all" (accept all bot messages except our own, to prevent echo loops). Configurable via: slack: allow_bots: mentions Or env var: SLACK_ALLOW_BOTS=mentions Self-message guard always active regardless of mode. Based on PR #3200 by Mibayy, adapted to current main with config.yaml bridging support.	2026-04-09 14:07:32 -07:00
dashed	7f7b02b764	fix(slack): comprehensive mrkdwn formatting — 6 bug fixes + 52 tests Fixes blockquote > escaping, edit_message raw markdown, *bold italic* handling, HTML entity double-escaping (&amp;), Wikipedia URL parens truncation, and step numbering format. Also adds format_message to the tool-layer _send_to_platform for consistent formatting across all delivery paths. Changes: - Protect Slack entities (<@user>, <https://...\|label>, <!here>) from escaping passes - Protect blockquote > markers before HTML entity escaping - Unescape-before-escape for idempotent HTML entity handling - *bold italic* → _text_ conversion (before bold pass) - URL regex upgraded to handle balanced parentheses - mrkdwn:True flag on chat_postMessage payloads - format_message applied in edit_message and send_message_tool - 52 new tests (format, edit, streaming, splitting, tool chunking) - Use reversed(dict) idiom for placeholder restoration Based on PR #3715 by dashed, cherry-picked onto current main.	2026-04-09 14:07:32 -07:00
Doruk Ardahan	7d499c75db	feat(slack): add require_mention and free_response_channels config support Port the mention gating pattern from Telegram, Discord, WhatsApp, and Matrix adapters to the Slack platform adapter. - Add _slack_require_mention() with explicit-false parsing and env var fallback (SLACK_REQUIRE_MENTION) - Add _slack_free_response_channels() with env var fallback (SLACK_FREE_RESPONSE_CHANNELS) - Replace hardcoded mention check with configurable gating logic - Bridge slack config.yaml settings to env vars - Bridge free_response_channels through the generic platform bridging loop - Add 26 tests covering config parsing, env fallback, gating logic Config usage: slack: require_mention: false free_response_channels: - "C0AQWDLHY9M" Default behavior unchanged: channels require @mention (backward compatible). Based on PR #5885 by dorukardahan, cherry-picked and adapted to current main.	2026-04-09 14:07:32 -07:00
Teknium	997e219c14	fix(security): enforce user authorization on approval button clicks Approval button clicks (Block Kit actions in Slack, CallbackQuery in Telegram) bypass the normal message authorization flow in gateway/run.py. Any workspace/group member who can see the approval message could click Approve to authorize dangerous commands. Read SLACK_ALLOWED_USERS / TELEGRAM_ALLOWED_USERS env vars directly in the approval handlers. When an allowlist is configured and the clicking user is not in it, the click is silently ignored (Slack) or answered with an error (Telegram). Wildcard '*' permits all users. When no allowlist is configured, behavior is unchanged (open access). Based on the idea from PR #6735 by maymuneth, reimplemented to use the existing env-var-based authorization system rather than a nonexistent _allowed_user_ids adapter attribute.	2026-04-09 14:07:32 -07:00
aaronagent	ab7b407224	fix: atomic Slack approval guard, safe JSON deserialization fallbacks 1. gateway/platforms/slack.py: Replace check-then-set TOCTOU race on _approval_resolved with atomic dict.pop(). Two concurrent button clicks could both pass the guard before either set it to True, causing double resolve_gateway_approval — which can resolve the WRONG queued approval when multiple are pending for the same session. 2. hermes_state.py: Add WARNING log and proper fallbacks when json.loads fails on tool_calls (→ []), reasoning_details (→ None), and codex_reasoning_items (→ None). Previously, failures were silently swallowed: tool_calls stayed as a raw string (iterating yields characters, not objects), and reasoning fields were simply missing from the dict. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:07:32 -07:00
Teknium	c6974fd108	fix: allow custom endpoint users to use main model for auxiliary tasks Step 1 of _resolve_auto() explicitly excluded 'custom' providers, forcing custom endpoint users through the fragile fallback chain instead of using their known-working main model credentials. This caused silent compression failures for users on local OpenAI- compatible endpoints — the summary generation would fail, middle turns would be silently dropped, and the agent would lose all conversation context. Remove 'custom' from the exclusion list so custom endpoint users get the same main-model-first treatment as DeepSeek, Anthropic, Gemini, and other direct providers.	2026-04-09 13:23:56 -07:00
Dylan Socolobsky	c6dba918b3	fix(tests): fix several failing/flaky tests on main (#6777 ) * fix(tests): mock is_safe_url in tests that use example.com Tests using example.com URLs were failing because is_safe_url does a real DNS lookup which fails in environments where example.com doesn't resolve, causing the request to be blocked before reaching the already-mocked HTTP client. This should fix around 17 failing tests. These tests test logic, caching, etc. so mocking this method should not modify them in any way. TestMattermostSendUrlAsFile was already doing this so we follow the same pattern. * fix(test): use case-insensitive lookup for model context length check DEFAULT_CONTEXT_LENGTHS uses inconsistent casing (MiniMax keys are lowercase, Qwen keys are mixed-case) so the test was broken in some cases since it couldn't find the model. * fix(test): patch is_linux in systemd gateway restart test The test only patched is_macos to False but didn't patch is_linux to True. On macOS hosts, is_linux() returns False and the systemd restart code path is skipped entirely, making the assertion fail. * fix(test): use non-blocklisted env var in docker forward_env tests GITHUB_TOKEN is in api_key_env_vars and thus in _HERMES_PROVIDER_ENV_BLOCKLIST so the env var is silently dropped, we replace it with a non-blocked one like DATABASE_URL so the tests actually work. * fix(test): fully isolate _has_any_provider_configured from host env _has_any_provider_configured() checks all env vars from PROVIDER_REGISTRY (not just the 5 the tests were clearing) and also calls get_auth_status() which detects gh auth token for Copilot. On machines with any of these set, the function returns True before reaching the code path under test. Clear all registry vars and mock get_auth_status so host credentials don't interfere. * fix(test): correct path to hermes_base_env.py in tool parser tests Path(__file__).parent.parent resolved to tests/, not the project root. The file lives at environments/hermes_base_env.py so we need one more parent level. * fix(test): accept optional HTML fields in Matrix send payload _send_matrix sometimes adds format and formatted_body when the markdown library is installed. The test was doing an exact dict equality check which broke. Check required fields instead. * fix(test): add config.yaml to codex vision requirements test The test only wrote auth.json but not config.yaml, so _read_main_provider() returned empty and vision auto-detect never tried the codex provider. Add a config.yaml pointing at openai-codex so the fallback path actually resolves the client. * fix(test): clear OPENROUTER_API_KEY in _isolate_hermes_home run_agent.py calls load_hermes_dotenv() at import time, which injects API keys from ~/.hermes/.env into os.environ before any test fixture runs. This caused test_agent_loop_tool_calling to make real API calls instead of skipping, which ends up making some tests fail. * fix(test): add get_rate_limit_state to agent mock in usage report tests _show_usage now calls agent.get_rate_limit_state() for rate limit display. The SimpleNamespace mock was missing this method. * fix(test): update expected Camofox config version from 12 to 13 * fix(test): mock _get_enabled_platforms in nous managed defaults test Importing gateway.run leaks DISCORD_BOT_TOKEN into os.environ, which makes _get_enabled_platforms() return ["cli", "discord"] instead of just ["cli"]. tools_command loops per platform, so apply_nous_managed_defaults runs twice: the first call sets config values, the second sees them as already configured and returns an empty set, causing the assertion to fail.	2026-04-09 13:17:06 -07:00
Teknium	3eade90b39	fix: OpenClaw migration now shows dry-run preview before executing (#6769 ) The setup wizard's OpenClaw migration previously ran immediately with aggressive defaults (overwrite=True, preset=full) after a single 'Would you like to import?' prompt. This caused several problems: - Config values with different semantics (e.g. tool_call_execution: 'auto' in OpenClaw vs 'off' for Hermes yolo mode) were imported without translation - Gateway tokens were hijacked from OpenClaw without warning, taking over Telegram/Slack/Discord channels - Instruction files (.md) containing OpenClaw-specific setup/restart procedures were copied, causing Hermes restart failures Now the migration: 1. Asks 'Would you like to see what can be imported?' (softer framing) 2. Runs a dry-run preview showing everything that would be imported 3. Displays categorized warnings for high-impact items (gateway takeover, config value differences, instruction files) 4. Asks for explicit confirmation with default=No 5. Executes with overwrite=False (preserves existing Hermes config) Also extracts _load_openclaw_migration_module() for reuse and adds _print_migration_preview() with keyword-based warning detection. Tests updated for two-phase behavior + new test for decline-after-preview.	2026-04-09 12:15:06 -07:00
KUSH42	34d06a9802	fix(compaction): don't halve context_length on output-cap-too-large errors When the API returns "max_tokens too large given prompt" (input tokens are within the context window, but input + requested output > window), the old code incorrectly routed through the same handler as "prompt too long" errors, calling get_next_probe_tier() and permanently halving context_length. This made things worse: the window was fine, only the requested output size needed trimming for that one call. Two distinct error classes now handled separately: Prompt too long — input itself exceeds context window. Fix: compress history + halve context_length (existing behaviour, unchanged). Output cap too large — input OK, but input + max_tokens > window. Fix: parse available_tokens from the error message, set a one-shot _ephemeral_max_output_tokens override for the retry, and leave context_length completely untouched. Changes: - agent/model_metadata.py: add parse_available_output_tokens_from_error() that detects Anthropic's "available_tokens: N" error format and returns the available output budget, or None for all other error types. - run_agent.py: call the new parser first in the is_context_length_error block; if it fires, set _ephemeral_max_output_tokens (with a 64-token safety margin) and break to retry without touching context_length. _build_api_kwargs consumes the ephemeral value exactly once then clears it so subsequent calls use self.max_tokens normally. - agent/anthropic_adapter.py: expand build_anthropic_kwargs docstring to clearly document the max_tokens (output cap) vs context_length (total window) distinction, which is a persistent source of confusion due to the OpenAI-inherited "max_tokens" name. - cli-config.yaml.example: add inline comments explaining both keys side by side where users are most likely to look. - website/docs/integrations/providers.md: add a callout box at the top of "Context Length Detection" and clarify the troubleshooting entry. - tests/test_ctx_halving_fix.py: 24 tests across four classes covering the parser, build_anthropic_kwargs clamping, ephemeral one-shot consumption, and the invariant that context_length is never mutated on output-cap errors.	2026-04-09 11:27:41 -07:00
Teknium	2772d99085	fix: remove /prompt slash command — footgun via prefix expansion (#6752 ) /pr <anything> silently resolved to /prompt via the shortest-match tiebreaker in prefix expansion, permanently overwriting the system prompt and persisting to config. The command's functionality (setting agent.system_prompt) is available via config.yaml and /personality covers the common use case. Removes: CommandDef, dispatch branch, _handle_prompt_command handler, docs references, and updates subcommand extraction test.	2026-04-09 11:27:27 -07:00
Teknium	ee16416c7b	fix(cli): prefer auth.py env vars over models.dev in provider detection (#6755 ) list_authenticated_providers() was using env var names from the external models.dev registry to detect credentials. This registry has incorrect mappings for 5 providers: minimax-cn, zai, opencode-zen, opencode-go, and kilocode — causing them to not appear in /model even when the correct API key is set. Now checks PROVIDER_REGISTRY from auth.py first (our source of truth), falling back to models.dev only for providers not in our registry. Fixes #6620. Based on devorun's investigation in PR #6625.	2026-04-09 11:13:11 -07:00
Teknium	3007174a61	fix: prevent 400 format errors from triggering compression loop on Codex Responses API (#6751 ) The error classifier's generic-400 heuristic only extracted err_body_msg from the nested body structure (body['error']['message']), missing the flat body format used by OpenAI's Responses API (body['message']). This caused descriptive 400 errors like 'Invalid input[index].name: string does not match pattern' to appear generic when the session was large, misclassifying them as context overflow and triggering an infinite compression loop. Added flat-body fallback in _classify_400() consistent with the parent classify_api_error() function's existing handling at line 297-298.	2026-04-09 11:11:34 -07:00
Yang Zhi	2f0a83dd12	fix(cli): update TUI status bar model name on provider fallback The status bar reads self.model from the CLI class, which is set once at init and never updated when _try_activate_fallback() switches to a backup provider/model in run_agent.py. This causes the TUI to display the original model name while context_length_max changes, creating a confusing mismatch. Read the model name from agent.model (live, updated by fallback) with self.model as fallback before the agent is created. Remove the redundant getattr(self, 'agent') call that was already done above.	2026-04-09 11:11:25 -07:00
Yang Zhi	110cdd573a	fix(auxiliary_client): inject KimiCLI User-Agent for custom endpoint sync clients When is explicitly set to , the custom-endpoint path in creates a plain client without provider-specific headers. This means sync vision calls (e.g. ) use the generic User-Agent and get rejected by Kimi's coding endpoint with a 403: 'Kimi For Coding is currently only available for Coding Agents such as Kimi CLI...' The async converter already injects , and the auto-detected API-key provider path also injects it, but the explicit custom endpoint shortcut was missing it entirely. This patch adds the same injection to the custom endpoint branch, and updates all existing Kimi header sites to for consistency. Fixes <issue number to be filled in>	2026-04-09 11:11:25 -07:00
Yang Zhi	4d1b988070	fix(credential_pool): use _resolve_kimi_base_url when seeding kimi-coding pool The credential pool seeder (_seed_from_env) hardcoded the base URL for API-key providers without running provider-specific auto-detection. For kimi-coding, this caused sk-kimi- prefixed keys to be seeded with the legacy api.moonshot.ai/v1 endpoint instead of api.kimi.com/coding/v1, resulting in HTTP 401 on the first request. Import and call _resolve_kimi_base_url for kimi-coding so the pool uses the correct endpoint based on the key prefix, matching the runtime credential resolver behavior. Also fix a comment: sk-kimi- keys are issued by kimi.com/code, not platform.kimi.ai. Fixes #5561	2026-04-09 11:11:25 -07:00
Yang Zhi	019c11d07e	fix(fallback): preserve provider-specific headers when activating fallback When _try_activate_fallback() swaps to a new provider (e.g. kimi-coding), resolve_provider_client() correctly injects provider-specific default_headers (like KimiCLI User-Agent) into the returned OpenAI client. However, _client_kwargs was saved with only api_key and base_url, dropping those headers. Every subsequent API call rebuilds the client from _client_kwargs via _create_request_openai_client(), producing a bare OpenAI client without the required headers. Kimi Coding rejects this with 403; Copilot would lose its auth headers similarly. This patch reads _custom_headers from the fallback client (where the OpenAI SDK stores the default_headers kwarg) and includes them in _client_kwargs so any client rebuild preserves provider-specific headers. Fixes #6075	2026-04-09 11:11:25 -07:00
MustafaKara7	fce23e8024	fix(docker): #6197 enable unbuffered stdout for live logs	2026-04-09 10:59:31 -07:00
Teknium	1ec1f6a68a	fix: model fallback — stale model on Nous login + connection error fallback (#6554 ) Two bugs in the model fallback system: 1. Nous login leaves stale model in config (provider=nous, model=opus from previous OpenRouter setup). Fixed by deferring the config.yaml provider write until AFTER model selection completes, and passing the selected model atomically via _update_config_for_provider's default_model parameter. Previously, _update_config_for_provider was called before model selection — if selection failed (free tier, no models, exception), config stayed as nous+opus permanently. 2. Codex/stale providers in auxiliary fallback can't connect but block the auto-detection chain. Added _is_connection_error() detection (APIConnectionError, APITimeoutError, DNS failures, connection refused) alongside the existing _is_payment_error() check in call_llm(). When a provider endpoint is unreachable, the system now falls back to the next available provider instead of crashing.	2026-04-09 10:38:53 -07:00

1 2 3 4 5 ...

3609 commits