hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-09 03:11:58 +00:00

Author	SHA1	Message	Date
briandevans	c1a2710a32	test(aux): cover effort: 0 fallback in Codex reasoning translation Copilot review on PR #17012 noted the docstring/comment lists `0` among the falsy effort values that fall back to `medium`, but the existing regression tests only cover `None` and `""`. Add the third case to lock in the full contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 05:47:50 -07:00
briandevans	9e893d16d1	fix(aux): default Codex reasoning effort to medium when extra_body.reasoning.effort is falsy auxiliary.<task>.extra_body.reasoning, but the new translation path in _CodexCompletionsAdapter.create() reads the effort with ``reasoning_cfg.get("effort", "medium")``. That returns the configured value verbatim when the key is present, so ``effort: null`` / ``effort: ""`` (both common YAML shapes) flow through as ``{"effort": null, "summary": "auto"}`` and Codex rejects the request with "Invalid value for parameter ``reasoning.effort``". agent/transports/codex.py::build_kwargs() — which the new adapter is documented to mirror — uses a truthy check (``elif reasoning_config.get("effort"):``) so the same falsy values keep the "medium" default. Switch the auxiliary adapter to the same ``or "medium"`` truthy form so identical config produces identical requests on both paths. - [x] Two new regression tests cover ``effort: None`` and ``effort: ""`` and assert the request goes out as ``{"effort": "medium", "summary": "auto"}``. - [x] Old behaviour fails the new tests (``{'effort': None} != {'effort': 'medium'}``); fixed behaviour passes all 11 tests in the ``TestCodexAdapterReasoningTranslation`` class. - [x] Adjacent suites green: ``tests/agent/test_auxiliary_client.py`` (108 passed) and ``tests/agent/transports/test_codex_transport.py + test_chat_completions.py`` (73 passed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 05:47:50 -07:00
nftpoetrist	808fee151d	fix(auxiliary): propagate explicit_api_key to _try_anthropic() _try_anthropic() lacked the explicit_api_key parameter added to _try_openrouter() in #18768. When resolve_provider_client() is called with provider="anthropic" and an explicit key (e.g. from a fallback_model entry with api_key set), the key was silently ignored — _try_anthropic() always fell back to resolve_anthropic_token(), so the fallback returned None,None for users without a default Anthropic credential configured. Fix: add explicit_api_key: str = None to _try_anthropic() and use explicit_api_key or <pool/env fallback> in both the pool-present and no-pool paths. Pass explicit_api_key=explicit_api_key at the call site in resolve_provider_client(). Symmetric with the _try_openrouter() fix. No behavior change when explicit_api_key is None.	2026-05-03 17:00:55 -07:00
liuhao1024	af98122793	fix(auxiliary): propagate explicit_api_key to _try_openrouter() When resolve_provider_client() passes explicit_api_key for OpenRouter auxiliary tasks, _try_openrouter() now accepts and honors this parameter instead of silently ignoring it and falling back to OPENROUTER_API_KEY env var. Root cause: _try_openrouter() had no explicit_api_key parameter, so even when callers wanted to pass a runtime credential pool key, it could not be used. Fix: - Add explicit_api_key: str = None parameter to _try_openrouter() - Prioritize explicit_api_key over pool key and env var - Update resolve_provider_client() call site to pass explicit_api_key Regression coverage: - Test that explicit_api_key is passed to OpenAI client when provided - Test that fallback to OPENROUTER_API_KEY still works when explicit_api_key is None Closes #18338	2026-05-02 02:27:49 -07:00
liuhao1024	9bf260472b	fix(tools): deduplicate tool names at API boundary for Vertex/Azure/Bedrock Providers like Google Vertex, Azure, and Amazon Bedrock reject API requests with duplicate tool names (HTTP 400: 'Tool names must be unique'). The upstream injection paths in run_agent.py already dedup after PR #17335, but two API-boundary functions pass tools through without checking: - agent/auxiliary_client.py: _build_call_kwargs() (all non-Anthropic providers in chat_completions mode) - agent/anthropic_adapter.py: convert_tools_to_anthropic() (Anthropic Messages API path) Add defensive dedup guards at both sites. Duplicates are dropped with a warning log, converting a hard 400 failure into a recoverable condition. This is intentionally conservative — the root-cause dedup in run_agent.py is the primary defense; these guards add resilience against future injection-path regressions. Includes 8 new tests covering unique passthrough, duplicate removal, empty/None edge cases. Closes #18478	2026-05-02 01:51:51 -07:00
Teknium	ce0c3ae493	fix(aux): remove hardcoded Codex fallback model, drop Codex from auto chain (#17765 ) The _CODEX_AUX_MODEL constant had already rotated twice in 6 weeks (gpt-5.3-codex -> gpt-5.2-codex -> now broken again at gpt-5.2-codex) because ChatGPT-account Codex gates which models it accepts via an undocumented, shifting allow-list that OpenAI publishes no changelog for. Any pinned default will keep going stale. Issue #17533 reports the current breakage: every ChatGPT-account auxiliary fallback fails with HTTP 400 "model is not supported" and the 60s pause loop degrades long sessions. Rather than reset the clock with another stale pin (PR #17544 proposes gpt-5.2-codex -> gpt-5.4), remove the hardcoded second-order Codex fallback entirely: - Delete `_CODEX_AUX_MODEL`. - Drop `_try_codex` from `_get_provider_chain()` (the auto chain now ends at api-key providers; 4 rungs instead of 5). - Rename `_try_codex() -> _build_codex_client(model)` and require an explicit model from the caller. No more guessing. - `resolve_provider_client("openai-codex", model=None)` now warns and returns (None, None) instead of silently guessing a stale model ID. - Remove `_try_codex` from the `provider="custom"` fallback ladder (same stale-constant trap). - `_resolve_strict_vision_backend("openai-codex")` routes through `resolve_provider_client` so the caller's explicit model is honored. Codex-main users are unaffected: Step 1 of `_resolve_auto` already uses `main_provider` + `main_model` directly and passes the user's configured Codex model through `resolve_provider_client`, which never touched `_CODEX_AUX_MODEL`. Per-task overrides (`auxiliary.<task>.provider/model`) continue to work and are the supported way to route specific aux tasks through Codex. Users whose main provider fails with a payment/connection error and who have ONLY ChatGPT-account Codex auth will now see the 60s pause without a stale-model-rejection noise line in between -- same outcome, cleaner failure. Closes #17533. Supersedes #17544 (which resets the clock on the same stale-constant problem).	2026-04-29 23:23:50 -07:00
Teknium	ff687c019e	fix(aux): skip kimi-coding in vision auto-detect (closes #17076 ) (#17451 ) * docs(anthropic): correct OAuth scope to Max plan + extra usage credits only The previous docs pass (#17399) overstated what Anthropic OAuth works with. In practice Hermes can only route against a Claude Max plan that has purchased extra usage credits — the base Max allowance is not consumed, and Claude Pro is not supported at all. Without Max + extra credits, users must fall back to an ANTHROPIC_API_KEY (pay-per-token). Updates the four pages touched in #17399: - integrations/providers.md - user-guide/features/credential-pools.md - reference/environment-variables.md - getting-started/quickstart.md * fix(aux): skip kimi-coding in vision auto-detect (closes #17076) Kimi Coding Plan's /coding endpoint (Anthropic Messages wire) has no image_in capability — Kimi's own docs confirm and suggest switching to a vision-capable model. Vision lives on the separate Kimi Platform (api.moonshot.ai, OpenAI-wire, pay-as-you-go). When the user has kimi-coding as main provider and auxiliary.vision.provider=auto, resolve_vision_provider_client was handing back an AnthropicAuxiliaryClient wrapped around /coding which 404'd on every vision request. Add a _PROVIDERS_WITHOUT_VISION frozenset ({kimi-coding, kimi-coding-cn}) and gate the main-provider vision branch on membership. On a skip the auto-detect falls through to OpenRouter → Nous like any other main-provider-unavailable case. Explicit per-task overrides (auxiliary.vision.provider=kimi-coding) are unaffected — the skip only applies when the caller is in auto mode. Tests: 4 new targeted tests in TestVisionAutoSkipsKimiCoding covering the skip path, CN variant, explicit-override passthrough, and a guard against accidental skip-list widening.	2026-04-29 06:10:23 -07:00
Teknium	391f1ca1f4	feat(aux): translate extra_body.reasoning into Codex Responses API (#17004 ) Auxiliary callers that configure reasoning via auxiliary.<task>.extra_body.reasoning were having that config silently dropped by the Codex Responses adapter — it only forwarded messages/model/tools through to responses.stream(), never translating chat.completions-shaped reasoning hints into the Responses API's top-level reasoning + include fields. Mirror the main-agent translation from agent/transports/codex.py: - extra_body.reasoning.effort → resp_kwargs.reasoning.{effort, summary:"auto"} - 'minimal' → 'low' clamp (Codex backend rejects 'minimal') - Always include ['reasoning.encrypted_content'] when reasoning is enabled - {'enabled': False} → omit reasoning and include entirely - Non-dict reasoning values are ignored defensively Reported by @OP (Apr 26 feedback bundle). ## Changes - agent/auxiliary_client.py: _CodexCompletionsAdapter.create() now reads and translates extra_body.reasoning before calling responses.stream() - tests/agent/test_auxiliary_client.py: 9 new tests covering all effort levels, the minimal→low clamp, the disabled path, the no-op paths, and defensive handling of wrong-shape inputs Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 05:47:42 -07:00
kshitijk4poor	56724147ef	fix(providers/gmi): post-salvage review fixes - config.py: remove dead ENV_VARS_BY_VERSION[17] entry (current _config_version is 22, so all users are past version 17 and would never be prompted for GMI_API_KEY on upgrade — consistent with how arcee was added) - auxiliary_client.py: use google/gemini-3.1-flash-lite-preview as GMI aux model instead of anthropic/claude-opus-4.6 (matches cheap fast-model pattern used by all other providers: zai→glm-4.5-flash, kimi→kimi-k2-turbo-preview, stepfun→step-3.5-flash, kilocode→google/gemini-3-flash-preview) - test_gmi_provider.py: fix malformed write_text() call in doctor test (was: write_text("GMI_API_KEY=* encoding="utf-8") → missing closing quote, wrote literal string 'GMI_API_KEY=* encoding=' to .env file) - test_gmi_provider.py + test_auxiliary_client.py: update aux model assertions to match new cheaper default - docs/integrations/providers.md: add 'gmi' to inline 'Supported providers' fallback list (was only in the table, not the inline list at line ~1181) - docs/reference/cli-commands.md: add 'gmi' to --provider choices list	2026-04-27 11:17:59 -07:00
Isaac Huang	c53fcb0173	feat(providers): add GMI Cloud as a first-class API-key provider (#11955 ) Add GMI Cloud (api.gmi-serving.com) as a full first-class API-key provider with built-in auth, aliases, model catalog, CLI entry points, auxiliary client routing, context length resolution, doctor checks, env var tracking, and docs. - auth.py: ProviderConfig for 'gmi' (api_key, GMI_API_KEY / GMI_BASE_URL) - providers.py: HermesOverlay with extra_env_vars for models.dev detection - models.py: curated slash-form model catalog; live /v1/models fetch - main.py: 'gmi' in _named_custom_provider_map and --provider choices - model_metadata.py: _URL_TO_PROVIDER, _PROVIDER_PREFIXES, dedicated context-length probe block (GMI's /models has authoritative data) - auxiliary_client.py: alias entries; _compat_model fix for slash-form models on cached aggregator-style clients; gmi aux default model - doctor.py: GMI in provider connectivity checks - config.py: GMI_API_KEY / GMI_BASE_URL in OPTIONAL_ENV_VARS - conftest.py: explicit GMI_BASE_URL clearing (not caught by _API_KEY suffix) - docs: providers.md, environment-variables.md, fallback-providers.md, configuration.md, quickstart.md (expands provider table) Co-authored-by: Isaac Huang <isaachuang@Isaacs-MacBook-Pro.local>	2026-04-27 11:17:59 -07:00
Wooseong Kim	be6b83562d	fix(aux): force anthropic oauth refresh after 401 Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-04-24 07:14:00 -07:00
Nicecsh	2e2de124af	fix(aux): normalize GitHub Copilot provider slugs Keep auxiliary provider resolution aligned with the switch and persisted main-provider paths when models.dev returns github-copilot slugs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 03:33:29 -07:00
sgaofen	07046096d9	fix(agent): clarify exhausted OpenRouter auxiliary credentials	2026-04-23 14:04:31 -07:00
emozilla	29693f9d8e	feat(aux): use Portal /api/nous/recommended-models for auxiliary models Wire the auxiliary client (compaction, vision, session search, web extract) to the Nous Portal's curated recommended-models endpoint when running on Nous Portal, with a TTL-cached fetch that mirrors how we pull /models for pricing. hermes_cli/models.py - fetch_nous_recommended_models(portal_base_url, force_refresh=False) 10-minute TTL cache, keyed per portal URL (staging vs prod don't collide). Public endpoint, no auth required. Returns {} on any failure so callers always get a dict. - get_nous_recommended_aux_model(vision, free_tier=None, ...) Tier-aware pick from the payload: - Paid tier → paidRecommended{Vision,Compaction}Model, falling back to freeRecommended* when the paid field is null (common during staged rollouts of new paid models). - Free tier → freeRecommended* only, never leaks paid models. When free_tier is None, auto-detects via the existing check_nous_free_tier() helper (already cached 3 min against /api/oauth/account). Detection errors default to paid so we never silently downgrade a paying user. agent/auxiliary_client.py — _try_nous() - Replaces the hardcoded xiaomi/mimo free-tier branch with a single call to get_nous_recommended_aux_model(vision=vision). - Falls back to _NOUS_MODEL (google/gemini-3-flash-preview) when the Portal is unreachable or returns a null recommendation. - The Portal is now the source of truth for aux model selection; the xiaomi allowlist we used to carry is effectively dead. Tests (15 new) - tests/hermes_cli/test_models.py::TestNousRecommendedModels Fetch caching, per-portal keying, network failure, force_refresh; paid-prefers-paid, paid-falls-to-free, free-never-leaks-paid, auto-detect, detection-error → paid default, null/blank modelName handling. - tests/agent/test_auxiliary_client.py::TestNousAuxiliaryRefresh _try_nous honors Portal recommendation for text + vision, falls back to google/gemini-3-flash-preview on None or exception. Behavior won't visibly change today — both tier recommendations currently point at google/gemini-3-flash-preview — but the moment the Portal ships a better paid recommendation, subscribers pick it up within 10 minutes without a Hermes release.	2026-04-21 20:35:16 -07:00
helix4u	392b2bb17b	fix(auxiliary): refresh Nous runtime credentials after aux 401s	2026-04-21 14:25:57 -07:00
Teknium	3cba81ebed	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 ) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR #13137 (@kshitijk4poor).	2026-04-20 12:23:05 -07:00
helix4u	6ab78401c9	fix(aux): add session_search extra_body and concurrency controls Adds auxiliary.<task>.extra_body config passthrough so reasoning-heavy OpenAI-compatible providers can receive provider-specific request fields (e.g. enable_thinking: false on GLM) on auxiliary calls, and bounds session_search summary fan-out with auxiliary.session_search.max_concurrency (default 3, clamped 1-5) to avoid 429 bursts on small providers. - agent/auxiliary_client.py: extract _get_auxiliary_task_config helper, add _get_task_extra_body, merge config+explicit extra_body with explicit winning - hermes_cli/config.py: extra_body defaults on all aux tasks + session_search.max_concurrency; _config_version 19 -> 20 - tools/session_search_tool.py: semaphore around _summarize_all gather - tests: coverage in test_auxiliary_client, test_session_search, test_aux_config - docs: user-guide/configuration.md + fallback-providers.md Co-authored-by: Teknium <teknium@nousresearch.com>	2026-04-20 00:47:39 -07:00
kagura-agent	9b60ffc47f	fix: include api.moonshot.cn in public API temperature override (#12745 ) kimi-k2.5 on api.moonshot.cn/v1 rejects temperature=0.6 with HTTP 400, same as api.moonshot.ai. The public API check now matches both domains.	2026-04-20 00:32:06 -07:00
taeng0204	6f79b8f01d	fix(kimi): route temperature override by base_url — kimi-k2.5 needs 1.0 on api.moonshot.ai Follow-up to #12144. That PR standardized the kimi-k2.* temperature lock against the Coding Plan endpoint (api.kimi.com/coding/v1) docs, where non-thinking models require 0.6. Verified empirically against Moonshot (April 2026) that the public chat endpoint (api.moonshot.ai/v1) has a different contract for kimi-k2.5: it only accepts temperature=1, and rejects 0.6 with: HTTP 400 "invalid temperature: only 1 is allowed for this model" Users hit the public endpoint when KIMI_API_KEY is a legacy sk-* key (the sk-kimi-* prefix routes to Coding Plan — see hermes_cli/auth.py). So for Coding Plan subscribers the fix from #12144 is correct, but for public-API users it reintroduces the exact 400 reported in #9125. Reproduction on api.moonshot.ai/v1 + kimi-k2.5: temperature=1.0 → 200 OK temperature=0.6 → 400 "only 1 is allowed" ← #12144 default temperature=None → 200 OK Other kimi-k2.* models are unaffected empirically — turbo-preview accepts 0.6 and thinking-turbo accepts 1.0 on both endpoints — so only kimi-k2.5 diverges. Fix: thread the client's actual base_url through _build_call_kwargs (the parameter already existed but callers passed config-level resolved_base_url; for auto-detected routes that was often empty). _fixed_temperature_for_model now checks api.moonshot.ai first via an explicit _KIMI_PUBLIC_API_OVERRIDES map, then falls back to the Coding Plan defaults. Tests parametrize over endpoint + model to lock both contracts. Closes #9125.	2026-04-19 18:54:35 -07:00
kshitij	c14b3b5880	fix(kimi): force fixed temperature on kimi-k2.* models (k2.5, thinking, turbo) (#12144 ) * fix(kimi): force fixed temperature on kimi-k2.* models (k2.5, thinking, turbo) The prior override only matched the literal model name "kimi-for-coding", but Moonshot's coding endpoint is hit with real model IDs such as `kimi-k2.5`, `kimi-k2-turbo-preview`, `kimi-k2-thinking`, etc. Those requests bypassed the override and kept the caller's temperature, so Moonshot returns HTTP 400 "invalid temperature: only 0.6 is allowed for this model" (or 1.0 for thinking variants). Match the whole kimi-k2.* family: * kimi-k2-thinking / kimi-k2-thinking-turbo -> 1.0 (thinking mode) * all other kimi-k2.* -> 0.6 (non-thinking / instant mode) Also accept an optional vendor prefix (e.g. `moonshotai/kimi-k2.5`) so aggregator routings are covered. * refactor(kimi): whitelist-match kimi coding models instead of prefix Addresses review feedback on PR #12144. - Replace `startswith("kimi-k2")` with explicit frozensets sourced from Moonshot's kimi-for-coding model list. The prefix match would have also clamped `kimi-k2-instruct` / `kimi-k2-instruct-0905`, which are the separate non-coding K2 family with variable temperature (recommended 0.6 but not enforced — see huggingface.co/moonshotai/Kimi-K2-Instruct). - Confirmed via platform.kimi.ai docs that all five coding models (k2.5, k2-turbo-preview, k2-0905-preview, k2-thinking, k2-thinking-turbo) share the fixed-temperature lock, so the preview-model mapping is no longer an assumption. - Drop the fragile `"thinking" in bare` substring test for a set lookup. - Log a debug line on each override so operators can see when Hermes silently rewrites temperature. - Update class docstring. Extend the negative test to parametrize over kimi-k2-instruct, Kimi-K2-Instruct-0905, and a hypothetical future kimi-k2-experimental name — all must keep the caller's temperature.	2026-04-18 09:35:51 -07:00
helix4u	2b60478fc2	fix(kimi): force kimi-for-coding temperature to 0.6	2026-04-17 15:49:14 -07:00
Teknium	2367c6ffd5	test: remove 169 change-detector tests across 21 files (#11472 ) First pass of test-suite reduction to address flaky CI and bloat. Removed tests that fall into these change-detector patterns: 1. Source-grep tests (tests/gateway/test_feishu.py, test_email.py): tests that call inspect.getsource() on production modules and grep for string literals. Break on any refactor/rename even when behavior is correct. 2. Platform enum tautologies (every gateway/test_X.py): assertions like `Platform.X.value == 'x'` duplicated across ~9 adapter test files. 3. Toolset/PLATFORM_HINTS/setup-wizard registry-presence checks: tests that only verify a key exists in a dict. Data-layout tests, not behavior. 4. Argparse wiring tests (test_argparse_flag_propagation, test_subparser_routing _fallback): tests that do parser.parse_args([...]) then assert args.field. Tests Python's argparse, not our code. 5. Pure dispatch tests (test_plugins_cmd.TestPluginsCommandDispatch): patch cmd_X, call plugins_command with matching action, assert mock called. Tests the if/elif chain, not behavior. 6. Kwarg-to-mock verification (test_auxiliary_client ~45 tests, test_web_tools_config, test_gemini_cloudcode, test_retaindb_plugin): tests that mock the external API client, call our function, and assert exact kwargs. Break on refactor even when behavior is preserved. 7. Schedule-internal "function-was-called" tests (acp/test_server scheduling tests): tests that patch own helper method, then assert it was called. Kept behavioral tests throughout: error paths (pytest.raises), security tests (path traversal, SSRF, redaction), message alternation invariants, provider API format conversion, streaming logic, memory contract, real config load/merge tests. Net reduction: 169 tests removed. 38 empty classes cleaned up. Collected before: 12,522 tests Collected after: 12,353 tests	2026-04-17 01:05:09 -07:00
kshitijk4poor	ff5bf0d6c8	fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation Salvaged from PR #10643 by kshitijk4poor, updated for current main. Root causes fixed: 1. Telegram xdist mock pollution — new tests/gateway/conftest.py with shared mock that runs at collection time (prevents ChatType=None caching) 2. VIRTUAL_ENV env var leak — monkeypatch.delenv in _detect_venv_dir tests 3. Copilot base_url missing — add fallback in _resolve_runtime_from_pool_entry 4. Stale vision model assertion — zai now uses glm-5v-turbo 5. Reasoning item id intentionally stripped — assert 'id' not in (store=False) 6. Context length warning unreachable — pass base_url to AIAgent in test 7. Kimi provider label updated — 'Kimi / Kimi Coding Plan' matches models.py 8. Google Workspace calendar tests — rewritten for current production code, properly mock subprocess on api_module, removed stale +agenda assertions 9. Credential pool auto-seeding — mock _select_pool_entry / _resolve_auto / _import_codex_cli_tokens to prevent real credentials from leaking into tests	2026-04-15 22:05:21 -07:00
Teknium	2558d28a9b	fix: resolve CI test failures — add missing functions, fix stale tests (#9483 ) Production fixes: - Add clear_session_context() to hermes_logging.py (fixes 48 teardown errors) - Add clear_session() to tools/approval.py (fixes 9 setup errors) - Add SyncError M_UNKNOWN_TOKEN check to Matrix _sync_loop (bug fix) - Fall back to inline api_key in named custom providers when key_env is absent (runtime_provider.py) Test fixes: - test_memory_user_id: use builtin+external provider pair, fix honcho peer_name override test to match production behavior - test_display_config: remove TestHelpers for non-existent functions - test_auxiliary_client: fix OAuth tokens to match _is_oauth_token patterns, replace get_vision_auxiliary_client with resolve_vision_provider_client - test_cli_interrupt_subagent: add missing _execution_thread_id attr - test_compress_focus: add model/provider/api_key/base_url/api_mode to mock compressor - test_auth_provider_gate: add autouse fixture to clean Anthropic env vars that leak from CI secrets - test_opencode_go_in_model_list: accept both 'built-in' and 'hermes' source (models.dev API unavailable in CI) - test_email: verify email Platform enum membership instead of source inspection (build_channel_directory now uses dynamic enum loop) - test_feishu: add bot_added/bot_deleted handler mocks to _Builder - test_ws_auth_retry: add AsyncMock for sync_store.get_next_batch, add _pending_megolm and _joined_rooms to Matrix adapter mocks - test_restart_drain: monkeypatch-delete INVOCATION_ID (systemd sets this in CI, changing the restart call signature) - test_session_hygiene: add user_id to SessionSource - test_session_env: use relative baseline for contextvar clear check (pytest-xdist workers share context)	2026-04-14 01:43:45 -07:00
helix4u	8680f61f8b	fix(copilot-acp): keep acp runtime off responses path	2026-04-13 16:17:43 -07:00
Teknium	e3ffe5b75f	fix: remove legacy compression.summary_* config and env var fallbacks (#8992 ) Remove the backward-compat code paths that read compression provider/model settings from legacy config keys and env vars, which caused silent failures when auto-detection resolved to incompatible backends. What changed: - Remove compression.summary_model, summary_provider, summary_base_url from DEFAULT_CONFIG and cli.py defaults - Remove backward-compat block in _resolve_task_provider_model() that read from the legacy compression section - Remove _get_auxiliary_provider() and _get_auxiliary_env_override() helper functions (AUXILIARY_/CONTEXT_ env var readers) - Remove env var fallback chain for per-task overrides - Update hermes config show to read from auxiliary.compression - Add config migration (v16→17) that moves non-empty legacy values to auxiliary.compression and strips the old keys - Update example config and openclaw migration script - Remove/update tests for deleted code paths Compression model/provider is now configured exclusively via: auxiliary.compression.provider / auxiliary.compression.model Closes #8923	2026-04-13 04:59:26 -07:00
Teknium	078dba015d	fix: three provider-related bugs (#8161 , #8181 , #8147 ) (#8243 ) - Add openai/openai-codex -> openai mapping to PROVIDER_TO_MODELS_DEV so context-length lookups use models.dev data instead of 128k fallback. Fixes #8161. - Set api_mode from custom_providers entry when switching via hermes model, and clear stale api_mode when the entry has none. Also extract api_mode in _named_custom_provider_map(). Fixes #8181. - Convert OpenAI image_url content blocks to Anthropic image blocks when the endpoint is Anthropic-compatible (MiniMax, MiniMax-CN, or any URL containing /anthropic). Fixes #8147.	2026-04-12 01:44:18 -07:00
Harish Kukreja	b1f13a8c5f	fix(agent): route compression aux through live session runtime	2026-04-12 01:34:52 -07:00
Teknium	424b62aa16	fix: update async fallback test mock to 5-tuple for api_mode	2026-04-11 01:52:58 -07:00
kshitijk4poor	c89719ad9c	fix: warn and clear stale OPENAI_BASE_URL on provider switch (#5161 )	2026-04-11 01:52:58 -07:00
kshitijk4poor	eeb8b4b00f	fix(auxiliary): harden fallback behavior for non-OpenRouter users Four fixes to auxiliary_client.py: 1. Respect explicit provider as hard constraint (#7559) When auxiliary.{task}.provider is explicitly set (not 'auto'), connection/payment errors no longer silently fallback to cloud providers. Local-only users (Ollama, vLLM) will no longer get unexpected OpenRouter billing from auxiliary tasks. 2. Eliminate model='default' sentinel (#7512) _resolve_api_key_provider() no longer sends literal 'default' as model name to APIs. Providers without a known aux model in _API_KEY_PROVIDER_AUX_MODELS are skipped instead of producing model_not_supported errors. 3. Add payment/connection fallback to async_call_llm (#7512) async_call_llm now mirrors sync call_llm's fallback logic for payment (402) and connection errors. Previously, async consumers (session_search, web_tools, vision) got hard failures with no recovery. Also fixes hardcoded 'openrouter' fallback to use the full auto-detection chain. 4. Use accurate error reason in fallback logs (#7512) _try_payment_fallback() now accepts a reason parameter and uses it in log messages. Connection timeouts are no longer misleadingly logged as 'payment error'. Closes #7559 Closes #7512	2026-04-11 01:52:58 -07:00
kshitijk4poor	ffbd80f5fc	fix(auxiliary): honor api_mode in auxiliary client (#6800 ) The auxiliary client always calls client.chat.completions.create(), ignoring the api_mode config flag. This breaks codex-family models (e.g. gpt-5.3-codex) on direct OpenAI API keys, which need the /v1/responses endpoint. Changes: - Expand _resolve_task_provider_model to return api_mode (5-tuple) - Read api_mode from auxiliary.{task}.api_mode config and env vars (AUXILIARY_{TASK}_API_MODE) - Pass api_mode through _get_cached_client to resolve_provider_client - Add _needs_codex_wrap/_wrap_if_needed helpers that wrap plain OpenAI clients in CodexAuxiliaryClient when api_mode=codex_responses or when auto-detection finds api.openai.com + codex model pattern - Apply wrapping at all custom endpoint, named custom provider, and API-key provider return paths - Update test mocks for the new 5-tuple return format Users can now set: auxiliary: compression: model: gpt-5.3-codex base_url: https://api.openai.com/v1 api_mode: codex_responses Closes #6800	2026-04-11 01:52:58 -07:00
hermes-agent-dhabibi	c1af614289	fix: wrap copilot Responses-API models in CodexAuxiliaryClient for auxiliary tasks GPT-5+ models (except gpt-5-mini) are only accessible via the Responses API on Copilot. When these models were configured as the compression summary_model (or any auxiliary task), the plain OpenAI client sent them to /chat/completions which returned a 400 error: model "gpt-5.4-mini" is not accessible via the /chat/completions endpoint resolve_provider_client() now checks _should_use_copilot_responses_api() for the copilot provider and wraps the client in CodexAuxiliaryClient when needed, routing calls through responses.stream() transparently. Adds tests for both the wrapping (gpt-5.4-mini) and non-wrapping (gpt-4.1-mini) paths.	2026-04-10 21:16:53 -07:00
Teknium	a4fc38c5b1	test: remove dead TestResolveForcedProvider tests (function doesn't exist on main)	2026-04-10 16:47:44 -07:00
Billard	475cbce775	fix(aux): honor api_mode for custom auxiliary endpoints	2026-04-10 16:47:44 -07:00
xwp	5a1cce53e4	fix(auxiliary): skip anthropic in fallback chain when not explicitly configured _resolve_api_key_provider() now checks is_provider_explicitly_configured before calling _try_anthropic(). Previously, any auxiliary fallback (e.g. when kimi-coding key was invalid) would silently discover and use Claude Code OAuth tokens — consuming the user's Claude Max subscription without their knowledge. This is the auxiliary-client counterpart of the setup-wizard gate in PR #4210. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 05:19:21 -07:00
alt-glitch	96c060018a	fix: remove 115 verified dead code symbols across 46 production files Automated dead code audit using vulture + coverage.py + ast-grep intersection, confirmed by Opus deep verification pass. Every symbol verified to have zero production callers (test imports excluded from reachability analysis). Removes ~1,534 lines of dead production code across 46 files and ~1,382 lines of stale test code. 3 entire files deleted (agent/builtin_memory_provider.py, hermes_cli/checklist.py, tests/hermes_cli/test_setup_model_selection.py). Co-authored-by: alt-glitch <balyan.sid@gmail.com>	2026-04-10 03:44:43 -07:00
Cherif Yaya	5cf4fac2aa	fix: restore codex fallback auth-store lookup	2026-04-09 01:56:10 -07:00
Teknium	7156f8d866	fix: CI test failures — metadata key, cli console, docker env, vision order (#6294 ) Fixes 9 test failures on current main, incorporating ideas from PR stack #6219-#6222 by xinbenlv with corrections: - model_metadata: sync HF context length key casing (minimaxai/minimax-m2.5 → MiniMaxAI/MiniMax-M2.5) - cli.py: route quick command error output through self.console instead of creating a new ChatConsole() instance - docker.py: explicit docker_forward_env entries now bypass the Hermes secret blocklist (intentional opt-in wins over generic filter) - auxiliary_client: revert _read_main_provider() to simple provider.strip().lower() — the _normalize_aux_provider() call introduced in `5c03f2e7` stripped the custom: prefix, breaking named custom provider resolution - auxiliary_client: flip vision auto-detection order to active provider → OpenRouter → Nous → stop (was OR → Nous → active) - test: update vision priority test to match new order Based on PR #6219-#6222 by xinbenlv.	2026-04-08 16:37:05 -07:00
Mibayy	ab271ebe10	fix(vision): simplify vision auto-detection to openrouter → nous → active provider Simplify the vision auto-detection chain from 5 backends (openrouter, nous, codex, anthropic, custom) down to 3: 1. OpenRouter (known vision-capable default model) 2. Nous Portal (known vision-capable default model) 3. Active provider + model (whatever the user is running) 4. Stop This is simpler and more predictable. The active provider step uses resolve_provider_client() which handles all provider types including named custom providers (from #5978). Removed the complex preferred-provider promotion logic and API-level fallback — the chain is short enough that it doesn't need them. Based on PR #5376 by Mibay. Closes #5366.	2026-04-08 01:21:54 -07:00
Teknium	5c03f2e7cc	fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983 ) Salvaged fixes from community PRs: - fix(model_switch): _read_auth_store → _load_auth_store + fix auth store key lookup (was checking top-level dict instead of store['providers']). OAuth providers now correctly detected in /model picker. Cherry-picked from PR #5911 by Xule Lin (linxule). - fix(ollama): pass num_ctx to override 2048 default context window. Ollama defaults to 2048 context regardless of model capabilities. Now auto-detects from /api/show metadata and injects num_ctx into every request. Config override via model.ollama_num_ctx. Fixes #2708. Cherry-picked from PR #5929 by kshitij (kshitijk4poor). - fix(aux): normalize provider aliases for vision/auxiliary routing. Adds _normalize_aux_provider() with 17 aliases (google→gemini, claude→anthropic, glm→zai, etc). Fixes vision routing failure when provider is set to 'google' instead of 'gemini'. Cherry-picked from PR #5793 by e11i (Elizabeth1979). - fix(aux): rewrite MiniMax /anthropic base URLs to /v1 for OpenAI SDK. MiniMax's inference_base_url ends in /anthropic (Anthropic Messages API), but auxiliary client uses OpenAI SDK which appends /chat/completions → 404 at /anthropic/chat/completions. Generic _to_openai_base_url() helper rewrites terminal /anthropic to /v1 for OpenAI-compatible endpoint. Inspired by PR #5786 by Lempkey. Added debug logging to silent exception blocks across all fixes. Co-authored-by: Hermes Agent <hermes@nousresearch.com>	2026-04-07 22:23:28 -07:00
Teknium	da02a4e283	fix: auxiliary client payment fallback — retry with next provider on 402 (#5599 ) When a user runs out of OpenRouter credits and switches to Codex (or any other provider), auxiliary tasks (compression, vision, web_extract) would still try OpenRouter first and fail with 402. Two fixes: 1. Payment fallback in call_llm(): When a resolved provider returns HTTP 402 or a credit-related error, automatically retry with the next available provider in the auto-detection chain. Skips the depleted provider and tries Nous → Custom → Codex → API-key providers. 2. Remove hardcoded OpenRouter fallback: The old code fell back specifically to OpenRouter when auto/custom resolution returned no client. Now falls back to the full auto-detection chain, which handles any available provider — not just OpenRouter. Also extracts _get_provider_chain() as a shared function (replaces inline tuple in _resolve_auto and the new fallback), built at call time so test patches on _try_* functions remain visible. Adds 16 tests covering _is_payment_error(), _get_provider_chain(), _try_payment_fallback(), and call_llm() integration with 402 retry.	2026-04-06 12:41:40 -07:00
Teknium	acea9ee20b	fix(tests): fix 11 real test failures + major cascade poisoner (#4570 ) Three root causes addressed: 1. AIAgent no longer defaults base_url to OpenRouter (9 tests) Tests that assert OpenRouter-specific behavior (prompt caching, reasoning extra_body, provider preferences) need explicit base_url and model set on the agent. Updated test_run_agent.py and test_provider_parity.py. 2. Credential pool auto-seeding from host env (2 tests) test_auxiliary_client.py tests for Anthropic OAuth and custom endpoint fallback were not mocking _select_pool_entry, so the host's credential pool interfered. Added pool + codex mocks. 3. sys.modules corruption cascade (major - ~250 tests) test_managed_modal_environment.py replaced sys.modules entries (tools, hermes_cli, agent packages) with SimpleNamespace stubs but had NO cleanup fixture. Every subsequent test in the process saw corrupted imports: 'cannot import get_config_path from <unknown module name>' and 'module tools has no attribute environments'. Added _restore_tool_and_agent_modules autouse fixture matching the pattern in test_managed_browserbase_and_modal.py. This was also the root cause of CI failures (104 failed on main).	2026-04-02 08:43:06 -07:00
Teknium	8d59881a62	feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647 ) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-31 03:10:01 -07:00
Teknium	f890a94c12	refactor: make config.yaml the single source of truth for endpoint URLs (#4165 ) OPENAI_BASE_URL was written to .env AND config.yaml, creating a dual-source confusion. Users (especially Docker) would see the URL in .env and assume that's where all config lives, then wonder why LLM_MODEL in .env didn't work. Changes: - Remove all 27 save_env_value("OPENAI_BASE_URL", ...) calls across main.py, setup.py, and tools_config.py - Remove OPENAI_BASE_URL env var reading from runtime_provider.py, cli.py, models.py, and gateway/run.py - Remove LLM_MODEL/HERMES_MODEL env var reading from gateway/run.py and auxiliary_client.py — config.yaml model.default is authoritative - Vision base URL now saved to config.yaml auxiliary.vision.base_url (both setup wizard and tools_config paths) - Tests updated to set config values instead of env vars Convention enforced: .env is for SECRETS only (API keys). All other configuration (model names, base URLs, provider selection) lives exclusively in config.yaml.	2026-03-30 22:02:53 -07:00
Teknium	d30ea65c9b	fix: URL-based auth for third-party Anthropic endpoints + CI test fixes (#4148 ) * fix(tests): mock sys.stdin.isatty for cmd_model TTY guard * fix(tests): update camofox snapshot format + trajectory compressor mock path - test_browser_camofox: mock response now uses snapshot format (accessibility tree) - test_trajectory_compressor: mock _get_async_client instead of setting async_client directly * fix: URL-based auth detection for third-party Anthropic endpoints + test fixes Reverts the key-prefix approach from #4093 which broke JWT and managed key OAuth detection. Instead, detects third-party endpoints by URL: if base_url is set and isn't anthropic.com, it's a proxy (Azure AI Foundry, AWS Bedrock, etc.) that uses x-api-key regardless of key format. Auth decision chain is now: 1. _requires_bearer_auth(url) → MiniMax → Bearer 2. _is_third_party_anthropic_endpoint(url) → Azure/Bedrock → x-api-key 3. _is_oauth_token(key) → OAuth on direct Anthropic → Bearer 4. else → x-api-key Also includes test fixes from PR #4051 by @erosika: - Mock sys.stdin.isatty for cmd_model TTY guard - Update camofox snapshot format mock - Fix trajectory compressor async client mock path --------- Co-authored-by: Erosika <eri@plasticlabs.ai>	2026-03-30 20:36:56 -07:00
Teknium	ffd5d37f9b	fix: treat non-sk-ant- keys as regular API keys, not OAuth tokens (#4093 ) * fix: treat non-sk-ant- prefixed keys (Azure AI Foundry) as regular API keys, not OAuth tokens * fix: treat non-sk-ant- keys as regular API keys, not OAuth tokens _is_oauth_token() returned True for any key not starting with sk-ant-api, misclassifying Azure AI Foundry keys as OAuth tokens and sending Bearer auth instead of x-api-key → 401 rejection. Real Anthropic OAuth tokens all start with sk-ant-oat (confirmed from live .credentials.json). Non-sk-ant- keys are third-party provider keys that should use x-api-key. Test fixtures updated to use realistic sk-ant-oat01- prefixed tokens instead of fake strings. Salvaged from PR #4075 by @HangGlidersRule. --------- Co-authored-by: Clawdbot <clawdbot@openclaw.ai>	2026-03-30 17:41:13 -07:00
Teknium	3cc50532d1	fix: auxiliary client uses placeholder key for local servers without auth (#3842 ) Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't require API keys, but the auxiliary client's _resolve_custom_runtime() rejected endpoints with empty keys — causing the auto-detection chain to skip the user's local server entirely. This broke compression, summarization, and memory flush for users running local models without an OpenRouter/cloud API key. The main CLI already had this fix (PR #2556, 'no-key-required' placeholder), but the auxiliary client's resolution path was missed. Two fixes: - _resolve_custom_runtime(): use 'no-key-required' placeholder instead of returning None when base_url is present but key is empty - resolve_provider_client() custom branch: same placeholder fallback for explicit_base_url without explicit_api_key Updates 2 tests that expected the old (broken) behavior.	2026-03-29 21:05:36 -07:00
Teknium	5a1e2a307a	perf(ttft): salvage easy-win startup optimizations from #3346 (#3395 ) * perf(ttft): dedupe shared tool availability checks * perf(ttft): short-circuit vision auto-resolution * perf(ttft): make Claude Code version detection lazy * perf(ttft): reuse loaded toolsets for skills prompt --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-27 07:49:44 -07:00
Teknium	a8e02c7d49	fix: align Nous Portal model slugs with OpenRouter naming (#3253 ) Nous Portal now passes through OpenRouter model names and routes from there. Update the static fallback model list and auxiliary client default to use OpenRouter-format slugs (provider/model) instead of bare names. - _PROVIDER_MODELS['nous']: full OpenRouter catalog - _NOUS_MODEL: google/gemini-3-flash-preview (was gemini-3-flash) - Updated 4 test assertions for the new default model name	2026-03-26 13:49:43 -07:00

1 2

70 commits