Follow-up for #3171 cherry-pick — the contributor's validation block
called get_provider_credentials() which doesn't exist on current main.
Replaces it with get_auth_status() limited to API-key providers in
PROVIDER_REGISTRY so providers without a registry entry (openrouter,
anthropic, custom) don't trigger false 'not authenticated' failures.
Also runs the provider name through resolve_provider() so aliases like
'glm'/'moonshot' validate correctly.
Adds StefanIsMe to AUTHOR_MAP.
Third-party gateways that speak the native Anthropic protocol (MiniMax,
Zhipu GLM, Alibaba DashScope, Kimi, LiteLLM proxies) now work end-to-end
with the same feature set as direct api.anthropic.com callers. Synthesizes
eight stale community PRs into one consolidated change.
Five fixes:
- URL detection: consolidate three inline `endswith("/anthropic")`
checks in runtime_provider.py into the shared _detect_api_mode_for_url
helper. Third-party /anthropic endpoints now auto-resolve to
api_mode=anthropic_messages via one code path instead of three.
- OAuth leak-guard: all five sites that assign `_is_anthropic_oauth`
(__init__, switch_model, _try_refresh_anthropic_client_credentials,
_swap_credential, _try_activate_fallback) now gate on
`provider == "anthropic"` so a stale ANTHROPIC_TOKEN never trips
Claude-Code identity injection on third-party endpoints. Previously
only 2 of 5 sites were guarded.
- Prompt caching: new method `_anthropic_prompt_cache_policy()` returns
`(should_cache, use_native_layout)` per endpoint. Replaces three
inline conditions and the `native_anthropic=(api_mode=='anthropic_messages')`
call-site flag. Native Anthropic and third-party Anthropic gateways
both get the native cache_control layout; OpenRouter gets envelope
layout. Layout is persisted in `_primary_runtime` so fallback
restoration preserves the per-endpoint choice.
- Auxiliary client: `_try_custom_endpoint` honors
`api_mode=anthropic_messages` and builds `AnthropicAuxiliaryClient`
instead of silently downgrading to an OpenAI-wire client. Degrades
gracefully to OpenAI-wire when the anthropic SDK isn't installed.
- Config hygiene: `_update_config_for_provider` (hermes_cli/auth.py)
clears stale `api_key`/`api_mode` when switching to a built-in
provider, so a previous MiniMax custom endpoint's credentials can't
leak into a later OpenRouter session.
- Truncation continuation: length-continuation and tool-call-truncation
retry now cover `anthropic_messages` in addition to `chat_completions`
and `bedrock_converse`. Reuses the existing `_build_assistant_message`
path via `normalize_anthropic_response()` so the interim message
shape is byte-identical to the non-truncated path.
Tests: 6 new files, 42 test cases. Targeted run + tests/run_agent,
tests/agent, tests/hermes_cli all pass (4554 passed).
Synthesized from (credits preserved via Co-authored-by trailers):
#7410 @nocoo — URL detection helper
#7393 @keyuyuan — OAuth 5-site guard
#7367 @n-WN — OAuth guard (narrower cousin, kept comment)
#8636 @sgaofen — caching helper + native-vs-proxy layout split
#10954 @Only-Code-A — caching on anthropic_messages+Claude
#7648 @zhongyueming1121 — aux client anthropic_messages branch
#6096 @hansnow — /model switch clears stale api_mode
#9691 @TroyMitchell911 — anthropic_messages truncation continuation
Closes: #7366, #8294 (third-party Anthropic identity + caching).
Supersedes: #7410, #7367, #7393, #8636, #10954, #7648, #6096, #9691.
Rejects: #9621 (OpenAI-wire caching with incomplete blocklist — risky),
#7242 (superseded by #9691, stale branch),
#8321 (targets smart_model_routing which was removed in #12732).
Co-authored-by: nocoo <nocoo@users.noreply.github.com>
Co-authored-by: Keyu Yuan <leoyuan0099@gmail.com>
Co-authored-by: Zoee <30841158+n-WN@users.noreply.github.com>
Co-authored-by: sgaofen <135070653+sgaofen@users.noreply.github.com>
Co-authored-by: Only-Code-A <bxzt2006@163.com>
Co-authored-by: zhongyueming <mygamez@163.com>
Co-authored-by: Xiaohan Li <hansnow@users.noreply.github.com>
Co-authored-by: Troy Mitchell <i@troy-y.org>
The example-skin.yaml was removed as part of the stale docs cleanup.
Docusaurus features/skins.md covers the same material.
Also update AUTHOR_MAP for balyan.sid@gmail.com → alt-glitch (actual
GitHub login; balyansid returns 404).
Salvaged commit 0c652e9b in this branch is authored by taeng02@icloud.com.
check-attribution CI blocks PRs whose new author emails aren't in
AUTHOR_MAP, so add the mapping to unblock #12680's salvage PR.
GitHub username confirmed via `gh api users/taeng0204` (Taein Lim).
* [verified] fix(mcp-oauth): bridge httpx auth_flow bidirectional generator
HermesMCPOAuthProvider.async_auth_flow wrapped the SDK's auth_flow with
'async for item in super().async_auth_flow(request): yield item', which
discards httpx's .asend(response) values and resumes the inner generator
with None. This broke every OAuth MCP server on the first HTTP response
with 'NoneType' object has no attribute 'status_code' crashing at
mcp/client/auth/oauth2.py:505.
Replace with a manual bridge that forwards .asend() values into the
inner generator, preserving httpx's bidirectional auth_flow contract.
Add tests/tools/test_mcp_oauth_bidirectional.py with two regression
tests that drive the flow through real .asend() round-trips. These
catch the bug at the unit level; prior tests only exercised
_initialize() and disk-watching, never the full generator protocol.
Verified against BetterStack MCP:
Before: 'Connection failed (11564ms): NoneType...' after 3 retries
After: 'Connected (2416ms); Tools discovered: 83'
Regression from #11383.
* [verified] fix(mcp-oauth): seed token_expiry_time + pre-flight AS discovery on cold-load
PR #11383's consolidation fixed external-refresh reloading and 401 dedup
but left two latent bugs that surfaced on BetterStack and any other OAuth
MCP with a split-origin authorization server:
1. HermesTokenStorage persisted only a relative 'expires_in', which is
meaningless after a process restart. The MCP SDK's OAuthContext
does NOT seed token_expiry_time in _initialize, so is_token_valid()
returned True for any reloaded token regardless of age. Expired
tokens shipped to servers, and app-level auth failures (e.g.
BetterStack's 'No teams found. Please check your authentication.')
were invisible to the transport-layer 401 handler.
2. Even once preemptive refresh did fire, the SDK's _refresh_token
falls back to {server_url}/token when oauth_metadata isn't cached.
For providers whose AS is at a different origin (BetterStack:
mcp.betterstack.com for MCP, betterstack.com/oauth/token for the
token endpoint), that fallback 404s and drops into full browser
re-auth on every process restart.
Fix set:
- HermesTokenStorage.set_tokens persists an absolute wall-clock
expires_at alongside the SDK's OAuthToken JSON (time.time() + TTL
at write time).
- HermesTokenStorage.get_tokens reconstructs expires_in from
max(expires_at - now, 0), clamping expired tokens to zero TTL.
Legacy files without expires_at fall back to file-mtime as a
best-effort wall-clock proxy, self-healing on the next set_tokens.
- HermesMCPOAuthProvider._initialize calls super(), then
update_token_expiry on the reloaded tokens so token_expiry_time
reflects actual remaining TTL. If tokens are loaded but
oauth_metadata is missing, pre-flight PRM + ASM discovery runs
via httpx.AsyncClient using the MCP SDK's own URL builders and
response handlers (build_protected_resource_metadata_discovery_urls,
handle_auth_metadata_response, etc.) so the SDK sees the correct
token_endpoint before the first refresh attempt. Pre-flight is
skipped when there are no stored tokens to keep fresh-install
paths zero-cost.
Test coverage (tests/tools/test_mcp_oauth_cold_load_expiry.py):
- set_tokens persists absolute expires_at
- set_tokens skips expires_at when token has no expires_in
- get_tokens round-trips expires_at -> remaining expires_in
- expired tokens reload with expires_in=0
- legacy files without expires_at fall back to mtime proxy
- _initialize seeds token_expiry_time from stored tokens
- _initialize flags expired-on-disk tokens as is_token_valid=False
- _initialize pre-flights PRM + ASM discovery with mock transport
- _initialize skips pre-flight when no tokens are stored
Verified against BetterStack MCP:
hermes mcp test betterstack -> Connected (2508ms), 83 tools
mcp_betterstack_telemetry_list_teams_tool -> real team data, not
'No teams found. Please check your authentication.'
Reference: mcp-oauth-token-diagnosis skill, Fix A.
* chore: map hermes@noushq.ai to benbarclay in AUTHOR_MAP
Needed for CI attribution check on cherry-picked commits from PR #12025.
---------
Co-authored-by: Hermes Agent <hermes@noushq.ai>
Adds a regression guard for the #11277 → proxy-bypass regression fixed in
42b394c3. With HTTPS_PROXY / HTTP_PROXY / ALL_PROXY set, the custom httpx
transport used for TCP keepalives must still route requests through an
HTTPProxy pool; without proxy env, no HTTPProxy mount should exist.
Also maps zrc <zhurongcheng@rcrai.com> → heykb in scripts/release.py
AUTHOR_MAP so the salvage PR passes the author-attribution CI check.
On top of the salvaged PR #12505 (Jason/farion1231, which adds dict-format
models: enumeration to both sections), three section-3 refinements from
competing PR #11534 (YangManBOBO):
- accept base_url as canonical (matches Hermes's writer and custom_providers
entries); keep api/url as fallbacks for legacy/hand-edited configs
- accept singular model as a default_model synonym, matching custom_providers
- add seen_slugs guard so the same provider slug appearing in both
providers: dict and custom_providers: list emits exactly one picker row
(providers: dict wins since section 3 runs first)
Two regression tests cover the new behavior. AUTHOR_MAP entry added for
farion1231 so CI doesn't reject the cherry-picked commit.
Follow-up for the helix4u easy-fix salvage batch:
- route remaining context-engine quiet-mode output through
_should_emit_quiet_tool_messages() so non-CLI/library callers stay
silent consistently
- drop the extra senderAliases computation from WhatsApp allowlist-drop
logging and remove the now-unused import
This keeps the batch scoped to the intended fixes while avoiding
leaked quiet-mode output and unnecessary duplicate work in the bridge.
The cherry-picked commit from #11434 uses the 154585401+ prefixed
noreply format. Add it alongside the existing bare entry so the
contributor audit passes.
Based on #12152 by @LVT382009.
Two fixes to run_agent.py:
1. _ephemeral_max_output_tokens consumption in chat_completions path:
The error-recovery ephemeral override was only consumed in the
anthropic_messages branch of _build_api_kwargs. All chat_completions
providers (OpenRouter, NVIDIA NIM, Qwen, Alibaba, custom, etc.)
silently ignored it. Now consumed at highest priority, matching the
anthropic pattern.
2. NVIDIA NIM max_tokens default (16384):
NVIDIA NIM falls back to a very low internal default when max_tokens
is omitted, causing models like GLM-4.7 to truncate immediately
(thinking tokens exhaust the budget before the response starts).
3. Progressive length-continuation boost:
When finish_reason='length' triggers a continuation retry, the output
budget now grows progressively (2x base on retry 1, 3x on retry 2,
capped at 32768) via _ephemeral_max_output_tokens. Previously the
retry loop just re-sent the same token limit on all 3 attempts.
Based on #11984 by @maxchernin. Fixes#8259.
Some providers (MiniMax M2.7 via NVIDIA NIM) resend the full function
name in every streaming chunk instead of only the first. The old
accumulator used += which concatenated them into 'read_fileread_file'.
Changed to simple assignment (=), matching the OpenAI Node SDK, LiteLLM,
and Vercel AI SDK patterns. Function names are atomic identifiers
delivered complete — no provider splits them across chunks, so
concatenation was never correct semantics.
Twelve tests under TestCJKSearchFallback guarding:
- CJK detection across Chinese/Japanese/Korean/Hiragana/Katakana ranges
(including the full Hangul syllables block \uac00-\ud7af, to catch
the shorter-range typo from one of the duplicate PRs)
- Substring match for multi-char Chinese, Japanese, Korean queries
- Filter preservation (source_filter, exclude_sources, role_filter)
in the LIKE path — guards against the SQL-builder bug from another
duplicate PR where filter clauses landed after LIMIT/OFFSET
- Snippet centered on the matched term (instr-based substr window),
not the leading 200 chars of content
- English fast-path untouched
- Empty/no-match cases
- Mixed CJK+English queries
Also:
- hermes_state.py: LIKE-fallback snippet is now
`substr(content, max(1, instr(content, ?) - 40), 120)`, centered on
the match instead of the whole-content default. Credit goes to
@iamagenius00 for the snippet idea in PR #11517.
- scripts/release.py: add @iamagenius00 to AUTHOR_MAP so future
release attribution resolves cleanly.
Refs #11511, #11516, #11517, #11541.
Co-authored-by: iamagenius00 <iamagenius00@users.noreply.github.com>
Follow-up polish on top of the cherry-picked #11023 commit.
- feishu_comment_rules.py: replace import-time "~/.hermes" expanduser fallback
with get_hermes_home() from hermes_constants (canonical, profile-safe).
- tools/feishu_doc_tool.py, tools/feishu_drive_tool.py: drop the
asyncio.get_event_loop().run_until_complete(asyncio.to_thread(...)) dance.
Tool handlers run synchronously in a worker thread with no running loop, so
the RuntimeError branch was always the one that executed. Calls client.request
directly now. Unused asyncio import removed.
- tests/gateway/test_feishu.py: add register_p2_customized_event to the mock
EventDispatcher builder so the existing adapter test matches the new handler
registration for drive.notice.comment_add_v1.
- scripts/release.py: map liujinkun@bytedance.com -> liujinkun2025 for
contributor attribution on release notes.
Follow-up on the native NVIDIA NIM provider salvage. The original PR wired
PROVIDER_REGISTRY + HERMES_OVERLAYS correctly but missed several touchpoints
required for full parity with other OpenAI-compatible providers (xai,
huggingface, deepseek, zai).
Gaps closed:
- hermes_cli/main.py:
- Add 'nvidia' to the _model_flow_api_key_provider dispatch tuple so
selecting 'NVIDIA NIM' in `hermes model` actually runs the api-key
provider flow (previously fell through silently).
- Add 'nvidia' to `hermes chat --provider` argparse choices so the
documented test command (`hermes chat --provider nvidia --model ...`)
parses successfully.
- hermes_cli/config.py: Register NVIDIA_API_KEY and NVIDIA_BASE_URL in
OPTIONAL_ENV_VARS so setup wizard can prompt for them and they're
auto-added to the subprocess env blocklist.
- hermes_cli/doctor.py: Add NVIDIA NIM row to `_apikey_providers` so
`hermes doctor` probes https://integrate.api.nvidia.com/v1/models.
- hermes_cli/dump.py: Add NVIDIA_API_KEY → 'nvidia' mapping for
`hermes dump` credential masking.
- tests/tools/test_local_env_blocklist.py: Extend registry_vars fixture
with NVIDIA_API_KEY to verify it's blocked from leaking into subprocesses.
- agent/model_metadata.py: Add 'nemotron' → 131072 context-length entry
so all Nemotron variants get 128K context via substring match (rather
than falling back to MINIMUM_CONTEXT_LENGTH).
- hermes_cli/models.py: Fix hallucinated model ID
'nvidia/nemotron-3-nano-8b-a4b' → 'nvidia/nemotron-3-nano-30b-a3b'
(verified against live integrate.api.nvidia.com/v1/models catalog).
Expand curated list from 5 to 9 agentic models mapping to OpenRouter
defaults per provider-guide convention: add qwen3.5-397b-a17b,
deepseek-v3.2, llama-3.3-nemotron-super-49b-v1.5, gpt-oss-120b.
- cli-config.yaml.example: Document 'nvidia' provider option.
- scripts/release.py: Map asurla@nvidia.com → anniesurla in AUTHOR_MAP
for CI attribution.
E2E verified: `hermes chat --provider nvidia ...` now reaches NVIDIA's
endpoint (returns 401 with bogus key instead of argparse error);
`hermes doctor` detects NVIDIA NIM when NVIDIA_API_KEY is set.
Follow-ups to the salvaged commits in this PR:
* gateway/config.py — strip trailing whitespace from youngDoo's diff
(line 315 had ~140 trailing spaces).
* hermes_cli/tools_config.py — replace `config.get("platform_toolsets", {})`
with `config.get("platform_toolsets") or {}`. Handles the case where the
YAML key is present but explicitly null (parses as None, previously
crashed with AttributeError on the next line's .get(platform)).
Cherry-picked from yyq4193's #9003 with attribution.
* tests/gateway/test_config.py — 4 new tests for TestGetConnectedPlatforms
covering DingTalk via extras, via env vars, disabled, and missing creds.
* tests/hermes_cli/test_tools_config.py — regression test for the null
platform_toolsets edge case.
* scripts/release.py — add kagura-agent, youngDoo, yyq4193 to AUTHOR_MAP.
Co-authored-by: yyq4193 <39405770+yyq4193@users.noreply.github.com>
* test: make test env hermetic; enforce CI parity via scripts/run_tests.sh
Fixes the recurring 'works locally, fails in CI' (and vice versa) class
of flakes by making tests hermetic and providing a canonical local runner
that matches CI's environment.
## Layer 1 — hermetic conftest.py (tests/conftest.py)
Autouse fixture now unsets every credential-shaped env var before every
test, so developer-local API keys can't leak into tests that assert
'auto-detect provider when key present'.
Pattern: unset any var ending in _API_KEY, _TOKEN, _SECRET, _PASSWORD,
_CREDENTIALS, _ACCESS_KEY, _PRIVATE_KEY, etc. Plus an explicit list of
credential names that don't fit the suffix pattern (AWS_ACCESS_KEY_ID,
FAL_KEY, GH_TOKEN, etc.) and all the provider BASE_URL overrides that
change auto-detect behavior.
Also unsets HERMES_* behavioral vars (HERMES_YOLO_MODE, HERMES_QUIET,
HERMES_SESSION_*, etc.) that mutate agent behavior.
Also:
- Redirects HOME to a per-test tempdir (not just HERMES_HOME), so
code reading ~/.hermes/* directly can't touch the real dir.
- Pins TZ=UTC, LANG=C.UTF-8, LC_ALL=C.UTF-8, PYTHONHASHSEED=0 to
match CI's deterministic runtime.
The old _isolate_hermes_home fixture name is preserved as an alias so
any test that yields it explicitly still works.
## Layer 2 — scripts/run_tests.sh canonical runner
'Always use scripts/run_tests.sh, never call pytest directly' is the
new rule (documented in AGENTS.md). The script:
- Unsets all credential env vars (belt-and-suspenders for callers
who bypass conftest — e.g. IDE integrations)
- Pins TZ/LANG/PYTHONHASHSEED
- Uses -n 4 xdist workers (matches GHA ubuntu-latest; -n auto on
a 20-core workstation surfaces test-ordering flakes CI will never
see, causing the infamous 'passes in CI, fails locally' drift)
- Finds the venv in .venv, venv, or main checkout's venv
- Passes through arbitrary pytest args
Installs pytest-split on demand so the script can also be used to run
matrix-split subsets locally for debugging.
## Remove 3 module-level dotenv stubs that broke test isolation
tests/hermes_cli/test_{arcee,xiaomi,api_key}_provider.py each had a
module-level:
if 'dotenv' not in sys.modules:
fake_dotenv = types.ModuleType('dotenv')
fake_dotenv.load_dotenv = lambda *a, **kw: None
sys.modules['dotenv'] = fake_dotenv
This patches sys.modules['dotenv'] to a fake at import time with no
teardown. Under pytest-xdist LoadScheduling, whichever worker collected
one of these files first poisoned its sys.modules; subsequent tests in
the same worker that imported load_dotenv transitively (e.g.
test_env_loader.py via hermes_cli.env_loader) got the no-op lambda and
saw their assertions fail.
dotenv is a required dependency (python-dotenv>=1.2.1 in pyproject.toml),
so the defensive stub was never needed. Removed.
## Validation
- tests/hermes_cli/ alone: 2178 passed, 1 skipped, 0 failed (was 4
failures in test_env_loader.py before this fix)
- tests/test_plugin_skills.py, tests/hermes_cli/test_plugins.py,
tests/test_hermes_logging.py combined: 123 passed (the caplog
regression tests from PR #11453 still pass)
- Local full run shows no F/E clusters in the 0-55% range that were
previously present before the conftest hardening
## Background
See AGENTS.md 'Testing' section for the full list of drift sources
this closes. Matrix split (closed as #11566) will be re-attempted
once this foundation lands — cross-test pollution was the root cause
of the shard-3 hang in that PR.
* fix(conftest): don't redirect HOME — it broke CI subprocesses
PR #11577's autouse fixture was setting HOME to a per-test tempdir.
CI started timing out at 97% complete with dozens of E/F markers and
orphan python processes at cleanup — tests (or transitive deps)
spawn subprocesses that expect a stable HOME, and the redirect broke
them in non-obvious ways.
Env-var unsetting and TZ/LANG/hashseed pinning (the actual CI-drift
fixes) are unchanged and still in place. HERMES_HOME redirection is
also unchanged — that's the canonical way to isolate tests from
~/.hermes/, not HOME.
Any code in the codebase reading ~/.hermes/* via `Path.home() / ".hermes"`
instead of `get_hermes_home()` is a bug to fix at the callsite, not
something to paper over in conftest.
Adds 15 regression tests for hermes_cli/dingtalk_auth.py covering:
* _api_post — network error mapping, errcode-nonzero mapping, success path
* begin_registration — 2-step chain, missing-nonce/device_code/uri
error cases
* wait_for_registration_success — success path, missing-creds guard,
on_waiting callback invocation
* render_qr_to_terminal — returns False when qrcode missing, prints
when available
* Configuration — BASE_URL default + override, SOURCE default
Also adds a one-line disclosure in dingtalk_qr_auth() telling users
the scan page will be OpenClaw-branded. Interim measure: DingTalk's
registration portal is hardcoded to route all sources to /openapp/
registration/openClaw, so users see OpenClaw branding regardless of
what 'source' value we send. We keep 'openClaw' as the source token
until DingTalk-Real-AI registers a Hermes-specific template.
Also adds meng93 to scripts/release.py AUTHOR_MAP.