Fixes#24809
The docs site uses baseUrl='/docs/' but the <img> tags in sessions.md
and cli.md referenced images at /img/docs/... which resolves to a 404.
The static files are served at /docs/img/docs/... instead.
Before: <img src="/img/docs/session-recap.svg"> → 404
After: <img src="/docs/img/docs/session-recap.svg"> → 200
Also fixes cli-layout.svg which had the same issue.
Use the current top-level fallback_providers list in fallback docs and keep fallback_model documented only as the legacy compatibility shape. Also align cron and delegation fallback coverage with current runtime behavior.
Closes#19691
Co-authored-by: Codex <codex@openai.com>
- Replace outdated linear ordering in prompt-assembly guide with
current stable/context/volatile tier contract from system_prompt.py
- Clarify where memory/profile snapshots live versus skills guidance
- Document that pre_llm_call context is user-message injection, not
cached system-prompt mutation
- Update architecture guide wording to reference system_prompt.py +
prompt_builder.py tiered assembly
Closes#34118
The rotation flowchart only described the generic 'retry once, rotate on
second 429' path. ChatGPT/Codex plan-limit 429s carry a usage_limit_reached
reason and rotate to the next pool key immediately (no retry, since the cap
won't clear on retry). Document that case so the docs match the code.
* chore(models): swap gemini-3-flash-preview for gemini-3.5-flash in OpenRouter + Nous lists
* chore(models): regenerate model-catalog.json for gemini-3.5-flash swap
Follow-up for #30240 — the new page was not referenced in sidebars.ts,
leaving it orphaned (unreachable via nav and flagged as a broken relative
link to ./profiles.md). Added under Using Hermes after profile-distributions.
Covers running multiple Hermes profiles as managed services on one host:
- A shell-loop wrapper pattern for start/stop/restart/status across every
profile (the per-profile CLI commands stay unchanged).
- Per-platform service file locations (LaunchAgent on macOS, systemd user
unit on Linux), plus the rules around clashes.
- Log paths per profile and how to tail every gateway at once.
- Config file layout per profile and the restart-after-edit workflow.
- Keeping the host awake: caffeinate flags on macOS,
systemd-inhibit + loginctl enable-linger on Linux.
- Token-conflict auditing across .env files.
- Troubleshooting for the common "Could not find service in domain for
user gui: 501" message and stale PIDs after a crash.
Tested locally with five profiles on macOS launchd.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The setup guide was missing the specific Feishu permission scopes to
configure and the event subscription (im.message.receive_v1) needed
for the bot to receive messages. Users had to reference external
OpenClaw documentation to complete the setup.
Adds:
- Required permissions table (im:message, im:message:send_as_bot,
im:resource, im:chat, im:chat:readonly)
- Recommended permissions (reactions, app info, contact)
- Event subscription step (im.message.receive_v1)
- App version publish reminder (permissions require published version)
Tool Search read its catalog from the global registry (get_tool_definitions
with no toolset scope = 'start with everything'), so a restricted-toolset
session — subagent, kanban worker, curated gateway session — could:
1. tool_search the entire process registry, not just its granted tools, and
2. tool_call any registered plugin/MCP tool it was never given, because
registry.dispatch() has no enabled_tools gate for non-execute_code tools.
A scoped session (enabled_toolsets=['mcp-github']) reported total_available=26
and successfully invoked an out-of-scope plugin tool via tool_call.
Fix:
- handle_function_call gains enabled_toolsets/disabled_toolsets; the bridge
dispatch scopes get_tool_definitions to them (also stops polluting the
process-global _last_resolved_tool_names with out-of-scope tools, which
leaked into execute_code's sandbox-tool fallback).
- A defense-in-depth gate rejects any tool_call'd name not in the scoped
deferrable catalog.
- tool_executor's unwrap (both concurrent + sequential paths) enforces the
same scope before dispatch, since it unwraps tool_call -> underlying name
and bypasses the bridge branch. New _tool_search_scoped_names() helper,
cached per-agent on registry generation + toolset scope.
- New scoped_deferrable_names() helper in tool_search.py shared by both sites.
Tests: 4 new regression tests in TestRegression_ToolsetScoping (scoped
catalog, out-of-scope tool_call rejection, no global pollution, helper).
Adds Tool Search, a structured-tools progressive-disclosure layer that
replaces MCP and non-core plugin tools in the model-visible tools array
with three bridge tools (tool_search / tool_describe / tool_call) when
the deferrable surface would consume more than a configurable percentage
of the active model's context window. Core Hermes tools are never deferred.
Default mode is 'auto' with a 10% context threshold, so small toolsets
pay no overhead. Set tools.tool_search.enabled to 'on' to force or 'off'
to disable.
Design carefully reflects the OpenClaw production failure modes
documented in the openclaw-tool-search-report:
- Core tools never defer (toolsets._HERMES_CORE_TOOLS). Addresses the
'tools silently missing from isolated cron turns' regression class
(openclaw#84141) by construction: there is no code path that can
drop a core tool.
- Catalog is stateless across turns — rebuilt from the live tool-defs
list on every assembly. No session-keyed Map that can drift out of
sync with the registry.
- tool_call unwraps the bridge call before any hook fires, so plugin
pre/post hooks, guardrails, approval flows, and the activity feed
all see the underlying tool name, not the bridge (addresses
openclaw#85588 and the verbose-mode complaint on openclaw#79823).
- The unwrap happens in both the parallel and sequential paths of
agent/tool_executor.py and also in handle_function_call, so direct
callers (sandboxed code, eval harnesses) are covered too.
- Bridge tools cannot invoke each other (recursion guard) and cannot
invoke core tools (those must be called directly).
- Tools mode only — no JS-sandbox code-mode. Keeps the surface small.
- Token estimation via cheap char/4 heuristic; precision isn't needed
for the threshold decision.
Files:
- tools/tool_search.py — new module (BM25 retrieval, classification,
threshold gate, bridge dispatch, unwrap helper).
- tests/tools/test_tool_search.py — 35 tests including the OpenClaw
#84141 regression guard.
- model_tools.py — wires assembly into _compute_tool_definitions as the
final step, adds skip_tool_search_assembly kwarg so the bridge can
see the real catalog, dispatches the three bridge tools.
- agent/tool_executor.py — unwraps tool_call in both parallel and
sequential parsing loops so checkpointing, guardrails, plugin hooks,
and tool-progress callbacks all observe the underlying tool name.
- hermes_cli/config.py — DEFAULT_CONFIG['tools']['tool_search'] block.
- website/docs/user-guide/features/tool-search.md — user docs.
Validation:
- 35/35 new tests pass.
- Existing tool/registry/model_tools/config/coercion/executor tests
(82 + 74 + small adjacents) green.
- Live E2E: 20 fake MCP tools registered, get_tool_definitions returns
3 bridges, tool_search returns top 3 hits, tool_describe returns
full schema, tool_call dispatches to the real underlying handler
and the underlying result is what the model sees.
- Reserved-name recursion guard verified live.
- Core-tool refusal via tool_call verified live.
The original PR diff updated two guides (oauth-over-ssh.md and
xai-grok-oauth.md) but only the oauth-over-ssh.md edit landed in the
PR's actual commit. Mirror the note to the primary xai-grok-oauth.md
guide too so users reading the main entry point don't miss the
bare-code form that already shipped in #33880.
The docs site (Vercel) serves /docs/api/model-catalog.json behind a bot
mitigation rule that returns HTTP 403 + x-vercel-mitigated: challenge for
non-browser User-Agents — including urllib (what the CLI uses) and curl.
When that happens, get_catalog() falls back to the stale disk cache and
new model releases (Opus 4.8, etc.) never reach the /model picker even
though they're already in OPENROUTER_MODELS and the live OpenRouter API.
Adds a fallback URL chain: when the primary catalog URL fails, walk
DEFAULT_CATALOG_FALLBACK_URLS — currently the raw.githubusercontent.com
copy of the same file. GitHub raw doesn't bot-gate, so the manifest stays
reachable through Vercel firewall hiccups. Per-provider override URLs
keep their direct-fetch semantics (operators configure those specifically,
no implicit fallback).
Also swaps stepfun/step-3.5-flash for stepfun/step-3.7-flash in the
OpenRouter + Nous Portal curated picker lists. Native stepfun provider
configuration (api.stepfun.ai) is left alone — that depends on what
stepfun.ai itself serves, not what OpenRouter routes.
Test plan: 5 new TestFallbackChain tests cover primary-success,
primary-failure-fallback-success, all-fail, primary==fallback-dedup, and
end-to-end get_catalog routing through the new helper. Existing 23 tests
in test_model_catalog.py still pass (28 total). Wider tests/hermes_cli/
sweep: 5701/5701 pass.
Salvages #25872 by @konsisumer against current main.
NAS users (UGOS, Synology, unRAID) expect the LinuxServer.io
PUID/PGID convention and bind-mount /opt/data from a host directory
owned by their own UID. Without this alias those vars are silently
ignored and the s6-setuidgid drop to UID 10000 leaves the runtime
unable to read the volume. HERMES_UID/HERMES_GID still take
precedence when both are set.
The original PR targeted docker/entrypoint.sh, which is now a 27-line
deprecation shim under s6-overlay (the May 2026 rework moved all
bootstrap logic to docker/stage2-hook.sh, installed as
/etc/cont-init.d/01-hermes-setup). Re-applied the same 2-line
alias resolution at the equivalent spot in stage2-hook.sh just
before the existing UID/GID remap block. Test was retargeted at
docker/stage2-hook.sh; docs hunk adapted to current main's wording
("stage2 hook" + s6-setuidgid, not the obsolete "entrypoint drops
via gosu") with the NAS bind-mount example preserved verbatim.
Test-first regression verification: reverted just docker/stage2-hook.sh
to origin/main and re-ran the new tests. Result:
FAILED test_stage2_hook_resolves_puid_pgid_aliases
FAILED test_puid_pgid_populate_hermes_uid_gid
AssertionError: assert ':' == '1000:10'
That's the exact bug shape — PUID=1000 PGID=10 silently ignored,
HERMES_UID/HERMES_GID stay empty. With the salvage applied, all 4
tests pass.
Closes#25872
Co-authored-by: konsisumer <11262660+konsisumer@users.noreply.github.com>
Remove unused imports (F401) and duplicate/shadowed import
redefinitions (F811) across the codebase using ruff's safe
autofixes. No behavioral changes -- imports only.
- ~1400 safe autofixes applied across 644 files (net -1072 lines)
- __init__.py re-exports preserved (excluded from F401 removal so
public re-export surfaces stay intact)
- Re-exports that are imported or monkeypatched by tests but look
unused in their defining module are kept with explicit # noqa:
F401 (gateway/run.py load_dotenv; run_agent re-exports from
agent.message_sanitization, agent.context_compressor,
agent.retry_utils, agent.prompt_builder, agent.process_bootstrap,
agent.codex_responses_adapter)
- Unsafe F841 (unused-variable) fixes deliberately skipped -- those
can change behavior when the RHS has side effects
- ruff lints remain disabled in pyproject.toml (only PLW1514 is
selected); this is a one-time cleanup, not a config change
Verification:
- python -m compileall: clean
- pytest --collect-only: all 27161 tests collect (zero import errors)
- core entry points import clean (run_agent, model_tools, cli,
toolsets, hermes_state, batch_runner, gateway)
- static scan: every name any test imports directly from an edited
module still resolves
NVIDIA's verified skills catalog (https://github.com/NVIDIA/skills) ships
NVIDIA-signed skills for CUDA-X, AIQ, cuOpt, cuPyNumeric, DeepStream, NeMo,
NemoClaw and the Skill Card Generator — each bundle carrying a detached
`skill.oms.sig` signature, a governance `skill-card.md`, and `evals/`. The
sync pipeline drops any skill missing those artifacts before publishing.
Changes:
- tools/skills_hub.py: add NVIDIA/skills to GitHubSource.DEFAULT_TAPS so
it lights up in `hermes skills browse`, `hermes skills search <q>`, the
twice-daily skills-index build, and the docs-site Skills Hub page
(https://hermes-agent.nousresearch.com/docs/skills) automatically.
- tools/skills_guard.py: add NVIDIA/skills to TRUSTED_REPOS so installs
resolve to trust_level="trusted" (looser install policy than community).
- website/scripts/extract-skills.py: map the `github` source id to a
friendly "NVIDIA" pill label for the docs hub page.
- website/src/pages/skills/index.tsx: register the NVIDIA pill (green
#76b900) and slot it into SOURCE_ORDER after HuggingFace.
- website/docs/user-guide/features/skills.md (+ zh-Hans i18n): document
the new default tap and the expanded trusted-repos list.
- tests/tools/test_skills_guard.py: assert NVIDIA/skills resolves to
"trusted" (including the skills-sh-wrapped form).
- tests/tools/test_skills_hub.py: invariant — every TRUSTED_REPOS entry
must be reachable via GitHubSource.DEFAULT_TAPS (prevents future
trusted repos from being declared but never browseable).
Validation:
- Live GitHub fetch: `src.fetch('NVIDIA/skills/skills/aiq-deploy')` pulled
17 files including SKILL.md (13 KB), skill-card.md, skill.oms.sig, and
the full references/ + evals/ tree. trust_level="trusted".
- Live inspect resolved name, description, and trust correctly.
- All 193 existing skills_guard + skills_hub tests still pass.
The page was last meaningfully rewritten in the pre-s6 (tini) era and had
drifted on five points that no longer matched the image:
1. "Running the dashboard" claimed the entrypoint backgrounds
`hermes dashboard` and prefixes its output with `[dashboard]`. That
was the pre-s6 entrypoint.sh path; under s6 the dashboard is a
supervised s6-rc service (`docker/s6-rc.d/dashboard/run`) with no
sed-prefix pipeline. Rewrote the section accordingly.
2. The default for `HERMES_DASHBOARD_HOST` was documented as
`127.0.0.1`. The s6 run script defaults it to `0.0.0.0`
(`dash_host="${HERMES_DASHBOARD_HOST:-0.0.0.0}"`). Fixed the table
and the surrounding prose.
3. Multi-profile was documented as "not recommended in Docker — run
one container per profile." That advice was load-bearing when
there was no in-container supervisor, but the s6 architecture
explicitly adds per-profile gateway supervision: each profile
created via `hermes profile create <name>` gets a slot under
`/run/service/gateway-<name>/`, the `02-reconcile-profiles`
cont-init script restores them across `docker restart` from
`gateway_state.json`, and `hermes gateway start/stop/restart` is
intercepted by `_dispatch_via_service_manager_if_s6` to route
through `s6-svc`. Pivoted the section to "one container, many
supervised profile gateways" as the default, with a comparison
table and a "When you DO want a separate container" escape
hatch for the genuine resource-isolation / network-segmentation
cases.
4. The Compose example trailer also claimed `[dashboard]` log
prefixing. Replaced with the actual log routing.
5. Added a new "Where the logs go" section covering all four log
surfaces: per-profile gateways (tee'd to `docker logs` AND
`${HERMES_HOME}/logs/gateways/<profile>/current` since PR
b34532319), dashboard (`docker logs`, no prefix), boot reconciler
(`container-boot.log`), and `hermes logs`. The gateway-mode and
Compose sections cross-reference this rather than each carrying
their own routing prose.
Added a new "docker exec automatically drops to the hermes user"
subsection under "What the Dockerfile does", next to the existing
Privilege model warning. Documents the `/opt/hermes/bin/hermes` shim
(landed via the docker-exec privilege-drop work) — operators don't
need to remember `--user hermes` for `docker exec hermes login`,
`docker exec hermes profile create …`, etc. The historical footgun
(`auth.json` written as `root:root`, supervised gateway then can't
read its own auth file) is mentioned only as context for what the
fail-loud `exit 126` is protecting against, not as a problem the
reader needs to solve. The `HERMES_DOCKER_EXEC_AS_ROOT=1` opt-out is
documented for diagnostic sessions.
The "Permission denied" troubleshooting subsection now carries a
single-line pointer to the new section instead of duplicating it.
The `--insecure` framing reflects PR #fb5125362 (opt-in via
`HERMES_DASHBOARD_INSECURE`, not derived from bind host): the OAuth
gate is the authority, the bind host alone never implies
`--insecure`, and opting out is an explicit security trade-off.
Anchors verified resolve. i18n zh-Hans mirror left for the
translation flow to catch up.
Updates the Docker Backend section of the user-guide configuration page
to match the actual behavior shipped in PR #33645. Pre-PR the docs
claimed "container is stopped and removed on shutdown," which was
never quite true for the documented happy path and is now actively
wrong: in default mode the container survives across Hermes processes
so background processes (npm watchers, dev servers, long-running
pytest) carry over the way the "ONE long-lived container shared
across sessions" promise requires.
Changes to `website/docs/user-guide/configuration.md`:
* Reworked the intro paragraph at the top of the Docker Backend
section to describe the actual cross-process reuse contract.
* Expanded the YAML example with the new keys
`docker_persist_across_processes` and `docker_orphan_reaper`, plus
the pre-existing-but-undocumented `docker_env`, `timeout`, and
`lifetime_seconds`. Clarified the `container_persistent` comment
to disambiguate from `docker_persist_across_processes`.
* Added a `docker_env` vs `docker_forward_env` explainer (one
injects literal KEY=value, the other forwards values from the
host/.env — easy to confuse).
* Replaced the one-line "Container lifecycle" paragraph with a full
subsection covering:
- the three labels Hermes tags every container with
(hermes-agent, hermes-task-id, hermes-profile)
- the label-probe reuse mechanism on startup
- a teardown-trigger table with four rows for every situation
that destroys the container in default mode
- edge cases (OOM kill, profile switching)
* Added an "Environment variable overrides" table covering all
TERMINAL_* env vars relevant to the Docker backend, including the
previously-undocumented `TERMINAL_DOCKER_ENV` and
`HERMES_DOCKER_BINARY`.
Changes to `website/docs/user-guide/docker.md`:
* Extended the cross-link admonition (around l.227) so the
Hermes-in-Docker page points at the new terminal-backend keys
(`docker_env`, `docker_persist_across_processes`,
`docker_orphan_reaper`) alongside the ones already mentioned.
No code changes. Behavior already covered by tests added in earlier
commits on this branch (#33645 commits 1-5).
Refs #20561
Regression from PR #33809 (lazy-fetch refactor). The `sources` and
`categoryEntries` useMemo blocks were derived from `allSkillsLocal`
but had empty/incomplete deps arrays — so they computed once at mount
when the catalog was still `[]`, then never recomputed when the fetch
resolved.
Symptom: live site shows only the "All 87,639" source button and
"All Skills 87,639" category — no per-source pills (ClawHub, skills.sh,
LobeHub, etc.) and no category breakdown. Filtering by source/category
is unusable.
Fix: add `allSkillsLocal` to both deps arrays so they recompute when
data arrives. Local build green on en + zh-Hans.
The s6 dashboard run script flipped `--insecure` on whenever
`HERMES_DASHBOARD_HOST` was anything other than 127.0.0.1 / localhost.
That comment ("the dashboard refuses otherwise") predates the OAuth
auth gate: back when it was written, `start_server` would SystemExit
on any non-loopback bind, so the run script's `--insecure` was the
only way to make in-container deployments work at all.
The gate has since been replaced by `should_require_auth(host,
allow_public)`, which engages the OAuth flow when a
`DashboardAuthProvider` is registered (the bundled `dashboard_auth/nous`
provider auto-registers on `HERMES_DASHBOARD_OAUTH_CLIENT_ID`) and
fails closed with a specific operator-facing error when none is. The
host-derived `--insecure` ran upstream of all that and silently
disabled the gate on every container-deployed dashboard.
Most visible under the portal's wildcard-subdomain rollout: every Fly
machine binds 0.0.0.0 so the edge can reach Flycast, every machine
boots with the correct `HERMES_DASHBOARD_OAUTH_CLIENT_ID`, the nous
provider registers — and `/api/status` still returns
`{"auth_required": false, "auth_providers": ["nous"]}` because the
run script disabled the gate before `start_server` ever saw the
request. The dashboard SPA was served to anyone, no `/login` redirect,
no OAuth challenge.
Fix: derive `--insecure` from an explicit opt-in env var,
`HERMES_DASHBOARD_INSECURE` (truthy values matching the rest of the
s6 boolean envs: 1, true, TRUE, True, yes, YES, Yes). Operators on
trusted LANs behind a reverse proxy without the OAuth contract
(the existing `docker-compose.windows.yml` use case) opt in
explicitly; portal-managed agent deployments leave it unset and let
the gate engage.
`docker-compose.windows.yml` already passes `--insecure` on the
`command:` array directly (line 38), so it doesn't depend on the s6
auto-injection. No compose-file change required.
Tests:
* `tests/test_docker_home_override_scripts.py` — extends the existing
static-text guard with a regression assertion that the legacy
host-derived case-statement is gone and the new env-var opt-in is
present (locks against accidental revert).
* `tests/docker/test_dashboard.py` — adds two Docker-in-Docker tests
exercising the actual `/api/status` round-trip:
- 0.0.0.0 bind + `HERMES_DASHBOARD_OAUTH_CLIENT_ID` → gate engaged
- 0.0.0.0 bind + `HERMES_DASHBOARD_INSECURE=1` → gate disabled
Docs:
* `website/docs/user-guide/docker.md` + zh-Hans i18n — adds the new
env var to the table, replaces the stale prose ("the entrypoint
no longer auto-enables insecure mode" — which until this PR was
flat-out wrong) with an accurate description of the gate's
trigger conditions and the explicit opt-out.
shellcheck clean. Python static-text test passes locally. Behavioural
test will run against any future image build (CI's Docker harness).
Adds an optional `messages` keyword to the `MemoryProvider.sync_turn`
contract so external/community memory plugins can receive the OpenAI-style
conversation message list for the completed turn — including assistant tool
calls and tool result content — not just the final assistant text.
Dispatch uses signature inspection (`_provider_sync_accepts_messages`): only
providers that declare a `messages` parameter (or `**kwargs`) receive it; all
existing in-tree providers keep their legacy text-only signature and are
called unchanged. No structured-trace envelope is added to core — providers
reconstruct whatever they need from the standard message list.
Also documents Memori as a standalone community memory provider.
Salvaged from #28065 — rebased onto current main.
Co-authored-by: Dave Heritage <david@memorilabs.ai>
Anthropic released Claude Opus 4.8 on 2026-05-27, available on
OpenRouter, Anthropic, Amazon Bedrock, and Claude Platform on AWS:
- https://openrouter.ai/anthropic/claude-opus-4.8
- https://openrouter.ai/anthropic/claude-opus-4.8-fast
The fast-mode variant is a separate model ID (anthropic/claude-opus-4.8-fast)
priced at 2x of the base model — a notable improvement over the 6x premium
on older Opus generations (4.6/4.7). It is NOT a `speed: "fast"` request
parameter like Opus 4.6; Anthropic's native fast-mode beta still only
covers Opus 4.6.
Changes:
hermes_cli/models.py
- Add anthropic/claude-opus-4.8 + anthropic/claude-opus-4.8-fast to
the OpenRouter fallback snapshot and the Nous Portal curated list
(live catalogs surface them automatically when reachable; the
fallback list matters when the manifest fetch fails).
- Add claude-opus-4-8 to the Anthropic-native picker list.
agent/model_metadata.py
- Register claude-opus-4-8 / claude-opus-4.8 in DEFAULT_CONTEXT_LENGTHS
with 1M tokens (matches 4.6/4.7).
agent/anthropic_adapter.py
- Extend _XHIGH_EFFORT_SUBSTRINGS, _ADAPTIVE_THINKING_SUBSTRINGS, and
_NO_SAMPLING_PARAMS_SUBSTRINGS with "4-8"/"4.8". 4.8 inherits the
Opus 4.7 API contract: adaptive thinking only, xhigh effort level
supported, sampling parameters (temperature/top_p/top_k) return 400.
- Add claude-opus-4-8 to _ANTHROPIC_OUTPUT_LIMITS (128k max output,
same as 4.7). Matches by substring so claude-opus-4-8-fast and
date-stamped variants resolve correctly.
agent/usage_pricing.py
- Add anthropic/claude-opus-4-8: $5/$25 per MTok input/output, $0.50
cache read, $6.25 cache write (same as 4.6/4.7).
- Add anthropic/claude-opus-4-8-fast: $10/$50 per MTok (2x), $1.00
cache read, $12.50 cache write. Per OpenRouter, the 2x premium is
the only differentiator from regular Opus 4.8.
- OpenRouter routes still pull pricing from the live /models API, so
no static OpenRouter entry is needed.
tests/agent/test_model_metadata.py
- Extend the Claude 4.6+ context-length tag list with 4.8/4-8.
website/static/api/model-catalog.json
- Regenerated via `python scripts/build_model_catalog.py` to pick up
the new entries in the OpenRouter and Nous Portal fallback lists.
E2E verification (isolated sys.path import against the worktree):
- _supports_adaptive_thinking, _supports_xhigh_effort, _forbids_sampling_params
all return True for claude-opus-4.8 and claude-opus-4.8-fast.
- _supports_fast_mode (the `speed: "fast"` request-parameter gate) stays
False for 4.8 — fast mode is a separate model ID on OpenRouter, not a
parameter Anthropic accepts on the base model.
- DEFAULT_CONTEXT_LENGTHS resolves 1M for both notations.
- resolve_billing_route + _lookup_official_docs_pricing resolve the
correct $5/$25 (regular) and $10/$50 (fast) pricing for both
dot-notation and dash-notation inputs.
- 4.7 and 4.6 regression: behavior unchanged.
Unit tests: 305 passed across tests/agent/test_usage_pricing.py,
test_model_metadata.py, tests/hermes_cli/test_model_catalog.py,
test_models.py, test_model_validation.py, test_models_dev_preferred_merge.py.
The web_crawl_tool() function was an orphan — no model schema registered
it, no skill or CLI command called it, and the agent had no way to invoke
it. PR #32608 proposed wiring it up as a model-callable tool; we've
decided not to expose crawl as a separate capability since web_search +
web_extract cover the use cases we want models to have.
Removed:
- tools/web_tools.py: web_crawl_tool() (~230 LOC)
- plugins/web/firecrawl/provider.py: supports_crawl() + crawl()
- plugins/web/tavily/provider.py: supports_crawl() + crawl()
- plugins/web/xai/provider.py: supports_crawl() override
- agent/web_search_provider.py: supports_crawl() + crawl() ABC methods
- agent/web_search_registry.py: get_active_crawl_provider() +
the 'crawl' branch in _resolve()
- agent/display.py: web_crawl tool-progress rendering
- hermes_cli/config.py: 'web_crawl' from TAVILY_API_KEY.tools
- tools/website_policy.py: stale comment reference
- Tests: removed TestWebCrawlTavily class, the two website-policy
web_crawl tests, the searxng/ddgs/brave-free crawl-error tests,
the integration test_web_crawl method, and the
test_unconfigured_crawl_emits_top_level_error test. Trimmed the
capability-flag parametrize list and the WebSearchProvider ABC
conformance tests.
- Docs: trimmed the Crawl column from capability tables in both EN
and zh-Hans, updated the developer-guide ABC table.
Net: 25 files, +115/-1067.
Closes#33762 (the schema-text bug only existed if #32608 landed).
Supersedes #32608.
PR #33748 grew the live skills index from ~2k skills to ~69k, which made
the previous build-time bundling strategy untenable: the skills page's
JS chunk was about to balloon from ~1MB to ~35MB. Initial page load
on mobile became unusable, search lagged on every keystroke against the
68k-item array, and JSON.parse blocked the main thread at startup.
Three changes:
1. extract-skills.py writes skills.json + skills-meta.json into
website/static/api/ instead of website/src/data/. Static-served by
Vercel as /docs/api/skills.json (gzipped on the wire), same CDN that
already serves skills-index.json.
2. skills/index.tsx drops the static import and fetches both files in
parallel on mount. Loading state shows '…' for the count; failures
surface a small error pill instead of blanking the page.
3. Search is debounced 150ms and runs against a precomputed lowercase
haystack stamped onto each row at load time. Before: array-join +
toLowerCase per row per keystroke on a 68k array. After: single
.includes() per row, deferred until typing settles.
Validation:
| | before | after |
|---|---|---|
| skills.json location | src/data/ (bundled) | static/api/ (CDN) |
| Largest JS chunk | would be ~35MB at 68k skills | 659 KB |
| Initial page render | wait for full parse | immediate, fetch async |
| Per-keystroke filter | join+lowercase x 68k rows | single includes x 68k rows |
| Debounce | none | 150ms |
Built locally for both en and zh-Hans locales; the 34MB skills.json now
lives in build/api/ and is served separately rather than inlined into
the page's bundle.
skills.json and skills-meta.json added to .gitignore — they were already
build artifacts, but the gitignore only listed skills-index.json before.
Kimi K2.6 is natively multimodal — flagged by Shengyuan from the Kimi
growth team. Replace the named-vendor example with a model-agnostic
phrasing so the row doesn't go stale as more vendors ship vision.
Adds first-class `client_cert` / `client_key` config keys so MCP servers
behind mTLS work without an external TLS-terminating proxy. Resolves
inbound community question (Jeremy W.).
Schema (per `mcp_servers.<name>`, HTTP/SSE only):
- `client_cert: "/path/to/combined.pem"` — single PEM with cert + key
- `client_cert: "/path/to/cert"` + `client_key: "/path/to/key"` — separate
- `client_cert: [cert, key]` or `[cert, key, password]` — list form,
with optional passphrase for encrypted keys
Paths support `~` expansion. Missing files raise a server-scoped
`FileNotFoundError` at connect time rather than failing later with an
opaque TLS handshake error.
Wiring:
- New SDK HTTP path (mcp >= 1.24): `cert=` on the user-owned
`httpx.AsyncClient` alongside the existing `verify=` handling.
- SSE path: routed through an `httpx_client_factory` that wraps the
SDK's defaults (follow_redirects=True) and layers `verify` + `cert`
on top. The factory is only injected when needed, so the SDK's
built-in `create_mcp_http_client` keeps being used in the default
case.
- Deprecated mcp<1.24 path left untouched — that SDK's
`streamablehttp_client` signature doesn't expose `cert`, and adding
it would be dead code.
Also documents the previously-undocumented `ssl_verify` key (bool or
CA bundle path) in the MCP config reference.
Tests:
- `tests/tools/test_mcp_client_cert.py` (new, 19 tests):
- `_resolve_client_cert` helper: all three input forms, `~` expansion,
missing-file and validation errors.
- HTTP transport: `cert=` forwarded into `httpx.AsyncClient` for
string and tuple forms; absent when unset; missing-file error
propagates.
- SSE transport: factory only injected when cert or non-default
verify is set; factory applies cert, custom CA bundle, and
preserves `follow_redirects=True` + forwarded headers/auth.
- Existing tests: 200/200 in `test_mcp_tool.py` + `test_mcp_sse_transport.py`
still pass.