Commit graph

166 commits

Author SHA1 Message Date
H-Ali13381
697d38a3f4 feat: auto-launch Chromium-family browser for CDP
Add browser CDP launch candidates for Chrome, Chromium, Brave, and Edge while preserving Chrome-first selection. Retry candidate launch failures instead of giving up after the first executable.

Update /browser CLI and TUI messaging, docs, and tool descriptions from Chrome-only wording to Chromium-family browser support. Add regression coverage for Brave/Edge paths, Chrome-first precedence, fallback launches, and CDP endpoint probing.
2026-05-19 22:34:05 -07:00
Teknium
340d2b6de0
docs(xai-oauth): note X Premium+ also unlocks Grok OAuth (#29055)
The xAI Grok OAuth page only mentioned SuperGrok subscribers. An X
Premium+ subscription on the X account you sign in with also unlocks
Grok access via accounts.x.ai (xAI links the X subscription status to
the xAI session automatically — see https://docs.x.ai/grok/faq).

Updates the OAuth page title, prereqs, and overview table, plus the
provider/configuration/x-search docs that reference the OAuth flow.
2026-05-19 22:28:26 -07:00
Teknium
22120ef00f Revert "feat(telegram): support quick-command-only menus"
This reverts commit b1acf80e17.
2026-05-18 23:59:57 -07:00
Teknium
eacce70a35
docs: comprehensive 2-week sweep of feature/PR coverage gaps (#28497)
Catch the website docs up to two weeks of merged work (May 4 – May 18, 2026,
roughly 1,080 PRs). The audit found ~50 user-visible features that had landed
in code with no docs footprint, plus a handful of stale pages. This PR closes
every gap the scan turned up.

New pages
- user-guide/features/deliverable-mode.md — extension list, agent triggers,
  kanban_complete artifacts pattern, [[as_document]] override (PR #27813).
- developer-guide/web-search-provider-plugin.md — authoring guide modeled on
  image-gen-provider-plugin, covering brave_free / ddgs / etc. (PR #25448).

Providers / auth
- Rename "Alibaba Cloud" → "Qwen Cloud (Alibaba DashScope)" everywhere the
  display label shows up; provider id stays `alibaba` (PR #24835).
- Document OAuth refresh-token quarantine for xAI / MiniMax / Codex (PRs
  #28116 / #28118 / #28119).
- Document Nous JWT minting from refresh token + invalid-refresh quarantine
  + cross-profile shared token store (PRs #27663 / #19712).
- Add `## Microsoft Entra ID authentication (keyless)` section to
  azure-foundry guide — DefaultAzureCredential, RBAC, OpenAI + Anthropic
  routing details (PR #28101 / #9df9816da).
- Custom providers `api_mode` is now prompted-and-persisted, not just URL
  autodetected (PR #25068).
- Delegation honours `api_mode` + auto-detects anthropic_messages base URLs
  (PR #26824).
- `x_search` auto-enables when xAI credentials are present (PR #27376).
- Add `xAI Grok OAuth (SuperGrok)` row to providers headline table (PR
  #26534).
- NVIDIA NIM billing-origin header is set automatically (PR #26585).

Windows / installer
- `install.ps1`: document `-Commit <sha>` and `-Tag <v>` pin params plus
  the BOM-strip / git-retry hardening (PR #28169).
- Document Hermes Desktop thin installer + first-launch bootstrap (PR
  #27822).
- Document `dep_ensure` Windows bootstrap (PR #27845).
- Document install-method auto-detection (pip / git / homebrew / nixos) and
  the matching update command (PR #27843).

Gateway / messaging
- `/platform list|pause|resume` full description + circuit-breaker
  semantics (PR #26600).
- Slack / Matrix / Mattermost get parallel `allowed_channels` /
  `allowed_rooms` allowlist sections matching Telegram/Discord/DingTalk
  (PR #21251).
- Discord `allow_any_attachment` + `max_attachment_bytes` (config and env
  vars) (PR #27245).
- Discord clarify-choice button rendering (PR #25485).
- Telegram `guest_mode` @mention bypass for allowlisted groups (PR
  #22759).
- Telegram `notifications` mode (`important` vs `all`) (PR #22793).
- `[[as_document]]` skill / response directive for forcing
  document-style media delivery (PR #21210).

CLI / TUI
- `/new [name]` argument (PR #19637).
- `/subgoal` user-supplied criteria appended to `/goal` (PR #25449).
- `/exit --delete` flag confirmation prompts for destructive slash
  commands (PR #22687).
- Status-bar additions: ▶ N background indicator (PR #27175), context
  compression count (PR #21218), YOLO mode banner+statusbar warning (PR
  #26238).
- `display.timestamps` + `docker_extra_args` config keys (PR #23599).
- TUI collapsible startup banner sections (PR #20625).
- `HERMES_SESSION_ID` exported to tool subprocesses (PR #23847).

i18n
- Refresh display.language locale list from 8 → 16 (en, zh, zh-hant, ja,
  de, es, fr, tr, uk, af, ko, it, ga, pt, ru, hu) — matches
  `agent/i18n.py:SUPPORTED_LANGUAGES`.

Tools / features
- `vision_analyze` native-pixel passthrough for vision-capable callers,
  with auxiliary text-describer fallback (PR #22955).
- `session_search` rewrite to the single-shape tool (discovery / scroll /
  browse modes) (PRs #27590 / #27840).
- Clarify MCP transport scope: client supports stdio + SSE; embedded
  `hermes mcp serve` is stdio-only (PR #21227).
- Web search backends table: add Brave Search (free tier) and DDGS rows
  (PR #21337).
- ACP session-scoped edit auto-approval modes (PR #27862).
- Curator rename map in the user-visible per-run summary (PR #22910).
- Prompt caching feature page reference in features/overview.md — Claude
  cross-session 1-hour prefix cache on native Anthropic / OpenRouter /
  Nous Portal (PR #23828).
- Cron per-job profile parameter (PR #28124).
- `--no-skills` flag for `hermes profile create` (PR #20986).

Build
- Verified with `npm run build` in `website/`; both `en` and `zh-Hans`
  locales compile. Remaining broken-link/anchor warnings are pre-existing
  (`rl-training.md` from learning-path / overview; the
  zh-Hans translation lag the docs skill already calls out).
2026-05-18 23:55:25 -07:00
stevehq26-bot
b1acf80e17 feat(telegram): support quick-command-only menus 2026-05-18 22:48:42 -07:00
Teknium
abf1af5401
feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590)
* feat(session_search): single-shape tool with discovery, scroll, browse — no LLM

Replaces the LLM-summarized session_search with a single-shape tool that
returns actual messages from the DB. Three calling shapes inferred from
args (no mode parameter):

  1. Discovery — pass query. FTS5 + anchored ±5 window + bookends per hit,
     all in one call. ~20ms on a real DB instead of ~90s for the previous
     three aux-LLM calls.
  2. Scroll — pass session_id + around_message_id. Returns a window
     centered on the anchor. To paginate, re-anchor on the first/last id
     of the returned window. Boundary message appears in both windows
     as the orientation marker. ~1ms per scroll call.
  3. Browse — no args. Recent sessions chronologically.

Bookend_start (first 3 user+assistant msgs) and bookend_end (last 3) give
the agent goal + resolution on every discovery hit, so a single tool call
reconstructs a long session's arc without loading the whole transcript.

The aux-LLM summary path is gone: it cost ~$0.30/call, took ~30s, and
laundered FTS5 hits through a model that could confabulate when the right
session wasn't in the hit list. The merged shape returns byte-for-byte
content from SQLite.

History:
- PR #20238 (JabberELF) seeded the fast/summary dual-mode split.
- PR #26419 (yoniebans) expanded to fast/guided/summary with bookends,
  multi-anchor drill-down, default-mode config, and a teaching skill.

This PR collapses that toolkit into one shape with explicit scroll
support, drops the summary path, drops the mode parameter, drops the
config knob, drops the skill. JabberELF's seed work is acknowledged via
the AUTHOR_MAP entry.

Validation:
- 38/38 tool tests pass (tests/tools/test_session_search.py)
- 12/12 get_messages_around tests pass (tests/hermes_state/)
- 11/11 get_anchored_view tests pass (tests/hermes_state/)
- Full tests/tools/ run: 5168 passing, 2 failures pre-exist on main
  (test ordering in test_delegate.py, unrelated)
- E2E against live state DB: discovery 20ms, scroll 1ms, browse 280ms;
  pagination forward+backward works with boundary-message orientation;
  error paths return clean tool_error responses

Co-authored-by: JabberELF <abcdjmm970703@gmail.com>
Co-authored-by: yoniebans <jonny@nousresearch.com>

* chore(session_search): prune dead LLM-summary config and docs

Companion to the single-shape rewrite. The auxiliary.session_search config
block, max_concurrency / extra_body tunables, and matching docs sections
all referenced the removed LLM summarization path. Removing them so users
don't try to tune knobs that nothing reads.

- hermes_cli/config.py: drop dead auxiliary.session_search block from
  DEFAULT_CONFIG. Leftover keys in user config.yaml are harmless and
  ignored.
- hermes_cli/tips.py: drop two tips referencing the removed
  max_concurrency / extra_body knobs.
- website/docs/user-guide/configuration.md: drop 'Session Search Tuning'
  section and the auxiliary.session_search block from the example.
- website/docs/user-guide/features/fallback-providers.md: drop session_search
  rows from the auxiliary-tasks tables and the dedicated tuning subsection.
- website/docs/reference/tools-reference.md: rewrite the session_search
  entry to describe the new three-shape behaviour.
- CONTRIBUTING.md: update the file-tree description.
- tests/tools/test_llm_content_none_guard.py: remove TestSessionSearchContentNone
  class and test_session_search_tool_guarded — both guard against an
  unguarded .content.strip() call site in _summarize_session() that no
  longer exists.

Validation: 97/97 targeted tests still pass (hermes_state + session_search +
llm_content_none_guard). Config tests 55/55.

---------

Co-authored-by: JabberELF <abcdjmm970703@gmail.com>
Co-authored-by: yoniebans <jonny@nousresearch.com>
2026-05-17 23:28:45 -07:00
Teknium
1345dda0cf
feat(kanban): orchestrator-driven auto-decomposition on triage (#27572)
* feat(kanban): orchestrator-driven auto-decomposition on triage

Closes the core gap in the kanban system: dropping a one-liner into Triage
now decomposes it into a graph of child tasks routed to specialist
profiles by description, matching teknium's original vision ("main
orchestrator splits/creates actual tasks, doles them out to each agent").

The build
---------
- hermes_cli/profiles.py: new `description` + `description_auto` fields
  on ProfileInfo, persisted in <profile_dir>/profile.yaml. Helpers
  read_profile_meta / write_profile_meta. `create_profile` accepts
  optional description.
- hermes_cli/profile_describer.py: new module — auto-generate a 1-2
  sentence description from a profile's skills + model + name via the
  auxiliary LLM (`auxiliary.profile_describer`).
- hermes_cli/main.py: new `hermes profile create --description ...`
  flag; new `hermes profile describe [name] [--text ... | --auto |
  --all --auto]` subcommand.
- hermes_cli/kanban_db.py: new `decompose_triage_task` atomic helper —
  creates N child tasks, links the root as a child of every leaf
  (root waits for the whole graph), flips root `triage -> todo` with
  orchestrator assignee, records an audit comment + `decomposed` event
  in a single write_txn.
- hermes_cli/kanban_decompose.py: new module — calls the auxiliary LLM
  (`auxiliary.kanban_decomposer`) with the profile roster + descriptions
  to produce a JSON task graph, then invokes the DB helper. Rewrites
  unknown assignees to the configured `kanban.default_assignee` (or
  the active default profile) so a task NEVER lands with assignee=None.
  Falls back to specify-style single-task promotion when the LLM
  returns `fanout: false`.
- hermes_cli/kanban.py: new `hermes kanban decompose [task_id | --all]`
  CLI verb.
- hermes_cli/config.py: new DEFAULT_CONFIG keys —
  kanban.orchestrator_profile, kanban.default_assignee,
  kanban.auto_decompose (default True), kanban.auto_decompose_per_tick
  (default 3), auxiliary.kanban_decomposer, auxiliary.profile_describer.
- gateway/run.py: kanban dispatcher watcher now runs auto-decompose
  before each `_tick_once`, capped by `auto_decompose_per_tick` so a
  bulk-load of triage tasks doesn't burst-spend the aux LLM.
- plugins/kanban/dashboard/plugin_api.py: new endpoints —
  GET /profiles (list roster + descriptions),
  PATCH /profiles/<name> (set description, user-authored),
  POST /profiles/<name>/describe-auto (LLM-generate),
  POST /tasks/<id>/decompose (run decomposer),
  GET/PUT /orchestration (orchestrator/default-assignee/auto-decompose
  pickers, with resolved fallbacks echoed back).
- plugins/kanban/dashboard/dist/index.js: new OrchestrationPanel
  collapsible — dropdowns for orchestrator profile and default
  assignee, auto-decompose toggle, per-profile description editor with
  Save and Auto-generate buttons. New ⚗ Decompose button next to
   Specify on triage-column task drawers.

Behavior
--------
- A task in Triage gets fanned out into a small DAG of child tasks.
  Children with no internal parents flip to `ready` immediately
  (parallel dispatch). Children with sibling parents wait. The root
  stays alive as a parent of every child — when the whole graph
  finishes, it promotes to `ready` and the orchestrator profile wakes
  back up to judge completion (the "adds more tasks until done" part
  of the original vision).
- `kanban.orchestrator_profile` unset -> falls back to the default
  profile (whichever `hermes` launches with no -p flag).
- `kanban.default_assignee` unset -> same fallback. Tasks NEVER end
  up unassigned.
- `kanban.auto_decompose=true` (default) runs the decomposer
  automatically on dispatcher ticks; manual `hermes kanban decompose`
  is always available.

Tests
-----
- tests/hermes_cli/test_kanban_decompose_db.py — 7 tests for the
  atomic DB helper (status transitions, dep graph, audit trail,
  validation errors).
- tests/hermes_cli/test_kanban_decompose.py — 6 tests for the
  decomposer module (fanout, no-fanout fallback, unknown-assignee
  rewrite, malformed-JSON resilience, no-aux-client path).
- tests/hermes_cli/test_profile_describer.py — 10 tests for
  profile.yaml r/w + the LLM auto-describer (yaml corrupt tolerance,
  user-vs-auto description protection, --overwrite, fallback parsing).

E2E
---
- CLI end-to-end: created profiles with descriptions, dropped a triage
  task, mocked the aux LLM with a 3-task graph -> verified all three
  children were created with the right assignees, the dependency
  edges matched the LLM's graph, root flipped to todo gated by every
  child, audit comment + `decomposed` event recorded.
- Dashboard end-to-end: started the dashboard against an isolated
  HERMES_HOME, verified all four new endpoints via curl (profile
  listing, PATCH for description, PUT for orchestration settings,
  POST for decompose). Opened the UI in the browser, confirmed the
  OrchestrationPanel renders with all three pickers + the per-profile
  description editor, typed a description, clicked Save, verified
  ~/.hermes/profile.yaml was written. Clicked Decompose on the triage
  card and confirmed the inline error message surfaced as designed
  ("no auxiliary client configured").

* feat(kanban): surface decompose mode (Auto/Manual) as a one-click pill

The auto/manual toggle already existed as kanban.auto_decompose (default
true), but it was buried inside the collapsed Orchestration settings
panel — users couldn't tell at a glance which mode they were in. This
hoists it to a pill at the top of the kanban page so the state is always
visible and one click flips it.

UX
- New "⚗ Decompose: AUTO|MANUAL" pill in the kanban header. Emerald
  styling when Auto is on (the default), muted/gray when Manual.
- Pill is visible both in the collapsed AND expanded Orchestration
  settings views so context is preserved when the user opens the panel.
- Tooltip explains both states + what clicking does.
- Renamed the in-panel "Auto-decompose on triage / Enabled" checkbox
  to "Decompose mode / Auto (default) | Manual" for language parity
  with the pill.

Behavior preserved
- Default remains Auto (kanban.auto_decompose=true).
- Manual mode restores pre-PR behavior: triage tasks stay in triage
  until the user clicks ⚗ Decompose on each card (or runs
  `hermes kanban decompose <id>`).

Implementation
- plugins/kanban/dashboard/dist/index.js: load /orchestration on mount
  (not just on expand) so the collapsed pill reflects real state.
  Render mode pill in both collapsed and expanded headers. Reuses the
  existing PUT /api/plugins/kanban/orchestration endpoint — no new
  backend, no new tests required.

E2E verified
- Pill renders as "⚗ Decompose: AUTO" on page load (default).
- One click flips to "⚗ Decompose: MANUAL" with muted styling.
- config.yaml on disk shows auto_decompose: false after the flip.
- Second click round-trips back to Auto; config.yaml flips to true.

* feat(kanban): rename mode pill to "Orchestration: Auto/Manual"

Per Teknium feedback — "Decompose" was too implementation-specific.
"Orchestration" is the user-facing concept (the whole pitch is the
orchestrator profile routing work), and the pill is the front door to it.

- Pill text: "Orchestration: Auto" / "Orchestration: Manual" (title case,
  no ⚗ prefix, no SHOUTY-CAPS for the mode value)
- In-panel checkbox label: "Orchestration mode" (was "Decompose mode")
- Tooltips updated to match
- No behavior change

* docs(kanban): document decompose, profile descriptions, orchestration mode

Brings the docs site up to parity with the PR. English build verified
locally (npx docusaurus build --locale en) — clean, no new broken links
or anchors. Pre-existing broken-link warnings (rl-training, llms.txt,
step-by-step-checklist, fallback-model) untouched.

- website/docs/reference/cli-commands.md
    + `hermes kanban decompose` action row in the action table, with
      pointer to the Auto vs Manual orchestration section.

- website/docs/reference/profile-commands.md
    + `--description "<text>"` flag on `hermes profile create`.
    + Full `hermes profile describe` section: read, --text, --auto,
      --overwrite, --all flags with examples.

- website/docs/user-guide/features/kanban.md (the big one)
    + Triage column intro rewritten around the Auto-decompose default
      behavior, with pointer to the new Auto vs Manual section.
    + Status action row updated to mention both ⚗ Decompose and
       Specify on triage cards.
    + New "Auto vs Manual orchestration" section explaining the two
      modes, how to flip them (pill, config), how routing-by-description
      works, the no-None-assignee guarantee, plus a config knob table
      (auto_decompose, auto_decompose_per_tick, orchestrator_profile,
      default_assignee) and the two new auxiliary slots
      (kanban_decomposer, profile_describer).
    + REST surface table gains 6 new endpoint rows: /tasks/:id/decompose,
      /profiles (GET), /profiles/:name (PATCH), /profiles/:name/describe-auto,
      /orchestration (GET + PUT).

- website/docs/user-guide/features/kanban-tutorial.md
    + Triage column blurb updated for Auto by default + Manual via the
      pill, with cross-link to the Auto vs Manual orchestration section.

- website/docs/user-guide/profiles.md
    + Blank-profile flow now mentions --description and points to the
      kanban routing model for context.

- website/docs/user-guide/configuration.md
    + `kanban_decomposer` and `profile_describer` added to the
      `hermes model -> Configure auxiliary models` menu listing.
2026-05-17 13:54:12 -07:00
r266-tech
86f3776a72 docs(delegation): document api_mode wire-protocol override from #26824 2026-05-16 20:32:43 -07:00
Teknium
233d4170cf
docs(xai): link OAuth-over-SSH guide from xAI provider surfaces (#26610)
Follow-up to #26592. The new docs/guides/oauth-over-ssh.md page was
linked from the two SSH-specific sections of the xAI Grok OAuth guide
but was missing from the surfaces a user is more likely to hit first:

- guides/xai-grok-oauth.md 'See Also' — add the SSH guide at the top
  with a short qualifier so remote users notice it before clicking
  through.
- integrations/providers.md xAI Grok OAuth callout — append the SSH
  guide link alongside the existing xAI OAuth guide link.
- user-guide/configuration.md xai-oauth tip — same.

Docs build: zero warnings on touched files.
2026-05-15 14:45:59 -07:00
Teknium
4ad5fa702f
docs(xai-oauth): add xai-oauth to provider enumeration pages (#26542)
Follow-up to #26534 (xai-oauth provider). The new guide and integrations
page were shipped with the salvage, but four reference/enumeration pages
still listed every other OAuth provider without xai-oauth:

- reference/cli-commands.md     — `--provider` choices list
- reference/environment-variables.md — HERMES_INFERENCE_PROVIDER values
- user-guide/configuration.md   — auxiliary-task provider list, OAuth
                                  tip block (mirrored from MiniMax OAuth),
                                  and provider table row
- user-guide/features/fallback-providers.md — provider table
2026-05-15 12:33:12 -07:00
domtriola
796c8a2d63 docs(user-guide): point tirith link to correct repo 2026-05-13 23:04:57 -07:00
Dan Benyamin
62fd905340 feat(browser): support externally managed Camofox sessions
Allow integrations to share a visible Camofox identity with Hermes and recover existing tabs without carrying local patches.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 15:14:49 -07:00
Teknium
c594a23047
feat(agent): per-turn file-mutation verifier footer (#24498)
Detect when write_file / patch calls fail during a turn and are never
superseded by a successful write to the same path.  When the final
text response is delivered, append an advisory footer listing the
files that did NOT change — so models that over-claim 'patched 5 files'
after 4 silent failures can't hide the lie.

Catches the failure mode reported in Ben Eng's llm-wiki session:
grok-4.1-fast issued batches of parallel patches, half failed with
'Could not find old_string', and the agent summarised the turn
claiming every file was edited.  The user had to manually run
'git status' each turn to catch it.

The verifier is a pure post-hoc check on tool results — no new LLM
calls, no synthetic messages injected into history (prompt cache
preserved), no changes to tool argument dispatch.  Per-turn state is
keyed by path; a later successful write to the same path clears the
failure entry so single-file retry recovery is not flagged.

Wired into both _execute_tool_calls_concurrent and
_execute_tool_calls_sequential, so batched parallel patches and one-at-
a-time edits are both covered.  Footer emission happens after the
agent loop exits, before transform_llm_output / post_llm_call plugin
hooks run, so plugins still see (and can modify) the augmented text.

Config: display.file_mutation_verifier (bool, default true) +
HERMES_FILE_MUTATION_VERIFIER env override.

31 unit tests in tests/run_agent/test_file_mutation_verifier.py cover
target extraction (write_file, patch-replace, patch-v4a single and
multi-file), error-preview extraction (JSON .error field and plain
string), per-turn state transitions (first-error-wins on repeated
failure, success supersedes failure), footer rendering (truncation
at 10 entries, user-actionable hint), and env/config precedence.

Companion docs updated: user-guide/configuration.md +
reference/environment-variables.md.
2026-05-12 11:54:13 -07:00
Teknium
0bcc327cab
docs(openrouter): document auxiliary.<task>.extra_body for OR routing and Pareto (#22844)
The plumbing for setting OpenRouter provider preferences and the Pareto Code
router on auxiliary tasks already exists — auxiliary.<task>.extra_body is
forwarded verbatim by call_llm() / async_call_llm(). It just wasn't documented,
so users who wanted (e.g.) Pareto Code routing for compression but the strongest
coder for the main agent had no way to discover the escape hatch.

- hermes_cli/config.py: expand the auxiliary section header with a YAML
  example showing provider routing plus plugins under extra_body, and an
  explicit note that main-agent provider_routing / openrouter.min_coding_score
  do NOT propagate to aux calls (each task is independent by design)
- website/docs/user-guide/configuration.md: new 'OpenRouter routing and
  Pareto Code for auxiliary tasks' subsection with worked example
- website/docs/integrations/providers.md: cross-link from the Pareto Code
  Router section to the aux-side doc

E2E verified that auxiliary.<task>.extra_body reaches the OpenRouter API with
the configured provider routing and plugins blocks intact.
2026-05-09 14:51:20 -07:00
Teknium
252d68fd45
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784)
* docs: deep audit — fix stale config keys, missing commands, and registry drift

Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level
user-guide, user-guide/features) against the live registries:

  hermes_cli/commands.py    COMMAND_REGISTRY (slash commands)
  hermes_cli/auth.py        PROVIDER_REGISTRY (providers)
  hermes_cli/config.py      DEFAULT_CONFIG (config keys)
  toolsets.py               TOOLSETS (toolsets)
  tools/registry.py         get_all_tool_names() (tools)
  python -m hermes_cli.main <subcmd> --help (CLI args)

reference/
- cli-commands.md: drop duplicate hermes fallback row + duplicate section,
  add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand
  lists to match --help output (status/logout/spotify, login, archive/prune/
  list-archived).
- slash-commands.md: add missing /sessions and /reload-skills entries +
  correct the cross-platform Notes line.
- tools-reference.md: drop bogus '68 tools' headline, drop fictional
  'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated),
  add missing 'kanban' and 'video' toolset sections, fix MCP example to use
  the real mcp_<server>_<tool> prefix.
- toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser'
  row, add missing 'kanban' and 'video' toolset rows, drop the stale
  '38 tools' count for hermes-cli.
- profile-commands.md: add missing install/update/info subcommands, document
  fish completion.
- environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the
  one with the correct gmi-serving.com default).
- faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just
  via OpenRouter), refresh the OpenAI model list.

getting-started/
- installation.md: PortableGit (not MinGit) is what the Windows installer
  fetches; document the 32-bit MinGit fallback.
- installation.md / termux.md: installer prefers .[termux-all] then falls
  back to .[termux].
- nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid
  'nix flake update --flake' invocation.
- updating.md: 'hermes backup restore --state pre-update' doesn't exist —
  point at the snapshot/quick-snapshot flow; correct config key
  'updates.pre_update_backup' (was 'update.backup').

user-guide/
- configuration.md: api_max_retries default 3 (not 2); display.runtime_footer
  is the real key (not display.runtime_metadata_footer); checkpoints defaults
  enabled=false / max_snapshots=20 (not true / 50).
- configuring-models.md: 'hermes model list' / 'hermes model set ...' don't
  exist — hermes model is interactive only.
- tui.md: busy_indicator -> tui_status_indicator with values
  kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none).
- security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env,
  not config.yaml.
- windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the
  OpenAI-compatible API server runs inside hermes gateway.

user-guide/features/
- computer-use.md: approvals.mode (not security.approval_level); fix broken
  ./browser-use.md link to ./browser.md.
- fallback-providers.md: top-level fallback_providers (not
  model.fallback_providers); the picker is subcommand-based, not modal.
- api-server.md: API_SERVER_* are env vars — write to per-profile .env,
  not 'hermes config set' which targets YAML.
- web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl
  modes are exposed through web_extract.
- kanban.md: failure_limit default is 2, not '~5'.
- plugins.md: drop hard-coded '33 providers' count.
- honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document
  that 'hermes honcho' subcommand is gated on memory.provider=honcho;
  reconcile subcommand list with actual --help output.
- memory-providers.md: legacy 'hermes honcho setup' redirect documented.

Verified via 'npm run build' — site builds cleanly; broken-link count went
from 149 to 146 (no regressions, fixed a few in passing).

* docs: round 2 audit fixes + regenerate skill catalogs

Follow-up to the previous commit on this branch:

Round 2 manual fixes:
- quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY;
  voice-mode and ACP install commands rewritten — bare 'pip install ...'
  doesn't work for curl-installed setups (no pip on PATH, not in repo
  dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e
  ".[voice]"'. ACP already ships in [all] so the curl install includes it.
- cli.md / configuration.md: 'auxiliary.compression.model' shown as
  'google/gemini-3-flash-preview' (the doc's own claimed default);
  actual default is empty (= use main model). Reworded as 'leave empty
  (default) or pin a cheap model'.
- built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row
  that was missing from the table.

Regenerated skill catalogs:
- ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill
  pages and both reference catalogs (skills-catalog.md,
  optional-skills-catalog.md). This adds the entries that were genuinely
  missing — productivity/teams-meeting-pipeline (bundled),
  optional/finance/* (entire category — 7 skills:
  3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model,
  merger-model, pptx-author), creative/hyperframes,
  creative/kanban-video-orchestrator, devops/watchers,
  productivity/shop-app, research/searxng-search,
  apple/macos-computer-use — and rewrites every other per-skill page from
  the current SKILL.md. Most diffs are tiny (one line of refreshed
  metadata).

Validation:
- 'npm run build' succeeded.
- Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation
  shells that lag every newly-added skill page (pre-existing pattern).
  No regressions on any en/ page.
2026-05-09 13:19:51 -07:00
Teknium
cff821e2dc
docs: register triage_specifier in the aux-models enumerations (#21494)
The kanban specifier landed in #21435 with feature-page docs (the
kanban page itself + the CLI reference table), but three other docs
pages enumerate every auxiliary task slot and were missed:

  user-guide/configuration.md            Auxiliary Models section —
                                         interactive picker example
                                         + full auxiliary config
                                         reference YAML block.
  user-guide/features/fallback-providers.md
                                         Both 'Auxiliary Tasks' and
                                         'Fallback Reference' tables.
  user-guide/features/kanban-tutorial.md
                                         Triage-column bullet now
                                         mentions the  Specify
                                         button + CLI + slash command.

No other docs enumerate the aux task slots (verified with
grep -r 'title_generation\|auxiliary.session_search' website/docs/).
2026-05-07 13:07:18 -07:00
kshitij
48c241840a
docs: add Web Search + Extract feature page with SearXNG setup guide 2026-05-06 10:20:05 -07:00
kshitij
94016dd1aa
docs+skill: add searxng-search optional skill and documentation
Closes the remaining gaps from PR #11562 that weren't covered by the
core SearXNG integration landed in #20823.

- optional-skills/research/searxng-search/ — installable skill with
  SKILL.md (curl-based usage, category support, Python example) and
  searxng.sh helper script for health checks and instance queries
- website/docs/user-guide/configuration.md — SearXNG added to the
  Web Search Backends section (5 backends, backend table, per-capability
  split config example, correct search-only note)
- website/docs/reference/environment-variables.md — SEARXNG_URL row
- website/docs/reference/optional-skills-catalog.md — searxng-search entry

The core SearXNG code, OPTIONAL_ENV_VARS, hermes tools picker, and tests
were already on main via #20823.  This commit is purely additive docs +
the optional skill scaffold.

Credits from #11562 salvage:
  @w4rum — original _searxng_search structure
  @nathansdev — tools_config.py integration
  @moyomartin — category support and result formatting
  @0xMihai — config/env var approach
  @nicobailon — skill and documentation structure
  @searxng-fan — error handling patterns
  @local-first — self-hosted-first philosophy and docs
2026-05-06 10:15:56 -07:00
etherman-os
39f451f5ad fix: add Turkish locale references in config, tests, and docs
- hermes_cli/config.py: add tr to supported languages comment
- locales/en.yaml: add tr to locale file list comment
- tests/agent/test_i18n.py: add Turkish alias tests + explicit lang test
- website/docs/user-guide/configuration.md: add tr to supported values
2026-05-05 17:29:12 -07:00
Oleksii Lisikh
c4b287ba53 feat(i18n): add Ukrainian locale 2026-05-05 17:21:59 -07:00
jani
0df80f4391 docs: align terminal-backend count and naming across docs and code
README:24 claimed "Six terminal backends" while tools/environments/ exposes
seven top-level backend choices through TERMINAL_ENV: local, docker, ssh,
singularity, modal, daytona, vercel_sandbox. Modal additionally has direct
and Nous-managed modes selected via terminal.modal_mode (the
ManagedModalEnvironment class is a Modal sub-mode, not a separate top-level
backend).

The same drift appeared in five other doc and code-comment sites with
inconsistent counts (six, seven, or implicit) and varying lists. Updated
all sites to a consistent seven-backend list in canonical order. The
configuration guide also clarifies how Modal's two modes are selected so
operators do not search for a non-existent backend: managed_modal value.

CONTRIBUTING.md:160 lists six backend filenames in a code tree but does
not carry the "Six terminal" prose; left out of scope per cohesion sweep
guidance to bundle only identical wording.

Files updated:
- README.md (line 24, marketing copy)
- website/docs/index.md (line 49, landing page)
- website/docs/user-guide/configuration.md (line 86, config guide)
- tools/environments/__init__.py (lines 3-6, package docstring)
- tools/file_operations.py (line 6, module docstring)
- environments/README.md (line 43, RL training docs — TERMINAL_ENV list)
2026-05-05 13:44:09 -07:00
Teknium
7de3c86c5a
feat(i18n): add display.language for static message translation (zh/ja/de/es) (#20231)
* revert(gateway): remove stale-code self-check and auto-restart

Removes the _detect_stale_code / _trigger_stale_code_restart mechanism
introduced in #17648 and iterated in #19740. On every incoming message
the gateway compared the boot-time git HEAD SHA to the current SHA on
disk, and if they differed it would reply with

    Gateway code was updated in the background --
    restarting this gateway so your next message runs
    on the new code. Please retry in a moment.

and then kick off a graceful restart. This is unwanted behaviour:
users who run a long-lived gateway and do their own ad-hoc git
operations on the checkout end up with their chat interrupted and
the current message dropped every time HEAD moves, with no way to
opt out.

If an operator really needs the old protection against stale
sys.modules after "hermes update", the SIGKILL-survivor sweep in
hermes update (hermes_cli/main.py, also tagged #17648) already
handles the supervisor-respawn case on its own.

Removed:
  gateway/run.py:
    - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS
    - _read_git_head_sha(), _compute_repo_mtime() module helpers
    - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha /
      _stale_code_restart_triggered defaults
    - __init__ boot-snapshot block (_boot_*, _cached_current_sha*,
      _repo_root_for_staleness, _stale_code_notified)
    - _current_git_sha_cached(), _detect_stale_code(),
      _trigger_stale_code_restart() methods
    - stale-code check + user-facing restart notice at the top of
      _handle_message()
  tests/gateway/test_stale_code_self_check.py (deleted, 412 lines)

No new logic added. Zero remaining references to any removed
symbol. Gateway test suite passes the same 4589 tests it passed
before; the 3 pre-existing unrelated failures (discord free-channel,
feishu bot admission, teams typing) are unchanged by this commit.

* feat(i18n): add display.language for static message translation (zh/ja/de/es)

Adds a thin-slice i18n layer covering the highest-impact static user-facing
messages: the CLI dangerous-command approval prompt and a handful of gateway
slash-command replies (restart-drain, goal cleared, approval expired, config
read/save errors).

Out of scope (stays English): agent responses, log lines, tool outputs,
slash-command descriptions, error tracebacks.

Infrastructure:
- agent/i18n.py: catalog loader, t() helper, language resolution
  (HERMES_LANGUAGE env var > display.language config > en)
- locales/{en,zh,ja,de,es}.yaml: ~19 translated strings per language
- display.language in DEFAULT_CONFIG (hermes_cli/config.py)

Tests:
- tests/agent/test_i18n.py: 21 tests covering catalog parity, placeholder
  parity across locales, fallback behavior, env-var override, alias
  normalization, missing-key graceful degradation.

Docs:
- website/docs/user-guide/configuration.md: display.language entry plus a
  short section explaining scope so users don't expect agent responses to
  translate via this knob.
2026-05-05 08:03:07 -07:00
Teknium
a1bed18194
docs: clarify that the Docker terminal backend is a single persistent container (#20003)
The docs were ambiguous about whether the Docker terminal backend spins up
a fresh container per command or reuses a long-lived one. It's the latter
— Hermes starts one container on first use and routes every terminal,
file, and execute_code call through docker exec into that same container
for the life of the process (across /new, /reset, and delegate_task
subagents). Working-directory changes, installed packages, and files in
/workspace persist from one tool call to the next, like a local shell.

- configuration.md: lead the Docker Backend section with the persistence
  model before the YAML example; sharpen the Backend Overview table row.
- features/tools.md: expand the Docker Backend block (previously just a
  2-line YAML stub) with a clear statement of the persistent-container
  semantics and a pointer to the full lifecycle section.
- docker.md: tighten the 'Docker as a terminal backend' bullet and the
  'Skills and credential files' paragraph to call out the single-container
  model explicitly.
2026-05-04 20:09:31 -07:00
Siddharth Balyan
a11aed1acc
fix(cli): local backend CLI always uses launch directory, stops .env sync of TERMINAL_CWD (#19334)
The old CWD heuristic was fooled by:
1. TERMINAL_CWD persisted to .env by `hermes config set terminal.cwd`
2. Inherited TERMINAL_CWD from parent hermes processes
3. Only resolved when config had a placeholder value (not explicit paths)

Fix:
- load_cli_config() unconditionally uses os.getcwd() for local backend
- TERMINAL_CWD always force-exported in CLI mode (overrides stale values)
- Gateway sets _HERMES_GATEWAY=1 marker so lazy cli.py imports don't clobber
- Remove terminal.cwd from config-set .env sync map (prevents re-poisoning)
- Clarify setup wizard label as 'Gateway working directory'

Closes #19214
2026-05-04 11:36:19 +05:30
Siddharth Balyan
167b5648ea
Revert "fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)" (#19329)
This reverts commit 9eaddfafa3.
2026-05-04 00:43:58 +05:30
Siddharth Balyan
9eaddfafa3
fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)
CLI/TUI sessions on the local backend now unconditionally use
os.getcwd() as the working directory. The terminal.cwd config value is
only consumed by gateway/cron/delegation modes (where there's no shell
to cd from).

Previously, 'hermes setup' would write an absolute path (e.g. $HOME)
into terminal.cwd which then pinned the CLI to that directory regardless
of where the user launched hermes from. This was a silent foot-gun —
the user's 'cd' was being ignored.

Changes:

1. cli.py: Restructured CWD resolution — if TERMINAL_CWD is not already
   set by the gateway, and the backend is local, always use os.getcwd().
   Config terminal.cwd is irrelevant for interactive CLI/TUI sessions.

2. setup.py: Moved the cwd prompt from setup_terminal_backend() to
   setup_gateway(). It now only appears when configuring messaging
   platforms and is labeled 'Gateway working directory'.

3. Tests: Rewrote test_cwd_env_respect.py to validate the new behavior:
   explicit config paths are ignored for CLI, gateway pre-set values are
   preserved, non-local backends keep their config paths.

4. Docs: Updated configuration.md, profiles.md, and
   environment-variables.md to clarify that terminal.cwd only affects
   gateway/cron mode on local backend.

Closes #19214
2026-05-04 00:14:36 +05:30
Teknium
289cc47631
docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738)
Broad drift audit against origin/main (b52b63396).

Reference pages (most user-visible drift):
- slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer
  that were missing; drop non-existent /terminal-setup; fix /q footnote
  (resolves to /queue, not /quit); extend CLI-only list with all 24
  CLI-only commands in the registry
- cli-commands: add dedicated sections for hermes curator / fallback /
  hooks (new subcommands not previously documented); remove stale
  hermes honcho standalone section (the plugin registers dynamically
  via hermes memory); list curator/fallback/hooks in top-level table;
  fix completion to include fish
- toolsets-reference: document the real 52-toolset count; split browser
  vs browser-cdp; add discord / discord_admin / spotify / yuanbao;
  correct hermes-cli tool count from 36 to 38; fix misleading claim
  that hermes-homeassistant adds tools (it's identical to hermes-cli)
- tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao,
  2 Discord toolsets; move browser_cdp/browser_dialog to their own
  browser-cdp toolset section
- environment-variables: add 40+ user-facing HERMES_* vars that were
  undocumented (--yolo, --accept-hooks, --ignore-*, inference model
  override, agent/stream/checkpoint timeouts, OAuth trace, per-platform
  batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs,
  gateway restart/connect timeouts); dedupe the Cron Scheduler section;
  replace stale QQ_SANDBOX with QQ_PORTAL_HOST

User-guide (top level):
- cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20)
- configuration.md: display.platforms is the canonical per-platform
  override key; tool_progress_overrides is deprecated and auto-migrated
- profiles.md: model.default is the config key, not model.model
- sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8
- checkpoints-and-rollback.md: destructive-command list now matches
  _DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd)
- docker.md: the container runs as non-root hermes (UID 10000) via
  gosu; fix install command (uv pip); add missing --insecure on the
  dashboard compose example (required for non-loopback bind)
- security.md: systemctl danger pattern also matches 'restart'
- index.md: built-in tool count 47 -> 68
- integrations/index.md: 6 STT providers, 8 memory providers
- integrations/providers.md: drop fictional dashscope/qwen aliases

Features:
- overview.md: 9 image models (not 8), 9 TTS providers (not 5),
  8 memory providers (Supermemory was missing)
- tool-gateway.md: 9 image models
- tools.md: extend common-toolsets list with search / messaging /
  spotify / discord / debugging / safe
- fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY
  (lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan,
  tencent-tokenhub, azure-foundry)
- plugins.md: Available Hooks table now includes on_session_finalize,
  on_session_reset, subagent_stop
- built-in-plugins.md: add the 7 bundled plugins the page didn't
  mention (spotify, google_meet, three image_gen providers, two
  dashboard examples)
- web-dashboard.md: add --insecure and --tui flags
- cron.md: hermes cron create takes positional schedule/prompt, not
  flags

Messaging:
- telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when
  TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it
  per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch.
- discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default
  is 2.0, not 0.1
- dingtalk.md: document DINGTALK_REQUIRE_MENTION /
  FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL /
  ALLOW_ALL_USERS that the adapter supports
- bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env
  var; the setting lives in platforms.bluebubbles.extra only
- qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and
  QQ_GROUP_ALLOWED_USERS
- wecom-callback.md: replace 'hermes gateway start' (service-only)
  with 'hermes gateway' for first-time setup

Developer-guide:
- architecture.md: refresh tool/toolset counts (61/52), terminal
  backend count (7), line counts for run_agent.py (~13.7k), cli.py
  (~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py
  (~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform
  adapter count 18 -> 20
- agent-loop.md: run_agent.py line count 10.7k -> 13.7k
- tools-runtime.md: add vercel_sandbox backend
- adding-tools.md: remove stale 'Discovery import added to
  model_tools.py' checklist item (registry auto-discovery)
- adding-platform-adapters.md: mark send_typing / get_chat_info as
  concrete base methods; only connect/disconnect/send are abstract
- acp-internals.md: ACP sessions now persist to SessionDB
  (~/.hermes/state.db); acp.run_agent call uses
  use_unstable_protocol=True
- cron-internals.md: gateway runs scheduler in a dedicated background
  thread via _start_cron_ticker, not on a maintenance cycle; locking
  is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows)
- gateway-internals.md: gateway/run.py ~12k lines
- provider-runtime.md: cron DOES support fallback (run_job reads
  fallback_providers from config)
- session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations
  10 and 11 (trigram FTS, inline-mode FTS5 re-index); add
  api_call_count column to Sessions DDL; document messages_fts_trigram
  and state_meta in the architecture tree
- context-compression-and-caching.md: remove the obsolete 'context
  pressure warnings' section (warnings were removed for causing
  models to give up early)
- context-engine-plugin.md: compress() signature now includes
  focus_topic param
- extending-the-cli.md: _build_tui_layout_children signature now
  includes model_picker_widget; add to default layout

Also fixed three pre-existing broken links/anchors the build warned
about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and
tips#background-tasks, nix-setup.md -> #container-aware-cli).

Regenerated per-skill pages via website/scripts/generate-skill-docs.py
so catalog tables and sidebar are consistent with current SKILL.md
frontmatter.

docusaurus build: clean, no broken links or anchors.
2026-04-29 20:55:59 -07:00
Teknium
22ff6ca32b
docs: two-week gap sweep — platforms, CLI, config, TUI, hooks, providers (#17727)
Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior
without docs coverage. No functional code changes; docs + static manifest
regeneration only.

Highlights:

Stale / incorrect:
- configuration.md: auxiliary auto-routing line was wrong since #11900;
  now correctly states auto routes to the main model, with a note on the
  cost trade-off and per-task override pattern.
- integrations/providers.md + configuration.md compression intro:
  removed stale 'Gemini Flash via OpenRouter' claim.
- website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py
  so the live manifest picks up tencent/hy3-preview (and remains in sync
  for future model-catalog PRs).

Platform messaging (#17417 #16997 #16193 #14315 #13151 #11794 #10610
#10283 #10246 #11564 #13178):
- Signal: native formatting (bodyRanges), reply quotes, reactions.
- Telegram: table rendering (bullets + code-block fallback),
  disable_link_previews, group_allowed_chats.
- Slack: strict_mention config.
- Discord: slash_commands disable, send_animation GIF, send_message
  native media attachments.
- DingTalk: require_mention + allowed_users.

CLI (#16052 #16539 #16566 #15841 #14798 #10043):
- New 'hermes fallback' interactive manager.
- New 'hermes update --check', '--backup' flag, and pre-update pairing
  snapshot behavior.
- 'hermes gateway start/restart --all' multi-profile flag.
- cron.md: 'hermes tools' as a platform, per-job enabled_toolsets,
  wakeAgent gate, context_from chaining.

Config keys / env vars (#17305 #17026 #17000 #15077 #14557 #14227
#14166 #14730 #17008):
- terminal.docker_run_as_host_user, display.runtime_metadata_footer,
  compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT,
  skills.guard_agent_created, TAVILY_BASE_URL,
  security.allow_private_urls, agent.api_max_retries,
  gateway hot-reload of compression/context_length config edits.

TUI / CLI UX (#17130 #17113 #17175 #17150 #16707 #12312 #12305 #12934
#14810 #14045 #17286 #17126):
- HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator
  styles, ctrl-x queued-message delete, git branch in status bar, per-
  prompt elapsed stopwatch, external-editor keybind, markdown stripping,
  TUI voice-mode parity, /agents overlay, /reload + /mouse.

Gateway features (#16506 #15027 #13428 #12116):
- Native multimodal image routing based on vision capability.
- /usage account-limits section.
- /steer slash command (added to reference + explanation in CLI).

Plugins / hooks (#12929 #12972 #10763 #16364):
- transform_tool_result, transform_terminal_output plugin hooks.
- PluginContext.dispatch_tool() documented with slash-command example.
- google_meet bundled plugin entry under built-in-plugins.md.

Other (#16576 #16572 #16383 #15878 #15608 #15606 #14809 #14767 #14231
#14232 #14307 #13683 #12373 #11891 #11291 #10066):
- hermes backup exclusions (WAL/SHM/journal + checkpoints/).
- security.md hardline blocklist (floor below --yolo).
- FHS install layout for root installs.
- openssh-client + docker-cli baked into the Docker image.
- MEDIA: tag supported extensions table (docs/office/archives/pdf).
- Remote-to-host file sync on SSH/Modal/Daytona teardown.
- 'hermes model' -> Configure Auxiliary Models interactive picker.
- Podman support via HERMES_DOCKER_BINARY.

Providers / STT / one-shot (#15045 #14473 #15704):
- alibaba-coding-plan first-class provider entry.
- xAI Grok STT as a 6th transcription option.
- 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL.

Build: 'docusaurus build' succeeds. No new broken links/anchors;
pre-existing warnings unchanged.
2026-04-29 20:32:37 -07:00
Adam Manning
eafa637287 docs: document MiniMax OAuth login flow
Add comprehensive documentation for the minimax-oauth provider.

New file: website/docs/guides/minimax-oauth.md
  - Overview table (provider ID, auth type, models, endpoints)
  - Quick start via 'hermes model'
  - Manual login via 'hermes auth add minimax-oauth'
  - --region global|cn flag reference
  - The PKCE OAuth flow explained step-by-step
  - hermes doctor output example
  - Configuration reference (config.yaml shape, region table, aliases)
  - Environment variables note: MINIMAX_API_KEY is NOT used by
    minimax-oauth (OAuth path uses browser login)
  - Models table with context length note
  - Troubleshooting section: expired token, timeout, state mismatch,
    headless/remote sessions, not logged in
  - Logout command

Updated: website/docs/getting-started/quickstart.md
  - Add MiniMax (OAuth) to provider picker table as the recommended
    path for users who want MiniMax models without an API key

Updated: website/docs/user-guide/configuration.md
  - Add 'minimax-oauth' to the auxiliary providers list
  - Add MiniMax OAuth tip callout in the providers section
  - Add minimax-oauth row to the provider table (auxiliary tasks)
  - Add MiniMax OAuth config.yaml example in Common Setups

Updated: website/docs/reference/environment-variables.md
  - Annotate MINIMAX_API_KEY, MINIMAX_BASE_URL, MINIMAX_CN_API_KEY,
    MINIMAX_CN_BASE_URL as NOT used by minimax-oauth
  - Add minimax-oauth to HERMES_INFERENCE_PROVIDER allowed values
2026-04-29 09:53:42 -07:00
Scott Trinh
5a1d4f6804 feat: add Vercel Sandbox backend
Adds Vercel Sandbox as a supported Hermes terminal backend alongside
existing providers (Local, Docker, Modal, SSH, Daytona, Singularity).

Uses the Vercel Python SDK to create/manage cloud microVMs, supports
snapshot-based filesystem persistence keyed by task_id, and integrates
with the existing BaseEnvironment shell contract and FileSyncManager
for credential/skill syncing.

Based on #17127 by @scotttrinh, cherry-picked onto current main.
2026-04-29 07:22:33 -07:00
helix4u
a3c27b5cd1 docs: clarify quick commands config shape 2026-04-28 11:07:07 -07:00
Teknium
185ecc71f1 docs: document agent.disabled_toolsets config + AUTHOR_MAP
Follow-up to the salvaged PR #16867 that added the read path for
agent.disabled_toolsets in _get_platform_tools():

- Document the new config key under a "Global Toolset Disable" section
  in website/docs/user-guide/configuration.md, including the precedence
  note (global disable overrides per-platform platform_toolsets).
- Map nazirulhafiy@gmail.com -> nazirulhafiy in scripts/release.py
  AUTHOR_MAP so release-notes CI attributes the cherry-picked commit.
2026-04-28 01:23:16 -07:00
Teknium
8081425a1c
feat(security): make secret redaction off by default (#16794)
Flips security.redact_secrets from true to false in DEFAULT_CONFIG, and
the HERMES_REDACT_SECRETS env-var fallback in agent/redact.py now
requires explicit opt-in ("1"/"true"/"yes"/"on") to enable.

New installs and users without a security.redact_secrets key get pass-
through tool output. Existing users whose config.yaml explicitly sets
redact_secrets: true keep redaction on — the config-yaml -> env-var
bridges in hermes_cli/main.py and gateway/run.py still honor their
setting.

Also updates the inline config comments, website docs, and the
hermes-agent skill so /hermes config set security.redact_secrets true
is now the documented way to turn it on.
2026-04-27 21:24:08 -07:00
Isaac Huang
c53fcb0173 feat(providers): add GMI Cloud as a first-class API-key provider (#11955)
Add GMI Cloud (api.gmi-serving.com) as a full first-class API-key provider
with built-in auth, aliases, model catalog, CLI entry points, auxiliary client
routing, context length resolution, doctor checks, env var tracking, and docs.

- auth.py: ProviderConfig for 'gmi' (api_key, GMI_API_KEY / GMI_BASE_URL)
- providers.py: HermesOverlay with extra_env_vars for models.dev detection
- models.py: curated slash-form model catalog; live /v1/models fetch
- main.py: 'gmi' in _named_custom_provider_map and --provider choices
- model_metadata.py: _URL_TO_PROVIDER, _PROVIDER_PREFIXES, dedicated
  context-length probe block (GMI's /models has authoritative data)
- auxiliary_client.py: alias entries; _compat_model fix for slash-form
  models on cached aggregator-style clients; gmi aux default model
- doctor.py: GMI in provider connectivity checks
- config.py: GMI_API_KEY / GMI_BASE_URL in OPTIONAL_ENV_VARS
- conftest.py: explicit GMI_BASE_URL clearing (not caught by _API_KEY suffix)
- docs: providers.md, environment-variables.md, fallback-providers.md,
  configuration.md, quickstart.md (expands provider table)

Co-authored-by: Isaac Huang <isaachuang@Isaacs-MacBook-Pro.local>
2026-04-27 11:17:59 -07:00
Teknium
b16f9d438b
feat(telegram): send fresh finals for stale preview streams (port openclaw#72038) (#16261)
Some checks are pending
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-and-push (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
Tests / test (push) Waiting to run
Tests / e2e (push) Waiting to run
Ports openclaw/openclaw#72038 to hermes-agent.

Telegram's `editMessageText` preserves the original message timestamp,
so a long-running streamed reply (reasoning models that take 60+ seconds
to finish) would keep the first-token timestamp even after completion.
Users can't tell how long a task actually took.

When a preview message has been visible for >= 60s (configurable via
`streaming.fresh_final_after_seconds`), finalize by sending a fresh
message instead of editing in place, then best-effort delete the stale
preview. Short previews still edit in place (the existing fast path).

Implementation notes adapted from OpenClaw's TypeScript original:
- `StreamConsumerConfig` gains `fresh_final_after_seconds` (default 0 =
  legacy edit-in-place). Gateway-level `StreamingConfig` defaults to 60.
- `GatewayStreamConsumer` tracks `_message_created_ts` at first-send and
  checks it in `_send_or_edit` on `finalize=True`. New helpers
  `_should_send_fresh_final` + `_try_fresh_final`.
- `BasePlatformAdapter` gains optional `delete_message(chat_id, message_id)`
  returning False by default. `TelegramAdapter` implements it via
  `_bot.delete_message`.
- `gateway/run.py` only enables fresh-final for `Platform.TELEGRAM`;
  other platforms ignore the setting (they don't have the stale-edit
  timestamp problem or edit-then-read works cheaply).
- Fallback to normal edit on any fresh-send failure — no user-visible
  regression if Telegram rate-limits a send or the message is gone.

Tests: 15 new cases in tests/gateway/test_stream_consumer_fresh_final.py
covering short/long previews, config plumbing, delete-support absent,
send-failure fallback, __no_edit__ sentinel safety, and StreamingConfig
round-trip.

Co-authored-by: Hermes Agent <agent@nousresearch.com>
2026-04-26 17:26:37 -07:00
Teknium
5b2c59559a
feat(terminal): collapse subagent task_ids to shared container (#16177)
Before: delegate_task children each allocated their own terminal
sandbox keyed by child task_id. Starting extra containers (or Modal
sandboxes / Daytona workspaces) is expensive, and the subagent's work
is invisible to the parent — files written by the child in its
container don't exist in the parent's when the subagent returns.

After: a single `_resolve_container_task_id` helper maps any
tool-call task_id to "default" UNLESS an env override is registered
for it. The parent agent and all delegate_task children therefore
share one long-lived sandbox — installed packages, cwd, /workspace
files, and /tmp scratch carry over freely between them.

RL and benchmark environments (TerminalBench2, HermesSweEnv, ...)
opt in to isolation via `register_task_env_overrides(task_id, {...})`;
those task_ids survive the collapse and get their own sandbox,
preserving the per-task Docker image behavior these benchmarks rely on.

file_state / active-subagents registry / TUI events still key off the
original child task_id, so the 'subagent wrote a file the parent read'
warning and UI per-subagent panels keep working.

Tradeoff: parallel delegate_task children (tasks=[...]) now share one
bash/container. Concurrent cd, env-var mutations, and writes to the
same path will collide. If that bites a specific workflow, the
subagent can opt back into isolation via register_task_env_overrides.

Applied at four lookup sites:
- tools/terminal_tool.py terminal_tool() and get_active_env()
- tools/file_tools.py _get_file_ops() and _get_live_tracking_cwd()
- tools/code_execution_tool.py _get_or_create_environment()

Docs: website/docs/user-guide/configuration.md updated to reflect the
shared-container reality and document the RL/benchmark carve-out.
Tests: tests/tools/test_shared_container_task_id.py (9 cases).
2026-04-26 11:55:02 -07:00
Teknium
9be83728a6
docs(docker-backend): clarify container is shared across sessions, not per-session (#16158)
The Docker terminal-backend docs said 'each session starts a long-lived
container', implying a fresh container per chat session. That hasn't been
true for a while: for the top-level agent, task_id defaults to 'default'
and the container is cached in _active_environments for the lifetime of
the Hermes process. /new, /reset, and switching sessions all reuse the
same container. Only delegate_task subagents and RL rollouts get isolated
containers keyed by their own task_id.
2026-04-26 10:46:08 -07:00
Teknium
7c50ed707c docs(azure-foundry): add provider guide, env vars, release AUTHOR_MAP
- New website/docs/guides/azure-foundry.md covering both OpenAI-style
  and Anthropic-style endpoints, auto-detection behaviour, gpt-5.x
  routing, /v1 stripping, api-version query forwarding, and the
  provider: anthropic + Azure URL alternative setup.
- environment-variables.md picks up AZURE_FOUNDRY_API_KEY,
  AZURE_FOUNDRY_BASE_URL, AZURE_ANTHROPIC_KEY.
- cli-commands.md includes azure-foundry in the provider choices list.
- configuration.md lists azure-foundry among auxiliary-task providers.
- sidebars.ts wires the new guide into the Guides section.
- scripts/release.py AUTHOR_MAP entries for TechPrototyper,
  HangGlidersRule (noreply), and pein892 so the contributor-attribution
  CI check does not reject the salvage.
2026-04-25 18:48:43 -07:00
Teknium
dc4d92f131
docs: embed tutorial videos on webhooks + auxiliary models pages (#15809)
- webhooks.md: adds a Video Tutorial section under the intro with a
  responsive YouTube iframe (WNYe5mD4fY8).
- configuration.md: adds a Video Tutorial subsection under Auxiliary
  Models with a responsive YouTube iframe (NoF-YajElIM).

Both use a 16:9 aspect-ratio wrapper so the embeds scale cleanly on
mobile. Verified with `npm run build` — MDX parses clean, no new
warnings or broken links introduced.
2026-04-25 16:44:53 -07:00
Teknium
ea01bdcebe
refactor(memory): remove flush_memories entirely (#15696)
The AIAgent.flush_memories pre-compression save, the gateway
_flush_memories_for_session, and everything feeding them are
obsolete now that the background memory/skill review handles
persistent memory extraction.

Problems with flush_memories:

- Pre-dates the background review loop.  It was the only memory-save
  path when introduced; the background review now fires every 10 user
  turns on CLI and gateway alike, which is far more frequent than
  compression or session reset ever triggered flush.
- Blocking and synchronous.  Pre-compression flush ran on the live agent
  before compression, blocking the user-visible response.
- Cache-breaking.  Flush built a temporary conversation prefix
  (system prompt + memory-only tool list) that diverged from the live
  conversation's cached prefix, invalidating prompt caching.  The
  gateway variant spawned a fresh AIAgent with its own clean prompt
  for each finalized session — still cache-breaking, just in a
  different process.
- Redundant.  Background review runs in the live conversation's
  session context, gets the same content, writes to the same memory
  store, and doesn't break the cache.  Everything flush_memories
  claimed to preserve is already covered.

What this removes:

- AIAgent.flush_memories() method (~248 LOC in run_agent.py)
- Pre-compression flush call in _compress_context
- flush_memories call sites in cli.py (/new + exit)
- GatewayRunner._flush_memories_for_session + _async_flush_memories
  (and the 3 call sites: session expiry watcher, /new, /resume)
- 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks,
  hermes tools UI task list, auxiliary_client docstrings
- _memory_flush_min_turns config + init
- #15631's headroom-deduction math in
  _check_compression_model_feasibility (headroom was only needed
  because flush dragged the full main-agent system prompt along;
  the compression summariser sends a single user-role prompt so
  new_threshold = aux_context is safe again)
- The dedicated test files and assertions that exercised
  flush-specific paths

What this renames (with read-time backcompat on sessions.json):

- SessionEntry.memory_flushed -> SessionEntry.expiry_finalized.
  The session-expiry watcher still uses the flag to avoid re-running
  finalize/eviction on the same expired session; the new name
  reflects what it now actually gates.  from_dict() reads
  'expiry_finalized' first, falls back to the legacy 'memory_flushed'
  key so existing sessions.json files upgrade seamlessly.

Supersedes #15631 and #15638.

Tested: 383 targeted tests pass across run_agent/, agent/, cli/,
and gateway/ session-boundary suites.  No behavior regressions —
background memory review continues to handle persistent memory
extraction on both CLI and gateway.
2026-04-25 08:21:14 -07:00
Teknium
5a1c599412
feat(browser): CDP supervisor — dialog detection + response + cross-origin iframe eval (#14540)
* docs: browser CDP supervisor design (for upcoming PR)

Design doc ahead of implementation — dialog + iframe detection/interaction
via a persistent CDP supervisor. Covers backend capability matrix (verified
live 2026-04-23), architecture, lifecycle, policy, agent surface, PR split,
non-goals, and test plan.

Supersedes #12550.

No code changes in this commit.

* feat(browser): add persistent CDP supervisor for dialog + frame detection

Single persistent CDP WebSocket per Hermes task_id that subscribes to
Page/Runtime/Target events and maintains thread-safe state for pending
dialogs, frame tree, and console errors.

Supervisor lives in its own daemon thread running an asyncio loop;
external callers use sync API (snapshot(), respond_to_dialog()) that
bridges onto the loop.

Auto-attaches to OOPIF child targets via Target.setAutoAttach{flatten:true}
and enables Page+Runtime on each so iframe-origin dialogs surface through
the same supervisor.

Dialog policies: must_respond (default, 300s safety timeout),
auto_dismiss, auto_accept.

Frame tree capped at 30 entries + OOPIF depth 2 to keep snapshot
payloads bounded on ad-heavy pages.

E2E verified against real Chrome via smoke test — detects + responds
to main-frame alerts, iframe-contentWindow alerts, preserves frame
tree, graceful no-dialog error path, clean shutdown.

No agent-facing tool wiring in this commit (comes next).

* feat(browser): add browser_dialog tool wired to CDP supervisor

Agent-facing response-only tool. Schema:
  action: 'accept' | 'dismiss' (required)
  prompt_text: response for prompt() dialogs (optional)
  dialog_id: disambiguate when multiple dialogs queued (optional)

Handler:
  SUPERVISOR_REGISTRY.get(task_id).respond_to_dialog(...)

check_fn shares _browser_cdp_check with browser_cdp so both surface and
hide together. When no supervisor is attached (Camofox, default
Playwright, or no browser session started yet), tool is hidden; if
somehow invoked it returns a clear error pointing the agent to
browser_navigate / /browser connect.

Registered in _HERMES_CORE_TOOLS and the browser / hermes-acp /
hermes-api-server toolsets alongside browser_cdp.

* feat(browser): wire CDP supervisor into session lifecycle + browser_snapshot

Supervisor lifecycle:
  * _get_session_info lazy-starts the supervisor after a session row is
    materialized — covers every backend code path (Browserbase, cdp_url
    override, /browser connect, future providers) with one hook.
  * cleanup_browser(task_id) stops the supervisor for that task first
    (before the backend tears down CDP).
  * cleanup_all_browsers() calls SUPERVISOR_REGISTRY.stop_all().
  * /browser connect eagerly starts the supervisor for task 'default'
    so the first snapshot already shows pending_dialogs.
  * /browser disconnect stops the supervisor.

CDP URL resolution for the supervisor:
  1. BROWSER_CDP_URL / browser.cdp_url override.
  2. Fallback: session_info['cdp_url'] from cloud providers (Browserbase).

browser_snapshot merges supervisor state (pending_dialogs + frame_tree)
into its JSON output when a supervisor is active — the agent reads
pending_dialogs from the snapshot it already requests, then calls
browser_dialog to respond. No extra tool surface.

Config defaults:
  * browser.dialog_policy: 'must_respond' (new)
  * browser.dialog_timeout_s: 300 (new)
No version bump — new keys deep-merge into existing browser section.

Deadlock fix in supervisor event dispatch:
  * _on_dialog_opening and _on_target_attached used to await CDP calls
    while the reader was still processing an event — but only the reader
    can set the response Future, so the call timed out.
  * Both now fire asyncio.create_task(...) so the reader stays pumping.
  * auto_dismiss/auto_accept now actually close the dialog immediately.

Tests (tests/tools/test_browser_supervisor.py, 11 tests, real Chrome):
  * supervisor start/snapshot
  * main-frame alert detection + dismiss
  * iframe.contentWindow alert
  * prompt() with prompt_text reply
  * respond with no pending dialog -> clean error
  * auto_dismiss clears on event
  * registry idempotency
  * registry stop -> snapshot reports inactive
  * browser_dialog tool no-supervisor error
  * browser_dialog invalid action
  * browser_dialog end-to-end via tool handler

xdist-safe: chrome_cdp fixture uses a per-worker port.
Skipped when google-chrome/chromium isn't installed.

* docs(browser): document browser_dialog tool + CDP supervisor

- user-guide/features/browser.md: new browser_dialog section with
  workflow, availability gate, and dialog_policy table
- reference/tools-reference.md: row for browser_dialog, tool count
  bumped 53 -> 54, browser tools count 11 -> 12
- reference/toolsets-reference.md: browser_dialog added to browser
  toolset row with note on pending_dialogs / frame_tree snapshot fields

Full design doc lives at
developer-guide/browser-supervisor.md (committed earlier).

* fix(browser): reconnect loop + recent_dialogs for Browserbase visibility

Found via Browserbase E2E test that revealed two production-critical issues:

1. **Supervisor WebSocket drops when other clients disconnect.** Browserbase's
   CDP proxy tears down our long-lived WebSocket whenever a short-lived
   client (e.g. agent-browser CLI's per-command CDP connection) disconnects.
   Fixed with a reconnecting _run loop that re-attaches with exponential
   backoff on drops. _page_session_id and _child_sessions are reset on each
   reconnect; pending_dialogs and frames are preserved across reconnects.

2. **Browserbase auto-dismisses dialogs server-side within ~10ms.** Their
   Playwright-based CDP proxy dismisses alert/confirm/prompt before our
   Page.handleJavaScriptDialog call can respond. So pending_dialogs is
   empty by the time the agent reads a snapshot on Browserbase.

   Added a recent_dialogs ring buffer (capacity 20) that retains a
   DialogRecord for every dialog that opened, with a closed_by tag:
     * 'agent'       — agent called browser_dialog
     * 'auto_policy' — local auto_dismiss/auto_accept fired
     * 'watchdog'    — must_respond timeout auto-dismissed (300s default)
     * 'remote'      — browser/backend closed it on us (Browserbase)

   Agents on Browserbase now see the dialog history with closed_by='remote'
   so they at least know a dialog fired, even though they couldn't respond.

3. **Page.javascriptDialogClosed matching bug.** The event doesn't include a
   'message' field (CDP spec has only 'result' and 'userInput') but our
   _on_dialog_closed was matching on message. Fixed to match by session_id
   + oldest-first, with a safety assumption that only one dialog is in
   flight per session (the JS thread is blocked while a dialog is up).

Docs + tests updated:
  * browser.md: new availability matrix showing the three backends and
    which mode (pending / recent / response) each supports
  * developer-guide/browser-supervisor.md: three-field snapshot schema
    with closed_by semantics
  * test_browser_supervisor.py: +test_recent_dialogs_ring_buffer (12/12
    passing against real Chrome)

E2E verified both backends:
  * Local Chrome via /browser connect: detect + respond full workflow
    (smoke_supervisor.py all 7 scenarios pass)
  * Browserbase: detect via recent_dialogs with closed_by='remote'
    (smoke_supervisor_browserbase_v2.py passes)

Camofox remains out of scope (REST-only, no CDP) — tracked for
upstream PR 3.

* feat(browser): XHR bridge for dialog response on Browserbase (FIXED)

Browserbase's CDP proxy auto-dismisses native JS dialogs within ~10ms, so
Page.handleJavaScriptDialog calls lose the race. Solution: bypass native
dialogs entirely.

The supervisor now injects Page.addScriptToEvaluateOnNewDocument with a
JavaScript override for window.alert/confirm/prompt. Those overrides
perform a synchronous XMLHttpRequest to a magic host
('hermes-dialog-bridge.invalid'). We intercept those XHRs via Fetch.enable
with a requestStage=Request pattern.

Flow when a page calls alert('hi'):
  1. window.alert override intercepts, builds XHR GET to
     http://hermes-dialog-bridge.invalid/?kind=alert&message=hi
  2. Sync XHR blocks the page's JS thread (mirrors real dialog semantics)
  3. Fetch.requestPaused fires on our WebSocket; supervisor surfaces
     it as a pending dialog with bridge_request_id set
  4. Agent reads pending_dialogs from browser_snapshot, calls browser_dialog
  5. Supervisor calls Fetch.fulfillRequest with JSON body:
     {accept: true|false, prompt_text: '...', dialog_id: 'd-N'}
  6. The injected script parses the body, returns the appropriate value
     from the override (undefined for alert, bool for confirm, string|null
     for prompt)

This works identically on Browserbase AND local Chrome — no native dialog
ever fires, so Browserbase's auto-dismiss has nothing to race. Dialog
policies (must_respond / auto_dismiss / auto_accept) all still work.

Bridge is installed on every attached session (main page + OOPIF child
sessions) so iframe dialogs are captured too.

Native-dialog path kept as a fallback for backends that don't auto-dismiss
(so a page that somehow bypasses our override — e.g. iframes that load
after Fetch.enable but before the init-script runs — still gets observed
via Page.javascriptDialogOpening).

E2E VERIFIED:
  * Local Chrome: 13/13 pytest tests green (12 original + new
    test_bridge_captures_prompt_and_returns_reply_text that asserts
    window.__ret === 'AGENT-SUPPLIED-REPLY' after agent responds)
  * Browserbase: smoke_bb_bridge_v2.py runs 4/4 PASS:
    - alert('BB-ALERT-MSG') dismiss → page.alert_ret = undefined ✓
    - prompt('BB-PROMPT-MSG', 'default-xyz') accept with 'AGENT-REPLY'
      → page.prompt_ret === 'AGENT-REPLY' ✓
    - confirm('BB-CONFIRM-MSG') accept → page.confirm_ret === true ✓
    - confirm('BB-CONFIRM-MSG') dismiss → page.confirm_ret === false ✓

Docs updated in browser.md and developer-guide/browser-supervisor.md —
availability matrix now shows Browserbase at full parity with local
Chrome for both detection and response.

* feat(browser): cross-origin iframe interaction via browser_cdp(frame_id=...)

Adds iframe interaction to the CDP supervisor PR (was queued as PR 2).

Design: browser_cdp gets an optional frame_id parameter. When set, the
tool looks up the frame in the supervisor's frame_tree, grabs its child
cdp_session_id (OOPIF session), and dispatches the CDP call through the
supervisor's already-connected WebSocket via run_coroutine_threadsafe.

Why not stateless: on Browserbase, each fresh browser_cdp WebSocket
must re-negotiate against a signed connectUrl. The session info carries
a specific URL that can expire while the supervisor's long-lived
connection stays valid. Routing via the supervisor sidesteps this.

Agent workflow:
  1. browser_snapshot → frame_tree.children[] shows OOPIFs with is_oopif=true
  2. browser_cdp(method='Runtime.evaluate', frame_id=<OOPIF frame_id>,
                 params={'expression': 'document.title', 'returnByValue': True})
  3. Supervisor dispatches the call on the OOPIF's child session

Supervisor state fixes needed along the way:
  * _on_frame_detached now skips reason='swap' (frame migrating processes)
  * _on_frame_detached also skips when the frame is an OOPIF with a live
    child session — Browserbase fires spurious remove events when a
    same-origin iframe gets promoted to OOPIF
  * _on_target_detached clears cdp_session_id but KEEPS the frame record
    so the agent still sees the OOPIF in frame_tree during transient
    session flaps

E2E VERIFIED on Browserbase (smoke_bb_iframe_agent_path.py):
  browser_cdp(method='Runtime.evaluate',
              params={'expression': 'document.title', 'returnByValue': True},
              frame_id=<OOPIF>)
  → {'success': True, 'result': {'value': 'Example Domain'}}

  The iframe is <iframe src='https://example.com/'> inside a top-level
  data: URL page on a real Browserbase session. The agent Runtime.evaluates
  INSIDE the cross-origin iframe and gets example.com's title back.

Tests (tests/tools/test_browser_supervisor.py — 16 pass total):
  * test_browser_cdp_frame_id_routes_via_supervisor — injects fake OOPIF,
    verifies routing via supervisor, Runtime.evaluate returns 1+1=2
  * test_browser_cdp_frame_id_missing_supervisor — clean error when no
    supervisor attached
  * test_browser_cdp_frame_id_not_in_frame_tree — clean error on bad
    frame_id

Docs (browser.md and developer-guide/browser-supervisor.md) updated with
the iframe workflow, availability matrix now shows OOPIF eval as shipped
for local Chrome + Browserbase.

* test(browser): real-OOPIF E2E verified manually + chrome_cdp uses --site-per-process

When asked 'did you test the iframe stuff' I had only done a mocked
pytest (fake injected OOPIF) plus a Browserbase E2E. Closed the
local-Chrome real-OOPIF gap by writing /tmp/dialog-iframe-test/
smoke_local_oopif.py:

  * 2 http servers on different hostnames (localhost:18905 + 127.0.0.1:18906)
  * Chrome with --site-per-process so the cross-origin iframe becomes a
    real OOPIF in its own process
  * Navigate, find OOPIF in supervisor.frame_tree, call
    browser_cdp(method='Runtime.evaluate', frame_id=<OOPIF>) which routes
    through the supervisor's child session
  * Asserts iframe document.title === 'INNER-FRAME-XYZ' (from the
    inner page, retrieved via OOPIF eval)

PASSED on 2026-04-23.

Tried to embed this as a pytest but hit an asyncio version quirk between
venv (3.11) and the system python (3.13) — Page.navigate hangs in the
pytest harness but works in standalone. Left a self-documenting skip
test that points to the smoke script + describes the verification.

chrome_cdp fixture now passes --site-per-process so future iframe tests
can rely on OOPIF behavior.

Result: 16 pass + 1 documented-skip = 17 tests in
tests/tools/test_browser_supervisor.py.

* docs(browser): add dialog_policy + dialog_timeout_s to configuration.md, fix tool count

Pre-merge docs audit revealed two gaps:

1. user-guide/configuration.md browser config example was missing the
   two new dialog_* knobs. Added with a short table explaining
   must_respond / auto_dismiss / auto_accept semantics and a link to
   the feature page for the full workflow.

2. reference/tools-reference.md header said '54 built-in tools' — real
   count on main is 54, this branch adds browser_dialog so it's 55.
   Fixed the header.  (browser count was already correctly bumped
   11 -> 12 in the earlier docs commit.)

No code changes.
2026-04-23 22:23:37 -07:00
Teknium
983bbe2d40
feat(skills): add design-md skill for Google's DESIGN.md spec (#14876)
* feat(config): make tool output truncation limits configurable

Port from anomalyco/opencode#23770: expose a new `tool_output` config
section so users can tune the hardcoded truncation caps that apply to
terminal output and read_file pagination.

Three knobs under `tool_output`:
- max_bytes (default 50_000) — terminal stdout/stderr cap
- max_lines (default 2000) — read_file pagination cap
- max_line_length (default 2000) — per-line cap in line-numbered view

All three keep their existing hardcoded values as defaults, so behaviour
is unchanged when the section is absent. Power users on big-context
models can raise them; small-context local models can lower them.

Implementation:
- New `tools/tool_output_limits.py` reads the section with defensive
  fallback (missing/invalid values → defaults, never raises).
- `tools/terminal_tool.py` MAX_OUTPUT_CHARS now comes from
  get_max_bytes().
- `tools/file_operations.py` normalize_read_pagination() and
  _add_line_numbers() now pull the limits at call time.
- `hermes_cli/config.py` DEFAULT_CONFIG gains the `tool_output` section
  so `hermes setup` writes defaults into fresh configs.
- Docs page `user-guide/configuration.md` gains a "Tool Output
  Truncation Limits" section with large-context and small-context
  example configs.

Tests (18 new in tests/tools/test_tool_output_limits.py):
- Default resolution with missing / malformed / non-dict config.
- Full and partial user overrides.
- Coercion of bad values (None, negative, wrong type, str int).
- Shortcut accessors delegate correctly.
- DEFAULT_CONFIG exposes the section with the right defaults.
- Integration: normalize_read_pagination clamps to the configured
  max_lines.

* feat(skills): add design-md skill for Google's DESIGN.md spec

Built-in skill under skills/creative/ that teaches the agent to author,
lint, diff, and export DESIGN.md files — Google's open-source
(Apache-2.0) format for describing a visual identity to coding agents.

Covers:
- YAML front matter + markdown body anatomy
- Full token schema (colors, typography, rounded, spacing, components)
- Canonical section order + duplicate-heading rejection
- Component property whitelist + variants-as-siblings pattern
- CLI workflow via 'npx @google/design.md' (lint/diff/export/spec)
- Lint rule reference including WCAG contrast checks
- Common YAML pitfalls (quoted hex, negative dimensions, dotted refs)
- Starter template at templates/starter.md

Package verified live on npm (@google/design.md@0.1.1).
2026-04-23 21:51:19 -07:00
10ishq
e038677ef6 docs: add Exa web search backend setup guide and details
Adds an Exa-specific setup note next to the Parallel search-modes line
documenting EXA_API_KEY, category filtering (company, research paper,
news, people, personal site, pdf), and domain/date filters.

Reapplied onto current main from @10ishq's PR #6697 — the original branch
was too far behind main to cherry-pick directly (touched 1,456 unrelated
files from deleted/renamed paths).

Co-authored-by: 10ishq <tanishq@exa.ai>
2026-04-22 20:00:29 -07:00
Teknium
bf73ced4f5
docs: document delegation width + depth knobs (#13745)
Fills the three gaps left by the orchestrator/width-depth salvage:

- configuration.md §Delegation: max_concurrent_children, max_spawn_depth,
  orchestrator_enabled are now in the canonical config.yaml reference
  with a paragraph covering defaults, clamping, role-degradation, and
  the 3x3x3=27-leaf cost scaling.
- environment-variables.md: adds DELEGATION_MAX_CONCURRENT_CHILDREN to
  the Agent Behavior table.
- features/delegation.md: corrects stale 'default 5, cap 8' wording
  (that was from the original PR; the salvage landed on default 3 with
  no ceiling and a tool error on excess instead of truncation).
2026-04-21 17:54:39 -07:00
helix4u
b48ea41d27 feat(voice): add cli beep toggle 2026-04-21 00:29:29 -07:00
helix4u
03e3c22e86 fix(config): add stale timeout settings 2026-04-20 00:52:50 -07:00
helix4u
afba54364e docs(config): document session_search auxiliary controls 2026-04-20 00:47:39 -07:00
Teknium
611657487f docs(providers): call out Bedrock as not covered by request_timeout_seconds
AWS Bedrock paths (bedrock_converse + AnthropicBedrock SDK) use boto3
with its own timeout config and are not wired to the per-provider knob.
Documented in cli-config.yaml.example and website configuration.md so
users don't expect it to take effect there.
2026-04-19 11:23:00 -07:00
Teknium
c11ab6f64d feat(providers): enforce request_timeout_seconds on OpenAI-wire primary calls
Live test with timeout_seconds: 0.5 on claude-sonnet-4.6 proved the
initial wiring was insufficient: run_agent.py was overriding the
client-level timeout on every call via hardcoded per-request kwargs.

Root cause: run_agent.py had two sites that pass an explicit timeout=
kwarg into chat.completions.create() — api_kwargs['timeout'] at line
7075 (HERMES_API_TIMEOUT=1800s default) and the streaming path's
_httpx.Timeout(..., read=HERMES_STREAM_READ_TIMEOUT=120s, ...) at line
5760. Both override the per-provider config value the client was
constructed with, so a 0.5s config timeout would silently not enforce.

This commit:
- Adds AIAgent._resolved_api_call_timeout() — config > HERMES_API_TIMEOUT env > 1800s default.
- Uses it for the non-streaming api_kwargs['timeout'] field.
- Uses it for the streaming path's httpx.Timeout(connect, read, write, pool)
  so both connect and read respect the configured value when set.
  Local-provider auto-bump (Ollama/vLLM cold-start) only applies when
  no explicit config value is set.
- New test: test_resolved_api_call_timeout_priority covers all three
  precedence cases (config, env, default).

Live verified: 0.5s config on claude-sonnet-4.6 now triggers
APITimeoutError at ~3s per retry, exhausts 3 retries in ~15s total
(was: 29-47s success with timeout ignored). Positive case (60s config
+ gpt-4o-mini) still succeeds at 1.3s.
2026-04-19 11:23:00 -07:00
Teknium
f1fe29d1c3 feat(providers): extend request_timeout_seconds to all client paths
Follow-up on top of mvanhorn's cherry-picked commit. Original PR only
wired request_timeout_seconds into the explicit-creds OpenAI branch at
run_agent.py init; router-based implicit auth, native Anthropic, and the
fallback chain were still hardcoded to SDK defaults.

- agent/anthropic_adapter.py: build_anthropic_client() accepts an optional
  timeout kwarg (default 900s preserved when unset/invalid).
- run_agent.py: resolve per-provider/per-model timeout once at init; apply
  to Anthropic native init + post-refresh rebuild + stale/interrupt
  rebuilds + switch_model + _restore_primary_runtime + the OpenAI
  implicit-auth path + _try_activate_fallback (with immediate client
  rebuild so the first fallback request carries the configured timeout).
- tests: cover anthropic adapter kwarg honoring; widen mock signatures
  to accept the new timeout kwarg.
- docs/example: clarify that the knob now applies to every transport,
  the fallback chain, and rebuilds after credential rotation.
2026-04-19 11:23:00 -07:00