diff --git a/website/docs/developer-guide/acp-internals.md b/website/docs/developer-guide/acp-internals.md index 968b2b906ad..2ef552e266c 100644 --- a/website/docs/developer-guide/acp-internals.md +++ b/website/docs/developer-guide/acp-internals.md @@ -76,9 +76,8 @@ The manager is thread-safe and supports: Bridged callbacks: - `tool_progress_callback` -- `thinking_callback` +- `thinking_callback` (currently set to `None` in the ACP bridge — reasoning is forwarded through `step_callback` instead) - `step_callback` -- `message_callback` Because `AIAgent` runs in a worker thread while ACP I/O lives on the main event loop, the bridge uses: diff --git a/website/docs/developer-guide/agent-loop.md b/website/docs/developer-guide/agent-loop.md index 4ca66b56283..cf9cb1c1efd 100644 --- a/website/docs/developer-guide/agent-loop.md +++ b/website/docs/developer-guide/agent-loop.md @@ -6,7 +6,7 @@ description: "Detailed walkthrough of AIAgent execution, API modes, tools, callb # Agent Loop Internals -The core orchestration engine is `run_agent.py`'s `AIAgent` class — roughly 13,700 lines that handle everything from prompt assembly to tool dispatch to provider failover. +The core orchestration engine is `run_agent.py`'s `AIAgent` class — a large file (15k+ lines) that handles everything from prompt assembly to tool dispatch to provider failover. ## Core Responsibilities @@ -222,7 +222,7 @@ After each turn: | File | Purpose | |------|---------| -| `run_agent.py` | AIAgent class — the complete agent loop (~13,700 lines) | +| `run_agent.py` | AIAgent class — the complete agent loop | | `agent/prompt_builder.py` | System prompt assembly from memory, skills, context files, personality | | `agent/context_engine.py` | ContextEngine ABC — pluggable context management | | `agent/context_compressor.py` | Default engine — lossy summarization algorithm | diff --git a/website/docs/developer-guide/architecture.md b/website/docs/developer-guide/architecture.md index c8901934199..af2b0a2fd4b 100644 --- a/website/docs/developer-guide/architecture.md +++ b/website/docs/developer-guide/architecture.md @@ -32,8 +32,8 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours │ ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐ │ │ │ Compression │ │ 3 API Modes │ │ Tool Registry│ │ │ │ & Caching │ │ chat_compl. │ │ (registry.py)│ │ -│ │ │ │ codex_resp. │ │ 61 tools │ │ -│ │ │ │ anthropic │ │ 52 toolsets │ │ +│ │ │ │ codex_resp. │ │ 70+ tools │ │ +│ │ │ │ anthropic │ │ 28 toolsets │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ └─────────┴─────────────────┴─────────────────┴───────────────────────┘ │ │ @@ -52,8 +52,8 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours ```text hermes-agent/ -├── run_agent.py # AIAgent — core conversation loop (~13,700 lines) -├── cli.py # HermesCLI — interactive terminal UI (~11,500 lines) +├── run_agent.py # AIAgent — core conversation loop (large file) +├── cli.py # HermesCLI — interactive terminal UI (large file) ├── model_tools.py # Tool discovery, schema collection, dispatch ├── toolsets.py # Tool groupings and platform presets ├── hermes_state.py # SQLite session/state database with FTS5 @@ -76,14 +76,14 @@ hermes-agent/ │ └── trajectory.py # Trajectory saving helpers │ ├── hermes_cli/ # CLI subcommands and setup -│ ├── main.py # Entry point — all `hermes` subcommands (~10,400 lines) +│ ├── main.py # Entry point — all `hermes` subcommands (large file) │ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration │ ├── commands.py # COMMAND_REGISTRY — central slash command definitions │ ├── auth.py # PROVIDER_REGISTRY, credential resolution │ ├── runtime_provider.py # Provider → api_mode + credentials │ ├── models.py # Model catalog, provider model lists │ ├── model_switch.py # /model command logic (CLI + gateway shared) -│ ├── setup.py # Interactive setup wizard (~3,500 lines) +│ ├── setup.py # Interactive setup wizard (large file) │ ├── skin_engine.py # CLI theming engine │ ├── skills_config.py # hermes skills — enable/disable per platform │ ├── skills_hub.py # /skills slash command @@ -102,14 +102,14 @@ hermes-agent/ │ ├── browser_tool.py # 10 browser automation tools │ ├── code_execution_tool.py # execute_code sandbox │ ├── delegate_tool.py # Subagent delegation -│ ├── mcp_tool.py # MCP client (~3,100 lines) +│ ├── mcp_tool.py # MCP client (large file) │ ├── credential_files.py # File-based credential passthrough │ ├── env_passthrough.py # Env var passthrough for sandboxes │ ├── ansi_strip.py # ANSI escape stripping │ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity) │ ├── gateway/ # Messaging platform gateway -│ ├── run.py # GatewayRunner — message dispatch (~12,200 lines) +│ ├── run.py # GatewayRunner — message dispatch (large file) │ ├── session.py # SessionStore — conversation persistence │ ├── delivery.py # Outbound message delivery │ ├── pairing.py # DM pairing authorization @@ -213,7 +213,7 @@ A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls. ### Tool System -Central tool registry (`tools/registry.py`) with 61 registered tools across 52 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 7 backends (local, Docker, SSH, Daytona, Modal, Singularity, Vercel Sandbox). +Central tool registry (`tools/registry.py`) with 70+ registered tools across ~28 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 7 backends (local, Docker, SSH, Daytona, Modal, Singularity, Vercel Sandbox). → [Tools Runtime](./tools-runtime.md) diff --git a/website/docs/developer-guide/browser-supervisor.md b/website/docs/developer-guide/browser-supervisor.md index d0aa34dbb2b..ba26d579bbb 100644 --- a/website/docs/developer-guide/browser-supervisor.md +++ b/website/docs/developer-guide/browser-supervisor.md @@ -217,7 +217,6 @@ Issue planned against `jo-inc/camofox-browser` adding: Unit tests use an asyncio mock CDP server that speaks enough of the protocol to exercise all state transitions: attach, enable, navigate, dialog fire, dialog dismiss, frame attach/detach, child target attach, session teardown. -Real-backend E2E (Browserbase + local Chrome) is manual; probe scripts from -the 2026-04-23 investigation kept in-repo under -`scripts/browser_supervisor_e2e.py` so anyone can re-verify on new backend -versions. +Real-backend E2E (Browserbase + local Chrome) is manual — exercise via +`/browser connect` to a live Chrome and run the dialog/frame test cases +described above. diff --git a/website/docs/developer-guide/contributing.md b/website/docs/developer-guide/contributing.md index 9b2cc9b3037..6e00e367330 100644 --- a/website/docs/developer-guide/contributing.md +++ b/website/docs/developer-guide/contributing.md @@ -50,6 +50,8 @@ export VIRTUAL_ENV="$(pwd)/venv" # Install with all extras (messaging, cron, CLI menus, dev tools) uv pip install -e ".[all,dev]" +# tinker-atropos is a git submodule — needs `git submodule update --init` first +# if you didn't clone with `--recurse-submodules` uv pip install -e "./tinker-atropos" # Optional: browser tools diff --git a/website/docs/developer-guide/environments.md b/website/docs/developer-guide/environments.md index 3409f304736..0a5aa00ffff 100644 --- a/website/docs/developer-guide/environments.md +++ b/website/docs/developer-guide/environments.md @@ -172,7 +172,7 @@ parser = get_parser("hermes") # or "mistral", "llama3_json", "qwen", "deepseek_ content, tool_calls = parser.parse(raw_model_output) ``` -Available parsers: `hermes`, `mistral`, `llama3_json`, `qwen`, `qwen3_coder`, `deepseek_v3`, `deepseek_v3_1`, `kimi_k2`, `longcat`, `glm45`, `glm47`. +Available parsers: `hermes`, `mistral`, `llama3_json`, `llama4_json`, `qwen`, `qwen3_coder`, `deepseek_v3`, `deepseek_v3_1` (alias `deepseek_v31`), `kimi_k2`, `longcat`, `glm45`, `glm47`. In Phase 1 (OpenAI server type), parsers are not needed — the server handles tool call parsing natively. diff --git a/website/docs/developer-guide/gateway-internals.md b/website/docs/developer-guide/gateway-internals.md index e10fe6821f0..d0521d4816d 100644 --- a/website/docs/developer-guide/gateway-internals.md +++ b/website/docs/developer-guide/gateway-internals.md @@ -6,13 +6,13 @@ description: "How the messaging gateway boots, authorizes users, routes sessions # Gateway Internals -The messaging gateway is the long-running process that connects Hermes to 14+ external messaging platforms through a unified architecture. +The messaging gateway is the long-running process that connects Hermes to 20+ external messaging platforms through a unified architecture. ## Key Files | File | Purpose | |------|---------| -| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (~12,000 lines) | +| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (large file; check git for current LOC) | | `gateway/session.py` | `SessionStore` — conversation persistence and session key construction | | `gateway/delivery.py` | Outbound message delivery to target platforms/channels | | `gateway/pairing.py` | DM pairing flow for user authorization | @@ -162,7 +162,10 @@ gateway/platforms/ ├── wecom.py # WeCom (WeChat Work) callback ├── weixin.py # Weixin (personal WeChat) via iLink Bot API ├── bluebubbles.py # Apple iMessage via BlueBubbles macOS server -├── qqbot.py # QQ Bot (Tencent QQ) via Official API v2 +├── qqbot/ # QQ Bot (Tencent QQ) via Official API v2 (sub-package: adapter.py, crypto.py, keyboards.py, …) +├── yuanbao.py # Yuanbao (Tencent) DM/group adapter +├── feishu_comment.py # Feishu document/drive comment-reply handler +├── msgraph_webhook.py # Microsoft Graph change-notification webhook (Teams, Outlook, etc.) ├── webhook.py # Inbound/outbound webhook adapter ├── api_server.py # REST API server adapter └── homeassistant.py # Home Assistant conversation integration @@ -205,7 +208,7 @@ Gateway hooks are Python modules that respond to lifecycle events: | `agent:end` | Agent finishes and returns response | | `command:*` | Any slash command is executed | -Hooks are discovered from `gateway/builtin_hooks/` (always active) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`. +Hooks are discovered from `gateway/builtin_hooks/` (an extension point — currently empty in the shipped distribution; `_register_builtin_hooks()` is a no-op stub) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`. ## Memory Provider Integration diff --git a/website/docs/developer-guide/provider-runtime.md b/website/docs/developer-guide/provider-runtime.md index 492a213e1f6..830382479ff 100644 --- a/website/docs/developer-guide/provider-runtime.md +++ b/website/docs/developer-guide/provider-runtime.md @@ -40,7 +40,7 @@ That ordering matters because Hermes treats the saved model/provider choice as t ## Providers -Current provider families include: +Current provider families include (see `plugins/model-providers/` for the complete bundled set): - AI Gateway (Vercel) - OpenRouter @@ -48,16 +48,27 @@ Current provider families include: - OpenAI Codex - Copilot / Copilot ACP - Anthropic (native) -- Google / Gemini -- Alibaba / DashScope +- Google / Gemini (`gemini`, `google-gemini-cli`) +- Alibaba / DashScope (`alibaba`, `alibaba-coding-plan`) - DeepSeek - Z.AI -- Kimi / Moonshot -- MiniMax -- MiniMax China +- Kimi / Moonshot (`kimi-coding`, `kimi-coding-cn`) +- MiniMax (`minimax`, `minimax-cn`, `minimax-oauth`) - Kilo Code - Hugging Face - OpenCode Zen / OpenCode Go +- AWS Bedrock +- Azure Foundry +- NVIDIA NIM +- xAI (Grok) +- Arcee +- GMI Cloud +- StepFun +- Qwen OAuth +- Xiaomi +- Ollama Cloud +- LM Studio +- Tencent TokenHub - Custom (`provider: custom`) — first-class provider for any OpenAI-compatible endpoint - Named custom providers (`custom_providers` list in config.yaml) @@ -154,7 +165,7 @@ When an auxiliary task is configured with provider `main`, Hermes resolves that ## Fallback models -Hermes supports a configured fallback model/provider pair, allowing runtime failover when the primary model encounters errors. +Hermes supports a configured fallback provider chain — a list of `(provider, model)` entries tried in order when the primary model encounters errors. The legacy single-pair `fallback_model` dict is still accepted for back-compat (and migrated on first write). ### How it works internally diff --git a/website/docs/guides/automation-templates.md b/website/docs/guides/automation-templates.md index a4f47e0bda8..2a6a125aa97 100644 --- a/website/docs/guides/automation-templates.md +++ b/website/docs/guides/automation-templates.md @@ -74,7 +74,7 @@ Review for: - Missing tests for new behavior Post a concise review. If the PR is a trivial docs/typo change, say so briefly." \ - --skills "github-code-review" \ + --skill github-code-review \ --deliver github_comment ``` @@ -296,7 +296,7 @@ Focus on: Skip routine dependency bumps and CI fixes. If nothing notable, respond with [SILENT]. If there are findings, organize by repo with brief analysis of each item." \ - --skills "competitive-pr-scout" \ + --skill competitive-pr-scout \ --name "Competitor scout" \ --deliver telegram ``` @@ -335,7 +335,7 @@ Daily arXiv scan that saves summaries to your note-taking system. ```bash hermes cron create "0 8 * * *" \ "Search arXiv for the 3 most interesting papers on 'language model reasoning' OR 'tool-use agents' from the past day. For each paper, create an Obsidian note with the title, authors, abstract summary, key contribution, and potential relevance to Hermes Agent development." \ - --skills "arxiv,obsidian" \ + --skill arxiv --skill obsidian \ --name "Paper digest" \ --deliver local ``` @@ -430,7 +430,7 @@ If action is 'closed' and pull_request.merged is true: 5. Reference the original PR in the new PR description If action is not 'closed' or not merged, respond with [SILENT]." \ - --skills "github-pr-workflow" \ + --skill github-pr-workflow \ --deliver log ``` @@ -514,7 +514,7 @@ hermes cron create "0 3 * * 0" \ Write a security report with findings categorized by severity (Critical, High, Medium, Low). If nothing found, report a clean bill of health." \ - --skills "codebase-security-audit" \ + --skill codebase-security-audit \ --name "Weekly security audit" \ --deliver telegram ``` diff --git a/website/docs/guides/cron-script-only.md b/website/docs/guides/cron-script-only.md index 06fa2880067..5863412f565 100644 --- a/website/docs/guides/cron-script-only.md +++ b/website/docs/guides/cron-script-only.md @@ -231,16 +231,15 @@ Silent when both filesystems are under 90%; fires exactly one line per over-thre | Approach | What runs | When to use | |----------|-----------|-------------| -| `hermes send` (one-shot) | Any shell command piping into it | Ad-hoc delivery or as the action of an external scheduler (systemd, launchd) | | `cronjob --no-agent` (this page) | Your script on Hermes' schedule | Recurring watchdogs / alerts / metrics that don't need reasoning | | `cronjob` (default, LLM) | Agent with optional pre-check script | When the message content requires reasoning over data | -| OS cron + `hermes send` | Your script on the OS schedule | When Hermes might be unhealthy (the thing you're monitoring) | +| OS cron + `curl` to a [webhook subscription](/docs/user-guide/features/webhooks) | Your script on the OS schedule | When Hermes might be unhealthy (the thing you're monitoring) | -For critical system-health watchdogs that must fire *even when the gateway is down*, keep using OS-level cron + a plain `curl` or `hermes send` call — those run as independent OS processes and don't depend on Hermes being up. The in-gateway scheduler is the right choice when the thing being monitored is external. +For critical system-health watchdogs that must fire *even when the gateway is down*, use OS-level cron with a plain `curl` to a Hermes webhook subscription (or any external alerting endpoint) — those run as independent OS processes and don't depend on Hermes being up. The in-gateway scheduler is the right choice when the thing being monitored is external. ## Related - [Automate Anything with Cron](/docs/guides/automate-with-cron) — LLM-driven cron patterns. - [Scheduled Tasks (Cron) reference](/docs/user-guide/features/cron) — full schedule syntax, lifecycle, delivery routing. -- [Pipe Script Output with `hermes send`](/docs/guides/pipe-script-output) — the one-shot counterpart for ad-hoc scripts. +- [Webhook Subscriptions](/docs/user-guide/features/webhooks) — fire-and-forget HTTP entry points for external schedulers. - [Gateway Internals](/docs/developer-guide/gateway-internals) — delivery-router internals. diff --git a/website/docs/guides/cron-troubleshooting.md b/website/docs/guides/cron-troubleshooting.md index d85a1530909..0db25044bca 100644 --- a/website/docs/guides/cron-troubleshooting.md +++ b/website/docs/guides/cron-troubleshooting.md @@ -38,7 +38,7 @@ If the job fires once and then disappears from the list, it's a one-shot schedul Cron jobs are fired by the gateway's background ticker thread, which ticks every 60 seconds. A regular CLI chat session does **not** automatically fire cron jobs. -If you're expecting jobs to fire automatically, you need a running gateway (`hermes gateway` or `hermes serve`). For one-off debugging, you can manually trigger a tick with `hermes cron tick`. +If you're expecting jobs to fire automatically, you need a running gateway (`hermes gateway` for foreground, or `hermes gateway start` for the installed service). For one-off debugging, you can manually trigger a tick with `hermes cron tick`. ### Check 4: Check the system clock and timezone diff --git a/website/docs/guides/local-ollama-setup.md b/website/docs/guides/local-ollama-setup.md index ae0cc445a82..9e2fab5e5de 100644 --- a/website/docs/guides/local-ollama-setup.md +++ b/website/docs/guides/local-ollama-setup.md @@ -31,11 +31,11 @@ By the end, you'll have: | **GPU** | Not required | NVIDIA GPU with 8+ GB VRAM speeds things up significantly | :::tip CPU-only works, but expect slower responses -Ollama runs on CPU-only servers. A 9B model on a modern 8-core CPU gives ~10 tokens/sec. A 31B model on CPU is slower (~2–5 tokens/sec) — each response takes 30–120 seconds, but it works. A GPU dramatically improves this. For CPU-only setups, increase the API timeout in config: +Ollama runs on CPU-only servers. A 9B model on a modern 8-core CPU gives ~10 tokens/sec. A 31B model on CPU is slower (~2–5 tokens/sec) — each response takes 30–120 seconds, but it works. A GPU dramatically improves this. For CPU-only setups, widen the API timeout via the env var (it's not a `config.yaml` key): -```yaml -agent: - api_timeout: 1800 # 30 minutes — generous for slow local models +```bash +# ~/.hermes/.env +HERMES_API_TIMEOUT=1800 # 30 minutes — generous for slow local models ``` ::: diff --git a/website/docs/guides/minimax-oauth.md b/website/docs/guides/minimax-oauth.md index 2bc1ef3683c..2914c4c1979 100644 --- a/website/docs/guides/minimax-oauth.md +++ b/website/docs/guides/minimax-oauth.md @@ -56,10 +56,12 @@ hermes auth add minimax-oauth ### China region -If your account is on the China platform (`minimaxi.com`), pass `--region cn`: +If your account is on the China platform (`minimaxi.com`), use the China-region OAuth provider id `minimax-cn` instead, or skip OAuth and configure `MINIMAX_CN_API_KEY` / `MINIMAX_CN_BASE_URL` directly. The `--region cn` flag described in older docs is **not** wired through the CLI's argument parser; use the `minimax-cn` provider instead: ```bash -hermes auth add minimax-oauth --region cn +hermes auth add minimax-cn --type oauth # if OAuth is supported on your CN account +# or simpler: +echo 'MINIMAX_CN_API_KEY=your-key' >> ~/.hermes/.env ``` ### Remote / headless sessions @@ -128,12 +130,12 @@ model: base_url: https://api.minimax.io/anthropic ``` -### `--region` flag +### Region endpoints -| Value | Portal | Inference endpoint | -|-------|--------|-------------------| -| `global` (default) | `https://api.minimax.io` | `https://api.minimax.io/anthropic` | -| `cn` | `https://api.minimaxi.com` | `https://api.minimaxi.com/anthropic` | +| Provider id | Portal | Inference endpoint | +|-------------|--------|-------------------| +| `minimax-oauth` (global) | `https://api.minimax.io` | `https://api.minimax.io/anthropic` | +| `minimax-cn` (China) | `https://api.minimaxi.com` | `https://api.minimaxi.com/anthropic` | ### Provider aliases diff --git a/website/docs/guides/operate-teams-meeting-pipeline.md b/website/docs/guides/operate-teams-meeting-pipeline.md index 1e32e74c1a7..78c25e6d0ab 100644 --- a/website/docs/guides/operate-teams-meeting-pipeline.md +++ b/website/docs/guides/operate-teams-meeting-pipeline.md @@ -54,21 +54,32 @@ You MUST run `maintain-subscriptions` on a schedule. Pick one of these three opt #### Option 1: Hermes cron (recommended if you already run the Hermes gateway) -Hermes ships a built-in cron scheduler. Add a script-only cron job that runs every 12 hours (gives 6x headroom against the 72h expiry window): +Hermes ships a built-in cron scheduler. The `--no-agent` mode runs a script as the job (rather than using an LLM), and `--script` must point at a file under `~/.hermes/scripts/`. First create the script: ```bash -hermes cron add \ +mkdir -p ~/.hermes/scripts +cat > ~/.hermes/scripts/maintain-teams-subscriptions.sh <<'EOF' +#!/usr/bin/env bash +exec hermes teams-pipeline maintain-subscriptions +EOF +chmod +x ~/.hermes/scripts/maintain-teams-subscriptions.sh +``` + +Then register a script-only cron job that runs every 12 hours (gives 6x headroom against the 72h expiry window): + +```bash +hermes cron create "0 */12 * * *" \ --name "teams-pipeline-maintain-subscriptions" \ - --schedule "0 */12 * * *" \ - --script-only \ - --command "hermes teams-pipeline maintain-subscriptions" + --no-agent \ + --script maintain-teams-subscriptions.sh \ + --deliver local ``` Verify it was registered and inspect the next run time: ```bash hermes cron list -hermes cron show teams-pipeline-maintain-subscriptions +hermes cron status # scheduler status ``` #### Option 2: systemd timer (recommended for Linux production deployments) diff --git a/website/docs/guides/python-library.md b/website/docs/guides/python-library.md index 3e857f7dd11..3bb08645ac9 100644 --- a/website/docs/guides/python-library.md +++ b/website/docs/guides/python-library.md @@ -81,7 +81,8 @@ print(f"Messages exchanged: {len(result['messages'])}") The returned dictionary contains: - **`final_response`** — The agent's final text reply - **`messages`** — The complete message history (system, user, assistant, tool calls) -- **`task_id`** — The task identifier used for VM isolation + +(The `task_id` you pass in is stored on the agent instance for VM isolation but isn't echoed back in the return dict.) You can also pass a custom system message that overrides the ephemeral system prompt for that call: diff --git a/website/docs/guides/use-mcp-with-hermes.md b/website/docs/guides/use-mcp-with-hermes.md index 6d86eea1eef..5fa43bbcde5 100644 --- a/website/docs/guides/use-mcp-with-hermes.md +++ b/website/docs/guides/use-mcp-with-hermes.md @@ -143,7 +143,7 @@ Use `chrome-devtools-mcp`. If your Windows Chrome already has live remote debugging enabled from `chrome://inspect/#remote-debugging`, add it like this from WSL: ```bash -hermes mcp add chrome-devtools-win --command cmd.exe --args /c "npx -y chrome-devtools-mcp@latest --autoConnect --no-usage-statistics" +hermes mcp add chrome-devtools-win --command cmd.exe --args /c npx -y chrome-devtools-mcp@latest --autoConnect --no-usage-statistics ``` After saving the server: diff --git a/website/docs/index.md b/website/docs/index.md index 86abf444037..bab06f634d5 100644 --- a/website/docs/index.md +++ b/website/docs/index.md @@ -47,7 +47,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl | 🗺️ **[Learning Path](/docs/getting-started/learning-path)** | Find the right docs for your experience level | | ⚙️ **[Configuration](/docs/user-guide/configuration)** | Config file, providers, models, and options | | 💬 **[Messaging Gateway](/docs/user-guide/messaging)** | Set up Telegram, Discord, Slack, WhatsApp, Teams, or more | -| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 68 built-in tools and how to configure them | +| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 70+ built-in tools and how to configure them | | 🧠 **[Memory System](/docs/user-guide/features/memory)** | Persistent memory that grows across sessions | | 📚 **[Skills System](/docs/user-guide/features/skills)** | Procedural memory the agent creates and reuses | | 🔌 **[MCP Integration](/docs/user-guide/features/mcp)** | Connect to MCP servers, filter their tools, and extend Hermes safely | @@ -65,7 +65,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl - **A closed learning loop** — Agent-curated memory with periodic nudges, autonomous skill creation, skill self-improvement during use, FTS5 cross-session recall with LLM summarization, and [Honcho](https://github.com/plastic-labs/honcho) dialectic user modeling - **Runs anywhere, not just your laptop** — 6 terminal backends: local, Docker, SSH, Daytona, Singularity, Modal. Daytona and Modal offer serverless persistence — your environment hibernates when idle, costing nearly nothing -- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu, WeCom, BlueBubbles, Home Assistant, Microsoft Teams — 15+ platforms from one gateway +- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu, WeCom, Weixin, QQ Bot, Yuanbao, BlueBubbles, Home Assistant, Microsoft Teams, Google Chat, and more — 20+ platforms from one gateway - **Built by model trainers** — Created by [Nous Research](https://nousresearch.com), the lab behind Hermes, Nomos, and Psyche. Works with [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai), OpenAI, or any endpoint - **Scheduled automations** — Built-in cron with delivery to any platform - **Delegates & parallelizes** — Spawn isolated subagents for parallel workstreams. Programmatic Tool Calling via `execute_code` collapses multi-step pipelines into single inference calls diff --git a/website/docs/integrations/providers.md b/website/docs/integrations/providers.md index df8701778da..93e4ba630d3 100644 --- a/website/docs/integrations/providers.md +++ b/website/docs/integrations/providers.md @@ -378,8 +378,8 @@ bedrock: # profile: "myprofile" # or set AWS_PROFILE # discovery: true # auto-discover region from IAM # guardrail: # optional Bedrock Guardrails - # id: "your-guardrail-id" - # version: "DRAFT" + # guardrail_identifier: "your-guardrail-id" + # guardrail_version: "DRAFT" ``` Authentication uses the standard boto3 chain: explicit `AWS_ACCESS_KEY_ID`/`AWS_SECRET_ACCESS_KEY`, `AWS_PROFILE` from `~/.aws/credentials`, IAM role on EC2/ECS/Lambda, IMDS, or SSO. No env var is required if you're already authenticated with the AWS CLI. @@ -484,7 +484,7 @@ For on-prem deployments (DGX Spark, local GPU), set `NVIDIA_BASE_URL=http://loca ### GMI Cloud -Open and reasoning models via [GMI Cloud](https://inference.gmi.ai) — OpenAI-compatible API, API key authentication. +Open and reasoning models via [GMI Cloud](https://www.gmicloud.ai/) — OpenAI-compatible API, API key authentication. ```bash # GMI Cloud @@ -499,7 +499,7 @@ model: default: "deepseek-ai/DeepSeek-R1" ``` -The base URL can be overridden with `GMI_BASE_URL` (default: `https://api.gmi.ai/v1`). +The base URL can be overridden with `GMI_BASE_URL` (default: `https://api.gmi-serving.com/v1`). ### StepFun @@ -1393,24 +1393,34 @@ Notes: - See OpenRouter's [Pareto Router docs](https://openrouter.ai/docs/guides/routing/routers/pareto-router) for the full router behavior. - To use the Pareto Code router for a specific **auxiliary task** (compression, vision, etc.) instead of the main agent, set `extra_body.plugins` under that task — see [Auxiliary Models → OpenRouter routing & Pareto Code for auxiliary tasks](/docs/user-guide/configuration#openrouter-routing--pareto-code-for-auxiliary-tasks). -## Fallback Model +## Fallback Providers -Configure a backup provider:model that Hermes switches to automatically when your primary model fails (rate limits, server errors, auth failures): +Configure a chain of backup providers Hermes tries in order when the primary model fails (rate limits, server errors, auth failures). The canonical format is a top-level `fallback_providers:` list: + +```yaml +fallback_providers: + - provider: openrouter + model: anthropic/claude-sonnet-4 + - provider: anthropic + model: claude-sonnet-4 + # base_url: http://localhost:8000/v1 # optional, for custom endpoints + # api_mode: chat_completions # optional override +``` + +The legacy single-pair `fallback_model:` dict is still accepted for back-compat: ```yaml fallback_model: - provider: openrouter # required - model: anthropic/claude-sonnet-4 # required - # base_url: http://localhost:8000/v1 # optional, for custom endpoints - # key_env: MY_CUSTOM_KEY # optional, env var name for custom endpoint API key + provider: openrouter + model: anthropic/claude-sonnet-4 ``` -When activated, the fallback swaps the model and provider mid-session without losing your conversation. It fires **at most once** per session. +When activated, the fallback swaps the model and provider mid-session without losing your conversation. The chain is tried entry-by-entry; activation is one-shot per session. -Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `alibaba`, `tencent-tokenhub`, `custom`. +Supported providers: `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `huggingface`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `ollama-cloud`, `bedrock`, `ai-gateway`, `azure-foundry`, `opencode-zen`, `opencode-go`, `kilocode`, `xiaomi`, `arcee`, `gmi`, `stepfun`, `lmstudio`, `alibaba`, `alibaba-coding-plan`, `tencent-tokenhub`, `custom`. :::tip -Fallback is configured exclusively through `config.yaml` — there are no environment variables for it. For full details on when it triggers, supported providers, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers). +Fallback is configured exclusively through `config.yaml` — or interactively via `hermes fallback`. For full details on when it triggers, how the chain advances, and how it interacts with auxiliary tasks and delegation, see [Fallback Providers](/docs/user-guide/features/fallback-providers). ::: --- diff --git a/website/docs/user-guide/messaging/google_chat.md b/website/docs/user-guide/messaging/google_chat.md index 6fda2b179a8..8cf2d01d7a3 100644 --- a/website/docs/user-guide/messaging/google_chat.md +++ b/website/docs/user-guide/messaging/google_chat.md @@ -164,10 +164,10 @@ GOOGLE_CHAT_MAX_BYTES=16777216 # 16 MiB — cap on in-flight me The project ID also falls back to `GOOGLE_CLOUD_PROJECT`, and the SA path falls back to `GOOGLE_APPLICATION_CREDENTIALS` — use whichever convention you prefer. -Install Hermes with the optional dependencies: +Install the dependencies the Google Chat adapter needs (no Hermes extra is currently published — install them directly): ```bash -pip install 'hermes-agent[google_chat]' +pip install google-cloud-pubsub google-api-python-client google-auth google-auth-oauthlib ``` Start the gateway: diff --git a/website/docs/user-guide/messaging/index.md b/website/docs/user-guide/messaging/index.md index 24970ac235d..b6ed2796c10 100644 --- a/website/docs/user-guide/messaging/index.md +++ b/website/docs/user-guide/messaging/index.md @@ -386,7 +386,7 @@ Each platform has its own toolset: | Discord | `hermes-discord` | Full tools including terminal | | WhatsApp | `hermes-whatsapp` | Full tools including terminal | | Slack | `hermes-slack` | Full tools including terminal | -| Google Chat | `hermes-google-chat` | Full tools including terminal | +| Google Chat | `hermes-google_chat` | Full tools including terminal | | Signal | `hermes-signal` | Full tools including terminal | | SMS | `hermes-sms` | Full tools including terminal | | Email | `hermes-email` | Full tools including terminal | @@ -402,7 +402,7 @@ Each platform has its own toolset: | QQBot | `hermes-qqbot` | Full tools including terminal | | Yuanbao | `hermes-yuanbao` | Full tools including terminal | | Microsoft Teams | `hermes-teams` | Full tools including terminal | -| API Server | `hermes` (default) | Full tools including terminal | +| API Server | `hermes-api-server` | Full tools (drops `clarify`, `send_message`, `text_to_speech` — programmatic access doesn't have an interactive user) | | Webhooks | `hermes-webhook` | Full tools including terminal | ## Next Steps diff --git a/website/docs/user-guide/messaging/open-webui.md b/website/docs/user-guide/messaging/open-webui.md index 175276eb084..e75517e79b3 100644 --- a/website/docs/user-guide/messaging/open-webui.md +++ b/website/docs/user-guide/messaging/open-webui.md @@ -275,16 +275,22 @@ To run separate Hermes instances per user — each with their own config, memory ### 1. Create profiles and configure API servers +`API_SERVER_*` are env vars, not YAML config keys, so write them to each profile's `.env`. Pick ports outside the default-platform range (`8644` is the webhook adapter, `8645` is wecom-callback, `8646` is msgraph-webhook), e.g. `8650+`: + ```bash hermes profile create alice -hermes -p alice config set API_SERVER_ENABLED true -hermes -p alice config set API_SERVER_PORT 8643 -hermes -p alice config set API_SERVER_KEY alice-secret +cat >> ~/.hermes/profiles/alice/.env <> ~/.hermes/profiles/bob/.env <