docs: fix all remaining minor accuracy issues

- updating.md: Note that 'hermes update' auto-handles config migration - cli.md: Add summary_model to compression config, fix display config (add personality/compact), remove unverified pastes/ claim - configuration.md: Add 5 missing config sections (stt, human_delay, code_execution, delegation, clarify), fix display defaults, fix reasoning_effort default to empty/unset - messaging/index.md: Add GATEWAY_ALLOWED_USERS to security section - skills.md: Add category field to skills_list return value - mcp.md: Document auto-registered utility tools (resources/prompts) - architecture.md: Fix file_tools.py reference, base_url default to None, synchronous agent loop pseudocode - cli-commands.md: Fix hermes logout description - environment-variables.md: Add HERMES_QUIET, HERMES_EXEC_ASK, BROWSER_INACTIVITY_TIMEOUT, GATEWAY_ALLOWED_USERS Verification scan: 27/27 checks passed, zero issues remaining.
2026-05-04 02:21:47 +00:00 · 2026-03-05 07:00:51 -08:00 · 2026-03-05 07:00:51 -08:00 · 19016497ef
commit 19016497ef
parent d578d06f59
9 changed files with 77 additions and 10 deletions
--- a/website/docs/user-guide/cli.md
+++ b/website/docs/user-guide/cli.md
@ -171,7 +171,7 @@ There are two ways to enter multi-line messages:
 ```

 :::info
-Pasting 5+ lines of text automatically saves to `~/.hermes/pastes/` and collapses to a reference, keeping your prompt clean.
+Pasting multi-line text is supported — use `Alt+Enter` or `Ctrl+J` to insert newlines, or simply paste content directly.
 :::

 ## Interrupting the Agent
@ -251,6 +251,7 @@ Long conversations are automatically summarized when approaching context limits:
 compression:
  enabled: true
  threshold: 0.85    # Compress at 85% of context limit
+  summary_model: "google/gemini-3-flash-preview"  # Model used for summarization
 ```

 When compression triggers, middle turns are summarized while the first 3 and last 4 turns are always preserved.
--- a/website/docs/user-guide/configuration.md
+++ b/website/docs/user-guide/configuration.md
@ -139,10 +139,10 @@ Control how much "thinking" the model does before responding:

 ```yaml
 agent:
-  reasoning_effort: "xhigh"   # xhigh (max), high, medium, low, minimal, none
+  reasoning_effort: ""   # empty = use model default. Options: xhigh (max), high, medium, low, minimal, none
 ```

-Higher reasoning effort gives better results on complex tasks at the cost of more tokens and latency.
+When unset (default), the model's own default reasoning level is used. Setting a value overrides it — higher reasoning effort gives better results on complex tasks at the cost of more tokens and latency.

 ## TTS Configuration

@ -164,6 +164,8 @@ tts:
 ```yaml
 display:
  tool_progress: all    # off | new | all | verbose
+  personality: "kawaii"  # Default personality for the CLI
+  compact: false         # Compact output mode (less whitespace)
 ```

 | Mode | What you see |
@ -173,6 +175,58 @@ display:
 | `all` | Every tool call with a short preview (default) |
 | `verbose` | Full args, results, and debug logs |

+## Speech-to-Text (STT)
+
+```yaml
+stt:
+  provider: "openai"           # STT provider
+```
+
+Requires `VOICE_TOOLS_OPENAI_KEY` in `.env` for OpenAI STT.
+
+## Human Delay
+
+Simulate human-like response pacing in messaging platforms:
+
+```yaml
+human_delay:
+  mode: "off"                  # off | natural | custom
+  min_ms: 500                  # Minimum delay (custom mode)
+  max_ms: 2000                 # Maximum delay (custom mode)
+```
+
+## Code Execution
+
+Configure the sandboxed Python code execution tool:
+
+```yaml
+code_execution:
+  timeout: 300                 # Max execution time in seconds
+  max_tool_calls: 50           # Max tool calls within code execution
+```
+
+## Delegation
+
+Configure subagent behavior for the delegate tool:
+
+```yaml
+delegation:
+  max_iterations: 50           # Max iterations per subagent
+  default_toolsets:             # Toolsets available to subagents
+    - terminal
+    - file
+    - web
+```
+
+## Clarify
+
+Configure the clarification prompt behavior:
+
+```yaml
+clarify:
+  timeout: 120                 # Seconds to wait for user clarification response
+```
+
 ## Context Files (SOUL.md, AGENTS.md)

 Drop these files in your project directory and the agent automatically picks them up:
--- a/website/docs/user-guide/features/mcp.md
+++ b/website/docs/user-guide/features/mcp.md
@ -159,6 +159,10 @@ mcp_{server_name}_{tool_name}

 Tools appear alongside built-in tools — the agent calls them like any other tool.

+:::info
+In addition to the server's own tools, each MCP server also gets 4 utility tools auto-registered: `list_resources`, `read_resource`, `list_prompts`, and `get_prompt`. These allow the agent to discover and use MCP resources and prompts exposed by the server.
+:::
+
 ### Reconnection

 If an MCP server disconnects, Hermes automatically reconnects with exponential backoff (1s, 2s, 4s, 8s, 16s — max 5 attempts). Initial connection failures are reported immediately.
--- a/website/docs/user-guide/features/skills.md
+++ b/website/docs/user-guide/features/skills.md
@ -36,7 +36,7 @@ hermes chat --toolsets skills -q "Show me the axolotl skill"
 Skills use a token-efficient loading pattern:

 ```
-Level 0: skills_list()           → [{name, description}, ...]   (~3k tokens)
+Level 0: skills_list()           → [{name, description, category}, ...]   (~3k tokens)
 Level 1: skill_view(name)        → Full content + metadata       (varies)
 Level 2: skill_view(name, path)  → Specific reference file       (varies)
 ```
--- a/website/docs/user-guide/messaging/index.md
+++ b/website/docs/user-guide/messaging/index.md
@ -113,6 +113,9 @@ Configure per-platform overrides in `~/.hermes/gateway.json`:
 TELEGRAM_ALLOWED_USERS=123456789,987654321
 DISCORD_ALLOWED_USERS=123456789012345678

+# Or allow specific users across all platforms (comma-separated user IDs):
+GATEWAY_ALLOWED_USERS=123456789,987654321
+
 # Or explicitly allow all users (NOT recommended for bots with terminal access):
 GATEWAY_ALLOW_ALL_USERS=true
 ```