Merge branch 'main' into rewbs/tool-use-charge-to-subscription

2026-04-25 00:51:20 +00:00 · 2026-03-31 08:48:54 +09:00 · 2026-03-31 08:48:54 +09:00 · 6e4598ce1e
commit 6e4598ce1e
parent 1cbb1b99cc ce2841f3c9
269 changed files with 33678 additions and 2273 deletions
--- a/.dockerignore
+++ b/.dockerignore
@ -0,0 +1,13 @@
+# Git
+.git
+.gitignore
+.gitmodules
+
+# Dependencies
+node_modules
+
+# CI/CD
+.github
+
+# Environment files
+.env
--- a/.env.example
+++ b/.env.example
@ -59,12 +59,25 @@ OPENCODE_ZEN_API_KEY=
 # OpenCode Go provides access to open models (GLM-5, Kimi K2.5, MiniMax M2.5)
 # $10/month subscription. Get your key at: https://opencode.ai/auth
 OPENCODE_GO_API_KEY=
+
+# =============================================================================
+# LLM PROVIDER (Hugging Face Inference Providers)
+# =============================================================================
+# Hugging Face routes to 20+ open models via unified OpenAI-compatible endpoint.
+# Free tier included ($0.10/month), no markup on provider rates.
+# Get your token at: https://huggingface.co/settings/tokens
+# Required permission: "Make calls to Inference Providers"
+HF_TOKEN=
 # OPENCODE_GO_BASE_URL=https://opencode.ai/zen/go/v1  # Override default base URL

 # =============================================================================
 # TOOL API KEYS
 # =============================================================================

+# Exa API Key - AI-native web search and contents
+# Get at: https://exa.ai
+EXA_API_KEY=
+
 # Parallel API Key - AI-native web search and extract
 # Get at: https://parallel.ai
 PARALLEL_API_KEY=
@ -85,7 +98,7 @@ FAL_KEY=
 HONCHO_API_KEY=

 # =============================================================================
-# TERMINAL TOOL CONFIGURATION (mini-swe-agent backend)
+# TERMINAL TOOL CONFIGURATION
 # =============================================================================
 # Backend type: "local", "singularity", "docker", "modal", or "ssh"
 # Terminal backend is configured in ~/.hermes/config.yaml (terminal.backend).
--- a/.github/workflows/docker-publish.yml
+++ b/.github/workflows/docker-publish.yml
@ -0,0 +1,61 @@
+name: Docker Build and Publish
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+concurrency:
+  group: docker-${{ github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  build-and-push:
+    runs-on: ubuntu-latest
+    timeout-minutes: 30
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+        with:
+          submodules: recursive
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+
+      - name: Build image
+        uses: docker/build-push-action@v6
+        with:
+          context: .
+          file: Dockerfile
+          load: true
+          tags: nousresearch/hermes-agent:test
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
+
+      - name: Test image starts
+        run: |
+          docker run --rm \
+            -v /tmp/hermes-test:/opt/data \
+            --entrypoint /opt/hermes/docker/entrypoint.sh \
+            nousresearch/hermes-agent:test --help
+
+      - name: Log in to Docker Hub
+        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
+        uses: docker/login-action@v3
+        with:
+          username: ${{ secrets.DOCKERHUB_USERNAME }}
+          password: ${{ secrets.DOCKERHUB_TOKEN }}
+
+      - name: Push image
+        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
+        uses: docker/build-push-action@v6
+        with:
+          context: .
+          file: Dockerfile
+          push: true
+          tags: |
+            nousresearch/hermes-agent:latest
+            nousresearch/hermes-agent:${{ github.sha }}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
--- a/AGENTS.md
+++ b/AGENTS.md
@ -210,6 +210,10 @@ registry.register(

 The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.

+**Path references in tool schemas**: If the schema description mentions file paths (e.g. default output directories), use `display_hermes_home()` to make them profile-aware. The schema is generated at import time, which is after `_apply_profile_override()` sets `HERMES_HOME`.
+
+**State files**: If a tool stores persistent state (caches, logs, checkpoints), use `get_hermes_home()` for the base directory — never `Path.home() / ".hermes"`. This ensures each profile gets its own state.
+
 **Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `todo_tool.py` for the pattern.

 ---
@ -358,8 +362,69 @@ in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):

 ---

+## Profiles: Multi-Instance Support
+
+Hermes supports **profiles** — multiple fully isolated instances, each with its own
+`HERMES_HOME` directory (config, API keys, memory, sessions, skills, gateway, etc.).
+
+The core mechanism: `_apply_profile_override()` in `hermes_cli/main.py` sets
+`HERMES_HOME` before any module imports. All 119+ references to `get_hermes_home()`
+automatically scope to the active profile.
+
+### Rules for profile-safe code
+
+1. **Use `get_hermes_home()` for all HERMES_HOME paths.** Import from `hermes_constants`.
+   NEVER hardcode `~/.hermes` or `Path.home() / ".hermes"` in code that reads/writes state.
+   ```python
+   # GOOD
+   from hermes_constants import get_hermes_home
+   config_path = get_hermes_home() / "config.yaml"
+
+   # BAD — breaks profiles
+   config_path = Path.home() / ".hermes" / "config.yaml"
+   ```
+
+2. **Use `display_hermes_home()` for user-facing messages.** Import from `hermes_constants`.
+   This returns `~/.hermes` for default or `~/.hermes/profiles/<name>` for profiles.
+   ```python
+   # GOOD
+   from hermes_constants import display_hermes_home
+   print(f"Config saved to {display_hermes_home()}/config.yaml")
+
+   # BAD — shows wrong path for profiles
+   print("Config saved to ~/.hermes/config.yaml")
+   ```
+
+3. **Module-level constants are fine** — they cache `get_hermes_home()` at import time,
+   which is AFTER `_apply_profile_override()` sets the env var. Just use `get_hermes_home()`,
+   not `Path.home() / ".hermes"`.
+
+4. **Tests that mock `Path.home()` must also set `HERMES_HOME`** — since code now uses
+   `get_hermes_home()` (reads env var), not `Path.home() / ".hermes"`:
+   ```python
+   with patch.object(Path, "home", return_value=tmp_path), \
+        patch.dict(os.environ, {"HERMES_HOME": str(tmp_path / ".hermes")}):
+       ...
+   ```
+
+5. **Gateway platform adapters should use token locks** — if the adapter connects with
+   a unique credential (bot token, API key), call `acquire_scoped_lock()` from
+   `gateway.status` in the `connect()`/`start()` method and `release_scoped_lock()` in
+   `disconnect()`/`stop()`. This prevents two profiles from using the same credential.
+   See `gateway/platforms/telegram.py` for the canonical pattern.
+
+6. **Profile operations are HOME-anchored, not HERMES_HOME-anchored** — `_get_profiles_root()`
+   returns `Path.home() / ".hermes" / "profiles"`, NOT `get_hermes_home() / "profiles"`.
+   This is intentional — it lets `hermes -p coder profile list` see all profiles regardless
+   of which one is active.
+
 ## Known Pitfalls

+### DO NOT hardcode `~/.hermes` paths
+Use `get_hermes_home()` from `hermes_constants` for code paths. Use `display_hermes_home()`
+for user-facing print/log messages. Hardcoding `~/.hermes` breaks profiles — each profile
+has its own `HERMES_HOME` directory. This was the source of 5 bugs fixed in PR #3575.
+
 ### DO NOT use `simple_term_menu` for interactive menus
 Rendering bugs in tmux/iTerm2 — ghosting on scroll. Use `curses` (stdlib) instead. See `hermes_cli/tools_config.py` for the pattern.

@ -375,6 +440,19 @@ Tool schema descriptions must not mention tools from other toolsets by name (e.g
 ### Tests must not write to `~/.hermes/`
 The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.

+**Profile tests**: When testing profile features, also mock `Path.home()` so that
+`_get_profiles_root()` and `_get_default_hermes_home()` resolve within the temp dir.
+Use the pattern from `tests/hermes_cli/test_profiles.py`:
+```python
+@pytest.fixture
+def profile_env(tmp_path, monkeypatch):
+    home = tmp_path / ".hermes"
+    home.mkdir()
+    monkeypatch.setattr(Path, "home", lambda: tmp_path)
+    monkeypatch.setenv("HERMES_HOME", str(home))
+    return home
+```
+
 ---

 ## Testing
--- a/20
+++ b/20
@ -0,0 +1,20 @@
+FROM debian:13.4
+
+RUN apt-get update
+RUN apt-get install -y nodejs npm python3 python3-pip ripgrep ffmpeg gcc python3-dev libffi-dev
+
+COPY . /opt/hermes
+WORKDIR /opt/hermes
+
+RUN pip install -e ".[all]" --break-system-packages
+RUN npm install
+RUN npx playwright install --with-deps chromium
+WORKDIR /opt/hermes/scripts/whatsapp-bridge
+RUN npm install
+
+WORKDIR /opt/hermes
+RUN chmod +x /opt/hermes/docker/entrypoint.sh
+
+ENV HERMES_HOME=/opt/data
+VOLUME [ "/opt/data" ]
+ENTRYPOINT [ "/opt/hermes/docker/entrypoint.sh" ]
--- a/RELEASE_v0.5.0.md
+++ b/RELEASE_v0.5.0.md
@ -0,0 +1,348 @@
+# Hermes Agent v0.5.0 (v2026.3.28)
+
+**Release Date:** March 28, 2026
+
+> The hardening release — Hugging Face provider, /model command overhaul, Telegram Private Chat Topics, native Modal SDK, plugin lifecycle hooks, tool-use enforcement for GPT models, Nix flake, 50+ security and reliability fixes, and a comprehensive supply chain audit.
+
+---
+
+## ✨ Highlights
+
+- **Nous Portal now supports 400+ models** — The Nous Research inference portal has expanded dramatically, giving Hermes Agent users access to over 400 models through a single provider endpoint
+
+- **Hugging Face as a first-class inference provider** — Full integration with HF Inference API including curated agentic model picker that maps to OpenRouter analogues, live `/models` endpoint probe, and setup wizard flow ([#3419](https://github.com/NousResearch/hermes-agent/pull/3419), [#3440](https://github.com/NousResearch/hermes-agent/pull/3440))
+
+- **Telegram Private Chat Topics** — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat ([#3163](https://github.com/NousResearch/hermes-agent/pull/3163))
+
+- **Native Modal SDK backend** — Replaced swe-rex dependency with native Modal SDK (`Sandbox.create.aio` + `exec.aio`), eliminating tunnels and simplifying the Modal terminal backend ([#3538](https://github.com/NousResearch/hermes-agent/pull/3538))
+
+- **Plugin lifecycle hooks activated** — `pre_llm_call`, `post_llm_call`, `on_session_start`, and `on_session_end` hooks now fire in the agent loop and CLI/gateway, completing the plugin hook system ([#3542](https://github.com/NousResearch/hermes-agent/pull/3542))
+
+- **Improved OpenAI Model Reliability** — Added `GPT_TOOL_USE_GUIDANCE` to prevent GPT models from describing intended actions instead of making tool calls, plus automatic stripping of stale budget warnings from conversation history that caused models to avoid tools across turns ([#3528](https://github.com/NousResearch/hermes-agent/pull/3528))
+
+- **Nix flake** — Full uv2nix build, NixOS module with persistent container mode, auto-generated config keys from Python source, and suffix PATHs for agent-friendliness ([#20](https://github.com/NousResearch/hermes-agent/pull/20), [#3274](https://github.com/NousResearch/hermes-agent/pull/3274), [#3061](https://github.com/NousResearch/hermes-agent/pull/3061)) by @alt-glitch
+
+- **Supply chain hardening** — Removed compromised `litellm` dependency, pinned all dependency version ranges, regenerated `uv.lock` with hashes, added CI workflow scanning PRs for supply chain attack patterns, and bumped deps to fix CVEs ([#2796](https://github.com/NousResearch/hermes-agent/pull/2796), [#2810](https://github.com/NousResearch/hermes-agent/pull/2810), [#2812](https://github.com/NousResearch/hermes-agent/pull/2812), [#2816](https://github.com/NousResearch/hermes-agent/pull/2816), [#3073](https://github.com/NousResearch/hermes-agent/pull/3073))
+
+- **Anthropic output limits fix** — Replaced hardcoded 16K `max_tokens` with per-model native output limits (128K for Opus 4.6, 64K for Sonnet 4.6), fixing "Response truncated" and thinking-budget exhaustion on direct Anthropic API ([#3426](https://github.com/NousResearch/hermes-agent/pull/3426), [#3444](https://github.com/NousResearch/hermes-agent/pull/3444))
+
+---
+
+## 🏗️ Core Agent & Architecture
+
+### New Provider: Hugging Face
+- First-class Hugging Face Inference API integration with auth, setup wizard, and model picker ([#3419](https://github.com/NousResearch/hermes-agent/pull/3419))
+- Curated model list mapping OpenRouter agentic defaults to HF equivalents — providers with 8+ curated models skip live `/models` probe for speed ([#3440](https://github.com/NousResearch/hermes-agent/pull/3440))
+- Added glm-5-turbo to Z.AI provider model list ([#3095](https://github.com/NousResearch/hermes-agent/pull/3095))
+
+### Provider & Model Improvements
+- `/model` command overhaul — extracted shared `switch_model()` pipeline for CLI and gateway, custom endpoint support, provider-aware routing ([#2795](https://github.com/NousResearch/hermes-agent/pull/2795), [#2799](https://github.com/NousResearch/hermes-agent/pull/2799))
+- Removed `/model` slash command from CLI and gateway in favor of `hermes model` subcommand ([#3080](https://github.com/NousResearch/hermes-agent/pull/3080))
+- Preserve `custom` provider instead of silently remapping to `openrouter` ([#2792](https://github.com/NousResearch/hermes-agent/pull/2792))
+- Read root-level `provider` and `base_url` from config.yaml into model config ([#3112](https://github.com/NousResearch/hermes-agent/pull/3112))
+- Align Nous Portal model slugs with OpenRouter naming ([#3253](https://github.com/NousResearch/hermes-agent/pull/3253))
+- Fix Alibaba provider default endpoint and model list ([#3484](https://github.com/NousResearch/hermes-agent/pull/3484))
+- Allow MiniMax users to override `/v1` → `/anthropic` auto-correction ([#3553](https://github.com/NousResearch/hermes-agent/pull/3553))
+- Migrate OAuth token refresh to `platform.claude.com` with fallback ([#3246](https://github.com/NousResearch/hermes-agent/pull/3246))
+
+### Agent Loop & Conversation
+- **Improved OpenAI model reliability** — `GPT_TOOL_USE_GUIDANCE` prevents GPT models from describing actions instead of calling tools + automatic budget warning stripping from history ([#3528](https://github.com/NousResearch/hermes-agent/pull/3528))
+- **Surface lifecycle events** — All retry, fallback, and compression events now surface to the user as formatted messages ([#3153](https://github.com/NousResearch/hermes-agent/pull/3153))
+- **Anthropic output limits** — Per-model native output limits instead of hardcoded 16K `max_tokens` ([#3426](https://github.com/NousResearch/hermes-agent/pull/3426))
+- **Thinking-budget exhaustion detection** — Skip useless continuation retries when model uses all output tokens on reasoning ([#3444](https://github.com/NousResearch/hermes-agent/pull/3444))
+- Always prefer streaming for API calls to prevent hung subagents ([#3120](https://github.com/NousResearch/hermes-agent/pull/3120))
+- Restore safe non-streaming fallback after stream failures ([#3020](https://github.com/NousResearch/hermes-agent/pull/3020))
+- Give subagents independent iteration budgets ([#3004](https://github.com/NousResearch/hermes-agent/pull/3004))
+- Update `api_key` in `_try_activate_fallback` for subagent auth ([#3103](https://github.com/NousResearch/hermes-agent/pull/3103))
+- Graceful return on max retries instead of crashing thread ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Count compression restarts toward retry limit ([#3070](https://github.com/NousResearch/hermes-agent/pull/3070))
+- Include tool tokens in preflight estimate, guard context probe persistence ([#3164](https://github.com/NousResearch/hermes-agent/pull/3164))
+- Update context compressor limits after fallback activation ([#3305](https://github.com/NousResearch/hermes-agent/pull/3305))
+- Validate empty user messages to prevent Anthropic API 400 errors ([#3322](https://github.com/NousResearch/hermes-agent/pull/3322))
+- GLM reasoning-only and max-length handling ([#3010](https://github.com/NousResearch/hermes-agent/pull/3010))
+- Increase API timeout default from 900s to 1800s for slow-thinking models ([#3431](https://github.com/NousResearch/hermes-agent/pull/3431))
+- Send `max_tokens` for Claude/OpenRouter + retry SSE connection errors ([#3497](https://github.com/NousResearch/hermes-agent/pull/3497))
+- Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode ([#2701](https://github.com/NousResearch/hermes-agent/pull/2701)) by @ctlst
+
+### Streaming & Reasoning
+- **Persist reasoning across gateway session turns** with new schema v6 columns (`reasoning`, `reasoning_details`, `codex_reasoning_items`) ([#2974](https://github.com/NousResearch/hermes-agent/pull/2974))
+- Detect and kill stale SSE connections ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Fix stale stream detector race causing spurious `RemoteProtocolError` ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Skip duplicate callback for `<think>`-extracted reasoning during streaming ([#3116](https://github.com/NousResearch/hermes-agent/pull/3116))
+- Preserve reasoning fields in `rewrite_transcript` ([#3311](https://github.com/NousResearch/hermes-agent/pull/3311))
+- Preserve Gemini thought signatures in streamed tool calls ([#2997](https://github.com/NousResearch/hermes-agent/pull/2997))
+- Ensure first delta is fired during reasoning updates ([untagged commit](https://github.com/NousResearch/hermes-agent))
+
+### Session & Memory
+- **Session search recent sessions mode** — Omit query to browse recent sessions with titles, previews, and timestamps ([#2533](https://github.com/NousResearch/hermes-agent/pull/2533))
+- **Session config surfacing** on `/new`, `/reset`, and auto-reset ([#3321](https://github.com/NousResearch/hermes-agent/pull/3321))
+- **Third-party session isolation** — `--source` flag for isolating sessions by origin ([#3255](https://github.com/NousResearch/hermes-agent/pull/3255))
+- Add `/resume` CLI handler, session log truncation guard, `reopen_session` API ([#3315](https://github.com/NousResearch/hermes-agent/pull/3315))
+- Clear compressor summary and turn counter on `/clear` and `/new` ([#3102](https://github.com/NousResearch/hermes-agent/pull/3102))
+- Surface silent SessionDB failures that cause session data loss ([#2999](https://github.com/NousResearch/hermes-agent/pull/2999))
+- Session search fallback preview on summarization failure ([#3478](https://github.com/NousResearch/hermes-agent/pull/3478))
+- Prevent stale memory overwrites by flush agent ([#2687](https://github.com/NousResearch/hermes-agent/pull/2687))
+
+### Context Compression
+- Replace dead `summary_target_tokens` with ratio-based scaling ([#2554](https://github.com/NousResearch/hermes-agent/pull/2554))
+- Expose `compression.target_ratio`, `protect_last_n`, and `threshold` in `DEFAULT_CONFIG` ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Restore sane defaults and cap summary at 12K tokens ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Preserve transcript on `/compress` and hygiene compression ([#3556](https://github.com/NousResearch/hermes-agent/pull/3556))
+- Update context pressure warnings and token estimates after compaction ([untagged commit](https://github.com/NousResearch/hermes-agent))
+
+### Architecture & Dependencies
+- **Remove mini-swe-agent dependency** — Inline Docker and Modal backends directly ([#2804](https://github.com/NousResearch/hermes-agent/pull/2804))
+- **Replace swe-rex with native Modal SDK** for Modal backend ([#3538](https://github.com/NousResearch/hermes-agent/pull/3538))
+- **Plugin lifecycle hooks** — `pre_llm_call`, `post_llm_call`, `on_session_start`, `on_session_end` now fire in the agent loop ([#3542](https://github.com/NousResearch/hermes-agent/pull/3542))
+- Fix plugin toolsets invisible in `hermes tools` and standalone processes ([#3457](https://github.com/NousResearch/hermes-agent/pull/3457))
+- Consolidate `get_hermes_home()` and `parse_reasoning_effort()` ([#3062](https://github.com/NousResearch/hermes-agent/pull/3062))
+- Remove unused Hermes-native PKCE OAuth flow ([#3107](https://github.com/NousResearch/hermes-agent/pull/3107))
+- Remove ~100 unused imports across 55 files ([#3016](https://github.com/NousResearch/hermes-agent/pull/3016))
+- Fix 154 f-strings, simplify getattr/URL patterns, remove dead code ([#3119](https://github.com/NousResearch/hermes-agent/pull/3119))
+
+---
+
+## 📱 Messaging Platforms (Gateway)
+
+### Telegram
+- **Private Chat Topics** — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat ([#3163](https://github.com/NousResearch/hermes-agent/pull/3163))
+- **Auto-discover fallback IPs via DNS-over-HTTPS** when `api.telegram.org` is unreachable ([#3376](https://github.com/NousResearch/hermes-agent/pull/3376))
+- **Configurable reply threading mode** ([#2907](https://github.com/NousResearch/hermes-agent/pull/2907))
+- Fall back to no `thread_id` on "Message thread not found" BadRequest ([#3390](https://github.com/NousResearch/hermes-agent/pull/3390))
+- Self-reschedule reconnect when `start_polling` fails after 502 ([#3268](https://github.com/NousResearch/hermes-agent/pull/3268))
+
+### Discord
+- Stop phantom typing indicator after agent turn completes ([#3003](https://github.com/NousResearch/hermes-agent/pull/3003))
+
+### Slack
+- Send tool call progress messages to correct Slack thread ([#3063](https://github.com/NousResearch/hermes-agent/pull/3063))
+- Scope progress thread fallback to Slack only ([#3488](https://github.com/NousResearch/hermes-agent/pull/3488))
+
+### WhatsApp
+- Download documents, audio, and video media from messages ([#2978](https://github.com/NousResearch/hermes-agent/pull/2978))
+
+### Matrix
+- Add missing Matrix entry in `PLATFORMS` dict ([#3473](https://github.com/NousResearch/hermes-agent/pull/3473))
+- Harden e2ee access-token handling ([#3562](https://github.com/NousResearch/hermes-agent/pull/3562))
+- Add backoff for `SyncError` in sync loop ([#3280](https://github.com/NousResearch/hermes-agent/pull/3280))
+
+### Signal
+- Track SSE keepalive comments as connection activity ([#3316](https://github.com/NousResearch/hermes-agent/pull/3316))
+
+### Email
+- Prevent unbounded growth of `_seen_uids` in EmailAdapter ([#3490](https://github.com/NousResearch/hermes-agent/pull/3490))
+
+### Gateway Core
+- **Config-gated `/verbose` command** for messaging platforms — toggle tool output verbosity from chat ([#3262](https://github.com/NousResearch/hermes-agent/pull/3262))
+- **Background review notifications** delivered to user chat ([#3293](https://github.com/NousResearch/hermes-agent/pull/3293))
+- **Retry transient send failures** and notify user on exhaustion ([#3288](https://github.com/NousResearch/hermes-agent/pull/3288))
+- Recover from hung agents — `/stop` hard-kills session lock ([#3104](https://github.com/NousResearch/hermes-agent/pull/3104))
+- Thread-safe `SessionStore` — protect `_entries` with `threading.Lock` ([#3052](https://github.com/NousResearch/hermes-agent/pull/3052))
+- Fix gateway token double-counting with cached agents — use absolute set instead of increment ([#3306](https://github.com/NousResearch/hermes-agent/pull/3306), [#3317](https://github.com/NousResearch/hermes-agent/pull/3317))
+- Fingerprint full auth token in agent cache signature ([#3247](https://github.com/NousResearch/hermes-agent/pull/3247))
+- Silence background agent terminal output ([#3297](https://github.com/NousResearch/hermes-agent/pull/3297))
+- Include per-platform `ALLOW_ALL` and `SIGNAL_GROUP` in startup allowlist check ([#3313](https://github.com/NousResearch/hermes-agent/pull/3313))
+- Include user-local bin paths in systemd unit PATH ([#3527](https://github.com/NousResearch/hermes-agent/pull/3527))
+- Track background task references in `GatewayRunner` ([#3254](https://github.com/NousResearch/hermes-agent/pull/3254))
+- Add request timeouts to HA, Email, Mattermost, SMS adapters ([#3258](https://github.com/NousResearch/hermes-agent/pull/3258))
+- Add media download retry to Mattermost, Slack, and base cache ([#3323](https://github.com/NousResearch/hermes-agent/pull/3323))
+- Detect virtualenv path instead of hardcoding `venv/` ([#2797](https://github.com/NousResearch/hermes-agent/pull/2797))
+- Use `TERMINAL_CWD` for context file discovery, not process cwd ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Stop loading hermes repo AGENTS.md into gateway sessions (~10k wasted tokens) ([#2891](https://github.com/NousResearch/hermes-agent/pull/2891))
+
+---
+
+## 🖥️ CLI & User Experience
+
+### Interactive CLI
+- **Configurable busy input mode** + fix `/queue` always working ([#3298](https://github.com/NousResearch/hermes-agent/pull/3298))
+- **Preserve user input on multiline paste** ([#3065](https://github.com/NousResearch/hermes-agent/pull/3065))
+- **Tool generation callback** — streaming "preparing terminal…" updates during tool argument generation ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Show tool progress for substantive tools, not just "preparing" ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Buffer reasoning preview chunks and fix duplicate display ([#3013](https://github.com/NousResearch/hermes-agent/pull/3013))
+- Prevent reasoning box from rendering 3x during tool-calling loops ([#3405](https://github.com/NousResearch/hermes-agent/pull/3405))
+- Eliminate "Event loop is closed" / "Press ENTER to continue" during idle — three-layer fix with `neuter_async_httpx_del()`, custom exception handler, and stale client cleanup ([#3398](https://github.com/NousResearch/hermes-agent/pull/3398))
+- Fix status bar shows 26K instead of 260K for token counts with trailing zeros ([#3024](https://github.com/NousResearch/hermes-agent/pull/3024))
+- Fix status bar duplicates and degrades during long sessions ([#3291](https://github.com/NousResearch/hermes-agent/pull/3291))
+- Refresh TUI before background task output to prevent status bar overlap ([#3048](https://github.com/NousResearch/hermes-agent/pull/3048))
+- Suppress KawaiiSpinner animation under `patch_stdout` ([#2994](https://github.com/NousResearch/hermes-agent/pull/2994))
+- Skip KawaiiSpinner when TUI handles tool progress ([#2973](https://github.com/NousResearch/hermes-agent/pull/2973))
+- Guard `isatty()` against closed streams via `_is_tty` property ([#3056](https://github.com/NousResearch/hermes-agent/pull/3056))
+- Ensure single closure of streaming boxes during tool generation ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Cap context pressure percentage at 100% in display ([#3480](https://github.com/NousResearch/hermes-agent/pull/3480))
+- Clean up HTML error messages in CLI display ([#3069](https://github.com/NousResearch/hermes-agent/pull/3069))
+- Show HTTP status code and 400 body in API error output ([#3096](https://github.com/NousResearch/hermes-agent/pull/3096))
+- Extract useful info from HTML error pages, dump debug on max retries ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Prevent TypeError on startup when `base_url` is None ([#3068](https://github.com/NousResearch/hermes-agent/pull/3068))
+- Prevent update crash in non-TTY environments ([#3094](https://github.com/NousResearch/hermes-agent/pull/3094))
+- Handle EOFError in sessions delete/prune confirmation prompts ([#3101](https://github.com/NousResearch/hermes-agent/pull/3101))
+- Catch KeyboardInterrupt during `flush_memories` on exit and in exit cleanup handlers ([#3025](https://github.com/NousResearch/hermes-agent/pull/3025), [#3257](https://github.com/NousResearch/hermes-agent/pull/3257))
+- Guard `.strip()` against None values from YAML config ([#3552](https://github.com/NousResearch/hermes-agent/pull/3552))
+- Guard `config.get()` against YAML null values to prevent AttributeError ([#3377](https://github.com/NousResearch/hermes-agent/pull/3377))
+- Store asyncio task references to prevent GC mid-execution ([#3267](https://github.com/NousResearch/hermes-agent/pull/3267))
+
+### Setup & Configuration
+- Use explicit key mapping for returning-user menu dispatch instead of positional index ([#3083](https://github.com/NousResearch/hermes-agent/pull/3083))
+- Use `sys.executable` for pip in update commands to fix PEP 668 ([#3099](https://github.com/NousResearch/hermes-agent/pull/3099))
+- Harden `hermes update` against diverged history, non-main branches, and gateway edge cases ([#3492](https://github.com/NousResearch/hermes-agent/pull/3492))
+- OpenClaw migration overwrites defaults and setup wizard skips imported sections — fixed ([#3282](https://github.com/NousResearch/hermes-agent/pull/3282))
+- Stop recursive AGENTS.md walk, load top-level only ([#3110](https://github.com/NousResearch/hermes-agent/pull/3110))
+- Add macOS Homebrew paths to browser and terminal PATH resolution ([#2713](https://github.com/NousResearch/hermes-agent/pull/2713))
+- YAML boolean handling for `tool_progress` config ([#3300](https://github.com/NousResearch/hermes-agent/pull/3300))
+- Reset default SOUL.md to baseline identity text ([#3159](https://github.com/NousResearch/hermes-agent/pull/3159))
+- Reject relative cwd paths for container terminal backends ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Add explicit `hermes-api-server` toolset for API server platform ([#3304](https://github.com/NousResearch/hermes-agent/pull/3304))
+- Reorder setup wizard providers — OpenRouter first ([untagged commit](https://github.com/NousResearch/hermes-agent))
+
+---
+
+## 🔧 Tool System
+
+### API Server
+- **Idempotency-Key support**, body size limit, and OpenAI error envelope ([#2903](https://github.com/NousResearch/hermes-agent/pull/2903))
+- Allow Idempotency-Key in CORS headers ([#3530](https://github.com/NousResearch/hermes-agent/pull/3530))
+- Cancel orphaned agent + true interrupt on SSE disconnect ([#3427](https://github.com/NousResearch/hermes-agent/pull/3427))
+- Fix streaming breaks when agent makes tool calls ([#2985](https://github.com/NousResearch/hermes-agent/pull/2985))
+
+### Terminal & File Operations
+- Handle addition-only hunks in V4A patch parser ([#3325](https://github.com/NousResearch/hermes-agent/pull/3325))
+- Exponential backoff for persistent shell polling ([#2996](https://github.com/NousResearch/hermes-agent/pull/2996))
+- Add timeout to subprocess calls in `context_references` ([#3469](https://github.com/NousResearch/hermes-agent/pull/3469))
+
+### Browser & Vision
+- Handle 402 insufficient credits error in vision tool ([#2802](https://github.com/NousResearch/hermes-agent/pull/2802))
+- Fix `browser_vision` ignores `auxiliary.vision.timeout` config ([#2901](https://github.com/NousResearch/hermes-agent/pull/2901))
+- Make browser command timeout configurable via config.yaml ([#2801](https://github.com/NousResearch/hermes-agent/pull/2801))
+
+### MCP
+- MCP toolset resolution for runtime and config ([#3252](https://github.com/NousResearch/hermes-agent/pull/3252))
+- Add MCP tool name collision protection ([#3077](https://github.com/NousResearch/hermes-agent/pull/3077))
+
+### Auxiliary LLM
+- Guard aux LLM calls against None content + reasoning fallback + retry ([#3449](https://github.com/NousResearch/hermes-agent/pull/3449))
+- Catch ImportError from `build_anthropic_client` in vision auto-detection ([#3312](https://github.com/NousResearch/hermes-agent/pull/3312))
+
+### Other Tools
+- Add request timeouts to `send_message_tool` HTTP calls ([#3162](https://github.com/NousResearch/hermes-agent/pull/3162)) by @memosr
+- Auto-repair `jobs.json` with invalid control characters ([#3537](https://github.com/NousResearch/hermes-agent/pull/3537))
+- Enable fine-grained tool streaming for Claude/OpenRouter ([#3497](https://github.com/NousResearch/hermes-agent/pull/3497))
+
+---
+
+## 🧩 Skills Ecosystem
+
+### Skills System
+- **Env var passthrough** for skills and user config — skills can declare environment variables to pass through ([#2807](https://github.com/NousResearch/hermes-agent/pull/2807))
+- Cache skills prompt with shared `skill_utils` module for faster TTFT ([#3421](https://github.com/NousResearch/hermes-agent/pull/3421))
+- Avoid redundant file re-read for skill conditions ([#2992](https://github.com/NousResearch/hermes-agent/pull/2992))
+- Use Git Trees API to prevent silent subdirectory loss during install ([#2995](https://github.com/NousResearch/hermes-agent/pull/2995))
+- Fix skills-sh install for deeply nested repo structures ([#2980](https://github.com/NousResearch/hermes-agent/pull/2980))
+- Handle null metadata in skill frontmatter ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Preserve trust for skills-sh identifiers + reduce resolution churn ([#3251](https://github.com/NousResearch/hermes-agent/pull/3251))
+- Agent-created skills were incorrectly treated as untrusted community content — fixed ([untagged commit](https://github.com/NousResearch/hermes-agent))
+
+### New Skills
+- **G0DM0D3 godmode jailbreaking skill** + docs ([#3157](https://github.com/NousResearch/hermes-agent/pull/3157))
+- **Docker management skill** added to optional-skills ([#3060](https://github.com/NousResearch/hermes-agent/pull/3060))
+- **OpenClaw migration v2** — 17 new modules, terminal recap for migrating from OpenClaw to Hermes ([#2906](https://github.com/NousResearch/hermes-agent/pull/2906))
+
+---
+
+## 🔒 Security & Reliability
+
+### Security Hardening
+- **SSRF protection** added to `browser_navigate` ([#3058](https://github.com/NousResearch/hermes-agent/pull/3058))
+- **SSRF protection** added to `vision_tools` and `web_tools` (hardened) ([#2679](https://github.com/NousResearch/hermes-agent/pull/2679))
+- **Restrict subagent toolsets** to parent's enabled set ([#3269](https://github.com/NousResearch/hermes-agent/pull/3269))
+- **Prevent zip-slip path traversal** in self-update ([#3250](https://github.com/NousResearch/hermes-agent/pull/3250))
+- **Prevent shell injection** in `_expand_path` via `~user` path suffix ([#2685](https://github.com/NousResearch/hermes-agent/pull/2685))
+- **Normalize input** before dangerous command detection ([#3260](https://github.com/NousResearch/hermes-agent/pull/3260))
+- Make tirith block verdicts approvable instead of hard-blocking ([#3428](https://github.com/NousResearch/hermes-agent/pull/3428))
+- Remove compromised `litellm`/`typer`/`platformdirs` from deps ([#2796](https://github.com/NousResearch/hermes-agent/pull/2796))
+- Pin all dependency version ranges ([#2810](https://github.com/NousResearch/hermes-agent/pull/2810))
+- Regenerate `uv.lock` with hashes, use lockfile in setup ([#2812](https://github.com/NousResearch/hermes-agent/pull/2812))
+- Bump dependencies to fix CVEs + regenerate `uv.lock` ([#3073](https://github.com/NousResearch/hermes-agent/pull/3073))
+- Supply chain audit CI workflow for PR scanning ([#2816](https://github.com/NousResearch/hermes-agent/pull/2816))
+
+### Reliability
+- **SQLite WAL write-lock contention** causing 15-20s TUI freeze — fixed ([#3385](https://github.com/NousResearch/hermes-agent/pull/3385))
+- **SQLite concurrency hardening** + session transcript integrity ([#3249](https://github.com/NousResearch/hermes-agent/pull/3249))
+- Prevent recurring cron job re-fire on gateway crash/restart loop ([#3396](https://github.com/NousResearch/hermes-agent/pull/3396))
+- Mark cron session as ended after job completes ([#2998](https://github.com/NousResearch/hermes-agent/pull/2998))
+
+---
+
+## ⚡ Performance
+
+- **TTFT startup optimizations** — salvaged easy-win startup improvements ([#3395](https://github.com/NousResearch/hermes-agent/pull/3395))
+- Cache skills prompt with shared `skill_utils` module ([#3421](https://github.com/NousResearch/hermes-agent/pull/3421))
+- Avoid redundant file re-read for skill conditions in prompt builder ([#2992](https://github.com/NousResearch/hermes-agent/pull/2992))
+
+---
+
+## 🐛 Notable Bug Fixes
+
+- Fix gateway token double-counting with cached agents ([#3306](https://github.com/NousResearch/hermes-agent/pull/3306), [#3317](https://github.com/NousResearch/hermes-agent/pull/3317))
+- Fix "Event loop is closed" / "Press ENTER to continue" during idle sessions ([#3398](https://github.com/NousResearch/hermes-agent/pull/3398))
+- Fix reasoning box rendering 3x during tool-calling loops ([#3405](https://github.com/NousResearch/hermes-agent/pull/3405))
+- Fix status bar shows 26K instead of 260K for token counts ([#3024](https://github.com/NousResearch/hermes-agent/pull/3024))
+- Fix `/queue` always working regardless of config ([#3298](https://github.com/NousResearch/hermes-agent/pull/3298))
+- Fix phantom Discord typing indicator after agent turn ([#3003](https://github.com/NousResearch/hermes-agent/pull/3003))
+- Fix Slack progress messages appearing in wrong thread ([#3063](https://github.com/NousResearch/hermes-agent/pull/3063))
+- Fix WhatsApp media downloads (documents, audio, video) ([#2978](https://github.com/NousResearch/hermes-agent/pull/2978))
+- Fix Telegram "Message thread not found" killing progress messages ([#3390](https://github.com/NousResearch/hermes-agent/pull/3390))
+- Fix OpenClaw migration overwriting defaults ([#3282](https://github.com/NousResearch/hermes-agent/pull/3282))
+- Fix returning-user setup menu dispatching wrong section ([#3083](https://github.com/NousResearch/hermes-agent/pull/3083))
+- Fix `hermes update` PEP 668 "externally-managed-environment" error ([#3099](https://github.com/NousResearch/hermes-agent/pull/3099))
+- Fix subagents hitting `max_iterations` prematurely via shared budget ([#3004](https://github.com/NousResearch/hermes-agent/pull/3004))
+- Fix YAML boolean handling for `tool_progress` config ([#3300](https://github.com/NousResearch/hermes-agent/pull/3300))
+- Fix `config.get()` crashes on YAML null values ([#3377](https://github.com/NousResearch/hermes-agent/pull/3377))
+- Fix `.strip()` crash on None values from YAML config ([#3552](https://github.com/NousResearch/hermes-agent/pull/3552))
+- Fix hung agents on gateway — `/stop` now hard-kills session lock ([#3104](https://github.com/NousResearch/hermes-agent/pull/3104))
+- Fix `_custom` provider silently remapped to `openrouter` ([#2792](https://github.com/NousResearch/hermes-agent/pull/2792))
+- Fix Matrix missing from `PLATFORMS` dict ([#3473](https://github.com/NousResearch/hermes-agent/pull/3473))
+- Fix Email adapter unbounded `_seen_uids` growth ([#3490](https://github.com/NousResearch/hermes-agent/pull/3490))
+
+---
+
+## 🧪 Testing
+
+- Pin `agent-client-protocol` < 0.9 to handle breaking upstream release ([#3320](https://github.com/NousResearch/hermes-agent/pull/3320))
+- Catch anthropic ImportError in vision auto-detection tests ([#3312](https://github.com/NousResearch/hermes-agent/pull/3312))
+- Update retry-exhaust test for new graceful return behavior ([#3320](https://github.com/NousResearch/hermes-agent/pull/3320))
+- Add regression tests for null metadata frontmatter ([untagged commit](https://github.com/NousResearch/hermes-agent))
+
+---
+
+## 📚 Documentation
+
+- Update all docs for `/model` command overhaul and custom provider support ([#2800](https://github.com/NousResearch/hermes-agent/pull/2800))
+- Fix stale and incorrect documentation across 18 files ([#2805](https://github.com/NousResearch/hermes-agent/pull/2805))
+- Document 9 previously undocumented features ([#2814](https://github.com/NousResearch/hermes-agent/pull/2814))
+- Add missing skills, CLI commands, and messaging env vars to docs ([#2809](https://github.com/NousResearch/hermes-agent/pull/2809))
+- Fix api-server response storage documentation — SQLite, not in-memory ([#2819](https://github.com/NousResearch/hermes-agent/pull/2819))
+- Quote pip install extras to fix zsh glob errors ([#2815](https://github.com/NousResearch/hermes-agent/pull/2815))
+- Unify hooks documentation — add plugin hooks to hooks page, add `session:end` event ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Clarify two-mode behavior in `session_search` schema description ([untagged commit](https://github.com/NousResearch/hermes-agent))
+- Fix Discord Public Bot setting for Discord-provided invite link ([#3519](https://github.com/NousResearch/hermes-agent/pull/3519)) by @mehmoodosman
+- Revise v0.4.0 changelog — fix feature attribution, reorder sections ([untagged commit](https://github.com/NousResearch/hermes-agent))
+
+---
+
+## 👥 Contributors
+
+### Core
+- **@teknium1** — 157 PRs covering the full scope of this release
+
+### Community Contributors
+- **@alt-glitch** (Siddharth Balyan) — 2 PRs: Nix flake with uv2nix build, NixOS module, and persistent container mode ([#20](https://github.com/NousResearch/hermes-agent/pull/20)); auto-generated config keys and suffix PATHs for Nix builds ([#3061](https://github.com/NousResearch/hermes-agent/pull/3061), [#3274](https://github.com/NousResearch/hermes-agent/pull/3274))
+- **@ctlst** — 1 PR: Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode ([#2701](https://github.com/NousResearch/hermes-agent/pull/2701))
+- **@memosr** (memosr.eth) — 1 PR: Add request timeouts to `send_message_tool` HTTP calls ([#3162](https://github.com/NousResearch/hermes-agent/pull/3162))
+- **@mehmoodosman** (Osman Mehmood) — 1 PR: Fix Discord docs for Public Bot setting ([#3519](https://github.com/NousResearch/hermes-agent/pull/3519))
+
+### All Contributors
+@alt-glitch, @ctlst, @mehmoodosman, @memosr, @teknium1
+
+---
+
+**Full Changelog**: [v2026.3.23...v2026.3.28](https://github.com/NousResearch/hermes-agent/compare/v2026.3.23...v2026.3.28)
--- a/acp_adapter/entry.py
+++ b/acp_adapter/entry.py
@ -74,7 +74,7 @@ def main() -> None:

    agent = HermesACPAgent()
    try:
-        asyncio.run(acp.run_agent(agent))
+        asyncio.run(acp.run_agent(agent, use_unstable_protocol=True))
    except KeyboardInterrupt:
        logger.info("Shutting down (KeyboardInterrupt)")
    except Exception:
--- a/acp_adapter/server.py
+++ b/acp_adapter/server.py
@ -25,6 +25,9 @@ from acp.schema import (
    NewSessionResponse,
    PromptResponse,
    ResumeSessionResponse,
+    SetSessionConfigOptionResponse,
+    SetSessionModelResponse,
+    SetSessionModeResponse,
    ResourceContentBlock,
    SessionCapabilities,
    SessionForkCapabilities,
@ -94,11 +97,14 @@ class HermesACPAgent(acp.Agent):

    async def initialize(
        self,
-        protocol_version: int,
+        protocol_version: int | None = None,
        client_capabilities: ClientCapabilities | None = None,
        client_info: Implementation | None = None,
        **kwargs: Any,
    ) -> InitializeResponse:
+        resolved_protocol_version = (
+            protocol_version if isinstance(protocol_version, int) else acp.PROTOCOL_VERSION
+        )
        provider = detect_provider()
        auth_methods = None
        if provider:
@ -111,7 +117,11 @@ class HermesACPAgent(acp.Agent):
            ]

        client_name = client_info.name if client_info else "unknown"
-        logger.info("Initialize from %s (protocol v%s)", client_name, protocol_version)
+        logger.info(
+            "Initialize from %s (protocol v%s)",
+            client_name,
+            resolved_protocol_version,
+        )

        return InitializeResponse(
            protocol_version=acp.PROTOCOL_VERSION,
@ -471,7 +481,7 @@ class HermesACPAgent(acp.Agent):

    async def set_session_model(
        self, model_id: str, session_id: str, **kwargs: Any
-    ):
+    ) -> SetSessionModelResponse | None:
        """Switch the model for a session (called by ACP protocol)."""
        state = self.session_manager.get_session(session_id)
        if state:
@ -489,4 +499,37 @@ class HermesACPAgent(acp.Agent):
            )
            self.session_manager.save_session(session_id)
            logger.info("Session %s: model switched to %s", session_id, model_id)
+            return SetSessionModelResponse()
+        logger.warning("Session %s: model switch requested for missing session", session_id)
        return None
+
+    async def set_session_mode(
+        self, mode_id: str, session_id: str, **kwargs: Any
+    ) -> SetSessionModeResponse | None:
+        """Persist the editor-requested mode so ACP clients do not fail on mode switches."""
+        state = self.session_manager.get_session(session_id)
+        if state is None:
+            logger.warning("Session %s: mode switch requested for missing session", session_id)
+            return None
+        setattr(state, "mode", mode_id)
+        self.session_manager.save_session(session_id)
+        logger.info("Session %s: mode switched to %s", session_id, mode_id)
+        return SetSessionModeResponse()
+
+    async def set_config_option(
+        self, config_id: str, session_id: str, value: str, **kwargs: Any
+    ) -> SetSessionConfigOptionResponse | None:
+        """Accept ACP config option updates even when Hermes has no typed ACP config surface yet."""
+        state = self.session_manager.get_session(session_id)
+        if state is None:
+            logger.warning("Session %s: config update requested for missing session", session_id)
+            return None
+
+        options = getattr(state, "config_options", None)
+        if not isinstance(options, dict):
+            options = {}
+        options[str(config_id)] = value
+        setattr(state, "config_options", options)
+        self.session_manager.save_session(session_id)
+        logger.info("Session %s: config option %s updated", session_id, config_id)
+        return SetSessionConfigOptionResponse(config_options=[])
--- a/agent/anthropic_adapter.py
+++ b/agent/anthropic_adapter.py
@ -35,6 +35,54 @@ ADAPTIVE_EFFORT_MAP = {
    "minimal": "low",
 }

+# ── Max output token limits per Anthropic model ───────────────────────
+# Source: Anthropic docs + Cline model catalog.  Anthropic's API requires
+# max_tokens as a mandatory field.  Previously we hardcoded 16384, which
+# starves thinking-enabled models (thinking tokens count toward the limit).
+_ANTHROPIC_OUTPUT_LIMITS = {
+    # Claude 4.6
+    "claude-opus-4-6":   128_000,
+    "claude-sonnet-4-6":  64_000,
+    # Claude 4.5
+    "claude-opus-4-5":    64_000,
+    "claude-sonnet-4-5":  64_000,
+    "claude-haiku-4-5":   64_000,
+    # Claude 4
+    "claude-opus-4":      32_000,
+    "claude-sonnet-4":    64_000,
+    # Claude 3.7
+    "claude-3-7-sonnet": 128_000,
+    # Claude 3.5
+    "claude-3-5-sonnet":   8_192,
+    "claude-3-5-haiku":    8_192,
+    # Claude 3
+    "claude-3-opus":       4_096,
+    "claude-3-sonnet":     4_096,
+    "claude-3-haiku":      4_096,
+}
+
+# For any model not in the table, assume the highest current limit.
+# Future Anthropic models are unlikely to have *less* output capacity.
+_ANTHROPIC_DEFAULT_OUTPUT_LIMIT = 128_000
+
+
+def _get_anthropic_max_output(model: str) -> int:
+    """Look up the max output token limit for an Anthropic model.
+
+    Uses substring matching against _ANTHROPIC_OUTPUT_LIMITS so date-stamped
+    model IDs (claude-sonnet-4-5-20250929) and variant suffixes (:1m, :fast)
+    resolve correctly.  Longest-prefix match wins to avoid e.g. "claude-3-5"
+    matching before "claude-3-5-sonnet".
+    """
+    m = model.lower()
+    best_key = ""
+    best_val = _ANTHROPIC_DEFAULT_OUTPUT_LIMIT
+    for key, val in _ANTHROPIC_OUTPUT_LIMITS.items():
+        if key in m and len(key) > len(best_key):
+            best_key = key
+            best_val = val
+    return best_val
+

 def _supports_adaptive_thinking(model: str) -> bool:
    """Return True for Claude 4.6 models that support adaptive thinking."""
@ -59,6 +107,7 @@ _OAUTH_ONLY_BETAS = [
 # The version must stay reasonably current — Anthropic rejects OAuth requests
 # when the spoofed user-agent version is too far behind the actual release.
 _CLAUDE_CODE_VERSION_FALLBACK = "2.1.74"
+_claude_code_version_cache: Optional[str] = None


 def _detect_claude_code_version() -> str:
@ -86,11 +135,18 @@ def _detect_claude_code_version() -> str:
    return _CLAUDE_CODE_VERSION_FALLBACK


-_CLAUDE_CODE_VERSION = _detect_claude_code_version()
 _CLAUDE_CODE_SYSTEM_PREFIX = "You are Claude Code, Anthropic's official CLI for Claude."
 _MCP_TOOL_PREFIX = "mcp_"


+def _get_claude_code_version() -> str:
+    """Lazily detect the installed Claude Code version when OAuth headers need it."""
+    global _claude_code_version_cache
+    if _claude_code_version_cache is None:
+        _claude_code_version_cache = _detect_claude_code_version()
+    return _claude_code_version_cache
+
+
 def _is_oauth_token(key: str) -> bool:
    """Check if the key is an OAuth/setup token (not a regular Console API key).

@ -132,7 +188,7 @@ def build_anthropic_client(api_key: str, base_url: str = None):
        kwargs["auth_token"] = api_key
        kwargs["default_headers"] = {
            "anthropic-beta": ",".join(all_betas),
-            "user-agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
+            "user-agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
            "x-app": "cli",
        }
    else:
@ -241,7 +297,7 @@ def _refresh_oauth_token(creds: Dict[str, Any]) -> Optional[str]:

    headers = {
        "Content-Type": "application/json",
-        "User-Agent": f"claude-cli/{_CLAUDE_CODE_VERSION} (external, cli)",
+        "User-Agent": f"claude-cli/{_get_claude_code_version()} (external, cli)",
    }

    for endpoint in token_endpoints:
@ -706,14 +762,21 @@ def convert_messages_to_anthropic(
                result.append({"role": "user", "content": [tool_result]})
            continue

-        # Regular user message
+        # Regular user message — validate non-empty content (Anthropic rejects empty)
        if isinstance(content, list):
            converted_blocks = _convert_content_to_anthropic(content)
-            result.append({
-                "role": "user",
-                "content": converted_blocks or [{"type": "text", "text": ""}],
-            })
+            # Check if all text blocks are empty
+            if not converted_blocks or all(
+                b.get("text", "").strip() == ""
+                for b in converted_blocks
+                if isinstance(b, dict) and b.get("type") == "text"
+            ):
+                converted_blocks = [{"type": "text", "text": "(empty message)"}]
+            result.append({"role": "user", "content": converted_blocks})
        else:
+            # Validate string content is non-empty
+            if not content or (isinstance(content, str) and not content.strip()):
+                content = "(empty message)"
            result.append({"role": "user", "content": content})

    # Strip orphaned tool_use blocks (no matching tool_result follows)
@ -803,9 +866,15 @@ def build_anthropic_kwargs(
    tool_choice: Optional[str] = None,
    is_oauth: bool = False,
    preserve_dots: bool = False,
+    context_length: Optional[int] = None,
 ) -> Dict[str, Any]:
    """Build kwargs for anthropic.messages.create().

+    When *max_tokens* is None, the model's native output limit is used
+    (e.g. 128K for Opus 4.6, 64K for Sonnet 4.6).  If *context_length*
+    is provided, the effective limit is clamped so it doesn't exceed
+    the context window.
+
    When *is_oauth* is True, applies Claude Code compatibility transforms:
    system prompt prefix, tool name prefixing, and prompt sanitization.

@ -816,7 +885,12 @@ def build_anthropic_kwargs(
    anthropic_tools = convert_tools_to_anthropic(tools) if tools else []

    model = normalize_model_name(model, preserve_dots=preserve_dots)
-    effective_max_tokens = max_tokens or 16384
+    effective_max_tokens = max_tokens or _get_anthropic_max_output(model)
+
+    # Clamp to context window if the user set a lower context_length
+    # (e.g. custom endpoint with limited capacity).
+    if context_length and effective_max_tokens > context_length:
+        effective_max_tokens = max(context_length - 1, 1)

    # ── OAuth: Claude Code identity ──────────────────────────────────
    if is_oauth:
--- a/agent/auxiliary_client.py
+++ b/agent/auxiliary_client.py
@ -47,8 +47,7 @@ from typing import Any, Dict, List, Optional, Tuple

 from openai import OpenAI

-from hermes_cli.config import get_hermes_home
-from hermes_constants import OPENROUTER_BASE_URL
+from hermes_constants import OPENROUTER_BASE_URL, get_hermes_home

 logger = logging.getLogger(__name__)

@ -627,8 +626,6 @@ def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str]]:
    custom_key = runtime.get("api_key")
    if not isinstance(custom_base, str) or not custom_base.strip():
        return None, None
-    if not isinstance(custom_key, str) or not custom_key.strip():
-        return None, None

    custom_base = custom_base.strip().rstrip("/")
    if "openrouter.ai" in custom_base.lower():
@ -636,6 +633,13 @@ def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str]]:
        # configured. Treat that as "no custom endpoint" for auxiliary routing.
        return None, None

+    # Local servers (Ollama, llama.cpp, vLLM, LM Studio) don't require auth.
+    # Use a placeholder key — the OpenAI SDK requires a non-empty string but
+    # local servers ignore the Authorization header.  Same fix as cli.py
+    # _ensure_runtime_credentials() (PR #2556).
+    if not isinstance(custom_key, str) or not custom_key.strip():
+        custom_key = "no-key-required"
+
    return custom_base, custom_key.strip()


@ -693,7 +697,13 @@ def _try_anthropic() -> Tuple[Optional[Any], Optional[str]]:
    is_oauth = _is_oauth_token(token)
    model = _API_KEY_PROVIDER_AUX_MODELS.get("anthropic", "claude-haiku-4-5-20251001")
    logger.debug("Auxiliary client: Anthropic native (%s) at %s (oauth=%s)", model, base_url, is_oauth)
-    real_client = build_anthropic_client(token, base_url)
+    try:
+        real_client = build_anthropic_client(token, base_url)
+    except ImportError:
+        # The anthropic_adapter module imports fine but the SDK itself is
+        # missing — build_anthropic_client raises ImportError at call time
+        # when _anthropic_sdk is None.  Treat as unavailable.
+        return None, None
    return AnthropicAuxiliaryClient(real_client, model, token, base_url, is_oauth=is_oauth), model


@ -731,16 +741,37 @@ def _resolve_forced_provider(forced: str) -> Tuple[Optional[OpenAI], Optional[st
    return None, None


+_AUTO_PROVIDER_LABELS = {
+    "_try_openrouter": "openrouter",
+    "_try_nous": "nous",
+    "_try_custom_endpoint": "local/custom",
+    "_try_codex": "openai-codex",
+    "_resolve_api_key_provider": "api-key",
+}
+
+
 def _resolve_auto() -> Tuple[Optional[OpenAI], Optional[str]]:
    """Full auto-detection chain: OpenRouter → Nous → custom → Codex → API-key → None."""
    global auxiliary_is_nous
    auxiliary_is_nous = False  # Reset — _try_nous() will set True if it wins
+    tried = []
    for try_fn in (_try_openrouter, _try_nous, _try_custom_endpoint,
                   _try_codex, _resolve_api_key_provider):
+        fn_name = getattr(try_fn, "__name__", "unknown")
+        label = _AUTO_PROVIDER_LABELS.get(fn_name, fn_name)
        client, model = try_fn()
        if client is not None:
+            if tried:
+                logger.info("Auxiliary auto-detect: using %s (%s) — skipped: %s",
+                            label, model or "default", ", ".join(tried))
+            else:
+                logger.info("Auxiliary auto-detect: using %s (%s)", label, model or "default")
            return client, model
-    logger.debug("Auxiliary client: none available")
+        tried.append(label)
+    logger.warning("Auxiliary auto-detect: no provider available (tried: %s). "
+                   "Compression, summarization, and memory flush will not work. "
+                   "Set OPENROUTER_API_KEY or configure a local model in config.yaml.",
+                   ", ".join(tried))
    return None, None


@ -891,11 +922,12 @@ def resolve_provider_client(
            custom_key = (
                (explicit_api_key or "").strip()
                or os.getenv("OPENAI_API_KEY", "").strip()
+                or "no-key-required"  # local servers don't need auth
            )
-            if not custom_base or not custom_key:
+            if not custom_base:
                logger.warning(
                    "resolve_provider_client: explicit custom endpoint requested "
-                    "but no API key was found (set explicit_api_key or OPENAI_API_KEY)"
+                    "but base_url is empty"
                )
                return None, None
            final_model = model or _read_main_model() or "gpt-4o-mini"
@ -1131,7 +1163,13 @@ def resolve_vision_provider_client(
        return "custom", client, final_model

    if requested == "auto":
-        for candidate in get_available_vision_backends():
+        ordered = list(_VISION_AUTO_PROVIDER_ORDER)
+        preferred = _preferred_main_vision_provider()
+        if preferred in ordered:
+            ordered.remove(preferred)
+            ordered.insert(0, preferred)
+
+        for candidate in ordered:
            sync_client, default_model = _resolve_strict_vision_backend(candidate)
            if sync_client is not None:
                return _finalize(candidate, sync_client, default_model)
@ -1204,6 +1242,39 @@ _client_cache: Dict[tuple, tuple] = {}
 _client_cache_lock = threading.Lock()


+def neuter_async_httpx_del() -> None:
+    """Monkey-patch ``AsyncHttpxClientWrapper.__del__`` to be a no-op.
+
+    The OpenAI SDK's ``AsyncHttpxClientWrapper.__del__`` schedules
+    ``self.aclose()`` via ``asyncio.get_running_loop().create_task()``.
+    When an ``AsyncOpenAI`` client is garbage-collected while
+    prompt_toolkit's event loop is running (the common CLI idle state),
+    the ``aclose()`` task runs on prompt_toolkit's loop but the
+    underlying TCP transport is bound to a *different* loop (the worker
+    thread's loop that the client was originally created on).  If that
+    loop is closed or its thread is dead, the transport's
+    ``self._loop.call_soon()`` raises ``RuntimeError("Event loop is
+    closed")``, which prompt_toolkit surfaces as "Unhandled exception
+    in event loop ... Press ENTER to continue...".
+
+    Neutering ``__del__`` is safe because:
+    - Cached clients are explicitly cleaned via ``_force_close_async_httpx``
+      on stale-loop detection and ``shutdown_cached_clients`` on exit.
+    - Uncached clients' TCP connections are cleaned up by the OS when the
+      process exits.
+    - The OpenAI SDK itself marks this as a TODO (``# TODO(someday):
+      support non asyncio runtimes here``).
+
+    Call this once at CLI startup, before any ``AsyncOpenAI`` clients are
+    created.
+    """
+    try:
+        from openai._base_client import AsyncHttpxClientWrapper
+        AsyncHttpxClientWrapper.__del__ = lambda self: None  # type: ignore[assignment]
+    except (ImportError, AttributeError):
+        pass  # Graceful degradation if the SDK changes its internals
+
+
 def _force_close_async_httpx(client: Any) -> None:
    """Mark the httpx AsyncClient inside an AsyncOpenAI client as closed.

@ -1251,6 +1322,25 @@ def shutdown_cached_clients() -> None:
        _client_cache.clear()


+def cleanup_stale_async_clients() -> None:
+    """Force-close cached async clients whose event loop is closed.
+
+    Call this after each agent turn to proactively clean up stale clients
+    before GC can trigger ``AsyncHttpxClientWrapper.__del__`` on them.
+    This is defense-in-depth — the primary fix is ``neuter_async_httpx_del``
+    which disables ``__del__`` entirely.
+    """
+    with _client_cache_lock:
+        stale_keys = []
+        for key, entry in _client_cache.items():
+            client, _default, cached_loop = entry
+            if cached_loop is not None and cached_loop.is_closed():
+                _force_close_async_httpx(client)
+                stale_keys.append(key)
+        for key in stale_keys:
+            del _client_cache[key]
+
+
 def _get_cached_client(
    provider: str,
    model: str = None,
@ -1394,6 +1484,29 @@ def _resolve_task_provider_model(
    return "auto", resolved_model, None, None


+_DEFAULT_AUX_TIMEOUT = 30.0
+
+
+def _get_task_timeout(task: str, default: float = _DEFAULT_AUX_TIMEOUT) -> float:
+    """Read timeout from auxiliary.{task}.timeout in config, falling back to *default*."""
+    if not task:
+        return default
+    try:
+        from hermes_cli.config import load_config
+        config = load_config()
+    except ImportError:
+        return default
+    aux = config.get("auxiliary", {}) if isinstance(config, dict) else {}
+    task_config = aux.get(task, {}) if isinstance(aux, dict) else {}
+    raw = task_config.get("timeout")
+    if raw is not None:
+        try:
+            return float(raw)
+        except (ValueError, TypeError):
+            pass
+    return default
+
+
 def _build_call_kwargs(
    provider: str,
    model: str,
@ -1451,7 +1564,7 @@ def call_llm(
    temperature: float = None,
    max_tokens: int = None,
    tools: list = None,
-    timeout: float = 30.0,
+    timeout: float = None,
    extra_body: dict = None,
 ) -> Any:
    """Centralized synchronous LLM call.
@ -1469,7 +1582,7 @@ def call_llm(
        temperature: Sampling temperature (None = provider default).
        max_tokens: Max output tokens (handles max_tokens vs max_completion_tokens).
        tools: Tool definitions (for function calling).
-        timeout: Request timeout in seconds.
+        timeout: Request timeout in seconds (None = read from auxiliary.{task}.timeout config).
        extra_body: Additional request body fields.

    Returns:
@ -1525,8 +1638,8 @@ def call_llm(
                )
            # For auto/custom, fall back to OpenRouter
            if not resolved_base_url:
-                logger.warning("Provider %s unavailable, falling back to openrouter",
-                               resolved_provider)
+                logger.info("Auxiliary %s: provider %s unavailable, falling back to openrouter",
+                            task or "call", resolved_provider)
                client, final_model = _get_cached_client(
                    "openrouter", resolved_model or _OPENROUTER_MODEL)
        if client is None:
@ -1534,10 +1647,19 @@ def call_llm(
                f"No LLM provider configured for task={task} provider={resolved_provider}. "
                f"Run: hermes setup")

+    effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
+
+    # Log what we're about to do — makes auxiliary operations visible
+    _base_info = str(getattr(client, "base_url", resolved_base_url) or "")
+    if task:
+        logger.info("Auxiliary %s: using %s (%s)%s",
+                     task, resolved_provider or "auto", final_model or "default",
+                     f" at {_base_info}" if _base_info and "openrouter" not in _base_info else "")
+
    kwargs = _build_call_kwargs(
        resolved_provider, final_model, messages,
        temperature=temperature, max_tokens=max_tokens,
-        tools=tools, timeout=timeout, extra_body=extra_body,
+        tools=tools, timeout=effective_timeout, extra_body=extra_body,
        base_url=resolved_base_url)

    # Handle max_tokens vs max_completion_tokens retry
@ -1552,6 +1674,62 @@ def call_llm(
        raise


+def extract_content_or_reasoning(response) -> str:
+    """Extract content from an LLM response, falling back to reasoning fields.
+
+    Mirrors the main agent loop's behavior when a reasoning model (DeepSeek-R1,
+    Qwen-QwQ, etc.) returns ``content=None`` with reasoning in structured fields.
+
+    Resolution order:
+      1. ``message.content`` — strip inline think/reasoning blocks, check for
+         remaining non-whitespace text.
+      2. ``message.reasoning`` / ``message.reasoning_content`` — direct
+         structured reasoning fields (DeepSeek, Moonshot, Novita, etc.).
+      3. ``message.reasoning_details`` — OpenRouter unified array format.
+
+    Returns the best available text, or ``""`` if nothing found.
+    """
+    import re
+
+    msg = response.choices[0].message
+    content = (msg.content or "").strip()
+
+    if content:
+        # Strip inline think/reasoning blocks (mirrors _strip_think_blocks)
+        cleaned = re.sub(
+            r"<(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>"
+            r".*?"
+            r"</(?:think|thinking|reasoning|REASONING_SCRATCHPAD)>",
+            "", content, flags=re.DOTALL | re.IGNORECASE,
+        ).strip()
+        if cleaned:
+            return cleaned
+
+    # Content is empty or reasoning-only — try structured reasoning fields
+    reasoning_parts: list[str] = []
+    for field in ("reasoning", "reasoning_content"):
+        val = getattr(msg, field, None)
+        if val and isinstance(val, str) and val.strip() and val not in reasoning_parts:
+            reasoning_parts.append(val.strip())
+
+    details = getattr(msg, "reasoning_details", None)
+    if details and isinstance(details, list):
+        for detail in details:
+            if isinstance(detail, dict):
+                summary = (
+                    detail.get("summary")
+                    or detail.get("content")
+                    or detail.get("text")
+                )
+                if summary and summary not in reasoning_parts:
+                    reasoning_parts.append(summary.strip() if isinstance(summary, str) else str(summary))
+
+    if reasoning_parts:
+        return "\n\n".join(reasoning_parts)
+
+    return ""
+
+
 async def async_call_llm(
    task: str = None,
    *,
@ -1563,7 +1741,7 @@ async def async_call_llm(
    temperature: float = None,
    max_tokens: int = None,
    tools: list = None,
-    timeout: float = 30.0,
+    timeout: float = None,
    extra_body: dict = None,
 ) -> Any:
    """Centralized asynchronous LLM call.
@ -1624,10 +1802,12 @@ async def async_call_llm(
                f"No LLM provider configured for task={task} provider={resolved_provider}. "
                f"Run: hermes setup")

+    effective_timeout = timeout if timeout is not None else _get_task_timeout(task)
+
    kwargs = _build_call_kwargs(
        resolved_provider, final_model, messages,
        temperature=temperature, max_tokens=max_tokens,
-        tools=tools, timeout=timeout, extra_body=extra_body,
+        tools=tools, timeout=effective_timeout, extra_body=extra_body,
        base_url=resolved_base_url)

    try:
--- a/agent/context_compressor.py
+++ b/agent/context_compressor.py
@ -141,7 +141,7 @@ class ContextCompressor:
            "last_prompt_tokens": self.last_prompt_tokens,
            "threshold_tokens": self.threshold_tokens,
            "context_length": self.context_length,
-            "usage_percent": (self.last_prompt_tokens / self.context_length * 100) if self.context_length else 0,
+            "usage_percent": min(100, (self.last_prompt_tokens / self.context_length * 100)) if self.context_length else 0,
            "compression_count": self.compression_count,
        }

@ -347,7 +347,7 @@ Write only the summary body. Do not include any preamble or prefix."""
                "messages": [{"role": "user", "content": prompt}],
                "temperature": 0.3,
                "max_tokens": summary_budget * 2,
-                "timeout": 45.0,
+                # timeout resolved from auxiliary.compression.timeout config by call_llm
            }
            if self.summary_model:
                call_kwargs["model"] = self.summary_model
--- a/agent/context_references.py
+++ b/agent/context_references.py
@ -286,12 +286,16 @@ def _expand_git_reference(
    args: list[str],
    label: str,
 ) -> tuple[str | None, str | None]:
-    result = subprocess.run(
-        ["git", *args],
-        cwd=cwd,
-        capture_output=True,
-        text=True,
-    )
+    try:
+        result = subprocess.run(
+            ["git", *args],
+            cwd=cwd,
+            capture_output=True,
+            text=True,
+            timeout=30,
+        )
+    except subprocess.TimeoutExpired:
+        return f"{ref.raw}: git command timed out (30s)", None
    if result.returncode != 0:
        stderr = (result.stderr or "").strip() or "git command failed"
        return f"{ref.raw}: {stderr}", None
@ -449,9 +453,12 @@ def _rg_files(path: Path, cwd: Path, limit: int) -> list[Path] | None:
            cwd=cwd,
            capture_output=True,
            text=True,
+            timeout=10,
        )
    except FileNotFoundError:
        return None
+    except subprocess.TimeoutExpired:
+        return None
    if result.returncode != 0:
        return None
    files = [Path(line.strip()) for line in result.stdout.splitlines() if line.strip()]
--- a/agent/display.py
+++ b/agent/display.py
@ -17,6 +17,23 @@ _RESET = "\033[0m"

 logger = logging.getLogger(__name__)

+# =========================================================================
+# Configurable tool preview length (0 = no limit)
+# Set once at startup by CLI or gateway from display.tool_preview_length config.
+# =========================================================================
+_tool_preview_max_len: int = 0  # 0 = unlimited
+
+
+def set_tool_preview_max_len(n: int) -> None:
+    """Set the global max length for tool call previews. 0 = no limit."""
+    global _tool_preview_max_len
+    _tool_preview_max_len = max(int(n), 0) if n else 0
+
+
+def get_tool_preview_max_len() -> int:
+    """Return the configured max preview length (0 = unlimited)."""
+    return _tool_preview_max_len
+

 # =========================================================================
 # Skin-aware helpers (lazy import to avoid circular deps)
@ -94,8 +111,14 @@ def _oneline(text: str) -> str:
    return " ".join(text.split())


-def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str | None:
-    """Build a short preview of a tool call's primary argument for display."""
+def build_tool_preview(tool_name: str, args: dict, max_len: int | None = None) -> str | None:
+    """Build a short preview of a tool call's primary argument for display.
+
+    *max_len* controls truncation.  ``None`` (default) defers to the global
+    ``_tool_preview_max_len`` set via config; ``0`` means unlimited.
+    """
+    if max_len is None:
+        max_len = _tool_preview_max_len
    if not args:
        return None
    primary_args = {
@ -190,7 +213,7 @@ def build_tool_preview(tool_name: str, args: dict, max_len: int = 40) -> str | N
    preview = _oneline(str(value))
    if not preview:
        return None
-    if len(preview) > max_len:
+    if max_len > 0 and len(preview) > max_len:
        preview = preview[:max_len - 3] + "..."
    return preview

@ -231,7 +254,7 @@ class KawaiiSpinner:
        "analyzing", "computing", "synthesizing", "formulating", "brainstorming",
    ]

-    def __init__(self, message: str = "", spinner_type: str = 'dots'):
+    def __init__(self, message: str = "", spinner_type: str = 'dots', print_fn=None):
        self.message = message
        self.spinner_frames = self.SPINNERS.get(spinner_type, self.SPINNERS['dots'])
        self.running = False
@ -239,12 +262,26 @@ class KawaiiSpinner:
        self.frame_idx = 0
        self.start_time = None
        self.last_line_len = 0
+        # Optional callable to route all output through (e.g. a no-op for silent
+        # background agents).  When set, bypasses self._out entirely so that
+        # agents with _print_fn overridden remain fully silent.
+        self._print_fn = print_fn
        # Capture stdout NOW, before any redirect_stdout(devnull) from
        # child agents can replace sys.stdout with a black hole.
        self._out = sys.stdout

    def _write(self, text: str, end: str = '\n', flush: bool = False):
-        """Write to the stdout captured at spinner creation time."""
+        """Write to the stdout captured at spinner creation time.
+
+        If a print_fn was supplied at construction, all output is routed through
+        it instead — allowing callers to silence the spinner with a no-op lambda.
+        """
+        if self._print_fn is not None:
+            try:
+                self._print_fn(text)
+            except Exception:
+                pass
+            return
        try:
            self._out.write(text + end)
            if flush:
@ -270,11 +307,11 @@ class KawaiiSpinner:
        The CLI already drives a TUI widget (_spinner_text) for spinner display,
        so KawaiiSpinner's \\r-based animation is redundant under StdoutProxy.
        """
-        out = self._out
-        # StdoutProxy has a 'raw' attribute (bool) that plain file objects lack.
-        if hasattr(out, 'raw') and type(out).__name__ == 'StdoutProxy':
-            return True
-        return False
+        try:
+            from prompt_toolkit.patch_stdout import StdoutProxy
+            return isinstance(self._out, StdoutProxy)
+        except ImportError:
+            return False

    def _animate(self):
        # When stdout is not a real terminal (e.g. Docker, systemd, pipe),
@ -470,10 +507,14 @@ def get_cute_tool_message(

    def _trunc(s, n=40):
        s = str(s)
+        if _tool_preview_max_len == 0:
+            return s  # no limit
        return (s[:n-3] + "...") if len(s) > n else s

    def _path(p, n=35):
        p = str(p)
+        if _tool_preview_max_len == 0:
+            return p  # no limit
        return ("..." + p[-(n-3):]) if len(p) > n else p

    def _wrap(line: str) -> str:
@ -685,7 +726,7 @@ def format_context_pressure(
        threshold_percent: Compaction threshold as a fraction of context window.
        compression_enabled: Whether auto-compression is active.
    """
-    pct_int = int(compaction_progress * 100)
+    pct_int = min(int(compaction_progress * 100), 100)
    filled = min(int(compaction_progress * _BAR_WIDTH), _BAR_WIDTH)
    bar = _BAR_FILLED * filled + _BAR_EMPTY * (_BAR_WIDTH - filled)

@ -715,7 +756,7 @@ def format_context_pressure_gateway(
    No ANSI — just Unicode and plain text suitable for Telegram/Discord/etc.
    The percentage shows progress toward the compaction threshold.
    """
-    pct_int = int(compaction_progress * 100)
+    pct_int = min(int(compaction_progress * 100), 100)
    filled = min(int(compaction_progress * _BAR_WIDTH), _BAR_WIDTH)
    bar = _BAR_FILLED * filled + _BAR_EMPTY * (_BAR_WIDTH - filled)

--- a/agent/model_metadata.py
+++ b/agent/model_metadata.py
@ -113,6 +113,15 @@ DEFAULT_CONTEXT_LENGTHS = {
    "glm": 202752,
    # Kimi
    "kimi": 262144,
+    # Hugging Face Inference Providers — model IDs use org/name format
+    "Qwen/Qwen3.5-397B-A17B": 131072,
+    "Qwen/Qwen3.5-35B-A3B": 131072,
+    "deepseek-ai/DeepSeek-V3.2": 65536,
+    "moonshotai/Kimi-K2.5": 262144,
+    "moonshotai/Kimi-K2-Thinking": 262144,
+    "MiniMaxAI/MiniMax-M2.5": 204800,
+    "XiaomiMiMo/MiMo-V2-Flash": 32768,
+    "zai-org/GLM-5": 202752,
 }

 _CONTEXT_LENGTH_KEYS = (
--- a/agent/models_dev.py
+++ b/agent/models_dev.py
@ -15,6 +15,8 @@ import time
 from pathlib import Path
 from typing import Any, Dict, Optional

+from utils import atomic_json_write
+
 import requests

 logger = logging.getLogger(__name__)
@ -64,12 +66,10 @@ def _load_disk_cache() -> Dict[str, Any]:


 def _save_disk_cache(data: Dict[str, Any]) -> None:
-    """Save models.dev data to disk cache."""
+    """Save models.dev data to disk cache atomically."""
    try:
        cache_path = _get_cache_path()
-        cache_path.parent.mkdir(parents=True, exist_ok=True)
-        with open(cache_path, "w", encoding="utf-8") as f:
-            json.dump(data, f, separators=(",", ":"))
+        atomic_json_write(cache_path, data, indent=None, separators=(",", ":"))
    except Exception as e:
        logger.debug("Failed to save models.dev disk cache: %s", e)

--- a/agent/prompt_builder.py
+++ b/agent/prompt_builder.py
@ -4,14 +4,28 @@ All functions are stateless. AIAgent._build_system_prompt() calls these to
 assemble pieces, then combines them with memory and ephemeral prompts.
 """

+import json
 import logging
 import os
 import re
+import threading
+from collections import OrderedDict
 from pathlib import Path

 from hermes_constants import get_hermes_home
 from typing import Optional

+from agent.skill_utils import (
+    extract_skill_conditions,
+    extract_skill_description,
+    get_all_skills_dirs,
+    get_disabled_skill_names,
+    iter_skill_index_files,
+    parse_frontmatter,
+    skill_matches_platform,
+)
+from utils import atomic_json_write
+
 logger = logging.getLogger(__name__)

 # ---------------------------------------------------------------------------
@ -156,6 +170,25 @@ SKILLS_GUIDANCE = (
    "Skills that aren't maintained become liabilities."
 )

+TOOL_USE_ENFORCEMENT_GUIDANCE = (
+    "# Tool-use enforcement\n"
+    "You MUST use your tools to take action — do not describe what you would do "
+    "or plan to do without actually doing it. When you say you will perform an "
+    "action (e.g. 'I will run the tests', 'Let me check the file', 'I will create "
+    "the project'), you MUST immediately make the corresponding tool call in the same "
+    "response. Never end your turn with a promise of future action — execute it now.\n"
+    "Keep working until the task is actually complete. Do not stop with a summary of "
+    "what you plan to do next time. If you have tools available that can accomplish "
+    "the task, use them instead of telling the user what you would do.\n"
+    "Every response should either (a) contain tool calls that make progress, or "
+    "(b) deliver a final result to the user. Responses that only describe intentions "
+    "without acting are not acceptable."
+)
+
+# Model name substrings that trigger tool-use enforcement guidance.
+# Add new patterns here when a model family needs explicit steering.
+TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex")
+
 PLATFORM_HINTS = {
    "whatsapp": (
        "You are on a text messaging communication platform, WhatsApp. "
@ -230,6 +263,111 @@ CONTEXT_TRUNCATE_HEAD_RATIO = 0.7
 CONTEXT_TRUNCATE_TAIL_RATIO = 0.2


+# =========================================================================
+# Skills prompt cache
+# =========================================================================
+
+_SKILLS_PROMPT_CACHE_MAX = 8
+_SKILLS_PROMPT_CACHE: OrderedDict[tuple, str] = OrderedDict()
+_SKILLS_PROMPT_CACHE_LOCK = threading.Lock()
+_SKILLS_SNAPSHOT_VERSION = 1
+
+
+def _skills_prompt_snapshot_path() -> Path:
+    return get_hermes_home() / ".skills_prompt_snapshot.json"
+
+
+def clear_skills_system_prompt_cache(*, clear_snapshot: bool = False) -> None:
+    """Drop the in-process skills prompt cache (and optionally the disk snapshot)."""
+    with _SKILLS_PROMPT_CACHE_LOCK:
+        _SKILLS_PROMPT_CACHE.clear()
+    if clear_snapshot:
+        try:
+            _skills_prompt_snapshot_path().unlink(missing_ok=True)
+        except OSError as e:
+            logger.debug("Could not remove skills prompt snapshot: %s", e)
+
+
+def _build_skills_manifest(skills_dir: Path) -> dict[str, list[int]]:
+    """Build an mtime/size manifest of all SKILL.md and DESCRIPTION.md files."""
+    manifest: dict[str, list[int]] = {}
+    for filename in ("SKILL.md", "DESCRIPTION.md"):
+        for path in iter_skill_index_files(skills_dir, filename):
+            try:
+                st = path.stat()
+            except OSError:
+                continue
+            manifest[str(path.relative_to(skills_dir))] = [st.st_mtime_ns, st.st_size]
+    return manifest
+
+
+def _load_skills_snapshot(skills_dir: Path) -> Optional[dict]:
+    """Load the disk snapshot if it exists and its manifest still matches."""
+    snapshot_path = _skills_prompt_snapshot_path()
+    if not snapshot_path.exists():
+        return None
+    try:
+        snapshot = json.loads(snapshot_path.read_text(encoding="utf-8"))
+    except Exception:
+        return None
+    if not isinstance(snapshot, dict):
+        return None
+    if snapshot.get("version") != _SKILLS_SNAPSHOT_VERSION:
+        return None
+    if snapshot.get("manifest") != _build_skills_manifest(skills_dir):
+        return None
+    return snapshot
+
+
+def _write_skills_snapshot(
+    skills_dir: Path,
+    manifest: dict[str, list[int]],
+    skill_entries: list[dict],
+    category_descriptions: dict[str, str],
+) -> None:
+    """Persist skill metadata to disk for fast cold-start reuse."""
+    payload = {
+        "version": _SKILLS_SNAPSHOT_VERSION,
+        "manifest": manifest,
+        "skills": skill_entries,
+        "category_descriptions": category_descriptions,
+    }
+    try:
+        atomic_json_write(_skills_prompt_snapshot_path(), payload)
+    except Exception as e:
+        logger.debug("Could not write skills prompt snapshot: %s", e)
+
+
+def _build_snapshot_entry(
+    skill_file: Path,
+    skills_dir: Path,
+    frontmatter: dict,
+    description: str,
+) -> dict:
+    """Build a serialisable metadata dict for one skill."""
+    rel_path = skill_file.relative_to(skills_dir)
+    parts = rel_path.parts
+    if len(parts) >= 2:
+        skill_name = parts[-2]
+        category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
+    else:
+        category = "general"
+        skill_name = skill_file.parent.name
+
+    platforms = frontmatter.get("platforms") or []
+    if isinstance(platforms, str):
+        platforms = [platforms]
+
+    return {
+        "skill_name": skill_name,
+        "category": category,
+        "frontmatter_name": str(frontmatter.get("name", skill_name)),
+        "description": description,
+        "platforms": [str(p).strip() for p in platforms if str(p).strip()],
+        "conditions": extract_skill_conditions(frontmatter),
+    }
+
+
 # =========================================================================
 # Skills index
 # =========================================================================
@ -241,22 +379,13 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
    (True, {}, "") to err on the side of showing the skill.
    """
    try:
-        from tools.skills_tool import _parse_frontmatter, skill_matches_platform
-
        raw = skill_file.read_text(encoding="utf-8")[:2000]
-        frontmatter, _ = _parse_frontmatter(raw)
+        frontmatter, _ = parse_frontmatter(raw)

        if not skill_matches_platform(frontmatter):
-            return False, {}, ""
+            return False, frontmatter, ""

-        desc = ""
-        raw_desc = frontmatter.get("description", "")
-        if raw_desc:
-            desc = str(raw_desc).strip().strip("'\"")
-            if len(desc) > 60:
-                desc = desc[:57] + "..."
-
-        return True, frontmatter, desc
+        return True, frontmatter, extract_skill_description(frontmatter)
    except Exception as e:
        logger.debug("Failed to parse skill file %s: %s", skill_file, e)
        return True, {}, ""
@ -265,16 +394,9 @@ def _parse_skill_file(skill_file: Path) -> tuple[bool, dict, str]:
 def _read_skill_conditions(skill_file: Path) -> dict:
    """Extract conditional activation fields from SKILL.md frontmatter."""
    try:
-        from tools.skills_tool import _parse_frontmatter
        raw = skill_file.read_text(encoding="utf-8")[:2000]
-        frontmatter, _ = _parse_frontmatter(raw)
-        hermes = frontmatter.get("metadata", {}).get("hermes", {})
-        return {
-            "fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
-            "requires_toolsets": hermes.get("requires_toolsets", []),
-            "fallback_for_tools": hermes.get("fallback_for_tools", []),
-            "requires_tools": hermes.get("requires_tools", []),
-        }
+        frontmatter, _ = parse_frontmatter(raw)
+        return extract_skill_conditions(frontmatter)
    except Exception as e:
        logger.debug("Failed to read skill conditions from %s: %s", skill_file, e)
        return {}
@ -317,109 +439,210 @@ def build_skills_system_prompt(
 ) -> str:
    """Build a compact skill index for the system prompt.

-    Scans ~/.hermes/skills/ for SKILL.md files grouped by category.
-    Includes per-skill descriptions from frontmatter so the model can
-    match skills by meaning, not just name.
-    Filters out skills incompatible with the current OS platform.
+    Two-layer cache:
+      1. In-process LRU dict keyed by (skills_dir, tools, toolsets)
+      2. Disk snapshot (``.skills_prompt_snapshot.json``) validated by
+         mtime/size manifest — survives process restarts
+
+    Falls back to a full filesystem scan when both layers miss.
+
+    External skill directories (``skills.external_dirs`` in config.yaml) are
+    scanned alongside the local ``~/.hermes/skills/`` directory.  External dirs
+    are read-only — they appear in the index but new skills are always created
+    in the local dir.  Local skills take precedence when names collide.
    """
    hermes_home = get_hermes_home()
    skills_dir = hermes_home / "skills"
+    external_dirs = get_all_skills_dirs()[1:]  # skip local (index 0)

-    if not skills_dir.exists():
+    if not skills_dir.exists() and not external_dirs:
        return ""

-    # Collect skills with descriptions, grouped by category.
-    # Each entry: (skill_name, description)
-    # Supports sub-categories: skills/mlops/training/axolotl/SKILL.md
-    # -> category "mlops/training", skill "axolotl"
-    # Load disabled skill names once for the entire scan
-    try:
-        from tools.skills_tool import _get_disabled_skill_names
-        disabled = _get_disabled_skill_names()
-    except Exception:
-        disabled = set()
+    # ── Layer 1: in-process LRU cache ─────────────────────────────────
+    cache_key = (
+        str(skills_dir.resolve()),
+        tuple(str(d) for d in external_dirs),
+        tuple(sorted(str(t) for t in (available_tools or set()))),
+        tuple(sorted(str(ts) for ts in (available_toolsets or set()))),
+    )
+    with _SKILLS_PROMPT_CACHE_LOCK:
+        cached = _SKILLS_PROMPT_CACHE.get(cache_key)
+        if cached is not None:
+            _SKILLS_PROMPT_CACHE.move_to_end(cache_key)
+            return cached
+
+    disabled = get_disabled_skill_names()
+
+    # ── Layer 2: disk snapshot ────────────────────────────────────────
+    snapshot = _load_skills_snapshot(skills_dir)

    skills_by_category: dict[str, list[tuple[str, str]]] = {}
-    for skill_file in skills_dir.rglob("SKILL.md"):
-        is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
-        if not is_compatible:
-            continue
-        rel_path = skill_file.relative_to(skills_dir)
-        parts = rel_path.parts
-        if len(parts) >= 2:
-            skill_name = parts[-2]
-            category = "/".join(parts[:-2]) if len(parts) > 2 else parts[0]
-        else:
-            category = "general"
-            skill_name = skill_file.parent.name
-        # Respect user's disabled skills config
-        fm_name = frontmatter.get("name", skill_name)
-        if fm_name in disabled or skill_name in disabled:
-            continue
-        # Extract conditions inline from already-parsed frontmatter
-        # (avoids redundant file re-read that _read_skill_conditions would do)
-        hermes_meta = (frontmatter.get("metadata") or {}).get("hermes") or {}
-        conditions = {
-            "fallback_for_toolsets": hermes_meta.get("fallback_for_toolsets", []),
-            "requires_toolsets": hermes_meta.get("requires_toolsets", []),
-            "fallback_for_tools": hermes_meta.get("fallback_for_tools", []),
-            "requires_tools": hermes_meta.get("requires_tools", []),
+    category_descriptions: dict[str, str] = {}
+
+    if snapshot is not None:
+        # Fast path: use pre-parsed metadata from disk
+        for entry in snapshot.get("skills", []):
+            if not isinstance(entry, dict):
+                continue
+            skill_name = entry.get("skill_name") or ""
+            category = entry.get("category") or "general"
+            frontmatter_name = entry.get("frontmatter_name") or skill_name
+            platforms = entry.get("platforms") or []
+            if not skill_matches_platform({"platforms": platforms}):
+                continue
+            if frontmatter_name in disabled or skill_name in disabled:
+                continue
+            if not _skill_should_show(
+                entry.get("conditions") or {},
+                available_tools,
+                available_toolsets,
+            ):
+                continue
+            skills_by_category.setdefault(category, []).append(
+                (skill_name, entry.get("description", ""))
+            )
+        category_descriptions = {
+            str(k): str(v)
+            for k, v in (snapshot.get("category_descriptions") or {}).items()
        }
-        if not _skill_should_show(conditions, available_tools, available_toolsets):
-            continue
-        skills_by_category.setdefault(category, []).append((skill_name, desc))
+    else:
+        # Cold path: full filesystem scan + write snapshot for next time
+        skill_entries: list[dict] = []
+        for skill_file in iter_skill_index_files(skills_dir, "SKILL.md"):
+            is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
+            entry = _build_snapshot_entry(skill_file, skills_dir, frontmatter, desc)
+            skill_entries.append(entry)
+            if not is_compatible:
+                continue
+            skill_name = entry["skill_name"]
+            if entry["frontmatter_name"] in disabled or skill_name in disabled:
+                continue
+            if not _skill_should_show(
+                extract_skill_conditions(frontmatter),
+                available_tools,
+                available_toolsets,
+            ):
+                continue
+            skills_by_category.setdefault(entry["category"], []).append(
+                (skill_name, entry["description"])
+            )

-    if not skills_by_category:
-        return ""
-
-    # Read category-level descriptions from DESCRIPTION.md
-    # Checks both the exact category path and parent directories
-    category_descriptions = {}
-    for category in skills_by_category:
-        cat_path = Path(category)
-        desc_file = skills_dir / cat_path / "DESCRIPTION.md"
-        if desc_file.exists():
+        # Read category-level DESCRIPTION.md files
+        for desc_file in iter_skill_index_files(skills_dir, "DESCRIPTION.md"):
            try:
                content = desc_file.read_text(encoding="utf-8")
-                match = re.search(r"^---\s*\n.*?description:\s*(.+?)\s*\n.*?^---", content, re.MULTILINE | re.DOTALL)
-                if match:
-                    category_descriptions[category] = match.group(1).strip()
+                fm, _ = parse_frontmatter(content)
+                cat_desc = fm.get("description")
+                if not cat_desc:
+                    continue
+                rel = desc_file.relative_to(skills_dir)
+                cat = "/".join(rel.parts[:-1]) if len(rel.parts) > 1 else "general"
+                category_descriptions[cat] = str(cat_desc).strip().strip("'\"")
            except Exception as e:
                logger.debug("Could not read skill description %s: %s", desc_file, e)

-    index_lines = []
-    for category in sorted(skills_by_category.keys()):
-        cat_desc = category_descriptions.get(category, "")
-        if cat_desc:
-            index_lines.append(f"  {category}: {cat_desc}")
-        else:
-            index_lines.append(f"  {category}:")
-        # Deduplicate and sort skills within each category
-        seen = set()
-        for name, desc in sorted(skills_by_category[category], key=lambda x: x[0]):
-            if name in seen:
-                continue
-            seen.add(name)
-            if desc:
-                index_lines.append(f"    - {name}: {desc}")
-            else:
-                index_lines.append(f"    - {name}")
+        _write_skills_snapshot(
+            skills_dir,
+            _build_skills_manifest(skills_dir),
+            skill_entries,
+            category_descriptions,
+        )

-    return (
-        "## Skills (mandatory)\n"
-        "Before replying, scan the skills below. If one clearly matches your task, "
-        "load it with skill_view(name) and follow its instructions. "
-        "If a skill has issues, fix it with skill_manage(action='patch').\n"
-        "After difficult/iterative tasks, offer to save as a skill. "
-        "If a skill you loaded was missing steps, had wrong commands, or needed "
-        "pitfalls you discovered, update it before finishing.\n"
-        "\n"
-        "<available_skills>\n"
-        + "\n".join(index_lines) + "\n"
-        "</available_skills>\n"
-        "\n"
-        "If none match, proceed normally without loading a skill."
-    )
+    # ── External skill directories ─────────────────────────────────────
+    # Scan external dirs directly (no snapshot caching — they're read-only
+    # and typically small).  Local skills already in skills_by_category take
+    # precedence: we track seen names and skip duplicates from external dirs.
+    seen_skill_names: set[str] = set()
+    for cat_skills in skills_by_category.values():
+        for name, _desc in cat_skills:
+            seen_skill_names.add(name)
+
+    for ext_dir in external_dirs:
+        if not ext_dir.exists():
+            continue
+        for skill_file in iter_skill_index_files(ext_dir, "SKILL.md"):
+            try:
+                is_compatible, frontmatter, desc = _parse_skill_file(skill_file)
+                if not is_compatible:
+                    continue
+                entry = _build_snapshot_entry(skill_file, ext_dir, frontmatter, desc)
+                skill_name = entry["skill_name"]
+                if skill_name in seen_skill_names:
+                    continue
+                if entry["frontmatter_name"] in disabled or skill_name in disabled:
+                    continue
+                if not _skill_should_show(
+                    extract_skill_conditions(frontmatter),
+                    available_tools,
+                    available_toolsets,
+                ):
+                    continue
+                seen_skill_names.add(skill_name)
+                skills_by_category.setdefault(entry["category"], []).append(
+                    (skill_name, entry["description"])
+                )
+            except Exception as e:
+                logger.debug("Error reading external skill %s: %s", skill_file, e)
+
+        # External category descriptions
+        for desc_file in iter_skill_index_files(ext_dir, "DESCRIPTION.md"):
+            try:
+                content = desc_file.read_text(encoding="utf-8")
+                fm, _ = parse_frontmatter(content)
+                cat_desc = fm.get("description")
+                if not cat_desc:
+                    continue
+                rel = desc_file.relative_to(ext_dir)
+                cat = "/".join(rel.parts[:-1]) if len(rel.parts) > 1 else "general"
+                category_descriptions.setdefault(cat, str(cat_desc).strip().strip("'\""))
+            except Exception as e:
+                logger.debug("Could not read external skill description %s: %s", desc_file, e)
+
+    if not skills_by_category:
+        result = ""
+    else:
+        index_lines = []
+        for category in sorted(skills_by_category.keys()):
+            cat_desc = category_descriptions.get(category, "")
+            if cat_desc:
+                index_lines.append(f"  {category}: {cat_desc}")
+            else:
+                index_lines.append(f"  {category}:")
+            # Deduplicate and sort skills within each category
+            seen = set()
+            for name, desc in sorted(skills_by_category[category], key=lambda x: x[0]):
+                if name in seen:
+                    continue
+                seen.add(name)
+                if desc:
+                    index_lines.append(f"    - {name}: {desc}")
+                else:
+                    index_lines.append(f"    - {name}")
+
+        result = (
+            "## Skills (mandatory)\n"
+            "Before replying, scan the skills below. If one clearly matches your task, "
+            "load it with skill_view(name) and follow its instructions. "
+            "If a skill has issues, fix it with skill_manage(action='patch').\n"
+            "After difficult/iterative tasks, offer to save as a skill. "
+            "If a skill you loaded was missing steps, had wrong commands, or needed "
+            "pitfalls you discovered, update it before finishing.\n"
+            "\n"
+            "<available_skills>\n"
+            + "\n".join(index_lines) + "\n"
+            "</available_skills>\n"
+            "\n"
+            "If none match, proceed normally without loading a skill."
+        )
+
+    # ── Store in LRU cache ────────────────────────────────────────────
+    with _SKILLS_PROMPT_CACHE_LOCK:
+        _SKILLS_PROMPT_CACHE[cache_key] = result
+        _SKILLS_PROMPT_CACHE.move_to_end(cache_key)
+        while len(_SKILLS_PROMPT_CACHE) > _SKILLS_PROMPT_CACHE_MAX:
+            _SKILLS_PROMPT_CACHE.popitem(last=False)
+
+    return result


 def build_nous_subscription_prompt(valid_tool_names: "set[str] | None" = None) -> str:
--- a/agent/skill_commands.py
+++ b/agent/skill_commands.py
@ -128,7 +128,11 @@ def _build_skill_message(
                        supporting.append(rel)

    if supporting and skill_dir:
-        skill_view_target = str(skill_dir.relative_to(SKILLS_DIR))
+        try:
+            skill_view_target = str(skill_dir.relative_to(SKILLS_DIR))
+        except ValueError:
+            # Skill is from an external dir — use the skill name instead
+            skill_view_target = skill_dir.name
        parts.append("")
        parts.append("[This skill has supporting files you can load with the skill_view tool:]")
        for sf in supporting:
@ -158,38 +162,49 @@ def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
    _skill_commands = {}
    try:
        from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, skill_matches_platform, _get_disabled_skill_names
-        if not SKILLS_DIR.exists():
-            return _skill_commands
+        from agent.skill_utils import get_external_skills_dirs
        disabled = _get_disabled_skill_names()
-        for skill_md in SKILLS_DIR.rglob("SKILL.md"):
-            if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
-                continue
-            try:
-                content = skill_md.read_text(encoding='utf-8')
-                frontmatter, body = _parse_frontmatter(content)
-                # Skip skills incompatible with the current OS platform
-                if not skill_matches_platform(frontmatter):
+        seen_names: set = set()
+
+        # Scan local dir first, then external dirs
+        dirs_to_scan = []
+        if SKILLS_DIR.exists():
+            dirs_to_scan.append(SKILLS_DIR)
+        dirs_to_scan.extend(get_external_skills_dirs())
+
+        for scan_dir in dirs_to_scan:
+            for skill_md in scan_dir.rglob("SKILL.md"):
+                if any(part in ('.git', '.github', '.hub') for part in skill_md.parts):
                    continue
-                name = frontmatter.get('name', skill_md.parent.name)
-                # Respect user's disabled skills config
-                if name in disabled:
+                try:
+                    content = skill_md.read_text(encoding='utf-8')
+                    frontmatter, body = _parse_frontmatter(content)
+                    # Skip skills incompatible with the current OS platform
+                    if not skill_matches_platform(frontmatter):
+                        continue
+                    name = frontmatter.get('name', skill_md.parent.name)
+                    if name in seen_names:
+                        continue
+                    # Respect user's disabled skills config
+                    if name in disabled:
+                        continue
+                    description = frontmatter.get('description', '')
+                    if not description:
+                        for line in body.strip().split('\n'):
+                            line = line.strip()
+                            if line and not line.startswith('#'):
+                                description = line[:80]
+                                break
+                    seen_names.add(name)
+                    cmd_name = name.lower().replace(' ', '-').replace('_', '-')
+                    _skill_commands[f"/{cmd_name}"] = {
+                        "name": name,
+                        "description": description or f"Invoke the {name} skill",
+                        "skill_md_path": str(skill_md),
+                        "skill_dir": str(skill_md.parent),
+                    }
+                except Exception:
                    continue
-                description = frontmatter.get('description', '')
-                if not description:
-                    for line in body.strip().split('\n'):
-                        line = line.strip()
-                        if line and not line.startswith('#'):
-                            description = line[:80]
-                            break
-                cmd_name = name.lower().replace(' ', '-').replace('_', '-')
-                _skill_commands[f"/{cmd_name}"] = {
-                    "name": name,
-                    "description": description or f"Invoke the {name} skill",
-                    "skill_md_path": str(skill_md),
-                    "skill_dir": str(skill_md.parent),
-                }
-            except Exception:
-                continue
    except Exception:
        pass
    return _skill_commands
--- a/agent/skill_utils.py
+++ b/agent/skill_utils.py
@ -0,0 +1,270 @@
+"""Lightweight skill metadata utilities shared by prompt_builder and skills_tool.
+
+This module intentionally avoids importing the tool registry, CLI config, or any
+heavy dependency chain.  It is safe to import at module level without triggering
+tool registration or provider resolution.
+"""
+
+import logging
+import os
+import re
+import sys
+from pathlib import Path
+from typing import Any, Dict, List, Optional, Set, Tuple
+
+from hermes_constants import get_hermes_home
+
+logger = logging.getLogger(__name__)
+
+# ── Platform mapping ──────────────────────────────────────────────────────
+
+PLATFORM_MAP = {
+    "macos": "darwin",
+    "linux": "linux",
+    "windows": "win32",
+}
+
+EXCLUDED_SKILL_DIRS = frozenset((".git", ".github", ".hub"))
+
+# ── Lazy YAML loader ─────────────────────────────────────────────────────
+
+_yaml_load_fn = None
+
+
+def yaml_load(content: str):
+    """Parse YAML with lazy import and CSafeLoader preference."""
+    global _yaml_load_fn
+    if _yaml_load_fn is None:
+        import yaml
+
+        loader = getattr(yaml, "CSafeLoader", None) or yaml.SafeLoader
+
+        def _load(value: str):
+            return yaml.load(value, Loader=loader)
+
+        _yaml_load_fn = _load
+    return _yaml_load_fn(content)
+
+
+# ── Frontmatter parsing ──────────────────────────────────────────────────
+
+
+def parse_frontmatter(content: str) -> Tuple[Dict[str, Any], str]:
+    """Parse YAML frontmatter from a markdown string.
+
+    Uses yaml with CSafeLoader for full YAML support (nested metadata, lists)
+    with a fallback to simple key:value splitting for robustness.
+
+    Returns:
+        (frontmatter_dict, remaining_body)
+    """
+    frontmatter: Dict[str, Any] = {}
+    body = content
+
+    if not content.startswith("---"):
+        return frontmatter, body
+
+    end_match = re.search(r"\n---\s*\n", content[3:])
+    if not end_match:
+        return frontmatter, body
+
+    yaml_content = content[3 : end_match.start() + 3]
+    body = content[end_match.end() + 3 :]
+
+    try:
+        parsed = yaml_load(yaml_content)
+        if isinstance(parsed, dict):
+            frontmatter = parsed
+    except Exception:
+        # Fallback: simple key:value parsing for malformed YAML
+        for line in yaml_content.strip().split("\n"):
+            if ":" not in line:
+                continue
+            key, value = line.split(":", 1)
+            frontmatter[key.strip()] = value.strip()
+
+    return frontmatter, body
+
+
+# ── Platform matching ─────────────────────────────────────────────────────
+
+
+def skill_matches_platform(frontmatter: Dict[str, Any]) -> bool:
+    """Return True when the skill is compatible with the current OS.
+
+    Skills declare platform requirements via a top-level ``platforms`` list
+    in their YAML frontmatter::
+
+        platforms: [macos]          # macOS only
+        platforms: [macos, linux]   # macOS and Linux
+
+    If the field is absent or empty the skill is compatible with **all**
+    platforms (backward-compatible default).
+    """
+    platforms = frontmatter.get("platforms")
+    if not platforms:
+        return True
+    if not isinstance(platforms, list):
+        platforms = [platforms]
+    current = sys.platform
+    for platform in platforms:
+        normalized = str(platform).lower().strip()
+        mapped = PLATFORM_MAP.get(normalized, normalized)
+        if current.startswith(mapped):
+            return True
+    return False
+
+
+# ── Disabled skills ───────────────────────────────────────────────────────
+
+
+def get_disabled_skill_names() -> Set[str]:
+    """Read disabled skill names from config.yaml.
+
+    Resolves platform from ``HERMES_PLATFORM`` env var, falls back to
+    the global disabled list.  Reads the config file directly (no CLI
+    config imports) to stay lightweight.
+    """
+    config_path = get_hermes_home() / "config.yaml"
+    if not config_path.exists():
+        return set()
+    try:
+        parsed = yaml_load(config_path.read_text(encoding="utf-8"))
+    except Exception as e:
+        logger.debug("Could not read skill config %s: %s", config_path, e)
+        return set()
+    if not isinstance(parsed, dict):
+        return set()
+
+    skills_cfg = parsed.get("skills")
+    if not isinstance(skills_cfg, dict):
+        return set()
+
+    resolved_platform = os.getenv("HERMES_PLATFORM")
+    if resolved_platform:
+        platform_disabled = (skills_cfg.get("platform_disabled") or {}).get(
+            resolved_platform
+        )
+        if platform_disabled is not None:
+            return _normalize_string_set(platform_disabled)
+    return _normalize_string_set(skills_cfg.get("disabled"))
+
+
+def _normalize_string_set(values) -> Set[str]:
+    if values is None:
+        return set()
+    if isinstance(values, str):
+        values = [values]
+    return {str(v).strip() for v in values if str(v).strip()}
+
+
+# ── External skills directories ──────────────────────────────────────────
+
+
+def get_external_skills_dirs() -> List[Path]:
+    """Read ``skills.external_dirs`` from config.yaml and return validated paths.
+
+    Each entry is expanded (``~`` and ``${VAR}``) and resolved to an absolute
+    path.  Only directories that actually exist are returned.  Duplicates and
+    paths that resolve to the local ``~/.hermes/skills/`` are silently skipped.
+    """
+    config_path = get_hermes_home() / "config.yaml"
+    if not config_path.exists():
+        return []
+    try:
+        parsed = yaml_load(config_path.read_text(encoding="utf-8"))
+    except Exception:
+        return []
+    if not isinstance(parsed, dict):
+        return []
+
+    skills_cfg = parsed.get("skills")
+    if not isinstance(skills_cfg, dict):
+        return []
+
+    raw_dirs = skills_cfg.get("external_dirs")
+    if not raw_dirs:
+        return []
+    if isinstance(raw_dirs, str):
+        raw_dirs = [raw_dirs]
+    if not isinstance(raw_dirs, list):
+        return []
+
+    local_skills = (get_hermes_home() / "skills").resolve()
+    seen: Set[Path] = set()
+    result: List[Path] = []
+
+    for entry in raw_dirs:
+        entry = str(entry).strip()
+        if not entry:
+            continue
+        # Expand ~ and environment variables
+        expanded = os.path.expanduser(os.path.expandvars(entry))
+        p = Path(expanded).resolve()
+        if p == local_skills:
+            continue
+        if p in seen:
+            continue
+        if p.is_dir():
+            seen.add(p)
+            result.append(p)
+        else:
+            logger.debug("External skills dir does not exist, skipping: %s", p)
+
+    return result
+
+
+def get_all_skills_dirs() -> List[Path]:
+    """Return all skill directories: local ``~/.hermes/skills/`` first, then external.
+
+    The local dir is always first (and always included even if it doesn't exist
+    yet — callers handle that).  External dirs follow in config order.
+    """
+    dirs = [get_hermes_home() / "skills"]
+    dirs.extend(get_external_skills_dirs())
+    return dirs
+
+
+# ── Condition extraction ──────────────────────────────────────────────────
+
+
+def extract_skill_conditions(frontmatter: Dict[str, Any]) -> Dict[str, List]:
+    """Extract conditional activation fields from parsed frontmatter."""
+    hermes = (frontmatter.get("metadata") or {}).get("hermes") or {}
+    return {
+        "fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
+        "requires_toolsets": hermes.get("requires_toolsets", []),
+        "fallback_for_tools": hermes.get("fallback_for_tools", []),
+        "requires_tools": hermes.get("requires_tools", []),
+    }
+
+
+# ── Description extraction ────────────────────────────────────────────────
+
+
+def extract_skill_description(frontmatter: Dict[str, Any]) -> str:
+    """Extract a truncated description from parsed frontmatter."""
+    raw_desc = frontmatter.get("description", "")
+    if not raw_desc:
+        return ""
+    desc = str(raw_desc).strip().strip("'\"")
+    if len(desc) > 60:
+        return desc[:57] + "..."
+    return desc
+
+
+# ── File iteration ────────────────────────────────────────────────────────
+
+
+def iter_skill_index_files(skills_dir: Path, filename: str):
+    """Walk skills_dir yielding sorted paths matching *filename*.
+
+    Excludes ``.git``, ``.github``, ``.hub`` directories.
+    """
+    matches = []
+    for root, dirs, files in os.walk(skills_dir):
+        dirs[:] = [d for d in dirs if d not in EXCLUDED_SKILL_DIRS]
+        if filename in files:
+            matches.append(Path(root) / filename)
+    for path in sorted(matches, key=lambda p: str(p.relative_to(skills_dir))):
+        yield path
--- a/agent/title_generator.py
+++ b/agent/title_generator.py
@ -19,7 +19,7 @@ _TITLE_PROMPT = (
 )


-def generate_title(user_message: str, assistant_response: str, timeout: float = 15.0) -> Optional[str]:
+def generate_title(user_message: str, assistant_response: str, timeout: float = 30.0) -> Optional[str]:
    """Generate a session title from the first exchange.

    Uses the auxiliary LLM client (cheapest/fastest available model).
--- a/cli-config.yaml.example
+++ b/cli-config.yaml.example
@ -7,17 +7,33 @@
 # =============================================================================
 model:
  # Default model to use (can be overridden with --model flag)
+  # Both "default" and "model" work as the key name here.
  default: "anthropic/claude-opus-4.6"
  
  # Inference provider selection:
-  #   "auto"       - Use Nous Portal if logged in, otherwise OpenRouter/env vars (default)
-  #   "nous-api"   - Use Nous Portal via API key (requires: NOUS_API_KEY)
-  #   "openrouter" - Always use OpenRouter API key from OPENROUTER_API_KEY
-  #   "nous"       - Always use Nous Portal (requires: hermes login)
-  #   "zai"        - Use z.ai / ZhipuAI GLM models (requires: GLM_API_KEY)
-  #   "kimi-coding"- Use Kimi / Moonshot AI models (requires: KIMI_API_KEY)
-  #   "minimax"    - Use MiniMax global endpoint (requires: MINIMAX_API_KEY)
-  #   "minimax-cn" - Use MiniMax China endpoint (requires: MINIMAX_CN_API_KEY)
+  #   "auto"         - Auto-detect from credentials (default)
+  #   "openrouter"   - OpenRouter (requires: OPENROUTER_API_KEY or OPENAI_API_KEY)
+  #   "nous"         - Nous Portal OAuth (requires: hermes login)
+  #   "nous-api"     - Nous Portal API key (requires: NOUS_API_KEY)
+  #   "anthropic"    - Direct Anthropic API (requires: ANTHROPIC_API_KEY)
+  #   "openai-codex" - OpenAI Codex (requires: hermes login --provider openai-codex)
+  #   "copilot"      - GitHub Copilot / GitHub Models (requires: GITHUB_TOKEN)
+  #   "zai"          - z.ai / ZhipuAI GLM (requires: GLM_API_KEY)
+  #   "kimi-coding"  - Kimi / Moonshot AI (requires: KIMI_API_KEY)
+  #   "minimax"      - MiniMax global (requires: MINIMAX_API_KEY)
+  #   "minimax-cn"   - MiniMax China (requires: MINIMAX_CN_API_KEY)
+  #   "huggingface"  - Hugging Face Inference (requires: HF_TOKEN)
+  #   "kilocode"     - KiloCode gateway (requires: KILOCODE_API_KEY)
+  #   "ai-gateway"   - Vercel AI Gateway (requires: AI_GATEWAY_API_KEY)
+  #
+  # Local servers (LM Studio, Ollama, vLLM, llama.cpp):
+  #   "custom"       - Any OpenAI-compatible endpoint. Set base_url below.
+  #   Aliases: "lmstudio", "ollama", "vllm", "llamacpp" all map to "custom".
+  #   Example for LM Studio:
+  #     provider: "lmstudio"
+  #     base_url: "http://localhost:1234/v1"
+  #   No API key needed — local servers typically ignore auth.
+  #
  # Can also be overridden with --provider flag or HERMES_INFERENCE_PROVIDER env var.
  provider: "auto"
  
@ -401,6 +417,15 @@ skills:
  # Set to 0 to disable.
  creation_nudge_interval: 15

+  # External skill directories — share skills across tools/agents without
+  # copying them into ~/.hermes/skills/.  Each path is expanded (~ and ${VAR})
+  # and resolved to an absolute path.  External dirs are read-only: skill
+  # creation always writes to ~/.hermes/skills/.  Local skills take precedence
+  # when names collide.
+  # external_dirs:
+  #   - ~/.agents/skills
+  #   - /home/shared/team-skills
+
 # =============================================================================
 # Agent Behavior
 # =============================================================================
@ -688,6 +713,12 @@ display:
  # Toggle at runtime with /verbose in the CLI
  tool_progress: all

+  # What Enter does when Hermes is already busy in the CLI.
+  #   interrupt: Interrupt the current run and redirect Hermes (default)
+  #   queue:     Queue your message for the next turn
+  # Ctrl+C always interrupts regardless of this setting.
+  busy_input_mode: interrupt
+
  # Background process notifications (gateway/messaging only).
  # Controls how chatty the process watcher is when you use
  # terminal(background=true, check_interval=...) from Telegram/Discord/etc.
--- a/cli.py
+++ b/cli.py
@ -70,7 +70,7 @@ _COMMAND_SPINNER_FRAMES = ("⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧

 # Load .env from ~/.hermes/.env first, then project root as dev fallback.
 # User-managed env files should override stale shell exports on restart.
-from hermes_constants import get_hermes_home, OPENROUTER_BASE_URL
+from hermes_constants import get_hermes_home, display_hermes_home, OPENROUTER_BASE_URL
 from hermes_cli.env_loader import load_hermes_dotenv

 _hermes_home = get_hermes_home()
@ -205,6 +205,7 @@ def load_cli_config() -> Dict[str, Any]:
            "resume_display": "full",
            "show_reasoning": False,
            "streaming": True,
+            "busy_input_mode": "interrupt",

            "skin": "default",
        },
@ -448,6 +449,25 @@ try:
 except Exception:
    pass  # Skin engine is optional — default skin used if unavailable

+# Initialize tool preview length from config
+try:
+    from agent.display import set_tool_preview_max_len
+    _tpl = CLI_CONFIG.get("display", {}).get("tool_preview_length", 0)
+    set_tool_preview_max_len(int(_tpl) if _tpl else 0)
+except Exception:
+    pass
+
+# Neuter AsyncHttpxClientWrapper.__del__ before any AsyncOpenAI clients are
+# created.  The SDK's __del__ schedules aclose() on asyncio.get_running_loop()
+# which, during CLI idle time, finds prompt_toolkit's event loop and tries to
+# close TCP transports bound to dead worker loops — producing
+# "Event loop is closed" / "Press ENTER to continue..." errors.
+try:
+    from agent.auxiliary_client import neuter_async_httpx_del
+    neuter_async_httpx_del()
+except Exception:
+    pass
+
 from rich import box as rich_box
 from rich.console import Console
 from rich.markup import escape as _escape
@ -1035,13 +1055,18 @@ class HermesCLI:
        self.config = CLI_CONFIG
        self.compact = compact if compact is not None else CLI_CONFIG["display"].get("compact", False)
        # tool_progress: "off", "new", "all", "verbose" (from config.yaml display section)
-        self.tool_progress_mode = CLI_CONFIG["display"].get("tool_progress", "all")
+        # YAML 1.1 parses bare `off` as boolean False — normalise to string.
+        _raw_tp = CLI_CONFIG["display"].get("tool_progress", "all")
+        self.tool_progress_mode = "off" if _raw_tp is False else str(_raw_tp)
        # resume_display: "full" (show history) | "minimal" (one-liner only)
        self.resume_display = CLI_CONFIG["display"].get("resume_display", "full")
        # bell_on_complete: play terminal bell (\a) when agent finishes a response
        self.bell_on_complete = CLI_CONFIG["display"].get("bell_on_complete", False)
        # show_reasoning: display model thinking/reasoning before the response
        self.show_reasoning = CLI_CONFIG["display"].get("show_reasoning", False)
+        # busy_input_mode: "interrupt" (Enter interrupts current run) or "queue" (Enter queues for next turn)
+        _bim = CLI_CONFIG["display"].get("busy_input_mode", "interrupt")
+        self.busy_input_mode = "queue" if str(_bim).strip().lower() == "queue" else "interrupt"

        self.verbose = verbose if verbose is not None else (self.tool_progress_mode == "verbose")
        
@ -1061,12 +1086,12 @@ class HermesCLI:
        # authoritative.  This avoids conflicts in multi-agent setups where
        # env vars would stomp each other.
        _model_config = CLI_CONFIG.get("model", {})
-        _config_model = _model_config.get("default", "") if isinstance(_model_config, dict) else (_model_config or "")
-        _FALLBACK_MODEL = "anthropic/claude-opus-4.6"
-        self.model = model or _config_model or _FALLBACK_MODEL
-        # Auto-detect model from local server if still on fallback
-        if self.model == _FALLBACK_MODEL:
-            _base_url = _model_config.get("base_url", "") if isinstance(_model_config, dict) else ""
+        _config_model = (_model_config.get("default") or _model_config.get("model") or "") if isinstance(_model_config, dict) else (_model_config or "")
+        _DEFAULT_CONFIG_MODEL = "anthropic/claude-opus-4.6"
+        self.model = model or _config_model or _DEFAULT_CONFIG_MODEL
+        # Auto-detect model from local server if still on default
+        if self.model == _DEFAULT_CONFIG_MODEL:
+            _base_url = (_model_config.get("base_url") or "") if isinstance(_model_config, dict) else ""
            if "localhost" in _base_url or "127.0.0.1" in _base_url:
                from hermes_cli.runtime_provider import _auto_detect_local_model
                _detected = _auto_detect_local_model(_base_url)
@ -1079,7 +1104,7 @@ class HermesCLI:
        # explicit choice — the user just never changed it.  But a config model
        # like "gpt-5.3-codex" IS explicit and must be preserved.
        self._model_is_default = not model and (
-            not _config_model or _config_model == _FALLBACK_MODEL
+            not _config_model or _config_model == _DEFAULT_CONFIG_MODEL
        )

        self._explicit_api_key = api_key
@ -1165,9 +1190,13 @@ class HermesCLI:
        self._provider_require_params = pr.get("require_parameters", False)
        self._provider_data_collection = pr.get("data_collection")
        
-        # Fallback model config — tried when primary provider fails after retries
-        fb = CLI_CONFIG.get("fallback_model") or {}
-        self._fallback_model = fb if fb.get("provider") and fb.get("model") else None
+        # Fallback provider chain — tried in order when primary fails after retries.
+        # Supports new list format (fallback_providers) and legacy single-dict (fallback_model).
+        fb = CLI_CONFIG.get("fallback_providers") or CLI_CONFIG.get("fallback_model") or []
+        # Normalize legacy single-dict to a one-element list
+        if isinstance(fb, dict):
+            fb = [fb] if fb.get("provider") and fb.get("model") else []
+        self._fallback_model = fb

        # Optional cheap-vs-strong routing for simple turns
        self._smart_model_routing = CLI_CONFIG.get("smart_model_routing", {}) or {}
@ -1329,7 +1358,12 @@ class HermesCLI:
    def _build_status_bar_text(self, width: Optional[int] = None) -> str:
        try:
            snapshot = self._get_status_bar_snapshot()
-            width = width or shutil.get_terminal_size((80, 24)).columns
+            if width is None:
+                try:
+                    from prompt_toolkit.application import get_app
+                    width = get_app().output.get_size().columns
+                except Exception:
+                    width = shutil.get_terminal_size((80, 24)).columns
            percent = snapshot["context_percent"]
            percent_label = f"{percent}%" if percent is not None else "--"
            duration_label = snapshot["duration"]
@ -1359,7 +1393,16 @@ class HermesCLI:
            return []
        try:
            snapshot = self._get_status_bar_snapshot()
-            width = shutil.get_terminal_size((80, 24)).columns
+            # Use prompt_toolkit's own terminal width when running inside the
+            # TUI — shutil.get_terminal_size() can return stale or fallback
+            # values (especially on SSH) that differ from what prompt_toolkit
+            # actually renders, causing the fragments to overflow to a second
+            # line and produce duplicated status bar rows over long sessions.
+            try:
+                from prompt_toolkit.application import get_app
+                width = get_app().output.get_size().columns
+            except Exception:
+                width = shutil.get_terminal_size((80, 24)).columns
            duration_label = snapshot["duration"]

            if width < 52:
@ -1594,6 +1637,7 @@ class HermesCLI:
        if not text:
            return
        self._reasoning_stream_started = True
+        self._reasoning_shown_this_turn = True
        if getattr(self, "_stream_box_opened", False):
            return

@ -2929,6 +2973,82 @@ class HermesCLI:
        if not silent:
            print("(^_^)v New session started!")

+    def _handle_resume_command(self, cmd_original: str) -> None:
+        """Handle /resume <session_id_or_title> — switch to a previous session mid-conversation."""
+        parts = cmd_original.split(None, 1)
+        target = parts[1].strip() if len(parts) > 1 else ""
+
+        if not target:
+            _cprint("  Usage: /resume <session_id_or_title>")
+            _cprint("  Tip:   Use /history or `hermes sessions list` to find sessions.")
+            return
+
+        if not self._session_db:
+            _cprint("  Session database not available.")
+            return
+
+        # Resolve title or ID
+        from hermes_cli.main import _resolve_session_by_name_or_id
+        resolved = _resolve_session_by_name_or_id(target)
+        target_id = resolved or target
+
+        session_meta = self._session_db.get_session(target_id)
+        if not session_meta:
+            _cprint(f"  Session not found: {target}")
+            _cprint("  Use /history or `hermes sessions list` to see available sessions.")
+            return
+
+        if target_id == self.session_id:
+            _cprint("  Already on that session.")
+            return
+
+        # End current session
+        try:
+            self._session_db.end_session(self.session_id, "resumed_other")
+        except Exception:
+            pass
+
+        # Switch to the target session
+        self.session_id = target_id
+        self._resumed = True
+        self._pending_title = None
+
+        # Load conversation history
+        restored = self._session_db.get_messages_as_conversation(target_id)
+        self.conversation_history = restored or []
+
+        # Re-open the target session so it's not marked as ended
+        try:
+            self._session_db.reopen_session(target_id)
+        except Exception:
+            pass
+
+        # Sync the agent if already initialised
+        if self.agent:
+            self.agent.session_id = target_id
+            self.agent.reset_session_state()
+            if hasattr(self.agent, "_last_flushed_db_idx"):
+                self.agent._last_flushed_db_idx = len(self.conversation_history)
+            if hasattr(self.agent, "_todo_store"):
+                try:
+                    from tools.todo_tool import TodoStore
+                    self.agent._todo_store = TodoStore()
+                except Exception:
+                    pass
+            if hasattr(self.agent, "_invalidate_system_prompt"):
+                self.agent._invalidate_system_prompt()
+
+        title_part = f" \"{session_meta['title']}\"" if session_meta.get("title") else ""
+        msg_count = len([m for m in self.conversation_history if m.get("role") == "user"])
+        if self.conversation_history:
+            _cprint(
+                f"  ↻ Resumed session {target_id}{title_part}"
+                f" ({msg_count} user message{'s' if msg_count != 1 else ''},"
+                f" {len(self.conversation_history)} total)"
+            )
+        else:
+            _cprint(f"  ↻ Resumed session {target_id}{title_part} — no messages, starting fresh.")
+
    def reset_conversation(self):
        """Reset the conversation by starting a new session."""
        self.new_session()
@ -3486,7 +3606,7 @@ class HermesCLI:
            print("  To start the gateway:")
            print("    python cli.py --gateway")
            print()
-            print("  Configuration file: ~/.hermes/config.yaml")
+            print(f"  Configuration file: {display_hermes_home()}/config.yaml")
            print()
            
        except Exception as e:
@ -3496,7 +3616,7 @@ class HermesCLI:
            print("    1. Set environment variables:")
            print("       TELEGRAM_BOT_TOKEN=your_token")
            print("       DISCORD_BOT_TOKEN=your_token")
-            print("    2. Or configure settings in ~/.hermes/config.yaml")
+            print(f"    2. Or configure settings in {display_hermes_home()}/config.yaml")
            print()
    
    def process_command(self, command: str) -> bool:
@ -3647,6 +3767,8 @@ class HermesCLI:
                    _cprint("  Session database not available.")
        elif canonical == "new":
            self.new_session()
+        elif canonical == "resume":
+            self._handle_resume_command(cmd_original)
        elif canonical == "provider":
            self._show_model_and_providers()
        elif canonical == "prompt":
@ -3701,7 +3823,7 @@ class HermesCLI:
                plugins = mgr.list_plugins()
                if not plugins:
                    print("No plugins installed.")
-                    print("Drop plugin directories into ~/.hermes/plugins/ to get started.")
+                    print(f"Drop plugin directories into {display_hermes_home()}/plugins/ to get started.")
                else:
                    print(f"Plugins ({len(plugins)}):")
                    for p in plugins:
@ -3722,17 +3844,17 @@ class HermesCLI:
        elif canonical == "background":
            self._handle_background_command(cmd_original)
        elif canonical == "queue":
-            if not self._agent_running:
-                _cprint("  /queue only works while Hermes is busy. Just type your message normally.")
+            # Extract prompt after "/queue " or "/q "
+            parts = cmd_original.split(None, 1)
+            payload = parts[1].strip() if len(parts) > 1 else ""
+            if not payload:
+                _cprint("  Usage: /queue <prompt>")
            else:
-                # Extract prompt after "/queue " or "/q "
-                parts = cmd_original.split(None, 1)
-                payload = parts[1].strip() if len(parts) > 1 else ""
-                if not payload:
-                    _cprint("  Usage: /queue <prompt>")
-                else:
-                    self._pending_input.put(payload)
+                self._pending_input.put(payload)
+                if self._agent_running:
                    _cprint(f"  Queued for the next turn: {payload[:80]}{'...' if len(payload) > 80 else ''}")
+                else:
+                    _cprint(f"  Queued: {payload[:80]}{'...' if len(payload) > 80 else ''}")
        elif canonical == "skin":
            self._handle_skin_command(cmd_original)
        elif canonical == "voice":
@ -3924,6 +4046,17 @@ class HermesCLI:
                    provider_data_collection=self._provider_data_collection,
                    fallback_model=self._fallback_model,
                )
+                # Silence raw spinner; route thinking through TUI widget when no foreground agent is active.
+                bg_agent._print_fn = lambda *_a, **_kw: None
+
+                def _bg_thinking(text: str) -> None:
+                    # Concurrent bg tasks may race on _spinner_text; acceptable for best-effort UI.
+                    if not self._agent_running:
+                        self._spinner_text = text
+                        if self._app:
+                            self._app.invalidate()
+
+                bg_agent.thinking_callback = _bg_thinking

                result = bg_agent.run_conversation(
                    user_message=prompt,
@ -3986,6 +4119,9 @@ class HermesCLI:
                _cprint(f"  ❌ Background task #{task_num} failed: {e}")
            finally:
                self._background_tasks.pop(task_id, None)
+                # Clear spinner only if no foreground agent owns it
+                if not self._agent_running:
+                    self._spinner_text = ""
                if self._app:
                    self._invalidate(min_interval=0)

@ -4216,7 +4352,7 @@ class HermesCLI:
                source = f" ({s['source']})" if s["source"] == "user" else ""
                print(f"   {marker} {s['name']}{source} — {s['description']}")
            print("\n  Usage: /skin <name>")
-            print("  Custom skins: drop a YAML file in ~/.hermes/skins/\n")
+            print(f"  Custom skins: drop a YAML file in {display_hermes_home()}/skins/\n")
            return

        new_skin = parts[1].strip().lower()
@ -4396,7 +4532,7 @@ class HermesCLI:
        compressor = agent.context_compressor
        last_prompt = compressor.last_prompt_tokens
        ctx_len = compressor.context_length
-        pct = (last_prompt / ctx_len * 100) if ctx_len else 0
+        pct = min(100, (last_prompt / ctx_len * 100)) if ctx_len else 0
        compressions = compressor.compression_count

        msg_count = len(self.conversation_history)
@ -4654,8 +4790,10 @@ class HermesCLI:
            from agent.display import get_tool_emoji
            emoji = get_tool_emoji(function_name)
            label = preview or function_name
-            if len(label) > 50:
-                label = label[:47] + "..."
+            from agent.display import get_tool_preview_max_len
+            _pl = get_tool_preview_max_len()
+            if _pl > 0 and len(label) > _pl:
+                label = label[:_pl - 3] + "..."
            self._spinner_text = f"{emoji} {label}"
            self._invalidate()

@ -5424,6 +5562,13 @@ class HermesCLI:
            except Exception as e:
                logging.debug("@ context reference expansion failed: %s", e)

+        # Sanitize surrogate characters that can arrive via clipboard paste from
+        # rich-text editors (Google Docs, Word, etc.).  Lone surrogates are invalid
+        # UTF-8 and crash JSON serialization in the OpenAI SDK.
+        if isinstance(message, str):
+            from run_agent import _sanitize_surrogates
+            message = _sanitize_surrogates(message)
+
        # Add user message to history
        self.conversation_history.append({"role": "user", "content": message})

@ -5436,6 +5581,10 @@ class HermesCLI:

            # Reset streaming display state for this turn
            self._reset_stream_state()
+            # Separate from _reset_stream_state because this must persist
+            # across intermediate turn boundaries (tool-calling loops) — only
+            # reset at the start of each user turn.
+            self._reasoning_shown_this_turn = False

            # --- Streaming TTS setup ---
            # When ElevenLabs is the TTS provider and sounddevice is available,
@ -5580,6 +5729,16 @@ class HermesCLI:

            agent_thread.join()  # Ensure agent thread completes

+            # Proactively clean up async clients whose event loop is dead.
+            # The agent thread may have created AsyncOpenAI clients bound
+            # to a per-thread event loop; if that loop is now closed, those
+            # clients' __del__ would crash prompt_toolkit's loop on GC.
+            try:
+                from agent.auxiliary_client import cleanup_stale_async_clients
+                cleanup_stale_async_clients()
+            except Exception:
+                pass
+
            # Flush any remaining streamed text and close the box
            self._flush_stream()

@ -5640,8 +5799,13 @@ class HermesCLI:
            response_previewed = result.get("response_previewed", False) if result else False

            # Display reasoning (thinking) box if enabled and available.
-            # Skip when streaming already showed reasoning live.
-            if self.show_reasoning and result and not self._reasoning_stream_started:
+            # Skip when streaming already showed reasoning live.  Use the
+            # turn-persistent flag (_reasoning_shown_this_turn) instead of
+            # _reasoning_stream_started — the latter gets reset during
+            # intermediate turn boundaries (tool-calling loops), which caused
+            # the reasoning box to re-render after the final response.
+            _reasoning_already_shown = getattr(self, '_reasoning_shown_this_turn', False)
+            if self.show_reasoning and result and not _reasoning_already_shown:
                reasoning = result.get("last_reasoning")
                if reasoning:
                    w = shutil.get_terminal_size().columns
@ -5762,10 +5926,22 @@ class HermesCLI:
            else:
                duration_str = f"{seconds}s"
            
+            # Look up session title for resume-by-name hint
+            session_title = None
+            if self._session_db:
+                try:
+                    session_title = self._session_db.get_session_title(self.session_id)
+                except Exception:
+                    pass
+
            print("Resume this session with:")
            print(f"  hermes --resume {self.session_id}")
+            if session_title:
+                print(f"  hermes -c \"{session_title}\"")
            print()
            print(f"Session:        {self.session_id}")
+            if session_title:
+                print(f"Title:          {session_title}")
            print(f"Duration:       {duration_str}")
            print(f"Messages:       {msg_count} ({user_msgs} user, {tool_calls} tool calls)")
        else:
@ -5782,6 +5958,9 @@ class HermesCLI:
        ``normal_prompt`` is the full ``branding.prompt_symbol``.
        ``state_suffix`` is what special states (sudo/secret/approval/agent)
        should render after their leading icon.
+
+        When a profile is active (not "default"), the profile name is
+        prepended to the prompt symbol: ``coder ❯`` instead of ``❯``.
        """
        try:
            from hermes_cli.skin_engine import get_active_prompt_symbol
@ -5790,6 +5969,15 @@ class HermesCLI:
            symbol = "❯ "

        symbol = (symbol or "❯ ").rstrip() + " "
+
+        # Prepend profile name when not default
+        try:
+            from hermes_cli.profiles import get_active_profile_name
+            profile = get_active_profile_name()
+            if profile not in ("default", "custom"):
+                symbol = f"{profile} {symbol}"
+        except Exception:
+            pass
        stripped = symbol.rstrip()
        if not stripped:
            return "❯ ", "❯ "
@ -5941,7 +6129,7 @@ class HermesCLI:
            from honcho_integration.client import HonchoClientConfig
            from agent.display import honcho_session_line, write_tty
            hcfg = HonchoClientConfig.from_global_config()
-            if hcfg.enabled and hcfg.api_key and hcfg.explicitly_configured:
+            if hcfg.enabled and (hcfg.api_key or hcfg.base_url) and hcfg.explicitly_configured:
                sname = hcfg.resolve_session_name(session_id=self.session_id)
                if sname:
                    write_tty(honcho_session_line(hcfg.workspace_id, sname) + "\n")
@ -6028,10 +6216,18 @@ class HermesCLI:
        set_approval_callback(self._approval_callback)
        set_secret_capture_callback(self._secret_capture_callback)

-        # Ensure tirith security scanner is available (downloads if needed)
+        # Ensure tirith security scanner is available (downloads if needed).
+        # Warn the user if tirith is enabled in config but not available,
+        # so they know command security scanning is degraded.
        try:
            from tools.tirith_security import ensure_installed
-            ensure_installed(log_failures=False)
+            tirith_path = ensure_installed(log_failures=False)
+            if tirith_path is None:
+                security_cfg = self.config.get("security", {}) or {}
+                tirith_enabled = security_cfg.get("tirith_enabled", True)
+                if tirith_enabled:
+                    _cprint(f"  {_DIM}⚠ tirith security scanner enabled but not available "
+                            f"— command scanning will use pattern matching only{_RST}")
        except Exception:
            pass  # Non-fatal — fail-open at scan time if unavailable
        
@ -6112,16 +6308,22 @@ class HermesCLI:
                # Bundle text + images as a tuple when images are present
                payload = (text, images) if images else text
                if self._agent_running and not (text and text.startswith("/")):
-                    self._interrupt_queue.put(payload)
-                    # Debug: log to file when message enters interrupt queue
-                    try:
-                        _dbg = _hermes_home / "interrupt_debug.log"
-                        with open(_dbg, "a") as _f:
-                            import time as _t
-                            _f.write(f"{_t.strftime('%H:%M:%S')} ENTER: queued interrupt msg={str(payload)[:60]!r}, "
-                                     f"agent_running={self._agent_running}\n")
-                    except Exception:
-                        pass
+                    if self.busy_input_mode == "queue":
+                        # Queue for the next turn instead of interrupting
+                        self._pending_input.put(payload)
+                        preview = text if text else f"[{len(images)} image{'s' if len(images) != 1 else ''} attached]"
+                        _cprint(f"  Queued for the next turn: {preview[:80]}{'...' if len(preview) > 80 else ''}")
+                    else:
+                        self._interrupt_queue.put(payload)
+                        # Debug: log to file when message enters interrupt queue
+                        try:
+                            _dbg = _hermes_home / "interrupt_debug.log"
+                            with open(_dbg, "a") as _f:
+                                import time as _t
+                                _f.write(f"{_t.strftime('%H:%M:%S')} ENTER: queued interrupt msg={str(payload)[:60]!r}, "
+                                         f"agent_running={self._agent_running}\n")
+                        except Exception:
+                            pass
                else:
                    self._pending_input.put(payload)
                event.app.current_buffer.reset(append_to_history=True)
@ -6312,6 +6514,24 @@ class HermesCLI:
            self._should_exit = True
            event.app.exit()

+        @kb.add('c-z')
+        def handle_ctrl_z(event):
+            """Handle Ctrl+Z - suspend process to background (Unix only)."""
+            import sys
+            if sys.platform == 'win32':
+                _cprint(f"\n{_DIM}Suspend (Ctrl+Z) is not supported on Windows.{_RST}")
+                event.app.invalidate()
+                return
+            import os, signal as _sig
+            from prompt_toolkit.application import run_in_terminal
+            from hermes_cli.skin_engine import get_active_skin
+            agent_name = get_active_skin().get_branding("agent_name", "Hermes Agent")
+            msg = f"\n{agent_name} has been suspended. Run `fg` to bring {agent_name} back."
+            def _suspend():
+                os.write(1, msg.encode())
+                os.kill(0, _sig.SIGTSTP)
+            run_in_terminal(_suspend)
+
        # Voice push-to-talk key: configurable via config.yaml (voice.record_key)
        # Default: Ctrl+B (avoids conflict with Ctrl+R readline reverse-search)
        # Config uses "ctrl+b" format; prompt_toolkit expects "c-b" format.
@ -6501,6 +6721,7 @@ class HermesCLI:
        # Paste collapsing: detect large pastes and save to temp file
        _paste_counter = [0]
        _prev_text_len = [0]
+        _prev_newline_count = [0]
        _paste_just_collapsed = [False]

        def _on_text_changed(buf):
@ -6509,18 +6730,27 @@ class HermesCLI:
            When bracketed paste is available, handle_paste collapses
            large pastes directly.  This handler is a fallback for
            terminals without bracketed paste support.
+
+            Two heuristics (either triggers collapse):
+            1. Many characters added at once (chars_added > 1) — works
+               when the terminal delivers the paste in one event-loop tick.
+            2. Newline count jumped by 4+ in a single text-change event —
+               catches terminals that feed characters individually but
+               still batch newlines.  Alt+Enter only adds 1 newline per
+               event so it never triggers this.
            """
            text = buf.text
            chars_added = len(text) - _prev_text_len[0]
            _prev_text_len[0] = len(text)
            if _paste_just_collapsed[0]:
                _paste_just_collapsed[0] = False
+                _prev_newline_count[0] = text.count('\n')
                return
            line_count = text.count('\n')
-            # Heuristic: a real paste adds many characters at once (not just a
-            # single newline from Alt+Enter) AND the result has 5+ lines.
-            # Fallback for terminals without bracketed paste support.
-            if line_count >= 5 and chars_added > 1 and not text.startswith('/'):
+            newlines_added = line_count - _prev_newline_count[0]
+            _prev_newline_count[0] = line_count
+            is_paste = chars_added > 1 or newlines_added >= 4
+            if line_count >= 5 and is_paste and not text.startswith('/'):
                _paste_counter[0] += 1
                # Save to temp file
                paste_dir = _hermes_home / "pastes"
@ -6528,6 +6758,7 @@ class HermesCLI:
                paste_file = paste_dir / f"paste_{_paste_counter[0]}_{datetime.now().strftime('%H%M%S')}.txt"
                paste_file.write_text(text, encoding="utf-8")
                # Replace buffer with compact reference
+                _paste_just_collapsed[0] = True
                buf.text = f"[Pasted text #{_paste_counter[0]}: {line_count + 1} lines \u2192 {paste_file}]"
                buf.cursor_position = len(buf.text)

@ -6894,6 +7125,15 @@ class HermesCLI:
            Window(
                content=FormattedTextControl(lambda: cli_ref._get_status_bar_fragments()),
                height=1,
+                # Prevent fragments that overflow the terminal width from
+                # wrapping onto a second line, which causes the status bar to
+                # appear duplicated (one full + one partial row) during long
+                # sessions, especially on SSH where shutil.get_terminal_size
+                # may return stale values.  _get_status_bar_fragments now reads
+                # width from prompt_toolkit's own output object, so fragments
+                # will always fit; wrap_lines=False is the belt-and-suspenders
+                # guard against any future width mismatch.
+                wrap_lines=False,
            ),
            filter=Condition(lambda: cli_ref._status_bar_visible),
        )
@ -7128,9 +7368,28 @@ class HermesCLI:
        # Register atexit cleanup so resources are freed even on unexpected exit
        atexit.register(_run_cleanup)
        
+        # Install a custom asyncio exception handler that suppresses the
+        # "Event loop is closed" RuntimeError from httpx transport cleanup.
+        # This is defense-in-depth — the primary fix is neuter_async_httpx_del
+        # which disables __del__ entirely, but older clients or SDK upgrades
+        # could bypass it.
+        def _suppress_closed_loop_errors(loop, context):
+            exc = context.get("exception")
+            if isinstance(exc, RuntimeError) and "Event loop is closed" in str(exc):
+                return  # silently suppress
+            # Fall back to default handler for everything else
+            loop.default_exception_handler(context)
+
        # Run the application with patch_stdout for proper output handling
        try:
            with patch_stdout():
+                # Set the custom handler on prompt_toolkit's event loop
+                try:
+                    import asyncio as _aio
+                    _loop = _aio.get_event_loop()
+                    _loop.set_exception_handler(_suppress_closed_loop_errors)
+                except Exception:
+                    pass
                app.run()
        except (EOFError, KeyboardInterrupt):
            pass
--- a/cron/jobs.py
+++ b/cron/jobs.py
@ -327,7 +327,20 @@ def load_jobs() -> List[Dict[str, Any]]:
        with open(JOBS_FILE, 'r', encoding='utf-8') as f:
            data = json.load(f)
            return data.get("jobs", [])
-    except (json.JSONDecodeError, IOError):
+    except json.JSONDecodeError:
+        # Retry with strict=False to handle bare control chars in string values
+        try:
+            with open(JOBS_FILE, 'r', encoding='utf-8') as f:
+                data = json.loads(f.read(), strict=False)
+                jobs = data.get("jobs", [])
+                if jobs:
+                    # Auto-repair: rewrite with proper escaping
+                    save_jobs(jobs)
+                    logger.warning("Auto-repaired jobs.json (had invalid control characters)")
+                return jobs
+        except Exception:
+            return []
+    except IOError:
        return []


@ -598,6 +611,34 @@ def mark_job_run(job_id: str, success: bool, error: Optional[str] = None):
    save_jobs(jobs)


+def advance_next_run(job_id: str) -> bool:
+    """Preemptively advance next_run_at for a recurring job before execution.
+
+    Call this BEFORE run_job() so that if the process crashes mid-execution,
+    the job won't re-fire on the next gateway restart.  This converts the
+    scheduler from at-least-once to at-most-once for recurring jobs — missing
+    one run is far better than firing dozens of times in a crash loop.
+
+    One-shot jobs are left unchanged so they can still retry on restart.
+
+    Returns True if next_run_at was advanced, False otherwise.
+    """
+    jobs = load_jobs()
+    for job in jobs:
+        if job["id"] == job_id:
+            kind = job.get("schedule", {}).get("kind")
+            if kind not in ("cron", "interval"):
+                return False
+            now = _hermes_now().isoformat()
+            new_next = compute_next_run(job["schedule"], now)
+            if new_next and new_next != job.get("next_run_at"):
+                job["next_run_at"] = new_next
+                save_jobs(jobs)
+                return True
+            return False
+    return False
+
+
 def get_due_jobs() -> List[Dict[str, Any]]:
    """Get all jobs that are due to run now.

--- a/cron/scheduler.py
+++ b/cron/scheduler.py
@ -26,6 +26,7 @@ except ImportError:
        msvcrt = None
 from pathlib import Path
 from hermes_constants import get_hermes_home
+from hermes_cli.config import load_config
 from typing import Optional

 from hermes_time import now as _hermes_now
@ -35,7 +36,7 @@ logger = logging.getLogger(__name__)
 # Add parent directory to path for imports
 sys.path.insert(0, str(Path(__file__).parent.parent))

-from cron.jobs import get_due_jobs, mark_job_run, save_job_output
+from cron.jobs import get_due_jobs, mark_job_run, save_job_output, advance_next_run

 # Sentinel: when a cron agent has nothing new to report, it can start its
 # response with this marker to suppress delivery.  Output is still saved
@ -86,6 +87,22 @@ def _resolve_delivery_target(job: dict) -> Optional[dict]:
            chat_id, thread_id = rest.split(":", 1)
        else:
            chat_id, thread_id = rest, None
+
+        # Resolve human-friendly labels like "Alice (dm)" to real IDs.
+        # send_message(action="list") shows labels with display suffixes
+        # that aren't valid platform IDs (e.g. WhatsApp JIDs).
+        try:
+            from gateway.channel_directory import resolve_channel_name
+            target = chat_id
+            # Strip display suffix like " (dm)" or " (group)"
+            if target.endswith(")") and " (" in target:
+                target = target.rsplit(" (", 1)[0].strip()
+            resolved = resolve_channel_name(platform_name.lower(), target)
+            if resolved:
+                chat_id = resolved
+        except Exception:
+            pass
+
        return {
            "platform": platform_name,
            "chat_id": chat_id,
@ -145,6 +162,8 @@ def _deliver_result(job: dict, content: str) -> None:
        "mattermost": Platform.MATTERMOST,
        "homeassistant": Platform.HOMEASSISTANT,
        "dingtalk": Platform.DINGTALK,
+        "feishu": Platform.FEISHU,
+        "wecom": Platform.WECOM,
        "email": Platform.EMAIL,
        "sms": Platform.SMS,
    }
@ -164,18 +183,29 @@ def _deliver_result(job: dict, content: str) -> None:
        logger.warning("Job '%s': platform '%s' not configured/enabled", job["id"], platform_name)
        return

-    # Wrap the content so the user knows this is a cron delivery and that
-    # the interactive agent has no visibility into it.
-    task_name = job.get("name", job["id"])
-    wrapped = (
-        f"Cronjob Response: {task_name}\n"
-        f"-------------\n\n"
-        f"{content}\n\n"
-        f"Note: The agent cannot see this message, and therefore cannot respond to it."
-    )
+    # Optionally wrap the content with a header/footer so the user knows this
+    # is a cron delivery.  Wrapping is on by default; set cron.wrap_response: false
+    # in config.yaml for clean output.
+    wrap_response = True
+    try:
+        user_cfg = load_config()
+        wrap_response = user_cfg.get("cron", {}).get("wrap_response", True)
+    except Exception:
+        pass
+
+    if wrap_response:
+        task_name = job.get("name", job["id"])
+        delivery_content = (
+            f"Cronjob Response: {task_name}\n"
+            f"-------------\n\n"
+            f"{content}\n\n"
+            f"Note: The agent cannot see this message, and therefore cannot respond to it."
+        )
+    else:
+        delivery_content = content

    # Run the async send in a fresh event loop (safe from any thread)
-    coro = _send_to_platform(platform, pconfig, chat_id, wrapped, thread_id=thread_id)
+    coro = _send_to_platform(platform, pconfig, chat_id, delivery_content, thread_id=thread_id)
    try:
        result = asyncio.run(coro)
    except RuntimeError:
@ -186,7 +216,7 @@ def _deliver_result(job: dict, content: str) -> None:
        coro.close()
        import concurrent.futures
        with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
-            future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, wrapped, thread_id=thread_id))
+            future = pool.submit(asyncio.run, _send_to_platform(platform, pconfig, chat_id, delivery_content, thread_id=thread_id))
            result = future.result(timeout=30)
    except Exception as e:
        logger.error("Job '%s': delivery to %s:%s failed: %s", job["id"], platform_name, chat_id, e)
@ -308,7 +338,7 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
            if delivery_target.get("thread_id") is not None:
                os.environ["HERMES_CRON_AUTO_DELIVER_THREAD_ID"] = str(delivery_target["thread_id"])

-        model = job.get("model") or os.getenv("HERMES_MODEL") or "anthropic/claude-opus-4.6"
+        model = job.get("model") or os.getenv("HERMES_MODEL") or ""

        # Load config.yaml for model, reasoning, prefill, toolsets, provider routing
        _cfg = {}
@ -524,6 +554,12 @@ def tick(verbose: bool = True) -> int:
        executed = 0
        for job in due_jobs:
            try:
+                # For recurring jobs (cron/interval), advance next_run_at to the
+                # next future occurrence BEFORE execution.  This way, if the
+                # process crashes mid-run, the job won't re-fire on restart.
+                # One-shot jobs are left alone so they can retry on restart.
+                advance_next_run(job["id"])
+
                success, output, final_response, error = run_job(job)

                output_file = save_job_output(job["id"], output)
--- a/docker/SOUL.md
+++ b/docker/SOUL.md
@ -0,0 +1,15 @@
+# Hermes Agent Persona
+
+<!--
+This file defines the agent's personality and tone.
+The agent will embody whatever you write here.
+Edit this to customize how Hermes communicates with you.
+
+Examples:
+  - "You are a warm, playful assistant who uses kaomoji occasionally."
+  - "You are a concise technical expert. No fluff, just facts."
+  - "You speak like a friendly coworker who happens to know everything."
+
+This file is loaded fresh each message -- no restart needed.
+Delete the contents (or this file) to use the default personality.
+-->
--- a/docker/entrypoint.sh
+++ b/docker/entrypoint.sh
@ -0,0 +1,34 @@
+#!/bin/bash
+# Docker entrypoint: bootstrap config files into the mounted volume, then run hermes.
+set -e
+
+HERMES_HOME="/opt/data"
+INSTALL_DIR="/opt/hermes"
+
+# Create essential directory structure.  Cache and platform directories
+# (cache/images, cache/audio, platforms/whatsapp, etc.) are created on
+# demand by the application — don't pre-create them here so new installs
+# get the consolidated layout from get_hermes_dir().
+mkdir -p "$HERMES_HOME"/{cron,sessions,logs,hooks,memories,skills}
+
+# .env
+if [ ! -f "$HERMES_HOME/.env" ]; then
+    cp "$INSTALL_DIR/.env.example" "$HERMES_HOME/.env"
+fi
+
+# config.yaml
+if [ ! -f "$HERMES_HOME/config.yaml" ]; then
+    cp "$INSTALL_DIR/cli-config.yaml.example" "$HERMES_HOME/config.yaml"
+fi
+
+# SOUL.md
+if [ ! -f "$HERMES_HOME/SOUL.md" ]; then
+    cp "$INSTALL_DIR/docker/SOUL.md" "$HERMES_HOME/SOUL.md"
+fi
+
+# Sync bundled skills (manifest-based so user edits are preserved)
+if [ -d "$INSTALL_DIR/skills" ]; then
+    python3 "$INSTALL_DIR/tools/skills_sync.py"
+fi
+
+exec hermes "$@"
--- a/environments/README.md
+++ b/environments/README.md
@ -101,21 +101,11 @@ Available methods:

 ### Patches (`patches.py`)

-**Problem**: Some hermes-agent tools use `asyncio.run()` internally (e.g., the Modal backend via SWE-ReX). This crashes when called from inside Atropos's event loop because `asyncio.run()` cannot be nested.
+**Problem**: Some hermes-agent tools use `asyncio.run()` internally (e.g., the Modal backend). This crashes when called from inside Atropos's event loop because `asyncio.run()` cannot be nested.

-**Solution**: `patches.py` monkey-patches `SwerexModalEnvironment` to use a dedicated background thread (`_AsyncWorker`) with its own event loop. The calling code sees the same sync interface, but internally the async work happens on a separate thread that doesn't conflict with Atropos's loop.
+**Solution**: `ModalEnvironment` uses a dedicated `_AsyncWorker` background thread with its own event loop. The calling code sees a sync interface, but internally all async Modal SDK calls happen on the worker thread so they don't conflict with Atropos's loop. This is built directly into `tools/environments/modal.py` — no monkey-patching required.

-What gets patched:
- `SwerexModalEnvironment.__init__` -- creates Modal deployment on a background thread
- `SwerexModalEnvironment.execute` -- runs commands on the same background thread
- `SwerexModalEnvironment.stop` -- stops deployment on the background thread
-
-The patches are:
- **Idempotent** -- calling `apply_patches()` multiple times is safe
- **Transparent** -- same interface and behavior, only the internal async execution changes
- **Universal** -- works identically in normal CLI use (no running event loop)
-
-Applied automatically at import time by `hermes_base_env.py`.
+`patches.py` is now a no-op (kept for backward compatibility with imports).

 ### Tool Call Parsers (`tool_call_parsers/`)

--- a/environments/benchmarks/terminalbench_2/terminalbench2_env.py
+++ b/environments/benchmarks/terminalbench_2/terminalbench2_env.py
@ -209,7 +209,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):

            # Agent settings -- TB2 tasks are complex, need many turns
            max_agent_turns=60,
-            max_token_length=***
+            max_token_length=16000,
            agent_temperature=0.6,
            system_prompt=None,

@ -233,7 +233,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
            steps_per_eval=1,
            total_steps=1,

-            tokenizer_name="NousRe...1-8B",
+            tokenizer_name="NousResearch/Hermes-3-Llama-3.1-8B",
            use_wandb=True,
            wandb_name="terminal-bench-2",
            ensure_scores_are_not_same=False,  # Binary rewards may all be 0 or 1
@ -245,7 +245,7 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
                base_url="https://openrouter.ai/api/v1",
                model_name="anthropic/claude-sonnet-4",
                server_type="openai",
-                api_key=os.get...EY", ""),
+                api_key=os.getenv("OPENROUTER_API_KEY", ""),
                health_check=False,
            )
        ]
@ -513,3 +513,446 @@ class TerminalBench2EvalEnv(HermesAgentBaseEnv):
                reward = 0.0
            else:
                # Run tests in a thread so the blocking ctx.terminal() calls
+                # don't freeze the entire event loop (which would stall all
+                # other tasks, tqdm updates, and timeout timers).
+                ctx = ToolContext(task_id)
+                try:
+                    loop = asyncio.get_event_loop()
+                    reward = await loop.run_in_executor(
+                        None,  # default thread pool
+                        self._run_tests, eval_item, ctx, task_name,
+                    )
+                except Exception as e:
+                    logger.error("Task %s: test verification failed: %s", task_name, e)
+                    reward = 0.0
+                finally:
+                    ctx.cleanup()
+
+            passed = reward == 1.0
+            status = "PASS" if passed else "FAIL"
+            elapsed = time.time() - task_start
+            tqdm.write(f"  [{status}] {task_name} (turns={result.turns_used}, {elapsed:.0f}s)")
+            logger.info(
+                "Task %s: reward=%.1f, turns=%d, finished=%s",
+                task_name, reward, result.turns_used, result.finished_naturally,
+            )
+
+            out = {
+                "passed": passed,
+                "reward": reward,
+                "task_name": task_name,
+                "category": category,
+                "turns_used": result.turns_used,
+                "finished_naturally": result.finished_naturally,
+                "messages": result.messages,
+            }
+            self._save_result(out)
+            return out
+
+        except Exception as e:
+            elapsed = time.time() - task_start
+            logger.error("Task %s: rollout failed: %s", task_name, e, exc_info=True)
+            tqdm.write(f"  [ERROR] {task_name}: {e} ({elapsed:.0f}s)")
+            out = {
+                "passed": False, "reward": 0.0,
+                "task_name": task_name, "category": category,
+                "error": str(e),
+            }
+            self._save_result(out)
+            return out
+
+        finally:
+            # --- Cleanup: clear overrides, sandbox, and temp files ---
+            clear_task_env_overrides(task_id)
+            try:
+                cleanup_vm(task_id)
+            except Exception as e:
+                logger.debug("VM cleanup for %s: %s", task_id[:8], e)
+            if task_dir and task_dir.exists():
+                shutil.rmtree(task_dir, ignore_errors=True)
+
+    def _run_tests(
+        self, item: Dict[str, Any], ctx: ToolContext, task_name: str
+    ) -> float:
+        """
+        Upload and execute the test suite in the agent's sandbox, then
+        download the verifier output locally to read the reward.
+
+        Follows Harbor's verification pattern:
+        1. Upload tests/ directory into the sandbox
+        2. Execute test.sh inside the sandbox
+        3. Download /logs/verifier/ directory to a local temp dir
+        4. Read reward.txt locally with native Python I/O
+
+        Downloading locally avoids issues with the file_read tool on
+        the Modal VM and matches how Harbor handles verification.
+
+        TB2 test scripts (test.sh) typically:
+        1. Install pytest via uv/pip
+        2. Run pytest against the test files in /tests/
+        3. Write results to /logs/verifier/reward.txt
+
+        Args:
+            item: The TB2 task dict (contains tests_tar, test_sh)
+            ctx: ToolContext scoped to this task's sandbox
+            task_name: For logging
+
+        Returns:
+            1.0 if tests pass, 0.0 otherwise
+        """
+        tests_tar = item.get("tests_tar", "")
+        test_sh = item.get("test_sh", "")
+
+        if not test_sh:
+            logger.warning("Task %s: no test_sh content, reward=0", task_name)
+            return 0.0
+
+        # Create required directories in the sandbox
+        ctx.terminal("mkdir -p /tests /logs/verifier")
+
+        # Upload test files into the sandbox (binary-safe via base64)
+        if tests_tar:
+            tests_temp = Path(tempfile.mkdtemp(prefix=f"tb2-tests-{task_name}-"))
+            try:
+                _extract_base64_tar(tests_tar, tests_temp)
+                ctx.upload_dir(str(tests_temp), "/tests")
+            except Exception as e:
+                logger.warning("Task %s: failed to upload test files: %s", task_name, e)
+            finally:
+                shutil.rmtree(tests_temp, ignore_errors=True)
+
+        # Write the test runner script (test.sh)
+        ctx.write_file("/tests/test.sh", test_sh)
+        ctx.terminal("chmod +x /tests/test.sh")
+
+        # Execute the test suite
+        logger.info(
+            "Task %s: running test suite (timeout=%ds)",
+            task_name, self.config.test_timeout,
+        )
+        test_result = ctx.terminal(
+            "bash /tests/test.sh",
+            timeout=self.config.test_timeout,
+        )
+
+        exit_code = test_result.get("exit_code", -1)
+        output = test_result.get("output", "")
+
+        # Download the verifier output directory locally, then read reward.txt
+        # with native Python I/O. This avoids issues with file_read on the
+        # Modal VM and matches Harbor's verification pattern.
+        reward = 0.0
+        local_verifier_dir = Path(tempfile.mkdtemp(prefix=f"tb2-verifier-{task_name}-"))
+        try:
+            ctx.download_dir("/logs/verifier", str(local_verifier_dir))
+
+            reward_file = local_verifier_dir / "reward.txt"
+            if reward_file.exists() and reward_file.stat().st_size > 0:
+                content = reward_file.read_text().strip()
+                if content == "1":
+                    reward = 1.0
+                elif content == "0":
+                    reward = 0.0
+                else:
+                    # Unexpected content -- try parsing as float
+                    try:
+                        reward = float(content)
+                    except (ValueError, TypeError):
+                        logger.warning(
+                            "Task %s: reward.txt content unexpected (%r), "
+                            "falling back to exit_code=%d",
+                            task_name, content, exit_code,
+                        )
+                        reward = 1.0 if exit_code == 0 else 0.0
+            else:
+                # reward.txt not written -- fall back to exit code
+                logger.warning(
+                    "Task %s: reward.txt not found after download, "
+                    "falling back to exit_code=%d",
+                    task_name, exit_code,
+                )
+                reward = 1.0 if exit_code == 0 else 0.0
+        except Exception as e:
+            logger.warning(
+                "Task %s: failed to download verifier dir: %s, "
+                "falling back to exit_code=%d",
+                task_name, e, exit_code,
+            )
+            reward = 1.0 if exit_code == 0 else 0.0
+        finally:
+            shutil.rmtree(local_verifier_dir, ignore_errors=True)
+
+        # Log test output for debugging failures
+        if reward == 0.0:
+            output_preview = output[-500:] if output else "(no output)"
+            logger.info(
+                "Task %s: FAIL (exit_code=%d)\n%s",
+                task_name, exit_code, output_preview,
+            )
+
+        return reward
+
+    # =========================================================================
+    # Evaluate -- main entry point for the eval subcommand
+    # =========================================================================
+
+    async def _eval_with_timeout(self, item: Dict[str, Any]) -> Dict:
+        """
+        Wrap rollout_and_score_eval with a per-task wall-clock timeout.
+
+        If the task exceeds task_timeout seconds, it's automatically scored
+        as FAIL. This prevents any single task from hanging indefinitely.
+        """
+        task_name = item.get("task_name", "unknown")
+        category = item.get("category", "unknown")
+        try:
+            return await asyncio.wait_for(
+                self.rollout_and_score_eval(item),
+                timeout=self.config.task_timeout,
+            )
+        except asyncio.TimeoutError:
+            from tqdm import tqdm
+            elapsed = self.config.task_timeout
+            tqdm.write(f"  [TIMEOUT] {task_name} (exceeded {elapsed}s wall-clock limit)")
+            logger.error("Task %s: wall-clock timeout after %ds", task_name, elapsed)
+            out = {
+                "passed": False, "reward": 0.0,
+                "task_name": task_name, "category": category,
+                "error": f"timeout ({elapsed}s)",
+            }
+            self._save_result(out)
+            return out
+
+    async def evaluate(self, *args, **kwargs) -> None:
+        """
+        Run Terminal-Bench 2.0 evaluation over all tasks.
+
+        This is the main entry point when invoked via:
+            python environments/terminalbench2_env.py evaluate
+
+        Runs all tasks through rollout_and_score_eval() via asyncio.gather()
+        (same pattern as GPQA and other Atropos eval envs). Each task is
+        wrapped with a wall-clock timeout so hung tasks auto-fail.
+
+        Suppresses noisy Modal/terminal output (HERMES_QUIET) so the tqdm
+        bar stays visible.
+        """
+        start_time = time.time()
+
+        # Route all logging through tqdm.write() so the progress bar stays
+        # pinned at the bottom while log lines scroll above it.
+        from tqdm import tqdm
+
+        class _TqdmHandler(logging.Handler):
+            def emit(self, record):
+                try:
+                    tqdm.write(self.format(record))
+                except Exception:
+                    self.handleError(record)
+
+        handler = _TqdmHandler()
+        handler.setFormatter(logging.Formatter(
+            "%(asctime)s [%(name)s] %(levelname)s: %(message)s",
+            datefmt="%H:%M:%S",
+        ))
+        root = logging.getLogger()
+        root.handlers = [handler]  # Replace any existing handlers
+        root.setLevel(logging.INFO)
+
+        # Silence noisy third-party loggers that flood the output
+        logging.getLogger("httpx").setLevel(logging.WARNING)      # Every HTTP request
+        logging.getLogger("openai").setLevel(logging.WARNING)     # OpenAI client retries
+        logging.getLogger("rex-deploy").setLevel(logging.WARNING) # Swerex deployment
+        logging.getLogger("rex_image_builder").setLevel(logging.WARNING)  # Image builds
+
+        print(f"\n{'='*60}")
+        print("Starting Terminal-Bench 2.0 Evaluation")
+        print(f"{'='*60}")
+        print(f"  Dataset: {self.config.dataset_name}")
+        print(f"  Total tasks: {len(self.all_eval_items)}")
+        print(f"  Max agent turns: {self.config.max_agent_turns}")
+        print(f"  Task timeout: {self.config.task_timeout}s")
+        print(f"  Terminal backend: {self.config.terminal_backend}")
+        print(f"  Tool thread pool: {self.config.tool_pool_size}")
+        print(f"  Terminal timeout: {self.config.terminal_timeout}s/cmd")
+        print(f"  Terminal lifetime: {self.config.terminal_lifetime}s (auto: task_timeout + 120)")
+        print(f"  Max concurrent tasks: {self.config.max_concurrent_tasks}")
+        print(f"{'='*60}\n")
+
+        # Semaphore to limit concurrent Modal sandbox creations.
+        # Without this, all 86 tasks fire simultaneously, each creating a Modal
+        # sandbox via asyncio.run() inside a thread pool worker. Modal's blocking
+        # calls (App.lookup, etc.) deadlock when too many are created at once.
+        semaphore = asyncio.Semaphore(self.config.max_concurrent_tasks)
+
+        async def _eval_with_semaphore(item):
+            async with semaphore:
+                return await self._eval_with_timeout(item)
+
+        # Fire all tasks with wall-clock timeout, track live accuracy on the bar
+        total_tasks = len(self.all_eval_items)
+        eval_tasks = [
+            asyncio.ensure_future(_eval_with_semaphore(item))
+            for item in self.all_eval_items
+        ]
+
+        results = []
+        passed_count = 0
+        pbar = tqdm(total=total_tasks, desc="Evaluating TB2", dynamic_ncols=True)
+        try:
+            for coro in asyncio.as_completed(eval_tasks):
+                result = await coro
+                results.append(result)
+                if result and result.get("passed"):
+                    passed_count += 1
+                done = len(results)
+                pct = (passed_count / done * 100) if done else 0
+                pbar.set_postfix_str(f"pass={passed_count}/{done} ({pct:.1f}%)")
+                pbar.update(1)
+        except (KeyboardInterrupt, asyncio.CancelledError):
+            pbar.close()
+            print(f"\n\nInterrupted! Cleaning up {len(eval_tasks)} tasks...")
+            # Cancel all pending tasks
+            for task in eval_tasks:
+                task.cancel()
+            # Let cancellations propagate (finally blocks run cleanup_vm)
+            await asyncio.gather(*eval_tasks, return_exceptions=True)
+            # Belt-and-suspenders: clean up any remaining sandboxes
+            from tools.terminal_tool import cleanup_all_environments
+            cleanup_all_environments()
+            print("All sandboxes cleaned up.")
+            return
+        finally:
+            pbar.close()
+
+        end_time = time.time()
+
+        # Filter out None results (shouldn't happen, but be safe)
+        valid_results = [r for r in results if r is not None]
+
+        if not valid_results:
+            print("Warning: No valid evaluation results obtained")
+            return
+
+        # ---- Compute metrics ----
+        total = len(valid_results)
+        passed = sum(1 for r in valid_results if r.get("passed"))
+        overall_pass_rate = passed / total if total > 0 else 0.0
+
+        # Per-category breakdown
+        cat_results: Dict[str, List[Dict]] = defaultdict(list)
+        for r in valid_results:
+            cat_results[r.get("category", "unknown")].append(r)
+
+        # Build metrics dict
+        eval_metrics = {
+            "eval/pass_rate": overall_pass_rate,
+            "eval/total_tasks": total,
+            "eval/passed_tasks": passed,
+            "eval/evaluation_time_seconds": end_time - start_time,
+        }
+
+        # Per-category metrics
+        for category, cat_items in sorted(cat_results.items()):
+            cat_passed = sum(1 for r in cat_items if r.get("passed"))
+            cat_total = len(cat_items)
+            cat_pass_rate = cat_passed / cat_total if cat_total > 0 else 0.0
+            cat_key = category.replace(" ", "_").replace("-", "_").lower()
+            eval_metrics[f"eval/pass_rate_{cat_key}"] = cat_pass_rate
+
+        # Store metrics for wandb_log
+        self.eval_metrics = [(k, v) for k, v in eval_metrics.items()]
+
+        # ---- Print summary ----
+        print(f"\n{'='*60}")
+        print("Terminal-Bench 2.0 Evaluation Results")
+        print(f"{'='*60}")
+        print(f"Overall Pass Rate: {overall_pass_rate:.4f} ({passed}/{total})")
+        print(f"Evaluation Time: {end_time - start_time:.1f} seconds")
+
+        print("\nCategory Breakdown:")
+        for category, cat_items in sorted(cat_results.items()):
+            cat_passed = sum(1 for r in cat_items if r.get("passed"))
+            cat_total = len(cat_items)
+            cat_rate = cat_passed / cat_total if cat_total > 0 else 0.0
+            print(f"  {category}: {cat_rate:.1%} ({cat_passed}/{cat_total})")
+
+        # Print individual task results
+        print("\nTask Results:")
+        for r in sorted(valid_results, key=lambda x: x.get("task_name", "")):
+            status = "PASS" if r.get("passed") else "FAIL"
+            turns = r.get("turns_used", "?")
+            error = r.get("error", "")
+            extra = f" (error: {error})" if error else ""
+            print(f"  [{status}] {r['task_name']} (turns={turns}){extra}")
+
+        print(f"{'='*60}\n")
+
+        # Build sample records for evaluate_log (includes full conversations)
+        samples = [
+            {
+                "task_name": r.get("task_name"),
+                "category": r.get("category"),
+                "passed": r.get("passed"),
+                "reward": r.get("reward"),
+                "turns_used": r.get("turns_used"),
+                "error": r.get("error"),
+                "messages": r.get("messages"),
+            }
+            for r in valid_results
+        ]
+
+        # Log evaluation results
+        try:
+            await self.evaluate_log(
+                metrics=eval_metrics,
+                samples=samples,
+                start_time=start_time,
+                end_time=end_time,
+                generation_parameters={
+                    "temperature": self.config.agent_temperature,
+                    "max_tokens": self.config.max_token_length,
+                    "max_agent_turns": self.config.max_agent_turns,
+                    "terminal_backend": self.config.terminal_backend,
+                },
+            )
+        except Exception as e:
+            print(f"Error logging evaluation results: {e}")
+
+        # Close streaming file
+        if hasattr(self, "_streaming_file") and not self._streaming_file.closed:
+            self._streaming_file.close()
+            print(f"  Live results saved to: {self._streaming_path}")
+
+        # Kill all remaining sandboxes. Timed-out tasks leave orphaned thread
+        # pool workers still executing commands -- cleanup_all stops them.
+        from tools.terminal_tool import cleanup_all_environments
+        print("\nCleaning up all sandboxes...")
+        cleanup_all_environments()
+
+        # Shut down the tool thread pool so orphaned workers from timed-out
+        # tasks are killed immediately instead of retrying against dead
+        # sandboxes and spamming the console with TimeoutError warnings.
+        from environments.agent_loop import _tool_executor
+        _tool_executor.shutdown(wait=False, cancel_futures=True)
+        print("Done.")
+
+    # =========================================================================
+    # Wandb logging
+    # =========================================================================
+
+    async def wandb_log(self, wandb_metrics: Optional[Dict] = None):
+        """Log TB2-specific metrics to wandb."""
+        if wandb_metrics is None:
+            wandb_metrics = {}
+
+        # Add stored eval metrics
+        for metric_name, metric_value in self.eval_metrics:
+            wandb_metrics[metric_name] = metric_value
+        self.eval_metrics = []
+
+        await super().wandb_log(wandb_metrics)
+
+
+if __name__ == "__main__":
+    TerminalBench2EvalEnv.cli()
--- a/gateway/builtin_hooks/init.py
+++ b/gateway/builtin_hooks/init.py
@ -0,0 +1 @@
+"""Built-in gateway hooks that are always registered."""
--- a/gateway/builtin_hooks/boot_md.py
+++ b/gateway/builtin_hooks/boot_md.py
@ -0,0 +1,86 @@
+"""Built-in boot-md hook — run ~/.hermes/BOOT.md on gateway startup.
+
+This hook is always registered. It silently skips if no BOOT.md exists.
+To activate, create ``~/.hermes/BOOT.md`` with instructions for the
+agent to execute on every gateway restart.
+
+Example BOOT.md::
+
+    # Startup Checklist
+
+    1. Check if any cron jobs failed overnight
+    2. Send a status update to Discord #general
+    3. If there are errors in /opt/app/deploy.log, summarize them
+
+The agent runs in a background thread so it doesn't block gateway
+startup. If nothing needs attention, it replies with [SILENT] to
+suppress delivery.
+"""
+
+import logging
+import os
+import threading
+from pathlib import Path
+
+logger = logging.getLogger("hooks.boot-md")
+
+HERMES_HOME = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
+BOOT_FILE = HERMES_HOME / "BOOT.md"
+
+
+def _build_boot_prompt(content: str) -> str:
+    """Wrap BOOT.md content in a system-level instruction."""
+    return (
+        "You are running a startup boot checklist. Follow the BOOT.md "
+        "instructions below exactly.\n\n"
+        "---\n"
+        f"{content}\n"
+        "---\n\n"
+        "Execute each instruction. If you need to send a message to a "
+        "platform, use the send_message tool.\n"
+        "If nothing needs attention and there is nothing to report, "
+        "reply with ONLY: [SILENT]"
+    )
+
+
+def _run_boot_agent(content: str) -> None:
+    """Spawn a one-shot agent session to execute the boot instructions."""
+    try:
+        from run_agent import AIAgent
+
+        prompt = _build_boot_prompt(content)
+        agent = AIAgent(
+            quiet_mode=True,
+            skip_context_files=True,
+            skip_memory=True,
+            max_iterations=20,
+        )
+        result = agent.run_conversation(prompt)
+        response = result.get("final_response", "")
+        if response and "[SILENT]" not in response:
+            logger.info("boot-md completed: %s", response[:200])
+        else:
+            logger.info("boot-md completed (nothing to report)")
+    except Exception as e:
+        logger.error("boot-md agent failed: %s", e)
+
+
+async def handle(event_type: str, context: dict) -> None:
+    """Gateway startup handler — run BOOT.md if it exists."""
+    if not BOOT_FILE.exists():
+        return
+
+    content = BOOT_FILE.read_text(encoding="utf-8").strip()
+    if not content:
+        return
+
+    logger.info("Running BOOT.md (%d chars)", len(content))
+
+    # Run in a background thread so we don't block gateway startup.
+    thread = threading.Thread(
+        target=_run_boot_agent,
+        args=(content,),
+        name="boot-md",
+        daemon=True,
+    )
+    thread.start()
--- a/gateway/config.py
+++ b/gateway/config.py
@ -52,6 +52,8 @@ class Platform(Enum):
    DINGTALK = "dingtalk"
    API_SERVER = "api_server"
    WEBHOOK = "webhook"
+    FEISHU = "feishu"
+    WECOM = "wecom"


@dataclass
@ -269,6 +271,12 @@ class GatewayConfig:
            # Webhook uses enabled flag only (secrets are per-route)
            elif platform == Platform.WEBHOOK:
                connected.append(platform)
+            # Feishu uses extra dict for app credentials
+            elif platform == Platform.FEISHU and config.extra.get("app_id"):
+                connected.append(platform)
+            # WeCom uses extra dict for bot credentials
+            elif platform == Platform.WECOM and config.extra.get("bot_id"):
+                connected.append(platform)
        return connected
    
    def get_home_channel(self, platform: Platform) -> Optional[HomeChannel]:
@ -596,6 +604,14 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            config.platforms[Platform.TELEGRAM] = PlatformConfig()
        config.platforms[Platform.TELEGRAM].reply_to_mode = telegram_reply_mode
    
+    telegram_fallback_ips = os.getenv("TELEGRAM_FALLBACK_IPS", "")
+    if telegram_fallback_ips:
+        if Platform.TELEGRAM not in config.platforms:
+            config.platforms[Platform.TELEGRAM] = PlatformConfig()
+        config.platforms[Platform.TELEGRAM].extra["fallback_ips"] = [
+            ip.strip() for ip in telegram_fallback_ips.split(",") if ip.strip()
+        ]
+
    telegram_home = os.getenv("TELEGRAM_HOME_CHANNEL")
    if telegram_home and Platform.TELEGRAM in config.platforms:
        config.platforms[Platform.TELEGRAM].home_channel = HomeChannel(
@ -634,14 +650,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            config.platforms[Platform.SLACK] = PlatformConfig()
        config.platforms[Platform.SLACK].enabled = True
        config.platforms[Platform.SLACK].token = slack_token
-        # Home channel
-        slack_home = os.getenv("SLACK_HOME_CHANNEL")
-        if slack_home:
-            config.platforms[Platform.SLACK].home_channel = HomeChannel(
-                platform=Platform.SLACK,
-                chat_id=slack_home,
-                name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
-            )
+    slack_home = os.getenv("SLACK_HOME_CHANNEL")
+    if slack_home and Platform.SLACK in config.platforms:
+        config.platforms[Platform.SLACK].home_channel = HomeChannel(
+            platform=Platform.SLACK,
+            chat_id=slack_home,
+            name=os.getenv("SLACK_HOME_CHANNEL_NAME", ""),
+        )
    
    # Signal
    signal_url = os.getenv("SIGNAL_HTTP_URL")
@ -655,13 +670,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            "account": signal_account,
            "ignore_stories": os.getenv("SIGNAL_IGNORE_STORIES", "true").lower() in ("true", "1", "yes"),
        })
-        signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
-        if signal_home:
-            config.platforms[Platform.SIGNAL].home_channel = HomeChannel(
-                platform=Platform.SIGNAL,
-                chat_id=signal_home,
-                name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
-            )
+    signal_home = os.getenv("SIGNAL_HOME_CHANNEL")
+    if signal_home and Platform.SIGNAL in config.platforms:
+        config.platforms[Platform.SIGNAL].home_channel = HomeChannel(
+            platform=Platform.SIGNAL,
+            chat_id=signal_home,
+            name=os.getenv("SIGNAL_HOME_CHANNEL_NAME", "Home"),
+        )

    # Mattermost
    mattermost_token = os.getenv("MATTERMOST_TOKEN")
@ -674,13 +689,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
        config.platforms[Platform.MATTERMOST].enabled = True
        config.platforms[Platform.MATTERMOST].token = mattermost_token
        config.platforms[Platform.MATTERMOST].extra["url"] = mattermost_url
-        mattermost_home = os.getenv("MATTERMOST_HOME_CHANNEL")
-        if mattermost_home:
-            config.platforms[Platform.MATTERMOST].home_channel = HomeChannel(
-                platform=Platform.MATTERMOST,
-                chat_id=mattermost_home,
-                name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
-            )
+    mattermost_home = os.getenv("MATTERMOST_HOME_CHANNEL")
+    if mattermost_home and Platform.MATTERMOST in config.platforms:
+        config.platforms[Platform.MATTERMOST].home_channel = HomeChannel(
+            platform=Platform.MATTERMOST,
+            chat_id=mattermost_home,
+            name=os.getenv("MATTERMOST_HOME_CHANNEL_NAME", "Home"),
+        )

    # Matrix
    matrix_token = os.getenv("MATRIX_ACCESS_TOKEN")
@ -702,13 +717,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            config.platforms[Platform.MATRIX].extra["password"] = matrix_password
        matrix_e2ee = os.getenv("MATRIX_ENCRYPTION", "").lower() in ("true", "1", "yes")
        config.platforms[Platform.MATRIX].extra["encryption"] = matrix_e2ee
-        matrix_home = os.getenv("MATRIX_HOME_ROOM")
-        if matrix_home:
-            config.platforms[Platform.MATRIX].home_channel = HomeChannel(
-                platform=Platform.MATRIX,
-                chat_id=matrix_home,
-                name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
-            )
+    matrix_home = os.getenv("MATRIX_HOME_ROOM")
+    if matrix_home and Platform.MATRIX in config.platforms:
+        config.platforms[Platform.MATRIX].home_channel = HomeChannel(
+            platform=Platform.MATRIX,
+            chat_id=matrix_home,
+            name=os.getenv("MATRIX_HOME_ROOM_NAME", "Home"),
+        )

    # Home Assistant
    hass_token = os.getenv("HASS_TOKEN")
@ -735,13 +750,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            "imap_host": email_imap,
            "smtp_host": email_smtp,
        })
-        email_home = os.getenv("EMAIL_HOME_ADDRESS")
-        if email_home:
-            config.platforms[Platform.EMAIL].home_channel = HomeChannel(
-                platform=Platform.EMAIL,
-                chat_id=email_home,
-                name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
-            )
+    email_home = os.getenv("EMAIL_HOME_ADDRESS")
+    if email_home and Platform.EMAIL in config.platforms:
+        config.platforms[Platform.EMAIL].home_channel = HomeChannel(
+            platform=Platform.EMAIL,
+            chat_id=email_home,
+            name=os.getenv("EMAIL_HOME_ADDRESS_NAME", "Home"),
+        )

    # SMS (Twilio)
    twilio_sid = os.getenv("TWILIO_ACCOUNT_SID")
@ -750,13 +765,13 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
            config.platforms[Platform.SMS] = PlatformConfig()
        config.platforms[Platform.SMS].enabled = True
        config.platforms[Platform.SMS].api_key = os.getenv("TWILIO_AUTH_TOKEN", "")
-        sms_home = os.getenv("SMS_HOME_CHANNEL")
-        if sms_home:
-            config.platforms[Platform.SMS].home_channel = HomeChannel(
-                platform=Platform.SMS,
-                chat_id=sms_home,
-                name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
-            )
+    sms_home = os.getenv("SMS_HOME_CHANNEL")
+    if sms_home and Platform.SMS in config.platforms:
+        config.platforms[Platform.SMS].home_channel = HomeChannel(
+            platform=Platform.SMS,
+            chat_id=sms_home,
+            name=os.getenv("SMS_HOME_CHANNEL_NAME", "Home"),
+        )

    # API Server
    api_server_enabled = os.getenv("API_SERVER_ENABLED", "").lower() in ("true", "1", "yes")
@ -798,6 +813,55 @@ def _apply_env_overrides(config: GatewayConfig) -> None:
        if webhook_secret:
            config.platforms[Platform.WEBHOOK].extra["secret"] = webhook_secret

+    # Feishu / Lark
+    feishu_app_id = os.getenv("FEISHU_APP_ID")
+    feishu_app_secret = os.getenv("FEISHU_APP_SECRET")
+    if feishu_app_id and feishu_app_secret:
+        if Platform.FEISHU not in config.platforms:
+            config.platforms[Platform.FEISHU] = PlatformConfig()
+        config.platforms[Platform.FEISHU].enabled = True
+        config.platforms[Platform.FEISHU].extra.update({
+            "app_id": feishu_app_id,
+            "app_secret": feishu_app_secret,
+            "domain": os.getenv("FEISHU_DOMAIN", "feishu"),
+            "connection_mode": os.getenv("FEISHU_CONNECTION_MODE", "websocket"),
+        })
+        feishu_encrypt_key = os.getenv("FEISHU_ENCRYPT_KEY", "")
+        if feishu_encrypt_key:
+            config.platforms[Platform.FEISHU].extra["encrypt_key"] = feishu_encrypt_key
+        feishu_verification_token = os.getenv("FEISHU_VERIFICATION_TOKEN", "")
+        if feishu_verification_token:
+            config.platforms[Platform.FEISHU].extra["verification_token"] = feishu_verification_token
+        feishu_home = os.getenv("FEISHU_HOME_CHANNEL")
+        if feishu_home:
+            config.platforms[Platform.FEISHU].home_channel = HomeChannel(
+                platform=Platform.FEISHU,
+                chat_id=feishu_home,
+                name=os.getenv("FEISHU_HOME_CHANNEL_NAME", "Home"),
+            )
+
+    # WeCom (Enterprise WeChat)
+    wecom_bot_id = os.getenv("WECOM_BOT_ID")
+    wecom_secret = os.getenv("WECOM_SECRET")
+    if wecom_bot_id and wecom_secret:
+        if Platform.WECOM not in config.platforms:
+            config.platforms[Platform.WECOM] = PlatformConfig()
+        config.platforms[Platform.WECOM].enabled = True
+        config.platforms[Platform.WECOM].extra.update({
+            "bot_id": wecom_bot_id,
+            "secret": wecom_secret,
+        })
+        wecom_ws_url = os.getenv("WECOM_WEBSOCKET_URL", "")
+        if wecom_ws_url:
+            config.platforms[Platform.WECOM].extra["websocket_url"] = wecom_ws_url
+        wecom_home = os.getenv("WECOM_HOME_CHANNEL")
+        if wecom_home:
+            config.platforms[Platform.WECOM].home_channel = HomeChannel(
+                platform=Platform.WECOM,
+                chat_id=wecom_home,
+                name=os.getenv("WECOM_HOME_CHANNEL_NAME", "Home"),
+            )
+
    # Session settings
    idle_minutes = os.getenv("SESSION_IDLE_MINUTES")
    if idle_minutes:
--- a/gateway/hooks.py
+++ b/gateway/hooks.py
@ -51,14 +51,33 @@ class HookRegistry:
        """Return metadata about all loaded hooks."""
        return list(self._loaded_hooks)

+    def _register_builtin_hooks(self) -> None:
+        """Register built-in hooks that are always active."""
+        try:
+            from gateway.builtin_hooks.boot_md import handle as boot_md_handle
+
+            self._handlers.setdefault("gateway:startup", []).append(boot_md_handle)
+            self._loaded_hooks.append({
+                "name": "boot-md",
+                "description": "Run ~/.hermes/BOOT.md on gateway startup",
+                "events": ["gateway:startup"],
+                "path": "(builtin)",
+            })
+        except Exception as e:
+            print(f"[hooks] Could not load built-in boot-md hook: {e}", flush=True)
+
    def discover_and_load(self) -> None:
        """
        Scan the hooks directory for hook directories and load their handlers.

+        Also registers built-in hooks that are always active.
+
        Each hook directory must contain:
          - HOOK.yaml with at least 'name' and 'events' keys
          - handler.py with a top-level 'handle' function (sync or async)
        """
+        self._register_builtin_hooks()
+
        if not HOOKS_DIR.exists():
            return

--- a/gateway/pairing.py
+++ b/gateway/pairing.py
@ -25,7 +25,7 @@ import time
 from pathlib import Path
 from typing import Optional

-from hermes_cli.config import get_hermes_home
+from hermes_constants import get_hermes_dir


 # Unambiguous alphabet -- excludes 0/O, 1/I to prevent confusion
@ -41,7 +41,7 @@ LOCKOUT_SECONDS = 3600              # Lockout duration after too many failures
 MAX_PENDING_PER_PLATFORM = 3        # Max pending codes per platform
 MAX_FAILED_ATTEMPTS = 5             # Failed approvals before lockout

-PAIRING_DIR = get_hermes_home() / "pairing"
+PAIRING_DIR = get_hermes_dir("platforms/pairing", "pairing")


 def _secure_write(path: Path, data: str) -> None:
--- a/gateway/platforms/api_server.py
+++ b/gateway/platforms/api_server.py
@ -166,7 +166,7 @@ class ResponseStore:

 _CORS_HEADERS = {
    "Access-Control-Allow-Methods": "GET, POST, DELETE, OPTIONS",
-    "Access-Control-Allow-Headers": "Authorization, Content-Type",
+    "Access-Control-Allow-Headers": "Authorization, Content-Type, Idempotency-Key",
 }


@ -223,6 +223,23 @@ if AIOHTTP_AVAILABLE:
 else:
    body_limit_middleware = None  # type: ignore[assignment]

+_SECURITY_HEADERS = {
+    "X-Content-Type-Options": "nosniff",
+    "Referrer-Policy": "no-referrer",
+}
+
+
+if AIOHTTP_AVAILABLE:
+    @web.middleware
+    async def security_headers_middleware(request, handler):
+        """Add security headers to all responses (including errors)."""
+        response = await handler(request)
+        for k, v in _SECURITY_HEADERS.items():
+            response.headers.setdefault(k, v)
+        return response
+else:
+    security_headers_middleware = None  # type: ignore[assignment]
+

 class _IdempotencyCache:
    """In-memory idempotency cache with TTL and basic LRU semantics."""
@ -307,6 +324,7 @@ class APIServerAdapter(BasePlatformAdapter):
        if "*" in self._cors_origins:
            headers = dict(_CORS_HEADERS)
            headers["Access-Control-Allow-Origin"] = "*"
+            headers["Access-Control-Max-Age"] = "600"
            return headers

        if origin not in self._cors_origins:
@ -315,6 +333,7 @@ class APIServerAdapter(BasePlatformAdapter):
        headers = dict(_CORS_HEADERS)
        headers["Access-Control-Allow-Origin"] = origin
        headers["Vary"] = "Origin"
+        headers["Access-Control-Max-Age"] = "600"
        return headers

    def _origin_allowed(self, origin: str) -> bool:
@ -366,14 +385,20 @@ class APIServerAdapter(BasePlatformAdapter):
        Create an AIAgent instance using the gateway's runtime config.

        Uses _resolve_runtime_agent_kwargs() to pick up model, api_key,
-        base_url, etc. from config.yaml / env vars.
+        base_url, etc. from config.yaml / env vars.  Toolsets are resolved
+        from config.yaml platform_toolsets.api_server (same as all other
+        gateway platforms), falling back to the hermes-api-server default.
        """
        from run_agent import AIAgent
-        from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model
+        from gateway.run import _resolve_runtime_agent_kwargs, _resolve_gateway_model, _load_gateway_config
+        from hermes_cli.tools_config import _get_platform_tools

        runtime_kwargs = _resolve_runtime_agent_kwargs()
        model = _resolve_gateway_model()

+        user_config = _load_gateway_config()
+        enabled_toolsets = sorted(_get_platform_tools(user_config, "api_server"))
+
        max_iterations = int(os.getenv("HERMES_MAX_ITERATIONS", "90"))

        agent = AIAgent(
@ -383,6 +408,7 @@ class APIServerAdapter(BasePlatformAdapter):
            quiet_mode=True,
            verbose_logging=False,
            ephemeral_system_prompt=ephemeral_system_prompt or None,
+            enabled_toolsets=enabled_toolsets,
            session_id=session_id,
            platform="api_server",
            stream_delta_callback=stream_delta_callback,
@ -488,17 +514,21 @@ class APIServerAdapter(BasePlatformAdapter):
                if delta is not None:
                    _stream_q.put(delta)

-            # Start agent in background
+            # Start agent in background.  agent_ref is a mutable container
+            # so the SSE writer can interrupt the agent on client disconnect.
+            agent_ref = [None]
            agent_task = asyncio.ensure_future(self._run_agent(
                user_message=user_message,
                conversation_history=history,
                ephemeral_system_prompt=system_prompt,
                session_id=session_id,
                stream_delta_callback=_on_delta,
+                agent_ref=agent_ref,
            ))

            return await self._write_sse_chat_completion(
-                request, completion_id, model_name, created, _stream_q, agent_task
+                request, completion_id, model_name, created, _stream_q,
+                agent_task, agent_ref,
            )

        # Non-streaming: run the agent (with optional Idempotency-Key)
@ -561,80 +591,107 @@ class APIServerAdapter(BasePlatformAdapter):

    async def _write_sse_chat_completion(
        self, request: "web.Request", completion_id: str, model: str,
-        created: int, stream_q, agent_task,
+        created: int, stream_q, agent_task, agent_ref=None,
    ) -> "web.StreamResponse":
-        """Write real streaming SSE from agent's stream_delta_callback queue."""
+        """Write real streaming SSE from agent's stream_delta_callback queue.
+
+        If the client disconnects mid-stream (network drop, browser tab close),
+        the agent is interrupted via ``agent.interrupt()`` so it stops making
+        LLM API calls, and the asyncio task wrapper is cancelled.
+        """
        import queue as _q

-        response = web.StreamResponse(
-            status=200,
-            headers={"Content-Type": "text/event-stream", "Cache-Control": "no-cache"},
-        )
+        sse_headers = {"Content-Type": "text/event-stream", "Cache-Control": "no-cache"}
+        # CORS middleware can't inject headers into StreamResponse after
+        # prepare() flushes them, so resolve CORS headers up front.
+        origin = request.headers.get("Origin", "")
+        cors = self._cors_headers_for_origin(origin) if origin else None
+        if cors:
+            sse_headers.update(cors)
+        response = web.StreamResponse(status=200, headers=sse_headers)
        await response.prepare(request)

-        # Role chunk
-        role_chunk = {
-            "id": completion_id, "object": "chat.completion.chunk",
-            "created": created, "model": model,
-            "choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
-        }
-        await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())
-
-        # Stream content chunks as they arrive from the agent
-        loop = asyncio.get_event_loop()
-        while True:
-            try:
-                delta = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
-            except _q.Empty:
-                if agent_task.done():
-                    # Drain any remaining items
-                    while True:
-                        try:
-                            delta = stream_q.get_nowait()
-                            if delta is None:
-                                break
-                            content_chunk = {
-                                "id": completion_id, "object": "chat.completion.chunk",
-                                "created": created, "model": model,
-                                "choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
-                            }
-                            await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
-                        except _q.Empty:
-                            break
-                    break
-                continue
-
-            if delta is None:  # End of stream sentinel
-                break
-
-            content_chunk = {
+        try:
+            # Role chunk
+            role_chunk = {
                "id": completion_id, "object": "chat.completion.chunk",
                "created": created, "model": model,
-                "choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
+                "choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": None}],
            }
-            await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
+            await response.write(f"data: {json.dumps(role_chunk)}\n\n".encode())

-        # Get usage from completed agent
-        usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
-        try:
-            result, agent_usage = await agent_task
-            usage = agent_usage or usage
-        except Exception:
-            pass
+            # Stream content chunks as they arrive from the agent
+            loop = asyncio.get_event_loop()
+            while True:
+                try:
+                    delta = await loop.run_in_executor(None, lambda: stream_q.get(timeout=0.5))
+                except _q.Empty:
+                    if agent_task.done():
+                        # Drain any remaining items
+                        while True:
+                            try:
+                                delta = stream_q.get_nowait()
+                                if delta is None:
+                                    break
+                                content_chunk = {
+                                    "id": completion_id, "object": "chat.completion.chunk",
+                                    "created": created, "model": model,
+                                    "choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
+                                }
+                                await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
+                            except _q.Empty:
+                                break
+                        break
+                    continue

-        # Finish chunk
-        finish_chunk = {
-            "id": completion_id, "object": "chat.completion.chunk",
-            "created": created, "model": model,
-            "choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}],
-            "usage": {
-                "prompt_tokens": usage.get("input_tokens", 0),
-                "completion_tokens": usage.get("output_tokens", 0),
-                "total_tokens": usage.get("total_tokens", 0),
-            },
-        }
-        await response.write(f"data: {json.dumps(finish_chunk)}\n\n".encode())
-        await response.write(b"data: [DONE]\n\n")
+                if delta is None:  # End of stream sentinel
+                    break
+
+                content_chunk = {
+                    "id": completion_id, "object": "chat.completion.chunk",
+                    "created": created, "model": model,
+                    "choices": [{"index": 0, "delta": {"content": delta}, "finish_reason": None}],
+                }
+                await response.write(f"data: {json.dumps(content_chunk)}\n\n".encode())
+
+            # Get usage from completed agent
+            usage = {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0}
+            try:
+                result, agent_usage = await agent_task
+                usage = agent_usage or usage
+            except Exception:
+                pass
+
+            # Finish chunk
+            finish_chunk = {
+                "id": completion_id, "object": "chat.completion.chunk",
+                "created": created, "model": model,
+                "choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}],
+                "usage": {
+                    "prompt_tokens": usage.get("input_tokens", 0),
+                    "completion_tokens": usage.get("output_tokens", 0),
+                    "total_tokens": usage.get("total_tokens", 0),
+                },
+            }
+            await response.write(f"data: {json.dumps(finish_chunk)}\n\n".encode())
+            await response.write(b"data: [DONE]\n\n")
+        except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError, OSError):
+            # Client disconnected mid-stream.  Interrupt the agent so it
+            # stops making LLM API calls at the next loop iteration, then
+            # cancel the asyncio task wrapper.
+            agent = agent_ref[0] if agent_ref else None
+            if agent is not None:
+                try:
+                    agent.interrupt("SSE client disconnected")
+                except Exception:
+                    pass
+            if not agent_task.done():
+                agent_task.cancel()
+                try:
+                    await agent_task
+                except (asyncio.CancelledError, Exception):
+                    pass
+            logger.info("SSE client disconnected; interrupted agent task %s", completion_id)

        return response

@ -1137,12 +1194,18 @@ class APIServerAdapter(BasePlatformAdapter):
        ephemeral_system_prompt: Optional[str] = None,
        session_id: Optional[str] = None,
        stream_delta_callback=None,
+        agent_ref: Optional[list] = None,
    ) -> tuple:
        """
        Create an agent and run a conversation in a thread executor.

        Returns ``(result_dict, usage_dict)`` where *usage_dict* contains
        ``input_tokens``, ``output_tokens`` and ``total_tokens``.
+
+        If *agent_ref* is a one-element list, the AIAgent instance is stored
+        at ``agent_ref[0]`` before ``run_conversation`` begins.  This allows
+        callers (e.g. the SSE writer) to call ``agent.interrupt()`` from
+        another thread to stop in-progress LLM calls.
        """
        loop = asyncio.get_event_loop()

@ -1152,6 +1215,8 @@ class APIServerAdapter(BasePlatformAdapter):
                session_id=session_id,
                stream_delta_callback=stream_delta_callback,
            )
+            if agent_ref is not None:
+                agent_ref[0] = agent
            result = agent.run_conversation(
                user_message=user_message,
                conversation_history=conversation_history,
@ -1176,10 +1241,11 @@ class APIServerAdapter(BasePlatformAdapter):
            return False

        try:
-            mws = [mw for mw in (cors_middleware, body_limit_middleware) if mw is not None]
+            mws = [mw for mw in (cors_middleware, body_limit_middleware, security_headers_middleware) if mw is not None]
            self._app = web.Application(middlewares=mws)
            self._app["api_server_adapter"] = self
            self._app.router.add_get("/health", self._handle_health)
+            self._app.router.add_get("/v1/health", self._handle_health)
            self._app.router.add_get("/v1/models", self._handle_models)
            self._app.router.add_post("/v1/chat/completions", self._handle_chat_completions)
            self._app.router.add_post("/v1/responses", self._handle_responses)
@ -1195,6 +1261,17 @@ class APIServerAdapter(BasePlatformAdapter):
            self._app.router.add_post("/api/jobs/{job_id}/resume", self._handle_resume_job)
            self._app.router.add_post("/api/jobs/{job_id}/run", self._handle_run_job)

+            # Port conflict detection — fail fast if port is already in use
+            import socket as _socket
+            try:
+                with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
+                    _s.settimeout(1)
+                    _s.connect(('127.0.0.1', self._port))
+                logger.error('[%s] Port %d already in use. Set a different port in config.yaml: platforms.api_server.port', self.name, self._port)
+                return False
+            except (ConnectionRefusedError, OSError):
+                pass  # port is free
+
            self._runner = web.AppRunner(self._app)
            await self._runner.setup()
            self._site = web.TCPSite(self._runner, self._host, self._port)
--- a/gateway/platforms/base.py
+++ b/gateway/platforms/base.py
@ -8,6 +8,7 @@ and implement the required methods.
 import asyncio
 import logging
 import os
+import random
 import re
 import uuid
 from abc import ABC, abstractmethod
@ -26,6 +27,7 @@ sys.path.insert(0, str(_Path(__file__).resolve().parents[2]))
 from gateway.config import Platform, PlatformConfig
 from gateway.session import SessionSource, build_session_key
 from hermes_cli.config import get_hermes_home
+from hermes_constants import get_hermes_dir


 GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
@ -43,8 +45,8 @@ GATEWAY_SECRET_CAPTURE_UNSUPPORTED_MESSAGE = (
 # (e.g. Telegram file URLs expire after ~1 hour).
 # ---------------------------------------------------------------------------

-# Default location: {HERMES_HOME}/image_cache/
-IMAGE_CACHE_DIR = get_hermes_home() / "image_cache"
+# Default location: {HERMES_HOME}/cache/images/ (legacy: image_cache/)
+IMAGE_CACHE_DIR = get_hermes_dir("cache/images", "image_cache")


 def get_image_cache_dir() -> Path:
@ -71,31 +73,51 @@ def cache_image_from_bytes(data: bytes, ext: str = ".jpg") -> str:
    return str(filepath)


-async def cache_image_from_url(url: str, ext: str = ".jpg") -> str:
+async def cache_image_from_url(url: str, ext: str = ".jpg", retries: int = 2) -> str:
    """
    Download an image from a URL and save it to the local cache.

-    Uses httpx for async download with a reasonable timeout.
+    Retries on transient failures (timeouts, 429, 5xx) with exponential
+    backoff so a single slow CDN response doesn't lose the media.

    Args:
        url: The HTTP/HTTPS URL to download from.
        ext: File extension including the dot (e.g. ".jpg", ".png").
+        retries: Number of retry attempts on transient failures.

    Returns:
        Absolute path to the cached image file as a string.
    """
+    import asyncio
    import httpx
+    import logging as _logging
+    _log = _logging.getLogger(__name__)

+    last_exc = None
    async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
-        response = await client.get(
-            url,
-            headers={
-                "User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
-                "Accept": "image/*,*/*;q=0.8",
-            },
-        )
-        response.raise_for_status()
-        return cache_image_from_bytes(response.content, ext)
+        for attempt in range(retries + 1):
+            try:
+                response = await client.get(
+                    url,
+                    headers={
+                        "User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
+                        "Accept": "image/*,*/*;q=0.8",
+                    },
+                )
+                response.raise_for_status()
+                return cache_image_from_bytes(response.content, ext)
+            except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
+                last_exc = exc
+                if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
+                    raise
+                if attempt < retries:
+                    wait = 1.5 * (attempt + 1)
+                    _log.debug("Media cache retry %d/%d for %s (%.1fs): %s",
+                               attempt + 1, retries, url[:80], wait, exc)
+                    await asyncio.sleep(wait)
+                    continue
+                raise
+    raise last_exc


 def cleanup_image_cache(max_age_hours: int = 24) -> int:
@ -126,7 +148,7 @@ def cleanup_image_cache(max_age_hours: int = 24) -> int:
 # here so the STT tool (OpenAI Whisper) can transcribe them from local files.
 # ---------------------------------------------------------------------------

-AUDIO_CACHE_DIR = get_hermes_home() / "audio_cache"
+AUDIO_CACHE_DIR = get_hermes_dir("cache/audio", "audio_cache")


 def get_audio_cache_dir() -> Path:
@ -153,29 +175,51 @@ def cache_audio_from_bytes(data: bytes, ext: str = ".ogg") -> str:
    return str(filepath)


-async def cache_audio_from_url(url: str, ext: str = ".ogg") -> str:
+async def cache_audio_from_url(url: str, ext: str = ".ogg", retries: int = 2) -> str:
    """
    Download an audio file from a URL and save it to the local cache.

+    Retries on transient failures (timeouts, 429, 5xx) with exponential
+    backoff so a single slow CDN response doesn't lose the media.
+
    Args:
        url: The HTTP/HTTPS URL to download from.
        ext: File extension including the dot (e.g. ".ogg", ".mp3").
+        retries: Number of retry attempts on transient failures.

    Returns:
        Absolute path to the cached audio file as a string.
    """
+    import asyncio
    import httpx
+    import logging as _logging
+    _log = _logging.getLogger(__name__)

+    last_exc = None
    async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
-        response = await client.get(
-            url,
-            headers={
-                "User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
-                "Accept": "audio/*,*/*;q=0.8",
-            },
-        )
-        response.raise_for_status()
-        return cache_audio_from_bytes(response.content, ext)
+        for attempt in range(retries + 1):
+            try:
+                response = await client.get(
+                    url,
+                    headers={
+                        "User-Agent": "Mozilla/5.0 (compatible; HermesAgent/1.0)",
+                        "Accept": "audio/*,*/*;q=0.8",
+                    },
+                )
+                response.raise_for_status()
+                return cache_audio_from_bytes(response.content, ext)
+            except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
+                last_exc = exc
+                if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
+                    raise
+                if attempt < retries:
+                    wait = 1.5 * (attempt + 1)
+                    _log.debug("Audio cache retry %d/%d for %s (%.1fs): %s",
+                               attempt + 1, retries, url[:80], wait, exc)
+                    await asyncio.sleep(wait)
+                    continue
+                raise
+    raise last_exc


 # ---------------------------------------------------------------------------
@ -185,7 +229,7 @@ async def cache_audio_from_url(url: str, ext: str = ".ogg") -> str:
 # here so the agent can reference them by local file path.
 # ---------------------------------------------------------------------------

-DOCUMENT_CACHE_DIR = get_hermes_home() / "document_cache"
+DOCUMENT_CACHE_DIR = get_hermes_dir("cache/documents", "document_cache")

 SUPPORTED_DOCUMENT_TYPES = {
    ".pdf": "application/pdf",
@ -312,7 +356,10 @@ class MessageEvent:
            return None
        # Split on space and get first word, strip the /
        parts = self.text.split(maxsplit=1)
-        return parts[0][1:].lower() if parts else None
+        raw = parts[0][1:].lower() if parts else None
+        if raw and "@" in raw:
+            raw = raw.split("@", 1)[0]
+        return raw
    
    def get_command_args(self) -> str:
        """Get the arguments after a command."""
@ -329,6 +376,24 @@ class SendResult:
    message_id: Optional[str] = None
    error: Optional[str] = None
    raw_response: Any = None
+    retryable: bool = False  # True for transient errors (network, timeout) — base will retry automatically
+
+
+# Error substrings that indicate a transient network failure worth retrying
+_RETRYABLE_ERROR_PATTERNS = (
+    "connecterror",
+    "connectionerror",
+    "connectionreset",
+    "connectionrefused",
+    "timeout",
+    "timed out",
+    "network",
+    "broken pipe",
+    "remotedisconnected",
+    "eoferror",
+    "readtimeout",
+    "writetimeout",
+)


 # Type for message handlers
@ -833,6 +898,91 @@ class BasePlatformAdapter(ABC):
                except Exception:
                    pass
    
+    @staticmethod
+    def _is_retryable_error(error: Optional[str]) -> bool:
+        """Return True if the error string looks like a transient network failure."""
+        if not error:
+            return False
+        lowered = error.lower()
+        return any(pat in lowered for pat in _RETRYABLE_ERROR_PATTERNS)
+
+    async def _send_with_retry(
+        self,
+        chat_id: str,
+        content: str,
+        reply_to: Optional[str] = None,
+        metadata: Any = None,
+        max_retries: int = 2,
+        base_delay: float = 2.0,
+    ) -> "SendResult":
+        """
+        Send a message with automatic retry for transient network errors.
+
+        On permanent failures (e.g. formatting / permission errors) falls back
+        to a plain-text version before giving up. If all attempts fail due to
+        network errors, sends the user a brief delivery-failure notice so they
+        know to retry rather than waiting indefinitely.
+        """
+
+        result = await self.send(
+            chat_id=chat_id,
+            content=content,
+            reply_to=reply_to,
+            metadata=metadata,
+        )
+
+        if result.success:
+            return result
+
+        error_str = result.error or ""
+        is_network = result.retryable or self._is_retryable_error(error_str)
+
+        if is_network:
+            # Retry with exponential backoff for transient errors
+            for attempt in range(1, max_retries + 1):
+                delay = base_delay * (2 ** (attempt - 1)) + random.uniform(0, 1)
+                logger.warning(
+                    "[%s] Send failed (attempt %d/%d, retrying in %.1fs): %s",
+                    self.name, attempt, max_retries, delay, error_str,
+                )
+                await asyncio.sleep(delay)
+                result = await self.send(
+                    chat_id=chat_id,
+                    content=content,
+                    reply_to=reply_to,
+                    metadata=metadata,
+                )
+                if result.success:
+                    logger.info("[%s] Send succeeded on retry %d", self.name, attempt)
+                    return result
+                error_str = result.error or ""
+                if not (result.retryable or self._is_retryable_error(error_str)):
+                    break  # error switched to non-transient — fall through to plain-text fallback
+            else:
+                # All retries exhausted (loop completed without break) — notify user
+                logger.error("[%s] Failed to deliver response after %d retries: %s", self.name, max_retries, error_str)
+                notice = (
+                    "\u26a0\ufe0f Message delivery failed after multiple attempts. "
+                    "Please try again \u2014 your request was processed but the response could not be sent."
+                )
+                try:
+                    await self.send(chat_id=chat_id, content=notice, reply_to=reply_to, metadata=metadata)
+                except Exception as notify_err:
+                    logger.debug("[%s] Could not send delivery-failure notice: %s", self.name, notify_err)
+                return result
+
+        # Non-network / post-retry formatting failure: try plain text as fallback
+        logger.warning("[%s] Send failed: %s — trying plain-text fallback", self.name, error_str)
+        fallback_result = await self.send(
+            chat_id=chat_id,
+            content=f"(Response formatting failed, plain text:)\n\n{content[:3500]}",
+            reply_to=reply_to,
+            metadata=metadata,
+        )
+        if not fallback_result.success:
+            logger.error("[%s] Fallback send also failed: %s", self.name, fallback_result.error)
+        return fallback_result
+
    async def handle_message(self, event: MessageEvent) -> None:
        """
        Process an incoming message.
@ -855,7 +1005,7 @@ class BasePlatformAdapter(ABC):
            # simultaneous messages. Queue them without interrupting the active run,
            # then process them immediately after the current task finishes.
            if event.message_type == MessageType.PHOTO:
-                print(f"[{self.name}] 🖼️ Queuing photo follow-up for session {session_key} without interrupt")
+                logger.debug("[%s] Queuing photo follow-up for session %s without interrupt", self.name, session_key)
                existing = self._pending_messages.get(session_key)
                if existing and existing.message_type == MessageType.PHOTO:
                    existing.media_urls.extend(event.media_urls)
@ -870,7 +1020,7 @@ class BasePlatformAdapter(ABC):
                return  # Don't interrupt now - will run after current task completes

            # Default behavior for non-photo follow-ups: interrupt the running agent
-            print(f"[{self.name}] ⚡ New message while session {session_key} is active - triggering interrupt")
+            logger.debug("[%s] New message while session %s is active — triggering interrupt", self.name, session_key)
            self._pending_messages[session_key] = event
            # Signal the interrupt (the processing task checks this)
            self._active_sessions[session_key].set()
@ -982,26 +1132,13 @@ class BasePlatformAdapter(ABC):
                # Send the text portion
                if text_content:
                    logger.info("[%s] Sending response (%d chars) to %s", self.name, len(text_content), event.source.chat_id)
-                    result = await self.send(
+                    result = await self._send_with_retry(
                        chat_id=event.source.chat_id,
                        content=text_content,
                        reply_to=event.message_id,
                        metadata=_thread_metadata,
                    )

-                    # Log send failures (don't raise - user already saw tool progress)
-                    if not result.success:
-                        print(f"[{self.name}] Failed to send response: {result.error}")
-                        # Try sending without markdown as fallback
-                        fallback_result = await self.send(
-                            chat_id=event.source.chat_id,
-                            content=f"(Response formatting failed, plain text:)\n\n{text_content[:3500]}",
-                            reply_to=event.message_id,
-                            metadata=_thread_metadata,
-                        )
-                        if not fallback_result.success:
-                            print(f"[{self.name}] Fallback send also failed: {fallback_result.error}")
-
                # Human-like pacing delay between text and media
                human_delay = self._get_human_delay()

@ -1069,9 +1206,9 @@ class BasePlatformAdapter(ABC):
                            )

                        if not media_result.success:
-                            print(f"[{self.name}] Failed to send media ({ext}): {media_result.error}")
+                            logger.warning("[%s] Failed to send media (%s): %s", self.name, ext, media_result.error)
                    except Exception as media_err:
-                        print(f"[{self.name}] Error sending media: {media_err}")
+                        logger.warning("[%s] Error sending media: %s", self.name, media_err)

                # Send auto-detected local files as native attachments
                for file_path in local_files:
@ -1103,7 +1240,7 @@ class BasePlatformAdapter(ABC):
            # Check if there's a pending message that was queued during our processing
            if session_key in self._pending_messages:
                pending_event = self._pending_messages.pop(session_key)
-                print(f"[{self.name}] 📨 Processing queued message from interrupt")
+                logger.debug("[%s] Processing queued message from interrupt", self.name)
                # Clean up current session before processing pending
                if session_key in self._active_sessions:
                    del self._active_sessions[session_key]
@ -1117,9 +1254,7 @@ class BasePlatformAdapter(ABC):
                return  # Already cleaned up
                
        except Exception as e:
-            print(f"[{self.name}] Error handling message: {e}")
-            import traceback
-            traceback.print_exc()
+            logger.error("[%s] Error handling message: %s", self.name, e, exc_info=True)
            # Send the error to the user so they aren't left with radio silence
            try:
                error_type = type(e).__name__
--- a/gateway/platforms/discord.py
+++ b/gateway/platforms/discord.py
@ -486,6 +486,17 @@ class DiscordAdapter(BasePlatformAdapter):
            return False
        
        try:
+            # Acquire scoped lock to prevent duplicate bot token usage
+            from gateway.status import acquire_scoped_lock
+            self._token_lock_identity = self.config.token
+            acquired, existing = acquire_scoped_lock('discord-bot-token', self._token_lock_identity, metadata={'platform': 'discord'})
+            if not acquired:
+                owner_pid = existing.get('pid') if isinstance(existing, dict) else None
+                message = f'Discord bot token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
+                logger.error('[%s] %s', self.name, message)
+                self._set_fatal_error('discord_token_lock', message, retryable=False)
+                return False
+
            # Set up intents -- members intent needed for username-to-ID resolution
            intents = Intents.default()
            intents.message_content = True
@ -550,6 +561,22 @@ class DiscordAdapter(BasePlatformAdapter):
                            return
                    # "all" falls through to handle_message
                
+                # If the message @mentions other users but NOT the bot, the
+                # sender is talking to someone else — stay silent.  Only
+                # applies in server channels; in DMs the user is always
+                # talking to the bot (mentions are just references).
+                # Controlled by DISCORD_IGNORE_NO_MENTION (default: true).
+                _ignore_no_mention = os.getenv(
+                    "DISCORD_IGNORE_NO_MENTION", "true"
+                ).lower() in ("true", "1", "yes")
+                if _ignore_no_mention and message.mentions and not isinstance(message.channel, discord.DMChannel):
+                    _bot_mentioned = (
+                        self._client.user is not None
+                        and self._client.user in message.mentions
+                    )
+                    if not _bot_mentioned:
+                        return  # Talking to someone else, don't interrupt
+
                await self._handle_message(message)

            @self._client.event
@ -622,6 +649,16 @@ class DiscordAdapter(BasePlatformAdapter):
        self._running = False
        self._client = None
        self._ready_event.clear()
+
+        # Release the token lock
+        try:
+            from gateway.status import release_scoped_lock
+            if getattr(self, '_token_lock_identity', None):
+                release_scoped_lock('discord-bot-token', self._token_lock_identity)
+                self._token_lock_identity = None
+        except Exception:
+            pass
+
        logger.info("[%s] Disconnected", self.name)
    
    async def send(
@ -1413,15 +1450,23 @@ class DiscordAdapter(BasePlatformAdapter):
        command_text: str,
        followup_msg: str | None = None,
    ) -> None:
-        """Common handler for simple slash commands that dispatch a command string."""
+        """Common handler for simple slash commands that dispatch a command string.
+
+        Defers the interaction (shows "thinking..."), dispatches the command,
+        then cleans up the deferred response.  If *followup_msg* is provided
+        the "thinking..." indicator is replaced with that text; otherwise it
+        is deleted so the channel isn't cluttered.
+        """
        await interaction.response.defer(ephemeral=True)
        event = self._build_slash_event(interaction, command_text)
        await self.handle_message(event)
-        if followup_msg:
-            try:
-                await interaction.followup.send(followup_msg, ephemeral=True)
-            except Exception as e:
-                logger.debug("Discord followup failed: %s", e)
+        try:
+            if followup_msg:
+                await interaction.edit_original_response(content=followup_msg)
+            else:
+                await interaction.delete_original_response()
+        except Exception as e:
+            logger.debug("Discord interaction cleanup failed: %s", e)

    def _register_slash_commands(self) -> None:
        """Register Discord slash commands on the command tree."""
@ -1446,9 +1491,7 @@ class DiscordAdapter(BasePlatformAdapter):
        @tree.command(name="reasoning", description="Show or change reasoning effort")
        @discord.app_commands.describe(effort="Reasoning effort: xhigh, high, medium, low, minimal, or none.")
        async def slash_reasoning(interaction: discord.Interaction, effort: str = ""):
-            await interaction.response.defer(ephemeral=True)
-            event = self._build_slash_event(interaction, f"/reasoning {effort}".strip())
-            await self.handle_message(event)
+            await self._run_simple_slash(interaction, f"/reasoning {effort}".strip())

        @tree.command(name="personality", description="Set a personality")
        @discord.app_commands.describe(name="Personality name. Leave empty to list available.")
@ -1521,9 +1564,7 @@ class DiscordAdapter(BasePlatformAdapter):
            discord.app_commands.Choice(name="status — show current mode", value="status"),
        ])
        async def slash_voice(interaction: discord.Interaction, mode: str = ""):
-            await interaction.response.defer(ephemeral=True)
-            event = self._build_slash_event(interaction, f"/voice {mode}".strip())
-            await self.handle_message(event)
+            await self._run_simple_slash(interaction, f"/voice {mode}".strip())

        @tree.command(name="update", description="Update Hermes Agent to the latest version")
        async def slash_update(interaction: discord.Interaction):
@ -2096,6 +2137,11 @@ class DiscordAdapter(BasePlatformAdapter):
        if pending_text_injection:
            event_text = f"{pending_text_injection}\n\n{event_text}" if event_text else pending_text_injection

+        # Defense-in-depth: prevent empty user messages from entering session
+        # (can happen when user sends @mention-only with no other text)
+        if not event_text or not event_text.strip():
+            event_text = "(The user sent a message with no text content)"
+
        event = MessageEvent(
            text=event_text,
            message_type=msg_type,
--- a/gateway/platforms/email.py
+++ b/gateway/platforms/email.py
@ -43,6 +43,20 @@ from gateway.platforms.base import (
 from gateway.config import Platform, PlatformConfig

 logger = logging.getLogger(__name__)
+# Automated sender patterns — emails from these are silently ignored
+_NOREPLY_PATTERNS = (
+    "noreply", "no-reply", "no_reply", "donotreply", "do-not-reply",
+    "mailer-daemon", "postmaster", "bounce", "notifications@",
+    "automated@", "auto-confirm", "auto-reply", "automailer",
+)
+
+# RFC headers that indicate bulk/automated mail
+_AUTOMATED_HEADERS = {
+    "Auto-Submitted": lambda v: v.lower() != "no",
+    "Precedence": lambda v: v.lower() in ("bulk", "list", "junk"),
+    "X-Auto-Response-Suppress": lambda v: bool(v),
+    "List-Unsubscribe": lambda v: bool(v),
+}

 # Gmail-safe max length per email body
 MAX_MESSAGE_LENGTH = 50_000
@ -50,7 +64,17 @@ MAX_MESSAGE_LENGTH = 50_000
 # Supported image extensions for inline detection
 _IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".gif", ".webp"}

-
+def _is_automated_sender(address: str, headers: dict) -> bool:
+    """Return True if this email is from an automated/noreply source."""
+    addr = address.lower()
+    if any(pattern in addr for pattern in _NOREPLY_PATTERNS):
+        return True
+    for header, check in _AUTOMATED_HEADERS.items():
+        value = headers.get(header, "")
+        if value and check(value):
+            return True
+    return False
+    
 def check_email_requirements() -> bool:
    """Check if email platform dependencies are available."""
    addr = os.getenv("EMAIL_ADDRESS")
@ -213,6 +237,7 @@ class EmailAdapter(BasePlatformAdapter):

        # Track message IDs we've already processed to avoid duplicates
        self._seen_uids: set = set()
+        self._seen_uids_max: int = 2000   # cap to prevent unbounded memory growth
        self._poll_task: Optional[asyncio.Task] = None

        # Map chat_id (sender email) -> last subject + message-id for threading
@ -220,6 +245,26 @@ class EmailAdapter(BasePlatformAdapter):

        logger.info("[Email] Adapter initialized for %s", self._address)

+    def _trim_seen_uids(self) -> None:
+        """Keep only the most recent UIDs to prevent unbounded memory growth.
+
+        IMAP UIDs are monotonically increasing integers. When the set grows
+        beyond the cap, we keep only the highest half — old UIDs are safe to
+        drop because new messages always have higher UIDs and IMAP's UNSEEN
+        flag prevents re-delivery regardless.
+        """
+        if len(self._seen_uids) <= self._seen_uids_max:
+            return
+        try:
+            # UIDs are bytes like b'1234' — sort numerically and keep top half
+            sorted_uids = sorted(self._seen_uids, key=lambda u: int(u))
+            keep = self._seen_uids_max // 2
+            self._seen_uids = set(sorted_uids[-keep:])
+            logger.debug("[Email] Trimmed seen UIDs to %d entries", len(self._seen_uids))
+        except (ValueError, TypeError):
+            # Fallback: just clear old entries if sort fails
+            self._seen_uids = set(list(self._seen_uids)[-self._seen_uids_max // 2:])
+
    async def connect(self) -> bool:
        """Connect to the IMAP server and start polling for new messages."""
        try:
@ -232,6 +277,8 @@ class EmailAdapter(BasePlatformAdapter):
            if status == "OK" and data and data[0]:
                for uid in data[0].split():
                    self._seen_uids.add(uid)
+            # Keep only the most recent UIDs to prevent unbounded growth
+            self._trim_seen_uids()
            imap.logout()
            logger.info("[Email] IMAP connection test passed. %d existing messages skipped.", len(self._seen_uids))
        except Exception as e:
@ -290,52 +337,63 @@ class EmailAdapter(BasePlatformAdapter):
        results = []
        try:
            imap = imaplib.IMAP4_SSL(self._imap_host, self._imap_port, timeout=30)
-            imap.login(self._address, self._password)
-            imap.select("INBOX")
+            try:
+                imap.login(self._address, self._password)
+                imap.select("INBOX")

-            status, data = imap.uid("search", None, "UNSEEN")
-            if status != "OK" or not data or not data[0]:
-                imap.logout()
-                return results
+                status, data = imap.uid("search", None, "UNSEEN")
+                if status != "OK" or not data or not data[0]:
+                    return results

-            for uid in data[0].split():
-                if uid in self._seen_uids:
-                    continue
-                self._seen_uids.add(uid)
+                for uid in data[0].split():
+                    if uid in self._seen_uids:
+                        continue
+                    self._seen_uids.add(uid)
+                    # Trim periodically to prevent unbounded memory growth
+                    if len(self._seen_uids) > self._seen_uids_max:
+                        self._trim_seen_uids()

-                status, msg_data = imap.uid("fetch", uid, "(RFC822)")
-                if status != "OK":
-                    continue
+                    status, msg_data = imap.uid("fetch", uid, "(RFC822)")
+                    if status != "OK":
+                        continue

-                raw_email = msg_data[0][1]
-                msg = email_lib.message_from_bytes(raw_email)
+                    raw_email = msg_data[0][1]
+                    msg = email_lib.message_from_bytes(raw_email)

-                sender_raw = msg.get("From", "")
-                sender_addr = _extract_email_address(sender_raw)
-                sender_name = _decode_header_value(sender_raw)
-                # Remove email from name if present
-                if "<" in sender_name:
-                    sender_name = sender_name.split("<")[0].strip().strip('"')
+                    sender_raw = msg.get("From", "")
+                    sender_addr = _extract_email_address(sender_raw)
+                    sender_name = _decode_header_value(sender_raw)
+                    # Remove email from name if present
+                    if "<" in sender_name:
+                        sender_name = sender_name.split("<")[0].strip().strip('"')

-                subject = _decode_header_value(msg.get("Subject", "(no subject)"))
-                message_id = msg.get("Message-ID", "")
-                in_reply_to = msg.get("In-Reply-To", "")
-                body = _extract_text_body(msg)
-                attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)
+                    subject = _decode_header_value(msg.get("Subject", "(no subject)"))
+                    message_id = msg.get("Message-ID", "")
+                    in_reply_to = msg.get("In-Reply-To", "")
+                    # Skip automated/noreply senders before any processing
+                    msg_headers = dict(msg.items())
+                    if _is_automated_sender(sender_addr, msg_headers):
+                        logger.debug("[Email] Skipping automated sender: %s", sender_addr)
+                        continue
+                    body = _extract_text_body(msg)
+                    attachments = _extract_attachments(msg, skip_attachments=self._skip_attachments)

-                results.append({
-                    "uid": uid,
-                    "sender_addr": sender_addr,
-                    "sender_name": sender_name,
-                    "subject": subject,
-                    "message_id": message_id,
-                    "in_reply_to": in_reply_to,
-                    "body": body,
-                    "attachments": attachments,
-                    "date": msg.get("Date", ""),
-                })
-
-            imap.logout()
+                    results.append({
+                        "uid": uid,
+                        "sender_addr": sender_addr,
+                        "sender_name": sender_name,
+                        "subject": subject,
+                        "message_id": message_id,
+                        "in_reply_to": in_reply_to,
+                        "body": body,
+                        "attachments": attachments,
+                        "date": msg.get("Date", ""),
+                    })
+            finally:
+                try:
+                    imap.logout()
+                except Exception:
+                    pass
        except Exception as e:
            logger.error("[Email] IMAP fetch error: %s", e)
        return results
@ -348,6 +406,11 @@ class EmailAdapter(BasePlatformAdapter):
        if sender_addr == self._address.lower():
            return

+        # Never reply to automated senders
+        if _is_automated_sender(sender_addr, {}):
+            logger.debug("[Email] Dropping automated sender at dispatch: %s", sender_addr)
+            return
+
        subject = msg_data["subject"]
        body = msg_data["body"].strip()
        attachments = msg_data["attachments"]
@ -443,10 +506,15 @@ class EmailAdapter(BasePlatformAdapter):
        msg.attach(MIMEText(body, "plain", "utf-8"))

        smtp = smtplib.SMTP(self._smtp_host, self._smtp_port, timeout=30)
-        smtp.starttls(context=ssl.create_default_context())
-        smtp.login(self._address, self._password)
-        smtp.send_message(msg)
-        smtp.quit()
+        try:
+            smtp.starttls(context=ssl.create_default_context())
+            smtp.login(self._address, self._password)
+            smtp.send_message(msg)
+        finally:
+            try:
+                smtp.quit()
+            except Exception:
+                smtp.close()

        logger.info("[Email] Sent reply to %s (subject: %s)", to_addr, subject)
        return msg_id
@ -530,10 +598,15 @@ class EmailAdapter(BasePlatformAdapter):
            msg.attach(part)

        smtp = smtplib.SMTP(self._smtp_host, self._smtp_port, timeout=30)
-        smtp.starttls(context=ssl.create_default_context())
-        smtp.login(self._address, self._password)
-        smtp.send_message(msg)
-        smtp.quit()
+        try:
+            smtp.starttls(context=ssl.create_default_context())
+            smtp.login(self._address, self._password)
+            smtp.send_message(msg)
+        finally:
+            try:
+                smtp.quit()
+            except Exception:
+                smtp.close()

        return msg_id

--- a/gateway/platforms/feishu.py
+++ b/gateway/platforms/feishu.py
--- a/gateway/platforms/matrix.py
+++ b/gateway/platforms/matrix.py
@ -40,7 +40,9 @@ logger = logging.getLogger(__name__)
 MAX_MESSAGE_LENGTH = 4000

 # Store directory for E2EE keys and sync state.
-_STORE_DIR = Path.home() / ".hermes" / "matrix" / "store"
+# Uses get_hermes_home() so each profile gets its own Matrix store.
+from hermes_constants import get_hermes_dir as _get_hermes_dir
+_STORE_DIR = _get_hermes_dir("platforms/matrix/store", "matrix/store")

 # Grace period: ignore messages older than this many seconds before startup.
 _STARTUP_GRACE_SECONDS = 5
@ -161,22 +163,49 @@ class MatrixAdapter(BasePlatformAdapter):
        # Authenticate.
        if self._access_token:
            client.access_token = self._access_token
-            # Resolve user_id if not set.
-            if not self._user_id:
-                resp = await client.whoami()
-                if isinstance(resp, nio.WhoamiResponse):
-                    self._user_id = resp.user_id
-                    client.user_id = resp.user_id
-                    logger.info("Matrix: authenticated as %s", self._user_id)
-                else:
-                    logger.error(
-                        "Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER"
+
+            # With access-token auth, always resolve whoami so we validate the
+            # token and learn the device_id. The device_id matters for E2EE:
+            # without it, matrix-nio can send plain messages but may fail to
+            # decrypt inbound encrypted events or encrypt outbound room sends.
+            resp = await client.whoami()
+            if isinstance(resp, nio.WhoamiResponse):
+                resolved_user_id = getattr(resp, "user_id", "") or self._user_id
+                resolved_device_id = getattr(resp, "device_id", "")
+                if resolved_user_id:
+                    self._user_id = resolved_user_id
+
+                # restore_login() is the matrix-nio path that binds the access
+                # token to a specific device and loads the crypto store.
+                if resolved_device_id and hasattr(client, "restore_login"):
+                    client.restore_login(
+                        self._user_id or resolved_user_id,
+                        resolved_device_id,
+                        self._access_token,
                    )
-                    await client.close()
-                    return False
+                else:
+                    if self._user_id:
+                        client.user_id = self._user_id
+                    if resolved_device_id:
+                        client.device_id = resolved_device_id
+                    client.access_token = self._access_token
+                    if self._encryption:
+                        logger.warning(
+                            "Matrix: access-token login did not restore E2EE state; "
+                            "encrypted rooms may fail until a device_id is available"
+                        )
+
+                logger.info(
+                    "Matrix: using access token for %s%s",
+                    self._user_id or "(unknown user)",
+                    f" (device {resolved_device_id})" if resolved_device_id else "",
+                )
            else:
-                client.user_id = self._user_id
-                logger.info("Matrix: using access token for %s", self._user_id)
+                logger.error(
+                    "Matrix: whoami failed — check MATRIX_ACCESS_TOKEN and MATRIX_HOMESERVER"
+                )
+                await client.close()
+                return False
        elif self._password and self._user_id:
            resp = await client.login(
                self._password,
@ -194,13 +223,18 @@ class MatrixAdapter(BasePlatformAdapter):
            return False

        # If E2EE is enabled, load the crypto store.
-        if self._encryption and hasattr(client, "olm"):
+        if self._encryption and getattr(client, "olm", None):
            try:
                if client.should_upload_keys:
                    await client.keys_upload()
                logger.info("Matrix: E2EE crypto initialized")
            except Exception as exc:
                logger.warning("Matrix: crypto init issue: %s", exc)
+        elif self._encryption:
+            logger.warning(
+                "Matrix: E2EE requested but crypto store is not loaded; "
+                "encrypted rooms may fail"
+            )

        # Register event callbacks.
        client.add_event_callback(self._on_room_message, nio.RoomMessageText)
@ -230,6 +264,7 @@ class MatrixAdapter(BasePlatformAdapter):
            )
            # Build DM room cache from m.direct account data.
            await self._refresh_dm_cache()
+            await self._run_e2ee_maintenance()
        else:
            logger.warning("Matrix: initial sync returned %s", type(resp).__name__)

@ -301,13 +336,48 @@ class MatrixAdapter(BasePlatformAdapter):
                    relates_to["m.in_reply_to"] = {"event_id": reply_to}
                msg_content["m.relates_to"] = relates_to

-            resp = await self._client.room_send(
-                chat_id,
-                "m.room.message",
-                msg_content,
-            )
+            async def _room_send_once(*, ignore_unverified_devices: bool = False):
+                return await asyncio.wait_for(
+                    self._client.room_send(
+                        chat_id,
+                        "m.room.message",
+                        msg_content,
+                        ignore_unverified_devices=ignore_unverified_devices,
+                    ),
+                    timeout=45,
+                )
+
+            try:
+                resp = await _room_send_once(ignore_unverified_devices=False)
+            except Exception as exc:
+                retryable = isinstance(exc, asyncio.TimeoutError)
+                olm_unverified = getattr(nio, "OlmUnverifiedDeviceError", None)
+                send_retry = getattr(nio, "SendRetryError", None)
+                if isinstance(olm_unverified, type) and isinstance(exc, olm_unverified):
+                    retryable = True
+                if isinstance(send_retry, type) and isinstance(exc, send_retry):
+                    retryable = True
+
+                if not retryable:
+                    logger.error("Matrix: failed to send to %s: %s", chat_id, exc)
+                    return SendResult(success=False, error=str(exc))
+
+                logger.warning(
+                    "Matrix: initial encrypted send to %s failed (%s); "
+                    "retrying after E2EE maintenance with ignored unverified devices",
+                    chat_id,
+                    exc,
+                )
+                await self._run_e2ee_maintenance()
+                try:
+                    resp = await _room_send_once(ignore_unverified_devices=True)
+                except Exception as retry_exc:
+                    logger.error("Matrix: failed to send to %s after retry: %s", chat_id, retry_exc)
+                    return SendResult(success=False, error=str(retry_exc))
+
            if isinstance(resp, nio.RoomSendResponse):
                last_event_id = resp.event_id
+                logger.info("Matrix: sent event %s to %s", last_event_id, chat_id)
            else:
                err = getattr(resp, "message", str(resp))
                logger.error("Matrix: failed to send to %s: %s", chat_id, err)
@ -551,9 +621,23 @@ class MatrixAdapter(BasePlatformAdapter):

    async def _sync_loop(self) -> None:
        """Continuously sync with the homeserver."""
+        import nio
+
        while not self._closing:
            try:
-                await self._client.sync(timeout=30000)
+                resp = await self._client.sync(timeout=30000)
+                if isinstance(resp, nio.SyncError):
+                    if self._closing:
+                        return
+                    logger.warning(
+                        "Matrix: sync returned %s: %s — retrying in 5s",
+                        type(resp).__name__,
+                        getattr(resp, "message", resp),
+                    )
+                    await asyncio.sleep(5)
+                    continue
+
+                await self._run_e2ee_maintenance()
            except asyncio.CancelledError:
                return
            except Exception as exc:
@ -562,6 +646,38 @@ class MatrixAdapter(BasePlatformAdapter):
                logger.warning("Matrix: sync error: %s — retrying in 5s", exc)
                await asyncio.sleep(5)

+    async def _run_e2ee_maintenance(self) -> None:
+        """Run matrix-nio E2EE housekeeping between syncs.
+
+        Hermes uses a custom sync loop instead of matrix-nio's sync_forever(),
+        so we need to explicitly drive the key management work that sync_forever()
+        normally handles for encrypted rooms.
+        """
+        client = self._client
+        if not client or not self._encryption or not getattr(client, "olm", None):
+            return
+
+        tasks = [asyncio.create_task(client.send_to_device_messages())]
+
+        if client.should_upload_keys:
+            tasks.append(asyncio.create_task(client.keys_upload()))
+
+        if client.should_query_keys:
+            tasks.append(asyncio.create_task(client.keys_query()))
+
+        if client.should_claim_keys:
+            users = client.get_users_for_key_claiming()
+            if users:
+                tasks.append(asyncio.create_task(client.keys_claim(users)))
+
+        for task in asyncio.as_completed(tasks):
+            try:
+                await task
+            except asyncio.CancelledError:
+                raise
+            except Exception as exc:
+                logger.warning("Matrix: E2EE maintenance task failed: %s", exc)
+
    # ------------------------------------------------------------------
    # Event callbacks
    # ------------------------------------------------------------------
--- a/gateway/platforms/mattermost.py
+++ b/gateway/platforms/mattermost.py
@ -407,18 +407,38 @@ class MattermostAdapter(BasePlatformAdapter):
        kind: str = "file",
    ) -> SendResult:
        """Download a URL and upload it as a file attachment."""
+        import asyncio
        import aiohttp
-        try:
-            async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
-                if resp.status >= 400:
-                    # Fall back to sending the URL as text.
-                    return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
-                file_data = await resp.read()
-                ct = resp.content_type or "application/octet-stream"
-                # Derive filename from URL.
-                fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
-        except Exception as exc:
-            logger.warning("Mattermost: failed to download %s: %s", url, exc)
+
+        last_exc = None
+        file_data = None
+        ct = "application/octet-stream"
+        fname = url.rsplit("/", 1)[-1].split("?")[0] or f"{kind}.png"
+
+        for attempt in range(3):
+            try:
+                async with self._session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
+                    if resp.status >= 500 or resp.status == 429:
+                        if attempt < 2:
+                            logger.debug("Mattermost download retry %d/2 for %s (status %d)",
+                                         attempt + 1, url[:80], resp.status)
+                            await asyncio.sleep(1.5 * (attempt + 1))
+                            continue
+                    if resp.status >= 400:
+                        return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
+                    file_data = await resp.read()
+                    ct = resp.content_type or "application/octet-stream"
+                    break
+            except (aiohttp.ClientError, asyncio.TimeoutError) as exc:
+                last_exc = exc
+                if attempt < 2:
+                    await asyncio.sleep(1.5 * (attempt + 1))
+                    continue
+                logger.warning("Mattermost: failed to download %s after %d attempts: %s", url, attempt + 1, exc)
+                return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)
+
+        if file_data is None:
+            logger.warning("Mattermost: download returned no data for %s", url)
            return await self.send(chat_id, f"{caption or ''}\n{url}".strip(), reply_to)

        file_id = await self._upload_file(chat_id, file_data, fname, ct)
@ -583,9 +603,19 @@ class MattermostAdapter(BasePlatformAdapter):
        # For DMs, user_id is sufficient.  For channels, check for @mention.
        message_text = post.get("message", "")

-        # Mention-only mode: skip channel messages that don't @mention the bot.
-        # DMs (type "D") are always processed.
+        # Mention-gating for non-DM channels.
+        # Config (env vars):
+        #   MATTERMOST_REQUIRE_MENTION: Require @mention in channels (default: true)
+        #   MATTERMOST_FREE_RESPONSE_CHANNELS: Channel IDs where bot responds without mention
        if channel_type_raw != "D":
+            require_mention = os.getenv(
+                "MATTERMOST_REQUIRE_MENTION", "true"
+            ).lower() not in ("false", "0", "no")
+
+            free_channels_raw = os.getenv("MATTERMOST_FREE_RESPONSE_CHANNELS", "")
+            free_channels = {ch.strip() for ch in free_channels_raw.split(",") if ch.strip()}
+            is_free_channel = channel_id in free_channels
+
            mention_patterns = [
                f"@{self._bot_username}",
                f"@{self._bot_user_id}",
@ -594,13 +624,21 @@ class MattermostAdapter(BasePlatformAdapter):
                pattern.lower() in message_text.lower()
                for pattern in mention_patterns
            )
-            if not has_mention:
+
+            if require_mention and not is_free_channel and not has_mention:
                logger.debug(
                    "Mattermost: skipping non-DM message without @mention (channel=%s)",
                    channel_id,
                )
                return

+            # Strip @mention from the message text so the agent sees clean input.
+            if has_mention:
+                for pattern in mention_patterns:
+                    message_text = re.sub(
+                        re.escape(pattern), "", message_text, flags=re.IGNORECASE
+                    ).strip()
+
        # Resolve sender info.
        sender_id = post.get("user_id", "")
        sender_name = data.get("sender_name", "").lstrip("@") or sender_id
--- a/gateway/platforms/signal.py
+++ b/gateway/platforms/signal.py
@ -22,7 +22,7 @@ import time
 from datetime import datetime, timezone
 from pathlib import Path
 from typing import Dict, List, Optional, Any
-from urllib.parse import unquote
+from urllib.parse import quote, unquote

 import httpx

@ -184,6 +184,8 @@ class SignalAdapter(BasePlatformAdapter):
        self._recent_sent_timestamps: set = set()
        self._max_recent_timestamps = 50

+        self._phone_lock_identity: Optional[str] = None
+
        logger.info("Signal adapter initialized: url=%s account=%s groups=%s",
                     self.http_url, _redact_phone(self.account),
                     "enabled" if self.group_allow_from else "disabled")
@ -198,6 +200,29 @@ class SignalAdapter(BasePlatformAdapter):
            logger.error("Signal: SIGNAL_HTTP_URL and SIGNAL_ACCOUNT are required")
            return False

+        # Acquire scoped lock to prevent duplicate Signal listeners for the same phone
+        try:
+            from gateway.status import acquire_scoped_lock
+
+            self._phone_lock_identity = self.account
+            acquired, existing = acquire_scoped_lock(
+                "signal-phone",
+                self._phone_lock_identity,
+                metadata={"platform": self.platform.value},
+            )
+            if not acquired:
+                owner_pid = existing.get("pid") if isinstance(existing, dict) else None
+                message = (
+                    "Another local Hermes gateway is already using this Signal account"
+                    + (f" (PID {owner_pid})." if owner_pid else ".")
+                    + " Stop the other gateway before starting a second Signal listener."
+                )
+                logger.error("Signal: %s", message)
+                self._set_fatal_error("signal_phone_lock", message, retryable=False)
+                return False
+        except Exception as e:
+            logger.warning("Signal: Could not acquire phone lock (non-fatal): %s", e)
+
        self.client = httpx.AsyncClient(timeout=30.0)

        # Health check — verify signal-cli daemon is reachable
@ -245,6 +270,14 @@ class SignalAdapter(BasePlatformAdapter):
            await self.client.aclose()
            self.client = None

+        if self._phone_lock_identity:
+            try:
+                from gateway.status import release_scoped_lock
+                release_scoped_lock("signal-phone", self._phone_lock_identity)
+            except Exception as e:
+                logger.warning("Signal: Error releasing phone lock: %s", e, exc_info=True)
+            self._phone_lock_identity = None
+
        logger.info("Signal: disconnected")

    # ------------------------------------------------------------------
@ -253,7 +286,7 @@ class SignalAdapter(BasePlatformAdapter):

    async def _sse_listener(self) -> None:
        """Listen for SSE events from signal-cli daemon."""
-        url = f"{self.http_url}/api/v1/events?account={self.account}"
+        url = f"{self.http_url}/api/v1/events?account={quote(self.account, safe='')}"
        backoff = SSE_RETRY_DELAY_INITIAL

        while self._running:
@ -279,6 +312,12 @@ class SignalAdapter(BasePlatformAdapter):
                            line = line.strip()
                            if not line:
                                continue
+                            # SSE keepalive comments (":") prove the connection
+                            # is alive — update activity so the health monitor
+                            # doesn't report false idle warnings.
+                            if line.startswith(":"):
+                                self._last_sse_activity = time.time()
+                                continue
                            # Parse SSE data lines
                            if line.startswith("data:"):
                                data_str = line[5:].strip()
@ -515,7 +554,7 @@ class SignalAdapter(BasePlatformAdapter):
        """Fetch an attachment via JSON-RPC and cache it. Returns (path, ext)."""
        result = await self._rpc("getAttachment", {
            "account": self.account,
-            "attachmentId": attachment_id,
+            "id": attachment_id,
        })

        if not result:
--- a/gateway/platforms/slack.py
+++ b/gateway/platforms/slack.py
@ -93,6 +93,17 @@ class SlackAdapter(BasePlatformAdapter):
            return False

        try:
+            # Acquire scoped lock to prevent duplicate app token usage
+            from gateway.status import acquire_scoped_lock
+            self._token_lock_identity = app_token
+            acquired, existing = acquire_scoped_lock('slack-app-token', app_token, metadata={'platform': 'slack'})
+            if not acquired:
+                owner_pid = existing.get('pid') if isinstance(existing, dict) else None
+                message = f'Slack app token already in use' + (f' (PID {owner_pid})' if owner_pid else '') + '. Stop the other gateway first.'
+                logger.error('[%s] %s', self.name, message)
+                self._set_fatal_error('slack_token_lock', message, retryable=False)
+                return False
+
            self._app = AsyncApp(token=bot_token)

            # Get our own bot user ID for mention detection
@ -138,6 +149,16 @@ class SlackAdapter(BasePlatformAdapter):
            except Exception as e:  # pragma: no cover - defensive logging
                logger.warning("[Slack] Error while closing Socket Mode handler: %s", e, exc_info=True)
        self._running = False
+
+        # Release the token lock (use stored identity, not re-read env)
+        try:
+            from gateway.status import release_scoped_lock
+            if getattr(self, '_token_lock_identity', None):
+                release_scoped_lock('slack-app-token', self._token_lock_identity)
+                self._token_lock_identity = None
+        except Exception:
+            pass
+
        logger.info("[Slack] Disconnected")

    async def send(
@ -819,33 +840,65 @@ class SlackAdapter(BasePlatformAdapter):
        await self.handle_message(event)

    async def _download_slack_file(self, url: str, ext: str, audio: bool = False) -> str:
-        """Download a Slack file using the bot token for auth."""
+        """Download a Slack file using the bot token for auth, with retry."""
+        import asyncio
        import httpx

        bot_token = self.config.token
-        async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
-            response = await client.get(
-                url,
-                headers={"Authorization": f"Bearer {bot_token}"},
-            )
-            response.raise_for_status()
+        last_exc = None

-        if audio:
-            from gateway.platforms.base import cache_audio_from_bytes
-            return cache_audio_from_bytes(response.content, ext)
-        else:
-            from gateway.platforms.base import cache_image_from_bytes
-            return cache_image_from_bytes(response.content, ext)
+        async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
+            for attempt in range(3):
+                try:
+                    response = await client.get(
+                        url,
+                        headers={"Authorization": f"Bearer {bot_token}"},
+                    )
+                    response.raise_for_status()
+
+                    if audio:
+                        from gateway.platforms.base import cache_audio_from_bytes
+                        return cache_audio_from_bytes(response.content, ext)
+                    else:
+                        from gateway.platforms.base import cache_image_from_bytes
+                        return cache_image_from_bytes(response.content, ext)
+                except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
+                    last_exc = exc
+                    if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
+                        raise
+                    if attempt < 2:
+                        logger.debug("Slack file download retry %d/2 for %s: %s",
+                                     attempt + 1, url[:80], exc)
+                        await asyncio.sleep(1.5 * (attempt + 1))
+                        continue
+                    raise
+        raise last_exc

    async def _download_slack_file_bytes(self, url: str) -> bytes:
-        """Download a Slack file and return raw bytes."""
+        """Download a Slack file and return raw bytes, with retry."""
+        import asyncio
        import httpx

        bot_token = self.config.token
+        last_exc = None
+
        async with httpx.AsyncClient(timeout=30.0, follow_redirects=True) as client:
-            response = await client.get(
-                url,
-                headers={"Authorization": f"Bearer {bot_token}"},
-            )
-            response.raise_for_status()
-        return response.content
+            for attempt in range(3):
+                try:
+                    response = await client.get(
+                        url,
+                        headers={"Authorization": f"Bearer {bot_token}"},
+                    )
+                    response.raise_for_status()
+                    return response.content
+                except (httpx.TimeoutException, httpx.HTTPStatusError) as exc:
+                    last_exc = exc
+                    if isinstance(exc, httpx.HTTPStatusError) and exc.response.status_code < 429:
+                        raise
+                    if attempt < 2:
+                        logger.debug("Slack file download retry %d/2 for %s: %s",
+                                     attempt + 1, url[:80], exc)
+                        await asyncio.sleep(1.5 * (attempt + 1))
+                        continue
+                    raise
+        raise last_exc
--- a/gateway/platforms/telegram.py
+++ b/gateway/platforms/telegram.py
@ -11,7 +11,7 @@ import asyncio
 import logging
 import os
 import re
-from typing import Dict, Optional, Any
+from typing import Dict, List, Optional, Any

 logger = logging.getLogger(__name__)

@ -25,6 +25,7 @@ try:
        filters,
    )
    from telegram.constants import ParseMode, ChatType
+    from telegram.request import HTTPXRequest
    TELEGRAM_AVAILABLE = True
 except ImportError:
    TELEGRAM_AVAILABLE = False
@ -34,6 +35,7 @@ except ImportError:
    Application = Any
    CommandHandler = Any
    TelegramMessageHandler = Any
+    HTTPXRequest = Any
    filters = None
    ParseMode = None
    ChatType = None
@ -59,6 +61,11 @@ from gateway.platforms.base import (
    cache_document_from_bytes,
    SUPPORTED_DOCUMENT_TYPES,
 )
+from gateway.platforms.telegram_network import (
+    TelegramFallbackTransport,
+    discover_fallback_ips,
+    parse_fallback_ip_env,
+)


 def check_telegram_requirements() -> bool:
@ -138,6 +145,13 @@ class TelegramAdapter(BasePlatformAdapter):
        # DM Topics config from extra.dm_topics
        self._dm_topics_config: List[Dict[str, Any]] = self.config.extra.get("dm_topics", [])

+    def _fallback_ips(self) -> list[str]:
+        """Return validated fallback IPs from config (populated by _apply_env_overrides)."""
+        configured = self.config.extra.get("fallback_ips", []) if getattr(self.config, "extra", None) else []
+        if isinstance(configured, str):
+            configured = configured.split(",")
+        return parse_fallback_ip_env(",".join(str(v) for v in configured) if configured else None)
+
    @staticmethod
    def _looks_like_polling_conflict(error: Exception) -> bool:
        text = str(error).lower()
@ -331,7 +345,8 @@ class TelegramAdapter(BasePlatformAdapter):
    def _persist_dm_topic_thread_id(self, chat_id: int, topic_name: str, thread_id: int) -> None:
        """Save a newly created thread_id back into config.yaml so it persists across restarts."""
        try:
-            config_path = _Path.home() / ".hermes" / "config.yaml"
+            from hermes_constants import get_hermes_home
+            config_path = get_hermes_home() / "config.yaml"
            if not config_path.exists():
                logger.warning("[%s] Config file not found at %s, cannot persist thread_id", self.name, config_path)
                return
@ -474,7 +489,26 @@ class TelegramAdapter(BasePlatformAdapter):
                return False

            # Build the application
-            self._app = Application.builder().token(self.config.token).build()
+            builder = Application.builder().token(self.config.token)
+            fallback_ips = self._fallback_ips()
+            if not fallback_ips:
+                fallback_ips = await discover_fallback_ips()
+                logger.info(
+                    "[%s] Auto-discovered Telegram fallback IPs: %s",
+                    self.name,
+                    ", ".join(fallback_ips),
+                )
+            if fallback_ips:
+                logger.warning(
+                    "[%s] Telegram fallback IPs active: %s",
+                    self.name,
+                    ", ".join(fallback_ips),
+                )
+                transport = TelegramFallbackTransport(fallback_ips)
+                request = HTTPXRequest(httpx_kwargs={"transport": transport})
+                get_updates_request = HTTPXRequest(httpx_kwargs={"transport": transport})
+                builder = builder.request(request).get_updates_request(get_updates_request)
+            self._app = builder.build()
            self._bot = self._app.bot
            
            # Register handlers
@ -674,9 +708,15 @@ class TelegramAdapter(BasePlatformAdapter):
            except ImportError:
                _NetErr = OSError  # type: ignore[misc,assignment]

+            try:
+                from telegram.error import BadRequest as _BadReq
+            except ImportError:
+                _BadReq = None  # type: ignore[assignment,misc]
+
            for i, chunk in enumerate(chunks):
                should_thread = self._should_thread_reply(reply_to, i)
                reply_to_id = int(reply_to) if should_thread else None
+                effective_thread_id = int(thread_id) if thread_id else None

                msg = None
                for _send_attempt in range(3):
@ -688,7 +728,7 @@ class TelegramAdapter(BasePlatformAdapter):
                                text=chunk,
                                parse_mode=ParseMode.MARKDOWN_V2,
                                reply_to_message_id=reply_to_id,
-                                message_thread_id=int(thread_id) if thread_id else None,
+                                message_thread_id=effective_thread_id,
                            )
                        except Exception as md_error:
                            # Markdown parsing failed, try plain text
@ -700,12 +740,40 @@ class TelegramAdapter(BasePlatformAdapter):
                                    text=plain_chunk,
                                    parse_mode=None,
                                    reply_to_message_id=reply_to_id,
-                                    message_thread_id=int(thread_id) if thread_id else None,
+                                    message_thread_id=effective_thread_id,
                                )
                            else:
                                raise
                        break  # success
                    except _NetErr as send_err:
+                        # BadRequest is a subclass of NetworkError in
+                        # python-telegram-bot but represents permanent errors
+                        # (not transient network issues). Detect and handle
+                        # specific cases instead of blindly retrying.
+                        if _BadReq and isinstance(send_err, _BadReq):
+                            err_lower = str(send_err).lower()
+                            if "thread not found" in err_lower and effective_thread_id is not None:
+                                # Thread doesn't exist — retry without
+                                # message_thread_id so the message still
+                                # reaches the chat.
+                                logger.warning(
+                                    "[%s] Thread %s not found, retrying without message_thread_id",
+                                    self.name, effective_thread_id,
+                                )
+                                effective_thread_id = None
+                                continue
+                            if "message to be replied not found" in err_lower and reply_to_id is not None:
+                                # Original message was deleted before we
+                                # could reply — clear reply target and retry
+                                # so the response is still delivered.
+                                logger.warning(
+                                    "[%s] Reply target deleted, retrying without reply_to: %s",
+                                    self.name, send_err,
+                                )
+                                reply_to_id = None
+                                continue
+                            # Other BadRequest errors are permanent — don't retry
+                            raise
                        if _send_attempt < 2:
                            wait = 2 ** _send_attempt
                            logger.warning("[%s] Network error on send (attempt %d/3), retrying in %ds: %s",
@ -1700,7 +1768,8 @@ class TelegramAdapter(BasePlatformAdapter):
        recognized without a gateway restart.
        """
        try:
-            config_path = _Path.home() / ".hermes" / "config.yaml"
+            from hermes_constants import get_hermes_home
+            config_path = get_hermes_home() / "config.yaml"
            if not config_path.exists():
                return

--- a/gateway/platforms/telegram_network.py
+++ b/gateway/platforms/telegram_network.py
@ -0,0 +1,245 @@
+"""Telegram-specific network helpers.
+
+Provides a hostname-preserving fallback transport for networks where
+api.telegram.org resolves to an endpoint that is unreachable from the current
+host. The transport keeps the logical request host and TLS SNI as
+api.telegram.org while retrying the TCP connection against one or more fallback
+IPv4 addresses.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import ipaddress
+import logging
+import os
+import socket
+from typing import Iterable, Optional
+
+import httpx
+
+logger = logging.getLogger(__name__)
+
+_TELEGRAM_API_HOST = "api.telegram.org"
+
+# DNS-over-HTTPS providers used to discover Telegram API IPs that may differ
+# from the (potentially unreachable) IP returned by the local system resolver.
+_DOH_TIMEOUT = 4.0  # seconds — bounded so connect() isn't noticeably delayed
+
+_DOH_PROVIDERS: list[dict] = [
+    {
+        "url": "https://dns.google/resolve",
+        "params": {"name": _TELEGRAM_API_HOST, "type": "A"},
+        "headers": {},
+    },
+    {
+        "url": "https://cloudflare-dns.com/dns-query",
+        "params": {"name": _TELEGRAM_API_HOST, "type": "A"},
+        "headers": {"Accept": "application/dns-json"},
+    },
+]
+
+# Last-resort IPs when DoH is also blocked.  These are stable Telegram Bot API
+# endpoints in the 149.154.160.0/20 block (same seed used by OpenClaw).
+_SEED_FALLBACK_IPS: list[str] = ["149.154.167.220"]
+
+
+def _resolve_proxy_url() -> str | None:
+    for key in ("HTTPS_PROXY", "HTTP_PROXY", "ALL_PROXY", "https_proxy", "http_proxy", "all_proxy"):
+        value = (os.environ.get(key) or "").strip()
+        if value:
+            return value
+    return None
+
+
+class TelegramFallbackTransport(httpx.AsyncBaseTransport):
+    """Retry Telegram Bot API requests via fallback IPs while preserving TLS/SNI.
+
+    Requests continue to target https://api.telegram.org/... logically, but on
+    connect failures the underlying TCP connection is retried against a known
+    reachable IP. This is effectively the programmatic equivalent of
+    ``curl --resolve api.telegram.org:443:<ip>``.
+    """
+
+    def __init__(self, fallback_ips: Iterable[str], **transport_kwargs):
+        self._fallback_ips = [ip for ip in dict.fromkeys(_normalize_fallback_ips(fallback_ips))]
+        proxy_url = _resolve_proxy_url()
+        if proxy_url and "proxy" not in transport_kwargs:
+            transport_kwargs["proxy"] = proxy_url
+        self._primary = httpx.AsyncHTTPTransport(**transport_kwargs)
+        self._fallbacks = {
+            ip: httpx.AsyncHTTPTransport(**transport_kwargs) for ip in self._fallback_ips
+        }
+        self._sticky_ip: Optional[str] = None
+        self._sticky_lock = asyncio.Lock()
+
+    async def handle_async_request(self, request: httpx.Request) -> httpx.Response:
+        if request.url.host != _TELEGRAM_API_HOST or not self._fallback_ips:
+            return await self._primary.handle_async_request(request)
+
+        sticky_ip = self._sticky_ip
+        attempt_order: list[Optional[str]] = [sticky_ip] if sticky_ip else [None]
+        for ip in self._fallback_ips:
+            if ip != sticky_ip:
+                attempt_order.append(ip)
+
+        last_error: Exception | None = None
+        for ip in attempt_order:
+            candidate = request if ip is None else _rewrite_request_for_ip(request, ip)
+            transport = self._primary if ip is None else self._fallbacks[ip]
+            try:
+                response = await transport.handle_async_request(candidate)
+                if ip is not None and self._sticky_ip != ip:
+                    async with self._sticky_lock:
+                        if self._sticky_ip != ip:
+                            self._sticky_ip = ip
+                            logger.warning(
+                                "[Telegram] Primary api.telegram.org path unreachable; using sticky fallback IP %s",
+                                ip,
+                            )
+                return response
+            except Exception as exc:
+                last_error = exc
+                if not _is_retryable_connect_error(exc):
+                    raise
+                if ip is None:
+                    logger.warning(
+                        "[Telegram] Primary api.telegram.org connection failed (%s); trying fallback IPs %s",
+                        exc,
+                        ", ".join(self._fallback_ips),
+                    )
+                    continue
+                logger.warning("[Telegram] Fallback IP %s failed: %s", ip, exc)
+                continue
+
+        assert last_error is not None
+        raise last_error
+
+    async def aclose(self) -> None:
+        await self._primary.aclose()
+        for transport in self._fallbacks.values():
+            await transport.aclose()
+
+
+def _normalize_fallback_ips(values: Iterable[str]) -> list[str]:
+    normalized: list[str] = []
+    for value in values:
+        raw = str(value).strip()
+        if not raw:
+            continue
+        try:
+            addr = ipaddress.ip_address(raw)
+        except ValueError:
+            logger.warning("Ignoring invalid Telegram fallback IP: %r", raw)
+            continue
+        if addr.version != 4:
+            logger.warning("Ignoring non-IPv4 Telegram fallback IP: %s", raw)
+            continue
+        normalized.append(str(addr))
+    return normalized
+
+
+def parse_fallback_ip_env(value: str | None) -> list[str]:
+    if not value:
+        return []
+    parts = [part.strip() for part in value.split(",")]
+    return _normalize_fallback_ips(parts)
+
+
+def _resolve_system_dns() -> set[str]:
+    """Return the IPv4 addresses that the OS resolver gives for api.telegram.org."""
+    try:
+        results = socket.getaddrinfo(_TELEGRAM_API_HOST, 443, socket.AF_INET)
+        return {addr[4][0] for addr in results}
+    except Exception:
+        return set()
+
+
+async def _query_doh_provider(
+    client: httpx.AsyncClient, provider: dict
+) -> list[str]:
+    """Query one DoH provider and return A-record IPs."""
+    try:
+        resp = await client.get(
+            provider["url"], params=provider["params"], headers=provider["headers"]
+        )
+        resp.raise_for_status()
+        data = resp.json()
+        ips: list[str] = []
+        for answer in data.get("Answer", []):
+            if answer.get("type") != 1:  # A record
+                continue
+            raw = answer.get("data", "").strip()
+            try:
+                ipaddress.ip_address(raw)
+                ips.append(raw)
+            except ValueError:
+                continue
+        return ips
+    except Exception as exc:
+        logger.debug("DoH query to %s failed: %s", provider["url"], exc)
+        return []
+
+
+async def discover_fallback_ips() -> list[str]:
+    """Auto-discover Telegram API IPs via DNS-over-HTTPS.
+
+    Resolves api.telegram.org through Google and Cloudflare DoH, collects all
+    unique IPs, and excludes the system-DNS-resolved IP (which is presumably
+    unreachable on this network).  Falls back to a hardcoded seed list when DoH
+    is also unavailable.
+    """
+    async with httpx.AsyncClient(timeout=httpx.Timeout(_DOH_TIMEOUT)) as client:
+        doh_tasks = [_query_doh_provider(client, p) for p in _DOH_PROVIDERS]
+        system_dns_task = asyncio.to_thread(_resolve_system_dns)
+        results = await asyncio.gather(system_dns_task, *doh_tasks, return_exceptions=True)
+
+    # results[0] = system DNS IPs (set), results[1:] = DoH IP lists
+    system_ips: set[str] = results[0] if isinstance(results[0], set) else set()
+
+    doh_ips: list[str] = []
+    for r in results[1:]:
+        if isinstance(r, list):
+            doh_ips.extend(r)
+
+    # Deduplicate preserving order, exclude system-DNS IPs
+    seen: set[str] = set()
+    candidates: list[str] = []
+    for ip in doh_ips:
+        if ip not in seen and ip not in system_ips:
+            seen.add(ip)
+            candidates.append(ip)
+
+    # Validate through existing normalization
+    validated = _normalize_fallback_ips(candidates)
+
+    if validated:
+        logger.debug("Discovered Telegram fallback IPs via DoH: %s", ", ".join(validated))
+        return validated
+
+    logger.info(
+        "DoH discovery yielded no new IPs (system DNS: %s); using seed fallback IPs %s",
+        ", ".join(system_ips) or "unknown",
+        ", ".join(_SEED_FALLBACK_IPS),
+    )
+    return list(_SEED_FALLBACK_IPS)
+
+
+def _rewrite_request_for_ip(request: httpx.Request, ip: str) -> httpx.Request:
+    original_host = request.url.host or _TELEGRAM_API_HOST
+    url = request.url.copy_with(host=ip)
+    headers = request.headers.copy()
+    headers["host"] = original_host
+    extensions = dict(request.extensions)
+    extensions["sni_hostname"] = original_host
+    return httpx.Request(
+        method=request.method,
+        url=url,
+        headers=headers,
+        stream=request.stream,
+        extensions=extensions,
+    )
+
+
+def _is_retryable_connect_error(exc: Exception) -> bool:
+    return isinstance(exc, (httpx.ConnectTimeout, httpx.ConnectError))
--- a/gateway/platforms/webhook.py
+++ b/gateway/platforms/webhook.py
@ -27,6 +27,7 @@ import hashlib
 import hmac
 import json
 import logging
+import os
 import re
 import subprocess
 import time
@ -53,6 +54,7 @@ logger = logging.getLogger(__name__)
 DEFAULT_HOST = "0.0.0.0"
 DEFAULT_PORT = 8644
 _INSECURE_NO_AUTH = "INSECURE_NO_AUTH"
+_DYNAMIC_ROUTES_FILENAME = "webhook_subscriptions.json"


 def check_webhook_requirements() -> bool:
@ -68,7 +70,10 @@ class WebhookAdapter(BasePlatformAdapter):
        self._host: str = config.extra.get("host", DEFAULT_HOST)
        self._port: int = int(config.extra.get("port", DEFAULT_PORT))
        self._global_secret: str = config.extra.get("secret", "")
-        self._routes: Dict[str, dict] = config.extra.get("routes", {})
+        self._static_routes: Dict[str, dict] = config.extra.get("routes", {})
+        self._dynamic_routes: Dict[str, dict] = {}
+        self._dynamic_routes_mtime: float = 0.0
+        self._routes: Dict[str, dict] = dict(self._static_routes)
        self._runner = None

        # Delivery info keyed by session chat_id — consumed by send()
@ -96,6 +101,9 @@ class WebhookAdapter(BasePlatformAdapter):
    # ------------------------------------------------------------------

    async def connect(self) -> bool:
+        # Load agent-created subscriptions before validating
+        self._reload_dynamic_routes()
+
        # Validate routes at startup — secret is required per route
        for name, route in self._routes.items():
            secret = route.get("secret", self._global_secret)
@ -110,6 +118,17 @@ class WebhookAdapter(BasePlatformAdapter):
        app.router.add_get("/health", self._handle_health)
        app.router.add_post("/webhooks/{route_name}", self._handle_webhook)

+        # Port conflict detection — fail fast if port is already in use
+        import socket as _socket
+        try:
+            with _socket.socket(_socket.AF_INET, _socket.SOCK_STREAM) as _s:
+                _s.settimeout(1)
+                _s.connect(('127.0.0.1', self._port))
+            logger.error('[webhook] Port %d already in use. Set a different port in config.yaml: platforms.webhook.port', self._port)
+            return False
+        except (ConnectionRefusedError, OSError):
+            pass  # port is free
+
        self._runner = web.AppRunner(app)
        await self._runner.setup()
        site = web.TCPSite(self._runner, self._host, self._port)
@ -182,8 +201,46 @@ class WebhookAdapter(BasePlatformAdapter):
        """GET /health — simple health check."""
        return web.json_response({"status": "ok", "platform": "webhook"})

+    def _reload_dynamic_routes(self) -> None:
+        """Reload agent-created subscriptions from disk if the file changed."""
+        from pathlib import Path as _Path
+        hermes_home = _Path(
+            os.getenv("HERMES_HOME", str(_Path.home() / ".hermes"))
+        ).expanduser()
+        subs_path = hermes_home / _DYNAMIC_ROUTES_FILENAME
+        if not subs_path.exists():
+            if self._dynamic_routes:
+                self._dynamic_routes = {}
+                self._routes = dict(self._static_routes)
+                logger.debug("[webhook] Dynamic subscriptions file removed, cleared dynamic routes")
+            return
+        try:
+            mtime = subs_path.stat().st_mtime
+            if mtime <= self._dynamic_routes_mtime:
+                return  # No change
+            data = json.loads(subs_path.read_text(encoding="utf-8"))
+            if not isinstance(data, dict):
+                return
+            # Merge: static routes take precedence over dynamic ones
+            self._dynamic_routes = {
+                k: v for k, v in data.items()
+                if k not in self._static_routes
+            }
+            self._routes = {**self._dynamic_routes, **self._static_routes}
+            self._dynamic_routes_mtime = mtime
+            logger.info(
+                "[webhook] Reloaded %d dynamic route(s): %s",
+                len(self._dynamic_routes),
+                ", ".join(self._dynamic_routes.keys()) or "(none)",
+            )
+        except Exception as e:
+            logger.warning("[webhook] Failed to reload dynamic routes: %s", e)
+
    async def _handle_webhook(self, request: "web.Request") -> "web.Response":
        """POST /webhooks/{route_name} — receive and process a webhook event."""
+        # Hot-reload dynamic subscriptions on each request (mtime-gated, cheap)
+        self._reload_dynamic_routes()
+
        route_name = request.match_info.get("route_name", "")
        route_config = self._routes.get(route_name)

--- a/gateway/platforms/wecom.py
+++ b/gateway/platforms/wecom.py
--- a/gateway/platforms/whatsapp.py
+++ b/gateway/platforms/whatsapp.py
@ -26,6 +26,7 @@ from pathlib import Path
 from typing import Dict, Optional, Any

 from hermes_cli.config import get_hermes_home
+from hermes_constants import get_hermes_dir

 logger = logging.getLogger(__name__)

@ -134,13 +135,15 @@ class WhatsAppAdapter(BasePlatformAdapter):
        )
        self._session_path: Path = Path(config.extra.get(
            "session_path",
-            get_hermes_home() / "whatsapp" / "session"
+            get_hermes_dir("platforms/whatsapp/session", "whatsapp/session")
        ))
        self._reply_prefix: Optional[str] = config.extra.get("reply_prefix")
        self._message_queue: asyncio.Queue = asyncio.Queue()
        self._bridge_log_fh = None
        self._bridge_log: Optional[Path] = None
        self._poll_task: Optional[asyncio.Task] = None
+        self._http_session: Optional["aiohttp.ClientSession"] = None
+        self._session_lock_identity: Optional[str] = None
    
    async def connect(self) -> bool:
        """
@ -159,6 +162,29 @@ class WhatsAppAdapter(BasePlatformAdapter):
        
        logger.info("[%s] Bridge found at %s", self.name, bridge_path)
        
+        # Acquire scoped lock to prevent duplicate sessions
+        try:
+            from gateway.status import acquire_scoped_lock
+
+            self._session_lock_identity = str(self._session_path)
+            acquired, existing = acquire_scoped_lock(
+                "whatsapp-session",
+                self._session_lock_identity,
+                metadata={"platform": self.platform.value},
+            )
+            if not acquired:
+                owner_pid = existing.get("pid") if isinstance(existing, dict) else None
+                message = (
+                    "Another local Hermes gateway is already using this WhatsApp session"
+                    + (f" (PID {owner_pid})." if owner_pid else ".")
+                    + " Stop the other gateway before starting a second WhatsApp bridge."
+                )
+                logger.error("[%s] %s", self.name, message)
+                self._set_fatal_error("whatsapp_session_lock", message, retryable=False)
+                return False
+        except Exception as e:
+            logger.warning("[%s] Could not acquire session lock (non-fatal): %s", self.name, e)
+
        # Auto-install npm dependencies if node_modules doesn't exist
        bridge_dir = bridge_path.parent
        if not (bridge_dir / "node_modules").exists():
@ -199,6 +225,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
                                print(f"[{self.name}] Using existing bridge (status: {bridge_status})")
                                self._mark_connected()
                                self._bridge_process = None  # Not managed by us
+                                self._http_session = aiohttp.ClientSession()
                                self._poll_task = asyncio.create_task(self._poll_messages())
                                return True
                            else:
@ -304,6 +331,9 @@ class WhatsAppAdapter(BasePlatformAdapter):
                    print(f"[{self.name}]   Bridge log: {self._bridge_log}")
                    print(f"[{self.name}]   If session expired, re-pair: hermes whatsapp")
            
+            # Create a persistent HTTP session for all bridge communication
+            self._http_session = aiohttp.ClientSession()
+
            # Start message polling task
            self._poll_task = asyncio.create_task(self._poll_messages())
            
@ -312,6 +342,12 @@ class WhatsAppAdapter(BasePlatformAdapter):
            return True
            
        except Exception as e:
+            if self._session_lock_identity:
+                try:
+                    from gateway.status import release_scoped_lock
+                    release_scoped_lock("whatsapp-session", self._session_lock_identity)
+                except Exception:
+                    pass
            logger.error("[%s] Failed to start bridge: %s", self.name, e, exc_info=True)
            self._close_bridge_log()
            return False
@ -369,10 +405,32 @@ class WhatsAppAdapter(BasePlatformAdapter):
        else:
            # Bridge was not started by us, don't kill it
            print(f"[{self.name}] Disconnecting (external bridge left running)")
-        
+
+        # Cancel the poll task explicitly
+        if self._poll_task and not self._poll_task.done():
+            self._poll_task.cancel()
+            try:
+                await self._poll_task
+            except (asyncio.CancelledError, Exception):
+                pass
+        self._poll_task = None
+
+        # Close the persistent HTTP session
+        if self._http_session and not self._http_session.closed:
+            await self._http_session.close()
+        self._http_session = None
+
+        if self._session_lock_identity:
+            try:
+                from gateway.status import release_scoped_lock
+                release_scoped_lock("whatsapp-session", self._session_lock_identity)
+            except Exception as e:
+                logger.warning("[%s] Error releasing WhatsApp session lock: %s", self.name, e, exc_info=True)
+
        self._mark_disconnected()
        self._bridge_process = None
        self._close_bridge_log()
+        self._session_lock_identity = None
        print(f"[{self.name}] Disconnected")
    
    async def send(
@ -383,7 +441,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
        metadata: Optional[Dict[str, Any]] = None
    ) -> SendResult:
        """Send a message via the WhatsApp bridge."""
-        if not self._running:
+        if not self._running or not self._http_session:
            return SendResult(success=False, error="Not connected")
        bridge_exit = await self._check_managed_bridge_exit()
        if bridge_exit:
@ -391,36 +449,29 @@ class WhatsAppAdapter(BasePlatformAdapter):
        
        try:
            import aiohttp
+
+            payload = {
+                "chatId": chat_id,
+                "message": content,
+            }
+            if reply_to:
+                payload["replyTo"] = reply_to
            
-            async with aiohttp.ClientSession() as session:
-                payload = {
-                    "chatId": chat_id,
-                    "message": content,
-                }
-                if reply_to:
-                    payload["replyTo"] = reply_to
-                
-                async with session.post(
-                    f"http://127.0.0.1:{self._bridge_port}/send",
-                    json=payload,
-                    timeout=aiohttp.ClientTimeout(total=30)
-                ) as resp:
-                    if resp.status == 200:
-                        data = await resp.json()
-                        return SendResult(
-                            success=True,
-                            message_id=data.get("messageId"),
-                            raw_response=data
-                        )
-                    else:
-                        error = await resp.text()
-                        return SendResult(success=False, error=error)
-                        
-        except ImportError:
-            return SendResult(
-                success=False, 
-                error="aiohttp not installed. Run: pip install aiohttp"
-            )
+            async with self._http_session.post(
+                f"http://127.0.0.1:{self._bridge_port}/send",
+                json=payload,
+                timeout=aiohttp.ClientTimeout(total=30)
+            ) as resp:
+                if resp.status == 200:
+                    data = await resp.json()
+                    return SendResult(
+                        success=True,
+                        message_id=data.get("messageId"),
+                        raw_response=data
+                    )
+                else:
+                    error = await resp.text()
+                    return SendResult(success=False, error=error)
        except Exception as e:
            return SendResult(success=False, error=str(e))

@ -431,28 +482,27 @@ class WhatsAppAdapter(BasePlatformAdapter):
        content: str,
    ) -> SendResult:
        """Edit a previously sent message via the WhatsApp bridge."""
-        if not self._running:
+        if not self._running or not self._http_session:
            return SendResult(success=False, error="Not connected")
        bridge_exit = await self._check_managed_bridge_exit()
        if bridge_exit:
            return SendResult(success=False, error=bridge_exit)
        try:
            import aiohttp
-            async with aiohttp.ClientSession() as session:
-                async with session.post(
-                    f"http://127.0.0.1:{self._bridge_port}/edit",
-                    json={
-                        "chatId": chat_id,
-                        "messageId": message_id,
-                        "message": content,
-                    },
-                    timeout=aiohttp.ClientTimeout(total=15)
-                ) as resp:
-                    if resp.status == 200:
-                        return SendResult(success=True, message_id=message_id)
-                    else:
-                        error = await resp.text()
-                        return SendResult(success=False, error=error)
+            async with self._http_session.post(
+                f"http://127.0.0.1:{self._bridge_port}/edit",
+                json={
+                    "chatId": chat_id,
+                    "messageId": message_id,
+                    "message": content,
+                },
+                timeout=aiohttp.ClientTimeout(total=15)
+            ) as resp:
+                if resp.status == 200:
+                    return SendResult(success=True, message_id=message_id)
+                else:
+                    error = await resp.text()
+                    return SendResult(success=False, error=error)
        except Exception as e:
            return SendResult(success=False, error=str(e))

@ -465,7 +515,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
        file_name: Optional[str] = None,
    ) -> SendResult:
        """Send any media file via bridge /send-media endpoint."""
-        if not self._running:
+        if not self._running or not self._http_session:
            return SendResult(success=False, error="Not connected")
        bridge_exit = await self._check_managed_bridge_exit()
        if bridge_exit:
@ -486,22 +536,21 @@ class WhatsAppAdapter(BasePlatformAdapter):
            if file_name:
                payload["fileName"] = file_name

-            async with aiohttp.ClientSession() as session:
-                async with session.post(
-                    f"http://127.0.0.1:{self._bridge_port}/send-media",
-                    json=payload,
-                    timeout=aiohttp.ClientTimeout(total=120),
-                ) as resp:
-                    if resp.status == 200:
-                        data = await resp.json()
-                        return SendResult(
-                            success=True,
-                            message_id=data.get("messageId"),
-                            raw_response=data,
-                        )
-                    else:
-                        error = await resp.text()
-                        return SendResult(success=False, error=error)
+            async with self._http_session.post(
+                f"http://127.0.0.1:{self._bridge_port}/send-media",
+                json=payload,
+                timeout=aiohttp.ClientTimeout(total=120),
+            ) as resp:
+                if resp.status == 200:
+                    data = await resp.json()
+                    return SendResult(
+                        success=True,
+                        message_id=data.get("messageId"),
+                        raw_response=data,
+                    )
+                else:
+                    error = await resp.text()
+                    return SendResult(success=False, error=error)

        except Exception as e:
            return SendResult(success=False, error=str(e))
@ -526,6 +575,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
        image_path: str,
        caption: Optional[str] = None,
        reply_to: Optional[str] = None,
+        **kwargs,
    ) -> SendResult:
        """Send a local image file natively via bridge."""
        return await self._send_media_to_bridge(chat_id, image_path, "image", caption)
@ -536,6 +586,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
        video_path: str,
        caption: Optional[str] = None,
        reply_to: Optional[str] = None,
+        **kwargs,
    ) -> SendResult:
        """Send a video natively via bridge — plays inline in WhatsApp."""
        return await self._send_media_to_bridge(chat_id, video_path, "video", caption)
@ -547,6 +598,7 @@ class WhatsAppAdapter(BasePlatformAdapter):
        caption: Optional[str] = None,
        file_name: Optional[str] = None,
        reply_to: Optional[str] = None,
+        **kwargs,
    ) -> SendResult:
        """Send a document/file as a downloadable attachment via bridge."""
        return await self._send_media_to_bridge(
@ -556,45 +608,43 @@ class WhatsAppAdapter(BasePlatformAdapter):

    async def send_typing(self, chat_id: str, metadata=None) -> None:
        """Send typing indicator via bridge."""
-        if not self._running:
+        if not self._running or not self._http_session:
            return
        if await self._check_managed_bridge_exit():
            return
        
        try:
            import aiohttp
-            
-            async with aiohttp.ClientSession() as session:
-                await session.post(
-                    f"http://127.0.0.1:{self._bridge_port}/typing",
-                    json={"chatId": chat_id},
-                    timeout=aiohttp.ClientTimeout(total=5)
-                )
+
+            await self._http_session.post(
+                f"http://127.0.0.1:{self._bridge_port}/typing",
+                json={"chatId": chat_id},
+                timeout=aiohttp.ClientTimeout(total=5)
+            )
        except Exception:
            pass  # Ignore typing indicator failures
    
    async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
        """Get information about a WhatsApp chat."""
-        if not self._running:
+        if not self._running or not self._http_session:
            return {"name": "Unknown", "type": "dm"}
        if await self._check_managed_bridge_exit():
            return {"name": chat_id, "type": "dm"}
        
        try:
            import aiohttp
-            
-            async with aiohttp.ClientSession() as session:
-                async with session.get(
-                    f"http://127.0.0.1:{self._bridge_port}/chat/{chat_id}",
-                    timeout=aiohttp.ClientTimeout(total=10)
-                ) as resp:
-                    if resp.status == 200:
-                        data = await resp.json()
-                        return {
-                            "name": data.get("name", chat_id),
-                            "type": "group" if data.get("isGroup") else "dm",
-                            "participants": data.get("participants", []),
-                        }
+
+            async with self._http_session.get(
+                f"http://127.0.0.1:{self._bridge_port}/chat/{chat_id}",
+                timeout=aiohttp.ClientTimeout(total=10)
+            ) as resp:
+                if resp.status == 200:
+                    data = await resp.json()
+                    return {
+                        "name": data.get("name", chat_id),
+                        "type": "group" if data.get("isGroup") else "dm",
+                        "participants": data.get("participants", []),
+                    }
        except Exception as e:
            logger.debug("Could not get WhatsApp chat info for %s: %s", chat_id, e)
        
@ -602,29 +652,26 @@ class WhatsAppAdapter(BasePlatformAdapter):
    
    async def _poll_messages(self) -> None:
        """Poll the bridge for incoming messages."""
-        try:
-            import aiohttp
-        except ImportError:
-            print(f"[{self.name}] aiohttp not installed, message polling disabled")
-            return
-        
+        import aiohttp
+
        while self._running:
+            if not self._http_session:
+                break
            bridge_exit = await self._check_managed_bridge_exit()
            if bridge_exit:
                print(f"[{self.name}] {bridge_exit}")
                break
            try:
-                async with aiohttp.ClientSession() as session:
-                    async with session.get(
-                        f"http://127.0.0.1:{self._bridge_port}/messages",
-                        timeout=aiohttp.ClientTimeout(total=30)
-                    ) as resp:
-                        if resp.status == 200:
-                            messages = await resp.json()
-                            for msg_data in messages:
-                                event = await self._build_message_event(msg_data)
-                                if event:
-                                    await self.handle_message(event)
+                async with self._http_session.get(
+                    f"http://127.0.0.1:{self._bridge_port}/messages",
+                    timeout=aiohttp.ClientTimeout(total=30)
+                ) as resp:
+                    if resp.status == 200:
+                        messages = await resp.json()
+                        for msg_data in messages:
+                            event = await self._build_message_event(msg_data)
+                            if event:
+                                await self.handle_message(event)
            except asyncio.CancelledError:
                break
            except Exception as e:
--- a/gateway/run.py
+++ b/gateway/run.py
@ -77,6 +77,7 @@ sys.path.insert(0, str(Path(__file__).parent.parent))

 # Resolve Hermes home directory (respects HERMES_HOME override)
 from hermes_constants import get_hermes_home
+from utils import atomic_yaml_write
 _hermes_home = get_hermes_home()

 # Load environment variables from ~/.hermes/.env first.
@ -224,6 +225,49 @@ from gateway.session import (
 from gateway.delivery import DeliveryRouter
 from gateway.platforms.base import BasePlatformAdapter, MessageEvent, MessageType

+
+def _normalize_whatsapp_identifier(value: str) -> str:
+    """Strip WhatsApp JID/LID syntax down to its stable numeric identifier."""
+    return (
+        str(value or "")
+        .strip()
+        .replace("+", "", 1)
+        .split(":", 1)[0]
+        .split("@", 1)[0]
+    )
+
+
+def _expand_whatsapp_auth_aliases(identifier: str) -> set:
+    """Resolve WhatsApp phone/LID aliases using bridge session mapping files."""
+    normalized = _normalize_whatsapp_identifier(identifier)
+    if not normalized:
+        return set()
+
+    session_dir = _hermes_home / "whatsapp" / "session"
+    resolved = set()
+    queue = [normalized]
+
+    while queue:
+        current = queue.pop(0)
+        if not current or current in resolved:
+            continue
+
+        resolved.add(current)
+        for suffix in ("", "_reverse"):
+            mapping_path = session_dir / f"lid-mapping-{current}{suffix}.json"
+            if not mapping_path.exists():
+                continue
+            try:
+                mapped = _normalize_whatsapp_identifier(
+                    json.loads(mapping_path.read_text(encoding="utf-8"))
+                )
+            except Exception:
+                continue
+            if mapped and mapped not in resolved:
+                queue.append(mapped)
+
+    return resolved
+
 logger = logging.getLogger(__name__)

 # Sentinel placed into _running_agents immediately when a session starts
@ -279,16 +323,16 @@ def _resolve_gateway_model(config: dict | None = None) -> str:
    """Read model from env/config — mirrors the resolution in _run_agent_sync.

    Without this, temporary AIAgent instances (memory flush, /compress) fall
-    back to the hardcoded default ("anthropic/claude-opus-4.6") which fails
-    when the active provider is openai-codex.
+    back to the hardcoded default which fails when the active provider is
+    openai-codex.
    """
-    model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or "anthropic/claude-opus-4.6"
+    model = os.getenv("HERMES_MODEL") or os.getenv("LLM_MODEL") or ""
    cfg = config if config is not None else _load_gateway_config()
    model_cfg = cfg.get("model", {})
    if isinstance(model_cfg, str):
        model = model_cfg
    elif isinstance(model_cfg, dict):
-        model = model_cfg.get("default", model)
+        model = model_cfg.get("default") or model_cfg.get("model") or model
    return model


@ -432,7 +476,7 @@ class GatewayRunner:
            from honcho_integration.session import HonchoSessionManager

            hcfg = HonchoClientConfig.from_global_config()
-            if not hcfg.enabled or not hcfg.api_key:
+            if not hcfg.enabled or not (hcfg.api_key or hcfg.base_url):
                return None, hcfg

            client = get_honcho_client(hcfg)
@ -573,6 +617,10 @@ class GatewayRunner:
                session_id=old_session_id,
                honcho_session_key=honcho_session_key,
            )
+            # Fully silence the flush agent — quiet_mode only suppresses init
+            # messages; tool call output still leaks to the terminal through
+            # _safe_print → _print_fn.  Set a no-op to prevent that.
+            tmp_agent._print_fn = lambda *a, **kw: None

            # Build conversation history from transcript
            msgs = [
@ -741,10 +789,22 @@ class GatewayRunner:
                logger.error("No connected messaging platforms remain. Shutting down gateway cleanly.")
            await self.stop()
        elif not self.adapters and self._failed_platforms:
-            logger.warning(
-                "No connected messaging platforms remain, but %d platform(s) queued for reconnection",
-                len(self._failed_platforms),
-            )
+            # All platforms are down and queued for background reconnection.
+            # If the error is retryable, exit with failure so systemd Restart=on-failure
+            # can restart the process. Otherwise stay alive and keep retrying in background.
+            if adapter.fatal_error_retryable:
+                self._exit_reason = adapter.fatal_error_message or "All messaging platforms failed with retryable errors"
+                self._exit_with_failure = True
+                logger.error(
+                    "All messaging platforms failed with retryable errors. "
+                    "Shutting down gateway for service restart (systemd will retry)."
+                )
+                await self.stop()
+            else:
+                logger.warning(
+                    "No connected messaging platforms remain, but %d platform(s) queued for reconnection",
+                    len(self._failed_platforms),
+                )

    def _request_clean_exit(self, reason: str) -> None:
        self._exit_cleanly = True
@ -902,11 +962,12 @@ class GatewayRunner:
        return {}

    @staticmethod
-    def _load_fallback_model() -> dict | None:
-        """Load fallback model config from config.yaml.
+    def _load_fallback_model() -> list | dict | None:
+        """Load fallback provider chain from config.yaml.

-        Returns a dict with 'provider' and 'model' keys, or None if
-        not configured / both fields empty.
+        Returns a list of provider dicts (``fallback_providers``), a single
+        dict (legacy ``fallback_model``), or None if not configured.
+        AIAgent.__init__ normalizes both formats into a chain.
        """
        try:
            import yaml as _y
@ -914,8 +975,8 @@ class GatewayRunner:
            if cfg_path.exists():
                with open(cfg_path, encoding="utf-8") as _f:
                    cfg = _y.safe_load(_f) or {}
-                fb = cfg.get("fallback_model", {}) or {}
-                if fb.get("provider") and fb.get("model"):
+                fb = cfg.get("fallback_providers") or cfg.get("fallback_model") or None
+                if fb:
                    return fb
        except Exception:
            pass
@ -943,6 +1004,13 @@ class GatewayRunner:
        """
        logger.info("Starting Hermes Gateway...")
        logger.info("Session storage: %s", self.config.sessions_dir)
+        try:
+            from hermes_cli.profiles import get_active_profile_name
+            _profile = get_active_profile_name()
+            if _profile and _profile != "default":
+                logger.info("Active profile: %s", _profile)
+        except Exception:
+            pass
        try:
            from gateway.status import write_runtime_status
            write_runtime_status(gateway_state="starting", exit_reason=None)
@ -954,12 +1022,24 @@ class GatewayRunner:
            os.getenv(v)
            for v in ("TELEGRAM_ALLOWED_USERS", "DISCORD_ALLOWED_USERS",
                       "WHATSAPP_ALLOWED_USERS", "SLACK_ALLOWED_USERS",
-                       "SIGNAL_ALLOWED_USERS", "EMAIL_ALLOWED_USERS",
+                       "SIGNAL_ALLOWED_USERS", "SIGNAL_GROUP_ALLOWED_USERS",
+                       "EMAIL_ALLOWED_USERS",
                       "SMS_ALLOWED_USERS", "MATTERMOST_ALLOWED_USERS",
                       "MATRIX_ALLOWED_USERS", "DINGTALK_ALLOWED_USERS",
+                       "FEISHU_ALLOWED_USERS",
+                       "WECOM_ALLOWED_USERS",
                       "GATEWAY_ALLOWED_USERS")
        )
-        _allow_all = os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in ("true", "1", "yes")
+        _allow_all = os.getenv("GATEWAY_ALLOW_ALL_USERS", "").lower() in ("true", "1", "yes") or any(
+            os.getenv(v, "").lower() in ("true", "1", "yes")
+            for v in ("TELEGRAM_ALLOW_ALL_USERS", "DISCORD_ALLOW_ALL_USERS",
+                       "WHATSAPP_ALLOW_ALL_USERS", "SLACK_ALLOW_ALL_USERS",
+                       "SIGNAL_ALLOW_ALL_USERS", "EMAIL_ALLOW_ALL_USERS",
+                       "SMS_ALLOW_ALL_USERS", "MATTERMOST_ALLOW_ALL_USERS",
+                       "MATRIX_ALLOW_ALL_USERS", "DINGTALK_ALLOW_ALL_USERS",
+                       "FEISHU_ALLOW_ALL_USERS",
+                       "WECOM_ALLOW_ALL_USERS")
+        )
        if not _any_allowlist and not _allow_all:
            logger.warning(
                "No user allowlists configured. All unauthorized users will be denied. "
@ -1401,6 +1481,20 @@ class GatewayRunner:
                return None
            return DingTalkAdapter(config)

+        elif platform == Platform.FEISHU:
+            from gateway.platforms.feishu import FeishuAdapter, check_feishu_requirements
+            if not check_feishu_requirements():
+                logger.warning("Feishu: lark-oapi not installed or FEISHU_APP_ID/SECRET not set")
+                return None
+            return FeishuAdapter(config)
+
+        elif platform == Platform.WECOM:
+            from gateway.platforms.wecom import WeComAdapter, check_wecom_requirements
+            if not check_wecom_requirements():
+                logger.warning("WeCom: aiohttp not installed or WECOM_BOT_ID/SECRET not set")
+                return None
+            return WeComAdapter(config)
+
        elif platform == Platform.MATTERMOST:
            from gateway.platforms.mattermost import MattermostAdapter, check_mattermost_requirements
            if not check_mattermost_requirements():
@ -1467,6 +1561,8 @@ class GatewayRunner:
            Platform.MATTERMOST: "MATTERMOST_ALLOWED_USERS",
            Platform.MATRIX: "MATRIX_ALLOWED_USERS",
            Platform.DINGTALK: "DINGTALK_ALLOWED_USERS",
+            Platform.FEISHU: "FEISHU_ALLOWED_USERS",
+            Platform.WECOM: "WECOM_ALLOWED_USERS",
        }
        platform_allow_all_map = {
            Platform.TELEGRAM: "TELEGRAM_ALLOW_ALL_USERS",
@ -1479,6 +1575,8 @@ class GatewayRunner:
            Platform.MATTERMOST: "MATTERMOST_ALLOW_ALL_USERS",
            Platform.MATRIX: "MATRIX_ALLOW_ALL_USERS",
            Platform.DINGTALK: "DINGTALK_ALLOW_ALL_USERS",
+            Platform.FEISHU: "FEISHU_ALLOW_ALL_USERS",
+            Platform.WECOM: "WECOM_ALLOW_ALL_USERS",
        }

        # Per-platform allow-all flag (e.g., DISCORD_ALLOW_ALL_USERS=true)
@ -1506,10 +1604,23 @@ class GatewayRunner:
        if global_allowlist:
            allowed_ids.update(uid.strip() for uid in global_allowlist.split(",") if uid.strip())

-        # WhatsApp JIDs have @s.whatsapp.net suffix — strip it for comparison
        check_ids = {user_id}
        if "@" in user_id:
            check_ids.add(user_id.split("@")[0])
+
+        # WhatsApp: resolve phone↔LID aliases from bridge session mapping files
+        if source.platform == Platform.WHATSAPP:
+            normalized_allowed_ids = set()
+            for allowed_id in allowed_ids:
+                normalized_allowed_ids.update(_expand_whatsapp_auth_aliases(allowed_id))
+            if normalized_allowed_ids:
+                allowed_ids = normalized_allowed_ids
+
+            check_ids.update(_expand_whatsapp_auth_aliases(user_id))
+            normalized_user_id = _normalize_whatsapp_identifier(user_id)
+            if normalized_user_id:
+                check_ids.add(normalized_user_id)
+
        return bool(check_ids & allowed_ids)

    def _get_unauthorized_dm_behavior(self, platform: Optional[Platform]) -> str:
@ -1970,6 +2081,12 @@ class GatewayRunner:
                            f"Use /resume to browse and restore a previous session.\n"
                            f"Adjust reset timing in config.yaml under session_reset."
                        )
+                        try:
+                            session_info = self._format_session_info()
+                            if session_info:
+                                notice = f"{notice}\n\n{session_info}"
+                        except Exception:
+                            pass
                        await adapter.send(
                            source.chat_id, notice,
                            metadata=getattr(event, 'metadata', None),
@ -2063,7 +2180,7 @@ class GatewayRunner:
                    if isinstance(_model_cfg, str):
                        _hyg_model = _model_cfg
                    elif isinstance(_model_cfg, dict):
-                        _hyg_model = _model_cfg.get("default", _hyg_model)
+                        _hyg_model = _model_cfg.get("default") or _model_cfg.get("model") or _hyg_model
                        # Read explicit context_length override from model config
                        # (same as run_agent.py lines 995-1005)
                        _raw_ctx = _model_cfg.get("context_length")
@ -2175,6 +2292,7 @@ class GatewayRunner:
                                    enabled_toolsets=["memory"],
                                    session_id=session_entry.session_id,
                                )
+                                _hyg_agent._print_fn = lambda *a, **kw: None

                                loop = asyncio.get_event_loop()
                                _compressed, _ = await loop.run_in_executor(
@ -2185,6 +2303,15 @@ class GatewayRunner:
                                    ),
                                )

+                                # _compress_context ends the old session and creates
+                                # a new session_id.  Write compressed messages into
+                                # the NEW session so the old transcript stays intact
+                                # and searchable via session_search.
+                                _hyg_new_sid = _hyg_agent.session_id
+                                if _hyg_new_sid != session_entry.session_id:
+                                    session_entry.session_id = _hyg_new_sid
+                                    self.session_store._save()
+
                                self.session_store.rewrite_transcript(
                                    session_entry.session_id, _compressed
                                )
@ -2736,6 +2863,85 @@ class GatewayRunner:
            # Clear session env
            self._clear_session_env()
    
+    def _format_session_info(self) -> str:
+        """Resolve current model config and return a formatted info block.
+
+        Surfaces model, provider, context length, and endpoint so gateway
+        users can immediately see if context detection went wrong (e.g.
+        local models falling to the 128K default).
+        """
+        from agent.model_metadata import get_model_context_length, DEFAULT_FALLBACK_CONTEXT
+
+        model = _resolve_gateway_model()
+        config_context_length = None
+        provider = None
+        base_url = None
+        api_key = None
+
+        try:
+            cfg_path = _hermes_home / "config.yaml"
+            if cfg_path.exists():
+                import yaml as _info_yaml
+                with open(cfg_path, encoding="utf-8") as f:
+                    data = _info_yaml.safe_load(f) or {}
+                model_cfg = data.get("model", {})
+                if isinstance(model_cfg, dict):
+                    raw_ctx = model_cfg.get("context_length")
+                    if raw_ctx is not None:
+                        try:
+                            config_context_length = int(raw_ctx)
+                        except (TypeError, ValueError):
+                            pass
+                    provider = model_cfg.get("provider") or None
+                    base_url = model_cfg.get("base_url") or None
+        except Exception:
+            pass
+
+        # Resolve runtime credentials for probing
+        try:
+            runtime = _resolve_runtime_agent_kwargs()
+            provider = provider or runtime.get("provider")
+            base_url = base_url or runtime.get("base_url")
+            api_key = runtime.get("api_key")
+        except Exception:
+            pass
+
+        context_length = get_model_context_length(
+            model,
+            base_url=base_url or "",
+            api_key=api_key or "",
+            config_context_length=config_context_length,
+            provider=provider or "",
+        )
+
+        # Format context source hint
+        if config_context_length is not None:
+            ctx_source = "config"
+        elif context_length == DEFAULT_FALLBACK_CONTEXT:
+            ctx_source = "default — set model.context_length in config to override"
+        else:
+            ctx_source = "detected"
+
+        # Format context length for display
+        if context_length >= 1_000_000:
+            ctx_display = f"{context_length / 1_000_000:.1f}M"
+        elif context_length >= 1_000:
+            ctx_display = f"{context_length // 1_000}K"
+        else:
+            ctx_display = str(context_length)
+
+        lines = [
+            f"◆ Model: `{model}`",
+            f"◆ Provider: {provider or 'openrouter'}",
+            f"◆ Context: {ctx_display} tokens ({ctx_source})",
+        ]
+
+        # Show endpoint for local/custom setups
+        if base_url and ("localhost" in base_url or "127.0.0.1" in base_url or "0.0.0.0" in base_url):
+            lines.append(f"◆ Endpoint: {base_url}")
+
+        return "\n".join(lines)
+
    async def _handle_reset_command(self, event: MessageEvent) -> str:
        """Handle /new or /reset command."""
        source = event.source
@ -2776,12 +2982,22 @@ class GatewayRunner:
            "session_key": session_key,
        })
        
+        # Resolve session config info to surface to the user
+        try:
+            session_info = self._format_session_info()
+        except Exception:
+            session_info = ""
+
        if new_entry:
-            return "✨ Session reset! I've started fresh with no memory of our previous conversation."
+            header = "✨ Session reset! Starting fresh."
        else:
            # No existing session, just create one
            self.session_store.get_or_create_session(source, force_new=True)
-            return "✨ New session started!"
+            header = "✨ New session started!"
+
+        if session_info:
+            return f"{header}\n\n{session_info}"
+        return header
    
    async def _handle_status_command(self, event: MessageEvent) -> str:
        """Handle /status command."""
@ -2959,8 +3175,7 @@ class GatewayRunner:
                if "agent" not in config or not isinstance(config.get("agent"), dict):
                    config["agent"] = {}
                config["agent"]["system_prompt"] = ""
-                with open(config_path, "w") as f:
-                    yaml.dump(config, f, default_flow_style=False, sort_keys=False)
+                atomic_yaml_write(config_path, config)
            except Exception as e:
                return f"⚠️ Failed to save personality change: {e}"
            self._ephemeral_system_prompt = ""
@ -2973,8 +3188,7 @@ class GatewayRunner:
                if "agent" not in config or not isinstance(config.get("agent"), dict):
                    config["agent"] = {}
                config["agent"]["system_prompt"] = new_prompt
-                with open(config_path, 'w', encoding="utf-8") as f:
-                    yaml.dump(config, f, default_flow_style=False, sort_keys=False)
+                atomic_yaml_write(config_path, config)
            except Exception as e:
                return f"⚠️ Failed to save personality change: {e}"

@ -3064,8 +3278,7 @@ class GatewayRunner:
                with open(config_path, encoding="utf-8") as f:
                    user_config = yaml.safe_load(f) or {}
            user_config[env_key] = chat_id
-            with open(config_path, 'w', encoding="utf-8") as f:
-                yaml.dump(user_config, f, default_flow_style=False)
+            atomic_yaml_write(config_path, user_config)
            # Also set in the current environment so it takes effect immediately
            os.environ[env_key] = str(chat_id)
        except Exception as e:
@ -3733,8 +3946,7 @@ class GatewayRunner:
                        current[k] = {}
                    current = current[k]
                current[keys[-1]] = value
-                with open(config_path, "w", encoding="utf-8") as f:
-                    yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
+                atomic_yaml_write(config_path, user_config)
                return True
            except Exception as e:
                logger.error("Failed to save config key %s: %s", key_path, e)
@ -3842,8 +4054,7 @@ class GatewayRunner:
            if "display" not in user_config or not isinstance(user_config.get("display"), dict):
                user_config["display"] = {}
            user_config["display"]["tool_progress"] = new_mode
-            with open(config_path, "w", encoding="utf-8") as f:
-                yaml.dump(user_config, f, default_flow_style=False, sort_keys=False)
+            atomic_yaml_write(config_path, user_config)
            return f"{descriptions[new_mode]}\n_(saved to config — takes effect on next message)_"
        except Exception as e:
            logger.warning("Failed to save tool_progress mode: %s", e)
@ -3885,17 +4096,27 @@ class GatewayRunner:
                enabled_toolsets=["memory"],
                session_id=session_entry.session_id,
            )
+            tmp_agent._print_fn = lambda *a, **kw: None

            loop = asyncio.get_event_loop()
            compressed, _ = await loop.run_in_executor(
                None,
-                lambda: tmp_agent._compress_context(msgs, "", approx_tokens=approx_tokens),
+                lambda: tmp_agent._compress_context(msgs, "", approx_tokens=approx_tokens)
            )

-            self.session_store.rewrite_transcript(session_entry.session_id, compressed)
+            # _compress_context already calls end_session() on the old session
+            # (preserving its full transcript in SQLite) and creates a new
+            # session_id for the continuation.  Write the compressed messages
+            # into the NEW session so the original history stays searchable.
+            new_session_id = tmp_agent.session_id
+            if new_session_id != session_entry.session_id:
+                session_entry.session_id = new_session_id
+                self.session_store._save()
+
+            self.session_store.rewrite_transcript(new_session_id, compressed)
            # Reset stored token count — transcript changed, old value is stale
            self.session_store.update_session(
-                session_entry.session_key, last_prompt_tokens=0,
+                session_entry.session_key, last_prompt_tokens=0
            )
            new_count = len(compressed)
            new_tokens = estimate_messages_tokens_rough(compressed)
@ -4051,7 +4272,7 @@ class GatewayRunner:
            ]
            ctx = agent.context_compressor
            if ctx.last_prompt_tokens:
-                pct = ctx.last_prompt_tokens / ctx.context_length * 100 if ctx.context_length else 0
+                pct = min(100, ctx.last_prompt_tokens / ctx.context_length * 100) if ctx.context_length else 0
                lines.append(f"Context: {ctx.last_prompt_tokens:,} / {ctx.context_length:,} ({pct:.0f}%)")
            if ctx.compression_count:
                lines.append(f"Compressions: {ctx.compression_count}")
@ -4798,10 +5019,23 @@ class GatewayRunner:
        from hermes_cli.tools_config import _get_platform_tools
        enabled_toolsets = sorted(_get_platform_tools(user_config, platform_key))

+        # Apply tool preview length config (0 = no limit)
+        try:
+            from agent.display import set_tool_preview_max_len
+            _tpl = user_config.get("display", {}).get("tool_preview_length", 0)
+            set_tool_preview_max_len(int(_tpl) if _tpl else 0)
+        except Exception:
+            pass
+
        # Tool progress mode from config.yaml: "all", "new", "verbose", "off"
-        # Falls back to env vars for backward compatibility
+        # Falls back to env vars for backward compatibility.
+        # YAML 1.1 parses bare `off` as boolean False — normalise before
+        # the `or` chain so it doesn't silently fall through to "all".
+        _raw_tp = user_config.get("display", {}).get("tool_progress")
+        if _raw_tp is False:
+            _raw_tp = "off"
        progress_mode = (
-            user_config.get("display", {}).get("tool_progress")
+            _raw_tp
            or os.getenv("HERMES_TOOL_PROGRESS_MODE")
            or "all"
        )
@ -4838,9 +5072,11 @@ class GatewayRunner:
                return
            
            if preview:
-                # Truncate preview to keep messages clean
-                if len(preview) > 80:
-                    preview = preview[:77] + "..."
+                # Truncate preview unless config says unlimited
+                from agent.display import get_tool_preview_max_len
+                _pl = get_tool_preview_max_len()
+                if _pl > 0 and len(preview) > _pl:
+                    preview = preview[:_pl - 3] + "..."
                msg = f"{emoji} {tool_name}: \"{preview}\""
            else:
                msg = f"{emoji} {tool_name}..."
@ -4860,12 +5096,17 @@ class GatewayRunner:
            progress_queue.put(msg)
        
        # Background task to send progress messages
-        # Accumulates tool lines into a single message that gets edited
-        # For DM top-level Slack messages, source.thread_id is None but the
-        # final reply will be threaded under the original message via reply_to.
-        # Use event_message_id as fallback so progress messages land in the
-        # same thread as the final response instead of going to the DM root.
-        _progress_thread_id = source.thread_id or event_message_id
+        # Accumulates tool lines into a single message that gets edited.
+        #
+        # Threading metadata is platform-specific:
+        # - Slack DM threading needs event_message_id fallback (reply thread)
+        # - Telegram uses message_thread_id only for forum topics; passing a
+        #   normal DM/group message id as thread_id causes send failures
+        # - Other platforms should use explicit source.thread_id only
+        if source.platform == Platform.SLACK:
+            _progress_thread_id = source.thread_id or event_message_id
+        else:
+            _progress_thread_id = source.thread_id
        _progress_metadata = {"thread_id": _progress_thread_id} if _progress_thread_id else None

        async def send_progress_messages():
@ -5128,7 +5369,25 @@ class GatewayRunner:
            agent.stream_delta_callback = _stream_delta_cb
            agent.status_callback = _status_callback_sync
            agent.reasoning_config = reasoning_config
-            
+
+            # Background review delivery — send "💾 Memory updated" etc. to user
+            def _bg_review_send(message: str) -> None:
+                if not _status_adapter:
+                    return
+                try:
+                    asyncio.run_coroutine_threadsafe(
+                        _status_adapter.send(
+                            _status_chat_id,
+                            message,
+                            metadata=_status_thread_metadata,
+                        ),
+                        _loop_for_step,
+                    )
+                except Exception as _e:
+                    logger.debug("background_review_callback error: %s", _e)
+
+            agent.background_review_callback = _bg_review_send
+
            # Store agent reference for interrupt support
            agent_holder[0] = agent
            # Capture the full tool definitions for transcript logging
--- a/gateway/session.py
+++ b/gateway/session.py
@ -762,14 +762,16 @@ class SessionStore:
            if session_key in self._entries:
                entry = self._entries[session_key]
                entry.updated_at = _now()
-                entry.input_tokens += input_tokens
-                entry.output_tokens += output_tokens
-                entry.cache_read_tokens += cache_read_tokens
-                entry.cache_write_tokens += cache_write_tokens
+                # Direct assignment — the gateway receives cumulative totals
+                # from the cached agent, not per-call deltas.
+                entry.input_tokens = input_tokens
+                entry.output_tokens = output_tokens
+                entry.cache_read_tokens = cache_read_tokens
+                entry.cache_write_tokens = cache_write_tokens
                if last_prompt_tokens is not None:
                    entry.last_prompt_tokens = last_prompt_tokens
                if estimated_cost_usd is not None:
-                    entry.estimated_cost_usd += estimated_cost_usd
+                    entry.estimated_cost_usd = estimated_cost_usd
                if cost_status:
                    entry.cost_status = cost_status
                entry.total_tokens = (
@ -783,7 +785,7 @@ class SessionStore:

        if self._db and db_session_id:
            try:
-                self._db.update_token_counts(
+                self._db.set_token_counts(
                    db_session_id,
                    input_tokens=input_tokens,
                    output_tokens=output_tokens,
@ -795,6 +797,7 @@ class SessionStore:
                    billing_provider=provider,
                    billing_base_url=base_url,
                    model=model,
+                    absolute=True,
                )
            except Exception as e:
                logger.debug("Session DB operation failed: %s", e)
@ -955,13 +958,17 @@ class SessionStore:
            try:
                self._db.clear_messages(session_id)
                for msg in messages:
+                    role = msg.get("role", "unknown")
                    self._db.append_message(
                        session_id=session_id,
-                        role=msg.get("role", "unknown"),
+                        role=role,
                        content=msg.get("content"),
                        tool_name=msg.get("tool_name"),
                        tool_calls=msg.get("tool_calls"),
                        tool_call_id=msg.get("tool_call_id"),
+                        reasoning=msg.get("reasoning") if role == "assistant" else None,
+                        reasoning_details=msg.get("reasoning_details") if role == "assistant" else None,
+                        codex_reasoning_items=msg.get("codex_reasoning_items") if role == "assistant" else None,
                    )
            except Exception as e:
                logger.debug("Failed to rewrite transcript in DB: %s", e)
--- a/hermes_cli/init.py
+++ b/hermes_cli/init.py
@ -11,5 +11,5 @@ Provides subcommands for:
 - hermes cron          - Manage cron jobs
 """

-__version__ = "0.4.0"
-__release_date__ = "2026.3.23"
+__version__ = "0.5.0"
+__release_date__ = "2026.3.28"
--- a/hermes_cli/auth.py
+++ b/hermes_cli/auth.py
@ -160,7 +160,7 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
        id="alibaba",
        name="Alibaba Cloud (DashScope)",
        auth_type="api_key",
-        inference_base_url="https://dashscope-intl.aliyuncs.com/apps/anthropic",
+        inference_base_url="https://coding-intl.dashscope.aliyuncs.com/v1",
        api_key_env_vars=("DASHSCOPE_API_KEY",),
        base_url_env_var="DASHSCOPE_BASE_URL",
    ),
@ -212,6 +212,14 @@ PROVIDER_REGISTRY: Dict[str, ProviderConfig] = {
        api_key_env_vars=("KILOCODE_API_KEY",),
        base_url_env_var="KILOCODE_BASE_URL",
    ),
+    "huggingface": ProviderConfig(
+        id="huggingface",
+        name="Hugging Face",
+        auth_type="api_key",
+        inference_base_url="https://router.huggingface.co/v1",
+        api_key_env_vars=("HF_TOKEN",),
+        base_url_env_var="HF_BASE_URL",
+    ),
 }


@ -685,8 +693,13 @@ def resolve_provider(
        "github-copilot-acp": "copilot-acp", "copilot-acp-agent": "copilot-acp",
        "aigateway": "ai-gateway", "vercel": "ai-gateway", "vercel-ai-gateway": "ai-gateway",
        "opencode": "opencode-zen", "zen": "opencode-zen",
+        "hf": "huggingface", "hugging-face": "huggingface", "huggingface-hub": "huggingface",
        "go": "opencode-go", "opencode-go-sub": "opencode-go",
        "kilo": "kilocode", "kilo-code": "kilocode", "kilo-gateway": "kilocode",
+        # Local server aliases — route through the generic custom provider
+        "lmstudio": "custom", "lm-studio": "custom", "lm_studio": "custom",
+        "ollama": "custom", "vllm": "custom", "llamacpp": "custom",
+        "llama.cpp": "custom", "llama-cpp": "custom",
    }
    normalized = _PROVIDER_ALIASES.get(normalized, normalized)

@ -733,7 +746,12 @@ def resolve_provider(
            if has_usable_secret(os.getenv(env_var, "")):
                return pid

-    return "openrouter"
+    raise AuthError(
+        "No inference provider configured. Run 'hermes model' to choose a "
+        "provider and model, or set an API key (OPENROUTER_API_KEY, "
+        "OPENAI_API_KEY, etc.) in ~/.hermes/.env.",
+        code="no_provider_configured",
+    )


 # =============================================================================
@ -2095,7 +2113,8 @@ def _login_openai_codex(args, pconfig: ProviderConfig) -> None:
    config_path = _update_config_for_provider("openai-codex", creds.get("base_url", DEFAULT_CODEX_BASE_URL))
    print()
    print("Login successful!")
-    print("  Auth state: ~/.hermes/auth.json")
+    from hermes_constants import display_hermes_home as _dhh
+    print(f"  Auth state: {_dhh()}/auth.json")
    print(f"  Config updated: {config_path} (model.provider=openai-codex)")


--- a/hermes_cli/banner.py
+++ b/hermes_cli/banner.py
@ -258,7 +258,7 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
        get_toolset_for_tool: Callable to map tool name -> toolset name.
        context_length: Model's context window size in tokens.
    """
-    from model_tools import check_tool_availability
+    from model_tools import check_tool_availability, TOOLSET_REQUIREMENTS
    if get_toolset_for_tool is None:
        from model_tools import get_toolset_for_tool

@ -267,8 +267,18 @@ def build_welcome_banner(console: Console, model: str, cwd: str,

    _, unavailable_toolsets = check_tool_availability(quiet=True)
    disabled_tools = set()
+    # Tools whose toolset has a check_fn are lazy-initialized (e.g. honcho,
+    # homeassistant) — they show as unavailable at banner time because the
+    # check hasn't run yet, but they aren't misconfigured.
+    lazy_tools = set()
    for item in unavailable_toolsets:
-        disabled_tools.update(item.get("tools", []))
+        toolset_name = item.get("name", "")
+        ts_req = TOOLSET_REQUIREMENTS.get(toolset_name, {})
+        tools_in_ts = item.get("tools", [])
+        if ts_req.get("check_fn"):
+            lazy_tools.update(tools_in_ts)
+        else:
+            disabled_tools.update(tools_in_ts)

    layout_table = Table.grid(padding=(0, 2))
    layout_table.add_column("left", justify="center")
@ -328,6 +338,8 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
        for name in sorted(tool_names):
            if name in disabled_tools:
                colored_names.append(f"[red]{name}[/]")
+            elif name in lazy_tools:
+                colored_names.append(f"[yellow]{name}[/]")
            else:
                colored_names.append(f"[{text}]{name}[/]")

@ -347,6 +359,8 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
                    colored_names.append("[dim]...[/]")
                elif name in disabled_tools:
                    colored_names.append(f"[red]{name}[/]")
+                elif name in lazy_tools:
+                    colored_names.append(f"[yellow]{name}[/]")
                else:
                    colored_names.append(f"[{text}]{name}[/]")
            tools_str = ", ".join(colored_names)
@ -403,6 +417,15 @@ def build_welcome_banner(console: Console, model: str, cwd: str,
    if mcp_connected:
        summary_parts.append(f"{mcp_connected} MCP servers")
    summary_parts.append("/help for commands")
+    # Show active profile name when not 'default'
+    try:
+        from hermes_cli.profiles import get_active_profile_name
+        _profile_name = get_active_profile_name()
+        if _profile_name and _profile_name != "default":
+            right_lines.append(f"[bold {accent}]Profile:[/] [{text}]{_profile_name}[/]")
+    except Exception:
+        pass  # Never break the banner over a profiles.py bug
+
    right_lines.append(f"[dim {dim}]{' · '.join(summary_parts)}[/]")

    # Update check — use prefetched result if available
--- a/hermes_cli/callbacks.py
+++ b/hermes_cli/callbacks.py
@ -12,6 +12,7 @@ import getpass

 from hermes_cli.banner import cprint, _DIM, _RST
 from hermes_cli.config import save_env_value_secure
+from hermes_constants import display_hermes_home


 def clarify_callback(cli, question, choices):
@ -131,7 +132,8 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
            }

        stored = save_env_value_secure(var_name, value)
-        cprint(f"\n{_DIM}  ✓ Stored secret in ~/.hermes/.env as {var_name}{_RST}")
+        _dhh = display_hermes_home()
+        cprint(f"\n{_DIM}  ✓ Stored secret in {_dhh}/.env as {var_name}{_RST}")
        return {
            **stored,
            "skipped": False,
@ -183,7 +185,8 @@ def prompt_for_secret(cli, var_name: str, prompt: str, metadata=None) -> dict:
                }

            stored = save_env_value_secure(var_name, value)
-            cprint(f"\n{_DIM}  ✓ Stored secret in ~/.hermes/.env as {var_name}{_RST}")
+            _dhh = display_hermes_home()
+            cprint(f"\n{_DIM}  ✓ Stored secret in {_dhh}/.env as {var_name}{_RST}")
            return {
                **stored,
                "skipped": False,
--- a/hermes_cli/codex_models.py
+++ b/hermes_cli/codex_models.py
@ -12,6 +12,8 @@ import os
 logger = logging.getLogger(__name__)

 DEFAULT_CODEX_MODELS: List[str] = [
+    "gpt-5.4-mini",
+    "gpt-5.4",
    "gpt-5.3-codex",
    "gpt-5.2-codex",
    "gpt-5.1-codex-max",
@ -19,8 +21,9 @@ DEFAULT_CODEX_MODELS: List[str] = [
 ]

 _FORWARD_COMPAT_TEMPLATE_MODELS: List[tuple[str, tuple[str, ...]]] = [
-    ("gpt-5.3-codex", ("gpt-5.2-codex",)),
+    ("gpt-5.4-mini", ("gpt-5.3-codex", "gpt-5.2-codex")),
    ("gpt-5.4", ("gpt-5.3-codex", "gpt-5.2-codex")),
+    ("gpt-5.3-codex", ("gpt-5.2-codex",)),
    ("gpt-5.3-codex-spark", ("gpt-5.3-codex", "gpt-5.2-codex")),
 ]

--- a/hermes_cli/config.py
+++ b/hermes_cli/config.py
@ -36,6 +36,8 @@ _EXTRA_ENV_KEYS = frozenset({
    "SIGNAL_ACCOUNT", "SIGNAL_HTTP_URL",
    "SIGNAL_ALLOWED_USERS", "SIGNAL_GROUP_ALLOWED_USERS",
    "DINGTALK_CLIENT_ID", "DINGTALK_CLIENT_SECRET",
+    "FEISHU_APP_ID", "FEISHU_APP_SECRET", "FEISHU_ENCRYPT_KEY", "FEISHU_VERIFICATION_TOKEN",
+    "WECOM_BOT_ID", "WECOM_SECRET",
    "TERMINAL_ENV", "TERMINAL_SSH_KEY", "TERMINAL_SSH_PORT",
    "WHATSAPP_MODE", "WHATSAPP_ENABLED",
    "MATTERMOST_HOME_CHANNEL", "MATTERMOST_REPLY_MODE",
@ -136,9 +138,16 @@ def ensure_hermes_home():

 DEFAULT_CONFIG = {
    "model": "anthropic/claude-opus-4.6",
+    "fallback_providers": [],
    "toolsets": ["hermes-cli"],
    "agent": {
        "max_turns": 90,
+        # Tool-use enforcement: injects system prompt guidance that tells the
+        # model to actually call tools instead of describing intended actions.
+        # Values: "auto" (default — applies to gpt/codex models), true/false
+        # (force on/off for all models), or a list of model-name substrings
+        # to match (e.g. ["gpt", "codex", "gemini", "qwen"]).
+        "tool_use_enforcement": "auto",
    },
    
    "terminal": {
@ -223,42 +232,49 @@ DEFAULT_CONFIG = {
            "model": "",
            "base_url": "",
            "api_key": "",
+            "timeout": 30,         # seconds — increase for slow local models
        },
        "compression": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
+            "timeout": 120,        # seconds — compression summarises large contexts; increase for local models
        },
        "session_search": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
+            "timeout": 30,
        },
        "skills_hub": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
+            "timeout": 30,
        },
        "approval": {
            "provider": "auto",
            "model": "",           # fast/cheap model recommended (e.g. gemini-flash, haiku)
            "base_url": "",
            "api_key": "",
+            "timeout": 30,
        },
        "mcp": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
+            "timeout": 30,
        },
        "flush_memories": {
            "provider": "auto",
            "model": "",
            "base_url": "",
            "api_key": "",
+            "timeout": 30,
        },
    },
    
@ -266,12 +282,14 @@ DEFAULT_CONFIG = {
        "compact": False,
        "personality": "kawaii",
        "resume_display": "full",
+        "busy_input_mode": "interrupt",
        "bell_on_complete": False,
        "show_reasoning": False,
        "streaming": False,
        "show_cost": False,       # Show $ cost in the status bar (off by default)
        "skin": "default",
        "tool_progress_command": False,  # Enable /verbose command in messaging gateway
+        "tool_preview_length": 0,  # Max chars for tool call previews (0 = no limit, show full paths/commands)
    },

    # Privacy settings
@ -354,6 +372,13 @@ DEFAULT_CONFIG = {
    # Never saved to sessions, logs, or trajectories.
    "prefill_messages_file": "",
    
+    # Skills — external skill directories for sharing skills across tools/agents.
+    # Each path is expanded (~, ${VAR}) and resolved.  Read-only — skill creation
+    # always goes to ~/.hermes/skills/.
+    "skills": {
+        "external_dirs": [],   # e.g. ["~/.agents/skills", "/shared/team-skills"]
+    },
+
    # Honcho AI-native memory -- reads ~/.honcho/config.json as single source of truth.
    # This section is only needed for hermes-specific overrides; everything else
    # (apiKey, workspace, peerName, sessions, enabled) comes from the global config.
@ -409,6 +434,12 @@ DEFAULT_CONFIG = {
        },
    },

+    "cron": {
+        # Wrap delivered cron responses with a header (task name) and footer
+        # ("The agent cannot see this message").  Set to false for clean output.
+        "wrap_response": True,
+    },
+
    # Config schema version - bump this when adding new required fields
    "_config_version": 11,
 }
@ -549,14 +580,14 @@ OPTIONAL_ENV_VARS = {
        "category": "provider",
    },
    "DASHSCOPE_API_KEY": {
-        "description": "Alibaba Cloud DashScope API key for Qwen models",
+        "description": "Alibaba Cloud DashScope API key (Qwen + multi-provider models)",
        "prompt": "DashScope API Key",
        "url": "https://modelstudio.console.alibabacloud.com/",
        "password": True,
        "category": "provider",
    },
    "DASHSCOPE_BASE_URL": {
-        "description": "Custom DashScope base URL (default: international endpoint)",
+        "description": "Custom DashScope base URL (default: coding-intl OpenAI-compat endpoint)",
        "prompt": "DashScope Base URL",
        "url": "",
        "password": False,
@ -595,8 +626,31 @@ OPTIONAL_ENV_VARS = {
        "category": "provider",
        "advanced": True,
    },
+    "HF_TOKEN": {
+        "description": "Hugging Face token for Inference Providers (20+ open models via router.huggingface.co)",
+        "prompt": "Hugging Face Token",
+        "url": "https://huggingface.co/settings/tokens",
+        "password": True,
+        "category": "provider",
+    },
+    "HF_BASE_URL": {
+        "description": "Hugging Face Inference Providers base URL override",
+        "prompt": "HF base URL (leave empty for default)",
+        "url": None,
+        "password": False,
+        "category": "provider",
+        "advanced": True,
+    },

    # ── Tool API keys ──
+    "EXA_API_KEY": {
+        "description": "Exa API key for AI-native web search and contents",
+        "prompt": "Exa API key",
+        "url": "https://exa.ai/",
+        "tools": ["web_search", "web_extract"],
+        "password": True,
+        "category": "tool",
+    },
    "PARALLEL_API_KEY": {
        "description": "Parallel API key for AI-native web search and extract",
        "prompt": "Parallel API key",
@ -815,6 +869,20 @@ OPTIONAL_ENV_VARS = {
        "password": False,
        "category": "messaging",
    },
+    "MATTERMOST_REQUIRE_MENTION": {
+        "description": "Require @mention in Mattermost channels (default: true). Set to false to respond to all messages.",
+        "prompt": "Require @mention in channels",
+        "url": None,
+        "password": False,
+        "category": "messaging",
+    },
+    "MATTERMOST_FREE_RESPONSE_CHANNELS": {
+        "description": "Comma-separated Mattermost channel IDs where bot responds without @mention",
+        "prompt": "Free-response channel IDs (comma-separated)",
+        "url": None,
+        "password": False,
+        "category": "messaging",
+    },
    "MATRIX_HOMESERVER": {
        "description": "Matrix homeserver URL (e.g. https://matrix.example.org)",
        "prompt": "Matrix homeserver URL",
@ -1694,6 +1762,7 @@ def show_config():
    keys = [
        ("OPENROUTER_API_KEY", "OpenRouter"),
        ("VOICE_TOOLS_OPENAI_KEY", "OpenAI (STT/TTS)"),
+        ("EXA_API_KEY", "Exa"),
        ("PARALLEL_API_KEY", "Parallel"),
        ("FIRECRAWL_API_KEY", "Firecrawl"),
        ("TAVILY_API_KEY", "Tavily"),
@ -1853,7 +1922,7 @@ def set_config_value(key: str, value: str):
    # Check if it's an API key (goes to .env)
    api_keys = [
        'OPENROUTER_API_KEY', 'OPENAI_API_KEY', 'ANTHROPIC_API_KEY', 'VOICE_TOOLS_OPENAI_KEY',
-        'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL',
+        'EXA_API_KEY', 'PARALLEL_API_KEY', 'FIRECRAWL_API_KEY', 'FIRECRAWL_API_URL',
        'FIRECRAWL_GATEWAY_URL', 'TOOL_GATEWAY_DOMAIN', 'TOOL_GATEWAY_SCHEME',
        'TOOL_GATEWAY_USER_TOKEN', 'TAVILY_API_KEY',
        'BROWSERBASE_API_KEY', 'BROWSERBASE_PROJECT_ID', 'BROWSER_USE_API_KEY',
--- a/hermes_cli/curses_ui.py
+++ b/hermes_cli/curses_ui.py
@ -4,7 +4,7 @@ Used by `hermes tools` and `hermes skills` for interactive checklists.
 Provides a curses multi-select with keyboard navigation, plus a
 text-based numbered fallback for terminals without curses support.
 """
-from typing import List, Set
+from typing import Callable, List, Optional, Set

 from hermes_cli.colors import Colors, color

@ -15,6 +15,7 @@ def curses_checklist(
    selected: Set[int],
    *,
    cancel_returns: Set[int] | None = None,
+    status_fn: Optional[Callable[[Set[int]], str]] = None,
 ) -> Set[int]:
    """Curses multi-select checklist. Returns set of selected indices.

@ -23,6 +24,9 @@ def curses_checklist(
        items: Display labels for each row.
        selected: Indices that start checked (pre-selected).
        cancel_returns: Returned on ESC/q. Defaults to the original *selected*.
+        status_fn: Optional callback ``f(chosen_indices) -> str`` whose return
+            value is rendered on the bottom row of the terminal.  Use this for
+            live aggregate info (e.g. estimated token counts).
    """
    if cancel_returns is None:
        cancel_returns = set(selected)
@ -47,6 +51,9 @@ def curses_checklist(
                stdscr.clear()
                max_y, max_x = stdscr.getmaxyx()

+                # Reserve bottom row for status bar when status_fn provided
+                footer_rows = 1 if status_fn else 0
+
                # Header
                try:
                    hattr = curses.A_BOLD
@ -62,7 +69,7 @@ def curses_checklist(
                    pass

                # Scrollable item list
-                visible_rows = max_y - 3
+                visible_rows = max_y - 3 - footer_rows
                if cursor < scroll_offset:
                    scroll_offset = cursor
                elif cursor >= scroll_offset + visible_rows:
@ -72,7 +79,7 @@ def curses_checklist(
                    range(scroll_offset, min(len(items), scroll_offset + visible_rows))
                ):
                    y = draw_i + 3
-                    if y >= max_y - 1:
+                    if y >= max_y - 1 - footer_rows:
                        break
                    check = "✓" if i in chosen else " "
                    arrow = "→" if i == cursor else " "
@ -87,6 +94,20 @@ def curses_checklist(
                    except curses.error:
                        pass

+                # Status bar (bottom row, right-aligned)
+                if status_fn:
+                    try:
+                        status_text = status_fn(chosen)
+                        if status_text:
+                            # Right-align on the bottom row
+                            sx = max(0, max_x - len(status_text) - 1)
+                            sattr = curses.A_DIM
+                            if curses.has_colors():
+                                sattr |= curses.color_pair(3)
+                            stdscr.addnstr(max_y - 1, sx, status_text, max_x - sx - 1, sattr)
+                    except curses.error:
+                        pass
+
                stdscr.refresh()
                key = stdscr.getch()

@ -107,7 +128,7 @@ def curses_checklist(
        return result_holder[0] if result_holder[0] is not None else cancel_returns

    except Exception:
-        return _numbered_fallback(title, items, selected, cancel_returns)
+        return _numbered_fallback(title, items, selected, cancel_returns, status_fn)


 def _numbered_fallback(
@ -115,6 +136,7 @@ def _numbered_fallback(
    items: List[str],
    selected: Set[int],
    cancel_returns: Set[int],
+    status_fn: Optional[Callable[[Set[int]], str]] = None,
 ) -> Set[int]:
    """Text-based toggle fallback for terminals without curses."""
    chosen = set(selected)
@ -125,6 +147,10 @@ def _numbered_fallback(
        for i, label in enumerate(items):
            marker = color("[✓]", Colors.GREEN) if i in chosen else "[ ]"
            print(f"  {marker} {i + 1:>2}. {label}")
+        if status_fn:
+            status_text = status_fn(chosen)
+            if status_text:
+                print(color(f"\n  {status_text}", Colors.DIM))
        print()
        try:
            val = input(color("  Toggle # (or Enter to confirm): ", Colors.DIM)).strip()
--- a/hermes_cli/doctor.py
+++ b/hermes_cli/doctor.py
@ -10,9 +10,11 @@ import subprocess
 import shutil

 from hermes_cli.config import get_project_root, get_hermes_home, get_env_path
+from hermes_constants import display_hermes_home

 PROJECT_ROOT = get_project_root()
 HERMES_HOME = get_hermes_home()
+_DHH = display_hermes_home()  # user-facing display path (e.g. ~/.hermes or ~/.hermes/profiles/coder)

 # Load environment variables from ~/.hermes/.env so API key checks work
 from dotenv import load_dotenv
@ -56,7 +58,7 @@ def _honcho_is_configured_for_doctor() -> bool:
        from honcho_integration.client import HonchoClientConfig

        cfg = HonchoClientConfig.from_global_config()
-        return bool(cfg.enabled and cfg.api_key)
+        return bool(cfg.enabled and (cfg.api_key or cfg.base_url))
    except Exception:
        return False

@ -209,14 +211,14 @@ def run_doctor(args):
    # Check ~/.hermes/.env (primary location for user config)
    env_path = HERMES_HOME / '.env'
    if env_path.exists():
-        check_ok("~/.hermes/.env file exists")
+        check_ok(f"{_DHH}/.env file exists")
        
        # Check for common issues
        content = env_path.read_text()
        if _has_provider_env_config(content):
            check_ok("API key or custom endpoint configured")
        else:
-            check_warn("No API key found in ~/.hermes/.env")
+            check_warn(f"No API key found in {_DHH}/.env")
            issues.append("Run 'hermes setup' to configure API keys")
    else:
        # Also check project root as fallback
@ -224,11 +226,11 @@ def run_doctor(args):
        if fallback_env.exists():
            check_ok(".env file exists (in project directory)")
        else:
-            check_fail("~/.hermes/.env file missing")
+            check_fail(f"{_DHH}/.env file missing")
            if should_fix:
                env_path.parent.mkdir(parents=True, exist_ok=True)
                env_path.touch()
-                check_ok("Created empty ~/.hermes/.env")
+                check_ok(f"Created empty {_DHH}/.env")
                check_info("Run 'hermes setup' to configure API keys")
                fixed_count += 1
            else:
@ -238,7 +240,7 @@ def run_doctor(args):
    # Check ~/.hermes/config.yaml (primary) or project cli-config.yaml (fallback)
    config_path = HERMES_HOME / 'config.yaml'
    if config_path.exists():
-        check_ok("~/.hermes/config.yaml exists")
+        check_ok(f"{_DHH}/config.yaml exists")
    else:
        fallback_config = PROJECT_ROOT / 'cli-config.yaml'
        if fallback_config.exists():
@ -248,11 +250,11 @@ def run_doctor(args):
            if should_fix and example_config.exists():
                config_path.parent.mkdir(parents=True, exist_ok=True)
                shutil.copy2(str(example_config), str(config_path))
-                check_ok("Created ~/.hermes/config.yaml from cli-config.yaml.example")
+                check_ok(f"Created {_DHH}/config.yaml from cli-config.yaml.example")
                fixed_count += 1
            elif should_fix:
                check_warn("config.yaml not found and no example to copy from")
-                manual_issues.append("Create ~/.hermes/config.yaml manually")
+                manual_issues.append(f"Create {_DHH}/config.yaml manually")
            else:
                check_warn("config.yaml not found", "(using defaults)")
    
@ -294,28 +296,28 @@ def run_doctor(args):
    
    hermes_home = HERMES_HOME
    if hermes_home.exists():
-        check_ok("~/.hermes directory exists")
+        check_ok(f"{_DHH} directory exists")
    else:
        if should_fix:
            hermes_home.mkdir(parents=True, exist_ok=True)
-            check_ok("Created ~/.hermes directory")
+            check_ok(f"Created {_DHH} directory")
            fixed_count += 1
        else:
-            check_warn("~/.hermes not found", "(will be created on first use)")
+            check_warn(f"{_DHH} not found", "(will be created on first use)")
    
    # Check expected subdirectories
    expected_subdirs = ["cron", "sessions", "logs", "skills", "memories"]
    for subdir_name in expected_subdirs:
        subdir_path = hermes_home / subdir_name
        if subdir_path.exists():
-            check_ok(f"~/.hermes/{subdir_name}/ exists")
+            check_ok(f"{_DHH}/{subdir_name}/ exists")
        else:
            if should_fix:
                subdir_path.mkdir(parents=True, exist_ok=True)
-                check_ok(f"Created ~/.hermes/{subdir_name}/")
+                check_ok(f"Created {_DHH}/{subdir_name}/")
                fixed_count += 1
            else:
-                check_warn(f"~/.hermes/{subdir_name}/ not found", "(will be created on first use)")
+                check_warn(f"{_DHH}/{subdir_name}/ not found", "(will be created on first use)")
    
    # Check for SOUL.md persona file
    soul_path = hermes_home / "SOUL.md"
@ -324,11 +326,11 @@ def run_doctor(args):
        # Check if it's just the template comments (no real content)
        lines = [l for l in content.splitlines() if l.strip() and not l.strip().startswith(("<!--", "-->", "#"))]
        if lines:
-            check_ok("~/.hermes/SOUL.md exists (persona configured)")
+            check_ok(f"{_DHH}/SOUL.md exists (persona configured)")
        else:
-            check_info("~/.hermes/SOUL.md exists but is empty — edit it to customize personality")
+            check_info(f"{_DHH}/SOUL.md exists but is empty — edit it to customize personality")
    else:
-        check_warn("~/.hermes/SOUL.md not found", "(create it to give Hermes a custom personality)")
+        check_warn(f"{_DHH}/SOUL.md not found", "(create it to give Hermes a custom personality)")
        if should_fix:
            soul_path.parent.mkdir(parents=True, exist_ok=True)
            soul_path.write_text(
@ -337,13 +339,13 @@ def run_doctor(args):
                "You are Hermes, a helpful AI assistant.\n",
                encoding="utf-8",
            )
-            check_ok("Created ~/.hermes/SOUL.md with basic template")
+            check_ok(f"Created {_DHH}/SOUL.md with basic template")
            fixed_count += 1
    
    # Check memory directory
    memories_dir = hermes_home / "memories"
    if memories_dir.exists():
-        check_ok("~/.hermes/memories/ directory exists")
+        check_ok(f"{_DHH}/memories/ directory exists")
        memory_file = memories_dir / "MEMORY.md"
        user_file = memories_dir / "USER.md"
        if memory_file.exists():
@ -357,10 +359,10 @@ def run_doctor(args):
        else:
            check_info("USER.md not created yet (will be created when the agent first writes a memory)")
    else:
-        check_warn("~/.hermes/memories/ not found", "(will be created on first use)")
+        check_warn(f"{_DHH}/memories/ not found", "(will be created on first use)")
        if should_fix:
            memories_dir.mkdir(parents=True, exist_ok=True)
-            check_ok("Created ~/.hermes/memories/")
+            check_ok(f"Created {_DHH}/memories/")
            fixed_count += 1
    
    # Check SQLite session store
@ -372,11 +374,11 @@ def run_doctor(args):
            cursor = conn.execute("SELECT COUNT(*) FROM sessions")
            count = cursor.fetchone()[0]
            conn.close()
-            check_ok(f"~/.hermes/state.db exists ({count} sessions)")
+            check_ok(f"{_DHH}/state.db exists ({count} sessions)")
        except Exception as e:
-            check_warn(f"~/.hermes/state.db exists but has issues: {e}")
+            check_warn(f"{_DHH}/state.db exists but has issues: {e}")
    else:
-        check_info("~/.hermes/state.db not created yet (will be created on first session)")
+        check_info(f"{_DHH}/state.db not created yet (will be created on first session)")

    _check_gateway_service_linger(issues)
    
@ -691,7 +693,7 @@ def run_doctor(args):
    if github_token:
        check_ok("GitHub token configured (authenticated API access)")
    else:
-        check_warn("No GITHUB_TOKEN", "(60 req/hr rate limit — set in ~/.hermes/.env for better rates)")
+        check_warn("No GITHUB_TOKEN", f"(60 req/hr rate limit — set in {_DHH}/.env for better rates)")

    # =========================================================================
    # Honcho memory
@ -708,8 +710,8 @@ def run_doctor(args):
            check_warn("Honcho config not found", "run: hermes honcho setup")
        elif not hcfg.enabled:
            check_info(f"Honcho disabled (set enabled: true in {_honcho_cfg_path} to activate)")
-        elif not hcfg.api_key:
-            check_fail("Honcho API key not set", "run: hermes honcho setup")
+        elif not (hcfg.api_key or hcfg.base_url):
+            check_fail("Honcho API key or base URL not set", "run: hermes honcho setup")
            issues.append("No Honcho API key — run 'hermes honcho setup'")
        else:
            from honcho_integration.client import get_honcho_client, reset_honcho_client
@ -728,6 +730,53 @@ def run_doctor(args):
    except Exception as _e:
        check_warn("Honcho check failed", str(_e))

+    # =========================================================================
+    # Profiles
+    # =========================================================================
+    try:
+        from hermes_cli.profiles import list_profiles, _get_wrapper_dir, profile_exists
+        import re as _re
+
+        named_profiles = [p for p in list_profiles() if not p.is_default]
+        if named_profiles:
+            print()
+            print(color("◆ Profiles", Colors.CYAN, Colors.BOLD))
+            check_ok(f"{len(named_profiles)} profile(s) found")
+            wrapper_dir = _get_wrapper_dir()
+            for p in named_profiles:
+                parts = []
+                if p.gateway_running:
+                    parts.append("gateway running")
+                if p.model:
+                    parts.append(p.model[:30])
+                if not (p.path / "config.yaml").exists():
+                    parts.append("⚠ missing config")
+                if not (p.path / ".env").exists():
+                    parts.append("no .env")
+                wrapper = wrapper_dir / p.name
+                if not wrapper.exists():
+                    parts.append("no alias")
+                status = ", ".join(parts) if parts else "configured"
+                check_ok(f"  {p.name}: {status}")
+
+            # Check for orphan wrappers
+            if wrapper_dir.is_dir():
+                for wrapper in wrapper_dir.iterdir():
+                    if not wrapper.is_file():
+                        continue
+                    try:
+                        content = wrapper.read_text()
+                        if "hermes -p" in content:
+                            _m = _re.search(r"hermes -p (\S+)", content)
+                            if _m and not profile_exists(_m.group(1)):
+                                check_warn(f"Orphan alias: {wrapper.name} → profile '{_m.group(1)}' no longer exists")
+                    except Exception:
+                        pass
+    except ImportError:
+        pass
+    except Exception as _e:
+        logger.debug("Profile health check failed: %s", _e)
+
    # =========================================================================
    # Summary
    # =========================================================================
--- a/hermes_cli/gateway.py
+++ b/hermes_cli/gateway.py
@ -15,6 +15,8 @@ from pathlib import Path
 PROJECT_ROOT = Path(__file__).parent.parent.resolve()

 from hermes_cli.config import get_env_value, get_hermes_home, save_env_value, is_managed, managed_error
+# display_hermes_home is imported lazily at call sites to avoid ImportError
+# when hermes_constants is cached from a pre-update version during `hermes update`.
 from hermes_cli.setup import (
    print_header, print_info, print_success, print_warning, print_error,
    prompt, prompt_choice, prompt_yes_no,
@ -125,20 +127,43 @@ _SERVICE_BASE = "hermes-gateway"
 SERVICE_DESCRIPTION = "Hermes Agent Gateway - Messaging Platform Integration"


+def _profile_suffix() -> str:
+    """Derive a service-name suffix from the current HERMES_HOME.
+
+    Returns ``""`` for the default ``~/.hermes``, the profile name for
+    ``~/.hermes/profiles/<name>``, or a short hash for any other custom
+    HERMES_HOME path.
+    """
+    import hashlib
+    import re
+    from pathlib import Path as _Path
+    home = get_hermes_home().resolve()
+    default = (_Path.home() / ".hermes").resolve()
+    if home == default:
+        return ""
+    # Detect ~/.hermes/profiles/<name> pattern → use the profile name
+    profiles_root = (default / "profiles").resolve()
+    try:
+        rel = home.relative_to(profiles_root)
+        parts = rel.parts
+        if len(parts) == 1 and re.match(r"^[a-z0-9][a-z0-9_-]{0,63}$", parts[0]):
+            return parts[0]
+    except ValueError:
+        pass
+    # Fallback: short hash for arbitrary HERMES_HOME paths
+    return hashlib.sha256(str(home).encode()).hexdigest()[:8]
+
+
 def get_service_name() -> str:
    """Derive a systemd service name scoped to this HERMES_HOME.

    Default ``~/.hermes`` returns ``hermes-gateway`` (backward compatible).
-    Any other HERMES_HOME appends a short hash so multiple installations
-    can each have their own systemd service without conflicting.
+    Profile ``~/.hermes/profiles/coder`` returns ``hermes-gateway-coder``.
+    Any other HERMES_HOME appends a short hash for uniqueness.
    """
-    import hashlib
-    from pathlib import Path as _Path  # local import to avoid monkeypatch interference
-    home = get_hermes_home().resolve()
-    default = (_Path.home() / ".hermes").resolve()
-    if home == default:
+    suffix = _profile_suffix()
+    if not suffix:
        return _SERVICE_BASE
-    suffix = hashlib.sha256(str(home).encode()).hexdigest()[:8]
    return f"{_SERVICE_BASE}-{suffix}"


@ -369,7 +394,14 @@ def print_systemd_linger_guidance() -> None:
        print("  sudo loginctl enable-linger $USER")

 def get_launchd_plist_path() -> Path:
-    return Path.home() / "Library" / "LaunchAgents" / "ai.hermes.gateway.plist"
+    """Return the launchd plist path, scoped per profile.
+
+    Default ``~/.hermes`` → ``ai.hermes.gateway.plist`` (backward compatible).
+    Profile ``~/.hermes/profiles/coder`` → ``ai.hermes.gateway-coder.plist``.
+    """
+    suffix = _profile_suffix()
+    name = f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
+    return Path.home() / "Library" / "LaunchAgents" / f"{name}.plist"

 def _detect_venv_dir() -> Path | None:
    """Detect the active virtualenv directory.
@ -420,6 +452,17 @@ def get_hermes_cli_path() -> str:
 # Systemd (Linux)
 # =============================================================================

+def _build_user_local_paths(home: Path, path_entries: list[str]) -> list[str]:
+    """Return user-local bin dirs that exist and aren't already in *path_entries*."""
+    candidates = [
+        str(home / ".local" / "bin"),       # uv, uvx, pip-installed CLIs
+        str(home / ".cargo" / "bin"),        # Rust/cargo tools
+        str(home / "go" / "bin"),            # Go tools
+        str(home / ".npm-global" / "bin"),   # npm global packages
+    ]
+    return [p for p in candidates if p not in path_entries and Path(p).exists()]
+
+
 def generate_systemd_unit(system: bool = False, run_as_user: str | None = None) -> str:
    python_path = get_python_path()
    working_dir = str(PROJECT_ROOT)
@ -434,13 +477,16 @@ def generate_systemd_unit(system: bool = False, run_as_user: str | None = None)
        resolved_node_dir = str(Path(resolved_node).resolve().parent)
        if resolved_node_dir not in path_entries:
            path_entries.append(resolved_node_dir)
-    path_entries.extend(["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"])
-    sane_path = ":".join(path_entries)

    hermes_home = str(get_hermes_home().resolve())

+    common_bin_paths = ["/usr/local/sbin", "/usr/local/bin", "/usr/sbin", "/usr/bin", "/sbin", "/bin"]
+
    if system:
        username, group_name, home_dir = _system_service_identity(run_as_user)
+        path_entries.extend(_build_user_local_paths(Path(home_dir), path_entries))
+        path_entries.extend(common_bin_paths)
+        sane_path = ":".join(path_entries)
        return f"""[Unit]
 Description={SERVICE_DESCRIPTION}
 After=network-online.target
@ -472,6 +518,9 @@ StandardError=journal
 WantedBy=multi-user.target
 """

+    path_entries.extend(_build_user_local_paths(Path.home(), path_entries))
+    path_entries.extend(common_bin_paths)
+    sane_path = ":".join(path_entries)
    return f"""[Unit]
 Description={SERVICE_DESCRIPTION}
 After=network.target
@ -752,18 +801,46 @@ def systemd_status(deep: bool = False, system: bool = False):
 # Launchd (macOS)
 # =============================================================================

+def get_launchd_label() -> str:
+    """Return the launchd service label, scoped per profile."""
+    suffix = _profile_suffix()
+    return f"ai.hermes.gateway-{suffix}" if suffix else "ai.hermes.gateway"
+
+
 def generate_launchd_plist() -> str:
    python_path = get_python_path()
    working_dir = str(PROJECT_ROOT)
+    hermes_home = str(get_hermes_home().resolve())
    log_dir = get_hermes_home() / "logs"
    log_dir.mkdir(parents=True, exist_ok=True)
-    
+    label = get_launchd_label()
+    # Build a sane PATH for the launchd plist.  launchd provides only a
+    # minimal default (/usr/bin:/bin:/usr/sbin:/sbin) which misses Homebrew,
+    # nvm, cargo, etc.  We prepend venv/bin and node_modules/.bin (matching
+    # the systemd unit), then capture the user's full shell PATH so every
+    # user-installed tool (node, ffmpeg, …) is reachable.
+    detected_venv = _detect_venv_dir()
+    venv_bin = str(detected_venv / "bin") if detected_venv else str(PROJECT_ROOT / "venv" / "bin")
+    venv_dir = str(detected_venv) if detected_venv else str(PROJECT_ROOT / "venv")
+    node_bin = str(PROJECT_ROOT / "node_modules" / ".bin")
+    # Resolve the directory containing the node binary (e.g. Homebrew, nvm)
+    # so it's explicitly in PATH even if the user's shell PATH changes later.
+    priority_dirs = [venv_bin, node_bin]
+    resolved_node = shutil.which("node")
+    if resolved_node:
+        resolved_node_dir = str(Path(resolved_node).resolve().parent)
+        if resolved_node_dir not in priority_dirs:
+            priority_dirs.append(resolved_node_dir)
+    sane_path = ":".join(
+        dict.fromkeys(priority_dirs + [p for p in os.environ.get("PATH", "").split(":") if p])
+    )
+
    return f"""<?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
 <plist version="1.0">
 <dict>
    <key>Label</key>
-    <string>ai.hermes.gateway</string>
+    <string>{label}</string>
    
    <key>ProgramArguments</key>
    <array>
@ -778,6 +855,16 @@ def generate_launchd_plist() -> str:
    <key>WorkingDirectory</key>
    <string>{working_dir}</string>
    
+    <key>EnvironmentVariables</key>
+    <dict>
+        <key>PATH</key>
+        <string>{sane_path}</string>
+        <key>VIRTUAL_ENV</key>
+        <string>{venv_dir}</string>
+        <key>HERMES_HOME</key>
+        <string>{hermes_home}</string>
+    </dict>
+    
    <key>RunAtLoad</key>
    <true/>
    
@ -850,7 +937,8 @@ def launchd_install(force: bool = False):
    print()
    print("Next steps:")
    print("  hermes gateway status             # Check status")
-    print("  tail -f ~/.hermes/logs/gateway.log  # View logs")
+    from hermes_constants import display_hermes_home as _dhh
+    print(f"  tail -f {_dhh()}/logs/gateway.log  # View logs")

 def launchd_uninstall():
    plist_path = get_launchd_plist_path()
@ -863,20 +951,33 @@ def launchd_uninstall():
    print("✓ Service uninstalled")

 def launchd_start():
-    refresh_launchd_plist_if_needed()
    plist_path = get_launchd_plist_path()
+    label = get_launchd_label()
+
+    # Self-heal if the plist is missing entirely (e.g., manual cleanup, failed upgrade)
+    if not plist_path.exists():
+        print("↻ launchd plist missing; regenerating service definition")
+        plist_path.parent.mkdir(parents=True, exist_ok=True)
+        plist_path.write_text(generate_launchd_plist(), encoding="utf-8")
+        subprocess.run(["launchctl", "load", str(plist_path)], check=True)
+        subprocess.run(["launchctl", "start", label], check=True)
+        print("✓ Service started")
+        return
+
+    refresh_launchd_plist_if_needed()
    try:
-        subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
+        subprocess.run(["launchctl", "start", label], check=True)
    except subprocess.CalledProcessError as e:
-        if e.returncode != 3 or not plist_path.exists():
+        if e.returncode != 3:
            raise
        print("↻ launchd job was unloaded; reloading service definition")
        subprocess.run(["launchctl", "load", str(plist_path)], check=True)
-        subprocess.run(["launchctl", "start", "ai.hermes.gateway"], check=True)
+        subprocess.run(["launchctl", "start", label], check=True)
    print("✓ Service started")

 def launchd_stop():
-    subprocess.run(["launchctl", "stop", "ai.hermes.gateway"], check=True)
+    label = get_launchd_label()
+    subprocess.run(["launchctl", "stop", label], check=True)
    print("✓ Service stopped")

 def _wait_for_gateway_exit(timeout: float = 10.0, force_after: float = 5.0):
@ -931,8 +1032,9 @@ def launchd_restart():

 def launchd_status(deep: bool = False):
    plist_path = get_launchd_plist_path()
+    label = get_launchd_label()
    result = subprocess.run(
-        ["launchctl", "list", "ai.hermes.gateway"],
+        ["launchctl", "list", label],
        capture_output=True,
        text=True
    )
@ -1220,6 +1322,59 @@ _PLATFORMS = [
             "help": "The AppSecret from your DingTalk application credentials."},
        ],
    },
+    {
+        "key": "feishu",
+        "label": "Feishu / Lark",
+        "emoji": "🪽",
+        "token_var": "FEISHU_APP_ID",
+        "setup_instructions": [
+            "1. Go to https://open.feishu.cn/ (or https://open.larksuite.com/ for Lark)",
+            "2. Create an app and copy the App ID and App Secret",
+            "3. Enable the Bot capability for the app",
+            "4. Choose WebSocket (recommended) or Webhook connection mode",
+            "5. Add the bot to a group chat or message it directly",
+            "6. Restrict access with FEISHU_ALLOWED_USERS for production use",
+        ],
+        "vars": [
+            {"name": "FEISHU_APP_ID", "prompt": "App ID", "password": False,
+             "help": "The App ID from your Feishu/Lark application."},
+            {"name": "FEISHU_APP_SECRET", "prompt": "App Secret", "password": True,
+             "help": "The App Secret from your Feishu/Lark application."},
+            {"name": "FEISHU_DOMAIN", "prompt": "Domain — feishu or lark (default: feishu)", "password": False,
+             "help": "Use 'feishu' for Feishu China, or 'lark' for Lark international."},
+            {"name": "FEISHU_CONNECTION_MODE", "prompt": "Connection mode — websocket or webhook (default: websocket)", "password": False,
+             "help": "websocket is recommended unless you specifically need webhook mode."},
+            {"name": "FEISHU_ALLOWED_USERS", "prompt": "Allowed user IDs (comma-separated, or empty)", "password": False,
+             "is_allowlist": True,
+             "help": "Restrict which Feishu/Lark users can interact with the bot."},
+            {"name": "FEISHU_HOME_CHANNEL", "prompt": "Home chat ID (optional, for cron/notifications)", "password": False,
+             "help": "Chat ID for scheduled results and notifications."},
+        ],
+    },
+    {
+        "key": "wecom",
+        "label": "WeCom (Enterprise WeChat)",
+        "emoji": "💬",
+        "token_var": "WECOM_BOT_ID",
+        "setup_instructions": [
+            "1. Go to WeCom Admin Console → Applications → Create AI Bot",
+            "2. Copy the Bot ID and Secret from the bot's credentials page",
+            "3. The bot connects via WebSocket — no public endpoint needed",
+            "4. Add the bot to a group chat or message it directly in WeCom",
+            "5. Restrict access with WECOM_ALLOWED_USERS for production use",
+        ],
+        "vars": [
+            {"name": "WECOM_BOT_ID", "prompt": "Bot ID", "password": False,
+             "help": "The Bot ID from your WeCom AI Bot."},
+            {"name": "WECOM_SECRET", "prompt": "Secret", "password": True,
+             "help": "The secret from your WeCom AI Bot."},
+            {"name": "WECOM_ALLOWED_USERS", "prompt": "Allowed user IDs (comma-separated, or empty)", "password": False,
+             "is_allowlist": True,
+             "help": "Restrict which WeCom users can interact with the bot."},
+            {"name": "WECOM_HOME_CHANNEL", "prompt": "Home chat ID (optional, for cron/notifications)", "password": False,
+             "help": "Chat ID for scheduled results and notifications."},
+        ],
+    },
 ]


@ -1437,7 +1592,7 @@ def _is_service_running() -> bool:
        return False
    elif is_macos() and get_launchd_plist_path().exists():
        result = subprocess.run(
-            ["launchctl", "list", "ai.hermes.gateway"],
+            ["launchctl", "list", get_launchd_label()],
            capture_output=True, text=True
        )
        return result.returncode == 0
--- a/hermes_cli/main.py
+++ b/hermes_cli/main.py
--- a/hermes_cli/mcp_config.py
+++ b/hermes_cli/mcp_config.py
@ -24,6 +24,7 @@ from hermes_cli.config import (
    get_hermes_home,  # noqa: F401 — used by test mocks
 )
 from hermes_cli.colors import Colors, color
+from hermes_constants import display_hermes_home

 logger = logging.getLogger(__name__)

@ -244,7 +245,7 @@ def cmd_mcp_add(args):
                    api_key = _prompt("API key / Bearer token", password=True)
                    if api_key:
                        save_env_value(env_key, api_key)
-                        _success(f"Saved to ~/.hermes/.env as {env_key}")
+                        _success(f"Saved to {display_hermes_home()}/.env as {env_key}")

                # Set header with env var interpolation
                if api_key or existing_key:
@ -332,7 +333,7 @@ def cmd_mcp_add(args):
    _save_mcp_server(name, server_config)

    print()
-    _success(f"Saved '{name}' to ~/.hermes/config.yaml ({tool_count}/{total} tools enabled)")
+    _success(f"Saved '{name}' to {display_hermes_home()}/config.yaml ({tool_count}/{total} tools enabled)")
    _info("Start a new session to use these tools.")


@ -607,6 +608,11 @@ def mcp_command(args):
    """Main dispatcher for ``hermes mcp`` subcommands."""
    action = getattr(args, "mcp_action", None)

+    if action == "serve":
+        from mcp_serve import run_mcp_server
+        run_mcp_server(verbose=getattr(args, "verbose", False))
+        return
+
    handlers = {
        "add": cmd_mcp_add,
        "remove": cmd_mcp_remove,
@ -625,6 +631,7 @@ def mcp_command(args):
        # No subcommand — show list
        cmd_mcp_list()
        print(color("  Commands:", Colors.CYAN))
+        _info("hermes mcp serve                              Run as MCP server")
        _info("hermes mcp add <name> --url <endpoint>        Add an MCP server")
        _info("hermes mcp add <name> --command <cmd>         Add a stdio server")
        _info("hermes mcp remove <name>                      Remove a server")
--- a/hermes_cli/models.py
+++ b/hermes_cli/models.py
@ -35,6 +35,8 @@ OPENROUTER_MODELS: list[tuple[str, str]] = [
    ("openai/gpt-5.3-codex",            ""),
    ("google/gemini-3-pro-preview",     ""),
    ("google/gemini-3-flash-preview",   ""),
+    ("google/gemini-3.1-pro-preview",     ""),
+    ("google/gemini-3.1-flash-lite-preview",   ""),
    ("qwen/qwen3.5-plus-02-15",         ""),
    ("qwen/qwen3.5-35b-a3b",            ""),
    ("stepfun/step-3.5-flash",          ""),
@ -62,6 +64,8 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "openai/gpt-5.3-codex",
        "google/gemini-3-pro-preview",
        "google/gemini-3-flash-preview",
+        "google/gemini-3.1-pro-preview",
+        "google/gemini-3.1-flash-lite-preview",
        "qwen/qwen3.5-plus-02-15",
        "qwen/qwen3.5-35b-a3b",
        "stepfun/step-3.5-flash",
@ -208,14 +212,31 @@ _PROVIDER_MODELS: dict[str, list[str]] = {
        "google/gemini-3-pro-preview",
        "google/gemini-3-flash-preview",
    ],
+    # Alibaba DashScope Coding platform (coding-intl) — default endpoint.
+    # Supports Qwen models + third-party providers (GLM, Kimi, MiniMax).
+    # Users with classic DashScope keys should override DASHSCOPE_BASE_URL
+    # to https://dashscope-intl.aliyuncs.com/compatible-mode/v1 (OpenAI-compat)
+    # or https://dashscope-intl.aliyuncs.com/apps/anthropic (Anthropic-compat).
    "alibaba": [
        "qwen3.5-plus",
-        "qwen3-max",
        "qwen3-coder-plus",
        "qwen3-coder-next",
-        "qwen-plus-latest",
-        "qwen3.5-flash",
-        "qwen-vl-max",
+        # Third-party models available on coding-intl
+        "glm-5",
+        "glm-4.7",
+        "kimi-k2.5",
+        "MiniMax-M2.5",
+    ],
+    # Curated HF model list — only agentic models that map to OpenRouter defaults.
+    "huggingface": [
+        "Qwen/Qwen3.5-397B-A17B",
+        "Qwen/Qwen3.5-35B-A3B",
+        "deepseek-ai/DeepSeek-V3.2",
+        "moonshotai/Kimi-K2.5",
+        "MiniMaxAI/MiniMax-M2.5",
+        "zai-org/GLM-5",
+        "XiaomiMiMo/MiMo-V2-Flash",
+        "moonshotai/Kimi-K2-Thinking",
    ],
 }

@ -236,6 +257,7 @@ _PROVIDER_LABELS = {
    "ai-gateway": "AI Gateway",
    "kilocode": "Kilo Code",
    "alibaba": "Alibaba Cloud (DashScope)",
+    "huggingface": "Hugging Face",
    "custom": "Custom endpoint",
 }

@ -271,6 +293,9 @@ _PROVIDER_ALIASES = {
    "aliyun": "alibaba",
    "qwen": "alibaba",
    "alibaba-cloud": "alibaba",
+    "hf": "huggingface",
+    "hugging-face": "huggingface",
+    "huggingface-hub": "huggingface",
 }


@ -304,7 +329,7 @@ def list_available_providers() -> list[dict[str, str]]:
    # Canonical providers in display order
    _PROVIDER_ORDER = [
        "openrouter", "nous", "openai-codex", "copilot", "copilot-acp",
-        "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
+        "huggingface", "zai", "kimi-coding", "minimax", "minimax-cn", "kilocode", "anthropic", "alibaba",
        "opencode-zen", "opencode-go",
        "ai-gateway", "deepseek", "custom",
    ]
--- a/hermes_cli/nous_subscription.py
+++ b/hermes_cli/nous_subscription.py
@ -14,6 +14,7 @@ from tools.tool_backend_helpers import (
    managed_nous_tools_enabled,
    normalize_browser_cloud_provider,
    normalize_modal_mode,
+    resolve_modal_backend_state,
    resolve_openai_audio_api_key,
 )

@ -185,6 +186,7 @@ def get_nous_subscription_features(
        else None
    )

+    direct_exa = bool(get_env_value("EXA_API_KEY"))
    direct_firecrawl = bool(get_env_value("FIRECRAWL_API_KEY") or get_env_value("FIRECRAWL_API_URL"))
    direct_parallel = bool(get_env_value("PARALLEL_API_KEY"))
    direct_tavily = bool(get_env_value("TAVILY_API_KEY"))
@ -200,19 +202,25 @@ def get_nous_subscription_features(
    managed_tts_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("openai-audio")
    managed_browser_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("browserbase")
    managed_modal_available = managed_tools_flag and nous_auth_present and is_managed_tool_gateway_ready("modal")
+    modal_state = resolve_modal_backend_state(
+        modal_mode,
+        has_direct=direct_modal,
+        managed_ready=managed_modal_available,
+    )

    web_managed = web_backend == "firecrawl" and managed_web_available and not direct_firecrawl
    web_active = bool(
        web_tool_enabled
        and (
            web_managed
+            or (web_backend == "exa" and direct_exa)
            or (web_backend == "firecrawl" and direct_firecrawl)
            or (web_backend == "parallel" and direct_parallel)
            or (web_backend == "tavily" and direct_tavily)
        )
    )
    web_available = bool(
-        managed_web_available or direct_firecrawl or direct_parallel or direct_tavily
+        managed_web_available or direct_exa or direct_firecrawl or direct_parallel or direct_tavily
    )

    image_managed = image_tool_enabled and managed_image_available and not direct_fal
@ -260,25 +268,31 @@ def get_nous_subscription_features(
        modal_available = True
        modal_active = bool(modal_tool_enabled)
        modal_direct_override = False
+    elif modal_state["selected_backend"] == "managed":
+        modal_managed = bool(modal_tool_enabled)
+        modal_available = True
+        modal_active = bool(modal_tool_enabled)
+        modal_direct_override = False
+    elif modal_state["selected_backend"] == "direct":
+        modal_managed = False
+        modal_available = True
+        modal_active = bool(modal_tool_enabled)
+        modal_direct_override = bool(modal_tool_enabled)
    elif modal_mode == "managed":
-        modal_managed = bool(modal_tool_enabled and managed_modal_available)
+        modal_managed = False
        modal_available = bool(managed_modal_available)
-        modal_active = bool(modal_tool_enabled and managed_modal_available)
+        modal_active = False
        modal_direct_override = False
    elif modal_mode == "direct":
        modal_managed = False
        modal_available = bool(direct_modal)
-        modal_active = bool(modal_tool_enabled and direct_modal)
-        modal_direct_override = bool(direct_modal)
+        modal_active = False
+        modal_direct_override = False
    else:
-        modal_managed = bool(
-            modal_tool_enabled
-            and managed_modal_available
-            and not direct_modal
-        )
+        modal_managed = False
        modal_available = bool(managed_modal_available or direct_modal)
-        modal_active = bool(modal_tool_enabled and (direct_modal or managed_modal_available))
-        modal_direct_override = bool(direct_modal)
+        modal_active = False
+        modal_direct_override = False

    tts_explicit_configured = False
    raw_tts_cfg = config.get("tts")
--- a/hermes_cli/plugins.py
+++ b/hermes_cli/plugins.py
@ -70,6 +70,17 @@ def _env_enabled(name: str) -> bool:
    return env_var_enabled(name)


+def _get_disabled_plugins() -> set:
+    """Read the disabled plugins list from config.yaml."""
+    try:
+        from hermes_cli.config import load_config
+        config = load_config()
+        disabled = config.get("plugins", {}).get("disabled", [])
+        return set(disabled) if isinstance(disabled, list) else set()
+    except Exception:
+        return set()
+
+
 # ---------------------------------------------------------------------------
 # Data classes
 # ---------------------------------------------------------------------------
@ -201,8 +212,15 @@ class PluginManager:
        # 3. Pip / entry-point plugins
        manifests.extend(self._scan_entry_points())

-        # Load each manifest
+        # Load each manifest (skip user-disabled plugins)
+        disabled = _get_disabled_plugins()
        for manifest in manifests:
+            if manifest.name in disabled:
+                loaded = LoadedPlugin(manifest=manifest, enabled=False)
+                loaded.error = "disabled via config"
+                self._plugins[manifest.name] = loaded
+                logger.debug("Skipping disabled plugin '%s'", manifest.name)
+                continue
            self._load_plugin(manifest)

        if manifests:
@ -387,16 +405,23 @@ class PluginManager:
    # Hook invocation
    # -----------------------------------------------------------------------

-    def invoke_hook(self, hook_name: str, **kwargs: Any) -> None:
+    def invoke_hook(self, hook_name: str, **kwargs: Any) -> List[Any]:
        """Call all registered callbacks for *hook_name*.

        Each callback is wrapped in its own try/except so a misbehaving
        plugin cannot break the core agent loop.
+
+        Returns a list of non-``None`` return values from callbacks.
+        This allows hooks like ``pre_llm_call`` to contribute context
+        that the agent core can collect and inject.
        """
        callbacks = self._hooks.get(hook_name, [])
+        results: List[Any] = []
        for cb in callbacks:
            try:
-                cb(**kwargs)
+                ret = cb(**kwargs)
+                if ret is not None:
+                    results.append(ret)
            except Exception as exc:
                logger.warning(
                    "Hook '%s' callback %s raised: %s",
@ -404,6 +429,7 @@ class PluginManager:
                    getattr(cb, "__name__", repr(cb)),
                    exc,
                )
+        return results

    # -----------------------------------------------------------------------
    # Introspection
@ -448,9 +474,12 @@ def discover_plugins() -> None:
    get_plugin_manager().discover_and_load()


-def invoke_hook(hook_name: str, **kwargs: Any) -> None:
-    """Invoke a lifecycle hook on all loaded plugins."""
-    get_plugin_manager().invoke_hook(hook_name, **kwargs)
+def invoke_hook(hook_name: str, **kwargs: Any) -> List[Any]:
+    """Invoke a lifecycle hook on all loaded plugins.
+
+    Returns a list of non-``None`` return values from plugin callbacks.
+    """
+    return get_plugin_manager().invoke_hook(hook_name, **kwargs)


 def get_plugin_tool_names() -> Set[str]:
--- a/hermes_cli/plugins_cmd.py
+++ b/hermes_cli/plugins_cmd.py
@ -374,6 +374,73 @@ def cmd_remove(name: str) -> None:
    _display_removed(name, plugins_dir)


+def _get_disabled_set() -> set:
+    """Read the disabled plugins set from config.yaml."""
+    try:
+        from hermes_cli.config import load_config
+        config = load_config()
+        disabled = config.get("plugins", {}).get("disabled", [])
+        return set(disabled) if isinstance(disabled, list) else set()
+    except Exception:
+        return set()
+
+
+def _save_disabled_set(disabled: set) -> None:
+    """Write the disabled plugins list to config.yaml."""
+    from hermes_cli.config import load_config, save_config
+    config = load_config()
+    if "plugins" not in config:
+        config["plugins"] = {}
+    config["plugins"]["disabled"] = sorted(disabled)
+    save_config(config)
+
+
+def cmd_enable(name: str) -> None:
+    """Enable a previously disabled plugin."""
+    from rich.console import Console
+
+    console = Console()
+    plugins_dir = _plugins_dir()
+
+    # Verify the plugin exists
+    target = plugins_dir / name
+    if not target.is_dir():
+        console.print(f"[red]Plugin '{name}' is not installed.[/red]")
+        sys.exit(1)
+
+    disabled = _get_disabled_set()
+    if name not in disabled:
+        console.print(f"[dim]Plugin '{name}' is already enabled.[/dim]")
+        return
+
+    disabled.discard(name)
+    _save_disabled_set(disabled)
+    console.print(f"[green]✓[/green] Plugin [bold]{name}[/bold] enabled. Takes effect on next session.")
+
+
+def cmd_disable(name: str) -> None:
+    """Disable a plugin without removing it."""
+    from rich.console import Console
+
+    console = Console()
+    plugins_dir = _plugins_dir()
+
+    # Verify the plugin exists
+    target = plugins_dir / name
+    if not target.is_dir():
+        console.print(f"[red]Plugin '{name}' is not installed.[/red]")
+        sys.exit(1)
+
+    disabled = _get_disabled_set()
+    if name in disabled:
+        console.print(f"[dim]Plugin '{name}' is already disabled.[/dim]")
+        return
+
+    disabled.add(name)
+    _save_disabled_set(disabled)
+    console.print(f"[yellow]⊘[/yellow] Plugin [bold]{name}[/bold] disabled. Takes effect on next session.")
+
+
 def cmd_list() -> None:
    """List installed plugins."""
    from rich.console import Console
@ -393,8 +460,11 @@ def cmd_list() -> None:
        console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
        return

+    disabled = _get_disabled_set()
+
    table = Table(title="Installed Plugins", show_lines=False)
    table.add_column("Name", style="bold")
+    table.add_column("Status")
    table.add_column("Version", style="dim")
    table.add_column("Description")
    table.add_column("Source", style="dim")
@ -420,11 +490,86 @@ def cmd_list() -> None:
        if (d / ".git").exists():
            source = "git"

-        table.add_row(name, str(version), description, source)
+        is_disabled = name in disabled or d.name in disabled
+        status = "[red]disabled[/red]" if is_disabled else "[green]enabled[/green]"
+        table.add_row(name, status, str(version), description, source)

    console.print()
    console.print(table)
    console.print()
+    console.print("[dim]Interactive toggle:[/dim] hermes plugins")
+    console.print("[dim]Enable/disable:[/dim] hermes plugins enable/disable <name>")
+
+
+def cmd_toggle() -> None:
+    """Interactive curses checklist to enable/disable installed plugins."""
+    from rich.console import Console
+
+    try:
+        import yaml
+    except ImportError:
+        yaml = None
+
+    console = Console()
+    plugins_dir = _plugins_dir()
+
+    dirs = sorted(d for d in plugins_dir.iterdir() if d.is_dir())
+    if not dirs:
+        console.print("[dim]No plugins installed.[/dim]")
+        console.print("[dim]Install with:[/dim] hermes plugins install owner/repo")
+        return
+
+    disabled = _get_disabled_set()
+
+    # Build items list: "name — description" for display
+    names = []
+    labels = []
+    selected = set()
+
+    for i, d in enumerate(dirs):
+        manifest_file = d / "plugin.yaml"
+        name = d.name
+        description = ""
+
+        if manifest_file.exists() and yaml:
+            try:
+                with open(manifest_file) as f:
+                    manifest = yaml.safe_load(f) or {}
+                name = manifest.get("name", d.name)
+                description = manifest.get("description", "")
+            except Exception:
+                pass
+
+        names.append(name)
+        label = f"{name} — {description}" if description else name
+        labels.append(label)
+
+        if name not in disabled and d.name not in disabled:
+            selected.add(i)
+
+    from hermes_cli.curses_ui import curses_checklist
+
+    result = curses_checklist(
+        title="Plugins — toggle enabled/disabled",
+        items=labels,
+        selected=selected,
+    )
+
+    # Compute new disabled set from deselected items
+    new_disabled = set()
+    for i, name in enumerate(names):
+        if i not in result:
+            new_disabled.add(name)
+
+    if new_disabled != disabled:
+        _save_disabled_set(new_disabled)
+        enabled_count = len(names) - len(new_disabled)
+        console.print(
+            f"\n[green]✓[/green] {enabled_count} enabled, {len(new_disabled)} disabled. "
+            f"Takes effect on next session."
+        )
+    else:
+        console.print("\n[dim]No changes.[/dim]")


 def plugins_command(args) -> None:
@ -437,8 +582,14 @@ def plugins_command(args) -> None:
        cmd_update(args.name)
    elif action in ("remove", "rm", "uninstall"):
        cmd_remove(args.name)
-    elif action in ("list", "ls") or action is None:
+    elif action == "enable":
+        cmd_enable(args.name)
+    elif action == "disable":
+        cmd_disable(args.name)
+    elif action in ("list", "ls"):
        cmd_list()
+    elif action is None:
+        cmd_toggle()
    else:
        from rich.console import Console

--- a/hermes_cli/profiles.py
+++ b/hermes_cli/profiles.py
@ -0,0 +1,906 @@
+"""
+Profile management for multiple isolated Hermes instances.
+
+Each profile is a fully independent HERMES_HOME directory with its own
+config.yaml, .env, memory, sessions, skills, gateway, cron, and logs.
+Profiles live under ``~/.hermes/profiles/<name>/`` by default.
+
+The "default" profile is ``~/.hermes`` itself — backward compatible,
+zero migration needed.
+
+Usage::
+
+    hermes profile create coder          # fresh profile + bundled skills
+    hermes profile create coder --clone  # also copy config, .env, SOUL.md
+    hermes profile create coder --clone-all  # full copy of source profile
+    coder chat                           # use via wrapper alias
+    hermes -p coder chat                 # or via flag
+    hermes profile use coder             # set as sticky default
+    hermes profile delete coder          # remove profile + alias + service
+"""
+
+import json
+import os
+import re
+import shutil
+import stat
+import subprocess
+import sys
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import List, Optional
+
+_PROFILE_ID_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,63}$")
+
+# Directories bootstrapped inside every new profile
+_PROFILE_DIRS = [
+    "memories",
+    "sessions",
+    "skills",
+    "skins",
+    "logs",
+    "plans",
+    "workspace",
+    "cron",
+]
+
+# Files copied during --clone (if they exist in the source)
+_CLONE_CONFIG_FILES = [
+    "config.yaml",
+    ".env",
+    "SOUL.md",
+]
+
+# Runtime files stripped after --clone-all (shouldn't carry over)
+_CLONE_ALL_STRIP = [
+    "gateway.pid",
+    "gateway_state.json",
+    "processes.json",
+]
+
+# Names that cannot be used as profile aliases
+_RESERVED_NAMES = frozenset({
+    "hermes", "default", "test", "tmp", "root", "sudo",
+})
+
+# Hermes subcommands that cannot be used as profile names/aliases
+_HERMES_SUBCOMMANDS = frozenset({
+    "chat", "model", "gateway", "setup", "whatsapp", "login", "logout",
+    "status", "cron", "doctor", "config", "pairing", "skills", "tools",
+    "mcp", "sessions", "insights", "version", "update", "uninstall",
+    "profile", "plugins", "honcho", "acp",
+})
+
+
+# ---------------------------------------------------------------------------
+# Path helpers
+# ---------------------------------------------------------------------------
+
+def _get_profiles_root() -> Path:
+    """Return the directory where named profiles are stored.
+
+    Always ``~/.hermes/profiles/`` — anchored to the user's home,
+    NOT to the current HERMES_HOME (which may itself be a profile).
+    This ensures ``coder profile list`` can see all profiles.
+    """
+    return Path.home() / ".hermes" / "profiles"
+
+
+def _get_default_hermes_home() -> Path:
+    """Return the default (pre-profile) HERMES_HOME path."""
+    return Path.home() / ".hermes"
+
+
+def _get_active_profile_path() -> Path:
+    """Return the path to the sticky active_profile file."""
+    return _get_default_hermes_home() / "active_profile"
+
+
+def _get_wrapper_dir() -> Path:
+    """Return the directory for wrapper scripts."""
+    return Path.home() / ".local" / "bin"
+
+
+# ---------------------------------------------------------------------------
+# Validation
+# ---------------------------------------------------------------------------
+
+def validate_profile_name(name: str) -> None:
+    """Raise ``ValueError`` if *name* is not a valid profile identifier."""
+    if name == "default":
+        return  # special alias for ~/.hermes
+    if not _PROFILE_ID_RE.match(name):
+        raise ValueError(
+            f"Invalid profile name {name!r}. Must match "
+            f"[a-z0-9][a-z0-9_-]{{0,63}}"
+        )
+
+
+def get_profile_dir(name: str) -> Path:
+    """Resolve a profile name to its HERMES_HOME directory."""
+    if name == "default":
+        return _get_default_hermes_home()
+    return _get_profiles_root() / name
+
+
+def profile_exists(name: str) -> bool:
+    """Check whether a profile directory exists."""
+    if name == "default":
+        return True
+    return get_profile_dir(name).is_dir()
+
+
+# ---------------------------------------------------------------------------
+# Alias / wrapper script management
+# ---------------------------------------------------------------------------
+
+def check_alias_collision(name: str) -> Optional[str]:
+    """Return a human-readable collision message, or None if the name is safe.
+
+    Checks: reserved names, hermes subcommands, existing binaries in PATH.
+    """
+    if name in _RESERVED_NAMES:
+        return f"'{name}' is a reserved name"
+    if name in _HERMES_SUBCOMMANDS:
+        return f"'{name}' conflicts with a hermes subcommand"
+
+    # Check existing commands in PATH
+    wrapper_dir = _get_wrapper_dir()
+    try:
+        result = subprocess.run(
+            ["which", name], capture_output=True, text=True, timeout=5,
+        )
+        if result.returncode == 0:
+            existing_path = result.stdout.strip()
+            # Allow overwriting our own wrappers
+            if existing_path == str(wrapper_dir / name):
+                try:
+                    content = (wrapper_dir / name).read_text()
+                    if "hermes -p" in content:
+                        return None  # it's our wrapper, safe to overwrite
+                except Exception:
+                    pass
+            return f"'{name}' conflicts with an existing command ({existing_path})"
+    except (FileNotFoundError, subprocess.TimeoutExpired):
+        pass
+
+    return None  # safe
+
+
+def _is_wrapper_dir_in_path() -> bool:
+    """Check if ~/.local/bin is in PATH."""
+    wrapper_dir = str(_get_wrapper_dir())
+    return wrapper_dir in os.environ.get("PATH", "").split(os.pathsep)
+
+
+def create_wrapper_script(name: str) -> Optional[Path]:
+    """Create a shell wrapper script at ~/.local/bin/<name>.
+
+    Returns the path to the created wrapper, or None if creation failed.
+    """
+    wrapper_dir = _get_wrapper_dir()
+    try:
+        wrapper_dir.mkdir(parents=True, exist_ok=True)
+    except OSError as e:
+        print(f"⚠ Could not create {wrapper_dir}: {e}")
+        return None
+
+    wrapper_path = wrapper_dir / name
+    try:
+        wrapper_path.write_text(f'#!/bin/sh\nexec hermes -p {name} "$@"\n')
+        wrapper_path.chmod(wrapper_path.stat().st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
+        return wrapper_path
+    except OSError as e:
+        print(f"⚠ Could not create wrapper at {wrapper_path}: {e}")
+        return None
+
+
+def remove_wrapper_script(name: str) -> bool:
+    """Remove the wrapper script for a profile. Returns True if removed."""
+    wrapper_path = _get_wrapper_dir() / name
+    if wrapper_path.exists():
+        try:
+            # Verify it's our wrapper before removing
+            content = wrapper_path.read_text()
+            if "hermes -p" in content:
+                wrapper_path.unlink()
+                return True
+        except Exception:
+            pass
+    return False
+
+
+# ---------------------------------------------------------------------------
+# ProfileInfo
+# ---------------------------------------------------------------------------
+
+@dataclass
+class ProfileInfo:
+    """Summary information about a profile."""
+    name: str
+    path: Path
+    is_default: bool
+    gateway_running: bool
+    model: Optional[str] = None
+    provider: Optional[str] = None
+    has_env: bool = False
+    skill_count: int = 0
+    alias_path: Optional[Path] = None
+
+
+def _read_config_model(profile_dir: Path) -> tuple:
+    """Read model/provider from a profile's config.yaml. Returns (model, provider)."""
+    config_path = profile_dir / "config.yaml"
+    if not config_path.exists():
+        return None, None
+    try:
+        import yaml
+        with open(config_path, "r") as f:
+            cfg = yaml.safe_load(f) or {}
+        model_cfg = cfg.get("model", {})
+        if isinstance(model_cfg, str):
+            return model_cfg, None
+        if isinstance(model_cfg, dict):
+            return model_cfg.get("model"), model_cfg.get("provider")
+        return None, None
+    except Exception:
+        return None, None
+
+
+def _check_gateway_running(profile_dir: Path) -> bool:
+    """Check if a gateway is running for a given profile directory."""
+    pid_file = profile_dir / "gateway.pid"
+    if not pid_file.exists():
+        return False
+    try:
+        raw = pid_file.read_text().strip()
+        if not raw:
+            return False
+        data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
+        pid = int(data["pid"])
+        os.kill(pid, 0)  # existence check
+        return True
+    except (json.JSONDecodeError, KeyError, ValueError, TypeError,
+            ProcessLookupError, PermissionError, OSError):
+        return False
+
+
+def _count_skills(profile_dir: Path) -> int:
+    """Count installed skills in a profile."""
+    skills_dir = profile_dir / "skills"
+    if not skills_dir.is_dir():
+        return 0
+    count = 0
+    for md in skills_dir.rglob("SKILL.md"):
+        if "/.hub/" not in str(md) and "/.git/" not in str(md):
+            count += 1
+    return count
+
+
+# ---------------------------------------------------------------------------
+# CRUD operations
+# ---------------------------------------------------------------------------
+
+def list_profiles() -> List[ProfileInfo]:
+    """Return info for all profiles, including the default."""
+    profiles = []
+    wrapper_dir = _get_wrapper_dir()
+
+    # Default profile
+    default_home = _get_default_hermes_home()
+    if default_home.is_dir():
+        model, provider = _read_config_model(default_home)
+        profiles.append(ProfileInfo(
+            name="default",
+            path=default_home,
+            is_default=True,
+            gateway_running=_check_gateway_running(default_home),
+            model=model,
+            provider=provider,
+            has_env=(default_home / ".env").exists(),
+            skill_count=_count_skills(default_home),
+        ))
+
+    # Named profiles
+    profiles_root = _get_profiles_root()
+    if profiles_root.is_dir():
+        for entry in sorted(profiles_root.iterdir()):
+            if not entry.is_dir():
+                continue
+            name = entry.name
+            if not _PROFILE_ID_RE.match(name):
+                continue
+            model, provider = _read_config_model(entry)
+            alias_path = wrapper_dir / name
+            profiles.append(ProfileInfo(
+                name=name,
+                path=entry,
+                is_default=False,
+                gateway_running=_check_gateway_running(entry),
+                model=model,
+                provider=provider,
+                has_env=(entry / ".env").exists(),
+                skill_count=_count_skills(entry),
+                alias_path=alias_path if alias_path.exists() else None,
+            ))
+
+    return profiles
+
+
+def create_profile(
+    name: str,
+    clone_from: Optional[str] = None,
+    clone_all: bool = False,
+    clone_config: bool = False,
+    no_alias: bool = False,
+) -> Path:
+    """Create a new profile directory.
+
+    Parameters
+    ----------
+    name:
+        Profile identifier (lowercase, alphanumeric, hyphens, underscores).
+    clone_from:
+        Source profile to clone from. If ``None`` and clone_config/clone_all
+        is True, defaults to the currently active profile.
+    clone_all:
+        If True, do a full copytree of the source (all state).
+    clone_config:
+        If True, copy only config files (config.yaml, .env, SOUL.md).
+    no_alias:
+        If True, skip wrapper script creation.
+
+    Returns
+    -------
+    Path
+        The newly created profile directory.
+    """
+    validate_profile_name(name)
+
+    if name == "default":
+        raise ValueError(
+            "Cannot create a profile named 'default' — it is the built-in profile (~/.hermes)."
+        )
+
+    profile_dir = get_profile_dir(name)
+    if profile_dir.exists():
+        raise FileExistsError(f"Profile '{name}' already exists at {profile_dir}")
+
+    # Resolve clone source
+    source_dir = None
+    if clone_from is not None or clone_all or clone_config:
+        if clone_from is None:
+            # Default: clone from active profile
+            from hermes_constants import get_hermes_home
+            source_dir = get_hermes_home()
+        else:
+            validate_profile_name(clone_from)
+            source_dir = get_profile_dir(clone_from)
+        if not source_dir.is_dir():
+            raise FileNotFoundError(
+                f"Source profile '{clone_from or 'active'}' does not exist at {source_dir}"
+            )
+
+    if clone_all and source_dir:
+        # Full copy of source profile
+        shutil.copytree(source_dir, profile_dir)
+        # Strip runtime files
+        for stale in _CLONE_ALL_STRIP:
+            (profile_dir / stale).unlink(missing_ok=True)
+    else:
+        # Bootstrap directory structure
+        profile_dir.mkdir(parents=True, exist_ok=True)
+        for subdir in _PROFILE_DIRS:
+            (profile_dir / subdir).mkdir(parents=True, exist_ok=True)
+
+        # Clone config files from source
+        if source_dir is not None:
+            for filename in _CLONE_CONFIG_FILES:
+                src = source_dir / filename
+                if src.exists():
+                    shutil.copy2(src, profile_dir / filename)
+
+    return profile_dir
+
+
+def seed_profile_skills(profile_dir: Path, quiet: bool = False) -> Optional[dict]:
+    """Seed bundled skills into a profile via subprocess.
+
+    Uses subprocess because sync_skills() caches HERMES_HOME at module level.
+    Returns the sync result dict, or None on failure.
+    """
+    project_root = Path(__file__).parent.parent.resolve()
+    try:
+        result = subprocess.run(
+            [sys.executable, "-c",
+             "import json; from tools.skills_sync import sync_skills; "
+             "r = sync_skills(quiet=True); print(json.dumps(r))"],
+            env={**os.environ, "HERMES_HOME": str(profile_dir)},
+            cwd=str(project_root),
+            capture_output=True, text=True, timeout=60,
+        )
+        if result.returncode == 0 and result.stdout.strip():
+            return json.loads(result.stdout.strip())
+        if not quiet:
+            print(f"⚠ Skill seeding returned exit code {result.returncode}")
+            if result.stderr.strip():
+                print(f"  {result.stderr.strip()[:200]}")
+        return None
+    except subprocess.TimeoutExpired:
+        if not quiet:
+            print("⚠ Skill seeding timed out (60s)")
+        return None
+    except Exception as e:
+        if not quiet:
+            print(f"⚠ Skill seeding failed: {e}")
+        return None
+
+
+def delete_profile(name: str, yes: bool = False) -> Path:
+    """Delete a profile, its wrapper script, and its gateway service.
+
+    Stops the gateway if running. Disables systemd/launchd service first
+    to prevent auto-restart.
+
+    Returns the path that was removed.
+    """
+    validate_profile_name(name)
+
+    if name == "default":
+        raise ValueError(
+            "Cannot delete the default profile (~/.hermes).\n"
+            "To remove everything, use: hermes uninstall"
+        )
+
+    profile_dir = get_profile_dir(name)
+    if not profile_dir.is_dir():
+        raise FileNotFoundError(f"Profile '{name}' does not exist.")
+
+    # Show what will be deleted
+    model, provider = _read_config_model(profile_dir)
+    gw_running = _check_gateway_running(profile_dir)
+    skill_count = _count_skills(profile_dir)
+
+    print(f"\nProfile: {name}")
+    print(f"Path:    {profile_dir}")
+    if model:
+        print(f"Model:   {model}" + (f" ({provider})" if provider else ""))
+    if skill_count:
+        print(f"Skills:  {skill_count}")
+
+    items = [
+        "All config, API keys, memories, sessions, skills, cron jobs",
+    ]
+
+    # Check for service
+    from hermes_cli.gateway import _profile_suffix, get_service_name
+    wrapper_path = _get_wrapper_dir() / name
+    has_wrapper = wrapper_path.exists()
+    if has_wrapper:
+        items.append(f"Command alias ({wrapper_path})")
+
+    print(f"\nThis will permanently delete:")
+    for item in items:
+        print(f"  • {item}")
+    if gw_running:
+        print(f"  ⚠ Gateway is running — it will be stopped.")
+
+    # Confirmation
+    if not yes:
+        print()
+        try:
+            confirm = input(f"Type '{name}' to confirm: ").strip()
+        except (KeyboardInterrupt, EOFError):
+            print("\nCancelled.")
+            return profile_dir
+        if confirm != name:
+            print("Cancelled.")
+            return profile_dir
+
+    # 1. Disable service (prevents auto-restart)
+    _cleanup_gateway_service(name, profile_dir)
+
+    # 2. Stop running gateway
+    if gw_running:
+        _stop_gateway_process(profile_dir)
+
+    # 3. Remove wrapper script
+    if has_wrapper:
+        if remove_wrapper_script(name):
+            print(f"✓ Removed {wrapper_path}")
+
+    # 4. Remove profile directory
+    try:
+        shutil.rmtree(profile_dir)
+        print(f"✓ Removed {profile_dir}")
+    except Exception as e:
+        print(f"⚠ Could not remove {profile_dir}: {e}")
+
+    # 5. Clear active_profile if it pointed to this profile
+    try:
+        active = get_active_profile()
+        if active == name:
+            set_active_profile("default")
+            print("✓ Active profile reset to default")
+    except Exception:
+        pass
+
+    print(f"\nProfile '{name}' deleted.")
+    return profile_dir
+
+
+def _cleanup_gateway_service(name: str, profile_dir: Path) -> None:
+    """Disable and remove systemd/launchd service for a profile."""
+    import platform as _platform
+
+    # Derive service name for this profile
+    # Temporarily set HERMES_HOME so _profile_suffix resolves correctly
+    old_home = os.environ.get("HERMES_HOME")
+    try:
+        os.environ["HERMES_HOME"] = str(profile_dir)
+        from hermes_cli.gateway import get_service_name, get_launchd_plist_path
+
+        if _platform.system() == "Linux":
+            svc_name = get_service_name()
+            svc_file = Path.home() / ".config" / "systemd" / "user" / f"{svc_name}.service"
+            if svc_file.exists():
+                subprocess.run(
+                    ["systemctl", "--user", "disable", svc_name],
+                    capture_output=True, check=False, timeout=10,
+                )
+                subprocess.run(
+                    ["systemctl", "--user", "stop", svc_name],
+                    capture_output=True, check=False, timeout=10,
+                )
+                svc_file.unlink(missing_ok=True)
+                subprocess.run(
+                    ["systemctl", "--user", "daemon-reload"],
+                    capture_output=True, check=False, timeout=10,
+                )
+                print(f"✓ Service {svc_name} removed")
+
+        elif _platform.system() == "Darwin":
+            plist_path = get_launchd_plist_path()
+            if plist_path.exists():
+                subprocess.run(
+                    ["launchctl", "unload", str(plist_path)],
+                    capture_output=True, check=False, timeout=10,
+                )
+                plist_path.unlink(missing_ok=True)
+                print(f"✓ Launchd service removed")
+    except Exception as e:
+        print(f"⚠ Service cleanup: {e}")
+    finally:
+        if old_home is not None:
+            os.environ["HERMES_HOME"] = old_home
+        elif "HERMES_HOME" in os.environ:
+            del os.environ["HERMES_HOME"]
+
+
+def _stop_gateway_process(profile_dir: Path) -> None:
+    """Stop a running gateway process via its PID file."""
+    import signal as _signal
+    import time as _time
+
+    pid_file = profile_dir / "gateway.pid"
+    if not pid_file.exists():
+        return
+
+    try:
+        raw = pid_file.read_text().strip()
+        data = json.loads(raw) if raw.startswith("{") else {"pid": int(raw)}
+        pid = int(data["pid"])
+        os.kill(pid, _signal.SIGTERM)
+        # Wait up to 10s for graceful shutdown
+        for _ in range(20):
+            _time.sleep(0.5)
+            try:
+                os.kill(pid, 0)
+            except ProcessLookupError:
+                print(f"✓ Gateway stopped (PID {pid})")
+                return
+        # Force kill
+        try:
+            os.kill(pid, _signal.SIGKILL)
+        except ProcessLookupError:
+            pass
+        print(f"✓ Gateway force-stopped (PID {pid})")
+    except (ProcessLookupError, PermissionError):
+        print("✓ Gateway already stopped")
+    except Exception as e:
+        print(f"⚠ Could not stop gateway: {e}")
+
+
+# ---------------------------------------------------------------------------
+# Active profile (sticky default)
+# ---------------------------------------------------------------------------
+
+def get_active_profile() -> str:
+    """Read the sticky active profile name.
+
+    Returns ``"default"`` if no active_profile file exists or it's empty.
+    """
+    path = _get_active_profile_path()
+    try:
+        name = path.read_text().strip()
+        if not name:
+            return "default"
+        return name
+    except (FileNotFoundError, UnicodeDecodeError, OSError):
+        return "default"
+
+
+def set_active_profile(name: str) -> None:
+    """Set the sticky active profile.
+
+    Writes to ``~/.hermes/active_profile``. Use ``"default"`` to clear.
+    """
+    validate_profile_name(name)
+    if name != "default" and not profile_exists(name):
+        raise FileNotFoundError(
+            f"Profile '{name}' does not exist. "
+            f"Create it with: hermes profile create {name}"
+        )
+
+    path = _get_active_profile_path()
+    path.parent.mkdir(parents=True, exist_ok=True)
+    if name == "default":
+        # Remove the file to indicate default
+        path.unlink(missing_ok=True)
+    else:
+        # Atomic write
+        tmp = path.with_suffix(".tmp")
+        tmp.write_text(name + "\n")
+        tmp.replace(path)
+
+
+def get_active_profile_name() -> str:
+    """Infer the current profile name from HERMES_HOME.
+
+    Returns ``"default"`` if HERMES_HOME is not set or points to ``~/.hermes``.
+    Returns the profile name if HERMES_HOME points into ``~/.hermes/profiles/<name>``.
+    Returns ``"custom"`` if HERMES_HOME is set to an unrecognized path.
+    """
+    from hermes_constants import get_hermes_home
+    hermes_home = get_hermes_home()
+    resolved = hermes_home.resolve()
+
+    default_resolved = _get_default_hermes_home().resolve()
+    if resolved == default_resolved:
+        return "default"
+
+    profiles_root = _get_profiles_root().resolve()
+    try:
+        rel = resolved.relative_to(profiles_root)
+        parts = rel.parts
+        if len(parts) == 1 and _PROFILE_ID_RE.match(parts[0]):
+            return parts[0]
+    except ValueError:
+        pass
+
+    return "custom"
+
+
+# ---------------------------------------------------------------------------
+# Export / Import
+# ---------------------------------------------------------------------------
+
+def export_profile(name: str, output_path: str) -> Path:
+    """Export a profile to a tar.gz archive.
+
+    Returns the output file path.
+    """
+    validate_profile_name(name)
+    profile_dir = get_profile_dir(name)
+    if not profile_dir.is_dir():
+        raise FileNotFoundError(f"Profile '{name}' does not exist.")
+
+    output = Path(output_path)
+    # shutil.make_archive wants the base name without extension
+    base = str(output).removesuffix(".tar.gz").removesuffix(".tgz")
+    result = shutil.make_archive(base, "gztar", str(profile_dir.parent), name)
+    return Path(result)
+
+
+def import_profile(archive_path: str, name: Optional[str] = None) -> Path:
+    """Import a profile from a tar.gz archive.
+
+    If *name* is not given, infers it from the archive's top-level directory.
+    Returns the imported profile directory.
+    """
+    import tarfile
+
+    archive = Path(archive_path)
+    if not archive.exists():
+        raise FileNotFoundError(f"Archive not found: {archive}")
+
+    # Peek at the archive to find the top-level directory name
+    with tarfile.open(archive, "r:gz") as tf:
+        top_dirs = {m.name.split("/")[0] for m in tf.getmembers() if "/" in m.name}
+        if not top_dirs:
+            top_dirs = {m.name for m in tf.getmembers() if m.isdir()}
+
+    inferred_name = name or (top_dirs.pop() if len(top_dirs) == 1 else None)
+    if not inferred_name:
+        raise ValueError(
+            "Cannot determine profile name from archive. "
+            "Specify it explicitly: hermes profile import <archive> --name <name>"
+        )
+
+    validate_profile_name(inferred_name)
+    profile_dir = get_profile_dir(inferred_name)
+    if profile_dir.exists():
+        raise FileExistsError(f"Profile '{inferred_name}' already exists at {profile_dir}")
+
+    profiles_root = _get_profiles_root()
+    profiles_root.mkdir(parents=True, exist_ok=True)
+
+    shutil.unpack_archive(str(archive), str(profiles_root))
+
+    # If the archive extracted under a different name, rename
+    extracted = profiles_root / (top_dirs.pop() if top_dirs else inferred_name)
+    if extracted != profile_dir and extracted.exists():
+        extracted.rename(profile_dir)
+
+    return profile_dir
+
+
+# ---------------------------------------------------------------------------
+# Rename
+# ---------------------------------------------------------------------------
+
+def rename_profile(old_name: str, new_name: str) -> Path:
+    """Rename a profile: directory, wrapper script, service, active_profile.
+
+    Returns the new profile directory.
+    """
+    validate_profile_name(old_name)
+    validate_profile_name(new_name)
+
+    if old_name == "default":
+        raise ValueError("Cannot rename the default profile.")
+    if new_name == "default":
+        raise ValueError("Cannot rename to 'default' — it is reserved.")
+
+    old_dir = get_profile_dir(old_name)
+    new_dir = get_profile_dir(new_name)
+
+    if not old_dir.is_dir():
+        raise FileNotFoundError(f"Profile '{old_name}' does not exist.")
+    if new_dir.exists():
+        raise FileExistsError(f"Profile '{new_name}' already exists.")
+
+    # 1. Stop gateway if running
+    if _check_gateway_running(old_dir):
+        _cleanup_gateway_service(old_name, old_dir)
+        _stop_gateway_process(old_dir)
+
+    # 2. Rename directory
+    old_dir.rename(new_dir)
+    print(f"✓ Renamed {old_dir.name} → {new_dir.name}")
+
+    # 3. Update wrapper script
+    remove_wrapper_script(old_name)
+    collision = check_alias_collision(new_name)
+    if not collision:
+        create_wrapper_script(new_name)
+        print(f"✓ Alias updated: {new_name}")
+    else:
+        print(f"⚠ Cannot create alias '{new_name}' — {collision}")
+
+    # 4. Update active_profile if it pointed to old name
+    try:
+        if get_active_profile() == old_name:
+            set_active_profile(new_name)
+            print(f"✓ Active profile updated: {new_name}")
+    except Exception:
+        pass
+
+    return new_dir
+
+
+# ---------------------------------------------------------------------------
+# Tab completion
+# ---------------------------------------------------------------------------
+
+def generate_bash_completion() -> str:
+    """Generate a bash completion script for hermes profile names."""
+    return '''# Hermes Agent profile completion
+# Add to ~/.bashrc: eval "$(hermes completion bash)"
+
+_hermes_profiles() {
+    local profiles_dir="$HOME/.hermes/profiles"
+    local profiles="default"
+    if [ -d "$profiles_dir" ]; then
+        profiles="$profiles $(ls "$profiles_dir" 2>/dev/null)"
+    fi
+    echo "$profiles"
+}
+
+_hermes_completion() {
+    local cur prev
+    cur="${COMP_WORDS[COMP_CWORD]}"
+    prev="${COMP_WORDS[COMP_CWORD-1]}"
+
+    # Complete profile names after -p / --profile
+    if [[ "$prev" == "-p" || "$prev" == "--profile" ]]; then
+        COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
+        return
+    fi
+
+    # Complete profile subcommands
+    if [[ "${COMP_WORDS[1]}" == "profile" ]]; then
+        case "$prev" in
+            profile)
+                COMPREPLY=($(compgen -W "list use create delete show alias rename export import" -- "$cur"))
+                return
+                ;;
+            use|delete|show|alias|rename|export)
+                COMPREPLY=($(compgen -W "$(_hermes_profiles)" -- "$cur"))
+                return
+                ;;
+        esac
+    fi
+
+    # Top-level subcommands
+    if [[ "$COMP_CWORD" == 1 ]]; then
+        local commands="chat model gateway setup status cron doctor config skills tools mcp sessions profile update version"
+        COMPREPLY=($(compgen -W "$commands" -- "$cur"))
+    fi
+}
+
+complete -F _hermes_completion hermes
+'''
+
+
+def generate_zsh_completion() -> str:
+    """Generate a zsh completion script for hermes profile names."""
+    return '''#compdef hermes
+# Hermes Agent profile completion
+# Add to ~/.zshrc: eval "$(hermes completion zsh)"
+
+_hermes() {
+    local -a profiles
+    profiles=(default)
+    if [[ -d "$HOME/.hermes/profiles" ]]; then
+        profiles+=("${(@f)$(ls $HOME/.hermes/profiles 2>/dev/null)}")
+    fi
+
+    _arguments \\
+        '-p[Profile name]:profile:($profiles)' \\
+        '--profile[Profile name]:profile:($profiles)' \\
+        '1:command:(chat model gateway setup status cron doctor config skills tools mcp sessions profile update version)' \\
+        '*::arg:->args'
+
+    case $words[1] in
+        profile)
+            _arguments '1:action:(list use create delete show alias rename export import)' \\
+                        '2:profile:($profiles)'
+            ;;
+    esac
+}
+
+_hermes "$@"
+'''
+
+
+# ---------------------------------------------------------------------------
+# Profile env resolution (called from _apply_profile_override)
+# ---------------------------------------------------------------------------
+
+def resolve_profile_env(profile_name: str) -> str:
+    """Resolve a profile name to a HERMES_HOME path string.
+
+    Called early in the CLI entry point, before any hermes modules
+    are imported, to set the HERMES_HOME environment variable.
+    """
+    validate_profile_name(profile_name)
+    profile_dir = get_profile_dir(profile_name)
+
+    if profile_name != "default" and not profile_dir.is_dir():
+        raise FileNotFoundError(
+            f"Profile '{profile_name}' does not exist. "
+            f"Create it with: hermes profile create {profile_name}"
+        )
+
+    return str(profile_dir)
--- a/hermes_cli/runtime_provider.py
+++ b/hermes_cli/runtime_provider.py
@ -63,8 +63,11 @@ def _get_model_config() -> Dict[str, Any]:
    model_cfg = config.get("model")
    if isinstance(model_cfg, dict):
        cfg = dict(model_cfg)
-        default = cfg.get("default", "").strip()
-        base_url = cfg.get("base_url", "").strip()
+        # Accept "model" as alias for "default" (users intuitively write model.model)
+        if not cfg.get("default") and cfg.get("model"):
+            cfg["default"] = cfg["model"]
+        default = (cfg.get("default") or "").strip()
+        base_url = (cfg.get("base_url") or "").strip()
        is_local = "localhost" in base_url or "127.0.0.1" in base_url
        is_fallback = not default or default == "anthropic/claude-opus-4.6"
        if is_local and is_fallback and base_url:
@ -203,7 +206,7 @@ def _resolve_named_custom_runtime(
        or _detect_api_mode_for_url(base_url)
        or "chat_completions",
        "base_url": base_url,
-        "api_key": api_key,
+        "api_key": api_key or "no-key-required",
        "source": f"custom_provider:{custom_provider.get('name', requested_provider)}",
    }

@ -407,12 +410,6 @@ def resolve_runtime_provider(
            # (e.g. https://api.minimax.io/anthropic, https://dashscope.../anthropic)
            elif base_url.rstrip("/").endswith("/anthropic"):
                api_mode = "anthropic_messages"
-            # MiniMax providers always use Anthropic Messages API.
-            # Auto-correct stale /v1 URLs (from old .env or config) to /anthropic.
-            elif provider in ("minimax", "minimax-cn"):
-                api_mode = "anthropic_messages"
-                if base_url.rstrip("/").endswith("/v1"):
-                    base_url = base_url.rstrip("/")[:-3] + "/anthropic"
        return {
            "provider": provider,
            "api_mode": api_mode,
--- a/hermes_cli/setup.py
+++ b/hermes_cli/setup.py
@ -98,6 +98,11 @@ _DEFAULT_PROVIDER_MODELS = {
    "minimax-cn": ["MiniMax-M2.7", "MiniMax-M2.7-highspeed", "MiniMax-M2.5", "MiniMax-M2.5-highspeed", "MiniMax-M2.1"],
    "ai-gateway": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5", "google/gemini-3-flash"],
    "kilocode": ["anthropic/claude-opus-4.6", "anthropic/claude-sonnet-4.6", "openai/gpt-5.4", "google/gemini-3-pro-preview", "google/gemini-3-flash-preview"],
+    "huggingface": [
+        "Qwen/Qwen3.5-397B-A17B", "Qwen/Qwen3-235B-A22B-Thinking-2507",
+        "Qwen/Qwen3-Coder-480B-A35B-Instruct", "deepseek-ai/DeepSeek-R1-0528",
+        "deepseek-ai/DeepSeek-V3.2", "moonshotai/Kimi-K2.5",
+    ],
 }


@ -302,6 +307,7 @@ from hermes_cli.config import (
    get_env_value,
    ensure_hermes_home,
 )
+# display_hermes_home imported lazily at call sites (stale-module safety during hermes update)

 from hermes_cli.colors import Colors, color

@ -599,7 +605,7 @@ def _print_setup_summary(config: dict, hermes_home):
    else:
        tool_status.append(("Mixture of Agents", False, "OPENROUTER_API_KEY"))

-    # Web tools (Parallel, Firecrawl, or Tavily)
+    # Web tools (Exa, Parallel, Firecrawl, or Tavily)
    if subscription_features.web.managed_by_nous:
        tool_status.append(("Web Search & Extract (Nous subscription)", True, None))
    elif subscription_features.web.available:
@ -608,7 +614,7 @@ def _print_setup_summary(config: dict, hermes_home):
            label = f"Web Search & Extract ({subscription_features.web.current_provider})"
        tool_status.append((label, True, None))
    else:
-        tool_status.append(("Web Search & Extract", False, "PARALLEL_API_KEY, FIRECRAWL_API_KEY, or TAVILY_API_KEY"))
+        tool_status.append(("Web Search & Extract", False, "EXA_API_KEY, PARALLEL_API_KEY, FIRECRAWL_API_KEY/FIRECRAWL_API_URL, or TAVILY_API_KEY"))

    # Browser tools (local Chromium or Browserbase cloud)
    import shutil
@ -720,7 +726,8 @@ def _print_setup_summary(config: dict, hermes_home):
        print_warning(
            "Some tools are disabled. Run 'hermes setup tools' to configure them,"
        )
-        print_warning("or edit ~/.hermes/.env directly to add the missing API keys.")
+        from hermes_constants import display_hermes_home as _dhh
+        print_warning(f"or edit {_dhh()}/.env directly to add the missing API keys.")
        print()

    # Done banner
@ -743,7 +750,8 @@ def _print_setup_summary(config: dict, hermes_home):
    print()

    # Show file locations prominently
-    print(color("📁 All your files are in ~/.hermes/:", Colors.CYAN, Colors.BOLD))
+    from hermes_constants import display_hermes_home as _dhh
+    print(color(f"📁 All your files are in {_dhh()}/:", Colors.CYAN, Colors.BOLD))
    print()
    print(f"   {color('Settings:', Colors.YELLOW)}  {get_config_path()}")
    print(f"   {color('API Keys:', Colors.YELLOW)}  {get_env_path()}")
@ -926,6 +934,7 @@ def setup_model_provider(config: dict):
        "OpenCode Go (open models, $10/month subscription)",
        "GitHub Copilot (uses GITHUB_TOKEN or gh auth token)",
        "GitHub Copilot ACP (spawns `copilot --acp --stdio`)",
+        "Hugging Face Inference Providers (20+ open models)",
    ]
    if keep_label:
        provider_choices.append(keep_label)
@ -1574,7 +1583,26 @@ def setup_model_provider(config: dict):
        _set_model_provider(config, "copilot-acp", pconfig.inference_base_url)
        selected_base_url = pconfig.inference_base_url

-    # else: provider_idx == 16 (Keep current) — only shown when a provider already exists
+    elif provider_idx == 16:  # Hugging Face Inference Providers
+        selected_provider = "huggingface"
+        print()
+        print_header("Hugging Face API Token")
+        pconfig = PROVIDER_REGISTRY["huggingface"]
+        print_info(f"Provider: {pconfig.name}")
+        print_info("Get your token at: https://huggingface.co/settings/tokens")
+        print_info("Required permission: 'Make calls to Inference Providers'")
+        print()
+
+        api_key = prompt("  HF Token", password=True)
+        if api_key:
+            save_env_value("HF_TOKEN", api_key)
+            # Clear OpenRouter env vars to prevent routing confusion
+            save_env_value("OPENAI_BASE_URL", "")
+            save_env_value("OPENAI_API_KEY", "")
+        _set_model_provider(config, "huggingface", pconfig.inference_base_url)
+        selected_base_url = pconfig.inference_base_url
+
+    # else: provider_idx == 17 (Keep current) — only shown when a provider already exists
    # Normalize "keep current" to an explicit provider so downstream logic
    # doesn't fall back to the generic OpenRouter/static-model path.
    if selected_provider is None:
@ -2178,11 +2206,11 @@ def setup_terminal_backend(config: dict):
            config["terminal"]["modal_mode"] = "direct"
            print_info("Requires a Modal account: https://modal.com")

-            # Check if swe-rex[modal] is installed
+            # Check if modal SDK is installed
            try:
-                __import__("swe_rex")
+                __import__("modal")
            except ImportError:
-                print_info("Installing swe-rex[modal]...")
+                print_info("Installing modal SDK...")
                import subprocess

                uv_bin = shutil.which("uv")
@ -2194,23 +2222,21 @@ def setup_terminal_backend(config: dict):
                            "install",
                            "--python",
                            sys.executable,
-                            "swe-rex[modal]",
+                            "modal",
                        ],
                        capture_output=True,
                        text=True,
                    )
                else:
                    result = subprocess.run(
-                        [sys.executable, "-m", "pip", "install", "swe-rex[modal]"],
+                        [sys.executable, "-m", "pip", "install", "modal"],
                        capture_output=True,
                        text=True,
                    )
                if result.returncode == 0:
-                    print_success("swe-rex[modal] installed")
+                    print_success("modal SDK installed")
                else:
-                    print_warning(
-                        "Install failed — run manually: pip install 'swe-rex[modal]'"
-                    )
+                    print_warning("Install failed — run manually: pip install modal")

            # Modal token
            print()
@ -2925,7 +2951,8 @@ def setup_gateway(config: dict):
        save_env_value("WEBHOOK_ENABLED", "true")
        print()
        print_success("Webhooks enabled! Next steps:")
-        print_info("   1. Define webhook routes in ~/.hermes/config.yaml")
+        from hermes_constants import display_hermes_home as _dhh
+        print_info(f"   1. Define webhook routes in {_dhh()}/config.yaml")
        print_info("   2. Point your service (GitHub, GitLab, etc.) at:")
        print_info("      http://your-server:8644/webhooks/<route-name>")
        print()
@ -3081,6 +3108,95 @@ def setup_tools(config: dict, first_install: bool = False):
    tools_command(first_install=first_install, config=config)


+# =============================================================================
+# Post-Migration Section Skip Logic
+# =============================================================================
+
+
+def _get_section_config_summary(config: dict, section_key: str) -> Optional[str]:
+    """Return a short summary if a setup section is already configured, else None.
+
+    Used after OpenClaw migration to detect which sections can be skipped.
+    ``get_env_value`` is the module-level import from hermes_cli.config
+    so that test patches on ``setup_mod.get_env_value`` take effect.
+    """
+    if section_key == "model":
+        has_key = bool(
+            get_env_value("OPENROUTER_API_KEY")
+            or get_env_value("OPENAI_API_KEY")
+            or get_env_value("ANTHROPIC_API_KEY")
+        )
+        if not has_key:
+            # Check for OAuth providers
+            try:
+                from hermes_cli.auth import get_active_provider
+                if get_active_provider():
+                    has_key = True
+            except Exception:
+                pass
+        if not has_key:
+            return None
+        model = config.get("model")
+        if isinstance(model, str) and model.strip():
+            return model.strip()
+        if isinstance(model, dict):
+            return str(model.get("default") or model.get("model") or "configured")
+        return "configured"
+
+    elif section_key == "terminal":
+        backend = config.get("terminal", {}).get("backend", "local")
+        return f"backend: {backend}"
+
+    elif section_key == "agent":
+        max_turns = config.get("agent", {}).get("max_turns", 90)
+        return f"max turns: {max_turns}"
+
+    elif section_key == "gateway":
+        platforms = []
+        if get_env_value("TELEGRAM_BOT_TOKEN"):
+            platforms.append("Telegram")
+        if get_env_value("DISCORD_BOT_TOKEN"):
+            platforms.append("Discord")
+        if get_env_value("SLACK_BOT_TOKEN"):
+            platforms.append("Slack")
+        if get_env_value("WHATSAPP_PHONE_NUMBER_ID"):
+            platforms.append("WhatsApp")
+        if get_env_value("SIGNAL_ACCOUNT"):
+            platforms.append("Signal")
+        if platforms:
+            return ", ".join(platforms)
+        return None  # No platforms configured — section must run
+
+    elif section_key == "tools":
+        tools = []
+        if get_env_value("ELEVENLABS_API_KEY"):
+            tools.append("TTS/ElevenLabs")
+        if get_env_value("BROWSERBASE_API_KEY"):
+            tools.append("Browser")
+        if get_env_value("FIRECRAWL_API_KEY"):
+            tools.append("Firecrawl")
+        if tools:
+            return ", ".join(tools)
+        return None
+
+    return None
+
+
+def _skip_configured_section(
+    config: dict, section_key: str, label: str
+) -> bool:
+    """Show an already-configured section summary and offer to skip.
+
+    Returns True if the user chose to skip, False if the section should run.
+    """
+    summary = _get_section_config_summary(config, section_key)
+    if not summary:
+        return False
+    print()
+    print_success(f"  {label}: {summary}")
+    return not prompt_yes_no(f"  Reconfigure {label.lower()}?", default=False)
+
+
 # =============================================================================
 # OpenClaw Migration
 # =============================================================================
@ -3152,7 +3268,7 @@ def _offer_openclaw_migration(hermes_home: Path) -> bool:
            target_root=hermes_home.resolve(),
            execute=True,
            workspace_target=None,
-            overwrite=False,
+            overwrite=True,
            migrate_secrets=True,
            output_dir=None,
            selected_options=selected,
@ -3319,6 +3435,8 @@ def run_setup_wizard(args):
        )
    )

+    migration_ran = False
+
    if is_existing:
        # ── Returning User Menu ──
        print()
@ -3387,7 +3505,8 @@ def run_setup_wizard(args):
            return

        # Offer OpenClaw migration before configuration begins
-        if _offer_openclaw_migration(hermes_home):
+        migration_ran = _offer_openclaw_migration(hermes_home)
+        if migration_ran:
            # Reload config in case migration wrote to it
            config = load_config()

@ -3400,20 +3519,31 @@ def run_setup_wizard(args):
    print()
    print_info("You can edit these files directly or use 'hermes config edit'")

+    if migration_ran:
+        print()
+        print_info("Settings were imported from OpenClaw.")
+        print_info("Each section below will show what was imported — press Enter to keep,")
+        print_info("or choose to reconfigure if needed.")
+
    # Section 1: Model & Provider
-    setup_model_provider(config)
+    if not (migration_ran and _skip_configured_section(config, "model", "Model & Provider")):
+        setup_model_provider(config)

    # Section 2: Terminal Backend
-    setup_terminal_backend(config)
+    if not (migration_ran and _skip_configured_section(config, "terminal", "Terminal Backend")):
+        setup_terminal_backend(config)

    # Section 3: Agent Settings
-    setup_agent_settings(config)
+    if not (migration_ran and _skip_configured_section(config, "agent", "Agent Settings")):
+        setup_agent_settings(config)

    # Section 4: Messaging Platforms
-    setup_gateway(config)
+    if not (migration_ran and _skip_configured_section(config, "gateway", "Messaging Platforms")):
+        setup_gateway(config)

    # Section 5: Tools
-    setup_tools(config, first_install=not is_existing)
+    if not (migration_ran and _skip_configured_section(config, "tools", "Tools")):
+        setup_tools(config, first_install=not is_existing)

    # Save and show summary
    save_config(config)
--- a/hermes_cli/skills_config.py
+++ b/hermes_cli/skills_config.py
@ -24,6 +24,12 @@ PLATFORMS = {
    "whatsapp": "📱 WhatsApp",
    "signal":   "📡 Signal",
    "email":    "📧 Email",
+    "homeassistant": "🏠 Home Assistant",
+    "mattermost": "💬 Mattermost",
+    "matrix":   "💬 Matrix",
+    "dingtalk": "💬 DingTalk",
+    "feishu": "🪽 Feishu",
+    "wecom": "💬 WeCom",
 }

 # ─── Config Helpers ───────────────────────────────────────────────────────────
--- a/hermes_cli/skills_hub.py
+++ b/hermes_cli/skills_hub.py
@ -21,6 +21,7 @@ from rich.table import Table

 # Lazy imports to avoid circular dependencies and slow startup.
 # tools.skills_hub and tools.skills_guard are imported inside functions.
+from hermes_constants import display_hermes_home

 _console = Console()

@ -304,7 +305,8 @@ def do_browse(page: int = 1, page_size: int = 20, source: str = "all",


 def do_install(identifier: str, category: str = "", force: bool = False,
-               console: Optional[Console] = None, skip_confirm: bool = False) -> None:
+               console: Optional[Console] = None, skip_confirm: bool = False,
+               invalidate_cache: bool = True) -> None:
    """Fetch, quarantine, scan, confirm, and install a skill."""
    from tools.skills_hub import (
        GitHubAuth, create_source_router, ensure_hub_dirs,
@ -387,7 +389,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
                "[bold bright_cyan]This is an official optional skill maintained by Nous Research.[/]\n\n"
                "It ships with hermes-agent but is not activated by default.\n"
                "Installing will copy it to your skills directory where the agent can use it.\n\n"
-                f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
+                f"Files will be at: [cyan]{display_hermes_home()}/skills/{category + '/' if category else ''}{bundle.name}/[/]",
                title="Official Skill",
                border_style="bright_cyan",
            ))
@ -397,7 +399,7 @@ def do_install(identifier: str, category: str = "", force: bool = False,
                "External skills can contain instructions that influence agent behavior,\n"
                "shell commands, and scripts. Even after automated scanning, you should\n"
                "review the installed files before use.\n\n"
-                f"Files will be at: [cyan]~/.hermes/skills/{category + '/' if category else ''}{bundle.name}/[/]",
+                f"Files will be at: [cyan]{display_hermes_home()}/skills/{category + '/' if category else ''}{bundle.name}/[/]",
                title="Disclaimer",
                border_style="yellow",
            ))
@ -417,6 +419,17 @@ def do_install(identifier: str, category: str = "", force: bool = False,
    c.print(f"[bold green]Installed:[/] {install_dir.relative_to(SKILLS_DIR)}")
    c.print(f"[dim]Files: {', '.join(bundle.files.keys())}[/]\n")

+    if invalidate_cache:
+        # Invalidate the skills prompt cache so the new skill appears immediately
+        try:
+            from agent.prompt_builder import clear_skills_system_prompt_cache
+            clear_skills_system_prompt_cache(clear_snapshot=True)
+        except Exception:
+            pass
+    else:
+        c.print("[dim]Skill will be available in your next session.[/]")
+        c.print("[dim]Use /reset to start a new session now, or --now to activate immediately (invalidates prompt cache).[/]\n")
+

 def do_inspect(identifier: str, console: Optional[Console] = None) -> None:
    """Preview a skill's SKILL.md content without installing."""
@ -603,7 +616,8 @@ def do_audit(name: Optional[str] = None, console: Optional[Console] = None) -> N


 def do_uninstall(name: str, console: Optional[Console] = None,
-                 skip_confirm: bool = False) -> None:
+                 skip_confirm: bool = False,
+                 invalidate_cache: bool = True) -> None:
    """Remove a hub-installed skill with confirmation."""
    from tools.skills_hub import uninstall_skill

@ -623,6 +637,15 @@ def do_uninstall(name: str, console: Optional[Console] = None,
    success, msg = uninstall_skill(name)
    if success:
        c.print(f"[bold green]{msg}[/]\n")
+        if invalidate_cache:
+            try:
+                from agent.prompt_builder import clear_skills_system_prompt_cache
+                clear_skills_system_prompt_cache(clear_snapshot=True)
+            except Exception:
+                pass
+        else:
+            c.print("[dim]Change will take effect in your next session.[/]")
+            c.print("[dim]Use /reset to start a new session now, or --now to apply immediately (invalidates prompt cache).[/]\n")
    else:
        c.print(f"[bold red]Error:[/] {msg}\n")

@ -722,7 +745,7 @@ def do_publish(skill_path: str, target: str = "github", repo: str = "",
        auth = GitHubAuth()
        if not auth.is_authenticated():
            c.print("[bold red]Error:[/] GitHub authentication required.\n"
-                    "Set GITHUB_TOKEN in ~/.hermes/.env or run 'gh auth login'.\n")
+                    f"Set GITHUB_TOKEN in {display_hermes_home()}/.env or run 'gh auth login'.\n")
            return

        c.print(f"[bold]Publishing '{name}' to {repo}...[/]")
@ -865,10 +888,15 @@ def do_snapshot_export(output_path: str, console: Optional[Console] = None) -> N
        "taps": tap_list,
    }

-    out = Path(output_path)
-    out.write_text(json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n")
-    c.print(f"[bold green]Snapshot exported:[/] {out}")
-    c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")
+    payload = json.dumps(snapshot, indent=2, ensure_ascii=False) + "\n"
+    if output_path == "-":
+        import sys
+        sys.stdout.write(payload)
+    else:
+        out = Path(output_path)
+        out.write_text(payload)
+        c.print(f"[bold green]Snapshot exported:[/] {out}")
+        c.print(f"[dim]{len(installed)} skill(s), {len(tap_list)} tap(s)[/]\n")


 def do_snapshot_import(input_path: str, force: bool = False,
@ -1059,19 +1087,23 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:

    elif action == "install":
        if not args:
-            c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force|--yes]\n")
+            c.print("[bold red]Usage:[/] /skills install <identifier> [--category <cat>] [--force] [--now]\n")
            return
        identifier = args[0]
        category = ""
-        # --yes / -y bypasses confirmation prompt (needed in TUI mode)
-        # --force handles reinstall override
-        skip_confirm = any(flag in args for flag in ("--yes", "-y"))
+        # Slash commands run inside prompt_toolkit where input() hangs.
+        # Always skip confirmation — the user typing the command is implicit consent.
+        skip_confirm = True
        force = "--force" in args
+        # --now invalidates prompt cache immediately (costs more money).
+        # Default: defer to next session to preserve cache.
+        invalidate_cache = "--now" in args
        for i, a in enumerate(args):
            if a == "--category" and i + 1 < len(args):
                category = args[i + 1]
        do_install(identifier, category=category, force=force,
-                   skip_confirm=skip_confirm, console=c)
+                   skip_confirm=skip_confirm, invalidate_cache=invalidate_cache,
+                   console=c)

    elif action == "inspect":
        if not args:
@ -1101,10 +1133,13 @@ def handle_skills_slash(cmd: str, console: Optional[Console] = None) -> None:

    elif action == "uninstall":
        if not args:
-            c.print("[bold red]Usage:[/] /skills uninstall <name> [--yes]\n")
+            c.print("[bold red]Usage:[/] /skills uninstall <name> [--now]\n")
            return
-        skip_confirm = any(flag in args for flag in ("--yes", "-y"))
-        do_uninstall(args[0], console=c, skip_confirm=skip_confirm)
+        # Slash commands run inside prompt_toolkit where input() hangs.
+        skip_confirm = True
+        invalidate_cache = "--now" in args
+        do_uninstall(args[0], console=c, skip_confirm=skip_confirm,
+                     invalidate_cache=invalidate_cache)

    elif action == "publish":
        if not args:
--- a/hermes_cli/status.py
+++ b/hermes_cli/status.py
@ -281,6 +281,9 @@ def show_status(args):
        "Slack": ("SLACK_BOT_TOKEN", None),
        "Email": ("EMAIL_ADDRESS", "EMAIL_HOME_ADDRESS"),
        "SMS": ("TWILIO_ACCOUNT_SID", "SMS_HOME_CHANNEL"),
+        "DingTalk": ("DINGTALK_CLIENT_ID", None),
+        "Feishu": ("FEISHU_APP_ID", "FEISHU_HOME_CHANNEL"),
+        "WeCom": ("WECOM_BOT_ID", "WECOM_HOME_CHANNEL"),
    }
    
    for name, (token_var, home_var) in platforms.items():
@ -319,8 +322,9 @@ def show_status(args):
        print("  Manager:      systemd (user)")
        
    elif sys.platform == 'darwin':
+        from hermes_cli.gateway import get_launchd_label
        result = subprocess.run(
-            ["launchctl", "list", "ai.hermes.gateway"],
+            ["launchctl", "list", get_launchd_label()],
            capture_output=True,
            text=True
        )
--- a/hermes_cli/tools_config.py
+++ b/hermes_cli/tools_config.py
@ -9,6 +9,8 @@ Saves per-platform tool configuration to ~/.hermes/config.yaml under
 the `platform_toolsets` key.
 """

+import json as _json
+import logging
 import sys
 from pathlib import Path
 from typing import Dict, List, Optional, Set
@ -24,6 +26,8 @@ from hermes_cli.nous_subscription import (
 )
 from tools.tool_backend_helpers import managed_nous_tools_enabled

+logger = logging.getLogger(__name__)
+
 PROJECT_ROOT = Path(__file__).parent.parent.resolve()


@ -113,7 +117,8 @@ def _get_effective_configurable_toolsets():
    """
    result = list(CONFIGURABLE_TOOLSETS)
    try:
-        from hermes_cli.plugins import get_plugin_toolsets
+        from hermes_cli.plugins import discover_plugins, get_plugin_toolsets
+        discover_plugins()  # idempotent — ensures plugins are loaded
        result.extend(get_plugin_toolsets())
    except Exception:
        pass
@ -123,7 +128,8 @@ def _get_effective_configurable_toolsets():
 def _get_plugin_toolset_keys() -> set:
    """Return the set of toolset keys provided by plugins."""
    try:
-        from hermes_cli.plugins import get_plugin_toolsets
+        from hermes_cli.plugins import discover_plugins, get_plugin_toolsets
+        discover_plugins()  # idempotent — ensures plugins are loaded
        return {ts_key for ts_key, _, _ in get_plugin_toolsets()}
    except Exception:
        return set()
@ -138,7 +144,12 @@ PLATFORMS = {
    "signal":   {"label": "📡 Signal",     "default_toolset": "hermes-signal"},
    "homeassistant": {"label": "🏠 Home Assistant", "default_toolset": "hermes-homeassistant"},
    "email":    {"label": "📧 Email",      "default_toolset": "hermes-email"},
-    "dingtalk": {"label": "💬 DingTalk",   "default_toolset": "hermes-dingtalk"},
+    "matrix":   {"label": "💬 Matrix",     "default_toolset": "hermes-matrix"},
+ "dingtalk": {"label": "💬 DingTalk", "default_toolset": "hermes-dingtalk"},
+    "feishu": {"label": "🪽 Feishu", "default_toolset": "hermes-feishu"},
+    "wecom": {"label": "💬 WeCom", "default_toolset": "hermes-wecom"},
+    "api_server": {"label": "🌐 API Server", "default_toolset": "hermes-api-server"},
+    "mattermost": {"label": "💬 Mattermost", "default_toolset": "hermes-mattermost"},
 }


@ -208,6 +219,14 @@ TOOL_CATEGORIES = {
                    {"key": "FIRECRAWL_API_KEY", "prompt": "Firecrawl API key", "url": "https://firecrawl.dev"},
                ],
            },
+            {
+                "name": "Exa",
+                "tag": "AI-native search and contents",
+                "web_backend": "exa",
+                "env_vars": [
+                    {"key": "EXA_API_KEY", "prompt": "Exa API key", "url": "https://exa.ai"},
+                ],
+            },
            {
                "name": "Parallel",
                "tag": "AI-native search and extract",
@ -354,7 +373,8 @@ def _run_post_setup(post_setup_key: str):
            if result.returncode == 0:
                _print_success("    Node.js dependencies installed")
            else:
-                _print_warning("    npm install failed - run manually: cd ~/.hermes/hermes-agent && npm install")
+                from hermes_constants import display_hermes_home
+                _print_warning(f"    npm install failed - run manually: cd {display_hermes_home()}/hermes-agent && npm install")
        elif not node_modules.exists():
            _print_warning("    Node.js not found - browser tools require: npm install (in hermes-agent directory)")

@ -689,9 +709,61 @@ def _prompt_choice(question: str, choices: list, default: int = 0) -> int:
            return default


+# ─── Token Estimation ────────────────────────────────────────────────────────
+
+# Module-level cache so discovery + tokenization runs at most once per process.
+_tool_token_cache: Optional[Dict[str, int]] = None
+
+
+def _estimate_tool_tokens() -> Dict[str, int]:
+    """Return estimated token counts per individual tool name.
+
+    Uses tiktoken (cl100k_base) to count tokens in the JSON-serialised
+    OpenAI-format tool schema.  Triggers tool discovery on first call,
+    then caches the result for the rest of the process.
+
+    Returns an empty dict when tiktoken or the registry is unavailable.
+    """
+    global _tool_token_cache
+    if _tool_token_cache is not None:
+        return _tool_token_cache
+
+    try:
+        import tiktoken
+        enc = tiktoken.get_encoding("cl100k_base")
+    except Exception:
+        logger.debug("tiktoken unavailable; skipping tool token estimation")
+        _tool_token_cache = {}
+        return _tool_token_cache
+
+    try:
+        # Trigger full tool discovery (imports all tool modules).
+        import model_tools  # noqa: F401
+        from tools.registry import registry
+    except Exception:
+        logger.debug("Tool registry unavailable; skipping token estimation")
+        _tool_token_cache = {}
+        return _tool_token_cache
+
+    counts: Dict[str, int] = {}
+    for name in registry.get_all_tool_names():
+        schema = registry.get_schema(name)
+        if schema:
+            # Mirror what gets sent to the API:
+            # {"type": "function", "function": <schema>}
+            text = _json.dumps({"type": "function", "function": schema})
+            counts[name] = len(enc.encode(text))
+    _tool_token_cache = counts
+    return _tool_token_cache
+
+
 def _prompt_toolset_checklist(platform_label: str, enabled: Set[str]) -> Set[str]:
    """Multi-select checklist of toolsets. Returns set of selected toolset keys."""
    from hermes_cli.curses_ui import curses_checklist
+    from toolsets import resolve_toolset
+
+    # Pre-compute per-tool token counts (cached after first call).
+    tool_tokens = _estimate_tool_tokens()

    effective = _get_effective_configurable_toolsets()

@ -707,11 +779,27 @@ def _prompt_toolset_checklist(platform_label: str, enabled: Set[str]) -> Set[str
        if ts_key in enabled
    }

+    # Build a live status function that shows deduplicated total token cost.
+    status_fn = None
+    if tool_tokens:
+        ts_keys = [ts_key for ts_key, _, _ in effective]
+
+        def status_fn(chosen: set) -> str:
+            # Collect unique tool names across all selected toolsets
+            all_tools: set = set()
+            for idx in chosen:
+                all_tools.update(resolve_toolset(ts_keys[idx]))
+            total = sum(tool_tokens.get(name, 0) for name in all_tools)
+            if total >= 1000:
+                return f"Est. tool context: ~{total / 1000:.1f}k tokens"
+            return f"Est. tool context: ~{total} tokens"
+
    chosen = curses_checklist(
        f"Tools for {platform_label}",
        labels,
        pre_selected,
        cancel_returns=pre_selected,
+        status_fn=status_fn,
    )
    return {effective[i][0] for i in chosen}

@ -1399,7 +1487,8 @@ def tools_command(args=None, first_install: bool = False, config: dict = None):
        platform_choices[idx] = f"Configure {pinfo['label']}  ({new_count}/{total} enabled)"

    print()
-    print(color("  Tool configuration saved to ~/.hermes/config.yaml", Colors.DIM))
+    from hermes_constants import display_hermes_home
+    print(color(f"  Tool configuration saved to {display_hermes_home()}/config.yaml", Colors.DIM))
    print(color("  Changes take effect on next 'hermes' or gateway restart.", Colors.DIM))
    print()

--- a/hermes_cli/webhook.py
+++ b/hermes_cli/webhook.py
@ -0,0 +1,260 @@
+"""hermes webhook — manage dynamic webhook subscriptions from the CLI.
+
+Usage:
+    hermes webhook subscribe <name> [options]
+    hermes webhook list
+    hermes webhook remove <name>
+    hermes webhook test <name> [--payload '{"key": "value"}']
+
+Subscriptions persist to ~/.hermes/webhook_subscriptions.json and are
+hot-reloaded by the webhook adapter without a gateway restart.
+"""
+
+import json
+import os
+import re
+import secrets
+import time
+from pathlib import Path
+from typing import Dict, Optional
+
+from hermes_constants import display_hermes_home
+
+
+_SUBSCRIPTIONS_FILENAME = "webhook_subscriptions.json"
+
+
+def _hermes_home() -> Path:
+    return Path(
+        os.getenv("HERMES_HOME", str(Path.home() / ".hermes"))
+    ).expanduser()
+
+
+def _subscriptions_path() -> Path:
+    return _hermes_home() / _SUBSCRIPTIONS_FILENAME
+
+
+def _load_subscriptions() -> Dict[str, dict]:
+    path = _subscriptions_path()
+    if not path.exists():
+        return {}
+    try:
+        data = json.loads(path.read_text(encoding="utf-8"))
+        return data if isinstance(data, dict) else {}
+    except Exception:
+        return {}
+
+
+def _save_subscriptions(subs: Dict[str, dict]) -> None:
+    path = _subscriptions_path()
+    path.parent.mkdir(parents=True, exist_ok=True)
+    tmp_path = path.with_suffix(".tmp")
+    tmp_path.write_text(
+        json.dumps(subs, indent=2, ensure_ascii=False),
+        encoding="utf-8",
+    )
+    os.replace(str(tmp_path), str(path))
+
+
+def _get_webhook_config() -> dict:
+    """Load webhook platform config. Returns {} if not configured."""
+    try:
+        from hermes_cli.config import load_config
+        cfg = load_config()
+        return cfg.get("platforms", {}).get("webhook", {})
+    except Exception:
+        return {}
+
+
+def _is_webhook_enabled() -> bool:
+    return bool(_get_webhook_config().get("enabled"))
+
+
+def _get_webhook_base_url() -> str:
+    wh = _get_webhook_config().get("extra", {})
+    host = wh.get("host", "0.0.0.0")
+    port = wh.get("port", 8644)
+    display_host = "localhost" if host == "0.0.0.0" else host
+    return f"http://{display_host}:{port}"
+
+
+def _setup_hint() -> str:
+    _dhh = display_hermes_home()
+    return f"""
+  Webhook platform is not enabled. To set it up:
+
+  1. Run the gateway setup wizard:
+     hermes gateway setup
+
+  2. Or manually add to {_dhh}/config.yaml:
+     platforms:
+       webhook:
+         enabled: true
+         extra:
+           host: "0.0.0.0"
+           port: 8644
+           secret: "your-global-hmac-secret"
+
+  3. Or set environment variables in {_dhh}/.env:
+     WEBHOOK_ENABLED=true
+     WEBHOOK_PORT=8644
+     WEBHOOK_SECRET=your-global-secret
+
+  Then start the gateway: hermes gateway run
+"""
+
+
+def _require_webhook_enabled() -> bool:
+    """Check webhook is enabled. Print setup guide and return False if not."""
+    if _is_webhook_enabled():
+        return True
+    print(_setup_hint())
+    return False
+
+
+def webhook_command(args):
+    """Entry point for 'hermes webhook' subcommand."""
+    sub = getattr(args, "webhook_action", None)
+
+    if not sub:
+        print("Usage: hermes webhook {subscribe|list|remove|test}")
+        print("Run 'hermes webhook --help' for details.")
+        return
+
+    if not _require_webhook_enabled():
+        return
+
+    if sub in ("subscribe", "add"):
+        _cmd_subscribe(args)
+    elif sub in ("list", "ls"):
+        _cmd_list(args)
+    elif sub in ("remove", "rm"):
+        _cmd_remove(args)
+    elif sub == "test":
+        _cmd_test(args)
+
+
+def _cmd_subscribe(args):
+    name = args.name.strip().lower().replace(" ", "-")
+    if not re.match(r'^[a-z0-9][a-z0-9_-]*$', name):
+        print(f"Error: Invalid name '{name}'. Use lowercase alphanumeric with hyphens/underscores.")
+        return
+
+    subs = _load_subscriptions()
+    is_update = name in subs
+
+    secret = args.secret or secrets.token_urlsafe(32)
+    events = [e.strip() for e in args.events.split(",")] if args.events else []
+
+    route = {
+        "description": args.description or f"Agent-created subscription: {name}",
+        "events": events,
+        "secret": secret,
+        "prompt": args.prompt or "",
+        "skills": [s.strip() for s in args.skills.split(",")] if args.skills else [],
+        "deliver": args.deliver or "log",
+        "created_at": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
+    }
+
+    if args.deliver_chat_id:
+        route["deliver_extra"] = {"chat_id": args.deliver_chat_id}
+
+    subs[name] = route
+    _save_subscriptions(subs)
+
+    base_url = _get_webhook_base_url()
+    status = "Updated" if is_update else "Created"
+
+    print(f"\n  {status} webhook subscription: {name}")
+    print(f"  URL:    {base_url}/webhooks/{name}")
+    print(f"  Secret: {secret}")
+    if events:
+        print(f"  Events: {', '.join(events)}")
+    else:
+        print("  Events: (all)")
+    print(f"  Deliver: {route['deliver']}")
+    if route.get("prompt"):
+        prompt_preview = route["prompt"][:80] + ("..." if len(route["prompt"]) > 80 else "")
+        print(f"  Prompt: {prompt_preview}")
+    print(f"\n  Configure your service to POST to the URL above.")
+    print(f"  Use the secret for HMAC-SHA256 signature validation.")
+    print(f"  The gateway must be running to receive events (hermes gateway run).\n")
+
+
+def _cmd_list(args):
+    subs = _load_subscriptions()
+    if not subs:
+        print("  No dynamic webhook subscriptions.")
+        print("  Create one with: hermes webhook subscribe <name>")
+        return
+
+    base_url = _get_webhook_base_url()
+    print(f"\n  {len(subs)} webhook subscription(s):\n")
+    for name, route in subs.items():
+        events = ", ".join(route.get("events", [])) or "(all)"
+        deliver = route.get("deliver", "log")
+        desc = route.get("description", "")
+        print(f"  ◆ {name}")
+        if desc:
+            print(f"    {desc}")
+        print(f"    URL:     {base_url}/webhooks/{name}")
+        print(f"    Events:  {events}")
+        print(f"    Deliver: {deliver}")
+        print()
+
+
+def _cmd_remove(args):
+    name = args.name.strip().lower()
+    subs = _load_subscriptions()
+
+    if name not in subs:
+        print(f"  No subscription named '{name}'.")
+        print("  Note: Static routes from config.yaml cannot be removed here.")
+        return
+
+    del subs[name]
+    _save_subscriptions(subs)
+    print(f"  Removed webhook subscription: {name}")
+
+
+def _cmd_test(args):
+    """Send a test POST to a webhook route."""
+    name = args.name.strip().lower()
+    subs = _load_subscriptions()
+
+    if name not in subs:
+        print(f"  No subscription named '{name}'.")
+        return
+
+    route = subs[name]
+    secret = route.get("secret", "")
+    base_url = _get_webhook_base_url()
+    url = f"{base_url}/webhooks/{name}"
+
+    payload = args.payload or '{"test": true, "event_type": "test", "message": "Hello from hermes webhook test"}'
+
+    import hmac
+    import hashlib
+    sig = "sha256=" + hmac.new(
+        secret.encode(), payload.encode(), hashlib.sha256
+    ).hexdigest()
+
+    print(f"  Sending test POST to {url}")
+    try:
+        import urllib.request
+        req = urllib.request.Request(
+            url,
+            data=payload.encode(),
+            headers={
+                "Content-Type": "application/json",
+                "X-Hub-Signature-256": sig,
+                "X-GitHub-Event": "test",
+            },
+            method="POST",
+        )
+        with urllib.request.urlopen(req, timeout=10) as resp:
+            body = resp.read().decode()
+            print(f"  Response ({resp.status}): {body}")
+    except Exception as e:
+        print(f"  Error: {e}")
+        print("  Is the gateway running? (hermes gateway run)")
--- a/hermes_constants.py
+++ b/hermes_constants.py
@ -17,6 +17,47 @@ def get_hermes_home() -> Path:
    return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))


+def get_hermes_dir(new_subpath: str, old_name: str) -> Path:
+    """Resolve a Hermes subdirectory with backward compatibility.
+
+    New installs get the consolidated layout (e.g. ``cache/images``).
+    Existing installs that already have the old path (e.g. ``image_cache``)
+    keep using it — no migration required.
+
+    Args:
+        new_subpath: Preferred path relative to HERMES_HOME (e.g. ``"cache/images"``).
+        old_name: Legacy path relative to HERMES_HOME (e.g. ``"image_cache"``).
+
+    Returns:
+        Absolute ``Path`` — old location if it exists on disk, otherwise the new one.
+    """
+    home = get_hermes_home()
+    old_path = home / old_name
+    if old_path.exists():
+        return old_path
+    return home / new_subpath
+
+
+def display_hermes_home() -> str:
+    """Return a user-friendly display string for the current HERMES_HOME.
+
+    Uses ``~/`` shorthand for readability::
+
+        default:  ``~/.hermes``
+        profile:  ``~/.hermes/profiles/coder``
+        custom:   ``/opt/hermes-custom``
+
+    Use this in **user-facing** print/log messages instead of hardcoding
+    ``~/.hermes``.  For code that needs a real ``Path``, use
+    :func:`get_hermes_home` instead.
+    """
+    home = get_hermes_home()
+    try:
+        return "~/" + str(home.relative_to(Path.home()))
+    except ValueError:
+        return str(home)
+
+
 VALID_REASONING_EFFORTS = ("xhigh", "high", "medium", "low", "minimal")


--- a/hermes_state.py
+++ b/hermes_state.py
@ -15,15 +15,20 @@ Key design decisions:
 """

 import json
+import logging
 import os
+import random
 import re
 import sqlite3
 import threading
 import time
 from pathlib import Path
 from hermes_constants import get_hermes_home
-from typing import Dict, Any, List, Optional
+from typing import Any, Callable, Dict, List, Optional, TypeVar

+logger = logging.getLogger(__name__)
+
+T = TypeVar("T")

 DEFAULT_DB_PATH = get_hermes_home() / "state.db"

@ -116,18 +121,38 @@ class SessionDB:
    single writer via WAL mode). Each method opens its own cursor.
    """

+    # ── Write-contention tuning ──
+    # With multiple hermes processes (gateway + CLI sessions + worktree agents)
+    # all sharing one state.db, WAL write-lock contention causes visible TUI
+    # freezes.  SQLite's built-in busy handler uses a deterministic sleep
+    # schedule that causes convoy effects under high concurrency.
+    #
+    # Instead, we keep the SQLite timeout short (1s) and handle retries at the
+    # application level with random jitter, which naturally staggers competing
+    # writers and avoids the convoy.
+    _WRITE_MAX_RETRIES = 15
+    _WRITE_RETRY_MIN_S = 0.020   # 20ms
+    _WRITE_RETRY_MAX_S = 0.150   # 150ms
+    # Attempt a PASSIVE WAL checkpoint every N successful writes.
+    _CHECKPOINT_EVERY_N_WRITES = 50
+
    def __init__(self, db_path: Path = None):
        self.db_path = db_path or DEFAULT_DB_PATH
        self.db_path.parent.mkdir(parents=True, exist_ok=True)

        self._lock = threading.Lock()
+        self._write_count = 0
        self._conn = sqlite3.connect(
            str(self.db_path),
            check_same_thread=False,
-            # 30s gives the WAL writer (CLI or gateway) time to finish a batch
-            # flush before the concurrent reader/writer gives up.  10s was too
-            # short when the CLI is doing frequent memory flushes.
-            timeout=30.0,
+            # Short timeout — application-level retry with random jitter
+            # handles contention instead of sitting in SQLite's internal
+            # busy handler for up to 30s.
+            timeout=1.0,
+            # Autocommit mode: Python's default isolation_level="" auto-starts
+            # transactions on DML, which conflicts with our explicit
+            # BEGIN IMMEDIATE.  None = we manage transactions ourselves.
+            isolation_level=None,
        )
        self._conn.row_factory = sqlite3.Row
        self._conn.execute("PRAGMA journal_mode=WAL")
@ -135,6 +160,96 @@ class SessionDB:

        self._init_schema()

+    # ── Core write helper ──
+
+    def _execute_write(self, fn: Callable[[sqlite3.Connection], T]) -> T:
+        """Execute a write transaction with BEGIN IMMEDIATE and jitter retry.
+
+        *fn* receives the connection and should perform INSERT/UPDATE/DELETE
+        statements.  The caller must NOT call ``commit()`` — that's handled
+        here after *fn* returns.
+
+        BEGIN IMMEDIATE acquires the WAL write lock at transaction start
+        (not at commit time), so lock contention surfaces immediately.
+        On ``database is locked``, we release the Python lock, sleep a
+        random 20-150ms, and retry — breaking the convoy pattern that
+        SQLite's built-in deterministic backoff creates.
+
+        Returns whatever *fn* returns.
+        """
+        last_err: Optional[Exception] = None
+        for attempt in range(self._WRITE_MAX_RETRIES):
+            try:
+                with self._lock:
+                    self._conn.execute("BEGIN IMMEDIATE")
+                    try:
+                        result = fn(self._conn)
+                        self._conn.commit()
+                    except BaseException:
+                        try:
+                            self._conn.rollback()
+                        except Exception:
+                            pass
+                        raise
+                # Success — periodic best-effort checkpoint.
+                self._write_count += 1
+                if self._write_count % self._CHECKPOINT_EVERY_N_WRITES == 0:
+                    self._try_wal_checkpoint()
+                return result
+            except sqlite3.OperationalError as exc:
+                err_msg = str(exc).lower()
+                if "locked" in err_msg or "busy" in err_msg:
+                    last_err = exc
+                    if attempt < self._WRITE_MAX_RETRIES - 1:
+                        jitter = random.uniform(
+                            self._WRITE_RETRY_MIN_S,
+                            self._WRITE_RETRY_MAX_S,
+                        )
+                        time.sleep(jitter)
+                        continue
+                # Non-lock error or retries exhausted — propagate.
+                raise
+        # Retries exhausted (shouldn't normally reach here).
+        raise last_err or sqlite3.OperationalError(
+            "database is locked after max retries"
+        )
+
+    def _try_wal_checkpoint(self) -> None:
+        """Best-effort PASSIVE WAL checkpoint.  Never blocks, never raises.
+
+        Flushes committed WAL frames back into the main DB file for any
+        frames that no other connection currently needs.  Keeps the WAL
+        from growing unbounded when many processes hold persistent
+        connections.
+        """
+        try:
+            with self._lock:
+                result = self._conn.execute(
+                    "PRAGMA wal_checkpoint(PASSIVE)"
+                ).fetchone()
+                if result and result[1] > 0:
+                    logger.debug(
+                        "WAL checkpoint: %d/%d pages checkpointed",
+                        result[2], result[1],
+                    )
+        except Exception:
+            pass  # Best effort — never fatal.
+
+    def close(self):
+        """Close the database connection.
+
+        Attempts a PASSIVE WAL checkpoint first so that exiting processes
+        help keep the WAL file from growing unbounded.
+        """
+        with self._lock:
+            if self._conn:
+                try:
+                    self._conn.execute("PRAGMA wal_checkpoint(PASSIVE)")
+                except Exception:
+                    pass
+                self._conn.close()
+                self._conn = None
+
    def _init_schema(self):
        """Create tables and FTS if they don't exist, run migrations."""
        cursor = self._conn.cursor()
@ -256,8 +371,8 @@ class SessionDB:
        parent_session_id: str = None,
    ) -> str:
        """Create a new session record. Returns the session_id."""
-        with self._lock:
-            self._conn.execute(
+        def _do(conn):
+            conn.execute(
                """INSERT OR IGNORE INTO sessions (id, source, user_id, model, model_config,
                   system_prompt, parent_session_id, started_at)
                   VALUES (?, ?, ?, ?, ?, ?, ?, ?)""",
@ -272,26 +387,35 @@ class SessionDB:
                    time.time(),
                ),
            )
-            self._conn.commit()
+        self._execute_write(_do)
        return session_id

    def end_session(self, session_id: str, end_reason: str) -> None:
        """Mark a session as ended."""
-        with self._lock:
-            self._conn.execute(
+        def _do(conn):
+            conn.execute(
                "UPDATE sessions SET ended_at = ?, end_reason = ? WHERE id = ?",
                (time.time(), end_reason, session_id),
            )
-            self._conn.commit()
+        self._execute_write(_do)
+
+    def reopen_session(self, session_id: str) -> None:
+        """Clear ended_at/end_reason so a session can be resumed."""
+        def _do(conn):
+            conn.execute(
+                "UPDATE sessions SET ended_at = NULL, end_reason = NULL WHERE id = ?",
+                (session_id,),
+            )
+        self._execute_write(_do)

    def update_system_prompt(self, session_id: str, system_prompt: str) -> None:
        """Store the full assembled system prompt snapshot."""
-        with self._lock:
-            self._conn.execute(
+        def _do(conn):
+            conn.execute(
                "UPDATE sessions SET system_prompt = ? WHERE id = ?",
                (system_prompt, session_id),
            )
-            self._conn.commit()
+        self._execute_write(_do)

    def update_token_counts(
        self,
@ -310,11 +434,39 @@ class SessionDB:
        billing_provider: Optional[str] = None,
        billing_base_url: Optional[str] = None,
        billing_mode: Optional[str] = None,
+        absolute: bool = False,
    ) -> None:
-        """Increment token counters and backfill model if not already set."""
-        with self._lock:
-            self._conn.execute(
-                """UPDATE sessions SET
+        """Update token counters and backfill model if not already set.
+
+        When *absolute* is False (default), values are **incremented** — use
+        this for per-API-call deltas (CLI path).
+
+        When *absolute* is True, values are **set directly** — use this when
+        the caller already holds cumulative totals (gateway path, where the
+        cached agent accumulates across messages).
+        """
+        if absolute:
+            sql = """UPDATE sessions SET
+                   input_tokens = ?,
+                   output_tokens = ?,
+                   cache_read_tokens = ?,
+                   cache_write_tokens = ?,
+                   reasoning_tokens = ?,
+                   estimated_cost_usd = COALESCE(?, 0),
+                   actual_cost_usd = CASE
+                       WHEN ? IS NULL THEN actual_cost_usd
+                       ELSE ?
+                   END,
+                   cost_status = COALESCE(?, cost_status),
+                   cost_source = COALESCE(?, cost_source),
+                   pricing_version = COALESCE(?, pricing_version),
+                   billing_provider = COALESCE(billing_provider, ?),
+                   billing_base_url = COALESCE(billing_base_url, ?),
+                   billing_mode = COALESCE(billing_mode, ?),
+                   model = COALESCE(model, ?)
+                   WHERE id = ?"""
+        else:
+            sql = """UPDATE sessions SET
                   input_tokens = input_tokens + ?,
                   output_tokens = output_tokens + ?,
                   cache_read_tokens = cache_read_tokens + ?,
@ -332,6 +484,94 @@ class SessionDB:
                   billing_base_url = COALESCE(billing_base_url, ?),
                   billing_mode = COALESCE(billing_mode, ?),
                   model = COALESCE(model, ?)
+                   WHERE id = ?"""
+        params = (
+            input_tokens,
+            output_tokens,
+            cache_read_tokens,
+            cache_write_tokens,
+            reasoning_tokens,
+            estimated_cost_usd,
+            actual_cost_usd,
+            actual_cost_usd,
+            cost_status,
+            cost_source,
+            pricing_version,
+            billing_provider,
+            billing_base_url,
+            billing_mode,
+            model,
+            session_id,
+        )
+        def _do(conn):
+            conn.execute(sql, params)
+        self._execute_write(_do)
+
+    def ensure_session(
+        self,
+        session_id: str,
+        source: str = "unknown",
+        model: str = None,
+    ) -> None:
+        """Ensure a session row exists, creating it with minimal metadata if absent.
+
+        Used by _flush_messages_to_session_db to recover from a failed
+        create_session() call (e.g. transient SQLite lock at agent startup).
+        INSERT OR IGNORE is safe to call even when the row already exists.
+        """
+        def _do(conn):
+            conn.execute(
+                """INSERT OR IGNORE INTO sessions
+                   (id, source, model, started_at)
+                   VALUES (?, ?, ?, ?)""",
+                (session_id, source, model, time.time()),
+            )
+        self._execute_write(_do)
+
+    def set_token_counts(
+        self,
+        session_id: str,
+        input_tokens: int = 0,
+        output_tokens: int = 0,
+        model: str = None,
+        cache_read_tokens: int = 0,
+        cache_write_tokens: int = 0,
+        reasoning_tokens: int = 0,
+        estimated_cost_usd: Optional[float] = None,
+        actual_cost_usd: Optional[float] = None,
+        cost_status: Optional[str] = None,
+        cost_source: Optional[str] = None,
+        pricing_version: Optional[str] = None,
+        billing_provider: Optional[str] = None,
+        billing_base_url: Optional[str] = None,
+        billing_mode: Optional[str] = None,
+    ) -> None:
+        """Set token counters to absolute values (not increment).
+
+        Use this when the caller provides cumulative totals from a completed
+        conversation run (e.g. the gateway, where the cached agent's
+        session_prompt_tokens already reflects the running total).
+        """
+        def _do(conn):
+            conn.execute(
+                """UPDATE sessions SET
+                   input_tokens = ?,
+                   output_tokens = ?,
+                   cache_read_tokens = ?,
+                   cache_write_tokens = ?,
+                   reasoning_tokens = ?,
+                   estimated_cost_usd = ?,
+                   actual_cost_usd = CASE
+                       WHEN ? IS NULL THEN actual_cost_usd
+                       ELSE ?
+                   END,
+                   cost_status = COALESCE(?, cost_status),
+                   cost_source = COALESCE(?, cost_source),
+                   pricing_version = COALESCE(?, pricing_version),
+                   billing_provider = COALESCE(billing_provider, ?),
+                   billing_base_url = COALESCE(billing_base_url, ?),
+                   billing_mode = COALESCE(billing_mode, ?),
+                   model = COALESCE(model, ?)
                   WHERE id = ?""",
                (
                    input_tokens,
@ -352,28 +592,7 @@ class SessionDB:
                    session_id,
                ),
            )
-            self._conn.commit()
-
-    def ensure_session(
-        self,
-        session_id: str,
-        source: str = "unknown",
-        model: str = None,
-    ) -> None:
-        """Ensure a session row exists, creating it with minimal metadata if absent.
-
-        Used by _flush_messages_to_session_db to recover from a failed
-        create_session() call (e.g. transient SQLite lock at agent startup).
-        INSERT OR IGNORE is safe to call even when the row already exists.
-        """
-        with self._lock:
-            self._conn.execute(
-                """INSERT OR IGNORE INTO sessions
-                   (id, source, model, started_at)
-                   VALUES (?, ?, ?, ?)""",
-                (session_id, source, model, time.time()),
-            )
-            self._conn.commit()
+        self._execute_write(_do)

    def get_session(self, session_id: str) -> Optional[Dict[str, Any]]:
        """Get a session by ID."""
@ -467,10 +686,10 @@ class SessionDB:
        Empty/whitespace-only strings are normalized to None (clearing the title).
        """
        title = self.sanitize_title(title)
-        with self._lock:
+        def _do(conn):
            if title:
                # Check uniqueness (allow the same session to keep its own title)
-                cursor = self._conn.execute(
+                cursor = conn.execute(
                    "SELECT id FROM sessions WHERE title = ? AND id != ?",
                    (title, session_id),
                )
@ -479,12 +698,12 @@ class SessionDB:
                    raise ValueError(
                        f"Title '{title}' is already in use by session {conflict['id']}"
                    )
-            cursor = self._conn.execute(
+            cursor = conn.execute(
                "UPDATE sessions SET title = ? WHERE id = ?",
                (title, session_id),
            )
-            self._conn.commit()
-            rowcount = cursor.rowcount
+            return cursor.rowcount
+        rowcount = self._execute_write(_do)
        return rowcount > 0

    def get_session_title(self, session_id: str) -> Optional[str]:
@ -656,17 +875,24 @@ class SessionDB:
        Also increments the session's message_count (and tool_call_count
        if role is 'tool' or tool_calls is present).
        """
-        with self._lock:
-            # Serialize structured fields to JSON for storage
-            reasoning_details_json = (
-                json.dumps(reasoning_details)
-                if reasoning_details else None
-            )
-            codex_items_json = (
-                json.dumps(codex_reasoning_items)
-                if codex_reasoning_items else None
-            )
-            cursor = self._conn.execute(
+        # Serialize structured fields to JSON before entering the write txn
+        reasoning_details_json = (
+            json.dumps(reasoning_details)
+            if reasoning_details else None
+        )
+        codex_items_json = (
+            json.dumps(codex_reasoning_items)
+            if codex_reasoning_items else None
+        )
+        tool_calls_json = json.dumps(tool_calls) if tool_calls else None
+
+        # Pre-compute tool call count
+        num_tool_calls = 0
+        if tool_calls is not None:
+            num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
+
+        def _do(conn):
+            cursor = conn.execute(
                """INSERT INTO messages (session_id, role, content, tool_call_id,
                   tool_calls, tool_name, timestamp, token_count, finish_reason,
                   reasoning, reasoning_details, codex_reasoning_items)
@ -676,7 +902,7 @@ class SessionDB:
                    role,
                    content,
                    tool_call_id,
-                    json.dumps(tool_calls) if tool_calls else None,
+                    tool_calls_json,
                    tool_name,
                    time.time(),
                    token_count,
@ -689,25 +915,20 @@ class SessionDB:
            msg_id = cursor.lastrowid

            # Update counters
-            # Count actual tool calls from the tool_calls list (not from tool responses).
-            # A single assistant message can contain multiple parallel tool calls.
-            num_tool_calls = 0
-            if tool_calls is not None:
-                num_tool_calls = len(tool_calls) if isinstance(tool_calls, list) else 1
            if num_tool_calls > 0:
-                self._conn.execute(
+                conn.execute(
                    """UPDATE sessions SET message_count = message_count + 1,
                       tool_call_count = tool_call_count + ? WHERE id = ?""",
                    (num_tool_calls, session_id),
                )
            else:
-                self._conn.execute(
+                conn.execute(
                    "UPDATE sessions SET message_count = message_count + 1 WHERE id = ?",
                    (session_id,),
                )
+            return msg_id

-            self._conn.commit()
-        return msg_id
+        return self._execute_write(_do)

    def get_messages(self, session_id: str) -> List[Dict[str, Any]]:
        """Load all messages for a session, ordered by timestamp."""
@ -1001,54 +1222,53 @@ class SessionDB:

    def clear_messages(self, session_id: str) -> None:
        """Delete all messages for a session and reset its counters."""
-        with self._lock:
-            self._conn.execute(
+        def _do(conn):
+            conn.execute(
                "DELETE FROM messages WHERE session_id = ?", (session_id,)
            )
-            self._conn.execute(
+            conn.execute(
                "UPDATE sessions SET message_count = 0, tool_call_count = 0 WHERE id = ?",
                (session_id,),
            )
-            self._conn.commit()
+        self._execute_write(_do)

    def delete_session(self, session_id: str) -> bool:
        """Delete a session and all its messages. Returns True if found."""
-        with self._lock:
-            cursor = self._conn.execute(
+        def _do(conn):
+            cursor = conn.execute(
                "SELECT COUNT(*) FROM sessions WHERE id = ?", (session_id,)
            )
            if cursor.fetchone()[0] == 0:
                return False
-            self._conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
-            self._conn.execute("DELETE FROM sessions WHERE id = ?", (session_id,))
-            self._conn.commit()
+            conn.execute("DELETE FROM messages WHERE session_id = ?", (session_id,))
+            conn.execute("DELETE FROM sessions WHERE id = ?", (session_id,))
            return True
+        return self._execute_write(_do)

    def prune_sessions(self, older_than_days: int = 90, source: str = None) -> int:
        """
        Delete sessions older than N days. Returns count of deleted sessions.
        Only prunes ended sessions (not active ones).
        """
-        import time as _time
-        cutoff = _time.time() - (older_than_days * 86400)
+        cutoff = time.time() - (older_than_days * 86400)

-        with self._lock:
+        def _do(conn):
            if source:
-                cursor = self._conn.execute(
+                cursor = conn.execute(
                    """SELECT id FROM sessions
                       WHERE started_at < ? AND ended_at IS NOT NULL AND source = ?""",
                    (cutoff, source),
                )
            else:
-                cursor = self._conn.execute(
+                cursor = conn.execute(
                    "SELECT id FROM sessions WHERE started_at < ? AND ended_at IS NOT NULL",
                    (cutoff,),
                )
            session_ids = [row["id"] for row in cursor.fetchall()]

            for sid in session_ids:
-                self._conn.execute("DELETE FROM messages WHERE session_id = ?", (sid,))
-                self._conn.execute("DELETE FROM sessions WHERE id = ?", (sid,))
+                conn.execute("DELETE FROM messages WHERE session_id = ?", (sid,))
+                conn.execute("DELETE FROM sessions WHERE id = ?", (sid,))
+            return len(session_ids)

-            self._conn.commit()
-        return len(session_ids)
+        return self._execute_write(_do)
--- a/honcho_integration/cli.py
+++ b/honcho_integration/cli.py
@ -270,7 +270,7 @@ def cmd_status(args) -> None:
            print(f"    {peer}: {mode}")
    print(f"  Write freq:     {hcfg.write_frequency}")

-    if hcfg.enabled and hcfg.api_key:
+    if hcfg.enabled and (hcfg.api_key or hcfg.base_url):
        print("\n  Connection... ", end="", flush=True)
        try:
            get_honcho_client(hcfg)
@ -278,7 +278,7 @@ def cmd_status(args) -> None:
        except Exception as e:
            print(f"FAILED ({e})\n")
    else:
-        reason = "disabled" if not hcfg.enabled else "no API key"
+        reason = "disabled" if not hcfg.enabled else "no API key or base URL"
        print(f"\n  Not connected ({reason})\n")


--- a/honcho_integration/client.py
+++ b/honcho_integration/client.py
@ -417,9 +417,18 @@ def get_honcho_client(config: HonchoClientConfig | None = None) -> Honcho:
    else:
        logger.info("Initializing Honcho client (host: %s, workspace: %s)", config.host, config.workspace_id)

+    # Local Honcho instances don't require an API key, but the SDK
+    # expects a non-empty string.  Use a placeholder for local URLs.
+    _is_local = resolved_base_url and (
+        "localhost" in resolved_base_url
+        or "127.0.0.1" in resolved_base_url
+        or "::1" in resolved_base_url
+    )
+    effective_api_key = config.api_key or ("local" if _is_local else None)
+
    kwargs: dict = {
        "workspace_id": config.workspace_id,
-        "api_key": config.api_key,
+        "api_key": effective_api_key,
        "environment": config.environment,
    }
    if resolved_base_url:
--- a/mcp_serve.py
+++ b/mcp_serve.py
@ -0,0 +1,868 @@
+"""
+Hermes MCP Server — expose messaging conversations as MCP tools.
+
+Starts a stdio MCP server that lets any MCP client (Claude Code, Cursor, Codex,
+etc.) list conversations, read message history, send messages, poll for live
+events, and manage approval requests across all connected platforms.
+
+Matches OpenClaw's 9-tool MCP channel bridge surface:
+  conversations_list, conversation_get, messages_read, attachments_fetch,
+  events_poll, events_wait, messages_send, permissions_list_open,
+  permissions_respond
+
+Plus: channels_list (Hermes-specific extra)
+
+Usage:
+    hermes mcp serve
+    hermes mcp serve --verbose
+
+MCP client config (e.g. claude_desktop_config.json):
+    {
+        "mcpServers": {
+            "hermes": {
+                "command": "hermes",
+                "args": ["mcp", "serve"]
+            }
+        }
+    }
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import os
+import re
+import sys
+import threading
+import time
+from dataclasses import dataclass, field
+from datetime import datetime
+from pathlib import Path
+from typing import Any, Dict, List, Optional
+
+logger = logging.getLogger("hermes.mcp_serve")
+
+# ---------------------------------------------------------------------------
+# Lazy MCP SDK import
+# ---------------------------------------------------------------------------
+
+_MCP_SERVER_AVAILABLE = False
+try:
+    from mcp.server.fastmcp import FastMCP
+
+    _MCP_SERVER_AVAILABLE = True
+except ImportError:
+    FastMCP = None  # type: ignore[assignment,misc]
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _get_sessions_dir() -> Path:
+    """Return the sessions directory using HERMES_HOME."""
+    try:
+        from hermes_constants import get_hermes_home
+        return get_hermes_home() / "sessions"
+    except ImportError:
+        return Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes")) / "sessions"
+
+
+def _get_session_db():
+    """Get a SessionDB instance for reading message transcripts."""
+    try:
+        from hermes_state import SessionDB
+        return SessionDB()
+    except Exception as e:
+        logger.debug("SessionDB unavailable: %s", e)
+        return None
+
+
+def _load_sessions_index() -> dict:
+    """Load the gateway sessions.json index directly.
+
+    Returns a dict of session_key -> entry_dict with platform routing info.
+    This avoids importing the full SessionStore which needs GatewayConfig.
+    """
+    sessions_file = _get_sessions_dir() / "sessions.json"
+    if not sessions_file.exists():
+        return {}
+    try:
+        with open(sessions_file, "r", encoding="utf-8") as f:
+            return json.load(f)
+    except Exception as e:
+        logger.debug("Failed to load sessions.json: %s", e)
+        return {}
+
+
+def _load_channel_directory() -> dict:
+    """Load the cached channel directory for available targets."""
+    try:
+        from hermes_constants import get_hermes_home
+        directory_file = get_hermes_home() / "channel_directory.json"
+    except ImportError:
+        directory_file = Path(
+            os.environ.get("HERMES_HOME", Path.home() / ".hermes")
+        ) / "channel_directory.json"
+
+    if not directory_file.exists():
+        return {}
+    try:
+        with open(directory_file, "r", encoding="utf-8") as f:
+            return json.load(f)
+    except Exception as e:
+        logger.debug("Failed to load channel_directory.json: %s", e)
+        return {}
+
+
+def _extract_message_content(msg: dict) -> str:
+    """Extract text content from a message, handling multi-part content."""
+    content = msg.get("content", "")
+    if isinstance(content, list):
+        text_parts = [
+            p.get("text", "") for p in content
+            if isinstance(p, dict) and p.get("type") == "text"
+        ]
+        return "\n".join(text_parts)
+    return str(content) if content else ""
+
+
+def _extract_attachments(msg: dict) -> List[dict]:
+    """Extract non-text attachments from a message.
+
+    Finds: multi-part image/file content blocks, MEDIA: tags in text,
+    image URLs, and file references.
+    """
+    attachments = []
+    content = msg.get("content", "")
+
+    # Multi-part content blocks (image_url, file, etc.)
+    if isinstance(content, list):
+        for part in content:
+            if not isinstance(part, dict):
+                continue
+            ptype = part.get("type", "")
+            if ptype == "image_url":
+                url = part.get("image_url", {}).get("url", "") if isinstance(part.get("image_url"), dict) else ""
+                if url:
+                    attachments.append({"type": "image", "url": url})
+            elif ptype == "image":
+                url = part.get("url", part.get("source", {}).get("url", ""))
+                if url:
+                    attachments.append({"type": "image", "url": url})
+            elif ptype not in ("text",):
+                # Unknown non-text content type
+                attachments.append({"type": ptype, "data": part})
+
+    # MEDIA: tags in text content
+    text = _extract_message_content(msg)
+    if text:
+        media_pattern = re.compile(r'MEDIA:\s*(\S+)')
+        for match in media_pattern.finditer(text):
+            path = match.group(1)
+            attachments.append({"type": "media", "path": path})
+
+    return attachments
+
+
+# ---------------------------------------------------------------------------
+# Event Bridge — polls SessionDB for new messages, maintains event queue
+# ---------------------------------------------------------------------------
+
+QUEUE_LIMIT = 1000
+POLL_INTERVAL = 0.2  # seconds between DB polls (200ms)
+
+
+@dataclass
+class QueueEvent:
+    """An event in the bridge's in-memory queue."""
+    cursor: int
+    type: str  # "message", "approval_requested", "approval_resolved"
+    session_key: str = ""
+    data: dict = field(default_factory=dict)
+
+
+class EventBridge:
+    """Background poller that watches SessionDB for new messages and
+    maintains an in-memory event queue with waiter support.
+
+    This is the Hermes equivalent of OpenClaw's WebSocket gateway bridge.
+    Instead of WebSocket events, we poll the SQLite database for changes.
+    """
+
+    def __init__(self):
+        self._queue: List[QueueEvent] = []
+        self._cursor = 0
+        self._lock = threading.Lock()
+        self._new_event = threading.Event()
+        self._running = False
+        self._thread: Optional[threading.Thread] = None
+        self._last_poll_timestamps: Dict[str, float] = {}  # session_key -> unix timestamp
+        # In-memory approval tracking (populated from events)
+        self._pending_approvals: Dict[str, dict] = {}
+        # mtime cache — skip expensive work when files haven't changed
+        self._sessions_json_mtime: float = 0.0
+        self._state_db_mtime: float = 0.0
+        self._cached_sessions_index: dict = {}
+
+    def start(self):
+        """Start the background polling thread."""
+        if self._running:
+            return
+        self._running = True
+        self._thread = threading.Thread(target=self._poll_loop, daemon=True)
+        self._thread.start()
+        logger.debug("EventBridge started")
+
+    def stop(self):
+        """Stop the background polling thread."""
+        self._running = False
+        self._new_event.set()  # Wake any waiters
+        if self._thread:
+            self._thread.join(timeout=5)
+        logger.debug("EventBridge stopped")
+
+    def poll_events(
+        self,
+        after_cursor: int = 0,
+        session_key: Optional[str] = None,
+        limit: int = 20,
+    ) -> dict:
+        """Return events since after_cursor, optionally filtered by session_key."""
+        with self._lock:
+            events = [
+                e for e in self._queue
+                if e.cursor > after_cursor
+                and (not session_key or e.session_key == session_key)
+            ][:limit]
+
+        next_cursor = events[-1].cursor if events else after_cursor
+        return {
+            "events": [
+                {"cursor": e.cursor, "type": e.type,
+                 "session_key": e.session_key, **e.data}
+                for e in events
+            ],
+            "next_cursor": next_cursor,
+        }
+
+    def wait_for_event(
+        self,
+        after_cursor: int = 0,
+        session_key: Optional[str] = None,
+        timeout_ms: int = 30000,
+    ) -> Optional[dict]:
+        """Block until a matching event arrives or timeout expires."""
+        deadline = time.monotonic() + (timeout_ms / 1000.0)
+
+        while time.monotonic() < deadline:
+            with self._lock:
+                for e in self._queue:
+                    if e.cursor > after_cursor and (
+                        not session_key or e.session_key == session_key
+                    ):
+                        return {
+                            "cursor": e.cursor, "type": e.type,
+                            "session_key": e.session_key, **e.data,
+                        }
+
+            remaining = deadline - time.monotonic()
+            if remaining <= 0:
+                break
+            self._new_event.clear()
+            self._new_event.wait(timeout=min(remaining, POLL_INTERVAL))
+
+        return None
+
+    def list_pending_approvals(self) -> List[dict]:
+        """List approval requests observed during this bridge session."""
+        with self._lock:
+            return sorted(
+                self._pending_approvals.values(),
+                key=lambda a: a.get("created_at", ""),
+            )
+
+    def respond_to_approval(self, approval_id: str, decision: str) -> dict:
+        """Resolve a pending approval (best-effort without gateway IPC)."""
+        with self._lock:
+            approval = self._pending_approvals.pop(approval_id, None)
+
+        if not approval:
+            return {"error": f"Approval not found: {approval_id}"}
+
+        self._enqueue(QueueEvent(
+            cursor=0,  # Will be set by _enqueue
+            type="approval_resolved",
+            session_key=approval.get("session_key", ""),
+            data={"approval_id": approval_id, "decision": decision},
+        ))
+
+        return {"resolved": True, "approval_id": approval_id, "decision": decision}
+
+    def _enqueue(self, event: QueueEvent) -> None:
+        """Add an event to the queue and wake any waiters."""
+        with self._lock:
+            self._cursor += 1
+            event.cursor = self._cursor
+            self._queue.append(event)
+            # Trim queue to limit
+            while len(self._queue) > QUEUE_LIMIT:
+                self._queue.pop(0)
+        self._new_event.set()
+
+    def _poll_loop(self):
+        """Background loop: poll SessionDB for new messages."""
+        db = _get_session_db()
+        if not db:
+            logger.warning("EventBridge: SessionDB unavailable, event polling disabled")
+            return
+
+        while self._running:
+            try:
+                self._poll_once(db)
+            except Exception as e:
+                logger.debug("EventBridge poll error: %s", e)
+            time.sleep(POLL_INTERVAL)
+
+    def _poll_once(self, db):
+        """Check for new messages across all sessions.
+
+        Uses mtime checks on sessions.json and state.db to skip work
+        when nothing has changed — makes 200ms polling essentially free.
+        """
+        # Check if sessions.json has changed (mtime check is ~1μs)
+        sessions_file = _get_sessions_dir() / "sessions.json"
+        try:
+            sj_mtime = sessions_file.stat().st_mtime if sessions_file.exists() else 0.0
+        except OSError:
+            sj_mtime = 0.0
+
+        if sj_mtime != self._sessions_json_mtime:
+            self._sessions_json_mtime = sj_mtime
+            self._cached_sessions_index = _load_sessions_index()
+
+        # Check if state.db has changed
+        try:
+            from hermes_constants import get_hermes_home
+            db_file = get_hermes_home() / "state.db"
+        except ImportError:
+            db_file = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes")) / "state.db"
+
+        try:
+            db_mtime = db_file.stat().st_mtime if db_file.exists() else 0.0
+        except OSError:
+            db_mtime = 0.0
+
+        if db_mtime == self._state_db_mtime and sj_mtime == self._sessions_json_mtime:
+            return  # Nothing changed since last poll — skip entirely
+
+        self._state_db_mtime = db_mtime
+        entries = self._cached_sessions_index
+
+        for session_key, entry in entries.items():
+            session_id = entry.get("session_id", "")
+            if not session_id:
+                continue
+
+            last_seen = self._last_poll_timestamps.get(session_key, 0.0)
+
+            try:
+                messages = db.get_messages(session_id)
+            except Exception:
+                continue
+
+            if not messages:
+                continue
+
+            # Normalize timestamps to float for comparison
+            def _ts_float(ts) -> float:
+                if isinstance(ts, (int, float)):
+                    return float(ts)
+                if isinstance(ts, str) and ts:
+                    try:
+                        return float(ts)
+                    except ValueError:
+                        # ISO string — parse to epoch
+                        try:
+                            from datetime import datetime
+                            return datetime.fromisoformat(ts).timestamp()
+                        except Exception:
+                            return 0.0
+                return 0.0
+
+            # Find messages newer than our last seen timestamp
+            new_messages = []
+            for msg in messages:
+                ts = _ts_float(msg.get("timestamp", 0))
+                role = msg.get("role", "")
+                if role not in ("user", "assistant"):
+                    continue
+                if ts > last_seen:
+                    new_messages.append(msg)
+
+            for msg in new_messages:
+                content = _extract_message_content(msg)
+                if not content:
+                    continue
+                self._enqueue(QueueEvent(
+                    cursor=0,
+                    type="message",
+                    session_key=session_key,
+                    data={
+                        "role": msg.get("role", ""),
+                        "content": content[:500],
+                        "timestamp": str(msg.get("timestamp", "")),
+                        "message_id": str(msg.get("id", "")),
+                    },
+                ))
+
+            # Update last seen to the most recent message timestamp
+            all_ts = [_ts_float(m.get("timestamp", 0)) for m in messages]
+            if all_ts:
+                latest = max(all_ts)
+                if latest > last_seen:
+                    self._last_poll_timestamps[session_key] = latest
+
+
+# ---------------------------------------------------------------------------
+# MCP Server
+# ---------------------------------------------------------------------------
+
+def create_mcp_server(event_bridge: Optional[EventBridge] = None) -> "FastMCP":
+    """Create and return the Hermes MCP server with all tools registered."""
+    if not _MCP_SERVER_AVAILABLE:
+        raise ImportError(
+            "MCP server requires the 'mcp' package. "
+            "Install with: pip install 'hermes-agent[mcp]'"
+        )
+
+    mcp = FastMCP(
+        "hermes",
+        instructions=(
+            "Hermes Agent messaging bridge. Use these tools to interact with "
+            "conversations across Telegram, Discord, Slack, WhatsApp, Signal, "
+            "Matrix, and other connected platforms."
+        ),
+    )
+
+    bridge = event_bridge or EventBridge()
+
+    # -- conversations_list ------------------------------------------------
+
+    @mcp.tool()
+    def conversations_list(
+        platform: Optional[str] = None,
+        limit: int = 50,
+        search: Optional[str] = None,
+    ) -> str:
+        """List active messaging conversations across connected platforms.
+
+        Returns conversations with their session keys (needed for messages_read),
+        platform, chat type, display name, and last activity time.
+
+        Args:
+            platform: Filter by platform name (telegram, discord, slack, etc.)
+            limit: Maximum number of conversations to return (default 50)
+            search: Optional text to filter conversations by name
+        """
+        entries = _load_sessions_index()
+        conversations = []
+
+        for key, entry in entries.items():
+            origin = entry.get("origin", {})
+            entry_platform = entry.get("platform") or origin.get("platform", "")
+
+            if platform and entry_platform.lower() != platform.lower():
+                continue
+
+            display_name = entry.get("display_name", "")
+            chat_name = origin.get("chat_name", "")
+            if search:
+                search_lower = search.lower()
+                if (search_lower not in display_name.lower()
+                        and search_lower not in chat_name.lower()
+                        and search_lower not in key.lower()):
+                    continue
+
+            conversations.append({
+                "session_key": key,
+                "session_id": entry.get("session_id", ""),
+                "platform": entry_platform,
+                "chat_type": entry.get("chat_type", origin.get("chat_type", "")),
+                "display_name": display_name,
+                "chat_name": chat_name,
+                "user_name": origin.get("user_name", ""),
+                "updated_at": entry.get("updated_at", ""),
+            })
+
+        conversations.sort(key=lambda c: c.get("updated_at", ""), reverse=True)
+        conversations = conversations[:limit]
+
+        return json.dumps({
+            "count": len(conversations),
+            "conversations": conversations,
+        }, indent=2)
+
+    # -- conversation_get --------------------------------------------------
+
+    @mcp.tool()
+    def conversation_get(session_key: str) -> str:
+        """Get detailed info about one conversation by its session key.
+
+        Args:
+            session_key: The session key from conversations_list
+        """
+        entries = _load_sessions_index()
+        entry = entries.get(session_key)
+
+        if not entry:
+            return json.dumps({"error": f"Conversation not found: {session_key}"})
+
+        origin = entry.get("origin", {})
+        return json.dumps({
+            "session_key": session_key,
+            "session_id": entry.get("session_id", ""),
+            "platform": entry.get("platform") or origin.get("platform", ""),
+            "chat_type": entry.get("chat_type", origin.get("chat_type", "")),
+            "display_name": entry.get("display_name", ""),
+            "user_name": origin.get("user_name", ""),
+            "chat_name": origin.get("chat_name", ""),
+            "chat_id": origin.get("chat_id", ""),
+            "thread_id": origin.get("thread_id"),
+            "updated_at": entry.get("updated_at", ""),
+            "created_at": entry.get("created_at", ""),
+            "input_tokens": entry.get("input_tokens", 0),
+            "output_tokens": entry.get("output_tokens", 0),
+            "total_tokens": entry.get("total_tokens", 0),
+        }, indent=2)
+
+    # -- messages_read -----------------------------------------------------
+
+    @mcp.tool()
+    def messages_read(
+        session_key: str,
+        limit: int = 50,
+    ) -> str:
+        """Read recent messages from a conversation.
+
+        Returns the message history in chronological order with role, content,
+        and timestamp for each message.
+
+        Args:
+            session_key: The session key from conversations_list
+            limit: Maximum number of messages to return (default 50, most recent)
+        """
+        entries = _load_sessions_index()
+        entry = entries.get(session_key)
+        if not entry:
+            return json.dumps({"error": f"Conversation not found: {session_key}"})
+
+        session_id = entry.get("session_id", "")
+        if not session_id:
+            return json.dumps({"error": "No session ID for this conversation"})
+
+        db = _get_session_db()
+        if not db:
+            return json.dumps({"error": "Session database unavailable"})
+
+        try:
+            all_messages = db.get_messages(session_id)
+        except Exception as e:
+            return json.dumps({"error": f"Failed to read messages: {e}"})
+
+        filtered = []
+        for msg in all_messages:
+            role = msg.get("role", "")
+            if role in ("user", "assistant"):
+                content = _extract_message_content(msg)
+                if content:
+                    filtered.append({
+                        "id": str(msg.get("id", "")),
+                        "role": role,
+                        "content": content[:2000],
+                        "timestamp": msg.get("timestamp", ""),
+                    })
+
+        messages = filtered[-limit:]
+
+        return json.dumps({
+            "session_key": session_key,
+            "count": len(messages),
+            "total_in_session": len(filtered),
+            "messages": messages,
+        }, indent=2)
+
+    # -- attachments_fetch -------------------------------------------------
+
+    @mcp.tool()
+    def attachments_fetch(
+        session_key: str,
+        message_id: str,
+    ) -> str:
+        """List non-text attachments for a message in a conversation.
+
+        Extracts images, media files, and other non-text content blocks
+        from the specified message.
+
+        Args:
+            session_key: The session key from conversations_list
+            message_id: The message ID from messages_read
+        """
+        entries = _load_sessions_index()
+        entry = entries.get(session_key)
+        if not entry:
+            return json.dumps({"error": f"Conversation not found: {session_key}"})
+
+        session_id = entry.get("session_id", "")
+        if not session_id:
+            return json.dumps({"error": "No session ID for this conversation"})
+
+        db = _get_session_db()
+        if not db:
+            return json.dumps({"error": "Session database unavailable"})
+
+        try:
+            all_messages = db.get_messages(session_id)
+        except Exception as e:
+            return json.dumps({"error": f"Failed to read messages: {e}"})
+
+        # Find the target message
+        target_msg = None
+        for msg in all_messages:
+            if str(msg.get("id", "")) == message_id:
+                target_msg = msg
+                break
+
+        if not target_msg:
+            return json.dumps({"error": f"Message not found: {message_id}"})
+
+        attachments = _extract_attachments(target_msg)
+
+        return json.dumps({
+            "message_id": message_id,
+            "count": len(attachments),
+            "attachments": attachments,
+        }, indent=2)
+
+    # -- events_poll -------------------------------------------------------
+
+    @mcp.tool()
+    def events_poll(
+        after_cursor: int = 0,
+        session_key: Optional[str] = None,
+        limit: int = 20,
+    ) -> str:
+        """Poll for new conversation events since a cursor position.
+
+        Returns events that have occurred since the given cursor. Use the
+        returned next_cursor value for subsequent polls.
+
+        Event types: message, approval_requested, approval_resolved
+
+        Args:
+            after_cursor: Return events after this cursor (0 for all)
+            session_key: Optional filter to one conversation
+            limit: Maximum events to return (default 20)
+        """
+        result = bridge.poll_events(
+            after_cursor=after_cursor,
+            session_key=session_key,
+            limit=limit,
+        )
+        return json.dumps(result, indent=2)
+
+    # -- events_wait -------------------------------------------------------
+
+    @mcp.tool()
+    def events_wait(
+        after_cursor: int = 0,
+        session_key: Optional[str] = None,
+        timeout_ms: int = 30000,
+    ) -> str:
+        """Wait for the next conversation event (long-poll).
+
+        Blocks until a matching event arrives or the timeout expires.
+        Use this for near-real-time event delivery without polling.
+
+        Args:
+            after_cursor: Wait for events after this cursor
+            session_key: Optional filter to one conversation
+            timeout_ms: Maximum wait time in milliseconds (default 30000)
+        """
+        event = bridge.wait_for_event(
+            after_cursor=after_cursor,
+            session_key=session_key,
+            timeout_ms=min(timeout_ms, 300000),  # Cap at 5 minutes
+        )
+        if event:
+            return json.dumps({"event": event}, indent=2)
+        return json.dumps({"event": None, "reason": "timeout"}, indent=2)
+
+    # -- messages_send -----------------------------------------------------
+
+    @mcp.tool()
+    def messages_send(
+        target: str,
+        message: str,
+    ) -> str:
+        """Send a message to a platform conversation.
+
+        The target format is "platform:chat_id" — same format used by the
+        channels_list tool. You can also use human-friendly channel names
+        that will be resolved automatically.
+
+        Examples:
+            target="telegram:6308981865"
+            target="discord:#general"
+            target="slack:#engineering"
+
+        Args:
+            target: Platform target in "platform:identifier" format
+            message: The message text to send
+        """
+        if not target or not message:
+            return json.dumps({"error": "Both target and message are required"})
+
+        try:
+            from tools.send_message_tool import send_message_tool
+            result_str = send_message_tool(
+                {"action": "send", "target": target, "message": message}
+            )
+            return result_str
+        except ImportError:
+            return json.dumps({"error": "Send message tool not available"})
+        except Exception as e:
+            return json.dumps({"error": f"Send failed: {e}"})
+
+    # -- channels_list -----------------------------------------------------
+
+    @mcp.tool()
+    def channels_list(platform: Optional[str] = None) -> str:
+        """List available messaging channels and targets across platforms.
+
+        Returns channels that you can send messages to. The target strings
+        returned here can be used directly with the messages_send tool.
+
+        Args:
+            platform: Filter by platform name (telegram, discord, slack, etc.)
+        """
+        directory = _load_channel_directory()
+        if not directory:
+            entries = _load_sessions_index()
+            targets = []
+            seen = set()
+            for key, entry in entries.items():
+                origin = entry.get("origin", {})
+                p = entry.get("platform") or origin.get("platform", "")
+                chat_id = origin.get("chat_id", "")
+                if not p or not chat_id:
+                    continue
+                if platform and p.lower() != platform.lower():
+                    continue
+                target_str = f"{p}:{chat_id}"
+                if target_str in seen:
+                    continue
+                seen.add(target_str)
+                targets.append({
+                    "target": target_str,
+                    "platform": p,
+                    "name": entry.get("display_name") or origin.get("chat_name", ""),
+                    "chat_type": entry.get("chat_type", origin.get("chat_type", "")),
+                })
+            return json.dumps({"count": len(targets), "channels": targets}, indent=2)
+
+        channels = []
+        for plat, entries_list in directory.items():
+            if platform and plat.lower() != platform.lower():
+                continue
+            if isinstance(entries_list, list):
+                for ch in entries_list:
+                    if isinstance(ch, dict):
+                        chat_id = ch.get("id", ch.get("chat_id", ""))
+                        channels.append({
+                            "target": f"{plat}:{chat_id}" if chat_id else plat,
+                            "platform": plat,
+                            "name": ch.get("name", ch.get("display_name", "")),
+                            "chat_type": ch.get("type", ""),
+                        })
+
+        return json.dumps({"count": len(channels), "channels": channels}, indent=2)
+
+    # -- permissions_list_open ---------------------------------------------
+
+    @mcp.tool()
+    def permissions_list_open() -> str:
+        """List pending approval requests observed during this bridge session.
+
+        Returns exec and plugin approval requests that the bridge has seen
+        since it started. Approvals are live-session only — older approvals
+        from before the bridge connected are not included.
+        """
+        approvals = bridge.list_pending_approvals()
+        return json.dumps({
+            "count": len(approvals),
+            "approvals": approvals,
+        }, indent=2)
+
+    # -- permissions_respond -----------------------------------------------
+
+    @mcp.tool()
+    def permissions_respond(
+        id: str,
+        decision: str,
+    ) -> str:
+        """Respond to a pending approval request.
+
+        Args:
+            id: The approval ID from permissions_list_open
+            decision: One of "allow-once", "allow-always", or "deny"
+        """
+        if decision not in ("allow-once", "allow-always", "deny"):
+            return json.dumps({
+                "error": f"Invalid decision: {decision}. "
+                         f"Must be allow-once, allow-always, or deny"
+            })
+
+        result = bridge.respond_to_approval(id, decision)
+        return json.dumps(result, indent=2)
+
+    return mcp
+
+
+# ---------------------------------------------------------------------------
+# Entry point
+# ---------------------------------------------------------------------------
+
+def run_mcp_server(verbose: bool = False) -> None:
+    """Start the Hermes MCP server on stdio."""
+    if not _MCP_SERVER_AVAILABLE:
+        print(
+            "Error: MCP server requires the 'mcp' package.\n"
+            "Install with: pip install 'hermes-agent[mcp]'",
+            file=sys.stderr,
+        )
+        sys.exit(1)
+
+    if verbose:
+        logging.basicConfig(level=logging.DEBUG, stream=sys.stderr)
+    else:
+        logging.basicConfig(level=logging.WARNING, stream=sys.stderr)
+
+    bridge = EventBridge()
+    bridge.start()
+
+    server = create_mcp_server(event_bridge=bridge)
+
+    import asyncio
+
+    async def _run():
+        try:
+            await server.run_stdio_async()
+        finally:
+            bridge.stop()
+
+    try:
+        asyncio.run(_run())
+    except KeyboardInterrupt:
+        bridge.stop()
--- a/nix/nixosModules.nix
+++ b/nix/nixosModules.nix
@ -10,6 +10,12 @@
 # container recreation. Environment variables are written to $HERMES_HOME/.env
 # and read by hermes at startup — no container recreation needed for env changes.
 #
+# Tool resolution: the hermes wrapper uses --suffix PATH for nix store tools,
+# so apt/uv-installed versions take priority. The container entrypoint provisions
+# extensible tools on first boot: nodejs/npm via apt, uv via curl, and a Python
+# 3.11 venv (bootstrapped entirely by uv) at ~/.venv with pip seeded. Agents get
+# writable tool prefixes for npm i -g, pip install, uv tool install, etc.
+#
 # Usage:
 #   services.hermes-agent = {
 #     enable = true;
@ -105,22 +111,52 @@
      fi
      mkdir -p "$TARGET_HOME"
      chown "$HERMES_UID:$HERMES_GID" "$TARGET_HOME"
+      chmod 0750 "$TARGET_HOME"

      # Ensure HERMES_HOME is owned by the target user
      if [ -n "''${HERMES_HOME:-}" ] && [ -d "$HERMES_HOME" ]; then
        chown -R "$HERMES_UID:$HERMES_GID" "$HERMES_HOME"
      fi

-      # Install sudo on Debian/Ubuntu if missing (first boot only, cached in writable layer)
-      if command -v apt-get >/dev/null 2>&1 && ! command -v sudo >/dev/null 2>&1; then
-        apt-get update -qq >/dev/null 2>&1 && apt-get install -y -qq sudo >/dev/null 2>&1 || true
+      # ── Provision apt packages (first boot only, cached in writable layer) ──
+      # sudo: agent self-modification
+      # nodejs/npm: writable node so npm i -g works (nix store copies are read-only)
+      # curl: needed for uv installer
+      if [ ! -f /var/lib/hermes-tools-provisioned ] && command -v apt-get >/dev/null 2>&1; then
+        echo "First boot: provisioning agent tools..."
+        apt-get update -qq
+        apt-get install -y -qq sudo nodejs npm curl
+        touch /var/lib/hermes-tools-provisioned
      fi
+
      if command -v sudo >/dev/null 2>&1 && [ ! -f /etc/sudoers.d/hermes ]; then
        mkdir -p /etc/sudoers.d
        echo "$TARGET_USER ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/hermes
        chmod 0440 /etc/sudoers.d/hermes
      fi

+      # uv (Python manager) — not in Ubuntu repos, retry-safe outside the sentinel
+      if ! command -v uv >/dev/null 2>&1 && [ ! -x "$TARGET_HOME/.local/bin/uv" ] && command -v curl >/dev/null 2>&1; then
+        su -s /bin/sh "$TARGET_USER" -c 'curl -LsSf https://astral.sh/uv/install.sh | sh' || true
+      fi
+
+      # Python 3.11 venv — gives the agent a writable Python with pip.
+      # Uses uv to install Python 3.11 (Ubuntu 24.04 ships 3.12).
+      # --seed includes pip/setuptools so bare `pip install` works.
+      _UV_BIN="$TARGET_HOME/.local/bin/uv"
+      if [ ! -d "$TARGET_HOME/.venv" ] && [ -x "$_UV_BIN" ]; then
+        su -s /bin/sh "$TARGET_USER" -c "
+          export PATH=\"\$HOME/.local/bin:\$PATH\"
+          uv python install 3.11
+          uv venv --python 3.11 --seed \"\$HOME/.venv\"
+        " || true
+      fi
+
+      # Put the agent venv first on PATH so python/pip resolve to writable copies
+      if [ -d "$TARGET_HOME/.venv/bin" ]; then
+        export PATH="$TARGET_HOME/.venv/bin:$PATH"
+      fi
+
      if command -v setpriv >/dev/null 2>&1; then
        exec setpriv --reuid="$HERMES_UID" --regid="$HERMES_GID" --init-groups "$@"
      elif command -v su >/dev/null 2>&1; then
@ -516,8 +552,8 @@
      # ── Directories ───────────────────────────────────────────────────
      {
        systemd.tmpfiles.rules = [
-          "d ${cfg.stateDir}                0755 ${cfg.user} ${cfg.group} - -"
-          "d ${cfg.stateDir}/.hermes        0755 ${cfg.user} ${cfg.group} - -"
+          "d ${cfg.stateDir}                0750 ${cfg.user} ${cfg.group} - -"
+          "d ${cfg.stateDir}/.hermes        0750 ${cfg.user} ${cfg.group} - -"
          "d ${cfg.stateDir}/home           0750 ${cfg.user} ${cfg.group} - -"
          "d ${cfg.workingDirectory}         0750 ${cfg.user} ${cfg.group} - -"
        ];
@ -531,21 +567,23 @@
          mkdir -p ${cfg.stateDir}/home
          mkdir -p ${cfg.workingDirectory}
          chown ${cfg.user}:${cfg.group} ${cfg.stateDir} ${cfg.stateDir}/.hermes ${cfg.stateDir}/home ${cfg.workingDirectory}
+          chmod 0750 ${cfg.stateDir} ${cfg.stateDir}/.hermes ${cfg.stateDir}/home ${cfg.workingDirectory}

          # Merge Nix settings into existing config.yaml.
          # Preserves user-added keys (skills, streaming, etc.); Nix keys win.
          # If configFile is user-provided (not generated), overwrite instead of merge.
          ${if cfg.configFile != null then ''
-            install -o ${cfg.user} -g ${cfg.group} -m 0644 -D ${configFile} ${cfg.stateDir}/.hermes/config.yaml
+            install -o ${cfg.user} -g ${cfg.group} -m 0640 -D ${configFile} ${cfg.stateDir}/.hermes/config.yaml
          '' else ''
            ${configMergeScript} ${generatedConfigFile} ${cfg.stateDir}/.hermes/config.yaml
            chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/config.yaml
-            chmod 0644 ${cfg.stateDir}/.hermes/config.yaml
+            chmod 0640 ${cfg.stateDir}/.hermes/config.yaml
          ''}

          # Managed mode marker (so interactive shells also detect NixOS management)
          touch ${cfg.stateDir}/.hermes/.managed
          chown ${cfg.user}:${cfg.group} ${cfg.stateDir}/.hermes/.managed
+          chmod 0644 ${cfg.stateDir}/.hermes/.managed

          # Seed auth file if provided
          ${lib.optionalString (cfg.authFile != null) ''
@ -577,7 +615,7 @@ HERMES_NIX_ENV_EOF

          # Link documents into workspace
          ${lib.concatStringsSep "\n" (lib.mapAttrsToList (name: _value: ''
-            install -o ${cfg.user} -g ${cfg.group} -m 0644 ${documentDerivation}/${name} ${cfg.workingDirectory}/${name}
+            install -o ${cfg.user} -g ${cfg.group} -m 0640 ${documentDerivation}/${name} ${cfg.workingDirectory}/${name}
          '') cfg.documents)}
        '';
      }
--- a/nix/packages.nix
+++ b/nix/packages.nix
@ -35,7 +35,7 @@

          ${pkgs.lib.concatMapStringsSep "\n" (name: ''
            makeWrapper ${hermesVenv}/bin/${name} $out/bin/${name} \
-              --prefix PATH : "${runtimePath}" \
+              --suffix PATH : "${runtimePath}" \
              --set HERMES_BUNDLED_SKILLS $out/share/hermes-agent/skills
          '') [ "hermes" "hermes-agent" "hermes-acp" ]}

--- a/optional-skills/communication/DESCRIPTION.md
+++ b/optional-skills/communication/DESCRIPTION.md
@ -0,0 +1 @@
+Communication and decision-making frameworks — structured response formats for proposals, trade-off analysis, and stakeholder-ready recommendations.
--- a/optional-skills/communication/one-three-one-rule/SKILL.md
+++ b/optional-skills/communication/one-three-one-rule/SKILL.md
@ -0,0 +1,103 @@
+---
+name: one-three-one-rule
+description: >
+  Structured decision-making framework for technical proposals and trade-off analysis.
+  When the user faces a choice between multiple approaches (architecture decisions,
+  tool selection, refactoring strategies, migration paths), this skill produces a
+  1-3-1 format: one clear problem statement, three distinct options with pros/cons,
+  and one concrete recommendation with definition of done and implementation plan.
+  Use when the user asks for a "1-3-1", says "give me options", or needs help
+  choosing between competing approaches.
+version: 1.0.0
+author: Willard Moore
+license: MIT
+category: communication
+metadata:
+  hermes:
+    tags: [communication, decision-making, proposals, trade-offs]
+---
+
+# 1-3-1 Communication Rule
+
+Structured decision-making format for when a task has multiple viable approaches and the user needs a clear recommendation. Produces a concise problem framing, three options with trade-offs, and an actionable plan for the recommended path.
+
+## When to Use
+
+- The user explicitly asks for a "1-3-1" response.
+- The user says "give me options" or "what are my choices" for a technical decision.
+- A task has multiple viable approaches with meaningful trade-offs (architecture, tooling, migration strategy).
+- The user needs a proposal they can forward to a team or stakeholder.
+
+Do NOT use for simple questions with one obvious answer, debugging sessions, or tasks where the user has already decided on an approach.
+
+## Procedure
+
+1. **Problem** (one sentence)
+   - State the core decision or desired outcome in a single concise sentence.
+   - Focus on the *what*, not the *how* — no implementation details, tool names, or specific technologies.
+   - Keep it tight. If you need "and", you're describing two problems.
+
+2. **Options** (exactly three)
+   - Present three distinct, viable approaches labeled A, B, C.
+   - Each option gets a brief description, pros, and cons.
+   - Options should represent genuinely different strategies, not minor variations of the same approach.
+
+3. **Recommendation** (one option)
+   - State which option you recommend and why, based on the user's context and priorities.
+   - Be direct — this is your professional judgment, not a hedge.
+
+4. **Definition of Done**
+   - List the specific success criteria for the recommended option.
+   - These are concrete, verifiable outcomes — not vague aspirations.
+   - If the user picks a different option, revise this section to match.
+
+5. **Implementation Plan**
+   - Concrete steps to execute the recommended option.
+   - Include specific commands, tools, or actions where applicable.
+   - If the user picks a different option, revise this section to match.
+
+## Verification
+
+- Response has exactly one Problem sentence.
+- Response has exactly three Options (A, B, C) with pros and cons for each.
+- Response has a single Recommendation that picks one option with reasoning.
+- Definition of Done and Implementation Plan align with the recommended option.
+- If the user selects a different option, Recommendation, DoD, and Implementation Plan update accordingly.
+
+## Example
+
+User: "Give me a 1-3-1 for adding retry logic to our API client."
+
+Agent:
+
+**Problem:** API calls to external services fail intermittently under load and need graceful retry handling.
+
+**Options:**
+
+- **Option A: Exponential backoff with jitter in a shared utility.**
+  Pros: Simple, well-understood pattern; single implementation shared across all callers.
+  Cons: No circuit-breaking; callers still block during retries.
+
+- **Option B: Retry library (tenacity) as a decorator on each endpoint call.**
+  Pros: Configurable per-endpoint; built-in support for circuit breakers and rate limits.
+  Cons: Adds a dependency; decorator pattern can obscure control flow.
+
+- **Option C: Queue-based async retry with dead letter handling.**
+  Pros: Non-blocking; failed calls are tracked and replayable; scales well.
+  Cons: Significant complexity increase; requires a queue backend; overkill for low-volume services.
+
+**Recommendation:** Option A. The call volume doesn't justify queue infrastructure, and a shared utility with exponential backoff + jitter covers the failure mode cleanly without new dependencies.
+
+**Definition of Done:**
+- `api_retry()` utility handles transient HTTP errors (429, 502, 503, 504).
+- Retries up to 3 times with exponential backoff and jitter.
+- All external API calls route through the retry utility.
+- Logs each retry attempt with endpoint and attempt number.
+- Tests cover retry exhaustion, immediate success, and transient-then-success scenarios.
+
+**Implementation Plan:**
+1. Create `utils/api_retry.py` with configurable max retries, base delay, and retryable status codes.
+2. Add jitter using `random.uniform(0, base_delay)` to prevent thundering herd.
+3. Wrap existing API calls in `api_client.py` with the retry utility.
+4. Add unit tests mocking HTTP responses for each retry scenario.
+5. Verify under load with a simple stress test against a flaky endpoint mock.
--- a/optional-skills/productivity/canvas/SKILL.md
+++ b/optional-skills/productivity/canvas/SKILL.md
@ -0,0 +1,97 @@
+---
+name: canvas
+description: Canvas LMS integration — fetch enrolled courses and assignments using API token authentication.
+version: 1.0.0
+author: community
+license: MIT
+prerequisites:
+  env_vars: [CANVAS_API_TOKEN, CANVAS_BASE_URL]
+metadata:
+  hermes:
+    tags: [Canvas, LMS, Education, Courses, Assignments]
+---
+
+# Canvas LMS — Course & Assignment Access
+
+Read-only access to Canvas LMS for listing courses and assignments.
+
+## Scripts
+
+- `scripts/canvas_api.py` — Python CLI for Canvas API calls
+
+## Setup
+
+1. Log in to your Canvas instance in a browser
+2. Go to **Account → Settings** (click your profile icon, then Settings)
+3. Scroll to **Approved Integrations** and click **+ New Access Token**
+4. Name the token (e.g., "Hermes Agent"), set an optional expiry, and click **Generate Token**
+5. Copy the token and add to `~/.hermes/.env`:
+
+```
+CANVAS_API_TOKEN=your_token_here
+CANVAS_BASE_URL=https://yourschool.instructure.com
+```
+
+The base URL is whatever appears in your browser when you're logged into Canvas (no trailing slash).
+
+## Usage
+
+```bash
+CANVAS="python $HERMES_HOME/skills/productivity/canvas/scripts/canvas_api.py"
+
+# List all active courses
+$CANVAS list_courses --enrollment-state active
+
+# List all courses (any state)
+$CANVAS list_courses
+
+# List assignments for a specific course
+$CANVAS list_assignments 12345
+
+# List assignments ordered by due date
+$CANVAS list_assignments 12345 --order-by due_at
+```
+
+## Output Format
+
+**list_courses** returns:
+```json
+[{"id": 12345, "name": "Intro to CS", "course_code": "CS101", "workflow_state": "available", "start_at": "...", "end_at": "..."}]
+```
+
+**list_assignments** returns:
+```json
+[{"id": 67890, "name": "Homework 1", "due_at": "2025-02-15T23:59:00Z", "points_possible": 100, "submission_types": ["online_upload"], "html_url": "...", "description": "...", "course_id": 12345}]
+```
+
+Note: Assignment descriptions are truncated to 500 characters. The `html_url` field links to the full assignment page in Canvas.
+
+## API Reference (curl)
+
+```bash
+# List courses
+curl -s -H "Authorization: Bearer $CANVAS_API_TOKEN" \
+  "$CANVAS_BASE_URL/api/v1/courses?enrollment_state=active&per_page=10"
+
+# List assignments for a course
+curl -s -H "Authorization: Bearer $CANVAS_API_TOKEN" \
+  "$CANVAS_BASE_URL/api/v1/courses/COURSE_ID/assignments?per_page=10&order_by=due_at"
+```
+
+Canvas uses `Link` headers for pagination. The Python script handles pagination automatically.
+
+## Rules
+
+- This skill is **read-only** — it only fetches data, never modifies courses or assignments
+- On first use, verify auth by running `$CANVAS list_courses` — if it fails with 401, guide the user through setup
+- Canvas rate-limits to ~700 requests per 10 minutes; check `X-Rate-Limit-Remaining` header if hitting limits
+
+## Troubleshooting
+
+| Problem | Fix |
+|---------|-----|
+| 401 Unauthorized | Token invalid or expired — regenerate in Canvas Settings |
+| 403 Forbidden | Token lacks permission for this course |
+| Empty course list | Try `--enrollment-state active` or omit the flag to see all states |
+| Wrong institution | Verify `CANVAS_BASE_URL` matches the URL in your browser |
+| Timeout errors | Check network connectivity to your Canvas instance |
--- a/optional-skills/productivity/canvas/scripts/canvas_api.py
+++ b/optional-skills/productivity/canvas/scripts/canvas_api.py
@ -0,0 +1,157 @@
+#!/usr/bin/env python3
+"""Canvas LMS API CLI for Hermes Agent.
+
+A thin CLI wrapper around the Canvas REST API.
+Authenticates using a personal access token from environment variables.
+
+Usage:
+  python canvas_api.py list_courses [--per-page N] [--enrollment-state STATE]
+  python canvas_api.py list_assignments COURSE_ID [--per-page N] [--order-by FIELD]
+"""
+
+import argparse
+import json
+import os
+import sys
+
+import requests
+
+CANVAS_API_TOKEN = os.environ.get("CANVAS_API_TOKEN", "")
+CANVAS_BASE_URL = os.environ.get("CANVAS_BASE_URL", "").rstrip("/")
+
+
+def _check_config():
+    """Validate required environment variables are set."""
+    missing = []
+    if not CANVAS_API_TOKEN:
+        missing.append("CANVAS_API_TOKEN")
+    if not CANVAS_BASE_URL:
+        missing.append("CANVAS_BASE_URL")
+    if missing:
+        print(
+            f"Missing required environment variables: {', '.join(missing)}\n"
+            "Set them in ~/.hermes/.env or export them in your shell.\n"
+            "See the canvas skill SKILL.md for setup instructions.",
+            file=sys.stderr,
+        )
+        sys.exit(1)
+
+
+def _headers():
+    return {"Authorization": f"Bearer {CANVAS_API_TOKEN}"}
+
+
+def _paginated_get(url, params=None, max_items=200):
+    """Fetch all pages up to max_items, following Canvas Link headers."""
+    results = []
+    while url and len(results) < max_items:
+        resp = requests.get(url, headers=_headers(), params=params, timeout=30)
+        resp.raise_for_status()
+        results.extend(resp.json())
+        params = None  # params are included in the Link URL for subsequent pages
+        url = None
+        link = resp.headers.get("Link", "")
+        for part in link.split(","):
+            if 'rel="next"' in part:
+                url = part.split(";")[0].strip().strip("<>")
+    return results[:max_items]
+
+
+# =========================================================================
+# Commands
+# =========================================================================
+
+
+def list_courses(args):
+    """List enrolled courses."""
+    _check_config()
+    url = f"{CANVAS_BASE_URL}/api/v1/courses"
+    params = {"per_page": args.per_page}
+    if args.enrollment_state:
+        params["enrollment_state"] = args.enrollment_state
+    try:
+        courses = _paginated_get(url, params)
+    except requests.HTTPError as e:
+        print(f"API error: {e.response.status_code} {e.response.text}", file=sys.stderr)
+        sys.exit(1)
+    output = [
+        {
+            "id": c["id"],
+            "name": c.get("name", ""),
+            "course_code": c.get("course_code", ""),
+            "enrollment_term_id": c.get("enrollment_term_id"),
+            "start_at": c.get("start_at"),
+            "end_at": c.get("end_at"),
+            "workflow_state": c.get("workflow_state", ""),
+        }
+        for c in courses
+    ]
+    print(json.dumps(output, indent=2))
+
+
+def list_assignments(args):
+    """List assignments for a course."""
+    _check_config()
+    url = f"{CANVAS_BASE_URL}/api/v1/courses/{args.course_id}/assignments"
+    params = {"per_page": args.per_page}
+    if args.order_by:
+        params["order_by"] = args.order_by
+    try:
+        assignments = _paginated_get(url, params)
+    except requests.HTTPError as e:
+        print(f"API error: {e.response.status_code} {e.response.text}", file=sys.stderr)
+        sys.exit(1)
+    output = [
+        {
+            "id": a["id"],
+            "name": a.get("name", ""),
+            "description": (a.get("description") or "")[:500],
+            "due_at": a.get("due_at"),
+            "points_possible": a.get("points_possible"),
+            "submission_types": a.get("submission_types", []),
+            "html_url": a.get("html_url", ""),
+            "course_id": a.get("course_id"),
+        }
+        for a in assignments
+    ]
+    print(json.dumps(output, indent=2))
+
+
+# =========================================================================
+# CLI parser
+# =========================================================================
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Canvas LMS API CLI for Hermes Agent"
+    )
+    sub = parser.add_subparsers(dest="command", required=True)
+
+    # --- list_courses ---
+    p = sub.add_parser("list_courses", help="List enrolled courses")
+    p.add_argument("--per-page", type=int, default=50, help="Results per page (default 50)")
+    p.add_argument(
+        "--enrollment-state",
+        default="",
+        help="Filter by enrollment state (active, invited_or_pending, completed)",
+    )
+    p.set_defaults(func=list_courses)
+
+    # --- list_assignments ---
+    p = sub.add_parser("list_assignments", help="List assignments for a course")
+    p.add_argument("course_id", help="Canvas course ID")
+    p.add_argument("--per-page", type=int, default=50, help="Results per page (default 50)")
+    p.add_argument(
+        "--order-by",
+        default="",
+        help="Order by field (due_at, name, position)",
+    )
+    p.set_defaults(func=list_assignments)
+
+    args = parser.parse_args()
+    args.func(args)
+
+
+if __name__ == "__main__":
+    main()
--- a/optional-skills/productivity/memento-flashcards/SKILL.md
+++ b/optional-skills/productivity/memento-flashcards/SKILL.md
@ -0,0 +1,324 @@
+---
+name: memento-flashcards
+description: >-
+  Spaced-repetition flashcard system. Create cards from facts or text,
+  chat with flashcards using free-text answers graded by the agent,
+  generate quizzes from YouTube transcripts, review due cards with
+  adaptive scheduling, and export/import decks as CSV.
+version: 1.0.0
+author: Memento AI
+license: MIT
+platforms: [macos, linux]
+metadata:
+  hermes:
+    tags: [Education, Flashcards, Spaced Repetition, Learning, Quiz, YouTube]
+    requires_toolsets: [terminal]
+    category: productivity
+---
+
+# Memento Flashcards — Spaced-Repetition Flashcard Skill
+
+## Overview
+
+Memento gives you a local, file-based flashcard system with spaced-repetition scheduling.
+Users can chat with their flashcards by answering in free text and having the agent grade the response before scheduling the next review.
+Use it whenever the user wants to:
+
+- **Remember a fact** — turn any statement into a Q/A flashcard
+- **Study with spaced repetition** — review due cards with adaptive intervals and agent-graded free-text answers
+- **Quiz from a YouTube video** — fetch a transcript and generate a 5-question quiz
+- **Manage decks** — organise cards into collections, export/import CSV
+
+All card data lives in a single JSON file. No external API keys are required — you (the agent) generate flashcard content and quiz questions directly.
+
+User-facing response style for Memento Flashcards:
+- Use plain text only. Do not use Markdown formatting in replies to the user.
+- Keep review and quiz feedback brief and neutral. Avoid extra praise, pep, or long explanations.
+
+## When to Use
+
+Use this skill when the user wants to:
+- Save facts as flashcards for later review
+- Review due cards with spaced repetition
+- Generate a quiz from a YouTube video transcript
+- Import, export, inspect, or delete flashcard data
+
+Do not use this skill for general Q&A, coding help, or non-memory tasks.
+
+## Quick Reference
+
+| User intent | Action |
+|---|---|
+| "Remember that X" / "save this as a flashcard" | Generate a Q/A card, call `memento_cards.py add` |
+| Sends a fact without mentioning flashcards | Ask "Want me to save this as a Memento flashcard?" — only create if confirmed |
+| "Create a flashcard" | Ask for Q, A, collection; call `memento_cards.py add` |
+| "Review my cards" | Call `memento_cards.py due`, present cards one-by-one |
+| "Quiz me on [YouTube URL]" | Call `youtube_quiz.py fetch VIDEO_ID`, generate 5 questions, call `memento_cards.py add-quiz` |
+| "Export my cards" | Call `memento_cards.py export --output PATH` |
+| "Import cards from CSV" | Call `memento_cards.py import --file PATH --collection NAME` |
+| "Show my stats" | Call `memento_cards.py stats` |
+| "Delete a card" | Call `memento_cards.py delete --id ID` |
+| "Delete a collection" | Call `memento_cards.py delete-collection --collection NAME` |
+
+## Card Storage
+
+Cards are stored in a JSON file at:
+
+```
+~/.hermes/skills/productivity/memento-flashcards/data/cards.json
+```
+
+**Never edit this file directly.** Always use `memento_cards.py` subcommands. The script handles atomic writes (write to temp file, then rename) to prevent corruption.
+
+The file is created automatically on first use.
+
+## Procedure
+
+### Creating Cards from Facts
+
+### Activation Rules
+
+Not every factual statement should become a flashcard. Use this three-tier check:
+
+1. **Explicit intent** — the user mentions "memento", "flashcard", "remember this", "save this card", "add a card", or similar phrasing that clearly requests a flashcard → **create the card directly**, no confirmation needed.
+2. **Implicit intent** — the user sends a factual statement without mentioning flashcards (e.g. "The speed of light is 299,792 km/s") → **ask first**: "Want me to save this as a Memento flashcard?" Only create the card if the user confirms.
+3. **No intent** — the message is a coding task, a question, instructions, normal conversation, or anything that is clearly not a fact to memorize → **do NOT activate this skill at all**. Let other skills or default behavior handle it.
+
+When activation is confirmed (tier 1 directly, tier 2 after confirmation), generate a flashcard:
+
+**Step 1:** Turn the statement into a Q/A pair. Use this format internally:
+
+```
+Turn the factual statement into a front-back pair.
+Return exactly two lines:
+Q: <question text>
+A: <answer text>
+
+Statement: "{statement}"
+```
+
+Rules:
+- The question should test recall of the key fact
+- The answer should be concise and direct
+
+**Step 2:** Call the script to store the card:
+
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py add \
+  --question "What year did World War 2 end?" \
+  --answer "1945" \
+  --collection "History"
+```
+
+If the user doesn't specify a collection, use `"General"` as the default.
+
+The script outputs JSON confirming the created card.
+
+### Manual Card Creation
+
+When the user explicitly asks to create a flashcard, ask them for:
+1. The question (front of card)
+2. The answer (back of card)
+3. The collection name (optional — default to `"General"`)
+
+Then call `memento_cards.py add` as above.
+
+### Reviewing Due Cards
+
+When the user wants to review, fetch all due cards:
+
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py due
+```
+
+This returns a JSON array of cards where `next_review_at <= now`. If a collection filter is needed:
+
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py due --collection "History"
+```
+
+**Review flow (free-text grading):**
+
+Here is an example of the EXACT interaction pattern you must follow. The user answers, you grade them, tell them the correct answer, then rate the card.
+
+**Example interaction:**
+
+> **Agent:** What year did the Berlin Wall fall?
+>
+> **User:** 1991
+>
+> **Agent:** Not quite. The Berlin Wall fell in 1989. Next review is tomorrow.
+> *(agent calls: memento_cards.py rate --id ABC --rating hard --user-answer "1991")*
+>
+> Next question: Who was the first person to walk on the moon?
+
+**The rules:**
+
+1. Show only the question. Wait for the user to answer.
+2. After receiving their answer, compare it to the expected answer and grade it:
+   - **correct** → user got the key fact right (even if worded differently)
+   - **partial** → right track but missing the core detail
+   - **incorrect** → wrong or off-topic
+3. **You MUST tell the user the correct answer and how they did.** Keep it short and plain-text. Use this format:
+   - correct: "Correct. Answer: {answer}. Next review in 7 days."
+   - partial: "Close. Answer: {answer}. {what they missed}. Next review in 3 days."
+   - incorrect: "Not quite. Answer: {answer}. Next review tomorrow."
+4. Then call the rate command: correct→easy, partial→good, incorrect→hard.
+5. Then show the next question.
+
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py rate \
+  --id CARD_ID --rating easy --user-answer "what the user said"
+```
+
+**Never skip step 3.** The user must always see the correct answer and feedback before you move on.
+
+If no cards are due, tell the user: "No cards due for review right now. Check back later!"
+
+**Retire override:** At any point the user can say "retire this card" to permanently remove it from reviews. Use `--rating retire` for this.
+
+### Spaced Repetition Algorithm
+
+The rating determines the next review interval:
+
+| Rating | Interval | ease_streak | Status change |
+|---|---|---|---|
+| **hard** | +1 day | reset to 0 | stays learning |
+| **good** | +3 days | reset to 0 | stays learning |
+| **easy** | +7 days | +1 | if ease_streak >= 3 → retired |
+| **retire** | permanent | reset to 0 | → retired |
+
+- **learning**: card is actively in rotation
+- **retired**: card won't appear in reviews (user has mastered it or manually retired it)
+- Three consecutive "easy" ratings automatically retire a card
+
+### YouTube Quiz Generation
+
+When the user sends a YouTube URL and wants a quiz:
+
+**Step 1:** Extract the video ID from the URL (e.g. `dQw4w9WgXcQ` from `https://www.youtube.com/watch?v=dQw4w9WgXcQ`).
+
+**Step 2:** Fetch the transcript:
+
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/youtube_quiz.py fetch VIDEO_ID
+```
+
+This returns `{"title": "...", "transcript": "..."}` or an error.
+
+If the script reports `missing_dependency`, tell the user to install it:
+```bash
+pip install youtube-transcript-api
+```
+
+**Step 3:** Generate 5 quiz questions from the transcript. Use these rules:
+
+```
+You are creating a 5-question quiz for a podcast episode.
+Return ONLY a JSON array with exactly 5 objects.
+Each object must contain keys 'question' and 'answer'.
+
+Selection criteria:
+- Prioritize important, surprising, or foundational facts.
+- Skip filler, obvious details, and facts that require heavy context.
+- Never return true/false questions.
+- Never ask only for a date.
+
+Question rules:
+- Each question must test exactly one discrete fact.
+- Use clear, unambiguous wording.
+- Prefer What, Who, How many, Which.
+- Avoid open-ended Describe or Explain prompts.
+
+Answer rules:
+- Each answer must be under 240 characters.
+- Lead with the answer itself, not preamble.
+- Add only minimal clarifying detail if needed.
+```
+
+Use the first 15,000 characters of the transcript as context. Generate the questions yourself (you are the LLM).
+
+**Step 4:** Validate the output is valid JSON with exactly 5 items, each having non-empty `question` and `answer` strings. If validation fails, retry once.
+
+**Step 5:** Store quiz cards:
+
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py add-quiz \
+  --video-id "VIDEO_ID" \
+  --questions '[{"question":"...","answer":"..."},...]' \
+  --collection "Quiz - Episode Title"
+```
+
+The script deduplicates by `video_id` — if cards for that video already exist, it skips creation and reports the existing cards.
+
+**Step 6:** Present questions one-by-one using the same free-text grading flow:
+1. Show "Question 1/5: ..." and wait for the user's answer. Never include the answer or any hint about revealing it.
+2. Wait for the user to answer in their own words
+3. Grade their answer using the grading prompt (see "Reviewing Due Cards" section)
+4. **IMPORTANT: You MUST reply to the user with feedback before doing anything else.** Show the grade, the correct answer, and when the card is next due. Do NOT silently skip to the next question. Keep it short and plain-text. Example: "Not quite. Answer: {answer}. Next review tomorrow."
+5. **After showing feedback**, call the rate command and then show the next question in the same message:
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py rate \
+  --id CARD_ID --rating easy --user-answer "what the user said"
+```
+6. Repeat. Every answer MUST receive visible feedback before the next question.
+
+### Export/Import CSV
+
+**Export:**
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py export \
+  --output ~/flashcards.csv
+```
+
+Produces a 3-column CSV: `question,answer,collection` (no header row).
+
+**Import:**
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py import \
+  --file ~/flashcards.csv \
+  --collection "Imported"
+```
+
+Reads a CSV with columns: question, answer, and optionally collection (column 3). If the collection column is missing, uses the `--collection` argument.
+
+### Statistics
+
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py stats
+```
+
+Returns JSON with:
+- `total`: total card count
+- `learning`: cards in active rotation
+- `retired`: mastered cards
+- `due_now`: cards due for review right now
+- `collections`: breakdown by collection name
+
+## Pitfalls
+
+- **Never edit `cards.json` directly** — always use the script subcommands to avoid corruption
+- **Transcript failures** — some YouTube videos have no English transcript or have transcripts disabled; inform the user and suggest another video
+- **Optional dependency** — `youtube_quiz.py` needs `youtube-transcript-api`; if missing, tell the user to run `pip install youtube-transcript-api`
+- **Large imports** — CSV imports with thousands of rows work fine but the JSON output may be verbose; summarize the result for the user
+- **Video ID extraction** — support both `youtube.com/watch?v=ID` and `youtu.be/ID` URL formats
+
+## Verification
+
+Verify the helper scripts directly:
+
+```bash
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py stats
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py add --question "Capital of France?" --answer "Paris" --collection "General"
+python3 ~/.hermes/skills/productivity/memento-flashcards/scripts/memento_cards.py due
+```
+
+If you are testing from the repo checkout, run:
+
+```bash
+pytest tests/skills/test_memento_cards.py tests/skills/test_youtube_quiz.py -q
+```
+
+Agent-level verification:
+- Start a review and confirm feedback is plain text, brief, and always includes the correct answer before the next card
+- Run a YouTube quiz flow and confirm each answer receives visible feedback before the next question
--- a/optional-skills/productivity/memento-flashcards/scripts/memento_cards.py
+++ b/optional-skills/productivity/memento-flashcards/scripts/memento_cards.py
@ -0,0 +1,353 @@
+#!/usr/bin/env python3
+"""Memento card storage, spaced-repetition engine, and CSV I/O.
+
+Stdlib-only. All output is JSON for agent parsing.
+Data file: $HERMES_HOME/skills/productivity/memento-flashcards/data/cards.json
+"""
+
+import argparse
+import csv
+import json
+import os
+import sys
+import tempfile
+import uuid
+from datetime import datetime, timedelta, timezone
+from pathlib import Path
+
+_HERMES_HOME = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
+DATA_DIR = _HERMES_HOME / "skills" / "productivity" / "memento-flashcards" / "data"
+CARDS_FILE = DATA_DIR / "cards.json"
+
+RETIRED_SENTINEL = "9999-12-31T23:59:59+00:00"
+
+
+def _now() -> datetime:
+    return datetime.now(timezone.utc)
+
+
+def _iso(dt: datetime) -> str:
+    return dt.isoformat()
+
+
+def _parse_iso(s: str) -> datetime:
+    return datetime.fromisoformat(s)
+
+
+def _empty_store() -> dict:
+    return {"cards": [], "version": 1}
+
+
+def _load() -> dict:
+    if not CARDS_FILE.exists():
+        return _empty_store()
+    try:
+        with open(CARDS_FILE, "r", encoding="utf-8") as f:
+            data = json.load(f)
+        if not isinstance(data, dict) or "cards" not in data:
+            return _empty_store()
+        return data
+    except (json.JSONDecodeError, OSError):
+        return _empty_store()
+
+
+def _save(data: dict) -> None:
+    DATA_DIR.mkdir(parents=True, exist_ok=True)
+    fd, tmp = tempfile.mkstemp(dir=DATA_DIR, suffix=".tmp")
+    try:
+        with os.fdopen(fd, "w", encoding="utf-8") as f:
+            json.dump(data, f, indent=2, ensure_ascii=False)
+            f.write("\n")
+        os.replace(tmp, CARDS_FILE)
+    except BaseException:
+        try:
+            os.unlink(tmp)
+        except OSError:
+            pass
+        raise
+
+
+def _out(obj: object) -> None:
+    json.dump(obj, sys.stdout, indent=2, ensure_ascii=False)
+    sys.stdout.write("\n")
+
+
+# ── Subcommands ──────────────────────────────────────────────────────────────
+
+def cmd_add(args: argparse.Namespace) -> None:
+    data = _load()
+    now = _now()
+    card = {
+        "id": str(uuid.uuid4()),
+        "question": args.question,
+        "answer": args.answer,
+        "collection": args.collection or "General",
+        "status": "learning",
+        "ease_streak": 0,
+        "next_review_at": _iso(now),
+        "created_at": _iso(now),
+        "video_id": None,
+        "last_user_answer": None,
+    }
+    data["cards"].append(card)
+    _save(data)
+    _out({"ok": True, "card": card})
+
+
+def cmd_add_quiz(args: argparse.Namespace) -> None:
+    data = _load()
+    now = _now()
+
+    try:
+        questions = json.loads(args.questions)
+    except json.JSONDecodeError as exc:
+        _out({"ok": False, "error": f"Invalid JSON for --questions: {exc}"})
+        sys.exit(1)
+
+    # Dedup: skip if cards with this video_id already exist
+    existing_ids = {c["video_id"] for c in data["cards"] if c.get("video_id")}
+    if args.video_id in existing_ids:
+        existing = [c for c in data["cards"] if c.get("video_id") == args.video_id]
+        _out({"ok": True, "skipped": True, "reason": "duplicate_video_id", "existing_count": len(existing), "cards": existing})
+        return
+
+    created = []
+    for qa in questions:
+        card = {
+            "id": str(uuid.uuid4()),
+            "question": qa["question"],
+            "answer": qa["answer"],
+            "collection": args.collection or "Quiz",
+            "status": "learning",
+            "ease_streak": 0,
+            "next_review_at": _iso(now),
+            "created_at": _iso(now),
+            "video_id": args.video_id,
+            "last_user_answer": None,
+        }
+        data["cards"].append(card)
+        created.append(card)
+
+    _save(data)
+    _out({"ok": True, "created_count": len(created), "cards": created})
+
+
+def cmd_due(args: argparse.Namespace) -> None:
+    data = _load()
+    now = _now()
+    due = []
+    for card in data["cards"]:
+        if card["status"] == "retired":
+            continue
+        review_at = _parse_iso(card["next_review_at"])
+        if review_at <= now:
+            if args.collection and card["collection"] != args.collection:
+                continue
+            due.append(card)
+    _out({"ok": True, "count": len(due), "cards": due})
+
+
+def cmd_rate(args: argparse.Namespace) -> None:
+    data = _load()
+    now = _now()
+    card = None
+    for c in data["cards"]:
+        if c["id"] == args.id:
+            card = c
+            break
+    if not card:
+        _out({"ok": False, "error": f"Card not found: {args.id}"})
+        sys.exit(1)
+
+    rating = args.rating
+    user_answer = getattr(args, "user_answer", None)
+    if user_answer is not None:
+        card["last_user_answer"] = user_answer
+
+    if rating == "retire":
+        card["status"] = "retired"
+        card["next_review_at"] = RETIRED_SENTINEL
+        card["ease_streak"] = 0
+    elif rating == "hard":
+        card["next_review_at"] = _iso(now + timedelta(days=1))
+        card["ease_streak"] = 0
+    elif rating == "good":
+        card["next_review_at"] = _iso(now + timedelta(days=3))
+        card["ease_streak"] = 0
+    elif rating == "easy":
+        card["next_review_at"] = _iso(now + timedelta(days=7))
+        card["ease_streak"] = card.get("ease_streak", 0) + 1
+        if card["ease_streak"] >= 3:
+            card["status"] = "retired"
+
+    _save(data)
+    _out({"ok": True, "card": card})
+
+
+def cmd_list(args: argparse.Namespace) -> None:
+    data = _load()
+    cards = data["cards"]
+    if args.collection:
+        cards = [c for c in cards if c["collection"] == args.collection]
+    if args.status:
+        cards = [c for c in cards if c["status"] == args.status]
+    _out({"ok": True, "count": len(cards), "cards": cards})
+
+
+def cmd_stats(args: argparse.Namespace) -> None:
+    data = _load()
+    now = _now()
+    total = len(data["cards"])
+    learning = sum(1 for c in data["cards"] if c["status"] == "learning")
+    retired = sum(1 for c in data["cards"] if c["status"] == "retired")
+    due_now = 0
+    for c in data["cards"]:
+        if c["status"] != "retired" and _parse_iso(c["next_review_at"]) <= now:
+            due_now += 1
+
+    collections: dict[str, int] = {}
+    for c in data["cards"]:
+        name = c["collection"]
+        collections[name] = collections.get(name, 0) + 1
+
+    _out({
+        "ok": True,
+        "total": total,
+        "learning": learning,
+        "retired": retired,
+        "due_now": due_now,
+        "collections": collections,
+    })
+
+
+def cmd_export(args: argparse.Namespace) -> None:
+    data = _load()
+    output_path = Path(args.output).expanduser()
+    with open(output_path, "w", newline="", encoding="utf-8") as f:
+        writer = csv.writer(f, lineterminator="\n")
+        for card in data["cards"]:
+            writer.writerow([card["question"], card["answer"], card["collection"]])
+    _out({"ok": True, "exported": len(data["cards"]), "path": str(output_path)})
+
+
+def cmd_import(args: argparse.Namespace) -> None:
+    data = _load()
+    now = _now()
+    file_path = Path(args.file).expanduser()
+
+    if not file_path.exists():
+        _out({"ok": False, "error": f"File not found: {file_path}"})
+        sys.exit(1)
+
+    created = 0
+    with open(file_path, "r", encoding="utf-8") as f:
+        reader = csv.reader(f)
+        for row in reader:
+            if len(row) < 2:
+                continue
+            question = row[0].strip()
+            answer = row[1].strip()
+            collection = row[2].strip() if len(row) >= 3 and row[2].strip() else (args.collection or "Imported")
+            if not question or not answer:
+                continue
+            card = {
+                "id": str(uuid.uuid4()),
+                "question": question,
+                "answer": answer,
+                "collection": collection,
+                "status": "learning",
+                "ease_streak": 0,
+                "next_review_at": _iso(now),
+                "created_at": _iso(now),
+                "video_id": None,
+                "last_user_answer": None,
+            }
+            data["cards"].append(card)
+            created += 1
+
+    _save(data)
+    _out({"ok": True, "imported": created})
+
+
+def cmd_delete(args: argparse.Namespace) -> None:
+    data = _load()
+    original = len(data["cards"])
+    data["cards"] = [c for c in data["cards"] if c["id"] != args.id]
+    removed = original - len(data["cards"])
+    if removed == 0:
+        _out({"ok": False, "error": f"Card not found: {args.id}"})
+        sys.exit(1)
+    _save(data)
+    _out({"ok": True, "deleted": args.id})
+
+
+def cmd_delete_collection(args: argparse.Namespace) -> None:
+    data = _load()
+    original = len(data["cards"])
+    data["cards"] = [c for c in data["cards"] if c["collection"] != args.collection]
+    removed = original - len(data["cards"])
+    _save(data)
+    _out({"ok": True, "deleted_count": removed, "collection": args.collection})
+
+
+# ── CLI ──────────────────────────────────────────────────────────────────────
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Memento flashcard manager")
+    sub = parser.add_subparsers(dest="command", required=True)
+
+    p_add = sub.add_parser("add", help="Create one card")
+    p_add.add_argument("--question", required=True)
+    p_add.add_argument("--answer", required=True)
+    p_add.add_argument("--collection", default="General")
+
+    p_quiz = sub.add_parser("add-quiz", help="Batch-add quiz cards")
+    p_quiz.add_argument("--video-id", required=True)
+    p_quiz.add_argument("--questions", required=True, help="JSON array of {question, answer}")
+    p_quiz.add_argument("--collection", default="Quiz")
+
+    p_due = sub.add_parser("due", help="List due cards")
+    p_due.add_argument("--collection", default=None)
+
+    p_rate = sub.add_parser("rate", help="Rate a card")
+    p_rate.add_argument("--id", required=True)
+    p_rate.add_argument("--rating", required=True, choices=["easy", "good", "hard", "retire"])
+    p_rate.add_argument("--user-answer", default=None)
+
+    p_list = sub.add_parser("list", help="List cards")
+    p_list.add_argument("--collection", default=None)
+    p_list.add_argument("--status", default=None, choices=["learning", "retired"])
+
+    sub.add_parser("stats", help="Show statistics")
+
+    p_export = sub.add_parser("export", help="Export cards to CSV")
+    p_export.add_argument("--output", required=True)
+
+    p_import = sub.add_parser("import", help="Import cards from CSV")
+    p_import.add_argument("--file", required=True)
+    p_import.add_argument("--collection", default="Imported")
+
+    p_del = sub.add_parser("delete", help="Delete one card")
+    p_del.add_argument("--id", required=True)
+
+    p_delcol = sub.add_parser("delete-collection", help="Delete all cards in a collection")
+    p_delcol.add_argument("--collection", required=True)
+
+    args = parser.parse_args()
+    cmd_map = {
+        "add": cmd_add,
+        "add-quiz": cmd_add_quiz,
+        "due": cmd_due,
+        "rate": cmd_rate,
+        "list": cmd_list,
+        "stats": cmd_stats,
+        "export": cmd_export,
+        "import": cmd_import,
+        "delete": cmd_delete,
+        "delete-collection": cmd_delete_collection,
+    }
+    cmd_map[args.command](args)
+
+
+if __name__ == "__main__":
+    main()
--- a/optional-skills/productivity/memento-flashcards/scripts/youtube_quiz.py
+++ b/optional-skills/productivity/memento-flashcards/scripts/youtube_quiz.py
@ -0,0 +1,88 @@
+#!/usr/bin/env python3
+"""Fetch YouTube transcripts for Memento quiz generation.
+
+Requires: pip install youtube-transcript-api
+The quiz question *generation* is done by the agent's LLM — this script only fetches transcripts.
+"""
+
+import argparse
+import json
+import re
+import sys
+
+
+def _out(obj: object) -> None:
+    json.dump(obj, sys.stdout, indent=2, ensure_ascii=False)
+    sys.stdout.write("\n")
+
+
+def _normalize_segments(segments: list) -> str:
+    parts = []
+    for seg in segments:
+        text = str(seg.get("text", "")).strip()
+        if text:
+            parts.append(text)
+    return re.sub(r"\s+", " ", " ".join(parts)).strip()
+
+
+def cmd_fetch(args: argparse.Namespace) -> None:
+    try:
+        import youtube_transcript_api  # noqa: F811
+    except ImportError:
+        _out({
+            "ok": False,
+            "error": "missing_dependency",
+            "message": "Run: pip install youtube-transcript-api",
+        })
+        sys.exit(1)
+
+    video_id = args.video_id
+    languages = ["en", "en-US", "en-GB", "en-CA", "en-AU"]
+
+    api = youtube_transcript_api.YouTubeTranscriptApi()
+    try:
+        raw = api.fetch(video_id, languages=languages)
+    except Exception as exc:
+        error_type = type(exc).__name__
+        _out({
+            "ok": False,
+            "error": "transcript_unavailable",
+            "error_type": error_type,
+            "message": f"Could not fetch transcript for {video_id}: {exc}",
+        })
+        sys.exit(1)
+
+    segments = raw
+    if hasattr(raw, "to_raw_data"):
+        segments = raw.to_raw_data()
+
+    text = _normalize_segments(segments)
+    if not text:
+        _out({
+            "ok": False,
+            "error": "empty_transcript",
+            "message": f"Transcript for {video_id} contained no usable text.",
+        })
+        sys.exit(1)
+
+    _out({
+        "ok": True,
+        "video_id": video_id,
+        "transcript": text,
+    })
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Memento YouTube transcript fetcher")
+    sub = parser.add_subparsers(dest="command", required=True)
+
+    p_fetch = sub.add_parser("fetch", help="Fetch transcript for a video")
+    p_fetch.add_argument("video_id", help="YouTube video ID")
+
+    args = parser.parse_args()
+    if args.command == "fetch":
+        cmd_fetch(args)
+
+
+if __name__ == "__main__":
+    main()
--- a/optional-skills/productivity/siyuan/SKILL.md
+++ b/optional-skills/productivity/siyuan/SKILL.md
@ -0,0 +1,297 @@
+---
+name: siyuan
+description: SiYuan Note API for searching, reading, creating, and managing blocks and documents in a self-hosted knowledge base via curl.
+version: 1.0.0
+author: FEUAZUR
+license: MIT
+metadata:
+  hermes:
+    tags: [SiYuan, Notes, Knowledge Base, PKM, API]
+    related_skills: [obsidian, notion]
+    homepage: https://github.com/siyuan-note/siyuan
+prerequisites:
+  env_vars: [SIYUAN_TOKEN]
+  commands: [curl, jq]
+required_environment_variables:
+  - name: SIYUAN_TOKEN
+    prompt: SiYuan API token
+    help: "Settings > About in SiYuan desktop app"
+  - name: SIYUAN_URL
+    prompt: SiYuan instance URL (default http://127.0.0.1:6806)
+    required_for: remote instances
+---
+
+# SiYuan Note API
+
+Use the [SiYuan](https://github.com/siyuan-note/siyuan) kernel API via curl to search, read, create, update, and delete blocks and documents in a self-hosted knowledge base. No extra tools needed -- just curl and an API token.
+
+## Prerequisites
+
+1. Install and run SiYuan (desktop or Docker)
+2. Get your API token: **Settings > About > API token**
+3. Store it in `~/.hermes/.env`:
+   ```
+   SIYUAN_TOKEN=your_token_here
+   SIYUAN_URL=http://127.0.0.1:6806
+   ```
+   `SIYUAN_URL` defaults to `http://127.0.0.1:6806` if not set.
+
+## API Basics
+
+All SiYuan API calls are **POST with JSON body**. Every request follows this pattern:
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/..." \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"param": "value"}'
+```
+
+Responses are JSON with this structure:
+```json
+{"code": 0, "msg": "", "data": { ... }}
+```
+`code: 0` means success. Any other value is an error -- check `msg` for details.
+
+**ID format:** SiYuan IDs look like `20210808180117-6v0mkxr` (14-digit timestamp + 7 alphanumeric chars).
+
+## Quick Reference
+
+| Operation | Endpoint |
+|-----------|----------|
+| Full-text search | `/api/search/fullTextSearchBlock` |
+| SQL query | `/api/query/sql` |
+| Read block | `/api/block/getBlockKramdown` |
+| Read children | `/api/block/getChildBlocks` |
+| Get path | `/api/filetree/getHPathByID` |
+| Get attributes | `/api/attr/getBlockAttrs` |
+| List notebooks | `/api/notebook/lsNotebooks` |
+| List documents | `/api/filetree/listDocsByPath` |
+| Create notebook | `/api/notebook/createNotebook` |
+| Create document | `/api/filetree/createDocWithMd` |
+| Append block | `/api/block/appendBlock` |
+| Update block | `/api/block/updateBlock` |
+| Rename document | `/api/filetree/renameDocByID` |
+| Set attributes | `/api/attr/setBlockAttrs` |
+| Delete block | `/api/block/deleteBlock` |
+| Delete document | `/api/filetree/removeDocByID` |
+| Export as Markdown | `/api/export/exportMdContent` |
+
+## Common Operations
+
+### Search (Full-Text)
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/search/fullTextSearchBlock" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"query": "meeting notes", "page": 0}' | jq '.data.blocks[:5]'
+```
+
+### Search (SQL)
+
+Query the blocks database directly. Only SELECT statements are safe.
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/query/sql" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"stmt": "SELECT id, content, type, box FROM blocks WHERE content LIKE '\''%keyword%'\'' AND type='\''p'\'' LIMIT 20"}' | jq '.data'
+```
+
+Useful columns: `id`, `parent_id`, `root_id`, `box` (notebook ID), `path`, `content`, `type`, `subtype`, `created`, `updated`.
+
+### Read Block Content
+
+Returns block content in Kramdown (Markdown-like) format.
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/getBlockKramdown" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"id": "20210808180117-6v0mkxr"}' | jq '.data.kramdown'
+```
+
+### Read Child Blocks
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/getChildBlocks" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"id": "20210808180117-6v0mkxr"}' | jq '.data'
+```
+
+### Get Human-Readable Path
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/filetree/getHPathByID" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"id": "20210808180117-6v0mkxr"}' | jq '.data'
+```
+
+### Get Block Attributes
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/attr/getBlockAttrs" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"id": "20210808180117-6v0mkxr"}' | jq '.data'
+```
+
+### List Notebooks
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/notebook/lsNotebooks" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{}' | jq '.data.notebooks[] | {id, name, closed}'
+```
+
+### List Documents in a Notebook
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/filetree/listDocsByPath" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"notebook": "NOTEBOOK_ID", "path": "/"}' | jq '.data.files[] | {id, name}'
+```
+
+### Create a Document
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/filetree/createDocWithMd" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "notebook": "NOTEBOOK_ID",
+    "path": "/Meeting Notes/2026-03-22",
+    "markdown": "# Meeting Notes\n\n- Discussed project timeline\n- Assigned tasks"
+  }' | jq '.data'
+```
+
+### Create a Notebook
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/notebook/createNotebook" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"name": "My New Notebook"}' | jq '.data.notebook.id'
+```
+
+### Append Block to Document
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/appendBlock" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "parentID": "DOCUMENT_OR_BLOCK_ID",
+    "data": "New paragraph added at the end.",
+    "dataType": "markdown"
+  }' | jq '.data'
+```
+
+Also available: `/api/block/prependBlock` (same params, inserts at the beginning) and `/api/block/insertBlock` (uses `previousID` instead of `parentID` to insert after a specific block).
+
+### Update Block Content
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/updateBlock" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "id": "BLOCK_ID",
+    "data": "Updated content here.",
+    "dataType": "markdown"
+  }' | jq '.data'
+```
+
+### Rename a Document
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/filetree/renameDocByID" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"id": "DOCUMENT_ID", "title": "New Title"}'
+```
+
+### Set Block Attributes
+
+Custom attributes must be prefixed with `custom-`:
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/attr/setBlockAttrs" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "id": "BLOCK_ID",
+    "attrs": {
+      "custom-status": "reviewed",
+      "custom-priority": "high"
+    }
+  }'
+```
+
+### Delete a Block
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/block/deleteBlock" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"id": "BLOCK_ID"}'
+```
+
+To delete a whole document: use `/api/filetree/removeDocByID` with `{"id": "DOC_ID"}`.
+To delete a notebook: use `/api/notebook/removeNotebook` with `{"notebook": "NOTEBOOK_ID"}`.
+
+### Export Document as Markdown
+
+```bash
+curl -s -X POST "${SIYUAN_URL:-http://127.0.0.1:6806}/api/export/exportMdContent" \
+  -H "Authorization: Token $SIYUAN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"id": "DOCUMENT_ID"}' | jq -r '.data.content'
+```
+
+## Block Types
+
+Common `type` values in SQL queries:
+
+| Type | Description |
+|------|-------------|
+| `d` | Document (root block) |
+| `p` | Paragraph |
+| `h` | Heading |
+| `l` | List |
+| `i` | List item |
+| `c` | Code block |
+| `m` | Math block |
+| `t` | Table |
+| `b` | Blockquote |
+| `s` | Super block |
+| `html` | HTML block |
+
+## Pitfalls
+
+- **All endpoints are POST** -- even read-only operations. Do not use GET.
+- **SQL safety**: only use SELECT queries. INSERT/UPDATE/DELETE/DROP are dangerous and should never be sent.
+- **ID validation**: IDs match the pattern `YYYYMMDDHHmmss-xxxxxxx`. Reject anything else.
+- **Error responses**: always check `code != 0` in responses before processing `data`.
+- **Large documents**: block content and export results can be very large. Use `LIMIT` in SQL and pipe through `jq` to extract only what you need.
+- **Notebook IDs**: when working with a specific notebook, get its ID first via `lsNotebooks`.
+
+## Alternative: MCP Server
+
+If you prefer a native integration instead of curl, install the SiYuan MCP server:
+
+```yaml
+# In ~/.hermes/config.yaml under mcp_servers:
+mcp_servers:
+  siyuan:
+    command: npx
+    args: ["-y", "@porkll/siyuan-mcp"]
+    env:
+      SIYUAN_TOKEN: "your_token"
+      SIYUAN_URL: "http://127.0.0.1:6806"
+```
--- a/optional-skills/research/parallel-cli/SKILL.md
+++ b/optional-skills/research/parallel-cli/SKILL.md
--- a/optional-skills/research/scrapling/SKILL.md
+++ b/optional-skills/research/scrapling/SKILL.md
@ -0,0 +1,335 @@
+---
+name: scrapling
+description: Web scraping with Scrapling - HTTP fetching, stealth browser automation, Cloudflare bypass, and spider crawling via CLI and Python.
+version: 1.0.0
+author: FEUAZUR
+license: MIT
+metadata:
+  hermes:
+    tags: [Web Scraping, Browser, Cloudflare, Stealth, Crawling, Spider]
+    related_skills: [duckduckgo-search, domain-intel]
+    homepage: https://github.com/D4Vinci/Scrapling
+prerequisites:
+  commands: [scrapling, python]
+---
+
+# Scrapling
+
+[Scrapling](https://github.com/D4Vinci/Scrapling) is a web scraping framework with anti-bot bypass, stealth browser automation, and a spider framework. It provides three fetching strategies (HTTP, dynamic JS, stealth/Cloudflare) and a full CLI.
+
+**This skill is for educational and research purposes only.** Users must comply with local/international data scraping laws and respect website Terms of Service.
+
+## When to Use
+
+- Scraping static HTML pages (faster than browser tools)
+- Scraping JS-rendered pages that need a real browser
+- Bypassing Cloudflare Turnstile or bot detection
+- Crawling multiple pages with a spider
+- When the built-in `web_extract` tool does not return the data you need
+
+## Installation
+
+```bash
+pip install "scrapling[all]"
+scrapling install
+```
+
+Minimal install (HTTP only, no browser):
+```bash
+pip install scrapling
+```
+
+With browser automation only:
+```bash
+pip install "scrapling[fetchers]"
+scrapling install
+```
+
+## Quick Reference
+
+| Approach | Class | Use When |
+|----------|-------|----------|
+| HTTP | `Fetcher` / `FetcherSession` | Static pages, APIs, fast bulk requests |
+| Dynamic | `DynamicFetcher` / `DynamicSession` | JS-rendered content, SPAs |
+| Stealth | `StealthyFetcher` / `StealthySession` | Cloudflare, anti-bot protected sites |
+| Spider | `Spider` | Multi-page crawling with link following |
+
+## CLI Usage
+
+### Extract Static Page
+
+```bash
+scrapling extract get 'https://example.com' output.md
+```
+
+With CSS selector and browser impersonation:
+
+```bash
+scrapling extract get 'https://example.com' output.md \
+  --css-selector '.content' \
+  --impersonate 'chrome'
+```
+
+### Extract JS-Rendered Page
+
+```bash
+scrapling extract fetch 'https://example.com' output.md \
+  --css-selector '.dynamic-content' \
+  --disable-resources \
+  --network-idle
+```
+
+### Extract Cloudflare-Protected Page
+
+```bash
+scrapling extract stealthy-fetch 'https://protected-site.com' output.html \
+  --solve-cloudflare \
+  --block-webrtc \
+  --hide-canvas
+```
+
+### POST Request
+
+```bash
+scrapling extract post 'https://example.com/api' output.json \
+  --json '{"query": "search term"}'
+```
+
+### Output Formats
+
+The output format is determined by the file extension:
+- `.html` -- raw HTML
+- `.md` -- converted to Markdown
+- `.txt` -- plain text
+- `.json` / `.jsonl` -- JSON
+
+## Python: HTTP Scraping
+
+### Single Request
+
+```python
+from scrapling.fetchers import Fetcher
+
+page = Fetcher.get('https://quotes.toscrape.com/')
+quotes = page.css('.quote .text::text').getall()
+for q in quotes:
+    print(q)
+```
+
+### Session (Persistent Cookies)
+
+```python
+from scrapling.fetchers import FetcherSession
+
+with FetcherSession(impersonate='chrome') as session:
+    page = session.get('https://example.com/', stealthy_headers=True)
+    links = page.css('a::attr(href)').getall()
+    for link in links[:5]:
+        sub = session.get(link)
+        print(sub.css('h1::text').get())
+```
+
+### POST / PUT / DELETE
+
+```python
+page = Fetcher.post('https://api.example.com/data', json={"key": "value"})
+page = Fetcher.put('https://api.example.com/item/1', data={"name": "updated"})
+page = Fetcher.delete('https://api.example.com/item/1')
+```
+
+### With Proxy
+
+```python
+page = Fetcher.get('https://example.com', proxy='http://user:pass@proxy:8080')
+```
+
+## Python: Dynamic Pages (JS-Rendered)
+
+For pages that require JavaScript execution (SPAs, lazy-loaded content):
+
+```python
+from scrapling.fetchers import DynamicFetcher
+
+page = DynamicFetcher.fetch('https://example.com', headless=True)
+data = page.css('.js-loaded-content::text').getall()
+```
+
+### Wait for Specific Element
+
+```python
+page = DynamicFetcher.fetch(
+    'https://example.com',
+    wait_selector=('.results', 'visible'),
+    network_idle=True,
+)
+```
+
+### Disable Resources for Speed
+
+Blocks fonts, images, media, stylesheets (~25% faster):
+
+```python
+from scrapling.fetchers import DynamicSession
+
+with DynamicSession(headless=True, disable_resources=True, network_idle=True) as session:
+    page = session.fetch('https://example.com')
+    items = page.css('.item::text').getall()
+```
+
+### Custom Page Automation
+
+```python
+from playwright.sync_api import Page
+from scrapling.fetchers import DynamicFetcher
+
+def scroll_and_click(page: Page):
+    page.mouse.wheel(0, 3000)
+    page.wait_for_timeout(1000)
+    page.click('button.load-more')
+    page.wait_for_selector('.extra-results')
+
+page = DynamicFetcher.fetch('https://example.com', page_action=scroll_and_click)
+results = page.css('.extra-results .item::text').getall()
+```
+
+## Python: Stealth Mode (Anti-Bot Bypass)
+
+For Cloudflare-protected or heavily fingerprinted sites:
+
+```python
+from scrapling.fetchers import StealthyFetcher
+
+page = StealthyFetcher.fetch(
+    'https://protected-site.com',
+    headless=True,
+    solve_cloudflare=True,
+    block_webrtc=True,
+    hide_canvas=True,
+)
+content = page.css('.protected-content::text').getall()
+```
+
+### Stealth Session
+
+```python
+from scrapling.fetchers import StealthySession
+
+with StealthySession(headless=True, solve_cloudflare=True) as session:
+    page1 = session.fetch('https://protected-site.com/page1')
+    page2 = session.fetch('https://protected-site.com/page2')
+```
+
+## Element Selection
+
+All fetchers return a `Selector` object with these methods:
+
+### CSS Selectors
+
+```python
+page.css('h1::text').get()              # First h1 text
+page.css('a::attr(href)').getall()      # All link hrefs
+page.css('.quote .text::text').getall() # Nested selection
+```
+
+### XPath
+
+```python
+page.xpath('//div[@class="content"]/text()').getall()
+page.xpath('//a/@href').getall()
+```
+
+### Find Methods
+
+```python
+page.find_all('div', class_='quote')       # By tag + attribute
+page.find_by_text('Read more', tag='a')    # By text content
+page.find_by_regex(r'\$\d+\.\d{2}')       # By regex pattern
+```
+
+### Similar Elements
+
+Find elements with similar structure (useful for product listings, etc.):
+
+```python
+first_product = page.css('.product')[0]
+all_similar = first_product.find_similar()
+```
+
+### Navigation
+
+```python
+el = page.css('.target')[0]
+el.parent                # Parent element
+el.children              # Child elements
+el.next_sibling          # Next sibling
+el.prev_sibling          # Previous sibling
+```
+
+## Python: Spider Framework
+
+For multi-page crawling with link following:
+
+```python
+from scrapling.spiders import Spider, Request, Response
+
+class QuotesSpider(Spider):
+    name = "quotes"
+    start_urls = ["https://quotes.toscrape.com/"]
+    concurrent_requests = 10
+    download_delay = 1
+
+    async def parse(self, response: Response):
+        for quote in response.css('.quote'):
+            yield {
+                "text": quote.css('.text::text').get(),
+                "author": quote.css('.author::text').get(),
+                "tags": quote.css('.tag::text').getall(),
+            }
+
+        next_page = response.css('.next a::attr(href)').get()
+        if next_page:
+            yield response.follow(next_page)
+
+result = QuotesSpider().start()
+print(f"Scraped {len(result.items)} quotes")
+result.items.to_json("quotes.json")
+```
+
+### Multi-Session Spider
+
+Route requests to different fetcher types:
+
+```python
+from scrapling.fetchers import FetcherSession, AsyncStealthySession
+
+class SmartSpider(Spider):
+    name = "smart"
+    start_urls = ["https://example.com/"]
+
+    def configure_sessions(self, manager):
+        manager.add("fast", FetcherSession(impersonate="chrome"))
+        manager.add("stealth", AsyncStealthySession(headless=True), lazy=True)
+
+    async def parse(self, response: Response):
+        for link in response.css('a::attr(href)').getall():
+            if "protected" in link:
+                yield Request(link, sid="stealth")
+            else:
+                yield Request(link, sid="fast", callback=self.parse)
+```
+
+### Pause/Resume Crawling
+
+```python
+spider = QuotesSpider(crawldir="./crawl_checkpoint")
+spider.start()  # Ctrl+C to pause, re-run to resume from checkpoint
+```
+
+## Pitfalls
+
+- **Browser install required**: run `scrapling install` after pip install -- without it, `DynamicFetcher` and `StealthyFetcher` will fail
+- **Timeouts**: DynamicFetcher/StealthyFetcher timeout is in **milliseconds** (default 30000), Fetcher timeout is in **seconds**
+- **Cloudflare bypass**: `solve_cloudflare=True` adds 5-15 seconds to fetch time -- only enable when needed
+- **Resource usage**: StealthyFetcher runs a real browser -- limit concurrent usage
+- **Legal**: always check robots.txt and website ToS before scraping. This library is for educational and research purposes
+- **Python version**: requires Python 3.10+
--- a/pyproject.toml
+++ b/pyproject.toml
@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "hermes-agent"
-version = "0.4.0"
+version = "0.5.0"
 description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
 readme = "README.md"
 requires-python = ">=3.11"
@ -26,6 +26,7 @@ dependencies = [
  # Interactive CLI (prompt_toolkit is used directly by cli.py)
  "prompt_toolkit>=3.0.52,<4",
  # Tools
+  "exa-py>=2.9.0,<3",
  "firecrawl-py>=4.16.0,<5",
  "parallel-web>=0.4.2,<1",
  "fal-client>=0.13.1,<1",
@ -37,7 +38,7 @@ dependencies = [
 ]

 [project.optional-dependencies]
-modal = ["swe-rex[modal]>=1.4.0,<2"]
+modal = ["modal>=1.0.0,<2"]
 daytona = ["daytona>=0.148.0,<1"]
 dev = ["debugpy>=1.8.0,<2", "pytest>=9.0.2,<10", "pytest-asyncio>=1.3.0,<2", "pytest-xdist>=3.0,<4", "mcp>=1.2.0,<2"]
 messaging = ["python-telegram-bot>=22.6,<23", "discord.py[voice]>=2.7.1,<3", "aiohttp>=3.13.3,<4", "slack-bolt>=1.18.0,<2", "slack-sdk>=3.27.0,<4"]
@ -55,8 +56,9 @@ honcho = ["honcho-ai>=2.0.1,<3"]
 mcp = ["mcp>=1.2.0,<2"]
 homeassistant = ["aiohttp>=3.9.0,<4"]
 sms = ["aiohttp>=3.9.0,<4"]
-acp = ["agent-client-protocol>=0.8.1,<1.0"]
+acp = ["agent-client-protocol>=0.8.1,<0.9"]
 dingtalk = ["dingtalk-stream>=0.1.0,<1"]
+feishu = ["lark-oapi>=1.5.3,<2"]
 rl = [
  "atroposlib @ git+https://github.com/NousResearch/atropos.git",
  "tinker @ git+https://github.com/thinking-machines-lab/tinker.git",
@ -82,6 +84,7 @@ all = [
  "hermes-agent[acp]",
  "hermes-agent[voice]",
  "hermes-agent[dingtalk]",
+  "hermes-agent[feishu]",
 ]

 [project.scripts]
--- a/run_agent.py
+++ b/run_agent.py
--- a/scripts/kill_modal.sh
+++ b/scripts/kill_modal.sh
@ -2,7 +2,7 @@
 # Kill all running Modal apps (sandboxes, deployments, etc.)
 #
 # Usage:
-#   bash scripts/kill_modal.sh          # Stop swe-rex (the sandbox app)
+#   bash scripts/kill_modal.sh          # Stop hermes-agent sandboxes
 #   bash scripts/kill_modal.sh --all    # Stop ALL Modal apps

 set -uo pipefail
@ -17,10 +17,10 @@ if [[ "${1:-}" == "--all" ]]; then
        modal app stop "$app_id" 2>/dev/null || true
    done
 else
-    echo "Stopping swe-rex sandboxes..."
-    APPS=$(echo "$APP_LIST" | grep 'swe-rex' | grep -oE 'ap-[A-Za-z0-9]+' || true)
+    echo "Stopping hermes-agent sandboxes..."
+    APPS=$(echo "$APP_LIST" | grep 'hermes-agent' | grep -oE 'ap-[A-Za-z0-9]+' || true)
    if [[ -z "$APPS" ]]; then
-        echo "  No swe-rex apps found."
+        echo "  No hermes-agent apps found."
    else
        echo "$APPS" | while read app_id; do
            echo "  Stopping $app_id"
@ -30,5 +30,5 @@ else
 fi

 echo ""
-echo "Current swe-rex status:"
-modal app list 2>/dev/null | grep -E 'State|swe-rex' || echo "  (none)"
+echo "Current hermes-agent status:"
+modal app list 2>/dev/null | grep -E 'State|hermes-agent' || echo "  (none)"
--- a/scripts/whatsapp-bridge/allowlist.js
+++ b/scripts/whatsapp-bridge/allowlist.js
@ -0,0 +1,79 @@
+import path from 'path';
+import { existsSync, readFileSync } from 'fs';
+
+export function normalizeWhatsAppIdentifier(value) {
+  return String(value || '')
+    .trim()
+    .replace(/:.*@/, '@')
+    .replace(/@.*/, '')
+    .replace(/^\+/, '');
+}
+
+export function parseAllowedUsers(rawValue) {
+  return new Set(
+    String(rawValue || '')
+      .split(',')
+      .map((value) => normalizeWhatsAppIdentifier(value))
+      .filter(Boolean)
+  );
+}
+
+function readMappingFile(sessionDir, identifier, suffix = '') {
+  const filePath = path.join(sessionDir, `lid-mapping-${identifier}${suffix}.json`);
+  if (!existsSync(filePath)) {
+    return null;
+  }
+
+  try {
+    const parsed = JSON.parse(readFileSync(filePath, 'utf8'));
+    const normalized = normalizeWhatsAppIdentifier(parsed);
+    return normalized || null;
+  } catch {
+    return null;
+  }
+}
+
+export function expandWhatsAppIdentifiers(identifier, sessionDir) {
+  const normalized = normalizeWhatsAppIdentifier(identifier);
+  if (!normalized) {
+    return new Set();
+  }
+
+  // Walk both phone->LID and LID->phone mapping files so allowlists can use
+  // either form transparently in bot mode.
+  const resolved = new Set();
+  const queue = [normalized];
+
+  while (queue.length > 0) {
+    const current = queue.shift();
+    if (!current || resolved.has(current)) {
+      continue;
+    }
+
+    resolved.add(current);
+
+    for (const suffix of ['', '_reverse']) {
+      const mapped = readMappingFile(sessionDir, current, suffix);
+      if (mapped && !resolved.has(mapped)) {
+        queue.push(mapped);
+      }
+    }
+  }
+
+  return resolved;
+}
+
+export function matchesAllowedUser(senderId, allowedUsers, sessionDir) {
+  if (!allowedUsers || allowedUsers.size === 0) {
+    return true;
+  }
+
+  const aliases = expandWhatsAppIdentifiers(senderId, sessionDir);
+  for (const alias of aliases) {
+    if (allowedUsers.has(alias)) {
+      return true;
+    }
+  }
+
+  return false;
+}
--- a/scripts/whatsapp-bridge/allowlist.test.mjs
+++ b/scripts/whatsapp-bridge/allowlist.test.mjs
@ -0,0 +1,47 @@
+import test from 'node:test';
+import assert from 'node:assert/strict';
+import os from 'node:os';
+import path from 'node:path';
+import { mkdtempSync, rmSync, writeFileSync } from 'node:fs';
+
+import {
+  expandWhatsAppIdentifiers,
+  matchesAllowedUser,
+  normalizeWhatsAppIdentifier,
+  parseAllowedUsers,
+} from './allowlist.js';
+
+test('normalizeWhatsAppIdentifier strips jid syntax and plus prefix', () => {
+  assert.equal(normalizeWhatsAppIdentifier('+19175395595@s.whatsapp.net'), '19175395595');
+  assert.equal(normalizeWhatsAppIdentifier('267383306489914@lid'), '267383306489914');
+  assert.equal(normalizeWhatsAppIdentifier('19175395595:12@s.whatsapp.net'), '19175395595');
+});
+
+test('expandWhatsAppIdentifiers resolves phone and lid aliases from session files', () => {
+  const sessionDir = mkdtempSync(path.join(os.tmpdir(), 'hermes-wa-allowlist-'));
+
+  try {
+    writeFileSync(path.join(sessionDir, 'lid-mapping-19175395595.json'), JSON.stringify('267383306489914'));
+    writeFileSync(path.join(sessionDir, 'lid-mapping-267383306489914_reverse.json'), JSON.stringify('19175395595'));
+
+    const aliases = expandWhatsAppIdentifiers('267383306489914@lid', sessionDir);
+    assert.deepEqual([...aliases].sort(), ['19175395595', '267383306489914']);
+  } finally {
+    rmSync(sessionDir, { recursive: true, force: true });
+  }
+});
+
+test('matchesAllowedUser accepts mapped lid sender when allowlist only contains phone number', () => {
+  const sessionDir = mkdtempSync(path.join(os.tmpdir(), 'hermes-wa-allowlist-'));
+
+  try {
+    writeFileSync(path.join(sessionDir, 'lid-mapping-19175395595.json'), JSON.stringify('267383306489914'));
+    writeFileSync(path.join(sessionDir, 'lid-mapping-267383306489914_reverse.json'), JSON.stringify('19175395595'));
+
+    const allowedUsers = parseAllowedUsers('+19175395595');
+    assert.equal(matchesAllowedUser('267383306489914@lid', allowedUsers, sessionDir), true);
+    assert.equal(matchesAllowedUser('188012763865257@lid', allowedUsers, sessionDir), false);
+  } finally {
+    rmSync(sessionDir, { recursive: true, force: true });
+  }
+});
--- a/scripts/whatsapp-bridge/bridge.js
+++ b/scripts/whatsapp-bridge/bridge.js
@ -26,6 +26,7 @@ import path from 'path';
 import { mkdirSync, readFileSync, writeFileSync, existsSync, readdirSync } from 'fs';
 import { randomBytes } from 'crypto';
 import qrcode from 'qrcode-terminal';
+import { matchesAllowedUser, parseAllowedUsers } from './allowlist.js';

 // Parse CLI args
 const args = process.argv.slice(2);
@ -47,7 +48,7 @@ const DOCUMENT_CACHE_DIR = path.join(process.env.HOME || '~', '.hermes', 'docume
 const AUDIO_CACHE_DIR = path.join(process.env.HOME || '~', '.hermes', 'audio_cache');
 const PAIR_ONLY = args.includes('--pair-only');
 const WHATSAPP_MODE = getArg('mode', process.env.WHATSAPP_MODE || 'self-chat'); // "bot" or "self-chat"
-const ALLOWED_USERS = (process.env.WHATSAPP_ALLOWED_USERS || '').split(',').map(s => s.trim()).filter(Boolean);
+const ALLOWED_USERS = parseAllowedUsers(process.env.WHATSAPP_ALLOWED_USERS || '');
 const DEFAULT_REPLY_PREFIX = '⚕ *Hermes Agent*\n────────────\n';
 const REPLY_PREFIX = process.env.WHATSAPP_REPLY_PREFIX === undefined
  ? DEFAULT_REPLY_PREFIX
@ -190,10 +191,9 @@ async function startSocket() {
        if (!isSelfChat) continue;
      }

-      // Check allowlist for messages from others (resolve LID → phone if needed)
-      if (!msg.key.fromMe && ALLOWED_USERS.length > 0) {
-        const resolvedNumber = lidToPhone[senderNumber] || senderNumber;
-        if (!ALLOWED_USERS.includes(resolvedNumber)) continue;
+      // Check allowlist for messages from others (resolve LID ↔ phone aliases)
+      if (!msg.key.fromMe && !matchesAllowedUser(senderId, ALLOWED_USERS, SESSION_DIR)) {
+        continue;
      }

      // Extract message body
@ -515,8 +515,8 @@ if (PAIR_ONLY) {
  app.listen(PORT, '127.0.0.1', () => {
    console.log(`🌉 WhatsApp bridge listening on port ${PORT} (mode: ${WHATSAPP_MODE})`);
    console.log(`📁 Session stored in: ${SESSION_DIR}`);
-    if (ALLOWED_USERS.length > 0) {
-      console.log(`🔒 Allowed users: ${ALLOWED_USERS.join(', ')}`);
+    if (ALLOWED_USERS.size > 0) {
+      console.log(`🔒 Allowed users: ${Array.from(ALLOWED_USERS).join(', ')}`);
    } else {
      console.log(`⚠️  No WHATSAPP_ALLOWED_USERS set — all messages will be processed`);
    }
--- a/skills/creative/songwriting-and-ai-music/SKILL.md
+++ b/skills/creative/songwriting-and-ai-music/SKILL.md
@ -0,0 +1,289 @@
+---
+name: songwriting-and-ai-music
+description: >
+  Songwriting craft, AI music generation prompts (Suno focus), parody/adaptation
+  techniques, phonetic tricks, and lessons learned. These are tools and ideas,
+  not rules. Break any of them when the art calls for it.
+tags: [songwriting, music, suno, parody, lyrics, creative]
+triggers:
+  - writing a song
+  - song lyrics
+  - music prompt
+  - suno prompt
+  - parody song
+  - adapting a song
+  - AI music generation
+---
+
+# Songwriting & AI Music Generation
+
+Everything here is a GUIDELINE, not a rule. Art breaks rules on purpose.
+Use what serves the song. Ignore what doesn't.
+
+---
+
+## 1. Song Structure (Pick One or Invent Your Own)
+
+Common skeletons — mix, modify, or throw out as needed:
+
+```
+ABABCB  Verse/Chorus/Verse/Chorus/Bridge/Chorus    (most pop/rock)
+AABA    Verse/Verse/Bridge/Verse (refrain-based)    (jazz standards, ballads)
+ABAB    Verse/Chorus alternating                    (simple, direct)
+AAA     Verse/Verse/Verse (strophic, no chorus)     (folk, storytelling)
+```
+
+The six building blocks:
+- Intro      — set the mood, pull the listener in
+- Verse      — the story, the details, the world-building
+- Pre-Chorus — optional tension ramp before the payoff
+- Chorus     — the emotional core, the part people remember
+- Bridge     — a detour, a shift in perspective or key
+- Outro      — the farewell, can echo or subvert the rest
+
+You don't need all of these. Some great songs are just one section
+that evolves. Structure serves the emotion, not the other way around.
+
+---
+
+## 2. Rhyme, Meter, and Sound
+
+RHYME TYPES (from tight to loose):
+- Perfect: lean/mean
+- Family: crate/braid
+- Assonance: had/glass (same vowels, different endings)
+- Consonance: scene/when (different vowels, similar endings)
+- Near/slant: enough to suggest connection without locking it down
+
+Mix them. All perfect rhymes can sound like a nursery rhyme.
+All slant rhymes can sound lazy. The blend is where it lives.
+
+INTERNAL RHYME: Rhyming within a line, not just at the ends.
+  "We pruned the lies from bleeding trees / Distilled the storm
+   from entropy" — "lies/flies," "trees/entropy" create internal echoes.
+
+METER: The rhythm of stressed vs unstressed syllables.
+- Matching syllable counts between parallel lines helps singability
+- The STRESSED syllables matter more than total count
+- Say it out loud. If you stumble, the meter needs work.
+- Intentionally breaking meter can create emphasis or surprise
+
+---
+
+## 3. Emotional Arc and Dynamics
+
+Think of a song as a journey, not a flat road.
+
+ENERGY MAPPING (rough idea, not prescription):
+  Intro: 2-3  |  Verse: 5-6  |  Pre-Chorus: 7
+  Chorus: 8-9  |  Bridge: varies  |  Final Chorus: 9-10
+
+The most powerful dynamic trick: CONTRAST.
+- Whisper before a scream hits harder than just screaming
+- Sparse before dense. Slow before fast. Low before high.
+- The drop only works because of the buildup
+- Silence is an instrument
+
+"Whisper to roar to whisper" — start intimate, build to full power,
+strip back to vulnerability. Works for ballads, epics, anthems.
+
+---
+
+## 4. Writing Lyrics That Work
+
+SHOW, DON'T TELL (usually):
+- "I was sad" = flat
+- "Your hoodie's still on the hook by the door" = alive
+- But sometimes "I give my life" said plainly IS the power
+
+THE HOOK:
+- The line people remember, hum, repeat
+- Usually the title or core phrase
+- Works best when melody + lyric + emotion all align
+- Place it where it lands hardest (often first/last line of chorus)
+
+PROSODY — lyrics and music supporting each other:
+- Stable feelings (resolution, peace) pair with settled melodies,
+  perfect rhymes, resolved chords
+- Unstable feelings (longing, doubt) pair with wandering melodies,
+  near-rhymes, unresolved chords
+- Verse melody typically sits lower, chorus goes higher
+- But flip this if it serves the song
+
+AVOID (unless you're doing it on purpose):
+- Cliches on autopilot ("heart of gold" without earning it)
+- Forcing word order to hit a rhyme ("Yoda-speak")
+- Same energy in every section (flat dynamics)
+- Treating your first draft as sacred — revision is creation
+
+---
+
+## 5. Parody and Adaptation
+
+When rewriting an existing song with new lyrics:
+
+THE SKELETON: Map the original's structure first.
+- Count syllables per line
+- Mark the rhyme scheme (ABAB, AABB, etc.)
+- Identify which syllables are STRESSED
+- Note where held/sustained notes fall
+
+FITTING NEW WORDS:
+- Match stressed syllables to the same beats as the original
+- Total syllable count can flex by 1-2 unstressed syllables
+- On long held notes, try to match the VOWEL SOUND of the original
+  (if original holds "LOOOVE" with an "oo" vowel, "FOOOD" fits
+   better than "LIFE")
+- Monosyllabic swaps in key spots keep rhythm intact
+  (Crime -> Code, Snake -> Noose)
+- Sing your new words over the original — if you stumble, revise
+
+CONCEPT:
+- Pick a concept strong enough to sustain the whole song
+- Start from the title/hook and build outward
+- Generate lots of raw material (puns, phrases, images) FIRST,
+  then fit the best ones into the structure
+- If you need a specific line somewhere, reverse-engineer the
+  rhyme scheme backward to set it up
+
+KEEP SOME ORIGINALS: Leaving a few original lines or structures
+intact adds recognizability and lets the audience feel the connection.
+
+---
+
+## 6. Suno AI Prompt Engineering
+
+### Style/Genre Description Field
+
+FORMULA (adapt as needed):
+  Genre + Mood + Era + Instruments + Vocal Style + Production + Dynamics
+
+```
+BAD:  "sad rock song"
+GOOD: "Cinematic orchestral spy thriller, 1960s Cold War era, smoky
+       sultry female vocalist, big band jazz, brass section with
+       trumpets and french horns, sweeping strings, minor key,
+       vintage analog warmth"
+```
+
+DESCRIBE THE JOURNEY, not just the genre:
+```
+"Begins as a haunting whisper over sparse piano. Gradually layers
+ in muted brass. Builds through the chorus with full orchestra.
+ Second verse erupts with raw belting intensity. Outro strips back
+ to a lone piano and a fragile whisper fading to silence."
+```
+
+TIPS:
+- V4.5+ supports up to 1,000 chars in Style field — use them
+- NO artist names or trademarks. Describe the sound instead.
+  "1960s Cold War spy thriller brass" not "James Bond style"
+  "90s grunge" not "Nirvana-style"
+- Specify BPM and key when you have a preference
+- Use Exclude Styles field for what you DON'T want
+- Unexpected genre combos can be gold: "bossa nova trap",
+  "Appalachian gothic", "chiptune jazz"
+- Build a vocal PERSONA, not just a gender:
+  "A weathered torch singer with a smoky alto, slight rasp,
+   who starts vulnerable and builds to devastating power"
+
+### Metatags (place in [brackets] inside lyrics field)
+
+STRUCTURE:
+  [Intro] [Verse] [Verse 1] [Pre-Chorus] [Chorus]
+  [Post-Chorus] [Hook] [Bridge] [Interlude]
+  [Instrumental] [Instrumental Break] [Guitar Solo]
+  [Breakdown] [Build-up] [Outro] [Silence] [End]
+
+VOCAL PERFORMANCE:
+  [Whispered] [Spoken Word] [Belted] [Falsetto] [Powerful]
+  [Soulful] [Raspy] [Breathy] [Smooth] [Gritty]
+  [Staccato] [Legato] [Vibrato] [Melismatic]
+  [Harmonies] [Choir] [Harmonized Chorus]
+
+DYNAMICS:
+  [High Energy] [Low Energy] [Building Energy] [Explosive]
+  [Emotional Climax] [Gradual swell] [Orchestral swell]
+  [Quiet arrangement] [Falling tension] [Slow Down]
+
+GENDER:
+  [Female Vocals] [Male Vocals]
+
+ATMOSPHERE:
+  [Melancholic] [Euphoric] [Nostalgic] [Aggressive]
+  [Dreamy] [Intimate] [Dark Atmosphere]
+
+SFX:
+  [Vinyl Crackle] [Rain] [Applause] [Static] [Thunder]
+
+Put tags in BOTH style field AND lyrics for reinforcement.
+Keep to 5-8 tags per section max — too many confuses the AI.
+Don't contradict yourself ([Calm] + [Aggressive] in same section).
+
+### Custom Mode
+- Always use Custom Mode for serious work (separate Style + Lyrics)
+- Lyrics field limit: ~3,000 chars (~40-60 lines)
+- Always add structural tags — without them Suno defaults to
+  flat verse/chorus/verse with no emotional arc
+
+---
+
+## 7. Phonetic Tricks for AI Singers
+
+AI vocalists don't read — they pronounce. Help them:
+
+PHONETIC RESPELLING:
+- Spell words as they SOUND: "through" -> "thru"
+- Proper nouns are highest failure rate — test early
+- "Nous" -> "Noose" (forces correct pronunciation)
+- Hyphenate to guide syllables: "Re-search", "bio-engineering"
+
+DELIVERY CONTROL:
+- ALL CAPS = louder, more intense
+- Vowel extension: "lo-o-o-ove" = sustained/melisma
+- Ellipses: "I... need... you" = dramatic pauses
+- Hyphenated stretch: "ne-e-ed" = emotional stretch
+
+ALWAYS:
+- Spell out numbers: "24/7" -> "twenty four seven"
+- Space acronyms: "AI" -> "A I" or "A-I"
+- Test proper nouns/unusual words in a short 30-second clip first
+- Once generated, pronunciation is baked in — fix in lyrics BEFORE
+
+---
+
+## 8. Workflow
+
+1. Write the concept/hook first — what's the emotional core?
+2. If adapting, map the original structure (syllables, rhyme, stress)
+3. Generate raw material — brainstorm freely before structuring
+4. Draft lyrics into the structure
+5. Read/sing aloud — catch stumbles, fix meter
+6. Build the Suno style description — paint the dynamic journey
+7. Add metatags to lyrics for performance direction
+8. Generate 3-5 variations minimum — treat them like recording takes
+9. Pick the best, use Extend/Continue to build on promising sections
+10. If something great happens by accident, keep it
+
+EXPECT: ~3-5 generations per 1 good result. Revision is normal.
+Style can drift in extensions — restate genre/mood when extending.
+
+---
+
+## 9. Lessons Learned
+
+- Describing the dynamic ARC in the style field matters way more
+  than just listing genres. "Whisper to roar to whisper" gives
+  Suno a performance map.
+- Keeping some original lines intact in a parody adds recognizability
+  and emotional weight — the audience feels the ghost of the original.
+- The bridge slot in a song is where you can transform imagery.
+  Swap the original's specific references for your theme's metaphors
+  while keeping the emotional function (reflection, shift, revelation).
+- Monosyllabic word swaps in hooks/tags are the cleanest way to
+  maintain rhythm while changing meaning.
+- A strong vocal persona description in the style field makes a
+  bigger difference than any single metatag.
+- Don't be precious about rules. If a line breaks meter but hits
+  harder, keep it. The feeling is what matters. Craft serves art,
+  not the other way around.
--- a/skills/devops/webhook-subscriptions/SKILL.md
+++ b/skills/devops/webhook-subscriptions/SKILL.md
@ -0,0 +1,180 @@
+---
+name: webhook-subscriptions
+description: Create and manage webhook subscriptions for event-driven agent activation. Use when the user wants external services to trigger agent runs automatically.
+version: 1.0.0
+metadata:
+  hermes:
+    tags: [webhook, events, automation, integrations]
+---
+
+# Webhook Subscriptions
+
+Create dynamic webhook subscriptions so external services (GitHub, GitLab, Stripe, CI/CD, IoT sensors, monitoring tools) can trigger Hermes agent runs by POSTing events to a URL.
+
+## Setup (Required First)
+
+The webhook platform must be enabled before subscriptions can be created. Check with:
+```bash
+hermes webhook list
+```
+
+If it says "Webhook platform is not enabled", set it up:
+
+### Option 1: Setup wizard
+```bash
+hermes gateway setup
+```
+Follow the prompts to enable webhooks, set the port, and set a global HMAC secret.
+
+### Option 2: Manual config
+Add to `~/.hermes/config.yaml`:
+```yaml
+platforms:
+  webhook:
+    enabled: true
+    extra:
+      host: "0.0.0.0"
+      port: 8644
+      secret: "generate-a-strong-secret-here"
+```
+
+### Option 3: Environment variables
+Add to `~/.hermes/.env`:
+```bash
+WEBHOOK_ENABLED=true
+WEBHOOK_PORT=8644
+WEBHOOK_SECRET=generate-a-strong-secret-here
+```
+
+After configuration, start (or restart) the gateway:
+```bash
+hermes gateway run
+# Or if using systemd:
+systemctl --user restart hermes-gateway
+```
+
+Verify it's running:
+```bash
+curl http://localhost:8644/health
+```
+
+## Commands
+
+All management is via the `hermes webhook` CLI command:
+
+### Create a subscription
+```bash
+hermes webhook subscribe <name> \
+  --prompt "Prompt template with {payload.fields}" \
+  --events "event1,event2" \
+  --description "What this does" \
+  --skills "skill1,skill2" \
+  --deliver telegram \
+  --deliver-chat-id "12345" \
+  --secret "optional-custom-secret"
+```
+
+Returns the webhook URL and HMAC secret. The user configures their service to POST to that URL.
+
+### List subscriptions
+```bash
+hermes webhook list
+```
+
+### Remove a subscription
+```bash
+hermes webhook remove <name>
+```
+
+### Test a subscription
+```bash
+hermes webhook test <name>
+hermes webhook test <name> --payload '{"key": "value"}'
+```
+
+## Prompt Templates
+
+Prompts support `{dot.notation}` for accessing nested payload fields:
+
+- `{issue.title}` — GitHub issue title
+- `{pull_request.user.login}` — PR author
+- `{data.object.amount}` — Stripe payment amount
+- `{sensor.temperature}` — IoT sensor reading
+
+If no prompt is specified, the full JSON payload is dumped into the agent prompt.
+
+## Common Patterns
+
+### GitHub: new issues
+```bash
+hermes webhook subscribe github-issues \
+  --events "issues" \
+  --prompt "New GitHub issue #{issue.number}: {issue.title}\n\nAction: {action}\nAuthor: {issue.user.login}\nBody:\n{issue.body}\n\nPlease triage this issue." \
+  --deliver telegram \
+  --deliver-chat-id "-100123456789"
+```
+
+Then in GitHub repo Settings → Webhooks → Add webhook:
+- Payload URL: the returned webhook_url
+- Content type: application/json
+- Secret: the returned secret
+- Events: "Issues"
+
+### GitHub: PR reviews
+```bash
+hermes webhook subscribe github-prs \
+  --events "pull_request" \
+  --prompt "PR #{pull_request.number} {action}: {pull_request.title}\nBy: {pull_request.user.login}\nBranch: {pull_request.head.ref}\n\n{pull_request.body}" \
+  --skills "github-code-review" \
+  --deliver github_comment
+```
+
+### Stripe: payment events
+```bash
+hermes webhook subscribe stripe-payments \
+  --events "payment_intent.succeeded,payment_intent.payment_failed" \
+  --prompt "Payment {data.object.status}: {data.object.amount} cents from {data.object.receipt_email}" \
+  --deliver telegram \
+  --deliver-chat-id "-100123456789"
+```
+
+### CI/CD: build notifications
+```bash
+hermes webhook subscribe ci-builds \
+  --events "pipeline" \
+  --prompt "Build {object_attributes.status} on {project.name} branch {object_attributes.ref}\nCommit: {commit.message}" \
+  --deliver discord \
+  --deliver-chat-id "1234567890"
+```
+
+### Generic monitoring alert
+```bash
+hermes webhook subscribe alerts \
+  --prompt "Alert: {alert.name}\nSeverity: {alert.severity}\nMessage: {alert.message}\n\nPlease investigate and suggest remediation." \
+  --deliver origin
+```
+
+## Security
+
+- Each subscription gets an auto-generated HMAC-SHA256 secret (or provide your own with `--secret`)
+- The webhook adapter validates signatures on every incoming POST
+- Static routes from config.yaml cannot be overwritten by dynamic subscriptions
+- Subscriptions persist to `~/.hermes/webhook_subscriptions.json`
+
+## How It Works
+
+1. `hermes webhook subscribe` writes to `~/.hermes/webhook_subscriptions.json`
+2. The webhook adapter hot-reloads this file on each incoming request (mtime-gated, negligible overhead)
+3. When a POST arrives matching a route, the adapter formats the prompt and triggers an agent run
+4. The agent's response is delivered to the configured target (Telegram, Discord, GitHub comment, etc.)
+
+## Troubleshooting
+
+If webhooks aren't working:
+
+1. **Is the gateway running?** Check with `systemctl --user status hermes-gateway` or `ps aux | grep gateway`
+2. **Is the webhook server listening?** `curl http://localhost:8644/health` should return `{"status": "ok"}`
+3. **Check gateway logs:** `grep webhook ~/.hermes/logs/gateway.log | tail -20`
+4. **Signature mismatch?** Verify the secret in your service matches the one from `hermes webhook list`. GitHub sends `X-Hub-Signature-256`, GitLab sends `X-Gitlab-Token`.
+5. **Firewall/NAT?** The webhook URL must be reachable from the service. For local development, use a tunnel (ngrok, cloudflared).
+6. **Wrong event type?** Check `--events` filter matches what the service sends. Use `hermes webhook test <name>` to verify the route works.
--- a/skills/github/github-auth/SKILL.md
+++ b/skills/github/github-auth/SKILL.md
@ -219,6 +219,9 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null; then
  echo "AUTH_METHOD=gh"
 elif [ -n "$GITHUB_TOKEN" ]; then
  echo "AUTH_METHOD=curl"
+elif [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
+  export GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
+  echo "AUTH_METHOD=curl"
 elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
  export GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
  echo "AUTH_METHOD=curl"
--- a/skills/github/github-auth/scripts/gh-env.sh
+++ b/skills/github/github-auth/scripts/gh-env.sh
@ -23,6 +23,11 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null 2>&1; then
    GH_USER=$(gh api user --jq '.login' 2>/dev/null)
 elif [ -n "$GITHUB_TOKEN" ]; then
    GH_AUTH_METHOD="curl"
+elif [ -f "$HOME/.hermes/.env" ] && grep -q "^GITHUB_TOKEN=" "$HOME/.hermes/.env" 2>/dev/null; then
+    GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" "$HOME/.hermes/.env" | head -1 | cut -d= -f2 | tr -d '\n\r')
+    if [ -n "$GITHUB_TOKEN" ]; then
+        GH_AUTH_METHOD="curl"
+    fi
 elif [ -f "$HOME/.git-credentials" ] && grep -q "github.com" "$HOME/.git-credentials" 2>/dev/null; then
    GITHUB_TOKEN=$(grep "github.com" "$HOME/.git-credentials" | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
    if [ -n "$GITHUB_TOKEN" ]; then
--- a/skills/github/github-code-review/SKILL.md
+++ b/skills/github/github-code-review/SKILL.md
@ -27,7 +27,11 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null; then
 else
  AUTH="git"
  if [ -z "$GITHUB_TOKEN" ]; then
-    GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
+    if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
+      GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
+    elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
+      GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
+    fi
  fi
 fi

--- a/skills/github/github-issues/SKILL.md
+++ b/skills/github/github-issues/SKILL.md
@ -27,7 +27,11 @@ if command -v gh &>/dev/null && gh auth status &>/dev/null; then
 else
  AUTH="git"
  if [ -z "$GITHUB_TOKEN" ]; then
-    GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
+    if [ -f ~/.hermes/.env ] && grep -q "^GITHUB_TOKEN=" ~/.hermes/.env; then
+      GITHUB_TOKEN=$(grep "^GITHUB_TOKEN=" ~/.hermes/.env | head -1 | cut -d= -f2 | tr -d '\n\r')
+    elif grep -q "github.com" ~/.git-credentials 2>/dev/null; then
+      GITHUB_TOKEN=$(grep "github.com" ~/.git-credentials 2>/dev/null | head -1 | sed 's|https://[^:]*:\([^@]*\)@.*|\1|')
+    fi
  fi
 fi

--- a/Show more
+++ b/Show more
				`@ -0,0 +1 @@`
				`"""Built-in gateway hooks that are always registered."""`
				`@ -0,0 +1 @@`
				`Communication and decision-making frameworks — structured response formats for proposals, trade-off analysis, and stakeholder-ready recommendations.`