Commit graph

2670 commits

Author SHA1 Message Date
Teknium
2ecb4e62bb
Merge remote-tracking branch 'origin/main' into hermes/hermes-6b48295e 2026-06-11 07:38:25 -07:00
Teknium
9c051f57c3
fix(dashboard): Anthropic API Key entry checks ANTHROPIC_API_KEY, not Claude Code creds; hide deprecated tool-progress env vars (#44286)
Two dashboard fixes:

1. The 'Anthropic API Key' OAuth catalog entry's status fn read
   ~/.claude/.credentials.json (which has its own dedicated claude-code
   entry) and never checked ANTHROPIC_API_KEY at all. It now checks the
   Hermes PKCE file, then the registry env-var order (ANTHROPIC_API_KEY
   -> ANTHROPIC_TOKEN -> CLAUDE_CODE_OAUTH_TOKEN) via get_env_value, so
   keys from .env, the shell, or Bitwarden (injected into the process
   env by load_hermes_dotenv) are all reported, with a '(from Bitwarden)'
   source suffix when applicable.

2. Deprecated HERMES_TOOL_PROGRESS / HERMES_TOOL_PROGRESS_MODE removed
   from OPTIONAL_ENV_VARS so the keys page and setup checklists stop
   offering them. Moved to _EXTRA_ENV_KEYS so .env sanitization and
   reload_env still recognize them for existing users (gateway back-compat
   fallback unchanged).
2026-06-11 07:18:15 -07:00
Teknium
a09343cc96
feat(dashboard): SKILL.md editor on Skills page + attach-skill selector in cron modals (#44231)
Headless/VPS users (dashboard-over-Tailscale, no comfortable SSH) could
list/toggle/install skills and create/edit cron jobs, but not author a
custom skill or link one to a cron job — the UI set WHEN a job runs, but
not WHICH skill it uses.

- Skills page: 'New skill' button + per-row edit pencil open a SKILL.md
  editor dialog (frontmatter + body, server-side validation via the same
  _create_skill/_edit_skill path as the agent's skill_manage tool).
- New endpoints: GET /api/skills/content, POST /api/skills,
  PUT /api/skills/content — all profile-scoped via _profile_scope(),
  which now also retargets tools.skill_manager_tool's import-time
  SKILLS_DIR binding.
- Cron page: skills multi-select in both create and edit modals (parity
  with hermes cron --skill / edit --add-skill); CronJobCreate gains a
  skills field; job cards show an attached-skills badge. update_job
  already accepted skills in updates.
- Tests: 17 new endpoint tests (content read, create/edit validation +
  profile scoping + auth gate, cron skills round-trip).
2026-06-11 06:10:27 -07:00
Teknium
f456f302df
fix(gateway): refuse to write service definitions with a temp-dir HERMES_HOME (#44267)
* fix(gateway): refuse to write service definitions with a temp-dir HERMES_HOME

A test/E2E harness that exports HERMES_HOME=/tmp/... and touches any
gateway service write path (install, start self-heal, restart's
refresh_systemd_unit_if_needed) bakes the throwaway home into the
production systemd unit / launchd plist. The gateway then restarts
'healthy' but pointed at an empty temp home — no platforms enabled,
deaf to every message (live incident 2026-06-11: /tmp/hermes-e2e-41264
poisoned the unit during a PR-review E2E probe; the post-update restart
produced a 7-hour zombie gateway).

The existing safety belt only sniffed pytest-shaped markers
(/pytest-of-, /hermes_test). Add a structural guard:
_temp_home_in_service_definition() extracts HERMES_HOME from the
generated systemd unit or launchd plist and refuses the write (with
actionable guidance) when it resolves under tempfile.gettempdir(),
/tmp, /var/tmp, or the macOS /private variants. Wired into all five
write sites: systemd refresh + install, launchd refresh + install +
start self-heal.

* test: patch unit generator in install tests tripped by temp-home guard

CI runs hermetic with HERMES_HOME under a tmp dir, so the real
generate_systemd_unit() output now (correctly) trips the new temp-home
write guard in three install tests. Patch the generator with synthetic
non-temp content — same pattern the existing pytest-marker guard tests
use.
2026-06-11 06:10:08 -07:00
Teknium
9c16ca8790
fix(dashboard): normalize model assignments + confirm-modal for backup import (#44237)
Two beta-reported dashboard bugs:

1. Models page: 'Use as -> Main model' on an analytics card sends
   entry.provider, which falls back to the model's VENDOR prefix
   (modelVendor('anthropic/claude-opus-4.6') == 'anthropic') when the
   session row has no billing_provider. That persisted
   provider: anthropic + default: anthropic/claude-opus-4.6 — a
   vendor-prefixed OpenRouter slug on the NATIVE Anthropic provider.
   New sessions then 400 against api.anthropic.com and the user reads
   it as 'changing models does nothing'. Unknown vendors (moonshotai,
   poolside, ...) were worse: a provider that can never resolve
   credentials.

   Fix: _normalize_main_model_assignment() at the single write
   chokepoint — maps non-provider vendor names back to the user's
   current aggregator (else openrouter), and runs the model through
   normalize_model_for_provider() so the persisted name matches the
   target provider's API format. Wired into both /api/model/set and
   the profile-scoped _write_profile_model.

2. System page: 'Restore from backup' spawns hermes import with
   stdin=DEVNULL, so the CLI's interactive 'Continue? [y/N]' overwrite
   prompt hits EOF and auto-aborts whenever a config already exists
   (always, when the dashboard is running). Fix: ConfirmDialog in the
   dashboard owns the consent, then the endpoint passes --force so the
   restore runs non-interactively.

Validated live: dashboard on a temp HERMES_HOME, repro'd both failure
modes pre-fix (vendor-slug write verified via config.yaml + tui
session.create; import 'Aborted.' in action-import.log), then verified
post-fix (normalized writes, modal -> --force -> restored marker file).
2026-06-11 05:07:58 -07:00
Teknium
73dd584995
fix(mcp): propagate HERMES_HOME override onto the MCP event loop (#44220)
* fix(mcp): propagate HERMES_HOME override onto the MCP event loop

Closes the known limit documented in #44007: tasks scheduled via
run_coroutine_threadsafe are created INSIDE the MCP loop thread, so they
copy that thread's context — a per-request profile scope (dashboard
?profile= endpoints, e.g. the MCP 'Test server' probe) silently vanished
for anything resolving get_hermes_home() inside the coroutine. Most
visible symptom: OAuth token-store paths (HERMES_HOME/mcp-tokens/)
resolved against the process home instead of the selected profile, so
testing an OAuth MCP cross-profile read the wrong tokens.

_run_on_mcp_loop now wraps scheduled coroutines with the caller's
context-local override (_wrap_with_home_override): set inside the task's
own context on the loop, reset on completion — task-local, so concurrent
calls carrying different scopes don't interfere, and the loop thread's
default context stays untouched. No-op (coroutine passes through
unwrapped) when no override is active, i.e. every non-dashboard caller.

web_server's probe comment updated from 'known limit' to 'covered'.

Tests: override propagation (direct + factory form), OAuth token-path
resolution on the loop, loop-context cleanliness after scoped calls,
no-op passthrough. 225 green across mcp_tool + unification suites.

* test(mcp): concurrent different-scope calls don't interfere
2026-06-11 04:37:01 -07:00
Teknium
875aa8f162
feat(dashboard): unify multi-profile management — one machine dashboard, global profile switcher (#44007)
* feat(dashboard): unify multi-profile management — one machine dashboard, global profile switcher

The dashboard becomes a machine-level management surface with one
write-target selector, replacing per-profile dashboard fragmentation.

Backend:
- profile param (query or body) on /api/config (get/put/raw), /api/env
  (get/put/delete/reveal), /api/mcp/servers (list/add/remove/test/enabled),
  /api/mcp/catalog (list/install), /api/model/info, /api/model/set —
  all scoped through the existing _profile_scope() context manager
- model/set restructured: expensive-model warning (await) runs before the
  scope; the config write runs sync inside the scope in a worker thread
- MCP catalog installs + git-bootstrap entries spawn 'hermes -p <profile>'
- chat PTY: ?profile= on /api/pty points the child's HERMES_HOME at the
  profile dir (its own gateway subprocess, config/skills/memory/state.db
  all profile-bound); in-process gateway attach skipped when scoped

CLI launch unification:
- '<profile> dashboard' routes to the machine dashboard: attach (open
  browser at ?profile=) when one is listening, else re-exec pinned to the
  default profile with --open-profile preselecting the launcher
- --isolated preserves the old dedicated per-profile server behavior
- start_server(initial_profile=...) appends ?profile= to the auto-open URL

Frontend:
- ProfileProvider + sidebar ProfileSwitcher: ONE global selector, URL-
  persisted (?profile=), mirrored into fetchJSON which auto-appends the
  param to the scoped endpoint families (explicit params win)
- app-wide amber banner names the managed profile
- SkillsPage's page-local selector (from the skills-scoping PR) folded
  into the global context — single source of truth
- ChatPage threads the scope into the PTY WS URL; switching profiles
  remounts the terminal into a fresh scoped session

Omitted profile keeps legacy behavior everywhere.

* docs(dashboard): document machine-level multi-profile management

- web-dashboard.md: 'Managing multiple profiles' section (switcher, URL
  deep-links, unified launch, --isolated, scoped Chat, what stays
  per-profile) + --isolated in the options table
- profiles.md: 'From the dashboard' subsection + set-as-active vs
  switcher clarification
- cli-commands.md: --isolated flag + profile-alias launch example

* fix(dashboard): address profile-unification review findings

Review findings (dev review on PR #44007):

1. HIGH — stale page state on profile switch: pages load data on mount
   and didn't consume the profile scope, so a page opened under profile A
   kept showing A's state while writes silently targeted the newly
   selected B. Fixed structurally: ProfileKeyedRoutes wraps the routed
   page tree and keys it by the selected profile, remounting every page
   (fresh state + refetch) on switch. ChatPage keeps its own remount
   (channel keyed on scopedProfile).

2. HIGH — /api/model/auxiliary read was unscoped while /api/model/set
   wrote scoped (Models page could show default's aux pins while editing
   worker's). Endpoint now takes profile + _profile_scope, added to
   PROFILE_SCOPED_PREFIXES, HTTPException re-raise so ghost profiles 404
   instead of 500. Regression test asserts read/write symmetry with
   differing worker/default aux config.

3. MEDIUM — tools post-setup spawned unscoped from the profile-aware
   drawer. Now spawns 'hermes -p <profile> tools post-setup <key>'
   (same mechanism as hub installs); drawer threads its profile prop.
   Most hooks install machine-level artifacts where the scope is inert,
   but hooks reading config/env now see the drawer's HERMES_HOME.

4. LOW — ty warnings: env Optional asserts before subscript/membership,
   fastapi import replaced with web_server.HTTPException re-use.

298 tests green across the four affected suites; tsc -b + vite build
green; aux scoping E2E-verified with real imports.

* fix(dashboard): address second profile-unification review (gille)

1. BLOCKER — profile scope dropped on sidebar navigation: ProfileProvider
   derived the selection from the current URL, and nav links are bare
   paths, so clicking Config from /skills?profile=worker silently reset
   the write target. State is now the source of truth; an effect
   re-asserts ?profile= onto the new location after every navigation
   (URL stays a synchronized projection for deep links/refresh), and an
   incoming URL param (e.g. 'Manage skills & tools' links) still wins.

2. BLOCKER — /api/model/options unscoped while model/set wrote scoped:
   the picker context (current model/provider, custom providers,
   per-profile .env auth state) now loads inside _profile_scope; added
   to PROFILE_SCOPED_PREFIXES. Test: a worker-only current-model pin
   appears in the scoped payload and not the unscoped one.

3. BLOCKER — MCP test-server probe escaped the scope after the config
   read: the probe now re-enters _profile_scope inside the worker thread
   so env-placeholder expansion resolves against the selected profile's
   .env. Known limit (documented): the probe's dedicated MCP event-loop
   thread doesn't inherit the contextvar (OAuth token paths). Test
   asserts get_hermes_home() inside the probe == the worker profile dir.

4. BLOCKER — broad excepts swallowed unknown-profile 404s: /api/model/info
   degraded to 200-with-empty-model-info and /api/mcp/catalog to a
   silently-empty catalog. Both re-raise HTTPException; 404 regression
   tests added for info/options/catalog.

Polish: scope banner clears the fixed mobile header (mt-14 lg:mt-0);
--open-profile hidden via argparse.SUPPRESS (internal re-exec flag);
attach-path test now asserts the opened ?profile= URL.

(Stale-page-state + /api/model/auxiliary findings from this review were
already fixed in 92bcd1568 — the review ran against e600f6951.)

35 tests in the two new suites + 274 in the adjacent ones, all green;
tsc -b + vite build green; scoping E2E-verified with real imports.

* docs(dashboard)+fix: self-review pass — Profiles page section, REST profile-param tip, body-beats-query precedence

Docs:
- web-dashboard.md: add the missing 'Profiles' subsection to Pages
  (cards, create/builder, manage-skills jump, set-as-active vs switcher
  distinction, editors); REST API section gets a profile-scoped-endpoints
  tip documenting ?profile= / body profile / 404 semantics / /api/pty
- (profiles.md + cli-commands.md were already updated in e600f6951)

Precedence fix: scoped endpoints taking BOTH a query param and a body
field now resolve body.profile first. The SPA's fetchJSON injects the
query param from the GLOBAL switcher; an explicit body.profile (e.g.
Profile Builder flows writing into a specific new profile) is the more
specific intent and must not be overridden by whatever the sidebar
happens to be set to. Matches the documented 'explicit beats global'
contract in api.ts.

Verified: 304 tests green across the four suites; tsc -b + vite build
green; docusaurus build green (only pre-existing broken-link warnings,
none from this PR's pages).
2026-06-11 03:29:33 -07:00
kshitij
955fa40062
Merge pull request #44085 from kshitijk4poor/review/pr-43754-ssh-update
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix Lockfile Fix / auto-fix-main (push) Waiting to run
Nix Lockfile Fix / fix (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run
Typecheck / typecheck (apps/desktop) (push) Waiting to run
Typecheck / typecheck (apps/shared) (push) Waiting to run
Typecheck / typecheck (ui-tui) (push) Waiting to run
Typecheck / typecheck (web) (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Has been cancelled
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Has been cancelled
fix(update): avoid SSH auth for passive official checks
2026-06-11 01:12:03 -07:00
kshitijk4poor
ed2b9e43c8 fix(backup): stage SQLite snapshots beside output zip in pre-update path too
The pre-update / pre-migration backup path (_write_full_zip_backup) had the
same /tmp staging bug as run_backup: a small tmpfs at the default tempfile
location silently drops large *.db files from the archive. Route its SQLite
staging temp files to the output zip's directory as well, and add regression
tests (mutation-verified) for both staging paths.

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
2026-06-11 12:45:40 +05:30
helix4u
cedd9b6d47 fix(update): avoid SSH auth for passive official checks 2026-06-11 12:45:07 +05:30
liuhao1024
dd40600e0a fix(backup): stage SQLite snapshots alongside output zip and stop excluding nested hermes-agent skill dirs
Two bugs in the backup routine:

1. SQLite safe-copy used tempfile.NamedTemporaryFile() which defaults to
   the system temp directory (/tmp).  When /tmp is a small tmpfs and the
   database is large, the copy silently fails and the resulting zip is
   missing state.db, kanban.db, and response_store.db.

   Fix: pass dir=out_path.parent so the temp file is staged alongside the
   output zip on the same filesystem.

2. _EXCLUDED_DIRS contained "hermes-agent" which matched at ANY path
   depth, accidentally excluding the Hermes Agent skill directory at
   skills/autonomous-ai-agents/hermes-agent/.

   Fix: special-case "hermes-agent" to only match when it is the first
   path component (the root-level code checkout).  All other excluded dir
   names continue to match at any depth.

Regression tests added for both fixes.
2026-06-11 12:43:39 +05:30
Shannon Sands
fa7f24e898 Enable webhooks from dashboard page 2026-06-10 22:55:06 -07:00
brooklyn!
975edd4140
fix(cli): omit --workspace when subpackage has its own package-lock.json (#42973) (#43986)
* fix(cli): omit --workspace when subpackage has its own package-lock.json

When ui-tui/ (or web/) contains its own package-lock.json, _workspace_root()
returns the subpackage directory itself.  Passing --workspace ui-tui in that
case fails because npm cannot find a workspace named 'ui-tui' inside ui-tui/.

Fix: skip the --workspace flag when npm_cwd equals the target directory,
running a plain 'npm install' from the standalone project root instead.

Applies the same fix to both _make_tui_argv (TUI) and _build_web_ui (web).

Fixes #42973

* test(cli): fix web workspace-scope fixture + cover own-lockfile fallback (#42973)

The web half of the #42977 fix broke test_npm_install_uses_workspace_web_scope,
which built its fixture with no lockfile anywhere. Without a root lockfile,
_workspace_root(web_dir) already returns web_dir, so the new
"() if npm_cwd == web_dir" branch correctly drops --workspace and the
assertion failed. Model a real workspace checkout instead: the single
package-lock.json lives at the root, so --workspace web scopes the install.

Also add the symmetric web regression test (web/ carrying its own lockfile =>
--workspace must be dropped and the install runs plainly from web_dir via
npm ci), matching the TUI coverage already in test_tui_npm_install.py.

---------

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
2026-06-11 05:01:25 +00:00
brooklyn!
3e74f75e41
feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316)
* feat(agent): coding-context posture with per-model edit-format tuning

Hermes detects when it's running in a coding context — an interactive
surface (CLI, TUI, ACP, desktop) sitting in a code workspace (git repo or
recognised project root) — and shifts into a coding posture. Outside that
(chat platforms, non-workspaces) nothing changes.

The posture is modelled as a frozen RuntimeMode selected from a small
ContextProfile registry (coding/general). A profile is data: the toolset to
collapse to, the operating brief to inject, and seams for model routing and
memory. Every domain reads the same resolved object instead of re-probing
git/config on its own:

- System prompt — RuntimeMode.system_blocks(): an operating brief (gather
  context before editing, edit through tools not chat, verify with terminal,
  cap retry loops) plus a live git/workspace snapshot, built once and baked
  into the stable prompt tier so per-conversation caching is preserved.
- Per-model edit-format tuning — the brief nudges each model family toward
  the patch mode it handles best: OpenAI/Codex toward mode='patch' (V4A
  multi-file diffs), Anthropic toward mode='replace' (string replacement).
  The model id rides on RuntimeMode; unknown families keep neutral wording.
- Skill index — non-coding skill categories are pruned from the prompt's
  skill index (discovery-only; skills_list/skill_view still reach the full
  catalog, with a disclosure note).
- Toolset — only under the opt-in 'focus' mode does the posture collapse to
  the coding toolset + enabled MCP servers; the default posture is
  prompt-only and never overrides configured toolsets.

Activation via agent.coding_context: auto (default), focus, on, off.
Subagents inherit the posture for free via toolset inheritance + the shared
prompt builder. Detection is not memoized so a long-lived gateway/TUI
process can't pin a stale posture across working directories.

* feat(agent): cover new-file authoring in the coding edit-format nudge

The per-model edit-format guidance only addressed editing existing code
(patch mode='patch' vs 'replace'), but authoring a brand-new file —
write_file, not patch — is a large fraction of real coding work and the
nudge was silent on it. Surfaced when building a single-file artifact where
the dominant operation was write_file and the steering offered no guidance.

Both family lines now lead with "author new files with write_file; for
edits to existing code prefer ...". Tests assert write_file appears in each
family's brief; unknown families still get neutral wording.

* docs(agent): correct memoization docstring + clarify TUI config-load asymmetry

* feat(agent): sharpen the coding posture — verify-loop facts, wider edit steering, $HOME guard

Tuning pass on the coding posture from dogfooding it as a harness:

- Workspace snapshot now hands the model its verify loop up front:
  detected manifests + package manager (lockfile sniff), the exact
  verify commands (package.json scripts, Makefile targets,
  scripts/run_tests.sh, pytest config), and which context files
  (AGENTS.md / CLAUDE.md / .cursorrules) exist at the root. Marker-only
  (non-git) projects get the snapshot too instead of nothing. The
  "verify before claiming done" brief line was the highest-value piece
  in evals — this turns it from advice into an executable loop instead
  of making the model rediscover the test command every session. Still
  stat-cheap, size-guarded reads, built once at prompt time.

- Edit-format steering covers the families Hermes actually serves:
  Gemini and open-weight coding models (DeepSeek, Qwen, Kimi, GLM,
  Grok, Hermes, Llama, Mistral, Devstral, MiniMax) steer to
  mode='replace' — their RL scaffolds use str_replace-style editors.
  Previously only GPT/Codex and Claude families got steering; the
  models Hermes users disproportionately run all fell to neutral.

- Operating brief gains four behaviors elite harnesses encode: batch
  independent reads/searches in one turn; fix root causes and the bug
  class (sibling call paths), not the reported site; no drive-by
  refactors/renames/reformatting; never read, print, or commit secrets.
  Plus a patch-failure escalation ladder: after the same region fails
  twice, rewrite the enclosing function/file with write_file instead of
  a third patch attempt.

- $HOME dotfiles guard: a git repo rooted exactly at the home directory
  (or a marker sitting in it, e.g. a global ~/AGENTS.md) is user config,
  not a code workspace — without the guard, every session anywhere under
  a dotfiles-managed home silently flipped to the coding posture. Real
  projects under such a home still detect via their own markers/repos;
  'on' mode bypasses the guard.
2026-06-10 23:06:44 -05:00
Teknium
7d8d000b19
revert(cron): remove per-job profile support (PR #28124) (#43956)
Fully removes the cron per-job 'profile' arg added in #28124: the
cronjob tool schema field, CLI --profile flags on cron create/edit,
job-record storage/validation, the scheduler's _job_profile_context
wrapper, and the script-runner env override. Sequential-partition
logic reverts to workdir-only.

The context-local HERMES_HOME override in hermes_constants and the
subprocess bridging in tools/environments/local.py are kept — they
now have other consumers (dashboard multi-profile, TUI gateway).
2026-06-10 20:46:17 -07:00
Teknium
914befa9aa feat(dashboard): profile-scoped skills & toolsets management
'Set as active' on the Profiles page only flips the sticky active_profile
file (future CLI/gateway runs) — it never retargets the running dashboard
process. The skills/toolsets endpoints called bare load_config()/
save_config(), so after 'activating' a profile in the web UI, deactivating
a skill silently wrote into the dashboard's own profile and the activated
profile was untouched.

Backend:
- _profile_scope() context manager on the skills/toolsets endpoints:
  context-local HERMES_HOME override for call-time config resolution +
  cron-style locked swap of tools.skills_tool's import-time SKILLS_DIR
- profile param on /api/skills, /api/skills/toggle, /api/tools/toolsets*
  (list/toggle/config/provider/env), hub sources/search installed-state
- hub install/uninstall/update spawn 'hermes -p <profile> skills ...' so
  the child rebinds skills_hub.SKILLS_DIR at import (the override cannot
  reach import-time globals); profile validated -> 404/400 before spawn

Frontend:
- Skills page: profile selector (deep-linkable /skills?profile=<name>),
  amber banner naming the managed profile, threaded through skill toggles,
  toolset drawer, and hub browser
- Profiles page: 'Manage skills & tools' action per card; 'Set as active'
  toast now says it applies to new CLI/gateway runs only

Omitted profile keeps legacy behavior (dashboard's own profile).
2026-06-10 20:34:53 -07:00
Matt Harris
e0e2571711 feat(web): Parallel-backed web search & extract — free Search MCP when keyless, v1 REST when keyed
Make Parallel the web search/extract backend with a zero-setup free tier:

- Keyless (no PARALLEL_API_KEY): web_search/web_extract work out of the box via
  Parallel's free hosted Search MCP (search.parallel.ai/mcp), and parallel
  becomes the default backend when no other web credentials are configured
  (ahead of ddgs, which is search-only). A small hand-rolled Streamable-HTTP
  JSON-RPC client speaks the MCP's web_search/web_fetch tools; the existing
  web_search/web_extract tools are the only tools registered.
- Keyed (PARALLEL_API_KEY set): uses the Parallel v1 REST endpoints
  (client.search / client.extract with advanced_settings.full_content) — no beta.
  Bumps parallel-web 0.4.2 -> 0.6.0.
- Attribution: on the free path only, results carry provider/attribution and the
  CLI tool line reads "Parallel search" / "Parallel fetch"; the paid path is
  unbranded.
- Selection/registration: web tools register unconditionally (free MCP backstop)
  while check_web_api_key remains a real usability probe; explicit per-capability
  backends are honored (so misconfig surfaces) rather than masked by the fallback.

Tested: live web_search/web_extract against search.parallel.ai in keyless and
keyed modes; unit suites for the MCP client, backend selection, and display
labeling; full agent run shows the "Parallel search" label on the free path.
2026-06-10 19:54:38 -07:00
brooklyn!
3ffbdfbcc0
desktop: registry-driven slash commands + first-class /resume & /handoff (#42351)
* desktop: surface /tools, /save, /personality and fix /help skill count

Move /tools and /save out of TERMINAL_ONLY_COMMANDS and /personality out of
ADVANCED_COMMANDS so they appear in the desktop slash palette and execute via
the existing slash.exec → command.dispatch fallback. The backend gateway already
accepts these through slash.exec (none are in _PENDING_INPUT_COMMANDS or the
skill list), so no backend change is required.

Recompute skill_count in filterDesktopCommandsCatalog from the filtered pairs.
Previously the /help footer echoed the unfiltered backend total — e.g. "60
skill commands available" while only ~29 actually appeared in the rendered
list, because the desktop hides terminal-only, picker-owned, and advanced
commands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* desktop: keep slash popover live while typing args

The trigger regex `(?:^|[\s])([@/])([^\s@/]*)$` stopped matching the moment
the user typed a space after a slash command, so the popover never showed arg
completions for `/personality`, `/tools`, etc. — even though the backend's
`complete.slash` already returns them with a `replace_from` indicator.

Split the trigger detection so `/` allows args (`/cmd arg1 arg2`) while `@`
keeps the strict no-space behavior. Restrict the slash command name to
`[a-zA-Z][\w-]*` so file paths like `src/foo/bar` don't accidentally trigger
the popover.

Rewrite arg-completion items in useSlashCompletions to insert the full
`/personality alice` token instead of stranding `/alice`: when `replace_from`
is past the command base, prepend the existing prefix to each item's text so
the chip serializer produces a coherent replacement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* cli: complete toolset names after /tools enable|disable

SlashCommandCompleter previously only auto-derived the first subcommand level
from args_hint, so `/tools enable <tab>` yielded nothing — the user had to
remember every toolset key (web, file, spotify, …) and every MCP server prefix.

Add `_tools_completions` that handles both stages: subcommand (list|disable|enable)
and tool name. Filter by current enable state so `/tools enable <tab>` only
offers disabled toolsets and `/tools disable <tab>` only offers enabled ones —
no point suggesting a no-op. MCP server prefixes (server:) come from the
saved mcp_servers config; per-tool completion under a server would require
runtime MCP introspection and is left as follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* desktop: registry-driven slash commands with first-class pickers

Collapse the if/else slash dispatch into one DESKTOP_COMMAND_SPECS table
that drives popover suggestions, per-type composer pills, and execution.

- /resume, /sessions, /switch: inline session completions (like /skin) plus
  a "Browse all sessions…" entry that opens a dedicated session picker overlay
- /handoff: inline platform completion + handoff.request/handoff.state
  gateway bridge so desktop reaches CLI parity
- colored per-type pills (command/skill/theme) in the composer
- strip ANSI and fix width/alignment of slash output in the chat panel

* desktop: fold repeated slash session/output boilerplate into one helper

runExec, /title, /help and the unavailable case each re-derived the same
ensure-session → bail-with-notify → build-renderSlashOutput dance.
withSlashOutput() returns {sessionId, render} or null, so each handler is
a two-line resolve instead of an eight-line preamble.

* desktop: keep backend meta on slash arg completions

Arg suggestions (/personality <name>, /tools enable <toolset>, /handoff
<platform>) were having their meta overwritten with the parent command's
registry description: desktopSlashDescription("/personality none") canonicalizes
back to /personality and returns its blurb. Skip the lookup for arg rows so the
backend's own display_meta ("clear personality overlay", etc.) survives.

* cli: list real personalities in /personality completion

_personality_completions resolved load_config().agent.personalities — but that
schema has no agent.personalities key, so completion always returned just
`none` even though the runtime (load_cli_config().agent.personalities) ships a
dozen built-ins (helpful, kawaii, pirate, …). Read from the same source the
command actually applies, so `/personality ` surfaces the real options.

* desktop: expand bare arg-commands to their options on pick

Picking a command like /personality from the slash popover committed it
immediately instead of advancing to its argument list. Mark arg-taking
commands (/skin, /resume, /handoff, /personality, /tools) in the registry
and, when one is picked bare, insert "/cmd " as plain text and re-open the
popover on its inline options — mirroring typing "/cmd " by hand. Arg picks
(serialized text already contains a space) still commit a single pill.

Also realign trigger-popover loading test with the redesigned popover (the
/help empty-state hint shows when resolved, not while the spinner is up);
the merge from main reintroduced the pre-redesign expectation.

* tui_gateway: fold session-db close into a context manager

Both handoff RPCs repeated the same `db, close_db = _session_db_handle()`
+ `finally: if close_db: db.close()` dance. Turn the helper into a
`_session_db` contextmanager that owns the close, so callers just
`with _session_db(session) as db:`.

* desktop: unblock handoff retries and exact resume ids

Clear timed-out desktop handoffs through the gateway so retries are not stuck behind a pending row, and let typed /resume session ids bypass the loaded sidebar cache.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-11 01:49:24 +00:00
emozilla
bfcc9f92b4 Merge commit '6110aed9b' into feat/whatsapp-cloud-api 2026-06-10 21:39:22 -04:00
xxxigm
acd4f34e65 fix(cron): resolve per-job provider "custom" to providers.custom instead of codex
A cron job stored with `provider: "custom"` and a matching `providers.custom`
entry in config failed at execution with `auth_unavailable: providers=codex`.
Two layers conspired:

- `_get_named_custom_provider` returned None for bare "custom" *before*
  scanning config, so a literal `providers.custom` entry was never matched and
  resolution fell through to the global default (codex). Now it scans config
  for an entry literally named "custom"; with none it still returns None,
  preserving the legacy model.base_url trust path.
- `_resolve_model_override` blindly stripped bare "custom" at job creation and
  pinned `model.provider` (e.g. codex). It now keeps "custom" when a configured
  custom endpoint resolves, pinning the main provider only when it doesn't.
2026-06-10 14:39:03 -07:00
Tranquil-Flow
a8f404b29f fix(gateway): probe launchd domain instead of hardcoding user/<uid> (#40831)
The previous fix for #23387 changed _launchd_domain() from gui/<uid> to
user/<uid> to support Background/SSH sessions on macOS 26+. However, this
broke Aqua sessions where gui/<uid> is the only working domain and
user/<uid> cannot bootstrap or manage the service.

Now _launchd_domain() probes which domain actually contains the loaded
service:
1. Try gui/<uid> first (Aqua sessions)
2. Fall back to user/<uid> (Background/SSH sessions)
3. Use launchctl managername as heuristic when neither has the service
4. Cache the result for the process lifetime

Regression tests cover all four paths plus caching behavior.
2026-06-10 12:39:48 -07:00
Shannon Sands
6fe4821926 Add dashboard file browser paths 2026-06-10 09:53:12 -07:00
Teknium
d986bb0c6d
feat(dashboard): full-featured profile builder (model + skills + MCPs) (#39084)
* feat(profiles): extend create endpoint for full profile-builder (model + MCPs + skills)

Backend foundation for the dashboard profile builder. Extends POST /api/profiles
to accept, in one call, everything a profile needs beyond name/clone:

- mcp_servers[]  -> written into the new profile's config.yaml
- keep_skills[]  -> replace-semantics: disable every seeded skill not kept
- hub_skills[]   -> async install via 'hermes -p <name> skills install <id>'

All applied best-effort AFTER the profile dir exists, so a hiccup in any one
never 500s the create. Model/MCP/keep-skills writes are profile-scoped via the
HERMES_HOME context override (same mechanism as the existing _write_profile_model).
Hub installs go through a subprocess scoped with -p because skills_hub.SKILLS_DIR
is import-time-bound and the runtime override can't redirect it.

Adds two helpers (_write_profile_mcp_servers, _disable_unselected_skills) and a
TestClient test asserting all four paths land in the NEW profile's config and
the hub spawn is scoped to it. Design doc at docs/design/profile-builder.md.

* feat(dashboard): full-featured profile builder page

Adds a dedicated /profiles/new builder that composes everything a profile
needs into one stepped create flow, reusing the existing Models/Skills/MCP
data paths instead of duplicating them:

- Identity   name + description
- Model      provider+model picker (api.getModelOptions)
- Skills     keep-which-built-in/optional (replace semantics, default = full
             bundle) + skills-hub search/add (api.getSkills, searchSkillsHub)
- MCPs       add HTTP/stdio servers inline
- Review     blueprint -> single POST /api/profiles create

Nothing writes until Create; the one call commits model+MCPs+skill selection
and spawns hub-skill installs (reported in the success toast). ProfilesPage
header gets a 'Build' button (full builder) alongside 'Create' (quick modal).
Route is page-only (not in the sidebar nav). Verified with vite build (2258
modules, green).
2026-06-10 09:18:32 -07:00
Barron Roth
2c19208224 feat(tts): add Gemini audio tag rewrite 2026-06-10 02:57:39 -07:00
Barron Roth
5718811de0 feat(tts): add Gemini persona prompt file 2026-06-10 02:57:39 -07:00
Teknium
70d5d7e39b
fix(memory,skills): repair write-approval inline prompt, gateway staging, and gateway /skills review (#43452)
Follow-ups to #38199/#43354 found in post-merge review:

- Inline CLI memory approval never worked: the per-thread approval callback
  was not passed to prompt_dangerous_approval, so the prompt_toolkit
  fail-closed guard (#15216) denied every gated foreground write without
  showing a prompt. Now invokes the registered callback directly; a crashed
  prompt falls back to staging instead of a silent deny.
- Gateway sessions claimed inline support but prompt_dangerous_approval has
  no gateway round-trip (that lives in the pending-approval queue), so gated
  gateway memory writes hit the input() fallback and denied. Gateway
  contexts now stage for /memory pending review.
- /skills pending|approve|reject|diff|approval now works on the gateway
  (gateway_config_gate on skills.write_approval), so skills staged from a
  messaging session can be reviewed there. Diff output truncated for chat.
- memory_tool validates required params before the gate so invalid writes
  are rejected immediately instead of staged and failing at approve time.
- Stale tri-state write_mode docstrings updated to the boolean gate; docs
  table corrected (inline prompt is interactive-CLI-only).
- 6 new tests covering the interactive approve/deny/error paths, gateway
  staging, skills never-prompt invariant, and pre-gate validation.
2026-06-10 02:57:15 -07:00
Teknium
a5c32cdf30
fix(update): self-heal a venv left half-built by an interrupted install (#42172)
* fix(update): self-heal a venv left half-built by an interrupted install

An update killed mid dependency-install (Ctrl-C, terminal close, WSL OOM)
could leave the venv with pip wiped and core deps (e.g. Pillow) missing,
with no automatic recovery — the user had to manually run ensurepip +
reinstall.

Drop an install-scoped .update-incomplete breadcrumb right before the dep
install and clear it only after core-dependency verification passes. On the
next launch (any command except 'update' itself), if the marker is present,
unconditionally bootstrap pip via ensurepip then re-run the .[all] install +
verification, then clear the marker. Failure leaves the marker for retry and
prints the manual recovery command. Never raises — recovery cannot block
launch.

* fix(update): address review — stderr-only recovery output, single-flight lock, gitignore marker

- Route all recovery output (status lines + streamed pip/uv install via
  fd-level dup2) to stderr so protocol-on-stdout launches (hermes acp)
  never get install noise on the JSON-RPC stream.
- Single-flight O_EXCL lockfile (.update-incomplete.lock) so a gateway
  start + CLI launch (or two profiles) can't run concurrent installs
  into the shared venv; stale locks (>1h) are broken for the next launch.
- gitignore .update-incomplete + lock so source-tree installs keep a
  clean git status and update's autostash skips them.
- Document why the loose 'update' argv substring match is intentional
  (over-match defers one launch; under-match would race the real update).
- 4 new tests: lock held → skip, stale lock broken, lock released,
  output lands on stderr only.
2026-06-10 02:57:05 -07:00
Ben Barclay
15813336cc
fix(config): preserve original .env file mode in remove_env_value too (#43349)
#33699 fixed save_env_value so an operator-set .env mode (e.g. 0640 on a
Docker bind-mount) survives a config write instead of being re-tightened
to 0600 by the unconditional _secure_file() call. The sibling
remove_env_value() had the identical bug: it restores original_mode and
then unconditionally called _secure_file(env_path), clobbering the mode
back to 0600 on every `hermes config remove KEY`.

Apply the same fix: move _secure_file() into the else branch so it only
runs when no original mode was captured (a freshly created .env still
gets 0600 hardening; existing operator-set modes survive).

Added test_remove_env_value_preserves_existing_file_mode_on_posix, which
fails on the unfixed remove path (expected 0o640, got 0o600) and passes
with the fix.
2026-06-10 19:53:07 +10:00
kshitij
2f19512341
fix(cli): repair non-UTF-8 stdout/stderr on all platforms, not just Windows (#43439)
`hermes setup` (and other banner-printing commands) crash with an unhandled
UnicodeEncodeError on Linux hosts whose locale selects a non-UTF-8 codec —
e.g. a fresh Raspberry Pi / minimal Debian with a latin-1 or C/POSIX locale.
The setup wizard prints box-drawing characters (┌│├└─) and the ⚕ glyph before
any stream repair runs, so the command dies before it can start.

The existing _ensure_utf8() shim already knew how to re-wrap the standard
streams as UTF-8, but it returned early on `sys.platform != "win32"`, so the
identical crash class on Linux was never covered.

- Drop the win32 gate: repair any stdout/stderr whose encoding is not UTF-8.
- Prefer TextIOWrapper.reconfigure() so the stream object is fixed in place
  (cached sys.stdout references keep working); fall back to reopening the fd
  with closefd=False (the CPython-recommended safe variant).
- Use errors="replace" — matching the sibling hermes_cli/stdio.py shim — so a
  stray un-encodable byte degrades gracefully instead of crashing.
- Only set the PYTHONUTF8/PYTHONIOENCODING child-process hints when a repair
  actually happened, so a healthy UTF-8 host sees zero footprint (no stream
  swap, no env mutation).

This is intentionally the earliest, platform-agnostic guard, running at import
time before any banner prints. hermes_cli/stdio.py::configure_windows_stdio()
still runs later from the entry points for the Windows-only extras (console
code-page flip, EDITOR default, PATH augmentation); it early-returns on
non-Windows and its stream reconfigure is an idempotent no-op once we've
already repaired the streams here.

Add regression tests covering latin-1 and ascii/POSIX streams, the reconfigure
fallback, already-UTF-8 no-op (identity preserved + no env mutation), the
repair-sets-env and respects-explicit-env contracts, and hostile/None streams.
2026-06-10 02:21:00 -07:00
teknium1
fa32af886f fix: dedupe concurrent gateway restarts + surface restart outcome in onboarding UI
Follow-ups to the salvaged Telegram QR onboarding auto-restart:

- _spawn_gateway_restart() reuses a live in-flight 'hermes gateway restart'
  child instead of spawning a second racing one (stale cached frontend +
  new backend both requesting a restart, or restart-button double-click).
  Both /api/gateway/restart and the onboarding apply path go through it.
- ChannelsPage polls /api/actions/gateway-restart/status after a
  server-initiated restart and surfaces a non-zero exit (e.g. systemd
  linger missing) via the manual-restart banner, since restart_started
  only means the child spawned.
- Test for the reuse path + _ACTION_PROCS isolation in existing tests.
2026-06-10 01:35:12 -07:00
Shannon Sands
984e69ff62 Auto-restart gateway after Telegram QR onboarding 2026-06-10 01:35:12 -07:00
Teknium
298bb93d39
feat(skills): show live per-source progress while browsing (#43398)
Some checks are pending
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker / shell lint / Lint Dockerfile (hadolint) (push) Waiting to run
Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix Lockfile Fix / auto-fix-main (push) Waiting to run
Nix Lockfile Fix / fix (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
do_browse waited on a frozen 'Fetching skills...' spinner while sources
resolved, so a slow source looked like a hang. parallel_search_sources
already exposes an on_source_done(sid, count) callback fired as each source
completes — wire it into the status line so it ticks off sources live
(official (12), + github (4), + clawhub (500)). The page is still rendered
once, after the full set is merged and trust-sorted, so browse's
official-first ordering and pagination contract are untouched.
2026-06-10 01:02:40 -07:00
Teknium
243cada157 fix(model): cover typed gateway /model path + async-safe pricing lookups
Follow-ups on top of #26016's expensive-model guard:

- gateway/slash_commands.py: typed '/model <name>' now routes through the
  expensive-model confirmation gate (slash-confirm buttons / text fallback)
  instead of bypassing the guard the pickers enforce. Cancel leaves the
  session override and --global config untouched.
- telegram/discord/web_server: run expensive_model_warning() via
  asyncio.to_thread — it can hit models.dev or a /models endpoint on a
  cache miss, which would otherwise block the event loop.
- telegram: picker callback no longer toasts 'Model switched!' when the
  switch callback raised (both mm: and mc: paths).
- tests: new tests/gateway/test_model_command_expensive_confirm.py pins
  the typed-path gate (prompt, confirm-once, cancel, cheap-model no-op).
2026-06-10 00:24:06 -07:00
Robin Fernandes
af978ecb17 fix(model): require confirmation for expensive model selections
Rebased onto current main and re-ported across the restructured
surfaces: model flows now thread confirm_provider/base_url/api_key
through hermes_cli/model_setup_flows.py, the Discord picker lives in
plugins/platforms/discord/adapter.py, and the web dashboard picker
applies chat-mode switches via config.set so the expensive-model
confirmation can ride the response.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 00:24:06 -07:00
Ondrej Drapalik
1c055a4c58 fix(xai): accept Grok Build code during loopback wait + tiny screenshot guard
xAI's consent page renders the authorization code in-page instead of
redirecting to the loopback callback, so the listener just hangs and the
manual-paste flow demands a callback URL that never contains the token.

- auth.py: poll stdin non-blockingly while waiting for the xAI loopback
  callback; accept a pasted bare Grok Build code and substitute the locally
  generated state (PKCE code_verifier still binds the exchange). No need to
  wait for timeout or re-run with --manual-paste.
- computer_use: parse PNG/JPEG dimensions from base64 and fall back to the
  text/AX/SOM payload when the screenshot is below the provider minimum
  (8x8), which xAI rejects with HTTP 400.
- model_setup_flows.py: xAI credential reuse prompt uses the standard radio
  picker via a shared _prompt_auth_credentials_choice helper.
- main.py: thread a title through _prompt_provider_choice; re-home the helper
  import (flows live in model_setup_flows.py post-decomposition).

Salvaged from #36781 onto current main (contributor's main.py edits re-homed
to model_setup_flows.py, where the flows were extracted since the PR opened).
2026-06-09 23:21:24 -07:00
Teknium
095f526b11
refactor(memory,skills): replace tri-state write_mode with boolean write_approval (default off) (#43354)
The shipped tri-state write_mode (on|off|approve) conflated two concepts —
whether writes are enabled and whether they're gated — so 'on' (writes flow
freely, gate inactive) read like 'gating is on'. Replace it with a single
clear boolean gate that defaults off.

  memory.write_approval / skills.write_approval:
    false (default) — write freely; the approval gate is off (pre-gate behaviour)
    true            — require approval: memory foreground prompts inline, memory
                      background-review + all skill writes stage for review

The old 'off = block all writes' mode is dropped; memory_enabled: false already
disables memory entirely, so a third 'block' state was redundant.

- tools/write_approval.py: get_write_mode/MODE_* → write_approval_enabled() bool;
  evaluate_gate() loses the config-driven 'blocked' path (blocked now only comes
  from an interactive user denial).
- tools/memory_tool.py, tools/skill_manager_tool.py: comment + behaviour follow.
- hermes_cli/config.py: memory/skills write_mode → write_approval (False);
  _config_version 28→29 with a 28→29 migration that renames any persisted
  write_mode (approve→true, on/off/unset→false) and drops the old key.
- slash commands: '/memory|/skills mode <on|off|approve>' → 'approval <on|off>'
  ('mode' kept as a back-compat alias); set_mode_fn callback now takes a bool.
- write_approval_commands.py, cli_commands_mixin.py, gateway/slash_commands.py,
  commands.py: handlers + registry args/subcommands updated.
- docs + tests rewritten for the boolean model; added migration tests.
2026-06-09 23:21:14 -07:00
Ben Barclay
63a421d4c0
fix(dashboard): _require_token endpoints all 401 behind the OAuth gate (#42578)
* fix(dashboard): let _require_token endpoints work behind the OAuth gate

In gated/OAuth mode (non-loopback bind without --insecure) the dashboard
authenticates the SPA via a session cookie and deliberately does NOT inject
the legacy ephemeral _SESSION_TOKEN into index.html. gated_auth_middleware
verifies the cookie and attaches request.state.session before any non-public
/api/ route runs; the legacy auth_middleware short-circuits in this mode too.

But several handlers call _require_token() directly, which only validated the
(absent) _SESSION_TOKEN header. So every cookie-authenticated request to those
endpoints 401'd — making plugin install/enable/disable, /api/dashboard/plugins/hub,
and the other _require_token routes permanently unreachable behind the gate.
In the UI this surfaced as a 401: {"detail":"Unauthorized"} popup on plugin
install for any publicly-bound (e.g. Fly-hosted NAS) dashboard.

Fix: _require_token now defers to the active gate. When auth_required is True it
accepts the request iff the gate attached a verified session (and 401s otherwise);
loopback/--insecure behavior is unchanged (still validates the session token).

Adds two regression tests driving the full in-process stub OAuth round trip:
the install endpoint must NOT 401 a logged-in request, and must still 401 with
no cookie. Verified the accept-test fails on the pre-fix code.

* test(dashboard): cover the whole _require_token route class under the gate

The install popup was one symptom of a class-wide bug: all 14 endpoints that
call _require_token directly (API-key reveal, provider validation, the
OAuth-provider connect/disconnect flow, and plugin enable/disable/update/
delete/visibility/providers) 401'd cookie-authenticated requests in gated mode.

Add a parametrized test hitting a representative spread (plugins/hub, env/reveal,
providers/validate, an oauth provider route, agent-plugin enable) asserting a
logged-in caller is never 401'd — proving the fix covers the class, not just
agent-plugins/install.
2026-06-09 22:57:49 -07:00
Ben Barclay
e4a1b35a39
fix(config): preserve original .env file mode instead of unconditionally tightening to 0600 (#33699)
`save_env_value()` captures the original .env file mode (e.g. 0640 for Docker
volume mounts) and restores it via `os.chmod` — but then unconditionally calls
`_secure_file(env_path)` on the next line, which re-tightens the mode to 0600
and defeats the entire preservation logic. The intent (preserve when
`original_mode` is captured, secure otherwise) was already in the code but
got short-circuited.

Move `_secure_file()` into the `else` branch so it only runs when no original
mode was captured — fresh `.env` files written for the first time still get
the 0600 hardening treatment, but operator-set modes survive subsequent writes.

Salvages #31518 by @blut-agent (config.py portion only). Their PR also bundled
unrelated lowercase-lookup changes in `hermes_cli/commands.py`; this salvage
takes only the focused config fix. The commands.py changes are reasonable on
their own merits but belong in a separate PR.

Co-authored-by: blut-agent <278569635+blut-agent@users.noreply.github.com>
2026-06-10 15:42:16 +10:00
Teknium
96af61b6ef
feat(memory,skills): approve/deny gate for memory + skill writes (#38199)
Adds memory.write_mode and skills.write_mode (on|off|approve), applied to
both foreground turns and the background self-improvement review fork — the
source of the unprompted 'wrong assumption' saves users reported.

- on (default): write freely, unchanged behaviour
- off: never write; the tool returns a clean disabled result
- approve: don't commit. Memory foreground writes prompt inline (small,
  reviewable in a chat bubble); background memory writes and ALL skill writes
  stage to a pending store instead (a SKILL.md is too large to review inline,
  and a daemon thread can't block on a prompt)

Review staged writes from CLI or any messaging platform:
  /memory pending|approve|reject|mode
  /skills pending|approve|reject|diff|mode

Skill review respects the size asymmetry: inline you see a one-line gist;
the full unified diff stays out-of-band (/skills diff, dashboard, or the
staged JSON file).

New: tools/write_approval.py (gate + pending store), hermes_cli/
write_approval_commands.py (shared CLI+gateway handlers). Gates wired at the
single entry points memory_tool() and skill_manage(), using the existing
write-origin ContextVar to distinguish foreground from background_review.
2026-06-09 21:51:43 -07:00
Ben Barclay
5cf6e28a2f
fix(gateway): auto-start after container restart via planned-stop marker (#42675) (#43236)
* fix(gateway): auto-start after container restart via planned-stop marker

On Docker (s6-overlay), the gateway runs as a dynamically-registered s6
service. When the container stops/restarts/upgrades, s6 sends the gateway
a plain SIGTERM. The shutdown path (_stop_impl) ended with an
unconditional _update_runtime_status("stopped"), persisting
gateway_state=stopped to the volume. container_boot.py reads that on the
next boot and only auto-starts gateways whose last state was "running"
(_AUTOSTART_STATES) — so after a routine `docker compose up
--force-recreate` the gateway stays down and messaging channels silently
go dark, with no error surfaced (issue #42675).

The codebase already distinguishes intentional stops from unexpected
signals via the planned-stop marker (write_planned_stop_marker /
consume_planned_stop_marker_for_self): `hermes gateway stop`,
systemd/launchd ExecStop, and Ctrl+C write a marker before signalling,
so the handler classifies them as planned. An unmarked SIGTERM
(container/s6 restart, OOM, bare kill) is signal-initiated.

This wires that existing classification through to the state persist,
rather than adding unreliable signal-source inference:

- run.py: GatewayRunner._signal_initiated_shutdown, set in
  shutdown_signal_handler's unmarked-signal branch. In _stop_impl, a
  signal-initiated (non-restart) teardown now persists "running" instead
  of "stopped" — preserving the operator's run-intent and overwriting the
  mid-shutdown "draining" marker so _AUTOSTART_STATES matches on reboot.
  Operator stops and restarts persist "stopped" as before.

- service_manager.py: S6ServiceManager.stop() now writes the planned-stop
  marker for the supervised PID (read from s6-svstat) before `s6-svc -d`,
  so an in-container `hermes gateway stop` is correctly classified as
  intentional (parity with the systemd/launchd/host stop paths, which
  already mark). Best-effort: a marker-write failure falls back to the
  safe signal-initiated path.

Tests: shutdown persist-decision table (signal→running, operator→stopped,
restart→stopped), s6 stop marker write + svstat PID parse + failure
tolerance. The signal→running and s6-marker tests fail without the
respective source change. Verified end-to-end against a container built
from this branch: an unmarked SIGTERM to the live gateway leaves
gateway_state=running (shutdown-context log confirms signal path);
existing real container-restart suite still green.

* docs(docker): clarify gateway autostart distinguishes operator-stop from container-kill

The per-profile-supervision section described the autostart-across-restart
contract as "running gateways come back, stopped stay stopped" without
spelling out what records 'stopped'. That contract was the source of
#42675 confusion: users expected a restart to bring the gateway back and
it didn't. With the write-side fix, only an explicit `hermes gateway stop`
records 'stopped'; container/s6 restart SIGTERMs (incl. image upgrades and
unexpected exits) leave the state 'running' so the gateway auto-starts.
Make that distinction explicit in both the multi-profile and
per-profile-supervision sections.

* test(docker): real-restart autostart E2E for #42675

Adds test_live_gateway_autostarts_after_real_restart_without_manual_state_stamp:
a live s6-supervised gateway is killed by an actual `docker restart`
SIGTERM (no manual gateway_state stamp, no planned-stop marker) and must
auto-start on the next boot. Exercises the WRITE side of the fix that the
existing stamp-based tests bypass.

Verified to FAIL against an origin/main image (reconciler logs
prior_state=stopped action=registered — the #42675 bug) and PASS against
the fixed image (prior_state=running action=started).
2026-06-10 14:01:34 +10:00
Ben Barclay
7df3aa34b1
fix(dashboard-auth): warn when public_url override is silently rejected (#43214)
A non-empty HERMES_DASHBOARD_PUBLIC_URL / dashboard.public_url value that
fails URL validation (overwhelmingly: a missing http(s):// scheme, e.g.
"hermes.domain.com") was silently discarded by resolve_public_url(),
falling back to reconstructing the OAuth redirect_uri from request
headers. Behind a reverse proxy that doesn't forward X-Forwarded-Proto
reliably, that yields an http:// callback even though the operator
explicitly set the public URL — with no signal as to why (#42780).

Emit a deduplicated operator-facing WARNING (once per distinct value,
since resolve_public_url runs per request) naming the offending value
and the required scheme. Turns a silent footgun into a self-diagnosing
one; behaviour is otherwise unchanged.

Tests assert the warning fires for a scheme-less value, is deduplicated
across repeated calls, and stays silent for a valid value — all three
fail without the fix.
2026-06-10 12:14:57 +10:00
brooklyn!
218452b050
fix(state.db): recover from malformed sqlite_master so hidden sessions reappear (#43149)
* fix(state.db): recover from malformed sqlite_master so hidden sessions reappear

The corruption class behind "Desktop/Dashboard show no sessions while
hundreds of session files sit on disk" is a malformed sqlite_master — most
often a duplicate object row, e.g. two CREATE VIRTUAL TABLE messages_fts
entries — surfacing as:

    sqlite3.DatabaseError: malformed database schema (messages_fts) -
    table messages_fts already exists

SQLite parses the whole schema while preparing the FIRST statement on a
connection, so on this class every statement fails before it runs: PRAGMA
journal_mode (which is where SessionDB.__init__ actually trips, in
apply_wal_with_fallback, BEFORE _init_schema), PRAGMA integrity_check, and
even DROP TABLE. The only operations that still work are
PRAGMA writable_schema=ON plus direct sqlite_master surgery. A plain
FTS-index rebuild at the _init_schema layer therefore cannot reach or fix
this; the canonical sessions/messages rows are intact — only the derived
schema is broken.

Add a dedicated recovery that operates where the failure actually happens:

- hermes_state.repair_state_db_schema(): backs up the raw file first, then a
  least-destructive ladder — (1) de-duplicate sqlite_master keeping the
  lowest rowid per object (preserves the existing FTS index), escalating to
  (2) drop every messages_fts* schema object + VACUUM and let the next open
  rebuild the FTS index from messages. sessions/messages are never modified.
  Plus is_malformed_db_error() to discriminate this class.
- SessionDB.__init__ auto-heals: on a malformed-schema open error it repairs
  once (process-guarded against loops / concurrent web_server opens) and
  reopens, so Desktop/Dashboard recover on their own instead of silently
  showing "no sessions".
- hermes doctor --fix detects the malformed class and repairs it (reporting
  the recovered session count + backup name).
- hermes sessions repair [--check-only] [--no-backup] runs on the raw file
  path, since SessionDB() itself cannot open a malformed DB.

Supersedes #32589 and #33869: both targeted FTS corruption but gated their
repair behind statements (integrity_check / SELECT / DROP TABLE) that
themselves fail on this class, and neither addressed the apply_wal_with_fallback
open-time failure. Credit preserved via Co-authored-by.

Closes #33865.

Co-authored-by: João Vitor Cunha <145560011+plcunha@users.noreply.github.com>
Co-authored-by: Tuna Dev <273476039+tuancookiez-hub@users.noreply.github.com>

* test(state.db): cover strat-B escalation + unrepairable safe-fail paths

---------

Co-authored-by: João Vitor Cunha <145560011+plcunha@users.noreply.github.com>
Co-authored-by: Tuna Dev <273476039+tuancookiez-hub@users.noreply.github.com>
2026-06-09 18:49:08 -05:00
Teknium
57c6714995
fix(models): keep curated Anthropic aliases in /model picker (#43103)
The Anthropic picker returned the live /v1/models dump verbatim whenever
credentials were configured. Anthropic's API lags newly-routed curated
aliases (e.g. claude-fable-5, reachable on Anthropic before the models
endpoint enumerates it), so the curated entry vanished from the picker.

Merge curated _PROVIDER_MODELS["anthropic"] with the live catalog —
curated first, live-only appended, deduped — mirroring the OpenAI
curated-merge path. Live failure / no creds falls back to curated verbatim.
2026-06-09 14:45:19 -07:00
emozilla
d7886da08c add Fable 5 to model list for Anthropic provider 2026-06-09 15:33:42 -04:00
brooklyn!
ba44de06da
fix(install): self-heal a stuck Electron download (salvage of #42894) (#42998)
* fix(install): self-heal a stuck Electron download on the desktop build

The desktop build downloads Electron (~114MB) from GitHub. A corrupt cached
zip, or a blocked/throttled GitHub release host (the repeating "retrying" log),
hard-failed the install — and install.sh had no recovery at all while
install.ps1 / `hermes desktop` only purged the cache.

All three build paths now escalate on a failed `npm run pack`:
GitHub → purge corrupt electron-*.zip + stale *-unpacked and retry → one retry
via a public Electron mirror (npmmirror.com). @electron/get SHASUM-verifies the
download, and a user-pinned ELECTRON_MIRROR is always respected (never
overridden). Adds a bash clear_electron_build_cache()/_desktop_pack() to mirror
the existing PowerShell/Python helpers.

* test(install): cover the Electron mirror fallback

Verify `hermes desktop` falls back to a mirror when the cache purge finds
nothing, and that a user-pinned ELECTRON_MIRROR is respected (no extra attempt,
not overridden).

* docs(desktop): troubleshoot a stuck Electron download

Document the automatic cache-purge + mirror fallback, how to pin your own
ELECTRON_MIRROR, and how to clear a corrupt cached zip by hand.

* docs(install): correct the Electron mirror trust framing

The mirror-fallback comments and the desktop troubleshooting doc implied
`@electron/get`'s SHASUM check makes the npmmirror.com download safe against
tampering. It doesn't: the SHASUMS256.txt is fetched from the same mirror, so
the check guards against a corrupt/partial download, not a compromised mirror.

Reframe all four surfaces (install.sh, install.ps1, `hermes desktop`, and the
docs) to state the trust trade-off honestly — npmmirror.com is the de-facto
Electron community mirror, we only fall back to it after the canonical GitHub
download fails, and a user-pinned ELECTRON_MIRROR is never overridden. No
behavior change.

---------

Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>
2026-06-09 18:19:14 +00:00
Teknium
f6f573ebaa
feat(plugins): install from a subdirectory within a repo (#42963)
Support installing a plugin that lives in a subdirectory of a larger
repo (docs/tests at root, plugin in a subdir) without forcing a
dedicated single-plugin repo.

Identifier syntax:
  owner/repo/path/to/plugin        (shorthand + subpath)
  <url>.git/path/to/plugin         (.git boundary on GitHub-style URLs)
  <url>#path/to/plugin             (explicit fragment, any scheme)

_resolve_git_url now returns (git_url, subdir); _install_plugin_core
reads the manifest from and moves only the subdir, so root-level docs
and tests no longer leak into ~/.hermes/plugins. _resolve_subdir_within
guards against path traversal, missing dirs, and non-directories.

Both the CLI (hermes plugins install) and the dashboard install endpoint
inherit this for free since they share _install_plugin_core. Dashboard
install hint + placeholder updated to advertise the subdir syntax.

Co-authored-by: Austin Pickett <pickett.austin@gmail.com>
2026-06-09 13:42:51 -04:00
Teknium
ff9c110d5a
feat(models): add anthropic/claude-fable-5 to openrouter + nous curated lists (#42979)
Adds the model above claude-opus-4.8 in both the OpenROUTER_MODELS and
_PROVIDER_MODELS['nous'] curated picker lists used by /model and
`hermes model`. Regenerated website/static/api/model-catalog.json to match.
2026-06-09 10:20:37 -07:00
Gille
c6dc2fcd21
fix(desktop): release profile backends before delete (#42613) 2026-06-09 10:52:02 -05:00
helix4u
f8adefdebf fix(tui): apply terminal backend config before launch
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Build Skills Index / build-index (push) Has been cancelled
Build Skills Index / trigger-deploy (push) Has been cancelled
2026-06-09 00:31:27 -07:00
Teknium
e687292eb4
feat(models): persist Nous recommended-models to disk; fall back on Portal failure (#42628)
The Portal's /api/nous/recommended-models endpoint is the source of truth for
which models are free/paid right now, but its result was cached in-process
only. When the live fetch failed (network, parse, non-2xx), the function
returned {} and the model picker silently dropped the free/paid
recommendations — free models would vanish with no indication anything went
wrong.

Add a per-base disk cache at $HERMES_HOME/cache/nous_recommended_cache.json:
a successful live fetch is persisted as last-known-good, and a failed fetch
with an empty in-process cache falls back to the disk copy instead of {}.
Self-heals on the next successful fetch. With no disk copy, still degrades to
{} (callers already handle that). Keyed by portal base URL so staging/prod
don't collide.

E2E: live fetch writes disk; simulated Portal failure returns the cached free
models from disk; no-disk + failure returns {}.
2026-06-09 00:03:43 -07:00