fix(desktop+gateway): full multi-profile support over one global-remote dashboard (#39921)

* fix(desktop): cross-profile session history in app-global remote mode #39894 made remote-profile sessions first-class for PER-PROFILE remote overrides. But the common setup — Settings → Gateway → "All profiles" → Remote — writes app-GLOBAL remote mode (connection.json top-level mode:'remote', empty profiles map), which the intercept didn't recognize. Switching to a non-launch profile then 404'd every session read, so no history showed for it. In global remote mode a SINGLE backend serves every profile via ?profile= (it reads each profile's state.db off the remote host's own disk — verified: one dashboard returns /api/profiles and /api/profiles/sessions?profile=all across all profiles). The fix: when no per-profile override matches but global remote mode is active, route per-session reads/mutations to that one backend and KEEP the ?profile= param so it opens the right state.db (instead of bailing to the local path and dropping the profile scope). - new globalRemoteActive() — true for connection.json mode:'remote' or the HERMES_DESKTOP_REMOTE_URL env override. - per-session branch: per-profile override → route sans profile (own db); global mode → route to the single backend WITH ?profile= preserved. - unified list is unchanged in global mode: it already passes through to the one backend, which aggregates all profiles natively. Verified live against a one-dashboard / multi-profile remote (Austin's topology): cross-profile transcript reads load (was 404), rename/delete route to the right profile, unified list spans both profiles. Known limitation (architectural, not fixed here): LIVE chat as a non-launch profile still needs a per-profile dashboard on the remote — the dashboard binds HERMES_HOME once at process start, so one global backend can't run an agent turn as another profile. Session history/read/mutate now work regardless. * fix(gateway): resume + chat any profile over one global-remote dashboard The REST half of this branch made cross-profile session history visible in app-global remote mode, but resume + chat still went over the WebSocket gateway, which was hard-bound to the dashboard's launch profile. Resuming a non-launch profile's session 404'd ("session not found") and sending spawned a new session — because session.resume/prompt.submit had no profile concept and the live agent + state.db were process-global to the launch profile's HERMES_HOME. Make the WS gateway per-session profile-aware so ONE dashboard can serve every local profile on its host (the app-global remote topology): - session.resume accepts an optional `profile`. _profile_home() resolves that profile's home on this host; resume opens THAT profile's state.db, binds its HERMES_HOME (ContextVar override) while building the agent so config/skills/ model resolve to it, and passes the profile db to the agent so turns persist to the right state.db. The owning profile_home is stored on the session. - prompt.submit re-binds the stored profile_home for the turn thread (mid-turn home reads — memory, skills — resolve to the resumed profile), reset in finally. - _make_agent gains an optional session_db param (defaults to _get_db()). - _load_cfg honors the home override (falls back to _hermes_home) so a resumed profile loads its own config; cache keyed on resolved path. - desktop: session.resume now sends the owning profile. Omitted/launch profile → unchanged (single-profile and per-profile-remote setups are byte-for-byte the same path). Verified live against a one-dashboard / multi-profile remote: resuming a non-launch profile's session loads its history, runs a real turn against THAT profile's home/env, and persists to its state.db. tests/tui_gateway/test_protocol.py: _make_agent mocks updated for the new param.
2026-06-09 08:21:50 +00:00 · 2026-06-05 12:22:55 -05:00 · 2026-06-05 12:22:55 -05:00 · 02d6bf1c39
commit 02d6bf1c39
parent e837856ecd
4 changed files with 119 additions and 20 deletions
--- a/apps/desktop/electron/main.cjs
+++ b/apps/desktop/electron/main.cjs
@ -3922,6 +3922,16 @@ function configuredRemoteProfileNames() {
  return Object.keys(config.profiles || {}).filter(name => profileRemoteOverride(config, name))
 }

+// True when the app is in app-global remote mode (Settings → "All profiles" →
+// Remote, or the env override): a SINGLE remote backend serves every profile via
+// ?profile=. Distinct from per-profile overrides — here there's one host for all.
+function globalRemoteActive() {
+  if (process.env.HERMES_DESKTOP_REMOTE_URL) {
+    return true
+  }
+  return readDesktopConnectionConfig().mode === 'remote'
+}
+
 // GET a profile's resolved backend (remote pool or local primary), parsed JSON.
 async function fetchJsonForProfile(profile, path) {
  return requestJsonForProfile(profile, path, 'GET')
@ -4762,19 +4772,35 @@ async function interceptSessionRequestForRemote(request) {
  }

  // Per-session read/mutation. Owner is in ?profile= (reads) or request.profile
-  // (mutations); route to the remote sans profile param — it serves its own
-  // state.db, with no cross-profile semantics.
+  // (mutations). Two remote shapes:
+  //  - per-profile override: route to that profile's own remote, sans profile
+  //    param (it serves its own state.db natively).
+  //  - global remote mode: ONE backend serves every profile via ?profile=, so
+  //    route there and KEEP the profile param so it opens the right state.db.
  if (/^\/api\/sessions\/[^/]+(\/messages)?$/.test(pathname)) {
    const profile = (searchParams.get('profile') || request.profile || '').trim()
-    if (!profile || !profileHasRemoteOverride(profile)) {
+    if (!profile) {
      return undefined
    }
-    if (method === 'GET') {
-      return fetchJsonForProfile(profile, pathname)
+    if (profileHasRemoteOverride(profile)) {
+      if (method === 'GET') {
+        return fetchJsonForProfile(profile, pathname)
+      }
+      const body = request.body && typeof request.body === 'object' ? { ...request.body } : request.body
+      if (body) delete body.profile
+      return requestJsonForProfile(profile, pathname, method, body)
    }
-    const body = request.body && typeof request.body === 'object' ? { ...request.body } : request.body
-    if (body) delete body.profile
-    return requestJsonForProfile(profile, pathname, method, body)
+    if (globalRemoteActive()) {
+      // Single global backend: keep ?profile= so it opens the right state.db.
+      const sep = pathname.includes('?') ? '&' : '?'
+      const path = `${pathname}${sep}profile=${encodeURIComponent(profile)}`
+      if (method === 'GET') {
+        return fetchJsonForProfile(null, path)
+      }
+      const body = request.body && typeof request.body === 'object' ? { ...request.body, profile } : { profile }
+      return requestJsonForProfile(null, path, method, body)
+    }
+    return undefined
  }

  return undefined
--- a/apps/desktop/src/app/session/hooks/use-session-actions.ts
+++ b/apps/desktop/src/app/session/hooks/use-session-actions.ts
@ -529,7 +529,11 @@ export function useSessionActions({

        const resumed = await requestGateway<SessionResumeResponse>('session.resume', {
          session_id: storedSessionId,
-          cols: 96
+          cols: 96,
+          // Owning profile: in app-global remote mode one backend serves every
+          // profile, so the gateway opens this profile's state.db + home to
+          // resume + persist the right session (no-op for single/launch profile).
+          ...(sessionProfile ? { profile: sessionProfile } : {})
        })

        if (!isCurrentResume()) {
--- a/tests/tui_gateway/test_protocol.py
+++ b/tests/tui_gateway/test_protocol.py
@ -315,7 +315,7 @@ def test_session_resume_returns_hydrated_messages(server, monkeypatch):
            ]

    monkeypatch.setattr(server, "_get_db", lambda: _DB())
-    monkeypatch.setattr(server, "_make_agent", lambda sid, key, session_id=None: object())
+    monkeypatch.setattr(server, "_make_agent", lambda sid, key, session_id=None, session_db=None: object())
    monkeypatch.setattr(server, "_init_session", lambda sid, key, agent, history, cols=80: None)
    monkeypatch.setattr(server, "_session_info", lambda _agent, _session=None: {"model": "test/model"})

@ -366,7 +366,7 @@ def test_session_resume_handles_multimodal_list_content(server, monkeypatch):
            return [multimodal_user, text_only_assistant]

    monkeypatch.setattr(server, "_get_db", lambda: _DB())
-    monkeypatch.setattr(server, "_make_agent", lambda sid, key, session_id=None: object())
+    monkeypatch.setattr(server, "_make_agent", lambda sid, key, session_id=None, session_db=None: object())
    monkeypatch.setattr(server, "_init_session", lambda sid, key, agent, history, cols=80: None)
    monkeypatch.setattr(server, "_session_info", lambda _agent, _session=None: {"model": "test/model"})

@ -432,7 +432,7 @@ def test_session_resume_reuses_existing_live_session(server, monkeypatch):
        def close(self):
            closed_sids.append(self.sid)

-    def make_agent(sid, key, session_id=None):
+    def make_agent(sid, key, session_id=None, session_db=None):
        created_sids.append(sid)
        first_agent_started.set()
        assert agent_can_finish.wait(timeout=1)
@ -547,7 +547,7 @@ def test_session_resume_live_payload_uses_current_history_with_ancestors(server,
    monkeypatch.setattr(
        server,
        "_make_agent",
-        lambda _sid, key, session_id=None: types.SimpleNamespace(
+        lambda _sid, key, session_id=None, session_db=None: types.SimpleNamespace(
            model="test/model", session_id=session_id or key
        ),
    )
@ -647,7 +647,7 @@ def test_session_branch_persists_branched_from_marker(server, monkeypatch):
    monkeypatch.setattr(
        server,
        "_make_agent",
-        lambda _sid, key, session_id=None: types.SimpleNamespace(
+        lambda _sid, key, session_id=None, session_db=None: types.SimpleNamespace(
            model="test/model", session_id=session_id or key
        ),
    )
--- a/tui_gateway/server.py
+++ b/tui_gateway/server.py
@ -16,7 +16,12 @@ from datetime import datetime
 from pathlib import Path
 from typing import Any, Optional

-from hermes_constants import get_hermes_home
+from hermes_constants import (
+    get_hermes_home,
+    get_hermes_home_override,
+    reset_hermes_home_override,
+    set_hermes_home_override,
+)
 from hermes_cli.env_loader import load_hermes_dotenv
 from utils import is_truthy_value
 from tui_gateway.transport import (
@ -458,6 +463,31 @@ def _db_unavailable_error(rid, *, code: int):
    return _err(rid, code, f"state.db unavailable: {detail}")


+# ── per-session profile scoping (global remote mode) ───────────────────────────
+# One dashboard normally serves its launch profile. But the desktop's app-global
+# remote mode points every profile at this single backend, so resume/prompt must
+# be able to act on ANOTHER local profile's state.db + home. The desktop passes
+# ``profile`` on those calls; we open that profile's db and bind its HERMES_HOME
+# (a ContextVar override) for the duration of the call so config/skills/model and
+# message persistence all resolve to the right profile. Omitted/own profile → the
+# launch profile (unchanged for single-profile and per-profile-remote setups).
+def _profile_home(profile: str | None) -> Path | None:
+    """Resolve a named profile's home on THIS host, or None for the launch profile."""
+    name = (profile or "").strip()
+    if not name:
+        return None
+    try:
+        from hermes_cli import profiles as profiles_mod
+
+        home = Path(profiles_mod.get_profile_dir(name))
+    except Exception:
+        return None
+    # Already the launch profile? No override needed.
+    if home.resolve() == Path(_hermes_home).resolve():
+        return None
+    return home if (home / "state.db").exists() or home.exists() else None
+
+
 def write_json(obj: dict) -> bool:
    """Emit one JSON frame. Routes via the most-specific transport available.

@ -873,7 +903,13 @@ def _load_cfg() -> dict:
    try:
        import yaml

-        p = _hermes_home / "config.yaml"
+        # Honor a per-session profile override (see session.resume) so a resumed
+        # remote profile loads ITS config (model, skills, prompt); otherwise the
+        # launch profile's _hermes_home. Cache is keyed on the resolved path, so
+        # profiles don't clobber each other.
+        override = get_hermes_home_override()
+        home = override if isinstance(override, str) and override else _hermes_home
+        p = Path(home) / "config.yaml"
        mtime = p.stat().st_mtime if p.exists() else None
        with _cfg_lock:
            if _cfg_cache is not None and _cfg_mtime == mtime and _cfg_path == p:
@ -2434,7 +2470,7 @@ def _reset_session_agent(sid: str, session: dict) -> dict:
    return info


-def _make_agent(sid: str, key: str, session_id: str | None = None):
+def _make_agent(sid: str, key: str, session_id: str | None = None, session_db=None):
    from run_agent import AIAgent
    from hermes_cli.runtime_provider import resolve_runtime_provider

@ -2494,7 +2530,7 @@ def _make_agent(sid: str, key: str, session_id: str | None = None):
        enabled_toolsets=_load_enabled_toolsets(),
        platform="tui",
        session_id=session_id or key,
-        session_db=_get_db(),
+        session_db=session_db if session_db is not None else _get_db(),
        ephemeral_system_prompt=system_prompt or None,
        checkpoints_enabled=is_truthy_value(os.environ.get("HERMES_TUI_CHECKPOINTS")),
        pass_session_id=is_truthy_value(os.environ.get("HERMES_TUI_PASS_SESSION_ID")),
@ -3094,9 +3130,22 @@ def _(rid, params: dict) -> dict:
        cols = int(params.get("cols", 80))
    except (TypeError, ValueError):
        cols = 80
-    db = _get_db()
+    # ``profile`` (app-global remote mode): resume a session that lives in another
+    # local profile's state.db. None/own profile → the launch profile (unchanged).
+    profile = (params.get("profile") or "").strip() or None
+    profile_home = _profile_home(profile)
+
+    # In a profile scope, the agent OWNS a long-lived db handle bound to that
+    # profile (do NOT auto-close it here). Otherwise reuse the shared launch db.
+    if profile_home is not None:
+        from hermes_state import SessionDB
+
+        db = SessionDB(db_path=profile_home / "state.db")
+    else:
+        db = _get_db()
    if db is None:
        return _db_unavailable_error(rid, code=5000)
+
    found = db.get_session(target)
    if not found:
        found = db.get_session_by_title(target)
@ -3125,6 +3174,9 @@ def _(rid, params: dict) -> dict:
    # dispatch thread (it's not a _LONG_HANDLER), blocking fast-path RPCs.
    sid = uuid.uuid4().hex[:8]
    _enable_gateway_prompts()
+    home_token = (
+        set_hermes_home_override(str(profile_home)) if profile_home is not None else None
+    )
    try:
        db.reopen_session(target)
        history = db.get_messages_as_conversation(target)
@ -3137,11 +3189,17 @@ def _(rid, params: dict) -> dict:
        messages = _history_to_messages(display_history)
        tokens = _set_session_context(target)
        try:
-            agent = _make_agent(sid, target, session_id=target)
+            # Pass the profile's db so the agent persists turns to the right
+            # state.db; home override is active here so config/skills/model
+            # resolve to the profile too.
+            agent = _make_agent(sid, target, session_id=target, session_db=db)
        finally:
            _clear_session_context(tokens)
    except Exception as e:
        return _err(rid, 5000, f"resume failed: {e}")
+    finally:
+        if home_token is not None:
+            reset_hermes_home_override(home_token)

    # Double-checked locking: another concurrent resume may have created the
    # live session while we were building. Re-check under the lock; if it won,
@ -3168,6 +3226,11 @@ def _(rid, params: dict) -> dict:
            _init_session(sid, target, agent, history, cols=cols)
            if sid in _sessions:
                _sessions[sid]["display_history_prefix"] = display_history_prefix
+                # Remember the profile home so each turn re-binds HERMES_HOME (the
+                # agent persists to its own db, but mid-turn home reads — memory,
+                # skills — must resolve to the resumed profile too).
+                if profile_home is not None:
+                    _sessions[sid]["profile_home"] = str(profile_home)
        except Exception as e:
            return _err(rid, 5000, f"resume failed: {e}")
        session = _sessions.get(sid) or {}
@ -4381,6 +4444,7 @@ def _run_prompt_submit(rid, sid: str, session: dict, text: Any) -> None:
    def run():
        approval_token = None
        session_tokens = []
+        home_token = None  # per-turn HERMES_HOME override for a resumed remote profile
        goal_followup = None  # set by the post-turn goal hook below
        try:
            from tools.approval import (
@ -4390,6 +4454,9 @@ def _run_prompt_submit(rid, sid: str, session: dict, text: Any) -> None:

            approval_token = set_current_session_key(session["session_key"])
            session_tokens = _set_session_context(session["session_key"])
+            _profile_home_str = session.get("profile_home")
+            if _profile_home_str:
+                home_token = set_hermes_home_override(_profile_home_str)
            # The sudo password callback is thread-local (tools.terminal_tool
            # _callback_tls), so wiring it on the build thread doesn't reach this
            # turn thread — terminal sudo prompts would fall through to /dev/tty
@ -4718,6 +4785,8 @@ def _run_prompt_submit(rid, sid: str, session: dict, text: Any) -> None:
                    reset_current_session_key(approval_token)
            except Exception:
                pass
+            if home_token is not None:
+                reset_hermes_home_override(home_token)
            _clear_session_context(session_tokens)
            with session["history_lock"]:
                session["running"] = False