hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-21 10:22:18 +00:00

Author	SHA1	Message	Date
Teknium	b266ad748c	chore(deps): npm audit fix — bump transitive undici to clear advisories (#49113 ) Resolves the 2 npm audit advisories (1 high, 1 moderate), both from transitive undici: - undici 6.26.0 -> 6.27.0 (high: TLS bypass / header injection / response queue poisoning class, via node-gyp + ui-tui) - jsdom's undici 7.27.2 -> 7.28.0 (moderate, via jsdom test dep) Both are in-range bumps (no --force). Lockfile also reconciled two pre-existing manifest drifts during the install: dompurify 3.4.10 -> 3.4.11 (in-range patch) and the web workspace's already-declared vitest ^4.1.5 devDep. No package.json changes. npm audit reports 0 vulnerabilities in root, ui-tui, and apps/desktop after.	2026-06-19 08:20:03 -07:00
brooklyn!	0e8b76532e	fix(desktop): rename "Restart messaging" → "Restart gateway", surface restarts in the statusbar, make logs selectable (#49094 ) * fix(desktop): rename "Restart messaging" -> "Restart gateway" The Command Center control restarts the whole messaging gateway, yet was labelled "Restart messaging" while the status line above it reads "Messaging gateway running/stopped". Rename the i18n key to match what it does, across all 4 locales. * feat(desktop): restart the gateway from Cmd+K, with statusbar spinner feedback Add a shared runGatewayRestart() (store/system-actions.ts) and wire it to a new Cmd+K "Restart gateway" action. While a restart is in flight the statusbar "Gateway" item swaps its icon for the TUI glyph spinner and reads "restarting…", returning to its real state on completion — driven by a $gatewayRestarting atom, not a transient toast or the generic "Agents running" counter. The helper owns its error handling so fire-and-forget callers can't leak an unhandled rejection; only a failure toasts. * fix(desktop): offer a Restart gateway action on messaging save/toggle toasts The "setup saved" and "platform enabled/disabled" toasts told users their change needs a gateway restart but left it a separate hunt. Attach a "Restart gateway" action (the shared runGatewayRestart), and reword the copy to state the pending consequence ("...takes effect after a gateway restart") now that the button carries the verb. Updated all 4 locales. * fix(desktop): make rendered logs selectable so they can be copied The global body { user-select: none } left log surfaces unselectable. Opt them back in via the existing data-selectable-text convention — at the shared LogView primitive (boot-failure + bootstrap install overlays) plus Command Center recent logs, toolset post-setup output, notification detail, and subagent stream/file lines.	2026-06-19 10:09:15 -05:00
Brooklyn Nicholson	929dbf7778	fix(desktop): make rendered logs selectable so they can be copied The global body { user-select: none } left log surfaces unselectable. Opt them back in via the existing data-selectable-text convention — at the shared LogView primitive (boot-failure + bootstrap install overlays) plus Command Center recent logs, toolset post-setup output, notification detail, and subagent stream/file lines.	2026-06-19 10:03:46 -05:00
Brooklyn Nicholson	a1639921ac	fix(desktop): offer a Restart gateway action on messaging save/toggle toasts The "setup saved" and "platform enabled/disabled" toasts told users their change needs a gateway restart but left it a separate hunt. Attach a "Restart gateway" action (the shared runGatewayRestart), and reword the copy to state the pending consequence ("...takes effect after a gateway restart") now that the button carries the verb. Updated all 4 locales.	2026-06-19 10:03:24 -05:00
Brooklyn Nicholson	553cf4f977	feat(desktop): restart the gateway from Cmd+K, with statusbar spinner feedback Add a shared runGatewayRestart() (store/system-actions.ts) and wire it to a new Cmd+K "Restart gateway" action. While a restart is in flight the statusbar "Gateway" item swaps its icon for the TUI glyph spinner and reads "restarting…", returning to its real state on completion — driven by a $gatewayRestarting atom, not a transient toast or the generic "Agents running" counter. The helper owns its error handling so fire-and-forget callers can't leak an unhandled rejection; only a failure toasts.	2026-06-19 10:02:54 -05:00
Brooklyn Nicholson	6308d3416a	fix(desktop): rename "Restart messaging" -> "Restart gateway" The Command Center control restarts the whole messaging gateway, yet was labelled "Restart messaging" while the status line above it reads "Messaging gateway running/stopped". Rename the i18n key to match what it does, across all 4 locales.	2026-06-19 10:02:21 -05:00
Teknium	0d7abd555c	fix(dashboard): sort chat session switcher by most-recent activity (#49104 ) The Chat-tab session switcher rendered rows in the API's default order="created" (original start time) while each row displays last_active — so a session you just messaged in could sit below an older one, and the list looked unsorted against its own timestamps. Pass order="recent" from ChatSessionList so the switcher sorts by latest activity across the compression chain (most-recently-used at top, ChatGPT-style; long conversations that auto-compressed into a new continuation id stay on the first page). Adds an optional, defaulted `order` arg to api.getSessions; the paginated Sessions page keeps the stable created order.	2026-06-19 07:58:56 -07:00
Teknium	1b04e4ede5	fix(cli): status bar no longer stays hidden after resize during idle (#49105 ) The classic CLI status bar could vanish for the rest of a session: any terminal reflow (SIGWINCH from a tmux pane change, SSH window restore, font zoom) set _status_bar_suppressed_after_resize=True, but the flag was ONLY cleared on the next submitted user input. Resize then sit idle and the bottom chrome rendered at height 0 on every repaint — even with the refresh clock ticking — so the bar was gone until you typed and hit enter. Fix: _recover_after_resize now schedules a debounced unsuppress timer that clears the flag and repaints once the reflow settles (~0.35s), so the bar returns on its own during idle. The next-submit clear stays as a fast path. Fails open: any error in scheduling clears the flag immediately rather than leaving the bar stuck hidden.	2026-06-19 07:53:58 -07:00
teknium1	7d86178cf5	fix(raft): set stdin=DEVNULL on bridge subprocess Satisfies the repo-wide subprocess-stdin guard (tests/tools/test_subprocess_stdin_guard.py); the long-lived bridge child should not inherit the gateway's stdin.	2026-06-19 07:52:37 -07:00
teknium1	22ccb12c30	chore(release): map skyzh@mail.build to xxchan for Raft salvage CI blocks PRs with unmapped commit-author emails.	2026-06-19 07:52:37 -07:00
skyzh	9026a8c789	feat(gateway): add Raft bundled platform plugin with activity hooks Adds a Raft platform adapter as a bundled plugin (plugins/platforms/raft/) connecting Hermes to Raft as an external agent via a wake-channel bridge. The adapter starts a loopback HTTP endpoint, spawns 'raft agent bridge' as a child process, and injects content-free wake hints into the gateway session pipeline. The agent reads/sends messages through the Raft CLI; the adapter never touches message bodies or delivery cursors. Activity observer hooks report tool/LLM/session lifecycle events via a bounded at-most-once queue. Auto-enables when RAFT_PROFILE is set. Cherry-picked from PR #47629. Authored by skyzh (@xxchan).	2026-06-19 07:52:37 -07:00
Teknium	2a5e9d994a	Merge pull request #48275 from NousResearch/feat/cron-scheduler-provider-chronos feat(cron): pluggable CronScheduler interface + Chronos managed-cron provider (scale-to-zero)	2026-06-19 07:51:59 -07:00
Ben	1928aa0443	fix(managed-scope): honor managed scope in config→env bridges too Manual verification surfaced a second bypass class beyond the standalone config loaders: several code paths bridge config.yaml values into os.environ (HERMES_TIMEZONE, HERMES_REDACT_SECRETS, HERMES_MAX_ITERATIONS, TERMINAL_*, network.force_ipv4, ...) by reading the raw user YAML, so the env the whole process reads carried the USER's value even when an administrator pinned it — e.g. a managed timezone was overridden because gateway/run.py wrote the user's timezone into HERMES_TIMEZONE, and _resolve_timezone_name() checks the env var first. Wired the shared apply_managed_overlay() into every config→env bridge: - gateway/run.py module-level startup bridge (timezone, redact_secrets, max_turns, terminal, display, gateway.strict, ...) - gateway/run.py _reload_runtime_env_preserving_config_authority (the per-turn re-bridge that keeps config authoritative over reloaded .env — must keep MANAGED authoritative on every turn, not just startup) - hermes_cli/main.py early security.redact_secrets / network.force_ipv4 bridge (runs before load_config is usable, at import time) - hermes_cli/send_cmd.py top-level scalar config→env bridge Verified end-to-end against a writable managed dir (12/12 checks incl. timezone, logging, model, skin, gateway settings, write-guard) and in a clean process the gateway per-turn bridge writes HERMES_TIMEZONE=<managed>. Adds an order-independent regression test for the bridge overlay.	2026-06-19 07:46:33 -07:00
Ben	b0e47a98f9	fix(managed-scope): honor managed scope in all standalone config loaders The skin bug was one instance of a class: several subsystems build their config dict directly from config.yaml instead of routing through hermes_cli.config.load_config (which carries the managed merge), so they silently ignored administrator-pinned values. Audited every config.yaml reader and fixed the behavioral-read bypasses: - gateway/config.py load_gateway_config (messaging gateway: session_reset, quick_commands, stt, model, ...) - gateway/run.py _load_gateway_config (its read_raw_config fast path also skipped the merge — read_raw_config returns raw user YAML) - tui_gateway/server.py _load_cfg (new TUI + desktop backend: skin, reasoning_effort, service_tier, provider_routing) - cron/scheduler.py (scheduled-job model/reasoning/toolsets/provider_routing) - hermes_logging.py (logging.level/max_size_mb/backup_count) - hermes_time.py (timezone) - hermes_cli/doctor.py (memory-provider diagnostic reads effective config) All route through a new shared managed_scope.apply_managed_overlay() helper that mirrors _load_config_impl (env-only expansion so a user ${VAR} can't shadow a managed literal, root-model-string normalization, leaf-merge) and is fail-open. cli.py's earlier inline fix is refactored onto the same helper. Write-back paths (slash_commands, telegram/yuanbao dm_topics, profile distribution) are deliberately left reading raw user YAML — overlaying managed values there would persist them into the user file. The dashboard (web_server.py) already routes through load_config and needed no change. TUI loader caches the RAW config so _save_cfg never writes managed values to disk. Adds test_managed_scope_overlay.py (helper) and test_managed_scope_loaders.py (per-surface integration); mutation-checked.	2026-06-19 07:46:33 -07:00
Ben	732293cf87	fix(managed-scope): apply managed layer in cli.py's standalone config loader cli.py's load_cli_config() builds CLI_CONFIG independently of hermes_cli.config._load_config_impl (it reads config.yaml directly and merges into hardcoded defaults), so the Phase 2 managed merge never reached the interactive CLI/TUI surface. Symptom: a managed display.skin (and any other display/CLI pref read from CLI_CONFIG) was silently ignored by the TUI while `hermes config`/`doctor`/write-guards — which go through load_config — correctly honored it. Found via manual testing: the skin engine kept using 'default'. Fix: overlay the managed config last in load_cli_config(), mirroring _load_config_impl — expand against the process env only (so a user ${VAR} can't shadow a managed literal), normalize the root model key so a managed `model: x/y` string can't clobber the dict shape callers expect, then leaf-merge. Fail-open so managed scope can never block CLI startup. Adds tests/hermes_cli/test_managed_scope_cli_config.py locking that CLI_CONFIG honors managed values, preserves user siblings, and is inert with no scope.	2026-06-19 07:46:33 -07:00
Ben	9a24e41d0f	docs: add managed scope admin guide + cross-link from configuration	2026-06-19 07:46:33 -07:00
Ben	ddd519ea70	feat(managed-scope): surface managed scope in config show and doctor - show_config prints an administrator header naming the managed source and lists the pinned config/env keys when a scope is active (silent otherwise). - hermes doctor gains a managed_scope_check under Configuration Files that reports the resolved managed dir + pinned key counts, and flags a HERMES_MANAGED_DIR redirect (the documented foot-gun).	2026-06-19 07:46:33 -07:00
Ben	4f9e15df97	feat(managed-scope): guard writes to managed config/env keys - set_config_value hard-rejects a managed config key (D2) and names the source, exiting non-zero. - save_env_value / remove_env_value refuse a managed env key. - save_config strips managed leaves from a bulk write (mechanical safety net) with a warning, so the unmanaged remainder still persists. New _strip_dotted_keys helper drives the bulk-save pruning. All guards are distinct from and layered after the existing is_managed() package-manager write-lock.	2026-06-19 07:46:33 -07:00
Ben	81a663abea	feat(managed-scope): apply managed .env last with override load_hermes_dotenv now loads the managed-scope .env after user/project .env and external secret sources, with override=True, so managed env values beat the user .env and any pre-existing shell export. Reuses the existing dotenv fallback + credential-sanitization path. Fail-open: no managed dir/.env is a no-op and any error is swallowed so managed scope never blocks startup.	2026-06-19 07:46:33 -07:00
Ben	b5ddd6e719	feat(managed-scope): managed config layer wins over user config _load_config_impl now deep-merges the managed config.yaml on top of the expanded user config so managed leaves win while sibling keys stay user-controlled (leaf-level merge, D3). Managed values are expanded against the process env only, never user-defined ${VAR}, so a user can't shadow a managed literal. The managed file's (mtime,size) is folded into the load cache key so editing it invalidates the cache. This inverts the usual env-over-config precedence for pinned keys by design (see design doc §4.1).	2026-06-19 07:46:33 -07:00
Ben	9cbcc0c9c8	feat(managed-scope): add managed_scope module (resolver, loaders, key helpers) New hermes_cli/managed_scope.py resolves a system-level managed directory (HERMES_MANAGED_DIR override > /etc/hermes), parses managed config.yaml/.env with fail-open semantics, and exposes is_key_managed/is_env_managed helpers. The system default is ignored under pytest and HERMES_MANAGED_DIR is added to the conftest env scrub so a real managed scope can't leak into the suite. Not wired into the load paths yet (Phases 2-3).	2026-06-19 07:46:33 -07:00
Ben	bf9a0481fa	test(config): pin config/env load behavior before managed scope	2026-06-19 07:46:33 -07:00
teknium1	a58287afcb	Merge remote-tracking branch 'origin/main' into pr48275-rebase # Conflicts: # cron/scheduler.py	2026-06-19 07:40:29 -07:00
Teknium	35e7ca03d5	fix(kanban): treat already-gone worker as terminated, not survived _terminate_reclaimed_worker early-returned on ProcessLookupError with terminated=False. The new reclaim-defer guard reads that as 'worker survived the kill' and defers the reclaim forever, so a stale task whose worker is already dead never lands in result.stale. ProcessLookupError means the process is gone — that IS a successful termination. Split it from the generic OSError branch and set terminated=True.	2026-06-19 07:38:10 -07:00
Sahil Saghir	b9e521da23	fix(kanban): hold reclaim while the worker is still alive release_stale_claims and detect_stale_running call _terminate_reclaimed_worker and then release the task claim unconditionally, even when the termination did not actually kill the worker. _terminate_reclaimed_worker already reports this via its "terminated" flag, but the callers ignore it. When a worker is parked in uninterruptible (D) state — for example throttled by a cgroup memory.high limit — a pending SIGTERM/SIGKILL cannot be delivered until the throttle lifts, so the kill is a no-op. The dispatcher then frees the claim and spawns a fresh worker beside the still-alive one. Repeated every dispatch tick this accumulates duplicate workers without bound, deepening the memory pressure that caused the throttle in the first place — a self-reinforcing runaway. Fix: gate both automatic reclaim paths on _worker_survived_termination(). When we attempted to kill our own host-local worker and it is still alive, defer the reclaim (_defer_reclaim_for_live_worker extends the claim a short grace and emits a reclaim_deferred event) instead of releasing. This guarantees at most one live worker per task and is self-correcting: not spawning a duplicate is what relieves the pressure so the pending signal lands and the worker dies, and the next tick reclaims cleanly. Non-host-local claims and the operator-driven reclaim_task() path keep their existing force-release behaviour. Related: #41448 (concurrent dispatchers amplify this by doubling reclaim frequency); #42858 (kill the worker rather than orphan it on archive). Tests: defer-when-worker-survives, reclaim-when-killed, release-when-not-host-local, and the detect_stale_running path.	2026-06-19 07:38:10 -07:00
teknium1	13d4b5fe2f	fix(hindsight): align client version to 0.6.1 across all sources The lazy_deps pin (memory.hindsight -> hindsight-client==0.6.1) was newer than the plugin's stated floor (>=0.4.22). Align _MIN_CLIENT_VERSION, the setup wizard dep string, plugin.yaml, and the README to 0.6.1 so the floor check, auto-upgrade target, and runtime lazy-install all agree. Also drops the redundant local _MIN_CLIENT_VERSION redefinition in post_setup.	2026-06-19 07:36:28 -07:00
Ben	6c44471bfd	fix(hindsight): lazy-install cloud client dependency	2026-06-19 07:36:28 -07:00
Sahil Saghir	db744e7d1e	feat(simplify-code): add risk-tiered application, Chesterton's Fence, slop + silent failure detection Five targeted enhancements to the upstream simplify-code skill: 1. Risk-tiered application (SAFE/CAREFUL/RISKY) — safe changes auto-applied, careful changes verified per-file, risky changes flagged for human review. Prevents auto-applying N+1 restructures and public API renames. 2. Chesterton's Fence — before flagging anything for removal, reviewers run 'git blame' to understand why it exists. Low-confidence findings are escalated rather than guessed. 3. AI slop detection — Quality reviewer now catches: extra comments restating obvious code, unnecessary defensive null-checks on validated inputs, 'as any' casts, and patterns inconsistent with the rest of the file. 4. Silent failure detection — Efficiency reviewer now catches: empty catch blocks, ignored error returns, except:pass, .catch(()=>{}) with no handling, and error propagation gaps. 5. Structured reviewer output with confidence+risk tags — reviewers report in 'file:line → problem → fix \| confidence: H/M/L \| risk: SAFE/CAREFUL/RISKY' format, enabling the orchestrator to tier the application. Plus 3 new pitfalls: over-trusting dead code tools, public contract awareness, and preserving intentional error handling. Total: +45/-8 lines. Keeps the 212-line compact spirit. Ref: #379	2026-06-19 07:35:36 -07:00
Teknium	ba50e86563	fix: open dispatcher lock file with explicit utf-8 encoding ruff (unspecified-encoding) and the Windows-footgun checker both flag open() in text mode without encoture=. Keep text mode (the Windows lock path in _try_acquire_file_lock writes a str newline) and pass encoding='utf-8'.	2026-06-19 07:35:33 -07:00
Sahil Saghir	226e9322e1	fix(kanban): cross-platform dispatcher lock + explicit release Two robustness gaps from community review (#44919): 1. Windows dead-path: replaced bespoke fcntl.flock with gateway.status _try_acquire_file_lock / _release_file_lock — already cross-platform (msvcrt on Windows, fcntl on POSIX). Added _release_singleton_lock helper. 2. Lock fd never released: stored handle is now released explicitly in both exit paths — CancelledError handler and normal while-loop exit. Allows in-process stop/restart (tests, embedded use). Also tightened docstrings — 'corrupt the SQLite DBs' is now specific (wal_autocheckpoint=0 + concurrent manual WAL checkpoints can corrupt index pages), matching the module's own concurrency claims.	2026-06-19 07:35:33 -07:00
Sahil Saghir	dfa561092a	fix(kanban): machine-global singleton lock for the embedded dispatcher (#41448 ) The gateway's embedded dispatcher has no guard against more than one dispatcher running concurrently. dispatch_in_gateway defaults to true, so a second gateway for the same profile (a restart race where the old process is slow to exit) — or any deployment that runs multiple profile gateways with the default — starts a second dispatcher loop. As #41448 describes, concurrent dispatchers each run release_stale_claims() against the same boards, double reclaim frequency, and re-dispatch slow workers before they finish. In practice they also corrupt the shared kanban SQLite DBs under concurrent write load. Add _acquire_singleton_lock(): an exclusive, non-blocking fcntl.flock at the machine-global kanban root (kanban_home()/kanban/.dispatcher.lock — the board is shared across profiles by design, so this serialises every gateway, not just one profile). The first gateway to start its dispatcher holds the lock for its process lifetime; any other gateway finds it contended, logs, and skips dispatching while still running for messaging. Falls back to config-only control on non-POSIX or filesystems without flock. This is more robust than a per-profile guard because the documented model is "one dispatcher sweeps all boards" — the contention is across profiles, not just within one. Closes #41448. Test: lock is exclusive (held, then contended while held, then held again after release).	2026-06-19 07:35:33 -07:00
Sahil Saghir	a5e06078b2	fix(cron): compact cron failure messages + repair bare repo dirs after git gc Two small, focused fixes for the cron scheduler and checkpoint manager. 1. _summarize_cron_failure_for_delivery (cron/scheduler.py): Replaces the raw error dump in _process_job with a compact pattern-matched summary. Provider rate limits, timeouts, and authentication errors now produce a short human-readable message instead of dumping multi-KB provider JSON into the delivery channel. 2. _repair_bare_repo_dirs (tools/checkpoint_manager.py): Recreates refs/heads/ and branches/ directories after git gc --prune=now, which can remove empty dirs from bare repos and cause subsequent git add -A to fail with 'fatal: not a git repository'. Called after all four git gc call sites. Both fixes use only standard library imports and plug into existing call sites with no architectural changes.	2026-06-19 07:35:29 -07:00
Teknium	1958208744	chore(release): add Sahil-SS9 to AUTHOR_MAP for PRs #48466/#44919/#44909/#42209	2026-06-19 07:35:29 -07:00
Teknium	d7bff949af	fix(cli): default cli_refresh_interval to 1.0 to keep status bar alive (#49087 ) PR #49056 set the default to 0, which reverts the #45592 idle-clock fix: without a periodic invalidate, prompt_toolkit stops repainting the bottom chrome during idle and the status bar goes stale/disappears after a turn. Restore 1.0 as the default for everyone. The config knob stays — users on emulators where the per-second redraw fights auto-scroll (#48309) can set display.cli_refresh_interval: 0 to opt out.	2026-06-19 07:35:06 -07:00
Ben Barclay	2dd285f9b3	docs(gateway): document multiplexing opt-in + contract changes Extend the 'Running Many Gateways at Once' user-guide page with a 'one gateway for all profiles (multiplexing)' section, kept to a single page: - How to opt in (gateway.multiplex_profiles on the default profile) and when to prefer it vs one-process-per-profile. - Every contract change a user sees when the flag is on: 1. secondary-profile 'gateway start' is a hard error (--force escape hatch), 2. HTTP-inbound reached via /p/<profile>/ prefix; secondary profiles must NOT enable a port-binding platform (webhook/api_server/msgraph_webhook/feishu/ wecom_callback/bluebubbles/sms) — config error at startup, 3. per-credential platforms still need their own token per profile, 4. session keys namespaced agent:<profile>: (default stays agent:main:), 5. single PID/lock + aggregated hermes status, per-profile runtime_status.json. - What does NOT change: per-profile .env credential isolation (stricter, incl. MCP/Kanban subprocess env), Kanban, profile-scoped skills/memory/SOUL, routing. All inert when the flag is off.	2026-06-19 07:34:15 -07:00
Ben Barclay	1e70df5fdd	feat(gateway): multiplex phase 4 — lifecycle guard + per-profile observability - _guard_named_profile_under_multiplexer: when the default gateway is running with gateway.multiplex_profiles=on, a named-profile 'hermes gateway run' hard -errors (pointing at the multiplexer) instead of double-binding that profile's platforms. Inert unless all hold: this invocation is a named profile, a default-profile gateway is alive, and its config has multiplexing on. --force overrides. Wired into run_gateway's guard chain. - write_runtime_status gains served_profiles: the secondary-adapter startup records [active] + multiplexed profiles into runtime_status.json so 'hermes status' can show per-profile coverage without a second probe. Absent for single-profile gateways. Tests: served_profiles round-trips and is absent by default; guard is inert for the default profile / under --force / when no default gateway is running.	2026-06-19 07:34:15 -07:00
Ben Barclay	d5d02eabb0	feat(gateway): multiplex phase 3 — secondary-profile adapter registry + conflict detection Bring up adapters for every profile the gateway serves, not just the active one. Keeps self.adapters as the default/active profile's map (the ~93 existing self.adapters[...] sites are untouched) and adds secondary profiles under self._profile_adapters[profile][platform]. - _start_secondary_profile_adapters loops profiles_to_serve(multiplex=True), skips the active profile (handled by the primary startup loop), and for each other profile loads its gateway config and creates+connects its enabled adapters under that profile's _profile_runtime_scope (home + secret scope). - Each secondary adapter gets _make_profile_message_handler(profile): stamps source.profile (when unset) before delegating to the shared _handle_message, so the agent turn and session key resolve to that profile. - Same-platform credential-conflict detection: _adapter_credential_fingerprint hashes the adapter's bot token (salted, truncated — never logs the token); two profiles claiming the same (platform, token) refuse the duplicate with a clear error naming both, since one token can't be polled twice. - Port-binding hard-error: a SECONDARY profile that enables a port-binding platform (webhook, api_server, msgraph_webhook, feishu, wecom_callback, bluebubbles, sms) is a config error and aborts startup via MultiplexConfigError — the default profile owns the single shared HTTP listener and serves every profile through the /p/<profile>/ prefix, so a second bind can only collide. Distinct from a transient connect failure (which logs + stays alive to retry): a config error writes gateway_state=startup_failed and exits cleanly with an actionable message (names the profile, the platform, and the fix). There is no valid reason to bind a second port once you've opted into a multiplexer. - Shutdown tears down secondary adapters alongside the primary ones. - Defensive getattr guards keep partial-construction unit tests (stop(), _run_agent on bare instances) working. No-op when multiplex_profiles is off (self._profile_adapters stays empty). Tests: fingerprint stability/log-safety/distinctness, profile message-handler stamping (and not overriding an already-stamped source), port-binding hard-error raises + names the profile/platform, non-binding platform is not rejected, and the guard set covers every TCP-binding adapter.	2026-06-19 07:34:15 -07:00
Ben Barclay	f35abb122a	feat(gateway): multiplex phase 1 — HTTP-inbound /p/<profile>/ routing (webhook) Serve webhook inbound for multiple profiles off the one shared listener via a URL prefix, with no second port bound. - SessionSource gains a 'profile' field (round-trips through to_dict/from_dict; omitted when unset so existing serialization is unchanged). It carries which profile an inbound message was routed to. - WebhookAdapter registers /p/{profile}/webhooks/{route_name} alongside the existing /webhooks/{route_name}. _resolve_request_profile validates the prefix against profiles_to_serve(): None when absent or multiplexing is off (ignored, handled as default — no spurious 404), the profile name when valid, _PROFILE_REJECTED (→ 404) when the profile isn't served. The resolved profile is stamped onto the SessionSource. - session-key namespacing and the per-turn home/credential scope now prefer source.profile: SessionStore._resolve_profile_for_key(source), _session_key_for_source fallback, and _resolve_profile_home_for_source all honor it (→ the agent turn resolves that profile's config/skills/credentials via the Phase 2 _profile_runtime_scope). Constraint: routing inbound needs no per-profile platform credential, but the agent still needs the routed profile's provider key — delivered by Phase 2's secret scope. api_server (OpenAI-compatible surface) profile routing is a focused follow-on; its source-construction path differs from webhook's. Tests: SessionSource.profile round-trip + namespace drive; _resolve_request_ profile accept/reject/ignore matrix.	2026-06-19 07:34:15 -07:00
Ben Barclay	f538470cf4	feat(gateway): multiplex phase 2 — fail-closed profile credential isolation (Workstream A) The credential gate. When multiplexing is active, a profile's secrets resolve from a context-local scope, never the process-global os.environ (which in a multiplexer may hold another profile's keys, and is inherited by every subprocess spawned with env=dict(os.environ)). - agent/secret_scope.py: get_secret() backed by a secret-scope contextvar. FAIL-CLOSED: when multiplex is active and no scope is installed, an unscoped read RAISES UnscopedSecretError instead of falling back to os.environ — a missed/new call site crashes loudly at that line rather than leaking a cross-profile value. Genuinely-global vars (HERMES_*, PATH, kanban paths, …) keep reading os.environ via an allowlist. load_env_file/build_profile_ secret_scope parse a profile .env into an isolated dict WITHOUT mutating os.environ. Off by default => transparent os.getenv behavior. - hermes_cli/runtime_provider.py: all credential/provider/base-url reads go through _getenv -> get_secret. - agent/credential_pool.py: env fallbacks route through get_secret (the ~/.hermes/.env-first preference is preserved and already profile-correct via the home override). - tools/mcp_tool.py: MCP config interpolation resolves through get_secret, so a server's picks up the routed profile's value. - gateway/run.py: set_multiplex_active() at GatewayRunner init; per-turn .env reload is a no-op for credentials in multiplex mode (secrets come from the scope, not global env); _profile_runtime_scope context manager combines the HERMES_HOME override + secret scope; _run_agent wraps _run_agent_inner in that scope (resolved via _resolve_profile_home_for_source) when multiplexing. Propagates into the agent worker thread for free via the existing copy_context() in _run_in_executor_with_context. Tests: 13 unit (fail-closed, scope isolation, global allowlist, .env parsing without environ mutation) + 7 E2E (runtime_provider + MCP interpolation prove two profiles isolated, unscoped read raises, globals still read environ).	2026-06-19 07:34:15 -07:00
Ben Barclay	d82f9fa7f7	feat(gateway): multiplex phase 0 — config flag, profile enumeration, profile-stamped session keys Foundations for serving multiple profiles from one gateway process, inert when off: - gateway.multiplex_profiles config flag (default false), round-trips through GatewayConfig and load_gateway_config (top-level + nested gateway.* form). - hermes_cli.profiles.profiles_to_serve(multiplex): the single chokepoint for which (profile, HERMES_HOME) pairs the gateway serves. Lightweight dir scan; active-profile-only when off, default + all named profiles when on. - build_session_key gains a profile= namespace slot. Default/None reuse the historical 'agent:main:...' literal BYTE-IDENTICALLY (no session migration, positional parsers unaffected); a named profile becomes 'agent:<profile>:...' so two profiles on the same platform/chat never collide. - SessionStore._resolve_profile_for_key + _session_key_for_source fallback resolve the namespace from the flag (legacy when off, active profile when on). Tests: byte-identical-when-off (parametrized), namespace isolation, positional layout preserved, config round-trip, profiles_to_serve enumeration.	2026-06-19 07:34:15 -07:00
alt-glitch	9e1f616136	fix(clarify): docstring — put options in choices[] only, never enumerate in question text The model was enumerating options inside the question string (dead prose the UI can't render as pickable rows). Schema description now spells out: choices[] is REQUIRED for selectable options; question holds ONLY the question.	2026-06-19 07:34:02 -07:00
teknium1	df2420f571	fix(gateway): keep non-Discord home-channel startup send byte-identical The salvaged non_conversational marking made the home-channel startup no-metadata branch always pass metadata= explicitly; for non-Discord platforms _non_conversational_metadata returns None, so Telegram/etc. went from adapter.send(chat_id, message) to adapter.send(..., metadata=None). Behaviorally identical but broke test_restart_notification's exact assert_called_once_with. Only attach metadata when the marker applies (Discord), restoring the original call shape elsewhere.	2026-06-19 07:29:27 -07:00
snav	caaa916289	fix(gateway): don't let delayed Discord status messages partition history backfill Discord channel-history backfill partitions on Hermes' last self-authored message. Asynchronous, non-conversational status sends (self-improvement review bubbles, heartbeats, background-process notifications, update status, gateway restart/online notices) land as ordinary bot messages, so a delayed status bump becomes the history boundary and swallows real messages that arrived after Hermes' actual reply. Mark these sends at the source via metadata["non_conversational"] (Discord only; other platforms' metadata is unchanged). The adapter no longer advances the history-boundary cache for marked sends and persists their IDs to a sidecar JSON so the cold-start scan can skip them by ID after a restart. A narrow regex recognizer remains only as an upgrade bridge for status bumps emitted by an older gateway that pre-dates the marking.	2026-06-19 07:29:27 -07:00
Teknium	b936f92b25	fix(desktop): render send/prefill directive notices (/goal, /undo) (#49073 ) The desktop slash dispatcher dropped the `notice` field on `send` and never handled `prefill` directives at all. `/goal <text>` returns {type: send, notice: "⊙ Goal set …", message} from command.dispatch — the desktop submitted the goal text as a plain prompt with no feedback, so the goal looked like it did nothing. `/undo` returns a prefill directive that fell through to "invalid response". - types: add `notice?` to SendCommandDispatchResponse; add PrefillCommandDispatchResponse to the union. - parseCommandDispatch: keep `notice` on send, parse prefill. - runExec dispatcher: render the notice as a system line before acting, and handle prefill by dropping the message into the composer for editing (mirrors the TUI's createSlashHandler). Tests: parseCommandDispatch send-notice / prefill cases.	2026-06-19 07:28:50 -07:00
Carlos Diosdado	e00b965406	feat(tts): add xAI TTS speed and optimize_streaming_latency config knobs The xAI TTS REST endpoint (POST /v1/tts) accepts 'speed' (0.7-1.5) and 'optimize_streaming_latency' (0/1/2) parameters, but the Hermes built-in xAI provider was reading neither from config nor sending either in the request body. Add them as tts.xai.speed and tts.xai.optimize_streaming_latency config knobs (with global tts.speed / tts.optimize_streaming_latency fallbacks). - speed: float, clamped to 0.7-1.5. 1.0 (the API default) is omitted from the request body to preserve the existing minimal-payload contract. - optimize_streaming_latency: int, clamped to 0-2. 0 (best quality, the API default) is omitted from the request body. Resolver order: tts.xai.<knob> overrides the global tts.<knob>.	2026-06-19 07:26:56 -07:00
Teknium	8b7c89bff2	feat(dashboard): session switcher panel on the Chat tab (#49077 ) Add a ChatGPT-style conversation list beside the embedded TUI on the dashboard Chat tab so users can swap sessions without leaving the page. - New ChatSessionList component: lists recent sessions for the active profile (title/preview, last-active, message count, source), a New chat button, and a refresh control. Best-effort like ChatSidebar. - Selecting a row drives /chat?resume=<id>, which ChatPage already treats as part of the PTY identity, so the terminal respawns resuming that conversation. Active row is highlighted; New chat clears resume. - Wired into ChatPage as a dedicated right-side column (desktop) and into the existing slide-over panel above model/tools (narrow screens). - i18n: new sessions.newChat key across all locales. - Read-only switcher by design — delete/rename/export stay on Sessions. Docs: web-dashboard.md Chat section documents the switcher.	2026-06-19 07:26:53 -07:00
teknium1	06c7c2577f	test(desktop): lock generic OAuth status fallthrough for catalog-only providers	2026-06-19 07:26:46 -07:00
teknium1	1d59d2dcae	feat(desktop): resolve OAuth status for catalog-only account providers Accounts-tab cards derived from the unified provider_catalog() carry status_fn=None and had no hardcoded branch in _resolve_provider_status, so any future OAuth/account provider plugin rendered permanently logged-out. Fall through to the canonical hermes_cli.auth.get_auth_status slug dispatcher and adapt its shape, so membership AND status both auto-extend with the hermes model universe.	2026-06-19 07:26:46 -07:00
Austin Pickett	d91b8d8368	test(desktop): make keyVar a typed EnvVarInfo factory Address review feedback on the keyVar test helper: it mocks one /api/env row (an EnvVarInfo), so type it as such and mirror the sibling provider() factory's base-plus-Partial-override shape instead of hardcoding positional args and fabricated fields (description='X direct API', url=''). Route the WidgetAI test through it too, removing the inline duplicate of the same object shape.	2026-06-19 07:26:46 -07:00
Austin Pickett	ee0de638d7	feat(desktop): add API-keys search; keep provider lists priority-sorted - API-keys tab: a SearchField filters provider cards by name / env-var key / description, with a 'no providers match' empty state. Card order stays priority-then-name (curated PROVIDER_GROUPS priority floats recommended providers up; equal priority falls back to alphabetical). - Accounts tab: 'Other providers' keep sortProviders order (priority, then name) — unchanged. Adds searchKeys/noKeysMatch i18n strings across all four locales. Vitest covers priority/name ordering + live filtering + empty state.	2026-06-19 07:26:46 -07:00

1 2 3 4 5 ...

12172 commits