hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-23 10:42:00 +00:00

Author	SHA1	Message	Date
Ben	1928aa0443	fix(managed-scope): honor managed scope in config→env bridges too Manual verification surfaced a second bypass class beyond the standalone config loaders: several code paths bridge config.yaml values into os.environ (HERMES_TIMEZONE, HERMES_REDACT_SECRETS, HERMES_MAX_ITERATIONS, TERMINAL_*, network.force_ipv4, ...) by reading the raw user YAML, so the env the whole process reads carried the USER's value even when an administrator pinned it — e.g. a managed timezone was overridden because gateway/run.py wrote the user's timezone into HERMES_TIMEZONE, and _resolve_timezone_name() checks the env var first. Wired the shared apply_managed_overlay() into every config→env bridge: - gateway/run.py module-level startup bridge (timezone, redact_secrets, max_turns, terminal, display, gateway.strict, ...) - gateway/run.py _reload_runtime_env_preserving_config_authority (the per-turn re-bridge that keeps config authoritative over reloaded .env — must keep MANAGED authoritative on every turn, not just startup) - hermes_cli/main.py early security.redact_secrets / network.force_ipv4 bridge (runs before load_config is usable, at import time) - hermes_cli/send_cmd.py top-level scalar config→env bridge Verified end-to-end against a writable managed dir (12/12 checks incl. timezone, logging, model, skin, gateway settings, write-guard) and in a clean process the gateway per-turn bridge writes HERMES_TIMEZONE=<managed>. Adds an order-independent regression test for the bridge overlay.	2026-06-19 07:46:33 -07:00
Ben	b0e47a98f9	fix(managed-scope): honor managed scope in all standalone config loaders The skin bug was one instance of a class: several subsystems build their config dict directly from config.yaml instead of routing through hermes_cli.config.load_config (which carries the managed merge), so they silently ignored administrator-pinned values. Audited every config.yaml reader and fixed the behavioral-read bypasses: - gateway/config.py load_gateway_config (messaging gateway: session_reset, quick_commands, stt, model, ...) - gateway/run.py _load_gateway_config (its read_raw_config fast path also skipped the merge — read_raw_config returns raw user YAML) - tui_gateway/server.py _load_cfg (new TUI + desktop backend: skin, reasoning_effort, service_tier, provider_routing) - cron/scheduler.py (scheduled-job model/reasoning/toolsets/provider_routing) - hermes_logging.py (logging.level/max_size_mb/backup_count) - hermes_time.py (timezone) - hermes_cli/doctor.py (memory-provider diagnostic reads effective config) All route through a new shared managed_scope.apply_managed_overlay() helper that mirrors _load_config_impl (env-only expansion so a user ${VAR} can't shadow a managed literal, root-model-string normalization, leaf-merge) and is fail-open. cli.py's earlier inline fix is refactored onto the same helper. Write-back paths (slash_commands, telegram/yuanbao dm_topics, profile distribution) are deliberately left reading raw user YAML — overlaying managed values there would persist them into the user file. The dashboard (web_server.py) already routes through load_config and needed no change. TUI loader caches the RAW config so _save_cfg never writes managed values to disk. Adds test_managed_scope_overlay.py (helper) and test_managed_scope_loaders.py (per-surface integration); mutation-checked.	2026-06-19 07:46:33 -07:00
Ben Barclay	1e70df5fdd	feat(gateway): multiplex phase 4 — lifecycle guard + per-profile observability - _guard_named_profile_under_multiplexer: when the default gateway is running with gateway.multiplex_profiles=on, a named-profile 'hermes gateway run' hard -errors (pointing at the multiplexer) instead of double-binding that profile's platforms. Inert unless all hold: this invocation is a named profile, a default-profile gateway is alive, and its config has multiplexing on. --force overrides. Wired into run_gateway's guard chain. - write_runtime_status gains served_profiles: the secondary-adapter startup records [active] + multiplexed profiles into runtime_status.json so 'hermes status' can show per-profile coverage without a second probe. Absent for single-profile gateways. Tests: served_profiles round-trips and is absent by default; guard is inert for the default profile / under --force / when no default gateway is running.	2026-06-19 07:34:15 -07:00
Ben Barclay	d5d02eabb0	feat(gateway): multiplex phase 3 — secondary-profile adapter registry + conflict detection Bring up adapters for every profile the gateway serves, not just the active one. Keeps self.adapters as the default/active profile's map (the ~93 existing self.adapters[...] sites are untouched) and adds secondary profiles under self._profile_adapters[profile][platform]. - _start_secondary_profile_adapters loops profiles_to_serve(multiplex=True), skips the active profile (handled by the primary startup loop), and for each other profile loads its gateway config and creates+connects its enabled adapters under that profile's _profile_runtime_scope (home + secret scope). - Each secondary adapter gets _make_profile_message_handler(profile): stamps source.profile (when unset) before delegating to the shared _handle_message, so the agent turn and session key resolve to that profile. - Same-platform credential-conflict detection: _adapter_credential_fingerprint hashes the adapter's bot token (salted, truncated — never logs the token); two profiles claiming the same (platform, token) refuse the duplicate with a clear error naming both, since one token can't be polled twice. - Port-binding hard-error: a SECONDARY profile that enables a port-binding platform (webhook, api_server, msgraph_webhook, feishu, wecom_callback, bluebubbles, sms) is a config error and aborts startup via MultiplexConfigError — the default profile owns the single shared HTTP listener and serves every profile through the /p/<profile>/ prefix, so a second bind can only collide. Distinct from a transient connect failure (which logs + stays alive to retry): a config error writes gateway_state=startup_failed and exits cleanly with an actionable message (names the profile, the platform, and the fix). There is no valid reason to bind a second port once you've opted into a multiplexer. - Shutdown tears down secondary adapters alongside the primary ones. - Defensive getattr guards keep partial-construction unit tests (stop(), _run_agent on bare instances) working. No-op when multiplex_profiles is off (self._profile_adapters stays empty). Tests: fingerprint stability/log-safety/distinctness, profile message-handler stamping (and not overriding an already-stamped source), port-binding hard-error raises + names the profile/platform, non-binding platform is not rejected, and the guard set covers every TCP-binding adapter.	2026-06-19 07:34:15 -07:00
Ben Barclay	f35abb122a	feat(gateway): multiplex phase 1 — HTTP-inbound /p/<profile>/ routing (webhook) Serve webhook inbound for multiple profiles off the one shared listener via a URL prefix, with no second port bound. - SessionSource gains a 'profile' field (round-trips through to_dict/from_dict; omitted when unset so existing serialization is unchanged). It carries which profile an inbound message was routed to. - WebhookAdapter registers /p/{profile}/webhooks/{route_name} alongside the existing /webhooks/{route_name}. _resolve_request_profile validates the prefix against profiles_to_serve(): None when absent or multiplexing is off (ignored, handled as default — no spurious 404), the profile name when valid, _PROFILE_REJECTED (→ 404) when the profile isn't served. The resolved profile is stamped onto the SessionSource. - session-key namespacing and the per-turn home/credential scope now prefer source.profile: SessionStore._resolve_profile_for_key(source), _session_key_for_source fallback, and _resolve_profile_home_for_source all honor it (→ the agent turn resolves that profile's config/skills/credentials via the Phase 2 _profile_runtime_scope). Constraint: routing inbound needs no per-profile platform credential, but the agent still needs the routed profile's provider key — delivered by Phase 2's secret scope. api_server (OpenAI-compatible surface) profile routing is a focused follow-on; its source-construction path differs from webhook's. Tests: SessionSource.profile round-trip + namespace drive; _resolve_request_ profile accept/reject/ignore matrix.	2026-06-19 07:34:15 -07:00
Ben Barclay	f538470cf4	feat(gateway): multiplex phase 2 — fail-closed profile credential isolation (Workstream A) The credential gate. When multiplexing is active, a profile's secrets resolve from a context-local scope, never the process-global os.environ (which in a multiplexer may hold another profile's keys, and is inherited by every subprocess spawned with env=dict(os.environ)). - agent/secret_scope.py: get_secret() backed by a secret-scope contextvar. FAIL-CLOSED: when multiplex is active and no scope is installed, an unscoped read RAISES UnscopedSecretError instead of falling back to os.environ — a missed/new call site crashes loudly at that line rather than leaking a cross-profile value. Genuinely-global vars (HERMES_*, PATH, kanban paths, …) keep reading os.environ via an allowlist. load_env_file/build_profile_ secret_scope parse a profile .env into an isolated dict WITHOUT mutating os.environ. Off by default => transparent os.getenv behavior. - hermes_cli/runtime_provider.py: all credential/provider/base-url reads go through _getenv -> get_secret. - agent/credential_pool.py: env fallbacks route through get_secret (the ~/.hermes/.env-first preference is preserved and already profile-correct via the home override). - tools/mcp_tool.py: MCP config interpolation resolves through get_secret, so a server's picks up the routed profile's value. - gateway/run.py: set_multiplex_active() at GatewayRunner init; per-turn .env reload is a no-op for credentials in multiplex mode (secrets come from the scope, not global env); _profile_runtime_scope context manager combines the HERMES_HOME override + secret scope; _run_agent wraps _run_agent_inner in that scope (resolved via _resolve_profile_home_for_source) when multiplexing. Propagates into the agent worker thread for free via the existing copy_context() in _run_in_executor_with_context. Tests: 13 unit (fail-closed, scope isolation, global allowlist, .env parsing without environ mutation) + 7 E2E (runtime_provider + MCP interpolation prove two profiles isolated, unscoped read raises, globals still read environ).	2026-06-19 07:34:15 -07:00
Ben Barclay	d82f9fa7f7	feat(gateway): multiplex phase 0 — config flag, profile enumeration, profile-stamped session keys Foundations for serving multiple profiles from one gateway process, inert when off: - gateway.multiplex_profiles config flag (default false), round-trips through GatewayConfig and load_gateway_config (top-level + nested gateway.* form). - hermes_cli.profiles.profiles_to_serve(multiplex): the single chokepoint for which (profile, HERMES_HOME) pairs the gateway serves. Lightweight dir scan; active-profile-only when off, default + all named profiles when on. - build_session_key gains a profile= namespace slot. Default/None reuse the historical 'agent:main:...' literal BYTE-IDENTICALLY (no session migration, positional parsers unaffected); a named profile becomes 'agent:<profile>:...' so two profiles on the same platform/chat never collide. - SessionStore._resolve_profile_for_key + _session_key_for_source fallback resolve the namespace from the flag (legacy when off, active profile when on). Tests: byte-identical-when-off (parametrized), namespace isolation, positional layout preserved, config round-trip, profiles_to_serve enumeration.	2026-06-19 07:34:15 -07:00
teknium1	df2420f571	fix(gateway): keep non-Discord home-channel startup send byte-identical The salvaged non_conversational marking made the home-channel startup no-metadata branch always pass metadata= explicitly; for non-Discord platforms _non_conversational_metadata returns None, so Telegram/etc. went from adapter.send(chat_id, message) to adapter.send(..., metadata=None). Behaviorally identical but broke test_restart_notification's exact assert_called_once_with. Only attach metadata when the marker applies (Discord), restoring the original call shape elsewhere.	2026-06-19 07:29:27 -07:00
snav	caaa916289	fix(gateway): don't let delayed Discord status messages partition history backfill Discord channel-history backfill partitions on Hermes' last self-authored message. Asynchronous, non-conversational status sends (self-improvement review bubbles, heartbeats, background-process notifications, update status, gateway restart/online notices) land as ordinary bot messages, so a delayed status bump becomes the history boundary and swallows real messages that arrived after Hermes' actual reply. Mark these sends at the source via metadata["non_conversational"] (Discord only; other platforms' metadata is unchanged). The adapter no longer advances the history-boundary cache for marked sends and persists their IDs to a sidecar JSON so the cold-start scan can skip them by ID after a restart. A narrow regex recognizer remains only as an upgrade bridge for status bumps emitted by an older gateway that pre-dates the marking.	2026-06-19 07:29:27 -07:00
infinitycrew39	ca92e9a362	fix(gateway): refresh cached agent max_iterations from current config When a gateway agent is reused from cache, it retains the max_iterations from its initial creation. If config.yaml agent.max_turns or HERMES_MAX_ITERATIONS changed between turns, the cached agent's budget becomes stale. Before reusing a cached agent, refresh agent.max_iterations from the freshly-resolved value (read from env/config at line 14585). Fixes partial issue from PR #48127: handles fresh agent creation + cached agent reuse.	2026-06-19 06:31:13 -07:00
infinitycrew39	460b1e50e5	fix(gateway): refresh max_turns before resolving runtime budget	2026-06-19 06:31:13 -07:00
Ben Barclay	2c6e266e88	fix(relay): trigger self-provision on relay-config + NAS token, not is_managed() (#48724 ) self_provision_if_managed() gated on is_managed(), but is_managed() means "NixOS/package-manager-managed" (it keys on HERMES_MANAGED or a ~/.hermes/.managed marker) — NOT "NAS-hosted". A NAS-provisioned Fly agent sets NEITHER, so the gate was always False and relay self-provision SILENTLY no-oped on exactly the hosted agents it was built for. Caught live: a staging agent with GATEWAY_RELAY_URL correctly stamped logged "No messaging platforms enabled" and never dialed the connector; HERMES_MANAGED was unset on the machine. The unit tests had mocked is_managed()->True, so they passed while the real trigger never fired (mocked- trigger blind spot). Fix: drop the is_managed() gate and rename self_provision_if_managed -> self_provision_relay. The real trigger is now "relay_url() set + no pinned secret + a resolvable NAS token", which is both NAS-independent and self-guarding: - NAS-hosted agent: GATEWAY_RELAY_URL + no pinned secret + bootstrapped NAS token -> self-provisions. - Self-hosted + `hermes gateway enroll`: pinned GATEWAY_RELAY_SECRET -> skipped (existing secret-present guard). - Self-hosted, unenrolled, no NAS identity: resolve_nous_access_token() fails -> graceful no-op (existing fail-soft path). Security: unchanged trust model. The connector still derives tenant from the validated NAS token; this only broadens WHEN the provision attempt fires, and every broadened case is still guarded by token-resolution + pinned-secret-skip. Tests: replaced the (wrong) "skips when not managed" test with a regression test proving a NAS host where is_managed()==False STILL provisions; renamed all call sites; added a "no NAS token -> non-fatal skip" test for the self-hosted branch. 88 relay tests pass. Relay-adapter lane. EXPERIMENTAL.	2026-06-19 01:01:24 +00:00
Ben Barclay	0ddd21c74e	feat(relay): managed-boot self-provision client (Phase 3, gateway side) (#48242 ) The gateway half of relay Phase 3. On a MANAGED boot with relay configured and no secret pinned, the runtime self-provisions its relay credentials IN-PROCESS: resolve the agent's own Nous access token (resolve_nous_access_token) -> POST the connector's /relay/provision asserting its own endpoint + route keys -> set GATEWAY_RELAY_ID/SECRET/DELIVERY_KEY into os.environ so the immediately- following register_relay_adapter() reads them and dials out authenticated. No human, no enrollment token, no disk write — the creds live only in process memory (save_env_value refuses under managed anyway, and keeping the secret off any volume is the stronger posture). Stateless: process-env creds don't survive a restart, so a managed container re-provisions every boot; the connector's rotation window covers a still-connected prior instance. An explicitly-pinned GATEWAY_RELAY_SECRET is respected (skip). Self-hosted is unchanged: humans keep using `hermes gateway enroll`. Endpoint provenance is gateway-asserted (GATEWAY_RELAY_ENDPOINT + GATEWAY_RELAY_ROUTE_KEYS, env or gateway.relay_* config) — uniform code path whether the operator sets it (self-hosted) or NAS stamps it (hosted, the only case NAS knows the public URL). Both absent -> outbound-only provisioning (credentials, no inbound routes). The connector scopes the asserted endpoint to the verified tenant, so it stays within the security model. - gateway/relay/__init__.py: relay_endpoint(), relay_route_keys(), _provision_url(), _post_provision(), self_provision_if_managed() (never raises — a provision failure logs and boots without relay auth). - gateway/run.py: call self_provision_if_managed() immediately before register_relay_adapter() in the startup path. Tests: 12 unit (trigger logic, respect-pinned-secret, in-process env wiring, endpoint+routes vs outbound-only, fail-soft on token/connector failure); mutation-checked (drop is_managed guard / pinned-secret guard -> tests fail). Cross-repo live E2E driver lands on the connector side (depends on this). EXPERIMENTAL: relay auth scheme may change until >=2 Class-1 platforms validate.	2026-06-18 15:25:29 +10:00
Ben	237fa7d29c	feat(gateway): register relay adapter from config; drop HERMES_GATEWAY_RELAY gate Wire the relay adapter into gateway startup and make activation config-driven instead of a dark-launch flag. - gateway/relay/__init__.py: replace relay_enabled()/HERMES_GATEWAY_RELAY with relay_url() (GATEWAY_RELAY_URL env or gateway.relay_url in config.yaml) — the same shape as gateway.proxy_url. register_relay_adapter() registers when a URL is configured and builds a live WebSocketRelayTransport; with no URL it's a no-op (direct/single-tenant deployments unaffected). force=True keeps the transport-less adapter for unit tests. relay_platform_identity() reads the hello platform/botId from GATEWAY_RELAY_PLATFORM/GATEWAY_RELAY_BOT_ID. - gateway/run.py: call register_relay_adapter() during GatewayRunner.start(), right after plugin discovery, so a configured connector relay is registered on every boot. Failures are logged, never block startup. This removes the dark-launch posture: the relay is on whenever it's configured, shipping the production end state rather than hiding it behind a flag.	2026-06-17 16:37:45 -07:00
teknium	36ae958473	feat(gateway): gate message timestamps behind opt-in (default off) Follow-up to salvaged PR #41633: the timestamp prefix injection was unconditional. Gate the in-context render behind gateway.message_timestamps.enabled (default false) at both the live-message and history-replay sites; timestamp metadata is still captured + persisted regardless so the toggle can be flipped on later. Add DEFAULT_CONFIG entry, docs, and gate tests.	2026-06-16 15:49:59 -07:00
Wolfram Ravenwolf	bd7fc8fdcd	feat(gateway): inject stable human-readable message timestamps Consolidates these related Amy fork patches: - 429830f39 feat(gateway): inject message timestamps into user messages for LLM context - 3c3d6fac0 fix: handle both ISO string and epoch float timestamps in history replay - 2874f7725 feat: human-friendly timestamp format with weekday and timezone name - 3735f4c8b fix: render gateway message timestamps once	2026-06-16 15:49:59 -07:00
Sierra (Hermes Agent)	01ae9b853e	fix(telegram): resolve replies to rich (sendRichMessage) messages Telegram does not echo a sendRichMessage's content back in reply_to_message (.text/.caption empty, .api_kwargs None), so replies to rich sends (briefings, the gateway's own rich finals) arrived with no quotable text and the [Replying to: ...] injection was skipped. Remember message_id -> text at send time in a best-effort JSON index (gateway/rich_sent_store.py), and recover it on inbound when text and caption are both empty. Best-effort and no-throw throughout: any failure degrades to prior behavior and never breaks a send or message. Salvaged from #47375 by @x1erra. Dropped the cross-platform run.py reply-prefix rewrite (out of scope; bloated every reply on every platform) and scrubbed a docstring reference to an out-of-repo script. Kept the inbound reply_to logging enrichment used to verify the fix.	2026-06-16 13:04:20 -07:00
Wolfram Ravenwolf	e76e7b5073	feat(hooks): session:compress event_callback for MemPalace sync	2026-06-16 11:45:36 -07:00
Wolfram Ravenwolf	16fc717091	fix(mattermost): harden delivery hygiene PROBLEM: Mattermost threads can become invalid or enormous, exposing two failure modes: internal scratch/reasoning/commentary displays could leak into persistent Mattermost threads via global display toggles, while rejected threaded user-visible replies could disappear unless every failed send fell back flat. A broad flat fallback would pollute channels with tool/status/progress noise. SOLUTION: Require explicit Mattermost platform opt-in for scratch displays, keep using the existing notify=True metadata marker for user-visible final text/media/file replies, and allow the Mattermost plugin adapter to flat-fallback only notify-worthy sends whose threaded POST failure looks like a broken root/thread. Keep tool/status/progress and other non-notify sends thread-strict. Add regression tests for display opt-in, notify-only broken-thread fallback, generic API failure suppression, and stream notify metadata. Verification: tests/gateway/test_mattermost.py tests/gateway/test_stream_consumer.py tests/gateway/test_stream_consumer_thread_routing.py tests/gateway/test_stream_consumer_fresh_final.py tests/gateway/test_stream_consumer_draft.py; tests/gateway/test_session_api.py tests/gateway/test_status_command.py tests/gateway/test_resume_command.py tests/hermes_cli/test_commands.py; py_compile touched gateway files; git diff --check. Session: Mattermost thread 6qg8e9dd1pd9pkhi74xyaa1mry, 2026-06-01.	2026-06-16 06:34:54 -07:00
teknium	6373aba80f	feat(gateway): rename to tool_progress_grouping, add config/docs/tests Follow-up to salvaged PR #41620: - Rename tool_progress_style -> tool_progress_grouping (clearer intent) - Add display.tool_progress_grouping to DEFAULT_CONFIG (accumulate default) - Document in messaging docs incl. 'separate is noisier, only where progress enabled' - Add resolver tests (default/global/override/invalid/case)	2026-06-16 05:49:24 -07:00
Wolfram Ravenwolf	fc956b9db6	feat: add tool_progress_style config (accumulate vs separate) Add display.tool_progress_style setting to control how tool progress messages are displayed in chat platforms: - 'accumulate' (default): Edit a single message with all tool calls (new v0.9.0 behavior) - 'separate': Send each tool call as its own message, interleaved with thinking messages (pre-v0.9 behavior, better readability) The setting participates in the per-platform display override system and can be set globally or per-platform. Files: gateway/display_config.py, gateway/run.py	2026-06-16 05:49:24 -07:00
Wolfram Ravenwolf	20b1f4f3fb	feat(memory): configurable background memory update notifications Background memory reviews now support three notification modes, configured via display.memory_notifications in config.yaml: off — no chat notification (still logged to stdout/HA log) on — generic '💾 Memory updated' (default, unchanged behavior) verbose — content preview with action indicators: 💾 Memory ➕ Hermes Repo liegt unter /config/amy/hermes-agent/... 💾 Memory ✏️ Updated repo path from claude-code to hermes-agent... 💾 Memory ➖ old entry about claude-code path... Previews are truncated to 120 chars for adds/replaces, 60 for removes. Each action gets its own line in verbose mode for readability. Files: run_agent.py, gateway/run.py	2026-06-16 05:45:40 -07:00
Teknium	5a0e0d35b9	fix(mattermost): preserve thread-local delivery hygiene Salvage the valid thread-routing pieces from #41640: - route Mattermost progress/status sends through metadata thread IDs - treat top-level Mattermost channel posts as thread roots for progress - preserve thread metadata through media/file sends - allow flat fallback only for final notify-worthy replies on confirmed broken roots Co-authored-by: Wolfram Ravenwolf <github.com@wolfram.ravenwolf.de>	2026-06-15 15:06:23 -07:00
Teknium	c66ecf0bc3	feat(delegation): async background subagents via delegate_task(background=true) (#40946 ) * feat(delegation): async background subagents via delegate_task(background=true) delegate_task(background=true) dispatches a subagent that runs in the background and returns a handle immediately, so the user and model keep working while it runs. The full result — plus the original task source — re-enters the conversation as a new turn when the subagent finishes, riding the same completion-queue rail as terminal background processes. - tools/async_delegation.py: daemon-executor registry, capacity cap, rich self-contained completion event pushed onto the shared process_registry.completion_queue (type='async_delegation'). - delegate_tool.py: background param + single-task dispatch branch; batch async rejected (v1). - process_registry.py: format_process_notification renders the rich task-source block (goal/context/toolsets/model/status/result). - gateway/run.py: dedicated _async_delegation_watcher drains + injects results into the originating session (idle + post-turn), session_key routing enrichment, shutdown interrupt of dangling delegations. - config: delegation.max_async_children (default 3). Reuses the existing idle-drain wiring rather than mutating a running agent loop, preserving message-role alternation and prompt-cache invariants. 13 targeted tests; CLI + gateway paths E2E-verified. * test(delegation): make async non-blocking tests environment-independent CI 'test (5)' flaked on a cold, 8-worker runner: the first delegate_task(background=true) call measured 2.27s of one-time setup (config load + child-agent construction + imports), tripping the elapsed < 1.0 wall-clock assertion. That assertion was testing setup overhead, not blocking. Replace the wall-clock thresholds with the real invariant: dispatch returns while the child is still gated (active_count == 1, completion queue empty), which a synchronous impl could not do. Keep only a loose 4s sanity backstop well under the runner's 5s gate. * fix(delegation): harden async background delegation Follow-up review fixes: - Detach background child from parent._active_children at dispatch — otherwise parent-turn interrupts (Ctrl+C, mid-turn steering), cache evicts (release_clients), and session close (/new) kill/close the detached subagent mid-run, defeating the point of background mode. Lifecycle is owned by the async registry's interrupt_fn. - Make the capacity check atomic with the record insert (TOCTOU: two concurrent dispatches could both pass active_count() and exceed the cap). - TUI dedup: key async_delegation events by delegation_id — the fallthrough keyed them all as ("", type), suppressing every completion after the first in the desktop/TUI status feed. - CLI /stop now interrupts running background delegations and /agents lists them (they live outside the process registry and were invisible). - Drop stray unbalanced ']' line from the re-injection block and the unused _ASYNC_DEFAULT import. Tests: detach-at-dispatch + concurrent-capacity race added (15 total in test_async_delegation.py); 137 delegate + 140 process-registry/notify/watch + 7 TUI dedup tests pass. * fix(delegation): harden async background completion drains	2026-06-15 13:33:12 -07:00
Teknium	3e7e9b24d4	fix: harden salvaged session and browser improvements Polish salvaged contributor work before PR review: - read browser inactivity timeout from config with documented fallback - skip redundant v10 trigram backfill before v11 FTS rebuild - show delegate_task goals safely in progress previews - show gateway status model/context without redundant token wording - wire gateway /sessions to shared session-listing helpers - map Ravenwolf author emails for release attribution Co-authored-by: Wolfram Ravenwolf <github.com@wolfram.ravenwolf.de> Co-authored-by: Amy Ravenwolf <amy@ravenwolf.de>	2026-06-15 07:46:34 -07:00
Teknium	be7c919bf9	fix(process): label background completion causes (#46659 ) Track why a background process finished and include that source in notify-on-complete messages so SIGTERM from process.kill, kill_all, backend loss, and ordinary exits are distinguishable.	2026-06-15 07:08:24 -07:00
kshitijk4poor	3bc4a2ff78	fix(gateway): re-baseline agent-cache message_count after each turn The #45966 cross-process coherence guard snapshots a session's on-disk message_count next to the cached agent and rebuilds the agent when the count changes. But the snapshot is taken at agent-BUILD time — before the turn writes its own user + assistant (+ tool) rows — and the cache entry is never rewritten on a reuse. So this process's OWN turn grows message_count, and the very next turn sees a mismatch and rebuilds the agent. That happens every turn, for every conversation, silently destroying the per-conversation prompt caching the cache exists to protect (AGENTS.md: prompt caching is sacred). Add _refresh_agent_cache_message_count(): after a turn completes and the agent has flushed its rows to the SessionDB, re-baseline the stored count to the now-current value. The guard then fires ONLY when a DIFFERENT process changes the transcript — preserving the #45966 fix while keeping the cache warm for normal single-process operation. Tests drive the real SessionDB + the real guard condition: 5 consecutive same-process turns now all REUSE the cached agent (0 before the fix); a cross-process append still invalidates; and the re-baseline is fail-safe (no DB, falsy session_id, raising probe, legacy 2-tuple, pending sentinel all no-op).	2026-06-14 22:58:55 +05:30
kyssta-exe	7f245b0035	fix(gateway): invalidate agent cache on cross-process session writes (#45966 ) (cherry picked from commit `6d0f79defe`)	2026-06-14 22:54:39 +05:30
Teknium	2c174bce24	fix(gateway): preserve new input on interrupted replay cleanup	2026-06-14 05:10:39 -07:00
Arnaud L	5191c1c2ce	fix(gateway): stop replaying interrupted tool-call tails and auto-continue notes Three changes to prevent infinite re-execution loops when a user sends a new message while long-running tools are executing: 1. Filter interrupted tool results in _build_gateway_agent_history: skip tool messages whose content contains [Command interrupted] or exit_code 130 — they represent partial execution, not valid results. 2. Don't replay auto-continue notes as user messages: detect gateway-injected [System note: ...] / [IMPORTANT: ...] prefixes and skip them in _build_gateway_agent_history so the LLM doesn't see 4+ messages from 'the user' telling it to finish old work. 3. Fix the wording: the system note now instructs the model to address the user's NEW message FIRST, IGNORE pending results, and NOT re-execute old tool calls. Closes #45230	2026-06-14 05:10:39 -07:00
Aldo	293c04fef6	fix(gateway): suppress exact silence tokens without mutating history	2026-06-14 03:25:08 -07:00
Teknium	10bad2faf1	fix(gateway): serialize startup auto-resume before inbound (#46074 ) Gateway startup now queues real inbound messages until restart-interrupted auto-resume turns have completed, preventing duplicate agents for the same session after a restart.	2026-06-14 03:21:06 -07:00
Teknium	723c2331bd	fix: make profile subprocess HOME policy explicit	2026-06-14 03:20:21 -07:00
Teknium	dc90ca4e17	fix(ssl): run CA guard during agent initialization	2026-06-13 21:14:32 -07:00
Teknium	af5b526472	fix(ssl): validate CA bundle paths before provider calls	2026-06-13 21:14:32 -07:00
chromalinx	a218a0f156	fix(agent,gateway,doctor): add SSL CA cert bundle fail-fast guard A stale certifi CA bundle after a partial `hermes update` used to crash the agent on the first outbound HTTPS call with a raw traceback and trap the gateway in a retry loop. This patch: * Adds `agent/errors.py` with a typed `SSLConfigurationError` * Adds `agent/ssl_guard.py` with a `verify_ca_bundle()` pre-flight that asserts the bundle exists, is non-trivial in size, and can build a working SSLContext. On macOS, it falls back to the system trust store when the bundle is empty but the system store is healthy (covers corporate proxies / MDM setups). * Wires the guard into `run_agent.py` and `gateway/run.py` right after the `hermes_bootstrap` import, inside a try/except so a bug in the guard itself can never prevent startup. * Adds a `SSL / CA Certificates` section to `hermes_cli doctor` so users can detect the failure with one command. * Adds unit tests covering the healthy, missing, empty, skip-env, and macOS-fallback paths. * Adds an RCA document describing the failure mode and the recovery path (`pip install -e .`). When the bundle is broken the user sees: \u26a0\ufe0f SSL certificate bundle issue detected. Run: pip install -e . `HERMES_SKIP_SSL_GUARD=1` disables the check for sandboxed environments that ship their own trust store.	2026-06-13 21:14:32 -07:00
kshitijk4poor	63097ee0d7	test(gateway): cover auto-resume full-path no-regression; clarify guard docstring The salvaged fix's two regression tests mock adapter.handle_message, so they only assert the pre-claimed sentinel is set/cleaned around a stub — they never drive the real dispatch chain. Add a full-path test that exercises _schedule_resume_pending_sessions -> _guarded_handle_message -> adapter.handle_message -> _process_message_background -> _handle_message and asserts the resumed session's agent runs EXACTLY ONCE: not zero (the pre-claim must not self-bounce the resume into a queued no-op) and not twice (the duplicate-agent bug #45456 the fix targets). Also assert no leaked sentinel and no orphaned pending event after the drain settles. Tighten the _guarded_handle_message docstring: on current main the real sentinel is taken over inside _handle_message (not _process_message_background), and note the `is _AGENT_PENDING_SENTINEL` guard only releases the slot we ourselves placed, never one a live run owns.	2026-06-13 23:39:35 +05:30
liuhao1024	6e2fd955ca	fix(gateway): claim session slot before auto-resume task to prevent duplicate agents When the gateway restarts and auto-resumes an interrupted session, an inbound message arriving in the window between `asyncio.create_task()` and the task's first await could spin up a second AIAgent for the same session. Both agents would then process messages concurrently, producing interleaved duplicate responses (#45456). Fix: set `_AGENT_PENDING_SENTINEL` in `_running_agents` immediately after the "already running" check, before creating the task. This closes the race window — any inbound message sees the slot as occupied and queues behind the auto-resume. A `_guarded_handle_message` wrapper ensures the pre-claimed sentinel is always released, even if `handle_message` raises before reaching `_process_message_background` (whose `finally` block handles normal cleanup). (cherry picked from commit `85150c976b`)	2026-06-13 23:36:51 +05:30
konsisumer	16fb573bae	fix(gateway): clear bloated compression binding on compression-exhaustion auto-reset After compression exhaustion the auto-reset created a fresh session but discarded reset_session()'s return value and left the Telegram topic binding pointing at the oversized compressed child. The next inbound message in that topic healed the binding forward and switch_session'd the freshly-reset lane back onto the bloated transcript, re-triggering compression exhaustion in a loop with a new session id each time. Capture the fresh entry and re-sync the topic binding to it so the next message starts clean. No-op on non-topic lanes. Regression of the #9893/#10063 auto-reset fix. Fixes #35809	2026-06-13 06:38:29 -07:00
Black-Kylin	202e318cb1	fix(gateway): sync compression session splits before failures Salvages PR #25747 by preserving gateway session rotation even when a post-compression model call fails before returning final content. Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>	2026-06-13 04:51:59 -07:00
Teknium	2a5dc0ef3d	fix(slack): make video attachments available to agents (#45512 )	2026-06-13 03:33:27 -07:00
Siddharth Balyan	7ba5df0d52	feat(billing): /credits command — balance + portal top-up handoff (#44776 ) * feat(billing): /usage → portal top-up browser handoff Add the terminal side of the billing slice (phase 2a): start a top-up by throwing the user to the portal billing page with the top-up modal open. The terminal does not confirm, poll, or track payment — checkout completes in the browser and the next /usage shows the new balance. - nous_account.py: parse organisation.slug/name from /api/oauth/account into NousPortalAccountInfo; add nous_portal_topup_url() building the org-pinned {base}/orgs/{slug}/billing?topup=open with a null-slug fallback to the legacy {base}/billing?topup=open (never /orgs/None/...). - portal_cli.py: 'hermes portal topup' — fresh account fetch, identity line (Topping up as <email> / org <name>), browser open with printed-URL fallback, no-wait closing copy. No polling/confirmation (deferred to 2b). - account_usage.py: the shared /usage credits block now links the org-pinned top-up URL (auto-opens the modal) + points to the command. Depends on NAS #409 (organisation.slug/name + ?topup=open). Do not merge until that is live on the target env; until then /api/oauth/account returns organisation: { id } only and the URL falls back to legacy. * feat(billing): /credits command for balance + top-up handoff Replace the standalone `hermes portal topup` subcommand with an in-session /credits slash command — a focused money surface (balance in, top-up out) that works in the CLI, TUI, and every messaging platform from one registry entry. - commands.py: register /credits (Info category). Slack is at its 50-slash cap, so /credits is routed via /hermes credits on Slack only (new _SLACK_VIA_HERMES_ONLY set) to avoid clamping a canonical command off the native list and breaking Telegram parity; native everywhere else. - account_usage.py: build_credits_view() — one portal fetch → balance lines + identity line + org-pinned top-up URL + depleted flag, consumed by all surfaces. Reuses the same snapshot/URL builder as /usage so numbers match. - cli.py: _show_credits() — balance block + identity line + 3-button panel (Open top-up / Copy link / Cancel) via the existing prompt_toolkit modal. ASK, never auto-launch; headless falls back to printing the URL. - gateway/slash_commands.py: _handle_credits_command() — renders the block + tappable top-up URL + no-wait copy; works on button and plain-text platforms. - /usage credits line now points to /credits. - Retire `hermes portal topup` (portal_cli.py back to baseline); the engine (slug/name parse + nous_portal_topup_url) stays as the shared core. No polling, no payment confirmation (billing phase 2a). Depends on NAS #409. * fix(credits): /credits works in the TUI slash-worker (non-interactive) In the TUI, /credits runs in the slash-worker subprocess where there is no live prompt_toolkit app and stdin is the JSON-RPC pipe. _show_credits called the 3-button modal unconditionally, which fell back to reading stdin → exception → slash.exec rejected → the command produced no output (only the pre-existing 'Credit access paused' banner showed). - _show_credits: when self._app is None (TUI worker / piped / non-interactive), render the text variant — balance block + tappable top-up URL + no-wait line, same affordance as the messaging surfaces — and skip the modal entirely. The 3-button panel still renders in the interactive CLI. - Depleted banner copy: 'run /usage for balance' → 'run /credits to top up' now that /credits is the dedicated money surface (+ tests). - Regression tests: _show_credits with self._app=None renders text and never invokes the modal; logged-out path. * feat(tui): credits.view RPC for the /credits tappable top-up button Add a credits.view JSON-RPC method returning the structured CreditsView (logged_in, balance_lines, identity_line, topup_url, depleted) so the TUI can render a clickable <Link> top-up button instead of plain text. Account- independent (portal fetch gated on a logged-in Nous account), fail-open to {logged_in: false} on any hiccup. Mirrors session.usage's credits-block pattern. Frontend (TUI-local /credits command + Ink component) lands separately. * feat(tui): /credits command with keyboard-driven top-up confirm TUI-local /credits: fetches the structured balance via the credits.view RPC, prints the balance + identity + top-up URL, then arms the EXISTING confirm overlay (Enter = open top-up in browser via openExternalUrl, Esc = cancel). Reuses ConfirmReq — no new overlay component/state/input handler. Headless (openExternalUrl returns false) falls back to printing the URL. - gatewayTypes.ts: CreditsViewResponse. - commands/credits.ts: the command (mirrors /status's rpc+guarded pattern). - registry.ts: register creditsCommands. - test: balance+overlay armed, headless fallback, no-url, logged-out (4 cases). Matches the CLI /credits 'Enter to open' affordance. Phase 2a: no polling.	2026-06-12 08:51:10 +00:00
Teknium	db7714d5f1	Merge pull request #44331 from NousResearch/hermes/hermes-6b48295e feat(whatsapp): WhatsApp Business Cloud API adapter (salvage #43921)	2026-06-11 22:48:06 -07:00
Kyssta	a942bfd9cc	fix(gateway): reset _last_flushed_db_idx when reusing cached agent (#44327 ) (#44518 ) Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>	2026-06-11 22:41:34 -07:00
Teknium	13650ab7f8	fix(gateway): audio attachment note no longer steers the agent into punting Sibling site of the PDF/DOCX note fixed in PR #44175: the audio file attachment context note led with "Ask the user what they'd like you to do with it", steering the model into asking instead of transcribing. Rewritten to instruct the agent to transcribe/process the file itself when the request involves its content, only asking when intent is genuinely unclear. Contract assertion added to the existing audio attachment note test.	2026-06-11 11:58:19 -07:00
xxxigm	e7ae145ac4	fix(gateway): guide the agent to read attached PDF/DOCX instead of punting When a user attached a binary document (PDF, DOCX, XLSX, …) in chat, the context note prepended to the turn said "Ask the user what they'd like you to do with it." That steered the model into asking the user to paste the contents rather than extracting the text it is fully capable of reading — so attached PDFs/DOCX appeared "unreadable" to the agent. Rewrite the binary-document note to tell the agent the file is a non-text format saved at the given path and to extract its text itself (e.g. via the terminal tool or the ocr-and-documents skill) before answering. Text documents (whose content is already inlined by the platform adapter) keep their existing note. The note construction is pulled into a small `_build_document_context_note` helper so it is unit-testable.	2026-06-11 11:58:19 -07:00
Teknium	cb29e8a82e	refactor(cron): rebrand Cron Recipes -> Automation Blueprints Product rename across every surface: module/file names (blueprint_catalog, tools/blueprints, blueprint_cmd), slash command /cron-recipe -> /blueprint (alias /bp), dashboard API /api/cron/blueprints, desktop deep-link hermes://blueprint/<key>, docs catalog page + extract script, and the skill frontmatter block metadata.hermes.blueprint. No behavior change.	2026-06-11 10:49:47 -07:00
Teknium	e8b757845d	fix(cron-recipes): pre-release hardening — honest cadences, strict slot names, surface-aware UX Review fixes for the Cron Recipes stack before release: - hydration-move: /90 in the cron minute field silently wraps to hourly (croniter-verified) — 90/120-minute options never fired at their stated cadence. Replaced with an hour-field step (0 9-17/2 * 1-5) and an interval_hours slot whose options (1/2/3h) all fire as labeled. - fill_recipe: reject unknown slot names. A typo'd 'tiem=07:15' used to silently create the job at the 08:00 default; now it 422s on the dashboard form and errors on the slash/deep-link paths with the valid slot list. - deliver slot: non-strict enum (options are suggestions, scheduler validates downstream) so slack/whatsapp/etc. users aren't locked out; GET /api/cron/recipes rewrites its options from cron_delivery_targets() so the dashboard form only offers configured platforms; help text no longer claims dashboard-created jobs deliver to 'the chat you set this up from' (the endpoint strips origin — they go to the home channel). - gateway: success/accept messages no longer point at /cron (cli_only); surface-aware hint instead. Conversational fill now sends the 'Setting up X — I'll ask you a couple of things…' ack before the agent turn, matching the CLI experience. - important-mail catalog entry: reference the urgency classifier by module path (python3 -m cron.scripts.classify_items) instead of baking an absolute host path into the job prompt — stale after relocation and nonexistent on remote terminal backends. cron/scripts is now a real package and ships in the wheel (pyproject packages.find). - export_recipe: interval schedules round-trip again — parse_schedule stores 'minutes' but the renderer only read 'seconds', so every interval job exported as the silent '0 9 * * *' fallback. - skills_hub install: say so when a recipe suggestion is dropped (latched dedup or pending cap) instead of printing nothing. Targeted tests: 58 cron/recipe + 261 web_server pass; E2E-validated all 14 recipes fill+parse, hydration cadences via croniter, typo rejection on slash + endpoint paths, surface-aware hints, and interval export round-trip.	2026-06-11 10:49:47 -07:00
teknium1	e976faac7a	feat(cron-recipes): /cron-recipe <name> seeds a conversational fill Reworks the chat-line UX: pick a recipe by name and the agent asks you for what it needs, one question at a time, instead of forcing you to hand-type a slot=val command line. - /cron-recipe -> lists the catalog - /cron-recipe <name> -> forgiving name match (exact/prefix/substring/ fuzzy; ambiguous lists candidates), then seeds the agent with a natural-language fill request built from the recipe's typed slots + schedule and prompt templates. The agent asks for each value one at a time and calls the EXISTING cronjob tool. No new tool. - /cron-recipe <name> slot=val -> unchanged deterministic path (fill_recipe -> create_job) for the dashboard/docs/power user. Mechanism (no new plumbing, invariant-safe — the seed enters as a normal user turn, never a synthetic injection): - shared handler returns RecipeCommandResult{text, agent_seed}; match_recipe() and build_recipe_seed() are the new shared pieces. - gateway: dispatch rewrites event.text to the seed and falls through to the agent (the same pattern /steer uses). - CLI: handler sets a one-shot self._pending_agent_seed; the interactive loop consumes it right after process_command() and runs it as the next turn. The typed-slot schema stays the single source of truth (still validates the form/inline path via fill_recipe); the agent path just renders those slots into the questions to ask. Docs updated to lead with the name-then-ask flow.	2026-06-11 10:49:47 -07:00
teknium1	1593ca5406	feat(cron): Cron Recipes — parameterized automation templates across every surface A 'recipe' is a one-place definition of an automation that every surface renders natively. The slot schema (cron/recipe_catalog.py) is the single source of truth; four renderers consume it, and all paths end at the same cron.jobs.create_job — no second job engine. Form where there's a screen, conversation where there's a chat line: - Dashboard / GUI app: a Recipes sub-tab on the Cron page renders each recipe's typed slots as a form (time-picker, enum dropdown, free-text); submit POSTs /api/cron/recipes/instantiate which fills + creates the job. - CLI / TUI / messengers: /cron-recipe lists the catalog, shows a recipe's fields, or fills + creates from a pasted 'key slot=val' command. The shared handler (hermes_cli/cron_recipe_cmd.py) names any missing/invalid slot so the agent can ask a targeted follow-up. - Docs: a generated Cron Recipes catalog page (website, .mdx + React cards) shows each recipe with a copy-paste command and a 'Send to App' button. - Desktop: a hermes:// URL scheme (Electron single-instance lock + setAsDefaultProtocolClient + open-url/second-instance) routes hermes://cron-recipe/<key>?slot=val into the chat composer pre-filled. Typed slots (time/enum/text/weekdays) with defaults: users never type raw cron — recipes parameterize time-of-day and weekday sets and translate to cron expressions; a free-text 'schedule' slot is the full-flexibility escape hatch. Consent-first throughout: nothing schedules without an explicit submit or send. Core: - cron/recipe_catalog.py — CronRecipe + RecipeSlot, 5 curated recipes, recipe_form_schema / recipe_slash_command / recipe_deeplink / recipe_catalog_entry renderers, fill_recipe (validate + translate to create_job kwargs). - hermes_cli/cron_recipe_cmd.py — shared /cron-recipe handler (CLI + TUI + gateway never drift). CommandDef + dispatch in commands.py / cli.py / gateway/run.py. Dashboard: GET /api/cron/recipes + POST /api/cron/recipes/instantiate (web_server.py), CronRecipes.tsx gallery+form, Segmented sub-tab on CronPage, api.ts methods + types. Desktop: hermes:// scheme end to end (main.cjs deep-link router + ready-queue, preload onDeepLink/signalDeepLinkReady, global.d.ts types, desktop-controller composer prefill, electron-builder protocols key). Docs: extract-cron-recipes.py generator wired into prebuild.mjs, cron-recipes-catalog.mdx + CronRecipesCatalog React component, sidebar entry. Generated index json gitignored like skills.json. Tests: 23 core (catalog/slots/schedule-resolution/validation/renderers/command handler/generator) + 5 web_server endpoint tests. E2E verified end to end: slot fill -> create_job -> persisted job with correct schedule/deliver/origin.	2026-06-11 10:49:47 -07:00

1 2 3 4 5 ...

1065 commits