hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-18 09:51:59 +00:00

Author	SHA1	Message	Date
Teknium	c66ecf0bc3	feat(delegation): async background subagents via delegate_task(background=true) (#40946 ) * feat(delegation): async background subagents via delegate_task(background=true) delegate_task(background=true) dispatches a subagent that runs in the background and returns a handle immediately, so the user and model keep working while it runs. The full result — plus the original task source — re-enters the conversation as a new turn when the subagent finishes, riding the same completion-queue rail as terminal background processes. - tools/async_delegation.py: daemon-executor registry, capacity cap, rich self-contained completion event pushed onto the shared process_registry.completion_queue (type='async_delegation'). - delegate_tool.py: background param + single-task dispatch branch; batch async rejected (v1). - process_registry.py: format_process_notification renders the rich task-source block (goal/context/toolsets/model/status/result). - gateway/run.py: dedicated _async_delegation_watcher drains + injects results into the originating session (idle + post-turn), session_key routing enrichment, shutdown interrupt of dangling delegations. - config: delegation.max_async_children (default 3). Reuses the existing idle-drain wiring rather than mutating a running agent loop, preserving message-role alternation and prompt-cache invariants. 13 targeted tests; CLI + gateway paths E2E-verified. * test(delegation): make async non-blocking tests environment-independent CI 'test (5)' flaked on a cold, 8-worker runner: the first delegate_task(background=true) call measured 2.27s of one-time setup (config load + child-agent construction + imports), tripping the elapsed < 1.0 wall-clock assertion. That assertion was testing setup overhead, not blocking. Replace the wall-clock thresholds with the real invariant: dispatch returns while the child is still gated (active_count == 1, completion queue empty), which a synchronous impl could not do. Keep only a loose 4s sanity backstop well under the runner's 5s gate. * fix(delegation): harden async background delegation Follow-up review fixes: - Detach background child from parent._active_children at dispatch — otherwise parent-turn interrupts (Ctrl+C, mid-turn steering), cache evicts (release_clients), and session close (/new) kill/close the detached subagent mid-run, defeating the point of background mode. Lifecycle is owned by the async registry's interrupt_fn. - Make the capacity check atomic with the record insert (TOCTOU: two concurrent dispatches could both pass active_count() and exceed the cap). - TUI dedup: key async_delegation events by delegation_id — the fallthrough keyed them all as ("", type), suppressing every completion after the first in the desktop/TUI status feed. - CLI /stop now interrupts running background delegations and /agents lists them (they live outside the process registry and were invisible). - Drop stray unbalanced ']' line from the re-injection block and the unused _ASYNC_DEFAULT import. Tests: detach-at-dispatch + concurrent-capacity race added (15 total in test_async_delegation.py); 137 delegate + 140 process-registry/notify/watch + 7 TUI dedup tests pass. * fix(delegation): harden async background completion drains	2026-06-15 13:33:12 -07:00
Teknium	3e7e9b24d4	fix: harden salvaged session and browser improvements Polish salvaged contributor work before PR review: - read browser inactivity timeout from config with documented fallback - skip redundant v10 trigram backfill before v11 FTS rebuild - show delegate_task goals safely in progress previews - show gateway status model/context without redundant token wording - wire gateway /sessions to shared session-listing helpers - map Ravenwolf author emails for release attribution Co-authored-by: Wolfram Ravenwolf <github.com@wolfram.ravenwolf.de> Co-authored-by: Amy Ravenwolf <amy@ravenwolf.de>	2026-06-15 07:46:34 -07:00
Teknium	be7c919bf9	fix(process): label background completion causes (#46659 ) Track why a background process finished and include that source in notify-on-complete messages so SIGTERM from process.kill, kill_all, backend loss, and ordinary exits are distinguishable.	2026-06-15 07:08:24 -07:00
kshitijk4poor	3bc4a2ff78	fix(gateway): re-baseline agent-cache message_count after each turn The #45966 cross-process coherence guard snapshots a session's on-disk message_count next to the cached agent and rebuilds the agent when the count changes. But the snapshot is taken at agent-BUILD time — before the turn writes its own user + assistant (+ tool) rows — and the cache entry is never rewritten on a reuse. So this process's OWN turn grows message_count, and the very next turn sees a mismatch and rebuilds the agent. That happens every turn, for every conversation, silently destroying the per-conversation prompt caching the cache exists to protect (AGENTS.md: prompt caching is sacred). Add _refresh_agent_cache_message_count(): after a turn completes and the agent has flushed its rows to the SessionDB, re-baseline the stored count to the now-current value. The guard then fires ONLY when a DIFFERENT process changes the transcript — preserving the #45966 fix while keeping the cache warm for normal single-process operation. Tests drive the real SessionDB + the real guard condition: 5 consecutive same-process turns now all REUSE the cached agent (0 before the fix); a cross-process append still invalidates; and the re-baseline is fail-safe (no DB, falsy session_id, raising probe, legacy 2-tuple, pending sentinel all no-op).	2026-06-14 22:58:55 +05:30
kyssta-exe	7f245b0035	fix(gateway): invalidate agent cache on cross-process session writes (#45966 ) (cherry picked from commit `6d0f79defe`)	2026-06-14 22:54:39 +05:30
Teknium	2c174bce24	fix(gateway): preserve new input on interrupted replay cleanup	2026-06-14 05:10:39 -07:00
Arnaud L	5191c1c2ce	fix(gateway): stop replaying interrupted tool-call tails and auto-continue notes Three changes to prevent infinite re-execution loops when a user sends a new message while long-running tools are executing: 1. Filter interrupted tool results in _build_gateway_agent_history: skip tool messages whose content contains [Command interrupted] or exit_code 130 — they represent partial execution, not valid results. 2. Don't replay auto-continue notes as user messages: detect gateway-injected [System note: ...] / [IMPORTANT: ...] prefixes and skip them in _build_gateway_agent_history so the LLM doesn't see 4+ messages from 'the user' telling it to finish old work. 3. Fix the wording: the system note now instructs the model to address the user's NEW message FIRST, IGNORE pending results, and NOT re-execute old tool calls. Closes #45230	2026-06-14 05:10:39 -07:00
Aldo	293c04fef6	fix(gateway): suppress exact silence tokens without mutating history	2026-06-14 03:25:08 -07:00
Teknium	10bad2faf1	fix(gateway): serialize startup auto-resume before inbound (#46074 ) Gateway startup now queues real inbound messages until restart-interrupted auto-resume turns have completed, preventing duplicate agents for the same session after a restart.	2026-06-14 03:21:06 -07:00
Teknium	723c2331bd	fix: make profile subprocess HOME policy explicit	2026-06-14 03:20:21 -07:00
Teknium	dc90ca4e17	fix(ssl): run CA guard during agent initialization	2026-06-13 21:14:32 -07:00
Teknium	af5b526472	fix(ssl): validate CA bundle paths before provider calls	2026-06-13 21:14:32 -07:00
chromalinx	a218a0f156	fix(agent,gateway,doctor): add SSL CA cert bundle fail-fast guard A stale certifi CA bundle after a partial `hermes update` used to crash the agent on the first outbound HTTPS call with a raw traceback and trap the gateway in a retry loop. This patch: * Adds `agent/errors.py` with a typed `SSLConfigurationError` * Adds `agent/ssl_guard.py` with a `verify_ca_bundle()` pre-flight that asserts the bundle exists, is non-trivial in size, and can build a working SSLContext. On macOS, it falls back to the system trust store when the bundle is empty but the system store is healthy (covers corporate proxies / MDM setups). * Wires the guard into `run_agent.py` and `gateway/run.py` right after the `hermes_bootstrap` import, inside a try/except so a bug in the guard itself can never prevent startup. * Adds a `SSL / CA Certificates` section to `hermes_cli doctor` so users can detect the failure with one command. * Adds unit tests covering the healthy, missing, empty, skip-env, and macOS-fallback paths. * Adds an RCA document describing the failure mode and the recovery path (`pip install -e .`). When the bundle is broken the user sees: \u26a0\ufe0f SSL certificate bundle issue detected. Run: pip install -e . `HERMES_SKIP_SSL_GUARD=1` disables the check for sandboxed environments that ship their own trust store.	2026-06-13 21:14:32 -07:00
kshitijk4poor	63097ee0d7	test(gateway): cover auto-resume full-path no-regression; clarify guard docstring The salvaged fix's two regression tests mock adapter.handle_message, so they only assert the pre-claimed sentinel is set/cleaned around a stub — they never drive the real dispatch chain. Add a full-path test that exercises _schedule_resume_pending_sessions -> _guarded_handle_message -> adapter.handle_message -> _process_message_background -> _handle_message and asserts the resumed session's agent runs EXACTLY ONCE: not zero (the pre-claim must not self-bounce the resume into a queued no-op) and not twice (the duplicate-agent bug #45456 the fix targets). Also assert no leaked sentinel and no orphaned pending event after the drain settles. Tighten the _guarded_handle_message docstring: on current main the real sentinel is taken over inside _handle_message (not _process_message_background), and note the `is _AGENT_PENDING_SENTINEL` guard only releases the slot we ourselves placed, never one a live run owns.	2026-06-13 23:39:35 +05:30
liuhao1024	6e2fd955ca	fix(gateway): claim session slot before auto-resume task to prevent duplicate agents When the gateway restarts and auto-resumes an interrupted session, an inbound message arriving in the window between `asyncio.create_task()` and the task's first await could spin up a second AIAgent for the same session. Both agents would then process messages concurrently, producing interleaved duplicate responses (#45456). Fix: set `_AGENT_PENDING_SENTINEL` in `_running_agents` immediately after the "already running" check, before creating the task. This closes the race window — any inbound message sees the slot as occupied and queues behind the auto-resume. A `_guarded_handle_message` wrapper ensures the pre-claimed sentinel is always released, even if `handle_message` raises before reaching `_process_message_background` (whose `finally` block handles normal cleanup). (cherry picked from commit `85150c976b`)	2026-06-13 23:36:51 +05:30
konsisumer	16fb573bae	fix(gateway): clear bloated compression binding on compression-exhaustion auto-reset After compression exhaustion the auto-reset created a fresh session but discarded reset_session()'s return value and left the Telegram topic binding pointing at the oversized compressed child. The next inbound message in that topic healed the binding forward and switch_session'd the freshly-reset lane back onto the bloated transcript, re-triggering compression exhaustion in a loop with a new session id each time. Capture the fresh entry and re-sync the topic binding to it so the next message starts clean. No-op on non-topic lanes. Regression of the #9893/#10063 auto-reset fix. Fixes #35809	2026-06-13 06:38:29 -07:00
Black-Kylin	202e318cb1	fix(gateway): sync compression session splits before failures Salvages PR #25747 by preserving gateway session rotation even when a post-compression model call fails before returning final content. Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>	2026-06-13 04:51:59 -07:00
Teknium	2a5dc0ef3d	fix(slack): make video attachments available to agents (#45512 )	2026-06-13 03:33:27 -07:00
Siddharth Balyan	7ba5df0d52	feat(billing): /credits command — balance + portal top-up handoff (#44776 ) * feat(billing): /usage → portal top-up browser handoff Add the terminal side of the billing slice (phase 2a): start a top-up by throwing the user to the portal billing page with the top-up modal open. The terminal does not confirm, poll, or track payment — checkout completes in the browser and the next /usage shows the new balance. - nous_account.py: parse organisation.slug/name from /api/oauth/account into NousPortalAccountInfo; add nous_portal_topup_url() building the org-pinned {base}/orgs/{slug}/billing?topup=open with a null-slug fallback to the legacy {base}/billing?topup=open (never /orgs/None/...). - portal_cli.py: 'hermes portal topup' — fresh account fetch, identity line (Topping up as <email> / org <name>), browser open with printed-URL fallback, no-wait closing copy. No polling/confirmation (deferred to 2b). - account_usage.py: the shared /usage credits block now links the org-pinned top-up URL (auto-opens the modal) + points to the command. Depends on NAS #409 (organisation.slug/name + ?topup=open). Do not merge until that is live on the target env; until then /api/oauth/account returns organisation: { id } only and the URL falls back to legacy. * feat(billing): /credits command for balance + top-up handoff Replace the standalone `hermes portal topup` subcommand with an in-session /credits slash command — a focused money surface (balance in, top-up out) that works in the CLI, TUI, and every messaging platform from one registry entry. - commands.py: register /credits (Info category). Slack is at its 50-slash cap, so /credits is routed via /hermes credits on Slack only (new _SLACK_VIA_HERMES_ONLY set) to avoid clamping a canonical command off the native list and breaking Telegram parity; native everywhere else. - account_usage.py: build_credits_view() — one portal fetch → balance lines + identity line + org-pinned top-up URL + depleted flag, consumed by all surfaces. Reuses the same snapshot/URL builder as /usage so numbers match. - cli.py: _show_credits() — balance block + identity line + 3-button panel (Open top-up / Copy link / Cancel) via the existing prompt_toolkit modal. ASK, never auto-launch; headless falls back to printing the URL. - gateway/slash_commands.py: _handle_credits_command() — renders the block + tappable top-up URL + no-wait copy; works on button and plain-text platforms. - /usage credits line now points to /credits. - Retire `hermes portal topup` (portal_cli.py back to baseline); the engine (slug/name parse + nous_portal_topup_url) stays as the shared core. No polling, no payment confirmation (billing phase 2a). Depends on NAS #409. * fix(credits): /credits works in the TUI slash-worker (non-interactive) In the TUI, /credits runs in the slash-worker subprocess where there is no live prompt_toolkit app and stdin is the JSON-RPC pipe. _show_credits called the 3-button modal unconditionally, which fell back to reading stdin → exception → slash.exec rejected → the command produced no output (only the pre-existing 'Credit access paused' banner showed). - _show_credits: when self._app is None (TUI worker / piped / non-interactive), render the text variant — balance block + tappable top-up URL + no-wait line, same affordance as the messaging surfaces — and skip the modal entirely. The 3-button panel still renders in the interactive CLI. - Depleted banner copy: 'run /usage for balance' → 'run /credits to top up' now that /credits is the dedicated money surface (+ tests). - Regression tests: _show_credits with self._app=None renders text and never invokes the modal; logged-out path. * feat(tui): credits.view RPC for the /credits tappable top-up button Add a credits.view JSON-RPC method returning the structured CreditsView (logged_in, balance_lines, identity_line, topup_url, depleted) so the TUI can render a clickable <Link> top-up button instead of plain text. Account- independent (portal fetch gated on a logged-in Nous account), fail-open to {logged_in: false} on any hiccup. Mirrors session.usage's credits-block pattern. Frontend (TUI-local /credits command + Ink component) lands separately. * feat(tui): /credits command with keyboard-driven top-up confirm TUI-local /credits: fetches the structured balance via the credits.view RPC, prints the balance + identity + top-up URL, then arms the EXISTING confirm overlay (Enter = open top-up in browser via openExternalUrl, Esc = cancel). Reuses ConfirmReq — no new overlay component/state/input handler. Headless (openExternalUrl returns false) falls back to printing the URL. - gatewayTypes.ts: CreditsViewResponse. - commands/credits.ts: the command (mirrors /status's rpc+guarded pattern). - registry.ts: register creditsCommands. - test: balance+overlay armed, headless fallback, no-url, logged-out (4 cases). Matches the CLI /credits 'Enter to open' affordance. Phase 2a: no polling.	2026-06-12 08:51:10 +00:00
Teknium	db7714d5f1	Merge pull request #44331 from NousResearch/hermes/hermes-6b48295e feat(whatsapp): WhatsApp Business Cloud API adapter (salvage #43921)	2026-06-11 22:48:06 -07:00
Kyssta	a942bfd9cc	fix(gateway): reset _last_flushed_db_idx when reusing cached agent (#44327 ) (#44518 ) Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>	2026-06-11 22:41:34 -07:00
Teknium	13650ab7f8	fix(gateway): audio attachment note no longer steers the agent into punting Sibling site of the PDF/DOCX note fixed in PR #44175: the audio file attachment context note led with "Ask the user what they'd like you to do with it", steering the model into asking instead of transcribing. Rewritten to instruct the agent to transcribe/process the file itself when the request involves its content, only asking when intent is genuinely unclear. Contract assertion added to the existing audio attachment note test.	2026-06-11 11:58:19 -07:00
xxxigm	e7ae145ac4	fix(gateway): guide the agent to read attached PDF/DOCX instead of punting When a user attached a binary document (PDF, DOCX, XLSX, …) in chat, the context note prepended to the turn said "Ask the user what they'd like you to do with it." That steered the model into asking the user to paste the contents rather than extracting the text it is fully capable of reading — so attached PDFs/DOCX appeared "unreadable" to the agent. Rewrite the binary-document note to tell the agent the file is a non-text format saved at the given path and to extract its text itself (e.g. via the terminal tool or the ocr-and-documents skill) before answering. Text documents (whose content is already inlined by the platform adapter) keep their existing note. The note construction is pulled into a small `_build_document_context_note` helper so it is unit-testable.	2026-06-11 11:58:19 -07:00
Teknium	cb29e8a82e	refactor(cron): rebrand Cron Recipes -> Automation Blueprints Product rename across every surface: module/file names (blueprint_catalog, tools/blueprints, blueprint_cmd), slash command /cron-recipe -> /blueprint (alias /bp), dashboard API /api/cron/blueprints, desktop deep-link hermes://blueprint/<key>, docs catalog page + extract script, and the skill frontmatter block metadata.hermes.blueprint. No behavior change.	2026-06-11 10:49:47 -07:00
Teknium	e8b757845d	fix(cron-recipes): pre-release hardening — honest cadences, strict slot names, surface-aware UX Review fixes for the Cron Recipes stack before release: - hydration-move: /90 in the cron minute field silently wraps to hourly (croniter-verified) — 90/120-minute options never fired at their stated cadence. Replaced with an hour-field step (0 9-17/2 * 1-5) and an interval_hours slot whose options (1/2/3h) all fire as labeled. - fill_recipe: reject unknown slot names. A typo'd 'tiem=07:15' used to silently create the job at the 08:00 default; now it 422s on the dashboard form and errors on the slash/deep-link paths with the valid slot list. - deliver slot: non-strict enum (options are suggestions, scheduler validates downstream) so slack/whatsapp/etc. users aren't locked out; GET /api/cron/recipes rewrites its options from cron_delivery_targets() so the dashboard form only offers configured platforms; help text no longer claims dashboard-created jobs deliver to 'the chat you set this up from' (the endpoint strips origin — they go to the home channel). - gateway: success/accept messages no longer point at /cron (cli_only); surface-aware hint instead. Conversational fill now sends the 'Setting up X — I'll ask you a couple of things…' ack before the agent turn, matching the CLI experience. - important-mail catalog entry: reference the urgency classifier by module path (python3 -m cron.scripts.classify_items) instead of baking an absolute host path into the job prompt — stale after relocation and nonexistent on remote terminal backends. cron/scripts is now a real package and ships in the wheel (pyproject packages.find). - export_recipe: interval schedules round-trip again — parse_schedule stores 'minutes' but the renderer only read 'seconds', so every interval job exported as the silent '0 9 * * *' fallback. - skills_hub install: say so when a recipe suggestion is dropped (latched dedup or pending cap) instead of printing nothing. Targeted tests: 58 cron/recipe + 261 web_server pass; E2E-validated all 14 recipes fill+parse, hydration cadences via croniter, typo rejection on slash + endpoint paths, surface-aware hints, and interval export round-trip.	2026-06-11 10:49:47 -07:00
teknium1	e976faac7a	feat(cron-recipes): /cron-recipe <name> seeds a conversational fill Reworks the chat-line UX: pick a recipe by name and the agent asks you for what it needs, one question at a time, instead of forcing you to hand-type a slot=val command line. - /cron-recipe -> lists the catalog - /cron-recipe <name> -> forgiving name match (exact/prefix/substring/ fuzzy; ambiguous lists candidates), then seeds the agent with a natural-language fill request built from the recipe's typed slots + schedule and prompt templates. The agent asks for each value one at a time and calls the EXISTING cronjob tool. No new tool. - /cron-recipe <name> slot=val -> unchanged deterministic path (fill_recipe -> create_job) for the dashboard/docs/power user. Mechanism (no new plumbing, invariant-safe — the seed enters as a normal user turn, never a synthetic injection): - shared handler returns RecipeCommandResult{text, agent_seed}; match_recipe() and build_recipe_seed() are the new shared pieces. - gateway: dispatch rewrites event.text to the seed and falls through to the agent (the same pattern /steer uses). - CLI: handler sets a one-shot self._pending_agent_seed; the interactive loop consumes it right after process_command() and runs it as the next turn. The typed-slot schema stays the single source of truth (still validates the form/inline path via fill_recipe); the agent path just renders those slots into the questions to ask. Docs updated to lead with the name-then-ask flow.	2026-06-11 10:49:47 -07:00
teknium1	1593ca5406	feat(cron): Cron Recipes — parameterized automation templates across every surface A 'recipe' is a one-place definition of an automation that every surface renders natively. The slot schema (cron/recipe_catalog.py) is the single source of truth; four renderers consume it, and all paths end at the same cron.jobs.create_job — no second job engine. Form where there's a screen, conversation where there's a chat line: - Dashboard / GUI app: a Recipes sub-tab on the Cron page renders each recipe's typed slots as a form (time-picker, enum dropdown, free-text); submit POSTs /api/cron/recipes/instantiate which fills + creates the job. - CLI / TUI / messengers: /cron-recipe lists the catalog, shows a recipe's fields, or fills + creates from a pasted 'key slot=val' command. The shared handler (hermes_cli/cron_recipe_cmd.py) names any missing/invalid slot so the agent can ask a targeted follow-up. - Docs: a generated Cron Recipes catalog page (website, .mdx + React cards) shows each recipe with a copy-paste command and a 'Send to App' button. - Desktop: a hermes:// URL scheme (Electron single-instance lock + setAsDefaultProtocolClient + open-url/second-instance) routes hermes://cron-recipe/<key>?slot=val into the chat composer pre-filled. Typed slots (time/enum/text/weekdays) with defaults: users never type raw cron — recipes parameterize time-of-day and weekday sets and translate to cron expressions; a free-text 'schedule' slot is the full-flexibility escape hatch. Consent-first throughout: nothing schedules without an explicit submit or send. Core: - cron/recipe_catalog.py — CronRecipe + RecipeSlot, 5 curated recipes, recipe_form_schema / recipe_slash_command / recipe_deeplink / recipe_catalog_entry renderers, fill_recipe (validate + translate to create_job kwargs). - hermes_cli/cron_recipe_cmd.py — shared /cron-recipe handler (CLI + TUI + gateway never drift). CommandDef + dispatch in commands.py / cli.py / gateway/run.py. Dashboard: GET /api/cron/recipes + POST /api/cron/recipes/instantiate (web_server.py), CronRecipes.tsx gallery+form, Segmented sub-tab on CronPage, api.ts methods + types. Desktop: hermes:// scheme end to end (main.cjs deep-link router + ready-queue, preload onDeepLink/signalDeepLinkReady, global.d.ts types, desktop-controller composer prefill, electron-builder protocols key). Docs: extract-cron-recipes.py generator wired into prebuild.mjs, cron-recipes-catalog.mdx + CronRecipesCatalog React component, sidebar entry. Generated index json gitignored like skills.json. Tests: 23 core (catalog/slots/schedule-resolution/validation/renderers/command handler/generator) + 5 web_server endpoint tests. E2E verified end to end: slot fill -> create_job -> persisted job with correct schedule/deliver/origin.	2026-06-11 10:49:47 -07:00
teknium1	9a09ea69fb	feat(cron): Suggested Cron Jobs — one surface for proposed automations Hermes can propose automations and let the user accept them with one tap via /suggestions, instead of making them assemble cron jobs by hand. Every proposal — wherever it originates — flows through one surface. Sources (the 'where suggestions come from'): - catalog: curated starter automations (daily briefing, important-mail monitor, weekly review, workday-start reminder) via /suggestions catalog - recipe: installing a skill that carries a metadata.hermes.recipe block registers a suggestion instead of auto-scheduling - usage / integration: reserved for the background-review detector and account-connect triggers (sources defined; emitters land next) Pieces: - cron/suggestions.py — the store. add/list/accept/dismiss, dedup+latch by key (dismissed proposals never re-offered), pending cap so it can't become a nag wall. Accepting calls the existing cron.jobs.create_job — there is NO second job engine. Mirrors jobs.py storage (atomic writes, lock, 0600). - cron/suggestion_catalog.py — the curated set. The important-mail monitor entry is where the old proactive-monitor poll->classify->surface engine lives now (cron/scripts/classify_items.py + the 'monitor' aux task), as ONE catalog automation rather than a standalone feature. - tools/recipes.py — recipe<->job bridge; register_recipe_suggestion() makes a recipe source 'recipe' of this surface. recipe_to_job_spec() is the single translation both the direct and suggestion paths share. - hermes_cli/suggestions_cmd.py — shared /suggestions handler (CLI + gateway never drift); /suggestions [accept N\|dismiss N\|catalog\|clear]. - Wired: CommandDef + CLI dispatch (cli.py) + gateway dispatch (gateway/run.py) + aux 'monitor' task (config.py) + recipe-install hook (skills_hub.py). Consent-first throughout: nothing auto-schedules; acceptance is always explicit; dismissals latch. Supersedes #41122 (proactive-monitor) and #41127 (recipes): both fold in here as a catalog entry and a suggestion source respectively. Tests: store (dedup/cap/accept/dismiss/latch), catalog seeding+idempotency, recipe->suggestion bridge, command handler, aux config. E2E: recipe SKILL.md -> parsed -> suggested -> accepted -> real cron job persisted to jobs.json.	2026-06-11 10:49:47 -07:00
Teknium	2ecb4e62bb	Merge remote-tracking branch 'origin/main' into hermes/hermes-6b48295e	2026-06-11 07:38:25 -07:00
teknium1	cb2c13055e	fix(gateway): scrub _HERMES_GATEWAY from POSIX detached restart watcher too Follow-up to the salvaged #41264 (Windows watcher): the setsid/bash detached restart watcher on Linux/macOS inherits _HERMES_GATEWAY=1 the same way, so the CLI's self-restart loop guard silently refuses 'hermes gateway restart' and the gateway never comes back. Scrub the marker from the watcher env on the POSIX branch as well, and extend the setsid test to assert it.	2026-06-10 23:22:43 -07:00
鼬君夏纪	264ac72b67	fix(gateway,windows): preserve restart watcher env	2026-06-10 23:22:43 -07:00
Teknium	13f1efdd15	fix(gateway): collapse repeated terminal headers in consecutive tool progress blocks (#43968 ) When the agent runs several terminal commands back-to-back, each progress line repeated the '💻 terminal' header above its fenced code block, cluttering the progress bubble. Now only the first terminal call in a streak emits the header; subsequent consecutive terminal calls render adjacent code blocks. Any other tool (or non-block preview) resets the streak so the next terminal call gets a fresh header.	2026-06-10 22:30:27 -07:00
emozilla	bfcc9f92b4	Merge commit '`6110aed9b`' into feat/whatsapp-cloud-api	2026-06-10 21:39:22 -04:00
Teknium	70d5d7e39b	fix(memory,skills): repair write-approval inline prompt, gateway staging, and gateway /skills review (#43452 ) Follow-ups to #38199/#43354 found in post-merge review: - Inline CLI memory approval never worked: the per-thread approval callback was not passed to prompt_dangerous_approval, so the prompt_toolkit fail-closed guard (#15216) denied every gated foreground write without showing a prompt. Now invokes the registered callback directly; a crashed prompt falls back to staging instead of a silent deny. - Gateway sessions claimed inline support but prompt_dangerous_approval has no gateway round-trip (that lives in the pending-approval queue), so gated gateway memory writes hit the input() fallback and denied. Gateway contexts now stage for /memory pending review. - /skills pending\|approve\|reject\|diff\|approval now works on the gateway (gateway_config_gate on skills.write_approval), so skills staged from a messaging session can be reviewed there. Diff output truncated for chat. - memory_tool validates required params before the gate so invalid writes are rejected immediately instead of staged and failing at approve time. - Stale tri-state write_mode docstrings updated to the boolean gate; docs table corrected (inline prompt is interactive-CLI-only). - 6 new tests covering the interactive approve/deny/error paths, gateway staging, skills never-prompt invariant, and pre-gate validation.	2026-06-10 02:57:15 -07:00
Teknium	cd9a9cd8e5	fix(gateway): Slack approval UX in threads — block-size overflow + typed-prefix instruction text (#43444 ) Two fixes for the reported Slack thread approval UX: 1. Slack Block Kit approval/confirm sends silently overflowed the 3000-char section-block cap (flat 2900-char truncation + header + reason), so long execute_code approvals failed with invalid_blocks and fell back to the plain-text prompt with no buttons. Budget the command preview against the rendered fixed parts so blocks never exceed the cap (send_exec_approval + send_slash_confirm). 2. The text fallbacks told users to reply /approve — which Slack blocks inside threads and Matrix clients reserve client-side. Add a typed_command_prefix capability flag on BasePlatformAdapter (default "/"; Slack and Matrix set "!" to match their existing bang-prefix rewrite) and use it in the shared fallback prompt builders (exec approval, update prompt, destructive slash confirm, expensive-model confirm) plus Matrix's reaction-prompt text. The slash-confirm text-intercept now also accepts bang-prefixed replies (!always, !cancel) since those keywords aren't registered commands and the adapters' rewrite doesn't touch them.	2026-06-10 02:30:01 -07:00
xxxigm	311900842e	fix(discord): don't auto-disconnect voice when reply mode is off The voice inactivity timer (VOICE_TIMEOUT) only counted the bot's OWN audio playback as activity. Under /voice off (text-only replies, but still in the channel — leaving is /voice leave) nothing ever reset it, so every 300s the bot disconnected and spammed "Left voice channel (inactivity timeout)." The adapter now learns the live voice-reply mode via a getter wired from run.py and skips the auto-disconnect while mode is off. It also resets the timer when a user actually speaks to the bot, so an active listener (incl. voice-on text-only sessions that never play audio) isn't dropped mid-conversation.	2026-06-09 23:24:26 -07:00
synapsesx	9ca9697342	fix(gateway): return tuple from voice transcription on placeholder caption (#42090 ) ## What does this PR do? The voice-during-active-run feature (#41984) changed `_enrich_message_with_transcription` so that it returns a `(enriched_text, successful_transcripts)` tuple instead of a bare string, which lets callers echo the raw transcript back to the user. The signature and every other return path were updated to match, but one branch was missed: when a successfully transcribed clip arrives with the Discord "empty content" placeholder as its caption, the method still returned the prefix string on its own. All four call sites unpack the result with `text, transcripts = await self._enrich_message_with_transcription(...)`, so that path raised `ValueError: too many values to unpack (expected 2)` and the inbound voice message was dropped instead of reaching the agent. This is a real user-facing path rather than a corner case: a Discord voice note sent without a caption is delivered as exactly that placeholder, so a captionless voice message that transcribed correctly would crash the handler precisely when transcription had worked. The fix returns the proper tuple from that branch so the placeholder is still stripped while the transcripts continue to flow back to the caller for the echo. ## Related Issue N/A ## Type of Change - [x] 🐛 Bug fix (non-breaking change that fixes an issue) - [ ] ✨ New feature (non-breaking change that adds functionality) - [ ] 🔒 Security fix - [ ] 📝 Documentation update - [ ] ✅ Tests (adding or improving test coverage) - [ ] ♻️ Refactor (no behavior change) - [ ] 🎯 New skill (bundled or hub) ## Changes Made - `gateway/run.py`: in `_enrich_message_with_transcription`, return `(prefix, successful_transcripts)` instead of a bare `prefix` from the empty-content-placeholder branch, so the contract matches the signature and the other return paths. - `tests/gateway/test_stt_config.py`: add `test_enrich_message_with_transcription_returns_tuple_for_empty_content_placeholder`, which drives a successful transcription with the placeholder caption and asserts the placeholder is stripped while the transcript is still returned. ## How to Test 1. Check out `main` and run the new test — it fails with `ValueError: too many values to unpack (expected 2)`, reproducing the crash a captionless Discord voice note would trigger. 2. Apply this change and re-run `pytest tests/gateway/test_stt_config.py -q` — all tests pass. 3. `ruff check gateway/run.py tests/gateway/test_stt_config.py` and `python scripts/check-windows-footguns.py gateway/run.py tests/gateway/test_stt_config.py` both pass. ## Checklist ### Code - [x] I've read the [Contributing Guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md) - [x] My commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) (`fix(scope):`, `feat(scope):`, etc.) - [x] I searched for [existing PRs](https://github.com/NousResearch/hermes-agent/pulls) to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix/feature (no unrelated commits) - [x] I've run `pytest tests/ -q` and all tests pass - [x] I've added tests for my changes (required for bug fixes, strongly encouraged for features) - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) per the [compatibility guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md#cross-platform-compatibility) — or N/A - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A	2026-06-09 23:16:23 -07:00
Teknium	96af61b6ef	feat(memory,skills): approve/deny gate for memory + skill writes (#38199 ) Adds memory.write_mode and skills.write_mode (on\|off\|approve), applied to both foreground turns and the background self-improvement review fork — the source of the unprompted 'wrong assumption' saves users reported. - on (default): write freely, unchanged behaviour - off: never write; the tool returns a clean disabled result - approve: don't commit. Memory foreground writes prompt inline (small, reviewable in a chat bubble); background memory writes and ALL skill writes stage to a pending store instead (a SKILL.md is too large to review inline, and a daemon thread can't block on a prompt) Review staged writes from CLI or any messaging platform: /memory pending\|approve\|reject\|mode /skills pending\|approve\|reject\|diff\|mode Skill review respects the size asymmetry: inline you see a one-line gist; the full unified diff stays out-of-band (/skills diff, dashboard, or the staged JSON file). New: tools/write_approval.py (gate + pending store), hermes_cli/ write_approval_commands.py (shared CLI+gateway handlers). Gates wired at the single entry points memory_tool() and skill_manage(), using the existing write-origin ContextVar to distinguish foreground from background_review.	2026-06-09 21:51:43 -07:00
Ben Barclay	5cf6e28a2f	fix(gateway): auto-start after container restart via planned-stop marker (#42675 ) (#43236 ) * fix(gateway): auto-start after container restart via planned-stop marker On Docker (s6-overlay), the gateway runs as a dynamically-registered s6 service. When the container stops/restarts/upgrades, s6 sends the gateway a plain SIGTERM. The shutdown path (_stop_impl) ended with an unconditional _update_runtime_status("stopped"), persisting gateway_state=stopped to the volume. container_boot.py reads that on the next boot and only auto-starts gateways whose last state was "running" (_AUTOSTART_STATES) — so after a routine `docker compose up --force-recreate` the gateway stays down and messaging channels silently go dark, with no error surfaced (issue #42675). The codebase already distinguishes intentional stops from unexpected signals via the planned-stop marker (write_planned_stop_marker / consume_planned_stop_marker_for_self): `hermes gateway stop`, systemd/launchd ExecStop, and Ctrl+C write a marker before signalling, so the handler classifies them as planned. An unmarked SIGTERM (container/s6 restart, OOM, bare kill) is signal-initiated. This wires that existing classification through to the state persist, rather than adding unreliable signal-source inference: - run.py: GatewayRunner._signal_initiated_shutdown, set in shutdown_signal_handler's unmarked-signal branch. In _stop_impl, a signal-initiated (non-restart) teardown now persists "running" instead of "stopped" — preserving the operator's run-intent and overwriting the mid-shutdown "draining" marker so _AUTOSTART_STATES matches on reboot. Operator stops and restarts persist "stopped" as before. - service_manager.py: S6ServiceManager.stop() now writes the planned-stop marker for the supervised PID (read from s6-svstat) before `s6-svc -d`, so an in-container `hermes gateway stop` is correctly classified as intentional (parity with the systemd/launchd/host stop paths, which already mark). Best-effort: a marker-write failure falls back to the safe signal-initiated path. Tests: shutdown persist-decision table (signal→running, operator→stopped, restart→stopped), s6 stop marker write + svstat PID parse + failure tolerance. The signal→running and s6-marker tests fail without the respective source change. Verified end-to-end against a container built from this branch: an unmarked SIGTERM to the live gateway leaves gateway_state=running (shutdown-context log confirms signal path); existing real container-restart suite still green. * docs(docker): clarify gateway autostart distinguishes operator-stop from container-kill The per-profile-supervision section described the autostart-across-restart contract as "running gateways come back, stopped stay stopped" without spelling out what records 'stopped'. That contract was the source of #42675 confusion: users expected a restart to bring the gateway back and it didn't. With the write-side fix, only an explicit `hermes gateway stop` records 'stopped'; container/s6 restart SIGTERMs (incl. image upgrades and unexpected exits) leave the state 'running' so the gateway auto-starts. Make that distinction explicit in both the multi-profile and per-profile-supervision sections. * test(docker): real-restart autostart E2E for #42675 Adds test_live_gateway_autostarts_after_real_restart_without_manual_state_stamp: a live s6-supervised gateway is killed by an actual `docker restart` SIGTERM (no manual gateway_state stamp, no planned-stop marker) and must auto-start on the next boot. Exercises the WRITE side of the fix that the existing stamp-based tests bypass. Verified to FAIL against an origin/main image (reconciler logs prior_state=stopped action=registered — the #42675 bug) and PASS against the fixed image (prior_state=running action=started).	2026-06-10 14:01:34 +10:00
Teknium	8d99b5bc4f	fix(gateway): cap terminal code-block preview in non-verbose mode (#42729 ) The markdown code-block change rendered args['command'] in full in both verbose AND non-verbose (all/new) modes, so a long or multi-line terminal command bypassed the tool_preview_length cap (default 40) and rendered as a huge block. Non-verbose now collapses to a single line capped at the preview length while keeping the fence; verbose keeps the full command.	2026-06-09 02:28:47 -07:00
Teknium	9351cbafab	fix(gateway): auto-deliver image_generate output as native media (#42616 ) image_generate returns its artifact as JSON ({"image": "/abs/path.png"}) with no MEDIA: tag, so the gateway auto-append path (which only recognized text_to_speech MEDIA: tags) never delivered it — image delivery silently depended on the model restating the path in its reply. Add image_generate to the producer allowlist and extract the local path from its JSON result (host_image > image > agent_visible_image), reusing the existing extension-anchored matcher and history-dedupe so remote URLs, unknown extensions, failures, and already-sent paths are rejected. Closes the remaining unfixed path from #19105.	2026-06-08 22:51:03 -07:00
Teknium	3705625b74	feat(gateway): render terminal commands as bare fenced code blocks in chat (#42576 ) Terminal tool progress on markdown-capable gateways (Telegram, Slack, Discord, WhatsApp, Matrix, Weixin, Feishu) renders the full command in a fenced code block again, in all/new AND verbose modes — gated on the adapter's supports_code_blocks capability. Plain-text platforms keep the short truncated preview. No language tag is emitted: Slack mrkdwn renders a '```bash' fence with 'bash' as a literal first code line, so a bare '```' fence is used, which renders correctly on every platform that supports blocks. This restores the #41215 feature (removed in #41950 due to the command showing in group chats) as the default. For a personal assistant the command display is desired; the group-chat concern is a preference, not a vulnerability.	2026-06-08 21:19:05 -07:00
ruangraung	6d2732e786	fix(gateway): apply MarkdownV2 formatting on progress message edits When a platform adapter sets REQUIRES_EDIT_FINALIZE=True (e.g. TelegramAdapter), tool progress edits now pass finalize=True so format_message() is applied before sending to the platform. Previously, the initial send() formatted the message correctly via MarkdownV2, but subsequent edit_message() calls skipped formatting (finalize=False), causing raw markdown (e.g. triple backticks for bash code blocks) to render as plain text on Telegram. Refs: #41955, #41732	2026-06-08 15:53:16 -07:00
GodsBoy	421226e404	fix(gateway): stop terminal progress from posting the full command to messaging chats #41215 rendered a terminal tool call as a native ```bash fenced block on markdown platforms (Telegram, WhatsApp, Slack, and others), showing the full command with no truncation, in both all/new and verbose modes. That posted complete shell commands (heredocs, internal paths, destructive commands) into the chat before the final answer, visible to everyone in it. This restores the prior behavior: terminal progress shows the short, truncated preview line that every other tool already uses, capped at tool_preview_length. The supports_code_blocks capability flag is left in place for future use. CLI/TUI rendering is a separate path and was unaffected. Adds a regression test asserting terminal progress renders as a truncated preview, not a fenced bash block, even on a markdown-capable gateway. Fixes #41955	2026-06-08 15:53:00 -07:00
Robin Fernandes	639c1e3636	feat(sessions): add optional max session cap	2026-06-08 15:12:12 -07:00
liuhao1024	8e4c447e5f	fix(gateway): prevent duplicate user messages in state.db When the agent has its own SessionDB reference (_session_db is not None), _flush_messages_to_session_db() persists user messages to SQLite during the agent run. Two gateway fallback paths also wrote the same user message without skip_db=True, creating duplicate entries in state.db: 1. agent_failed_early path (transient 429/timeout failures) 2. not-new-messages path (history_offset >= len(messages) edge case) Move agent_persisted flag definition to before the if/elif/else block so all paths can use it, and pass skip_db=agent_persisted to every fallback append_to_transcript() call. Fixes #42039	2026-06-08 11:29:53 -07:00
teknium1	a706a349b5	refactor(gateway): extract authorization cluster into GatewayAuthorizationMixin (god-file Phase 3) Lift the 4 inbound-message authorization methods out of GatewayRunner into gateway/authz_mixin.py:GatewayAuthorizationMixin. Behavior-neutral; gateway/run.py 16200 -> 15812 LOC. Methods moved (~389 LOC): _is_user_authorized, _get_unauthorized_dm_behavior, _adapter_dm_policy, _adapter_enforces_own_access_policy. The two adapter-policy helpers are private to _is_user_authorized, so the cluster is fully self-contained (zero outside-cluster self.method calls after the lift). All self.* calls resolve unchanged via the MRO (GatewayRunner(GatewayAuthorizationMixin, ...)). Import split: 6 neutral deps (os, Optional, Platform, SessionSource, the two whatsapp_identity helpers) at the mixin module top; the module-level logger is imported lazily inside _is_user_authorized (from gateway.run import logger) so the mixin never imports gateway.run at module scope -> no cycle. The lazy import preserves the exact logger name (gateway.run) so log records are unchanged.	2026-06-08 09:42:02 -07:00
Teknium	e9c1e757fe	fix(gateway): release evicted agent clients to stop RSS leak (#29298 ) (#41974 ) _evict_cached_agent (the chokepoint for /new, /model, /undo, session resets — 17 call sites) only popped the cache entry, dropping the AIAgent reference without releasing its httpx client pool. AIAgent holds reference cycles (callbacks, tool state) so CPython refcounting does not free the client promptly; under steady gateway traffic the held sockets + buffers accumulate and RSS climbs (the leak class behind Now the chokepoint pops AND schedules a soft release_clients() on a daemon thread (mirrors the cap-enforcer / idle-sweeper). Soft release frees the client pool + per-turn child subagents but preserves the session's terminal sandbox / browser / bg processes for resumption. Mid-turn agents are skipped so a running request is never torn down. Also fixes the no-lock branch which previously never popped at all.	2026-06-08 06:44:51 -07:00
Michael Steuer	3d029a53ec	fix(gateway): close residual memory-leak sites under heavy scheduled workload Long-lived gateways under heavy cron/build workloads grow steadily (~18 MB/hr post-phantom-dispatch-fix) and eventually need a restart-or-OOM. Four retention sites, all confirmed live on current main: 1. _evict_cached_agent() (/model, /reasoning, codex-runtime, /undo, etc.) popped the cache entry without releasing the agent's OpenAI client, httpx transport, SSL context, or conversation history. Only /new cleaned up first. Now releases clients on a daemon thread, matching _enforce_agent_cache_cap. 2. _release_evicted_agent_soft() now clears _session_messages after release_clients() — tool outputs (file reads, terminal output, search results) can be tens of MB per 100+-tool-call session; the list is rebuilt from persisted session JSON on resume, so dropping it on soft eviction is safe. 3. The session-expiry watcher (permanent finalization) now drops the session's per-session control dicts (_session_model_overrides, _session_reasoning_overrides, _pending_approvals, _update_prompt_pending, _pending_model_notes). These leaked one entry per session per gateway lifetime. NOTE: this is the session-finalize path, NOT idle agent-cache eviction — an idle-evicted session is still alive and rebuilds its agent from these overrides, so pruning them there would silently reset a user's /model choice. 4. _tool_defs_cache is now bounded (_TOOL_DEFS_CACHE_MAX=8) with oldest-first eviction instead of growing unboundedly across the distinct toolset/config fingerprints a gateway sees over its lifetime. Salvaged from #25318 by Michael Steuer (@mssteuer); fix 3 redirected from the idle-sweep to the session-finalize lifecycle, magic number 8 lifted to a named constant, test ported. Fixes #19251 Co-authored-by: Michael Steuer <michael@make.software>	2026-06-08 06:32:42 -07:00
Kristian Vastveit	d55304c39f	fix(gateway): transcribe voice messages during active agent runs Salvaged from #6600 (@kristianvast) — re-scoped to the voice half only and rebased onto current main. The cascading-interrupt hang half of the original PR landed independently in `dd0d1222a`, so this carries ONLY Problem 1. When a voice/audio message arrives while the agent is busy on the same session, it hit the interrupt path with empty text because STT only ran after the running-agent guard — the voice was effectively lost. Now we transcribe audio BEFORE signaling the agent (and on the fresh-message path), echo the raw transcript back to the user (🎙️), and _enrich_message_with_transcription returns (text, transcripts) so callers can echo. A new _dequeue_pending_with_transcription drives the post-agent drain the same way. Reapplied onto _prepare_inbound_message_text (inbound enrichment was extracted from the inline dispatch block since the original PR). Co-authored-by: Kristian Vastveit <kristian@agrointel.no>	2026-06-08 15:16:20 +05:30

1 2 3 4 5 ...

1042 commits