hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-27 17:58:07 +00:00

Author	SHA1	Message	Date
Brooklyn Nicholson	4dd9732a94	feat(desktop): hoisted todo widget, JSON tool summaries, history grouping & timer fixes - Hoist todo to first-class widget (shadcn checkboxes, brand colors, no tool-accordion). Header derives label from active task; non-active rows fade. - Replace raw JSON dumps with structured key/value summaries via formatToolResultSummary; nested error extraction for clearer failures. - Fix loaded-session grouping: stitch interleaved assistant/tool iterations into one bubble instead of orphaned synthetic messages. - Stable tool/thinking timers via keyed registry so unmount/scroll doesn't reset elapsed counts; gate "running" on real live thread state. - Reorganize chat-only assistant-ui components under components/chat/.	2026-05-11 16:34:25 -04:00
emozilla	4b3839a8ee	fix(cli): seed bundled skills on dashboard + gateway entrypoints `sync_skills(quiet=True)` was only being called from inside `cmd_chat`, which meant `hermes dashboard` (the desktop GUI's backend) and `hermes gateway` (Telegram/Discord/Slack/etc daemons) never seeded the bundled skill library into ~/.hermes/skills/. This surfaced as "No skills found" in the desktop GUI's skills panel on fresh installs, despite the agent having access to the full bundled library when invoked via `hermes chat`. scripts/install.ps1 worked around it by running skills_sync.py as part of Copy-ConfigTemplates, but that's not part of the desktop installer's bootstrap chain. Fix - Extract the skills-sync block from cmd_chat into a module-level `_sync_bundled_skills_quietly()` helper. - Call the helper from cmd_chat (preserving existing behavior), cmd_dashboard (after the --status/--stop early-return paths and fastapi import check, so we don't run skills_sync on management commands or when deps aren't installed), and cmd_gateway. Why these three entrypoints - cmd_chat: the user's primary CLI entrypoint - cmd_dashboard: the desktop GUI's backend; this is what `hermes dashboard --tui` invokes when the desktop bootstrapper spawns Hermes - cmd_gateway: long-running daemons where the user expects the agent to have full skill access Other entrypoints (cmd_config, cmd_doctor, cmd_login, cmd_status, etc.) are management commands that don't need skill discovery and were never running skills_sync in the first place — leaving them alone. Idempotence - tools/skills_sync.py is manifest-based: skipped skills cost milliseconds. Calling it from multiple entrypoints adds no real cost, and users running `hermes chat` then `hermes dashboard` get two fast no-ops on the second call. Failure handling - Helper wraps skills_sync in try/except. Skills are an enhancement, not a hard dependency — Hermes runs fine with an empty skills/ dir. Files - hermes_cli/main.py: + new helper `_sync_bundled_skills_quietly()` at module level + cmd_chat: replace inline block with helper call + cmd_dashboard: add helper call after fastapi import succeeds + cmd_gateway: add helper call before delegating to gateway_command	2026-05-11 15:53:50 -04:00
Brooklyn Nicholson	50a9d6333f	Merge branch 'bb/gui' of github.com:NousResearch/hermes-agent into bb/gui	2026-05-11 15:28:51 -04:00
Brooklyn Nicholson	8d465a5732	feat: theme changes, composer tweaks, in app update ux, finesse	2026-05-11 15:28:45 -04:00
emozilla	c8c8c53a0c	feat(desktop): NSIS prereq detection page + auto-install via winget The packaged Windows installer now detects Python 3.11+ and Git for Windows at install time and offers to install missing prereqs via winget. Mirrors the prereq logic scripts/install.ps1 already runs for CLI installs, so desktop installer users get the same out-of-the-box experience as install.ps1 users. Why - Hermes' terminal tool calls bash.exe directly (tools/environments/ local.py); on Windows that's Git Bash from Git for Windows. Without it, the agent fails on the first terminal() call. - Hermes' Python runtime needs 3.11+. Without it, the desktop bootstrapper errors out at venv creation. - Both gaps surfaced on a fresh Windows 11 VM smoke test: VM had Python pre-installed but no Git, so the agent's first terminal call failed with "Git Bash isn't installed." - install.ps1 has had Install-Git + Install-Uv functions for ages. The desktop installer was the asymmetric outlier. How — NSIS prereq page - New file: apps/desktop/installer/prereq-check.nsh (plugged into electron-builder via build.nsis.include) - Real Wizard page using nsDialogs, inserted via customPageAfterChangeDir hook (between the Directory page and InstFiles). - Group boxes for Python and Git, each showing detection status. - Pre-checked install checkboxes when winget is available. - Auto-skips silently if both prereqs are already installed. - Falls back to manual download URLs when winget itself is missing. - Detection: - Python: probes `py -3.11`/`-3.12`/`-3.13`/`-3.14` via the Python launcher. Microsoft Store "Python stub" (no py.exe) is correctly classified as not-installed. - Git: `where git`. - winget: `where winget` (Win10 1809+ / Win11 with App Installer). - Install execution (in customInstall macro): - Python: nsExec::ExecToLog with `--scope user --silent`. Per-user install, no UAC prompt, output streams to install log. - Git: ExecShellWait via Windows ShellExecute. Critical because Git always installs per-machine and triggers UAC; ShellExecute preserves the foreground focus chain across non-elevated → elevated process spawns, so UAC actually comes to the foreground. nsExec::ExecToLog breaks the chain because winget runs hidden. - Both pass `--disable-interactivity --accept-package-agreements --accept-source-agreements` to suppress winget's own dialogs. - Verification: probes Git's standard install locations via FileExists rather than `where git`. NSIS's process inherits PATH at startup, so a freshly-installed Git won't be visible to `where` until restart. - Silent installs (/S) skip the prompts; managed deploys handle prereqs out-of-band via Group Policy / Intune. How — Electron-side safety net - New findGitBash() in main.cjs, parallel to findSystemPython(). Probes the same locations as tools/environments/local.py:_find_bash() so a positive result here means the agent's terminal tool will work. - ensureRuntime now throws a clear, actionable error on Windows when Git Bash isn't found, matching the existing "Python 3.11+ is required" error path. - Catches users the NSIS page doesn't: .msi installer users (NSIS prereq page doesn't run for MSI), `npm run dev` users, manual installers, anyone who unchecked the install boxes on the NSIS prereq page. - All gated on `IS_WINDOWS`; macOS / Linux unaffected. NSIS build issue (resolved) - electron-builder defaults to `-WX` (warnings as errors). NSIS optimizer emits "warning 6010: function not referenced" for our page functions because Page custom directives don't count as references in its static-analysis pass. The functions ARE called at runtime when NSIS invokes the page; the optimizer just can't see it statically. - Set `build.nsis.warningsAsErrors=false` in package.json so this spurious warning doesn't fail the build. (Documented option from electron-builder's nsisOptions.) Out of scope (filed for future work) - MSI prereq detection: Windows Installer custom actions are a different mechanism. Enterprise deploys typically handle prereqs via GP/Intune. - Bundle PortableGit + python-build-standalone in extraResources for zero-network installs. ~80MB increase. - Mac / Linux GUI prereq flows (different installer formats; Xcode CLT covers most macOS prereqs already; Linux is per-distro hard). Files - apps/desktop/installer/prereq-check.nsh (new, ~290 lines NSIS) - apps/desktop/package.json (build.nsis.include + warningsAsErrors) - apps/desktop/electron/main.cjs (findGitBash + preflight) - apps/desktop/README.md (Runtime prerequisites section) Cross-platform impact - macOS / Linux builds (dist:mac, dist:mac:dmg, dist:mac:zip): nsis config is ignored entirely; .nsh is dormant. - npm run dev: .nsh dormant; main.cjs preflight gated on IS_WINDOWS. - scripts/install.ps1, scripts/install.sh: no reference to any new files; CLI install paths untouched. - Hermes CLI / dashboard / gateway: no reference; runtime untouched. - All checks: node --check on main.cjs and test-desktop.mjs pass; npm run test:desktop:platforms 4/4 passing; node --test green. Tested - npm run dist:win produces signed .exe and .msi without errors. - Fresh Win11 VM (Python pre-installed, no Git): prereq page renders, Python check shows detected, Git checkbox pre-checked. Click Next → Git installs via winget with UAC prompt in foreground. - After install completes, Hermes launches and the agent's terminal tool can run bash commands. Verified Git Bash is detected at `C:\Program Files\Git\bin\bash.exe` by ensureRuntime's preflight.	2026-05-11 11:13:49 -04:00
Brooklyn Nicholson	bff052d61f	feat(desktop): theme polish, prose chat typography, composer chrome - DS tokens/midground, Backdrop, scoped scrollbars, typography plugin + prose - Composer liquid/radius utilities, thread font parity, tool/thinking cues - File tree label scale, preview flex, thread retry loading + streaming tests	2026-05-11 10:25:23 -04:00
emozilla	61fb5a48b7	refactor(desktop): align install layout with install.ps1 / install.sh Make the desktop app's runtime layout match what scripts/install.ps1 and scripts/install.sh produce, so a desktop-only user and a CLI-only user end up with the same files in the same places and can share one install. Layout - ACTIVE_HERMES_ROOT = HERMES_HOME/hermes-agent (was: process.resourcesPath/hermes-agent, read-only) - VENV_ROOT = HERMES_HOME/hermes-agent/venv (was: userData/hermes-runtime) - desktop.log = HERMES_HOME/logs/desktop.log (was: userData/desktop.log) - HERMES_HOME default: %LOCALAPPDATA%\hermes on Windows, ~/.hermes elsewhere The packaged .app/.exe still ships a read-only payload at process.resourcesPath/hermes-agent (FACTORY_HERMES_ROOT). On first launch or after an installer-driven upgrade we sync factory -> active, then provision the venv and run pip install -e . against the active root. Key behaviors - Pin HERMES_HOME in the spawned Python's env so get_hermes_home() resolves to the same path resolveHermesHome() picked. Without this, Python falls back to ~/.hermes on every platform - fine on mac/linux, a split-state bug on Windows where our default is %LOCALAPPDATA%\hermes. - Detect developer installs by .git presence at ACTIVE; never overwrite a user's checkout via factory sync. - Marker at ACTIVE/.hermes-desktop-runtime.json (schema v4) tracks pyproject hash + factory version + runtime schema version. depsFresh fast-paths when nothing changed. - Dev (npm run dev) prefers SOURCE_REPO_ROOT over ACTIVE so devs run their local edits, not whatever's under HERMES_HOME. - Better error messages distinguish "no payload" from "no Python". - Preserve a legacy ~/.hermes on Windows when no %LOCALAPPDATA%\hermes exists, so users with prior pip/manual installs aren't orphaned. pyproject.toml - Promote fastapi, uvicorn[standard], ptyprocess (non-Windows), and pywinpty (Windows) to main dependencies. The dashboard backend (hermes dashboard) needs them at runtime; the previous lazy-import fallback was a footgun for fresh installs. - Empty the [pty] optional-extra; kept as a no-op back-compat alias for any existing pip install hermes-agent[pty] invocations. Drops the hardcoded BUNDLED_RUNTIME_REQUIREMENTS list in main.cjs - the desktop now installs whatever pyproject.toml says, single source of truth. Files - apps/desktop/electron/main.cjs: runtime layout, HERMES_HOME pin, factory->active sync, marker v4 - apps/desktop/scripts/test-desktop.mjs: track new venv location - apps/desktop/README.md: new Setup, Runtime Bootstrap, and Debugging sections - pyproject.toml: fastapi/uvicorn/pty backends in main dependencies; [pty] extra emptied Tested locally on Windows: npm run dev boots cleanly, sessions land at the new location, type-check + lint + test:desktop:platforms all pass. Verified end-to-end on a fresh Win11 VM via dist:win installer. Known gaps (filed as follow-ups, not in this PR): - Skills not seeded on packaged installs (sync_skills only runs in cmd_chat, not cmd_dashboard). Need to move to shared pre-dispatch. - Git Bash not bundled or detected; agent's terminal tool errors out with a useful message but desktop bootstrapper should pre-flight it. - install.ps1 / install.sh should be decomposed into composable phase libraries so the desktop bootstrapper can reuse them as a single source of truth across all install surfaces.	2026-05-11 00:43:46 -04:00
Brooklyn Nicholson	cb7f1d7e0e	Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/gui	2026-05-10 07:05:16 -04:00
kshitijk4poor	44cdf555a8	fix(codex-spark): defensive 128k entry in DEFAULT_CONTEXT_LENGTHS + clarify validation test docstring Two follow-ups from self-review: 1. Add gpt-5.3-codex-spark to DEFAULT_CONTEXT_LENGTHS at 128k. The primary resolution path for Spark goes through provider='openai-codex' → _CODEX_OAUTH_CONTEXT_FALLBACK (already correct). But if any future code path resolves Spark's context with a different provider (custom proxy, generic fallthrough), the longest-substring-first lookup in step 8 would match 'gpt-5' and report 400k, which is wrong by ~3x. Adding the explicit override is a cheap defensive correctness fix matching how gpt-5.4-mini and gpt-5.4-nano already shadow the generic gpt-5 entry. 2. Update test_openai_codex_model_validation_fallback.py docstring. The bug it was originally written for (gpt-5.3-codex-spark missing from listing) is now resolved by this PR's catalog restoration. The test still validly exercises the soft-accept code path for any future entitlement-gated Codex slug that ships before Hermes catalogs it, but the framing was stale — clarified.	2026-05-09 23:17:25 -07:00
kshitijk4poor	826e7171e9	test(codex-spark): add live-API regression and make picker test deterministic Two follow-ups from self-review: 1. Add unit test for _fetch_models_from_api covering the live HTTP path. The salvaged PR #19530 dropped the supported_in_api:false filter in both _fetch_models_from_api and _read_cache_models, but only the cache path had a regression test. This adds the symmetric live-fetch test (mocked httpx) so a future drive-by change to the HTTP path can't silently re-introduce the filter. 2. Pin test_codex_picker_uses_live_codex_catalog to the cache fallback. The test wrote a fake JWT and a CODEX_HOME cache, but provider_model_ids ('openai-codex') still issued a real 10s HTTP probe to chatgpt.com/backend-api/codex/models before falling back to the cache. That made the test slow and non-deterministic in restricted/CI networks. Patch _fetch_models_from_api to return [] so we go straight to the cache path the test actually means to exercise.	2026-05-09 23:17:25 -07:00
kshitij	9ee9a4297d	docs(codex-spark): document ChatGPT Pro entitlement gating PR #12994 stripped gpt-5.3-codex-spark on the assumption that it was unsupported. It's actually research-preview, ChatGPT-Pro-only, exposed via the Codex OAuth backend at chatgpt.com/backend-api/codex/models — not via the public OpenAI API. Add explanatory comments in: - DEFAULT_CODEX_MODELS / _FORWARD_COMPAT_TEMPLATE_MODELS (codex_models.py) - _CODEX_OAUTH_CONTEXT_FALLBACK (model_metadata.py) - list_authenticated_providers' live-discovery branch (model_switch.py) so future maintainers don't strip the entry again. Also documents the intentional asymmetry that Spark stays out of the "openai" provider catalog (it isn't on the public API) and why the supported_in_api filter is not applied for the openai-codex route.	2026-05-09 23:17:25 -07:00
kshitij	6b5e0119b3	chore: add codex-spark salvage contributors to AUTHOR_MAP Maps olegwn@gmail.com → nederev (PR #18286) and vesper@askclaw.dev → askclaw-vesper (PR #19530) so the contributor attribution check passes when their commits land via this salvage.	2026-05-09 23:17:25 -07:00
Vesper 🌙	9457644390	fix: surface Codex CLI-only models	2026-05-09 23:17:25 -07:00
olegdater	c6dc295a35	fix(model-metadata): set codex-spark fallback context to 128k	2026-05-09 23:17:25 -07:00
olegdater	2a6f3deb50	fix(model-metadata): restore gpt-5.3-codex-spark fallback context	2026-05-09 23:17:25 -07:00
olegdater	dcc8de83a9	feat(codex): add gpt-5.3-codex-spark model	2026-05-09 23:17:25 -07:00
Teknium	e5af1dd633	fix(review): tell background reviewer not to capture transient env failures as skills (#23004 ) Closes #6051. Reported failure mode: agent migrated to WSL2, browser launch failed because Playwright wasn't installed yet. Background reviewer captured the failure as a durable skill (`browser-tool-launch-issue`) and the agent kept refusing the browser tool for weeks after Playwright was installed and verified working. Negative claims also propagated into unrelated skills ("browser tools do not work", "cannot use Y from execute_code"). Root cause: `_SKILL_REVIEW_PROMPT` and `_COMBINED_REVIEW_PROMPT` both lean hard on "be active, save things, a pass that does nothing is a missed learning opportunity." Neither distinguished durable knowledge from transient environment state. The reviewer was doing what it was told. Fix at the write site — both prompts now carry a "Do NOT capture" section calling out: • Environment-dependent failures (missing binaries, fresh-install errors, post-migration path mismatches, 'command not found', unconfigured credentials, uninstalled packages) • Negative claims about tools or features ("X does not work") that harden into self-cited refusals • Session-specific transient errors that resolved before the conversation ended • One-off task narratives ("summarize today's market", "analyze this PR") — also addresses the #12812 / #4538 family Plus a positive-reframing line: when a tool fails because of setup state, capture the FIX (install command, config step, env var) under an existing setup/troubleshooting skill — never "this tool doesn't work" as a standalone constraint. Targeted tests: 24/24 passing in tests/run_agent/test_review_prompt_class_first.py (2 new + all existing review-prompt assertions). Substring-based checks so future prompt edits don't false-fail.	2026-05-09 22:51:25 -07:00
Teknium	126cbffb8a	feat(stream-retry): add upstream + timing diagnostics to drop log (#23005 ) The previous PR (#22993) gave us a structured WARNING per stream drop but the only diagnostic was 'error_type=APIError error=Network connection lost.' — same nothing the user started with. To actually diagnose why subagents drop streams disproportionately we need to know WHERE the drop happened. Adds three breadcrumbs to the agent.log WARNING: 1. Inner exception chain. openai SDK wraps httpx errors as APIConnectionError / APIError so the catch site only sees the wrapper. _flatten_exception_chain walks __cause__/__context__ up to 4 levels deep and renders 'Outer(msg) <- Inner(msg)' so we can tell ConnectError vs RemoteProtocolError vs ReadError vs ProxyError without enabling verbose mode. 2. Upstream HTTP headers. Snapshots cf-ray, x-openrouter-provider, x-openrouter-model, x-openrouter-id, x-request-id, server, via, etc. from stream.response immediately after open (so they survive even when the stream dies before the first chunk). These answer 'is one CF edge / one downstream provider responsible, or random?' 3. Per-attempt counters. bytes streamed, chunk count, elapsed time on the dying attempt, and time-to-first-byte. Distinguishes 'couldn't connect at all' (0s, 0 bytes) from 'died after 30s mid-stream' (very different root causes — first is auth/routing, second is upstream idle-kill or proxy timeout). Plumbing: - _stream_diag_init / _stream_diag_capture_response live on AIAgent and produce a per-attempt dict held on request_client_holder['diag'] for closure access from the retry block. - _call_chat_completions and _call_anthropic both initialize the diag and increment counters per chunk/event (best-effort, never raises in the streaming hot path). - _log_stream_retry / _emit_stream_drop accept an optional diag and render the new fields. Final-exhaustion log goes through the same helper so it gets the same diagnostic dump. - UI status line gains a brief 'after Xs' suffix when timing is available — distinguishes 'connect failed' from 'died mid-stream' at a glance without grepping logs. Sample WARNING after this change: Stream drop mid tool-call on attempt 2/3 — retrying. subagent_id=sa-2-cafef00d depth=1 provider=openrouter base_url=https://openrouter.ai/api/v1 error_type=APIError error=Connection error. chain=APIError(Connection error.) <- RemoteProtocolError(peer closed connection without sending complete message body) http_status=200 bytes=12400 chunks=47 elapsed=12.00s ttfb=0.83s upstream=[cf-ray=8f1a2b3c4d5e6f7g-LAX x-openrouter-provider=Anthropic x-openrouter-id=gen-abc123 server=cloudflare] Tests: 10 covering diag init, header capture (whitelist enforced for PII), exception-chain walking + depth cap, log content with full diag, log content without diag (placeholders), UI elapsed-suffix on/off.	2026-05-09 22:49:35 -07:00
Teknium	5a70d9b6be	chore: AUTHOR_MAP entry for tymrtn (#21794 )	2026-05-09 22:49:29 -07:00
tymrtn	d1fc748def	fix(kanban): /kanban slash command emits argparse garbage instead of help Closes #21794. `/kanban`, `/kanban help`, `/kanban --help`, and `/kanban <sub> -h` all returned broken output to the gateway and interactive CLI. Three underlying bugs in `hermes_cli.kanban.run_slash`: 1. argparse writes help to stdout but `run_slash` only captured stderr at parse time, so `-h` text was silently swallowed and replaced with the `(usage error: 0)` sentinel. 2. The wrapping parser used `prog="/"` and routed via a synthetic "_top → kanban" subparser, producing `usage: / kanban …` (stray space) and `usage: /kanban kanban …` (doubled token) in error text. 3. Bare `/kanban` and `/kanban help` dumped argparse's full ~3KB usage tree, which reads as visual garbage in a chat bubble. Fix: drive the kanban_parser directly (no double-wrap), rewrite prog strings on every leaf subparser, capture stdout AND stderr around parse_args, distinguish SystemExit(0) (help — return captured stdout) from SystemExit(2) (error — return single-line ⚠-prefixed message), and add an explicit chat-friendly short-help block returned for bare invocation and the help aliases (`help`, `--help`, `-h`, `?`). Added 5 regression tests covering bare invocation, every help alias, subcommand help, unknown action, and missing required arg. Affects every chat platform via gateway/run.py::_handle_kanban_command and the interactive CLI via cli.py::_handle_kanban_command. Co-Authored-By: Nagatha (Claude Opus 4.7) <noreply@anthropic.com>	2026-05-09 22:49:29 -07:00
Teknium	3d2bfc502e	chore(models): refresh OpenRouter + Nous fallback lists (#23001 ) Reorder Anthropic Opus 4.7/4.6 + Sonnet 4.6 to the top, cluster free models at the bottom of the OpenRouter list, and mirror the same ordering into the Nous portal list (paid models only). - Add inclusionai/ring-2.6-1t:free - Drop minimax-m2.5, minimax-m2.5:free, sonnet-4.5, mimo-v2.5, glm-5v-turbo, glm-5-turbo, trinity-large-preview:free, trinity-large-thinking, qwen3.5-plus-02-15 - Replace qwen3.5-35b-a3b with qwen3.6-35b-a3b - Drop x-ai/grok-4.20-beta from the Nous list	2026-05-09 22:47:38 -07:00
emozilla	767736ff1e	fix(desktop): keep composer contenteditable mounted across stacked toggle The composer rendered {input} inside two different parent fragments depending on `stacked`. When auto-expand flipped `stacked` (e.g. the moment typed text wrapped past two lines), React reconciled the two branches as different positions and unmounted/remounted the contenteditable. The fresh mount started empty, so any in-flight characters — most reliably reproduced by holding a key — were lost. Replace the conditional with a single CSS Grid whose template-areas swap on `stacked`. The three children (menu, input, controls) keep stable identities across the toggle; only their grid placement changes, which the browser handles without React tearing down the editor.	2026-05-10 01:43:52 -04:00
Teknium	e2ce89a8aa	chore: AUTHOR_MAP entry for li0near gmail (#21378 )	2026-05-09 22:38:01 -07:00
li0near	6f2d60559e	fix(kanban): drop redundant init_db() in gateway watchers (#21378 ) Both `_kanban_notifier_watcher` and `_kanban_dispatcher_watcher`'s `_tick_once_for_board` called `_kb.connect(board=slug)` immediately followed by `_kb.init_db(board=slug)`. Since `connect()` already runs the schema + idempotent migration on first open per process, the explicit `init_db()` was redundant — and worse, `init_db()` deliberately busts the per-process `_INITIALIZED_PATHS` cache and re-runs the migration on a second connection that races the first. On every cold gateway start against a legacy DB this surfaced as either `sqlite3.OperationalError: duplicate column name: <col>` or intermittent `database is locked` errors logged at the first tick. The duplicate-column case is now tolerated by `_add_column_if_missing` (commit `78698381a`), but the wasted second migration plus the database-is-locked race remain fixable by skipping the redundant call entirely. Drops `_kb.init_db(board=slug)` at both call sites and adds a regression test in `tests/hermes_cli/test_kanban_notify.py` that pins the absence via source inspection plus a runtime spy. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-09 22:38:01 -07:00
Teknium	68e44642c8	fix(stream-retry): collapse two-line drop status, name provider, and let agent.log capture diagnostics (#22993 ) Subagent stream drops were spamming the parent terminal with two lines per blip ('Connection dropped...' + 'Reconnected...') while leaving zero breadcrumb in agent.log to debug them. Two underlying bugs, fixed together: 1. quiet_mode raised the run_agent/tools/etc. loggers to ERROR, which filters records before root-logger file handlers see them. The comment claimed 'File handlers still capture everything' — that was wrong. Removed in both run_agent.py and cli.py; console quietness already comes from hermes_logging not installing a console StreamHandler in non-verbose mode. 2. The stream-retry blocks emitted two _emit_status calls per drop ('⚠️ Connection dropped... Reconnecting...' + '🔄 Reconnected — resuming…') with no provider name, so multi-provider sessions had to dig through agent.log to attribute a drop. Replaced both call sites with a single _emit_stream_drop helper that emits ONE line naming the provider and error class, and always writes a structured WARNING to agent.log with subagent_id, depth, provider, base_url, error_type. Net UX change: 6 lines per triple-subagent drop → 3 lines, each naming the provider. agent.log now has a structured breadcrumb per retry that didn't exist before. Tests: 6 new tests in tests/run_agent/test_stream_drop_logging.py covering the logger-level guard, structured WARNING content, single status line per drop (no Reconnected follow-up), and provider naming.	2026-05-09 22:35:35 -07:00
emozilla	eaab34e57e	interpret compactPreview for non-string vlaues as JSON or an empty string	2026-05-10 01:23:25 -04:00
emozilla	4d14a1479a	hide application menu on non-mac systems	2026-05-10 00:35:35 -04:00
Teknium	3800972dd0	feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955 ) When the active main model has native vision and the provider supports multimodal tool results (Anthropic, OpenAI Chat, Codex Responses, Gemini 3, OpenRouter, Nous), vision_analyze loads the image bytes and returns them to the model as a multimodal tool-result envelope. The model then sees the pixels directly on its next turn instead of receiving a lossy text description from an auxiliary LLM. Falls back to the legacy aux-LLM text path for non-vision models and unverified providers. Mirrors the architecture used in OpenCode, Claude Code, Codex CLI, and Cline. All four converge on the same pattern: tool results carry image content blocks for vision-capable provider/model combinations. Changes - tools/vision_tools.py: _vision_analyze_native fast path + provider capability table (_supports_media_in_tool_results). Schema description updated to reflect new behaviour. - agent/codex_responses_adapter.py: function_call_output.output now accepts the array form for multimodal tool results (was string-only). Preflight validates input_text/input_image parts. - agent/auxiliary_client.py: _RUNTIME_MAIN_PROVIDER/_MODEL globals so tools see the live CLI/gateway override, not the stale config.yaml default. set_runtime_main()/clear_runtime_main() helpers. - run_agent.py: AIAgent.run_conversation calls set_runtime_main at turn start so vision_analyze's fast-path check sees the actual runtime. - tests/conftest.py: clear runtime-main override between tests. Tests - tests/tools/test_vision_native_fast_path.py: provider capability table, envelope shape, fast-path gating (vision-capable model uses fast path; non-vision model falls through to aux). - tests/run_agent/test_codex_multimodal_tool_result.py: list tool content becomes function_call_output.output array; preflight preserves arrays and drops unknown part types. Live verified - Opus 4.6 + Sonnet 4.6 on OpenRouter: model calls vision_analyze on a typed filepath, gets pixels back, reads exact text from images that no aux description could capture (font color irony, multi-line fruit-count list, etc.). PR replaces the closed prior efforts (#16506 shipped the inbound user- attached path; this PR closes the gap for tool-discovered images).	2026-05-09 21:06:19 -07:00
Teknium	e62250453b	docs(user-stories): add 18 verified social entries (99 → 117) (#22920 ) Found 18 real Hermes-Agent stories from HN, X, and Reddit not yet captured on the page. All URLs HTTP-verified to return 200 with matching titles. Reddit (15): r/hermesagent (Obsidian-as-memory writeup at 794 upvotes, LLM cheatsheet at 635 upvotes, Kanban game-changer post, OpenRouter #1 ranking, AMA from the Nous team, etc.); r/LocalLLaMA, r/Rag, r/openclaw, r/SideProject, r/LocalLLM threads where users describe their actual setups (Qwen3.5-9b on 16gb VRAM, 5060Ti + Telegram, smart routing tiers). X (3): @vmiss33's 'what I use Hermes for' guide, @HeyYanvi's X-to-NotebookLM podcast workflow, @ExileAI_0's spare-laptop Iris running RenPy + ComfyUI, @brucexu_eth's Hermes Inc. Telegram startup sim from the hackathon, Hype's deep-dive blog. HN (1): 'I'm using Hermes — sandbox it like any agent.' No component changes — all new entries fit the existing schema (real URL, real author, real date).	2026-05-09 20:58:09 -07:00
Clooooode	998676dd0c	chore(test): comment of test case rewrite to english Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details	2026-05-09 19:31:41 -07:00
Clooooode	a4036654f1	fix(kanban): remove blocked kind from unsub	2026-05-09 19:31:41 -07:00
Clooooode	dd49d50389	test(kanban): assert re-block notification is delivered after unblock cycle Adds test_notifier_second_blocked_delivers to cover the case where a task is blocked, unblocked, then blocked again — the second blocked event must still deliver a gateway notification. Currently fails because blocked is treated as a terminal event kind, causing the subscription to be dropped after the first block.	2026-05-09 19:31:41 -07:00
Tranquil-Flow	8954537f95	fix(kanban): request default board explicitly (#21819 )	2026-05-09 19:31:32 -07:00
Teknium	eb3db231dc	chore: AUTHOR_MAP entry for eloklam (#22898 )	2026-05-09 19:31:14 -07:00
eloklam	d04a0b81ee	docs(skills): clarify kanban fan-out decomposition	2026-05-09 19:31:14 -07:00
emozilla	edc015886b	pin electron version	2026-05-09 22:18:56 -04:00
Teknium	08ec602770	fix(tool-result-storage): persist via stdin to bypass 128 KB exec-arg cap (#22913 ) Linux's MAX_ARG_STRLEN caps any single argv element at 128 KB (32 * PAGE_SIZE). The previous heredoc-in-the-command-string approach in _write_to_sandbox put the entire tool result inside the 'bash -c' arg, so any result over ~128 KB raised OSError [Errno 7] 'Argument list too long' before the heredoc ever ran. The caller logged a warning, but quiet_mode (CLI default) sets tools.* to ERROR — so the warning never reached agent.log either, and the agent saw a 1.5 KB preview tagged 'Full output could not be saved to sandbox'. Hits delegate_task with 3+ subagent outputs routinely now. Switch to passing content via env.execute(stdin_data=...). cmd is now just 'mkdir -p X && cat > Y' (under 1 KB), and the heavyweight payload travels through stdin where there is no argv-element limit. E2E reproduced the user's exact 144,778-char delegate_task envelope: old code OSError'd, new code round-trips cleanly to disk with all three task summaries intact.	2026-05-09 18:44:58 -07:00
Teknium	ded194eb6a	chore(skills): move heavy training skills + outlines to optional-skills (#22912 ) These skills require heavy GPU/CUDA stacks or are niche enough that they shouldn't be active by default. Moved to optional-skills/ where users opt-in via `hermes skills install official/...`. Moved: - mlops/training/axolotl - mlops/training/trl-fine-tuning - mlops/training/unsloth - mlops/inference/outlines Counts: 91 -> 87 built-in, 72 -> 76 optional. Auto-regenerated docs (per-skill pages + catalogs) reflect the move.	2026-05-09 18:44:12 -07:00
Teknium	4375b82cd9	feat(curator): show rename map in user-visible summary (#22910 ) * feat(curator): show rename map (where skills went) in user-visible summary The full data has always been on disk in REPORT.md, but the user-visible curator summary (gateway 💾 line, CLI session-start panel, `hermes curator status`) was counts-only — "consolidated 4 into 2 umbrellas" with no names. Users only discovered renames when something they expected was gone. New `_build_rename_summary()` formats the rename map and appends it to `final_summary`: auto: 1 marked stale; llm: consolidated 2 into 1, pruned 1 archived 3 skill(s): • docx-extraction → document-tools • pdf-extraction → document-tools • old-stale-thing — pruned (stale) full report: hermes curator status Empty on no-op ticks (no archives), so most ticks add zero log noise. Cap of 10 entries keeps agent.log readable when a 50-skill consolidation lands; the full list is always in REPORT.md. `hermes curator status` indents continuation lines so the multi-line summary reads as one logical field. 5 new tests in tests/agent/test_curator_classification.py covering empty / consolidation / pruning / cap / mixed cases. * feat(curator): show recent run summary once on `hermes update` The rename map is now visible from where users actually look — the update flow they explicitly run, instead of just the live gateway log or transient CLI session-start panel. Behavior: - After `hermes update`, if the most recent curator run produced a rename map (multi-line summary) that the user hasn't seen yet, print it once with a 'last run Xh ago' header and a one-time-message footer. - Stamp `last_run_summary_shown_at = last_run_at` after printing so subsequent `hermes update` invocations are silent until a newer curator run lands. - Silent on no-op runs (single-line summary like 'auto: no changes; llm: no change'). Still stamps shown so we don't reconsider on every update. - Silent when the curator has never run (the existing first-run notice handles that case). Output: ℹ Skill curator — last run 4h ago auto: 1 marked stale; llm: consolidated 2 into 1, pruned 1 archived 3 skill(s): • docx-extraction → document-tools • pdf-extraction → document-tools • old-stale-thing — pruned (stale) full report: hermes curator status (This message shows once per curator run. View anytime: hermes curator status) State migration: - `_default_state()` gains `last_run_summary_shown_at: None`. Existing state files lack the field; `.get()` returns None; the comparison treats any prior run as 'not yet shown' and prints once on next update. Self-healing. Wiring: - Both `hermes update` paths in main.py call the new `_print_curator_recent_run_notice()` right after the existing first-run notice. Best-effort try/except so a state-load bug never breaks the update flow. 6 tests in tests/hermes_cli/test_curator_recent_run_notice.py: no-run / single-line / multi-line / show-once / new-run-resets / time-formatter buckets.	2026-05-09 18:43:40 -07:00
Teknium	b67ea7ff47	perf(cli): skip welcome banner on `chat -q` single-query mode (#22904 ) `hermes chat -q "..."` printed the full welcome banner before running the query — kawaii ASCII logo, available toolsets list, available skills list, model name, session ID, working directory, update-available notice. Building it took ~420 ms on cold start (~200 ms version-update probe, the rest is toolset / skill enumeration plus Rich panel rendering). For a one-shot `-q` query the banner is noise: the user already picked the prompt, doesn't need a toolset reference, and gets the session ID + resume hint from `_print_exit_summary()` after the response prints. The fully-quiet `-Q` / `--quiet` machine-readable path was already banner-free; this brings the human-facing single-query path in line so all non-interactive invocations are fast. Measured impact (`hermes chat -q "ok" --max-turns 1`, 10-run percentiles, 9950X3D): median: 1.90 → 1.75 s (-150 ms) min: 1.80 → 1.73 s ( -70 ms) P25: 1.82 → 1.74 s ( -80 ms) Wider variance than expected; the banner cost overlaps with API latency on real `chat -q` runs. Min-time delta of 70 ms is the cleanest signal — that's the deterministic banner-build cost gone. The 150 ms median delta picks up cases where the version-update probe also finishes during the wait. Interactive mode (`hermes` with no `-q`) and the `--list-tools` / `--list-toolsets` one-shot listing commands still show the banner — those are the contexts where it's actually wanted. Tests: 656/656 `tests/cli/` pass on top of latest main (modulo 5 pre- existing flakes in `test_cli_save_config_value.py` that fail with `No module named 'ruamel'` both with and without this change).	2026-05-09 18:20:28 -07:00
Teknium	5971a4e092	feat(docs): richer info panels on the Skills Hub for built-in + optional skills (#22905 ) The Skills Hub at /skills had cards that, when expanded, showed only the one-line description, tags, author, version, and an install command. For the 163 bundled and optional skills shipped with the repo, this was thinner than the data we already have on disk. Three changes, all under website/: 1. extract-skills.py now pulls four extra fields per local skill: - 'overview' — first non-heading body paragraph from SKILL.md (stripped of admonitions/code fences, capped at ~500 chars at a sentence boundary) - 'envVars' / 'commands' — from the prerequisites: block in frontmatter - 'license' — from the top-level frontmatter - 'docsPath' — slug to the per-skill /docs/user-guide/skills/.../* page, computed with the same logic as generate-skill-docs.py 162 of 163 local skills get a non-empty overview automatically. The remaining one (media/heartmula) has only headings/code in its body and falls through to the description. 2. Skill TS interface + SkillCard expanded-panel render the new fields: - Overview paragraph at the top of the panel - Prerequisites box (env vars + required commands) when frontmatter declares them - License row alongside author/version - 'View full documentation →' link to the per-skill docs page Search now covers the overview text too, so users can find skills by matching content from inside SKILL.md, not just the one-line description. 3. styles.module.css gains six new classes (overviewBlock, detailLabel, overviewText, prereqBlock/Row/Kind/List/Item, docsLink) styled to match the existing dark panel aesthetic. External / community skills (Anthropic, LobeHub, Claude Marketplace cached indexes) keep the old behavior — overview is empty, no prereqs, no docsPath. Validation: 'npm run build' clean (exit 0); broken-link count unchanged at 155 baseline; all 163 generated docsPath values resolve to existing pages under website/docs/user-guide/skills/.	2026-05-09 18:17:39 -07:00
Teknium	da086a0154	chore: add ming1523 to AUTHOR_MAP	2026-05-09 17:55:12 -07:00
ming	85383c6363	fix(cli): preserve config comments on setting writes	2026-05-09 17:55:12 -07:00
Teknium	de54618720	chore: add v1b3coder to AUTHOR_MAP	2026-05-09 17:54:58 -07:00
v1b3coder	4fdaf0b4d8	fix: use credential_pool for custom endpoint model listing probes Same-provider /model switches on a 'custom' endpoint kept stale credentials because (a) _resolve_named_custom_runtime's bare-custom + explicit_base_url path went straight to OPENAI_API_KEY/OPENROUTER_API_KEY env fallbacks without consulting the credential pool, and (b) switch_model() guarded against custom-provider re-resolution to preserve base_url, locking in the prior api_key. Now the bare-custom path queries the credential pool first (mirroring the named-custom-provider branch behavior), and the same-provider switch guard is removed since resolve_runtime_provider has since grown a robust custom-resolution path that preserves base_url from model_cfg. Refs #18681 (the gateway-side api_key wiring is still separate), #16254, #12919.	2026-05-09 17:54:58 -07:00
Teknium	f93b8c28e3	chore: add DanielLSM to AUTHOR_MAP	2026-05-09 17:54:44 -07:00
Daniel Marta	1fb9f7c68c	fix(gateway): pass max_total_size_mb and max_file_size_mb to CheckpointManager The /rollback command handler in gateway/run.py was constructing CheckpointManager with only enabled and max_snapshots, omitting max_total_size_mb and max_file_size_mb that the __init__ expects. This caused a TypeError on every /rollback invocation when checkpoints were enabled. Fixes: NousResearch/hermes-agent#18841	2026-05-09 17:54:44 -07:00
Teknium	4ca7c2104d	test(gateway): stub /proc unavailability in find_gateway_pids fallback test Follow-up test fix for #22693 — the existing test for ps-failure + pid-file fallback needed the /proc walk path stubbed too since /proc is now consulted first.	2026-05-09 17:54:17 -07:00
Wesley Simplicio	6bf7ac3185	fix(gateway): detect gateway process via /proc in Docker without procps Salvage of NousResearch/hermes-agent#7622. Docker images often lack procps so `ps` is unavailable. Try reading /proc/*/cmdline first (works in any Linux container) and fall back to `ps -A eww` only when /proc is not present. PermissionError on individual PIDs is silently skipped. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-09 17:54:17 -07:00
Teknium	2ffef15675	fix(test_gateway): stop run_gateway() tests from rewriting the dev's installed systemd unit (#22900 ) run_gateway() calls refresh_systemd_unit_if_needed() on every invocation so restart settings stay current after exit-code-75 respawns. The user-scope unit path resolves under Path.home() (NOT sandboxed by conftest, only HERMES_HOME is), and generate_systemd_unit() bakes the current HERMES_HOME into the unit's Environment= line. Result: any test that exercises run_gateway() end-to-end on a real Linux dev box silently rewrites the developer's installed ~/.config/systemd/user/hermes-gateway.service with a polluted HERMES_HOME pointing at /tmp/pytest-of-<user>/.../hermes_test. On the next reboot, systemd loads that unit, the gateway starts looking at an empty tmp dir, and Telegram/Discord/etc. all show as 'No messaging platforms enabled' even though the user's real config is fine. Three tests in tests/hermes_cli/test_gateway.py hit this path: test_run_gateway_exits_cleanly_on_keyboard_interrupt, test_run_gateway_exits_nonzero_when_start_gateway_reports_failure, and test_run_gateway_root_guard_has_escape_hatch. Two-layer fix: 1. _install_fake_gateway_run helper (covers all four run_gateway() call sites in test_gateway.py and any future ones) now also stubs supports_systemd_services and refresh_systemd_unit_if_needed. 2. refresh_systemd_unit_if_needed() itself sniffs the generated unit body for /pytest-of- and /hermes_test markers and refuses to write when present. Defense in depth so a future test that bypasses the helper still can't corrupt the dev's gateway. Tests that legitimately exercise the refresh flow (test_run_gateway_refreshes_outdated_unit_on_boot) patch generate_systemd_unit to return synthetic content that doesn't carry those markers, so they keep working. Adds test_refresh_refuses_to_bake_pytest_tmpdir_into_real_user_unit as a regression test for the source-side guard.	2026-05-09 17:54:09 -07:00

1 2 3 4 5 ...

8013 commits