hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-31 19:16:29 +00:00

Author	SHA1	Message	Date
Austin Pickett	cb0e2e2f36	Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-04-29 15:23:30 -04:00
vincez-hms-coder	4c0cc77e94	fix(dashboard): keep ui imports browser-safe after rebase	2026-04-29 01:47:13 -04:00
vincez-hms-coder	9b62c98170	chore(dashboard): restore package lock metadata	2026-04-29 01:43:21 -04:00
vincez-hms-coder	469e4df3c2	fix(profiles): preserve skills on dashboard profile creation	2026-04-29 01:42:51 -04:00
vincez-hms-coder	ae11a31058	feat(profiles): add profile setup command endpoint and wrapper creation	2026-04-29 01:42:51 -04:00
vincez-hms-coder	3e200b64fb	fix(profiles): update terminal command for copying based on profile name Co-authored-by: Copilot <copilot@github.com>	2026-04-29 01:42:51 -04:00
vincez-hms-coder	1745cfc6d7	fix(dashboard): avoid node-only ui imports in browser	2026-04-29 01:42:50 -04:00
vincez-hms-coder	58c07867e3	fix(dashboard): keep profiles list resilient	2026-04-29 01:39:52 -04:00
vincez-hms-coder	4523965de9	feat(dashboard): add profiles management page Copy profile dashboard changes onto a fresh branch under the vincez-hms-coder account. Includes: - Profiles dashboard route and sidebar entry - Profile lifecycle REST endpoints - SOUL.md read/write support - i18n labels and helper text updates - Targeted profile API tests Test plan: - pytest tests/hermes_cli/test_web_server.py -k profile -q - cd web && npm run build	2026-04-29 01:39:51 -04:00
teknium1	fa9383d27b	feat(curator): umbrella-first prompt, inherit parent config, unbounded iterations Based on three live test runs against 346 agent-created skills on the author's own setup (~6.5 min, opus-4.7, 86 API calls), the curator prompt needed three sharpenings before it consistently produced real umbrella consolidation instead of passive audit output: Umbrella-first framing. The original 'decide keep/patch/archive/ consolidate' framing lets opus default to 'keep' whenever two skills aren't byte-identical. The new prompt explicitly tells the reviewer that pairwise distinctness is the wrong bar — the right question is 'would a human maintainer write this as N separate skills, or one skill with N labeled subsections?' Expect 10-25 prefix clusters; merge each into an umbrella via one of three methods. Three concrete consolidation methods. (a) Merge into an existing umbrella (patch the broadest skill, archive siblings); (b) Create a new umbrella SKILL.md (skill_manage action=create); (c) Demote session-specific detail into references/, templates/, or scripts/ under the umbrella via skill_manage action=write_file, then archive the narrow sibling. This matches the support-file vocabulary the review-prompt side already uses (PR #17213). Two observed bailouts pre-empted: 'usage counters are zero so I can't judge' (rule 4: judge on content, not use_count) and 'each has a distinct trigger' (rule 5: pairwise distinctness is the wrong bar). Config-aware parent inheritance. _run_llm_review() was building AIAgent() without explicit provider/model, hitting an auto-resolve path that returned empty credentials → HTTP 400 'No models provided' against OpenRouter. Fork now inherits the user's main provider and model (via load_config + resolve_runtime_provider) before spawning — runs on whatever the user is currently on, OAuth-backed or pool-backed included. Unbounded iteration ceiling. max_iterations=8 was way too low for an umbrella-build pass over hundreds of skills. A live pass takes 50-100 API calls (scanning, clustering, skill_view'ing candidates, patching umbrellas, mv'ing siblings). Raised to 9999 — the natural stopping criterion is 'no more clusters worth processing', not an arbitrary tool-call budget. Tests updated: test_curator_review_prompt_has_invariants accepts DO NOT / MUST NOT and drops 'keep' from the required-verb set (the umbrella-first prompt correctly deemphasizes 'keep' as a first-class decision label since passive keep-everything is the failure mode being prevented). Added test_curator_review_prompt_is_umbrella_first asserting the umbrella framing, class-level thinking, references/ + templates/ + scripts/ support-file mentions, and the 'use_count is not evidence of value' pre-emption. Added test_curator_review_prompt_offers_support_file_actions asserting skill_manage action=create and action=write_file are both named. Live validation on author's setup: - Run 1 (old prompt): 3 archives, stopped after surveying — typical passive outcome - Run 2 (consolidation prompt): 44 archives, 3 patches, surfaced the 50-skill mlops reorg duplicate bug but didn't umbrella - Run 3 (this prompt): 249 archives + 18 new class-level umbrellas created, reducing agent-created skills from 346 → 118 with every archived skill's content preserved as references/ under its umbrella. Pinned skill untouched. Full report in PR description.	2026-04-28 22:33:33 -07:00
Teknium	019d4c1c3f	feat(curator): hook into the gateway's cron-ticker thread Long-running gateways need the curator to fire on cadence without restarts. Piggy-back on the existing cron ticker thread (which already runs image/document cache cleanup every hour on the same pattern) instead of spawning a dedicated timer thread. - New CURATOR_EVERY = 60 ticks (poll hourly at default 60s interval). The inner config.interval_hours gate controls the real cadence, so 60 of these 60 hourly pokes are cheap no-ops and one runs the review. - Removed the boot-time call added in the prior commit — the ticker covers boot + every hour thereafter. Avoids double-running. Handles the weekly-default-on-24/7-gateway gap flagged in review.	2026-04-28 22:33:33 -07:00
Teknium	a12f7aa8bb	fix(curator): default cycle is every 7 days, not 24 hours Weekly is closer to how skill churn actually works — most agent-created skills don't change multiple times per day, so a daily review is pure cost without benefit. Bumping the default to 7 days reduces aux-model spend while still catching drift and staleness on the timescales that matter (30d stale, 90d archive). Changes: - DEFAULT_INTERVAL_HOURS: 24 -> 168 (7 days) - config.yaml default: interval_hours: 24 -> 24 * 7 - CLI status line renders as '7d' when interval is a whole-day multiple - Test `test_old_run_eligible` decoupled from the exact default: it now uses 2 * get_interval_hours() so future tweaks don't break it	2026-04-28 22:33:33 -07:00
Teknium	0d31864e3b	fix(curator): defense-in-depth gates against bundled/hub skills Previous invariants only gated the primary entry points (apply_automatic_transitions, archive_skill, CLI pin). Several paths were unprotected: - bump_view / bump_use / bump_patch / set_state / set_pinned wrote usage records unconditionally, which is confusing noise in .usage.json even though the review list filtered them out - restore_skill did not check whether a bundled skill now shadows the archived name - CLI unpin was asymmetric with CLI pin — it had no gate Fixes: - _mutate() (the shared counter / state writer) now drops silently when the skill is not agent-created. .usage.json never gains a record for a bundled or hub-installed skill. - restore_skill() refuses to restore under a name that is now bundled or hub-installed (would shadow upstream). - CLI unpin gate matches CLI pin. New tests: - 5 provenance-guard tests on skill_usage (one per mutator) - 1 end-to-end test that hammers every mutator at a bundled skill and a hub skill, asserts both are untouched on disk, and asserts the sidecar stays clean - 2 CLI tests proving pin/unpin refuse bundled skills symmetrically 64/64 tests passing (29 skill_usage + 27 curator + 8 new guards).	2026-04-28 22:33:33 -07:00
Teknium	c8b7e7268a	refactor(curator): point review prompt at existing tools The LLM review prompt mentioned bespoke `archive_skill` and `pin_skill` tools that are not registered as model tools. Swap the prompt to rely on the real surface: - skill_manage action=patch — for patching and consolidation - terminal — to `mv` skill dirs into .archive/ Also drop `pin` from the model's decision list — pinning is a user opt-out for `hermes curator pin <skill>`, not something the model should do autonomously. Decision list is now: keep / patch / consolidate / archive. Tests updated: prompt-invariant test now asserts the existing tools are referenced and that bespoke tool names do NOT appear. New test prevents `pin` from being re-added as a model decision.	2026-04-28 22:33:33 -07:00
Teknium	bc79e227e6	feat(curator): background skill maintenance (issue #7816 ) Adds the Curator — an auxiliary-model background task that periodically reviews AGENT-CREATED skills and keeps the collection tidy: tracks usage, transitions unused skills through active → stale → archived, and spawns a forked AIAgent to consolidate overlaps and patch drift. Default: enabled, inactivity-triggered (no cron daemon). Runs on CLI startup and gateway boot when the last run is older than interval_hours (default 24) AND the agent has been idle for min_idle_hours (default 2). Invariants (all load-bearing): - Never touches bundled or hub-installed skills (.bundled_manifest + .hub/lock.json double-filter) - Never auto-deletes — archive only. Archives are recoverable via `hermes curator restore <skill>` - Pinned skills bypass all auto-transitions - Uses the aux client; never touches the main session's prompt cache New files: - tools/skill_usage.py — sidecar .usage.json telemetry, atomic writes, provenance filter - agent/curator.py — orchestrator: config, idle gating, state-machine transitions (pure, no LLM), forked-agent review prompt - hermes_cli/curator.py — `hermes curator {status,run,pause,resume, pin,unpin,restore}` subcommand - tests/tools/test_skill_usage.py — 29 tests - tests/agent/test_curator.py — 25 tests Modified files (surgical patches): - tools/skills_tool.py — bump view_count on successful skill_view - tools/skill_manager_tool.py — bump patch_count on skill_manage patch/edit/write_file/remove_file; forget record on delete - hermes_cli/config.py — add curator: section to DEFAULT_CONFIG - hermes_cli/commands.py — add /curator CommandDef with subcommands - hermes_cli/main.py — register `hermes curator` subparser via register_cli() from hermes_cli.curator - cli.py — /curator slash-command dispatch + startup hook - gateway/run.py — gateway-boot hook (mirrors CLI) Validation: - 54 new tests across skill_usage + curator, all passing in 3s - 346 tests across all touched files' neighbors green - 2783 tests across hermes_cli/ + gateway/test_run_progress_topics.py green - CLI smoke: `hermes curator status/pause/resume` work end-to-end Companion to PR #16026 (class-first skill review prompt) — together they form a loop: the review prompt stops near-duplicate skill creation at the source, and the curator prunes/consolidates what still accumulates. Refs #7816.	2026-04-28 22:33:33 -07:00
Mil Wang (from Dev Box)	88602376d4	fix: resolve external_dirs relative to HERMES_HOME instead of cwd (#9949 ) Relative entries in skills.external_dirs were resolved against the process cwd via Path.resolve(), making them silently fail when Hermes was launched from a different directory. Resolve relative paths against get_hermes_home() for consistent behavior across CLI, gateway, and cron contexts. Absolute paths and env-var/tilde expansion are unchanged.	2026-04-28 22:29:09 -07:00
teknium1	ded12f0968	chore(release): map LyleLengyel@gmail.com -> mcndjxlefnd	2026-04-28 22:26:09 -07:00
Lyle Lengyel	80e474f11f	fix(gateway,terminal): expand shell tilde in terminal.cwd before subprocess Commit `3c42064e` made config.yaml the single source of truth for TERMINAL_CWD, but the config bridge passes cwd values verbatim to os.environ. When a user sets terminal.cwd: ~/ in config.yaml, the literal string '~/'' reaches subprocess.Popen, which the kernel rejects because it does not expand shell tilde syntax. This patch adds three defensive layers: 1. gateway/run.py — expanduser at config bridge time so TERMINAL_CWD is always an absolute path. 2. tools/terminal_tool.py — expanduser when reading TERMINAL_CWD in _get_env_config(), guarding against stale or manually-set env vars. 3. tools/environments/local.py — expanduser in LocalEnvironment before passing cwd to subprocess.Popen, the final safety net. Includes regression tests in test_config_cwd_bridge.py for nested terminal.cwd, top-level cwd alias, and precedence ordering. Refs: `3c42064e`	2026-04-28 22:26:09 -07:00
JackJin	88e07c42b4	fix(cli): prevent .env sanitizer from splitting GLM_API_KEY by LM_API_KEY suffix The known-key splitter in `_sanitize_env_lines` used substring matching to find concatenated KEY=VALUE pairs. When a registered key was a suffix of another (LM_API_KEY is a suffix of GLM_API_KEY), the shorter key's needle would match inside the longer one, causing the sanitizer to rewrite `GLM_API_KEY=...` as `G\nLM_API_KEY=...` and silently break Z.AI/GLM auth (and similarly `GLM_BASE_URL` -> `G\nLM_BASE_URL`). Drop matches whose needle range is fully contained within a longer overlapping match. Two regression tests cover the suffix-collision case and confirm a real concatenation that happens to start with the longer key still splits where it should. Fixes #17138	2026-04-28 22:22:45 -07:00
Brooklyn Nicholson	97a2474b39	review(copilot): point reload.env docstring at hermes_cli.config.reload_env	2026-04-28 22:22:30 -07:00
Brooklyn Nicholson	6b4ef00a2c	review(copilot): keep /reload cli_only since gateway has no handler	2026-04-28 22:22:30 -07:00
Brooklyn Nicholson	4858e26eaa	feat(tui): port classic CLI /reload (.env hot-reload) to TUI Classic CLI exposes ``/reload`` (re-reads ~/.hermes/.env into ``os.environ`` via ``hermes_cli.config.reload_env``) so newly added API keys take effect without restarting the session. The TUI was missing the parity command, so users had to Ctrl+C out and ``hermes --tui`` again whenever they added or rotated a credential. Three small wires: * New ``reload.env`` JSON-RPC method in ``tui_gateway/server.py`` that delegates to ``hermes_cli.config.reload_env`` and returns the count of vars updated. * New ``/reload`` slash command in ``ui-tui/src/app/slash/commands/ops.ts`` matching the existing ``/reload-mcp`` pattern (native RPC, no slash worker). * Drop ``cli_only=True`` from the ``reload`` ``CommandDef`` in ``hermes_cli/commands.py`` so help/menus surface it in the TUI too. ``reload_env`` itself is environment-agnostic. Same caveat as classic CLI: the currently constructed agent's credential pool / provider routing does not auto-rebuild. Users who want a brand-new credential resolution should follow with ``/new``. Tests: * New ``test_reload_env_rpc_calls_hermes_cli_reload_env`` confirms RPC delegates and reports the count. * New ``test_reload_env_rpc_surfaces_errors`` confirms exceptions are rendered as JSON-RPC errors. * ``createSlashHandler.test.ts`` slash-parity matrix extended with ``['/reload', 'reload.env', {}]`` so we can't regress the routing. Validation: scripts/run_tests.sh tests/test_tui_gateway_server.py — 92/92. scripts/run_tests.sh tests/hermes_cli/test_commands.py — 128/128. cd ui-tui && npm run type-check — clean; npm test --run — 390/390.	2026-04-28 22:22:30 -07:00
Teknium	dcd7b717f8	fix(gateway): linearize tool-progress bubbles with content messages (#17280 ) After PR #7885 (`97b0cd51e`) added content-side segment breaks for natural mid-turn assistant messages, the tool-progress task in gateway/run.py was not updated to match. progress_msg_id and progress_lines persisted for the whole run, so after a tool batch produced bubble B1 followed by content bubble C1, the next tool.started kept editing the OLD bubble B1 above C1 — making the chat appear out of order on Telegram, Discord, and Slack. Add on_new_message callback to GatewayStreamConsumer, fired at the four sites where a fresh content bubble lands on the platform: - _send_or_edit first-send branch (NOT edits) - _send_commentary - _send_new_chunk (overflow split) - each successful chunk of _send_fallback_final Gateway supplies a lambda that enqueues ('__reset__',) into the progress_queue. send_progress_messages() handles the marker in both the main loop and the CancelledError drain path, clearing progress_msg_id, progress_lines, and the dedup state so the next tool.started opens a fresh bubble below the new content. Result: each tool batch appears in chronological order below the preceding content. When no content appears between tool batches, tools still group in one bubble (CLI-style compactness). Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 22:17:33 -07:00
Tranquil-Flow	ac855bba0e	fix(cli): respect terminal.cwd config in local terminal backend init_session() runs a login shell bootstrap that sources profile scripts (.bashrc, .bash_profile, etc.) before capturing pwd. If any profile script changes the working directory, the captured cwd overwrites the configured terminal.cwd value — so terminal commands run in the wrong directory despite the TUI banner showing the configured path. Add an explicit 'builtin cd' to the configured cwd in the bootstrap script, after profile sourcing but before pwd capture, ensuring the configured terminal.cwd is always what gets recorded. Fixes #14044	2026-04-28 22:16:08 -07:00
Brooklyn Nicholson	f95c34f415	fix(browser): address Copilot round-4 on /browser connect * Reject unsupported schemes (anything outside http/https/ws/wss) in cli.py /browser connect before probing or persisting, matching the gateway's existing 4015 path. * Defend gateway browser.manage against `{"url": null}` and non-string urls: empty/null falls back to DEFAULT_BROWSER_CDP_URL, non-string returns a 4015 instead of slipping into the generic 5031 catch via TypeError on `"://" in url`. * Add regression tests for both null-url fallback and non-string rejection.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	679a27498d	fix(browser): address Copilot round-3 on /browser connect * Gate `browser.progress` emit on truthy `session_id`. The TUI prints `messages` from the response when there's no session, so emitting events too would double-render. Now: with a session → events stream live; without one → bundled messages only. * Resolve `system = platform.system()` once in `_browser_connect` and thread it through `try_launch_chrome_debug` and `_failure_messages` → `manual_chrome_debug_command`, so the generated hint is consistent (and tests are deterministic) on any host. * Add `test_browser_manage_connect_no_session_skips_progress_events` to lock in the gating behavior.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	d1ee4915f3	fix(browser): address Copilot review on /browser connect Fixes from Copilot's two passes on PR #17238: * Validate parsed URL once: reject missing host, invalid port, and unsupported scheme up front so malformed inputs (e.g. http://:9222 or http://localhost:abc) don't fall through to a generic 5031. * Tighten _is_default_local_cdp to require a discovery-style path so ws://127.0.0.1:9222/devtools/browser/<id> is not collapsed to bare http://127.0.0.1:9222 (which would lose the path and break the connect). * Move browser.manage into _LONG_HANDLERS so the up-to-10s launch-and-retry loop runs on the RPC pool instead of blocking the main dispatcher. * try_launch_chrome_debug uses Windows-appropriate detach kwargs (creationflags=DETACHED_PROCESS\|CREATE_NEW_PROCESS_GROUP) instead of POSIX-only start_new_session=True. * manual_chrome_debug_command uses subprocess.list2cmdline on Windows so the printed instruction is cmd.exe-compatible. * Mirror host/port validation in cli.py /browser connect so the classic CLI never persists an invalid BROWSER_CDP_URL.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	26816d1f77	refactor(tui): tighten /browser connect plumbing Split browser.manage into a small dispatcher with named connect/disconnect helpers, fold _http_ok / _probe_urls / _normalize_cdp_url out of the nested probe loop, collapse the failure-message scaffolding, and DRY the chrome candidate path tables. Behaviour and event shape unchanged.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	e750829015	fix(tui): stream /browser connect progress as gateway events Emit browser.progress JSON-RPC notifications during the connect work and render them in the TUI as system transcript lines, so users see the same step-by-step status the base CLI prints instead of nothing for ~1m followed by a final result.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	7d39a45749	fix(tui): show /browser connect progress like CLI Return CLI-style browser connect status messages from the gateway and render them in the TUI so local Chrome launch attempts are visible instead of ending in a silent delayed failure.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	69ff114ee2	fix(browser): avoid bogus Chrome launch fallback Detect an actual Chrome/Chromium executable before printing a manual CDP launch command, including common WSL-mounted Windows browser paths, so /browser connect does not suggest google-chrome when it is unavailable.	2026-04-28 22:11:10 -07:00
Brooklyn Nicholson	f10a3df632	fix(tui): align /browser connect local CDP handling Share Chrome CDP launch helpers between the classic CLI and TUI so default /browser connect uses loopback consistently, retries local Chrome launch, and reports a copyable manual-start command instead of claiming a dead connection.	2026-04-28 22:11:10 -07:00
Teknium	1d4218be56	feat(review): active-update bias, loaded-skill-first, support-file variants (#17213 ) The background skill-review prompts (_SKILL_REVIEW_PROMPT and the Skills half of _COMBINED_REVIEW_PROMPT) steered the reviewer toward passive behavior — most passes concluded 'Nothing to save.' even when the session produced real lessons. User-preference corrections (style, format, legibility, verbosity) were especially lost: they were read as memory signals only, so skills never carried the fix. This rewrite changes the stance: - Active-update bias. The reviewer now treats inaction as a missed learning opportunity. 'Nothing to save.' remains an explicit escape but is no longer framed as the most-common outcome. - User-preference corrections are first-class skill signals. Style, tone, format, legibility, verbosity complaints — and the actual phrasings users use ('stop doing X', 'this is too verbose', 'I hate when you Y', 'remember this') — now warrant patching the skill that governs the task, not just writing to memory. - Loaded-skill-first preference order. When a skill was loaded via /skill-name or skill_view during the session, the reviewer patches THAT one first. It was in play; it's the right place. - Four-step ladder: patch-loaded → patch-umbrella → support-file → create. Support files are explicitly enumerated as three kinds: * references/<topic>.md — session-specific detail OR condensed knowledge banks (quoted research, API docs excerpts, domain notes) * templates/<name>.<ext> — starter files to copy and modify * scripts/<name>.<ext> — statically re-runnable actions - Name-veto for CREATE. New skill names MUST be class-level — no PR numbers, error strings, codenames, library-alone names, or session artifacts ('fix-X / debug-Y / audit-Z-today'). If the proposed name only fits today's task, fall back to one of the patch/support-file options. - Memory scope clarified. 'who the user is and what the current situation and state of your operations are' — MEMORY.md is situational/state, USER.md is identity/preferences. - Curator handoff. Reviewer flags overlap; the background curator handles consolidation at scale. Single-session reviewer doesn't attempt umbrella-rebalancing. Tests: tests/run_agent/test_review_prompt_class_first.py upgraded to assert the new behavioral contracts (active bias, user-correction signals, loaded-skill-first, support-file kinds, name-veto, memory framing, curator handoff). 17 tests, all pass. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 21:11:48 -07:00
Teknium	8c892c1453	refactor(redact): canonical mask_secret helper; fix status.py DIM drift (#17207 ) Three modules independently implemented the same "preserve head+tail of a secret, mask the middle" logic with slightly different behaviors that had started to drift: hermes_cli/config.py redact_key — 12-char floor, 4+4, DIM '(not set)' hermes_cli/status.py redact_key — 12-char floor, 4+4, plain '(not set)' ← drift hermes_cli/dump.py _redact — 12-char floor, 4+4, empty string The visible bug: 'hermes status' displayed the '(not set)' placeholder in plain text while 'hermes config' showed it in dim text. Same concept, inconsistent UI. Introduces mask_secret() in agent/redact.py as the canonical helper, with head/tail/floor/placeholder/empty kwargs. The three call sites become one-line wrappers that differ only in the 'empty' handling: config.redact_key → mask_secret(k, empty=color('(not set)', Colors.DIM)) status.redact_key → mask_secret(k, empty=color('(not set)', Colors.DIM)) dump._redact → mask_secret(v) # empty → '' agent.redact._mask_token (log redactor, different policy: 18-char floor, 6+4 visible, '*' on empty) also ports to mask_secret but retains its own empty-case handling to preserve the historical '' return. Net: the three display-time redactors now agree on formatting, the canonical helper lives in one place, and future tweaks (e.g. adding bullet-point masking, changing the head/tail widths) happen once. Verified: - 3/3 tests/hermes_cli/test_web_server.py::TestRedactKey pass - 89/89 agent/tests/test_redact.py + tests/tools/test_browser_secret_exfil.py + tests/hermes_cli/test_redact_config_bridge.py pass - Live 'hermes status', 'hermes config', 'hermes dump' all render the same way they did before (verified against actual env with real keys: OpenRouter, Firecrawl, Browserbase, FAL, Tinker all show 'prefix...suffix'; Kimi shows '**' at <12 chars; unset shows '(not set)' uniformly). Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 21:04:35 -07:00
brooklyn!	6e9691ff12	Merge pull request #17237 from NousResearch/bb/tui-paste-watchdog fix(tui): stabilize sticky prompts and paste recovery	2026-04-28 20:22:44 -07:00
Brooklyn Nicholson	10ad7006b6	fix(tui): use paste timeout when rearming paste watchdog Match the buffered-stdin rearm cadence to IN_PASTE state so large pastes do not spin the normal escape timeout while waiting for readable data to drain.	2026-04-28 22:21:44 -05:00
Brooklyn Nicholson	f542d17b00	style(tui): apply npm run fix Run the TUI lint autofix and formatter on the PR branch after the sticky prompt and paste recovery changes.	2026-04-28 22:18:26 -05:00
Brooklyn Nicholson	d7ae8dfd0a	style(tui): remove steer queued emoji Keep the /steer acknowledgement plain text so it reads like the rest of the TUI status copy.	2026-04-28 22:15:57 -05:00
Brooklyn Nicholson	ce2cc7302e	fix(tui): stabilize sticky prompt tracking Keep the latest prompt sticky while the viewport is in live assistant output beyond history, and clear stale sticky state at the real bottom using fresh scroll height.	2026-04-28 22:10:40 -05:00
Brooklyn Nicholson	afb20a1d67	fix(tui): recover from stuck paste mode Prevent unterminated bracketed paste input from swallowing future keystrokes, and avoid rendering an empty Thinking panel before reasoning arrives.	2026-04-28 22:06:27 -05:00
Teknium	cd7150a195	perf(approval): precompile DANGEROUS_PATTERNS and HARDLINE_PATTERNS (#17206 ) detect_dangerous_command() and detect_hardline_command() were calling re.search(pattern, text, re.IGNORECASE \| re.DOTALL) inline — Python's re._cache (512 patterns) amortizes compile cost on the warm path, but: 1. The first terminal() call per process pays the full compile fan-out for all 59 patterns (12 HARDLINE + 47 DANGEROUS). Measured at ~2.6 ms per detect_dangerous_command() call after re.purge(). 2. The re._cache is LRU — unrelated regex work elsewhere in the agent (response parsing, text normalization, etc.) can evict our patterns and silently re-compile them on the next terminal() call. Precompiling at module load eliminates both costs: detect_dangerous_command: cold 2.613 ms → 0.298 ms (-88%) warm 0.042 ms → 0.004 ms (-90%) detect_hardline_command: cold ~0.6 ms → 0.006 ms warm 0.011 ms → 0.002 ms Savings are per terminal() call. Agents with heavy terminal use see compound savings; the bigger value is the stability guarantee (no re._cache eviction can silently re-introduce the 2.6 ms cold cost mid-session). Implementation: - HARDLINE_PATTERNS_COMPILED and DANGEROUS_PATTERNS_COMPILED built at module load from the existing (pattern, description) tuples, using shared _RE_FLAGS = re.IGNORECASE \| re.DOTALL. - detect_* functions now iterate the compiled list and call pattern_re.search(text). - Original HARDLINE_PATTERNS and DANGEROUS_PATTERNS lists kept as-is (other code in the file uses them for key derivation / _PATTERN_KEY_ALIASES). Verified: - 160/161 tests/tools/test_approval*.py pass (1 pre-existing heartbeat test flake on main). - 349/349 tests/tools/ 'approval or terminal or dangerous' pass. - Live hermes chat smoke: 3 benign terminal commands + 1 rm -rf /tmp/ (clarify prompt fired — approval path still works) + 1 sudo (sudo password prompt fired — DANGEROUS pattern match still works). 23 log lines in the smoke window, zero errors. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 18:44:14 -07:00
Teknium	adef1f33ab	chore(release): map scott@scotttrinh.com -> scotttrinh (#17203 ) Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-and-push (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 18:28:49 -07:00
Teknium	fe295f9836	docs(hooks): tutorial — build a BOOT.md startup checklist (#17202 ) Replace the removed built-in boot-md hook (#17093) with a how-to that shows users how to wire up the same behavior themselves via the hooks system. Uses _resolve_gateway_model() + _resolve_runtime_agent_kwargs() so the example works against custom endpoints and OAuth providers, not just the aggregator defaults that the old built-in silently assumed. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 18:27:48 -07:00
Scott Trinh	fd943461ca	fix(doctor): accept catalog provider aliases Validate configured providers against both Hermes runtime provider ids and catalog-normalized provider ids. This keeps providers like ai-gateway from being rejected after catalog resolution maps them to models.dev ids. Keep credential checks and vendor-slug warnings anchored to the runtime id so doctor reports actionable provider names in follow-up diagnostics.	2026-04-28 18:27:42 -07:00
Teknium	9f004b6d94	perf(tools): memoize get_tool_definitions + TTL-cache check_fn results (#17098 ) Two amplifying optimizations to per-turn overhead in the gateway: 1. get_tool_definitions() memoization (model_tools.py) Keyed on (frozenset(enabled), frozenset(disabled), registry._generation, config.yaml mtime+size). Only active when quiet_mode=True (which is every hot-path caller — gateway, AIAgent.__init__); quiet_mode=False keeps the existing print side effects. Cached path returns a shallow-copy list sharing read-only schema dicts. Measured: 7.5 ms → 0.01 ms per call (~750× speedup). Gateway constructs fresh AIAgent per message, so this saves ~7 ms/turn before any LLM work. 2. check_fn() TTL cache (tools/registry.py) check_fn callables like check_terminal_requirements probe external state (Docker daemon, Modal SDK, playwright binary). For a long-lived process, hitting them on every get_definitions() pass was pure waste — external state changes on human timescales. 30 s TTL so env-var flips (hermes tools enable X) propagate within a turn or two without explicit invalidation. Measured: first call 7.5ms → 1.6ms (check_fn probes now dominate); subsequent calls ~0.01ms via the upstream memoization. Invalidation surface: - registry._generation bumps on register/deregister/register_toolset_alias, invalidating the memoized definitions automatically. - config.yaml mtime in the cache key captures user-visible config edits affecting dynamic schemas (execute_code mode, discord allowlist). - invalidate_check_fn_cache() exposed for explicit flushes (e.g. after hermes tools enable/disable). - tests/conftest.py autouse fixture clears both caches before every test so env-var monkeypatches don't see stale results. Also fixes a regression from PR #17046 that I missed: - tools/web_tools.py — Firecrawl was removed from module scope by the lazy import, breaking 8 tests that patch 'tools.web_tools.Firecrawl'. Applied the same _FirecrawlProxy pattern used in auxiliary_client/ run_agent for OpenAI (module-level proxy that looks like the class but imports the SDK on first call/isinstance; patch() replaces the attribute as usual). Verified: - 49/49 tests/tools/test_web_tools_config.py pass (was 8 failing on main) - 68/68 tests/tools/test_homeassistant_tool.py pass (was 1 failing in the full suite due to check_fn TTL cross-test pollution; fixed by the autouse fixture) - 3887/3895 tests/tools/ (8 pre-existing fails: 2 delegate, 1 mcp dynamic discovery, 5 mcp structured content — all confirmed on main) - 2973/2976 tests/agent/ + tests/run_agent/ (3 pre-existing fails) - 868/868 tests/run_agent/ (excluding test_run_agent.py which has pre-existing suite-level issues) - Live smoke: 2 turns + /model switch + tool calls, zero errors in agent.log session window. Co-authored-by: teknium1 <teknium@users.noreply.github.com>	2026-04-28 18:20:17 -07:00
brooklyn!	188eaa57c4	fix(tui): honor documented mouse_tracking config key (#17188 ) * fix(tui): honor documented mouse_tracking config key The TUI runtime was reading display.tui_mouse while docs and user-facing examples pointed users at display.mouse_tracking. That made persistent mouse-disable config look like a no-op for users trying to restore native terminal selection/copy behavior on Linux/SSH/tmux terminals. Use display.mouse_tracking as the canonical key, keep display.tui_mouse as a legacy fallback, and have /mouse write the documented key. Both gateway config.get and client-side config sync now share the same precedence: the canonical key wins, then the legacy key, then default on. * review(copilot): align mouse tracking config coercion - Load gateway config once before deriving display.mouse_tracking state. - Use key-presence precedence on the TUI client too, so canonical mouse_tracking wins over legacy tui_mouse even when the value is null. - Treat numeric 0 as disabled on both gateway and client, matching the existing string "0" handling. - Widen ConfigDisplayConfig mouse fields because config.get full returns raw YAML, not normalized booleans.	2026-04-28 17:39:07 -07:00
brooklyn!	6b09df39be	fix(tui): restore macOS copy behavior and theme polish (#17131 ) This PR groups the TUI fixes that restore macOS Terminal usability and clean up the theme/composer regressions: - copy transcript selections on macOS drag-release so Terminal.app users can copy while mouse tracking is enabled - copy composer selections on macOS drag-release; composer selection is internal to TextInput and does not use the global Ink selection bus - keep IDE Cmd+C forwarding setup macOS-only, and make keybinding conflict checks respect simple when-clause overlap/negation - force truecolor before chalk initializes (unless NO_COLOR / FORCE_COLOR / HERMES_TUI_TRUECOLOR opt-outs apply) so the default banner keeps its gold/amber/bronze gradient in Terminal.app - move TUI surfaces onto semantic theme tokens and preserve skin prompt symbols as bare tokens with renderer-owned spacing - render focused placeholders as dim hint text in TTY mode instead of inverse/selected-looking synthetic cursor text	2026-04-28 18:47:14 -05:00
brooklyn!	a9efa46b69	Merge pull request #17174 from NousResearch/bb/nix-web-hash-refresh fix(nix): refresh web/ npm-deps hash to unblock main builds	2026-04-28 16:45:57 -07:00
Brooklyn Nicholson	b2f936fd37	fix(nix): treat transient magic-cache throttling as skip in fix-lockfiles Round 1 of #17174 hit `nix-lockfile-check` failure. Root cause was NOT a stale hash — the primary `nix (ubuntu-latest)` and `nix (macos-latest)` builds passed. GitHub's Magic Nix Cache returned HTTP 418 (rate-limited / throttled) mid-run, so the rebuild bailed with `some outputs of '/nix/store/...-npm-deps.drv' are not valid, so checking is not possible` — no `got:` line for the script to extract. The script then incorrectly treated this as 'build failed with no hash mismatch' and exited 1, breaking the lint on every PR whenever the cache is throttled. Now we recognize the throttling/cache-disabled signature and skip that entry with a warning. A real stale hash still surfaces in the primary `.#$ATTR` build (separate CI job), so we don't lose coverage.	2026-04-28 18:39:35 -05:00
Brooklyn Nicholson	ec11aa64ee	fix(nix): refresh web/ npm-deps hash to unblock main builds `web/package-lock.json` was updated by the design-system refactor (merged via #17007 + follow-ups: spinner / select / badges / buttons) without bumping `nix/web.nix::npmDeps.hash`, breaking nix builds on every PR + main since 2026-04-28T18:46. Hash sourced from the actual `Check flake` failure output: specified: sha256-AahWmJ9gDQ9pMPa1FYwUjYdO2mOi6JM9Mst27E0vp68= got: sha256-+B2+Fe4djPzHHcUXRx+m0cuyaopAhW0PcHsMgYfV5VE= Standalone single-file fix so it can land fast and clear nix on every other open PR.	2026-04-28 18:21:09 -05:00

1 2 3 4 5 ...

6527 commits