hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-04 12:33:08 +00:00

Author	SHA1	Message	Date
Teknium	016c772e7f	feat(plugins): tool override flag for replacing built-in tools (closes #11049 ) (#26759 ) Plugins can now replace a built-in tool by passing override=True to ctx.register_tool(). Without it, the registry rejects any registration that would shadow an existing tool from a different toolset (unchanged default behavior). Unlocks the use case from #11049: drop-in replacement of browser/web backends without forking core. Composes with the existing pre_tool_call hook for runtime interception of any implementation. The override is audit-logged at INFO so it surfaces in agent.log.	2026-05-15 22:12:57 -07:00
helix4u	9c304a7f56	fix(agent): retry malformed anthropic stream parser errors	2026-05-15 22:12:42 -07:00
teknium1	c9b32a654c	feat(skill): darwinian-evolver optional skill Thin wrapper around Imbue's darwinian_evolver (AGPL-3.0, subprocess-only). Ships a working OpenRouter driver (parrot_openrouter.py), a snapshot inspector (show_snapshot.py), and a custom-problem template. SKILL.md has 58-char description, Pitfalls sourced from actually running the loop: non-viable seed trap, Azure content filter killing runs, loop.run() being a generator, nested-pickle snapshots, and aggressive default concurrency. Salvaged from #12719 by @Bihruze — original PR shipped 12,289 LOC across 61 files (29 Python modules, FastAPI dashboard, VS Code extension, benchmark hub, marketplace, etc.) which was far beyond the scope of the underlying issue (#336). This version stays at the ~700-LOC scope that issue actually asked for. Authorship of the original effort credited via AUTHOR_MAP entry and the SKILL.md author field. Verified end-to-end: seed 'Say {{ phrase }}' (score 0.000) evolved into 'Please repeat the following phrase exactly as it is, without any modifications or additional formatting: {{ phrase }}' (score 0.750) across 3 iterations on gpt-4o-mini via OpenRouter. Co-authored-by: Bihruze <98262967+Bihruze@users.noreply.github.com>	2026-05-15 21:56:07 -07:00
Teknium	c5dc9700eb	fix(windows): silence tirith-unavailable banner + skip install/spawn attempts on unsupported platforms (#26718 ) Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-main (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Tirith ships no Windows binary, so on every Windows CLI startup users saw a scary 'tirith security scanner enabled but not available' banner they could not act on. The banner suggested degraded security; in reality pattern-matching guards still run and the message was pure noise. Fix: - New public is_platform_supported() helper in tools/tirith_security.py that returns False when _detect_target() doesn't resolve (Windows, any non-x86_64/aarch64 arch). - ensure_installed(), _resolve_tirith_path(), and check_command_security() short-circuit on unsupported platforms: cache _resolved_path = _INSTALL_FAILED with reason 'unsupported_platform', skip PATH probes, skip the background download thread, skip the disk failure marker, and return allow with an empty summary from check_command_security so the spawn loop never fires. - Explicit user-configured tirith_path is still honored everywhere (a user who built tirith themselves under WSL keeps that path). - CLI banner in cli.py gated on is_platform_supported() — fires only on platforms where tirith should work but isn't installed. - Docs note tirith's supported-platform list and point Windows users at WSL. Tests: tests/tools/test_tirith_security.py +8 tests covering Linux x86_64, Darwin arm64, Windows, and unknown-arch verdicts plus the silent ensure_installed / check_command_security / _resolve_tirith_path fast-paths and the explicit-path override. test_tirith_security.py 75 passed (8 new + 67 pre-existing) test_command_guards.py 19 passed	2026-05-15 20:29:28 -07:00
helix4u	97a32afdc4	fix(auxiliary): resolve xai oauth compression from pool	2026-05-15 19:53:37 -07:00
Teknium	6784c80794	fix(xai-oauth): lead entitlement-403 hint with X Premium+ gotcha (#26672 ) The #1 confusing cause of the xAI 403 (per Teknium): X Premium+ subscribers see Grok inside the X app and assume API access is included. It is NOT — only standalone SuperGrok subscribers can use xai-oauth with Hermes today. Without calling this out, every Premium+ user hits the 403 with no idea why. PR #26666's neutral 4-cause list was correct but buried the most common cause. Lead with the Premium+ gotcha, then list the other possibilities (no subscription, wrong tier, exhausted quota) as fallbacks. Same neutral framing — does not accuse anyone of being unsubscribed.	2026-05-15 17:23:33 -07:00
Teknium	9818b9a1ac	fix(xai-oauth): rewrite entitlement-403 hint to not accuse subscribers (#26666 ) PR #26644 confidently told users "xAI OAuth account lacks SuperGrok / X Premium entitlement" on any 403 from xAI's permission-denied surface. But that body is returned for at least four distinct causes that Hermes cannot distinguish from the wire: * Account has no Grok subscription at all * Account has SuperGrok but the tier doesn't include the requested model (e.g. grok-4.3 needs SuperGrok Heavy) * Monthly quota for the subscribed tier is exhausted * SuperGrok is active but the API access add-on isn't enabled Don Piedro pushed back that he IS subscribed yet still hit this. Picking the worst-case interpretation ("you're not subscribed") reads as wrong and insulting to subscribers, and points them at a fix they already did. New wording lists all 4 possibilities and points at https://grok.com/?_s=usage where the user can check which applies. The detection logic and credential-pool short-circuit (PR #26664) are unchanged — only the user-facing wording is rephrased.	2026-05-15 17:15:22 -07:00
Teknium	ce0e189d3e	fix(xai-oauth): break entitlement-403 credential-refresh loop, bump grok-4.3 context to 1M (#26664 ) Don Piedro's 18-minute hang on grok-4.3 traced to two issues PR #26644 didn't cover: - _recover_with_credential_pool classifies 403 as FailoverReason.auth and calls pool.try_refresh_current(). For xAI OAuth on an unsubscribed account, refresh succeeds (mints a new token from the same account) but the next API call 403s with the same entitlement error. Result: infinite refresh → retry → 403 loop until Ctrl+C (1133s in Don's log). New _is_entitlement_failure(error_context, status_code) detects the subscription-shape body ("do not have an active Grok subscription" / "out of available resources" + grok / "does not have permission" + grok) and short-circuits recovery so _summarize_api_error surfaces PR #26644's friendly hint. - grok-4.3 resolved to 256k via the grok-4 catch-all in DEFAULT_CONTEXT_LENGTHS. Per docs.x.ai/developers/models/grok-4.3 the model ships with 1M context. Add explicit grok-4.3 entry before the grok-4 fallback (longest-first substring matching ensures grok-4.3 and grok-4.3-latest both land on the new value). Tests: 8 new (23 total in test_codex_xai_oauth_recovery.py). E2E verified Don's 100-iteration loop bails out with 0 refresh calls while genuine auth failures still refresh once and recover.	2026-05-15 17:11:06 -07:00
teknium1	cd9470f416	fix(deepseek): wire thinking-mode via DeepSeekProfile, not legacy fallback The cherry-picked PR #15251 from @tw2818 correctly identified the DeepSeek 400 root cause but placed the fix in the legacy fallback path of `build_kwargs`, which DeepSeek never reaches — DeepSeek has a registered ProviderProfile and goes through `_build_kwargs_from_profile` instead. The legacy-path block was therefore dead code. This commit pivots the fix to where it actually fires: - New `DeepSeekProfile` in `plugins/model-providers/deepseek/__init__.py` overrides `build_api_kwargs_extras` to emit DeepSeek's expected wire format (mirrors `KimiProfile`): {"reasoning_effort": "<low\|medium\|high\|max>", "extra_body": {"thinking": {"type": "enabled" \| "disabled"}}} - Model gating: only `deepseek-v4-*` and `deepseek-reasoner` emit thinking control. `deepseek-chat` (V3) is untouched — current behavior. - Effort mapping: low/medium/high passthrough, xhigh/max → max, unset → omitted (DeepSeek server applies its own default). - Revert the legacy-path additions from PR #15251 — they were dead code, and the `_copy_reasoning_content_for_api` strip block specifically would have nullified the existing reasoning_content padding machinery (`_needs_deepseek_tool_reasoning` → space-pad on replay) that the active provider already relies on for replay correctness. - Unit tests pin the wire-shape contract and the model gating rules (26 tests, all passing). Existing transport + provider profile suites (321 tests) continue to pass. - AUTHOR_MAP: map twebefy@gmail.com → tw2818 for release notes credit. Closes #15700, #17212, #17825. Co-authored-by: tw2818 <twebefy@gmail.com>	2026-05-15 17:03:26 -07:00
Teknium	31ba2b0cbc	fix(xai-oauth): recover from prelude SSE errors, gate reasoning replay, surface entitlement 403s (#26644 ) Three fixes for the May 2026 xAI OAuth (SuperGrok / X Premium) rollout failures: - _run_codex_stream: when openai SDK raises RuntimeError("Expected to have received `response.created` before `<type>`"), retry once then fall back to responses.create(stream=True) — same path used for missing-response.completed postlude. Fallback surfaces the real provider error with body+status_code intact. Also fixes #8133 (response.in_progress prelude on custom relays) and #14634 (codex.rate_limits prelude on codex-lb). - _summarize_api_error: when error body matches xAI's entitlement shape, append a one-line hint pointing to https://grok.com and /model. Once-only, applies to both auxiliary warnings and main-loop error surfacing. - _chat_messages_to_responses_input: new is_xai_responses kwarg drops replayed codex_reasoning_items (encrypted_content) before they reach xAI. Also drops reasoning.encrypted_content from the xAI include array. Native Codex behavior unchanged. Grok still reasons natively each turn; coherence rides on visible message text alone. Closes #8133, #14634.	2026-05-15 16:35:12 -07:00
teknium1	4aec25bc44	fix(windows): stop spamming cwd-missing + tirith-spawn warnings on every terminal call Two log-spam fixes surfaced by a Windows user (Git Bash + Python 3.11.9): 1. LocalEnvironment cwd warn spam ============================ Git Bash's `pwd -P` emits paths like `/c/Users/x`. The base-class `_extract_cwd_from_output` was assigning this verbatim to `self.cwd` without validation, then `_resolve_safe_cwd`'s `os.path.isdir(/c/...)` returned False on Windows, triggering: LocalEnvironment cwd '/c/Users/NVIDIA' is missing on disk; falling back to '/' so terminal commands keep working. ...on every terminal call. The pre-existing Windows-path translation inside `_run_bash` ran AFTER the safe-cwd check, so it could never prevent the warning. Fix: - New `_msys_to_windows_path` helper (idempotent, no-op off Windows). - `_resolve_safe_cwd` normalizes before `isdir`, so a valid MSYS path is recognized as the real directory it points at. - `LocalEnvironment._update_cwd` and a new override of `_extract_cwd_from_output` translate + validate before mutating `self.cwd`. Stale / non-existent marker paths roll back to the previous cwd instead of clobbering it. - The fallback warning still fires when the directory really is gone (deletion-recovery scenario from #17558 still covered). 2. tirith spawn-failed warn spam ============================= When tirith isn't installed (background install in flight, or marked failed for the day) and the configured path stays as the bare string `tirith`, every `subprocess.run([tirith_path, ...])` raises OSError and logged: tirith spawn failed: [WinError 2] The system cannot find the file specified ...on every command. fail_open=True means behaviour is correct, but the log noise is severe. Fix: - `_warn_once(key, ...)` thread-safe dedupe helper. - Three hot-path warnings (`tirith path resolved to None`, `tirith spawn failed: ...`, `tirith timed out after Ns`) now log once per (exception class, errno) / timeout-value / path-none key. - Dedupe set is cleared on `_clear_install_failed` so a successful install lets a subsequent failure surface again. Tests ===== - `tests/tools/test_local_env_windows_msys.py`: 12 tests covering the MSYS→Windows translator, the resolve fast-path, update_cwd validation, and extract_cwd_from_output rollback. - `tests/tools/test_tirith_security.py`: 4 new dedupe tests (15 spawn failures → 1 log line; distinct exc types → 2 lines; timeout dedupe; path-None dedupe). Targeted runs: test_local_env_windows_msys.py 12 passed test_local_env_cwd_recovery.py 7 passed (pre-existing, no regressions) test_tirith_security.py 67 passed (63 pre-existing + 4 new) test_base_environment + local_* 37 passed (no regressions) test_local_env_blocklist + neighbours 114 passed Reported via Hermes log capture: 19× cwd warnings + 15× tirith warnings in a single short session.	2026-05-15 16:25:31 -07:00
alt-glitch	47c0efe1c0	refactor: DRY cleanup from code review - dep_ensure.py: use get_hermes_home() instead of hand-rolled env var - dep_ensure.py: add "chrome" to browser name list (was inconsistent with browser_tool.py) - main.py _cmd_update_check: use detect_install_method() directly instead of redundant .git check - main.py _cmd_update_pip: build command list directly instead of fragile split() on display string - banner.py: rename _check_via_pypi → check_via_pypi (cross-module public API)	2026-05-15 14:45:43 -07:00
alt-glitch	99b81cd54b	feat: add `hermes postinstall` command for pip users One-shot bootstrap that installs non-Python deps (node, browser, ripgrep, ffmpeg) via ensure_dependency(), then runs setup if no provider is configured. Closes the gap between `pip install` and the full user-facing experience. Also fixes 3 pre-existing test regressions caused by earlier commits: - test_recommended_update_command: mock detect_install_method for git env - test_check_for_updates_no_git_dir: now falls back to PyPI, not None - test_plist_path_includes_node_modules_bin: skip when dir absent	2026-05-15 14:45:43 -07:00
alt-glitch	259ae846c8	feat: add ensure_dependency() wrapper + ship install.sh in wheel Includes paired change: browser tool now searches ~/.hermes/node_modules/.bin/ for agent-browser installed via install.sh --ensure browser.	2026-05-15 14:45:43 -07:00
alt-glitch	624ce11ee8	feat(config): detect pip install method and recommend correct update command Adds detect_install_method() to identify nixos/homebrew/git/pip installs, and recommended_update_command_for_method() to return the right upgrade command for each method. Updates recommended_update_command() to use these for pip-installed instances (no .git dir, not managed).	2026-05-15 14:45:43 -07:00
alt-glitch	b2bf658442	feat(tui): find bundled entry.js from wheel before falling back to npm build Add _find_bundled_tui() that checks for hermes_cli/tui_dist/entry.js (present in wheel installs) and wire it into _make_tui_argv() between the HERMES_TUI_DIR prebuilt path and the npm install fallback.	2026-05-15 14:45:43 -07:00
alt-glitch	d69eab1efd	fix(gateway): build service PATH from existing dirs only, include ~/.hermes/node_modules Extract PATH building into _build_service_path_dirs() that skips directories which don't exist on disk (e.g. node_modules/.bin for pip installs) and also includes ~/.hermes/node/bin and ~/.hermes/node_modules/.bin for agent-browser.	2026-05-15 14:45:43 -07:00
alt-glitch	384ec9684e	feat(banner): check PyPI for updates when not a git install For pip-installed hermes-agent (no .git directory), fall back to querying PyPI's JSON API to compare __version__ against the latest published release, using stdlib only (urllib + json, no packaging dep).	2026-05-15 14:45:43 -07:00
Teknium	518f39557b	fix(gateway): keep running when platforms fail; add per-platform circuit breaker + /platform (#26600 ) Stop the gateway from exiting (or systemd-restart-looping) when a single messaging adapter fails at startup or runtime. A misconfigured WhatsApp (npm install timeout, unpaired bridge, missing creds.json) used to take the entire gateway down, killing cron jobs and any other connected platforms with it. Changes: • Startup (gateway/run.py): when connected_count==0 but the only errors are retryable, log a degraded-state warning and keep the gateway alive instead of returning False. Reconnect watcher then recovers platforms as their underlying problem clears. • Runtime (gateway/run.py _handle_adapter_fatal_error): when the last adapter goes down with a retryable error and is queued for reconnection, stay alive instead of exit-with-failure. Previously this triggered systemd Restart=on-failure, which created infinite restart loops on persistent retryable failures (proxy outage, repeated bridge crashes). • Reconnect watcher (gateway/run.py _platform_reconnect_watcher): replace the 20-attempt hard drop with a circuit-breaker pause. After _PAUSE_AFTER_FAILURES (10) consecutive retryable failures, the platform stays in _failed_platforms with paused=True so the watcher skips it but the operator can still see and resume it. Non-retryable errors still drop out of the queue immediately. Resolves #17063 (gateway giving up on Telegram after 20 attempts). • WhatsApp preflight (gateway/platforms/whatsapp.py): refuse to start the Node bridge when creds.json is missing. Sets a non-retryable whatsapp_not_paired fatal error so the watcher drops it cleanly with a single 'run hermes whatsapp' log line instead of paying the 30s bridge bootstrap timeout on every gateway start. • WhatsApp setup ordering (hermes_cli/main.py cmd_whatsapp): only set WHATSAPP_ENABLED=true once pairing actually succeeds. Previously the wizard wrote the env var at step 2 (before npm install and QR pairing), so any Ctrl+C left .env claiming WhatsApp was ready when the bridge had no creds.json. Also propagate the env var when the user keeps an existing pairing on a re-run. • /platform slash command (hermes_cli/commands.py + gateway/run.py): new gateway-only command for manual circuit-breaker control. /platform list — show connected + failed/paused platforms /platform pause <name> — silence a known-broken platform /platform resume <name> — re-queue a paused platform Tests: • New: pause/resume helpers, /platform list\|pause\|resume command, WhatsApp creds.json preflight, WhatsApp setup ordering. • Updated: stale assertions that codified the old 'exit and let systemd restart' behavior in test_runner_fatal_adapter.py, test_runner_startup_failures.py, and test_platform_reconnect.py (the 20-attempt give-up test became a circuit-breaker pause test). 5488 tests pass in tests/gateway/.	2026-05-15 14:32:14 -07:00
Teknium	3b9368a0c4	fix(auth): point SSH OAuth users at the tunnel they actually need (#26592 ) Two loopback-redirect OAuth flows (xAI Grok, Spotify) silently fail when Hermes runs on a remote host: the auth server redirects to 127.0.0.1:<port> on the user's laptop, not on the remote box. The --no-browser flag only suppresses webbrowser.open() — it doesn't change the bind address. Symptom xAI surfaces is 'Could not establish connection. We couldn't reach your app.', followed by a 'xAI authorization timed out waiting for the local callback' on the CLI side. Changes - hermes_cli/auth.py: new _print_loopback_ssh_hint() helper, called from _xai_oauth_loopback_login() and _spotify_login() right after they print the redirect URI. Silent off SSH; on SSH prints the exact 'ssh -N -L <port>:127.0.0.1:<port>' command using the actually-bound port (not the hardcoded constant — the listener auto-bumps when the preferred port is busy), a provider-specific docs URL, and a link to the new shared guide. - website/docs/guides/oauth-over-ssh.md (new): single source of truth for the tunnel pattern — TL;DR command, jump-box / ProxyJump variant, mosh+tmux+ControlMaster gotchas, troubleshooting. - website/docs/guides/xai-grok-oauth.md: fix the two sections that claimed --no-browser alone was enough; link to the shared guide. - website/docs/user-guide/features/spotify.md: expand the existing one-liner; link to the shared guide. - website/sidebars.ts: register the new page. - tests/hermes_cli/test_auth_loopback_ssh_hint.py: 7 unit tests covering SSH-vs-not, loopback-vs-not, malformed URIs, port echo, with and without provider docs URL.	2026-05-15 14:27:50 -07:00
ethernet	9e67c8e8be	Merge pull request #26048 from stephenschoettler/fix/discord-e2e-history-mock test: unblock post-25957 shared CI	2026-05-15 17:21:07 -04:00
HenkDz	bd3a5873e1	fix(acp): replay native todo plans	2026-05-15 14:07:53 -07:00
HenkDz	4444d5fe4f	fix(acp): emit native plan updates for todo	2026-05-15 14:07:53 -07:00
kchantharuan	13c3d4b4ef	feat(nvidia): add NIM billing origin header	2026-05-15 14:06:51 -07:00
Teknium	4e89c53082	fix(async): close unscheduled coroutines in all threadsafe bridges (#26584 ) Wraps every sync->async coroutine-scheduling site in the codebase with a new agent.async_utils.safe_schedule_threadsafe() helper that closes the coroutine on scheduling failure (closed loop, shutdown race, etc.) instead of leaking it as 'coroutine was never awaited' RuntimeWarnings plus reference leaks. 22 production call sites migrated across the codebase: - acp_adapter/events.py, acp_adapter/permissions.py - agent/lsp/manager.py - cron/scheduler.py (media + text delivery paths) - gateway/platforms/feishu.py (5 sites, via existing _submit_on_loop helper which now delegates to safe_schedule_threadsafe) - gateway/run.py (10 sites: telegram rename, agent:step hook, status callback, interim+bg-review, clarify send, exec-approval button+text, temp-bubble cleanup, channel-directory refresh) - plugins/memory/hindsight, plugins/platforms/google_chat - tools/browser_supervisor.py (3), browser_cdp_tool.py, computer_use/cua_backend.py, slash_confirm.py - tools/environments/modal.py (_AsyncWorker) - tools/mcp_tool.py (2 + 8 _run_on_mcp_loop callers converted to factory-style so the coroutine is never constructed on a dead loop) - tui_gateway/ws.py Tests: new tests/agent/test_async_utils.py covers helper behavior under live loop, dead loop, None loop, and scheduling exceptions. Regression tests added at three PR-original sites (acp events, acp permissions, mcp loop runner) mirroring contributor's intent. Live-tested end-to-end: - Helper stress test: 1500 schedules across live/dead/race scenarios, zero leaked coroutines - Race exercised: 5000 schedules with loop killed mid-flight, 100 ok / 4900 None returns, zero leaks - hermes chat -q with terminal tool call (exercises step_callback bridge) - MCP probe against failing subprocess servers + factory path - Real gateway daemon boot + SIGINT shutdown across multiple platform adapter inits - WSTransport 100 live + 50 dead-loop writes - Cron delivery path live + dead loop Salvages PR #2657 — adopts contributor's intent over a much wider site list and a single centralized helper instead of inline try/except at each site. 3 of the original PR's 6 sites no longer exist on main (environments/patches.py deleted, DingTalk refactored to native async); the equivalent fix lives in tools/environments/modal.py instead. Co-authored-by: JithendraNara <jithendranaidunara@gmail.com>	2026-05-15 14:00:01 -07:00
teknium1	931caf2b2d	fix(env-flags): widen truthy-only session env checks to sibling sites Build on @aydnOktay's cronjob fix by routing the cronjob check through the shared 'env_var_enabled' helper in utils.py (same truthy set: 1/true/yes/on) and applying the same semantics to the 8 sibling call sites that read HERMES_INTERACTIVE / HERMES_GATEWAY_SESSION / HERMES_EXEC_ASK / HERMES_CRON_SESSION with bare os.getenv() truthy checks: - tools/approval.py: _is_gateway_approval_context (2), check_command_safety (2), check_all_command_guards (3) -- 7 sites total - tools/terminal_tool.py: _handle_sudo_failure, sudo password prompt -- 2 sites - tools/skills_tool.py: _is_gateway_surface -- 1 site Without this, a user who exports HERMES_INTERACTIVE=0 in their shell still gets interactive sudo prompts, approval prompts, and gateway skill-install paths -- only the cronjob tool was hardened. Now all consumers agree on the same false-like values. Also drops the duplicate _is_truthy_env helper from cronjob_tools.py in favour of the existing canonical utils.env_var_enabled. Tests: extend the parametrized regression coverage to all three session env vars (HERMES_INTERACTIVE / HERMES_GATEWAY_SESSION / HERMES_EXEC_ASK) symmetrically. tests/tools/test_cronjob_tools.py: 60/60 pass; tests/tools/{approval,terminal_tool,skills_tool, cron_approval_mode,hardline_blocklist}.py: 378/378 pass.	2026-05-15 12:35:07 -07:00
aydnOktay	734aa0f367	fix(cronjob): require explicit truthy session env values	2026-05-15 12:35:07 -07:00
Jaaneek	7d7cdd48e0	test(xai-oauth): use grok-4.3 instead of retiring grok-code-fast-1 Per @mark-xai's review on PR #26457 and the xAI model retirement on 2026-05-15: grok-code-fast-1 is being retired today and aliases redirect to grok-4.3 (already pinned to the top of the xAI model list by this PR). Update the two xAI Responses-API test fixtures Mark flagged plus the picker fallback default in hermes_cli/main.py that uses the same literal.	2026-05-15 12:11:32 -07:00
Jaaneek	7fdc16dd4a	refactor(transports/codex): trim duplicated cache-key comments The xAI prompt_cache_key block carried two long comment paragraphs that either restated setdefault semantics, narrated the SDK type-validation mechanism, or recapped the historical motivation for the extra_body indirection — all already covered by the test docstring at test_xai_responses_sends_cache_key_via_extra_body (which links to the xAI docs). Also restored the truncated link in the body-injection comment. No behavior change.	2026-05-15 12:11:32 -07:00
Jaaneek	e13c1b8060	fix(xai-http): preserve ~/.hermes/.env fallback and XAI_STT_BASE_URL precedence The new resolve_xai_http_credentials() resolver was using os.getenv() for the XAI_API_KEY/XAI_BASE_URL fallback path, which dropped the ~/.hermes/.env contract guarded by PR #17140 / #17163. Users with XAI_API_KEY in dotenv only would see "No xAI credentials found" even though the key was configured. Separately, _transcribe_xai started consulting creds["base_url"] (which always returns at least the default https://api.x.ai/v1) ahead of the public XAI_STT_BASE_URL env override, so the per-tool override stopped working. - tools/xai_http.py: add module-level get_env_value() wrapper that reads ~/.hermes/.env first (via hermes_cli.config.get_env_value), then os.environ. Resolver uses it for the API-key/base-url fallback. - tools/transcription_tools.py: restore precedence so XAI_STT_BASE_URL wins over creds["base_url"]. - tests/tools/test_transcription_dotenv_fallback.py + tests/tools/test_tts_dotenv_fallback.py: repoint the per-call-site patches at the new resolution point (tools.xai_http.get_env_value). The end-to-end regression-guard test (which patches load_env) is unchanged and still passes.	2026-05-15 12:11:32 -07:00
Jaaneek	e4d7a5dffa	fix(tools): video_gen picker reflects active xAI selection and runs xai_grok post_setup Two bugs in the `hermes tools` reconfigure flow caused picking xAI Grok Imagine for video_gen (or image_gen) to feel like a no-op: 1. `_is_provider_active()` had a branch for `image_gen_plugin_name` but none for `video_gen_plugin_name`, so a row marked as the active xAI video provider was never recognized as active. The picker fell through to the env-var fallback in `_detect_active_provider_index()`, which matched the FAL row (because `FAL_KEY` is set), so the picker visually defaulted to FAL even though the user had selected xAI. 2. `_plugin_video_gen_providers()` and `_plugin_image_gen_providers()` built picker rows from the plugin's `get_setup_schema()` but only copied `name`, `badge`, `tag`, `env_vars`. The xAI plugins declare `post_setup: "xai_grok"` so the picker should run the OAuth / API-key prompt hook after selection — that key was silently dropped, so the hook never fired from the picker rows. Adds the missing `video_gen_plugin_name` branch (placed before the `managed_nous_feature` block, mirroring the existing image_gen branch) and propagates `post_setup` from the plugin schema into both picker-row builders. Adds focused tests in `test_video_gen_picker.py` and `test_image_gen_picker.py`.	2026-05-15 12:11:32 -07:00
Jaaneek	b62c997973	feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider Adds a new authentication provider that lets SuperGrok subscribers sign in to Hermes with their xAI account via the standard OAuth 2.0 PKCE loopback flow, instead of pasting a raw API key from console.x.ai. Highlights ---------- * OAuth 2.0 PKCE loopback login against accounts.x.ai with discovery, state/nonce, and a strict CORS-origin allowlist on the callback. * Authorize URL carries `plan=generic` (required for non-allowlisted loopback clients) and `referrer=hermes-agent` for best-effort attribution in xAI's OAuth server logs. * Token storage in `auth.json` with file-locked atomic writes; JWT `exp`-based expiry detection with skew; refresh-token rotation synced both ways between the singleton store and the credential pool so multi-process / multi-profile setups don't tear each other's refresh tokens. * Reactive 401 retry: on a 401 from the xAI Responses API, the agent refreshes the token, swaps it back into `self.api_key`, and retries the call once. Guarded against silent account swaps when the active key was sourced from a different (manual) pool entry. * Auxiliary tasks (curator, vision, embeddings, etc.) route through a dedicated xAI Responses-mode auxiliary client instead of falling back to OpenRouter billing. * Direct HTTP tools (`tools/xai_http.py`, transcription, TTS, image-gen plugin) resolve credentials through a unified runtime → singleton → env-var fallback chain so xai-oauth users get them for free. * `hermes auth add xai-oauth` and `hermes auth remove xai-oauth N` are wired through the standard auth-commands surface; remove cleans up the singleton loopback_pkce entry so it doesn't silently reinstate. * `hermes model` provider picker shows "xAI Grok OAuth (SuperGrok Subscription)" and the model-flow falls back to pool credentials when the singleton is missing. Hardening --------- * Discovery and refresh responses validate the returned `token_endpoint` host against the same `.x.ai` allowlist as the authorization endpoint, blocking MITM persistence of a hostile endpoint. Discovery / refresh / token-exchange `response.json()` calls are wrapped to raise typed `AuthError` on malformed bodies (captive portals, proxy error pages) instead of leaking JSONDecodeError tracebacks. * `prompt_cache_key` is routed through `extra_body` on the codex transport (sending it as a top-level kwarg trips xAI's SDK with a TypeError). * Credential-pool sync-back preserves `active_provider` so refreshing an OAuth entry doesn't silently flip the active provider out from under the running agent. Testing ------- * New `tests/hermes_cli/test_auth_xai_oauth_provider.py` (~63 tests) covers JWT expiry, OAuth URL params (plan + referrer), CORS origins, redirect URI validation, singleton↔pool sync, concurrency races, refresh error paths, runtime resolution, and malformed-JSON guards. * Extended `test_credential_pool.py`, `test_codex_transport.py`, and `test_run_agent_codex_responses.py` cover the pool sync-back, `extra_body` routing, and 401 reactive refresh paths. * 165 tests passing on this branch via `scripts/run_tests.sh`.	2026-05-15 12:11:32 -07:00
Siddharth Balyan	d5416284f1	fix(tui): autonomous background process completion notifications (#26071 ) (#26327 ) * feat(process-registry): add format_process_notification shared helper * feat(process-registry): add drain_notifications method * refactor(cli): use shared drain_notifications and format_process_notification * feat(tui): add background notification poller for completion_queue * feat(tui): wire notification poller into session init/finalize * refactor(tui): add post-turn drain using shared helper as safety net	2026-05-15 19:31:00 +05:30
kshitij	db84a78e61	fix(langfuse): complete observability fix — trace I/O, tool outputs, placeholder credentials (closes #22342 , #22763 ) (#26320 ) * fix(langfuse): reject placeholder credentials with one-shot warning When operators leave HERMES_LANGFUSE_PUBLIC_KEY / HERMES_LANGFUSE_SECRET_KEY at a template value like 'placeholder', 'test-key', or 'your-langfuse-key', the Langfuse SDK silently accepts the credentials at construction time and drops every trace at flush time. No warning, no error — just an empty Langfuse dashboard the operator only notices hours later. Add prefix-based validation in _get_langfuse() against the documented 'pk-lf-' / 'sk-lf-' prefixes that Langfuse always issues server-side. Anything else fires a single warning naming the offending env var(s) with a log-safe value preview (full string for short placeholders so the operator knows which template they left in place; truncated for long values so a real secret pasted into the wrong field never hits the log), then short-circuits via the existing _INIT_FAILED cache so the warning fires once per process, not once per hook invocation. The check sits after the 'Langfuse is None' SDK-installed guard so hosts without the optional langfuse SDK don't see misleading 'set real keys' hints when the actionable fix is 'pip install langfuse'. Missing credentials remains the documented opt-out path and stays silent — no log noise for unconfigured installs. Fixes #22763 Fixes #23823 * fix(langfuse): use actual API request messages for generation input on_pre_llm_request previously used the messages kwarg alone, which could be None when Hermes passes the payload via request_messages, conversation_history, or user_message instead. Add _coerce_request_messages to pick the first available list across all variants, falling back to a synthetic user message. Generations now show the real outbound payload rather than an empty input. * fix(langfuse): record tool call outputs in traces Tool observations showed input (arguments) but output was always undefined. Root cause: when tool_call_id is empty, pre_tool_call stored observations under a unique time-based key that post_tool_call could never reconstruct, so every tool span was closed without output by the _finish_trace sweep. Fix pre/post matching by routing empty-tool_call_id tools through a per-name FIFO queue (pending_tools_by_name) instead of the time-based key. Tools with a tool_call_id continue to use the id-keyed dict. Also: - Preserve OpenAI-style nested function shape in serialized tool calls so Langfuse renders name/arguments correctly - Keep name + tool_call_id on role:tool messages for proper pairing - Backfill tool results onto the matching turn_tool_calls entry so the generation's tool-call record carries the result alongside arguments - Coerce request messages from whichever field the runtime provides (request_messages, messages, conversation_history, user_message) * fix(langfuse): salvage-review polish — drop dead is_first_turn, shallow-copy request_messages, real threaded FIFO test Self-review of the combined #22345 + #23831 salvage surfaced three issues worth fixing in the same PR rather than as follow-ups: 1. Drop is_first_turn from the pre_api_request hook. The boolean expression `not bool(conversation_history)` was wrong: conversation_history is reassigned to None mid-run after compression (5 sites in run_agent.py), so the value flips False -> True mid-conversation on every post-compression API call. The langfuse plugin never consumed it, so the kwarg was both misleading AND dead. 2. Replace copy.deepcopy(request_messages) with shallow list() copy. The pre_api_request hook contract discards return values (invoke_hook never writes back to api_kwargs), and the langfuse plugin's _serialize_messages already builds its own snapshot dicts via _safe_value. A deepcopy on every API call would walk every tool result and base64 image — significant overhead for no real isolation benefit. Shallow copy of the outer list protects against later mutations of api_messages without paying for the inner-dict walk. 3. Rename test_empty_tool_call_id_concurrent_fifo_order -> test_empty_tool_call_id_observations_are_fifo_within_tool_name and add a real test_threaded_post_calls_preserve_fifo_under_lock that spawns 8 threads behind a barrier to actually exercise _STATE_LOCK on the pending_tools_by_name queue. The original test was sequential and only validated Python list semantics; this one validates the lock discipline. 4. Fix stale 'Cleared by reset_cache_for_tests()' comment on _INIT_FAILED — that function does not exist. Tests reload the module via sys.modules.pop + importlib.import_module instead. Tests: 37 langfuse plugin tests pass, 658 plugin tests overall pass. --------- Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com> Co-authored-by: Brian Conklin <brian@dralth.com>	2026-05-15 05:04:02 -07:00
kshitijk4poor	77276070f5	fix(codex-runtime): de-dup [plugins.X] tables and stop leaking HERMES_HOME into config.toml Builds on @steezkelly's Bug A fix (#25857, top-level default_permissions via _insert_managed_block_at_top_level) by addressing the other two config-corruption bugs described in #26250: Bug B (duplicate [plugins.X] tables) - Codex itself writes [plugins."<name>@<marketplace>"] tables to config.toml when the user runs `codex plugins enable` directly, before hermes-agent's managed block exists. On the next migrate run, _query_codex_plugins() re-discovers the same plugins via plugin/list and render_codex_toml_section() re-emits them inside the managed block. Codex's strict TOML parser then rejects the duplicate table header on startup. - Add _strip_unmanaged_plugin_tables() that drops [plugins.] tables from the user-content portion of the file. Only run it when plugin/list succeeded — if the RPC failed we can't re-emit and must preserve the user's tables. plugin/list is the source of truth when it answers. Bug C (HERMES_HOME pytest-tempdir leak into ~/.codex/config.toml) - _build_hermes_tools_mcp_entry() read HERMES_HOME directly from os.environ, so a sibling pytest's monkeypatch.setenv("HERMES_HOME", tmp_path) silently burned a transient pytest tempdir into the user's real ~/.codex/config.toml. After pytest reaped the tempdir, every codex-routed hermes-tools tool call failed silently. - Derive HERMES_HOME from get_hermes_home() (the canonical resolver that goes through the profile-aware path) and refuse to emit obvious test-tempdir paths via _looks_like_test_tempdir() as belt-and-suspenders for any other callsite that forgets to patch migrate(). - test_enable_succeeds_when_codex_present in test_codex_runtime_switch.py invoked the real migrate() (no mock), writing to Path.home() / .codex using whatever HERMES_HOME the running pytest session had set. Add the same migrate patch the other apply() tests already use, so the suite stops touching the user's real ~/.codex/config.toml. E2E verification (replicating the issue's repro): - Pre-state config.toml with user [mcp_servers.omx_team_run] + codex-installed [plugins."tasks@openai-curated"], HERMES_HOME="/private/var/folders/.../pytest-of-.../..." - On origin/main: tomllib refuses to load the result with "Cannot declare ('plugins', 'tasks@openai-curated') twice" AND the pytest-tempdir HERMES_HOME is burned in. - On this branch: file parses cleanly, default_permissions is top-level, exactly one [plugins."tasks@openai-curated"] table inside the managed block, no HERMES_HOME in the MCP env. 7 new regression tests covering all three bugs + the test-leak guard. `bash scripts/run_tests.sh tests/hermes_cli/test_codex_runtime_.py` — 95 passed, 0 failed. Closes #26250	2026-05-15 02:31:30 -07:00
Steve Kelly	274217316e	fix(codex-runtime): keep migrated root keys top-level	2026-05-15 02:31:30 -07:00
aydnOktay	6af9942327	fix(url-safety): allow only http and https schemes	2026-05-15 01:52:48 -07:00
InB4DevOps	4f8aaf1046	perf(run_agent): accumulate length-continuation prefix via list+join Replace O(n²) string concatenation of truncated_response_prefix in the length-continuation retry loop with a list + ''.join(). Functionally equivalent: same partial response on early return, same prepend on final assembly. The legacy retry path is capped at 3 iterations, so the practical wall-clock win is small, but the new idiom matches the rest of the codebase and removes a needless repeated allocation. Salvaged from PR #2717 (the run_conversation portion only — trajectory refactor dropped because it silently rewrote </tool_response> to </think>). Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-15 01:42:08 -07:00
Mibayy	09d9724a09	feat(gateway): add SimpleX Chat platform plugin SimpleX Chat (https://simplex.chat) is a private, decentralised messenger with no persistent user IDs — every contact is identified by an opaque internal ID generated at connection time. This adds it as a Hermes gateway platform via the plugin system. The adapter connects to a local simplex-chat daemon via WebSocket, listens for inbound messages, and sends replies. Originally proposed in PR #2558 as a core-modifying integration; reshaped here as a self- contained plugin under plugins/platforms/simplex/ with no edits to any core file. Discovery is filesystem-based (scanned by gateway.config), and the platform identity is resolved on demand via Platform("simplex"). Plugin contract: - check_requirements() requires SIMPLEX_WS_URL AND the websockets package - validate_config() / is_connected() accept env or config.yaml input - _env_enablement() seeds PlatformConfig.extra (ws_url + home_channel) - _standalone_send() supports out-of-process cron delivery - interactive_setup() provides a stdin wizard for hermes gateway setup - register() wires the adapter into the registry with required_env, install_hint, cron_deliver_env_var, allowed_users_env, and a platform_hint for the LLM. Lazy dependency: the websockets Python package is imported inside the functions that need it. The plugin is importable and discoverable even when websockets is missing — check_requirements() simply returns False until `pip install websockets` is run. No new pyproject extras are introduced. Environment variables: SIMPLEX_WS_URL WebSocket URL of the daemon (required) SIMPLEX_ALLOWED_USERS Comma-separated allowed contact IDs SIMPLEX_ALLOW_ALL_USERS Set true to allow all contacts SIMPLEX_HOME_CHANNEL Default contact for cron delivery SIMPLEX_HOME_CHANNEL_NAME Human label for the home channel Closes #2557.	2026-05-15 01:41:30 -07:00
teknium1	85782a4ed7	feat(acp): hermes acp --setup-browser bootstraps browser tools for registry installs The Zed ACP Registry path (uvx --from 'hermes-agent[acp]==X' hermes-acp) gets a Python-only install. Browser tools depend on the agent-browser npm package + Chromium, neither of which are in the wheel. Without an explicit bootstrap, registry users have no path to working browser tools. Ship a bundled, idempotent bootstrap script (Linux/macOS bash + Windows PowerShell) inside acp_adapter/bootstrap/ as wheel package-data. New entry points: hermes acp --setup-browser # interactive; prompts before Chromium download hermes acp --setup-browser --yes # non-interactive hermes-acp --setup-browser The terminal-auth flow (hermes acp --setup) also offers the browser bootstrap as a follow-up after model selection, so first-run registry users get the option without knowing the flag exists. Key design choices: - npm install -g --prefix $NODE_PREFIX so we never need sudo. System Node on PATH is respected; only the install target is redirected to the user-writable Hermes-managed Node prefix. - tools/browser_tool.py::_browser_candidate_path_dirs() already walks $HERMES_HOME/node/bin, so installed binaries are discovered with no agent-side code change. - System Chrome/Chromium detection short-circuits the ~400 MB Playwright download when a suitable browser already exists. - Bash + PowerShell live as ONE copy each under acp_adapter/bootstrap/. Not duplicated under scripts/. install.sh and install.ps1 keep their inline browser blocks for the source-checkout path. E2E validated end-to-end: bash bootstrap_browser_tools.sh --skip-chromium → installs agent-browser into ~/.hermes/node/bin/ tools.browser_tool._find_agent_browser() → returns the installed path check_browser_requirements() → returns True (browser tools register) Tests: - tests/acp/test_entry.py: 11 tests covering --setup-browser dispatch (linux + windows + --yes forwarding + failure propagation), the terminal-auth follow-up prompt path, and a package-data wheel-shipping assertion that catches any future pyproject.toml regression. Docs: website/docs/user-guide/features/acp.md gains a 'Browser tools (optional)' subsection with the two-line install + what-it-does.	2026-05-15 01:38:24 -07:00
buntingszn	6682f91b80	feat(cron): support name-based lookup for job operations Cron mutation operations (run/pause/resume/remove) and 'hermes cron edit' now accept a job name in addition to the hex ID, with case-insensitive matching. Before this, 'hermes cron run my_job_name' died with 'Job with ID my_job_name not found' and forced the user to look up the hex ID first. The original PR matched by name but silently picked the first match when two jobs shared a name. This version refuses to act on an ambiguous name and surfaces every matching job (id, name, schedule, next_run_at) so the caller can pick a specific ID. - cron/jobs.py: - get_job() stays ID-only (preserves existing call-site semantics for web_server/api_server/curator/scheduler/test code that always passes real IDs). - resolve_job_ref() is the new name-or-ID resolver, used by pause/ resume/trigger/remove_job. Exact ID match wins over a name match even if a different job's name happens to equal that ID. Ambiguous name match raises AmbiguousJobReference with all candidate IDs. - tools/cronjob_tools.py: dispatch site uses resolve_job_ref, surfaces ambiguous matches as a structured error with the matching IDs. - hermes_cli/cron.py: 'cron edit' uses resolve_job_ref so editing by name works and ambiguous names are reported with IDs. - tests/cron/test_jobs.py: new TestResolveJobRef covering ID match, case-insensitive name match, ID-wins-over-name, ambiguous refusal, and that pause/resume/trigger/remove all refuse on ambiguity. Closes #2627	2026-05-15 01:36:03 -07:00
Teknium	9329e06696	feat(image-gen): actionable setup message when no FAL backend is reachable (#26222 ) When the in-tree FAL path has no API key (and no managed gateway), the handler used to return a bare 'FAL_KEY environment variable not set' error. Users had no idea where to get a key, that a managed Nous gateway exists, or that plugin-registered providers are an option. Now `image_generate_tool` returns a structured multi-line message: - signup link (https://fal.ai) - managed-gateway status (if Nous tools are enabled) - pointer to `hermes tools` / `hermes plugins list` for alternate backends, so users on a stale `image_gen.provider` know where to look The schema is untouched — `check_fn` still gates the tool out of the schema when no backend is reachable at startup, consistent with every other conditional tool. This patch fixes the call-time failure modes: managed-gateway 5xx, plugin provider disappearing mid-session, etc. Inspired by #2546 / @Mibayy. The PR was ~5700 commits stale against the new plugin-aware image_gen architecture, so this is a forward port of the actionable-error idea rather than a cherry-pick. Closes #2543 Co-authored-by: Mibayy <mibayy@users.noreply.github.com>	2026-05-15 01:33:13 -07:00
CoinTheHat	814c60092b	fix: clean stale conversation mappings on response eviction/deletion ResponseStore.put() and .delete() now remove conversations rows that reference evicted or deleted response IDs, preventing 404 errors when a conversation name is reused after its backing response was purged. Adds regression tests for delete, eviction, and handler-level reuse. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 01:27:43 -07:00
teyrebaz33	e0e7397c32	fix(session): persist auto-reset state across gateway restarts was_auto_reset, auto_reset_reason, and reset_had_activity were not included in SessionEntry.to_dict() / from_dict(), so a gateway restart between session expiry and the user's next message would silently drop the auto-reset notification and context note. Add the three fields to the serialization roundtrip with safe defaults (False / None / False) so existing sessions.json files load cleanly. Add three roundtrip tests to test_session_reset_notify.py.	2026-05-15 01:25:42 -07:00
Teknium	965ae7fa97	revert(cli): drop scrollback box width clamp (#25975 ), restore full-width borders (#26163 ) #25975 (salvaging #24403) clamped decorative scrollback Panels and streaming box rules to `max(32, min(width, 56))` as a defense against terminal-emulator reflow when columns shrink. On any modern wide terminal this made the response/reasoning borders look stubby — 56 cols inside a 200-col viewport. #26137 (salvaging #25981, by @OutThisLife) landed a more fundamental fix: prompt_toolkit's `_output_screen_diff` is monkey-patched so its reserve-vertical-space cursor move no longer pushes chrome into scrollback at all. With that in place, the clamp is no longer load-bearing for the chrome-into-scrollback class of bugs — the remaining risk is purely cosmetic reflow of already stamped Panel borders during an aggressive column shrink, which we now accept as a tradeoff for restoring proper full-width rendering. Changes: - `_scrollback_box_width()` returns `max(32, width)` (just the floor, no upper cap). All 10 call sites stay valid. - Updated `test_scrollback_box_width_caps_to_resize_safe_value` to the new `test_scrollback_box_width_returns_viewport_width` asserting full-width passthrough above the 32-col floor. Floor of 32 is kept so `'─' * (w - 2)` math stays positive on tiny terminals. Refs #18449 #19280 #22976 (the original reflow class) and #25975 (the clamp this reverts).	2026-05-14 23:30:16 -07:00
teknium1	cbd1f8e4be	test(cli): cover light-mode detection + SkinConfig.get_color remap Adds 16 unit tests covering the light/dark terminal detection path introduced in the previous commit: - Env override priority (HERMES_LIGHT, HERMES_TUI_LIGHT, HERMES_TUI_THEME, HERMES_TUI_BACKGROUND, COLORFGBG) - Detection cache stickiness - _maybe_remap_for_light_mode() no-op in dark mode - Known dark-mode color remap (#FFF8DC -> #1A1A1A etc) - Case-insensitive lookup - Unknown color passthrough - Status-bar paired colors (#C0C0C0, #888888, #555555, #8B8682) are intentionally NOT remapped — regression guard for the patch-11 fix, since remapping them would produce dark-on-dark on the status bar's navy bg - SkinConfig.get_color() wrapper is installed and idempotent - SkinConfig.get_color() does remap in light mode and passes through in dark mode We don't try to fake an OSC 11 reply — that path is exercised end-to-end in real Terminal.app; the env-override path covers the algorithmic logic.	2026-05-14 23:23:32 -07:00
teknium1	c8c6ce1731	feat(acp-registry): switch to uvx distribution, drop npm launcher The ACP Registry schema supports uvx as a first-class distribution method alongside npx and binary. Pointing the registry directly at the existing hermes-agent PyPI release removes: - the @nousresearch npm scope (we don't own it) - a separate npm publish step on every weekly release - 90 lines of Node launcher + tests in packages/hermes-agent-acp/ The Zed registry now installs Hermes via: uvx --from 'hermes-agent[acp]==<version>' hermes-acp This is the same command the npm launcher was shelling out to anyway, so end-user behavior is unchanged. Registry CI validates the PyPI URL + version-pin exact match automatically. Changes: - acp_registry/agent.json: distribution.npx -> distribution.uvx - delete packages/hermes-agent-acp/ entirely - scripts/release.py: drop npm-launcher bump paths, keep manifest lockstep - tests/acp/test_registry_manifest.py: assert uvx shape + version pin - tests/scripts/test_release_acp_registry.py: rewrite for uvx-only shape - docs (user-guide + dev-guide): drop all npm-launcher references - delete docs/plans/acp-registry-zed-integration.md (stale, npm-shaped) Validated against agentclientprotocol/registry agent.schema.json via jsonschema. hermes-agent==0.13.0 is already live on PyPI.	2026-05-14 22:27:09 -07:00
Siddharth Balyan	5af672c753	chore: remove Atropos RL environments and tinker-atropos integration (#26106 ) * chore: remove Atropos RL environments, tools, tests, skill, and tinker-atropos submodule Delete: - environments/ (43 files — base env, agent loop, tool call parsers, benchmarks) - rl_cli.py (standalone RL training CLI) - tools/rl_training_tool.py (all 10 rl_* tools) - tests: test_rl_training_tool, test_tool_call_parsers, test_managed_server_tool_support, test_agent_loop, test_agent_loop_vllm, test_agent_loop_tool_calling, test_terminalbench2_env_security - optional-skills/mlops/hermes-atropos-environments/ - tinker-atropos git submodule + .gitmodules * chore: remove RL/Atropos references from Python source - toolsets.py: remove rl toolset block + update comment - model_tools.py: remove rl_tools group + update async bridging comment - hermes_cli/tools_config.py: remove RL display entry, _DEFAULT_OFF_TOOLSETS, setup block, and rl_training post-setup handler - tools/budget_config.py: remove RL environment reference in docstring - tests/test_model_tools.py: remove rl_tools from expected groups - tests/run_agent/test_streaming_tool_call_repair.py: fix stale cross-reference * chore: remove rl/yc-bench extras and tinker-atropos refs from pyproject.toml - Remove rl extra (atroposlib, tinker, fastapi, uvicorn, wandb) - Remove yc-bench extra - Remove rl_cli from py-modules - Remove [tool.ty.src] exclude for tinker-atropos - Remove [tool.ruff] exclude for tinker-atropos - Regenerate uv.lock * chore: remove tinker-atropos from install/setup scripts - setup-hermes.sh: remove entire tinker-atropos submodule install block - scripts/install.sh: remove both tinker-atropos blocks (Termux + standard) - scripts/install.ps1: remove tinker-atropos block - nix/hermes-agent.nix: remove tinker-atropos pip install line * chore: remove RL references from cli-config.yaml.example * docs: remove Atropos/RL references from README, CONTRIBUTING, AGENTS.md * docs: remove RL/Atropos references from website - Delete: environments.md, rl-training.md, mlops-hermes-atropos-environments.md - sidebars.ts: remove rl-training and environments sidebar entries - optional-skills-catalog.md: remove hermes-atropos-environments row - tools-reference.md: remove entire rl toolset section - toolsets-reference.md: remove rl row + update example - integrations/index.md: remove RL Training bullet - architecture.md: remove environments/ from tree + RL section - contributing.md: remove tinker-atropos setup - updating.md: remove tinker-atropos install + stale submodule update * chore: remove remaining RL/Atropos stragglers - hermes_cli/config.py: remove TINKER_API_KEY + WANDB_API_KEY env var defs - hermes_cli/doctor.py: remove Submodules check section (tinker-atropos) - hermes_cli/setup.py: remove RL Training status check - hermes_cli/status.py: remove Tinker + WandB from API key status display - agent/display.py: remove both rl_* tool preview/activity blocks - website/docs: remove RL references from providers.md + env-variables.md - tests: remove TINKER_API_KEY from conftest, set_config_value, setup_script * chore: remove RL training section from .env.example	2026-05-15 10:36:38 +05:30
teknium1	d364132114	chore(release): bump ACP Registry assets in lockstep with pyproject Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-main (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Has been cancelled Details Nix Lockfile Fix / fix (push) Has been cancelled Details The ACP Registry manifest (acp_registry/agent.json), the npm launcher package.json, and the launcher's HERMES_AGENT_VERSION constant must all match pyproject.toml exactly — tests/acp/test_registry_manifest.py enforces this lockstep. Without a release-script hook, the next weekly version bump fails that test until someone hand-edits four files. Extend update_version_files() to drive the ACP bump alongside __init__.py and pyproject.toml, and add tests covering the lockstep and the missing-files no-op path. Also map adam.manning@gmail.com -> am423 for the salvage commit.	2026-05-14 20:26:02 -07:00
mr-r0b0t	4c94396206	feat: add ACP registry metadata for Zed	2026-05-14 20:26:02 -07:00

1 2 3 4 5 ...

3756 commits