hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-03 12:23:08 +00:00

Author	SHA1	Message	Date
kshitij	db84a78e61	fix(langfuse): complete observability fix — trace I/O, tool outputs, placeholder credentials (closes #22342 , #22763 ) (#26320 ) * fix(langfuse): reject placeholder credentials with one-shot warning When operators leave HERMES_LANGFUSE_PUBLIC_KEY / HERMES_LANGFUSE_SECRET_KEY at a template value like 'placeholder', 'test-key', or 'your-langfuse-key', the Langfuse SDK silently accepts the credentials at construction time and drops every trace at flush time. No warning, no error — just an empty Langfuse dashboard the operator only notices hours later. Add prefix-based validation in _get_langfuse() against the documented 'pk-lf-' / 'sk-lf-' prefixes that Langfuse always issues server-side. Anything else fires a single warning naming the offending env var(s) with a log-safe value preview (full string for short placeholders so the operator knows which template they left in place; truncated for long values so a real secret pasted into the wrong field never hits the log), then short-circuits via the existing _INIT_FAILED cache so the warning fires once per process, not once per hook invocation. The check sits after the 'Langfuse is None' SDK-installed guard so hosts without the optional langfuse SDK don't see misleading 'set real keys' hints when the actionable fix is 'pip install langfuse'. Missing credentials remains the documented opt-out path and stays silent — no log noise for unconfigured installs. Fixes #22763 Fixes #23823 * fix(langfuse): use actual API request messages for generation input on_pre_llm_request previously used the messages kwarg alone, which could be None when Hermes passes the payload via request_messages, conversation_history, or user_message instead. Add _coerce_request_messages to pick the first available list across all variants, falling back to a synthetic user message. Generations now show the real outbound payload rather than an empty input. * fix(langfuse): record tool call outputs in traces Tool observations showed input (arguments) but output was always undefined. Root cause: when tool_call_id is empty, pre_tool_call stored observations under a unique time-based key that post_tool_call could never reconstruct, so every tool span was closed without output by the _finish_trace sweep. Fix pre/post matching by routing empty-tool_call_id tools through a per-name FIFO queue (pending_tools_by_name) instead of the time-based key. Tools with a tool_call_id continue to use the id-keyed dict. Also: - Preserve OpenAI-style nested function shape in serialized tool calls so Langfuse renders name/arguments correctly - Keep name + tool_call_id on role:tool messages for proper pairing - Backfill tool results onto the matching turn_tool_calls entry so the generation's tool-call record carries the result alongside arguments - Coerce request messages from whichever field the runtime provides (request_messages, messages, conversation_history, user_message) * fix(langfuse): salvage-review polish — drop dead is_first_turn, shallow-copy request_messages, real threaded FIFO test Self-review of the combined #22345 + #23831 salvage surfaced three issues worth fixing in the same PR rather than as follow-ups: 1. Drop is_first_turn from the pre_api_request hook. The boolean expression `not bool(conversation_history)` was wrong: conversation_history is reassigned to None mid-run after compression (5 sites in run_agent.py), so the value flips False -> True mid-conversation on every post-compression API call. The langfuse plugin never consumed it, so the kwarg was both misleading AND dead. 2. Replace copy.deepcopy(request_messages) with shallow list() copy. The pre_api_request hook contract discards return values (invoke_hook never writes back to api_kwargs), and the langfuse plugin's _serialize_messages already builds its own snapshot dicts via _safe_value. A deepcopy on every API call would walk every tool result and base64 image — significant overhead for no real isolation benefit. Shallow copy of the outer list protects against later mutations of api_messages without paying for the inner-dict walk. 3. Rename test_empty_tool_call_id_concurrent_fifo_order -> test_empty_tool_call_id_observations_are_fifo_within_tool_name and add a real test_threaded_post_calls_preserve_fifo_under_lock that spawns 8 threads behind a barrier to actually exercise _STATE_LOCK on the pending_tools_by_name queue. The original test was sequential and only validated Python list semantics; this one validates the lock discipline. 4. Fix stale 'Cleared by reset_cache_for_tests()' comment on _INIT_FAILED — that function does not exist. Tests reload the module via sys.modules.pop + importlib.import_module instead. Tests: 37 langfuse plugin tests pass, 658 plugin tests overall pass. --------- Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com> Co-authored-by: Brian Conklin <brian@dralth.com>	2026-05-15 05:04:02 -07:00
kshitij	f199cd9f84	chore(release): map brian@dralth.com to btorresgil for #22345 salvage (#26319 ) PR #22345 by @btorresgil authors commits as 'Brian Conklin <brian@dralth.com>' (git config carries a different name/email than the GitHub account). GitHub's commit-author mapping correctly attributes these commits to @btorresgil based on the public-key registration, but Hermes' release attribution audit reads the raw commit email, not the GitHub mapping. Without this AUTHOR_MAP entry, salvaging #22345 would fail `scripts/contributor_audit.py` strict mode at release time. Prerequisite for the langfuse trace fix salvage that cherry-picks @btorresgil's commits onto current main.	2026-05-15 05:03:43 -07:00
kshitijk4poor	77276070f5	fix(codex-runtime): de-dup [plugins.X] tables and stop leaking HERMES_HOME into config.toml Builds on @steezkelly's Bug A fix (#25857, top-level default_permissions via _insert_managed_block_at_top_level) by addressing the other two config-corruption bugs described in #26250: Bug B (duplicate [plugins.X] tables) - Codex itself writes [plugins."<name>@<marketplace>"] tables to config.toml when the user runs `codex plugins enable` directly, before hermes-agent's managed block exists. On the next migrate run, _query_codex_plugins() re-discovers the same plugins via plugin/list and render_codex_toml_section() re-emits them inside the managed block. Codex's strict TOML parser then rejects the duplicate table header on startup. - Add _strip_unmanaged_plugin_tables() that drops [plugins.] tables from the user-content portion of the file. Only run it when plugin/list succeeded — if the RPC failed we can't re-emit and must preserve the user's tables. plugin/list is the source of truth when it answers. Bug C (HERMES_HOME pytest-tempdir leak into ~/.codex/config.toml) - _build_hermes_tools_mcp_entry() read HERMES_HOME directly from os.environ, so a sibling pytest's monkeypatch.setenv("HERMES_HOME", tmp_path) silently burned a transient pytest tempdir into the user's real ~/.codex/config.toml. After pytest reaped the tempdir, every codex-routed hermes-tools tool call failed silently. - Derive HERMES_HOME from get_hermes_home() (the canonical resolver that goes through the profile-aware path) and refuse to emit obvious test-tempdir paths via _looks_like_test_tempdir() as belt-and-suspenders for any other callsite that forgets to patch migrate(). - test_enable_succeeds_when_codex_present in test_codex_runtime_switch.py invoked the real migrate() (no mock), writing to Path.home() / .codex using whatever HERMES_HOME the running pytest session had set. Add the same migrate patch the other apply() tests already use, so the suite stops touching the user's real ~/.codex/config.toml. E2E verification (replicating the issue's repro): - Pre-state config.toml with user [mcp_servers.omx_team_run] + codex-installed [plugins."tasks@openai-curated"], HERMES_HOME="/private/var/folders/.../pytest-of-.../..." - On origin/main: tomllib refuses to load the result with "Cannot declare ('plugins', 'tasks@openai-curated') twice" AND the pytest-tempdir HERMES_HOME is burned in. - On this branch: file parses cleanly, default_permissions is top-level, exactly one [plugins."tasks@openai-curated"] table inside the managed block, no HERMES_HOME in the MCP env. 7 new regression tests covering all three bugs + the test-leak guard. `bash scripts/run_tests.sh tests/hermes_cli/test_codex_runtime_.py` — 95 passed, 0 failed. Closes #26250	2026-05-15 02:31:30 -07:00
Steve Kelly	274217316e	fix(codex-runtime): keep migrated root keys top-level	2026-05-15 02:31:30 -07:00
nidhi-singh02	13c72fb486	fix(tools): wrap browser provider network calls with error handling Wrap requests.post() in create_session() for browser_use, browserbase, and firecrawl providers with requests.RequestException handling. Connection timeouts and DNS resolution failures now surface as clean RuntimeError messages instead of raw requests exception tracebacks. Browser Use managed-gateway mode preserves raw exception propagation so the existing idempotency-key retry semantics keep working. Closes #2746 Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-15 01:53:06 -07:00
aydnOktay	6af9942327	fix(url-safety): allow only http and https schemes	2026-05-15 01:52:48 -07:00
nidhi-singh02	8373956850	fix(slack): guard split()[0] against whitespace-only command text When a user sends a Slack message like '/hermes ' (trailing whitespace after the slash) the legacy subcommand router hit `text.split()[0]` with a truthy-but-whitespace-only `text`. `' '.split()` returns `[]` → IndexError, blowing up the slash handler before fallthrough to `/help`. Switch to a two-step guard that materializes the parts list first and indexes only if non-empty. Salvaged from PR #2752 by @nidhi-singh02. The PR's other two hunks (`tools/file_operations.py`, `agent/anthropic_adapter.py`) are unreachable in current code — `LINTERS` is a hardcoded constant dict with no empty values, and the anthropic version-detection site is already guarded by a `result.stdout.strip()` truthy check — so only the slack hunk is taken. Closes #2745 Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-15 01:50:56 -07:00
teknium1	94bdc63ff5	chore(release): add AUTHOR_MAP entry for nidhi-singh02 PR #2751 salvage. CI requires AUTHOR_MAP coverage for all contributor commit emails.	2026-05-15 01:50:41 -07:00
Nidhi Singh	eacb398f75	fix(tools): add return_exceptions to asyncio.gather in web_tools Three asyncio.gather() calls in tools/web_tools.py ran without return_exceptions=True. A single failing task (e.g. LLM rate limit on one URL) would raise out of gather() and discard every other successfully fetched/summarized result. Pass return_exceptions=True and filter BaseException entries with a warning log before unpacking. Affects: - chunk summarization gather (large web_extract pages) - firecrawl per-result LLM post-processing - tavily crawl per-result LLM post-processing Closes #2744	2026-05-15 01:50:41 -07:00
teknium1	5301cc212b	chore(release): add AUTHOR_MAP entry for nidhi-singh02	2026-05-15 01:50:07 -07:00
nidhi-singh02	c4a21d7831	fix(cli): log swallowed exception in runtime model auto-detection Replaces bare `except Exception: pass` with debug-level logging so failures in local endpoint model discovery are diagnosable instead of silently hidden.	2026-05-15 01:50:07 -07:00
teknium1	59c7cc64f0	chore(release): add AUTHOR_MAP entry for amethystani	2026-05-15 01:43:54 -07:00
Animesh Mishra	55f3262e78	fix(mcp): pre-compile env-var regex and unify interpolation Remove redundant inner `import re` and regex recompilation on every call in _interpolate_env_vars. Add module-level _ENV_VAR_PATTERN compiled once. Replace the separate _interpolate_value() in mcp_config.py (which used \w+ and would silently fail on env vars containing hyphens or dots) with the shared _ENV_VAR_PATTERN from mcp_tool.py. Remove now-unused import re.	2026-05-15 01:43:54 -07:00
teknium1	5360b54244	fix(providers): set User-Agent on ProviderProfile.fetch_models Some catalog endpoints (OpenCode Zen, etc.) sit behind a WAF that returns 403 for the default Python-urllib/<ver> User-Agent. The generic profile-based live fetch in providers/base.py was silently failing for any such provider — falling through to the static catalog and missing newly-launched models. Set a generic 'hermes-cli/<version>' UA on the catalog probe so every api_key provider profile benefits. Verified live against opencode-zen: before this change, profile.fetch_models() raised HTTP 403; after, it returns 42 models including gpt-5.5, gpt-5.5-pro, kimi-k2.6, glm-5.1 and the *-free variants the static catalog doesn't list. Also strip the now-stale comment in validate_requested_model() claiming opencode-zen's /models returns 404 against the HTML marketing site — the API endpoint at /zen/v1/models returns 200 with valid JSON. Surfaced by #2651 (@aashizpoudel) — fixes the same user-facing gap their PR targeted, applied at the right layer so all api_key provider profiles get live catalogs through the same code path. Co-authored-by: Aashish Poudel <mr.aashiz@gmail.com>	2026-05-15 01:42:21 -07:00
teknium1	647cc0bb0d	chore(release): add AUTHOR_MAP entries for InB4DevOps	2026-05-15 01:42:08 -07:00
InB4DevOps	4f8aaf1046	perf(run_agent): accumulate length-continuation prefix via list+join Replace O(n²) string concatenation of truncated_response_prefix in the length-continuation retry loop with a list + ''.join(). Functionally equivalent: same partial response on early return, same prepend on final assembly. The legacy retry path is capped at 3 iterations, so the practical wall-clock win is small, but the new idiom matches the rest of the codebase and removes a needless repeated allocation. Salvaged from PR #2717 (the run_conversation portion only — trajectory refactor dropped because it silently rewrote </tool_response> to </think>). Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-15 01:42:08 -07:00
Mibayy	b6e07417c5	feat(cli): show YOLO mode warning in banner and status bar When running with --yolo, all dangerous command approvals are bypassed. Make this state visible so users don't forget: - Banner: '⚠ YOLO mode — all approval prompts bypassed' line in red, only shown when YOLO is active. Default case is silent (no extra line, no always-on 'restricted' label). - Status bar: '⚠ YOLO' fragment appended in red (#FF4444 bold) across all three width tiers (<52, <76, ≥76) in both the plain-text fallback and the fragments builder. Closes #2663 Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>	2026-05-15 01:41:59 -07:00
teknium1	47614dbfca	chore: wire simplex docs into sidebar + AUTHOR_MAP - Adds plugins/platforms/simplex docs page to the messaging sidebar between LINE and Open WebUI. - Maps louismichalot@hotmail.com -> Mibayy in scripts/release.py so the attribution check on the salvage PR passes.	2026-05-15 01:41:30 -07:00
Mibayy	09d9724a09	feat(gateway): add SimpleX Chat platform plugin SimpleX Chat (https://simplex.chat) is a private, decentralised messenger with no persistent user IDs — every contact is identified by an opaque internal ID generated at connection time. This adds it as a Hermes gateway platform via the plugin system. The adapter connects to a local simplex-chat daemon via WebSocket, listens for inbound messages, and sends replies. Originally proposed in PR #2558 as a core-modifying integration; reshaped here as a self- contained plugin under plugins/platforms/simplex/ with no edits to any core file. Discovery is filesystem-based (scanned by gateway.config), and the platform identity is resolved on demand via Platform("simplex"). Plugin contract: - check_requirements() requires SIMPLEX_WS_URL AND the websockets package - validate_config() / is_connected() accept env or config.yaml input - _env_enablement() seeds PlatformConfig.extra (ws_url + home_channel) - _standalone_send() supports out-of-process cron delivery - interactive_setup() provides a stdin wizard for hermes gateway setup - register() wires the adapter into the registry with required_env, install_hint, cron_deliver_env_var, allowed_users_env, and a platform_hint for the LLM. Lazy dependency: the websockets Python package is imported inside the functions that need it. The plugin is importable and discoverable even when websockets is missing — check_requirements() simply returns False until `pip install websockets` is run. No new pyproject extras are introduced. Environment variables: SIMPLEX_WS_URL WebSocket URL of the daemon (required) SIMPLEX_ALLOWED_USERS Comma-separated allowed contact IDs SIMPLEX_ALLOW_ALL_USERS Set true to allow all contacts SIMPLEX_HOME_CHANNEL Default contact for cron delivery SIMPLEX_HOME_CHANNEL_NAME Human label for the home channel Closes #2557.	2026-05-15 01:41:30 -07:00
teknium1	85782a4ed7	feat(acp): hermes acp --setup-browser bootstraps browser tools for registry installs The Zed ACP Registry path (uvx --from 'hermes-agent[acp]==X' hermes-acp) gets a Python-only install. Browser tools depend on the agent-browser npm package + Chromium, neither of which are in the wheel. Without an explicit bootstrap, registry users have no path to working browser tools. Ship a bundled, idempotent bootstrap script (Linux/macOS bash + Windows PowerShell) inside acp_adapter/bootstrap/ as wheel package-data. New entry points: hermes acp --setup-browser # interactive; prompts before Chromium download hermes acp --setup-browser --yes # non-interactive hermes-acp --setup-browser The terminal-auth flow (hermes acp --setup) also offers the browser bootstrap as a follow-up after model selection, so first-run registry users get the option without knowing the flag exists. Key design choices: - npm install -g --prefix $NODE_PREFIX so we never need sudo. System Node on PATH is respected; only the install target is redirected to the user-writable Hermes-managed Node prefix. - tools/browser_tool.py::_browser_candidate_path_dirs() already walks $HERMES_HOME/node/bin, so installed binaries are discovered with no agent-side code change. - System Chrome/Chromium detection short-circuits the ~400 MB Playwright download when a suitable browser already exists. - Bash + PowerShell live as ONE copy each under acp_adapter/bootstrap/. Not duplicated under scripts/. install.sh and install.ps1 keep their inline browser blocks for the source-checkout path. E2E validated end-to-end: bash bootstrap_browser_tools.sh --skip-chromium → installs agent-browser into ~/.hermes/node/bin/ tools.browser_tool._find_agent_browser() → returns the installed path check_browser_requirements() → returns True (browser tools register) Tests: - tests/acp/test_entry.py: 11 tests covering --setup-browser dispatch (linux + windows + --yes forwarding + failure propagation), the terminal-auth follow-up prompt path, and a package-data wheel-shipping assertion that catches any future pyproject.toml regression. Docs: website/docs/user-guide/features/acp.md gains a 'Browser tools (optional)' subsection with the two-line install + what-it-does.	2026-05-15 01:38:24 -07:00
teknium1	9f57f2286d	chore(release): add AUTHOR_MAP entry for buntingszn	2026-05-15 01:36:03 -07:00
buntingszn	6682f91b80	feat(cron): support name-based lookup for job operations Cron mutation operations (run/pause/resume/remove) and 'hermes cron edit' now accept a job name in addition to the hex ID, with case-insensitive matching. Before this, 'hermes cron run my_job_name' died with 'Job with ID my_job_name not found' and forced the user to look up the hex ID first. The original PR matched by name but silently picked the first match when two jobs shared a name. This version refuses to act on an ambiguous name and surfaces every matching job (id, name, schedule, next_run_at) so the caller can pick a specific ID. - cron/jobs.py: - get_job() stays ID-only (preserves existing call-site semantics for web_server/api_server/curator/scheduler/test code that always passes real IDs). - resolve_job_ref() is the new name-or-ID resolver, used by pause/ resume/trigger/remove_job. Exact ID match wins over a name match even if a different job's name happens to equal that ID. Ambiguous name match raises AmbiguousJobReference with all candidate IDs. - tools/cronjob_tools.py: dispatch site uses resolve_job_ref, surfaces ambiguous matches as a structured error with the matching IDs. - hermes_cli/cron.py: 'cron edit' uses resolve_job_ref so editing by name works and ambiguous names are reported with IDs. - tests/cron/test_jobs.py: new TestResolveJobRef covering ID match, case-insensitive name match, ID-wins-over-name, ambiguous refusal, and that pause/resume/trigger/remove all refuse on ambiguity. Closes #2627	2026-05-15 01:36:03 -07:00
Teknium	05d9f641c0	docs(cron): worked recipes for the wakeAgent pre-run gate (#26229 ) Adds three pre-run gate recipes to the cron docs: - file-change gate (stat + mtime + state file) - external-flag gate (file presence) - SQL-count gate (user's own database, not state.db) These are the use cases @iankar8 proposed adding as a parallel 'trigger' subsystem in #2654. The existing `script` + `wakeAgent` gate already covers all three at $0 — this lands the patterns as documentation so users can find them, instead of adding a second gating mechanism to the cron subsystem.	2026-05-15 01:34:15 -07:00
Teknium	9329e06696	feat(image-gen): actionable setup message when no FAL backend is reachable (#26222 ) When the in-tree FAL path has no API key (and no managed gateway), the handler used to return a bare 'FAL_KEY environment variable not set' error. Users had no idea where to get a key, that a managed Nous gateway exists, or that plugin-registered providers are an option. Now `image_generate_tool` returns a structured multi-line message: - signup link (https://fal.ai) - managed-gateway status (if Nous tools are enabled) - pointer to `hermes tools` / `hermes plugins list` for alternate backends, so users on a stale `image_gen.provider` know where to look The schema is untouched — `check_fn` still gates the tool out of the schema when no backend is reachable at startup, consistent with every other conditional tool. This patch fixes the call-time failure modes: managed-gateway 5xx, plugin provider disappearing mid-session, etc. Inspired by #2546 / @Mibayy. The PR was ~5700 commits stale against the new plugin-aware image_gen architecture, so this is a forward port of the actionable-error idea rather than a cherry-pick. Closes #2543 Co-authored-by: Mibayy <mibayy@users.noreply.github.com>	2026-05-15 01:33:13 -07:00
Siddharth Balyan	04b1fdaecf	security(deps): add upper bounds to 5 loose deps + document supply chain policy (#24226 ) After the Mini Shai-Hulud supply chain campaign (May 2026) and the litellm compromise (March 2026), codify the dependency pinning policy that was established in PRs #2810 and #9801 but never written down for contributors. Changes: - pyproject.toml: Add tight upper bounds to the 5 deps that slipped through as review escapes from external contributor PRs: - hindsight-client>=0.4.22,<0.5 (was >=0.4.22) - aiosqlite>=0.20,<0.23 (was >=0.20) - asyncpg>=0.29,<0.32 (was >=0.29) - alibabacloud-dingtalk>=2.0.0,<3 (was >=2.0.0) - youtube-transcript-api>=1.2.0,<2 (was >=1.2.0) Pre-1.0 packages get <0.(current_minor+2) — tight enough to block hostile minor releases but loose enough to not require bumps every week. - CONTRIBUTING.md: Add 'Dependency pinning policy' section under Security with the full rationale, table of source types + treatments, and examples. - AGENTS.md: Add concise 'Dependency Pinning Policy' section for AI coding agents with the decision table and step-by-step checklist. - supply-chain-audit.yml: Add dep-bounds job that fails PRs introducing PyPI deps without <ceiling upper bounds. Fires on pyproject.toml changes. Posts a PR comment with the specific unbounded specs found. Refs: #2796 #2810 #9801 #24205	2026-05-15 01:33:08 -07:00
Wysie	681778a0b7	fix(whatsapp): fail fast when Baileys sendMessage hangs Baileys' sock.sendMessage() can hang indefinitely while uploading media to WhatsApp servers (and, less often, on text sends), pinning the bridge's Express handler until the gateway's aiohttp timeout fires — surfacing to the user as a 120s wait followed by an empty error from the TTS/voice path. Wrap every sock.sendMessage() call inside the bridge in a sendWithTimeout() helper that rejects after WHATSAPP_SEND_TIMEOUT_MS (default 60s) via Promise.race. The four call sites are /send, /edit, and /send-media's primary send. Express handlers catch the rejection in their existing try/catch and return a real 500 to the gateway, which can then surface a retryable error. Salvaged from #2608 — wysie diagnosed the hang and the Promise.race shape; the other two parts of that PR (gateway HTTP session pooling, base.py metadata kwarg removal) already landed on main via separate routes and are no longer needed. Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-05-15 01:30:48 -07:00
teknium1	0161d4bb6c	chore(release): add AUTHOR_MAP entry for CoinTheHat	2026-05-15 01:29:31 -07:00
CoinTheHat	814c60092b	fix: clean stale conversation mappings on response eviction/deletion ResponseStore.put() and .delete() now remove conversations rows that reference evicted or deleted response IDs, preventing 404 errors when a conversation name is reused after its backing response was purged. Adds regression tests for delete, eviction, and handler-level reuse. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-15 01:27:43 -07:00
KiraKatana	23ac522d37	fix(gateway): isinstance-guard string-form 429 error body When a non-Anthropic provider (e.g. Morpheus proxy) returns a 429 with `{"error": "Too Many Requests"}` instead of the expected `{"error": {"type": ...}}` dict, _err_body.json().get("error", {}) returns the raw string and the next .get("type") line crashes with AttributeError, taking down the message handler. Guard with isinstance(_err_json, dict) so non-dict error bodies fall through to the generic rate-limit hint. Salvaged from PR #2587 by @KiraKatana. The PR's fallback-config `base_url`/`api_key_env` fix was already implemented independently on main (run_agent.py:8759-8780) with additional aliases and Ollama Cloud host handling, so only the gateway guard is cherry-picked. Co-authored-by: KiraKatana <kira.ops@proton.me>	2026-05-15 01:26:11 -07:00
teyrebaz33	e0e7397c32	fix(session): persist auto-reset state across gateway restarts was_auto_reset, auto_reset_reason, and reset_had_activity were not included in SessionEntry.to_dict() / from_dict(), so a gateway restart between session expiry and the user's next message would silently drop the auto-reset notification and context note. Add the three fields to the serialization roundtrip with safe defaults (False / None / False) so existing sessions.json files load cleanly. Add three roundtrip tests to test_session_reset_notify.py.	2026-05-15 01:25:42 -07:00
kshitijk4poor	e0e4856d46	feat(skills-hub): add huggingface/skills as trusted default tap (#2549 ) Adds Hugging Face's official skill catalog to the default GitHub taps and classifies it as a trusted source alongside openai/skills and anthropics/skills. - tools/skills_guard.py: huggingface/skills -> TRUSTED_REPOS - tools/skills_hub.py: GitHubSource.DEFAULT_TAPS += huggingface/skills (skills/) - website/docs: list it under default taps + trusted-source examples Closes #2549. Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-15 01:25:33 -07:00
libo1106	0086cdaf93	refactor(yuanbao): improve quote media fallback — move to DispatchMiddleware, tighten conditions	2026-05-15 01:17:50 -07:00
libo1106	fc2754dbdf	fix(yuanbao): resolve quoted file/image via transcript lookup when quote desc lacks ybres When a user quotes a file message (type=3) and @bot, the quote's desc field only contains the filename without a ybres:// resource reference. The existing QuoteContextMiddleware only extracted media refs from desc using the ybres regex, which always returned empty for file quotes. Fix: add a transcript lookup fallback in QuoteContextMiddleware.handle() — when quote_media_refs is empty but reply_to_message_id is set, search the session transcript for the quoted message_id and extract ybres anchors from its content. Also fix message_type classification: when quote media resolves non-image files, override message_type to DOCUMENT so gateway/run.py's document injection logic properly prepends the file path and content for the agent.	2026-05-15 01:17:50 -07:00
libo1106	3df26b925c	feat(yuanbao): prioritize quote media refs over history backfill in DispatchMiddleware	2026-05-15 01:17:50 -07:00
libo1106	80efe664ce	feat(yuanbao): add quote_media_refs extraction to QuoteContextMiddleware	2026-05-15 01:17:50 -07:00
libo1106	d57a4b3eb5	feat(yuanbao): add _parse_resource_id and update _extract_text for ybres anchors	2026-05-15 01:17:50 -07:00
Siddharth Balyan	6bdad1f3b2	ci: add PyPI publish workflow (salvaged from #25901 ) (#26148 ) * ci(pypi): add publish workflow for automated PyPI releases Triggered by CalVer tag pushes from scripts/release.py (v20* pattern). Three jobs: build (uv build) → publish (OIDC trusted publishing) → sign (Sigstore + attach to existing GitHub Release). - workflow_dispatch as manual escape hatch - skip-existing for safe re-runs - Graceful skip when GitHub Release not found (sign job) - Top-level permissions: contents: read (CodeQL compliant) Requires one-time setup: PyPI trusted publisher + GitHub pypi environment. Co-authored-by: dmahan93 <44207705+dmahan93@users.noreply.github.com> * fix(release): address review findings - Stage acp_registry/agent.json in version bump commit (was silently left unstaged) - Add missing return when no previous tags found without --first-release - Fix get_pr_number return type annotation (str -> str \| None) - Prefer uv build over python -m build (matches CI workflow), with fallback - Use unit separator (%x1f) in git log format to handle \| in author names - Add explicit encoding='utf-8' to .release_notes.md write Workflow hardening: - Gracefully skip signing when GitHub Release not found (env var gate instead of exit 1, so PyPI publish still shows green) * fix(ci): harden PyPI workflow — SHA-pin actions, guard workflow_dispatch, explicit build flags - Pin all actions to commit SHAs (supply-chain hardening for id-token:write) - workflow_dispatch now requires confirm_tag input + checks out that tag - Both uv build paths explicitly pass --sdist --wheel --------- Co-authored-by: dmahan93 <44207705+dmahan93@users.noreply.github.com>	2026-05-15 13:21:48 +05:30
teknium1	f9ad7400e3	fix(goals): raise judge max_tokens 200 → 4096, make configurable The freeform /goal judge was capped at max_tokens=200, which reliably truncated the JSON verdict on reasoning-heavy models (deepseek-v4-pro, qwq, etc.) — the model burns tokens on hidden reasoning before emitting visible content, and the first /goal turn's prompt is larger than later turns, blowing past 200. Symptom: agent.log shows `judge reply was not JSON: '{"done": true, "reason": "The agent successfully'` followed by repeated `judge returned empty response` lines, then the goal pauses with a misleading 'judge model isn't returning the required JSON verdict' message. Diagnosed live by @helix4u — empirically verified that raising the budget on an unmodified worktree makes the failures go away on the exact configs users were hitting on Nous Plus subscription paths. Changes: - DEFAULT_JUDGE_MAX_TOKENS = 4096 (up from 200) - New auxiliary.goal_judge.max_tokens config knob for tuning in specifically constrained setups - _goal_judge_max_tokens() resolves the value with fail-open semantics (non-int / non-positive / load failure → default). load_config() is mtime-cached so per-turn lookup is cheap. Scoped narrowly to the verified root cause — does not introduce a submit_verdict tool-call schema (see #26162 / #23671 for that direction; they can land separately if we want them). Tests: tests/hermes_cli/test_goals.py + tests/cli/test_cli_goal_interrupt.py + tests/gateway/test_goal_verdict_send.py — 62/62 passing. E2E verified: config override honored (8192), missing/garbage/zero values fall back to 4096, no-auxiliary-section falls back to 4096. Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com> Credits: - @helix4u (Gille) — diagnosed the max_tokens=200 truncation via live testing on an unmodified worktree, drafted the original fix shape in #26162. - @AhmetArif0 — flagged the freeform judge fragility in #23671 from the tool-call angle. - @0xharryriddle (HarryRiddle.eth) — reported the issue from a Nous Plus subscription setup in #23876 with full debug reports. Closes #23876 Supersedes #26162, #23671, #23881	2026-05-14 23:44:06 -07:00
Teknium	965ae7fa97	revert(cli): drop scrollback box width clamp (#25975 ), restore full-width borders (#26163 ) #25975 (salvaging #24403) clamped decorative scrollback Panels and streaming box rules to `max(32, min(width, 56))` as a defense against terminal-emulator reflow when columns shrink. On any modern wide terminal this made the response/reasoning borders look stubby — 56 cols inside a 200-col viewport. #26137 (salvaging #25981, by @OutThisLife) landed a more fundamental fix: prompt_toolkit's `_output_screen_diff` is monkey-patched so its reserve-vertical-space cursor move no longer pushes chrome into scrollback at all. With that in place, the clamp is no longer load-bearing for the chrome-into-scrollback class of bugs — the remaining risk is purely cosmetic reflow of already stamped Panel borders during an aggressive column shrink, which we now accept as a tradeoff for restoring proper full-width rendering. Changes: - `_scrollback_box_width()` returns `max(32, width)` (just the floor, no upper cap). All 10 call sites stay valid. - Updated `test_scrollback_box_width_caps_to_resize_safe_value` to the new `test_scrollback_box_width_returns_viewport_width` asserting full-width passthrough above the 32-col floor. Floor of 32 is kept so `'─' * (w - 2)` math stays positive on tiny terminals. Refs #18449 #19280 #22976 (the original reflow class) and #25975 (the clamp this reverts).	2026-05-14 23:30:16 -07:00
teknium1	cbd1f8e4be	test(cli): cover light-mode detection + SkinConfig.get_color remap Adds 16 unit tests covering the light/dark terminal detection path introduced in the previous commit: - Env override priority (HERMES_LIGHT, HERMES_TUI_LIGHT, HERMES_TUI_THEME, HERMES_TUI_BACKGROUND, COLORFGBG) - Detection cache stickiness - _maybe_remap_for_light_mode() no-op in dark mode - Known dark-mode color remap (#FFF8DC -> #1A1A1A etc) - Case-insensitive lookup - Unknown color passthrough - Status-bar paired colors (#C0C0C0, #888888, #555555, #8B8682) are intentionally NOT remapped — regression guard for the patch-11 fix, since remapping them would produce dark-on-dark on the status bar's navy bg - SkinConfig.get_color() wrapper is installed and idempotent - SkinConfig.get_color() does remap in light mode and passes through in dark mode We don't try to fake an OSC 11 reply — that path is exercised end-to-end in real Terminal.app; the env-override path covers the algorithmic logic.	2026-05-14 23:23:32 -07:00
Brooklyn Nicholson	f8745f59c2	fix(cli): kill resize scrollback duplication + light-mode visibility Two long-standing prompt_toolkit bugs in the base hermes CLI: 1. Resize duplication. Column-shrink resize used to push 40+ rows of duplicate chrome (status bar, input rules) into terminal scrollback every resize. Same wall as pt issues #29 (open since 2014), #1675, #1933 — aider/xonsh/ipython all use alt-screen to dodge it. Root cause (verified by reading prompt_toolkit/renderer.py): _output_screen_diff (renderer.py L232-242) deliberately moves the cursor to the bottom of the canvas after every paint 'to make sure the terminal scrolls up'. In non-fullscreen mode this scrolls chrome content into terminal scrollback on every render — not just on resize. Fix: monkey-patch prompt_toolkit.renderer._output_screen_diff to bypass the reserve-vertical-space cursor move. When pt's logic checks 'if current_height > previous_screen.height', we inflate the previous screen height so the branch falls through. ~30-line wrapper, no fork of pt, no alt-screen, no DECSTBM scroll region. Verified empirically in real Terminal.app: 10 resizes (mixed shrinks/widens 1300→500→1400) during streaming produced ZERO scrollback delta, full agent response preserved, status bar pinned at bottom, no visible duplicates. pt is pinned to ==3.0.52 so the private-function patch is safe; future pt bumps will need to re-verify the signature matches. 2. Light-mode terminal visibility. Hardcoded skin colors (#FFF8DC cornsilk, #FFD700 gold, #B8860B dark goldenrod) are tuned for dark Terminal.app — invisible on light/cream backgrounds. Port ui-tui/src/theme.ts detectLightMode() to Python so the base CLI adapts. Detection priority: HERMES_LIGHT/HERMES_TUI_LIGHT env → HERMES_TUI_THEME=light\|dark → HERMES_TUI_BACKGROUND=#RRGGBB → COLORFGBG env (xterm/Konsole/urxvt) → OSC 11 query (\x1b]11;?\x1b\\) with 100ms timeout → default dark. OSC 11 is tty-gated so gateway/cron/batch/subagent code paths don't pay the timeout cost. When light mode is detected, dark-mode colors auto-remap to readable equivalents (#FFF8DC → #1A1A1A, #FFD700 → #9A6B00, etc). Hooked at three points: - _hex_to_ansi() — auto-remaps any color emitted via the ANSI helper - _build_tui_style_dict() — rewrites pt style strings (chrome bg/fg) - SkinConfig.get_color() — wrapped at module load so Rich Panel borders/body text get the remap too Status-bar foreground colors (#C0C0C0, #888888, etc.) are explicitly skipped because they're paired with a dark navy bg — remapping them would make them invisible in dark mode. 3. Other visibility fixes: [thinking] reasoning preview now uses ANSI dim+italic (\x1b[2;3m) instead of #B8860B so it inherits terminal default fg color. Input/prompt area defaults to terminal default fg (was #FFF8DC cornsilk → invisible on cream). Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com>	2026-05-14 23:23:32 -07:00
teknium1	bcca5ed34d	fix(deps): pin brotlicffi so aiohttp can decode Discord's Brotli attachments Discord's CDN serves attachments with Content-Encoding: br. aiohttp's compression_utils tries 'import brotlicffi as brotli' first and falls back to google's Brotli, but Brotli<1.2.0's Decompressor.process() is 1-arg while aiohttp calls it with 2 args (data, max_length). Result: every .txt/.md/.doc uploaded to a Discord-gateway session fails to decode at att.read() with 'Can not decode content-encoding: br' / 'TypeError: process() takes exactly 1 argument (2 given)', the agent never sees the bytes, and falls back to filesystem guessing. Pin brotlicffi==1.2.0.1 in both surfaces: - tools/lazy_deps.py 'platform.discord' tuple: Discord users on the lazy-install path get it on first discord.py import. - pyproject.toml [messaging] extra: users who explicitly install hermes-agent[messaging] (skipping the lazy path) get it eagerly. brotlicffi wins aiohttp's import race regardless of what else is installed (try brotlicffi / except: import brotli), so existing setups that already pulled google's Brotli transitively don't change behavior beyond the bug fix. ~1.5 MB wheel, manylinux/macOS/Windows coverage. E2E verified: round-trip decode of Brotli-compressed payload via aiohttp.compression_utils.brotli succeeds with brotlicffi pinned; same test against Brotli==1.1.0 alone reproduces the reported TypeError. Credit to @Korkyzer for the original diagnosis and fix shape in #15744; the lazy-deps gating layer was added on top to keep brotlicffi out of the install path for users who don't run a Discord gateway. Fixes #12511. Closes #15744. Co-authored-by: Korky <korkyzer@gmail.com>	2026-05-14 22:36:46 -07:00
teknium1	c8c6ce1731	feat(acp-registry): switch to uvx distribution, drop npm launcher The ACP Registry schema supports uvx as a first-class distribution method alongside npx and binary. Pointing the registry directly at the existing hermes-agent PyPI release removes: - the @nousresearch npm scope (we don't own it) - a separate npm publish step on every weekly release - 90 lines of Node launcher + tests in packages/hermes-agent-acp/ The Zed registry now installs Hermes via: uvx --from 'hermes-agent[acp]==<version>' hermes-acp This is the same command the npm launcher was shelling out to anyway, so end-user behavior is unchanged. Registry CI validates the PyPI URL + version-pin exact match automatically. Changes: - acp_registry/agent.json: distribution.npx -> distribution.uvx - delete packages/hermes-agent-acp/ entirely - scripts/release.py: drop npm-launcher bump paths, keep manifest lockstep - tests/acp/test_registry_manifest.py: assert uvx shape + version pin - tests/scripts/test_release_acp_registry.py: rewrite for uvx-only shape - docs (user-guide + dev-guide): drop all npm-launcher references - delete docs/plans/acp-registry-zed-integration.md (stale, npm-shaped) Validated against agentclientprotocol/registry agent.schema.json via jsonschema. hermes-agent==0.13.0 is already live on PyPI.	2026-05-14 22:27:09 -07:00
Siddharth Balyan	5af672c753	chore: remove Atropos RL environments and tinker-atropos integration (#26106 ) * chore: remove Atropos RL environments, tools, tests, skill, and tinker-atropos submodule Delete: - environments/ (43 files — base env, agent loop, tool call parsers, benchmarks) - rl_cli.py (standalone RL training CLI) - tools/rl_training_tool.py (all 10 rl_* tools) - tests: test_rl_training_tool, test_tool_call_parsers, test_managed_server_tool_support, test_agent_loop, test_agent_loop_vllm, test_agent_loop_tool_calling, test_terminalbench2_env_security - optional-skills/mlops/hermes-atropos-environments/ - tinker-atropos git submodule + .gitmodules * chore: remove RL/Atropos references from Python source - toolsets.py: remove rl toolset block + update comment - model_tools.py: remove rl_tools group + update async bridging comment - hermes_cli/tools_config.py: remove RL display entry, _DEFAULT_OFF_TOOLSETS, setup block, and rl_training post-setup handler - tools/budget_config.py: remove RL environment reference in docstring - tests/test_model_tools.py: remove rl_tools from expected groups - tests/run_agent/test_streaming_tool_call_repair.py: fix stale cross-reference * chore: remove rl/yc-bench extras and tinker-atropos refs from pyproject.toml - Remove rl extra (atroposlib, tinker, fastapi, uvicorn, wandb) - Remove yc-bench extra - Remove rl_cli from py-modules - Remove [tool.ty.src] exclude for tinker-atropos - Remove [tool.ruff] exclude for tinker-atropos - Regenerate uv.lock * chore: remove tinker-atropos from install/setup scripts - setup-hermes.sh: remove entire tinker-atropos submodule install block - scripts/install.sh: remove both tinker-atropos blocks (Termux + standard) - scripts/install.ps1: remove tinker-atropos block - nix/hermes-agent.nix: remove tinker-atropos pip install line * chore: remove RL references from cli-config.yaml.example * docs: remove Atropos/RL references from README, CONTRIBUTING, AGENTS.md * docs: remove RL/Atropos references from website - Delete: environments.md, rl-training.md, mlops-hermes-atropos-environments.md - sidebars.ts: remove rl-training and environments sidebar entries - optional-skills-catalog.md: remove hermes-atropos-environments row - tools-reference.md: remove entire rl toolset section - toolsets-reference.md: remove rl row + update example - integrations/index.md: remove RL Training bullet - architecture.md: remove environments/ from tree + RL section - contributing.md: remove tinker-atropos setup - updating.md: remove tinker-atropos install + stale submodule update * chore: remove remaining RL/Atropos stragglers - hermes_cli/config.py: remove TINKER_API_KEY + WANDB_API_KEY env var defs - hermes_cli/doctor.py: remove Submodules check section (tinker-atropos) - hermes_cli/setup.py: remove RL Training status check - hermes_cli/status.py: remove Tinker + WandB from API key status display - agent/display.py: remove both rl_* tool preview/activity blocks - website/docs: remove RL references from providers.md + env-variables.md - tests: remove TINKER_API_KEY from conftest, set_config_value, setup_script * chore: remove RL training section from .env.example	2026-05-15 10:36:38 +05:30
teknium1	d364132114	chore(release): bump ACP Registry assets in lockstep with pyproject Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-main (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Has been cancelled Details Nix Lockfile Fix / fix (push) Has been cancelled Details The ACP Registry manifest (acp_registry/agent.json), the npm launcher package.json, and the launcher's HERMES_AGENT_VERSION constant must all match pyproject.toml exactly — tests/acp/test_registry_manifest.py enforces this lockstep. Without a release-script hook, the next weekly version bump fails that test until someone hand-edits four files. Extend update_version_files() to drive the ACP bump alongside __init__.py and pyproject.toml, and add tests covering the lockstep and the missing-files no-op path. Also map adam.manning@gmail.com -> am423 for the salvage commit.	2026-05-14 20:26:02 -07:00
mr-r0b0t	4c94396206	feat: add ACP registry metadata for Zed	2026-05-14 20:26:02 -07:00
Harry Riddle	e8b9f5ff9a	fix(aux): surface Nous auth-unavailable warning in auxiliary client When the auxiliary client falls through Nous (e.g. no stored auth, or runtime credential mint failed), users currently see only `debug`-level lines, so the next provider in the fallback chain takes over silently. Promote the no-auth path to a warning that tells operators to run `hermes auth`, and add a debug breadcrumb on the rarer mint-failed-but-stored-auth-still-present fallback path so the existing behavior (use the raw stored token) is preserved while staying investigable. Salvaged from #23881 by @0xharryriddle. The contributor's original patch also short-circuited the second branch with a return, which broke the pool-entry fallback path covered by `test_try_nous_uses_pool_entry` — kept the warning intent, dropped the return so the fallback still works. Dropped the contributor's changes to `hermes_cli/goals.py` because the goal-pause path is unreachable when the auxiliary client is None (`judge_goal` returns `parse_failed=False`, which resets `consecutive_parse_failures`), so the reason string they added never surfaces in the pause message. Refs #23876	2026-05-14 20:15:29 -07:00
teknium1	d3d5916089	chore(release): add AUTHOR_MAP entry for outdoorsea	2026-05-14 20:14:40 -07:00
Jeremy Irish	eabd8c1fd1	fix(cli): fall back to SelectSelector when kqueue can't watch stdin On macOS with uv-managed cPython 3.11, the default kqueue selector cannot register fd 0, so prompt_toolkit's loop.add_reader raises OSError(EINVAL) ("[Errno 22] Invalid argument") from kqueue.control() and the agent crashes immediately on startup (#5884, also reported in #6393). Probe KqueueSelector.register(0, EVENT_READ) before launching prompt_toolkit. If it fails, install an event-loop policy that returns a SelectorEventLoop backed by SelectSelector — select() works fine on stdin in this Python build, so add_reader succeeds and the agent launches normally. Also extend the existing #6393 fallback handler to recognize EINVAL / EBADF / "Invalid argument" so that any future selector failure on stdin shows the friendly "reinstall Python via pyenv or Homebrew" guidance instead of an opaque traceback. Verified on macOS (Darwin 24.6.0) with uv-managed cPython 3.11.15: the kqueue probe fails, the policy switch fires, and `hermes` launches cleanly. No effect on platforms where kqueue can register fd 0.	2026-05-14 20:14:40 -07:00
teknium1	4695d2716f	fix(browser): honor pre-set AGENT_BROWSER_ARGS and document the bypass Follow-up to the sandbox-bypass env-var fix: - Update the opt-out gate so a user-provided AGENT_BROWSER_ARGS is also respected, not just the legacy AGENT_BROWSER_CHROME_FLAGS. Previously the gate only checked the broken legacy var, so a user who pre-set AGENT_BROWSER_ARGS would still get clobbered by Hermes's auto-injection. - Document AGENT_BROWSER_ARGS in .env.example, the browser feature page, and the env var reference, with notes about the auto-injection on AppArmor-restricted systems (Ubuntu 23.10+, DGX Spark, containers). - Add Anadi Jaggia to AUTHOR_MAP.	2026-05-14 19:02:17 -07:00

1 2 3 4 5 ...

8389 commits