hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
Ben	1bbfed70c4	test(dashboard-auth): cover registry register/get/list/clear semantics Phase 1, Task 1.2. Verifies registration order is preserved, duplicate names are rejected with ValueError, and non-compliant providers fail at register time (not later when the middleware tries to dispatch).	2026-05-27 02:12:27 -07:00
Ben	2dc6d03a3d	feat(dashboard-auth): define DashboardAuthProvider ABC + Session dataclass Phase 1, Task 1.1. New package hermes_cli/dashboard_auth/ contains: base.py - DashboardAuthProvider ABC with 5 abstract methods (start_login, complete_login, verify_session, refresh_session, revoke_session), Session + LoginStart frozen dataclasses, three exception types (ProviderError / InvalidCodeError / RefreshExpiredError), and assert_protocol_compliance() for plugins to call in their own tests. registry.py - Module-level register/get/list/clear with a lock. Nothing reads the registry yet — Phase 2 adds the StubAuthProvider and Phase 3 wires the gate middleware. The plugin hook lands in Task 1.3.	2026-05-27 02:12:27 -07:00
Ben	949ad95e4b	feat(dashboard): stash auth_required flag on app.state Phase 0, Task 0.3. start_server now computes should_require_auth(host, allow_public) and records it on app.state.auth_required BEFORE the existing legacy SystemExit guard fires. This gives middleware, the SPA token-injection path, and WS endpoints a consistent read source for 'is the gate active'. The flag is set but no one reads it yet — Phase 3 registers the gate middleware. Note: 4 pre-existing test failures in tests/hermes_cli/test_web_server.py (PtyWebSocket) + test_update_hangup_protection.py reproduce on pristine HEAD and are unrelated to this change (starlette TestClient WS regression).	2026-05-27 02:12:27 -07:00
Ben	8773bbf186	feat(dashboard): add should_require_auth predicate for OAuth gate Phase 0, Task 0.2. Single source of truth for 'is the auth gate active?'. Reuses the existing _LOOPBACK_HOST_VALUES frozenset so this stays in sync with the DNS-rebinding host-header check. RFC1918/CGNAT/link-local are treated as public — exact threat model the gate exists for.	2026-05-27 02:12:27 -07:00
Ben	f2b479e7a2	test(dashboard): pin current loopback auth behavior as regression harness Phase 0, Task 0.1 of the dashboard-oauth plan. Establishes a baseline for the loopback dashboard's auth surface so future phases can prove they didn't regress the existing _SESSION_TOKEN flow when adding the OAuth gate.	2026-05-27 02:12:27 -07:00
Teknium	249534e472	plugins: add security-guidance — pattern-matched warnings on dangerous code writes (#33131 ) New opt-in plugin that scans the content passed to write_file / patch / skill_manage for 25 known-dangerous code patterns — pickle.load, yaml.load, eval(, os.system, subprocess(shell=True), child_process.exec, dangerouslySetInnerHTML, innerHTML/outerHTML/document.write/ insertAdjacentHTML, crypto.createCipher (no IV), AES ECB, TLS verification disabled, XXE-prone xml.etree/minidom parsers, <script src=//...> without SRI, torch.load without weights_only=True, GitHub Actions ${{ github.event.* }} injection — and appends a "Security guidance" warning block to the tool result via the transform_tool_result hook. Default behaviour is non-blocking: the file is written and the warning rides back to the model in the next turn so it can self-correct or document why the construct is safe. SECURITY_GUIDANCE_BLOCK=1 upgrades to refusing the write entirely; SECURITY_GUIDANCE_DISABLE=1 is the kill switch. Pattern data (patterns.py) is a verbatim Apache-2.0 fork of Anthropic's claude-plugins-official/plugins/security-guidance/hooks/ patterns.py at commit 0bde168 (2026-05-26). LICENSE and NOTICE preserve attribution. The Hermes-side plugin glue (__init__.py, plugin.yaml, README.md, tests) is original work. Plugin is opt-in like all bundled plugins: hermes plugins enable security-guidance Inspired by https://x.com/ClaudeDevs/status/1927108527247... — Anthropic shipped this as their security-guidance plugin for Claude Code on 2026-05-26 with a measured 30-40% reduction in security-related PR comments on internal rollout. What's NOT ported (deferred): * Layer 2 (LLM diff review on turn end) — would route through main model by default on Hermes, real money on reasoning models. A follow-up can wire it to a cheap aux model with explicit opt-in. * Layer 3 (agentic commit-time review) — agent can run this on demand via delegate_task today. * .hermes/security-guidance.md project-rules file — only used by layers 2/3 upstream.	2026-05-27 02:07:21 -07:00
Teknium	c752205635	chore(release): map superearn-fisher noreply for #33122 salvage	2026-05-27 02:06:21 -07:00
SuperEarn	4920f8437f	test(codex): cover null output stream terminal events	2026-05-27 02:06:21 -07:00
orcool	f0fdb5e67d	feat(catalog): add qwen3.7-max to alibaba + alibaba-coding-plan model lists Alibaba's latest flagship Qwen model is released but not yet present in the DashScope (alibaba) or Alibaba Coding Plan curated catalogs. Add it so it shows up in the /model picker and setup wizard for those providers. OpenCode Go routing for qwen3.7-max already landed via #32780 (commit `2fc77c53f`). OpenRouter + Nous catalog entries already landed via #32809 (commit `ccd3d04fc`). This salvage picks up the remaining alibaba / alibaba-coding-plan entries from #32806 — the AI Gateway entry is dropped because Vercel AI Gateway was removed in #33067.	2026-05-27 02:05:58 -07:00
Teknium	96223265b9	chore(api-server): mark skills_api capability True now that /v1/skills shipped #33016 added GET /v1/skills + /v1/toolsets on the API server; the capability flag introduced in this branch was placeholder-False. Flip to True so capability probers see the truth.	2026-05-27 01:56:55 -07:00
Jonathan	464b51d455	Support media in session chat API	2026-05-27 01:56:55 -07:00
Bailey Dixon	f7527b0fdb	feat: add API server session controls	2026-05-27 01:56:55 -07:00
Teknium	f0be32232d	chore(release): map EvilHumphrey noreply for #33034 salvage	2026-05-27 01:52:34 -07:00
EvilHumphrey	4243b6dc45	fix(codex): update silent-hang workaround hint	2026-05-27 01:52:34 -07:00
Siddharth Balyan	976979489a	feat(nix): add #messaging and #full package variants (#33108 ) * fix(plugins/discord): correct install_hint extra to [messaging] The Discord platform registered install_hint pointing at 'hermes-agent[discord]', but pyproject.toml has no [discord] extra — the deps live in [messaging] alongside Telegram and Slack. Users hitting "Platform 'Discord' requirements not met" were directed at a pip command that installs nothing. * feat(nix): add #messaging and #full package variants Make Discord/Telegram/Slack work out of the box for `nix profile install` users. Messaging deps were dropped from [all] on 2026-05-12 in favor of lazy-install, but lazy-install can't write to the read-only /nix/store — users hit "No adapter available for discord" with no actionable guidance. - #messaging: pre-built with discord.py/telegram/slack (+33 MB venv) - #full: all 18 platform-portable extras + matrix on Linux only (python-olm lacks Darwin PyPI wheels) (+738 MB venv) Also adds a `messaging-variant` flake check that verifies `import discord` succeeds in the sealed venv — regression guard for the lazy-install migration. Docs updated: Quick Start callout, extraDependencyGroups rewrite with messaging as primary example + full extras table, troubleshooting row, cheatsheet row. Closure size deltas (measured x86_64-linux): default 1792 MB pkg / 512 MB venv messaging 1826 MB pkg / 546 MB venv (+33 MB) full 2530 MB pkg / 1250 MB venv (+738 MB) * chore(nix): trim variant comments + alphabetize full extras Drop the date-stamped changelog from messaging-variant's comment and the "+33 MB / +704 MB" numbers from the variant defs — those drift and belong in the PR description, not source. Alphabetize the 18-extra list in #full so future additions produce clean one-line diffs. No semantic change. messaging-variant check still passes.	2026-05-27 14:15:39 +05:30
Teknium	25f43d38de	feat(api-server): add GET /v1/skills and /v1/toolsets (#33016 ) Lets external clients enumerate the agent's skills and resolved toolsets deterministically over the OpenAI-compatible API server, without standing up the dashboard web server or sending a chat message and asking the model to list them. - GET /v1/skills — list installed skills (name, description, category) - GET /v1/toolsets — list toolsets resolved for the api_server platform, with enabled/configured state and the concrete tool names each expands to - Both gated by API_SERVER_KEY (same Bearer scheme as every other /v1/* endpoint) - /v1/capabilities advertises both new endpoints Closes the gap a community user just hit asking how to list skills over REST when only the OpenAI-compatible server is running. Test plan - python -m pytest tests/gateway/test_api_server.py -k "Skills or Toolsets or Capabilities" -o 'addopts=' -q → 9/9 pass - python -m pytest tests/gateway/test_api_server.py -o 'addopts=' -q → 156/156 pass, no regressions - E2E: started a real adapter on an isolated HERMES_HOME with a fake skill installed; curl-equivalent calls to /v1/capabilities, /v1/skills, /v1/toolsets returned the expected JSON; unauthenticated calls returned 401 with the configured API_SERVER_KEY.	2026-05-27 01:27:26 -07:00
Teknium	febc4cfec0	remove Vercel AI Gateway and Vercel Sandbox (#33067 ) * remove Vercel AI Gateway provider and Vercel Sandbox terminal backend Both Vercel-hosted integrations are removed end-to-end. Users on the AI Gateway should switch to OpenRouter or one of the other aggregators (Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should switch to Docker, Modal, Daytona, or SSH. What's removed: - `plugins/model-providers/ai-gateway/` provider plugin - `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper - `tools/environments/vercel_sandbox.py` terminal backend - `ai-gateway` provider wiring across auth, doctor, setup, models, config, status, providers, main, web_server, model_normalize, dump - `vercel_sandbox` backend wiring across terminal_tool, file_tools, code_execution_tool, file_operations, approval, skills_tool, environments/local, credential_files, lazy_deps, prompt_builder, cli, gateway/run - `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client header set, run_agent base-URL header/reasoning special-cases - `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock - env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`, `VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`, `TERMINAL_VERCEL_RUNTIME` - Tests: deletes test_ai_gateway_models.py and test_vercel_sandbox_environment.py; scrubs references across 23 surviving test files (no entire tests deleted unless they were dedicated to AI Gateway / Sandbox) - Docs: provider tables, env-var reference, setup guides, security notes, tool config, terminal-backend tables — English plus zh-Hans i18n parity - `hermes-agent` skill: provider table entry and remote-backend list What stays (intentional): - `popular-web-designs/templates/vercel.md` — CSS design reference, unrelated to Vercel-the-AI-product - `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN response header, useful diag signal on any Vercel-hosted endpoint - `vercel-labs/agent-browser` URL in browser config — lightpanda browser project, different OSS effort - `userStories.json` historical contributor entry mentioning Vercel Sandbox — archive, not active docs Validation: - 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`) - Full repo `py_compile` clean - Live import of every touched module + invariant check (no `ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no `vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`) * test: convert profile-count check from change-detector to invariant The hardcoded "== 34" assertion broke when ai-gateway was removed. Per AGENTS.md change-detector-test guidance, assert the relationship (registry count >= number of plugin dirs) instead of a literal count. Counts shift when providers are added/removed; that's expected.	2026-05-27 00:43:32 -07:00
Teknium	cb38ce28cb	refactor(codex): drop SDK responses.stream() helper; consume events directly (#33042 ) * refactor(codex): drop SDK responses.stream() helper; consume events directly The OpenAI Python SDK's high-level `client.responses.stream(...)` helper does post-hoc typed reconstruction from the terminal `response.completed.response.output` field. The chatgpt.com Codex backend has been observed (today, gpt-5.5) to ship `response.output = null` on terminal frames, which crashes the SDK with `TypeError: 'NoneType' object is not iterable` mid-iteration. Carlton's #32963 patched the symptom by wrapping the helper in try/except and recovering from the same per-event accumulator the SDK was supposed to populate. This PR removes the helper from the call path entirely: we now use `client.responses.create(stream=True)` (raw AsyncIterable of SSE events) and assemble the final response object ourselves from `response.output_item.done` events as they arrive. The terminal event's `output` field is never read for content. Same strategy OpenClaw uses for the same backend. This makes Hermes structurally immune to the bug class, not patched. The next time OpenAI ships a shape change to chatgpt.com's terminal frame, our consumer keeps working because it doesn't read that frame for content — only for usage/status/id. Changes - `agent/codex_runtime.py`: new `_consume_codex_event_stream()` shared consumer; `run_codex_stream()` uses `responses.create(stream=True)`; `run_codex_create_stream_fallback()` collapses into a thin alias since the primary path now does what the fallback used to do. - `agent/auxiliary_client.py`: `_CodexCompletionsAdapter` uses the same consumer; old null-output recovery helpers deleted as unreferenced. - Tests migrated: fixtures that mocked `responses.stream` now mock `responses.create` returning a raw iterable. New regression test asserts the auxiliary path returns streamed items even when the terminal event's `output` is literally `null`. Validation - Live: tested against fresh OAuth on `chatgpt.com/backend-api/codex` with `gpt-5.5` — response built correctly with `response.output=null` on the terminal frame, all events consumed, usage/reasoning tokens propagated. - `tests/run_agent/test_run_agent_codex_responses.py` + `tests/agent/test_auxiliary_client.py`: 242 passed. * test+fix(codex): migrate streaming tests, raise on truncated streams CI surfaced 10 test failures across tests/run_agent/test_streaming.py and tests/run_agent/test_codex_xai_oauth_recovery.py — both files had their own `responses.stream(...)` mocks I missed in the first sweep. agent/codex_runtime.py: _consume_codex_event_stream() now raises "Codex Responses stream did not emit a terminal response" when the stream ends without any terminal frame AND no usable content. This preserves the signal callers used to get from the SDK's high-level helper, which they distinguished from "completed with empty body" in error handling. Tests migrated: - test_streaming.py: text-delta callback, activity-touch, and remote-protocol-error tests all switch from mocking responses.stream to responses.create returning an iterable of events. - test_codex_xai_oauth_recovery.py: prelude-error tests are recast as wire-error-event tests (the new path raises _StreamErrorEvent directly when the wire emits type=error, which is strictly better than the old two-phase "SDK RuntimeError → retry → fallback"). The retry-on-transport-error test moves from responses.stream side-effect to responses.create side-effect. Verified live against chatgpt.com Codex with gpt-5.5 — AIAgent.chat() through the full codex_responses path returns correctly, 319/319 targeted tests passing.	2026-05-27 00:30:06 -07:00
Ben	fb298a958c	fix(docker): mkdir HERMES_HOME as root in stage2 before chown / privilege drop (#18488 ) When HERMES_HOME points at a custom path whose parent directories only root can create (e.g. HERMES_HOME=/home/hermes/.hermes in a Compose file, or any path under a fresh / not pre-populated by the image), stage2-hook.sh fails on first boot: [stage2] Warning: chown failed (rootless container?) - continuing mkdir: cannot create directory '/custom': Permission denied mkdir: cannot create directory '/custom': Permission denied ... (one per s6-setuidgid hermes mkdir invocation) cont-init: info: /etc/cont-init.d/01-hermes-setup exited 1 The mkdirs fail because s6-setuidgid drops to hermes (UID 10000) before invoking mkdir -p, and the runtime user has no permission to create root-owned ancestor directories. 02-reconcile-profiles then crashes with FileNotFoundError, .install_method never lands, and the container limps on in a half-initialized state. Bootstrap HERMES_HOME with mkdir -p while still root, before the ownership normalization. Idempotent on the default /opt/data path (directory already exists from the Dockerfile RUN mkdir -p) and on any subsequent restart. (#18482) Retargeted from the original PR's docker/entrypoint.sh (now a deprecated shim) to docker/stage2-hook.sh where the related chown logic moved during the s6-overlay rework. Co-authored-by: wpengpeng168 <133926080+wpengpeng168@users.noreply.github.com>	2026-05-27 17:16:40 +10:00
Ben	c3bdb2af37	ci(docker): add shellcheck shell=sh directive to main-wrapper.sh shellcheck doesn't recognize the s6-overlay `#!/command/with-contenv sh` shebang and aborts with SC1008 ("This shebang was unrecognized. ShellCheck only supports sh/bash/dash/ksh/'busybox sh'. Add a 'shell' directive to specify."). The error fires at --severity=error too, so it fails the "Docker / shell lint" CI job on every PR that touches docker/. Add the canonical `# shellcheck shell=sh` directive — same fix already applied to the sibling cont-init.d scripts (`02-reconcile-profiles` and `015-supervise-perms`) when they adopted the with-contenv shebang. The shebang was changed from `#!/bin/sh` → `#!/command/with-contenv sh` in PR #32412 (commit `29c71e9`) to fix env-propagation through s6's PID 1. The shellcheck-directive line was missed in that PR; this patches it. Reproduces locally: docker run --rm -v "$PWD:/mnt" -w /mnt koalaman/shellcheck:stable \ --severity=error --format=gcc docker/main-wrapper.sh Before: docker/main-wrapper.sh:1:1: error: [SC1008] (rc=1) After: (no output) (rc=0) Script behavior is unchanged — the directive is a comment, and `sh -n` / `bash -n` parse the file cleanly either way.	2026-05-27 16:32:35 +10:00
Ben	27a29ee54e	feat(docker): upgrade Node to 22 LTS via multi-stage from node:22-bookworm-slim (#4977 ) Debian trixie's bundled `nodejs` package is pinned to 20.19.2, which reached LTS EOL in April 2026. Trixie won't upgrade in place; Debian 14 (forky) — where the apt nodejs is 24.x — isn't released until ~mid-2027. To stay on a supported LTS without waiting for Debian 14, copy node + npm + corepack from the upstream `node:22-bookworm-slim` image as a multi-stage source, matching the existing `uv_source` and `gosu_source` patterns in the Dockerfile. Bookworm-based slim image is used so the produced binary links against glibc 2.36, which runs cleanly on Debian 13 (trixie, glibc 2.41). Changes: - Add `FROM node:22-bookworm-slim@sha256:... AS node_source` stage - Remove `nodejs npm` from `apt-get install` (now sourced from node_source) - Add `ca-certificates` explicitly to apt install (was a transitive of the apt nodejs package; removing nodejs broke the chain and curl inside the build failed with "error setting certificate file") - COPY node binary + npm + corepack from node_source; recreate the symlinks at /usr/local/bin/{npm,npx,corepack} - Update the npm_config_install_links=false comment block — npm 10's default is already `install-links=false`, but we keep the env as defense-in-depth against future Node-source-version regressions Future bumps to Node 24/26 are a one-line ARG change. Validation: - Built --no-cache against current origin/main; build succeeds in 1m42s - Image size: 3.27 GB (pre-salvage-1 baseline) → 3.14 GB (this PR); net 130 MiB savings (60 MiB from this change alone vs current main — removing apt nodejs+transitive deps that duplicated what node bundles) - Node 22.22.3 / npm 10.9.8 / esbuild 0.27.7 all run cleanly under trixie's glibc 2.41 - Standard image smoke (6/6), Node-version E2E (8/8), chown E2E from #19788 (6/6), TUI UID-remap E2E from #28851 (4/4) — 24 checks total Co-authored-by: Prithvi Monangi <8312237+Prithvi1994@users.noreply.github.com>	2026-05-27 16:22:21 +10:00
Ben	22eb4d13f7	fix(docker): chown ui-tui and node_modules on UID remap so TUI esbuild works (#28851 ) Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker / shell lint / Lint Dockerfile (hadolint) (push) Waiting to run Details Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Build Skills Index / build-index (push) Has been cancelled Details Build Skills Index / trigger-deploy (push) Has been cancelled Details When HERMES_UID remaps the hermes user from 10000 to another UID (e.g. matching the host user's UID for bind-mount ergonomics), the TUI launcher's esbuild step fails: ✘ [ERROR] Failed to write to output file: open /opt/hermes/ui-tui/dist/entry.js: permission denied TUI build failed. This is because the Dockerfile's build-time `chown -R hermes:hermes` on `/opt/hermes/{.venv,ui-tui,node_modules}` (line 154) wrote UID 10000, and stage2-hook.sh only re-chowned `.venv` on UID remap — leaving the TUI build trees still owned by the old UID. Extend the stage2 re-chown to include the same set as the build-time chown: `.venv`, `ui-tui`, `node_modules`. These are the runtime-writable trees under $INSTALL_DIR; everything else under /opt/hermes is read-only at runtime so keeping it root-owned is fine. Original fix targeted docker/entrypoint.sh which is now a deprecated shim; retargeted to docker/stage2-hook.sh where the .venv chown moved during the s6-overlay rework. Co-authored-by: Andreas Steffan <623481+deas@users.noreply.github.com>	2026-05-27 15:41:48 +10:00
Ben	9eadb6805c	fix(docker): targeted chown to preserve host file ownership in HERMES_HOME (#19795 ) Replaces the recursive chown of $HERMES_HOME in stage2-hook.sh with a targeted approach: chown the top-level dir (so hermes can create new subdirs) plus the specific hermes-owned subdirectories (cron/, sessions/, logs/, hooks/, memories/, skills/, skins/, plans/, workspace/, home/, profiles/) — the same canonical list seeded by the s6-setuidgid mkdir -p block below. Avoids clobbering host-side file ownership when $HERMES_HOME is a bind mount that contains user-owned files not managed by hermes (issue #19788). Original fix targeted docker/entrypoint.sh which is now a deprecated shim; retargeted to docker/stage2-hook.sh where the recursive chown moved during the s6-overlay rework. Co-authored-by: Ptichalouf <1809721+ptichalouf@users.noreply.github.com>	2026-05-27 15:08:41 +10:00
Teknium	b6ca56f651	fix(codex-responses): gracefully recover from invalid_encrypted_content (salvage #10144 ) (#33035 ) * fix(codex-responses): gracefully recover from invalid_encrypted_content (salvage #10144) When an OpenAI-compatible Responses API surface accepts an initial request but later rejects the replayed `codex_reasoning_items` encrypted blob with HTTP 400 `invalid_encrypted_content`, the session previously got stuck retrying the same poisoned payload. Recovery: classify the error as a dedicated FailoverReason, and on the first hit disable encrypted reasoning replay for the rest of the session, strip cached items from message history, and retry once. Changes: * error_classifier: add FailoverReason.invalid_encrypted_content branch in _classify_400 (before context_overflow so the messages that mention 'encrypted content … could not be verified' don't trip context heuristics), in _classify_by_error_code, and extend _extract_error_code to peek inside wrapped JSON in error.message and ignore the bare '400' as a code. * agent_init: initialize `_codex_reasoning_replay_enabled = True` on every agent. * run_agent: add AIAgent._disable_codex_reasoning_replay() helper that flips the flag and pops cached items. * codex_responses_adapter: thread a `replay_encrypted_reasoning` kwarg through _chat_messages_to_responses_input so that when the flag is False we don't replay codex_reasoning_items. * transports/codex.py: read `replay_encrypted_reasoning` from params, thread it into the adapter, and gate the `include=['reasoning.encrypted_content']` request hint on it. * chat_completion_helpers: pass the agent's replay flag through to the transport. * conversation_loop: in the retry loop, add an invalid_encrypted_content recovery branch that fires once per session, only when api_mode == codex_responses, only when replay is still enabled, and only when at least one assistant message in history actually carries cached reasoning items (otherwise the 400 has nothing to do with our cache and the normal retry path handles it). Tests: * test_error_classifier: new wrapped-JSON _extract_error_code case; new TestClassifyApiError cases proving the 400 is retryable with no fallback, that the broad message match doesn't catch a generic 'parsed' message, and that the error code match is case-insensitive. * test_run_agent_codex_responses: end-to-end test of the recovery branch firing once and disabling replay, plus a sibling test that proves the branch does not fire (and the flag stays True) when history has no cached reasoning items. Salvages PR #10144 onto the post-refactor module layout (error_classifier / codex_responses_adapter / transports/codex / conversation_loop / agent_init) since the original diff was written against the pre-refactor monolithic run_agent.py. * chore(release): map victorGPT in AUTHOR_MAP for #10144 salvage --------- Co-authored-by: victorGPT <wuxuebin1993@gmail.com>	2026-05-26 22:01:17 -07:00
Jeffrey Quesnelle	9d3e9316f4	Merge pull request #29591 from NousResearch/jq/hermes-update-branch-flag feat(cli): add --branch flag to `hermes update`	2026-05-27 00:57:37 -04:00
emozilla	3d9a26afad	Merge remote-tracking branch 'origin/main' into jq/hermes-update-branch-flag	2026-05-27 00:48:25 -04:00
Ben	1e5884e38f	refactor(docker): drop build-essential from apt install (#27507 ) build-essential is a Debian metapackage (libc6-dev + gcc + g++ + make + dpkg-dev). The Dockerfile already installs gcc + python3-dev + libffi-dev explicitly, which covers the C-ext compile cases lazy_deps may hit at first boot. g++/make/dpkg-dev aren't reached by the resolved [all]+[messaging] tree on current main — verified via uv sync --dry-run on cp313-linux. Co-authored-by: Monty Taylor <mordred@inaugust.com>	2026-05-27 14:35:19 +10:00
Ben Barclay	81a4f280d2	Merge pull request #22534 from wesleysimplicio/fix/voice-mode-docker-respect-pulse-pipewire fix(voice): honor PULSE_SERVER/PIPEWIRE_REMOTE inside Docker (#21203)	2026-05-27 13:59:12 +10:00
teknium1	9feadc2734	chore(release): map ticketclosed-wontfix noreply to GitHub login	2026-05-26 20:51:59 -07:00
Nick	0a83247e9f	feat: add TUI session orchestrator Add a first-class active-session orchestrator for the Ink TUI: - list, activate, close, and launch live process-local TUI sessions - hydrate committed and in-flight output when switching sessions - dispatch a new prompt session from the +new row with session-scoped model picks - expose a clickable live-session count in the status chrome - preserve stable row order while initially focusing the current session - support mouse hit-testing for floating orchestrator overlays - add backend and frontend regression coverage for the lifecycle and UI helpers	2026-05-26 20:51:59 -07:00
beardthelion	2fc77c53f0	feat(opencode-go): route qwen3.7-max via anthropic_messages qwen3.7-max on OpenCode Go rejects the OpenAI-compatible (oa-compat) format with HTTP 401 but works correctly via the Anthropic Messages endpoint (/v1/messages with x-api-key auth). Route it the same way MiniMax models are routed: anthropic_messages api_mode. Changes: - hermes_cli/models.py: add qwen3.7-max routing + curated list - hermes_cli/setup.py: add to setup wizard model list - hermes_cli/auth.py: update provider comment - tests: add assertions for qwen3.7-max api_mode routing	2026-05-26 20:44:43 -07:00
Ben Barclay	3c7f786ade	Merge pull request #31557 from yu-xin-c/codex/docs-xurl-docker-home-29108 docs: clarify xurl auth HOME in Docker	2026-05-27 13:42:51 +10:00
Ben Barclay	7d94eee0a9	Merge pull request #32122 from yu-xin-c/codex/docs-docker-audio-bridge-32009 docs: add Docker audio bridge notes	2026-05-27 13:42:36 +10:00
Ben Barclay	628aaea63a	Merge pull request #32412 from jonpol01/fix/docker-env-propagation fix(docker): propagate env through s6 to cont-init and main CMD	2026-05-27 13:42:26 +10:00
Ben Barclay	840f79ed12	Merge pull request #31031 from Sunil123135/feat/windows-docker-desktop feat(docker): add Windows Docker Desktop compatible compose file	2026-05-27 13:41:16 +10:00
Will Falcon	bba50977bc	fix: parse Codex image generation SSE directly	2026-05-26 20:40:29 -07:00
Teknium	16e86ce6a7	chore(release): map wangpuv contributor email for #32933 (#33005 ) Pre-stages the AUTHOR_MAP entry so the contributor-check workflow passes when Will Falcon's image-gen SSE fix lands.	2026-05-26 20:40:17 -07:00
Ben Barclay	1e267c4859	Merge pull request #29025 from slowtokki0409/codex/ignore-local-runtime-files Ignore local Hermes runtime files	2026-05-27 13:20:01 +10:00
Teknium	2a8d217417	chore(release): map carltonawong noreply to GitHub login Added AUTHOR_MAP entry for the cherry-picked fix in the preceding commit so the release contributor audit can resolve Carlton's noreply email.	2026-05-26 19:37:37 -07:00
Carlton	43a3f119fc	fix(agent): recover Codex streams with null output	2026-05-26 19:37:37 -07:00
Teknium	bb4703c761	docs(auth): replace stale 'hermes login' references with 'hermes auth add' 'hermes login' was removed (the command now just prints a deprecation message and exits). The bundled hermes-agent SKILL.md, in-code error messages, the tip rotation, the proxy adapters, and the docs site still pointed agents and users at the dead command — so models loading the skill kept running 'hermes login --provider openai-codex' and getting a dead-end print. Replacements use the canonical 'hermes auth add <provider>' surface (or bare 'hermes auth' for the interactive manager). Files: - skills/autonomous-ai-agents/hermes-agent/SKILL.md (+ regenerated docs page) - hermes_cli/tips.py (tip rotation) - agent/google_oauth.py (gemini-cli error message) - agent/conversation_loop.py (nous re-auth troubleshooting line) - agent/credential_sources.py (docstring) - hermes_cli/proxy/cli.py + hermes_cli/proxy/adapters/nous_portal.py (proxy auth hints) - tests/hermes_cli/test_proxy.py (updated assertions) - website/docs/reference/faq.md, website/docs/user-guide/features/subscription-proxy.md - zh-Hans i18n mirrors for the above 'hermes logout' is still a live command and is left untouched. The 'hermes login' stub in hermes_cli/auth.py:login_command() and the cli-commands.md 'Deprecated' rows are intentionally kept as the discoverable deprecation surface.	2026-05-26 15:41:11 -07:00
teknium1	f05a47309e	fix(gateway): refresh cached agent tools on /reload-mcp When the gateway processes /reload-mcp, it reconnects MCP servers and updates the global _servers registry, but cached AIAgent instances in _agent_cache keep the tools list they were built with. The user had to also run /new (discarding conversation history) before the agent could see the new tools — even though /reload-mcp had succeeded. This patch refreshes each cached agent's .tools and .valid_tool_names in _execute_mcp_reload after discovery returns, so existing sessions pick up new MCP tools on their next turn. The slash-confirm gate in _handle_reload_mcp_command already obtains user consent for the implied prompt-cache invalidation before this code runs. Mirrors the equivalent behaviour the CLI already does in cli.py _reload_mcp. Per-agent enabled_toolsets and disabled_toolsets are preserved so an agent that was scoped to a subset of toolsets does not silently gain disabled tools after the reload. Original diagnosis + initial implementation in #23812 from @fujinice. The auto-reload watcher half of that PR is intentionally dropped — users want /reload-mcp to remain explicit. Co-authored-by: fujinice <45688690+fujinice@users.noreply.github.com>	2026-05-26 14:28:51 -07:00
teknium1	556bf7c5c1	test(cron): guard schedule-required description text on CRONJOB_SCHEMA	2026-05-26 14:09:37 -07:00
ygd58	51013268cf	fix(cron): clarify schedule is required for create in tool schema Grok models (and other LLMs) sometimes omit the schedule parameter when calling the cronjob tool with action=create because the schema only listed 'action' in required[] and the schedule description did not explicitly state it was mandatory (issue #32427). Fix: update schema descriptions to clearly state schedule is REQUIRED for action=create, making this explicit for models that rely on description text for parameter compliance. Fixes #32427	2026-05-26 14:09:37 -07:00
Teknium	ccd3d04fc5	chore(models): swap qwen3.6-plus → qwen3.7-max in openrouter+nous lists (#32809 ) Updates curated picker lists for both the OpenRouter fallback snapshot (`OPENROUTER_MODELS`) and the Nous Portal list (`_PROVIDER_MODELS['nous']`). Regenerates website/static/api/model-catalog.json via `scripts/build_model_catalog.py` to keep the docs-hosted manifest in sync (drift guard in `test_in_repo_lists_match_manifest`). tests/hermes_cli/test_models.py fixtures updated — they pinned the old model id as their live-fetch sample.	2026-05-26 14:01:47 -07:00
Teknium	8b69ec03af	feat(mcp): Nous-approved MCP catalog with interactive picker (#30870 ) * feat(mcp): Nous-approved MCP catalog with interactive picker Adds an optional-mcps/ directory mirroring optional-skills/: curated, Nous-approved MCP servers shipped with the repo but disabled by default. Presence in optional-mcps/ = approval. No community tier, no trust signals. Entries are added by merging a PR. New surface: hermes mcp Interactive catalog picker (default) hermes mcp catalog Plain-text list, scriptable hermes mcp install <name> Install a catalog entry Picker behavior: not installed -> install (clone/bootstrap if needed, prompt for creds) installed/off -> enable installed/on -> menu (disable / uninstall / reinstall) Manifest schema (manifest_version: 1) supports: - transport: stdio (command/args, ${INSTALL_DIR} substitution) or http (url) - install: optional git clone + bootstrap commands (for repos that need local venv setup, like the n8n bridge); omit for npx/uvx servers - auth: api_key (prompts -> ~/.hermes/.env), oauth (provider-mediated or native MCP), or none Catalog entries are never auto-updated. Users re-run `hermes mcp install` to refresh. Credentials always go to ~/.hermes/.env (the .env-is-for-secrets rule), never to per-server env blocks. Ships n8n as the reference manifest (https://github.com/CyberSamuraiX/hermes-n8n-mcp). Tests: 19 catalog tests + E2E install/uninstall round-trip via the shipped manifest. * feat(mcp): tool-selection checklist + Linear catalog entry Adds install-time tool selection so users only enable the MCP tools they actually want, and ships Linear as a second reference catalog entry to demonstrate the http+oauth path alongside n8n's stdio+api_key+git-bootstrap. Tool selection flow: install (clone/auth/credentials) -> probe server for available tools -> curses checklist with pre-checked rows -> write mcp_servers.<name>.tools.include Pre-check priority: 1. user's prior tools.include (reinstall preserves selection) 2. manifest's tools.default_enabled (curated subset) 3. all probed tools (default) Probe-failure fallback (server unreachable, OAuth not yet complete, backing service offline): - manifest declared default_enabled -> applied directly - no default declared -> no filter written (all-on when reachable) - both cases point user at hermes mcp configure <name> Manifest schema additions: tools: default_enabled: [list, of, tool, names] # optional Updates: - optional-mcps/linear/manifest.yaml -- new reference entry (http+oauth) - optional-mcps/n8n/manifest.yaml -- tools.default_enabled set to the 8 read-mostly tools; mutating tools (activate/deactivate, container_logs) pruned by default - docs: new 'Tool selection at install time' section in features/mcp.md Tests: 7 new tests in TestToolSelection covering probe-success / probe-fail matrix, manifest-default filtering, reinstall-preserves-selection, and invalid-default-enabled rejection. 26 catalog tests + 32 existing mcp_config tests passing. * feat(mcp): polish — picker unification, include-mode convergence, hardening Addresses review findings on PR #30870. Lands all improvements that belong in this PR before merge; defers separate cleanup (consolidating two probe implementations, change-detector tests) to follow-ups. Picker UX (mcp_picker.py) - Unifies catalog + custom (user-added) MCPs in one view with distinct status badges (available / enabled / installed (disabled) / custom — enabled / custom — disabled) - Adds 'Configure tools (probe server + re-pick)' action to both the catalog-installed and custom-row submenus — the existing hermes mcp configure flow was previously unreachable from the picker - Loops until ESC/q so the user can manage several entries in one session instead of having to re-launch - Uninstall message now mentions .env credentials are preserved with a pointer to clean them up manually if no longer needed - Surfaces a 'requires a newer Hermes' warning per future-manifest entry instead of silently hiding it Catalog (mcp_catalog.py) - catalog_diagnostics() exposes which manifests were skipped and why (future_manifest vs invalid) so UIs can give actionable feedback - _do_git_install detects SHA-shaped refs (regex /[0-9a-f]{7,40}/) and skips the doomed 'git clone --branch <sha>' attempt — clone --branch only accepts branches/tags, so SHAs always failed noisily before falling back to the full-clone path - Probe-success all-tools-enabled message now mentions that new tools the server adds later will be auto-enabled (no-filter mode) Convergence (tools_config.py) - _configure_mcp_tools_interactive now writes tools.include (whitelist) instead of tools.exclude (blacklist), matching the catalog flow and hermes mcp configure. The on-disk config shape no longer depends on which UI the user touched last - Two existing tests updated to assert the new include-mode contract Discoverability - Setup wizard final step now prints 'Browse curated MCPs: hermes mcp' - Three tip-corpus entries pointing at the new catalog - Docs updated with: trust model (manifests run code locally, gated by PR review, but read before installing), runtime ${ENV_VAR} substitution semantics, and the manifest_version forward-compat behavior Tests - 7 new tests covering future-manifest diagnostics, custom MCP picker rows, SHA-ref git-install path, branch-ref git-install path, and the tools_config include-mode write contract - 80 MCP-related tests passing across test_mcp_catalog.py, test_mcp_config.py, test_mcp_tools_config.py * fix(mcp): drop setup-wizard catalog hint to satisfy supply-chain scanner The wizard line 'Browse curated MCPs: hermes mcp' triggered the CI supply-chain scanner because it pattern-matches on edits to any file named hermes_cli/setup.py — that filename matches the Python 'install-hook file' heuristic even though this setup.py is the user-facing 'hermes setup' wizard, not a packaging install hook. The catalog is already surfaced via three tip-corpus entries in hermes_cli/tips.py (which the scanner doesn't flag), so dropping the wizard mention loses no discoverability. Worth revisiting after a scanner allowlist for this specific file lands.	2026-05-26 12:48:14 -07:00
Teknium	2517917de3	fix(cli): restore fallback paste collapse + handle long single-line pastes (#32447 ) Follow-up to #32087 after community report from @ethernet that 8000-char single-line pastes get dumped raw into the input box. A) Fallback regression revert paste_collapse_threshold_fallback default: 0 -> 5 #32087 disabled the fallback handler by default. The fallback path has been always-on with line_count >= 5 since #3065 (March 2026); the previous shape was the salvaged contributor's design and didn't match pre-existing behavior for terminals without bracketed paste support (Windows terminals, some SSH setups). Restoring the original on-by-default. B) Long single-line paste guard New config key: paste_collapse_char_threshold (default 2000) Bracketed-paste handler and fallback handler now BOTH collapse when line count >= line threshold OR total char length >= char threshold. Catches the case ethernet hit: ~8000 chars of minified JSON / log output on a single line dumped raw into the buffer. TUI mirrors the same config via uiStore.pasteCollapseChars. Set 0 to disable. Defaults verified: paste_collapse_threshold: 5 paste_collapse_threshold_fallback: 5 paste_collapse_char_threshold: 2000 Tests: tests/hermes_cli/test_config.py: 87/87 pass ui-tui useConfigSync.test.ts: 34/34 pass ui-tui useComposerState.test.ts: 9/9 pass tsc: 0 new errors in touched files	2026-05-25 23:49:01 -07:00
Teknium	31c8d5ff5f	chore(wecom): make defusedxml dep acquireable and tolerant of absence Follow-up on top of @TheOnlyMika's #32155 cherry-pick. The defusedxml hardening import was unconditional, which would break the gateway for anyone running a WeComCallback adapter without the (transitive-only) defusedxml present. - Wrap the import in the same try/except pattern as aiohttp/httpx in the same file. Sets DEFUSEDXML_AVAILABLE flag. - Extend check_wecom_callback_requirements() to gate on the flag, so the gateway logs the actual missing dep and skips the adapter instead of crashing. - Add [wecom] extra to pyproject.toml with defusedxml==0.7.1. - Register platform.wecom_callback in tools/lazy_deps.py so users get prompted to install it on first WeComCallback configuration, same pattern as discord/slack/matrix. defusedxml is still the right call for pre-auth XML parsing — this commit just makes the dep declarative and recoverable instead of a hard import-time crash.	2026-05-25 23:30:43 -07:00
TheOnlyMika	5744b17579	harden: restrict markdown link schemes; parse untrusted XML with defusedxml Two small defensive-hardening changes: - web/src/components/Markdown.tsx: render links only for http(s)/mailto schemes; other schemes (javascript:, data:, vbscript:) are dropped to plain text so a crafted link in rendered content can't execute on click. - gateway/platforms/wecom_callback.py: parse the untrusted, pre-auth WeCom callback request body with defusedxml instead of xml.etree, blocking entity-expansion / billion-laughs (and XXE) on the parse path. defusedxml is already a dependency (uv.lock); response-building XML in wecom_crypto.py is unchanged (it is not parsed from untrusted input). Verified: dashboard typechecks and builds; defusedxml blocks an entity-expansion payload while valid WeCom envelopes still parse.	2026-05-25 23:30:43 -07:00
dearmayo	f4953bc648	fix(subdirectory_hints): prevent loading AGENTS.md outside workspace SubdirectoryHintTracker was scanning directories outside the active working directory, allowing files like ~/.codex/AGENTS.md or ~/.claude/CLAUDE.md to be loaded and injected into the agent context. This causes cross-agent context contamination and instruction mixup. Add _is_ancestor_or_same() helper and a path boundary check in _is_valid_subdir(): only directories within the working directory tree (i.e. path.is_relative_to(working_dir)) are allowed. Also add exist_ok=True to mkdir() calls in new tests to prevent pytest-xdist race conditions when workers share the same tmp_path parent. Tests added: - test_outside_working_dir_rejected: verifies sibling dirs are blocked - test_outside_working_dir_absolute_path_rejected: verifies ~/.codex paths blocked - test_inside_workspace_subdir_allowed: verifies normal subdir access unaffected - test_sibling_repo_not_loaded_via_ancestor_walk: ancestor walk stays within workspace	2026-05-25 23:17:33 -07:00

1 2 3 4 5 ...

9629 commits