hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-23 10:42:00 +00:00

Author	SHA1	Message	Date
Teknium	524cbabd89	chore(release): add dandacompany to AUTHOR_MAP for salvaged PR #20503	2026-05-08 17:01:12 -07:00
teknium1	d606df8126	docs(cli): call out Ctrl+Enter for Windows Terminal users Windows Terminal captures Alt+Enter at the terminal layer (fullscreen toggle), so documenting 'Alt+Enter or Ctrl+J' without qualification leaves stock Windows Terminal users with no working newline key they can discover from the docs alone. - Main keybindings row: note Alt+Enter is intercepted on WT and direct users to Ctrl+Enter / Ctrl+J instead. - Shift+Enter compatibility table: split 'stock Windows Terminal' from Windows Terminal Preview 1.25+ (which added Kitty protocol support and works with the keybinding from this PR once enabled). - Add AUTHOR_MAP entry for ra2157218@gmail.com -> Abd0r so the salvage commit passes the email-mapping CI gate.	2026-05-08 16:26:51 -07:00
Teknium	59fbcd5ccb	fix(install.ps1): strip UTF-8 BOM that broke [scriptblock]::Create Commit `3dfb35700` accidentally saved scripts/install.ps1 with a UTF-8 BOM (EF BB BF) at byte 0. PowerShell's normal file-execution path (`& .\install.ps1`) handles BOMs fine, but the curl-and-iex one-liner documented in the README uses `[scriptblock]::Create((irm ...))` which does NOT strip BOMs — the BOM lands inside the param() block and fails with 'The assignment expression is not valid' on $Branch and $HermesHome. teknium1 hit this trying to reinstall from the PR branch after Brooklyn's commits landed. Every user trying the PR branch install-one-liner hit it too until we notice. Saved without BOM, verified via xxd: file now starts with '# =====' at byte 0 instead of EF BB BF.	2026-05-08 14:27:40 -07:00
Teknium	0548facc50	fix(windows): gateway status dedup + install.ps1 platform-SDK bootstrap ## Two residual Windows fixes that were hanging from earlier commits. ### 1. `hermes gateway status` reported 2 PIDs per gateway — TWO bugs compounded Diagnosed with psutil parent/child walk against live gateway PIDs: Bug A (the real one): `_get_parent_pid` silently failed on Windows. The helper shelled out to `ps -o ppid= -p <pid>`, which doesn't exist on Windows — `FileNotFoundError` → returns `None` → the ancestor walk terminated at `os.getpid()` alone. Consequence: the PID table scan in `_scan_gateway_pids` couldn't filter out `hermes gateway status`'s own launcher stub (a venv `pythonw.exe`/`python.exe` that matches the same `-m hermes_cli.main gateway` pattern as the gateway). Every status call saw "itself" as a second gateway. Fix: `_get_parent_pid` now calls `psutil.Process(pid).ppid()` first (psutil is a core dependency since `3dfb35700`) and falls back to `ps` only when `shutil.which("ps")` succeeds — matching the Windows-footgun checker's "always guard `ps` / `wmic` / etc. with `shutil.which`" rule. Before: `Gateway process running (PID: 21952, 46880)` — 46880 changing on every call (the status invocation's own launcher, which died by the time the next status call looked). After (5 consecutive calls): ``` ✓ Gateway process running (PID: 21952) ✓ Gateway process running (PID: 21952) ✓ Gateway process running (PID: 21952) ✓ Gateway process running (PID: 21952) ✓ Gateway process running (PID: 21952) ``` Ancestor walk on the fix: 14 PIDs (full chain through bash/explorer) instead of the broken 1-PID set. Bug B (the cosmetic one): venv-launcher dedup. Standard Windows CPython venv behaviour is that `<venv>/Scripts/pythonw.exe` is a ~5 MB launcher stub that spawns the base Python (`C:\\Program Files\\Python311 \\pythonw.exe`) with the same command line and waits. Our process scanner sees two PIDs for every gateway: launcher + interpreter, same cmdline. Bug A masked this by accidentally counting the status call AS one of them; with Bug A fixed, we see both the real launcher and real interpreter for the gateway process itself. Fix: `_filter_venv_launcher_stubs` at the tail of `_scan_gateway_pids` walks each matched PID's ppid via psutil. Any PID that's the PARENT of another matched PID is a launcher stub — drop it, keep the child. Scoped to Windows (`is_windows() and len(pids) > 1`) and no-ops when psutil isn't importable. Net effect: `gateway status` now reports one PID per gateway — the interpreter — matching POSIX behaviour and user expectations. ### 2. `install.ps1`: bootstrap pip + auto-install platform SDKs New `Install-PlatformSdks` function wired between `Invoke-SetupWizard` and `Start-GatewayIfConfigured`. Fixes two related issues on fresh Windows installs: 1. The tiered `uv pip install` cascade (introduced in `87fca8342`) correctly falls through when tier 1 `.[all]` fails on the RL git deps, but the fallback tiers can silently skip SDKs from `[messaging]` when there's a partial-resolve. Result: user sets `DISCORD_BOT_TOKEN` in `.env`, fires up gateway, hits "discord module not installed". 2. `uv` creates venvs WITHOUT pip by default, so the user's escape hatch (`pip install discord.py` in the venv) doesn't exist either. The new function: - Skips if `-NoVenv` (nothing to bootstrap into). - Scans `~/.hermes/.env` for messaging tokens (TELEGRAM_BOT_TOKEN, DISCORD_BOT_TOKEN, SLACK_BOT_TOKEN, SLACK_APP_TOKEN, WHATSAPP_ENABLED), filtering placeholder values. - For each token that's set, runs `python -c "import <sdk>"` to verify. - If any import fails: runs `python -m ensurepip --upgrade` to bootstrap pip into the venv (idempotent — no-ops if pip is already present), then `pip install <spec>` for each missing SDK with specs mirroring pyproject.toml's `[messaging]` extra to avoid version drift. The `$ErrorActionPreference = "SilentlyContinue"` spans are not cosmetic — PowerShell wraps native-stderr from a non-zero-exit subprocess as a `NativeCommandError` that prints even through `*> $null` / `2>$null`. Save + restore EAP over the import-probe and pip-install blocks keeps the output clean. Verified on this Windows 10 box: - Initial state: telegram+fastapi+psutil present, discord+slack_sdk missing (tier 1 `.[all]` had failed — `.tirith-install-failed` marker in `%LOCALAPPDATA%\\hermes`). - First run with discord+slack tokens in .env: detects both missing, ensurepip (skipped — pip was already bootstrapped earlier this session for telegram), installs `discord.py[voice]==2.7.1` + `PyNaCl` + `davey`, installs `slack-sdk==3.41.0`. All imports succeed on verify. - Second run: all three SDKs report OK, function no-ops. Pip spec strings mirror pyproject.toml's `[messaging]` extra verbatim so a bump to the extra picks up here automatically — no drift. ### Files - `hermes_cli/gateway.py`: `_get_parent_pid` rewritten (psutil-first); `_filter_venv_launcher_stubs` added; `_scan_gateway_pids` dedups launchers on Windows when it finds >1 match. - `scripts/install.ps1`: new `Install-PlatformSdks` function (~85 lines); wired into the main flow at line 1438. ### Verification - `venv/Scripts/python.exe scripts/check-windows-footguns.py --all` → `✓ No Windows footguns found (380 file(s) scanned).` - `ast.parse` passes on gateway.py. - `[System.Management.Automation.Language.Parser]::ParseFile` passes on install.ps1. - Live gateway (PID 21952, running since 12:33 today) survived 5x stress loop of `hermes gateway status` without dying.	2026-05-08 14:27:40 -07:00
Teknium	cc38282b04	feat(cross-platform): psutil for PID/process management + Windows footgun checker ## Why Hermes supports Linux, macOS, and native Windows, but the codebase grew up POSIX-first and has accumulated patterns that silently break (or worse, silently kill!) on Windows: - `os.kill(pid, 0)` as a liveness probe — on Windows this maps to CTRL_C_EVENT and broadcasts Ctrl+C to the target's entire console process group (bpo-14484, open since 2012). - `os.killpg` — doesn't exist on Windows at all (AttributeError). - `os.setsid` / `os.getuid` / `os.geteuid` — same. - `signal.SIGKILL` / `signal.SIGHUP` / `signal.SIGUSR1` — module-attr errors at runtime on Windows. - `open(path)` / `open(path, "r")` without explicit encoding= — inherits the platform default, which is cp1252/mbcs on Windows (UTF-8 on POSIX), causing mojibake round-tripping between hosts. - `wmic` — removed from Windows 10 21H1+. This commit does three things: 1. Makes `psutil` a core dependency and migrates critical callsites to it. 2. Adds a grep-based CI gate (`scripts/check-windows-footguns.py`) that blocks new instances of any of the above patterns. 3. Fixes every existing instance in the codebase so the baseline is clean. ## What changed ### 1. psutil as a core dependency (pyproject.toml) Added `psutil>=5.9.0,<8` to core deps. psutil is the canonical cross-platform answer for "is this PID alive" and "kill this process tree" — its `pid_exists()` uses `OpenProcess + GetExitCodeProcess` on Windows (NOT a signal call), and its `Process.children(recursive=True)` + `.kill()` combo replaces `os.killpg()` portably. ### 2. `gateway/status.py::_pid_exists` Rewrote to call `psutil.pid_exists()` first, falling back to the hand-rolled ctypes `OpenProcess + WaitForSingleObject` dance on Windows (and `os.kill(pid, 0)` on POSIX) only if psutil is somehow missing — e.g. during the scaffold phase of a fresh install before pip finishes. ### 3. `os.killpg` migration to psutil (7 callsites, 5 files) - `tools/code_execution_tool.py` - `tools/process_registry.py` - `tools/tts_tool.py` - `tools/environments/local.py` (3 sites kept as-is, suppressed with `# windows-footgun: ok` — the pgid semantics psutil can't replicate, and the calls are already Windows-guarded at the outer branch) - `gateway/platforms/whatsapp.py` ### 4. `scripts/check-windows-footguns.py` (NEW, 500 lines) Grep-based checker with 11 rules covering every Windows cross-platform footgun we've hit so far: 1. `os.kill(pid, 0)` — the silent killer 2. `os.setsid` without guard 3. `os.killpg` (recommends psutil) 4. `os.getuid` / `os.geteuid` / `os.getgid` 5. `os.fork` 6. `signal.SIGKILL` 7. `signal.SIGHUP/SIGUSR1/SIGUSR2/SIGALRM/SIGCHLD/SIGPIPE/SIGQUIT` 8. `subprocess` shebang script invocation 9. `wmic` without `shutil.which` guard 10. Hardcoded `~/Desktop` (OneDrive trap) 11. `asyncio.add_signal_handler` without try/except 12. `open()` without `encoding=` on text mode Features: - Triple-quoted-docstring aware (won't flag prose inside docstrings) - Trailing-comment aware (won't flag mentions in `# os.kill(pid, 0)` comments) - Guard-hint aware (skips lines with `hasattr(os, ...)`, `shutil.which(...)`, `if platform.system() != 'Windows'`, etc.) - Inline suppression with `# windows-footgun: ok — <reason>` - `--list` to print all rules with fixes - `--all` / `--diff <ref>` / staged-files (default) modes - Scans 380 files in under 2 seconds ### 5. CI integration A GitHub Actions workflow that runs the checker on every PR and push is staged at `/tmp/hermes-stash/windows-footguns.yml` — not included in this commit because the GH token on the push machine lacks `workflow` scope. A maintainer with `workflow` permissions should add it as `.github/workflows/windows-footguns.yml` in a follow-up. Content: ```yaml name: Windows footgun check on: push: branches: [main] pull_request: branches: [main] jobs: check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: {python-version: "3.11"} - run: python scripts/check-windows-footguns.py --all ``` ### 6. CONTRIBUTING.md — "Cross-Platform Compatibility" expansion Expanded from 5 to 16 rules, each with message, example, and fix. Recommends psutil as the preferred API for PID / process-tree operations. ### 7. Baseline cleanup (91 → 0 findings) - 14 `open()` sites → added `encoding='utf-8'` (internal logs/caches) or `encoding='utf-8-sig'` (user-editable files that Notepad may BOM) - 23 POSIX-only callsites in systemd helpers, pty_bridge, and plugin tool subprocess management → annotated with `# windows-footgun: ok — <reason>` - 7 `os.killpg` sites → migrated to psutil (see §3 above) ## Verification ``` $ python scripts/check-windows-footguns.py --all ✓ No Windows footguns found (380 file(s) scanned). $ python -c "from gateway.status import _pid_exists; import os > print('self:', _pid_exists(os.getpid())); print('bogus:', _pid_exists(999999))" self: True bogus: False ``` Proof-of-repro that `os.kill(pid, 0)` was actually killing processes before this fix — see commit ``1cbe39914`` and bpo-14484. This commit removes the last hand-rolled ctypes path from the hot liveness-check path and defers to the best-maintained cross-platform answer.	2026-05-08 14:27:40 -07:00
Teknium	52e497ce7f	fix(windows installer): UTF-8 BOM, tiered extras, skip tinker-atropos by default install.ps1 had three related problems that compounded into `hermes dashboard` failing to boot on Windows with 'No module named fastapi': 1. UTF-8 BOM missing. Windows PowerShell 5.1 (the default on Windows 10/11, which is what `irm \| iex` runs under) reads files without a BOM as cp1252. install.ps1 has em-dashes, arrows, check marks, etc. — PS 5.1 mangled them and the file failed to parse. Added UTF-8 BOM so PS 5.1, PS 7, and the in-memory `irm \| iex` path all read the file identically. 2. `uv pip install -e .[all]` had a single-tier silent fallback to bare `.` on any failure, with `2>&1 \| Out-Null` swallowing the error. Any transient extras install failure (network hiccup, wheel build issue, etc.) would drop every optional extra including [web], and the installer would still print 'Main package installed'. Replaced with a four-tier fallback (.[all] -> PyPI-only extras -> dashboard+core -> bare) that prints output at every step and a targeted [web] verify+repair at the end so `hermes dashboard` specifically is never silently broken. 3. tinker-atropos was installed unconditionally after the main install. tinker-atropos/pyproject.toml pulls atroposlib and tinker from git+https://github.com/... which can fail on locked-down networks, flaky DNS, or rate-limited github.com and would half-install the venv. install.sh already skipped it by default with a one-liner for users who actually do RL training — install.ps1 now matches that behavior. Parse-checked clean under Windows PowerShell 5.1.26100.8115 (5318 tokens, 0 parse errors).	2026-05-08 14:27:40 -07:00
Teknium	03566e5124	fix(windows): auto-install Playwright Chromium + surface it in doctor scripts/install.sh runs 'npx playwright install --with-deps chromium' on every Linux distro after the npm-install step, which is why browser tools Just Work on Linux. scripts/install.ps1 never did the equivalent step, so on native Windows installs check_browser_requirements() in tools/browser_tool.py would return False (no Chromium under %LOCALAPPDATA%\ms-playwright) and every browser_* tool got silently filtered out of the agent's tool schema — no error, no log entry, user just wondered why the tools didn't exist. Two-part fix: 1. scripts/install.ps1: after 'npm install' in InstallDir succeeds, run 'npx playwright install chromium'. Resolves npx via the same execution-policy-aware logic already used for npm (prefer npx.cmd next to npmExe, fall back to Get-Command). Surfaces a warning + manual-recovery hint when the install fails, matching install.sh behaviour for distros. 2. hermes_cli/doctor.py: after the agent-browser check, lazily import tools.browser_tool and reuse the exact same _chromium_installed() predicate check_browser_requirements() uses, so the doctor signal cannot drift from the runtime gate. Skip the check when Camofox / CDP override / a cloud provider / Lightpanda is configured (those bypass local Chromium). On missing Chromium, the hint is platform-correct: '--with-deps' on POSIX, plain 'install chromium' on win32. Verified on Windows 10: - 'npx playwright install chromium' completes successfully, drops Chrome Headless Shell under %LOCALAPPDATA%\ms-playwright - check_browser_requirements() flips from False -> True - 'hermes doctor' now prints either '✓ Playwright Chromium (browser engine)' or '⚠ Playwright Chromium not installed' + fix command - tests/hermes_cli/test_doctor.py: 38/38 pass - tests/tools/test_browser_chromium_check.py: 16/16 pass	2026-05-08 14:27:40 -07:00
Teknium	b63f9645f0	docs: add Windows-Specific Quirks section to hermes-agent skill + keystroke diagnostic Adds a dedicated '## Windows-Specific Quirks' section to the hermes-agent skill so Windows pitfalls have one discoverable place to evolve. Inaugural entries cover: - Input / keybindings — Alt+Enter intercepted by Windows Terminal, Ctrl+Enter as the Windows newline keystroke, mintty/git-bash behavior, pointer to scripts/keystroke_diagnostic.py for investigation. - Config / files — UTF-8 BOM HTTP-400 trap. - execute_code / sandbox — WinError 10106 SYSTEMROOT root cause + _WINDOWS_ESSENTIAL_ENV_VARS fix location. - Testing / contributing — scripts/run_tests.sh POSIX-venv limitation and the system-Python workaround, POSIX-only test skip-guard patterns. - Path / filesystem — line-ending warnings (cosmetic), forward-slash portability. Collapses the old scattered Windows bullets under 'Platform-specific issues' into a single pointer at the new dedicated section so there's only one place to maintain this content. Also adds the scripts/keystroke_diagnostic.py the skill now references — a small prompt_toolkit Application that prints the Keys.* identifier and raw escape bytes for every keystroke. Used to establish the Ctrl+Enter = c-j fact on Windows Terminal; generally useful for anyone adding a platform-aware keybinding.	2026-05-08 14:27:40 -07:00
Teknium	cbce5e93fc	codebase: add encoding='utf-8' to all bare open() calls (PLW1514) Closes the last Python-on-Windows UTF-8 exposure by making every text-mode open() call explicit about its encoding. Before: on Windows, bare open(path, 'r') defaults to the system locale encoding (cp1252 on US-locale installs). That means reading any config/yaml/markdown/json file with non-ASCII content either crashes with UnicodeDecodeError or silently mis-decodes bytes. After: all 89 affected call sites in production code now pass encoding='utf-8' explicitly. Works identically on every platform and every locale, no surprise behavior. Mechanical sweep via: ruff check --preview --extend-select PLW1514 --unsafe-fixes --fix --exclude 'tests,venv,.venv,node_modules,website,optional-skills, skills,tinker-atropos,plugins' . All 89 fixes have the same shape: open(x) or open(x, mode) became open(x, encoding='utf-8') or open(x, mode, encoding='utf-8'). Nothing else changed. Every modified file still parses and the Windows/sandbox test suite is still green (85 passed, 14 skipped, 0 failed across tests/tools/test_code_execution_windows_env.py + tests/tools/test_code_execution_modes.py + tests/tools/test_env_passthrough.py + tests/test_hermes_bootstrap.py). Scope notes: - tests/ excluded: test fixtures can use locale encoding intentionally (exercising edge cases). If we want to tighten tests later that's a separate PR. - plugins/ excluded: plugin-specific conventions may differ; plugin authors own their code. - optional-skills/ and skills/ excluded: skill scripts are user-authored and we don't want to mass-edit them. - website/ and tinker-atropos/ excluded: vendored / generated content. 46 files touched, 89 +/- lines (symmetric replacement). No behavior change on POSIX or on Windows when the file is ASCII; bug fix on Windows when the file contains non-ASCII.	2026-05-08 14:27:40 -07:00
Teknium	a2efad6bea	fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch Two fixes from teknium1's next install run: 1. npm install: "npm.ps1 cannot be loaded because running scripts is disabled on this system." Get-Command's default PATHEXT ordering picked up ``npm.ps1`` (the PowerShell shim) ahead of ``npm.cmd`` (the batch shim). Most Windows users have PowerShell's execution policy set to Restricted or RemoteSigned, which blocks unsigned ``.ps1`` files. ``npm.cmd`` has no such restriction and works universally. Install-NodeDeps now detects when Get-Command returned npm.ps1, looks for a sibling npm.cmd in the same directory, and prefers it. Prints an info line so the user sees why. Emits a warning + hint if only npm.ps1 is available. 2. "Launch hermes chat now? Y" crashes with "%1 is not a valid Win32 application" on Windows installs. The setup wizard calls ``relaunch(["chat"])``; ``resolve_hermes_bin()`` returned ``sys.argv[0]`` which was ``...\\hermes_cli\\main.py`` (because hermes was launched via ``python -m hermes_cli.main`` during setup). On Windows, ``os.access(script.py, os.X_OK)`` returns True because PATHEXT lists ``.py`` when the Python launcher is registered — but ``subprocess.run([script.py, ...])`` can't actually execute a ``.py`` directly. CreateProcessW needs a real PE file. Fixed ``resolve_hermes_bin`` to reject ``.py``/``.pyc`` argv0 values on Windows specifically. Falls through to ``shutil.which("hermes")`` (hermes.exe in the venv Scripts dir) or, as a final fallback, lets build_relaunch_argv build ``[sys.executable, "-m", "hermes_cli.main"]`` which is bulletproof. POSIX behaviour unchanged — ``.py`` argv0 with a shebang + chmod+x is still a valid exec target there. 3 new tests cover the Windows paths: .py argv0 + hermes.exe on PATH → returns hermes.exe; .py argv0 + no PATH → returns None (caller uses python -m); POSIX + executable .py → still accepted. 26 relaunch tests pass, no POSIX regressions.	2026-05-08 14:27:40 -07:00
Teknium	8f91d7bfa9	fix(windows): %1 install error, patch CRLF false-negative, SOUL.md BOM Three bugs from teknium1's successful install + diagnostic chat on Windows: 1. Start-Process -FilePath npm.cmd fails with "%1 is not a valid Win32 application". Start-Process bypasses cmd.exe and PATHEXT to call CreateProcessW directly, which refuses .cmd batch shims. Switched Install-NodeDeps to use PowerShell's invocation operator (``& $npmExe install --silent > $log``) which DOES honour PATHEXT. Extracted a ``_Run-NpmInstall`` helper so the browser + TUI paths share the same logic. Captures $LASTEXITCODE correctly, still surfaces the real stderr on failure with a log-file pointer for the full output. 2. patch tool returns false-negative on Windows due to CRLF round-trip.* Root cause was upstream of patch: ``subprocess.Popen(..., text=True, stdin=PIPE)`` on Windows translates ``\\n`` → ``\\r\\n`` when data flows through the stdin pipe. ``_pipe_stdin()`` was writing the patch's new_content string through a text-mode pipe, bash then wrote those CRLF bytes to disk, and patch's post-write verify compared the on-disk CRLF bytes against the original LF-only string — fail. Fixed in two places for defense in depth: - ``_pipe_stdin()`` now writes through ``proc.stdin.buffer`` with explicit UTF-8 encoding, bypassing Python's newline translation on every platform. No behaviour change on POSIX (bytes are identical) but stops the CRLF injection on Windows. - ``patch_replace``'s post-write verify normalizes CRLF→LF on both sides before comparing, so even if some future backend still translates newlines the patch tool won't report a bogus failure. 3. SOUL.md gets a UTF-8 BOM on Windows PowerShell 5.1. ``Set-Content -Encoding UTF8`` on PS5.1 writes UTF-8 WITH a byte-order-mark (changed in PS7 via ``utf8NoBOM``). Hermes's prompt-injection scanner sees the BOM (U+FEFF invisible char) and refuses to load the file, so SOUL.md's persona instructions never get applied. Fixed by writing the file via ``[System.IO.File]::WriteAllText`` with an explicit ``UTF8Encoding($false)`` — BOM-free on every PowerShell version. All POSIX behaviour verified unchanged: 198 tests pass across test_file_operations, test_local_env_cwd_recovery, test_code_execution, test_windows_native_support, test_windows_compat.	2026-05-08 14:27:40 -07:00
Teknium	d52e54170a	fix(install.ps1): step out of $InstallDir before touching it + harden repo probe User hit 'fatal: not in a git directory' on re-install because: 1. They ran Remove-Item -Force $env:LOCALAPPDATA\hermes -ErrorAction SilentlyContinue WHILE cd'd inside the install dir. Windows silently refuses to delete a directory any shell is currently cd'd inside and leaves the skeleton intact, but the -ErrorAction SilentlyContinue swallowed every partial-delete failure so they thought the wipe succeeded. 2. The installer then walked into Install-Repository, saw $InstallDir still exists with a partial .git stub, my repo-validity probe returned success (the probe's git rev-parse may have exit-code-zeroed in a way I didn't expect), and the real git fetch died with three 'fatal: not a git repository' errors. Two fixes belt-and-braces: - Main() now cds to $env:USERPROFILE at start if the current shell is inside $InstallDir. Harmless when the user ran from elsewhere; critical when they didn't. This alone fixes the user's case. - Install-Repository's 'is this a valid repo' probe now runs BOTH git rev-parse --is-inside-work-tree AND git status, resets $LASTEXITCODE before each to avoid picking up a stale 0, and requires BOTH to succeed. Also requires rev-parse's output to match 'true' (not just exit 0) to rule out exit-0-with-empty-output edge cases.	2026-05-08 14:27:40 -07:00
Teknium	c469a05ce5	fix(install.ps1): validate existing repo via git itself + clean up broken stubs teknium1 hit "fatal: not in a git directory" on re-install when the previous install left a $InstallDir\.git stub that Test-Path matched but git didn't recognize (three "fatal: not a git repository" lines, then the script exited before touching anything). Two bugs: 1. Test-Path "$InstallDir\.git" was a weak gate — it matches .git whether it's a directory, file, symlink, submodule gitfile, OR a broken stub from a failed previous Remove-Item. Replaced with a real repo probe: Push-Location + git rev-parse --is-inside-work-tree + $LASTEXITCODE check. If git itself can't see a repo, we treat the directory as not-a-repo and fall through to fresh clone. 2. The original update path ignored $LASTEXITCODE. fetch/checkout/pull all emitted fatals but the script kept going. Now each command checks $LASTEXITCODE and throws with an explicit message. Also: when the directory exists but isn't a valid repo, the new code wipes it (Remove-Item -ErrorAction Stop) and falls through to fresh clone, instead of dying with the old "Directory exists but is not a git repository" error. If the wipe itself fails (file locked, hermes still running), we throw with a user-readable "close any programs using files in <dir>" hint. Refactored the function to use a $didUpdate flag instead of my earlier draft's early `return` — that was skipping the submodule init block at the bottom of the function. Both the update and fresh-clone paths now fall through to the submodule init step, which is correct (git pull doesn't auto-update submodules). PowerShell structural check: 21 functions defined, braces balanced.	2026-05-08 14:27:40 -07:00
Teknium	3601e20f47	fix(windows): use PortableGit (not MinGit), fix relaunch os.execvp crash, surface npm errors Three real bugs from teknium1's first Windows install run: 1. MinGit has no bash.exe. MinGit is the minimal-automation Git for Windows distribution — it ships git.exe but deliberately strips bash and the POSIX coreutils. Installer logged "Could not locate bash.exe" and Hermes would fail to run any shell command. Switched to PortableGit — the full Git for Windows minus the installer UI. PortableGit ships bash.exe at <root>\bin\bash.exe plus sh, awk, sed, grep, curl, ssh in usr\bin\. ARM64 variant is detected separately (PortableGit--arm64.7z.exe). 32-bit falls back to MinGit-32-bit with a warning (PortableGit is 64-bit only). PortableGit ships as a 7z self-extractor (56MB vs MinGit's 38MB). We invoke it with `-o<target> -y` to extract silently — no 7z install needed, it's self-contained. Updated tools/environments/local.py::_find_bash candidate order to prefer the PortableGit layout (<root>\bin\bash.exe) with the MinGit layout (<root>\usr\bin\bash.exe) as a fallback so existing installs keep working. 2. os.execvp "Exec format error" on Windows.* Setup wizard's "Launch hermes chat now? Y" called `os.execvp(["hermes", "chat"])` which on Windows can only swap to real Win32 .exe files — chokes with OSError(8) on .cmd batch shims and Python console-script wrappers. Added a win32 branch in hermes_cli/relaunch.py::relaunch() that uses subprocess.run + sys.exit — functionally identical (user sees "hermes exited, then new hermes started") with one extra PID in play. POSIX path is UNCHANGED — still uses os.execvp for in-place replacement. Catches OSError in the Windows branch and surfaces a "open a new terminal so PATH picks up, then re-run hermes" hint instead of a cryptic traceback. 3. npm install failures silent on Windows. The install.ps1 was invoking `npm install --silent 2>&1 \| Out-Null` inside a try/catch. PowerShell's try/catch does NOT trigger on non-zero process exit codes — only on unhandled .NET exceptions — so npm failing printed a generic "npm install failed" with zero information about WHY. The silent pipe ate the stderr. Rewrote Install-NodeDeps to: - Resolve npm.cmd via Get-Command (respects PATHEXT) instead of relying on bare `npm` name resolution. - Use Start-Process with -PassThru to capture the actual exit code. - Redirect stderr to a temp log and surface the first ~800 chars of the real npm error when install fails, plus the log path for the full text. - Fail loudly with the right exit code instead of a misleading success. - Bail cleanly with a helpful message when npm isn't on PATH at all. 4. "True" printing to console after Node check. `Test-Node` returns $true; installer called it as a bare statement (no assignment, no cast). PowerShell prints bare return values. Wrapped the call in `[void](Test-Node)`. ## Tests - Added 3 new tests in tests/hermes_cli/test_relaunch.py covering the Windows branch: subprocess is called (not execvp), child exit code propagates, OSError surfaces a helpful message. All 23 tests pass (20 existing + 3 new). - 77 Windows-compat tests still pass, POSIX behaviour unchanged.	2026-05-08 14:27:40 -07:00
Teknium	b7fe7ed7bd	feat(windows-install): bundle portable MinGit instead of relying on winget User hit a real failure case: their system Git was in a half-installed state (can neither uninstall nor reinstall) and winget refused to work around it. We were one step away from shipping an installer that would have left users with exactly the problem he already had. What other agents do (reality check): - Claude Code: requires pre-installed Git; breaks if user doesn't have it. - OpenCode, Codex: don't need bash at all — PowerShell-first design. - Cline: uses whatever shell VSCode is configured with; installs nothing. None of them solve the "broken system Git" problem. We need to own our Git. Changes: - scripts/install.ps1::Install-Git: dropped winget path entirely. Now: (1) use existing git if present; (2) download portable MinGit from the official git-for-windows GitHub release to %LOCALAPPDATA%\hermes\git. No winget, no admin, no Windows installer registry, no system impact. - Added %LOCALAPPDATA%\hermes\git\{cmd,usr\bin} to User PATH so git + bash + POSIX coreutils (which, env, grep, …) resolve in fresh shells. - tools/environments/local.py::_find_bash: reorder so Hermes' portable MinGit install is checked BEFORE falling through to shutil.which("bash") or system install locations. This way a broken system Git can't hijack the bash lookup. - README + installation docs reworded to reflect the new story: "portable Git Bash, isolated from any system install, recoverable via rm -rf if it ever breaks." Recoverability: if Hermes' Git install ever breaks, ``Remove-Item %LOCALAPPDATA%\hermes\git`` and re-run the installer — no system impact, no uninstall drama, no winget to fight with.	2026-05-08 14:27:40 -07:00
Teknium	d0aad4b021	fix(computer-use): harden image-rejection fallback + AUTHOR_MAP Follow-up to #15328's vision-unsupported retry branch in run_agent.py. _strip_images_from_messages() previously deleted any message whose content was entirely images. That's fine for synthetic user messages injected for attachment delivery, but it breaks providers for tool-role messages — the paired tool_call_id on the preceding assistant message ends up unmatched, which OpenAI-compatible APIs reject with HTTP 400. Fix: tool-role messages whose content becomes empty are replaced with a plaintext placeholder that preserves the tool_call_id linkage. Only non-tool messages are dropped. Added 10 tests covering the role-alternation invariants + image-type coverage. Image-rejection detector: expanded phrase list (image content not supported / multimodal input / vision input / model does not support image) and gated on 4xx status so transient 5xx errors never get misinterpreted as 'server said no to images'. Detection is documented as best-effort English phrase matching. AUTHOR_MAP: mapped 3820588+ddupont808@users.noreply.github.com to ddupont808 so release notes attribute the salvage correctly.	2026-05-08 11:07:38 -07:00
Teknium	839cdd1b05	fix(approval): cron jobs must not be treated as gateway context The new _is_gateway_approval_context() widened the gateway classification to any call with HERMES_SESSION_PLATFORM bound via contextvars. But cron/scheduler.py binds that same contextvar for delivery routing on cron jobs that originate from a gateway platform (telegram/discord/etc.), so those jobs were getting routed through submit_pending with no listener — blocking indefinitely instead of honoring approvals.cron_mode. Short-circuit on HERMES_CRON_SESSION before any gateway check. Cron is always governed by cron_mode config, regardless of where the job was scheduled from. Adds regression coverage in TestCronWithGatewayOrigin and records the contributor email mapping for scripts/release.py.	2026-05-08 07:30:14 -07:00
Teknium	83c23e8861	fix(google-workspace): cleanup for --check-live salvage Small follow-ups on top of #19643: - check_auth() takes quiet kwarg to suppress its AUTHENTICATED print when called from check_auth_live(), so the final status line reflects the live-call outcome only. - Drop redundant _ensure_deps() call in check_auth_live() (check_auth() already calls it). - Add AUTHOR_MAP entry for ygd58 so release attribution script works.	2026-05-08 04:50:43 -07:00
Isaac Huang	5d1bdf11b6	Add AUTHOR_MAP entry for Isaac Huang	2026-05-08 03:22:11 -07:00
Teknium	1bdacb697c	chore(release): add BennetYrWang to AUTHOR_MAP	2026-05-07 17:47:22 -07:00
Teknium	307c85e5c1	fix(goals): auto-pause when judge model returns unparseable output Weak judge models (e.g. deepseek-v4-flash) return empty strings or prose when asked for the strict {done, reason} JSON verdict. The old code failed-open to continue on every such turn, burning the entire turn budget with log lines like judge returned empty response judge reply was not JSON: "Let me analyze whether the goal..." and /goal clear could not stop it mid-loop without /stop. After N=3 consecutive parse failures (transport/API errors don't count — those are transient), the loop auto-pauses and prints: ⏸ Goal paused — the judge model (3 turns) isn't returning the required JSON verdict. Route the judge to a stricter model in ~/.hermes/config.yaml: auxiliary: goal_judge: provider: openrouter model: google/gemini-3-flash-preview Then /goal resume to continue. The counter resets on any usable reply (both "done"/"continue" and API errors) and persists across GoalManager reloads so cross-session resumes carry the correct state. Also fixes test_goal_verdict_send.py sharing a hardcoded session_id across tests — the shared id only worked because the previous _post_turn_goal_continuation was a never-awaited coroutine. Now that PR #19160 made it properly awaited, the xdist test-leakage bug surfaced. Each test gets a unique session_id via uuid suffix.	2026-05-07 17:33:09 -07:00
teknium1	7f369bfe55	chore(release): add hllqkb to AUTHOR_MAP for PR #21288 salvage	2026-05-07 15:21:34 -07:00
hllqkb	c80fa728bd	fix(installer): set UV_NO_CONFIG=1 to avoid permission denied under sudo -u When the installer is run via , uv resolves config file paths against the process owner's (root) home directory rather than the effective user's, causing a Permission denied error when trying to read /root/uv.toml. Setting UV_NO_CONFIG=1 prevents uv from discovering any config files (uv.toml, pyproject.toml) during installation, which is the correct behavior for a bootstrap script that manages its own environment. Fixes #21269	2026-05-07 15:21:34 -07:00
teknium1	2214ab1073	chore: fix AUTHOR_MAP for johnsonblake1@gmail.com → voteblake The existing mapping pointed to the wrong GitHub user (blakejohnson, id 866695, IBM) — the email actually belongs to voteblake (id 5585957), confirmed via search/commits?author-email. Mis-credited since `323ca7084`.	2026-05-07 13:04:42 -07:00
adybag14-cyber	dc5ef1ac8e	fix: add termux-all install profile and safe fallbacks	2026-05-07 13:04:08 -07:00
adybag14-cyber	da18fd084a	fix: strengthen termux install network prerequisites	2026-05-07 13:04:08 -07:00
Abd0r	04193cf71c	feat(web): add Brave Search (free tier) and DDGS search providers Both implement WebSearchProvider via tools/web_providers/ — matching the existing SearXNG pattern (PR #`5c906d702`). Search-only; pair with any extract provider via web.extract_backend. - tools/web_providers/brave_free.py — Brave Search API (free tier, 2k queries/mo). Uses BRAVE_SEARCH_API_KEY as X-Subscription-Token. - tools/web_providers/ddgs.py — DuckDuckGo via the ddgs Python package. No API key; gated on package importability. - tools/web_tools.py: both backends added to _get_backend() config list and auto-detect chain (trails paid providers), _is_backend_available, web_search_tool dispatch, web_extract_tool + web_crawl_tool search-only refusals, check_web_api_key, and the __main__ diagnostic. Introduces _ddgs_package_importable() helper so tests can monkeypatch a single symbol for the ddgs availability check. - hermes_cli/tools_config.py: picker entries for both providers; ddgs gets a post_setup handler that runs `pip install ddgs`. - hermes_cli/config.py: BRAVE_SEARCH_API_KEY in OPTIONAL_ENV_VARS. - scripts/release.py: AUTHOR_MAP entry for @Abd0r. - tests: 14 new tests (brave-free) + 15 new tests (ddgs) covering provider unit behavior, backend wiring, and search-only refusals. Salvages the brave-free + ddgs portion of PR #19796. Not included: the in-line helpers in web_tools.py (replaced with provider modules to match the shipped architecture), the lynx-based extract path (these backends should refuse extract with a clear error — users pair with a real extract provider), and scripts/start-llama-server.sh (unrelated). Co-authored-by: Abd0r <223003280+Abd0r@users.noreply.github.com>	2026-05-07 09:59:17 -07:00
Teknium	498bfc7bc1	chore: release v0.13.0 (2026.5.7) (#21406 ) The Tenacity Release — Hermes Agent now finishes what it starts. - Durable multi-agent Kanban with heartbeat, reclaim, zombie detection, retry budgets, hallucination gate - /goal persistent cross-turn goals (Ralph loop) - Checkpoints v2 single-store rewrite with real pruning - Gateway auto-resume interrupted sessions after restart - no_agent cron watchdog mode - Post-write delta lint on write_file + patch - 8 P0 security closures — redaction ON by default, CVSS 8.1 Discord fix, WhatsApp stranger rejection, MCP/auth TOCTOU, SSRF floor, cron prompt-injection skill scanning - Google Chat (20th platform) + generic platform-plugin hooks - ProviderProfile ABC + plugins/model-providers/ - 7 i18n locales (zh/ja/de/es/fr/uk/tr) + display.language - video_analyze tool, xAI Custom Voices, SearXNG, OpenRouter caching - MCP SSE transport + OAuth + image MEDIA surfacing - 864 commits, 588 merged PRs, 295 contributors	2026-05-07 09:22:48 -07:00
Teknium	bbff2f6345	chore(release): map maciekczech noreply email	2026-05-07 07:39:57 -07:00
Teknium	1baab8771a	chore(release): add qWaitCrypto to AUTHOR_MAP for PR #21055 salvage	2026-05-07 07:17:12 -07:00
Teknium	43cf72a458	chore(release): map donramon77 to AUTHOR_MAP for PR #18425 salvage	2026-05-07 07:15:44 -07:00
teknium1	4ee6c3349a	chore(release): map tuancanhnguyen706@gmail.com → xxxigm	2026-05-07 07:05:05 -07:00
Teknium	5a3e5b23d2	fix(memory): remove dead allOf schema block at the source PR #21238 introduced top-level `allOf: [{if/then/required}]` blocks in the built-in memory tool's parameters schema as conditional-required hints. Two problems: 1. OpenAI's Codex backend (chatgpt.com/backend-api/codex, gpt-5.x) rejects top-level `allOf`/`anyOf`/`oneOf`/`enum`/`not` outright with a non-retryable 400 — affected every user on openai-codex/gpt-5.x. 2. The `if/then` hints were silently ignored by every other provider (Chat Completions doesn't honour them on function schemas), so they never actually enforced anything anywhere. The runtime handler in `memory_tool()` already validates the per-action required fields and returns actionable error messages, so removing the block changes nothing behaviourally. Paired with the defense-in-depth sanitizer in the previous commit, this closes the bug both at the source (schema no longer emits the forbidden form) and at the wire boundary (sanitizer strips it if anything else re-introduces it). - Rewrites `tests/tools/test_memory_tool_schema.py` to guard against regressing the forbidden-combinator shape instead of asserting it. - Adds AUTHOR_MAP entry for @hrkzogw (author of the sanitizer fix).	2026-05-07 07:03:21 -07:00
Teknium	f5c9bb582c	chore(release): add CashWilliams to AUTHOR_MAP	2026-05-07 06:54:29 -07:00
Teknium	6a4ecc0a9f	fix(whatsapp): reject strangers by default, never respond in self-chat (#8389 ) (#21291 ) Self-chat mode (default) previously replied to ANY incoming DM with a Python-side pairing-code message. Two compounding defaults: 1. allowlist.js::matchesAllowedUser returned true for an empty allowlist — so WHATSAPP_ALLOWED_USERS unset → everyone passes the JS bridge gate → messages reach Python gateway → _is_user_authorized returns False but _get_unauthorized_dm_behavior falls back to 'pair' → stranger gets a pairing code reply. 2. bridge.js had no mode check on !fromMe messages, so self-chat mode (where the operator only wants to talk to themselves) forwarded everything anyway. Fix: - allowlist.js: empty allowlist now returns false. Operators who want an open bot must set WHATSAPP_ALLOWED_USERS=* explicitly (the existing wildcard behaviour, consistent with SIGNAL_GROUP_ALLOWED_USERS). - bridge.js: self-chat mode hard-rejects all !fromMe messages at the bridge, before they ever reach the Python gateway. Bot mode still enforces the allowlist. - Startup log message updated to reflect the new per-mode behaviour (was '⚠️ No WHATSAPP_ALLOWED_USERS set — all messages will be processed', which was both inaccurate post-fix and a bad default signal pre-fix). - allowlist.test.mjs: new regression test pinning the empty-rejects contract, + null/undefined defensive cases. Behaviour delta for existing users: - self-chat mode, no allowlist: strangers got pairing codes, now silently dropped. Strictly better. - bot mode, no allowlist: strangers got pairing codes via the Python-side pairing flow, now silently dropped at the JS bridge. Operators who genuinely want an open bot set WHATSAPP_ALLOWED_USERS=*.	2026-05-07 06:53:04 -07:00
Teknium	6769060ae2	chore: AUTHOR_MAP entry for @glesperance	2026-05-07 06:37:23 -07:00
Teknium	30c9990175	chore: correct AUTHOR_MAP for oluwadareab12 (was mismapped to bennytimz)	2026-05-07 06:35:54 -07:00
Teknium	f481395d4c	chore(release): add subtract0 to AUTHOR_MAP for PR #19935 salvage	2026-05-07 06:32:45 -07:00
Teknium	33563df027	chore: AUTHOR_MAP entry for @paul-tian	2026-05-07 06:31:08 -07:00
Teknium	755b74fc2d	chore: AUTHOR_MAP entry for @LucianoSP	2026-05-07 06:29:27 -07:00
Teknium	8aa30407c2	chore(release): add masonjames to AUTHOR_MAP for PR #10439 salvage	2026-05-07 06:28:11 -07:00
Teknium	25187ca05c	chore: AUTHOR_MAP entry for @hedirman	2026-05-07 06:27:47 -07:00
Hedirman	a9ebee5f02	Fix WhatsApp long message splitting	2026-05-07 06:27:47 -07:00
Teknium	46d1fc16ab	chore(release): add AJV20 to AUTHOR_MAP for PR #10287 salvage	2026-05-07 06:25:35 -07:00
Teknium	b7a97cd44f	chore: AUTHOR_MAP entry for wabrent	2026-05-07 06:25:03 -07:00
Teknium	fcd619cae4	chore: AUTHOR_MAP entry for @kowenhaoai	2026-05-07 06:24:24 -07:00
Teknium	cfe019c782	chore: AUTHOR_MAP entry for @acc001k	2026-05-07 06:21:50 -07:00
Teknium	fd13b7d2b9	chore: AUTHOR_MAP entry for @agilejava	2026-05-07 06:19:58 -07:00
Teknium	8cef149131	chore: AUTHOR_MAP entry for @stevenchouai	2026-05-07 06:04:28 -07:00
Teknium	afbcca0f06	chore: AUTHOR_MAP entry for @shashwatgokhe	2026-05-07 05:58:11 -07:00

1 2 3 4 5 ...

643 commits