hermes-agent/hermes_cli
Teknium c594a23047
feat(agent): per-turn file-mutation verifier footer (#24498)
Detect when write_file / patch calls fail during a turn and are never
superseded by a successful write to the same path.  When the final
text response is delivered, append an advisory footer listing the
files that did NOT change — so models that over-claim 'patched 5 files'
after 4 silent failures can't hide the lie.

Catches the failure mode reported in Ben Eng's llm-wiki session:
grok-4.1-fast issued batches of parallel patches, half failed with
'Could not find old_string', and the agent summarised the turn
claiming every file was edited.  The user had to manually run
'git status' each turn to catch it.

The verifier is a pure post-hoc check on tool results — no new LLM
calls, no synthetic messages injected into history (prompt cache
preserved), no changes to tool argument dispatch.  Per-turn state is
keyed by path; a later successful write to the same path clears the
failure entry so single-file retry recovery is not flagged.

Wired into both _execute_tool_calls_concurrent and
_execute_tool_calls_sequential, so batched parallel patches and one-at-
a-time edits are both covered.  Footer emission happens after the
agent loop exits, before transform_llm_output / post_llm_call plugin
hooks run, so plugins still see (and can modify) the augmented text.

Config: display.file_mutation_verifier (bool, default true) +
HERMES_FILE_MUTATION_VERIFIER env override.

31 unit tests in tests/run_agent/test_file_mutation_verifier.py cover
target extraction (write_file, patch-replace, patch-v4a single and
multi-file), error-preview extraction (JSON .error field and plain
string), per-turn state transitions (first-error-wins on repeated
failure, success supersedes failure), footer rendering (truncation
at 10 entries, user-actionable hint), and env/config precedence.

Companion docs updated: user-guide/configuration.md +
reference/environment-variables.md.
2026-05-12 11:54:13 -07:00
..
__init__.py chore: release v0.13.0 (2026.5.7) (#21406) 2026-05-07 09:22:48 -07:00
_parser.py fix: add dashboard to CLI help epilogue and Docker CI smoke test 2026-05-07 06:16:23 -07:00
_subprocess_compat.py feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags 2026-05-08 14:27:40 -07:00
auth.py fix(minimax): harden OAuth dashboard and runtime 2026-05-11 22:15:16 -07:00
auth_commands.py fix(minimax): harden OAuth dashboard and runtime 2026-05-11 22:15:16 -07:00
azure_detect.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
backup.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
banner.py fix(banner): resolve update-check repo from running code, not profile-scoped path 2026-05-09 04:10:35 -07:00
browser_connect.py fix(browser): address Copilot review on /browser connect 2026-04-28 22:11:10 -07:00
callbacks.py fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902) 2026-04-14 16:11:37 -07:00
checkpoints.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
claw.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
cli_output.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
clipboard.py feat: fix img pasting in new ink plus newline after tools 2026-04-11 13:14:32 -05:00
codex_models.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
colors.py feat: respect NO_COLOR env var and TERM=dumb (#4079) 2026-03-30 17:07:21 -07:00
commands.py chore: ruff auto-fixes — collapsible-else-if, if-stmt-min-max, dict.fromkeys (#23926) 2026-05-11 11:03:29 -07:00
completion.py fix(completion): use valid zsh _arguments exclusion-group syntax 2026-05-09 13:36:44 -07:00
config.py feat(agent): per-turn file-mutation verifier footer (#24498) 2026-05-12 11:54:13 -07:00
copilot_auth.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
cron.py feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709) 2026-05-04 12:31:01 -07:00
curator.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
curses_ui.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
debug.py fix(debug): redact log content at upload time in hermes debug share 2026-05-03 11:42:20 -07:00
default_soul.py fix: reset default SOUL.md to baseline identity text (#3159) 2026-03-26 01:34:27 -07:00
dingtalk_auth.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
doctor.py feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220) 2026-05-12 01:02:25 -07:00
dump.py refactor(env): use shared Hermes dotenv loader 2026-05-05 10:13:13 -07:00
env_loader.py feat(cross-platform): psutil for PID/process management + Windows footgun checker 2026-05-08 14:27:40 -07:00
fallback_cmd.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
gateway.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
gateway_windows.py fix(gateway): preserve Ctrl+C for Windows foreground runs 2026-05-09 14:34:18 -07:00
goals.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
hooks.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
kanban.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
kanban_db.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
kanban_diagnostics.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
kanban_specify.py feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435) 2026-05-07 13:04:41 -07:00
logs.py feat: component-separated logging with session context and filtering (#7991) 2026-04-11 17:23:36 -07:00
main.py feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220) 2026-05-12 01:02:25 -07:00
mcp_config.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
memory_setup.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
model_catalog.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
model_normalize.py fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802) 2026-05-06 09:08:33 -07:00
model_switch.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
models.py fix(nous): surface Portal-flagged free models in picker even when curated list is stale (#24082) 2026-05-11 18:08:16 -07:00
nous_subscription.py feat(web): add SearXNG as a native search-only backend 2026-05-06 10:05:29 -07:00
oneshot.py fix: make session search initialize session db 2026-05-09 14:36:58 -07:00
pairing.py fix(pairing): enforce lockout on approve_code, not just generate_code (#10195) (#21325) 2026-05-07 07:18:21 -07:00
platforms.py feat: complete plugin platform parity — all 12 integration points 2026-04-29 21:56:51 -07:00
plugins.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
plugins_cmd.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
profile_distribution.py feat(profile): shareable profile distributions via git (#20831) 2026-05-08 10:04:32 -07:00
profiles.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
providers.py fix: prevent bare 'custom' slug in model.provider (#17478) 2026-04-30 04:32:11 -07:00
pt_input_extras.py fix(cli): make Ctrl+Enter insert newline on WSL/SSH/Windows Terminal (#22777) 2026-05-09 12:48:14 -07:00
pty_bridge.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
relaunch.py fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch 2026-05-08 14:27:40 -07:00
runtime_provider.py fix(minimax): harden OAuth dashboard and runtime 2026-05-11 22:15:16 -07:00
security_advisories.py feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220) 2026-05-12 01:02:25 -07:00
setup.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
skills_config.py refactor(config): migrate remaining 33 cfg_get call sites (#17311) 2026-04-29 04:03:03 -07:00
skills_hub.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
skin_engine.py fix(tui): honor skin highlight colors (#20895) 2026-05-06 14:01:56 -07:00
slack_cli.py fix(slack): enable writable app home DMs in manifest 2026-05-08 17:01:12 -07:00
status.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
stdio.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
timeouts.py refactor(timeouts): drop redundant ImportError in except clause 2026-04-26 20:48:20 -07:00
tips.py feat: Ctrl+Enter inserts newline on Windows Terminal 2026-05-08 14:27:40 -07:00
tools_config.py fix(deps): unbreak [all] install — drop mistralai while PyPI quarantined (#24205) 2026-05-11 23:02:15 -07:00
uninstall.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00
vercel_auth.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
voice.py fix(tui): restore voice push-to-talk parity (#20897) 2026-05-06 15:49:59 -07:00
web_server.py feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220) 2026-05-12 01:02:25 -07:00
webhook.py chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937) 2026-05-11 11:13:25 -07:00