hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-13 14:02:16 +00:00

History

Teknium a4d8f0f62a feat(prompt): universal task-completion guidance + local Python toolchain probe (#34340 ) * fix(codex): surface error code in Responses 'failed' status errors When a Codex Responses turn ends with status=failed, the response carries the failure details under `response.error` as `{code, message, param, ...}`. The previous extractor pulled only `message`, so users seeing a rate-limit failure got a bare "Slow down" string indistinguishable from a generic stream truncation; an internal_error with empty message degraded to a dict dump ("{'code': 'internal_error', 'message': ''}"). Extract a `_format_responses_error()` helper that: - prefixes `code` when both code and message are present (e.g. 'rate_limit_exceeded: Slow down') - falls back to the bare `code` when message is empty - accepts both dict and attribute-style payloads (SDK and JSON-RPC paths) - preserves the prior status-only fallback when no error payload exists Apply the same helper at the sibling site in `codex_app_server_session.run_turn()` so codex-CLI subprocess turn failures get the same treatment. Tests: - 8 new unit tests for `_format_responses_error` covering both shapes, empty/missing fields, non-string fields, and the status-only fallback. - 2 regression tests on `_normalize_codex_response` for failed status with and without a code, asserting the exact RuntimeError message. - All 3603 tests in tests/agent/ pass. Adapted from anomalyco/opencode#28757. * feat(prompt): universal task-completion guidance + local Python toolchain probe Two cross-model failure modes get a single-line answer in the cached system prompt. Both gated by config (default on), both add zero overhead when not needed, both verified via real AIAgent prompt builds. ## What changed `TASK_COMPLETION_GUIDANCE` — short prompt block applied to ALL models. Targets two failure modes observed on a real Sarasota real-estate build task: (1) Opus stopped after writing an 85-byte stub and gave a prose response with finish_reason=stop on call #3 of 90; (2) DeepSeek pushed through a PEP-668 wall, then returned fabricated listings instead of admitting the blocker. Both behaviors are model-family-agnostic, so the guidance lives outside the existing tool_use_enforcement gate (~192 tokens, paid once per session via prefix cache). `tools/env_probe.py` — local Python toolchain probe. Detects python3/pip/uv/PEP-668 state and emits ONE short line in the system prompt when something is non-default. Emits NOTHING when the env is clean (zero token cost for normal users). Skipped entirely for remote terminal backends (docker/modal/ssh) — they have their own probe. Example output on a broken environment (the actual case): Python toolchain: python3=3.11.15 (no pip module), python=missing (use python3), pip→python3.12 (mismatch), PEP 668=yes (use venv or uv). ## Config Both flags live under `agent.` in config.yaml, default True: agent: task_completion_guidance: true # universal "finish the job" block environment_probe: true # local Python toolchain hints Neither addition required a `_config_version` bump — deep-merge fills defaults in for existing user configs. ## Validation \| Test surface \| Result \| \|---\|---\| \| tests/tools/test_env_probe.py \| 10/10 pass (probe unit) \| \| tests/run_agent/test_run_agent.py — new classes \| 8/8 pass (integration) \| \| TestToolUseEnforcementConfig \| 17/17 pass (no regression) \| \| TestBuildSystemPrompt \| 9/9 pass (no regression) \| \| TestInvalidateSystemPrompt \| 2/2 pass (no regression) \| \| tests/agent/test_prompt_builder.py \| 124/124 pass (no regression) \| \| tests/hermes_cli/ \| 5662/5662 pass (config defaults) \| \| E2E AIAgent build (broken env) \| Both blocks present, 2,178 chars \| \| E2E AIAgent build (clean env) \| 771-char net overhead, env probe silent \|		2026-05-28 22:26:09 -07:00
..
dashboard_auth	fix(dashboard-auth): share /api/* public allowlist between legacy and OAuth gates	2026-05-29 12:17:12 +10:00
proxy	fix(xai-proxy): handle 429 rate-limit responses in proxy retry path	2026-05-28 02:36:37 -07:00
__init__.py	chore: release v0.15.1 (2026.5.29) (#34222 )	2026-05-28 18:11:49 -07:00
_parser.py	Fix CLI verbose tool progress config fallback	2026-05-23 21:03:51 -07:00
_subprocess_compat.py	feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags	2026-05-08 14:27:40 -07:00
auth.py	fix(xai-oauth): accept bare-code manual paste (state=None) (#26923 ) (#33880 )	2026-05-28 05:47:30 -07:00
auth_commands.py	fix(cli): show masked feedback for secret prompts	2026-05-25 01:20:33 -07:00
azure_detect.py	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
backup.py	fix: limit pre-update state snapshots	2026-05-28 02:45:25 -07:00
banner.py	fix(docker): bake build-time git SHA into the image	2026-05-28 15:14:05 +10:00
browser_connect.py	feat: auto-launch Chromium-family browser for CDP	2026-05-19 22:34:05 -07:00
build_info.py	fix(docker): bake build-time git SHA into the image	2026-05-28 15:14:05 +10:00
bundles.py	feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373 )	2026-05-18 21:38:05 -07:00
callbacks.py	fix(cli): show masked feedback for secret prompts	2026-05-25 01:20:33 -07:00
checkpoints.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
claw.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
cli_output.py	fix(cli): show masked feedback for secret prompts	2026-05-25 01:20:33 -07:00
clipboard.py	fix(clipboard): only read PNG signature bytes, not entire file	2026-05-13 22:54:21 -07:00
codex_models.py	fix(codex): drop dead model slugs that HTTP 400 on ChatGPT Pro (#33424 )	2026-05-27 12:16:15 -07:00
codex_runtime_plugin_migration.py	fix(codex-runtime): de-dup [plugins.X] tables and stop leaking HERMES_HOME into config.toml	2026-05-15 02:31:30 -07:00
codex_runtime_switch.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
colors.py	feat: respect NO_COLOR env var and TERM=dumb (#4079 )	2026-03-30 17:07:21 -07:00
commands.py	fix(model picker): unify /model and `hermes model` lists, add disk cache (#33867 )	2026-05-28 11:33:16 -07:00
completion.py	test(cli): strengthen zsh completion regression coverage	2026-05-13 09:34:15 -07:00
config.py	feat(prompt): universal task-completion guidance + local Python toolchain probe (#34340 )	2026-05-28 22:26:09 -07:00
container_boot.py	fix(docker): make s6 lifecycle work for the unprivileged hermes user	2026-05-25 12:23:23 +10:00
copilot_auth.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
cron.py	feat: add cron job profile support	2026-05-18 17:39:50 +00:00
curator.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
curses_ui.py	fix(cli): clamp curses color 8 for 8-color terminals (Docker)	2026-05-21 23:40:58 -07:00
debug.py	fix(debug): redact BlueBubbles webhook secrets	2026-05-24 15:43:48 -07:00
default_soul.py	fix: reset default SOUL.md to baseline identity text (#3159 )	2026-03-26 01:34:27 -07:00
dep_ensure.py	feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection (#27845 )	2026-05-18 16:34:24 +05:30
dingtalk_auth.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
doctor.py	remove Vercel AI Gateway and Vercel Sandbox (#33067 )	2026-05-27 00:43:32 -07:00
dump.py	fix(docker): bake build-time git SHA into the image	2026-05-28 15:14:05 +10:00
env_loader.py	fix(secrets): only apply external secrets once per HERMES_HOME per process (#32271 )	2026-05-25 15:18:55 -07:00
fallback_cmd.py	fix(fallback): merge fallback_providers with legacy fallback_model configurations	2026-05-23 05:24:57 -07:00
fallback_config.py	fix(fallback): merge fallback_providers with legacy fallback_model configurations	2026-05-23 05:24:57 -07:00
gateway.py	feat(docker): auto-redirect `gateway run` to supervised mode inside s6 image	2026-05-28 12:42:13 +10:00
gateway_windows.py	fix(gateway): drain on Windows `hermes gateway stop` so sessions survive restart (#33798 )	2026-05-28 03:25:32 -07:00
goals.py	feat: inject current time into goal judge prompt	2026-05-16 23:05:27 -07:00
hooks.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
inventory.py	refactor(inventory): extract shared ConfigContext + build_models_payload	2026-05-13 22:31:11 -07:00
kanban.py	fix(kanban): CLI dispatch honors max_in_progress/max_spawn from config; swap missing 'avoid-ai-writing' skill for bundled humanizer (#33488 , #29415 ) (#34337 )	2026-05-28 21:00:46 -07:00
kanban_db.py	feat(kanban): default_assignee fallback + per-profile concurrency cap (#27145 , #21582 ) (#34244 )	2026-05-28 19:02:55 -07:00
kanban_decompose.py	fix(kanban): close kanban.db FD after every connect() in long-lived processes	2026-05-27 22:07:49 -07:00
kanban_diagnostics.py	fix(kanban): honor severity thresholds in diagnostics	2026-05-18 20:47:01 -07:00
kanban_specify.py	fix(kanban): close kanban.db FD after every connect() in long-lived processes	2026-05-27 22:07:49 -07:00
kanban_swarm.py	fix(kanban): CLI dispatch honors max_in_progress/max_spawn from config; swap missing 'avoid-ai-writing' skill for bundled humanizer (#33488 , #29415 ) (#34337 )	2026-05-28 21:00:46 -07:00
logs.py	feat: component-separated logging with session context and filtering (#7991 )	2026-04-11 17:23:36 -07:00
main.py	fix(model picker): unify /model and `hermes model` lists, add disk cache (#33867 )	2026-05-28 11:33:16 -07:00
mcp_catalog.py	feat(mcp): Nous-approved MCP catalog with interactive picker (#30870 )	2026-05-26 12:48:14 -07:00
mcp_config.py	feat(mcp): Nous-approved MCP catalog with interactive picker (#30870 )	2026-05-26 12:48:14 -07:00
mcp_picker.py	feat(mcp): Nous-approved MCP catalog with interactive picker (#30870 )	2026-05-26 12:48:14 -07:00
memory_setup.py	fix(cli): show masked feedback for secret prompts	2026-05-25 01:20:33 -07:00
migrate.py	feat(cli): hermes migrate xai [--apply] [--no-backup]	2026-05-20 09:18:23 -07:00
model_catalog.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
model_normalize.py	remove Vercel AI Gateway and Vercel Sandbox (#33067 )	2026-05-27 00:43:32 -07:00
model_switch.py	fix(model picker): unify /model and `hermes model` lists, add disk cache (#33867 )	2026-05-28 11:33:16 -07:00
models.py	fix(model picker): unify /model and `hermes model` lists, add disk cache (#33867 )	2026-05-28 11:33:16 -07:00
nous_account.py	feat(auth) normalise the way in which we check whether a user has free/paid access to nous portal so we can expose behaviour and error messages accordingly.	2026-05-28 00:19:31 -07:00
nous_subscription.py	fix(auth): refresh Nous entitlement in tool menus	2026-05-28 00:19:31 -07:00
oneshot.py	fix(provider): make config.yaml model.provider the single source of truth (#31222 )	2026-05-23 18:18:41 -07:00
pairing.py	fix(pairing): enforce lockout on approve_code, not just generate_code (#10195 ) (#21325 )	2026-05-07 07:18:21 -07:00
platforms.py	feat: complete plugin platform parity — all 12 integration points	2026-04-29 21:56:51 -07:00
plugins.py	feat(plugins): add register_dashboard_auth_provider hook on PluginContext	2026-05-27 02:12:27 -07:00
plugins_cmd.py	feat(context-engine): host contract for external context engines	2026-05-28 01:45:30 -07:00
portal_cli.py	feat(portal): one-shot setup, status CLI, and Nous-included markers (#30860 )	2026-05-23 02:39:09 -07:00
profile_describer.py	fix(skills): prune dependency/venv dirs from all skill scanners (#30042 )	2026-05-21 14:18:02 -07:00
profile_distribution.py	fix(profile): reject symlinks in distributions (#25292 )	2026-05-25 05:07:58 -07:00
profiles.py	fix(security): tighten .env file permissions to 0600 at all creation sites	2026-05-25 03:40:47 -07:00
providers.py	remove Vercel AI Gateway and Vercel Sandbox (#33067 )	2026-05-27 00:43:32 -07:00
psutil_android.py	fix(android): reject unsafe tar members in psutil compatibility installer	2026-05-28 02:36:09 -07:00
pt_input_extras.py	fix(cli): make Ctrl+Enter insert newline on WSL/SSH/Windows Terminal (#22777 )	2026-05-09 12:48:14 -07:00
pty_bridge.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
relaunch.py	fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch	2026-05-08 14:27:40 -07:00
runtime_provider.py	fix(custom): pass custom provider extra body	2026-05-21 07:48:53 -07:00
secret_prompt.py	fix(cli): show masked feedback for secret prompts	2026-05-25 01:20:33 -07:00
secrets_cli.py	fix(cli): show masked feedback for secret prompts	2026-05-25 01:20:33 -07:00
security_advisories.py	feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220 )	2026-05-12 01:02:25 -07:00
security_audit.py	feat(security): on-demand supply-chain audit via OSV.dev (#31460 )	2026-05-24 15:15:16 -07:00
send_cmd.py	fix(review): address Copilot follow-up on sanitizer and file decode errors	2026-05-16 23:00:58 -05:00
service_manager.py	fix(docker): align HOME for dashboard and s6 gateway services (#33481 )	2026-05-28 13:42:27 +10:00
session_recap.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
setup.py	remove Vercel AI Gateway and Vercel Sandbox (#33067 )	2026-05-27 00:43:32 -07:00
skills_config.py	refactor(config): migrate remaining 33 cfg_get call sites (#17311 )	2026-04-29 04:03:03 -07:00
skills_hub.py	fix(skills-hub): stop ellipsis-truncating the Identifier column (#33810 )	2026-05-28 04:53:13 -07:00
skin_engine.py	fix(tui): improve charizard completion menu contrast	2026-05-18 20:05:23 -07:00
slack_cli.py	fix(slack): enable writable app home DMs in manifest	2026-05-08 17:01:12 -07:00
status.py	feat(auth) normalise the way in which we check whether a user has free/paid access to nous portal so we can expose behaviour and error messages accordingly.	2026-05-28 00:19:31 -07:00
stdio.py	chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )	2026-05-11 11:13:25 -07:00
timeouts.py	perf(agent-loop): cut 47% of per-conversation function calls via 3 targeted hot-path optimizations (#28866 )	2026-05-19 14:25:10 -07:00
tips.py	docs(auth): replace stale 'hermes login' references with 'hermes auth add'	2026-05-26 15:41:11 -07:00
tools_config.py	fix: expose context engine tools with saved toolsets	2026-05-28 00:28:42 -07:00
uninstall.py	docs(windows): avoid piping installer directly into iex	2026-05-18 20:05:47 -07:00
voice.py	fix(tui): restore voice push-to-talk parity (#20897 )	2026-05-06 15:49:59 -07:00
web_server.py	fix(dashboard-auth): share /api/* public allowlist between legacy and OAuth gates	2026-05-29 12:17:12 +10:00
webhook.py	fix(state): restrict sensitive store file permissions	2026-05-24 04:55:18 -07:00
xai_retirement.py	fix(xai): align migrate retirement map with docs	2026-05-20 09:18:23 -07:00