hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-26 17:38:36 +00:00

History

brooklyn! 3e74f75e41 feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316 ) * feat(agent): coding-context posture with per-model edit-format tuning Hermes detects when it's running in a coding context — an interactive surface (CLI, TUI, ACP, desktop) sitting in a code workspace (git repo or recognised project root) — and shifts into a coding posture. Outside that (chat platforms, non-workspaces) nothing changes. The posture is modelled as a frozen RuntimeMode selected from a small ContextProfile registry (coding/general). A profile is data: the toolset to collapse to, the operating brief to inject, and seams for model routing and memory. Every domain reads the same resolved object instead of re-probing git/config on its own: - System prompt — RuntimeMode.system_blocks(): an operating brief (gather context before editing, edit through tools not chat, verify with terminal, cap retry loops) plus a live git/workspace snapshot, built once and baked into the stable prompt tier so per-conversation caching is preserved. - Per-model edit-format tuning — the brief nudges each model family toward the patch mode it handles best: OpenAI/Codex toward mode='patch' (V4A multi-file diffs), Anthropic toward mode='replace' (string replacement). The model id rides on RuntimeMode; unknown families keep neutral wording. - Skill index — non-coding skill categories are pruned from the prompt's skill index (discovery-only; skills_list/skill_view still reach the full catalog, with a disclosure note). - Toolset — only under the opt-in 'focus' mode does the posture collapse to the coding toolset + enabled MCP servers; the default posture is prompt-only and never overrides configured toolsets. Activation via agent.coding_context: auto (default), focus, on, off. Subagents inherit the posture for free via toolset inheritance + the shared prompt builder. Detection is not memoized so a long-lived gateway/TUI process can't pin a stale posture across working directories. * feat(agent): cover new-file authoring in the coding edit-format nudge The per-model edit-format guidance only addressed editing existing code (patch mode='patch' vs 'replace'), but authoring a brand-new file — write_file, not patch — is a large fraction of real coding work and the nudge was silent on it. Surfaced when building a single-file artifact where the dominant operation was write_file and the steering offered no guidance. Both family lines now lead with "author new files with write_file; for edits to existing code prefer ...". Tests assert write_file appears in each family's brief; unknown families still get neutral wording. * docs(agent): correct memoization docstring + clarify TUI config-load asymmetry * feat(agent): sharpen the coding posture — verify-loop facts, wider edit steering, $HOME guard Tuning pass on the coding posture from dogfooding it as a harness: - Workspace snapshot now hands the model its verify loop up front: detected manifests + package manager (lockfile sniff), the exact verify commands (package.json scripts, Makefile targets, scripts/run_tests.sh, pytest config), and which context files (AGENTS.md / CLAUDE.md / .cursorrules) exist at the root. Marker-only (non-git) projects get the snapshot too instead of nothing. The "verify before claiming done" brief line was the highest-value piece in evals — this turns it from advice into an executable loop instead of making the model rediscover the test command every session. Still stat-cheap, size-guarded reads, built once at prompt time. - Edit-format steering covers the families Hermes actually serves: Gemini and open-weight coding models (DeepSeek, Qwen, Kimi, GLM, Grok, Hermes, Llama, Mistral, Devstral, MiniMax) steer to mode='replace' — their RL scaffolds use str_replace-style editors. Previously only GPT/Codex and Claude families got steering; the models Hermes users disproportionately run all fell to neutral. - Operating brief gains four behaviors elite harnesses encode: batch independent reads/searches in one turn; fix root causes and the bug class (sibling call paths), not the reported site; no drive-by refactors/renames/reformatting; never read, print, or commit secrets. Plus a patch-failure escalation ladder: after the same region fails twice, rewrite the enclosing function/file with write_file instead of a third patch attempt. - $HOME dotfiles guard: a git repo rooted exactly at the home directory (or a marker sitting in it, e.g. a global ~/AGENTS.md) is user config, not a code workspace — without the guard, every session anywhere under a dotfiles-managed home silently flipped to the coding posture. Real projects under such a home still detect via their own markers/repos; 'on' mode bypasses the guard.		2026-06-10 23:06:44 -05:00
..
acp	feat(acp): emit session provenance metadata for compression rotation (#41724 )	2026-06-07 22:22:21 -07:00
acp_adapter
agent	feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316 )	2026-06-10 23:06:44 -05:00
cli	fix(cli): prevent duplicate one-shot finalize on interrupted cleanup (#43320 )	2026-06-09 22:41:04 -07:00
cron	revert(cron): remove per-job profile support (PR #28124 ) (#43956 )	2026-06-10 20:46:17 -07:00
docker	fix(gateway): auto-start after container restart via planned-stop marker (#42675 ) (#43236 )	2026-06-10 14:01:34 +10:00
e2e	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
fakes
fixtures/plugins/example-dashboard/dashboard	feat(dashboard): nous-blue theme, bulk sessions, schedule picker (#37383 )	2026-06-02 12:37:40 -04:00
gateway	fix(telegram): stripped-text fallbacks, re-finalize skip, and tail-only delete guard	2026-06-10 15:09:35 -07:00
hermes_cli	revert(cron): remove per-job profile support (PR #28124 ) (#43956 )	2026-06-10 20:46:17 -07:00
hermes_state
honcho_plugin	fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759 ) (#42150 )	2026-06-08 09:35:22 -07:00
integration	refactor(gateway): migrate Home Assistant adapter to bundled plugin	2026-06-06 11:46:24 -07:00
openviking_plugin	fix(openviking): add missing /agent/{agent}/ segment to memory URI — fixes #36969	2026-06-04 17:40:33 -07:00
plugins	fix(web): genericize free-MCP client identity per telemetry policy	2026-06-10 19:54:38 -07:00
providers	fix(openrouter): route reasoning_effort to verbosity for adaptive Anthropic models (#43436 )	2026-06-10 15:03:01 +05:30
run_agent	fix(agent): strip api_messages in thinking-signature recovery so the retry actually omits thinking blocks	2026-06-10 12:39:44 -07:00
scripts	fix(skills-hub): stop shipping a degenerate index when GitHub taps collapse (#42347 )	2026-06-08 15:21:28 -07:00
skills	fix(google-workspace): fall back to uv when venv has no pip (#39516 )	2026-06-05 13:30:02 +10:00
stress	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
tools	feat(web): Parallel-backed web search & extract — free Search MCP when keyless, v1 REST when keyed	2026-06-10 19:54:38 -07:00
tui_gateway	fix(desktop): rebind sessions after websocket reconnect (salvage of #41740 ) (#43004 )	2026-06-09 19:01:00 +00:00
website	feat(skills): fix browse cap, add source links + copy buttons + category cleanup (#37143 )	2026-06-01 19:52:28 -07:00
__init__.py
conftest.py	fix: batch of small robustness/correctness fixes from @kyssta-exe	2026-06-01 19:51:03 -07:00
run_interrupt_test.py
test_account_usage.py
test_atomic_replace_symlinks.py
test_base_url_hostname.py	security(runtime_provider): close OLLAMA_API_KEY substring-leak sweep miss (#13522 )	2026-04-21 06:06:16 -07:00
test_batch_runner_checkpoint.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_bitwarden_secrets.py	fix(bitwarden): prevent zip-slip path traversal when extracting bws binary (#40569 )	2026-06-06 18:33:44 -07:00
test_cli_file_drop.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_cli_manual_compress.py
test_cli_skin_integration.py
test_ctx_halving_fix.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_dashboard_sidecar_close_on_disconnect.py	fix(tui-gateway): reap leaked slash_worker sessions on disconnect + active_list liveness (re-scoped onto current main)	2026-06-08 10:02:05 -07:00
test_desktop_mac_entitlements.py	test(desktop): assert macOS device entitlements are inherited	2026-06-03 07:32:00 +07:00
test_docker_home_override_scripts.py	Repair cron ownership on container restart (#41976 )	2026-06-10 15:32:34 +10:00
test_docker_stage2_browser_discovery.py	fix(docker): discover Playwright headless_shell browser (#35717 )	2026-06-01 16:06:44 +10:00
test_dockerfile_tini_compat_shim.py	fix(docker): add /usr/bin/tini compatibility shim for legacy wrappers (#34192 ) (#34382 )	2026-06-01 13:32:55 +10:00
test_empty_model_fallback.py	test(models): guard Nous silent default against expensive-flagship escalation	2026-06-05 02:54:34 -07:00
test_env_loader_secret_sources.py	fix(secrets): only apply external secrets once per HERMES_HOME per process (#32271 )	2026-05-25 15:18:55 -07:00
test_evidence_store.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_gateway_streaming_nested_config.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_get_tool_definitions_cache_isolation.py	fix(gateway): close residual memory-leak sites under heavy scheduled workload	2026-06-08 06:32:42 -07:00
test_hermes_bootstrap.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_hermes_constants.py	fix(constants): use windows native default hermes home	2026-06-03 19:37:29 -07:00
test_hermes_home_profile_warning.py
test_hermes_logging.py	fix(gateway): tolerate Unicode in stderr log handlers on Windows	2026-06-06 19:57:44 -07:00
test_hermes_state.py	fix(cron): bound the desktop run-history query to one job (#41088 )	2026-06-07 02:41:01 -07:00
test_hermes_state_compression_locks.py	fix(compression): prevent session-id fork from concurrent compressions (#34351 )	2026-05-28 21:40:39 -07:00
test_hermes_state_wal_fallback.py	fix(kanban): skip redundant WAL pragma on already-WAL connections	2026-05-27 14:31:55 -07:00
test_honcho_client_concurrency.py	fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759 ) (#42150 )	2026-06-08 09:35:22 -07:00
test_honcho_client_config.py	fix(honcho): harden self-hosted setup paths	2026-05-29 22:29:48 -07:00
test_honcho_session_context.py	fix(honcho): align user context peer perspective	2026-05-27 10:49:33 -07:00
test_honcho_startup_fail_open.py	fix: make Honcho startup fail open	2026-06-01 20:13:42 -07:00
test_install_no_initial_commit.py	fix(install): move broken checkout aside instead of deleting it	2026-06-08 02:18:21 -07:00
test_install_sh_browser_install.py
test_install_sh_pythonpath_sanitization.py
test_install_sh_root_fhs_uv_python_path.py	test(install): harden uv-python-path regression test against future drift	2026-05-27 13:55:51 -07:00
test_install_sh_setup_wizard_tty_probe.py	fix(install): widen /dev/tty open-probe to sibling gates (#16746 )	2026-04-28 06:45:55 -07:00
test_install_sh_symlink_stomp.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_install_sh_termux_network_prereqs.py
test_ipv4_preference.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_lazy_session_regressions.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_lint_config.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_live_system_guard_self_test.py
test_mcp_serve.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minimax_oauth.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_minisweagent_path.py
test_model_forces_max_completion_tokens.py	fix(params): send max_completion_tokens for newer OpenAI families on custom endpoints	2026-06-09 23:22:10 -07:00
test_model_picker_scroll.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_model_tools.py	feat(middleware): add adaptive execution intercepts	2026-06-03 11:22:06 -07:00
test_model_tools_async_bridge.py	fix(web): run URL SSRF checks off the event loop in async paths	2026-06-04 18:04:47 -07:00
test_ollama_num_ctx.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_output_cap_parsing.py	test(agent): cover char-based output-cap overflow parsing (#42741 )	2026-06-09 03:17:12 -07:00
test_package_json_lazy_deps.py
test_packaging_metadata.py	fix(packaging): ship optional-mcps catalog in wheel and sdist (#39859 )	2026-06-09 14:03:20 -04:00
test_plugin_skills.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_plugin_utils.py	fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759 ) (#42150 )	2026-06-08 09:35:22 -07:00
test_process_loop_event_loop_warning.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_project_metadata.py	fix(deps): align anthropic extra pin with lazy pin + guard whole pin surface (#42335 )	2026-06-08 12:11:54 -07:00
test_retry_utils.py
test_run_tests_parallel.py	fix(ci): make parallel runner's exit-4 retry robust for newly-added test files (#42994 )	2026-06-09 21:39:09 -07:00
test_sanitize_tool_error.py
test_slash_worker_watchdog.py	feat(slash-worker): self-terminate on parent death via create_time watchdog	2026-06-08 07:03:12 -07:00
test_sql_injection.py
test_state_db_malformed_repair.py	fix(state.db): recover from malformed sqlite_master so hidden sessions reappear (#43149 )	2026-06-09 18:49:08 -05:00
test_subprocess_home_isolation.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_termux_all_extra_compat.py
test_timezone.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_toolset_distributions.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_toolsets.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_trajectory_compressor.py	fix(research): keep tool_call/tool_response pairs intact when compressing trajectories	2026-06-07 05:01:27 -07:00
test_trajectory_compressor_async.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_transform_llm_output_hook.py
test_transform_tool_result_hook.py	test: stub has_hook in transform_tool_result hook tests	2026-06-03 06:36:46 -07:00
test_tui_gateway_server.py	desktop: registry-driven slash commands + first-class /resume & /handoff (#42351 )	2026-06-11 01:49:24 +00:00
test_tui_gateway_ws.py	fix: resolve rebase conflict in _teardown_session worker cleanup	2026-06-08 10:02:05 -07:00
test_utils_truthy_values.py
test_web_server.py	fix(tui-gateway): reap leaked slash_worker sessions on disconnect + active_list liveness (re-scoped onto current main)	2026-06-08 10:02:05 -07:00
test_wheel_locales_e2e.py	fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix (#38383 )	2026-06-03 12:00:27 -07:00
test_yuanbao_integration.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_yuanbao_markdown.py
test_yuanbao_pipeline.py	test(yuanbao): add missing patch import to pipeline tests	2026-06-09 03:17:00 -07:00
test_yuanbao_proto.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_yuanbao_shutdown.py	fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (#40607 )	2026-06-07 17:49:38 -07:00