feat(execute_code): add project/strict execution modes, default to project (#11971)

Weaker models (Gemma-class) repeatedly rediscover and forget that execute_code uses a different CWD and Python interpreter than terminal(), causing them to flip-flop on whether user files exist and to hit import errors on project dependencies like pandas. Adds a new 'code_execution.mode' config key (default 'project') that brings execute_code into line with terminal()'s filesystem/interpreter: project (new default): - cwd = session's TERMINAL_CWD (falls back to os.getcwd()) - python = active VIRTUAL_ENV/bin/python or CONDA_PREFIX/bin/python with a Python 3.8+ version check; falls back cleanly to sys.executable if no venv or the candidate fails - result : 'import pandas' works, '.env' resolves, matches terminal() strict (opt-in): - cwd = staging tmpdir (today's behavior) - python = sys.executable (today's behavior) - result : maximum reproducibility and isolation; project deps won't resolve Security-critical invariants are identical across both modes and covered by explicit regression tests: - env scrubbing (strips *_API_KEY, *_TOKEN, *_SECRET, *_PASSWORD, *_CREDENTIAL, *_PASSWD, *_AUTH substrings) - SANDBOX_ALLOWED_TOOLS whitelist (no execute_code recursion, no delegate_task, no MCP from inside scripts) - resource caps (5-min timeout, 50KB stdout, 50 tool calls) Deliberately avoids 'sandbox'/'isolated'/'cloud' language in tool descriptions (regression from commit 39b83f34 where agents on local backends falsely believed they were sandboxed and refused networking). Override via env var: HERMES_EXECUTE_CODE_MODE=strict|project
2026-04-25 00:51:20 +00:00 · 2026-04-18 01:46:25 -07:00 · 2026-04-18 01:46:25 -07:00 · 285bb2b915
commit 285bb2b915
parent 54e0eb24c0
5 changed files with 643 additions and 14 deletions
--- a/tests/hermes_cli/test_config.py
+++ b/tests/hermes_cli/test_config.py
@ -459,7 +459,7 @@ class TestCustomProviderCompatibility:
            migrate_config(interactive=False, quiet=True)
            raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))

-        assert raw["_config_version"] == 18
+        assert raw["_config_version"] == 19
        assert raw["providers"]["openai-direct"] == {
            "api": "https://api.openai.com/v1",
            "api_key": "test-key",
@ -606,7 +606,7 @@ class TestInterimAssistantMessageConfig:
            migrate_config(interactive=False, quiet=True)
            raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))

-        assert raw["_config_version"] == 18
+        assert raw["_config_version"] == 19
        assert raw["display"]["tool_progress"] == "off"
        assert raw["display"]["interim_assistant_messages"] is True

@ -626,6 +626,6 @@ class TestDiscordChannelPromptsConfig:
            migrate_config(interactive=False, quiet=True)
            raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))

-        assert raw["_config_version"] == 18
+        assert raw["_config_version"] == 19
        assert raw["discord"]["auto_thread"] is True
        assert raw["discord"]["channel_prompts"] == {}