feat(execute_code): add project/strict execution modes, default to project (#11971)

Weaker models (Gemma-class) repeatedly rediscover and forget that
execute_code uses a different CWD and Python interpreter than terminal(),
causing them to flip-flop on whether user files exist and to hit import
errors on project dependencies like pandas.

Adds a new 'code_execution.mode' config key (default 'project') that
brings execute_code into line with terminal()'s filesystem/interpreter:

  project (new default):
    - cwd       = session's TERMINAL_CWD (falls back to os.getcwd())
    - python    = active VIRTUAL_ENV/bin/python or CONDA_PREFIX/bin/python
                  with a Python 3.8+ version check; falls back cleanly to
                  sys.executable if no venv or the candidate fails
    - result    : 'import pandas' works, '.env' resolves, matches terminal()

  strict (opt-in):
    - cwd       = staging tmpdir (today's behavior)
    - python    = sys.executable (today's behavior)
    - result    : maximum reproducibility and isolation; project deps
                  won't resolve

Security-critical invariants are identical across both modes and covered by
explicit regression tests:

  - env scrubbing (strips *_API_KEY, *_TOKEN, *_SECRET, *_PASSWORD,
    *_CREDENTIAL, *_PASSWD, *_AUTH substrings)
  - SANDBOX_ALLOWED_TOOLS whitelist (no execute_code recursion, no
    delegate_task, no MCP from inside scripts)
  - resource caps (5-min timeout, 50KB stdout, 50 tool calls)

Deliberately avoids 'sandbox'/'isolated'/'cloud' language in tool
descriptions (regression from commit 39b83f34 where agents on local
backends falsely believed they were sandboxed and refused networking).

Override via env var: HERMES_EXECUTE_CODE_MODE=strict|project
This commit is contained in:
Teknium 2026-04-18 01:46:25 -07:00 committed by GitHub
parent 54e0eb24c0
commit 285bb2b915
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 643 additions and 14 deletions

View file

@ -459,7 +459,7 @@ class TestCustomProviderCompatibility:
migrate_config(interactive=False, quiet=True)
raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))
assert raw["_config_version"] == 18
assert raw["_config_version"] == 19
assert raw["providers"]["openai-direct"] == {
"api": "https://api.openai.com/v1",
"api_key": "test-key",
@ -606,7 +606,7 @@ class TestInterimAssistantMessageConfig:
migrate_config(interactive=False, quiet=True)
raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))
assert raw["_config_version"] == 18
assert raw["_config_version"] == 19
assert raw["display"]["tool_progress"] == "off"
assert raw["display"]["interim_assistant_messages"] is True
@ -626,6 +626,6 @@ class TestDiscordChannelPromptsConfig:
migrate_config(interactive=False, quiet=True)
raw = yaml.safe_load(config_path.read_text(encoding="utf-8"))
assert raw["_config_version"] == 18
assert raw["_config_version"] == 19
assert raw["discord"]["auto_thread"] is True
assert raw["discord"]["channel_prompts"] == {}