mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
feat(execute_code): add project/strict execution modes, default to project (#11971)
Weaker models (Gemma-class) repeatedly rediscover and forget that
execute_code uses a different CWD and Python interpreter than terminal(),
causing them to flip-flop on whether user files exist and to hit import
errors on project dependencies like pandas.
Adds a new 'code_execution.mode' config key (default 'project') that
brings execute_code into line with terminal()'s filesystem/interpreter:
project (new default):
- cwd = session's TERMINAL_CWD (falls back to os.getcwd())
- python = active VIRTUAL_ENV/bin/python or CONDA_PREFIX/bin/python
with a Python 3.8+ version check; falls back cleanly to
sys.executable if no venv or the candidate fails
- result : 'import pandas' works, '.env' resolves, matches terminal()
strict (opt-in):
- cwd = staging tmpdir (today's behavior)
- python = sys.executable (today's behavior)
- result : maximum reproducibility and isolation; project deps
won't resolve
Security-critical invariants are identical across both modes and covered by
explicit regression tests:
- env scrubbing (strips *_API_KEY, *_TOKEN, *_SECRET, *_PASSWORD,
*_CREDENTIAL, *_PASSWD, *_AUTH substrings)
- SANDBOX_ALLOWED_TOOLS whitelist (no execute_code recursion, no
delegate_task, no MCP from inside scripts)
- resource caps (5-min timeout, 50KB stdout, 50 tool calls)
Deliberately avoids 'sandbox'/'isolated'/'cloud' language in tool
descriptions (regression from commit 39b83f34 where agents on local
backends falsely believed they were sandboxed and refused networking).
Override via env var: HERMES_EXECUTE_CODE_MODE=strict|project
This commit is contained in:
parent
54e0eb24c0
commit
285bb2b915
5 changed files with 643 additions and 14 deletions
|
|
@ -274,9 +274,9 @@ def get_tool_definitions(
|
|||
# execute_code" even when the API key isn't configured or the toolset is
|
||||
# disabled (#560-discord).
|
||||
if "execute_code" in available_tool_names:
|
||||
from tools.code_execution_tool import SANDBOX_ALLOWED_TOOLS, build_execute_code_schema
|
||||
from tools.code_execution_tool import SANDBOX_ALLOWED_TOOLS, build_execute_code_schema, _get_execution_mode
|
||||
sandbox_enabled = SANDBOX_ALLOWED_TOOLS & available_tool_names
|
||||
dynamic_schema = build_execute_code_schema(sandbox_enabled)
|
||||
dynamic_schema = build_execute_code_schema(sandbox_enabled, mode=_get_execution_mode())
|
||||
for i, td in enumerate(filtered_tools):
|
||||
if td.get("function", {}).get("name") == "execute_code":
|
||||
filtered_tools[i] = {"type": "function", "function": dynamic_schema}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue