mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-09 03:11:58 +00:00
execute_code: write sandbox files as UTF-8 on Windows
Second Windows-specific sandbox bug (WinError 10106 was the first):
after the env-scrub fix let the child start, it immediately failed to
import hermes_tools with:
SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x97
in position 154: invalid start byte
Root cause: _execute_local wrote the generated hermes_tools.py stub and
the user's script.py via open(path, 'w') without encoding=. On Windows
the default text-mode encoding is cp1252 (system locale), which encodes
em-dashes (used in the stub's docstrings) as 0x97. Python then decodes
source files as UTF-8 (PEP 3120) on import, chokes on 0x97, and the
sandbox dies before any tool call.
Fix: pass encoding='utf-8' to all four file opens in the code_execution
path — the two staging writes in _execute_local (hermes_tools.py +
script.py) and the two RPC file-transport reads/writes in the generated
remote stub. JSON is ASCII-safe for most payloads but tool results
(terminal output, web_extract content) routinely carry non-ASCII.
Tests added (4):
- test_stub_and_script_writes_specify_utf8 — source grep guard
- test_file_rpc_stub_uses_utf8 — generated remote stub check
- test_stub_source_roundtrips_through_utf8 — concrete round-trip
- test_windows_default_encoding_would_have_failed — negative control
(skips on modern Python builds where default is already UTF-8
compatible, but retained for platforms where the regression could
return)
24/25 tests pass on Windows 3.11 (negative control skips because this
Python build handles em-dashes via cp1252 subset — the fix is still
correct, just the corruption path isn't always triggerable).
This commit is contained in:
parent
3b9cd58208
commit
da184439db
2 changed files with 179 additions and 6 deletions
|
|
@ -390,9 +390,12 @@ def _call(tool_name, args):
|
|||
req_file = os.path.join(_RPC_DIR, f"req_{seq_str}")
|
||||
res_file = os.path.join(_RPC_DIR, f"res_{seq_str}")
|
||||
|
||||
# Write request atomically (write to .tmp, then rename)
|
||||
# Write request atomically (write to .tmp, then rename).
|
||||
# encoding="utf-8" is critical: on Windows-hosted remote backends
|
||||
# (or any non-UTF-8 locale) the default open() mode would mangle
|
||||
# non-ASCII chars in tool args when encoding them as JSON.
|
||||
tmp = req_file + ".tmp"
|
||||
with open(tmp, "w") as f:
|
||||
with open(tmp, "w", encoding="utf-8") as f:
|
||||
json.dump({"tool": tool_name, "args": args, "seq": seq}, f)
|
||||
os.rename(tmp, req_file)
|
||||
|
||||
|
|
@ -405,7 +408,7 @@ def _call(tool_name, args):
|
|||
time.sleep(poll_interval)
|
||||
poll_interval = min(poll_interval * 1.2, 0.25) # Back off to 250ms
|
||||
|
||||
with open(res_file) as f:
|
||||
with open(res_file, encoding="utf-8") as f:
|
||||
raw = f.read()
|
||||
|
||||
# Clean up response file
|
||||
|
|
@ -1111,15 +1114,22 @@ def execute_code(
|
|||
server_sock = None
|
||||
|
||||
try:
|
||||
# Write the auto-generated hermes_tools module
|
||||
# Write the auto-generated hermes_tools module.
|
||||
# encoding="utf-8" is required on Windows — the stub and user code
|
||||
# both contain non-ASCII characters (em-dashes in docstrings, plus
|
||||
# whatever the user script carries). Python's default open() uses
|
||||
# the system locale on Windows (cp1252 typically), which corrupts
|
||||
# those bytes; the child then fails to import with a SyntaxError
|
||||
# ("'utf-8' codec can't decode byte 0x97 in position ...") because
|
||||
# Python source files are decoded as UTF-8 by default (PEP 3120).
|
||||
# sandbox_tools is already the correct set (intersection with session
|
||||
# tools, or SANDBOX_ALLOWED_TOOLS as fallback — see lines above).
|
||||
tools_src = generate_hermes_tools_module(list(sandbox_tools))
|
||||
with open(os.path.join(tmpdir, "hermes_tools.py"), "w") as f:
|
||||
with open(os.path.join(tmpdir, "hermes_tools.py"), "w", encoding="utf-8") as f:
|
||||
f.write(tools_src)
|
||||
|
||||
# Write the user's script
|
||||
with open(os.path.join(tmpdir, "script.py"), "w") as f:
|
||||
with open(os.path.join(tmpdir, "script.py"), "w", encoding="utf-8") as f:
|
||||
f.write(code)
|
||||
|
||||
# --- Start RPC server ---
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue