hermes_bootstrap: Windows-only UTF-8 stdio shim for all entry points

Codebase-wide fix for Python-on-Windows UTF-8 footguns, complementing
the earlier execute_code sandbox fixes (which remain load-bearing for
when the sandbox explicitly scrubs child env).

Problem: Python on Windows has two long-standing text-encoding pitfalls:

  1. sys.stdout/stderr are bound to the console code page (cp1252 on
     US-locale installs) — print('café') crashes with UnicodeEncodeError.
  2. Subprocess children don't know to use UTF-8 unless PYTHONUTF8 and/or
     PYTHONIOENCODING are set in their env — so any Python we spawn
     (linters, sandbox children, delegation workers) hits the same bug.

Solution: A tiny bootstrap module (hermes_bootstrap.py) imported as the
first statement of every Hermes entry point:

  - hermes_cli/main.py   (hermes / hermes-agent console_script)
  - run_agent.py         (hermes-agent direct)
  - acp_adapter/entry.py (hermes-acp)
  - gateway/run.py       (messaging gateway)
  - batch_runner.py      (parallel batch mode)
  - cli.py               (legacy direct-launch CLI)

On Windows, the bootstrap:
  - os.environ.setdefault('PYTHONUTF8', '1')       (PEP 540 UTF-8 mode)
  - os.environ.setdefault('PYTHONIOENCODING', 'utf-8')
  - sys.stdout/stderr/stdin.reconfigure(encoding='utf-8', errors='replace')

Children inherit the env vars → they run in UTF-8 mode.
Current process's stdio is reconfigured → print('café') works now.

On POSIX (Linux/macOS), the bootstrap is a complete no-op.  We don't
touch LANG, LC_*, or anything else — users who have intentionally
configured a non-UTF-8 locale aren't affected.  POSIX systems are
already UTF-8 by default in 99% of modern setups, so there's nothing
to fix.

setdefault() (not overwrite) means users who explicitly set PYTHONUTF8=0
or PYTHONIOENCODING=cp1252 in their environment are respected.

What this does NOT fix: bare open(path, 'w') calls in the *parent*
process still default to locale encoding because PYTHONUTF8 is only
read at interpreter init.  A ruff PLW1514 sweep (separate follow-up)
will add explicit encoding='utf-8' at those ~219 call sites for
belt-and-suspenders.

Tests (17): 16 passed, 1 skipped on Windows.
  - Windows: env vars set, stdio reconfigured, child inherits UTF-8 mode
  - POSIX: complete no-op (verified on fake POSIX + skipped on real
    POSIX since we don't have a Linux box in this session)
  - Idempotence: multiple calls safe
  - Graceful degradation: non-reconfigurable streams don't crash
  - User opt-out: explicit PYTHONUTF8=0 is respected
  - Load order: every entry point's FIRST top-level import is
    hermes_bootstrap, enforced by an AST-level parametrized test

pyproject.toml: added hermes_bootstrap to py-modules so it ships with
pip installs.
This commit is contained in:
Teknium 2026-05-07 19:09:40 -07:00
parent bf43f6cfdd
commit 6098272454
9 changed files with 452 additions and 2 deletions

View file

@ -0,0 +1,297 @@
"""Tests for hermes_bootstrap — Windows UTF-8 stdio shim.
The bootstrap module is imported at the top of every Hermes entry point
(hermes, hermes-agent, hermes-acp, gateway, batch_runner, cli.py). It
fixes Python's Windows UTF-8 defaults so print("café") doesn't crash and
subprocess children inherit UTF-8 mode.
Key invariants covered by these tests:
1. Windows: env vars get set, stdio reconfigured, non-ASCII print works
2. POSIX: complete no-op (we don't touch LANG/LC_* or anything else)
3. Idempotent: safe to call multiple times
4. Respects user opt-out: if the user explicitly sets PYTHONUTF8=0 or
PYTHONIOENCODING=something-else, we leave those alone
5. Load order: every Hermes entry point imports hermes_bootstrap as its
first non-docstring import (before anything that might do file I/O
or print to stdout)
"""
from __future__ import annotations
import io
import os
import subprocess
import sys
import textwrap
import unittest.mock as mock
import pytest
# Import the module under test via an import-time side-effect check path.
# We need to be able to reset its state between tests, so we import it
# fresh in each test that manipulates _IS_WINDOWS.
def _fresh_import():
"""Return a freshly-imported hermes_bootstrap module.
Drops any cached copy from sys.modules first so module-level code
runs again and the platform check re-evaluates.
"""
sys.modules.pop("hermes_bootstrap", None)
import hermes_bootstrap # noqa: WPS433
return hermes_bootstrap
class TestWindowsBehavior:
"""Windows: the bootstrap does its job."""
@pytest.mark.skipif(
sys.platform != "win32",
reason="Windows-specific behavior",
)
def test_env_vars_set_on_windows(self, monkeypatch):
# Clear any pre-existing values and re-run bootstrap.
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
hb = _fresh_import()
# Module-level apply_windows_utf8_bootstrap() ran during import.
assert os.environ.get("PYTHONUTF8") == "1"
assert os.environ.get("PYTHONIOENCODING") == "utf-8"
assert hb._bootstrap_applied is True
@pytest.mark.skipif(
sys.platform != "win32",
reason="Windows-specific behavior",
)
def test_stdout_reconfigured_to_utf8_on_windows(self):
# The live process's stdout should now be UTF-8 (the Hermes CLI
# runs on Windows with a pytest console that's cp1252 by default).
# If reconfigure succeeded, sys.stdout.encoding is 'utf-8'.
_fresh_import()
# pytest may capture stdout, which makes encoding check flaky —
# so instead verify the reconfigure call succeeded on the real
# stream by attempting the failure case.
out = sys.stdout
reconfigure = getattr(out, "reconfigure", None)
if reconfigure is None:
pytest.skip("pytest replaced sys.stdout with a non-reconfigurable stream")
# After bootstrap, encoding should be utf-8 (or the reconfigure
# skipped because pytest's capture already set it to utf-8).
assert out.encoding.lower() in {"utf-8", "utf8"}, (
f"stdout encoding is {out.encoding!r} — bootstrap should have "
"reconfigured it to UTF-8"
)
@pytest.mark.skipif(
sys.platform != "win32",
reason="Windows-specific behavior",
)
def test_child_process_inherits_utf8_mode(self):
"""A subprocess spawned from this process should inherit
PYTHONUTF8=1 and be able to print non-ASCII to stdout."""
_fresh_import()
# Non-ASCII chars that would crash under cp1252: arrow, emoji.
script = textwrap.dedent("""
import sys
print("em-dash \\u2014 arrow \\u2192 emoji \\U0001f680")
sys.exit(0)
""").strip()
# Don't pass env= — let the child inherit os.environ, which
# now contains PYTHONUTF8=1 courtesy of the bootstrap.
result = subprocess.run(
[sys.executable, "-c", script],
capture_output=True,
timeout=15,
)
assert result.returncode == 0, (
f"Child crashed printing non-ASCII despite UTF-8 bootstrap:\n"
f" stdout: {result.stdout!r}\n"
f" stderr: {result.stderr!r}"
)
decoded = result.stdout.decode("utf-8")
assert "\u2014" in decoded
assert "\u2192" in decoded
assert "\U0001f680" in decoded
class TestUserOptOut:
"""If the user has explicitly set PYTHONUTF8 / PYTHONIOENCODING in
their environment, we respect that (setdefault, not overwrite)."""
@pytest.mark.skipif(
sys.platform != "win32",
reason="Only meaningful on Windows where we'd otherwise set these",
)
def test_user_pythonutf8_zero_preserved(self, monkeypatch):
monkeypatch.setenv("PYTHONUTF8", "0")
_fresh_import()
assert os.environ["PYTHONUTF8"] == "0", (
"bootstrap must not overwrite an explicit user setting"
)
@pytest.mark.skipif(
sys.platform != "win32",
reason="Only meaningful on Windows where we'd otherwise set these",
)
def test_user_pythonioencoding_preserved(self, monkeypatch):
monkeypatch.setenv("PYTHONIOENCODING", "latin-1")
_fresh_import()
assert os.environ["PYTHONIOENCODING"] == "latin-1"
class TestPosixNoOp:
"""POSIX: zero behavior change. We don't touch LANG, LC_*, or any
stdio. The goal is that Linux/macOS behave identically before and
after this module is imported."""
def test_noop_on_fake_posix(self, monkeypatch):
"""Even when imported, the bootstrap function must return False
and leave env untouched when _IS_WINDOWS is False."""
hb = _fresh_import()
# Reset + fake POSIX
hb._IS_WINDOWS = False
hb._bootstrap_applied = False
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
result = hb.apply_windows_utf8_bootstrap()
assert result is False
assert "PYTHONUTF8" not in os.environ
assert "PYTHONIOENCODING" not in os.environ
assert hb._bootstrap_applied is False
@pytest.mark.skipif(
sys.platform == "win32",
reason="Real POSIX required for this check",
)
def test_real_posix_bootstrap_is_noop(self, monkeypatch):
"""On actual Linux/macOS, importing the module must not set
PYTHONUTF8 or reconfigure stdio."""
monkeypatch.delenv("PYTHONUTF8", raising=False)
monkeypatch.delenv("PYTHONIOENCODING", raising=False)
hb = _fresh_import()
assert hb._bootstrap_applied is False
assert "PYTHONUTF8" not in os.environ
assert "PYTHONIOENCODING" not in os.environ
class TestIdempotence:
"""Calling apply_windows_utf8_bootstrap() multiple times must be safe."""
def test_second_call_returns_false(self):
hb = _fresh_import()
# First call already happened at import time.
result = hb.apply_windows_utf8_bootstrap()
assert result is False, (
"Second call should return False (idempotent no-op)"
)
def test_no_exceptions_on_repeated_calls(self):
hb = _fresh_import()
for _ in range(5):
hb.apply_windows_utf8_bootstrap()
class TestStdioReconfigureErrorHandling:
"""If sys.stdout/stderr/stdin have been replaced with streams that
don't support reconfigure (e.g. by a test harness), the bootstrap
must degrade gracefully rather than crash."""
def test_non_reconfigurable_stream_does_not_crash(self, monkeypatch):
"""Replace sys.stdout with a BytesIO (no reconfigure method),
then run the bootstrap and make sure it doesn't raise."""
hb = _fresh_import()
hb._IS_WINDOWS = True
hb._bootstrap_applied = False
fake = io.BytesIO() # no .reconfigure attribute
monkeypatch.setattr(sys, "stdout", fake)
try:
# Must not raise.
hb.apply_windows_utf8_bootstrap()
except Exception as exc:
pytest.fail(f"bootstrap raised on non-reconfigurable stdout: {exc}")
def test_reconfigure_oserror_is_caught(self, monkeypatch):
"""If reconfigure() itself raises (closed stream, etc.), swallow
the error the env-var half of the fix still applies."""
hb = _fresh_import()
hb._IS_WINDOWS = True
hb._bootstrap_applied = False
class _BrokenStream:
encoding = "utf-8"
def reconfigure(self, **kwargs):
raise OSError("simulated: stream already closed")
monkeypatch.setattr(sys, "stdout", _BrokenStream())
monkeypatch.setattr(sys, "stderr", _BrokenStream())
# Must not raise.
hb.apply_windows_utf8_bootstrap()
class TestEntryPointsImportBootstrap:
"""Every Hermes entry point must import hermes_bootstrap as its
first non-docstring import. We check this by scanning source files
rather than invoking the entry points (which would require a full
agent context)."""
# Entry points that invoke Hermes as a process. Each one must
# import hermes_bootstrap before doing any file I/O or stdout writes.
ENTRY_POINTS = [
"hermes_cli/main.py", # hermes CLI (console_script)
"run_agent.py", # hermes-agent (console_script)
"acp_adapter/entry.py", # hermes-acp (console_script)
"gateway/run.py", # gateway
"batch_runner.py", # batch mode
"cli.py", # legacy direct-launch CLI
]
@pytest.mark.parametrize("path", ENTRY_POINTS)
def test_entry_point_imports_bootstrap(self, path):
"""The file must contain 'import hermes_bootstrap' and that
line must appear before the first 'import' of anything else.
We're lenient about the docstring (can be arbitrarily long) and
about comment lines just need to verify the first import
statement is the bootstrap.
"""
# Resolve relative to the hermes-agent repo root. Tests live
# at tests/test_hermes_bootstrap.py, so go up one dir.
import pathlib
here = pathlib.Path(__file__).resolve()
repo_root = here.parent.parent # tests/ -> repo root
full_path = repo_root / path
assert full_path.exists(), f"entry point missing: {full_path}"
source = full_path.read_text(encoding="utf-8")
# Find the first non-comment, non-blank line that starts with
# 'import ' or 'from '. It must be 'import hermes_bootstrap'.
import tokenize
import ast
tree = ast.parse(source)
first_import_node = None
for node in ast.iter_child_nodes(tree):
if isinstance(node, (ast.Import, ast.ImportFrom)):
first_import_node = node
break
assert first_import_node is not None, (
f"{path}: no top-level imports found at all"
)
if isinstance(first_import_node, ast.Import):
first_import_name = first_import_node.names[0].name
else: # ImportFrom
first_import_name = first_import_node.module or ""
assert first_import_name == "hermes_bootstrap", (
f"{path}: first top-level import is {first_import_name!r}, "
f"but it must be 'hermes_bootstrap' so UTF-8 stdio is "
f"configured before anything else initializes. Move the "
f"'import hermes_bootstrap' line to be the first import."
)