hermes-agent/tools/environments
Teknium f336ae3d7d fix(environments): use incremental UTF-8 decoder in select-based drain
The first draft of the fix called `chunk.decode("utf-8")` directly on
each 4096-byte `os.read()` result, which corrupts output whenever a
multi-byte UTF-8 character straddles a read boundary:

  * `UnicodeDecodeError` fires on the valid-but-truncated byte sequence.
  * The except handler clears ALL previously-decoded output and replaces
    the whole buffer with `[binary output detected ...]`.

Empirically: 10000 '日' chars (30001 bytes) through the wrapper loses
all 10000 characters on the first draft; the baseline TextIOWrapper
drain (which uses `encoding='utf-8', errors='replace'` on Popen)
preserves them all. This regression affects any command emitting
non-ASCII output larger than one chunk — CJK/Arabic/emoji in
`npm install`, `pip install`, `docker logs`, `kubectl logs`, etc.

Fix: swap to `codecs.getincrementaldecoder('utf-8')(errors='replace')`,
which buffers partial multi-byte sequences across chunks and substitutes
U+FFFD for genuinely invalid bytes. Flush on drain exit via
`decoder.decode(b'', final=True)` to emit any trailing replacement
character for a dangling partial sequence.

Adds two regression tests:
  * test_utf8_multibyte_across_read_boundary — 10000 U+65E5 chars,
    verifies count round-trips and no fallback fires.
  * test_invalid_utf8_uses_replacement_not_fallback — deliberate
    \xff\xfe between valid ASCII, verifies surrounding text survives.
2026-04-19 11:27:50 -07:00
..
__init__.py feat(environments): add Daytona cloud sandbox backend 2026-03-05 10:02:21 -08:00
base.py fix(environments): use incremental UTF-8 decoder in select-based drain 2026-04-19 11:27:50 -07:00
daytona.py fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards 2026-04-16 19:39:21 -07:00
docker.py feat: entry-level Podman support — find_docker() + rootless entrypoint (#10066) 2026-04-14 21:20:37 -07:00
file_sync.py fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards 2026-04-16 19:39:21 -07:00
local.py fix: per-profile subprocess HOME isolation (#4426) (#7357) 2026-04-10 13:37:45 -07:00
managed_modal.py feat(environments): unified spawn-per-call execution layer 2026-04-08 17:23:15 -07:00
modal.py feat(file-sync): sync remote changes back to host on teardown 2026-04-16 19:39:21 -07:00
modal_utils.py fix: follow-up for salvaged PR #10854 2026-04-16 06:42:45 -07:00
singularity.py feat(environments): unified spawn-per-call execution layer 2026-04-08 17:23:15 -07:00
ssh.py feat(file-sync): sync remote changes back to host on teardown 2026-04-16 19:39:21 -07:00