fix: handle UnicodeEncodeError with ASCII codec (#6843)

Broaden the UnicodeEncodeError recovery to handle systems with ASCII-only
locale (LANG=C, Chromebooks) where ANY non-ASCII character causes encoding
failure, not just lone surrogates.

Changes:
- Add _strip_non_ascii() and _sanitize_messages_non_ascii() helpers that
  strip all non-ASCII characters from message content, name, and tool_calls
- Update the UnicodeEncodeError handler to detect ASCII codec errors and
  fall back to non-ASCII sanitization after surrogate check fails
- Sanitize tool_calls arguments and name fields (not just content)
- Fix bare .encode() in cli.py suspend handler to use explicit utf-8
- Add comprehensive test suite (17 tests)
This commit is contained in:
Hermes Audit 2026-04-09 23:21:42 +00:00 committed by Teknium
parent 7e28b7b5d5
commit 71036a7a75
3 changed files with 205 additions and 11 deletions

2
cli.py
View file

@ -7999,7 +7999,7 @@ class HermesCLI:
agent_name = get_active_skin().get_branding("agent_name", "Hermes Agent")
msg = f"\n{agent_name} has been suspended. Run `fg` to bring {agent_name} back."
def _suspend():
os.write(1, msg.encode())
os.write(1, msg.encode("utf-8", errors="replace"))
os.kill(0, _sig.SIGTSTP)
run_in_terminal(_suspend)