hermes-agent/tests/cli
Teknium 2ff1ef6ae6
fix(surrogates): sanitize reasoning/reasoning_content/reasoning_details fields (#11628)
Byte-level reasoning models (xiaomi/mimo-v2-pro, kimi, glm) can emit lone
surrogates in reasoning output. The proactive sanitizer walked content/
name/tool_calls but not extra fields like reasoning or the nested
reasoning_details array. Surrogates in those fields survived the
proactive pass, crashed json.dumps() in the OpenAI SDK, and the recovery
block's _sanitize_messages_surrogates(messages) call also didn't check
those fields — so 'found' was False, no retry happened, and after 3
attempts the user saw:

  API call failed after 3 retries. 'utf-8' codec can't encode characters
  in position N-M: surrogates not allowed

Changes:
- _sanitize_messages_surrogates: walk any extra string fields (reasoning,
  reasoning_content, etc.) and recurse into nested dict/list values
  (reasoning_details). Mirrors _sanitize_messages_non_ascii coverage
  added in PR #10537.
- _sanitize_structure_surrogates: new recursive walker, mirror of
  _sanitize_structure_non_ascii but for surrogate recovery.
- UnicodeEncodeError recovery block: also sanitize api_messages,
  api_kwargs, and prefill_messages (not just the canonical messages
  list — the API-copy carries reasoning_content transformed from
  reasoning and that's what the SDK actually serializes). Always
  retry on detected surrogate errors, not only when we found
  something to strip — gate on error type per PR #10537's pattern.

Tests: extended tests/cli/test_surrogate_sanitization.py with
coverage for reasoning, reasoning_content, reasoning_details (flat
and deeply nested), structure walker, and an integration case that
reproduces the exact api_messages shape that was crashing.
2026-04-17 13:30:47 -07:00
..
__init__.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_branch_command.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_approval_ui.py fix(cli): stop approval panel from clipping approve/deny off-screen (#11260) 2026-04-16 16:36:07 -07:00
test_cli_background_tui_refresh.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_browser_connect.py fix: /browser connect auto-launch uses dedicated profile dir (#6821) 2026-04-09 14:55:45 -07:00
test_cli_context_warning.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_extension_hooks.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_file_drop.py fix(termux): add local image chat route 2026-04-09 16:24:53 -07:00
test_cli_image_command.py fix(termux): harden execute_code and mobile browser/audio UX 2026-04-09 16:24:53 -07:00
test_cli_init.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_interrupt_subagent.py fix: resolve CI test failures — add missing functions, fix stale tests (#9483) 2026-04-14 01:43:45 -07:00
test_cli_loading_indicator.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_mcp_config_watch.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_new_session.py fix(tests): resolve remaining CI failures — commit_memory_session, already_sent, timezone leak, session env (#10785) 2026-04-16 02:26:14 -07:00
test_cli_plan_command.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_prefix_matching.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_preloaded_skills.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_provider_resolution.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_cli_retry.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_save_config_value.py fix: remove legacy compression.summary_* config and env var fallbacks (#8992) 2026-04-13 04:59:26 -07:00
test_cli_secret_capture.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_cli_skin_integration.py fix(termux): deepen browser, voice, and tui support 2026-04-09 16:24:53 -07:00
test_cli_status_bar.py fix(termux): tighten voice setup and mobile chat UX 2026-04-09 16:24:53 -07:00
test_cli_status_command.py fix(profile): use existing get_active_profile_name() for /profile command 2026-04-15 17:52:03 -07:00
test_cli_tools_command.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_compress_focus.py feat: /compress <focus> — guided compression with focus topic (#8017) 2026-04-11 19:23:29 -07:00
test_cwd_env_respect.py fix: enforce config.yaml as sole CWD source + deprecate .env CWD vars + add hermes memory reset (#11029) 2026-04-16 06:48:33 -07:00
test_fast_command.py fix(anthropic): send fast mode speed via extra_body 2026-04-13 22:32:39 -07:00
test_manual_compress.py fix(gateway): make manual compression feedback truthful 2026-04-10 21:16:53 -07:00
test_personality_none.py fix(gateway): use profile-aware Hermes paths in runtime hints 2026-04-15 17:52:03 -07:00
test_quick_commands.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00
test_reasoning_command.py fix: clean up stale test references to removed attributes 2026-04-10 03:44:43 -07:00
test_resume_display.py fix: show full last assistant response when resuming a session (#8724) 2026-04-12 19:07:14 -07:00
test_session_boundary_hooks.py fix: add gateway coverage for session boundary hooks, move test to tests/cli/ 2026-04-08 04:27:34 -07:00
test_stream_delta_think_tag.py fix(streaming): prevent <think> in prose from suppressing response output 2026-04-09 22:16:36 -07:00
test_surrogate_sanitization.py fix(surrogates): sanitize reasoning/reasoning_content/reasoning_details fields (#11628) 2026-04-17 13:30:47 -07:00
test_tool_progress_scrollback.py fix(cli): restore stacked tool progress scrollback in TUI (#8201) 2026-04-11 23:22:34 -07:00
test_worktree.py fix: aggressive worktree and branch cleanup to prevent accumulation (#6134) 2026-04-08 04:44:49 -07:00
test_worktree_security.py refactor(tests): re-architect tests + fix CI failures (#5946) 2026-04-07 17:19:07 -07:00