hermes-agent/gateway
Teknium 45735e71a2
fix(telegram): use UTF-16 code units for message length splitting
Port from nearai/ironclaw#2304: Telegram's 4096 character limit is
measured in UTF-16 code units, not Unicode codepoints. Characters
outside the Basic Multilingual Plane (emoji like 😀, CJK Extension B,
musical symbols) are surrogate pairs: 1 Python char but 2 UTF-16 units.

Previously, truncate_message() used Python's len() which counts
codepoints. This could produce chunks exceeding Telegram's actual limit
when messages contain many astral-plane characters.

Changes:
- Add utf16_len() helper and _prefix_within_utf16_limit() for
  UTF-16-aware string measurement and truncation
- Add _custom_unit_to_cp() binary-search helper that maps a custom-unit
  budget to the largest safe codepoint slice position
- Update truncate_message() to accept optional len_fn parameter
- Telegram adapter now passes len_fn=utf16_len when splitting messages
- Fix fallback truncation in Telegram error handler to use
  _prefix_within_utf16_limit instead of codepoint slicing
- Update send_message_tool.py to use utf16_len for Telegram platform
- Add comprehensive tests: utf16_len, _prefix_within_utf16_limit,
  truncate_message with len_fn (emoji splitting, content preservation,
  code block handling)
- Update mock lambdas in reply_mode tests to accept **kw for len_fn
2026-04-12 17:43:05 -07:00
..
builtin_hooks refactor: replace inline HERMES_HOME re-implementations with get_hermes_home() 2026-04-07 10:40:34 -07:00
platforms fix(telegram): use UTF-16 code units for message length splitting 2026-04-12 17:43:05 -07:00
__init__.py Enhance CLI with multi-platform messaging integration and configuration management 2026-02-02 19:01:51 -08:00
channel_directory.py fix(gateway): derive channel directory platforms from enum instead of hardcoded list (#7450) 2026-04-10 17:27:32 -07:00
config.py feat(gateway): add WeCom callback-mode adapter for self-built apps 2026-04-11 15:22:49 -07:00
delivery.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
display_config.py feat: per-platform display verbosity configuration (#8006) 2026-04-11 17:20:34 -07:00
hooks.py feat: built-in boot-md hook — run BOOT.md on gateway startup (#3733) 2026-03-29 10:19:54 -07:00
mirror.py chore: remove ~100 unused imports across 55 files (#3016) 2026-03-25 15:02:03 -07:00
pairing.py fix: multiple platform adaptors concurrency 2026-04-06 16:49:54 -07:00
restart.py fix(gateway): address restart review feedback 2026-04-10 21:18:34 -07:00
run.py fix(weixin): streaming cursor, media uploads, markdown links, blank messages (#8665) 2026-04-12 16:43:25 -07:00
session.py fix(matrix): replace pickle crypto store with SQLite, fix E2EE decryption (#7981) 2026-04-12 07:24:46 +05:30
session_context.py fix(gateway): add HERMES_SESSION_KEY to session_context contextvars 2026-04-11 15:35:04 -07:00
status.py fix(discord): decouple readiness from slash sync 2026-04-11 19:22:14 -07:00
sticker_cache.py chore: remove ~100 unused imports across 55 files (#3016) 2026-03-25 15:02:03 -07:00
stream_consumer.py feat(gateway): surface natural mid-turn assistant messages in chat platforms 2026-04-11 16:21:39 -07:00