fix: recognize emoji and caret as natural response endings

GLM models via Ollama report finish_reason='stop' even when the
response was truncated by max_tokens. The continuation mechanism
uses _has_natural_response_ending() as one of the heuristics to
detect whether the response was genuinely finished.

Currently only ASCII punctuation and CJK punctuation are recognized.
This means any response ending with an emoji (e.g. , 👍) or the
caret character ^ (common in French ^^ smiley) is not recognized as
naturally ended, triggering a false-positive continuation where the
model receives 'Continue where you left off' and produces garbled
output.

Add:
- ^ (caret) to the punctuation set
- Unicode emoji range (codepoint >= 0x1F300) as natural ending

This only affects GLM/Ollama users but the fix is safe for all
backends since _has_natural_response_ending() is only consulted
inside the continuation flow.
This commit is contained in:
oseftg 2026-05-18 19:37:34 -07:00 committed by Teknium
parent 6d495d9e7c
commit 700f3b13e7

View file

@ -1019,7 +1019,15 @@ class AIAgent:
return False
if stripped.endswith("```"):
return True
return stripped[-1] in '.!?:)"\']}。!?:)】」』》'
if stripped.endswith('^'):
return True
last = stripped[-1]
if last in '.!?:)"\']}。!?:)】」』》^':
return True
# Emoji ranges (Misc Symbols, Dingbats, Emoticons, Supplemental, etc.)
if ord(last) >= 0x1F300:
return True
return False
def _is_ollama_glm_backend(self) -> bool:
"""Detect the narrow backend family affected by Ollama/GLM stop misreports."""