hermes-agent/agent
emozilla bda9aa17cb fix(streaming): prevent <think> in prose from suppressing response output
When the model mentions <think> as literal text in its response (e.g.
"(/think not producing <think> tags)"), the streaming display treated it
as a reasoning block opener and suppressed everything after it. The
response box would close with truncated content and no error — the API
response was complete but the display ate it.

Root cause: _stream_delta() matched <think> anywhere in the text stream
regardless of position. Real reasoning blocks always start at the
beginning of a line; mentions in prose appear mid-sentence.

Fix: track line position across streaming deltas with a
_stream_last_was_newline flag. Only enter reasoning suppression when
the tag appears at a block boundary (start of stream, after a newline,
or after only whitespace on the current line). Add a _flush_stream()
safety net that recovers buffered content if no closing tag is found
by end-of-stream.

Also fixes three related issues discovered during investigation:

- anthropic_adapter: _get_anthropic_max_output() now normalizes dots to
  hyphens so 'claude-opus-4.6' matches the 'claude-opus-4-6' table key
  (was returning 32K instead of 128K)

- run_agent: send explicit max_tokens for Claude models on Nous Portal,
  same as OpenRouter — both proxy to Anthropic's API which requires it.
  Without it the backend defaults to a low limit that truncates responses.

- run_agent: reset truncated_tool_call_retries after successful tool
  execution so a single truncation doesn't poison the entire conversation.
2026-04-09 22:16:36 -07:00
..
__init__.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
anthropic_adapter.py fix(streaming): prevent <think> in prose from suppressing response output 2026-04-09 22:16:36 -07:00
auxiliary_client.py fix: allow custom endpoint users to use main model for auxiliary tasks 2026-04-09 13:23:56 -07:00
builtin_memory_provider.py refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites 2026-04-07 13:36:38 -07:00
context_compressor.py fix: insert static fallback when compression summary fails 2026-04-09 14:28:56 -07:00
context_references.py refactor: replace inline HERMES_HOME re-implementations with get_hermes_home() 2026-04-07 10:40:34 -07:00
copilot_acp_client.py fix: bridge tool-calls in copilot-acp adapter 2026-04-06 01:47:57 -07:00
credential_pool.py fix: add auth.json write-back for Codex retry and valid-token early-return paths 2026-04-09 21:48:50 -07:00
display.py refactor: remove 24 confirmed dead functions — 432 lines of unused code 2026-04-07 11:41:26 -07:00
error_classifier.py fix(error_classifier): disambiguate usage-limit patterns in _classify_by_message 2026-04-09 16:24:13 -07:00
insights.py fix(insights): show cache tokens in overview so total adds up (#4428) 2026-04-01 03:06:47 -07:00
memory_manager.py refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites 2026-04-07 13:36:38 -07:00
memory_provider.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
model_metadata.py fix(compaction): don't halve context_length on output-cap-too-large errors 2026-04-09 11:27:41 -07:00
models_dev.py feat(qwen): add Qwen OAuth provider with portal request support 2026-04-08 13:46:30 -07:00
prompt_builder.py feat(gateway): add BlueBubbles iMessage platform adapter (#6437) 2026-04-08 23:54:03 -07:00
prompt_caching.py fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter 2026-03-21 16:54:43 -07:00
rate_limit_tracker.py feat: capture provider rate limit headers and show in /usage (#6541) 2026-04-09 03:43:14 -07:00
redact.py fix: mem0 API v2 compat, prefetch context fencing, secret redaction (#5423) 2026-04-05 22:43:33 -07:00
retry_utils.py feat(agent): add jittered retry backoff 2026-04-08 00:41:36 -07:00
skill_commands.py feat(skills): add skill config interface + llm-wiki skill (#5635) 2026-04-06 13:49:13 -07:00
skill_utils.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
smart_model_routing.py Merge branch 'main' into rewbs/tool-use-charge-to-subscription 2026-04-02 11:00:35 +11:00
subdirectory_hints.py fix(agent): catch PermissionError in subdirectory hint discovery 2026-04-09 03:10:30 -07:00
title_generator.py feat(agent): configurable timeouts for auxiliary LLM calls via config.yaml (#3597) 2026-03-28 14:35:28 -07:00
trajectory.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
usage_pricing.py fix: status bar shows 26K instead of 260K for token counts with trailing zeros (#3024) 2026-03-25 12:45:58 -07:00