mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-08 03:01:47 +00:00
fix(goals): auto-pause when judge model returns unparseable output
Weak judge models (e.g. deepseek-v4-flash) return empty strings or prose
when asked for the strict {done, reason} JSON verdict. The old code
failed-open to continue on every such turn, burning the entire turn
budget with log lines like
judge returned empty response
judge reply was not JSON: "Let me analyze whether the goal..."
and /goal clear could not stop it mid-loop without /stop.
After N=3 consecutive *parse* failures (transport/API errors don't
count — those are transient), the loop auto-pauses and prints:
⏸ Goal paused — the judge model (3 turns) isn't returning the
required JSON verdict. Route the judge to a stricter model in
~/.hermes/config.yaml:
auxiliary:
goal_judge:
provider: openrouter
model: google/gemini-3-flash-preview
Then /goal resume to continue.
The counter resets on any usable reply (both "done"/"continue" and
API errors) and persists across GoalManager reloads so cross-session
resumes carry the correct state.
Also fixes test_goal_verdict_send.py sharing a hardcoded session_id
across tests — the shared id only worked because the previous
_post_turn_goal_continuation was a never-awaited coroutine. Now that
PR #19160 made it properly awaited, the xdist test-leakage bug
surfaced. Each test gets a unique session_id via uuid suffix.
This commit is contained in:
parent
03ddff8897
commit
307c85e5c1
4 changed files with 270 additions and 49 deletions
|
|
@ -58,6 +58,7 @@ AUTHOR_MAP = {
|
|||
"223003280+Abd0r@users.noreply.github.com": "Abd0r",
|
||||
"abdielv@proton.me": "AJV20",
|
||||
"mason@growagainorchids.com": "masonjames",
|
||||
"ytchen0719@gmail.com": "liquidchen",
|
||||
"am@studio1.tailb672fe.ts.net": "subtract0",
|
||||
"axmaiqiu@gmail.com": "qWaitCrypto",
|
||||
"159539633+MottledShadow@users.noreply.github.com": "MottledShadow",
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue