fix(goals): auto-pause when judge model returns unparseable output

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-08 03:01:47 +00:00

Weak judge models (e.g. deepseek-v4-flash) return empty strings or prose
when asked for the strict {done, reason} JSON verdict. The old code
failed-open to continue on every such turn, burning the entire turn
budget with log lines like

  judge returned empty response
  judge reply was not JSON: "Let me analyze whether the goal..."

and /goal clear could not stop it mid-loop without /stop.

After N=3 consecutive *parse* failures (transport/API errors don't
count — those are transient), the loop auto-pauses and prints:

  ⏸ Goal paused — the judge model (3 turns) isn't returning the
  required JSON verdict. Route the judge to a stricter model in
  ~/.hermes/config.yaml:
    auxiliary:
      goal_judge:
        provider: openrouter
        model: google/gemini-3-flash-preview
  Then /goal resume to continue.

The counter resets on any usable reply (both "done"/"continue" and
API errors) and persists across GoalManager reloads so cross-session
resumes carry the correct state.

Also fixes test_goal_verdict_send.py sharing a hardcoded session_id
across tests — the shared id only worked because the previous
_post_turn_goal_continuation was a never-awaited coroutine. Now that
PR #19160 made it properly awaited, the xdist test-leakage bug
surfaced. Each test gets a unique session_id via uuid suffix.

This commit is contained in:

Teknium

2026-05-07 17:19:47 -07:00

parent 03ddff8897

commit 307c85e5c1

4 changed files with 270 additions and 49 deletions

									
										1

scripts/release.py
									
										View file
										
				@ -58,6 +58,7 @@ AUTHOR_MAP = {

				    "223003280+Abd0r@users.noreply.github.com": "Abd0r",

				    "abdielv@proton.me": "AJV20",

				    "mason@growagainorchids.com": "masonjames",

				    "ytchen0719@gmail.com": "liquidchen",

				    "am@studio1.tailb672fe.ts.net": "subtract0",

				    "axmaiqiu@gmail.com": "qWaitCrypto",

				    "159539633+MottledShadow@users.noreply.github.com": "MottledShadow",

Rows
Columns

fix(goals): auto-pause when judge model returns unparseable output

1 scripts/release.py Unescape Escape View file

1

scripts/release.py

View file