mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-24 10:52:21 +00:00
feat(goals): /goal wait <pid> — park the loop on a background process (#50503)
* feat(goals): add /goal wait <pid> barrier to park the loop on a background process
The /goal loop re-pokes the agent every turn via the post-turn judge. When a
goal is gated on a long-running background process (CI poller, build, test
matrix, deploy) that produces nothing to judge yet, this spins the agent into
'is it done?' busy-work and burns the turn budget.
/goal wait <pid> [reason] parks the loop: while the PID is alive, the judge is
skipped, no turn is consumed, no continuation fires, and /goal status shows a
parked indicator. The barrier auto-clears the moment the process exits (the
agent's notify_on_complete watcher is the natural wake signal), then the next
turn resumes normal judging. /goal unwait clears it manually; pause/resume/clear
drop it; a dead/stale PID can never wedge the loop.
Wired across CLI, gateway, and the mid-run command guard for parity. Barrier
persists in SessionDB.state_meta (survives /resume); GoalState gains
backward-compatible waiting_on_pid/waiting_reason/waiting_since fields. 12 new
tests; docs updated.
* fix(goals): use gateway.status._pid_exists for liveness, not os.kill(pid,0)
The Windows-footguns CI guard flagged os.kill(pid, 0) in _pid_alive — on
Windows that's not a no-op, it routes to CTRL_C_EVENT and hard-kills the
target's console process group (bpo-14484). Delegate to the canonical
footgun-safe gateway.status._pid_exists (psutil + ctypes/POSIX fallback)
instead, with a direct-psutil last resort.
* feat(goals): judge-driven auto-wait — the loop parks itself, no manual /goal wait
Makes the wait barrier automatic. Every turn the judge is shown the agent's
live background processes (pid, command, uptime, output tail from the
process_registry) alongside the goal + response, and can return a new 'wait'
verdict instead of continue:
{"verdict":"wait","wait_on_pid":N} → park until that process exits
{"verdict":"wait","wait_for_seconds":N} → park until the deadline passes
evaluate_after_turn acts on the directive (sets the barrier, parks the loop)
so the agent isn't re-poked into busy-work while CI/builds/deploys run. Adds a
time-based waiting_until barrier alongside the pid barrier; both auto-clear and
can never wedge the loop. Drivers (CLI, gateway, tui_gateway) feed the live
registry in via gather_background_processes(). Manual /goal wait stays as an
override. Judge verdict contract widened to (verdict, reason, parse_failed,
wait_directive); legacy {"done":bool} shape still accepted.
* test(goals): update kanban _fake_judge to the 4-tuple judge contract
CI test(3) caught it: test_kanban_goal_mode's _fake_judge still returned the
3-tuple (verdict, reason, parse_failed), but the kanban loop now unpacks the
4-tuple (+ wait_directive). Update the fake to return None for the directive
and accept the background_processes kwarg.
* feat(goals): trigger-based wait — park on a process's own signal, not just exit
Addresses two gaps in the judge-driven wait: (1) the judge could only express
'wait until PID exits' or 'wait N seconds', so a long-lived watcher/server that
fires a trigger MID-RUN (and may never exit) couldn't be waited on; (2) the
process's own watch_patterns/notify_on_complete trigger was invisible to the judge.
Adds a session-based barrier (waiting_on_session) that releases on the process's
OWN trigger via process_registry.is_session_waiting(): the session exits, OR (if
started with watch_patterns) its pattern matches — even while the process keeps
running. list_sessions() now surfaces session_id + watch_patterns/watch_hit/
notify_on_complete so the judge sees the trigger and is told to prefer
wait_on_session for trigger processes. Judge verdict gains a {wait_on_session}
directive (preferred over pid). Backward-compatible GoalState field; pid + time
barriers unchanged.
Tests: TestSessionTriggerBarrier (release on mid-run pattern match while alive,
release on exit, unknown-session, full park→trigger→resume, parse, validation,
backcompat load). 105 goal-surface + 85 process_registry tests green.
This commit is contained in:
parent
d4fa2db1c5
commit
ff85af3fc7
13 changed files with 1139 additions and 104 deletions
|
|
@ -1055,6 +1055,42 @@ class ProcessRegistry:
|
|||
"""Check if a completion notification was already consumed via wait/log."""
|
||||
return session_id in self._completion_consumed
|
||||
|
||||
def is_session_waiting(self, session_id: str) -> bool:
|
||||
"""Whether a goal loop parked on this session should still be parked.
|
||||
|
||||
Used by the goal-loop wait barrier (``hermes_cli.goals``) to support
|
||||
waiting on a process's OWN trigger, not just its exit. A session is
|
||||
"still waiting" when:
|
||||
- it is still running, AND
|
||||
- if it has ``watch_patterns``, none has matched yet (so a
|
||||
long-lived watcher that fires a trigger mid-run — and may never
|
||||
exit — unblocks the moment its pattern hits, not on exit).
|
||||
|
||||
Returns False (don't wait) when the session has exited, its watch
|
||||
pattern has already fired, or the session is unknown — so a stale or
|
||||
already-triggered barrier can never wedge the loop.
|
||||
"""
|
||||
if not session_id:
|
||||
return False
|
||||
with self._lock:
|
||||
session = self._running.get(session_id) or self._finished.get(session_id)
|
||||
if session is None:
|
||||
return False
|
||||
# Refresh detached/remote state so .exited is current.
|
||||
try:
|
||||
self._refresh_detached_session(session)
|
||||
except Exception:
|
||||
pass
|
||||
if session.exited:
|
||||
return False
|
||||
# Watch-pattern process: the trigger is a pattern match, not exit.
|
||||
# Once any match has been delivered, the wait is satisfied even though
|
||||
# the process keeps running (server/daemon/watcher case).
|
||||
if session.watch_patterns and not session._watch_disabled:
|
||||
if session._watch_hits > 0:
|
||||
return False
|
||||
return True
|
||||
|
||||
def _drain_should_skip(self, session_id: str) -> bool:
|
||||
"""Whether the CLI drain should skip a completion event for this session.
|
||||
|
||||
|
|
@ -1500,6 +1536,14 @@ class ProcessRegistry:
|
|||
"status": "exited" if s.exited else "running",
|
||||
"output_preview": s.output_buffer[-200:] if s.output_buffer else "",
|
||||
}
|
||||
# Trigger metadata so a goal-loop judge can decide to wait on this
|
||||
# process's OWN signal (a watch-pattern match or completion), not
|
||||
# just its exit. A watcher with watch_patterns may never exit.
|
||||
if s.watch_patterns and not s._watch_disabled:
|
||||
entry["watch_patterns"] = list(s.watch_patterns)
|
||||
entry["watch_hit"] = s._watch_hits > 0
|
||||
if s.notify_on_complete:
|
||||
entry["notify_on_complete"] = True
|
||||
if s.exited:
|
||||
entry["exit_code"] = s.exit_code
|
||||
if s.detached:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue