Commit graph

7 commits

Author SHA1 Message Date
Teknium
187cf0f257
tools(terminal): nudge homebrewed CI pollers at the tool surface (#33142)
Background processes whose command contains `gh pr view --json
statusCheckRollup` or `gh pr checks | jq` now get a runtime hint in
the result pointing at the canonical green-ci-policy snippets. The
homebrew shape has caused at least seven silent CI-watcher failures
in the past two weeks (#31329, #31448, #31695, #31709, #31745,
#32264, #33131) — each one a different jq/awk/grep variation of the
same fundamental problem (stdout buffering, jq null-key edge cases,
conclusion-vs-status confusion, TTY-only banner grepping).

The skill that documents this anti-pattern is excellent, but a skill
only fires if the agent loads it. The tool surface fires on every
misuse. This is the embed-footguns-in-tool-surface pattern from
PR #31289 applied to a recurring failure mode that's outgrown
skill-only enforcement.

Detector is deliberately narrow — flags two specific shapes:

  1. Any command containing `statusCheckRollup` (the JSON-API path —
     conclusion vs status field semantics keep burning us).
  2. `gh pr view` / `gh pr checks` combined with `jq` (gh pr
     checks doesn't emit JSON, so any `| jq` here is confused intent;
     the canonical column-2 poller uses awk-on-tabs, not jq).

Does NOT flag the blessed column-2 awk-on-tabs poller (which uses
`awk -F"\t" "\==\"pending\""`) or the exit-code-driven
`gh pr checks $PR >/dev/null` snippet.

Hint composes with the existing background-without-notify_on_complete
hint — both can fire on the same call. Each is independently
actionable.

Tests:
- 4 new cases in tests/tools/test_notify_on_complete.py
- test_homebrew_ci_poller_via_statusCheckRollup_emits_hint (positive)
- test_homebrew_ci_poller_via_gh_pr_checks_piped_to_jq_emits_hint (positive)
- test_canonical_column2_awk_poller_does_not_emit_homebrew_hint (negative)
- test_canonical_gh_pr_checks_exit_code_loop_does_not_emit_hint (negative)
- test_non_ci_background_command_does_not_emit_homebrew_hint (negative)
- 30/30 passing (was 26)
2026-05-27 02:22:08 -07:00
Teknium
d97c324473
fix(terminal): warn at call time when background=true runs silently (#31289)
`terminal(background=true)` without `notify_on_complete=true` or
`watch_patterns` runs the process SILENTLY — the agent has no way
to learn it finished short of calling `process(action='poll')`
explicitly. That's correct for genuine long-lived processes (servers,
watchers, daemons) but is a footgun for every bounded task (tests,
builds, deploys, CI pollers, batch jobs), which is the vast majority
of background uses.

Hit on May 23, 2026 (PR #31231 incident): agent launched a CI-watch
loop with `background=true` only. The poller ran fine, exited green
6 minutes later, agent never noticed. User had to surface 'we are
green CI, you can merge.' Memory and skill docs said *what* to do
(poll in background) but not *how* to receive the result. The
`notify_on_complete=true` flag exists and works, but is easy to
forget when bg seems sufficient on its own.

Two changes here, mutually reinforcing:

1. Runtime nudge: tool result for `background=true` w/o notify or
   watch_patterns now includes a `hint` field explaining the silent-
   process failure mode and pointing at the corrective flag. Agent
   sees it on the same turn and self-corrects without needing the
   user to surface anything. Cost for legitimate server cases is one
   ignored read (~50 tokens); cost for forgot-notify cases is
   prevented blindness (potentially many turns, or a user nudge).
   False positives << false negatives.

2. Schema/description rewrite: top-level TERMINAL_TOOL_DESCRIPTION
   and the `background` field description now lead with 'Almost
   always pair with notify_on_complete=true' instead of presenting
   it as one of two equally-likely patterns. The two legitimate
   non-notify shapes (long-lived servers; watch_patterns mid-process
   signals) are still documented, but as the minority case.

Tests cover all four shapes: bg-only emits hint, bg+notify doesn't,
bg+watch_patterns doesn't, foreground doesn't. 4 new tests; full
suite of background/process tests stays green (160/160 across the
relevant 6 test files).
2026-05-23 21:02:14 -07:00
Teknium
f53a5a7fe1
fix: suppress duplicate completion notifications when agent already consumed output via wait/poll/log (#8228)
When the agent calls process(action='wait') or process(action='poll')
and gets the exited status, the completion_queue notification is
redundant — the agent already has the output from the tool return.
Previously, the drain loops in CLI and gateway would still inject
the [SYSTEM: Background process completed] message, causing the
agent to receive the same information twice.

Fix: track session IDs in _completion_consumed set when wait/poll/log
returns an exited process. Drain loops in cli.py and gateway watcher
skip completion events for consumed sessions. Watch pattern events
are never suppressed (they have independent semantics).

Adds 4 tests covering wait/poll/log marking and running-process
negative case.
2026-04-12 00:36:22 -07:00
Teknium
cac6178104 fix(gateway): propagate user identity through process watcher pipeline
Background process watchers (notify_on_complete, check_interval) created
synthetic SessionSource objects without user_id/user_name. While the
internal=True bypass (1d8d4f28) prevented false pairing for agent-
generated notifications, the missing identity caused:

- Garbage entries in pairing rate limiters (discord:None, telegram:None)
- 'User None' in approval messages and logs
- No user identity available for future code paths that need it

Additionally, platform messages arriving without from_user (Telegram
service messages, channel forwards, anonymous admin actions) could still
trigger false pairing because they are not internal events.

Fix:
1. Propagate user_id/user_name through the full watcher chain:
   session_context.py → gateway/run.py → terminal_tool.py →
   process_registry.py (including checkpoint persistence/recovery)

2. Add None user_id guard in _handle_message() — silently drop
   non-internal messages with no user identity instead of triggering
   the pairing flow.

Salvaged from PRs #7664 (kagura-agent, ContextVar approach),
#6540 (MestreY0d4-Uninter, tests), and #7709 (guang384, None guard).

Closes #6341, #6485, #7643
Relates to #6516, #7392
2026-04-11 13:46:16 -07:00
Teknium
c8e4dcf412
fix: prevent duplicate completion notifications on process kill (#7124)
When kill_process() sends SIGTERM, both it and the reader thread race
to call _move_to_finished() — kill_process sets exit_code=-15 and
enqueues a notification, then the reader thread's process.wait()
returns with exit_code=143 (128+SIGTERM) and enqueues a second one.

Fix: make _move_to_finished() idempotent by tracking whether the
session was actually removed from _running. The second call sees it
was already moved and skips the completion_queue.put().

Adds regression test: test_move_to_finished_idempotent_no_duplicate
2026-04-10 03:52:16 -07:00
mrshu
19b0ddce40 fix(process): correct detached crash recovery state
Previously crash recovery recreated detached sessions as if they were
fully managed, so polls and kills could lie about liveness and the
checkpoint could forget recovered jobs after the next restart.
This commit refreshes recovered host-backed sessions from real PID
state, keeps checkpoint data durable, and preserves notify watcher
metadata while treating sandbox-only PIDs as non-recoverable.

- Persist `pid_scope` in `tools/process_registry.py` and skip
  recovering sandbox-backed entries without a host-visible PID handle
- Refresh detached sessions on access so `get`/`poll`/`wait` and active
  session queries observe exited processes instead of hanging forever
- Allow recovered host PIDs to be terminated honestly and requeue
  `notify_on_complete` watchers during checkpoint recovery
- Add regression tests for durable checkpoints, detached exit/kill
  behavior, sandbox skip logic, and recovered notify watchers
2026-04-08 03:35:43 -07:00
Teknium
e120d2afac
feat: notify_on_complete for background processes (#5779)
* feat: notify_on_complete for background processes

When terminal(background=true, notify_on_complete=true), the system
auto-triggers a new agent turn when the process exits — no polling needed.

Changes:
- ProcessSession: add notify_on_complete field
- ProcessRegistry: add completion_queue, populate on _move_to_finished()
- Terminal tool: add notify_on_complete parameter to schema + handler
- CLI: drain completion_queue after agent turn AND during idle loop
- Gateway: enhanced _run_process_watcher injects synthetic MessageEvent
  on completion, triggering a full agent turn
- Checkpoint persistence includes notify_on_complete for crash recovery
- code_execution_tool: block notify_on_complete in sandbox scripts
- 15 new tests covering queue mechanics, checkpoint round-trip, schema

* docs: update terminal tool descriptions for notify_on_complete

- background: remove 'ONLY for servers' language, describe both patterns
  (long-lived processes AND long-running tasks with notify_on_complete)
- notify_on_complete: more prescriptive about when to use it
- TERMINAL_TOOL_DESCRIPTION: remove 'Do NOT use background for builds'
  guidance that contradicted the new feature
2026-04-07 02:40:16 -07:00