Background process watchers (notify_on_complete, check_interval) created
synthetic SessionSource objects without user_id/user_name. While the
internal=True bypass (1d8d4f28) prevented false pairing for agent-
generated notifications, the missing identity caused:
- Garbage entries in pairing rate limiters (discord:None, telegram:None)
- 'User None' in approval messages and logs
- No user identity available for future code paths that need it
Additionally, platform messages arriving without from_user (Telegram
service messages, channel forwards, anonymous admin actions) could still
trigger false pairing because they are not internal events.
Fix:
1. Propagate user_id/user_name through the full watcher chain:
session_context.py → gateway/run.py → terminal_tool.py →
process_registry.py (including checkpoint persistence/recovery)
2. Add None user_id guard in _handle_message() — silently drop
non-internal messages with no user identity instead of triggering
the pairing flow.
Salvaged from PRs #7664 (kagura-agent, ContextVar approach),
#6540 (MestreY0d4-Uninter, tests), and #7709 (guang384, None guard).
Closes#6341, #6485, #7643
Relates to #6516, #7392
Previously crash recovery recreated detached sessions as if they were
fully managed, so polls and kills could lie about liveness and the
checkpoint could forget recovered jobs after the next restart.
This commit refreshes recovered host-backed sessions from real PID
state, keeps checkpoint data durable, and preserves notify watcher
metadata while treating sandbox-only PIDs as non-recoverable.
- Persist `pid_scope` in `tools/process_registry.py` and skip
recovering sandbox-backed entries without a host-visible PID handle
- Refresh detached sessions on access so `get`/`poll`/`wait` and active
session queries observe exited processes instead of hanging forever
- Allow recovered host PIDs to be terminated honestly and requeue
`notify_on_complete` watchers during checkpoint recovery
- Add regression tests for durable checkpoints, detached exit/kill
behavior, sandbox skip logic, and recovered notify watchers
Salvaged from PR #1573 by @eren-karakus0. Cherry-picked with authorship preserved.
Fixes#1143 — background process notifications resume after gateway restart.
Co-authored-by: Muhammet Eren Karakuş <erenkar950@gmail.com>
Extend subprocess env sanitization beyond provider credentials by blocking Hermes-managed tool, messaging, and related gateway runtime vars. Reuse a shared sanitizer in LocalEnvironment and ProcessRegistry so background and PTY processes honor the same blocklist and _HERMES_FORCE_ escape hatch. Add regression coverage for local env execution and process_registry spawning.
Cover model_tools, toolset_distributions, context_compressor,
prompt_caching, cronjob_tools, session_search, process_registry,
and cron/scheduler with 127 new test cases.