hermes-agent/tools/environments
Teknium 4e89c53082
fix(async): close unscheduled coroutines in all threadsafe bridges (#26584)
Wraps every sync->async coroutine-scheduling site in the codebase with a
new agent.async_utils.safe_schedule_threadsafe() helper that closes the
coroutine on scheduling failure (closed loop, shutdown race, etc.)
instead of leaking it as 'coroutine was never awaited' RuntimeWarnings
plus reference leaks.

22 production call sites migrated across the codebase:
- acp_adapter/events.py, acp_adapter/permissions.py
- agent/lsp/manager.py
- cron/scheduler.py (media + text delivery paths)
- gateway/platforms/feishu.py (5 sites, via existing _submit_on_loop helper
  which now delegates to safe_schedule_threadsafe)
- gateway/run.py (10 sites: telegram rename, agent:step hook, status
  callback, interim+bg-review, clarify send, exec-approval button+text,
  temp-bubble cleanup, channel-directory refresh)
- plugins/memory/hindsight, plugins/platforms/google_chat
- tools/browser_supervisor.py (3), browser_cdp_tool.py,
  computer_use/cua_backend.py, slash_confirm.py
- tools/environments/modal.py (_AsyncWorker)
- tools/mcp_tool.py (2 + 8 _run_on_mcp_loop callers converted to
  factory-style so the coroutine is never constructed on a dead loop)
- tui_gateway/ws.py

Tests: new tests/agent/test_async_utils.py covers helper behavior under
live loop, dead loop, None loop, and scheduling exceptions. Regression
tests added at three PR-original sites (acp events, acp permissions,
mcp loop runner) mirroring contributor's intent.

Live-tested end-to-end:
- Helper stress test: 1500 schedules across live/dead/race scenarios,
  zero leaked coroutines
- Race exercised: 5000 schedules with loop killed mid-flight, 100 ok /
  4900 None returns, zero leaks
- hermes chat -q with terminal tool call (exercises step_callback bridge)
- MCP probe against failing subprocess servers + factory path
- Real gateway daemon boot + SIGINT shutdown across multiple platform
  adapter inits
- WSTransport 100 live + 50 dead-loop writes
- Cron delivery path live + dead loop

Salvages PR #2657 — adopts contributor's intent over a much wider site
list and a single centralized helper instead of inline try/except at
each site. 3 of the original PR's 6 sites no longer exist on main
(environments/patches.py deleted, DingTalk refactored to native async);
the equivalent fix lives in tools/environments/modal.py instead.

Co-authored-by: JithendraNara <jithendranaidunara@gmail.com>
2026-05-15 14:00:01 -07:00
..
__init__.py docs: align terminal-backend count and naming across docs and code 2026-05-05 13:44:09 -07:00
base.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
daytona.py fix(daytona): migrate legacy-sandbox lookup to cursor-based list() (#24587) 2026-05-12 16:31:46 -07:00
docker.py feat(terminal,cli): docker_extra_args + display.timestamps 2026-05-10 22:43:39 -07:00
file_sync.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
local.py feat: expose HERMES_SESSION_ID to agent tools via ContextVar + env (#23847) 2026-05-12 00:16:45 +05:30
managed_modal.py feat(environments): unified spawn-per-call execution layer 2026-04-08 17:23:15 -07:00
modal.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
modal_utils.py fix: follow-up for salvaged PR #10854 2026-04-16 06:42:45 -07:00
singularity.py feat(environments): unified spawn-per-call execution layer 2026-04-08 17:23:15 -07:00
ssh.py fix(ssh): add scp availability check to preflight validation 2026-05-05 09:57:23 -07:00
vercel_sandbox.py feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220) 2026-05-12 01:02:25 -07:00