fix: add activity heartbeats to prevent false gateway inactivity timeouts (#10501)

Multiple gaps in activity tracking could cause the gateway's inactivity
timeout to fire while the agent is actively working:

1. Streaming wait loop had no periodic heartbeat — the outer thread only
   touched activity when the stale-stream detector fired (180-300s), and
   for local providers (Ollama) the stale timeout was infinity, meaning
   zero heartbeats. Now touches activity every 30s.

2. Concurrent tool execution never set the activity callback on worker
   threads (threading.local invisible across threads) and never set
   _current_tool. Workers now set the callback, and the concurrent wait
   uses a polling loop with 30s heartbeats.

3. Modal backend's execute() override had its own polling loop without
   any activity callback. Now matches _wait_for_process cadence (10s).
This commit is contained in:
Teknium 2026-04-15 13:29:05 -07:00 committed by GitHub
parent 0d25e1c146
commit a418ddbd8b
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 73 additions and 3 deletions

View file

@ -296,7 +296,7 @@ def test_managed_modal_execute_times_out_and_cancels(monkeypatch):
modal_common = sys.modules["tools.environments.modal_utils"]
calls = []
monotonic_values = iter([0.0, 12.5])
monotonic_values = iter([0.0, 0.0, 0.0, 12.5, 12.5])
def fake_request(method, url, headers=None, json=None, timeout=None):
calls.append((method, url, json, timeout))