test: speed up slow tests (backoff + subprocess + IMDS network) (#11797)

Cuts shard-3 local runtime in half by neutralizing real wall-clock
waits across three classes of slow test:

## 1. Retry backoff mocks

- tests/run_agent/conftest.py (NEW): autouse fixture mocks
  jittered_backoff to 0.0 so the `while time.time() < sleep_end`
  busy-loop exits immediately. No global time.sleep mock (would
  break threading tests).
- test_anthropic_error_handling, test_413_compression,
  test_run_agent_codex_responses, test_fallback_model: per-file
  fixtures mock time.sleep / asyncio.sleep for retry / compression
  paths.
- test_retaindb_plugin: cap the retaindb module's bound time.sleep
  to 0.05s via a per-test shim (background writer-thread retries
  sleep 2s after errors; tests don't care about exact duration).
  Plus replace arbitrary time.sleep(N) waits with short polling
  loops bounded by deadline.

## 2. Subprocess sleeps in production code

- test_update_gateway_restart: mock time.sleep. Production code
  does time.sleep(3) after `systemctl restart` to verify the
  service survived. Tests mock subprocess.run \u2014 nothing actually
  restarts \u2014 so the wait is dead time.

## 3. Network / IMDS timeouts (biggest single win)

- tests/conftest.py: add AWS_EC2_METADATA_DISABLED=true plus
  AWS_METADATA_SERVICE_TIMEOUT=1 and ATTEMPTS=1. boto3 falls back
  to IMDS (169.254.169.254) when no AWS creds are set. Any test
  hitting has_aws_credentials() / resolve_aws_auth_env_var() (e.g.
  test_status, test_setup_copilot_acp, anything that touches
  provider auto-detect) burned ~2-4s waiting for that to time out.
- test_exit_cleanup_interrupt: explicitly mock
  resolve_runtime_provider which was doing real network auto-detect
  (~4s). Tests don't care about provider resolution \u2014 the agent
  is already mocked.
- test_timezone: collapse the 3-test "TZ env in subprocess" suite
  into 2 tests by checking both injection AND no-leak in the same
  subprocess spawn (was 3 \u00d7 3.2s, now 2 \u00d7 4s).

## Validation

| Test | Before | After |
|---|---|---|
| test_anthropic_error_handling (8 tests) | ~80s | ~15s |
| test_413_compression (14 tests) | ~18s | 2.3s |
| test_retaindb_plugin (67 tests) | ~13s | 1.3s |
| test_status_includes_tavily_key | 4.0s | 0.05s |
| test_setup_copilot_acp_skips_same_provider_pool_step | 8.0s | 0.26s |
| test_update_gateway_restart (5 tests) | ~18s total | ~0.35s total |
| test_exit_cleanup_interrupt (2 tests) | 8s | 1.5s |
| **Matrix shard 3 local** | **108s** | **50s** |

No behavioral contract changed \u2014 tests still verify retry happens,
service restart logic runs, etc.; they just don't burn real seconds
waiting for it.

Supersedes PR #11779 (those changes are included here).
This commit is contained in:
Teknium 2026-04-17 14:21:22 -07:00 committed by GitHub
parent eb07c05646
commit 3207b9bda0
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 231 additions and 33 deletions

View file

@ -0,0 +1,34 @@
"""Fast-path fixtures shared across tests/run_agent/.
Many tests in this directory exercise the retry/backoff paths in the
agent loop. Production code uses ``jittered_backoff(base_delay=5.0)``
with a ``while time.time() < sleep_end`` loop a single retry test
spends 5+ seconds of real wall-clock time on backoff waits.
Mocking ``jittered_backoff`` to return 0.0 collapses the while-loop
to a no-op (``time.time() < time.time() + 0`` is false immediately),
which handles the most common case without touching ``time.sleep``.
We deliberately DO NOT mock ``time.sleep`` here some tests
(test_interrupt_propagation, test_primary_runtime_restore, etc.) use
the real ``time.sleep`` for threading coordination or assert that it
was called with specific values. Tests that want to additionally
fast-path direct ``time.sleep(N)`` calls in production code should
monkeypatch ``run_agent.time.sleep`` locally (see
``test_anthropic_error_handling.py`` for the pattern).
"""
from __future__ import annotations
import pytest
@pytest.fixture(autouse=True)
def _fast_retry_backoff(monkeypatch):
"""Short-circuit retry backoff for all tests in this directory."""
try:
import run_agent
except ImportError:
return
monkeypatch.setattr(run_agent, "jittered_backoff", lambda *a, **k: 0.0)

View file

@ -19,6 +19,24 @@ import pytest
from agent.context_compressor import SUMMARY_PREFIX
from run_agent import AIAgent
import run_agent
# ---------------------------------------------------------------------------
# Fast backoff for compression retry tests
# ---------------------------------------------------------------------------
@pytest.fixture(autouse=True)
def _no_compression_sleep(monkeypatch):
"""Short-circuit the 2s time.sleep between compression retries.
Production code has ``time.sleep(2)`` in multiple places after a 413/context
compression, for rate-limit smoothing. Tests assert behavior, not timing.
"""
import time as _time
monkeypatch.setattr(_time, "sleep", lambda *_a, **_k: None)
monkeypatch.setattr(run_agent, "jittered_backoff", lambda *a, **k: 0.0)
# ---------------------------------------------------------------------------

View file

@ -27,6 +27,39 @@ from gateway.config import Platform
from gateway.session import SessionSource
# ---------------------------------------------------------------------------
# Fast backoff for tests that exercise the retry loop
# ---------------------------------------------------------------------------
@pytest.fixture(autouse=True)
def _no_backoff_wait(monkeypatch):
"""Short-circuit retry backoff so tests don't block on real wall-clock waits.
The production code uses jittered_backoff() with a 5s base delay plus a
tight time.sleep(0.2) loop. Without this patch, each 429/500/529 retry
test burns ~10s of real time on CI across six tests that's ~60s for
behavior we're not asserting against timing.
Tests assert retry counts and final results, never wait durations.
"""
import asyncio as _asyncio
import time as _time
monkeypatch.setattr(run_agent, "jittered_backoff", lambda *a, **k: 0.0)
monkeypatch.setattr(_time, "sleep", lambda *_a, **_k: None)
# Also fast-path asyncio.sleep — the gateway's _run_agent path has
# several await asyncio.sleep(...) calls that add real wall-clock time.
_real_asyncio_sleep = _asyncio.sleep
async def _fast_sleep(delay=0, *args, **kwargs):
# Yield to the event loop but skip the actual delay.
await _real_asyncio_sleep(0)
monkeypatch.setattr(_asyncio, "sleep", _fast_sleep)
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------

View file

@ -13,6 +13,24 @@ from unittest.mock import MagicMock, patch, call
import pytest
@pytest.fixture(autouse=True)
def _mock_runtime_provider(monkeypatch):
"""run_job calls resolve_runtime_provider which can try real network
auto-detection (~4s of socket timeouts in hermetic CI). Mock it out
since these tests don't care about provider resolution — the agent
is mocked too."""
import hermes_cli.runtime_provider as rp
def _fake_resolve(*args, **kwargs):
return {
"provider": "openrouter",
"api_key": "test-key",
"base_url": "https://openrouter.ai/api/v1",
"model": "test/model",
"api_mode": "chat_completions",
}
monkeypatch.setattr(rp, "resolve_runtime_provider", _fake_resolve)
class TestCronJobCleanup:
"""cron/scheduler.py — end_session + close in the finally block."""

View file

@ -11,6 +11,16 @@ from unittest.mock import MagicMock, patch
import pytest
from run_agent import AIAgent
import run_agent
@pytest.fixture(autouse=True)
def _no_fallback_wait(monkeypatch):
"""Short-circuit time.sleep in fallback/recovery paths so tests don't
block on the ``min(3 + retry_count, 8)`` wait before a primary retry."""
import time as _time
monkeypatch.setattr(_time, "sleep", lambda *_a, **_k: None)
monkeypatch.setattr(run_agent, "jittered_backoff", lambda *a, **k: 0.0)
def _make_tool_defs(*names: str) -> list:

View file

@ -12,6 +12,15 @@ sys.modules.setdefault("fal_client", types.SimpleNamespace())
import run_agent
@pytest.fixture(autouse=True)
def _no_codex_backoff(monkeypatch):
"""Short-circuit retry backoff so Codex retry tests don't block on real
wall-clock waits (5s jittered_backoff base delay + tight time.sleep loop)."""
import time as _time
monkeypatch.setattr(run_agent, "jittered_backoff", lambda *a, **k: 0.0)
monkeypatch.setattr(_time, "sleep", lambda *_a, **_k: None)
def _patch_agent_bootstrap(monkeypatch):
monkeypatch.setattr(
run_agent,