Commit graph

5 commits

Author SHA1 Message Date
Teknium
a418ddbd8b
fix: add activity heartbeats to prevent false gateway inactivity timeouts (#10501)
Multiple gaps in activity tracking could cause the gateway's inactivity
timeout to fire while the agent is actively working:

1. Streaming wait loop had no periodic heartbeat — the outer thread only
   touched activity when the stale-stream detector fired (180-300s), and
   for local providers (Ollama) the stale timeout was infinity, meaning
   zero heartbeats. Now touches activity every 30s.

2. Concurrent tool execution never set the activity callback on worker
   threads (threading.local invisible across threads) and never set
   _current_tool. Workers now set the callback, and the concurrent wait
   uses a polling loop with 30s heartbeats.

3. Modal backend's execute() override had its own polling loop without
   any activity callback. Now matches _wait_for_process cadence (10s).
2026-04-15 13:29:05 -07:00
alt-glitch
d684d7ee7e feat(environments): unified spawn-per-call execution layer
Replace dual execution model (PersistentShellMixin + per-backend oneshot)
with spawn-per-call + session snapshot for all backends except ManagedModal.

Core changes:
- Every command spawns a fresh bash process; session snapshot (env vars,
  functions, aliases) captured at init and re-sourced before each command
- CWD persists via file-based read (local) or in-band stdout markers (remote)
- ProcessHandle protocol + _ThreadedProcessHandle adapter for SDK backends
- cancel_fn wired for Modal (sandbox.terminate) and Daytona (sandbox.stop)
- Shared utilities extracted: _pipe_stdin, _popen_bash, _load_json_store,
  _save_json_store, _file_mtime_key, _SYNC_INTERVAL_SECONDS
- Rate-limited file sync unified in base _before_execute() with _sync_files() hook
- execute_oneshot() removed; all 11 call sites in code_execution_tool.py
  migrated to execute()
- Daytona timeout wrapper replaced with SDK-native timeout parameter
- persistent_shell.py deleted (291 lines)

Backend-specific:
- Local: process-group kill via os.killpg, file-based CWD read
- Docker: -e env flags only on init_session, not per-command
- SSH: shlex.quote transport, ControlMaster connection reuse
- Singularity: apptainer exec with instance://, no forced --pwd
- Modal: _AsyncWorker + _ThreadedProcessHandle, cancel_fn -> sandbox.terminate
- Daytona: SDK-level timeout (not shell wrapper), cancel_fn -> sandbox.stop
- ManagedModal: unchanged (gateway owns execution); docstring added explaining why
2026-04-08 17:23:15 -07:00
Teknium
acea9ee20b
fix(tests): fix 11 real test failures + major cascade poisoner (#4570)
Three root causes addressed:

1. AIAgent no longer defaults base_url to OpenRouter (9 tests)
   Tests that assert OpenRouter-specific behavior (prompt caching,
   reasoning extra_body, provider preferences) need explicit base_url
   and model set on the agent. Updated test_run_agent.py and
   test_provider_parity.py.

2. Credential pool auto-seeding from host env (2 tests)
   test_auxiliary_client.py tests for Anthropic OAuth and custom
   endpoint fallback were not mocking _select_pool_entry, so the
   host's credential pool interfered. Added pool + codex mocks.

3. sys.modules corruption cascade (major - ~250 tests)
   test_managed_modal_environment.py replaced sys.modules entries
   (tools, hermes_cli, agent packages) with SimpleNamespace stubs
   but had NO cleanup fixture. Every subsequent test in the process
   saw corrupted imports: 'cannot import get_config_path from
   <unknown module name>' and 'module tools has no attribute
   environments'. Added _restore_tool_and_agent_modules autouse
   fixture matching the pattern in test_managed_browserbase_and_modal.py.

   This was also the root cause of CI failures (104 failed on main).
2026-04-02 08:43:06 -07:00
Robin Fernandes
1b7473e702 Fixes and refactors enabled by recent updates to main. 2026-03-31 09:29:59 +09:00
Robin Fernandes
95dc9aaa75 feat: add managed tool gateway and Nous subscription support
- add managed modal and gateway-backed tool integrations\n- improve CLI setup, auth, and configuration for subscriber flows\n- expand tests and docs for managed tool support
2026-03-26 16:17:58 -07:00