hermes-agent/tests/agent/test_copilot_acp_client.py
Teknium 31f70d1f2a
fix(ci): recover 38 failing tests on main (#17642)
CI Tests workflow has been red on main for 40+ consecutive runs. This
commit recovers every failure visible in run 25130722163 (most recent
completed run prior to this PR).

Root causes, by group:

Test-mock drift after product landed (fix: update mocks)
- test_mcp_structured_content / test_mcp_dynamic_discovery (6 tests):
  product added _rpc_lock (#02ae15222) and _schedule_tools_refresh
  (#1350d12b0) without updating sibling test files. Install a real
  asyncio.Lock inside the fake run-loop and patch at _schedule_tools_refresh.
- test_session.py: renamed normalize_whatsapp_identifier → canonical_
  whatsapp_identifier upstream; keep a local alias so the legacy tests
  keep working.
- test_run_progress_topics Slack DM test: PR #8006 made Slack default
  tool_progress=off; explicitly set it to 'all' in the test fixture so
  the progress-callback path still runs. Also read tool_progress_callback
  at call time rather than freezing it in FakeAgent.__init__ — production
  assigns it AFTER construction.
- test_tui_gateway_server session-create/close race: session.create now
  defers _start_agent_build behind a 50ms timer — wait for the build
  thread to enter _make_agent before closing, otherwise the orphan-
  cleanup path never runs.
- test_protocol session.resume: product get_messages_as_conversation now
  takes include_ancestors kwarg; accept **_kwargs in the test stub.
- test_copilot_acp_client redaction: redactor is OFF by default (snapshots
  HERMES_REDACT_SECRETS at import); patch agent.redact._REDACT_ENABLED=True
  for the duration of the test.
- test_minimax_provider: after #17171, dots in non-Anthropic model names
  stay dots even with preserve_dots=False. Assert the new invariant
  rather than the old 'broken for MiniMax' behavior.
- test_update_autostash: updater now scans `ps -A` for dashboard PIDs;
  the test's catch-all subprocess.run stub needed stdout/stderr fields.
- test_accretion_caps: read_timestamps dict is populated lazily when
  os.path.getmtime succeeds. Use .get("read_timestamps", {}) to tolerate
  CI filesystems where the stat races file creation.

Change-detector tests (fix: rewrite as structural invariants)
- test_credential_sources_registry_has_expected_steps: was a frozen set
  comparison that broke when minimax-oauth was added. Rewrite as an
  invariant check (every step has description, no dupes, core steps
  present) per AGENTS.md 'don't write change-detector tests'.

xdist ordering / test pollution (fix: reset state, use module-local patches)
- test_setup vercel: sibling test saved VERCEL_PROJECT_ID='project' to
  os.environ via save_env_value() and never cleared it. monkeypatch.delenv
  the VERCEL_* vars in the link-file test.
- test_clipboard TestIsWsl: GitHub Actions is on Azure VMs whose real
  /proc/version often contains 'microsoft'. Patching builtins.open with
  mock_open didn't reliably intercept hermes_constants.is_wsl's call in
  xdist workers that had already cached _wsl_detected=True from an
  earlier test. Patch hermes_constants.open directly and add
  teardown_method to reset the cache after each test.

Pytest-asyncio cancellation hangs (fix: bound product await with timeout)
- test_session_split_brain_11016 (3 params) + test_gateway_shutdown
  cancel-inflight: under pytest-asyncio 1.3.0, 'await task' and
  'asyncio.gather(cancelled_tasks)' can stall for 30s when the cancelled
  task's finally block awaits typing-task cleanup. Bound both with
  asyncio.wait_for(..., timeout=5.0) and asyncio.shield — the stragglers
  are released from adapter tracking and allowed to finish unwinding in
  the background. This is also a legitimate hardening: a wedged finally
  shouldn't stall the caller's dispatch or a gateway shutdown.

Orphan UI config (fix: merge tiny tab into messaging category)
- test_web_server test_no_single_field_categories: the telegram.reactions
  config field lived in its own 'telegram' schema category with no
  siblings. Fold it under 'discord' via _CATEGORY_MERGE so the dashboard
  doesn't render an orphan single-field tab.

Local verification: 38/38 originally-failing tests pass; 4044/4044
gateway tests pass; 684/684 targeted subset (all 16 touched test files)
passes.
2026-04-29 20:05:32 -07:00

207 lines
7 KiB
Python

"""Focused regressions for the Copilot ACP shim safety layer."""
from __future__ import annotations
import io
import json
import os
import tempfile
import unittest
from pathlib import Path
from unittest.mock import patch
from agent.copilot_acp_client import CopilotACPClient
class _FakeProcess:
def __init__(self) -> None:
self.stdin = io.StringIO()
class CopilotACPClientSafetyTests(unittest.TestCase):
def setUp(self) -> None:
self.client = CopilotACPClient(acp_cwd="/tmp")
def _dispatch(self, message: dict, *, cwd: str) -> dict:
process = _FakeProcess()
handled = self.client._handle_server_message(
message,
process=process,
cwd=cwd,
text_parts=[],
reasoning_parts=[],
)
self.assertTrue(handled)
payload = process.stdin.getvalue().strip()
self.assertTrue(payload)
return json.loads(payload)
def test_request_permission_is_not_auto_allowed(self) -> None:
response = self._dispatch(
{
"jsonrpc": "2.0",
"id": 1,
"method": "session/request_permission",
"params": {},
},
cwd="/tmp",
)
outcome = (((response.get("result") or {}).get("outcome") or {}).get("outcome"))
self.assertEqual(outcome, "cancelled")
def test_read_text_file_blocks_internal_hermes_hub_files(self) -> None:
with tempfile.TemporaryDirectory() as tmpdir:
home = Path(tmpdir) / "home"
blocked = home / ".hermes" / "skills" / ".hub" / "index-cache" / "entry.json"
blocked.parent.mkdir(parents=True, exist_ok=True)
blocked.write_text('{"token":"sk-test-secret-1234567890"}')
with patch.dict(
os.environ,
{"HOME": str(home), "HERMES_HOME": str(home / ".hermes")},
clear=False,
):
response = self._dispatch(
{
"jsonrpc": "2.0",
"id": 2,
"method": "fs/read_text_file",
"params": {"path": str(blocked)},
},
cwd=str(home),
)
self.assertIn("error", response)
def test_read_text_file_redacts_sensitive_content(self) -> None:
with tempfile.TemporaryDirectory() as tmpdir:
root = Path(tmpdir)
secret_file = root / "config.env"
secret_file.write_text("OPENAI_API_KEY=sk-proj-abc123def456ghi789jkl012")
# agent.redact snapshots HERMES_REDACT_SECRETS at import time into
# _REDACT_ENABLED, so patching os.environ is a no-op. Flip the
# module-level constant directly for the duration of the call.
with patch("agent.redact._REDACT_ENABLED", True):
response = self._dispatch(
{
"jsonrpc": "2.0",
"id": 3,
"method": "fs/read_text_file",
"params": {"path": str(secret_file)},
},
cwd=str(root),
)
content = ((response.get("result") or {}).get("content") or "")
self.assertNotIn("abc123def456", content)
self.assertIn("OPENAI_API_KEY=", content)
def test_write_text_file_reuses_write_denylist(self) -> None:
with tempfile.TemporaryDirectory() as tmpdir:
home = Path(tmpdir) / "home"
target = home / ".ssh" / "id_rsa"
target.parent.mkdir(parents=True, exist_ok=True)
with patch("agent.copilot_acp_client.is_write_denied", return_value=True, create=True):
response = self._dispatch(
{
"jsonrpc": "2.0",
"id": 4,
"method": "fs/write_text_file",
"params": {
"path": str(target),
"content": "fake-private-key",
},
},
cwd=str(home),
)
self.assertIn("error", response)
self.assertFalse(target.exists())
def test_write_text_file_respects_safe_root(self) -> None:
with tempfile.TemporaryDirectory() as tmpdir:
root = Path(tmpdir)
safe_root = root / "workspace"
safe_root.mkdir()
outside = root / "outside.txt"
with patch.dict(os.environ, {"HERMES_WRITE_SAFE_ROOT": str(safe_root)}, clear=False):
response = self._dispatch(
{
"jsonrpc": "2.0",
"id": 5,
"method": "fs/write_text_file",
"params": {
"path": str(outside),
"content": "should-not-write",
},
},
cwd=str(root),
)
self.assertIn("error", response)
self.assertFalse(outside.exists())
if __name__ == "__main__":
unittest.main()
# ── HOME env propagation tests (from PR #11285) ─────────────────────
from unittest.mock import patch as _patch
import pytest
def _make_home_client(tmp_path):
return CopilotACPClient(
api_key="copilot-acp",
base_url="acp://copilot",
acp_command="copilot",
acp_args=["--acp", "--stdio"],
acp_cwd=str(tmp_path),
)
def _fake_popen_capture(captured):
def _fake(cmd, **kwargs):
captured["cmd"] = cmd
captured["kwargs"] = kwargs
raise FileNotFoundError("copilot not found")
return _fake
def test_run_prompt_prefers_profile_home_when_available(monkeypatch, tmp_path):
hermes_home = tmp_path / "hermes"
profile_home = hermes_home / "home"
profile_home.mkdir(parents=True)
monkeypatch.delenv("HOME", raising=False)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
captured = {}
client = _make_home_client(tmp_path)
with _patch("agent.copilot_acp_client.subprocess.Popen", side_effect=_fake_popen_capture(captured)):
with pytest.raises(RuntimeError, match="Could not start Copilot ACP command"):
client._run_prompt("hello", timeout_seconds=1)
assert captured["kwargs"]["env"]["HOME"] == str(profile_home)
def test_run_prompt_passes_home_when_parent_env_is_clean(monkeypatch, tmp_path):
monkeypatch.delenv("HOME", raising=False)
monkeypatch.delenv("HERMES_HOME", raising=False)
captured = {}
client = _make_home_client(tmp_path)
with _patch("agent.copilot_acp_client.subprocess.Popen", side_effect=_fake_popen_capture(captured)):
with pytest.raises(RuntimeError, match="Could not start Copilot ACP command"):
client._run_prompt("hello", timeout_seconds=1)
assert "env" in captured["kwargs"]
assert captured["kwargs"]["env"]["HOME"]