From 24fa05576366564479e0b0e7ee37d34a28506cc3 Mon Sep 17 00:00:00 2001
From: Teknium <127238744+teknium1@users.noreply.github.com>
Date: Thu, 16 Apr 2026 20:43:41 -0700
Subject: [PATCH] fix(ci): resolve 4 pre-existing main failures (docs lint + 3
 stale tests) (#11373)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* docs: fix ascii-guard border alignment errors

Three docs pages had ASCII diagram boxes with off-by-one column
alignment issues that failed docs-site-checks CI:

- architecture.md: outer box is 71 cols but inner-box content lines
  and border corners were offset by 1 col, making content-line right
  border at col 70/72 while top/bottom border was at col 71. Inner
  boxes also had border corners at cols 19/36/53 but content pipes
  at cols 20/37/54. Rewrote the diagram with consistent 71-col width
  throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with
  2-space gaps and 15-space trailing padding.

- gateway-internals.md: same class of issue — outer box at 51 cols,
  inner content lines varied 52-54 cols. Rewrote with consistent
  51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also
  restructured the bottom-half message flow so it's bare text
  (not half-open box cells) matching the intent of the original.

- agent-loop.md line 112-114: box 2 (API thread) content lines had
  one extra space pushing the right border to col 46 while the top
  and bottom borders of that box sat at col 45. Trimmed one trailing
  space from each of the three content lines.

All 123 docs files now pass `npm run lint:diagrams`:
  ✓ Errors: 0  (warnings: 6, non-fatal)

Pre-existing failures on main — unrelated to any open PR.

* test(setup): accept description kwarg in prompt_choice mock lambdas

setup.py's `_curses_prompt_choice` gained an optional `description`
parameter (used for rendering context hints alongside the prompt).
`prompt_choice` forwards it via keyword arg. The two existing tests
mocked `_curses_prompt_choice` with lambdas that didn't accept the
new kwarg, so the forwarded call raised TypeError.

Fix: add `description=None` to both mock lambda signatures so they
absorb the new kwarg without changing behavior.

* test(matrix): update stale audio-caching assertion

test_regular_audio_has_http_url asserted that non-voice audio
messages keep their HTTP URL and are NOT downloaded/cached. That
was true when the caching code only triggered on
`is_voice_message`. Since bec02f37 (encrypted-media caching
refactor), matrix.py caches all media locally — photos, audio,
video, documents — so downstream tools can read them as real
files via media_urls. This applies to regular audio too.

Renamed the test to `test_regular_audio_is_cached_locally`,
flipped the assertions accordingly, and documented the
intentional behavior change in the docstring. Other tests in
the file (voice-specific caching, message-type detection,
reply-to threading) continue to pass.

* test(413): allow multi-pass preflight compression

run_agent.py's preflight compression runs up to 3 passes in a loop
for very large sessions (each pass summarizes the middle N turns,
then re-checks tokens). The loop breaks when a pass returns a
message list no shorter than its input (can't compress further).

test_preflight_compresses_oversized_history used a static mock
return value that returned the same 2 messages regardless of input,
so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break),
making call_count == 2. The assert_called_once() assertion was
strictly wrong under the multi-pass design.

The invariant the test actually cares about is: preflight ran, and
its first invocation received the full oversized history. Replaced
the count assertion with those two invariants.

* docs: drop '...' from gateway diagram, merge side-by-side boxes

ascii-guard 2.3.0 flagged two remaining issues after the initial fix
pass:

1. gateway-internals.md L33: the '...' suffix after inner box 3's
   right border got parsed as 'extra characters after inner-box right
   border'. Dropped the '...' — the surrounding prose already conveys
   'and more platforms' without needing the visual hint.

2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side
   boxes of different heights (main thread 7 rows, API thread 5 rows).
   Even equalizing heights didn't help — the linter treats the left
   box's right border as the end of the diagram. Merged into a single
   54-char-wide outer box with both threads labeled as regions inside,
   keeping the ▶ arrow to preserve the main→API flow direction.
---
 tests/gateway/test_matrix_voice.py            | 18 ++++++----
 tests/hermes_cli/test_setup_prompt_menus.py   |  4 +--
 tests/run_agent/test_413_compression.py       | 11 ++++--
 website/docs/developer-guide/agent-loop.md    | 15 ++++----
 website/docs/developer-guide/architecture.md  | 30 ++++++++--------
 .../docs/developer-guide/gateway-internals.md | 36 +++++++++----------
 6 files changed, 64 insertions(+), 50 deletions(-)

diff --git a/tests/gateway/test_matrix_voice.py b/tests/gateway/test_matrix_voice.py
index dab113c5d..3b3e08d14 100644
--- a/tests/gateway/test_matrix_voice.py
+++ b/tests/gateway/test_matrix_voice.py
@@ -184,8 +184,14 @@ class TestMatrixVoiceMessageDetection:
             f"Expected MessageType.AUDIO for non-voice, got {captured_event.message_type}"
 
     @pytest.mark.asyncio
-    async def test_regular_audio_has_http_url(self):
-        """Regular audio uploads should keep HTTP URL (not cached locally)."""
+    async def test_regular_audio_is_cached_locally(self):
+        """Regular audio uploads are cached locally for downstream tool access.
+
+        Since PR #bec02f37 (encrypted-media caching refactor), all media
+        types — photo, audio, video, document — are cached locally when
+        received so tools can read them as real files. This applies equally
+        to voice messages and regular audio.
+        """
         event = _make_audio_event(is_voice=False)
 
         captured_event = None
@@ -200,10 +206,10 @@ class TestMatrixVoiceMessageDetection:
 
         assert captured_event is not None
         assert captured_event.media_urls is not None
-        # Should be HTTP URL, not local path
-        assert captured_event.media_urls[0].startswith("http"), \
-            f"Non-voice audio should have HTTP URL, got {captured_event.media_urls[0]}"
-        self.adapter._client.download_media.assert_not_awaited()
+        # Should be a local path, not an HTTP URL.
+        assert not captured_event.media_urls[0].startswith("http"), \
+            f"Regular audio should be cached locally, got {captured_event.media_urls[0]}"
+        self.adapter._client.download_media.assert_awaited_once()
         assert captured_event.media_types == ["audio/ogg"]
 
 
diff --git a/tests/hermes_cli/test_setup_prompt_menus.py b/tests/hermes_cli/test_setup_prompt_menus.py
index 5a7225d09..fd017d87d 100644
--- a/tests/hermes_cli/test_setup_prompt_menus.py
+++ b/tests/hermes_cli/test_setup_prompt_menus.py
@@ -2,7 +2,7 @@ from hermes_cli import setup as setup_mod
 
 
 def test_prompt_choice_uses_curses_helper(monkeypatch):
-    monkeypatch.setattr(setup_mod, "_curses_prompt_choice", lambda question, choices, default=0: 1)
+    monkeypatch.setattr(setup_mod, "_curses_prompt_choice", lambda question, choices, default=0, description=None: 1)
 
     idx = setup_mod.prompt_choice("Pick one", ["a", "b", "c"], default=0)
 
@@ -10,7 +10,7 @@ def test_prompt_choice_uses_curses_helper(monkeypatch):
 
 
 def test_prompt_choice_falls_back_to_numbered_input(monkeypatch):
-    monkeypatch.setattr(setup_mod, "_curses_prompt_choice", lambda question, choices, default=0: -1)
+    monkeypatch.setattr(setup_mod, "_curses_prompt_choice", lambda question, choices, default=0, description=None: -1)
     monkeypatch.setattr("builtins.input", lambda _prompt="": "2")
 
     idx = setup_mod.prompt_choice("Pick one", ["a", "b", "c"], default=0)
diff --git a/tests/run_agent/test_413_compression.py b/tests/run_agent/test_413_compression.py
index b30f9f6bb..1d6f6cebb 100644
--- a/tests/run_agent/test_413_compression.py
+++ b/tests/run_agent/test_413_compression.py
@@ -430,8 +430,15 @@ class TestPreflightCompression:
             )
             result = agent.run_conversation("hello", conversation_history=big_history)
 
-        # Preflight compression should have been called BEFORE the API call
-        mock_compress.assert_called_once()
+        # Preflight compression is a multi-pass loop (up to 3 passes for very
+        # large sessions, breaking when no further reduction is possible).
+        # First pass must have received the full oversized history.
+        assert mock_compress.call_count >= 1, "Preflight compression never ran"
+        first_call_messages = mock_compress.call_args_list[0].args[0]
+        assert len(first_call_messages) >= 40, (
+            f"First preflight pass should see the full history, got "
+            f"{len(first_call_messages)} messages"
+        )
         assert result["completed"] is True
         assert result["final_response"] == "After preflight"
 
diff --git a/website/docs/developer-guide/agent-loop.md b/website/docs/developer-guide/agent-loop.md
index 2d0df3278..1ec647010 100644
--- a/website/docs/developer-guide/agent-loop.md
+++ b/website/docs/developer-guide/agent-loop.md
@@ -108,13 +108,14 @@ Providers validate these sequences and will reject malformed histories.
 API requests are wrapped in `_api_call_with_interrupt()` which runs the actual HTTP call in a background thread while monitoring an interrupt event:
 
 ```text
-┌──────────────────────┐     ┌──────────────┐
-│  Main thread         │     │  API thread   │
-│  wait on:            │────▶│  HTTP POST    │
-│  - response ready    │     │  to provider  │
-│  - interrupt event   │     └──────────────┘
-│  - timeout           │
-└──────────────────────┘
+┌────────────────────────────────────────────────────┐
+│  Main thread                  API thread           │
+│                                                    │
+│   wait on:                     HTTP POST           │
+│    - response ready     ───▶   to provider         │
+│    - interrupt event                               │
+│    - timeout                                       │
+└────────────────────────────────────────────────────┘
 ```
 
 When interrupted (user sends new message, `/stop` command, or signal):
diff --git a/website/docs/developer-guide/architecture.md b/website/docs/developer-guide/architecture.md
index 5b881c7e2..88ad96269 100644
--- a/website/docs/developer-guide/architecture.md
+++ b/website/docs/developer-guide/architecture.md
@@ -20,21 +20,21 @@ This page is the top-level map of Hermes Agent internals. Use it to orient yours
            │              │                       │
            ▼              ▼                       ▼
 ┌─────────────────────────────────────────────────────────────────────┐
-│                     AIAgent (run_agent.py)                           │
-│                                                                      │
-│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐                │
-│  │ Prompt        │ │ Provider     │ │ Tool         │                │
-│  │ Builder       │ │ Resolution   │ │ Dispatch     │                │
-│  │ (prompt_      │ │ (runtime_    │ │ (model_      │                │
-│  │  builder.py)  │ │  provider.py)│ │  tools.py)   │                │
-│  └──────┬───────┘ └──────┬───────┘ └──────┬───────┘                │
-│         │                │                │                          │
-│  ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐                │
-│  │ Compression  │ │ 3 API Modes  │ │ Tool Registry│                │
-│  │ & Caching    │ │ chat_compl.  │ │ (registry.py)│                │
-│  │              │ │ codex_resp.  │ │ 47 tools     │                │
-│  │              │ │ anthropic    │ │ 19 toolsets  │                │
-│  └──────────────┘ └──────────────┘ └──────────────┘                │
+│                     AIAgent (run_agent.py)                          │
+│                                                                     │
+│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐               │
+│  │ Prompt       │  │ Provider     │  │ Tool         │               │
+│  │ Builder      │  │ Resolution   │  │ Dispatch     │               │
+│  │ (prompt_     │  │ (runtime_    │  │ (model_      │               │
+│  │  builder.py) │  │  provider.py)│  │  tools.py)   │               │
+│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘               │
+│         │                 │                 │                       │
+│  ┌──────┴───────┐  ┌──────┴───────┐  ┌──────┴───────┐               │
+│  │ Compression  │  │ 3 API Modes  │  │ Tool Registry│               │
+│  │ & Caching    │  │ chat_compl.  │  │ (registry.py)│               │
+│  │              │  │ codex_resp.  │  │ 47 tools     │               │
+│  │              │  │ anthropic    │  │ 19 toolsets  │               │
+│  └──────────────┘  └──────────────┘  └──────────────┘               │
 └─────────────────────────────────────────────────────────────────────┘
            │                                    │
            ▼                                    ▼
diff --git a/website/docs/developer-guide/gateway-internals.md b/website/docs/developer-guide/gateway-internals.md
index f3a9942c8..3f9a46bec 100644
--- a/website/docs/developer-guide/gateway-internals.md
+++ b/website/docs/developer-guide/gateway-internals.md
@@ -27,25 +27,25 @@ The messaging gateway is the long-running process that connects Hermes to 14+ ex
 
 ```text
 ┌─────────────────────────────────────────────────┐
-│                 GatewayRunner                     │
-│                                                   │
+│                  GatewayRunner                  │
+│                                                 │
 │  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
-│  │ Telegram  │  │ Discord  │  │  Slack   │  ...  │
-│  │ Adapter   │  │ Adapter  │  │ Adapter  │       │
-│  └─────┬─────┘  └─────┬────┘  └─────┬────┘       │
-│        │              │              │             │
-│        └──────────────┼──────────────┘             │
-│                       ▼                            │
-│              _handle_message()                     │
-│                       │                            │
-│          ┌────────────┼────────────┐               │
-│          ▼            ▼            ▼               │
-│   Slash command   AIAgent      Queue/BG            │
-│    dispatch       creation     sessions            │
-│                       │                            │
-│                       ▼                            │
-│              SessionStore                          │
-│           (SQLite persistence)                     │
+│  │ Telegram │  │ Discord  │  │  Slack   │       │
+│  │ Adapter  │  │ Adapter  │  │ Adapter  │       │
+│  └────┬─────┘  └────┬─────┘  └────┬─────┘       │
+│       │             │             │             │
+│       └─────────────┼─────────────┘             │
+│                     ▼                           │
+│              _handle_message()                  │
+│                     │                           │
+│         ┌───────────┼───────────┐               │
+│         ▼           ▼           ▼               │
+│  Slash command   AIAgent    Queue/BG            │
+│    dispatch      creation   sessions            │
+│                     │                           │
+│                     ▼                           │
+│                 SessionStore                    │
+│              (SQLite persistence)               │
 └─────────────────────────────────────────────────┘
 ```