docs: stabilize website diagrams

2026-04-25 00:51:20 +00:00 · 2026-03-14 22:49:57 -07:00 · 2026-03-14 22:49:57 -07:00 · 259208bfe4
commit 259208bfe4
parent d5b64ebdb3
18 changed files with 1504 additions and 112 deletions
--- a/website/docs/user-guide/features/honcho.md
+++ b/website/docs/user-guide/features/honcho.md
@ -207,16 +207,17 @@ honcho: {}

 Honcho context is fetched asynchronously to avoid blocking the response path:

-```
-Turn N:
-  user message
-    → consume cached context (from previous turn's background fetch)
-    → inject into system prompt (user representation, AI representation, dialectic)
-    → LLM call
-    → response
-    → fire background fetch for next turn
-         → fetch context    ─┐
-         → fetch dialectic  ─┴→ cache for Turn N+1
+```mermaid
+flowchart TD
+    user["User message"] --> cache["Consume cached Honcho context<br/>from the previous turn"]
+    cache --> prompt["Inject user, AI, and dialectic context<br/>into the system prompt"]
+    prompt --> llm["LLM call"]
+    llm --> response["Assistant response"]
+    response --> fetch["Start background fetch for Turn N+1"]
+    fetch --> ctx["Fetch context"]
+    fetch --> dia["Fetch dialectic"]
+    ctx --> next["Cache for the next turn"]
+    dia --> next
 ```

 Turn 1 is a cold start (no cache). All subsequent turns consume cached results with zero HTTP latency on the response path. The system prompt on turn 1 uses only static context to preserve prefix cache hits at the LLM provider.