fix(copilot-acp): tighten deprecation detection + sharpen GitHub Models 413 hint

Follow-up improvements on top of @konsisumer's cherry-picked fix for #10648: 1. Deprecation patterns required BOTH a product fingerprint ('gh-copilot') and a deprecation marker. The previous list included 'copilot-cli' and bare 'deprecation', which would false-positive on stderr from the NEW @github/copilot CLI — whose repo is literally github.com/github/copilot-cli and which legitimately surfaces those substrings in its own messages. 2. Replace the deprecation hint. The user in #10648 installed 'gh extension install github/gh-copilot' (the deprecated extension) thinking that's what ACP mode uses, when ACP actually spawns the new 'copilot' binary from '@github/copilot'. The hint now points users at the correct install command ('npm install -g @github/copilot') with the new CLI's repo URL, and demotes provider-switching to a fallback alternative. 3. Change _URL_TO_PROVIDER value for models.inference.ai.azure.com from the 'github-models' alias to the canonical 'copilot' provider id, matching the convention used by every other entry in the table. 4. Sharpen the 413 hint message. The free tier's ~8K cap is below the system-prompt floor, so this endpoint is fundamentally incompatible with an agentic loop — not a 'use a different URL' problem. Tests: - New parametrized false-positive coverage for the new CLI's stderr shape. - Updated assertion to require canonical 'copilot' provider mapping. - All 14 deprecation/URL tests pass.
2026-05-18 04:41:56 +00:00 · 2026-05-16 01:58:13 -07:00 · 2026-05-16 01:58:13 -07:00 · 374dc81c23
commit 374dc81c23
parent b85b938b1f
4 changed files with 84 additions and 38 deletions
--- a/run_agent.py
+++ b/run_agent.py
@ -14185,29 +14185,35 @@ class AIAgent:
                        }
                    
                    # Actionable hint for GitHub Models (Azure) 413 errors.
-                    # The free tier enforces a hard 8K token limit per request,
-                    # which Hermes' system prompt alone can exceed.  Compression
-                    # won't help — surface a clear message so the user doesn't
-                    # wait through three futile compression attempts.
+                    # The free tier enforces a hard 8K token cap per request,
+                    # which Hermes' system prompt + tool schemas alone exceed.
+                    # Compression can't help — the floor is the system prompt
+                    # itself, not the conversation — so surface a clear "not
+                    # compatible" message instead of looping into three futile
+                    # compression attempts.
                    if (
                        status_code == 413
                        and isinstance(_base, str)
                        and "models.inference.ai.azure.com" in _base
                    ):
                        self._vprint(
-                            f"{self.log_prefix}   💡 GitHub Models (Azure) enforces a hard per-request token limit (often 8K).",
+                            f"{self.log_prefix}   💡 GitHub Models free tier (models.inference.ai.azure.com) caps every",
                            force=True,
                        )
                        self._vprint(
-                            f"{self.log_prefix}      Hermes' system prompt alone may exceed this limit.  This endpoint is not",
+                            f"{self.log_prefix}      request at ~8K tokens. Hermes' system prompt + tool schemas baseline",
                            force=True,
                        )
                        self._vprint(
-                            f"{self.log_prefix}      compatible with Hermes Agent.  Use https://models.github.ai or the GitHub",
+                            f"{self.log_prefix}      exceeds that floor, so this endpoint cannot run an agentic loop.",
                            force=True,
                        )
                        self._vprint(
-                            f"{self.log_prefix}      Copilot provider instead, which have higher token limits.",
+                            f"{self.log_prefix}      Use the `copilot` provider with a Copilot subscription token (`hermes",
+                            force=True,
+                        )
+                        self._vprint(
+                            f"{self.log_prefix}      setup` → GitHub Copilot), or pick any other provider.",
                            force=True,
                        )