mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
fix(copilot-acp): tighten deprecation detection + sharpen GitHub Models 413 hint
Follow-up improvements on top of @konsisumer's cherry-picked fix for #10648: 1. Deprecation patterns required BOTH a product fingerprint ('gh-copilot') and a deprecation marker. The previous list included 'copilot-cli' and bare 'deprecation', which would false-positive on stderr from the NEW @github/copilot CLI — whose repo is literally github.com/github/copilot-cli and which legitimately surfaces those substrings in its own messages. 2. Replace the deprecation hint. The user in #10648 installed 'gh extension install github/gh-copilot' (the deprecated extension) thinking that's what ACP mode uses, when ACP actually spawns the new 'copilot' binary from '@github/copilot'. The hint now points users at the correct install command ('npm install -g @github/copilot') with the new CLI's repo URL, and demotes provider-switching to a fallback alternative. 3. Change _URL_TO_PROVIDER value for models.inference.ai.azure.com from the 'github-models' alias to the canonical 'copilot' provider id, matching the convention used by every other entry in the table. 4. Sharpen the 413 hint message. The free tier's ~8K cap is below the system-prompt floor, so this endpoint is fundamentally incompatible with an agentic loop — not a 'use a different URL' problem. Tests: - New parametrized false-positive coverage for the new CLI's stderr shape. - Updated assertion to require canonical 'copilot' provider mapping. - All 14 deprecation/URL tests pass.
This commit is contained in:
parent
b85b938b1f
commit
374dc81c23
4 changed files with 84 additions and 38 deletions
22
run_agent.py
22
run_agent.py
|
|
@ -14185,29 +14185,35 @@ class AIAgent:
|
|||
}
|
||||
|
||||
# Actionable hint for GitHub Models (Azure) 413 errors.
|
||||
# The free tier enforces a hard 8K token limit per request,
|
||||
# which Hermes' system prompt alone can exceed. Compression
|
||||
# won't help — surface a clear message so the user doesn't
|
||||
# wait through three futile compression attempts.
|
||||
# The free tier enforces a hard 8K token cap per request,
|
||||
# which Hermes' system prompt + tool schemas alone exceed.
|
||||
# Compression can't help — the floor is the system prompt
|
||||
# itself, not the conversation — so surface a clear "not
|
||||
# compatible" message instead of looping into three futile
|
||||
# compression attempts.
|
||||
if (
|
||||
status_code == 413
|
||||
and isinstance(_base, str)
|
||||
and "models.inference.ai.azure.com" in _base
|
||||
):
|
||||
self._vprint(
|
||||
f"{self.log_prefix} 💡 GitHub Models (Azure) enforces a hard per-request token limit (often 8K).",
|
||||
f"{self.log_prefix} 💡 GitHub Models free tier (models.inference.ai.azure.com) caps every",
|
||||
force=True,
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix} Hermes' system prompt alone may exceed this limit. This endpoint is not",
|
||||
f"{self.log_prefix} request at ~8K tokens. Hermes' system prompt + tool schemas baseline",
|
||||
force=True,
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix} compatible with Hermes Agent. Use https://models.github.ai or the GitHub",
|
||||
f"{self.log_prefix} exceeds that floor, so this endpoint cannot run an agentic loop.",
|
||||
force=True,
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix} Copilot provider instead, which have higher token limits.",
|
||||
f"{self.log_prefix} Use the `copilot` provider with a Copilot subscription token (`hermes",
|
||||
force=True,
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix} setup` → GitHub Copilot), or pick any other provider.",
|
||||
force=True,
|
||||
)
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue