fix(error_classifier): don't classify generic 404 as model_not_found (#14013)

The 404 branch in _classify_by_status had dead code: the generic
fallback below the _MODEL_NOT_FOUND_PATTERNS check returned the
exact same classification (model_not_found + should_fallback=True),
so every 404 — regardless of message — was treated as a missing model.

This bites local-endpoint users (llama.cpp, Ollama, vLLM) whose 404s
usually mean a wrong endpoint path, proxy routing glitch, or transient
backend issue — not a missing model. Claiming 'model not found' misleads
the next turn and silently falls back to another provider when the real
problem was a URL typo the user should see.

Fix: only classify 404 as model_not_found when the message actually
matches _MODEL_NOT_FOUND_PATTERNS ("invalid model", "model not found",
etc.). Otherwise fall through as unknown (retryable) so the real error
surfaces in the retry loop.

Test updated to match the new behavior. 103 error_classifier tests pass.
This commit is contained in:
Teknium 2026-04-22 06:11:47 -07:00 committed by GitHub
parent 40619b393f
commit 77e04a29d5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 16 additions and 5 deletions

View file

@ -470,11 +470,16 @@ def _classify_by_status(
retryable=False,
should_fallback=True,
)
# Generic 404 — could be model or endpoint
# Generic 404 with no "model not found" signal — could be a wrong
# endpoint path (common with local llama.cpp / Ollama / vLLM when
# the URL is slightly misconfigured), a proxy routing glitch, or
# a transient backend issue. Classifying these as model_not_found
# silently falls back to a different provider and tells the model
# the model is missing, which is wrong and wastes a turn. Treat
# as unknown so the retry loop surfaces the real error instead.
return result_fn(
FailoverReason.model_not_found,
retryable=False,
should_fallback=True,
FailoverReason.unknown,
retryable=True,
)
if status_code == 413: