fix(error_classifier): don't classify generic 404 as model_not_found (#14013)

The 404 branch in _classify_by_status had dead code: the generic
fallback below the _MODEL_NOT_FOUND_PATTERNS check returned the
exact same classification (model_not_found + should_fallback=True),
so every 404 — regardless of message — was treated as a missing model.

This bites local-endpoint users (llama.cpp, Ollama, vLLM) whose 404s
usually mean a wrong endpoint path, proxy routing glitch, or transient
backend issue — not a missing model. Claiming 'model not found' misleads
the next turn and silently falls back to another provider when the real
problem was a URL typo the user should see.

Fix: only classify 404 as model_not_found when the message actually
matches _MODEL_NOT_FOUND_PATTERNS ("invalid model", "model not found",
etc.). Otherwise fall through as unknown (retryable) so the real error
surfaces in the retry loop.

Test updated to match the new behavior. 103 error_classifier tests pass.
This commit is contained in:
Teknium 2026-04-22 06:11:47 -07:00 committed by GitHub
parent 40619b393f
commit 77e04a29d5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 16 additions and 5 deletions

View file

@ -298,9 +298,15 @@ class TestClassifyApiError:
assert result.retryable is False
def test_404_generic(self):
# Generic 404 with no "model not found" signal — common for local
# llama.cpp/Ollama/vLLM endpoints with slightly wrong paths. Treat
# as unknown (retryable) so the real error surfaces, rather than
# claiming the model is missing and silently falling back.
e = MockAPIError("Not Found", status_code=404)
result = classify_api_error(e)
assert result.reason == FailoverReason.model_not_found
assert result.reason == FailoverReason.unknown
assert result.retryable is True
assert result.should_fallback is False
# ── Payload too large ──