fix: align auth-by-message classification with status-code path, decode URLs before secret check

error_classifier.py: Message-only auth errors ("invalid api key", "unauthorized", etc.) were classified as retryable=True (line 707), inconsistent with the HTTP 401 path (line 432) which correctly uses retryable=False + should_fallback=True. The mismatch causes 3 wasted retries with the same broken credential before fallback, while 401 errors immediately attempt fallback. Align the message-based path to match: retryable=False, should_fallback=True. web_tools.py: The _PREFIX_RE secret-detection check in web_extract_tool() runs against the raw URL string (line 1196). URL-encoded secrets like %73k-1234... ( sk-1234...) bypass the filter because the regex expects literal ASCII. Add urllib.parse.unquote() before the check so percent-encoded variants are also caught. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 00:51:20 +00:00 · 2026-04-10 12:00:31 +08:00 · 2026-04-10 12:00:31 +08:00 · 738f0bac13
commit 738f0bac13
parent 37bb4f807b
2 changed files with 5 additions and 2 deletions
--- a/tools/web_tools.py
+++ b/tools/web_tools.py
@ -1190,10 +1190,12 @@ async def web_extract_tool(
    Raises:
        Exception: If extraction fails or API key is not set
    """
-    # Block URLs containing embedded secrets (exfiltration prevention)
+    # Block URLs containing embedded secrets (exfiltration prevention).
+    # URL-decode first so percent-encoded secrets (%73k- = sk-) are caught.
    from agent.redact import _PREFIX_RE
+    from urllib.parse import unquote
    for _url in urls:
-        if _PREFIX_RE.search(_url):
+        if _PREFIX_RE.search(_url) or _PREFIX_RE.search(unquote(_url)):
            return json.dumps({
                "success": False,
                "error": "Blocked: URL contains what appears to be an API key or token. "