fix: sweep remaining provider-URL substring checks across codebase

Completes the hostname-hardening sweep — every substring check against a
provider host in live-routing code is now hostname-based. This closes the
same false-positive class for OpenRouter, GitHub Copilot, Kimi, Qwen,
ChatGPT/Codex, Bedrock, GitHub Models, Vercel AI Gateway, Nous, Z.AI,
Moonshot, Arcee, and MiniMax that the original PR closed for OpenAI, xAI,
and Anthropic.

New helper:
- utils.base_url_host_matches(base_url, domain) — safe counterpart to
  'domain in base_url'. Accepts hostname equality and subdomain matches;
  rejects path segments, host suffixes, and prefix collisions.

Call sites converted (real-code only; tests, optional-skills, red-teaming
scripts untouched):

run_agent.py (10 sites):
- AIAgent.__init__ Bedrock branch, ChatGPT/Codex branch (also path check)
- header cascade for openrouter / copilot / kimi / qwen / chatgpt
- interleaved-thinking trigger (openrouter + claude)
- _is_openrouter_url(), _is_qwen_portal()
- is_native_anthropic check
- github-models-vs-copilot detection (3 sites)
- reasoning-capable route gate (nousresearch, vercel, github)
- codex-backend detection in API kwargs build
- fallback api_mode Bedrock detection

agent/auxiliary_client.py (7 sites):
- extra-headers cascades in 4 distinct client-construction paths
  (resolve custom, resolve auto, OpenRouter-fallback-to-custom,
  _async_client_from_sync, resolve_provider_client explicit-custom,
  resolve_auto_with_codex)
- _is_openrouter_client() base_url sniff

agent/usage_pricing.py:
- resolve_billing_route openrouter branch

agent/model_metadata.py:
- _is_openrouter_base_url(), Bedrock context-length lookup

hermes_cli/providers.py:
- determine_api_mode Bedrock heuristic

hermes_cli/runtime_provider.py:
- _is_openrouter_url flag for API-key preference (issues #420, #560)

hermes_cli/doctor.py:
- Kimi User-Agent header for /models probes

tools/delegate_tool.py:
- subagent Codex endpoint detection

trajectory_compressor.py:
- _detect_provider() cascade (8 providers: openrouter, nous, codex, zai,
  kimi-coding, arcee, minimax-cn, minimax)

cli.py, gateway/run.py:
- /model-switch cache-enabled hint (openrouter + claude)

Bedrock detection tightened from 'bedrock-runtime in url' to
'hostname starts with bedrock-runtime. AND host is under amazonaws.com'.
ChatGPT/Codex detection tightened from 'chatgpt.com/backend-api/codex in
url' to 'hostname is chatgpt.com AND path contains /backend-api/codex'.

Tests:
- tests/test_base_url_hostname.py extended with a base_url_host_matches
  suite (exact match, subdomain, path-segment rejection, host-suffix
  rejection, host-prefix rejection, empty-input, case-insensitivity,
  trailing dot).

Validation: 651 targeted tests pass (runtime_provider, minimax, bedrock,
gemini, auxiliary, codex_cloudflare, usage_pricing, compressor_fallback,
fallback_model, openai_client_lifecycle, provider_parity, cli_provider_resolution,
delegate, credential_pool, context_compressor, plus the 4 hostname test
modules). 26-assertion E2E call-site verification across 6 modules passes.
This commit is contained in:
Teknium 2026-04-20 21:17:28 -07:00 committed by Teknium
parent cecf84daf7
commit dbb7e00e7e
13 changed files with 184 additions and 76 deletions

View file

@ -48,7 +48,7 @@ from openai import OpenAI
from agent.credential_pool import load_pool
from hermes_cli.config import get_hermes_home
from hermes_constants import OPENROUTER_BASE_URL
from utils import base_url_hostname
from utils import base_url_host_matches, base_url_hostname
logger = logging.getLogger(__name__)
@ -817,9 +817,9 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
if is_native_gemini_base_url(base_url):
return GeminiNativeClient(api_key=api_key, base_url=base_url), model
extra = {}
if "api.kimi.com" in base_url.lower():
if base_url_host_matches(base_url, "api.kimi.com"):
extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
elif "api.githubcopilot.com" in base_url.lower():
elif base_url_host_matches(base_url, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
@ -843,9 +843,9 @@ def _resolve_api_key_provider() -> Tuple[Optional[OpenAI], Optional[str]]:
if is_native_gemini_base_url(base_url):
return GeminiNativeClient(api_key=api_key, base_url=base_url), model
extra = {}
if "api.kimi.com" in base_url.lower():
if base_url_host_matches(base_url, "api.kimi.com"):
extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
elif "api.githubcopilot.com" in base_url.lower():
elif base_url_host_matches(base_url, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
@ -994,7 +994,7 @@ def _resolve_custom_runtime() -> Tuple[Optional[str], Optional[str], Optional[st
return None, None, None
custom_base = custom_base.strip().rstrip("/")
if "openrouter.ai" in custom_base.lower():
if base_url_host_matches(custom_base, "openrouter.ai"):
# requested='custom' falls back to OpenRouter when no custom endpoint is
# configured. Treat that as "no custom endpoint" for auxiliary routing.
return None, None, None
@ -1433,14 +1433,14 @@ def _to_async_client(sync_client, model: str):
"api_key": sync_client.api_key,
"base_url": str(sync_client.base_url),
}
base_lower = str(sync_client.base_url).lower()
if "openrouter" in base_lower:
sync_base_url = str(sync_client.base_url)
if base_url_host_matches(sync_base_url, "openrouter.ai"):
async_kwargs["default_headers"] = dict(_OR_HEADERS)
elif "api.githubcopilot.com" in base_lower:
elif base_url_host_matches(sync_base_url, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
async_kwargs["default_headers"] = copilot_default_headers()
elif "api.kimi.com" in base_lower:
elif base_url_host_matches(sync_base_url, "api.kimi.com"):
async_kwargs["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
return AsyncOpenAI(**async_kwargs), model
@ -1621,9 +1621,9 @@ def resolve_provider_client(
provider,
)
extra = {}
if "api.kimi.com" in custom_base.lower():
if base_url_host_matches(custom_base, "api.kimi.com"):
extra["default_headers"] = {"User-Agent": "KimiCLI/1.30.0"}
elif "api.githubcopilot.com" in custom_base.lower():
elif base_url_host_matches(custom_base, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
extra["default_headers"] = copilot_default_headers()
client = OpenAI(api_key=custom_key, base_url=custom_base, **extra)
@ -1728,9 +1728,9 @@ def resolve_provider_client(
# Provider-specific headers
headers = {}
if "api.kimi.com" in base_url.lower():
if base_url_host_matches(base_url, "api.kimi.com"):
headers["User-Agent"] = "KimiCLI/1.30.0"
elif "api.githubcopilot.com" in base_url.lower():
elif base_url_host_matches(base_url, "api.githubcopilot.com"):
from hermes_cli.models import copilot_default_headers
headers.update(copilot_default_headers())
@ -2154,7 +2154,7 @@ def cleanup_stale_async_clients() -> None:
def _is_openrouter_client(client: Any) -> bool:
for obj in (client, getattr(client, "_client", None), getattr(client, "client", None)):
if obj and "openrouter" in str(getattr(obj, "base_url", "") or "").lower():
if obj and base_url_host_matches(str(getattr(obj, "base_url", "") or ""), "openrouter.ai"):
return True
return False

View file

@ -14,7 +14,7 @@ from urllib.parse import urlparse
import requests
import yaml
from utils import base_url_hostname
from utils import base_url_host_matches, base_url_hostname
from hermes_constants import OPENROUTER_MODELS_URL
@ -220,7 +220,7 @@ def _auth_headers(api_key: str = "") -> Dict[str, str]:
def _is_openrouter_base_url(base_url: str) -> bool:
return "openrouter.ai" in _normalize_base_url(base_url).lower()
return base_url_host_matches(base_url, "openrouter.ai")
def _is_custom_endpoint(base_url: str) -> bool:
@ -1089,7 +1089,11 @@ def get_model_context_length(
# 4b. AWS Bedrock — use static context length table.
# Bedrock's ListFoundationModels doesn't expose context window sizes,
# so we maintain a curated table in bedrock_adapter.py.
if provider == "bedrock" or (base_url and "bedrock-runtime" in base_url):
if provider == "bedrock" or (
base_url
and base_url_hostname(base_url).startswith("bedrock-runtime.")
and base_url_host_matches(base_url, "amazonaws.com")
):
try:
from agent.bedrock_adapter import get_bedrock_context_length
return get_bedrock_context_length(model)

View file

@ -6,6 +6,7 @@ from decimal import Decimal
from typing import Any, Dict, Literal, Optional
from agent.model_metadata import fetch_endpoint_model_metadata, fetch_model_metadata
from utils import base_url_host_matches
DEFAULT_PRICING = {"input": 0.0, "output": 0.0}
@ -393,7 +394,7 @@ def resolve_billing_route(
if provider_name == "openai-codex":
return BillingRoute(provider="openai-codex", model=model, base_url=base_url or "", billing_mode="subscription_included")
if provider_name == "openrouter" or "openrouter.ai" in base:
if provider_name == "openrouter" or base_url_host_matches(base_url or "", "openrouter.ai"):
return BillingRoute(provider="openrouter", model=model, base_url=base_url or "", billing_mode="official_models_api")
if provider_name == "anthropic":
return BillingRoute(provider="anthropic", model=model.split("/")[-1], base_url=base_url or "", billing_mode="official_docs_snapshot")