* remove Vercel AI Gateway provider and Vercel Sandbox terminal backend
Both Vercel-hosted integrations are removed end-to-end. Users on the AI
Gateway should switch to OpenRouter or one of the other aggregators
(Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should
switch to Docker, Modal, Daytona, or SSH.
What's removed:
- `plugins/model-providers/ai-gateway/` provider plugin
- `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper
- `tools/environments/vercel_sandbox.py` terminal backend
- `ai-gateway` provider wiring across auth, doctor, setup, models,
config, status, providers, main, web_server, model_normalize, dump
- `vercel_sandbox` backend wiring across terminal_tool, file_tools,
code_execution_tool, file_operations, approval, skills_tool,
environments/local, credential_files, lazy_deps, prompt_builder,
cli, gateway/run
- `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client
header set, run_agent base-URL header/reasoning special-cases
- `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock
- env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`,
`VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`,
`TERMINAL_VERCEL_RUNTIME`
- Tests: deletes test_ai_gateway_models.py and
test_vercel_sandbox_environment.py; scrubs references across 23
surviving test files (no entire tests deleted unless they were
dedicated to AI Gateway / Sandbox)
- Docs: provider tables, env-var reference, setup guides, security
notes, tool config, terminal-backend tables — English plus zh-Hans
i18n parity
- `hermes-agent` skill: provider table entry and remote-backend list
What stays (intentional):
- `popular-web-designs/templates/vercel.md` — CSS design reference,
unrelated to Vercel-the-AI-product
- `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN
response header, useful diag signal on any Vercel-hosted endpoint
- `vercel-labs/agent-browser` URL in browser config — lightpanda
browser project, different OSS effort
- `userStories.json` historical contributor entry mentioning Vercel
Sandbox — archive, not active docs
Validation:
- 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`)
- Full repo `py_compile` clean
- Live import of every touched module + invariant check (no
`ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no
`vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`)
* test: convert profile-count check from change-detector to invariant
The hardcoded "== 34" assertion broke when ai-gateway was removed.
Per AGENTS.md change-detector-test guidance, assert the relationship
(registry count >= number of plugin dirs) instead of a literal count.
Counts shift when providers are added/removed; that's expected.
The previous revision of this PR added six GMI-specific branches
(`elif base_url_host_matches(..., 'api.gmi-serving.com')`) across
run_agent.py and agent/auxiliary_client.py, plus a _HERMES_UA_HEADERS
constant in auxiliary_client.py.
ProviderProfile already has a `default_headers: dict[str, str]` field
commented as 'Client-level quirks (set once at client construction)'.
Other plugins (ai-gateway, kimi-coding) already use it. Two of the four
auxiliary_client sites we previously patched already had a generic
`else: profile.default_headers` fallback that picked it up (so did
both run_agent sites).
This revision:
* Sets `default_headers={'User-Agent': 'HermesAgent/<ver>'}` on the
GMI profile in plugins/model-providers/gmi/__init__.py.
* Reverts all six GMI-specific branches in run_agent.py and
auxiliary_client.py.
* Adds the generic profile-fallback `else` block to the two
auxiliary_client sites (`_to_async_client`, `resolve_provider_client`)
that didn't have it yet. This benefits every provider whose profile
declares default_headers, not just GMI — e.g. Vercel AI Gateway's
HTTP-Referer/X-Title now flow through the async client path too.
* Replaces the GMI-specific URL-branch tests with a profile-level
assertion and keeps the run_agent integration test (with
`provider='gmi'` so the fallback picks up the profile).
Net diff vs main: +82/-0 across 5 files, touching only the GMI plugin,
two generic fallback blocks in auxiliary_client.py, AUTHOR_MAP, and
tests. No core files change.
Based on #20907 by @isaachuangGMICLOUD.
Introduces providers/ package — single source of truth for every
inference provider. Adding a simple api-key provider now requires one
providers/<name>.py file with zero edits anywhere else.
What this PR ships:
- providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes)
- ProviderProfile declarative fields: name, api_mode, aliases, display_name,
env_vars, base_url, models_url, auth_type, fallback_models, hostname,
default_headers, fixed_temperature, default_max_tokens, default_aux_model
- 4 overridable hooks: prepare_messages, build_extra_body,
build_api_kwargs_extras, fetch_models
- chat_completions.build_kwargs: profile path via _build_kwargs_from_profile,
legacy flag path retained for lmstudio/tencent-tokenhub (which have
session-aware reasoning probing that doesn't map cleanly to hooks yet)
- run_agent.py: profile path for all registered providers; legacy path
variable scoping fixed (all flags defined before branching)
- Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS,
doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER
- GeminiProfile: thinking_config translation (native + openai-compat nested)
- New tests/providers/ (79 tests covering profile declarations, transport
parity, hook overrides, e2e kwargs assembly)
Deltas vs original PR (salvaged onto current main):
- Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth
(were added to main since original PR)
- Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their
reasoning_effort probing has no clean hook equivalent yet)
- Removed lmstudio alias from custom profile (it's a separate provider now)
- Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension
(resolve_provider special-cases them; adding breaks runtime resolution)
- runtime_provider: profile.api_mode only as fallback when URL detection
finds nothing (was breaking minimax /v1 override)
- Preserved main's legacy-path improvements: deepseek reasoning_content
preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M
beta recovery, etc.
- Kept agent/copilot_acp_client.py in place (rejected PR's relocation —
main has 7 fixes landed since; relocation would revert them)
- _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing
test imports
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Closes#14418
- config.py: remove dead ENV_VARS_BY_VERSION[17] entry (current _config_version
is 22, so all users are past version 17 and would never be prompted for
GMI_API_KEY on upgrade — consistent with how arcee was added)
- auxiliary_client.py: use google/gemini-3.1-flash-lite-preview as GMI aux
model instead of anthropic/claude-opus-4.6 (matches cheap fast-model pattern
used by all other providers: zai→glm-4.5-flash, kimi→kimi-k2-turbo-preview,
stepfun→step-3.5-flash, kilocode→google/gemini-3-flash-preview)
- test_gmi_provider.py: fix malformed write_text() call in doctor test
(was: write_text("GMI_API_KEY=*** encoding="utf-8") → missing closing quote,
wrote literal string 'GMI_API_KEY=*** encoding=' to .env file)
- test_gmi_provider.py + test_auxiliary_client.py: update aux model assertions
to match new cheaper default
- docs/integrations/providers.md: add 'gmi' to inline 'Supported providers'
fallback list (was only in the table, not the inline list at line ~1181)
- docs/reference/cli-commands.md: add 'gmi' to --provider choices list