remove Vercel AI Gateway and Vercel Sandbox (#33067)

* remove Vercel AI Gateway provider and Vercel Sandbox terminal backend

Both Vercel-hosted integrations are removed end-to-end. Users on the AI
Gateway should switch to OpenRouter or one of the other aggregators
(Nous Portal, Kilo Code). Users on the Vercel Sandbox backend should
switch to Docker, Modal, Daytona, or SSH.

What's removed:
- `plugins/model-providers/ai-gateway/` provider plugin
- `hermes_cli/vercel_auth.py` Vercel-Sandbox auth helper
- `tools/environments/vercel_sandbox.py` terminal backend
- `ai-gateway` provider wiring across auth, doctor, setup, models,
  config, status, providers, main, web_server, model_normalize, dump
- `vercel_sandbox` backend wiring across terminal_tool, file_tools,
  code_execution_tool, file_operations, approval, skills_tool,
  environments/local, credential_files, lazy_deps, prompt_builder,
  cli, gateway/run
- `AI_GATEWAY_BASE_URL` constant, `_AI_GATEWAY_HEADERS` auxiliary-client
  header set, run_agent base-URL header/reasoning special-cases
- `[vercel]` pyproject extra and `vercel`/`vercel-workers` from uv.lock
- env vars: `AI_GATEWAY_API_KEY`, `AI_GATEWAY_BASE_URL`, `VERCEL_TOKEN`,
  `VERCEL_PROJECT_ID`, `VERCEL_TEAM_ID`, `VERCEL_OIDC_TOKEN`,
  `TERMINAL_VERCEL_RUNTIME`
- Tests: deletes test_ai_gateway_models.py and
  test_vercel_sandbox_environment.py; scrubs references across 23
  surviving test files (no entire tests deleted unless they were
  dedicated to AI Gateway / Sandbox)
- Docs: provider tables, env-var reference, setup guides, security
  notes, tool config, terminal-backend tables — English plus zh-Hans
  i18n parity
- `hermes-agent` skill: provider table entry and remote-backend list

What stays (intentional):
- `popular-web-designs/templates/vercel.md` — CSS design reference,
  unrelated to Vercel-the-AI-product
- `x-vercel-id` in `stream_diag.py` headers — generic Vercel CDN
  response header, useful diag signal on any Vercel-hosted endpoint
- `vercel-labs/agent-browser` URL in browser config — lightpanda
  browser project, different OSS effort
- `userStories.json` historical contributor entry mentioning Vercel
  Sandbox — archive, not active docs

Validation:
- 1153 tests in the 22 targeted files pass (`scripts/run_tests.sh`)
- Full repo `py_compile` clean
- Live import of every touched module + invariant check (no
  `ai-gateway` in `PROVIDER_REGISTRY`, no `_AI_GATEWAY_HEADERS`, no
  `vercel_sandbox` in `_REMOTE_TERMINAL_BACKENDS`)

* test: convert profile-count check from change-detector to invariant

The hardcoded "== 34" assertion broke when ai-gateway was removed.
Per AGENTS.md change-detector-test guidance, assert the relationship
(registry count >= number of plugin dirs) instead of a literal count.
Counts shift when providers are added/removed; that's expected.
This commit is contained in:
Teknium 2026-05-27 00:43:32 -07:00 committed by GitHub
parent cb38ce28cb
commit febc4cfec0
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
95 changed files with 111 additions and 3088 deletions

View file

@ -83,11 +83,11 @@ Leaving these unset keeps the legacy defaults (`HERMES_API_TIMEOUT=1800`s, `HERM
## Terminal Backend Configuration
Hermes supports seven terminal backends. Each determines where the agent's shell commands actually execute — your local machine, a Docker container, a remote server via SSH, a Modal cloud sandbox (direct or via the Nous-managed gateway), a Daytona workspace, a Vercel Sandbox, or a Singularity/Apptainer container.
Hermes supports six terminal backends. Each determines where the agent's shell commands actually execute — your local machine, a Docker container, a remote server via SSH, a Modal cloud sandbox (direct or via the Nous-managed gateway), a Daytona workspace, or a Singularity/Apptainer container.
```yaml
terminal:
backend: local # local | docker | ssh | modal | daytona | vercel_sandbox | singularity
backend: local # local | docker | ssh | modal | daytona | singularity
cwd: "." # Gateway/cron working directory (CLI always uses launch dir)
timeout: 180 # Per-command timeout in seconds
env_passthrough: [] # Env var names to forward to sandboxed execution (terminal + execute_code)
@ -96,7 +96,7 @@ terminal:
daytona_image: "nikolaik/python-nodejs:python3.11-nodejs20" # Container image for Daytona backend
```
For cloud sandboxes such as Modal, Daytona, and Vercel Sandbox, `container_persistent: true` means Hermes will try to preserve filesystem state across sandbox recreation. It does not promise that the same live sandbox, PID space, or background processes will still be running later.
For cloud sandboxes such as Modal and Daytona, `container_persistent: true` means Hermes will try to preserve filesystem state across sandbox recreation. It does not promise that the same live sandbox, PID space, or background processes will still be running later.
### Backend Overview
@ -107,7 +107,6 @@ For cloud sandboxes such as Modal, Daytona, and Vercel Sandbox, `container_persi
| **ssh** | Remote server via SSH | Network boundary | Remote dev, powerful hardware |
| **modal** | Modal cloud sandbox | Full (cloud VM) | Ephemeral cloud compute, evals |
| **daytona** | Daytona workspace | Full (cloud container) | Managed cloud dev environments |
| **vercel_sandbox** | Vercel Sandbox | Full (cloud microVM) | Cloud execution with snapshot-backed filesystem persistence |
| **singularity** | Singularity/Apptainer container | Namespaces (--containall) | HPC clusters, shared machines |
### Local Backend
@ -232,49 +231,6 @@ terminal:
**Disk limit:** Daytona enforces a 10 GiB maximum. Requests above this are capped with a warning.
### Vercel Sandbox Backend
Runs commands in a [Vercel Sandbox](https://vercel.com/docs/vercel-sandbox) cloud microVM. Hermes uses the normal terminal and file tool surfaces; there are no Vercel-specific model-facing tools.
```yaml
terminal:
backend: vercel_sandbox
vercel_runtime: node24 # node24 | node22 | python3.13
cwd: /vercel/sandbox # default workspace root
container_persistent: true # Snapshot/restore filesystem
container_disk: 51200 # Shared default only; custom disk is unsupported
```
**Required install:** Install the optional SDK extra:
```bash
pip install 'hermes-agent[vercel]'
```
**Required authentication:** Configure access-token auth with all three of `VERCEL_TOKEN`, `VERCEL_PROJECT_ID`, and `VERCEL_TEAM_ID`. This is the supported setup for deployments and normal long-running Hermes processes on Render, Railway, Docker, and similar hosts.
For one-off local development, Hermes also accepts short-lived Vercel OIDC tokens:
```bash
VERCEL_OIDC_TOKEN="$(vc project token <project-name>)" hermes chat
```
From a linked Vercel project directory, you can omit the project name:
```bash
VERCEL_OIDC_TOKEN="$(vc project token)" hermes chat
```
OIDC tokens are short-lived and should not be used as the documented deployment path.
**Runtime:** `terminal.vercel_runtime` supports `node24`, `node22`, and `python3.13`. If unset, Hermes defaults to `node24`.
**Persistence:** When `container_persistent: true`, Hermes snapshots the sandbox filesystem during cleanup and restores a later sandbox for the same task from that snapshot. Snapshot contents can include Hermes-synced credentials, skills, and cache files that were copied into the sandbox. This preserves filesystem state only; it does not preserve live sandbox identity, PID space, shell state, or running background processes.
**Background commands:** `terminal(background=true)` uses Hermes' generic non-local background process flow. You can spawn, poll, wait, view logs, and kill processes through the normal process tool while the sandbox is alive. Hermes does not provide native Vercel detached-process recovery after cleanup or restart.
**Disk sizing:** Vercel Sandbox does not currently support Hermes' `container_disk` resource knob. Leave `container_disk` unset or at the shared default `51200`; non-default values fail diagnostics and backend creation instead of being silently ignored.
### Singularity/Apptainer Backend
Runs commands in a [Singularity/Apptainer](https://apptainer.org) container. Designed for HPC clusters and shared machines where Docker isn't available.
@ -829,7 +785,7 @@ Every model slot in Hermes — auxiliary tasks, compression, fallback — uses t
When `base_url` is set, Hermes ignores the provider and calls that endpoint directly (using `api_key` or `OPENAI_API_KEY` for auth). When only `provider` is set, Hermes uses that provider's built-in auth and base URL.
Available providers for auxiliary tasks: `auto`, `main`, plus any provider in the [provider registry](/reference/environment-variables) — `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `xai-oauth`, `ollama-cloud`, `alibaba`, `bedrock`, `huggingface`, `arcee`, `xiaomi`, `kilocode`, `opencode-zen`, `opencode-go`, `ai-gateway`, `azure-foundry` — or any named custom provider from your `custom_providers` list (e.g. `provider: "beans"`).
Available providers for auxiliary tasks: `auto`, `main`, plus any provider in the [provider registry](/reference/environment-variables) — `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `gemini`, `google-gemini-cli`, `qwen-oauth`, `zai`, `kimi-coding`, `kimi-coding-cn`, `minimax`, `minimax-cn`, `minimax-oauth`, `deepseek`, `nvidia`, `xai`, `xai-oauth`, `ollama-cloud`, `alibaba`, `bedrock`, `huggingface`, `arcee`, `xiaomi`, `kilocode`, `opencode-zen`, `opencode-go`, `azure-foundry` — or any named custom provider from your `custom_providers` list (e.g. `provider: "beans"`).
:::tip MiniMax OAuth
`minimax-oauth` logs in via browser OAuth (no API key needed). Run `hermes model` and select **MiniMax (OAuth)** to authenticate. Auxiliary tasks use `MiniMax-M2.7-highspeed` automatically. See the [MiniMax OAuth guide](../guides/minimax-oauth.md).

View file

@ -47,7 +47,6 @@ Both `provider` and `model` are **required**. If either is missing, the fallback
| Provider | Value | Requirements |
|----------|-------|-------------|
| AI Gateway | `ai-gateway` | `AI_GATEWAY_API_KEY` |
| OpenRouter | `openrouter` | `OPENROUTER_API_KEY` |
| Nous Portal | `nous` | `hermes setup --portal` (fresh) or `hermes auth add nous` (OAuth) |
| OpenAI Codex | `openai-codex` | `hermes model` (ChatGPT OAuth) |

View file

@ -65,14 +65,13 @@ The terminal tool can execute commands in different environments:
| `singularity` | HPC containers | Cluster computing, rootless |
| `modal` | Cloud execution | Serverless, scale |
| `daytona` | Cloud sandbox workspace | Persistent remote dev environments |
| `vercel_sandbox` | Vercel Sandbox cloud microVM | Cloud execution with snapshot-backed filesystem persistence |
### Configuration
```yaml
# In ~/.hermes/config.yaml
terminal:
backend: local # or: docker, ssh, singularity, modal, daytona, vercel_sandbox
backend: local # or: docker, ssh, singularity, modal, daytona
cwd: "." # Working directory
timeout: 180 # Command timeout in seconds
```
@ -123,41 +122,13 @@ modal setup
hermes config set terminal.backend modal
```
### Vercel Sandbox
```bash
pip install 'hermes-agent[vercel]'
hermes config set terminal.backend vercel_sandbox
hermes config set terminal.vercel_runtime node24
```
Authenticate with all three of `VERCEL_TOKEN`, `VERCEL_PROJECT_ID`, and `VERCEL_TEAM_ID`. This access-token setup is the supported path for deployments and normal long-running Hermes processes on Render, Railway, Docker, and similar hosts. Supported runtimes are `node24`, `node22`, and `python3.13`; Hermes defaults to `/vercel/sandbox` as the remote workspace root.
For one-off local development, Hermes also accepts short-lived Vercel OIDC tokens:
```bash
VERCEL_OIDC_TOKEN="$(vc project token <project-name>)" hermes chat
```
From a linked Vercel project directory:
```bash
VERCEL_OIDC_TOKEN="$(vc project token)" hermes chat
```
With `container_persistent: true`, Hermes uses Vercel snapshots to preserve filesystem state across sandbox recreation for the same task. This can include Hermes-synced credentials, skills, and cache files inside the sandbox. Snapshots do not preserve live processes, PID space, or the same live sandbox identity.
Background terminal commands use Hermes' generic non-local process flow: spawn, poll, wait, log, and kill work through the normal process tool while the sandbox is alive, but Hermes does not provide native Vercel detached-process recovery after cleanup or restart.
Leave `container_disk` unset or at the shared default `51200`; custom disk sizing is unsupported for Vercel Sandbox and will fail diagnostics/backend creation.
### Container Resources
Configure CPU, memory, disk, and persistence for all container backends:
```yaml
terminal:
backend: docker # or singularity, modal, daytona, vercel_sandbox
backend: docker # or singularity, modal, daytona
container_cpu: 1 # CPU cores (default: 1)
container_memory: 5120 # Memory in MB (default: 5GB)
container_disk: 51200 # Disk in MB (default: 50GB)

View file

@ -144,7 +144,7 @@ The following patterns trigger approval prompts (defined in `tools/approval.py`)
| `gateway run` with `&`/`disown`/`nohup`/`setsid` | Prevents starting gateway outside service manager |
:::info
**Container bypass**: When running in `docker`, `singularity`, `modal`, `daytona`, or `vercel_sandbox` backends, dangerous command checks are **skipped** because the container itself is the security boundary. Destructive commands inside a container can't harm the host.
**Container bypass**: When running in `docker`, `singularity`, `modal`, or `daytona` backends, dangerous command checks are **skipped** because the container itself is the security boundary. Destructive commands inside a container can't harm the host.
:::
### Approval Flow (CLI)
@ -340,7 +340,7 @@ terminal:
- **Ephemeral mode** (`container_persistent: false`): Uses tmpfs for workspace — everything is lost on cleanup
:::tip
For production gateway deployments, use `docker`, `modal`, `daytona`, or `vercel_sandbox` backend to isolate agent commands from your host system. This eliminates the need for dangerous command approval entirely.
For production gateway deployments, use `docker`, `modal`, or `daytona` backend to isolate agent commands from your host system. This eliminates the need for dangerous command approval entirely.
:::
:::warning
@ -357,7 +357,6 @@ If you add names to `terminal.docker_forward_env`, those variables are intention
| **singularity** | Container | ❌ Skipped | HPC environments |
| **modal** | Cloud sandbox | ❌ Skipped | Scalable cloud isolation |
| **daytona** | Cloud sandbox | ❌ Skipped | Persistent cloud workspaces |
| **vercel_sandbox** | Cloud microVM | ❌ Skipped | Cloud execution with snapshot persistence |
## Environment Variable Passthrough {#environment-variable-passthrough}

View file

@ -406,7 +406,6 @@ Full config reference: https://hermes-agent.nousresearch.com/docs/user-guide/con
| Alibaba / DashScope | API key | `DASHSCOPE_API_KEY` |
| Xiaomi MiMo | API key | `XIAOMI_API_KEY` |
| Kilo Code | API key | `KILOCODE_API_KEY` |
| AI Gateway (Vercel) | API key | `AI_GATEWAY_API_KEY` |
| OpenCode Zen | API key | `OPENCODE_ZEN_API_KEY` |
| OpenCode Go | API key | `OPENCODE_GO_API_KEY` |
| Qwen OAuth | OAuth | `hermes auth add qwen-oauth` |
@ -1014,7 +1013,7 @@ See `tests/agent/test_prompt_builder.py::TestEnvironmentHints` for a worked exam
Factual guidance about the host OS, user home, cwd, terminal backend, and shell (bash vs. PowerShell on Windows) is emitted from `agent/prompt_builder.py::build_environment_hints()`. This is also where the WSL hint and per-backend probe logic live. The convention:
- **Local terminal backend** → emit host info (OS, `$HOME`, cwd) + Windows-specific notes (hostname ≠ username, `terminal` uses bash not PowerShell).
- **Remote terminal backend** (anything in `_REMOTE_TERMINAL_BACKENDS`: `docker, singularity, modal, daytona, ssh, vercel_sandbox, managed_modal`) → **suppress** host info entirely and describe only the backend. A live `uname`/`whoami`/`pwd` probe runs inside the backend via `tools.environments.get_environment(...).execute(...)`, cached per process in `_BACKEND_PROBE_CACHE`, with a static fallback if the probe times out.
- **Remote terminal backend** (anything in `_REMOTE_TERMINAL_BACKENDS`: `docker, singularity, modal, daytona, ssh, managed_modal`) → **suppress** host info entirely and describe only the backend. A live `uname`/`whoami`/`pwd` probe runs inside the backend via `tools.environments.get_environment(...).execute(...)`, cached per process in `_BACKEND_PROBE_CACHE`, with a static fallback if the probe times out.
- **Key fact for prompt authoring:** when `TERMINAL_ENV != "local"`, *every* file tool (`read_file`, `write_file`, `patch`, `search_files`) runs inside the backend container, not on the host. The system prompt must never describe the host in that case — the agent can't touch it.
Full design notes, the exact emitted strings, and testing pitfalls: