fix: clean up API server — remove dead code, deduplicate model resolution, cache streaming config, add setup integration and security docs

- Remove unused _write_sse_chat_completion pseudo-streaming method (dead code)
- Extract _resolve_model() helper in gateway/run.py, use from api_server
- Cache streaming config at GatewayRunner init instead of YAML parsing per-message
- Add API_SERVER_* env vars to OPTIONAL_ENV_VARS for hermes setup integration
- Add security warning about network exposure without API_SERVER_KEY
This commit is contained in:
teknium1 2026-03-11 09:01:17 -07:00
parent d54280ea03
commit b800e63137
4 changed files with 100 additions and 86 deletions

View file

@ -165,11 +165,17 @@ This means you can customize behavior per-frontend without losing capabilities:
Bearer token auth via the `Authorization` header:
```
Authorization: Bearer your-secret-key
Authorization: Bearer ***
```
Configure the key via `API_SERVER_KEY` env var. If no key is set, all requests are allowed (for local-only use).
:::warning Security
The API server gives full access to hermes-agent's toolset, **including terminal commands**. If you change the bind address to `0.0.0.0` (network-accessible), **always set `API_SERVER_KEY`** — without it, anyone on your network can execute arbitrary commands on your machine.
The default bind address (`127.0.0.1`) is safe for local-only use.
:::
## Configuration
### Environment Variables