docs(gemini): add Google Gemini guide

2026-06-24 10:52:21 +00:00 · 2026-04-28 15:53:30 +09:00 · 2026-04-28 15:53:30 +09:00 · b1476c76f6
commit b1476c76f6
parent 794f48766c
2 changed files with 282 additions and 0 deletions
--- a/website/docs/guides/google-gemini.md
+++ b/website/docs/guides/google-gemini.md
@ -0,0 +1,280 @@
+---
+sidebar_position: 16
+title: "Google Gemini"
+description: "Use Hermes Agent with Google Gemini — native AI Studio API, API-key setup, OAuth option, tool calling, streaming, and quota guidance"
+---
+
+# Google Gemini
+
+Hermes Agent supports Google Gemini as a native provider using the **Google AI Studio / Gemini API** — not the OpenAI-compatible endpoint. This lets Hermes translate its internal OpenAI-shaped message and tool loop into Gemini's native `generateContent` API while preserving tool calling, streaming, multimodal inputs, and Gemini-specific response metadata.
+
+Hermes also supports a separate **Google Gemini (OAuth)** provider that uses the same Cloud Code Assist backend as Google's Gemini CLI. Use the API-key provider (`gemini`) for the lowest-risk official API path.
+
+## Prerequisites
+
+- **Google AI Studio API key** — create one at [aistudio.google.com/apikey](https://aistudio.google.com/apikey)
+- **Billing-enabled Google Cloud project** — recommended for agent use. Gemini's free tier is too small for long-running agent sessions because Hermes may make several model calls per user turn.
+- **Hermes installed** — no extra Python package is required for the native Gemini provider.
+
+:::tip API key path
+Set `GOOGLE_API_KEY` or `GEMINI_API_KEY`. Hermes checks both names for the `gemini` provider.
+:::
+
+## Quick Start
+
+```bash
+# Add your Gemini API key
+echo "GOOGLE_API_KEY=..." >> ~/.hermes/.env
+
+# Select Gemini as your provider
+hermes model
+# → Choose "More providers..." → "Google AI Studio"
+# → Hermes checks your key tier and shows Gemini models
+# → Select a model
+
+# Start chatting
+hermes chat
+```
+
+If you prefer direct config editing, use the native Gemini API base URL:
+
+```yaml
+model:
+  default: gemini-3-flash-preview
+  provider: gemini
+  base_url: https://generativelanguage.googleapis.com/v1beta
+```
+
+## Configuration
+
+After running `hermes model`, your `~/.hermes/config.yaml` will contain:
+
+```yaml
+model:
+  default: gemini-3-flash-preview
+  provider: gemini
+  base_url: https://generativelanguage.googleapis.com/v1beta
+```
+
+And in `~/.hermes/.env`:
+
+```bash
+GOOGLE_API_KEY=...
+```
+
+### Native Gemini API
+
+The recommended endpoint is:
+
+```text
+https://generativelanguage.googleapis.com/v1beta
+```
+
+Hermes detects this endpoint and creates its native Gemini adapter. Internally, Hermes still keeps the agent loop in OpenAI-shaped messages, then translates each request to Gemini's native schema:
+
+- `messages[]` → Gemini `contents[]`
+- system prompts → Gemini `systemInstruction`
+- tool schemas → Gemini `functionDeclarations`
+- tool results → Gemini `functionResponse` parts
+- streaming responses → OpenAI-shaped stream chunks for the Hermes loop
+
+:::note Gemini 3 thought signatures
+For Gemini 3 tool use, Hermes preserves the `thoughtSignature` values attached to function-call parts and replays them on the next tool turn. That covers the validation-critical path for multi-step agent workflows.
+
+Gemini 3 may also attach thought signatures to other response parts. Hermes' native adapter is optimized for agent tool loops today, so it does not yet replay every non-tool-call signature with full part-level fidelity.
+:::
+
+### Prefer the Native Endpoint
+
+Google also exposes an OpenAI-compatible endpoint:
+
+```text
+https://generativelanguage.googleapis.com/v1beta/openai/
+```
+
+For Hermes agent sessions, prefer the native Gemini endpoint above. Hermes includes a native Gemini adapter so it can map multi-turn tool use, tool-call results, streaming, multimodal inputs, and Gemini response metadata directly onto Gemini's `generateContent` API. The OpenAI-compatible endpoint is still useful when you specifically need OpenAI API compatibility.
+
+If you previously set `GEMINI_BASE_URL` to the `/openai` URL, remove it or change it:
+
+```bash
+GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta
+```
+
+### OAuth Provider
+
+Hermes also has a `google-gemini-cli` provider:
+
+```bash
+hermes model
+# → Choose "Google Gemini (OAuth)"
+```
+
+This uses browser PKCE login and the Cloud Code Assist backend. It can be useful for users who want Gemini CLI-style OAuth, but Hermes shows an explicit warning because Google may treat use of the Gemini CLI OAuth client from third-party software as a policy violation. For production or lowest-risk usage, prefer the API-key provider above.
+
+## Available Models
+
+The `hermes model` picker shows Gemini models maintained in Hermes' provider registry. Common choices include:
+
+| Model | ID | Notes |
+|-------|----|-------|
+| Gemini 3.1 Pro Preview | `gemini-3.1-pro-preview` | Most capable preview model when available |
+| Gemini 3 Pro Preview | `gemini-3-pro-preview` | Strong reasoning and coding model |
+| Gemini 3 Flash Preview | `gemini-3-flash-preview` | Recommended default balance of speed and capability |
+| Gemini 3.1 Flash Lite Preview | `gemini-3.1-flash-lite-preview` | Fastest / lowest-cost option when available |
+
+Model availability changes over time. If a model disappears or is not enabled for your key, run `hermes model` again and pick one from the current list.
+
+:::info Model IDs
+Use Gemini's native model IDs such as `gemini-3-flash-preview`, not OpenRouter-style IDs like `google/gemini-3-flash-preview`, when `provider: gemini`.
+:::
+
+### Latest Aliases
+
+Google publishes moving aliases for the Pro and Flash Gemini families. `gemini-pro-latest` and `gemini-flash-latest` are useful when you want Google to advance the model automatically without changing your Hermes config.
+
+| Alias | Currently tracks | Notes |
+|-------|------------------|-------|
+| `gemini-pro-latest` | Latest Gemini Pro model | Best when you want Google's current Pro default |
+| `gemini-flash-latest` | Latest Gemini Flash model | Best when you want Google's current Flash default |
+
+```yaml
+model:
+  default: gemini-pro-latest
+  provider: gemini
+  base_url: https://generativelanguage.googleapis.com/v1beta
+```
+
+If you need strict reproducibility, prefer explicit model IDs such as `gemini-3.1-pro-preview` or `gemini-3-flash-preview`.
+
+### Gemma via the Gemini API
+
+Google also exposes Gemma models through the Gemini API. Hermes recognizes these as Google models, but hides very low-throughput Gemma entries from the default model picker so new users do not accidentally select an evaluation-tier model for a long-running agent session.
+
+Useful evaluation IDs include:
+
+| Model | ID | Notes |
+|-------|----|-------|
+| Gemma 4 31B IT | `gemma-4-31b-it` | Larger Gemma model; useful for compatibility and quality evaluation |
+| Gemma 4 26B A4B IT | `gemma-4-26b-a4b-it` | Smaller active-parameter variant when available |
+
+These models are best treated as evaluation options on Gemini API keys. Google's Gemma API pricing is free-tier-only and the usage caps are low compared with production Gemini models, so sustained Hermes agent use should normally move to a paid Gemini model, a self-hosted deployment, or another provider with appropriate quota.
+
+To use a Gemma model that is hidden from the picker, set it directly:
+
+```yaml
+model:
+  default: gemma-4-31b-it
+  provider: gemini
+  base_url: https://generativelanguage.googleapis.com/v1beta
+```
+
+## Switching Models Mid-Session
+
+Use the `/model` command during a conversation:
+
+```text
+/model gemini-3-flash-preview
+/model gemini-flash-latest
+/model gemini-3-pro-preview
+/model gemini-pro-latest
+/model gemma-4-31b-it
+/model gemini-3.1-flash-lite-preview
+```
+
+If you have not configured Gemini yet, exit the session and run `hermes model` first. `/model` switches among already-configured providers and models; it does not collect new API keys.
+
+## Diagnostics
+
+```bash
+hermes doctor
+```
+
+The doctor checks:
+
+- Whether `GOOGLE_API_KEY` or `GEMINI_API_KEY` is available
+- Whether Gemini OAuth credentials exist for `google-gemini-cli`
+- Whether configured provider credentials can be resolved
+
+For OAuth quota usage, run this inside a Hermes session:
+
+```text
+/gquota
+```
+
+`/gquota` applies to the `google-gemini-cli` OAuth provider, not the AI Studio API-key provider.
+
+## Gateway (Messaging Platforms)
+
+Gemini works with all Hermes gateway platforms (Telegram, Discord, Slack, WhatsApp, LINE, Feishu, etc.). Configure Gemini as your provider, then start the gateway normally:
+
+```bash
+hermes gateway setup
+hermes gateway start
+```
+
+The gateway reads `config.yaml` and uses the same Gemini provider configuration.
+
+## Troubleshooting
+
+### "Gemini native client requires an API key"
+
+Hermes could not find a usable API key. Add one of these to `~/.hermes/.env`:
+
+```bash
+GOOGLE_API_KEY=...
+# or
+GEMINI_API_KEY=...
+```
+
+Then run `hermes model` again.
+
+### "This Google API key is on the free tier"
+
+Hermes probes Gemini API keys during setup. Free-tier quotas can be exhausted after a handful of agent turns because tool use, retries, compression, and auxiliary tasks may require multiple model calls.
+
+Enable billing on the Google Cloud project attached to your key, regenerate the key if needed, then run:
+
+```bash
+hermes model
+```
+
+### "404 model not found"
+
+The selected model is not available for your account, region, or key. Run `hermes model` again and pick another Gemini model from the current list.
+
+### Gemma model is not shown in `hermes model`
+
+Hermes may hide low-throughput Gemma models from the picker by default. If you intentionally want to evaluate one, set the model ID directly in `~/.hermes/config.yaml`.
+
+### "429 quota exceeded" on Gemma
+
+Gemma models exposed through the Gemini API are useful for evaluation, but their Gemini API free-tier caps are low. Use them for compatibility testing, then switch to a paid Gemini model or another provider for sustained agent sessions.
+
+### OpenAI-compatible endpoint is configured
+
+Check `~/.hermes/.env` for:
+
+```bash
+GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai/
+```
+
+Change it to the native endpoint or remove the override:
+
+```bash
+GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta
+```
+
+### OAuth login warning
+
+The `google-gemini-cli` provider uses a Gemini CLI / Cloud Code Assist OAuth flow. Hermes warns before starting it because this is distinct from the official AI Studio API-key path. Use `provider: gemini` with `GOOGLE_API_KEY` for the official API-key integration.
+
+### Tool calling fails with schema errors
+
+Upgrade Hermes and rerun `hermes model`. The native Gemini adapter sanitizes tool schemas for Gemini's stricter function-declaration format; older builds or custom endpoints may not.
+
+## Related
+
+- [AI Providers](/docs/integrations/providers)
+- [Configuration](/docs/user-guide/configuration)
+- [Fallback Providers](/docs/user-guide/features/fallback-providers)
+- [AWS Bedrock](/docs/guides/aws-bedrock) — native cloud-provider integration using AWS credentials
--- a/website/docs/integrations/providers.md
+++ b/website/docs/integrations/providers.md
@ -42,6 +42,8 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
 | **LM Studio** | `hermes model` → "LM Studio" (provider: `lmstudio`, optional `LM_API_KEY`) |
 | **Custom Endpoint** | `hermes model` → choose "Custom endpoint" (saved in `config.yaml`) |

+For the official API-key path, see the dedicated [Google Gemini guide](/docs/guides/google-gemini).
+
 :::tip Model key alias
 In the `model:` config section, you can use either `default:` or `model:` as the key name for your model ID. Both `model: { default: my-model }` and `model: { model: my-model }` work identically.
 :::