mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-08 03:01:47 +00:00

Jetha Chan b1476c76f6 docs(gemini): add Google Gemini guide

2026-05-05 13:43:04 -07:00

10 KiB

Raw Blame History

sidebar_position	title	description
16	Google Gemini	Use Hermes Agent with Google Gemini — native AI Studio API, API-key setup, OAuth option, tool calling, streaming, and quota guidance

Google Gemini

Hermes Agent supports Google Gemini as a native provider using the Google AI Studio / Gemini API — not the OpenAI-compatible endpoint. This lets Hermes translate its internal OpenAI-shaped message and tool loop into Gemini's native generateContent API while preserving tool calling, streaming, multimodal inputs, and Gemini-specific response metadata.

Hermes also supports a separate Google Gemini (OAuth) provider that uses the same Cloud Code Assist backend as Google's Gemini CLI. Use the API-key provider (gemini) for the lowest-risk official API path.

Prerequisites

Google AI Studio API key — create one at aistudio.google.com/apikey
Billing-enabled Google Cloud project — recommended for agent use. Gemini's free tier is too small for long-running agent sessions because Hermes may make several model calls per user turn.
Hermes installed — no extra Python package is required for the native Gemini provider.

:::tip API key path Set GOOGLE_API_KEY or GEMINI_API_KEY. Hermes checks both names for the gemini provider. :::

Quick Start

# Add your Gemini API key
echo "GOOGLE_API_KEY=..." >> ~/.hermes/.env

# Select Gemini as your provider
hermes model
# → Choose "More providers..." → "Google AI Studio"
# → Hermes checks your key tier and shows Gemini models
# → Select a model

# Start chatting
hermes chat

If you prefer direct config editing, use the native Gemini API base URL:

model:
  default: gemini-3-flash-preview
  provider: gemini
  base_url: https://generativelanguage.googleapis.com/v1beta

Configuration

After running hermes model, your ~/.hermes/config.yaml will contain:

model:
  default: gemini-3-flash-preview
  provider: gemini
  base_url: https://generativelanguage.googleapis.com/v1beta

And in ~/.hermes/.env:

GOOGLE_API_KEY=...

Native Gemini API

The recommended endpoint is:

https://generativelanguage.googleapis.com/v1beta

Hermes detects this endpoint and creates its native Gemini adapter. Internally, Hermes still keeps the agent loop in OpenAI-shaped messages, then translates each request to Gemini's native schema:

messages[] → Gemini contents[]
system prompts → Gemini systemInstruction
tool schemas → Gemini functionDeclarations
tool results → Gemini functionResponse parts
streaming responses → OpenAI-shaped stream chunks for the Hermes loop

:::note Gemini 3 thought signatures For Gemini 3 tool use, Hermes preserves the thoughtSignature values attached to function-call parts and replays them on the next tool turn. That covers the validation-critical path for multi-step agent workflows.

Gemini 3 may also attach thought signatures to other response parts. Hermes' native adapter is optimized for agent tool loops today, so it does not yet replay every non-tool-call signature with full part-level fidelity. :::

Prefer the Native Endpoint

Google also exposes an OpenAI-compatible endpoint:

https://generativelanguage.googleapis.com/v1beta/openai/

For Hermes agent sessions, prefer the native Gemini endpoint above. Hermes includes a native Gemini adapter so it can map multi-turn tool use, tool-call results, streaming, multimodal inputs, and Gemini response metadata directly onto Gemini's generateContent API. The OpenAI-compatible endpoint is still useful when you specifically need OpenAI API compatibility.

If you previously set GEMINI_BASE_URL to the /openai URL, remove it or change it:

GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta

OAuth Provider

Hermes also has a google-gemini-cli provider:

hermes model
# → Choose "Google Gemini (OAuth)"

This uses browser PKCE login and the Cloud Code Assist backend. It can be useful for users who want Gemini CLI-style OAuth, but Hermes shows an explicit warning because Google may treat use of the Gemini CLI OAuth client from third-party software as a policy violation. For production or lowest-risk usage, prefer the API-key provider above.

Available Models

The hermes model picker shows Gemini models maintained in Hermes' provider registry. Common choices include:

Model	ID	Notes
Gemini 3.1 Pro Preview	`gemini-3.1-pro-preview`	Most capable preview model when available
Gemini 3 Pro Preview	`gemini-3-pro-preview`	Strong reasoning and coding model
Gemini 3 Flash Preview	`gemini-3-flash-preview`	Recommended default balance of speed and capability
Gemini 3.1 Flash Lite Preview	`gemini-3.1-flash-lite-preview`	Fastest / lowest-cost option when available

Model availability changes over time. If a model disappears or is not enabled for your key, run hermes model again and pick one from the current list.

:::info Model IDs Use Gemini's native model IDs such as gemini-3-flash-preview, not OpenRouter-style IDs like google/gemini-3-flash-preview, when provider: gemini. :::

Latest Aliases

Google publishes moving aliases for the Pro and Flash Gemini families. gemini-pro-latest and gemini-flash-latest are useful when you want Google to advance the model automatically without changing your Hermes config.

Alias	Currently tracks	Notes
`gemini-pro-latest`	Latest Gemini Pro model	Best when you want Google's current Pro default
`gemini-flash-latest`	Latest Gemini Flash model	Best when you want Google's current Flash default

model:
  default: gemini-pro-latest
  provider: gemini
  base_url: https://generativelanguage.googleapis.com/v1beta

If you need strict reproducibility, prefer explicit model IDs such as gemini-3.1-pro-preview or gemini-3-flash-preview.

Gemma via the Gemini API

Google also exposes Gemma models through the Gemini API. Hermes recognizes these as Google models, but hides very low-throughput Gemma entries from the default model picker so new users do not accidentally select an evaluation-tier model for a long-running agent session.

Useful evaluation IDs include:

Model	ID	Notes
Gemma 4 31B IT	`gemma-4-31b-it`	Larger Gemma model; useful for compatibility and quality evaluation
Gemma 4 26B A4B IT	`gemma-4-26b-a4b-it`	Smaller active-parameter variant when available

These models are best treated as evaluation options on Gemini API keys. Google's Gemma API pricing is free-tier-only and the usage caps are low compared with production Gemini models, so sustained Hermes agent use should normally move to a paid Gemini model, a self-hosted deployment, or another provider with appropriate quota.

To use a Gemma model that is hidden from the picker, set it directly:

model:
  default: gemma-4-31b-it
  provider: gemini
  base_url: https://generativelanguage.googleapis.com/v1beta

Switching Models Mid-Session

Use the /model command during a conversation:

/model gemini-3-flash-preview
/model gemini-flash-latest
/model gemini-3-pro-preview
/model gemini-pro-latest
/model gemma-4-31b-it
/model gemini-3.1-flash-lite-preview

If you have not configured Gemini yet, exit the session and run hermes model first. /model switches among already-configured providers and models; it does not collect new API keys.

Diagnostics

hermes doctor

The doctor checks:

Whether GOOGLE_API_KEY or GEMINI_API_KEY is available
Whether Gemini OAuth credentials exist for google-gemini-cli
Whether configured provider credentials can be resolved

For OAuth quota usage, run this inside a Hermes session:

/gquota

/gquota applies to the google-gemini-cli OAuth provider, not the AI Studio API-key provider.

Gateway (Messaging Platforms)

Gemini works with all Hermes gateway platforms (Telegram, Discord, Slack, WhatsApp, LINE, Feishu, etc.). Configure Gemini as your provider, then start the gateway normally:

hermes gateway setup
hermes gateway start

The gateway reads config.yaml and uses the same Gemini provider configuration.

Troubleshooting

"Gemini native client requires an API key"

Hermes could not find a usable API key. Add one of these to ~/.hermes/.env:

GOOGLE_API_KEY=...
# or
GEMINI_API_KEY=...

Then run hermes model again.

"This Google API key is on the free tier"

Hermes probes Gemini API keys during setup. Free-tier quotas can be exhausted after a handful of agent turns because tool use, retries, compression, and auxiliary tasks may require multiple model calls.

Enable billing on the Google Cloud project attached to your key, regenerate the key if needed, then run:

hermes model

"404 model not found"

The selected model is not available for your account, region, or key. Run hermes model again and pick another Gemini model from the current list.

Gemma model is not shown in `hermes model`

Hermes may hide low-throughput Gemma models from the picker by default. If you intentionally want to evaluate one, set the model ID directly in ~/.hermes/config.yaml.

"429 quota exceeded" on Gemma

Gemma models exposed through the Gemini API are useful for evaluation, but their Gemini API free-tier caps are low. Use them for compatibility testing, then switch to a paid Gemini model or another provider for sustained agent sessions.

OpenAI-compatible endpoint is configured

Check ~/.hermes/.env for:

GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai/

Change it to the native endpoint or remove the override:

GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta

The google-gemini-cli provider uses a Gemini CLI / Cloud Code Assist OAuth flow. Hermes warns before starting it because this is distinct from the official AI Studio API-key path. Use provider: gemini with GOOGLE_API_KEY for the official API-key integration.

Tool calling fails with schema errors

Upgrade Hermes and rerun hermes model. The native Gemini adapter sanitizes tool schemas for Gemini's stricter function-declaration format; older builds or custom endpoints may not.

AI Providers
Configuration
Fallback Providers
AWS Bedrock — native cloud-provider integration using AWS credentials

10 KiB Raw Blame History