The URL is now the primary element — displayed in a bordered box
before the browser auto-open attempt. Works for users who SSH into
remote servers where webbrowser.open() silently fails.
Put the authorization URL front and center instead of treating it as
a fallback. Most Hermes users run on remote servers via SSH where
webbrowser.open() silently fails.
Adds our own OAuth login and token refresh flow, independent of Claude
Code CLI. Mirrors the PKCE flow used by pi-ai (clawdbot) and OpenCode:
- run_hermes_oauth_login(): full PKCE authorization code flow
- Opens browser to claude.ai/oauth/authorize
- User pastes code#state back
- Exchanges for access + refresh tokens
- Stores in ~/.hermes/.anthropic_oauth.json (our own file)
- Also writes to ~/.claude/.credentials.json for backward compat
- refresh_hermes_oauth_token(): automatic token refresh
- POST to console.anthropic.com/v1/oauth/token with refresh_token
- Updates both credential files on success
- Credential resolution priority updated:
1. ANTHROPIC_TOKEN env var
2. CLAUDE_CODE_OAUTH_TOKEN env var
3. Hermes OAuth credentials (~/.hermes/.anthropic_oauth.json) ← NEW
4. Claude Code credentials (~/.claude/.credentials.json)
5. ANTHROPIC_API_KEY env var
Uses same CLIENT_ID, endpoints, scopes, and PKCE parameters as
Claude Code / OpenCode / pi-ai. Token refresh happens automatically
before each API call via _try_refresh_anthropic_client_credentials.
anthropic/claude-opus-4.6 (OpenRouter format) was being sent as
claude-opus-4.6 to the Anthropic API, which expects claude-opus-4-6
(hyphens, not dots).
normalize_model_name() now converts dots to hyphens after stripping
the provider prefix, matching Anthropic's naming convention.
Fixes 404: 'model: claude-opus-4.6 was not found'
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
Haiku models don't support extended thinking at all. Without this
guard, claude-haiku-4-5-20251001 would receive type=enabled +
budget_tokens and return a 400 error.
Incorporates the fix from PR #1127 (by frizynn) on top of #1128's
adaptive thinking refactor.
Verified live with Claude Code OAuth:
claude-opus-4-6 → adaptive thinking ✓
claude-haiku-4-5 → no thinking params ✓
claude-sonnet-4 → enabled thinking ✓
For Claude 4.6 models (Opus and Sonnet), the Anthropic API rejects
budget_tokens when thinking.type is 'adaptive'. This was causing a
400 error: 'thinking.adaptive.budget_tokens: Extra inputs are not
permitted'.
Changes:
- Send thinking: {type: 'adaptive'} without budget_tokens for 4.6
- Move effort control to output_config: {effort: ...} per Anthropic docs
- Map Hermes effort levels to Anthropic effort levels (xhigh->max, etc.)
- Narrow adaptive detection to 4.6 models only (4.5 still uses manual)
- Add tests for adaptive thinking on 4.6 and manual thinking on pre-4.6
Fixes#1126
Remaining issues from deep scan:
Adapter (agent/anthropic_adapter.py):
- Add _sanitize_tool_id() — Anthropic requires IDs matching [a-zA-Z0-9_-],
now strips invalid chars and ensures non-empty (both tool_use and tool_result)
- Empty tool result content → '(no output)' placeholder (Anthropic rejects empty)
- Set temperature=1 when thinking type='enabled' on older models (required)
- normalize_model_name now case-insensitive for 'Anthropic/' prefix
- Fix stale docstrings referencing only ~/.claude/.credentials.json
Agent loop (run_agent.py):
- Guard memory flush path (line ~2684) — was calling self.client.chat.completions
which is None in anthropic_messages mode. Now routes through Anthropic client.
- Guard summary generation path (line ~3171) — same crash when reaching
iteration limit. Now builds proper Anthropic kwargs and normalizes response.
- Guard retry summary path (line ~3200) — same fix for the summary retry loop.
All three self.client.chat.completions.create() calls outside the main
loop now have anthropic_messages branches to prevent NoneType crashes.
Fixes from comprehensive code review and cross-referencing with
clawdbot/OpenCode implementations:
CRITICAL:
- Add one-shot guard (anthropic_auth_retry_attempted) to prevent
infinite 401 retry loops when credentials keep changing
- Fix _is_oauth_token(): managed keys from ~/.claude.json are NOT
regular API keys (don't start with sk-ant-api). Inverted the logic:
only sk-ant-api* is treated as API key auth, everything else uses
Bearer auth + oauth beta headers
HIGH:
- Wrap json.loads(args) in try/except in message conversion — malformed
tool_call arguments no longer crash the entire conversation
- Raise AuthError in runtime_provider when no Anthropic token found
(was silently passing empty string, causing confusing API errors)
- Remove broken _try_anthropic() from auxiliary vision chain — the
centralized router creates an OpenAI client for api_key providers
which doesn't work with Anthropic's Messages API
MEDIUM:
- Handle empty assistant message content — Anthropic rejects empty
content blocks, now inserts '(empty)' placeholder
- Fix setup.py existing_key logic — set to 'KEEP' sentinel instead
of None to prevent falling through to the auth choice prompt
- Add debug logging to _fetch_anthropic_models on failure
Tests: 43 adapter tests (2 new for token detection), 3197 total passed
- Add _fetch_anthropic_models() to hermes_cli/models.py — hits the
Anthropic /v1/models endpoint to get the live model catalog. Handles
both API key and OAuth token auth headers.
- Wire it into provider_model_ids() so both 'hermes model' and
'hermes setup model' show the live list instead of a stale static one.
- Update static _PROVIDER_MODELS fallback with full current catalog:
opus-4-6, sonnet-4-6, opus-4-5, sonnet-4-5, opus-4, sonnet-4, haiku-4-5
- Update model_metadata.py with context lengths for all current models.
- Fix thinking parameter for 4.5+ models: use type='adaptive' instead
of type='enabled' (Anthropic deprecated 'enabled' for newer models,
warns at runtime). Detects model version from the model name string.
Verified live:
hermes model → Anthropic → auto-detected creds → shows 7 live models
hermes chat --provider anthropic --model claude-opus-4-6 → works
The critical bug: read_claude_code_credentials() only looked at
~/.claude/.credentials.json, but Claude Code's native binary (v2.x,
Bun-compiled) stores credentials in ~/.claude.json at the top level
as 'primaryApiKey'. The .credentials.json file is only written by
older npm-based installs.
Now checks both locations in priority order:
1. ~/.claude.json → primaryApiKey (native binary, v2.x)
2. ~/.claude/.credentials.json → claudeAiOauth.accessToken (legacy)
Verified live: hermes model → Anthropic → auto-detected credentials →
claude-sonnet-4-20250514 → 'Hello there, how are you?' (5 words)
Feedback fixes:
1. Revert _convert_vision_content — vision is handled by the vision_analyze
tool, not by converting image blocks inline in conversation messages.
Removed the function and its tests.
2. Add Anthropic to 'hermes model' (cmd_model in main.py):
- Added to provider_labels dict
- Added to providers selection list
- Added _model_flow_anthropic() with Claude Code credential auto-detection,
API key prompting, and model selection from catalog.
3. Wire up Anthropic as a vision-capable auxiliary provider:
- Added _try_anthropic() to auxiliary_client.py using claude-sonnet-4
as the vision model (Claude natively supports multimodal)
- Added to the get_vision_auxiliary_client() auto-detection chain
(after OpenRouter/Nous, before Codex/custom)
Cache tracking note: the Anthropic cache metrics branch in run_agent.py
(cache_read_input_tokens / cache_creation_input_tokens) is in the correct
place — it's response-level parsing, same location as the existing
OpenRouter cache tracking. auxiliary_client.py has no cache tracking.
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions