hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-18 04:41:56 +00:00

Author	SHA1	Message	Date
Alex-wuhu	c76e879574	feat: add NovitaAI as LLM provider Add NovitaAI as a first-class provider with dedicated model selection flow, live pricing, and authoritative context length resolution. - Register provider in PROVIDER_REGISTRY, HERMES_OVERLAYS, and all alias/label maps (ID: novita, aliases: novita-ai, novitaai) - Add dedicated _model_flow_novita() with 3-tier model list fallback: Novita API → models.dev → static curated list - Fetch live pricing from /v1/models with correct unit conversion (input_token_price_per_m is 0.0001 USD per Mtok) - Add Novita-specific context length resolution (step 4b) in get_model_context_length(), prioritized over models.dev/OpenRouter - Register api.novita.ai in _URL_TO_PROVIDER to prevent early return from the custom-endpoint code path - Add models.dev mapping (novita → novita-ai) - Add default auxiliary model (deepseek/deepseek-v3-0324) - Add NOVITA_API_KEY to test isolation (conftest.py) - Update docs: providers page, env vars reference, CLI reference, .env.example, README, and landing page	2026-05-13 23:51:15 -07:00
nicoechaniz	e2b713cced	fix(model-metadata): skip OpenRouter for known providers, add kimi/moonshot to PROVIDER_TO_MODELS_DEV Based on PR #23950 by @nicoechaniz. - Add "kimi" and "moonshot" to PROVIDER_TO_MODELS_DEV → kimi-for-coding - Gate OpenRouter metadata step behind "if not effective_provider": known providers should not be overridden by community-maintained OR data - Keep the targeted Kimi-family 32k guard as a secondary safety net inside the OR gate (for unknown providers with Kimi models) Co-authored-by: nicoechaniz <nicoechaniz@altermundi.net>	2026-05-11 13:16:07 -07:00
kshitijk4poor	91eef6255e	fix: correct context-length resolution for kimi-k2.6 on Ollama Cloud and Kimi Coding Kimi-k2.6 (which supports 262K context) was incorrectly resolved as 32K, tripping the 64K minimum-context guard and preventing use of the model on Ollama Cloud and Kimi Coding / Moonshot providers. Three fixes in the context-length resolution chain: 1. Ollama Cloud native /api/show query: new _query_ollama_api_show() queries the Ollama native API for authoritative GGUF model_info context_length. For hosted Ollama, prefers model_info over num_ctx since users can't set their own num_ctx on Cloud. Added at step 5e in get_model_context_length(), before the models.dev fallback. 2. models.dev :cloud/-cloud suffix fallback: lookup_models_dev_context() now also tries appending :cloud and -cloud suffixes when the bare model name doesn't match. models.dev stores 'kimi-k2.6:cloud' but users and the live API use bare 'kimi-k2.6'. 3. Kimi-family 32K guard: after the OpenRouter metadata step, reject exactly 32768 for Kimi-named models (kimi-, moonshot) and fall through to hardcoded defaults ('kimi': 262144). OpenRouter reports 32768 for moonshotai/kimi-k2.6 but the model actually supports 262K. Narrow filter — only 32768, only Kimi-family — becomes dead code when OpenRouter updates its metadata. ---	2026-05-11 13:16:07 -07:00
Teknium	775c0e22cf	perf(models_dev): cache-first lookup, skip network when disk cache is fresh (#22808 ) `fetch_models_dev()` is on the hot path of every `AIAgent.__init__` (via `context_compressor → get_model_context_length`). The previous policy was "always try network first, only fall back to disk if network fails," so every fresh `hermes chat` / `hermes gateway` / batch / cron process paid 250-500 ms re-fetching a 2 MB JSON registry that was already on disk from earlier runs. Add a stage 2 between in-mem and network: if `models_dev_cache.json` exists and its mtime is younger than the existing `_MODELS_DEV_CACHE_TTL` (1 hour, same TTL the in-mem cache already uses), load from disk and skip the network call. The in-mem TTL is anchored to the disk file's age, so a 50-min-old cache stays in-memory for only 10 more minutes — no surprise extension of staleness window. Invariants preserved: - `force_refresh=True` still always hits the network and only falls back to disk on failure (`hermes config refresh` semantics). - Missing disk cache → fall through to network (first-ever run). - Stale disk cache (mtime > TTL) → fall through to network. - Negative file age (clock skew) → fall through to network. - Network failure → existing stage-4 stale-disk fallback unchanged. Measured impact (3-run medians, 9950X3D, fresh process per run): fetch_models_dev cold: 256 → 17 ms (-93%) hermes chat -q wall: 4.00 → 3.73 s (-7% median) 3.99 → 3.60 s (-10% min) The chat-end-to-end win is bounded below by API latency variance, but the fetch_models_dev microbenchmark is the cleanest signal: 239 ms shaved off every fresh-process agent construction. Win compounds with the previous perf PRs: #22681 google_chat lazy-load #22766 doctor parallel + IMDS off #22790 gateway.platforms PEP 562 Tests: all 30 `tests/agent/test_models_dev.py` pass (added 4 new ones covering the new disk-cache-first path, force_refresh override, stale disk fallback, and missing-disk-cache fall-through). Full `tests/agent/` suite: 2560 passed, 0 failed.	2026-05-09 13:32:38 -07:00
LeonSGP43	14f38822fa	fix(models): prefer image modalities for vision routing	2026-05-07 05:54:12 -07:00
teknium1	40a98fb0fa	feat(minimax-oauth): full integration with peer OAuth providers Close integration gaps discovered by auditing qwen-oauth's file coverage. These are surfaces the original salvage missed — they all existed on main and were added in the 747 commits since PR #15203 was opened. Coverage added: - agent/credential_pool.py: seed pool from auth.json providers.minimax-oauth so `hermes auth list` reflects logged-in state and `hermes auth remove minimax-oauth <N>` works through the standard flow. - agent/credential_sources.py: register RemovalStep for minimax-oauth with suppression-aware `_clear_auth_store_provider`. - agent/models_dev.py: PROVIDER_TO_MODELS_DEV mapping (-> 'minimax' family). - hermes_cli/providers.py: HermesOverlay entry (anthropic_messages transport, oauth_external auth_type, api.minimax.io/anthropic base). - hermes_cli/model_normalize.py: add to _MATCHING_PREFIX_STRIP_PROVIDERS so `minimax-oauth/MiniMax-M2.7` in config.yaml gets correctly repaired. - hermes_cli/status.py: render MiniMax OAuth block in `hermes doctor` (logged-in / region / expires_at / error). - hermes_cli/web_server.py: register in OAUTH_PROVIDER_REGISTRY + dispatch branch in _resolve_provider_status so the dashboard auth page shows it. - website/docs/integrations/providers.md: full 'MiniMax (OAuth)' section. - website/docs/reference/cli-commands.md: --provider enum. - website/docs/user-guide/features/fallback-providers.md: fallback table row. - scripts/release.py AUTHOR_MAP: amanning3390 mapping (CI gate).	2026-04-29 09:53:42 -07:00
zhzouxiaoya12	3d90292eda	fix: normalize provider in list_provider_models to support aliases	2026-04-23 01:59:20 -07:00
hengm3467	c6b1ef4e58	feat: add Step Plan provider support (salvage #6005 ) Adds a first-class 'stepfun' API-key provider surfaced as Step Plan: - Support Step Plan setup for both International and China regions - Discover Step Plan models live from /step_plan/v1/models, with a small coding-focused fallback catalog when discovery is unavailable - Thread StepFun through provider metadata, setup persistence, status and doctor output, auxiliary routing, and model normalization - Add tests for provider resolution, model validation, metadata mapping, and StepFun region/model persistence Based on #6005 by @hengm3467. Co-authored-by: hengm3467 <100685635+hengm3467@users.noreply.github.com>	2026-04-22 02:59:58 -07:00
helix4u	a7dd6a3449	fix(gemini): hide stale and low-TPM Google models	2026-04-18 12:52:01 -07:00
helix4u	2eab7ee15f	fix(gemini): hide low-TPM Gemma models from exposed lists	2026-04-18 12:52:01 -07:00
kshitijk4poor	1b61ec470b	feat: add Ollama Cloud as built-in provider Add ollama-cloud as a first-class provider with full parity to existing API-key providers (gemini, zai, minimax, etc.): - PROVIDER_REGISTRY entry with OLLAMA_API_KEY env var - Provider aliases: ollama -> custom (local), ollama_cloud -> ollama-cloud - models.dev integration for accurate context lengths - URL-to-provider mapping (ollama.com -> ollama-cloud) - Passthrough model normalization (preserves Ollama model:tag format) - Default auxiliary model (nemotron-3-nano:30b) - HermesOverlay in providers.py - CLI --provider choices, CANONICAL_PROVIDERS entry - Dynamic model discovery with disk caching (1hr TTL) - 37 provider-specific tests Cherry-picked from PR #6038 by kshitijk4poor. Closes #3926	2026-04-16 02:22:09 -07:00
Teknium	8d023e43ed	refactor: remove dead code — 1,784 lines across 77 files (#9180 ) Deep scan with vulture, pyflakes, and manual cross-referencing identified: - 41 dead functions/methods (zero callers in production) - 7 production-dead functions (only test callers, tests deleted) - 5 dead constants/variables - ~35 unused imports across agent/, hermes_cli/, tools/, gateway/ Categories of dead code removed: - Refactoring leftovers: _set_default_model, _setup_copilot_reasoning_selection, rebuild_lookups, clear_session_context, get_logs_dir, clear_session - Unused API surface: search_models_dev, get_pricing, skills_categories, get_read_files_summary, clear_read_tracker, menu_labels, get_spinner_list - Dead compatibility wrappers: schedule_cronjob, list_cronjobs, remove_cronjob - Stale debug helpers: get_debug_session_info copies in 4 tool files (centralized version in debug_helpers.py already exists) - Dead gateway methods: send_emote, send_notice (matrix), send_reaction (bluebubbles), _normalize_inbound_text (feishu), fetch_room_history (matrix), _start_typing_indicator (signal), parse_feishu_post_content - Dead constants: NOUS_API_BASE_URL, SKILLS_TOOL_DESCRIPTION, FILE_TOOLS, VALID_ASPECT_RATIOS, MEMORY_DIR - Unused UI code: _interactive_provider_selection, _interactive_model_selection (superseded by prompt_toolkit picker) Test suite verified: 609 tests covering affected files all pass. Tests for removed functions deleted. Tests using removed utilities (clear_read_tracker, MEMORY_DIR) updated to use internal APIs directly.	2026-04-13 16:32:04 -07:00
Teknium	0e60a9dc25	fix: add kimi-coding-cn to remaining provider touchpoints Follow-up for salvaged PR #7637. Adds kimi-coding-cn to: - model_normalize.py (prefix strip) - providers.py (models.dev mapping) - runtime_provider.py (credential resolution) - setup.py (model list + setup label) - doctor.py (health check) - trajectory_compressor.py (URL detection) - models_dev.py (registry mapping) - integrations/providers.md (docs)	2026-04-13 11:20:37 -07:00
Teknium	078dba015d	fix: three provider-related bugs (#8161 , #8181 , #8147 ) (#8243 ) - Add openai/openai-codex -> openai mapping to PROVIDER_TO_MODELS_DEV so context-length lookups use models.dev data instead of 128k fallback. Fixes #8161. - Set api_mode from custom_providers entry when switching via hermes model, and clear stale api_mode when the entry has none. Also extract api_mode in _named_custom_provider_map(). Fixes #8181. - Convert OpenAI image_url content blocks to Anthropic image blocks when the endpoint is Anthropic-compatible (MiniMax, MiniMax-CN, or any URL containing /anthropic). Fixes #8147.	2026-04-12 01:44:18 -07:00
kshitijk4poor	6693e2a497	feat(xiaomi): add Xiaomi MiMo as first-class provider Cherry-picked from PR #7702 by kshitijk4poor. Adds Xiaomi MiMo as a direct provider (XIAOMI_API_KEY) with models: - mimo-v2-pro (1M context), mimo-v2-omni (256K, multimodal), mimo-v2-flash (256K, cheapest) Standard OpenAI-compatible provider checklist: auth.py, config.py, models.py, main.py, providers.py, doctor.py, model_normalize.py, model_metadata.py, models_dev.py, auxiliary_client.py, .env.example, cli-config.yaml.example. Follow-up: vision tasks use mimo-v2-omni (multimodal) instead of the user's main model. Non-vision aux uses the user's selected model. Added _PROVIDER_VISION_MODELS dict for provider-specific vision model overrides. On failure, falls back to aggregators (gemini flash) via existing fallback chain. Corrects pre-existing context lengths: mimo-v2-pro 1048576→1000000, mimo-v2-omni 1048576→256000, adds mimo-v2-flash 256000. 36 tests covering registry, aliases, auto-detect, credentials, models.dev, normalization, URL mapping, providers module, doctor, aux client, vision model override, and agent init.	2026-04-11 11:17:52 -07:00
kshitijk4poor	50bb4fe010	fix(vision): auto-resize oversized images, increase default timeout, fix vision capability detection Cherry-picked from PR #7749 by kshitijk4poor with modifications: - Raise hard image limit from 5 MB to 20 MB (matches most restrictive provider) - Send images at full resolution first; only auto-resize to 5 MB on API failure - Add _is_image_size_error() helper to detect size-related API rejections - Auto-resize uses Pillow (soft dep) with progressive downscale + JPEG quality reduction - Fix get_model_capabilities() to check modalities.input for vision support - Increase default vision timeout from 30s to 120s (matches hardcoded fallback intent) - Applied retry-with-resize to both vision_analyze_tool and browser_vision Closes #7740	2026-04-11 11:12:50 -07:00
alt-glitch	96c060018a	fix: remove 115 verified dead code symbols across 46 production files Automated dead code audit using vulture + coverage.py + ast-grep intersection, confirmed by Opus deep verification pass. Every symbol verified to have zero production callers (test imports excluded from reachability analysis). Removes ~1,534 lines of dead production code across 46 files and ~1,382 lines of stale test code. 3 entire files deleted (agent/builtin_memory_provider.py, hermes_cli/checklist.py, tests/hermes_cli/test_setup_model_selection.py). Co-authored-by: alt-glitch <balyan.sid@gmail.com>	2026-04-10 03:44:43 -07:00
kshitijk4poor	3377017eb4	feat(qwen): add Qwen OAuth provider with portal request support Based on #6079 by @tunamitom with critical fixes and comprehensive tests. Changes from #6079: - Fix: sanitization overwrite bug — Qwen message prep now runs AFTER codex field sanitization, not before (was silently discarding Qwen transforms) - Fix: missing try/except AuthError in runtime_provider.py — stale Qwen credentials now fall through to next provider on auto-detect - Fix: 'qwen' alias conflict — bare 'qwen' stays mapped to 'alibaba' (DashScope); use 'qwen-portal' or 'qwen-cli' for the OAuth provider - Fix: hardcoded ['coder-model'] replaced with live API fetch + curated fallback list (qwen3-coder-plus, qwen3-coder) - Fix: extract _is_qwen_portal() helper + _qwen_portal_headers() to replace 5 inline 'portal.qwen.ai' string checks and share headers between init and credential swap - Fix: add Qwen branch to _apply_client_headers_for_base_url for mid-session credential swaps - Fix: remove suspicious TypeError catch blocks around _prompt_provider_choice - Fix: handle bare string items in content lists (were silently dropped) - Fix: remove redundant dict() copies after deepcopy in message prep - Revert: unrelated ai-gateway test mock removal and model_switch.py comment deletion New tests (30 test functions): - _qwen_cli_auth_path, _read_qwen_cli_tokens (success + 3 error paths) - _save_qwen_cli_tokens (roundtrip, parent creation, permissions) - _qwen_access_token_is_expiring (5 edge cases: fresh, expired, within skew, None, non-numeric) - _refresh_qwen_cli_tokens (success, preserve old refresh, 4 error paths, default expires_in, disk persistence) - resolve_qwen_runtime_credentials (fresh, auto-refresh, force-refresh, missing token, env override) - get_qwen_auth_status (logged in, not logged in) - Runtime provider resolution (direct, pool entry, alias) - _build_api_kwargs (metadata, vl_high_resolution_images, message formatting, max_tokens suppression)	2026-04-08 13:46:30 -07:00
Teknium	187e90e425	refactor: replace inline HERMES_HOME re-implementations with get_hermes_home() 16 callsites across 14 files were re-deriving the hermes home path via os.environ.get('HERMES_HOME', ...) instead of using the canonical get_hermes_home() from hermes_constants. This breaks profiles — each profile has its own HERMES_HOME, and the inline fallback defaults to ~/.hermes regardless. Fixed by importing and calling get_hermes_home() at each site. For files already inside the hermes process (agent/, hermes_cli/, tools/, gateway/, plugins/), this is always safe. Files that run outside the process context (mcp_serve.py, mcp_oauth.py) already had correct try/except ImportError fallbacks and were left alone. Skipped: hermes_constants.py (IS the implementation), env_loader.py (bootstrap), profiles.py (intentionally manipulates the env var), standalone scripts (optional-skills/, skills/), and tests.	2026-04-07 10:40:34 -07:00
Teknium	d0ffb111c2	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 ) Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture) and manual analysis of the entire codebase. Changes by category: Unused imports removed (~95 across 55 files): - Removed genuinely unused imports from all major subsystems - agent/, hermes_cli/, tools/, gateway/, plugins/, cron/ - Includes imports in try/except blocks that were truly unused (vs availability checks which were left alone) Unused variables removed (~25): - Removed dead variables: connected, inner, channels, last_exc, source, new_server_names, verify, pconfig, default_terminal, result, pending_handled, temperature, loop - Dropped unused argparse subparser assignments in hermes_cli/main.py (12 instances of add_parser() where result was never used) Dead code removed: - run_agent.py: Removed dead ternary (None if False else None) and surrounding unreachable branch in identity fallback - run_agent.py: Removed write-only attribute _last_reported_tool - hermes_cli/providers.py: Removed dead @property decorator on module-level function (decorator has no effect outside a class) - gateway/run.py: Removed unused MCP config load before reconnect - gateway/platforms/slack.py: Removed dead SessionSource construction Undefined name bugs fixed (would cause NameError at runtime): - batch_runner.py: Added missing logger = logging.getLogger(__name__) - tools/environments/daytona.py: Added missing Dict and Path imports Unnecessary global statements removed (14): - tools/terminal_tool.py: 5 functions declared global for dicts they only mutated via .pop()/[key]=value (no rebinding) - tools/browser_tool.py: cleanup thread loop only reads flag - tools/rl_training_tool.py: 4 functions only do dict mutations - tools/mcp_oauth.py: only reads the global - hermes_time.py: only reads cached values Inefficient patterns fixed: - startswith/endswith tuple form: 15 instances of x.startswith('a') or x.startswith('b') consolidated to x.startswith(('a', 'b')) - len(x)==0 / len(x)>0: 13 instances replaced with pythonic truthiness checks (not x / bool(x)) - in dict.keys(): 5 instances simplified to in dict - Redefined unused name: removed duplicate _strip_mdv2 import in send_message_tool.py Other fixes: - hermes_cli/doctor.py: Replaced undefined logger.debug() with pass - hermes_cli/config.py: Consolidated chained .endswith() calls Test results: 3934 passed, 17 failed (all pre-existing on main), 19 skipped. Zero regressions.	2026-04-07 10:25:31 -07:00
Teknium	cc7136b1ac	fix: update Gemini model catalog + wire models.dev as live model source Follow-up for salvaged PR #5494: - Update model catalog to Gemini 3.x + Gemma 4 (drop deprecated 2.0) - Add list_agentic_models() to models_dev.py with noise filter - Wire models.dev into _model_flow_api_key_provider as primary source (static curated list serves as offline fallback) - Add gemini -> google mapping in PROVIDER_TO_MODELS_DEV - Fix Gemma 4 context lengths to 256K (models.dev values) - Update auxiliary model to gemini-3-flash-preview - Expand tests: 3.x catalog, context lengths, models.dev integration	2026-04-06 10:28:03 -07:00
Teknium	4976a8b066	feat: /model command — models.dev primary database + --provider flag (#5181 ) Full overhaul of the model/provider system. ## What changed - models.dev (109 providers, 4000+ models) as primary database for provider identity AND model metadata - --provider flag replaces colon syntax for explicit provider switching - Full ModelInfo/ProviderInfo dataclasses with context, cost, capabilities, modalities - HermesOverlay system merges models.dev + Hermes-specific transport/auth/aggregator flags - User-defined endpoints via config.yaml providers: section - /model (no args) lists authenticated providers with curated model catalog - Rich metadata display: context window, max output, cost/M tokens, capabilities - Config migration: custom_providers list → providers dict (v11→v12) - AIAgent.switch_model() for in-place model swap preserving conversation ## Files agent/models_dev.py, hermes_cli/providers.py, hermes_cli/model_switch.py, hermes_cli/model_normalize.py, cli.py, gateway/run.py, run_agent.py, hermes_cli/config.py, hermes_cli/commands.py	2026-04-05 01:04:44 -07:00
Teknium	3a68ec3172	feat: add Fireworks context length detection support (#4158 ) - Add api.fireworks.ai to _URL_TO_PROVIDER for automatic provider detection - Add fireworks to PROVIDER_TO_MODELS_DEV mapped to 'fireworks-ai' (the correct models.dev provider key — original PR used 'fireworks' which would silently fail the lookup) Cherry-picked from PR #3989 with models.dev key fix. Co-authored-by: sroecker <sroecker@users.noreply.github.com>	2026-03-30 20:37:08 -07:00
Teknium	2dd286c162	fix: write models.dev disk cache atomically (#3588 ) Use atomic_json_write() from utils.py instead of plain open()/json.dump() for the models.dev disk cache. Prevents corrupted cache if the process is killed mid-write — _load_disk_cache() silently returns {} on corrupt JSON, losing all model metadata until the next successful API fetch. Co-authored-by: memosr <memosr@users.noreply.github.com>	2026-03-28 14:20:30 -07:00
Test	55ce601502	fix: 6 bugs in model metadata, reasoning detection, and delegate tool Cherry-picked from PR #2169 by @0xbyt4. 1. _strip_provider_prefix: skip Ollama model:tag names (qwen:0.5b) 2. Fuzzy match: remove reverse direction that made claude-sonnet-4 resolve to 1M instead of 200K 3. _has_content_after_think_block: reuse _strip_think_blocks() to handle all tag variants (thinking, reasoning, REASONING_SCRATCHPAD) 4. models.dev lookup: elif→if so nous provider also queries models.dev 5. Disk cache fallback: use 5-min TTL instead of full hour so network is retried soon 6. Delegate build: wrap child construction in try/finally so _last_resolved_tool_names is always restored on exception	2026-03-20 08:52:37 -07:00
Teknium	88643a1ba9	feat: overhaul context length detection with models.dev and provider-aware resolution (#2158 ) Replace the fragile hardcoded context length system with a multi-source resolution chain that correctly identifies context windows per provider. Key changes: - New agent/models_dev.py: Fetches and caches the models.dev registry (3800+ models across 100+ providers with per-provider context windows). In-memory cache (1hr TTL) + disk cache for cold starts. - Rewritten get_model_context_length() resolution chain: 0. Config override (model.context_length) 1. Custom providers per-model context_length 2. Persistent disk cache 3. Endpoint /models (local servers) 4. Anthropic /v1/models API (max_input_tokens, API-key only) 5. OpenRouter live API (existing, unchanged) 6. Nous suffix-match via OpenRouter (dot/dash normalization) 7. models.dev registry lookup (provider-aware) 8. Thin hardcoded defaults (broad family patterns) 9. 128K fallback (was 2M) - Provider-aware context: same model now correctly resolves to different context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic, 128K on GitHub Copilot). Provider name flows through ContextCompressor. - DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns. models.dev replaces the per-model hardcoding. - CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K] to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M. - hermes model: prompts for context_length when configuring custom endpoints. Supports shorthand (32k, 128K). Saved to custom_providers per-model config. - custom_providers schema extended with optional models dict for per-model context_length (backward compatible). - Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash normalization. Handles all 15 current Nous models. - Anthropic direct: queries /v1/models for max_input_tokens. Only works with regular API keys (sk-ant-api*), not OAuth tokens. Falls through to models.dev for OAuth users. Tests: 5574 passed (18 new tests for models_dev + updated probe tiers) Docs: Updated configuration.md context length section, AGENTS.md Co-authored-by: Test <test@test.com>	2026-03-20 06:04:33 -07:00

26 commits