hermes-agent/agent
Teknium 346601ca8d
fix(context): invalidate stale Codex OAuth cache entries >= 400k (#15078)
PR #14935 added a Codex-aware context resolver but only new lookups
hit the live /models probe. Users who had run Hermes on gpt-5.5 / 5.4
BEFORE that PR already had the wrong value (e.g. 1,050,000 from
models.dev) persisted in ~/.hermes/context_length_cache.yaml, and the
cache-first lookup in get_model_context_length() returns it forever.

Symptom (reported in the wild by Ludwig, min heo, Gaoge on current
main at 6051fba9d, which is AFTER #14935):
  * Startup banner shows context usage against 1M
  * Compression fires late and then OpenAI hard-rejects with
    'context length will be reduced from 1,050,000 to 128,000'
    around the real 272k boundary.

Fix: when the step-1 cache returns a value for an openai-codex lookup,
check whether it's >= 400k. Codex OAuth caps every slug at 272k (live
probe values) so anything at or above 400k is definitionally a
pre-#14935 leftover. Drop that entry from the on-disk cache and fall
through to step 5, which runs the live /models probe and repersists
the correct value (or 272k from the hardcoded fallback if the probe
fails). Non-Codex providers and legitimately-cached Codex entries at
272k are untouched.

Changes:
- agent/model_metadata.py:
  * _invalidate_cached_context_length() — drop a single entry from
    context_length_cache.yaml and rewrite the file.
  * Step-1 cache check in get_model_context_length() now gates
    provider=='openai-codex' entries >= 400k through invalidation
    instead of returning them.

Tests (3 new in TestCodexOAuthContextLength):
- stale 1.05M Codex entry is dropped from disk AND re-resolved
  through the live probe to 272k; unrelated cache entries survive.
- fresh 272k Codex entry is respected (no probe call, no invalidation).
- non-Codex 1M entries (e.g. anthropic/claude-opus-4.6 on OpenRouter)
  are unaffected — the guard is strictly scoped to openai-codex.

Full tests/agent/test_model_metadata.py: 88 passed.
2026-04-24 04:46:07 -07:00
..
transports fix(kimi,mcp): Moonshot schema sanitizer + MCP schema robustness (#14805) 2026-04-23 16:11:57 -07:00
__init__.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
account_usage.py feat(account-usage): add per-provider account limits module 2026-04-21 01:56:35 -07:00
anthropic_adapter.py refactor: remove _nr_to_assistant_message shim + fix flush_memories guard 2026-04-23 02:30:05 -07:00
auxiliary_client.py fix(aux): normalize GitHub Copilot provider slugs 2026-04-24 03:33:29 -07:00
bedrock_adapter.py feat: native AWS Bedrock provider via Converse API 2026-04-15 16:17:17 -07:00
codex_responses_adapter.py refactor: extract codex_responses logic into dedicated adapter 2026-04-20 11:53:17 -07:00
context_compressor.py fix(compress): don't reach into ContextCompressor privates from /compress (#15039) 2026-04-24 02:55:43 -07:00
context_engine.py fix(compress): don't reach into ContextCompressor privates from /compress (#15039) 2026-04-24 02:55:43 -07:00
context_references.py fix(agent): fall back when rg is blocked for @folder references 2026-04-20 01:56:41 -07:00
copilot_acp_client.py fix(security): apply file safety to copilot acp fs 2026-04-21 01:31:58 -07:00
credential_pool.py fix(auth): unify credential source removal — every source sticks (#13427) 2026-04-21 01:52:49 -07:00
credential_sources.py fix(auth): unify credential source removal — every source sticks (#13427) 2026-04-21 01:52:49 -07:00
display.py fix(display): render <missing old_text> in memory previews instead of empty quotes (#12852) 2026-04-19 22:45:47 -07:00
error_classifier.py fix(errors): classify OpenRouter privacy-guardrail 404s distinctly (#14943) 2026-04-23 23:26:29 -07:00
file_safety.py fix(security): apply file safety to copilot acp fs 2026-04-21 01:31:58 -07:00
gemini_cloudcode_adapter.py refactor: remove redundant local imports already available at module level 2026-04-21 00:50:58 -07:00
gemini_native_adapter.py refactor: remove redundant local imports already available at module level 2026-04-21 00:50:58 -07:00
gemini_schema.py fix(gemini): drop integer/number/boolean enums from tool schemas (#15082) 2026-04-24 03:40:00 -07:00
google_code_assist.py fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833) 2026-04-17 15:34:12 -07:00
google_oauth.py feat(gemini): add Google Gemini CLI OAuth provider via Cloud Code Assist (free + paid tiers) (#11270) 2026-04-16 16:49:00 -07:00
image_gen_provider.py feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) 2026-04-21 21:30:10 -07:00
image_gen_registry.py feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) 2026-04-21 21:30:10 -07:00
insights.py Merge branch 'main' into feat/dashboard-skill-analytics 2026-04-20 05:25:49 -07:00
manual_compression_feedback.py fix(gateway): make manual compression feedback truthful 2026-04-10 21:16:53 -07:00
memory_manager.py feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619) 2026-04-15 19:12:19 -07:00
memory_provider.py refactor(memory): drop on_session_reset — commit-only is enough 2026-04-15 11:28:45 -07:00
model_metadata.py fix(context): invalidate stale Codex OAuth cache entries >= 400k (#15078) 2026-04-24 04:46:07 -07:00
models_dev.py fix: normalize provider in list_provider_models to support aliases 2026-04-23 01:59:20 -07:00
moonshot_schema.py fix(kimi,mcp): Moonshot schema sanitizer + MCP schema robustness (#14805) 2026-04-23 16:11:57 -07:00
nous_rate_guard.py fix: Nous Portal rate limit guard — prevent retry amplification (#10568) 2026-04-15 16:31:48 -07:00
prompt_builder.py feat(agent): add PLATFORM_HINTS for matrix, mattermost, and feishu (#14428) 2026-04-23 12:50:22 +05:30
prompt_caching.py fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter 2026-03-21 16:54:43 -07:00
rate_limit_tracker.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
redact.py feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal (#13148) 2026-04-20 11:49:54 -07:00
retry_utils.py feat(agent): add jittered retry backoff 2026-04-08 00:41:36 -07:00
shell_hooks.py feat: shell hooks — wire shell scripts as Hermes hook callbacks 2026-04-20 20:53:51 -07:00
skill_commands.py refactor(commands): drop /provider, /plan handler, and clean up slash registry (#15047) 2026-04-24 03:10:52 -07:00
skill_utils.py fix(skills): follow symlinks in iter_skill_index_files 2026-04-22 17:43:30 -07:00
subdirectory_hints.py fix(agent): catch PermissionError in subdirectory hint discovery 2026-04-09 03:10:30 -07:00
title_generator.py fix: increase max_tokens for GLM 5.1 reasoning headroom 2026-04-22 18:44:07 -07:00
trajectory.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
usage_pricing.py fix(usage): read top-level Anthropic cache fields from OAI-compatible proxies 2026-04-22 17:40:49 -07:00