hermes-agent/gateway
Teknium b2b4a9ee7d
fix(gateway): hygiene compression ignores config context_length and 1.4x exceeds model limit
Three bugs in gateway session hygiene pre-compression caused 'Session too
large' errors for ~200K context models like GLM-5-turbo on z.ai:

1. Gateway hygiene called get_model_context_length(model) without passing
   config_context_length, provider, or base_url — so user overrides like
   model.context_length: 180000 were ignored, and provider-aware detection
   (models.dev, z.ai endpoint) couldn't fire. The agent's own compressor
   correctly passed all three (run_agent.py line 1038).

2. The 1.4x safety factor on rough token estimates pushed the compression
   threshold above the model's actual context limit:
     200K * 0.85 * 1.4 = 238K > 200K (model limit)
   So hygiene never compressed, sessions grew past the limit, and the API
   rejected the request.

3. Same issue for the warn threshold: 200K * 0.95 * 1.4 = 266K.

Fix:
- Read model.context_length, provider, and base_url from config.yaml
  (same as run_agent.py does) and pass them to get_model_context_length()
- Resolve provider/base_url from runtime when not in config
- Cap the 1.4x-adjusted compress threshold at 95% of context_length
- Cap the 1.4x-adjusted warn threshold at context_length

Affects: z.ai GLM-5/GLM-5-turbo, any ~200K or smaller context model
where the 1.4x factor would push 85% above 100%.

Ref: Discord report from Ddox — glm-5-turbo on z.ai coding plan
2026-03-22 15:15:37 -07:00
..
platforms fix(matrix): duplicate messages, image caching for vision support (#2520) 2026-03-22 09:27:25 -07:00
__init__.py Enhance CLI with multi-platform messaging integration and configuration management 2026-02-02 19:01:51 -08:00
channel_directory.py feat: add SMS (Twilio) platform adapter 2026-03-17 03:14:53 -07:00
config.py feat(gateway): notify users when session auto-resets (#2519) 2026-03-22 09:33:39 -07:00
delivery.py Merge origin/main into hermes/hermes-5d160594 2026-03-14 19:34:05 -07:00
hooks.py feat(hooks): emit session:end lifecycle event (#1725) 2026-03-17 04:17:44 -07:00
mirror.py fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths 2026-03-13 21:32:53 -07:00
pairing.py fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths 2026-03-13 21:32:53 -07:00
run.py fix(gateway): hygiene compression ignores config context_length and 1.4x exceeds model limit 2026-03-22 15:15:37 -07:00
session.py feat(gateway): notify users when session auto-resets (#2519) 2026-03-22 09:33:39 -07:00
status.py fix(gateway): detect stopped processes and release stale locks on --replace 2026-03-21 18:13:53 -07:00
sticker_cache.py fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths 2026-03-13 21:32:53 -07:00
stream_consumer.py fix: handle message length overflow in streaming mode (#1783) 2026-03-17 11:00:52 -07:00