hermes-agent/website/docs
Teknium f783986f5a
fix: increase stream read timeout default to 120s, auto-raise for local LLMs (#6967)
Raise the default httpx stream read timeout from 60s to 120s for all
providers. Additionally, auto-detect local LLM endpoints (Ollama,
llama.cpp, vLLM) and raise the read timeout to HERMES_API_TIMEOUT
(1800s) since local models can take minutes for prefill on large
contexts before producing the first token.

The stale stream timeout already had this local auto-detection pattern;
the httpx read timeout was missing it — causing a hard 60s wall that
users couldn't find (HERMES_STREAM_READ_TIMEOUT was undocumented).

Changes:
- Default HERMES_STREAM_READ_TIMEOUT: 60s -> 120s
- Auto-detect local endpoints -> raise to 1800s (user override respected)
- Document HERMES_STREAM_READ_TIMEOUT and HERMES_STREAM_STALE_TIMEOUT
- Add 10 parametrized tests

Reported-by: Pavan Srinivas (@pavanandums)
2026-04-09 22:35:30 -07:00
..
developer-guide fix(bluebubbles): add missing integration points and documentation (#6460) 2026-04-09 00:19:05 -07:00
getting-started fix(termux): improve status and install UX 2026-04-09 16:24:53 -07:00
guides docs: fix 40+ discrepancies between documentation and codebase (#5818) 2026-04-07 10:17:44 -07:00
integrations fix(compaction): don't halve context_length on output-cap-too-large errors 2026-04-09 11:27:41 -07:00
reference fix: increase stream read timeout default to 120s, auto-raise for local LLMs (#6967) 2026-04-09 22:35:30 -07:00
user-guide feat: API server model name derived from profile name (#6857) 2026-04-09 17:07:29 -07:00
index.md fix(bluebubbles): add missing integration points and documentation (#6460) 2026-04-09 00:19:05 -07:00