hermes-agent/website/docs/user-guide
Ben Barclay eddfecd2ce fix(vision): cap vision_analyze fan-out concurrency process-wide
A single agent turn can fan out N vision_analyze calls at once — the
classic trigger is "analyze every frame of this video", where ffmpeg
explodes a clip into dozens of frames and the model calls vision_analyze
on each. Every call does a CPU-heavy base64-encode/resize burst AND holds
a long-lived LLM stream open. The tool executor runs concurrent tool calls
on a per-session ThreadPoolExecutor (_MAX_TOOL_WORKERS=8), and multiple
agent sessions share one process (the dashboard runs the agent in-process),
so there was no global ceiling. In prod (June 2026) a video-frame fan-out
pinned a worker thread at ~100% CPU and starved the shared asyncio event
loop that also serves the dashboard's /api/status liveness probe, flapping
the instance to UNHEALTHY even though nothing had crashed.

Add a process-global threading.BoundedSemaphore that bounds how many vision
analyses run concurrently across the whole process, held across the entire
analysis (image load + encode + LLM call) in the single _handle_vision_analyze
chokepoint (covers both the native fast path and the legacy aux-LLM path).

It is a threading semaphore, NOT asyncio: each vision call is dispatched
through model_tools._run_async on a per-thread event loop, so an asyncio
primitive bound to one loop cannot coordinate across them. The acquire is
offloaded via run_in_executor so waiting for a slot never blocks the calling
loop.

Default: min(host CPUs, 4), floored at 1 — respect the host's concurrency,
or lower. Override via auxiliary.vision.max_concurrency (config.yaml) or
HERMES_VISION_MAX_CONCURRENCY (env). Values < 1 are ignored so the cap can
never be disabled into an unbounded fan-out.

Tests: bounded-fan-out regression guard + a control proving it would fail
without the cap; resolver tests for host-cpu default, ceiling clamp, low-cpu
host, env override, and sub-1 rejection. Pre-existing handler tests updated
for the now-async _handle_vision_analyze. Verified via the real
registry.dispatch -> _run_async per-thread-loop path (16 concurrent calls,
peak bounded to cap).
2026-06-29 01:27:10 -07:00
..
features docs: reconcile docs with code across last 3 releases (#54254) 2026-06-28 12:47:50 -07:00
messaging feat(slack): nudge stale installs to add mpim scopes; mark message.mpim required 2026-06-29 01:02:53 -07:00
secrets feat(secrets/bitwarden): EU Cloud + self-hosted server URL support (#31378) 2026-05-24 02:19:57 -07:00
skills feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
_category_.json feat: add documentation website (Docusaurus) 2026-03-05 05:24:55 -08:00
checkpoints-and-rollback.md feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709) 2026-05-06 05:44:35 -07:00
cli.md docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952) 2026-06-07 01:39:06 -07:00
configuration.md fix(vision): cap vision_analyze fan-out concurrency process-wide 2026-06-29 01:27:10 -07:00
configuring-models.md fix(cli): warn when in-session model switch will preflight-compress 2026-06-21 16:29:31 +05:30
desktop.md fix(desktop): route old runtimes through dashboard when serve is absent 2026-06-28 22:10:42 -05:00
docker.md fix(docker): replace dashboard --insecure with basic-auth provider 2026-06-21 19:05:27 -07:00
git-worktrees.md docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952) 2026-06-07 01:39:06 -07:00
managed-scope.md docs: add managed scope admin guide + cross-link from configuration 2026-06-19 07:46:33 -07:00
multi-profile-gateways.md docs(gateway): document multiplexing opt-in + contract changes 2026-06-19 07:34:15 -07:00
profile-distributions.md Expand .gitignore example 2026-06-20 20:42:49 -07:00
profiles.md fix: make profile subprocess HOME policy explicit 2026-06-14 03:20:21 -07:00
security.md Make email pairing opt-in 2026-06-21 22:43:57 -07:00
sessions.md docs(sessions): clarify sessions.json is the gateway routing index, not the session list (#51726) 2026-06-23 23:56:36 -07:00
tui.md docs(tui): correct HERMES_TUI_GATEWAY_URL — dashboard-internal, not remote-attach (#42162) 2026-06-08 09:37:03 -07:00
windows-native.md docs(windows): correct native data dir to %LOCALAPPDATA%\hermes (#42856) 2026-06-09 14:11:20 -05:00
windows-wsl-quickstart.md fix(docs): update all install instructions everywhere 2026-06-04 21:07:45 -04:00