hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

History

Ben Barclay eddfecd2ce fix(vision): cap vision_analyze fan-out concurrency process-wide A single agent turn can fan out N vision_analyze calls at once — the classic trigger is "analyze every frame of this video", where ffmpeg explodes a clip into dozens of frames and the model calls vision_analyze on each. Every call does a CPU-heavy base64-encode/resize burst AND holds a long-lived LLM stream open. The tool executor runs concurrent tool calls on a per-session ThreadPoolExecutor (_MAX_TOOL_WORKERS=8), and multiple agent sessions share one process (the dashboard runs the agent in-process), so there was no global ceiling. In prod (June 2026) a video-frame fan-out pinned a worker thread at ~100% CPU and starved the shared asyncio event loop that also serves the dashboard's /api/status liveness probe, flapping the instance to UNHEALTHY even though nothing had crashed. Add a process-global threading.BoundedSemaphore that bounds how many vision analyses run concurrently across the whole process, held across the entire analysis (image load + encode + LLM call) in the single _handle_vision_analyze chokepoint (covers both the native fast path and the legacy aux-LLM path). It is a threading semaphore, NOT asyncio: each vision call is dispatched through model_tools._run_async on a per-thread event loop, so an asyncio primitive bound to one loop cannot coordinate across them. The acquire is offloaded via run_in_executor so waiting for a slot never blocks the calling loop. Default: min(host CPUs, 4), floored at 1 — respect the host's concurrency, or lower. Override via auxiliary.vision.max_concurrency (config.yaml) or HERMES_VISION_MAX_CONCURRENCY (env). Values < 1 are ignored so the cap can never be disabled into an unbounded fan-out. Tests: bounded-fan-out regression guard + a control proving it would fail without the cap; resolver tests for host-cpu default, ceiling clamp, low-cpu host, env override, and sub-1 rejection. Pre-existing handler tests updated for the now-async _handle_vision_analyze. Verified via the real registry.dispatch -> _run_async per-thread-loop path (16 concurrent calls, peak bounded to cap).		2026-06-29 01:27:10 -07:00
..
_category_.json	feat: add documentation website (Docusaurus)	2026-03-05 05:24:55 -08:00
automation-blueprints-catalog.mdx	docs: finish Automation Blueprints terminology rebrand (#44470 )	2026-06-11 17:22:22 -04:00
cli-commands.md	feat(cli): add headless `hermes serve` backend; desktop no longer launches `dashboard`	2026-06-28 22:04:22 -05:00
environment-variables.md	fix(vision): cap vision_analyze fan-out concurrency process-wide	2026-06-29 01:27:10 -07:00
faq.md	feat(docs): clarify platform support	2026-06-26 11:37:56 -07:00
mcp-config-reference.md	refactor: remove agent-callable send_message tool (#47856 )	2026-06-17 07:11:23 -07:00
model-catalog.md	docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952 )	2026-06-07 01:39:06 -07:00
optional-skills-catalog.md	fix(docs): regenerate skill docs to fix stale cross-links, add tool-search to sidebar	2026-06-20 20:42:49 -07:00
profile-commands.md	fix(profile): make clone-from a full source selector	2026-06-13 07:33:58 -07:00
skills-catalog.md	Merge remote-tracking branch 'origin/main' into bb/pets	2026-06-22 05:25:49 -05:00
slash-commands.md	docs: reconcile docs with code across last 3 releases (#54254 )	2026-06-28 12:47:50 -07:00
tools-reference.md	docs: reconcile docs with code across last 3 releases (#54254 )	2026-06-28 12:47:50 -07:00
toolsets-reference.md	docs: reconcile docs with code across last 3 releases (#54254 )	2026-06-28 12:47:50 -07:00