hermes-agent/website
Ben Barclay eddfecd2ce fix(vision): cap vision_analyze fan-out concurrency process-wide
A single agent turn can fan out N vision_analyze calls at once — the
classic trigger is "analyze every frame of this video", where ffmpeg
explodes a clip into dozens of frames and the model calls vision_analyze
on each. Every call does a CPU-heavy base64-encode/resize burst AND holds
a long-lived LLM stream open. The tool executor runs concurrent tool calls
on a per-session ThreadPoolExecutor (_MAX_TOOL_WORKERS=8), and multiple
agent sessions share one process (the dashboard runs the agent in-process),
so there was no global ceiling. In prod (June 2026) a video-frame fan-out
pinned a worker thread at ~100% CPU and starved the shared asyncio event
loop that also serves the dashboard's /api/status liveness probe, flapping
the instance to UNHEALTHY even though nothing had crashed.

Add a process-global threading.BoundedSemaphore that bounds how many vision
analyses run concurrently across the whole process, held across the entire
analysis (image load + encode + LLM call) in the single _handle_vision_analyze
chokepoint (covers both the native fast path and the legacy aux-LLM path).

It is a threading semaphore, NOT asyncio: each vision call is dispatched
through model_tools._run_async on a per-thread event loop, so an asyncio
primitive bound to one loop cannot coordinate across them. The acquire is
offloaded via run_in_executor so waiting for a slot never blocks the calling
loop.

Default: min(host CPUs, 4), floored at 1 — respect the host's concurrency,
or lower. Override via auxiliary.vision.max_concurrency (config.yaml) or
HERMES_VISION_MAX_CONCURRENCY (env). Values < 1 are ignored so the cap can
never be disabled into an unbounded fan-out.

Tests: bounded-fan-out regression guard + a control proving it would fail
without the cap; resolver tests for host-cpu default, ceiling clamp, low-cpu
host, env override, and sub-1 rejection. Pre-existing handler tests updated
for the now-async _handle_vision_analyze. Verified via the real
registry.dispatch -> _run_async per-thread-loop path (16 concurrent calls,
peak bounded to cap).
2026-06-29 01:27:10 -07:00
..
docs fix(vision): cap vision_analyze fan-out concurrency process-wide 2026-06-29 01:27:10 -07:00
i18n/zh-Hans/docusaurus-plugin-content-docs/current feat(slack): nudge stale installs to add mpim scopes; mark message.mpim required 2026-06-29 01:02:53 -07:00
scripts refactor(cron): rebrand Cron Recipes -> Automation Blueprints 2026-06-11 10:49:47 -07:00
src refactor(cron): rebrand Cron Recipes -> Automation Blueprints 2026-06-11 10:49:47 -07:00
static feat: add z-ai/glm-5.2 to OpenRouter and Nous model lists 2026-06-16 23:35:45 +05:30
.gitignore feat(skills-hub): health checks, freshness badge, and a watchdog cron (#32345) 2026-05-25 23:10:45 -07:00
docusaurus.config.ts docs: point desktop download links to site root (deprecate /desktop) (#46795) 2026-06-15 15:02:24 -04:00
package-lock.json docs(website): redirect old automation-templates URL to automation-blueprints 2026-06-12 09:46:27 -07:00
package.json docs(website): redirect old automation-templates URL to automation-blueprints 2026-06-12 09:46:27 -07:00
README.md docs: replace ASCII diagrams with Mermaid/lists, add linting note 2026-03-21 17:58:30 -07:00
sidebars.ts docs: reconcile docs with code across last 3 releases (#54254) 2026-06-28 12:47:50 -07:00
tsconfig.json change(tooling): typecheck in CI, update ts to 6 2026-06-10 11:59:34 -04:00

Website

This website is built using Docusaurus, a modern static website generator.

Installation

yarn

Local Development

yarn start

This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

Build

yarn build

This command generates static content into the build directory and can be served using any static contents hosting service.

Deployment

Using SSH:

USE_SSH=true yarn deploy

Not using SSH:

GIT_USER=<Your GitHub username> yarn deploy

If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the gh-pages branch.

Diagram Linting

CI runs ascii-guard to lint docs for ASCII box diagrams. Use Mermaid (````mermaid`) or plain lists/tables instead of ASCII boxes to avoid CI failures.