mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

History

teknium1 75317d82d0 fix(vision): narrow the fan-out cap to the CPU encode burst only The original cap held a process-global slot across the WHOLE vision analysis (image load + encode + LLM call) with a default of min(CPUs, 4). That serialized legitimate multi-image workflows — "compare these 6 screenshots", "read this 10-page scan", "analyze every frame" — behind a 4-wide gate, and on the native fast path it even throttled calls that make no LLM request at all. Excess calls queued (blocking acquire, nothing dropped), but the latency hit on real fan-out was the wrong tradeoff. The incident was CPU exhaustion, not call count: concurrent base64/resize bursts saturated every core and left none to service the shared event loop serving /api/status. So cap ONLY that: - A dedicated, bounded ThreadPoolExecutor (_vision_cpu_executor) runs the encode/resize/dimension-check off the caller's loop, sized to the host's usable core count with NO fixed ceiling — the cap tracks the actual exhausted resource (cores), not a magic number. Excess encodes queue on the executor; cores stay free for the loop. - The LLM call is deliberately OUTSIDE the executor, so multi-image workflows keep full request concurrency. - Override via auxiliary.vision.max_concurrency / HERMES_VISION_MAX_CONCURRENCY (honored verbatim, including above core count); sub-1 ignored. - _vision_concurrency_slot() is now a no-op shim for back-compat. Tests assert: resolver defaults to host cores with no ceiling; env/config override (incl. above cores); sub-1 rejection; the executor is dedicated and core-sized; encode runs on a vision-encode thread; and crucially that encode bursts are bounded to the cap while the analyses themselves stay fully concurrent (calls_peak > cap).		2026-06-29 01:27:10 -07:00
..
docs	fix(vision): narrow the fan-out cap to the CPU encode burst only	2026-06-29 01:27:10 -07:00
i18n/zh-Hans/docusaurus-plugin-content-docs/current	feat(slack): nudge stale installs to add mpim scopes; mark message.mpim required	2026-06-29 01:02:53 -07:00
scripts	refactor(cron): rebrand Cron Recipes -> Automation Blueprints	2026-06-11 10:49:47 -07:00
src	refactor(cron): rebrand Cron Recipes -> Automation Blueprints	2026-06-11 10:49:47 -07:00
static	feat: add z-ai/glm-5.2 to OpenRouter and Nous model lists	2026-06-16 23:35:45 +05:30
.gitignore	feat(skills-hub): health checks, freshness badge, and a watchdog cron (#32345 )	2026-05-25 23:10:45 -07:00
docusaurus.config.ts	docs: point desktop download links to site root (deprecate /desktop) (#46795 )	2026-06-15 15:02:24 -04:00
package-lock.json	docs(website): redirect old automation-templates URL to automation-blueprints	2026-06-12 09:46:27 -07:00
package.json	docs(website): redirect old automation-templates URL to automation-blueprints	2026-06-12 09:46:27 -07:00
README.md	docs: replace ASCII diagrams with Mermaid/lists, add linting note	2026-03-21 17:58:30 -07:00
sidebars.ts	docs: reconcile docs with code across last 3 releases (#54254 )	2026-06-28 12:47:50 -07:00
tsconfig.json	change(tooling): typecheck in CI, update ts to 6	2026-06-10 11:59:34 -04:00

README.md

Website

This website is built using Docusaurus, a modern static website generator.

Installation

yarn

Local Development

yarn start

This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

Build

yarn build

This command generates static content into the build directory and can be served using any static contents hosting service.

Deployment

Using SSH:

USE_SSH=true yarn deploy

Not using SSH:

GIT_USER=<Your GitHub username> yarn deploy

If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the gh-pages branch.

Diagram Linting

CI runs ascii-guard to lint docs for ASCII box diagrams. Use Mermaid (````mermaid`) or plain lists/tables instead of ASCII boxes to avoid CI failures.