mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-26 01:01:40 +00:00

History

KUSH42 34d06a9802 fix(compaction): don't halve context_length on output-cap-too-large errors When the API returns "max_tokens too large given prompt" (input tokens are within the context window, but input + requested output > window), the old code incorrectly routed through the same handler as "prompt too long" errors, calling get_next_probe_tier() and permanently halving context_length. This made things worse: the window was fine, only the requested output size needed trimming for that one call. Two distinct error classes now handled separately: Prompt too long — input itself exceeds context window. Fix: compress history + halve context_length (existing behaviour, unchanged). Output cap too large — input OK, but input + max_tokens > window. Fix: parse available_tokens from the error message, set a one-shot _ephemeral_max_output_tokens override for the retry, and leave context_length completely untouched. Changes: - agent/model_metadata.py: add parse_available_output_tokens_from_error() that detects Anthropic's "available_tokens: N" error format and returns the available output budget, or None for all other error types. - run_agent.py: call the new parser first in the is_context_length_error block; if it fires, set _ephemeral_max_output_tokens (with a 64-token safety margin) and break to retry without touching context_length. _build_api_kwargs consumes the ephemeral value exactly once then clears it so subsequent calls use self.max_tokens normally. - agent/anthropic_adapter.py: expand build_anthropic_kwargs docstring to clearly document the max_tokens (output cap) vs context_length (total window) distinction, which is a persistent source of confusion due to the OpenAI-inherited "max_tokens" name. - cli-config.yaml.example: add inline comments explaining both keys side by side where users are most likely to look. - website/docs/integrations/providers.md: add a callout box at the top of "Context Length Detection" and clarify the troubleshooting entry. - tests/test_ctx_halving_fix.py: 24 tests across four classes covering the parser, build_anthropic_kwargs clamping, ephemeral one-shot consumption, and the invariant that context_length is never mutated on output-cap errors.		2026-04-09 11:27:41 -07:00
..
docs	fix(compaction): don't halve context_length on output-cap-too-large errors	2026-04-09 11:27:41 -07:00
scripts	feat(website): add skills browse and search page to docs (#4500 )	2026-04-02 10:47:38 -07:00
src	feat(website): add skills browse and search page to docs (#4500 )	2026-04-02 10:47:38 -07:00
static	docs: stabilize website diagrams	2026-03-14 22:49:57 -07:00
.gitignore	chore: gitignore generated skills.json	2026-04-02 10:48:15 -07:00
docusaurus.config.ts	feat(website): add skills browse and search page to docs (#4500 )	2026-04-02 10:47:38 -07:00
package-lock.json	docs: stabilize website diagrams	2026-03-14 22:49:57 -07:00
package.json	docs: stabilize website diagrams	2026-03-14 22:49:57 -07:00
README.md	docs: replace ASCII diagrams with Mermaid/lists, add linting note	2026-03-21 17:58:30 -07:00
sidebars.ts	fix(bluebubbles): add missing integration points and documentation (#6460 )	2026-04-09 00:19:05 -07:00
tsconfig.json	feat: add documentation website (Docusaurus)	2026-03-05 05:24:55 -08:00

README.md

Website

This website is built using Docusaurus, a modern static website generator.

Installation

yarn

Local Development

yarn start

This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

Build

yarn build

This command generates static content into the build directory and can be served using any static contents hosting service.

Deployment

Using SSH:

USE_SSH=true yarn deploy

Not using SSH:

GIT_USER=<Your GitHub username> yarn deploy

If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the gh-pages branch.

Diagram Linting

CI runs ascii-guard to lint docs for ASCII box diagrams. Use Mermaid (````mermaid`) or plain lists/tables instead of ASCII boxes to avoid CI failures.