hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-01 12:02:05 +00:00

History

Teknium ee8cbfdc03 feat(web_extract): truncate-and-store instead of LLM summarization (#54843 ) * feat(web_extract): truncate-and-store instead of LLM summarization web_extract no longer runs an auxiliary LLM over scraped pages. The extract backends (Firecrawl/Tavily/Exa/Parallel) already return clean, boilerplate- stripped markdown, so we return it directly: pages within a char budget (default 15000, web.extract_char_limit) come back whole; larger pages get a head+tail window plus an explicit footer giving the stored full-text path and the read_file call to page through the omitted middle. The full clean text is written to cache/web (mounted read-only into remote backends like the other cache dirs), so nothing is lost. Inline base64 images are converted to [IMAGE: alt] placeholders (token bombs dropped) while real http(s) image URLs are preserved as links so the agent can still web_extract/vision_analyze them. Removes process_content_with_llm + the chunked summarizer + check_auxiliary_model + _resolve_web_extract_auxiliary. context_references._default_url_fetcher is updated to the truncate path and its stale data.documents shape read is fixed to results (it was silently returning empty). Live before/after eval (firecrawl, 4 URLs): 11.7x faster overall (176.6s -> 15.1s); 10-60x on large pages. Quality identical; findability 4/4 (answer recoverable from stored full text on every truncated page). web_search is unchanged. No own scraper added; no changes to web_search. * fix(web_extract): add char_limit to execute_code web_extract stub The new web_extract char_limit param must appear in the code_execution_tool _TOOL_STUBS signature (and doc line) or test_stubs_cover_all_schema_params fails — the stub schema must cover every real schema param.		2026-06-29 10:00:49 -07:00
..
__init__.py	test: reorganize test structure and add missing unit tests	2026-02-26 03:20:08 +03:00
test_batch_runner.py	test: reorganize test structure and add missing unit tests	2026-02-26 03:20:08 +03:00
test_checkpoint_resumption.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_daytona_terminal.py	test(daytona): add unit and integration tests for Daytona backend	2026-03-05 10:26:22 -08:00
test_ha_integration.py	refactor(gateway): migrate Home Assistant adapter to bundled plugin	2026-06-06 11:46:24 -07:00
test_modal_terminal.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_voice_channel_flow.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_web_tools.py	feat(web_extract): truncate-and-store instead of LLM summarization (#54843 )	2026-06-29 10:00:49 -07:00