hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-31 19:16:29 +00:00

History

teknium1 fa9383d27b feat(curator): umbrella-first prompt, inherit parent config, unbounded iterations Based on three live test runs against 346 agent-created skills on the author's own setup (~6.5 min, opus-4.7, 86 API calls), the curator prompt needed three sharpenings before it consistently produced real umbrella consolidation instead of passive audit output: Umbrella-first framing. The original 'decide keep/patch/archive/ consolidate' framing lets opus default to 'keep' whenever two skills aren't byte-identical. The new prompt explicitly tells the reviewer that pairwise distinctness is the wrong bar — the right question is 'would a human maintainer write this as N separate skills, or one skill with N labeled subsections?' Expect 10-25 prefix clusters; merge each into an umbrella via one of three methods. Three concrete consolidation methods. (a) Merge into an existing umbrella (patch the broadest skill, archive siblings); (b) Create a new umbrella SKILL.md (skill_manage action=create); (c) Demote session-specific detail into references/, templates/, or scripts/ under the umbrella via skill_manage action=write_file, then archive the narrow sibling. This matches the support-file vocabulary the review-prompt side already uses (PR #17213). Two observed bailouts pre-empted: 'usage counters are zero so I can't judge' (rule 4: judge on content, not use_count) and 'each has a distinct trigger' (rule 5: pairwise distinctness is the wrong bar). Config-aware parent inheritance. _run_llm_review() was building AIAgent() without explicit provider/model, hitting an auto-resolve path that returned empty credentials → HTTP 400 'No models provided' against OpenRouter. Fork now inherits the user's main provider and model (via load_config + resolve_runtime_provider) before spawning — runs on whatever the user is currently on, OAuth-backed or pool-backed included. Unbounded iteration ceiling. max_iterations=8 was way too low for an umbrella-build pass over hundreds of skills. A live pass takes 50-100 API calls (scanning, clustering, skill_view'ing candidates, patching umbrellas, mv'ing siblings). Raised to 9999 — the natural stopping criterion is 'no more clusters worth processing', not an arbitrary tool-call budget. Tests updated: test_curator_review_prompt_has_invariants accepts DO NOT / MUST NOT and drops 'keep' from the required-verb set (the umbrella-first prompt correctly deemphasizes 'keep' as a first-class decision label since passive keep-everything is the failure mode being prevented). Added test_curator_review_prompt_is_umbrella_first asserting the umbrella framing, class-level thinking, references/ + templates/ + scripts/ support-file mentions, and the 'use_count is not evidence of value' pre-emption. Added test_curator_review_prompt_offers_support_file_actions asserting skill_manage action=create and action=write_file are both named. Live validation on author's setup: - Run 1 (old prompt): 3 archives, stopped after surveying — typical passive outcome - Run 2 (consolidation prompt): 44 archives, 3 patches, surfaced the 50-skill mlops reorg duplicate bug but didn't umbrella - Run 3 (this prompt): 249 archives + 18 new class-level umbrellas created, reducing agent-created skills from 346 → 118 with every archived skill's content preserved as references/ under its umbrella. Pinned skill untouched. Full report in PR description.		2026-04-28 22:33:33 -07:00
..
transports	feat(agent): add lmstudio integration	2026-04-28 12:27:36 -07:00
__init__.py	test: add unit tests for 8 modules (batch 2)	2026-02-26 13:54:20 +03:00
test_anthropic_adapter.py	fix(mcp): normalize nullable tool schemas	2026-04-28 04:58:03 -07:00
test_anthropic_keychain.py	fix: re-auth on stale OAuth token; read Claude Code credentials from macOS Keychain	2026-04-24 07:14:00 -07:00
test_auxiliary_client.py	feat(aux): translate extra_body.reasoning into Codex Responses API (#17004 )	2026-04-28 05:47:42 -07:00
test_auxiliary_client_anthropic_custom.py	fix(anthropic): complete third-party Anthropic-compatible provider support (#12846 )	2026-04-19 22:43:09 -07:00
test_auxiliary_config_bridge.py	fix: remove legacy compression.summary_* config and env var fallbacks (#8992 )	2026-04-13 04:59:26 -07:00
test_auxiliary_main_first.py	fix(copilot): send vision header for Copilot vision requests	2026-04-27 08:35:50 -07:00
test_auxiliary_named_custom_providers.py	refactor(memory): remove flush_memories entirely (#15696 )	2026-04-25 08:21:14 -07:00
test_auxiliary_transport_autodetect.py	fix(auxiliary): auto-detect Anthropic Messages transport for all aux clients (#17027 )	2026-04-28 06:50:14 -07:00
test_bedrock_1m_context.py	fix(bedrock): send context-1m-2025-08-07 beta so Opus 4.6/4.7 get 1M context (#16793 )	2026-04-27 20:41:36 -07:00
test_bedrock_adapter.py	test(bedrock): add model picker and region routing tests	2026-04-28 03:53:11 -07:00
test_bedrock_integration.py	fix(agent): handle aws_sdk auth type in resolve_provider_client	2026-04-24 07:26:07 -07:00
test_codex_cloudflare_headers.py	fix(codex): pin correct Cloudflare headers and extend to auxiliary client	2026-04-19 11:59:25 -07:00
test_compress_focus.py	fix: resolve CI test failures — add missing functions, fix stale tests (#9483 )	2026-04-14 01:43:45 -07:00
test_compressor_image_tokens.py	feat(image-input): native multimodal routing based on model vision capability (#16506 )	2026-04-27 06:27:59 -07:00
test_context_compressor.py	fix(compression): notify users when configured aux model fails even if main-model fallback recovers (#16775 )	2026-04-27 20:08:23 -07:00
test_context_engine.py	feat: wire context engine plugin slot into agent and plugin system	2026-04-10 19:15:50 -07:00
test_context_references.py	fix(agent): fall back when rg is blocked for @folder references	2026-04-20 01:56:41 -07:00
test_copilot_acp_client.py	fix: set HOME for Copilot ACP subprocesses	2026-04-24 05:09:08 -07:00
test_credential_pool.py	fix(codex): resync pool entry from auth.json after reauth (#17001 )	2026-04-28 05:43:09 -07:00
test_credential_pool_routing.py	refactor: remove smart_model_routing feature (#12732 )	2026-04-19 18:12:55 -07:00
test_crossloop_client_cache.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_curator.py	feat(curator): umbrella-first prompt, inherit parent config, unbounded iterations	2026-04-28 22:33:33 -07:00
test_direct_provider_url_detection.py	fix: restrict provider URL detection to exact hostname matches	2026-04-20 22:14:29 -07:00
test_display.py	fix(display): render <missing old_text> in memory previews instead of empty quotes (#12852 )	2026-04-19 22:45:47 -07:00
test_display_emoji.py	feat(tools): centralize tool emoji metadata in registry + skin integration	2026-03-15 20:21:21 -07:00
test_error_classifier.py	feat(image-input): native multimodal routing based on model vision capability (#16506 )	2026-04-27 06:27:59 -07:00
test_external_skills.py	feat(skills): support external skill directories via config (#3678 )	2026-03-29 00:33:30 -07:00
test_gemini_cloudcode.py	fix(gemini): assign unique stream indices to parallel tool calls	2026-04-20 02:10:53 -07:00
test_gemini_free_tier_gate.py	feat(gemini): block free-tier keys at setup + surface guidance on 429 (#15100 )	2026-04-24 04:46:17 -07:00
test_gemini_native_adapter.py	fix(gemini): fail fast on missing API key + surface it in hermes dump (#15133 )	2026-04-24 05:35:17 -07:00
test_gemini_schema.py	fix(gemini): drop integer/number/boolean enums from tool schemas (#15082 )	2026-04-24 03:40:00 -07:00
test_image_gen_registry.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
test_image_routing.py	feat(image-input): native multimodal routing based on model vision capability (#16506 )	2026-04-27 06:27:59 -07:00
test_insights.py	test: stop testing mutable data — convert change-detectors to invariants (#13363 )	2026-04-20 23:20:33 -07:00
test_kimi_coding_anthropic_thinking.py	fix(kimi): don't send Anthropic thinking to api.kimi.com/coding (#13826 )	2026-04-21 21:19:14 -07:00
test_local_stream_timeout.py	fix(agent): recognize Tailscale CGNAT (100.64.0.0/10) as local for Ollama timeouts	2026-04-22 14:46:10 -07:00
test_memory_provider.py	fix(memory): add write origin metadata	2026-04-24 14:37:55 -07:00
test_memory_user_id.py	feat(hindsight): richer session-scoped retain metadata	2026-04-22 05:27:10 -07:00
test_minimax_auxiliary_url.py	fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983 )	2026-04-07 22:23:28 -07:00
test_minimax_provider.py	fix(tests): resolve 17 persistent CI test failures (#15084 )	2026-04-24 03:46:46 -07:00
test_model_metadata.py	revert: computer-use cua-driver (PR #16919 ) (#16927 )	2026-04-28 01:57:21 -07:00
test_model_metadata_local_ctx.py	fix(tui): show correct context length	2026-04-28 12:27:36 -07:00
test_model_metadata_ssl.py	fix(auth): honor SSL CA env vars across httpx + requests callsites	2026-04-24 03:00:33 -07:00
test_models_dev.py	feat: add Step Plan provider support (salvage #6005 )	2026-04-22 02:59:58 -07:00
test_moonshot_schema.py	fix(kimi,mcp): Moonshot schema sanitizer + MCP schema robustness (#14805 )	2026-04-23 16:11:57 -07:00
test_nous_rate_guard.py	fix(nous): don't trip cross-session rate breaker on upstream-capacity 429s (#15898 )	2026-04-26 04:53:42 -07:00
test_onboarding.py	fix(openclaw-migration): case-preserving brand rewrite + one-time ~/.openclaw residue banner (#16327 )	2026-04-26 20:57:26 -07:00
test_prompt_builder.py	feat(agent): add PLATFORM_HINTS for matrix, mattermost, and feishu (#14428 )	2026-04-23 12:50:22 +05:30
test_prompt_caching.py	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter	2026-03-21 16:54:43 -07:00
test_proxy_and_url_validation.py	fix(agent): normalize socks:// env proxies for httpx/anthropic	2026-04-21 05:52:46 -07:00
test_rate_limit_tracker.py	feat: capture provider rate limit headers and show in /usage (#6541 )	2026-04-09 03:43:14 -07:00
test_redact.py	feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal (#13148 )	2026-04-20 11:49:54 -07:00
test_shell_hooks.py	feat: shell hooks — wire shell scripts as Hermes hook callbacks	2026-04-20 20:53:51 -07:00
test_shell_hooks_consent.py	fix(shell_hooks): parse hooks_auto_accept as strict bool/string, not bool() (#16322 )	2026-04-26 20:48:35 -07:00
test_skill_commands.py	refactor(commands): drop /provider, /plan handler, and clean up slash registry (#15047 )	2026-04-24 03:10:52 -07:00
test_streaming_context_scrubber.py	style: trim verbose comment blocks added by previous commit	2026-04-27 12:37:33 -07:00
test_subagent_progress.py	feat(delegate): orchestrator role and configurable spawn depth (default flat)	2026-04-21 14:23:45 -07:00
test_subagent_stop_hook.py	feat: shell hooks — wire shell scripts as Hermes hook callbacks	2026-04-20 20:53:51 -07:00
test_subdirectory_hints.py	fix(agent): catch PermissionError in subdirectory hint discovery	2026-04-09 03:10:30 -07:00
test_title_generator.py	fix(auxiliary): custom provider URL rewrite + main_runtime model for title gen	2026-04-28 01:47:25 -07:00
test_unsupported_parameter_retry.py	fix(auxiliary): generalize unsupported-parameter detector and harden max_tokens retry (#15633 )	2026-04-25 05:50:34 -07:00
test_unsupported_temperature_retry.py	refactor(memory): remove flush_memories entirely (#15696 )	2026-04-25 08:21:14 -07:00
test_usage_pricing.py	fix(usage): read top-level Anthropic cache fields from OAI-compatible proxies	2026-04-22 17:40:49 -07:00
test_vision_resolved_args.py	fix: pass resolved args to resolve_vision_provider_client()	2026-04-16 07:45:13 -07:00