hermes-agent/tests/run_agent
Teknium 76042f5867
feat(review): class-first skill review prompt (#16026)
The background skill-review prompt (spawned after N user turns) now instructs
the reviewer to SURVEY existing skills first, identify the CLASS of task, and
PREFER updating/generalizing an existing skill over creating a new narrow one.

This reduces near-duplicate skill accumulation at the source. Catches the
common failure mode where repeated tasks of the same class each spawn their
own specific skill ("fix-my-tauri-error", "fix-my-electron-error") instead
of a single class-level skill ("desktop-app-build-troubleshooting").

Applied to both _SKILL_REVIEW_PROMPT and the **Skills** half of
_COMBINED_REVIEW_PROMPT. Memory-only review prompt unchanged.

Groundwork for the Curator feature (issue #7816) — the creation-side fix.
Curator handles the retirement/consolidation side in a follow-up PR.

Tests assert the behavioral instructions are present (survey, class, update-
over-create, overlap-flagging, opt-out clause) rather than snapshotting the
full prompt text.
2026-04-26 05:17:10 -07:00
..
__init__.py
conftest.py
test_413_compression.py
test_860_dedup.py
test_1630_context_overflow_loop.py
test_agent_guardrails.py
test_agent_loop.py
test_agent_loop_tool_calling.py
test_agent_loop_vllm.py
test_anthropic_error_handling.py
test_anthropic_prompt_cache_policy.py
test_anthropic_third_party_oauth_guard.py
test_anthropic_truncation_continuation.py
test_api_max_retries_config.py
test_async_httpx_del_neuter.py
test_background_review_summary.py fix(agent): exclude prior-history tool messages from background review summary 2026-04-24 03:10:19 -07:00
test_compress_focus_plugin_fallback.py refactor(memory): remove flush_memories entirely (#15696) 2026-04-25 08:21:14 -07:00
test_compression_boundary.py
test_compression_feasibility.py refactor(memory): remove flush_memories entirely (#15696) 2026-04-25 08:21:14 -07:00
test_compression_persistence.py
test_compression_trigger_excludes_reasoning.py
test_compressor_fallback_update.py
test_concurrent_interrupt.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_context_token_tracking.py
test_create_openai_client_kwargs_isolation.py
test_create_openai_client_proxy_env.py test(proxy): regression tests for NO_PROXY bypass on keepalive client 2026-04-24 03:04:42 -07:00
test_create_openai_client_reuse.py
test_deepseek_reasoning_content_echo.py fix: DeepSeek/Kimi thinking mode requires reasoning_content on ALL assistant messages 2026-04-26 07:47:13 +08:00
test_dict_tool_call_args.py
test_exit_cleanup_interrupt.py
test_fallback_model.py
test_interactive_interrupt.py
test_interrupt_propagation.py
test_invalid_context_length_warning.py
test_jsondecodeerror_retryable.py fix(agent): retry on json.JSONDecodeError instead of treating it as a local validation error (#15107) 2026-04-24 05:02:58 -07:00
test_long_context_tier_429.py
test_memory_provider_init.py
test_memory_sync_interrupted.py fix(memory): skip external-provider sync on interrupted turns (#15218) 2026-04-24 15:30:18 -07:00
test_openai_client_lifecycle.py
test_percentage_clamp.py
test_plugin_context_engine_init.py
test_primary_runtime_restore.py fix(agent): only set rate-limit cooldown when leaving primary; add tests 2026-04-24 05:35:43 -07:00
test_provider_attribution_headers.py fix(providers): send user agent to routermint endpoints 2026-04-24 03:02:16 -07:00
test_provider_fallback.py fix(agent): fall back on rate limit when pool has no rotation room 2026-04-24 05:20:05 -07:00
test_provider_parity.py fix(agent): preserve Codex message items for replay 2026-04-25 18:22:06 -07:00
test_real_interrupt_subagent.py
test_redirect_stdout_issue.py
test_repair_tool_call_arguments.py fix(run_agent): handle unescaped control chars in tool_call arguments (#15356) 2026-04-24 15:06:41 -07:00
test_repair_tool_call_name.py fix(agent): repair CamelCase + _tool suffix tool-call emissions (#15124) 2026-04-24 05:32:08 -07:00
test_review_prompt_class_first.py feat(review): class-first skill review prompt (#16026) 2026-04-26 05:17:10 -07:00
test_run_agent.py fix(agent): support Azure OpenAI gpt-5.x on chat/completions endpoint 2026-04-25 18:48:43 -07:00
test_run_agent_codex_responses.py fix(agent): preserve Codex message items for replay 2026-04-25 18:22:06 -07:00
test_run_agent_multimodal_prologue.py
test_sequential_chats_live.py
test_session_meta_filtering.py
test_session_reset_fix.py
test_steer.py
test_stream_interrupt_retry.py fix: /stop now immediately aborts streaming retry loop 2026-04-25 09:51:39 -07:00
test_streaming.py
test_streaming_tool_call_repair.py fix: repair malformed tool call args in streaming assembly before flagging as truncated 2026-04-24 15:03:07 -07:00
test_strict_api_validation.py
test_strip_reasoning_tags_cli.py
test_switch_model_context.py
test_switch_model_fallback_prune.py fix(agent): default missing fallback chain on switch 2026-04-24 05:35:43 -07:00
test_token_persistence_non_cli.py
test_tool_arg_coercion.py
test_tool_call_args_sanitizer.py fix(run_agent): repair corrupted tool_call arguments before sending to provider 2026-04-24 14:55:47 -07:00
test_unicode_ascii_codec.py