hermes-agent/tests
Teknium ab8f9c089e
feat: thinking-only prefill continuation for structured reasoning responses (#5931)
When the model produces structured reasoning (via API fields like .reasoning,
.reasoning_content, .reasoning_details) but no visible text content, append
the assistant message as prefill and continue the loop. The model sees its own
reasoning context on the next turn and produces the text portion.

Inspired by clawdbot's 'incomplete-text' recovery pattern. Up to 2 prefill
attempts before falling through to the existing '(empty)' terminal.

Key design decisions:
- Only triggers for structured reasoning (API fields), NOT inline <think> tags
- Prefill messages are popped on success to maintain strict role alternation
- _thinking_prefill marker stripped from all API message building paths
- Works across all providers: OpenAI (continuation), Anthropic (native prefill)

Verified with E2E tests: simulated thinking-only → real OpenRouter continuation
produces correct content. Also confirmed Qwen models consistently produce
structured-reasoning-only responses under token pressure.
2026-04-07 13:19:06 -07:00
..
acp feat(api): structured run events via /v1/runs SSE endpoint 2026-04-05 12:05:13 -07:00
agent fix: thread gateway user_id to memory plugins for per-user scoping (#5895) 2026-04-07 11:14:12 -07:00
cron test: add unit tests for media helper — video, document, multi-file, failure isolation 2026-04-07 12:49:25 -07:00
e2e test(e2e): remove section separator comments 2026-04-01 15:23:52 -07:00
fakes fix: streaming tool call parsing, error handling, and fake HA state mutation 2026-03-14 14:27:20 +03:00
gateway feat(slack): thread engagement — auto-respond in bot-started and mentioned threads (#5897) 2026-04-07 11:12:08 -07:00
hermes_cli fix: repair 57 failing CI tests across 14 files (#5823) 2026-04-07 09:58:45 -07:00
honcho_plugin fix(honcho): migration guard for observation mode default change 2026-04-05 12:34:11 -07:00
integration refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804) 2026-03-24 07:30:25 -07:00
plugins fix(memory): clean up supermemory provider threads 2026-04-06 22:15:58 -07:00
skills fix: protect profile-scoped google workspace oauth tokens 2026-04-03 17:49:18 -07:00
tools feat(clipboard): add native Windows image paste support 2026-04-07 12:49:39 -07:00
__init__.py A bit of restructuring for simplicity and organization 2025-10-01 23:29:25 +00:00
conftest.py fix(approval): show full command in dangerous command approval (#1553) 2026-03-17 02:02:33 -07:00
run_interrupt_test.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_413_compression.py fix: remove stale test skips, fix regex backtracking, file search bug, and test flakiness 2026-04-04 10:18:57 -07:00
test_860_dedup.py fix: eliminate 3x SQLite message duplication in gateway sessions (#860) 2026-03-10 15:22:44 -07:00
test_1630_context_overflow_loop.py fix: prevent infinite 400 loop on context overflow + block prompt injection via cache files (#1630, #1558) 2026-03-17 01:50:59 -07:00
test_agent_guardrails.py feat: pre-call sanitization and post-call tool guardrails (#1732) 2026-03-17 04:24:27 -07:00
test_agent_loop.py fix: salvage gateway dedup and executor cleanup from PR #993 2026-03-14 11:03:20 -07:00
test_agent_loop_tool_calling.py fix: remove stale test skips, fix regex backtracking, file search bug, and test flakiness 2026-04-04 10:18:57 -07:00
test_agent_loop_vllm.py test: restore vllm integration coverage and add dict-args regression 2026-03-15 08:02:29 -07:00
test_anthropic_adapter.py fix: preserve Anthropic thinking block signatures across tool-use turns 2026-04-02 10:30:32 -07:00
test_anthropic_error_handling.py fix(ci): pin acp <0.9 and update retry-exhaust test (#3320) 2026-03-26 19:21:34 -07:00
test_anthropic_oauth_flow.py fix: repair 57 failing CI tests across 14 files (#5823) 2026-04-07 09:58:45 -07:00
test_anthropic_provider_persistence.py fix: preflight Anthropic auth and prefer Claude store 2026-03-14 19:38:55 -07:00
test_api_key_providers.py fix(credential_pool): auto-detect Z.AI endpoint via probe and cache 2026-04-07 00:00:08 -07:00
test_async_httpx_del_neuter.py fix: eliminate 'Event loop is closed' / 'Press ENTER to continue' during idle (#3398) 2026-03-27 09:45:25 -07:00
test_atomic_json_write.py test: cover atomic temp cleanup on interrupts 2026-03-14 22:31:51 -07:00
test_atomic_yaml_write.py test: cover atomic temp cleanup on interrupts 2026-03-14 22:31:51 -07:00
test_auth_codex_provider.py refactor(auth): transition Codex OAuth tokens to Hermes auth store 2026-03-01 19:59:24 -08:00
test_auth_commands.py fix: hermes auth remove now clears env-seeded credentials permanently (#5285) 2026-04-05 12:00:53 -07:00
test_auth_nous_provider.py Fix nous refresh token rotation failure in case where api key mint/retrieval fails 2026-03-02 17:18:15 +11:00
test_auxiliary_config_bridge.py feat(compression): add summary_base_url + move compression config to YAML-only 2026-03-17 04:46:15 -07:00
test_batch_runner_checkpoint.py fix: sanitize chat payloads and provider precedence 2026-03-13 23:59:12 -07:00
test_branch_command.py fix: clear ghost status-bar lines on terminal resize (#4960) 2026-04-03 22:43:45 -07:00
test_cli_approval_ui.py fix(cli): repair dangerous command approval UI 2026-03-14 11:57:44 -07:00
test_cli_background_tui_refresh.py fix(cli): refresh TUI before background task output to prevent status bar overlap (#3048) 2026-03-25 15:00:33 -07:00
test_cli_browser_connect.py fix: cross-platform browser test path separators 2026-04-06 16:54:16 -07:00
test_cli_context_warning.py fix: add missing provider attrs to cli_obj test fixture 2026-04-01 01:12:23 -07:00
test_cli_extension_hooks.py refactor(cli): add protected TUI extension hooks for wrapper CLIs 2026-03-21 09:42:07 -07:00
test_cli_file_drop.py refactor: extract _detect_file_drop() + add 28 tests 2026-04-02 00:40:27 -07:00
test_cli_init.py fix(cli): surface recent sessions inside /history and /resume 2026-04-03 00:50:49 -07:00
test_cli_interrupt_subagent.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_cli_loading_indicator.py fix(cli): add loading indicators for slow slash commands 2026-03-10 17:31:00 -07:00
test_cli_mcp_config_watch.py fix: auto-reload MCP tools when mcp_servers config changes without restart (#1474) 2026-03-15 19:03:34 -07:00
test_cli_new_session.py fix: complete session reset — missing compressor counters + test 2026-03-20 04:35:17 -07:00
test_cli_plan_command.py fix: save /plan output in workspace (#1381) 2026-03-14 21:28:51 -07:00
test_cli_prefix_matching.py feat: add /tools disable/enable/list slash commands with session reset (#1652) 2026-03-17 02:05:26 -07:00
test_cli_preloaded_skills.py fix: move activated skills line below welcome text 2026-03-23 06:20:19 -07:00
test_cli_provider_resolution.py fix: repair 57 failing CI tests across 14 files (#5823) 2026-04-07 09:58:45 -07:00
test_cli_retry.py test: lock retry replacement semantics 2026-03-14 21:19:22 -07:00
test_cli_save_config_value.py fix(cli): use atomic write in save_config_value to prevent config loss on interrupt 2026-03-31 12:21:55 -07:00
test_cli_secret_capture.py feat: secure skill env setup on load (core #688) 2026-03-13 03:14:04 -07:00
test_cli_skin_integration.py fix(test): add missing voice state attrs to CLI stub in skin tests 2026-03-14 15:00:45 +03:00
test_cli_status_bar.py fix(cli): handle CJK wide chars in TUI input height 2026-04-06 16:54:16 -07:00
test_cli_tools_command.py fix: resolve 7 failing CI tests (#3936) 2026-03-30 08:10:14 -07:00
test_codex_execution_paths.py fix: repair 57 failing CI tests across 14 files (#5823) 2026-04-07 09:58:45 -07:00
test_codex_models.py fix: repair OpenCode model routing and selection (#4508) 2026-04-02 09:36:24 -07:00
test_compression_boundary.py fix(agent): prevent silent tool result loss during context compression (#1993) 2026-03-18 15:22:51 -07:00
test_compression_persistence.py fix: persist compressed context to gateway session after mid-run compression 2026-03-30 18:49:14 -07:00
test_compressor_fallback_update.py feat(providers): add ordered fallback provider chain (salvage #1761) (#3813) 2026-03-29 16:04:53 -07:00
test_config_env_expansion.py feat(config): support ${ENV_VAR} substitution in config.yaml (#2684) 2026-03-23 16:02:06 -07:00
test_context_pressure.py fix: cap context pressure percentage at 100% in display (#3480) 2026-03-27 21:42:09 -07:00
test_context_references.py fix(context): restrict @ references to safe workspace paths (#2601) 2026-03-23 06:40:05 -07:00
test_context_token_tracking.py fix(tests): resolve all consistently failing tests 2026-03-22 05:58:26 -07:00
test_credential_pool.py fix(delegate): share credential pools with subagents + per-task leasing 2026-04-06 23:01:11 -07:00
test_credential_pool_routing.py Honor provider reset windows in pooled credential failover 2026-04-05 00:20:53 -07:00
test_crossloop_client_cache.py fix(agent): prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode (#2701) 2026-03-25 17:31:56 -07:00
test_dict_tool_call_args.py test: restore vllm integration coverage and add dict-args regression 2026-03-15 08:02:29 -07:00
test_display.py feat: add inline diff previews for write actions 2026-04-01 02:13:57 -07:00
test_evidence_store.py feat: add OSS Security Forensics skill (Skills Hub) (#1482) 2026-03-15 21:59:53 -07:00
test_exit_cleanup_interrupt.py feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623) 2026-04-02 15:33:51 -07:00
test_external_credential_detection.py refactor(auth): transition Codex OAuth tokens to Hermes auth store 2026-03-01 19:59:24 -08:00
test_fallback_model.py feat: upgrade MiniMax default to M2.7 + add new OpenRouter models 2026-03-18 02:42:58 -07:00
test_file_permissions.py security: enforce 0600/0700 file permissions on sensitive files (inspired by openclaw) 2026-03-09 02:19:32 -07:00
test_flush_memories_codex.py fix: update all test mocks for call_llm migration 2026-03-11 21:06:54 -07:00
test_gemini_provider.py fix: repair 57 failing CI tests across 14 files (#5823) 2026-04-07 09:58:45 -07:00
test_hermes_logging.py fix: repair 57 failing CI tests across 14 files (#5823) 2026-04-07 09:58:45 -07:00
test_hermes_state.py fix: merge dotted+hyphenated FTS5 quoting into single pass 2026-04-02 00:49:11 -07:00
test_honcho_client_config.py feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623) 2026-04-02 15:33:51 -07:00
test_insights.py feat: add route-aware pricing estimates (#1695) 2026-03-17 03:44:44 -07:00
test_interactive_interrupt.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_interrupt_propagation.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_large_tool_result.py feat: save oversized tool results to file instead of destructive truncation (#5210) 2026-04-05 10:29:57 -07:00
test_long_context_tier_429.py fix: handle Anthropic Sonnet long-context tier 429 by reducing to 200k (#4747) 2026-04-03 02:05:02 -07:00
test_managed_server_tool_support.py test: fix stale CI assumptions in parser and quick-command coverage (#1236) 2026-03-13 21:56:12 -07:00
test_mcp_serve.py feat: add MCP server mode — hermes mcp serve (#3795) 2026-03-29 15:47:19 -07:00
test_minisweagent_path.py chore: remove all remaining mini-swe-agent references 2026-03-24 08:19:23 -07:00
test_model_metadata_local_ctx.py fix: prefer loaded instance context size over max for LM Studio 2026-03-19 21:24:53 +01:00
test_model_normalize.py Fix #5211: Preserve dots in OpenCode Go model names 2026-04-06 11:25:06 -07:00
test_model_provider_persistence.py fix: repair OpenCode model routing and selection (#4508) 2026-04-02 09:36:24 -07:00
test_model_tools.py Add request-scoped plugin lifecycle hooks 2026-04-05 23:31:29 -07:00
test_model_tools_async_bridge.py fix: use per-thread persistent event loops in worker threads 2026-03-20 15:41:06 -04:00
test_ollama_cloud_auth.py fix: Ollama Cloud auth, /model switch persistence, and alias tab completion 2026-04-05 11:06:06 -07:00
test_openai_client_lifecycle.py fix: audit fixes — 5 bugs found and resolved 2026-03-16 06:35:46 -07:00
test_packaging_metadata.py chore: prepare Hermes for Homebrew packaging (#4099) 2026-03-30 17:34:43 -07:00
test_percentage_clamp.py fix: cap percentage displays at 100% in stats, gateway, and memory tool (#3599) 2026-03-28 14:55:18 -07:00
test_personality_none.py feat(cli,gateway): add /personality none and custom personality support 2026-03-09 17:31:54 +03:00
test_plugin_cli_registration.py fix(plugins): only register CLI commands for the active memory provider 2026-04-05 12:34:11 -07:00
test_plugins.py feat(plugins): pre_api_request/post_api_request with narrow payloads 2026-04-05 23:31:29 -07:00
test_plugins_cmd.py feat(plugins): prompt for required env vars during hermes plugins install 2026-04-06 16:37:53 -07:00
test_primary_runtime_restore.py feat: per-turn primary runtime restoration and transport recovery (#4624) 2026-04-02 10:52:01 -07:00
test_project_metadata.py fix: exclude matrix from [all] extras — python-olm is upstream-broken (#4615) 2026-04-02 09:21:37 -07:00
test_provider_fallback.py feat(providers): add ordered fallback provider chain (salvage #1761) (#3813) 2026-03-29 16:04:53 -07:00
test_provider_parity.py test: add test for _should_sanitize_tool_calls() 2026-04-05 00:13:25 -07:00
test_quick_commands.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_real_interrupt_subagent.py fix: thread safety for concurrent subagent delegation (#1672) 2026-03-17 02:53:33 -07:00
test_reasoning_command.py fix: prevent reasoning box from rendering 3x during tool-calling loops (#3405) 2026-03-27 09:57:50 -07:00
test_redirect_stdout_issue.py fix: use session_key instead of chat_id for adapter interrupt lookups 2026-03-12 08:35:45 -07:00
test_resume_display.py feat: display previous messages when resuming a session in CLI 2026-03-08 17:45:45 -07:00
test_run_agent.py feat: thinking-only prefill continuation for structured reasoning responses (#5931) 2026-04-07 13:19:06 -07:00
test_run_agent_codex_responses.py fix(codex): handle reasoning-only responses and replay path (#2070) 2026-03-19 10:34:44 -07:00
test_runtime_provider_resolution.py fix: stale OAuth credentials block OpenRouter users on auto-detect (#5746) 2026-04-06 23:01:43 -07:00
test_session_meta_filtering.py fix: filter transcript-only roles from chat-completions payload (#4715) 2026-04-03 14:57:33 -07:00
test_session_reset_fix.py fix(session): clear compressor summary and turn counter on /clear and /new (#3102) 2026-03-25 18:22:21 -07:00
test_setup_model_selection.py fix: repair OpenCode model routing and selection (#4508) 2026-04-02 09:36:24 -07:00
test_sql_injection.py fix(security): eliminate SQL string formatting in execute() calls 2026-03-19 15:16:35 +01:00
test_streaming.py test: add codex transport drop regression 2026-03-31 12:05:06 -07:00
test_strict_api_validation.py test: add strict API validation tests for Fireworks compatibility 2026-04-05 00:13:25 -07:00
test_surrogate_sanitization.py fix: sanitize surrogate characters from clipboard paste to prevent UnicodeEncodeError (#3624) 2026-03-28 16:53:14 -07:00
test_timezone.py fix: repair 57 failing CI tests across 14 files (#5823) 2026-04-07 09:58:45 -07:00
test_token_persistence_non_cli.py fix(insights): persist token usage for non-CLI sessions 2026-04-02 10:47:13 -07:00
test_tool_arg_coercion.py feat: coerce tool call arguments to match JSON Schema types (#5265) 2026-04-05 10:57:34 -07:00
test_tool_call_parsers.py fix(mistral-parser): handle nested JSON in fallback extraction 2026-03-21 09:41:17 -07:00
test_toolset_distributions.py test: add unit tests for 8 modules (batch 2) 2026-02-26 13:54:20 +03:00
test_toolsets.py fix: add missing Platform.SIGNAL to toolset mappings, update test + config docs 2026-03-09 23:27:19 -07:00
test_trajectory_compressor.py fix: URL-based auth for third-party Anthropic endpoints + CI test fixes (#4148) 2026-03-30 20:36:56 -07:00
test_trajectory_compressor_async.py fix: create AsyncOpenAI lazily in trajectory_compressor to avoid closed event loop (#4013) 2026-03-30 13:16:16 -07:00
test_utils_truthy_values.py Gate tool-gateway behind an env var, so it's not in users' faces until we're ready. Even if users enable it, it'll be blocked server-side for now, until we unlock for non-admin users on tool-gateway. 2026-03-30 13:28:10 +09:00
test_worktree.py fix: harden salvaged worktree include checks 2026-03-14 21:51:27 -07:00
test_worktree_security.py fix: harden salvaged worktree include checks 2026-03-14 21:51:27 -07:00