hermes-agent/tests/agent
Teknium 481caa66f2
feat(display): friendly human-phrased tool labels for built-in tools (#55166)
* feat(display): friendly human-phrased tool labels for built-in tools

Built-in tools now render ChatGPT-style status verbs ('Searching the web
for ...', 'Reading <file>', 'Browsing <url>') on the CLI spinner and
gateway/desktop tool-progress instead of the raw tool name.

- agent/display.py: _TOOL_VERBS map + build_tool_label() + set/get
  friendly-labels flag (default on). Custom/plugin/MCP tools fall back to
  the raw preview; verbose gateway mode left untouched (debug surface).
- tool_executor.py / tui_gateway / gateway: route the three spinner sites,
  the TUI _tool_ctx, and the gateway all/new progress line through the label.
- config: display.friendly_tool_labels (default True, per-platform aware).

Zero new core tool / schema footprint — pure display layer.

* docs: add PR infographic for friendly tool labels

* fix(display): preserve arg preview in gateway friendly labels + update tests

The first gateway pass re-derived the label from the callback's `args`, which
is empty ({}) at the gateway tool.started callsite — the command/query lives in
the `preview` string, so terminal rendered as a bare '💻 Running' and dedup
collapsed consecutive commands. Now the gateway prefixes the verb onto the
already-computed preview via get_tool_verb/tool_verb_connector/verb_drops_preview,
preserving the command/url/query. CLI spinner path (real args) keeps build_tool_label.

Tests: update test_run_progress_topics exact-format assertions to the friendly
form ('💻 Running pwd'), add a format-agnostic preview extractor for the
truncation tests (works for both quoted-legacy and verb-prefixed output).

* test(tui): update resume-display context to friendly tool label

_tool_ctx now uses build_tool_label, so the desktop resume-view context for a
search_files turn reads 'Searching files for resume' instead of the bare
'resume' preview — consistent with live tool-progress. Update the assertion.

* test(tui): harden no-race worker test against sibling shard leakage

test_session_create_no_race_keeps_worker_alive flaked under -j 8: a daemon
build thread leaked from a prior session.create test in the same shard process
fires close/unregister against its own (foreign) session_key after this test
patches the global approval hooks, polluting the captured lists. Scope the
assertions to this session's own session_key so the regression intent
(this session's worker/notify must survive) is preserved while the test
becomes immune to shard composition. Not related to friendly-tool-labels.
2026-06-29 20:31:17 -07:00
..
lsp fix(lsp): detect Windows wrapper binaries in installer probes 2026-05-30 02:08:36 -07:00
transports fix(cache): content-address prompt_cache_key so recurring cron jobs reuse the warm prefix (#52295) 2026-06-24 21:46:30 -07:00
__init__.py test: add unit tests for 8 modules (batch 2) 2026-02-26 13:54:20 +03:00
test_anthropic_adapter.py fix(anthropic): stop SDK auto-retry double-firing and raise Retry-After cap to 600s 2026-06-27 19:23:15 -07:00
test_anthropic_keychain.py fix(anthropic): adopt Claude Code's already-refreshed token before racing refresh 2026-06-27 19:14:43 -07:00
test_anthropic_kwargs_sanitize.py fix(anthropic): strip Responses-only kwargs before Messages SDK call (#31673) (#42155) 2026-06-08 09:36:38 -07:00
test_anthropic_mcp_prefix_strip.py fix(anthropic): also normalize MCP-server tool names to mcp__ on OAuth wire 2026-06-17 13:20:29 +05:30
test_anthropic_oauth_pkce.py test(anthropic-oauth): cover login token-endpoint host + fallback 2026-06-23 23:59:40 -07:00
test_anthropic_output_field_leak.py fix(anthropic): strip output-only SDK fields from replayed content blocks 2026-06-10 20:45:16 -07:00
test_anthropic_thinking_block_order.py refactor: keep anthropic_content_blocks in-memory only (no state.db column) 2026-06-10 20:45:16 -07:00
test_arcee_trinity_overrides.py feat(compression): raise compaction trigger to 85% for gpt-5.5 on Codex OAuth (#40957) 2026-06-07 01:40:50 -07:00
test_async_utils.py fix(ci): rip out some xdist legacy stuff... how did these ever work?? 2026-06-26 19:15:18 -07:00
test_auxiliary_client.py test(auxiliary): cover NVIDIA NIM max_tokens in _build_call_kwargs 2026-06-29 18:04:39 +05:30
test_auxiliary_client_anthropic_custom.py fix(anthropic): complete third-party Anthropic-compatible provider support (#12846) 2026-04-19 22:43:09 -07:00
test_auxiliary_client_azure_foundry.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_auxiliary_client_base_url_host_validation_52608.py fix(auxiliary): gate Anthropic base_url override on Anthropic-compatible host (#52608) 2026-06-26 11:21:05 +05:30
test_auxiliary_client_proxy_env.py test(auxiliary): cover env-only proxy policy for auxiliary clients (#53702) 2026-06-27 21:22:49 -07:00
test_auxiliary_client_xai_oauth_recovery.py fix(auxiliary): detect xAI OAuth 403 bad-credentials as auth error 2026-05-29 00:28:02 -07:00
test_auxiliary_config_bridge.py fix(vision): cap vision_analyze fan-out concurrency process-wide 2026-06-29 01:27:10 -07:00
test_auxiliary_main_first.py fix(moa): resolve auxiliary tasks to the aggregator, not the preset name (#53827) 2026-06-27 14:21:26 -07:00
test_auxiliary_named_custom_providers.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_auxiliary_transport_autodetect.py fix(providers): support anthropic proxy v1 endpoints 2026-06-14 02:09:16 -07:00
test_auxiliary_user_default_headers.py fix(aux): honor model.default_headers on auxiliary client too (#40033) 2026-06-07 02:02:40 -07:00
test_azure_identity_adapter.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_bedrock_1m_context.py feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
test_bedrock_adapter.py fix(bedrock): check boto3 version >= 1.34.59 before using converse_stream 2026-06-15 05:25:17 -07:00
test_bedrock_integration.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_billing_view.py feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449) 2026-06-19 01:53:32 +05:30
test_cascading_interrupt_6600.py fix(agent): don't retry interrupt-induced transport errors (cascading-interrupt hang) 2026-06-08 02:19:13 -07:00
test_chat_completion_helpers_provider_sort.py fix(agent): validate OpenRouter provider sort before request dispatch 2026-06-27 11:43:08 -07:00
test_close_interrupted_tool_sequence.py fix(agent): close tool-call sequence on all interrupt aborts, not just finalize_turn 2026-06-25 12:24:34 -05:00
test_codex_cloudflare_headers.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_codex_responses_adapter.py fix(xai): OAuth Responses native web_search, incomplete guard, grok-composer context 2026-06-17 17:33:32 -07:00
test_codex_ttfb_watchdog.py fix(codex): relax no-byte TTFB watchdog default from 12s to 120s 2026-05-29 02:02:25 -07:00
test_coding_context.py fix(agent): require code for coding posture 2026-06-25 16:40:27 -05:00
test_compress_focus.py refactor(agent): drop unused tail_start param from _derive_auto_focus_topic 2026-06-11 23:03:52 -07:00
test_compressed_summary_metadata.py test: compressed-summary metadata flag set in-process, stripped on wire 2026-06-12 16:47:15 -07:00
test_compression_concurrent_fork.py test(compression): pin rotation-fallback tests to in_place=False ahead of default flip 2026-06-25 12:56:05 -07:00
test_compression_count_warning_36908.py fix(agent): route repeated-compression warning through _emit_status (#36908) 2026-06-21 11:34:47 -07:00
test_compression_interrupt_protection.py fix(compression): protect the summary call from mid-flight interrupts 2026-06-20 21:32:30 -07:00
test_compression_logging_session_context.py test(compression): pin rotation-fallback tests to in_place=False ahead of default flip 2026-06-25 12:56:05 -07:00
test_compression_progress.py fix(agent): align preflight token-progress floor to 5% (#23767, #39548) 2026-06-22 15:51:52 +05:30
test_compression_rotation_state.py test(compression): pin rotation-fallback tests to in_place=False ahead of default flip 2026-06-25 12:56:05 -07:00
test_compressor_assistant_tail_anchor.py test(compressor): regression coverage for assistant-tail anchor + compaction rollup (#29824) 2026-06-12 15:41:57 -07:00
test_compressor_historical_media.py Port from Kilo-Org/kilocode#9434: strip historical media after compression (#27189) 2026-05-16 17:18:25 -07:00
test_compressor_image_tokens.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_compressor_media_stripping.py fix(agent): strip MEDIA directives from compressor summarizer input (#14665) 2026-06-12 01:14:28 -07:00
test_compressor_tool_call_budget.py fix(compressor): count tool_call envelope in tail-budget token estimate (#28053) 2026-06-22 16:26:56 +05:30
test_context_breakdown.py feat(desktop): add context usage breakdown popover 2026-06-29 09:18:10 -04:00
test_context_compressor.py fix(compression): abort (preserve context) on transient network summary failure (#29559, #25585) 2026-06-24 18:31:51 +05:30
test_context_compressor_cross_session_guard.py fix(compression): guard against cross-session stale _previous_summary contamination 2026-06-07 22:09:45 -07:00
test_context_compressor_summary_continuity.py test: cover ci-unblocker production regressions 2026-05-27 22:14:53 -07:00
test_context_compressor_temporal_anchoring.py fix(agent): frame compaction handoff sections as historical context 2026-06-11 13:57:13 -07:00
test_context_engine.py fix(agent): deepcopy plugin context engine to prevent parent corruption on delegate_task (#42449) 2026-06-25 02:13:26 +05:30
test_context_engine_host_contract.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_context_references.py fix(agent): make a binary @file: reference actionable instead of a dead end 2026-06-09 19:16:46 -05:00
test_copilot_acp_client.py fix(agent): stream copilot ACP chat completions 2026-06-28 22:52:51 -07:00
test_copilot_acp_deprecation.py fix(copilot-acp): tighten deprecation detection + sharpen GitHub Models 413 hint 2026-05-16 02:24:48 -07:00
test_credential_pool.py fix(auth): preserve concurrently-added credentials on pool rewrite 2026-06-27 19:01:37 -07:00
test_credential_pool_oauth_writethrough.py fix(auth): write rotated Codex/xAI pool grant through to global root (#48415) (#52760) 2026-06-25 19:14:06 -07:00
test_credential_pool_routing.py refactor: remove smart_model_routing feature (#12732) 2026-04-19 18:12:55 -07:00
test_credits_cold_start.py fix(credits): suppress usage gauge when top-up funds exist + add display.credits_notices toggle (#44716) 2026-06-12 01:06:46 -07:00
test_credits_fixture_snapshot.py feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011) 2026-06-06 13:18:18 +05:30
test_credits_policy.py feat(billing): /credits command — balance + portal top-up handoff (#44776) 2026-06-12 08:51:10 +00:00
test_credits_tracker.py feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011) 2026-06-06 13:18:18 +05:30
test_credits_view.py feat(billing): /credits command — balance + portal top-up handoff (#44776) 2026-06-12 08:51:10 +00:00
test_crossloop_client_cache.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_curator.py fix(curator): never archive cron-referenced skills + floor use=0 pruning (#54443) 2026-06-28 15:10:21 -07:00
test_curator_activity.py fix: use skill activity in curator status 2026-04-30 10:31:47 -07:00
test_curator_backup.py fix(curator): stop the rollback safety snapshot from pruning its target 2026-06-17 05:40:05 -07:00
test_curator_classification.py feat(curator): hint at hermes curator pin in the rename block (#23212) 2026-05-10 06:44:53 -07:00
test_curator_reports.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_custom_pool_mismatch_guard.py fix(agent): don't treat custom:<name> pools as cross-provider mismatch (#45289) 2026-06-13 02:01:09 -07:00
test_custom_provider_extra_body.py fix: handle named custom providers and Z.AI overload retries 2026-06-25 00:17:17 -07:00
test_custom_providers_vision.py fix(vision): honor custom_providers per-model supports_vision (#41036) 2026-06-07 21:50:57 -07:00
test_deepseek_anthropic_thinking.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_direct_provider_url_detection.py fix: restrict provider URL detection to exact hostname matches 2026-04-20 22:14:29 -07:00
test_display.py feat(display): friendly human-phrased tool labels for built-in tools (#55166) 2026-06-29 20:31:17 -07:00
test_display_emoji.py feat(tools): centralize tool emoji metadata in registry + skin integration 2026-03-15 20:21:21 -07:00
test_display_todo_progress.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_display_tool_failure.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_empty_tool_name_loop_dampening.py fix(agent): dampen empty-name phantom tool-call loop (#47967) (#48109) 2026-06-17 17:32:14 -07:00
test_error_classifier.py fix(agent): classify 429 'overloaded' bodies as overloaded, not rate_limit 2026-06-27 04:16:54 -07:00
test_external_skills.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_external_skills_dirs_cache.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_failover_identity.py fix(agent): keep system-prompt model identity in sync across provider failover 2026-06-20 10:46:01 -07:00
test_file_safety.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_file_safety_container_mirror.py fix(file-safety): extend sandbox-mirror guard to cover inner-container path (#32049) (#32407) 2026-06-02 14:03:37 +10:00
test_file_safety_credentials.py fix(security): block read_file on project-local .env files 2026-05-25 03:40:47 -07:00
test_file_safety_cross_profile.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_file_safety_sandbox_mirror.py fix(file-safety): add sandbox-mirror soft guard for writes to per-task .hermes mirrors (#32213) 2026-06-02 11:29:24 +10:00
test_gemini_fast_fallback.py feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492) 2026-06-21 19:53:27 -07:00
test_gemini_free_tier_gate.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_gemini_native_adapter.py fix(gemini): strip native self prefixes before generateContent (#36141) 2026-06-13 13:47:08 -07:00
test_gemini_schema.py fix(gemini): drop integer/number/boolean enums from tool schemas (#15082) 2026-04-24 03:40:00 -07:00
test_i18n.py fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix (#38383) 2026-06-03 12:00:27 -07:00
test_image_gen_registry.py feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) 2026-04-21 21:30:10 -07:00
test_image_routing.py test(vision): cover Ollama /api/show vision capability routing (#54511) 2026-06-28 22:52:59 -07:00
test_insights.py refactor(insights): drop dead pricing/duration wrappers, call usage_pricing directly (#40618) 2026-06-07 18:33:20 -07:00
test_intent_ack_continuation.py fix(agent): config-driven intent-ack continuation for all api_modes (#27881) (#53943) 2026-06-27 20:46:00 -07:00
test_jiter_preload.py fix(agent): preload jiter native parser 2026-05-28 00:20:11 -07:00
test_kimi_coding_anthropic_thinking.py fix(anthropic): broaden Kimi thinking-suppression to custom endpoints (#17455) 2026-04-29 06:35:42 -07:00
test_last_total_tokens.py fix(compressor): ABC compliance — total_tokens, api_mode, logger consistency 2026-05-23 17:38:19 -07:00
test_learn_prompt.py fix(learn): name distilled skills as author Hermes, not the host OS user (#52388) 2026-06-25 12:48:08 -07:00
test_local_stream_timeout.py fix(local): recognize unqualified hostnames as local endpoints (#9248) 2026-06-05 10:18:10 +10:00
test_markdown_tables.py fix(cli): vertical fallback for markdown tables wider than terminal (#23948) 2026-05-11 16:49:13 -07:00
test_memory_async_sync.py fix(memory): run end-of-turn sync off the turn thread (#41945) 2026-06-08 02:18:59 -07:00
test_memory_provider.py fix(agent): validate context/memory tool schemas before wrapping 2026-06-25 02:17:29 +05:30
test_memory_session_switch.py fix(memory): run end-of-turn sync off the turn thread (#41945) 2026-06-08 02:18:59 -07:00
test_memory_skill_scaffolding.py fix(memory): strip skill scaffolding for all providers, not just openviking 2026-06-16 10:37:37 -07:00
test_memory_user_id.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_memory_write_bridge.py refactor(memory): move write-mirror gating behind MemoryManager interface 2026-06-22 07:00:42 -07:00
test_message_content.py fix(openviking): preserve structured sync attribution 2026-06-19 15:23:41 +08:00
test_minimax_auxiliary_url.py fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983) 2026-04-07 22:23:28 -07:00
test_minimax_provider.py polish(minimax): address Copilot review comments on M3 default-aux fix 2026-06-04 05:53:35 -07:00
test_model_metadata.py fix(moa): resolve context window from the aggregator, not the 256K default (#53780) 2026-06-27 12:08:09 -07:00
test_model_metadata_local_ctx.py fix: normalize lmstudio base urls 2026-06-28 20:46:44 -07:00
test_model_metadata_ssl.py fix(auth): honor SSL CA env vars across httpx + requests callsites 2026-04-24 03:00:33 -07:00
test_models_dev.py test: remove low-value model-catalog mirror tests 2026-05-29 23:45:05 -07:00
test_moonshot_schema.py fix(moonshot): handle union type arrays in tool schemas 2026-06-13 05:51:41 -07:00
test_non_stream_stale_timeout.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_nous_credits_gauge.py feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011) 2026-06-06 13:18:18 +05:30
test_nous_credits_snapshot.py feat(billing): /credits command — balance + portal top-up handoff (#44776) 2026-06-12 08:51:10 +00:00
test_nous_oauth_401_guidance.py feat(cli): make hermes portal the human-readable Portal onboarding alias 2026-06-04 01:19:28 +05:30
test_nous_rate_guard.py fix(nous): don't trip cross-session rate breaker on upstream-capacity 429s (#15898) 2026-04-26 04:53:42 -07:00
test_onboarding.py feat(onboarding): opt-in structured profile-build path on first contact (#41114) 2026-06-07 08:36:48 -07:00
test_oneshot.py feat(agent): one-shot LLM helper + llm.oneshot gateway RPC (#51261) 2026-06-23 08:01:50 +00:00
test_openrouter_response_cache.py fix(openrouter): use canonical X-Title attribution header 2026-05-05 10:13:34 -07:00
test_pet_engine.py feat(pets): pet engine + display.pet config 2026-06-20 14:18:30 -05:00
test_pet_generate.py test(pets): make slow pet generation suite opt-in 2026-06-25 00:44:53 -05:00
test_platform_hint_overrides.py feat(prompt): configurable per-platform system-prompt hint overrides 2026-06-18 14:28:01 -07:00
test_plugin_llm.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_portal_tags.py feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779) 2026-05-12 20:49:20 -07:00
test_preflight_compression_gate.py fix(agent): trigger preflight compression on few-but-huge sessions (#27405) 2026-06-25 01:20:23 +05:30
test_prompt_builder.py test(agent): cover .hermes.md no-git-root cwd-only behavior 2026-06-28 20:46:32 -07:00
test_prompt_caching.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_proxy_and_url_validation.py fix(agent): normalize socks:// env proxies for httpx/anthropic 2026-04-21 05:52:46 -07:00
test_rate_limit_tracker.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_reasoning_stale_timeout_floor.py test(reasoning-floor): isolate stale-timeout floor tests from config-module reload races (#54775) 2026-06-29 02:42:54 -07:00
test_redact.py fix(security): redact bare-token credentials in URL userinfo (#6396) (#54475) 2026-06-28 18:52:42 -07:00
test_replay_cleanup.py fix(tui): sanitize replay history on WebUI/TUI session resume (#29086) (#53939) 2026-06-27 20:56:49 -07:00
test_restore_primary_pool_reselect.py fix(pool): re-select from credential pool on primary runtime restore 2026-06-27 20:04:45 -07:00
test_resume_stale_active_task.py fix(agent): strengthen compression preamble against stale task execution (#41607) 2026-06-11 13:57:13 -07:00
test_runtime_cwd.py fix(desktop): stabilize project folder sessions (#37586) 2026-06-02 20:23:09 +00:00
test_save_url_image.py fix(image_gen): cache xAI ephemeral URL responses to disk (#26942) (#31759) 2026-05-24 18:10:47 -07:00
test_secret_scope.py feat(gateway): multiplex phase 2 — fail-closed profile credential isolation (Workstream A) 2026-06-19 07:34:15 -07:00
test_set_runtime_main_custom_provider.py test(auxiliary): e2e routing assertions for custom-provider aux resolution 2026-05-30 02:38:59 -07:00
test_shell_hooks.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_shell_hooks_consent.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_skill_bundles.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_skill_commands.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_skill_commands_reload.py fix(ci): rip out some xdist legacy stuff... how did these ever work?? 2026-06-26 19:15:18 -07:00
test_skill_utils.py fix(curator): protect external skills from background curation 2026-06-25 22:03:02 -07:00
test_ssl_ca_guard.py fix(ssl): align guard docs and escape hatch 2026-06-13 21:14:32 -07:00
test_stream_read_timeout_floor.py fix(streaming): stop socket read timeout from preempting stale-stream detector (#43570) 2026-06-10 20:21:38 -05:00
test_streaming_context_scrubber.py 🐛 fix(memory): require newline after context tag 2026-05-18 10:53:08 -07:00
test_subagent_progress.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_subagent_stop_hook.py feat: shell hooks — wire shell scripts as Hermes hook callbacks 2026-04-20 20:53:51 -07:00
test_subdirectory_hints.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_summary_prefix_semantics.py fix(agent): freeze carveout-era SUMMARY_PREFIX for renormalization 2026-06-11 13:57:13 -07:00
test_system_prompt.py feat(tools): add project workspace tools 2026-06-25 16:40:27 -05:00
test_system_prompt_restore.py fix(desktop): keep live model switch metadata truthful 2026-06-16 09:50:17 -05:00
test_think_scrubber.py fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924) (#20184) 2026-05-05 04:33:38 -07:00
test_thinking_timeout_guidance.py fix(agent): detect thinking-timeout for reasoning models and surface actionable guidance instead of misleading file-write advice 2026-06-25 19:00:48 -07:00
test_title_generator.py feat(titles): support language-aware title generation (#45296) 2026-06-19 17:15:52 -07:00
test_tool_call_arg_no_redaction.py fix(agent): stop redacting tool-call args in history; fix auth-header quote-eating 2026-06-28 02:44:06 -07:00
test_tool_dispatch_helpers.py feat(security): promptware defense — shared threat patterns + memory load-time scan + tool-result delimiters (#32269) 2026-05-25 14:52:24 -07:00
test_tool_guardrails.py fix: add recovery hints to loop guard warnings 2026-05-19 00:12:12 -07:00
test_tool_result_classification.py fix: classify landed file mutations with diagnostics 2026-05-13 06:46:23 -07:00
test_transcription_registry.py feat(stt): add register_transcription_provider() plugin hook 2026-05-25 01:41:19 -07:00
test_tts_registry.py feat(tts): add register_tts_provider() plugin hook (closes #30398) 2026-05-24 18:04:54 -07:00
test_turn_context.py fix: persist non-NULL system prompt on fresh turn setup (#45499) (#52616) 2026-06-25 12:54:19 -07:00
test_turn_finalizer_cleanup_guard.py fix(agent): complete final text on last turn 2026-06-22 13:57:59 -07:00
test_turn_finalizer_interrupt_alternation.py test(agent): cover interrupt tool-tail alternation close (#48879) 2026-06-23 23:52:28 -07:00
test_turn_retry_state.py fix(agent): route content-filter stream stalls to fallback chain (#32421) 2026-06-28 01:15:21 -07:00
test_unsupported_parameter_retry.py test: remove 50 stale/broken tests to unblock CI (#22098) 2026-05-08 14:55:40 -07:00
test_unsupported_temperature_retry.py fix(auxiliary): stop capping output with max_tokens by default (#34530) (#34845) 2026-05-29 17:24:30 -07:00
test_usage_pricing.py fix(bedrock): price Claude prompt-cache tokens in /usage (#50307) 2026-06-21 11:48:43 -07:00
test_verification_evidence.py feat(agent): recognize focused ad-hoc verification scripts 2026-06-24 23:03:45 -05:00
test_verification_stop.py feat(verify-on-stop): default OFF, one-time migration, skip doc-only edits (#53552) 2026-06-27 03:23:22 -07:00
test_video_gen_registry.py feat(video_gen): unified video_generate tool with pluggable provider backends (#25126) 2026-05-13 16:39:41 -07:00
test_vision_resolved_args.py fix(vision): preserve explicit provider auth with custom base_url 2026-05-04 05:05:43 -07:00
test_vision_routing_31179.py fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452) 2026-05-24 15:01:28 -07:00