hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-03 07:21:54 +00:00

History

xxxigm b5ea6a5c80 test(xai-oauth): regression coverage for the bad-credentials disambiguator (#29344 ) Eleven new tests pinning the #29344 fix. Layout mirrors the existing "Fix D" entitlement section so the bad-credentials disambiguator sits alongside the entitlement-block tests it complements. Classifier-level coverage: * ``test_is_entitlement_failure_false_for_bad_credentials_wke_suffix`` — verbatim shape from the reporter's wire capture (``{code: 'caller does not have permission', error: 'OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]'}``) ↦ classifier must return False so the refresh path runs. * ``test_is_entitlement_failure_false_for_wke_suffix_in_normalized_shape`` — same body after ``_extract_api_error_context`` has rewritten it to ``{reason, message}``. The disambiguator must fire in BOTH shapes; without this guard the production call site at ``_recover_with_credential_pool`` (which goes through the normalised extractor) would still misclassify. * ``test_is_entitlement_failure_false_for_any_wke_unauthenticated_variant`` — parametrised forward-compat: ``bad-credentials``, ``expired-token``, ``revoked``, ``some-future-reason``. xAI documents the prefix as stable, the suffix after the colon as a reason code that can grow; every variant under ``unauthenticated:`` must route to refresh. * ``test_is_entitlement_failure_false_via_oauth2_validation_phrase_alone`` — belt-and-braces guard: if a future API revision drops the WKE suffix but keeps "OAuth2 access token could not be validated", we still classify correctly. * ``test_is_entitlement_failure_wke_signal_overrides_entitlement_keywords`` — defensive: if a body ever carries BOTH the WKE suffix and entitlement language, the WKE signal wins. Auth is recoverable; entitlement isn't, and a refreshed token will resurface the entitlement message on the next request. * ``test_is_entitlement_failure_case_insensitive_wke_match`` — pins that the classifier lowercases the haystack so a future xAI build that uppercases the prefix doesn't reintroduce the bug. Recovery-path coverage (end-to-end through ``_recover_with_credential_pool``): * ``test_recover_with_credential_pool_refreshes_on_xai_bad_credentials_403`` — the headline test the reporter requested: a bad-credentials 403 with the exact wire body must call ``try_refresh_current()`` exactly once and ``_swap_credential`` once. Pre-fix this returned ``(False, _)`` because the entitlement classifier over-matched and short-circuited the refresh path. * ``test_recover_with_credential_pool_still_blocks_real_entitlement`` — companion regression guard for #26847: a pure unsubscribed- account body (no WKE suffix, no OAuth2-validation phrase) must still surface as entitlement and skip refresh. The new disambiguator must not weaken the original loop-protection it was added to preserve. The scaffolding reuses ``_make_codex_agent``, ``_FakePool``, and the existing ``MagicMock`` patterns from the surrounding tests so the new section reads as a natural extension of "Fix D" rather than a separate test file.		2026-05-23 02:48:13 -07:00
..
__init__.py
conftest.py	ci(tests): add pytest-timeout 60s hard cap to break suite-teardown deadlock (#28861 )	2026-05-19 17:27:24 -07:00
test_413_compression.py	fix: show context compaction status	2026-05-13 23:11:43 -07:00
test_860_dedup.py	refactor(gateway): stop writing JSONL in append_to_transcript / rewrite_transcript	2026-05-20 13:00:57 -07:00
test_1630_context_overflow_loop.py
test_agent_guardrails.py
test_anthropic_prompt_cache_policy.py	fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778 )	2026-05-12 20:46:04 -07:00
test_anthropic_third_party_oauth_guard.py
test_anthropic_truncation_continuation.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
test_api_max_retries_config.py
test_async_httpx_del_neuter.py	fix(dashboard): UI polish — modals, layout, consistency, test fixes	2026-05-12 13:59:22 -04:00
test_background_review.py	fix(run_agent): isolate background review fork from external memory plugins (#27190 )	2026-05-16 20:33:38 -07:00
test_background_review_cache_parity.py	chore: trim verbose comments/docstrings, add AUTHOR_MAP entry	2026-05-21 12:49:21 +05:30
test_background_review_summary.py
test_background_review_toolset_restriction.py	chore: trim verbose comments/docstrings, add AUTHOR_MAP entry	2026-05-21 12:49:21 +05:30
test_callable_api_key.py	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
test_codex_app_server_integration.py	fix(codex-runtime): retire wedged sessions + post-tool watchdog + OAuth refresh classify (#25769 )	2026-05-14 07:55:09 -07:00
test_codex_multimodal_tool_result.py	feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955 )	2026-05-09 21:06:19 -07:00
test_codex_xai_oauth_recovery.py	test(xai-oauth): regression coverage for the bad-credentials disambiguator (#29344 )	2026-05-23 02:48:13 -07:00
test_commit_memory_session_context_engine.py	fix(agent): notify context engine on commit_memory_session (#22764 )	2026-05-09 12:28:42 -07:00
test_compress_focus_plugin_fallback.py
test_compression_boundary.py
test_compression_boundary_hook.py	fix(tests): catch up six stale tests after compression/aux/kanban changes (#28465 )	2026-05-18 21:43:59 -07:00
test_compression_feasibility.py	perf(compression): defer feasibility check to first compression attempt (#28957 )	2026-05-19 17:27:17 -07:00
test_compression_persistence.py
test_compression_trigger_excludes_reasoning.py
test_compressor_fallback_update.py
test_concurrent_interrupt.py	test: remove 50 stale/broken tests to unblock CI (#22098 )	2026-05-08 14:55:40 -07:00
test_context_token_tracking.py	refactor(session-log): delete _save_session_log and all callers	2026-05-20 11:44:10 -07:00
test_copilot_native_vision_headers.py
test_create_openai_client_kwargs_isolation.py
test_create_openai_client_proxy_env.py
test_create_openai_client_reuse.py	fix(force_close_tcp_sockets): shutdown only, do not release FD (#29507 )	2026-05-23 02:31:10 -07:00
test_deepseek_reasoning_content_echo.py
test_deepseek_v4_thinking_live.py
test_dict_tool_call_args.py
test_empty_response_recovery_persistence.py	refactor(session-log): delete _save_session_log and all callers	2026-05-20 11:44:10 -07:00
test_exit_cleanup_interrupt.py
test_file_mutation_verifier.py	fix: classify landed file mutations with diagnostics	2026-05-13 06:46:23 -07:00
test_image_rejection_fallback.py	fix(agent): catch ChatGPT-account Codex data-URL rejection so images are stripped instead of cascading to compression (#23602 )	2026-05-11 07:37:22 -07:00
test_image_shrink_recovery.py
test_init_fallback_on_exhausted_pool.py
test_interactive_interrupt.py
test_interrupt_propagation.py
test_invalid_context_length_warning.py
test_iteration_budget_race.py
test_jsondecodeerror_retryable.py	refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards	2026-05-16 22:55:49 -07:00
test_last_reasoning_per_turn.py
test_long_context_tier_429.py
test_materialize_data_url_cleanup.py	fix(misc): three small defensive fixes from PR #1974	2026-05-10 22:28:01 -07:00
test_memory_nudge_counter_hydration.py	refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards	2026-05-16 22:55:49 -07:00
test_memory_provider_init.py
test_memory_sync_interrupted.py
test_message_sequence_repair.py	fix(run_agent): break permanent empty-response loop from orphan tool-tail (#21385 )	2026-05-07 08:35:10 -07:00
test_multimodal_tool_content_recovery.py	fix(agent): recover from providers rejecting list-type tool content (#27344 ) (#30259 )	2026-05-21 23:40:16 -07:00
test_openai_client_lifecycle.py	fix(gateway): harden kanban and provider cleanup races	2026-05-20 14:31:22 -07:00
test_percentage_clamp.py
test_plugin_context_engine_init.py
test_primary_runtime_restore.py	fix(agent): reset _fallback_index at turn start even when no fallback activated	2026-05-16 17:12:48 -07:00
test_provider_attribution_headers.py	feat(nvidia): add NIM billing origin header	2026-05-15 14:06:51 -07:00
test_provider_fallback.py	fix(fallback): skip chain entries matching current provider/model/base_url (#22780 )	2026-05-09 12:48:19 -07:00
test_provider_parity.py	fix(tests): stabilize xai env and provider parity	2026-05-17 11:55:25 -07:00
test_real_interrupt_subagent.py
test_redirect_stdout_issue.py
test_repair_tool_call_arguments.py
test_repair_tool_call_name.py
test_review_prompt_class_first.py	fix(review): tell background reviewer not to capture transient env failures as skills (#23004 )	2026-05-09 22:51:25 -07:00
test_run_agent.py	fix(agent): fail fast on small Ollama runtime context	2026-05-21 23:25:01 -07:00
test_run_agent_codex_responses.py	test(session-log): pin no-session_json regression + drop trailing whitespace	2026-05-20 11:44:10 -07:00
test_run_agent_multimodal_prologue.py
test_sequential_chats_live.py
test_session_id_env.py	feat: expose HERMES_SESSION_ID to agent tools via ContextVar + env (#23847 )	2026-05-12 00:16:45 +05:30
test_session_meta_filtering.py
test_session_reset_fix.py
test_steer.py
test_stream_drop_logging.py	feat(stream-retry): add upstream + timing diagnostics to drop log (#23005 )	2026-05-09 22:49:35 -07:00
test_stream_interrupt_retry.py
test_streaming.py	fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184 )	2026-05-16 17:09:41 -07:00
test_streaming_tool_call_repair.py	chore: remove Atropos RL environments and tinker-atropos integration (#26106 )	2026-05-15 10:36:38 +05:30
test_strict_api_validation.py
test_strip_reasoning_tags_cli.py
test_switch_model_context.py	test(ci): stabilize shared optional dependency baselines	2026-05-13 17:32:22 -07:00
test_switch_model_fallback_prune.py
test_thinking_only_sanitizer.py
test_tls_fd_recycle_corruption.py	test(tls-fd-recycle): pin shutdown-only + thread-aware close contract (#29507 )	2026-05-23 02:31:10 -07:00
test_token_persistence_non_cli.py	fix: make session search initialize session db	2026-05-09 14:36:58 -07:00
test_tool_arg_coercion.py
test_tool_call_args_sanitizer.py	ci(tests): add pytest-timeout 60s hard cap to break suite-teardown deadlock (#28861 )	2026-05-19 17:27:24 -07:00
test_tool_call_guardrail_runtime.py	fix: add recovery hints to loop guard warnings	2026-05-19 00:12:12 -07:00
test_tool_executor_contextvar_propagation.py	refactor(run_agent): extract tool execution to agent/tool_executor.py	2026-05-16 18:24:05 -07:00
test_tool_name_db_persistence.py	fix(agent): set tool_name on tool-result messages at construction time	2026-05-19 20:49:11 +01:00
test_unicode_ascii_codec.py
test_vision_aware_preprocessing.py	fix(agent): resolve supports_vision override for named custom providers	2026-05-20 23:27:10 -07:00