hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-31 19:16:29 +00:00

History

Teknium 31f70d1f2a fix(ci): recover 38 failing tests on main (#17642 ) CI Tests workflow has been red on main for 40+ consecutive runs. This commit recovers every failure visible in run 25130722163 (most recent completed run prior to this PR). Root causes, by group: Test-mock drift after product landed (fix: update mocks) - test_mcp_structured_content / test_mcp_dynamic_discovery (6 tests): product added _rpc_lock (#`02ae15222`) and _schedule_tools_refresh (#`1350d12b0`) without updating sibling test files. Install a real asyncio.Lock inside the fake run-loop and patch at _schedule_tools_refresh. - test_session.py: renamed normalize_whatsapp_identifier → canonical_ whatsapp_identifier upstream; keep a local alias so the legacy tests keep working. - test_run_progress_topics Slack DM test: PR #8006 made Slack default tool_progress=off; explicitly set it to 'all' in the test fixture so the progress-callback path still runs. Also read tool_progress_callback at call time rather than freezing it in FakeAgent.__init__ — production assigns it AFTER construction. - test_tui_gateway_server session-create/close race: session.create now defers _start_agent_build behind a 50ms timer — wait for the build thread to enter _make_agent before closing, otherwise the orphan- cleanup path never runs. - test_protocol session.resume: product get_messages_as_conversation now takes include_ancestors kwarg; accept *_kwargs in the test stub. - test_copilot_acp_client redaction: redactor is OFF by default (snapshots HERMES_REDACT_SECRETS at import); patch agent.redact._REDACT_ENABLED=True for the duration of the test. - test_minimax_provider: after #17171, dots in non-Anthropic model names stay dots even with preserve_dots=False. Assert the new invariant rather than the old 'broken for MiniMax' behavior. - test_update_autostash: updater now scans `ps -A` for dashboard PIDs; the test's catch-all subprocess.run stub needed stdout/stderr fields. - test_accretion_caps: read_timestamps dict is populated lazily when os.path.getmtime succeeds. Use .get("read_timestamps", {}) to tolerate CI filesystems where the stat races file creation. Change-detector tests (fix: rewrite as structural invariants) - test_credential_sources_registry_has_expected_steps: was a frozen set comparison that broke when minimax-oauth was added. Rewrite as an invariant check (every step has description, no dupes, core steps present) per AGENTS.md 'don't write change-detector tests'. xdist ordering / test pollution (fix: reset state, use module-local patches) - test_setup vercel: sibling test saved VERCEL_PROJECT_ID='project' to os.environ via save_env_value() and never cleared it. monkeypatch.delenv the VERCEL_ vars in the link-file test. - test_clipboard TestIsWsl: GitHub Actions is on Azure VMs whose real /proc/version often contains 'microsoft'. Patching builtins.open with mock_open didn't reliably intercept hermes_constants.is_wsl's call in xdist workers that had already cached _wsl_detected=True from an earlier test. Patch hermes_constants.open directly and add teardown_method to reset the cache after each test. Pytest-asyncio cancellation hangs (fix: bound product await with timeout) - test_session_split_brain_11016 (3 params) + test_gateway_shutdown cancel-inflight: under pytest-asyncio 1.3.0, 'await task' and 'asyncio.gather(cancelled_tasks)' can stall for 30s when the cancelled task's finally block awaits typing-task cleanup. Bound both with asyncio.wait_for(..., timeout=5.0) and asyncio.shield — the stragglers are released from adapter tracking and allowed to finish unwinding in the background. This is also a legitimate hardening: a wedged finally shouldn't stall the caller's dispatch or a gateway shutdown. Orphan UI config (fix: merge tiny tab into messaging category) - test_web_server test_no_single_field_categories: the telegram.reactions config field lived in its own 'telegram' schema category with no siblings. Fold it under 'discord' via _CATEGORY_MERGE so the dashboard doesn't render an orphan single-field tab. Local verification: 38/38 originally-failing tests pass; 4044/4044 gateway tests pass; 684/684 targeted subset (all 16 touched test files) passes.		2026-04-29 20:05:32 -07:00
..
acp	fix(acp): wire HERMES_SESSION_KEY per session so sudo cache scope activates	2026-04-28 01:34:16 -07:00
agent	fix(ci): recover 38 failing tests on main (#17642 )	2026-04-29 20:05:32 -07:00
cli	feat: add Vercel Sandbox backend	2026-04-29 07:22:33 -07:00
cron	fix(cron): use last_run_at as croniter base for cron jobs	2026-04-29 08:24:48 -07:00
e2e	fix(gateway): coerce plaintext "restart gateway" DMs to /restart	2026-04-28 01:40:28 -07:00
environments/benchmarks
fakes
gateway	fix(ci): recover 38 failing tests on main (#17642 )	2026-04-29 20:05:32 -07:00
hermes_cli	fix(ci): recover 38 failing tests on main (#17642 )	2026-04-29 20:05:32 -07:00
hermes_state	fix(resume): redirect --resume to the descendant that actually holds the messages	2026-04-24 03:04:42 -07:00
honcho_plugin	feat(honcho): explain why when honcho_profile returns an empty card	2026-04-27 12:37:33 -07:00
integration	fix(discord): strip RTP padding before DAVE/Opus decode (#11267 )	2026-04-16 16:50:15 -07:00
plugins	fix(hindsight): route flush-on-switch through writer queue, not raw thread	2026-04-29 08:09:03 -07:00
run_agent	fix(gemini): pass base_url into chat transport	2026-04-29 12:10:40 -07:00
skills	test(openclaw-migration): cover alias reverse-lookup for real OpenClaw schema	2026-04-28 04:58:13 -07:00
tools	fix(ci): recover 38 failing tests on main (#17642 )	2026-04-29 20:05:32 -07:00
tui_gateway	fix(tui): responsive /compress with live progress + CLI-parity feedback (#17661 )	2026-04-29 18:01:18 -07:00
website	fix(website): auto-wrap ASCII-art code blocks in generated skill pages (#16497 )	2026-04-27 03:38:39 -07:00
__init__.py
conftest.py	perf(tools): memoize get_tool_definitions + TTL-cache check_fn results (#17098 )	2026-04-28 18:20:17 -07:00
run_interrupt_test.py
test_account_usage.py	feat(account-usage): add per-provider account limits module	2026-04-21 01:56:35 -07:00
test_atomic_replace_symlinks.py	refactor: consolidate symlink-safe atomic replace into shared helper	2026-04-28 04:58:22 -07:00
test_base_url_hostname.py	security(runtime_provider): close OLLAMA_API_KEY substring-leak sweep miss (#13522 )	2026-04-21 06:06:16 -07:00
test_batch_runner_checkpoint.py	test: regression coverage for checkpoint dedup and inf/nan coercion	2026-04-24 14:32:21 -07:00
test_cli_file_drop.py	fix(tui): improve macOS paste and shortcut parity	2026-04-21 08:00:00 -07:00
test_cli_manual_compress.py	test(cli): regression test for manual /compress system_message	2026-04-28 05:21:49 -07:00
test_cli_skin_integration.py	fix(tui): restore macOS copy behavior and theme polish (#17131 )	2026-04-28 18:47:14 -05:00
test_ctx_halving_fix.py
test_empty_model_fallback.py
test_evidence_store.py
test_hermes_constants.py
test_hermes_logging.py	fix(logging): attach gateway log after cli init	2026-04-26 19:01:26 -07:00
test_hermes_state.py	fix(state): repair FTS5 delete trigger and add v11 migration for tool-call indexing	2026-04-28 01:33:00 -07:00
test_honcho_client_config.py
test_install_sh_setup_wizard_tty_probe.py	fix(install): widen /dev/tty open-probe to sibling gates (#16746 )	2026-04-28 06:45:55 -07:00
test_ipv4_preference.py
test_mcp_serve.py
test_mini_swe_runner.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )	2026-04-20 12:23:05 -07:00
test_minimax_model_validation.py	fix(models): validate MiniMax models against static catalog (#12611 , #12460 , #12399 , #12547 )	2026-04-19 22:44:47 -07:00
test_minimax_oauth.py	test(cli): cover minimax-oauth resolution, refresh, menu wiring	2026-04-29 09:53:42 -07:00
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools.py	fix(plugins): stop firing pre_tool_call hook twice per tool execution (#17611 )	2026-04-29 12:43:39 -07:00
test_model_tools_async_bridge.py	fix(model_tools): cancel coroutine on timeout so worker thread exits + log full traceback	2026-04-29 05:00:40 -07:00
test_ollama_num_ctx.py
test_packaging_metadata.py
test_plugin_skills.py	fix(tests): attach caplog to specific logger in 3 order-dependent tests (#11453 )	2026-04-17 00:20:40 -07:00
test_project_metadata.py	build(deps): add qrcode to dingtalk + feishu extras (parity with messaging) (#11627 )	2026-04-17 13:31:53 -07:00
test_retry_utils.py
test_sql_injection.py
test_subprocess_home_isolation.py
test_timezone.py	test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )	2026-04-17 14:21:22 -07:00
test_toolset_distributions.py
test_toolsets.py	feat(discord): split discord_server into discord + discord_admin tools	2026-04-25 04:50:14 -07:00
test_trajectory_compressor.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )	2026-04-20 12:23:05 -07:00
test_trajectory_compressor_async.py	fix(kimi): omit temperature entirely for Kimi/Moonshot models (#13157 )	2026-04-20 12:23:05 -07:00
test_transform_tool_result_hook.py	test: stop testing mutable data — convert change-detectors to invariants (#13363 )	2026-04-20 23:20:33 -07:00
test_tui_gateway_server.py	fix(ci): recover 38 failing tests on main (#17642 )	2026-04-29 20:05:32 -07:00
test_utils_truthy_values.py
test_yuanbao_integration.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00
test_yuanbao_markdown.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00
test_yuanbao_pipeline.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00
test_yuanbao_proto.py	yuanbao platform (#16298 )	2026-04-26 18:50:49 -07:00