hermes-agent/tests
Eri Barrett ba9e3a491b
feat(memory): Honcho OAuth connect — desktop and CLI flows + token refresh (#44335)
* feat(memory): OAuth token storage and refresh for the Honcho provider

* feat(memory): refresh the Honcho OAuth token in the client and session

* feat(memory): zero-CLI loopback OAuth authorization flow

* feat(memory): generic memory-provider OAuth connect endpoints

* feat(desktop): memory-provider OAuth connect link

* feat(memory): CLI OAuth sign-in with source-tagged authorize links

* fix(memory): IP-literal loopback redirect and consent config_path on the authorize link

* fix(memory): profile-scope the memory-provider OAuth endpoints

* refactor(desktop): generic memory-provider OAuth client functions

* docs(memory): trim OAuth module docstrings to the invariants

* docs(memory): document OAuth connect as an optional auth method

* fix(memory): send home-relative display path to consent, not the absolute path

* perf(memory): cache OAuth token expiry in memory to skip the hot-path disk read

* fix(memory): log OAuth refresh failures at warning, not debug

* feat(memory): fall back to an OS-assigned loopback port when 8765 is taken

* test(memory): cover the desktop Connect launcher, status, and provider dispatch

* fix(desktop): keep the memory-provider dropdown one size regardless of connect state

* fix(desktop): move the memory connect link to the description line, leaving the dropdown untouched

* refactor(memory): move OAuth connect routes out of web_server into a memory-layer router

* refactor(desktop): import MemoryConnect directly, drop the single-export barrel

* fix(memory): launch CLI OAuth sign-in right after the auth choice, not after the wizard

* fix(desktop): auto-clear the OAuth error state instead of leaving it sticky

* test(honcho): isolate auth-method prompt from deployment-shape wizard tests

main's wizard suite scripts the cloud prompts without the OAuth auth-method step; auto-answer it in the shared helper so the answer lists stay shape-only.

* docs(honcho): document query-adaptive reasoning level (reasoningHeuristic)

README never mentioned reasoningHeuristic and listed reasoningLevelCap as an orphaned cap with the wrong default (— vs "high"). Add the query-adaptive scaling note + the reasoningHeuristic/reasoningLevelCap rows (grouped under Dialectic & Reasoning), matching the wording already on the hosted honcho.md page, and add a pointer from the memory-providers overview.

* fix(honcho): default the CLI peer prompt to the OAuth consent name

The CLI runs the grant with apply_config=False, so the peerName the user just entered at consent was dropped and the wizard's 'Your name' prompt fell back to $USER. Surface it as a transient OAuthCredential.consent_peer_name (set even when config isn't merged) and seed the prompt default from it.

* feat(honcho): split OAuth client_id by surface (cli=hermes-agent, desktop=hermes-desktop)

resolve_endpoints now picks the client_id from the initiating surface and
threads it through authorize -> token exchange -> persisted grant -> refresh,
so the CLI and desktop register as distinct OAuth clients. Surface-specific
env overrides (HONCHO_OAUTH_CLIENT_ID_CLI/_DESKTOP) win over the generic
HONCHO_OAUTH_CLIENT_ID, which still overrides every surface.

* feat(honcho): show OAuth vs API key in status; detect existing OAuth in setup

status now prints 'Auth: OAuth (clientId, token valid Xm/expired)' instead of
masking the OAuth access token as a generic API key; setup notes an existing
OAuth grant when re-run.

* docs(honcho): drop 'shared pool' wording from unified observation mode help

* fix(honcho): cross-process lock around OAuth refresh to prevent grant revocation

The in-process threading lock can't stop a sibling process (another profile or
the desktop app sharing honcho.json) from replaying the single-use refresh
token and tripping reuse-detection, which revokes the whole grant. Guard the
read-refresh-persist section with an OS file lock on <config>.lock so only one
process rotates at a time; the others re-read the freshly-persisted token.
Best-effort: platforms without flock degrade to in-process serialization.

* refactor(honcho): one OAuth client (hermes-agent) for all surfaces

Collapse the per-surface client_id split. CLI and desktop now use a single
client_id (hermes-agent); consent branding/UI still adapt via the source query
param. One grant identity means no clientId-vs-refresh-token desync that could
get the grant revoked. HONCHO_OAUTH_CLIENT_ID still overrides for self-hosting.

* fix(honcho): per-session resolves to session_id, never remapped by title

Reorder resolve_session_name so stable identifiers win over labels: gateway
per-chat key first, then the per-session session_id, then the cwd map / title.
A (possibly auto-generated) title can no longer remap a live per-session
conversation onto a second Honcho session mid-stream — fixes the desktop, which
is per-conversation via session_id. Consequence: a gateway's per-chat key now
also wins over a title (titles never remap a stable id).
2026-06-22 19:16:47 -05:00
..
acp fix(codex): seed app-server sessions with configured cwd 2026-06-21 16:39:02 -07:00
acp_adapter feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
agent fix(agent): complete final text on last turn 2026-06-22 13:57:59 -07:00
cli feat(goals): /goal wait <pid> — park the loop on a background process (#50503) 2026-06-22 06:27:29 -07:00
computer_use feat(computer_use): disable cua-driver telemetry by default, add opt-in (#50842) 2026-06-22 09:57:16 -07:00
cron fix(cron): scope job execution to its owning profile (#32091 follow-up) (#50993) 2026-06-22 14:54:28 -07:00
docker fix(docker): replace dashboard --insecure with basic-auth provider 2026-06-21 19:05:27 -07:00
e2e refactor(gateway): migrate slack/dingtalk/whatsapp/matrix/feishu/telegram/wecom/email/sms adapters to bundled plugins 2026-06-20 10:26:45 -07:00
fakes
fixtures/plugins/example-dashboard/dashboard feat(dashboard): nous-blue theme, bulk sessions, schedule picker (#37383) 2026-06-02 12:37:40 -04:00
gateway fix(gateway): redact credentials from TUI approval prompts (#48456) 2026-06-23 03:14:18 +05:30
hermes_cli fix(tests): update cua install tests for cross-platform support 2026-06-22 13:41:03 -07:00
hermes_state test: narrow db._conn before raw SQL so ty stops flagging None-union access 2026-06-18 16:04:58 -05:00
honcho_plugin feat(memory): Honcho OAuth connect — desktop and CLI flows + token refresh (#44335) 2026-06-22 19:16:47 -05:00
integration refactor(gateway): migrate Home Assistant adapter to bundled plugin 2026-06-06 11:46:24 -07:00
openviking_plugin fix(openviking): guard empty tool_id in batch skip set; reuse env_var_enabled 2026-06-19 13:53:39 +05:30
plugins fix(memory): fail closed on unclear write results 2026-06-22 07:00:42 -07:00
providers fix(models): pass model.base_url to fetch_models in /model picker 2026-06-16 13:09:40 -07:00
run_agent fix(agent): shrink anthropic-native image history 2026-06-22 18:23:21 -05:00
scripts fix(skills-hub): stop shipping a degenerate index when GitHub taps collapse (#42347) 2026-06-08 15:21:28 -07:00
skills feat(skills): add cloudflare-temporary-deploy optional skill (#50849) 2026-06-22 12:14:30 -07:00
stress chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
tools Merge remote-tracking branch 'origin/main' into pr-50994 2026-06-22 18:48:07 -05:00
tui_gateway fix(tui): persist session messages on force-quit / signal shutdown 2026-06-21 07:26:07 -07:00
website feat(skills): fix browse cap, add source links + copy buttons + category cleanup (#37143) 2026-06-01 19:52:28 -07:00
__init__.py
conftest.py feat(managed-scope): add managed_scope module (resolver, loaders, key helpers) 2026-06-19 07:46:33 -07:00
run_interrupt_test.py
test_account_usage.py
test_assistant_ui_tap_compat.py test(deps): guard @assistant-ui cluster on one tap version 2026-06-15 11:55:02 -04:00
test_atomic_replace_symlinks.py fix(utils): copy fallback for atomic replace across devices (#43852) 2026-06-13 14:50:05 -07:00
test_base_url_hostname.py
test_batch_runner_checkpoint.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_bitwarden_secrets.py fix(bitwarden): prevent zip-slip path traversal when extracting bws binary (#40569) 2026-06-06 18:33:44 -07:00
test_cli_file_drop.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_cli_manual_compress.py fix(tests): catch up six stale tests after compression/aux/kanban changes (#28465) 2026-05-18 21:43:59 -07:00
test_cli_skin_integration.py
test_ctx_halving_fix.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_dashboard_sidecar_close_on_disconnect.py fix(dashboard): hide sidecar sessions from history (#49269) 2026-06-19 18:06:38 -04:00
test_delegate_cascade_49148.py fix(agent): stop delegate cascade from deleting the parent session 2026-06-21 12:09:16 -07:00
test_desktop_electron_pin.py fix(desktop): resolve electronDist dynamically + self-heal blocked installs (supersedes #48081/#48082) (#48091) 2026-06-17 18:48:35 -05:00
test_desktop_mac_entitlements.py test(desktop): assert macOS device entitlements are inherited 2026-06-03 07:32:00 +07:00
test_dispatch_session_id.py fix(dispatch): forward session_id into registry.dispatch (#28479) 2026-06-14 00:27:59 -04:00
test_docker_home_override_scripts.py Repair cron ownership on container restart (#41976) 2026-06-10 15:32:34 +10:00
test_docker_stage2_browser_discovery.py fix(docker): discover Playwright headless_shell browser (#35717) 2026-06-01 16:06:44 +10:00
test_docker_webui_install_surface.py fix(docker): support WebUI installs from read-only sources (#48541) 2026-06-19 10:52:16 +10:00
test_dockerfile_tini_compat_shim.py fix(docker): add /usr/bin/tini compatibility shim for legacy wrappers (#34192) (#34382) 2026-06-01 13:32:55 +10:00
test_empty_model_fallback.py test(models): guard Nous silent default against expensive-flagship escalation 2026-06-05 02:54:34 -07:00
test_empty_session_hygiene.py fix: in-memory transcript blocks empty-session prune 2026-06-10 17:37:34 -07:00
test_env_loader_secret_sources.py fix(secrets): only apply external secrets once per HERMES_HOME per process (#32271) 2026-05-25 15:18:55 -07:00
test_evidence_store.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_gateway_streaming_nested_config.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_get_tool_definitions_cache_isolation.py fix(gateway): close residual memory-leak sites under heavy scheduled workload 2026-06-08 06:32:42 -07:00
test_hermes_bootstrap.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_hermes_constants.py fix(windows): prefer cmd npm shim on PATH fallback 2026-06-21 14:06:39 -06:00
test_hermes_home_profile_warning.py
test_hermes_logging.py refactor(gateway): migrate slack/dingtalk/whatsapp/matrix/feishu/telegram/wecom/email/sms adapters to bundled plugins 2026-06-20 10:26:45 -07:00
test_hermes_state.py test(sessions): cover title reclaim across a compression lineage 2026-06-19 17:36:18 +05:30
test_hermes_state_compression_locks.py fix(compression): prevent session-id fork from concurrent compressions (#34351) 2026-05-28 21:40:39 -07:00
test_hermes_state_wal_fallback.py fix(kanban): skip redundant WAL pragma on already-WAL connections 2026-05-27 14:31:55 -07:00
test_honcho_client_concurrency.py fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759) (#42150) 2026-06-08 09:35:22 -07:00
test_honcho_client_config.py fix(honcho): harden self-hosted setup paths 2026-05-29 22:29:48 -07:00
test_honcho_session_context.py fix(honcho): align user context peer perspective 2026-05-27 10:49:33 -07:00
test_honcho_startup_fail_open.py fix: make Honcho startup fail open 2026-06-01 20:13:42 -07:00
test_install_no_initial_commit.py fix(install): move broken checkout aside instead of deleting it 2026-06-08 02:18:21 -07:00
test_install_ps1_native_stderr_eap.py fix(install): fail fast when uv venv genuinely fails under relaxed EAP 2026-06-18 22:11:35 +05:30
test_install_ps1_uv_powershell_host.py test(install): lock uv installer to a resolved PowerShell host 2026-06-18 16:26:34 +07:00
test_install_sh_browser_install.py
test_install_sh_install_method_stamp.py fix(update): scope install-method stamp to the code tree, not $HERMES_HOME (#48188) 2026-06-18 14:14:41 +10:00
test_install_sh_node_global_prefix.py fix(install): repair existing managed-Node global prefix on re-run 2026-06-14 17:34:11 +07:00
test_install_sh_pythonpath_sanitization.py
test_install_sh_root_fhs_uv_python_path.py test(install): harden uv-python-path regression test against future drift 2026-05-27 13:55:51 -07:00
test_install_sh_setup_wizard_tty_probe.py
test_install_sh_symlink_stomp.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_install_sh_termux_network_prereqs.py
test_install_unmerged_index.py test(installer): regression for unmerged-index update failure 2026-06-13 05:19:44 -07:00
test_ipv4_preference.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_lazy_session_regressions.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_lint_config.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_live_system_guard_self_test.py
test_mcp_serve.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minimax_oauth.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_minisweagent_path.py
test_model_forces_max_completion_tokens.py fix(params): send max_completion_tokens for newer OpenAI families on custom endpoints 2026-06-09 23:22:10 -07:00
test_model_picker_scroll.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_model_tools.py fix(tools): preserve core tools when a platform bundle is disabled 2026-06-21 16:33:58 +05:30
test_model_tools_async_bridge.py fix(web): run URL SSRF checks off the event loop in async paths 2026-06-04 18:04:47 -07:00
test_ollama_num_ctx.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_output_cap_parsing.py test(agent): cover char-based output-cap overflow parsing (#42741) 2026-06-09 03:17:12 -07:00
test_package_json_lazy_deps.py
test_packaging_metadata.py feat(mcp-catalog): add official Unreal Engine 5.8 MCP server 2026-06-18 09:16:40 -07:00
test_plugin_skills.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_plugin_utils.py fix(plugins): thread-safe lazy-singleton helpers; fix honcho TOCTOU (#24759) (#42150) 2026-06-08 09:35:22 -07:00
test_process_loop_event_loop_warning.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_project_metadata.py fix(deps): align anthropic extra pin with lazy pin + guard whole pin surface (#42335) 2026-06-08 12:11:54 -07:00
test_retry_utils.py
test_run_tests_parallel.py fix(ci): remove pytest-timeout, use per-file timeout only 2026-06-12 13:42:42 -04:00
test_sanitize_tool_error.py
test_slash_worker_watchdog.py feat(slash-worker): self-terminate on parent death via create_time watchdog 2026-06-08 07:03:12 -07:00
test_sql_injection.py
test_state_db_malformed_repair.py fix(state.db): recover from malformed sqlite_master so hidden sessions reappear (#43149) 2026-06-09 18:49:08 -05:00
test_subprocess_home_isolation.py fix: make profile subprocess HOME policy explicit 2026-06-14 03:20:21 -07:00
test_termux_all_extra_compat.py
test_timezone.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_toolset_distributions.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_toolsets.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_trajectory_compressor.py fix(research): keep tool_call/tool_response pairs intact when compressing trajectories 2026-06-07 05:01:27 -07:00
test_trajectory_compressor_async.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_transform_llm_output_hook.py
test_transform_tool_result_hook.py test: stub has_hook in transform_tool_result hook tests 2026-06-03 06:36:46 -07:00
test_tui_gateway_server.py fix dashboard chat session titles 2026-06-21 22:44:02 -07:00
test_tui_gateway_ws.py feat(desktop): composer status stack, live subagent windows, editable prompts (#44630) 2026-06-12 08:30:06 -05:00
test_tui_mcp_late_refresh.py fix(tui): refresh tool snapshot when MCP discovery lands after agent build (#48403) 2026-06-18 05:41:23 -07:00
test_utils_truthy_values.py
test_web_server.py refactor(desktop): use port 0 for ephemeral port discovery instead of PortPool reservation 2026-06-12 14:02:19 -04:00
test_wheel_locales_e2e.py fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix (#38383) 2026-06-03 12:00:27 -07:00
test_yuanbao_integration.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_yuanbao_markdown.py
test_yuanbao_pipeline.py feat(Yuanbao): support wechat forward msg (#43508) 2026-06-12 02:06:47 -07:00
test_yuanbao_proto.py chore: prune unused imports and duplicate import redefinitions 2026-05-28 22:26:25 -07:00
test_yuanbao_shutdown.py fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (#40607) 2026-06-07 17:49:38 -07:00