CI Tests workflow has been red on main for 40+ consecutive runs. This
commit recovers every failure visible in run 25130722163 (most recent
completed run prior to this PR).
Root causes, by group:
Test-mock drift after product landed (fix: update mocks)
- test_mcp_structured_content / test_mcp_dynamic_discovery (6 tests):
product added _rpc_lock (#02ae15222) and _schedule_tools_refresh
(#1350d12b0) without updating sibling test files. Install a real
asyncio.Lock inside the fake run-loop and patch at _schedule_tools_refresh.
- test_session.py: renamed normalize_whatsapp_identifier → canonical_
whatsapp_identifier upstream; keep a local alias so the legacy tests
keep working.
- test_run_progress_topics Slack DM test: PR #8006 made Slack default
tool_progress=off; explicitly set it to 'all' in the test fixture so
the progress-callback path still runs. Also read tool_progress_callback
at call time rather than freezing it in FakeAgent.__init__ — production
assigns it AFTER construction.
- test_tui_gateway_server session-create/close race: session.create now
defers _start_agent_build behind a 50ms timer — wait for the build
thread to enter _make_agent before closing, otherwise the orphan-
cleanup path never runs.
- test_protocol session.resume: product get_messages_as_conversation now
takes include_ancestors kwarg; accept **_kwargs in the test stub.
- test_copilot_acp_client redaction: redactor is OFF by default (snapshots
HERMES_REDACT_SECRETS at import); patch agent.redact._REDACT_ENABLED=True
for the duration of the test.
- test_minimax_provider: after #17171, dots in non-Anthropic model names
stay dots even with preserve_dots=False. Assert the new invariant
rather than the old 'broken for MiniMax' behavior.
- test_update_autostash: updater now scans `ps -A` for dashboard PIDs;
the test's catch-all subprocess.run stub needed stdout/stderr fields.
- test_accretion_caps: read_timestamps dict is populated lazily when
os.path.getmtime succeeds. Use .get("read_timestamps", {}) to tolerate
CI filesystems where the stat races file creation.
Change-detector tests (fix: rewrite as structural invariants)
- test_credential_sources_registry_has_expected_steps: was a frozen set
comparison that broke when minimax-oauth was added. Rewrite as an
invariant check (every step has description, no dupes, core steps
present) per AGENTS.md 'don't write change-detector tests'.
xdist ordering / test pollution (fix: reset state, use module-local patches)
- test_setup vercel: sibling test saved VERCEL_PROJECT_ID='project' to
os.environ via save_env_value() and never cleared it. monkeypatch.delenv
the VERCEL_* vars in the link-file test.
- test_clipboard TestIsWsl: GitHub Actions is on Azure VMs whose real
/proc/version often contains 'microsoft'. Patching builtins.open with
mock_open didn't reliably intercept hermes_constants.is_wsl's call in
xdist workers that had already cached _wsl_detected=True from an
earlier test. Patch hermes_constants.open directly and add
teardown_method to reset the cache after each test.
Pytest-asyncio cancellation hangs (fix: bound product await with timeout)
- test_session_split_brain_11016 (3 params) + test_gateway_shutdown
cancel-inflight: under pytest-asyncio 1.3.0, 'await task' and
'asyncio.gather(cancelled_tasks)' can stall for 30s when the cancelled
task's finally block awaits typing-task cleanup. Bound both with
asyncio.wait_for(..., timeout=5.0) and asyncio.shield — the stragglers
are released from adapter tracking and allowed to finish unwinding in
the background. This is also a legitimate hardening: a wedged finally
shouldn't stall the caller's dispatch or a gateway shutdown.
Orphan UI config (fix: merge tiny tab into messaging category)
- test_web_server test_no_single_field_categories: the telegram.reactions
config field lived in its own 'telegram' schema category with no
siblings. Fold it under 'discord' via _CATEGORY_MERGE so the dashboard
doesn't render an orphan single-field tab.
Local verification: 38/38 originally-failing tests pass; 4044/4044
gateway tests pass; 684/684 targeted subset (all 16 touched test files)
passes.
When `hermes update` stashes local changes and the restore hits merge
conflicts, the old code prompted the user to reset or keep conflict
markers. If the user declined the reset, git conflict markers
(<<<<<<< Updated upstream) were left in source files, making hermes
completely unrunnable with a SyntaxError on the next invocation.
Additionally, the interactive path called sys.exit(1), which killed
the entire update process before pip dependency install, skill sync,
and gateway restart could finish — even though the code pull itself
had succeeded.
Changes:
- Always auto-reset to clean state when stash restore conflicts
- Remove the "Reset working tree?" prompt (footgun)
- Remove sys.exit(1) — return False so cmd_update continues normally
- User's changes remain safely in the stash for manual recovery
Also fixes a secondary bug where the conflict handling prompt used
bare input() instead of the input_fn parameter, which would hang
in gateway mode.
Tests updated: replaced prompt/sys.exit assertions with auto-reset
behavior checks; removed the "user declines reset" test (path no
longer exists).
Follow-up for salvaged PR #2352:
- Replace hardcoded Path(os.getenv('HERMES_HOME', ...)) with
get_hermes_home() from hermes_constants (2 places)
- Consolidate redundant git_cmd_base into the existing git_cmd
variable, constructed once before fork detection
- Update autostash tests for the unmerged index check added
in the previous commit
* fix: harden `hermes update` against diverged history, non-main branches, and gateway edge cases
The self-update command (`hermes update` / gateway `/update`) could fail
or silently corrupt state in several scenarios:
1. **Diverged history** — `git pull --ff-only` aborts with a cryptic
subprocess error when upstream has force-pushed or rebased. Now falls
back to `git reset --hard origin/main` since local changes are already
stashed.
2. **User on a feature branch / detached HEAD** — the old code would
either clobber the feature branch HEAD to point at origin/main, or
silently pull against a non-existent remote branch. Now auto-checkouts
main before pulling, with a clear warning.
3. **Fetch failures** — network or auth errors produced raw subprocess
tracebacks. Now shows user-friendly messages ("Network error",
"Authentication failed") with actionable hints.
4. **reset --hard failure** — if the fallback reset itself fails (disk
full, permissions), the old code would still attempt stash restore on
a broken working tree. Now skips restore and tells the user their
changes are safe in stash.
5. **Gateway /update stash conflicts** — non-interactive mode (Telegram
`/update`) called sys.exit(1) when stash restore had conflicts, making
the entire update report as failed even though the code update itself
succeeded. Now treats stash conflicts as non-fatal in non-interactive
mode (returns False instead of exiting).
* fix: restore stash and branch on 'already up to date' early return
The PR moved stash creation before the commit-count check (needed for
the branch-switching feature), but the 'already up to date' early return
didn't restore the stash or switch back to the original branch — leaving
the user stranded on main with changes trapped in a stash.
Now the early-return path restores the stash and checks out the original
branch when applicable.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
When 'hermes update' stashes local changes and the restore hits
conflicts, the previous behavior silently ran 'git reset --hard HEAD'
to clean up. This could surprise users who didn't realize their
working tree was being nuked.
Now the conflict handler:
- Lists the specific conflicted files
- Reassures the user their stash is preserved
- Asks before resetting (interactive mode)
- Auto-resets in non-interactive mode (prompt_user=False)
- If declined, leaves the working tree as-is with guidance
When `hermes update` stashes local changes and the subsequent
`git stash apply` fails or leaves unmerged files, the conflict markers
(<<<<<<< etc.) were left in the working tree, making Hermes unrunnable
until manually cleaned up.
Now the update command runs `git reset --hard HEAD` to restore a clean
working tree before exiting, and also detects unmerged files even when
git stash apply reports success.
Closes#2348
Add a restore prompt for interactive updates, keep the stash when the user declines, and print a post-restore warning that local changes were reapplied on top of updated code.