mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-18 04:41:56 +00:00
29 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
c1eb2dcda7
|
feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220)
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback
Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.
# What this PR makes true
1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
detection banner with copy-pasteable remediation steps the moment
they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
a fresh install to 'core only' — the installer keeps every other
extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
lazy-install on first use under a strict allowlist, instead of
eagerly pulling everything at install time.
# Detection: hermes_cli/security_advisories.py
- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
re-banner after ack.
- Wired into:
* hermes doctor — runs first, prints full remediation block
* hermes doctor --ack <id> — dismisses an advisory
* cli.py interactive run() and single-query branches — short
stderr banner pointing at hermes doctor
* gateway/run.py startup — operator-visible warning in gateway.log
# Lazy-install framework: tools/lazy_deps.py
- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
* tools/tts_tool.py — _import_elevenlabs() calls ensure first
* plugins/memory/honcho/client.py — get_honcho_client lazy-installs
* tts.mistral / stt.mistral entries pre-registered for when PyPI
restores mistralai
# Installer fallback tiers
scripts/install.sh, scripts/install.ps1, setup-hermes.sh:
- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
the same _BROKEN_EXTRAS array so updates stay in sync.
Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).
# Config
hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: [] (advisory IDs the user has dismissed)
- allow_lazy_installs: True (security gate for ensure())
No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.
# Tests
tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
gateway_log_message
- shipped catalog well-formedness invariant
tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command
Combined: 63 new tests, all passing under scripts/run_tests.sh.
# Validation
- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
tests/hermes_cli/test_doctor_command_install.py
tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
9191 passed, 8 pre-existing failures (verified on origin/main
before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
+ gateway_log_message with mocked installed version → produces
copy-pasteable remediation output
# Community
Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md
Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md
Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>
* build(deps): pin every direct dep to ==X.Y.Z (no ranges)
Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.
Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.
What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.
Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.
Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.
mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.
LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.
Validation:
- Cross-checked all 77 pinned direct deps in pyproject.toml against
uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
→ 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.
* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra
You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.
# What this commit fixes
1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
uv.lock records SHA256 hashes for every transitive — a compromised
package with a different hash gets REJECTED. Falls through to the
existing `uv pip install` cascade if the lockfile is missing or
stale, with a loud warning that the fallback path does NOT
hash-verify transitives. Previously only `setup-hermes.sh` (the dev
path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
(the paths fresh users actually run) skipped it.
2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
project is fully quarantined right now — every version returns 404,
so any pin we wrote was unresolvable, which broke `uv lock --check`
in CI. Restoration is documented in pyproject.toml as a 5-step
checklist (verify, re-add extra, re-enable in 4 modules, regenerate
lock, optionally re-add to [all]).
3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
jsonpath-python pruned. `uv lock --check` now passes.
# Defense-in-depth view
| Layer | Where | Protects against |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject | direct deps | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph | transitive worm injection |
| Tier-0 hash-verified path | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate | every PR | drift between pyproject and lockfile |
| `hermes_cli/security_advisories.py` | runtime | cleanup for users who already got hit |
The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.
# Validation
- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
(test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.
* chore: remove community announcement drafts (PR body covers it)
* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)
Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.
Moved out of core dependencies = []:
- anthropic (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client (image gen; only when picked)
- edge-tts (default TTS but still optional)
New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].
New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.
Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.
Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).
Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
|
||
|
|
cc38282b04 |
feat(cross-platform): psutil for PID/process management + Windows footgun checker
## Why
Hermes supports Linux, macOS, and native Windows, but the codebase grew up
POSIX-first and has accumulated patterns that silently break (or worse,
silently kill!) on Windows:
- `os.kill(pid, 0)` as a liveness probe — on Windows this maps to
CTRL_C_EVENT and broadcasts Ctrl+C to the target's entire console
process group (bpo-14484, open since 2012).
- `os.killpg` — doesn't exist on Windows at all (AttributeError).
- `os.setsid` / `os.getuid` / `os.geteuid` — same.
- `signal.SIGKILL` / `signal.SIGHUP` / `signal.SIGUSR1` — module-attr
errors at runtime on Windows.
- `open(path)` / `open(path, "r")` without explicit encoding= — inherits
the platform default, which is cp1252/mbcs on Windows (UTF-8 on POSIX),
causing mojibake round-tripping between hosts.
- `wmic` — removed from Windows 10 21H1+.
This commit does three things:
1. Makes `psutil` a core dependency and migrates critical callsites to it.
2. Adds a grep-based CI gate (`scripts/check-windows-footguns.py`) that
blocks new instances of any of the above patterns.
3. Fixes every existing instance in the codebase so the baseline is clean.
## What changed
### 1. psutil as a core dependency (pyproject.toml)
Added `psutil>=5.9.0,<8` to core deps. psutil is the canonical
cross-platform answer for "is this PID alive" and "kill this process
tree" — its `pid_exists()` uses `OpenProcess + GetExitCodeProcess` on
Windows (NOT a signal call), and its `Process.children(recursive=True)`
+ `.kill()` combo replaces `os.killpg()` portably.
### 2. `gateway/status.py::_pid_exists`
Rewrote to call `psutil.pid_exists()` first, falling back to the
hand-rolled ctypes `OpenProcess + WaitForSingleObject` dance on Windows
(and `os.kill(pid, 0)` on POSIX) only if psutil is somehow missing —
e.g. during the scaffold phase of a fresh install before pip finishes.
### 3. `os.killpg` migration to psutil (7 callsites, 5 files)
- `tools/code_execution_tool.py`
- `tools/process_registry.py`
- `tools/tts_tool.py`
- `tools/environments/local.py` (3 sites kept as-is, suppressed with
`# windows-footgun: ok` — the pgid semantics psutil can't replicate,
and the calls are already Windows-guarded at the outer branch)
- `gateway/platforms/whatsapp.py`
### 4. `scripts/check-windows-footguns.py` (NEW, 500 lines)
Grep-based checker with 11 rules covering every Windows cross-platform
footgun we've hit so far:
1. `os.kill(pid, 0)` — the silent killer
2. `os.setsid` without guard
3. `os.killpg` (recommends psutil)
4. `os.getuid` / `os.geteuid` / `os.getgid`
5. `os.fork`
6. `signal.SIGKILL`
7. `signal.SIGHUP/SIGUSR1/SIGUSR2/SIGALRM/SIGCHLD/SIGPIPE/SIGQUIT`
8. `subprocess` shebang script invocation
9. `wmic` without `shutil.which` guard
10. Hardcoded `~/Desktop` (OneDrive trap)
11. `asyncio.add_signal_handler` without try/except
12. `open()` without `encoding=` on text mode
Features:
- Triple-quoted-docstring aware (won't flag prose inside docstrings)
- Trailing-comment aware (won't flag mentions in `# os.kill(pid, 0)` comments)
- Guard-hint aware (skips lines with `hasattr(os, ...)`,
`shutil.which(...)`, `if platform.system() != 'Windows'`, etc.)
- Inline suppression with `# windows-footgun: ok — <reason>`
- `--list` to print all rules with fixes
- `--all` / `--diff <ref>` / staged-files (default) modes
- Scans 380 files in under 2 seconds
### 5. CI integration
A GitHub Actions workflow that runs the checker on every PR and push is
staged at `/tmp/hermes-stash/windows-footguns.yml` — not included in this
commit because the GH token on the push machine lacks `workflow` scope.
A maintainer with `workflow` permissions should add it as
`.github/workflows/windows-footguns.yml` in a follow-up. Content:
```yaml
name: Windows footgun check
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: {python-version: "3.11"}
- run: python scripts/check-windows-footguns.py --all
```
### 6. CONTRIBUTING.md — "Cross-Platform Compatibility" expansion
Expanded from 5 to 16 rules, each with message, example, and fix.
Recommends psutil as the preferred API for PID / process-tree operations.
### 7. Baseline cleanup (91 → 0 findings)
- 14 `open()` sites → added `encoding='utf-8'` (internal logs/caches) or
`encoding='utf-8-sig'` (user-editable files that Notepad may BOM)
- 23 POSIX-only callsites in systemd helpers, pty_bridge, and plugin
tool subprocess management → annotated with
`# windows-footgun: ok — <reason>`
- 7 `os.killpg` sites → migrated to psutil (see §3 above)
## Verification
```
$ python scripts/check-windows-footguns.py --all
✓ No Windows footguns found (380 file(s) scanned).
$ python -c "from gateway.status import _pid_exists; import os
> print('self:', _pid_exists(os.getpid())); print('bogus:', _pid_exists(999999))"
self: True
bogus: False
```
Proof-of-repro that `os.kill(pid, 0)` was actually killing processes
before this fix — see commit `
|
||
|
|
3082fa0829 |
feat(hindsight): probe API for update_mode='append' support, dedupe across processes
Mirrors the pattern already shipping in hindsight-integrations/openclaw:
probe `<api_url>/version` once per process, gate on Hindsight ≥ 0.5.0.
When supported, retains use a stable session-scoped `document_id`
(`session_id`) plus `update_mode='append'` so cross-process retains for
the same session merge into one document instead of producing
N-different-process-stamped duplicates. When unsupported (or probe
fails), fall back to the existing per-process unique
`f"{session_id}-{start_ts}"` document_id with no `update_mode` — the
resume-overwrite fix (#6654) keeps working unchanged on legacy servers.
Closes the dedup half of #20115. The proposed `document_id_strategy`
config knob isn't needed: auto-detection via the same /version probe
the OpenClaw plugin already uses gives the same outcome with no extra
config burden, and the choice is purely a function of what the server
can do.
Plumbing
--------
- Module-level helpers (`_meets_minimum_version`, `_fetch_hindsight_api_version`,
`_check_api_supports_update_mode_append`) cache the result per api_url
so every provider in the process gets one /version round-trip.
- One-time WARN logged when the API is older than 0.5.0, telling the
user to upgrade for cross-session deduplication.
- New instance helper `_resolve_retain_target(fallback_doc_id)` returns
`(document_id, update_mode)` based on cached capability. Wired into
`sync_turn` and the `on_session_switch` flush path.
- For local_embedded mode, the probe URL is taken from the running
client (`client.url`) so we hit the actual daemon port rather than
the configured default.
- `update_mode` is set on the per-item dict; `aretain_batch` already
threads `item['update_mode']` into the API call.
Tests
-----
- `TestUpdateModeAppendCapability` (5 cases): legacy fallback, modern
stable+append, per-url cache, one-time warn, flush-on-switch resolves
against the OLD session.
- Existing `_make_hindsight_provider` factory in the manager-side test
file extended to seed `_mode`/`_api_url`/`_api_key`/`_client` and stub
`_resolve_retain_target` so the bypass-init pattern keeps working.
E2E verified against installed `~/.hermes/hermes-agent`:
- Legacy probe (unreachable host) → `legacy-session-<ts>` doc_id,
no `update_mode`.
- Modern probe (live local_embedded 0.5.6 daemon) → stable
`modern-session` doc_id + `update_mode='append'`.
- `test_hermes_embedded_smoke.py` passes (90s).
|
||
|
|
0a5ee01e48 |
fix(hindsight): route flush-on-switch through writer queue, not raw thread
Follow-up to the cherry-picked PR #17447. The original flush spawned a bare threading.Thread for the buffer-flush path, overwriting self._sync_thread — which is aliased to the long-lived writer thread. Two consequences: 1. No serialization with the writer queue. If old-session retains were still queued in _retain_queue, the flush ran concurrently with the writer and both threads could call aretain_batch against the same document_id. 2. The pre-spawn 'self._sync_thread.join(timeout=5.0)' tried to join the long-lived writer, which never exits, so the join was a no-op that just timed out — never actually serialized anything. Fix: enqueue the flush closure on _retain_queue via _ensure_writer + put(). Natural FIFO ordering behind any pending retains, no new thread, no broken join. Shutdown-aware so it doesn't enqueue after teardown. Tests updated to drain via _retain_queue.join() instead of the stale _sync_thread.join(). Added regression guard test_flush_serializes_behind_pending_retains_via_writer_queue that blocks the writer mid-retain to prove the flush waits in FIFO behind the old retain. Also seeds _retain_queue / _shutting_down / stubbed _ensure_writer on the bare-object test helper in test_memory_session_switch.py so that path doesn't blow up under the new queue-enqueue. tests/plugins/memory/test_hindsight_provider.py + tests/agent/test_memory_session_switch.py: 103/103 passing. |
||
|
|
c38dac742b |
fix(hindsight): flush buffered turns and drop stale prefetch on session switch
Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch introduced by #17409. 1. Buffered turns silently lost when retain_every_n_turns > 1. on_session_switch unconditionally cleared _session_turns without flushing. Users who batched every N>1 turns and switched mid-batch (/reset, /new, /resume, /branch, or context compression) had those buffered turns disappear. Same data-loss class as the shutdown race, different lifecycle event. Note commit_memory_session() -> on_session_end() runs *before* on_session_switch on /reset, but Hindsight doesn't implement on_session_end so the buffer survives that step and dies at clear time. /resume, /branch, and compression skip commit_memory_session entirely so an on_session_end impl wouldn't help them anyway. Fix: snapshot the old _session_id, _document_id, _parent_session_id, _turn_index, and _session_turns; spawn one final retain that lands under the OLD document_id; then rotate state. Metadata is built synchronously against the old self._* so session_id / lineage tags on the flushed item all reference the prior session consistently. 2. Stale _prefetch_result leaks across switch. If queue_prefetch ran in the old session and the result hadn't been consumed by prefetch() yet, on_session_switch left the cached recall text in place. The next session's first prefetch() call would return text mined from the prior session's bank/query. Fix: join any in-flight _prefetch_thread (3s bounded — matches shutdown()), then clear _prefetch_result under _prefetch_lock before rotating session_id. Tests ----- - tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush): - buffered turns flushed under OLD document_id with OLD lineage tags - empty buffer => no spurious retain - _prefetch_result cleared on switch - in-flight prefetch thread is awaited before clear (no race) - tests/agent/test_memory_session_switch.py: factory extended to seed the attrs the new flush path reads (_retain_source, _platform, _bank_id, prefetch state, etc.) and stub _run_hindsight_operation so existing switch-state assertions keep passing without network setup. |
||
|
|
0565497dcc |
fix(hindsight): drain retain queue cleanly on shutdown
The plugin used to spawn one daemon thread per sync_turn() to do the
aretain_batch network write. On CLI exit, that pattern raced interpreter
shutdown — the last retain could reach aiohttp after asyncio's
"cannot schedule new futures" guard had fired, producing noisy logs and
silently losing the final unsaved turn:
WARNING ... Hindsight sync failed: cannot schedule new futures after
interpreter shutdown
ERROR asyncio: Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x...>
Switch to a single-writer model: each provider owns one long-lived
writer thread plus a queue. sync_turn() snapshots state and enqueues a
job; the writer drains sequentially. Once shutdown() is called:
- new sync_turn() / queue_prefetch() calls are dropped, not enqueued
- a sentinel wakes the writer so it finishes in-flight work
- shutdown joins the writer (10s) before nulling the client
Also register an idempotent atexit hook from the first sync_turn(), so
exit paths that don't go through MemoryManager.shutdown_all() (Ctrl-C,
abrupt exit) still get a chance to drain.
Tests: keep _sync_thread as a legacy alias to the writer, swap join()
calls to _retain_queue.join() (canonical wait-for-drain), add a new
TestShutdownRace suite covering single-writer reuse, post-shutdown drop,
queue draining, and shutdown idempotency.
|
||
|
|
13683c0842
|
feat(memory): notify providers on mid-process session_id rotation (#17409)
Fixes #6672 Memory providers now receive on_session_switch() whenever AIAgent.session_id rotates mid-process — /resume, /branch, /reset, /new, and context compression. Before this, providers that cached per-session state in initialize() (Hindsight's _session_id, _document_id, accumulated _session_turns, _turn_counter) kept writing into the old session's record after the agent had moved on. MemoryProvider ABC ------------------ - New optional hook on_session_switch(new_session_id, *, parent_session_id='', reset=False, **kwargs) with no-op default for backward compat. reset=True signals /reset or /new — providers should flush accumulated per-session buffers. reset=False for /resume, /branch, compression where the logical conversation continues. MemoryManager ------------- - on_session_switch() fans the hook out to every registered provider. Isolated try/except per provider — one bad provider can't block others. - Empty/None new_session_id is a no-op to avoid corrupting provider state during shutdown paths. run_agent.py ------------ - _sync_external_memory_for_turn now passes session_id=self.session_id into sync_all() and queue_prefetch_all(). Providers with defensive session_id updates in sync_turn (Hindsight already had this at plugins/memory/hindsight/__init__.py:1199) now actually receive the current id. - Compression block at ~L8884 already notified the context engine of the rollover; now also calls _memory_manager.on_session_switch(reason='compression'). cli.py ------ - new_session() fires reset=True, reason='new_session' so providers flush buffers. - _handle_resume_command fires reset=False, reason='resume' with the previous session as parent_session_id. - _handle_branch_command fires reset=False, reason='branch' with the parent session_id already captured for the DB parent link. gateway/run.py -------------- - _handle_resume_command now evicts the cached AIAgent, mirroring /branch and /reset. The next message rebuilds a fresh agent whose memory provider initialize() runs with the correct session_id — matches the pattern the gateway already uses for provider state cross-session transitions. Hindsight reference implementation ---------------------------------- - plugins/memory/hindsight/__init__.py adds on_session_switch that: updates _session_id, mints a fresh _document_id (prevents vectorize-io/hindsight#1303 overwrite), and clears _session_turns / _turn_counter / _turn_index so in-flight batches don't flush under the new document id. parent_session_id only overwritten when provided (avoids clobbering on a bare switch). Tests ----- - tests/agent/test_memory_session_switch.py: new dedicated file. ABC default no-op, manager fan-out, failure isolation, empty-id no-op, session_id propagation through sync_all/queue_prefetch_all, Hindsight state transitions for every reset/non-reset case, parent preservation. - tests/cli/test_branch_command.py: new test verifying /branch fires the hook with correct parent_session_id + reset=False + reason. - tests/gateway/test_resume_command.py: new test verifying /resume evicts the cached agent. - tests/run_agent/test_memory_sync_interrupted.py: updated existing assertions to account for the session_id kwarg on sync_all and queue_prefetch_all. E2E verified (real imports, tmp HERMES_HOME): - /resume: session_id updates, doc_id fresh, buffers cleared, parent set - /branch: session_id forks, parent links to original - /new: reset=True clears accumulated state - compression: reason='compression' propagated, lineage preserved - Empty id: no-op, state preserved - Legacy provider without on_session_switch: no crash Reported by @nicoloboschi (Hindsight maintainer); related scope-widening comment by @kidonng extending coverage to compression. |
||
|
|
059980727a
|
refactor(config): migrate remaining 33 cfg_get call sites (#17311)
Completes the cfg_get migration started in PR #17304. Covers the remaining hermes_cli/ and plugins/ config-access sites that the first PR intentionally left opportunistic. Migrated (33 sites across 14 files): hermes_cli/setup.py 13 sites (terminal.*, agent.*, display.*, compression.*, tts.*) hermes_cli/tools_config.py 7 sites (tts.*, browser.*, web.*, platform_toolsets.*) hermes_cli/plugins_cmd.py 3 sites (plugins.*, memory.*, context.*) plugins/memory/honcho/cli.py 3 sites (hosts.*) hermes_cli/web_server.py 1 site (dashboard.*) hermes_cli/skills_config.py 1 site (platform_disabled) hermes_cli/plugins.py 1 site (plugins.disabled) hermes_cli/status.py 1 site (terminal.backend) hermes_cli/mcp_config.py 1 site (mcp_servers.*) hermes_cli/webhook.py 1 site (platforms.webhook) plugins/memory/__init__.py 1 site (memory.provider) plugins/memory/hindsight/ 1 site (banks.hermes) plugins/memory/holographic/ 1 site (plugins.hermes-memory-store) run_agent.py 1 site (auxiliary.compression) The helper supports non-literal keys too, so e.g. cfg.get('hosts', {}).get(HOST, {}) becomes cfg_get(cfg, 'hosts', HOST, default={}) Migration bugs caught and fixed during this PR: 1. An AST-based batch rewrite naïvely captured the first word token in a chain, which corrupted 'self._config.get(...).get(...)' into 'self.cfg_get(_config, ...)' (dropping 'self.', creating a broken method call). Plugins/memory/hindsight caught it via its test suite. Fixed manually to 'cfg_get(self._config, ...)'. 2. Import-extension heuristic rewrote multi-line parenthesized imports ('from X import (\n A,\n B,\n)') as 'from X import cfg_get, (' — syntactically broken. Fixed by inserting cfg_get as the first name inside the parentheses. Combined with PR #17304, the cfg_get migration now covers: PR #17304 (first batch): 20 sites in tools/ + gateway/ PR #17317 (this one): 33 sites in hermes_cli/ + plugins/ + run_agent.py Total: 53 sites migrated. Remaining ~8 sites are either: - Function-call chains (e.g. '_load_stt_config().get(...).get(...)') that would need double-evaluation or a local binding to migrate cleanly — intentionally deferred. - JSON response-navigation (e.g. 'response_data.get('data',{}).get('web')) which is unrelated to config access and shouldn't use cfg_get. Verified: - 412/412 tests/plugins/ pass (including the hindsight test that caught the self.X regex bug before commit) - 3181/3189 tests/hermes_cli/ pass (8 pre-existing failures on main, verified by git-stash comparison) - Live 'hermes status' and 'hermes config' render correctly (exercise the migrated terminal.backend, tts.provider, browser.cloud_provider, compression.threshold, display.tool_progress sites) - Live 'hermes chat': 1 turn + /quit, zero errors in 11-line log window No semantic changes — cfg_get was already proven to be a 1:1 match for the original .get("X",{}).get("Y",default) pattern in PR #17304. |
||
|
|
64a497bfa9 | fix(hindsight): preserve setup config on blank input | ||
|
|
0ba6471dd1 | fix: recover hindsight embedded daemon after idle shutdown | ||
|
|
18beb69b49 |
fix(memory): close embedded Hindsight async client cleanly
HindsightEmbedded.close() delegates to its sync client.close(). When Hermes created/used that client on the shared async loop, closing it from the main thread raises 'attached to a different loop' before aiohttp releases the session — so the ClientSession / TCPConnector leak past provider teardown. Close the embedded inner async client on the shared loop first via _run_sync(inner_client.aclose()), then let the wrapper's sync close() do its daemon/UI bookkeeping. Salvage of #14605: test placement rebased — appended TestShutdown class after TestSharedEventLoopLifecycle (which landed on main after the PR was written). Original author attribution preserved. |
||
|
|
edff2fbe7e |
feat(hindsight): optional bank_id_template for per-agent / per-user banks
Adds an optional bank_id_template config that derives the bank name at
initialize() time from runtime context. Existing users with a static
bank_id keep the current behavior (template is empty by default).
Supported placeholders:
{profile} — active Hermes profile (agent_identity kwarg)
{workspace} — Hermes workspace (agent_workspace kwarg)
{platform} — cli, telegram, discord, etc.
{user} — platform user id (gateway sessions)
{session} — session id
Unsafe characters in placeholder values are sanitized, and empty
placeholders collapse cleanly (e.g. "hermes-{user}" with no user
becomes "hermes"). If the template renders empty, the static bank_id
is used as a fallback.
Common uses:
bank_id_template: hermes-{profile} # isolate per Hermes profile
bank_id_template: {workspace}-{profile} # workspace + profile scoping
bank_id_template: hermes-{user} # per-user banks for gateway
|
||
|
|
f9c6c5ab84 |
fix(hindsight): scope document_id per process to avoid resume overwrite (#6602)
Reusing session_id as document_id caused data loss on /resume: when
the session is loaded again, _session_turns starts empty and the next
retain replaces the entire previously stored content.
Now each process lifecycle gets its own document_id formed as
{session_id}-{startup_timestamp}, so:
- Same session, same process: turns accumulate into one document (existing behavior)
- Resume (new process, same session): writes a new document, old one preserved
- Forks: child process gets its own document; parent's doc is untouched
Also adds session lineage tags so all processes for the same session
(or its parent) can still be filtered together via recall:
- session:<session_id> on every retain
- parent:<parent_session_id> when initialized with parent_session_id
Closes #6602
|
||
|
|
f1ba2f0c0b |
fix(hindsight): use configured timeout in _run_sync for all async operations
The previous commit added HINDSIGHT_TIMEOUT as a configurable env var, but _run_sync still used the hardcoded _DEFAULT_TIMEOUT (120s). All async operations (recall, retain, reflect, aclose) now go through an instance method that uses self._timeout, so the configured value is actually applied. Also: added backward-compatible alias comment for the module-level function. |
||
|
|
403c82b6b6 |
feat(hindsight): add configurable HINDSIGHT_TIMEOUT env var
The Hindsight Cloud API can take 30-40 seconds per request. The hardcoded 30s timeout was too aggressive and caused frequent timeout errors. This patch: 1. Adds HINDSIGHT_TIMEOUT environment variable (default: 120s) 2. Adds timeout to the config schema for setup wizard visibility 3. Uses the configurable timeout in both _run_sync() and client creation 4. Reads from config.json or env var, falling back to 120s default This makes the timeout upgrade-proof — users can set it via env var or config without patching source code. Signed-off-by: Kumar <kumar@tekgnosis.net> |
||
|
|
93a74f74bf |
fix(hindsight): preserve shared event loop across provider shutdowns
The module-global `_loop` / `_loop_thread` pair is shared across every `HindsightMemoryProvider` instance in the process — the plugin loader creates one provider per `AIAgent`, and the gateway creates one `AIAgent` per concurrent chat session (Telegram/Discord/Slack/CLI). `HindsightMemoryProvider.shutdown()` stopped the shared loop when any one session ended. That stranded the aiohttp `ClientSession` and `TCPConnector` owned by every sibling provider on a now-dead loop — they were never reachable for close and surfaced as the `Unclosed client session` / `Unclosed connector` warnings reported in #11923. Fix: stop stopping the shared loop in `shutdown()`. Per-provider cleanup still closes that provider's own client via `self._client.aclose()`. The loop runs on a daemon thread and is reclaimed on process exit; keeping it alive between provider shutdowns means sibling providers can drain their own sessions cleanly. Regression tests in `tests/plugins/memory/test_hindsight_provider.py` (`TestSharedEventLoopLifecycle`): - `test_shutdown_does_not_stop_shared_event_loop` — two providers share the loop; shutting down one leaves the loop live for the other. This test reproduces the #11923 leak on `main` and passes with the fix. - `test_client_aclose_called_on_cloud_mode_shutdown` — each provider's own aiohttp session is still closed via `aclose()`. Fixes #11923. |
||
|
|
df55660e3c | fix(hindsight): disable broken local runtime on unsupported CPUs | ||
|
|
3e994e38f7 | [verified] fix: materialize hindsight profile env during setup | ||
|
|
127048e643 | fix(hindsight): accept snake_case api_key config | ||
|
|
d6b65bbc47 | fix(hindsight): preserve non-ASCII text in retained conversation turns | ||
|
|
a5c7422f23 |
fix(hindsight): always write HINDSIGHT_LLM_API_KEY to .env, even when empty
When user runs ✓ Memory provider: built-in only Saved to config.yaml and leaves the API key blank, the old code skipped writing it entirely. This caused the uvx daemon launcher to fail at startup because it couldn't distinguish between "key not configured" and "explicitly blank key." Now HINDSIGHT_LLM_API_KEY is always written to .env so the value is either set or explicitly empty. |
||
|
|
b66644f0ec |
feat(hindsight): richer session-scoped retain metadata
- Add configurable retain_tags / retain_source / retain_user_prefix / retain_assistant_prefix knobs for native Hindsight. - Thread gateway session identity (user_name, chat_id, chat_name, chat_type, thread_id) through AIAgent and MemoryManager into MemoryProvider.initialize kwargs so providers can scope and tag retained memories. - Hindsight attaches the new identity fields as retain metadata, merges per-call tool tags with configured default tags, and uses the configurable transcript labels for auto-retained turns. Co-authored-by: Abner <abner.the.foreman@agentmail.to> |
||
|
|
25757d631b |
feat(hindsight): feature parity, setup wizard, and config improvements
Port missing features from the hindsight-hermes external integration package into the native plugin. Only touches plugin files — no core changes. Features: - Tags on retain/recall (tags, recall_tags, recall_tags_match) - Recall config (recall_max_tokens, recall_max_input_chars, recall_types, recall_prompt_preamble) - Retain controls (retain_every_n_turns, auto_retain, auto_recall, retain_async via aretain_batch, retain_context) - Bank config via Banks API (bank_mission, bank_retain_mission) - Structured JSON retain with per-message timestamps - Full session accumulation with document_id for dedup - Custom post_setup() wizard with curses picker - Mode-aware dep install (hindsight-client for cloud, hindsight-all for local) - local_external mode and openai_compatible LLM provider - OpenRouter support with auto base URL - Auto-upgrade of hindsight-client to >=0.4.22 on session start - Comprehensive debug logging across all operations - 46 unit tests - Updated README and website docs |
||
|
|
a1213d06bd
|
fix(hindsight): correct config key mismatch and add base URL support (#6282)
Fixes #6259. Three bugs fixed: 1. Config key mismatch: _get_client() and _start_daemon() read 'llmApiKey' (camelCase) but save_config() stores 'llm_api_key' (snake_case). The config value was never read — only the env var fallback worked. 2. Missing base URL support: users on OpenRouter or custom endpoints had no way to configure HINDSIGHT_API_LLM_BASE_URL through setup. Added llm_base_url to config schema with empty default, passed conditionally to HindsightEmbedded constructor. 3. Daemon config change detection: config_changed now also checks HINDSIGHT_API_LLM_BASE_URL, and the daemon profile .env includes the base URL when set. Keeps HINDSIGHT_API_LLM_API_KEY (with double API) in the daemon profile .env — this matches the upstream hindsight .env.example convention. |
||
|
|
55ac056920 |
fix(hindsight): add missing get_hermes_home import
Import hermes_constants.get_hermes_home at module level so it is available in _start_daemon() when local mode starts the embedded daemon. Previously the import was only inside _load_config(), causing NameError when _start_daemon() referenced get_hermes_home(). Fixes #5993 Co-Authored-By: 史官 <historian@slock.team> |
||
|
|
678a87c477
|
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:
tools/registry.py — tool_error() and tool_result():
Every tool handler returns JSON strings. The pattern
json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
and json.dumps({"success": False, "error": msg}, ...) another 23.
Now: tool_error(msg) or tool_error(msg, success=False).
tool_result() handles arbitrary result dicts:
tool_result(success=True, data=payload) or tool_result(some_dict).
hermes_cli/config.py — read_raw_config():
Lightweight YAML reader that returns the raw config dict without
load_config()'s deep-merge + migration overhead. Available for
callsites that just need a single config value.
Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
delegate (5), send_message (4), tts (4), memory (7), session_search (3),
mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
|
||
|
|
187e90e425
|
refactor: replace inline HERMES_HOME re-implementations with get_hermes_home()
16 callsites across 14 files were re-deriving the hermes home path
via os.environ.get('HERMES_HOME', ...) instead of using the canonical
get_hermes_home() from hermes_constants. This breaks profiles — each
profile has its own HERMES_HOME, and the inline fallback defaults to
~/.hermes regardless.
Fixed by importing and calling get_hermes_home() at each site. For
files already inside the hermes process (agent/, hermes_cli/, tools/,
gateway/, plugins/), this is always safe. Files that run outside the
process context (mcp_serve.py, mcp_oauth.py) already had correct
try/except ImportError fallbacks and were left alone.
Skipped: hermes_constants.py (IS the implementation), env_loader.py
(bootstrap), profiles.py (intentionally manipulates the env var),
standalone scripts (optional-skills/, skills/), and tests.
|
||
|
|
28e1e210ee |
fix(hindsight): overhaul hindsight memory plugin and memory setup wizard
- Dedicated asyncio event loop for Hindsight async calls (fixes aiohttp session leaks) - Client caching (reuse instead of creating per-call) - Local mode daemon management with config change detection and auto-restart - Memory mode support (hybrid/context/tools) and prefetch method (recall/reflect) - Proper shutdown with event loop and client cleanup - Disable HindsightEmbedded.__del__ to avoid GC loop errors - Update API URLs (app -> ui.hindsight.vectorize.io, api_url -> base_url) - Setup wizard: conditional fields (when clause), dynamic defaults (default_from) - Switch dependency install from pip to uv (correct for uv-based venvs) - Add hindsight-all to plugin.yaml and import mapping - 12 new tests for dispatch routing and setup field filtering Original PR #5044 by cdbartholomew. |
||
|
|
924bc67eee
|
feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623)
* feat(memory): add pluggable memory provider interface with profile isolation Introduces a pluggable MemoryProvider ABC so external memory backends can integrate with Hermes without modifying core files. Each backend becomes a plugin implementing a standard interface, orchestrated by MemoryManager. Key architecture: - agent/memory_provider.py — ABC with core + optional lifecycle hooks - agent/memory_manager.py — single integration point in the agent loop - agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md Profile isolation fixes applied to all 6 shipped plugins: - Cognitive Memory: use get_hermes_home() instead of raw env var - Hindsight Memory: check $HERMES_HOME/hindsight/config.json first, fall back to legacy ~/.hindsight/ for backward compat - Hermes Memory Store: replace hardcoded ~/.hermes paths with get_hermes_home() for config loading and DB path defaults - Mem0 Memory: use get_hermes_home() instead of raw env var - RetainDB Memory: auto-derive profile-scoped project name from hermes_home path (hermes-<profile>), explicit env var overrides - OpenViking Memory: read-only, no local state, isolation via .env MemoryManager.initialize_all() now injects hermes_home into kwargs so every provider can resolve profile-scoped storage without importing get_hermes_home() themselves. Plugin system: adds register_memory_provider() to PluginContext and get_plugin_memory_providers() accessor. Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration). * refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider Remove cognitive-memory plugin (#727) — core mechanics are broken: decay runs 24x too fast (hourly not daily), prefetch uses row ID as timestamp, search limited by importance not similarity. Rewrite openviking-memory plugin from a read-only search wrapper into a full bidirectional memory provider using the complete OpenViking session lifecycle API: - sync_turn: records user/assistant messages to OpenViking session (threaded, non-blocking) - on_session_end: commits session to trigger automatic memory extraction into 6 categories (profile, preferences, entities, events, cases, patterns) - prefetch: background semantic search via find() endpoint - on_memory_write: mirrors built-in memory writes to the session - is_available: checks env var only, no network calls (ABC compliance) Tools expanded from 3 to 5: - viking_search: semantic search with mode/scope/limit - viking_read: tiered content (abstract ~100tok / overview ~2k / full) - viking_browse: filesystem-style navigation (list/tree/stat) - viking_remember: explicit memory storage via session - viking_add_resource: ingest URLs/docs into knowledge base Uses direct HTTP via httpx (no openviking SDK dependency needed). Response truncation on viking_read to prevent context flooding. * fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker - Remove redundant mem0_context tool (identical to mem0_search with rerank=true, top_k=5 — wastes a tool slot and confuses the model) - Thread sync_turn so it's non-blocking — Mem0's server-side LLM extraction can take 5-10s, was stalling the agent after every turn - Add threading.Lock around _get_client() for thread-safe lazy init (prefetch and sync threads could race on first client creation) - Add circuit breaker: after 5 consecutive API failures, pause calls for 120s instead of hammering a down server every turn. Auto-resets after cooldown. Logs a warning when tripped. - Track success/failure in prefetch, sync_turn, and all tool calls - Wait for previous sync to finish before starting a new one (prevents unbounded thread accumulation on rapid turns) - Clean up shutdown to join both prefetch and sync threads * fix(memory): enforce single external memory provider limit MemoryManager now rejects a second non-builtin provider with a warning. Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE external plugin provider is allowed at a time. This prevents tool schema bloat (some providers add 3-5 tools each) and conflicting memory backends. The warning message directs users to configure memory.provider in config.yaml to select which provider to activate. Updated all 47 tests to use builtin + one external pattern instead of multiple externals. Added test_second_external_rejected to verify the enforcement. * feat(memory): add ByteRover memory provider plugin Implements the ByteRover integration (from PR #3499 by hieuntg81) as a MemoryProvider plugin instead of direct run_agent.py modifications. ByteRover provides persistent memory via the brv CLI — a hierarchical knowledge tree with tiered retrieval (fuzzy text then LLM-driven search). Local-first with optional cloud sync. Plugin capabilities: - prefetch: background brv query for relevant context - sync_turn: curate conversation turns (threaded, non-blocking) - on_memory_write: mirror built-in memory writes to brv - on_pre_compress: extract insights before context compression Tools (3): - brv_query: search the knowledge tree - brv_curate: store facts/decisions/patterns - brv_status: check CLI version and context tree state Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped per profile). Binary resolution cached with thread-safe double-checked locking. All write operations threaded to avoid blocking the agent (curate can take 120s with LLM processing). * fix(memory): thread remaining sync_turns, fix holographic, add config key Plugin fixes: - Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread) - RetainDB: thread sync_turn (was blocking on HTTP POST) - Both: shutdown now joins sync threads alongside prefetch threads Holographic retrieval fixes: - reason(): removed dead intersection_key computation (bundled but never used in scoring). Now reuses pre-computed entity_residuals directly, moved role_content encoding outside the inner loop. - contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above 500 facts, only checks the most recently updated ones to avoid O(n^2) explosion (~125K comparisons at 500 is acceptable). Config: - Added memory.provider key to DEFAULT_CONFIG ("" = builtin only). No version bump needed (deep_merge handles new keys automatically). * feat(memory): extract Honcho as a MemoryProvider plugin Creates plugins/honcho-memory/ as a thin adapter over the existing honcho_integration/ package. All 4 Honcho tools (profile, search, context, conclude) move from the normal tool registry to the MemoryProvider interface. The plugin delegates all work to HonchoSessionManager — no Honcho logic is reimplemented. It uses the existing config chain: $HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars. Lifecycle hooks: - initialize: creates HonchoSessionManager via existing client factory - prefetch: background dialectic query - sync_turn: records messages + flushes to API (threaded) - on_memory_write: mirrors user profile writes as conclusions - on_session_end: flushes all pending messages This is a prerequisite for the MemoryManager wiring in run_agent.py. Once wired, Honcho goes through the same provider interface as all other memory plugins, and the scattered Honcho code in run_agent.py can be consolidated into the single MemoryManager integration point. * feat(memory): wire MemoryManager into run_agent.py Adds 8 integration points for the external memory provider plugin, all purely additive (zero existing code modified): 1. Init (~L1130): Create MemoryManager, find matching plugin provider from memory.provider config, initialize with session context 2. Tool injection (~L1160): Append provider tool schemas to self.tools and self.valid_tool_names after memory_manager init 3. System prompt (~L2705): Add external provider's system_prompt_block alongside existing MEMORY.md/USER.md blocks 4. Tool routing (~L5362): Route provider tool calls through memory_manager.handle_tool_call() before the catchall handler 5. Memory write bridge (~L5353): Notify external provider via on_memory_write() when the built-in memory tool writes 6. Pre-compress (~L5233): Call on_pre_compress() before context compression discards messages 7. Prefetch (~L6421): Inject provider prefetch results into the current-turn user message (same pattern as Honcho turn context) 8. Turn sync + session end (~L8161, ~L8172): sync_all() after each completed turn, queue_prefetch_all() for next turn, on_session_end() + shutdown_all() at conversation end All hooks are wrapped in try/except — a failing provider never breaks the agent. The existing memory system, Honcho integration, and all other code paths are completely untouched. Full suite: 7222 passed, 4 pre-existing failures. * refactor(memory): remove legacy Honcho integration from core Extracts all Honcho-specific code from run_agent.py, model_tools.py, toolsets.py, and gateway/run.py. Honcho is now exclusively available as a memory provider plugin (plugins/honcho-memory/). Removed from run_agent.py (-457 lines): - Honcho init block (session manager creation, activation, config) - 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools, _activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch, _honcho_prefetch, _honcho_save_user_observation, _honcho_sync - _inject_honcho_turn_context module-level function - Honcho system prompt block (tool descriptions, CLI commands) - Honcho context injection in api_messages building - Honcho params from __init__ (honcho_session_key, honcho_manager, honcho_config) - HONCHO_TOOL_NAMES constant - All honcho-specific tool dispatch forwarding Removed from other files: - model_tools.py: honcho_tools import, honcho params from handle_function_call - toolsets.py: honcho toolset definition, honcho tools from core tools list - gateway/run.py: honcho params from AIAgent constructor calls Removed tests (-339 lines): - 9 Honcho-specific test methods from test_run_agent.py - TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that were accidentally removed during the honcho function extraction. The honcho_integration/ package is kept intact — the plugin delegates to it. tools/honcho_tools.py registry entries are now dead code (import commented out in model_tools.py) but the file is preserved for reference. Full suite: 7207 passed, 4 pre-existing failures. Zero regressions. * refactor(memory): restructure plugins, add CLI, clean gateway, migration notice Plugin restructure: - Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/ (byterover, hindsight, holographic, honcho, mem0, openviking, retaindb) - New plugins/memory/__init__.py discovery module that scans the directory directly, loading providers by name without the general plugin system - run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers() CLI wiring: - hermes memory setup — interactive curses picker + config wizard - hermes memory status — show active provider, config, availability - hermes memory off — disable external provider (built-in only) - hermes honcho — now shows migration notice pointing to hermes memory setup Gateway cleanup: - Remove _get_or_create_gateway_honcho (already removed in prev commit) - Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods - Remove all calls to shutdown methods (4 call sites) - Remove _honcho_managers/_honcho_configs dict references Dead code removal: - Delete tools/honcho_tools.py (279 lines, import was already commented out) - Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods) - Remove if False placeholder from run_agent.py Migration: - Honcho migration notice on startup: detects existing honcho.json or ~/.honcho/config.json, prints guidance to run hermes memory setup. Only fires when memory.provider is not set and not in quiet mode. Full suite: 7203 passed, 4 pre-existing failures. Zero regressions. * feat(memory): standardize plugin config + add per-plugin documentation Config architecture: - Add save_config(values, hermes_home) to MemoryProvider ABC - Honcho: writes to $HERMES_HOME/honcho.json (SDK native) - Mem0: writes to $HERMES_HOME/mem0.json - Hindsight: writes to $HERMES_HOME/hindsight/config.json - Holographic: writes to config.yaml under plugins.hermes-memory-store - OpenViking/RetainDB/ByteRover: env-var only (default no-op) Setup wizard (hermes memory setup): - Now calls provider.save_config() for non-secret config - Secrets still go to .env via env vars - Only memory.provider activation key goes to config.yaml Documentation: - README.md for each of the 7 providers in plugins/memory/<name>/ - Requirements, setup (wizard + manual), config reference, tools table - Consistent format across all providers The contract for new memory plugins: - get_config_schema() declares all fields (REQUIRED) - save_config() writes native config (REQUIRED if not env-var-only) - Secrets use env_var field in schema, written to .env by wizard - README.md in the plugin directory * docs: add memory providers user guide + developer guide New pages: - user-guide/features/memory-providers.md — comprehensive guide covering all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover). Each with setup, config, tools, cost, and unique features. Includes comparison table and profile isolation notes. - developer-guide/memory-provider-plugin.md — how to build a new memory provider plugin. Covers ABC, required methods, config schema, save_config, threading contract, profile isolation, testing. Updated pages: - user-guide/features/memory.md — replaced Honcho section with link to new Memory Providers page - user-guide/features/honcho.md — replaced with migration redirect to the new Memory Providers page - sidebars.ts — added both new pages to navigation * fix(memory): auto-migrate Honcho users to memory provider plugin When honcho.json or ~/.honcho/config.json exists but memory.provider is not set, automatically set memory.provider: honcho in config.yaml and activate the plugin. The plugin reads the same config files, so all data and credentials are preserved. Zero user action needed. Persists the migration to config.yaml so it only fires once. Prints a one-line confirmation in non-quiet mode. * fix(memory): only auto-migrate Honcho when enabled + credentialed Check HonchoClientConfig.enabled AND (api_key OR base_url) before auto-migrating — not just file existence. Prevents false activation for users who disabled Honcho, stopped using it (config lingers), or have ~/.honcho/ from a different tool. * feat(memory): auto-install pip dependencies during hermes memory setup Reads pip_dependencies from plugin.yaml, checks which are missing, installs them via pip before config walkthrough. Also shows install guidance for external_dependencies (e.g. brv CLI for ByteRover). Updated all 7 plugin.yaml files with pip_dependencies: - honcho: honcho-ai - mem0: mem0ai - openviking: httpx - hindsight: hindsight-client - holographic: (none) - retaindb: requests - byterover: (external_dependencies for brv CLI) * fix: remove remaining Honcho crash risks from cli.py and gateway cli.py: removed Honcho session re-mapping block (would crash importing deleted tools/honcho_tools.py), Honcho flush on compress, Honcho session display on startup, Honcho shutdown on exit, honcho_session_key AIAgent param. gateway/run.py: removed honcho_session_key params from helper methods, sync_honcho param, _honcho.shutdown() block. tests: fixed test_cron_session_with_honcho_key_skipped (was passing removed honcho_key param to _flush_memories_for_session). * fix: include plugins/ in pyproject.toml package list Without this, plugins/memory/ wouldn't be included in non-editable installs. Hermes always runs from the repo checkout so this is belt- and-suspenders, but prevents breakage if the install method changes. * fix(memory): correct pip-to-import name mapping for dep checks The heuristic dep.replace('-', '_') fails for packages where the pip name differs from the import name: honcho-ai→honcho, mem0ai→mem0, hindsight-client→hindsight_client. Added explicit mapping table so hermes memory setup doesn't try to reinstall already-installed packages. * chore: remove dead code from old plugin memory registration path - hermes_cli/plugins.py: removed register_memory_provider(), _memory_providers list, get_plugin_memory_providers() — memory providers now use plugins/memory/ discovery, not the general plugin system - hermes_cli/main.py: stripped 74 lines of dead honcho argparse subparsers (setup, status, sessions, map, peer, mode, tokens, identity, migrate) — kept only the migration redirect - agent/memory_provider.py: updated docstring to reflect new registration path - tests: replaced TestPluginMemoryProviderRegistration with TestPluginMemoryDiscovery that tests the actual plugins/memory/ discovery system. Added 3 new tests (discover, load, nonexistent). * chore: delete dead honcho_integration/cli.py and its tests cli.py (794 lines) was the old 'hermes honcho' command handler — nobody calls it since cmd_honcho was replaced with a migration redirect. Deleted tests that imported from removed code: - tests/honcho_integration/test_cli.py (tested _resolve_api_key) - tests/honcho_integration/test_config_isolation.py (tested CLI config paths) - tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py) Remaining honcho_integration/ files (actively used by the plugin): - client.py (445 lines) — config loading, SDK client creation - session.py (991 lines) — session management, queries, flush * refactor: move honcho_integration/ into the honcho plugin Moves client.py (445 lines) and session.py (991 lines) from the top-level honcho_integration/ package into plugins/memory/honcho/. No Honcho code remains in the main codebase. - plugins/memory/honcho/client.py — config loading, SDK client creation - plugins/memory/honcho/session.py — session management, queries, flush - Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py, plugin __init__.py, session.py cross-import, all tests - Removed honcho_integration/ package and pyproject.toml entry - Renamed tests/honcho_integration/ → tests/honcho_plugin/ * docs: update architecture + gateway-internals for memory provider system - architecture.md: replaced honcho_integration/ with plugins/memory/ - gateway-internals.md: replaced Honcho-specific session routing and flush lifecycle docs with generic memory provider interface docs * fix: update stale mock path for resolve_active_host after honcho plugin migration * fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore Review feedback from Honcho devs (erosika): P0 — Provider lifecycle: - Remove on_session_end() + shutdown_all() from run_conversation() tail (was killing providers after every turn in multi-turn sessions) - Add shutdown_memory_provider() method on AIAgent for callers - Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry Bug fixes: - Remove sync_honcho=False kwarg from /btw callsites (TypeError crash) - Fix doctor.py references to dead 'hermes honcho setup' command - Cache prefetch_all() before tool loop (was re-calling every iteration) ABC contract hardening (all backwards-compatible): - Add session_id kwarg to prefetch/sync_turn/queue_prefetch - Make on_pre_compress() return str (provider insights in compression) - Add **kwargs to on_turn_start() for runtime context - Add on_delegation() hook for parent-side subagent observation - Document agent_context/agent_identity/agent_workspace kwargs on initialize() (prevents cron corruption, enables profile scoping) - Fix docstring: single external provider, not multiple Honcho CLI restoration: - Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py with imports adapted to plugin path) - Restore full hermes honcho command with all subcommands (status, peer, mode, tokens, identity, enable/disable, sync, peers, --target-profile) - Restore auto-clone on profile creation + sync on hermes update - hermes honcho setup now redirects to hermes memory setup * fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type - Wire on_delegation() in delegate_tool.py — parent's memory provider is notified with task+result after each subagent completes - Add skip_memory=True to cron scheduler (prevents cron system prompts from corrupting user representations — closes #4052) - Add skip_memory=True to gateway flush agent (throwaway agent shouldn't activate memory provider) - Fix ByteRover on_pre_compress() return type: None -> str * fix(honcho): port profile isolation fixes from PR #4632 Ports 5 bug fixes found during profile testing (erosika's PR #4632): 1. 3-tier config resolution — resolve_config_path() now checks $HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json (non-default profiles couldn't find shared host blocks) 2. Thread host=_host_key() through from_global_config() in cmd_setup, cmd_status, cmd_identity (--target-profile was being ignored) 3. Use bare profile name as aiPeer (not host key with dots) — Honcho's peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid 4. Wrap add_peers() in try/except — was fatal on new AI peers, killed all message uploads for the session 5. Gate Honcho clone behind --clone/--clone-all on profile create (bare create should be blank-slate) Also: sanitize assistant_peer_id via _sanitize_id() * fix(tests): add module cleanup fixture to test_cli_provider_resolution test_cli_provider_resolution._import_cli() wipes tools.*, cli, and run_agent from sys.modules to force fresh imports, but had no cleanup. This poisoned all subsequent tests on the same xdist worker — mocks targeting tools.file_tools, tools.send_message_tool, etc. patched the NEW module object while already-imported functions still referenced the OLD one. Caused ~25 cascade failures: send_message KeyError, process_registry FileNotFoundError, file_read_guards timeouts, read_loop_detection file-not-found, mcp_oauth None port, and provider_parity/codex_execution stale tool lists. Fix: autouse fixture saves all affected modules before each test and restores them after, matching the pattern in test_managed_browserbase_and_modal.py. |