hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-14 14:12:44 +00:00

Author	SHA1	Message	Date
Ziliang Peng	c3a09f7835	fix(background_review): propagate parent toolset config to keep tools[] cache-stable ## Summary The background skill/memory-review fork constructed a child `AIAgent` without propagating `enabled_toolsets` / `disabled_toolsets` from the parent. When the parent narrowed its toolset (via `hermes tools disable` or `config.yaml`), the fork's default `enabled_toolsets=None` expanded to "all registered tools" — and the fork's outbound request body sent a wider `tools[]` array than the parent's main-turn request. Anthropic's prompt-cache key includes the `tools[]` array byte-for-byte, so this divergence forked the cache lineage on every nudge and forced a full prefix rewrite. On a captured ~4 hour Claude-via-Hermes session this cost roughly 4.3 M cache-write tokens — about half of those attributable to the per-nudge alternation between the main turn's narrowed `tools[]` and the review fork's wider `tools[]`. ## Goal Extend the byte-stability invariant established by PR #17276 (which fixed `system`) to the `tools[]` slot of the request body, so the review fork's outbound request hits the parent's warmed Anthropic prefix cache regardless of how the parent's toolset is configured. ## Implementation Two-line change in `agent/background_review.py`: pass `enabled_toolsets=getattr(agent, "enabled_toolsets", None)` and the matching `disabled_toolsets` kwarg into the `AIAgent(...)` call inside `_spawn_background_review`. Adds an explanatory block comment that calls out the cache-key dependency and the relationship to PR #17276. The post-construction runtime whitelist (`set_thread_tool_whitelist({memory, skills})`) is untouched — it still gates which tools the model is allowed to dispatch. This change aligns only what the request body transmits, not what the review is allowed to do, so the safety contract from issue #15204 remains intact. ## Testing - `tests/run_agent/test_background_review_cache_parity.py`: new `test_review_fork_inherits_parent_toolset_config` asserts the parent's `enabled_toolsets` and `disabled_toolsets` reach the review-fork constructor as kwargs. - `tests/run_agent/test_background_review_toolset_restriction.py`: the existing `test_background_review_does_not_narrow_toolset_schema` was inverted (its old "must NOT pass enabled_toolsets" rule was built on the assumption that the parent always ran with the registry default — wrong in practice when the parent is narrowed). Renamed to `test_background_review_matches_parent_toolset_config` and updated to assert the parent's value propagates verbatim. - Verified the new positive test fails without the fix and passes with it. - Full suite for `test_background_review*`: ``` $ python -m pytest tests/run_agent/test_background_review.py \ tests/run_agent/test_background_review_summary.py \ tests/run_agent/test_background_review_toolset_restriction.py \ tests/run_agent/test_background_review_cache_parity.py -q 18 passed in 1.85s ``` ## Scope - `agent/background_review.py`: 2 added kwargs + explanatory comment. - Two test files: one new positive test, one inverted existing test. - No production code paths outside the review fork; no schema changes; no public-API changes. Refs: ziliangpeng/hermes-agent#1 (root-cause analysis with wire-level cache-write measurements). Extends PR #17276's `system`-bytes invariant to the `tools[]` slot.	2026-05-21 12:49:21 +05:30
EloquentBrush0x	6c26727bb3	fix(gateway): extend observe+attribution to location and media handlers _handle_location_message and _handle_media_message were skipped when the observe-unmentioned-group-messages feature landed (`a9db0e2c7`). Both handlers now: 1. Check _should_observe_unmentioned_group_message on the skipped path and call _observe_unmentioned_group_message so group chatter is stored as shared session context even when the bot is not addressed. 2. Call _apply_telegram_group_observe_attribution on the triggered path so the dispatched event uses the shared (user_id=None) group session instead of the per-user session, letting the model see previously observed context. For stickers the attribution is applied after _handle_sticker completes (which overwrites event.text with the vision description); for all other media types it is applied once after caption cleaning. Four new tests cover the observe and attribution paths for both handlers.	2026-05-20 23:52:18 -07:00
0xsir0000	5edb346c75	security(file-safety): also write-deny <root>/.env when running under a profile (#15981 ) build_write_denied_paths() resolved the protected ``.env`` via get_hermes_home(), which is profile-aware. When a profile is active HERMES_HOME points at ``<root>/profiles/<name>`` and ``hermes_home / ".env"`` expands to the profile env file only — the global ``<root>/.env`` is left off the deny list and a write_file call against it succeeds. Since the top-level .env supplies credentials inherited by every profile, this is a P0 credential-exfiltration / overwrite path. Add a parallel ``_hermes_root_path()`` helper that returns the Hermes root (via the existing ``get_default_hermes_root()`` constant) and include ``<root>/.env`` in the deny list alongside ``<active_profile>/.env``. Both paths now refuse write_file/patch regardless of profile state. The active HERMES_HOME .env entry is preserved so the protection in non-profile mode is unchanged. A regression test exercises the profile-active scenario by pointing HERMES_HOME at ``<tmp>/profiles/coder`` and asserting that ``<tmp>/.env`` is denied. Fixes #15981	2026-05-20 23:37:37 -07:00
teknium1	f722ec723f	chore: add nycomar to AUTHOR_MAP	2026-05-20 23:27:38 -07:00
Omar B	be0728cacc	fix: handle Discord typing indicator 429 gracefully The typing indicator loop (send_typing) ran every 8s and died on any exception, including Discord 429 rate limits. Once a 429 killed the loop, the indicator never restarted — and the raw exception bounce could cascade into broader gateway instability. Changes: - Bump sleep interval from 8s to 12s (typing light lasts ~10s) - On 429: extract retry_after, log a warning, sleep the backoff, and continue the loop - On non-rate-limit errors: log debug and return (unchanged behaviour)	2026-05-20 23:27:38 -07:00
Teknium	975e13091e	fix(cli): honour image-routing decision in quiet-mode -q --image path The interactive CLI input path consults decide_image_input_mode() to pick between native image_url attachment and the vision_analyze text pipeline, but the non-interactive 'hermes chat -Q -q ... --image FOO' path unconditionally called _preprocess_images_with_vision() — so even with `model.supports_vision: true` set, --image always went through the text-pipeline. Symptom: vision_analyze runs 4-5s per image and the model sees a lossy text summary instead of the actual pixels. Mirror the interactive path: load config, call decide_image_input_mode, branch on native vs text. Falls back to the text-pipeline on any import or build error (Pyright-clean: _build_parts guarded with `is not None`). Live E2E (provider=custom, base_url=openrouter, anthropic/claude-haiku-4.5, red 64x64 PNG): baseline (no override): vision_analyze called (8 log lines), 5.8s with supports_vision: vision_analyze NOT called (0 log lines), 3.9s Same model, same image, single knob flips text→native routing.	2026-05-20 23:27:10 -07:00
Teknium	32aea113f0	fix(agent): consult supports_vision override in auto-mode routing The contributor PR (#17936) only patched the strip path in `_model_supports_vision()`. The auto-mode router in `agent/image_routing._lookup_supports_vision` still only read models.dev, so a custom-provider model declared as vision-capable would still get its images routed through vision_analyze in the default `agent.image_input_mode: auto` setting. Users had to set both `supports_vision: true` AND `image_input_mode: native` to bypass the text pipeline. Single-knob behavior now: `supports_vision: true` alone is enough in auto mode. The strip path and the routing path consult the same resolver. - Extract override resolution into `_supports_vision_override()` in agent/image_routing.py and wire it into `_lookup_supports_vision()`. - Refactor `run_agent._model_supports_vision` to call the same helper (DRY, single source of truth for the resolution order). - Strict YAML boolean coercion: `supports_vision: "false"` (quoted — a common YAML mistake) no longer coerces to True via bool() truthiness. Recognised tokens: true/false/yes/no/on/off/1/0 plus real bools and 0/1. Unrecognised values return None and fall through to models.dev. - Add @CNSeniorious000 to AUTHOR_MAP for release attribution. Tests: 26 new (TestCoerceCapabilityBool, TestSupportsVisionOverride, TestLookupSupportsVisionOverride, TestAutoModeRespectsOverride). Existing contributor tests + image_routing + vision_native_fast_path + native_image_buffer_isolation all green (92/92).	2026-05-20 23:27:10 -07:00
Muspi Merol	1c76689b28	fix(agent): resolve supports_vision override for named custom providers Named custom providers are rewritten to provider="custom" at runtime (hermes_cli/runtime_provider.py:_resolve_named_custom_runtime), so a config under providers.my-vllm.models.my-llava.supports_vision was unreachable via self.provider alone. Also try cfg.model.provider as a candidate provider key, covering both runtime and config naming. Adds a regression test for the named-provider path.	2026-05-20 23:27:10 -07:00
Muspi Merol	24c7ce0fb8	feat(agent): allow declaring supports_vision via user config Custom/local provider models absent from models.dev get classified as non-vision and have their image content stripped before reaching the upstream API. Surface a user-facing override: model: supports_vision: true providers: my-vllm: models: my-llava: supports_vision: true The override short-circuits the models.dev lookup in _model_supports_vision(), which is the single gate guarding image-strip preprocessing on every transport path. Refs #8731.	2026-05-20 23:27:10 -07:00
Teknium	b4afc6546e	fix(xai): restore encrypted reasoning replay across turns xAI partner integration requires Hermes to thread `encrypted_content` reasoning items back to the Responses API on every turn so Grok can maintain cross-turn reasoning coherence. PR #26644 (May 15) gated this off for `is_xai_responses` on the theory that the OAuth/SuperGrok surface rejected replayed encrypted blobs and produced the multi-turn "Expected to have received \`response.created\` before \`error\`" failure. That diagnosis was wrong — the prelude-SSE fallback added in the same PR is what actually fixed that failure mode. Suppressing the replay was an unnecessary side-effect that broke the whole point of xAI's partnership integration. Changes: - agent/codex_responses_adapter.py — drop the `is_xai_responses` gate in `_chat_messages_to_responses_input`. Keep the kwarg in the signature for transport compatibility; update the docstring to document the May 2026 reversal. - agent/transports/codex.py — restore `kwargs["include"] = ["reasoning.encrypted_content"]` on the xAI Responses path so xAI echoes encrypted reasoning back to us. - tests/run_agent/test_codex_xai_oauth_recovery.py — flip the three xAI assertions (now: xAI MUST receive replayed reasoning AND we MUST include encrypted_content in the request). - tests/agent/transports/test_codex_transport.py — flip the `include` assertions on `test_xai_reasoning_effort_passed` and `test_xai_grok_4_omits_reasoning_effort`; update the allowlist block comment. The prelude-SSE fallback and the entitlement-403 surfacing fixes from #26644 are untouched — they were independent fixes that happened to ride along with the reasoning-replay gate. Validation: - Targeted: tests/run_agent/test_codex_xai_oauth_recovery.py + tests/agent/transports/test_codex_transport.py → 65/65 pass - Broader: tests/agent/transports/ + tests/run_agent/ → 1674 passed, 3 skipped, 0 failures - E2E (real imports, isolated HERMES_HOME, ResponsesApiTransport build_kwargs): turn-1 request carries `include: ["reasoning.encrypted_content"]`; turn-2 input replays the encrypted_content blob from turn-1's `codex_reasoning_items`; native Codex unchanged.	2026-05-20 23:12:45 -07:00
Teknium	127b56a61a	style: docstring + whitespace cleanup on secure_parent_dir - Drop two extra blank lines between display_hermes_home and secure_parent_dir - Fix docstring saying 'depth < 2' (actual guard is parts < 3)	2026-05-20 22:56:55 -07:00
liuhao1024	4ead464f97	fix(security): guard os.chmod(parent) against / and top-level dirs Five call sites do os.chmod(path.parent, 0o700) without checking that the parent resolves to a safe directory. If HERMES_HOME or another path env var resolves to /, the chmod strips traversal permission from the root inode and bricks the entire host. Add secure_parent_dir() to hermes_constants.py that refuses to chmod / or any top-level directory (depth < 2). Replace all 5 call sites with this helper. Fixes #25821	2026-05-20 22:56:55 -07:00
teknium1	3bbe980115	chore: add Glucksberg to AUTHOR_MAP	2026-05-20 22:55:31 -07:00
Markus	a9db0e2c74	Observe unmentioned Telegram group messages	2026-05-20 22:55:31 -07:00
Teknium	c6a992e3e3	fix(security): derive <VENDOR>_API_KEY from host as final credential fallback After #28660's host-gating fix, users with provider=custom and base_url pointed at a commercial endpoint (DeepSeek, Groq, Mistral, …) hit no-key-required even when they had the vendor-named env var set (DEEPSEEK_API_KEY, GROQ_API_KEY, …). The issue author flagged this as 'what users intuitively expect'. Adds _host_derived_api_key() to derive an env var name from the base URL host using the registrable label (second-to-last). Appended to all three api_key_candidates chains (_resolve_named_custom_runtime direct-alias path, named-custom path, _resolve_openrouter_runtime non-openrouter branch). Lookalike resistance: api.deepseek.com.attacker.test resolves to vendor label 'attacker', NOT 'deepseek' — DEEPSEEK_API_KEY stays put. IPs and loopback yield no vendor label. Already-handled vendors (OPENAI/OPENROUTER/ OLLAMA) are filtered to prevent bypass of the explicit host-gated paths. Adds 6 tests covering positive paths (DeepSeek, Groq), the lookalike attack, loopback rejection, the already-handled-vendor filter, and direct helper unit tests. Also adds erhnysr to AUTHOR_MAP.	2026-05-20 22:12:09 -07:00
Erhnysr	9514ddbee2	fix(security): address review feedback from pmos69 - Preserve OPENROUTER_API_KEY for explicit mirror/proxy configs when requested provider is openrouter and OPENROUTER_BASE_URL is set - Gate OPENAI_API_KEY and OPENROUTER_API_KEY in named custom provider path (_resolve_named_custom_runtime) on authoritative hosts - Gate same keys in direct-alias path - Update tests to reflect secure-by-default behavior for local endpoints	2026-05-20 22:12:09 -07:00
Erhnysr	59088228f6	fix(security): prevent API key leakage to non-authoritative custom endpoints Custom endpoint provider was forwarding OPENAI_API_KEY and OLLAMA_API_KEY to arbitrary hosts. Keys should only be sent to their authoritative domains (openai.com, ollama.com) or when explicitly configured via pool/env. - Gate OPENAI_API_KEY to openai.com hosts only - Gate OLLAMA_API_KEY to ollama.com hosts only - Return 'no-key-required' for unrecognized custom endpoints - Update tests to reflect secure-by-default behavior Closes #28660	2026-05-20 22:12:09 -07:00
teknium1	5672772dab	fix(gateway): reorder telegram menu priority — everyday commands first Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Docker Build and Publish / move-main (push) Blocked by required conditions Details Docker Build and Publish / move-latest (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (push) Waiting to run Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details Put /help, /new, /stop, /status, /resume, /sessions, /model ahead of the maintenance group (/debug, /restart, /update, /verbose, /commands) so the menu's first row matches what users actually type most often. The maintenance commands that prompted this priority list still land inside the 30-cap visible window — just not at the very top.	2026-05-20 19:14:21 -07:00
helix4u	b9b6e034d5	fix(gateway): prioritize Telegram command menu	2026-05-20 19:14:21 -07:00
ethernet	1566d71726	Merge pull request #29342 from NousResearch/fix/tui-linux-copy fix(tui): clipboard copy on linux/wayland	2026-05-20 21:40:37 -04:00
ethernet	f7441f9c42	fix(nix): add xclip and wl-copy	2026-05-20 19:47:30 -04:00
ethernet	c42edd8055	fix(tui): clipboard copy on linux/wayland `probeLinuxCopy` and `copyNative` in `osc.ts` await `execFileNoThrow` for wl-copy / xclip / xsel. Those tools double-fork a daemon that holds the system selection live, and the daemon inherits stdio pipes from `spawn(stdio: 'pipe')`. Node's 'close' event only fires when stdio is fully closed → the daemon keeps the pipes open → 'close' never fires → the await leaks past the timeout (kill(SIGTERM) on an already-exited child is a no-op, daemon survives). Result: `linuxCopy` cache stays `undefined` permanently, the actual copy never runs, ctrl-c silently does nothing on wayland/x11. Reproduced in isolation, confirmed across wl-copy and a daemonization-shaped fixture. Fix: add `resolveOnExit` option to `execFileNoThrow`. When set, the promise settles on the immediate child's 'exit' event instead of waiting for stdio drainage. Wired into both the probe and the actual copy spawns for every clipboard tool (pbcopy, wl-copy, xclip, xsel, clip). Tests: 5 new vitest cases covering daemon-style child handling, non-zero exit propagation, timeout behavior, and double-resolve guard. The forever-hang case is committed as `it.skip` with documentation so a reviewer can verify the bug by hand.	2026-05-20 19:47:30 -04:00
Teknium	c6a380eb6c	fix(skills-hub): widen identifier-dedup to GitHubSource + fix test patch path Sibling fix on top of @EloquentBrush0x's PR #29441. - tools/skills_hub.py GitHubSource.search() had the same r.name dedup bug. Two configured GitHub taps publishing same-named skills would collapse to one. - tests/hermes_cli/test_skills_hub.py:test_browse_skills_dedup_uses_identifier_not_name patched hermes_cli.skills_hub.create_source_router, but browse_skills() imports it locally from tools.skills_hub. Fixed patch path.	2026-05-20 15:04:01 -07:00
EloquentBrush0x	8f92327891	fix(skills-hub): fix dedup in browse_skills() programmatic API browse_skills() is the TUI gateway's API for the web UI skills browser (tui_gateway/server.py:6574). It had the same dedup-by-name bug as do_browse() and unified_search() fixed in the parent commit: r.name is not unique for browse-sh skills (Airbnb, Booking.com, Zillow all publish "search-listings"), so the dedup loop silently dropped all but the first skill with each task name. Switch to r.identifier, which is always globally unique. Add a regression test asserting that two browse-sh skills with the same name but different hostnames both appear in the browse_skills() result.	2026-05-20 15:04:01 -07:00
EloquentBrush0x	fc7e04e9ed	fix(skills-hub): deduplicate search results by identifier, not name Browse.sh exposes skills by task name (e.g. "search-listings"), which is shared across hundreds of sites. Deduplicating by name silently dropped every browse-sh skill after the first one with a given task name — e.g. only Airbnb's "search-listings" would survive, collapsing Booking.com, Zillow, and every other site's variant into nothing. Switch unified_search() and do_browse() to use r.identifier as the dedup key. identifier is always globally unique (e.g. "browse-sh/airbnb.com/search-listings-ddgioa"), so same-named skills from different browse-sh hostnames are preserved as distinct results. Update existing TestUnifiedSearchDedup tests to model the real scenario (same identifier appearing from two sources) and add a regression test that asserts browse-sh skills with the same name but different hostnames are never collapsed.	2026-05-20 15:04:01 -07:00
kshitij	3ce1cf2bb7	Merge pull request #29484 from kshitijk4poor/kp/x-search-degraded-flag Merged after self-review + local verification of date validation and degraded flag. All tests pass, claims confirmed end-to-end.	2026-05-20 14:39:27 -07:00
helix4u	1a7bb988fc	fix(gateway): harden kanban and provider cleanup races	2026-05-20 14:31:22 -07:00
kshitijk4poor	2a352f96ee	fix(x_search): surface degraded results + validate dates The xAI Responses API for x_search returns 200 OK with a synthesized fluff answer in two failure modes that callers currently cannot distinguish from a real, citation-backed result: 1. Any narrowing filter (allowed_x_handles, excluded_x_handles, from_date, to_date) was active, but the X index returned no matching posts. The model then answers from training data. 2. The date range is malformed, inverted, or pure-future (e.g. from_date=2030-01-01). The API call burns quota and Grok responds with a generic answer. Mitigations, both client-side: * Validate from_date / to_date before the HTTP call: - Strict YYYY-MM-DD. - from_date <= to_date when both set. - from_date <= today UTC (no posts in a window that hasn't started). to_date in the future remains allowed so callers can request 'from yesterday to tomorrow'. * Add 'degraded' + 'degraded_reason' to successful responses. degraded=True iff any narrowing filter was active AND both the top-level 'citations' array and inline 'url_citation' annotations came back empty. A broad query with no filters that returns no citations is not flagged degraded — that case is just an unsourced answer, not a filter miss. Tests cover all four validation paths plus six degraded-flag scenarios (each filter type, inline vs top-level citation recovery, broad query baseline). All existing tests continue to pass; the additions are purely additive on the success-path response shape. Discovered while testing the x_search toolset end-to-end: queries scoped to @Teknium1 returned confident-sounding generic text about Nous Research with zero citations, and from_date in 2030 produced sassy non-answers. Both are now detectable by the caller.	2026-05-21 02:38:45 +05:30
Teknium	31a0100104	feat(state.db): persist platform_message_id; restore yuanbao exact-id recall PR #29211 dropped JSONL gateway transcripts and noted that the platform's own `message_id` field (used by Yuanbao's recall guard to redact a message by exact platform id) was no longer preserved — falling back to content-match. That fallback works for the common case but redacts the wrong row when two messages share text (or fails to match when content is post-processed). Restore exact-id matching by giving state.db a column for it: - New `platform_message_id TEXT` column on the messages table (SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler on existing DBs, no version-gated migration block needed) - Partial index `idx_messages_platform_msg_id` on (session_id, platform_message_id) to keep recall's point-lookup cheap even on large sessions - `append_message()` and `replace_messages()` accept the new value: the gateway-facing `append_to_transcript` in `gateway/session.py` forwards either `message["platform_message_id"]` or the legacy `message["message_id"]` key (yuanbao's existing convention) - `get_messages_as_conversation()` surfaces the column back on the message dict as `message_id` so platform code reads the same shape it used to read from JSONL - Yuanbao `_patch_transcript`: restore branch A1 (exact id match) ahead of A2 (content match) ahead of B (system-note). Both branches log which one fired so operators can tell from gateway.log whether recall hit the canonical path or had to fall back. Tests: - New low-level round-trip tests in `test_hermes_state.py` for both `append_message` and `replace_messages` paths - The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert the new contract: branch A1 (id match) works against DB-only transcripts, and branch A2 (content match) still recovers rows that were observed without a platform id (e.g. agent-processed @bot messages where run.py doesn't carry msg_id through)	2026-05-20 13:00:57 -07:00
yoniebans	0cc1a1d2d9	refactor(yuanbao): drop dead branch A1 message_id loop + pin missing fixture PR #29211 review findings: 1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures. 2. yuanbao recall branch A1 (message_id exact match) was structurally dead once load_transcript() became DB-only — state.db never preserves the platform message_id. Removed the dead loop, consolidated to a single content-match branch (renamed 'A: content match'). Branch B (system note) unchanged. Updated the test name + docstring to reflect this. Note: self._lock is no longer taken in append_to_transcript (was guarding the JSONL file append). SQLite append_message handles its own concurrency via WAL mode, so this is safe; flagging for awareness.	2026-05-20 13:00:57 -07:00
yoniebans	c634c07bcc	test(gateway): pin DEFAULT_DB_PATH in fixtures to prevent real state.db writes Fixtures that instantiate SessionStore() trigger SessionDB() with no args, which resolves to ~/.hermes/state.db via the DEFAULT_DB_PATH module constant (snapshot of get_hermes_home() at hermes_state import time). The autouse _hermetic_environment fixture in tests/conftest.py monkeypatches HERMES_HOME env, but DEFAULT_DB_PATH is already cached by then. Per-test monkeypatch.setattr(hermes_state, 'DEFAULT_DB_PATH', tmp_path/'state.db') forces the DB into tmp_path so the tests can't leak into the real profile. Verified by counting u1-prefixed sessions in real state.db before/after: delta=0.	2026-05-20 13:00:57 -07:00
yoniebans	33a3cf5322	docs(sessions): state.db is canonical for gateway messages	2026-05-20 13:00:57 -07:00
yoniebans	b4b118c201	refactor(gateway): drop _append_to_jsonl from mirror Mirror messages are persisted via _append_to_sqlite. JSONL writer was a redundant dual-write. Updated test assertions from JSONL file checks to SQLite mock verification.	2026-05-20 13:00:57 -07:00
yoniebans	351fdcc6e6	refactor(gateway): stop writing JSONL in append_to_transcript / rewrite_transcript state.db is canonical. JSONL transcripts were a transition fallback; the fallback was removed in the previous commit. Existing *.jsonl files on disk are left untouched.	2026-05-20 13:00:57 -07:00
yoniebans	971cfaa38c	refactor(yuanbao): migrate recall to load_transcript() Yuanbao's recall feature was reading the gateway JSONL directly to look up messages by platform message_id, which state.db does not preserve. Migrated to use load_transcript() which returns DB messages. Recall branch A1 (message_id match) now falls through to A2 (content match) or B (system note) for all sessions — a documented degradation. Follow-up issue: add platform_message_id column to state.db messages to restore exact-id matching.	2026-05-20 13:00:57 -07:00
yoniebans	024a8e3ee9	refactor(gateway): drop JSONL fallback in load_transcript state.db is canonical. The 'use whichever source is longer' branch was defensive code for the pre-DB migration; on every real DB it has not fired (verified on a session corpus with 27 jsonl files / 950 sessions — zero jsonl-bigger cases). Test changes: - TestLoadTranscriptCorruptLines: deleted (tested dead JSONL code path) - TestLoadTranscriptPreferLongerSource: deleted (tested removed fallback) - Replaced with TestLoadTranscriptDBOnly (DB-only reads) - TestSessionStoreRewriteTranscript: fixture now creates DB session - test_gateway_retry_replaces_last_user_turn: fixture uses real DB	2026-05-20 13:00:57 -07:00
yoniebans	1d27be0ff3	test(gateway): pin SQLite-only load_transcript behaviour	2026-05-20 13:00:57 -07:00
helix4u	4d2df86281	docs(skills): clarify external dir mutations	2026-05-20 12:41:38 -07:00
Fabio Siqueira	57a61057f5	fix(deps): bump pydantic to 2.13.4 to avoid pydantic-core thread segfault (#29021 ) * fix(deps): bump pydantic to 2.13.4 to avoid pydantic-core thread segfault pydantic-core 2.41.5 (pulled by pydantic==2.12.5) segfaults when the OpenAI SDK's Responses API resource (client.responses.create / client.responses.stream) is exercised from a non-main threading.Thread. Hermes always dispatches codex_responses calls from a daemon thread in agent/chat_completion_helpers.py:_call, so the crash is 100% reproducible whenever the active provider is xai-oauth or openai-codex. Symptom: `hermes -z "ping"` (or any oneshot path) dies with SIGSEGV / exit 139 and zero output — hermes_cli/oneshot.py redirects stderr to /dev/null, hiding the crash. Bumping pydantic to 2.13.4 pulls in pydantic-core 2.46.4, which eliminates the crash. Verified end-to-end: `hermes -z "ping"` against xai-oauth/grok-4.3 now returns the expected response. Minimal repro (any OpenAI base_url; not xAI-specific): import threading from openai import OpenAI cli = OpenAI(api_key="sk-bogus", base_url="https://api.openai.com/v1") def go(): try: cli.responses.create(model="gpt-4o", input="ping") except BaseException as e: print(type(e).__name__) threading.Thread(target=go).start() # → SIGSEGV with pydantic-core 2.41.5; clean 401 with 2.46.4 * chore(deps): regenerate uv.lock for pydantic 2.13.4 bump	2026-05-20 15:27:14 -04:00
dependabot[bot]	419910ee21	chore(deps): bump idna from 3.11 to 3.15 (#28883 ) Bumps [idna](https://github.com/kjd/idna) from 3.11 to 3.15. - [Release notes](https://github.com/kjd/idna/releases) - [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.md) - [Commits](https://github.com/kjd/idna/compare/v3.11...v3.15) --- updated-dependencies: - dependency-name: idna dependency-version: '3.15' dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-20 15:25:45 -04:00
dependabot[bot]	fee88105f9	chore(deps): bump protobufjs in /scripts/whatsapp-bridge (#28889 ) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.5.6 to 7.6.0. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/protobufjs-v7.6.0/CHANGELOG.md) - [Commits](https://github.com/protobufjs/protobuf.js/compare/protobufjs-v7.5.6...protobufjs-v7.6.0) --- updated-dependencies: - dependency-name: protobufjs dependency-version: 7.6.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-20 15:25:32 -04:00
dependabot[bot]	27506cc02d	chore(deps): bump ws from 8.20.0 to 8.20.1 in /scripts/whatsapp-bridge (#28975 ) Bumps [ws](https://github.com/websockets/ws) from 8.20.0 to 8.20.1. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](https://github.com/websockets/ws/compare/8.20.0...8.20.1) --- updated-dependencies: - dependency-name: ws dependency-version: 8.20.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-20 15:25:15 -04:00
brooklyn!	88f5186d35	fix(tui): anchor splitReasoning unclosed-tag regex to start of input (#29426 ) `splitReasoning()` strips paired `<think>…</think>` blocks first, then runs an unclosed-trailing regex to catch reasoning that hasn't yet streamed its closer. That second regex was unanchored and greedy: new RegExp(`<${tag}>([\\s\\S])$`, 'i') So any literal `<think>` somewhere in prose — a model quoting the tag, a code example, or a stream-mid-tag before the closer arrives — consumed every paragraph after it to EOF. User-visible symptom: "TUI eats last paragraph of output," both during streaming and on settled turns. Real reasoning streams always lead the message (that's the only place an unclosed opener can legitimately appear during streaming). Anchor the regex to `^\s` so mid-prose mentions of the tag are preserved. Empirical repro before the fix: splitReasoning('final answer paragraph one.\n\n<think>internal note\n\nfinal answer paragraph two.') → text: 'final answer paragraph one.' ← paragraph two GONE After: → text: 'final answer paragraph one.\n\n<think>internal note\n\nfinal answer paragraph two.' Updated the existing trailing-unclosed test to lead with `<think>` (the real-world shape) and added a regression test pinning the mid-text case. ui-tui type-check clean, 808/808 vitest pass.	2026-05-20 14:09:38 -05:00
Teknium	eeb747de25	feat(sessions): opt-in per-session JSON snapshot writer PR #29182 deleted the per-session JSON snapshot writer outright because state.db is canonical and the snapshots had no in-tree consumer. Some users have external tooling that reads `~/.hermes/sessions/session_{sid}.json` directly, so reintroduce the writer behind a config flag that defaults to off. - Add `sessions.write_json_snapshots` (default False) to DEFAULT_CONFIG - Restore `AIAgent._save_session_log` + `_clean_session_content` as gated methods. When the flag is off the call is a fast no-op; when on, the writer behaves as before (atomic write, truncation guard preserved, REASONING_SCRATCHPAD → think tag normalization) - Re-derive the target path from `agent.session_id` on each call so `/branch` and `/compress` re-points happen automatically — no need to restore the explicit re-point bookkeeping at call sites - Wire the single call site in `_persist_session` (the cleanup-on-exit hook). Did NOT restore the 7 intra-turn calls the original PR deleted — those were redundant writes within the same turn that doubled disk I/O without adding any persistence guarantee `_persist_session` does not already provide - Read the flag once at agent init via `load_config()`, cache as `agent._session_json_enabled` - Update `TestNoSessionJsonSnapshot` → `TestSessionJsonSnapshotOptIn` to pin behavior: default off (no file), opt-in true (file written), no-op method on default agents, logs_dir retained unconditionally - Update CONTRIBUTING.md and the bundled `hermes-agent` skill to document the flag and its default	2026-05-20 11:44:10 -07:00
Teknium	6fc1989a5d	chore(release): correct AUTHOR_MAP for jonny@nousresearch.com The email "jonny@nousresearch.com" belongs to @yoniebans (GitHub id 5584832, display name "jonny"), not to Jeffrey Quesnelle (@jquesnelle, id 687076, who commits as emozilla@nousresearch.com). Verified across all 60 historical commits on the repo authored from this email — every one of them was a yoniebans commit being mis-credited to jquesnelle in the changelog. Surfaced while salvaging PR #29182 (yoniebans's session-log refactor).	2026-05-20 11:44:10 -07:00
yoniebans	b6c6f650ee	test(session-log): pin no-session_json regression + drop trailing whitespace Adds TestNoSessionJsonSnapshot to lock the contract that session_log_file attribute, _save_session_log method, and the per-session JSON snapshot writer are gone. logs_dir is retained for request_dump_*.json. Also cleans up stray trailing whitespace in test_run_agent_codex_responses introduced when the _save_session_log stub line was deleted.	2026-05-20 11:44:10 -07:00
yoniebans	6f1a5f8597	refactor(session-log): delete dead _clean_session_content helper Only caller was the removed _save_session_log. Also removes the unused convert_scratchpad_to_think and has_incomplete_scratchpad imports from run_agent.py (both still used elsewhere via their own imports).	2026-05-20 11:44:10 -07:00
yoniebans	9d793e8e58	docs(session-log): state.db is canonical; ~/.hermes/sessions/ is legacy	2026-05-20 11:44:10 -07:00
yoniebans	cebd480818	refactor(session-log): drop branch/compress re-point of session_log_file The attribute no longer exists; nothing to re-point.	2026-05-20 11:44:10 -07:00
yoniebans	c547392fd4	refactor(session-log): stop initializing session_log_file attribute	2026-05-20 11:44:10 -07:00

1 2 3 4 5 ...

9137 commits