hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-19 10:02:16 +00:00

Author	SHA1	Message	Date
Reiji Kisaragi	3d21666b2f	fix: preserve multimodal user content during persistence Avoid applying text-only persist_user_message overrides to multimodal current-turn user messages. Early crash-resilience persistence mutates the same messages list later used for the API call, so clobbering list content drops ACP image blocks before model dispatch.\n\nAdd regression coverage for both text override behavior and multimodal preservation.\n\nCloses #44242	2026-06-17 09:49:39 -07:00
xxxigm	c2fa302e93	Merge pull request #47913 from xxxigm/fix/desktop-backend-skew-toast-nag fix(desktop): stop the "Backend out of date" toast nagging on every session open	2026-06-17 10:04:34 -05:00
Teknium	c6c8abbadb	refactor: remove agent-callable send_message tool (#47856 ) * feat(mcp): raise default tool-call timeout 120s -> 300s Port from openai/codex#28234. Long-running MCP tools (web fetches, sandboxed builds, deep-research servers) routinely exceed 120s, causing spurious timeout failures. Codex bumped its default MCP tool timeout from 120 to 300 for the same reason. - _DEFAULT_TOOL_TIMEOUT 120 -> 300 in tools/mcp_tool.py (per-server 'timeout' config override unchanged) - update test_default_timeout assertion - document the default in mcp-config-reference.md * refactor: remove agent-callable send_message tool The agent should not decide on its own to fire off cross-platform messages or reactions. Outbound platform messaging is handled outside the agent loop — cron delivery, the gateway kanban notifier (dashboard-toggled), and the `hermes send` CLI. Removes the model-tool registration only; the send engine in send_message_tool.py (_send_to_platform, _send_via_adapter, _parse_target_ref, per-platform _send_* helpers) is kept intact for those non-agent callers. Drops the now-empty 'messaging' toolset and its `hermes tools` toggle. Yuanbao DM guidance now points at the native yb_send_dm tool.	2026-06-17 07:11:23 -07:00
brooklyn!	f10f7114f9	Merge pull request #47664 from NousResearch/bb/desktop-markdown-spread-overflow fix(desktop): stop a single message from crashing or freezing the chat	2026-06-17 08:37:06 -05:00
Brooklyn Nicholson	0138282f97	perf(desktop): keep oversized messages from freezing the chat A multi-MB message (logged bundle, huge tool dump) froze the renderer before any paint: Streamdown runs `preprocess` + `marked` lex over the whole string synchronously in a useMemo, an uninterruptible long task that no try/catch or content-visibility can help (our JS runs before the browser ever skips layout). Tiered fix: - Message gate: past 200KB, bypass markdown entirely and render the raw text in `content-visibility:auto` line-chunks — synchronous work is bounded to a string split, the browser virtualizes layout natively, and every line stays in the DOM (selectable, find-in-page). - Code-block budget: past 3k lines / 150KB, skip Shiki (which emits a span per token) and render plain, chunked the same way. - Collapse/expand: a reusable ExpandableBlock clamps code blocks and the huge-text fallback to a 120px preview with a gradient + chevron, expanding to 300px. The inner element is always a scroll container so the content-visibility chunks stay lazily laid out in both states. No content is ever dropped; the copy button (card header) always yields the full block.	2026-06-17 08:25:52 -05:00
Max Freedom Pollard	992b922389	fix(curator): stop restore from matching unrelated skills by name prefix restore_skill() falls back to p.name.startswith(f"{skill_name}-") when no archive directory matches the requested name exactly. That fallback is meant to catch the timestamped duplicate archive_skill() writes on a name collision (<skill>-YYYYMMDDHHMMSS), but the bare prefix also matches any unrelated archived skill named <name>-something. So restoring "git" can pull an archived "git-helpers" out of .archive/, rename it to "git", and report success: the requested skill is not restored and the sibling is gone from the archive. Constrain the fallback to the exact suffix archive_skill() produces, a 14 digit timestamp. The exact-name match and the recursive nested-archive walk are unchanged, so nested and timestamped restores still work; unrelated siblings no longer match. Fixes #47647	2026-06-17 06:04:03 -07:00
Teknium	cbfa018aef	fix(auth): retry Codex device-code login on 429 with clear rate-limit message (#47860 ) The OpenAI device-code login (POST auth.openai.com/.../deviceauth/usercode) had no retry or 429 handling — a transient throttle from OpenAI surfaced as a bare "Device code request returned status 429" with no guidance, reading as a hard login failure. - Retry the device-code request with capped exponential backoff (honoring Retry-After), up to 4 attempts. - On persistent 429, raise a clear AuthError tagged CODEX_RATE_LIMITED_CODE (classified transient, not a credential problem) with a wait hint. - Apply the same 429 classification to the token-exchange step (same bug class). Unrelated to PR #47399 (Responses-API cache headers); this is the OAuth device-code path in hermes_cli/auth.py.	2026-06-17 05:48:35 -07:00
teknium1	06d907dc4e	fix(dashboard): only run runtime-pid liveness fallback against local status get_runtime_status_running_pid() validates liveness with a local os.kill(pid, 0) probe. In /api/status the runtime record can be the REMOTE health-probe body (cross-container), whose PID belongs to another host and is display-only — probing it locally is wrong and trips the test live-system guard (os.kill on a PID outside the test subtree). Run the fallback only against the local read_runtime_status() record.	2026-06-17 05:40:57 -07:00
teknium1	dc86d48a3e	fix(dashboard): use await-safe config-only scope for /api/status profile _profile_scope swaps process-global skills_tool/skill_manager module attrs under an RLock; /api/status holds that scope across the run_in_executor remote-health probe await, so a concurrent /api/skills?profile=X request can cross-restore the status profile's skill dir on its finally. Add _config_profile_scope (contextvar-only, task-local, await-safe) and use it for status, which only resolves get_hermes_home() at call time for config/env/gateway state and never needs the skills-module globals.	2026-06-17 05:40:57 -07:00
Shannon Sands	674e8b098a	Fix dashboard gateway profile scoping	2026-06-17 05:40:57 -07:00
Teknium	f80381c456	feat(prompt): scale context-file cap to model window + point agent at truncated file (#47846 ) Context files (AGENTS.md, CLAUDE.md, .hermes.md, .cursorrules, SOUL.md) were hard-capped at a flat 20K chars before head/tail truncation. Among the agent harnesses we track, only Codex caps project docs at all (32 KiB); Claude Code, OpenCode, and Cline load them whole. The flat 20K predates large context windows and silently truncates real-world AGENTS.md files. B — dynamic cap: when context_file_max_chars is unset (now the shipped default), the cap scales with the model's context window (ctx_tokens * 4 * 0.06, floor 20K, ceiling 500K). Small-context models stay at the historical 20K; a 200K model gets 48K; large models stop truncating real docs. An explicit context_file_max_chars still wins. Context length is resolved once per conversation (stable -> prompt cache untouched). C — when truncation does happen, the marker now names the concrete file path and tells the agent to read_file it for the full content. Validation: 154 targeted tests + full agent/ + hermes_cli/ + test_config (0 failures); E2E against a real 60K AGENTS.md confirms small windows truncate with the path-bearing marker, large windows load whole, and the system prompt is byte-stable across rebuilds.	2026-06-17 05:40:26 -07:00
teknium1	49ef0241eb	chore(release): map Adolanium author email for PR #44628 salvage	2026-06-17 05:40:15 -07:00
Adolanium	f4100f4394	fix(desktop): list markers and quote border follow RTL message direction unicode-bidi:plaintext (#44596) resolves text direction per line, but list markers and the blockquote border are box chrome driven by the CSS direction property, which plaintext never sets, so an RTL list renders its numbers stranded at the far left edge. CSS cannot close this gap (:dir() only reads the dir attribute, never plaintext resolution), so ul/ol/blockquote carry dir="auto" and the browser resolves their box direction natively while the plaintext rules keep owning the text. Inline code carries dir="ltr", which HTML's auto algorithm skips, matching the no-vote contract the CSS isolate already gives it.	2026-06-17 05:40:15 -07:00
Max Freedom Pollard	fc1119ca66	fix(curator): stop the rollback safety snapshot from pruning its target Rolling back to the oldest curator snapshot failed and deleted that snapshot. rollback() takes a safety snapshot first, and snapshot_skills() ends by pruning the backups directory down to keep (5 by default). At the steady keep limit that prune removed the oldest snapshot, which is the very one being restored, so the extract found no skills.tar.gz and the rollback stopped with "snapshot extract failed (state restored)". Thread an optional protect set through snapshot_skills() into _prune_old() so the pre rollback safety snapshot can never evict the snapshot being restored. Add two regression tests covering restore of the oldest snapshot at the keep limit. Fixes #47612	2026-06-17 05:40:05 -07:00
Teknium	7bbffceb9c	feat(curator): make skill consolidation opt-in (prune stays default-on) (#47840 ) The curator now defaults to prune-only: the deterministic inactivity pass (mark stale / archive long-unused skills) still runs whenever the curator is enabled, but the opinionated LLM umbrella-building consolidation fork is OFF by default. - agent/curator.py: add DEFAULT_CONSOLIDATE=False + get_consolidate(); gate the forked aux-model review in run_curator_review behind it (new consolidate param, None=read config). When off, the LLM pass is skipped entirely (no aux-model cost); the run is still recorded and reported. - config.py: add curator.consolidate (default false); v29->v30 migration seeds the key for existing installs without clobbering a user-set value. - hermes_cli/curator.py: 'hermes curator run --consolidate' override; status shows consolidate state; prune-only notice on run. - docs + tests.	2026-06-17 05:20:32 -07:00
Teknium	e48803daec	fix(gateway): defer macOS launchd reload when run inside the gateway tree (#47842 ) When refresh_launchd_plist_if_needed() runs from inside the gateway's own launchd process tree (agent-initiated self-update via the terminal tool), a direct launchctl bootout tears down the service's process group — including the CLI doing the refresh — before the follow-up bootstrap can run. The gateway is left unloaded and KeepAlive can't revive it (#43842). Detect in-service execution via gateway.status.get_running_pid() + _is_pid_ancestor_of_current_process(), and delegate the bootout->bootstrap to a detached (start_new_session=True) helper that survives the process-group teardown. The normal out-of-tree CLI path is unchanged. Fixes #43842.	2026-06-17 05:19:21 -07:00
kyssta-exe	4d39a603d1	fix(codex): restore session_id/x-client-request-id HTTP headers for cache routing (#47335 )	2026-06-17 05:13:12 -07:00
Brooklyn Nicholson	435c706e8e	fix(desktop): stop a failed turn leaking into every other thread A turn that ends in an error (e.g. an out-of-funds state) was being re-rendered in unrelated threads. On a warm thread switch the on-screen `$messages` still belongs to the previously viewed thread, and `flushPendingViewState` fed it into `preserveLocalAssistantErrors`, which grafted the prior thread's failed turn onto the newly opened one. Because the polluted view then became the next switch's baseline, the error cascaded into every thread the user visited. Only carry local errors across a view flush when the on-screen baseline is the same session being flushed; the cached state we publish already retains that session's own errors. Also surface the turn error as a global toast even when the failing turn ran in a background thread, since the error blocks all subsequent interactions until the user acts.	2026-06-17 05:07:48 -07:00
kshitij	f9c8d95e43	Merge pull request #47723 from NousResearch/salvage/oauth-mcp-prefix Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run Details Typecheck / typecheck (apps/desktop) (push) Waiting to run Details Typecheck / typecheck (apps/shared) (push) Waiting to run Details Typecheck / typecheck (ui-tui) (push) Waiting to run Details Typecheck / typecheck (web) (push) Waiting to run Details Typecheck / desktop-build (push) Waiting to run Details fix(anthropic): no single-underscore mcp_ tool names on the OAuth wire (plan-limit billing)	2026-06-17 13:26:02 +05:30
kshitijk4poor	b70a4e7533	fix(anthropic): also normalize MCP-server tool names to mcp__ on OAuth wire The double-underscore prefix swap fixed bare native tools but SKIPPED tools already named mcp_<server>_<tool> (real MCP servers, e.g. mcp_linear_get_issue): they went on the OAuth wire single-underscore and still tripped Anthropic's third-party billing classifier -> HTTP 400 'extra usage, not plan limits'. Verified empirically against a live Max subscription: a single mcp_ tool flips the whole request to the extra-usage lane; mcp__ is accepted. - build_anthropic_kwargs: promote ANY leading single-underscore mcp_ to mcp__ (bare names -> mcp__name; mcp_<server>_<tool> -> mcp__<server>_<tool>), never double-prefixing an already-mcp__ name. Same for tool_use blocks in history. - normalize_response: reverse the mcp__ wire name back to whichever original the registry knows — the single-underscore mcp_<server>_<tool> form for MCP server tools, or the bare name for native tools — preferring a name that already resolves natively. - Tests rewritten to assert the invariant: ZERO single-underscore mcp_ names reach the OAuth wire, and the mcp__ round-trip resolves back to the registered name for both native and MCP-server tools. Builds on liuhao1024's mcp__ prefix commit (cherry-picked). Closes the MCP-server gap that left any session with an MCP server configured still billing to extra usage.	2026-06-17 13:20:29 +05:30
liuhao1024	3d37869295	fix(anthropic): use double-underscore mcp__ prefix for OAuth tool names Anthropic's Claude-Code request classifier treats tool names with a single-underscore `mcp_<x>` prefix as non-Claude-Code / third-party, routing the request to extra-usage billing (HTTP 400). Real Claude Code uses double underscores: `mcp__<server>__<tool>`. Change the tool-name prefix from `mcp_` to `mcp__` in both the outgoing path (build_anthropic_kwargs) and the incoming path (normalize_response). Update the skip-guard to check for both `mcp_` and `mcp__` prefixes so native MCP server tools (which use the legacy single-underscore format) are not double-prefixed. Fixes #46675	2026-06-17 13:12:23 +05:30
kshitij	9901141d64	Merge pull request #47701 from kshitijk4poor/salvage/cli-completer-keystroke-latency fix(cli): keep typing responsive by running completion off the UI event loop	2026-06-17 12:42:50 +05:30
kshitijk4poor	ca6542f602	docs(cli): note URL exclusion in _extract_path_word docstring The docstring described a token as path-like when it contains a "/" separator, but the keystroke-latency fix now excludes "://" scheme tokens (URLs) even though they contain "/". Document the exclusion so the contract matches the behavior.	2026-06-17 12:36:01 +05:30
kshitijk4poor	fbaad3031a	test(cli): URL tokens must not trigger filesystem path completion Regression coverage for the keystroke-latency fix: a URL token contains "/", so the bare-slash path heuristic used to return it as a path word and run os.listdir on every keystroke. Assert _extract_path_word rejects http/https/ssh scheme tokens, that ordinary paths (incl. a bare colon) are unaffected, and that the completer never touches the filesystem for a URL under the cursor.	2026-06-17 12:33:56 +05:30
xxxigm	f48b312037	fix(cli): keep typing responsive by not blocking the keystroke loop The interactive CLI input box runs its completer with `complete_while_typing=True`, so `SlashCommandCompleter.get_completions` is invoked on every keystroke. That completer does blocking I/O: fuzzy `@`-file indexing shells out to `rg`/`fd` (up to a 2s timeout) and file-path completion calls `os.listdir` + `stat`. Because the completer was passed inline (never wrapped in `ThreadedCompleter`), all of this ran synchronously on the prompt_toolkit event loop, stalling the render after each key — very noticeable on WSL2 and other slow-filesystem setups ("typing in the prompt box being very latent"). Two fixes: - Wrap the input completer in `ThreadedCompleter` so completion work runs off the UI event loop and never blocks rendering between keystrokes. - Stop treating URLs as file paths in `_extract_path_word`: a token like `https://example.com/x` contains `/`, so it triggered `os.listdir` on every keystroke while typing/pasting a link (listing a bogus `https:` dir) for a completion that can never be useful. Skip any token with a `://` scheme separator. (cherry picked from commit `b5be2ba276`)	2026-06-17 12:32:38 +05:30
Brooklyn Nicholson	b82eca2beb	fix(desktop): isolate message render crashes from the root boundary Streamdown runs our `preprocess` inside its own useMemo, and the user bubble runs `extractEmbeddedImages`/directive parsing inside theirs — so anything thrown while rendering one message (a regex/stack overflow on adversarial content) escapes to the ROOT error boundary and takes down the entire app, as seen in a reported `RangeError: Maximum call stack size exceeded` from a single message. Wrap both the assistant preprocess pipeline and the user-message directive passes in try/catch that degrade to the raw text. One bad message now renders plain instead of nuking the transcript.	2026-06-17 00:46:17 -05:00
Brooklyn Nicholson	547a014e7e	fix(desktop): avoid stack overflow rendering huge fenced blocks `normalizeFenceBlocks`/`pushProseFence` appended block bodies with `out.push(...lines)`, which spreads every line as a separate call argument. A single message carrying a large fenced block (a logged minified bundle, base64 blob, or big tool dump — common in long sessions) overflows V8's argument-count limit and throws `RangeError: Maximum call stack size exceeded`, breaking the transcript render. Compression doesn't save us: it gates on tokens vs. window, not a single message's line count, and the protected recent tail renders verbatim regardless. Append iteratively via a small `extend()` helper. Behavior is identical for normal-sized blocks.	2026-06-17 00:34:59 -05:00
Bartok	5e01a5dbf1	fix(cli): detect containerd/CRI cgroup-v2 containers in is_container() (#47131 ) Closes #47111 is_container() only recognized Docker (/.dockerenv), Podman (/run/.containerenv), and docker/podman/lxc markers in /proc/1/cgroup. Under cgroup v2 (Kubernetes/k3s on containerd or CRI-O) /proc/1/cgroup collapses to a single "0::/" line with no runtime marker, so is_container() returned False on every containerd/CRI pod. That false negative bypassed container-aware behavior across the CLI. The most damaging case (reported): even after #46290 fixed detect_service_manager() to gate on _s6_running() alone, other is_container() call sites (profile home resolution, gateway behaviors, config, doctor) still misbehave on containerd. Broaden detection conservatively: - KUBERNETES_SERVICE_HOST env var (present in every k8s pod). - kubepods/containerd/crio markers in /proc/1/cgroup (cgroup v1 nested). - same markers in /proc/self/mountinfo as a cgroup-v2 fallback. Tests: 3 new (k8s env, kubepods cgroup, cgroup-v2-via-mountinfo) plus the existing negative case hardened to stub mountinfo + env; 108 constants + service_manager tests pass.	2026-06-17 12:11:31 +10:00
teknium	36ae958473	feat(gateway): gate message timestamps behind opt-in (default off) Follow-up to salvaged PR #41633: the timestamp prefix injection was unconditional. Gate the in-context render behind gateway.message_timestamps.enabled (default false) at both the live-message and history-replay sites; timestamp metadata is still captured + persisted regardless so the toggle can be flipped on later. Add DEFAULT_CONFIG entry, docs, and gate tests.	2026-06-16 15:49:59 -07:00
Wolfram Ravenwolf	bd7fc8fdcd	feat(gateway): inject stable human-readable message timestamps Consolidates these related Amy fork patches: - 429830f39 feat(gateway): inject message timestamps into user messages for LLM context - 3c3d6fac0 fix: handle both ISO string and epoch float timestamps in history replay - 2874f7725 feat: human-friendly timestamp format with weekday and timezone name - 3735f4c8b fix: render gateway message timestamps once	2026-06-16 15:49:59 -07:00
brooklyn!	b7f0c9cd52	fix(desktop): honor pre-session model pick + restore global reasoning/speed defaults (#47447 ) * fix(desktop): keep the pre-session model pick selected in the picker The composer picker derived its "current" row from `model.options ?? store`, so model.options always won. Pre-session that query returns the PROFILE DEFAULT, not the sticky composer pick — so selecting a model before a session exists left the checkmark (and the picker's "current" line) on the default, making the pick look ignored even though the pill updated. Add `currentPickerSelection()`: with a live session the gateway's model.options is authoritative; pre-session the sticky `$currentModel`/`$currentProvider` wins, falling back to options. Wire it into ModelMenuPanel and ModelPickerDialog. * feat(desktop): global reasoning/speed defaults in Settings → Model The composer picker is now sticky-UI/per-session only and never writes the profile default (#46959), but Settings → Model had no reasoning/speed control and `agent.reasoning_effort` wasn't in the curated config surface at all (`service_tier` was buried in Advanced) — so there was nowhere to set the profile default that crons/subagents/messaging resolve from. Add capability-gated Reasoning (effort) + Fast controls beside the main model, gated by the applied model's reported capabilities (reasoning defaults on, fast off when unreported — same as the composer). They read/write `agent.reasoning_effort` and `agent.service_tier` by round-tripping the config record, matching the gateway's value semantics (service_tier "fast"/"priority"/"on" ⇒ fast). * refactor(desktop): don't open the reasoning select from its row label A <label> wrapping the Select forwarded text clicks to the trigger, opening the dropdown unexpectedly. Plain row for reasoning; Fast stays a <label> so clicking its text toggles the switch (expected for a checkbox-like control).	2026-06-16 16:22:09 -05:00
xxxigm	d1ecebcbfd	fix(desktop): re-download Electron binary via mirror when pack fails (#47266 ) (#47276 ) * fix(desktop): re-download Electron binary via mirror when pack fails (#47266) Since #38673 pinned build.electronDist to node_modules/electron/dist, electron-builder reads the Electron binary straight from there and never downloads it during `npm run pack`. That dist tree is only produced by the electron package's postinstall (install.js) during `npm ci`. When that download is blocked or throttled (GitHub's release host is unreachable in some regions), the dist is missing and the build dies with: The specified electronDist does not exist: .../node_modules/electron/dist The existing ELECTRON_MIRROR fallback in all three desktop-build paths (scripts/install.ps1, scripts/install.sh, and `hermes desktop` in hermes_cli/main.py) re-ran `npm run pack` with ELECTRON_MIRROR set — but pack never downloads Electron anymore, so the mirror was never used and the retry re-read the same missing dist. The fallback was effectively dead. Drive the mirror through electron's own downloader instead: - Add a dist-presence check + a downloader helper (Test-ElectronDist / Restore-ElectronDist, _electron_dist_ok / _restore_electron_dist, _electron_dist_ok / _redownload_electron_dist) that wipes a partial dist + the path.txt version marker (electron's install.js short-circuits on it) and re-runs `node install.js`, optionally via a mirror. - On the first retry, repopulate a missing dist from the canonical source; on the mirror retry, re-fetch through npmmirror.com, then pack. - Gate the re-download on the dist check so an unrelated build failure (tsc/vite) doesn't trigger a pointless ~200 MB refetch, and skip the final pack when the binary still can't be fetched instead of failing the same way. * test(desktop): cover Electron dist re-download mirror fallback (#47266) Add behavior coverage for the electronDist re-download fix: - _electron_dist_ok across linux/win32/darwin, including the partial-dist case (dir present but binary missing) that makes the pinned electronDist fail. - _redownload_electron_dist: no-op when the binary is present, bail when install.js is absent, wipe a stale dist + path.txt marker and run electron's downloader with ELECTRON_MIRROR injected, and report failure when the download still produces no binary. - `hermes desktop`: the mirror fallback now drives electron's own downloader before re-running pack, and skips the final pack entirely when the binary can't be fetched. Replaces the old mirror test that asserted the (now-fixed) dead behavior of re-running `npm run pack` with ELECTRON_MIRROR set — pack never downloads Electron under the pinned electronDist, so that retry could never help.	2026-06-16 15:40:55 -05:00
teknium1	db44af004c	test(model-picker): cover two overlapping user-defined custom providers Guards that two user-defined custom endpoints exposing an overlapping model each keep their full catalog — the dedup must never cross-filter two user-defined rows against each other.	2026-06-16 13:09:40 -07:00
liuhao1024	1b962f001e	fix(models): pass model.base_url to fetch_models in /model picker The /model interactive picker resolved a base_url from user credentials but never passed it to ProviderProfile.fetch_models(), causing the picker to always query the provider's hardcoded default endpoint instead of the user's custom URL (e.g. a company litellm proxy). - providers/base.py: add optional base_url parameter to fetch_models() - hermes_cli/models.py: pass resolved base_url to fetch_models() - Update all subclass overrides for signature compatibility - Add 6 regression tests covering override, fallback, and integration	2026-06-16 13:09:40 -07:00
Wolfram Ravenwolf	9137b86a52	fix(skills): ignore support docs in skill discovery Support files under references/, templates/, assets/, and scripts/ are progressive-disclosure data loaded through skill_view(..., file_path=...). They should not be treated as standalone skills during discovery or collision checks. This prevents archived skill packages or support markdown files inside a real skill from shadowing active skills with the same name while still allowing top-level categories named scripts/templates/assets/references. Tests cover: - pruning nested SKILL.md files inside skill support directories - preserving support-named top-level categories - avoiding skill_view collisions from support markdown - keeping archived package SKILL.md files accessible only through file_path	2026-06-16 13:08:34 -07:00
teknium1	7493de7fc3	test(model-switch): cover section-3 no-auth probe; map chimpera author Salvage follow-up for PR #29575: add regression tests for the section-3 no-api_key /v1/models probe (probes bare endpoints, skips when explicit models set) and add the contributor AUTHOR_MAP entry.	2026-06-16 13:07:52 -07:00
chimpera	1039e90b5e	fix(model-switch): probe /v1/models for providers without api_key Section 3 of list_authenticated_providers (user-defined endpoints from the providers: config section) required an api_key before probing the endpoint's /v1/models for live model discovery. This broke local self-hosted backends (llama.cpp, Ollama, vLLM, etc.) that don't require authentication — they would only ever show the single default_model from config instead of the full model catalog. Section 4 (custom_providers list) already handled this correctly with the policy: probe when api_key is set OR when no explicit models are configured. Apply the same logic to Section 3 so local backends get full model discovery without requiring a placeholder api_key workaround. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-16 13:07:52 -07:00
teknium1	8ed16a7a0c	test(telegram): rich-reply recovery via send-time index Cover #47375 fix: record-on-rich-send + lookup-on-reply round trip, lookup miss leaving reply_to_text None, and precedence (native quote and echoed caption both win over the index fallback).	2026-06-16 13:04:20 -07:00
teknium1	3f80bcac56	chore(release): AUTHOR_MAP entry for x1erra (Sierra)	2026-06-16 13:04:20 -07:00
Sierra (Hermes Agent)	01ae9b853e	fix(telegram): resolve replies to rich (sendRichMessage) messages Telegram does not echo a sendRichMessage's content back in reply_to_message (.text/.caption empty, .api_kwargs None), so replies to rich sends (briefings, the gateway's own rich finals) arrived with no quotable text and the [Replying to: ...] injection was skipped. Remember message_id -> text at send time in a best-effort JSON index (gateway/rich_sent_store.py), and recover it on inbound when text and caption are both empty. Best-effort and no-throw throughout: any failure degrades to prior behavior and never breaks a send or message. Salvaged from #47375 by @x1erra. Dropped the cross-platform run.py reply-prefix rewrite (out of scope; bloated every reply on every platform) and scrubbed a docstring reference to an out-of-repo script. Kept the inbound reply_to logging enrichment used to verify the fix.	2026-06-16 13:04:20 -07:00
teknium1	db01910e3a	chore(release): map cyb0rgk1tty noreply email for AUTHOR_MAP Salvage follow-up for PR #46921 — CI matches contributor authorship on the commit email, which is the GitHub noreply form.	2026-06-16 13:04:07 -07:00
cyb0rgk1tty	b7fa62c530	fix(inventory): keep user-defined custom providers in model dedup The #45954 model-dedup builds `user_models` from every is_user_defined row, then strips those model IDs from every row where is_aggregator(slug) is True. But is_aggregator() returns True for every `custom:*` slug, and list_authenticated_providers emits named custom providers with slug `custom:<name>` and is_user_defined=True. So a user's own custom provider is treated as an aggregator and filtered against user_models — which holds exactly its own models (the row helped build that set). Every model is removed, the row drops to zero, and the provider disappears from the model picker. Guard the dedup loop to skip is_user_defined rows: a user's configured provider is never an aggregator duplicate of itself. Built-in aggregators (openrouter, etc.) are still deduped as before. Adds a regression test.	2026-06-16 13:04:07 -07:00
Jaaneek	f4ef70f6fc	docs(xai): update default model references to grok-build-0.1 Reflect the default-model change in the xAI Grok OAuth guide, the web search docs (EN + zh-Hans), and the web provider docstring. grok-4.3 is kept in the model tables as the previous default; the Nous/OpenRouter aggregator catalog still lists grok-4.3 and is left unchanged.	2026-06-16 11:50:17 -07:00
Jaaneek	bbc842d31e	feat(xai): default to grok-build-0.1 Switch the default model for the xAI/Grok provider and the xAI web search backend from grok-4.3 to grok-build-0.1. grok-build-0.1 is already recognized by the model metadata, so no new model definition is required; grok-4.3 remains selectable.	2026-06-16 11:50:17 -07:00
teknium	28f92478e3	test(hooks): cover session:compress event; drop dead import Follow-up to salvaged PR #41624: - Remove stray urllib.parse import in run_agent.py (cherry-pick cruft, unused) - Add tests: session:compress emits with correct context, no-callback is safe, and a callback exception does not break compression	2026-06-16 11:45:36 -07:00
Wolfram Ravenwolf	e76e7b5073	feat(hooks): session:compress event_callback for MemPalace sync	2026-06-16 11:45:36 -07:00
kshitij	8fa562a399	Merge pull request #47391 from kshitijk4poor/feat/add-glm-5.2 feat: add z-ai/glm-5.2 to OpenRouter and Nous model lists	2026-06-17 00:02:05 +05:30
brooklyn!	44e5848e74	feat(desktop): stream subagent activity into watch windows (#47060 ) * feat(desktop): stream subagent replies into watch windows A desktop watch window resumes a child session lazily (no full agent) and mirrors the parent-relayed `subagent.` events into native child-session stream events. The child's streamed reply text was never relayed, so the window sat blank while the subagent "talked". - delegate_tool: forward the child's `run_conversation` stream tokens up the progress relay as `subagent.text` (inert under CLI/TUI — their progress handlers ignore non-tool event types; only a gateway watch window mirrors it). - server: mirror `subagent.text` -> `message.delta` on the child sid only, and skip the parent emit (per-token frames are meaningless on the parent session, which shows the child via the spawn tree). Demote `subagent.start` to a one-time goal header and drop the noisy `subagent.progress` mirror — tools already mirror natively. - server: guard `_start_agent_build` so a lazy watch session spectating an in-flight child stays lazy; incidental RPCs were upgrading it to a full agent mid-stream and silently killing the mirror. fix(desktop): keep watch-window chat clear of titlebar chrome Secondary windows (new-session scratch, subagent watch, cmd-click pop-out) hide the titlebar tool cluster + session header, so the transcript ran to the window's top edge and streamed text slid up under the OS traffic lights. - Gate the hidden chrome on `isSecondaryWindow()` everywhere (app-shell, chat header, thread list) instead of the narrower new-session flag. - Add a fixed opaque drag-strip at the top of the secondary-window transcript: content padding alone scrolls away with the text, so the strip masks anything behind it and keeps the window draggable like the main header. * fix: WSL subagent window * fix: subagent window top padding --------- Co-authored-by: Austin Pickett <pickett.austin@gmail.com> Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-06-16 14:30:11 -04:00
teknium	6ebc449915	fix(prompt): isolate truncation warnings per context Follow-up to salvaged PR #41619: replace the module-global _truncation_warnings list with a contextvars.ContextVar so concurrent gateway-session prompt builds can't drain or clear each other's pending warnings (cross-session leak). Adds a context-isolation test.	2026-06-16 11:28:35 -07:00
Wolfram Ravenwolf	f6a42b1acf	feat(prompt): make context-file truncation limit configurable PROBLEM: Automatic context files such as SOUL.md and AGENTS.md were capped by a hardcoded CONTEXT_FILE_MAX_CHARS value. Amy's local fork had raised that constant from 20K to 25K so a larger SOUL.md would not be silently truncated, but the hardcoded 25K value changed upstream default behavior and made the patch less generally useful. SOLUTION: Restore the upstream-compatible 20K default, add a context_file_max_chars config setting for users who intentionally keep larger identity/project-context files, keep chat-visible truncation warnings, and document the new setting. Tests cover the default, config override, explicit max_chars precedence, and the warning text.	2026-06-16 11:28:35 -07:00

1 2 3 4 5 ...

11884 commits