hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-27 11:22:03 +00:00

Author	SHA1	Message	Date
teknium1	f284d85efa	fix(cron): restore [SILENT] silence + suppress empty-turn explainer on Telegram Scheduled jobs delivering to Telegram/etc. started posting a literal '⚠️ No reply: the model returned empty content…' message instead of staying silent. Two interacting causes: 1. The turn-completion explainer (#34452) replaces an empty model turn with a user-facing '⚠️ No reply…' string. In a cron context that is not a silence marker, so the scheduler delivered it — a regression from the previously-silent empty turn. run_job now detects the explainer text deterministically (via the same formatter that produced it) for abnormal-empty turn_exit_reasons and strips it to empty, so the existing empty-response suppression + soft-fail guard apply. The explainer is unchanged on CLI/gateway. 2. The cron suppression used a loose 'SILENT_MARKER in ...upper()' substring check. It leaked bracketless near-markers the model emits ('SILENT', 'NO_REPLY', 'NO REPLY' — #51438, #46917) and wrongly swallowed a real report that merely quoted '[SILENT]' mid-sentence. Replaced with _is_cron_silence_response(): suppresses a canonical token as the whole response, its own first/last line, or the documented bracketed '[SILENT] <note>' prefix — while a token buried mid-sentence in a genuine report is delivered. Preserves the intentional cron trailing/prefix tolerance (existing tests unchanged). Tests: bracketless-variant suppression, mid-sentence-quote delivery, direct matcher contract, and explainer-strip + defensive real-report delivery.	2026-06-25 13:45:09 -07:00
kshitij	42bea9e298	Merge pull request #52618 from NousResearch/salvage/14185-todo-coercion fix(tools): defensive type coercion in todo_tool for malformed LLM input (#14185)	2026-06-26 02:02:18 +05:30
infinitycrew39	d40b5735a4	test(telegram): cover table auto-rich and topic routing Assert bare tables upgrade to sendRichMessage under default/opt-out config, DM-topic resumed sends without reply anchors, and rich finalize edits carry forum topic routing metadata.	2026-06-25 13:10:54 -07:00
infinitycrew39	9d225fbf4e	fix(telegram): auto-rich pipe tables and topic routing for sendRichMessage Pipe-only markdown tables now use sendRichMessage even when rich_messages is off, and resumed DM-topic sends route via direct_messages_topic_id without requiring a reply anchor. Rich finalize edits forward topic kwargs.	2026-06-25 13:10:54 -07:00
teknium1	92b5987ca2	chore: add herbalizer404 + pyxl-dev to AUTHOR_MAP for auxiliary fallback salvage	2026-06-25 13:08:18 -07:00
teknium1	0d777453fa	fix(auxiliary): fall back when a route can't run the model at all (400 capability mismatch) The salvaged context-window screen (#52392) skips fallback candidates that are too small, and the rate-limit/403 fixes skip candidates that are at capacity. A third hard failure remained uncovered: a fallback that builds a client fine but returns a 400 because it structurally cannot run the model. The canonical case is a configured openai-codex / ChatGPT-account fallback asked to compress a glm-5.2 conversation: 400 - {'detail': "The 'glm-5.2' model is not supported when using Codex with a ChatGPT account."} This is a request-validation error, so should_fallback was False and the explicit-provider gate blocked it — the auxiliary task (compression) aborted every turn, dropping middle turns without a summary and churning the session, which is exactly what destroys the prompt cache. Adds _is_model_incompatible_error() (400 + capability phrasing, excluding not-found and billing 400s which the sibling predicates own) and treats it as a fallback-worthy capacity error in both sync and async call_llm, so the chain skips the incapable route and continues to the next viable candidate.	2026-06-25 13:08:18 -07:00
Tranquil-Flow	e4d026aa3b	fix(auxiliary): screen fallback chain by context window for compression (#52392 ) The runtime auxiliary fallback chain (_try_configured_fallback_chain and _try_main_fallback_chain) returned the first reachable candidate without checking whether the candidate's context window was large enough for the task. For task='compression' this meant a reachable but undersized fallback (e.g. 32K) could be selected and then fail, even when a later larger-context fallback was available. This adds two small helpers: _task_minimum_context_length(task) Returns MINIMUM_CONTEXT_LENGTH (64K) for compression, None for other tasks (vision, web_extract, etc.). _candidate_context_window(provider, model, ...) Thin wrapper around get_model_context_length that returns None on probe failure so unknown/custom endpoints pass through unchanged (preserves the existing fallback surface). Both fallback loops now skip reachable candidates whose resolved context is below the task minimum and continue iterating. The success path (first viable candidate wins) is unchanged. Return shape and ordering for healthy candidates are preserved. Six regression tests cover: L2 configured chain skips too-small candidate L2 chain continues after skipping, returns last viable L3 main chain skips too-small candidate L4 unknown-context candidate passes through L5 non-compression task is not filtered L6 minimum constant matches MINIMUM_CONTEXT_LENGTH (64K) 3/6 fail on upstream/main without the production change (verified); all 6 pass with the fix. Full test_auxiliary_client.py suite (231 tests) and related compression tests (130 tests) remain green.	2026-06-25 13:08:18 -07:00
herbalizer404	b82c83d320	fix(auxiliary): honor fallback chain when compression provider auth is unavailable When an explicit aux provider cannot build a client before any request is sent (missing raw env key, exhausted/unavailable OAuth or credential-pool auth, resolver returning (None, None)), call_llm raised a misleading "no API key was found" error and bypassed the configured fallback_chain entirely. A provider authenticated through Hermes auth / the credential pool (e.g. ollama-cloud) whose pool entry is exhausted hit this path, so compression failed instead of routing to the configured fallback. Adds _try_configured_fallback_for_unavailable_client() and wires it into both sync and async call_llm before the raise, and into the startup compression feasibility check. Salvaged from #51835 by @herbalizer404.	2026-06-25 13:08:18 -07:00
pyxl-dev	751adfa6b9	fix: include rate-limit in auxiliary capacity-error fallback gate Rate-limit (429) errors on explicit-provider auxiliary tasks were silently failing instead of triggering the fallback chain. The is_capacity_error gate only checked payment and connection errors, excluding rate limits — so when a configured provider like openai-codex hit its rate limit, auxiliary tasks (kanban_decomposer, vision, web_extract, approval, etc.) had zero resilience. Add _is_rate_limit_error() to is_capacity_error at both call sites (sync and async paths) so rate limits trigger fallback regardless of whether the provider was auto-detected or explicitly configured. Fixes #52228	2026-06-25 13:08:18 -07:00
herbalizer404	ff8920299c	fix(auxiliary): treat 403 subscription and session-usage-limit errors as payment errors for fallback Ollama Cloud (and similar) return 403 with bodies like "this model requires a subscription, upgrade for access" or "you have reached your session usage limit, upgrade for higher limits". These are capacity/billing conditions semantically identical to credit exhaustion, but _is_payment_error() did not recognize them (403 missing from the status set; keywords missing), so the configured fallback_chain was never tried and compression failed outright. Adds 403 to the status set and the subscription/session-usage keywords. Salvaged from #49076 by @herbalizer404.	2026-06-25 13:08:18 -07:00
kshitij	ca714f6189	Merge pull request #52653 from kshitijk4poor/salvage/33814-env-quote-hash fix(config): quote .env values containing # to prevent token truncation (#30355)	2026-06-26 01:32:49 +05:30
kshitijk4poor	0654319644	chore(release): map srojk34 legacy prefix-less noreply in AUTHOR_MAP (#50098 )	2026-06-25 12:56:05 -07:00
kshitijk4poor	d9bd7ce827	test(compression): pin rotation-fallback tests to in_place=False ahead of default flip These 7 test sites assert rotation behavior (fork, child sessions, lock contention, logging session-context follows id rotation, boundary hooks fire on rotation). Pin each builder to in_place=False explicitly so they keep exercising the retained rotation fallback regardless of the global default (flipped to True in #38763). Rotation stays a working opt-out fallback and deserves continued coverage — these are NOT deleted. Pinned sites: - test_compression_concurrent_fork._build_agent_with_db - test_compression_logging_session_context._build_agent_with_db - test_compression_rotation_state._build_agent_with_db - test_compression_boundary_hook._make_agent (2 helpers: CompressionBoundaryHook + SessionCompressEvent) - test_compression_concurrent_sessions._build_agent_with_db	2026-06-25 12:56:05 -07:00
kshitijk4poor	2107b86024	feat(compression): flip in_place default to True (#38763 ) [2/2] In-place compaction (single durable session id, non-destructive soft-archive) becomes the default. Rotation is now the opt-out fallback via compression.in_place: false. Prerequisite: #50098 (hygiene guard reads result flag not config flag) merged first — without it, flipping the default causes permanent transcript loss on gateway hygiene-compress and /compress when no session_db is available. Blast radius (empirically measured on current main): 7 rotation-asserting tests broke and are pinned to in_place=False in the companion test commit: - tests/agent/test_compression_concurrent_fork.py (2) - tests/agent/test_compression_logging_session_context.py (1) - tests/agent/test_compression_rotation_state.py (1) - tests/run_agent/test_compression_boundary_hook.py (2 _make_agent helpers) - tests/gateway/test_compression_concurrent_sessions.py (2) Rotation stays as a working fallback and deserves continued coverage. Plan: .hermes/plans/in-place-compaction-38763.md	2026-06-25 12:56:05 -07:00
srojk34	510bf40705	fix(gateway): read compaction result flag not config flag in hygiene guard (#50098 ) Salvage of #50098 by @srojk34, cherry-picked onto current main. The hygiene auto-compress guard and the /compress slash command both read compression_in_place (config flag — is in-place mode enabled?) instead of _last_compaction_in_place (result flag — did in-place compaction actually succeed?). Both agents are built without a session_db, so archive_and_compact always fails silently and _last_compaction_in_place stays False. Reading the config flag makes the guard think in-place succeeded, triggering rewrite_transcript() which replaces the original messages with only the compressed summary — permanent data loss. Co-authored-by: srojk34 <srojk34@users.noreply.github.com>	2026-06-25 12:56:05 -07:00
Teknium	2a1e615565	fix: persist non-NULL system prompt on fresh turn setup (#45499 ) (#52616 ) build_turn_context() created the DB session row via _ensure_db_session() before the system prompt was restored/built, so a fresh API/gateway agent carrying client-managed history inserted a row with system_prompt=NULL. That tripped the misleading 'stored system prompt is null; rebuilding from scratch ... investigate the previous turn's write path' warning and a guaranteed first-turn prefix cache miss. Move row creation to after _cached_system_prompt is populated. Verified live (OpenRouter + claude-sonnet-4.5): persistent-agent turns show cache_read jumping to the full prefix on turn 2+ (write 24411 -> read 24411), and the persisted system_prompt is non-NULL so fresh-agent restore keeps the prefix cache warm. Tests: turn-context ordering regression asserting _ensure_db_session runs after _cached_system_prompt is populated.	2026-06-25 12:54:19 -07:00
Teknium	d7021af30f	fix(learn): name distilled skills as author Hermes, not the host OS user (#52388 ) /learn told the agent to fill the skill `author` field, and the system prompt environment probe surfaces the OS login name (user=$(whoami) in prompt_builder.py), so the model wrote the host username into published SKILL.md frontmatter — a privacy leak the user never opted into, and inconsistent run to run as the most-salient identity changed. The /learn authoring prompt now sets `author` to the literal value `Hermes` and explicitly forbids deriving it from the host environment (OS/login user, git config, or any probeable identity). The skill names itself as the tool that wrote it. Closes #52368.	2026-06-25 12:48:08 -07:00
helix4u	4efec63a34	fix(tools): let session_search match session titles	2026-06-26 01:12:26 +05:30
rob-maron	2c02583c2b	fix shape	2026-06-25 12:38:33 -07:00
rob-maron	525ee58b43	krea	2026-06-25 12:38:33 -07:00
sweetcornna	150afea942	fix(config): quote env values containing hash	2026-06-26 00:54:34 +05:30
kshitijk4poor	73c8d5a1e7	fix: use self._session_db directly + add regression test - Replace getattr(self.session_store, '_db', None) with self._session_db (the GatewayRunner's own SessionDB, consistent with existing usage in slash_commands.py L240/L499). - Remove verbose comment referencing a branch name as an issue number. - Update stale comment in run.py that said 'today it has no session_db'. - Add regression test verifying session_db is passed and rotated session is persisted (adapted from #51624 by @LeonSGP43). - Add _session_db=None to _make_runner fixtures in test_compress_command, test_compress_focus, and test_compress_plugin_engine.	2026-06-26 00:50:40 +05:30
Omar B	1a38a8ff7d	fix(gateway): pass session_db to compress temp agents so persistence works Manual /compress and session hygiene auto-compress both create temporary AIAgent instances to run compression. These agents were created without a session_db, so compress_context computed the compressed messages in memory, rotated the session ID, and reported success — but never wrote to the database. The next user message reloaded the original full transcript, making compression appear to do nothing. Fix: pass session_db=self.session_store._db to both temp agents so the session rotation is properly persisted. Also set _end_session_on_close on the /compress temp agent (already done in hygiene path) to prevent cleanup from ending the newly rotated session.	2026-06-26 00:50:40 +05:30
brooklyn!	edf35918be	Merge pull request #52620 from NousResearch/bb/desktop-session-switch-perf	2026-06-25 14:19:59 -05:00
Brooklyn Nicholson	e8561d61e6	test(tui_gateway): pin synchronous-build resume tests to eager_build These three assert the eager build contract — stored runtime overrides / profile db reach _make_agent synchronously, and the agent binds to the compression tip. Under deferred-by-default the build runs off-thread, so they raced the timer (green in CI, flaky locally). Pin them to eager_build; deferred coverage lives in the protocol tests.	2026-06-25 14:13:07 -05:00
David Metcalfe	da73223f4a	fix(desktop): show statusbar item tooltips on hover Statusbar items declared a 'title' string (e.g. YOLO, gateway health, agents, cron, version, context usage) that was populated by use-statusbar-items.tsx but never forwarded to the rendered DOM in StatusbarControls — so every statusbar button/menu/text/link had no hover hint. Wrap the four render branches (menu trigger, text, link, action) in the existing 'Tip' component from components/ui/tooltip.tsx. Tip is self-contained (carries its own Provider), instant (delayDuration=0), themed (bg-foreground/text-background, auto-inverts per theme), and already in use elsewhere in the desktop shell. Renders the child untouched when label is falsy, so items without a title stay zero-cost.	2026-06-25 12:11:17 -07:00
Brooklyn Nicholson	1ca1f9f2c7	refactor(tui_gateway): DRY the deferred-session paths Collapse the duplicated cold-resume / lazy-watch / create scaffolding into shared helpers: _deferred_session_record (the live-session dict minus the agent), _lazy_resume_info (the not-yet-built session.info), _claim_or_reuse_live (lock + double-checked register-or-reuse), and _schedule_agent_build (the pre-warm timer). Net -12 lines, three copies of the ~30-key session dict and the lazy-info block down to one each. No behavior change.	2026-06-25 14:03:03 -05:00
Brooklyn Nicholson	3bf00e459a	perf(desktop): make deferred resume the default, not an opt-in flag Per review: gating the faster path behind a `defer_build` flag that the only caller always sends is pointless. Flip it — `session.resume` now defers the agent build by default for every caller (desktop + Ink TUI); a caller that needs the agent built synchronously passes `eager_build: true` (used by the build-race test). The desktop no longer sends a flag. While verifying the flip, fixed two real parity gaps the deferred path had vs the old eager (`_init_session`) path: - `_enable_gateway_prompts()` was never called on a deferred resume, so approvals/clarify wouldn't route through the gateway prompt callbacks. - `_start_agent_build` never wired `background_review_callback` / `memory_notifications`, so a deferred-built session's self-improvement "💾 …" summary leaked to stdout instead of rendering in-transcript. Wiring it there also fixes it for `session.create` sessions, which build through the same path. ACP is unaffected (it uses its own session_manager, not this RPC); the Ink TUI already consumes the same lazy `info` shape from session.create and upgrades on the later `session.info` event.	2026-06-25 14:03:03 -05:00
Brooklyn Nicholson	c4c590e4a1	perf(desktop): make session switching fast under load Switching sessions in the desktop app could freeze the whole UI for several seconds on heavy, tool-rich chats. Root causes and fixes: - Cold `session.resume` built the AIAgent (MCP discovery, prompt/skill build) before returning, and the desktop awaits that RPC before it paints — so the entire switch blocked on the build. Add an opt-in `defer_build` resume path (the contract `session.create` already uses): return the full display transcript immediately, register an upgradable live session, and pre-warm the agent on a short timer. The persisted runtime identity (model/provider/base_url/api_mode/reasoning/tier) is restored on the deferred build so it can't drop the provider. - Nothing bounded how many in-memory agents accumulate; a user who reconnects often piled up detached sessions for the full 6h TTL. Add a soft LRU cap (`max_live_sessions`, default 16) that evicts the least-recently-active DETACHED sessions (no live client) — never a running, awaiting-input, mid-build, or live-transport one. Reopening re-resumes from disk. - On the prefetch-hit cold-resume path, skip rebuilding a throwaway merged-message array (and its 1000-entry Map) when the prefetch already painted the exact transcript; the downstream sameMessageList guard already drops the publish, so it was pure main-thread cost. The desktop opts into `defer_build` for every non-watch cold resume; the eager path stays for CLI/TUI and existing callers.	2026-06-25 14:03:03 -05:00
kshitij	5de8a8fbe8	Merge pull request #52375 from NousResearch/salvage/47237-dedupe-user-turns fix(gateway): dedupe user turns on transient failure (#47237)	2026-06-26 00:30:59 +05:30
davidgut1982	6208d6b3be	fix(gateway): dedupe user turns on transient failure (#47237 ) When the gateway persists a user message after a transient provider failure (429/timeout/auth error), subsequent retries of the same Telegram message could stack duplicate user turns in the transcript, causing the agent to fall behind by 1-2 messages. Add has_platform_message_id() to SessionDB (using the existing idx_messages_platform_msg_id partial index) and a SessionStore wrapper. The gateway's transient-failure path checks this before append_to_transcript -- if the platform_message_id is already persisted, the duplicate write is skipped. Salvaged from #47869 by @davidgut1982. Adapted to current main which has additional append sites and an existing content-based dedupe in the exception handler path. Closes #47237	2026-06-26 00:11:17 +05:30
Tranquil-Flow	0be10607d9	fix(tools): defensive type coercion in todo_tool for malformed LLM input (#14185 ) todo_tool crashed with `AttributeError: 'str' object has no attribute 'get'` when the LLM emitted the `todos` param as a JSON-encoded string instead of an array, or as a list containing non-dict items (observed intermittently on Claude 4.5/4.6/4.7, and after a prior tool-call rejection where the model "self-corrects" by wrapping the list in json.dumps). Three additive guards, no behavior change for well-formed input: - todo_tool(): if `todos` is a str, json.loads it; reject unparseable strings and non-list values with a clear tool_error instead of crashing downstream. - _validate(): non-dict items return a {id:"?", content:"(invalid item)"} placeholder rather than calling .get() on a str/int/None. - _dedupe_by_id(): non-dict items get a synthetic key so _validate handles them. Salvaged from #14785 by @Tranquil-Flow (authorship preserved via cherry-pick). Comprehensive tests: JSON-string coercion (parse / unparseable / non-list / non-string), non-dict list items (str/None/int/mixed), and a well-formed- unchanged regression class — both guards mutation-verified to fail without them. Closes #14185. Supersedes #14187, #22505, #14350 (same fix, less/no test coverage) and #16952 (bundled unrelated scope-creep).	2026-06-25 23:42:42 +05:30
kshitij	d682f320b3	Merge pull request #52147 from NousResearch/salvage/29184-mcp-osv-nonblocking fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184)	2026-06-25 23:39:44 +05:30
kshitij	c210e23a02	Merge pull request #52386 from NousResearch/salvage/31999-yaml-indent fix(utils): unify YAML list indent across all config writers (#31999)	2026-06-25 23:39:37 +05:30
qdaszx	6305ac0e4b	fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184 ) During stdio MCP server startup, _run_stdio (an async method) called the synchronous check_package_for_malware() inline. That makes a blocking urllib HTTPS POST to api.osv.dev whose own timeout doesn't reliably cover a stalled SSL handshake, so an intermittent network issue froze the entire asyncio event loop for up to ~120s — blowing past the TUI/gateway's 15s startup budget and showing "gateway startup timeout". Run the check via asyncio.to_thread (off the loop) AND bound it with asyncio.wait_for(timeout=_OSV_MALWARE_CHECK_TIMEOUT_S=12s). The malware check is fail-open, so on timeout we log and proceed rather than blocking startup. Salvaged from #29190 by @qdaszx (re-applied on current main — the call site moved since the PR was opened), combining the to_thread approach also proposed in #29192 by @ygd58. Two load-bearing tests: event-loop-not-blocked-during- check and timeout-fails-open — both mutation-verified to fail against the old inline blocking call. Closes #29184. Co-authored-by: ygd58 <buraysandro9@gmail.com>	2026-06-25 23:30:41 +05:30
xxxigm	0aea0c3654	fix(utils): unify YAML list indent across all config writers (#31999 ) atomic_yaml_write used default yaml.dump which emits indentless sequences (list items at column 0), while atomic_roundtrip_yaml_update (ruamel.yaml) emits 2-space-indented sequences. Cross-path writes to the same config.yaml toggled indentation on every save, eventually producing a mixed-indent file that js-yaml rejects with 'bad indentation of a mapping entry', silently dropping custom_providers and breaking model switching. Add IndentDumper SafeDumper subclass that forces indentless=False, route atomic_yaml_write through it. Route tui_gateway._save_cfg and the Telegram adapter's config writer through atomic_yaml_write so all paths emit the same 2-indent layout. Salvaged from #32034 by @xxxigm. Adapted to current main which already has allow_unicode=True (from #51356) but was missing IndentDumper. Closes #31999	2026-06-25 23:27:44 +05:30
brooklyn!	a53fc78c02	Merge pull request #52594 from NousResearch/bb/queue-resubmit-on-busy fix(tui_gateway): queue mid-turn prompts instead of dropping them on a busy retry	2026-06-25 12:50:18 -05:00
kshitijk4poor	15ee2d6f04	refactor: lightweight sudo count + drop chatty multi-sudo tip Replace _count_real_sudo_invocations (which called _rewrite_real_sudo_invocations and discarded the rewritten string) with a lightweight token scan that reuses the same tokeniser but skips string building. Remove the agent-facing tip about nested sudo in heredocs — the cache-cleared warning is enough.	2026-06-25 23:08:48 +05:30
xxxigm	d93abd75d1	test(terminal): cover sudo cache invalidation and multi-invocation piping	2026-06-25 23:08:48 +05:30
xxxigm	8278d82e17	fix(terminal): improve sudo -S password delivery and cache invalidation Pipe one password line per sudo invocation in compound commands so a correct password is not rejected on the second `sudo` in `sudo a && sudo b`. Drop the session cache when sudo returns Authentication failed, surface sudo_auth_failed in the tool result, and add hints for interactive sessions.	2026-06-25 23:08:48 +05:30
brooklyn!	931a5e92cc	Merge pull request #52592 from NousResearch/bb/close-interrupt-tool-seq-sibling-paths fix(agent): close tool-call sequence on all interrupt aborts (#48879 follow-up)	2026-06-25 12:31:27 -05:00
Brooklyn Nicholson	70319626a9	fix(tui_gateway): queue mid-turn prompts instead of dropping them on a busy retry A prompt sent while a turn was in flight got rejected with 4009 "session busy", which pushed clients (the desktop app) into a deadline-bounded busy-retry. When turn teardown outlived that deadline — e.g. the user hits stop while a slow, non-interruptible tool (web_search, read_file, an MCP call) is mid-flight, since the sequential executor only checks the interrupt flag between tools — the resubmitted message was silently dropped: "it just doesn't listen". Wire the previously-dead display.busy_input_mode config into prompt.submit: instead of rejecting, apply the policy and queue the message to run as the next turn (drained in run()'s tail, ahead of goal/notification follow-ups). Modes: interrupt (default) interrupts the live turn so it winds down promptly then runs the queued message; queue runs it after the current turn finishes; steer injects it into the live turn when accepted, else queues. The queued slot pins the sender's transport and losslessly merges a second arrival. No client deadline, no dropped sends.	2026-06-25 12:29:49 -05:00
Brooklyn Nicholson	2d286a6d00	fix(agent): close tool-call sequence on all interrupt aborts, not just finalize_turn #48879 closed the tool-call sequence on interrupt inside finalize_turn so a /stop after a tool no longer persists a `tool` tail that the next user message turns into a `tool -> user` role-alternation violation (which strict providers like Gemini/Claude react to by hallucinating a continuation and ignoring prior context — what users see as "lost context after stop"). But the retry-wait, error-handling, and post-error retry-wait interrupt aborts in conversation_loop return early and never reach finalize_turn, so they still persisted and returned a raw `tool` tail. Interrupting during provider backoff/rate-limiting (common under heavy work) hit exactly this path. Extract the close into a shared close_interrupted_tool_sequence helper and apply it at every interrupt abort (finalize_turn + the three early returns) so the whole bug class is fixed, not just the one site.	2026-06-25 12:24:34 -05:00
brooklyn!	88e01d92e6	Merge pull request #52591 from NousResearch/bb/desktop-update-adhoc-sign fix(desktop): ad-hoc sign macOS self-update rebuilds	2026-06-25 12:23:59 -05:00
Brooklyn Nicholson	1d9ed7f48a	fix(desktop): ad-hoc sign macOS self-update rebuilds The desktop self-updater rebuilds and re-signs the .app on each user's own machine (`hermes desktop --build-only` -> electron-builder `--dir`). With CSC_IDENTITY_AUTO_DISCOVERY on (its default), electron-builder signs the type=distribution, hardened-runtime bundle with whatever identity is in that user's keychain -- typically a personal "Apple Development" cert -- which stalls/fails the sign step (no Developer ID, no provisioning profile) or clobbers the original notarized signature with an unusable one, tripping Gatekeeper on every post-update launch. Force ad-hoc signing for the local packaged rebuild instead: deterministic, and exactly what _desktop_macos_relaunchable_fixup already finishes off. No-op for source runs, off-macOS, when a real identity is configured (CSC_LINK / APPLE_SIGNING_IDENTITY), or when the caller already pinned the flag.	2026-06-25 12:08:29 -05:00
ethernet	a6a28ce3e2	fix(ci): run CI on all PRs to anywhere fixes stacked PRs no-checks bug where main < a < b a merges into main b is retargeted to main but b doesn't run checks since it's not considered a new pr to main now b will simply already have passing ci :)	2026-06-25 09:15:20 -07:00
Ben Barclay	d6269da7fd	fix(gateway): harden scale-to-zero dormancy guards (#52359 ) Some checks are pending CI / detect (push) Waiting to run Details CI / tests (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / typecheck (push) Blocked by required conditions Details CI / docs-site (push) Blocked by required conditions Details CI / history-check (push) Blocked by required conditions Details CI / contributor-check (push) Blocked by required conditions Details CI / uv-lockfile (push) Blocked by required conditions Details CI / docker-lint (push) Blocked by required conditions Details CI / supply-chain (push) Blocked by required conditions Details CI / osv-scanner (push) Blocked by required conditions Details CI / All required checks pass (push) Blocked by required conditions Details Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Block scale-to-zero suspend while background async delegations are active, and restore runtime status to running on real inbound after a dormant wake.\n\nAdd regression coverage for both review findings.	2026-06-25 20:41:03 +10:00
Teknium	e62afaca62	fix(learn): teach /learn the full CONTRIBUTING.md skill standards (#52372 ) The /learn authoring prompt taught a subset of the HARDLINE skill rules, and stated the <=60-char description rule without making the model enforce it — so generated descriptions overshot (up to 202 chars), which the 60-char system-prompt skill index then silently truncates. - description: add the index-truncation rationale, a count-and-trim self-check, and a good/bad length example so the model actually hits <=60. - add platforms-gating rule (OS-bound primitives -> declare platforms:). - add author-credits-human-first rule. - round out the Hermes-tool framing with the full wrapped-tool mapping and references/templates layout. Closes #52367.	2026-06-25 00:17:23 -07:00
Teknium	60a2feeebf	chore: add benbenlijie to AUTHOR_MAP for PR #47205 salvage	2026-06-25 00:17:17 -07:00
benbenwyb	6f2b2a1f34	fix: handle named custom providers and Z.AI overload retries	2026-06-25 00:17:17 -07:00

1 2 3 4 5 ...

12913 commits