hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-24 10:52:21 +00:00

Author	SHA1	Message	Date
teknium1	d0de4601d2	fix(tui): /compress shows a before/after summary (#46686 ) The TUI /compress slash side-effect compressed the session, synced the key, and emitted session.info — but returned an empty string, so the user saw no 'Compressed: N → M messages / ~X → ~Y tokens' feedback. The CLI (_manual_compress) and gateway (slash_commands) paths both already call summarize_manual_compression; the TUI slash path was the lone gap. Snapshot history + rough token estimate before and after compaction and return the formatted summarize_manual_compression() feedback, mirroring the session.compress RPC handler. The estimate uses the same estimate_request_tokens_rough(system_prompt, tools) inputs as the RPC path, re-reading the system prompt after compaction (it may be rebuilt). Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-21 11:36:09 -07:00
teknium1	9e4fe32d36	fix(session): opt the background-review fork out of session finalization The background-review fork (fires ~every 10 turns) pins review_agent.session_id = agent.session_id — the parent's LIVE id — for prefix-cache parity, then calls close(). With session finalization now in close(), that would end the still-active parent session mid-conversation. Set _end_session_on_close = False on the fork so the real owner (CLI close / gateway reset / cron) finalizes the session instead. Follow-up to the #12029 fix.	2026-06-21 11:35:09 -07:00
yeyitech	b17180d950	fix(session): finalize owned SQLite session rows on AIAgent.close() Funnel session finalization through AIAgent.close() — the single terminal path every agent (CLI, gateway, subagent, cron) funnels through — so finished agents stop leaving rows with ended_at IS NULL. The biggest leak source was delegate_task subagent + background-review forks whose close() never ended their row. end_session() is first-reason-wins and no-ops on an already-ended row, so a 'compression'/'cron_complete'/'cli_close' reason set by an earlier terminal path is never clobbered. /resume already calls reopen_session(), so finalizing-on-close does not break resumability. Temporary helper agents that rotate/share the session forward (manual compression, gateway session-hygiene) opt out via _end_session_on_close=False. Also stop the long-running gateway heartbeat once the executor is done or the session slot is rebound to a different agent, preventing a stale 'running: delegate_task' bubble from outliving its run. Closes #12029.	2026-06-21 11:35:09 -07:00
teknium1	41e0c10f7e	fix(agent): route repeated-compression warning through _emit_status (#36908 ) The 'Session compressed N times — accuracy may degrade' warning went through _vprint (CLI stdout only), so the Ink TUI / Telegram / Discord never saw it — unlike the two other compression warnings in the same module, which route through _emit_status (and store _compression_warning for late-bound gateway status_callback replay). Set agent._compression_warning + call agent._emit_status() for this warning too, matching the sibling pattern. _emit_status still _vprints for the CLI, so CLI output is unchanged; TUI / gateway surfaces now receive it via status_callback (and replay_compression_warning can re-deliver it once a late-bound gateway callback is wired). Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-21 11:34:47 -07:00
konsisumer	3e354b61db	fix(agent): preserve copilot routed headers	2026-06-21 11:29:49 -07:00
Teknium	b6a4638b6d	fix(compressor): treat empty-content summary response as failure, not an empty summary (#50297 ) When an OpenAI-compatible proxy (e.g. cmkey.cn, one-api Anthropic channels) returns a well-formed HTTP 200 whose summary content is null or empty/ whitespace-only, _generate_summary coerced it to "" and stored a prefix-only summary — silently replacing the compacted turns with nothing. The model then lost all in-progress context after compression (#11978, #11914). _validate_llm_response already guards None / empty-choices, so those never reach the compressor; the gap was a well-formed response with empty content. Now treat empty content as a summary failure: raise so it routes through the existing main-model fallback then transient cooldown, dropping the turns without a summary rather than wiping context with an empty one. Also narrow the bare 'except RuntimeError' so only genuine 'No LLM provider configured' errors take the 600s no-provider cooldown; empty/invalid-response RuntimeErrors from a configured provider now correctly get the main-model fallback instead of being misrouted into the long no-provider cooldown. Reported by @Hung2124; area identified by @annguyenNous in #39590.	2026-06-21 11:27:07 -07:00
Teknium	296b290f8f	chore(release): add AUTHOR_MAP entry for de1tydev (#10158 )	2026-06-21 11:11:23 -07:00
Teknium	41ba90f814	fix(process): keep CLI drain dedup after poll goes read-only (#10156 ) Follow-up to @de1tydev's poll-read-only fix. Removing the _completion_consumed.add() from poll() fixes the gateway/tui watcher suppression (#10156) but reintroduces the CLI duplicate that #8228 fixed: a notify_on_complete process always enqueues a completion event, and the CLI idle/post-turn drain would re-inject it as a [SYSTEM: ...] message even though the agent already saw the exit inline in its poll result. Add a separate _poll_observed set that poll() populates on an observed exit. drain_notifications() (CLI only) skips poll-observed sessions; the gateway/tui watchers keep checking only is_completion_consumed, so a read-only poll never suppresses their autonomous delivery turn. - _poll_observed pruned alongside _completion_consumed in _prune_if_needed - 4 tests: CLI drain dedup after poll, gateway gate untouched, running poll doesn't mark observed, wait/log still skip CLI drain	2026-06-21 11:11:23 -07:00
Liao Shiwu	6f5f58e34b	fix: keep poll read-only for notify_on_complete watcher	2026-06-21 11:11:23 -07:00
Eugeniusz Gilewski	9078b4bbdf	fix(file): harden read_file device alias blocking Security-hardening fix for the read_file device guard, not a new sandbox boundary. The guard already rejects direct device paths and upstream now has a resolved-path pass for workspace symlinks to blocked devices, but its concrete-path helper still compared the expanded path before normalization. That leaves residual alias cases where the dangerous path is visible before final terminal-specific resolution, for example: 1. /dev/../dev/zero and /dev/./urandom should match the blocked-device list as concrete paths, not only after final realpath; 2. /dev/stdin-style aliases can disappear once realpath follows them to /proc/self/fd/0 and then to a tty path; 3. a user symlink to /dev/../dev/stdin exposes the dangerous intermediate target before final resolution, but not necessarily after it. Normalize expanded paths before matching and inspect each symlink hop before falling back to realpath. This preserves the existing /proc fd and /proc pseudo-file guards while enforcing the intended security invariant: model-supplied read paths must not reach blocking or infinite device streams through spelling, normalization, or symlink-hop tricks. Classification: security hardening / residual bypass fix for the read_file device blocklist. This is defensive code at the file-tool boundary, but it fixes a concrete denial-of-service class tracked as security in #10141 and #29158. Tests: - normalized /dev/../dev/zero and /dev/./urandom aliases - symlink to /dev/../dev/stdin blocked before realpath - existing symlink-to-device and regular-symlink guards still pass Fixes #10141 Fixes #29158	2026-06-21 11:11:19 -07:00
tt-a1i	ea056b0559	fix(telegram): avoid rich messages for CJK text Telegram Mac/Desktop Bot API 10.1 rich-message rendering leaves garbled overlapping draft/overlay glyphs for CJK text (#47653), affecting every message containing CJK characters. The legacy MarkdownV2 path renders the same text cleanly, so skip the rich send / draft / final-edit paths up front for content containing CJK (incl. astral-plane extensions) until affected clients age out. Non-CJK rich rendering is preserved. Fixes #47653	2026-06-21 11:10:37 -07:00
brooklyn!	65a477f12e	feat(desktop): add Update now button to About panel (#50186 )	2026-06-21 11:34:45 -05:00
teknium1	2f4f23fbfb	fix(codex): bridge app-server item/started events to Telegram tool-progress (#38835 ) When the main provider is the Codex app-server runtime (api_mode codex_app_server), the gateway showed no verbose 'running X' tool-progress breadcrumbs on Telegram while every other provider did. The app-server session processes item/started notifications (command execution, file changes, MCP/dynamic tool calls) but never surfaced them as Hermes tool-progress events — the session was constructed without an on_event hook, so the agent's tool_progress_callback was never invoked on this route. Add _codex_note_to_tool_progress() mapping item/started → (tool_name, preview, args) for commandExecution / fileChange / mcpToolCall / dynamicToolCall, and wire an on_event hook into CodexAppServerSession that forwards mapped events to agent.tool_progress_callback('tool.started', ...) — the same signature the chat_completions path uses (tool_executor.py). Non-tool items (agentMessage/reasoning) and non-item/started methods map to None and are ignored. Co-authored-by: jplew <462836+jplew@users.noreply.github.com>	2026-06-21 08:46:06 -07:00
yeyitech	8a506ed3ac	fix(auth): make load_pool() non-destructive for env-seeded credentials load_pool() is meant to be a read, but it persistently pruned env-seeded pool entries whenever the calling process's os.environ lacked the seeding var. A process without MINIMAX_API_KEY would delete the persisted env:MINIMAX_API_KEY entry from auth.json for every other process, causing auth.json to oscillate and auxiliary auto-detect to fall through to the wrong provider. env:* entries are persisted references re-hydrated from the environment on each load — a missing var means "cannot re-seed right now", not "source is gone forever". _prune_stale_seeded_entries now gates env-source removal behind prune_env_sources (default True for explicit cleanup paths); load_pool() passes prune_env_sources=False. File-backed singletons (device-code OAuth, hermes_pkce) still prune when their backing file is gone, and explicit removal via `hermes auth remove` (source suppression) is unaffected. Fixes #9331. Co-authored-by: houko <suzukaze.haduki@gmail.com>	2026-06-21 08:26:37 -07:00
Teknium	a966932392	fix(telegram): exempt tables from rich newline hard-breaks The newline normalization is the shared chokepoint for every rich send (sendRichMessage, draft, and editMessageText). Injecting a Markdown hard break (two trailing spaces) into a GFM table row separator corrupts the natively-rendered table — the rich path's headline feature. Protect both fenced code blocks AND pipe-table blocks as bare regions; only prose between them gets hard breaks. Verified RICH_CONTENT and the existing rich-table tests stay byte-identical.	2026-06-21 08:26:28 -07:00
Tranquil-Flow	31e59fe44d	fix(telegram): preserve newlines in rich slash-command output (#46070 ) Bot API 10.1 sendRichMessage treats a lone newline as a soft break, so multi-line content joined with "\n".join(lines) — slash-command lists, etc. — collapses into a single paragraph. Normalize single newlines to Markdown hard breaks (two trailing spaces) in _rich_message_payload, leaving paragraph breaks and fenced code blocks untouched. Fixes #46070	2026-06-21 08:26:28 -07:00
Teknium	03563dabac	fix(gateway): raise session-hygiene hard message limit 400 → 5000 (#50194 ) The gateway pre-compression hygiene valve force-compressed any session crossing 400 messages regardless of token usage. On large-context (1M+) models doing many short, message-dense turns, a healthy session at ~16% token usage could hit 400 messages and get force-compressed — and the compression summary's stale Active Task could then bleed into the next turn. The valve's actual purpose is to break a death spiral: when API calls keep disconnecting on an oversized session, no token-usage data arrives, the token threshold never fires, and the transcript grows unbounded. It's a count-based floor for that pathological case only. 400 was tuned for ~200K-context models and is far too low for modern large-context sessions. Raise the default to 5000 — still well clear of any death spiral, but no longer firing on legitimate long conversations. The value remains fully configurable via compression.hygiene_hard_message_limit.	2026-06-21 08:26:19 -07:00
teknium1	3509be7124	fix(compression): auto-compression triggers at minimum context length (#14690 ) The compaction threshold is max(context_length * threshold_percent, MINIMUM_CONTEXT_LENGTH=64000). The floor prevents premature compression on large models, but degenerates at small windows: a model at exactly 64000 ctx gets max(32000, 64000) = 64000 — a threshold equal to the ENTIRE window. should_compress() can then never fire, because the provider rejects the request before usage reaches 100%. Auto-compression silently never triggers for any model whose context_length <= MINIMUM / threshold_percent (e.g. 64K-per-slot local models). Centralize the calc in _compute_threshold_tokens(). When the floor would meet or exceed the context window, trigger at 85% of the window (_MIN_CTX_TRIGGER_RATIO) — high enough that a minimum-context model uses most of its budget before compacting (compacting at the 50% percentage would waste half the small window), but below 100% so compaction actually fires before the provider rejects the request. This mirrors the existing gpt-5.5/Codex 85% autoraise rationale. Large-context behavior (floor at 64000) is unchanged; both call sites (__init__ and update_model) use the shared helper. Co-authored-by: soynchux <soynchuux@gmail.com> Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com> Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com>	2026-06-21 07:53:14 -07:00
kshitij	c6a0929875	Merge pull request #50137 from NousResearch/fix/reset-calibration-on-model-switch fix(agent): reset stale token calibration on model switch (#23767)	2026-06-21 20:02:08 +05:30
kshitij	ed8f7898b9	Merge pull request #50136 from NousResearch/fix/context-aware-tool-budget fix(agent): scale tool-output budget to the model context window (#23767)	2026-06-21 20:01:32 +05:30
liuhao1024	6984026f12	fix(browser): enable SSRF guard when terminal runs in container When terminal.backend is docker/modal/daytona/ssh/singularity, the terminal runs in a sandboxed container with network isolation, but the browser still runs on the host. The SSRF guard was skipped because _is_local_backend() only checked browser.cloud_provider, not the terminal backend. Now _is_local_backend() also checks TERMINAL_ENV — when the terminal is containerized, the browser is treated as non-local and SSRF protection is enabled. Fixes #38690	2026-06-21 07:26:18 -07:00
bogerman1	c7e8854cb3	fix(tui): persist session messages on force-quit / signal shutdown Mirror the CLI's exit-path behaviour in the TUI gateway so that unpersisted conversation messages are flushed to state.db and the on_session_end plugin hook fires before the session is closed. Root cause: _finalize_session() only called db.end_session() to mark the session row as ended, but did NOT flush in-memory messages via _persist_session() or fire the on_session_end hook. When the user force-quit (double Ctrl-C, terminal-close, SIGHUP) while the agent was mid-turn, messages accumulated since the last persist point were silently lost. Changes ------- tui_gateway/server.py - _finalize_session(): - Persist unflushed messages via agent._persist_session() before db.end_session(). Prefers agent._session_messages (set by the last _persist_session call inside run_conversation) over session['history'] (stale when agent is mid-turn). - Fire on_session_end(interrupted=True) plugin hook so crash- recovery plugins can flush buffers, matching cli.py behaviour. tui_gateway/entry.py - _log_signal(): - Explicitly call _shutdown_sessions() before sys.exit(0) in the SIGHUP/SIGTERM handler as belt-and-suspenders over atexit. tests/tui_gateway/test_finalize_session_persist.py (new): - 11 tests covering: history persistence, _session_messages priority, empty-history skip, missing-agent, double-finalize, persist-exception resilience, hook firing, hook-exception resilience, and db.end_session preservation. Related ------- Closes the TUI half of #5021 (CLI already handles this via its atexit handler). Also addresses the session-persistence gap discussed in #18465 and #18269.	2026-06-21 07:26:07 -07:00
Teknium	e499d69e3e	feat(api-server): configurable concurrent-run cap to prevent DoS (#50007 ) The OpenAI-compatible API server only enforced a hardcoded cap of 10 concurrent runs on /v1/runs, leaving /v1/chat/completions and /v1/responses unbounded — a request flood could exhaust CPU, memory, and upstream LLM quota (#7483). - Add gateway.api_server.max_concurrent_runs (config.yaml, default 10, 0 disables). No env var. - Shared concurrency gate across all three agent-serving endpoints, counting both the chat/responses in-flight counter and the /v1/runs stream set. Returns OpenAI-style 429 + Retry-After when at the cap. - Remove the dead hardcoded _MAX_CONCURRENT_RUNS class attribute. Closes #7483.	2026-06-21 07:26:03 -07:00
Hariharan Ayappane	99233faf78	fix(cli): persist sessions before shutdown	2026-06-21 07:25:56 -07:00
Teknium	9f67ba1b01	fix(agent): guard finalize_turn cleanup chain so it never drops the response (#50009 ) When a turn hit max_iterations, finalize_turn ran three unguarded cleanup steps after the model's summary — _save_trajectory (file I/O), _cleanup_task_resources (remote VM/browser teardown), and _persist_session (SQLite write). Any raise there propagated out of run_conversation, discarding the partial final_response the caller was waiting for; subprocess wrappers saw an empty stdout with no traceback (#8049). Each step is now guarded independently so one failure can't skip the others. Failures log at ERROR with a traceback and are surfaced on the result dict via cleanup_errors; the partial response is always returned. Closes #8049.	2026-06-21 07:25:42 -07:00
miha	796f618f99	fix(telegram): keep chunk markers outside code fences When truncate_message appends a (N/M) chunk indicator to a chunk that had to close an in-progress fenced code block, the marker lands on the closing fence line (``` \(1/2\) after MarkdownV2 escaping). Telegram does not treat that as a clean closing fence and rejects the MarkdownV2, falling back to plain text. Move the indicator onto its own line right after the closing fence at all three legacy-send call sites. Fixes #48517	2026-06-21 07:25:37 -07:00
kshitijk4poor	1e0b3a2bcc	fix(agent): reset stale token calibration on model switch (#23767 ) ContextCompressor.update_model() recomputed context_length/threshold/budgets but kept the cross-call calibration state (last_real_prompt_tokens, last_rough_tokens_when_real_prompt_fit, last_compression_rough_tokens, awaiting_real_usage_after_compression, _ineffective_compression_count) from the PREVIOUS model. Those fields encode 'the provider proved this prompt fit' / 'preflight can be deferred' decisions valid only for the model that produced them. Carried across a switch to a smaller-context model, should_defer_preflight_to_real_usage() used the old model's 'it fit' history to SKIP a preflight compression the new model actually needed — sending an oversized prompt the provider rejects (#23767). update_model() now clears that state; the new model's first response repopulates it via update_from_response(). Verified E2E: after a 200K->65,536 switch, defer no longer suppresses and should_compress fires on an over-threshold estimate.	2026-06-21 17:46:58 +05:30
kshitijk4poor	1965d56219	fix(agent): scale tool-output budget to the model context window (#23767 ) The tool-result persistence budget was a fixed 100K chars/result and 200K chars/turn regardless of the active model. On a small-context model (e.g. a 65K-token local model switched into mid-session) a single large tool result (reporter: a 279K-char search result) or a full 200K-char turn (~50K tokens) could by itself approach or exceed the window, forcing an oversized request that the provider rejects as "Prompt too long". - budget_config.budget_for_context_window() scales per-result/per-turn char caps to a fraction of the model window, clamped to the historical 100K/200K defaults (large models unchanged) and floored so small models stay usable. - resolve_threshold() now caps the per-tool registry value at default_result_size so tools that register a fixed 100K cap (web/terminal/x_search) don't re-inflate a scaled-down budget. No-op for the default budget (both 100K). - tool_executor wires the agent's live context_length (recomputed on model switch) into all four persist/turn-budget call sites. read_file stays inf-pinned (no persist loop). Verified E2E: a 279K-char result against a 65K model collapses to a ~1.6K preview; a 200K model is byte-identical to today.	2026-06-21 17:46:38 +05:30
kshitij	5aec00f7a9	Merge pull request #50131 from kshitijk4poor/salvage/gateway-busy-readout-50103 feat(gateway+dashboard): busy/idle readout for safe lifecycle actions (salvage #50103)	2026-06-21 17:39:26 +05:30
kshitijk4poor	4d7bb382b0	refactor(gateway): route all active_agents coercion through parse_active_agents; harden drain-timeout fallback Second cleanup pass (simplify-code review of the first follow-up): - write_runtime_status now clamps active_agents via parse_active_agents instead of an inline max(0, int(...)). Removes the duplicated clamp the helper's docstring acknowledged AND closes a write-side ValueError gap (a non-numeric active_agents previously raised; now degrades to 0). - hermes_cli/gateway.py draining-status line routes its active-agents count through parse_active_agents too — the third coercion site of the same persisted field, now consistent and non-raising with the two HTTP surfaces. - web_server.py /api/status: the drain-timeout resolver fallback now catches ImportError specifically and falls back to DEFAULT_GATEWAY_RESTART_DRAIN_TIMEOUT (a real float) instead of a blanket 'except Exception -> None'. None would have violated the surfaced field's int/float contract and stripped NAS's poll-deadline hint silently. - Dropped a redundant 'if runtime else 0' branch (parse_active_agents already handles the empty/None case) and tightened the parse_active_agents docstring to describe the actual single-contract role (write + both reads).	2026-06-21 17:22:52 +05:30
kshitijk4poor	b577f25100	refactor(gateway): dedupe drain-timeout resolution + share active_agents parse Follow-up cleanups on top of the busy/idle readout (PR #50103): - web_server.py /api/status reused the single drain-timeout resolver hermes_cli.gateway._get_restart_drain_timeout() (HERMES_RESTART_DRAIN_TIMEOUT env -> agent.restart_drain_timeout config -> default) instead of inlining a third hand-rolled copy of that precedence chain. Also fixes a subtle divergence: the inline copy used os.environ.get() so a set-but-empty env var was treated as a value rather than falling through to config; the shared resolver .strip()s and falls through correctly. - Added gateway.status.parse_active_agents() and routed BOTH HTTP surfaces (/api/status and /health/detailed) through it, so the exposed active_agents field is consistently clamped non-negative. Previously /api/status clamped while /health/detailed exposed the raw file value, diverging on a corrupt count. - Added TestParseActiveAgents covering the shared coercion contract.	2026-06-21 17:22:52 +05:30
Ben	0ee75469d7	feat(dashboard): surface gateway busy/drainable on /api/status Give an external consumer (NAS) a trustworthy, always-reachable busy/idle readout it can poll before a disruptive lifecycle action (restart, migrate, stop, auto-update). The dashboard /api/status is the only HTTP surface guaranteed up on a hosted agent regardless of which gateway platforms are enabled, and it already reads gateway_state.json. Add to /api/status (additive, non-breaking): - active_agents — in-flight gateway-turn count (now refreshed per-turn by the companion gateway-side commit) - gateway_busy — running AND active_agents > 0 - gateway_drainable — running and live (a valid begin-drain target) - restart_drain_timeout — resolved seconds, so the consumer can size its poll deadline without out-of-band knowledge (env HERMES_RESTART_DRAIN_TIMEOUT → config agent.restart_drain_timeout → default) The busy/drainable contract is defined once in gateway.status (derive_gateway_busy / derive_gateway_drainable) and consumed by both /api/status and /health/detailed so the two surfaces can never disagree. Liveness keys off gateway_running (a live PID/health probe), NEVER gateway_updated_at — a healthy idle gateway never advances that timestamp. All derived fields degrade to safe falsy values when the gateway is down or the status file is absent/corrupt (never a spurious "busy" that would wedge the consumer). active_sessions (the 5-min DB recency heuristic the SPA reads) is left exactly as-is — new signal, new fields. Tests (behaviour contracts, not snapshots): the pure derivation contract across every running/state/count/liveness combination; /api/status integration for busy, idle-drainable, draining, down, stale-busy-file, corrupt-count, and timeout surfacing; and /health/detailed parity.	2026-06-21 17:22:52 +05:30
Ben	51a338a1b6	feat(gateway): track active_agents in runtime status on turn boundaries The gateway only rewrote gateway_state.json on lifecycle transitions (start/connect/drain/stop), never on turn start/end. Live-verified on a hosted agent: a confirmed end-to-end turn ran while gateway_updated_at stayed frozen at boot and active_agents was absent — so any active_agents read from the file between transitions is stale. That makes it unusable as a busy/idle signal for an external consumer (NAS deciding whether it's safe to restart/migrate/auto-update an agent mid-turn). Add _persist_active_agents(), called at every turn boundary: - turn start: both running-agent sentinel-claim sites (normal inbound message path + startup-resume path) - turn end: the central _release_running_agent_state() choke point (covers normal completion, /stop, /reset, sentinel cleanup, stale-eviction — every path that ends a running turn) It passes ONLY active_agents to write_runtime_status, leaving gateway_state (and every other field) _UNSET so the read-merge-write preserves the current lifecycle state. Passing gateway_state=None would clobber it — hence a dedicated helper rather than reusing _update_runtime_status. The write is the same cheap JSON write done on lifecycle transitions today; best-effort (a failed status write never disrupts a turn). Behaviour-contract test: an active_agents-only write preserves both running and draining gateway_state, and the count clamps non-negative.	2026-06-21 17:22:52 +05:30
kshitij	44d552ea5a	Merge pull request #50115 from NousResearch/salvage/model-switch-preflight-warning fix(cli): warn when in-session model switch will preflight-compress	2026-06-21 16:41:44 +05:30
kshitijk4poor	1ca29723f0	fix(cli): log instead of swallow preflight-warning errors; consistent TUI warning field Follow-up to the salvaged preflight-compression warning: - Replace silent `except Exception: pass` at all 5 guard call sites (cli.py x2, gateway/slash_commands.py x2, tui_gateway/server.py) with `logger.debug(...)` so signature drift in the guard helper isn't hidden. - tui_gateway/server.py: set the confirm dict's `warning` field to the merged message (was bare expensive-model text) so it matches `confirm_message` for any future consumer reading `warning`. - Add trailing newlines to the two new files.	2026-06-21 16:31:56 +05:30
Tuna Dev	04730f32e7	fix(cli): warn when in-session model switch will preflight-compress Adds hermes_cli/context_switch_guard.py mirroring the model_cost_guard pattern. When a user switches models mid-session (Herm TUI picker, CLI, or /model on Telegram/Discord), the warning surfaces on the existing ModelSwitchResult.warning_message path used by the expensive-model guard if the new model's compression threshold is below the current session size. Partial fix for #23767 — addresses only the 'user-facing guardrail when switching from a high-context provider to a substantially lower-context provider' slice. The other proposed fixes from that issue (hard preflight token guard, metadata cache invalidation on switch, compression safety invariant, oversized tool-output handling) are out of scope for this PR.	2026-06-21 16:29:31 +05:30
xxxigm	7b9a0b315b	test(mcp): cover 'unknown method' ping keepalive fallback (#50028 ) Two regression tests for the agentmemory reconnect-loop: - _is_method_not_found_error matches the plain 'Unknown method: ping' phrasing (no structural -32601 code). - _keepalive_probe latches _ping_unsupported and falls back to list_tools when send_ping raises 'Unknown method: ping', instead of propagating (which would reconnect-loop).	2026-06-21 16:02:56 +05:30
xxxigm	472c068159	fix(mcp): detect 'unknown method' phrasing in ping keepalive fallback A server that doesn't implement the optional 'ping' utility answers a keepalive ping with JSON-RPC method-not-found. _is_method_not_found_error latches that condition so the probe falls back to list_tools instead of reconnect-looping. The substring fallback only matched 'method not found' / '-32601' / 'not found: ping'. Servers that surface method-not-found as the common 'Unknown method: <name>' phrasing without a structural -32601 code (e.g. agentmemory's MCP server) slipped through, so the fallback never latched and the keepalive reconnect-looped every cycle. Add 'unknown method' to the substring fallback so the ping->list_tools keepalive fallback latches for these servers too. Fixes #50028.	2026-06-21 16:02:56 +05:30
kshitij	8ca38d3121	Merge pull request #50100 from kshitijk4poor/salvage/model-visibility-cross-provider-47450 fix(desktop): preserve other providers' hide-all in model visibility dialog (salvage #47450)	2026-06-21 15:56:00 +05:30
kshitijk4poor	461fcc0964	test(desktop): harden model-visibility toggle + dedupe default expansion Follow-up to the salvaged #47450 fix: - Extract expandProviderDefaults() so the curated-default expansion rule lives in one place (was duplicated between defaultVisibleKeys and resolveVisibleKeys). - Drop the redundant new Set() wrap in toggleModelVisibility (resolveVisibleKeys already returns a fresh Set; effectiveVisibleKeys already relied on this). - Document the intentional re-enable behavior (re-enabling one model of a hidden-all provider restores only that model, not the curated defaults) and tighten the toggleModelVisibility JSDoc. - Add 7 hardening tests: re-enable-restores-only-that-model, full hide/re-enable round-trip, empty-non-null stored, single toggle-off from null defaults, zero-model provider, and direct resolveVisibleKeys null/empty assertions.	2026-06-21 15:46:58 +05:30
David Doan	8666fd7635	fix(desktop): preserve other providers' hide-all in model visibility dialog #43496 added a per-provider hide-all sentinel ('provider::') so emptying a provider in the Edit Models dialog stopped re-expanding its defaults. That fixed the single-provider case, but the dialog's toggle handler seeds its working set from effectiveVisibleKeys(), which strips ALL sentinels before returning. So persisting after any toggle silently dropped every OTHER provider's hide-all sentinel; those providers then looked 'never customized' and re-enabled all their models on the next render. Split resolution into two functions: - resolveVisibleKeys(): stored keys + curated default expansion, with hide-all sentinels PRESERVED — the canonical working set the toggle handler mutates and persists. - effectiveVisibleKeys(): resolveVisibleKeys() then strips sentinels, for display only (unchanged contract). Move the toggle set-computation into a pure, unit-tested toggleModelVisibility() that seeds from resolveVisibleKeys(), so sibling sentinels survive the persist. Add regression tests that drive the real toggle handler across multiple providers. Follow-up to #43496; completes the fix for #43485 (cross-provider case).	2026-06-21 15:42:26 +05:30
kshitij	f57ff7aef1	Merge pull request #50034 from NousResearch/salvage/cron-tz-offset-repair Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run Details Typecheck / typecheck (apps/desktop) (push) Waiting to run Details Typecheck / typecheck (apps/shared) (push) Waiting to run Details Typecheck / typecheck (ui-tui) (push) Waiting to run Details Typecheck / typecheck (web) (push) Waiting to run Details Typecheck / desktop-build (push) Waiting to run Details fix(cron): repair migrated timezone offsets to prevent double-fire	2026-06-21 13:53:28 +05:30
kshitij	f6a504d088	Merge pull request #50025 from NousResearch/salvage/cron-run-immediate fix(cron): execute job immediately on action=run	2026-06-21 13:53:13 +05:30
kshitij	3051a1634c	Merge pull request #50023 from NousResearch/salvage/f3b-telegram-dmtopic fix(cron): route Telegram DM-topic cron delivery through DeliveryRouter (#22773)	2026-06-21 13:47:30 +05:30
kshitijk4poor	f43c61643d	chore(release): add devsart95 to AUTHOR_MAP	2026-06-21 13:35:50 +05:30
kshitijk4poor	4cc28aa3bb	fix(cron): route Telegram DM-topic cron delivery through DeliveryRouter (#22773 ) PR #22410 added three-mode Telegram topic routing to the live message path (TelegramAdapter.send via the gateway DeliveryRouter), but the cron delivery path never got it. cron/scheduler.py::_deliver_result sent through the live adapter with a bare ``{"thread_id": ...}`` and fell back to the standalone _send_telegram, neither of which addresses Bot API Direct Messages topics correctly. After Bot API 10.0 (2026-05-08), sending to a private chat with a bare ``message_thread_id`` is rejected/mis-routed, so cron deliveries to a private DM topic landed in the General topic instead of the requested lane. Fix: the cron live-adapter branch now routes the text send through the gateway's ``DeliveryRouter._deliver_to_platform`` — the same canonical path live messages use — so it inherits all three Telegram routing modes: 1. Forum/supergroup (negative chat_id) -> message_thread_id 2. Bot API DM topics (private chat_id + numeric topic id) -> direct_messages_topic_id (the case #22773 reported) 3. Hermes-created named private DM-topic lanes -> ensure_dm_topic + reply anchor For mode 2, a private-chat target with a numeric topic id is passed as ``direct_messages_topic_id`` metadata (verified end-to-end: TelegramAdapter._thread_kwargs_for_send turns it into ``{message_thread_id: None, direct_messages_topic_id: <int>}``), instead of a bare message_thread_id. Forum/supergroup and home-channel deliveries are unchanged. The standalone fallback (gateway down) is preserved. No new config knob and no duplicated routing logic — this reuses the existing DeliveryRouter rather than reimplementing topic routing in the cron path. Salvaged from #42051 (stepanov1975) and #23249 (devsart95), which both diagnosed the missing three-mode routing in the cron/standalone path; reimplemented onto the canonical DeliveryRouter that landed since those PRs were opened. Co-authored-by: Alex <9785479+stepanov1975@users.noreply.github.com> Co-authored-by: devsart95 <devsart95@gmail.com>	2026-06-21 13:35:45 +05:30
Tranquil-Flow	f1f36b3bae	fix(cron): repair migrated cron timezone offsets to prevent double-fire A recurring cron job persists `next_run_at` as an absolute timestamp with a UTC offset (e.g. `2026-05-19T21:00:00+10:00`). Cron expressions, however, describe local wall-clock intent ("run at 21:00"). When Hermes/system timezone changes after the timestamp was persisted, the stored instant is re-interpreted in the new zone: `21:00+10:00` is the instant `13:00+02:00`, which is `<= now` (13:02+02:00) — so the job fires HOURS EARLY, then `compute_next_run` advances it via croniter to `21:00+02:00` the same day, producing a SECOND fire. (#28934, recurrence of #24289.) `_get_due_jobs_locked` now detects this precise migration case before the due check: for a `cron` job whose converted instant looks due, whose stored UTC offset differs from the current zone's, AND whose stored wall-clock time is still in the future (distinguishing a migrated offset from a genuinely missed run), it recomputes `next_run_at` from the schedule and skips the early fire — preserving the local wall-clock intent. Verified against the issue's reproducer: stored `21:00+10` under runtime `+02:00` at wall-clock `13:02` is rescheduled to `21:00+02` instead of firing early + again. Salvaged from #28941 by @Tranquil-Flow (authorship preserved). Chosen over the alternative approaches (#28951 normalize-to-UTC, #28985 rebase-and-match) because UTC-normalization does not change the absolute-instant comparison and so does not fix the early fire, and this guard is the tightest: it only acts when all four conditions hold and reuses the existing `compute_next_run`. Fixes #28934	2026-06-21 13:31:31 +05:30
kshitij	02a3288de3	Merge pull request #50018 from NousResearch/salvage/f3a-delivery-confirm fix(cron): make live-adapter delivery confirmation reliable (#38922, #47056, #43014)	2026-06-21 13:29:45 +05:30
kyssta-exe	65d7c7fafd	fix(cron): execute job immediately on action='run' `cronjob(action='run')` (and `hermes cron run`) only set `next_run_at = now` and returned success, relying on the scheduler ticker to actually execute the job on its next tick. When no gateway/ticker is running — a CLI-only setup, or the Windows case in #41037 — the job never executed: `run` reported success, but `last_run_at` stayed null forever, no output, no delivery. A manual `run` should actually run. `_execute_job_now` now: - claims the job via `claim_job_for_fire` — the same at-most-once CAS the scheduler/external-provider fire path uses. This both advances `next_run_at` for recurring jobs and blocks a concurrently-running gateway ticker from double-firing the same job; if the claim is lost, the run is skipped (the tool reports `execution_skipped`). This closes the double-fire race that a bare `advance_next_run` left open (a tick whose `get_due_jobs` already captured the job between trigger and advance would still fire it). - delegates firing to `run_one_job` — the single shared execute→save→deliver→mark body the ticker and external providers use — so failure delivery, `[SILENT]` handling, and live-adapter delivery stay identical across paths and can't drift. (The original salvage re-implemented this sequence inline and had already dropped failure delivery + `[SILENT]`.) The tool response carries `executed`, `execution_success`, and either `execution_error` or `execution_skipped`. The `hermes cron run` CLI message no longer claims "It will run on the next scheduler tick" — it reports the actual "Ran now: succeeded/failed" outcome (or the skip). Salvaged from #41130 by @kyssta-exe (authorship preserved); reworked to reuse `claim_job_for_fire` + `run_one_job` per review rather than re-implementing the fire sequence inline. Adds tests for the claim-then-fire path, claim-lost skip, failure reporting, and exception capture. Fixes #41037 Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>	2026-06-21 13:28:04 +05:30
kshitij	9f4c0b27c9	Merge pull request #50016 from NousResearch/salvage/cron-ticker-liveness	2026-06-21 13:08:46 +05:30

1 2 3 4 5 ...

12390 commits