hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-24 10:52:21 +00:00

Author	SHA1	Message	Date
Teknium	9630ec6c19	fix(kanban): pin worker TERMINAL_CWD to the task workspace (#50348 ) _default_spawn launched the worker subprocess with cwd=workspace and set HERMES_KANBAN_WORKSPACE, but never set TERMINAL_CWD — so the worker inherited the dispatching gateway's TERMINAL_CWD. That value takes precedence over the process cwd in two places: - tools/file_tools.py::_resolve_base_dir — a relative write_file path resolved against the gateway user's home instead of the workspace, so artifacts silently landed outside the workspace (#41312). - agent_init's context-file loader — AGENTS.md was discovered relative to the gateway's cwd, so under multi-profile dispatch a worker loaded whichever gateway won the claim race's AGENTS.md, not the task's (#34619). Both are the same root cause. Pinning TERMINAL_CWD to the workspace (where the task's work actually happens) fixes both. Guarded on an existing absolute dir because file_tools rejects relative/sentinel TERMINAL_CWD values — a non-dir workspace leaves the inherited value rather than writing a meaningless one. Closes #34619, closes #41312.	2026-06-21 12:43:37 -07:00
Teknium	b6d1072408	fix(cli): branch new worktrees from the fresh remote tip, not stale local HEAD (#50355 ) hermes -w created the worktree branch from the standalone clone's HEAD, which lags origin when the clone isn't freshly updated (it's only refreshed by hermes update, not per session). Every worktree branch then rooted on a stale base, so the PR diff GitHub computes against current main ballooned with unrelated changes and the agent had to discover the staleness at push time and rebase. _resolve_worktree_base() now fetches and branches from the freshest available ref: the current branch's upstream if it tracks one (so a deliberate feature-branch worktree tracks its own remote), else the remote's default branch (origin/HEAD), else local HEAD as a fail-soft fallback (offline / no remote / detached). A bogus 'origin/(unknown)' default is guarded, and worktree creation retries from HEAD if branching off the remote ref fails — so this is never worse than the old behavior. Gated by worktree_sync (default true); set worktree_sync: false to keep the old branch-from-local-HEAD behavior. The resolved base is printed in the session banner. This is the follow-up to the #50319 session, where the standalone clone was 213 commits behind origin and the worktree inherited that stale base.	2026-06-21 12:42:11 -07:00
Teknium	e217fd42e2	feat(kanban): add task lifecycle plugin hooks (claimed/completed/blocked) (#50349 ) Plugins could observe session/tool/approval lifecycle but had no way to observe kanban task transitions. Adds three observer hooks fired by the board's claim/complete/block transitions: - kanban_task_claimed (dispatcher process, before worker spawn) - kanban_task_completed (worker process, carries summary) - kanban_task_blocked (worker process, carries reason) Each fires AFTER the DB write txn commits, so a plugin observes durable state and a slow/hanging callback can never hold the SQLite write lock. All firing is best-effort: a raising hook is logged and swallowed and never breaks a board transition. profile_name is resolved from HERMES_HOME so dispatcher- and worker-side hooks carry the right profile. Requested by @Smithangshu on Discord.	2026-06-21 12:38:14 -07:00
Teknium	9d883ac90e	feat(plugins): add ctx.profile_name for session-agnostic profile access (#50346 ) Plugins previously had no way to read the active profile name from the PluginContext. The workaround in the wild — reaching into ctx._manager._cli_ref — only works in an interactive CLI session; _cli_ref is None in the gateway and in kanban-spawned worker sessions (hermes -p <profile> chat -q ...), so the workaround breaks exactly where multi-profile awareness matters most. ctx.profile_name wraps hermes_cli.profiles.get_active_profile_name(), which derives the name from HERMES_HOME and therefore works in every execution context with zero dependency on _cli_ref.	2026-06-21 12:38:11 -07:00
Teknium	7d9f6a24f5	chore(release): add AUTHOR_MAP entry for #48678 salvage	2026-06-21 12:36:26 -07:00
natehale	565b7c8d9d	fix(telegram): stop typing indicator lingering after final reply After the agent's final response, the '...typing' bubble persisted ~5s. send() re-triggers send_typing() after every delivery so the bubble survives intermediate progress messages (Telegram clears typing on each delivered message). But that re-trigger also fired on the FINAL send, re-arming Telegram's ~5s timer AFTER the gateway had already torn down its typing-refresh loop — and Telegram exposes no stop-typing API, so nothing cancelled it. Gate the post-send re-trigger on the absence of metadata['notify'] (set only on the final user-visible reply via _mark_notify_metadata). Both the rich-message and legacy send paths are covered; intermediate progress sends still re-trigger so the bubble stays alive mid-response. Fixes #48678	2026-06-21 12:36:26 -07:00
Teknium	c0409a87ff	feat(gateway): typed send-error classification (SendResult.error_kind) (#50342 ) Add a platform-neutral send-failure vocabulary so consumers can branch on a typed category instead of substring-matching the raw provider message. - base.py: SEND_ERROR_KINDS + classify_send_error() (too_long / bad_format / forbidden / not_found / rate_limited / transient / unknown), and an optional SendResult.error_kind field (defaults None — fully backward compatible). - telegram.py: populate error_kind on send() failures; message_too_long keeps its existing error token plus error_kind='too_long'. Purely additive: no behavioral change to the existing degrade-and-deliver paths (MarkdownV2->plain-text fallback, overflow split, retry classification all untouched). 22 new tests + 210 adapter regression tests green.	2026-06-21 12:34:22 -07:00
teknium1	6bbacc2238	fix(desktop): make cold-start port-announcement deadline tolerant The port-announcement clock in waitForDashboardPort starts the instant the backend process is spawned — before uvicorn binds its socket. On a cold install the child first compiles and imports the whole hermes_cli.main -> web_server -> FastAPI/uvicorn chain, and on Windows real-time AV scans every freshly written .pyc. That pre-bind cost can exceed the old hardcoded 45s deadline, so the desktop killed a healthy-but-still-starting backend and respawned it, piling up orphaned processes (#50209). Raise the default to 90s and make it overridable via HERMES_DESKTOP_PORT_ANNOUNCE_TIMEOUT_MS, clamped to a 45s floor so a bad override can't reintroduce the loop. Warm starts still announce in well under a second; both call sites inherit the new default with no change. Adds backend-ready.test.cjs (wired into test:desktop:platforms).	2026-06-21 12:29:18 -07:00
joaomarcos	e580706d4d	test(web_server): add integration tests for desktop boot handshake fix Three tests covering the scenarios from issue #50209 that could not be validated with real Defender on a fresh install: 1. test_lifespan_warmup_is_nonblocking Patches _warm_gateway_module to sleep 3 s. Measures TestClient startup time — must complete in < 1.5 s, proving the fire-and-forget run_in_executor does not block the event loop before port binding (HERMES_DASHBOARD_READY timing proxy). 2. test_get_status_does_not_block_event_loop Patches _resolve_restart_drain_timeout to sleep 3 s. Fires concurrent GET /api/status and GET /api/version requests. /api/version must respond in < 3 s while /api/status waits — proving the event loop stays free during the slow import (15 s socket timeout would not fire). 3. test_concurrent_status_probes_all_respond Three simultaneous /api/status probes with the slow patch — all must return HTTP 200 (no connection resets, no orphan accumulation). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-21 12:29:18 -07:00
joaomarcos	475e81dab4	fix(web_server): use run_in_executor for gateway pre-warm and drain-timeout Fixes a regression introduced by the prior approach (synchronous import hermes_cli.gateway inside _lifespan) that caused a new failure mode: the blocking import stalled the asyncio event loop before uvicorn could bind its port, pushing HERMES_DASHBOARD_READY past the desktop shell's 45 s announcement deadline and triggering a respawn loop that accumulated orphaned backend processes. Two-part fix: _lifespan: replace the blocking import with a fire-and-forget run_in_executor call (_warm_gateway_module). The import runs in a worker thread while the server socket is already open, so HERMES_DASHBOARD_READY fires without delay. get_status: replace the inline lazy import with await run_in_executor(None, _resolve_restart_drain_timeout). This is the root fix for the original 15 s socket-timeout: the blocking .pyc-compilation + Defender scan is offloaded to a thread, keeping the event loop free for every /api/status probe. After the first call the module is in sys.modules and the executor returns in microseconds. Both helpers are extracted as module-level sync functions so they can be unit-tested independently of FastAPI or uvicorn. Closes #50209 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-21 12:29:18 -07:00
Teknium	5e3e89cc05	feat(hindsight): configurable embedded daemon health grace timeout (#50341 ) On resource-contended hosts the embedded Hindsight daemon can exceed a single 2s /health check; upstream then waits a grace window before treating it as stale and killing+restarting it (hindsight-embed reads HINDSIGHT_EMBED_PORT_HEALTH_GRACE_TIMEOUT, default 30s, into a module-level constant at import time). Users on busy boxes had no Hermes-side way to raise it short of hand-setting an env var. Add a 'port_health_grace_timeout' config.json option to the Hindsight plugin. When set, initialize() exports it to the process env BEFORE daemon_embed_manager is imported (the import-time read is the contract). setdefault() so an explicit operator env override always wins. Exposed in 'hermes memory setup' for local_embedded mode. Follow-up to #50308 / issue #13125 comment thread.	2026-06-21 12:20:53 -07:00
Eugeniusz Gilewski	def3f6388f	fix(file): anchor device symlink guard to task cwd The read_file device guard now walks symlink hops before the file operation layer, but that hop walk still interpreted relative paths against the Python process cwd. In sessions where TERMINAL_CWD points at the task workspace, a relative workspace symlink to a blocked alias such as /dev/../dev/stdin could therefore miss the intermediate device target before later task-cwd resolution. Anchor relative device checks to the task base before symlink-hop inspection so the pre-I/O guard sees the same workspace path that read_file would otherwise read. Absolute device paths and the existing final realpath fallback remain unchanged. Refs #10141 Refs #29158	2026-06-21 12:16:10 -07:00
teknium1	e267237671	test(photon): cover overflow retry, typing cooldown, sidecar-crash detection Follow-up for salvaged PR #50256. Unit tests for the three behaviors: retryable classification of Envoy/sidecar overflow strings, per-chat typing cooldown with stop_typing reset, and the _supervise_sidecar crash-detection path that raises a retryable fatal (and the clean-shutdown no-op).	2026-06-21 12:15:44 -07:00
joaomarcos	9578e52795	fix(photon): detect unexpected sidecar death and trigger reconnect When the Node spectrum-ts sidecar process exited mid-session (crash, OOM, upstream overflow escalation), _supervise_sidecar returned silently — readline hit EOF, the log-pump loop broke, and nothing notified the gateway. _inbound_loop entered an infinite retry loop against a dead port, _running stayed True, and the adapter remained in self.adapters with no path to self-recovery short of a manual gateway restart. Add a death-detection tail to _supervise_sidecar: after the log-pump exits (EOF or exception), guard on _inbound_running to distinguish unexpected death from a deliberate disconnect(). On unexpected exit, call _set_fatal_error("SIDECAR_CRASHED", retryable=True) followed by _notify_fatal_error() so the reconnect watcher picks up the platform within 30 s and retries with exponential backoff (30 s → 300 s cap) until the sidecar comes back up. All other platforms remain unaffected. The _inbound_running guard is safe against races: disconnect() sets _inbound_running = False before _stop_sidecar() cancels the supervisor task. CancelledError is BaseException, not Exception, so it bypasses the except clause and propagates normally — the detection block never runs during a clean shutdown.	2026-06-21 12:15:44 -07:00
joaomarcos	2a4542333e	fix(photon): classify Envoy overflow errors as retryable; add typing cooldown Closes #50185 Two independent gaps let a transient Photon/Spectrum upstream overflow degrade message delivery and amplify gRPC pressure: 1. _is_retryable_error did not recognise Photon- or Envoy-specific error strings ("internal sidecar error", "upstream connect error", "reset reason: overflow"), so _send_with_retry fell through to the plain-text fallback immediately instead of backing off and retrying. 2. send_typing had no rate gate, so a burst of typing-indicator calls during an overflow event kept hitting the upstream gRPC connection and widened the failure window. Fix: - Add _PHOTON_RETRYABLE_PATTERNS with the three high-specificity Envoy / sidecar substrings and override _is_retryable_error on PhotonAdapter to check them after delegating to the base-class patterns. base.py and all other adapters are untouched. - Add a 5 s per-chat cooldown in send_typing backed by _typing_last_sent. stop_typing clears the entry so the next start after a completed turn fires immediately — only rapid consecutive starts without a stop are suppressed. - Reduce PhotonAdapter._send_with_retry default max_retries from 2 to 1 (single 2 s back-off check) — enough to confirm whether the Envoy circuit-breaker has opened, without adding unnecessary latency. All changes are scoped to plugins/platforms/photon/adapter.py.	2026-06-21 12:15:44 -07:00
Teknium	7a131f7f40	fix(api-server): stop silently promising async delivery on stateless HTTP path (#50319 ) * fix(api-server): stop silently promising async delivery on stateless HTTP path terminal(notify_on_complete=True / watch_patterns) and delegate_task(background=True) silently no-op'd on the API server / WebUI path (#10760): the watcher / detached child registered, but every API-server route (OpenAI-spec /v1/chat/completions and /v1/responses, plus the proprietary /v1/runs SSE stream) tears down its channel when the turn ends, and APIServerAdapter.send() is a no-op stub. A completion that fires after the response closed had nowhere to go — from the agent side, indistinguishable from a hang. There is no spec-compliant surface to wake the agent later on a stateless HTTP client, so make the no-op honest instead of silent: - Add a per-adapter capability flag supports_async_delivery (default True; APIServerAdapter = False), propagated into a HERMES_SESSION_ASYNC_DELIVERY contextvar via async_delivery_supported(). Toggle on the adapter, not a hardcoded platform string — a future stateless adapter is correct-by-default. - terminal: when delivery is unsupported, skip watcher registration, force notify_on_complete off, and return a notify_unsupported note telling the agent to process(action='poll'). - delegate_task: when delivery is unsupported, fall back to SYNCHRONOUS execution (work runs and returns in the same response) with a note, instead of handing out a handle that never resolves. CLI (in-process completion_queue) and the real gateway platforms are unchanged. Fixes #10760 * refactor(api-server): route session binding through a single no-delivery chokepoint Add APIServerAdapter._bind_api_server_session() and route both agent-entry paths (_run_agent for /v1/chat/completions + /v1/responses, and the /v1/runs _run_sync path) through it. The helper hardwires platform="api_server" and async_delivery=False with no async_delivery parameter to pass, so a future route added to the API server physically cannot reintroduce the silent no-op (#10760) by forgetting to mark the channel as non-delivering. The binding stays request-scoped (cleared per turn), so a session resumed later on a delivering interface (CLI / gateway platform) re-binds fresh and is NOT blocked — the no-delivery decision tracks the interface handling the current turn, never the session.	2026-06-21 12:15:14 -07:00
JackJin	56255f83f7	fix(agent): stop delegate cascade from deleting the parent session _collect_delegate_child_ids() walks the _delegate_from marker chain to gather delegate subagents for cascade deletion, but started its visited set empty. When the chain loops back onto a parent — a delegation cycle, or a parent that is also another parent's delegate child when several ids are deleted together — that parent was collected as one of its own descendants and then permanently deleted, along with all of its messages, by _delete_delegate_children(). Seed the visited set with the parent ids so they can never be re-collected, and exclude them from the returned child set. Callers (delete_session, bulk delete) remove the parents separately, so this only prevents the unintended parent deletion; legitimate child collection is unchanged. Add regression tests (in-memory sqlite) covering single/multi-level delegate chains, the parent_session_id+marker branch, untagged children (orphan-don't-delete contract), and the cycle case that previously leaked the parent into the deletion set. Fixes #49148	2026-06-21 12:09:16 -07:00
Teknium	e581740aa1	fix(kanban): single-writer dispatch lock to prevent orphan-dispatcher DB corruption (#50331 ) A shell-launched 'hermes gateway run --replace' / 'gateway restart' on a systemd/launchd host can leave an orphan gateway whose kanban dispatcher escapes the service cgroup, survives 'systemctl restart', and becomes a second long-lived writer on the shared kanban.db. Two dispatchers that each believe they own the file both pass SQLite busy_timeout and then race on WAL frames — the documented root cause of multi-writer corruption (issue #35240). The existing _guard_supervised_gateway_conflict startup guard blocks the common way an orphan is born, but does nothing once a second dispatcher already exists. This adds the defense-in-depth: dispatch_once now wraps every tick in a non-blocking, board-scoped flock (_dispatch_tick_lock). A losing dispatcher returns DispatchResult(skipped_locked=True) and does zero DB writes this tick — so two dispatchers can never run a reclaim/spawn/write sequence concurrently regardless of how the second one got there. - Non-blocking (LOCK_NB): never stalls the gateway's async watcher. - Board-scoped: lock file is a .dispatch.lock sibling of each board's kanban.db, so unrelated boards tick in parallel. - POSIX + Windows (fcntl / msvcrt LK_NBLCK), no-op degrade where neither exists — mirrors the existing _cross_process_init_lock pattern. Verified with a real two-process orphan repro: while a separate process holds the lock, dispatch_once skips; after release it runs.	2026-06-21 12:06:24 -07:00
Teknium	587b5b9ac2	fix(backup): capture memory-provider state stored outside HERMES_HOME (#50325 ) hermes backup only walks HERMES_HOME, so memory providers that keep config/credentials in home-anchored dotdirs (honcho -> ~/.honcho, hindsight -> ~/.hindsight, openviking -> ~/.openviking) lost that data across a backup/import cycle — the peer IDs, session pairings, and API keys never made it into the archive. Add an optional MemoryProvider.backup_paths() hook (default []). The active provider declares its external paths; backup resolves them from config only (no init, no network), archives the ones under the home dir into a reserved _external/ subtree encoded relative to home, and import restores them to their original location with a home-anchored traversal guard and 0600 on credential-shaped files. Paths outside home are skipped as non-portable. honcho, hindsight, and openviking override the hook. E2E-validated full backup->import cycle plus 7 new tests.	2026-06-21 12:03:46 -07:00
Teknium	7a8c4fe238	chore(release): add AUTHOR_MAP entry for #48422 salvage	2026-06-21 12:03:24 -07:00
kn8-codes	6183e8ce1b	fix(telegram): make Bot API 10.1 rich messages opt-in (default off) Rich messages are not ready for primetime: current Telegram clients can render Bot API 10.1 rich messages as blank/unsupported bubbles and make them hard to copy as plain text, which is worse than the legacy MarkdownV2 path for command snippets and mobile handoffs. Default the rich_messages toggle to False so replies stay on the copyable legacy path; users opt in per bot via platforms.telegram.extra.rich_messages: true. Updates adapter, gateway config default, example config, English + zh-Hans docs, and the default/opt-in tests.	2026-06-21 12:03:24 -07:00
Stephen Chin	3b56d3a29a	fix(security): redact secrets in kanban tool payloads before persistence	2026-06-21 12:02:30 -07:00
Teknium	d19aabbf2d	fix(gateway): persist in-flight transcript on restart/shutdown drain timeout (#50312 ) A turn forcibly interrupted by the drain-timeout escalation never reaches turn_finalizer.finalize_turn (the only place that flushes the turn to state.db). Its in-flight tool rounds live only in the in-memory _session_messages, so the immediate pre-restart turn was silently dropped from load_transcript() on resume. _finalize_shutdown_agents now flushes _session_messages to the SQLite session store before teardown. The flush is idempotent (identity-tracked in _flush_messages_to_session_db), so agents that finished gracefully re-flush nothing. The resume_pending / fresh-tool-tail branches in _handle_message_with_agent already expect a transcript whose tail may be a pending tool result. Fixes #13121.	2026-06-21 11:57:15 -07:00
sgaofen	93ea9b04af	fix(gateway): cap inbound media download size to prevent memory exhaustion Inbound image/audio/video payloads were buffered fully into process memory before being written to the cache, with no size limit. A large upload (Discord Nitro allows 500 MB) or a remote media URL in an inbound message pointing at a huge file could spike RAM and OOM-kill the gateway. Enforce a configurable cap in the shared cache helpers (gateway/platforms/ base.py) so the protection holds across every platform adapter, not one: - cache_image/audio/video_from_bytes reject oversized payloads before writing (video was the gap in the original report — now covered). - cache_image/audio_from_url stream the body, rejecting on an oversized Content-Length header and re-checking the running total per chunk so an absent/lying header can't smuggle an unbounded body past the cap. - Discord's _read_attachment_bytes checks att.size up front, so an oversized attachment is rejected before any bytes are pulled into memory. Configurable via gateway.max_inbound_media_bytes in config.yaml (default 128 MiB; 0 disables). No new env var — non-secret config lives in config.yaml. Salvaged and extended from @sgaofen's PR #13341 (the original report and the shared-helper approach). Reapplied onto current main (Discord adapter has since moved to plugins/platforms/discord/), the configurable knob moved from an env var to config.yaml, and the video cache helper added. Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-06-21 11:56:46 -07:00
teknium1	16899ae144	test(file): update guard assertions for unified display-text message The salvaged #19820 unifies the write_file guard under _is_internal_file_tool_content with the message 'internal read_file display text'. Two tests added to test_file_read_guards.py after the PR branch point still asserted the old 'status text' wording. Update them to match the new (correct, more general) message.	2026-06-21 11:55:59 -07:00
Brandon Zarnitz	71274f264b	fix(file): reject read_file line-numbered writeback	2026-06-21 11:55:59 -07:00
Teknium	a18bae65b9	fix(config): redact api_key in config show/set output (#50245 ) (#50313 ) hermes config show printed the model dict raw via print(), bypassing the logging redactor; a custom-provider api_key (e.g. Cloudflare cfut_...) was shown in plaintext even with security.redact_secrets=true. Opaque tokens don't match any vendor-prefix regex, so structural key-name masking is required. - Add redact_config_value(): recursively masks credential-shaped keys (api_key/token/secret/... exact-match) via mask_secret. - Wrap the show_config model dump in it. - Mask the set_config_value echo when the leaf key is credential-shaped (config set model.api_key routes to config.yaml, lowercase misses the .env allowlist).	2026-06-21 11:50:31 -07:00
Teknium	e0498bd305	fix(bedrock): price Claude prompt-cache tokens in /usage (#50307 ) Bedrock Claude routes through the AnthropicBedrock SDK and injects cache_control, so cached tokens are always reported — but the pricing table had no cache cost fields for any Bedrock model, so /usage showed "cost unknown" on every cached session. Also, cross-region inference profiles (us./global./eu. prefixes) never matched the bare pricing keys. - Add cache_read/cache_write rates to the four Bedrock Claude rows (read 0.1x input, write 1.25x input per the Bedrock pricing page). - Normalize the cross-region prefix in the Bedrock pricing lookup, mirroring is_anthropic_bedrock_model's prefix list. Closes #50295.	2026-06-21 11:48:43 -07:00
LehaoLin	7bc6f18062	fix(hindsight): skip local_embedded daemon when running as root PostgreSQL's initdb refuses to run as root, so the embedded Hindsight daemon could never initialize its data directory under root. The daemon-start thread would fail, retry, and loop forever — each cycle reloading embedding models (~958MB RAM, ~33% CPU) with no user-visible error, leaving Hermes sluggish on a common VPS/cloud root setup. initialize() now detects root (os.geteuid() == 0) before spawning the daemon thread, disables local_embedded mode, and surfaces a clear warning to both the log and the terminal so the user knows to run as a non-root user or switch to cloud / local_external mode. Closes #13125. Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-06-21 11:47:02 -07:00
teknium1	d0de4601d2	fix(tui): /compress shows a before/after summary (#46686 ) The TUI /compress slash side-effect compressed the session, synced the key, and emitted session.info — but returned an empty string, so the user saw no 'Compressed: N → M messages / ~X → ~Y tokens' feedback. The CLI (_manual_compress) and gateway (slash_commands) paths both already call summarize_manual_compression; the TUI slash path was the lone gap. Snapshot history + rough token estimate before and after compaction and return the formatted summarize_manual_compression() feedback, mirroring the session.compress RPC handler. The estimate uses the same estimate_request_tokens_rough(system_prompt, tools) inputs as the RPC path, re-reading the system prompt after compaction (it may be rebuilt). Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-21 11:36:09 -07:00
teknium1	9e4fe32d36	fix(session): opt the background-review fork out of session finalization The background-review fork (fires ~every 10 turns) pins review_agent.session_id = agent.session_id — the parent's LIVE id — for prefix-cache parity, then calls close(). With session finalization now in close(), that would end the still-active parent session mid-conversation. Set _end_session_on_close = False on the fork so the real owner (CLI close / gateway reset / cron) finalizes the session instead. Follow-up to the #12029 fix.	2026-06-21 11:35:09 -07:00
yeyitech	b17180d950	fix(session): finalize owned SQLite session rows on AIAgent.close() Funnel session finalization through AIAgent.close() — the single terminal path every agent (CLI, gateway, subagent, cron) funnels through — so finished agents stop leaving rows with ended_at IS NULL. The biggest leak source was delegate_task subagent + background-review forks whose close() never ended their row. end_session() is first-reason-wins and no-ops on an already-ended row, so a 'compression'/'cron_complete'/'cli_close' reason set by an earlier terminal path is never clobbered. /resume already calls reopen_session(), so finalizing-on-close does not break resumability. Temporary helper agents that rotate/share the session forward (manual compression, gateway session-hygiene) opt out via _end_session_on_close=False. Also stop the long-running gateway heartbeat once the executor is done or the session slot is rebound to a different agent, preventing a stale 'running: delegate_task' bubble from outliving its run. Closes #12029.	2026-06-21 11:35:09 -07:00
teknium1	41e0c10f7e	fix(agent): route repeated-compression warning through _emit_status (#36908 ) The 'Session compressed N times — accuracy may degrade' warning went through _vprint (CLI stdout only), so the Ink TUI / Telegram / Discord never saw it — unlike the two other compression warnings in the same module, which route through _emit_status (and store _compression_warning for late-bound gateway status_callback replay). Set agent._compression_warning + call agent._emit_status() for this warning too, matching the sibling pattern. _emit_status still _vprints for the CLI, so CLI output is unchanged; TUI / gateway surfaces now receive it via status_callback (and replay_compression_warning can re-deliver it once a late-bound gateway callback is wired). Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>	2026-06-21 11:34:47 -07:00
konsisumer	3e354b61db	fix(agent): preserve copilot routed headers	2026-06-21 11:29:49 -07:00
Teknium	b6a4638b6d	fix(compressor): treat empty-content summary response as failure, not an empty summary (#50297 ) When an OpenAI-compatible proxy (e.g. cmkey.cn, one-api Anthropic channels) returns a well-formed HTTP 200 whose summary content is null or empty/ whitespace-only, _generate_summary coerced it to "" and stored a prefix-only summary — silently replacing the compacted turns with nothing. The model then lost all in-progress context after compression (#11978, #11914). _validate_llm_response already guards None / empty-choices, so those never reach the compressor; the gap was a well-formed response with empty content. Now treat empty content as a summary failure: raise so it routes through the existing main-model fallback then transient cooldown, dropping the turns without a summary rather than wiping context with an empty one. Also narrow the bare 'except RuntimeError' so only genuine 'No LLM provider configured' errors take the 600s no-provider cooldown; empty/invalid-response RuntimeErrors from a configured provider now correctly get the main-model fallback instead of being misrouted into the long no-provider cooldown. Reported by @Hung2124; area identified by @annguyenNous in #39590.	2026-06-21 11:27:07 -07:00
Teknium	296b290f8f	chore(release): add AUTHOR_MAP entry for de1tydev (#10158 )	2026-06-21 11:11:23 -07:00
Teknium	41ba90f814	fix(process): keep CLI drain dedup after poll goes read-only (#10156 ) Follow-up to @de1tydev's poll-read-only fix. Removing the _completion_consumed.add() from poll() fixes the gateway/tui watcher suppression (#10156) but reintroduces the CLI duplicate that #8228 fixed: a notify_on_complete process always enqueues a completion event, and the CLI idle/post-turn drain would re-inject it as a [SYSTEM: ...] message even though the agent already saw the exit inline in its poll result. Add a separate _poll_observed set that poll() populates on an observed exit. drain_notifications() (CLI only) skips poll-observed sessions; the gateway/tui watchers keep checking only is_completion_consumed, so a read-only poll never suppresses their autonomous delivery turn. - _poll_observed pruned alongside _completion_consumed in _prune_if_needed - 4 tests: CLI drain dedup after poll, gateway gate untouched, running poll doesn't mark observed, wait/log still skip CLI drain	2026-06-21 11:11:23 -07:00
Liao Shiwu	6f5f58e34b	fix: keep poll read-only for notify_on_complete watcher	2026-06-21 11:11:23 -07:00
Eugeniusz Gilewski	9078b4bbdf	fix(file): harden read_file device alias blocking Security-hardening fix for the read_file device guard, not a new sandbox boundary. The guard already rejects direct device paths and upstream now has a resolved-path pass for workspace symlinks to blocked devices, but its concrete-path helper still compared the expanded path before normalization. That leaves residual alias cases where the dangerous path is visible before final terminal-specific resolution, for example: 1. /dev/../dev/zero and /dev/./urandom should match the blocked-device list as concrete paths, not only after final realpath; 2. /dev/stdin-style aliases can disappear once realpath follows them to /proc/self/fd/0 and then to a tty path; 3. a user symlink to /dev/../dev/stdin exposes the dangerous intermediate target before final resolution, but not necessarily after it. Normalize expanded paths before matching and inspect each symlink hop before falling back to realpath. This preserves the existing /proc fd and /proc pseudo-file guards while enforcing the intended security invariant: model-supplied read paths must not reach blocking or infinite device streams through spelling, normalization, or symlink-hop tricks. Classification: security hardening / residual bypass fix for the read_file device blocklist. This is defensive code at the file-tool boundary, but it fixes a concrete denial-of-service class tracked as security in #10141 and #29158. Tests: - normalized /dev/../dev/zero and /dev/./urandom aliases - symlink to /dev/../dev/stdin blocked before realpath - existing symlink-to-device and regular-symlink guards still pass Fixes #10141 Fixes #29158	2026-06-21 11:11:19 -07:00
tt-a1i	ea056b0559	fix(telegram): avoid rich messages for CJK text Telegram Mac/Desktop Bot API 10.1 rich-message rendering leaves garbled overlapping draft/overlay glyphs for CJK text (#47653), affecting every message containing CJK characters. The legacy MarkdownV2 path renders the same text cleanly, so skip the rich send / draft / final-edit paths up front for content containing CJK (incl. astral-plane extensions) until affected clients age out. Non-CJK rich rendering is preserved. Fixes #47653	2026-06-21 11:10:37 -07:00
brooklyn!	65a477f12e	feat(desktop): add Update now button to About panel (#50186 )	2026-06-21 11:34:45 -05:00
teknium1	2f4f23fbfb	fix(codex): bridge app-server item/started events to Telegram tool-progress (#38835 ) When the main provider is the Codex app-server runtime (api_mode codex_app_server), the gateway showed no verbose 'running X' tool-progress breadcrumbs on Telegram while every other provider did. The app-server session processes item/started notifications (command execution, file changes, MCP/dynamic tool calls) but never surfaced them as Hermes tool-progress events — the session was constructed without an on_event hook, so the agent's tool_progress_callback was never invoked on this route. Add _codex_note_to_tool_progress() mapping item/started → (tool_name, preview, args) for commandExecution / fileChange / mcpToolCall / dynamicToolCall, and wire an on_event hook into CodexAppServerSession that forwards mapped events to agent.tool_progress_callback('tool.started', ...) — the same signature the chat_completions path uses (tool_executor.py). Non-tool items (agentMessage/reasoning) and non-item/started methods map to None and are ignored. Co-authored-by: jplew <462836+jplew@users.noreply.github.com>	2026-06-21 08:46:06 -07:00
yeyitech	8a506ed3ac	fix(auth): make load_pool() non-destructive for env-seeded credentials load_pool() is meant to be a read, but it persistently pruned env-seeded pool entries whenever the calling process's os.environ lacked the seeding var. A process without MINIMAX_API_KEY would delete the persisted env:MINIMAX_API_KEY entry from auth.json for every other process, causing auth.json to oscillate and auxiliary auto-detect to fall through to the wrong provider. env:* entries are persisted references re-hydrated from the environment on each load — a missing var means "cannot re-seed right now", not "source is gone forever". _prune_stale_seeded_entries now gates env-source removal behind prune_env_sources (default True for explicit cleanup paths); load_pool() passes prune_env_sources=False. File-backed singletons (device-code OAuth, hermes_pkce) still prune when their backing file is gone, and explicit removal via `hermes auth remove` (source suppression) is unaffected. Fixes #9331. Co-authored-by: houko <suzukaze.haduki@gmail.com>	2026-06-21 08:26:37 -07:00
Teknium	a966932392	fix(telegram): exempt tables from rich newline hard-breaks The newline normalization is the shared chokepoint for every rich send (sendRichMessage, draft, and editMessageText). Injecting a Markdown hard break (two trailing spaces) into a GFM table row separator corrupts the natively-rendered table — the rich path's headline feature. Protect both fenced code blocks AND pipe-table blocks as bare regions; only prose between them gets hard breaks. Verified RICH_CONTENT and the existing rich-table tests stay byte-identical.	2026-06-21 08:26:28 -07:00
Tranquil-Flow	31e59fe44d	fix(telegram): preserve newlines in rich slash-command output (#46070 ) Bot API 10.1 sendRichMessage treats a lone newline as a soft break, so multi-line content joined with "\n".join(lines) — slash-command lists, etc. — collapses into a single paragraph. Normalize single newlines to Markdown hard breaks (two trailing spaces) in _rich_message_payload, leaving paragraph breaks and fenced code blocks untouched. Fixes #46070	2026-06-21 08:26:28 -07:00
Teknium	03563dabac	fix(gateway): raise session-hygiene hard message limit 400 → 5000 (#50194 ) The gateway pre-compression hygiene valve force-compressed any session crossing 400 messages regardless of token usage. On large-context (1M+) models doing many short, message-dense turns, a healthy session at ~16% token usage could hit 400 messages and get force-compressed — and the compression summary's stale Active Task could then bleed into the next turn. The valve's actual purpose is to break a death spiral: when API calls keep disconnecting on an oversized session, no token-usage data arrives, the token threshold never fires, and the transcript grows unbounded. It's a count-based floor for that pathological case only. 400 was tuned for ~200K-context models and is far too low for modern large-context sessions. Raise the default to 5000 — still well clear of any death spiral, but no longer firing on legitimate long conversations. The value remains fully configurable via compression.hygiene_hard_message_limit.	2026-06-21 08:26:19 -07:00
teknium1	3509be7124	fix(compression): auto-compression triggers at minimum context length (#14690 ) The compaction threshold is max(context_length * threshold_percent, MINIMUM_CONTEXT_LENGTH=64000). The floor prevents premature compression on large models, but degenerates at small windows: a model at exactly 64000 ctx gets max(32000, 64000) = 64000 — a threshold equal to the ENTIRE window. should_compress() can then never fire, because the provider rejects the request before usage reaches 100%. Auto-compression silently never triggers for any model whose context_length <= MINIMUM / threshold_percent (e.g. 64K-per-slot local models). Centralize the calc in _compute_threshold_tokens(). When the floor would meet or exceed the context window, trigger at 85% of the window (_MIN_CTX_TRIGGER_RATIO) — high enough that a minimum-context model uses most of its budget before compacting (compacting at the 50% percentage would waste half the small window), but below 100% so compaction actually fires before the provider rejects the request. This mirrors the existing gpt-5.5/Codex 85% autoraise rationale. Large-context behavior (floor at 64000) is unchanged; both call sites (__init__ and update_model) use the shared helper. Co-authored-by: soynchux <soynchuux@gmail.com> Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com> Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com>	2026-06-21 07:53:14 -07:00
kshitij	c6a0929875	Merge pull request #50137 from NousResearch/fix/reset-calibration-on-model-switch fix(agent): reset stale token calibration on model switch (#23767)	2026-06-21 20:02:08 +05:30
kshitij	ed8f7898b9	Merge pull request #50136 from NousResearch/fix/context-aware-tool-budget fix(agent): scale tool-output budget to the model context window (#23767)	2026-06-21 20:01:32 +05:30
liuhao1024	6984026f12	fix(browser): enable SSRF guard when terminal runs in container When terminal.backend is docker/modal/daytona/ssh/singularity, the terminal runs in a sandboxed container with network isolation, but the browser still runs on the host. The SSRF guard was skipped because _is_local_backend() only checked browser.cloud_provider, not the terminal backend. Now _is_local_backend() also checks TERMINAL_ENV — when the terminal is containerized, the browser is treated as non-local and SSRF protection is enabled. Fixes #38690	2026-06-21 07:26:18 -07:00

1 2 3 4 5 ...

12419 commits