hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-27 11:22:03 +00:00

Author	SHA1	Message	Date
Harjoth Khara	233ef98afe	fix(docker): skip symlinked stage2 chown targets (#52789 ) Prevents stage2-hook.sh recursive chown from following a symlinked $HERMES_HOME/home (or profiles/cron) and destroying the host user's home directory. Also guards top-level state-file chowns and refuses first-boot seeding through symlinks. Fixes #52781. Co-authored-by: harjoth <harjoth.khara@gmail.com>	2026-06-26 12:09:52 +10:00
teknium1	1abfa66ba6	chore(release): add DavidMetcalfe to AUTHOR_MAP for PR #52272 salvage	2026-06-25 19:00:48 -07:00
DavidMetcalfe	865a09a610	fix(agent): detect thinking-timeout for reasoning models and surface actionable guidance instead of misleading file-write advice Two-part fix: Part 1 (classifier override at agent/error_classifier.py:720-738): A transport disconnect on a reasoning model — even on a large session — now routes to FailoverReason.timeout instead of context_overflow. Without this, large-session reasoning-model disconnects route to the compression branch and silently delete conversation history on a phantom context-length error. The override is strictly targeted: non-reasoning models (gpt-4o, claude-3-5-sonnet, llama-3.3-70b, etc.) still route to context_overflow on large sessions — the existing intentional behavior for chat models whose proxy doesn't idle-kill during prefill/generation. Part 2 (new agent/thinking_timeout_guidance.py + integration at agent/conversation_loop.py:3488-3567): New is_thinking_timeout() and build_thinking_timeout_guidance() helpers. When a known reasoning model (NVIDIA Nemotron 3 Ultra, OpenAI o1/o3, Anthropic Opus 4.x thinking, DeepSeek R1, Qwen QwQ, xAI Grok reasoning) hits a transport-kill on a small session (classifier says timeout directly) or after Part 1 routes correctly (large session), the user now sees reasoning-specific guidance with three actionable workarounds in priority order: 1. Set providers.<provider>.models.<model>.stale_timeout_seconds: 900 in ~/.hermes/config.yaml (Hermes's built-in floor is already 600s for known reasoning models; raise further if upstream is even tighter). 2. Lower reasoning_budget or set reasoning_effort: medium on this model if the provider supports it. 3. Use a smaller / faster reasoning model if the task doesn't require deep thinking. The new guidance takes precedence via if/elif over the existing _is_stream_drop block, so a reasoning-model user with a transport-kill message sees actionable advice instead of the misleading "try execute_code with Python's open() for large files" advice (which is correct for the unrelated large-file-write stream-drop case but actively wrong for the thinking-timeout case). Verified: - 478 tests passing across 9 directly-relevant files (49 new + 429 existing, zero regressions). - Ruff lint clean on all 4 modified/new files. - Negative test: 6 parametrized regression guards confirm non-reasoning models still route to context_overflow on large sessions; 4 parametrized gates confirm non-timeout classifier reasons never trigger the guidance; 5 parametrized cases confirm non-transport messages never trigger it. - Regression guard: new guidance message does NOT contain "execute_code" or "open()" — the misleading advice is fully replaced, not appended alongside. - Cross-vendor dual review via agy -p: - Gemini 3.5 Flash (Medium) — passed: true, zero blockers, one SHOULD-FIX (vprint block duplication — fixed by extracting detection into a helper module). - GPT-OSS 120B (Medium) — passed: true, zero blockers, two nits (test placement — adopted at tests/agent/test_thinking_timeout_guidance.py; primary-model capture — accepted as non-issue per Flash's nit). Dependency note for maintainers: This PR includes agent/reasoning_timeouts.py (the reasoning-model allowlist module from PR #52238) because the Layer 1 override is load-bearing on get_reasoning_stale_timeout_floor(). After PR #52238 lands on main, this PR's duplicate agent/reasoning_timeouts.py should be rebased away. Either PR can land first; the other rebase is mechanical. Fixes #52271.	2026-06-25 19:00:48 -07:00
Teknium	811df74a10	fix(gateway): defer cross-process cache cleanup off the cache lock (#52197 ) (#52761 ) The #45966 cross-process coherence guard popped the stale cached agent and then called the blocking _cleanup_agent_resources (memory-provider shutdown, tool-resource teardown, async-client teardown) while still holding _agent_cache_lock, on the gateway event-loop thread. While that ran, _sweep_idle_cached_agents (driven by _session_expiry_watcher) blocked acquiring the same lock and the asyncio loop stalled for minutes, tripping repeated Discord 'heartbeat blocked' warnings. Fix mirrors the cap-enforcer / idle-sweep paths: pop the stale entry under the lock, release it, then schedule the SOFT release on a daemon thread. The soft path (_release_evicted_agent_soft) is also more correct here than the hard teardown the regression used — the same session rebuilds a fresh agent immediately after invalidation, so its terminal sandbox / browser / bg processes (keyed on task_id) must be preserved for the rebuilt agent to inherit, not torn down. Verified the cross-process site was the only cleanup-under-lock instance; the other _cleanup_agent_resources call sites run outside the lock.	2026-06-25 18:58:47 -07:00
teknium1	e29823f1e8	chore(release): map agt-user noreply email for #48496 salvage	2026-06-25 18:50:11 -07:00
Teknium	ce802e932c	fix(telegram): heartbeat loop exits cleanly when bot has no get_me CI shard test_telegram_conflict.py timed out (140s) because the new _polling_heartbeat_loop, started by connect(), busy-spun under those tests: they monkeypatch asyncio.sleep to instant and pass a bot double with no get_me(), so the probe raised AttributeError (swallowed) and the loop re-entered immediately with no real pacing, starving the event loop. Guard the loop to return when bot.get_me is not callable — a real PTB Bot always exposes it, so this only triggers on a torn-down app or a test double, where there is nothing to probe. Also cancel the heartbeat task in the conflict tests that call connect() without disconnect(), matching the production disconnect() teardown. Verified: test_telegram_conflict.py now runs in ~4.5s; the 22 heartbeat/reconnect tests still pass; E2E confirms a hanging get_me still fires the reconnect ladder while a missing get_me exits without spinning.	2026-06-25 18:50:11 -07:00
agt-user	8501caf51f	fix(telegram): persistent heartbeat loop to detect CLOSE-WAIT polling sockets When a Telegram long-poll TCP socket enters CLOSE-WAIT (remote sent FIN but httpx hasn't noticed), epoll still reports it readable so no exception is raised. PTB's error_callback never fires, the reconnect ladder never engages, and the gateway silently stops receiving messages while the process stays alive — until a manual systemctl restart. The existing recovery only covers two cases: error_callback-driven reconnects (which require an exception PTB never gets) and a one-shot _verify_polling_after_reconnect probe (which runs only right after an explicit reconnect). A socket that wedges during steady-state operation is never detected. Add _polling_heartbeat_loop: a background asyncio.Task started in connect() (polling mode only) that probes get_me() every 90s on the general request pool (not the getUpdates pool, so healthy long-polls are never interrupted). On asyncio.TimeoutError/OSError it hands off to the existing _handle_polling_network_error ladder; other errors are swallowed. disconnect() cancels and awaits the task. Worst-case detection window ~105s. Complementary to #51541 (general-pool keepalive limits / fd leak) — that recycles idle pooled connections; this detects a wedged active read. Fixes #48495 Co-authored-by: agt-user <267614622+agt-user@users.noreply.github.com>	2026-06-25 18:50:11 -07:00
liuhao1024	56cf517ccd	fix(cron): detect partial job loss in restore_cron_jobs_if_emptied (#52144 ) The desktop scheduler can overwrite cron/jobs.json with its own small set of internally-tracked crons after an update/restart, causing partial loss of tool-created cron jobs. The previous guard only checked for total loss (live_count == 0), missing the case where live_count > 0 but less than the pre-update snapshot count. Compare live_count against snap_count instead of checking for zero, so both total loss (0 vs N) and partial loss (1 vs 19) trigger restoration. Salvaged from #52161 by @liuhao1024. Closes #52144	2026-06-25 18:49:18 -07:00
brooklyn!	6b639bc2b9	Merge pull request #52772 from NousResearch/bb/editor feat(desktop): in-app spot editor for the file preview pane	2026-06-25 20:25:06 -05:00
brooklyn!	41f4dce828	Merge pull request #52756 from NousResearch/bb/delegate-bg-resume-ux feat(delegation): calm "will resume" affordance for background delegate_task	2026-06-25 20:08:06 -05:00
Brooklyn Nicholson	985350dd85	feat(cli): note background delegate_task dispatch in _on_tool_complete A top-level delegate_task dispatches in the background and re-enters as a fresh turn when done. Print a one-line dispatch-time note — no spinner, nothing to poll — so the idle prompt doesn't read as "nothing happened."	2026-06-25 19:57:58 -05:00
Brooklyn Nicholson	7f02f30b76	feat(tui): add width-budgeted "resumes when subagent finishes" status segment When idle with a background subagent still in flight, append a tail status segment spelling out that the agent resumes on its own. Width-budgeted like every tail segment, so it drops first on a tight terminal where the ⛓ count already carries the signal.	2026-06-25 19:57:58 -05:00
Brooklyn Nicholson	563d347e4d	feat(desktop): show a calm "will resume" notice for background delegate_task When idle with a top-level delegate_task still in flight, render a static, shimmering system-note at the transcript tail instead of a spinner (which reads as "stuck"). Reuses the shared steer / slash-status chrome (centered, 0.6875rem, muted, Codicon) so it sits in the thread like every other meta line, and mirrors the primary child's latest stream line, falling back to generic copy. i18n across en/ja/zh/zh-hant; markdown prose/heading rhythm tuned so a re-entered turn breathes.	2026-06-25 19:57:51 -05:00
Brooklyn Nicholson	6e096a850a	feat(desktop): add $backgroundResume store for parked delegate_task Track top-level delegate_task work that dispatches in the background and re-enters as a fresh turn. $backgroundResume returns {count, activity} for the active session while idle — count of parked tasks plus the primary child's latest stream line (tool/progress/thinking) when readable.	2026-06-25 19:57:45 -05:00
Brooklyn Nicholson	09623b4527	fix(desktop): make the tab modified dot amber with a separating ring Use the app's amber warn color for the unsaved-edits tab dot (was inheriting the label text color) and add a tab-bg ring + soft drop shadow so it stays legible where it overlaps the filename.	2026-06-25 19:55:31 -05:00
Brooklyn Nicholson	c456029b4e	Merge remote-tracking branch 'origin/main' into bb/editor	2026-06-25 19:53:31 -05:00
Brooklyn Nicholson	1f950e189c	feat(desktop): vertical resize for the bottom-row terminal pane Extends the pane store with heightOverride (alongside widthOverride) and a get/set/clear API, and wires the pane shell + desktop controller so the bottom-row terminal pane can be resized on the Y axis with its size persisted.	2026-06-25 19:50:29 -05:00
Brooklyn Nicholson	ff81365988	feat(desktop): in-app spot editor for the file preview pane Adds a CodeMirror 6 spot editor to the right-rail file preview so users can make quick edits in-app without leaving for an IDE. Entering edit mode is a pure in-place swap of the read view — same fixed-height header, same gutter geometry/typography (mirrors SourceView 1:1) so nothing shifts — toggled via the Edit button, a bare `e` when the pane is hovered/focused, or the tab. - Save path is transport-agnostic (writeDesktopFileText): local Electron IPC or a new hardened POST /api/fs/write-text on the dashboard server (path validation, parent-must-exist, regular-files-only, size cap, atomic temp-file + os.replace), behind the existing auth middleware. - Stale-on-disk guard re-reads before writing and offers overwrite vs discard-and-reload instead of clobbering external/agent edits. - VS Code-style modified dot on the tab; ⌘/Ctrl+S and ⌘/Ctrl+Enter save, Esc cancels; GitHub highlight style matched to the read view's Shiki theme. - Typing stays render-free (draft in a ref; dirty flips once at the boundary).	2026-06-25 19:50:25 -05:00
Que0x	b8fc8c908b	fix(approval): fold Windows absolute home paths in dangerous-command detection The detector folds absolute home / Hermes-home prefixes into their canonical ~/ and ~/.hermes/ forms so static patterns catch /home/alice/.bashrc the same way they catch ~/.bashrc (`abd69b81`). On native Windows this fold never fired, so terminal commands writing to shell startup files, ~/.ssh/authorized_keys, or ~/.hermes/config.yaml / .env returned "safe" and skipped the approval prompt — and config.yaml carries the approval policy itself. Two compounding causes: 1. The fold ran after the backslash-escape strip (r\m -> rm), which dissolves the backslash separators in a Windows path (C:\Users\alice\.bashrc -> C:Usersalice...) before the fold could match. It now runs before the strip. 2. The fold only recognized POSIX absolute paths and only the home prefix, leaving multi-segment backslash suffixes (\.ssh\authorized_keys) to be mangled by the strip. Consolidated into _home_prefix_fold_regex / _fold_home_prefixes: match a home prefix with either separator, capture the rest of the path token, and normalize its separators to / so multi-segment patterns match. The degenerate-path guard generalizes count("/") >= 2 to "at least two components below the root" (also rejecting a bare drive root C:\). HOME is consulted directly because Windows' expanduser ignores it; the more specific Hermes home is folded first, longest candidate first, so neither fold clobbers the other. POSIX behavior unchanged; the r\m -> rm anti-obfuscation strip still runs. Adds TestWindowsAbsolutePathFolding, which monkeypatches a Windows-style HOME/HERMES_HOME so the behavior is also exercised on the CI runner.	2026-06-25 17:49:39 -07:00
brooklyn!	7cd5eaa646	Merge pull request #52745 from NousResearch/desktop/bundle-main desktop: bundle main.cjs for electron	2026-06-25 19:06:59 -05:00
ethernet	df514654ba	desktop: bundle main.cjs for electron fixes simple-git not found	2026-06-25 20:05:20 -04:00
brooklyn!	55af6c447a	Merge pull request #52206 from NousResearch/bb/desktop-tools-curation fix(desktop): hide platform/internal toolsets from the Skills & Tools list	2026-06-25 18:56:04 -05:00
teknium1	6dfb8326f5	fix(state): exclude delegate/branch/tool children from resume walk + reconcile salvaged fixes Follow-up to the salvage of #45035 + #48682. The two PRs touched different functions (resolve_resume_session_id vs get_compression_tip) but #45035's descendant walk followed ANY parent_session_id child, so a delegate/subagent child could hijack the resume target. Apply the same _branched_from / _delegate_from / source!='tool' exclusion the rest of hermes_state.py uses, so the resume walk only follows genuine compression continuations. Also updates the unrealistic delegation test fixture to carry the real _delegate_from marker, and updates 3 list_sessions_rich test mocks for the order_by_last_active kwarg #48682 added. AUTHOR_MAP: map PINKIIILQWQ + ailang323 salvage authors.	2026-06-25 16:29:09 -07:00
longer	6d9ca04574	fix(desktop): resume latest compression continuation	2026-06-25 16:29:09 -07:00
Pink	263f6b03eb	chore: rename test to reflect new semantics of resolve_resume_session_id	2026-06-25 16:29:09 -07:00
PINKIIILQWQ	abd6b85200	fix(state): resolve compression chain tip in resolve_resume_session_id After context compression, the parent session holds pre-compression messages and a child (or deeper descendant) holds the continuation. resolve_resume_session_id() short-circuited when the input session already had messages (row is not None -> return session_id), causing REST API endpoints, gateway resume, and CLI resume to serve stale parent messages. Remove the early-return. Walk the full descendant chain, record the deepest node that has messages (best), and return best if not None else the original session_id (preserving the empty-chain fallback). Callers (api_server.py, web_server.py, cli_agent_setup_mixin.py, cli_commands_mixin.py) all use the resolved != input -> redirect pattern and are transparent to this change.	2026-06-25 16:29:09 -07:00
Teknium	208f0d7c3b	fix(update): default pre-update backup to off (#52729 ) The pre-update HERMES_HOME zip shipped on by default (DEFAULT_CONFIG + runtime fallback both True), so every `hermes update` zipped the entire ~/.hermes — sessions DB, caches, skills — adding minutes to each update. The shipped cli-config.yaml.example, the --backup help, and the example config all already said "off by default," so the live default contradicted its own documentation. Flip the default to off everywhere: DEFAULT_CONFIG, the runtime `.get(..., False)` fallback in _run_pre_update_backup, and the stale --backup help string. Users who want the #48200 safety net opt in via updates.pre_update_backup: true or --backup for a single run. Updated test_default_enabled_creates_backup -> test_default_disabled_is_silent to assert the new default (silent no-op, no zip).	2026-06-25 16:01:09 -07:00
kshitij	e4ff494860	fix(cron): add default retention to per-run job output (#52383 ) (#52646 ) * fix(cron): add default retention to per-run job output to bound disk usage (#52383) Per-run cron output (cron/output/<job>/<timestamp>.md) is written once per execution and was never pruned, so a frequently-scheduled job on a long-running deploy accumulates one file per run indefinitely and can fill the volume ('no space left on device'). save_job_output() now keeps the most recent N output files per job and removes older ones. N defaults to 50 and is configurable via cron.output_retention; a non-positive value disables pruning for operators who manage cleanup externally. Salvaged from #52402 by @0xDevNinja. Closes #52383 * fix(config): add cron.output_retention to DEFAULT_CONFIG Follow-up to #52383: the retention config key was functional via get()-with-default but missing from DEFAULT_CONFIG, so the deep-merge wouldn't auto-populate it for new installs. Add it explicitly. --------- Co-authored-by: 0xDevNinja <manmit0x@gmail.com>	2026-06-25 16:00:13 -07:00
brooklyn!	ffa3d3c811	Merge pull request #49037 from NousResearch/bb/projects-paradigm feat(desktop): first-class projects — sidebar, coding rail, review pane, and agent project tools	2026-06-25 17:49:05 -05:00
Teknium	fd2a35b169	fix: stop reporting cache-hit rate and cost across all UI surfaces (#52717 ) * fix: stop reporting cache-hit rate and cost across all UI surfaces Cost estimates and cache read/write token reporting are unreliable on providers that don't surface cached_tokens (e.g. ollama-cloud, which doesn't implement prompt_tokens_details.cached_tokens), producing misleading near-zero 'cache hit' readouts and cost figures. Remove cost + cache-hit reporting from every user-facing surface; keep input/output/total token counts (provider-agnostic and accurate) and the Nous account billing UI (real account money, separate from per-conversation estimates). Surfaces: - CLI /usage + model-info: drop cost lines + cache read/write token lines - Gateway /usage + /model: drop cost + cache lines - tui_gateway/server.py: stop emitting cost_usd / cache_read in usage and subagent.complete payloads - TUI (Ink): drop cost from status bar (+ showCost plumbing), /usage panel, thinking rollup, agents overlay (incl. compare view); keep token counts - Desktop Command Center: drop cost stat, per-model cost, actual-cost hint Underlying estimate_usage_cost / format_cost / insights cost columns are left intact but no longer surfaced (display-only change, reversible). * test: update TUI + gateway + CLI tests for removed cost/cache-hit reporting - CLI /usage test asserts cost/cache lines are absent, tokens present - gateway /usage test drops cost + cache asserts; removes cost-included test - TUI subagentTree summary expectation drops the cost segment - useConfigSync + appChrome status-rule tests drop showCost prop/state	2026-06-25 15:21:22 -07:00
Brooklyn Nicholson	19ca295a84	fix(desktop): clarify branch convert actions Open checked-out branches, switch the primary checkout for the default branch, and create linked worktrees only for non-trunk free branches.	2026-06-25 17:19:36 -05:00
teknium1	3e99ec0ff9	test(hermes_state): cover update_session_billing_route overwrite + prompt null Regression for the salvaged #48254 fix: billing route is first-writer-wins via update_token_counts (COALESCE), so a mid-session provider switch left the dashboard attributing cost to the original provider. Asserts the new update_session_billing_route() overwrites unconditionally, nulls system_prompt so the next turn rebuilds Model:/Provider:, and preserves billing_mode when omitted (COALESCE on None).	2026-06-25 14:44:00 -07:00
x7peeps	c7e934a5b4	fix(hermes_state): persist billing provider/base_url after mid-session /model switch The session database records billing_provider and billing_base_url using COALESCE(column, ?) in update_token_counts(), making them write-once. When a user switches models mid-session via /model, the runtime (agent.provider, agent.base_url) updates correctly, but the session row never reflects the new provider. This causes the dashboard Models page to display a stale provider badge and misattributes token usage / cost analytics. Fix: add update_session_billing_route() that unconditionally sets billing_provider, billing_base_url, and billing_mode (no COALESCE), and call it from switch_model() in agent_runtime_helpers.py after the swap succeeds. This follows the same pattern as update_session_model() which already unconditionally updates the model column (added for the identical COALESCE problem on the model field). Closes #48248	2026-06-25 14:44:00 -07:00
Gille	bf0513bca0	test(windows): align gateway restart CI coverage	2026-06-25 14:42:38 -07:00
Gille	e7d2f0b93c	fix(windows): suppress console flashes and harden gateway restarts	2026-06-25 14:42:38 -07:00
Brooklyn Nicholson	9f3aa1685c	fix(cli): register project command beside MoA	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	890e890281	chore(desktop): update package lock	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	a391523bcc	i18n(desktop): add project and worktree strings	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	b8d220f268	feat(desktop): wire project settings and shell chrome	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	62af32efe7	feat(desktop): keep active sessions aligned with cwd	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	68680db10d	feat(desktop): add Codex-style review pane	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	7a7f9a5b3d	feat(desktop): add composer coding rail and worktree flow	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	488ae376db	feat(desktop): render backend-authoritative projects sidebar	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	74352a1e61	feat(desktop): add project and coding stores	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	344415892f	feat(desktop): add shared project UI primitives	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	e2b8018729	feat(desktop): add git worktree and review IPC	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	86e748df13	fix(agent): require code for coding posture	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	cb3f8ec03d	fix(tools): isolate per-session worktree cwd	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	4ffdedd369	feat(tools): add project workspace tools	2026-06-25 16:40:27 -05:00
Brooklyn Nicholson	4e023f5bc9	feat(gateway): build authoritative project tree	2026-06-25 16:40:27 -05:00

1 2 3 4 5 ...

12972 commits