hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-20 10:11:58 +00:00

Author	SHA1	Message	Date
harshitAgr	3791a87dbe	fix(openviking): close remaining session-boundary races on switch Three follow-ups from review on #28296: 1. Sync worker outliving the bounded join. Each sync_turn POST has _TIMEOUT=30s and there are two per turn, but on_session_end and on_session_switch only join for 10s. If the worker is still alive after the join, committing the old session orphans the worker's late writes past the commit boundary — they land in an already- committed session and never get extracted. Both hooks now re-check is_alive() after the join and skip the commit when the worker hasn't drained. 2. on_memory_write late session_id capture. Same shape as the pre-fix sync_turn: f-string for the post path read self._session_id inside the worker, so a switch between thread spawn and post call landed the memory note in the new session. Snapshot sid at call time, same pattern as sync_turn. 3. Stale prefetch repopulating the new session. The pre-switch drain+clear only protects against workers that finish before the join completes; one finishing after the clear would write its result into the new generation's slot. Added a monotonic _prefetch_generation; workers capture it at spawn and refuse to write if it has advanced. Tests: existing in-flight-sync test updated to drain (it tested the join-before-commit happy path); four new tests cover hung-writer skip on end + switch, on_memory_write sid capture, and prefetch generation gating. 177/177 memory tests pass.	2026-05-21 07:21:18 +03:00
harshitAgr	2ea8d5c537	fix(openviking): close session-boundary races on sync_turn and on_session_end Two hardening fixes prompted by review on #28296: 1. sync_turn() now snapshots the target session id before spawning the worker. The previous code read self._session_id inside the worker, so a worker delayed past on_session_switch's bounded join could read the rotated-in NEW id and write the OLD turn's messages into the wrong session. 2. on_session_end() resets _turn_count to 0 after a successful commit, making the old-session commit path idempotent with the new switch hook. /new and compression call commit_memory_session() (which fires on_session_end) immediately before on_session_switch; without this, the old session would be committed twice. On commit failure we leave _turn_count > 0 so on_session_switch retries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 13:19:44 +03:00
harshitAgr	a1e7185e8a	fix(openviking): implement on_session_switch hook (#28296 ) OpenVikingMemoryProvider only overrides on_session_end and inherits the base-class no-op for on_session_switch. When the agent rotates session_id (via /new, /branch, /reset, /resume, or context compression), the provider's cached _session_id stays at the value initialize() captured. All subsequent sync_turn writes then land in the already-closed old session, and on_session_end tries to commit it a second time — the new session never accumulates messages and never triggers memory extraction. The fix mirrors the pattern Hindsight uses (#17508): 1. Wait for any in-flight sync thread to drain under the OLD _session_id before we mutate it, otherwise the commit below races the last message write. 2. Commit the old session if it accumulated turns — same extraction semantics as on_session_end. Skip if empty (nothing to extract). 3. Drain in-flight prefetch from the old session and clear its cached result so the new session doesn't see stale recall. 4. Rotate _session_id to the new value and reset _turn_count. Commit failures are swallowed (logged at WARN) so a flaky server can't strand the provider on the old session forever — same posture as the existing on_session_end commit.	2026-05-19 07:07:27 +03:00
RyanRana	206f595f66	perf(prompt): cache kanban worker guidance at session init Salvages #24402 by @RyanRana. The KANBAN_GUIDANCE block (~835 tokens) is session-static — the dispatcher decides at spawn time whether the process is a kanban worker via the kanban_show tool's check_fn (gated on HERMES_KANBAN_TASK env var). Re-checking 'kanban_show' in valid_tool_names and re-loading the reference on every system-prompt rebuild (init + each context compression) is wasted work. Caches the resolved string on agent._kanban_worker_guidance once in agent_init and consumes it in system_prompt.build_system_prompt(), with a getattr fallback for code paths that bypass agent_init.	2026-05-18 20:56:44 -07:00
Bartok9	365da2d2df	fix: 4 small surgical bugs Salvages #23302 by @Bartok9. Four independent one-area fixes: 1. kanban boards delete alias now hard-deletes (not archives) — the alias didn't carry --delete, so getattr(args, 'delete', False) returned False. Detect boards_action=='delete' explicitly. 2. Gateway auto-title failures no longer leak as user-visible warnings — debug-log only since they're not actionable. 3. Background process completion notification snaps truncation to the next newline boundary, prepends a marker when content is dropped. 4. _cprint() schedules the run_in_terminal coroutine via asyncio.ensure_future so output isn't silently dropped from background threads (fixes #23185 Bug A). Skips the double-print fallback that would fire for mock paths.	2026-05-18 20:54:52 -07:00
LeonSGP43	3a7ed7be08	fix(packaging): ship bundled skills in wheel Salvages #23738 by @LeonSGP43. Wheel installs were missing skills/ and optional-skills/ because pyproject's [tool.setuptools.packages.find] only includes Python packages — the skills directories don't have __init__.py so they were silently dropped from the wheel. Adds setup.py with data_files spec emitting skills/* and optional-skills/* under hermes_agent-<v>.data/data/, and a get_bundled_skills_dir() helper in hermes_constants that discovers the wheel-installed location via sysconfig before falling back to a source-checkout path. tools/skills_sync uses the helper so 'hermes update' works for pip-installed users.	2026-05-18 20:52:35 -07:00
SimbaKingjoe	5fdcfd851f	feat(kanban): add max_in_progress config to cap concurrent running tasks Salvages #22981 by @SimbaKingjoe. Adds 'kanban.max_in_progress' config that caps simultaneously running tasks. When the board already has N running, dispatcher skips spawning so slow workers (local LLMs, resource-constrained hosts) don't pile up and time out. Threads through dispatch_once(max_in_progress=) and gateway dispatcher config parsing with validation (warns on invalid/below-1 values).	2026-05-18 20:50:13 -07:00
steezkelly	d3345cc70d	test: isolate Kanban env pins in hermetic fixture Salvages the substantive part of #22295 by @steezkelly. Adds the missing HERMES_KANBAN_HOME, HERMES_KANBAN_RUN_ID, HERMES_KANBAN_CLAIM_LOCK, HERMES_KANBAN_DISPATCH_IN_GATEWAY entries to _HERMES_BEHAVIORAL_VARS so ambient developer-shell pins on those vars don't bleed into pytest runs. The frozenset extraction + standalone regression test from the original PR were dropped to keep the change minimal — main already maintains the list inline.	2026-05-18 20:47:51 -07:00
LeonSGP43	a94ddd8073	fix(kanban): honor severity thresholds in diagnostics Salvages #26431 by @LeonSGP43. Dashboard plugin_api list_diagnostics was using exact-match (severity == filter), so '--severity warning' hid 'error' and 'critical' diagnostics. Adds severity_at_or_above() helper to kanban_diagnostics and uses it in the dashboard endpoint (CLI already used SEVERITY_ORDER comparison correctly).	2026-05-18 20:47:01 -07:00
LeonJS	9f008bcd5c	fix(kanban): release scratch workspace and tmux session on task completion Salvages #27369 by @LeonJS. complete_task() now calls _cleanup_workspace() and _cleanup_worker_tmux() after marking a task complete. Scratch workspaces (used by swarm agents) accumulate on disk — hundreds of MB per task, never released. Stale tmux sessions from completed agents also persist indefinitely. Both gates are safe: - workspace_kind == 'scratch' gate preserves user worktree/dir workspaces - tmux #{pane_dead} == 1 gate only kills sessions where the worker has already exited - best-effort: cleanup failures never block task completion	2026-05-18 20:45:29 -07:00
shunsuke-hikiyama	fb96208892	feat(kanban): add initial-status for human-ops cards Salvages #27526 by @shunsuke-hikiyama. Adds an --initial-status flag (running\|blocked, default running) to 'kanban create', threaded through kanban_db.create_task() and the kanban_create tool schema. 'blocked' parks the task directly in the blocked column for R3 human-ops review, skipping the brief running-to-blocked transition. Dropped the unrelated 'add' alias, WIFEXITED Windows compat, and slash-handler error formatting changes that were bundled in the original PR — those should ship as their own focused changes if still wanted.	2026-05-18 20:44:02 -07:00
kronexoi	e8ce7b83fa	fix(kanban): reject direct running transitions in dashboard bulk updates Salvages #24050 by @kronexoi. The single-task PATCH already rejects direct status='running' since it bypasses the dispatcher/claim invariant, but the bulk-update endpoint still accepted it. Aligns bulk with single by emitting an error result row for any 'running' entry.	2026-05-18 20:38:32 -07:00
uzunkuyruk	666b66a066	fix(oneshot): pass fallback_providers from profile config to AIAgent Salvages #23368 by @uzunkuyruk. Oneshot workers (e.g. kanban workers spawned via 'hermes -p <profile> chat -q ...') were not honouring the profile's fallback_providers / fallback_model chain because oneshot.py never read the config and never passed fallback_model= to AIAgent. Reads cfg.get('fallback_providers') (new list format) or cfg.get('fallback_model') (legacy single-dict) with the same normalization cli.py applies, then forwards as fallback_model=_fb.	2026-05-18 20:37:23 -07:00
helix4u	713c231cf8	docs(kanban): document worker protocol auto-blocks Salvages #21585 by @helix4u. Documents the protocol_violation event (worker exits successfully while task is still running), adds --max-retries to the create flag list and --failure-limit to dispatch.	2026-05-18 20:36:32 -07:00
LeonSGP43	fdb374e10f	fix(packaging): ship dashboard plugin assets in wheel Salvages #23737 by @LeonSGP43. Adds plugins/* manifest.json and dist/ glob entries to setuptools package-data so wheel installs ship the bundled dashboard plugin assets (kanban, achievements, etc.). Without these, /api/dashboard/plugins can't discover plugin assets outside a source checkout.	2026-05-18 20:35:00 -07:00
oemtalks	b9d38a56dd	fix(kanban): don't crash dispatched workers when kanban-worker skill is absent Salvages #27372 by @oemtalks. The dispatcher unconditionally injected `--skills kanban-worker` into every worker spawn, but worker profiles sometimes don't have that bundled skill in their skills dir, which is fatal at CLI startup (`ValueError: Unknown skill(s): kanban-worker`). Adds `_kanban_worker_skill_available(hermes_home)` and only injects the flag when the skill resolves. The MANDATORY lifecycle still ships via KANBAN_GUIDANCE in the system prompt, so omitting the flag is safe.	2026-05-18 20:32:20 -07:00
Ade5954	0392cf53b5	fix(kanban): close sqlite connection on init failure to prevent fd leak Salvages #28301 by @Ade5954. If WAL setup, PRAGMA application, or schema init raises after sqlite3.connect() succeeds, the new connection was leaking. Wrap the body in try/except so the connection is closed before the exception propagates.	2026-05-18 20:30:56 -07:00
Bartok9	4341072563	docs(env): add HERMES_KANBAN_DISPATCH_IN_GATEWAY override (#21956 ) Salvages the env-vars docs portion of #21956 by @Bartok9. The ascii-guard-ignore tags from the original PR already landed on main.	2026-05-18 20:27:30 -07:00
maxmilian	2dec7604e2	fix(kanban-dashboard): make Orchestration mode checkbox label static The checkbox label echoed its state ("Auto (default)" / "Manual") instead of describing the action, so a checked box reading "Auto" parsed as a status indicator rather than a control. The accompanying sub-description was also static and started with "When on, ...", which read awkwardly when the box was unchecked. Replace the dynamic label with a static action label ("Auto-decompose triage tasks") and flip the sub-description between the two modes so it stays accurate either way. The top-of-page Orchestration pill is unchanged — that one is intentionally a status badge / toggle. Fixes #28178 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 20:26:18 -07:00
DoGMaTiiC	4da4133d34	fix: assign single-task kanban decompositions	2026-05-18 20:26:02 -07:00
roycepersonalassistant	6c4f11c64a	fix: show scheduled kanban tasks in dashboard	2026-05-18 20:25:45 -07:00
ACR27	a5c2836b07	feat(kanban): allow trimmed task comments SS-1647 live SHIP validation: real code + tests for kanban comment --max-len.	2026-05-18 20:25:29 -07:00
hanzckernel	5d079fee17	fix: harden Kanban worker Hermes command resolution	2026-05-18 20:25:09 -07:00
ht1072	0b547aea03	fix(kanban): make legacy task migration idempotent (cherry picked from commit 293f1c3a7241b0117669e049d9aa746c9645ac90)	2026-05-18 20:24:53 -07:00
haran2001	c30608cfbe	fix(kanban): preserve worker tools with restricted toolsets	2026-05-18 20:24:37 -07:00
LizerAIDev	f12382fcc4	docs(kanban-worker): document notification routing configuration	2026-05-18 20:24:21 -07:00
zccyman	fe5e0bf5a3	feat(kanban): add board-level default workdir (#25430 )	2026-05-18 20:24:04 -07:00
LeonSGP43	8bfb456948	fix(kanban): pass accept-hooks to worker chat subprocess	2026-05-18 20:23:47 -07:00
LeonSGP43	0f620138b0	fix(kanban): make claim ttl configurable Co-Authored-By: Paperclip <noreply@paperclip.ing>	2026-05-18 20:23:31 -07:00
wesleysimplicio	86279160b0	fix(kanban): persist worker session metadata on completion Salvages #25579 by @wesleysimplicio. Stamps task_runs.metadata.worker_session_id from HERMES_SESSION_ID on kanban_complete. Cherry-picked the substantive commit (not the AUTHOR_MAP fixup tip) onto current main.	2026-05-18 20:22:27 -07:00
moortekweb-art	4f6101cc74	Fix Kanban dashboard initial board selection	2026-05-18 20:18:21 -07:00
Interstellar-code	d8ad431de8	fix(kanban): task_age() tolerates ISO-8601 timestamps Prevents ValueError crash in dashboard get_board() when a task has an ISO timestamp (e.g. "2026-05-10T15:00:00Z") instead of a unix epoch int. Adds _to_epoch() helper that normalises both formats.	2026-05-18 20:18:04 -07:00
psionic73	ca8126bd53	fix(kanban): serialize DB initialization	2026-05-18 20:17:48 -07:00
Drexuxux	917e51858d	fix(kanban): demote ready children when a parent is reopened	2026-05-18 20:17:28 -07:00
soynchux	9281599b6f	fix(kanban): align board_exists with board discovery rules	2026-05-18 20:17:10 -07:00
bradhallett	de9bcfc6a0	fix(kanban): fingerprint crash errors to prevent fleet-wide retry exhaustion When a systemic failure (provider outage, auth expiry, OOM) crashes multiple workers simultaneously, detect_crashed_workers increments each task failure counter independently. The circuit breaker only trips after N × failure_limit retries across the fleet. Fingerprint crash errors by normalizing host-specific details (PIDs, timestamps). When 3+ tasks crash with the same fingerprint in a single detection cycle, immediately trip the circuit breaker (failure_limit=1) instead of waiting for repeated failures. Isolated crashes (unique fingerprints) retain their normal retry budget. Protocol violations continue to trip immediately. Includes regression tests for systemic and isolated crash paths.	2026-05-18 20:16:50 -07:00
bradhallett	f042931852	fix(kanban): reset failure counters on unblock_task When a task is manually unblocked (blocked → ready/todo), the consecutive_failures counter and last_failure_error were left intact. The next failure would immediately re-trip the circuit breaker because the counter was still at or above the failure limit. Reset both fields on unblock so the task gets a fresh retry budget. Includes a regression test that verifies counters are zeroed.	2026-05-18 20:16:32 -07:00
sprmn24	5db0d72c90	fix(kanban): use 'is not None' check for max_runtime_seconds in create_task max_runtime_seconds=0 was being silently coerced to None due to a falsy check (if max_runtime_seconds). Zero is a valid value that causes the dispatcher to immediately time out a task. The adjacent max_retries parameter already used the correct 'is not None' pattern. Fixes the inconsistency by aligning max_runtime_seconds with max_retries.	2026-05-18 20:16:15 -07:00
bradhallett	40c1decb3b	fix(kanban): promote blocked tasks when parent dependencies complete recompute_ready only scanned 'todo' tasks for promotion, ignoring 'blocked' tasks entirely. When a task was blocked (e.g. by the circuit breaker) and its parent dependencies later completed, the task stayed stuck in 'blocked' forever unless manually unblocked. Now recompute_ready also scans 'blocked' tasks. When all parents are done/archived, the blocked task is promoted to 'ready' with failure counters reset — equivalent to an automatic unblock. Includes a regression test for the blocked-parent-done promotion path.	2026-05-18 20:15:55 -07:00
Que0x	bc961c13f3	fix(kanban): sync slash subcommands with live parser	2026-05-18 20:15:38 -07:00
argabor	f149e1e567	fix(cli): make kanban specify max_tokens configurable	2026-05-18 20:15:20 -07:00
Zyrixtrex	b7ea62e5d3	fix(kanban): promote dependents when a parent is archived	2026-05-18 20:15:03 -07:00
Zyrixtrex	326c15d955	fix(kanban): preserve notifier_profile for dashboard home subscriptions	2026-05-18 20:14:45 -07:00
QuenVix	afae2dd9ec	fix(kanban): keep board-management commands independent from board override	2026-05-18 20:14:27 -07:00
QuenVix	8a64e1580b	fix(kanban): ignore stale HERMES_KANBAN_BOARD for removed boards	2026-05-18 20:14:10 -07:00
ms-alan	97ac94fe56	fix(kanban): seed bundled skills (e.g. kanban-worker) on kanban init Closes #23725	2026-05-18 20:13:52 -07:00
momowind	4519d2b476	fix(web): add Cache-Control: no-store to plugin static file serving Prevents browser caching of stale dashboard plugin JS files that may contain bugs already fixed upstream (e.g. COLUMN_LABEL undefined).	2026-05-18 20:13:35 -07:00
briandevans	d62964cdfa	fix(kanban): clear _INITIALIZED_PATHS in remove_board so recycled DBs re-init schema Archiving or deleting a board via remove_board() leaves the path's "schema already initialized" entry in the module-level cache. A concurrent connect(board=<slug>) call (e.g. the dashboard event-stream poll loop) then: 1. resolves the same kanban.db path, 2. recreates the directory + an empty sqlite file because connect() does mkdir(parents=True, exist_ok=True), 3. skips the CREATE TABLE pass because the cache entry says the schema is already in place, 4. errors on the next read with `no such table: task_events`. Drop the cache entry before mutating the filesystem so the fresh file gets a proper schema init on next connect(). Applies to both archive=True (rename) and archive=False (rmtree) branches. Fixes #23833. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 20:13:17 -07:00
wuli666	028bbc5425	test(kanban-dashboard): cover _task_dict task_age fallback The fix in `061a1830` added an outer try/except in plugin_api._task_dict so that a future failure mode in kanban_db.task_age (anything _safe_int doesn't already absorb) cannot 500 the GET /board response. The _safe_int / task_age corruption paths got regression coverage in tests/hermes_cli/test_kanban_db.py, but the OUTER fallback contract remained untested -- meaning a refactor that drops the try/except would not be caught by CI. Pin that contract from both consumers of _task_dict: - GET /board returns 200 with the literal fallback age dict for the affected card (other cards continue to render via the same path) - GET /tasks/:id (drawer view) returns 200 with the same fallback, so a single corrupt task can't block its own drawer Both tests force task_age to raise RuntimeError rather than ValueError on '%s', because ValueError is absorbed by _safe_int and never reaches the outer try/except -- testing that path would only re-cover what test_kanban_db.py already pins. Manually verified the regression discipline: git checkout `061a1830`^ -- plugins/kanban/dashboard/plugin_api.py pytest -k task_age_exception # both FAIL with 500 git checkout HEAD -- plugins/kanban/dashboard/plugin_api.py pytest -k task_age_exception # both PASS	2026-05-18 20:12:52 -07:00
hongchen1993	f01ee0b575	feat: per-task model override for kanban workers - Add model_override field to Task class and tasks schema - Add migration for existing databases - Spawn worker with -m model when model_override is set	2026-05-18 20:12:28 -07:00

1 2 3 4 5 ...

8874 commits