feat(delegate): cross-agent file state coordination for concurrent subagents (#13718)

* feat(models): hide OpenRouter models that don't advertise tool support

Port from Kilo-Org/kilocode#9068.

hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.

Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).

Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.

Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.

* feat(delegate): cross-agent file state coordination for concurrent subagents

Prevents mangled edits when concurrent subagents touch the same file
(same process, same filesystem — the mangle scenario from #11215).

Three layers, all opt-out via HERMES_DISABLE_FILE_STATE_GUARD=1:

1. FileStateRegistry (tools/file_state.py) — process-wide singleton
   tracking per-agent read stamps and the last writer globally.
   check_stale() names the sibling subagent in the warning when a
   non-owning agent wrote after this agent's last read.

2. Per-path threading.Lock wrapped around the read-modify-write
   region in write_file_tool and patch_tool. Concurrent siblings on
   the same path serialize; different paths stay fully parallel.
   V4A multi-file patches lock in sorted path order (deadlock-free).

3. Delegate-completion reminder in tools/delegate_tool.py: after a
   subagent returns, writes_since(parent, child_start, parent_reads)
   appends '[NOTE: subagent modified files the parent previously
   read — re-read before editing: ...]' to entry.summary when the
   child touched anything the parent had already seen.

Complements (does not replace) the existing path-overlap check in
run_agent._should_parallelize_tool_batch — batch check prevents
same-file parallel dispatch within one agent's turn (cheap prevention,
zero API cost), registry catches cross-subagent and cross-turn
staleness at write time (detection).

Behavior is warning-only, not hard-failing — matches existing project
style. Errors surface naturally: sibling writes often invalidate the
old_string in patch operations, which already errors cleanly.

Tests: tests/tools/test_file_state_registry.py — 16 tests covering
registry state transitions, per-path locking, per-path-not-global
locking, writes_since filtering, kill switch, and end-to-end
integration through the real read_file/write_file/patch handlers.
This commit is contained in:
Teknium 2026-04-21 16:41:26 -07:00 committed by GitHub
parent 35a4b093d8
commit 9c9d9b7ddf
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 785 additions and 39 deletions

View file

@ -27,6 +27,7 @@ from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import Any, Dict, List, Optional
from toolsets import TOOLSETS
from tools import file_state
from utils import base_url_hostname
@ -728,7 +729,22 @@ def _run_single_child(
except Exception as e:
logger.debug("Progress callback start failed: %s", e)
result = child.run_conversation(user_message=goal)
# File-state coordination: generate a stable child task_id so the
# file_state registry can attribute writes back to this subagent,
# and snapshot the parent's read set at launch time. After the
# child returns we compare to detect "sibling modified files the
# parent previously read" and surface it as a reminder on the
# returned summary.
import uuid as _uuid
child_task_id = f"subagent-{task_index}-{_uuid.uuid4().hex[:8]}"
parent_task_id = getattr(parent_agent, "_current_task_id", None)
wall_start = time.time()
parent_reads_snapshot = (
list(file_state.known_reads(parent_task_id))
if parent_task_id else []
)
result = child.run_conversation(user_message=goal, task_id=child_task_id)
# Flush any remaining batched progress to gateway
if child_progress_cb and hasattr(child_progress_cb, '_flush'):
@ -826,6 +842,36 @@ def _run_single_child(
if status == "failed":
entry["error"] = result.get("error", "Subagent did not produce a response.")
# Cross-agent file-state reminder. If this subagent wrote any
# files the parent had already read, surface it so the parent
# knows to re-read before editing — the scenario that motivated
# the registry. We check writes by ANY non-parent task_id (not
# just this child's), which also covers transitive writes from
# nested orchestrator→worker chains.
try:
if parent_task_id and parent_reads_snapshot:
sibling_writes = file_state.writes_since(
parent_task_id, wall_start, parent_reads_snapshot
)
if sibling_writes:
mod_paths = sorted(
{p for paths in sibling_writes.values() for p in paths}
)
if mod_paths:
reminder = (
"\n\n[NOTE: subagent modified files the parent "
"previously read — re-read before editing: "
+ ", ".join(mod_paths[:8])
+ (f" (+{len(mod_paths) - 8} more)" if len(mod_paths) > 8 else "")
+ "]"
)
if entry.get("summary"):
entry["summary"] = entry["summary"] + reminder
else:
entry["stale_paths"] = mod_paths
except Exception:
logger.debug("file_state sibling-write check failed", exc_info=True)
if child_progress_cb:
try:
child_progress_cb(