hermes-agent

mirrors/hermes-agent

Fork 0

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-18 09:51:59 +00:00

Commit graph

Author	SHA1	Message	Date
brooklyn!	44e5848e74	feat(desktop): stream subagent activity into watch windows (#47060 ) * feat(desktop): stream subagent replies into watch windows A desktop watch window resumes a child session lazily (no full agent) and mirrors the parent-relayed `subagent.` events into native child-session stream events. The child's streamed reply text was never relayed, so the window sat blank while the subagent "talked". - delegate_tool: forward the child's `run_conversation` stream tokens up the progress relay as `subagent.text` (inert under CLI/TUI — their progress handlers ignore non-tool event types; only a gateway watch window mirrors it). - server: mirror `subagent.text` -> `message.delta` on the child sid only, and skip the parent emit (per-token frames are meaningless on the parent session, which shows the child via the spawn tree). Demote `subagent.start` to a one-time goal header and drop the noisy `subagent.progress` mirror — tools already mirror natively. - server: guard `_start_agent_build` so a lazy watch session spectating an in-flight child stays lazy; incidental RPCs were upgrading it to a full agent mid-stream and silently killing the mirror. fix(desktop): keep watch-window chat clear of titlebar chrome Secondary windows (new-session scratch, subagent watch, cmd-click pop-out) hide the titlebar tool cluster + session header, so the transcript ran to the window's top edge and streamed text slid up under the OS traffic lights. - Gate the hidden chrome on `isSecondaryWindow()` everywhere (app-shell, chat header, thread list) instead of the narrower new-session flag. - Add a fixed opaque drag-strip at the top of the secondary-window transcript: content padding alone scrolls away with the text, so the strip masks anything behind it and keeps the window draggable like the main header. * fix: WSL subagent window * fix: subagent window top padding --------- Co-authored-by: Austin Pickett <pickett.austin@gmail.com> Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-06-16 14:30:11 -04:00
kshitijk4poor	66827f8947	chore: prune unused imports and duplicate import redefinitions Remove unused imports (F401) and duplicate/shadowed import redefinitions (F811) across the codebase using ruff's safe autofixes. No behavioral changes -- imports only. - ~1400 safe autofixes applied across 644 files (net -1072 lines) - __init__.py re-exports preserved (excluded from F401 removal so public re-export surfaces stay intact) - Re-exports that are imported or monkeypatched by tests but look unused in their defining module are kept with explicit # noqa: F401 (gateway/run.py load_dotenv; run_agent re-exports from agent.message_sanitization, agent.context_compressor, agent.retry_utils, agent.prompt_builder, agent.process_bootstrap, agent.codex_responses_adapter) - Unsafe F841 (unused-variable) fixes deliberately skipped -- those can change behavior when the RHS has side effects - ruff lints remain disabled in pyproject.toml (only PLW1514 is selected); this is a one-time cleanup, not a config change Verification: - python -m compileall: clean - pytest --collect-only: all 27161 tests collect (zero import errors) - core entry points import clean (run_agent, model_tools, cli, toolsets, hermes_state, batch_runner, gateway) - static scan: every name any test imports directly from an edited module still resolves	2026-05-28 22:26:25 -07:00
Teknium	7634c1386f	feat(delegate): diagnostic dump when a subagent times out with 0 API calls (#15105 ) When a subagent in delegate_task times out before making its first LLM request, write a structured diagnostic file under ~/.hermes/logs/subagent-timeout-<sid>-<ts>.log capturing enough state for the user (and us) to debug the hang. The old error message — 'Subagent timed out after Ns with no response. The child may be stuck on a slow API call or unresponsive network request.' — gave no observability for the 0-API-call case, which is the hardest to reason about remotely. The diagnostic captures: - timeout config vs actual duration - goal (truncated to 1000 chars) - child config: model, provider, api_mode, base_url, max_iterations, quiet_mode, platform, _delegate_role, _delegate_depth - enabled_toolsets + loaded tool names - system prompt byte/char count (catches oversized prompts that providers silently choke on) - tool schema count + byte size - child's get_activity_summary() snapshot - Python stack of the worker thread at the moment of timeout (reveals whether the hang is in credential resolution, transport, prompt construction, etc.) Wiring: - _run_single_child captures the worker thread via a small wrapper around child.run_conversation so we can look up its stack at timeout. - After a FuturesTimeoutError, we pull child.get_activity_summary() to read api_call_count. If 0 AND it was a timeout (not a raise), _dump_subagent_timeout_diagnostic() is invoked. - The returned path is surfaced in the error string so the parent agent (and therefore the user / gateway) sees exactly where to look. - api_calls > 0 timeouts keep the old 'stuck on slow API call' phrasing since that's the correct diagnosis for those. This does NOT change any behavior for successful subagent runs, non-timeout errors, or subagents that made at least one API call before hanging. Tests: 7 cases (tests/tools/test_delegate_subagent_timeout_diagnostic.py) - output format + required sections + field values - long-goal truncation with [truncated] marker - missing / already-exited worker thread branches - unwritable HERMES_HOME/logs/ returns None without raising - _run_single_child wiring: 0 API calls → dump + diagnostic_path in error - _run_single_child wiring: N>0 API calls → no dump, old message Refs: #14726	2026-04-24 04:58:32 -07:00

Author

SHA1

Message

Date

brooklyn!

44e5848e74

feat(desktop): stream subagent activity into watch windows (#47060 )

* feat(desktop): stream subagent replies into watch windows

A desktop watch window resumes a child session lazily (no full agent) and
mirrors the parent-relayed `subagent.*` events into native child-session
stream events. The child's streamed reply text was never relayed, so the
window sat blank while the subagent "talked".

- delegate_tool: forward the child's `run_conversation` stream tokens up the
  progress relay as `subagent.text` (inert under CLI/TUI — their progress
  handlers ignore non-tool event types; only a gateway watch window mirrors it).
- server: mirror `subagent.text` -> `message.delta` on the child sid only, and
  skip the parent emit (per-token frames are meaningless on the parent session,
  which shows the child via the spawn tree). Demote `subagent.start` to a
  one-time goal header and drop the noisy `subagent.progress` mirror — tools
  already mirror natively.
- server: guard `_start_agent_build` so a lazy watch session spectating an
  in-flight child stays lazy; incidental RPCs were upgrading it to a full
  agent mid-stream and silently killing the mirror.

* fix(desktop): keep watch-window chat clear of titlebar chrome

Secondary windows (new-session scratch, subagent watch, cmd-click pop-out)
hide the titlebar tool cluster + session header, so the transcript ran to the
window's top edge and streamed text slid up under the OS traffic lights.

- Gate the hidden chrome on `isSecondaryWindow()` everywhere (app-shell,
  chat header, thread list) instead of the narrower new-session flag.
- Add a fixed opaque drag-strip at the top of the secondary-window transcript:
  content padding alone scrolls away with the text, so the strip masks
  anything behind it and keeps the window draggable like the main header.

* fix: WSL subagent window

* fix: subagent window top padding

---------

Co-authored-by: Austin Pickett <pickett.austin@gmail.com>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>

2026-06-16 14:30:11 -04:00

kshitijk4poor

66827f8947

chore: prune unused imports and duplicate import redefinitions

Remove unused imports (F401) and duplicate/shadowed import
redefinitions (F811) across the codebase using ruff's safe
autofixes. No behavioral changes -- imports only.

- ~1400 safe autofixes applied across 644 files (net -1072 lines)
- __init__.py re-exports preserved (excluded from F401 removal so
  public re-export surfaces stay intact)
- Re-exports that are imported or monkeypatched by tests but look
  unused in their defining module are kept with explicit # noqa:
  F401 (gateway/run.py load_dotenv; run_agent re-exports from
  agent.message_sanitization, agent.context_compressor,
  agent.retry_utils, agent.prompt_builder, agent.process_bootstrap,
  agent.codex_responses_adapter)
- Unsafe F841 (unused-variable) fixes deliberately skipped -- those
  can change behavior when the RHS has side effects
- ruff lints remain disabled in pyproject.toml (only PLW1514 is
  selected); this is a one-time cleanup, not a config change

Verification:
- python -m compileall: clean
- pytest --collect-only: all 27161 tests collect (zero import errors)
- core entry points import clean (run_agent, model_tools, cli,
  toolsets, hermes_state, batch_runner, gateway)
- static scan: every name any test imports directly from an edited
  module still resolves

2026-05-28 22:26:25 -07:00

Teknium

7634c1386f

feat(delegate): diagnostic dump when a subagent times out with 0 API calls (#15105 )

When a subagent in delegate_task times out before making its first LLM
request, write a structured diagnostic file under
~/.hermes/logs/subagent-timeout-<sid>-<ts>.log capturing enough state
for the user (and us) to debug the hang. The old error message —
'Subagent timed out after Ns with no response. The child may be stuck
on a slow API call or unresponsive network request.' — gave no
observability for the 0-API-call case, which is the hardest to reason
about remotely.

The diagnostic captures:
  - timeout config vs actual duration
  - goal (truncated to 1000 chars)
  - child config: model, provider, api_mode, base_url, max_iterations,
    quiet_mode, platform, _delegate_role, _delegate_depth
  - enabled_toolsets + loaded tool names
  - system prompt byte/char count (catches oversized prompts that
    providers silently choke on)
  - tool schema count + byte size
  - child's get_activity_summary() snapshot
  - Python stack of the worker thread at the moment of timeout
    (reveals whether the hang is in credential resolution, transport,
    prompt construction, etc.)

Wiring:
  - _run_single_child captures the worker thread via a small wrapper
    around child.run_conversation so we can look up its stack at
    timeout.
  - After a FuturesTimeoutError, we pull child.get_activity_summary()
    to read api_call_count. If 0 AND it was a timeout (not a raise),
    _dump_subagent_timeout_diagnostic() is invoked.
  - The returned path is surfaced in the error string so the parent
    agent (and therefore the user / gateway) sees exactly where to look.
  - api_calls > 0 timeouts keep the old 'stuck on slow API call'
    phrasing since that's the correct diagnosis for those.

This does NOT change any behavior for successful subagent runs,
non-timeout errors, or subagents that made at least one API call
before hanging.

Tests: 7 cases (tests/tools/test_delegate_subagent_timeout_diagnostic.py)
  - output format + required sections + field values
  - long-goal truncation with [truncated] marker
  - missing / already-exited worker thread branches
  - unwritable HERMES_HOME/logs/ returns None without raising
  - _run_single_child wiring: 0 API calls → dump + diagnostic_path in error
  - _run_single_child wiring: N>0 API calls → no dump, old message

Refs: #14726

2026-04-24 04:58:32 -07:00

3 commits