Commit graph

5006 commits

Author SHA1 Message Date
Erosika
78586ce036 fix(honcho): dialectic lifecycle — defaults, retry, prewarm consumption
Several correctness and cost-safety fixes to the Honcho dialectic path
after a multi-turn investigation surfaced a chain of silent failures:

- dialecticCadence default flipped 3 → 1. PR #10619 changed this from 1 to
  3 for cost, but existing installs with no explicit config silently went
  from per-turn dialectic to every-3-turns on upgrade. Restores pre-#10619
  behavior; 3+ remains available for cost-conscious setups. Docs + wizard
  + status output updated to match.

- Session-start prewarm now consumed. Previously fired a .chat() on init
  whose result landed in HonchoSessionManager._dialectic_cache and was
  never read — pop_dialectic_result had zero call sites. Turn 1 paid for
  a duplicate synchronous dialectic. Prewarm now writes directly to the
  plugin's _prefetch_result via _prefetch_lock so turn 1 consumes it with
  no extra call.

- Prewarm is now dialecticDepth-aware. A single-pass prewarm can return
  weak output on cold peers; the multi-pass audit/reconcile cycle is
  exactly the case dialecticDepth was built for. Prewarm now runs the
  full configured depth in the background.

- Silent dialectic failure no longer burns the cadence window.
  _last_dialectic_turn now advances only when the result is non-empty.
  Empty result → next eligible turn retries immediately instead of
  waiting the full cadence gap.

- Thread pile-up guard. queue_prefetch skips when a prior dialectic
  thread is still in-flight, preventing stacked races on _prefetch_result.

- First-turn sync timeout is recoverable. Previously on timeout the
  background thread's result was stored in a dead local list. Now the
  thread writes into _prefetch_result under lock so the next turn
  picks it up.

- Cadence gate applies uniformly. At cadence=1 the old "cadence > 1"
  guard let first-turn sync + same-turn queue_prefetch both fire.
  Gate now always applies.

- Restored query-length reasoning-level scaling, dropped in 9a0ab34c.
  Scales dialecticReasoningLevel up on longer queries (+1 at ≥120 chars,
  +2 at ≥400), clamped at reasoningLevelCap. Two new config keys:
  `reasoningHeuristic` (bool, default true) and `reasoningLevelCap`
  (string, default "high"; previously parsed but never enforced).
  Respects dialecticDepthLevels and proportional lighter-early passes.

- Restored short-prompt skip, dropped in ef7f3156. One-word
  acknowledgements ("ok", "y", "thanks") and slash commands bypass
  both injection and dialectic fire.

- Purged dead code in session.py: prefetch_dialectic, _dialectic_cache,
  set_dialectic_result, pop_dialectic_result — all unused after prewarm
  refactor.

Tests: 542 passed across honcho_plugin/, agent/test_memory_provider.py,
and run_agent/test_run_agent.py. New coverage:
- TestTrivialPromptHeuristic (classifier + prefetch/queue skip)
- TestDialecticCadenceAdvancesOnSuccess (empty-result retry, pile-up guard)
- TestSessionStartDialecticPrewarm (prewarm consumed, sync fallback)
- TestReasoningHeuristic (length bumps, cap clamp, interaction with depth)
- TestDialecticLifecycleSmoke (end-to-end 8-turn session walk)
2026-04-18 22:50:55 -07:00
Teknium
bf5d7462ba
fix(tui): reject history-mutating commands while session is running (#12416)
Fixes silent data loss in the TUI when /undo, /compress, /retry, or
rollback.restore runs during an in-flight agent turn.  The version-
guard at prompt.submit:1449 would fail the version check and silently
skip writing the agent's result — UI showed the assistant reply but
DB / backend history never received it, causing UI↔backend desync
that persisted across session resume.

Changes (tui_gateway/server.py):
- session.undo, session.compress, /retry, rollback.restore (full-history
  only — file-scoped rollbacks still allowed): reject with 4009 when
  session.running is True.  Users can /interrupt first.
- prompt.submit: on history_version mismatch (defensive backstop),
  attach a 'warning' field to message.complete and log to stderr
  instead of silently dropping the agent's output.  The UI can surface
  the warning to the user; the operator can spot it in logs.

Tests (tests/test_tui_gateway_server.py): 6 new cases.
- test_session_undo_rejects_while_running
- test_session_undo_allowed_when_idle (regression guard)
- test_session_compress_rejects_while_running
- test_rollback_restore_rejects_full_history_while_running
- test_prompt_submit_history_version_mismatch_surfaces_warning
- test_prompt_submit_history_version_match_persists_normally (regression)

Validated: against unpatched server.py the three 'rejects_while_running'
tests fail and the version-mismatch test fails (no 'warning' field).
With the fix, all 6 pass, all 33 tests in the file pass, 74 TUI tests
in total pass.  Live E2E against the live Python environment confirmed
all 5 patches present and guards enforce 4009 exactly as designed.
2026-04-18 22:30:10 -07:00
Teknium
3a6351454b
fix(gateway): close pending-drain and late-arrival races in base adapter (#12371)
Two related race conditions in gateway/platforms/base.py that could
produce duplicate agent runs or silently drop messages. Neither is
specific to any one platform — all adapters inherit this logic.

R5 (HIGH) — duplicate agent spawn on turn chain
  In _process_message_background, the pending-drain path deleted
  _active_sessions[session_key] before awaiting typing_task.cancel()
  and then recursively awaiting _process_message_background for the
  queued event. During the typing_task await, a fresh inbound message
  M3 could pass the Level-1 guard (entry now missing), set its own
  Event, and spawn a second _process_message_background for the same
  session_key — two agents running simultaneously, duplicate responses,
  duplicate tool calls.

  Fix: keep the _active_sessions entry populated and only clear() the
  Event. The guard stays live, so any concurrent inbound message takes
  the busy-handler path (queue + interrupt) as intended.

R6 (MED-HIGH) — message dropped during finally cleanup
  The finally block has two await points (typing_task, stop_typing)
  before it deletes _active_sessions. A message arriving in that
  window passes the guard (entry still live), lands in
  _pending_messages via the busy-handler — and then the unconditional
  del removes the guard with that message still queued. Nothing
  drains it; the user never gets a reply.

  Fix: before deleting _active_sessions in finally, pop any late
  pending_messages entry and spawn a drain task for it. Only delete
  _active_sessions when no pending is waiting.

Tests: tests/gateway/test_pending_drain_race.py — three regression
cases. Validated: without the fix, two of the three fail exactly
where the races manifest (duplicate-spawn guard loses identity,
late-arrival 'LATE' message not in processed list).
2026-04-18 19:32:26 -07:00
Teknium
762f7e9796 feat: configurable approval mode for cron jobs (approvals.cron_mode)
Add approvals.cron_mode config option that controls how cron jobs handle
dangerous commands. Previously, cron jobs silently auto-approved all
dangerous commands because there was no user present to approve them.

Now the behavior is configurable:
  - deny (default): block dangerous commands and return a message telling
    the agent to find an alternative approach. The agent loop continues —
    it just can't use that specific command.
  - approve: auto-approve all dangerous commands (previous behavior).

When a command is blocked, the agent receives the same response format as
a user denial in the CLI — exit_code=-1, status=blocked, with a message
explaining why and pointing to the config option. This keeps the agent
loop running and encourages it to adapt.

Implementation:
  - config.py: add approvals.cron_mode to DEFAULT_CONFIG
  - scheduler.py: set HERMES_CRON_SESSION=1 env var before agent runs
  - approval.py: both check_command_approval() and check_all_command_guards()
    now check for cron sessions and apply the configured mode
  - 21 new tests covering config parsing, deny/approve behavior, and
    interaction with other bypass mechanisms (yolo, containers)
2026-04-18 19:24:35 -07:00
Teknium
b02833f32d
fix(codex): Hermes owns its own Codex auth; stop touching ~/.codex/auth.json (#12360)
Codex OAuth refresh tokens are single-use and rotate on every refresh.
Sharing them with the Codex CLI / VS Code via ~/.codex/auth.json made
concurrent use of both tools a race: whoever refreshed last invalidated
the other side's refresh_token.  On top of that, the silent auto-import
path picked up placeholder / aborted-auth data from ~/.codex/auth.json
(e.g. literal {"access_token":"access-new","refresh_token":"refresh-new"})
and seeded it into the Hermes pool as an entry the selector could
eventually pick.

Hermes now owns its own Codex auth state end-to-end:

Removed
- agent/credential_pool.py: _sync_codex_entry_from_cli() method,
  its pre-refresh + retry + _available_entries call sites, and the
  post-refresh write-back to ~/.codex/auth.json.
- agent/credential_pool.py: auto-import from ~/.codex/auth.json in
  _seed_from_singletons() — users now run `hermes auth openai-codex`
  explicitly.
- hermes_cli/auth.py: silent runtime migration in
  resolve_codex_runtime_credentials() — now surfaces
  `codex_auth_missing` directly (message already points to `hermes auth`).
- hermes_cli/auth.py: post-refresh write-back in
  _refresh_codex_auth_tokens().
- hermes_cli/auth.py: dead helper _write_codex_cli_tokens() and its 4
  tests in test_auth_codex_provider.py.

Kept
- hermes_cli/auth.py: _import_codex_cli_tokens() — still used by the
  interactive `hermes auth openai-codex` setup flow for a user-gated
  one-time import (with "a separate login is recommended" messaging).

User-visible impact
- On existing installs with Hermes auth already present: no change.
- On a fresh install where the user has only logged in via Codex CLI:
  `hermes chat --provider openai-codex` now fails with "No Codex
  credentials stored. Run `hermes auth` to authenticate." The
  interactive setup flow then detects ~/.codex/auth.json and offers a
  one-time import.
- On an install where Codex CLI later refreshes its token: Hermes is
  unaffected (we no longer read from that file at runtime).

Tests
- tests/hermes_cli/test_auth_codex_provider.py: 15/15 pass.
- tests/hermes_cli/test_auth_commands.py: 20/20 pass.
- tests/agent/test_credential_pool.py: 31/31 pass.
- Live E2E on openai-codex/gpt-5.4: 1 API call, 1.7s latency,
  3 log lines, no refresh events, no auth drama.

The related 14:52 refresh-loop bug (hundreds of rotations/minute on a
single entry) is a separate issue — that requires a refresh-attempt
cap on the auth-recovery path in run_agent.py, which remains open.
2026-04-18 19:19:46 -07:00
yeyitech
bd01ec7885 fix(cli): strip all reasoning tag variants from /resume recap
HermesCLI._display_resumed_history() calls the module-level _strip_reasoning_tags() to clean assistant content before rendering the recap panel.  The tag list was missing <thought> (Gemma 4) and there was no pass for stray orphan </tag> closes, so those variants leaked internal reasoning into the recap display (#11316).

- Add <thought> to _REASONING_TAGS.
- Add a third regex pass that strips orphan close tags (e.g. 'stuff</think>answer' → 'stuffanswer').
- Apply IGNORECASE to closed-pair and unclosed-pair passes so mixed-case variants (<THINK>, <Thinking>) are handled uniformly — previously both 'THINKING' and 'thinking' had to be listed explicitly as distinct tuple entries, which missed <Thinking>.

7 new regression tests in tests/cli/test_resume_display.py covering: <think>, <thinking>, <reasoning>, <thought>, unclosed <think>, multiple interleaved blocks, and orphan </think> close.

Resolves #11316.

Originally proposed as PR #11366.
2026-04-18 19:19:24 -07:00
Tranquil-Flow
ec48ec5530 fix(agent): strip <think> blocks from stored assistant content
Inline reasoning tags in an assistant message's content field leak to every downstream consumer: messaging platforms (#8878, #9568), API replay of prior turns, session transcript, CLI recap, generated session titles, and context compression.  _extract_reasoning() already captures the reasoning text into msg['reasoning'] separately, so the raw tags in content are redundant.

Stripping once at the storage boundary in _build_assistant_message() cleans the content for every downstream path in one place — no per-platform or per-path stripper needed.  Measured impact on a real MiniMax M2.7-highspeed session (per @luoyejiaoe-source, #9306): 55% of assistant messages started with <think> blocks, 51/100 session titles were polluted, 16% content-size reduction.

3 new regression tests in TestBuildAssistantMessage: closed-pair strip with reasoning capture, no-think-tag passthrough, and unterminated-block strip.

Resolves #8878 and #9568.

Originally proposed as PR #9250.
2026-04-18 19:19:24 -07:00
Teknium
9489d1577d fix(agent): strip unterminated <think> blocks from visible content
Providers served via NIM (MiniMax M2.7, some Moonshot/DeepSeek proxies) sometimes drop the closing </think> tag, leaving raw reasoning in the assistant's content field.  _strip_think_blocks()'s closed-pair regex is non-greedy so it only matches complete blocks — any orphan <think>...EOF survived the stripper and leaked to users (#8878, #9568, #10408).

Adds an unterminated-tag pass that fires when an open reasoning tag sits at a block boundary (start of text or after a newline) with no matching close.  Everything from that tag to end of string is stripped.  The block-boundary check mirrors gateway/stream_consumer.py's filter so models that mention <think> in prose are not over-stripped.

Also makes the closed-pair regexes consistently case-insensitive so <THINK>...</THINK> and <Thinking>...</Thinking> are handled uniformly — previously the mixed-case open tag would bypass the closed-pair pass and be caught by the unterminated-tag pass, taking trailing visible content with it.

6 new regression tests in TestStripThinkBlocks covering: unterminated <think>, unterminated <thought>, multi-line unterminated, line-start orphan with preserved prefix, prose-mention non-regression, mixed-case closed pairs.

The implementation is inspired by @luinbytes's PR #10408 report of the NIM/MiniMax symptom.  This commit does not include the 💭/🧠 emoji regexes from that PR — those glyphs are Hermes CLI display decorations, not model content markers.
2026-04-18 19:19:24 -07:00
Teknium
79c5a381c5 feat(uninstall): offer to remove named profiles when uninstalling from default
When `hermes uninstall` runs from the default HERMES_HOME (~/.hermes)
and other named profiles exist under ~/.hermes/profiles/, show them in
the installation overview and prompt:

    Also stop and remove these N profile(s)? [y/N]

If confirmed, for each named profile we:
  1. Shell out to `python -m hermes_cli.main -p <name> gateway stop/uninstall`
     to stop the gateway and remove its systemd unit or launchd plist
     (service names + unit paths are derived from HERMES_HOME, so we
     can't cleanly switch in-process)
  2. Remove the ~/.local/bin/<name> alias wrapper (outside HERMES_HOME)
  3. Wipe the profile's HERMES_HOME dir

Previously `hermes uninstall` was silently profile-scoped, leaving
zombie systemd units at ~/.config/systemd/user/hermes-gateway-<profile>.service
and zombie HERMES_HOMEs under ~/.hermes/profiles/ whenever a user
uninstalled from default with other profiles configured.

Prompt only appears when uninstalling from the default root. Uninstalling
from within a named profile stays profile-scoped as before.
2026-04-18 19:18:13 -07:00
Teknium
3fe0d503b6 fix(uninstall): properly stop and destroy gateway on hermes uninstall
The uninstaller's gateway cleanup was incomplete:
- Linux only (ignored macOS launchd)
- Only checked user systemd scope (missed system services)
- Didn't kill standalone gateway processes (hermes gateway run)
- Missing DBUS env setup for headless servers

Now delegates to gateway.py's existing machinery:
1. Kill any standalone gateway processes (all platforms)
2. Linux: stop + disable + remove both user AND system systemd services
3. macOS: unload + remove launchd plist
4. Warns (instead of silently failing) when system service needs sudo
2026-04-18 19:18:13 -07:00
Teknium
1e5f0439d9 docs: update Anthropic console URLs to platform.claude.com
Anthropic migrated their developer console from console.anthropic.com
to platform.claude.com. Two user-facing display URLs were still pointing
to the old domain:

- hermes_cli/main.py — API key prompt in the Anthropic model flow
- run_agent.py — 401 troubleshooting output

The OAuth token refresh endpoint was already migrated in PR #3246
(with fallback).

Spotted by @LucidPaths in PR #3237.

(Salvage of #3758 — dropped the setup.py hunk since that section was
refactored away and no longer contains the stale URL.)
2026-04-18 18:55:58 -07:00
Teknium
2a2e5c0fed fix: force relogin on 401/403 Codex token refresh failures
When the OAuth token endpoint returns 401/403 but the JSON body
doesn't contain a known error code (invalid_grant, etc.),
relogin_required stayed False. Users saw a bare error message
without guidance to re-authenticate.

Now any 401/403 from the token endpoint forces relogin_required=True,
since these status codes always indicate invalid credentials on a
refresh endpoint. 500+ errors remain as transient (no relogin).
2026-04-18 18:54:34 -07:00
Teknium
beabbd87ef
fix(gateway): close adapter resources when connect() fails or raises (#12339)
Gateway startup leaks aiohttp.ClientSession (and other partial-init
resources) when an adapter's connect() returns False or raises. The
adapter is never added to self.adapters, so the shutdown path at
gateway/run.py:2426 never calls disconnect() on it — Python GC later
logs 'Unclosed client session' at process exit.

Seen on 2026-04-18 18:08:16 during a double --replace takeover cycle:
one of the partial-init sessions survived past shutdown and emitted
the warning right before status=75/TEMPFAIL.

Fix:
- New GatewayRunner._safe_adapter_disconnect() helper — calls
  adapter.disconnect() and swallows any exception. Used on error paths.
- Connect loop calls it in both failure branches: success=False and
  except Exception.
- Adapter disconnect() implementations are already expected to be
  idempotent and tolerate partial-init state (they all guard on
  self._http_session / self._bridge_process before touching them).

Tests: tests/gateway/test_safe_adapter_disconnect.py — 3 cases verify
the helper forwards to disconnect, swallows exceptions, and tolerates
platform=None.
2026-04-18 18:53:31 -07:00
Teknium
632a807a3e
fix(gateway): slash commands never interrupt a running agent (#12334)
Any recognized slash command now bypasses the Level-1 active-session
guard instead of queueing + interrupting. A mid-run /model (or
/reasoning, /voice, /insights, /title, /resume, /retry, /undo,
/compress, /usage, /provider, /reload-mcp, /sethome, /reset) used to
interrupt the agent AND get silently discarded by the slash-command
safety net — zero-char response, dropped tool calls.

Root cause:
- Discord registers 41 native slash commands via tree.command().
- Only 14 were in ACTIVE_SESSION_BYPASS_COMMANDS.
- The other ~15 user-facing ones fell through base.py:handle_message
  to the busy-session handler, which calls running_agent.interrupt()
  AND queues the text.
- After the aborted run, gateway/run.py:9912 correctly identifies the
  queued text as a slash command and discards it — but the damage
  (interrupt + zero-char response) already happened.

Fix:
- should_bypass_active_session() now returns True for any resolvable
  slash command. ACTIVE_SESSION_BYPASS_COMMANDS stays as the subset
  with dedicated Level-2 handlers (documentation + tests).
- gateway/run.py adds a catch-all after the dedicated handlers that
  returns a user-visible "agent busy — wait or /stop first" response
  for any other resolvable command.
- Unknown text / file-path-like messages are unchanged — they still
  queue.

Also:
- gateway/platforms/discord.py logs the invoker identity on every
  slash command (user id + name + channel + guild) so future
  ghost-command reports can be triaged without guessing.

Tests:
- 15 new parametrized cases in test_command_bypass_active_session.py
  cover every previously-broken Discord slash command.
- Existing tests for /stop, /new, /approve, /deny, /help, /status,
  /agents, /background, /steer, /update, /queue still pass.
- test_steer.py's ACTIVE_SESSION_BYPASS_COMMANDS check still passes.

Fixes #5057. Related: #6252, #10370, #4665.
2026-04-18 18:53:22 -07:00
Teknium
41560192c4
chore(attribution): add AUTHOR_MAP entry for nish3451
Adds the nish3451 noreply email to the AUTHOR_MAP so CI attribution checks
pass for the #6100 Telegram DM fallback fix merged in 1a9a2d7f.
2026-04-18 18:52:41 -07:00
Teknium
aa5f89d3ea test: add coverage for from_user=None DM fallback
Tests the three cases:
- DM with from_user=None: user_id falls back to chat.id
- Group with from_user=None: user_id stays None (safe default)
- DM with from_user present: user_id uses from_user.id (no regression)
2026-04-18 18:18:01 -07:00
Nish
1a9a2d7fe8 fix(gateway/telegram): fall back to chat.id when from_user is None in DMs
When `message.from_user` is None — which can happen for forwarded messages,
anonymous admin mode in groups, or certain Telegram client edge cases —
`_build_message_event` set `source.user_id` to None. This caused:

1. `_is_user_authorized()` to early-return False (`if not user_id: return False`)
2. The access check never compared against `TELEGRAM_ALLOWED_USERS` even when
   the user actually was in the allowlist
3. The pairing flow fired and generated a code for `user_id=None`
4. The pairing approval saved an entry under the literal string key "null"
5. The user was effectively locked out because their real user_id never
   matched the "null" key on subsequent messages

For DMs (`chat_type == "dm"`), Telegram guarantees `chat.id == user.id` —
they are the same numeric ID for private chats. Falling back to `chat.id`
when `from_user` is None for DMs restores the expected access-control
behavior without weakening it (group/channel chats correctly stay None).

Also adds a parallel `user_name` fallback to `chat.full_name` so the
display name still works in the same edge case.
2026-04-18 18:18:01 -07:00
Teknium
139a6da67c fix(skills): touchdesigner-mcp setup.sh — correct pgrep match + suppress stray yaml output
Discovered while dogfooding the skill end-to-end:

- pgrep -if "TouchDesigner" matched any shell whose command line
  contained the substring (including the setup script's own invocation
  under certain wrappers), falsely reporting TD running on machines
  where it isn't. Switch to pgrep -x (exact process name match,
  supported on both macOS and Linux) and also check TouchDesignerFTE
  (the non-commercial variant).
- The embedded python3 yaml-writer printed 'added' / 'exists' to
  stdout as status, which leaked a stray word into the setup output
  right before the ✔ line. Drop the print()s — the bash-level ✔/✘ is
  the status indicator.
2026-04-18 17:43:42 -07:00
Teknium
6b31e20894 chore(skills): touchdesigner-mcp follow-ups
- Remove orphan skills/creative/touchdesigner/references/pitfalls.md
  left over from the rename commit (git add-then-edit instead of git mv
  meant the old file never got deleted).
- Honour $HERMES_HOME in setup.sh and SKILL.md setup invocation so
  profile-aware installs work correctly.
- Fix troubleshooting.md config path to use $HERMES_HOME instead of
  hardcoding ~/.hermes/.
- Add touchdesigner-mcp entries to skills-catalog.md and
  optional-skills-catalog.md for parity with blender-mcp/meme-generation.
2026-04-18 17:43:42 -07:00
Teknium
11ee87e605 chore(attribution): add AUTHOR_MAP entry for kshitijk4poor@gmail.com
Covers the non-noreply email used on commit dd3e6424 (rename of the
TouchDesigner skill to touchdesigner-mcp).
2026-04-18 17:43:42 -07:00
kshitijk4poor
6d2fe1d624 feat: rename touchdesigner -> touchdesigner-mcp, move to optional-skills/
- Rename skill to touchdesigner-mcp (matches blender-mcp convention)
- Move from skills/creative/ to optional-skills/creative/
- Fix duplicate pitfall numbering (#3 appeared twice)
- Update SKILL.md cross-references for renumbered pitfalls
- Update setup.sh path for new directory location
2026-04-18 17:43:42 -07:00
kshitijk4poor
6f27390fae feat: rewrite TouchDesigner skill for twozero MCP (v2.0.0)
Major rewrite of the TouchDesigner skill:
- Replace custom API handler with twozero MCP (36 native tools)
- Add audio-reactive GLSL proven recipe (spectrum chain, pitfalls)
- Add recording checklist (FPS>0, non-black, audio cueing)
- Expand pitfalls: 38 entries from real sessions (was 20)
- Update network-patterns with MCP-native build scripts
- Rewrite mcp-tools reference for twozero v2.774+
- Update troubleshooting for MCP-based workflow
- Remove obsolete custom_api_handler.py
- Generalize Environment section for all users
- Remove session-specific Paired Skills section
- Bump version to 2.0.0
2026-04-18 17:43:42 -07:00
kshitijk4poor
7a5371b20d feat: add TouchDesigner integration skill
New skill: creative/touchdesigner — control a running TouchDesigner
instance via REST API. Build real-time visual networks programmatically.

Architecture:
  Hermes Agent -> HTTP REST (curl) -> TD WebServer DAT -> TD Python env

Key features:
- Custom API handler (scripts/custom_api_handler.py) that creates a
  self-contained WebServer DAT + callback in TD. More reliable than the
  official mcp_webserver_base.tox which frequently fails module imports.
- Discovery-first workflow: never hardcode TD parameter names. Always
  probe the running instance first since names change across versions.
- Persistent setup: save the TD project once with the API handler baked
  in. TD auto-opens the last project on launch, so port 9981 is live
  with zero manual steps after first-time setup.
- Works via curl in execute_code (no MCP dependency required).
- Optional MCP server config for touchdesigner-mcp-server npm package.

Skill structure (2823 lines total):
  SKILL.md (209 lines) — setup, workflow, key rules, operator reference
  references/pitfalls.md (276 lines) — 24 hard-won lessons
  references/operators.md (239 lines) — all 6 operator families
  references/network-patterns.md (589 lines) — audio-reactive, generative,
    video processing, GLSL, instancing, live performance recipes
  references/mcp-tools.md (501 lines) — 13 MCP tool schemas
  references/python-api.md (443 lines) — TD Python scripting patterns
  references/troubleshooting.md (274 lines) — connection diagnostics
  scripts/custom_api_handler.py (140 lines) — REST API handler for TD
  scripts/setup.sh (152 lines) — prerequisite checker

Tested on TouchDesigner 099 Non-Commercial (macOS/darwin).
2026-04-18 17:43:42 -07:00
Teknium
c49a58a6d0
fix(gateway): mark only still-running sessions resume_pending on drain timeout (#12332)
Follow-up to #12301.

The drain-timeout branch of _stop_impl() was iterating the drain-start
snapshot (active_agents) when marking sessions resume_pending. That
snapshot can include sessions that finished gracefully during the drain
window — marking them would give their next turn a stray
'your previous turn was interrupted by a gateway restart' system note
even though the prior turn actually completed cleanly.

Iterate self._running_agents at timeout time instead, mirroring
_interrupt_running_agents() exactly:
- only sessions still blocking the shutdown get marked
- pending sentinels (AIAgent construction not yet complete) are skipped

Changes:
- gateway/run.py: swap active_agents.keys() for filtered
  self._running_agents.items() iteration in the drain-timeout mark loop.
- tests/gateway/test_restart_resume_pending.py: two regression tests —
  finisher-during-drain not marked, pending sentinel not marked.
2026-04-18 17:40:34 -07:00
Teknium
cb4addacab
fix(gateway): auto-resume sessions after drain-timeout restart (#11852) (#12301)
The shutdown banner promised "send any message after restart to resume
where you left off" but the code did the opposite: a drain-timeout
restart skipped the .clean_shutdown marker, which made the next startup
call suspend_recently_active(), which marked the session suspended,
which made get_or_create_session() spawn a fresh session_id with a
'Session automatically reset. Use /resume...' notice — contradicting
the banner.

Introduce a resume_pending state on SessionEntry that is distinct from
suspended. Drain-timeout shutdown flags active sessions resume_pending
instead of letting startup-wide suspension destroy them. The next
message on the same session_key preserves the session_id, reloads the
transcript, and the agent receives a reason-aware restart-resume
system note that subsumes the existing tool-tail auto-continue note
(PR #9934).

Terminal escalation still flows through the existing
.restart_failure_counts stuck-loop counter (PR #7536, threshold 3) —
no parallel counter on SessionEntry. suspended still wins over
resume_pending in get_or_create_session() so genuinely stuck sessions
converge to a clean slate.

Spec: PR #11852 (BrennerSpear). Implementation follows the spec with
the approved correction (reuse .restart_failure_counts rather than
adding a resume_attempts field).

Changes:
- gateway/session.py: SessionEntry.resume_pending/resume_reason/
  last_resume_marked_at + to_dict/from_dict; SessionStore
  .mark_resume_pending()/clear_resume_pending(); get_or_create_session()
  returns existing entry when resume_pending (suspended still wins);
  suspend_recently_active() skips resume_pending entries.
- gateway/run.py: _stop_impl() drain-timeout branch marks active
  sessions resume_pending before _interrupt_running_agents();
  _run_agent() injects reason-aware restart-resume system note that
  subsumes the tool-tail case; successful-turn cleanup also clears
  resume_pending next to _clear_restart_failure_count();
  _notify_active_sessions_of_shutdown() softens the restart banner to
  'I'll try to resume where you left off' (honest about stuck-loop
  escalation).
- tests/gateway/test_restart_resume_pending.py: 29 new tests covering
  SessionEntry roundtrip, mark/clear helpers, get_or_create_session
  precedence (suspended > resume_pending), suspend_recently_active
  skip, drain-timeout mark reason (restart vs shutdown), system-note
  injection decision tree (including tool-tail subsumption), banner
  wording, and stuck-loop escalation override.
2026-04-18 17:32:17 -07:00
brooklyn!
ad99e32371
Merge pull request #12312 from NousResearch/bb/tui-ux-pack
feat(tui): UX pack — stable picker keys, /clear confirm, light-theme preset
2026-04-18 18:13:06 -05:00
Brooklyn Nicholson
df5ca5065f feat(tui): replace /clear double-press gate with a proper confirm overlay
The time-window gate felt wrong — users would hit /clear, read the
prompt, retype, and consistently blow past the window. Swapping to a
real yes/no overlay that blocks input like the existing Approval and
Clarify prompts.

- add ConfirmReq type + OverlayState.confirm + $isBlocked coverage
- ConfirmPrompt component (prompts.tsx): cancel row on top as the
  default, danger-coloured confirm row on the bottom, Y/N hotkeys,
  Enter on default = cancel, Esc/Ctrl+C cancel
- wire into PromptZone (appOverlays.tsx)
- /clear + /new now push onto the overlay instead of arming a timer
- HERMES_TUI_NO_CONFIRM=1 still skips the prompt for scripting
- drop the destructiveGate + createSlashHandler reset wiring
  (destructive.ts and its tests removed)

Refs #4069.
2026-04-18 18:04:08 -05:00
Brooklyn Nicholson
75377feb07 fix(tui): make /clear confirm window humane (3s → 30s, reset on other slash)
The 3s gate was too tight — users reading the prompt and retyping
consistently blow past it and get stuck in a loop ("press /clear
again within 3s" forever). Fixes:

- bump CONFIRM_WINDOW_MS 3_000 → 30_000
- drop the time number from the confirmation message to remove the
  pressure vibe: "press /clear again to confirm — starts a new session"
- reset the gate from createSlashHandler whenever any non-destructive
  slash command runs, so stale arming from 20s ago can't silently
  turn the next /clear into an unintended confirm
- export the gate + isDestructiveCommand helper for that wiring
- add armed() introspection method

Follow-up to #4069 / 3366714b.
2026-04-18 17:55:53 -05:00
Brooklyn Nicholson
20eab355e7 feat(tui): add LIGHT_THEME preset for white/light terminal backgrounds
Splits the existing palette into DARK_THEME (current yellow-heavy
default) and LIGHT_THEME (darker browns + proper contrast on white).
DEFAULT_THEME aliases DARK_THEME, and flips to LIGHT_THEME when
HERMES_TUI_LIGHT=1 is set at launch.

Skin system (fromSkin) still layers on top of whichever preset is
active, so users can keep customizing on top of either palette.

Refs #11300.
2026-04-18 17:49:40 -05:00
Brooklyn Nicholson
3366714ba4 feat(tui): double-press confirm on /clear and /new
Prevents accidental session loss: the first press prints
"press /clear again within 3s to confirm"; a second press inside
the window actually starts a new session. Outside the window the
gate re-arms.

Opt out with HERMES_TUI_NO_CONFIRM=1 for scripted / muscle-memory
workflows.

Refs #4069.
2026-04-18 17:48:34 -05:00
Brooklyn Nicholson
52124384de fix(tui): stable React keys in /model picker rows
Use provider.slug (and a composite key for model rows) instead of the
rendered string, so dupes in the backend response can't collapse two
rows into one or trigger key-collision warnings.
2026-04-18 17:47:26 -05:00
brooklyn!
db59c190c1
Merge pull request #12305 from NousResearch/bb/tui-status-git-branch
feat(tui): append git branch to cwd label in status bar
2026-04-18 17:27:40 -05:00
brooklyn!
c0edcf2d53
Merge pull request #12306 from NousResearch/bb/tui-model-picker-dedupe-names
fix(tui): disambiguate /model picker rows when provider display names collide
2026-04-18 17:27:31 -05:00
Brooklyn Nicholson
4aa52590d8 fix(tui): disambiguate /model picker rows when provider display names collide
If the gateway returns two providers that resolve to the same display name
(e.g. `kimi-coding` and `kimi-coding-cn` both → "Kimi For Coding"), the
picker now appends the slug so users can tell them apart, in both the
provider list and the selected-provider header. No-op when names are
already unique.

Refs #10526 — the Python backend dedupe from #10599 skips one alias, but
user-defined providers, canonical overlays, and future regressions can
still surface as indistinguishable rows in the picker. This is a
client-side safety net on top of that.
2026-04-18 17:22:23 -05:00
Brooklyn Nicholson
ff2aa7ccd7 feat(tui): append git branch to cwd label in status bar
Adds useGitBranch hook (async, cached, 15s TTL) and fmtCwdBranch
helper so the footer shows `~/repo (main)` instead of just `~/repo`.
Degrades silently when git is unavailable or cwd is outside a repo.

Partial fix for #12267 (TUI portion; #12277 covers the Python side).
2026-04-18 17:17:05 -05:00
Teknium
0175ff7516
feat(skills): replace xitter with xurl — the official X API CLI (#12303)
Swap the social-media/xitter skill (third-party wrapper around
Infatoshi/x-cli) for a new social-media/xurl skill wrapping
xdevplatform/xurl — the official X API CLI from the X developer
platform team.

Why:
- xurl is officially maintained by the X dev platform team
- OAuth 2.0 PKCE with auto-refresh + multi-app / multi-user support
  (vs. xitter's 5-env-var OAuth 1.0a + single account)
- Credentials stored in ~/.xurl managed by xurl itself — no manual
  env var juggling for users
- Substantially larger API surface: DMs, follows, blocks, mutes,
  media upload, streaming, and raw v2 endpoint access
- Ships stronger agent-safety guardrails (forbidden-flag list,
  no --verbose in agent mode, never-read-~/.xurl rule)

Adaptation:
- Ported the openclaw SKILL.md (which the xdevplatform team seeded)
  to Hermes frontmatter conventions (prerequisites.commands, platforms,
  metadata.hermes.tags/homepage) — dropped openclaw-specific metadata
- Added a Hermes-oriented one-time user setup section so the agent
  knows to direct the user to run auth commands themselves, never
  execute them with inline secrets
- Preserved the mandatory secret-safety rules verbatim
- Attribution block credits xdevplatform, openclaw, and the Hermes
  port

Docs: updated website/docs/reference/skills-catalog.md to replace
the xitter row with xurl.
2026-04-18 15:11:32 -07:00
Teknium
6a3a6a0fb6
Merge pull request #12263 from NousResearch/bb/tui-audit-followup
fix(tui): TUI v2 audit follow-up — registry, overlays, paste, reasoning, hyperlinks
2026-04-18 14:40:16 -07:00
helix4u
4e8f60fd11 fix(cli): use display width for wrapped spinner height 2026-04-18 14:34:05 -07:00
Brooklyn Nicholson
fb06bc67de fix(tui): Ctrl+C with input selection actually preserves input (lift handler to app level)
Previous fix in 9dbf1ec6 handled Ctrl+C inside textInput but the APP-level
useInputHandlers fires the same keypress in a separate React hook and ran
clearIn() regardless. Net effect: the OSC 52 copy succeeded but the input
wiped right after, so Brooklyn only noticed the wipe.

Lift the selection-aware Ctrl+C to a single place by threading input
selection state through a new nanostore (src/app/inputSelectionStore.ts).
textInput syncs its derived `selected` range + a clear() callback to the
store on every selection change, and the app-level Ctrl+C handler reads
the store before its clear/interrupt/die chain:

  - terminal-level selection (scrollback) → copy, existing behavior
  - in-input selection present → copy + clear selection, preserve input
  - input has text, no selection → clearIn(), existing behavior
  - empty + busy → interrupt turn
  - empty + idle → die

textInput no longer has its own Ctrl+C block; keypress falls through to
app-level like it did before 9dbf1ec6.
2026-04-18 16:28:51 -05:00
Brooklyn Nicholson
bfac5d039d Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/tui-audit-followup 2026-04-18 15:27:40 -05:00
Brooklyn Nicholson
17e95a26b7 fix(tui): render /skills browse as a formatted Panel instead of raw JSON
Previous handler dumped the raw skills.manage response into a pager, which
was unreadable and hid the pagination metadata. Also silently accepted
non-numeric page args.

Now:
- validates page arg (rejects NaN / <1 with a usage message)
- shows "fetching community skills (scans 6 sources, may take ~15s)…" up
  front so the 10-30s hub fetch isn't a silent hang
- renders items as {name · trust, description (truncated 160 chars)} rows
  in the existing Panel component
- footer shows "page X of Y · N skills total · /skills browse N+1 for more"
  when the server returned pagination metadata

Skills hub's remote fetch latency is a separate upstream issue
(browse_skills hits 6 sources sequentially) — client-side we just stop
misrepresenting it.
2026-04-18 15:22:43 -05:00
Brooklyn Nicholson
7e9a098574 chore: uptick 2026-04-18 15:17:42 -05:00
Brooklyn Nicholson
450ded98db chore(tui): prettier whitespace on files touched in this branch 2026-04-18 15:13:31 -05:00
Brooklyn Nicholson
93b4080b78 Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/tui-audit-followup
# Conflicts:
#	ui-tui/src/components/markdown.tsx
#	ui-tui/src/types/hermes-ink.d.ts
2026-04-18 14:52:54 -05:00
helix4u
ca32a2a60b fix(gemini): restore bearer auth on openai route 2026-04-18 12:52:01 -07:00
helix4u
a7dd6a3449 fix(gemini): hide stale and low-TPM Google models 2026-04-18 12:52:01 -07:00
helix4u
2eab7ee15f fix(gemini): hide low-TPM Gemma models from exposed lists 2026-04-18 12:52:01 -07:00
LVT382009
f7af90e2da fix: wire _ephemeral_max_output_tokens into chat_completions and add NVIDIA NIM default
Based on #12152 by @LVT382009.

Two fixes to run_agent.py:

1. _ephemeral_max_output_tokens consumption in chat_completions path:
   The error-recovery ephemeral override was only consumed in the
   anthropic_messages branch of _build_api_kwargs.  All chat_completions
   providers (OpenRouter, NVIDIA NIM, Qwen, Alibaba, custom, etc.)
   silently ignored it.  Now consumed at highest priority, matching the
   anthropic pattern.

2. NVIDIA NIM max_tokens default (16384):
   NVIDIA NIM falls back to a very low internal default when max_tokens
   is omitted, causing models like GLM-4.7 to truncate immediately
   (thinking tokens exhaust the budget before the response starts).

3. Progressive length-continuation boost:
   When finish_reason='length' triggers a continuation retry, the output
   budget now grows progressively (2x base on retry 1, 3x on retry 2,
   capped at 32768) via _ephemeral_max_output_tokens.  Previously the
   retry loop just re-sent the same token limit on all 3 attempts.
2026-04-18 12:51:30 -07:00
jarvischer
0f778f7768 fix: prevent tool name duplication in streaming accumulator (MiniMax/NVIDIA NIM)
Based on #11984 by @maxchernin.  Fixes #8259.

Some providers (MiniMax M2.7 via NVIDIA NIM) resend the full function
name in every streaming chunk instead of only the first.  The old
accumulator used += which concatenated them into 'read_fileread_file'.

Changed to simple assignment (=), matching the OpenAI Node SDK, LiteLLM,
and Vercel AI SDK patterns.  Function names are atomic identifiers
delivered complete — no provider splits them across chunks, so
concatenation was never correct semantics.
2026-04-18 12:50:32 -07:00
Brooklyn Nicholson
4caf6c23dd fix(tui): strip <think>…</think> tags from assistant content and route to reasoning panel
Models that emit reasoning inline as <think>/<reasoning>/<thinking>/<thought>/
<REASONING_SCRATCHPAD> tags in the content field (rather than a separate API
reasoning channel) had the raw tags + inner content shown twice: once as body
text with literal <think> markers, and again in the thinking panel when the
reasoning field was populated.

Port v1's tag set to lib/reasoning.ts with a splitReasoning(text) helper that
returns { reasoning, text }. Applied in three spots:

  - scheduleStreaming: strips tags from the live streaming view so the user
    never sees <think> mid-turn.
  - flushStreamingSegment: when a tool interrupts assistant output mid-turn,
    the saved segment is the stripped text; extracted reasoning promotes to
    reasoningText if the API channel hasn't already populated it.
  - recordMessageComplete: final message text is split, extracted reasoning
    merges with any existing reasoning (API channel wins on conflicts so we
    don't double-count when both are present).
2026-04-18 14:46:38 -05:00