Commit graph

538 commits

Author SHA1 Message Date
kshitijk4poor
69716a2e6f docs(compression): fix stale 'discarded' wording on in_place config flag
Review nit (yoniebans): the config.py comment still said compaction is
'lossy: the pre-compaction transcript is discarded, matching Claude Code /
Codex' — leftover from the original destructive design. The shipped behavior
is soft-archive: lossy for the LIVE context (what the model reloads), but the
pre-compaction turns are kept on disk (active=0, compacted=1), searchable via
session_search and recoverable. Comment now says so. Comment-only; no behavior
change.
2026-06-20 10:57:07 -07:00
kshitijk4poor
47fadc24d7 feat(compression): in-place compaction option that keeps one session id (#38763)
Context compression today rewrites the message list AND rotates the
session id — it ends the session, forks a parent_session_id child, and
renumbers the title (name -> name #2). That moving identity key is the
root cause of a whole bug cluster: /goal lost (#33618), pending response
lost at the split (#14238), orphan sessions (#33907), TUI sid desync
(#36777), FTS search gaps + duplicate sidebar entries (#45117), null
continuation cwd (#42228), and title-rename dead-ends (#48989). It also
forced a large defensive apparatus (compression lock, contextvar/env/
logging triple-sync, orphan finalization, gateway SessionEntry
re-propagation, tip projection) whose only job is surviving a
mid-conversation id change.

Add a compression.in_place config flag (default False during rollout).
When True, compaction rewrites the transcript and rebuilds the system
prompt but keeps the SAME session_id: no end_session, no child row, no
title renumber, no contextvar/logging re-sync, no memory/context-engine
session-switch. The conversation keeps one durable id for life, like
Claude Code / Codex. Compaction is lossy by design — the pre-compaction
transcript is summarized away, not archived.

The rotation path is unchanged when the flag is off (moved verbatim into
an else branch). Staged rollout: this PR ships the option behind a
default-off flag for live validation; a follow-up flips the default and
deletes the now-redundant rotation machinery, superseding the 14 open
band-aid PRs in this area.

- hermes_cli/config.py: add compression.in_place (default False), documented
- agent/agent_init.py: resolve the flag -> agent.compression_in_place
- agent/conversation_compression.py: branch compress_context() on the flag
- tests/run_agent/test_in_place_compaction.py: in-place invariants +
  rotation regression guard + config default

The pre-flush of current-turn messages (#47202) runs in BOTH modes, so no
boundary data loss. Prompt-cache invariant preserved: the system-prompt
rebuild is the same single sanctioned invalidation that already happens
during compaction — no NEW invalidation. Message alternation preserved.
2026-06-20 10:57:07 -07:00
helix4u
c253b07380 fix(model): clear stale endpoint credentials across switches 2026-06-19 19:58:26 -07:00
Teknium
cf58f1a520
feat(titles): support language-aware title generation (#45296)
Make auxiliary title prompts match the user language by default, with an optional pinned `auxiliary.title_generation.language` config.
2026-06-19 17:15:52 -07:00
alt-glitch
b6e2a54a94 fix(mcp): address adversarial review round 1 (cache parity, gates, races)
Consolidated findings from three independent reviewers (Codex, Claude Code, a
Hermes subagent w/ the hermes-agent-dev skill):

- BLOCKING: refresh_agent_mcp_tools rebuilt only the registry subset, silently
  dropping post-build-injected memory-provider (mem0/honcho/…) and context-
  engine (lcm_*) tools on every refresh. Now additive-preserving: re-applies
  the same injectors agent_init uses, staged on locals and published atomically.
- Re-injection now honors the #5544 enabled_toolsets gate for context-engine
  tools, so a restricted-toolset platform can't get lcm_* leaked back in.
- Atomic read-diff-publish under one lock: the returned `added` set and the
  (tools, valid_tool_names) pair are consistent even under concurrent callers
  (no half-swap, no TOCTOU).
- background_review fork opts out (_skip_mcp_refresh) so its byte-identical
  tools[] cache parity with the parent is preserved.
- CLI /reload-mcp routed through the shared helper (was a 4th divergent copy
  with the same clobber bug + missing disabled_toolsets).
- Explicit reloads (TUI RPC + CLI) pass enabled_override so a server the user
  just enabled in config this session is picked up; automatic paths reuse the
  agent's build-time selection.
- mcp_discovery_timeout default 5.0 -> 1.5s: correctness now comes from the
  between-turns refresh, so the startup wait is only a small turn-1 UX bump
  rather than a heavy dead-server latency penalty.
- has_registered_mcp_tools checks registered TOOLS (not connected servers) so a
  zero-tool/prompt-only server doesn't make the per-turn hook fire forever.
- Tests: rewrote the thread-safety test to actually exercise the write path
  (alternating tool sets), added the #5544-gate regression, the memory/context
  preservation regression, and a "callable next turn via valid_tool_names"
  contract; removed a dead monkeypatch line.
2026-06-19 11:57:43 -07:00
alt-glitch
93d6e73028 fix(mcp): expose late-connecting MCP tools to the agent (TUI/CLI/gateway)
MCP servers that connect after the agent's one-time tool snapshot were
invisible for the whole session. Two root causes, fixed together:

1. The startup discovery wait was a flat 0.75s. HTTP/OAuth servers
   commonly take 2-6s on a cold connect, so they missed the window and
   their tools never entered the agent's snapshot. `thread.join(timeout)`
   already returns the instant discovery completes, so raising the bound
   costs ~0s for the common case (no MCP / fast servers) and only ever
   blocks for a genuinely-pending server, capped so a dead server can't
   freeze startup. The bound is now configurable via
   `mcp_discovery_timeout` (config.yaml, default 5.0s).

2. Three call sites duplicated the agent tool-snapshot rebuild (the TUI
   `reload.mcp` RPC, the gateway reload, and the TUI late-binding refresh
   thread), and the late-refresh detected changes by tool COUNT — missing
   an equal-size add/remove swap. Consolidated into one shared
   `tools.mcp_tool.refresh_agent_mcp_tools(agent)` helper that diffs by
   tool NAME, mutates the agent under a lock (thread-safe), and respects
   the agent's own enabled/disabled toolsets.

The late-binding refresh keeps its pre-first-turn cache-safety guard:
it never rebuilds the tool list once a turn has started, so the cached
prompt prefix is never invalidated mid-conversation.

Tests: new tests/tools/test_refresh_agent_mcp_tools.py covers the
name-based diff, in-place mutation, agent-scoped filtering, thread
safety, and the config-driven discovery bound (incl. instant-return
when nothing is pending). 75 passed across the touched areas.
2026-06-19 11:57:43 -07:00
Teknium
2a5e9d994a
Merge pull request #48275 from NousResearch/feat/cron-scheduler-provider-chronos
feat(cron): pluggable CronScheduler interface + Chronos managed-cron provider (scale-to-zero)
2026-06-19 07:51:59 -07:00
Ben
ddd519ea70 feat(managed-scope): surface managed scope in config show and doctor
- show_config prints an administrator header naming the managed source and
  lists the pinned config/env keys when a scope is active (silent otherwise).
- hermes doctor gains a managed_scope_check under Configuration Files that
  reports the resolved managed dir + pinned key counts, and flags a
  HERMES_MANAGED_DIR redirect (the documented foot-gun).
2026-06-19 07:46:33 -07:00
Ben
4f9e15df97 feat(managed-scope): guard writes to managed config/env keys
- set_config_value hard-rejects a managed config key (D2) and names the
  source, exiting non-zero.
- save_env_value / remove_env_value refuse a managed env key.
- save_config strips managed leaves from a bulk write (mechanical safety net)
  with a warning, so the unmanaged remainder still persists.
New _strip_dotted_keys helper drives the bulk-save pruning. All guards are
distinct from and layered after the existing is_managed() package-manager
write-lock.
2026-06-19 07:46:33 -07:00
Ben
b5ddd6e719 feat(managed-scope): managed config layer wins over user config
_load_config_impl now deep-merges the managed config.yaml on top of the
expanded user config so managed leaves win while sibling keys stay
user-controlled (leaf-level merge, D3). Managed values are expanded against
the process env only, never user-defined ${VAR}, so a user can't shadow a
managed literal. The managed file's (mtime,size) is folded into the load
cache key so editing it invalidates the cache. This inverts the usual
env-over-config precedence for pinned keys by design (see design doc §4.1).
2026-06-19 07:46:33 -07:00
teknium1
a58287afcb
Merge remote-tracking branch 'origin/main' into pr48275-rebase
# Conflicts:
#	cron/scheduler.py
2026-06-19 07:40:29 -07:00
Teknium
d7bff949af
fix(cli): default cli_refresh_interval to 1.0 to keep status bar alive (#49087)
PR #49056 set the default to 0, which reverts the #45592 idle-clock fix:
without a periodic invalidate, prompt_toolkit stops repainting the bottom
chrome during idle and the status bar goes stale/disappears after a turn.

Restore 1.0 as the default for everyone. The config knob stays — users on
emulators where the per-second redraw fights auto-scroll (#48309) can set
display.cli_refresh_interval: 0 to opt out.
2026-06-19 07:35:06 -07:00
OYLFLMH
c1ffd4c3b4 fix(cli): make refresh_interval configurable, default to 0 (disabled)
Commit 6724daa2c added refresh_interval=1.0 to keep the idle clock
ticking, but unconditional 1 Hz redraws in non-fullscreen prompt_toolkit
mode cause terminal emulators (Xshell, iTerm2, Windows Terminal) to
auto-scroll to the bottom on every tick — breaking scroll-up to read
history.

Drive it from display.cli_refresh_interval (0 = disabled, the default)
so users who want the ticking clock can opt in without affecting everyone.

Fixes: #48309
Related: 6724daa2c, 8972a151a
2026-06-19 07:06:34 -07:00
Shannon Sands
d9190491a6 Add Slack setup hints and field validation 2026-06-19 12:16:23 +05:30
Shannon Sands
f741e70791 Add Slack allowed users setup field 2026-06-19 12:16:23 +05:30
Ben
637aff46e7 Merge remote-tracking branch 'origin/main' into hermes/hermes-6fe26723 2026-06-19 15:17:13 +10:00
Victor Kyriazakos
3ead2bdd0d feat(prompt): configurable per-platform system-prompt hint overrides
Add platform_hints config so an admin can append to or replace Hermes'
built-in platform hint for a single messaging platform (WhatsApp, Slack,
Telegram, ...) without affecting other platforms. Enables enterprise
managed profiles to steer platform-aware skills (e.g. invoke a custom
table-formatting skill on WhatsApp where Markdown tables don't render)
while leaving Telegram/Slack/CLI behavior unchanged.

- hermes_cli/config.py: document platform_hints in DEFAULT_CONFIG
- agent/agent_init.py: load platform_hints -> agent._platform_hint_overrides
- agent/system_prompt.py: _resolve_platform_hint() applies append/replace
  (replace wins; bare string = append shorthand); defensive on bad config
- tests: 16 cases covering append/replace/shorthand/isolation/malformed

Override only affects the platform-hint segment of the system prompt;
SOUL/context/memory tiers and general instructions are unchanged.
2026-06-18 14:28:01 -07:00
flooryyyy
f8d8f045fa feat(kanban): auto-subscribe calling session on kanban_create
When a worker calls kanban_create from inside a session that has a
persistent delivery channel, the originating session is now subscribed
to the new task's completion/block events automatically. The agent
that dispatched the task gets notified instead of having to poll.

- Gateway sessions (telegram/discord/slack): HERMES_SESSION_PLATFORM +
  HERMES_SESSION_CHAT_ID ContextVars, set by the messaging gateway.
- TUI / desktop sessions: HERMES_SESSION_KEY in the subprocess env.
  The TUI notification poller keys on platform='tui' + chat_id=<key>.
- CLI / cron / test: no persistent channel, no subscription.

Gated by kanban.auto_subscribe_on_create in config.yaml (default True).
Disable to mirror pre-feature behaviour — users who want explicit
kanban_notify-subscribe calls per task can set it to false. This
config gate addresses the design concern that got PR #19718 reverted
upstream (unconditional implicit auto-subscribe on tool-driven
kanban_create was too aggressive for orchestrator users).

HERMES_SESSION_ID is intentionally not a fallback channel — it is
set by ACP/agent subprocess telemetry for every invocation, not just
TUI, so treating it as a notification target would auto-subscribe
every CLI session and re-introduce the over-eager behaviour.

The kanban_create response now includes a 'subscribed' bool so
orchestrators can react if subscription failed (e.g. by falling
back to explicit kanban_notify-subscribe or to polling).

Includes 6 tests covering the gateway / TUI / CLI / partial-context /
gated / add_notify_sub-failure paths. All 90 tests in
test_kanban_tools.py pass; 509 broader kanban tests pass.
2026-06-18 14:10:51 -07:00
Teknium
0fa7d6f660
fix(desktop): never persist or restore a named custom provider as bare "custom" (#48547)
* Port from cline/cline#11514: encourage parallel tool calls

Add a universal system-prompt guidance block telling the model to batch
independent tool calls (reads, searches, web fetches, read-only commands)
into a single assistant turn instead of one call per turn. The runtime
already executes independent batches concurrently (read-only tools always;
non-overlapping path-scoped file ops); the open-source system prompt had
nothing steering the model to PRODUCE the batch. Fewer round-trips means
less resent context, which compounds over a long conversation.

- prompt_builder.py: new PARALLEL_TOOL_CALL_GUIDANCE block (short, static,
  cache-amortised) modeled on TASK_COMPLETION_GUIDANCE.
- system_prompt.py: inject right after the task-completion block, gated by
  agent.valid_tool_names + the new toggle.
- agent_init.py: read agent.parallel_tool_call_guidance (default True).
- config.py: add the default under the agent section.
- test_prompt_builder.py: behavior-contract tests (batching steer, dependent
  carve-out, length bound) — invariants, not wording snapshots.

Adapted from Cline's TypeScript tool-surface guidance to hermes-agent's
Python prompt-assembly architecture and config-over-env conventions.

* fix(desktop): never persist or restore a named custom provider as bare "custom"

Custom providers vanish from the Desktop/TUI model picker with
"No LLM provider configured" — repeatedly fixed (#44062, #44109, #45578)
and repeatedly regressed (#44022, #47714) because every fix only recovered
the entry identity from a persisted base_url. When a session is
persisted/restored with the resolved provider "custom" and NO base_url, bare
"custom" leaked through verbatim; resolve_runtime_provider("custom") routes to
the OpenRouter default URL with no api_key, so the next turn/resume dies.

Bare "custom" is the resolved billing class shared by every named providers:/
custom_providers: entry — it is not a routable identity. Centralize the
"never let bare custom escape" invariant in one helper,
runtime_provider.canonical_custom_identity(), and apply it at all four leak
sites in tui_gateway/server.py:

- _ensure_session_db_row  — the ORIGIN: first DB write seeds the bad row
- _runtime_model_config   — live persist
- _stored_session_runtime_overrides — resume restore (heals old rows; drops
  unrecoverable bare custom so resume falls back to config default)
- _make_agent             — rebuild / per-turn

The helper recovers custom:<name> from the endpoint URL when present, else
from config.model.provider (the durable identity left when no base_url
survived). Regression tests in test_custom_provider_session_persistence.py
lock the no-base_url vector at every site so it cannot regress again.
2026-06-18 11:11:51 -07:00
Kewe63
f1254c8eaf fix(skills): rmtree scope guard + default pre_update_backup to true (#48200)
Defense-in-depth fix for the silent wipe of ~/.hermes/ documented in
#48200. A `hermes update --yes` run silently destroyed a user's
.env, MEMORY.md, kanban.db, custom skills, and scripts. Two changes:

1. `_rmtree_writable` in tools/skills_sync.py now refuses to rmtree
   anything outside SKILLS_DIR (the HERMES_HOME/skills/ root).
   All five call sites pass paths under SKILLS_DIR, so the guard is
   a no-op for current code and a loud, recoverable failure for
   any future regression (bad path join, malicious bundled
   manifest, stale path in scope after an exception).

2. The default `updates.pre_update_backup` flips from false to
   true in hermes_cli/config.py. A few minutes of zip per update
   is negligible compared to silent total data loss. Still
   overridable; --no-backup still works for one-off opt-out.

Five new tests in TestRmtreeWritableScopeGuard (root path,
hermes home, sibling dir, skills root itself, subdir) plus a
flipped `test_default_enabled_creates_backup` in test_backup.py.
178/178 tests pass in the two affected files. Public method
signatures unchanged, no test-stub blast radius.

Closes #48200
2026-06-18 08:53:35 -07:00
Ben
28531b6186 Merge remote-tracking branch 'origin/main' into hermes/hermes-6fe26723 2026-06-18 15:31:29 +10:00
Ben
b75757d4aa feat(cron): wire on_jobs_changed, cron.chronos config, docs + agent↔NAS contract
Phase 4F (F.1 + F.2 + F.3, agent side). F.4 is the operator-run live smoke
(needs a NAS deployment); recorded in the PR, not code.

F.1 — on_jobs_changed wiring:
- cron/scheduler.py: _notify_provider_jobs_changed() — resolve the active
  provider, call on_jobs_changed(), swallow errors. Lives in scheduler.py (not
  jobs.py) so the store stays free of provider imports (no import cycle).
- Wired at the consumer surfaces AFTER a successful mutation: the cronjob model
  tool (tools/cronjob_tools.py, create/update/remove/pause/resume) — which the
  `hermes cron` CLI also routes through — and the REST handlers
  (gateway/platforms/api_server.py, same five). Built-in's no-op default = zero
  behavior change on the default path. Sleeping-agent direct jobs.json writes
  (no tool/CLI/REST) are covered by reconcile-on-wake in start().

F.2 — config: cron.chronos.{portal_url,callback_url,expected_audience,
nas_jwks_url}. All non-secret; the agent holds no scheduler creds and the
outbound provision call reuses the existing Nous token (no token key). Additive
deep-merge key, no version literal.

F.3 — docs:
- docs/chronos-managed-cron-contract.md: authoritative agent↔NAS wire contract
  (the three agent-cron endpoints + inbound /api/cron/fire + the 3-hop trust
  model + at-most-once/re-arm semantics). This is what the NAS-side agent builds
  against.
- cron-internals.md: "Managed cron (Chronos) for scale-to-zero" section.
- cli-commands.md: cron.provider accepts chronos + the cron.chronos.* keys.
- User docs name no scheduler vendor (QStash is a NAS-internal detail).

INVARIANT re-verified: zero qstash/upstash hits across plugins/cron, gateway,
hermes_cli, tools, website/docs (the one remaining repo hit is an unrelated
Context7 MCP comment in tools/mcp_tool.py).

Tests: test_jobs_changed_notify (5) — notify calls provider hook, swallows
errors, built-in harmless, tool create/remove notify. Full cron + chronos +
webhook + config + api_server_jobs suites green (504 in the cron+chronos+webhook
run).
2026-06-18 15:11:32 +10:00
Ben Barclay
4440d77bf3
fix(update): scope install-method stamp to the code tree, not $HERMES_HOME (#48188)
The install method (docker/git/pip/...) describes the *running binary*, but
detect_install_method() read it from $HERMES_HOME/.install_method — a shared
DATA directory. The Docker docs deliberately bind-mount $HERMES_HOME
(~/.hermes:/opt/data) so config/sessions/memory persist and can be shared with
a host-side Desktop/CLI install.

When a containerized gateway and a host install share one $HERMES_HOME, the
home-scoped stamp is a single slot describing two installs: the published image
stamps 'docker' on every boot, the host install then reads 'docker' and the
in-app updater refuses to run 'hermes update' ("doesn't apply inside the Docker
container"). Reinstalling the Desktop app from the DMG doesn't help because the
contaminated stamp is re-read every time.

Fix (option 1 — code-scoped stamp):
- detect_install_method() reads <install tree>/.install_method first (next to
  the running code, immune to the shared data dir). It falls back to the legacy
  $HERMES_HOME stamp for back-compat, but IGNORES a 'docker' home stamp when
  not actually containerized — so already-poisoned shared homes self-heal.
- stamp_install_method() writes the code-scoped stamp.
- install.sh stamps $INSTALL_DIR instead of $HERMES_HOME.
- Dockerfile bakes 'docker' into /opt/hermes/.install_method at build time
  (inside the immutable block); stage2-hook.sh no longer writes the home stamp
  and proactively removes a stale 'docker' one to heal existing shared homes.

Genuine containers still resolve to 'docker' (baked stamp, or legacy home stamp
honored when containerized). Unstamped installs in generic containers still fall
through to git/pip (preserves the #34397 fix).
2026-06-18 14:14:41 +10:00
Ben
ae8fa11097 feat(cron): cron.provider config + plugins/cron discovery + resolver
Phase 2 of the pluggable cron-scheduler refactor. Still no call-site changes;
this wires up provider SELECTION with a hard safety net.

Task 2.1: cron.provider config key (hermes_cli/config.py), empty = built-in.
  Additive key — deep-merge picks it up into existing configs with no version
  bump (verified: load_config() yields the key on a pre-existing config.yaml).
Task 2.2: plugins/cron/__init__.py — discovery machinery cloned near-verbatim
  from plugins/memory/__init__.py, retargeted at CronScheduler /
  register_cron_scheduler. Bundled (plugins/cron/<name>/) + user
  (/plugins/<name>/) dirs, bundled wins collisions. The built-in is
  NOT discovered here — it's core, so the fallback can't be removed.
Task 2.3: resolve_cron_scheduler() in cron/scheduler_provider.py — reads
  cron.provider and ALWAYS degrades to built-in (missing / unavailable / load
  error / typo all fall back with a warning). cron can never be left without a
  trigger.

Deviation from plan: the plan's resolver snippet used cfg_get("cron.provider")
(dotted-string form). The real cfg_get signature is cfg_get(cfg, *keys,
default=) — corrected to cfg_get(load_config(), "cron", "provider", default=""),
matching plugins/memory/__init__.py:349. Tests monkeypatch load_config (not
cfg_get) so the real traversal runs.

Tests: default key empty, discovery returns list, unknown load returns None,
and the four resolver paths (empty→builtin, no-section→builtin,
unknown→builtin, unavailable→builtin, available→used). Full tests/cron/: 453
passed; config suite green (additive key, no migration break).
2026-06-18 14:09:36 +10:00
Teknium
f80381c456
feat(prompt): scale context-file cap to model window + point agent at truncated file (#47846)
Context files (AGENTS.md, CLAUDE.md, .hermes.md, .cursorrules, SOUL.md) were
hard-capped at a flat 20K chars before head/tail truncation. Among the agent
harnesses we track, only Codex caps project docs at all (32 KiB); Claude Code,
OpenCode, and Cline load them whole. The flat 20K predates large context
windows and silently truncates real-world AGENTS.md files.

B — dynamic cap: when context_file_max_chars is unset (now the shipped
default), the cap scales with the model's context window
(ctx_tokens * 4 * 0.06, floor 20K, ceiling 500K). Small-context models stay at
the historical 20K; a 200K model gets 48K; large models stop truncating real
docs. An explicit context_file_max_chars still wins. Context length is resolved
once per conversation (stable -> prompt cache untouched).

C — when truncation does happen, the marker now names the concrete file path
and tells the agent to read_file it for the full content.

Validation: 154 targeted tests + full agent/ + hermes_cli/ + test_config
(0 failures); E2E against a real 60K AGENTS.md confirms small windows truncate
with the path-bearing marker, large windows load whole, and the system prompt
is byte-stable across rebuilds.
2026-06-17 05:40:26 -07:00
Teknium
7bbffceb9c
feat(curator): make skill consolidation opt-in (prune stays default-on) (#47840)
The curator now defaults to prune-only: the deterministic inactivity pass
(mark stale / archive long-unused skills) still runs whenever the curator is
enabled, but the opinionated LLM umbrella-building consolidation fork is OFF
by default.

- agent/curator.py: add DEFAULT_CONSOLIDATE=False + get_consolidate(); gate
  the forked aux-model review in run_curator_review behind it (new consolidate
  param, None=read config). When off, the LLM pass is skipped entirely (no
  aux-model cost); the run is still recorded and reported.
- config.py: add curator.consolidate (default false); v29->v30 migration seeds
  the key for existing installs without clobbering a user-set value.
- hermes_cli/curator.py: 'hermes curator run --consolidate' override; status
  shows consolidate state; prune-only notice on run.
- docs + tests.
2026-06-17 05:20:32 -07:00
teknium
36ae958473 feat(gateway): gate message timestamps behind opt-in (default off)
Follow-up to salvaged PR #41633: the timestamp prefix injection was
unconditional. Gate the in-context render behind
gateway.message_timestamps.enabled (default false) at both the live-message
and history-replay sites; timestamp metadata is still captured + persisted
regardless so the toggle can be flipped on later. Add DEFAULT_CONFIG entry,
docs, and gate tests.
2026-06-16 15:49:59 -07:00
Wolfram Ravenwolf
f6a42b1acf feat(prompt): make context-file truncation limit configurable
PROBLEM: Automatic context files such as SOUL.md and AGENTS.md were capped by a hardcoded CONTEXT_FILE_MAX_CHARS value. Amy's local fork had raised that constant from 20K to 25K so a larger SOUL.md would not be silently truncated, but the hardcoded 25K value changed upstream default behavior and made the patch less generally useful.

SOLUTION: Restore the upstream-compatible 20K default, add a context_file_max_chars config setting for users who intentionally keep larger identity/project-context files, keep chat-visible truncation warnings, and document the new setting. Tests cover the default, config override, explicit max_chars precedence, and the warning text.
2026-06-16 11:28:35 -07:00
teknium
6373aba80f feat(gateway): rename to tool_progress_grouping, add config/docs/tests
Follow-up to salvaged PR #41620:
- Rename tool_progress_style -> tool_progress_grouping (clearer intent)
- Add display.tool_progress_grouping to DEFAULT_CONFIG (accumulate default)
- Document in messaging docs incl. 'separate is noisier, only where progress enabled'
- Add resolver tests (default/global/override/invalid/case)
2026-06-16 05:49:24 -07:00
teknium
98ae28657f feat(display): document and test memory_notifications setting
Follow-up to salvaged PR #4684:
- Add display.memory_notifications to DEFAULT_CONFIG (off|on|verbose, default on)
- Document the setting in docs/user-guide/features/memory.md
- Add resolver tests for off/on/verbose memory + skill paths
2026-06-16 05:45:40 -07:00
Teknium
a6364bfa08
fix(telegram): edit streamed previews in place as rich (Bot API 10.1) (#46890)
Streamed Telegram replies that finalize through editMessageText were
converted to MarkdownV2, which has no table syntax and rewrites pipe
tables into bullet lists — users saw a table while streaming that
collapsed to a list at the last moment.

Finalize now edits the existing preview IN PLACE via Bot API 10.1's
editMessageText rich_message parameter when the content has constructs
the legacy path degrades (tables, task lists, <details>, block math).
No fresh send + delete, so no duplicate-preview flicker — the reason
#46206 reverted the fresh-final re-send path. prefers_fresh_final_streaming
stays False; the in-place edit replaces it.

- _needs_rich_rendering(): rich reserved for table/task-list/details/math
  (adapted from #45995, @YonganZhang); plain replies stay on MarkdownV2.
- _try_edit_rich(): editMessageText + rich_message via do_api_request,
  mirroring _try_send_rich's fallback/latch/transient contract.
- edit_message finalize tries rich in place before the 4,096 overflow
  pre-flight (rich cap is 32,768), falling back to legacy on rejection.
- rich_messages default flipped back to True (DEFAULT_CONFIG + adapter).
- docs (en + zh-Hans) + cli-config example updated to default-on.

Closes the root cause behind #45911 / #46009.
2026-06-16 05:26:04 -07:00
Teknium
c66ecf0bc3
feat(delegation): async background subagents via delegate_task(background=true) (#40946)
* feat(delegation): async background subagents via delegate_task(background=true)

delegate_task(background=true) dispatches a subagent that runs in the
background and returns a handle immediately, so the user and model keep
working while it runs. The full result — plus the original task source —
re-enters the conversation as a new turn when the subagent finishes,
riding the same completion-queue rail as terminal background processes.

- tools/async_delegation.py: daemon-executor registry, capacity cap,
  rich self-contained completion event pushed onto the shared
  process_registry.completion_queue (type='async_delegation').
- delegate_tool.py: background param + single-task dispatch branch;
  batch async rejected (v1).
- process_registry.py: format_process_notification renders the rich
  task-source block (goal/context/toolsets/model/status/result).
- gateway/run.py: dedicated _async_delegation_watcher drains + injects
  results into the originating session (idle + post-turn), session_key
  routing enrichment, shutdown interrupt of dangling delegations.
- config: delegation.max_async_children (default 3).

Reuses the existing idle-drain wiring rather than mutating a running
agent loop, preserving message-role alternation and prompt-cache
invariants. 13 targeted tests; CLI + gateway paths E2E-verified.

* test(delegation): make async non-blocking tests environment-independent

CI 'test (5)' flaked on a cold, 8-worker runner: the first
delegate_task(background=true) call measured 2.27s of one-time setup
(config load + child-agent construction + imports), tripping the
elapsed < 1.0 wall-clock assertion. That assertion was testing setup
overhead, not blocking.

Replace the wall-clock thresholds with the real invariant: dispatch
returns while the child is still gated (active_count == 1, completion
queue empty), which a synchronous impl could not do. Keep only a loose
4s sanity backstop well under the runner's 5s gate.

* fix(delegation): harden async background delegation

Follow-up review fixes:
- Detach background child from parent._active_children at dispatch —
  otherwise parent-turn interrupts (Ctrl+C, mid-turn steering), cache
  evicts (release_clients), and session close (/new) kill/close the
  detached subagent mid-run, defeating the point of background mode.
  Lifecycle is owned by the async registry's interrupt_fn.
- Make the capacity check atomic with the record insert (TOCTOU: two
  concurrent dispatches could both pass active_count() and exceed the cap).
- TUI dedup: key async_delegation events by delegation_id — the
  fallthrough keyed them all as ("", type), suppressing every completion
  after the first in the desktop/TUI status feed.
- CLI /stop now interrupts running background delegations and /agents
  lists them (they live outside the process registry and were invisible).
- Drop stray unbalanced ']' line from the re-injection block and the
  unused _ASYNC_DEFAULT import.

Tests: detach-at-dispatch + concurrent-capacity race added (15 total in
test_async_delegation.py); 137 delegate + 140 process-registry/notify/watch
+ 7 TUI dedup tests pass.

* fix(delegation): harden async background completion drains
2026-06-15 13:33:12 -07:00
Teknium
a1f51feb72
fix(telegram): avoid rich final duplicate previews (#46206) 2026-06-14 11:13:38 -07:00
Teknium
a27d7e68cc
fix(mcp): block suspicious stdio configs before probe (#46112) 2026-06-14 04:46:54 -07:00
Teknium
972a9885ee
fix(mcp): block exfil-shaped stdio server configs (#46083) 2026-06-14 04:24:14 -07:00
Teknium
723c2331bd fix: make profile subprocess HOME policy explicit 2026-06-14 03:20:21 -07:00
Justin Sunseri
12682d96b9 feat(telegram): restore rich messages opt-out
Salvages PR #45840's client-compatibility opt-out while keeping rich messages enabled by default via telegram.extra.rich_messages: true.
2026-06-13 21:45:49 -07:00
Teknium
bba9b519aa
fix(delegation): remove the default subagent wall-clock timeout (#45149)
Subagents doing legitimate heavy work (deep code reviews, research
fan-outs, slow reasoning models) were routinely killed at the blanket
600s child_timeout_seconds cap while making steady progress (e.g. 36
API calls completed when the axe fell). Failures should come from what
the child is actually doing — API errors, tool errors, iteration
budget — not a delegation-level stopwatch.

- DEFAULT_CHILD_TIMEOUT: 600 -> None; Future.result(timeout=None)
  blocks until the child finishes
- config default delegation.child_timeout_seconds: 600 -> 0
  (0/negative = disabled; positive opts back in, floor 30s unchanged)
- stuck-child protection unchanged: the heartbeat staleness monitor
  still stops refreshing parent activity so the gateway inactivity
  timeout fires on a truly wedged worker; the 0-API-call diagnostic
  dump still works when a cap is configured
- docs updated (EN + zh-Hans)
2026-06-12 12:58:25 -07:00
Teknium
4474873d2c
feat(cli): persist resolved approval/clarify prompts in scrollback (#44702)
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix Lockfile Fix / auto-fix-main (push) Waiting to run
Nix Lockfile Fix / fix (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Waiting to run
Tests / test (1) (push) Waiting to run
Tests / test (2) (push) Waiting to run
Tests / test (3) (push) Waiting to run
Tests / test (4) (push) Waiting to run
Tests / test (5) (push) Waiting to run
Tests / test (6) (push) Waiting to run
Tests / save-durations (push) Blocked by required conditions
Tests / e2e (push) Waiting to run
Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run
Typecheck / typecheck (apps/desktop) (push) Waiting to run
Typecheck / typecheck (apps/shared) (push) Waiting to run
Typecheck / typecheck (ui-tui) (push) Waiting to run
Typecheck / typecheck (web) (push) Waiting to run
uv.lock check / uv lock --check (push) Waiting to run
Build Skills Index / build-index (push) Has been cancelled
Build Skills Index / trigger-deploy (push) Has been cancelled
Modal prompt panels (dangerous-command approval, clarify questions)
live in the prompt_toolkit layout and vanish on the next repaint,
leaving no trace of the question or the decision in chat history.

Emit a dim one-line summary after each prompt resolves:
  ⚠ Approval: <command> → allowed for session
  ? Clarify: <question> → <answer>

Gated on display.persist_prompts (default true). Detail and outcome
are whitespace-collapsed and capped at 120 chars.
2026-06-12 01:14:35 -07:00
Teknium
c196269d8d
fix(credits): suppress usage gauge when top-up funds exist + add display.credits_notices toggle (#44716)
The subscription-cap usage gauge (50/75/90% bands) ignored purchased
(top-up) credits: a sub user with top-up funds got a sticky warn banner
at 90% of their cap — permanently at >=100%, alongside grant_spent —
despite being fully able to keep inferencing. The cap is the wrong
denominator for an account that can keep spending.

- evaluate_credits_notices: purchased_micros > 0 suppresses the usage
  band (grant_spent already covers the cap-reached + top-up case with
  the remaining balance). A top-up landing mid-session clears any
  showing band; spending top-up down to 0 resumes the gauge.
- New display.credits_notices config (default true): false silences all
  credits notices. State capture and /usage are unaffected. Read once
  per agent (cached) in _emit_credits_notices, fail-open true.
- Docs: configuration.md display block.
2026-06-12 01:06:46 -07:00
teknium1
9a09ea69fb feat(cron): Suggested Cron Jobs — one surface for proposed automations
Hermes can propose automations and let the user accept them with one tap
via /suggestions, instead of making them assemble cron jobs by hand. Every
proposal — wherever it originates — flows through one surface.

Sources (the 'where suggestions come from'):
- catalog: curated starter automations (daily briefing, important-mail
  monitor, weekly review, workday-start reminder) via /suggestions catalog
- recipe: installing a skill that carries a metadata.hermes.recipe block
  registers a suggestion instead of auto-scheduling
- usage / integration: reserved for the background-review detector and
  account-connect triggers (sources defined; emitters land next)

Pieces:
- cron/suggestions.py — the store. add/list/accept/dismiss, dedup+latch by
  key (dismissed proposals never re-offered), pending cap so it can't become
  a nag wall. Accepting calls the existing cron.jobs.create_job — there is
  NO second job engine. Mirrors jobs.py storage (atomic writes, lock, 0600).
- cron/suggestion_catalog.py — the curated set. The important-mail monitor
  entry is where the old proactive-monitor poll->classify->surface engine
  lives now (cron/scripts/classify_items.py + the 'monitor' aux task), as ONE
  catalog automation rather than a standalone feature.
- tools/recipes.py — recipe<->job bridge; register_recipe_suggestion() makes
  a recipe source 'recipe' of this surface. recipe_to_job_spec() is the single
  translation both the direct and suggestion paths share.
- hermes_cli/suggestions_cmd.py — shared /suggestions handler (CLI + gateway
  never drift); /suggestions [accept N|dismiss N|catalog|clear].
- Wired: CommandDef + CLI dispatch (cli.py) + gateway dispatch (gateway/run.py)
  + aux 'monitor' task (config.py) + recipe-install hook (skills_hub.py).

Consent-first throughout: nothing auto-schedules; acceptance is always
explicit; dismissals latch.

Supersedes #41122 (proactive-monitor) and #41127 (recipes): both fold in here
as a catalog entry and a suggestion source respectively.

Tests: store (dedup/cap/accept/dismiss/latch), catalog seeding+idempotency,
recipe->suggestion bridge, command handler, aux config. E2E: recipe SKILL.md
-> parsed -> suggested -> accepted -> real cron job persisted to jobs.json.
2026-06-11 10:49:47 -07:00
Teknium
4d6a133a9f
fix(agent): gate skill-index demotion behind the opt-in focus mode (#44387)
The coding posture's names-only demotion of non-coding skill categories
(#44342) applied under the default auto mode, silently changing the skill
index for every user in a git repo. Index changes must be opt-in: demotion
now only fires under agent.coding_context=focus, alongside the toolset
collapse. auto/on leave the skill index untouched; focus semantics are
unchanged (demoted, never hidden; deny-list keeps coding-adjacent and
custom categories at full entries).
2026-06-11 10:00:57 -07:00
Teknium
9c051f57c3
fix(dashboard): Anthropic API Key entry checks ANTHROPIC_API_KEY, not Claude Code creds; hide deprecated tool-progress env vars (#44286)
Two dashboard fixes:

1. The 'Anthropic API Key' OAuth catalog entry's status fn read
   ~/.claude/.credentials.json (which has its own dedicated claude-code
   entry) and never checked ANTHROPIC_API_KEY at all. It now checks the
   Hermes PKCE file, then the registry env-var order (ANTHROPIC_API_KEY
   -> ANTHROPIC_TOKEN -> CLAUDE_CODE_OAUTH_TOKEN) via get_env_value, so
   keys from .env, the shell, or Bitwarden (injected into the process
   env by load_hermes_dotenv) are all reported, with a '(from Bitwarden)'
   source suffix when applicable.

2. Deprecated HERMES_TOOL_PROGRESS / HERMES_TOOL_PROGRESS_MODE removed
   from OPTIONAL_ENV_VARS so the keys page and setup checklists stop
   offering them. Moved to _EXTRA_ENV_KEYS so .env sanitization and
   reload_env still recognize them for existing users (gateway back-compat
   fallback unchanged).
2026-06-11 07:18:15 -07:00
brooklyn!
3e74f75e41
feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316)
* feat(agent): coding-context posture with per-model edit-format tuning

Hermes detects when it's running in a coding context — an interactive
surface (CLI, TUI, ACP, desktop) sitting in a code workspace (git repo or
recognised project root) — and shifts into a coding posture. Outside that
(chat platforms, non-workspaces) nothing changes.

The posture is modelled as a frozen RuntimeMode selected from a small
ContextProfile registry (coding/general). A profile is data: the toolset to
collapse to, the operating brief to inject, and seams for model routing and
memory. Every domain reads the same resolved object instead of re-probing
git/config on its own:

- System prompt — RuntimeMode.system_blocks(): an operating brief (gather
  context before editing, edit through tools not chat, verify with terminal,
  cap retry loops) plus a live git/workspace snapshot, built once and baked
  into the stable prompt tier so per-conversation caching is preserved.
- Per-model edit-format tuning — the brief nudges each model family toward
  the patch mode it handles best: OpenAI/Codex toward mode='patch' (V4A
  multi-file diffs), Anthropic toward mode='replace' (string replacement).
  The model id rides on RuntimeMode; unknown families keep neutral wording.
- Skill index — non-coding skill categories are pruned from the prompt's
  skill index (discovery-only; skills_list/skill_view still reach the full
  catalog, with a disclosure note).
- Toolset — only under the opt-in 'focus' mode does the posture collapse to
  the coding toolset + enabled MCP servers; the default posture is
  prompt-only and never overrides configured toolsets.

Activation via agent.coding_context: auto (default), focus, on, off.
Subagents inherit the posture for free via toolset inheritance + the shared
prompt builder. Detection is not memoized so a long-lived gateway/TUI
process can't pin a stale posture across working directories.

* feat(agent): cover new-file authoring in the coding edit-format nudge

The per-model edit-format guidance only addressed editing existing code
(patch mode='patch' vs 'replace'), but authoring a brand-new file —
write_file, not patch — is a large fraction of real coding work and the
nudge was silent on it. Surfaced when building a single-file artifact where
the dominant operation was write_file and the steering offered no guidance.

Both family lines now lead with "author new files with write_file; for
edits to existing code prefer ...". Tests assert write_file appears in each
family's brief; unknown families still get neutral wording.

* docs(agent): correct memoization docstring + clarify TUI config-load asymmetry

* feat(agent): sharpen the coding posture — verify-loop facts, wider edit steering, $HOME guard

Tuning pass on the coding posture from dogfooding it as a harness:

- Workspace snapshot now hands the model its verify loop up front:
  detected manifests + package manager (lockfile sniff), the exact
  verify commands (package.json scripts, Makefile targets,
  scripts/run_tests.sh, pytest config), and which context files
  (AGENTS.md / CLAUDE.md / .cursorrules) exist at the root. Marker-only
  (non-git) projects get the snapshot too instead of nothing. The
  "verify before claiming done" brief line was the highest-value piece
  in evals — this turns it from advice into an executable loop instead
  of making the model rediscover the test command every session. Still
  stat-cheap, size-guarded reads, built once at prompt time.

- Edit-format steering covers the families Hermes actually serves:
  Gemini and open-weight coding models (DeepSeek, Qwen, Kimi, GLM,
  Grok, Hermes, Llama, Mistral, Devstral, MiniMax) steer to
  mode='replace' — their RL scaffolds use str_replace-style editors.
  Previously only GPT/Codex and Claude families got steering; the
  models Hermes users disproportionately run all fell to neutral.

- Operating brief gains four behaviors elite harnesses encode: batch
  independent reads/searches in one turn; fix root causes and the bug
  class (sibling call paths), not the reported site; no drive-by
  refactors/renames/reformatting; never read, print, or commit secrets.
  Plus a patch-failure escalation ladder: after the same region fails
  twice, rewrite the enclosing function/file with write_file instead of
  a third patch attempt.

- $HOME dotfiles guard: a git repo rooted exactly at the home directory
  (or a marker sitting in it, e.g. a global ~/AGENTS.md) is user config,
  not a code workspace — without the guard, every session anywhere under
  a dotfiles-managed home silently flipped to the coding posture. Real
  projects under such a home still detect via their own markers/repos;
  'on' mode bypasses the guard.
2026-06-10 23:06:44 -05:00
Barron Roth
2c19208224 feat(tts): add Gemini audio tag rewrite 2026-06-10 02:57:39 -07:00
Barron Roth
5718811de0 feat(tts): add Gemini persona prompt file 2026-06-10 02:57:39 -07:00
Ben Barclay
15813336cc
fix(config): preserve original .env file mode in remove_env_value too (#43349)
#33699 fixed save_env_value so an operator-set .env mode (e.g. 0640 on a
Docker bind-mount) survives a config write instead of being re-tightened
to 0600 by the unconditional _secure_file() call. The sibling
remove_env_value() had the identical bug: it restores original_mode and
then unconditionally called _secure_file(env_path), clobbering the mode
back to 0600 on every `hermes config remove KEY`.

Apply the same fix: move _secure_file() into the else branch so it only
runs when no original mode was captured (a freshly created .env still
gets 0600 hardening; existing operator-set modes survive).

Added test_remove_env_value_preserves_existing_file_mode_on_posix, which
fails on the unfixed remove path (expected 0o640, got 0o600) and passes
with the fix.
2026-06-10 19:53:07 +10:00
Teknium
095f526b11
refactor(memory,skills): replace tri-state write_mode with boolean write_approval (default off) (#43354)
The shipped tri-state write_mode (on|off|approve) conflated two concepts —
whether writes are enabled and whether they're gated — so 'on' (writes flow
freely, gate inactive) read like 'gating is on'. Replace it with a single
clear boolean gate that defaults off.

  memory.write_approval / skills.write_approval:
    false (default) — write freely; the approval gate is off (pre-gate behaviour)
    true            — require approval: memory foreground prompts inline, memory
                      background-review + all skill writes stage for review

The old 'off = block all writes' mode is dropped; memory_enabled: false already
disables memory entirely, so a third 'block' state was redundant.

- tools/write_approval.py: get_write_mode/MODE_* → write_approval_enabled() bool;
  evaluate_gate() loses the config-driven 'blocked' path (blocked now only comes
  from an interactive user denial).
- tools/memory_tool.py, tools/skill_manager_tool.py: comment + behaviour follow.
- hermes_cli/config.py: memory/skills write_mode → write_approval (False);
  _config_version 28→29 with a 28→29 migration that renames any persisted
  write_mode (approve→true, on/off/unset→false) and drops the old key.
- slash commands: '/memory|/skills mode <on|off|approve>' → 'approval <on|off>'
  ('mode' kept as a back-compat alias); set_mode_fn callback now takes a bool.
- write_approval_commands.py, cli_commands_mixin.py, gateway/slash_commands.py,
  commands.py: handlers + registry args/subcommands updated.
- docs + tests rewritten for the boolean model; added migration tests.
2026-06-09 23:21:14 -07:00
Ben Barclay
e4a1b35a39
fix(config): preserve original .env file mode instead of unconditionally tightening to 0600 (#33699)
`save_env_value()` captures the original .env file mode (e.g. 0640 for Docker
volume mounts) and restores it via `os.chmod` — but then unconditionally calls
`_secure_file(env_path)` on the next line, which re-tightens the mode to 0600
and defeats the entire preservation logic. The intent (preserve when
`original_mode` is captured, secure otherwise) was already in the code but
got short-circuited.

Move `_secure_file()` into the `else` branch so it only runs when no original
mode was captured — fresh `.env` files written for the first time still get
the 0600 hardening treatment, but operator-set modes survive subsequent writes.

Salvages #31518 by @blut-agent (config.py portion only). Their PR also bundled
unrelated lowercase-lookup changes in `hermes_cli/commands.py`; this salvage
takes only the focused config fix. The commands.py changes are reasonable on
their own merits but belong in a separate PR.

Co-authored-by: blut-agent <278569635+blut-agent@users.noreply.github.com>
2026-06-10 15:42:16 +10:00
Teknium
96af61b6ef
feat(memory,skills): approve/deny gate for memory + skill writes (#38199)
Adds memory.write_mode and skills.write_mode (on|off|approve), applied to
both foreground turns and the background self-improvement review fork — the
source of the unprompted 'wrong assumption' saves users reported.

- on (default): write freely, unchanged behaviour
- off: never write; the tool returns a clean disabled result
- approve: don't commit. Memory foreground writes prompt inline (small,
  reviewable in a chat bubble); background memory writes and ALL skill writes
  stage to a pending store instead (a SKILL.md is too large to review inline,
  and a daemon thread can't block on a prompt)

Review staged writes from CLI or any messaging platform:
  /memory pending|approve|reject|mode
  /skills pending|approve|reject|diff|mode

Skill review respects the size asymmetry: inline you see a one-line gist;
the full unified diff stays out-of-band (/skills diff, dashboard, or the
staged JSON file).

New: tools/write_approval.py (gate + pending store), hermes_cli/
write_approval_commands.py (shared CLI+gateway handlers). Gates wired at the
single entry points memory_tool() and skill_manage(), using the existing
write-origin ContextVar to distinguish foreground from background_review.
2026-06-09 21:51:43 -07:00