Commit graph

589 commits

Author SHA1 Message Date
teknium1
64972b6403 fix(config): canonicalize model.name/model.model to model.default (#34500)
A custom_providers config that names the model under model.name (or
model.model) resolved to an empty model, so the API request went out
with model= — HTTP 400 from OpenAI-compatible backends. Display paths
(hermes status/dump) already read model.name and showed the model,
making the failure silent.

The model id was read via 'default or model' at ~14 independent sites
(cli, gateway, cron, curator, oneshot, fallback, profiles, ...), none
of which honored 'name'. Rather than patch every site, canonicalize at
the single load/save chokepoint: _normalize_root_model_keys() now
promotes model.model/model.name -> model.default (precedence
default > model > name) and drops the stale alias, so every reader —
present and future — sees a populated default and config.yaml is
migrated canonical on next save. The gateway, which bypasses
load_config(), replays the same normalization in _load_gateway_config().

Co-authored-by: Bartok9 <danielrpike9@gmail.com>

Credit: root-cause analysis and fix direction from @Bartok9 (#34502,
first) and @v86861062 (#34527).
2026-06-28 02:05:13 -07:00
Teknium
c9df4bc094
fix(gateway): default restart_drain_timeout to 0 to kill systemd crash loop (#54066)
A restart now interrupts in-flight agents immediately rather than holding
the gateway open for a grace window. The previous 180s default coupled two
independently-set timers: the gateway's own drain timer and systemd's
TimeoutStopSec. On a stale unit where TimeoutStopSec < drain, systemd
SIGKILLed the gateway mid-cleanup, leaving a stale lock that made the next
startup exit immediately ('already running') — an infinite crash loop under
Restart=on-failure (#31981).

Setting drain to 0 makes the mismatch structurally impossible: with drain 0
the generated unit gets TimeoutStopSec=90 against a near-instant drain, so
systemd never kills mid-cleanup. Contract: restart the gateway, in-flight
work stops. A grace window large enough to 'save' a long agent turn would
have to outlast an unbounded task, which is impossible.

Also fixes the stale-unit warning's suggested command
(hermes gateway service install --replace -> hermes gateway install --force);
the former subcommand does not exist.

Closes #31981
2026-06-28 01:14:34 -07:00
teknium1
aacc15b2c9 fix(clarify): raise default clarify_timeout to 3600s (#32762)
The 600s default evicted the gateway clarify entry while users were
still away (meeting/AFK); a later button tap then landed on a dead
entry and the agent hung on 'running: clarify'. Raise the default to
1h in DEFAULT_CONFIG and the get_clarify_timeout() code-level fallback,
documenting the running-agent-guard tradeoff. User overrides still win.
2026-06-28 01:07:53 -07:00
teknium1
c918d42d88 feat(desktop): config-driven Electron launch flags + GPU policy
Adds a desktop: section to config.yaml so headless/VM users can make
`hermes desktop` launch correctly without a wrapper command:

- desktop.electron_flags: extra Electron CLI flags (e.g. --ozone-platform=x11)
  appended to every launch. Accepts a list or a shell-split string.
- desktop.disable_gpu: auto|true|false, bridged to the HERMES_DESKTOP_DISABLE_GPU
  env var the Electron app already reads. An explicit env var still wins.

cmd_gui() reads these via _desktop_launch_options() and applies them. This is
the config.yaml form of the capability proposed as a raw env var in #38934
(@1RB) — behavioral settings belong in config.yaml, not a new HERMES_* env var.

Co-authored-by: ray <86501179+1RB@users.noreply.github.com>
2026-06-27 22:26:43 -07:00
Priyanshu Sharma
f6deabca0d fix(gateway): clear stale base_url on model switches 2026-06-27 21:23:25 -07:00
Teknium
d43e0cf304
fix(agent): config-driven intent-ack continuation for all api_modes (#27881) (#53943)
* fix(agent): config-driven intent-ack continuation for all api_modes (#27881)

The agent could end a turn after only stating intent ('I will run a health
check...') without executing the announced tool call, forcing the user to
re-prompt. A continuation guard that catches this and nudges the model to
proceed already existed but was hard-gated to the codex_responses api_mode,
so Gemini/Claude/OpenRouter turns never benefited.

- New agent.intent_ack_continuation config (default 'auto' = codex-only,
  byte-stable for existing conversations). 'true'/model-list opts every
  api_mode in; 'false' disables. Mirrors agent.tool_use_enforcement's shape.
- looks_like_codex_intermediate_ack gains require_workspace (default True).
  The opted-in path drops the codebase/filesystem requirement so general
  autonomous workflows (server ops, deploys, API calls) are caught, not just
  coding tasks. Future-ack + action-verb + short-content + no-prior-tool
  guards still apply; the 2-nudge-per-turn cap is unchanged.
- Resolution centralized in intent_ack_continuation_mode (off/codex_only/all).

* docs(infographic): intent-ack continuation (#27881)
2026-06-27 20:46:00 -07:00
Teknium
45b2e4dd6b fix(config): opt newer migrations out of default-stripping
The salvaged #27354 fix made save_config strip schema-default leaves by
default. Five migration sites added to main after the PR was authored
still called bare save_config(config) and intentionally materialize a
(often default-valued) key: model_catalog.ttl_hours, write_approval,
curator.consolidate, agent.verify_on_stop, and the suspicious-MCP-server
disable. Pass strip_defaults=False so those one-time deliberate writes
survive, matching the opt-out the PR applied to the other migrations.
2026-06-27 19:38:11 -07:00
郝鹏宇
98488c4be4 fix(config): prevent save_config from materialising schema defaults
Fixes #27354

Root cause:  called during init (or by any code path
that saves ) wrote injected schema defaults into
config.yaml as if the user had authored them.  Two fix layers:

1.  now only injects
    when the user actually set
    somewhere (root or agent).  A user who never set
    keeps it absent, so 's explicit-path
   detection won't treat it as user-authored.

2.  gains a  parameter and a
   new  pass that removes keys matching
    unless those paths were explicitly present in the
   **raw** (pre-normalization) config on disk.  Explicit-path detection
   uses  on  *before* any
   normalisation runs — preventing injected-in defaults from being
   mistaken for user-set values.

All migration and edit-config call sites pass
to preserve their intentional default-seeding behaviour.

New helpers:
-   — collects leaf-key paths from a raw dict
-    — removes keys matching schema defaults

Test coverage: 4 new regression tests (59 total, all passing).
2026-06-27 19:38:11 -07:00
Teknium
d3d621f7c3
revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853)
* Revert "fix(windows): capture is not a no-window boundary; route flashing spawns through chokepoint (#53829)"

This reverts commit 2ecca1e7d3.

* Revert "fix(windows): stop terminal-window popups from background spawns (#53810)"

This reverts commit 5db1430af9.

* Revert "fix(windows): stop subprocess console-window popups + add CI guard (#53791)"

This reverts commit ef17cd204d.
2026-06-27 15:59:00 -07:00
Teknium
ef17cd204d
fix(windows): stop subprocess console-window popups + add CI guard (#53791)
* fix(windows): stop subprocess console-window popups + add CI guard

The single biggest source of Windows 'terminal popup' bug reports was bare
subprocess.run/Popen calls spawning a console window. The compat helpers
(windows_hide_flags / windows_detach_popen_kwargs) already existed but the
footgun checker had no rule to stop new bare calls from reintroducing the flash.

- scripts/check-windows-footguns.py: new AST-based rule flagging subprocess
  calls that can create a new console — output-redirection-aware (capture/
  redirect/check_output exempt) and POSIX-only-program-aware (launchctl/
  systemctl/brew/etc. exempt). Comprehensive on real popups, no annotation
  burden on calls that can't flash.
- Swept all genuine window-spawning sites through windows_hide_flags()/
  windows_detach_popen_kwargs(); marked intentionally-visible launches
  (editor/terminal/foreground re-exec) with '# windows-footgun: ok'.
- tests/scripts/test_windows_footgun_subprocess_rule.py: behavior-contract
  tests + full-repo cleanliness invariant.
- CONTRIBUTING.md: documents the rule + the helper pattern.

* test: accept creationflags kwarg in psutil_android fake_subprocess_run

The Windows no-window sweep added creationflags=windows_hide_flags() to
install_psutil_android.py's subprocess.run call; the test's fake stub had a
fixed (cmd) signature and raised TypeError on the new kwarg.
2026-06-27 13:03:51 -07:00
ms-alan
16192103f4 fix(config): accept placeholder base_url in custom provider validation
_normalize_custom_provider_entry() ran urlparse() on base_url and dropped
any entry whose value was an un-expanded placeholder, so a caller reaching
the normalizer with raw config (e.g. the Dockerized gateway path) silently
skipped the provider with a 'not a valid URL' warning. Skip URL validation
when the candidate contains a placeholder token — both ${ENV_VAR} env-refs
and bare {region}-style templates — since those are expanded at runtime.

Closes #14457
2026-06-27 04:15:27 -07:00
Teknium
60f58a2b95
feat(verify-on-stop): default OFF, one-time migration, skip doc-only edits (#53552)
The verify-on-stop guard fired too eagerly — including on doc/markdown/skill
edits with nothing to verify, where it pushed a pointless /tmp verification
script. Three changes:

1. Default OFF for new installs: agent.verify_on_stop defaults to false
   (was the "auto" surface-aware sentinel). _config_version bumped 30 -> 31.
2. One-time migration (v30 -> v31): existing installs are switched off once,
   but only when the value is missing or still the "auto" sentinel — an
   explicit true/false the user set is preserved.
3. Path filter: build_verify_on_stop_nudge() now drops documentation/prose
   paths (.md/.mdx/.rst/.txt/LICENSE/CHANGELOG/...) so even when explicitly
   enabled, a doc-only turn never nudges. Mixed doc+code turns still nudge on
   the code paths.

The legacy "auto" sentinel is still honored when set explicitly (ON for
interactive coding surfaces, OFF for messaging). HERMES_VERIFY_ON_STOP env
override unchanged.
2026-06-27 03:23:22 -07:00
kyssta-exe
c0568ca95f fix(config): use read_raw_config() in migrations to prevent expanding defaults (#40821) 2026-06-26 22:40:52 +05:30
kshitij
1aa458a1e6
Merge pull request #52920 from NousResearch/salvage/38798-toolset-validation
fix(config): surface invalid platform_toolsets instead of silently dropping tools (#38798)
2026-06-26 14:14:55 +05:30
lEWFkRAD
41ede84b93 fix(config): surface invalid platform_toolsets instead of silently dropping tools (#38798)
A config migration (or hand-edit) that leaves an invalid toolset name in
`platform_toolsets` — e.g. the #38798 corruption that rewrote `hermes-cli` to
the non-existent `hermes` — silently disabled all affected tools:
resolve_toolset() returns [] for an unknown name, so the agent quietly lost its
tools with no error, warning, or log entry and degraded to text-only replies.

Surface it loudly at two points:
- After migration (migrate_config): validate platform_toolsets and record/print
  a warning per unknown name, with a `hermes-<platform>` suggestion when that
  would have been valid (the exact #38798 shape).
- At runtime (_get_platform_tools): if a platform was explicitly configured but
  every toolset name is invalid, log a warning when tools are resolved for a
  session — so an ALREADY-corrupted config is caught at startup, not only on the
  next `hermes update`.

Logic lives in a new pure, side-effect-free helper (toolset_validation.py) with
validate_toolset injected, so it is unit-testable without the tool registry.

Note: the original v25→v26 migration that caused the corruption no longer
exists (config format is now v30; no migration step rewrites toolset names).
This change is the durable defense against the silent-failure mode regardless
of cause, matching the issue's "Expected: log a warning".

Salvaged from #39207 by @lEWFkRAD (authorship preserved via cherry-pick).
Tests: 9 helper cases (incl. the #38798 corruption shape, mixed valid/invalid,
zero-tools state, non-dict/scalar/non-string) + a runtime caplog test — both the
helper warning and the runtime guard mutation-verified to fail without the fix.

Closes #38798. Supersedes #39581 (prevent-in-v25→v26 — that path is gone),
#41006 / #40208 (repair-migration for already-corrupted configs).
2026-06-26 14:07:43 +05:30
Ben
2e322466b1 feat(dashboard-auth): drain shared-bearer-secret provider plugin
Task 2.0b: the concrete shared-bearer-secret auth provider, the FIRST consumer
of the generic token-auth capability (Task 2.0a). Implements decisions.md Q-A.

plugins/dashboard_auth/drain/ (bundled, discovered like dashboard_auth/basic):
- DrainSecretProvider: non-interactive provider, supports_token=True. Verifies
  an inbound Authorization bearer token against a per-agent shared secret with
  hmac.compare_digest (constant-time, no timing oracle) and, on a match,
  vouches for the caller as the "drain-control" principal scoped to "drain".
  The five interactive ABC methods raise NotImplementedError; verify_session
  returns None (stacks harmlessly in the cookie-verify loop).
- assess_secret_strength(): fail-closed entropy gate. Rejects secrets shorter
  than 43 url-safe-b64 chars (~256 bits), with < 16 distinct characters, or
  below 128 bits Shannon entropy — so a weak/structured/repeated secret can
  never be silently accepted. Enforced both at register() (friendly skip
  reason) and in __init__ (raises — defence in depth).
- register(ctx): no-op + skip reason when HERMES_DASHBOARD_DRAIN_SECRET is
  unset; rejects a weak secret fail-closed (drain endpoint stays gated). On a
  strong secret, registers the provider AND opts /api/gateway/drain into the
  generic token-auth seam via register_token_route().

Config: the secret is a CREDENTIAL → carried via HERMES_DASHBOARD_DRAIN_SECRET
(per-agent, provisioned by NAS at deploy). Behavioural knobs only
(dashboard.drain_auth.{scope,min_secret_chars}) live in config.yaml — added to
DEFAULT_CONFIG with the .env-is-for-secrets rationale documented inline.

Tests: tests/plugins/dashboard_auth/test_drain_provider.py — entropy gate
(strong pass; empty/short/repeated/few-distinct/custom-min reject), verify_token
(match → scoped principal, wrong/empty → None, custom scope), protocol
compliance, interactive-methods-raise, and register() (skip-no-secret,
fail-closed-weak-secret, strong-env-secret registers + route opt-in, config
scope + min_secret_chars). 21 new tests; drain + token-auth suites 44 passed.
Verified the plugin is discovered as dashboard_auth/drain alongside basic/nous.

Intentionally deferred:
- The begin/cancel-drain endpoint handler itself — Task 2.1.
- The dashboard→gateway control channel — Task 2.2.

Build status: dashboard-auth + drain-plugin suites green.
2026-06-26 00:47:19 -07:00
brooklyn!
a2b49e60b6
Merge pull request #52412 from GodsBoy/fix/verify-on-stop-messaging-surface-leak
fix(agent): gate verify-on-stop nudge off for messaging surfaces
2026-06-26 02:30:08 -05:00
Teknium
208f0d7c3b
fix(update): default pre-update backup to off (#52729)
The pre-update HERMES_HOME zip shipped on by default (DEFAULT_CONFIG +
runtime fallback both True), so every `hermes update` zipped the entire
~/.hermes — sessions DB, caches, skills — adding minutes to each update.
The shipped cli-config.yaml.example, the --backup help, and the example
config all already said "off by default," so the live default
contradicted its own documentation.

Flip the default to off everywhere: DEFAULT_CONFIG, the runtime
`.get(..., False)` fallback in _run_pre_update_backup, and the stale
--backup help string. Users who want the #48200 safety net opt in via
updates.pre_update_backup: true or --backup for a single run.

Updated test_default_enabled_creates_backup -> test_default_disabled_is_silent
to assert the new default (silent no-op, no zip).
2026-06-25 16:01:09 -07:00
kshitij
e4ff494860
fix(cron): add default retention to per-run job output (#52383) (#52646)
* fix(cron): add default retention to per-run job output to bound disk usage (#52383)

Per-run cron output (cron/output/<job>/<timestamp>.md) is written once
per execution and was never pruned, so a frequently-scheduled job on
a long-running deploy accumulates one file per run indefinitely and
can fill the volume ('no space left on device').

save_job_output() now keeps the most recent N output files per job and
removes older ones. N defaults to 50 and is configurable via
cron.output_retention; a non-positive value disables pruning for
operators who manage cleanup externally.

Salvaged from #52402 by @0xDevNinja.

Closes #52383

* fix(config): add cron.output_retention to DEFAULT_CONFIG

Follow-up to #52383: the retention config key was functional via
get()-with-default but missing from DEFAULT_CONFIG, so the deep-merge
wouldn't auto-populate it for new installs. Add it explicitly.

---------

Co-authored-by: 0xDevNinja <manmit0x@gmail.com>
2026-06-25 16:00:13 -07:00
Teknium
c6575df927
feat(moa): expose MoA presets as selectable virtual models (#46081)
* feat(moa): expose MoA presets as selectable virtual models

Reconstructed onto current main (PR #46081's base had diverged with no common
ancestor, marking the PR dirty so CI never dispatched). MoA is now a virtual
provider: each named preset is a selectable model under provider 'moa', and the
preset's aggregator is the acting model that answers and calls tools.

Reference models fan out in parallel via a bounded ThreadPoolExecutor (the same
batch pattern delegate_task uses) — all references dispatched at once, collected
when every one finishes, then handed to the aggregator. Output order is
preserved, failures and the MoA-recursion guard stay isolated per reference.

- Removed the old mixture_of_agents model tool and moa toolset.
- Added moa as a virtual provider in the provider/model inventory.
- /moa is shortcut behavior over model selection (default preset / named preset
  / one-shot prompt).
- Dashboard + Desktop manage named presets; presets appear in model pickers.
- Parallel reference fan-out in agent/moa_loop.py with regression test.

* fix(moa): thread moa_config through _run_agent to _run_agent_inner

The reconstructed gateway MoA wiring declared moa_config on _run_agent (the
profile-scoping wrapper) and used it inside _run_agent_inner, but the wrapper
never forwarded it — _run_agent_inner had no such parameter, so the runtime hit
NameError: name 'moa_config' is not defined on the compression-failure session
sync path. Add moa_config to _run_agent_inner's signature and forward it from
both wrapper call sites (multiplex and non-multiplex). Caught by
tests/gateway/test_compression_failure_session_sync.py on CI shard test(4).

* fix(moa): classify moa as a virtual provider in the catalog

The moa virtual provider has no PROVIDER_REGISTRY/ProviderProfile entry, so
provider_catalog() fell through to the default auth_type="api_key" with no
env vars — tripping two catalog invariants:
  - test_provider_catalog: api_key providers must expose a credential env var
  - test_provider_parity: every hermes-model provider must be desktop-configurable

moa already declares auth_type="virtual" in HERMES_OVERLAYS; consult that
overlay as an auth_type fallback so the catalog reports moa as virtual (no real
credential, no network endpoint). Exempt virtual providers from the desktop
parity union check the same way 'custom' is exempt — derived from the catalog,
not a hardcoded slug, so future virtual providers are covered too.
2026-06-25 13:52:06 -07:00
kshitij
ca714f6189
Merge pull request #52653 from kshitijk4poor/salvage/33814-env-quote-hash
fix(config): quote .env values containing # to prevent token truncation (#30355)
2026-06-26 01:32:49 +05:30
kshitijk4poor
2107b86024 feat(compression): flip in_place default to True (#38763) [2/2]
In-place compaction (single durable session id, non-destructive soft-archive)
becomes the default. Rotation is now the opt-out fallback via
compression.in_place: false.

Prerequisite: #50098 (hygiene guard reads result flag not config flag) merged
first — without it, flipping the default causes permanent transcript loss on
gateway hygiene-compress and /compress when no session_db is available.

Blast radius (empirically measured on current main): 7 rotation-asserting
tests broke and are pinned to in_place=False in the companion test commit:
- tests/agent/test_compression_concurrent_fork.py (2)
- tests/agent/test_compression_logging_session_context.py (1)
- tests/agent/test_compression_rotation_state.py (1)
- tests/run_agent/test_compression_boundary_hook.py (2 _make_agent helpers)
- tests/gateway/test_compression_concurrent_sessions.py (2)
Rotation stays as a working fallback and deserves continued coverage.

Plan: .hermes/plans/in-place-compaction-38763.md
2026-06-25 12:56:05 -07:00
sweetcornna
150afea942 fix(config): quote env values containing hash 2026-06-26 00:54:34 +05:30
Brooklyn Nicholson
c4c590e4a1 perf(desktop): make session switching fast under load
Switching sessions in the desktop app could freeze the whole UI for
several seconds on heavy, tool-rich chats. Root causes and fixes:

- Cold `session.resume` built the AIAgent (MCP discovery, prompt/skill
  build) *before* returning, and the desktop awaits that RPC before it
  paints — so the entire switch blocked on the build. Add an opt-in
  `defer_build` resume path (the contract `session.create` already uses):
  return the full display transcript immediately, register an upgradable
  live session, and pre-warm the agent on a short timer. The persisted
  runtime identity (model/provider/base_url/api_mode/reasoning/tier) is
  restored on the deferred build so it can't drop the provider.

- Nothing bounded how many in-memory agents accumulate; a user who
  reconnects often piled up detached sessions for the full 6h TTL. Add a
  soft LRU cap (`max_live_sessions`, default 16) that evicts the
  least-recently-active DETACHED sessions (no live client) — never a
  running, awaiting-input, mid-build, or live-transport one. Reopening
  re-resumes from disk.

- On the prefetch-hit cold-resume path, skip rebuilding a throwaway
  merged-message array (and its 1000-entry Map) when the prefetch already
  painted the exact transcript; the downstream sameMessageList guard
  already drops the publish, so it was pure main-thread cost.

The desktop opts into `defer_build` for every non-watch cold resume; the
eager path stays for CLI/TUI and existing callers.
2026-06-25 14:03:03 -05:00
GodsBoy
f168631be0 fix(agent): gate verify-on-stop nudge off for messaging surfaces
The verify-on-stop guard (PRs #52296, #52297) defaulted ON for every
session, so on gateway messaging surfaces (Telegram, Discord, etc.) the
model complied with the nudge by writing a hermes-verify temp script and
emitting an ad-hoc verification summary, which the gateway delivered to
the end user as chat noise.

Resolve a surface-aware default instead. The DEFAULT_CONFIG value becomes
the sentinel "auto", which verify_on_stop_enabled() resolves to ON for
interactive coding surfaces (CLI, TUI, desktop) and programmatic callers,
and OFF for conversational messaging surfaces. The surface is read from
HERMES_SESSION_PLATFORM (what the gateway actually binds), with
HERMES_SESSION_SOURCE and HERMES_PLATFORM as fallbacks, matching the
sibling resolution in skill_commands.py and prompt_builder.py. An explicit
HERMES_VERIFY_ON_STOP env var or a boolean agent.verify_on_stop config
still overrides in either direction.

The passive evidence ledger and the call site are untouched.
2026-06-25 10:05:04 +02:00
brooklyn!
d473e5d07a
Merge pull request #52296 from NousResearch/bb/verify-stop-loop
Add verification stop loop
2026-06-24 23:10:03 -05:00
Brooklyn Nicholson
2f1a47b90e feat(agent): require verification before finishing edits
Make verification closure the default coding behavior after landed file edits while keeping bounded retries and config/env switches for users who need to disable it.
2026-06-24 23:02:48 -05:00
Victor Kyriazakos
b693bee100 feat(cron): thread-preferred continuable delivery (open a thread, mirror DM fallback)
Continuable cron jobs (attach_to_session / cron.mirror_delivery, default
OFF) now prefer a dedicated thread on thread-capable platforms, falling
back to origin-DM mirroring where threads don't exist.

- Thread-capable (Telegram topics, Discord/Slack threads): open a fresh
  thread for the job via the shipped adapter.create_handoff_thread,
  route the brief into it, and seed the thread-keyed session so the
  user's in-thread reply continues with full context. This is the
  'continuable cron opens its own thread' interface.
- DM-only (WhatsApp/Signal/SMS): create_handoff_thread returns None ->
  fall back to mirroring into the origin DM session (existing behaviour).

Reuses existing infrastructure end-to-end — no new adapter surface, no
provider-chain signature change:
- adapter.create_handoff_thread (already implemented per-platform,
  returns None on unsupported platforms = the fallback signal)
- the live SessionStore via adapter._session_store (already set on every
  adapter), reached without threading a new param through the frozen
  CronScheduler.start() contract
- gateway.mirror.mirror_to_session for the seed/append
- existing per-target delivery routing carries the new thread_id for free

Mirrors GatewayRunner._process_handoff's open-thread-or-fallback +
seed pattern, standalone for the cron delivery path. thread_seeded
guards against a double-mirror after seeding. Scoped to the origin
target only; fan-out/broadcast targets are never threaded or mirrored.

Config docs updated (cron.mirror_delivery) + cronjob tool
attach_to_session description reframed around continuable/thread-preferred.

Tests: +5 (thread id returned on thread platform; None on DM platform;
None without capability/loop; seed creates thread session + mirrors;
seed no-op on empty). 22/22 in TestCronDeliveryMirror; 532 cron tests
pass (4 failures pre-existing: croniter-not-installed + TZ).
2026-06-24 20:27:05 -07:00
Victor Kyriazakos
1b181724fa feat(cron): optional mirror of cron delivery into target chat session
Adds an opt-in path so a cron job's delivered output is also appended to
the TARGET chat's gateway session transcript (as an assistant turn), so a
user reply to a recurring delivery (daily brief, reminder) is answered with
the delivery in context instead of 'what is that?' amnesia.

- Reuses the shipped gateway.mirror.mirror_to_session — the same primitive
  interactive send_message mirroring already uses. No messaging-toolset
  change (cron still can't call send_message; this rides delivery).
- Gated: per-job attach_to_session overrides global cron.mirror_delivery
  (config.yaml). Default OFF — historical isolation preserved byte-for-byte.
- Mirrors the CLEAN agent output, not the cron header/footer wrapper.
- Alternation/cache-safe: append lands at a turn boundary, never mid-loop,
  never mutates the cached system prompt. Cold-start (no target session)
  is a silent no-op; mirror errors never fail a successful delivery.
- Surfaced on the cronjob tool (attach_to_session) + config schema.

Driven by enterprise cron-as-control-plane use case. 10 new tests; full
cron + cronjob-tool suites pass (600).
2026-06-24 20:27:05 -07:00
Teknium
411faf08bd
fix(soul): installers seed the real default persona, upgrade legacy empty templates (#52246)
The desktop bootstrap (and curl/PowerShell/docker installs) seeded
~/.hermes/SOUL.md with a comment-only scaffold that contained no persona
text. That shadowed the runtime default (_ensure_default_soul_md ->
DEFAULT_SOUL_MD), since seeding is guarded by 'if SOUL.md doesn't exist'.
Result: every fresh installer install got the empty template instead of
the documented Hermes persona; desktop just made it visible in onboarding.

- install.sh / install.ps1 / docker/SOUL.md now write DEFAULT_SOUL_MD.
- _ensure_default_soul_md() upgrades a SOUL.md still matching the known
  legacy scaffold in place; customized files (any deviation, incl. a
  persona appended below the comment) are never touched.
- Detection normalizes CRLF/BOM so Windows-installer drift still matches.
2026-06-24 18:56:26 -07:00
Ben
d1cac0e5ef feat(gateway): scale-to-zero idle detection + dormant-quiesce (Phase 0)
The gateway-side BEHAVIOUR layer that consumes the relay scale-to-zero
primitives (gateway-gateway Phase 5): the gateway decides it is idle and
drives the relay transport dormant so the platform (Fly autostop:"suspend")
can suspend the now-traffic-idle machine, which wakes on the connector's
wakeUrl poke (decisions.md Q3=C', D1-D13).

- gateway/scale_to_zero.py: pure helpers — scale_to_zero_enabled (the NAS
  Labs HERMES_SCALE_TO_ZERO stamp, D11/Q8=A), parse_idle_timeout_seconds
  (config.yaml gateway.scale_to_zero.idle_timeout_minutes, D2),
  messaging_is_relay_only_or_absent (F6/D1), should_arm (D1/D11/§3.4(1)),
  is_idle (D2/D3/F7).
- gateway/run.py: _last_inbound_at clock stamped on user inbound in
  _handle_message (F13); the arm-gate + idle predicate + the
  _scale_to_zero_watcher dormant sequence (mark draining -> adapter
  go_dormant() -> cooldown), started only when armed. Deliberately NOT the
  stop path and NOT mark_resume_pending (F12/D13).
- tools/process_registry.py: has_any_active() for the bg-work guard (D3/F7).
- hermes_cli/config.py: gateway.scale_to_zero.idle_timeout_minutes default 5.

Tests: 38 pure-logic + 6 watcher (incl. bg-work regression guard proven RED).
Full relay + scale-to-zero suites: 184 passed. The 20 unrelated failures in
the broader run are PRE-EXISTING on origin/main (custom-provider/tools tests),
confirmed via a pristine baseline worktree.
2026-06-24 18:47:18 -07:00
helix4u
17beb55e3c fix(telegram): gate rich draft previews separately 2026-06-24 18:11:14 -07:00
uperLu
0d4cecb352 fix(cron): avoid provider package shadowing core cron 2026-06-23 23:39:22 -07:00
Brooklyn Nicholson
e495b33bf1 Merge remote-tracking branch 'origin/main' into bb/pets-merge
# Conflicts:
#	hermes_cli/commands.py
#	tui_gateway/server.py
2026-06-23 19:05:22 -05:00
Teknium
6cc07b6cd0
feat(discord): render reasoning as -# subtext via display.reasoning_style (#51168)
Adds a per-platform display.reasoning_style setting (code | blockquote |
subtext) controlling how the show_reasoning summary renders on the gateway.
Discord defaults to "subtext" (-# small grey metadata text); every other
platform keeps the fenced code block. Resolves through the existing
display.platforms.<platform>.reasoning_style override chain.
2026-06-23 10:44:02 -07:00
Teknium
87c4a5ebb8
feat(background-review): aux-model selector for the self-improvement review (#49252)
Adds auxiliary.background_review.{provider,model} (default auto = main chat
model — unchanged). Set it to a different, cheaper model and the post-turn
self-improvement review runs there for ~3-5x lower cost.

Cache-aware by design: the main chat is warm in the prompt cache, so the
default full-history replay on the main model is cheap cache reads — left
exactly as-is. A different model can't reuse that cache (different key), so
when (and only when) routed to a different model the fork replays a compact
digest instead of the full transcript, minimising what it cold-writes on the
aux model. Same model -> full replay; different model -> digest.

Quality holds in benchmarks: memory capture identical, skill near-identical.
Nothing changes unless you opt in by naming a different model.

Co-authored-by: Hermes Agent <noreply@nousresearch.com>
2026-06-22 14:54:53 -07:00
Teknium
f1e6d39a74
feat(computer_use): disable cua-driver telemetry by default, add opt-in (#50842)
* feat(computer_use): disable cua-driver telemetry by default, add opt-in

cua-driver ships anonymous PostHog usage telemetry ENABLED by default
upstream (fires cua_driver_install / cua_driver_doctor events to
eu.i.posthog.com). Hermes now disables it for our users unless they
explicitly opt in.

- New config key `computer_use.cua_telemetry` (default false) in
  DEFAULT_CONFIG.
- `cua_backend.cua_driver_child_env()` injects
  `CUA_DRIVER_RS_TELEMETRY_ENABLED=0` into the child env when telemetry is
  disabled (the default); leaves the var untouched on opt-in so the driver
  uses its own default. Reads config fail-safe — any error defaults to
  telemetry off.
- Routed every cua-driver spawn site through the policy: MCP backend
  (StdioServerParameters env), `cua_driver_update_check`, doctor's
  health_report Popen, the install.sh/install.ps1 runner, and the
  `--version` / status probes.
- Docs: new Telemetry subsection in computer-use.md (EN).
- Tests: tests/computer_use/test_cua_telemetry.py — default disables,
  explicit-false disables, opt-in leaves var untouched, config-failure
  fails safe, inherited-enabled is overridden off.

Verified live on Linux against the real cua-driver-rs 0.6.0 binary: with
the var=0 the driver reports "telemetry: disabled via
CUA_DRIVER_RS_TELEMETRY_ENABLED" and sends no event; with it unset it logs
"sending event: cua_driver_doctor". 213 computer_use + install tests green.

* fix(dashboard): fold computer_use config category into agent tab

The new computer_use.cua_telemetry key created a single-field dashboard
config category, tripping test_no_single_field_categories (web_server's
invariant that categories with <2 fields must be merged to avoid tab
sprawl). Add computer_use -> agent to _CATEGORY_MERGE, matching the
existing onboarding/telegram single-field folds.
2026-06-22 09:57:16 -07:00
Brooklyn Nicholson
5342eccf12 Merge remote-tracking branch 'origin/main' into bb/pets 2026-06-22 05:25:49 -05:00
Shannon Sands
5dae502b86 Address email pairing review feedback 2026-06-21 22:43:57 -07:00
teknium1
4314d451ca fix(gateway): accept any inbound file type across all messaging platforms
Authorization to message the agent is the gate, not the file extension.
Previously the inbound-attachment allowlist (SUPPORTED_DOCUMENT_TYPES) was
opt-OUT on Discord (allow_any_attachment defaulted false) and had no bypass
at all on Telegram/Slack — so an .html (or any non-allowlisted type) was
dropped or hard-rejected before the agent saw it.

Now every authorized upload is cached and surfaced to the agent regardless
of type:
- base.cache_media_bytes(): unknown types cache as octet-stream (or the
  caller-supplied MIME) instead of returning None — fixes the chokepoint
  that Teams/Telegram-media route through.
- discord/telegram/slack adapters: removed the allowlist reject/skip; any
  non-media attachment is typed DOCUMENT and cached. Known types keep their
  precise MIME.
- Text inlining now gates on a shared _TEXT_INJECT_EXTENSIONS set (text +
  code + config + markup) instead of a blind UTF-8 decode, so binary formats
  (PDF/zip/docx) with ASCII headers are never inlined.
- gateway/run.py emits the path-pointing context note for every DOCUMENT,
  including non text/application MIME types.
- discord.allow_any_attachment is now a documented no-op kept for config
  back-compat.

Validation: 357 gateway tests pass; E2E confirms .html/.bin/custom types
cache, known types stay precise, PDFs are not inlined.
2026-06-21 22:43:45 -07:00
Teknium
95d53c3bcb
feat(cli): /reasoning full — show complete thinking, not 10-line clamp (#50499)
* feat(cli): /reasoning full to show complete thinking, not 10-line clamp

The post-response Reasoning recap box hard-clamped long thinking to the
first 10 lines, so there was no way to see the full reasoning trace after
a turn (live streaming already shows it in full). Add display.reasoning_full
(default off) plus /reasoning full|clamp to toggle it at runtime; the clamp
truncation note now points at the command. Addresses repeated user requests
to show all thinking tokens.

* test(gateway): de-snapshot /reasoning help assertion

The test froze the exact args-hint literal '/reasoning [level|show|hide]',
which the new full/clamp args change to '[level|show|hide|full|clamp]'.
Convert to an invariant: assert /reasoning is in help and carries its core
args, not the exact hint string.

* feat(tui): /reasoning full|clamp parity in tui_gateway

The classic-CLI reasoning_full toggle had no TUI equivalent — typing
/reasoning full in the TUI fell through to parse_reasoning_effort and
errored. The TUI renders thinking as an expand/collapse section (no fixed
10-line recap), so map full -> sections.thinking=expanded (raw, uncapped
via thinkingPreview mode='full') and clamp -> collapsed, persisting
display.reasoning_full for cross-surface config consistency.
2026-06-21 20:21:11 -07:00
Teknium
7130d60861
feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492)
* feat(providers): remove google-gemini-cli + google-antigravity OAuth providers

Google now actively bans accounts for third-party tools that piggyback on
Gemini CLI / Antigravity / Code Assist OAuth, and because abuse prevention
sits at a backend layer the ban can extend to the entire Google account
(Gmail/Drive), with a second violation being permanent.
Ref: https://github.com/google-gemini/gemini-cli/discussions/20632

Removes both OAuth inference providers entirely (modules, provider profiles,
auth/runtime/config/models wiring, the /gquota Code Assist quota command,
the antigravity-cli optional skill, desktop + docs surface in en + zh-Hans).
The API-key 'gemini' provider (GOOGLE_API_KEY/GEMINI_API_KEY against
generativelanguage.googleapis.com) is unaffected and stays fully supported.

* fix(skills): keep the antigravity-cli skill — only the OAuth provider is removed

The antigravity-cli optional skill orchestrates the external `agy` binary as
a coding-agent tool via the terminal tool — it does NOT wrap Hermes inference
through the banned google-antigravity OAuth provider, so it carries none of
the account-ban risk that motivated removing that provider. Restore the skill,
its docs page, the sidebar entry, and the optional-skills catalog row. The
google-antigravity / google-gemini-cli inference providers stay fully removed.
2026-06-21 19:53:27 -07:00
teknium1
8cecaf0b29 feat(process): escalate SIGTERM->SIGKILL on host-pid termination after grace
A daemon that ignores or stalls in its SIGTERM handler currently survives the
process-registry reap and leaks until reboot (observed as agent-browser
daemons accumulating to EMFILE on long-running gateways). _terminate_host_pid
now snapshots the tree, SIGTERMs it, waits a bounded grace window
(terminal.daemon_term_grace_seconds, default 2.0s, 0 disables), then SIGKILLs
any survivor. The recycled-PID identity guard still gates the whole path, so
escalation never reaches a stranger; Windows is unchanged (taskkill /F is
already a hard kill).

Config lives in config.yaml (terminal.daemon_term_grace_seconds), NOT an env
var, per the .env-secrets-only policy.

Implements the SIGKILL-escalation idea from @tkwong's #15008, reworked onto the
current _terminate_host_pid tree-kill path (the original predated it) and
config-gated instead of env-var-gated.

Co-authored-by: Benjamin Wong <tkwong@inspiresynergy.com>
2026-06-21 19:08:52 -07:00
pmos69
8baa4e9976 feat(cli): add native Antigravity OAuth provider 2026-06-21 16:41:30 -07:00
Teknium
824c9d3812
fix(config): alias model.api_base -> model.base_url for custom providers (#50385)
A bare custom provider configured via `model.api_base` (the intuitive name
OpenAI-SDK / LiteLLM users reach for) was silently ignored: `hermes config set`
accepts any dotted key, so `model.api_base` got written and confirmed, but the
runtime resolver reads only `model.base_url`. Requests fell back to OpenRouter
with an empty key -> 401, zero hits to the custom endpoint (issue #8919).

Now api_base is migrated to base_url at load time (fixes existing broken
configs) and at set time (with a notice), never overriding an explicit
base_url. Closes #8919.
2026-06-21 13:33:41 -07:00
kn8-codes
6183e8ce1b fix(telegram): make Bot API 10.1 rich messages opt-in (default off)
Rich messages are not ready for primetime: current Telegram clients can
render Bot API 10.1 rich messages as blank/unsupported bubbles and make
them hard to copy as plain text, which is worse than the legacy
MarkdownV2 path for command snippets and mobile handoffs. Default the
rich_messages toggle to False so replies stay on the copyable legacy
path; users opt in per bot via platforms.telegram.extra.rich_messages:
true. Updates adapter, gateway config default, example config, English +
zh-Hans docs, and the default/opt-in tests.
2026-06-21 12:03:24 -07:00
sgaofen
93ea9b04af fix(gateway): cap inbound media download size to prevent memory exhaustion
Inbound image/audio/video payloads were buffered fully into process memory
before being written to the cache, with no size limit. A large upload
(Discord Nitro allows 500 MB) or a remote media URL in an inbound message
pointing at a huge file could spike RAM and OOM-kill the gateway.

Enforce a configurable cap in the shared cache helpers (gateway/platforms/
base.py) so the protection holds across every platform adapter, not one:

- cache_image/audio/video_from_bytes reject oversized payloads before writing
  (video was the gap in the original report — now covered).
- cache_image/audio_from_url stream the body, rejecting on an oversized
  Content-Length header and re-checking the running total per chunk so an
  absent/lying header can't smuggle an unbounded body past the cap.
- Discord's _read_attachment_bytes checks att.size up front, so an oversized
  attachment is rejected before any bytes are pulled into memory.

Configurable via gateway.max_inbound_media_bytes in config.yaml (default
128 MiB; 0 disables). No new env var — non-secret config lives in config.yaml.

Salvaged and extended from @sgaofen's PR #13341 (the original report and the
shared-helper approach). Reapplied onto current main (Discord adapter has
since moved to plugins/platforms/discord/), the configurable knob moved from
an env var to config.yaml, and the video cache helper added.

Co-authored-by: Hermes Agent <noreply@nousresearch.com>
2026-06-21 11:56:46 -07:00
Teknium
a18bae65b9
fix(config): redact api_key in config show/set output (#50245) (#50313)
hermes config show printed the model dict raw via print(), bypassing the
logging redactor; a custom-provider api_key (e.g. Cloudflare cfut_...) was
shown in plaintext even with security.redact_secrets=true. Opaque tokens
don't match any vendor-prefix regex, so structural key-name masking is
required.

- Add redact_config_value(): recursively masks credential-shaped keys
  (api_key/token/secret/... exact-match) via mask_secret.
- Wrap the show_config model dump in it.
- Mask the set_config_value echo when the leaf key is credential-shaped
  (config set model.api_key routes to config.yaml, lowercase misses the
  .env allowlist).
2026-06-21 11:50:31 -07:00
Teknium
03563dabac
fix(gateway): raise session-hygiene hard message limit 400 → 5000 (#50194)
The gateway pre-compression hygiene valve force-compressed any session
crossing 400 messages regardless of token usage. On large-context (1M+)
models doing many short, message-dense turns, a healthy session at ~16%
token usage could hit 400 messages and get force-compressed — and the
compression summary's stale Active Task could then bleed into the next
turn.

The valve's actual purpose is to break a death spiral: when API calls
keep disconnecting on an oversized session, no token-usage data arrives,
the token threshold never fires, and the transcript grows unbounded.
It's a count-based floor for that pathological case only. 400 was tuned
for ~200K-context models and is far too low for modern large-context
sessions. Raise the default to 5000 — still well clear of any death
spiral, but no longer firing on legitimate long conversations.

The value remains fully configurable via compression.hygiene_hard_message_limit.
2026-06-21 08:26:19 -07:00
Teknium
e499d69e3e
feat(api-server): configurable concurrent-run cap to prevent DoS (#50007)
The OpenAI-compatible API server only enforced a hardcoded cap of 10
concurrent runs on /v1/runs, leaving /v1/chat/completions and
/v1/responses unbounded — a request flood could exhaust CPU, memory,
and upstream LLM quota (#7483).

- Add gateway.api_server.max_concurrent_runs (config.yaml, default 10,
  0 disables). No env var.
- Shared concurrency gate across all three agent-serving endpoints,
  counting both the chat/responses in-flight counter and the /v1/runs
  stream set. Returns OpenAI-style 429 + Retry-After when at the cap.
- Remove the dead hardcoded _MAX_CONCURRENT_RUNS class attribute.

Closes #7483.
2026-06-21 07:26:03 -07:00