PR #22410 added three-mode Telegram topic routing to the live message path
(TelegramAdapter.send via the gateway DeliveryRouter), but the cron delivery
path never got it. cron/scheduler.py::_deliver_result sent through the live
adapter with a bare ``{"thread_id": ...}`` and fell back to the standalone
_send_telegram, neither of which addresses Bot API Direct Messages topics
correctly. After Bot API 10.0 (2026-05-08), sending to a private chat with a
bare ``message_thread_id`` is rejected/mis-routed, so cron deliveries to a
private DM topic landed in the General topic instead of the requested lane.
Fix: the cron live-adapter branch now routes the text send through the
gateway's ``DeliveryRouter._deliver_to_platform`` — the same canonical path
live messages use — so it inherits all three Telegram routing modes:
1. Forum/supergroup (negative chat_id) -> message_thread_id
2. Bot API DM topics (private chat_id + numeric topic id) ->
direct_messages_topic_id (the case #22773 reported)
3. Hermes-created named private DM-topic lanes -> ensure_dm_topic +
reply anchor
For mode 2, a private-chat target with a numeric topic id is passed as
``direct_messages_topic_id`` metadata (verified end-to-end:
TelegramAdapter._thread_kwargs_for_send turns it into
``{message_thread_id: None, direct_messages_topic_id: <int>}``), instead of a
bare message_thread_id. Forum/supergroup and home-channel deliveries are
unchanged. The standalone fallback (gateway down) is preserved.
No new config knob and no duplicated routing logic — this reuses the existing
DeliveryRouter rather than reimplementing topic routing in the cron path.
Salvaged from #42051 (stepanov1975) and #23249 (devsart95), which both
diagnosed the missing three-mode routing in the cron/standalone path;
reimplemented onto the canonical DeliveryRouter that landed since those PRs
were opened.
Co-authored-by: Alex <9785479+stepanov1975@users.noreply.github.com>
Co-authored-by: devsart95 <devsart95@gmail.com>
The in-process cron ticker (cron/scheduler_provider.py) caught only
`Exception` and logged at DEBUG, so a `SystemExit`/`KeyboardInterrupt`
raised from a misbehaving provider SDK or agent retry path killed the
ticker thread silently. The gateway PROCESS stayed up, so `hermes cron
status` — which only checks `find_gateway_pids()` — kept reporting
"✓ jobs will fire automatically" while no jobs ever fired (#32612,
#32895).
This makes ticker death survivable and detectable:
- The ticker loop now catches `BaseException` and logs at ERROR with a
traceback, so a single bad tick no longer tears the thread down and
the failure is visible in the gateway log.
- The loop records a heartbeat (`cron/ticker_heartbeat`, epoch seconds)
on startup and after every tick — best-effort, never raised into the
loop. Both ticker entry points (the gateway and the desktop fallback
in web_server.py) funnel through `InProcessCronScheduler.start`, so one
heartbeat site covers both.
- `hermes cron status` now reads the heartbeat age: if the gateway is
running but the heartbeat is stale (> 200s, i.e. several missed ~60s
ticks), it reports the ticker as STALLED and suggests a restart instead
of falsely claiming jobs will fire. A missing heartbeat (older build /
never ran) is treated as "unknown", not "dead".
Adds tests for BaseException survival, per-iteration heartbeat recording,
heartbeat round-trip/age, staleness detection, and silent-write-failure.
Salvaged from #49660 (BaseException survival on current structure),
extended with the heartbeat + honest-status reporting that the earlier
(pre-refactor) watchdog PRs #35616 and #33849 proposed.
Fixes#32612Fixes#32895
Co-authored-by: banditburai <promptsiren@gmail.com>
Co-authored-by: sweetcornna <96944678+sweetcornna@users.noreply.github.com>
Consolidates three cron-delivery defects in cron/scheduler.py::_deliver_result
that all stem from how the live-adapter send result is interpreted.
#38922 — duplicate message on confirmation timeout.
future.result(timeout=60) raising TimeoutError bubbled to the outer
except handler, which left delivered=False, so `if not delivered:` re-sent
the identical message via the standalone path. future.cancel() cannot
un-send a request already in flight on the wire, so a slow confirmation
deterministically produced a duplicate. The send was already dispatched onto
the gateway loop, so a bare timeout is now treated as delivered
(assume-delivered is safer than guaranteed-duplicate) and the standalone
fallback is skipped. The live-adapter media attempt is also skipped on
timeout since the contended loop would re-block each 30s media budget.
#47056 — silent drop when the gateway has an active session.
The old check `if send_result is None or not getattr(send_result,
"success", True)` let a result object missing a `success` attribute default
to True = counted as a successful delivery, so the scheduler logged
"delivered via live adapter" while the gateway never processed the message.
Delivery is now confirmed via _confirm_adapter_delivery(): only an explicit,
truthy `success` attribute counts; None or a `success`-less object falls
through to the standalone path so the message actually arrives.
A genuine send Exception (not a slow confirmation) still falls through to
the standalone path, and is caught by run_job's outer handler — it is
recorded as the job's last_error and never crashes the cron ticker.
#43014 — deliver=origin fails to resolve in CLI sessions.
A CLI-created job has no {platform, chat_id} origin, so deliver=origin (and
auto-detect / deliver=None) was unresolvable and emitted "no delivery target
resolved" on every run. An unresolvable origin with no configured home
channel is now treated as local (output stays in last_output), matching the
documented auto-deliver contract; a concrete unresolvable platform target
still reports a real error.
Salvaged from #41007 (timeout discriminator), folding in #47127's
_confirm_adapter_delivery hardening and #38937 / #43063's origin→local
fallback. Tests rewritten as behavior contracts (timeout => no duplicate;
None / success-less result => standalone fallback; confirmed success => no
fallback; CLI origin => local, explicit platform => still errors).
Co-authored-by: Evi Nova <66773372+Tranquil-Flow@users.noreply.github.com>
Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>
Cron jobs created without an explicit `model` are stored as `model: null`.
At fire time `run_job` resolved `model = job.get("model") or os.getenv(
"HERMES_MODEL") or ""` and then `_model_cfg.get("default", model)`, so when
config.yaml had no `model.default` (or `model: {default: null}`) an empty
string flowed straight to the provider and surfaced as an opaque HTTP 400
("Model parameter is required" / "model: String should have at least 1
character"). The operator had to inspect jobs.json to discover the job was
stored with a null model.
This change makes cron model resolution robust and symmetric with the CLI:
- Coerce `model: null`/missing config to `{}` so a falsy default never
overwrites an already-resolved env value with `None`.
- Only overwrite `model` from `model.default` when the resolved value is
truthy; accept a `model.model` alias key, mirroring the sibling resolvers
in hermes_cli/oneshot.py, fallback_cmd.py and prompt_size.py.
- Resolve AFTER the managed-scope overlay so an administrator-pinned model
still wins.
- Fail fast with an actionable error (caught by run_job's outer handler and
recorded as the job's last_error — the cron ticker is unaffected) instead
of letting an empty model reach the API.
- The per-job model is re-read every tick, so a `cronjob action=update
model=...` after a failed run takes effect on the next tick (no cache).
Adds tests/cron/conftest.py pinning a default HERMES_MODEL so existing
run_job tests don't trip the new guard, plus regression tests covering env
fallback, config.default fallback, string-form config, the model alias key,
null-default-no-clobber, corrupt-config graceful degradation, fail-fast,
and the no-cache re-read property.
Salvaged from #24005, rebased onto current main, with additional test
coverage folded in from #45550 and the alias-key behavior from #43952.
Fixes#43899Fixes#23979Fixes#22761
Co-authored-by: szzhoujiarui-sketch <szzhoujiarui@gmail.com>
Co-authored-by: rayjun <rayjun0412@gmail.com>
Matches the env= callsite convention at the other sanitized
subprocess spawns (cua_backend dict(os.environ), gateway
os.environ.copy()). Functionally equivalent — _sanitize_subprocess_env
never mutates its input — but avoids handing the live mapping to the
helper.
Follow-up to salvaged PR #49207.
Cron no_agent and pre-check scripts ran with the full gateway/agent
environment, allowing scripts under HERMES_HOME/scripts/ to read provider
credentials. Apply _sanitize_subprocess_env like terminal and MCP paths
(SECURITY.md section 2.3).
Add regression test asserting blocklisted provider vars are absent in the
child process.
The skin bug was one instance of a class: several subsystems build their
config dict directly from config.yaml instead of routing through
hermes_cli.config.load_config (which carries the managed merge), so they
silently ignored administrator-pinned values. Audited every config.yaml
reader and fixed the behavioral-read bypasses:
- gateway/config.py load_gateway_config (messaging gateway: session_reset,
quick_commands, stt, model, ...)
- gateway/run.py _load_gateway_config (its read_raw_config fast path also
skipped the merge — read_raw_config returns raw user YAML)
- tui_gateway/server.py _load_cfg (new TUI + desktop backend: skin,
reasoning_effort, service_tier, provider_routing)
- cron/scheduler.py (scheduled-job model/reasoning/toolsets/provider_routing)
- hermes_logging.py (logging.level/max_size_mb/backup_count)
- hermes_time.py (timezone)
- hermes_cli/doctor.py (memory-provider diagnostic reads effective config)
All route through a new shared managed_scope.apply_managed_overlay() helper
that mirrors _load_config_impl (env-only expansion so a user ${VAR} can't
shadow a managed literal, root-model-string normalization, leaf-merge) and is
fail-open. cli.py's earlier inline fix is refactored onto the same helper.
Write-back paths (slash_commands, telegram/yuanbao dm_topics, profile
distribution) are deliberately left reading raw user YAML — overlaying managed
values there would persist them into the user file. The dashboard
(web_server.py) already routes through load_config and needed no change.
TUI loader caches the RAW config so _save_cfg never writes managed values to
disk. Adds test_managed_scope_overlay.py (helper) and
test_managed_scope_loaders.py (per-surface integration); mutation-checked.
Two small, focused fixes for the cron scheduler and checkpoint manager.
1. _summarize_cron_failure_for_delivery (cron/scheduler.py):
Replaces the raw error dump in _process_job with a compact
pattern-matched summary. Provider rate limits, timeouts, and
authentication errors now produce a short human-readable message
instead of dumping multi-KB provider JSON into the delivery channel.
2. _repair_bare_repo_dirs (tools/checkpoint_manager.py):
Recreates refs/heads/ and branches/ directories after git gc
--prune=now, which can remove empty dirs from bare repos and cause
subsequent git add -A to fail with 'fatal: not a git repository'.
Called after all four git gc call sites.
Both fixes use only standard library imports and plug into existing
call sites with no architectural changes.
Phase 4F (F.1 + F.2 + F.3, agent side). F.4 is the operator-run live smoke
(needs a NAS deployment); recorded in the PR, not code.
F.1 — on_jobs_changed wiring:
- cron/scheduler.py: _notify_provider_jobs_changed() — resolve the active
provider, call on_jobs_changed(), swallow errors. Lives in scheduler.py (not
jobs.py) so the store stays free of provider imports (no import cycle).
- Wired at the consumer surfaces AFTER a successful mutation: the cronjob model
tool (tools/cronjob_tools.py, create/update/remove/pause/resume) — which the
`hermes cron` CLI also routes through — and the REST handlers
(gateway/platforms/api_server.py, same five). Built-in's no-op default = zero
behavior change on the default path. Sleeping-agent direct jobs.json writes
(no tool/CLI/REST) are covered by reconcile-on-wake in start().
F.2 — config: cron.chronos.{portal_url,callback_url,expected_audience,
nas_jwks_url}. All non-secret; the agent holds no scheduler creds and the
outbound provision call reuses the existing Nous token (no token key). Additive
deep-merge key, no version literal.
F.3 — docs:
- docs/chronos-managed-cron-contract.md: authoritative agent↔NAS wire contract
(the three agent-cron endpoints + inbound /api/cron/fire + the 3-hop trust
model + at-most-once/re-arm semantics). This is what the NAS-side agent builds
against.
- cron-internals.md: "Managed cron (Chronos) for scale-to-zero" section.
- cli-commands.md: cron.provider accepts chronos + the cron.chronos.* keys.
- User docs name no scheduler vendor (QStash is a NAS-internal detail).
INVARIANT re-verified: zero qstash/upstash hits across plugins/cron, gateway,
hermes_cli, tools, website/docs (the one remaining repo hit is an unrelated
Context7 MCP comment in tools/mcp_tool.py).
Tests: test_jobs_changed_notify (5) — notify calls provider hook, swallows
errors, built-in harmless, tool create/remove notify. Full cron + chronos +
webhook + config + api_server_jobs suites green (504 in the cron+chronos+webhook
run).
Phase 4A. Factor tick's per-job closure (_process_job: execute → save →
deliver → mark) into a module-level run_one_job(job, *, adapters, loop,
verbose) so the external Chronos provider's fire_due (Phase 4D) reuses the
IDENTICAL body — no duplicated correctness. tick's _process_job is now a thin
wrapper calling run_one_job; the pool/in-flight-guard/contextvars dispatch
logic is unchanged.
run_one_job fires ONE given job; it does NOT decide due-ness, claim, or compute
next_run (tick advances next_run_at under the file lock; an external provider
claims via the store CAS in Phase 4C). Pure refactor, no behavior change.
TDD: test_run_one_job.py characterizes the sequence through tick() first
(test_tick_process_job_sequence, passed pre-extraction), then unit-tests the
helper directly: success sequence, [SILENT]→skip delivery, empty-response soft
failure (#8585), failed-job-still-delivers, exception→mark-failed.
Verified: tests/cron/ 459 passed (was 453 + 6 new); tick behavior unchanged.
Fully removes the cron per-job 'profile' arg added in #28124: the
cronjob tool schema field, CLI --profile flags on cron create/edit,
job-record storage/validation, the scheduler's _job_profile_context
wrapper, and the script-runner env override. Sequential-partition
logic reverts to workdir-only.
The context-local HERMES_HOME override in hermes_constants and the
subprocess bridging in tools/environments/local.py are kept — they
now have other consumers (dashboard multi-profile, TUI gateway).
The runtime assembled-prompt scan (#3968 lineage) selected its pattern
tier on has_skills alone. A script-driven, no-skills job injects its
script's stdout into the prompt, and that blob was scanned with the
STRICT user-prompt pattern set — so any command-shape string in the
data feed (e.g. a triage bot ingesting a bug report that quotes
`rm -rf /`) hard-blocked the job on every tick.
Script output and context_from output are runtime DATA produced by
operator-authored code — the same trust class as install-vetted skill
markdown, not a user-authored directive prompt. Select the scan tier by
what the assembled prompt CONTAINS: when it includes skill content OR
injected data, use the looser _scan_cron_skill_assembled set (keeps
unambiguous injection directives, drops command-shape patterns,
sanitizes invisible unicode instead of blocking).
Defense-in-depth is preserved:
- The raw user prompt is still strict-scanned at create/update
(api_server paths untouched) AND re-scanned strict at runtime even
when the looser tier was selected for the data blob.
- Plain no-script/no-skills jobs keep the strict scan on the whole
assembled prompt.
- Injection directives arriving via script stdout still block.
Rejected alternative: removing destructive_root_rm from the strict set
or a per-job skip_injection_scan flag — both weaken the guard globally.
A cron session's first message is the injected "[IMPORTANT: you are running as
a scheduled cron job …]" delivery hint, so with no explicit title the sidebar
and history rows fell back to that hint as their label.
Set the session title from the job (name → short prompt → id) with a run-time
suffix for uniqueness against the sessions.title index. Done after the run so
the agent's own INSERT keeps model/system_prompt — this only updates the title.
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(dashboard): populate cron delivery dropdown from configured platforms
The dashboard cron-create/edit dropdown hardcoded five delivery options
(local, telegram, discord, slack, email), so users on Matrix — or any
other backend-supported platform — had no way to pick their channel even
though the cron scheduler delivers to all of them. It also offered
Telegram/Discord/etc. to users who never set those up.
- cron/scheduler.py: add cron_delivery_targets() — the single source of
truth. Intersects gateway-configured platforms with cron-deliverable
ones and reports whether each platform's home channel is set.
- web_server.py: GET /api/cron/delivery-targets exposes that list (+ the
implicit local option) to the dashboard.
- CronPage.tsx: both modals render options from the endpoint. Configured
platforms missing a home channel still appear, annotated "set a home
channel first" (option B), so the user knows what to fix. Edit modal
preserves a job's current target even if it's no longer configured.
Local-only state shows a "configure a platform under Channels" hint.
Validation: scheduler + endpoint E2E'd with a Matrix gateway (home set
and unset); 5 new tests; tests/cron + tests/hermes_cli/test_web_server
green (366 passed).
Follow-up on the parallel-dispatch decoupling: the sequential pass for
workdir/profile jobs still ran inline in the ticker thread, so a long
workdir/profile job reintroduced the exact starvation #37312 describes,
just for env-mutating jobs. And the MCP orphan sweep ran immediately
after dispatch in sync=False mode — before jobs finished — defeating its
own 'runs after every job' contract and racing jobs still spawning MCP
children.
- Sequential jobs now queue to a persistent single-thread cron-seq pool
(preserves one-at-a-time ordering across ticks, never blocks the tick).
- Same in-flight dedup guard now covers sequential jobs.
- MCP orphan sweep runs via a done-callback after the LAST dispatched job
completes in async mode; inline after as_completed in sync mode.
Verified E2E: tick(sync=False) returns in ~1ms with a 1.5s sequential job
in flight; sweep fires only after that job ends.
PR #13021 fixed serial starvation by adding ThreadPoolExecutor to tick(),
but kept as_completed(timeout=600) which still blocks the ticker thread
until the slowest job finishes. This causes the same starvation pattern:
when one job runs long (15+ min), other jobs' next_run_at expires past the
grace window and they get perpetually fast-forwarded instead of running.
This PR decouples dispatch from completion:
- Persistent ThreadPoolExecutor (reused across ticks, no auto-join)
- Fire-and-forget dispatch: tick submits and returns immediately
- Running-job guard: prevents re-dispatching active jobs
- sync parameter: defaults to True (backward compatible), callers opt
into sync=False for non-blocking behavior
- atexit shutdown handler for clean pool teardown
- gateway/run.py: production ticker opts into sync=False
Refs #33315 (complementary — that issue's PRs fix grace handling in
jobs.py; this PR prevents the grace from expiring in the first place)
A stray zero-width space (U+200B), BOM, or bidi control in loaded skill
markdown permanently killed any cron that loaded it. The skills-attached
assembled-prompt scan hard-blocked on any invisible-unicode char, even
though skill bodies are already install-time vetted by skills_guard.py and
the chars commonly appear in copy-pasted unicode docs / code examples.
The skills path now strips invisibles (logging the codepoints) and runs the
cleaned prompt. The raw user-prompt path (_scan_cron_prompt) keeps the hard
block — that is the actual #3968 injection surface, where a small directive
prompt with a ZWSP is a smoking gun, not prose. Stripping does not let a real
injection slip through: the directive still matches after sanitization.
_scan_cron_skill_assembled now returns (cleaned_prompt, error).
The runtime cron prompt scanner (added in #3968 to plug the
"malicious skill carrying an injection payload" gap) reuses the same
critical-severity patterns as the create-time user-prompt scan against
the *assembled* prompt — which includes loaded skill markdown.
That works fine for narrow patterns like "ignore previous instructions"
which never legitimately appear in prose. It catastrophically false-
positives on command-shape patterns like `cat ~/.hermes/.env`,
`authorized_keys`, `/etc/sudoers`, and `rm -rf /`, which routinely
appear in security postmortems and runbooks as **descriptive prose**
about attacks, not as actual commands.
Concrete failure: the bundled `hermes-agent-dev` skill contains a
security postmortem section saying "the attacker could just
`cat ~/.hermes/.env`". Every PR-scout cron job that loaded this skill
was silently blocked with `Blocked: prompt matches threat pattern
'read_secrets'`. All 11 scout jobs failed for weeks.
Fix: split the scanner into two tiers and route by context:
- `_scan_cron_prompt` (strict, unchanged behavior) runs against
the small user-authored cron prompt at create/update and as a
runtime defense-in-depth when no skills are attached. A legit
user prompt has no business saying `cat .env`, so the strict
patterns still apply there.
- `_scan_cron_skill_assembled` (new, looser) runs against the
assembled prompt when skills are attached. It only catches
unambiguous prompt-injection directives ("ignore previous
instructions", "disregard your rules", "system prompt override",
"do not tell the user") plus invisible-unicode markers. Command-
shape patterns are dropped because they false-positive on prose.
This is defense-in-depth, not the only line of defense. Skill bodies
are already scanned at install time by `skills_guard.py`; the runtime
cron scan exists purely as a tripwire for an obvious injection
directive surviving a malicious install. Catching prose mentions of
commands was never the goal of #3968 — the test that planted a skill
containing `cat ~/.hermes/.env` was the wrong shape of test for the
threat model.
Tests:
- `_scan_cron_prompt` strict behavior preserved (56 existing tests
unchanged: bare `cat .env`, `rm -rf /`, etc. still block).
- New `TestScanCronSkillAssembled` class verifies the looser scanner:
injection / disregard / system-override / do-not-tell-the-user /
invisible-unicode still block; descriptive prose about attack
commands is allowed; GitHub auth-header allowlist still works.
- `test_skill_with_env_exfil_payload_raises` (planted `cat .env`
in skill body) replaced with `test_skill_with_env_exfil_command
_in_prose_is_allowed` documenting the new correct behavior with
the real-world postmortem-style example that triggered the bug.
- All 11 originally-failing PR-scout jobs validated end-to-end via
`_build_job_prompt` — assembled prompts now build successfully
with the `hermes-agent-dev` skill attached.
Total: 75/75 tests in cron + cronjob_tools + threat scanner pass;
544/544 across the wider cron / memory / threat-pattern surface.
The bug: cron/scheduler.py:_resolve_cron_enabled_toolsets returns an
LLM-supplied per-job enabled_toolsets verbatim. The disabled_toolsets
passed to AIAgent was a hardcoded [cronjob, messaging, clarify] that
ignored agent.disabled_toolsets from config.yaml. An LLM could call
cronjob(action='add', enabled_toolsets=['terminal','file'],
prompt='...') and the cron-spawned agent would receive terminal+file
even when the operator had globally disabled them.
Fix: new _resolve_cron_disabled_toolsets() helper that ALWAYS layers
agent.disabled_toolsets on top of the cron baseline. AIAgent's
disabled_toolsets takes precedence over enabled_toolsets, so this
stops the bypass regardless of what the per-job override contains.
This is the disabled-side fix. Three concurrent PRs (#25842, #25815,
#25780) proposed intersection-side variants on _resolve_cron_enabled_toolsets;
this fix is more robust because it stops the leak at the precedence
boundary AIAgent itself enforces, not at a layer above.
Regression test reproduces the issue's PoC exactly:
config.yaml has agent.disabled_toolsets=[terminal,file]; cron job has
enabled_toolsets=[web,terminal,file]; assertion: AIAgent receives
disabled_toolsets containing terminal AND file.
Salvaged from PR #25786 by @Schrotti77. Simplified the implementation:
dropped a 23-line _normalize_toolset_list() helper (handled str/tuple/
set/garbage input shapes) in favor of the existing convention
(agent_cfg.get('disabled_toolsets') or []) used elsewhere in the
codebase. YAML always parses these as lists; the elaborate normalizer
was theatre for shapes we never produce.
Closes#25752
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
ntfy now ships as a self-contained plugin under plugins/platforms/ntfy/
instead of editing 8 core files (gateway/config.py Platform enum,
gateway/run.py factory + auth maps, cron/scheduler.py, toolsets.py,
hermes_cli/status.py, agent/prompt_builder.py, gateway/channel_directory.py,
tools/send_message_tool.py).
All routing goes through gateway/platform_registry via register_platform():
- adapter_factory, check_fn, validate_config, is_connected
- env_enablement_fn seeds PlatformConfig.extra from NTFY_* env vars so
gateway status reflects env-only setups without instantiating httpx
- standalone_sender_fn handles deliver=ntfy cron jobs when cron runs
out-of-process from the gateway
- allowed_users_env / allow_all_env hook into _is_user_authorized
- cron_deliver_env_var=NTFY_HOME_CHANNEL for cron home routing
- platform_hint surfaces in the system prompt
- pii_safe=True (topic names are the only identifier; no PII to redact)
Tests moved to tests/gateway/test_ntfy_plugin.py using _plugin_adapter_loader
so the module lives under plugin_adapter_ntfy in sys.modules and cannot
collide with sibling plugin-adapter tests on the same xdist worker. The
core-file grep tests (Platform.NTFY in source, hermes-ntfy in toolsets,
etc.) are replaced with plugin-shape tests covering register() metadata,
env_enablement_fn output, and standalone_sender_fn behavior.
68 tests pass under scripts/run_tests.sh.
Add an official, production-grade WhatsApp integration via Meta's
Business Cloud API as a complement to the existing Baileys bridge.
No bridge subprocess, no QR codes, no account-ban risk — at the cost
of a Meta Business account and a public HTTPS webhook URL.
Setup is fully wizard-driven: 'hermes whatsapp-cloud' walks through
every credential with paste-time validation (catches the #1 trap of
pasting a phone number into the Phone Number ID field), generates a
verify token, and ends with copy-paste instructions for the
cloudflared / Meta-dashboard / Business Manager pieces that can't be
automated. The wizard also points users at Meta's Business Manager
for setting the bot's display name and profile picture.
Feature set:
- Inbound: text, images (with native-vision routing), voice notes
(STT), documents (small text inlined, larger cached), reply context.
- Outbound: text with WhatsApp-flavored markdown conversion, images,
videos, documents, opus voice notes via ffmpeg with MP3 fallback.
- Native interactive buttons for clarify, dangerous-command approval,
and slash-command confirmation flows — matches the Telegram /
Discord UX, graceful degrades to plain text.
- Read receipts (blue double-checkmarks) and typing indicator,
using Meta's combined endpoint so they fire in a single API call.
- Webhook security: X-Hub-Signature-256 HMAC verification (raw body,
constant-time), wamid deduplication, group-shaped-message refusal
(groups deferred to v2 — Baileys still covers them).
- Full integration with the gateway's session, cron, display-tier,
prompt-hint, and auth-allowlist systems. Cloud and Baileys can run
side-by-side against different phone numbers.
Also wires STT (speech-to-text) through Nous's managed audio gateway
for Nous subscribers — previously the default stt.provider=local
required a separate faster-whisper install. New subscribers now get
voice-note transcription out of the box.
Docs: 418-line user guide at website/docs/user-guide/messaging/
whatsapp-cloud.md, sidebar entry, environment-variables reference,
ADDING_A_PLATFORM.md updated with the optional interactive-UX
contract for future adapter authors.
Tests: 100 dedicated tests for the adapter, 32 for the setup wizard,
20 for the Nous subscription STT wiring, plus regression coverage
across display_config, prompt_builder, and the cron scheduler.
Known limitations (deferred until clear demand signal):
- Group chats — use the Baileys bridge if you need them.
- Message templates for 24-hour-window outside-conversation sends —
reactive chat is unaffected; cron / delegate_task with gaps > 24h
will fail with a clear error. The agent's system prompt warns the
model about this so it knows to mention it when scheduling delayed
messages.
Apply CREATE_NO_WINDOW flags when the cron scheduler launches job scripts on Windows so gateway-managed no-agent cron jobs do not flash cmd or python console windows every tick.
1. trajectory_compressor.py: yaml.safe_load() returns None on empty
files, crashing with TypeError on `if 'tokenizer' in data`. Fix by
adding `or {}` fallback. (HIGH — blocks startup with empty config)
2. 6 files with fcntl.flock(LOCK_UN) in finally blocks without
try/except: cron/scheduler.py, hermes_cli/auth.py,
agent/shell_hooks.py, tools/skill_usage.py,
tools/environments/file_sync.py, tools/memory_tool.py. If unlock
raises OSError, fd.close() is skipped and the lock is held forever.
The msvcrt branches already had try/except; the fcntl branches did
not. Fix by wrapping in try/except (OSError, IOError): pass.
3. agent/copilot_acp_client.py line 639: TOCTOU race — path.exists()
followed by path.read_text() with no try/except. If file is deleted
between the check and the read, FileNotFoundError propagates. Fix
by using try/except FileNotFoundError.
4. gateway/sticker_cache.py: non-atomic write via Path.write_text()
can leave truncated JSON on crash, causing JSONDecodeError on next
load. Fix by writing to tempfile + fsync + os.replace (atomic).
When Telegram topic mode is enabled, cron messages delivered to the bot's
root DM (TELEGRAM_HOME_CHANNEL without a thread id) land in the system
lobby — replies there are rebuffed with the lobby reminder and
reply_to_message_id is dropped, so users cannot interact with the cron
output (#24409).
Add an optional TELEGRAM_CRON_THREAD_ID env var that overrides
TELEGRAM_HOME_CHANNEL_THREAD_ID for cron deliveries only. Operators can
create a "Cron" forum topic in the DM, point this var at its thread id,
and replies to cron messages will land in that topic's existing session
instead of the lobby. The home-channel thread id (used elsewhere, e.g.
restart notifications) is unchanged, and explicit
deliver="telegram:chat:thread" targets continue to win over the env var.
Per the reporter's clarification on 2026-05-13, option (a) (cron-side
route to a dedicated topic + config knob) was chosen.
Fixes#24409
Two code paths call json.loads() on output from external tools without
catching JSONDecodeError. If the tool returns a non-JSON string (error
message, empty string, or None), the entire call path crashes.
1. gateway/run.py — text_to_speech_tool() result in voice reply path.
A TTS failure that returns an error string instead of JSON crashes
the voice reply handler, killing the message response entirely.
2. cron/scheduler.py — skill_view() result when loading skills for
cron jobs. A corrupted or missing skill file that returns an error
string instead of JSON crashes the cron tick, preventing all jobs
from executing that cycle.
Both fixes catch (json.JSONDecodeError, TypeError), log a warning,
and gracefully skip the failed operation instead of crashing.
Instead of raising FileNotFoundError (which silently bricks the job),
log a warning and fall back to the scheduler default home. Validates
at create/update time still catches typos. Idea from PR #19958.
Replace generator-based result collection with explicit per-future
handling. Each future is now processed independently with a 600s timeout.
Before: _results.extend(f.result() for f in _futures)
- One exception stops the generator, remaining results are lost
- No timeout: one hung job blocks the entire tick
After: as_completed() + per-future try/except
- Each future handled independently
- 600s timeout prevents indefinite blocking
- Failed futures are logged and counted as failures
Wraps every sync->async coroutine-scheduling site in the codebase with a
new agent.async_utils.safe_schedule_threadsafe() helper that closes the
coroutine on scheduling failure (closed loop, shutdown race, etc.)
instead of leaking it as 'coroutine was never awaited' RuntimeWarnings
plus reference leaks.
22 production call sites migrated across the codebase:
- acp_adapter/events.py, acp_adapter/permissions.py
- agent/lsp/manager.py
- cron/scheduler.py (media + text delivery paths)
- gateway/platforms/feishu.py (5 sites, via existing _submit_on_loop helper
which now delegates to safe_schedule_threadsafe)
- gateway/run.py (10 sites: telegram rename, agent:step hook, status
callback, interim+bg-review, clarify send, exec-approval button+text,
temp-bubble cleanup, channel-directory refresh)
- plugins/memory/hindsight, plugins/platforms/google_chat
- tools/browser_supervisor.py (3), browser_cdp_tool.py,
computer_use/cua_backend.py, slash_confirm.py
- tools/environments/modal.py (_AsyncWorker)
- tools/mcp_tool.py (2 + 8 _run_on_mcp_loop callers converted to
factory-style so the coroutine is never constructed on a dead loop)
- tui_gateway/ws.py
Tests: new tests/agent/test_async_utils.py covers helper behavior under
live loop, dead loop, None loop, and scheduling exceptions. Regression
tests added at three PR-original sites (acp events, acp permissions,
mcp loop runner) mirroring contributor's intent.
Live-tested end-to-end:
- Helper stress test: 1500 schedules across live/dead/race scenarios,
zero leaked coroutines
- Race exercised: 5000 schedules with loop killed mid-flight, 100 ok /
4900 None returns, zero leaks
- hermes chat -q with terminal tool call (exercises step_callback bridge)
- MCP probe against failing subprocess servers + factory path
- Real gateway daemon boot + SIGINT shutdown across multiple platform
adapter inits
- WSTransport 100 live + 50 dead-loop writes
- Cron delivery path live + dead loop
Salvages PR #2657 — adopts contributor's intent over a much wider site
list and a single centralized helper instead of inline try/except at
each site. 3 of the original PR's 6 sites no longer exist on main
(environments/patches.py deleted, DingTalk refactored to native async);
the equivalent fix lives in tools/environments/modal.py instead.
Co-authored-by: JithendraNara <jithendranaidunara@gmail.com>
Cron jobs using `deliver: whatsapp` were silently dropped because the
resolver's home-channel env var dict in cron/scheduler.py listed every
messaging platform except whatsapp. _resolve_delivery_targets() returned
[] and no message was sent — but jobs.json marked the run successful and
no log line surfaced the failure.
The gateway adapter and the send_message tool path both honored
WHATSAPP_HOME_CHANNEL correctly; only the cron path missed.
Adds 'whatsapp' -> 'WHATSAPP_HOME_CHANNEL' to _HOME_TARGET_ENV_VARS.
Verified end-to-end with multiple cron pings landing in WhatsApp
self-chat after the fix.
Fixes#22997
Replace with for all literal-tuple
membership tests. Set lookup is O(1) vs O(n) for tuple — consistent
micro-optimization across the codebase.
608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining.
133 files, +626/-626 (net zero).
Pick openrouter/pareto-code as your model and OpenRouter auto-routes each
request to the cheapest model meeting your coding-quality bar (ranked by
Artificial Analysis). The new openrouter.min_coding_score config key (0.0-1.0,
default 0.65) tunes the floor.
- hermes_cli/models.py: add openrouter/pareto-code to OPENROUTER_MODELS so
it shows up in the picker with a description
- hermes_cli/config.py: add openrouter.min_coding_score (default 0.65 — lands
on a mid-tier coder on the current Pareto frontier)
- plugins/model-providers/openrouter: emit extra_body.plugins =
[{id: pareto-router, min_coding_score: X}] when model is openrouter/pareto-code
AND the score is a valid float in [0.0, 1.0]
- agent/transports/chat_completions.py: same emission on the legacy flag
path (when no provider profile is loaded)
- run_agent.py: openrouter_min_coding_score kwarg + storage; plumbed into
both build_kwargs() invocations and the context-summary extra_body path
- cli.py: read openrouter.min_coding_score once at init, validate float in
[0,1], pass to AIAgent constructions (CLI + background-task paths)
- cron/scheduler.py, batch_runner.py, tools/delegate_tool.py,
tui_gateway/server.py: propagate the kwarg (mirrors providers_order
plumbing — subagents inherit, cron/batch read from config)
- tests: profile-level + transport-level coverage of the model gating,
unset/empty/out-of-range handling, and the legacy flag path
- docs: new 'OpenRouter Pareto Code Router' section in providers.md
Verified end-to-end against api.openrouter.ai: at score=0.65 we land on a
mid-tier coder, at omission we get the strongest. Score is silently dropped
on any model other than openrouter/pareto-code, so it's safe to leave set.