hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
Tranquil-Flow	ee83a710f0	fix(gateway,cron): activate fallback_model when primary provider auth fails When the primary provider raises AuthError (expired OAuth token, revoked API key), the error was re-raised before AIAgent was created, so fallback_model was never consulted. Now both gateway/run.py and cron/scheduler.py catch AuthError specifically and attempt to resolve credentials from the fallback_providers/fallback_model config chain before propagating the error. Closes #7230	2026-04-24 05:35:43 -07:00
Teknium	852c7f3be3	feat(cron): per-job workdir for project-aware cron runs (#15110 ) Cron jobs can now specify a per-job working directory. When set, the job runs as if launched from that directory: AGENTS.md / CLAUDE.md / .cursorrules from that dir are injected into the system prompt, and the terminal / file / code-exec tools use it as their cwd (via TERMINAL_CWD). When unset, old behaviour is preserved (no project context files, tools use the scheduler's cwd). Requested by @bluthcy. ## Mechanism - cron/jobs.py: create_job / update_job accept 'workdir'; validated to be an absolute existing directory at create/update time. - cron/scheduler.py run_job: if job.workdir is set, point TERMINAL_CWD at it and flip skip_context_files to False before building the agent. Restored in finally on every exit path. - cron/scheduler.py tick: workdir jobs run sequentially (outside the thread pool) because TERMINAL_CWD is process-global. Workdir-less jobs still run in the parallel pool unchanged. - tools/cronjob_tools.py + hermes_cli/cron.py + hermes_cli/main.py: expose 'workdir' via the cronjob tool and 'hermes cron create/edit --workdir ...'. Empty string on edit clears the field. ## Validation - tests/cron/test_cron_workdir.py (21 tests): normalize, create, update, JSON round-trip via cronjob tool, tick partition (workdir jobs run on the main thread, not the pool), run_job env toggle + restore in finally. - Full targeted suite (tests/cron/, test_cronjob_tools.py, test_cron.py, test_config_cwd_bridge.py, test_worktree.py): 314/314 passed. - Live smoke: hermes cron create --workdir $(pwd) works; relative path rejected; list shows 'Workdir:'; edit --workdir '' clears.	2026-04-24 05:07:01 -07:00
Teknium	ef5eaf8d87	feat(cron): honor `hermes tools` config for the cron platform (#14798 ) Cron now resolves its toolset from the same per-platform config the gateway uses — `_get_platform_tools(cfg, 'cron')` — instead of blindly loading every default toolset. Existing cron jobs without a per-job override automatically lose `moa`, `homeassistant`, and `rl` (the `_DEFAULT_OFF_TOOLSETS` set), which stops the "surprise $4.63 mixture_of_agents run" class of bug (Norbert, Discord). Precedence inside `run_job`: 1. per-job `enabled_toolsets` (PR #14767 / #6130) — wins if set 2. `_get_platform_tools(cfg, 'cron')` — new, the blanket gate 3. `None` fallback (legacy) — only on resolver exception Changes: - hermes_cli/platforms.py: register 'cron' with default_toolset 'hermes-cron' - toolsets.py: add 'hermes-cron' toolset (mirrors 'hermes-cli'; `_get_platform_tools` then filters via `_DEFAULT_OFF_TOOLSETS`) - cron/scheduler.py: add `_resolve_cron_enabled_toolsets(job, cfg)`, call it at the `AIAgent(...)` kwargs site - tests/cron/test_scheduler.py: replace the 'None when not set' test (outdated contract) with an invariant ('moa not in default cron toolset') + new per-job-wins precedence test - tests/hermes_cli/test_tools_config.py: mark 'cron' as non-messaging in the gateway-toolset-coverage test	2026-04-23 15:48:50 -07:00
say8hi	0086fd894d	feat(cron): support enabled_toolsets per job to reduce token overhead	2026-04-23 15:16:18 -07:00
Nilesh	22afa066f8	fix(cron): guard against non-dict result from run_conversation When run_conversation returns a non-dict value (e.g. an int under error conditions), the subsequent result.get("final_response", "") raises an opaque "'int' object has no attribute 'get'" AttributeError. Add a type guard that converts this into a clear RuntimeError, which is properly caught by the outer except Exception handler that marks the job as failed and delivers the error message. Fixes NousResearch/hermes-agent#9433 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 02:30:00 -07:00
VTRiot	18e7fd8364	fix(cron): cancel orphan coroutine on delivery timeout before standalone fallback When the live adapter delivery path (_deliver_result) or media send path (_send_media_via_adapter) times out at future.result(timeout=N), the underlying coroutine scheduled via asyncio.run_coroutine_threadsafe can still complete on the event loop, causing a duplicate send after the standalone fallback runs. Cancel the future on TimeoutError before re-raising, so the standalone fallback is the sole delivery path. Adds TestDeliverResultTimeoutCancelsFuture and TestSendMediaTimeoutCancelsFuture.	2026-04-21 05:52:16 -07:00
alt-glitch	28b3f49aaa	refactor: remove remaining redundant local imports (comprehensive sweep) Full AST-based scan of all .py files to find every case where a module or name is imported locally inside a function body but is already available at module level. This is the second pass — the first commit handled the known cases from the lint report; this one catches everything else. Files changed (19): cli.py — 16 removals: time as _time/_t/_tmod (×10), re / re as _re (×2), os as _os, sys, partial os from combo import, from model_tools import get_tool_definitions gateway/run.py — 8 removals: MessageEvent as _ME / MessageType as _MT (×3), os as _os2, MessageEvent+MessageType (×2), Platform, BasePlatformAdapter as _BaseAdapter run_agent.py — 6 removals: get_hermes_home as _ghh, partial (contextlib, os as _os), cleanup_vm, cleanup_browser, set_interrupt as _sif (×2), partial get_toolset_for_tool hermes_cli/main.py — 4 removals: get_hermes_home, time as _time, logging as _log, shutil hermes_cli/config.py — 1 removal: get_hermes_home as _ghome hermes_cli/runtime_provider.py — 1 removal: load_config as _load_bedrock_config hermes_cli/setup.py — 2 removals: importlib.util (×2) hermes_cli/nous_subscription.py — 1 removal: from hermes_cli.config import load_config hermes_cli/tools_config.py — 1 removal: from hermes_cli.config import load_config, save_config cron/scheduler.py — 3 removals: concurrent.futures, json as _json, from hermes_cli.config import load_config batch_runner.py — 1 removal: list_distributions as get_all_dists (kept print_distribution_info, not at top level) tools/send_message_tool.py — 2 removals: import os (×2) tools/skills_tool.py — 1 removal: logging as _logging tools/browser_camofox.py — 1 removal: from hermes_cli.config import load_config tools/image_generation_tool.py — 1 removal: import fal_client environments/tool_context.py — 1 removal: concurrent.futures gateway/platforms/bluebubbles.py — 1 removal: httpx as _httpx gateway/platforms/whatsapp.py — 1 removal: import asyncio tui_gateway/server.py — 2 removals: from datetime import datetime, import time All alias references (_time, _t, _tmod, _re, _os, _os2, _json, _ghh, _ghome, _sif, _ME, _MT, _BaseAdapter, _load_bedrock_config, _httpx, _logging, _log, get_all_dists) updated to use the top-level names.	2026-04-21 00:50:58 -07:00
Teknium	c86915024e	fix(cron): run due jobs in parallel to prevent serial tick starvation (#13021 ) Replaces the serial for-loop in tick() with ThreadPoolExecutor so all jobs due in a single tick run concurrently. A slow job no longer blocks others from executing, fixing silent job skipping (issue #9086). Thread safety: - Session/delivery env vars migrated from os.environ to ContextVars (gateway/session_context.py) so parallel jobs can't clobber each other's delivery targets. Each thread gets its own copied context. - jobs.json read-modify-write cycles (advance_next_run, mark_job_run) protected by threading.Lock to prevent concurrent save clobber. - send_message_tool reads delivery vars via get_session_env() for ContextVar-aware resolution with os.environ fallback. Configuration: - cron.max_parallel_jobs in config.yaml (null = unbounded, 1 = serial) - HERMES_CRON_MAX_PARALLEL env var override Based on PR #9169 by @VenomMoth1. Fixes #9086	2026-04-20 11:53:07 -07:00
Teknium	424e9f36b0	refactor: remove smart_model_routing feature (#12732 ) Smart model routing (auto-routing short/simple turns to a cheap model across providers) was opt-in and disabled by default. This removes the feature wholesale: the routing module, its config keys, docs, tests, and the orchestration scaffolding it required in cli.py / gateway/run.py / cron/scheduler.py. The /fast (Priority Processing / Anthropic fast mode) feature kept its hooks into _resolve_turn_agent_config — those still build a route dict and attach request_overrides when the model supports it; the route now just always uses the session's primary model/provider rather than running prompts through choose_cheap_model_route() first. Also removed: - DEFAULT_CONFIG['smart_model_routing'] block and matching commented-out example sections in hermes_cli/config.py and cli-config.yaml.example - _load_smart_model_routing() / self._smart_model_routing on GatewayRunner - self._smart_model_routing / self._active_agent_route_signature on HermesCLI (signature kept; just no longer initialised through the smart-routing pipeline) - route_label parameter on HermesCLI._init_agent (only set by smart routing; never read elsewhere) - 'Smart Model Routing' section in website/docs/integrations/providers.md - tip in hermes_cli/tips.py - entries in hermes_cli/dump.py + hermes_cli/web_server.py - row in skills/autonomous-ai-agents/hermes-agent/SKILL.md Tests: - Deleted tests/agent/test_smart_model_routing.py - Rewrote tests/agent/test_credential_pool_routing.py to target the simplified _resolve_turn_agent_config directly (preserves credential pool propagation + 429 rotation coverage) - Dropped 'cheap model' test from test_cli_provider_resolution.py - Dropped resolve_turn_route patches from cli + gateway test_fast_command — they now exercise the real method end-to-end - Removed _smart_model_routing stub assignments from gateway/cron test helpers Targeted suites: 74/74 in the directly affected test files; tests/agent + tests/cron + tests/cli pass except 5 failures that already exist on main (cron silent-delivery + alias quick-command).	2026-04-19 18:12:55 -07:00
Teknium	e017131403	feat(cron): add wakeAgent gate — scripts can skip the agent entirely Extends the existing cron script hook with a wake gate ported from nanoclaw #1232. When a cron job's pre-check Python script (already sandboxed to HERMES_HOME/scripts/) writes a JSON line like ```json {"wakeAgent": false} ``` on its last stdout line, `run_job()` returns the SILENT marker and skips the agent entirely — no LLM call, no delivery, no tokens spent. Useful for frequent polls (every 1-5 min) that only need to wake the agent when something has genuinely changed. Any other script output (non-JSON, missing key, non-dict, `wakeAgent: true`, truthy/falsy non-False values) behaves as before: stdout is injected as context and the agent runs normally. Strict `False` is required to skip — avoids accidental gating from arbitrary JSON. Refactor: - New pure helper `_parse_wake_gate(script_output)` in cron/scheduler.py - `_build_job_prompt` accepts optional `prerun_script` tuple so the script runs exactly once per job (run_job runs it for the gate check, reuses the output for prompt injection) - `run_job` short-circuits with SILENT_MARKER when gate fires Script failures (success=False) still cannot trigger the gate — the failure is reported as context to the agent as before. This replaces the approach in closed PR #3837, which inlined bash scripts via tempfile and lost the path-traversal/scripts-dir sandbox that main's impl has. The wake-gate idea (the one net-new capability) is ported on top of the existing sandboxed Python-script model. Tests: - 11 pure unit tests for _parse_wake_gate (empty, whitespace, non-JSON, non-dict JSON, missing key, truthy/falsy non-False, multi-line, trailing blanks, non-last-line JSON) - 5 integration tests for run_job wake-gate (skip returns SILENT, wake-true passes through, script-runs-only-once, script failure doesn't gate, no-script regression) - Full tests/cron/ suite: 194/194 pass	2026-04-19 01:42:35 -07:00
Teknium	762f7e9796	feat: configurable approval mode for cron jobs (approvals.cron_mode) Add approvals.cron_mode config option that controls how cron jobs handle dangerous commands. Previously, cron jobs silently auto-approved all dangerous commands because there was no user present to approve them. Now the behavior is configurable: - deny (default): block dangerous commands and return a message telling the agent to find an alternative approach. The agent loop continues — it just can't use that specific command. - approve: auto-approve all dangerous commands (previous behavior). When a command is blocked, the agent receives the same response format as a user denial in the CLI — exit_code=-1, status=blocked, with a message explaining why and pointing to the config option. This keeps the agent loop running and encourages it to adapt. Implementation: - config.py: add approvals.cron_mode to DEFAULT_CONFIG - scheduler.py: set HERMES_CRON_SESSION=1 env var before agent runs - approval.py: both check_command_approval() and check_all_command_guards() now check for cron sessions and apply the configured mode - 21 new tests covering config parsing, deny/approve behavior, and interaction with other bypass mechanisms (yolo, containers)	2026-04-18 19:24:35 -07:00
Teknium	d2206c69cc	fix(qqbot): add back-compat for env var rename; drop qrcode core dep Follow-up to WideLee's salvaged PR #11582. Back-compat for QQ_HOME_CHANNEL → QQBOT_HOME_CHANNEL rename: - gateway/config.py reads QQBOT_HOME_CHANNEL, falls back to QQ_HOME_CHANNEL with a one-shot deprecation warning so users on the old name aren't silently broken. - cron/scheduler.py: _HOME_TARGET_ENV_VARS['qqbot'] now maps to the new name; _get_home_target_chat_id falls back to the legacy name via a _LEGACY_HOME_TARGET_ENV_VARS table. - hermes_cli/status.py + hermes_cli/setup.py: honor both names when displaying or checking for missing home channels. - hermes_cli/config.py: keep legacy QQ_HOME_CHANNEL[_NAME] in _EXTRA_ENV_KEYS so .env sanitization still recognizes them. Scope cleanup: - Drop qrcode from core dependencies and requirements.txt (remains in messaging/dingtalk/feishu extras). _qqbot_render_qr already degrades gracefully when qrcode is missing, printing a 'pip install qrcode' tip and falling back to URL-only display. - Restore @staticmethod on QQAdapter._detect_message_type (it doesn't use self). Revert the test change that was only needed when it was converted to an instance method. - Reset uv.lock to origin/main; the PR's stale lock also included unrelated changes (atroposlib source URL, hermes-agent version bump, fastapi additions) that don't belong. Verified E2E: - Existing user (QQ_HOME_CHANNEL set): gateway + cron both pick up the legacy name; deprecation warning logs once. - Fresh user (QQBOT_HOME_CHANNEL set): gateway + cron use new name, no warning. - Both set: new name wins on both surfaces. Targeted tests: 296 passed, 4 skipped (qqbot + cron + hermes_cli).	2026-04-17 15:31:14 -07:00
Teknium	f64241ed90	feat(cron+tests): extend origin fallback to email/dingtalk/qqbot + fix Weixin test mocks Cron origin fallback extension (builds on #9193's _HOME_TARGET_ENV_VARS): adds the three remaining origin-fallback-eligible platforms that have home channel env vars configured in gateway/config.py but use non-generic env var names: - email → EMAIL_HOME_ADDRESS (non-standard suffix) - dingtalk → DINGTALK_HOME_CHANNEL - qqbot → QQ_HOME_CHANNEL (non-standard prefix: QQ_ not QQBOT_) Picks up the completeness intent of @Xowiek's PR #11317 using the architecturally-correct dict-based lookup from #9193, so platforms with non-standard env var names actually resolve instead of silently missing. Extended the parametrized regression test to cover the new three. Weixin test mock alignment (builds on #10091's _send_session split): Three test sites added in Batch 1 (TestWeixinSendImageFileParameterName) and Batch 3 (TestWeixinVoiceSending) mocked only adapter._session, but #10091 switched the send paths to check self._send_session. Added the companion setter so the tests stay green with the session split in place.	2026-04-17 06:26:43 -07:00
bde3249023	b46db048c3	fix(cron): align home target env lookup	2026-04-17 06:26:43 -07:00
bde3249023	f696b4745a	fix(cron): restore origin fallback for feishu home channels	2026-04-17 06:26:43 -07:00
Ubuntu	5ca52bae5b	fix(gateway/weixin): split poll/send sessions, reuse live adapter for cron & send_message - gateway/platforms/weixin.py: - Split aiohttp.ClientSession into _poll_session and _send_session - Add _LIVE_ADAPTERS registry so send_weixin_direct() reuses the connected gateway adapter instead of creating a competing session - Fixes silent message loss when gateway is running (iLink token contention) - cron/scheduler.py: - Support comma-separated deliver values (e.g. 'feishu,weixin') for multi-target delivery - Delay pconfig/enabled check until standalone fallback so live adapters work even when platform is not in gateway config - tools/send_message_tool.py: - Synthesize PlatformConfig from WEIXIN_* env vars when gateway config lacks a weixin entry - Fall back to WEIXIN_HOME_CHANNEL env var for home channel resolution - tests/gateway/test_weixin.py: - Update mocks to include _send_session	2026-04-17 06:26:43 -07:00
Billard	e9b3b8e820	fix(cron): treat empty agent response as error in last_status (fixes #8585 ) When a cron job's agent run completes but produces an empty final_response (e.g. API 404 from invalid model name), the scheduler now marks last_status as "error" instead of "ok", so the failure is visible in job listings. Previously, any run that didn't raise an exception was marked "ok" regardless of whether the agent actually produced output.	2026-04-16 06:49:57 -07:00
Mil Wang (from Dev Box)	f9714161f0	fix: stop leaking '(No response generated)' placeholder to users and cron targets When the LLM returns an empty completion, gateway/run.py replaced final_response with the literal string '(No response generated)'. This defeated cron/scheduler.py's empty-response skip guard, causing the placeholder to be delivered to home channels. Changes: - gateway/run.py: return empty string instead of placeholder when there is no error and no response content - cron/scheduler.py: defensively strip the placeholder text in case any upstream path still produces it Fixes NousResearch/hermes-agent#9270	2026-04-16 06:10:40 -07:00
helix4u	aa398ad655	fix(cron): preserve skill env passthrough in worker thread	2026-04-15 11:03:49 -07:00
Teknium	1c4d3216d3	fix(cron): include job_id in delivery and guide models on removal workflow (#10242 ) * fix(gateway): suppress duplicate replies on interrupt and streaming flood control Three fixes for the duplicate reply bug affecting all gateway platforms: 1. base.py: Suppress stale response when the session was interrupted by a new message that hasn't been consumed yet. Checks both interrupt_event and _pending_messages to avoid false positives. (#8221, #2483) 2. run.py (return path): Remove response_previewed guard from already_sent check. Stream consumer's already_sent alone is authoritative — if content was delivered via streaming, the duplicate send must be suppressed regardless of the agent's response_previewed flag. (#8375) 3. run.py (queued-message path): Same fix — already_sent without response_previewed now correctly marks the first response as already streamed, preventing re-send before processing the queued message. The response_previewed field is still produced by the agent (run_agent.py) but is no longer required as a gate for duplicate suppression. The stream consumer's already_sent flag is the delivery-level truth about what the user actually saw. Concepts from PR #8380 (konsisumer). Closes #8375, #8221, #2483. * fix(cron): include job_id in delivery and guide models on removal workflow Users reported cron reminders keep firing after asking the agent to stop. Root cause: the conversational agent didn't know the job_id (not in delivery) and models don't reliably do the list→remove two-step without guidance. 1. Include job_id in the cron delivery wrapper so users and agents can reference it when requesting removal. 2. Replace confusing footer ('The agent cannot see this message') with actionable guidance ('To stop or manage this job, send me a new message'). 3. Add explicit list→remove guidance in the cronjob tool schema so models know to list first and never guess job IDs.	2026-04-15 03:46:58 -07:00
walli	884cd920d4	feat(gateway): unify QQBot branding, add PLATFORM_HINTS, fix streaming, restore missing setup functions - Rename platform from 'qq' to 'qqbot' across all integration points (Platform enum, toolset, config keys, import paths, file rename qq.py → qqbot.py) - Add PLATFORM_HINTS for QQBot in prompt_builder (QQ supports markdown) - Set SUPPORTS_MESSAGE_EDITING = False to skip streaming on QQ (prevents duplicate messages from non-editable partial + final sends) - Add _send_qqbot() standalone send function for cron/send_message tool - Add interactive _setup_qq() wizard in hermes_cli/setup.py - Restore missing _setup_signal/email/sms/dingtalk/feishu/wecom/wecom_callback functions that were lost during the original merge	2026-04-14 00:11:49 -07:00
Junjun Zhang	87bfc28e70	feat: add QQ Bot platform adapter (Official API v2) Add full QQ Bot integration via the Official QQ Bot API (v2): - WebSocket gateway for inbound events (C2C, group, guild, DM) - REST API for outbound text/markdown/media messages - Voice transcription (Tencent ASR + configurable STT provider) - Attachment processing (images, voice, files) - User authorization (allowlist + allow-all + DM pairing) Integration points: - gateway: Platform.QQ enum, adapter factory, allowlist maps - CLI: setup wizard, gateway config, status display, tools config - tools: send_message cross-platform routing, toolsets - cron: delivery platform support - docs: QQ Bot setup guide	2026-04-14 00:11:49 -07:00
Teknium	8e00b3a69e	fix(cron): steer model away from explicit deliver targets that lose topic context (#8187 ) Rewrite the cronjob tool's 'deliver' parameter description to strongly guide models toward omitting the parameter (which auto-detects origin including thread/topic). The previous description listed all platform names equally, inviting models to construct explicit targets like 'telegram:<chat_id>' which silently drops the thread_id. New description: - Leads with 'Omit this parameter' as the recommended path - Explicitly warns that platform:chat_id without :thread_id loses topics - Removes the long flat list of platform names that invited construction Also adds diagnostic logging at two key points: - _origin_from_env(): logs when thread_id is captured during job creation - _deliver_result(): warns when origin has thread_id but delivery target lost it; logs at debug when delivering to a specific thread Helps diagnose user-reported issue where cron responses from Telegram topics are delivered to the main chat instead of the originating topic.	2026-04-11 23:20:39 -07:00
Teknium	1ca9b19750	feat: add network.force_ipv4 config to fix IPv6 timeout issues (#8196 ) On servers with broken or unreachable IPv6, Python's socket.getaddrinfo returns AAAA records first. urllib/httpx/requests all try IPv6 connections first and hang for the full TCP timeout before falling back to IPv4. This affects web_extract, web_search, the OpenAI SDK, and all HTTP tools. Adds network.force_ipv4 config option (default: false) that monkey-patches socket.getaddrinfo to resolve as AF_INET when the caller didn't specify a family. Falls back to full resolution if no A record exists, so pure-IPv6 hosts still work. Applied early at all three entry points (CLI, gateway, cron scheduler) before any HTTP clients are created. Reported by user @29n — Chinese Ubuntu server with unreachable IPv6 causing timeouts on lobste.rs and other IPv6-enabled sites while Google/GitHub worked fine (IPv4-only resolution).	2026-04-11 23:12:11 -07:00
chqchshj	5f0caf54d6	feat(gateway): add WeCom callback-mode adapter for self-built apps Add a second WeCom integration mode for regular enterprise self-built applications. Unlike the existing bot/websocket adapter (wecom.py), this handles WeCom's standard callback flow: WeCom POSTs encrypted XML to an HTTP endpoint, the adapter decrypts, queues for the agent, and immediately acknowledges. The agent's reply is delivered proactively via the message/send API. Key design choice: always acknowledge immediately and use proactive send — agent sessions take 3-30 minutes, so the 5-second inline reply window is never useful. The original PR's Future/pending-reply machinery was removed in favour of this simpler architecture. Features: - AES-CBC encrypt/decrypt (BizMsgCrypt-compatible) - Multi-app routing scoped by corp_id:user_id - Legacy bare user_id fallback for backward compat - Access-token management with auto-refresh - WECOM_CALLBACK_* env var overrides - Port-in-use pre-check before binding - Health endpoint at /health Salvaged from PR #7774 by @chqchshj. Simplified by removing the inline reply Future system and fixing: secrets.choice for nonce generation, immediate plain-text acknowledgment (not encrypted XML containing 'success'), and initial token refresh error handling.	2026-04-11 15:22:49 -07:00
Teknium	b53f681993	fix(cron): pass skip_context_files=True to AIAgent in run_job (#7958 ) Cron jobs run from whatever directory the scheduler process lives in (typically the hermes-agent install dir), so without this flag the agent picks up AGENTS.md, SOUL.md, or .cursorrules from that cwd — injecting irrelevant project context into the cron job's system prompt. batch_runner.py and gateway boot_md already pass skip_context_files=True for the same reason. This aligns cron with the established pattern for autonomous/headless agent runs.	2026-04-11 14:48:58 -07:00
aaronagent	307697688e	fix: prevent zombie processes, redact cron stderr, skip symlinks in skill enumeration process_registry.py: _reader_loop() has process.wait() after the try-except block (line 380). If the reader thread crashes with an unexpected exception (e.g. MemoryError, KeyboardInterrupt), control exits the except handler but skips wait() — leaving the child as a zombie process. Move wait() and the cleanup into a finally block so the child is always reaped. cron/scheduler.py: _run_job_script() only redacts secrets in stdout on the SUCCESS path (line 417-421). When a cron script fails (non-zero exit), both stdout and stderr are returned WITHOUT redaction (lines 407-413). A script that accidentally prints an API key to stderr during a failure would leak it into the LLM context. Move redaction before the success/failure branch so both paths benefit. skill_commands.py: _build_skill_message() enumerates supporting files using rglob("*") but only checks is_file() (line 171) without filtering symlinks. PR #6693 added symlink protection to scan_skill_commands() but missed this function. A malicious skill can create symlinks in references/ pointing to arbitrary files, exposing their paths (and potentially content via skill_view) to the LLM. Add is_symlink() check to match the guard in scan_skill_commands. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 02:03:20 -07:00
coffee	c5ab760528	fix(cron): missing field init, unnecessary save, and shutdown cleanup 1. Add missing `last_delivery_error` field initialization in `create_job()`. `mark_job_run()` sets this field on line 596 but it was never initialized, causing inconsistent job schemas between new and executed jobs. 2. Replace unnecessary `save_jobs()` call with a warning log when `mark_job_run()` is called with a non-existent job_id. Previously the function would silently write unchanged data to disk. 3. Add `cancel_futures=True` to the `finally` block in cron scheduler's thread pool shutdown. The `except` path already passes this flag but the normal exit path did not, leaving futures running after inactivity timeout detection.	2026-04-10 16:51:35 -07:00
Teknium	be4f049f46	fix: salvage follow-ups for Weixin adapter (#6747 ) - Remove sys.path.insert hack (leftover from standalone dev) - Add token lock (acquire_scoped_lock/release_scoped_lock) in connect()/disconnect() to prevent duplicate pollers across profiles - Fix get_connected_platforms: WEIXIN check must precede generic token/api_key check (requires both token AND account_id) - Add WEIXIN_HOME_CHANNEL_NAME to _EXTRA_ENV_KEYS - Add gateway setup wizard with QR login flow - Add platform status check for partially configured state - Add weixin.md docs page with full adapter documentation - Update environment-variables.md reference with all 11 env vars - Update sidebars.ts to include weixin docs page - Wire all gateway integration points onto current main Salvaged from PR #6747 by Zihan Huang.	2026-04-10 05:54:37 -07:00
Dominic Grieco	38cce22e2c	fix: harden cron script timeout and provider recovery	2026-04-10 02:57:57 -07:00
Teknium	d97f6cec7f	feat(gateway): add BlueBubbles iMessage platform adapter (#6437 ) Adds Apple iMessage as a gateway platform via BlueBubbles macOS server. Architecture: - Webhook-based inbound (event-driven, no polling/dedup needed) - Email/phone → chat GUID resolution for user-friendly addressing - Private API safety (checks helper_connected before tapback/typing) - Inbound attachment downloading (images, audio, documents cached locally) - Markdown stripping for clean iMessage delivery - Smart progress suppression for platforms without message editing Based on PR #5869 by @benjaminsehl (webhook architecture, GUID resolution, Private API safety, progress suppression) with inbound attachment downloading from PR #4588 by @1960697431 (attachment cache routing). Integration points: Platform enum, env config, adapter factory, auth maps, cron delivery, send_message routing, channel directory, platform hints, toolset definition, setup wizard, status display. 27 tests covering config, adapter, webhook parsing, GUID resolution, attachment download routing, toolset consistency, and prompt hints.	2026-04-08 23:54:03 -07:00
Teknium	30ea423ce8	fix: unify reasoning_effort to config.yaml only, remove HERMES_REASONING_EFFORT env var Gateway and cron had inconsistent reasoning_effort resolution: - CLI: config.yaml only (correct) - Gateway: config.yaml first, env var fallback - Cron: env var first, config.yaml fallback All three now read exclusively from agent.reasoning_effort in config.yaml. Removed HERMES_REASONING_EFFORT env var support entirely — .env is for secrets only, not behavioral config.	2026-04-08 03:36:44 -07:00
Teknium	fff237e111	feat(cron): track delivery failures in job status (#6042 ) _deliver_result() now returns Optional[str] — None on success, error message on failure. All failure paths (unknown platform, platform disabled, config load error, send failure, unresolvable target) return descriptive error strings. mark_job_run() gains delivery_error param, tracked as last_delivery_error on the job — separate from agent execution errors. A job where the agent succeeded but delivery failed shows last_status='ok' + last_delivery_error='...'. The cronjob list tool now surfaces last_delivery_error so agents and users can see when cron outputs aren't arriving. Inspired by PR #5863 (oxngon) — reimplemented with proper wiring. Tests: 3 new mark_job_run tests + 6 new _deliver_result return tests.	2026-04-07 22:49:01 -07:00
kshitijk4poor	0f3895ba29	fix(cron): deliver MEDIA files as native platform attachments The cron delivery path sent raw 'MEDIA:/path/to/file' text instead of uploading the file as a native attachment. The standalone path (via _send_to_platform) already extracted MEDIA tags and forwarded them as media_files, but the live adapter path passed the unprocessed delivery_content directly to adapter.send(). Two bugs fixed: 1. Live adapter path now sends cleaned text (MEDIA tags stripped) instead of raw content — prevents 'MEDIA:/path' from appearing as literal text in Discord/Telegram/etc. 2. Live adapter path now sends each extracted media file via the adapter's native method (send_voice for audio, send_image_file for images, send_video for video, send_document as fallback) — files are uploaded as proper platform attachments. The file-type routing mirrors BasePlatformAdapter._process_message_background to ensure consistent behavior between normal gateway responses and cron-delivered responses. Adds 2 tests: - test_live_adapter_sends_media_as_attachments: verifies Discord adapter receives send_voice call for .mp3 file - test_live_adapter_sends_cleaned_text_not_raw: verifies MEDIA tag stripped from text sent via live adapter	2026-04-07 12:49:25 -07:00
Teknium	d0ffb111c2	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 ) Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture) and manual analysis of the entire codebase. Changes by category: Unused imports removed (~95 across 55 files): - Removed genuinely unused imports from all major subsystems - agent/, hermes_cli/, tools/, gateway/, plugins/, cron/ - Includes imports in try/except blocks that were truly unused (vs availability checks which were left alone) Unused variables removed (~25): - Removed dead variables: connected, inner, channels, last_exc, source, new_server_names, verify, pconfig, default_terminal, result, pending_handled, temperature, loop - Dropped unused argparse subparser assignments in hermes_cli/main.py (12 instances of add_parser() where result was never used) Dead code removed: - run_agent.py: Removed dead ternary (None if False else None) and surrounding unreachable branch in identity fallback - run_agent.py: Removed write-only attribute _last_reported_tool - hermes_cli/providers.py: Removed dead @property decorator on module-level function (decorator has no effect outside a class) - gateway/run.py: Removed unused MCP config load before reconnect - gateway/platforms/slack.py: Removed dead SessionSource construction Undefined name bugs fixed (would cause NameError at runtime): - batch_runner.py: Added missing logger = logging.getLogger(__name__) - tools/environments/daytona.py: Added missing Dict and Path imports Unnecessary global statements removed (14): - tools/terminal_tool.py: 5 functions declared global for dicts they only mutated via .pop()/[key]=value (no rebinding) - tools/browser_tool.py: cleanup thread loop only reads flag - tools/rl_training_tool.py: 4 functions only do dict mutations - tools/mcp_oauth.py: only reads the global - hermes_time.py: only reads cached values Inefficient patterns fixed: - startswith/endswith tuple form: 15 instances of x.startswith('a') or x.startswith('b') consolidated to x.startswith(('a', 'b')) - len(x)==0 / len(x)>0: 13 instances replaced with pythonic truthiness checks (not x / bool(x)) - in dict.keys(): 5 instances simplified to in dict - Redefined unused name: removed duplicate _strip_mdv2 import in send_message_tool.py Other fixes: - hermes_cli/doctor.py: Replaced undefined logger.debug() with pass - hermes_cli/config.py: Consolidated chained .endswith() calls Test results: 3934 passed, 17 failed (all pre-existing on main), 19 skipped. Zero regressions.	2026-04-07 10:25:31 -07:00
Myeongwon Choi	ea16949422	fix(cron): suppress delivery when [SILENT] appears anywhere in response Previously the scheduler checked startswith('[SILENT]'), so agents that appended [SILENT] after an explanation (e.g. 'N items filtered.\n\n[SILENT]') would still trigger delivery. Change the check to 'in' so the marker is caught regardless of position. Add test_silent_trailing_suppresses_delivery to cover this case.	2026-04-06 16:49:40 -07:00
Awsh1	878b1d3d33	fix(cron): harden scheduler against path traversal and env leaks Cherry-picked from PR #5503 by Awsh1. - Validate ALL script paths (absolute, relative, tilde) against scripts_dir boundary - Add API-boundary validation in cronjob_tools.py - Move os.environ injections inside try block so finally cleanup always runs - Comprehensive regression tests for path containment bypass	2026-04-06 12:42:16 -07:00
Teknium	3d08a2fa1b	fix: extract MEDIA: tags from cron delivery before sending (#5598 ) The cron scheduler delivery path passed raw text including MEDIA: tags to _send_to_platform(), so media attachments were delivered as literal text instead of actual files. The send function already supports media_files= but the cron path never used it. Now calls BasePlatformAdapter.extract_media() to split media paths from text before sending, matching the gateway's normal message flow. Salvaged from PR #4877 by robert-hoffmann.	2026-04-06 11:42:44 -07:00
Teknium	89db3aeb2c	fix(cron): add delivery guidance to cron prompt — stop send_message thrashing (#5444 ) Cron agents were burning iterations trying to use send_message (which is disabled via messaging toolset) because their prompts said things like 'send the report to Telegram'. The scheduler handles delivery automatically via the deliver setting, but nothing told the agent that. Add a delivery guidance hint to _build_job_prompt alongside the existing [SILENT] hint: tells agents their final response is auto-delivered and they should NOT use send_message. Before: only [SILENT] suppression hint After: delivery guidance ('do NOT use send_message') + [SILENT] hint	2026-04-05 23:58:45 -07:00
Teknium	d6ef7fdf92	fix(cron): replace wall-clock timeout with inactivity-based timeout (#5440 ) Port the gateway's inactivity-based timeout pattern (PR #5389) to the cron scheduler. The agent can now run for hours if it's actively calling tools or receiving stream tokens — only genuine inactivity (no activity for HERMES_CRON_TIMEOUT seconds, default 600s) triggers a timeout. This fixes the Sunday PR scouts (openclaw, nanoclaw, ironclaw) which all hit the hard 600s wall-clock limit while actively working. Changes: - Replace flat future.result(timeout=N) with a polling loop that checks agent.get_activity_summary() every 5s (same pattern as gateway) - Timeout error now includes diagnostic info: last activity description, idle duration, current tool, iteration count - HERMES_CRON_TIMEOUT=0 means unlimited (no timeout) - Move sys.path.insert before repo-level imports to fix ModuleNotFoundError for hermes_time on stale gateway processes - Add time import needed by the polling loop - Add 9 tests covering active/idle/unlimited/env-var/diagnostic scenarios	2026-04-05 23:49:42 -07:00
Teknium	567bc79948	fix: clean up cron platform allowlist — add homeassistant, fix import, improve placement Follow-up for cherry-picked #5118 commits: - Remove duplicate 'import subprocess' - Move _KNOWN_DELIVERY_PLATFORMS to module-level (after imports) - Add 'homeassistant' to allowlist (existing platform missing from original PR) - Remove trailing whitespace	2026-04-05 12:31:27 -07:00
Maymun	71a4582bf8	fix(security): hoist platform allowlist to module scope as frozenset	2026-04-05 12:31:27 -07:00
Maymun	1ebc932417	fix(security): validate cron deliver platform name to prevent env var enumeration	2026-04-05 12:31:27 -07:00
Damian P	afccbf253c	fix: resolve listed messaging targets consistently	2026-04-05 11:59:28 -07:00
memosr	7f853ba7b6	fix: use logger.exception to preserve traceback in logs and drop unused import	2026-04-05 11:59:28 -07:00
memosr	5ff514ec79	fix(security): remove full traceback from cron error output to prevent info leakage	2026-04-05 11:59:28 -07:00
Teknium	c100ad874c	fix(matrix): E2EE cron delivery via live adapter + HTML formatting + origin fallback Salvaged from PRs #3767 (chalkers), #5236 (ygd58), #2641 (buntingszn). Three improvements to Matrix cron delivery: 1. Live adapter path: when the gateway is running, cron delivery now uses the connected MatrixAdapter via run_coroutine_threadsafe instead of the standalone HTTP PUT. This enables delivery to E2EE rooms where the raw HTTP path cannot encrypt. Falls back to standalone on failure. Threads adapters + event loop from gateway -> cron ticker -> tick() -> _deliver_result(). (from #3767) 2. HTML formatted_body: _send_matrix() now converts markdown to HTML using the optional markdown library, with h1-h6 to bold conversion for Element X compatibility. Falls back to plain text if markdown is not installed. Also adds random bytes to txn_id to prevent collisions. (from #5236) 3. Origin fallback: when deliver="origin" but origin is null (jobs created via API/scripts), falls back to HOME_CHANNEL env vars in order: matrix -> telegram -> discord -> slack. (from #2641)	2026-04-05 11:07:47 -07:00
memosr	931624feda	fix(security): guard cron script against path traversal and redact output Relative script paths resolved against HERMES_HOME/scripts/ were not validated to stay within that directory. Paths like '../../etc/passwd' could escape and be executed as Python. Fix: resolve the path and verify it stays within scripts_dir using Path.relative_to(). Also apply redact_sensitive_text() to script stdout before LLM injection — same pattern as execute_code sandbox output. Cherry-picked from PR #5093 by memosr (fixes 1 and 3; absolute path restriction dropped as too restrictive for the feature's design intent).	2026-04-04 17:01:11 -07:00
Teknium	5d0f55cac4	feat(cron): add script field for pre-run data collection (#5082 ) Add an optional 'script' parameter to cron jobs that references a Python script. The script runs before each agent turn, and its stdout is injected into the prompt as context. This enables stateful monitoring — the script handles data collection and change detection, the LLM analyzes and reports. - cron/jobs.py: add script field to create_job(), stored in job dict - cron/scheduler.py: add _run_job_script() executor with timeout handling, inject script output/errors into _build_job_prompt() - tools/cronjob_tools.py: add script to tool schema, create/update handlers, _format_job display - hermes_cli/cron.py: add --script to create/edit, display in list/edit output - hermes_cli/main.py: add --script argparse for cron create/edit subcommands - tests/cron/test_cron_script.py: 20 tests covering job CRUD, script execution, path resolution, error handling, prompt injection, tool API Script paths can be absolute or relative (resolved against ~/.hermes/scripts/). Scripts run with a 120s timeout. Failures are injected as error context so the LLM can report the problem. Empty string clears an attached script.	2026-04-04 10:43:39 -07:00
kshitijk4poor	0ed28ab80c	refactor: simplify and harden PR fixes after review - Fix cron ThreadPoolExecutor blocking on timeout: use shutdown(wait=False, cancel_futures=True) instead of context manager that waits indefinitely - Extract _dequeue_pending_text() to deduplicate media-placeholder logic in interrupt and normal-completion dequeue paths - Remove hasattr guards for _running_agents_ts: add class-level default so partial test construction works without scattered defensive checks - Move `import concurrent.futures` to top of cron/scheduler.py - Progress throttle: sleep remaining interval instead of busy-looping 0.1s (~15 wakeups per 1.5s window → 1 wakeup) - Deduplicate _load_stt_config() in transcription_tools.py: _has_openai_audio_backend() now delegates to _resolve_openai_audio_client_config()	2026-04-03 00:50:17 -07:00

1 2 3

113 commits