hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-09 08:21:50 +00:00

Author	SHA1	Message	Date
Teknium	3705625b74	feat(gateway): render terminal commands as bare fenced code blocks in chat (#42576 ) Terminal tool progress on markdown-capable gateways (Telegram, Slack, Discord, WhatsApp, Matrix, Weixin, Feishu) renders the full command in a fenced code block again, in all/new AND verbose modes — gated on the adapter's supports_code_blocks capability. Plain-text platforms keep the short truncated preview. No language tag is emitted: Slack mrkdwn renders a '```bash' fence with 'bash' as a literal first code line, so a bare '```' fence is used, which renders correctly on every platform that supports blocks. This restores the #41215 feature (removed in #41950 due to the command showing in group chats) as the default. For a personal assistant the command display is desired; the group-chat concern is a preference, not a vulnerability.	2026-06-08 21:19:05 -07:00
helix4u	b23184cad4	fix(api-server): bind request session context for tools	2026-06-08 20:52:08 -07:00
ruangraung	f4531feee8	fix(telegram): improve MarkdownV2 edit fallback and fix _strip_mdv2 bold handling When edit_message(finalize=True) fails with a MarkdownV2 parse error, the silent fallback previously sent raw content with escape sequences. Now it logs the error and strips markdown formatting via _strip_mdv2() for clean plain-text fallback. Also fixes _strip_mdv2 to handle standard markdown bold (\\text\\) before MarkdownV2 bold (\text\), preventing half-stripped asterisks. Refs: #41955, #41732	2026-06-08 15:53:16 -07:00
ruangraung	6d2732e786	fix(gateway): apply MarkdownV2 formatting on progress message edits When a platform adapter sets REQUIRES_EDIT_FINALIZE=True (e.g. TelegramAdapter), tool progress edits now pass finalize=True so format_message() is applied before sending to the platform. Previously, the initial send() formatted the message correctly via MarkdownV2, but subsequent edit_message() calls skipped formatting (finalize=False), causing raw markdown (e.g. triple backticks for bash code blocks) to render as plain text on Telegram. Refs: #41955, #41732	2026-06-08 15:53:16 -07:00
GodsBoy	421226e404	fix(gateway): stop terminal progress from posting the full command to messaging chats #41215 rendered a terminal tool call as a native ```bash fenced block on markdown platforms (Telegram, WhatsApp, Slack, and others), showing the full command with no truncation, in both all/new and verbose modes. That posted complete shell commands (heredocs, internal paths, destructive commands) into the chat before the final answer, visible to everyone in it. This restores the prior behavior: terminal progress shows the short, truncated preview line that every other tool already uses, capped at tool_preview_length. The supports_code_blocks capability flag is left in place for future use. CLI/TUI rendering is a separate path and was unaffected. Adds a regression test asserting terminal progress renders as a truncated preview, not a fenced bash block, even on a markdown-capable gateway. Fixes #41955	2026-06-08 15:53:00 -07:00
Robin Fernandes	639c1e3636	feat(sessions): add optional max session cap	2026-06-08 15:12:12 -07:00
liuhao1024	8e4c447e5f	fix(gateway): prevent duplicate user messages in state.db When the agent has its own SessionDB reference (_session_db is not None), _flush_messages_to_session_db() persists user messages to SQLite during the agent run. Two gateway fallback paths also wrote the same user message without skip_db=True, creating duplicate entries in state.db: 1. agent_failed_early path (transient 429/timeout failures) 2. not-new-messages path (history_offset >= len(messages) edge case) Move agent_persisted flag definition to before the if/elif/else block so all paths can use it, and pass skip_db=agent_persisted to every fallback append_to_transcript() call. Fixes #42039	2026-06-08 11:29:53 -07:00
teknium1	a706a349b5	refactor(gateway): extract authorization cluster into GatewayAuthorizationMixin (god-file Phase 3) Lift the 4 inbound-message authorization methods out of GatewayRunner into gateway/authz_mixin.py:GatewayAuthorizationMixin. Behavior-neutral; gateway/run.py 16200 -> 15812 LOC. Methods moved (~389 LOC): _is_user_authorized, _get_unauthorized_dm_behavior, _adapter_dm_policy, _adapter_enforces_own_access_policy. The two adapter-policy helpers are private to _is_user_authorized, so the cluster is fully self-contained (zero outside-cluster self.method calls after the lift). All self.* calls resolve unchanged via the MRO (GatewayRunner(GatewayAuthorizationMixin, ...)). Import split: 6 neutral deps (os, Optional, Platform, SessionSource, the two whatsapp_identity helpers) at the mixin module top; the module-level logger is imported lazily inside _is_user_authorized (from gateway.run import logger) so the mixin never imports gateway.run at module scope -> no cycle. The lazy import preserves the exact logger name (gateway.run) so log records are unchanged.	2026-06-08 09:42:02 -07:00
Teknium	e9c1e757fe	fix(gateway): release evicted agent clients to stop RSS leak (#29298 ) (#41974 ) _evict_cached_agent (the chokepoint for /new, /model, /undo, session resets — 17 call sites) only popped the cache entry, dropping the AIAgent reference without releasing its httpx client pool. AIAgent holds reference cycles (callbacks, tool state) so CPython refcounting does not free the client promptly; under steady gateway traffic the held sockets + buffers accumulate and RSS climbs (the leak class behind Now the chokepoint pops AND schedules a soft release_clients() on a daemon thread (mirrors the cap-enforcer / idle-sweeper). Soft release frees the client pool + per-turn child subagents but preserves the session's terminal sandbox / browser / bg processes for resumption. Mid-turn agents are skipped so a running request is never torn down. Also fixes the no-lock branch which previously never popped at all.	2026-06-08 06:44:51 -07:00
Michael Steuer	3d029a53ec	fix(gateway): close residual memory-leak sites under heavy scheduled workload Long-lived gateways under heavy cron/build workloads grow steadily (~18 MB/hr post-phantom-dispatch-fix) and eventually need a restart-or-OOM. Four retention sites, all confirmed live on current main: 1. _evict_cached_agent() (/model, /reasoning, codex-runtime, /undo, etc.) popped the cache entry without releasing the agent's OpenAI client, httpx transport, SSL context, or conversation history. Only /new cleaned up first. Now releases clients on a daemon thread, matching _enforce_agent_cache_cap. 2. _release_evicted_agent_soft() now clears _session_messages after release_clients() — tool outputs (file reads, terminal output, search results) can be tens of MB per 100+-tool-call session; the list is rebuilt from persisted session JSON on resume, so dropping it on soft eviction is safe. 3. The session-expiry watcher (permanent finalization) now drops the session's per-session control dicts (_session_model_overrides, _session_reasoning_overrides, _pending_approvals, _update_prompt_pending, _pending_model_notes). These leaked one entry per session per gateway lifetime. NOTE: this is the session-finalize path, NOT idle agent-cache eviction — an idle-evicted session is still alive and rebuilds its agent from these overrides, so pruning them there would silently reset a user's /model choice. 4. _tool_defs_cache is now bounded (_TOOL_DEFS_CACHE_MAX=8) with oldest-first eviction instead of growing unboundedly across the distinct toolset/config fingerprints a gateway sees over its lifetime. Salvaged from #25318 by Michael Steuer (@mssteuer); fix 3 redirected from the idle-sweep to the session-finalize lifecycle, magic number 8 lifted to a named constant, test ported. Fixes #19251 Co-authored-by: Michael Steuer <michael@make.software>	2026-06-08 06:32:42 -07:00
Kristian Vastveit	d55304c39f	fix(gateway): transcribe voice messages during active agent runs Salvaged from #6600 (@kristianvast) — re-scoped to the voice half only and rebased onto current main. The cascading-interrupt hang half of the original PR landed independently in `dd0d1222a`, so this carries ONLY Problem 1. When a voice/audio message arrives while the agent is busy on the same session, it hit the interrupt path with empty text because STT only ran after the running-agent guard — the voice was effectively lost. Now we transcribe audio BEFORE signaling the agent (and on the fresh-message path), echo the raw transcript back to the user (🎙️), and _enrich_message_with_transcription returns (text, transcripts) so callers can echo. A new _dequeue_pending_with_transcription drives the post-agent drain the same way. Reapplied onto _prepare_inbound_message_text (inbound enrichment was extracted from the inline dispatch block since the original PR). Co-authored-by: Kristian Vastveit <kristian@agrointel.no>	2026-06-08 15:16:20 +05:30
kshitij	4eb8972390	Merge pull request #33817 from sweetcornna/fix/28503-busy-input-fifo fix(gateway): use FIFO queue for busy_input_mode pending messages	2026-06-08 02:02:02 -07:00
teknium1	619bd78273	refactor(gateway): extract 42 slash-command handlers into GatewaySlashCommandsMixin (god-file Phase 3b) The in-session slash commands (/model, /reset, /usage, /compress, /voice, ...) — 42 _handle__command handlers, ~3,200 LOC — move out of gateway/run.py into a mixin GatewayRunner inherits. self._handle__command dispatch + all test references resolve unchanged via the MRO. Neutral deps (MessageEvent, EphemeralReply, Platform, t, cfg_get, atomic_*_write, account-usage helpers, stdlib) imported at the mixin top level. The ~10 run.py- internal helpers (_hermes_home, _load_gateway_config, _resolve_gateway_model, _AGENT_PENDING_SENTINEL, ...) imported lazily inside the handlers that need them to avoid an import cycle. gateway/run.py 19157 -> 15870 LOC; GatewayRunner direct methods 214 -> 172. Behavior-neutral: voice/update/model/compress command test suites pass; all 42 resolve to the mixin via MRO.	2026-06-08 01:25:35 -07:00
LeonSGP43	e02f4c03c3	fix(gateway): abort --replace when old PID survives SIGKILL When --replace force-kills an unresponsive old gateway, SIGKILL can fail to reap it (uninterruptible sleep, zombie-reaping parent, etc.). The old code unconditionally cleared the PID file and scoped locks and started a fresh instance anyway, leaving two live gateways fighting over the same bot token — a duplicate-gateway failure mode of #19471. Re-verify the process is actually gone (via the Windows-safe _pid_exists helper) after the force-kill; if it still appears alive, clear the takeover marker and abort the replacement instead of duplicating. Co-authored-by: Hermes <noreply@nousresearch.com>	2026-06-07 23:57:32 -07:00
konsisumer	3714caa1b9	fix(session): follow compression continuations for transcript reads	2026-06-07 23:57:20 -07:00
teknium1	1c68f6f81f	refactor(gateway): extract kanban watcher loops into GatewayKanbanWatchersMixin (god-file Phase 3) gateway/run.py is the largest god file (20k LOC, GatewayRunner with 220 methods). This lifts the cohesive kanban-watcher cluster — _kanban_notifier_watcher, _kanban_dispatcher_watcher, _kanban_advance/unsub/rewind, _deliver_kanban_artifacts (~1,035 LOC, 6 methods) — into gateway/kanban_watchers.py as a mixin that GatewayRunner inherits. Mixin (not free functions) because the methods use only self state: inheriting keeps every self._kanban_* call site working unchanged via the MRO, making this a behavior-neutral move. The methods' lazy imports (_kb, _decomp, _load_config, Platform) travel with them; the mixin needs only stdlib + a matching logging.getLogger('gateway.run'). run.py 20187 -> 19157 LOC; GatewayRunner direct methods 220 -> 214. Behavior-neutral: gateway test suite 6582 passed / 0 failed; start() still wires both watchers via self._kanban_*; MRO resolves all 6 to the mixin. One test (corrupt-board quarantine retry) keyed its time-travel mock on the caller's filename being gateway/run.py — updated to also accept gateway/kanban_watchers.py. Establishes the mixin-extraction pattern for further GatewayRunner decomposition (the 2406-LOC _run_agent and 1164-LOC _handle_message remain, but their callback closures need a context-object redesign — deferred).	2026-06-07 23:14:18 -07:00
Rod Boev	648706936d	test(gateway): add compression session_id rotation integration tests (#34089 )	2026-06-07 22:39:51 -07:00
teknium1	2a10da3a16	fix(gateway): keep /model + /reasoning overrides on topic recovery & compression splits Session-scoped /model and /reasoning overrides were silently lost on Telegram DM/forum topics and after compression session splits (#30479). Root cause: _handle_message_with_agent rewrites source.thread_id via _recover_telegram_topic_thread_id (lobby/stripped reply -> the user's bound topic) before deriving the session key. The /model and /reasoning handlers derived their override key from the raw inbound event.source, skipping that recovery, so the override was stored under one key and the next message turn read a different key. Fix: add _normalize_source_for_session_key (applies the same recovery a message turn does) and use it in both handlers before deriving the key. session_id rotation on compression was never the cause — overrides are keyed by the durable session_key; the split path preserves it. Author: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-06-07 22:10:32 -07:00
Hariharan Ayappane	b8469a81e3	fix(weixin): add rate-limit circuit breaker	2026-06-07 22:10:17 -07:00
Teknium	2e62862784	fix(telegram): use get_running_loop in polling-conflict retry reschedule (#41716 ) The conflict-retry path called asyncio.get_event_loop() to reschedule itself when a retry's start_polling raised. On Python 3.11+ (our floor) that raises 'RuntimeError: There is no current event loop in thread MainThread' when no loop is attached to the thread, which is what happens when PTB dispatches this error callback. The retry never gets scheduled, the adapter goes silent-but-alive, and gateway --replace keeps spawning fresh instances that hit the same wall — the crash loop reported in #19471 (worse under multi-profile, where two bots hold the same conflict open). We are inside a coroutine here, so asyncio.get_running_loop() is the correct, guaranteed-valid replacement. Only get_event_loop() call in any platform adapter, so no sibling sites. Fixes #19471	2026-06-07 22:10:03 -07:00
Teknium	5408013369	fix(gateway): isolate DM sessions on user_id when chat_id is absent (#41764 ) build_session_key collapsed every DM that arrived without a chat_id into one shared 'agent:main:<platform>:dm' key. A single cached AIAgent then served multiple users' conversations, bleeding history across senders. DMs now fall back to the sender's user_id_alt/user_id (mirroring the group-path participant precedence and the telegram auth-path fallback) before the bare per-platform sink. Telegram's normal event path always sets chat_id, so this hardens the synthetic-source / non-standard-adapter paths that don't.	2026-06-07 22:07:07 -07:00
islam666	09a5548628	fix(weixin): refresh typing ticket on expiry to prevent stuck indicator (#38085 ) The WeChat iLink typing ticket has a 600-second TTL. When a long-running session exceeds that window, the cached ticket evicts from TypingTicketCache. Both send_typing and stop_typing silently returned early when the ticket was None, meaning the TYPING_STOP=2 signal was never sent to iLink. The WeChat client then showed the typing indicator indefinitely. Fix: add _ensure_typing_ticket() that transparently refreshes the ticket via getConfig when the cached one has expired or is missing. Both send_typing and stop_typing now call this method instead of silently no-oping. Fixes #38085	2026-06-07 21:50:57 -07:00
Brian D. Evans	ab0a6270c3	fix(slack): align thread_ts check with is_thread_reply invariant (Copilot #15464 ) Two findings from Copilot's review on #15464, both addressed: 1. ``event.get("thread_ts")`` truthy vs ``event_thread_ts != ts``: the new channel branch treated ANY truthy ``thread_ts`` as a real thread reply, but three lines below ``is_thread_reply`` is defined with the stricter ``event_thread_ts and event_thread_ts != ts`` invariant. If Slack ever ships a payload where ``thread_ts == ts`` on a thread root, the stricter check would treat it as a top-level message for the ``is_thread_reply`` path but as a thread reply for session keying — divergent behaviour. Aligned this branch to the same ``and event_thread_ts_raw != ts`` invariant. 2. ``test_top_level_reply_to_id_stays_none_when_shared`` docstring had the ternary logic backwards ("None != ts → reply_to_message_id IS set"). The code reads ``reply_to_message_id = thread_ts if thread_ts != ts else None`` — with ``thread_ts = None``, the condition is True so the expression evaluates to ``thread_ts`` itself (None), meaning the reply stays un-threaded. The test asserted the correct end-state; only the explanatory docstring was wrong. Rewrote the docstring to match the actual code flow, with the note that Copilot caught the reversal. 7/7 tests still pass. No behaviour change for the existing test_thread_reply_scopes_by_thread_even_when_shared case because ``event_thread_ts_raw = "1700000000.000000"`` and ``ts = "1700000000.000005"`` are distinct — the new ``!= ts`` guard is a no-op there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-07 21:19:59 -07:00
Brian D. Evans	133e0271e2	fix(slack): scope top-level channel messages by channel-only when reply_in_thread=false (#15421 ) Top-level Slack channel messages previously fell back to the message's own ``ts`` as a synthetic ``thread_ts``: thread_ts = event.get("thread_ts") or ts # ts fallback for channels That value flows into ``build_source(thread_id=thread_ts)`` at line 1247. The gateway session store keys sessions by ``(platform, channel_id, thread_id)``, so every top-level channel message ended up on a unique session. Operators who set ``reply_in_thread: false`` in ``config.yaml`` expected all top-level channel messages to share one session (the whole point of that flag) — instead each one spawned a fresh conversation with no context carry-over. ### Fix Three explicit cases in the channel branch: \| event.thread_ts \| reply_in_thread \| thread_ts for session keying \| \|---\|---\|---\| \| non-null (real thread reply) \| either \| event.thread_ts \| \| null (top-level) \| true (default) \| ts (legacy: own-thread sessions) \| \| null (top-level) \| false \| None (shared channel session) \| The outbound-reply gate at line 1264 (``reply_to_message_id = thread_ts if thread_ts != ts else None``) still works correctly in all three cases without further changes: ``None != ts`` is True, so shared-channel top-level messages don't get their reply threaded either — matching the operator's ``reply_in_thread=false`` intent end-to-end. Genuine thread replies still scope per-thread under both modes so multi-person threaded conversations can't collide with unrelated channel chatter. ### Tests (7 new in ``tests/gateway/test_slack_channel_session_scope.py``) All drive the real ``SlackAdapter._handle_slack_message`` code path (not a re-implementation) via the standard pytest fixture pattern used by ``tests/gateway/test_slack.py``. Messages @mention the bot so the mention gate doesn't drop them — the tests are specifically about what happens once the handler decides to emit a ``MessageEvent``. * ``TestChannelSessionScopeDefault`` (2 cases): - Explicit ``reply_in_thread: true`` keeps ``thread_id = ts`` (legacy behaviour — regression guard) - Unset config behaves like ``reply_in_thread: true`` (pins the default) * ``TestChannelSessionScopeShared`` (3 cases): - ``reply_in_thread: false`` + top-level → ``thread_id is None`` (the #15421 bug 1 fix) - ``reply_to_message_id is None`` in the same case (no threaded outbound reply) - Genuine thread reply still scopes per-thread when shared mode is on — only TOP-LEVEL messages collapse to the channel session * ``TestThreadReplyAlwaysScopesByThread`` (2 parametrised cases): - Thread replies get ``thread_id = event.thread_ts`` regardless of ``reply_in_thread`` — critical invariant for multi-thread channels; a regression here would leak per-thread context across threads Regression guard verified: reverted the else-branch to the legacy ``thread_ts = event.get("thread_ts") or ts`` one-liner; ``test_top_level_maps_to_none_when_reply_in_thread_false`` correctly failed (asserts ``thread_id is None`` but got ``"1700000000.000003"``). Restored → 182 slack tests pass (175 existing + 7 new). Scope: this fixes #15421 bug 1 only. Bug 2 (sessions.json not persisting across compression) lives elsewhere in the session manager and is left for a separate diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-07 21:19:59 -07:00
Teknium	e3b8b6d32c	feat(hooks): expose thread_id and chat_type in agent:start/end context (#41672 ) Adds thread_id and chat_type to the agent:start/end plugin hook context (via getattr with safe defaults; both are real `source` attrs already used in gateway/run.py). agent:end inherits them via **hook_ctx. Purely additive — no prompt/history mutation. Documents the full ctx dict in hooks.py. Co-authored-by: SNooZyy2 <SNooZyy2@users.noreply.github.com>	2026-06-07 19:16:36 -07:00
Teknium	30c7913617	fix(api_server): report hermes version on /health and /health/detailed (#40620 ) Salvaged from #40479; re-verified on main, tightened, tested. Co-authored-by: tfournet <tfournet@users.noreply.github.com>	2026-06-07 18:38:54 -07:00
Gilad Bauman	ae82eed2b1	fix(gateway): use OGG for Telegram auto TTS	2026-06-07 18:05:58 -07:00
Teknium	cb83149dc6	fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (#40607 ) Salvaged from #40421; re-verified on main, tightened, tested. Co-authored-by: maxmilian <maxmilian@users.noreply.github.com>	2026-06-07 17:49:38 -07:00
Teknium	dde9c0d19d	feat(gateway): render terminal tool calls as native bash code blocks on markdown platforms (#41215 ) Tool-progress now shows a terminal command in a ```bash fenced block — full command, no surrounding quotes, no label, no 40-char truncation — instead of the noisy `terminal: "cmd…"` line, on every platform that renders markdown code blocks (Telegram, Slack, Matrix, WhatsApp, Feishu, Weixin, Discord). Plain-text platforms keep the compact preview line. Gated on a new `BasePlatformAdapter.supports_code_blocks` capability (default False) rather than a hardcoded platform list, so plugin adapters (Discord lives in plugins/platforms/) opt in by setting the flag. Applies to both all/new and verbose progress modes, with a safe fallback when the command arg is missing or blank.	2026-06-07 17:29:55 -07:00
Teknium	0c48b7165d	hardening(api-server): scan cron prompts on REST create/update for parity with the agent tool The agent-facing cronjob tool scans the user prompt with _scan_cron_prompt() before creating/updating a job (tools/cronjob_tools.py); the REST cron endpoints (POST /api/jobs, PATCH /api/jobs/{id}) validated length but not content. This adds the same scan to both handlers so an exfiltration/injection prompt is rejected the same way regardless of which surface created the job. NOT a security boundary, defense-in-depth / parity only: the REST cron endpoints are authenticated (every handler runs _check_auth, and connect() refuses to start without API_SERVER_KEY), and _scan_cron_prompt is a documented in-process heuristic, not a containment boundary (SECURITY.md 3.2). Raised externally via GHSA-fr3q-rjg3-x6mf (DNS-rebinding pre-auth RCE). The report's load-bearing 'no auth by default' premise was already closed three weeks after it was filed by the API_SERVER_KEY-required guard (commit `1a9ef8314`); this lands the create/update prompt-validation parity the report also pointed at. Scanner imported defensively so a missing scanner cannot disable the cron REST API.	2026-06-07 10:04:57 -07:00
Teknium	cb3e41e2fd	feat(onboarding): opt-in structured profile-build path on first contact (#41114 ) * feat(onboarding): opt-in structured profile-build path on first contact On a user's very first gateway message, Hermes now optionally offers to build a short profile of them — then, only with consent, gathers durable facts and persists them to the user-profile memory store (memory tool, target="user") so future sessions start already knowing who they are. Inspired by Poke's zero-input onboarding, but consent-first by design: - The agent OFFERS, never assumes. Declining stops it immediately. - Before ANY external lookup it states what it will look up and asks. - It never reads connected accounts (email/calendar) silently — the exact privacy concern that made naive implementations feel invasive. Wiring reuses existing infrastructure end-to-end: - gateway/run.py first-message hook (was a plain self-intro) now swaps in the profile-build directive when enabled and not yet offered. - agent/onboarding.py gains profile_build_mode()/profile_build_directive() + PROFILE_BUILD_FLAG, latched once via the existing onboarding.seen mechanism so the offer fires at most once per install. - config default onboarding.profile_build: "ask" (set "off" to disable). Added to an existing section, so no _config_version bump needed. No new storage layer, no new injection path, no prompt-cache impact. * fix(dashboard): fold onboarding into agent tab to avoid 1-field category onboarding.profile_build is the only schema-surfaced onboarding field (onboarding.seen is an internal latch dict), so the dashboard CONFIG_SCHEMA single-field-category invariant rejected it. Merge onboarding -> agent like the other small categories.	2026-06-07 08:36:48 -07:00
Dusk1e	3fa15b33dd	fix(feishu): fail closed for update prompt card actions	2026-06-07 06:21:37 -07:00
Dusk1e	410cb743bf	fix(slack): re-check gateway auth on approval and slash-confirm buttons	2026-06-07 06:21:37 -07:00
manishbyatroy	490c486ff6	fix(simplex): accept display name in SIMPLEX_ALLOWED_USERS SIMPLEX_ALLOWED_USERS silently denied every contact when operators listed display names instead of numeric contactIds. The SimpleX UI never surfaces the numeric id, so display names are what operators naturally put in the env var. _is_user_authorized only compared source.user_id (the contactId), so the allowlist never matched. Expand check_ids to include source.user_name for the simplex platform, mirroring the existing WhatsApp phone-LID aliasing pattern. Adds doc + setup-prompt clarification and three regression tests. Salvaged from PR #40393. Adds manishbyatroy to release.py AUTHOR_MAP.	2026-06-07 04:53:22 -07:00
bmoore210	330ca4585b	fix: harden gateway startup and turn persistence Persist the inbound user turn before provider/tool execution so a crash before run_conversation() (e.g. provider/httpx client init failure) keeps the inbound message in the transcript. Repair stale/missing SSL_CERT_FILE state on gateway startup, and avoid duplicate gateway fallback writes.	2026-06-07 02:15:23 -07:00
annguyenNous	7223f22d65	fix: add timeout to subprocess.run() and proc.wait() calls subprocess.run() and proc.wait() without timeout can hang indefinitely if the child process becomes unresponsive. This blocks the calling thread forever. Fixed locations: - tools/transcription_tools.py: ffmpeg conversion (timeout=300) and user-configured STT commands with shell=True (timeout=300) - gateway/run.py: helper script proc.wait() (timeout=3600) Not fixed: - agent/anthropic_adapter.py: interactive 'claude setup-token' — user-driven, timeout would be inappropriate	2026-06-07 01:26:33 -07:00
annguyenNous	b08662b782	fix(gateway): tolerate Unicode in stderr log handlers on Windows On Windows with non-UTF-8 console encodings (e.g. cp949, cp1252), StreamHandler emits raise UnicodeEncodeError when log messages contain characters outside the console codepage — such as the em-dash (U+2014) in the session hygiene message. This crashed the gateway process silently, leaving no diagnostic output. Fix: add _safe_stderr() helper that wraps sys.stderr in a TextIOWrapper with encoding='utf-8' and errors='replace' when the console encoding is not UTF-8. Applied to both: - hermes_logging.py setup_verbose_logging() stderr handler - gateway/run.py optional stderr handler The wrapper ensures log lines are never lost — un-encodable characters are replaced with '?' instead of crashing the process. Fixes #40432	2026-06-06 19:57:44 -07:00
Teknium	3eeca4613d	fix(qqbot): stop 100% CPU spin when WebSocket is closed but not None (#31193 , #31771 ) (#40574 ) _read_events() returned normally when self._ws was closed-but-non-None (the while-condition is false on entry). _listen_loop treats a normal return as a clean read, resets backoff to 0, and immediately retries — a tight busy-loop pinning CPU. Raising on entry routes it through the reconnect/backoff path instead. Co-authored-by: xushibo <xushibo@users.noreply.github.com> Co-authored-by: cnfi <cnfi@users.noreply.github.com>	2026-06-06 18:44:44 -07:00
Teknium	f4a73abbd0	chore(gateway): drop HOMEASSISTANT from /update allowlist (#40736 ) Home Assistant is a bundled plugin now (#40709) and declares allow_update_command=True on its PlatformEntry. The registry fallback in _handle_update_command already covers it, so the frozenset entry is a redundant double-allow — same cleanup #40711 did for Discord and Mattermost. Adds a registry-fallback test mirroring the existing discord/mattermost cases.	2026-06-06 18:25:43 -07:00
kshitijk4poor	ef7e5168b5	chore(gateway): drop plugin-migrated platforms from /update allowlist `gateway/run.py::_UPDATE_ALLOWED_PLATFORMS` was a hardcoded frozenset listing every messaging platform allowed to invoke the `/update` slash command. Plugin-migrated platforms (currently Discord and Mattermost, soon also Home Assistant via #32500) declare `allow_update_command=True` on their `PlatformEntry`, and `_handle_update_command` already falls back to the registry when a platform isn't in the frozenset. The result was a silent redundancy: those entries said "allowed" twice, and the registry flag was a no-op for them in practice. - Removed `Platform.DISCORD` and `Platform.MATTERMOST` from the frozenset. - Updated the docstring to make the split explicit (built-ins live in the frozenset; plugins use `allow_update_command` on the registry entry). The remaining frozenset entries are all still built-in platforms living under `gateway/platforms/` today. Future plugin migrations should drop their entry from the frozenset as part of the migration PR (or in a sibling chore PR like this one). Added a `TestUpdateCommandPlatformGate` test class that pins down all three branches of the gate so future changes don't silently regress: - Programmatic interfaces (`Platform.WEBHOOK`, `Platform.API_SERVER`) must remain blocked. - Plugin-migrated platforms (Discord, Mattermost) must pass via the registry fallback. - Built-in platforms in the hardcoded frozenset (Telegram) must still pass without needing the registry. The gate previously had zero direct test coverage — its only existing coverage was `test_no_adapter_for_platform` which exercised a different code path.	2026-06-06 11:48:55 -07:00
kshitijk4poor	c37c6eaf29	refactor(gateway): migrate Home Assistant adapter to bundled plugin Move gateway/platforms/homeassistant.py into plugins/platforms/homeassistant/ following the same shape as the Mattermost and Discord migrations. - Adapter file is renamed via git mv (history is preserved). - register() exposes the platform via the plugin system instead of the hardcoded Platform.HOMEASSISTANT elif in gateway/run.py::build_adapter(). - _standalone_send() replaces the legacy _send_homeassistant() helper in tools/send_message_tool.py. Out-of-process cron delivery (deliver=homeassistant from a cron process not co-located with the gateway) now flows through the registry's standalone_sender_fn path instead of the hardcoded elif. - _is_connected() probes HASS_TOKEN via hermes_cli.gateway.get_env_value so existing connected-platform checks behave identically. The HASS_TOKEN / HASS_URL env-to-PlatformConfig seeding in gateway/config.py stays in core — same pattern bluebubbles, mattermost, and discord migrations followed. No setup_fn or apply_yaml_config_fn is registered because Home Assistant has no _setup_homeassistant wizard in hermes_cli/setup.py and no homeassistant: YAML block in config.yaml today; setup runs through the existing hermes_cli/tools_config.py toolset wizard. Test imports were rewritten across tests/gateway/test_homeassistant.py, tests/integration/test_ha_integration.py, and tests/tools/test_send_message_missing_platforms.py; the legacy (token, extra, chat_id, message)-shaped _send_homeassistant call site is preserved via a small SimpleNamespace shim in test_send_message_missing_platforms.py (same approach used when mattermost moved). - Focused HA suites (64 tests across the three rewritten files) pass. - Broader gateway/cron sweep produces 10 failures identical to main baseline (telegram approval/model-picker xdist isolation flakes, wecom_callback defusedxml issue, cron script_timeout fixture issue). Zero net new failures.	2026-06-06 11:46:24 -07:00
Teknium	54e7b74f7f	fix(gateway): plain text while busy interrupts by default again (#40590 ) * fix: respect disabled auto-compaction on context overflow Port from anomalyco/opencode#30749. When compression.enabled is false, NO automatic compaction trigger may fire. The proactive token-threshold paths (preflight + post-response should_compress gate) already honoured the setting, but the three provider-overflow recovery paths in the agent loop — long-context-tier 429, 413 payload-too-large, and context-overflow — called _compress_context() unconditionally, silently compressing and rotating the session against the user's explicit choice. Add a single guard at the top of the overflow-recovery dispatch: when compression is disabled and the error is one of those three overflow classes, surface a terminal error (compaction_disabled: True) telling the user to /compress manually, /new, switch to a larger-context model, or reduce attachments. Manual /compress (force=True) is unaffected — it never enters this loop. Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't compress when disabled; control case still compresses when enabled). Existing overflow-recovery tests updated to enable compaction explicitly (they verify the recovery fires); fixture defaults flipped to True to match production (compression.enabled defaults to True). * fix(gateway): plain text while busy interrupts by default again busy_input_mode (default 'interrupt') was advertised as the busy-behavior knob, but a second knob added in `7abd62719` — busy_text_mode, defaulting to 'queue' — short-circuited every plain TEXT message before busy_input_mode was consulted. Result: plain follow-ups silently queued instead of interrupting, even with busy_input_mode left at its 'interrupt' default (regression #38390, silent-queue #31588). Collapse to one source of truth: busy_input_mode drives text handling. busy_text_mode is kept only as a legacy explicit override for back-compat (existing queue setups keep working); when unset it follows busy_input_mode. All default fallbacks flipped queue->interrupt. The debounce mechanism is preserved and now keyed off the resolved mode. Fixes #38390, #31588.	2026-06-06 09:00:10 -07:00
Siddharth Balyan	fcb1944b4f	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 ) Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details * feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits) L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that exercises the real header -> CreditsState -> TUI pipe end-to-end behind HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists. - agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are strings -> paid_access via == "true", never bool(); retain-last-known; only subscription_micros may be negative; _usd kept verbatim). - run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros, session-start baseline latch, + dev-gated "credits" capture log. - agent/chat_completion_helpers.py: capture on the streaming response. - agent/agent_init.py: init _credits_state + _credits_session_start_micros. - tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged. - ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner. Off by default; silent for normal users. Validated live against staging (capture log delta matches the TUI segment). Throwaway consumer (readout/log/ banner); credits_tracker + the capture plumbing are the real feature foundation. test(credits): lock parser under 9-state matrix + harden validation (L2) Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free, depleted, debt, missing, no_org) plus validation edge cases: version strict==1 with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off == "true"/"false", never bool()), half-pair subscription limit treated as both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros → None, negative non-subscription micros → None, as_of_ms junk → None, zero limit ZeroDivision guard. Harden agent/credits_tracker.py to match the spec: - Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState - Add depleted property (== not paid_access, never remaining==0) - Change used_fraction guard to key off subscription_limit_micros (the actual denominator) not denominator_kind (metadata) - Replace fail-soft _safe_int with a sentinel-returning variant; full validation now returns None on any malformed field rather than silently defaulting - Add module-level warn-once latch for version > 1 - Add USD regex validation; add denominator_kind allow-list check - Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-) feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1) L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's policy will fire through and L5's TUI render will consume. - agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id; kind defaults "sticky", kept TTL-expressive for a future config seam). - run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and _emit_notice / _emit_notice_clear emitters (swallow all callback errors — a notice must never break the agent loop; no-op when unbound). - agent/agent_init.py: thread both callbacks through init_agent. - tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear WS events (snake_case payload, matching the existing gateway-event convention). - ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent. - tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op, signature threading, TUI binding payload shape). Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/ decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly. * feat(credits): threshold reconciliation policy + tests (L4.1) * feat(credits): wire threshold policy into capture + latch (L4.2) After a fresh header parse, _capture_credits runs evaluate_credits_notices against the agent's _credits_latch and emits the result — clears first, then shows (so a recovered depletion clears before the "restored" success lands, and depleted wins the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks) still caches state for /usage but runs no policy. Parse stays fail-open (miss → keep last-known); the eval/emit path warns on failure rather than swallowing, so a depletion-notice bug can't vanish silently. - run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn); latch lazy-guarded (object.__new__ safety). - agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}. * feat(tui): render credits notices in the status bar (L5, Strategy B) The TUI now renders the notification.show / notification.clear gateway events the agent emits — a level-colored notice overrides the status/verb slot when not busy. - Notice state machine on turnController (pendingNotice + dedicated noticeTimer + show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler decodes the events and delegates. - Render priority busy > notice > status (appChrome StatusRule); notice text rendered verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx; dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire). - Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites (recordMessageComplete / interruptTurn / recordError) — never idle(), which reset() also calls (would leak across sessions); reset() clears instead. - Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard; latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak). - 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority). * feat(credits): cold-start seed for new Nous sessions (L3) A genuinely-new Nous session has no inference header yet, so seed credits state from the authoritative GET /api/oauth/account snapshot at session start (in the new-session branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin hook gets no agent reference). The seed runs the shared notice policy, so a session that opens already depleted warns IMMEDIATELY rather than only after the first turn. - Maps the nested account fields (paid_service_access → paid_access; total_usable / subscription / purchased on paid_service_access_info; rollover on subscription), each None-guarded; float dollars → micros via round(d1e6), _usd left "" (render formats from micros — never synthesize a verbatim usd from a float). - Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset → used_fraction None → no warn90 from the seed (% only once a header lands, per D-E). - Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never blocks startup); paid_access unknown ⇒ True (never falsely depleted). - run_agent.py: extracted the warm-path policy/emit block into a shared _emit_credits_notices() so capture and the seed fire notices identically. * feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6) Add Nous credit dollar magnitudes to /usage (subscription / top-up / total + rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the account endpoint exposes a denominator). Reuses the existing account-usage render machinery via a new pure build_nous_credits_snapshot() that maps a NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to fetch_account_usage (keeps the per-provider boundary intact). CLI /usage also doubles as a depletion-recovery trigger: a force_fresh account fetch, kept in a SEPARATE local so it never clobbers the header-sourced agent._credits_state (which alone carries used_fraction). If paid access recovered while credits.depleted is latched and a notice consumer is bound, it reuses agent._emit_credits_notices() to clear it. Gateway /usage displays magnitudes only — messaging binds no notice consumer, so it performs no recovery emit. Fail-open throughout: any portal hiccup leaves /usage unaffected. * refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers The dev-flag truthy check was inlined in three places. Replace with the shared utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the env check on every render). Behaviour-preserving; identical truthy set. * fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review) Adversarial review found the /usage depletion-recovery trigger dead AND broken: the CLI binds no notice_clear_callback, the TUI runs /usage in a separate slash-worker subprocess (its own agent/latch), and the no-clobber rule made it evaluate stale paid_access anyway. Recovery already happens on the next inference (warm path), so the trigger was redundant — remove it and stop the depleted notice over-promising. - cli.py: remove the dead recovery block; bound the /usage portal fetch with a 10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch — urllib's per-socket timeout is not a wall-clock guarantee. - agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance" (no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn). - agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch so a stalled portal can't hang session startup; tidy its time import. * chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE) Throwaway dev scaffolding to exercise the notice pipeline without real spend or Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct / grant_exhausted / depleted / clear) or a file path whose contents name a state (re-read each turn → flip states live for recovery testing). _capture_credits injects the chosen CreditsState instead of parsing real headers and runs the shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding. * feat(credits): /usage monthly-grant % gauge The portal /api/oauth/account subscription block now carries monthly_credits (the per-period grant allowance, the % denominator). The consumer parsed monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only. Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload. build_nous_credits_snapshot emits a Subscription usage window (real % used, routed through the existing render machinery) when monthly_credits is a finite positive denominator and credits_remaining is finite and <= cap; otherwise it degrades to magnitudes-only (older portals, rollover-over-cap, or non-finite payloads). Guards (adversarial-review-driven): reject non-finite operands (json.loads parses bare NaN/Infinity by default → would render $nan + a false 100% used), reject bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap (rollover spanning the period makes the cap a nonsensical denominator → the $X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%. Money rule preserved: the ratio + magnitudes are computed from numeric float account fields via display formatting, never by parsing a server _usd string (there are none on these dataclasses). 13 gauge tests added (tests/agent/test_nous_credits_gauge.py). fix(credits): show /usage Nous block whenever a Nous account is present /usage runs in a slash-worker subprocess whose resolved inference provider is often not "nous" even when the user has a Nous account, so gating the Nous credits block on (provider == "nous") hid it entirely — the account data was fully available but never rendered. Gate instead on "a Nous account is logged in": a cheap local auth-state lookup (get_provider_auth_state('nous') has an access_token) decides whether to attempt the portal fetch, regardless of which provider inference runs on. In the gateway the block is also lifted out of the 'if provider:' scope so a Nous-credentialled user with another (or no) resident inference provider still sees their balance. Fail-open and the per-fetch wall-clock timeout are preserved. * fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker) In the TUI, /usage runs in a slash-worker subprocess that resumes the session WITHOUT building an agent (self.agent is None), so _show_usage early-returned "(._.) No active agent" before ever reaching the Nous credits block — which is agent-independent (a portal fetch gated on Nous auth-state). Extract the block into _print_nous_credits_block() and run it at the no-agent / no-calls early-returns too (returns True if it printed, so the fallback message only shows when there's genuinely nothing). Verified live against staging: the block + monthly-grant gauge now render in the slash-worker /usage path (previously hidden). The plain CLI REPL + messaging paths are unchanged (they have a live agent). * feat(credits): escalating 50/75/90 usage bands (single status line) Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn, 90 warn) shown as ONE status-bar line: it displays the highest band the subscription grant has crossed, replaces the line as usage climbs, steps back down on recovery, and clears below 50%. No stacking, no per-turn churn. Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything from it. Single notice key (credits.usage) with a usage_band latch field so the notice only re-emits when the band actually changes. The crossing gate (seen_below_90) is preserved so a fresh live session that opens mid-range stays quiet until it has been observed below the lowest band (cold-start primes it when it wants an open-high warning). Denominator math unchanged: % = subscription grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %. Migrated test_credits_policy.py to the new key + added TestUsageBands (climb, step-down, recovery-clear, idempotent, inclusive boundaries). * feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn) Notices previously only fired inside a conversation turn (first message), so a session that opened already depleted / past a usage band showed nothing at 'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start() and call it (a) in the TUI/desktop agent build right after the notice callback is wired (fires at 'ready', before any message) and (b) as the first-turn fallback in conversation_loop. Idempotent (skips once _credits_state exists) and fail-open. The seed now maps monthly_credits -> subscription_limit_micros + denominator_kind='subscription_cap', so used_fraction is computable at seed time and usage-band warnings (not just depletion) hydrate on open. Primes the crossing latch so a session opening already in a band warns immediately. Degrades to depletion-only when monthly_credits is absent (older portals). Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap degradation, and the shared seed (fires/idempotent/skips-non-nous). * feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge when the portal supplies a positive, finite monthly_credits denominator with remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise. Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so the CLI and TUI /usage render the same block, and _snapshot_from_credits_state() so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too. TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage panel renders them regardless of API-call count or resume state — previously the TUI's separate /usage implementation only showed token counts. Money rule preserved: %% and magnitudes come from numeric float account fields via display formatting, never by parsing a server _usd string. feat(credits): CLI REPL inline notices (parity with TUI) The plain CLI agent bound no notice callbacks, so credit notices were TUI-only. Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders a single level-colored line above the prompt (error red / warn yellow / success green / info dim) via _cprint, and seed credits at session open so a depletion or usage-band warning shows before the first message — the same hydration the TUI got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot). * test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open seed, /usage gauge). * fix(credits): usage-band notice clears on next prompt (not sticky-forever) A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear the visible credits.usage notice when a new turn starts (startMessage), so it shows until your next prompt then yields. The server latch is unchanged, so it won't re-nag at the same band — it only re-shows when the band actually changes (climb) or clears when usage drops below the lowest band. Depletion stays sticky. * refactor(credits): consolidate the /usage credits block behind nous_credits_lines() The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command) each re-implemented the auth-gate + portal fetch + render, and both bypassed the dev-fixture short-circuit that only the TUI honored — so /usage ignored HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth gate, and the fixture works on every surface (~60 fewer duplicated lines). The gateway usage test recorded only the last asyncio.to_thread call; /usage now dispatches both the account fetch and the credits fetch, so it records every call and matches the account fetch by its provider arg. * fix(credits): keep the /usage gauge type-safe and log its fail-open path _is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge operands (monthly_credits / credits_remaining) and the magnitudes passed to _fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block is diagnosable in agent.log without a dev flag. * fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed - Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a real account. Matches the documented run workflow (both vars set together). - Hot-path probe: parse_credits_headers checks for the version sentinel header before allocating a lowercased copy of the response headers — skips that work on every non-Nous API call. Behaviour-identical and still case-insensitive. - Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now runs in a daemon thread, so a slow/unreachable portal never delays session "ready" (previously blocked up to 10s). The dev-fixture path stays synchronous; the thread re-checks idempotency before hydrating (a live header may land first). - Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed parser / dead seed is distinguishable from a legitimate no-headers miss. Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate. * test(tui): fix env-timing in the StatusRule dev-credits assertion DEV_CREDITS_MODE is read once at module load (config/env), so mutating process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner assertion only passed if the env was exported before vitest started, and failed in a normal run. Move that assertion to a sibling file that mocks config/env with DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard). * test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt - _snapshot_from_credits_state (the offline /usage renderer) had no direct test: lock the gauge math, the verbatim _usd magnitudes, the depletion line and the fixture marker, plus the no-cap (no gauge) and None-state cases. - turnController.startMessage had no test for clearing the credits.usage notice on the next prompt while leaving credits.depleted sticky. feat(credits): deliver credit notices over messaging gateways Bind notice_callback/notice_clear_callback on the per-turn gateway agent so usage-band / depletion / restored notices reach Telegram/Discord/Slack/ etc. Previously the messaging gateway bound neither callback, so the agent's _emit_credits_notices early-returned and a chat user crossing a band got nothing unless they ran /usage manually. - render_notice_line(): AgentNotice -> single plaintext line (level glyph + text), plaintext-only so it renders uniformly without per-platform escaping. Fail-soft on malformed/empty notices. - Standalone push for every notice (messaging has no persistent status bar): route through the shared _deliver_platform_notice rail (honors private/ public delivery + thread metadata), scheduled onto the gateway loop via safe_schedule_threadsafe from the agent's sync worker thread — same pattern as _status_callback_sync. - The fired-once latch lives on the cached (reused-in-place) agent and persists across turns, so a band crosses once -> one push, no per-turn re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder). - Recovery ('Credit access restored') rides the show path (emitted as a success notice, not a clear). notice_clear_callback is a no-op: a sent platform message can't be cleanly retracted. Tests: render glyph/levels/fail-soft + public/private delivery seam through _deliver_platform_notice + no-adapter no-op. * fix(credits): don't double the glyph on messaging notices render_notice_line prepended a per-level glyph, but the notice policy already bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used", "⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead level→glyph map. The render tests fed glyph-less text (and the success case only checked startswith), so the doubling slipped through. Rework them around the verbatim contract and add an end-to-end regression that runs real evaluate_credits_notices output through render_notice_line and asserts the line is returned unchanged.	2026-06-06 13:18:18 +05:30
Brooklyn Nicholson	30340eae2f	Include git SHA in /version output via banner label helper. Reuses format_banner_version_label() so CLI, TUI, gateway, and desktop show upstream/local commit when available.	2026-06-05 18:05:05 -07:00
Brooklyn Nicholson	9c1bb8d2c7	Add /version slash command across CLI, gateway, TUI, and desktop. Surfaces Hermes Agent version info on demand without leaving chat; works mid-run like /help and /update.	2026-06-05 18:05:05 -07:00
teknium1	14275d7baa	fix(gateway): honor per-provider max_output_tokens in max_tokens chain Widens ViewWay's #20741 fix to the sibling config surface: a custom_providers entry can pin its own output cap via max_output_tokens (or max_tokens). _get_named_custom_provider now lifts it onto the resolved runtime at all three return sites, and the gateway uses it as a fallback only when the documented global model.max_tokens isn't set, so the global key always wins. Precedence: HERMES_MAX_TOKENS > model.max_tokens > provider max_output_tokens > None. Closes the same #20741 truncation for users who configure the cap per-provider rather than globally. Picks up the intent of #19782 (alexcam1901), reimplemented to feed ViewWay's max_tokens pipeline.	2026-06-05 09:10:26 -07:00
ViewWay	1c909e75e1	fix(cli,gateway): complete max_tokens propagation — CLI path + env var override Previous commit only covered the gateway runtime path. This adds: - CLI __init__: read max_tokens from model config with HERMES_MAX_TOKENS env override - CLI AIAgent() calls (interactive + background): pass max_tokens - Gateway _resolve_runtime_agent_kwargs: add HERMES_MAX_TOKENS env override All three code paths (CLI, gateway runtime, session override) now consistently propagate max_tokens to AIAgent.	2026-06-05 09:10:26 -07:00
ViewWay	cf786593cd	fix(gateway): propagate max_tokens from config.yaml to AIAgent max_tokens set under model: in config.yaml was silently ignored. The value was never read from config, never passed through _resolve_runtime_agent_kwargs(), _resolve_turn_agent_config(), or the session override path. Added it to all three code paths so custom/Ollama endpoints receive the correct output cap. Closes #20741	2026-06-05 09:10:26 -07:00
Teknium	947e21b3d6	fix(gateway): log silent file-delivery drops (#39767 ) When the agent's reply references a deliverable file path that does not exist on disk, extract_local_files dropped it from native delivery with no log line — the most common reason a promised file never arrives over a messaging platform. Add an INFO log at that drop point so the gap is visible in gateway.log instead of vanishing. Also convert the two print() calls in Telegram's send_document / send_video exception handlers to logger.warning(exc_info=True). print() writes to stdout, which 'hermes logs' never captures, so outbound upload failures (oversized files, Bot API rejections) were invisible.	2026-06-05 04:50:04 -07:00
Teknium	06268f11cc	feat(gateway): explain /voice usage when toggled bare (#39766 ) A bare /voice silently toggled on/off with a one-line result, leaving users with no idea what the modes mean or that Discord also supports TTS-all and live voice-channel join/leave. Bare /voice now still toggles but appends a usage explainer covering on/off/tts/status, with the Discord voice-channel lines shown only on adapters that support them. Adds gateway.voice.help + gateway.voice.help_channels across all 16 locales (placeholders {toggle}/{channels}).	2026-06-05 04:21:13 -07:00

1 2 3 4 5 ...

1951 commits