hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-23 10:42:00 +00:00

Author	SHA1	Message	Date
teknium1	a6ce9b2fbb	fix(picker): keep flat-namespace reseller first-party models in desktop picker OpenCode Go (and OpenCode Zen) showed only a subset of the models they serve in the desktop/CLI model picker — e.g. opencode-go rendered 13 of 19, silently dropping minimax-m3/m2.7/m2.5, glm-5/5.1, deepseek-v4-flash. Root cause: the picker dedup in build_models_payload strips any model from an aggregator row that overlaps a user-defined provider's catalog (so a local proxy isn't shadowed by OpenRouter). It gated on is_aggregator(), which is True for opencode-go/zen because their flat /v1/models returns bare IDs the model-switch resolver searches. But those are flat-namespace RESELLERS, not routing aggregators — every model they list is first-party, so deduping them against a user proxy that happens to serve a same-named model guts their own catalog. Fix: add is_routing_aggregator() (True only for true routers like OpenRouter and custom:* proxies; False for opencode-go/zen) and gate the picker dedup on it. is_aggregator() is unchanged so model-switch flat catalog resolution keeps working. Both desktop entry points (model.options JSON-RPC and /api/model/options REST) and hermes model share build_models_payload, so all surfaces get the full list. Fixes #47077	2026-06-22 06:09:08 -07:00
Teknium	ef6492b648	fix(gateway): cold-start installed Windows gateway after update when none was running (#50804 ) The post-update gateway resume path (`_resume_windows_gateways_after_update`) only relaunched gateways that were running when the update began — it enumerates live PIDs in `_pause_windows_gateways_for_update` and respawns exactly those. A gateway that had already died between updates (e.g. it was launched attached to a terminal/TUI that later closed, taking the child with it) was never brought back: the Startup-folder / Scheduled-Task autostart entry only fires on the next login, not after an in-place update. So a Desktop-GUI update (which runs `hermes update --yes --gateway`) on a box whose gateway had quietly died would complete with no gateway running, and the user had no indication anything should have come up. Fix: when no gateway is running at pause time but an autostart entry is installed (`gateway_windows.is_installed()` — an explicit "I want a gateway" signal), return a `cold_start_if_installed` token. The resume step then does a fresh detached spawn via `gateway_windows._spawn_detached()` — the same windowless `pythonw` + `CREATE_BREAKAWAY_FROM_JOB` path `hermes gateway start` uses. It re-checks liveness immediately before spawning so a concurrent start (autostart entry firing) can't produce a duplicate. Gateway-less users (no autostart entry) get nothing forced on them — the pause step still returns None for them. POSIX is unaffected: enabled systemd units already restart via `Restart=always`. Windows-only; best-effort throughout (logs at debug and no-ops on any error). Tests: pause returns the cold-start token only when installed, returns None when not installed, resume cold-starts on the token, and resume skips the cold-start when a gateway is already running.	2026-06-22 06:02:31 -07:00
teknium	da498ed99b	chore(release): map ScotterMonk for PR #50145 salvage	2026-06-22 05:41:22 -07:00
teknium	e9cd8c5bf3	fix(delivery): drop env-var knob, flag all chunking adapters Follow-up to ScotterMonk's cron-truncation fix: - Remove HERMES_DELIVERY_MAX_PLATFORM_OUTPUT env var. Behavioral config belongs in config.yaml, not a new HERMES_* env var (.env is secrets only). The actual bug is fixed entirely by the adapter-aware skip; the configurable cap was unneeded scope. MAX_PLATFORM_OUTPUT is a constant again, collapsing the max_output=0 disable branch and the audit-vs-truncation threshold divergence. - Flag the remaining verified-chunking adapters (slack, matrix, feishu, mattermost, teams, whatsapp, whatsapp_cloud, weixin, bluebubbles, yuanbao) with splits_long_messages=True so the fix covers the whole bug class, not just Discord/Telegram. Each verified to chunk in its own send() via truncate_message(). - SMS deliberately left False: it chunks for normal replies but a multi-segment cron blast is cost-bearing; the 4000-cap + file save is the safer default there. - Update tests: drop the two env-override tests, add a test asserting a save failure during truncation (non-chunking) propagates.	2026-06-22 05:41:22 -07:00
ScotterMonk	86e4521cb1	fix(delivery): make cron output truncation configurable + adapter-aware Gateway-level truncation (MAX_PLATFORM_OUTPUT=4000) was pre-empting adapter-side message splitting. Discord and Telegram both chunk long content natively in their send() via truncate_message(), but the delivery router truncated to 3800 chars + footer before the adapter ever saw the full payload — so long cron output was cut short instead of being delivered as multiple messages (issue #50126). Changes: - HERMES_DELIVERY_MAX_PLATFORM_OUTPUT env var makes the cap configurable (default 4000, backward compatible). Set to 0 to disable truncation. - TRUNCATED_VISIBLE (3800) removed — visible portion now derived dynamically from max_output minus the actual footer length. - New BasePlatformAdapter.splits_long_messages capability flag (default False). Adapters that chunk in send() set True; delivery skips truncation for them but still saves full output to disk as audit. - Flagged Discord and Telegram (both verified to chunk in send()). Fixes #50126	2026-06-22 05:41:22 -07:00
Teknium	eecb5b9dd1	fix(update): don't count across shallow-clone boundary (bogus '12492 commits behind') (#50784 ) * chore: re-trigger CI (workflows did not dispatch on prior head) * fix(update): don't count across shallow-clone boundary (bogus '12492 commits behind') Installer checkouts are shallow (git clone --depth 1). The CLI banner and hermes update --check both did a plain git fetch (silently unshallowing the repo) then git rev-list --count HEAD..origin/main, which counts across the shallow boundary and prints a huge nonsense number like '12492 commits behind'. Detect shallow up front, fetch with --depth 1 to preserve the boundary, and compare tip SHAs instead of counting: - banner _check_via_local_git: returns UPDATE_AVAILABLE_NO_COUNT when behind (renders as 'update available') instead of the bogus count. - _cmd_update_check: reports presence-only on shallow clones. Full clones keep the exact count path unchanged. Mirrors the desktop fix in apps/desktop/electron/main.cjs (commit `2950c6fa2`).	2026-06-22 05:39:11 -07:00
Kartik	2e779d11a0	feat(mem0): v3 API, OSS mode, update/delete tools, telemetry & review fixes (#15624 ) * fix: update to version 3 endpoints and adding update and delete tool * chore: removing the test md file * fix: prevent circuit breaker on client errors in Mem0 provider * chore: add telemetry for platform version * feat: add OSS mode support to Mem0 memory provider * chore: bump mem0ai dependency to >=2.0.1 in memory plugin * refactor: enhance dependency checks and embedder config in mem0 backend * refactor: adjust fact storage message for OSS mode * refactor: expand user paths, add collection recreation on dimension change for Qdrant * fix(mem0): make MEM0_USER_ID override gateway-native ids and tag writes with channel When MEM0_USER_ID was configured (env or mem0.json), the gateway-native id from kwargs (Telegram numeric id, Discord snowflake, ...) still won, so the same human ended up under different user_ids per channel and memories never merged across CLI / Telegram / Slack / Discord. Mirrors openclaw's cfg.userId pattern: configured override wins, gateway-native id is the fallback. The legacy "hermes-user" placeholder default written by the setup wizard is treated as unset to avoid silently bucketing every gateway user together. Also tag every write with metadata.channel (cli/telegram/discord/...) so the dashboard can offer per-channel filtered views without coupling identity to the channel; document the read/write filter asymmetry as intentional (reads scope to user_id only for cross-agent recall). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: improve Mem0 memory provider backend, pagination, config, and error handling * refactor: update mem0 telemetry code, docs, and bump version * fix(mem0): make get_config_schema() return unified schema with mode-aware required flag Schema always includes api_key field so picker shows "API key / local" for both modes. In OSS mode api_key.required=False so status won't mislead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: improve mem0 telemetry, add env var key and OSS mode detection * chore: bump mem0ai lower bound to 2.0.4 (latest SDK release) * refactor: set telemetry sample rate to 1.0 and update docs for opt‑out * fix(mem0): resolve 15 correctness, thread-safety, and resource bugs Thread safety: - Protect circuit breaker counters with _breaker_lock (race between prefetch/sync daemon threads and main thread) - Wrap sync_turn thread creation in _sync_lock; skip if previous sync is still alive after 5 s join to prevent duplicate memory ingestion - Guard _schedule_flush timer creation under _queue_lock (TOCTOU race) - Capture local `backend` reference in prefetch/sync closures so shutdown() nulling self._backend cannot crash in-flight threads Correctness: - Fix bool("false")==True for rerank param; parse string values explicitly - Guard page/top_k with max(1,...) and move int() inside try blocks - Fix fact_count=0 always in OSS mode (Memory.add returns list, not dict) - Fix prefetch() not clearing result when thread still alive after timeout - Fix atexit.register accumulating on repeated initialize() calls Backend / setup: - Handle Qdrant named-vector collections in _recreate_collection_if_dims_changed (vectors is a dict; .size access raised AttributeError, swallowed silently) - Wrap QdrantClient and psycopg2 conn/cursor in try/finally to prevent leaks - Resolve ollama_bin at top of _ensure_ollama; use it for ollama pull - Fix embedder key lookup when LLM provider has no env_var (e.g. ollama) Also: remove _telemetry_enabled cache (env var check is cheap), bump required mem0ai to >=2.0.7, minor README wording fix. * fix(mem0): fix brittle qdrant path test + add telemetry sample-rate docs - Replace generator-throw lambda with a proper def in test_qdrant_path_not_writable; use tmp_path instead of a hardcoded /nonexistent path so the test is root-safe - Add MEM0_TELEMETRY_SAMPLE_RATE to memory-providers.md (was only in the plugin README, not the user-guide docs) * revert: remove MEM0_TELEMETRY_SAMPLE_RATE from user-guide docs * refactor: remove telemetry from mem0 plugin and update documentation * fix(mem0): set stdin=DEVNULL on setup subprocess calls The TUI stdin guard (scripts/check_subprocess_stdin.py) requires every subprocess call in plugin code to set stdin= so it can't inherit the gateway's JSON-RPC stdin fd. Muzzle the docker/ollama calls in the OSS setup wizard with stdin=subprocess.DEVNULL (none need interactive input). Also covers the docker-inspect call the linter's regex misses. --------- Co-authored-by: chaithanyak42 <chaithanya.kumar42a@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-22 12:30:47 +00:00
kshitij	a904ff1724	Merge pull request #50781 from NousResearch/salvage/output-token-reservation-threshold fix(compress): reserve output tokens in the compaction threshold (#23767, #43547)	2026-06-22 17:33:09 +05:30
kshitijk4poor	623b21bf24	fix(compress): reserve output tokens in the compaction threshold (#23767 , #43547 ) The compaction trigger compared estimated input against context_length * threshold, but the provider reserves max_tokens of OUTPUT out of the same window. With a large max_tokens (e.g. 65536 on a custom provider) the usable input budget is materially smaller than the raw window, so sessions hit a provider 400 before compaction ever fired. _compute_threshold_tokens now subtracts the output reservation (context_length - max_tokens) before applying the percentage and the small-window 85% guard. max_tokens is stored on the compressor (threaded from agent.max_tokens at construction) and reused across update_model() switches; None = provider default = no reservation (full-window behavior, unchanged). Reimplemented on the current _compute_threshold_tokens surface (the inline threshold calc the original PR targeted was since refactored for the small-window #14690 fix); composes with that 85% guard on the effective budget. Credit: @kyssta-exe (#43651) — original design for the output-token reservation in the compaction threshold. Closes #43547.	2026-06-22 17:26:17 +05:30
Ben Barclay	75a70d98f3	feat(relay): forward a stable instance id at self-provision (Phase 6 Unit α) (#50772 ) Add relay_instance_id() (env GATEWAY_RELAY_INSTANCE_ID first, then gateway.relay_instance_id in config.yaml, mirroring the other relay readers) and forward it in the /relay/provision body so the connector can bind gatewayId -> instanceId and route inbound per-instance once Phase 6 delivery lands. The value is gateway-asserted but safely scoped: the org/tenant stays NAS-token-verified at the connector, so a dishonest gateway can only bind its OWN tenant's instance — same posture as relay_endpoint(). instanceId is only added to the body when present, so omitting it lets the connector store null (back-compat: self-hosted / pre-Phase-6 gateways simply have no binding yet). For a managed (NAS-hosted) agent the id is NAS's AgentInstance.id, stamped into the container env beside GATEWAY_RELAY_URL. Tests: reader (env/config/absent), self_provision_relay forwards the id (set + absent), and the real _post_provision body includes instanceId ONLY when set. Refs: ~/nous/specs/gateway-gateway plan.md Phase 6 Unit α; decisions.md Q11.	2026-06-22 21:46:59 +10:00
kshitij	065946d84f	Merge pull request #50762 from NousResearch/salvage/defer-preflight-after-compaction fix(agent): defer preflight compaction until real usage after a compaction (#23767, #36718)	2026-06-22 17:10:03 +05:30
kshitij	1f28b1a9b9	fix(gateway): redact credentials from approval prompts before sending to clients (#48456 ) (#50767 ) Tirith redacts its own findings, but the approval-request callbacks built the operator prompt from the RAW command string, so a credential-shaped value Tirith flagged was sent verbatim to clients, undoing the redaction one layer up. Two egress transports carried the leak; both are fixed via a shared module-level seam _redact_approval_command() (redact_sensitive_text force=True): 1. chat platforms — _approval_notify_sync (gateway/run.py): redact before both the button path (send_exec_approval) and the plain-text /approve fallback. 2. SSE/API stream — _approval_notify (gateway/platforms/api_server.py): redact event['command'] before it is enqueued to API/desktop clients. (whole-bug-class: sibling call path on a separate transport.) force=True so the prompt — a hard secret-egress boundary — honors redaction even when security.redact_secrets is off. Clean commands pass through unchanged. Tests bind the seam (synthetic credential-format fixtures, force-when-disabled) AND assert BOTH callbacks ASSIGN the redacted result before the send/enqueue sink, via an AST contract that rejects a discarded-result call. All mutation-checked.	2026-06-22 11:39:45 +00:00
kshitijk4poor	b2c84a1626	fix(agent): defer preflight compaction until real usage after a compaction (#23767 , #36718 ) After a compaction, the post-compression path parks last_prompt_tokens=-1 and sets awaiting_real_usage_after_compression=True, but last_real_prompt_tokens still holds the stale pre-compression value (above threshold). should_defer_ preflight_to_real_usage() hit the 'last_real_prompt_tokens >= threshold => False' short-circuit and let preflight fire a SECOND compaction before the provider reported real post-compaction usage. Add an early-return on the awaiting flag so deferral holds for exactly one turn; update_from_response() clears it. The flag-setting half (#36718) already landed on main via the in-place compaction path (conversation_compression.py); this adds the missing should_defer guard that consumes it. Credit: - @ashishpatel26 (#38133) — diagnosis + the should_defer early-return design - @Tranquil-Flow (#36769) — same #36718 fix, identical guard placement Closes #36718.	2026-06-22 16:33:18 +05:30
kshitijk4poor	b4cb33cd42	chore(release): map basilalshukaili@gmail.com in AUTHOR_MAP Committer email for the salvaged #43293 commit; required by the contributor attribution check.	2026-06-22 16:26:56 +05:30
Basil Al Shukaili	72f75f8456	fix(compressor): count tool_call envelope in tail-budget token estimate (#28053 ) The tail-protection budget walks estimated an assistant message's tokens from content + function.arguments only, dropping each tool_call's id, type and function.name (plus JSON structure). Assistant turns that fan out into parallel tool calls were undercounted by 2-15x (a 4-tool-call turn measures ~73 vs ~1,090 real tokens), so the protected tail overshot tail_token_budget and compression ran far below its intended ratio — context kept growing. Consolidate the three duplicated budget walks (_prune_old_tool_results and the two passes in _find_tail_cut_by_tokens) into a single _estimate_msg_budget_tokens() helper that counts the full tool_call envelope via len(str(tc)), consistent with how _estimate_message_chars estimates message size elsewhere. Tested on Windows: new tests/agent/test_compressor_tool_call_budget.py plus the existing compression suite (test_context_compressor, compressor_image_tokens, cross_session_guard, infinite_compaction_loop) — 209 passed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 16:26:56 +05:30
kshitij	0e87c0a41b	Merge pull request #50117 from NousResearch/salvage/f5-cron-mcp-per-job fix(cron): layer enabled MCP servers onto per-job enabled_toolsets (#23997)	2026-06-22 16:03:54 +05:30
kshitij	aa83213c53	Merge pull request #50740 from NousResearch/salvage/preflight-token-progress Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details Typecheck / typecheck (apps/bootstrap-installer) (push) Waiting to run Details Typecheck / typecheck (apps/desktop) (push) Waiting to run Details Typecheck / typecheck (apps/shared) (push) Waiting to run Details Typecheck / typecheck (ui-tui) (push) Waiting to run Details Typecheck / typecheck (web) (push) Waiting to run Details Typecheck / desktop-build (push) Waiting to run Details Docker / shell lint / Lint Dockerfile (hadolint) (push) Has been cancelled Details Docker / shell lint / Lint docker/ shell scripts (shellcheck) (push) Has been cancelled Details fix(agent): count tokens, not just rows, as preflight compression progress (#23767, #39548)	2026-06-22 15:58:58 +05:30
kshitij	21541ce6e9	Merge pull request #50108 from NousResearch/salvage/f4m1-anthropic-pool fix(auth): consult credential_pool in resolve_anthropic_token (#26344)	2026-06-22 15:58:01 +05:30
kshitijk4poor	5bd3dae9e2	chore(release): add sherman-yang to AUTHOR_MAP	2026-06-22 15:53:30 +05:30
sherman-yang	74a5905aea	fix(cron): layer enabled MCP servers onto per-job enabled_toolsets A cron job that sets `enabled_toolsets` to a list of native toolsets (e.g. `["web", "terminal"]`) silently got ZERO MCP tools, while a job with no per-job list got every globally-enabled MCP server. `_resolve_cron_enabled_ toolsets` returned the per-job list verbatim, bypassing the MCP-merge that the platform-fallback branch performs via `_get_platform_tools`. So `discover_mcp_tools()` registered the MCP tools into the registry, but `get_tool_definitions(enabled_toolsets=...)` kept only the named native toolsets — the agent then rejected every `mcp_` call as "Unknown tool". (R2 of #23997.) Fix: `_merge_mcp_into_per_job_toolsets` layers MCP membership onto a per-job allowlist with the SAME semantics as `_get_platform_tools`: `no_mcp` sentinel present -> no MCP servers (sentinel stripped) * one or more MCP server names already listed -> treat as an allowlist * otherwise -> union in every globally-enabled MCP server To avoid duplicating the "which MCP servers are enabled" computation (it already existed inline in `_get_platform_tools`), this extracts a shared `enabled_mcp_server_names(config)` helper in `hermes_cli.tools_config` and has BOTH the gateway/CLI platform resolver and the cron per-job resolver call it — so every path agrees on MCP membership (extend, don't duplicate). Note: the issue's headline — bare MCP server names rejected, registry never includes them — was already fixed on main (commits `c10fea8d2` + `04918345e`, both before the issue was filed). This PR closes the remaining cron-specific gap (R2). The `server:*` / `mcp:server` alias-notation rejection (R1) and the quiet-mode silent-drop (R3) are tracked separately. Salvaged from #32788 by sherman-yang (credited below). Reworked to reuse the shared `enabled_mcp_server_names` helper instead of re-implementing the MCP membership set in cron/scheduler.py. Fixes #23997 Co-authored-by: sherman-yang <58446328+sherman-yang@users.noreply.github.com>	2026-06-22 15:52:58 +05:30
brooklyn!	04a1d9efd7	feat(desktop): PR-style file diffs in chat (#50731 ) * feat(desktop): add Update now button to About panel The About > Updates panel only surfaced "See what's new" when an update was available, which just opens the changelog overlay — there was no way to start the install directly from About. Add an "Update now" primary button that opens the updates overlay (for apply progress) and kicks off the install for the active target (backend in remote mode, else client). * feat(desktop): PR-style file diffs in chat Render write_file/edit_file/patch as a reviewable diff instead of raw result JSON, closer to a Cursor/T3 per-edit review. - Unified diff via FileDiffPanel: strip git file-header + @@ hunk noise, drop the +/- gutter, color by line with a 2px gutter accent, full-bleed to the card, transparent context lines, compact scroll height. - Header shows filename + language icon + +N/-N stats; full path moves to a hover tooltip (no Edited verb, no ms). - Treat the three file-edit tools uniformly (isFileEditTool); read diff from inline_diff or patch's diff field; suppress raw-arg detail. - Reusable FileTypeIcon primitive sharing the code-block icon mapping (codiconForFilename), codicon fallback. - Per-row scaffolding fade (not the group wrapper, which trapped child opacity); expanded edits stay full, collapsed fade; keyboard-only focus lift. Hide diff-less rehydrated creates that read as dupes. * style(desktop): lead --dt-font-mono with bundled JetBrains Mono Code/diff blocks preferred a system Cascadia Code before the bundled JetBrains Mono, so they drifted from the terminal (which leads with JetBrains Mono) on machines where Cascadia is installed. Reorder so every mono surface uses the face we actually ship. * feat(desktop): syntax-highlight inline diffs via Shiki Unify the diff renderer onto the same Shiki path as code blocks: highlight the marker-stripped change content in the file's language, then a per-line transformer layers the add/remove tint + gutter accent on top. Falls back to the plain color-only renderer when the language is unknown, over budget, or while Shiki loads. - shikiLanguageForFilename(): extension → bundled-language id (shared filename-token helper with codiconForFilename). - code display:grid so full-width line tints don't double with newline nodes; theme surface stripped so context lines stay transparent. * style(desktop): use github-dark-dimmed for inline diffs The vivid github-dark-default tokens read harsh behind the add/remove tint in dark mode; switch the diff's dark theme to GitHub's lower-contrast dimmed palette. Light mode and code blocks are unchanged. * style(desktop): dim code-block syntax theme + share with diffs Apply github-dark-dimmed to code blocks too (not just inline diffs) and export one shared SHIKI_THEME so the two highlighters can't drift. Lower contrast reads easier at our small code size in dark mode. * style(desktop): soften shiki token contrast in dark mode github-dark-dimmed only dims the background, which the diff/code surfaces strip — so the bright token foregrounds were unchanged. Pull saturation + brightness back a touch (hues preserved) on .shiki in dark mode for both code blocks and inline diffs.	2026-06-22 05:22:23 -05:00
kshitij	b9f302441f	Merge pull request #50112 from NousResearch/salvage/f5-cron-storage-root fix(cron): anchor cron storage at the default root home (#32091)	2026-06-22 15:51:59 +05:30
kshitijk4poor	69de0360a1	fix(agent): align preflight token-progress floor to 5% (#23767 , #39548 ) Follow-up to the salvaged preflight token-progress fix: require a material (>5%) token reduction to count as progress, matching the overflow-handler retry path (conversation_loop.py, #39550), so a sub-5% wobble can't keep the 3-pass preflight loop spinning. Adds boundary + zero-token regression tests.	2026-06-22 15:51:52 +05:30
kshitij	f509d65336	Merge pull request #50109 from NousResearch/salvage/f5-disabled-bundle-core fix(tools): preserve core tools when a platform bundle is disabled	2026-06-22 15:51:50 +05:30
kshitij	2649f7360c	Merge pull request #50062 from NousResearch/salvage/cron-missed-grace-runonce fix(cron): run missed-grace jobs once instead of deferring forever	2026-06-22 15:50:54 +05:30
kshitijk4poor	3545d29422	refactor(auth): drop dead select() fallback in anthropic pool resolver /simplify-code QUALITY finding: the `if callable(_available_entries): ... else: pool.select()` ladder was dead for the real CredentialPool type (`_available_entries` is always a bound method) AND the select() fallback violated the helper's read-only contract — select() -> _select_unlocked() runs _available_entries(clear_expired=True, refresh=True), which persists to auth.json and triggers a network refresh. Call _available_entries(clear_expired=False, refresh=False) directly inside the existing try/except instead. Also drops the now-dead `select=` stubs from the 6 pool tests (they only existed to satisfy the removed fallback branch). Behavior unchanged; 6 pool tests pass and the read-only / null-token contract tests were mutation-checked (flipping the flags / removing the None-guard fails the respective test).	2026-06-22 15:50:26 +05:30
JackJin	b08ee8ad04	fix(agent): count tokens, not just rows, as preflight compression progress Rebased onto god-file Phase 1 refactor — preflight compression has moved from agent/conversation_loop.py to agent/turn_context.py (no semantic change in the refactor itself; the bug below was carried over verbatim). The preflight compression loop in ``turn_context.py`` uses ``len(messages) >= _orig_len`` to decide whether a compression pass has made progress. That conflates two different conditions: a true no-op (transcript materially unchanged) and effective token compression that summarises message contents but keeps the same number of rows. The second case is misread as "Cannot compress further" — the session then surfaces ``Context length exceeded`` and auto-resets even when the post-compression estimate is far below the model context window. Observed example from #39548: a Telegram session on GPT-5.5 with a 1M context dropped from ~288k → ~183k tokens (a 36% reduction) while preserving 220 messages. The loop treats that as exhaustion and the gateway auto-resets the session. Fix --- Add ``_compression_made_progress(orig_len, new_len, orig_tokens, new_tokens)`` and call it after the post-pass ``estimate_request_tokens_rough`` (which is moved up to run before the progress check instead of after it). Either a row-count reduction OR a token-count reduction now counts as progress; only when neither moves do we break out as "stuck". Fixes #39548	2026-06-22 15:49:19 +05:30
Brooklyn Nicholson	61c266b0dc	style(desktop): soften dark-mode syntax highlighting Share one SHIKI_THEME (github-dark-dimmed) across code blocks and inline diffs so they can't drift, and pull token saturation/brightness back via a `.shiki` dark-mode filter. The dimmed theme alone only changes the background — which both surfaces strip — so the bright foregrounds needed the filter to actually calm down.	2026-06-22 05:16:18 -05:00
kshitij	33efff0d8c	Merge pull request #50726 from NousResearch/salvage/compression-token-progress fix(agent): count tokens, not just message rows, as compression progress (#23767, #39550)	2026-06-22 15:44:38 +05:30
Ben Barclay	64a507da44	feat(relay): handle passthrough_forward over the WS (Phase 5 §5.1, gateway half) (#50702 ) The connector half (gateway-gateway) moves the passthrough plane's post-ACK forward off the HTTP gatewayEndpoint onto the gateway's outbound /relay WS via a new passthrough_forward frame. This is the gateway side: the relay adapter now RECEIVES and handles that frame, so a hosted gateway (no public IP) can process forwarded Class-2/3 traffic (Discord interactions, Twilio) over the socket it already holds — closing the "passthrough inbound doesn't work for hosted gateways" gap. - ws_transport.py: decode the passthrough_forward frame; PassthroughForward dataclass + _passthrough_from_wire (base64 body -> exact bytes, byte parity with the connector's toPassthroughForward); set_passthrough_handler mirrors set_interrupt_inbound_handler. - transport.py: PassthroughHandler type + set_passthrough_handler on the RelayTransport protocol. - adapter.py: connect() wires the passthrough handler; _on_passthrough decodes the (already-sanitized, token-free) forward and, for a Discord interaction, converts it to a MessageEvent routed through the normal agent path (handle_message) — the reply egresses over the outbound / token-less follow_up path, so the gateway never holds the interaction credential. Never raises (a bad forward can't kill the read loop). Non-discord forwards (Twilio) are logged + dropped for now. - docs/relay-connector-contract.md: document the passthrough_forward frame + PassthroughForward shape + §3.1. The interaction -> MessageEvent CONVERSION semantics (slash-command vs button UX, option rendering) are the open sub-design flagged in the spec; the TRANSPORT + receive mechanism (this) is settled per Ben's Gate-2 decision: "the relay adapter handles receiving these events over the WS." Tests (tests/gateway/relay/test_relay_passthrough.py): byte-preservation round-trip (+ malformed-body tolerance), connect() wiring, application-command and message-component interactions route through handle_message with correct session source + scope capture, malformed/non-discord forwards dropped cleanly. 100 relay tests green. Pairs with the connector PR (gateway-gateway).	2026-06-22 20:10:57 +10:00
Brooklyn Nicholson	ac128af1ce	feat(desktop): syntax-highlight inline diffs via Shiki Unify the diff renderer onto the same Shiki path as code blocks: highlight the marker-stripped change content in the file's language, then a per-line transformer layers the add/remove tint + gutter accent on top. Falls back to the plain color-only renderer when the language is unknown, over budget, or while Shiki loads. - shikiLanguageForFilename(): extension → bundled-language id (shared filename-token helper with codiconForFilename). - code display:grid so full-width line tints don't double with newline nodes; theme surface stripped so context lines stay transparent.	2026-06-22 05:10:23 -05:00
Brooklyn Nicholson	c6fbd5a104	style(desktop): lead --dt-font-mono with bundled JetBrains Mono Code/diff blocks preferred a system Cascadia Code before the bundled JetBrains Mono, so they drifted from the terminal (which leads with JetBrains Mono) on machines where Cascadia is installed. Reorder so every mono surface uses the face we actually ship.	2026-06-22 05:05:34 -05:00
Brooklyn Nicholson	a61baa9615	feat(desktop): PR-style file diffs in chat Render write_file/edit_file/patch as a reviewable diff instead of raw result JSON, closer to a Cursor/T3 per-edit review. - Unified diff via FileDiffPanel: strip git file-header + @@ hunk noise, drop the +/- gutter, color by line with a 2px gutter accent, full-bleed to the card, transparent context lines, compact scroll height. - Header shows filename + language icon + +N/-N stats; full path moves to a hover tooltip (no Edited verb, no ms). - Treat the three file-edit tools uniformly (isFileEditTool); read diff from inline_diff or patch's diff field; suppress raw-arg detail. - Reusable FileTypeIcon primitive sharing the code-block icon mapping (codiconForFilename), codicon fallback. - Per-row scaffolding fade (not the group wrapper, which trapped child opacity); expanded edits stay full, collapsed fade; keyboard-only focus lift. Hide diff-less rehydrated creates that read as dupes.	2026-06-22 05:04:13 -05:00
kshitijk4poor	ebd38e1280	test(agent): regression for token-only compression progress (#39550 , #23767 ) Adds test_413_retries_on_token_only_compression: same message count but materially fewer tokens after compaction must count as progress and retry, not abort. Fails on main without the salvaged fix, passes with it.	2026-06-22 15:26:29 +05:30
David Gutowsky	87b60ae49a	no-mistakes(review): guard token-delta status msg on actual compression in overflow handler	2026-06-22 15:23:24 +05:30
David Gutowsky	47b6b4cf85	fix #39550 : detect token-only compression success Compression can materially reduce request size (tool-result pruning, in-place summarization) without reducing message count. The two compression-success checks in conversation_loop.py (413 handler and context-overflow handler) only compared len(messages) to detect success, missing token-only compression. Now re-estimates tokens after compress_context() returns and treats any >=5% reduction as a successful compression pass. Error logs also use the post-compression token count instead of the stale pre-compression estimate. Fixes: #39550	2026-06-22 15:23:24 +05:30
kshitij	ab22317d09	Merge pull request #50214 from kshitijk4poor/salvage/desktop-rename-branched-50143 fix(desktop): rename a branched session via session.title RPC (fixes "Session not found")	2026-06-22 15:15:30 +05:30
Teknium	5ff11a689b	feat(cli): /timestamps command + timestamps in /history (#50506 ) display.timestamps already drove the [HH:MM] suffix on live submitted and streamed message labels, but there was no runtime command to toggle it and /history ignored the setting entirely. Add /timestamps [on\|off\|status] (alias /ts) and render [HH:MM] in /history for turns that carry a stored unix timestamp (resumed sessions). Live unsaved turns without a stored time are never given a fabricated one. Uses the existing sanctioned non-wire 'timestamp' message key (stripped before the API call in chat_completions), so message-alternation and prompt-cache invariants are untouched.	2026-06-21 22:44:25 -07:00
Shannon Sands	b9b4756ab4	fix dashboard chat session titles	2026-06-21 22:44:02 -07:00
Shannon Sands	5dae502b86	Address email pairing review feedback	2026-06-21 22:43:57 -07:00
Shannon Sands	2455e1801b	Make email pairing opt-in	2026-06-21 22:43:57 -07:00
Teknium	74f0dd62e8	feat(cli): Ctrl+G submits the edited draft on save (TUI parity) (#50560 ) Ctrl+G already opened $EDITOR with the current draft, but used open_in_editor(validate_and_handle=False), which only loaded the saved text back into the input area — the user still had to press Enter. The TUI's Ctrl+G (openEditor) submits the draft on a clean exit. Since CLI submission is driven by the custom Enter keybinding (not the buffer accept_handler), validate_and_handle can't route through it; instead chain a done-callback on the editor Task that calls the new _submit_editor_buffer(), which mirrors the Enter handler's idle/queue/slash branches and drops an empty save.	2026-06-21 22:43:55 -07:00
Shannon Sands	4b09903de5	fix Nous auth refresh for idle agents	2026-06-21 22:43:48 -07:00
teknium1	b5bd66eac9	fix(telegram): observed/replied group docs of any type are cached too Follow-up to the accept-any-file-type change. The observe-unmentioned and replied-media paths relied on cache_media_bytes() returning None for unsupported document types to emit an 'unsupported, not cached' note. Now that any file type is always cached, those docs are cached and surfaced with a path-pointing note — consistent with the main document path. The remaining cached-is-None branch is image-validation-failure only; its note is reworded accordingly. Updates the group-gating test to the new contract.	2026-06-21 22:43:45 -07:00
teknium1	4314d451ca	fix(gateway): accept any inbound file type across all messaging platforms Authorization to message the agent is the gate, not the file extension. Previously the inbound-attachment allowlist (SUPPORTED_DOCUMENT_TYPES) was opt-OUT on Discord (allow_any_attachment defaulted false) and had no bypass at all on Telegram/Slack — so an .html (or any non-allowlisted type) was dropped or hard-rejected before the agent saw it. Now every authorized upload is cached and surfaced to the agent regardless of type: - base.cache_media_bytes(): unknown types cache as octet-stream (or the caller-supplied MIME) instead of returning None — fixes the chokepoint that Teams/Telegram-media route through. - discord/telegram/slack adapters: removed the allowlist reject/skip; any non-media attachment is typed DOCUMENT and cached. Known types keep their precise MIME. - Text inlining now gates on a shared _TEXT_INJECT_EXTENSIONS set (text + code + config + markup) instead of a blind UTF-8 decode, so binary formats (PDF/zip/docx) with ASCII headers are never inlined. - gateway/run.py emits the path-pointing context note for every DOCUMENT, including non text/application MIME types. - discord.allow_any_attachment is now a documented no-op kept for config back-compat. Validation: 357 gateway tests pass; E2E confirms .html/.bin/custom types cache, known types stay precise, PDFs are not inlined.	2026-06-21 22:43:45 -07:00
Ben Barclay	de6b3ae377	fix(terminal): bridge docker_extra_args to TERMINAL_DOCKER_EXTRA_ARGS in CLI + gateway (#50631 ) terminal.docker_extra_args passes flags verbatim to `docker run` (e.g. --gpus=all, --shm-size=16g). It was wired into DEFAULT_CONFIG, TERMINAL_CONFIG_ENV_MAP (so `hermes config set` bridged it), terminal_tool._get_env_config (reads TERMINAL_DOCKER_EXTRA_ARGS), and DockerEnvironment (applies extra_args) -- but it was MISSING from cli.py's env_mappings and gateway/run.py's _terminal_env_map. Consequence: a user who hand-edits config.yaml (rather than running `hermes config set`) has docker_extra_args silently dropped on the CLI and gateway/desktop startup paths, while docker_image / docker_volumes (which ARE in those maps) bridge correctly -- producing the reported 'Hermes partially reads the Docker config' symptom where --gpus=all and --shm-size=16g never reach docker run. This is the same bridge-coverage bug class that shipped before for docker_run_as_host_user (cli + gateway) and docker_mount_cwd_to_workspace (gateway). Fix by adding the key to both maps, plus a dedicated regression pin in test_terminal_config_env_sync.py mirroring the existing test_docker_*_is_bridged_everywhere guards.	2026-06-22 15:41:23 +10:00
Ben Barclay	6202fdfc35	fix(container): detect dashboard role under s6-overlay v3 (#49196 ) (#50600 ) * fix(gateway): walk /proc//cmdline to find main-wrapper.sh under s6-overlay v3 (#49196) (cherry picked from commit `3a108c2df0`) fix(container): peel s6-v3 rc.init prefix so dashboard role is detected kyssta-exe's preceding commit (#49238) fixed _read_container_argv() to locate the rc.init-launched main-wrapper.sh process under s6-overlay v3, but the skip still never fired: _strip_container_argv_prefix() only peeled a prefix when args[0] was init/main-wrapper.sh/hermes. Under s6 v3 the matched argv is /bin/sh -e /run/s6/basedir/scripts/rc.init top /opt/hermes/docker/main-wrapper.sh dashboard ... so args[0] stayed /bin/sh, _is_dashboard_container() returned False, and the dashboard container reconciled + started its own gateway-default — the exact dual Telegram getUpdates 409 in issue #49196. Fix: strip everything up to and including the main-wrapper.sh token (the stable boundary the image owns), covering both the v2 (/init ...) and v3 (/bin/sh ... rc.init top ...) shapes with one rule, instead of matching launcher tokens positionally. This also repairs _is_legacy_gateway_run_request() under v3, which shares the same strip helper (the issue called this out). Tests: extend the dashboard true/false parametrize sets with the s6-v3 argv shape, and add test_main_skips_reconcile_in_dashboard_container_s6v3 exercising main() end-to-end with the v3 argv. Verified via mutation that both new v3 assertions fail under the old positional strip and pass with the fix. --------- Co-authored-by: kyssta-exe <kyssta-exe@users.noreply.github.com>	2026-06-22 15:35:38 +10:00
Teknium	e448b21414	feat(dashboard): interactive auth setup on no-provider non-loopback bind (#50551 ) When `hermes dashboard --host 0.0.0.0` is run interactively with the auth gate engaged but no DashboardAuthProvider configured, prompt to set up the bundled username/password provider on the spot (or point at `hermes dashboard register` for OAuth) instead of only emitting the fail-closed error. - main.py: `_maybe_setup_dashboard_auth_interactively()` runs before start_server. No-ops on loopback binds, when a provider is already registered, or when stdin/stdout isn't a TTY (Docker/s6, CI, piped runs) so the fail-closed SystemExit stays the backstop for unattended deploys. On the password path it writes dashboard.basic_auth.{username,password_hash,secret} to config.yaml (scrypt hash, never plaintext), then force-rediscovers plugins so the basic provider registers before the gate check. - web_server.py: fix the fail-closed hint — it told operators to set `dashboard_auth.basic.username` but the provider reads `dashboard.basic_auth`. - docs: note the interactive setup under Fail-closed semantics. No new env vars; reuses the existing dashboard.basic_auth config surface.	2026-06-21 20:21:48 -07:00
Teknium	9e96e70995	feat(cli): /prompt — compose your next prompt in $EDITOR (#50509 ) * feat(cli): /prompt — compose your next prompt in $EDITOR Adds /prompt (alias /compose): opens $VISUAL/$EDITOR on a temp markdown file so you can hand-edit a multi-line prompt, then sends the saved buffer as the next agent turn. Text after the command pre-seeds the buffer; an empty save cancels. Reuses the one-shot _pending_agent_seed the interactive loop already consumes (same mechanism as /blueprint), so no changes to the input event loop or message pipeline. CLI-only. * feat(tui): /prompt slash command opens $EDITOR (parity with CLI) The TUI already opens $EDITOR via Ctrl+G (openEditor), but had no /prompt slash command like the classic CLI. Wire openEditor into the slash handler context and register /prompt (alias /compose) to call it; inline text after the command is dropped into the composer first so it carries into the editor, matching the CLI's /prompt <text>.	2026-06-21 20:21:33 -07:00
Teknium	95d53c3bcb	feat(cli): /reasoning full — show complete thinking, not 10-line clamp (#50499 ) * feat(cli): /reasoning full to show complete thinking, not 10-line clamp The post-response Reasoning recap box hard-clamped long thinking to the first 10 lines, so there was no way to see the full reasoning trace after a turn (live streaming already shows it in full). Add display.reasoning_full (default off) plus /reasoning full\|clamp to toggle it at runtime; the clamp truncation note now points at the command. Addresses repeated user requests to show all thinking tokens. * test(gateway): de-snapshot /reasoning help assertion The test froze the exact args-hint literal '/reasoning [level\|show\|hide]', which the new full/clamp args change to '[level\|show\|hide\|full\|clamp]'. Convert to an invariant: assert /reasoning is in help and carries its core args, not the exact hint string. * feat(tui): /reasoning full\|clamp parity in tui_gateway The classic-CLI reasoning_full toggle had no TUI equivalent — typing /reasoning full in the TUI fell through to parse_reasoning_effort and errored. The TUI renders thinking as an expand/collapse section (no fixed 10-line recap), so map full -> sections.thinking=expanded (raw, uncapped via thinkingPreview mode='full') and clamp -> collapsed, persisting display.reasoning_full for cross-surface config consistency.	2026-06-21 20:21:11 -07:00

1 2 3 4 5 ...

12556 commits