hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-09 08:21:50 +00:00

Author	SHA1	Message	Date
teknium1	76f01780f0	fix(kanban): sweep deferred scratch parent on non-scratch child completion + tests Follow-up on the deferred-cleanup salvage (#33774): _cleanup_workspace returned early for a non-scratch ('dir'/'worktree') task and never ran the parent sweep, so a scratch parent waiting on a 'dir' child would leak its deferred workspace forever. Run the parent sweep before the early return. Adds regression tests: deferred-while-child-active, swept-after-last-child, and dir-child-unblocks-scratch-parent.	2026-06-07 09:50:44 -07:00
annguyenNous	9405cd0812	fix: defer scratch workspace cleanup when task has active children (#33774 ) When a Kanban task with workspace_kind=scratch completes, the _cleanup_workspace() function immediately deletes the workspace directory. If the task has children linked via task_links, those children find the workspace deleted when they start. This fix adds two checks: 1. Before deleting, check if any children are still active (todo/ready/running). If so, defer cleanup. 2. After a child completes, check if parent workspace can now be cleaned up (all children terminal). Fixes NousResearch/hermes-agent#33774	2026-06-07 09:50:44 -07:00
Teknium	cb3e41e2fd	feat(onboarding): opt-in structured profile-build path on first contact (#41114 ) * feat(onboarding): opt-in structured profile-build path on first contact On a user's very first gateway message, Hermes now optionally offers to build a short profile of them — then, only with consent, gathers durable facts and persists them to the user-profile memory store (memory tool, target="user") so future sessions start already knowing who they are. Inspired by Poke's zero-input onboarding, but consent-first by design: - The agent OFFERS, never assumes. Declining stops it immediately. - Before ANY external lookup it states what it will look up and asks. - It never reads connected accounts (email/calendar) silently — the exact privacy concern that made naive implementations feel invasive. Wiring reuses existing infrastructure end-to-end: - gateway/run.py first-message hook (was a plain self-intro) now swaps in the profile-build directive when enabled and not yet offered. - agent/onboarding.py gains profile_build_mode()/profile_build_directive() + PROFILE_BUILD_FLAG, latched once via the existing onboarding.seen mechanism so the offer fires at most once per install. - config default onboarding.profile_build: "ask" (set "off" to disable). Added to an existing section, so no _config_version bump needed. No new storage layer, no new injection path, no prompt-cache impact. * fix(dashboard): fold onboarding into agent tab to avoid 1-field category onboarding.profile_build is the only schema-surfaced onboarding field (onboarding.seen is an internal latch dict), so the dashboard CONFIG_SCHEMA single-field-category invariant rejected it. Merge onboarding -> agent like the other small categories.	2026-06-07 08:36:48 -07:00
Teknium	d87f293972	feat(compression): temporal anchoring in compaction summaries (#41102 ) Compaction summaries now receive the current date and instruct the summarizer to rewrite completed actions as absolute, dated, past-tense facts (e.g. "email John about the proposal" -> "Sent the proposal email to John on 2026-06-07"). A resumed conversation no longer re-issues work that already happened or treats a finished action as still pending. The date is resolved via hermes_time.now() (date-only, user-configured timezone) inside _generate_summary. The compaction summary is a mid-conversation message that is never part of the cached prefix, so the date does not affect prompt-cache stability. Date resolution is best-effort: a clock failure omits the rule rather than blocking compaction. The rule rides the shared template, so both first-compaction and iterative-update prompts carry it. Inspired by Poke's summarization (temporal anchoring + semantic preservation).	2026-06-07 08:36:45 -07:00
Teknium	9dbad1990b	test(discord): align clarify/model-picker tests with fail-closed component auth (#41338 ) Three gateway tests broke on main after the component-auth security hardening (test_discord_component_auth.py) made empty Discord component allowlists fail-closed: a view built with allowed_user_ids=set() now rejects every click instead of allowing anyone. The clarify and model-picker BEHAVIOR tests still constructed their views with an empty allowlist and expected the click to succeed — a stale assumption from before the hardening. Fixed by giving each view an allowlist containing the clicking user (the interaction's own id), which is the realistic shape and what the security model requires. Production code unchanged — this only updates the test fixtures to match the intended (and separately pinned) fail-closed contract. The security regression suite and these behavior suites now both pass. Fixes: - test_discord_clarify_buttons.py: test_choice_falls_back_to_label_text_when_entry_missing, test_other_flips_entry_to_awaiting_text - test_discord_model_picker.py: test_model_picker_clears_controls_before_running_switch_callback	2026-06-07 08:27:40 -07:00
Teknium	a317e54935	chore(release): map Dusk1e and LaPhilosophie for approval fail-closed salvage (#33844 , #33866 , #30964 )	2026-06-07 06:21:37 -07:00
LaPhilosophie	f6f363662e	fix(discord): fail closed for component button auth when no allowlist set Salvage of the Discord half of PR #30964 by @LaPhilosophie. Discord component button callbacks (ExecApprovalView, SlashConfirmView, UpdatePromptView, ModelPickerView) bypass the normal message dispatch authorization path. _component_check_auth previously returned True when both the user and role allowlists were empty, so any guild member who could see an approval prompt could click Approve on a dangerous command. Fail closed instead: require DISCORD_ALLOWED_USERS / DISCORD_ALLOWED_ROLES / GATEWAY_ALLOWED_USERS membership, or an explicit DISCORD_ALLOW_ALL_USERS / GATEWAY_ALLOW_ALL_USERS opt-in for deliberately-open deployments. Mirrors the Telegram (#24457) and Matrix fail-closed precedent. The Slack half of #30964 is superseded by PR #33844's helper. Reported via GHSA-mc26-p6fw-7pp6 (@whyiug). Co-authored-by: LaPhilosophie <804436395@qq.com>	2026-06-07 06:21:37 -07:00
Dusk1e	3fa15b33dd	fix(feishu): fail closed for update prompt card actions	2026-06-07 06:21:37 -07:00
Dusk1e	410cb743bf	fix(slack): re-check gateway auth on approval and slash-confirm buttons	2026-06-07 06:21:37 -07:00
Teknium	2912d94370	fix: guard int(os.getenv()) casts against malformed env vars (#40598 ) A non-numeric value in env vars like HERMES_STREAM_RETRIES, HERMES_KANBAN_SPECIFY_MAX_TOKENS, GOOGLE_CHAT_MAX_BYTES, IRC_PORT, etc. raised ValueError at import/init and crashed startup. Parse them safely, falling back to the default. Unified onto the existing utils.env_int(key, default) helper for core/ hermes_cli/tools modules instead of the original PR's three duplicate local helpers; plugins keep minimal inline guards (no core-utils import). All existing max()/min()/`or extra.get()` wrappers preserved. Co-authored-by: annguyenNous <annguyenNous@users.noreply.github.com>	2026-06-07 06:14:24 -07:00
oxngon	e2cc24e331	fix: respect Honcho env var fallback in doctor and honcho status hermes doctor and hermes honcho status warned 'Honcho config not found' whenever ~/.honcho/config.json was absent, even though HONCHO_API_KEY in .env resolves a working config via HonchoClientConfig.from_global_config() -> from_env(). Both now check hcfg.api_key/base_url before warning. Co-authored-by: oxngon <98992931+oxngon@users.noreply.github.com>	2026-06-07 05:37:02 -07:00
teknium1	fa8fd513ea	chore(release): add synapsesx to AUTHOR_MAP for #40495 salvage	2026-06-07 05:01:27 -07:00
synapsesx	f10a330aee	fix(research): keep tool_call/tool_response pairs intact when compressing trajectories ## What does this PR do? The trajectory compressor could corrupt training trajectories by cutting a conversation in the middle of a tool-call/tool-response pair. In the from/value trajectory format a `tool` turn (carrying `<tool_response>` markers) is always emitted immediately after the `gpt` turn whose `<tool_call>` it answers, so the two turns must stay together. The compressible region's end boundary, however, was chosen purely by token accumulation: the loop stopped at the first turn where the accumulated tokens met the savings target, with no regard for turn roles. For any over-budget trajectory whose savings boundary happened to land between a `gpt` turn and its `tool` turn, the `gpt` (with its `<tool_call>`) was summarised away into the replacement `human` message while the now-orphaned `tool` turn (with its `<tool_response>`) was kept verbatim in the tail — producing an unmatched marker and silently corrupting the training signal. The head boundary had the mirror problem when the first tool turn was not protected. This change snaps both compression boundaries to a clean turn boundary before the region is extracted and replaced, so the summary always covers whole gpt+tool blocks and a `tool` turn is never separated from the `gpt` turn that precedes it. The boundary is moved forward when possible (folding an orphaned tool turn into the region that already holds its gpt) and falls back to moving backward when no clean boundary exists ahead, such as when the protected tail itself begins on a tool turn. ## Related Issue N/A ## Type of Change - [x] 🐛 Bug fix (non-breaking change that fixes an issue) ## Changes Made - `trajectory_compressor.py`: added `_is_boundary_clean()` and `_snap_boundary()` helpers on `TrajectoryCompressor`, and applied them to both the head and tail compression boundaries in `compress_trajectory()` and `compress_trajectory_async()`. When snapping collapses the region to nothing safe to compress, the trajectory is returned unchanged and flagged as still over the limit rather than being corrupted. - `tests/test_trajectory_compressor.py`: added `TestCompressionToolPairIntegrity` covering the sync and async paths plus direct unit tests for the boundary snapping (forward skip and backward fallback). ## How to Test 1. Run the focused tests: `pytest tests/test_trajectory_compressor.py -q`. 2. The new sync/async cases build a trajectory of gpt/tool pairs with an oversized middle gpt turn and choose a token target that forces the accumulation boundary to stop between a `<tool_call>` and its `<tool_response>`. They assert that `<tool_call>` and `<tool_response>` markers stay balanced after compression and that every kept `tool` turn is immediately preceded by a `gpt` turn (never the inserted summary or another tool turn). ## Checklist ### Code - [x] I've read the [Contributing Guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md) - [x] My commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) (`fix(scope):`, `feat(scope):`, etc.) - [x] I searched for [existing PRs](https://github.com/NousResearch/hermes-agent/pulls) to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix/feature (no unrelated commits) - [x] I've run `pytest tests/ -q` and all tests pass - [x] I've added tests for my changes (required for bug fixes, strongly encouraged for features) - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (README, `docs/`, docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — or N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — or N/A - [x] I've considered cross-platform impact (Windows, macOS) per the [compatibility guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md#cross-platform-compatibility) — or N/A - [x] I've updated tool descriptions/schemas if I changed tool behavior — or N/A	2026-06-07 05:01:27 -07:00
manishbyatroy	490c486ff6	fix(simplex): accept display name in SIMPLEX_ALLOWED_USERS SIMPLEX_ALLOWED_USERS silently denied every contact when operators listed display names instead of numeric contactIds. The SimpleX UI never surfaces the numeric id, so display names are what operators naturally put in the env var. _is_user_authorized only compared source.user_id (the contactId), so the allowlist never matched. Expand check_ids to include source.user_name for the simplex platform, mirroring the existing WhatsApp phone-LID aliasing pattern. Adds doc + setup-prompt clarification and three regression tests. Salvaged from PR #40393. Adds manishbyatroy to release.py AUTHOR_MAP.	2026-06-07 04:53:22 -07:00
Teknium	9d72680ca3	fix(desktop): make the running-turn timer per-session (#41182 ) The desktop statusbar turn timer read a single process-global $turnStartedAt, set/cleared only for the active session. With multiple same-profile sessions running at once, switching to session B reset the one shared clock, so session A's still-running turn "restarted from zero" the moment you left it — exactly the behaviour @Da7_Tech reported after the profile-scoped session work. Move turnStartedAt onto ClientSessionState so each session owns its own turn clock. The global atom now just mirrors whichever session is focused, written on view-sync (the flush that already stages the active session's state). A backgrounded turn keeps counting in its own cache entry, and focusing it restores its real elapsed time instead of zeroing it. Set/clear sites: message.start (seed), message.complete + error + interrupted bail (clear), and the session.info running-state path (seed if missing / clear on stop) so a turn that goes busy via session.info — e.g. resuming a session that's already running — also gets a clock. Note: the agent loop itself never froze — every same-profile session runs in its own backend thread and background deltas are buffered per-session. This fixes the timer-reset symptom; the "no live progress until you return" is inherent to a single-view transcript and is out of scope here.	2026-06-07 04:29:05 -07:00
teknium1	1a4010edf5	test(approval): regression for shell-escape denylist bypass (#36846 , #36847 )	2026-06-07 03:57:21 -07:00
ashishpatel26	621bf3a873	fix(security): strip shell escapes in denylist normalizer; fail-closed on missing approval module DANGEROUS_PATTERNS and HARDLINE_PATTERNS are matched on the raw command string, so backslash-escape (r\m) and empty-quote split (r''m) bypass both lists. _normalize_command_for_detection now strips these before pattern matching. tui_gateway shell.exec had a bare 'except ImportError: pass' that silently disabled the entire safety gate if tools.approval wasn't importable. Changed to fail-closed (return 5001 error). Added detect_hardline_command check. Fixes #36846, #36847.	2026-06-07 03:57:21 -07:00
Teknium	1fb99b1f22	fix(stream+output-cap): guard empty streams and parse OpenRouter output-cap errors (#40589 ) Two isolated reliability fixes: - chat_completion_helpers: raise on a zero-chunk stream (no finish_reason, no content/reasoning/tool_calls) so retry handles it instead of fabricating a successful empty turn. - model_metadata: parse the OpenRouter/Nous output-cap error phrasing ("maximum context length is N ... (A of text input, B of tool input, C in the output)") so parse_available_output_tokens_from_error returns a real cap and the caller stops looping on it. Salvaged from #40405 (@ashishpatel26) — took the two stream/error-parsing fixes. The PR also bundled compression-state changes (on_session_start clearing _previous_summary; cron session-id prefix preservation, #38788); those touch the compression hot path and are split out for separate review. Co-authored-by: ashishpatel26 <ashishpatel26@users.noreply.github.com>	2026-06-07 03:52:09 -07:00
teknium1	02aad08acf	fix(desktop): bootstrap falls back to installed agent install.sh on GitHub 404 Packaged Desktop first-launch bootstrap no longer dies with a fatal HTTP 404 when install-stamp.json pins a commit that isn't fetchable from GitHub. This only happens for locally-built desktop apps: write-build-stamp.cjs's fromLocalGit() pins `git rev-parse HEAD`, which can be an unpushed commit or dirty tree. CI builds stamp $GITHUB_SHA and are unaffected. The fix unblocks the dev / self-builder workflow. resolveInstallScript() now wraps the GitHub download in try/catch; on failure it resolves ~/.hermes/hermes-agent/scripts/install.sh (the already-installed agent checkout), copies it into bootstrap-cache, and returns it as source 'installed-agent'. If the cache copy fails (read-only FS), it uses the source path directly. With no installed checkout to fall back to, the original error rethrows unchanged. Download is now injectable via an optional _download param so the fallback path is tested hermetically (no network). Reported with a precise repro and suggested fix by @Tamaz-sujashvili (#40815). Co-authored-by: Tamaz-sujashvili <56168197+Tamaz-sujashvili@users.noreply.github.com>	2026-06-07 03:46:12 -07:00
Teknium	9e63109522	feat(dashboard): change UI font from the theme picker, independent of theme (#41145 ) The dashboard font is now selectable from the UI, not just YAML. A new Font section in the header theme picker overrides the UI font of whatever theme is active; the choice is orthogonal to the theme and survives theme switches. Each theme keeps its own font as the default — picking "Theme default" clears the override. - web/src/themes/fonts.ts: curated font catalog (system + Google Fonts across sans/serif/mono), each with a family stack and optional webfont URL. The catalog is the only injected-font surface — no free-text URL box, so the injected <link> origins stay fixed. - web/src/themes/context.tsx: font-override state (localStorage + server), applied after theme typography so it wins; theme apply re-asserts it, and clearing re-runs theme apply to restore the theme's own font. Mono is left to the theme so code/terminal are untouched. - web/src/components/ThemeSwitcher.tsx: Font section with grouped, self- previewing font rows and a "Theme default" clear option. - hermes_cli/web_server.py: GET/PUT /api/dashboard/font persisting to config.yaml dashboard.font, with a server-side id allow-list (unknown ids coerce to the theme sentinel). - i18n + types, api client methods, tests, and docs. Validation: 6 new backend endpoint tests pass; tsc + vite build clean; live browser test confirmed pick/persist/survive-theme-switch/clear all work.	2026-06-07 03:39:01 -07:00
Teknium	136dae779e	fix(cli): return bool (not None) when a destructive-slash confirmation is cancelled (#40583 ) process_command() is typed -> bool, but the /clear, /new, and /undo cancel paths did a bare `return` (None) when _confirm_destructive_slash was declined, leaking None through the bool contract. Return True (command handled, keep the REPL alive) on cancel. Co-authored-by: yubingz <yubingz@users.noreply.github.com>	2026-06-07 02:49:28 -07:00
Teknium	0507e4630d	fix(desktop): preserve configured base_url on same-provider model switch (#41121 ) The desktop model picker calls POST /api/model/set with provider+model only (no base_url). _apply_main_model_assignment cleared model.base_url for every non-custom provider, so re-picking a Xiaomi MiMo model wiped a Token Plan endpoint (https://token-plan-*.xiaomimimo.com/v1) back to the registry default api.xiaomimimo.com — breaking valid tp- keys with 401s. Now base_url is cleared only when switching to a different provider (the stale URL belonged to the old one); same-provider re-assignment preserves it, and an explicitly supplied base_url is honored for any provider.	2026-06-07 02:48:21 -07:00
Teknium	349a3f601c	fix(desktop): stop bare-URL autolinker swallowing trailing emphasis asterisks (#41093 ) The desktop markdown preprocessor autolinks bare URLs by wrapping them in <...>. RAW_URL_RE allowed '' in its character classes, so a bold line with a URL and no separating space — e.g. 'PR opened: https://.../pull/123' — greedily pulled the closing '' into the href, producing a broken link and an unterminated bold run. Exclude '' from both URL character classes; '_' and '~' (which can appear in real paths) are preserved.	2026-06-07 02:47:39 -07:00
Teknium	ed81cfe3de	fix(cron): bound the desktop run-history query to one job (#41088 ) The cron run-history endpoint (GET /api/cron/jobs/{id}/runs, added in #40684) reused list_sessions_rich's order_by_last_active path with a leading-wildcard id_query. That routes through the recursive compression-chain CTE, which seeds from EVERY source='cron' row in the DB and runs per-row preview/last_active subqueries before filtering to one job and applying LIMIT. Work scaled with the total cron history, so a large pile made the run-history load time out before eventually populating. Cron runs are flat, never-compressed sessions with ids of the form cron_{job_id}_{ts}, so the chain machinery is pure overhead and the job binding is a true prefix, not a substring. - New SessionDB.list_cron_job_runs(): bounded [prefix, hi) id-range scan on source='cron', ordered by started_at DESC, with the same preview/last_active enrichment. No CTE, no leading-wildcard LIKE. - Add idx_sessions_source(source, id) so the range is an index scan; bump SCHEMA_VERSION 14 -> 15 (index reconciles onto existing DBs via CREATE INDEX IF NOT EXISTS on startup). - Point the endpoint at the new method. Measured on a real SessionDB with 30k cron rows: 5ms vs 85ms for the old path (16x), and the new path stays flat as the pile grows while the old one scaled with it. Verified the query plan uses idx_sessions_source_id (range scan, no full table scan), runs are correctly scoped (substring collisions like cron_xalpha_ excluded), newest-first, and paged.	2026-06-07 02:41:01 -07:00
Teknium	5a3092b601	fix(desktop): scope in-session /model switch per-session, stop process-env leak (#41120 ) * fix(desktop): scope in-session /model switch per-session, stop process-env leak The desktop/dashboard tui_gateway backend hosts every same-profile session in ONE process. An in-session /model switch wrote process-global env vars (HERMES_MODEL / HERMES_INFERENCE_MODEL / HERMES_TUI_PROVIDER / HERMES_INFERENCE_PROVIDER), which _resolve_startup_runtime() reads when building a fresh agent. So switching the model in one session leaked into every other live session's next agent rebuild (/new, resume) — changing the model in session B silently changed it in session A. Fix: record the switch as a per-session model_override on the session dict instead of mutating os.environ. _make_agent honors that override on rebuild (carrying the concrete base_url/api_key/api_mode the switch resolved), and falls back to global config when absent. Global persistence on the --global flag is unchanged. Also a cleaner fix for #16857 (/new after switching to a custom-provider model): the override carries the resolved credentials, so the rebuild keeps the right endpoint without relying on the leaky env vars. Reported via Twitter (@Da7_Tech): MiniMax M3 in one session + GLM 5.1 in another interfere when switching between them. * test(tui_gateway): align /model switch tests with per-session override contract The three test_config_set_model_syncs_* tests asserted the old leaky contract (switch writes HERMES_MODEL / HERMES_TUI_PROVIDER / HERMES_INFERENCE_PROVIDER to process env). That env-sync IS the cross-session contamination bug this PR removes. Updated to assert the new contract: shared process env untouched, the switch recorded as a per-session model_override carrying provider/model/base_url/ api_key/api_mode. #16857's intent (a custom-provider switch survives /new) is still covered — now via the override _make_agent honors on rebuild.	2026-06-07 02:33:28 -07:00
Teknium	4b9862eb7f	chore: map bmoore210 author email for PR #40550 salvage	2026-06-07 02:15:23 -07:00
bmoore210	b55ac45264	fix(desktop): scope session list to active profile + longer timeout The desktop sidebar fetched the unified cross-profile session list as profile='all' and filtered it client-side by the active profile. On a large multi-profile install the active profile's rows could be windowed out of the cross-profile recency page entirely, so switching to a profile agent showed an empty history panel (and the 'all' fetch could exceed the 15s IPC timeout on startup). Scope the fetch to the active profile so its own page comes back on its merits, and bump the session-list IPC timeout to 60s. profileScope is now a refreshSessions dep, so the existing gateway-open effect re-pulls on profile switch.	2026-06-07 02:15:23 -07:00
bmoore210	330ca4585b	fix: harden gateway startup and turn persistence Persist the inbound user turn before provider/tool execution so a crash before run_conversation() (e.g. provider/httpx client init failure) keeps the inbound message in the transcript. Repair stale/missing SSL_CERT_FILE state on gateway startup, and avoid duplicate gateway fallback writes.	2026-06-07 02:15:23 -07:00
helix4u	591e6fb8f4	fix(computer_use): honor custom vision routing	2026-06-07 02:09:20 -07:00
kshitijk4poor	ffe665277c	fix(aux): honor model.default_headers on auxiliary client too (#40033 ) The salvaged main-agent fix (sanidhyasin) applies model.default_headers to the primary OpenAI client, but the auxiliary client (title generation, context compression, vision routing) builds its own clients and did not read the override. For a `provider: custom` endpoint behind a gateway/WAF that rejects the OpenAI SDK's identifying headers, the main turn would succeed while auxiliary calls to the same endpoint still failed with the opaque 502/4xx from #40033. Add agent.auxiliary_client._apply_user_default_headers() (user values win over provider/SDK defaults; no-op when unconfigured) and apply it at every OpenAI-wire client construction site: - _try_custom_endpoint() — config-level `model.provider: custom` - the named custom-provider branch (custom_providers/providers entries), including the anthropic-SDK-missing OpenAI-wire fallback - the api-key-provider, async-conversion, and main resolve_provider_client fallback branches To prevent the two clients ever drifting on precedence/value handling, AIAgent._apply_user_default_headers (run_agent.py) now delegates the config read + merge to this shared helper (run_agent already imports from auxiliary_client). Native Anthropic/Bedrock branches are untouched (they don't use the OpenAI wire). 8 new tests (helper semantics + config-level custom + named custom); full aux + attribution header suites green (295).	2026-06-07 02:02:40 -07:00
Sanidhya Singh	a216ff839b	fix(agent): honor model.default_headers for custom OpenAI-compatible providers (#40033 ) Custom OpenAI-compatible endpoints sitting behind a gateway/WAF can reject the OpenAI Python SDK's default identifying headers (User-Agent: OpenAI/Python, X-Stainless-*) and return an opaque 502/4xx even though the same request body succeeds under curl. There was no supported way to override those headers. Add a model.default_headers config key whose values are merged onto the OpenAI client's default_headers, taking precedence over provider- and SDK-supplied defaults. Applied at client construction and on every credential swap / client rebuild so the override survives reconnects. No-op for native Anthropic / Bedrock modes and when unconfigured.	2026-06-07 02:02:40 -07:00
Teknium	f5c3fc319c	docs(i18n): port deep-audit corrections to zh-Hans mirror (#41104 ) Mirrors the EN deep-audit fixes (PR #40952) into the zh-Hans translation so the two locales agree. zh-Hans is the only non-English locale; 26 translated pages carried the same stale claims. Corrections ported (code tokens identical across locales; prose re-translated where the surrounding text was already Chinese): - reference: /version slash command + dual-surface list; cli --provider adds openai-api + novita aliases; tool count 70->71 (+ removed phantom "10 RL tools" and fixed kanban 7->9); model_catalog ttl 24->1. - user-guide: hermes -w -q -> -w -z; language list 8->16; aux slots 8->11; docker separate-dashboard claim; gateway-streaming per-platform note; computer-use frontmatter. - features: curator prune_builtins truth; codex-runtime aux keys (context_compression->compression, vision_detect->vision); voice-mode STT/TTS enums; removed phantom rl toolset. - integrations: StepFun step-3-mini->step-3.5-flash; web-search backends 4->8; nous-portal status subcommand. - messaging: WeCom typing/streaming columns; telegram transport default edit->auto; sms host 0.0.0.0->127.0.0.1; simplex/ntfy gateway-setup + pairing approve; line smart-chunking; matrix MATRIX_DM_AUTO_THREAD; msgraph host note. - developer-guide: entry-point group hermes.plugins->hermes_agent.plugins; PLUGIN.yaml->plugin.yaml. Net-new EN sections (mcp mTLS, api-server run-approval, kanban CLI verbs) are untranslated in zh-Hans and fall back to English source, consistent with the mirror's existing partial-coverage state. Verified: docusaurus build --locale zh-Hans succeeds; no new broken anchors from these edits.	2026-06-07 01:57:18 -07:00
Teknium	3c8f1dee8d	fix(compression): don't overwrite the -1 post-compression sentinel in preflight seed (#36718 ) compress_context() sets last_prompt_tokens=-1 right after compression to mark "no real API usage yet". The preflight display-seed used `_preflight_tokens > (last_prompt_tokens or 0)`, and `(-1 or 0)` is -1 (truthy), so any positive rough estimate clobbered the sentinel with a schema-inflated count — re-triggering compression on the next turn. Treat any negative value as "no real data yet" and skip the seed. Salvaged from #40246 as the minimal root-cause fix. The original also added an `_awaiting_suppression_count` bounded-window state machine to should_compress() across 3 files; left out here to keep blast radius small — the sentinel guard alone fixes the re-fire. The suppression window can be added separately if the usage=None-stub edge case warrants it. Co-authored-by: davidgut1982 <davidgut1982@users.noreply.github.com>	2026-06-07 01:56:51 -07:00
kshitij	3763355f08	chore(release): map singhsanidhya741@gmail.com to sanidhyasin (#41094 ) Adds the AUTHOR_MAP entry for the #40403 salvage (model.default_headers for custom OpenAI-compatible providers, fixes #40033) so contributor_audit passes when the salvage PR lands.	2026-06-07 01:55:24 -07:00
Teknium	e18f14d928	test(kimi): align stale parity/profile tests with thinking-xor-effort contract (#41095 ) * test(kimi): align stale parity/profile tests with thinking-xor-effort contract `ce4e74b3` (fix(kimi): send thinking xor reasoning_effort, never both) changed the Kimi profile to emit at most one of extra_body.thinking or a top-level reasoning_effort, and added tests/plugins/model_providers/test_kimi_profile.py to pin it — but left two older test files still asserting the removed 'send both' behavior, turning main red for every PR branched after it. Update the stale assertions to the xor contract: - explicit recognized effort (low\|medium\|high) -> reasoning_effort only, no thinking - enabled w/o effort, or no reasoning_config -> thinking:enabled only, no reasoning_effort - disabled -> thinking:disabled only No production change. * test(kimi): cover remaining xor stale assertions (profile_wiring, run_agent) Two more test files asserted the pre-ce4e74b3 'thinking + reasoning_effort together' behavior — landed in a different CI shard so they surfaced only after the first batch went green: - tests/providers/test_profile_wiring.py::TestKimiProfileParity (2) - tests/run_agent/test_run_agent.py::TestBuildApiKwargs (3: kimi-coding, moonshot, moonshot-cn) Same realignment to the xor contract: default/enabled-without-effort emits thinking:enabled and no reasoning_effort; explicit effort emits reasoning_effort only. Verified by running the full provider + TestBuildApiKwargs Kimi surface (202 passed) plus a codebase-wide grep for any remaining paired thinking+effort assertion (none).	2026-06-07 01:52:49 -07:00
Teknium	0524c9b34e	feat(compression): raise compaction trigger to 85% for gpt-5.5 on Codex OAuth (#40957 ) The ChatGPT Codex OAuth backend hard-caps gpt-5.5 at a 272K context window (verified live: a ~330K-token request to chatgpt.com/backend-api/codex/responses is rejected with context_length_exceeded while ~250K succeeds; the same slug exposes 1.05M on the direct OpenAI API / OpenRouter and 400K on Copilot). At the default 50% trigger, auto-compaction fires at ~136K — half the usable window. Raise the trigger to 85% (~231K) on this exact route only, gated by a new compression.codex_gpt55_autoraise config flag (default true). When it fires, emit a one-time notice (CLI inline print + gateway status_callback replay) with the exact opt-back-out command. gpt-5.5 on any other provider keeps the user's global threshold. - _is_codex_gpt55() matches the 5.5 family only on provider=openai-codex - _compression_threshold_for_model() now provider-aware + opt-out param - config key + _config_version bump (27->28) for backfill - docs + tests (40 cases in test_arcee_trinity_overrides.py)	2026-06-07 01:40:50 -07:00
Teknium	2d099fed1e	docs: deep audit — registry drift, stale claims, 2-week PR coverage, dashboard screenshot (#40952 ) Full-corpus correctness audit of the hand-written docs against the codebase, plus a 2-week merged-PR coverage sweep and one live dashboard screenshot. Correctness (verified against COMMAND_REGISTRY / PROVIDER_REGISTRY / TOOLSETS / tools.registry / DEFAULT_CONFIG / source): - reference: add /version slash command, context_engine toolset, openai-api + novita-ai to --provider; fix tool count 64->71; model_catalog ttl 24->1; add profile describe to summary table; add real provider env vars (LM_API_KEY/LM_BASE_URL, KIMI_CODING_API_KEY, ALIBABA_CODING_PLAN_*, ANTHROPIC_BASE_URL, COPILOT_API_BASE_URL); fix faq "Windows: not natively". - user-guide: fix broken `hermes -w -q` (->-z) and `hermes logs --tail` (->-f); language list 8->16; aux slots 8->11; docker separate-dashboard claim; _SECURITY_ARGS -> _BASE_SECURITY_ARGS. - features: curator prune_builtins truth + missing CLI verbs; codex-runtime aux keys (context_compression->compression, vision_detect->vision); kanban terminate endpoint + promote/reassign/schedule/diagnostics/edit + per-profile cap; mcp mTLS (client_cert/client_key); built-in-plugins nemo_relay + teams_pipeline; api-server run approval endpoint; computer-use frontmatter. - features N-Z + integrations: StepFun step-3-mini->step-3.5-flash; web-search backends 4->8; tool-gateway image-model IDs; voice-mode STT/TTS enums; remove phantom `rl` toolset; nous-portal status subcommand. - messaging: WeCom typing/streaming cols; telegram transport default edit->auto; sms host default; simplex/ntfy `gateway setup` + pairing approve; line smart-chunking; matrix MATRIX_DM_AUTO_THREAD. - developer-guide: build-a-plugin code examples (register_command signature, ContextEngine/ImageGenProvider/MemoryProvider ABCs); model-provider-plugin entry-point group hermes.plugins->hermes_agent.plugins; PLUGIN.yaml->plugin.yaml; agent-loop stale LOC; web-search-provider phantom crawl(). PR coverage (2-week window, 149 feat PRs): - desktop.md refreshed for ~15 shipped features (zh-Hans switcher, rebindable shortcuts + zoom + Cmd+K, status-bar model picker + YOLO toggle, session-by-id + archive, multi-profile concurrent + cross-profile @session, composer history, Providers pane, per-profile remote hosts, Grok OAuth, aux-pin warning). - configuration.md gateway-streaming default corrected to per-platform. - tool-gateway.md free tool pool entitlement note. Media: - New /img/dashboard/admin-config.png — live dashboard Config admin page (captured from a clean profile, no secrets/personalization).	2026-06-07 01:39:06 -07:00
Teknium	3289d4adf2	fix(transcription): handle ffmpeg TimeoutExpired in _prepare_local_audio Follow-up to the subprocess timeout: _prepare_local_audio only caught CalledProcessError, so a timeout would raise uncaught. Return a clean error instead.	2026-06-07 01:26:33 -07:00
annguyenNous	7223f22d65	fix: add timeout to subprocess.run() and proc.wait() calls subprocess.run() and proc.wait() without timeout can hang indefinitely if the child process becomes unresponsive. This blocks the calling thread forever. Fixed locations: - tools/transcription_tools.py: ffmpeg conversion (timeout=300) and user-configured STT commands with shell=True (timeout=300) - gateway/run.py: helper script proc.wait() (timeout=3600) Not fixed: - agent/anthropic_adapter.py: interactive 'claude setup-token' — user-driven, timeout would be inappropriate	2026-06-07 01:26:33 -07:00
teknium1	ce4e74b350	fix(kimi): send thinking xor reasoning_effort, never both The standalone Kimi/Moonshot profile (api.moonshot.ai/v1) sent both extra_body.thinking AND a top-level reasoning_effort. With no reasoning config it even defaulted to thinking:enabled + reasoning_effort:medium, pairing them on every default call. Moonshot treats these as mutually exclusive (cannot specify both 'thinking' and 'reasoning_effort'). Align with the kimi-k2 handling already shipped for the opencode-go relay: send effort when a recognized low\|medium\|high is requested, otherwise fall back to the extra_body.thinking toggle. Disabled sends thinking:disabled only. Never both. Reported by Cars29 (NOUS Discord). DeepSeek was deliberately left untouched: its native endpoint accepts both (verified by the live guardrail in test_deepseek_v4_thinking_live.py), so the report's DeepSeek claim does not hold there. Tests: tests/plugins/model_providers/test_kimi_profile.py pins the xor contract across all config shapes.	2026-06-07 01:24:29 -07:00
teknium1	03392b67d6	fix(opencode-go): gate thinking when reasoning_effort set to avoid HTTP 400 Salvaged from #40429; re-verified on main, tightened, tested. Co-authored-by: jimjsong <jimjsong@users.noreply.github.com>	2026-06-07 01:24:29 -07:00
Teknium	fe0b3f2338	fix(windows): retry watcher Popen without breakaway when parent job denies it, plus regression tests for the breakaway bit (#40956 ) #40909 added `CREATE_BREAKAWAY_FROM_JOB` to `windows_detach_flags()`, which fixed the headline bug (gateway dies after Desktop GUI update and never comes back). The flag's own docstring acknowledges that restrictive parent job objects can still refuse breakaway with `ERROR_ACCESS_DENIED`, surfacing as `OSError` on the `subprocess.Popen` call: "Callers in this codebase already wrap detached spawns in try/except OSError and fall back to a cmd.exe wrapper, so the breakaway-denied case degrades gracefully rather than crashing." That's true for `_spawn_detached` in `gateway_windows.py` (the `hermes gateway start` path), which has both the breakaway bit AND a retry-without-breakaway fallback. It's NOT true for the post-update watcher path in `launch_detached_profile_gateway_restart` (`hermes_cli/gateway.py`), which only has `except OSError: return False` and gives up entirely. If a user's shell/terminal/container wraps Hermes in a breakaway-denying job, the gateway-respawn watcher silently fails to launch instead of trying again without breakaway. This PR closes that gap and adds the regression tests that were missing from the original fix. ## Changes ### `hermes_cli/_subprocess_compat.py` Adds a sibling helper `windows_detach_flags_without_breakaway()` so callers can express the fallback symbolically (via the helper) rather than coding the magic `& ~0x01000000` mask at every site. Documented on `windows_detach_flags` and `windows_detach_flags_without_breakaway` with the recommended try/except pattern. ### `hermes_cli/gateway.py::launch_detached_profile_gateway_restart` Two changes, both aligned with the canonical pattern in `gateway_windows._spawn_detached`: 1. The outer watcher Popen now wraps in `try/except OSError`, and on failure retries with `windows_detach_flags_without_breakaway()` (POSIX never reaches this branch — `start_new_session=True` can't raise OSError). 2. The inlined respawn payload (the `python -c` watcher) also wraps its CreateProcess in try/except OSError and retries with `_flags & ~_CREATE_BREAKAWAY_FROM_JOB` on failure. This matters because the watcher's job-object inheritance is independent of the outer process's — even if the outer Popen succeeds with breakaway, the respawned gateway might inherit a job that doesn't. ### Regression tests in `tests/tools/test_windows_native_support.py` #40909 shipped the fix without any test that the breakaway bit is present (the existing `test_windows_detach_flags_has_expected_win32_bits` asserted only the three legacy bits). Four new tests close that: - `test_windows_detach_flags_includes_breakaway_from_job` — explicit assertion that the breakaway bit is in the default bundle, with the rationale spelled out in the docstring so a future maintainer staring at this test understands why removing it would resurrect the gateway-dies-after-GUI-update bug. - `test_windows_detach_flags_without_breakaway_drops_only_that_bit` — fallback payload keeps the other three detach bits intact. - `test_launch_detached_profile_gateway_restart_inlined_watcher_uses_breakaway` — static-text check on the stringified watcher payload. The inlined Python program isn't reachable via normal import-time inspection because it lives in a `textwrap.dedent("""...""")` literal that gets passed to a separate `python -c` interpreter. Asserting that both `_CREATE_BREAKAWAY_FROM_JOB` (symbolic) and `0x01000000` (hex literal) appear inside the dedent block is a sufficient regression guard against accidental refactors. - `test_launch_detached_profile_gateway_restart_outer_popen_has_access_denied_fallback` — static check that this PR's fallback retry is wired up symbolically. Without standing up a real Windows job object that refuses breakaway, we can't trigger the OSError in a unit test; the text guard catches the case where a future refactor removes the helper import or the `& ~_CREATE_BREAKAWAY_FROM_JOB` retry. Also extends `test_windows_detach_flags_has_expected_win32_bits` to include the breakaway bit assertion and updates `test_windows_flags_zero_on_posix` to cover the new helper. ## Tests Locally on Windows: 8/8 in the `-k "detach or breakaway or popen_kwargs or launch_detached or gateway_run_update or hermes_cli_gateway"` slice pass. Broader `tests/hermes_cli/test_gateway*.py + test_windows_native_support.py`: 172 passed, 10 failed. All 10 failures are pre-existing POSIX-only tests running on a Windows host (os.geteuid, SIGKILL fallback, is_linux fixture mismatches). Stashing this PR and re-running on bare post-#40909 main reproduces all 10 identically — none are regressions. POSIX paths unchanged: `windows_detach_flags()` and `windows_detach_flags_without_breakaway()` both return 0 off Windows, `windows_detach_popen_kwargs()` still yields `{"start_new_session": True}`. ## Out of scope - The other detached-spawn site in `hermes_cli/gateway.py` (around line 3068) also uses `windows_detach_popen_kwargs()` + `except OSError`. It deserves the same fallback treatment but the codepath is different enough (not the update-flow watcher) that it warrants a separate PR with its own scrutiny. - `gateway/run.py` has Windows branches with `windows_detach_popen_kwargs` too — same reasoning. ## Context Follow-up to #40909 (merged). I had a parallel PR (#40934, closed) that duplicated the core breakaway fix; the bits unique to that PR that #40909 didn't cover are the contents of this one. Closing #40934 and opening this slimmed-down version as the focused follow-up.	2026-06-07 01:21:58 -07:00
kshitijk4poor	44c0c2d4ac	refactor(inventory): make force_fresh_nous_tier keyword-only + pin contract Some checks failed Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Has been cancelled Details uv.lock check / uv lock --check (push) Has been cancelled Details Follow-up to the salvaged perf fix. The new force_fresh_nous_tier param was inserted into list_authenticated_providers between custom_providers and max_models. Make it keyword-only (*) so a positional caller passing max_models as the 5th arg can never silently mis-bind it to the tier-refresh flag, and add a signature-contract test that fails if the keyword-only separator is later dropped. All in-repo callers already use keyword args; verified no caller breaks.	2026-06-07 00:41:13 -07:00
helix4u	eb70ab894b	fix(inventory): avoid fresh Nous tier checks in picker payloads	2026-06-07 00:41:13 -07:00
brooklyn!	846821d8c0	Merge pull request #40684 from NousResearch/bb/cron-sessions-sidebar feat(desktop): first-class cron jobs in the sidebar + dashboard scheduler	2026-06-07 00:32:25 -05:00
teknium1	210f4e706a	fix(desktop): resolve powershell.exe by absolute path in Electron bootstrap Mirror the bootstrap-installer (Rust) fix in the Electron first-launch runner. spawnPowerShell launched bare 'powershell.exe', trusting PATH to contain %SystemRoot%\System32\WindowsPowerShell\v1.0 — the same latent weakness that stalled the native installer at "0 of 0 steps" when PATH is trimmed/truncated or stored as a non-expanding REG_SZ. Resolve by absolute path first (%SystemRoot%/%windir%), then PATH (powershell 5.1 -> pwsh 7), then bare name as last resort.	2026-06-06 19:59:16 -07:00
xxxigm	5dee40fcc0	test(bootstrap-installer): cover PowerShell path layout cross-platform Make `powershell_under_root` visible under `cfg(test)` so the %SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe layout is asserted on any host (the rest of the resolution is gated to Windows).	2026-06-06 19:59:16 -07:00
xxxigm	8720023e96	fix(bootstrap-installer): resolve powershell.exe by absolute path on Windows The native Windows installer spawned PowerShell via the bare program name `powershell.exe`, which trusts PATH to contain %SystemRoot%\System32\WindowsPowerShell\v1.0. On machines whose PATH was trimmed or truncated (Windows silently drops entries once the variable exceeds its length limit), the lookup fails and the spawn dies with "program not found" before install.ps1 runs at all — the installer then stalls at "0 of 0 steps". Resolve PowerShell by absolute path first (%SystemRoot%/%windir%), then fall back to PATH (powershell 5.1, then pwsh 7), then a bare name as a last resort. Also include the resolved interpreter in the spawn-failure context; the old message printed only the script path, which misleadingly read as if the .ps1 itself was missing.	2026-06-06 19:59:16 -07:00
xxxigm	fe2942a5aa	test(desktop): assert every theme typography carries an emoji font (#40364 ) Regression guard for the emoji-fallback fix: checks DEFAULT_TYPOGRAPHY and every defined builtin-theme fontSans/fontMono stack contains a color-emoji font.	2026-06-06 19:58:39 -07:00
xxxigm	bec07964be	fix(desktop): add color-emoji font fallback so emoji render (#40364 ) None of the UI sans/mono font stacks (themes/presets.ts, styles.css) carry emoji glyphs, so on platforms whose default text font lacks them (e.g. Linux) emoji rendered as tofu boxes in the composer and chat. Append a color-emoji fallback — Apple Color Emoji / Segoe UI Emoji / Segoe UI Symbol / Noto Color Emoji / the `emoji` generic — to every font stack (SYSTEM_SANS, SYSTEM_MONO, the Courier theme, and the CSS --dt-font-* defaults). Text still uses the primary fonts; the browser only falls back for emoji codepoints. Custom themes build on SYSTEM_* so they inherit it automatically.	2026-06-06 19:58:39 -07:00

1 2 3 4 5 ...

10886 commits