hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
Brooklyn Nicholson	dee51c1607	fix(tui): address Copilot review on #14045 Four real issues Copilot flagged: 1. delegate_tool: `_build_child_agent` never passed `toolsets` to the progress callback, so the event payload's `toolsets` field (wired through every layer) was always empty and the overlay's toolsets row never populated. Thread `child_toolsets` through. 2. event handler: the race-protection on subagent.spawn_requested / subagent.start only preserved `completed`, so a late-arriving queued event could clobber `failed` / `interrupted` too. Preserve any terminal status (`completed \| failed \| interrupted`). 3. SpawnHud: comment claimed concurrency was approximated by "widest level in the tree" but code used `totals.activeCount` (total across all parents). `max_concurrent_children` is a per-parent cap, so activeCount over-warns for multi-orchestrator runs. Switch to `max(widthByDepth(tree))`; the label now reads `⚡W/cap+extra` where W is the widest level (drives the ratio) and `+extra` is the rest. 4. spawn_tree.list: comment said "peek header without parsing full list" but the code json.loads()'d every snapshot. Adds a per-session `_index.jsonl` sidecar written on save; list() reads only the index (with a full-scan fallback for pre-index sessions). O(1) per snapshot now vs O(file-size).	2026-04-22 10:56:32 -05:00
kshitijk4poor	5e8262da26	chore: add rnijhara to AUTHOR_MAP	2026-04-22 08:49:24 -07:00
kshitijk4poor	1f216ecbb4	feat(gateway/slack): add SLACK_REACTIONS env toggle for reaction lifecycle Adds _reactions_enabled() gating to match Discord (DISCORD_REACTIONS) and Telegram (TELEGRAM_REACTIONS) pattern. Defaults to true to preserve existing behavior. Gates at three levels: - _handle_slack_message: skips _reacting_message_ids registration - on_processing_start: early return - on_processing_complete: early return Also adds config.yaml bridge (slack.reactions) and two new tests.	2026-04-22 08:49:24 -07:00
Roopak Nijhara	70a33708e7	fix(gateway/slack): align reaction lifecycle with Discord/Telegram pattern Slack reactions were placed around handle_message(), which returns immediately after spawning a background task. This caused the 👀 → ✅ swap to happen before any real work began. Fix: implement on_processing_start / on_processing_complete callbacks (matching Discord/Telegram) so reactions bracket actual _message_handler work driven by the base class. Also fixes missing stop_typing() for Slack's assistant thread status indicator, which left 'is thinking...' stuck in the UI after processing completed. - Add _reacting_message_ids set for DM/@mention-only gating - Add _active_status_threads dict for stop_typing lookup - Update test_reactions_in_message_flow for new callback pattern - Add test_reactions_failure_outcome and test_reactions_skipped_for_non_dm_non_mention	2026-04-22 08:49:24 -07:00
Brooklyn Nicholson	f06adcc1ae	chore(tui): drop unreachable return + prettier pass - createGatewayEventHandler: remove dead `return` after a block that always returns (tool.complete case). The inner block exits via both branches so the outer statement was never reachable. Was pre-existing on main; fixed here because it was the only thing blocking `npm run fix` on this branch. - agentsOverlay + ops: prettier reformatting. `npm run fix` / `npm run type-check` / `npm test` all clean.	2026-04-22 10:43:59 -05:00
Brooklyn Nicholson	06ebe34b40	fix(tui): repair useInput handler in agents overlay The Write tool that wrote the cleaned overlay split the `if` keyword across two lines in 9 places (` i\nf (cond) {`), which silently passed one typecheck run but actually left the handler as broken JS — every keystroke threw. Input froze in the /agents overlay (j/k/arrows/q/etc. all no-ops) while the 500ms now-tick kept rendering, so the UI looked "frozen but the timeline moves". Reflows the handler as-intended with no behaviour change.	2026-04-22 10:41:13 -05:00
Brooklyn Nicholson	7785654ad5	feat(tui): subagent spawn observability overlay Adds a live + post-hoc audit surface for recursive delegate_task fan-out. None of cc/oc/oclaw tackle nested subagent trees inside an Ink overlay; this ships a view-switched dashboard that handles arbitrary depth + width. Python - delegate_tool: every subagent event now carries subagent_id, parent_id, depth, model, tool_count; subagent.complete also ships input/output/ reasoning tokens, cost, api_calls, files_read/files_written, and a tail of tool-call outputs - delegate_tool: new subagent.spawn_requested event + _active_subagents registry so the overlay can kill a branch by id and pause new spawns - tui_gateway: new RPCs delegation.status, delegation.pause, subagent.interrupt, spawn_tree.save/list/load (disk under \$HERMES_HOME/spawn-trees/<session>/<ts>.json) TUI - /agents overlay: full-width list mode (gantt strip + row picker) and Enter-to-drill full-width scrollable detail mode; inverse+amber selection, heat-coloured branch markers, wall-clock gantt with tick ruler, per-branch rollups - Detail pane: collapsible accordions (Budget, Files, Tool calls, Output, Progress, Summary); open-state persists across agents + mode switches via a shared atom - /replay [N\|last\|list\|load <path>] for in-memory + disk history; /replay-diff <a> <b> for side-by-side tree comparison - Status-bar SpawnHud warns as depth/concurrency approaches caps; overlay auto-follows the just-finished turn onto history[1] - Theme: bump DARK dim #B8860B → #CC9B1F for readable secondary text globally; keep LIGHT untouched Tests: +29 new subagentTree unit tests; 215/215 passing.	2026-04-22 10:38:17 -05:00
kshitijk4poor	04e039f687	fix: Kimi /coding thinking block survival + empty reasoning_content + block ordering Follow-up to the cherry-picked PR #13897 fix. Three issues found: 1. CRITICAL: The thinking block synthesised from reasoning_content was immediately stripped by the third-party signature management code (Kimi is classified as _is_third_party_anthropic_endpoint). Added a Kimi-specific carve-out that preserves unsigned thinking blocks while still stripping Anthropic-signed blocks Kimi can't validate. 2. Empty-string reasoning_content was silently dropped because the truthiness check ('if reasoning_content and ...') evaluates to False for ''. Changed to 'isinstance(reasoning_content, str)' so the tier-3 fallback from _copy_reasoning_content_for_api (which injects '' for Kimi tool-call messages with no reasoning) actually produces a thinking block. 3. The thinking block was appended AFTER tool_use blocks. Anthropic protocol requires thinking -> text -> tool_use ordering. Changed to blocks.insert(0, ...) to prepend.	2026-04-22 08:21:23 -07:00
Jerome	97a536057d	chore(release): add hiddenpuppy to AUTHOR_MAP Map tsuijinglei@gmail.com → hiddenpuppy.	2026-04-22 08:21:23 -07:00
Jerome	2efb0eea21	fix(anthropic_adapter): preserve reasoning_content on assistant tool-call messages for Kimi /coding Fixes NousResearch/hermes-agent#13848 Kimi's /coding endpoint speaks the Anthropic Messages protocol but has its own thinking semantics: when thinking is enabled, Kimi validates message history and requires every prior assistant tool-call message to carry OpenAI-style reasoning_content. The Anthropic path never populated that field, and convert_messages_to_anthropic strips all Anthropic thinking blocks on third-party endpoints — so the request failed with HTTP 400: "thinking is enabled but reasoning_content is missing in assistant tool call message at index N" Now, when an assistant message contains tool_calls and a reasoning_content string, we append a {"type": "thinking", ...} block to the Anthropic content so Kimi can validate the history. This only affects assistant messages with tool_calls + reasoning_content; plain text assistant messages are unchanged.	2026-04-22 08:21:23 -07:00
Teknium	77e04a29d5	fix(error_classifier): don't classify generic 404 as model_not_found (#14013 ) The 404 branch in _classify_by_status had dead code: the generic fallback below the _MODEL_NOT_FOUND_PATTERNS check returned the exact same classification (model_not_found + should_fallback=True), so every 404 — regardless of message — was treated as a missing model. This bites local-endpoint users (llama.cpp, Ollama, vLLM) whose 404s usually mean a wrong endpoint path, proxy routing glitch, or transient backend issue — not a missing model. Claiming 'model not found' misleads the next turn and silently falls back to another provider when the real problem was a URL typo the user should see. Fix: only classify 404 as model_not_found when the message actually matches _MODEL_NOT_FOUND_PATTERNS ("invalid model", "model not found", etc.). Otherwise fall through as unknown (retryable) so the real error surfaces in the retry loop. Test updated to match the new behavior. 103 error_classifier tests pass.	2026-04-22 06:11:47 -07:00
Yukipukii1	40619b393f	tools: normalize file tool pagination bounds	2026-04-22 06:11:41 -07:00
Teknium	3e652f75b2	fix(plugins+nous): auto-coerce memory plugins; actionable Nous 401 diagnostic (#14005 ) * fix(plugins): auto-coerce user-installed memory plugins to kind=exclusive User-installed memory provider plugins at $HERMES_HOME/plugins/<name>/ were being dispatched to the general PluginManager, which has no register_memory_provider method on PluginContext. Every startup logged: Failed to load plugin 'mempalace': 'PluginContext' object has no attribute 'register_memory_provider' Bundled memory providers were already skipped via skip_names={memory, context_engine} in discover_and_load, but user-installed ones weren't. Fix: _parse_manifest now scans the plugin's __init__.py source for 'register_memory_provider' or 'MemoryProvider' (same heuristic as plugins/memory/__init__.py:_is_memory_provider_dir) and auto-coerces kind to 'exclusive' when the manifest didn't declare one explicitly. This routes the plugin to plugins/memory discovery instead of the general loader. The escape hatch: if a manifest explicitly declares kind: standalone, the heuristic doesn't override it. Reported by Uncle HODL on Discord. * fix(nous): actionable CLI message when Nous 401 refresh fails Mirrors the Anthropic 401 diagnostic pattern. When Nous returns 401 and the credential refresh (_try_refresh_nous_client_credentials) also fails, the user used to see only the raw APIError. Now prints: 🔐 Nous 401 — Portal authentication failed. Response: <truncated body> Most likely: Portal OAuth expired, account out of credits, or agent key revoked. Troubleshooting: • Re-authenticate: hermes login --provider nous • Check credits / billing: https://portal.nousresearch.com • Verify stored credentials: $HERMES_HOME/auth.json • Switch providers temporarily: /model <model> --provider openrouter Addresses the common 'my hermes model hangs' pattern where the user's Portal OAuth expired and the CLI gave no hint about the next step.	2026-04-22 05:54:11 -07:00
kshitijk4poor	5fb143169b	feat(dashboard): track real API call count per session Adds schema v7 'api_call_count' column. run_agent.py increments it by 1 per LLM API call, web_server analytics SQL aggregates it, frontend uses the real counter instead of summing sessions. The 'API Calls' card on the analytics dashboard previously displayed COUNT(*) from the sessions table — the number of conversations, not LLM requests. Each session makes 10-90 API calls through the tool loop, so the reported number was ~30x lower than real. Salvaged from PR #10140 (@kshitijk4poor). The cache-token accuracy portions of the original PR were deferred — per-provider analytics is the better path there, since cache_write_tokens and actual_cost_usd are only reliably available from a subset of providers (Anthropic native, Codex Responses, OpenRouter with usage.include). Tests: - schema_version v7 assertion - migration v2 -> v7 adds api_call_count column with default 0 - update_token_counts increments api_call_count by provided delta - absolute=True sets api_call_count directly - /api/analytics/usage exposes total_api_calls in totals	2026-04-22 05:51:58 -07:00
teknium1	be11a75eae	chore(release): map hharry11 email to GitHub handle	2026-04-22 05:51:44 -07:00
hharry11	83cb9a03ee	fix(cli): ensure project .env is sanitized before loading	2026-04-22 05:51:44 -07:00
WideLee	cf55c738e7	refactor(qqbot): migrate qr onboard flow to sync + consolidate into onboard.py - Replace async create_bind_task/poll_bind_result with synchronous httpx.Client equivalents, eliminating manual event loop management - Move _render_qr and full qr_register() entry-point into onboard.py, mirroring the Feishu onboarding pattern - Remove _qqbot_render_qr and _qqbot_qr_flow from gateway.py (~90 lines); call site becomes a single qr_register() import - Fix potential segfault: previous code called loop.close() in the EXPIRED branch and again in the finally block (double-close crashed under uvloop)	2026-04-22 05:50:21 -07:00
Teknium	ba7e8b0df9	chore(release): map Abner email to Abnertheforeman	2026-04-22 05:27:10 -07:00
Abner	b66644f0ec	feat(hindsight): richer session-scoped retain metadata - Add configurable retain_tags / retain_source / retain_user_prefix / retain_assistant_prefix knobs for native Hindsight. - Thread gateway session identity (user_name, chat_id, chat_name, chat_type, thread_id) through AIAgent and MemoryManager into MemoryProvider.initialize kwargs so providers can scope and tag retained memories. - Hindsight attaches the new identity fields as retain metadata, merges per-call tool tags with configured default tags, and uses the configurable transcript labels for auto-retained turns. Co-authored-by: Abner <abner.the.foreman@agentmail.to>	2026-04-22 05:27:10 -07:00
Teknium	b8663813b6	feat(state): auto-prune old sessions + VACUUM state.db at startup (#13861 ) * feat(state): auto-prune old sessions + VACUUM state.db at startup state.db accumulates every session, message, and FTS5 index entry forever. A heavy user (gateway + cron) reported 384MB with 982 sessions / 68K messages causing slowdown; manual 'hermes sessions prune --older-than 7' + VACUUM brought it to 43MB. The prune command and VACUUM are not wired to run automatically anywhere — sessions grew unbounded until users noticed. Changes: - hermes_state.py: new state_meta key/value table, vacuum() method, and maybe_auto_prune_and_vacuum() — idempotent via last-run timestamp in state_meta so it only actually executes once per min_interval_hours across all Hermes processes for a given HERMES_HOME. Never raises. - hermes_cli/config.py: new 'sessions:' block in DEFAULT_CONFIG (auto_prune=True, retention_days=90, vacuum_after_prune=True, min_interval_hours=24). Added to _KNOWN_ROOT_KEYS. - cli.py: call maintenance once at HermesCLI init (shared helper _run_state_db_auto_maintenance reads config and delegates to DB). - gateway/run.py: call maintenance once at GatewayRunner init. - Docs: user-guide/sessions.md rewrites 'Automatic Cleanup' section. Why VACUUM matters: SQLite does NOT shrink the file on DELETE — freed pages get reused on next INSERT. Without VACUUM, a delete-heavy DB stays bloated forever. VACUUM only runs when the prune actually removed rows, so tight DBs don't pay the I/O cost. Tests: 10 new tests in tests/test_hermes_state.py covering state_meta, vacuum, idempotency, interval skipping, VACUUM-only-when-needed, corrupt-marker recovery. All 246 existing state/config/gateway tests still pass. Verified E2E with real imports + isolated HERMES_HOME: DEFAULT_CONFIG exposes the new block, load_config() returns it for fresh installs, first call prunes+vacuums, second call within min_interval_hours skips, and the state_meta marker persists across connection close/reopen. * sessions.auto_prune defaults to false (opt-in) Session history powers session_search recall across past conversations, so silently pruning on startup could surprise users. Ship the machinery disabled and let users opt in when they notice state.db is hurting performance. - DEFAULT_CONFIG.sessions.auto_prune: True → False - Call-site fallbacks in cli.py and gateway/run.py match the new default (so unmigrated configs still see off) - Docs: flip 'Enable in config.yaml' framing + tip explains the tradeoff	2026-04-22 05:21:49 -07:00
Teknium	b43524ecab	fix(wecom): visible poll progress + clearer no-bot-info failure + docstring note Follow-ups on top of salvaged #13923 (@keifergu): - Print QR poll dot every 3s instead of every 18s so "Fetching configuration results..." doesn't look hung. - On "status=success but no bot_info" from the WeCom query endpoint, log the full payload at WARNING and tell the user we're falling back to manual entry (was previously a single opaque line). - Document in the qr_scan_for_bot_info() docstring that the work.weixin.qq.com/ai/qc/* endpoints are the admin-console web-UI flow, not the public developer API, and may change without notice. Also add keifergu@tencent.com to scripts/release.py AUTHOR_MAP so release notes attribute the feature correctly.	2026-04-22 05:15:32 -07:00
keifergu	3f60a907e1	docs(wecom): document QR scan-to-create setup flow	2026-04-22 05:15:32 -07:00
keifergu	8bcd77a9c2	feat(wecom): add QR scan flow and interactive setup wizard for bot credentials	2026-04-22 05:15:32 -07:00
Teknium	d166716c65	feat(optional-skills): add page-agent skill under new web-development category (#13976 ) Adds an optional skill that walks users through installing and using alibaba/page-agent — a pure-JS in-page GUI agent that web developers embed into their own webapps so end users can drive the UI with natural language. Three install paths: CDN demo (30s, no install), npm install into an existing app with provider config table (Qwen/OpenAI/Ollama/OpenRouter), and clone-from-source for dev/contributor workflow. Clear use-case framing up front (embed AI copilot in SaaS/admin/B2B, modernize legacy UIs, accessibility via natural language) and an explicit NOT-for list that points users wanting server-side browser automation back to Hermes' built-in browser tool. Live-verified: repo builds on Node 22.22 + npm 10.9, dev:demo serves at localhost:5174, API surface (new PageAgent{...}, panel.show(), execute(task)) matches what the skill documents. Also verified discovery end-to-end via OptionalSkillSource with isolated HERMES_HOME — search/inspect/fetch all resolve official/web-development/page-agent correctly. New category directory: optional-skills/web-development/ with a DESCRIPTION.md explaining the distinction from Hermes' own browser automation (outside-in vs inside-out).	2026-04-22 04:54:26 -07:00
helix4u	a7d78d3bfd	fix: preserve reasoning_content on Kimi replay	2026-04-22 04:31:59 -07:00
kshitijk4poor	30ec12970b	fix(packaging): include agent.* sub-packages in pyproject.toml The transport refactor (PRs #13862 ff.) added agent/transports/ as a sub-package but the setuptools packages.find include list only had "agent" (top-level files), not "agent." (sub-packages). pip install / Nix builds therefore ship run_agent.py (which now imports from agent.transports on every API call) but omit the transports directory entirely, causing: ModuleNotFoundError: No module named 'agent.transports' on every LLM call for packaged installs. Adds "agent." to match the existing pattern used by tools, gateway, tui_gateway, and plugins.	2026-04-22 03:35:37 -07:00
hengm3467	c6b1ef4e58	feat: add Step Plan provider support (salvage #6005 ) Adds a first-class 'stepfun' API-key provider surfaced as Step Plan: - Support Step Plan setup for both International and China regions - Discover Step Plan models live from /step_plan/v1/models, with a small coding-focused fallback catalog when discovery is unavailable - Thread StepFun through provider metadata, setup persistence, status and doctor output, auxiliary routing, and model normalization - Add tests for provider resolution, model validation, metadata mapping, and StepFun region/model persistence Based on #6005 by @hengm3467. Co-authored-by: hengm3467 <100685635+hengm3467@users.noreply.github.com>	2026-04-22 02:59:58 -07:00
Teknium	ff9752410a	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 ) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` \| `backend` \| `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().	2026-04-21 21:30:10 -07:00
Teknium	d1acf17773	feat(models): add minimax/minimax-m2.5:free to OpenRouter catalog (#13836 ) Surfaces the free variant alongside the paid minimax-m2.5 entry in both the OPENROUTER_MODELS fallback snapshot and the nous/openrouter provider model list.	2026-04-21 21:27:40 -07:00
Teknium	410f33a728	fix(kimi): don't send Anthropic thinking to api.kimi.com/coding (#13826 ) Kimi's /coding endpoint speaks the Anthropic Messages protocol but has its own thinking semantics: when thinking.enabled is sent, Kimi validates the history and requires every prior assistant tool-call message to carry OpenAI-style reasoning_content. The Anthropic path never populates that field, and convert_messages_to_anthropic strips Anthropic thinking blocks on third-party endpoints — so after one tool-calling turn the next request fails with: HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index N Kimi on chat_completions handles thinking via extra_body in ChatCompletionsTransport (#13503). On the Anthropic route, drop the parameter entirely and let Kimi drive reasoning server-side. build_anthropic_kwargs now gates the reasoning_config -> thinking block on not _is_kimi_coding_endpoint(base_url). Tests: 8 new parametric tests cover /coding, /coding/v1, /coding/anthropic, /coding/ (trailing slash), explicit disabled, other third-party endpoints still getting thinking (MiniMax), native Anthropic unaffected, and the non-/coding Kimi root route.	2026-04-21 21:19:14 -07:00
Teknium	7b79e0f4c9	chore(models): drop 3 models from nous portal recommended list (#13822 ) Remove nvidia/nemotron-3-super-120b-a12b:free, arcee-ai/trinity-large-preview:free, and openrouter/elephant-alpha from _PROVIDER_MODELS['nous']. The paid nemotron and arcee-thinking variants remain.	2026-04-21 21:10:20 -07:00
kshitijk4poor	57411fca24	feat: add BedrockTransport + wire all Bedrock transport paths Fourth and final transport — completes the transport layer with all four api_modes covered. Wraps agent/bedrock_adapter.py behind the ProviderTransport ABC, handles both raw boto3 dicts and already-normalized SimpleNamespace. Wires all transport methods to production paths in run_agent.py: - build_kwargs: _build_api_kwargs bedrock branch - validate_response: response validation, new bedrock_converse branch - finish_reason: new bedrock_converse branch in finish_reason extraction Based on PR #13467 by @kshitijk4poor, with one adjustment: the main normalize loop does NOT add a bedrock_converse branch to invoke normalize_response on the already-normalized response. Bedrock's normalize_converse_response runs at the dispatch site (run_agent.py:5189), so the response already has the OpenAI-compatible .choices[0].message shape by the time the main loop sees it. Falling through to the chat_completions else branch is correct and sidesteps a redundant NormalizedResponse rebuild. Transport coverage — complete: \| api_mode \| Transport \| build_kwargs \| normalize \| validate \| \|--------------------\|--------------------------\|:------------:\|:---------:\|:--------:\| \| anthropic_messages \| AnthropicTransport \| ✅ \| ✅ \| ✅ \| \| codex_responses \| ResponsesApiTransport \| ✅ \| ✅ \| ✅ \| \| chat_completions \| ChatCompletionsTransport \| ✅ \| ✅ \| ✅ \| \| bedrock_converse \| BedrockTransport \| ✅ \| ✅ \| ✅ \| 17 new BedrockTransport tests pass. 117 transport tests total pass. 160 bedrock/converse tests across tests/agent/ pass. Full tests/run_agent/ targeted suite passes (885/885 + 15 skipped; the 1 remaining failure is the pre-existing test_concurrent_interrupt flake on origin/main).	2026-04-21 20:58:37 -07:00
Brooklyn Nicholson	572e27c93f	fix(tui): demote gateway log-noise from Activity to info tone Restore the old-CLI contract where only complete failures tint Activity red. Everything else is still visible for debugging but no longer commandeers attention. - gateway.stderr: always tone='info' (drops the ERRLIKE_RE regex) - gateway.protocol_error: both pushes demoted to 'info' - commands.catalog cold-start failure: demoted to 'info' - approval.request: no longer duplicates the overlay into Activity Kept as 'error': terminal `error` event, gateway.start_timeout, gateway-exited, explicit status.update kinds.	2026-04-21 20:57:40 -07:00
Brooklyn Nicholson	76ad697dcb	fix(tui): don't force-open Activity on every error Reverts the auto-expand-on-new-error effect added in `93b47d96`. The effect overrode the user's chosen detailsMode and visually interrupted every turn. Red/yellow chevron tint remains as the passive signal — click to read, just like Thinking and Tool calls.	2026-04-21 20:57:40 -07:00
kshitijk4poor	83d86ce344	feat: add ChatCompletionsTransport + wire all default paths Third concrete transport — handles the default 'chat_completions' api_mode used by ~16 OpenAI-compatible providers (OpenRouter, Nous, NVIDIA, Qwen, Ollama, DeepSeek, xAI, Kimi, custom, etc.). Wires build_kwargs + validate_response to production paths. Based on PR #13447 by @kshitijk4poor, with fixes: - Preserve tool_call.extra_content (Gemini thought_signature) via ToolCall.provider_data — the original shim stripped it, causing 400 errors on multi-turn Gemini 3 thinking requests. - Preserve reasoning_content distinctly from reasoning (DeepSeek/Moonshot) so the thinking-prefill retry check (_has_structured) still triggers. - Port Kimi/Moonshot quirks (32000 max_tokens, top-level reasoning_effort, extra_body.thinking) that landed on main after the original PR was opened. - Keep _qwen_prepare_chat_messages_inplace alive and call it through the transport when sanitization already deepcopied (avoids a second deepcopy). - Skip the back-compat SimpleNamespace shim in the main normalize loop — for chat_completions, response.choices[0].message is already the right shape with .content/.tool_calls/.reasoning/.reasoning_content/.reasoning_details and per-tool-call .extra_content from the OpenAI SDK. run_agent.py: -239 lines in _build_api_kwargs default branch extracted to the transport. build_kwargs now owns: codex-field sanitization, Qwen portal prep, developer role swap, provider preferences, max_tokens resolution (ephemeral > user > NVIDIA 16384 > Qwen 65536 > Kimi 32000 > anthropic_max_output), Kimi reasoning_effort + extra_body.thinking, OpenRouter/Nous/GitHub reasoning, Nous product attribution tags, Ollama num_ctx, custom-provider think=false, Qwen vl_high_resolution_images, request_overrides. 39 new transport tests (8 build_kwargs, 5 Kimi, 4 validate, 4 normalize including extra_content regression, 3 cache stats, 3 basic). Tests/run_agent/ targeted suite passes (885/885 + 15 skipped; the 1 remaining failure is the test_concurrent_interrupt flake present on origin/main).	2026-04-21 20:50:02 -07:00
emozilla	29693f9d8e	feat(aux): use Portal /api/nous/recommended-models for auxiliary models Wire the auxiliary client (compaction, vision, session search, web extract) to the Nous Portal's curated recommended-models endpoint when running on Nous Portal, with a TTL-cached fetch that mirrors how we pull /models for pricing. hermes_cli/models.py - fetch_nous_recommended_models(portal_base_url, force_refresh=False) 10-minute TTL cache, keyed per portal URL (staging vs prod don't collide). Public endpoint, no auth required. Returns {} on any failure so callers always get a dict. - get_nous_recommended_aux_model(vision, free_tier=None, ...) Tier-aware pick from the payload: - Paid tier → paidRecommended{Vision,Compaction}Model, falling back to freeRecommended* when the paid field is null (common during staged rollouts of new paid models). - Free tier → freeRecommended* only, never leaks paid models. When free_tier is None, auto-detects via the existing check_nous_free_tier() helper (already cached 3 min against /api/oauth/account). Detection errors default to paid so we never silently downgrade a paying user. agent/auxiliary_client.py — _try_nous() - Replaces the hardcoded xiaomi/mimo free-tier branch with a single call to get_nous_recommended_aux_model(vision=vision). - Falls back to _NOUS_MODEL (google/gemini-3-flash-preview) when the Portal is unreachable or returns a null recommendation. - The Portal is now the source of truth for aux model selection; the xiaomi allowlist we used to carry is effectively dead. Tests (15 new) - tests/hermes_cli/test_models.py::TestNousRecommendedModels Fetch caching, per-portal keying, network failure, force_refresh; paid-prefers-paid, paid-falls-to-free, free-never-leaks-paid, auto-detect, detection-error → paid default, null/blank modelName handling. - tests/agent/test_auxiliary_client.py::TestNousAuxiliaryRefresh _try_nous honors Portal recommendation for text + vision, falls back to google/gemini-3-flash-preview on None or exception. Behavior won't visibly change today — both tier recommendations currently point at google/gemini-3-flash-preview — but the moment the Portal ships a better paid recommendation, subscribers pick it up within 10 minutes without a Hermes release.	2026-04-21 20:35:16 -07:00
emozilla	c22f4a76de	remove Nous Portal free-model allowlist Drop _NOUS_ALLOWED_FREE_MODELS + filter_nous_free_models and its two call sites. Whatever Nous Portal prices as free now shows up in the picker as-is — no local allowlist gatekeeping. Free-tier partitioning (paid vs free in the menu) still runs via partition_nous_models_by_tier.	2026-04-21 20:35:16 -07:00
Kongxi	dd8ab40556	fix(delegation): add hard timeout and stale detection for subagent execution (#13770 ) - Wrap child.run_conversation() in a ThreadPoolExecutor with configurable timeout (delegation.child_timeout_seconds, default 300s) to prevent indefinite blocking when a subagent's API call or tool HTTP request hangs. - Add heartbeat stale detection: if a child's api_call_count doesn't advance for 5 consecutive heartbeat cycles (~2.5 min), stop touching the parent's activity timestamp so the gateway inactivity timeout can fire as a last resort. - Add 'timeout' as a new exit_reason/status alongside the existing completed/max_iterations/interrupted states. - Use shutdown(wait=False) on the timeout executor to avoid the ThreadPoolExecutor.__exit__ deadlock when a child is stuck on blocking I/O. Closes #13768	2026-04-21 20:20:16 -07:00
kshitijk4poor	c832ebd67c	feat: add ResponsesApiTransport + wire all Codex transport paths Add ResponsesApiTransport wrapping codex_responses_adapter.py behind the ProviderTransport ABC. Auto-registered via _discover_transports(). Wire ALL Codex transport methods to production paths in run_agent.py: - build_kwargs: main _build_api_kwargs codex branch (50 lines extracted) - normalize_response: main loop + flush + summary + retry (4 sites) - convert_tools: memory flush tool override - convert_messages: called internally via build_kwargs - validate_response: response validation gate - preflight_kwargs: request sanitization (2 sites) Remove 7 dead legacy wrappers from AIAgent (_responses_tools, _chat_messages_to_responses_input, _normalize_codex_response, _preflight_codex_api_kwargs, _preflight_codex_input_items, _extract_responses_message_text, _extract_responses_reasoning_text). Keep 3 ID manipulation methods still used by _build_assistant_message. Update 18 test call sites across 3 test files to call adapter functions directly instead of through deleted AIAgent wrappers. 24 new tests. 343 codex/responses/transport tests pass (0 failures). PR 4 of the provider transport refactor.	2026-04-21 19:48:56 -07:00
Teknium	09dd5eb6a5	chore(release): map xiaoqiang243 personal email in AUTHOR_MAP	2026-04-21 19:48:39 -07:00
Teknium	b2ba351380	fix(kimi): reconcile sk-kimi- routing with Anthropic SDK URL semantics Follow-ups after salvaging xiaoqiang243's kimi-for-coding patches: - KIMI_CODE_BASE_URL: drop trailing /v1 (was /coding/v1). The /coding endpoint speaks Anthropic Messages, and the Anthropic SDK appends /v1/messages internally. /coding/v1 + SDK suffix produced /coding/v1/v1/messages (a 404). /coding + SDK suffix now yields /coding/v1/messages correctly. - kimi-coding ProviderConfig: keep legacy default api.moonshot.ai/v1 so non-sk-kimi- moonshot keys still authenticate. sk-kimi- keys are already redirected to api.kimi.com/coding via _resolve_kimi_base_url. - doctor.py: update Kimi UA to claude-code/0.1.0 (was KimiCLI/1.30.0) and rewrite /coding base URLs to /coding/v1 for the /models health check (Anthropic surface has no /models). - test_kimi_env_vars: accept KIMI_CODING_API_KEY as a secondary env var. E2E verified: sk-kimi-<key> → https://api.kimi.com/coding/v1/messages (Anthropic) sk-<legacy> → https://api.moonshot.ai/v1/chat/completions (OpenAI) UA: claude-code/0.1.0, x-api-key: <sk-kimi-*>	2026-04-21 19:48:39 -07:00
王强	6caf8bd994	fix: Enhance Kimi Coding API mode detection and User-Agent	2026-04-21 19:48:39 -07:00
王强	2a026eb762	fix: Update Kimi Coding API endpoint and User-Agent	2026-04-21 19:48:39 -07:00
王强	46d680125e	fix(kimi-coding): set anthropic_messages api_mode for /coding endpoint	2026-04-21 19:48:39 -07:00
王强	bad5471409	fix(kimi-coding): add KIMI_CODING_API_KEY fallback + api_mode detection for /coding endpoint	2026-04-21 19:48:39 -07:00
王强	fd403854b9	fix: auto-detect anthropic_messages mode for Kimi /coding/v1 endpoints	2026-04-21 19:48:39 -07:00
王强	de181dfd22	fix: add User-Agent claude-code/0.1.0 for Kimi /coding endpoint - Add _is_kimi_coding_endpoint() to detect Kimi coding API - Place Kimi check BEFORE _requires_bearer_auth to ensure User-Agent header is set - Without this header, Kimi returns 403 on /coding/v1/messages - Fixes kimi-2.5, kimi-for-coding, kimi-k2.6-code-preview all returning 403	2026-04-21 19:48:39 -07:00
Teknium	84449d9afe	fix(prompt): tell CLI agents not to emit MEDIA:/path tags (#13766 ) The CLI has no attachment channel — MEDIA:<path> tags are only intercepted on messaging gateway platforms (Telegram, Discord, Slack, WhatsApp, Signal, BlueBubbles, email, etc.). On the CLI they render as literal text, which is confusing for users. The CLI platform hint was the one PLATFORM_HINTS entry that said nothing about file delivery, so models trained on the messaging hints would default to MEDIA: tags on the CLI too. Tool schemas (browser_tool, tts_tool, etc.) also recommend MEDIA: generically. Extend the CLI hint to explicitly discourage MEDIA: tags and tell the agent to reference files by plain absolute path instead. Add a regression test asserting the CLI hint carries negative guidance about MEDIA: while messaging hints keep positive guidance.	2026-04-21 19:36:05 -07:00
Teknium	0a1e85dd0d	fix(skills/baoyu-comic): absolute curl paths + clarify-timeout handling (#13775 ) * fix(skills/baoyu-comic): require absolute paths for curl -o downloads When downloading generated images across several batches of image_generate calls, relying on persistent-shell CWD is unsafe. The terminal tool's shell can rotate (TERMINAL_LIFETIME_SECONDS expiry, a failed cd that leaves the shell somewhere else), and 'curl -fsSL <url> -o relative.png' then silently writes to the wrong directory with no error. Update the skill's Step 7 Download step to require absolute -o paths (or workdir= on the terminal tool) and add a matching pitfall entry referencing the Apr 2026 incident where pages 06-09 of a 10-page comic landed at the repo root instead of comic/<slug>/. The agent then spent several turns claiming the files existed where they didn't. * fix(skills/baoyu-comic): handle clarify timeouts correctly in Step 2 A clarify timeout returning 'Use your best judgement to make the choice and proceed' is NOT user consent to default the entire Step 2 questionnaire. It is a per-question default only. Add guidance at both instruction sites (SKILL.md User Questions section, references/workflow.md Step 2 header) telling the agent to: 1. Continue asking the remaining questions in the sequence after a timeout — each question is an independent consent point. 2. Surface every defaulted choice in the next user-visible message so the user can correct it when they return. An unreported default is indistinguishable from never having asked. Reported live Apr 2026: agent asked style question via clarify, got a timeout response, and silently defaulted style + narrative focus + audience + review flags in one pass. User only learned style had defaulted to 'ohmsha' after the comic was fully generated.	2026-04-21 19:35:42 -07:00
brooklyn!	1dfbfcfe74	Merge pull request #13729 from NousResearch/bb/tui-diff-inline-sequence fix(tui): tool inline_diff renders inline with the active turn	2026-04-21 21:13:50 -05:00

1 2 3 4 5 ...

5385 commits