hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-19 15:18:03 +00:00

Author	SHA1	Message	Date
Matt Maximo	271f0e6eb0	fix(model): let Codex setup reuse or reauthenticate	2026-04-24 04:53:32 -07:00
j3ffffff	f76df30e08	fix(auth): parse OpenAI nested error shape in Codex token refresh OpenAI's OAuth token endpoint returns errors in a nested shape — {"error": {"code": "refresh_token_reused", "message": "..."}} — not the OAuth spec's flat {"error": "...", "error_description": "..."}. The existing parser only handled the flat shape, so: - `err.get("error")` returned a dict, the `isinstance(str)` guard rejected it, and `code` stayed `"codex_refresh_failed"`. - The dedicated `refresh_token_reused` branch (with its actionable "re-run codex + hermes auth" message and `relogin_required=True`) never fired. - Users saw the generic "Codex token refresh failed with status 401" when another Codex client (CLI, VS Code extension) had consumed their single-use refresh token — giving no hint that re-auth was required. Parse both shapes, mapping OpenAI's nested `code`/`type` onto the existing `code` variable so downstream branches (`refresh_token_reused`, `invalid_grant`, etc.) fire correctly. Add regression tests covering: - nested `refresh_token_reused` → actionable message + relogin_required - nested generic code → code + message surfaced - flat OAuth-spec `invalid_grant` still handled (back-compat) - unparseable body → generic fallback message, relogin_required=False Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-24 04:53:32 -07:00
LeonSGP43	ccc8fccf77	fix(cli): validate user-defined providers consistently	2026-04-24 04:48:56 -07:00
Teknium	3aa1a41e88	feat(gemini): block free-tier keys at setup + surface guidance on 429 (#15100 ) Google AI Studio's free tier (<= 250 req/day for gemini-2.5-flash) is exhausted in a handful of agent turns, so the setup wizard now refuses to wire up Gemini when the supplied key is on the free tier, and the runtime 429 handler appends actionable billing guidance. Setup-time probe (hermes_cli/main.py): - `_model_flow_api_key_provider` fires one minimal generateContent call when provider_id == 'gemini' and classifies the response as free/paid/unknown via x-ratelimit-limit-requests-per-day header or 429 body containing 'free_tier'. - Free -> print block message, refuse to save the provider, return. - Paid -> 'Tier check: paid' and proceed. - Unknown (network/auth error) -> 'could not verify', proceed anyway. Runtime 429 handler (agent/gemini_native_adapter.py): - `gemini_http_error` appends billing guidance when the 429 error body mentions 'free_tier', catching users who bypass setup by putting GOOGLE_API_KEY directly in .env. Tests: 21 unit tests for the probe + error path, 4 tests for the setup-flow block. All 67 existing gemini tests still pass.	2026-04-24 04:46:17 -07:00
Teknium	346601ca8d	fix(context): invalidate stale Codex OAuth cache entries >= 400k (#15078 ) PR #14935 added a Codex-aware context resolver but only new lookups hit the live /models probe. Users who had run Hermes on gpt-5.5 / 5.4 BEFORE that PR already had the wrong value (e.g. 1,050,000 from models.dev) persisted in ~/.hermes/context_length_cache.yaml, and the cache-first lookup in get_model_context_length() returns it forever. Symptom (reported in the wild by Ludwig, min heo, Gaoge on current main at `6051fba9d`, which is AFTER #14935): * Startup banner shows context usage against 1M * Compression fires late and then OpenAI hard-rejects with 'context length will be reduced from 1,050,000 to 128,000' around the real 272k boundary. Fix: when the step-1 cache returns a value for an openai-codex lookup, check whether it's >= 400k. Codex OAuth caps every slug at 272k (live probe values) so anything at or above 400k is definitionally a pre-#14935 leftover. Drop that entry from the on-disk cache and fall through to step 5, which runs the live /models probe and repersists the correct value (or 272k from the hardcoded fallback if the probe fails). Non-Codex providers and legitimately-cached Codex entries at 272k are untouched. Changes: - agent/model_metadata.py: * _invalidate_cached_context_length() — drop a single entry from context_length_cache.yaml and rewrite the file. * Step-1 cache check in get_model_context_length() now gates provider=='openai-codex' entries >= 400k through invalidation instead of returning them. Tests (3 new in TestCodexOAuthContextLength): - stale 1.05M Codex entry is dropped from disk AND re-resolved through the live probe to 272k; unrelated cache entries survive. - fresh 272k Codex entry is respected (no probe call, no invalidation). - non-Codex 1M entries (e.g. anthropic/claude-opus-4.6 on OpenRouter) are unaffected — the guard is strictly scoped to openai-codex. Full tests/agent/test_model_metadata.py: 88 passed.	2026-04-24 04:46:07 -07:00
Teknium	18f3fc8a6f	fix(tests): resolve 17 persistent CI test failures (#15084 ) Make the main-branch test suite pass again. Most failures were tests still asserting old shapes after recent refactors; two were real source bugs. Source fixes: - tools/mcp_tool.py: _kill_orphaned_mcp_children() slept 2s on every shutdown even when no tracked PIDs existed, making test_shutdown_is_parallel measure ~3s for 3 parallel 1s shutdowns. Early-return when pids is empty. - hermes_cli/tips.py: tip 105 was 157 chars; corpus max is 150. Test fixes (mostly stale mock targets / missing fixture fields): - test_zombie_process_cleanup, test_agent_cache: patch run_agent.cleanup_vm (the local name bound at import), not tools.terminal_tool.cleanup_vm. - test_browser_camofox: patch tools.browser_camofox.load_config, not hermes_cli.config.load_config (the source module, not the resolved one). - test_flush_memories_codex._chat_response_with_memory_call: add finish_reason, tool_call.id, tool_call.type so the chat_completions transport normalizer doesn't AttributeError. - test_concurrent_interrupt: polling_tool signature now accepts messages= kwarg that _invoke_tool() passes through. - test_minimax_provider: add _fallback_chain=[] to the __new__'d agent so switch_model() doesn't AttributeError. - test_skills_config: SKILLS_DIR MagicMock + .rglob stopped working after the scanner switched to agent.skill_utils.iter_skill_index_files (os.walk-based). Point SKILLS_DIR at a real tmp_path and patch agent.skill_utils.get_external_skills_dirs. - test_browser_cdp_tool: browser_cdp toolset was intentionally split into 'browser-cdp' (commit `96b0f3700`) so its stricter check_fn doesn't gate the whole browser toolset; test now expects 'browser-cdp'. - test_registry: add tools.browser_dialog_tool to the expected builtin-discovery set (PR #14540 added it). - test_file_tools TestPatchHints: patch_tool surfaces hints as a '_hint' key on the JSON payload, not inline '[Hint: ...' text. - test_write_deny test_hermes_env: resolve .env via get_hermes_home() so the path matches the profile-aware denylist under hermetic HERMES_HOME. - test_checkpoint_manager test_falls_back_to_parent: guard the walk-up so a stray /tmp/pyproject.toml on the host doesn't pick up /tmp as the project root. - test_quick_commands: set cli.session_id in the __new__'d CLI so the alias-args path doesn't trip AttributeError when fuzzy-matching leaks a skill command across xdist test distribution.	2026-04-24 03:46:46 -07:00
Teknium	1f9c368622	fix(gemini): drop integer/number/boolean enums from tool schemas (#15082 ) Gemini's Schema validator requires every `enum` entry to be a string, even when the parent `type` is integer/number/boolean. Discord's `auto_archive_duration` parameter (`type: integer, enum: [60, 1440, 4320, 10080]`) tripped this on every request that shipped the full tool catalog to generativelanguage.googleapis.com, surfacing as `Gateway: Non-retryable client error: Gemini HTTP 400 (INVALID_ARGUMENT) Invalid value ... (TYPE_STRING), 60` and aborting the turn. Sanitize by dropping the `enum` key when the declared type is numeric or boolean and any entry is non-string. The `type` and `description` survive, so the model still knows the allowed values; the tool handler keeps its own runtime validation. Other providers (OpenAI, OpenRouter, Anthropic) are unaffected — the sanitizer only runs for native Gemini / cloudcode adapters. Reported by @selfhostedsoul on Discord with hermes debug share.	2026-04-24 03:40:00 -07:00
Nicolò Boschi	edff2fbe7e	feat(hindsight): optional bank_id_template for per-agent / per-user banks Adds an optional bank_id_template config that derives the bank name at initialize() time from runtime context. Existing users with a static bank_id keep the current behavior (template is empty by default). Supported placeholders: {profile} — active Hermes profile (agent_identity kwarg) {workspace} — Hermes workspace (agent_workspace kwarg) {platform} — cli, telegram, discord, etc. {user} — platform user id (gateway sessions) {session} — session id Unsafe characters in placeholder values are sanitized, and empty placeholders collapse cleanly (e.g. "hermes-{user}" with no user becomes "hermes"). If the template renders empty, the static bank_id is used as a fallback. Common uses: bank_id_template: hermes-{profile} # isolate per Hermes profile bank_id_template: {workspace}-{profile} # workspace + profile scoping bank_id_template: hermes-{user} # per-user banks for gateway	2026-04-24 03:38:17 -07:00
Nicolò Boschi	f9c6c5ab84	fix(hindsight): scope document_id per process to avoid resume overwrite (#6602 ) Reusing session_id as document_id caused data loss on /resume: when the session is loaded again, _session_turns starts empty and the next retain replaces the entire previously stored content. Now each process lifecycle gets its own document_id formed as {session_id}-{startup_timestamp}, so: - Same session, same process: turns accumulate into one document (existing behavior) - Resume (new process, same session): writes a new document, old one preserved - Forks: child process gets its own document; parent's doc is untouched Also adds session lineage tags so all processes for the same session (or its parent) can still be filtered together via recall: - session:<session_id> on every retain - parent:<parent_session_id> when initialized with parent_session_id Closes #6602	2026-04-24 03:38:17 -07:00
Teknium	3a86f70969	test(hindsight): update materialize-profile-env test for HINDSIGHT_TIMEOUT The existing test_local_embedded_setup_materializes_profile_env expected exact equality on ~/.hermes/.env content; the new HINDSIGHT_TIMEOUT=120 line from the timeout feature now appears in that file. Append it to the expected string so the test reflects the new post_setup output.	2026-04-24 03:36:02 -07:00
Jason Perlow	93a74f74bf	fix(hindsight): preserve shared event loop across provider shutdowns The module-global `_loop` / `_loop_thread` pair is shared across every `HindsightMemoryProvider` instance in the process — the plugin loader creates one provider per `AIAgent`, and the gateway creates one `AIAgent` per concurrent chat session (Telegram/Discord/Slack/CLI). `HindsightMemoryProvider.shutdown()` stopped the shared loop when any one session ended. That stranded the aiohttp `ClientSession` and `TCPConnector` owned by every sibling provider on a now-dead loop — they were never reachable for close and surfaced as the `Unclosed client session` / `Unclosed connector` warnings reported in #11923. Fix: stop stopping the shared loop in `shutdown()`. Per-provider cleanup still closes that provider's own client via `self._client.aclose()`. The loop runs on a daemon thread and is reclaimed on process exit; keeping it alive between provider shutdowns means sibling providers can drain their own sessions cleanly. Regression tests in `tests/plugins/memory/test_hindsight_provider.py` (`TestSharedEventLoopLifecycle`): - `test_shutdown_does_not_stop_shared_event_loop` — two providers share the loop; shutting down one leaves the loop live for the other. This test reproduces the #11923 leak on `main` and passes with the fix. - `test_client_aclose_called_on_cloud_mode_shutdown` — each provider's own aiohttp session is still closed via `aclose()`. Fixes #11923.	2026-04-24 03:34:12 -07:00
Teknium	42d6ab5082	test(gateway): unify discord mock via shared conftest; drop duplicated mock in model_picker test The cherry-picked model_picker test installed its own discord mock at module-import time via a local _ensure_discord_mock(), overwriting sys.modules['discord'] with a mock that lacked attributes other gateway tests needed (Intents.default(), File, app_commands.Choice). On pytest-xdist workers that collected test_discord_model_picker.py first, the shared mock in tests/gateway/conftest.py got clobbered and downstream tests failed with AttributeError / TypeError against missing mock attrs. Classic sys.modules cross-test pollution (see xdist-cross-test-pollution skill). Fix: - Extend the canonical _ensure_discord_mock() in tests/gateway/conftest.py to cover everything the model_picker test needs: real View/Select/ Button/SelectOption classes (not MagicMock sentinels), an Embed class that preserves title/description/color kwargs for assertion, and Color.greyple. - Strip the duplicated mock-setup block from test_discord_model_picker.py and rely on the shared mock that conftest installs at collection time. Regression check: scripts/run_tests.sh tests/gateway/ tests/hermes_cli/ -k 'discord or model or copilot or provider' -o 'addopts=' 1291 passed (was 1288 passed + 3 xdist-ordered failures before this commit).	2026-04-24 03:33:29 -07:00
Nicecsh	fe34741f32	fix(model): repair Discord Copilot /model flow Keep Discord Copilot model switching responsive and current by refreshing picker data from the live catalog when possible, correcting the curated fallback list, and clearing stale controls before the switch completes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 03:33:29 -07:00
Nicecsh	2e2de124af	fix(aux): normalize GitHub Copilot provider slugs Keep auxiliary provider resolution aligned with the switch and persisted main-provider paths when models.dev returns github-copilot slugs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 03:33:29 -07:00
LeonSGP43	df55660e3c	fix(hindsight): disable broken local runtime on unsupported CPUs	2026-04-24 03:33:14 -07:00
kshitij	7897f65a94	fix(normalize): lowercase Xiaomi model IDs for case-insensitive config (#15066 ) Xiaomi's API (api.xiaomimimo.com) requires lowercase model IDs like "mimo-v2.5-pro" but rejects mixed-case names like "MiMo-V2.5-Pro" that users copy from marketing docs or the ProviderEntry description. Add _LOWERCASE_MODEL_PROVIDERS set and apply .lower() to model names for providers in this set (currently just xiaomi) after stripping the provider prefix. This ensures any case variant in config.yaml is normalized before hitting the API. Other providers (minimax, zai, etc.) are NOT affected — their APIs accept mixed case (e.g. MiniMax-M2.7).	2026-04-24 03:33:05 -07:00
bwjoke	3e994e38f7	[verified] fix: materialize hindsight profile env during setup	2026-04-24 03:30:11 -07:00
JC的AI分身	127048e643	fix(hindsight): accept snake_case api_key config	2026-04-24 03:30:03 -07:00
harryplusplus	d6b65bbc47	fix(hindsight): preserve non-ASCII text in retained conversation turns	2026-04-24 03:29:58 -07:00
WildCat Eng Manager	7626f3702e	feat: read prompt caching cache_ttl from config - Load prompt_caching.cache_ttl in AIAgent (5m default, 1h opt-in) - Document DEFAULT_CONFIG and developer guide example - Add unit tests for default, 1h, and invalid TTL fallback Made-with: Cursor	2026-04-24 03:21:29 -07:00
Harry Riddle	ac25e6c99a	feat(auth-codex): add config-provider fallback detection for logout in hermes-agent/hermes_cli/auth.py	2026-04-24 03:17:18 -07:00
Teknium	b2e124d082	refactor(commands): drop /provider, /plan handler, and clean up slash registry (#15047 ) * refactor(commands): drop /provider and clean up slash registry * refactor(commands): drop /plan special handler — use plain skill dispatch	2026-04-24 03:10:52 -07:00
Teknium	b29287258a	fix(aux-client): honor api_mode: anthropic_messages for named custom providers (#15059 ) Auxiliary tasks (session_search, flush_memories, approvals, compression, vision, etc.) that route to a named custom provider declared under config.yaml 'providers:' with 'api_mode: anthropic_messages' were silently building a plain OpenAI client and POSTing to {base_url}/chat/completions, which returns 404 on Anthropic-compatible gateways that only expose /v1/messages. Two gaps caused this: 1. hermes_cli/runtime_provider.py::_get_named_custom_provider — the providers-dict branch (new-style) returned only name/base_url/api_key/ model and dropped api_mode. The legacy custom_providers-list branch already propagated it correctly. The dict branch now parses and returns api_mode via _parse_api_mode() in both match paths. 2. agent/auxiliary_client.py::resolve_provider_client — the named custom provider block at ~L1740 ignored custom_entry['api_mode'] and unconditionally built an OpenAI client (only wrapping for Codex/Responses). It now mirrors _try_custom_endpoint()'s three-way dispatch: anthropic_messages → AnthropicAuxiliaryClient (async wrapped in AsyncAnthropicAuxiliaryClient), codex_responses → CodexAuxiliaryClient, otherwise plain OpenAI. An explicit task-level api_mode override still wins over the provider entry's declared api_mode. Fixes #15033 Tests: tests/agent/test_auxiliary_named_custom_providers.py gains a TestProvidersDictApiModeAnthropicMessages class covering - providers-dict preserves valid api_mode - invalid api_mode values are dropped - missing api_mode leaves the entry unchanged (no regression) - resolve_provider_client returns (Async)AnthropicAuxiliaryClient for api_mode=anthropic_messages - full chain via get_text_auxiliary_client / get_async_text_auxiliary_client with an auxiliary.<task> override - providers without api_mode still use the OpenAI-wire path	2026-04-24 03:10:30 -07:00
luyao618	bc15f526fb	fix(agent): exclude prior-history tool messages from background review summary Cherry-pick-of: `27b6a217b` (PR #14967 by @luyao618) Co-authored-by: luyao618 <364939526@qq.com>	2026-04-24 03:10:19 -07:00
Teknium	f24956ba12	fix(resume): redirect --resume to the descendant that actually holds the messages When context compression fires mid-session, run_agent's _compress_context ends the current session, creates a new child session linked by parent_session_id, and resets the SQLite flush cursor. New messages land in the child; the parent row ends up with message_count = 0. A user who runs 'hermes --resume <original_id>' sees a blank chat even though the transcript exists — just under a descendant id. PR #12920 already fixed the exit banner to print the live descendant id at session end, but that didn't help users who resume by a session id captured BEFORE the banner update (scripts, sessions list, old terminal scrollback) or who type the parent id manually. Fix: add SessionDB.resolve_resume_session_id() which walks the parent→child chain forward and returns the first descendant with at least one message row. Wire it into all three resume entry points: - HermesCLI._preload_resumed_session() (early resume at run() time) - HermesCLI._init_agent() (the classical resume path) - /resume slash command Semantics preserved when the chain has no descendants with messages, when the requested session already has messages, or when the id is unknown. A depth cap of 32 guards against malformed loops. This does NOT concatenate the pre-compression parent transcript into the child — the whole point of compression is to shrink that, so replaying it would blow the cache budget we saved. We just jump to the post-compression child. The summary already reflects what was compressed away. Tests: tests/hermes_state/test_resolve_resume_session_id.py covers - the exact 6-session shape from the issue - passthrough when session has messages / no descendants - passthrough for nonexistent / empty / None input - middle-of-chain redirects - fork resolution (prefers most-recent child) Closes #15000	2026-04-24 03:04:42 -07:00
Teknium	166b960fe4	test(proxy): regression tests for NO_PROXY bypass on keepalive client Pin the behaviour added in the preceding commit — `_get_proxy_for_base_url()` must return None for hosts covered by NO_PROXY and the HTTPS_PROXY otherwise, and the full `_create_openai_client()` path must NOT mount HTTPProxy for a NO_PROXY host. Refs: #14966	2026-04-24 03:04:42 -07:00
Cameron Aragon	dfc5563641	fix(acp): include MCP toolsets in ACP sessions	2026-04-24 03:04:42 -07:00
Teknium	8a1e247c6c	fix(discord): honor wildcard '' in ignored_channels and free_response_channels Follow-up to the allowed_channels wildcard fix in the preceding commit. The same '' literal trap affected two other Discord channel config lists: - DISCORD_IGNORED_CHANNELS: '' was stored as the literal string in the ignored set, and the intersection check never matched real channel IDs, so '' was a no-op instead of silencing every channel. - DISCORD_FREE_RESPONSE_CHANNELS: same shape — '' never matched, so the bot still required a mention everywhere. Add a '' short-circuit to both checks, matching the allowed_channels semantics. Extend tests/gateway/test_discord_allowed_channels.py with regression coverage for all three lists. Refs: #14920	2026-04-24 03:04:42 -07:00
Mrunmayee Rane	8598746e86	fix(discord): honor wildcard '' in DISCORD_ALLOWED_CHANNELS allowed_channels: "" in config (or DISCORD_ALLOWED_CHANNELS="" env var) is meant to allow all channels, but the check was comparing numeric channel IDs against the literal string set {""} via set intersection — always empty, so every message was silently dropped. Add a "*" short-circuit before the set intersection, consistent with every other platform's allowlist handling (Signal, Slack, Telegram all do this). Fixes #14920	2026-04-24 03:04:42 -07:00
Reginaldas	3e10f339fd	fix(providers): send user agent to routermint endpoints	2026-04-24 03:02:16 -07:00
Keira Voss	1ef1e4c669	feat(plugins): add pre_gateway_dispatch hook Introduces a new plugin hook `pre_gateway_dispatch` fired once per incoming MessageEvent in `_handle_message`, after the internal-event guard but before the auth / pairing chain. Plugins may return a dict to influence flow: {"action": "skip", "reason": "..."} -> drop (no reply) {"action": "rewrite", "text": "..."} -> replace event.text {"action": "allow"} / None -> normal dispatch Motivation: gateway-level message-flow patterns that don't fit cleanly into any single adapter — e.g. listen-only group-chat windows (buffer ambient messages, collapse on @mention), or human-handover silent ingest (record messages while an owner handles the chat manually). Today these require forking core; with this hook they can live in a single profile-agnostic plugin. Hook runs BEFORE auth so plugins can handle unauthorized senders (e.g. customer-service handover ingest) without triggering the pairing-code flow. Exceptions in plugin callbacks are caught and logged; the first non-None action dict wins, remaining results are ignored. Includes: - `VALID_HOOKS` entry + inline doc in `hermes_cli/plugins.py` - Invocation block in `gateway/run.py::_handle_message` - 5 new tests in `tests/gateway/test_pre_gateway_dispatch.py` (skip, rewrite, allow, exception safety, internal-event bypass) - 2 additional tests in `tests/hermes_cli/test_plugins.py` - Table entry in `website/docs/user-guide/features/plugins.md` Made-with: Cursor	2026-04-24 03:02:03 -07:00
0xbyt4	8aa37a0cf9	fix(auth): honor SSL CA env vars across httpx + requests callsites - hermes_cli/auth.py: add _default_verify() with macOS Homebrew certifi fallback (mirrors weixin `3a0ec1d93`). Extend env var chain to include REQUESTS_CA_BUNDLE so one env var works across httpx + requests paths. - agent/model_metadata.py: add _resolve_requests_verify() reading HERMES_CA_BUNDLE / REQUESTS_CA_BUNDLE / SSL_CERT_FILE in priority order. Apply explicit verify= to all 6 requests.get callsites. - Tests: 18 new unit tests + autouse platform pin on existing TestResolveVerifyFallback to keep its "returns True" assertions platform-independent. Empirically verified against self-signed HTTPS server: requests honors REQUESTS_CA_BUNDLE only; httpx honors SSL_CERT_FILE only. Hermes now honors all three everywhere. Triggered by Discord reports — Nous OAuth SSL failure on macOS Homebrew Python; custom provider self-signed cert ignored despite REQUESTS_CA_BUNDLE set in env.	2026-04-24 03:00:33 -07:00
Teknium	a9a4416c7c	fix(compress): don't reach into ContextCompressor privates from /compress (#15039 ) Manual /compress crashed with 'LCMEngine' object has no attribute '_align_boundary_forward' when any context-engine plugin was active. The gateway handler reached into _align_boundary_forward and _find_tail_cut_by_tokens on tmp_agent.context_compressor, but those are ContextCompressor-specific — not part of the generic ContextEngine ABC — so every plugin engine (LCM, etc.) raised AttributeError. - Add optional has_content_to_compress(messages) to ContextEngine ABC with a safe default of True (always attempt). - Override it in the built-in ContextCompressor using the existing private helpers — preserves exact prior behavior for 'compressor'. - Rewrite gateway /compress preflight to call the ABC method, deleting the private-helper reach-in. - Add focus_topic to the ABC compress() signature. Make _compress_context retry without focus_topic on TypeError so older strict-sig plugins don't crash on manual /compress <focus>. - Regression test with a fake ContextEngine subclass that only implements the ABC (mirrors LCM's surface). Reported by @selfhostedsoul (Discord, Apr 22).	2026-04-24 02:55:43 -07:00
Teknium	4350668ae4	fix(transcription): fall back to CPU when CUDA runtime libs are missing faster-whisper's device="auto" picks CUDA when ctranslate2's wheel ships CUDA shared libs, even on hosts without the NVIDIA runtime (libcublas.so.12 / libcudnn*). On those hosts the model often loads fine but transcribe() fails at first dlopen, and the broken model stays cached in the module-global — every subsequent voice message in the gateway process fails identically until restart. - Add _load_local_whisper_model() wrapper: try auto, catch missing-lib errors, retry on device=cpu compute_type=int8. - Wrap transcribe() with the same fallback: evict cached model, reload on CPU, retry once. Required because the dlopen failure only surfaces at first kernel launch, not at model construction. - Narrow marker list (libcublas, libcudnn, libcudart, 'cannot be loaded', 'no kernel image is available', 'no CUDA-capable device', driver mismatch). Deliberately excludes 'CUDA out of memory' and similar — those are real runtime failures that should surface, not be silently retried on CPU. - Tests for load-time fallback, runtime fallback (with cached-model eviction verified), and the OOM non-fallback path. Reported via Telegram voice-message dumps on WSL2 hosts where libcublas isn't installed by default.	2026-04-24 02:50:14 -07:00
Teknium	34c3e67109	fix: sanitize tool schemas for llama.cpp backends; restore MCP in TUI (#15032 ) Local llama.cpp servers (e.g. ggml-org/llama.cpp:full-cuda) fail the entire request with HTTP 400 'Unable to generate parser for this template. ... Unrecognized schema: "object"' when any tool schema contains shapes its json-schema-to-grammar converter can't handle: * 'type': 'object' without 'properties' * bare string schema values ('additionalProperties: "object"') * 'type': ['X', 'null'] arrays (nullable form) Cloud providers accept these silently, so they ship from external MCP servers (Atlassian, GCloud, Datadog) and from a couple of our own tools. Changes - tools/schema_sanitizer.py: walks the finalized tool list right before it leaves get_tool_definitions() and repairs the hostile shapes in a deep copy. No-op on well-formed schemas. Recurses into properties, items, additionalProperties, anyOf/oneOf/allOf, and $defs. - model_tools.get_tool_definitions(): invoke the sanitizer as the last step so all paths (built-in, MCP, plugin, dynamically-rebuilt) get covered uniformly. - tools/browser_cdp_tool.py, tools/mcp_tool.py: fix our own bare-object schemas so sanitization isn't load-bearing for in-repo tools. - tui_gateway/server.py: _load_enabled_toolsets() was passing include_default_mcp_servers=False at runtime. That's the config-editing variant (see PR #3252) — it silently drops every default MCP server from the TUI's enabled_toolsets, which is why the TUI didn't hit the llama.cpp crash (no MCP tools sent at all). Switch to True so TUI matches CLI behavior. Tests tests/tools/test_schema_sanitizer.py (17 tests) covers the individual failure modes, well-formed pass-through, deep-copy isolation, and required-field pruning. E2E: loaded the default 'hermes-cli' toolset with MCP discovery and confirmed all 27 resolved tool schemas pass a llama.cpp-compatibility walk (no 'object' node missing 'properties', no bare-string schema values).	2026-04-24 02:44:46 -07:00
Brooklyn Nicholson	78481ac124	feat(tui): per-section visibility for the details accordion Adds optional per-section overrides on top of the existing global details_mode (hidden \| collapsed \| expanded). Lets users keep the accordion collapsed by default while auto-expanding tools, or hide the activity panel entirely without touching thinking/tools/subagents. Config (~/.hermes/config.yaml): display: details_mode: collapsed sections: thinking: expanded tools: expanded activity: hidden Slash command: /details show current global + overrides /details [hidden\|collapsed\|expanded] set global mode (existing) /details <section> <mode\|reset> per-section override (new) /details <section> reset clear override Sections: thinking, tools, subagents, activity. Implementation: - ui-tui/src/types.ts SectionName + SectionVisibility - ui-tui/src/domain/details.ts parseSectionMode / resolveSections / sectionMode + SECTION_NAMES - ui-tui/src/app/uiStore.ts + app/interfaces.ts + app/useConfigSync.ts sections threaded into UiState - ui-tui/src/components/ thinking.tsx ToolTrail consults per-section mode for hidden/expanded behaviour; expandAll skips hidden sections; floating-alert fallback respects activity:hidden - ui-tui/src/components/ messageLine.tsx + appLayout.tsx pass sections through render tree - ui-tui/src/app/slash/ commands/core.ts /details <section> <mode\|reset> syntax - tui_gateway/server.py config.set details_mode.<section> writes to display.sections.<section> (empty value clears the override) - website/docs/user-guide/tui.md documented Tests: 14 new (4 domain, 4 useConfigSync, 3 slash, 3 gateway). Total: 269/269 vitest, all gateway tests pass.	2026-04-24 02:34:32 -05:00
Teknium	6051fba9dc	feat(banner): hyperlink startup banner title to latest GitHub release (#14945 ) Wrap the existing version label in the welcome-banner panel title ('Hermes Agent v… · upstream … · local …') with an OSC-8 terminal hyperlink pointing at the latest git tag's GitHub release page (https://github.com/NousResearch/hermes-agent/releases/tag/<tag>). Clickable in modern terminals (iTerm2, WezTerm, Windows Terminal, GNOME Terminal, Kitty, etc.); degrades to plain text on terminals without OSC-8 support. No new line added to the banner. New get_latest_release_tag() helper runs 'git describe --tags --abbrev=0' in the Hermes checkout (3s timeout, per-process cache, silent fallback for non-git/pip installs and forks without tags).	2026-04-23 23:28:34 -07:00
Teknium	2acc8783d1	fix(errors): classify OpenRouter privacy-guardrail 404s distinctly (#14943 ) OpenRouter returns a 404 with the specific message 'No endpoints available matching your guardrail restrictions and data policy. Configure: https://openrouter.ai/settings/privacy' when a user's account-level privacy setting excludes the only endpoint serving a model (e.g. DeepSeek V4 Pro, which today is hosted only by DeepSeek's own endpoint that may log inputs). Before this change we classified it as model_not_found, which was misleading (the model exists) and triggered provider fallback (useless — the same account setting applies to every OpenRouter call). Now it classifies as a new FailoverReason.provider_policy_blocked with retryable=False, should_fallback=False. The error body already contains the fix URL, so the user still gets actionable guidance.	2026-04-23 23:26:29 -07:00
Teknium	51f4c9827f	fix(context): resolve real Codex OAuth context windows (272k, not 1M) (#14935 ) On ChatGPT Codex OAuth every gpt-5.x slug actually caps at 272,000 tokens, but Hermes was resolving gpt-5.5 / gpt-5.4 to 1,050,000 (from models.dev) because openai-codex aliases to the openai entry there. At 1.05M the compressor never fires and requests hard-fail with 'context window exceeded' around the real 272k boundary. Verified live against chatgpt.com/backend-api/codex/models: gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.3-codex, gpt-5.2-codex, gpt-5.2, gpt-5.1-codex-max → context_window = 272000 Changes: - agent/model_metadata.py: * _fetch_codex_oauth_context_lengths() — probe the Codex /models endpoint with the OAuth bearer token and read context_window per slug (1h in-memory TTL). * _resolve_codex_oauth_context_length() — prefer the live probe, fall back to hardcoded _CODEX_OAUTH_CONTEXT_FALLBACK (all 272k). * Wire into get_model_context_length() when provider=='openai-codex', running BEFORE the models.dev lookup (which returns 1.05M). Result persists via save_context_length() so subsequent lookups skip the probe entirely. * Fixed the now-wrong comment on the DEFAULT_CONTEXT_LENGTHS gpt-5.5 entry (400k was never right for Codex; it's the catch-all for providers we can't probe live). Tests (4 new in TestCodexOAuthContextLength): - fallback table used when no token is available (no models.dev leakage) - live probe overrides the fallback - probe failure (non-200) falls back to hardcoded 272k - non-codex providers (openrouter, direct openai) unaffected Non-codex context resolution is unchanged — the Codex branch only fires when provider=='openai-codex'.	2026-04-23 22:39:47 -07:00
Teknium	5a1c599412	feat(browser): CDP supervisor — dialog detection + response + cross-origin iframe eval (#14540 ) * docs: browser CDP supervisor design (for upcoming PR) Design doc ahead of implementation — dialog + iframe detection/interaction via a persistent CDP supervisor. Covers backend capability matrix (verified live 2026-04-23), architecture, lifecycle, policy, agent surface, PR split, non-goals, and test plan. Supersedes #12550. No code changes in this commit. * feat(browser): add persistent CDP supervisor for dialog + frame detection Single persistent CDP WebSocket per Hermes task_id that subscribes to Page/Runtime/Target events and maintains thread-safe state for pending dialogs, frame tree, and console errors. Supervisor lives in its own daemon thread running an asyncio loop; external callers use sync API (snapshot(), respond_to_dialog()) that bridges onto the loop. Auto-attaches to OOPIF child targets via Target.setAutoAttach{flatten:true} and enables Page+Runtime on each so iframe-origin dialogs surface through the same supervisor. Dialog policies: must_respond (default, 300s safety timeout), auto_dismiss, auto_accept. Frame tree capped at 30 entries + OOPIF depth 2 to keep snapshot payloads bounded on ad-heavy pages. E2E verified against real Chrome via smoke test — detects + responds to main-frame alerts, iframe-contentWindow alerts, preserves frame tree, graceful no-dialog error path, clean shutdown. No agent-facing tool wiring in this commit (comes next). * feat(browser): add browser_dialog tool wired to CDP supervisor Agent-facing response-only tool. Schema: action: 'accept' \| 'dismiss' (required) prompt_text: response for prompt() dialogs (optional) dialog_id: disambiguate when multiple dialogs queued (optional) Handler: SUPERVISOR_REGISTRY.get(task_id).respond_to_dialog(...) check_fn shares _browser_cdp_check with browser_cdp so both surface and hide together. When no supervisor is attached (Camofox, default Playwright, or no browser session started yet), tool is hidden; if somehow invoked it returns a clear error pointing the agent to browser_navigate / /browser connect. Registered in _HERMES_CORE_TOOLS and the browser / hermes-acp / hermes-api-server toolsets alongside browser_cdp. * feat(browser): wire CDP supervisor into session lifecycle + browser_snapshot Supervisor lifecycle: * _get_session_info lazy-starts the supervisor after a session row is materialized — covers every backend code path (Browserbase, cdp_url override, /browser connect, future providers) with one hook. * cleanup_browser(task_id) stops the supervisor for that task first (before the backend tears down CDP). * cleanup_all_browsers() calls SUPERVISOR_REGISTRY.stop_all(). * /browser connect eagerly starts the supervisor for task 'default' so the first snapshot already shows pending_dialogs. * /browser disconnect stops the supervisor. CDP URL resolution for the supervisor: 1. BROWSER_CDP_URL / browser.cdp_url override. 2. Fallback: session_info['cdp_url'] from cloud providers (Browserbase). browser_snapshot merges supervisor state (pending_dialogs + frame_tree) into its JSON output when a supervisor is active — the agent reads pending_dialogs from the snapshot it already requests, then calls browser_dialog to respond. No extra tool surface. Config defaults: * browser.dialog_policy: 'must_respond' (new) * browser.dialog_timeout_s: 300 (new) No version bump — new keys deep-merge into existing browser section. Deadlock fix in supervisor event dispatch: * _on_dialog_opening and _on_target_attached used to await CDP calls while the reader was still processing an event — but only the reader can set the response Future, so the call timed out. * Both now fire asyncio.create_task(...) so the reader stays pumping. * auto_dismiss/auto_accept now actually close the dialog immediately. Tests (tests/tools/test_browser_supervisor.py, 11 tests, real Chrome): * supervisor start/snapshot * main-frame alert detection + dismiss * iframe.contentWindow alert * prompt() with prompt_text reply * respond with no pending dialog -> clean error * auto_dismiss clears on event * registry idempotency * registry stop -> snapshot reports inactive * browser_dialog tool no-supervisor error * browser_dialog invalid action * browser_dialog end-to-end via tool handler xdist-safe: chrome_cdp fixture uses a per-worker port. Skipped when google-chrome/chromium isn't installed. * docs(browser): document browser_dialog tool + CDP supervisor - user-guide/features/browser.md: new browser_dialog section with workflow, availability gate, and dialog_policy table - reference/tools-reference.md: row for browser_dialog, tool count bumped 53 -> 54, browser tools count 11 -> 12 - reference/toolsets-reference.md: browser_dialog added to browser toolset row with note on pending_dialogs / frame_tree snapshot fields Full design doc lives at developer-guide/browser-supervisor.md (committed earlier). * fix(browser): reconnect loop + recent_dialogs for Browserbase visibility Found via Browserbase E2E test that revealed two production-critical issues: 1. Supervisor WebSocket drops when other clients disconnect. Browserbase's CDP proxy tears down our long-lived WebSocket whenever a short-lived client (e.g. agent-browser CLI's per-command CDP connection) disconnects. Fixed with a reconnecting _run loop that re-attaches with exponential backoff on drops. _page_session_id and _child_sessions are reset on each reconnect; pending_dialogs and frames are preserved across reconnects. 2. Browserbase auto-dismisses dialogs server-side within ~10ms. Their Playwright-based CDP proxy dismisses alert/confirm/prompt before our Page.handleJavaScriptDialog call can respond. So pending_dialogs is empty by the time the agent reads a snapshot on Browserbase. Added a recent_dialogs ring buffer (capacity 20) that retains a DialogRecord for every dialog that opened, with a closed_by tag: * 'agent' — agent called browser_dialog * 'auto_policy' — local auto_dismiss/auto_accept fired * 'watchdog' — must_respond timeout auto-dismissed (300s default) * 'remote' — browser/backend closed it on us (Browserbase) Agents on Browserbase now see the dialog history with closed_by='remote' so they at least know a dialog fired, even though they couldn't respond. 3. Page.javascriptDialogClosed matching bug. The event doesn't include a 'message' field (CDP spec has only 'result' and 'userInput') but our _on_dialog_closed was matching on message. Fixed to match by session_id + oldest-first, with a safety assumption that only one dialog is in flight per session (the JS thread is blocked while a dialog is up). Docs + tests updated: * browser.md: new availability matrix showing the three backends and which mode (pending / recent / response) each supports * developer-guide/browser-supervisor.md: three-field snapshot schema with closed_by semantics * test_browser_supervisor.py: +test_recent_dialogs_ring_buffer (12/12 passing against real Chrome) E2E verified both backends: * Local Chrome via /browser connect: detect + respond full workflow (smoke_supervisor.py all 7 scenarios pass) * Browserbase: detect via recent_dialogs with closed_by='remote' (smoke_supervisor_browserbase_v2.py passes) Camofox remains out of scope (REST-only, no CDP) — tracked for upstream PR 3. * feat(browser): XHR bridge for dialog response on Browserbase (FIXED) Browserbase's CDP proxy auto-dismisses native JS dialogs within ~10ms, so Page.handleJavaScriptDialog calls lose the race. Solution: bypass native dialogs entirely. The supervisor now injects Page.addScriptToEvaluateOnNewDocument with a JavaScript override for window.alert/confirm/prompt. Those overrides perform a synchronous XMLHttpRequest to a magic host ('hermes-dialog-bridge.invalid'). We intercept those XHRs via Fetch.enable with a requestStage=Request pattern. Flow when a page calls alert('hi'): 1. window.alert override intercepts, builds XHR GET to http://hermes-dialog-bridge.invalid/?kind=alert&message=hi 2. Sync XHR blocks the page's JS thread (mirrors real dialog semantics) 3. Fetch.requestPaused fires on our WebSocket; supervisor surfaces it as a pending dialog with bridge_request_id set 4. Agent reads pending_dialogs from browser_snapshot, calls browser_dialog 5. Supervisor calls Fetch.fulfillRequest with JSON body: {accept: true\|false, prompt_text: '...', dialog_id: 'd-N'} 6. The injected script parses the body, returns the appropriate value from the override (undefined for alert, bool for confirm, string\|null for prompt) This works identically on Browserbase AND local Chrome — no native dialog ever fires, so Browserbase's auto-dismiss has nothing to race. Dialog policies (must_respond / auto_dismiss / auto_accept) all still work. Bridge is installed on every attached session (main page + OOPIF child sessions) so iframe dialogs are captured too. Native-dialog path kept as a fallback for backends that don't auto-dismiss (so a page that somehow bypasses our override — e.g. iframes that load after Fetch.enable but before the init-script runs — still gets observed via Page.javascriptDialogOpening). E2E VERIFIED: * Local Chrome: 13/13 pytest tests green (12 original + new test_bridge_captures_prompt_and_returns_reply_text that asserts window.__ret === 'AGENT-SUPPLIED-REPLY' after agent responds) * Browserbase: smoke_bb_bridge_v2.py runs 4/4 PASS: - alert('BB-ALERT-MSG') dismiss → page.alert_ret = undefined ✓ - prompt('BB-PROMPT-MSG', 'default-xyz') accept with 'AGENT-REPLY' → page.prompt_ret === 'AGENT-REPLY' ✓ - confirm('BB-CONFIRM-MSG') accept → page.confirm_ret === true ✓ - confirm('BB-CONFIRM-MSG') dismiss → page.confirm_ret === false ✓ Docs updated in browser.md and developer-guide/browser-supervisor.md — availability matrix now shows Browserbase at full parity with local Chrome for both detection and response. * feat(browser): cross-origin iframe interaction via browser_cdp(frame_id=...) Adds iframe interaction to the CDP supervisor PR (was queued as PR 2). Design: browser_cdp gets an optional frame_id parameter. When set, the tool looks up the frame in the supervisor's frame_tree, grabs its child cdp_session_id (OOPIF session), and dispatches the CDP call through the supervisor's already-connected WebSocket via run_coroutine_threadsafe. Why not stateless: on Browserbase, each fresh browser_cdp WebSocket must re-negotiate against a signed connectUrl. The session info carries a specific URL that can expire while the supervisor's long-lived connection stays valid. Routing via the supervisor sidesteps this. Agent workflow: 1. browser_snapshot → frame_tree.children[] shows OOPIFs with is_oopif=true 2. browser_cdp(method='Runtime.evaluate', frame_id=<OOPIF frame_id>, params={'expression': 'document.title', 'returnByValue': True}) 3. Supervisor dispatches the call on the OOPIF's child session Supervisor state fixes needed along the way: * _on_frame_detached now skips reason='swap' (frame migrating processes) * _on_frame_detached also skips when the frame is an OOPIF with a live child session — Browserbase fires spurious remove events when a same-origin iframe gets promoted to OOPIF * _on_target_detached clears cdp_session_id but KEEPS the frame record so the agent still sees the OOPIF in frame_tree during transient session flaps E2E VERIFIED on Browserbase (smoke_bb_iframe_agent_path.py): browser_cdp(method='Runtime.evaluate', params={'expression': 'document.title', 'returnByValue': True}, frame_id=<OOPIF>) → {'success': True, 'result': {'value': 'Example Domain'}} The iframe is <iframe src='https://example.com/'> inside a top-level data: URL page on a real Browserbase session. The agent Runtime.evaluates INSIDE the cross-origin iframe and gets example.com's title back. Tests (tests/tools/test_browser_supervisor.py — 16 pass total): * test_browser_cdp_frame_id_routes_via_supervisor — injects fake OOPIF, verifies routing via supervisor, Runtime.evaluate returns 1+1=2 * test_browser_cdp_frame_id_missing_supervisor — clean error when no supervisor attached * test_browser_cdp_frame_id_not_in_frame_tree — clean error on bad frame_id Docs (browser.md and developer-guide/browser-supervisor.md) updated with the iframe workflow, availability matrix now shows OOPIF eval as shipped for local Chrome + Browserbase. * test(browser): real-OOPIF E2E verified manually + chrome_cdp uses --site-per-process When asked 'did you test the iframe stuff' I had only done a mocked pytest (fake injected OOPIF) plus a Browserbase E2E. Closed the local-Chrome real-OOPIF gap by writing /tmp/dialog-iframe-test/ smoke_local_oopif.py: * 2 http servers on different hostnames (localhost:18905 + 127.0.0.1:18906) * Chrome with --site-per-process so the cross-origin iframe becomes a real OOPIF in its own process * Navigate, find OOPIF in supervisor.frame_tree, call browser_cdp(method='Runtime.evaluate', frame_id=<OOPIF>) which routes through the supervisor's child session * Asserts iframe document.title === 'INNER-FRAME-XYZ' (from the inner page, retrieved via OOPIF eval) PASSED on 2026-04-23. Tried to embed this as a pytest but hit an asyncio version quirk between venv (3.11) and the system python (3.13) — Page.navigate hangs in the pytest harness but works in standalone. Left a self-documenting skip test that points to the smoke script + describes the verification. chrome_cdp fixture now passes --site-per-process so future iframe tests can rely on OOPIF behavior. Result: 16 pass + 1 documented-skip = 17 tests in tests/tools/test_browser_supervisor.py. * docs(browser): add dialog_policy + dialog_timeout_s to configuration.md, fix tool count Pre-merge docs audit revealed two gaps: 1. user-guide/configuration.md browser config example was missing the two new dialog_* knobs. Added with a short table explaining must_respond / auto_dismiss / auto_accept semantics and a link to the feature page for the full workflow. 2. reference/tools-reference.md header said '54 built-in tools' — real count on main is 54, this branch adds browser_dialog so it's 55. Fixed the header. (browser count was already correctly bumped 11 -> 12 in the earlier docs commit.) No code changes.	2026-04-23 22:23:37 -07:00
Matt Maximo	3ccda2aa05	fix(mcp): seed protocol header before HTTP initialize	2026-04-23 22:01:24 -07:00
Teknium	983bbe2d40	feat(skills): add design-md skill for Google's DESIGN.md spec (#14876 ) * feat(config): make tool output truncation limits configurable Port from anomalyco/opencode#23770: expose a new `tool_output` config section so users can tune the hardcoded truncation caps that apply to terminal output and read_file pagination. Three knobs under `tool_output`: - max_bytes (default 50_000) — terminal stdout/stderr cap - max_lines (default 2000) — read_file pagination cap - max_line_length (default 2000) — per-line cap in line-numbered view All three keep their existing hardcoded values as defaults, so behaviour is unchanged when the section is absent. Power users on big-context models can raise them; small-context local models can lower them. Implementation: - New `tools/tool_output_limits.py` reads the section with defensive fallback (missing/invalid values → defaults, never raises). - `tools/terminal_tool.py` MAX_OUTPUT_CHARS now comes from get_max_bytes(). - `tools/file_operations.py` normalize_read_pagination() and _add_line_numbers() now pull the limits at call time. - `hermes_cli/config.py` DEFAULT_CONFIG gains the `tool_output` section so `hermes setup` writes defaults into fresh configs. - Docs page `user-guide/configuration.md` gains a "Tool Output Truncation Limits" section with large-context and small-context example configs. Tests (18 new in tests/tools/test_tool_output_limits.py): - Default resolution with missing / malformed / non-dict config. - Full and partial user overrides. - Coercion of bad values (None, negative, wrong type, str int). - Shortcut accessors delegate correctly. - DEFAULT_CONFIG exposes the section with the right defaults. - Integration: normalize_read_pagination clamps to the configured max_lines. * feat(skills): add design-md skill for Google's DESIGN.md spec Built-in skill under skills/creative/ that teaches the agent to author, lint, diff, and export DESIGN.md files — Google's open-source (Apache-2.0) format for describing a visual identity to coding agents. Covers: - YAML front matter + markdown body anatomy - Full token schema (colors, typography, rounded, spacing, components) - Canonical section order + duplicate-heading rejection - Component property whitelist + variants-as-siblings pattern - CLI workflow via 'npx @google/design.md' (lint/diff/export/spec) - Lint rule reference including WCAG contrast checks - Common YAML pitfalls (quoted hex, negative dimensions, dotted refs) - Starter template at templates/starter.md Package verified live on npm (@google/design.md@0.1.1).	2026-04-23 21:51:19 -07:00
Brooklyn Nicholson	0a679cb7ad	fix(tui): restore voice/panic handlers + scope fuzzy paths to cwd Two fixes on top of the fuzzy-@ branch: (1) Rebase artefact: re-apply only the fuzzy additions on top of fresh `tui_gateway/server.py`. The earlier commit was cut from a base 58 commits behind main and clobbered ~170 lines of voice.toggle / voice.record handlers and the gateway crash hooks (`_panic_hook`, `_thread_panic_hook`). Reset server.py to origin/main and re-add only: - `_FUZZY_*` constants + `_list_repo_files` + `_fuzzy_basename_rank` - the new fuzzy branch in the `complete.path` handler (2) Path scoping (Copilot review): `git ls-files` returns repo-root- relative paths, but completions need to resolve under the gateway's cwd. When hermes is launched from a subdirectory, the previous code surfaced `@file:apps/web/src/foo.tsx` even though the agent would resolve that relative to `apps/web/` and miss. Fix: - `git -C root rev-parse --show-toplevel` to get repo top - `git -C top ls-files …` for the listing - `os.path.relpath(top + p, root)` per result, dropping anything starting with `../` so the picker stays scoped to cwd-and-below (matches Cmd-P workspace semantics) `apps/web/src/foo.tsx` ends up as `@file:src/foo.tsx` from inside `apps/web/`, and sibling subtrees + parent-of-cwd files don't leak. New test `test_fuzzy_paths_relative_to_cwd_inside_subdir` builds a 3-package mono-repo, runs from `apps/web/`, and verifies completion paths are subtree-relative + outside-of-cwd files don't appear. Copilot review threads addressed: #3134675504 (path scoping), #3134675532 (`voice.toggle` regression), #3134675541 (`voice.record` regression — both were stale-base artefacts, not behavioural changes).	2026-04-23 19:38:33 -05:00
Brooklyn Nicholson	b08cbc7a79	fix(tui): @<name> fuzzy-matches filenames across the repo Typing `@appChrome` in the composer should surface `ui-tui/src/components/appChrome.tsx` without requiring the user to first type the full directory path — matches the Cmd-P behaviour users expect from modern editors. The gateway's `complete.path` handler was doing a plain `os.listdir(".")` + `startswith` prefix match, so basenames only resolved inside the current working directory. This reworks it to: - enumerate repo files via `git ls-files -z --cached --others --exclude-standard` (fast, honours `.gitignore`); fall back to a bounded `os.walk` that skips common vendor / build dirs when the working dir isn't a git repo. Results cached per-root with a 5s TTL so rapid keystrokes don't respawn git processes. - rank basenames with a 5-tier scorer: exact → prefix → camelCase / word-boundary → substring → subsequence. Shorter basenames win ties; shorter rel paths break basename-length ties. - only take the fuzzy branch when the query is bare (no `/`), is a context reference (`@...`), and isn't `@folder:` — path-ish queries and folder tags fall through to the existing directory-listing path so explicit navigation intent is preserved. Completion rows now carry `display = basename`, `meta = directory`, so the picker renders `appChrome.tsx ui-tui/src/components` on one row (basename bold, directory dim) — the meta column was previously "dir" / "" and is a more useful signal for fuzzy hits. Reported by Ben Barclay during the TUI v2 blitz test.	2026-04-23 19:01:27 -05:00
Teknium	6a20e187dd	test,chore: cover stringified array/object coercion + AUTHOR_MAP entry Follow-up to the cherry-picked coercion commit: adds 9 regression tests covering array/object parsing, invalid-JSON passthrough, wrong-shape preservation, and the issue #3947 gmail-mcp scenario end-to-end. Adds dan@danlynn.com -> danklynn to scripts/release.py AUTHOR_MAP so the salvage PR's contributor attribution doesn't break CI.	2026-04-23 16:38:38 -07:00
0xbyt4	04c489b587	feat(tui): match CLI's voice slash + VAD-continuous recording model The TUI had drifted from the CLI's voice model in two ways: - /voice on was lighting up the microphone immediately and Ctrl+B was interpreted as a mode toggle. The CLI separates the two: /voice on just flips the umbrella bit, recording only starts once the user presses Ctrl+B, which also sets _voice_continuous so the VAD loop auto-restarts until the user presses Ctrl+B again or three silent cycles pass. - /voice tts was missing entirely, so users couldn't turn agent reply speech on/off from inside the TUI. This commit brings the TUI to parity. Python - hermes_cli/voice.py: continuous-mode API (start_continuous, stop_continuous, is_continuous_active) layered on the existing PTT wrappers. The silence callback transcribes, fires on_transcript, tracks consecutive no-speech cycles, and auto-restarts — mirroring cli.py:_voice_stop_and_transcribe + _restart_recording. - tui_gateway/server.py: - voice.toggle now supports on / off / tts / status. The umbrella bit lives in HERMES_VOICE + display.voice_enabled; tts lives in HERMES_VOICE_TTS + display.voice_tts. /voice off also tears down any active continuous loop so a toggle-off really releases the microphone. - voice.record start/stop now drives start_continuous/stop_continuous. start is refused with a clear error when the mode is off, matching cli.py:handle_voice_record's early return on `not _voice_mode`. - New voice.transcript / voice.status events emit through _voice_emit (remembers the sid that last enabled the mode so events land in the right session). TypeScript - gatewayTypes.ts: voice.status + voice.transcript event discriminants; VoiceToggleResponse gains tts; VoiceRecordResponse gains status for the new "started/stopped" responses. - interfaces.ts: GatewayEventHandlerContext gains composer.setInput + submission.submitRef + voice.{setRecording, setProcessing, setVoiceEnabled}; InputHandlerContext.voice gains enabled + setVoiceEnabled for the mode-aware Ctrl+B handler. - createGatewayEventHandler.ts: voice.status drives REC/STT badges; voice.transcript auto-submits when the composer is empty (CLI _pending_input.put parity) and appends when a draft is in flight. no_speech_limit flips voice off + sys line. - useInputHandlers.ts: Ctrl+B now calls voice.record (start/stop), not voice.toggle, and nudges the user with a sys line when the mode is off instead of silently flipping it on. - useMainApp.ts: wires the new event-handler context fields. - slash/commands/session.ts: /voice handles on / off / tts / status with CLI-matching output ("voice: mode on · tts off"). Backward compat preserved for voice.record (was always PTT shape; gateway still honours start/stop with mode-gating added).	2026-04-23 16:18:15 -07:00
0xbyt4	0bb460b070	fix(tui): add missing hermes_cli.voice wrapper for gateway RPC tui_gateway/server.py:3486/3491/3509 imports start_recording, stop_and_transcribe, and speak_text from hermes_cli.voice, but the module never existed (not in git history — never shipped, never deleted). Every voice.record / voice.tts RPC call hit the ImportError branch and the TUI surfaced it as "voice module not available — install audio dependencies" even on boxes with sounddevice / faster-whisper / numpy installed. Adds a thin wrapper on top of tools.voice_mode (recording + transcription) and tools.tts_tool (text-to-speech): - start_recording() — idempotent; stores the active AudioRecorder in a module-global guarded by a Lock so repeat Ctrl+B presses don't fight over the mic. - stop_and_transcribe() — returns None for no-op / no-speech / Whisper-hallucination cases so the TUI's existing "no speech detected" path keeps working unchanged. - speak_text(text) — lazily imports tts_tool (optional provider SDKs stay unloaded until the first /voice tts call), parses the tool's JSON result, and plays the audio via play_audio_file. Paired with the Ctrl+B keybinding fix in the prior commit, the TUI voice pipeline now works end-to-end for the first time.	2026-04-23 16:18:15 -07:00
Teknium	e26c4f0e34	fix(kimi,mcp): Moonshot schema sanitizer + MCP schema robustness (#14805 ) Fixes a broader class of 'tools.function.parameters is not a valid moonshot flavored json schema' errors on Nous / OpenRouter aggregators routing to moonshotai/kimi-k2.6 with MCP tools loaded. ## Moonshot sanitizer (agent/moonshot_schema.py, new) Model-name-routed (not base-URL-routed) so Nous / OpenRouter users are covered alongside api.moonshot.ai. Applied in ChatCompletionsTransport.build_kwargs when is_moonshot_model(model). Two repairs: 1. Fill missing 'type' on every property / items / anyOf-child schema node (structural walk — only schema-position dicts are touched, not container maps like properties/$defs). 2. Strip 'type' at anyOf parents; Moonshot rejects it. ## MCP normalizer hardened (tools/mcp_tool.py) Draft-07 $ref rewrite from PR #14802 now also does: - coerce missing / null 'type' on object-shaped nodes (salvages #4897) - prune 'required' arrays to names that exist in 'properties' (salvages #4651; Gemini 400s on dangling required) - apply recursively, not just top-level These repairs are provider-agnostic so the same MCP schema is valid on OpenAI, Anthropic, Gemini, and Moonshot in one pass. ## Crash fix: safe getattr for Tool.inputSchema _convert_mcp_schema now uses getattr(t, 'inputSchema', None) so MCP servers whose Tool objects omit the attribute entirely no longer abort registration (salvages #3882). ## Validation - tests/agent/test_moonshot_schema.py: 27 new tests (model detection, missing-type fill, anyOf-parent strip, non-mutation, real-world MCP shape) - tests/tools/test_mcp_tool.py: 7 new tests (missing / null type, required pruning, nested repair, safe getattr) - tests/agent/transports/test_chat_completions.py: 2 new integration tests (Moonshot route sanitizes, non-Moonshot route doesn't) - Targeted suite: 49 passed - E2E via execute_code with a realistic MCP tool carrying all three Moonshot rejection modes + dangling required + draft-07 refs: sanitizer produces a schema valid on Moonshot and Gemini	2026-04-23 16:11:57 -07:00
helix4u	24f139e16a	fix(mcp): rewrite definitions refs to in input schemas	2026-04-23 15:56:57 -07:00
Teknium	ef5eaf8d87	feat(cron): honor `hermes tools` config for the cron platform (#14798 ) Cron now resolves its toolset from the same per-platform config the gateway uses — `_get_platform_tools(cfg, 'cron')` — instead of blindly loading every default toolset. Existing cron jobs without a per-job override automatically lose `moa`, `homeassistant`, and `rl` (the `_DEFAULT_OFF_TOOLSETS` set), which stops the "surprise $4.63 mixture_of_agents run" class of bug (Norbert, Discord). Precedence inside `run_job`: 1. per-job `enabled_toolsets` (PR #14767 / #6130) — wins if set 2. `_get_platform_tools(cfg, 'cron')` — new, the blanket gate 3. `None` fallback (legacy) — only on resolver exception Changes: - hermes_cli/platforms.py: register 'cron' with default_toolset 'hermes-cron' - toolsets.py: add 'hermes-cron' toolset (mirrors 'hermes-cli'; `_get_platform_tools` then filters via `_DEFAULT_OFF_TOOLSETS`) - cron/scheduler.py: add `_resolve_cron_enabled_toolsets(job, cfg)`, call it at the `AIAgent(...)` kwargs site - tests/cron/test_scheduler.py: replace the 'None when not set' test (outdated contract) with an invariant ('moa not in default cron toolset') + new per-job-wins precedence test - tests/hermes_cli/test_tools_config.py: mark 'cron' as non-messaging in the gateway-toolset-coverage test	2026-04-23 15:48:50 -07:00

1 2 3 4 5 ...

2428 commits