Commit graph

5319 commits

Author SHA1 Message Date
Billard
e9b3b8e820 fix(cron): treat empty agent response as error in last_status (fixes #8585)
When a cron job's agent run completes but produces an empty final_response
(e.g. API 404 from invalid model name), the scheduler now marks last_status
as "error" instead of "ok", so the failure is visible in job listings.

Previously, any run that didn't raise an exception was marked "ok" regardless
of whether the agent actually produced output.
2026-04-16 06:49:57 -07:00
Teknium
77bdad5b02
fix(tests): resolve 12 CI failures + 10 errors across 6 root causes (#11040)
Group A (3 tests): 'No LLM provider configured' RuntimeError
- test_user_message_surrogates_sanitized, test_counters_initialized_in_init,
  test_openai_prompt_tokens_unchanged
- Root cause: AIAgent.__init__ now requires base_url alongside api_key to
  skip resolve_provider_client() (which returns None when API keys are
  blanked in CI). Added base_url='http://localhost:1234/v1' to test
  agent construction.

Group B (5 tests): Discord slash command auto-registration
- test_auto_registers_missing_gateway_commands, test_auto_registered_command_*,
  test_register_skill_group_*
- Root cause: xdist workers that loaded a discord mock WITHOUT
  app_commands.Command/Group caused _register_slash_commands() to fail
  silently. Added comprehensive shared discord mock in
  tests/gateway/conftest.py (same pattern as existing telegram mock).

Group C (5 errors): Discord reply mode 'NoneType has no DMChannel'
- All TestReplyToText tests
- Root cause: FakeDMChannel was not a subclass of real discord.DMChannel,
  so isinstance() checks in _handle_message failed when running in full
  suite (real discord installed). Made FakeDMChannel inherit from
  discord.DMChannel when available. Removed fragile monkeypatch approach.

Group D (2 tests): detect_provider_for_model wrong provider
- test_openrouter_slug_match (got 'ai-gateway'), test_bare_name_gets_
  openrouter_slug (got 'copilot')
- Root cause: ai-gateway, copilot, and kilocode are multi-vendor
  aggregators that list other providers' models (OpenRouter-style slugs).
  They were being matched in Step 1 before OpenRouter. Added all three
  to _AGGREGATORS set so they're skipped like nous/openrouter.

Group E (1 test): model_flow_custom StopIteration
- test_model_flow_custom_saves_verified_v1_base_url
- Root cause: 'Display name' prompt was added after the test was written.
  The input iterator had 5 answers but the flow now asks 6 questions.
  Added 6th empty string answer.

Group F (1 test): Telegram proxy env assertion
- test_uses_proxy_env_for_primary_and_fallback_transports
- Root cause: _resolve_proxy_url() now checks TELEGRAM_PROXY first
  (via resolve_proxy_url('TELEGRAM_PROXY')). Test didn't clear this
  env var, allowing potential leakage from other tests in xdist workers.
  Added TELEGRAM_PROXY to the cleanup list.
2026-04-16 06:49:36 -07:00
Teknium
3c42064efc
fix: enforce config.yaml as sole CWD source + deprecate .env CWD vars + add hermes memory reset (#11029)
config.yaml terminal.cwd is now the single source of truth for working
directory. MESSAGING_CWD and TERMINAL_CWD in .env are deprecated with a
migration warning.

Changes:

1. config.py: Remove MESSAGING_CWD from OPTIONAL_ENV_VARS (setup wizard
   no longer prompts for it). Add warn_deprecated_cwd_env_vars() that
   prints a migration hint when deprecated env vars are detected.

2. gateway/run.py: Replace all MESSAGING_CWD reads with TERMINAL_CWD
   (which is bridged from config.yaml terminal.cwd). MESSAGING_CWD is
   still accepted as a backward-compat fallback with deprecation warning.
   Config bridge skips cwd placeholder values so they don't clobber
   the resolved TERMINAL_CWD.

3. cli.py: Guard against lazy-import clobbering — when cli.py is
   imported lazily during gateway runtime (via delegate_tool), don't
   let load_cli_config() overwrite an already-resolved TERMINAL_CWD
   with os.getcwd() of the service's working directory. (#10817)

4. hermes_cli/main.py: Add 'hermes memory reset' command with
   --target all/memory/user and --yes flags. Profile-scoped via
   HERMES_HOME.

Migration path for users with .env settings:
  Remove MESSAGING_CWD / TERMINAL_CWD from .env
  Add to config.yaml:
    terminal:
      cwd: /your/project/path

Addresses: #10225, #4672, #10817, #7663
2026-04-16 06:48:33 -07:00
Teknium
fe12042e50
fix: remove context pressure warnings entirely (#11039)
The gateway compression notifications were already removed in commit cc63b2d1
(PR #4139), but the agent-level context pressure warnings (85%/95% tiered
alerts via _emit_context_pressure) were still firing on both CLI and gateway.

Removed:
- _emit_context_pressure method and all call sites in run_conversation()
- Class-level dedup state (_context_pressure_last_warned, _CONTEXT_PRESSURE_COOLDOWN)
- Instance attribute _context_pressure_warned_at
- Pressure reset logic in _compress_context
- format_context_pressure and format_context_pressure_gateway from agent/display.py
- Orphaned ANSI constants that only served these functions
- tests/run_agent/test_context_pressure.py (all 361 lines)

Compression itself continues to run silently in the background.
Closes #3784
2026-04-16 06:44:23 -07:00
kshitijk4poor
a6142a8e08 fix: follow-up for salvaged PR #10854
- Extract duplicated activity-callback polling into shared
  touch_activity_if_due() helper in tools/environments/base.py
- Use helper from both base.py _wait_for_process and
  code_execution_tool.py local polling loop (DRY)
- Add test assertion that timeout output field contains the
  timeout message and emoji (#10807)
- Add stream_consumer test for tool-boundary fallback scenario
  where continuation is empty but final_text differs from
  visible prefix (#10807)
2026-04-16 06:42:45 -07:00
konsisumer
3e3ec35a5e fix: surface execute_code timeout to user instead of silently dropping (#10807)
When execute_code times out, the result JSON had status="timeout" and an
error field, but the output field was empty.  Many models treat empty
output as "nothing happened" and produce an empty/minimal response.  The
gateway stream consumer then considers the response "already sent" (from
pre-tool streaming) and silently drops it — leaving the user staring at
silence.

Three changes:

1. Include the timeout message in the output field (both local and remote
   paths) so the model always has visible content to relay to the user.

2. Add periodic activity callbacks to the local execution polling loop so
   the gateway's inactivity monitor knows execute_code is alive during
   long runs.

3. Fix stream_consumer._send_fallback_final to not silently drop content
   when the continuation appears empty but the final text differs from
   what was previously streamed (e.g. after a tool boundary reset).
2026-04-16 06:42:45 -07:00
Bartok9
73befa505d fix(cli): handle null/non-dict display config in skin initialization
display: null or display: <non-dict> in config.yaml crashed skin init
with AttributeError. Now falls back to default skin gracefully.

Cherry-picked from #10867 by @Bartok9. Consolidates #10876 by @cola-runner.

Co-authored-by: cola-runner <cola-runner@users.noreply.github.com>
2026-04-16 06:35:31 -07:00
LeonSGP43
465193b7eb fix(gateway): close temporary agents after one-off tasks
Add shared _cleanup_agent_resources() for temporary gateway AIAgent
instances. Apply cleanup to memory flush, background tasks, /btw,
manual /compress, and session-hygiene auto-compression. Prevents
unclosed aiohttp client session leaks.

Cherry-picked from #10899 by @LeonSGP43. Consolidates #10945 by @Lubrsy706.
Fixes #10865.

Co-authored-by: Lubrsy706 <Lubrsy706@users.noreply.github.com>
2026-04-16 06:31:23 -07:00
Brooklyn Nicholson
39b1336d1f fix: ctx usage display 2026-04-16 08:27:41 -05:00
Brooklyn Nicholson
f81dba0da2 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 08:23:20 -05:00
Teknium
dc7d47a6b8 chore: add GenKoKo to AUTHOR_MAP 2026-04-16 06:10:40 -07:00
Mil Wang (from Dev Box)
f9714161f0 fix: stop leaking '(No response generated)' placeholder to users and cron targets
When the LLM returns an empty completion, gateway/run.py replaced
final_response with the literal string '(No response generated)'.
This defeated cron/scheduler.py's empty-response skip guard, causing
the placeholder to be delivered to home channels.

Changes:
- gateway/run.py: return empty string instead of placeholder when
  there is no error and no response content
- cron/scheduler.py: defensively strip the placeholder text in case
  any upstream path still produces it

Fixes NousResearch/hermes-agent#9270
2026-04-16 06:10:40 -07:00
Ko
85752791ed fix: resolve UnboundLocalError in post-tool empty response nudge path
When a model returns an empty response after tool calls with no new
tool_calls in the follow-up turn, the code enters the "nudge" recovery
path which referenced `assistant_msg` before it was assigned. This
variable is only set in the tool-calls branch (line 10098), but the
nudge code lives in the no-tool-calls branch (line 10263+).

The fix builds a fresh assistant message dict via `_build_assistant_message()`
instead of reusing the unbound variable, consistent with the exhausted-
retries path at line 10457.
2026-04-16 06:10:40 -07:00
Teknium
9f231dae56
fix: quiet mode (-Q) outputs only raw response text (#11024)
Two issues when running hermes chat -Q -q:
1. The streaming 'Hermes' response box was rendering to stdout because
   stream_delta_callback was wired during _init_agent() before quiet_mode
   was set. This caused the response to appear twice — once in the styled
   box and once as plain text.
2. session_id was printed to stdout, making piped output unusable.

Fix: null out stream_delta_callback and tool_gen_callback after agent init
in the quiet-mode path, and redirect session_id to stderr.

Now 'hermes chat -Q -q "prompt" | cat' produces only the answer text.
session_id is still available on stderr for scripts that need it.

Reported by @nixpiper on X.
2026-04-16 06:07:14 -07:00
Teknium
4b1cf77770 chore: add davetist to AUTHOR_MAP 2026-04-16 05:53:18 -07:00
Teknium
fa830a49e0 test: add cancellation handler delivery confirmation tests
5 tests covering the stream_consumer.py cancellation handler fix:
- partial-only (no accumulated) stays False
- best-effort send succeeds → True
- best-effort send fails → stays False (gateway fallback delivers)
- preserves existing True through cancellation
- regression: old code would have promoted partial to final
2026-04-16 05:53:18 -07:00
Teknium
3b5572ded3 fix(stream-consumer): only confirm final delivery on successful best-effort send
The cancellation handler previously promoted any partial send
(already_sent=True) to final_response_sent=True unconditionally.
This meant if intermediate text (e.g. 'Let me search…') was streamed
and the consumer was cancelled before delivering the actual answer,
the gateway's suppression check would still prevent the fallback send.

Now final_response_sent is only set in the cancellation path when:
- The best-effort send of accumulated content actually succeeded, OR
- It was already confirmed before cancellation

Companion fix for PR #11000's run.py changes — closes the
cancellation-path loophole that would otherwise let partial streams
suppress final delivery during queued follow-ups.
2026-04-16 05:53:18 -07:00
Dave Tist
35bbc6851b fix(gateway): honor previewed replies in queued follow-ups 2026-04-16 05:53:18 -07:00
Dave Tist
d67e602cc8 fix: only suppress gateway replies after confirmed final stream delivery
(cherry picked from commit 675249085b383fff305cc84b8aeacd6dd20c7b14)
2026-04-16 05:53:18 -07:00
kshitij
512c328815
fix(copilot): eliminate redundant catalog fetch in api_mode resolution (#11008)
copilot_model_api_mode() called normalize_copilot_model_id() which
fetched the GitHub model catalog via HTTP, then the secondary endpoint
check fetched it again because the catalog was never passed through.

Fix: fetch the catalog once at the top of copilot_model_api_mode()
and pass it to normalize_copilot_model_id(). The secondary check
then sees a non-None catalog and skips the redundant fetch.

For a Claude model switch on Copilot this eliminates one 5-second-
timeout HTTP call from the interactive /model path.

Surfaced during review of PR #10533.

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-04-16 05:18:34 -07:00
kshitij
92a78ffeee
chore(gateway): replace deprecated asyncio.get_event_loop() with get_running_loop() (#11005)
All 10 call sites in gateway/run.py and gateway/platforms/api_server.py
are inside async functions where a loop is guaranteed to be running.

get_event_loop() is deprecated since Python 3.10 — it can silently
create a new loop when none is running, masking bugs.
get_running_loop() raises RuntimeError instead, which is safer.

Surfaced during review of PRs #10533 and #10647.

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-04-16 05:13:39 -07:00
Teknium
0de6340a73 fix(docs): show sidebar on docs homepage 2026-04-16 04:24:45 -07:00
helix4u
bd7e272c1f fix(slack): per-thread sessions for DMs by default
Each top-level Slack DM now gets its own Hermes session, matching the
per-thread behavior channels already have. Previously all top-level DM
messages shared one continuous session because thread_ts was None,
causing context to accumulate across unrelated conversations.

The behavior is controlled by platforms.slack.extra.dm_top_level_threads_as_sessions
in config.yaml (default: true). Set to false to restore legacy behavior.

Based on PR #10789 by helix4u. Changes from original:
- Default flipped to true (was opt-in, now opt-out)
- Removed env var fallback (config.yaml only per project policy)
- Tests updated to cover both default and opt-out paths
2026-04-16 04:22:33 -07:00
LeonSGP43
daef0519e9 fix(google-workspace): normalize authorized user token writes 2026-04-16 04:22:16 -07:00
Teknium
f726b9b843 fix(browser): runtime fallback to local Chromium when cloud provider fails
Wraps provider.create_session() in _get_session_info() with try/except
to catch cloud provider runtime failures (timeouts, auth errors, rate
limits, invalid responses). Falls back to _create_local_session() so
browser automation continues working when cloud APIs are down.

Marks fallback sessions with fallback_from_cloud, fallback_reason, and
fallback_provider metadata for observability. If both cloud and local
fail, raises RuntimeError with chained context from both errors.

Closes #10883
Co-authored-by: konsisumer <konsisumer@users.noreply.github.com>
2026-04-16 04:19:34 -07:00
Teknium
e0532be8ae fix(docs): add dashboard-plugins to sidebar navigation 2026-04-16 04:16:50 -07:00
Teknium
50d438d125
fix(honcho): drop anyOf schema — breaks Fireworks and other providers
The honcho_conclude tool schema used anyOf with nested required
fields which is unsupported by Fireworks AI, MiniMax, and other
providers that only handle basic JSON Schema. The handler already
validates that conclusion or delete_id is present (line 1018-1020),
so the schema constraint was redundant.

Replace with required: [] and let the handler reject bad calls.
2026-04-16 04:10:36 -07:00
Teknium
131d261a74 docs: add dashboard themes and plugins documentation
- web-dashboard.md: add Themes section covering built-in themes, custom
  theme YAML format (21 color tokens + overlay), and theme API endpoints
- dashboard-plugins.md: full plugin authoring guide covering manifest
  format, plugin SDK reference, backend API routes, custom CSS, loading
  flow, discovery, and tips
2026-04-16 04:10:06 -07:00
Teknium
01214a7f73 feat: dashboard plugin system — extend the web UI with custom tabs
Add a plugin system that lets plugins add new tabs to the dashboard.
Plugins live in ~/.hermes/plugins/<name>/dashboard/ alongside any
existing CLI/gateway plugin code.

Plugin structure:
  plugins/<name>/dashboard/
    manifest.json     # name, label, icon, tab config, entry point
    dist/index.js     # pre-built JS bundle (IIFE, uses SDK globals)
    plugin_api.py     # optional FastAPI router mounted at /api/plugins/<name>/

Backend (hermes_cli/web_server.py):
- Plugin discovery: scans plugins/*/dashboard/manifest.json from user,
  bundled, and project plugin directories
- GET /api/dashboard/plugins — returns discovered plugin manifests
- GET /api/dashboard/plugins/rescan — force re-discovery
- GET /dashboard-plugins/<name>/<path> — serves plugin static assets
  with path traversal protection
- Optional API route mounting: imports plugin_api.py and mounts its
  router under /api/plugins/<name>/
- Plugin API routes bypass session token auth (localhost-only)

Frontend (web/src/plugins/):
- Plugin SDK exposed on window.__HERMES_PLUGIN_SDK__ — provides React,
  hooks, UI components (Card, Badge, Button, etc.), API client,
  fetchJSON, theme/i18n hooks, and utilities
- Plugin registry on window.__HERMES_PLUGINS__.register(name, Component)
- usePlugins() hook: fetches manifests, loads JS/CSS, resolves components
- App.tsx dynamically adds nav items and routes for discovered plugins
- Icon resolution via static map of 20 common Lucide icons (no tree-
  shaking penalty — bundle only +5KB over baseline)

Example plugin (plugins/example-dashboard/):
- Demonstrates SDK usage: Card components, backend API call, SDK reference
- Backend route: GET /api/plugins/example/hello

Tested: plugin discovery, static serving, API routes, path traversal
blocking, unknown plugin 404, bundle size (400KB vs 394KB baseline).
2026-04-16 04:10:06 -07:00
Teknium
23a42635f0
docs: remove nonexistent CAMOFOX_PROFILE_DIR env var references (#10976)
Camofox automatically maps each userId to a persistent Firefox profile
on the server side — no CAMOFOX_PROFILE_DIR env var exists. Our docs
incorrectly told users to configure this on the server.

Removed the fabricated env var from:
- browser docs (:::note block)
- config.py DEFAULT_CONFIG comment
- test docstring
2026-04-16 04:07:11 -07:00
Teknium
e07dbde582
Revert "fix: enable TCP keepalives to detect dead provider connections (#10324)"
This reverts commit 64fee35dc0.
2026-04-16 03:59:05 -07:00
Teknium
e66b373351
fix: word-wrap spinner, interruptable agent join, and delegate_task interrupt (#10940)
* fix: stop /model from silently rerouting direct providers to OpenRouter (#10300)

detect_provider_for_model() silently remapped models to OpenRouter when
the direct provider's credentials weren't found via env vars. Three bugs:

1. Credential check only looked at env vars from PROVIDER_REGISTRY,
   missing credential pool entries, auth store, and OAuth tokens
2. When env var check failed, silently returned ('openrouter', slug)
   instead of the direct provider the model actually belongs to
3. Users with valid credentials via non-env-var mechanisms (pool,
   OAuth, Claude Code tokens) got silently rerouted

Fix:
- Expand credential check to also query credential pool and auth store
- Always return the direct provider match regardless of credential
  status -- let client init handle missing creds with a clear error
  rather than silently routing through the wrong provider

Same philosophy as the provider-required fix: don't guess, don't
silently reroute, error clearly when something is missing.

Closes #10300

* fix: word-wrap spinner, interruptable agent join, and delegate_task interrupt

Three fixes:

1. Spinner widget clips long tool commands — prompt_toolkit Window had
   height=1 and wrap_lines=False. Now uses wrap_lines=True with dynamic
   height from text length / terminal width. Long commands wrap naturally.

2. agent_thread.join() blocked forever after interrupt — if the agent
   thread took time to clean up, the process_loop thread froze. Now polls
   with 0.2s timeout on the interrupt path, checking _should_exit so
   double Ctrl+C breaks out immediately.

3. Root cause of 5-hour CLI hang: delegate_task() used as_completed()
   with no interrupt check. When subagent children got stuck, the parent
   blocked forever inside the ThreadPoolExecutor. Now polls with
   wait(timeout=0.5) and checks parent_agent._interrupt_requested each
   iteration. Stuck children are reported as interrupted, and the parent
   returns immediately.
2026-04-16 03:50:49 -07:00
Bartok9
f05590796e fix(telegram): increase cold-boot retry budget and cap backoff
Bump connect retry attempts from 3 to 8 and cap exponential backoff at
15 seconds. Old budget: 3 attempts, 1+2+4=7s total — insufficient for
cold boot on slow networks or embedded devices. New budget: 8 attempts,
1+2+4+8+15+15+15=~60s total.

Inspired by PR #5770 by @Bartok9 (re-implemented against current main
since original was 913 commits stale with conflicts).
2026-04-16 03:47:00 -07:00
Markus Corazzione
c928ebb1b1 retry transient telegram send failures 2026-04-16 03:47:00 -07:00
Teknium
333cb8251b
fix: improve interrupt responsiveness during concurrent tool execution and follow-up turns (#10935)
Three targeted fixes for the 'agent stuck on terminal command' report:

1. **Concurrent tool wait loop now checks interrupts** (run_agent.py)
   The sequential path checked _interrupt_requested before each tool call,
   but the concurrent path's wait loop just blocked with 30s timeouts.
   Now polls every 5s and cancels pending futures on interrupt, giving
   already-running tools 3s to notice the per-thread interrupt signal.

2. **Cancelled concurrent tools get proper interrupt messages** (run_agent.py)
   When a concurrent tool is cancelled or didn't return a result due to
   interrupt, the tool result message says 'skipped due to user interrupt'
   instead of a generic error.

3. **Typing indicator fires before follow-up turn** (gateway/run.py)
   After an interrupt is acknowledged and the pending message dequeued,
   the gateway now sends a typing indicator before starting the recursive
   _run_agent call. This gives the user immediate visual feedback that
   the system is processing their new message (closing the perceived
   'dead air' gap between the interrupt ack and the response).

Reported by @_SushantSays.
2026-04-16 02:44:56 -07:00
Teknium
3f6c4346ac feat: dashboard theme system with live switching
Add a theme engine for the web dashboard that mirrors the CLI skin
engine philosophy — pure data, no code changes needed for new themes.

Frontend:
- ThemeProvider context that loads active theme from backend on mount
  and applies CSS variable overrides to document.documentElement
- ThemeSwitcher dropdown component in the header (next to language
  switcher) with instant preview on click
- 6 built-in themes: Hermes Teal (default), Midnight, Ember, Mono,
  Cyberpunk, Rosé — each defines all 21 color tokens + overlay settings
- Theme types, presets, and context in web/src/themes/

Backend:
- GET /api/dashboard/themes — returns available themes + active name
- PUT /api/dashboard/theme — persists selection to config.yaml
- User custom themes discoverable from ~/.hermes/dashboard-themes/*.yaml
- Theme list endpoint added to public API paths (no auth needed)

Config:
- dashboard.theme key in DEFAULT_CONFIG (default: 'default')
- Schema override for select dropdown in config page
- Category merged into 'display' tab in config UI

i18n: theme switcher strings added for en + zh.
2026-04-16 02:44:32 -07:00
Peter Berthelsen
9a9b8cd1e4 fix: keep rapid telegram follow-ups from getting cut off 2026-04-16 02:44:00 -07:00
Teknium
12b109b664
fix: enable TCP keepalives to detect dead provider connections (#10324) (#10933)
When a custom provider drops a connection mid-stream, the TCP socket
can enter CLOSE-WAIT and the httpx read timeout may never fire —
epoll_wait blocks indefinitely because no data or error signal arrives.
The agent hangs until manually killed.

The existing defenses (httpx read timeout, stale stream detector,
_force_close_tcp_sockets) are all time-based and work correctly once
triggered, but they rely on the socket layer reporting the dead
connection. Without TCP keepalives, the kernel has no reason to probe
a silent connection.

Fix: inject SO_KEEPALIVE + TCP_KEEPIDLE/KEEPINTVL/KEEPCNT into the
httpx transport via socket_options. The kernel probes idle connections
after 30s, retries every 10s, gives up after 3 failures — dead peer
detected within ~60s instead of hanging forever.

Platform-aware: uses TCP_KEEPIDLE on Linux, TCP_KEEPALIVE on macOS.
Falls back silently if socket options aren't available (Windows, etc.).

Closes #10324
2026-04-16 02:32:21 -07:00
Teknium
f2f9d0c819
fix: stop /model from silently rerouting direct providers to OpenRouter (#10300) (#10780)
detect_provider_for_model() silently remapped models to OpenRouter when
the direct provider's credentials weren't found via env vars. Three bugs:

1. Credential check only looked at env vars from PROVIDER_REGISTRY,
   missing credential pool entries, auth store, and OAuth tokens
2. When env var check failed, silently returned ('openrouter', slug)
   instead of the direct provider the model actually belongs to
3. Users with valid credentials via non-env-var mechanisms (pool,
   OAuth, Claude Code tokens) got silently rerouted

Fix:
- Expand credential check to also query credential pool and auth store
- Always return the direct provider match regardless of credential
  status -- let client init handle missing creds with a clear error
  rather than silently routing through the wrong provider

Same philosophy as the provider-required fix: don't guess, don't
silently reroute, error clearly when something is missing.

Closes #10300
2026-04-16 02:27:20 -07:00
Teknium
e4cd62d07d
fix(tests): resolve remaining CI failures — commit_memory_session, already_sent, timezone leak, session env (#10785)
Fixes 12 CI test failures:

1. test_cli_new_session (4): _FakeAgent missing commit_memory_session
   attribute added in the memory provider refactoring. Added MagicMock.

2. test_run_progress_topics (1): already_sent detection only checked
   stream consumer flags, missing the response_previewed path from
   interim_assistant_callback. Restructured guard to check both paths.

3. test_timezone (1): HERMES_TIMEZONE leaked into child processes via
   _SAFE_ENV_PREFIXES matching HERMES_*. The code correctly converts
   it to TZ but didn't remove the original. Added child_env.pop().

4. test_session_env (1): contextvars baseline captured from a different
   context couldn't be restored after clear. Changed assertion to verify
   the test's value was removed rather than comparing to a fragile baseline.

5. test_discord_slash_commands (5): already fixed on current main.
2026-04-16 02:26:14 -07:00
Teknium
0c1217d01e feat(xai): upgrade to Responses API, add TTS provider
Cherry-picked and trimmed from PR #10600 by Jaaneek.

- Switch xAI transport from openai_chat to codex_responses (Responses API)
- Add codex_responses detection for xAI in all runtime_provider resolution paths
- Add xAI api_mode detection in AIAgent.__init__ (provider name + URL auto-detect)
- Add extra_headers passthrough for codex_responses requests
- Add x-grok-conv-id session header for xAI prompt caching
- Add xAI reasoning support (encrypted_content include, no effort param)
- Move x-grok-conv-id from chat_completions path to codex_responses path
- Add xAI TTS provider (dedicated /v1/tts endpoint with Opus conversion)
- Add xAI provider aliases (grok, x-ai, x.ai) across auth, models, providers, auxiliary
- Trim xAI model list to agentic models (grok-4.20-reasoning, grok-4-1-fast-reasoning)
- Add XAI_API_KEY/XAI_BASE_URL to OPTIONAL_ENV_VARS
- Add xAI TTS config section, setup wizard entry, tools_config provider option
- Add shared xai_http.py helper for User-Agent string

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>
2026-04-16 02:24:08 -07:00
Teknium
330ed12fb1 chore: add nosleepcassette to AUTHOR_MAP 2026-04-16 02:22:19 -07:00
nosleepcassette
3c859e35dc fix: skin spinner faces and verbs not applied at runtime
Skins define waiting_faces, thinking_faces, and thinking_verbs in their
spinner config, but all 7 call sites in run_agent.py used hardcoded class
constants. Add three classmethods on KawaiiSpinner that query the active
skin first and fall back to the class constants, matching the existing
pattern used for wings/tool_prefix/tool_emojis.

Co-authored-by: nosleepcassette <nosleepcassette@users.noreply.github.com>
2026-04-16 02:22:19 -07:00
Teknium
5c397876b9 fix(cli): hint about /v1 suffix when configuring local model endpoints
When a user enters a local model server URL (Ollama, vLLM, llama.cpp)
without a /v1 suffix during 'hermes model' custom endpoint setup,
prompt them to add it. Most OpenAI-compatible local servers require
/v1 in the base URL for chat completions to work.
2026-04-16 02:22:09 -07:00
ygd58
8798b069d3 fix(agent): sanitize surrogate characters from API responses and before API calls 2026-04-16 02:22:09 -07:00
Mibayy
3522a7aa13 feat(ollama): pass think=false to custom providers when reasoning_effort is none
When a custom/Ollama provider is used and reasoning_effort is set to 'none'
(or enabled: false), inject 'think': false into the request extra_body.

Ollama does not recognise the OpenRouter-style 'reasoning' extra_body field,
so thinking-capable models (Qwen3, etc.) generate <think> blocks regardless
of the reasoning_effort setting. This produces empty-response errors that
corrupt session state.

The fix adds a provider-specific block in _build_api_kwargs() that sets
think=false in extra_body whenever self.provider == 'custom' and reasoning
is explicitly disabled.

Closes #3191
2026-04-16 02:22:09 -07:00
LeonSGP43
8011aa31ba fix(agent): continue ollama glm truncation replies 2026-04-16 02:22:09 -07:00
kshitijk4poor
1b61ec470b feat: add Ollama Cloud as built-in provider
Add ollama-cloud as a first-class provider with full parity to existing
API-key providers (gemini, zai, minimax, etc.):

- PROVIDER_REGISTRY entry with OLLAMA_API_KEY env var
- Provider aliases: ollama -> custom (local), ollama_cloud -> ollama-cloud
- models.dev integration for accurate context lengths
- URL-to-provider mapping (ollama.com -> ollama-cloud)
- Passthrough model normalization (preserves Ollama model:tag format)
- Default auxiliary model (nemotron-3-nano:30b)
- HermesOverlay in providers.py
- CLI --provider choices, CANONICAL_PROVIDERS entry
- Dynamic model discovery with disk caching (1hr TTL)
- 37 provider-specific tests

Cherry-picked from PR #6038 by kshitijk4poor. Closes #3926
2026-04-16 02:22:09 -07:00
helix4u
8021a735c2 fix(gateway): preserve notify context in executor threads
Gateway executor work now inherits the active session contextvars via
copy_context() so background process watchers retain the correct
platform/chat/user/session metadata for routing completion events back
to the originating chat.

Cherry-picked from #10647 by @helix4u with:
- Use asyncio.get_running_loop() instead of deprecated get_event_loop()
- Strip trailing whitespace
- Add *args forwarding test
- Add exception propagation test
2026-04-16 02:05:59 -07:00
helix4u
4093982f19 fix: recompute Copilot api_mode after model switch
Recomputes GitHub Copilot api_mode from the selected model in the
shared /model switch path.  Before this change, Copilot could carry a
stale codex_responses mode forward from a GPT-5 selection into a later
Claude model switch, causing unsupported_api_for_model errors.

Cherry-picked from #10533 by @helix4u with:
- Comment specificity (Provider-specific → Copilot api_mode override)
- Fix pre-existing duplicate opencode-go in set literal
- Extract test mock helper to reduce duplication
- Add GPT-5 → GPT-5 regression test (keeps codex_responses)
2026-04-16 01:16:14 -07:00