When enabled, @mentioning the bot in a DM creates a thread (default:
false). Supports both env var and YAML config (matrix.dm_mention_threads).
6 new tests, docs updated.
From #6957
When _build_api_kwargs() throws an exception, the except handler in
the retry loop referenced api_kwargs before it was assigned. This
caused an UnboundLocalError that masked the real error, making
debugging impossible for the user.
Two _dump_api_request_debug() calls in the except block (non-retryable
client error path and max-retries-exhausted path) both accessed
api_kwargs without checking if it was assigned.
Fix: initialize api_kwargs = None before the retry loop and guard both
dump calls. Now the real error surfaces instead of the masking
UnboundLocalError.
Reported by Discord user gruman0.
HERMES_OVERLAYS keys use models.dev IDs (e.g. 'github-copilot') but
_PROVIDER_MODELS curated lists and config.yaml use Hermes provider IDs
('copilot'). list_authenticated_providers() Section 2 was using the
overlay key directly for model lookups and is_current checks, causing:
- 0 models shown for copilot, kimi, kilo, opencode, vercel
- is_current never matching the config provider
Fix: build reverse mapping from PROVIDER_TO_MODELS_DEV to translate
overlay keys to Hermes slugs before curated list lookup and result
construction. Also adds 'kimi-for-coding' alias in auth.py so the
picker's returned slug resolves correctly in resolve_provider().
Fixes#5223. Based on work by HearthCore (#6492) and linxule (#6287).
Co-authored-by: HearthCore <HearthCore@users.noreply.github.com>
Co-authored-by: linxule <linxule@users.noreply.github.com>
Adds xAI as a first-class provider: ProviderConfig in auth.py,
HermesOverlay in providers.py, 11 curated Grok models, URL mapping
in model_metadata.py, aliases (x-ai, x.ai), and env var tests.
Uses standard OpenAI-compatible chat completions.
Closes#7050
`delegate_task` silently truncated batch tasks to 3 — the model sends
5 tasks, gets results for 3, never told 2 were dropped. Now returns a
clear tool_error explaining the limit and how to fix it.
The limit is configurable via:
- delegation.max_concurrent_children in config.yaml (priority 1)
- DELEGATION_MAX_CONCURRENT_CHILDREN env var (priority 2)
- default: 3
Uses the same _load_config() path as the rest of delegate_task for
consistent config priority. Clamps to min 1, warns on non-integer
config values.
Also removes the hardcoded maxItems: 3 from the JSON schema — the
schema was blocking the model from even attempting >3 tasks before
the runtime check could fire. The runtime check gives a much more
actionable error message.
Backwards compatible: default remains 3, existing configs unchanged.
Isolate system tool configs (git, ssh, gh, npm) per profile by injecting
a per-profile HOME into subprocess environments only. The Python
process's own os.environ['HOME'] and Path.home() are never modified,
preserving all existing profile infrastructure.
Activation is directory-based: when {HERMES_HOME}/home/ exists on disk,
subprocesses see it as HOME. The directory is created automatically for:
- Docker: entrypoint.sh bootstraps it inside the persistent volume
- Named profiles: added to _PROFILE_DIRS in profiles.py
Injection points (all three subprocess env builders):
- tools/environments/local.py _make_run_env() — foreground terminal
- tools/environments/local.py _sanitize_subprocess_env() — background procs
- tools/code_execution_tool.py child_env — execute_code sandbox
Single source of truth: hermes_constants.get_subprocess_home()
Closes#4426
Broaden the UnicodeEncodeError recovery to handle systems with ASCII-only
locale (LANG=C, Chromebooks) where ANY non-ASCII character causes encoding
failure, not just lone surrogates.
Changes:
- Add _strip_non_ascii() and _sanitize_messages_non_ascii() helpers that
strip all non-ASCII characters from message content, name, and tool_calls
- Update the UnicodeEncodeError handler to detect ASCII codec errors and
fall back to non-ASCII sanitization after surrogate check fails
- Sanitize tool_calls arguments and name fields (not just content)
- Fix bare .encode() in cli.py suspend handler to use explicit utf-8
- Add comprehensive test suite (17 tests)
When delegate_task runs, the parent agent's activity tracker freezes
because child.run_conversation() blocks and the child's own
_touch_activity() never propagates back to the parent. The gateway
inactivity timeout then fires a spurious 'No activity' warning and
eventually kills the agent, even though the subagent is actively working.
Fix: add a heartbeat thread in _run_single_child that calls
parent._touch_activity() every 30 seconds with detail from the child's
activity summary (current tool, iteration count). The thread is a daemon
that starts before child.run_conversation() and is cleaned up in the
finally block.
This also improves the gateway 'Still working...' status messages —
instead of just 'running: delegate_task', users now see what the
subagent is actually doing (e.g., 'delegate_task: subagent running
terminal (iteration 5/50)').
_resolve_verify() returned stale CA bundle paths from auth.json without
checking if the file exists. When a user logs into Nous Portal on their
host (where SSL_CERT_FILE points to a valid cert), that path gets
persisted in auth.json. Running hermes model later in Docker where the
host path doesn't exist caused FileNotFoundError bubbling up as
'Could not verify credentials: [Errno 2] No such file or directory'.
Now _resolve_verify validates the path exists before returning it. If
missing, logs a warning and falls back to True (default certifi-based
TLS verification).
In Docker, HERMES_HOME=/opt/data (set in Dockerfile) and users mount
their .hermes directory to /opt/data. However, profile operations used
Path.home() / '.hermes' which resolves to /root/.hermes in Docker —
an ephemeral container path, not the mounted volume.
This caused:
- Profiles created at /root/.hermes/profiles/ (lost on container recreate)
- active_profile sticky file written to wrong location
- profile list looking at wrong directory
Fix: Add get_default_hermes_root() to hermes_constants.py that detects
Docker/custom deployments (HERMES_HOME outside ~/.hermes) and returns
HERMES_HOME as the root. Also handles Docker profiles correctly
(<root>/profiles/<name> → root is grandparent).
Files changed:
- hermes_constants.py: new get_default_hermes_root()
- hermes_cli/profiles.py: _get_default_hermes_home() delegates to shared fn
- hermes_cli/main.py: _apply_profile_override() + _invalidate_update_cache()
- hermes_cli/gateway.py: _profile_suffix() + _profile_arg()
- Tests: 12 new tests covering Docker scenarios
fetch_api_models is imported locally inside _model_flow_named_custom from
hermes_cli.models, not defined as a module-level attribute of hermes_cli.main.
Patch the source module so the local import picks up the mock.
Also force simple_term_menu ImportError so tests reliably use the input()
fallback path regardless of environment.
Co-Authored-By: Claude <noreply@anthropic.com>
When opencode-go API key is set, it should appear in the /model list.
The provider was already in PROVIDER_TO_MODELS_DEV and PROVIDER_REGISTRY,
so it appears via Part 1 (built-in source).
Also fixes a potential issue in Part 2 (HERMES_OVERLAYS) where providers
with auth_type=api_key but no extra_env_vars would not be detected:
- Now also checks api_key_env_vars from PROVIDER_REGISTRY for api_key auth_type
- Add test verifying opencode-go appears when OPENCODE_GO_API_KEY is set
When switching models at runtime, the config_context_length override
was not being passed to the new context compressor instance. This
meant the user-specified context length from config.yaml was lost
after a model switch.
- Store _config_context_length on AIAgent instance during __init__
- Pass _config_context_length when creating new ContextCompressor in switch_model
- Add test to verify config_context_length is preserved across model switches
Fixes: quando estamos alterando o modelo não está alterando o tamanho do contexto
Port from anomalyco/opencode#21355: Alibaba's DashScope API returns a
unique throttling message ('Request rate increased too quickly...') that
doesn't match standard rate-limit patterns ('rate limit', 'too many
requests'). This caused Alibaba errors to fall through to the 'unknown'
category rather than being properly classified as rate_limit with
appropriate backoff/rotation.
Add 'rate increased too quickly' to _RATE_LIMIT_PATTERNS and test with
the exact error message observed from the Alibaba provider.
Telegram's Bot API only allows a specific set of emoji for bot reactions
(the ReactionEmoji enum). ✅ (U+2705) and ❌ (U+274C) are not in that
set, causing on_processing_complete reactions to silently fail with
REACTION_INVALID (caught at debug log level).
Replace with 👍 (U+1F44D) / 👎 (U+1F44E) which are always available in
Telegram's allowed reaction list. The 👀 (eyes) reaction used by
on_processing_start was already valid.
Based on the fix by @ppdng in PR #6685.
Fixes#6068
Add debug logging when eyes reaction redaction fails, and add tests
for the success=False path and the no-pending-reaction edge case.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_send_reaction now returns Optional[str] (event_id) instead of bool.
Tests updated:
- test_send_reaction: assert result == event_id string
- test_send_reaction_no_client: assert result is None
- test_on_processing_start_sends_eyes: _send_reaction returns event_id,
now also asserts _pending_reactions is populated
- test_on_processing_complete_sends_check: set up _pending_reactions and
mock _redact_reaction, assert eyes reaction is redacted before sending check
Four gaps in DANGEROUS_PATTERNS found by running 10 targeted tests that
each mapped to a specific pattern in approval.py and checked whether the
documented defense actually held.
1. **Heredoc script injection** — `python3 << 'EOF'` bypasses the
existing `-e`/`-c` flag pattern. Adds pattern for interpreter + `<<`
covering python{2,3}, perl, ruby, node.
2. **PID expansion self-termination** — `kill -9 $(pgrep hermes)` is
opaque to the existing `pkill|killall` + name pattern because command
substitution is not expanded at detection time. Adds structural
patterns matching `kill` + `$(pgrep` and backtick variants.
3. **Git destructive operations** — `git reset --hard`, `push --force`,
`push -f`, `clean -f*`, and `branch -D` were entirely absent.
Note: `branch -d` also triggers because IGNORECASE is global —
acceptable since -d is still a delete, just a safe one, and the
prompt is only a confirmation, not a hard block.
4. **chmod +x then execute** — two-step social engineering where a
script containing dangerous commands is first written to disk (not
checked by write_file), then made executable and run as `./script`.
Pattern catches `chmod +x ... [;&|]+ ./` combos. Does not solve the
deeper architectural issue (write_file not checking content) — that
is called out in the PR description as a known limitation.
Tests: 23 new cases across 4 test classes, all in test_approval.py:
- TestHeredocScriptExecution (7 cases, incl. regressions for -c)
- TestPgrepKillExpansion (5 cases, incl. safe kill PID negative)
- TestGitDestructiveOps (8 cases, incl. safe git status/push negatives)
- TestChmodExecuteCombo (3 cases, incl. safe chmod-only negative)
Full suite: 146 passed, 0 failed.
_resolve_api_key_provider() now checks is_provider_explicitly_configured
before calling _try_anthropic(). Previously, any auxiliary fallback
(e.g. when kimi-coding key was invalid) would silently discover and use
Claude Code OAuth tokens — consuming the user's Claude Max subscription
without their knowledge.
This is the auxiliary-client counterpart of the setup-wizard gate in
PR #4210.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, removing a claude_code credential from the anthropic pool
only printed a note — the next load_pool() re-seeded it from
~/.claude/.credentials.json. Now writes a 'suppressed_sources' flag
to auth.json that _seed_from_singletons checks before seeding.
Follows the pattern of env: source removal (clears .env var) and
device_code removal (clears auth store state).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_seed_from_singletons('anthropic') now checks
is_provider_explicitly_configured('anthropic') before reading
~/.claude/.credentials.json. Without this, the auxiliary client
fallback chain silently discovers and uses Claude Code tokens when
the user's primary provider key is invalid — consuming their Claude
Max subscription quota without consent.
Follows the same gating pattern as PR #4210 (setup wizard gate)
but applied to the credential pool seeding path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gate function for checking whether a user has explicitly selected a
provider via hermes model/setup, auth.json active_provider, or env
vars. Used in subsequent commits to prevent unauthorized credential
auto-discovery. Follows the pattern from PR #4210.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follow-up to Dusk1e's PR #7120 (Slack send_image redirect guard):
- Rename _safe_url_for_log -> safe_url_for_log (drop underscore) since
it is now imported cross-module by the Slack adapter
- Add _ssrf_redirect_guard httpx event hook to cache_image_from_url()
and cache_audio_from_url() in base.py — same pattern as vision_tools
and the Slack adapter fix
- Update url_safety.py docstring to reflect broader coverage
- Add regression tests for image/audio redirect blocking + safe passthrough
The API server's _run_agent() was not passing task_id to
run_conversation(), causing a fresh random UUID per request. This meant
every Open WebUI message spun up a new Docker container and tore it down
afterward — making persistent filesystem state impossible.
Two fixes:
1. Pass task_id="default" so all API server conversations share the same
Docker container (matching the design intent: one configured Docker
environment, always the same container).
2. Derive a stable session_id from the system prompt + first user message
hash instead of uuid4(). This stops hermes sessions list from being
polluted with single-message throwaway sessions.
Fixes#3438.
Slack may return an HTML sign-in/redirect page instead of actual media
bytes (e.g. expired token, restricted file access). This adds two layers
of defense:
1. Content-Type check in slack.py rejects text/html responses early
2. Magic-byte validation in base.py's cache_image_from_bytes() rejects
non-image data regardless of source platform
Also adds ValueError guards in wecom.py and email.py so the new
validation doesn't crash those adapters.
Closes#6829
When kill_process() sends SIGTERM, both it and the reader thread race
to call _move_to_finished() — kill_process sets exit_code=-15 and
enqueues a notification, then the reader thread's process.wait()
returns with exit_code=143 (128+SIGTERM) and enqueues a second one.
Fix: make _move_to_finished() idempotent by tracking whether the
session was actually removed from _running. The second call sees it
was already moved and skips the completion_queue.put().
Adds regression test: test_move_to_finished_idempotent_no_duplicate
- test_background_autocompletes: pytest.importorskip("prompt_toolkit")
so the test skips gracefully where the CLI dep is absent
- test_run_agent_progress_stays_in_originating_topic: update stale emoji
💻 → ⚙️ to match get_tool_emoji("terminal", default="⚙️") in run.py
- test_internal_event_bypass{_authorization,_pairing}: mock
_handle_message_with_agent to raise immediately; avoids the 300s
run_in_executor hang that caused the tests to time out
Platforms that don't return a message_id after the first send (Signal,
GitHub webhooks) were causing GatewayStreamConsumer to re-enter the
"first send" path on every tool boundary, posting one platform message
per tool call (observed as 155 PR comments on a single response).
Fix: treat _message_id == "__no_edit__" as a sentinel meaning "platform
accepted the send but cannot be edited". When a tool boundary arrives
in that state, skip the message_id/accumulated/last_sent_text reset so
all continuation text is delivered once via _send_fallback_final rather
than re-posted per segment.
Also make prompt_toolkit imports in hermes_cli/commands.py optional so
gateway and test environments that lack the package can still import
resolve_command, gateway_help_lines, and COMMAND_REGISTRY.
Adds a regression test verifying that /background bypasses the
active-session guard in the platform adapter, matching the existing
test pattern for /stop, /new, /approve, /deny, and /status.