- Add agent/embeddings.py with Embedder protocol, FastEmbedEmbedder, OpenAIEmbedder
- Factory function get_embedder() reads provider from config.yaml embeddings section
- Lazy initialization — no startup impact, model loaded on first embed call
- cosine_similarity() and cosine_similarity_matrix() utility functions included
- Add fastembed as optional dependency in pyproject.toml
- 30 unit tests, all passing
Closes#675
Completes the fix started in 8318a51 — handle_function_call() accepted
enabled_tools but run_agent.py never passed it. Now both call sites in
_execute_tool_calls() pass self.valid_tool_names, so each agent session
uses its own tool list instead of the process-global
_last_resolved_tool_names (which subagents can overwrite).
Also simplifies the redundant ternary in code_execution_tool.py:
sandbox_tools is already computed correctly (intersection with session
tools, or full SANDBOX_ALLOWED_TOOLS as fallback), so the conditional
was dead logic.
Inspired by PR #663 (JasonOA888). Closes#662.
Tests: 2857 passed.
Add support for using Nous Portal via a direct API key, mirroring
how OpenRouter and other API-key providers work. This gives users a
simpler alternative to the OAuth device-code flow when they already
have a Nous API key.
Changes:
- Add 'nous-api' to PROVIDER_REGISTRY as an api_key provider
pointing to https://inference-api.nousresearch.com/v1
- Add NOUS_API_KEY and NOUS_BASE_URL to OPTIONAL_ENV_VARS
- Add NOUS_API_BASE_URL / NOUS_API_CHAT_URL to hermes_constants
- Add 'Nous Portal API key' as first option in setup wizard
- Add provider aliases (nous_api, nousapi, nous-portal-api)
- Add test for nous-api runtime provider resolution
Closes#644
Move all new tests (schema, env filtering, edge cases, interrupt) into
the existing test_code_execution.py instead of a separate file.
Delete the now-redundant test_code_execution_schema.py.
build_execute_code_schema(set()) produced "from hermes_tools import , ..."
in the code property description — invalid Python syntax shown to the model.
This triggers when a user enables only the code_execution toolset without
any of the sandbox-allowed tools (e.g. `hermes tools code_execution`),
because SANDBOX_ALLOWED_TOOLS & {"execute_code"} = empty set.
Also adds 29 unit tests covering build_execute_code_schema, environment
variable filtering, execute_code edge cases, and interrupt handling.
Authored by tripledoublev.
After context compression on 413/400 errors, the inner retry loop was
reusing the stale pre-compression api_messages payload. Fix breaks out
of the inner retry loop so the outer loop rebuilds api_messages from
the now-compressed messages list. Adds regression test verifying the
second request actually contains the compressed payload.
Add display.background_process_notifications config option to control
how chatty the gateway process watcher is when using
terminal(background=true, check_interval=...) from messaging platforms.
Modes:
- all: running-output updates + final message (default, current behavior)
- result: only the final completion message
- error: only the final message when exit code != 0
- off: no watcher messages at all
Also supports HERMES_BACKGROUND_NOTIFICATIONS env var override.
Includes 12 tests (5 config loading + 7 watcher behavior).
Inspired by @PeterFile's PR #593. Closes#592.
Automatic filesystem snapshots before destructive file operations,
with user-facing rollback. Inspired by PR #559 (by @alireza78a).
Architecture:
- Shadow git repos at ~/.hermes/checkpoints/{hash}/ via GIT_DIR
- CheckpointManager: take/list/restore, turn-scoped dedup, pruning
- Transparent — the LLM never sees it, no tool schema, no tokens
- Once per turn — only first write_file/patch triggers a snapshot
Integration:
- Config: checkpoints.enabled + checkpoints.max_snapshots
- CLI flag: hermes --checkpoints
- Trigger: run_agent.py _execute_tool_calls() before write_file/patch
- /rollback slash command in CLI + gateway (list, restore by number)
- Pre-rollback snapshot auto-created on restore (undo the undo)
Safety:
- Never blocks file operations — all errors silently logged
- Skips root dir, home dir, dirs >50K files
- Disables gracefully when git not installed
- Shadow repo completely isolated from project git
Tests: 35 new tests, all passing (2798 total suite)
Docs: feature page, config reference, CLI commands reference
Authored by unmodeled-tyler. Adds openclaw-migration skill to optional-skills/
with migration script, SKILL.md, and 7 tests. Also improves clarify/approval
panel rendering with dynamic width calculation.
Authored by 0xbyt4. Fixes #N/A (no linked issue).
- Sanitize user input before FTS5 MATCH to prevent OperationalError on
special characters (C++, unbalanced quotes, dangling operators, etc.)
- Close SessionDB connection in mirror._append_to_sqlite() via finally block
- Added tests for both fixes
Platform.SIGNAL was missing from default_toolset_map and platform_config_key
in gateway/run.py, causing Signal to silently fall back to hermes-telegram
toolset (same bug as HomeAssistant, fixed in PR #538).
Also updates:
- tests/test_toolsets.py: include hermes-signal and hermes-homeassistant in
the platform core-tools consistency check
- cli-config.yaml.example: document signal and homeassistant platform keys
When a user runs `hermes -w -c Pokemon Agent Dev` without quoting the
session name, argparse would fail with:
error: argument command: invalid choice: 'Agent'
This is because argparse parses `-c Pokemon` (consuming one token via
nargs='?'), then sees 'Agent' and tries to match it as a subcommand.
Fix: add _coalesce_session_name_args() that pre-processes sys.argv before
argparse, joining consecutive non-flag, non-subcommand tokens after -c or
-r into a single argument. This makes both quoted and unquoted multi-word
session names work transparently.
Includes 17 tests covering all edge cases: multi-word names, single-word,
bare flags, flag ordering, subcommand boundaries, and passthrough.
Complements PR #453 by 0xbyt4. Adds isinstance(dict) guard in
run_agent.py to catch cases where json.loads returns non-dict
(e.g. null, list, string) before they reach downstream code.
Also adds 15 tests for build_tool_preview covering None args,
empty dicts, known/unknown tools, fallback keys, truncation,
and all special-cased tools (process, todo, memory, session_search).
Validate that the username portion of ~username paths contains only
valid characters (alphanumeric, dot, hyphen, underscore) before passing
to shell echo for expansion. Previously, paths like '~; rm -rf /'
would be passed unquoted to self._exec(f'echo {path}'), allowing
arbitrary command execution.
The approach validates the username rather than using shlex.quote(),
which would prevent tilde expansion from working at all since
echo '~user' outputs the literal string instead of expanding it.
Added tests for injection blocking and valid ~username/path expansion.
Credit to @alireza78a for reporting (PR #442, issue #442).
SamplingHandler.__call__ accessed response.choices[0] without checking
if the list was non-empty. LLM APIs can return empty choices on content
filtering, provider errors, or rate limits, causing an unhandled
IndexError that propagates to the MCP SDK and may crash the connection.
Add a defensive guard that returns a proper ErrorData when choices is
empty, None, or missing. Includes three test cases covering all
variants.
Vision auto-mode previously only tried OpenRouter, Nous, and Codex
for multimodal — deliberately skipping custom endpoints with the
assumption they 'may not handle vision input.' This caused silent
failures for users running local multimodal models (Qwen-VL, LLaVA,
Pixtral, etc.) without any cloud API keys.
Now custom endpoints are tried as a last resort in auto mode. If the
model doesn't support vision, the API call fails gracefully — but
users with local vision models no longer need to manually set
auxiliary.vision.provider: main in config.yaml.
Reported by @Spadav and @kotyKD.
Skills can now declare fallback_for_toolsets, fallback_for_tools,
requires_toolsets, and requires_tools in their SKILL.md frontmatter.
The system prompt builder filters skills automatically based on which
tools are available in the current session.
- Add _read_skill_conditions() to parse conditional frontmatter fields
- Add _skill_should_show() to evaluate conditions against available tools
- Update build_skills_system_prompt() to accept and apply tool availability
- Pass valid_tool_names and available toolsets from run_agent.py
- Backward compatible: skills without conditions always show; calling
build_skills_system_prompt() with no args preserves existing behavior
Closes#539
Implement send_document() and send_video() overrides in TelegramAdapter
so the agent can deliver files (PDFs, CSVs, docs, etc.) and videos as
native Telegram attachments instead of just printing the file path as
text.
The base adapter already routes MEDIA:<path> tags by extension — audio
goes to send_voice(), images to send_image_file(), and everything else
falls through to send_document(). But TelegramAdapter didn't override
send_document() or send_video(), so those fell back to plain text.
Now when the agent includes MEDIA:/path/to/report.pdf in its response,
users get a proper downloadable file attachment in Telegram.
Features:
- send_document: sends files via bot.send_document with display name,
caption (truncated to 1024), and reply_to support
- send_video: sends videos via bot.send_video with inline playback
- Both fall back to base class text if the Telegram API call fails
- 10 new tests covering success, custom filename, file-not-found,
not-connected, caption truncation, API error fallback, and reply_to
Requested by @TigerHixTang on Twitter.
- Register no-op app_mention event handler to suppress Bolt 404 errors.
The 'message' handler already processes @mentions in channels, so
app_mention is acknowledged without duplicate processing.
- Add send_document() for native file attachments (PDFs, CSVs, etc.)
via files_upload_v2, matching the pattern from Telegram PR #779.
- Add send_video() for native video uploads via files_upload_v2.
- Handle incoming document attachments from users: download, cache,
and inject text content for .txt/.md files (capped at 100KB),
following the same pattern as the Telegram adapter.
- Add _download_slack_file_bytes() helper for raw byte downloads.
- Add 24 new tests covering all new functionality.
Fixes the unhandled app_mention events reported in gateway logs.
Closes#643
Changes:
- /personality none|default|neutral — clears system prompt overlay
- Custom personalities in config.yaml support dict format with:
name, description, system_prompt, tone, style directives
- Backwards compatible — existing string format still works
- CLI + gateway both updated
- 18 tests covering none/default/neutral, dict format, string format,
list display, save to config
Add MCP sampling/createMessage capability via SamplingHandler class.
Text-only sampling + tool use in sampling with governance (rate limits,
model whitelist, token caps, tool loop limits). Per-server audit metrics.
Based on concept from PR #366 by eren-karakus0. Restructured as class-based
design with bug fixes and tests using real MCP SDK types.
50 new tests, 2600 total passing.
_FakeReadResult and _FakeSearchResult now expose the attributes
that read_file_tool/search_tool access after the redact_sensitive_text
integration from main.
Combine read/search loop detection with main's redact_sensitive_text
and truncation hint features. Add tracker reset to TestSearchHints
to prevent cross-test state leakage.
Add configurable bot message filtering via DISCORD_ALLOW_BOTS env var:
- 'none' (default): Ignore all other bot messages — matches previous
behavior where only our own bot was filtered, but now ALL bots are
filtered by default for cleaner channels
- 'mentions': Accept bot messages only when they @mention our bot —
useful for bot-to-bot workflows triggered by mentions
- 'all': Accept all bot messages — for setups where bots need to
interact freely
Previously, we only ignored our own bot's messages, allowing all other
bots through. This could cause noisy loops in channels with multiple bots.
8 new tests covering all filter modes and edge cases.
Inspired by openclaw v2026.3.7 Discord allowBots: 'mentions' config.
Enforce owner-only permissions on files and directories that contain
secrets or sensitive data:
- cron/jobs.py: jobs.json (0600), cron dirs (0700), job output files (0600)
- hermes_cli/config.py: config.yaml (0600), .env (0600), ~/.hermes/* dirs (0700)
- cli.py: config.yaml via save_config_value (0600)
All chmod calls use try/except for Windows compatibility.
Includes _secure_file() and _secure_dir() helpers with graceful fallback.
8 new tests verify permissions on all file types.
Inspired by openclaw v2026.3.7 file permission enforcement.
Two changes to prevent unnecessary Anthropic prompt cache misses in the
gateway, where a fresh AIAgent is created per user message:
1. Reuse stored system prompt for continuing sessions:
When conversation_history is non-empty, load the system prompt from
the session DB instead of rebuilding from disk. The model already has
updated memory in its conversation history (it wrote it!), so
re-reading memory from disk produces a different system prompt that
breaks the cache prefix.
2. Stabilize Honcho context per session:
- Only prefetch Honcho context on the first turn (empty history)
- Bake Honcho context into the cached system prompt and store to DB
- Remove the per-turn Honcho injection from the API call loop
This ensures the system message is identical across all turns in a
session. Previously, re-fetching Honcho could return different context
on each turn, changing the system message and invalidating the cache.
Both changes preserve the existing behavior for compression (which
invalidates the prompt and rebuilds from scratch) and for the CLI
(where the same AIAgent persists and the cached prompt is already
stable across turns).
Tests: 2556 passed (6 new)
Terminal output was already redacted via redact_sensitive_text() but
read_file and search_files returned raw content. Now both tools
redact secrets before returning results to the LLM.
Based on PR #372 by @teyrebaz33 (closes#363) — applied manually
due to branch conflicts with the current codebase.
The summary message was always injected as 'user' role, which causes
consecutive user messages when the last preserved head message is also
'user'. Some APIs reject this (400 error), and it produces malformed
training data.
Fix: check the role of the last head message and pick the opposite role
for the summary — 'user' after assistant/tool, 'assistant' after user.
Based on PR #328 by johnh4098. Closes#328.
Split fallback provider handling into two clean registries:
_FALLBACK_API_KEY_PROVIDERS — env-var-based (openrouter, zai, kimi, minimax)
_FALLBACK_OAUTH_PROVIDERS — OAuth-based (openai-codex, nous)
New _resolve_fallback_credentials() method handles all three cases
(OAuth, API key, custom endpoint) and returns a uniform (key, url, mode)
tuple. _try_activate_fallback() is now just validation + client build.
Adds Nous Portal as a fallback provider — uses the same OAuth flow
as the primary provider (hermes login), returns chat_completions mode.
OAuth providers get credential refresh for free: the existing 401
retry handlers (_try_refresh_codex/nous_client_credentials) check
self.provider, which is set correctly after fallback activation.
4 new tests (nous activation, nous no-login, codex retained).
27 total fallback tests passing, 2548 full suite.
Codex OAuth uses a different auth flow (OAuth tokens, not env vars)
and a different API mode (codex_responses, not chat_completions).
The fallback now handles this specially:
- Resolves credentials via resolve_codex_runtime_credentials()
- Sets api_mode to codex_responses
- Fails gracefully if no Codex OAuth session exists
Also added to the commented-out config.yaml example.
2 new tests (codex activation + graceful failure).
The gateway had a SEPARATE compression system ('session hygiene')
with hardcoded thresholds (100k tokens / 200 messages) that were
completely disconnected from the model's context length and the
user's compression config in config.yaml. This caused premature
auto-compression on Telegram/Discord — triggering at ~60k tokens
(from the 200-message threshold) or inconsistent token counts.
Changes:
- Gateway hygiene now reads model name from config.yaml and uses
get_model_context_length() to derive the actual context limit
- Compression threshold comes from compression.threshold in
config.yaml (default 0.85), same as the agent's ContextCompressor
- Removed the message-count-based trigger (was redundant and caused
false positives in tool-heavy sessions)
- Removed the undocumented session_hygiene config section — the
standard compression.* config now controls everything
- Env var overrides (CONTEXT_COMPRESSION_THRESHOLD,
CONTEXT_COMPRESSION_ENABLED) are respected
- Warn threshold is now 95% of model context (was hardcoded 200k)
- Updated tests to verify model-aware thresholds, scaling across
models, and that message count alone no longer triggers compression
For claude-opus-4.6 (200k context) at 85% threshold: gateway
hygiene now triggers at 170k tokens instead of the old 100k.