Vision auto-mode previously only tried OpenRouter, Nous, and Codex
for multimodal — deliberately skipping custom endpoints with the
assumption they 'may not handle vision input.' This caused silent
failures for users running local multimodal models (Qwen-VL, LLaVA,
Pixtral, etc.) without any cloud API keys.
Now custom endpoints are tried as a last resort in auto mode. If the
model doesn't support vision, the API call fails gracefully — but
users with local vision models no longer need to manually set
auxiliary.vision.provider: main in config.yaml.
Reported by @Spadav and @kotyKD.
The Docker backend already supports user-configured volume mounts via
docker_volumes, but it was undocumented — missing from DEFAULT_CONFIG,
cli.py defaults, and configuration docs.
Changes:
- hermes_cli/config.py: Add docker_volumes to DEFAULT_CONFIG with
inline documentation and examples
- cli.py: Add docker_volumes to load_cli_config defaults
- configuration.md: Full Docker Volume Mounts section with YAML
examples, use cases (providing files, receiving outputs, shared
workspaces), and env var alternative
Authored by aydnOktay. Improves URL validation with urlparse, adds exc_info
to error logs for full stack traces, and tightens type hints.
Resolved merge conflict in _handle_vision_analyze: kept PR's string formatting
with our AUXILIARY_VISION_MODEL env var logic.
MCP tests import from mcp.types but mcp wasn't in the dev optional
dependencies. Fresh 'pip install -e .[dev]' setups failed 3 tests.
Based on PR #427 by @teyrebaz33 (applied manually due to stale branch).
The 'hermes gateway setup' instructions for Slack were missing:
- The 'Subscribe to Events' step entirely (message.im, message.channels,
app_mention, message.groups)
- Several required scopes (app_mentions:read, groups:history, users:read,
files:write)
- Warning about bot only working in DMs without message.channels
- Step to invite the bot to channels
The 'hermes setup' flow (setup.py) and the website docs (slack.md)
already had the correct information — only gateway.py was outdated.
Reported by JordanB on Slack.
The #1 support issue with Slack is 'bot works in DMs but not channels'.
This is almost always caused by missing event subscriptions (message.channels,
message.groups) or missing OAuth scopes (channels:history, groups:history).
Changes:
- slack.md: Move channels:history and groups:history from optional to required
scopes. Move message.channels and message.groups to required events. Add new
'How the Bot Responds' section explaining DM vs channel behavior. Add Step 8
for inviting bot to channels. Expand troubleshooting table with specific
'works in DMs not channels' entry. Add quick checklist for channel debugging.
- setup.py: Expand Slack setup wizard with all required scopes, event
subscriptions, and a warning that without message.channels/message.groups
the bot only works in DMs. Add link to full docs. Improve Member ID
discovery instructions.
- config.py: Update SLACK_BOT_TOKEN and SLACK_APP_TOKEN descriptions to list
required scopes and event subscriptions inline.
Skills can now declare fallback_for_toolsets, fallback_for_tools,
requires_toolsets, and requires_tools in their SKILL.md frontmatter.
The system prompt builder filters skills automatically based on which
tools are available in the current session.
- Add _read_skill_conditions() to parse conditional frontmatter fields
- Add _skill_should_show() to evaluate conditions against available tools
- Update build_skills_system_prompt() to accept and apply tool availability
- Pass valid_tool_names and available toolsets from run_agent.py
- Backward compatible: skills without conditions always show; calling
build_skills_system_prompt() with no args preserves existing behavior
Closes#539
Implement send_document() and send_video() overrides in TelegramAdapter
so the agent can deliver files (PDFs, CSVs, docs, etc.) and videos as
native Telegram attachments instead of just printing the file path as
text.
The base adapter already routes MEDIA:<path> tags by extension — audio
goes to send_voice(), images to send_image_file(), and everything else
falls through to send_document(). But TelegramAdapter didn't override
send_document() or send_video(), so those fell back to plain text.
Now when the agent includes MEDIA:/path/to/report.pdf in its response,
users get a proper downloadable file attachment in Telegram.
Features:
- send_document: sends files via bot.send_document with display name,
caption (truncated to 1024), and reply_to support
- send_video: sends videos via bot.send_video with inline playback
- Both fall back to base class text if the Telegram API call fails
- 10 new tests covering success, custom filename, file-not-found,
not-connected, caption truncation, API error fallback, and reply_to
Requested by @TigerHixTang on Twitter.
- Register no-op app_mention event handler to suppress Bolt 404 errors.
The 'message' handler already processes @mentions in channels, so
app_mention is acknowledged without duplicate processing.
- Add send_document() for native file attachments (PDFs, CSVs, etc.)
via files_upload_v2, matching the pattern from Telegram PR #779.
- Add send_video() for native video uploads via files_upload_v2.
- Handle incoming document attachments from users: download, cache,
and inject text content for .txt/.md files (capped at 100KB),
following the same pattern as the Telegram adapter.
- Add _download_slack_file_bytes() helper for raw byte downloads.
- Add 24 new tests covering all new functionality.
Fixes the unhandled app_mention events reported in gateway logs.
Closes#643
Changes:
- /personality none|default|neutral — clears system prompt overlay
- Custom personalities in config.yaml support dict format with:
name, description, system_prompt, tone, style directives
- Backwards compatible — existing string format still works
- CLI + gateway both updated
- 18 tests covering none/default/neutral, dict format, string format,
list display, save to config
time.sleep(1) inside async def connect() blocks the entire event
loop for 1 second. Replaced with await asyncio.sleep(1) to yield
control back to the event loop while waiting for the killed port
process to release.
The full HERMES-AGENT ASCII logo needs ~95 columns, and the
side-by-side caduceus + tools panel needs ~80. In narrow terminals
(Kitty default, resized windows) everything wraps into visual garbage.
Fixes:
- show_banner() auto-detects terminal width and falls back to compact
banner when < 80 columns
- build_welcome_banner() skips the ASCII logo when < 95 columns
- Compact banner now dynamically sized via _build_compact_banner()
instead of a hardcoded 64-char box that also wrapped in narrow terms
- Same width checks applied to /clear command's banner refresh
The up/down arrow key issue in Kitty terminal for multiline input is
a known Kitty keyboard protocol (CSI u) vs prompt_toolkit compatibility
gap — arrow keys work correctly in standard terminals and tmux. Users
can work around it by running in tmux or setting TERM=xterm-256color.
Adds a 'find-nearby' skill for discovering nearby places using
OpenStreetMap (Overpass + Nominatim). No API keys needed. Works with:
- Coordinates (from Telegram location pins)
- Addresses, cities, zip codes, landmarks (auto-geocoded)
- Multiple place types (restaurant, cafe, bar, pharmacy, etc.)
Returns names, distances, cuisine, hours, addresses, and Google Maps
links (pin + directions). 184-line stdlib-only script.
Also adds Telegram location message handling:
- New MessageType.LOCATION in gateway base
- Telegram adapter handles LOCATION and VENUE messages
- Injects lat/lon coordinates into conversation context
- Prompts agent to ask what the user wants nearby
Inspired by PR #422 (reimplemented with simpler script and broader
skill scope — addresses/cities/zips, not just Telegram coordinates).
Selecting a saved custom provider now switches instantly without
probing /models — the model name is stored in the config entry
as a complete profile (name + url + key + model).
Changes:
- custom_providers entries now include 'model' field
- Selecting a saved provider with a model just activates it
- Only probes /models if no model is saved (first-time setup)
- Menu shows saved model name: 'Local (localhost:8000) — llama-70b'
- Dedup on re-entry: still activates the model, just doesn't add
a duplicate config entry (updates model name if changed)
When a user adds a custom endpoint via 'hermes model' → 'Custom
endpoint', it now automatically saves to custom_providers in
config.yaml so it persists and appears in the provider menu on
subsequent runs. Deduplicates by base_url.
Auto-generated names based on URL:
http://localhost:8000/v1 → 'Local (localhost:8000)'
https://xyz.runpod.ai/v1 → 'RunPod (xyz.runpod.ai)'
https://api.example.com/v1 → 'Api.example.com'
Also adds 'Remove a saved custom provider' option to the menu
(only shown when custom providers exist) with a selection UI
to pick which one to remove.
Users can also manually edit custom_providers in config.yaml
for full control over names and settings.
Users with multiple local servers or custom endpoints can now define
them all in config.yaml and switch between them from the model
selection menu:
custom_providers:
- name: 'Local Llama 70B'
base_url: 'http://localhost:8000/v1'
api_key: 'not-needed'
- name: 'RunPod vLLM'
base_url: 'https://xyz.runpod.ai/v1'
api_key: 'rp_xxxxx'
These appear in `hermes model` provider selection alongside the
built-in providers. When selected, the endpoint's /models API is
probed to show available models in a selection menu.
Previously only a single 'Custom endpoint' option existed, requiring
manual URL entry each time you wanted to switch between local servers.
Requested by @ZiarnoBobu on Twitter.
Add MCP sampling/createMessage capability via SamplingHandler class.
Text-only sampling + tool use in sampling with governance (rate limits,
model whitelist, token caps, tool loop limits). Per-server audit metrics.
Based on concept from PR #366 by eren-karakus0. Restructured as class-based
design with bug fixes and tests using real MCP SDK types.
50 new tests, 2600 total passing.
Some local LLM servers (llama-server, etc.) return message.content as
a dict or list instead of a plain string. This caused AttributeError
'dict object has no attribute strip' on every API call.
Normalizes content to string immediately after receiving the response:
- dict: extracts 'text' or 'content' field, falls back to json.dumps
- list: extracts text parts (OpenAI multimodal content format)
- other: str() conversion
Applied at the single point where response.choices[0].message is read
in the main agent loop, so all downstream .strip()/.startswith()/[:100]
operations work regardless of server implementation.
Closes#759
_FakeReadResult and _FakeSearchResult now expose the attributes
that read_file_tool/search_tool access after the redact_sensitive_text
integration from main.
Combine read/search loop detection with main's redact_sensitive_text
and truncation hint features. Add tracker reset to TestSearchHints
to prevent cross-test state leakage.
When switching FROM Codex/Nous/custom TO OpenRouter via 'hermes setup',
the old provider stayed active because setup only saved the API key but
never updated config.yaml or auth.json. This caused resolve_provider()
to keep returning the old provider (e.g. openai-codex) even after the
user selected OpenRouter.
Fix: the OpenRouter path in setup now deactivates any OAuth provider
in auth.json and writes model.provider='openrouter' to config.yaml,
matching what all other provider paths already do.
Added pitfalls discovered during live abliteration testing:
- Models < 1B have fragmented refusal, respond poorly (0.5B: 60%→20%)
- Models 3B+ work much better (3B: 75%→0% with advanced defaults)
- aggressive method can backfire on small models (made it worse)
- Spectral certification RED is common even when refusal rate is 0%
- Fixed torch property: total_mem → total_memory
Three issues caused the gateway to display 'openrouter' instead of
'Custom endpoint' when users configured a custom OAI-compatible endpoint:
1. hermes setup: custom endpoint path saved OPENAI_BASE_URL and
OPENAI_API_KEY to .env but never wrote model.provider to config.yaml.
All other providers (Codex, z.ai, Kimi, etc.) call
_update_config_for_provider() which sets this — custom was the only
path that skipped it. Now writes model.provider='custom' and
model.base_url to config.yaml.
2. hermes model: custom endpoint set model.provider='auto' in config.yaml.
The CLI display had a hack to detect OPENAI_BASE_URL and override to
'custom', but the gateway didn't. Now sets model.provider='custom'
directly.
3. gateway /model and /provider commands: defaulted to 'openrouter' and
read config.yaml — which had no provider set. Added OPENAI_BASE_URL
detection fallback (same pattern the CLI uses) as a defensive catch
for existing users who set up before this fix.
Add configurable bot message filtering via DISCORD_ALLOW_BOTS env var:
- 'none' (default): Ignore all other bot messages — matches previous
behavior where only our own bot was filtered, but now ALL bots are
filtered by default for cleaner channels
- 'mentions': Accept bot messages only when they @mention our bot —
useful for bot-to-bot workflows triggered by mentions
- 'all': Accept all bot messages — for setups where bots need to
interact freely
Previously, we only ignored our own bot's messages, allowing all other
bots through. This could cause noisy loops in channels with multiple bots.
8 new tests covering all filter modes and edge cases.
Inspired by openclaw v2026.3.7 Discord allowBots: 'mentions' config.
Enforce owner-only permissions on files and directories that contain
secrets or sensitive data:
- cron/jobs.py: jobs.json (0600), cron dirs (0700), job output files (0600)
- hermes_cli/config.py: config.yaml (0600), .env (0600), ~/.hermes/* dirs (0700)
- cli.py: config.yaml via save_config_value (0600)
All chmod calls use try/except for Windows compatibility.
Includes _secure_file() and _secure_dir() helpers with graceful fallback.
8 new tests verify permissions on all file types.
Inspired by openclaw v2026.3.7 file permission enforcement.
Prevents unnecessary Anthropic prompt cache misses by reusing stored
system prompts for continuing sessions and stabilizing Honcho context
per session instead of per turn.
Two changes to prevent unnecessary Anthropic prompt cache misses in the
gateway, where a fresh AIAgent is created per user message:
1. Reuse stored system prompt for continuing sessions:
When conversation_history is non-empty, load the system prompt from
the session DB instead of rebuilding from disk. The model already has
updated memory in its conversation history (it wrote it!), so
re-reading memory from disk produces a different system prompt that
breaks the cache prefix.
2. Stabilize Honcho context per session:
- Only prefetch Honcho context on the first turn (empty history)
- Bake Honcho context into the cached system prompt and store to DB
- Remove the per-turn Honcho injection from the API call loop
This ensures the system message is identical across all turns in a
session. Previously, re-fetching Honcho could return different context
on each turn, changing the system message and invalidating the cache.
Both changes preserve the existing behavior for compression (which
invalidates the prompt and rebuilds from scratch) and for the CLI
(where the same AIAgent persists and the cached prompt is already
stable across turns).
Tests: 2556 passed (6 new)
Moved redact_secrets out of DEFAULT_CONFIG (it's on by default when
unset) and into the commented sections at the bottom of config.yaml,
alongside fallback_model. Users can see the option and uncomment to
disable.
New config option:
security:
redact_secrets: false # default: true
When set to false, API keys, tokens, and passwords are shown in
full in read_file, search_files, and terminal output. Useful for
debugging auth issues where you need to verify the actual key value.
Bridged to both CLI and gateway via HERMES_REDACT_SECRETS env var.
The check is in redact_sensitive_text() itself, so all call sites
(terminal, file tools, log formatter) respect it.
Terminal output was already redacted via redact_sensitive_text() but
read_file and search_files returned raw content. Now both tools
redact secrets before returning results to the LLM.
Based on PR #372 by @teyrebaz33 (closes#363) — applied manually
due to branch conflicts with the current codebase.
Authored by @ch3ronsa. Fixes UnicodeEncodeError/UnicodeDecodeError on
Windows with non-UTF-8 system locales (e.g. Turkish cp1254).
Adds encoding='utf-8' to 10 open() calls across gateway/session.py,
gateway/channel_directory.py, and gateway/mirror.py.
Uses temp file + fsync + os.replace() to avoid corruption if the
process crashes mid-write. Cleans up temp file on failure, logs
errors at debug level.
Based on PR #335 by @aydnOktay — adapted for the current v2
manifest format (name:hash).