Introduces providers/ package — single source of truth for every
inference provider. Adding a simple api-key provider now requires one
providers/<name>.py file with zero edits anywhere else.
What this PR ships:
- providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes)
- ProviderProfile declarative fields: name, api_mode, aliases, display_name,
env_vars, base_url, models_url, auth_type, fallback_models, hostname,
default_headers, fixed_temperature, default_max_tokens, default_aux_model
- 4 overridable hooks: prepare_messages, build_extra_body,
build_api_kwargs_extras, fetch_models
- chat_completions.build_kwargs: profile path via _build_kwargs_from_profile,
legacy flag path retained for lmstudio/tencent-tokenhub (which have
session-aware reasoning probing that doesn't map cleanly to hooks yet)
- run_agent.py: profile path for all registered providers; legacy path
variable scoping fixed (all flags defined before branching)
- Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS,
doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER
- GeminiProfile: thinking_config translation (native + openai-compat nested)
- New tests/providers/ (79 tests covering profile declarations, transport
parity, hook overrides, e2e kwargs assembly)
Deltas vs original PR (salvaged onto current main):
- Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth
(were added to main since original PR)
- Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their
reasoning_effort probing has no clean hook equivalent yet)
- Removed lmstudio alias from custom profile (it's a separate provider now)
- Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension
(resolve_provider special-cases them; adding breaks runtime resolution)
- runtime_provider: profile.api_mode only as fallback when URL detection
finds nothing (was breaking minimax /v1 override)
- Preserved main's legacy-path improvements: deepseek reasoning_content
preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M
beta recovery, etc.
- Kept agent/copilot_acp_client.py in place (rejected PR's relocation —
main has 7 fixes landed since; relocation would revert them)
- _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing
test imports
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Closes#14418
Cron is a built-in Hermes feature (CLI `hermes cron`, `cronjob` agent
tool, gateway ticker, scheduler in cron/scheduler.py) but croniter has
been gated behind the [cron] optional extra. Users who do a plain
`pip install hermes-agent` can create jobs via /cron but any recurring
cron schedule silently returns next_run_at=None (HAS_CRONITER=False),
which then gets wrapped into a 'state=error' message only after a tick.
Move croniter into core dependencies so scheduled jobs work out of the
box on any install path. The [cron] extra is kept as an empty
passthrough so existing `pip install hermes-agent[cron]` installs and
the [all]/[termux] extras continue to resolve.
Also update the now-stale user-facing error message in
`compute_next_run()` that still tells users to install `hermes-agent[cron]`.
Salvaged from #17234 (authored by @txbxxx) with a corrected premise:
the original PR claimed [cron] wasn't in [all], but it is (pyproject.toml
line 112). The real UX problem is the plain no-extras install path,
which this fix addresses.
Adds Vercel Sandbox as a supported Hermes terminal backend alongside
existing providers (Local, Docker, Modal, SSH, Daytona, Singularity).
Uses the Vercel Python SDK to create/manage cloud microVMs, supports
snapshot-based filesystem persistence keyed by task_id, and integrates
with the existing BaseEnvironment shell contract and FileSyncManager
for credential/skill syncing.
Based on #17127 by @scotttrinh, cherry-picked onto current main.
Background macOS desktop control via cua-driver MCP — does NOT steal the
user's cursor or keyboard focus, works with any tool-capable model.
Replaces the Anthropic-native `computer_20251124` approach from the
abandoned #4562 with a generic OpenAI function-calling schema plus SOM
(set-of-mark) captures so Claude, GPT, Gemini, and open models can all
drive the desktop via numbered element indices.
- `tools/computer_use/` package — swappable ComputerUseBackend ABC +
CuaDriverBackend (stdio MCP client to trycua/cua's cua-driver binary).
- Universal `computer_use` tool with one schema for all providers.
Actions: capture (som/vision/ax), click, double_click, right_click,
middle_click, drag, scroll, type, key, wait, list_apps, focus_app.
- Multimodal tool-result envelope (`_multimodal=True`, OpenAI-style
`content: [text, image_url]` parts) that flows through
handle_function_call into the tool message. Anthropic adapter converts
into native `tool_result` image blocks; OpenAI-compatible providers
get the parts list directly.
- Image eviction in convert_messages_to_anthropic: only the 3 most
recent screenshots carry real image data; older ones become text
placeholders to cap per-turn token cost.
- Context compressor image pruning: old multimodal tool results have
their image parts stripped instead of being skipped.
- Image-aware token estimation: each image counts as a flat 1500 tokens
instead of its base64 char length (~1MB would have registered as
~250K tokens before).
- COMPUTER_USE_GUIDANCE system-prompt block — injected when the toolset
is active.
- Session DB persistence strips base64 from multimodal tool messages.
- Trajectory saver normalises multimodal messages to text-only.
- `hermes tools` post-setup installs cua-driver via the upstream script
and prints permission-grant instructions.
- CLI approval callback wired so destructive computer_use actions go
through the same prompt_toolkit approval dialog as terminal commands.
- Hard safety guards at the tool level: blocked type patterns
(curl|bash, sudo rm -rf, fork bomb), blocked key combos (empty trash,
force delete, lock screen, log out).
- Skill `apple/macos-computer-use/SKILL.md` — universal (model-agnostic)
workflow guide.
- Docs: `user-guide/features/computer-use.md` plus reference catalog
entries.
44 new tests in tests/tools/test_computer_use.py covering schema
shape (universal, not Anthropic-native), dispatch routing, safety
guards, multimodal envelope, Anthropic adapter conversion, screenshot
eviction, context compressor pruning, image-aware token estimation,
run_agent helpers, and universality guarantees.
469/469 pass across tests/tools/test_computer_use.py + the affected
agent/ test suites.
- `model_tools.py` provider-gating: the tool is available to every
provider. Providers without multi-part tool message support will see
text-only tool results (graceful degradation via `text_summary`).
- Anthropic server-side `clear_tool_uses_20250919` — deferred;
client-side eviction + compressor pruning cover the same cost ceiling
without a beta header.
- macOS only. cua-driver uses private SkyLight SPIs
(SLEventPostToPid, SLPSPostEventRecordTo,
_AXObserverAddNotificationAndCheckRemote) that can break on any macOS
update. Pin with HERMES_CUA_DRIVER_VERSION.
- Requires Accessibility + Screen Recording permissions — the post-setup
prints the Settings path.
Supersedes PR #4562 (pyautogui/Quartz foreground backend, Anthropic-
native schema). Credit @0xbyt4 for the original #3816 groundwork whose
context/eviction/token design is preserved here in generic form.
Closes#13626.
Two follow-ups on top of the _hermes_home helper from @jerome-benoit's #12729:
1. Declare a [google] optional extra in pyproject.toml
(google-api-python-client, google-auth-oauthlib, google-auth-httplib2) and
include it in [all]. Packagers (Nix flake, Homebrew) now ship the deps by
default, so `setup.py --check` does not need to shell out to pip at
runtime — the imports succeed and install_deps() is never reached.
This fixes the Nix breakage where pip/ensurepip are stripped.
2. Add `from __future__ import annotations` to setup.py so the PEP 604
`str | None` annotation parses on Python 3.9 (macOS system python).
Previously system python3 SyntaxError'd before any code ran.
install_deps() error message now also points users at the extra instead of
just the raw pip command.
Adds ruff (fast Python linter from Astral) as a dev dependency and sets
up initial config with all files excluded — ruff is entirely disabled
for now, this just lands the config for slow rollout enabling it
module-by-module in follow-up PRs.
The transport refactor (PRs #13862 ff.) added agent/transports/ as a
sub-package but the setuptools packages.find include list only had
"agent" (top-level files), not "agent.*" (sub-packages).
pip install / Nix builds therefore ship run_agent.py (which now imports
from agent.transports on every API call) but omit the transports
directory entirely, causing:
ModuleNotFoundError: No module named 'agent.transports'
on every LLM call for packaged installs.
Adds "agent.*" to match the existing pattern used by tools, gateway,
tui_gateway, and plugins.
Cherry-picked from #10985 by pedh, adapted to current main:
* Keeps main's full group-chat gating (require_mention + allowed_users +
free_response_chats + mention_patterns) — PR's simpler subset dropped.
* Keeps main's fire-and-forget process() dispatch + session_webhook
fallback for SDK >= 0.24.
* Picks up PR's REQUIRES_EDIT_FINALIZE capability flag on
BasePlatformAdapter + finalize kwarg on edit_message(), plumbed through
stream_consumer. Default False so Telegram/Slack/Discord/Matrix stay
on the zero-overhead fast path.
* DingTalk AI Card lifecycle: per-chat _message_contexts, two-card flow
(tool-progress + final response) with sibling auto-close driven by
reply_to, idempotent 🤔Thinking → 🥳Done swap, $alibabacloud-dingtalk$
for media URL resolution (replaces raw HTTP that was 403-ing).
* pyproject: dingtalk extra now dingtalk-stream>=0.20,<1 +
alibabacloud-dingtalk>=2.0.0 + qrcode.
Closes#10991
Co-authored-by: pedh
#4b1567f4 (anthhub) added qrcode to the messaging extra for Weixin's
QR login. The same package is needed by:
* hermes_cli/dingtalk_auth.py — QR device-flow auth shipped in #11574
* gateway/platforms/feishu.py:3962 — Feishu QR login
These extras are independent of [messaging] (users can install
hermes-agent[dingtalk] or hermes-agent[feishu] without [messaging]),
so the dep needs to be declared on each.
Pin matches anthhub's choice (>=7.0,<8) for consistency. The all
extra inherits from all three, so it picks up qrcode transitively.
Adds parallel tests to tests/test_project_metadata.py — same shape
as test_messaging_extra_includes_qrcode_for_weixin_setup.
Refs #9431.
The everywhere release — Hermes goes mobile with Termux/Android, adds
iMessage and WeChat, ships Fast Mode for OpenAI and Anthropic,
introduces background process monitoring, launches a local web
dashboard, and delivers the deepest security hardening pass yet
across 16 supported platforms.
487 commits, 269 merged PRs, 167 resolved issues, 24 contributors.
* feat: web UI dashboard for managing Hermes Agent (salvage of #8204/#7621)
Adds an embedded web UI dashboard accessible via `hermes web`:
- Status page: agent version, active sessions, gateway status, connected platforms
- Config editor: schema-driven form with tabbed categories, import/export, reset
- API Keys page: set, clear, and view redacted values with category grouping
- Sessions, Skills, Cron, Logs, and Analytics pages
Backend:
- hermes_cli/web_server.py: FastAPI server with REST endpoints
- hermes_cli/config.py: reload_env() utility for hot-reloading .env
- hermes_cli/main.py: `hermes web` subcommand (--port, --host, --no-open)
- cli.py / commands.py: /reload slash command for .env hot-reload
- pyproject.toml: [web] optional dependency extra (fastapi + uvicorn)
- Both update paths (git + zip) auto-build web frontend when npm available
Frontend:
- Vite + React + TypeScript + Tailwind v4 SPA in web/
- shadcn/ui-style components, Nous design language
- Auto-refresh status page, toast notifications, masked password inputs
Security:
- Path traversal guard (resolve().is_relative_to()) on SPA file serving
- CORS localhost-only via allow_origin_regex
- Generic error messages (no internal leak), SessionDB handles closed properly
Tests: 47 tests covering reload_env, redact_key, API endpoints, schema
generation, path traversal, category merging, internal key stripping,
and full config round-trip.
Original work by @austinpickett (PR #1813), salvaged by @kshitijk4poor
(PR #7621 → #8204), re-salvaged onto current main with stale-branch
regressions removed.
* fix(web): clean up status page cards, always rebuild on `hermes web`
- Remove config version migration alert banner from status page
- Remove config version card (internal noise, not surfaced in TUI)
- Reorder status cards: Agent → Gateway → Active Sessions (3-col grid)
- `hermes web` now always rebuilds from source before serving,
preventing stale web_dist when editing frontend files
* feat(web): full-text search across session messages
- Add GET /api/sessions/search endpoint backed by FTS5
- Auto-append prefix wildcards so partial words match (e.g. 'nimb' → 'nimby')
- Debounced search (300ms) with spinner in the search icon slot
- Search results show FTS5 snippets with highlighted match delimiters
- Expanding a search hit auto-scrolls to the first matching message
- Matching messages get a warning ring + 'match' badge
- Inline term highlighting within Markdown (text, bold, italic, headings, lists)
- Clear button (x) on search input for quick reset
---------
Co-authored-by: emozilla <emozilla@nousresearch.com>
Fixes#7952 — Matrix E2EE completely broken after mautrix migration.
- Replace MemoryCryptoStore + pickle/HMAC persistence with mautrix's
PgCryptoStore backed by SQLite via aiosqlite. Crypto state now
persists reliably across restarts without fragile serialization.
- Add handle_sync() call on initial sync response so to-device events
(queued Megolm key shares) are dispatched to OlmMachine instead of
being silently dropped.
- Add _verify_device_keys_on_server() after loading crypto state.
Detects missing keys (re-uploads), stale keys from migration
(attempts re-upload), and corrupted state (refuses E2EE).
- Add _CryptoStateStore adapter wrapping MemoryStateStore to satisfy
mautrix crypto's StateStore interface (is_encrypted,
get_encryption_info, find_shared_rooms).
- Remove redundant share_keys() call from sync loop — OlmMachine
already handles this via DEVICE_OTK_COUNT event handler.
- Fix datetime vs float TypeError in session.py suspend_recently_active()
that crashed gateway startup.
- Add aiosqlite and asyncpg to [matrix] extra in pyproject.toml.
- Update test mocks for PgCryptoStore/Database and add query_keys mock
for key verification. 174 tests pass.
- Add E2EE upgrade/migration docs to Matrix user guide.
matrix-nio pulls in peewee -> atomicwrites (sdist-only, archived,
missing build-system metadata) which breaks nix flake builds.
mautrix-python publishes wheels, has a leaner dep tree, and its
[encryption] extra uses the same python-olm without the problematic
transitive chain.
* fix(nix): gate matrix extra to Linux in [all] profile
matrix-nio[e2e] depends on python-olm which is upstream-broken on modern
macOS (Clang 21+, archived libolm). Previously the [matrix] extra was
completely excluded from [all], meaning NixOS users (who install via [all])
had no Matrix support at all.
Add a sys_platform == 'linux' marker so [all] pulls in [matrix] on Linux
(where python-olm builds fine) while still skipping it on macOS. This
fixes the NixOS setup path without breaking macOS installs.
Update the regression test to verify the Linux-gated marker is present
rather than just checking matrix is absent from [all].
Fixes#4594
* chore: regenerate uv.lock with matrix-on-linux in [all]
Add the [socks] extra to the httpx dependency to include the required
'socksio' package. This fixes the error: "Using SOCKS proxy, but the
'socksio' package is not installed" when users configure SOCKS proxy
settings.
Straight rename to match the 0.9.0 API where AuthMethod was split into
AuthMethodAgent, AuthMethodEnvVar, AuthMethodTerminal. Bump pin to >=0.9.0,<1.0.
Co-authored-by: Mibayy <mibayy@users.noreply.github.com>
Cherry-picked from PR #4338 by nepenth, resolved against current main.
Adds:
- Processing lifecycle reactions (eyes/checkmark/cross) via MATRIX_REACTIONS env
- Reaction send/receive with ReactionEvent + UnknownEvent fallback for older nio
- Fire-and-forget read receipts on text and media messages
- Message redaction, room history fetch, room creation, user invite
- Presence status control (online/offline/unavailable)
- Emote (/me) and notice message types with HTML rendering
- XSS-hardened markdown-to-HTML converter (strips raw HTML preprocessor,
sanitizes link URLs against javascript:/data:/vbscript: schemes)
- Comprehensive regex fallback with full block/inline markdown support
- Markdown>=3.6 added to [matrix] extras in pyproject.toml
- 46 new tests covering all features and security hardening