hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-25 00:51:20 +00:00

Author	SHA1	Message	Date
Teknium	f683132c1d	feat(api-server): inline image inputs on /v1/chat/completions and /v1/responses (#12969 ) OpenAI-compatible clients (Open WebUI, LobeChat, etc.) can now send vision requests to the API server. Both endpoints accept the canonical OpenAI multimodal shape: Chat Completions: {type: text\|image_url, image_url: {url, detail?}} Responses: {type: input_text\|input_image, image_url: <str>, detail?} The server validates and converts both into a single internal shape that the existing agent pipeline already handles (Anthropic adapter converts, OpenAI-wire providers pass through). Remote http(s) URLs and data:image/* URLs are supported. Uploaded files (file, input_file, file_id) and non-image data: URLs are rejected with 400 unsupported_content_type. Changes: - gateway/platforms/api_server.py - _normalize_multimodal_content(): validates + normalizes both Chat and Responses content shapes. Returns a plain string for text-only content (preserves prompt-cache behavior on existing callers) or a canonical [{type:text\|image_url,...}] list when images are present. - _content_has_visible_payload(): replaces the bare truthy check so a user turn with only an image no longer rejects as 'No user message'. - _handle_chat_completions and _handle_responses both call the new helper for user/assistant content; system messages continue to flatten to text. - Codex conversation_history, input[], and inline history paths all share the same validator. No duplicated normalizers. - run_agent.py - _summarize_user_message_for_log(): produces a short string summary ('[1 image] describe this') from list content for logging, spinner previews, and trajectory writes. Fixes AttributeError when list user_message hit user_message[:80] + '...' / .replace(). - _chat_content_to_responses_parts(): module-level helper that converts chat-style multimodal content to Responses 'input_text'/'input_image' parts. Used in _chat_messages_to_responses_input for Codex routing. - _preflight_codex_input_items() now validates and passes through list content parts for user/assistant messages instead of stringifying. - tests/gateway/test_api_server_multimodal.py (new, 38 tests) - Unit coverage for _normalize_multimodal_content, including both part formats, data URL gating, and all reject paths. - Real aiohttp HTTP integration on /v1/chat/completions and /v1/responses verifying multimodal payloads reach _run_agent intact. - 400 coverage for file / input_file / non-image data URL. - tests/run_agent/test_run_agent_multimodal_prologue.py (new) - Regression coverage for the prologue no-crash contract. - _chat_content_to_responses_parts round-trip coverage. - website/docs/user-guide/features/api-server.md - Inline image examples for both endpoints. - Updated Limitations: files still unsupported, images now supported. Validated live against openrouter/anthropic/claude-opus-4.6: POST /v1/chat/completions → 200, vision-accurate description POST /v1/responses → 200, same image, clean output_text POST /v1/chat/completions [file] → 400 unsupported_content_type POST /v1/responses [input_file] → 400 unsupported_content_type POST /v1/responses [non-image data URL] → 400 unsupported_content_type Closes #5621, #8253, #4046, #6632. Co-authored-by: Paul Bergeron <paul@gamma.app> Co-authored-by: zhangxicen <zhangxicen@example.com> Co-authored-by: Manuel Schipper <manuelschipper@users.noreply.github.com> Co-authored-by: pradeep7127 <pradeep7127@users.noreply.github.com>	2026-04-20 04:16:13 -07:00
Teknium	54e0eb24c0	docs: correctness audit — fix wrong values, add missing coverage (#11972 ) Comprehensive audit of every reference/messaging/feature doc page against the live code registries (PROVIDER_REGISTRY, OPTIONAL_ENV_VARS, COMMAND_REGISTRY, TOOLSETS, tool registry, on-disk skills). Every fix was verified against code before writing. ### Wrong values fixed (users would paste-and-fail) - reference/environment-variables.md: - DASHSCOPE_BASE_URL default was `coding-intl.dashscope.aliyuncs.com/v1` \u2192 actual `dashscope-intl.aliyuncs.com/compatible-mode/v1`. - MINIMAX_BASE_URL and MINIMAX_CN_BASE_URL defaults were `/v1` \u2192 actual `/anthropic` (Hermes calls MiniMax via its Anthropic Messages endpoint). - reference/toolsets-reference.md MCP example used the non-existent nested `mcp: servers:` key \u2192 real key is the flat `mcp_servers:`. - reference/skills-catalog.md listed ~20 bundled skills that no longer exist on disk (all moved to `optional-skills/`). Regenerated the whole bundled section from `skills/*/SKILL.md` \u2014 79 skills, accurate paths and names. - messaging/slack.md ":::info" callout claimed Slack has no `free_response_channels` equivalent; both the env var and the yaml key are in fact read. - messaging/qqbot.md documented `QQ_MARKDOWN_SUPPORT` as an env var, but the adapter only reads `extra.markdown_support` from config.yaml. Removed the env var row and noted config-only nature. - messaging/qqbot.md `hermes setup gateway` \u2192 `hermes gateway setup`. ### Missing coverage added - Providers: AWS Bedrock and Qwen Portal (qwen-oauth) \u2014 both in PROVIDER_REGISTRY but undocumented everywhere. Added sections to integrations/providers.md, rows to quickstart.md and fallback-providers.md. - integrations/providers.md "Fallback Model" provider list now includes gemini, google-gemini-cli, qwen-oauth, xai, nvidia, ollama-cloud, bedrock. - reference/cli-commands.md `--provider` enum and HERMES_INFERENCE_PROVIDER enum in env-vars now include the same set. - reference/slash-commands.md: added `/agents` (alias `/tasks`) and `/copy`. Removed duplicate rows for `/snapshot`, `/fast` (\u00d72), `/debug`. - reference/tools-reference.md: fixed "47 built-in tools" \u2192 52. Added `feishu_doc` and `feishu_drive` toolset sections. - reference/toolsets-reference.md: added `feishu_doc` / `feishu_drive` core rows + all missing `hermes-<platform>` toolsets in the platform table (bluebubbles, dingtalk, feishu, qqbot, wecom, wecom-callback, weixin, homeassistant, webhook, gateway). Fixed the `debugging` composite to describe the actual `includes=[...]` mechanism. - reference/optional-skills-catalog.md: added `fitness-nutrition`. - reference/environment-variables.md: added NOUS_BASE_URL, NOUS_INFERENCE_BASE_URL, NVIDIA_API_KEY/BASE_URL, OLLAMA_API_KEY/BASE_URL, XAI_API_KEY/BASE_URL, MISTRAL_API_KEY, AWS_REGION/AWS_PROFILE, BEDROCK_BASE_URL, HERMES_QWEN_BASE_URL, DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, TELEGRAM_REPLY_TO_MODE, MATRIX_DEVICE_ID, MATRIX_REACTIONS, QQBOT_HOME_CHANNEL_NAME, QQ_SANDBOX. - messaging/discord.md: documented DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS and HERMES_DISCORD_TEXT_BATCH_SPLIT _DELAY_SECONDS (all actively read by the adapter). - messaging/matrix.md: documented MATRIX_REACTIONS (default true). - messaging/telegram.md: removed the redundant second Webhook Mode section that invented a `telegram.webhook_mode: true` yaml key the adapter does not read. - user-guide/features/hooks.md: added `on_session_finalize` and `on_session_reset` (both emitted via invoke_hook but undocumented). - user-guide/features/api-server.md: documented GET /health/detailed, the `/api/jobs/` CRUD surface, POST /v1/runs, and GET /v1/runs/{id}/events (10 routes that were live but undocumented). - user-guide/features/fallback-providers.md: added `approval` and `title_generation` auxiliary-task rows; added gemini, bedrock, qwen-oauth to the supported-providers table. - user-guide/features/tts.md: "seven providers" \u2192 "eight" (post-xAI add oversight in #11942). - user-guide/configuration.md: TTS provider enum gains `xai` and `gemini`; yaml example block gains `mistral:`, `gemini:`, `xai:` subsections. Auxiliary-provider enum now enumerates all real registry entries. - reference/faq.md: stale AIAgent/config examples bumped from `nous/hermes-3-llama-3.1-70b` and `claude-sonnet-4.6` to `claude-opus-4.7`. ### Docs-site integrity - guides/build-a-hermes-plugin.md referenced two nonexistent hooks (`pre_api_request`, `post_api_request`). Replaced with the real `on_session_finalize` / `on_session_reset` entries. - messaging/open-webui.md and features/api-server.md had pre-existing broken links to `/docs/user-guide/features/profiles` (actual path is `/docs/user-guide/profiles`). Fixed. - reference/skills-catalog.md had one `<1%` literal that MDX parsed as a JSX tag. Escaped to `<1%`. ### False positives filtered out (not changed, verified correct) - `/set-home` is a registered alias of `/sethome` \u2014 docs were fine. - `hermes setup gateway` is valid syntax (`hermes setup \<section\>`); changed in qqbot.md for cross-doc consistency, not as a bug fix. - Telegram reactions "disabled by default" matches code (default `"false"`). - Matrix encryption "opt-in" matches code (empty env default \u2192 disabled). - `pre_api_request` / `post_api_request` hooks do NOT exist in current code; documented instead the real `on_session_finalize` / `on_session_reset`. - SIGNAL_IGNORE_STORIES is already in env-vars.md (subagent missed it). Validation: - `docusaurus build` \u2014 passes (only pre-existing nix-setup anchor warning). - `ascii-guard lint docs` \u2014 124 files, 0 errors. - 22 files changed, +317 / \u2212158.	2026-04-18 01:45:48 -07:00
Teknium	2871ef1807	docs: note session continuity for previous_response_id chains (#10060 )	2026-04-14 21:07:37 -07:00
simon-marcus	d6c09ab94a	feat(api-server): stream /v1/responses SSE tool events	2026-04-14 20:51:52 -07:00
Teknium	8bb5973950	docs: add proxy mode documentation - Matrix docs: full Proxy Mode section with architecture diagram, step-by-step setup (host + Docker), docker-compose.yml/Dockerfile examples, configuration reference, and limitations notes - API Server docs: add Proxy Mode section explaining the api_server serves as the backend for gateway proxy mode - Environment variables reference: add GATEWAY_PROXY_URL and GATEWAY_PROXY_KEY entries	2026-04-14 10:49:48 -07:00
entropidelic	989b950fbc	fix(security): enforce API_SERVER_KEY for non-loopback binding Add is_network_accessible() helper using Python's ipaddress module to robustly classify bind addresses (IPv4/IPv6 loopback, wildcards, mapped addresses, hostname resolution with DNS-failure-fails-closed). The API server connect() now refuses to start when the bind address is network-accessible and no API_SERVER_KEY is set, preventing RCE from other machines on the network. Co-authored-by: entropidelic <entropidelic@users.noreply.github.com>	2026-04-10 16:51:44 -07:00
Teknium	9634e20e15	feat: API server model name derived from profile name (#6857 ) * feat: API server model name derived from profile name For multi-user setups (e.g. OpenWebUI), each profile's API server now advertises a distinct model name on /v1/models: - Profile 'lucas' -> model ID 'lucas' - Profile 'admin' -> model ID 'admin' - Default profile -> 'hermes-agent' (unchanged) Explicit override via API_SERVER_MODEL_NAME env var or platforms.api_server.model_name config for custom names. Resolves friction where OpenWebUI couldn't distinguish multiple hermes-agent connections all advertising the same model name. * docs: multi-user setup with profiles for API server + Open WebUI - api-server.md: added Multi-User Setup section, API_SERVER_MODEL_NAME to config table, updated /v1/models description - open-webui.md: added Multi-User Setup with Profiles section with step-by-step guide, updated model name references - environment-variables.md: added API_SERVER_MODEL_NAME entry	2026-04-09 17:07:29 -07:00
Teknium	fb2af3bd1d	docs: document tool progress streaming in API server and Open WebUI (#4138 ) Update docs to reflect that tool progress now streams inline during SSE responses. Previously docs said tool calls were invisible. - api-server.md: add 'Tool progress in streams' note to streaming docs - open-webui.md: update 'How It Works' steps, add Tool Progress tip	2026-03-30 19:40:39 -07:00
Teknium	02fb7c4aaf	docs: comprehensive docs audit — fix 12 stale/missing items across 10 pages (#3618 ) Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from #3551/#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from #3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from #3572/#3573/#3576/#3580/#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (#3583). - skills.md: List default GitHub taps including garrytan/gstack (#3605).	2026-03-28 15:26:35 -07:00
Teknium	9718334962	docs: fix api-server response storage — SQLite, not in-memory (#2819 ) * docs: update all docs for /model command overhaul and custom provider support Documents the full /model command overhaul across 6 files: AGENTS.md: - Add model_switch.py to project structure tree configuration.md: - Rewrite General Setup with 3 config methods (interactive, config.yaml, env vars) - Add new 'Switching Models with /model' section documenting all syntax variants - Add 'Named Custom Providers' section with config.yaml examples and custom:name:model triple syntax slash-commands.md: - Update /model descriptions in both CLI and messaging tables with full syntax examples (provider:model, custom:model, custom:name:model, bare custom auto-detect) cli-commands.md: - Add /model slash command subsection under hermes model with syntax table - Add custom endpoint config to hermes model use cases faq.md: - Add config.yaml example for offline/local model setup - Note that provider: custom is a first-class provider - Document /model custom auto-detect provider-runtime.md: - Add model_switch.py to implementation file list - Update provider families to show Custom as first-class with named variants * docs: fix api-server response storage description — SQLite, not in-memory The ResponseStore class uses SQLite persistence (with in-memory fallback), not pure in-memory storage. Responses survive gateway restarts.	2026-03-24 09:05:15 -07:00
Teknium	e109a8b502	fix(security): block untrusted browser access to api server (#2451 ) Co-authored-by: ifrederico <fr@tecompanytea.com>	2026-03-22 04:08:48 -07:00
Teknium	588962d24e	docs: escape {id} in api-server.md headings to fix MDX build (#1787 ) MDX v2+ interprets curly braces in regular markdown as JSX expressions. The headings 'GET /v1/responses/{id}' and 'DELETE /v1/responses/{id}' caused a ReferenceError during Docusaurus static site generation because 'id' is not a defined JavaScript variable. Escaped with backslashes. Co-authored-by: Test <test@test.com>	2026-03-17 11:04:37 -07:00
Teknium	dd60bcbfb7	feat: OpenAI-compatible API server + WhatsApp configurable reply prefix (#1756 ) * feat: OpenAI-compatible API server platform adapter Salvaged from PR #956, updated for current main. Adds an HTTP API server as a gateway platform adapter that exposes hermes-agent via the OpenAI Chat Completions and Responses APIs. Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat, AnythingLLM, NextChat, ChatBox, etc.) can connect by pointing at http://localhost:8642/v1. Endpoints: - POST /v1/chat/completions — stateless Chat Completions API - POST /v1/responses — stateful Responses API with chaining - GET /v1/responses/{id} — retrieve stored response - DELETE /v1/responses/{id} — delete stored response - GET /v1/models — list hermes-agent as available model - GET /health — health check Features: - Real SSE streaming via stream_delta_callback (uses main's streaming) - In-memory LRU response store for Responses API conversation chaining - Named conversations via 'conversation' parameter - Bearer token auth (optional, via API_SERVER_KEY) - CORS support for browser-based frontends - System prompt layering (frontend system messages on top of core) - Real token usage tracking in responses Integration points: - Platform.API_SERVER in gateway/config.py - _create_adapter() branch in gateway/run.py - API_SERVER_* env vars in hermes_cli/config.py - Env var overrides in gateway/config.py _apply_env_overrides() Changes vs original PR #956: - Removed streaming infrastructure (already on main via stream_consumer.py) - Removed Telegram reply_to_mode (separate feature, not included) - Updated _resolve_model() -> _resolve_gateway_model() - Updated stream_callback -> stream_delta_callback - Updated connect()/disconnect() to use _mark_connected()/_mark_disconnected() - Adapted to current Platform enum (includes MATTERMOST, MATRIX, DINGTALK) Tests: 72 new tests, all passing Docs: API server guide, Open WebUI integration guide, env var reference * feat(whatsapp): make reply prefix configurable via config.yaml Reworked from PR #1764 (ifrederico) to use config.yaml instead of .env. The WhatsApp bridge prepends a header to every outgoing message. This was hardcoded to '⚕ Hermes Agent'. Users can now customize or disable it via config.yaml: whatsapp: reply_prefix: '' # disable header reply_prefix: '🤖 My Bot\n───\n' # custom prefix How it works: - load_gateway_config() reads whatsapp.reply_prefix from config.yaml and stores it in PlatformConfig.extra['reply_prefix'] - WhatsAppAdapter reads it from config.extra at init - When spawning bridge.js, the adapter passes it as WHATSAPP_REPLY_PREFIX in the subprocess environment - bridge.js handles undefined (default), empty (no header), or custom values with \\n escape support - Self-chat echo suppression uses the configured prefix Also fixes _config_version: was 9 but ENV_VARS_BY_VERSION had a key 10 (TAVILY_API_KEY), so existing users at v9 would never be prompted for Tavily. Bumped to 10 to close the gap. Added a regression test to prevent this from happening again. Credit: ifrederico (PR #1764) for the bridge.js implementation and the config version gap discovery. --------- Co-authored-by: Test <test@test.com>	2026-03-17 10:44:37 -07:00

13 commits