hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-05 12:42:30 +00:00

History

Teknium 5aa755e4e6 feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194 ) * feat(plugins): host-owned LLM access via ctx.llm Plugins can now ask the host to run a one-shot chat or structured completion against the user's active model and auth, without ever seeing an OAuth token or API key. Closes the gap where plugins that needed bounded structured inference (receipts, CRM extraction, support classification) had to either bring their own provider keys or register a tool the agent had to call. New surface on PluginContext: - ctx.llm.complete(messages, ...) - ctx.llm.complete_structured(instructions, input, json_schema, ...) - async siblings ctx.llm.acomplete / acomplete_structured Backed by the existing auxiliary_client.call_llm pipeline — every provider, fallback chain, vision routing, and timeout policy Hermes already supports applies automatically. Trust gate (fail-closed by default): - plugins.entries.<id>.llm.allow_model_override - plugins.entries.<id>.llm.allowed_models (allowlist; '' = any) - plugins.entries.<id>.llm.allow_agent_id_override - plugins.entries.<id>.llm.allow_profile_override Embedded model@profile shorthand goes through the same gate as explicit profile=, so it can't bypass the auth-profile policy. Conflicting explicit and embedded profiles fail closed. Also lands: - plugins/plugin-llm-example/ — reference plugin that registers /receipt-extract, demonstrating image+text structured input, jsonschema validation, and the trust-gate config. - website/docs/developer-guide/plugin-llm-access.md — full API docs. - 45 unit tests covering trust gates, JSON parsing, schema validation, image encoding, async surface, and config loading. Validation: - 2628 tests pass in tests/agent/ - E2E: bundled plugin loaded with isolated HERMES_HOME, slash command produced parsed JSON via stubbed call_llm - response_format extra_body wired correctly for both json_object and json_schema modes docs(plugin-llm): rewrite quickstart and framing The quickstart now uses a meeting-notes-to-tasks example instead of a receipt extractor, and the page leads with hook-time / gateway pre-filter / scheduled-job framing rather than the OpenClaw KB/support/CRM/finance/migration enumeration that the original upstream PR used. Receipt example moved to a separate worked example link so the docs page itself doesn't echo any of the upstream framing. Also clarifies where ctx.llm fits in the broader plugin surface (table comparing register_tool / register_platform / register_hook / etc.) and what makes this lane different from auxiliary_client internals. No code change. * docs(plugin-llm): reframe as any LLM call, not just structured output The original draft leaned heavily on complete_structured() and made the chat lane (complete() / acomplete()) feel like a footnote. Restructure so: - The page title and description say 'any LLM call.' - The lead shows BOTH a plain chat call (error rewriter) AND a structured call (triage scorer) up top. - Quick start has two complete plugin examples — /tldr (chat) and /paste-to-tasks (structured). - New 'When to use which' table for choosing complete() vs complete_structured() vs the async siblings. - Trust-gate sections explicitly note 'all four methods,' and the request-shaping list calls out chat-only fields (messages) and structured-only fields (instructions, input, json_schema) alongside each other. - The 'Where this fits' section now says 'for any reason, structured or not.' The receipt-extractor reference plugin still exists under plugins/plugin-llm-example/ — but the docs page no longer treats it as the canonical surface example. It's now described as 'a third worked example, this time with image input.' No code change. * feat(plugin-llm): split provider/model into independent explicit kwargs The first cut accepted a single 'provider/model' slug on every method and split it internally. That looked clean but broke under live test: the model-override path tried to use the slug's vendor prefix as a literal Hermes provider id, which silently switched the user off their aggregator (e.g. plugin asks for 'openai/gpt-4o-mini' on a user who routes through OpenRouter — host attempted to call the 'openai' provider directly, failed because OPENAI_API_KEY wasn't set). New shape mirrors the host's main config: ctx.llm.complete( messages=[...], provider='openrouter', # gated, optional model='openai/gpt-4o-mini', # gated, optional profile='work', # gated, optional ... ) Each is independently gated by its own allow__override flag. Granting model-override does NOT auto-grant provider-override. Allowlists are now per-axis (allowed_providers, allowed_models) matched literally against whatever string the plugin sends. Dropped 'model@profile' embedded-suffix shorthand entirely. Hermes doesn't use that pattern anywhere else; profile= is its own kwarg. Live E2E (against real OpenRouter via Teknium's config) confirms: - zero-config call works - default-deny blocks each override with a helpful error - model-only override stays on user's active provider (the bug) - provider+model override switches cleanly - allowlist refuses non-listed entries - structured output round-trip parses + schema-validates Tests: 49 cases (up from 45); all green. Docs updated to match the new shape, including a 'most plugins never need this section' callout on the trust-gate config block. fix+cleanup(plugin-llm): real attribution, hook-mode coverage, move example out of core Three integration fixes for the ctx.llm surface: 1. Attribution bug — result.provider and result.model now reflect what call_llm actually used, not placeholder fallbacks ('auto', 'default'). New _resolve_attribution() helper: - explicit overrides win (what the call targeted) - response.model wins for the recorded model (provider canonicalisation: 'gpt-4o' → 'gpt-4o-2024-08-06' etc.) - falls back to _read_main_provider() / _read_main_model() when no override is set, so audit logs reflect the user's active main provider/model - 'auto' / 'default' only when EVERYTHING is empty Live verified: zero-config call now records provider='openrouter', model='anthropic/claude-4.7-opus-20260416' instead of provider='auto', model='default'. 2. Hook-mode coverage — TestHookMode confirms ctx.llm.complete works from inside a registered post_tool_call callback. The docs page promised hook integration; now there's a test that exercises the lazy-import path through the real invoke_hook machinery. Two cases: traceback-rewrite hook with conditional ctx.llm.complete, and minimal hook regression for the sync-hook + sync-llm path. 3. Reference plugin moved out of core. plugins/plugin-llm-example/ is gone from hermes-agent — it now lives in the new NousResearch/hermes-example-plugins companion repo. The docs page links there. Hermes' bundled plugins should be plugins users actually run; reference / docs-companion plugins live externally. Test count: 56 (up from 49). Wider sweep on tests/hermes_cli/ + tests/gateway/ + tests/tools/ + tests/agent/ shows 16770 passing; the 12 failures are all pre-existing on origin/main (verified by stashing this branch's changes and re-running) — kanban-boards, delegate-task, gateway-restart, tts-routing — none touch the plugin_llm surface. * chore(plugins): move all example plugins to companion repo Reference / docs-companion plugins now live exclusively in NousResearch/hermes-example-plugins, not bundled with the core repo: - example-dashboard - strike-freedom-cockpit A new fourth example, plugin-llm-async-example, was added to that repo demonstrating ctx.llm's async surface (acomplete()) with asyncio.gather() — registers /translate <lang>: <text> which fires forward translation + sentiment classifier in parallel, then a back-translation for QA. Live-tested at 2.5s for three real provider round-trips (would be ~5-6s sequential). Docs updated: - developer-guide/plugin-llm-access.md links both sync and async examples in the Reference section - user-guide/features/extending-the-dashboard.md repoints both demo sections to the companion repo with corrected install paths - user-guide/features/built-in-plugins.md drops the two demo rows - AGENTS.md notes that example plugins live in the companion repo Net: hermes-agent's plugins/ directory now contains only plugins users actually run (memory providers, dashboard tabs that ship real features, the disk-cleanup hook, platform adapters). All four demo / reference plugins live externally where they can be cloned on demand instead of inflating the core install.		2026-05-10 07:09:28 -07:00
..
__init__.py	chore: release v0.13.0 (2026.5.7) (#21406 )	2026-05-07 09:22:48 -07:00
_parser.py	fix: add dashboard to CLI help epilogue and Docker CI smoke test	2026-05-07 06:16:23 -07:00
_subprocess_compat.py	feat(windows): close remaining POSIX-only landmines — TUI crash, kanban waitpid, AF_UNIX sandbox, /bin/bash, npm .cmd shims, cwd tracking, detach flags	2026-05-08 14:27:40 -07:00
auth.py	feat(cross-platform): psutil for PID/process management + Windows footgun checker	2026-05-08 14:27:40 -07:00
auth_commands.py	auth: use get_default_hermes_root() for shared nous_auth.json path	2026-05-08 14:27:40 -07:00
azure_detect.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
backup.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
banner.py	fix(banner): resolve update-check repo from running code, not profile-scoped path	2026-05-09 04:10:35 -07:00
browser_connect.py	fix(browser): address Copilot review on /browser connect	2026-04-28 22:11:10 -07:00
callbacks.py	fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902 )	2026-04-14 16:11:37 -07:00
checkpoints.py	feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709 )	2026-05-06 05:44:35 -07:00
claw.py	Merge origin/main and resolve conflict in nix/tui.nix	2026-05-07 22:56:19 +00:00
cli_output.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
clipboard.py	feat: fix img pasting in new ink plus newline after tools	2026-04-11 13:14:32 -05:00
codex_models.py	docs(codex-spark): document ChatGPT Pro entitlement gating	2026-05-09 23:17:25 -07:00
colors.py	feat: respect NO_COLOR env var and TERM=dumb (#4079 )	2026-03-30 17:07:21 -07:00
commands.py	Merge pull request #20805 from NousResearch/austin-feat-sessions-skills-menu	2026-05-07 18:54:16 -04:00
completion.py	fix(completion): use valid zsh _arguments exclusion-group syntax	2026-05-09 13:36:44 -07:00
config.py	fix(terminal): bridge docker_env config to TERMINAL_DOCKER_ENV	2026-05-09 17:53:35 -07:00
copilot_auth.py	fix(oauth,gateway): monotonic deadlines for polling/timeout loops	2026-05-07 05:09:39 -07:00
cron.py	feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709 )	2026-05-04 12:31:01 -07:00
curator.py	feat(curator): show rename map in user-visible summary (#22910 )	2026-05-09 18:43:40 -07:00
curses_ui.py	fix: treat ctrl-c as curses cancel	2026-05-04 01:36:44 -07:00
debug.py	fix(debug): redact log content at upload time in hermes debug share	2026-05-03 11:42:20 -07:00
default_soul.py	fix: reset default SOUL.md to baseline identity text (#3159 )	2026-03-26 01:34:27 -07:00
dingtalk_auth.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
doctor.py	fix(doctor): normalize provider name and aliases before dedicated-skip check	2026-05-09 13:36:33 -07:00
dump.py	refactor(env): use shared Hermes dotenv loader	2026-05-05 10:13:13 -07:00
env_loader.py	feat(cross-platform): psutil for PID/process management + Windows footgun checker	2026-05-08 14:27:40 -07:00
fallback_cmd.py	feat(cli): add 'hermes fallback' command to manage fallback providers (#16052 )	2026-04-26 06:19:04 -07:00
gateway.py	fix(gateway): detect gateway process via /proc in Docker without procps	2026-05-09 17:54:17 -07:00
gateway_windows.py	fix(gateway): preserve Ctrl+C for Windows foreground runs	2026-05-09 14:34:18 -07:00
goals.py	fix(goals): auto-pause when judge model returns unparseable output	2026-05-07 17:33:09 -07:00
hooks.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
kanban.py	fix(kanban): /kanban slash command emits argparse garbage instead of help	2026-05-09 22:49:29 -07:00
kanban_db.py	fix(kanban): make _migrate_add_optional_columns idempotent on concurrent open	2026-05-09 13:36:23 -07:00
kanban_diagnostics.py	fix(kanban): unify failure counter across spawn/timeout/crash outcomes (#20410 )	2026-05-05 13:55:37 -07:00
kanban_specify.py	feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks (#21435 )	2026-05-07 13:04:41 -07:00
logs.py	feat: component-separated logging with session context and filtering (#7991 )	2026-04-11 17:23:36 -07:00
main.py	feat(curator): show rename map in user-visible summary (#22910 )	2026-05-09 18:43:40 -07:00
mcp_config.py	feat(mcp): add codex preset for built-in MCP server discovery	2026-05-09 11:11:28 -07:00
memory_setup.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
model_catalog.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
model_normalize.py	fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802 )	2026-05-06 09:08:33 -07:00
model_switch.py	docs(codex-spark): document ChatGPT Pro entitlement gating	2026-05-09 23:17:25 -07:00
models.py	chore(models): refresh OpenRouter + Nous fallback lists (#23001 )	2026-05-09 22:47:38 -07:00
nous_subscription.py	feat(web): add SearXNG as a native search-only backend	2026-05-06 10:05:29 -07:00
oneshot.py	fix: make session search initialize session db	2026-05-09 14:36:58 -07:00
pairing.py	fix(pairing): enforce lockout on approve_code, not just generate_code (#10195 ) (#21325 )	2026-05-07 07:18:21 -07:00
platforms.py	feat: complete plugin platform parity — all 12 integration points	2026-04-29 21:56:51 -07:00
plugins.py	feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194 )	2026-05-10 07:09:28 -07:00
plugins_cmd.py	fix(plugins): resolve Git binary for installs under minimal PATH	2026-05-09 11:10:04 -07:00
profile_distribution.py	feat(profile): shareable profile distributions via git (#20831 )	2026-05-08 10:04:32 -07:00
profiles.py	fix(profiles): exclude infrastructure artifacts when cloning with --clone-all	2026-05-09 04:10:35 -07:00
providers.py	fix: prevent bare 'custom' slug in model.provider (#17478 )	2026-04-30 04:32:11 -07:00
pt_input_extras.py	fix(cli): make Ctrl+Enter insert newline on WSL/SSH/Windows Terminal (#22777 )	2026-05-09 12:48:14 -07:00
pty_bridge.py	feat(cross-platform): psutil for PID/process management + Windows footgun checker	2026-05-08 14:27:40 -07:00
relaunch.py	fix(windows): prefer npm.cmd over npm.ps1, skip .py argv0 in relaunch	2026-05-08 14:27:40 -07:00
runtime_provider.py	fix: use credential_pool for custom endpoint model listing probes	2026-05-09 17:54:58 -07:00
setup.py	fix(setup): offer gateway service install on Windows (#22099 )	2026-05-08 14:59:59 -07:00
skills_config.py	refactor(config): migrate remaining 33 cfg_get call sites (#17311 )	2026-04-29 04:03:03 -07:00
skills_hub.py	codebase: add encoding='utf-8' to all bare open() calls (PLW1514)	2026-05-08 14:27:40 -07:00
skin_engine.py	fix(tui): honor skin highlight colors (#20895 )	2026-05-06 14:01:56 -07:00
slack_cli.py	fix(slack): enable writable app home DMs in manifest	2026-05-08 17:01:12 -07:00
status.py	fix(status): add missing popular provider API keys to hermes status display	2026-05-04 05:14:13 -07:00
stdio.py	fix(windows): quote cache paths in bash + augment PATH so rg/bash resolve on first launch	2026-05-08 14:27:40 -07:00
timeouts.py	refactor(timeouts): drop redundant ImportError in except clause	2026-04-26 20:48:20 -07:00
tips.py	feat: Ctrl+Enter inserts newline on Windows Terminal	2026-05-08 14:27:40 -07:00
tools_config.py	fix(tools): install cua-driver when Computer Use is enabled via 'hermes tools' (#22765 )	2026-05-09 13:02:25 -07:00
uninstall.py	feat(windows uninstall): clean up User env, PATH, Scheduled Task, and portable tooling	2026-05-08 14:27:40 -07:00
vercel_auth.py	feat: add Vercel Sandbox backend	2026-04-29 07:22:33 -07:00
voice.py	fix(tui): restore voice push-to-talk parity (#20897 )	2026-05-06 15:49:59 -07:00
web_server.py	fix(security): require dashboard auth for plugin API routes	2026-05-10 07:04:18 -07:00
webhook.py	refactor(config): migrate remaining 33 cfg_get call sites (#17311 )	2026-04-29 04:03:03 -07:00