hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-26 17:38:36 +00:00

History

Teknium ff9752410a feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 ) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` \| `backend` \| `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().		2026-04-21 21:30:10 -07:00
..
browser_providers	feat: ungate Tool Gateway — subscription-based access with per-tool opt-in	2026-04-16 12:36:49 -07:00
environments	feat(skills+terminal): make bundled skill scripts runnable out of the box (#13384 )	2026-04-21 00:39:19 -07:00
neutts_samples	refactor(tts): replace NeuTTS optional skill with built-in provider + setup flow	2026-03-17 02:33:12 -07:00
__init__.py	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-03-31 08:48:54 +09:00
ansi_strip.py	fix: strip ANSI at the source — clean terminal output before it reaches the model	2026-03-23 07:43:12 -07:00
approval.py	feat: configurable approval mode for cron jobs (approvals.cron_mode)	2026-04-18 19:24:35 -07:00
binary_extensions.py	fix(tools): address PR review — remove _extract_raw_output, BudgetConfig everywhere, read_file hardening	2026-04-08 02:24:32 -07:00
browser_camofox.py	refactor: remove remaining redundant local imports (comprehensive sweep)	2026-04-21 00:50:58 -07:00
browser_camofox_state.py	feat(browser): add persistent Camofox sessions and VNC URL discovery (salvage #4400 ) (#4419 )	2026-04-01 04:18:50 -07:00
browser_cdp_tool.py	feat(browser): add browser_cdp raw DevTools Protocol passthrough (#12369 )	2026-04-19 00:03:10 -07:00
browser_tool.py	refactor: remove redundant local imports already available at module level	2026-04-21 00:50:58 -07:00
budget_config.py	fix: preserve existing thresholds, remove pre-read byte guard	2026-04-08 02:24:32 -07:00
checkpoint_manager.py	refactor: remove redundant local imports already available at module level	2026-04-21 00:50:58 -07:00
clarify_tool.py	refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites	2026-04-07 13:36:38 -07:00
code_execution_tool.py	feat(execute_code): add project/strict execution modes, default to project (#11971 )	2026-04-18 01:46:25 -07:00
credential_files.py	refactor: extract shared helpers to deduplicate repeated code patterns (#7917 )	2026-04-11 13:59:52 -07:00
cronjob_tools.py	fix: replace hardcoded ~/.hermes with display_hermes_home() in agent-facing text (#10285 )	2026-04-15 04:57:55 -07:00
debug_helpers.py	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 )	2026-04-07 10:25:31 -07:00
delegate_tool.py	fix(delegation): add hard timeout and stale detection for subagent execution (#13770 )	2026-04-21 20:20:16 -07:00
discord_tool.py	feat: add Discord server introspection and management tool (#4753 )	2026-04-19 11:52:19 -07:00
env_passthrough.py	fix(env_passthrough): reject Hermes provider credentials from skill passthrough (#13523 )	2026-04-21 06:14:25 -07:00
feishu_doc_tool.py	fix(feishu-comment): use get_hermes_home(); drop dead asyncio wrapper; AUTHOR_MAP	2026-04-17 19:04:11 -07:00
feishu_drive_tool.py	fix(feishu-comment): use get_hermes_home(); drop dead asyncio wrapper; AUTHOR_MAP	2026-04-17 19:04:11 -07:00
file_operations.py	fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage	2026-04-21 02:03:46 -07:00
file_state.py	feat(delegate): cross-agent file state coordination for concurrent subagents (#13718 )	2026-04-21 16:41:26 -07:00
file_tools.py	feat(delegate): cross-agent file state coordination for concurrent subagents (#13718 )	2026-04-21 16:41:26 -07:00
fuzzy_match.py	fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage	2026-04-21 02:03:46 -07:00
homeassistant_tool.py	fix: clean up description escaping, add string-data tests	2026-04-13 04:45:07 -07:00
image_generation_tool.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
interrupt.py	fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace (#11907 )	2026-04-17 20:39:25 -07:00
managed_tool_gateway.py	fix(tools): add debug logging for token refresh and tighten domain check	2026-04-02 12:40:03 +11:00
mcp_oauth.py	fix(mcp-oauth): bidirectional auth_flow bridge + absolute expires_at (salvage #12025 ) (#12717 )	2026-04-19 16:31:07 -07:00
mcp_oauth_manager.py	fix(mcp-oauth): bidirectional auth_flow bridge + absolute expires_at (salvage #12025 ) (#12717 )	2026-04-19 16:31:07 -07:00
mcp_tool.py	fix(mcp): reset circuit breaker on successful OAuth reconnect	2026-04-21 05:19:03 -07:00
memory_tool.py	fix: nest msvcrt import inside fcntl except block	2026-04-14 10:18:05 -07:00
mixture_of_agents_tool.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
neutts_synth.py	fix(tts): document NeuTTS provider and align install guidance (#1903 )	2026-03-18 02:55:30 -07:00
openrouter_client.py	refactor: route ad-hoc LLM consumers through centralized provider router	2026-03-11 20:02:36 -07:00
osv_check.py	feat: OSV malware check for MCP extension packages (#5305 )	2026-04-05 12:46:07 -07:00
patch_parser.py	fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage	2026-04-21 02:03:46 -07:00
path_security.py	refactor: extract shared helpers to deduplicate repeated code patterns (#7917 )	2026-04-11 13:59:52 -07:00
process_registry.py	refactor: remove redundant local imports already available at module level	2026-04-21 00:50:58 -07:00
registry.py	fix: tighten AST check to module-level only	2026-04-14 21:12:29 -07:00
rl_training_tool.py	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 )	2026-04-07 10:25:31 -07:00
send_message_tool.py	refactor: remove remaining redundant local imports (comprehensive sweep)	2026-04-21 00:50:58 -07:00
session_search_tool.py	fix(aux): add session_search extra_body and concurrency controls	2026-04-20 00:47:39 -07:00
skill_manager_tool.py	fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage	2026-04-21 02:03:46 -07:00
skills_guard.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
skills_hub.py	feat(skills): centralized skills index — eliminate GitHub API calls for search/install	2026-04-12 16:39:04 -07:00
skills_sync.py	feat(skills): add 'hermes skills reset' to un-stick bundled skills (#11468 )	2026-04-17 00:41:31 -07:00
skills_tool.py	fix(skills): respect HERMES_SESSION_PLATFORM in _is_skill_disabled	2026-04-21 05:42:32 -07:00
terminal_tool.py	fix(acp): wire approval callback + make it thread-local (#13525 )	2026-04-21 06:20:40 -07:00
tirith_security.py	fix: handle cross-device shutil.move failure in tirith auto-install (#10127 ) (#10524 )	2026-04-15 14:50:07 -07:00
todo_tool.py	fix(tools): enforce ID uniqueness in TODO store during replace operations	2026-04-11 16:22:50 -07:00
tool_backend_helpers.py	fix(fal): extend whitespace-only FAL_KEY handling to all call sites	2026-04-21 02:04:21 -07:00
tool_result_storage.py	fix(tools): neutralize shell injection in _write_to_sandbox via path quoting (#7940 )	2026-04-11 14:26:11 -07:00
transcription_tools.py	fix(stt): map cloud-only model names to valid local size for faster-whisper (#2544 )	2026-04-20 05:18:48 -07:00
tts_tool.py	fix(tts): use per-provider input-character caps instead of global 4000 (#13743 )	2026-04-21 17:49:39 -07:00
url_safety.py	fix: allow trusted QQ CDN benchmark IP resolution	2026-04-17 04:22:40 -07:00
vision_tools.py	fix: vision tool respects auxiliary.vision.temperature from config (#4661 )	2026-04-20 00:32:09 -07:00
voice_mode.py	fix: point optional-dep install hints at the venv's python (#11938 )	2026-04-17 21:16:33 -07:00
web_tools.py	feat: ungate Tool Gateway — subscription-based access with per-tool opt-in	2026-04-16 12:36:49 -07:00
website_policy.py	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 )	2026-04-07 10:25:31 -07:00
xai_http.py	feat(xai): upgrade to Responses API, add TTS provider	2026-04-16 02:24:08 -07:00