hermes-agent/tools
Teknium 01906e99dd
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265)
* feat(image_gen): multi-model FAL support with picker in hermes tools

Adds 8 FAL text-to-image models selectable via `hermes tools` →
Image Generation → (FAL.ai | Nous Subscription) → model picker.

Models supported:
- fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP)
- fal-ai/flux-2-pro (previous default, kept backward-compat upscaling)
- fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN)
- fal-ai/nano-banana (Gemini 2.5 Flash Image)
- fal-ai/gpt-image-1.5 (with quality tier: low/medium/high)
- fal-ai/ideogram/v3 (best typography)
- fal-ai/recraft-v3 (vector, brand styles)
- fal-ai/qwen-image (LLM-based)

Architecture:
- FAL_MODELS catalog declares per-model size family, defaults, supports
  whitelist, and upscale flag. Three size families handled uniformly:
  image_size_preset (flux family), aspect_ratio (nano-banana), and
  gpt_literal (gpt-image-1.5).
- _build_fal_payload() translates unified inputs (prompt + aspect_ratio)
  into model-specific payloads, merges defaults, applies caller overrides,
  wires GPT quality_setting, then filters to the supports whitelist — so
  models never receive rejected keys.
- IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen
  providers (Replicate, Stability, etc.); each provider entry tags itself
  with imagegen_backend: 'fal' to select the right catalog.
- Upscaler (Clarity) defaults off for new models (preserves <1s value
  prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS.

Config:
  image_gen.model           = fal-ai/flux-2/klein/9b  (new)
  image_gen.quality_setting = medium                  (new, GPT only)
  image_gen.use_gateway     = bool                    (existing)

Agent-facing schema unchanged (prompt + aspect_ratio only) — model
choice is a user-level config decision, not an agent-level arg.

Picker uses curses_radiolist (arrow keys, auto numbered-fallback on
non-TTY). Column-aligned: Model / Speed / Strengths / Price.

Docs: image-generation.md rewritten with the model table and picker
walkthrough. tools-reference, tool-gateway, overview updated to drop
the stale "FLUX 2 Pro" wording.

Tests: 42 new in tests/tools/test_image_generation.py covering catalog
integrity, all 3 size families, supports filter, default merging, GPT
quality wiring, model resolution fallback. 8 new in
tests/hermes_cli/test_tools_config.py for picker wiring (registry,
config writes, GPT quality follow-up prompt, corrupt-config repair).

* feat(image_gen): translate managed-gateway 4xx to actionable error

When the Nous Subscription managed FAL proxy rejects a model with 4xx
(likely portal-side allowlist miss or billing gate), surface a clear
message explaining:
  1. The rejected model ID + HTTP status
  2. Two remediation paths: set FAL_KEY for direct access, or
     pick a different model via `hermes tools`

5xx, connection errors, and direct-FAL errors pass through unchanged
(those have different root causes and reasonable native messages).

Motivation: new FAL models added to this release (flux-2-klein-9b,
z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3,
qwen-image) are untested against the Nous Portal proxy. If the portal
allowlists model IDs, users on Nous Subscription will hit cryptic
4xx errors without guidance on how to work around it.

Tests: 8 new cases covering status extraction across httpx/fal error
shapes and 4xx-vs-5xx-vs-ConnectionError translation policy.

Docs: brief note in image-generation.md for Nous subscribers.

Operator action (Nous Portal side): verify that fal-queue-gateway
passes through these 7 new FAL model IDs. If the proxy has an
allowlist, add them; otherwise Nous Subscription users will see the
new translated error and fall back to direct FAL.

* feat(image_gen): pin GPT-Image quality to medium (no user choice)

Previously the tools picker asked a follow-up question for GPT-Image
quality tier (low / medium / high) and persisted the answer to
`image_gen.quality_setting`. This created two problems:

1. Nous Portal billing complexity — the 22x cost spread between tiers
   ($0.009 low / $0.20 high) forces the gateway to meter per-tier per
   user, which the portal team can't easily support at launch.
2. User footgun — anyone picking `high` by mistake burns through
   credit ~6x faster than `medium`.

This commit pins quality at medium by baking it into FAL_MODELS
defaults for gpt-image-1.5 and removes all user-facing override paths:

- Removed `_resolve_gpt_quality()` runtime lookup
- Removed `honors_quality_setting` flag on the model entry
- Removed `_configure_gpt_quality_setting()` picker helper
- Removed `_GPT_QUALITY_CHOICES` constant
- Removed the follow-up prompt call in `_configure_imagegen_model()`
- Even if a user manually edits `image_gen.quality_setting` in
  config.yaml, no code path reads it — always sends medium.

Tests:
- Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium
  (5 tests) — proves medium is baked in, config is ignored, flag is
  removed, helper is removed, non-gpt models never get quality.
- Replaced test_picker_with_gpt_image_also_prompts_quality with
  test_picker_with_gpt_image_does_not_prompt_quality — proves only 1
  picker call fires when gpt-image is selected (no quality follow-up).

Docs updated: image-generation.md replaces the quality-tier table
with a short note explaining the pinning decision.

* docs(image_gen): drop stale 'wires GPT quality tier' line from internals section

Caught in a cleanup sweep after pinning quality to medium. The
"How It Works Internally" walkthrough still described the removed
quality-wiring step.
2026-04-16 20:19:53 -07:00
..
browser_providers feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
environments fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards 2026-04-16 19:39:21 -07:00
neutts_samples refactor(tts): replace NeuTTS optional skill with built-in provider + setup flow 2026-03-17 02:33:12 -07:00
__init__.py Merge branch 'main' into rewbs/tool-use-charge-to-subscription 2026-03-31 08:48:54 +09:00
ansi_strip.py fix: strip ANSI at the source — clean terminal output before it reaches the model 2026-03-23 07:43:12 -07:00
approval.py fix(approval): heartbeat activity during gateway approval wait (#11245) 2026-04-16 14:48:50 -07:00
binary_extensions.py fix(tools): address PR review — remove _extract_raw_output, BudgetConfig everywhere, read_file hardening 2026-04-08 02:24:32 -07:00
browser_camofox.py fix: /browser connect CDP override now takes priority over Camofox (#10523) 2026-04-15 14:11:18 -07:00
browser_camofox_state.py feat(browser): add persistent Camofox sessions and VNC URL discovery (salvage #4400) (#4419) 2026-04-01 04:18:50 -07:00
browser_tool.py fix(browser): runtime fallback to local Chromium when cloud provider fails 2026-04-16 04:19:34 -07:00
budget_config.py fix: preserve existing thresholds, remove pre-read byte guard 2026-04-08 02:24:32 -07:00
checkpoint_manager.py fix(checkpoints): isolate shadow git repo from user's global config (#11261) 2026-04-16 16:06:49 -07:00
clarify_tool.py refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites 2026-04-07 13:36:38 -07:00
code_execution_tool.py fix: follow-up for salvaged PR #10854 2026-04-16 06:42:45 -07:00
credential_files.py refactor: extract shared helpers to deduplicate repeated code patterns (#7917) 2026-04-11 13:59:52 -07:00
cronjob_tools.py fix: replace hardcoded ~/.hermes with display_hermes_home() in agent-facing text (#10285) 2026-04-15 04:57:55 -07:00
debug_helpers.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
delegate_tool.py fix: UnboundLocalError on 'entry' in parallel subagent polling loop (#11050) 2026-04-16 06:53:44 -07:00
env_passthrough.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
file_operations.py feat: improve file search UX — fuzzy @ completions, mtime sorting, better suggestions (#9467) 2026-04-13 23:54:45 -07:00
file_tools.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
fuzzy_match.py fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs 2026-04-10 16:47:44 -07:00
homeassistant_tool.py fix: clean up description escaping, add string-data tests 2026-04-13 04:45:07 -07:00
image_generation_tool.py feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) 2026-04-16 20:19:53 -07:00
interrupt.py fix: scope tool interrupt signal per-thread to prevent cross-session leaks (#7930) 2026-04-11 14:02:58 -07:00
managed_tool_gateway.py fix(tools): add debug logging for token refresh and tighten domain check 2026-04-02 12:40:03 +11:00
mcp_oauth.py fix(approval,mcp): log silent exception handlers, narrow OAuth catches, close server on error 2026-04-10 03:05:04 -07:00
mcp_tool.py fix: add circuit breaker to MCP tool handler to prevent retry burn loops (#10447) (#10776) 2026-04-15 22:33:48 -07:00
memory_tool.py fix: nest msvcrt import inside fcntl except block 2026-04-14 10:18:05 -07:00
mixture_of_agents_tool.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
neutts_synth.py fix(tts): document NeuTTS provider and align install guidance (#1903) 2026-03-18 02:55:30 -07:00
openrouter_client.py refactor: route ad-hoc LLM consumers through centralized provider router 2026-03-11 20:02:36 -07:00
osv_check.py feat: OSV malware check for MCP extension packages (#5305) 2026-04-05 12:46:07 -07:00
patch_parser.py fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs 2026-04-10 16:47:44 -07:00
path_security.py refactor: extract shared helpers to deduplicate repeated code patterns (#7917) 2026-04-11 13:59:52 -07:00
process_registry.py fix: prevent agent hang when backgrounding processes via terminal tool 2026-04-15 17:26:31 -07:00
registry.py fix: tighten AST check to module-level only 2026-04-14 21:12:29 -07:00
rl_training_tool.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
send_message_tool.py retry transient telegram send failures 2026-04-16 03:47:00 -07:00
session_search_tool.py fix(session_search): coerce limit to int to prevent TypeError with non-int values (#10522) 2026-04-15 14:11:05 -07:00
skill_manager_tool.py fix: five HERMES_HOME profile-isolation leaks (#10570) 2026-04-15 17:09:41 -07:00
skills_guard.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
skills_hub.py feat(skills): centralized skills index — eliminate GitHub API calls for search/install 2026-04-12 16:39:04 -07:00
skills_sync.py fix(skills): read name from SKILL.md frontmatter in skills_sync 2026-04-11 01:21:20 -07:00
skills_tool.py fix: use absolute skill_dir for external skills (#10313) (#10587) 2026-04-15 17:22:55 -07:00
terminal_tool.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
tirith_security.py fix: handle cross-device shutil.move failure in tirith auto-install (#10127) (#10524) 2026-04-15 14:50:07 -07:00
todo_tool.py fix(tools): enforce ID uniqueness in TODO store during replace operations 2026-04-11 16:22:50 -07:00
tool_backend_helpers.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
tool_result_storage.py fix(tools): neutralize shell injection in _write_to_sandbox via path quoting (#7940) 2026-04-11 14:26:11 -07:00
transcription_tools.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
tts_tool.py feat(tts): add Google Gemini TTS provider (#11229) 2026-04-16 14:23:16 -07:00
url_safety.py fix: make safe_url_for_log public, add SSRF redirect guards to base.py cache helpers 2026-04-10 05:04:28 -07:00
vision_tools.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
voice_mode.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
web_tools.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
website_policy.py refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821) 2026-04-07 10:25:31 -07:00
xai_http.py feat(xai): upgrade to Responses API, add TTS provider 2026-04-16 02:24:08 -07:00