hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-02 02:01:47 +00:00

History

Teknium 01906e99dd feat(image_gen): multi-model FAL support with picker in hermes tools (#11265 ) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai \| Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.		2026-04-16 20:19:53 -07:00
..
__init__.py	test: reorganize test structure and add missing unit tests	2026-02-26 03:20:08 +03:00
test_ansi_strip.py	fix: strip ANSI at the source — clean terminal output before it reaches the model	2026-03-23 07:43:12 -07:00
test_approval.py	fix: block agent from self-destructing gateway via terminal (#6666 )	2026-04-14 15:43:31 -07:00
test_approval_heartbeat.py	fix(approval): heartbeat activity during gateway approval wait (#11245 )	2026-04-16 14:48:50 -07:00
test_base_environment.py	feat(environments): unified spawn-per-call execution layer	2026-04-08 17:23:15 -07:00
test_browser_camofox.py	fix: /browser connect CDP override now takes priority over Camofox (#10523 )	2026-04-15 14:11:18 -07:00
test_browser_camofox_persistence.py	docs: remove nonexistent CAMOFOX_PROFILE_DIR env var references (#10976 )	2026-04-16 04:07:11 -07:00
test_browser_camofox_state.py	feat(discord): add channel_prompts config	2026-04-15 16:31:28 -07:00
test_browser_cdp_override.py	feat: switch managed browser provider from Browserbase to Browser Use (#5750 )	2026-04-07 08:40:22 -04:00
test_browser_cleanup.py	fix(doctor): only check the active memory provider, not all providers unconditionally (#6285 )	2026-04-08 13:44:58 -07:00
test_browser_cloud_fallback.py	fix(browser): runtime fallback to local Chromium when cloud provider fails	2026-04-16 04:19:34 -07:00
test_browser_console.py	fix: add browser_console to browser toolset and core tools list (#1084 )	2026-03-17 02:02:57 -07:00
test_browser_content_none_guard.py	fix(browser): guard LLM response content against None in snapshot and vision (#3642 )	2026-03-28 17:25:04 -07:00
test_browser_hardening.py	fix(browser): hardening — dead code, caching, scroll perf, security, thread safety	2026-04-10 13:05:44 -07:00
test_browser_homebrew_paths.py	fix(browser): add termux PATH fallbacks	2026-04-14 16:55:55 -07:00
test_browser_orphan_reaper.py	fix: reap orphaned browser sessions on startup (#7931 )	2026-04-11 14:02:46 -07:00
test_browser_secret_exfil.py	fix: rewrite test mock secrets and add redaction fixture	2026-04-01 12:03:56 -07:00
test_browser_ssrf_local.py	fix(browser): skip SSRF check for local backends (Camofox, headless Chromium) (#4292 )	2026-03-31 10:40:13 -07:00
test_budget_config.py	test(tools): add unit tests for budget_config module	2026-04-11 02:58:48 -07:00
test_checkpoint_manager.py	fix(checkpoints): isolate shadow git repo from user's global config (#11261 )	2026-04-16 16:06:49 -07:00
test_clarify_tool.py	test(tools): add unit tests for clarify_tool.py	2026-02-27 03:29:26 -05:00
test_clipboard.py	feat(gateway): WSL-aware gateway with smart systemd detection (#7510 )	2026-04-10 21:15:47 -07:00
test_code_execution.py	fix: follow-up for salvaged PR #10854	2026-04-16 06:42:45 -07:00
test_command_guards.py	fix: remove 115 verified dead code symbols across 46 production files	2026-04-10 03:44:43 -07:00
test_config_null_guard.py	fix: guard config.get() against YAML null values to prevent AttributeError (#3377 )	2026-03-27 04:03:00 -07:00
test_credential_files.py	fix: remove 115 verified dead code symbols across 46 production files	2026-04-10 03:44:43 -07:00
test_cron_prompt_injection.py	fix: cron prompt injection scanner bypass for multi-word variants	2026-02-26 13:55:54 +03:00
test_cronjob_tools.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_daytona_environment.py	fix: update tests for unified spawn-per-call execution model	2026-04-08 17:23:15 -07:00
test_debug_helpers.py	fix(tests): isolate HERMES_HOME in tests and adjust log directory for debug session	2026-03-02 04:34:21 -08:00
test_delegate.py	feat(delegation): add configurable reasoning_effort for subagents	2026-04-10 21:16:53 -07:00
test_delegate_toolset_scope.py	fix(security): restrict subagent toolsets to parent's enabled set (#3269 )	2026-03-26 14:50:26 -07:00
test_docker_environment.py	fix(tests): fix several failing/flaky tests on main (#6777 )	2026-04-09 13:17:06 -07:00
test_docker_find.py	feat: entry-level Podman support — find_docker() + rootless entrypoint (#10066 )	2026-04-14 21:20:37 -07:00
test_env_passthrough.py	fix: remove 115 verified dead code symbols across 46 production files	2026-04-10 03:44:43 -07:00
test_file_operations.py	fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs	2026-04-10 16:47:44 -07:00
test_file_operations_edge_cases.py	fix(tools): remove dead code in _is_likely_binary and harden _check_lint against brace paths	2026-04-10 21:16:53 -07:00
test_file_read_guards.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_file_staleness.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_file_sync.py	test(file_sync): add tests for bulk_upload_fn callback	2026-04-10 21:14:32 -07:00
test_file_sync_back.py	fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards	2026-04-16 19:39:21 -07:00
test_file_sync_perf.py	test: add reproducible perf benchmark for file sync overhead	2026-04-10 03:01:46 -07:00
test_file_tools.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_file_tools_live.py	feat(environments): unified spawn-per-call execution layer	2026-04-08 17:23:15 -07:00
test_file_write_safety.py	fix(file_tools): block /private/etc writes on macOS symlink bypass	2026-04-13 05:15:05 -07:00
test_force_dangerous_override.py	fix(skills): honor policy table for dangerous verdicts	2026-03-14 11:27:02 -07:00
test_fuzzy_match.py	fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs	2026-04-10 16:47:44 -07:00
test_hidden_dir_filter.py	fix: use Path.parts for hidden directory filter in skill listing	2026-03-04 18:34:16 +03:00
test_homeassistant_tool.py	fix: clean up description escaping, add string-data tests	2026-04-13 04:45:07 -07:00
test_image_generation.py	feat(image_gen): multi-model FAL support with picker in hermes tools (#11265 )	2026-04-16 20:19:53 -07:00
test_interrupt.py	fix: resolve remaining 4 CI test failures (#9543 )	2026-04-14 02:18:38 -07:00
test_llm_content_none_guard.py	fix: guard aux LLM calls against None content + reasoning fallback + retry (salvage #3389 ) (#3449 )	2026-03-27 15:28:19 -07:00
test_local_env_blocklist.py	fix: add macOS Homebrew paths to browser and terminal PATH resolution	2026-03-23 22:45:55 -07:00
test_local_tempdir.py	fix(termux): honor temp dirs for local temp artifacts	2026-04-09 16:24:53 -07:00
test_managed_browserbase_and_modal.py	feat: ungate Tool Gateway — subscription-based access with per-tool opt-in	2026-04-16 12:36:49 -07:00
test_managed_media_gateways.py	feat: ungate Tool Gateway — subscription-based access with per-tool opt-in	2026-04-16 12:36:49 -07:00
test_managed_modal_environment.py	fix: add activity heartbeats to prevent false gateway inactivity timeouts (#10501 )	2026-04-15 13:29:05 -07:00
test_managed_server_tool_support.py	fix(tests): fix several failing/flaky tests on main (#6777 )	2026-04-09 13:17:06 -07:00
test_managed_tool_gateway.py	feat: ungate Tool Gateway — subscription-based access with per-tool opt-in	2026-04-16 12:36:49 -07:00
test_mcp_dynamic_discovery.py	fix(mcp): make server aliases explicit	2026-04-14 17:19:20 -07:00
test_mcp_oauth.py	feat: implement MCP OAuth 2.1 PKCE client support (#5420 )	2026-04-05 22:08:00 -07:00
test_mcp_probe.py	fix: remove stale test skips, fix regex backtracking, file search bug, and test flakiness	2026-04-04 10:18:57 -07:00
test_mcp_stability.py	fix: add vLLM/local server error patterns + MCP initial connection retry (#9281 )	2026-04-13 18:46:14 -07:00
test_mcp_structured_content.py	fix(mcp): combine content and structuredContent when both present (#7118 )	2026-04-10 03:44:35 -07:00
test_mcp_tool.py	fix(mcp): make server aliases explicit	2026-04-14 17:19:20 -07:00
test_mcp_tool_issue_948.py	fix: remove stale test skips, fix regex backtracking, file search bug, and test flakiness	2026-04-04 10:18:57 -07:00
test_memory_tool.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_memory_tool_import_fallback.py	fix(tools): keep memory tool available when fcntl is unavailable	2026-04-14 10:18:05 -07:00
test_mixture_of_agents_tool.py	refactor: tighten MoA traceback logging scope (#1307 )	2026-03-14 07:53:56 -07:00
test_modal_bulk_upload.py	perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive (#8014 )	2026-04-12 06:18:05 +05:30
test_modal_sandbox_fixes.py	fix: update tests for unified spawn-per-call execution model	2026-04-08 17:23:15 -07:00
test_modal_snapshot_isolation.py	fix(tests): update mocks for file sync changes	2026-04-10 03:01:46 -07:00
test_notify_on_complete.py	fix: suppress duplicate completion notifications when agent already consumed output via wait/poll/log (#8228 )	2026-04-12 00:36:22 -07:00
test_osv_check.py	feat: OSV malware check for MCP extension packages (#5305 )	2026-04-05 12:46:07 -07:00
test_parse_env_var.py	fix(docker): add explicit env allowlist for container credentials (#1436 )	2026-03-17 02:34:35 -07:00
test_patch_parser.py	fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs	2026-04-10 16:47:44 -07:00
test_process_registry.py	fix(gateway): propagate user identity through process watcher pipeline	2026-04-11 13:46:16 -07:00
test_read_loop_detection.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_registry.py	fix(tools): auto-discover built-in tool modules	2026-04-14 21:12:29 -07:00
test_rl_training_tool.py	fix: call _stop_training_run on early-return failure paths	2026-03-10 17:09:51 -07:00
test_search_hidden_dirs.py	fix: exclude hidden directories from find/grep search backends (#1558 )	2026-03-17 02:02:57 -07:00
test_send_message_missing_platforms.py	fix(send_message): deliver Matrix media via adapter	2026-04-15 17:37:43 -07:00
test_send_message_tool.py	retry transient telegram send failures	2026-04-16 03:47:00 -07:00
test_session_search.py	fix(session_search): coerce limit to int to prevent TypeError with non-int values (#10522 )	2026-04-15 14:11:05 -07:00
test_singularity_preflight.py	fix(tests): use case-insensitive regex in singularity preflight tests	2026-03-16 19:01:39 +03:00
test_skill_env_passthrough.py	fix: remove 115 verified dead code symbols across 46 production files	2026-04-10 03:44:43 -07:00
test_skill_improvements.py	feat(skills): size limits for agent writes + fuzzy matching for patch (#4414 )	2026-04-01 04:19:19 -07:00
test_skill_manager_tool.py	refactor: extract shared helpers to deduplicate repeated code patterns (#7917 )	2026-04-11 13:59:52 -07:00
test_skill_size_limits.py	feat(skills): size limits for agent writes + fuzzy matching for patch (#4414 )	2026-04-01 04:19:19 -07:00
test_skill_view_path_check.py	refactor: use Path.is_relative_to() for skill_view boundary check	2026-03-04 05:30:43 -08:00
test_skill_view_traversal.py	fix(security): block path traversal in skill_view file_path (fixes #220 )	2026-03-02 02:00:09 -08:00
test_skills_guard.py	fix(skills): preserve trust for skills-sh identifiers + reduce resolution churn (#3251 )	2026-03-26 13:40:21 -07:00
test_skills_hub.py	fix: update 6 test files broken by dead code removal	2026-04-10 03:44:43 -07:00
test_skills_hub_clawhub.py	fix: improve clawhub skill search matching	2026-03-14 23:15:04 -07:00
test_skills_sync.py	fix(skills): read name from SKILL.md frontmatter in skills_sync	2026-04-11 01:21:20 -07:00
test_skills_tool.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_ssh_bulk_upload.py	perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive (#8014 )	2026-04-12 06:18:05 +05:30
test_ssh_environment.py	fix(tests): update mocks for file sync changes	2026-04-10 03:01:46 -07:00
test_symlink_prefix_confusion.py	fix: use is_relative_to() for symlink boundary check in skills_guard	2026-03-04 17:23:23 +03:00
test_sync_back_backends.py	fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards	2026-04-16 19:39:21 -07:00
test_terminal_exit_semantics.py	feat: add exit code context for common CLI tools in terminal results (#5144 )	2026-04-04 16:57:24 -07:00
test_terminal_foreground_timeout_cap.py	fix: reject foreground timeout above cap instead of clamping	2026-04-10 02:58:54 -07:00
test_terminal_none_command_guard.py	fix(terminal): guard invalid command values	2026-04-08 21:37:51 -07:00
test_terminal_requirements.py	feat: ungate Tool Gateway — subscription-based access with per-tool opt-in	2026-04-16 12:36:49 -07:00
test_terminal_timeout_output.py	fix(terminal): preserve partial output when command times out (#3868 )	2026-03-29 21:51:44 -07:00
test_terminal_tool.py	fix terminal workdir validation for Windows paths	2026-04-15 15:06:51 -07:00
test_terminal_tool_pty_fallback.py	feat: add tested Termux install path and EOF-aware gh auth	2026-04-09 16:24:53 -07:00
test_terminal_tool_requirements.py	feat: ungate Tool Gateway — subscription-based access with per-tool opt-in	2026-04-16 12:36:49 -07:00
test_threaded_process_handle.py	feat(environments): unified spawn-per-call execution layer	2026-04-08 17:23:15 -07:00
test_tirith_security.py	fix: send_animation metadata, MarkdownV2 inline code splitting, tirith cosign-free install (#1626 )	2026-03-16 23:39:41 -07:00
test_todo_tool.py	fix(tools): enforce ID uniqueness in TODO store during replace operations	2026-04-11 16:22:50 -07:00
test_tool_backend_helpers.py	feat: ungate Tool Gateway — subscription-based access with per-tool opt-in	2026-04-16 12:36:49 -07:00
test_tool_call_parsers.py	refactor(tests): re-architect tests + fix CI failures (#5946 )	2026-04-07 17:19:07 -07:00
test_tool_result_storage.py	fix(tools): neutralize shell injection in _write_to_sandbox via path quoting (#7940 )	2026-04-11 14:26:11 -07:00
test_transcription.py	feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647 )	2026-03-31 03:10:01 -07:00
test_transcription_tools.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_tts_gemini.py	feat(tts): add Google Gemini TTS provider (#11229 )	2026-04-16 14:23:16 -07:00
test_tts_mistral.py	feat(tools): add Voxtral TTS provider (Mistral AI)	2026-04-11 01:56:55 -07:00
test_tts_speed.py	test(tts): add speed config tests for Edge, OpenAI, and MiniMax	2026-04-12 16:46:18 -07:00
test_url_safety.py	fix(security): add SSRF protection to vision_tools and web_tools (hardened)	2026-03-23 15:40:42 -07:00
test_vision_tools.py	refactor: remove dead code — 1,784 lines across 77 files (#9180 )	2026-04-13 16:32:04 -07:00
test_voice_cli_integration.py	fix(tests): fix 78 CI test failures and remove dead test (#9036 )	2026-04-13 10:50:24 -07:00
test_voice_mode.py	fix(termux): tighten voice setup and mobile chat UX	2026-04-09 16:24:53 -07:00
test_watch_patterns.py	fix(gateway): route synthetic background events by session	2026-04-15 11:16:01 -07:00
test_web_tools_config.py	feat: ungate Tool Gateway — subscription-based access with per-tool opt-in	2026-04-16 12:36:49 -07:00
test_web_tools_tavily.py	fix(tests): fix several failing/flaky tests on main (#6777 )	2026-04-09 13:17:06 -07:00
test_website_policy.py	fix: resolve 7 failing CI tests (#3936 )	2026-03-30 08:10:14 -07:00
test_windows_compat.py	fix: guard POSIX-only process functions for Windows compatibility	2026-03-01 01:54:27 +03:00
test_write_deny.py	fix: resolve symlink bypass in write deny list on macOS	2026-02-26 13:30:55 +03:00
test_yolo_mode.py	fix(gateway): scope /yolo to the active session	2026-04-10 03:38:44 -07:00
test_zombie_process_cleanup.py	fix(tests): fix 78 CI test failures and remove dead test (#9036 )	2026-04-13 10:50:24 -07:00