Commit graph

213 commits

Author SHA1 Message Date
kshitijk4poor
748f3e016b refactor(web): delete inline vendor helpers, re-export from plugins
Removes ~580 lines of dead code from tools/web_tools.py that were
superseded by the plugin migration but kept around in the cutover commit
to keep the diff focused. Replaces them with thin re-export shims so
existing tests and external callers that reach for the legacy
``tools.web_tools.<name>`` paths continue to work transparently.

Deleted from tools/web_tools.py
--------------------------------
- Lazy Firecrawl SDK proxy (_load_firecrawl_cls, _FirecrawlProxy,
  _FIRECRAWL_CLS_CACHE, the Firecrawl singleton)
- Firecrawl client section (_get_direct_firecrawl_config,
  _get_firecrawl_gateway_url, _is_tool_gateway_ready,
  _has_direct_firecrawl_config, _raise_web_backend_configuration_error,
  _firecrawl_backend_help_suffix, _get_firecrawl_client)
- Parallel client section (_get_parallel_client,
  _get_async_parallel_client, _parallel_client, _async_parallel_client)
- Tavily client section (_TAVILY_BASE_URL, _tavily_request,
  _normalize_tavily_search_results, _normalize_tavily_documents)
- Generic SDK normalizers (_to_plain_object, _normalize_result_list,
  _extract_web_search_results, _extract_scrape_payload)
- Exa client section (_get_exa_client, _exa_client, _exa_search,
  _exa_extract)
- Parallel helpers (_parallel_search, _parallel_extract)
- Duplicate inline check_firecrawl_api_key

Net: tools/web_tools.py drops from 2227 → 1613 lines (-614 lines).

Re-exports added at top of tools/web_tools.py
---------------------------------------------
- From plugins.web.firecrawl.provider:
  Firecrawl, _FirecrawlProxy, _FIRECRAWL_CLS_CACHE, _load_firecrawl_cls,
  _get_direct_firecrawl_config, _get_firecrawl_gateway_url,
  _is_tool_gateway_ready, _has_direct_firecrawl_config,
  _firecrawl_backend_help_suffix, _raise_web_backend_configuration_error,
  _get_firecrawl_client, _to_plain_object, _normalize_result_list,
  _extract_web_search_results, _extract_scrape_payload,
  check_firecrawl_api_key
- From plugins.web.tavily.provider:
  _tavily_request, _normalize_tavily_search_results,
  _normalize_tavily_documents
- From plugins.web.parallel.provider:
  _get_parallel_client, _get_async_parallel_client
- From plugins.web.exa.provider:
  _get_exa_client

Plus retained module-level imports for backward-compat with tests:
- httpx (tests patch tools.web_tools.httpx for tavily request mocking)
- build_vendor_gateway_url, _read_nous_access_token,
  resolve_managed_tool_gateway, managed_nous_tools_enabled,
  prefers_gateway (tests patch tools.web_tools.<name>)

Plugin indirection pattern (key technique)
------------------------------------------
For functions inside the firecrawl/parallel/exa plugins to honor
unit-test patches that target ``tools.web_tools.<name>``, the plugin
implementations now do ``import tools.web_tools as _wt`` at call time
and read helper names through that module (``_wt._read_nous_access_token``,
``_wt.Firecrawl``, ``_wt.prefers_gateway``, etc.). This makes the
existing test patches transparently reach the plugin code without any
test changes.

The cached client globals (_firecrawl_client, _firecrawl_client_config,
_parallel_client, _async_parallel_client, _exa_client) also now live on
tools.web_tools so existing test setup_method handlers that reset
``tools.web_tools._<vendor>_client = None`` between cases keep working.
The plugins read/write the cache via getattr/setattr on the web_tools
module.

Verified
--------
- 173/173 targeted web tests pass:
  test_web_providers.py, test_web_providers_brave_free.py,
  test_web_providers_ddgs.py, test_web_providers_searxng.py,
  test_web_tools_config.py, test_web_tools_tavily.py,
  test_website_policy.py, test_config_null_guard.py
- Compile-clean (py_compile.compile passes)
- All inline implementations now exist in exactly one place
  (plugins.web.<vendor>.provider)

Follow-up clean-up
------------------
- Drop _WEB_PLUGIN_SKIPLIST + hardcoded TOOL_CATEGORIES["web"] rows
  (next commit)
- Delete tools/web_providers/ directory entirely
- Add tests/plugins/web/ coverage
- Full tests/tools/ + tests/gateway/ regression sweep before promoting PR
2026-05-13 22:31:28 -07:00
kshitijk4poor
5e54330e27 fix(web): preserve firecrawl crawl + website-policy gate after migration
Two regressions discovered by running the full tests/tools/ suite after
the dispatcher cutover, both fixed in this commit:

1. web_crawl_tool incorrectly errored "search-only" for firecrawl
---------------------------------------------------------------------
The cutover treated any provider with supports_crawl()==False as a
search-only backend and returned the typed search-only error. But
firecrawl can crawl via the legacy multi-page-extract path inside
web_crawl_tool — it just doesn't expose supports_crawl on the plugin
(adding native firecrawl crawl is a clean follow-up).

Fix: only emit the search-only error when the provider supports
NEITHER crawl NOR extract (brave-free / ddgs / searxng). When the
provider supports extract but not crawl (firecrawl), fall through to
the legacy firecrawl-via-extract path below.

2. firecrawl plugin's check_website_access wasn't patchable
---------------------------------------------------------------------
The plugin imported `from tools.website_policy import check_website_access`
INSIDE the extract() function body, so monkeypatching the name on
plugins.web.firecrawl.provider had no effect — the inner import re-bound
the name on every call.

Fix: hoist the import to module level. Cheap (website_policy itself
has no heavy deps) and makes the standard
monkeypatch.setattr(firecrawl_provider, "check_website_access", ...)
pattern work.

Test updates (tests/tools/test_website_policy.py — 4 tests):
  - test_web_extract_short_circuits_blocked_url
  - test_web_extract_blocks_redirected_final_url
    Both: patch the gate at plugins.web.firecrawl.provider (where it
    runs after migration) and force the firecrawl plugin to be the
    active extract provider via FIRECRAWL_API_KEY.
  - test_web_crawl_short_circuits_blocked_url
  - test_web_crawl_blocks_redirected_final_url
    Both: unchanged — the dispatcher-level gate at tools.web_tools.py
    line 1651 still uses the imported `check_website_access` name and
    the firecrawl-fallthrough path is exercised as before.

Verified: 22/22 tests/tools/test_website_policy.py pass.
2026-05-13 22:31:28 -07:00
kshitijk4poor
143184e943 feat(web): firecrawl plugin — largest migration (search + async extract + dual auth)
Migrates Firecrawl from inline code in tools/web_tools.py to a bundled
plugin at plugins/web/firecrawl/. By line count this is the largest of
the seven provider migrations: the firecrawl path captured most of the
file's vendor-specific complexity.

What moved into the plugin (all previously in tools/web_tools.py):

  Lazy Firecrawl SDK proxy
    - _load_firecrawl_cls() — caches the imported SDK class
    - _FirecrawlProxy + Firecrawl singleton — defers ~200ms of SDK
      imports until first construction or isinstance check.

  Client construction (dual auth)
    - _get_direct_firecrawl_config()  — direct FIRECRAWL_API_KEY/URL path
    - _get_firecrawl_gateway_url()    — managed Nous tool-gateway URL
    - _is_tool_gateway_ready()        — gateway URL + Nous token check
    - _has_direct_firecrawl_config()  — direct config present?
    - _get_firecrawl_client()         — combined client construction
                                        honoring web.use_gateway
    - check_firecrawl_api_key()       — top-level "is firecrawl usable"
    - _firecrawl_backend_help_suffix() — managed-gateway help string
    - _raise_web_backend_configuration_error() — typed misconfig error

  Response shape normalization (vendor-specific)
    - _to_plain_object(), _normalize_result_list() — SDK→dict helpers
    - _extract_web_search_results() — handles SDK/direct/gateway shapes
    - _extract_scrape_payload()     — nested-data unwrap for scrape

  Per-URL extract loop
    - 60s asyncio.wait_for timeout per URL
    - Pre-scrape website-policy gate
    - Post-scrape redirect-aware SSRF re-check
    - Format-aware content selection (markdown / html / auto)
    - Per-URL errors returned as {"error": str} entries, no raises

Extract is declared `async def` — each URL is scraped in
asyncio.to_thread(...). This is the second async-extract plugin after
parallel.

The plugin re-exports `Firecrawl` (the lazy proxy) and
`check_firecrawl_api_key()` so existing tests doing
`patch("tools.web_tools.Firecrawl")` or
`monkeypatch.setattr(web_tools, "check_firecrawl_api_key", ...)` keep
working — tools/web_tools.py re-exports both names in the next
dispatcher-cutover commit.

Note: web_crawl_tool still has its own Firecrawl crawl path inline
(separate from extract); the Firecrawl SDK supports /crawl but we don't
expose supports_crawl=True on this plugin yet. Tavily handles crawl
today. Adding Firecrawl crawl is a clean follow-up.

Adds "firecrawl" to _WEB_PLUGIN_SKIPLIST.

E2E verified:
  - All 7 providers register: brave-free, ddgs, exa, firecrawl,
    parallel, searxng, tavily
  - inspect.iscoroutinefunction(firecrawl.extract) -> True
  - Firecrawl proxy is a callable lazy proxy at module level
  - check_firecrawl_api_key reflects FIRECRAWL_API_KEY presence
2026-05-13 22:31:28 -07:00
kshitijk4poor
31fcde876c feat(web): tavily plugin — first three-capability plugin (search + extract + crawl)
Migrates Tavily from inline _tavily_request() / _normalize_tavily_*
helpers in tools/web_tools.py to a bundled plugin at plugins/web/tavily/.

First plugin in the codebase to advertise supports_crawl=True. Tavily is
unique among built-in backends in offering a native /crawl endpoint that
walks linked pages from a seed URL with optional natural-language
instructions and depth ("basic" or "advanced").

Capabilities:
  - supports_search()  -> True (Tavily /search)
  - supports_extract() -> True (Tavily /extract)
  - supports_crawl()   -> True (Tavily /crawl)
  All sync (httpx.post under the hood).

The crawl method accepts forward-compat kwargs (instructions, depth,
limit) and is gated against unsafe URLs/policy by the dispatcher in
web_crawl_tool — exactly as before.

Behavior preserved:
  - TAVILY_API_KEY required (ValueError → typed error response)
  - TAVILY_BASE_URL env override honored
  - /crawl requires both body auth AND Bearer header — preserved
  - failed_results[] and failed_urls[] response keys mapped to per-URL
    items with error fields rather than raising
  - max_results capped at 20 server-side

Adds "tavily" to _WEB_PLUGIN_SKIPLIST.

The legacy inline _tavily_request / _normalize_tavily_search_results /
_normalize_tavily_documents / _TAVILY_BASE_URL in tools/web_tools.py are
NOT deleted yet — search/extract dispatch and the entire web_crawl_tool
function still reference them. They go away when those dispatchers are
cut over to the registry.

E2E verified:
  - Tavily registers with all 3 capabilities
  - Provider list now: brave-free, ddgs, exa, parallel, searxng, tavily
2026-05-13 22:31:28 -07:00
kshitijk4poor
4816646109 feat(web): parallel plugin — first async-extract plugin
Migrates Parallel.ai from inline `_parallel_search()` / `_parallel_extract()`
in tools/web_tools.py to a bundled plugin at plugins/web/parallel/.

First plugin in the codebase to expose an async :meth:`extract`:

  - search() is sync — Parallel.beta.search
  - extract() is **async def** — AsyncParallel.beta.extract

The ABC's docstring on supports_extract() already permits sync-or-async;
this commit is the first to exercise the async path. The web_extract_tool
dispatcher (next commit) detects coroutines via
inspect.iscoroutinefunction and awaits accordingly.

Behavior preserved:
  - PARALLEL_API_KEY required (raises ValueError if missing → surfaced
    as {"success": False, "error": "..."} instead)
  - PARALLEL_SEARCH_MODE env var honored (agentic|fast|one-shot, default
    agentic), validated via _resolve_search_mode()
  - Limit capped at 20 server-side via min(limit, 20)
  - Per-URL failure mode preserved: response.errors[] each become a
    result dict with an "error" field rather than raising
  - Module-level _parallel_client / _async_parallel_client caches kept
    (mirrors legacy singleton pattern)

Adds "parallel" to _WEB_PLUGIN_SKIPLIST in hermes_cli/tools_config.py so
the picker doesn't double-list.

The legacy inline _parallel_search, _parallel_extract, _get_parallel_client,
_get_async_parallel_client in tools/web_tools.py are NOT deleted yet — the
dispatcher still calls them. They go away when the dispatcher cuts over.

E2E verified:
  - inspect.iscoroutinefunction(p.search) -> False
  - inspect.iscoroutinefunction(p.extract) -> True
  - extract() returns a coroutine (not a list)
  - 5 providers register correctly (brave-free, ddgs, exa, parallel, searxng)
2026-05-13 22:31:28 -07:00
kshitijk4poor
ec8449e9c6 feat(web): exa plugin — first multi-capability migration (search + extract)
Migrates Exa from the inline `_exa_search()` / `_exa_extract()` helpers in
tools/web_tools.py to a bundled plugin at plugins/web/exa/.

This is the first plugin in this PR to advertise supports_extract=True,
exercising the multi-capability ABC path that the initial three migrations
(brave_free, ddgs, searxng — all search-only) did not cover.

Both Exa methods are sync — the SDK is sync-only. The web_extract_tool
dispatcher in tools/web_tools.py will continue to call them inline until
Task "dispatch-extract-all" cuts it over to the registry.

Behaviour preserved bit-for-bit aside from the ABC method-name change:
  - is_configured()  -> is_available()
  - provider_name()  -> name (property)
  - "exa" stays as the registered name
  - Module-level `_exa_client` cache + lazy `from exa_py import Exa`
    preserved at the new location.
  - Errors (ValueError for missing API key, ImportError for missing SDK,
    generic Exception) caught and surfaced as {"success": False, "error": ...}
    instead of raising.

Adds "exa" to _WEB_PLUGIN_SKIPLIST in hermes_cli/tools_config.py so the
hardcoded TOOL_CATEGORIES["web"] row and the plugin-injected row don't
duplicate during the spike. The skip-list goes away in the cleanup phase
along with the hardcoded row.

The legacy inline `_exa_search` / `_exa_extract` / `_get_exa_client` /
`_exa_client` in tools/web_tools.py are NOT deleted yet — the dispatcher
still references them. They go away in the next dispatcher-cutover commit.

E2E verified:
  - Plugin discovers + registers
  - .supports_search/.supports_extract/.supports_crawl = (True, True, False)
  - .get_setup_schema() returns the picker row shape
  - resolve(): explicit exa + EXA_API_KEY -> exa; without key -> exa (registered
    but unavailable, dispatcher surfaces "EXA_API_KEY not set" error)
2026-05-13 22:31:28 -07:00
kshitijk4poor
6b219f5af6 refactor(web): remove legacy in-tree provider modules
Deletes tools/web_providers/{brave_free,ddgs,searxng}.py — the three
providers that moved to plugins/web/ in prior commits. tools/web_tools.py
no longer imports them (registry dispatch as of d8735963f), so removing
them is purely a cleanup pass.

Also migrates the existing tests to the new import paths:
  tests/tools/test_web_providers_brave_free.py
  tests/tools/test_web_providers_ddgs.py
  tests/tools/test_web_providers_searxng.py

Mechanical rewrites:
  - `from tools.web_providers.X import YSearchProvider`
      -> `from plugins.web.X.provider import YWebSearchProvider`
  - `.is_configured()` -> `.is_available()`        (legacy method  -> new method)
  - `.provider_name()` -> `.name`                  (legacy method  -> new property)
  - `from tools.web_providers.base import WebSearchProvider`
      -> `from agent.web_search_provider import WebSearchProvider`
      (the subclass-check asserts membership in the new plugin-facing ABC)
  - `sys.modules.delitem("tools.web_providers.ddgs")` updated to point at
    `plugins.web.ddgs.provider` (cache-busting for lazy ddgs imports)

The TestXBackendWiring / TestXSearchOnlyErrors classes (covering
_is_backend_available, _get_backend, check_web_api_key, and the
"search-only" error paths in web_extract/web_crawl) are untouched —
those still test web_tools.py's backend-selection logic, which continues
to recognize the names "brave-free" / "ddgs" / "searxng" even after the
modules behind them moved to plugins.

tools/web_providers/base.py is intentionally NOT deleted by this commit
— it's the parent ABC of the legacy modules and shares its name with
agent/web_search_provider.py::WebSearchProvider. Removing it surfaces the
naming collision (see PR description Finding 0); the real migration PR
deletes it in the same commit that drops the _WEB_PLUGIN_SKIPLIST
guards in hermes_cli/tools_config.py.

Test results:
  bash scripts/run_tests.sh tests/tools/test_web_providers_*.py
  -> 65 passed in 3.41s (all rewritten unit tests + unchanged integration tests)
  bash scripts/run_tests.sh tests/tools/test_web_*.py
  -> 141 passed in 4.70s (full web test set, post-deletion)
2026-05-13 22:31:28 -07:00
kshitijk4poor
0d085d9454 feat(web): searxng plugin (search-only, third migration)
Adds plugins/web/searxng/. SearXNG aggregates results from upstream engines
via its JSON API (/search?format=json) — search-only, no extract capability
(supports_extract() returns False).

E2E verified — registry now has ['brave-free', 'ddgs', 'searxng'].
2026-05-13 22:31:28 -07:00
kshitijk4poor
5c7d098bee feat(web): ddgs plugin (second migration)
Adds plugins/web/ddgs/ following the same plugins/image_gen/ pattern as
brave_free. DuckDuckGo search via the community ddgs package; no API key,
package is an optional dep gated by is_available().

E2E verified — registry now has ['brave-free', 'ddgs'].
2026-05-13 22:31:28 -07:00
kshitijk4poor
d403cf018c feat(web): brave_free plugin (first migration from tools/web_providers/)
Adds plugins/web/brave_free/ as the first plugin built against the new
WebSearchProvider ABC. Mirrors the plugins/image_gen/openai/ layout exactly:

  plugins/web/brave_free/
    plugin.yaml      kind: backend, provides_web_providers: [brave-free]
    __init__.py      register(ctx) -> ctx.register_web_search_provider(...)
    provider.py      BraveFreeWebSearchProvider(WebSearchProvider)

Behavior preserved: same name ("brave-free" with hyphen), same env var
(BRAVE_SEARCH_API_KEY), same HTTP request shape, same response normalization.

The legacy tools/web_providers/brave_free.py is left in place — the
dispatcher in tools/web_tools.py still references it. Task 7 cuts over the
dispatcher to the new registry; Task 10 deletes the legacy file.

E2E verified:
  HERMES_PLUGINS_DEBUG=1 python -c "
  from hermes_cli.plugins import _ensure_plugins_discovered
  _ensure_plugins_discovered()
  from agent.web_search_registry import list_providers
  print([p.name for p in list_providers()])
  "
  # -> ['brave-free']
2026-05-13 22:31:28 -07:00
Teknium
9d42c2c286
feat(video_gen): unified video_generate tool with pluggable provider backends (#25126)
* feat(video_gen): unified video_generate tool with pluggable provider backends

One core video_generate tool, every backend a plugin. Mirrors the
image_gen + memory_provider + context_engine architecture: ABC, registry,
plugin-context registration hook, and per-plugin model catalogs surfaced
through hermes tools.

Surface (one schema, every backend):
- operation: generate / edit / extend
- modalities: text-to-video (prompt only), image-to-video (prompt +
  image_url), video edit (prompt + video_url), video extend (video_url)
- reference_image_urls, duration, aspect_ratio, resolution,
  negative_prompt, audio, seed, model override
- Providers ignore unknown kwargs and declare what they support via
  VideoGenProvider.capabilities() — backend-specific quirks stay in the
  backend, the agent learns one tool

Backends shipped:
- plugins/video_gen/xai/  — Grok-Imagine, full generate/edit/extend +
  image-to-video + reference images (salvaged from PR #10600 by
  @Jaaneek, reshaped into the plugin interface)
- plugins/video_gen/fal/  — Veo 3.1 (t2v + i2v), Kling O3 i2v,
  Pixverse v6 i2v with model-aware payload building that drops keys a
  model doesn't declare

Wiring:
- agent/video_gen_provider.py — VideoGenProvider ABC, normalize_operation,
  success_response / error_response, save_b64_video / save_bytes_video,
  $HERMES_HOME/cache/videos/
- agent/video_gen_registry.py — thread-safe register/get/list +
  get_active_provider() reading video_gen.provider from config.yaml
- hermes_cli/plugins.py — PluginContext.register_video_gen_provider()
- hermes_cli/tools_config.py — Video Generation category in
  hermes tools, plugin-only providers list, model picker per plugin,
  config write to video_gen.{provider,model}
- toolsets.py — new video_gen toolset
- tests: 31 new tests covering ABC, registry, tool dispatch, both plugins
- docs: developer-guide/video-gen-provider-plugin.md (parallel to the
  image-gen guide), sidebar + toolsets-reference + plugin guides updated

Supersedes: #25035 (FAL), #17972 (FAL), #14543 (xAI), #13847 (HappyHorse),
#10458 (provider categories), #10786 (xAI media+search bundle), #2984
(FAL duplicate), #19086 (Google Veo standalone — easy port to plugin
interface).

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>

* feat(video_gen): dynamic schema reflects active backend's capabilities

Address the 'capability variance' question — instead of one tool with a
static schema that lies about what every backend supports, the
video_generate tool now rebuilds its description at get_definitions()
time based on the configured video_gen.provider and video_gen.model.

The agent sees backend-specific guidance up-front:
- 'fal-ai/veo3.1/image-to-video': 'image-to-video only — image_url is
  REQUIRED; text-only prompts will be rejected'
- 'fal-ai/veo3.1' (t2v): no image_url restriction shown
- xAI grok-imagine-video: 'operations: generate, edit, extend; up to 7
  reference_image_urls'
- Backends without edit/extend: 'not supported on this backend — surface
  that they need to switch backends via hermes tools'

This is the same pattern PR #22694 used for delegate_task self-capping —
documented in the dynamic-tool-schemas skill. Cache invalidation is
free: get_tool_definitions() already memoizes on config.yaml mtime, so a
mid-session backend swap rebuilds the schema automatically.

Tested:
- Empirical FAL OpenAPI schema check confirms image-to-video models
  require image_url (FAL returns HTTP 422 otherwise) — client-side
  rejection in FALVideoGenProvider.generate() now prevents the wasted
  round-trip
- Live E2E: fal-ai/veo3.1/image-to-video + prompt-only → clean
  missing_image_url error; fal-ai/veo3.1 + prompt-only → dispatches
- 6 new tests cover the builder (no config / image-only / full-surface /
  text-only / unknown provider / registry wiring), all passing
- 37/37 in the slice, 134/134 in the broader regression set

* test(video_gen/xai): full surface integration tests + cleaner schema

Verified end-to-end that the xAI plugin handles every documented mode
from PR #10600's surface: text-to-video, image-to-video,
reference-images-to-video, video edit, video extend (with and without
prompt). All five modes route to the correct xAI endpoint
(/videos/generations, /videos/edits, /videos/extensions) with the right
payload shape (image / reference_images / video keys), and all five
client-side rejections fire before the network: edit-without-prompt,
extend-without-video_url, image+refs conflict, >7 references, and
duration/aspect_ratio clamping.

15 new integration tests grouped into four classes (endpoint routing,
modalities, validation, clamping). httpx is stubbed via a small fake
AsyncClient that records POSTs so the tests assert the actual payload
the plugin would send to xAI — not just the success/error envelope.

Also cleaned up a description redundancy: when a model's operations
match the backend's overall set, we no longer print the duplicate
'operations supported by this model' line. xAI's description now reads:

    Active backend: xAI . model: grok-imagine-video
    - operations supported by this backend: edit, extend, generate
    - modalities supported by this backend: image, reference_images, text
    - aspect_ratio choices: 16:9, 1:1, 2:3, 3:2, 3:4, 4:3, 9:16
    - resolution choices: 480p, 720p
    - duration range: 1-15s
    - reference_image_urls: up to 7 images

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>

* feat(video_gen): collapse surface to t2v + i2v, family-based auto-routing

Two design changes per Teknium:

1) Drop edit/extend from the tool surface entirely. Only text-to-video
and image-to-video remain. The agent sees a clean tool with two
modalities; backend-specific quirks like xAI's edit/extend endpoints
stay out of the unified schema.

2) FAL: pick a model FAMILY once, the plugin routes between the
family's text-to-video and image-to-video endpoints based on whether
image_url was passed. Users no longer pick 'fal-ai/veo3.1' AND
'fal-ai/veo3.1/image-to-video' as separate options — they pick
'veo3.1', and the plugin handles the rest.

Catalog rewritten as families:

    veo3.1            fal-ai/veo3.1                                /  fal-ai/veo3.1/image-to-video
    pixverse-v6       fal-ai/pixverse/v6/text-to-video             /  fal-ai/pixverse/v6/image-to-video
    kling-o3-standard fal-ai/kling-video/o3/standard/text-to-video /  fal-ai/kling-video/o3/standard/image-to-video

xAI uses a single endpoint (/videos/generations) for both modes,
routed by the presence of the 'image' field in the payload — no
edit/extend exposure.

Schema changes:
- VIDEO_GENERATE_SCHEMA: drop operation, drop video_url. Final params:
  prompt (required), image_url, reference_image_urls, duration,
  aspect_ratio, resolution, negative_prompt, audio, seed, model.
- VideoGenProvider ABC: drop normalize_operation, VALID_OPERATIONS,
  DEFAULT_OPERATION. capabilities() drops 'operations' key.
- success_response: add 'modality' field ('text' | 'image') so the
  agent and logs can see which endpoint was actually hit.

Dynamic schema builder simplified — no operations bullet, no
'switch backends if you need edit/extend' guidance. When the active
backend supports both modalities (the common case), description reads:

    Active backend: FAL . model: pixverse-v6
    - supports both text-to-video (omit image_url) and image-to-video
      (pass image_url) - routes automatically
    - aspect_ratio choices: 16:9, 9:16, 1:1
    - resolution choices: 360p, 540p, 720p, 1080p
    - duration range: 1-15s
    - audio: pass audio=true to enable native audio (pricing tier)
    - negative_prompt: supported

Tests: 51 in the video_gen slice, 216 across the broader image+video
sweep, all passing. New FAL routing tests prove pixverse-v6 + no image
hits text-to-video endpoint, pixverse-v6 + image_url hits
image-to-video endpoint, same for veo3.1 and kling-o3-standard.

Docs updated: developer-guide page rewrites the 'model families' pattern
as a first-class section so external plugin authors know the convention.
toolsets-reference and toolsets.py descriptions match the new surface.

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>

* feat(video_gen/fal): expand catalog to 6 families, cheap + premium tiers

Catalog now covers everything Teknium specced from FAL:

  Cheap tier:
    ltx-2.3        fal-ai/ltx-2.3-22b/text-to-video       / image-to-video
    pixverse-v6    fal-ai/pixverse/v6/text-to-video       / image-to-video

  Premium tier:
    veo3.1         fal-ai/veo3.1                          / fal-ai/veo3.1/image-to-video
    seedance-2.0   bytedance/seedance-2.0/text-to-video   / image-to-video
    kling-v3-4k    fal-ai/kling-video/v3/4k/text-to-video / image-to-video
    happy-horse    fal-ai/happy-horse/text-to-video       / image-to-video

DEFAULT_MODEL moved from veo3.1 (premium) to pixverse-v6 (cheap, sane
defaults, both modalities) — better first-run UX for users who haven't
explicitly picked a model.

New family-entry knob: image_param_key. Kling v3 4K's image-to-video
endpoint expects start_image_url instead of image_url; declaring
image_param_key='start_image_url' on the family lets _build_payload
remap correctly. Other families default to plain image_url.

Per-family capability flags reflect each model's docs:
- LTX 2.3 + Happy Horse: minimal payloads (no duration/aspect/resolution
  enum exposed by FAL — let endpoint apply defaults)
- Seedance: 6 aspect ratios incl 21:9, durations 4-15, audio supported,
  negative prompts NOT supported per docs
- Kling v3 4K: 16:9/9:16/1:1, 3-15s, audio + negative
- Veo 3.1: unchanged, 16:9/9:16, 4/6/8s

Tests: +5 covering the new families (full catalog, Kling 4K
start_image_url remap, Seedance routing, LTX payload minimality, Happy
Horse minimality). 56/56 in the slice green.

Note: I did NOT add the FAL-hosted xAI Grok-Imagine variant. Hermes
already has a direct xAI plugin that talks to xAI's own API; routing
the same model through FAL's wrapper would duplicate the surface
without adding capabilities. Users on FAL who want Grok-Imagine should
use the xAI plugin directly; flag if you want both routes available.

* test(video_gen): tool-surface routing matrix — every model x modality

End-to-end matrix test driven through _handle_video_generate() — the
actual function the agent's video_generate tool call lands in. Writes
config.yaml, invokes the registered handler with a raw args dict, then
asserts the outbound HTTP/SDK call hit the right endpoint with the right
payload shape.

Parametrized over FAL_FAMILIES.keys() so the matrix auto-discovers new
families as they're added (add a family to FAL_FAMILIES and you get
both modalities tested for free).

Coverage:
- All 6 FAL families x {text-only, text+image} = 12 cases
- xAI x {text-only, text+image} = 2 cases
- tool-level model= arg overrides config = 2 cases

For each case, verifies:
- result['success'] is True
- result['modality'] matches input shape ('text' if no image_url, 'image' otherwise)
- outbound endpoint URL matches the family's text_endpoint or image_endpoint
- text-only payloads carry no image-shaped keys
- text+image payloads carry the family's image key (image_url for most,
  start_image_url for kling-v3-4k, wrapped 'image' object for xAI)

All 16 cases passing. Confirms the tool surface routes every
(provider, model, modality) combination correctly with zero leakage.

* feat(video_gen): keep video_gen out of first-run setup, surface in status

Two changes:

1. video_gen joins _DEFAULT_OFF_TOOLSETS, so it is NOT pre-selected in
   the first-run toolset checklist. Video gen is niche, paid, and slow —
   most users don't want it nagging them during initial setup. Anyone
   who wants it opts in via 'hermes tools' -> Video Generation, which
   already routes to the provider+model picker.

2. The 'hermes setup' status panel learns about video_gen — but only
   shows the row when a plugin reports available. Users without
   FAL_KEY/XAI_API_KEY see nothing about video gen; users with one of
   those keys see 'Video Generation (FAL) ✓' as confirmation it's wired.

Verified live:
- Fresh install (no creds): zero video_gen mentions in wizard.
- With FAL_KEY: status row appears with active backend name.
- 160/160 in the setup + tools_config + video_gen test slice.

Rationale: image_gen is on by default because it's a featured creative
tool used in casual chat (telegrams, etc). Video gen is heavier — long
wait, paid per-second pricing. Default-off matches user intent better.

---------

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>
2026-05-13 16:39:41 -07:00
Teknium
486b692ddd
feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779)
Some checks failed
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Docker Build and Publish / move-latest (push) Blocked by required conditions
Lint (ruff + ty) / ruff + ty diff (push) Waiting to run
Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run
Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run
Nix / nix (macos-latest) (push) Waiting to run
Nix / nix (ubuntu-latest) (push) Waiting to run
Tests / test (push) Waiting to run
Tests / e2e (push) Waiting to run
OSV-Scanner / Scan lockfiles (push) Has been cancelled
uv.lock check / uv lock --check (push) Has been cancelled
* feat(nous): unified client=hermes-client-v<version> tag on every Portal request

Every Hermes request to Nous Portal now carries the same
client=hermes-client-v<__version__> tag (e.g. client=hermes-client-v0.13.0
on this release), sourced live from hermes_cli.__version__. The release
script's regex bump auto-aligns it on every release.

Centralized in agent/portal_tags.py and wired into all four call sites:
- NousProfile.build_extra_body (main agent loop, every chat completion)
- auxiliary_client.NOUS_EXTRA_BODY + _build_call_kwargs (aux client)
- run_agent.py compression-summary fallback path
- tools/web_tools.py web_extract fallback

Replaces the client=aux marker added in #24194 with the unified version
tag. Tests assert against the helper output (invariant) rather than the
literal string, so they don't need updating on every release.

* feat(nous): cover /goal judge and kanban specify aux paths

Two aux-using surfaces bypassed call_llm by invoking
client.chat.completions.create() directly without extra_body, so they
were missing the unified Portal client tag:

- hermes_cli/goals.py — /goal standing-goal judge
- hermes_cli/kanban_specify.py — kanban triage specifier

Both now pass extra_body=get_auxiliary_extra_body() or None so they
inherit the version tag when the aux client points at Nous Portal, and
emit nothing otherwise (no tag leak to OpenRouter/Anthropic auxes).
2026-05-12 20:49:20 -07:00
mizgyo
7c67097325 fix(line): use build_source instead of nonexistent create_source
The LINE adapter calls self.create_source(...) which raises
AttributeError on every inbound message — no such method exists.
The base PlatformAdapter exposes this factory as build_source(),
consistent with the IRC and Teams adapters.

Fixes #23728
2026-05-12 18:46:54 -07:00
jak983464779
0c233e70f8 fix(doctor): skip /models health check for providers that don't support it
Xiaomi MiMo's /v1/models endpoint returns 401 even with a valid API key,
causing hermes doctor to falsely report 'invalid API key'.

Add a `supports_health_check` field to ProviderProfile (default True).
Providers whose /models endpoint doesn't support auth verification can
set it to False. The doctor's dynamic provider discovery now reads this
field instead of hardcoding True.

The xiaomi provider plugin sets supports_health_check=False.
2026-05-12 17:12:25 -07:00
Austin Pickett
fc3fd6bb6b fix(dashboard): UI polish — modals, layout, consistency, test fixes
Dashboard UX polish pass — consolidates create forms into modals
triggered from the page header, fixes layout inconsistencies, adds
scroll-to navigation for the Keys page, and aligns the TokenBar with
the design system.

Changes:
- App.tsx: add padding to sidebar header
- resolve-page-title.ts: add missing routes, better fallback title
- en.ts: fix nav labels (Profiles was 'profiles : multi agents')
- ModelsPage: two-col layout, auxiliary tasks modal, TokenBar redesign
- ProfilesPage: create button in header, form in modal, Checkbox component
- CronPage: create button in header, form in modal
- EnvPage: scroll-to sub-nav in header, fix text overflow

Modal and dialog standardization:
- Replace all native confirm()/window.confirm() with ConfirmDialog
  (OAuthProvidersCard, PluginsPage, ModelsPage, ConfigPage)
- Add useModalBehavior hook (Escape-to-close, scroll lock, focus restore)
- Apply hook to ProfilesPage, CronPage, AuxiliaryTasksModal

Component fixes (from PR review):
- Checkbox: fix controlled/uncontrolled mismatch, add focus-visible ring
- TokenBar: add rounded-full to legend dots, remove dead code

CI/test fixes:
- Fix TS unused imports (noUnusedLocals), type-narrow PickerTarget union
- Add windows-footgun suppression on platform-guarded os.killpg
- Fix 19 stale unit tests + 9 e2e tests broken by recent main changes
- Restore minimal example-dashboard plugin for plugin auth test
2026-05-12 13:59:22 -04:00
Teknium
c1eb2dcda7
feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220)
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback

Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.

# What this PR makes true

1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
   detection banner with copy-pasteable remediation steps the moment
   they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
   a fresh install to 'core only' — the installer keeps every other
   extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
   lazy-install on first use under a strict allowlist, instead of
   eagerly pulling everything at install time.

# Detection: hermes_cli/security_advisories.py

- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
  mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
  dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
  the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
  re-banner after ack.
- Wired into:
  * hermes doctor — runs first, prints full remediation block
  * hermes doctor --ack <id> — dismisses an advisory
  * cli.py interactive run() and single-query branches — short
    stderr banner pointing at hermes doctor
  * gateway/run.py startup — operator-visible warning in gateway.log

# Lazy-install framework: tools/lazy_deps.py

- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
  memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
  uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
  pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
  HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
  * tools/tts_tool.py — _import_elevenlabs() calls ensure first
  * plugins/memory/honcho/client.py — get_honcho_client lazy-installs
  * tts.mistral / stt.mistral entries pre-registered for when PyPI
    restores mistralai

# Installer fallback tiers

scripts/install.sh, scripts/install.ps1, setup-hermes.sh:

- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
  array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
  PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
  landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
  the same _BROKEN_EXTRAS array so updates stay in sync.

Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).

# Config

hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: []  (advisory IDs the user has dismissed)
- allow_lazy_installs: True  (security gate for ensure())

No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.

# Tests

tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
  gateway_log_message
- shipped catalog well-formedness invariant

tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
  every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
  pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command

Combined: 63 new tests, all passing under scripts/run_tests.sh.

# Validation

- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
  tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
  tests/hermes_cli/test_doctor_command_install.py
  tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
  tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
  9191 passed, 8 pre-existing failures (verified on origin/main
  before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
  + gateway_log_message with mocked installed version → produces
  copy-pasteable remediation output

# Community

Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md

Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md

Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>

* build(deps): pin every direct dep to ==X.Y.Z (no ranges)

Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.

Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.

What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.

Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.

Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.

mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.

LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.

Validation:

- Cross-checked all 77 pinned direct deps in pyproject.toml against
  uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
  → 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.

* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra

You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.

# What this commit fixes

1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
   uv.lock records SHA256 hashes for every transitive — a compromised
   package with a different hash gets REJECTED. Falls through to the
   existing `uv pip install` cascade if the lockfile is missing or
   stale, with a loud warning that the fallback path does NOT
   hash-verify transitives. Previously only `setup-hermes.sh` (the dev
   path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
   (the paths fresh users actually run) skipped it.

2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
   project is fully quarantined right now — every version returns 404,
   so any pin we wrote was unresolvable, which broke `uv lock --check`
   in CI. Restoration is documented in pyproject.toml as a 5-step
   checklist (verify, re-add extra, re-enable in 4 modules, regenerate
   lock, optionally re-add to [all]).

3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
   jsonpath-python pruned. `uv lock --check` now passes.

# Defense-in-depth view

| Layer                      | Where             | Protects against                          |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject    | direct deps       | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph  | transitive worm injection                  |
| Tier-0 hash-verified path  | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate  | every PR          | drift between pyproject and lockfile      |
| `hermes_cli/security_advisories.py` | runtime  | cleanup for users who already got hit      |

The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.

# Validation

- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
  (test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
  test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.

* chore: remove community announcement drafts (PR body covers it)

* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)

Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.

Moved out of core dependencies = []:
- anthropic   (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client  (image gen; only when picked)
- edge-tts    (default TTS but still optional)

New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].

New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.

Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.

Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).

Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
2026-05-12 01:02:25 -07:00
Evgenii
27cfe72543 fix(kanban): use localized column label in select-all aria label 2026-05-11 06:44:58 -07:00
Teknium1
b8bf2f817d fix(kanban): merge dashboard batch QOL with i18n + collapse + assignee-casing
PR #23240 was branched before main landed:
- c39168453 i18n localization (16 locales)
- a91e5a875 native <details> collapse + skip empty metadata
- 0e0ddaac8 tone down completed-run metadata panel
- b308dd7d7 preserve assignee casing in dashboard

The cherry-pick took PR's dist/index.js wholesale via -X theirs,
which dropped those features. This commit re-applies them by
hand-merging the 7 conflict regions:

1. bulk-action catch handler: keep PR's failedIds + loadBoard,
   keep main's t-in-deps for tx() i18n calls
2. Refresh button: keep main's tx(t, 'refresh', ...), add PR's
   Clear filters button with tx(t, 'clearFilters', ...)
3. Archive button: keep main's tx(t, 'archive', ...), add PR's
   priority setter with tx(t, 'priority'/'setPriority', ...)
4. Column header: keep main's colHelp i18n var, add PR's
   column-select-all checkbox
5/6. lane.tasks/column.tasks .map: keep main's t->tk rename
   (avoids shadowing the i18n t), apply tk to PR's failed/
   draggingSource props
7. Card checkbox label-wrap: keep PR's <label> structure
   (larger hit target), keep main's tx(i18n, 'selectForBulk', ...)

Adds three new i18n keys (clearFilters, priority, setPriority)
that fall back to English via tx() until translators add them
to the kanban catalog, matching the existing pattern.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam
3df7e30244 kanban dashboard: fix shift-click range selection, column select-all toggle, and bulk action optimistic UI
- Bug 1: shift-click now always adds the target card and sets it as the
  last-selected anchor, so range selection works even when 0 or 1 cards
  are selected.
- Bug 2: column select-all checkbox now toggles: if every card in the
  column is already selected, clicking unselects them all.
- Bug 3: applyBulk now mirrors moveSelected with optimistic UI updates
  for status moves and calls loadBoard() on catch for consistency.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam
69053832e3 kanban dashboard: remove redundant t.summary from search haystack
The Task dataclass has no `summary` field; only Run carries summary.
The dashboard already searches `latest_summary` (derived from the
latest run), so `t.summary` in the client-side haystack was always
undefined and therefore redundant.

Verdict from task t_4bcac44f:
- Before batch QOL (6c7ec94d9): search only covered id, title,
  assignee, tenant.
- Batch QOL (7fd187102) correctly added body, result, latest_summary.
- `t.summary` was included but is a misleading no-op because tasks
  never expose a `summary` key — `latest_summary` already covers it.

Removes the redundant field from the haystack only.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam
a88f201cd4 kanban dashboard: multi-card drag visual feedback
- When dragging a selected card while multiple cards are selected, the
  browser ghost image now shows a 'N cards' badge instead of a single card.
- All selected cards in the original column are dimmed (opacity 0.45 +
  grayscale) during the drag so the user sees the whole set is in-flight.
- Uses React state for the dragged task id; event delegation on the board
  columns container to avoid deep prop threading.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam
98c499b235 kanban dashboard: fix batch QOL oracle blockers
- Preserve failedIds partial-failure highlighting after moveSelected/
  applyBulk by clearing only selectedIds/lastSelectedId instead of
  calling clearSelected() (which also wiped failedIds).
- Fix touch/native multi-drag drop stale closure by adding
  props.selectedIds and props.onMoveSelected to the hermes-kanban:drop
  useEffect dependency array.

Fixes t_5bfafb73.
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam
0ea234e093 feat(kanban): dashboard batch QOL upgrade
- Shift-click range selection, column select-all, select-all-visible
- Multi-card drag/drop via selectedIds + /tasks/bulk
- Expanded bulk actions: todo/ready/blocked/unblock/complete/archive,
  priority setter, reassign with reclaim_first checkbox
- Partial failure card highlight (failedIds + hermes-kanban-card--failed)
- Search expanded to body, result, latest_summary, summary
- Clear filters button + reset all filters on board switch
- Accessibility: larger checkbox hit target, tabIndex/role/aria-label,
  Enter/Space/Esc keyboard handlers
- Fix temporal-dead-zone bug: move clearSelected before moveSelected
2026-05-10 21:44:37 -07:00
Yi Lok Enoch Lam
518d37f6af feat(kanban): add reclaim_first support to bulk reassign endpoint
- Extend BulkTaskBody with reclaim_first: bool = False
- In bulk_update, use kanban_db.reassign_task(..., reclaim_first=True)
  when payload.reclaim_first is set and assignee is present
- Falls back to existing assign_task behavior when reclaim_first is false

This enables the dashboard to bulk-reassign running tasks by
reclaiming their claims first, matching the single-task
/tasks/{id}/reassign endpoint behavior.
2026-05-10 21:44:37 -07:00
Teknium
80bb5f2947 fix(achievements): use canonical X-Hermes-Session-Token header
Follow-up to TreyDong's fix: switch the auth header to
`X-Hermes-Session-Token` (the canonical pattern used by the rest of
the dashboard SPA — see `web/src/lib/api.ts` `fetchJSON()`). The
server still accepts both schemes, so the original `Authorization:
Bearer` form would also work; we standardize on X-header to match
every other dashboard fetch and only set the header when a token is
actually present.

Also add scripts/release.py AUTHOR_MAP entry for treydong.zh@gmail.com.
2026-05-10 19:41:45 -07:00
treydong
da2ed478b5 fix(achievements): inject Authorization header in plugin API calls 2026-05-10 19:41:45 -07:00
Teknium
a91e5a8759 feat(kanban-dashboard): native <details> collapse + skip empty metadata
Two follow-up improvements to Tranquil-Flow's metadata-panel restyle.
Both stay within the parent PR's "tone down the panel" scope.

1. Native <details>/<summary> collapse for verbose metadata.

   The parent PR consciously deferred this ("adding native expand/collapse
   would be the next step but requires UX agreement"). The default they
   asked for is straightforward: collapsed when the rendered JSON exceeds
   300 chars (the threshold where the max-height: 8.5rem cap actually
   starts mattering), expanded otherwise. <details>/<summary> is the right
   primitive — zero JS, browser-handled state, accessible by default
   (keyboard-navigable, screen-reader announces the disclosure state),
   and survives any react-state churn for free.

   The OS-default disclosure marker is suppressed (list-style: none +
   ::-webkit-details-marker hidden) and replaced with a CSS ::before
   chevron that rotates 90deg on the [open] attribute, so the look is
   consistent across Firefox/WebKit/Blink without the double-marker
   that would otherwise appear on the platforms that still render the
   default triangle.

2. Skip rendering when metadata is an empty object.

   `r.metadata && ...` truthy-checks, but `{}` is truthy in JS — so a
   completed task with no actual metadata would render a "Metadata"
   labeled disclosure block containing literal `{}`. Adds an
   Object.keys(r.metadata).length > 0 guard so empty payloads render
   nothing instead of an empty disclosure stub.

Tests: three new static-asset assertions covering the <details> shape,
the empty-object skip, and the suppress-default-marker + animated-chevron
CSS — all in `tests/plugins/test_kanban_dashboard_plugin.py`.
2026-05-10 08:30:42 -07:00
Tranquil-Flow
0e0ddaac8f fix(kanban-dashboard): tone down completed-run metadata panel (#19548)
Hand-rebased onto current main from PR #19980; the original branch was stale
against main (~6 unrelated dashboard fixes had landed since), so applying
the PR's dist files directly would have silently reverted them.

The run-history panel in the task drawer rendered each completed run's
`metadata` field as a `<code class="hermes-kanban-run-meta">` containing
`JSON.stringify(r.metadata)` — a single unindented monoline. With
`white-space: pre-wrap` and a monospace font, a writer task's metadata
(changed_files paths, source URLs, generated-artifact details) wrapped
into a tall block of code-ish text that filled the parent run row. The
container's faint `--color-foreground 3%` background then made the whole
thing read like a crash dump even though the run completed normally.

Restyle and label, no interactivity changes:

- Wrap the meta payload in a `.hermes-kanban-run-meta-block` sub-block
  with an explicit `Metadata` label (small, uppercase, muted) so the
  panel reads as auxiliary detail at a glance.
- Pretty-print the JSON (`indent=2`) so the structure is scannable
  instead of a wall of monoline text.
- Cap `.hermes-kanban-run-meta` at `max-height: 8.5rem; overflow: auto`
  so a verbose blob scrolls inside its own pane rather than swamping
  the run row.
- Sub-block uses a thin `border-left` rule and `background: transparent`
  — distinct from the destructive-tinted treatment used by crashed /
  timed_out / blocked / spawn_failed runs higher in the same file.

Tests: two new static-asset assertions in
`tests/plugins/test_kanban_dashboard_plugin.py` lock in the rendered
shape (the plugin ships built-only, no src/).
2026-05-10 08:30:42 -07:00
princepal9120
b308dd7d75 fix(kanban): preserve assignee casing in dashboard 2026-05-10 07:35:01 -07:00
baocin
061a183008 fix(kanban): guard task_age against corrupt created_at values like '%s'
task_age() crashed with ValueError when created_at contained the
literal format string '%s' instead of a Unix timestamp, taking down
the entire GET /board endpoint with a 500.

- Add _safe_int() helper that returns None on non-numeric values
- Refactor task_age() to use _safe_int instead of bare int() casts
- Wrap task_age() call in _task_dict with try/except fallback so one
  corrupt row never kills the whole board endpoint
2026-05-10 07:15:59 -07:00
Teknium
c39168453d
feat(i18n): localize all gateway commands + web dashboard, add 8 new locales (16 total) (#22914)
* feat(i18n): localize /model command output

Reported by @tianma8888: when Chinese users run /model, the labels
("Provider:", "Context:", "_session only_", etc.) are still English.
This routes the static prose through the existing i18n catalog so it
follows display.language / HERMES_LANGUAGE.

Changes:
- locales/{en,zh,ja,de,es,fr,tr,uk}.yaml: add 17 keys under
  gateway.model.* covering switched/provider/context/max_output/cost/
  capabilities/prompt_caching/warning/saved_global/session_only_hint/
  current_label/current_tag/more_models_suffix/usage_*.
- gateway/run.py _handle_model_command: replace hardcoded f-strings in
  the picker callback, the text-list fallback, and the direct-switch
  confirmation block with t("gateway.model.<key>", ...).

What stays English:
- model IDs, provider slugs, capability strings, cost figures, and the
  "[Note: model was just switched...]" prepended to the model's next
  prompt (LLM-facing, not user-facing).
- The two slightly-different session-only hints unify on a single key
  with the em-dash phrasing.

Validation: tests/agent/test_i18n.py 27/27 passing (parity contract
holds), tests/gateway/ -k 'model or i18n' 74/74 passing.

* feat(i18n): localize all gateway slash command outputs

Expands the i18n catalog from 7 strings to 234 keys across 35 gateway
slash command handlers, so non-English users see localized output for
\`/profile\`, \`/status\`, \`/help\`, \`/personality\`, \`/voice\`, \`/reset\`,
\`/agents\`, \`/restart\`, \`/commands\`, \`/goal\`, \`/retry\`, \`/undo\`,
\`/sethome\`, \`/title\`, \`/yolo\`, \`/background\`, \`/approve\`, \`/deny\`,
\`/insights\`, \`/debug\`, \`/rollback\`, \`/reasoning\`, \`/fast\`,
\`/verbose\`, \`/footer\`, \`/compress\`, \`/topic\`, \`/kanban\`,
\`/resume\`, \`/branch\`, \`/usage\`, \`/reload-mcp\`, \`/reload-skills\`,
\`/update\`, \`/stop\` (plus the \`/model\` block already added in the
previous commit).

Reported by @tianma8888 — Chinese users want command output prose in
their language, not just the labels we already had.

Translations are hand-written for all 8 supported locales (en, zh, ja,
de, es, fr, tr, uk), matching each catalog's existing style: full-width
punctuation in zh, em-dashes in zh/ja/uk, French spaced colons,
German noun capitalization, etc.

What stays English (unchanged):
- Identifiers/values: model IDs, file paths, profile names, session IDs,
  command flag names like --global, URLs, config keys.
- Backtick code spans: \`/foo\`, \`config.yaml\`.
- Log messages (logger.info/warning/error).
- LLM-facing system notes prepended to next prompt (e.g. [Note: model
  was just switched...]).
- Strings produced by external modules (gateway_help_lines,
  format_gateway, manual_compression_feedback) — those have their
  own surfaces.

New shared keys for cross-handler boilerplate:
- gateway.shared.session_db_unavailable (5 call sites: branch, title,
  resume, topic, _disable_telegram_topic_mode_for_chat)
- gateway.shared.session_not_found (1 site)
- gateway.shared.warn_passthrough (2 sites in /title's f"⚠️ {e}" pattern)

YAML gotcha fixed: \`yolo.on\` and \`yolo.off\` were originally written
unquoted, which YAML 1.1 parses as boolean True/False keys. Renamed to
\`yolo.enabled\` / \`yolo.disabled\` for both safety and clarity.

Test fix: tests/agent/test_i18n.py::test_t_missing_key_in_non_english_falls_back_to_english
now resets the catalog cache on teardown, so the fake "foo: English Foo"
locale doesn't poison the module-level cache for subsequent tests in
the same xdist worker. (Without this, every gateway slash command test
that shares a worker with the i18n suite would see the fake catalog.)

Validation:
- tests/agent/test_i18n.py: 27/27 (parity contract — every key in every
  locale, matching placeholder tokens).
- tests/gateway/: 5077 passed, 0 failed (full gateway suite).
- 180 t() call sites added across 35 handlers; 1872 catalog entries
  total (234 keys × 8 locales).

* feat(i18n): add 8 new locales — af, ko, it, ga, zh-hant, pt, ru, hu

Expands the static-message catalog from 8 → 16 languages, each with full
270-key parity against the English source-of-truth.  Every locale now
covers the same surface PR #22914 added: approval prompts plus all 35
gateway slash command outputs.

New locales:
- af  Afrikaans      (community ask in #21961 by @GodsBoy; PRs #21962, #21970)
- ko  Korean         (PRs #20297 by @tmdgusya, #22285 by @project820)
- it  Italian        (PR #20371 by @leprincep35700)
- ga  Irish/Gaeilge  (PR #20962 by @ryanmcc09-dot)
- zh-hant Traditional Chinese (PRs #20523 by @jackey8616, #13140 by @anomixer)
- pt  Portuguese     (PRs #20443 by @pedroborges, #15737 by @carloshenriquecarniatto, #22063 by @Magaav)
- ru  Russian        (PR #22770 by @DrMaks22)
- hu  Hungarian      (PR #22336 by @lunasec007)

Each locale uses native-quality translations matching the existing tone
and conventions of the older 8 locales:
- zh-hant uses 繁體 characters with TW/HK technical vocabulary (軟體
  not 软件, 連線 not 连接, 設定 not 设置, 訊息 not 消息, 工作階段 not 会话, 程式
  not 程序, 預設 not 默认, 伺服器 not 服务器), full-width punctuation 「:()」.
- ko uses formal 합니다체 (습니다/합니다) register throughout.
- pt uses European Portuguese as baseline with neutral PT/BR vocabulary
  where possible.
- ga uses standard An Caighdeán Oifigiúil; English loanwords retained
  for tech terms without good Irish equivalents (gateway, API, JSON).
- All preserve {placeholder} tokens, backtick code spans, slash commands,
  brand names (Hermes, MCP, TTS, YOLO, OpenAI, Telegram, etc.), and emoji.

Aliases added in agent/i18n.py:
- af-za, Afrikaans → af
- ko-kr, Korean, 한국어 → ko
- it-it, italiano → it
- ga-ie, Irish, Gaeilge → ga
- zh-tw, zh-hk, zh-mo, traditional-chinese → zh-hant (note: zh-tw used to
  alias to zh; now aliases to its own zh-hant catalog)
- zh-cn, zh-hans, zh-sg → zh (unchanged from before)
- pt-pt, pt-br, brazilian, portuguese → pt
- ru-ru, Russian, русский → ru
- hu-hu, Magyar → hu

The zh-tw alias re-routing is intentional: previously typing 'zh-TW' got
the Simplified Chinese catalog (wrong vocabulary for Taiwan/HK users).
Now those users get the proper Traditional Chinese catalog.

Validation:
- tests/agent/test_i18n.py: 43/43 (parity contract holds for all 16
  languages × 270 keys = 4320 catalog entries, with matching placeholder
  tokens).
- E2E alias resolution verified for all 19 alias inputs (Afrikaans, ko-KR,
  한국어, italiano, Gaeilge, zh-TW, zh-HK, traditional-chinese, pt-BR,
  brazilian, Magyar, etc.).
- tests/gateway/: 5198 passed (3 pre-existing TTS routing failures
  unrelated to i18n).

Credit to all contributors whose PRs surfaced these language requests.
Their original PRs may now be closed as superseded with credit.

* feat(dashboard-i18n): add 14 web dashboard locales matching the static catalog

Brings the React dashboard (web/src/) up to the same 16-language
coverage the static catalog already has after the previous commits in
this PR. The Translations interface is TypeScript-typed, so every new
locale must provide every key — tsc -b is the parity guard.

Languages added (each is a complete 429-line locale file):
- af  Afrikaans
- ja  Japanese        (PR #22513 by @snuffxxx surfaced this)
- de  German          (PR #21749 by @mag1art)
- es  Spanish         (PR #21749)
- fr  French          (PRs #21749, #10310 by @foXaCe)
- tr  Turkish
- uk  Ukrainian
- ko  Korean          (PRs #21749, #18894 by @ovstng, #22285 by @project820)
- it  Italian
- ga  Irish (Gaeilge)
- zh-hant Traditional Chinese (PR #13140 by @anomixer)
- pt  Portuguese      (PRs #22063 by @Magaav, #22182 by @wesleysimplicio, #15737 by @carloshenriquecarniatto)
- ru  Russian         (PRs #21749, #22770 by @DrMaks22)
- hu  Hungarian       (PR #22336 by @lunasec007)

Each translation covers all 15 namespaces with full key parity vs en.ts,
preserves every {placeholder} token verbatim, keeps identifiers
untranslated (brand names, file paths, cron expressions, code spans),
translates the language.switchTo tooltip into the target language, and
matches existing tone conventions (zh-hant uses TW/HK vocab; ja uses
formal desu/masu; ko uses formal seumnida register; ga uses An
Caighdean Oifigiuil with English loanwords for tech vocab without good
Irish equivalents).

Plumbing:
- web/src/i18n/types.ts: Locale union expanded to all 16 codes.
- web/src/i18n/context.tsx: imports all 16 catalogs; exports
  LOCALE_META (endonym + flag per locale); isLocale() type guard.
- web/src/i18n/index.ts: re-export LOCALE_META.
- web/src/components/LanguageSwitcher.tsx: replaced two-state EN-ZH
  toggle with a click-to-open dropdown listing all 16 languages.

Note: zh-hant.ts exports zhHant (camelCase) since hyphen is invalid in
a JS identifier; the canonical 'zh-hant' string keys it in TRANSLATIONS.

Validation:
- npx tsc -b: 0 errors. Every locale satisfies Translations.
- npm run build (tsc + vite production): green, 2062 modules.
- Each locale file is exactly 429 lines.

Out of scope: plugin dashboards (kanban/achievements ship as prebuilt
bundles with no source in repo); Docusaurus docs (separate surface);
TUI (no i18n yet).

* feat(plugin-i18n): localize achievements + kanban plugin dashboards across all 16 locales

Brings the two shipped plugin dashboards (hermes-achievements, kanban)
under the same i18n umbrella as the core dashboard PR #22914 just
established.  Both bundles now read user-facing strings from the host's
i18n catalog via SDK.useI18n() instead of hardcoded English.

## Approach

Plugin dashboards ship as prebuilt IIFE bundles in
plugins/<name>/dashboard/dist/index.js — no build step, no source in
repo (upstream-authored, vendored as compiled JS).  Earlier contributor
PRs (#22594, #22595, #18747) tried direct edits but didn't actually
wire the bundles to read translations.

This change does the wiring properly:

1.  Each bundle gets a useI18n shim at IIFE scope:
        const useI18n = SDK.useI18n
          || function () { return { t: { kanban: null }, locale: "en" }; };
    Older host SDKs without useI18n still load the bundle and render
    English fallbacks.

2.  A small tx(t, path, fallback, vars) helper resolves dotted keys
    under the plugin's namespace (t.kanban.* or t.achievements.*) and
    interpolates {placeholder} tokens.

3.  Every React component starts with const { t } = useI18n() and
    each user-visible string is wrapped in tx(t, "key", "English fallback").
    Helpers called outside React components (window.prompt callers,
    constants used during init) take t as a parameter.

4.  Top-level constants that were English dictionaries (COLUMN_LABEL,
    COLUMN_HELP, DESTRUCTIVE_TRANSITIONS, DIAGNOSTIC_EVENT_LABELS in
    kanban) become getColumnLabel(t, status)-style functions backed by
    FALLBACK_* dictionaries.

## Translations added

Two new top-level namespaces added to the dashboard's TypeScript-typed
Translations interface:

- achievements: ~70 keys covering the hero, scan banner, achievement
  card, share dialog, stats, filters, and empty states.
- kanban: ~145 keys covering the board, columns (with nested
  columnLabels and columnHelp sub-dicts), card detail panel,
  bulk-actions toolbar, dependency editor, board switcher, and
  diagnostic callouts.

Each key is provided across all 16 supported locales:
en, zh, zh-hant, ja, de, es, fr, tr, uk, af, ko, it, ga, pt, ru, hu.

Total new translation entries: ~3,440 (215 keys × 16 locales).

## What stays English (deliberate)

- API paths, CSS class names, data-* attributes, JSON keys, regex
  strings, URLs, file paths (~/.hermes/kanban.db, boards/_archived/).
- State identifier strings used as lookup keys (triage / todo / ready /
  running / blocked / done / archived) — labels translate, key strings
  don't.
- The PNG share-card text rendered to canvas in the achievements
  ShareDialog (HERMES AGENT watermark, UNLOCKED stamp, tier names) —
  these become part of a globally-shared image and stay English.
- localStorage keys (hermes.kanban.selectedBoard).
- Brand names (Kanban, Hermes, WebSocket, Nous Research).

## Contributor credit

PR #22594 by @02356abc and PR #22595 by @02356abc supplied the
en + zh kanban namespace skeleton (145 keys); used as the en source-
of-truth in this commit and translated to the other 14 locales.

PR #18747 by @laolaoshiren first surfaced the achievements
localization request.

## Validation

- npx tsc -b: 0 errors. All 16 locale .ts files satisfy the
  Translations type with full key parity.
- npm run build (tsc + vite production build): green, 2062 modules,
  1.56MB JS / 95KB CSS, ~2.5s build.
- node --check on both plugin bundles: parse cleanly.
- 126 tx() call sites in kanban, 46 in achievements.

## Out of scope

- TUI (ui-tui/) has no i18n infrastructure yet.
- Docusaurus docs (website/i18n/) — already had zh-Hans; expanding
  is a separate translation workstream (Thai / Korean / Hindi PRs).
2026-05-10 07:14:14 -07:00
Teknium
5aa755e4e6
feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194)
* feat(plugins): host-owned LLM access via ctx.llm

Plugins can now ask the host to run a one-shot chat or structured
completion against the user's active model and auth, without ever
seeing an OAuth token or API key. Closes the gap where plugins that
needed bounded structured inference (receipts, CRM extraction,
support classification) had to either bring their own provider keys
or register a tool the agent had to call.

New surface on PluginContext:
- ctx.llm.complete(messages, ...)
- ctx.llm.complete_structured(instructions, input, json_schema, ...)
- async siblings ctx.llm.acomplete / acomplete_structured

Backed by the existing auxiliary_client.call_llm pipeline — every
provider, fallback chain, vision routing, and timeout policy Hermes
already supports applies automatically.

Trust gate (fail-closed by default):
- plugins.entries.<id>.llm.allow_model_override
- plugins.entries.<id>.llm.allowed_models (allowlist; '*' = any)
- plugins.entries.<id>.llm.allow_agent_id_override
- plugins.entries.<id>.llm.allow_profile_override

Embedded model@profile shorthand goes through the same gate as
explicit profile=, so it can't bypass the auth-profile policy.
Conflicting explicit and embedded profiles fail closed.

Also lands:
- plugins/plugin-llm-example/ — reference plugin that registers
  /receipt-extract, demonstrating image+text structured input,
  jsonschema validation, and the trust-gate config.
- website/docs/developer-guide/plugin-llm-access.md — full API docs.
- 45 unit tests covering trust gates, JSON parsing, schema
  validation, image encoding, async surface, and config loading.

Validation:
- 2628 tests pass in tests/agent/
- E2E: bundled plugin loaded with isolated HERMES_HOME, slash
  command produced parsed JSON via stubbed call_llm
- response_format extra_body wired correctly for both json_object
  and json_schema modes

* docs(plugin-llm): rewrite quickstart and framing

The quickstart now uses a meeting-notes-to-tasks example instead of
a receipt extractor, and the page leads with hook-time / gateway
pre-filter / scheduled-job framing rather than the OpenClaw
KB/support/CRM/finance/migration enumeration that the original
upstream PR used. Receipt example moved to a separate worked
example link so the docs page itself doesn't echo any of the
upstream framing.

Also clarifies where ctx.llm fits in the broader plugin surface
(table comparing register_tool / register_platform / register_hook
/ etc.) and what makes this lane different from auxiliary_client
internals.

No code change.

* docs(plugin-llm): reframe as any LLM call, not just structured output

The original draft leaned heavily on complete_structured() and made
the chat lane (complete() / acomplete()) feel like a footnote.
Restructure so:

- The page title and description say 'any LLM call.'
- The lead shows BOTH a plain chat call (error rewriter) AND a
  structured call (triage scorer) up top.
- Quick start has two complete plugin examples — /tldr (chat) and
  /paste-to-tasks (structured).
- New 'When to use which' table for choosing complete() vs
  complete_structured() vs the async siblings.
- Trust-gate sections explicitly note 'all four methods,' and the
  request-shaping list calls out chat-only fields (messages) and
  structured-only fields (instructions, input, json_schema)
  alongside each other.
- The 'Where this fits' section now says 'for any reason,
  structured or not.'

The receipt-extractor reference plugin still exists under
plugins/plugin-llm-example/ — but the docs page no longer treats
it as the canonical surface example. It's now described as 'a third
worked example, this time with image input.'

No code change.

* feat(plugin-llm): split provider/model into independent explicit kwargs

The first cut accepted a single 'provider/model' slug on every method
and split it internally. That looked clean but broke under live test:
the model-override path tried to use the slug's vendor prefix as a
literal Hermes provider id, which silently switched the user off
their aggregator (e.g. plugin asks for 'openai/gpt-4o-mini' on a user
who routes through OpenRouter — host attempted to call the 'openai'
provider directly, failed because OPENAI_API_KEY wasn't set).

New shape mirrors the host's main config:

  ctx.llm.complete(
      messages=[...],
      provider='openrouter',         # gated, optional
      model='openai/gpt-4o-mini',    # gated, optional
      profile='work',                # gated, optional
      ...
  )

Each is independently gated by its own allow_*_override flag.
Granting model-override does NOT auto-grant provider-override.
Allowlists are now per-axis (allowed_providers, allowed_models)
matched literally against whatever string the plugin sends.

Dropped 'model@profile' embedded-suffix shorthand entirely. Hermes
doesn't use that pattern anywhere else; profile= is its own kwarg.

Live E2E (against real OpenRouter via Teknium's config) confirms:
- zero-config call works
- default-deny blocks each override with a helpful error
- model-only override stays on user's active provider (the bug)
- provider+model override switches cleanly
- allowlist refuses non-listed entries
- structured output round-trip parses + schema-validates

Tests: 49 cases (up from 45); all green. Docs updated to match the
new shape, including a 'most plugins never need this section' callout
on the trust-gate config block.

* fix+cleanup(plugin-llm): real attribution, hook-mode coverage, move example out of core

Three integration fixes for the ctx.llm surface:

1. Attribution bug — result.provider and result.model now reflect
   what call_llm actually used, not placeholder fallbacks ('auto',
   'default'). New _resolve_attribution() helper:

     - explicit overrides win (what the call targeted)
     - response.model wins for the recorded model (provider
       canonicalisation: 'gpt-4o' → 'gpt-4o-2024-08-06' etc.)
     - falls back to _read_main_provider() / _read_main_model()
       when no override is set, so audit logs reflect the user's
       active main provider/model
     - 'auto' / 'default' only when EVERYTHING is empty

   Live verified: zero-config call now records
   provider='openrouter', model='anthropic/claude-4.7-opus-20260416'
   instead of provider='auto', model='default'.

2. Hook-mode coverage — TestHookMode confirms ctx.llm.complete
   works from inside a registered post_tool_call callback. The
   docs page promised hook integration; now there's a test that
   exercises the lazy-import path through the real invoke_hook
   machinery. Two cases: traceback-rewrite hook with conditional
   ctx.llm.complete, and minimal hook regression for the
   sync-hook + sync-llm path.

3. Reference plugin moved out of core. plugins/plugin-llm-example/
   is gone from hermes-agent — it now lives in the new
   NousResearch/hermes-example-plugins companion repo. The docs
   page links there. Hermes' bundled plugins should be plugins
   users actually run; reference / docs-companion plugins live
   externally.

Test count: 56 (up from 49). Wider sweep on tests/hermes_cli/
+ tests/gateway/ + tests/tools/ + tests/agent/ shows 16770
passing; the 12 failures are all pre-existing on origin/main
(verified by stashing this branch's changes and re-running) —
kanban-boards, delegate-task, gateway-restart, tts-routing —
none touch the plugin_llm surface.

* chore(plugins): move all example plugins to companion repo

Reference / docs-companion plugins now live exclusively in
NousResearch/hermes-example-plugins, not bundled with the core repo:

- example-dashboard
- strike-freedom-cockpit

A new fourth example, plugin-llm-async-example, was added to that
repo demonstrating ctx.llm's async surface (acomplete()) with
asyncio.gather() — registers /translate <lang>: <text> which fires
forward translation + sentiment classifier in parallel, then a
back-translation for QA. Live-tested at 2.5s for three real
provider round-trips (would be ~5-6s sequential).

Docs updated:
- developer-guide/plugin-llm-access.md links both sync and async
  examples in the Reference section
- user-guide/features/extending-the-dashboard.md repoints both demo
  sections to the companion repo with corrected install paths
- user-guide/features/built-in-plugins.md drops the two demo rows
- AGENTS.md notes that example plugins live in the companion repo

Net: hermes-agent's plugins/ directory now contains only plugins
users actually run (memory providers, dashboard tabs that ship real
features, the disk-cleanup hook, platform adapters). All four
demo / reference plugins live externally where they can be cloned
on demand instead of inflating the core install.
2026-05-10 07:09:28 -07:00
Teknium
ae4b09ce10 test(security): broaden plugin API auth coverage + correct stale docstring
Follow-up to the previous commit's middleware fix.

- plugins/kanban/dashboard/plugin_api.py: rewrite the "Security note"
  docstring. The previous text said "/api/plugins/ is unauthenticated by
  design" — that's now actively wrong and dangerously misleading. New
  text explains that plugin routes flow through the same session-token
  middleware as core API routes and that --host 0.0.0.0 is safe to use
  on a LAN as a result.

- tests/hermes_cli/test_web_server.py: extend TestPluginAPIAuth to cover
  the surfaces the original PR didn't pin:
  * test_plugin_route_allows_auth now exercises a real plugin path
    (/api/plugins/example/hello) instead of accepting 200 OR 404 from
    a maybe-loaded kanban plugin — the assertion was effectively vacuous.
  * test_plugin_patch_requires_auth + test_plugin_delete_requires_auth
    cover non-GET mutation methods in case a future regression
    whitelists them by accident.
  * test_non_kanban_plugin_route_requires_auth proves the fix is
    plugin-agnostic, not kanban-specific (hits hermes-achievements +
    a non-existent plugin namespace; both 401 before route resolution).
  * test_plugin_websocket_unaffected_by_http_middleware locks in that
    the HTTP middleware change didn't accidentally start gating WS
    upgrades — kanban /events still uses its own ?token= check.
  Plus a cosmetic blank-line cleanup.
2026-05-10 07:04:18 -07:00
Teknium
50f9fee988
feat(gateway): add LINE Messaging API platform plugin (#23197)
* feat(gateway): add LINE Messaging API platform plugin

Adds LINE as a bundled platform plugin under `plugins/platforms/line/`,
synthesized from the strongest pieces of seven open community PRs. The
adapter requires zero core edits — `Platform("line")` is auto-discovered
via the bundled-plugin scan in `gateway/config.py`, and all hooks
(setup, env-enablement, cron delivery, standalone send) are wired
through `register_platform()` kwargs the way IRC and Teams do it.

Highlights merged into one plugin:

- **Reply token preferred, Push fallback.** Try the free reply token
  first (single-use, ~60s TTL); fall back to metered Push when the
  token is absent, expired, or rejected. (PR #21023)
- **Slow-LLM Template Buttons postback.** When the LLM is still running
  past `LINE_SLOW_RESPONSE_THRESHOLD` (default 45s), the adapter burns
  the original reply token to send a "Get answer" button bubble. The
  user taps it to fetch the cached answer via a fresh reply token —
  also free. State machine: PENDING → READY → DELIVERED, ERROR for
  cancelled runs (orphan resolves to `LINE_INTERRUPTED_TEXT` after
  /stop). Set threshold to 0 to disable. (PR #18153)
- **Three-allowlist gating** — separate user / group / room allowlists
  with `LINE_ALLOW_ALL_USERS=true` dev-only escape hatch. (PR #18153)
- **Markdown URL preservation.** Strip bold/italic/code-fence/heading
  markers (LINE renders them literally) but keep `[label](url)` →
  `label (url)` so URLs stay tappable. (PR #18153)
- **System-message bypass** for ` Interrupting`, ` Queued`, etc. —
  busy-acks reach the user as visible bubbles instead of being
  swallowed into the postback cache. (PR #18153)
- **Media via public HTTPS URLs.** LINE doesn't accept binary uploads;
  images/audio/video must be HTTPS-reachable. The adapter serves
  registered tempfiles under `/line/media/<token>/<filename>` from the
  same aiohttp app. Allowed-roots traversal guard covers
  `tempfile.gettempdir()`, `/tmp` (→ `/private/tmp` on macOS), and
  `HERMES_HOME`. `LINE_PUBLIC_URL` overrides URL construction for
  setups behind tunnels/proxies. (PR #8398)
- **5-message-per-call batching.** LINE rejects >5 messages per
  Reply/Push; smart-chunker caps text at 4500 chars per bubble.
- **Inbound dedup** via `webhookEventId` LRU. (PR #21023)
- **Self-message filter** via `/v2/bot/info` userId lookup. (PR #21023)
- **Loading-animation indicator** wired to LINE's `chat/loading/start`
  endpoint, DM-only (LINE rejects it for groups/rooms). (PR #21023)
- **Out-of-process cron delivery** via `_standalone_send`, so
  `deliver: line` cron jobs work even when cron runs detached from
  the gateway.
- **Webhook hardening** — 1 MiB body cap, constant-time HMAC-SHA256
  signature verification, dedup, scoped lock so two profiles can't
  bind the same channel.

Validation
----------

- `scripts/run_tests.sh tests/gateway/test_line_plugin.py` →
  73 passed in 1.05s
- `scripts/run_tests.sh tests/gateway/test_line_plugin.py
  tests/gateway/test_irc_adapter.py
  tests/gateway/test_plugin_platform_interface.py
  tests/gateway/test_platform_registry.py
  tests/gateway/test_config.py` → 193 passed, 7 skipped
- E2E import + register + signature roundtrip + `Platform("line")`
  bundled-plugin discovery verified against current `origin/main`.

Closes the seven open LINE PRs (#18153, #16832, #6676, #21023, #14942,
#14988, #8398) by superseding them with a single plugin-form
implementation that takes the best idea from each.

Co-authored-by: pwlee <32443648+leepoweii@users.noreply.github.com>
Co-authored-by: Jetha Chan <jetha@google.com>
Co-authored-by: Cattia <openclaw@liyangchen.me>
Co-authored-by: perng <charles@perng.com>
Co-authored-by: Soichiro Yoshimura <soichiro0111.dev@gmail.com>
Co-authored-by: David Zhou <77736378+David-0x221Eight@users.noreply.github.com>
Co-authored-by: Yu-ga <74749461+yuga-hashimoto@users.noreply.github.com>

* docs(platforms): document platform-specific slow-LLM UX pattern

Add a 'Platform-Specific Slow-LLM UX' section to the platform-adapter
developer guide covering the _keep_typing override pattern that LINE
uses for its Template Buttons postback flow.

Three subsections:
- Pattern: subclass _keep_typing to layer mid-flight UX (with code)
- Pattern: subclass send to route through a cache instead of sending
- When this pattern is appropriate (vs. always-Push fallback)

Plus a short pointer in gateway/platforms/ADDING_A_PLATFORM.md so
tree-readers find the prose walkthrough on the docsite.

Filed because the LINE plugin (PR #23197) was the first bundled
adapter to need this pattern — every prior plugin (irc, teams,
google_chat) handles slow responses with the default typing-loop and
a regular send_text. Documenting now while the rationale is fresh.

---------

Co-authored-by: pwlee <32443648+leepoweii@users.noreply.github.com>
Co-authored-by: Jetha Chan <jetha@google.com>
Co-authored-by: Cattia <openclaw@liyangchen.me>
Co-authored-by: perng <charles@perng.com>
Co-authored-by: Soichiro Yoshimura <soichiro0111.dev@gmail.com>
Co-authored-by: David Zhou <77736378+David-0x221Eight@users.noreply.github.com>
Co-authored-by: Yu-ga <74749461+yuga-hashimoto@users.noreply.github.com>
2026-05-10 06:40:46 -07:00
Tranquil-Flow
8954537f95 fix(kanban): request default board explicitly (#21819) 2026-05-09 19:31:32 -07:00
Teknium
c7f0aab949
feat(openrouter): wire Pareto Code router with min_coding_score knob (#22838)
Pick openrouter/pareto-code as your model and OpenRouter auto-routes each
request to the cheapest model meeting your coding-quality bar (ranked by
Artificial Analysis). The new openrouter.min_coding_score config key (0.0-1.0,
default 0.65) tunes the floor.

- hermes_cli/models.py: add openrouter/pareto-code to OPENROUTER_MODELS so
  it shows up in the picker with a description
- hermes_cli/config.py: add openrouter.min_coding_score (default 0.65 — lands
  on a mid-tier coder on the current Pareto frontier)
- plugins/model-providers/openrouter: emit extra_body.plugins =
  [{id: pareto-router, min_coding_score: X}] when model is openrouter/pareto-code
  AND the score is a valid float in [0.0, 1.0]
- agent/transports/chat_completions.py: same emission on the legacy flag
  path (when no provider profile is loaded)
- run_agent.py: openrouter_min_coding_score kwarg + storage; plumbed into
  both build_kwargs() invocations and the context-summary extra_body path
- cli.py: read openrouter.min_coding_score once at init, validate float in
  [0,1], pass to AIAgent constructions (CLI + background-task paths)
- cron/scheduler.py, batch_runner.py, tools/delegate_tool.py,
  tui_gateway/server.py: propagate the kwarg (mirrors providers_order
  plumbing — subagents inherit, cron/batch read from config)
- tests: profile-level + transport-level coverage of the model gating,
  unset/empty/out-of-range handling, and the legacy flag path
- docs: new 'OpenRouter Pareto Code Router' section in providers.md

Verified end-to-end against api.openrouter.ai: at score=0.65 we land on a
mid-tier coder, at omission we get the strongest. Score is silently dropped
on any model other than openrouter/pareto-code, so it's safe to leave set.
2026-05-09 14:47:00 -07:00
Teknium
550f6e2efc
perf(teams): defer httpx import to first webhook call (#22831)
Same pattern as the google_chat lazy-load (PR #22681), applied to the
Teams plugin. The bundled `plugins/platforms/teams/adapter.py` did
`import httpx` at module top, which dragged the entire httpx +
httpcore stack into every process that triggered plugin discovery —
including `hermes` invocations that never instantiate the Teams
adapter.

`httpx` is only needed inside one method
(`TeamsMeetingPipeline._write_summary_via_incoming_webhook`), and the
`httpx.AsyncBaseTransport` parameter annotation is already string-only
thanks to the existing `from __future__ import annotations`. Move the
runtime import inside the method.

Measured impact (7-run medians, 9950X3D):
  teams plugin alone:    118 → 89 ms  (-25%)
                         46 → 38 MB   (-17%)
  import cli (full):     unchanged
  import model_tools:    unchanged

The full-CLI numbers are flat because httpx is loaded transitively
from many other modules on that path. The microbench win is the real
signal: 29 ms / 8 MB shaved off any process that touches the teams
plugin without otherwise pulling httpx — primarily future workflows
where the gateway is enabled but Teams is not configured.

Tests: 44/44 `tests/gateway/test_teams.py` pass; 345 across all
plugin-platform suites (teams + qqbot + google_chat). The test file
imports `httpx` itself for the `MockTransport` fixture, which is
correct — tests legitimately use httpx, only the plugin's module-level
import was the issue.
2026-05-09 14:42:12 -07:00
Ninso112
883e11f0a0 fix(openrouter): add x-grok-conv-id header for Grok models to improve prompt cache hit rates (carve-out of #22708)
Pass session_id through to provider profile build_api_kwargs_extras so
the OpenRouter profile can attach an xAI cache-affinity header
(x-grok-conv-id: <session-id>) for x-ai/grok-* models. xAI prompt
cache requires server affinity via this header — without it the cache
is poisoned and Grok prompt-cache hit rates drop dramatically on
multi-turn sessions.

Carve-out of #22708 by Ninso112. The original PR bundled a /diff
slash command, a zsh completion fix (already on main via #22802),
and holographic memory null-guards. This salvage keeps just the
Grok header work — small, targeted, and well-tested. Other
contributors and changes preserved for separate review.

Closes #22705.
2026-05-09 13:38:52 -07:00
Ayman Kamal
13b474c56e fix: send correct resolution param to xAI image generation API
The xAI /v1/images/generations endpoint expects resolution as a
literal string ('1k' or '2k'), not the numeric value ('1024').

- Change _XAI_RESOLUTIONS from a dict mapping to a validation set
- Use the resolution key directly instead of the mapped value
- Fall back to DEFAULT_RESOLUTION on invalid config values

Fixes 422 Unprocessable Entity errors when resolution was sent.
2026-05-09 13:07:46 -07:00
Teknium
8f83046f6c
perf(google_chat): defer heavy google-cloud imports to first adapter use (#22681)
Plugin discovery imports every bundled platform plugin at model_tools
import time. The google_chat adapter unconditionally pulled in
google.cloud.pubsub_v1, googleapiclient, grpc, httplib2, and friends at
module top — about 33 MB RSS and 110 ms wall on every CLI invocation,
even ones that never construct a gateway adapter.

Wrap the heavy imports in _load_google_modules(): an idempotent loader
that rebinds the module-level globals (pubsub_v1, service_account,
HttpError, MediaFileUpload, …) on first call and is invoked from
GoogleChatAdapter.__init__, connect(), and check_google_chat_requirements().

The HttpError = Exception placeholder is preserved for the brief window
before the loader runs, so 'except HttpError as exc:' clauses stay
correct (Python looks up the name at try/except evaluation time, not
at function definition time).

Measured impact on a 9950X3D, 7-run medians:
  import cli:              895 → 787 ms  (-108 ms / -12%)
                           133 → 110 MB  ( -23 MB / -17%)
  import model_tools:      491 → 400 ms  ( -91 ms / -19%)
                            95 →  66 MB  ( -29 MB / -31%)
  google_chat alone:       244 → 132 ms  (-112 ms / -46%)
                            83 →  50 MB  ( -33 MB / -40%)
  hermes chat -q (cold):   177 → 145 MB  ( -32 MB / -18%)

Real-world win lands on every path that imports cli.py: hermes chat,
hermes gateway, cron jobs, batch runs, subagents. Long-lived gateway
processes save ~30 MB resident.

All 157 google_chat tests pass; full gateway suite (5050 tests) green.
2026-05-09 11:07:06 -07:00
GodsBoy
93e25ceb13 feat(plugins): add standalone_sender_fn for out-of-process cron delivery
Plugin platforms (IRC, Teams, Google Chat) currently fail with
`No live adapter for platform '<name>'` when a `deliver=<plugin>` cron
job runs in a separate process from the gateway, even though the
platforms are eligible cron targets via `cron_deliver_env_var` (added
in #21306). Built-in platforms (Telegram, Discord, Slack, etc.) use
direct REST helpers in `tools/send_message_tool.py` so cron can deliver
without holding the gateway in the same process; plugin platforms
historically depended on `_gateway_runner_ref()` which returns `None`
out of process.

This change adds an optional `standalone_sender_fn` field to
`PlatformEntry` so plugins can register an ephemeral send path that
opens its own connection, sends, and closes without needing the live
adapter. The dispatch site in `_send_via_adapter` falls through to the
hook when the gateway runner is unavailable, with a descriptive error
when neither path applies. The hook is optional, so existing plugins
are unaffected.

Reference migrations land in the same change for IRC, Teams, and
Google Chat, exercising the hook across stdlib (asyncio + IRC protocol),
Bot Framework OAuth client_credentials, and Google service-account
flows respectively.

Security hardening on the new code paths:
* IRC: control-character stripping on chat_id and message body to
  block CRLF command injection; bounded nick-collision retries; JOIN
  before PRIVMSG so channels with the default `+n` mode accept the
  delivery.
* Teams: TEAMS_SERVICE_URL validated against an allowlist of known
  Bot Framework hosts (`smba.trafficmanager.net`,
  `smba.infra.gov.teams.microsoft.us`) to block SSRF; chat_id and
  tenant_id constrained to the documented Bot Framework character set;
  per-request timeouts so a slow STS endpoint cannot starve the
  activity POST.
* Google Chat: chat_id and thread_id validated against strict
  resource-name regexes; service-account refresh wrapped in
  `asyncio.wait_for` so a hung token endpoint cannot stall the
  scheduler.

Test coverage: 20 new tests covering happy path, missing-config errors,
network failure modes, and each defensive validation. Existing tests
unchanged. `bash scripts/run_tests.sh tests/tools/test_send_message_tool.py
tests/gateway/test_irc_adapter.py tests/gateway/test_teams.py
tests/gateway/test_google_chat.py` reports 341 passed, 0 regressions.

Documentation: new "Out-of-process cron delivery" section in
website/docs/developer-guide/adding-platform-adapters.md and an entry
in gateway/platforms/ADDING_A_PLATFORM.md naming the hook.
2026-05-09 02:56:29 -07:00
kshitijk4poor
8578f898cb test(google-chat): cover relay-declared sender_type honoring
Adds five regression tests for the Format 3 (Cloud Run relay) envelope
path:

- test_relay_flat_honors_declared_sender_type_bot: BOT sender_type
  propagates to msg['sender']['type'].
- test_relay_flat_defaults_sender_type_human_when_absent: backward
  compat \u2014 missing field still flows as HUMAN.
- test_relay_flat_coerces_unknown_sender_type_to_human: defensive
  coercion \u2014 strip+upper normalizes whitespace/case, anything outside
  {HUMAN, BOT} falls back to HUMAN.
- test_relay_flat_bot_sender_is_filtered_end_to_end: end-to-end
  through _on_pubsub_message \u2014 a relay envelope with sender_type=BOT
  is dropped by the BOT self-filter without dispatch.
- test_relay_flat_human_sender_dispatches: end-to-end negative
  control \u2014 human relay envelopes still reach the agent loop.

Also clarifies the operator contract in the adapter comment: the
relay must forward upstream sender.type as envelope.sender_type,
otherwise bot replies forwarded as HUMAN cannot be distinguished
from genuine humans by this filter.
2026-05-09 02:31:31 -07:00
memosr
c386400040 fix(security): honor relay-declared sender_type in Google Chat adapter to prevent BOT filter bypass 2026-05-09 02:31:31 -07:00
kshitij
2a7047c2ed
fix(sqlite): fall back to journal_mode=DELETE on NFS/SMB/FUSE (#22043)
SQLite's WAL mode requires shared-memory (mmap) coordination and fcntl
byte-range locks that don't reliably work on network filesystems. Upstream
documents this explicitly:
  https://www.sqlite.org/wal.html#sometimes_queries_return_sqlite_busy_in_wal_mode

On NFS / SMB / some FUSE mounts / WSL1, 'PRAGMA journal_mode=WAL' raises
'sqlite3.OperationalError: locking protocol' (SQLITE_PROTOCOL). Before
this change, every feature backed by state.db or kanban.db broke silently:
  - /resume, /title, /history, /branch returned 'Session database not
    available.' with no cause
  - gateway logged the init failure at DEBUG (invisible in errors.log)
  - kanban dispatcher crashed every 60s, driving the known migration race
    (duplicate column name: consecutive_failures, #21708 / #21374)

Changes:
  - hermes_state.apply_wal_with_fallback(): shared helper that tries WAL
    and falls back to DELETE on SQLITE_PROTOCOL-style errors with one
    WARNING explaining why
  - hermes_state.get_last_init_error() + format_session_db_unavailable():
    capture the init failure cause and surface it in user-facing strings
    (with an NFS/SMB pointer for 'locking protocol')
  - hermes_cli/kanban_db.connect(): use the shared helper
  - gateway/run.py: bump SessionDB init failure log DEBUG -> WARNING
    (matches cli.py's existing correct behavior)
  - cli.py (4 sites) + gateway/run.py (5 sites): replace bare
    'Session database not available.' with format_session_db_unavailable()

Tests: 12 new tests in tests/test_hermes_state_wal_fallback.py + 1 new
test in tests/hermes_cli/test_kanban_db.py. Existing suites (state,
kanban, gateway, cli) remain green for all tests unrelated to pre-existing
failures on main.

Evidence: real-world user on NFSv3 mount (172.26.224.200:d2dfac12/home,
local_lock=none) reporting 'Session database not available.' on /resume;
'locking protocol' appears in 4 distinct log entries across backup,
kanban, TUI, and CLI paths in the same session.

closes #22032
2026-05-09 02:09:35 -07:00
kshitij
8fb3e2d63a
fix: always send tenant headers in OpenViking _headers() when account/user are set
OpenViking 0.3.x requires X-OpenViking-Account and X-OpenViking-User headers for ROOT API key requests to tenant-scoped APIs. Previously the `!="default"` guard skipped these headers when account/user were the literal string "default", causing INVALID_ARGUMENT errors.

Remove the `!="default"` guard so headers are sent whenever account/user are truthy. Empty strings are still correctly skipped since `""` is falsy.

Update tests to reflect the new behavior:
- test_viking_client_headers_send_tenant_when_default: asserts "default" headers ARE present
- test_viking_client_headers_send_tenant_when_empty_falls_back_to_default: asserts "default" headers ARE present from constructor fallback

Based on #21775 by @happy5318
2026-05-09 01:53:19 -07:00
helix4u
cacb984732 fix(google-chat): repair setup prompt imports 2026-05-08 16:24:01 -07:00
Teknium
cc38282b04 feat(cross-platform): psutil for PID/process management + Windows footgun checker
## Why

Hermes supports Linux, macOS, and native Windows, but the codebase grew up
POSIX-first and has accumulated patterns that silently break (or worse,
silently kill!) on Windows:

- `os.kill(pid, 0)` as a liveness probe — on Windows this maps to
  CTRL_C_EVENT and broadcasts Ctrl+C to the target's entire console
  process group (bpo-14484, open since 2012).
- `os.killpg` — doesn't exist on Windows at all (AttributeError).
- `os.setsid` / `os.getuid` / `os.geteuid` — same.
- `signal.SIGKILL` / `signal.SIGHUP` / `signal.SIGUSR1` — module-attr
  errors at runtime on Windows.
- `open(path)` / `open(path, "r")` without explicit encoding= — inherits
  the platform default, which is cp1252/mbcs on Windows (UTF-8 on POSIX),
  causing mojibake round-tripping between hosts.
- `wmic` — removed from Windows 10 21H1+.

This commit does three things:

1. Makes `psutil` a core dependency and migrates critical callsites to it.
2. Adds a grep-based CI gate (`scripts/check-windows-footguns.py`) that
   blocks new instances of any of the above patterns.
3. Fixes every existing instance in the codebase so the baseline is clean.

## What changed

### 1. psutil as a core dependency (pyproject.toml)

Added `psutil>=5.9.0,<8` to core deps. psutil is the canonical
cross-platform answer for "is this PID alive" and "kill this process
tree" — its `pid_exists()` uses `OpenProcess + GetExitCodeProcess` on
Windows (NOT a signal call), and its `Process.children(recursive=True)`
+ `.kill()` combo replaces `os.killpg()` portably.

### 2. `gateway/status.py::_pid_exists`

Rewrote to call `psutil.pid_exists()` first, falling back to the
hand-rolled ctypes `OpenProcess + WaitForSingleObject` dance on Windows
(and `os.kill(pid, 0)` on POSIX) only if psutil is somehow missing —
e.g. during the scaffold phase of a fresh install before pip finishes.

### 3. `os.killpg` migration to psutil (7 callsites, 5 files)

- `tools/code_execution_tool.py`
- `tools/process_registry.py`
- `tools/tts_tool.py`
- `tools/environments/local.py` (3 sites kept as-is, suppressed with
  `# windows-footgun: ok` — the pgid semantics psutil can't replicate,
  and the calls are already Windows-guarded at the outer branch)
- `gateway/platforms/whatsapp.py`

### 4. `scripts/check-windows-footguns.py` (NEW, 500 lines)

Grep-based checker with 11 rules covering every Windows cross-platform
footgun we've hit so far:

1. `os.kill(pid, 0)` — the silent killer
2. `os.setsid` without guard
3. `os.killpg` (recommends psutil)
4. `os.getuid` / `os.geteuid` / `os.getgid`
5. `os.fork`
6. `signal.SIGKILL`
7. `signal.SIGHUP/SIGUSR1/SIGUSR2/SIGALRM/SIGCHLD/SIGPIPE/SIGQUIT`
8. `subprocess` shebang script invocation
9. `wmic` without `shutil.which` guard
10. Hardcoded `~/Desktop` (OneDrive trap)
11. `asyncio.add_signal_handler` without try/except
12. `open()` without `encoding=` on text mode

Features:
- Triple-quoted-docstring aware (won't flag prose inside docstrings)
- Trailing-comment aware (won't flag mentions in `# os.kill(pid, 0)` comments)
- Guard-hint aware (skips lines with `hasattr(os, ...)`,
  `shutil.which(...)`, `if platform.system() != 'Windows'`, etc.)
- Inline suppression with `# windows-footgun: ok — <reason>`
- `--list` to print all rules with fixes
- `--all` / `--diff <ref>` / staged-files (default) modes
- Scans 380 files in under 2 seconds

### 5. CI integration

A GitHub Actions workflow that runs the checker on every PR and push is
staged at `/tmp/hermes-stash/windows-footguns.yml` — not included in this
commit because the GH token on the push machine lacks `workflow` scope.
A maintainer with `workflow` permissions should add it as
`.github/workflows/windows-footguns.yml` in a follow-up. Content:

```yaml
name: Windows footgun check
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: {python-version: "3.11"}
      - run: python scripts/check-windows-footguns.py --all
```

### 6. CONTRIBUTING.md — "Cross-Platform Compatibility" expansion

Expanded from 5 to 16 rules, each with message, example, and fix.
Recommends psutil as the preferred API for PID / process-tree operations.

### 7. Baseline cleanup (91 → 0 findings)

- 14 `open()` sites → added `encoding='utf-8'` (internal logs/caches) or
  `encoding='utf-8-sig'` (user-editable files that Notepad may BOM)
- 23 POSIX-only callsites in systemd helpers, pty_bridge, and plugin
  tool subprocess management → annotated with
  `# windows-footgun: ok — <reason>`
- 7 `os.killpg` sites → migrated to psutil (see §3 above)

## Verification

```
$ python scripts/check-windows-footguns.py --all
✓ No Windows footguns found (380 file(s) scanned).

$ python -c "from gateway.status import _pid_exists; import os
> print('self:', _pid_exists(os.getpid())); print('bogus:', _pid_exists(999999))"
self: True
bogus: False
```

Proof-of-repro that `os.kill(pid, 0)` was actually killing processes
before this fix — see commit `1cbe39914` and bpo-14484. This commit
removes the last hand-rolled ctypes path from the hot liveness-check
path and defers to the best-maintained cross-platform answer.
2026-05-08 14:27:40 -07:00
Teknium
324567c936 fix(windows): os.kill(pid, 0) is NOT a no-op on Windows — route through new _pid_exists helper
On Windows, Python's ``os.kill(pid, 0)`` is NOT a no-op. CPython's
implementation (``Modules/posixmodule.c::os_kill_impl``) treats sig=0
as ``CTRL_C_EVENT`` because the two integer values collide at the C
layer, and routes it through ``GenerateConsoleCtrlEvent(0, pid)`` —
which sends a Ctrl+C to the ENTIRE console process group containing
the target PID, not just the PID itself. Any caller that wanted to
check "is PID X alive" via the classic POSIX ``os.kill(pid, 0)``
idiom was silently killing that process (and often unrelated
processes in the same console group) on Windows. Long-standing
Python Windows quirk; see bpo-14484 (open since 2012).

This manifested in Hermes as: every ``hermes gateway status``
invocation would read the gateway's PID from the PID file, call
``os.kill(pid, 0)`` via ``gateway.status.get_running_pid()`` as a
"liveness check", and instantly terminate the gateway it was trying
to report on. No shutdown log, no traceback, no atexit hook fire,
no exit-diag entry — just silent termination of the detached pythonw
process. "Bot answered one message then stopped typing" was the
characteristic end-user symptom because `os.kill(pid, 0)` fires
mid-response-send and kills the gateway between logs.

Reproduction (verified in this branch before the fix):

  $ hermes gateway start       # gateway alive, PID 37520
  $ hermes gateway status      # reports "No gateway process detected"
  $ tasklist /FI "PID eq 37520"  # INFO: No tasks are running
                                 # — gateway terminated silently

Root-cause fix is a new ``gateway.status._pid_exists(pid)`` helper:

- On Windows: Win32 ``OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION |
  SYNCHRONIZE, False, pid)`` + ``WaitForSingleObject(handle, 0)``
  via ctypes. Zero signal delivery, zero console-group side effects.
  Pins ctypes return types to avoid DWORD-vs-signed-int parse bugs
  on WAIT_TIMEOUT (0x102). Distinguishes ERROR_INVALID_PARAMETER
  (PID gone) from ERROR_ACCESS_DENIED (alive but another user).
- On POSIX: the canonical ``os.kill(pid, 0)`` idiom that actually is
  a no-op there.

Then patch every ``os.kill(pid, 0)`` liveness-check callsite to
route through ``_pid_exists`` instead. Total 14 callsites across
11 files; every single one was a latent silent-kill on Windows:

  gateway/run.py:2810      — /restart watcher (inline subprocess)
  gateway/run.py:15195     — --replace wait loop
  gateway/status.py:572    — acquire_gateway_runtime_lock stale check
  gateway/status.py:828    — get_running_pid (THE killer for status)
  gateway/platforms/whatsapp.py:111
  hermes_cli/gateway.py:228, 522, 1012  — gateway-related drain loops
  hermes_cli/kanban_db.py:2826         — _pid_alive was claiming to
                                         be cross-platform but used
                                         os.kill(pid, 0) on Windows
  hermes_cli/main.py:5792        — CLI process-kill polling
  hermes_cli/profiles.py:782     — profile stop wait loop
  plugins/google_meet/process_manager.py:74
  tools/browser_tool.py:1215, 1255  — browser daemon ownership probes
  tools/mcp_tool.py:1255, 3374     — MCP stdio orphan tracking

The watcher source in gateway/run.py:2810 is a multi-line string
that gets spawned as an inline ``python -c "..."`` subprocess, so
it can't import gateway.status. The fix for that callsite inlines
the same ctypes probe directly into the watcher source.

Tested on Windows 10 with the hermes gateway + Telegram bot:
- gateway start → alive
- 5 consecutive ``hermes gateway status`` invocations → gateway
  alive after every one, same PID reported each time (37520, 21952)
- gateway.log shows uninterrupted operation; no spurious shutdown
  entries; cron ticker and kanban dispatcher still running on
  their 60-second cadence
- bot continues answering Telegram messages throughout

Ships alongside an exit-path diagnostic wrapper in
``hermes_cli/gateway.py::run_gateway()`` that captures every way
``asyncio.run(start_gateway(...))`` can return (success, SystemExit,
KeyboardInterrupt, BaseException, atexit) with full traceback to
``logs/gateway-exit-diag.log``. This was used to prove the gateway
was being hard-killed externally (no exit event fired) and should
be kept for future Windows debugging.

Refs: https://bugs.python.org/issue14484
See also: references/windows-subprocess-sigint-storm.md in
the hermes-agent skill.
2026-05-08 14:27:40 -07:00
Dilee
d36ccc29c9 refactor(teams): remove redundant delivery-mode branch 2026-05-08 12:00:09 -07:00
Dilee
397f750bb4 feat(teams): add pipeline outbound delivery via existing adapter 2026-05-08 12:00:09 -07:00