* fix(api-server): stop silently promising async delivery on stateless HTTP path
terminal(notify_on_complete=True / watch_patterns) and delegate_task(background=True)
silently no-op'd on the API server / WebUI path (#10760): the watcher / detached
child registered, but every API-server route (OpenAI-spec /v1/chat/completions
and /v1/responses, plus the proprietary /v1/runs SSE stream) tears down its
channel when the turn ends, and APIServerAdapter.send() is a no-op stub. A
completion that fires after the response closed had nowhere to go — from the
agent side, indistinguishable from a hang.
There is no spec-compliant surface to wake the agent later on a stateless HTTP
client, so make the no-op honest instead of silent:
- Add a per-adapter capability flag supports_async_delivery (default True;
APIServerAdapter = False), propagated into a HERMES_SESSION_ASYNC_DELIVERY
contextvar via async_delivery_supported(). Toggle on the adapter, not a
hardcoded platform string — a future stateless adapter is correct-by-default.
- terminal: when delivery is unsupported, skip watcher registration, force
notify_on_complete off, and return a notify_unsupported note telling the
agent to process(action='poll').
- delegate_task: when delivery is unsupported, fall back to SYNCHRONOUS
execution (work runs and returns in the same response) with a note, instead
of handing out a handle that never resolves.
CLI (in-process completion_queue) and the real gateway platforms are unchanged.
Fixes#10760
* refactor(api-server): route session binding through a single no-delivery chokepoint
Add APIServerAdapter._bind_api_server_session() and route both agent-entry
paths (_run_agent for /v1/chat/completions + /v1/responses, and the /v1/runs
_run_sync path) through it. The helper hardwires platform="api_server" and
async_delivery=False with no async_delivery parameter to pass, so a future
route added to the API server physically cannot reintroduce the silent
no-op (#10760) by forgetting to mark the channel as non-delivering.
The binding stays request-scoped (cleared per turn), so a session resumed
later on a delivering interface (CLI / gateway platform) re-binds fresh and
is NOT blocked — the no-delivery decision tracks the interface handling the
current turn, never the session.