fix(middleware): preserve translated downstream failures

Track successful next_call completion separately from invocation so execution
  middleware that catches and translates a downstream provider/tool failure does
  not accidentally convert that failure into a successful None result.

  Also avoid wrapping BaseException from downstream execution, and document the
  execution middleware error semantics.

  Tests cover:
  - pre-next_call middleware failures fail open to the remaining chain
  - post-next_call middleware failures preserve the downstream result
  - translated downstream failures propagate instead of returning None
  - downstream BaseException is not wrapped

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
This commit is contained in:
Bryan Bednarski 2026-06-06 09:26:18 -07:00
parent 2e0c9083db
commit 5abe45674d
No known key found for this signature in database
GPG key ID: CC5B6BE166579FEF
3 changed files with 81 additions and 3 deletions

View file

@ -244,6 +244,15 @@ For NeMo Relay adaptive execution middleware, see
patches.
- Execution middleware should call `next_call(...)` exactly once unless it is
intentionally short-circuiting execution.
- If execution middleware raises before calling `next_call(...)`, Hermes treats
that as middleware failure and continues with the remaining middleware chain
and base execution.
- If execution middleware calls `next_call(...)` successfully and then raises
during post-processing, Hermes preserves the downstream result and does not
run the provider or tool a second time.
- If downstream provider or tool execution fails, middleware may let that error
propagate or translate it deliberately. Hermes does not convert downstream
failure into a successful `None` result.
- Tool request middleware runs before approvals. If it mutates file paths,
commands, URLs, or arguments, the mutated values are what guardrails and
approvals evaluate.