fix(middleware): preserve translated downstream failures

Track successful next_call completion separately from invocation so execution middleware that catches and translates a downstream provider/tool failure does not accidentally convert that failure into a successful None result. Also avoid wrapping BaseException from downstream execution, and document the execution middleware error semantics. Tests cover: - pre-next_call middleware failures fail open to the remaining chain - post-next_call middleware failures preserve the downstream result - translated downstream failures propagate instead of returning None - downstream BaseException is not wrapped Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
2026-06-09 08:21:50 +00:00 · 2026-06-06 09:26:18 -07:00 · 2026-06-06 09:26:18 -07:00 · 5abe45674d
commit 5abe45674d
parent 2e0c9083db
3 changed files with 81 additions and 3 deletions
--- a/docs/middleware/README.md
+++ b/docs/middleware/README.md
@ -244,6 +244,15 @@ For NeMo Relay adaptive execution middleware, see
  patches.
 - Execution middleware should call `next_call(...)` exactly once unless it is
  intentionally short-circuiting execution.
+- If execution middleware raises before calling `next_call(...)`, Hermes treats
+  that as middleware failure and continues with the remaining middleware chain
+  and base execution.
+- If execution middleware calls `next_call(...)` successfully and then raises
+  during post-processing, Hermes preserves the downstream result and does not
+  run the provider or tool a second time.
+- If downstream provider or tool execution fails, middleware may let that error
+  propagate or translate it deliberately. Hermes does not convert downstream
+  failure into a successful `None` result.
 - Tool request middleware runs before approvals. If it mutates file paths,
  commands, URLs, or arguments, the mutated values are what guardrails and
  approvals evaluate.