mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
feat(hooks): add duration_ms to post_tool_call + transform_tool_result
Plugin hooks fired after a tool dispatch now receive an integer
duration_ms kwarg measuring how long the tool's registry.dispatch()
call took (time.monotonic() before/after). Inspired by Claude Code
2.1.119 which added the same field to PostToolUse hook inputs.
Wire points:
- model_tools.py: measure dispatch latency, pass duration_ms to
invoke_hook("post_tool_call", ...) and invoke_hook("transform_tool_result", ...)
- hermes_cli/hooks.py: include duration_ms in the synthetic payload
used by 'hermes hooks test' and 'hermes hooks doctor' so shell-hook
authors see the same shape at development time as runtime
- shell hooks (agent/shell_hooks.py): no code change needed;
_serialize_payload already surfaces non-top-level kwargs under
payload['extra'], so duration_ms lands at extra.duration_ms for
shell-hook scripts
Plugin authors can now build latency dashboards, per-tool SLO alerts,
and regression canaries without having to wrap every tool manually.
Test: tests/test_model_tools.py::test_post_tool_call_receives_non_negative_integer_duration_ms
E2E: real PluginManager + dispatch monkey-patched with a 50ms sleep,
hook callback observes duration_ms=50 (int).
Refs: https://code.claude.com/docs/en/changelog (2.1.119, Apr 23 2026)
This commit is contained in:
parent
13038dc747
commit
0f82c757e0
5 changed files with 52 additions and 6 deletions
|
|
@ -317,7 +317,8 @@ Fires **immediately after** every tool execution returns.
|
|||
**Callback signature:**
|
||||
|
||||
```python
|
||||
def my_callback(tool_name: str, args: dict, result: str, task_id: str, **kwargs):
|
||||
def my_callback(tool_name: str, args: dict, result: str, task_id: str,
|
||||
duration_ms: int, **kwargs):
|
||||
```
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|
|
@ -326,24 +327,27 @@ def my_callback(tool_name: str, args: dict, result: str, task_id: str, **kwargs)
|
|||
| `args` | `dict` | The arguments the model passed to the tool |
|
||||
| `result` | `str` | The tool's return value (always a JSON string) |
|
||||
| `task_id` | `str` | Session/task identifier. Empty string if not set. |
|
||||
| `duration_ms` | `int` | How long the tool's dispatch took, in milliseconds (measured with `time.monotonic()` around `registry.dispatch()`). |
|
||||
|
||||
**Fires:** In `model_tools.py`, inside `handle_function_call()`, after the tool's handler returns. Fires once per tool call. Does **not** fire if the tool raised an unhandled exception (the error is caught and returned as an error JSON string instead, and `post_tool_call` fires with that error string as `result`).
|
||||
|
||||
**Return value:** Ignored.
|
||||
|
||||
**Use cases:** Logging tool results, metrics collection, tracking tool success/failure rates, sending notifications when specific tools complete.
|
||||
**Use cases:** Logging tool results, metrics collection, tracking tool success/failure rates, latency dashboards, per-tool budget alerts, sending notifications when specific tools complete.
|
||||
|
||||
**Example — track tool usage metrics:**
|
||||
|
||||
```python
|
||||
from collections import Counter
|
||||
from collections import Counter, defaultdict
|
||||
import json
|
||||
|
||||
_tool_counts = Counter()
|
||||
_error_counts = Counter()
|
||||
_latency_ms = defaultdict(list)
|
||||
|
||||
def track_metrics(tool_name, result, **kwargs):
|
||||
def track_metrics(tool_name, result, duration_ms=0, **kwargs):
|
||||
_tool_counts[tool_name] += 1
|
||||
_latency_ms[tool_name].append(duration_ms)
|
||||
try:
|
||||
parsed = json.loads(result)
|
||||
if "error" in parsed:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue