hermes-agent/tests/tools/test_memory_tool_schema.py
Teknium 38c8a9c10f
feat(memory): batch operations for single-turn memory updates (#48507)
The memory tool was strictly one-op-per-call. With the store running near
its char limit by design, a new add that would overflow gets rejected with
'consolidate now, then retry' -- but the model could not consolidate and add
in one call. It had to remove/replace across several turns, then retry the
add, each turn re-sending the whole conversation context. Expensive thrash.

Add an 'operations' array: a list of add/replace/remove ops applied
atomically against the FINAL char budget. The model frees space and adds new
entries in ONE call, even when an add alone would overflow. All-or-nothing:
any bad op aborts the whole batch, nothing written.

Root-cause note: the two agent-level memory interception sites
(agent_runtime_helpers.py, tool_executor.py) silently dropped any param not
in their explicit kwarg list, so 'operations' never reached the handler and
batch calls failed with 'Unknown action None'. Both now pass it through and
bridge each add/replace op to external memory providers.

Also: success response is now terminal (done=true + 'do not repeat' note,
no full-entries echo that invited re-edits); schema rewritten to lead with
the batch mechanism and an explicit one-shot stop rule (2138 -> 1476 chars).

Live-verified: near-full consolidate-and-add went 7 calls -> 1 call,
stable across 3 reps. 103 memory/approval tests + 398 background-review/
run_agent tests green; 6 new batch tests added.
2026-06-18 10:19:33 -07:00

54 lines
2.4 KiB
Python

"""Schema-shape tests for the built-in memory tool.
The memory tool previously used ``allOf: [{if: ..., then: {required: ...}}]``
at the top level of ``parameters`` to hint per-action required fields. That
form was:
1. Ignored by every provider (Chat Completions doesn't honour ``if/then``
on function schemas), so it never actually enforced anything.
2. **Rejected outright by strict backends** — OpenAI's Codex endpoint
(``chatgpt.com/backend-api/codex``, gpt-5.x) returns
``Invalid schema for function 'memory': schema must have type 'object'
and not have 'oneOf'/'anyOf'/'allOf'/'enum'/'not' at the top level``.
We now rely on the runtime handler (``memory_tool()`` in ``tools/memory_tool.py``)
to validate required fields per action and return actionable error messages.
These tests guard the schema against regressing back to a shape strict
backends reject.
"""
import json
from tools.memory_tool import MEMORY_SCHEMA
_FORBIDDEN_TOP_LEVEL_KEYS = ("allOf", "anyOf", "oneOf", "enum", "not")
def test_memory_schema_has_no_forbidden_top_level_combinators():
"""OpenAI's Codex backend rejects these at the top level of parameters."""
params = MEMORY_SCHEMA["parameters"]
for key in _FORBIDDEN_TOP_LEVEL_KEYS:
assert key not in params, (
f"top-level {key!r} in memory tool parameters will break the "
"Codex backend (chatgpt.com/backend-api/codex). Per-action "
"required-field checks belong in the runtime handler, not the schema."
)
def test_memory_schema_is_well_formed():
params = MEMORY_SCHEMA["parameters"]
assert params["type"] == "object"
# Only ``target`` is universally required: ``action`` belongs to the
# single-op shape and is omitted when the batch ``operations`` array is used.
assert params["required"] == ["target"]
# Nested ``enum`` on property values is fine — only top-level is forbidden.
assert params["properties"]["action"]["enum"] == ["add", "replace", "remove"]
assert params["properties"]["target"]["enum"] == ["memory", "user"]
# Batch shape is exposed and its items reuse the same actions.
assert params["properties"]["operations"]["type"] == "array"
assert params["properties"]["operations"]["items"]["properties"]["action"]["enum"] == ["add", "replace", "remove"]
def test_memory_schema_is_json_serializable():
json.dumps(MEMORY_SCHEMA)