hermes-agent/tests/test_fast_safe_load.py
Teknium 980622d0ec
perf(startup): parse config + plugin manifests with libyaml CSafeLoader (#54486)
The startup config/manifest reads used PyYAML's pure-Python SafeLoader,
which is ~8x slower than the libyaml-backed CSafeLoader C extension.
config.yaml is parsed several times during launch (cli config, raw
config, early interface/redaction bridge, logging config) and every
plugin manifest is parsed once — all on the slow path.

Add utils.fast_safe_load (CSafeLoader-preferring, pure-Python fallback,
true drop-in for safe_load) and route the hot startup parse sites
through it: hermes_cli/config.py (config + manifest reads),
hermes_cli/plugins.py (manifest parse), env_loader, cli.load_cli_config,
hermes_logging, and the two pre-config early YAML bridges in main.py.

Behavior is identical (same restricted safe tag set); only speed changes.
safe_load calls on the startup path drop from ~79 to ~0, cutting the
YAML parse cost from ~0.9s to ~0.15s under profiling.

Adds tests/test_fast_safe_load.py asserting equivalence with safe_load
across input shapes, empty-doc falsiness, C-loader preference, and that
python/object tags are still rejected (safe, not full loader).
2026-06-28 15:38:39 -07:00

62 lines
2.1 KiB
Python

"""Invariants for utils.fast_safe_load.
fast_safe_load is a drop-in for yaml.safe_load that prefers the libyaml
CSafeLoader C extension for speed. These tests assert the behavior contract
(it parses identically to safe_load across input shapes), not a snapshot of
any particular document.
"""
import io
import yaml
from utils import fast_safe_load, _get_fast_yaml_loader
_DOCS = [
"", # empty document -> None
"a: 1\nb: two\nc: 3.5\n",
"list: [1, 2, 3]\nnested:\n k: v\n flag: true\n empty: null\n",
"name: skill-x\nmetadata:\n hermes:\n tags: [alpha, beta]\n category: devops\n",
"- one\n- two\n- three\n", # top-level sequence
"scalar string", # bare scalar
]
def test_equivalent_to_safe_load_for_strings():
for doc in _DOCS:
assert fast_safe_load(doc) == yaml.safe_load(doc), repr(doc)
def test_equivalent_to_safe_load_for_file_objects():
for doc in _DOCS:
assert fast_safe_load(io.StringIO(doc)) == yaml.safe_load(io.StringIO(doc)), repr(doc)
def test_empty_document_returns_none():
# Callers rely on ``fast_safe_load(...) or {}`` — empty must be falsy.
assert fast_safe_load("") is None
def test_prefers_c_loader_when_available():
loader = _get_fast_yaml_loader()
# If libyaml is compiled in, we must be using the C loader; otherwise the
# pure-Python SafeLoader is an acceptable fallback. Either way it must be a
# safe loader (never the unsafe full Loader).
c_loader = getattr(yaml, "CSafeLoader", None)
if c_loader is not None:
assert loader is c_loader
else:
assert loader is yaml.SafeLoader
def test_rejects_arbitrary_python_objects_like_safe_load():
# Safe loaders must not construct arbitrary Python objects. This tag is
# accepted by the unsafe Loader but rejected by Safe/CSafe loaders.
dangerous = "!!python/object/apply:os.system ['echo pwned']\n"
try:
fast_safe_load(dangerous)
raised = False
except yaml.YAMLError:
raised = True
assert raised, "fast_safe_load must reject python/object tags like safe_load"