hermes-agent/scripts/ci/classify_changes.py
ethernet 05c896cf52 ci: refactor paths & clones
ci: centralize path-gating behind single orchestrator + all-checks-pass
gate

Replace the scattered per-workflow detect-changes pattern with a single
ci.yml orchestrator that runs the classifier once, then conditionally
calls sub-workflows via workflow_call based on lane outputs. A final
all-checks-pass job (if: always()) aggregates all results so branch
protection only needs to require one check.

Changes:
- New .github/workflows/ci.yml orchestrator (detect + conditional calls
  + all-checks-pass gate)
- Extend classify_changes.py with scan/deps/mcp_catalog lanes, absorbing
  supply-chain-audit's internal changes job
- Update detect-changes/action.yml to expose the new lane outputs
- Convert all 10 PR-gated sub-workflows to workflow_call-only triggers,
  removing their push/pull_request triggers and per-step detect-changes
  guards (gating now happens at the orchestrator level)
- lint.yml + supply-chain-audit.yml receive event_name as a
workflow_call
  input to replace github.event_name (which is "workflow_call" inside
  called workflows)
- supply-chain-audit.yml: remove internal changes job + *-gate jobs
  (orchestrator handles gating, booleans arrive as inputs)
- contributor-check.yml: remove internal filter step
- Update test_classify_changes.py for 6-lane output + new supply-chain
  test cases
2026-06-23 09:30:50 -07:00

99 lines
3.8 KiB
Python

#!/usr/bin/env python3
"""Classify a PR's changed files into CI work lanes.
Reads newline-separated changed paths on stdin and writes ``key=value``
booleans (one per lane) to ``$GITHUB_OUTPUT`` and stdout. The
``detect-changes`` composite action consumes them so steps gate on
``if: steps.changes.outputs.<lane> == 'true'``.
Lanes:
* ``python`` — pytest / ruff / ty / footguns.
* ``docker_meta`` — Dockerfiles etc.
* ``frontend`` — TS typecheck matrix + desktop build.
* ``site`` — Docusaurus + generated skill docs.
* ``scan`` — supply-chain scan (Python files, .pth, setup hooks).
* ``deps`` — pyproject.toml dependency bounds check.
* ``mcp_catalog`` — bundled MCP catalog / installer review.
Docker is not a lane — it builds on push-to-main and release only,
never per-PR.
Contract — *fail open, never closed*. We may run a lane we didn't need, but
must never skip one a change could break:
* An empty diff, or any ``.github/`` change, runs everything.
* ``python`` is a denylist: skipped only when *every* file is provably prose
or a frontend-only package; an unrecognized path keeps it on.
* ``skills/`` (incl. ``SKILL.md``) is python-relevant — the skill-doc tests
read that tree, so a doc-looking edit can still break Python.
"""
from __future__ import annotations
import os
import sys
_FRONTEND = ("ui-tui/", "web/", "apps/") # TS typecheck-matrix packages
_ROOT_NPM = {"package.json", "package-lock.json"} # shifts every package's tree
_DOCKER_META = ("docker/", ".hadolint.yml", "Dockerfile") # docker setup
_SITE = ("website/", "skills/", "optional-skills/") # docs site + skill pages
# Prose/frontend trees that can't touch Python. skills/ is excluded on purpose.
_PY_SKIP = ("docs/", "website/") + _FRONTEND
# Supply-chain scan: files that can execute code at install/import time.
_SCAN_EXTS = (".py", ".pth")
_SCAN_FILES = {"setup.cfg", "pyproject.toml"}
# MCP catalog files that require explicit security review.
_MCP_CATALOG_PATHS = ("optional-mcps/",)
_MCP_CATALOG_FILES = {"hermes_cli/mcp_catalog.py"}
def _is_docs(p: str) -> bool:
if p.startswith(("skills/", "optional-skills/")):
return False
return p.endswith((".md", ".mdx")) or p.startswith("docs/") or p.startswith("LICENSE")
def _py_irrelevant(p: str) -> bool:
return _is_docs(p) or p in _ROOT_NPM or p.startswith(_PY_SKIP) or p.startswith(_DOCKER_META)
def _is_scan(p: str) -> bool:
return p.endswith(_SCAN_EXTS) or p in _SCAN_FILES
def _is_mcp_catalog(p: str) -> bool:
return p.startswith(_MCP_CATALOG_PATHS) or p in _MCP_CATALOG_FILES
def classify(files: list[str]) -> dict[str, bool]:
"""Map changed paths to ``{lane: should_run}``."""
files = [f.strip() for f in files if f.strip()]
if not files or any(f.startswith(".github/") for f in files):
return dict.fromkeys(
("python", "docker_meta", "frontend", "site", "scan", "deps", "mcp_catalog"), True
)
return {
"python": any(not _py_irrelevant(f) for f in files),
"docker_meta": any(f.startswith(_DOCKER_META) for f in files),
"frontend": any(f.startswith(_FRONTEND) or f in _ROOT_NPM for f in files),
"site": any(f.startswith(_SITE) for f in files),
"scan": any(_is_scan(f) for f in files),
"deps": any(f == "pyproject.toml" for f in files),
"mcp_catalog": any(_is_mcp_catalog(f) for f in files),
}
def main() -> int:
lanes = classify(sys.stdin.read().splitlines())
out = "\n".join(f"{k}={str(v).lower()}" for k, v in lanes.items())
if dest := os.environ.get("GITHUB_OUTPUT"):
with open(dest, "a", encoding="utf-8") as fh:
fh.write(out + "\n")
print(out) # echo for local runs + CI step logs
return 0
if __name__ == "__main__":
raise SystemExit(main())