fix: prevent infinite 400 loop on context overflow + block prompt injection via cache files (#1630, #1558)

* fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. --------- Co-authored-by: buray <ygd58@users.noreply.github.com>
2026-04-25 00:51:20 +00:00 · 2026-03-17 01:50:59 -07:00 · 2026-03-17 01:50:59 -07:00 · 96dac22194
commit 96dac22194
parent 2d36819503
5 changed files with 436 additions and 32 deletions
--- a/tools/file_tools.py
+++ b/tools/file_tools.py
@ -169,6 +169,27 @@ def clear_file_ops_cache(task_id: str = None):
 def read_file_tool(path: str, offset: int = 1, limit: int = 500, task_id: str = "default") -> str:
    """Read a file with pagination and line numbers."""
    try:
+        # Security: block direct reads of internal Hermes cache/index files
+        # to prevent prompt injection via catalog or hub metadata files.
+        import pathlib as _pathlib
+        _resolved = _pathlib.Path(path).expanduser().resolve()
+        _hermes_home = _pathlib.Path("~/.hermes").expanduser().resolve()
+        _blocked_dirs = [
+            _hermes_home / "skills" / ".hub" / "index-cache",
+            _hermes_home / "skills" / ".hub",
+        ]
+        for _blocked in _blocked_dirs:
+            try:
+                _resolved.relative_to(_blocked)
+                return json.dumps({
+                    "error": (
+                        f"Access denied: {path} is an internal Hermes cache file "
+                        "and cannot be read directly to prevent prompt injection. "
+                        "Use the skills_list or skill_view tools instead."
+                    )
+                })
+            except ValueError:
+                pass
        file_ops = _get_file_ops(task_id)
        result = file_ops.read_file(path, offset, limit)
        if result.content: