fix(redact): stop DB-connstr redaction from corrupting code output (#33801) (#54061)

Secret redaction is display/output-scoped on main — write_file writes content verbatim, terminal/execute_code redact only output not the command/source. The real bug is in displayed tool OUTPUT (read_file, terminal, execute_code): _DB_CONNSTR_RE's password group [^@]+ was greedy across newlines, so on a multi-line block it scanned past the DSN line to the next stray '@' (a Python @decorator), replacing every intervening character — including line breaks — with ***. That dropped lines and concatenated the next line onto the f-string line, making read_file output look corrupted (the file on disk was always correct). Reported in #33801. Fix: - Forbid whitespace in the userinfo/password groups ([^:\s]+ / [^@\s]+) so the match can never span a line break. A real DSN password never contains whitespace. This alone kills the catastrophic line-dropping. - Under code_file=True, preserve a password group that is a pure {...} brace expression — f"postgresql://{user}:{pass}@{host}" is an f-string template, not a live credential. Literal passwords are still masked. - Pass code_file=True at the terminal and execute_code output redaction call sites (file_tools already did) so code-execution output isn't corrupted by ENV/JSON/template false positives. Real prefixes, auth headers, JWTs, and private keys are still redacted. Verified E2E against the reporter's exact pydantic-settings module: file written verbatim, read_file shows the DSN f-string + @model_validator intact with zero *** corruption, while a literal postgresql://admin:pw@host DSN and a real sk- key are still masked. Reported-by: koishi70 Reported-by: pfrenssen
2026-07-01 12:02:05 +00:00 · 2026-06-28 01:15:39 -07:00 · 2026-06-28 01:15:39 -07:00 · 674e16e7c6
commit 674e16e7c6
parent de6e9ac760
5 changed files with 119 additions and 11 deletions
--- a/tools/code_execution_tool.py
+++ b/tools/code_execution_tool.py
@ -1031,9 +1031,11 @@ def _execute_remote(
    from tools.ansi_strip import strip_ansi
    stdout_text = strip_ansi(stdout_text)

-    # Redact secrets
+    # Redact secrets. code_file=True: execute_code output is code-execution
+    # output that often echoes source/config — skip false-positive ENV/JSON/
+    # f-string-template redaction while still masking real credentials.
    from agent.redact import redact_sensitive_text
-    stdout_text = redact_sensitive_text(stdout_text)
+    stdout_text = redact_sensitive_text(stdout_text, code_file=True)

    # Build response
    result: Dict[str, Any] = {
@ -1441,9 +1443,11 @@ def execute_code(
        # The sandbox env-var filter (lines 434-454) blocks os.environ access,
        # but scripts can still read secrets from disk (e.g. open('~/.hermes/.env')).
        # This ensures leaked secrets never enter the model context.
+        # code_file=True: this is code-execution output — skip false-positive
+        # ENV/JSON/f-string-template redaction; real credentials still masked.
        from agent.redact import redact_sensitive_text
-        stdout_text = redact_sensitive_text(stdout_text)
-        stderr_text = redact_sensitive_text(stderr_text)
+        stdout_text = redact_sensitive_text(stdout_text, code_file=True)
+        stderr_text = redact_sensitive_text(stderr_text, code_file=True)

        # Build response
        result: Dict[str, Any] = {