fix(redact): stop DB-connstr redaction from corrupting code output (#33801) (#54061)

Secret redaction is display/output-scoped on main — write_file writes
content verbatim, terminal/execute_code redact only output not the
command/source. The real bug is in displayed tool OUTPUT (read_file,
terminal, execute_code):

_DB_CONNSTR_RE's password group [^@]+ was greedy across newlines, so on a
multi-line block it scanned past the DSN line to the next stray '@' (a
Python @decorator), replacing every intervening character — including line
breaks — with ***. That dropped lines and concatenated the next line onto
the f-string line, making read_file output look corrupted (the file on disk
was always correct). Reported in #33801.

Fix:
- Forbid whitespace in the userinfo/password groups ([^:\s]+ / [^@\s]+) so
  the match can never span a line break. A real DSN password never contains
  whitespace. This alone kills the catastrophic line-dropping.
- Under code_file=True, preserve a password group that is a pure {...} brace
  expression — f"postgresql://{user}:{pass}@{host}" is an f-string template,
  not a live credential. Literal passwords are still masked.
- Pass code_file=True at the terminal and execute_code output redaction call
  sites (file_tools already did) so code-execution output isn't corrupted by
  ENV/JSON/template false positives. Real prefixes, auth headers, JWTs, and
  private keys are still redacted.

Verified E2E against the reporter's exact pydantic-settings module: file
written verbatim, read_file shows the DSN f-string + @model_validator intact
with zero *** corruption, while a literal postgresql://admin:pw@host DSN and
a real sk- key are still masked.

Reported-by: koishi70
Reported-by: pfrenssen
This commit is contained in:
Teknium 2026-06-28 01:15:39 -07:00 committed by GitHub
parent de6e9ac760
commit 674e16e7c6
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 119 additions and 11 deletions

View file

@ -1031,9 +1031,11 @@ def _execute_remote(
from tools.ansi_strip import strip_ansi
stdout_text = strip_ansi(stdout_text)
# Redact secrets
# Redact secrets. code_file=True: execute_code output is code-execution
# output that often echoes source/config — skip false-positive ENV/JSON/
# f-string-template redaction while still masking real credentials.
from agent.redact import redact_sensitive_text
stdout_text = redact_sensitive_text(stdout_text)
stdout_text = redact_sensitive_text(stdout_text, code_file=True)
# Build response
result: Dict[str, Any] = {
@ -1441,9 +1443,11 @@ def execute_code(
# The sandbox env-var filter (lines 434-454) blocks os.environ access,
# but scripts can still read secrets from disk (e.g. open('~/.hermes/.env')).
# This ensures leaked secrets never enter the model context.
# code_file=True: this is code-execution output — skip false-positive
# ENV/JSON/f-string-template redaction; real credentials still masked.
from agent.redact import redact_sensitive_text
stdout_text = redact_sensitive_text(stdout_text)
stderr_text = redact_sensitive_text(stderr_text)
stdout_text = redact_sensitive_text(stdout_text, code_file=True)
stderr_text = redact_sensitive_text(stderr_text, code_file=True)
# Build response
result: Dict[str, Any] = {