Commit graph

23 commits

Author SHA1 Message Date
Teknium
3483424aaa
fix(security): redact bare-token credentials in URL userinfo (#6396) (#54475)
git remote set-url with an embedded password (https://PASSWORD@github.com)
leaked the credential into agent output — the redaction engine only masked
user:pass@ DB connection strings, never the colon-less bare-token userinfo
form a git remote uses.

Add _URL_BARE_TOKEN_RE: scheme://TOKEN@host for web/transport schemes
(http/https/wss/git/ssh/ftp), 8+ char floor to skip short usernames, token
class forbidding /:@ so an @ in a path/query is never treated as userinfo.

Deliberately scoped to the bare-token form only. The user:pass@ colon form
and query-string tokens stay passing through (#34029, 'pass web URLs through
unchanged') so magic-link / OAuth round-trip skills keep working — a bare
credential in userinfo is never a workflow token (those live in the query
string), so masking it can't break a skill.
2026-06-28 18:52:42 -07:00
teknium1
091ce825fe test(redact): fix file_read regression-guard for current-main YAML collapse
The salvaged #35519 regression guard asserted that default (non-file_read)
mode keeps a head/tail `ghp_S1...Pn2T` mask for a `token: <key>` line. On
current main the YAML config pass (`_YAML_ASSIGN_RE`, key `token`) re-masks
the already-prefix-masked value to `***`, so the assertion was stale. Switch
to a bare-token context so the guard isolates what it claims (prefix-mask
head/tail shape in default mode) without depending on the YAML collapse.
2026-06-28 04:13:20 -07:00
kshitijk4poor
de928bccde fix(redact): non-reusable sentinel for prefix secrets in file reads (#35519)
When security.redact_secrets is on (default), read_file/search_files/cat
applied redact_sensitive_text(code_file=True) to file content, which still
ran prefix masking. An API key in config.yaml (ghp_..., sk-..., xai-..., etc.)
came back as a head/tail mask like `ghp_S1...Pn2T` — a plausible-looking
truncated key. When an agent read that and wrote it back to config, the masked
value replaced the real credential, silently breaking auth (401). Production
evidence: a config.yaml found containing the exact 13-char masked GitHub PAT.

The two community PRs (#35529, #35534) fixed the corruption by NOT redacting
prefixes for config reads — but that exposes the user's real keys to the agent
context, model, and logs (a security regression). This takes the safer route:
keep redacting, but for file content emit a NON-REUSABLE sentinel.

- New `_mask_token_nonreusable`: prefix secrets -> `«redacted:ghp_…»` (vendor
  label preserved for debuggability; zero secret bytes; angle-bracket/ellipsis
  wrapper is syntactically invalid as a token so it can't be mistaken for or
  written back as a usable key).
- New `redact_sensitive_text(file_read=True)` routes prefix matches through it
  (implies code_file=True). Default/log/display mode is UNCHANGED — `_mask_token`
  still keeps head/tail (fine for logs, never written back).
- Wired the 3 file_tools.py call sites (read_file / search_files / cat) to
  file_read=True.

Fixes both the corruption AND avoids the secret-exposure of the un-redact
approach. 6 new tests (sentinel shape, no-leak, not-a-plausible-key, default
mode unchanged, file_read implies code_file, sk- prefix); 88 redact tests pass;
mutation-verified (reverting to the old mask fails the sentinel/leak tests).

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
Co-authored-by: adammatski1972 <289282750+adammatski1972@users.noreply.github.com>

Closes #35519. Supersedes #35529, #35534.
2026-06-28 04:13:20 -07:00
Teknium
c1c179a239
fix(security): redact secrets in background process + foreground env-dump output (#43025) (#54149)
* fix(security): redact secrets in background process + foreground env-dump output

Terminal-output redaction was incomplete (#43025):

- Gap 1: process(action=poll/log/wait) returned background stdout verbatim —
  no redaction at all. A background printenv/server/test emitting a key leaked
  raw to the model, session.db, and CLI display. Same for the gateway
  background-process watcher's completion/progress notifications.
- Gap 2: the foreground terminal path hardcoded code_file=True, which skips the
  ENV-assignment pass, so an opaque token (no vendor prefix) from env/printenv
  leaked even there.

Adds agent.redact.redact_terminal_output(output, command) as the single policy
for ALL terminal-output surfaces: env-dump commands (env/printenv/set/export/
declare) get the ENV-assignment pass (code_file=False) to mask opaque tokens;
other commands stay on code_file=True to avoid false positives on source dumps.
Wired into terminal_tool, process_registry (_handle_process boundary), and the
gateway watcher. Respects security.redact_secrets (no force) — opt-out preserved.

* docs: add infographic for #43025 terminal-output redaction fix
2026-06-28 02:44:21 -07:00
teknium1
bbe1bf4045 fix(agent): stop redacting tool-call args in history; fix auth-header quote-eating
Two related redaction bugs from #43083:

1. build_assistant_message redacted tool-call arguments in-memory. That dict
   feeds both the replayed conversation history and state.db (which is itself
   replayed verbatim on session resume), so the model read back its own
   PGPASSWORD='***' psql call and copied the placeholder, breaking every
   credential-dependent command on the second turn. The masking gave no real
   protection either — the same secret still leaks through tool OUTPUT. Remove
   it. Keeping secrets out of the replayable store is a separate
   tokenization/vault concern (security.redact_secrets still governs
   storage-time redaction elsewhere).

2. _AUTH_HEADER_RE's greedy \S+ credential class ate a closing quote when the
   token sat flush against it (Authorization: Bearer sk-.."), turning value
   corruption into syntax corruption (unterminated quote -> shell EOF /
   SyntaxError). Exclude " and ' from the token class; real credentials never
   contain them.

Closes #43083.
2026-06-28 02:44:06 -07:00
Teknium
674e16e7c6
fix(redact): stop DB-connstr redaction from corrupting code output (#33801) (#54061)
Secret redaction is display/output-scoped on main — write_file writes
content verbatim, terminal/execute_code redact only output not the
command/source. The real bug is in displayed tool OUTPUT (read_file,
terminal, execute_code):

_DB_CONNSTR_RE's password group [^@]+ was greedy across newlines, so on a
multi-line block it scanned past the DSN line to the next stray '@' (a
Python @decorator), replacing every intervening character — including line
breaks — with ***. That dropped lines and concatenated the next line onto
the f-string line, making read_file output look corrupted (the file on disk
was always correct). Reported in #33801.

Fix:
- Forbid whitespace in the userinfo/password groups ([^:\s]+ / [^@\s]+) so
  the match can never span a line break. A real DSN password never contains
  whitespace. This alone kills the catastrophic line-dropping.
- Under code_file=True, preserve a password group that is a pure {...} brace
  expression — f"postgresql://{user}:{pass}@{host}" is an f-string template,
  not a live credential. Literal passwords are still masked.
- Pass code_file=True at the terminal and execute_code output redaction call
  sites (file_tools already did) so code-execution output isn't corrupted by
  ENV/JSON/template false positives. Real prefixes, auth headers, JWTs, and
  private keys are still redacted.

Verified E2E against the reporter's exact pydantic-settings module: file
written verbatim, read_file shows the DSN f-string + @model_validator intact
with zero *** corruption, while a literal postgresql://admin:pw@host DSN and
a real sk- key are still masked.

Reported-by: koishi70
Reported-by: pfrenssen
2026-06-28 01:15:39 -07:00
Teknium
190e1ffac9
fix(redact): mask passwords in lowercase/dotted config keys (#53590)
The secret redactor only matched uppercase env-style keys ([A-Z0-9_]),
so config-file assignments like spring.datasource.password=secret,
app.api.key=xyz, and YAML password: secret leaked verbatim when the
agent ran cat/grep on application.properties or .env files (issue #16413).

Adds three case-insensitive config-key matchers that run only in a
config-file context, preserving the existing #4367 (lowercase code/prose)
and web-URL-passthrough carve-outs:
  - _CFG_DOTTED_RE: namespaced keys (contain a dot) — unambiguously config
  - _CFG_ANCHORED_RE: bare secret-word keys at line start (incl. export)
  - _YAML_ASSIGN_RE: unquoted colon config (password: value)
Value capture stops at whitespace and '&' so form bodies stay pair-wise;
the '://' guard keeps intentional web-URL query-param passthrough intact.

Reported-by: Murtaza1211
2026-06-27 04:43:28 -07:00
devorun
6f0ecf37da fix(redact): mask all Authorization schemes and x-api-key style headers
Secret redaction only matched `Authorization: Bearer <token>`. Other auth
headers passed through verbatim into logs, tool output, and transcripts:

- `Authorization: Basic <base64>` — leaks base64(user:password)
- `Authorization: token <pat>` / any non-Bearer scheme
- `Proxy-Authorization: ...`
- `x-api-key: <key>` (Anthropic and many providers) and `api-key`,
  `x-goog-api-key`, `x-auth-token`, `x-access-token`, ... — opaque values with
  no known vendor prefix were caught by nothing

A logged request or an echoed `curl -H "x-api-key: ..."` command therefore
leaked live credentials.

Generalize the Authorization rule to mask the credential for any scheme (and
Proxy-Authorization) while preserving the header name and scheme word for
debuggability, and add an api-key header rule for the single-opaque-value
headers. Bearer behavior is unchanged; plain prose containing the word
"authorization" (no colon-delimited value) is left untouched.

Adds regression tests for Basic/token/Proxy auth and the x-api-key/api-key
headers, including inside a curl command.
2026-06-21 14:08:06 -07:00
Teknium
aab2e99bae test: cover request debug dump redaction
Keep request dump writes on the shared atomic JSON path, add regression coverage for request body/error/stdout redaction, and map the salvaged contributor email for release attribution.
2026-06-15 05:31:21 -07:00
teknium1
fe62424ac4 test(redact): assert Discord mentions pass through unchanged
Rewrite TestDiscordMentions as negative assertions (mentions survive the
redactor) and clean up the orphaned comment + dangling whitespace left by
removing _DISCORD_MENTION_RE. Follow-up to the salvaged #32259 fix for #35611.
2026-05-30 20:48:41 -07:00
kshitijk4poor
66827f8947 chore: prune unused imports and duplicate import redefinitions
Remove unused imports (F401) and duplicate/shadowed import
redefinitions (F811) across the codebase using ruff's safe
autofixes. No behavioral changes -- imports only.

- ~1400 safe autofixes applied across 644 files (net -1072 lines)
- __init__.py re-exports preserved (excluded from F401 removal so
  public re-export surfaces stay intact)
- Re-exports that are imported or monkeypatched by tests but look
  unused in their defining module are kept with explicit # noqa:
  F401 (gateway/run.py load_dotenv; run_agent re-exports from
  agent.message_sanitization, agent.context_compressor,
  agent.retry_utils, agent.prompt_builder, agent.process_bootstrap,
  agent.codex_responses_adapter)
- Unsafe F841 (unused-variable) fixes deliberately skipped -- those
  can change behavior when the RHS has side effects
- ruff lints remain disabled in pyproject.toml (only PLW1514 is
  selected); this is a one-time cleanup, not a config change

Verification:
- python -m compileall: clean
- pytest --collect-only: all 27161 tests collect (zero import errors)
- core entry points import clean (run_agent, model_tools, cli,
  toolsets, hermes_state, batch_runner, gateway)
- static scan: every name any test imports directly from an edited
  module still resolves
2026-05-28 22:26:25 -07:00
Teknium
5f66c36470
fix(redact): pass web URLs through unchanged (#34029)
* fix(redact): pass web URLs through unchanged

Magic-link checkout URLs, OAuth callbacks the agent is meant to follow,
and pre-signed share URLs were getting `?token=***` / `?code=***` /
`?signature=***` blanket-redacted by parameter NAME, which breaks any
skill that has to round-trip a URL through history (the model's tool
call arguments get sanitized before persistence — the live call fires
with the real URL, but the next turn sees `***`).

Joe Rinaldi Johnson hit this with a checkout-acceleration skill that
uses magic links in URLs.

Drops three call sites from `redact_sensitive_text`:
- `_redact_url_query_params` (was redacting `access_token`, `token`,
  `api_key`, `code`, `signature`, `key`, `auth`, etc.)
- `_redact_url_userinfo` (was redacting `https://user:pass@host`)
- `_redact_http_request_target_query_params` (was redacting access-log
  request targets like `"POST /hook?password=... HTTP/1.1"`)

The helpers themselves are kept in the module — still importable by
anything that wants to opt in explicitly.

Still redacted (unchanged):
- Vendor-prefix credential shapes (sk-, ghp_, AKIA, gAAAA, etc.)
  anywhere they appear, including inside URLs — see the
  `test_known_prefix_inside_url_still_redacted` case.
- JWTs (`eyJ...`)
- DB connection-string passwords (`postgres://admin:pw@host`) —
  these are connection strings, not web URLs the agent navigates to.
- Authorization headers, ENV assignments, JSON `apiKey`/`token` fields,
  Telegram bot tokens, private key blocks, Discord mentions, E.164
  phone numbers, and form-urlencoded bodies (request bodies, not URLs).

Tests: replaces `TestUrlQueryParamRedaction` + `TestUrlUserinfoRedaction`
with `TestWebUrlsNotRedacted`, asserting representative URLs (OAuth
callback, magic link, S3 pre-signed, websocket, userinfo, access log)
pass through unchanged. Adds positive cases proving the prefix and DB
connstr nets still fire. 74 redact tests + 10 browser-exfil + 16 PII
redaction tests all pass.

* test(codex_app_server): drop URL-query assertion from stderr-tail redaction test

The test bundled (a) sk-live-* credential-prefix redaction with (b)
URL query-param redaction. (a) is still in effect via _PREFIX_RE;
(b) was the contract we just removed in the parent commit so the
'querysecret12345' assertion stopped holding. Keep the credential-shape
assertion, drop the URL-query one.

Send-message tool's local _URL_SECRET_QUERY_RE in tools/send_message_tool.py
is independent of agent/redact.py and unchanged — its tests
(test_top_level_send_failure_redacts_query_token,
test_http_error_redacts_access_token_in_exception_text) still pass.
2026-05-28 11:32:39 -07:00
helix4u
514f5020c7 fix(debug): redact BlueBubbles webhook secrets 2026-05-24 15:43:48 -07:00
flamiinngo
5613dfea93 fix(security): redact xAI (Grok) API keys in logs
xAI is a first-class provider in hermes-agent with its own credential
pool entry (XAI_API_KEY / xai-oauth). API keys follow the format
xai-<60+ alphanumeric chars> and were absent from _PREFIX_PATTERNS in
agent/redact.py.

When a key appears raw in log output, tool results, or error messages,
it passed through completely unmasked. The ENV-assignment and Bearer
header patterns catch the most common cases, but a raw token in a
stack trace or debug print had no protection.

Verified before fix:
  redact_sensitive_text("using key xai-ABCD...rstu to call xAI", force=True)
  # "using key xai-ABCD...rstu to call xAI"  <- exposed

After fix:
  # "using key xai-AB...rstu to call xAI"    <- masked

Five unit tests added to TestXaiToken covering bare token masking,
env assignment, short-prefix false positive, company name false
positive, and visible prefix in masked output.
2026-05-18 10:21:22 -07:00
Teknium
d587d62eba
feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal (#13148)
* feat(security): URL query param + userinfo + form body redaction

Port from nearai/ironclaw#2529.

Hermes already has broad value-shape coverage in agent/redact.py
(30+ vendor prefixes, JWTs, DB connstrs, etc.) but missed three
key-name-based patterns that catch opaque tokens without recognizable
prefixes:

1. URL query params - OAuth callback codes (?code=...),
   access_token, refresh_token, signature, etc. These are opaque and
   won't match any prefix regex. Now redacted by parameter NAME.

2. URL userinfo (https://user:pass@host) - for non-DB schemes. DB
   schemes were already handled by _DB_CONNSTR_RE.

3. Form-urlencoded body (k=v pairs joined by ampersands) -
   conservative, only triggers on clean pure-form inputs with no
   other text.

Sensitive key allowlist matches ironclaw's (exact case-insensitive,
NOT substring - so token_count and session_id pass through).

Tests: +20 new test cases across 3 test classes. All 75 redact tests
pass; gateway/test_pii_redaction and tools/test_browser_secret_exfil
also green.

Known pre-existing limitation: _ENV_ASSIGN_RE greedy match swallows
whole all-caps ENV-style names + trailing text when followed by
another assignment. Left untouched here (out of scope); URL query
redaction handles the lowercase case.

* feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal

Update model catalogs for OpenRouter (fallback snapshot), Nous Portal,
and NVIDIA NIM to reference moonshotai/kimi-k2.6.  Add kimi-k2.6 to
the fixed-temperature frozenset in auxiliary_client.py so the 0.6
contract is enforced on aggregator routings.

Native Moonshot provider lists (kimi-coding, kimi-coding-cn, moonshot,
opencode-zen, opencode-go) are unchanged — those use Moonshot's own
model IDs which are unaffected.
2026-04-20 11:49:54 -07:00
Teknium
ee9c0a3ed0
fix(security): add JWT token and Discord mention redaction (#10547)
Found via trace data audit: JWT tokens (eyJ...) and Discord snowflake
mentions (<@ID>) were passing through unredacted.

JWT pattern: matches 1/2/3-part tokens starting with eyJ (base64 for '{').
Zero false-positive risk — no normal text matches eyJ + 10+ base64url chars.

Discord pattern: matches <@digits> and <@!digits> with 17-20 digit snowflake
IDs. Syntactically unique to Discord's mention format.

Both patterns follow the same structural-uniqueness standard as existing
prefix patterns (sk-, ghp_, AKIA, etc.).
2026-04-15 16:08:52 -07:00
Teknium
5b003ca4a0
test(redact): add regression tests for lowercase variable redaction (#4367) (#5185)
Add 5 regression tests from PR #4476 (gnanam1990) to prevent re-introducing
the IGNORECASE bug that caused lowercase Python/TypeScript variable assignments
to be incorrectly redacted as secrets. The core fix landed in 6367e1c4.

Tests cover:
- Lowercase Python variable with 'token' in name
- Lowercase Python variable with 'api_key' in name
- TypeScript 'await' not treated as secret value
- TypeScript 'secret' variable assignment
- 'export' prefix preserved for uppercase env vars

Co-authored-by: gnanam1990 <gnanam1990@users.noreply.github.com>
2026-04-05 00:10:16 -07:00
Teknium
fad3f338d1 fix: patch _REDACT_ENABLED in test fixture for module-level snapshot
The _REDACT_ENABLED constant is snapshotted at import time, so
monkeypatch.delenv() alone doesn't re-enable redaction during tests
when HERMES_REDACT_SECRETS=false is set in the host environment.
2026-03-31 10:30:48 -07:00
Teknium
fb634068df
fix(security): extend secret redaction to ElevenLabs, Tavily and Exa API keys (#3920)
ElevenLabs (sk_), Tavily (tvly-), and Exa (exa_) keys were not covered
by _PREFIX_PATTERNS, leaking in plain text via printenv or log output.

Salvaged from PR #3790 by @memosr. Tests rewritten with correct
assertions (original tests had vacuously true checks).

Co-authored-by: memosr <memosr@users.noreply.github.com>
2026-03-30 08:13:01 -07:00
Teknium
be3eb62047
fix(tests): resolve all consistently failing tests
- test_plugins.py: remove tests for unimplemented plugin command API
  (get_plugin_command_handler, register_command never existed)
- test_redact.py: add autouse fixture to clear HERMES_REDACT_SECRETS
  env var leaked by cli.py import in other tests
- test_signal.py: same HERMES_REDACT_SECRETS fix for phone redaction
- test_mattermost.py: add @bot_user_id to test messages after the
  mention-only filter was added in #2443
- test_context_token_tracking.py: mock resolve_provider_client for
  openai-codex provider that requires real OAuth credentials

Full suite: 5893 passed, 0 failed.
2026-03-22 05:58:26 -07:00
aydnOktay
40c9a13476
fix(redact): safely handle non-string inputs
redact_sensitive_text() now returns early for None and coerces other
non-string values to str before applying regex-based redaction,
preventing TypeErrors in logging/tool-output paths.

Cherry-picked from PR #2369 by aydnOktay.
2026-03-21 16:55:02 -07:00
kshitijk4poor
ccfbf42844 feat: secure skill env setup on load (core #688)
When a skill declares required_environment_variables in its YAML
frontmatter, missing env vars trigger a secure TUI prompt (identical
to the sudo password widget) when the skill is loaded. Secrets flow
directly to ~/.hermes/.env, never entering LLM context.

Key changes:
- New required_environment_variables frontmatter field for skills
- Secure TUI widget (masked input, 120s timeout)
- Gateway safety: messaging platforms show local setup guidance
- Legacy prerequisites.env_vars normalized into new format
- Remote backend handling: conservative setup_needed=True
- Env var name validation, file permissions hardened to 0o600
- Redact patterns extended for secret-related JSON fields
- 12 existing skills updated with prerequisites declarations
- ~48 new tests covering skip, timeout, gateway, remote backends
- Dynamic panel widget sizing (fixes hardcoded width from original PR)

Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main
with conflict resolution.

Fixes #688

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-13 03:14:04 -07:00
teknium1
70dfec9638 test(redact): add sensitive text redaction
- Introduce a new test suite for the `redact_sensitive_text` function, covering various sensitive data formats including API keys, tokens, and environment variables.
- Ensure that sensitive information is properly masked in logs and outputs while non-sensitive data remains unchanged.
- Add tests for different scenarios including JSON fields, authorization headers, and environment variable assignments.
- Implement a redacting formatter for logging to enhance security during log output.
2026-02-28 21:56:27 -08:00