Commit graph

45 commits

Author SHA1 Message Date
Teknium
2558d28a9b
fix: resolve CI test failures — add missing functions, fix stale tests (#9483)
Production fixes:
- Add clear_session_context() to hermes_logging.py (fixes 48 teardown errors)
- Add clear_session() to tools/approval.py (fixes 9 setup errors)
- Add SyncError M_UNKNOWN_TOKEN check to Matrix _sync_loop (bug fix)
- Fall back to inline api_key in named custom providers when key_env
  is absent (runtime_provider.py)

Test fixes:
- test_memory_user_id: use builtin+external provider pair, fix honcho
  peer_name override test to match production behavior
- test_display_config: remove TestHelpers for non-existent functions
- test_auxiliary_client: fix OAuth tokens to match _is_oauth_token
  patterns, replace get_vision_auxiliary_client with resolve_vision_provider_client
- test_cli_interrupt_subagent: add missing _execution_thread_id attr
- test_compress_focus: add model/provider/api_key/base_url/api_mode
  to mock compressor
- test_auth_provider_gate: add autouse fixture to clean Anthropic env
  vars that leak from CI secrets
- test_opencode_go_in_model_list: accept both 'built-in' and 'hermes'
  source (models.dev API unavailable in CI)
- test_email: verify email Platform enum membership instead of source
  inspection (build_channel_directory now uses dynamic enum loop)
- test_feishu: add bot_added/bot_deleted handler mocks to _Builder
- test_ws_auth_retry: add AsyncMock for sync_store.get_next_batch,
  add _pending_megolm and _joined_rooms to Matrix adapter mocks
- test_restart_drain: monkeypatch-delete INVOCATION_ID (systemd sets
  this in CI, changing the restart call signature)
- test_session_hygiene: add user_id to SessionSource
- test_session_env: use relative baseline for contextvar clear check
  (pytest-xdist workers share context)
2026-04-14 01:43:45 -07:00
Teknium
8d023e43ed
refactor: remove dead code — 1,784 lines across 77 files (#9180)
Deep scan with vulture, pyflakes, and manual cross-referencing identified:
- 41 dead functions/methods (zero callers in production)
- 7 production-dead functions (only test callers, tests deleted)
- 5 dead constants/variables
- ~35 unused imports across agent/, hermes_cli/, tools/, gateway/

Categories of dead code removed:
- Refactoring leftovers: _set_default_model, _setup_copilot_reasoning_selection,
  rebuild_lookups, clear_session_context, get_logs_dir, clear_session
- Unused API surface: search_models_dev, get_pricing, skills_categories,
  get_read_files_summary, clear_read_tracker, menu_labels, get_spinner_list
- Dead compatibility wrappers: schedule_cronjob, list_cronjobs, remove_cronjob
- Stale debug helpers: get_debug_session_info copies in 4 tool files
  (centralized version in debug_helpers.py already exists)
- Dead gateway methods: send_emote, send_notice (matrix), send_reaction
  (bluebubbles), _normalize_inbound_text (feishu), fetch_room_history
  (matrix), _start_typing_indicator (signal), parse_feishu_post_content
- Dead constants: NOUS_API_BASE_URL, SKILLS_TOOL_DESCRIPTION,
  FILE_TOOLS, VALID_ASPECT_RATIOS, MEMORY_DIR
- Unused UI code: _interactive_provider_selection,
  _interactive_model_selection (superseded by prompt_toolkit picker)

Test suite verified: 609 tests covering affected files all pass.
Tests for removed functions deleted. Tests using removed utilities
(clear_read_tracker, MEMORY_DIR) updated to use internal APIs directly.
2026-04-13 16:32:04 -07:00
Teknium
c7d8d109ff fix(matrix): trust m.mentions.user_ids as authoritative mention signal
Port from openclaw/openclaw#64796: Per MSC3952 / Matrix v1.7, the
m.mentions.user_ids field is the authoritative mention signal. Clients
that populate m.mentions but don't duplicate @bot in the body text
were being silently dropped when MATRIX_REQUIRE_MENTION=true.

Cherry-picked from PR #8673.
2026-04-12 18:05:41 -07:00
Sicheng Li
ea2829ab43 fix(weixin,wecom,matrix): respect system proxy via aiohttp trust_env
aiohttp.ClientSession defaults to trust_env=False, ignoring HTTP_PROXY/
HTTPS_PROXY env vars. This causes QR login and all API calls to fail for
users behind a proxy (e.g. Clash in fake-ip mode), which is common in
China where Weixin and WeCom are primarily used.

Added trust_env=True to all aiohttp.ClientSession instantiations that
connect to external hosts (weixin: 3 places, wecom: 1, matrix: 1).
WhatsApp sessions are excluded as they only connect to localhost.

httpx-based adapters (dingtalk, signal, wecom_callback) are unaffected
as httpx defaults to trust_env=True.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 18:03:16 -07:00
Teknium
dd5b1063d0 fix: register MATRIX_RECOVERY_KEY env var + document migration path
Follow-up for cherry-picked PR #8272:
- Add MATRIX_RECOVERY_KEY to module docstring header in matrix.py
- Register in OPTIONAL_ENV_VARS (config.py) with password=True, advanced=True
- Add to _NON_SETUP_ENV_VARS set
- Document cross-signing verification in matrix.md E2EE section
- Update migration guide with recovery key step (step 3)
- Add to environment-variables.md reference
2026-04-12 02:18:03 -07:00
elkimek
b9af4955b9 fix(matrix): restore verify_with_recovery_key after device key rotation
After the PgCryptoStore migration in v0.8.0, the verify_with_recovery_key
call that previously ran after share_keys() was dropped. On any rotation
that uploads fresh device keys (fresh crypto.db, server had stale keys
from a prior install, etc.), the new device keys carry no valid self-
signing signature because the bot has no access to the self-signing
private key.

Peers like Element then refuse to share Megolm sessions with the
rotated device, so the bot silently stops decrypting incoming messages.

This restores the recovery-key bootstrap: on startup, if
MATRIX_RECOVERY_KEY is set, import the cross-signing private keys from
SSSS and sign_own_device(), producing a valid signature server-side.

Idempotent and gated on MATRIX_RECOVERY_KEY — no behavior change for
users who don't configure a recovery key.

Verified end-to-end by deleting crypto.db and restarting: the bot
rotates device identity keys, re-uploads, self-signs via recovery key,
and decrypts+replies to fresh messages from a paired Element client.
2026-04-12 02:18:03 -07:00
Siddharth Balyan
50d86b3c71
fix(matrix): replace pickle crypto store with SQLite, fix E2EE decryption (#7981)
Fixes #7952 — Matrix E2EE completely broken after mautrix migration.

- Replace MemoryCryptoStore + pickle/HMAC persistence with mautrix's
  PgCryptoStore backed by SQLite via aiosqlite. Crypto state now
  persists reliably across restarts without fragile serialization.

- Add handle_sync() call on initial sync response so to-device events
  (queued Megolm key shares) are dispatched to OlmMachine instead of
  being silently dropped.

- Add _verify_device_keys_on_server() after loading crypto state.
  Detects missing keys (re-uploads), stale keys from migration
  (attempts re-upload), and corrupted state (refuses E2EE).

- Add _CryptoStateStore adapter wrapping MemoryStateStore to satisfy
  mautrix crypto's StateStore interface (is_encrypted,
  get_encryption_info, find_shared_rooms).

- Remove redundant share_keys() call from sync loop — OlmMachine
  already handles this via DEVICE_OTK_COUNT event handler.

- Fix datetime vs float TypeError in session.py suspend_recently_active()
  that crashed gateway startup.

- Add aiosqlite and asyncpg to [matrix] extra in pyproject.toml.

- Update test mocks for PgCryptoStore/Database and add query_keys mock
  for key verification. 174 tests pass.

- Add E2EE upgrade/migration docs to Matrix user guide.
2026-04-12 07:24:46 +05:30
Teknium
04c1c5d53f
refactor: extract shared helpers to deduplicate repeated code patterns (#7917)
* refactor: add shared helper modules for code deduplication

New modules:
- gateway/platforms/helpers.py: MessageDeduplicator, TextBatchAggregator,
  strip_markdown, ThreadParticipationTracker, redact_phone
- hermes_cli/cli_output.py: print_info/success/warning/error, prompt helpers
- tools/path_security.py: validate_within_dir, has_traversal_component
- utils.py additions: safe_json_loads, read_json_file, read_jsonl,
  append_jsonl, env_str/lower/int/bool helpers
- hermes_constants.py additions: get_config_path, get_skills_dir,
  get_logs_dir, get_env_path

* refactor: migrate gateway adapters to shared helpers

- MessageDeduplicator: discord, slack, dingtalk, wecom, weixin, mattermost
- strip_markdown: bluebubbles, feishu, sms
- redact_phone: sms, signal
- ThreadParticipationTracker: discord, matrix
- _acquire/_release_platform_lock: telegram, discord, slack, whatsapp,
  signal, weixin

Net -316 lines across 19 files.

* refactor: migrate CLI modules to shared helpers

- tools_config.py: use cli_output print/prompt + curses_radiolist (-117 lines)
- setup.py: use cli_output print helpers + curses_radiolist (-101 lines)
- mcp_config.py: use cli_output prompt (-15 lines)
- memory_setup.py: use curses_radiolist (-86 lines)

Net -263 lines across 5 files.

* refactor: migrate to shared utility helpers

- safe_json_loads: agent/display.py (4 sites)
- get_config_path: skill_utils.py, hermes_logging.py, hermes_time.py
- get_skills_dir: skill_utils.py, prompt_builder.py
- Token estimation dedup: skills_tool.py imports from model_metadata
- Path security: skills_tool, cronjob_tools, skill_manager_tool, credential_files
- Non-atomic YAML writes: doctor.py, config.py now use atomic_yaml_write
- Platform dict: new platforms.py, skills_config + tools_config derive from it
- Anthropic key: new get_anthropic_key() in auth.py, used by doctor/status/config/main

* test: update tests for shared helper migrations

- test_dingtalk: use _dedup.is_duplicate() instead of _is_duplicate()
- test_mattermost: use _dedup instead of _seen_posts/_prune_seen
- test_signal: import redact_phone from helpers instead of signal
- test_discord_connect: _platform_lock_identity instead of _token_lock_identity
- test_telegram_conflict: updated lock error message format
- test_skill_manager_tool: 'escapes' instead of 'boundary' in error msgs
2026-04-11 13:59:52 -07:00
Teknium
06e1d9cdd4
fix: resolve three high-impact community bugs (#5819, #6893, #3388) (#7881)
Matrix gateway: fix sync loop never dispatching events (#5819)
- _sync_loop() called client.sync() but never called handle_sync()
  to dispatch events to registered callbacks — _on_room_message was
  registered but never fired for new messages
- Store next_batch token from initial sync and pass as since= to
  subsequent incremental syncs (was doing full initial sync every time)
- 17 comments, confirmed by multiple users on matrix.org

Feishu docs: add interactive card configuration for approvals (#6893)
- Error 200340 is a Feishu Developer Console configuration issue,
  not a code bug — users need to enable Interactive Card capability
  and configure Card Request URL
- Added required 3-step setup instructions to feishu.md
- Added troubleshooting entry for error 200340
- 17 comments from Feishu users

Copilot provider drift: detect GPT-5.x Responses API requirement (#3388)
- GPT-5.x models are rejected on /v1/chat/completions by both OpenAI
  and OpenRouter (unsupported_api_for_model error)
- Added _model_requires_responses_api() to detect models needing
  Responses API regardless of provider
- Applied in __init__ (covers OpenRouter primary users) and in
  _try_activate_fallback() (covers Copilot->OpenRouter drift)
- Fixed stale comment claiming gateway creates fresh agents per message
  (it caches them via _agent_cache since the caching was added)
- 7 comments, reported on Copilot+Telegram gateway
2026-04-11 11:12:20 -07:00
Siddharth Balyan
69f3aaa1d6
fix(matrix): pass required args to MemoryCryptoStore for mautrix ≥0.21 (#7848)
* fix(matrix): pass required args to MemoryCryptoStore for mautrix ≥0.21

MemoryCryptoStore.__init__() now requires account_id and pickle_key
positional arguments as of mautrix 0.21. The migration from matrix-nio
(commit 1850747) didn't account for this, causing E2EE initialization
to fail with:

  MemoryCryptoStore.__init__() missing 2 required positional arguments:
  'account_id' and 'pickle_key'

Pass self._user_id as account_id and derive pickle_key from the same
user_id:device_id pair already used for the on-disk HMAC signature.

Update the test stub to accept the new parameters.

Fixes #7803

* fix: use consistent fallback for pickle_key derivation

Address review: _pickle_key now uses _acct_id (which has the 'hermes'
fallback) instead of raw self._user_id, so both values stay consistent
when user_id is empty.

---------

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-04-11 10:43:49 -07:00
Teknium
be9198f1e1 fix: guard mautrix imports for gateway-safe fallback + fix test isolation
Follow-up fixes for the matrix-nio → mautrix migration:

1. Module-level mautrix.types import now wrapped in try/except with
   proper stub classes. Without this, importing gateway.platforms.matrix
   crashes the entire gateway when mautrix isn't installed — even for
   users who don't use Matrix. The stubs mirror mautrix's real attribute
   names so tests that exercise adapter methods (send, reactions, etc.)
   work without the real SDK.

2. Removed _ensure_mautrix_mock() from test_matrix_mention.py — it
   permanently installed MagicMock modules in sys.modules via setdefault(),
   polluting later tests in the suite. No longer needed since the module
   imports cleanly without mautrix.

3. Fixed thread persistence tests to use direct class reference in
   monkeypatch.setattr() instead of string-based paths, which broke
   when the module was reimported by other tests.

4. Moved the module-importability test to a subprocess to prevent it
   from polluting sys.modules (reimporting creates a second module object
   with different __dict__, breaking patch.object in subsequent tests).
2026-04-10 21:15:59 -07:00
alt-glitch
be06db71d7 fix(matrix): ignore m.notice messages to prevent bot-to-bot loops
The old nio code only handled RoomMessageText (m.text). The mautrix
rewrite dispatched both m.text and m.notice, which would cause infinite
loops between bots since m.notice is the conventional msgtype for bot
responses in the Matrix ecosystem.
2026-04-10 21:15:59 -07:00
alt-glitch
5d3332dbba fix(matrix): close leaked sessions on connect failure + HMAC-sign pickle store
- Add api.session.close() on E2EE dep check and E2EE setup failure
  paths (two missing cleanup points from the mautrix migration)
- Replace raw pickle.load/dump with HMAC-SHA256 signed payloads to
  prevent arbitrary code execution from a tampered store file
2026-04-10 21:15:59 -07:00
alt-glitch
bc8b93812c refactor(matrix): simplify adapter after code review
- Extract _resolve_message_context() to deduplicate ~40 lines of
  mention/thread/DM gating logic between text and media handlers
- Move mautrix.types imports to module level (16 scattered local
  imports consolidated)
- Parse mention/thread env vars once in __init__ instead of per-message
- Cache _is_bot_mentioned() result instead of calling 3x per event
- Consolidate send_emote/send_notice into shared _send_simple_message()
- Use _is_dm_room() in get_chat_info() instead of inline duplication
- Add _CRYPTO_PICKLE_PATH constant (was duplicated in 2 locations)
- Fix fragile event_ts extraction (double getattr, None safety)
- Clean up leaked aiohttp session on auth failure paths
- Remove redundant trailing _track_thread() calls
2026-04-10 21:15:59 -07:00
alt-glitch
1f3f120042 fix(matrix): persist E2EE crypto store and fix decrypted event dedup
Address two bugs found by code review:

1. MemoryCryptoStore loses all E2EE keys on restart — now pickle the
   store to disk on disconnect and restore on connect, preserving
   Megolm sessions across restarts.

2. Encrypted events buffered for retry were silently dropped after
   decryption because _on_encrypted_event registered the event ID
   in the dedup set, then _on_room_message rejected it as a
   duplicate. Now clear the dedup entry before routing decrypted
   events.
2026-04-10 21:15:59 -07:00
alt-glitch
8053d48c8d refactor(matrix): rewrite adapter from matrix-nio to mautrix-python
Translate all nio SDK calls to mautrix equivalents while preserving the
adapter structure, business logic, and all features (E2EE, reactions,
threading, mention gating, text batching, media caching, voice MSC3245).

Key changes:
- nio.AsyncClient -> mautrix.client.Client + HTTPAPI + MemoryStateStore
- Manual E2EE key management -> OlmMachine with auto key lifecycle
- isinstance(resp, nio.XxxResponse) -> mautrix returns values directly
- add_event_callback per type -> single ROOM_MESSAGE handler with
  msgtype dispatch
- Room state (member_count, display_name) via async state store lookups
- Upload/download return ContentURI/bytes directly (no wrapper objects)
2026-04-10 21:15:59 -07:00
Fran Fitzpatrick
3e24ba1656 feat(matrix): add MATRIX_DM_MENTION_THREADS env var
When enabled, @mentioning the bot in a DM creates a thread (default:
false). Supports both env var and YAML config (matrix.dm_mention_threads).
6 new tests, docs updated.

From #6957
2026-04-10 15:46:20 -07:00
Fran Fitzpatrick
21bb2547c6 fix(matrix): log redact failures and add missing reaction test cases
Add debug logging when eyes reaction redaction fails, and add tests
for the success=False path and the no-pending-reaction edge case.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 05:19:26 -07:00
Fran Fitzpatrick
cc12ab8290 fix(matrix): remove eyes reaction on processing complete
The on_processing_complete handler was never removing the eyes reaction because
_send_reaction didn't return the reaction event_id.

Fix:
- _send_reaction returns Optional[str] event_id
- on_processing_start stores it in _pending_reactions dict
- on_processing_complete redacts the eyes reaction before adding completion emoji
2026-04-10 05:19:26 -07:00
Kenny Xie
4f2f09affa fix(gateway): avoid false failure reactions on restart cancellation 2026-04-10 03:52:00 -07:00
Teknium
13d7ff3420
fix(gateway): bypass text batching when delay is 0 (#6996)
The text batching feature routes TEXT messages through
asyncio.create_task() + asyncio.sleep(delay). Even with delay=0,
the task fires asynchronously and won't complete before synchronous
test assertions. This broke 33 tests across Discord, Matrix, and
WeCom adapters.

When _text_batch_delay_seconds is 0 (the test fixture setting),
dispatch directly to handle_message() instead of going through
the async batching path. This preserves the pre-batching behavior
for tests while keeping batching active in production (default
delay 0.6s).
2026-04-09 23:59:20 -07:00
Teknium
07148cac9a fix(matrix): add text batching to merge split long messages
Ports the adaptive batching pattern from the Telegram adapter.
Matrix clients split messages around 4000 chars. Adaptive delay waits
2.0s when a chunk is near the limit, 0.6s otherwise. Only text messages
are batched; commands dispatch immediately.

Ref #6892
2026-04-09 23:25:27 -07:00
Teknium
469cd16fe0
fix(security): consolidated security hardening — SSRF, timing attack, tar traversal, credential leakage (#5944)
Salvaged from PRs #5800 (memosr), #5806 (memosr), #5915 (Ruzzgar), #5928 (Awsh1).

Changes:
- Use hmac.compare_digest for API key comparison (timing attack prevention)
- Apply provider env var blocklist to Docker containers (credential leakage)
- Replace tar.extractall() with safe extraction in TerminalBench2 (CVE-2007-4559)
- Add SSRF protection via is_safe_url to ALL platform adapters:
  base.py (cache_image_from_url, cache_audio_from_url),
  discord, slack, telegram, matrix, mattermost, feishu, wecom
  (Signal and WhatsApp protected via base.py helpers)
- Update tests: mock is_safe_url in Mattermost download tests
- Add security tests for tar extraction (traversal, symlinks, safe files)
2026-04-07 17:28:37 -07:00
Teknium
d0ffb111c2
refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821)
Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture)
and manual analysis of the entire codebase.

Changes by category:

Unused imports removed (~95 across 55 files):
- Removed genuinely unused imports from all major subsystems
- agent/, hermes_cli/, tools/, gateway/, plugins/, cron/
- Includes imports in try/except blocks that were truly unused
  (vs availability checks which were left alone)

Unused variables removed (~25):
- Removed dead variables: connected, inner, channels, last_exc,
  source, new_server_names, verify, pconfig, default_terminal,
  result, pending_handled, temperature, loop
- Dropped unused argparse subparser assignments in hermes_cli/main.py
  (12 instances of add_parser() where result was never used)

Dead code removed:
- run_agent.py: Removed dead ternary (None if False else None) and
  surrounding unreachable branch in identity fallback
- run_agent.py: Removed write-only attribute _last_reported_tool
- hermes_cli/providers.py: Removed dead @property decorator on
  module-level function (decorator has no effect outside a class)
- gateway/run.py: Removed unused MCP config load before reconnect
- gateway/platforms/slack.py: Removed dead SessionSource construction

Undefined name bugs fixed (would cause NameError at runtime):
- batch_runner.py: Added missing logger = logging.getLogger(__name__)
- tools/environments/daytona.py: Added missing Dict and Path imports

Unnecessary global statements removed (14):
- tools/terminal_tool.py: 5 functions declared global for dicts
  they only mutated via .pop()/[key]=value (no rebinding)
- tools/browser_tool.py: cleanup thread loop only reads flag
- tools/rl_training_tool.py: 4 functions only do dict mutations
- tools/mcp_oauth.py: only reads the global
- hermes_time.py: only reads cached values

Inefficient patterns fixed:
- startswith/endswith tuple form: 15 instances of
  x.startswith('a') or x.startswith('b') consolidated to
  x.startswith(('a', 'b'))
- len(x)==0 / len(x)>0: 13 instances replaced with pythonic
  truthiness checks (not x / bool(x))
- in dict.keys(): 5 instances simplified to in dict
- Redefined unused name: removed duplicate _strip_mdv2 import in
  send_message_tool.py

Other fixes:
- hermes_cli/doctor.py: Replaced undefined logger.debug() with pass
- hermes_cli/config.py: Consolidated chained .endswith() calls

Test results: 3934 passed, 17 failed (all pre-existing on main),
19 skipped. Zero regressions.
2026-04-07 10:25:31 -07:00
kshitijk4poor
05f9267938 fix(matrix): hard-fail E2EE when python-olm missing + stable MATRIX_DEVICE_ID
Two issues caused Matrix E2EE to silently not work in encrypted rooms:

1. When matrix-nio is installed without the [e2e] extra (no python-olm /
   libolm), nio.crypto.ENCRYPTION_ENABLED is False and client.olm is
   never initialized. The adapter logged warnings but returned True from
   connect(), so the bot appeared online but could never decrypt messages.
   Now: check_matrix_requirements() and connect() both hard-fail with a
   clear error message when MATRIX_ENCRYPTION=true but E2EE deps are
   missing.

2. Without a stable device_id, the bot gets a new device identity on each
   restart. Other clients see it as "unknown device" and refuse to share
   Megolm session keys. Now: MATRIX_DEVICE_ID env var lets users pin a
   stable device identity that persists across restarts and is passed to
   nio.AsyncClient constructor + restore_login().

Changes:
- gateway/platforms/matrix.py: add _check_e2ee_deps(), hard-fail in
  connect() and check_matrix_requirements(), MATRIX_DEVICE_ID support
  in constructor + restore_login
- gateway/config.py: plumb MATRIX_DEVICE_ID into platform extras
- hermes_cli/config.py: add MATRIX_DEVICE_ID to OPTIONAL_ENV_VARS

Closes #3521
2026-04-06 16:54:16 -07:00
nepenth
534511bebb feat(matrix): Tier 1 enhancement — reactions, read receipts, rich formatting, room management
Cherry-picked from PR #4338 by nepenth, resolved against current main.

Adds:
- Processing lifecycle reactions (eyes/checkmark/cross) via MATRIX_REACTIONS env
- Reaction send/receive with ReactionEvent + UnknownEvent fallback for older nio
- Fire-and-forget read receipts on text and media messages
- Message redaction, room history fetch, room creation, user invite
- Presence status control (online/offline/unavailable)
- Emote (/me) and notice message types with HTML rendering
- XSS-hardened markdown-to-HTML converter (strips raw HTML preprocessor,
  sanitizes link URLs against javascript:/data:/vbscript: schemes)
- Comprehensive regex fallback with full block/inline markdown support
- Markdown>=3.6 added to [matrix] extras in pyproject.toml
- 46 new tests covering all features and security hardening
2026-04-05 11:19:54 -07:00
Teknium
c100ad874c fix(matrix): E2EE cron delivery via live adapter + HTML formatting + origin fallback
Salvaged from PRs #3767 (chalkers), #5236 (ygd58), #2641 (buntingszn).

Three improvements to Matrix cron delivery:

1. Live adapter path: when the gateway is running, cron delivery now uses
   the connected MatrixAdapter via run_coroutine_threadsafe instead of
   the standalone HTTP PUT. This enables delivery to E2EE rooms where
   the raw HTTP path cannot encrypt. Falls back to standalone on failure.
   Threads adapters + event loop from gateway -> cron ticker -> tick() ->
   _deliver_result(). (from #3767)

2. HTML formatted_body: _send_matrix() now converts markdown to HTML
   using the optional markdown library, with h1-h6 to bold conversion
   for Element X compatibility. Falls back to plain text if markdown
   is not installed. Also adds random bytes to txn_id to prevent
   collisions. (from #5236)

3. Origin fallback: when deliver="origin" but origin is null (jobs
   created via API/scripts), falls back to HOME_CHANNEL env vars
   in order: matrix -> telegram -> discord -> slack. (from #2641)
2026-04-05 11:07:47 -07:00
chalkers
bec02f3731 fix(matrix): handle encrypted media events and cache decrypted attachments
Cherry-picked from PR #3140 by chalkers, resolved against current main.
Registers RoomEncryptedImage/Audio/Video/File callbacks, decrypts
attachments via nio.crypto, caches all media types (images, audio,
documents), prevents ciphertext URL fallback for encrypted media.
Unifies the separate voice-message download into the main cache block.
Preserves main's MATRIX_REQUIRE_MENTION, auto-thread, and mention
stripping features. Includes 355 lines of encrypted media tests.
2026-04-05 11:07:47 -07:00
binhnt92
b65e67545a fix(gateway): stop Matrix/Mattermost reconnect on permanent auth failures
Cherry-picked from PR #3695 by binhnt92.
Matrix _sync_loop() and Mattermost _ws_loop() were retrying all errors
forever, including permanent auth failures (expired tokens, revoked
access). Now detects M_UNKNOWN_TOKEN, M_FORBIDDEN, 401/403 and stops
instead of spinning. Includes 216 lines of tests.
2026-04-05 11:07:47 -07:00
pjay-io
9d7c288d86 fix(matrix): add filesize to nio.upload() for Synapse compatibility
Cherry-picked from PR #4343 by pjay-io.
Synapse rejects chunked uploads without Content-Length. Adding
filesize=len(data) ensures the upload includes proper sizing.
2026-04-05 11:07:47 -07:00
Fran Fitzpatrick
2556cfdab1 fix(gateway): match Discord mention-stripping behavior in Matrix adapter
Move mention stripping outside the `if not is_dm` guard so mentions
are stripped in DMs too. Remove the bare-mention early return so a
message containing only a mention passes through as empty string,
matching Discord's behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 13:09:27 -07:00
Fran Fitzpatrick
d86be33161 feat(gateway): add MATRIX_REQUIRE_MENTION and MATRIX_AUTO_THREAD support
Bring Matrix feature parity with Discord by adding mention gating and
auto-threading. Both default to true, matching Discord behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 13:09:27 -07:00
Teknium
07746dca0c
fix(matrix): E2EE decryption — request keys, auto-trust devices, retry buffered events (#4083)
When the Matrix adapter receives encrypted events it can't decrypt
(MegolmEvent), it now:

1. Requests the missing room key from other devices via
   client.request_room_key(event) instead of silently dropping the message

2. Buffers undecrypted events (bounded to 100, 5 min TTL) and retries
   decryption after each E2EE maintenance cycle when new keys arrive

3. Auto-trusts/verifies all devices after key queries so other clients
   share session keys with the bot proactively

4. Exports Megolm keys on disconnect and imports them on connect, so
   session keys survive gateway restarts

This addresses the 'could not decrypt event' warnings that caused the
bot to miss messages in encrypted rooms.
2026-03-30 17:16:09 -07:00
Teknium
1e896b0251
fix: resolve 7 failing CI tests (#3936)
1. matrix voice: _on_room_message_media unconditionally overwrote
   media_urls with the image cache path (always None for non-images),
   wiping the locally-cached voice path. Now only overrides when
   cached_path is truthy.

2. cli_tools_command: /tools disable no longer prompts for confirmation
   (input() removed in earlier commit to fix TUI hang), but tests still
   expected the old Y/N prompt flow. Updated tests to match current
   behavior (direct apply + session reset).

3. slack app_mention: connect() was refactored for multi-workspace
   (creates AsyncWebClient per token), but test only mocked the old
   self._app.client path. Added AsyncWebClient and acquire_scoped_lock
   mocks.

4. website_policy: module-level _cached_policy from earlier tests caused
   fast-path return of None. Added invalidate_cache() before assertion.

5. codex 401 refresh: already passing on current main (fixed by
   intervening commit).
2026-03-30 08:10:14 -07:00
Teknium
ee61485cac
feat(matrix): support native voice messages via MSC3245 (#3877)
* feat(matrix): support native voice messages

* fix: skip matrix voice tests when matrix-nio not installed

---------

Co-authored-by: Carlos Alberto Pereira Gomes <carlosapgomes@users.noreply.github.com>
2026-03-30 00:02:51 -07:00
Teknium
1e924e99b9
refactor: consolidate ~/.hermes directory layout with backward compat (#3610)
New installs get a cleaner structure:
  cache/images/      (was image_cache/)
  cache/audio/       (was audio_cache/)
  cache/documents/   (was document_cache/)
  cache/screenshots/ (was browser_screenshots/)
  platforms/whatsapp/session/ (was whatsapp/session/)
  platforms/matrix/store/    (was matrix/store/)
  platforms/pairing/         (was pairing/)

Existing installs are unaffected -- get_hermes_dir() checks for the
old path first and uses it if present. No migration needed.

Adds get_hermes_dir(new_subpath, old_name) helper to hermes_constants.py
for reuse by any future subsystem.
2026-03-28 15:22:19 -07:00
Teknium
e97c0cb578
fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support
* feat: GPT tool-use steering + strip budget warnings from history

Two changes to improve tool reliability, especially for OpenAI GPT models:

1. GPT tool-use enforcement prompt: Adds GPT_TOOL_USE_GUIDANCE to the
   system prompt when the model name contains 'gpt' and tools are loaded.
   This addresses a known behavioral pattern where GPT models describe
   intended actions ('I will run the tests') instead of actually making
   tool calls. Inspired by similar steering in OpenCode (beast.txt) and
   Cline (GPT-5.1 variant).

2. Budget warning history stripping: Budget pressure warnings injected by
   _get_budget_warning() into tool results are now stripped when
   conversation history is replayed via run_conversation(). Previously,
   these turn-scoped signals persisted across turns, causing models to
   avoid tool calls in all subsequent messages after any turn that hit
   the 70-90% iteration threshold.

* fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support

Prep for the upcoming profiles feature — each profile is a separate
HERMES_HOME directory, so all paths must respect the env var.

Fixes:
- gateway/platforms/matrix.py: Matrix E2EE store was hardcoded to
  ~/.hermes/matrix/store, ignoring HERMES_HOME. Now uses
  get_hermes_home() so each profile gets its own Matrix state.

- gateway/platforms/telegram.py: Two locations reading config.yaml via
  Path.home()/.hermes instead of get_hermes_home(). DM topic thread_id
  persistence and hot-reload would read the wrong config in a profile.

- tools/file_tools.py: Security path for hub index blocking was
  hardcoded to ~/.hermes, would miss the actual profile's hub cache.

- hermes_cli/gateway.py: Service naming now uses the profile name
  (hermes-gateway-coder) instead of a cryptic hash suffix. Extracted
  _profile_suffix() helper shared by systemd and launchd.

- hermes_cli/gateway.py: Launchd plist path and Label now scoped per
  profile (ai.hermes.gateway-coder.plist). Previously all profiles
  would collide on the same plist file on macOS.

- hermes_cli/gateway.py: Launchd plist now includes HERMES_HOME in
  EnvironmentVariables — was missing entirely, making custom
  HERMES_HOME broken on macOS launchd (pre-existing bug).

- All launchctl commands in gateway.py, main.py, status.py updated
  to use get_launchd_label() instead of hardcoded string.

Test fixes: DM topic tests now set HERMES_HOME env var alongside
Path.home() mock. Launchd test uses get_launchd_label() for expected
commands.
2026-03-28 13:51:08 -07:00
Teknium
be322efdf2
fix(matrix): harden e2ee access-token handling (#3562)
* fix(matrix): harden e2ee access-token handling

* fix: patch nio mock in e2ee maintenance sync loop test

The sync_loop now imports nio for SyncError checking (from PR #3280),
so the test needs to inject a fake nio module via sys.modules.

---------

Co-authored-by: Cortana <andrew+cortana@chalkley.org>
2026-03-28 12:13:35 -07:00
Teknium
148f46620f
fix(matrix): add backoff for SyncError in sync loop (#3280)
When the homeserver returns an error response, matrix-nio parses it
as a SyncError return value rather than raising an exception. The sync
loop only had backoff in the except handler, so SyncError caused a
tight retry loop (~489 req/s) flooding logs and hammering the
homeserver. Check the return value and sleep 5s before retry.

Cherry-picked from PR #2937 by ticketclosed-wontfix.

Co-authored-by: ticketclosed-wontfix <ticketclosed-wontfix@users.noreply.github.com>
2026-03-26 16:19:58 -07:00
Teknium
8bb1d15da4
chore: remove ~100 unused imports across 55 files (#3016)
Automated cleanup via pyflakes + autoflake with manual review.

Changes:
- Removed unused stdlib imports (os, sys, json, pathlib.Path, etc.)
- Removed unused typing imports (List, Dict, Any, Optional, Tuple, Set, etc.)
- Removed unused internal imports (hermes_cli.auth, hermes_cli.config, etc.)
- Fixed cli.py: removed 8 shadowed banner imports (imported from hermes_cli.banner
  then immediately redefined locally — only build_welcome_banner is actually used)
- Added noqa comments to imports that appear unused but serve a purpose:
  - Re-exports (gateway/session.py SessionResetPolicy, tools/terminal_tool.py
    is_interrupted/_interrupt_event)
  - SDK presence checks in try/except (daytona, fal_client, discord)
  - Test mock targets (auxiliary_client.py Path, mcp_config.py get_hermes_home)

Zero behavioral changes. Full test suite passes (6162/6162, 2 pre-existing
streaming test failures unrelated to this change).
2026-03-25 15:02:03 -07:00
Teknium
5e5ad634a1
fix(matrix): duplicate messages, image caching for vision support (#2520)
Three fixes for the Matrix adapter:

1. Remove RoomMessageMedia callback registration — RoomMessageImage
   inherits from it, causing images to be processed twice.

2. Add event ID deduplication to both text and media handlers.
   nio can fire the same event more than once; bounded deque+set
   tracks the last 1000 events.

3. Cache images locally via Matrix client download. MXC URLs require
   authentication, so the vision pipeline couldn't access them.
   Images are now downloaded via the authenticated client and saved
   to the local cache (same pattern as Telegram/Discord).

Cherry-picked from PR #2353 by williamtwomey.

Co-authored-by: williamtwomey <williamtwomey@users.noreply.github.com>
2026-03-22 09:27:25 -07:00
Bartok9
66f71c1836 fix(matrix): use correct reply_to_message_id parameter name
Fixes #1842

The MessageEvent dataclass expects 'reply_to_message_id' but the Matrix
connector was passing 'reply_to'. This caused replies to fail with:

    MessageEvent.__init__() got an unexpected keyword argument 'reply_to'

Changed the parameter name to match the dataclass definition.
2026-03-18 02:23:21 -07:00
teknium1
6832d60bc0 fix(gateway): SMS persistent HTTP session + Matrix MIME media types
1. sms.py: Replace per-send aiohttp.ClientSession with a persistent
   session created in connect() and closed in disconnect(). Each
   outbound SMS no longer pays the TCP+TLS handshake cost. Falls back
   to a temporary session if the persistent one isn't available.

2. matrix.py: Use proper MIME types (image/png, audio/ogg, video/mp4)
   instead of bare category words (image, audio, video). The gateway's
   media processing checks startswith('image/') and startswith('audio/')
   so bare words caused Matrix images to skip vision enrichment and
   Matrix audio to skip transcription. Now extracts the actual MIME
   type from the nio event's content info when available.
2026-03-17 04:35:14 -07:00
teknium1
b111f2a779 fix(gateway): Matrix and Mattermost never report as connected
Neither adapter called _mark_connected() after successful connect(),
so _running stayed False, runtime status never showed 'connected',
and /status reported them as offline even while actively processing
messages.

Add _mark_connected() calls matching the pattern used by Telegram
and DingTalk adapters.
2026-03-17 04:01:02 -07:00
teknium1
cd67f60e01 feat: add Mattermost and Matrix gateway adapters
Add support for Mattermost (self-hosted Slack alternative) and Matrix
(federated messaging protocol) as messaging platforms.

Mattermost adapter:
- REST API v4 client for posts, files, channels, typing indicators
- WebSocket listener for real-time 'posted' events with reconnect backoff
- Thread support via root_id
- File upload/download with auth-aware caching
- Dedup cache (5min TTL, 2000 entries)
- Full self-hosted instance support

Matrix adapter:
- matrix-nio AsyncClient with sync loop
- Dual auth: access token or user_id + password
- Optional E2EE via matrix-nio[e2e] (libolm)
- Thread support via m.thread (MSC3440)
- Reply support via m.in_reply_to with fallback stripping
- Media upload/download via mxc:// URLs (authenticated v1.11+ endpoint)
- Auto-join on room invite
- DM detection via m.direct account data with sync fallback
- Markdown to HTML conversion

Fixes applied over original PR #1225 by @cyb0rgk1tty:
- Mattermost: add timeout to file downloads, wrap API helpers in
  try/except for network errors, download incoming files immediately
  with auth headers instead of passing auth-required URLs
- Matrix: use authenticated media endpoint (/_matrix/client/v1/media/),
  robust m.direct cache with sync fallback, prefer aiohttp over httpx

Install Matrix support: pip install 'hermes-agent[matrix]'
Mattermost needs no extra deps (uses aiohttp).

Salvaged from PR #1225 by @cyb0rgk1tty with fixes.
2026-03-17 03:18:16 -07:00