mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-26 01:01:40 +00:00
fix(matrix): replace pickle crypto store with SQLite, fix E2EE decryption (#7981)
Fixes #7952 — Matrix E2EE completely broken after mautrix migration. - Replace MemoryCryptoStore + pickle/HMAC persistence with mautrix's PgCryptoStore backed by SQLite via aiosqlite. Crypto state now persists reliably across restarts without fragile serialization. - Add handle_sync() call on initial sync response so to-device events (queued Megolm key shares) are dispatched to OlmMachine instead of being silently dropped. - Add _verify_device_keys_on_server() after loading crypto state. Detects missing keys (re-uploads), stale keys from migration (attempts re-upload), and corrupted state (refuses E2EE). - Add _CryptoStateStore adapter wrapping MemoryStateStore to satisfy mautrix crypto's StateStore interface (is_encrypted, get_encryption_info, find_shared_rooms). - Remove redundant share_keys() call from sync loop — OlmMachine already handles this via DEVICE_OTK_COUNT event handler. - Fix datetime vs float TypeError in session.py suspend_recently_active() that crashed gateway startup. - Add aiosqlite and asyncpg to [matrix] extra in pyproject.toml. - Update test mocks for PgCryptoStore/Database and add query_keys mock for key verification. 174 tests pass. - Add E2EE upgrade/migration docs to Matrix user guide.
This commit is contained in:
parent
27eeea0555
commit
50d86b3c71
5 changed files with 298 additions and 64 deletions
|
|
@ -344,9 +344,79 @@ pip install 'hermes-agent[matrix]'
|
|||
**Fix**:
|
||||
1. Verify `libolm` is installed on your system (see the E2EE section above).
|
||||
2. Make sure `MATRIX_ENCRYPTION=true` is set in your `.env`.
|
||||
3. In your Matrix client (Element), go to the bot's profile → **Sessions** → verify/trust the bot's device.
|
||||
3. In your Matrix client (Element), go to the bot's profile -> Sessions -> verify/trust the bot's device.
|
||||
4. If the bot just joined an encrypted room, it can only decrypt messages sent *after* it joined. Older messages are inaccessible.
|
||||
|
||||
### Upgrading from a previous version with E2EE
|
||||
|
||||
If you previously used Hermes with `MATRIX_ENCRYPTION=true` and are upgrading to
|
||||
a version that uses the new SQLite-based crypto store, the bot's encryption
|
||||
identity has changed. Your Matrix client (Element) may cache the old device keys
|
||||
and refuse to share encryption sessions with the bot.
|
||||
|
||||
**Symptoms**: The bot connects and shows "E2EE enabled" in the logs, but all
|
||||
messages show "could not decrypt event" and the bot never responds.
|
||||
|
||||
**What's happening**: The old encryption state (from the previous `matrix-nio` or
|
||||
serialization-based `mautrix` backend) is incompatible with the new SQLite crypto
|
||||
store. The bot creates a fresh encryption identity, but your Matrix client still
|
||||
has the old keys cached and won't share the room's encryption session with a
|
||||
device whose keys changed. This is a Matrix security feature -- clients treat
|
||||
changed identity keys for the same device as suspicious.
|
||||
|
||||
**Fix** (one-time migration):
|
||||
|
||||
1. **Generate a new access token** to get a fresh device ID. The simplest way:
|
||||
|
||||
```bash
|
||||
curl -X POST https://your-server/_matrix/client/v3/login \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"type": "m.login.password",
|
||||
"identifier": {"type": "m.id.user", "user": "@hermes:your-server.org"},
|
||||
"password": "your-password",
|
||||
"initial_device_display_name": "Hermes Agent"
|
||||
}'
|
||||
```
|
||||
|
||||
Copy the new `access_token` and update `MATRIX_ACCESS_TOKEN` in `~/.hermes/.env`.
|
||||
|
||||
2. **Delete old encryption state**:
|
||||
|
||||
```bash
|
||||
rm -f ~/.hermes/platforms/matrix/store/crypto.db
|
||||
rm -f ~/.hermes/platforms/matrix/store/crypto_store.*
|
||||
```
|
||||
|
||||
3. **Force your Matrix client to rotate the encryption session**. In Element,
|
||||
open the DM room with the bot and type `/discardsession`. This forces Element
|
||||
to create a new encryption session and share it with the bot's new device.
|
||||
|
||||
4. **Restart the gateway**:
|
||||
|
||||
```bash
|
||||
hermes gateway run
|
||||
```
|
||||
|
||||
5. **Send a new message**. The bot should decrypt and respond normally.
|
||||
|
||||
:::note
|
||||
After migration, messages sent *before* the upgrade cannot be decrypted -- the old
|
||||
encryption keys are gone. This only affects the transition; new messages work
|
||||
normally.
|
||||
:::
|
||||
|
||||
:::tip
|
||||
**New installations are not affected.** This migration is only needed if you had
|
||||
a working E2EE setup with a previous version of Hermes and are upgrading.
|
||||
|
||||
**Why a new access token?** Each Matrix access token is bound to a specific device
|
||||
ID. Reusing the same device ID with new encryption keys causes other Matrix
|
||||
clients to distrust the device (they see changed identity keys as a potential
|
||||
security breach). A new access token gets a new device ID with no stale key
|
||||
history, so other clients trust it immediately.
|
||||
:::
|
||||
|
||||
### Sync issues / bot falls behind
|
||||
|
||||
**Cause**: Long-running tool executions can delay the sync loop, or the homeserver is slow.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue