mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-05 07:41:39 +00:00
The 5-second startup-grace filter in _on_room_message silently drops events where event_ts < startup_ts - 5. When the host clock is set ahead of real time, the comparison flips against every live event and the bot 'connects but never replies' — exactly the symptom in #12614. Reporter Schnurzel700 chased this for several weeks before tracing it to their Debian VM's clock being out of sync. The current /1000.0 millisecond->second conversion is correct (mautrix returns ms); the failure mode is purely environmental. Add a one-shot WARNING that fires when: - we are >30s past startup (initial-sync replay window closed), AND - 3 consecutive drops share the same skew within 60s (a constant clock offset, not varied-age backfill from an invited room). State is reset in connect() so reconnects after fixing NTP rearm the detector. Includes the NTP fix instruction in the warning message itself and a new Troubleshooting entry in the Matrix docs. 5 new tests cover the happy path, initial-sync backfill, under- threshold drops, varied-age backfill, and the reconnect rearm path.
This commit is contained in:
parent
56ad30de17
commit
519657aa98
3 changed files with 280 additions and 0 deletions
|
|
@ -357,6 +357,23 @@ To find a Room ID: in Element, go to the room → **Settings** → **Advanced**
|
|||
|
||||
**Fix**: Invite the bot to the room — it auto-joins on invite. Verify your User ID is in `MATRIX_ALLOWED_USERS` (use the full `@user:server` format). Restart the gateway.
|
||||
|
||||
### Bot joins rooms but silently drops every message (clock skew)
|
||||
|
||||
**Cause**: The host's system clock is set ahead of real time. The Matrix adapter applies a 5-second startup-grace filter (`event_ts < startup_ts - 5`) to ignore events replayed from initial sync. When the wall clock is ahead, every incoming event looks "older than startup" and is dropped before reaching the message handler — the bot appears connected but never replies. See [#12614](https://github.com/NousResearch/hermes-agent/issues/12614).
|
||||
|
||||
**Symptom**: Gateway log shows `Matrix: dropped N live events as 'too old' more than 30s after startup`.
|
||||
|
||||
**Fix**: Sync the host clock with NTP and restart the bot:
|
||||
|
||||
```bash
|
||||
# Debian/Ubuntu
|
||||
sudo timedatectl set-ntp true
|
||||
timedatectl status # confirm "System clock synchronized: yes"
|
||||
|
||||
# macOS
|
||||
sudo sntp -sS time.apple.com
|
||||
```
|
||||
|
||||
### "Failed to authenticate" / "whoami failed" on startup
|
||||
|
||||
**Cause**: The access token or homeserver URL is incorrect.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue