mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-11 08:42:11 +00:00
* fix(session-db): survive CLI/gateway concurrent write contention Closes #3139 Three layered fixes for the scenario where CLI and gateway write to state.db concurrently, causing create_session() to fail with 'database is locked' and permanently disabling session_search on the gateway side. 1. Increase SQLite connection timeout: 10s -> 30s hermes_state.py: longer window for the WAL writer to finish a batch flush before the other process gives up entirely. 2. INSERT OR IGNORE in create_session hermes_state.py: prevents IntegrityError on duplicate session IDs (e.g. gateway restarts while CLI session is still alive). 3. Don't null out _session_db on create_session failure (main fix) run_agent.py: a transient lock at agent startup must not permanently disable session_search for the lifetime of that agent instance. _session_db now stays alive so subsequent flushes and searches work once the lock clears. 4. New ensure_session() helper + call it during flush hermes_state.py: INSERT OR IGNORE for a minimal session row. run_agent.py _flush_messages_to_session_db: calls ensure_session() before appending messages, so the FK constraint is satisfied even when create_session() failed at startup. No-op when the row exists. * fix(state): release lock between context queries in search_messages The context-window queries (one per FTS5 match) were running inside the same lock acquisition as the primary FTS5 query, holding the lock for O(N) sequential SQLite round-trips. Move per-match context fetches outside the outer lock block so each acquires the lock independently, keeping critical sections short and allowing other threads to interleave. * fix(session): prefer longer source in load_transcript to prevent legacy truncation When a long-lived session pre-dates SQLite storage (e.g. sessions created before the DB layer was introduced, or after a clean deployment that reset the DB), _flush_messages_to_session_db only writes the *new* messages from the current turn to SQLite — it skips messages already present in conversation_history, assuming they are already persisted. That assumption fails for legacy JSONL-only sessions: Turn N (first after DB migration): load_transcript(id) → SQLite: 0 → falls back to JSONL: 994 ✓ _flush_messages_to_session_db: skip first 994, write 2 new → SQLite: 2 Turn N+1: load_transcript(id) → SQLite: 2 → returns immediately ✗ Agent sees 2 messages of history instead of 996 The same pattern causes the reported symptom: session JSON truncated to 4 messages (_save_session_log writes agent.messages which only has 2 history + 2 new = 4). Fix: always load both sources and return whichever is longer. For a fully-migrated session SQLite will always be ≥ JSONL, so there is no regression. For a legacy session that hasn't been bootstrapped yet, JSONL wins and the full history is restored. Closes #3212 * test: add load_transcript source preference tests for #3212 Covers: JSONL longer returns JSONL, SQLite longer returns SQLite, SQLite empty falls back to JSONL, both empty returns empty, equal length prefers SQLite (richer reasoning fields). --------- Co-authored-by: Mibayy <mibayy@hermes.ai> Co-authored-by: kewe63 <kewe.3217@gmail.com> Co-authored-by: Mibayy <mibayy@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| test_agent_cache.py | ||
| test_api_server.py | ||
| test_api_server_jobs.py | ||
| test_approve_deny_commands.py | ||
| test_async_memory_flush.py | ||
| test_background_command.py | ||
| test_background_process_notifications.py | ||
| test_base_topic_sessions.py | ||
| test_channel_directory.py | ||
| test_config.py | ||
| test_config_cwd_bridge.py | ||
| test_delivery.py | ||
| test_dingtalk.py | ||
| test_discord_bot_filter.py | ||
| test_discord_document_handling.py | ||
| test_discord_free_response.py | ||
| test_discord_imports.py | ||
| test_discord_media_metadata.py | ||
| test_discord_opus.py | ||
| test_discord_send.py | ||
| test_discord_slash_commands.py | ||
| test_discord_system_messages.py | ||
| test_discord_thread_persistence.py | ||
| test_dm_topics.py | ||
| test_document_cache.py | ||
| test_email.py | ||
| test_extract_local_files.py | ||
| test_flush_memory_stale_guard.py | ||
| test_gateway_shutdown.py | ||
| test_homeassistant.py | ||
| test_honcho_lifecycle.py | ||
| test_hooks.py | ||
| test_interrupt_key_match.py | ||
| test_matrix.py | ||
| test_mattermost.py | ||
| test_media_extraction.py | ||
| test_mirror.py | ||
| test_pairing.py | ||
| test_pii_redaction.py | ||
| test_plan_command.py | ||
| test_platform_base.py | ||
| test_platform_reconnect.py | ||
| test_queue_consumption.py | ||
| test_reasoning_command.py | ||
| test_resume_command.py | ||
| test_retry_replacement.py | ||
| test_retry_response.py | ||
| test_run_progress_topics.py | ||
| test_runner_fatal_adapter.py | ||
| test_runner_startup_failures.py | ||
| test_send_image_file.py | ||
| test_session.py | ||
| test_session_env.py | ||
| test_session_hygiene.py | ||
| test_session_race_guard.py | ||
| test_session_reset_notify.py | ||
| test_signal.py | ||
| test_slack.py | ||
| test_sms.py | ||
| test_ssl_certs.py | ||
| test_status.py | ||
| test_status_command.py | ||
| test_sticker_cache.py | ||
| test_stt_config.py | ||
| test_telegram_conflict.py | ||
| test_telegram_documents.py | ||
| test_telegram_format.py | ||
| test_telegram_photo_interrupts.py | ||
| test_telegram_reply_mode.py | ||
| test_telegram_text_batching.py | ||
| test_title_command.py | ||
| test_transcript_offset.py | ||
| test_unauthorized_dm_behavior.py | ||
| test_update_command.py | ||
| test_voice_command.py | ||
| test_webhook_adapter.py | ||
| test_webhook_integration.py | ||
| test_whatsapp_connect.py | ||
| test_whatsapp_reply_prefix.py | ||