perf(desktop): make session-id search SQL-bounded, not O(n)

search_sessions_by_id previously fetched up to 10k sessions via
list_sessions_rich and filtered them in Python — O(n) per keystroke.
Push the id match into SQL instead.

- list_sessions_rich gains an optional id_query param: a case-insensitive
  LIKE pushed into the outer WHERE, matched against each surfaced row's id
  AND every id in its forward compression chain (via the existing chain
  CTE). Searching a compression root id or a tip id both resolve to the
  same projected conversation. LIKE wildcards in the needle are escaped.
- search_sessions_by_id now fetches only matching rows (limit*4) and ranks
  exact > prefix > substring in Python over that small set.
- web_server /api/sessions/search: route ID matches and content matches
  through one lineage-keyed dedup helper so an id-hit and a content-hit on
  the same conversation collapse to a single result (the contributor's
  version keyed ID hits by raw sid and content hits by root, which could
  double-list a compression tip).
- command-center haystack also matches _lineage_root_id for parity.

E2E verified against a real DB: exact match over 3000+ sessions
materializes 1 row in Python (was ~3000), 5ms; root-id resolves to tip;
LIKE-wildcard escaping holds.

Follow-up to @0xharryriddle's feat(desktop): search sessions by id.
This commit is contained in:
Teknium 2026-06-04 06:05:22 -07:00
parent 9ecc331be8
commit 580d924097
4 changed files with 133 additions and 73 deletions

View file

@ -4,11 +4,18 @@ from hermes_cli import web_server
class _FakeSessionDB:
"""Fake backing the /api/sessions/search endpoint.
The endpoint surfaces direct session-id matches first, then FTS message
matches, deduping both by compression lineage root. This fake has no
compression chains (get_session returns no parent), so each session is its
own lineage root.
"""
closed = False
def search_sessions_by_id(self, query, limit=20, include_archived=True):
assert query == "20260603"
assert limit == 2
assert include_archived is True
return [
{
@ -22,7 +29,6 @@ class _FakeSessionDB:
def search_messages(self, query, limit=20):
assert query == "20260603*"
assert limit == 2
return [
{
"session_id": "20260603_090200_exact",
@ -42,6 +48,13 @@ class _FakeSessionDB:
},
]
def get_session(self, session_id):
# No compression chains in this fixture — every session is its own root.
return {"id": session_id, "parent_session_id": None}
def get_compression_tip(self, session_id):
return session_id
def close(self):
self.closed = True
@ -51,10 +64,13 @@ def test_desktop_session_search_merges_id_matches_before_content_matches(monkeyp
response = asyncio.run(web_server.search_sessions(q="20260603", limit=2))
# ID match surfaces first; the content hit on the SAME session is deduped
# by lineage root (not double-listed); the unrelated content hit follows.
assert response == {
"results": [
{
"session_id": "20260603_090200_exact",
"lineage_root": "20260603_090200_exact",
"snippet": "ID match preview",
"role": None,
"source": "cli",
@ -63,6 +79,7 @@ def test_desktop_session_search_merges_id_matches_before_content_matches(monkeyp
},
{
"session_id": "content_session",
"lineage_root": "content_session",
"snippet": "content hit",
"role": "assistant",
"source": "desktop",