mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-26 01:01:40 +00:00
Add official optional skill for qmd (tobi/qmd), a local on-device search engine for personal knowledge bases, notes, docs, and meeting transcripts. Covers: - Installation and setup for macOS and Linux - Collection management and context annotations - All search modes: BM25, vector, hybrid with reranking - MCP integration (stdio and HTTP daemon modes) - Structured query patterns and best practices - systemd/launchd service configs for daemon persistence Placed in optional-skills/ due to heavyweight requirements (Node >= 22, ~2GB local models).
441 lines
13 KiB
Markdown
441 lines
13 KiB
Markdown
---
|
|
name: qmd
|
|
description: Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration.
|
|
version: 1.0.0
|
|
author: Hermes Agent + Teknium
|
|
license: MIT
|
|
platforms: [macos, linux]
|
|
metadata:
|
|
hermes:
|
|
tags: [Search, Knowledge-Base, RAG, Notes, MCP, Local-AI]
|
|
related_skills: [obsidian, native-mcp, arxiv]
|
|
---
|
|
|
|
# QMD — Query Markup Documents
|
|
|
|
Local, on-device search engine for personal knowledge bases. Indexes markdown
|
|
notes, meeting transcripts, documentation, and any text-based files, then
|
|
provides hybrid search combining keyword matching, semantic understanding, and
|
|
LLM-powered reranking — all running locally with no cloud dependencies.
|
|
|
|
Created by [Tobi Lütke](https://github.com/tobi/qmd). MIT licensed.
|
|
|
|
## When to Use
|
|
|
|
- User asks to search their notes, docs, knowledge base, or meeting transcripts
|
|
- User wants to find something across a large collection of markdown/text files
|
|
- User wants semantic search ("find notes about X concept") not just keyword grep
|
|
- User has already set up qmd collections and wants to query them
|
|
- User asks to set up a local knowledge base or document search system
|
|
- Keywords: "search my notes", "find in my docs", "knowledge base", "qmd"
|
|
|
|
## Prerequisites
|
|
|
|
### Node.js >= 22 (required)
|
|
|
|
```bash
|
|
# Check version
|
|
node --version # must be >= 22
|
|
|
|
# macOS — install or upgrade via Homebrew
|
|
brew install node@22
|
|
|
|
# Linux — use NodeSource or nvm
|
|
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
|
|
sudo apt-get install -y nodejs
|
|
# or with nvm:
|
|
nvm install 22 && nvm use 22
|
|
```
|
|
|
|
### SQLite with Extension Support (macOS only)
|
|
|
|
macOS system SQLite lacks extension loading. Install via Homebrew:
|
|
|
|
```bash
|
|
brew install sqlite
|
|
```
|
|
|
|
### Install qmd
|
|
|
|
```bash
|
|
npm install -g @tobilu/qmd
|
|
# or with Bun:
|
|
bun install -g @tobilu/qmd
|
|
```
|
|
|
|
First run auto-downloads 3 local GGUF models (~2GB total):
|
|
|
|
| Model | Purpose | Size |
|
|
|-------|---------|------|
|
|
| embeddinggemma-300M-Q8_0 | Vector embeddings | ~300MB |
|
|
| qwen3-reranker-0.6b-q8_0 | Result reranking | ~640MB |
|
|
| qmd-query-expansion-1.7B | Query expansion | ~1.1GB |
|
|
|
|
### Verify Installation
|
|
|
|
```bash
|
|
qmd --version
|
|
qmd status
|
|
```
|
|
|
|
## Quick Reference
|
|
|
|
| Command | What It Does | Speed |
|
|
|---------|-------------|-------|
|
|
| `qmd search "query"` | BM25 keyword search (no models) | ~0.2s |
|
|
| `qmd vsearch "query"` | Semantic vector search (1 model) | ~3s |
|
|
| `qmd query "query"` | Hybrid + reranking (all 3 models) | ~2-3s warm, ~19s cold |
|
|
| `qmd get <docid>` | Retrieve full document content | instant |
|
|
| `qmd multi-get "glob"` | Retrieve multiple files | instant |
|
|
| `qmd collection add <path> --name <n>` | Add a directory as a collection | instant |
|
|
| `qmd context add <path> "description"` | Add context metadata to improve retrieval | instant |
|
|
| `qmd embed` | Generate/update vector embeddings | varies |
|
|
| `qmd status` | Show index health and collection info | instant |
|
|
| `qmd mcp` | Start MCP server (stdio) | persistent |
|
|
| `qmd mcp --http --daemon` | Start MCP server (HTTP, warm models) | persistent |
|
|
|
|
## Setup Workflow
|
|
|
|
### 1. Add Collections
|
|
|
|
Point qmd at directories containing your documents:
|
|
|
|
```bash
|
|
# Add a notes directory
|
|
qmd collection add ~/notes --name notes
|
|
|
|
# Add project docs
|
|
qmd collection add ~/projects/myproject/docs --name project-docs
|
|
|
|
# Add meeting transcripts
|
|
qmd collection add ~/meetings --name meetings
|
|
|
|
# List all collections
|
|
qmd collection list
|
|
```
|
|
|
|
### 2. Add Context Descriptions
|
|
|
|
Context metadata helps the search engine understand what each collection
|
|
contains. This significantly improves retrieval quality:
|
|
|
|
```bash
|
|
qmd context add qmd://notes "Personal notes, ideas, and journal entries"
|
|
qmd context add qmd://project-docs "Technical documentation for the main project"
|
|
qmd context add qmd://meetings "Meeting transcripts and action items from team syncs"
|
|
```
|
|
|
|
### 3. Generate Embeddings
|
|
|
|
```bash
|
|
qmd embed
|
|
```
|
|
|
|
This processes all documents in all collections and generates vector
|
|
embeddings. Re-run after adding new documents or collections.
|
|
|
|
### 4. Verify
|
|
|
|
```bash
|
|
qmd status # shows index health, collection stats, model info
|
|
```
|
|
|
|
## Search Patterns
|
|
|
|
### Fast Keyword Search (BM25)
|
|
|
|
Best for: exact terms, code identifiers, names, known phrases.
|
|
No models loaded — near-instant results.
|
|
|
|
```bash
|
|
qmd search "authentication middleware"
|
|
qmd search "handleError async"
|
|
```
|
|
|
|
### Semantic Vector Search
|
|
|
|
Best for: natural language questions, conceptual queries.
|
|
Loads embedding model (~3s first query).
|
|
|
|
```bash
|
|
qmd vsearch "how does the rate limiter handle burst traffic"
|
|
qmd vsearch "ideas for improving onboarding flow"
|
|
```
|
|
|
|
### Hybrid Search with Reranking (Best Quality)
|
|
|
|
Best for: important queries where quality matters most.
|
|
Uses all 3 models — query expansion, parallel BM25+vector, reranking.
|
|
|
|
```bash
|
|
qmd query "what decisions were made about the database migration"
|
|
```
|
|
|
|
### Structured Multi-Mode Queries
|
|
|
|
Combine different search types in a single query for precision:
|
|
|
|
```bash
|
|
# BM25 for exact term + vector for concept
|
|
qmd query $'lex: rate limiter\nvec: how does throttling work under load'
|
|
|
|
# With query expansion
|
|
qmd query $'expand: database migration plan\nlex: "schema change"'
|
|
```
|
|
|
|
### Query Syntax (lex/BM25 mode)
|
|
|
|
| Syntax | Effect | Example |
|
|
|--------|--------|---------|
|
|
| `term` | Prefix match | `perf` matches "performance" |
|
|
| `"phrase"` | Exact phrase | `"rate limiter"` |
|
|
| `-term` | Exclude term | `performance -sports` |
|
|
|
|
### HyDE (Hypothetical Document Embeddings)
|
|
|
|
For complex topics, write what you expect the answer to look like:
|
|
|
|
```bash
|
|
qmd query $'hyde: The migration plan involves three phases. First, we add the new columns without dropping the old ones. Then we backfill data. Finally we cut over and remove legacy columns.'
|
|
```
|
|
|
|
### Scoping to Collections
|
|
|
|
```bash
|
|
qmd search "query" --collection notes
|
|
qmd query "query" --collection project-docs
|
|
```
|
|
|
|
### Output Formats
|
|
|
|
```bash
|
|
qmd search "query" --json # JSON output (best for parsing)
|
|
qmd search "query" --limit 5 # Limit results
|
|
qmd get "#abc123" # Get by document ID
|
|
qmd get "path/to/file.md" # Get by file path
|
|
qmd get "file.md:50" -l 100 # Get specific line range
|
|
qmd multi-get "journals/*.md" --json # Batch retrieve by glob
|
|
```
|
|
|
|
## MCP Integration (Recommended)
|
|
|
|
qmd exposes an MCP server that provides search tools directly to
|
|
Hermes Agent via the native MCP client. This is the preferred
|
|
integration — once configured, the agent gets qmd tools automatically
|
|
without needing to load this skill.
|
|
|
|
### Option A: Stdio Mode (Simple)
|
|
|
|
Add to `~/.hermes/config.yaml`:
|
|
|
|
```yaml
|
|
mcp_servers:
|
|
qmd:
|
|
command: "qmd"
|
|
args: ["mcp"]
|
|
timeout: 30
|
|
connect_timeout: 45
|
|
```
|
|
|
|
This registers tools: `mcp_qmd_search`, `mcp_qmd_vsearch`,
|
|
`mcp_qmd_deep_search`, `mcp_qmd_get`, `mcp_qmd_status`.
|
|
|
|
**Tradeoff:** Models load on first search call (~19s cold start),
|
|
then stay warm for the session. Acceptable for occasional use.
|
|
|
|
### Option B: HTTP Daemon Mode (Fast, Recommended for Heavy Use)
|
|
|
|
Start the qmd daemon separately — it keeps models warm in memory:
|
|
|
|
```bash
|
|
# Start daemon (persists across agent restarts)
|
|
qmd mcp --http --daemon
|
|
|
|
# Runs on http://localhost:8181 by default
|
|
```
|
|
|
|
Then configure Hermes Agent to connect via HTTP:
|
|
|
|
```yaml
|
|
mcp_servers:
|
|
qmd:
|
|
url: "http://localhost:8181/mcp"
|
|
timeout: 30
|
|
```
|
|
|
|
**Tradeoff:** Uses ~2GB RAM while running, but every query is fast
|
|
(~2-3s). Best for users who search frequently.
|
|
|
|
### Keeping the Daemon Running
|
|
|
|
#### macOS (launchd)
|
|
|
|
```bash
|
|
cat > ~/Library/LaunchAgents/com.qmd.daemon.plist << 'EOF'
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
|
|
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
|
<plist version="1.0">
|
|
<dict>
|
|
<key>Label</key>
|
|
<string>com.qmd.daemon</string>
|
|
<key>ProgramArguments</key>
|
|
<array>
|
|
<string>qmd</string>
|
|
<string>mcp</string>
|
|
<string>--http</string>
|
|
<string>--daemon</string>
|
|
</array>
|
|
<key>RunAtLoad</key>
|
|
<true/>
|
|
<key>KeepAlive</key>
|
|
<true/>
|
|
<key>StandardOutPath</key>
|
|
<string>/tmp/qmd-daemon.log</string>
|
|
<key>StandardErrorPath</key>
|
|
<string>/tmp/qmd-daemon.log</string>
|
|
</dict>
|
|
</plist>
|
|
EOF
|
|
|
|
launchctl load ~/Library/LaunchAgents/com.qmd.daemon.plist
|
|
```
|
|
|
|
#### Linux (systemd user service)
|
|
|
|
```bash
|
|
mkdir -p ~/.config/systemd/user
|
|
|
|
cat > ~/.config/systemd/user/qmd-daemon.service << 'EOF'
|
|
[Unit]
|
|
Description=QMD MCP Daemon
|
|
After=network.target
|
|
|
|
[Service]
|
|
ExecStart=qmd mcp --http --daemon
|
|
Restart=on-failure
|
|
RestartSec=10
|
|
Environment=PATH=/usr/local/bin:/usr/bin:/bin
|
|
|
|
[Install]
|
|
WantedBy=default.target
|
|
EOF
|
|
|
|
systemctl --user daemon-reload
|
|
systemctl --user enable --now qmd-daemon
|
|
systemctl --user status qmd-daemon
|
|
```
|
|
|
|
### MCP Tools Reference
|
|
|
|
Once connected, these tools are available as `mcp_qmd_*`:
|
|
|
|
| MCP Tool | Maps To | Description |
|
|
|----------|---------|-------------|
|
|
| `mcp_qmd_search` | `qmd search` | BM25 keyword search |
|
|
| `mcp_qmd_vsearch` | `qmd vsearch` | Semantic vector search |
|
|
| `mcp_qmd_deep_search` | `qmd query` | Hybrid search + reranking |
|
|
| `mcp_qmd_get` | `qmd get` | Retrieve document by ID or path |
|
|
| `mcp_qmd_status` | `qmd status` | Index health and stats |
|
|
|
|
The MCP tools accept structured JSON queries for multi-mode search:
|
|
|
|
```json
|
|
{
|
|
"searches": [
|
|
{"type": "lex", "query": "authentication middleware"},
|
|
{"type": "vec", "query": "how user login is verified"}
|
|
],
|
|
"collections": ["project-docs"],
|
|
"limit": 10
|
|
}
|
|
```
|
|
|
|
## CLI Usage (Without MCP)
|
|
|
|
When MCP is not configured, use qmd directly via terminal:
|
|
|
|
```
|
|
terminal(command="qmd query 'what was decided about the API redesign' --json", timeout=30)
|
|
```
|
|
|
|
For setup and management tasks, always use terminal:
|
|
|
|
```
|
|
terminal(command="qmd collection add ~/Documents/notes --name notes")
|
|
terminal(command="qmd context add qmd://notes 'Personal research notes and ideas'")
|
|
terminal(command="qmd embed")
|
|
terminal(command="qmd status")
|
|
```
|
|
|
|
## How the Search Pipeline Works
|
|
|
|
Understanding the internals helps choose the right search mode:
|
|
|
|
1. **Query Expansion** — A fine-tuned 1.7B model generates 2 alternative
|
|
queries. The original gets 2x weight in fusion.
|
|
2. **Parallel Retrieval** — BM25 (SQLite FTS5) and vector search run
|
|
simultaneously across all query variants.
|
|
3. **RRF Fusion** — Reciprocal Rank Fusion (k=60) merges results.
|
|
Top-rank bonus: #1 gets +0.05, #2-3 get +0.02.
|
|
4. **LLM Reranking** — qwen3-reranker scores top 30 candidates (0.0-1.0).
|
|
5. **Position-Aware Blending** — Ranks 1-3: 75% retrieval / 25% reranker.
|
|
Ranks 4-10: 60/40. Ranks 11+: 40/60 (trusts reranker more for long tail).
|
|
|
|
**Smart Chunking:** Documents are split at natural break points (headings,
|
|
code blocks, blank lines) targeting ~900 tokens with 15% overlap. Code
|
|
blocks are never split mid-block.
|
|
|
|
## Best Practices
|
|
|
|
1. **Always add context descriptions** — `qmd context add` dramatically
|
|
improves retrieval accuracy. Describe what each collection contains.
|
|
2. **Re-embed after adding documents** — `qmd embed` must be re-run when
|
|
new files are added to collections.
|
|
3. **Use `qmd search` for speed** — when you need fast keyword lookup
|
|
(code identifiers, exact names), BM25 is instant and needs no models.
|
|
4. **Use `qmd query` for quality** — when the question is conceptual or
|
|
the user needs the best possible results, use hybrid search.
|
|
5. **Prefer MCP integration** — once configured, the agent gets native
|
|
tools without needing to load this skill each time.
|
|
6. **Daemon mode for frequent users** — if the user searches their
|
|
knowledge base regularly, recommend the HTTP daemon setup.
|
|
7. **First query in structured search gets 2x weight** — put the most
|
|
important/certain query first when combining lex and vec.
|
|
|
|
## Troubleshooting
|
|
|
|
### "Models downloading on first run"
|
|
Normal — qmd auto-downloads ~2GB of GGUF models on first use.
|
|
This is a one-time operation.
|
|
|
|
### Cold start latency (~19s)
|
|
This happens when models aren't loaded in memory. Solutions:
|
|
- Use HTTP daemon mode (`qmd mcp --http --daemon`) to keep warm
|
|
- Use `qmd search` (BM25 only) when models aren't needed
|
|
- MCP stdio mode loads models on first search, stays warm for session
|
|
|
|
### macOS: "unable to load extension"
|
|
Install Homebrew SQLite: `brew install sqlite`
|
|
Then ensure it's on PATH before system SQLite.
|
|
|
|
### "No collections found"
|
|
Run `qmd collection add <path> --name <name>` to add directories,
|
|
then `qmd embed` to index them.
|
|
|
|
### Embedding model override (CJK/multilingual)
|
|
Set `QMD_EMBED_MODEL` environment variable for non-English content:
|
|
```bash
|
|
export QMD_EMBED_MODEL="your-multilingual-model"
|
|
```
|
|
|
|
## Data Storage
|
|
|
|
- **Index & vectors:** `~/.cache/qmd/index.sqlite`
|
|
- **Models:** Auto-downloaded to local cache on first run
|
|
- **No cloud dependencies** — everything runs locally
|
|
|
|
## References
|
|
|
|
- [GitHub: tobi/qmd](https://github.com/tobi/qmd)
|
|
- [QMD Changelog](https://github.com/tobi/qmd/blob/main/CHANGELOG.md)
|