hermes-agent/skills/mlops/faiss/SKILL.md
teknium1 14e59706b7 Add Skills Hub — universal skill search, install, and management from online registries
Implements the Hermes Skills Hub with agentskills.io spec compliance,
multi-registry skill discovery, security scanning, and user-driven
management via CLI and /skills slash command.

Core features:
- Security scanner (tools/skills_guard.py): 120 threat patterns across
  12 categories, trust-aware install policy (builtin/trusted/community),
  structural checks, unicode injection detection, LLM audit pass
- Hub client (tools/skills_hub.py): GitHub, ClawHub, Claude Code
  marketplace, and LobeHub source adapters with shared GitHubAuth
  (PAT + gh CLI + GitHub App), lock file provenance tracking, quarantine
  flow, and unified search across all sources
- CLI interface (hermes_cli/skills_hub.py): search, install, inspect,
  list, audit, uninstall, publish (GitHub PR), snapshot export/import,
  and tap management — powers both `hermes skills` and `/skills`

Spec conformance (Phase 0):
- Upgraded frontmatter parser to yaml.safe_load with fallback
- Migrated 39 SKILL.md files: tags/related_skills to metadata.hermes.*
- Added assets/ directory support and compatibility/metadata fields
- Excluded .hub/ from skill discovery in skills_tool.py

Updated 13 config/doc files including README, AGENTS.md, .env.example,
setup wizard, doctor, status, pyproject.toml, and docs.
2026-02-18 16:09:05 -08:00

5 KiB
Raw Blame History

name description version author license dependencies metadata
faiss Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). Use for fast k-NN search, large-scale vector retrieval, or when you need pure similarity search without metadata. Best for high-performance applications. 1.0.0 Orchestra Research MIT
faiss-cpu
faiss-gpu
numpy
hermes
tags
RAG
FAISS
Similarity Search
Vector Search
Facebook AI
GPU Acceleration
Billion-Scale
K-NN
HNSW
High Performance
Large Scale

FAISS - Efficient Similarity Search

Facebook AI's library for billion-scale vector similarity search.

When to use FAISS

Use FAISS when:

  • Need fast similarity search on large vector datasets (millions/billions)
  • GPU acceleration required
  • Pure vector similarity (no metadata filtering needed)
  • High throughput, low latency critical
  • Offline/batch processing of embeddings

Metrics:

  • 31,700+ GitHub stars
  • Meta/Facebook AI Research
  • Handles billions of vectors
  • C++ with Python bindings

Use alternatives instead:

  • Chroma/Pinecone: Need metadata filtering
  • Weaviate: Need full database features
  • Annoy: Simpler, fewer features

Quick start

Installation

# CPU only
pip install faiss-cpu

# GPU support
pip install faiss-gpu

Basic usage

import faiss
import numpy as np

# Create sample data (1000 vectors, 128 dimensions)
d = 128
nb = 1000
vectors = np.random.random((nb, d)).astype('float32')

# Create index
index = faiss.IndexFlatL2(d)  # L2 distance
index.add(vectors)             # Add vectors

# Search
k = 5  # Find 5 nearest neighbors
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)

print(f"Nearest neighbors: {indices}")
print(f"Distances: {distances}")

Index types

# L2 (Euclidean) distance
index = faiss.IndexFlatL2(d)

# Inner product (cosine similarity if normalized)
index = faiss.IndexFlatIP(d)

# Slowest, most accurate

2. IVF (inverted file) - Fast approximate

# Create quantizer
quantizer = faiss.IndexFlatL2(d)

# IVF index with 100 clusters
nlist = 100
index = faiss.IndexIVFFlat(quantizer, d, nlist)

# Train on data
index.train(vectors)

# Add vectors
index.add(vectors)

# Search (nprobe = clusters to search)
index.nprobe = 10
distances, indices = index.search(query, k)

3. HNSW (Hierarchical NSW) - Best quality/speed

# HNSW index
M = 32  # Number of connections per layer
index = faiss.IndexHNSWFlat(d, M)

# No training needed
index.add(vectors)

# Search
distances, indices = index.search(query, k)

4. Product Quantization - Memory efficient

# PQ reduces memory by 16-32×
m = 8   # Number of subquantizers
nbits = 8
index = faiss.IndexPQ(d, m, nbits)

# Train and add
index.train(vectors)
index.add(vectors)

Save and load

# Save index
faiss.write_index(index, "large.index")

# Load index
index = faiss.read_index("large.index")

# Continue using
distances, indices = index.search(query, k)

GPU acceleration

# Single GPU
res = faiss.StandardGpuResources()
index_cpu = faiss.IndexFlatL2(d)
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu)  # GPU 0

# Multi-GPU
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)

# 10-100× faster than CPU

LangChain integration

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

# Create FAISS vector store
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())

# Save
vectorstore.save_local("faiss_index")

# Load
vectorstore = FAISS.load_local(
    "faiss_index",
    OpenAIEmbeddings(),
    allow_dangerous_deserialization=True
)

# Search
results = vectorstore.similarity_search("query", k=5)

LlamaIndex integration

from llama_index.vector_stores.faiss import FaissVectorStore
import faiss

# Create FAISS index
d = 1536
faiss_index = faiss.IndexFlatL2(d)

vector_store = FaissVectorStore(faiss_index=faiss_index)

Best practices

  1. Choose right index type - Flat for <10K, IVF for 10K-1M, HNSW for quality
  2. Normalize for cosine - Use IndexFlatIP with normalized vectors
  3. Use GPU for large datasets - 10-100× faster
  4. Save trained indices - Training is expensive
  5. Tune nprobe/ef_search - Balance speed/accuracy
  6. Monitor memory - PQ for large datasets
  7. Batch queries - Better GPU utilization

Performance

Index Type Build Time Search Time Memory Accuracy
Flat Fast Slow High 100%
IVF Medium Fast Medium 95-99%
HNSW Slow Fastest High 99%
PQ Medium Fast Low 90-95%

Resources