mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
refactor: reorganize skills into sub-categories
The skills directory was getting disorganized — mlops alone had 40 skills in a flat list, and 12 categories were singletons with just one skill each. Code change: - prompt_builder.py: Support sub-categories in skill scanner. skills/mlops/training/axolotl/SKILL.md now shows as category 'mlops/training' instead of just 'mlops'. Backwards-compatible with existing flat structure. Split mlops (40 skills) into 7 sub-categories: - mlops/training (12): accelerate, axolotl, flash-attention, grpo-rl-training, peft, pytorch-fsdp, pytorch-lightning, simpo, slime, torchtitan, trl-fine-tuning, unsloth - mlops/inference (8): gguf, guidance, instructor, llama-cpp, obliteratus, outlines, tensorrt-llm, vllm - mlops/models (6): audiocraft, clip, llava, segment-anything, stable-diffusion, whisper - mlops/vector-databases (4): chroma, faiss, pinecone, qdrant - mlops/evaluation (5): huggingface-tokenizers, lm-evaluation-harness, nemo-curator, saelens, weights-and-biases - mlops/cloud (2): lambda-labs, modal - mlops/research (1): dspy Merged singleton categories: - gifs → media (gif-search joins youtube-content) - music-creation → media (heartmula, songsee) - diagramming → creative (excalidraw joins ascii-art) - ocr-and-documents → productivity - domain → research (domain-intel) - feeds → research (blogwatcher) - market-data → research (polymarket) Fixed misplaced skills: - mlops/code-review → software-development (not ML-specific) - mlops/ml-paper-writing → research (academic writing) Added DESCRIPTION.md files for all new/updated categories.
This commit is contained in:
parent
d6c710706f
commit
732c66b0f3
217 changed files with 39 additions and 4 deletions
361
skills/mlops/vector-databases/pinecone/SKILL.md
Normal file
361
skills/mlops/vector-databases/pinecone/SKILL.md
Normal file
|
|
@ -0,0 +1,361 @@
|
|||
---
|
||||
name: pinecone
|
||||
description: Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.
|
||||
version: 1.0.0
|
||||
author: Orchestra Research
|
||||
license: MIT
|
||||
dependencies: [pinecone-client]
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [RAG, Pinecone, Vector Database, Managed Service, Serverless, Hybrid Search, Production, Auto-Scaling, Low Latency, Recommendations]
|
||||
|
||||
---
|
||||
|
||||
# Pinecone - Managed Vector Database
|
||||
|
||||
The vector database for production AI applications.
|
||||
|
||||
## When to use Pinecone
|
||||
|
||||
**Use when:**
|
||||
- Need managed, serverless vector database
|
||||
- Production RAG applications
|
||||
- Auto-scaling required
|
||||
- Low latency critical (<100ms)
|
||||
- Don't want to manage infrastructure
|
||||
- Need hybrid search (dense + sparse vectors)
|
||||
|
||||
**Metrics**:
|
||||
- Fully managed SaaS
|
||||
- Auto-scales to billions of vectors
|
||||
- **p95 latency <100ms**
|
||||
- 99.9% uptime SLA
|
||||
|
||||
**Use alternatives instead**:
|
||||
- **Chroma**: Self-hosted, open-source
|
||||
- **FAISS**: Offline, pure similarity search
|
||||
- **Weaviate**: Self-hosted with more features
|
||||
|
||||
## Quick start
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
pip install pinecone-client
|
||||
```
|
||||
|
||||
### Basic usage
|
||||
|
||||
```python
|
||||
from pinecone import Pinecone, ServerlessSpec
|
||||
|
||||
# Initialize
|
||||
pc = Pinecone(api_key="your-api-key")
|
||||
|
||||
# Create index
|
||||
pc.create_index(
|
||||
name="my-index",
|
||||
dimension=1536, # Must match embedding dimension
|
||||
metric="cosine", # or "euclidean", "dotproduct"
|
||||
spec=ServerlessSpec(cloud="aws", region="us-east-1")
|
||||
)
|
||||
|
||||
# Connect to index
|
||||
index = pc.Index("my-index")
|
||||
|
||||
# Upsert vectors
|
||||
index.upsert(vectors=[
|
||||
{"id": "vec1", "values": [0.1, 0.2, ...], "metadata": {"category": "A"}},
|
||||
{"id": "vec2", "values": [0.3, 0.4, ...], "metadata": {"category": "B"}}
|
||||
])
|
||||
|
||||
# Query
|
||||
results = index.query(
|
||||
vector=[0.1, 0.2, ...],
|
||||
top_k=5,
|
||||
include_metadata=True
|
||||
)
|
||||
|
||||
print(results["matches"])
|
||||
```
|
||||
|
||||
## Core operations
|
||||
|
||||
### Create index
|
||||
|
||||
```python
|
||||
# Serverless (recommended)
|
||||
pc.create_index(
|
||||
name="my-index",
|
||||
dimension=1536,
|
||||
metric="cosine",
|
||||
spec=ServerlessSpec(
|
||||
cloud="aws", # or "gcp", "azure"
|
||||
region="us-east-1"
|
||||
)
|
||||
)
|
||||
|
||||
# Pod-based (for consistent performance)
|
||||
from pinecone import PodSpec
|
||||
|
||||
pc.create_index(
|
||||
name="my-index",
|
||||
dimension=1536,
|
||||
metric="cosine",
|
||||
spec=PodSpec(
|
||||
environment="us-east1-gcp",
|
||||
pod_type="p1.x1"
|
||||
)
|
||||
)
|
||||
```
|
||||
|
||||
### Upsert vectors
|
||||
|
||||
```python
|
||||
# Single upsert
|
||||
index.upsert(vectors=[
|
||||
{
|
||||
"id": "doc1",
|
||||
"values": [0.1, 0.2, ...], # 1536 dimensions
|
||||
"metadata": {
|
||||
"text": "Document content",
|
||||
"category": "tutorial",
|
||||
"timestamp": "2025-01-01"
|
||||
}
|
||||
}
|
||||
])
|
||||
|
||||
# Batch upsert (recommended)
|
||||
vectors = [
|
||||
{"id": f"vec{i}", "values": embedding, "metadata": metadata}
|
||||
for i, (embedding, metadata) in enumerate(zip(embeddings, metadatas))
|
||||
]
|
||||
|
||||
index.upsert(vectors=vectors, batch_size=100)
|
||||
```
|
||||
|
||||
### Query vectors
|
||||
|
||||
```python
|
||||
# Basic query
|
||||
results = index.query(
|
||||
vector=[0.1, 0.2, ...],
|
||||
top_k=10,
|
||||
include_metadata=True,
|
||||
include_values=False
|
||||
)
|
||||
|
||||
# With metadata filtering
|
||||
results = index.query(
|
||||
vector=[0.1, 0.2, ...],
|
||||
top_k=5,
|
||||
filter={"category": {"$eq": "tutorial"}}
|
||||
)
|
||||
|
||||
# Namespace query
|
||||
results = index.query(
|
||||
vector=[0.1, 0.2, ...],
|
||||
top_k=5,
|
||||
namespace="production"
|
||||
)
|
||||
|
||||
# Access results
|
||||
for match in results["matches"]:
|
||||
print(f"ID: {match['id']}")
|
||||
print(f"Score: {match['score']}")
|
||||
print(f"Metadata: {match['metadata']}")
|
||||
```
|
||||
|
||||
### Metadata filtering
|
||||
|
||||
```python
|
||||
# Exact match
|
||||
filter = {"category": "tutorial"}
|
||||
|
||||
# Comparison
|
||||
filter = {"price": {"$gte": 100}} # $gt, $gte, $lt, $lte, $ne
|
||||
|
||||
# Logical operators
|
||||
filter = {
|
||||
"$and": [
|
||||
{"category": "tutorial"},
|
||||
{"difficulty": {"$lte": 3}}
|
||||
]
|
||||
} # Also: $or
|
||||
|
||||
# In operator
|
||||
filter = {"tags": {"$in": ["python", "ml"]}}
|
||||
```
|
||||
|
||||
## Namespaces
|
||||
|
||||
```python
|
||||
# Partition data by namespace
|
||||
index.upsert(
|
||||
vectors=[{"id": "vec1", "values": [...]}],
|
||||
namespace="user-123"
|
||||
)
|
||||
|
||||
# Query specific namespace
|
||||
results = index.query(
|
||||
vector=[...],
|
||||
namespace="user-123",
|
||||
top_k=5
|
||||
)
|
||||
|
||||
# List namespaces
|
||||
stats = index.describe_index_stats()
|
||||
print(stats['namespaces'])
|
||||
```
|
||||
|
||||
## Hybrid search (dense + sparse)
|
||||
|
||||
```python
|
||||
# Upsert with sparse vectors
|
||||
index.upsert(vectors=[
|
||||
{
|
||||
"id": "doc1",
|
||||
"values": [0.1, 0.2, ...], # Dense vector
|
||||
"sparse_values": {
|
||||
"indices": [10, 45, 123], # Token IDs
|
||||
"values": [0.5, 0.3, 0.8] # TF-IDF scores
|
||||
},
|
||||
"metadata": {"text": "..."}
|
||||
}
|
||||
])
|
||||
|
||||
# Hybrid query
|
||||
results = index.query(
|
||||
vector=[0.1, 0.2, ...],
|
||||
sparse_vector={
|
||||
"indices": [10, 45],
|
||||
"values": [0.5, 0.3]
|
||||
},
|
||||
top_k=5,
|
||||
alpha=0.5 # 0=sparse, 1=dense, 0.5=hybrid
|
||||
)
|
||||
```
|
||||
|
||||
## LangChain integration
|
||||
|
||||
```python
|
||||
from langchain_pinecone import PineconeVectorStore
|
||||
from langchain_openai import OpenAIEmbeddings
|
||||
|
||||
# Create vector store
|
||||
vectorstore = PineconeVectorStore.from_documents(
|
||||
documents=docs,
|
||||
embedding=OpenAIEmbeddings(),
|
||||
index_name="my-index"
|
||||
)
|
||||
|
||||
# Query
|
||||
results = vectorstore.similarity_search("query", k=5)
|
||||
|
||||
# With metadata filter
|
||||
results = vectorstore.similarity_search(
|
||||
"query",
|
||||
k=5,
|
||||
filter={"category": "tutorial"}
|
||||
)
|
||||
|
||||
# As retriever
|
||||
retriever = vectorstore.as_retriever(search_kwargs={"k": 10})
|
||||
```
|
||||
|
||||
## LlamaIndex integration
|
||||
|
||||
```python
|
||||
from llama_index.vector_stores.pinecone import PineconeVectorStore
|
||||
|
||||
# Connect to Pinecone
|
||||
pc = Pinecone(api_key="your-key")
|
||||
pinecone_index = pc.Index("my-index")
|
||||
|
||||
# Create vector store
|
||||
vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
|
||||
|
||||
# Use in LlamaIndex
|
||||
from llama_index.core import StorageContext, VectorStoreIndex
|
||||
|
||||
storage_context = StorageContext.from_defaults(vector_store=vector_store)
|
||||
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
|
||||
```
|
||||
|
||||
## Index management
|
||||
|
||||
```python
|
||||
# List indices
|
||||
indexes = pc.list_indexes()
|
||||
|
||||
# Describe index
|
||||
index_info = pc.describe_index("my-index")
|
||||
print(index_info)
|
||||
|
||||
# Get index stats
|
||||
stats = index.describe_index_stats()
|
||||
print(f"Total vectors: {stats['total_vector_count']}")
|
||||
print(f"Namespaces: {stats['namespaces']}")
|
||||
|
||||
# Delete index
|
||||
pc.delete_index("my-index")
|
||||
```
|
||||
|
||||
## Delete vectors
|
||||
|
||||
```python
|
||||
# Delete by ID
|
||||
index.delete(ids=["vec1", "vec2"])
|
||||
|
||||
# Delete by filter
|
||||
index.delete(filter={"category": "old"})
|
||||
|
||||
# Delete all in namespace
|
||||
index.delete(delete_all=True, namespace="test")
|
||||
|
||||
# Delete entire index
|
||||
index.delete(delete_all=True)
|
||||
```
|
||||
|
||||
## Best practices
|
||||
|
||||
1. **Use serverless** - Auto-scaling, cost-effective
|
||||
2. **Batch upserts** - More efficient (100-200 per batch)
|
||||
3. **Add metadata** - Enable filtering
|
||||
4. **Use namespaces** - Isolate data by user/tenant
|
||||
5. **Monitor usage** - Check Pinecone dashboard
|
||||
6. **Optimize filters** - Index frequently filtered fields
|
||||
7. **Test with free tier** - 1 index, 100K vectors free
|
||||
8. **Use hybrid search** - Better quality
|
||||
9. **Set appropriate dimensions** - Match embedding model
|
||||
10. **Regular backups** - Export important data
|
||||
|
||||
## Performance
|
||||
|
||||
| Operation | Latency | Notes |
|
||||
|-----------|---------|-------|
|
||||
| Upsert | ~50-100ms | Per batch |
|
||||
| Query (p50) | ~50ms | Depends on index size |
|
||||
| Query (p95) | ~100ms | SLA target |
|
||||
| Metadata filter | ~+10-20ms | Additional overhead |
|
||||
|
||||
## Pricing (as of 2025)
|
||||
|
||||
**Serverless**:
|
||||
- $0.096 per million read units
|
||||
- $0.06 per million write units
|
||||
- $0.06 per GB storage/month
|
||||
|
||||
**Free tier**:
|
||||
- 1 serverless index
|
||||
- 100K vectors (1536 dimensions)
|
||||
- Great for prototyping
|
||||
|
||||
## Resources
|
||||
|
||||
- **Website**: https://www.pinecone.io
|
||||
- **Docs**: https://docs.pinecone.io
|
||||
- **Console**: https://app.pinecone.io
|
||||
- **Pricing**: https://www.pinecone.io/pricing
|
||||
|
||||
|
||||
181
skills/mlops/vector-databases/pinecone/references/deployment.md
Normal file
181
skills/mlops/vector-databases/pinecone/references/deployment.md
Normal file
|
|
@ -0,0 +1,181 @@
|
|||
# Pinecone Deployment Guide
|
||||
|
||||
Production deployment patterns for Pinecone.
|
||||
|
||||
## Serverless vs Pod-based
|
||||
|
||||
### Serverless (Recommended)
|
||||
|
||||
```python
|
||||
from pinecone import Pinecone, ServerlessSpec
|
||||
|
||||
pc = Pinecone(api_key="your-key")
|
||||
|
||||
# Create serverless index
|
||||
pc.create_index(
|
||||
name="my-index",
|
||||
dimension=1536,
|
||||
metric="cosine",
|
||||
spec=ServerlessSpec(
|
||||
cloud="aws", # or "gcp", "azure"
|
||||
region="us-east-1"
|
||||
)
|
||||
)
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Auto-scaling
|
||||
- Pay per usage
|
||||
- No infrastructure management
|
||||
- Cost-effective for variable load
|
||||
|
||||
**Use when:**
|
||||
- Variable traffic
|
||||
- Cost optimization important
|
||||
- Don't need consistent latency
|
||||
|
||||
### Pod-based
|
||||
|
||||
```python
|
||||
from pinecone import PodSpec
|
||||
|
||||
pc.create_index(
|
||||
name="my-index",
|
||||
dimension=1536,
|
||||
metric="cosine",
|
||||
spec=PodSpec(
|
||||
environment="us-east1-gcp",
|
||||
pod_type="p1.x1", # or p1.x2, p1.x4, p1.x8
|
||||
pods=2, # Number of pods
|
||||
replicas=2 # High availability
|
||||
)
|
||||
)
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Consistent performance
|
||||
- Predictable latency
|
||||
- Higher throughput
|
||||
- Dedicated resources
|
||||
|
||||
**Use when:**
|
||||
- Production workloads
|
||||
- Need consistent p95 latency
|
||||
- High throughput required
|
||||
|
||||
## Hybrid search
|
||||
|
||||
### Dense + Sparse vectors
|
||||
|
||||
```python
|
||||
# Upsert with both dense and sparse vectors
|
||||
index.upsert(vectors=[
|
||||
{
|
||||
"id": "doc1",
|
||||
"values": [0.1, 0.2, ...], # Dense (semantic)
|
||||
"sparse_values": {
|
||||
"indices": [10, 45, 123], # Token IDs
|
||||
"values": [0.5, 0.3, 0.8] # TF-IDF/BM25 scores
|
||||
},
|
||||
"metadata": {"text": "..."}
|
||||
}
|
||||
])
|
||||
|
||||
# Hybrid query
|
||||
results = index.query(
|
||||
vector=[0.1, 0.2, ...], # Dense query
|
||||
sparse_vector={
|
||||
"indices": [10, 45],
|
||||
"values": [0.5, 0.3]
|
||||
},
|
||||
top_k=10,
|
||||
alpha=0.5 # 0=sparse only, 1=dense only, 0.5=balanced
|
||||
)
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Best of both worlds
|
||||
- Semantic + keyword matching
|
||||
- Better recall than either alone
|
||||
|
||||
## Namespaces for multi-tenancy
|
||||
|
||||
```python
|
||||
# Separate data by user/tenant
|
||||
index.upsert(
|
||||
vectors=[{"id": "doc1", "values": [...]}],
|
||||
namespace="user-123"
|
||||
)
|
||||
|
||||
# Query specific namespace
|
||||
results = index.query(
|
||||
vector=[...],
|
||||
namespace="user-123",
|
||||
top_k=5
|
||||
)
|
||||
|
||||
# List namespaces
|
||||
stats = index.describe_index_stats()
|
||||
print(stats['namespaces'])
|
||||
```
|
||||
|
||||
**Use cases:**
|
||||
- Multi-tenant SaaS
|
||||
- User-specific data isolation
|
||||
- A/B testing (prod/staging namespaces)
|
||||
|
||||
## Metadata filtering
|
||||
|
||||
### Exact match
|
||||
|
||||
```python
|
||||
results = index.query(
|
||||
vector=[...],
|
||||
filter={"category": "tutorial"},
|
||||
top_k=5
|
||||
)
|
||||
```
|
||||
|
||||
### Range queries
|
||||
|
||||
```python
|
||||
results = index.query(
|
||||
vector=[...],
|
||||
filter={"price": {"$gte": 100, "$lte": 500}},
|
||||
top_k=5
|
||||
)
|
||||
```
|
||||
|
||||
### Complex filters
|
||||
|
||||
```python
|
||||
results = index.query(
|
||||
vector=[...],
|
||||
filter={
|
||||
"$and": [
|
||||
{"category": {"$in": ["tutorial", "guide"]}},
|
||||
{"difficulty": {"$lte": 3}},
|
||||
{"published": {"$gte": "2024-01-01"}}
|
||||
]
|
||||
},
|
||||
top_k=5
|
||||
)
|
||||
```
|
||||
|
||||
## Best practices
|
||||
|
||||
1. **Use serverless for development** - Cost-effective
|
||||
2. **Switch to pods for production** - Consistent performance
|
||||
3. **Implement namespaces** - Multi-tenancy
|
||||
4. **Add metadata strategically** - Enable filtering
|
||||
5. **Use hybrid search** - Better quality
|
||||
6. **Batch upserts** - 100-200 vectors per batch
|
||||
7. **Monitor usage** - Check Pinecone dashboard
|
||||
8. **Set up alerts** - Usage/cost thresholds
|
||||
9. **Regular backups** - Export important data
|
||||
10. **Test filters** - Verify performance
|
||||
|
||||
## Resources
|
||||
|
||||
- **Docs**: https://docs.pinecone.io
|
||||
- **Console**: https://app.pinecone.io
|
||||
Loading…
Add table
Add a link
Reference in a new issue