mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-26 01:01:40 +00:00
* feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap Map active skills to Telegram's slash command menu so users can discover and invoke skills directly. Three changes: 1. Telegram menu now includes active skill commands alongside built-in commands, capped at 100 entries (Telegram Bot API limit). Overflow commands remain callable but hidden from the picker. Logged at startup when cap is hit. 2. New /commands [page] gateway command for paginated browsing of all commands + skills. /help now shows first 10 skill commands and points to /commands for the full list. 3. When a user types a slash command that matches a disabled or uninstalled skill, they get actionable guidance: - Disabled: 'Enable it with: hermes skills config' - Optional (not installed): 'Install with: hermes skills install official/<path>' Built on ideas from PR #3921 by @kshitijk4poor. * chore: move 21 niche skills to optional-skills Move specialized/niche skills from built-in (skills/) to optional (optional-skills/) to reduce the default skill count. Users can install them with: hermes skills install official/<category>/<name> Moved skills (21): - mlops: accelerate, chroma, faiss, flash-attention, hermes-atropos-environments, huggingface-tokenizers, instructor, lambda-labs, llava, nemo-curator, pinecone, pytorch-lightning, qdrant, saelens, simpo, slime, tensorrt-llm, torchtitan - research: domain-intel, duckduckgo-search - devops: inference-sh cli Built-in skills: 96 → 75 Optional skills: 22 → 43 * fix: only include repo built-in skills in Telegram menu, not user-installed User-installed skills (from hub or manually added) stay accessible via /skills and by typing the command directly, but don't get registered in the Telegram slash command picker. Only skills whose SKILL.md is under the repo's skills/ directory are included in the menu. This keeps the Telegram menu focused on the curated built-in set while user-installed skills remain discoverable through /skills and /commands.
361 lines
7.7 KiB
Markdown
361 lines
7.7 KiB
Markdown
---
|
|
name: pinecone
|
|
description: Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.
|
|
version: 1.0.0
|
|
author: Orchestra Research
|
|
license: MIT
|
|
dependencies: [pinecone-client]
|
|
metadata:
|
|
hermes:
|
|
tags: [RAG, Pinecone, Vector Database, Managed Service, Serverless, Hybrid Search, Production, Auto-Scaling, Low Latency, Recommendations]
|
|
|
|
---
|
|
|
|
# Pinecone - Managed Vector Database
|
|
|
|
The vector database for production AI applications.
|
|
|
|
## When to use Pinecone
|
|
|
|
**Use when:**
|
|
- Need managed, serverless vector database
|
|
- Production RAG applications
|
|
- Auto-scaling required
|
|
- Low latency critical (<100ms)
|
|
- Don't want to manage infrastructure
|
|
- Need hybrid search (dense + sparse vectors)
|
|
|
|
**Metrics**:
|
|
- Fully managed SaaS
|
|
- Auto-scales to billions of vectors
|
|
- **p95 latency <100ms**
|
|
- 99.9% uptime SLA
|
|
|
|
**Use alternatives instead**:
|
|
- **Chroma**: Self-hosted, open-source
|
|
- **FAISS**: Offline, pure similarity search
|
|
- **Weaviate**: Self-hosted with more features
|
|
|
|
## Quick start
|
|
|
|
### Installation
|
|
|
|
```bash
|
|
pip install pinecone-client
|
|
```
|
|
|
|
### Basic usage
|
|
|
|
```python
|
|
from pinecone import Pinecone, ServerlessSpec
|
|
|
|
# Initialize
|
|
pc = Pinecone(api_key="your-api-key")
|
|
|
|
# Create index
|
|
pc.create_index(
|
|
name="my-index",
|
|
dimension=1536, # Must match embedding dimension
|
|
metric="cosine", # or "euclidean", "dotproduct"
|
|
spec=ServerlessSpec(cloud="aws", region="us-east-1")
|
|
)
|
|
|
|
# Connect to index
|
|
index = pc.Index("my-index")
|
|
|
|
# Upsert vectors
|
|
index.upsert(vectors=[
|
|
{"id": "vec1", "values": [0.1, 0.2, ...], "metadata": {"category": "A"}},
|
|
{"id": "vec2", "values": [0.3, 0.4, ...], "metadata": {"category": "B"}}
|
|
])
|
|
|
|
# Query
|
|
results = index.query(
|
|
vector=[0.1, 0.2, ...],
|
|
top_k=5,
|
|
include_metadata=True
|
|
)
|
|
|
|
print(results["matches"])
|
|
```
|
|
|
|
## Core operations
|
|
|
|
### Create index
|
|
|
|
```python
|
|
# Serverless (recommended)
|
|
pc.create_index(
|
|
name="my-index",
|
|
dimension=1536,
|
|
metric="cosine",
|
|
spec=ServerlessSpec(
|
|
cloud="aws", # or "gcp", "azure"
|
|
region="us-east-1"
|
|
)
|
|
)
|
|
|
|
# Pod-based (for consistent performance)
|
|
from pinecone import PodSpec
|
|
|
|
pc.create_index(
|
|
name="my-index",
|
|
dimension=1536,
|
|
metric="cosine",
|
|
spec=PodSpec(
|
|
environment="us-east1-gcp",
|
|
pod_type="p1.x1"
|
|
)
|
|
)
|
|
```
|
|
|
|
### Upsert vectors
|
|
|
|
```python
|
|
# Single upsert
|
|
index.upsert(vectors=[
|
|
{
|
|
"id": "doc1",
|
|
"values": [0.1, 0.2, ...], # 1536 dimensions
|
|
"metadata": {
|
|
"text": "Document content",
|
|
"category": "tutorial",
|
|
"timestamp": "2025-01-01"
|
|
}
|
|
}
|
|
])
|
|
|
|
# Batch upsert (recommended)
|
|
vectors = [
|
|
{"id": f"vec{i}", "values": embedding, "metadata": metadata}
|
|
for i, (embedding, metadata) in enumerate(zip(embeddings, metadatas))
|
|
]
|
|
|
|
index.upsert(vectors=vectors, batch_size=100)
|
|
```
|
|
|
|
### Query vectors
|
|
|
|
```python
|
|
# Basic query
|
|
results = index.query(
|
|
vector=[0.1, 0.2, ...],
|
|
top_k=10,
|
|
include_metadata=True,
|
|
include_values=False
|
|
)
|
|
|
|
# With metadata filtering
|
|
results = index.query(
|
|
vector=[0.1, 0.2, ...],
|
|
top_k=5,
|
|
filter={"category": {"$eq": "tutorial"}}
|
|
)
|
|
|
|
# Namespace query
|
|
results = index.query(
|
|
vector=[0.1, 0.2, ...],
|
|
top_k=5,
|
|
namespace="production"
|
|
)
|
|
|
|
# Access results
|
|
for match in results["matches"]:
|
|
print(f"ID: {match['id']}")
|
|
print(f"Score: {match['score']}")
|
|
print(f"Metadata: {match['metadata']}")
|
|
```
|
|
|
|
### Metadata filtering
|
|
|
|
```python
|
|
# Exact match
|
|
filter = {"category": "tutorial"}
|
|
|
|
# Comparison
|
|
filter = {"price": {"$gte": 100}} # $gt, $gte, $lt, $lte, $ne
|
|
|
|
# Logical operators
|
|
filter = {
|
|
"$and": [
|
|
{"category": "tutorial"},
|
|
{"difficulty": {"$lte": 3}}
|
|
]
|
|
} # Also: $or
|
|
|
|
# In operator
|
|
filter = {"tags": {"$in": ["python", "ml"]}}
|
|
```
|
|
|
|
## Namespaces
|
|
|
|
```python
|
|
# Partition data by namespace
|
|
index.upsert(
|
|
vectors=[{"id": "vec1", "values": [...]}],
|
|
namespace="user-123"
|
|
)
|
|
|
|
# Query specific namespace
|
|
results = index.query(
|
|
vector=[...],
|
|
namespace="user-123",
|
|
top_k=5
|
|
)
|
|
|
|
# List namespaces
|
|
stats = index.describe_index_stats()
|
|
print(stats['namespaces'])
|
|
```
|
|
|
|
## Hybrid search (dense + sparse)
|
|
|
|
```python
|
|
# Upsert with sparse vectors
|
|
index.upsert(vectors=[
|
|
{
|
|
"id": "doc1",
|
|
"values": [0.1, 0.2, ...], # Dense vector
|
|
"sparse_values": {
|
|
"indices": [10, 45, 123], # Token IDs
|
|
"values": [0.5, 0.3, 0.8] # TF-IDF scores
|
|
},
|
|
"metadata": {"text": "..."}
|
|
}
|
|
])
|
|
|
|
# Hybrid query
|
|
results = index.query(
|
|
vector=[0.1, 0.2, ...],
|
|
sparse_vector={
|
|
"indices": [10, 45],
|
|
"values": [0.5, 0.3]
|
|
},
|
|
top_k=5,
|
|
alpha=0.5 # 0=sparse, 1=dense, 0.5=hybrid
|
|
)
|
|
```
|
|
|
|
## LangChain integration
|
|
|
|
```python
|
|
from langchain_pinecone import PineconeVectorStore
|
|
from langchain_openai import OpenAIEmbeddings
|
|
|
|
# Create vector store
|
|
vectorstore = PineconeVectorStore.from_documents(
|
|
documents=docs,
|
|
embedding=OpenAIEmbeddings(),
|
|
index_name="my-index"
|
|
)
|
|
|
|
# Query
|
|
results = vectorstore.similarity_search("query", k=5)
|
|
|
|
# With metadata filter
|
|
results = vectorstore.similarity_search(
|
|
"query",
|
|
k=5,
|
|
filter={"category": "tutorial"}
|
|
)
|
|
|
|
# As retriever
|
|
retriever = vectorstore.as_retriever(search_kwargs={"k": 10})
|
|
```
|
|
|
|
## LlamaIndex integration
|
|
|
|
```python
|
|
from llama_index.vector_stores.pinecone import PineconeVectorStore
|
|
|
|
# Connect to Pinecone
|
|
pc = Pinecone(api_key="your-key")
|
|
pinecone_index = pc.Index("my-index")
|
|
|
|
# Create vector store
|
|
vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
|
|
|
|
# Use in LlamaIndex
|
|
from llama_index.core import StorageContext, VectorStoreIndex
|
|
|
|
storage_context = StorageContext.from_defaults(vector_store=vector_store)
|
|
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
|
|
```
|
|
|
|
## Index management
|
|
|
|
```python
|
|
# List indices
|
|
indexes = pc.list_indexes()
|
|
|
|
# Describe index
|
|
index_info = pc.describe_index("my-index")
|
|
print(index_info)
|
|
|
|
# Get index stats
|
|
stats = index.describe_index_stats()
|
|
print(f"Total vectors: {stats['total_vector_count']}")
|
|
print(f"Namespaces: {stats['namespaces']}")
|
|
|
|
# Delete index
|
|
pc.delete_index("my-index")
|
|
```
|
|
|
|
## Delete vectors
|
|
|
|
```python
|
|
# Delete by ID
|
|
index.delete(ids=["vec1", "vec2"])
|
|
|
|
# Delete by filter
|
|
index.delete(filter={"category": "old"})
|
|
|
|
# Delete all in namespace
|
|
index.delete(delete_all=True, namespace="test")
|
|
|
|
# Delete entire index
|
|
index.delete(delete_all=True)
|
|
```
|
|
|
|
## Best practices
|
|
|
|
1. **Use serverless** - Auto-scaling, cost-effective
|
|
2. **Batch upserts** - More efficient (100-200 per batch)
|
|
3. **Add metadata** - Enable filtering
|
|
4. **Use namespaces** - Isolate data by user/tenant
|
|
5. **Monitor usage** - Check Pinecone dashboard
|
|
6. **Optimize filters** - Index frequently filtered fields
|
|
7. **Test with free tier** - 1 index, 100K vectors free
|
|
8. **Use hybrid search** - Better quality
|
|
9. **Set appropriate dimensions** - Match embedding model
|
|
10. **Regular backups** - Export important data
|
|
|
|
## Performance
|
|
|
|
| Operation | Latency | Notes |
|
|
|-----------|---------|-------|
|
|
| Upsert | ~50-100ms | Per batch |
|
|
| Query (p50) | ~50ms | Depends on index size |
|
|
| Query (p95) | ~100ms | SLA target |
|
|
| Metadata filter | ~+10-20ms | Additional overhead |
|
|
|
|
## Pricing (as of 2025)
|
|
|
|
**Serverless**:
|
|
- $0.096 per million read units
|
|
- $0.06 per million write units
|
|
- $0.06 per GB storage/month
|
|
|
|
**Free tier**:
|
|
- 1 serverless index
|
|
- 100K vectors (1536 dimensions)
|
|
- Great for prototyping
|
|
|
|
## Resources
|
|
|
|
- **Website**: https://www.pinecone.io
|
|
- **Docs**: https://docs.pinecone.io
|
|
- **Console**: https://app.pinecone.io
|
|
- **Pricing**: https://www.pinecone.io/pricing
|
|
|
|
|