refactor: reorganize skills into sub-categories

The skills directory was getting disorganized — mlops alone had 40 skills in a flat list, and 12 categories were singletons with just one skill each. Code change: - prompt_builder.py: Support sub-categories in skill scanner. skills/mlops/training/axolotl/SKILL.md now shows as category 'mlops/training' instead of just 'mlops'. Backwards-compatible with existing flat structure. Split mlops (40 skills) into 7 sub-categories: - mlops/training (12): accelerate, axolotl, flash-attention, grpo-rl-training, peft, pytorch-fsdp, pytorch-lightning, simpo, slime, torchtitan, trl-fine-tuning, unsloth - mlops/inference (8): gguf, guidance, instructor, llama-cpp, obliteratus, outlines, tensorrt-llm, vllm - mlops/models (6): audiocraft, clip, llava, segment-anything, stable-diffusion, whisper - mlops/vector-databases (4): chroma, faiss, pinecone, qdrant - mlops/evaluation (5): huggingface-tokenizers, lm-evaluation-harness, nemo-curator, saelens, weights-and-biases - mlops/cloud (2): lambda-labs, modal - mlops/research (1): dspy Merged singleton categories: - gifs → media (gif-search joins youtube-content) - music-creation → media (heartmula, songsee) - diagramming → creative (excalidraw joins ascii-art) - ocr-and-documents → productivity - domain → research (domain-intel) - feeds → research (blogwatcher) - market-data → research (polymarket) Fixed misplaced skills: - mlops/code-review → software-development (not ML-specific) - mlops/ml-paper-writing → research (academic writing) Added DESCRIPTION.md files for all new/updated categories.
2026-04-25 00:51:20 +00:00 · 2026-03-09 03:35:53 -07:00 · 2026-03-09 03:35:53 -07:00 · 732c66b0f3
commit 732c66b0f3
parent d6c710706f
217 changed files with 39 additions and 4 deletions
--- a/skills/mlops/vector-databases/pinecone/SKILL.md
+++ b/skills/mlops/vector-databases/pinecone/SKILL.md
@ -0,0 +1,361 @@
+---
+name: pinecone
+description: Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.
+version: 1.0.0
+author: Orchestra Research
+license: MIT
+dependencies: [pinecone-client]
+metadata:
+  hermes:
+    tags: [RAG, Pinecone, Vector Database, Managed Service, Serverless, Hybrid Search, Production, Auto-Scaling, Low Latency, Recommendations]
+
+---
+
+# Pinecone - Managed Vector Database
+
+The vector database for production AI applications.
+
+## When to use Pinecone
+
+**Use when:**
+- Need managed, serverless vector database
+- Production RAG applications
+- Auto-scaling required
+- Low latency critical (<100ms)
+- Don't want to manage infrastructure
+- Need hybrid search (dense + sparse vectors)
+
+**Metrics**:
+- Fully managed SaaS
+- Auto-scales to billions of vectors
+- **p95 latency <100ms**
+- 99.9% uptime SLA
+
+**Use alternatives instead**:
+- **Chroma**: Self-hosted, open-source
+- **FAISS**: Offline, pure similarity search
+- **Weaviate**: Self-hosted with more features
+
+## Quick start
+
+### Installation
+
+```bash
+pip install pinecone-client
+```
+
+### Basic usage
+
+```python
+from pinecone import Pinecone, ServerlessSpec
+
+# Initialize
+pc = Pinecone(api_key="your-api-key")
+
+# Create index
+pc.create_index(
+    name="my-index",
+    dimension=1536,  # Must match embedding dimension
+    metric="cosine",  # or "euclidean", "dotproduct"
+    spec=ServerlessSpec(cloud="aws", region="us-east-1")
+)
+
+# Connect to index
+index = pc.Index("my-index")
+
+# Upsert vectors
+index.upsert(vectors=[
+    {"id": "vec1", "values": [0.1, 0.2, ...], "metadata": {"category": "A"}},
+    {"id": "vec2", "values": [0.3, 0.4, ...], "metadata": {"category": "B"}}
+])
+
+# Query
+results = index.query(
+    vector=[0.1, 0.2, ...],
+    top_k=5,
+    include_metadata=True
+)
+
+print(results["matches"])
+```
+
+## Core operations
+
+### Create index
+
+```python
+# Serverless (recommended)
+pc.create_index(
+    name="my-index",
+    dimension=1536,
+    metric="cosine",
+    spec=ServerlessSpec(
+        cloud="aws",         # or "gcp", "azure"
+        region="us-east-1"
+    )
+)
+
+# Pod-based (for consistent performance)
+from pinecone import PodSpec
+
+pc.create_index(
+    name="my-index",
+    dimension=1536,
+    metric="cosine",
+    spec=PodSpec(
+        environment="us-east1-gcp",
+        pod_type="p1.x1"
+    )
+)
+```
+
+### Upsert vectors
+
+```python
+# Single upsert
+index.upsert(vectors=[
+    {
+        "id": "doc1",
+        "values": [0.1, 0.2, ...],  # 1536 dimensions
+        "metadata": {
+            "text": "Document content",
+            "category": "tutorial",
+            "timestamp": "2025-01-01"
+        }
+    }
+])
+
+# Batch upsert (recommended)
+vectors = [
+    {"id": f"vec{i}", "values": embedding, "metadata": metadata}
+    for i, (embedding, metadata) in enumerate(zip(embeddings, metadatas))
+]
+
+index.upsert(vectors=vectors, batch_size=100)
+```
+
+### Query vectors
+
+```python
+# Basic query
+results = index.query(
+    vector=[0.1, 0.2, ...],
+    top_k=10,
+    include_metadata=True,
+    include_values=False
+)
+
+# With metadata filtering
+results = index.query(
+    vector=[0.1, 0.2, ...],
+    top_k=5,
+    filter={"category": {"$eq": "tutorial"}}
+)
+
+# Namespace query
+results = index.query(
+    vector=[0.1, 0.2, ...],
+    top_k=5,
+    namespace="production"
+)
+
+# Access results
+for match in results["matches"]:
+    print(f"ID: {match['id']}")
+    print(f"Score: {match['score']}")
+    print(f"Metadata: {match['metadata']}")
+```
+
+### Metadata filtering
+
+```python
+# Exact match
+filter = {"category": "tutorial"}
+
+# Comparison
+filter = {"price": {"$gte": 100}}  # $gt, $gte, $lt, $lte, $ne
+
+# Logical operators
+filter = {
+    "$and": [
+        {"category": "tutorial"},
+        {"difficulty": {"$lte": 3}}
+    ]
+}  # Also: $or
+
+# In operator
+filter = {"tags": {"$in": ["python", "ml"]}}
+```
+
+## Namespaces
+
+```python
+# Partition data by namespace
+index.upsert(
+    vectors=[{"id": "vec1", "values": [...]}],
+    namespace="user-123"
+)
+
+# Query specific namespace
+results = index.query(
+    vector=[...],
+    namespace="user-123",
+    top_k=5
+)
+
+# List namespaces
+stats = index.describe_index_stats()
+print(stats['namespaces'])
+```
+
+## Hybrid search (dense + sparse)
+
+```python
+# Upsert with sparse vectors
+index.upsert(vectors=[
+    {
+        "id": "doc1",
+        "values": [0.1, 0.2, ...],  # Dense vector
+        "sparse_values": {
+            "indices": [10, 45, 123],  # Token IDs
+            "values": [0.5, 0.3, 0.8]   # TF-IDF scores
+        },
+        "metadata": {"text": "..."}
+    }
+])
+
+# Hybrid query
+results = index.query(
+    vector=[0.1, 0.2, ...],
+    sparse_vector={
+        "indices": [10, 45],
+        "values": [0.5, 0.3]
+    },
+    top_k=5,
+    alpha=0.5  # 0=sparse, 1=dense, 0.5=hybrid
+)
+```
+
+## LangChain integration
+
+```python
+from langchain_pinecone import PineconeVectorStore
+from langchain_openai import OpenAIEmbeddings
+
+# Create vector store
+vectorstore = PineconeVectorStore.from_documents(
+    documents=docs,
+    embedding=OpenAIEmbeddings(),
+    index_name="my-index"
+)
+
+# Query
+results = vectorstore.similarity_search("query", k=5)
+
+# With metadata filter
+results = vectorstore.similarity_search(
+    "query",
+    k=5,
+    filter={"category": "tutorial"}
+)
+
+# As retriever
+retriever = vectorstore.as_retriever(search_kwargs={"k": 10})
+```
+
+## LlamaIndex integration
+
+```python
+from llama_index.vector_stores.pinecone import PineconeVectorStore
+
+# Connect to Pinecone
+pc = Pinecone(api_key="your-key")
+pinecone_index = pc.Index("my-index")
+
+# Create vector store
+vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
+
+# Use in LlamaIndex
+from llama_index.core import StorageContext, VectorStoreIndex
+
+storage_context = StorageContext.from_defaults(vector_store=vector_store)
+index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
+```
+
+## Index management
+
+```python
+# List indices
+indexes = pc.list_indexes()
+
+# Describe index
+index_info = pc.describe_index("my-index")
+print(index_info)
+
+# Get index stats
+stats = index.describe_index_stats()
+print(f"Total vectors: {stats['total_vector_count']}")
+print(f"Namespaces: {stats['namespaces']}")
+
+# Delete index
+pc.delete_index("my-index")
+```
+
+## Delete vectors
+
+```python
+# Delete by ID
+index.delete(ids=["vec1", "vec2"])
+
+# Delete by filter
+index.delete(filter={"category": "old"})
+
+# Delete all in namespace
+index.delete(delete_all=True, namespace="test")
+
+# Delete entire index
+index.delete(delete_all=True)
+```
+
+## Best practices
+
+1. **Use serverless** - Auto-scaling, cost-effective
+2. **Batch upserts** - More efficient (100-200 per batch)
+3. **Add metadata** - Enable filtering
+4. **Use namespaces** - Isolate data by user/tenant
+5. **Monitor usage** - Check Pinecone dashboard
+6. **Optimize filters** - Index frequently filtered fields
+7. **Test with free tier** - 1 index, 100K vectors free
+8. **Use hybrid search** - Better quality
+9. **Set appropriate dimensions** - Match embedding model
+10. **Regular backups** - Export important data
+
+## Performance
+
+| Operation | Latency | Notes |
+|-----------|---------|-------|
+| Upsert | ~50-100ms | Per batch |
+| Query (p50) | ~50ms | Depends on index size |
+| Query (p95) | ~100ms | SLA target |
+| Metadata filter | ~+10-20ms | Additional overhead |
+
+## Pricing (as of 2025)
+
+**Serverless**:
+- $0.096 per million read units
+- $0.06 per million write units
+- $0.06 per GB storage/month
+
+**Free tier**:
+- 1 serverless index
+- 100K vectors (1536 dimensions)
+- Great for prototyping
+
+## Resources
+
+- **Website**: https://www.pinecone.io
+- **Docs**: https://docs.pinecone.io
+- **Console**: https://app.pinecone.io
+- **Pricing**: https://www.pinecone.io/pricing
+
+
--- a/skills/mlops/vector-databases/pinecone/references/deployment.md
+++ b/skills/mlops/vector-databases/pinecone/references/deployment.md
@ -0,0 +1,181 @@
+# Pinecone Deployment Guide
+
+Production deployment patterns for Pinecone.
+
+## Serverless vs Pod-based
+
+### Serverless (Recommended)
+
+```python
+from pinecone import Pinecone, ServerlessSpec
+
+pc = Pinecone(api_key="your-key")
+
+# Create serverless index
+pc.create_index(
+    name="my-index",
+    dimension=1536,
+    metric="cosine",
+    spec=ServerlessSpec(
+        cloud="aws",  # or "gcp", "azure"
+        region="us-east-1"
+    )
+)
+```
+
+**Benefits:**
+- Auto-scaling
+- Pay per usage
+- No infrastructure management
+- Cost-effective for variable load
+
+**Use when:**
+- Variable traffic
+- Cost optimization important
+- Don't need consistent latency
+
+### Pod-based
+
+```python
+from pinecone import PodSpec
+
+pc.create_index(
+    name="my-index",
+    dimension=1536,
+    metric="cosine",
+    spec=PodSpec(
+        environment="us-east1-gcp",
+        pod_type="p1.x1",  # or p1.x2, p1.x4, p1.x8
+        pods=2,  # Number of pods
+        replicas=2  # High availability
+    )
+)
+```
+
+**Benefits:**
+- Consistent performance
+- Predictable latency
+- Higher throughput
+- Dedicated resources
+
+**Use when:**
+- Production workloads
+- Need consistent p95 latency
+- High throughput required
+
+## Hybrid search
+
+### Dense + Sparse vectors
+
+```python
+# Upsert with both dense and sparse vectors
+index.upsert(vectors=[
+    {
+        "id": "doc1",
+        "values": [0.1, 0.2, ...],  # Dense (semantic)
+        "sparse_values": {
+            "indices": [10, 45, 123],  # Token IDs
+            "values": [0.5, 0.3, 0.8]   # TF-IDF/BM25 scores
+        },
+        "metadata": {"text": "..."}
+    }
+])
+
+# Hybrid query
+results = index.query(
+    vector=[0.1, 0.2, ...],  # Dense query
+    sparse_vector={
+        "indices": [10, 45],
+        "values": [0.5, 0.3]
+    },
+    top_k=10,
+    alpha=0.5  # 0=sparse only, 1=dense only, 0.5=balanced
+)
+```
+
+**Benefits:**
+- Best of both worlds
+- Semantic + keyword matching
+- Better recall than either alone
+
+## Namespaces for multi-tenancy
+
+```python
+# Separate data by user/tenant
+index.upsert(
+    vectors=[{"id": "doc1", "values": [...]}],
+    namespace="user-123"
+)
+
+# Query specific namespace
+results = index.query(
+    vector=[...],
+    namespace="user-123",
+    top_k=5
+)
+
+# List namespaces
+stats = index.describe_index_stats()
+print(stats['namespaces'])
+```
+
+**Use cases:**
+- Multi-tenant SaaS
+- User-specific data isolation
+- A/B testing (prod/staging namespaces)
+
+## Metadata filtering
+
+### Exact match
+
+```python
+results = index.query(
+    vector=[...],
+    filter={"category": "tutorial"},
+    top_k=5
+)
+```
+
+### Range queries
+
+```python
+results = index.query(
+    vector=[...],
+    filter={"price": {"$gte": 100, "$lte": 500}},
+    top_k=5
+)
+```
+
+### Complex filters
+
+```python
+results = index.query(
+    vector=[...],
+    filter={
+        "$and": [
+            {"category": {"$in": ["tutorial", "guide"]}},
+            {"difficulty": {"$lte": 3}},
+            {"published": {"$gte": "2024-01-01"}}
+        ]
+    },
+    top_k=5
+)
+```
+
+## Best practices
+
+1. **Use serverless for development** - Cost-effective
+2. **Switch to pods for production** - Consistent performance
+3. **Implement namespaces** - Multi-tenancy
+4. **Add metadata strategically** - Enable filtering
+5. **Use hybrid search** - Better quality
+6. **Batch upserts** - 100-200 vectors per batch
+7. **Monitor usage** - Check Pinecone dashboard
+8. **Set up alerts** - Usage/cost thresholds
+9. **Regular backups** - Export important data
+10. **Test filters** - Verify performance
+
+## Resources
+
+- **Docs**: https://docs.pinecone.io
+- **Console**: https://app.pinecone.io