refactor: reorganize skills into sub-categories

The skills directory was getting disorganized — mlops alone had 40 skills in a flat list, and 12 categories were singletons with just one skill each. Code change: - prompt_builder.py: Support sub-categories in skill scanner. skills/mlops/training/axolotl/SKILL.md now shows as category 'mlops/training' instead of just 'mlops'. Backwards-compatible with existing flat structure. Split mlops (40 skills) into 7 sub-categories: - mlops/training (12): accelerate, axolotl, flash-attention, grpo-rl-training, peft, pytorch-fsdp, pytorch-lightning, simpo, slime, torchtitan, trl-fine-tuning, unsloth - mlops/inference (8): gguf, guidance, instructor, llama-cpp, obliteratus, outlines, tensorrt-llm, vllm - mlops/models (6): audiocraft, clip, llava, segment-anything, stable-diffusion, whisper - mlops/vector-databases (4): chroma, faiss, pinecone, qdrant - mlops/evaluation (5): huggingface-tokenizers, lm-evaluation-harness, nemo-curator, saelens, weights-and-biases - mlops/cloud (2): lambda-labs, modal - mlops/research (1): dspy Merged singleton categories: - gifs → media (gif-search joins youtube-content) - music-creation → media (heartmula, songsee) - diagramming → creative (excalidraw joins ascii-art) - ocr-and-documents → productivity - domain → research (domain-intel) - feeds → research (blogwatcher) - market-data → research (polymarket) Fixed misplaced skills: - mlops/code-review → software-development (not ML-specific) - mlops/ml-paper-writing → research (academic writing) Added DESCRIPTION.md files for all new/updated categories.
2026-04-25 00:51:20 +00:00 · 2026-03-09 03:35:53 -07:00 · 2026-03-09 03:35:53 -07:00 · 732c66b0f3
commit 732c66b0f3
parent d6c710706f
217 changed files with 39 additions and 4 deletions
--- a/skills/mlops/vector-databases/pinecone/references/deployment.md
+++ b/skills/mlops/vector-databases/pinecone/references/deployment.md
@ -0,0 +1,181 @@
+# Pinecone Deployment Guide
+
+Production deployment patterns for Pinecone.
+
+## Serverless vs Pod-based
+
+### Serverless (Recommended)
+
+```python
+from pinecone import Pinecone, ServerlessSpec
+
+pc = Pinecone(api_key="your-key")
+
+# Create serverless index
+pc.create_index(
+    name="my-index",
+    dimension=1536,
+    metric="cosine",
+    spec=ServerlessSpec(
+        cloud="aws",  # or "gcp", "azure"
+        region="us-east-1"
+    )
+)
+```
+
+**Benefits:**
+- Auto-scaling
+- Pay per usage
+- No infrastructure management
+- Cost-effective for variable load
+
+**Use when:**
+- Variable traffic
+- Cost optimization important
+- Don't need consistent latency
+
+### Pod-based
+
+```python
+from pinecone import PodSpec
+
+pc.create_index(
+    name="my-index",
+    dimension=1536,
+    metric="cosine",
+    spec=PodSpec(
+        environment="us-east1-gcp",
+        pod_type="p1.x1",  # or p1.x2, p1.x4, p1.x8
+        pods=2,  # Number of pods
+        replicas=2  # High availability
+    )
+)
+```
+
+**Benefits:**
+- Consistent performance
+- Predictable latency
+- Higher throughput
+- Dedicated resources
+
+**Use when:**
+- Production workloads
+- Need consistent p95 latency
+- High throughput required
+
+## Hybrid search
+
+### Dense + Sparse vectors
+
+```python
+# Upsert with both dense and sparse vectors
+index.upsert(vectors=[
+    {
+        "id": "doc1",
+        "values": [0.1, 0.2, ...],  # Dense (semantic)
+        "sparse_values": {
+            "indices": [10, 45, 123],  # Token IDs
+            "values": [0.5, 0.3, 0.8]   # TF-IDF/BM25 scores
+        },
+        "metadata": {"text": "..."}
+    }
+])
+
+# Hybrid query
+results = index.query(
+    vector=[0.1, 0.2, ...],  # Dense query
+    sparse_vector={
+        "indices": [10, 45],
+        "values": [0.5, 0.3]
+    },
+    top_k=10,
+    alpha=0.5  # 0=sparse only, 1=dense only, 0.5=balanced
+)
+```
+
+**Benefits:**
+- Best of both worlds
+- Semantic + keyword matching
+- Better recall than either alone
+
+## Namespaces for multi-tenancy
+
+```python
+# Separate data by user/tenant
+index.upsert(
+    vectors=[{"id": "doc1", "values": [...]}],
+    namespace="user-123"
+)
+
+# Query specific namespace
+results = index.query(
+    vector=[...],
+    namespace="user-123",
+    top_k=5
+)
+
+# List namespaces
+stats = index.describe_index_stats()
+print(stats['namespaces'])
+```
+
+**Use cases:**
+- Multi-tenant SaaS
+- User-specific data isolation
+- A/B testing (prod/staging namespaces)
+
+## Metadata filtering
+
+### Exact match
+
+```python
+results = index.query(
+    vector=[...],
+    filter={"category": "tutorial"},
+    top_k=5
+)
+```
+
+### Range queries
+
+```python
+results = index.query(
+    vector=[...],
+    filter={"price": {"$gte": 100, "$lte": 500}},
+    top_k=5
+)
+```
+
+### Complex filters
+
+```python
+results = index.query(
+    vector=[...],
+    filter={
+        "$and": [
+            {"category": {"$in": ["tutorial", "guide"]}},
+            {"difficulty": {"$lte": 3}},
+            {"published": {"$gte": "2024-01-01"}}
+        ]
+    },
+    top_k=5
+)
+```
+
+## Best practices
+
+1. **Use serverless for development** - Cost-effective
+2. **Switch to pods for production** - Consistent performance
+3. **Implement namespaces** - Multi-tenancy
+4. **Add metadata strategically** - Enable filtering
+5. **Use hybrid search** - Better quality
+6. **Batch upserts** - 100-200 vectors per batch
+7. **Monitor usage** - Check Pinecone dashboard
+8. **Set up alerts** - Usage/cost thresholds
+9. **Regular backups** - Export important data
+10. **Test filters** - Verify performance
+
+## Resources
+
+- **Docs**: https://docs.pinecone.io
+- **Console**: https://app.pinecone.io