mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-04-26 01:01:40 +00:00

feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap (#3934 )

* feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap

Map active skills to Telegram's slash command menu so users can
discover and invoke skills directly. Three changes:

1. Telegram menu now includes active skill commands alongside built-in
   commands, capped at 100 entries (Telegram Bot API limit). Overflow
   commands remain callable but hidden from the picker. Logged at
   startup when cap is hit.

2. New /commands [page] gateway command for paginated browsing of all
   commands + skills. /help now shows first 10 skill commands and
   points to /commands for the full list.

3. When a user types a slash command that matches a disabled or
   uninstalled skill, they get actionable guidance:
   - Disabled: 'Enable it with: hermes skills config'
   - Optional (not installed): 'Install with: hermes skills install official/<path>'

Built on ideas from PR #3921 by @kshitijk4poor.

* chore: move 21 niche skills to optional-skills

Move specialized/niche skills from built-in (skills/) to optional
(optional-skills/) to reduce the default skill count. Users can
install them with: hermes skills install official/<category>/<name>

Moved skills (21):
- mlops: accelerate, chroma, faiss, flash-attention,
  hermes-atropos-environments, huggingface-tokenizers, instructor,
  lambda-labs, llava, nemo-curator, pinecone, pytorch-lightning,
  qdrant, saelens, simpo, slime, tensorrt-llm, torchtitan
- research: domain-intel, duckduckgo-search
- devops: inference-sh cli

Built-in skills: 96 → 75
Optional skills: 22 → 43

* fix: only include repo built-in skills in Telegram menu, not user-installed

User-installed skills (from hub or manually added) stay accessible via
/skills and by typing the command directly, but don't get registered
in the Telegram slash command picker. Only skills whose SKILL.md is
under the repo's skills/ directory are included in the menu.

This keeps the Telegram menu focused on the curated built-in set while
user-installed skills remain discoverable through /skills and /commands.

2026-03-30 10:57:30 -07:00

3.5 KiB

Raw Blame History

Pinecone Deployment Guide

Production deployment patterns for Pinecone.

Serverless vs Pod-based

Serverless (Recommended)

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="your-key")

# Create serverless index
pc.create_index(
    name="my-index",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",  # or "gcp", "azure"
        region="us-east-1"
    )
)

Benefits:

Auto-scaling
Pay per usage
No infrastructure management
Cost-effective for variable load

Use when:

Variable traffic
Cost optimization important
Don't need consistent latency

Pod-based

from pinecone import PodSpec

pc.create_index(
    name="my-index",
    dimension=1536,
    metric="cosine",
    spec=PodSpec(
        environment="us-east1-gcp",
        pod_type="p1.x1",  # or p1.x2, p1.x4, p1.x8
        pods=2,  # Number of pods
        replicas=2  # High availability
    )
)

Benefits:

Consistent performance
Predictable latency
Higher throughput
Dedicated resources

Use when:

Production workloads
Need consistent p95 latency
High throughput required

Hybrid search

Dense + Sparse vectors

# Upsert with both dense and sparse vectors
index.upsert(vectors=[
    {
        "id": "doc1",
        "values": [0.1, 0.2, ...],  # Dense (semantic)
        "sparse_values": {
            "indices": [10, 45, 123],  # Token IDs
            "values": [0.5, 0.3, 0.8]   # TF-IDF/BM25 scores
        },
        "metadata": {"text": "..."}
    }
])

# Hybrid query
results = index.query(
    vector=[0.1, 0.2, ...],  # Dense query
    sparse_vector={
        "indices": [10, 45],
        "values": [0.5, 0.3]
    },
    top_k=10,
    alpha=0.5  # 0=sparse only, 1=dense only, 0.5=balanced
)

Benefits:

Best of both worlds
Semantic + keyword matching
Better recall than either alone

Namespaces for multi-tenancy

# Separate data by user/tenant
index.upsert(
    vectors=[{"id": "doc1", "values": [...]}],
    namespace="user-123"
)

# Query specific namespace
results = index.query(
    vector=[...],
    namespace="user-123",
    top_k=5
)

# List namespaces
stats = index.describe_index_stats()
print(stats['namespaces'])

Use cases:

Multi-tenant SaaS
User-specific data isolation
A/B testing (prod/staging namespaces)

Metadata filtering

Exact match

results = index.query(
    vector=[...],
    filter={"category": "tutorial"},
    top_k=5
)

Range queries

results = index.query(
    vector=[...],
    filter={"price": {"$gte": 100, "$lte": 500}},
    top_k=5
)

Complex filters

results = index.query(
    vector=[...],
    filter={
        "$and": [
            {"category": {"$in": ["tutorial", "guide"]}},
            {"difficulty": {"$lte": 3}},
            {"published": {"$gte": "2024-01-01"}}
        ]
    },
    top_k=5
)

Best practices

Use serverless for development - Cost-effective
Switch to pods for production - Consistent performance
Implement namespaces - Multi-tenancy
Add metadata strategically - Enable filtering
Use hybrid search - Better quality
Batch upserts - 100-200 vectors per batch
Monitor usage - Check Pinecone dashboard
Set up alerts - Usage/cost thresholds
Regular backups - Export important data
Test filters - Verify performance

Resources

Docs: https://docs.pinecone.io
Console: https://app.pinecone.io

3.5 KiB Raw Blame History

Pinecone Deployment Guide

Serverless vs Pod-based

Serverless (Recommended)

Pod-based

Hybrid search

Dense + Sparse vectors

Namespaces for multi-tenancy

Metadata filtering

Exact match

Range queries

Complex filters

Best practices

Resources

3.5 KiB

Raw Blame History