mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-14 04:02:26 +00:00

Teknium b9bac87d5a feat(skills): declare platforms frontmatter for all 79 undeclared built-in skills

Completes the Windows-gating coverage for the built-in skills/ tree. Every
bundled SKILL.md now carries an explicit platforms: declaration so the
loader (agent.skill_utils.skill_matches_platform) can skip-load skills
that don't fit the current OS.

74 skills declared cross-platform (platforms: [linux, macos, windows]):
  Creative (16): ascii-art, ascii-video, architecture-diagram, baoyu-comic,
    baoyu-infographic, claude-design, creative-ideation, design-md,
    excalidraw, humanizer, manim-video, p5js, pixel-art,
    popular-web-designs, pretext, sketch, songwriting-and-ai-music,
    touchdesigner-mcp
  Autonomous agents: claude-code, codex, hermes-agent, opencode
  Data/devops: jupyter-live-kernel, kanban-orchestrator, kanban-worker,
    webhook-subscriptions, dogfood, codebase-inspection
  GitHub: github-auth, github-code-review, github-issues,
    github-pr-workflow, github-repo-management
  Media: gif-search, heartmula, songsee, spotify, youtube-content
  MCP / email / gaming / notes / smart-home: native-mcp, himalaya,
    pokemon-player, obsidian, openhue
  mlops (non-broken): weights-and-biases, huggingface-hub, llama-cpp,
    outlines, segment-anything-model, dspy, trl-fine-tuning
  Productivity: airtable, google-workspace, linear, maps, nano-pdf,
    notion, ocr-and-documents, powerpoint
  Red-teaming / research: godmode, arxiv, blogwatcher, llm-wiki,
    polymarket
  Software-dev: debugging-hermes-tui-commands, hermes-agent-skill-authoring,
    node-inspect-debugger, plan, requesting-code-review, spike,
    subagent-driven-development, systematic-debugging,
    test-driven-development, writing-plans
  Misc: yuanbao

5 skills gated from Windows (platforms: [linux, macos]):
  mlops/inference/vllm (serving-llms-vllm)
    vLLM is officially Linux-only; Windows requires WSL.
  mlops/training/axolotl
    Axolotl's flash-attn + deepspeed + bitsandbytes stack is Linux-first.
  mlops/training/unsloth
    Requires Triton + xformers + flash-attn — Linux only in practice.
  mlops/models/audiocraft (audiocraft-audio-generation)
    torchaudio ffmpeg backend + encodec dependencies are Linux-first.
  mlops/inference/obliteratus
    Research abliteration workflow; relies on Linux-focused pytorch
    kernels and MLX — no first-class Windows path.

Same strict-over-lenient policy as the optional-skills sweep: when the
underlying tool's Windows support is rough, missing, or WSL-only, gate the
skill. Easier to un-gate after verified Windows support lands than to leak
partial support that manifests as mid-task failures.

Combined with prior commits in this branch, every bundled SKILL.md
(skills/ + optional-skills/) now has a platforms: declaration.

2026-05-08 09:23:27 -07:00

9.8 KiB

Raw Blame History

name

description

version

author

license

platforms

metadata

arxiv

Search arXiv papers by keyword, author, category, or ID.

1.0.0

Hermes Agent

MIT

linux

macos

windows

hermes

arXiv Research

Search and retrieve academic papers from arXiv via their free REST API. No API key, no dependencies — just curl.

Quick Reference

Action	Command
Search papers	`curl "https://export.arxiv.org/api/query?search_query=all:QUERY&max_results=5"`
Get specific paper	`curl "https://export.arxiv.org/api/query?id_list=2402.03300"`
Read abstract (web)	`web_extract(urls=["https://arxiv.org/abs/2402.03300"])`
Read full paper (PDF)	`web_extract(urls=["https://arxiv.org/pdf/2402.03300"])`

Searching Papers

The API returns Atom XML. Parse with grep/sed or pipe through python3 for clean output.

Basic search

curl -s "https://export.arxiv.org/api/query?search_query=all:GRPO+reinforcement+learning&max_results=5"

Clean output (parse XML to readable format)

curl -s "https://export.arxiv.org/api/query?search_query=all:GRPO+reinforcement+learning&max_results=5&sortBy=submittedDate&sortOrder=descending" | python3 -c "
import sys, xml.etree.ElementTree as ET
ns = {'a': 'http://www.w3.org/2005/Atom'}
root = ET.parse(sys.stdin).getroot()
for i, entry in enumerate(root.findall('a:entry', ns)):
    title = entry.find('a:title', ns).text.strip().replace('\n', ' ')
    arxiv_id = entry.find('a:id', ns).text.strip().split('/abs/')[-1]
    published = entry.find('a:published', ns).text[:10]
    authors = ', '.join(a.find('a:name', ns).text for a in entry.findall('a:author', ns))
    summary = entry.find('a:summary', ns).text.strip()[:200]
    cats = ', '.join(c.get('term') for c in entry.findall('a:category', ns))
    print(f'{i+1}. [{arxiv_id}] {title}')
    print(f'   Authors: {authors}')
    print(f'   Published: {published} | Categories: {cats}')
    print(f'   Abstract: {summary}...')
    print(f'   PDF: https://arxiv.org/pdf/{arxiv_id}')
    print()
"

Search Query Syntax

Prefix	Searches	Example
`all:`	All fields	`all:transformer+attention`
`ti:`	Title	`ti:large+language+models`
`au:`	Author	`au:vaswani`
`abs:`	Abstract	`abs:reinforcement+learning`
`cat:`	Category	`cat:cs.AI`
`co:`	Comment	`co:accepted+NeurIPS`

Boolean operators

# AND (default when using +)
search_query=all:transformer+attention

# OR
search_query=all:GPT+OR+all:BERT

# AND NOT
search_query=all:language+model+ANDNOT+all:vision

# Exact phrase
search_query=ti:"chain+of+thought"

# Combined
search_query=au:hinton+AND+cat:cs.LG

Sort and Pagination

Parameter	Options
`sortBy`	`relevance`, `lastUpdatedDate`, `submittedDate`
`sortOrder`	`ascending`, `descending`
`start`	Result offset (0-based)
`max_results`	Number of results (default 10, max 30000)

# Latest 10 papers in cs.AI
curl -s "https://export.arxiv.org/api/query?search_query=cat:cs.AI&sortBy=submittedDate&sortOrder=descending&max_results=10"

Fetching Specific Papers

# By arXiv ID
curl -s "https://export.arxiv.org/api/query?id_list=2402.03300"

# Multiple papers
curl -s "https://export.arxiv.org/api/query?id_list=2402.03300,2401.12345,2403.00001"

BibTeX Generation

After fetching metadata for a paper, generate a BibTeX entry:

{% raw %}

curl -s "https://export.arxiv.org/api/query?id_list=1706.03762" | python3 -c "
import sys, xml.etree.ElementTree as ET
ns = {'a': 'http://www.w3.org/2005/Atom', 'arxiv': 'http://arxiv.org/schemas/atom'}
root = ET.parse(sys.stdin).getroot()
entry = root.find('a:entry', ns)
if entry is None: sys.exit('Paper not found')
title = entry.find('a:title', ns).text.strip().replace('\n', ' ')
authors = ' and '.join(a.find('a:name', ns).text for a in entry.findall('a:author', ns))
year = entry.find('a:published', ns).text[:4]
raw_id = entry.find('a:id', ns).text.strip().split('/abs/')[-1]
cat = entry.find('arxiv:primary_category', ns)
primary = cat.get('term') if cat is not None else 'cs.LG'
last_name = entry.find('a:author', ns).find('a:name', ns).text.split()[-1]
print(f'@article{{{last_name}{year}_{raw_id.replace(\".\", \"\")},')
print(f'  title     = {{{title}}},')
print(f'  author    = {{{authors}}},')
print(f'  year      = {{{year}}},')
print(f'  eprint    = {{{raw_id}}},')
print(f'  archivePrefix = {{arXiv}},')
print(f'  primaryClass  = {{{primary}}},')
print(f'  url       = {{https://arxiv.org/abs/{raw_id}}}')
print('}')
"

{% endraw %}

Reading Paper Content

After finding a paper, read it:

# Abstract page (fast, metadata + abstract)
web_extract(urls=["https://arxiv.org/abs/2402.03300"])

# Full paper (PDF → markdown via Firecrawl)
web_extract(urls=["https://arxiv.org/pdf/2402.03300"])

For local PDF processing, see the ocr-and-documents skill.

Common Categories

Category	Field
`cs.AI`	Artificial Intelligence
`cs.CL`	Computation and Language (NLP)
`cs.CV`	Computer Vision
`cs.LG`	Machine Learning
`cs.CR`	Cryptography and Security
`stat.ML`	Machine Learning (Statistics)
`math.OC`	Optimization and Control
`physics.comp-ph`	Computational Physics

Full list: https://arxiv.org/category_taxonomy

Helper Script

The scripts/search_arxiv.py script handles XML parsing and provides clean output:

python scripts/search_arxiv.py "GRPO reinforcement learning"
python scripts/search_arxiv.py "transformer attention" --max 10 --sort date
python scripts/search_arxiv.py --author "Yann LeCun" --max 5
python scripts/search_arxiv.py --category cs.AI --sort date
python scripts/search_arxiv.py --id 2402.03300
python scripts/search_arxiv.py --id 2402.03300,2401.12345

No dependencies — uses only Python stdlib.

arXiv doesn't provide citation data or recommendations. Use the Semantic Scholar API for that — free, no key needed for basic use (1 req/sec), returns JSON.

Get paper details + citations

# By arXiv ID
curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300?fields=title,authors,citationCount,referenceCount,influentialCitationCount,year,abstract" | python3 -m json.tool

# By Semantic Scholar paper ID or DOI
curl -s "https://api.semanticscholar.org/graph/v1/paper/DOI:10.1234/example?fields=title,citationCount"

Get citations OF a paper (who cited it)

curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300/citations?fields=title,authors,year,citationCount&limit=10" | python3 -m json.tool

Get references FROM a paper (what it cites)

curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300/references?fields=title,authors,year,citationCount&limit=10" | python3 -m json.tool

Search papers (alternative to arXiv search, returns JSON)

curl -s "https://api.semanticscholar.org/graph/v1/paper/search?query=GRPO+reinforcement+learning&limit=5&fields=title,authors,year,citationCount,externalIds" | python3 -m json.tool

Get paper recommendations

curl -s -X POST "https://api.semanticscholar.org/recommendations/v1/papers/" \
  -H "Content-Type: application/json" \
  -d '{"positivePaperIds": ["arXiv:2402.03300"], "negativePaperIds": []}' | python3 -m json.tool

Author profile

curl -s "https://api.semanticscholar.org/graph/v1/author/search?query=Yann+LeCun&fields=name,hIndex,citationCount,paperCount" | python3 -m json.tool

Useful Semantic Scholar fields

title, authors, year, abstract, citationCount, referenceCount, influentialCitationCount, isOpenAccess, openAccessPdf, fieldsOfStudy, publicationVenue, externalIds (contains arXiv ID, DOI, etc.)

Complete Research Workflow

Discover: python scripts/search_arxiv.py "your topic" --sort date --max 10
Assess impact: curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:ID?fields=citationCount,influentialCitationCount"
Read abstract: web_extract(urls=["https://arxiv.org/abs/ID"])
Read full paper: web_extract(urls=["https://arxiv.org/pdf/ID"])
Find related work: curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:ID/references?fields=title,citationCount&limit=20"
Get recommendations: POST to Semantic Scholar recommendations endpoint
Track authors: curl -s "https://api.semanticscholar.org/graph/v1/author/search?query=NAME"

Rate Limits

API	Rate	Auth
arXiv	~1 req / 3 seconds	None needed
Semantic Scholar	1 req / second	None (100/sec with API key)

Notes

arXiv returns Atom XML — use the helper script or parsing snippet for clean output
Semantic Scholar returns JSON — pipe through python3 -m json.tool for readability
arXiv IDs: old format (hep-th/0601001) vs new (2402.03300)
PDF: https://arxiv.org/pdf/{id} — Abstract: https://arxiv.org/abs/{id}
HTML (when available): https://arxiv.org/html/{id}
For local PDF processing, see the ocr-and-documents skill

ID Versioning

arxiv.org/abs/1706.03762 always resolves to the latest version
arxiv.org/abs/1706.03762v1 points to a specific immutable version
When generating citations, preserve the version suffix you actually read to prevent citation drift (a later version may substantially change content)
The API <id> field returns the versioned URL (e.g., http://arxiv.org/abs/1706.03762v7)

Withdrawn Papers

Papers can be withdrawn after submission. When this happens:

The <summary> field contains a withdrawal notice (look for "withdrawn" or "retracted")
Metadata fields may be incomplete
Always check the summary before treating a result as a valid paper

9.8 KiB Raw Blame History