feat(skills): /learn — distill a reusable skill from anything you describe (#51506)

Open-ended skill learning across every surface. /learn <free text> takes a
description of any source — a directory, a URL, the workflow you just walked
the agent through, or pasted notes — and the live agent gathers it with the
tools it already has (read_file/search_files, web_extract, the conversation,
the pasted text), then authors a SKILL.md via skill_manage following the
house authoring standards (<=60-char description, the standard section order,
Hermes-tool framing, no invented commands).

No engine, no model-tool footprint, works on any terminal backend (local,
Docker, remote): /learn builds a standards-guided prompt and hands it to the
agent as a normal turn.

- agent/learn_prompt.py: shared standards-guided prompt builder
- /learn registry entry (both surfaces) + CLI handler (inject onto input
  queue) + gateway handler (rewrite turn, fall through, /blueprint pattern)
- tui_gateway command.dispatch returns a send directive -> TUI + dashboard chat
- dashboard Skills page 'Learn a skill' panel (dir + URL + open-ended text)
  composes a /learn request and runs it in chat
- docs (slash-commands ref + skills feature page), 11 targeted tests

Inspired by OpenAI Codex's Record & Replay and the /learn concept from #47234
(dir-distillation engine); reworked to be open-ended and engine-free per
review.
This commit is contained in:
Teknium 2026-06-23 13:51:28 -07:00 committed by GitHub
parent aaa2e2cb88
commit e32ebc6aa2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 404 additions and 1 deletions

109
agent/learn_prompt.py Normal file
View file

@ -0,0 +1,109 @@
#!/usr/bin/env python3
"""``/learn`` — build the standards-guided prompt that turns whatever the user
described into a reusable skill.
``/learn`` is open-ended. The user can point it at anything they can describe:
a directory of code, an API doc URL, a workflow they just walked the agent
through in this conversation, or pasted notes. This module builds ONE prompt
that instructs the live agent to:
1. Gather the sources the user named, using the tools it already has
(``read_file`` / ``search_files`` for dirs, ``web_extract`` for URLs, the
current conversation for "what I just did", the user's text for pasted
material).
2. Author a single ``SKILL.md`` via ``skill_manage`` that follows the Hermes
skill-authoring standards (description <=60 chars, the modern section
order, Hermes-tool framing, no invented commands).
There is no separate distillation engine and no model-tool footprint: the
agent does the work with its existing toolset, so this works identically on
local, Docker, and remote terminal backends. Every surface (CLI ``/learn``,
gateway ``/learn``, the dashboard "Learn a skill" panel) calls
:func:`build_learn_prompt` and feeds the result to the agent as a normal turn.
"""
from __future__ import annotations
# The house-style rules, distilled from AGENTS.md "Skill authoring standards
# (HARDLINE)" and the hermes-agent-dev new-skill salvage reference. Embedded in
# the prompt so the agent authors skills the way a maintainer would by hand.
_AUTHORING_STANDARDS = """\
Follow the Hermes skill-authoring standards exactly:
Frontmatter:
- name: lowercase-hyphenated, <=64 chars, no spaces.
- description: ONE sentence, <=60 characters, ends with a period. State the
capability, not the implementation. No marketing words (powerful,
comprehensive, seamless, advanced). Do NOT repeat the skill name. If the
description contains a colon, wrap the whole value in double quotes.
- version: 0.1.0
- metadata.hermes.tags: a few Capitalized, Relevant, Tags.
Body section order (omit a section only if it genuinely has no content):
1. "# <Human Title>" then a 2-3 sentence intro: what it does, what it does NOT
do, and the key dependency stance (e.g. "stdlib only").
2. "## When to Use" bullet list of concrete trigger phrases.
3. "## Prerequisites" exact env vars, install steps, credentials.
4. "## How to Run" the canonical invocation, framed through Hermes tools.
5. "## Quick Reference" a flat command/endpoint list, no narration.
6. "## Procedure" numbered steps with copy-paste-exact commands.
7. "## Pitfalls" known limits, rate limits, things that look broken but aren't.
8. "## Verification" a single command/check that proves the skill worked.
Hermes-tool framing (this is what makes it a skill, not shell docs):
- Frame running scripts as "invoke through the `terminal` tool".
- Use `read_file` (not cat/head/tail), `search_files` (not grep/find/ls),
`patch` (not sed/awk), `web_extract` (not curl-to-scrape),
`vision_analyze` for images. Reference these tools by name in backticks.
- Do NOT name shell utilities the agent already has wrapped.
Quality bar:
- Prefer exact commands, endpoint URLs, function signatures, and config keys
that appear VERBATIM in the source. NEVER invent flags, paths, or APIs if
you didn't see it in the source, don't write it.
- Keep it tight and scannable: ~100 lines for a simple skill, ~200 for a
complex one. Don't re-paste the source docs.
- Don't write a router/index/hub skill that only points at other skills.
- Larger scripts/parsers belong in a `scripts/` file (add via
`skill_manage` write_file), referenced from SKILL.md by relative path not
inlined for the agent to re-type every run."""
def build_learn_prompt(user_request: str) -> str:
"""Build the agent prompt for an open-ended ``/learn`` request.
Args:
user_request: the free-text the user gave after ``/learn`` a
description of the workflow, paths, URLs, or "what I just did".
Returns:
A complete instruction the agent runs as a normal turn. The agent
gathers the described sources with its existing tools and authors the
skill via ``skill_manage``.
"""
req = (user_request or "").strip()
if not req:
req = (
"the workflow we just went through in this conversation — review "
"the steps taken and distill them into a reusable skill"
)
return (
"[/learn] The user wants you to learn a reusable skill from the "
"source(s) they described below, and save it.\n\n"
f"WHAT TO LEARN FROM:\n{req}\n\n"
"Do this:\n"
"1. Gather the material. Resolve whatever the user named using the "
"tools you already have — `read_file`/`search_files` for local files "
"or directories, `web_extract` for URLs, the current conversation "
"history if they referred to something you just did, and the text "
"they pasted as-is. If the request is ambiguous about scope, make a "
"reasonable choice and note it; do not stall.\n"
"2. Author ONE SKILL.md and save it with the `skill_manage` tool "
"(action=\"create\"). Pick a sensible category. If the procedure needs "
"a non-trivial script, add it under the skill's `scripts/` with "
"`skill_manage` write_file and reference it by relative path.\n\n"
f"{_AUTHORING_STANDARDS}\n\n"
"When done, tell the user the skill name, its category, and a "
"one-line summary of what it captured."
)

2
cli.py
View file

@ -8009,6 +8009,8 @@ class HermesCLI(CLIAgentSetupMixin, CLICommandsMixin):
elif canonical == "skills":
with self._busy_command(self._slow_command_status(cmd_original)):
self._handle_skills_command(cmd_original)
elif canonical == "learn":
self._handle_learn_command(cmd_original)
elif canonical == "memory":
self._handle_memory_command(cmd_original)
elif canonical == "platforms":

View file

@ -8113,6 +8113,34 @@ class GatewayRunner(GatewayAuthorizationMixin, GatewayKanbanWatchersMixin, Gatew
if canonical == "skills":
return await self._handle_skills_command(event)
if canonical == "learn":
# Open-ended: rewrite the turn to a standards-guided prompt and fall
# through to normal agent processing. The live agent gathers the
# sources the user described (dirs via read_file, URLs via
# web_extract, this conversation, pasted text) and authors the skill
# via skill_manage. Mirrors the /blueprint fall-through so role
# alternation is preserved. No engine, works on any backend.
from agent.learn_prompt import build_learn_prompt
_learn_req = event.get_command_args().strip()
_ack = (
"Learning a skill from what you described…"
if _learn_req
else "Learning a skill from this conversation…"
)
try:
adapter = self.adapters.get(source.platform)
if adapter:
_ack_meta = self._thread_metadata_for_source(source)
await adapter.send(str(source.chat_id), _ack, metadata=_ack_meta)
except Exception:
logger.debug("learn ack send failed", exc_info=True)
try:
event.text = build_learn_prompt(_learn_req)
# fall through to agent processing
except Exception:
return "Could not start /learn — please try again."
if canonical == "fast":
return await self._handle_fast_command(event)

View file

@ -1354,6 +1354,32 @@ class CLICommandsMixin:
from hermes_cli.skills_hub import handle_skills_slash
handle_skills_slash(cmd, ChatConsole())
def _handle_learn_command(self, cmd: str):
"""Handle /learn — distill a reusable skill from anything the user describes.
Open-ended: the argument is free text describing the source(s) a
directory, a URL, "what we just did", pasted notes. We build a
standards-guided prompt and inject it onto the agent's input queue; the
live agent gathers the material with the tools it already has and
authors the skill via ``skill_manage``. No engine, no model-tool
footprint, works on any terminal backend.
"""
from agent.learn_prompt import build_learn_prompt
# Everything after the command word is the open-ended request.
parts = cmd.strip().split(None, 1)
user_request = parts[1].strip() if len(parts) > 1 else ""
msg = build_learn_prompt(user_request)
if user_request:
print("\n⚡ Learning a skill from what you described...")
else:
print("\n⚡ Learning a skill from this conversation...")
if hasattr(self, "_pending_input"):
self._pending_input.put(msg)
else: # pragma: no cover - defensive (no live input loop)
print(" /learn needs an active chat session to run.")
def _handle_memory_command(self, cmd: str):
"""Handle /memory slash command — pending review + approval-gate toggle."""
from hermes_cli.write_approval_commands import handle_pending_subcommand

View file

@ -179,6 +179,8 @@ COMMAND_REGISTRY: list[CommandDef] = [
subcommands=("pending", "approve", "reject", "approval")),
CommandDef("bundles", "List skill bundles (aliases /<name> for multiple skills)",
"Tools & Skills"),
CommandDef("learn", "Learn a reusable skill from anything you describe (dirs, URLs, this chat, notes)",
"Tools & Skills", args_hint="<what to learn from>"),
CommandDef("cron", "Manage scheduled tasks", "Tools & Skills",
cli_only=True, args_hint="[subcommand]",
subcommands=("list", "add", "create", "edit", "pause", "resume", "run", "remove")),

View file

@ -0,0 +1,73 @@
"""Tests for /learn — open-ended skill distillation.
Covers the shared prompt builder (agent.learn_prompt.build_learn_prompt) and
the slash-command registry wiring. /learn has no engine and no model tool: it
builds a standards-guided prompt that the live agent runs as a normal turn, so
these are the load-bearing behavior contracts.
"""
from agent.learn_prompt import build_learn_prompt, _AUTHORING_STANDARDS
class TestBuildLearnPrompt:
def test_embeds_the_user_request_verbatim(self):
req = "the REST client in ~/projects/acme-sdk, focus on auth"
prompt = build_learn_prompt(req)
assert req in prompt
def test_always_includes_the_authoring_standards(self):
# The standards are what make distilled skills match house style;
# they must travel with every prompt regardless of input.
for req in ["", "a url https://x/y", "what we just did"]:
assert _AUTHORING_STANDARDS in build_learn_prompt(req)
def test_instructs_saving_via_skill_manage_not_a_raw_file(self):
prompt = build_learn_prompt("learn the thing")
assert "skill_manage" in prompt
def test_references_gather_tools_for_open_ended_sourcing(self):
# Open-ended sourcing relies on the agent's own tools, named so it
# knows dirs/URLs/conversation/paste all route through existing tools.
prompt = build_learn_prompt("learn from somewhere")
for tool in ("read_file", "search_files", "web_extract"):
assert tool in prompt
def test_empty_request_falls_back_to_the_conversation(self):
# Bare /learn should distill "what we just did", not error.
prompt = build_learn_prompt("")
assert "conversation" in prompt.lower()
# And still carries the standards + save instruction.
assert "skill_manage" in prompt
def test_whitespace_only_request_is_treated_as_empty(self):
assert build_learn_prompt(" \n ") == build_learn_prompt("")
def test_description_length_rule_is_in_the_standards(self):
# The single most-violated rule must be explicit in the prompt.
assert "60" in _AUTHORING_STANDARDS
class TestLearnRegistryWiring:
def test_learn_is_registered_and_resolves(self):
from hermes_cli.commands import resolve_command
cmd = resolve_command("learn")
assert cmd is not None
assert cmd.name == "learn"
def test_learn_is_in_tools_and_skills_category(self):
from hermes_cli.commands import resolve_command
assert resolve_command("learn").category == "Tools & Skills"
def test_learn_works_on_the_gateway(self):
# /learn must reach the gateway runner (it's a both-surfaces command),
# not be CLI-only.
from hermes_cli.commands import GATEWAY_KNOWN_COMMANDS
assert "learn" in GATEWAY_KNOWN_COMMANDS
def test_learn_is_not_cli_only(self):
from hermes_cli.commands import resolve_command
assert not resolve_command("learn").cli_only

View file

@ -9127,6 +9127,15 @@ def _(rid, params: dict) -> dict:
return _err(rid, 4004, "usage: /queue <prompt>")
return _ok(rid, {"type": "send", "message": arg})
if name == "learn":
# Open-ended: build the standards-guided prompt and submit it as a
# normal agent turn. The live agent gathers whatever the user
# described (dirs, URLs, this conversation, pasted text) with its own
# tools and authors the skill via skill_manage. Works on any backend.
from agent.learn_prompt import build_learn_prompt
return _ok(rid, {"type": "send", "message": build_learn_prompt(arg)})
if name == "retry":
if not session:
return _err(rid, 4001, "no active session to retry")

View file

@ -671,6 +671,25 @@ export default function ChatPage({ isActive = true }: { isActive?: boolean }) {
// follow up with the authoritative measurement — at worst Ink
// reflows once after the PTY boots, which is imperceptible.
ws.send(`\x1b[RESIZE:${term.cols};${term.rows}]`);
// One-shot: a ?learn=<text> param (set by the Skills page "Learn a
// skill" panel) is typed into the composer as a /learn command once the
// PTY is up. /learn resolves via command.dispatch → a normal agent turn,
// so this reuses the existing composer path — no special PTY protocol.
const learnSeed = searchParams.get("learn");
if (learnSeed) {
const next = new URLSearchParams(searchParams);
next.delete("learn");
setSearchParams(next, { replace: true });
const cmd = `/learn ${learnSeed}`.trim();
// Delay so Ink's composer has mounted and grabbed focus before input.
setTimeout(() => {
try {
wsRef.current?.send(cmd + "\r");
} catch {
/* PTY not ready / closed — user can retype */
}
}, 800);
}
};
ws.onmessage = (ev) => {

View file

@ -1,4 +1,5 @@
import { useEffect, useLayoutEffect, useState, useMemo, useCallback } from "react";
import { useNavigate } from "react-router-dom";
import {
Package,
Search,
@ -212,6 +213,37 @@ export default function SkillsPage() {
setEditorSkill(null);
setEditorOpen(true);
}, []);
// ── "Learn a skill" panel ──────────────────────────────────────────────
// Open-ended: dir + URL + free-text inputs are composed into a single-line
// /learn command and handed to the chat. /learn resolves to a normal agent
// turn (command.dispatch → send), so the live agent gathers the sources
// with its own tools and authors the skill via skill_manage. No backend
// distill endpoint — one code path with the CLI/TUI/gateway /learn.
const navigate = useNavigate();
const [learnOpen, setLearnOpen] = useState(false);
const [learnDir, setLearnDir] = useState("");
const [learnUrl, setLearnUrl] = useState("");
const [learnText, setLearnText] = useState("");
const openLearn = useCallback(() => {
setLearnDir("");
setLearnUrl("");
setLearnText("");
setLearnOpen(true);
}, []);
const submitLearn = useCallback(() => {
const segs: string[] = [];
const dir = learnDir.trim();
const url = learnUrl.trim();
const text = learnText.trim();
if (dir) segs.push(`local source: ${dir}`);
if (url) segs.push(`URL: ${url}`);
if (text) segs.push(text);
// Flatten to a single line — the chat composer submits on the first Enter.
const composed = segs.join("; ").replace(/\s*\n\s*/g, " ").trim();
if (!composed) return;
setLearnOpen(false);
navigate(`/chat?learn=${encodeURIComponent(composed)}`);
}, [learnDir, learnUrl, learnText, navigate]);
const openEditEditor = useCallback((skillName: string) => {
setEditorSkill(skillName);
setEditorOpen(true);
@ -492,6 +524,14 @@ export default function SkillsPage() {
.replace("{count}", String(activeSkills.length))
.replace("{s}", activeSkills.length !== 1 ? "s" : "")}
</Badge>
<Button
size="sm"
outlined
onClick={openLearn}
prefix={<Sparkles />}
>
Learn a skill
</Button>
<Button
size="sm"
outlined
@ -630,6 +670,64 @@ export default function SkillsPage() {
onClose={() => setEditorOpen(false)}
onSaved={handleEditorSaved}
/>
<Dialog open={learnOpen} onOpenChange={setLearnOpen}>
<DialogContent className="max-w-lg">
<DialogHeader>
<DialogTitle>Learn a skill</DialogTitle>
<DialogDescription>
Point Hermes at anything and it will distill a reusable skill
following the house authoring standards. Fill in any combination
below; the agent gathers the sources and writes the skill in chat.
</DialogDescription>
</DialogHeader>
<div className="grid gap-3 py-2">
<div className="grid gap-1.5">
<label className="text-xs font-medium text-muted-foreground">
Local file or directory
</label>
<Input
placeholder="~/projects/some-sdk (read with read_file / search_files)"
value={learnDir}
onChange={(e) => setLearnDir(e.target.value)}
/>
</div>
<div className="grid gap-1.5">
<label className="text-xs font-medium text-muted-foreground">
URL
</label>
<Input
placeholder="https://docs.example.com/api (fetched with web_extract)"
value={learnUrl}
onChange={(e) => setLearnUrl(e.target.value)}
/>
</div>
<div className="grid gap-1.5">
<label className="text-xs font-medium text-muted-foreground">
Anything else describe the workflow, paste notes, or say
"what we just did"
</label>
<textarea
className="min-h-[90px] w-full rounded-md border border-input bg-transparent px-3 py-2 text-sm shadow-sm focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-ring"
placeholder="e.g. how I file an expense report: open the portal, …"
value={learnText}
onChange={(e) => setLearnText(e.target.value)}
/>
</div>
</div>
<div className="flex justify-end gap-2 pt-1">
<Button ghost onClick={() => setLearnOpen(false)}>
Cancel
</Button>
<Button
onClick={submitLearn}
prefix={<Sparkles />}
disabled={!learnDir.trim() && !learnUrl.trim() && !learnText.trim()}
>
Learn it
</Button>
</div>
</DialogContent>
</Dialog>
<PluginSlot name="skills:bottom" />
</div>
);

View file

@ -89,6 +89,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/skills` | Search, install, inspect, or manage skills from online registries. Also the review surface for the skill write-approval gate: `/skills pending`, `/skills diff <id>`, `/skills approve <id>`, `/skills reject <id>`, `/skills approval on\|off`. See [Gating agent skill writes](/user-guide/features/skills#gating-agent-skill-writes-skillswrite_approval). |
| `/memory [pending\|approve\|reject\|approval]` | Review pending memory writes staged by the write-approval gate (`memory.write_approval`) and toggle the gate. See [Controlling memory writes](/user-guide/features/memory#controlling-memory-writes-write_approval). |
| `/bundles` | List configured skill bundles — `/<name>` slash aliases that preload several skills at once. Configure under `bundles:` in `~/.hermes/config.yaml`. See [Skill Bundles](/user-guide/features/skills#skill-bundles). |
| `/learn <what to learn from>` | Distill a reusable skill from anything you describe — a directory, a URL, the workflow you just walked the agent through, or pasted notes. Open-ended: the agent gathers the sources with its own tools and authors a `SKILL.md` following the house authoring standards. Works in the CLI, the messaging gateway, the TUI, and the dashboard Skills page. |
| `/cron` | Manage scheduled tasks (list, add/create, edit, pause, resume, run, remove) |
| `/suggestions [accept\|dismiss N\|catalog\|clear]` (alias: `/suggest`) | Review suggested automations. Use `/suggestions` to list pending suggestions, `/suggestions accept <id>` to create the proposed automation, `/suggestions dismiss <id>` to reject one, `/suggestions catalog` to add curated starter automations, and `/suggestions clear` to clear resolved suggestion records. Accepted jobs preserve the current surface as the delivery origin. |
| `/blueprint [name] [slot=value ...]` (alias: `/bp`) | Set up an automation from a blueprint template. Bare `/blueprint` lists the catalog; `/blueprint <name>` starts a guided slot-filling flow on the next agent turn; `/blueprint <name> slot=value ...` creates the job directly. |
@ -249,7 +250,7 @@ The messaging gateway supports the following built-in commands inside Telegram,
- `/skills` is **CLI-only for search/browse/install**; its write-approval review subcommands (`pending`, `approve`, `reject`, `diff`, `approval`) also work on messaging platforms when `skills.write_approval` is on. `/memory` works on **both** surfaces.
- `/verbose` is **CLI-only by default**, but can be enabled for messaging platforms by setting `display.tool_progress_command: true` in `config.yaml`. When enabled, it cycles the `display.tool_progress` mode and saves to config.
- `/sethome`, `/update`, `/restart`, `/approve`, `/deny`, `/topic`, `/platform`, and `/commands` are **messaging-only** commands.
- `/status`, `/version`, `/background`, `/queue`, `/steer`, `/voice`, `/reload-mcp`, `/reload-skills`, `/rollback`, `/debug`, `/fast`, `/footer`, `/curator`, `/kanban`, `/credits`, `/suggestions`, `/blueprint`, `/sessions`, and `/yolo` work in **both** the CLI and the messaging gateway.
- `/status`, `/version`, `/background`, `/queue`, `/steer`, `/voice`, `/reload-mcp`, `/reload-skills`, `/rollback`, `/debug`, `/fast`, `/footer`, `/curator`, `/kanban`, `/credits`, `/suggestions`, `/blueprint`, `/learn`, `/sessions`, and `/yolo` work in **both** the CLI and the messaging gateway.
- `/voice join`, `/voice channel`, and `/voice leave` are only meaningful on Discord.
- In the TUI, `/sessions` shows live sessions in the current TUI process. Use `/resume [name]` or `hermes --tui --resume <id-or-title>` for saved or closed transcripts.

View file

@ -71,6 +71,42 @@ hermes chat --toolsets skills -q "What skills do you have?"
hermes chat --toolsets skills -q "Show me the axolotl skill"
```
## Learning a skill from sources (`/learn`)
`/learn` is the fast way to turn something you already know — or a pile of
reference material — into a reusable skill, without hand-writing the
`SKILL.md`. It is open-ended: point it at *anything you can describe* and the
agent gathers the material with the tools it already has, then authors a skill
that follows the [house authoring standards](#skillmd-format) (≤60-char
description, the standard section order, Hermes-tool framing, no invented
commands).
```bash
# A local SDK or doc directory — read with read_file / search_files
/learn the REST client in ~/projects/acme-sdk, focus on auth + pagination
# An online doc page — fetched with web_extract
/learn https://docs.example.com/api/quickstart
# The workflow you just walked the agent through in this conversation
/learn how I just deployed the staging server
# Pasted notes / a described procedure
/learn filing an expense: open the portal, New > Expense, attach the receipt, submit
```
Because the live agent does the sourcing, `/learn` works the same in the CLI,
the messaging gateway, the TUI, and the dashboard — and on any terminal backend
(local, Docker, remote), since there is no separate ingestion engine. In the
**dashboard**, the Skills page has a **Learn a skill** button that opens a panel
with a directory field, a URL field, and an open-ended text box; it composes a
`/learn` request and runs it in chat.
There is no model-tool footprint: `/learn` builds a standards-guided prompt and
hands it to the agent as a normal turn. The agent saves the result with the
`skill_manage` tool, so the [write-approval gate](#gating-agent-skill-writes-skillswrite_approval)
applies if you have it on.
## Progressive Disclosure
Skills use a token-efficient loading pattern: