chore(skills/darwinian-evolver): AUTHOR_MAP + docs regen

2026-05-18 04:41:56 +00:00 · 2026-05-15 21:55:01 -07:00 · 2026-05-15 21:55:01 -07:00 · 53637fb17d
commit 53637fb17d
parent c9b32a654c
4 changed files with 220 additions and 0 deletions
--- a/website/docs/reference/optional-skills-catalog.md
+++ b/website/docs/reference/optional-skills-catalog.md
@ -161,6 +161,7 @@ hermes skills uninstall <skill-name>
 | Skill | Description |
 |-------|-------------|
 | [**bioinformatics**](/docs/user-guide/skills/optional/research/research-bioinformatics) | Gateway to 400+ bioinformatics skills from bioSkills and ClawBio. Covers genomics, transcriptomics, single-cell, variant calling, pharmacogenomics, metagenomics, structural biology, and more. Fetches domain-specific reference material on... |
+| [**darwinian-evolver**](/docs/user-guide/skills/optional/research/research-darwinian-evolver) | Evolve prompts/regex/SQL/code with Imbue's evolution loop. |
 | [**domain-intel**](/docs/user-guide/skills/optional/research/research-domain-intel) | Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, domain availability checks, and bulk multi-domain analysis. No API keys required. |
 | [**drug-discovery**](/docs/user-guide/skills/optional/research/research-drug-discovery) | Pharmaceutical research assistant for drug discovery workflows. Search bioactive compounds on ChEMBL, calculate drug-likeness (Lipinski Ro5, QED, TPSA, synthetic accessibility), look up drug-drug interactions via OpenFDA, interpret ADMET... |
 | [**duckduckgo-search**](/docs/user-guide/skills/optional/research/research-duckduckgo-search) | Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Prefer the `ddgs` CLI when installed; use the Python DDGS library only after verifying that `ddgs` is available in the current runtime. |
--- a/website/docs/user-guide/skills/optional/research/research-darwinian-evolver.md
+++ b/website/docs/user-guide/skills/optional/research/research-darwinian-evolver.md
@ -0,0 +1,217 @@
+---
+title: "Darwinian Evolver — Evolve prompts/regex/SQL/code with Imbue's evolution loop"
+sidebar_label: "Darwinian Evolver"
+description: "Evolve prompts/regex/SQL/code with Imbue's evolution loop"
+---
+
+{/* This page is auto-generated from the skill's SKILL.md by website/scripts/generate-skill-docs.py. Edit the source SKILL.md, not this page. */}
+
+# Darwinian Evolver
+
+Evolve prompts/regex/SQL/code with Imbue's evolution loop.
+
+## Skill metadata
+
+| | |
+|---|---|
+| Source | Optional — install with `hermes skills install official/research/darwinian-evolver` |
+| Path | `optional-skills/research/darwinian-evolver` |
+| Version | `0.1.0` |
+| Author | Bihruze (Asahi0x), Hermes Agent |
+| License | MIT |
+| Platforms | linux, macos |
+| Tags | `evolution`, `optimization`, `prompt-engineering`, `research` |
+| Related skills | [`arxiv`](/docs/user-guide/skills/bundled/research/research-arxiv), [`jupyter-live-kernel`](/docs/user-guide/skills/bundled/data-science/data-science-jupyter-live-kernel) |
+
+## Reference: full SKILL.md
+
+:::info
+The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.
+:::
+
+# Darwinian Evolver
+
+Run Imbue's [darwinian_evolver](https://github.com/imbue-ai/darwinian_evolver) — an
+LLM-driven evolutionary search loop — to optimize a **prompt, regex, SQL query,
+or small code snippet** against a fitness function.
+
+Status: thin wrapper around the upstream tool. The skill installs it, walks the
+agent through writing a `Problem` definition (organism + evaluator + mutator),
+and drives the loop via the upstream CLI or a small custom Python driver.
+
+**License:** the upstream tool is **AGPL-3.0**. The skill ONLY ever invokes it
+via the upstream CLI or a `subprocess`/`uv run` call (mere aggregation). Do NOT
+import upstream classes into Hermes itself.
+
+## When to Use
+
+- User says "optimize this prompt", "evolve a regex for X", "auto-improve this
+  code/SQL", "search for a better instruction".
+- You have a scorer (exact match, regex pass-rate, unit test, LLM-judge, runtime
+  metric) AND a starting candidate (organism). If you don't have a scorer, stop
+  and define one first — that's the hard part.
+- Cost is OK: a typical run is 50–500 LLM calls. On gpt-4o-mini that's pennies;
+  on Claude Sonnet it can be a few dollars.
+
+Do **not** use this when:
+- The optimization target is differentiable (use gradient descent / DSPy).
+- You only need to try 2–3 variants — just write them by hand.
+- The fitness signal is purely subjective with no measurable criterion.
+
+## Prerequisites
+
+- Python ≥3.11
+- `git`, `uv` (or `pip`)
+- One of: `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, or `OPENAI_API_KEY`
+
+The skill ships a small `parrot_openrouter.py` driver that uses `OPENROUTER_API_KEY`
+via the OpenAI SDK, so any model on OpenRouter works. The upstream CLI itself
+hardcodes Anthropic and needs `ANTHROPIC_API_KEY`.
+
+## Install (One-Time)
+
+Run via the `terminal` tool:
+
+```bash
+mkdir -p ~/.hermes/cache/darwinian-evolver && cd ~/.hermes/cache/darwinian-evolver
+[ -d darwinian_evolver ] || git clone --depth 1 https://github.com/imbue-ai/darwinian_evolver.git
+cd darwinian_evolver && uv sync
+```
+
+Verify:
+
+```bash
+cd ~/.hermes/cache/darwinian-evolver/darwinian_evolver \
+  && uv run darwinian_evolver --help | head -5
+```
+
+## Quick Start — The Built-In Parrot Example
+
+Tiny smoke test (requires `ANTHROPIC_API_KEY`):
+
+```bash
+cd ~/.hermes/cache/darwinian-evolver/darwinian_evolver
+uv run darwinian_evolver parrot \
+  --num_iterations 2 \
+  --num_parents_per_iteration 2 \
+  --mutator_concurrency 2 --evaluator_concurrency 2 \
+  --output_dir /tmp/parrot_demo
+```
+
+Outputs:
+- `/tmp/parrot_demo/snapshots/iteration_N.pkl` — pickled population per iteration
+- `/tmp/parrot_demo/<jsonl>` — per-iteration JSON log (path printed at end)
+
+Open `~/.hermes/cache/darwinian-evolver/darwinian_evolver/darwinian_evolver/lineage_visualizer.html`
+in a browser and load the JSON log to see the evolutionary tree.
+
+## Quick Start — OpenRouter Driver (No Anthropic Key)
+
+The skill ships `scripts/parrot_openrouter.py` — same parrot problem, but the
+LLM call goes through OpenRouter so any provider works.
+
+```bash
+# From wherever the skill is installed:
+SKILL_DIR=~/.hermes/skills/research/darwinian-evolver
+DE_DIR=~/.hermes/cache/darwinian-evolver/darwinian_evolver
+
+cd "$DE_DIR" && \
+  EVOLVER_MODEL='openai/gpt-4o-mini' \
+  uv run --with openai python "$SKILL_DIR/scripts/parrot_openrouter.py" \
+    --num_iterations 3 --num_parents_per_iteration 2 \
+    --output_dir /tmp/parrot_or
+```
+
+Inspect the result with `scripts/show_snapshot.py`:
+
+```bash
+uv run --with openai python "$SKILL_DIR/scripts/show_snapshot.py" \
+  /tmp/parrot_or/snapshots/iteration_3.pkl
+```
+
+Expected output: 7 evolved prompt templates ranked by score, with the best
+landing around 0.6–0.8 (the seed `Say {{ phrase }}` scored 0.000).
+
+## Defining a Custom Problem
+
+The skill ships `templates/custom_problem_template.py` — copy, edit, run.
+Three things you must define:
+
+1. **`Organism`** — a Pydantic `BaseModel` subclass holding the artifact being
+   evolved (`prompt_template: str`, `regex_pattern: str`, `sql_query: str`,
+   `code_block: str`, etc.). Add a `run(*args)` method that exercises it.
+
+2. **`Evaluator`** — `.evaluate(organism) -> EvaluationResult(score=..., trainable_failure_cases=[...], holdout_failure_cases=[...], is_viable=True)`.
+   - **`score`** is in `[0, 1]`. Higher is better.
+   - **`trainable_failure_cases`** — what the mutator sees. Include enough
+     context (input, expected, actual) for the LLM to diagnose.
+   - **`holdout_failure_cases`** — kept out of the mutator's view. Use these
+     to detect overfitting.
+   - **`is_viable=True`** unless the organism is completely broken (raises,
+     returns None, etc.). A 0-score viable organism is fine — it just gets
+     down-weighted in parent selection.
+
+3. **`Mutator`** — `.mutate(organism, failure_cases, learning_log_entries) -> list[Organism]`.
+   Typically: build an LLM prompt that includes the current organism + a
+   failure case + an ask to propose a fix; parse the LLM's response; return
+   a new `Organism`. Return `[]` on parse failure — the loop handles it.
+
+Then write a driver script that wires `Problem(initial_organism, evaluator, [mutators])`
+into `EvolveProblemLoop` and iterates over `loop.run(num_iterations=N)` — the
+shipped `scripts/parrot_openrouter.py` is the reference.
+
+## Hyperparameters That Actually Matter
+
+| flag | default | when to change |
+|---|---|---|
+| `--num_iterations` | 5 | bump to 10–20 once you trust the evaluator |
+| `--num_parents_per_iteration` | 4 | drop to 2 for cheap exploration |
+| `--mutator_concurrency` | 10 | drop to 2–4 to avoid rate limits |
+| `--evaluator_concurrency` | 10 | same; evaluator hits the LLM too |
+| `--batch_size` | 1 | raise to 3–5 once your mutator handles multiple failures |
+| `--verify_mutations` | off | turn on once mutator is wasteful (>10× cost saving on later runs per Imbue) |
+| `--midpoint_score` | `p75` | leave alone unless scores cluster |
+| `--sharpness` | 10 | leave alone |
+
+## Pitfalls
+
+1. **`Initial organism must be viable`** — set `is_viable=True` in your
+   `EvaluationResult` even on a 0-score seed. The loop refuses non-viable
+   organisms because they imply the loop has nothing to evolve from.
+2. **Provider content filters kill runs.** Azure-backed OpenRouter models
+   reject phrases like "ignore previous instructions" with HTTP 400. Wrap
+   the LLM call in `try/except` and return `f"<LLM_ERROR: {e}>"` — the
+   evolver will just score that organism 0 and move on.
+3. **`loop.run()` is a generator** — calling it doesn't run anything until
+   you iterate. Use `for snap in loop.run(num_iterations=N):`.
+4. **Snapshots are nested pickles.** `iteration_N.pkl` contains a dict with
+   `population_snapshot` (more pickled bytes). To unpickle you must have the
+   `Organism` class importable under the same dotted path it was pickled at.
+5. **Concurrency defaults are aggressive.** 10/10 will hit rate limits on
+   most providers. Start with 2/2.
+6. **CLI is hardcoded to Anthropic.** `uv run darwinian_evolver <problem>`
+   reaches for `ANTHROPIC_API_KEY` and uses Claude Sonnet. To use any other
+   provider, write a driver like `parrot_openrouter.py`.
+7. **AGPL.** Never `from darwinian_evolver import ...` inside Hermes core.
+   Custom driver scripts under `~/.hermes/skills/...` are user-side and fine.
+8. **No PyPI package.** `pip install darwinian-evolver` will pull the wrong
+   thing. Always install from the GitHub repo.
+
+## Verification
+
+After install + a parrot run, exit code 0 from this is sufficient:
+
+```bash
+DE_DIR=~/.hermes/cache/darwinian-evolver/darwinian_evolver
+ls "$DE_DIR/darwinian_evolver/lineage_visualizer.html" >/dev/null && \
+cd "$DE_DIR" && uv run darwinian_evolver --help >/dev/null && \
+echo "darwinian-evolver: OK"
+```
+
+## References
+
+- [Imbue research post](https://imbue.com/research/2026-02-27-darwinian-evolver/)
+- [ARC-AGI-2 results](https://imbue.com/research/2026-02-27-arc-agi-2-evolution/)
+- [imbue-ai/darwinian_evolver](https://github.com/imbue-ai/darwinian_evolver) (AGPL-3.0)
+- [Darwin Gödel Machines](https://arxiv.org/abs/2505.22954)
+- [PromptBreeder](https://arxiv.org/abs/2309.16797)
--- a/website/sidebars.ts
+++ b/website/sidebars.ts
@ -547,6 +547,7 @@ const sidebars: SidebarsConfig = {
                  collapsed: true,
                  items: [
                    'user-guide/skills/optional/research/research-bioinformatics',
+                    'user-guide/skills/optional/research/research-darwinian-evolver',
                    'user-guide/skills/optional/research/research-domain-intel',
                    'user-guide/skills/optional/research/research-drug-discovery',
                    'user-guide/skills/optional/research/research-duckduckgo-search',