mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
Merge c77175b7f7 into 05d8f11085
This commit is contained in:
commit
bb11bc76a0
7 changed files with 590 additions and 1 deletions
|
|
@ -1,3 +1,31 @@
|
||||||
---
|
---
|
||||||
description: Skills for academic research, paper discovery, literature review, domain reconnaissance, market data, content monitoring, and scientific knowledge retrieval.
|
description: Skills for academic research, paper discovery, literature review, domain reconnaissance, market data, content monitoring, and scientific knowledge retrieval. Includes autoresearch for autonomous, continuous research with iterative experimentation.
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Skill Overview
|
||||||
|
|
||||||
|
| Skill | Purpose | Best For |
|
||||||
|
|-------|---------|----------|
|
||||||
|
| `autoresearch` | Autonomous research orchestration | Continuous experimentation, hypothesis testing, benchmark optimization |
|
||||||
|
| `arxiv` | Search academic papers | Literature surveys, paper discovery |
|
||||||
|
| `research-paper-writing` | Write publication-ready papers | Final paper generation |
|
||||||
|
| `blogwatcher` | Monitor research blogs | Staying current with new developments |
|
||||||
|
| `llm-wiki` | LLM knowledge base | Quick reference on models and techniques |
|
||||||
|
| `polymarket` | Prediction market data | Research on market trends and predictions |
|
||||||
|
|
||||||
|
## Getting Started with Research
|
||||||
|
|
||||||
|
**Quick literature search:**
|
||||||
|
```
|
||||||
|
/arxiv "transformer attention mechanisms"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Start autonomous research:**
|
||||||
|
```
|
||||||
|
/autoresearch "Does LoRA rank affect convergence speed?"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Write a paper:**
|
||||||
|
```
|
||||||
|
/research-paper-writing
|
||||||
|
```
|
||||||
|
|
|
||||||
200
skills/research/autoresearch/SKILL.md
Normal file
200
skills/research/autoresearch/SKILL.md
Normal file
|
|
@ -0,0 +1,200 @@
|
||||||
|
---
|
||||||
|
name: autoresearch
|
||||||
|
description: Autonomous research orchestration for AI coding agents. Run continuous, self-directed research with a two-loop architecture — rapid inner-loop experiments and periodic outer-loop synthesis. Ideal for literature surveys, hypothesis testing, benchmark optimization, and iterative discovery. No human hand-holding required.
|
||||||
|
version: 1.0.0
|
||||||
|
author: Hermes Agent
|
||||||
|
license: MIT
|
||||||
|
metadata:
|
||||||
|
hermes:
|
||||||
|
tags: [Research, Autonomous, Experiments, ML, AI, Literature, Hypothesis, Benchmark, Optimization]
|
||||||
|
related_skills: [arxiv, research-paper-writing, web-search, notebooklm]
|
||||||
|
config:
|
||||||
|
autoresearch.loop_interval_minutes:
|
||||||
|
description: "Interval between autonomous research loops (in minutes)"
|
||||||
|
default: 20
|
||||||
|
autoresearch.max_iterations:
|
||||||
|
description: "Maximum number of inner-loop experiments before forced reflection"
|
||||||
|
default: 10
|
||||||
|
autoresearch.auto_commit:
|
||||||
|
description: "Automatically git-commit research milestones"
|
||||||
|
default: true
|
||||||
|
---
|
||||||
|
|
||||||
|
# Autoresearch
|
||||||
|
|
||||||
|
**Autonomous research orchestration for AI coding agents.**
|
||||||
|
|
||||||
|
Run continuous, self-directed research with a two-loop architecture:
|
||||||
|
- **Inner Loop**: Rapid experiment iteration with clear measurable outcomes
|
||||||
|
- **Outer Loop**: Periodic synthesis, pattern discovery, and direction setting
|
||||||
|
|
||||||
|
Ideal for literature surveys, hypothesis testing, benchmark optimization, mechanistic interpretability studies, and any research requiring iterative experimentation.
|
||||||
|
|
||||||
|
## When to Use
|
||||||
|
|
||||||
|
| Scenario | Use Autoresearch? |
|
||||||
|
|----------|-------------------|
|
||||||
|
| "I want to explore X and see what works" | ✅ Yes |
|
||||||
|
| "Does technique Y improve metric Z?" | ✅ Yes |
|
||||||
|
| "What's the state of the art for problem W?" | ✅ Yes (bootstrap + literature) |
|
||||||
|
| "Train a model with specific hyperparameters" | ❌ Use domain skills directly |
|
||||||
|
| "Run a single evaluation" | ❌ Use evaluation skills directly |
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start a research project
|
||||||
|
/autoresearch "Does LoRA rank affect convergence speed on small datasets?"
|
||||||
|
|
||||||
|
# Or with the research tool
|
||||||
|
research_init(project="lora-rank-study", question="Does LoRA rank affect convergence speed?")
|
||||||
|
```
|
||||||
|
|
||||||
|
## The Two-Loop Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
BOOTSTRAP (once)
|
||||||
|
↓
|
||||||
|
INNER LOOP (fast, repeating) → Run experiments → Measure → Record → Learn
|
||||||
|
↓ (every N experiments or when stuck)
|
||||||
|
OUTER LOOP (reflective) → Synthesize → New hypotheses → Decide direction
|
||||||
|
↓
|
||||||
|
CONCLUDE → Write findings → Generate report
|
||||||
|
```
|
||||||
|
|
||||||
|
### Inner Loop: Experiment Fast
|
||||||
|
|
||||||
|
1. Pick highest-priority untested hypothesis
|
||||||
|
2. Write protocol (what change, what prediction, why)
|
||||||
|
3. **Lock it**: Commit to git BEFORE running
|
||||||
|
4. Run experiment (invoke domain skill)
|
||||||
|
5. Sanity check results (converged? baseline correct?)
|
||||||
|
6. Measure proxy metric
|
||||||
|
7. Record in `experiments/{hypothesis-slug}/`
|
||||||
|
8. Update `research-state.yaml`
|
||||||
|
9. If stuck → search literature or brainstorm
|
||||||
|
|
||||||
|
### Outer Loop: Step Back and Synthesize
|
||||||
|
|
||||||
|
1. Review all results since last reflection
|
||||||
|
2. Cluster by type: what worked? what didn't?
|
||||||
|
3. Ask WHY — identify mechanisms
|
||||||
|
4. Update `findings.md` with current understanding
|
||||||
|
5. Search literature if results surprise you
|
||||||
|
6. Generate new hypotheses if warranted
|
||||||
|
7. Decide direction: DEEPEN / BROADEN / PIVOT / CONCLUDE
|
||||||
|
|
||||||
|
## Workspace Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
{project}/
|
||||||
|
├── research-state.yaml # Central state tracking
|
||||||
|
├── research-log.md # Decision timeline
|
||||||
|
├── findings.md # Evolving narrative synthesis
|
||||||
|
├── literature/ # Papers, survey notes
|
||||||
|
├── src/ # Reusable code (utils, plotting)
|
||||||
|
├── data/ # Raw result data
|
||||||
|
├── experiments/ # Per-hypothesis work
|
||||||
|
│ └── {hypothesis-slug}/
|
||||||
|
│ ├── protocol.md # What, why, and prediction
|
||||||
|
│ ├── code/ # Experiment-specific code
|
||||||
|
│ ├── results/ # Raw outputs, metrics
|
||||||
|
│ └── analysis.md # What we learned
|
||||||
|
├── to_human/ # Progress presentations
|
||||||
|
└── paper/ # Final paper (optional)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Research Discipline
|
||||||
|
|
||||||
|
### Lock Before You Run
|
||||||
|
|
||||||
|
Always commit your protocol to git BEFORE executing:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add experiments/H001-protocol.md
|
||||||
|
git commit -m "research(protocol): H001 — cosine warmup improves convergence"
|
||||||
|
# THEN run the experiment
|
||||||
|
```
|
||||||
|
|
||||||
|
This creates temporal proof your plan existed before results.
|
||||||
|
|
||||||
|
### Confirmatory vs Exploratory
|
||||||
|
|
||||||
|
| Type | Definition | Trust Level |
|
||||||
|
|------|------------|-------------|
|
||||||
|
| **Confirmatory** | Matches your locked protocol | High |
|
||||||
|
| **Exploratory** | Discovered during execution | Medium — needs replication |
|
||||||
|
|
||||||
|
### Negative Results Are Progress
|
||||||
|
|
||||||
|
A refuted hypothesis tells you something. Log what it rules out and what it suggests.
|
||||||
|
|
||||||
|
## Commands
|
||||||
|
|
||||||
|
| Command | Description |
|
||||||
|
|---------|-------------|
|
||||||
|
| `/autoresearch <question>` | Initialize and start research project |
|
||||||
|
| `/research-status` | Show current state and progress |
|
||||||
|
| `/research-pause` | Pause autonomous loops |
|
||||||
|
| `/research-resume` | Resume autonomous loops |
|
||||||
|
| `/research-report` | Generate progress presentation |
|
||||||
|
| `/research-conclude` | Finalize and write paper |
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Add to `~/.hermes/config.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
autoresearch:
|
||||||
|
loop_interval_minutes: 20 # How often to check progress
|
||||||
|
max_iterations: 10 # Experiments before forced reflection
|
||||||
|
auto_commit: true # Auto-commit milestones
|
||||||
|
default_workspace: "./research" # Where to create projects
|
||||||
|
```
|
||||||
|
|
||||||
|
## Integration with Other Skills
|
||||||
|
|
||||||
|
| Research Phase | Skills to Invoke |
|
||||||
|
|----------------|------------------|
|
||||||
|
| Literature search | `arxiv`, `web-search`, `notebooklm` |
|
||||||
|
| Data preparation | `data-science` tools |
|
||||||
|
| Model training | `mlops`, domain-specific skills |
|
||||||
|
| Evaluation | `evaluating-llms-harness`, custom evals |
|
||||||
|
| Paper writing | `research-paper-writing` |
|
||||||
|
| Progress reports | Built-in report generation |
|
||||||
|
|
||||||
|
## Example: LoRA Rank Study
|
||||||
|
|
||||||
|
```
|
||||||
|
User: /autoresearch "Does LoRA rank affect convergence speed on small datasets?"
|
||||||
|
|
||||||
|
Agent:
|
||||||
|
1. Bootstraps: Searches arxiv for LoRA papers
|
||||||
|
2. Forms hypotheses: H1 (rank 4), H2 (rank 8), H3 (rank 16)
|
||||||
|
3. Inner loop: Trains 3 models, records convergence steps
|
||||||
|
4. Outer loop: Notices rank 8 converges fastest
|
||||||
|
5. Deepens: Tests rank 6, 10, 12
|
||||||
|
6. Concludes: Generates report with trajectory plot
|
||||||
|
```
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
|
||||||
|
1. **Start simple**: First experiment should run in <30 minutes
|
||||||
|
2. **Define metrics upfront**: Lock evaluation criteria before running
|
||||||
|
3. **Return to literature**: When stuck or surprised, search papers
|
||||||
|
4. **Commit frequently**: Git history is your research log
|
||||||
|
5. **Show your work**: Generate progress reports for human review
|
||||||
|
6. **Never idle**: If blocked, diagnose, fix, or pivot — but keep moving
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- Inspired by Andrej Karpathy's autoresearch methodology
|
||||||
|
- Compatible with agentskills.io open standard
|
||||||
|
- Built-in templates from `templates/` directory
|
||||||
|
|
||||||
|
## See Also
|
||||||
|
|
||||||
|
- `templates/research-state.yaml` — State tracking template
|
||||||
|
- `templates/findings.md` — Synthesis template
|
||||||
|
- `templates/research-log.md` — Decision log template
|
||||||
|
- `examples/` — Example research projects
|
||||||
34
skills/research/autoresearch/examples/README.md
Normal file
34
skills/research/autoresearch/examples/README.md
Normal file
|
|
@ -0,0 +1,34 @@
|
||||||
|
# Autoresearch Examples
|
||||||
|
|
||||||
|
This directory contains example research projects using the autoresearch methodology.
|
||||||
|
|
||||||
|
## Available Examples
|
||||||
|
|
||||||
|
### `lora-rank-study.md`
|
||||||
|
|
||||||
|
**Question:** Does LoRA rank affect convergence speed on small datasets?
|
||||||
|
|
||||||
|
**Type:** Benchmark optimization, hyperparameter study
|
||||||
|
|
||||||
|
**Skills Used:**
|
||||||
|
- `arxiv` — Literature search
|
||||||
|
- `mlops` — Model training
|
||||||
|
- `tensorboard` — Experiment tracking
|
||||||
|
|
||||||
|
**Key Takeaway:** Higher rank improves convergence speed up to a point (r=16), then diminishing returns.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Creating Your Own Research
|
||||||
|
|
||||||
|
1. Start with `/autoresearch "your question"`
|
||||||
|
2. Follow the two-loop architecture
|
||||||
|
3. Commit protocols before running
|
||||||
|
4. Generate progress reports with `/research-report`
|
||||||
|
|
||||||
|
## Tips from Examples
|
||||||
|
|
||||||
|
- **Start small:** First experiment should complete in <30 minutes
|
||||||
|
- **Define metrics upfront:** Know what you're measuring before you start
|
||||||
|
- **Document surprises:** Negative results are progress too
|
||||||
|
- **Show your work:** Progress reports help humans follow along
|
||||||
74
skills/research/autoresearch/examples/lora-rank-study.md
Normal file
74
skills/research/autoresearch/examples/lora-rank-study.md
Normal file
|
|
@ -0,0 +1,74 @@
|
||||||
|
# LoRA Rank Convergence Study
|
||||||
|
|
||||||
|
**Research Question:** Does LoRA rank affect convergence speed on small datasets?
|
||||||
|
|
||||||
|
## Bootstrap
|
||||||
|
|
||||||
|
### Literature
|
||||||
|
|
||||||
|
Key papers:
|
||||||
|
- Hu et al. (2021) — LoRA: Low-Rank Adaptation of Large Language Models
|
||||||
|
- Valipour et al. (2023) — DyLoRA: Parameter-Efficient Tuning with Dynamic Search
|
||||||
|
|
||||||
|
Gap: Most papers focus on final performance, not convergence dynamics.
|
||||||
|
|
||||||
|
### Hypotheses
|
||||||
|
|
||||||
|
- **H1:** Higher rank (r=16) converges faster but may overfit on small data
|
||||||
|
- **H2:** Lower rank (r=4) converges slower but generalizes better
|
||||||
|
- **H3:** There's an optimal rank (r=8) that balances speed and generalization
|
||||||
|
|
||||||
|
## Experiments
|
||||||
|
|
||||||
|
### H001 — Baseline (r=8)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Protocol: Train with rank 8, measure convergence steps to 90% of max accuracy
|
||||||
|
# Prediction: Baseline behavior, ~50 steps to converge
|
||||||
|
```
|
||||||
|
|
||||||
|
**Results:**
|
||||||
|
- Convergence steps: 47
|
||||||
|
- Final accuracy: 0.892
|
||||||
|
- Wall time: 12 min
|
||||||
|
|
||||||
|
### H002 — Low Rank (r=4)
|
||||||
|
|
||||||
|
**Results:**
|
||||||
|
- Convergence steps: 68 (+44% vs baseline)
|
||||||
|
- Final accuracy: 0.887 (-0.6%)
|
||||||
|
|
||||||
|
### H003 — High Rank (r=16)
|
||||||
|
|
||||||
|
**Results:**
|
||||||
|
- Convergence steps: 41 (-13% vs baseline)
|
||||||
|
- Final accuracy: 0.894 (+0.2%)
|
||||||
|
|
||||||
|
## Outer Loop #1
|
||||||
|
|
||||||
|
**Pattern:** Higher rank → faster convergence, minimal overfit on this dataset
|
||||||
|
|
||||||
|
**Decision:** DEEPEN — Test r=32 and r=64 to find saturation point
|
||||||
|
|
||||||
|
### H004 — Very High Rank (r=32)
|
||||||
|
|
||||||
|
**Results:**
|
||||||
|
- Convergence steps: 38 (-6% vs r=16)
|
||||||
|
- Final accuracy: 0.891 (-0.3%)
|
||||||
|
- **Diminishing returns observed**
|
||||||
|
|
||||||
|
### H005 — Optimal Search (r=6, r=10, r=12)
|
||||||
|
|
||||||
|
[Running...]
|
||||||
|
|
||||||
|
## Current Findings
|
||||||
|
|
||||||
|
1. Convergence speed improves with rank up to r=16, then plateaus
|
||||||
|
2. Final accuracy relatively stable across ranks (±0.5%)
|
||||||
|
3. For small datasets, r=8-12 appears optimal (speed vs compute tradeoff)
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- Complete H005-H007
|
||||||
|
- Test on different dataset sizes (generalization)
|
||||||
|
- Write up findings
|
||||||
93
skills/research/autoresearch/templates/findings.md
Normal file
93
skills/research/autoresearch/templates/findings.md
Normal file
|
|
@ -0,0 +1,93 @@
|
||||||
|
# Findings: {{PROJECT_NAME}}
|
||||||
|
|
||||||
|
**Research Question:** {{RESEARCH_QUESTION}}
|
||||||
|
|
||||||
|
**Last Updated:** {{LAST_UPDATED}}
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Current Understanding
|
||||||
|
|
||||||
|
### What We Know So Far
|
||||||
|
|
||||||
|
[Summarize the current state of knowledge. 2-4 paragraphs.]
|
||||||
|
|
||||||
|
### Key Patterns and Insights
|
||||||
|
|
||||||
|
| Pattern | Evidence | Confidence |
|
||||||
|
|---------|----------|------------|
|
||||||
|
| [Pattern 1] | [Which experiments support this] | High/Medium/Low |
|
||||||
|
| [Pattern 2] | [Which experiments support this] | High/Medium/Low |
|
||||||
|
|
||||||
|
### Mechanistic Understanding
|
||||||
|
|
||||||
|
[If applicable: What mechanisms explain the results?]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Lessons and Constraints
|
||||||
|
|
||||||
|
### What Works
|
||||||
|
|
||||||
|
- [Specific finding with context]
|
||||||
|
- [Another finding]
|
||||||
|
|
||||||
|
### What Doesn't Work
|
||||||
|
|
||||||
|
- [Failed approach and why]
|
||||||
|
- [Constraint discovered]
|
||||||
|
|
||||||
|
### Critical Parameters
|
||||||
|
|
||||||
|
| Parameter | Sweet Spot | Why |
|
||||||
|
|-----------|------------|-----|
|
||||||
|
| [Param 1] | [Value/range] | [Explanation] |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Open Questions
|
||||||
|
|
||||||
|
### High Priority
|
||||||
|
|
||||||
|
1. [Question that would change the story if answered]
|
||||||
|
2. [Another critical question]
|
||||||
|
|
||||||
|
### Medium Priority
|
||||||
|
|
||||||
|
1. [Nice to know but not blocking]
|
||||||
|
|
||||||
|
### Answered
|
||||||
|
|
||||||
|
1. ~~[Question]~~ → Answer: [Brief answer with evidence]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Narrative Arc
|
||||||
|
|
||||||
|
[For paper writing: What's the story? What would the abstract say?]
|
||||||
|
|
||||||
|
### Contribution Sketch
|
||||||
|
|
||||||
|
[1-2 sentences on what this research contributes]
|
||||||
|
|
||||||
|
### Implications
|
||||||
|
|
||||||
|
[Who cares? Why does this matter?]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
### Immediate (Next 1-3 Experiments)
|
||||||
|
|
||||||
|
- [ ] [Specific experiment]
|
||||||
|
- [ ] [Another experiment]
|
||||||
|
|
||||||
|
### Medium Term
|
||||||
|
|
||||||
|
- [ ] [Broader direction]
|
||||||
|
|
||||||
|
### If Current Direction Fails
|
||||||
|
|
||||||
|
- [Pivot option 1]
|
||||||
|
- [Pivot option 2]
|
||||||
90
skills/research/autoresearch/templates/research-log.md
Normal file
90
skills/research/autoresearch/templates/research-log.md
Normal file
|
|
@ -0,0 +1,90 @@
|
||||||
|
# Research Log: {{PROJECT_NAME}}
|
||||||
|
|
||||||
|
**Research Question:** {{RESEARCH_QUESTION}}
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Bootstrap Phase
|
||||||
|
|
||||||
|
### {{DATE}} — Project Initialization
|
||||||
|
|
||||||
|
- **Action:** Created workspace, initialized state files
|
||||||
|
- **Research Question:** {{RESEARCH_QUESTION}}
|
||||||
|
- **Initial Thoughts:** [What makes this interesting?]
|
||||||
|
|
||||||
|
### {{DATE}} — Literature Search
|
||||||
|
|
||||||
|
- **Sources:** arxiv, semantic scholar, web search
|
||||||
|
- **Key Papers:**
|
||||||
|
- [Paper 1] — [Key finding relevant to question]
|
||||||
|
- [Paper 2] — [Key finding]
|
||||||
|
- **Gap Identified:** [What's missing in existing work?]
|
||||||
|
|
||||||
|
### {{DATE}} — Hypothesis Formation
|
||||||
|
|
||||||
|
- **H001:** [Description] → Prediction: [Specific prediction]
|
||||||
|
- **H002:** [Description] → Prediction: [Specific prediction]
|
||||||
|
- **H003:** [Description] → Prediction: [Specific prediction]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Inner Loop Log
|
||||||
|
|
||||||
|
### {{DATE}} — Experiment H001
|
||||||
|
|
||||||
|
- **Hypothesis:** H001
|
||||||
|
- **Protocol:** [What was changed, what was predicted]
|
||||||
|
- **Git Commit:** `research(protocol): H001 — [description]`
|
||||||
|
- **Status:** [Running/Completed/Failed]
|
||||||
|
- **Results:**
|
||||||
|
- Metric: [Value]
|
||||||
|
- Baseline: [Value]
|
||||||
|
- Delta: [+/-X]
|
||||||
|
- **Interpretation:** [What this means]
|
||||||
|
- **Next Action:** [Continue/Adjust/Pivot]
|
||||||
|
|
||||||
|
### {{DATE}} — Experiment H002
|
||||||
|
|
||||||
|
[Same format...]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Outer Loop Log
|
||||||
|
|
||||||
|
### {{DATE}} — Reflection #1 (After {{N}} experiments)
|
||||||
|
|
||||||
|
- **Experiments Reviewed:** H001-H00N
|
||||||
|
- **Patterns Observed:**
|
||||||
|
- [Pattern 1]
|
||||||
|
- [Pattern 2]
|
||||||
|
- **Updated Understanding:** [New insights]
|
||||||
|
- **Direction Decision:** [DEEPEN/BROADEN/PIVOT/CONCLUDE]
|
||||||
|
- **Rationale:** [Why this direction?]
|
||||||
|
- **New Hypotheses:**
|
||||||
|
- H00N+1: [Description]
|
||||||
|
- H00N+2: [Description]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Direction Changes
|
||||||
|
|
||||||
|
### {{DATE}} — PIVOT: [New Direction]
|
||||||
|
|
||||||
|
- **From:** [Old direction/assumption]
|
||||||
|
- **To:** [New direction]
|
||||||
|
- **Trigger:** [What result/surprise caused this?]
|
||||||
|
- **New Research Question:** [If changed]
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
### {{DATE}} — Research Concluded
|
||||||
|
|
||||||
|
- **Final Status:** [Completed/Partial/Abandoned]
|
||||||
|
- **Key Findings:**
|
||||||
|
1. [Finding 1]
|
||||||
|
2. [Finding 2]
|
||||||
|
- **Contribution:** [What this adds to the field]
|
||||||
|
- **Limitations:** [What we didn't test/couldn't conclude]
|
||||||
|
- **Future Work:** [What someone should do next]
|
||||||
70
skills/research/autoresearch/templates/research-state.yaml
Normal file
70
skills/research/autoresearch/templates/research-state.yaml
Normal file
|
|
@ -0,0 +1,70 @@
|
||||||
|
# Research State
|
||||||
|
# Central tracking file for autoresearch project
|
||||||
|
# Updated automatically by the agent — do not edit manually
|
||||||
|
|
||||||
|
project:
|
||||||
|
name: "{{PROJECT_NAME}}"
|
||||||
|
created_at: "{{CREATED_AT}}"
|
||||||
|
research_question: "{{RESEARCH_QUESTION}}"
|
||||||
|
status: "bootstrapping" # bootstrapping | active | paused | concluding | completed
|
||||||
|
|
||||||
|
bootstrap:
|
||||||
|
literature_searched: false
|
||||||
|
initial_hypotheses_formed: false
|
||||||
|
evaluation_metric_defined: false
|
||||||
|
baseline_established: false
|
||||||
|
|
||||||
|
loops:
|
||||||
|
inner_loop_count: 0
|
||||||
|
outer_loop_count: 0
|
||||||
|
last_inner_loop_at: null
|
||||||
|
last_outer_loop_at: null
|
||||||
|
|
||||||
|
direction:
|
||||||
|
current: "explore" # explore | deepen | broaden | pivot | conclude
|
||||||
|
rationale: "Initial exploration phase"
|
||||||
|
next_milestone: "Complete first 3 experiments"
|
||||||
|
|
||||||
|
hypotheses:
|
||||||
|
# Example structure — replace with actual hypotheses
|
||||||
|
H001:
|
||||||
|
description: "{{HYPOTHESIS_DESCRIPTION}}"
|
||||||
|
status: "untested" # untested | running | completed | refuted | supported
|
||||||
|
prediction: "{{PREDICTION}}"
|
||||||
|
priority: 1
|
||||||
|
created_at: "{{CREATED_AT}}"
|
||||||
|
completed_at: null
|
||||||
|
result_summary: null
|
||||||
|
experiment_slug: null
|
||||||
|
|
||||||
|
metrics:
|
||||||
|
primary: "{{PRIMARY_METRIC}}" # e.g., "val_loss", "accuracy", "convergence_steps"
|
||||||
|
baseline_value: null
|
||||||
|
target_value: null
|
||||||
|
current_best: null
|
||||||
|
optimization_direction: "minimize" # minimize | maximize
|
||||||
|
|
||||||
|
trajectory:
|
||||||
|
# Auto-populated from experiments
|
||||||
|
# - experiment_id: run_001
|
||||||
|
# hypothesis: H001
|
||||||
|
# metric_value: 0.847
|
||||||
|
# baseline: 0.812
|
||||||
|
# delta: "+0.035"
|
||||||
|
# wall_time_min: 23
|
||||||
|
# change_summary: "Added cosine annealing"
|
||||||
|
|
||||||
|
resources:
|
||||||
|
literature_papers: []
|
||||||
|
related_work_notes: null
|
||||||
|
code_references: []
|
||||||
|
|
||||||
|
continuity:
|
||||||
|
last_session_at: "{{CREATED_AT}}"
|
||||||
|
next_scheduled_loop: null
|
||||||
|
current_experiment: null
|
||||||
|
pending_tasks: []
|
||||||
|
|
||||||
|
notes: |
|
||||||
|
# Agent notes — context for next session
|
||||||
|
# What was I doing? What's next?
|
||||||
Loading…
Add table
Add a link
Reference in a new issue