* feat(config): make tool output truncation limits configurable
Port from anomalyco/opencode#23770: expose a new `tool_output` config
section so users can tune the hardcoded truncation caps that apply to
terminal output and read_file pagination.
Three knobs under `tool_output`:
- max_bytes (default 50_000) — terminal stdout/stderr cap
- max_lines (default 2000) — read_file pagination cap
- max_line_length (default 2000) — per-line cap in line-numbered view
All three keep their existing hardcoded values as defaults, so behaviour
is unchanged when the section is absent. Power users on big-context
models can raise them; small-context local models can lower them.
Implementation:
- New `tools/tool_output_limits.py` reads the section with defensive
fallback (missing/invalid values → defaults, never raises).
- `tools/terminal_tool.py` MAX_OUTPUT_CHARS now comes from
get_max_bytes().
- `tools/file_operations.py` normalize_read_pagination() and
_add_line_numbers() now pull the limits at call time.
- `hermes_cli/config.py` DEFAULT_CONFIG gains the `tool_output` section
so `hermes setup` writes defaults into fresh configs.
- Docs page `user-guide/configuration.md` gains a "Tool Output
Truncation Limits" section with large-context and small-context
example configs.
Tests (18 new in tests/tools/test_tool_output_limits.py):
- Default resolution with missing / malformed / non-dict config.
- Full and partial user overrides.
- Coercion of bad values (None, negative, wrong type, str int).
- Shortcut accessors delegate correctly.
- DEFAULT_CONFIG exposes the section with the right defaults.
- Integration: normalize_read_pagination clamps to the configured
max_lines.
* feat(skills): add design-md skill for Google's DESIGN.md spec
Built-in skill under skills/creative/ that teaches the agent to author,
lint, diff, and export DESIGN.md files — Google's open-source
(Apache-2.0) format for describing a visual identity to coding agents.
Covers:
- YAML front matter + markdown body anatomy
- Full token schema (colors, typography, rounded, spacing, components)
- Canonical section order + duplicate-heading rejection
- Component property whitelist + variants-as-siblings pattern
- CLI workflow via 'npx @google/design.md' (lint/diff/export/spec)
- Lint rule reference including WCAG contrast checks
- Common YAML pitfalls (quoted hex, negative dimensions, dotted refs)
- Starter template at templates/starter.md
Package verified live on npm (@google/design.md@0.1.1).
* fix(skills/baoyu-comic): require absolute paths for curl -o downloads
When downloading generated images across several batches of image_generate
calls, relying on persistent-shell CWD is unsafe. The terminal tool's shell
can rotate (TERMINAL_LIFETIME_SECONDS expiry, a failed cd that leaves the
shell somewhere else), and 'curl -fsSL <url> -o relative.png' then silently
writes to the wrong directory with no error.
Update the skill's Step 7 Download step to require absolute -o paths (or
workdir= on the terminal tool) and add a matching pitfall entry referencing
the Apr 2026 incident where pages 06-09 of a 10-page comic landed at the
repo root instead of comic/<slug>/. The agent then spent several turns
claiming the files existed where they didn't.
* fix(skills/baoyu-comic): handle clarify timeouts correctly in Step 2
A clarify timeout returning 'Use your best judgement to make the choice
and proceed' is NOT user consent to default the entire Step 2 questionnaire.
It is a per-question default only. Add guidance at both instruction sites
(SKILL.md User Questions section, references/workflow.md Step 2 header)
telling the agent to:
1. Continue asking the remaining questions in the sequence after a
timeout — each question is an independent consent point.
2. Surface every defaulted choice in the next user-visible message
so the user can correct it when they return. An unreported default
is indistinguishable from never having asked.
Reported live Apr 2026: agent asked style question via clarify, got a
timeout response, and silently defaulted style + narrative focus +
audience + review flags in one pass. User only learned style had
defaulted to 'ohmsha' after the comic was fully generated.
Page prompts are written in Step 5 from the text descriptions in
characters/characters.md — the PNG sheet generated in Step 7.1
cannot be used to write them. Reposition the PNG as a human-facing
review artifact (and reference for later regenerations / manual
edits), and drop the confusing "Character sheet | Strategy" tables
since the embedding rule is uniform.
- Remove PDF merge feature and scripts/ directory (no pdf-lib dep)
- Correct image_generate docs: prompt-only, returns URL; add
curl download step after every call
- Downgrade reference images to text-based trait extraction
(style/palette/scene); character sheet is agent-facing reference
- Unify source file naming on source-{slug}.md across SKILL.md
and workflow.md
Port the upstream baoyu-comic skill to Hermes' tool ecosystem, matching
the earlier baoyu-infographic adaptation:
- metadata namespace openclaw -> hermes (+ tags, homepage)
- drop EXTEND.md preferences system (references/config/ removed,
workflow Step 1.1 removed)
- user prompts via clarify (one question at a time) instead of
AskUserQuestion batches
- image generation via image_generate instead of baoyu-imagine, with
aspect-ratio mapping to landscape/portrait/square
- Windows/PowerShell/WSL shell snippets dropped
- file I/O referenced via Hermes write_file/read_file tools
- CLI-style --flags converted to natural-language options and
user-intent cues (skill matching has no slash command trigger)
Add PORT_NOTES.md documenting the adaptations and a sync procedure.
Art-style/tone/layout reference files are preserved verbatim from
upstream v1.56.1.
- Remove orphan skills/creative/touchdesigner/references/pitfalls.md
left over from the rename commit (git add-then-edit instead of git mv
meant the old file never got deleted).
- Honour $HERMES_HOME in setup.sh and SKILL.md setup invocation so
profile-aware installs work correctly.
- Fix troubleshooting.md config path to use $HERMES_HOME instead of
hardcoding ~/.hermes/.
- Add touchdesigner-mcp entries to skills-catalog.md and
optional-skills-catalog.md for parity with blender-mcp/meme-generation.
New skill: creative/touchdesigner — control a running TouchDesigner
instance via REST API. Build real-time visual networks programmatically.
Architecture:
Hermes Agent -> HTTP REST (curl) -> TD WebServer DAT -> TD Python env
Key features:
- Custom API handler (scripts/custom_api_handler.py) that creates a
self-contained WebServer DAT + callback in TD. More reliable than the
official mcp_webserver_base.tox which frequently fails module imports.
- Discovery-first workflow: never hardcode TD parameter names. Always
probe the running instance first since names change across versions.
- Persistent setup: save the TD project once with the API handler baked
in. TD auto-opens the last project on launch, so port 9981 is live
with zero manual steps after first-time setup.
- Works via curl in execute_code (no MCP dependency required).
- Optional MCP server config for touchdesigner-mcp-server npm package.
Skill structure (2823 lines total):
SKILL.md (209 lines) — setup, workflow, key rules, operator reference
references/pitfalls.md (276 lines) — 24 hard-won lessons
references/operators.md (239 lines) — all 6 operator families
references/network-patterns.md (589 lines) — audio-reactive, generative,
video processing, GLSL, instancing, live performance recipes
references/mcp-tools.md (501 lines) — 13 MCP tool schemas
references/python-api.md (443 lines) — TD Python scripting patterns
references/troubleshooting.md (274 lines) — connection diagnostics
scripts/custom_api_handler.py (140 lines) — REST API handler for TD
scripts/setup.sh (152 lines) — prerequisite checker
Tested on TouchDesigner 099 Non-Commercial (macOS/darwin).
Previous pass assumed both skills would always be loaded together, so
each description pointed at the other ('use concept-diagrams instead').
That breaks when only one skill is active — the agent reads 'use the
other skill' and there is no other skill.
Now each skill's description and scope section is fully self-contained:
- States what it's best suited for
- Lists subjects where a more specialized skill (if available) would be
a better fit, naming them only as 'consider X if available'
- Explicitly offers itself as a general SVG diagram fallback when no
more specialized skill exists
An agent loading either skill alone gets unambiguous guidance; an
agent with both loaded still gets useful routing via the 'consider X
if available' hints and the related_skills metadata.
Both skills generate SVG system diagrams, but for very different subjects
and aesthetics. The old descriptions didn't make the split clear, so an
agent loading either one couldn't confidently pick.
Changes:
- Rewrote both frontmatter descriptions to state the scope up front plus
an explicit 'for X, use the other skill instead' pointer.
- Added a symmetric 'When to use this skill vs <other>' decision table
to the top of each SKILL.md body, so the guidance is visible whether
the agent is reading frontmatter or full content.
- Added architecture-diagram <-> concept-diagrams to each other's
related_skills metadata.
Rule of thumb baked into both skills:
software/cloud infra -> architecture-diagram
physical / scientific / educational -> concept-diagrams
Port of Cocoon AI's architecture-diagram-generator (MIT) as a Hermes skill.
Generates professional dark-themed system architecture diagrams as standalone
HTML/SVG files. Self-contained output, no dependencies.
- SKILL.md with design system specs, color palette, layout rules
- HTML template with all component types, arrow styles, legend examples
- Fits alongside excalidraw in creative/ category
Source: https://github.com/Cocoon-AI/architecture-diagram-generator
Generate project ideas through creative constraints. Constraint + direction
= creativity.
Core skill (SKILL.md, 147 lines):
- 15 curated constraints across 3 categories: developers, makers, anyone
- Developer-focused prompts: 'solve your own itch', 'the CLI tool that
should exist', 'automate the annoying thing', 'nothing new except glue'
- Matching table: maps user mood/intent to appropriate constraints
- Complete worked example with 3 concrete project ideas
- Output format for consistent, actionable idea presentation
Extended library (references/full-prompt-library.md, 110 lines):
- 30+ additional constraints: communication, screens, philosophy,
transformation, identity, scale, starting points
Constraint approach inspired by wttdotm.com/prompts.html. Adapted for
software development and general-purpose ideation.
Adds opt-in creative thinking frameworks to ascii-video, p5js, and
manim-video skills, based on Lluminate (joelsimon.net/lluminate).
Only engaged when the user explicitly asks for creative, experimental,
or unconventional output. Straightforward requests are unaffected.
Each skill gets 2-3 strategies matched to its domain:
- ascii-video: Forced Connections, Conceptual Blending, Oblique Strategies
- p5js: Conceptual Blending, SCAMPER, Distance Association
- manim-video: SCAMPER, Assumption Reversal
Strategies sourced from creativity research (Boden, Eno, de Bono,
Koestler, Fauconnier & Turner, Osborn), formalized for LLM prompting
by Lluminate.
Add geometry mobjects, movement/creation animations, and LaTeX
environments to the skill's reference docs. All verified against
Manim CE v0.20.1.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Manim's Pango text renderer produces broken kerning with proportional
fonts (Helvetica, Inter, SF Pro, Arial) at all sizes and resolutions.
Characters overlap and spacing is inconsistent. This is a fundamental
Pango limitation.
Changes:
- Recommend Menlo (monospace) as the default font for ALL text
- Proportional fonts only acceptable for large titles (>=48, short strings)
- Set minimum font_size=18 for readability
- Update all code examples to use MONO='Menlo' pattern
- Remove Inter/Helvetica/SF Pro from recommendations
Production pipeline for creating 3Blue1Brown-style animated videos
using Manim Community Edition. The agent handles the full workflow:
creative planning, Python code generation, rendering, scene stitching,
audio muxing, and iterative refinement.
Modes: concept explainers, equation derivations, algorithm
visualizations, data stories, architecture diagrams, paper explainers,
3D visualizations.
9 reference files, setup verification script, README.
All API references verified against ManimCommunity/manim source.
- composition.md: add text backdrop (gaussian dark mask behind glyphs) and
external layout oracle pattern (browser-based text layout → JSON → Python
renderer pipeline for obstacle-aware text reflow)
- shaders.md: add reverse vignette shader (center-darkening for text readability)
- troubleshooting.md: add diagnostic entries for text-over-busy-background
readability and kaleidoscope-destroys-text pitfall
Adds a songwriting craft and AI music prompt engineering skill covering
song structure, rhyme/meter, emotional arcs, Suno metatag reference,
phonetic tricks for AI singers, parody adaptation, and production workflow.
Complements existing music skills (heartmula, audiocraft, songsee) which
cover model setup/usage — this one covers the creative process itself.
Also removes the empty skills/music-creation/ category (only had a
DESCRIPTION.md, no actual skills).
Co-authored-by: 123mikeyd <123mikeyd@users.noreply.github.com>
- New references/design-patterns.md: layer hierarchy (bg/content/accent),
directional parameter arcs, scene concepts and visual metaphors,
counter-rotating systems, wave collision, progressive fragmentation,
entropy/consumption, staggered crescendo buildup, scene ordering
- New references/examples.md: copy-paste-ready scenes at every complexity
- Update scenes.md: local time convention (t=0 at scene start)
- Update SKILL.md: add design-patterns.md to reference table
- Add README.md to hermes-agent copy
- Sync all reference docs with canonical source (SHL0MS/ascii-video)
Major issues fixed:
- Removed dead APIs: artii.herokuapp.com (404 since Heroku free tier
ended 2022), patorjk.com TAAG AJAX endpoint (404)
- Removed unusable sources: emojicombos.com (3.3MB JS blob, not
curl-accessible), asciiart.eu (art loads via JavaScript only)
New working sources added:
- asciified API (asciified.thelicato.io): free text-to-ASCII REST API,
250+ FIGlet fonts, returns plain text, no auth — perfect remote
alternative when pyfiglet isn't installed
- ascii.co.uk: classic ASCII art archive, art in <pre> tags,
extractable with simple curl + Python parsing
- qrenco.de: QR codes as ASCII art via curl
- wttr.in: weather and moon phase as ASCII art via curl
Also fixed: Tool 6 no longer relies on web_extract inside
execute_code (which was the original #662 bug). All web lookups
now use terminal curl which is universally available.
emojicombos.com has a huge curated collection of ASCII art, dot art,
kaomoji, and emoji combos searchable via web_extract with a simple
URL pattern: https://emojicombos.com/{term}-ascii-art
No API key needed. Returns modern/meme art, pop culture references,
and kaomoji alongside classic ASCII art. Added as Source A (recommended
first) before asciiart.eu (Source B, classic archive).
Also added GitHub Octocat API as a fun easter egg and kaomoji search
to the decision flow.
Adds 5 additional tools from the awesome-ascii-art ecosystem:
- cowsay: 50+ characters with speech/thought bubbles
- boxes: 70+ decorative border designs, composable with pyfiglet
- toilet: colored text art with rainbow/metal/border filters
- ascii-image-converter: modern image-to-ASCII (PNG/JPEG/GIF/WEBP)
- jp2a: lightweight JPEG-to-ASCII fallback
Also adds fun extras (Star Wars telnet), resource links, and
an expanded decision flow covering all 7 modes.
Ref: github.com/moul/awesome-ascii-art
Adds two primary modes on top of the original LLM-generation approach:
- Mode 1: pyfiglet (571 fonts, pip install, no API key) for text banners
- Mode 2: asciiart.eu search (11,000+ pieces) via web_extract for pre-made art
- Mode 3: LLM-generated art using Unicode palette (original PR, now fallback)
Includes decision flow, font recommendations, and category reference.
Unicode-based ASCII art generator skill with multiple styles
(block, shadow, outlined, gradient, decorative frame), character
palette reference, and usage examples. No external dependencies.