mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-04-25 00:51:20 +00:00
fix(skills/baoyu-comic): absolute curl paths + clarify-timeout handling (#13775)
* fix(skills/baoyu-comic): require absolute paths for curl -o downloads When downloading generated images across several batches of image_generate calls, relying on persistent-shell CWD is unsafe. The terminal tool's shell can rotate (TERMINAL_LIFETIME_SECONDS expiry, a failed cd that leaves the shell somewhere else), and 'curl -fsSL <url> -o relative.png' then silently writes to the wrong directory with no error. Update the skill's Step 7 Download step to require absolute -o paths (or workdir= on the terminal tool) and add a matching pitfall entry referencing the Apr 2026 incident where pages 06-09 of a 10-page comic landed at the repo root instead of comic/<slug>/. The agent then spent several turns claiming the files existed where they didn't. * fix(skills/baoyu-comic): handle clarify timeouts correctly in Step 2 A clarify timeout returning 'Use your best judgement to make the choice and proceed' is NOT user consent to default the entire Step 2 questionnaire. It is a per-question default only. Add guidance at both instruction sites (SKILL.md User Questions section, references/workflow.md Step 2 header) telling the agent to: 1. Continue asking the remaining questions in the sequence after a timeout — each question is an independent consent point. 2. Surface every defaulted choice in the next user-visible message so the user can correct it when they return. An unreported default is indistinguishable from never having asked. Reported live Apr 2026: agent asked style question via clarify, got a timeout response, and silently defaulted style + narrative focus + audience + review flags in one pass. User only learned style had defaulted to 'ohmsha' after the comic was fully generated.
This commit is contained in:
parent
1dfbfcfe74
commit
0a1e85dd0d
2 changed files with 14 additions and 2 deletions
|
|
@ -169,6 +169,12 @@ Input → Analyze → [Check Existing?] → [Confirm: Style + Reviews] → Story
|
|||
|
||||
Use the `clarify` tool to confirm options. Since `clarify` handles one question at a time, ask the most important question first and proceed sequentially. See [references/workflow.md](references/workflow.md) for the full Step 2 question set.
|
||||
|
||||
**Timeout handling (CRITICAL)**: `clarify` can return `"The user did not provide a response within the time limit. Use your best judgement to make the choice and proceed."` — this is NOT user consent to default everything.
|
||||
|
||||
- Treat it as a default **for that one question only**. Continue asking the remaining Step 2 questions in sequence; each question is an independent consent point.
|
||||
- **Surface the default to the user visibly** in your next message so they have a chance to correct it: e.g. `"Style: defaulted to ohmsha preset (clarify timed out). Say the word to switch."` — an unreported default is indistinguishable from never having asked.
|
||||
- Do NOT collapse Step 2 into a single "use all defaults" pass after one timeout. If the user is genuinely absent, they will be equally absent for all five questions — but they can correct visible defaults when they return, and cannot correct invisible ones.
|
||||
|
||||
### Step 7: Image Generation
|
||||
|
||||
Use Hermes' built-in `image_generate` tool for all image rendering. Its schema accepts only `prompt` and `aspect_ratio` (`landscape` | `portrait` | `square`); it **returns a URL**, not a local file. Every generated page or character sheet must therefore be downloaded to the output directory.
|
||||
|
|
@ -185,8 +191,11 @@ Use Hermes' built-in `image_generate` tool for all image rendering. Its schema a
|
|||
|
||||
**Download step** — after every `image_generate` call:
|
||||
1. Read the URL from the tool result
|
||||
2. Fetch the image bytes (e.g., `curl -fsSL "<url>" -o <target>.png`)
|
||||
3. Verify the file exists and is non-empty before proceeding to the next page
|
||||
2. Fetch the image bytes using an **absolute** output path, e.g.
|
||||
`curl -fsSL "<url>" -o /abs/path/to/comic/<slug>/NN-page-<slug>.png`
|
||||
3. Verify the file exists and is non-empty at that exact path before proceeding to the next page
|
||||
|
||||
**Never rely on shell CWD persistence for `-o` paths.** The terminal tool's persistent-shell CWD can change between batches (session expiry, `TERMINAL_LIFETIME_SECONDS`, a failed `cd` that leaves you in the wrong directory). `curl -o relative/path.png` is a silent footgun: if CWD has drifted, the file lands somewhere else with no error. **Always pass a fully-qualified absolute path to `-o`**, or pass `workdir=<abs path>` to the terminal tool. Incident Apr 2026: pages 06-09 of a 10-page comic landed at the repo root instead of `comic/<slug>/` because batch 3 inherited a stale CWD from batch 2 and `curl -o 06-page-skills.png` wrote to the wrong directory. The agent then spent several turns claiming the files existed where they didn't.
|
||||
|
||||
**7.1 Character sheet** — generate it (to `characters/characters.png`, aspect `landscape`) when the comic is multi-page with recurring characters. Skip for simple presets (e.g., four-panel minimalist) or single-page comics. The prompt file at `characters/characters.md` must exist before invoking `image_generate`. The rendered PNG is a **human-facing review artifact** (so the user can visually verify character design) and a reference for later regenerations or manual prompt edits — it does **not** drive Step 7.2. Page prompts are already written in Step 5 from the **text descriptions** in `characters/characters.md`; `image_generate` cannot accept images as visual input.
|
||||
|
||||
|
|
@ -229,6 +238,7 @@ Full step-by-step workflow (analysis, storyboard, review gates, regeneration var
|
|||
|
||||
- Image generation: 10-30 seconds per page; auto-retry once on failure
|
||||
- **Always download** the URL returned by `image_generate` to a local PNG — downstream tooling (and the user's review) expects files in the output directory, not ephemeral URLs
|
||||
- **Use absolute paths for `curl -o`** — never rely on persistent-shell CWD across batches. Silent footgun: files land in the wrong directory and subsequent `ls` on the intended path shows nothing. See Step 7 "Download step".
|
||||
- Use stylized alternatives for sensitive public figures
|
||||
- **Step 2 confirmation required** - do not skip
|
||||
- **Steps 4/6 conditional** - only if user requested in Step 2
|
||||
|
|
|
|||
|
|
@ -99,6 +99,8 @@ Save result and handle accordingly:
|
|||
|
||||
**Use `clarify` one question at a time**, in priority order:
|
||||
|
||||
> **Timeout handling (CRITICAL)**: if `clarify` returns `"The user did not provide a response within the time limit. Use your best judgement..."`, that is a per-question default, NOT blanket consent. Continue to the next question in the sequence — do not bail out of Step 2. Then, in your next user-visible message, explicitly surface every default that was taken (e.g. `"Defaulted style → ohmsha, narrative focus → concept explanation, audience → developers (clarify timed out on all three). Say the word to redirect."`). An unreported default is indistinguishable to the user from "the agent never asked."
|
||||
|
||||
### Question 1: Visual Style
|
||||
|
||||
If a preset is recommended (see `auto-selection.md`), show it first:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue