MatrixAdapter._is_self_sender returns True defensively when _user_id is empty
(whoami not yet resolved) to prevent echo loops — see #15763. The reaction
approval test must therefore initialize a user_id so _on_reaction does not
drop the inbound test event before reaching the approval handler.
Self-contained docker-compose harness that exercises the new bootstrap
branch against a real Continuwuity homeserver. Three tests:
1. fresh bot → bootstrap fires, /keys/query returns master + ssk
with UNPADDED base64 keyids, current device is signed by the
new SSK
2. second startup with same crypto store → bootstrap is skipped
3. MATRIX_RECOVERY_KEY set → existing verify_with_recovery_key path
takes precedence, no new bootstrap
Run via:
docker compose -f tests/e2e/matrix_xsign_bootstrap/docker-compose.yml up -d
python tests/e2e/matrix_xsign_bootstrap/test_bootstrap.py
docker compose -f tests/e2e/matrix_xsign_bootstrap/docker-compose.yml down -v
The test mirrors the bootstrap snippet from matrix.py inline so it can
run without importing the full hermes gateway and its deps. Skipped
automatically when mautrix isn't installed or the homeserver is
unreachable.
All three pass against ghcr.io/continuwuity/continuwuity:latest
(Continuwuity 0.5.7). The unpadded-keyid assertion is the load-bearing
one — it's exactly the property the PR's bootstrap path provides that
the hand-rolled `base64.b64encode().decode()` scripts get wrong.
Without this, every Matrix bot started under hermes-agent shows the
"Encrypted by a device not verified by its owner" badge in Element
indefinitely, because the cross-signing chain (master → SSK → device)
was never published. Operators currently have to write their own
bootstrap script and remember to run it once per bot — and it's easy
to get wrong (the obvious base64.b64encode().decode() produces padded
keyids that matrix-rust-sdk silently rejects in /keys/query, so even
correctly-signed keys fail to load identity in Element).
mautrix already has the right primitive: generate_recovery_key() does
the full flow — generate seeds, upload privates to SSSS, publish
publics to the homeserver, sign the current device with the new SSK,
and return the human-readable recovery key. We invoke it once on
startup if the bot has no existing cross-signing identity, and log
the recovery key with a clear instruction to save it for future
restarts via MATRIX_RECOVERY_KEY (which the existing recovery-key
path already consumes).
Skipped when MATRIX_RECOVERY_KEY is set (existing path takes over)
or when the bot already has cross-signing keys on the homeserver
(get_own_cross_signing_public_keys returns non-None).
Bootstrap failure is non-fatal — logged with hint about UIA; the bot
continues without cross-signing and Element will show the warning
that prompted this PR. That matches the existing soft-fail pattern
for verify_with_recovery_key.
Tested against Continuwuity 0.5.7 (no UIA required). Synapse with
UIA enabled will need a follow-up PR to thread MATRIX_PASSWORD
through to /keys/device_signing/upload.
Five ``except Exception as exc:`` blocks in the Matrix adapter logged
only ``str(exc)`` without ``exc_info=True``:
- _reverify_keys_after_upload → post-upload key verification failure
- _upload_keys_if_needed → initial device-key query failure
- _upload_keys_if_needed → re-upload device keys failure
- _upload_keys_if_needed → initial device key upload failure
- connect → whoami / access-token validation failure
The E2EE key paths here are security-critical: a silent traceback-
less failure during device-key verification or upload makes it
hard for operators to tell whether their Matrix bot is failing
because of a stale token, a federation timeout, or an olm state
mismatch — all three fail with different tracebacks, which
``str(exc)`` alone flattens.
The contributing guide asks for ``exc_info=True`` on error logs.
Append it to each of the five call sites. Pure logging enrichment.
- Wrap _sync_loop sync() call with asyncio.wait_for(timeout=45s) to guard
against TCP-level hangs that the Matrix long-poll timeout cannot catch
- Add logger.debug at the top of _on_room_message so LOG_LEVEL=DEBUG
confirms whether callbacks fire at all (diagnoses #5819, #7914, #12614)
- Add logger.debug when MATRIX_REQUIRE_MENTION silently drops a message,
pointing users to the env var to disable the filter
Adapted for current mautrix-python adapter (PR was written against the
legacy matrix-nio adapter).
Closes#5819
* Port from Kilo-Org/kilocode#9448: roll up subagent costs into parent session total
Child subagents built by delegate_task() each track their own
session_estimated_cost_usd, but the parent agent's total never folded
those numbers in. On runs where the parent mostly delegates and the
children do the expensive work, the footer/UI was reporting a fraction
of the actual spend — sometimes $0.00 when the parent itself made no
billed calls.
Fix:
- Capture each child's session_estimated_cost_usd into _child_cost_usd
on the result entry (before child.close() drops the counter).
- After the existing subagent_stop hook loop, sum the children's costs
and add the total to parent.session_estimated_cost_usd.
- Promote session_cost_source from 'none' -> 'subagent' when the parent
had no direct spend but children did, so the UI doesn't label the
total as having unknown provenance. Real sources (openrouter,
anthropic, etc.) are preserved.
Nested orchestrator -> worker trees roll up naturally: each layer's own
delegate_task() folds its direct children in, and when the orchestrator
itself returns, its parent folds the orchestrator's now-inflated total
on top.
Internal fields (_child_cost_usd, _child_role) are stripped from the
results dict before it's serialised back to the model — same contract
as _child_role already followed.
Tests: TestSubagentCostRollup (5 cases) covers single-child, batch,
zero-cost-children, preserved-source, and legacy-fixture paths.
Source: https://github.com/Kilo-Org/kilocode/pull/9448
* fix(web): scope dashboard config Reset button to the current tab
Reported by @ykmfb001 via X: clicking 'Restore Defaults' (恢复默认值) on
the Auxiliary page wiped the entire config.yaml to defaults, not just
the auxiliary section. The button sits next to the category tabs and
users reasonably assumed 'reset this tab', not 'reset everything'.
Changes:
- handleReset now scopes to the fields in the current view:
active category's fields (form mode) or search-matched fields
(search mode). Only those keys are copied from defaults; the rest
of the config is left alone.
- Added a window.confirm() with the scope name before applying.
- Button is hidden in YAML mode (scoping doesn't apply there).
- Tooltip/aria-label now name the scope, e.g. 'Reset Auxiliary to
defaults'.
- i18n: new resetScopeTooltip / confirmResetScope / resetScopeToast
strings in en + zh; resetDefaults key preserved for compat.
On AWS Bedrock (and Azure AI Foundry), Claude Opus 4.6/4.7 and Sonnet 4.6
are capped at 200K context unless the request carries the
`context-1m-2025-08-07` beta header. On native Anthropic (api.anthropic.com)
1M went GA so the header is a harmless no-op, but Bedrock/Azure still gate
it as beta as of 2026-04.
Hermes was advertising 1M in model_metadata.py (`claude-opus-4-7: 1000000`)
while silently sending a request without the beta — so Bedrock users saw
a 200K ceiling with no error message, and no config knob unblocked it.
Claude Code sends this header by default, which is why the same Bedrock
credentials worked there.
- Add `context-1m-2025-08-07` to `_COMMON_BETAS` (alongside interleaved
thinking and fine-grained tool streaming).
- Strip it in `_common_betas_for_base_url` for MiniMax bearer-auth
endpoints — they host their own models, not Claude, so Anthropic beta
headers are irrelevant and could risk rejection.
- Attach `_COMMON_BETAS` as `default_headers` on the AnthropicBedrock
client. Previously that constructor passed no betas at all, so native
Anthropic had the 1M unlock via default_headers but Bedrock didn't.
- Fast-mode per-request `extra_headers` already rebuilds from
`_common_betas_for_base_url`, so it picks up the 1M beta automatically.
Reported by user 'Rodmar' on Discord: Bedrock Opus 4.7 stuck at 200K while
same credentials worked in Claude Code.
Anyone who ran hermes between Apr 15 (42aeb4ec) and Apr 22 (a7d78d3b)
has schema_version=7 from the pre-renumber api_call_count migration.
When a7d78d3b inserted reasoning_content as the new v7 and pushed
api_call_count to v8, the 'if current_version < 7' gate was already
false for those users, so reasoning_content was never created —
sqlite3.OperationalError: no such column: reasoning_content on any
/continue or /resume touching assistant replays.
Replaces the version-gated ADD COLUMN chain with _reconcile_columns():
on every startup, parse SCHEMA_SQL via an in-memory SQLite and diff
against PRAGMA table_info; ALTER TABLE ADD COLUMN for anything missing.
Follows the Beets / sqlite-utils pattern — SCHEMA_SQL becomes the single
source of truth for declared columns. Self-healing and idempotent.
v10 trigram FTS backfill is retained in a version-gated block — that
migration isn't a column add, it inserts existing message rows into
the new FTS virtual table, so reconciliation can't express it.
schema_version is also kept for future row-data migrations.
Salvaged from #14097 (@kshitijk4poor) onto current main; v10 trigram
preservation and the v9 codex_message_items column (stale-missed by
the original branch) are covered automatically by reconciliation.
Tests:
- Regression: DB at old v7 with api_call_count but no reasoning_content
gets the column on open
- Idempotency: reopening the same DB is a no-op
- Structural invariant: every SCHEMA_SQL column is in the live DB
- Existing v2 migration test still passes
- E2E verified against fresh / v1 / old-v7 / v9 DBs, plus v10 trigram
backfill preserved
Port https://github.com/blader/humanizer (MIT, v2.5.1, 16k stars) into
the built-in skills under skills/creative/humanizer/. Based on Wikipedia's
'Signs of AI writing' guide (WikiProject AI Cleanup) — detects 29 AI-writing
patterns and rewrites them to sound human.
Hermes-native adaptations:
- Description (<60 chars) explains what it's for: 'Humanize text: strip
AI-isms and add real voice.'
- 'When to use this skill' section — trigger phrases (humanize, de-AI,
de-slop, un-ChatGPT, rewrite to not sound like an LLM) plus guidance to
apply it to the agent's own output (release notes, PR descriptions, docs).
- 'How to use it in Hermes' — maps the three real input paths (inline,
file via read_file/patch/write_file, voice-calibration sample) onto the
tools the agent actually has. Drops Claude Code's allowed-tools block.
- Converted frontmatter to Hermes format (metadata.hermes.tags, category,
homepage, related_skills).
Attribution preserved:
- Original author Siqi Chen (@blader) credited in frontmatter and body.
- Full MIT LICENSE copied verbatim alongside SKILL.md.
- Wikipedia / WikiProject AI Cleanup credited.
- 29 patterns, personality/soul section, and full worked example kept
verbatim from the source (29,914 chars).
Validated end-to-end against a clean HERMES_HOME:
- sync_skills() copies skills/creative/humanizer/ including LICENSE.
- skills_list(category='creative') returns the 48-char description.
- skill_view(name='humanizer') returns the full body with all 29 patterns,
personality/soul, attribution, and Hermes tool refs (read_file, patch,
write_file) intact.
Plugins can now observe dangerous-command approval events in real time,
on both the CLI-interactive path and the async gateway path. This is the
missing hook surface external tools need to build approval notifiers
(macOS menu-bar allow/deny, Slack alerts, audit logs, etc.) without
forking Hermes or running a parallel gateway adapter.
Changes:
- hermes_cli/plugins.py: add two entries to VALID_HOOKS
- tools/approval.py: fire both hooks from check_all_command_guards --
around prompt_dangerous_approval (CLI surface) and around the
notify_cb + blocking event.wait loop (gateway surface)
- website/docs/user-guide/features/hooks.md: document both hooks with
a macOS-notification example
- tests/tools/test_approval_plugin_hooks.py: 5 tests covering CLI once,
CLI deny, plugin-crash resilience, gateway approve, gateway timeout
Hooks are observer-only: return values are ignored, so plugins cannot
veto or pre-answer an approval (use pre_tool_call for that). A crashing
plugin cannot break the approval flow -- invoke_hook swallows per-
callback errors, and the wrapper logs and swallows dispatch-layer
errors too.
Surface kwarg distinguishes "cli" from "gateway"; post hook reports
choice as one of once/session/always/deny/timeout.
A misconfigured auxiliary.compression.model is a user-fixable problem that silent recovery would hide. The previous retry-on-main logic transparently swallowed aux-model failures whenever the fallback succeeded, leaving the user's broken config in place and racking up future failures.
Track the aux-model failure on the compressor alongside the existing fallback-placeholder fields:
- _last_aux_model_failure_model: str | None
- _last_aux_model_failure_error: str | None
Both are set at the moment the aux model errors (captured before summary_model is cleared for retry), regardless of whether the retry succeeds. Cleared at compress() start and on on_session_reset() so a clean run doesn't leak stale warnings.
Surface at three places:
- gateway hygiene auto-compress: ℹ note to the platform adapter (thread_id preserved)
- gateway /compress command: ℹ line appended to the reply
- CLI via _emit_warning: deduped on (model, error) so repeat compactions don't spam
Distinct from the existing ⚠️ dropped-turns warning — different severity, different emoji, explicit 'context is intact' reassurance.
Adds four new reference docs covering common TD use cases not previously
documented in the skill:
- animation.md: LFOs, timers, keyframes, easing, time references
- midi-osc.md: MIDI controllers, OSC routing, TouchOSC, multi-machine sync
- particles.md: POPs and particleSOP — emission, forces, collisions, render
- projection-mapping.md: windowCOMP, corner pin, mesh warp, edge blending
Also clarifies the SKILL.md tool quick reference: adds td_screen_point_to_global
and notes that 4 admin/dev-mode tools (td_project_quit, td_test_session,
td_dev_log, td_clear_dev_log) live only in mcp-tools.md to keep the main
reference focused on creative workflows.
No SKILL.md workflow or critical-rules changes. References load on demand
so no token-budget impact at session start.
The existing retry-on-main path in _generate_summary only fires for errors that match the _is_model_not_found heuristic (404/503, 'model_not_found', 'does not exist', 'no available channel'). Other misconfiguration errors — 400s from aggregators, provider-specific 'no route' strings, opaque rejections — fall straight through to the transient-cooldown branch, which drops N turns of context and inserts a static placeholder.
Losing context is almost always worse than one extra summary attempt. Add a best-effort retry-on-main for the unknown-error branch, guarded by the same invariants as the existing fast-path retry: only when summary_model differs from main, and only once per compressor (_summary_model_fallen_back).
Tests cover: 404 fast-path fallback still works, unknown 400 now falls back, same-model aux skips retry (no infinite loop), and a double-failure (aux + main) stops at 2 calls.
PR #16333 added a warning to the manual /compress reply when the
auxiliary summariser fails and the static fallback placeholder is
used, but only the gateway-hygiene path had a test
(test_session_hygiene_warns_user_when_summary_generation_fails).
The /compress branch in _handle_compress_command was uncovered.
New test test_compress_command_appends_warning_when_summary_generation_fails
mocks the compressor's _last_summary_fallback_used /
_last_summary_dropped_count / _last_summary_error fields and
verifies the /compress reply contains the ⚠️ marker, the underlying
error string, the dropped message count, and the 'historical
message(s) were removed' wording — i.e. the same contract the
hygiene-path test enforces.
The per-call reset block at the top of compress() cleared
_last_summary_dropped_count and _last_summary_fallback_used but
not _last_summary_error. Functionally this didn't break the
gateway warning path (callers gate on _last_summary_fallback_used
first, and _last_summary_error is overwritten on the next failure),
but it left the three tracking fields inconsistent — anyone
reading _last_summary_error standalone after a successful compress
would see a stale value from a previous failed compress.
Reset all three together so the per-call contract is uniform.
The fallback placeholder said "N conversation turns were removed" while the
gateway warning said "N historical message(s) were removed". Use "messages"
in both so users don't wonder if the two counters refer to different things.
Address review feedback on PR #16333:
1. The hygiene-path warning send was missing metadata=_hyg_meta. On
Telegram topics / Slack threads / Discord threads the warning would
land in the main channel instead of the originating thread. Now
reuses the same _hyg_meta dict already computed for the hygiene
compaction itself.
2. New gateway-level test
test_session_hygiene_warns_user_when_summary_generation_fails
verifies end-to-end:
- When the compressor's _last_summary_fallback_used flag is True,
the gateway invokes adapter.send() exactly once.
- The warning message includes the dropped count and the underlying
error string.
- metadata={'thread_id': ...} is propagated so the warning lands
in the originating topic/thread.
Tests: 20 gateway hygiene + 54 context_compressor — all pass.
When auxiliary compression's summary LLM call fails (e.g. model 404,
auxiliary model misconfigured), the compressor still drops the selected
turns and inserts a static fallback placeholder — the dropped context
is unrecoverable.
Previously the only signal of this was a WARNING in agent.log. Gateway
users (Telegram/Discord/etc.) had no way to know context was lost
because the existing _emit_warning path requires a status_callback,
and the gateway hygiene path uses a temporary _hyg_agent with
quiet_mode=True and no callback wired up.
Changes:
- ContextCompressor: track _last_summary_fallback_used and
_last_summary_dropped_count on each compress() call. Cleared at the
start of compress() and on session reset.
- gateway/run.py hygiene: after auto-compress, inspect the temp
agent's compressor; if fallback was used, send a visible ⚠️ warning
to the user via the platform adapter (TG/Discord/etc.) including
dropped count and the underlying error.
- gateway/run.py /compress: append the same warning to the manual
compress reply so users running /compress see the failure too.
Acceptance:
- Summary success: no user-visible warning (unchanged).
- Summary failure on gateway hygiene: user receives a TG/Discord
message with dropped count + error + remediation hint.
- Summary failure on /compress: warning appended to the command reply.
- CLI status_callback / _emit_warning path is untouched.
- Test coverage: two new tests verify the tracking fields are set on
failure and cleared on subsequent success.
The typing-indicator refresh loop in BasePlatformAdapter._keep_typing
awaited each send_typing call unconditionally. Each call is an HTTP
round-trip to the platform API (Telegram/Discord), normally ~100ms. When
the same network instability that causes upstream provider timeouts
(e.g. Anthropic capacity blips slowing first-token latency past the
120s stream-read timeout) also slows the platform typing API to
multi-second response times, the refresh loop stalls inside the await.
Platform-side typing expires at ~5s, so the bubble dies and stays dead
until the stuck send_typing call returns — right when the user most
needs the 'still working' signal and instead sees a bot that looks
dead, then asks 'wtf are you doing' which itself interrupts the
eventually-recovering turn.
Bound each send_typing with asyncio.wait_for (1.5s cap, derived from
interval so it's always below the 2s cadence). Slow calls get abandoned
so the next scheduled tick fires a fresh send_typing on schedule. As
long as any one of them reaches the platform within its ~5s
typing-expiry window, the bubble stays visible across the stall.
Also catches non-timeout send_typing exceptions (transient HTTP errors)
so one bad tick doesn't terminate the whole loop.
Tests: 4 new in tests/gateway/test_keep_typing_timeout.py covering
slow-send non-blocking, fast-send still-awaited, exception resilience,
and paused-chat regression guard.
- moveCursor(extend=true) now collapses to the bare cursor when the
computed offset equals the existing anchor instead of leaving a
zero-length sel. Without this, Shift+Left at col 0 / Shift+Home at
start would silently hide the hardware cursor (selected truthy)
without rendering any highlight.
- _tui_need_npm_install also catches UnicodeDecodeError so a corrupted
/ non-UTF8 lockfile falls back to the mtime path the docstring
promises instead of crashing.
Made-with: Cursor
* feat(tui): auto copy-on-select for transcript text
Drag in the transcript already highlighted but you had to press Cmd+C to
land it on the clipboard, and the highlight cleared on copy — most users
never realised selection existed. Now drag-release fires copySelectionNoClear
so the text is on the clipboard immediately while the highlight stays put,
matching iTerm2's "Copy to pasteboard on selection" default. Esc clears.
Behaviour:
- Single click in the input still positions the cursor (TextInput onClick).
- Single click in the transcript still does nothing destructive.
- Double / triple click select word / line, then drag extends.
- /copyselect [on|off|toggle] (alias /cos) flips the setting at runtime,
HERMES_TUI_DISABLE_COPY_ON_SELECT=1 disables at startup, persists via
display.tui_copy_on_select in config.yaml.
Help overlay now lists drag-select, multi-click, and click-to-position
so the gestures are discoverable.
Made-with: Cursor
* fix(tui): support prompt text selection gestures
Add mouse drag selection and Shift+Arrow/Home/End extension inside the TUI composer so prompt text behaves like a normal editable field while keeping click-to-position and right-click paste intact.
Made-with: Cursor
* Revert "feat(tui): auto copy-on-select for transcript text"
This reverts commit 6701288fe0.
* fix(tui): allow composer selection from prompt whitespace
Give the composer a one-cell mouse capture pad before the editable text. The prompt glyph/gutter still does not become selectable, but dragging from the edge now anchors at input offset 0 so users do not need to hit the first character precisely.
Made-with: Cursor
* fix(tui): clear selections from blank composer space
Clicking blank space in the transcript or composer now clears active TUI/input selections like a normal text surface. TextInput clicks stop bubbling so cursor placement and selection gestures keep their local behavior.
Made-with: Cursor
* fix(tui): delegate prompt gutter drags to composer text
The prompt gutter is now an input gesture region, not selectable content. Dragging from the whitespace or prompt area anchors the composer selection at offset 0, while selection highlight/copy remains limited to actual input text.
Made-with: Cursor
* fix(tui): move composer cursor to end on selection clear
External clear actions now collapse the composer selection to the end of the input, matching normal text-field behavior after dismissing a selection.
Made-with: Cursor
* fix(tui): capture composer padding before prompt
Add an explicit mouse capture cell over the left padding before the prompt glyph. Drags starting there now delegate to the composer input at offset 0 instead of starting terminal-level selection over the prompt chrome.
Made-with: Cursor
* fix(tui): avoid npm install on lockfile mtime churn
Compare package-lock.json against npm's hidden node_modules lock by content instead of mtimes. Git checkouts and npm lock rewrites can make the root lockfile newer even when installed dependencies already match, causing hermes --tui to print Installing TUI dependencies on every launch.
Made-with: Cursor
* fix(tui): include prompt leading cell in gesture region
Use the prompt box's real layout region to cover the leading whitespace cell before the glyph. The cell now participates in mouse hit testing and delegates to composer selection instead of starting terminal-level selection.
Made-with: Cursor
* fix(tui): widen prompt-side gesture capture band
Capture a wider left-side band around the composer prompt row so drags starting in terminal gutter/padding cells are consumed and delegated to input selection, instead of triggering terminal-level selection chrome.
Made-with: Cursor
* fix(tui): make pre-prompt spacer non-selectable content
Replace the sticky-prompt fallback `Text(' ')` with an empty spacer box so the visual gap remains but no literal space character is rendered/copyable before the composer prompt.
Made-with: Cursor
* fix(tui): capture pre-prompt spacer without shifting prompt layout
Revert the widened negative-margin prompt capture band and instead capture drags on the dedicated spacer row above the prompt. This keeps prompt/text alignment stable while still delegating whitespace-start drags to composer selection.
Made-with: Cursor
* fix(tui): align prompt with status bar and capture full input row
Drop the leading prompt column from 3 to 2 so the input first character lines up with the status bar text. Wrap the prompt+input row in a single mouse-capture box and stop event propagation from TextInput's own handlers so any drag in that row delegates to composer selection without leaking to terminal-level selection.
Made-with: Cursor
* fix(tui): anchor hardware cursor during composer selection
When a composer selection covers a row exactly the column width, the rendered text fills the row and the terminal auto-wraps the hardware cursor to col 0 of the next row, leaving a ghost block beneath the prompt. Park the cursor at the start of the input box during selection so it can't escape the input region.
Made-with: Cursor
* fix(tui): hide hardware cursor during composer selection
Stop fighting auto-wrap by hiding the hardware cursor outright while the
composer has an active selection. This prevents both the ghost block under
the prompt (cursor wrapping past the last cell) and the parked-cursor block
on the first selected character. The cursor restores as soon as the
selection clears or focus changes.
Made-with: Cursor
* chore(tui): /clean — drop dead capture-pad path, dedupe gutter handlers
- TextInput: remove unused leftCaptureColumns prop and capture-pad math, drop
unused mouseApi.startAt, fold mouse offset into a single offsetAt helper,
share a MouseEventLite type across the four handlers.
- appLayout: hoist a GutterMouseEvent type and an endInputDrag callback so the
spacer/prompt/input rows share one shape.
- _tui_need_npm_install: lift the runtime-only key set to a module constant,
collapse nested isinstance checks, and document the mtime fallback.
Made-with: Cursor
* fix(tui): address copilot review on PR #16732
- Split InputSelection.clear() into clear() (cursor-preserving) and
collapseToEnd() (clear + jump to end). Cmd+C copy paths keep using
clear() so the cursor stays put; the blank-area click in useMainApp
switches to collapseToEnd() to match the requested UX.
- Spacer-row drags now force row=0 when forwarding into the input,
since the spacer's vertical origin doesn't align with the input box
and Ink mouse-capture keeps dispatching motion to the original
target. Prompt+input row drag keeps localRow because origins match.
Made-with: Cursor
* fix(tui): give TextInput Box an explicit width
After the /clean pass dropped the unused capture-pad math, the wrapping
Box also lost its explicit width and started sizing to its rendered
content. Clicks past the last character missed TextInput and fell
through to the parent prompt-row Box, which collapsed the cursor to
offset 0. Pin the Box back to `columns` so the input owns its full
column span regardless of value length.
Made-with: Cursor
* feat(tui): double-click select-all + hide cursor on terminal blur
- Track click time/offset in TextInput so a quick second click on the
same offset triggers select-all. Ink's screen-level multi-click is
bypassed once our onMouseDown captures, so the gesture has to be
detected locally.
- Extend the cursor-hide effect to also fire when the terminal loses
focus, so the hollow-rect ghost most terminals draw at the parked
cursor position disappears too.
Made-with: Cursor
* chore(tui): /clean — extract isMultiClickAt helper
Pull the click-recurrence math out of TextInput's onMouseDown into a
small isMultiClickAt(offset) helper so the handler reads as the gesture
list it actually is (multi-click → select-all, otherwise start).
Drop the redundant length>0 guard now that selectAll() already noops on
an empty value.
Made-with: Cursor
* docs(tui): explain _tui_need_npm_install content-vs-mtime comparison
Expand the docstring so future readers understand why we parse the
lockfiles instead of comparing mtimes, what the optional/peer skip
covers, how stale hidden-lock entries are handled, and when we fall
back to mtime.
The previous commit on this branch went through a layer that redacted
strings matching API-key patterns. Restore the original placeholder
values (sk-ant-..., ${ANTHROPIC_API_KEY}, etc.) that were already in
main so the diff is scoped strictly to the new Multi-profile support
section.
Clarifies that Hermes' built-in multi-profile feature is not recommended
when running under Docker. Recommends instead running one container per
profile, each bind-mounting its own host data directory as /opt/data.
Includes docker run examples, a rationale list (isolation, independent
lifecycle, port separation, concurrent-write safety), and a Compose
snippet showing two profile services side by side.
- Rename `removeAt` → `removeAtInPlace` and document the mutation
contract; the old name read like a non-mutating helper.
- Hotkey table + queue header: use `Ctrl+X` / `Esc` to match the
rest of the UI (was `⌃X` / `esc`).
- Render the queued header as a single template literal so JSX
text-node whitespace can't sneak into the rendered line.
- Make `Esc` while editing beat the `terminal.hasSelection` clear:
the header promises 'Esc cancel', so an active selection
shouldn't silently consume the keystroke.
The text input's ctrl-passthrough whitelist only listed Ctrl+C and
Ctrl+B. Ctrl+X fell through to the printable-char branch and got
inserted as 'x' alongside the queue-delete action firing in
useInputHandlers.
Add Ctrl+X to the same whitelist so it bypasses the readline-style
fallback and reaches the app-level handler unchanged. When not in
queue-edit mode it's a no-op, which is fine — typing 'x' on Ctrl+X
was the wrong default anyway.
Today there's no way to remove a queued message — ↑ loads it for edit,
ctrl-K dispatches the head, but a draft you no longer want stays put
forever. ctrl-C just clears the composer and exits edit mode without
touching the queue.
Two new bindings, both gated on queueEditIdx !== null so they're
inert when the user isn't pointing at a queue item:
- ctrl-X — delete the queue item being edited, clear composer, exit
edit mode. "cut" matches the mental model and doesn't collide with
any existing binding.
- esc — cancel the edit (composer clears, item stays in queue).
Mirrors ctrl-C's existing behavior so muscle memory has two paths.
Header line now reads `queued (3) · editing 2 · ⌃X delete · esc cancel`
when in edit mode, so the affordance is discoverable without /help.
The /help hotkey table also gets a Ctrl+X entry.
ctrl-C is intentionally unchanged: it should never destroy queued
content. Cancel is non-destructive (esc / ctrl-C); only ctrl-X
removes the item.
Same layering concern as the persisted-assistant scrub already removed:
_emit_interim_assistant_message and the final_response return path were
mutating model output broadly. Streaming scrubber covers real leaks
delta-by-delta; these post-stream scrubs were redundant.
Reviewer pushback on the original boundary-hardening commits — three
overreach points pulled plugin-specific policy into shared core paths:
1. gateway/run.py hardcoded a '## Honcho Context' literal split for
vision-LLM output. Plugin-format heading in framework code; could
truncate legitimate output naturally containing that header.
Drop the literal split; keep generic sanitize_context (the wrapper
strip is plugin-agnostic). Plugin-specific cleanup belongs at the
provider boundary, not the shared gateway path.
2. run_agent.run_conversation scrubbed user_message and
persist_user_message before the conversation loop. User text is
sacred — if a user types a literal <memory-context> tag we must
not silently delete it. The producer (build_memory_context_block)
is the only legitimate emitter; user input should never need the
reverse op.
3. _build_assistant_message scrubbed model output before persistence.
Same hazard: would silently mutate legitimate documentation/code
the model emits containing the literal markers. The streaming
scrubber catches real leaks delta-by-delta before content is
concatenated; persist-time scrub was redundant belt-and-suspenders.
4. _fire_stream_delta stripped leading newlines from every delta unless
a paragraph break flag was set. Mid-stream '\n' is legitimate
markdown — lists, code fences, paragraph breaks — and chunk
boundaries are arbitrary. Narrow lstrip to the very first delta
of the stream only (so stale provider preamble still gets cleaned
on turn start, but mid-stream formatting survives).
Plus: build_memory_context_block now logs a warning when its defensive
sanitize_context strips something — surfaces buggy providers returning
pre-wrapped text instead of silently double-fencing.
Net architectural change: scrub surface collapses from 8 sites to 3
(StreamingContextScrubber on output deltas, plugin→backend send,
build_memory_context_block input-validation). Plugin-specific strings
stay out of shared runtime paths. User input and persisted assistant
output are no longer mutated.
Tests: rescoped TestMemoryContextSanitization (helper-correctness only,
no source-inspection of removed call sites), updated vision tests to
drop '## Honcho Context' literal-split assertions, updated
_build_assistant_message persistence test to assert preservation.
Added: cross-turn scrubber reset, build_memory_context_block warn-on-
violation, mid-stream newline preservation (plain + code fence).
Closed PR #5137 addressed the retrieval path (peer cards via get_card()
instead of the session-scoped lookup that returned empty for per-session
messaging flows) — that architectural fix is already in main as
_fetch_peer_card / _fetch_peer_context.
What never got fixed is the user-visible side: honcho_profile returning
a flat 'No profile facts available yet.' leaves the model to guess at
why. The model then often surfaces it to the user as a cryptic error.
Adds a diagnostic hint next to the existing 'result' message, enumerating
the likely causes in rough order of frequency:
1. Observation disabled for this peer (user_observe_me/others off)
2. Peer card hasn't accumulated yet (fresh peer / dialectic cadence
hasn't fired enough turns — cards build over time)
3. Generic fallback: self-hosted Honcho < 3.x lacks peer cards
The hint also suggests alternative tools (honcho_reasoning / honcho_search)
so the model can route around the empty card rather than giving up.
Schema description updated so the model knows the hint field exists and
that an empty card is NOT an error state.
7 tests cover the hint paths: warmup, observation-disabled for user + ai,
generic fallback, populated card still returns plain result (no hint),
alternative-tool suggestion present.
The scheme-validation commit (e77a3f2c) was too strict: a user with
legacy ''baseUrl: localhost:8000'' (no ''http://'' prefix) in their
''~/.honcho/config.json'' would get ''No API key configured'' from the
CLI after that change, even though their setup worked before.
urlparse on a schemeless host:port treats the host segment as the
scheme and leaves netloc empty, so the http/https check rejected it.
Falls back to a lenient check for schemeless strings that look like
hosts: contain '.' or ':', aren't a boolean/null literal, aren't pure
digits. The SDK still rejects truly malformed URLs at connect time
with a clearer error than ours.
Three new tests: legacy schemeless hosts accepted; obvious garbage
literals (''true'', ''null'', ''12345'') still rejected. Reviewer
noted concern #1: schemeless regression for self-hosters with old
configs.
main's 6a957a74 added an optional 'metadata' kwarg to
MemoryProvider.on_memory_write so providers can distinguish tool-driven
memory writes from background-review writes. MemoryManager already
does a getfullargspec-based introspection, so the old 3-arg signature
didn't break at runtime — but it missed the origin hint entirely.
Updates HonchoMemoryProvider.on_memory_write to accept the kwarg. The
metadata isn't yet threaded into Honcho's create_conclusion payload —
that's worth its own PR once the consolidation lands and the new
metadata shape stabilises.
Two small follow-ups to the PR review:
- Hoist hashlib import from _enforce_session_id_limit() to module top.
stdlib imports are free after first cache, but keeping all imports at
module top matches the rest of the codebase.
- _resolve_api_key now URL-parses baseUrl and requires http/https +
non-empty netloc before returning the 'local' sentinel. A typo like
baseUrl: 'true' (or bare 'localhost') no longer silently passes the
credential guard; the CLI correctly reports 'not configured'.
Three new tests cover the new validation (garbage strings, non-http
schemes, valid https).
fixes#5719
The auxiliary vision LLM called by gateway._enrich_message_with_vision
can echo its injected Honcho system prompt back into the image
description. That description gets embedded verbatim into the enriched
user message, so recalled memory (personal facts, dialectic output)
surfaces into a user-visible bubble.
Strips both forms of leak before embedding:
- <memory-context>...</memory-context> fenced blocks (sanitize_context)
- trailing '## Honcho Context' sections (header + everything after)
Plus regression tests:
- tests/agent/test_streaming_context_scrubber.py — 13 tests on the
stateful scrubber (whole block, split tags, false-positive partial
tags, unterminated span, reset, case-insensitivity)
- tests/run_agent/test_run_agent_codex_responses.py — 2 new tests on
_fire_stream_delta covering the realistic 7-chunk leak scenario and
the cross-turn scrubber reset
- tests/gateway/test_vision_memory_leak.py — 4 tests covering the
vision auto-analysis boundary (clean pass-through, '## Honcho Context'
header, fenced block, both patterns together)
sanitize_context() uses a non-greedy block regex that needs both
<memory-context> open and close tags present in a single string. When a
provider streams the fenced memory block across multiple deltas (typical
for recalled-context leaks — the payload often arrives in 10+ 1-80 char
chunks), the per-delta sanitize stripped the lone open/close tags via
_FENCE_TAG_RE but let the payload in between flow straight to the UI.
Adds StreamingContextScrubber: a small stateful scrubber that tracks
open/close tag pairs across deltas, holds back partial-tag tails at
chunk boundaries, and discards span contents wholesale (including the
system-note line that fragments across deltas).
Wired into _fire_stream_delta; reset per user turn; benign trailing
partial-tag tails are flushed at the end of each model call. Mid-span
interruption (provider drops closing tag) drops the orphaned content
rather than leaking it — truncated answer > leaked memory.
Follow-up to #13672 (@dontcallmejames).