The check-windows-footguns.py script outputs a checkmark (U+2713) and
cross (U+2717) to report results. Windows terminals default to cp1252,
which cannot encode these characters, so running the script on Windows
threw a UnicodeEncodeError before any results were printed.
This made the tool completely unusable on the exact platform it exists
to help -- a developer on Windows trying to check their code for
Windows-safety issues would just get a crash instead.
Fix: reconfigure stdout and stderr to UTF-8 at the start of main(),
before any output is produced. Verified on Windows 11 Home with
Python 3.13 (terminal defaulting to cp1252).
Tests in TestReadClaudeCodeCredentials were not mocking
_read_claude_code_credentials_from_keychain, which was added after the
tests were written. On macOS machines with real Claude Code credentials
stored in the Keychain, the function returns live credentials instead of
the test fixtures, causing assertions to fail and leaking real tokens in
test output.
Add an autouse fixture that stubs the keychain reader to None so all
tests in the class exercise only the file-based credential path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace generator-based result collection with explicit per-future
handling. Each future is now processed independently with a 600s timeout.
Before: _results.extend(f.result() for f in _futures)
- One exception stops the generator, remaining results are lost
- No timeout: one hung job blocks the entire tick
After: as_completed() + per-future try/except
- Each future handled independently
- 600s timeout prevents indefinite blocking
- Failed futures are logged and counted as failures
The goal judge only receives the goal text and the agent's last
response. It has no concept of the current time, making it
impossible to evaluate time-sensitive goals like 'keep working
until 5pm'.
This commit adds 'Current time' to both JUDGE_USER_PROMPT_TEMPLATE
and JUDGE_USER_PROMPT_WITH_SUBGOALS_TEMPLATE, computed from
datetime.now().astimezone() at judge call time.
The Telegram/Discord model picker skipped live model discovery for
custom providers (llama.cpp, Ollama) unless an api_key was configured.
Local providers typically don't require auth on the /models endpoint.
The CLI always probes /models, so this brings the gateway picker into
parity.
Change: `if api_url and api_key:` -> `if api_url:`
Add creationflags=CREATE_NO_WINDOW to every Windows Popen call
across the terminal, process registry, code execution, and kanban
worker subsystems. Prevents visible CMD windows from flashing on
the user's desktop during agent operation.
Also adds the _IS_WINDOWS module constant to kanban_db.py where
it was missing, for consistency with the other patched files.
5 Popen sites across 4 files:
- tools/environments/local.py (terminal foreground spawn)
- tools/process_registry.py (background process spawn)
- tools/code_execution_tool.py (sandbox + interpreter probe)
- hermes_cli/kanban_db.py (kanban worker spawn)
When /goal loop generates synthetic MessageEvents (goal continuations,
status notices), the reply anchor is unavailable (message_id=None). For
Telegram DM topic lanes, the Telegram adapter requires
direct_messages_topic_id to route messages correctly; without it, the
adapter falls back to message_thread_id=None, sending messages to the
root 'All Messages' thread instead of the active topic lane.
The fix includes direct_messages_topic_id in thread metadata for all
non-General Telegram DM topics, ensuring queued/synthetic messages are
delivered to the correct thread even when no reply anchor exists.
_propare_messages_for_non_vision_model() was only called in the legacy
flag path (no provider profile). Providers with registered profiles
(e.g. DeepSeek, Kimi) bypassed the strip, causing HTTP 400 errors when
image_url content blocks reached their non-vision APIs.
This mirrors the existing behavior in the legacy path, ensuring all
non-vision models get image stripping regardless of profile status.
Vision-capable models are unaffected (the function is a no-op for them).
The _normalize_custom_provider_entry() function was dropping the
discover_models field from custom_provider entries because:
1. It was not listed in _KNOWN_KEYS, so it was logged as an
unknown key and ignored.
2. The function builds the normalized dict by explicitly copying
known fields, so even if the warning was suppressed, the value
was not carried through.
This caused downstream model_switch.py to default discover_models
to True, triggering /models HTTP probes on unreachable endpoints.
With 4 unreachable internal endpoints at ~6s timeout each, the
/api/model/options endpoint took ~24s instead of <1s.
Two protocol-correctness gaps from review:
1. Stage-Node used [void](Test-Node) which discarded Test-Node's return
value, so the JSON frame always reported ok=true even when Node
install fully failed. A GUI driver consuming the manifest couldn't
tell 'node ready' from 'node missing'. Wire a soft-skip channel
($script:_StageSkippedReason) that workers can populate to surface
'ran, but the thing it was supposed to set up is not available' as
skipped=true with a reason in the JSON, without aborting the install
(Node is optional -- browser tools degrade gracefully, matches
Write-Completion's existing 'Note: Node.js could not be installed'
behavior). Reset before each stage so a prior reason can't leak.
2. The -Stage dispatch used 'if ($Stage)' which is falsy for empty
string, so 'install.ps1 -Stage ""' fell through to Main and silently
kicked off a full destructive install. Switch to
PSBoundParameters.ContainsKey('Stage') so an explicit empty value
surfaces as unknown-stage exit 2 with a structured JSON frame, the
way every other bad stage name does.
Address the two cosmetic items from review:
- Completion banner middle line was 62 chars vs 59-char top/bottom borders
(replacing the 1-char checkmark with [OK] added width that wasn't
reflected in the trailing whitespace). Drop 3 trailing spaces.
- Smoke test file had a single em-dash in a comment -- the only
non-ASCII byte across both files. Replace with -- for consistency
with install.ps1's pure-ASCII goal.
Three issues flagged by the Copilot review on this PR:
1. Double JSON emit on stage failure (Copilot #1, #2). When -Stage <name>
ran a worker that threw, Invoke-Stage's finally emitted a JSON result
frame AND the entry-point catch emitted a second error frame --
producing two concatenated JSON objects on stdout and breaking the
one-line-per-invocation contract that drivers parse against. Same
issue applied to -Json mode on a full install (every stage's finally
plus a final error frame missing duration_ms/skipped).
Fix: Invoke-Stage's finally now sets $script:_StageEmittedErrorFrame
when it emits a failure frame; the entry-point catch checks the flag
and skips its own emit, still exit 1.
2. $prevEAP uninitialized on early try-block throw (Copilot #3). In
Install-Uv, Test-Python, Test-Node's winget fallback,
_Run-NpmInstall, and the playwright block, '$prevEAP =
$ErrorActionPreference' lived as the first statement INSIDE the
try. If anything between 'try {' and that line threw (Write-Info on
an unusual host, the npx-finding loop, etc.), the catch's
'if ($prevEAP) { ... }' restore was a no-op and EAP could remain
relaxed.
Fix: hoist '$prevEAP = $ErrorActionPreference' to the line
immediately before 'try {' in all five sites. Catch's restore is
now always meaningful regardless of where in the try the throw
originated.
No change to Invoke-Stage's success path or to the four lint-clean EAP
sites (Test-Node was the only winget-related catch). All 19 metadata
smoke tests still pass.
Adds an opt-in stage protocol that lets programmatic drivers (the
desktop GUI's onboarding wizard, CI, future install.sh parity) drive
install.ps1 one step at a time with structured JSON results. Default
invocation (`irm | iex` one-liner) behaves unchanged.
Entry points:
install.ps1 Today's interactive install (unchanged)
install.ps1 -ProtocolVersion Emit protocol version integer
install.ps1 -Manifest Emit JSON manifest of available stages
install.ps1 -Stage <name> Run one stage, emit JSON result
install.ps1 -NonInteractive Suppress Read-Host prompts (skips the
setup wizard and gateway autostart)
install.ps1 -Json Machine-readable completion frame
Manifest exposes 14 stages across prereqs/install/finalize/post-install
categories, with 2 (configure, gateway) flagged needs_user_input=true
so GUI drivers can skip them and handle the equivalent UX themselves.
Along the way, clean-VM testing on stock Windows 10/11 surfaced a
series of latent install.ps1 bugs that were never exercised by
developer machines. Fixed in the same commit:
* Encoding: file is now pure ASCII with no BOM. Windows PowerShell
5.1 reads BOM-less files as Windows-1252 and chokes on em-dashes
(and other UTF-8 sequences), while iex chokes on a leading U+FEFF.
Pure-ASCII satisfies both invocation paths.
* EAP=Stop + native `2>&1` captures: PowerShell wraps stderr lines
from native commands as ErrorRecord objects under EAP=Stop and
throws even when the command exits 0. Relaxed to EAP=Continue
around the astral.sh uv installer, `uv python install`, `npm
install`, `npx playwright install`, the venv import probes, and
the Node winget fallback. Check $LASTEXITCODE for the real signal.
* Cross-process state: each `-Stage <name>` invocation spawns a
fresh powershell child. $script:UvCmd set by Stage-Uv was invisible
to Stage-Python; PATH updated by Stage-Git/Stage-Node was invisible
to subsequent stages spawned by the driver shell. Added Resolve-UvCmd
helper called at the top of every stage that needs uv, and a
Sync-EnvPath helper called at the top of Invoke-Stage to refresh
PATH from the registry.
* UAC avoidance: `winget install OpenJS.NodeJS.LTS` triggers a UAC
prompt that often appears minimized in the taskbar -- looks like a
hang. Switched Test-Node to prefer the official portable Node zip
dropped into %LOCALAPPDATA%\hermes\node\ (mirrors the PortableGit
pattern Install-Git already uses). winget kept as fallback.
* npx hangs on confirmation: `npx playwright install chromium` blocks
on stdin waiting for "Need to install playwright@X.Y.Z (y/N)" when
playwright isn't in local node_modules. Tee-Object pipelines
disconnect stdin from the user's TTY so the install hangs forever.
Pass `--yes` to auto-accept.
* Silent long-running installs: `*> $logPath` redirected every stream
to disk and left the user staring at a frozen "Installing..." line
for the 5-10 minutes Playwright Chromium takes to download. Switched
to `2>&1 | ForEach-Object { "$_" } | Tee-Object -FilePath $log` so
output streams live to the console AND captures to log for failure
diagnostics. ForEach-Object coercion strips PowerShell's red
NativeCommandError formatter from stderr items.
* Console encoding: forced [Console]::OutputEncoding to UTF-8 so
playwright/git/npm progress bars, box-drawing, and check marks render
correctly instead of as IBM437/Windows-1252 mojibake.
* Performance: set $ProgressPreference = "SilentlyContinue" so
Invoke-WebRequest doesn't paint its per-chunk progress bar. The
PS 5.1 progress UI throttles downloads by 10-100x (a 57MB PortableGit
grab takes 5 minutes with the bar on vs ~20 seconds with it off,
same network). Affects PortableGit, Node portable zip, and the
Hermes repo zip fallback.
Tests: scripts/tests/test-install-ps1-stage-protocol.ps1 provides 19
metadata-only assertions covering -ProtocolVersion, -Manifest schema,
and unknown -Stage error frame. No install side effects.
End-to-end validated on a clean Windows 10 VM via:
1. `irm <branch>/scripts/install.ps1 | iex` (canonical CLI path)
2. `powershell -File install.ps1 -Stage X` iterated through every
stage (GUI driver path, exercises cross-process fixes)
Closes#26924 (and supersedes #26926) in spirit.
DeepSeek was missing `default_aux_model` on its `ProviderProfile`, so
`_get_aux_model_for_provider("deepseek")` returned an empty string and
the compression / vision / session-search paths emitted
"No auxiliary LLM provider configured -- context compression will
drop middle turns without a summary."
on every DeepSeek session, even when the user had perfectly working
DeepSeek credentials.
Fix lands at the profile layer rather than the legacy
`_API_KEY_PROVIDER_AUX_MODELS_FALLBACK` dict the original PR targeted.
Every modern provider (gemini, zai, minimax, anthropic, kimi-coding,
stepfun, ollama-cloud, gmi, novita, kilocode, ai-gateway, opencode-zen)
sets `default_aux_model` on its `ProviderProfile`; the fallback dict
only exists for providers that predate the profiles system.
Tests added under `tests/plugins/model_providers/test_deepseek_profile.py`:
- `test_profile_advertises_deepseek_chat` -- pins the profile attribute
- `test_consumer_api_returns_deepseek_chat` -- pins the consumer API behavior
- `test_consumer_api_returns_non_empty` -- regression guard for the
symptom in the issue
Original diagnosis and aux-model choice from @kriscolab in PR #26926;
moved one layer up.
Co-authored-by: kriscolab <71590782+kriscolab@users.noreply.github.com>
previously only checked provider ID and
base URL. When kimi-k2.6 is served via ollama-cloud (or any third-party
provider), provider is not 'kimi-coding' and base URL is not
api.kimi.com — so reasoning_content pad was never injected. This caused
HTTP 400 from Ollama Cloud's Go backend: 'invalid message content type:
map[string]interface {}'.
Fix: add model-name detection ('kimi' in model.lower()) so any route
serving a kimi model gets the required reasoning_content echo-back.
Refs the 400/401 Telegram errors where kimi-k2.6 via ollama-cloud
consistently failed after tool-call turns.
The install_open_webui function correctly resolved the python interpreter into the $py variable, but hardcoded 'python' in subsequent pip install commands. This caused 'command not found' or 'externally-managed-environment' errors on systems where 'python' is not implicitly aliased to 'python3'.
The gateway already accepts plain-text config files (.ini, .cfg) and
structured formats (.json, .yaml, .toml) as documents, but not common
source-file extensions. Sending a .ts/.py/.sh file currently requires
renaming it to .txt first.
Adds .ts, .py, .sh as text/plain, consistent with the existing
.ini/.cfg entries.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds logger.info when large pastes are collapsed to file
references in both paste-code paths (handle_paste and
_on_text_changed). Logs paste ID, line count, character
count, and file path so operators can correlate missing-
content reports with specific paste files. This is a
diagnostic aid, not a fix for the paste-drop issue.
Add test_returns_none_when_skill_load_fails to verify that
build_skill_invocation_message() returns None when a registered
skill exists in the command cache but _load_skill_payload() fails.
This guards against regression of the fix in 877d01b.
build_skill_invocation_message() returns a non-empty placeholder string
('[Failed to load skill: ...]') when the skill exists in the command cache
but loading the actual SKILL.md payload fails. CLI/gateway callers treat
any truthy return value as success, so the failure is silently routed into
the model as if it were a valid skill prompt.
Return None instead, matching the existing behavior for unknown commands,
so callers using 'if msg:' can properly detect the failure.
- Remove unused from tools/tts_tool.py (dead code)
- Move _BUILTIN_DELIVER_PLATFORMS set from send() method to module
scope in gateway/platforms/webhook.py to avoid reallocation on
every call
hermes_cli/gateway.py:3702 referenced logger.debug() but 'logger' was
never defined in the module, causing a NameError at runtime if the
try/except around discover_plugins() caught an exception.
Added import logging and logger = logging.getLogger(__name__)
at module level to resolve the undefined name.
Both links were merged from low-risk batch salvage but on review they're
brand-new single-commit personal repos with zero stars/forks and no
track record. README links from us implicitly endorse community
projects; the Community section should have a minimum activity bar
before we link to a repo, not just "the contributor opened a PR."
MemPalace in particular wraps an in-process memory provider, so a
README endorsement carries more risk than a typical docs link.
Consume multi-byte non-CSI ESC sequences during ANSI sanitization and handle UnicodeDecodeError for `hermes send --file` so review findings are resolved without regressions.
Strip incomplete CSI prefixes before rendering, remove carriage returns from sanitized output, and add regression tests to prevent escape-sequence recomposition across message boundaries.
Avoid Terminal.app paint corruption by disabling fast-echo in that terminal, sanitizing non-SGR control sequences before ANSI rendering, and defaulting Apple Terminal back to the safer 256-color path unless truecolor is explicitly requested.
Pass skip_memory=True to the AIAgent constructor used by
_spawn_background_review() so the review fork's __init__ no longer
rebuilds a _memory_manager wired to honcho / mem0 / supermemory /
etc. under the parent's session_id.
Before this change, the review fork ingested its harness prompt
(the 'Review the conversation above and update the skill library...'
text) into the user's real memory namespace via three sites in
run_conversation():
- on_turn_start(turn_count, prompt) cadence + turn-message
- prefetch_all(prompt) recall query
- sync_all(prompt, review_output, ...) harness + review output
recorded as a
(user, assistant) pair
Built-in MEMORY.md / USER.md state is still rebound from the parent
right after construction, so memory(action='add') writes from the
review continue to land on disk; only the external-plugin side
effects are removed.
Reported by @Utku.
The same root cause as the auxiliary compression fix (commit 7becb19):
get_model_context_length() is called without custom_providers, so per-model
context_length overrides are silently skipped. The fallback activation path
(_try_activate_fallback) had the same missing parameter.
When the agent switches to a fallback provider, the fallback model would use
the models.dev value (e.g. 204800 for NVIDIA NIM minimax-m2.7) instead of
the user-configured one in custom_providers (e.g. 196608) — a subtle
discrepancy that could cause the fallback model to run with an incorrect
context window, leading to truncated messages or failed API requests when
the model does not support the detected length.
Fix: pass self._custom_providers to get_model_context_length() so the
fallback path sees the same per-model overrides as the main model path.