Cover the deprecation pattern matching against real gh-copilot stderr
output, verify the GitHub Models Azure URL is in _URL_TO_PROVIDER, and
confirm _is_github_models_base_url recognises the Azure endpoint.
Address two blocking issues when using GitHub Copilot integrations:
1. ACP mode: detect the gh-copilot CLI deprecation error from stderr
and surface an actionable message with alternatives instead of
hanging or showing a cryptic error.
2. GitHub Models (Azure) 413: recognize models.inference.ai.azure.com
as a known GitHub Models URL, and print a targeted hint explaining
the hard 8K token limit that makes this endpoint incompatible with
Hermes' system prompt size.
Fixes#26693
`hermes doctor` currently promotes invalid direct API keys into the final
summary even when the matching OAuth path is already healthy. That makes
the setup look more broken than it really is.
This change keeps the failed API Connectivity row visible but stops
treating it as a blocking summary issue when a healthy OAuth fallback
already exists for the same provider family.
Covered cases:
- Gemini OAuth + invalid direct Gemini key
- MiniMax OAuth + invalid direct MiniMax key
Based on #26704 by @worlldz.
* feat(skills): add osint-investigation optional skill (closes#355)
Phase-1 public-records OSINT investigation framework adapted from
ShinMegamiBoson/OpenPlanter (MIT). Lives in optional-skills/research/.
Six data-source wiki entries (FEC, SEC EDGAR, USAspending, Senate LD,
OFAC SDN, ICIJ Offshore Leaks), each following the 9-section template:
summary, access, schema, coverage, cross-reference keys, data quality,
acquisition, legal, references.
Six stdlib-only acquisition scripts that emit normalized CSV, plus three
analysis scripts:
- entity_resolution.py — three-tier match (exact / fuzzy / token overlap)
with explicit confidence per row
- timing_analysis.py — permutation test for donation/contract timing
correlation, joins through cross-links
- build_findings.py — assembles structured findings.json with
evidence chains pointing back to source rows
Validation: full pipeline runs end-to-end on synthetic fixtures. Entity
resolution found 24 cross-matches with 0 false positives on a 5-row /
4-row test set. Timing analysis on 5 donations clustered near 3 awards
returned p=0.000, effect size 2.41 SD. Findings JSON correctly tags
HIGH-severity timing pattern. All 9 scripts pass --help and py_compile.
Docs site page auto-generated by website/scripts/generate-skill-docs.py;
sidebar + catalog entries updated by the same generator.
* fix(osint-investigation): live API fixes from end-to-end sweep
Live-tested the skill on a real public-citizen query and found three bugs
the synthetic E2E missed. All three are now fixed and re-verified.
1. FEC fetch hung on contributor name searches.
The combination of two_year_transaction_period + sort=date +
contributor_name puts the OpenFEC query plan on a slow path that the
upstream gateway times out (25s+). Switched to min_date/max_date with no
explicit sort. Renamed --candidate to --contributor (the original name
was misleading: FEC searches by donor, not by candidate; --candidate is
kept as a deprecated alias). Added --state filter for narrowing.
2. ICIJ Offshore Leaks reconcile endpoint returns 404.
ICIJ removed the Open Refine reconciliation API. Rewrote
fetch_icij_offshore.py to download the official bulk CSV ZIP (~70 MB,
public, no auth) and search it locally. Cached under
$HERMES_OSINT_CACHE/icij/ (default ~/.cache/hermes-osint/icij/) for
30 days, --force-refresh to refetch. Verified live: 'PUTIN' query
returns 5 Panama Papers officer matches in 0.5s after first download.
3. SEC EDGAR silently returned 0 when the company-name resolver matched
an individual Form 3/4/5 filer (insider trading disclosures).
Now surfaces 'Resolved company X → CIK Y (Z)' on stderr, prints a
filing-type histogram when the type filter wipes results, and
explicitly warns when the matched CIK appears to be an individual
filer rather than a corporate registrant.
Bonus: _http.py was retrying 429 responses with exponential backoff plus
honoring (often-missing) Retry-After headers, which compounded into
multi-second hangs per page when the upstream key was over quota.
Changed to fail-fast on 429 with a clear, actionable error showing the
upstream's quota message. Verified: 0.3s fast-fail vs the previous 60s
hang on DEMO_KEY rate-limit exhaustion.
Updated SKILL.md, fec.md, and icij-offshore.md to match the new CLI
flags and ICIJ bulk-cache flow. Regenerated the docusaurus page via
website/scripts/generate-skill-docs.py.
Live sweep results across all 6 sources for 'Dillon Rolnick, New York':
- OFAC SDN: 0 matches ✓ (correctly not sanctioned)
- USAspending: 0 matches ✓ (correctly not a federal contractor)
- Senate LDA: 0 matches ✓ (correctly not a lobbying client)
- SEC EDGAR: warns it resolved to 'Rolnick Michael' (CIK 0001845264)
who is an individual Form 3 filer, not a corporate registrant
- ICIJ: 0 matches ✓ (correctly not in any offshore leak)
- FEC: rate-limited (DEMO_KEY); fails fast with clear quota message
* feat(osint-investigation): expand to 12 sources covering identity, property, courts, archives, news
Phase-2 expansion per Teknium feedback that the original 6-source skill
(federal financial/regulatory only) wasn't a complete OSINT toolkit. Adds
6 more sources covering the major omissions a real investigation would
reach for first.
New sources (6 fetch scripts + 6 wiki entries):
1. NYC ACRIS — Real property records (deeds, mortgages, liens) via the
city's Socrata API. Search by party name or property address. Joins
Parties to Master to populate doc_type, dates, borough, and amount.
Coverage: 5 NYC boroughs, ~70M party records, 1966-present.
2. OpenCorporates — Global corporate registry covering 130+ jurisdictions
(~200M companies). Free API token at
https://opencorporates.com/api_accounts/new raises the rate limit;
HTML fallback works without one (limited fields).
3. CourtListener (Free Law Project) — federal + state court opinions
(~10M back to colonial era) + PACER dockets via RECAP. Anonymous v4
search works; COURTLISTENER_TOKEN raises rate limits.
4. Wayback Machine CDX — historical web captures (~900B+). Used both for
surveillance-of-record (when did this site change?) and as a
content-recovery layer when other sources point to dead URLs.
5. Wikipedia + Wikidata — narrative bio + structured facts. Wikipedia
OpenSearch for article matching, REST summary for extracts, Wikidata
Action API (wbgetentities) for claims. Avoids the SPARQL Query
Service which is aggressively rate-limited.
6. GDELT 2.0 DOC API — global news monitoring in 100+ languages,
~2015-present. Auto-retries with 6s backoff on the standard
1-req-per-5-sec throttle.
Other changes in this commit:
- SEC EDGAR no longer raises SystemExit when the company-name resolver
finds no CIK; writes an empty CSV with header so the rest of a
pipeline can keep moving and the warning is just on stderr.
- _http.py User-Agent updated per Wikimedia policy: includes app name,
version, and a 'set HERMES_OSINT_UA to identify yourself' instruction.
- SKILL.md workflow now groups sources into two clusters (federal
financial vs identity/property/courts/archives/news) with bash
examples for each. 'When to use this skill' lists the broader set of
investigation patterns the expanded sources unlock.
Live sweep results on 'Dillon Rolnick, New York' across all 12 sources:
ofac ✓ 0 (correctly clean)
icij ✓ 0 (correctly not in any leak)
usaspending ✓ 0 (correctly not a federal contractor)
senate_lda ✓ 0 (correctly not a lobbying client)
sec_edgar ✓ 0, warns: resolved to 'Rolnick Michael' (CIK 0001845264),
individual Form 3 filer, NOT a corporate registrant
fec — rate-limited (DEMO_KEY exhausted), fails fast with
clear quota message
nyc_acris ✓ 200 records named Rolnick across NYC; 48 records at
571 Hudson (the property the web identifies as his)
opencorporates ✓ 0 (no API token configured; HTML fallback)
courtlistener ✓ 0 for 'Dillon Rolnick'; 20 for 'Rolnick' generally;
5 for 'Microsoft' sanity check
wayback ✓ 30 captures of nousresearch.com from 2011-present
wikipedia ✓ 0 (correctly not notable enough); Bill Gates sanity
returns full structured facts (occupation, employer,
DOB, place of birth, country)
gdelt ✓ 0 for 'Dillon Rolnick'; 5 for 'Nous Research'
All 17 scripts compile clean and pass --help. Synthetic analysis pipeline
regression still passes (entity_resolution 30 matches, timing p=0.000,
findings 2).
* feat(osint-investigation): remove FEC; DEMO_KEY rate-limits make it unreliable
The FEC fetcher consistently failed the live sweep because the OpenFEC
DEMO_KEY tier (40 calls/hour) exhausts on a single investigation, and
the upstream returns slow-path query plans for unindexed contributor-name
searches that the gateway times out. Without a real API key it's not
usable; with one the user has to sign up at api.data.gov first. That's
too much setup friction for a skill that should work out of the box.
Removed:
- scripts/fetch_fec.py
- references/sources/fec.md
Updated:
- SKILL.md frontmatter description + tags
- 'When NOT to use' now points users at https://www.fec.gov/data/ for
federal donations
- entity_resolution example switched from donor↔contractor to
lobbying-client↔contractor (Senate LDA + USAspending pair)
- timing_analysis example switched to lobbying-filings vs awards
- 8 wiki entries had their 'FEC ↔ ...' cross-reference bullets removed
11 sources remain (5 federal financial + 6 identity/property/courts/
archives/news). All scripts compile, pass --help, and the synthetic
analysis pipeline still passes on the new lobbying-shaped regression
fixture (30 matches, p=0.000 on tight clustering, 2 findings).
Closes#10695. Picks up the still-vulnerable Python pins on current main:
- aiohttp 3.13.3 -> 3.13.4 (messaging, slack, homeassistant, sms extras +
lazy_deps platform.slack) — CVE-2026-34513 (DNS cache exhaustion),
CVE-2026-34518 (cookie/proxy-auth leak on cross-origin redirect, relevant
for the gateway since it handles OAuth tokens), CVE-2026-34519 (response
reason injection), CVE-2026-34520 (null bytes in headers), CVE-2026-34525
(multiple Host headers).
- anthropic 0.86.0 -> 0.87.0 (anthropic extra + lazy_deps provider.anthropic)
— CVE-2026-34450 (memory tool files created mode 0o666),
CVE-2026-34452 (path-traversal in async local-filesystem memory tool).
Not directly exploitable since hermes-agent doesn't use the SDK's
filesystem memory tool, but the SDK is bumped for hygiene.
- cryptography pinned explicitly at 46.0.7 in core dependencies —
CVE-2026-39892 (buffer overflow on non-contiguous buffers). Previously
came in transitively via PyJWT[crypto]; the explicit floor keeps the
WeCom/Weixin crypto paths from drifting below the fix.
curl-cffi from the original issue is no longer in pyproject.toml or uv.lock,
so no action needed there.
uv.lock regenerated cleanly; only aiohttp / anthropic / cryptography moved.
Credit: original issue + scoping by @shaun0927 (#10695, #10701).
Floor analysis and packaging-surface audit by @gnanirahulnutakki (#10784),
adapted to current main's exact-pin style.
Co-authored-by: shaun0927 <shaun0927@users.noreply.github.com>
Co-authored-by: Gnani Rahul Nutakki <gnanirahulnutakki@users.noreply.github.com>
Port three hardening patches from Claude Code 2.1.113's expanded deny
rules to hermes' detect_dangerous_command() pattern list.
1. macOS /private/{etc,var,tmp,home} system paths
/etc, /var, /tmp, /home are symlinks to /private/<name> on macOS.
A write to /private/etc/sudoers works identically to /etc/sudoers
but bypassed the plain /etc/ pattern check. Extracted a shared
_SYSTEM_CONFIG_PATH fragment so /etc/ and the /private/ mirror
stay in sync across redirect / tee / cp / mv / install / sed -i
patterns.
2. killall -9 / -KILL / -SIGKILL / -s KILL / -r <regex>
Parallel to the existing pkill -9 pattern. killall -9 against
non-hermes processes was previously unprotected, and killall -r
can sweep unrelated processes matching a regex.
3. find -execdir rm
Same destructive effect as find -exec rm but ran in each match's
directory. The previous pattern required a literal '-exec ' so
-execdir slipped through.
Guarded by 32 new test cases in 4 test classes:
- TestMacOSPrivateSystemPaths (11 cases)
- TestKillallKillSignals (9 cases)
- TestFindExecdir (4 cases)
- TestEtcPatternsUnaffectedByRefactor (6 regression guards on
the existing /etc/ coverage after the _SYSTEM_CONFIG_PATH refactor)
Inspiration: https://github.com/anthropics/claude-code/releases
(Claude Code 2.1.113, April 17 2026 - "Enhanced deny rules" and
"Dangerous path protection")
Port from openai/codex#17667: MCP servers can now opt-in to parallel
tool execution by setting supports_parallel_tool_calls: true in their
config. This allows tools from the same server to run concurrently
within a single tool-call batch, matching the behavior already available
for built-in tools like web_search and read_file.
Previously all MCP tools were forced sequential because they weren't in
the _PARALLEL_SAFE_TOOLS set. Now _should_parallelize_tool_batch checks
is_mcp_tool_parallel_safe() which looks up the server's config flag.
Config example:
mcp_servers:
docs:
command: "docs-server"
supports_parallel_tool_calls: true
Changes:
- tools/mcp_tool.py: Track parallel-safe servers in _parallel_safe_servers
set, populated during register_mcp_servers(). Add is_mcp_tool_parallel_safe()
public API.
- run_agent.py: Add _is_mcp_tool_parallel_safe() lazy-import wrapper. Update
_should_parallelize_tool_batch() to check MCP tools against server config.
- 11 new tests covering the feature end-to-end.
- Updated MCP docs and config reference.
Subagent delegation hardcoded api_mode='chat_completions' for any
delegation.base_url that didn't match three specific hostnames
(chatgpt.com, api.anthropic.com, api.kimi.com/coding), and never
read delegation.api_mode from config. Azure AI Foundry's
https://foundry.services.ai.azure.com/anthropic endpoint fell through
and got chat_completions, causing 404s on every delegate_task call.
The main agent already handles this correctly via the shared
_detect_api_mode_for_url() helper (anything ending in /anthropic →
anthropic_messages); delegation reimplemented its own narrower check.
Reuse the shared detector and honor an explicit delegation.api_mode
when set so users can also force the transport on non-standard
endpoints the URL heuristic can't classify.
Fixes#10213.
Co-authored-by: HiddenPuppy <HiddenPuppy@users.noreply.github.com>
* feat(x_search): gated X (Twitter) search tool with OAuth-or-API-key auth
Salvages tools/x_search_tool.py from the closed PR #10786 (originally by
@Jaaneek) and reworks its credential resolution so the tool registers
when EITHER xAI credential path is available:
* XAI_API_KEY (paid xAI API key) is set in ~/.hermes/.env or the env, OR
* The user is signed in via xAI Grok OAuth — SuperGrok subscription —
i.e. hermes auth add xai-oauth has been run
Both paths route through xAI's built-in x_search Responses tool at
https://api.x.ai/v1/responses. When both credentials exist OAuth wins,
matching tools/xai_http.py's existing preference order (uses SuperGrok
quota instead of paid API spend).
The check_fn calls resolve_xai_http_credentials() which auto-refreshes
the OAuth access token if it's within the refresh skew window, so a
True return means the bearer is fetchable AND non-empty.
Wiring
- tools/x_search_tool.py — new tool, ~370 LOC. Schema gated by check_fn,
bearer resolved per-call so revoked OAuth surfaces a clean tool_error
rather than an HTTP 401.
- toolsets.py — "x_search" toolset def. NOT added to _HERMES_CORE_TOOLS;
users opt in via hermes tools.
- hermes_cli/tools_config.py — CONFIGURABLE_TOOLSETS entry + TOOL_CATEGORIES
block with two provider options (OAuth + API key) sharing the existing
xai_grok post_setup hook for credential bootstrap.
- hermes_cli/config.py — DEFAULT_CONFIG["x_search"] with model /
timeout_seconds / retries. Additive nested key; no version bump.
- tests/tools/test_x_search_tool.py — 13 tests covering HTTP shape,
handle validation, citation extraction, 4xx/5xx/timeout handling,
and the full credential-resolution matrix (OAuth-only, API-key-only,
both-set, neither-set, resolver-raises, config overrides, registry
registration).
- website/docs/guides/xai-grok-oauth.md — adds X Search to the
direct-to-xAI tools section with off-by-default note.
- website/docs/user-guide/features/tools.md — new row in the tools table.
Off by default — users enable via `hermes tools` → 🐦 X (Twitter) Search.
Schema only appears to the model when xAI credentials are configured.
Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>
* docs(x_search): add dedicated feature page + reference entries
- website/docs/user-guide/features/x-search.md (new) — full feature
walkthrough: authentication, enablement, configuration, parameters,
returned fields, example, troubleshooting, see-also links.
- website/docs/reference/tools-reference.md — new "x_search" toolset
section with parameter docs and credential gating note.
- website/docs/reference/toolsets-reference.md — new row in the
toolset catalog table.
- website/sidebars.ts — wires the new feature page under
Media & Web, after web-search.
---------
Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>
Adds _sanitize_tool_error() in model_tools and routes both error paths
through it: registry.dispatch's try/except (the primary path for tool
exceptions) and handle_function_call's outer except (defense in depth).
Stripping targets structural framing tokens that the model itself can
react to even though json.dumps already handles wire-layer escaping:
XML role tags (tool_call, function_call, result, response, output,
input, system, assistant, user), CDATA sections, and markdown code
fences. Caps message body at 2000 chars and wraps with [TOOL_ERROR]
prefix.
Defense-in-depth: a tool exception carrying '<tool_call>...' won't
break message framing (json escapes it), but the model still reads
those tokens and they nudge it toward role-confusion framing.
Ported from ironclaw#1639 (one piece of #3838's three-feature scout).
The truncated-tool-call (#1632) and empty-response-recovery (#1677,
#1720) pieces are skipped because main now implements both far more
thoroughly (run_agent.py L8147/L12209/L13012 for truncation retry +
length rewrite; L4500/L15090+ for empty-response scaffolding stripper,
multi-stage nudge, fallback model activation).
* fix(tui): keep Ink displayCursor in sync with fast-echo writes so cursor stops drifting
TextInput's fast-echo bypass writes characters directly to stdout to
avoid waiting on a React re-render for each keystroke. The hardware
cursor advances by text.length cells, but Ink's cached `displayCursor`
(the basis for the next frame's relative cursor-move preamble in
log-update) stayed unchanged. When ANY unrelated component re-rendered
between the fast-echo write and the deferred composer setCur/setParent
flush — status bar timer, streaming reasoning, etc. — the next frame's
preamble emitted a relative cursor move from a stale parked position
and the hardware cursor parked N cells offset from the actual caret.
Visible symptom: extra whitespace between the just-typed character and
the cursor block, intermittent, worse on long sessions during streaming.
Alt-screen was immune because frames begin with absolute CSI H.
This adds a small API in @hermes/ink:
- `Ink.noteExternalCursorAdvance(dx, dy?)` — bumps displayCursor if
set, otherwise seeds from frontFrame.cursor so the next preamble's
relative move correctly cancels the external advance. No-op on
alt-screen.
- `CursorAdvanceContext` + `useCursorAdvance()` hook to expose it.
TextInput then calls `noteCursorAdvance(text.length)` after the
fast-echo `stdout.write(text)` append, and `noteCursorAdvance(-1)`
after the fast-backspace `\b \b` sequence.
Tests: 4 new vitest cases pin the API contract (bumps when set, seeds
from frontFrame.cursor when null, alt-screen no-op, zero-delta no-op).
All 751 ui-tui tests pass; tests/test_tui_gateway_server.py (177) pass.
* fix(tui): also advance cursorDeclaration so fast-echo survives deferred React state
Copilot review on PR #26717 flagged a gap in the original fix:
TextInput's fast-echo path defers the React `cur` state update by
16ms (perf optimization that batches re-renders during heavy typing).
Inside that window, `useDeclaredCursor` still publishes a target
computed from the PRE-keystroke `cur` — `cursorLayout(display, cur,
columns)`. Advancing only `displayCursor` would let any unrelated
re-render in that 16ms window run onRender's cursor-park branch with
the stale declaration and visually undo the fast-echo's advance.
The fix is symmetric: `noteExternalCursorAdvance` now bumps BOTH
`displayCursor` (the log-update relative-move basis) AND, if non-null,
`cursorDeclaration.relativeX/Y` (the target the cursor parks at after
every frame). When React finally flushes `setCur`, `useDeclaredCursor`
publishes a fresh declaration that supersedes our bumped one — exactly
what we want.
Adds two new vitest cases covering both halves:
- active declaration advances in lock-step with displayCursor
- null declaration stays null (no spurious bump)
All 753 ui-tui tests pass; tests/test_tui_gateway_server.py (177) pass.
Closes review threads:
PRRT_kwDOPRF1G86ChKtD (textInput.tsx:1016 fast-echo append)
PRRT_kwDOPRF1G86ChKtF (textInput.tsx:924 fast-backspace)
PRRT_kwDOPRF1G86ChKtG (ink-cursor-advance.test.ts:57 missing coverage)
* fix(tui): make fast-echo survive TextInput rerenders + alt-screen (Copilot round 2)
Round 2 of PR #26717 review. Three real holes Copilot flagged after the
initial cursorDeclaration bump:
1. alt-screen early-return skipped BOTH halves of the notifier. But the
default TUI wraps the composer in <AlternateScreen> — that IS the
production path. CSI H resets log-update's relative-move basis, but
the alt-screen park branch uses absolute CUP =
`rect.x + decl.relativeX`, so a stale declaration there still parks
the cursor at the pre-keystroke caret. Fix: skip ONLY the
displayCursor half on alt-screen; still bump cursorDeclaration.
2. TextInput's own rerender could clobber the Ink-level bump. The fast-
echo path defers setCur by 16ms; if a parent state change rerenders
TextInput in that window, the layout effect inside useDeclaredCursor
reads the stale React `cur` state and re-publishes a declaration at
the OLD column. Fix:
`cursorLayout(display, curRef.current, columns)` — read the always-
up-to-date ref, not the deferred state. useMemo dropped (compute is
cheap, single-line wrap-text in the common case).
3. Tests bypassed the production wiring. Added two structural tests:
- `still advances cursorDeclaration on alt-screen` in the Ink-level
suite, asserting displayCursor stays put but the declaration
advances by the delta.
- `textInputCursorSourceOfTruth.test.ts` pins three structural
invariants: layout reads curRef.current, never the bare `cur`
state, and the fast-echo stdout.write calls remain paired with
noteCursorAdvance(±N). Source-grep invariants > flaky Ink mount
tests for this kind of regression.
757/757 ui-tui tests pass (+3 over round 1). type-check clean. lint
introduces zero new errors on touched files. tests/test_tui_gateway_server.py
(177) pass.
Closes review threads:
PRRT_kwDOPRF1G86ChOG2 (ink.tsx alt-screen guard)
PRRT_kwDOPRF1G86ChOG9 (textInput.tsx fast-backspace rerender window)
PRRT_kwDOPRF1G86ChOHC (textInput.tsx fast-append rerender window)
PRRT_kwDOPRF1G86ChOHJ (alt-screen test asserts wrong invariant)
PRRT_kwDOPRF1G86ChOHP (missing integration-style coverage)
* fix(tui): reject fast-backspace at soft-wrap boundary (Copilot round 3)
PR #26717 round 3. Copilot caught two real things:
1. `\b \b` cannot move the terminal cursor onto the previous visual
row across a soft-wrap boundary. When the caret sits at visual
column 0 of a wrapped row (e.g. value 'hello ' at width 6 →
cursorLayout produces (line 1, col 0)), backspace would leave the
physical cursor in place while the logical caret moves up to the
end of the previous visual line. `noteCursorAdvance(-1)` would then
feed Ink a wrong delta. Fix: `canFastBackspaceShape` now takes the
composer width and rejects when `cursorLayout(value, cursor, columns).column === 0`.
The fast path falls through to the normal Ink render, which
correctly lays out the new caret position. The PR-description
inconsistency about alt-screen is fixed in a separate gh pr edit.
Adds 4 new tests in textInputFastEcho.test.ts pinning the rejection at
exact-multiple wrap boundaries plus a positive control inside a
wrapped line and a back-compat case where `columns` is omitted.
761/761 ui-tui tests pass. type-check / lint clean. 177/177 Python
tests/test_tui_gateway_server.py pass.
Closes review threads:
PRRT_kwDOPRF1G86ChxE5 (textInput.tsx:933 wrap-boundary regression)
* fix(tui): polish doc + tests after Copilot round 4
Three polish points Copilot raised:
1. canFastBackspaceShape doc comment overstated the legacy contract —
said it conservatively rejects potential wrap boundaries when
columns is omitted, but the implementation actually skips the
wrap-boundary check entirely. Reworded to make the legacy behavior
explicit and warn callers not to rely on protection they don't get.
2. ink-cursor-advance.test.ts rationale comment for the
'advances cursorDeclaration in lock-step' case still referenced
the pre-fix `cursorLayout(display, cur, columns)` expression. Now
accurately describes the current source of truth — `curRef.current`
in textInput.tsx — and explains the window the bump is bridging.
3. Removed the three `__get*ForTest` accessors from Ink. The test
file already cast the instance to inspect private state in the
couple of tests that needed declaration mutation; the rest now use
a small `peek(ink)` helper that does the same cast for reads. No
test-only API surface ships in production.
761/761 ui-tui tests pass. type-check clean. lint introduces zero new
errors on touched files. 177/177 tests/test_tui_gateway_server.py pass.
Closes review threads:
PRRT_kwDOPRF1G86Ch23W (canFastBackspaceShape doc accuracy)
PRRT_kwDOPRF1G86Ch23f (stale test rationale)
PRRT_kwDOPRF1G86Ch23p (test-only API surface in production)
* fix(tui): tighten doc + add dy test coverage (Copilot round 5)
Two polish points from round 5:
1. canFastBackspaceShape doc had two paragraphs that conflicted —
the main 'Additionally rejects when the physical cursor sits at
visual column 0' was stated unconditionally, then the columns-param
paragraph qualified that it only happens when columns is passed.
Reworked into clear 'When supplied / When omitted' branches with a
concrete example value ('hello ' returns true without columns even
though it would be unsafe at width 6). No more inconsistency.
2. Added a test asserting cursorDeclaration.relativeY advances when dy
is non-zero. Existing tests exercised dy on displayCursor only.
Newlines in fast-echoed text don't currently hit the bypass
(canFastAppendShape rejects '\n'), but dy is part of the public
notifier contract and must propagate symmetrically with dx so
future callers get a fully-implemented contract.
762/762 ui-tui tests pass (+1). type-check / lint / build clean.
Closes review threads:
PRRT_kwDOPRF1G86Ch6Sz (doc inconsistency)
PRRT_kwDOPRF1G86Ch6TE (missing dy coverage on declaration)
* fix(tui): doc polish (Copilot round 6)
Four small but valid points:
1. textInputCursorSourceOfTruth.test.ts used bare 'fs'/'path'/'url'
imports; the rest of ui-tui consistently uses the 'node:' prefix
(see src/__tests__/useSessionLifecycle.test.ts, src/lib/editor.test.ts).
Switched to node:fs / node:path / node:url to match convention.
2. CursorAdvanceContext.ts type-level doc described only displayCursor.
The notifier intentionally also mutates the active cursorDeclaration
and that's the only part that matters on alt-screen. Reworked the
doc into a two-part 'updates both' summary with the alt-screen
asymmetry called out explicitly.
3. use-cursor-advance.ts hook doc had the same problem. Same fix —
document both pieces of state, both screen modes.
4. App.tsx onCursorAdvance prop comment was incomplete. Same fix —
describe both state updates and the screen-mode asymmetry.
No behavior change. 762/762 ui-tui tests pass. type-check / lint /
build clean.
Closes review threads (auto-resolved on PR but valid critiques):
PRRT_kwDOPRF1G86Ch926 (node: prefix on built-in imports)
PRRT_kwDOPRF1G86Ch92_ (use-cursor-advance.ts doc)
PRRT_kwDOPRF1G86Ch93H (CursorAdvanceContext.ts type doc)
PRRT_kwDOPRF1G86Ch93J (App.tsx prop comment)
Zero-install localhost tunnels over SSH via Pinggy. Covers HTTP/HTTPS,
TCP, TLS, access control (basic auth / bearer / IP whitelist), header
manipulation (CORS, force-HTTPS), web debugger, Pro token mode, and four
composite recipes (webhook receiver, MCP server exposure, local LLM
endpoint share, dev-server quick-share with one-shot password).
Closes#361
Document the three protocols already available for driving hermes-agent
from external programs — ACP, the TUI gateway JSON-RPC, and the
OpenAI-compatible API server — with a 'which one should I use' guide and
a Pi-style RPC command mapping table. Sidebar entry under Developer
Guide -> Architecture.
Plugins can now replace a built-in tool by passing override=True to
ctx.register_tool(). Without it, the registry rejects any registration
that would shadow an existing tool from a different toolset (unchanged
default behavior).
Unlocks the use case from #11049: drop-in replacement of browser/web
backends without forking core. Composes with the existing pre_tool_call
hook for runtime interception of any implementation.
The override is audit-logged at INFO so it surfaces in agent.log.
Thin wrapper around Imbue's darwinian_evolver (AGPL-3.0, subprocess-only).
Ships a working OpenRouter driver (parrot_openrouter.py), a snapshot
inspector (show_snapshot.py), and a custom-problem template. SKILL.md
has 58-char description, Pitfalls sourced from actually running the loop:
non-viable seed trap, Azure content filter killing runs, loop.run() being
a generator, nested-pickle snapshots, and aggressive default concurrency.
Salvaged from #12719 by @Bihruze — original PR shipped 12,289 LOC across
61 files (29 Python modules, FastAPI dashboard, VS Code extension,
benchmark hub, marketplace, etc.) which was far beyond the scope of the
underlying issue (#336). This version stays at the ~700-LOC scope that
issue actually asked for. Authorship of the original effort credited via
AUTHOR_MAP entry and the SKILL.md author field.
Verified end-to-end: seed 'Say {{ phrase }}' (score 0.000) evolved into
'Please repeat the following phrase exactly as it is, without any
modifications or additional formatting: {{ phrase }}' (score 0.750)
across 3 iterations on gpt-4o-mini via OpenRouter.
Co-authored-by: Bihruze <98262967+Bihruze@users.noreply.github.com>
Mirrors the dependency-ready / assign-profile semantics used in other locales;
Copilot review noted uk.ts was still on the old dispatcher-tick wording.
Co-authored-by: Cursor <cursoragent@cursor.com>
Tirith ships no Windows binary, so on every Windows CLI startup users
saw a scary 'tirith security scanner enabled but not available' banner
they could not act on. The banner suggested degraded security; in
reality pattern-matching guards still run and the message was pure noise.
Fix:
- New public is_platform_supported() helper in tools/tirith_security.py
that returns False when _detect_target() doesn't resolve (Windows, any
non-x86_64/aarch64 arch).
- ensure_installed(), _resolve_tirith_path(), and check_command_security()
short-circuit on unsupported platforms: cache _resolved_path =
_INSTALL_FAILED with reason 'unsupported_platform', skip PATH probes,
skip the background download thread, skip the disk failure marker, and
return allow with an empty summary from check_command_security so the
spawn loop never fires.
- Explicit user-configured tirith_path is still honored everywhere (a
user who built tirith themselves under WSL keeps that path).
- CLI banner in cli.py gated on is_platform_supported() — fires only on
platforms where tirith *should* work but isn't installed.
- Docs note tirith's supported-platform list and point Windows users at
WSL.
Tests: tests/tools/test_tirith_security.py +8 tests covering Linux
x86_64, Darwin arm64, Windows, and unknown-arch verdicts plus the
silent ensure_installed / check_command_security / _resolve_tirith_path
fast-paths and the explicit-path override.
test_tirith_security.py 75 passed (8 new + 67 pre-existing)
test_command_guards.py 19 passed
The per-skill sidebar tree from PR #26646 emitted category entries with
only a label. Docusaurus derives translation keys from the label
(sidebar.docs.category.<label>), and categories that exist in both
Bundled and Optional (productivity, mcp, mlops, research, email,
software-development, dogfood) collided on identical keys — failing
i18n extraction and the Deploy Site build. Result: source had the
sidebar fix but no per-skill page rendered with a sidebar in production.
Add a 'key: skills-<source>-<category>' attribute to each generated
category dict so Bundled vs Optional get distinct translation keys.
Regenerated sidebars.ts via the script. Local docusaurus build passes.
When an approval / clarify / confirm overlay was active, the global input
handler in useInputHandlers returned for every key that wasn't Ctrl+C, which
silently disabled transcript scrolling. On long threads the context the
prompt was asking about often lived above the visible viewport, and being
unable to scroll while answering felt like the prompt had locked the UI.
ApprovalPrompt also had no Esc handler at all, so the one obvious 'abort'
key did nothing during a permission prompt and the user had to memorize
Ctrl+C or hunt for the deny number.
Fixes:
- Extract shouldFallThroughForScroll(key) (pure, exported) covering wheel
scrolls, PageUp/PageDown, and Shift+ArrowUp/Down. When a prompt overlay
is up and the pressed key is a scroll input, skip the early return so it
reaches the existing wheel/PageUp/Shift+arrow handlers below. Plain
arrows still drive in-prompt selection — they don't fall through.
- ApprovalPrompt now maps Esc to onChoice('deny'), parity with the global
Ctrl+C cancellation path that already invokes cancelOverlayFromCtrlC()
for approvals. The bottom-of-prompt hint now advertises 'Esc/Ctrl+C deny'.
- Extract approvalAction(ch, key, sel) — pure key-dispatch helper for the
approval prompt, exported so the regression matrix (Esc, numbers, Enter,
arrows, edge clamping, precedence) is testable without mounting Ink.
Tests:
- useInputHandlers.test.ts: 6 cases covering shouldFallThroughForScroll
positives (wheel/PageUp/PageDown/Shift+arrows) and negatives (plain
arrows, bare shift, no scroll key).
- approvalAction.test.ts: 8 cases covering Esc→deny, numeric mapping,
Enter, ↑↓ within bounds, edge clamping, Esc-beats-others precedence,
unrelated keystrokes.
Ready column help and fallbacks now describe dependency-ready work; show a
badge on unassigned ready cards and fix the stale unassigned tooltip. Align
localized Ready help strings with the new semantics.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor(tui): thread cols through Md/StreamingMd/renderTable, update cache key
* feat(tui): three-tier width calc + full-line string rendering in renderTable
Replaces the old renderTable (L203-244) with:
- Empty table guard
- Ragged row normalization
- Three-tier column width calculation (ideal → proportional shrink → hard scale)
- Rounding remainder distribution
- Full-line string rendering (one <Text> per row, not per cell)
- wrap=truncate-end on all table lines
- All cells rendered as plain text via stripInlineMarkup
No wrapping or vertical fallback yet — those come in Phase 3 and 4.
* feat(tui): wrapCell with grapheme-safe hard-break + multi-line row rendering
Adds:
- Intl.Segmenter-based grapheme splitting (fallback to [...word])
- wrapCell() for width-correct word wrapping on stripped text
- Multi-line row rendering with LineEntry metadata (header/separator/body)
- Post-render safety condition (maxLineWidth computed, vertical fallback in Task 4)
- Non-wrapping path preserved for tables that fit at ideal widths
* feat(tui): vertical key-value fallback with scaled threshold + safety check
Wires:
- Scaled row-height threshold (numCols<=3: 8, <=6: 5, else: 4)
- Post-render safety check (maxLineWidth > available space)
- Header-only edge case
- Vertical format: bold headers, stripped cell text, clamped separator width
- Iterates headers (not rows) for consistent key-value fields on ragged rows
* test(tui): pass cols to Md in test helpers, add width-overflow assertions
- renderAtWidth now passes cols={columns} to <Md> so width-aware code paths
are exercised in tests
- tableFuzz: every rendered line must fit within allocated width (stringWidth)
- tableRepro: separator regex updated to match truncation ellipsis
- stringWidth imported from @hermes/ink for CJK-correct assertions
* fix(tui): address adversarial review — comment tier 3 budget overshoot, eliminate redundant wrapCell
- Add comment on Tier 3 MIN_COL_WIDTH clamp exceeding budget (self-heals via safetyOverflow)
- Track tallestBodyRow during allEntries build pass instead of re-wrapping every cell
in a second traversal (eliminates O(cells) of redundant stripInlineMarkup+stringWidth)
* fix(tui): pass cols to recursive fenced-markdown Md, fix test frame extraction
- Thread cols into <Md> for fenced markdown blocks (L734) so nested
tables use the width-aware renderer instead of max-content path
- Fix renderAtWidth helpers to extract final Ink repaint frame instead
of concatenating all intermediate frames (REPAINT_RE split)
- Add fenced-markdown-table fixture to tableFuzz (exercises the nested path)
* chore: remove repro test suites and tmux driver script
These were scaffolding for development/reproduction — not needed in the PR.
Accept delegation timeout/error statuses in the TUI subagent model, normalize unknown status strings defensively, and harden /agents overlay rendering/sorting so unknown statuses cannot crash glyph/color lookup. Add regression tests for live event normalization and disk snapshot replay.
Avoid shifting the terminal's last visible row in the alt-screen DECSTBM fast path, which can leave transient scroll bleed/discoloration artifacts around the status lane until a repaint. Add regression tests to preserve the fast path when safe and skip it when the hint touches the bottom row.
The #1 confusing cause of the xAI 403 (per Teknium): X Premium+
subscribers see Grok inside the X app and assume API access is
included. It is NOT — only standalone SuperGrok subscribers can use
xai-oauth with Hermes today. Without calling this out, every Premium+
user hits the 403 with no idea why.
PR #26666's neutral 4-cause list was correct but buried the most
common cause. Lead with the Premium+ gotcha, then list the other
possibilities (no subscription, wrong tier, exhausted quota) as
fallbacks. Same neutral framing — does not accuse anyone of being
unsubscribed.
PR #26644 confidently told users "xAI OAuth account lacks SuperGrok /
X Premium entitlement" on any 403 from xAI's permission-denied surface.
But that body is returned for at least four distinct causes that
Hermes cannot distinguish from the wire:
* Account has no Grok subscription at all
* Account has SuperGrok but the tier doesn't include the requested
model (e.g. grok-4.3 needs SuperGrok Heavy)
* Monthly quota for the subscribed tier is exhausted
* SuperGrok is active but the API access add-on isn't enabled
Don Piedro pushed back that he IS subscribed yet still hit this.
Picking the worst-case interpretation ("you're not subscribed")
reads as wrong and insulting to subscribers, and points them at a
fix they already did.
New wording lists all 4 possibilities and points at
https://grok.com/?_s=usage where the user can check which applies.
The detection logic and credential-pool short-circuit (PR #26664)
are unchanged — only the user-facing wording is rephrased.
Don Piedro's 18-minute hang on grok-4.3 traced to two issues PR #26644
didn't cover:
- _recover_with_credential_pool classifies 403 as FailoverReason.auth
and calls pool.try_refresh_current(). For xAI OAuth on an
unsubscribed account, refresh succeeds (mints a new token from the
same account) but the next API call 403s with the same entitlement
error. Result: infinite refresh → retry → 403 loop until Ctrl+C
(1133s in Don's log). New _is_entitlement_failure(error_context,
status_code) detects the subscription-shape body ("do not have an
active Grok subscription" / "out of available resources" + grok /
"does not have permission" + grok) and short-circuits recovery so
_summarize_api_error surfaces PR #26644's friendly hint.
- grok-4.3 resolved to 256k via the grok-4 catch-all in
DEFAULT_CONTEXT_LENGTHS. Per docs.x.ai/developers/models/grok-4.3
the model ships with 1M context. Add explicit grok-4.3 entry
before the grok-4 fallback (longest-first substring matching
ensures grok-4.3 and grok-4.3-latest both land on the new value).
Tests: 8 new (23 total in test_codex_xai_oauth_recovery.py).
E2E verified Don's 100-iteration loop bails out with 0 refresh calls
while genuine auth failures still refresh once and recover.
Individual skill pages (e.g. /docs/user-guide/skills/bundled/productivity/notion)
had no sidebar rendered — the sidebar config only listed the two catalog index
pages. That was an intentional choice from an earlier 'too many entries would
drown product docs' concern, but the effect is that a user landing on any skill
page (via search, share link, or the catalog table) loses navigation entirely
and can't see related skills.
Wire build_sidebar_items() (which was already computed and discarded) back into
the sidebar. Structure:
Skills
├── Bundled skills catalog (catalog table, was already there)
├── Optional skills catalog (catalog table, was already there)
├── Bundled
│ ├── apple/
│ │ ├── apple-apple-notes
│ │ └── ...
│ └── ... (one collapsed category per skill category)
└── Optional
└── ... (same)
Categories are collapsed by default so the top-level Skills entry doesn't
explode visually. Users browsing one skill see siblings in the same category;
the catalogs remain the at-a-glance entry point.
Also includes drift the regen script naturally produces on top of current main:
- creative-comfyui v5.0.0 → v5.1.0 page (author + new ref file)
- devops-kanban-worker SKILL.md updates
- new pages for optional skills that lacked generated docs:
hyperliquid, finance-stocks, software-development/rest-graphql-debug
- updated optional-skills-catalog row for those
Validation:
- npx docusaurus build (en locale) succeeded — only pre-existing warnings
- inspected built productivity-notion/index.html: sidebar tree present,
sibling productivity skills (airtable, linear, etc.) all linked
The cherry-picked PR #15251 from @tw2818 correctly identified the
DeepSeek 400 root cause but placed the fix in the legacy fallback path
of `build_kwargs`, which DeepSeek never reaches — DeepSeek has a
registered ProviderProfile and goes through `_build_kwargs_from_profile`
instead. The legacy-path block was therefore dead code.
This commit pivots the fix to where it actually fires:
- New `DeepSeekProfile` in `plugins/model-providers/deepseek/__init__.py`
overrides `build_api_kwargs_extras` to emit DeepSeek's expected wire
format (mirrors `KimiProfile`):
{"reasoning_effort": "<low|medium|high|max>",
"extra_body": {"thinking": {"type": "enabled" | "disabled"}}}
- Model gating: only `deepseek-v4-*` and `deepseek-reasoner` emit
thinking control. `deepseek-chat` (V3) is untouched — current behavior.
- Effort mapping: low/medium/high passthrough, xhigh/max → max, unset →
omitted (DeepSeek server applies its own default).
- Revert the legacy-path additions from PR #15251 — they were dead code,
and the `_copy_reasoning_content_for_api` strip block specifically
would have nullified the existing reasoning_content padding machinery
(`_needs_deepseek_tool_reasoning` → space-pad on replay) that the
active provider already relies on for replay correctness.
- Unit tests pin the wire-shape contract and the model gating rules
(26 tests, all passing). Existing transport + provider profile suites
(321 tests) continue to pass.
- AUTHOR_MAP: map twebefy@gmail.com → tw2818 for release notes credit.
Closes#15700, #17212, #17825.
Co-authored-by: tw2818 <twebefy@gmail.com>
DeepSeek's thinking mode requires both:
- extra_body.thinking.type: "enabled" to activate thinking mode
- top-level reasoning_effort: "max" or "high" to control depth
Previously, the ChatCompletionsTransport only handled Kimi's thinking
mode — DeepSeek was left unmapped, so reasoning_effort config was
silently dropped.
This patch:
1. Adds is_deepseek: bool to the Params dataclass, detected by
base_url matching api.deepseek.com
2. Maps Hermes effort levels (xhigh/max → "max", low/medium/high →
themselves) to the top-level reasoning_effort parameter
3. Sets extra_body.thinking.type alongside the effort
4. Strips reasoning_content from assistant messages sent back to
DeepSeek, preventing 400 errors when thinking was enabled
Three fixes for the May 2026 xAI OAuth (SuperGrok / X Premium) rollout
failures:
- _run_codex_stream: when openai SDK raises RuntimeError("Expected to
have received `response.created` before `<type>`"), retry once then
fall back to responses.create(stream=True) — same path used for
missing-response.completed postlude. Fallback surfaces the real
provider error with body+status_code intact. Also fixes#8133
(response.in_progress prelude on custom relays) and #14634
(codex.rate_limits prelude on codex-lb).
- _summarize_api_error: when error body matches xAI's entitlement
shape, append a one-line hint pointing to https://grok.com and
/model. Once-only, applies to both auxiliary warnings and
main-loop error surfacing.
- _chat_messages_to_responses_input: new is_xai_responses kwarg
drops replayed codex_reasoning_items (encrypted_content) before
they reach xAI. Also drops reasoning.encrypted_content from the
xAI include array. Native Codex behavior unchanged. Grok still
reasons natively each turn; coherence rides on visible message
text alone.
Closes#8133, #14634.
Two log-spam fixes surfaced by a Windows user (Git Bash + Python 3.11.9):
1. LocalEnvironment cwd warn spam
============================
Git Bash's `pwd -P` emits paths like `/c/Users/x`. The base-class
`_extract_cwd_from_output` was assigning this verbatim to `self.cwd`
without validation, then `_resolve_safe_cwd`'s `os.path.isdir(/c/...)`
returned False on Windows, triggering:
LocalEnvironment cwd '/c/Users/NVIDIA' is missing on disk;
falling back to '/' so terminal commands keep working.
...on every terminal call. The pre-existing Windows-path translation
inside `_run_bash` ran AFTER the safe-cwd check, so it could never
prevent the warning.
Fix:
- New `_msys_to_windows_path` helper (idempotent, no-op off Windows).
- `_resolve_safe_cwd` normalizes before `isdir`, so a valid MSYS path
is recognized as the real directory it points at.
- `LocalEnvironment._update_cwd` and a new override of
`_extract_cwd_from_output` translate + validate before mutating
`self.cwd`. Stale / non-existent marker paths roll back to the
previous cwd instead of clobbering it.
- The fallback warning still fires when the directory really is gone
(deletion-recovery scenario from #17558 still covered).
2. tirith spawn-failed warn spam
=============================
When tirith isn't installed (background install in flight, or marked
failed for the day) and the configured path stays as the bare string
`tirith`, every `subprocess.run([tirith_path, ...])` raises OSError
and logged:
tirith spawn failed: [WinError 2] The system cannot find the file specified
...on every command. fail_open=True means behaviour is correct, but
the log noise is severe.
Fix:
- `_warn_once(key, ...)` thread-safe dedupe helper.
- Three hot-path warnings (`tirith path resolved to None`,
`tirith spawn failed: ...`, `tirith timed out after Ns`) now log
once per (exception class, errno) / timeout-value / path-none key.
- Dedupe set is cleared on `_clear_install_failed` so a successful
install lets a subsequent failure surface again.
Tests
=====
- `tests/tools/test_local_env_windows_msys.py`: 12 tests covering the
MSYS→Windows translator, the resolve fast-path, update_cwd validation,
and extract_cwd_from_output rollback.
- `tests/tools/test_tirith_security.py`: 4 new dedupe tests (15 spawn
failures → 1 log line; distinct exc types → 2 lines; timeout dedupe;
path-None dedupe).
Targeted runs:
test_local_env_windows_msys.py 12 passed
test_local_env_cwd_recovery.py 7 passed (pre-existing, no regressions)
test_tirith_security.py 67 passed (63 pre-existing + 4 new)
test_base_environment + local_* 37 passed (no regressions)
test_local_env_blocklist + neighbours 114 passed
Reported via Hermes log capture: 19× cwd warnings + 15× tirith warnings
in a single short session.
On Windows (msvcrt path), _file_lock() first checked if the lock file
existed and wrote it with write_text(), then opened it with open('r+').
Between these two calls, another process could delete the file causing
open('r+') to raise FileNotFoundError — uncaught, leaving memory writes
to proceed without holding the lock, risking data corruption.
Replace the three-line sequence with a single open('a+', ...) call which
atomically creates the file if missing or opens it if it exists, closing
the TOCTOU window entirely. The existing fd.seek(0) before msvcrt.locking()
is preserved and sufficient for correct lock byte positioning.
Root cause: TOCTOU between lock_path.write_text() and open('r+')
Impact: concurrent memory writes on Windows could corrupt MEMORY.md
Pairs with the prior commit (start() now inside the try block). If
threading.Thread.start() itself raises (OS thread exhaustion under
heavy delegation fanout), the finally would call .join() on a
never-started thread, which raises RuntimeError("cannot join thread
before it is started") — trading one rare bug for another.
Thread.ident is None until start() succeeds, so gate the join on it.
_heartbeat_thread.start() was called before the try/finally block that
contains _heartbeat_stop.set(). If _register_subagent() or any code
between .start() and try: raised an exception, the finally block would
never run — leaving the heartbeat thread as an orphan that continues
calling _touch_activity() on the parent agent, incorrectly resetting
gateway timeout counters.
Move _heartbeat_thread.start() to be the first statement inside the
try block so the finally block always reaches _heartbeat_stop.set()
regardless of how the child run completes or fails.
Root cause: heartbeat start outside try/finally scope
Impact: orphan heartbeat thread incorrectly resets parent gateway timeouts
* feat(skills/notion): overhaul for Notion Developer Platform (May 2026)
Notion shipped its Developer Platform on May 13, 2026: ntn CLI, Workers,
Markdown API, bidirectional webhooks, agent tools. The existing skill only
covered curl + integration token CRUD, so it didn't surface any of the new
ergonomics — particularly the /markdown endpoints (much easier for agents
to consume) and the ntn CLI for headless API + Workers management.
This rewrite (v1.0.0 -> v2.0.0):
- Splits setup into Path A (HTTP, cross-platform incl. Windows), Path B
(ntn CLI on macOS/Linux, with NOTION_API_TOKEN env var for headless),
and Path C (Windows fallback — HTTP API or WSL2; native ntn is 'coming
soon').
- Keeps the full curl reference (still the only Windows-compatible path).
- Adds /markdown endpoints — GET and PATCH page-as-markdown, plus POST
/v1/pages with a markdown body param. Agent-friendly, no CLI required.
- Adds ntn CLI cheat sheet for raw API shorthand, file uploads, and
workspace flags.
- Adds Notion Workers section: scaffold, tool/webhook capability shapes,
lifecycle commands. Gated on Business/Enterprise plans + macOS/Linux.
- Adds Notion-flavored Markdown reference (callouts, toggles, columns,
mentions, colors) for the /markdown endpoints.
- Adds a 'choose the right path' decision table at the bottom.
- Notes the new efficient Notion MCP server as an optional wiring path.
Auto-generated docs page regenerated via
website/scripts/generate-skill-docs.py.
* docs(skills-catalog): update notion description for v2.0.0
Catches the failure mode that produced #25045: a contributor PR whose
branch had been disconnected from main's history (likely an accidental
'git checkout --orphan' or '.git/' re-init). GitHub's merge UI does
not refuse merges of unrelated histories, so the PR landed cleanly
with its intended one-file change but its parent-less root commit
(413990c94) got grafted into main as a second root. The merge
resolution itself was correct — main's content won for every
conflicting file — but ~1500 files' worth of git blame collapsed
onto that single commit.
Implementation: 'git merge-base origin/main HEAD' exits non-zero and
prints nothing when the two commits share no ancestor. Check both
conditions and fail with a clear message + recovery steps.
Verified: against the historic state of PR #25045 (base 5d90386ba,
head 1149e75db), 'git merge-base' returns empty with exit 1, so the
new check would have rejected it.
Follow-up to #26592. The new docs/guides/oauth-over-ssh.md page was
linked from the two SSH-specific sections of the xAI Grok OAuth guide
but was missing from the surfaces a user is more likely to hit first:
- guides/xai-grok-oauth.md 'See Also' — add the SSH guide at the top
with a short qualifier so remote users notice it before clicking
through.
- integrations/providers.md xAI Grok OAuth callout — append the SSH
guide link alongside the existing xAI OAuth guide link.
- user-guide/configuration.md xai-oauth tip — same.
Docs build: zero warnings on touched files.
- installation.md: add tip about `hermes postinstall` for upfront dep install
- quickstart.md: show `hermes postinstall` in pip install flow
- updating.md: fix --check description to mention PyPI path for pip installs
- dep_ensure.py: use get_hermes_home() instead of hand-rolled env var
- dep_ensure.py: add "chrome" to browser name list (was inconsistent with browser_tool.py)
- main.py _cmd_update_check: use detect_install_method() directly instead of redundant .git check
- main.py _cmd_update_pip: build command list directly instead of fragile split() on display string
- banner.py: rename _check_via_pypi → check_via_pypi (cross-module public API)
Document pip install hermes-agent as a first-class install option.
Clarify that PyPI releases track tagged versions (major/minor),
not every commit on main — git installer is for bleeding-edge.
One-shot bootstrap that installs non-Python deps (node, browser,
ripgrep, ffmpeg) via ensure_dependency(), then runs setup if no
provider is configured. Closes the gap between `pip install` and
the full user-facing experience.
Also fixes 3 pre-existing test regressions caused by earlier commits:
- test_recommended_update_command: mock detect_install_method for git env
- test_check_for_updates_no_git_dir: now falls back to PyPI, not None
- test_plist_path_includes_node_modules_bin: skip when dir absent