Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/gui

This commit is contained in:
Brooklyn Nicholson 2026-05-16 11:36:22 -05:00
commit 062eed654d
116 changed files with 10770 additions and 258 deletions

477
RELEASE_v0.14.0.md Normal file
View file

@ -0,0 +1,477 @@
# Hermes Agent v0.14.0 (v2026.5.16)
**Release Date:** May 16, 2026
**Since v0.13.0:** 808 commits · 633 merged PRs · 1393 files changed · 165,061 insertions · 545 issues closed (12 P0, 50 P1) · 215 community contributors (including co-authors)
> The Foundation Release — Hermes Agent installs and runs anywhere now. Native Windows ships in early beta with a full PowerShell installer story, a `pip install hermes-agent` wheel lands on PyPI, lazy-deps reshape what `pip install hermes-agent` actually pulls down, the supply-chain checker scans every install/upgrade for unsafe versions, and a new OpenAI-compatible local proxy lets Codex / Aider / Cline talk to OAuth-only providers (Claude Pro, ChatGPT Pro, SuperGrok). The cold-start wave shaves ~19 seconds off `hermes` launch, browser-tool CDP calls run 180x faster, and `hermes tools` All-Platforms drops from 14s to under 1.5s. Two new messaging platforms (LINE and SimpleX Chat) and a Microsoft Graph foundation (Teams pipeline + webhook adapter) land alongside `/handoff` that finally transfers sessions live, `vision_analyze` passing pixels through to vision-capable models, `x_search` as a first-class tool, LSP semantic diagnostics on every `write_file` / `patch`, a unified pluggable `video_generate`, a `computer_use` cua-driver backend, cross-session 1-hour Claude prompt caching, a per-turn file-mutation verifier, plus 9 new optional skills. 50+ P1 closures, 12 P0 closures.
---
## ✨ Highlights
- **Native Windows support (early beta)** — full PowerShell installer, native subprocess/PTY paths, taskkill-based process management, MinGit auto-install, Microsoft Store python stub detection, foreground Ctrl+C preservation, taskkill+ps2 fallback, npm prefix handling, and ~40 follow-up Windows-only fixes across CLI / gateway / TUI / curator / tools. Hermes finally runs natively on `cmd.exe` and PowerShell, no WSL required. ([#21561](https://github.com/NousResearch/hermes-agent/pull/21561), [#22130](https://github.com/NousResearch/hermes-agent/pull/22130), [#22752](https://github.com/NousResearch/hermes-agent/pull/22752), [#26618](https://github.com/NousResearch/hermes-agent/pull/26618), and many more)
- **`pip install hermes-agent && hermes`** — Hermes Agent is now a real PyPI package. One command, no clone, no git, no shell installer. Wheel includes the Ink TUI bundle and shell launcher. (salvage of [#26350](https://github.com/NousResearch/hermes-agent/pull/26350)) ([#26593](https://github.com/NousResearch/hermes-agent/pull/26593))
- **Cold-start performance wave — ~19s off `hermes` launch** — skills cache, lazy Feishu import, no Nous HTTP at startup, plus PEP-562 lazy adapter imports (QQ, Yuanbao, Teams, Google Chat), deferred `fal_client` / `google-cloud` / `httpx` loads, models.dev disk-cache-first lookup, parallel doctor API checks, eager-skip plugin discovery on built-in subcommands, `hermes tools` All-Platforms drops from 14s to <1.5s, welcome banner skipped on `chat -q`. ([#22138](https://github.com/NousResearch/hermes-agent/pull/22138), [#22120](https://github.com/NousResearch/hermes-agent/pull/22120), [#22681](https://github.com/NousResearch/hermes-agent/pull/22681), [#22790](https://github.com/NousResearch/hermes-agent/pull/22790), [#22808](https://github.com/NousResearch/hermes-agent/pull/22808), [#22831](https://github.com/NousResearch/hermes-agent/pull/22831), [#22859](https://github.com/NousResearch/hermes-agent/pull/22859), [#22904](https://github.com/NousResearch/hermes-agent/pull/22904), [#22766](https://github.com/NousResearch/hermes-agent/pull/22766), [#25341](https://github.com/NousResearch/hermes-agent/pull/25341))
- **180x faster `browser_console` evaluations** — routed through the supervisor's persistent CDP WebSocket instead of spawning a fresh DevTools session per call. Real-world page interactions feel instant. ([#23226](https://github.com/NousResearch/hermes-agent/pull/23226))
- **Supply-chain advisory checker + lazy-deps framework + tiered install fallback** — every `pip install` / `hermes update` scans dependencies against an advisory list, lazy-deps replace heavy import-time loads with first-use installs, and the installer falls back through extras tiers when a wheel rejects on the target platform. ([#24220](https://github.com/NousResearch/hermes-agent/pull/24220))
- **OpenAI-compatible local proxy**`hermes proxy` exposes any OAuth-authed provider (Claude Pro, ChatGPT Pro, SuperGrok) as an OpenAI-compatible endpoint that Codex / Aider / Cline / VS Code Continue can hit. Your subscription, your tools. ([#25969](https://github.com/NousResearch/hermes-agent/pull/25969))
- **Cross-session 1-hour Claude prompt cache** — Anthropic / OpenRouter / Nous Portal now share a 1h prefix cache across sessions for Claude models. Fast resume, fast `/new`, lower cost on repeat work. ([#23828](https://github.com/NousResearch/hermes-agent/pull/23828))
- **Two new messaging platforms — LINE + SimpleX Chat** — LINE Messaging API lands as a first-class platform, SimpleX Chat salvages #2558 onto the modern adapter spec. Hermes is now on 22 platforms. ([#23197](https://github.com/NousResearch/hermes-agent/pull/23197), [#26232](https://github.com/NousResearch/hermes-agent/pull/26232))
- **Microsoft Graph foundation — Teams pipeline + webhook adapter**`msgraph` auth/client foundation, webhook listener platform, Teams pipeline plugin runtime, and Teams outbound delivery via the existing adapter — Hermes can now read and post to Teams. (salvages of #21408#21411) ([#21922](https://github.com/NousResearch/hermes-agent/pull/21922), [#21969](https://github.com/NousResearch/hermes-agent/pull/21969), [#22007](https://github.com/NousResearch/hermes-agent/pull/22007), [#22024](https://github.com/NousResearch/hermes-agent/pull/22024))
- **`/handoff` actually transfers the session live** — the agent's active session moves to a different model / persona / profile mid-conversation, with messages, tool history, and context preserved. ([#23395](https://github.com/NousResearch/hermes-agent/pull/23395))
- **`x_search` — first-class X (Twitter) search tool** — gated tool with OAuth-or-API-key auth, no skill needed to query the timeline. ([#26763](https://github.com/NousResearch/hermes-agent/pull/26763))
- **`vision_analyze` returns pixels to vision-capable models** — when the active model can see, `vision_analyze` now hands the image straight through instead of falling back to a text description. ([#22955](https://github.com/NousResearch/hermes-agent/pull/22955))
- **LSP semantic diagnostics on every write**`write_file` and `patch` now run real language-server diagnostics on the post-edit file (delta-only) and surface real errors before they ship downstream. ([#24168](https://github.com/NousResearch/hermes-agent/pull/24168), [#25978](https://github.com/NousResearch/hermes-agent/pull/25978))
- **Per-turn file-mutation verifier footer** — after every turn that wrote files, the agent gets a verifier footer summarizing what actually changed on disk — catches silent overwrites and "wrote it but it didn't land" bugs. ([#24498](https://github.com/NousResearch/hermes-agent/pull/24498))
- **Unified `video_generate` with pluggable provider backends** — single tool, any backend. Drop in a new video provider as a plugin, no core changes. ([#25126](https://github.com/NousResearch/hermes-agent/pull/25126))
- **`computer_use` cua-driver backend** — proper focus-safe ops, non-Anthropic provider support, refresh on `hermes update`. Computer-use is no longer locked to a single SDK. (re-salvage of #16936) ([#21967](https://github.com/NousResearch/hermes-agent/pull/21967), [#24063](https://github.com/NousResearch/hermes-agent/pull/24063))
- **xAI Grok OAuth provider — SuperGrok via subscription** — sign in with your xAI account, talk to Grok models from Hermes. ([#26534](https://github.com/NousResearch/hermes-agent/pull/26534))
- **Clarify with buttons — native inline keyboards on Telegram + Discord** — the `clarify` tool renders multi-choice prompts as platform-native buttons instead of typed responses. ([#24199](https://github.com/NousResearch/hermes-agent/pull/24199), [#25485](https://github.com/NousResearch/hermes-agent/pull/25485))
- **Discord channel history backfill (default on)** — Hermes reads recent channel history when joining a thread so it actually knows what's been said. ([#25984](https://github.com/NousResearch/hermes-agent/pull/25984))
- **Watchers skill — RSS / HTTP JSON / GitHub polling via cron `no_agent` mode** — skill recipes that wire change-detection sources directly into cron's script-only watchdog mode. ([#21881](https://github.com/NousResearch/hermes-agent/pull/21881))
- **Zed ACP Registry integration + uvx distribution** — Hermes is in the Zed registry, installable via `uvx` (no npm). Plus `hermes acp --setup-browser` bootstraps browser tools for registry installs. (salvage of [#25908](https://github.com/NousResearch/hermes-agent/pull/25908)) ([#26079](https://github.com/NousResearch/hermes-agent/pull/26079), [#26120](https://github.com/NousResearch/hermes-agent/pull/26120), [#26234](https://github.com/NousResearch/hermes-agent/pull/26234))
- **OpenRouter Pareto Code router** — wire a new OpenRouter router with `min_coding_score` knob. Pick the cheapest model that meets your quality bar. ([#22838](https://github.com/NousResearch/hermes-agent/pull/22838))
- **Optional codex app-server runtime for OpenAI/Codex models** — drives the OpenAI Codex CLI under the hood for OpenAI/Codex paths, with session reuse, wedge retirement, and OAuth refresh classification. ([#24182](https://github.com/NousResearch/hermes-agent/pull/24182), [#25769](https://github.com/NousResearch/hermes-agent/pull/25769))
- **`hermes-skills/huggingface` as a trusted default tap** — community skills index from huggingface.co/skills is available by default in the Skills Hub. ([#26219](https://github.com/NousResearch/hermes-agent/pull/26219))
- **9 new optional skills** — Hyperliquid (perp/spot trading via SDK + REST) (@kshitijk4poor & Hermes), Yahoo Finance market data, api-testing (REST/GraphQL debug), unified EVM multi-chain skill (folds #25291 + #2010 + base/), darwinian-evolver, osint-investigation (closes #355), pinggy-tunnel, watchers (RSS/HTTP/GitHub via cron), Notion overhaul for the Developer Platform (May 2026). ([#23582](https://github.com/NousResearch/hermes-agent/pull/23582), [#23583](https://github.com/NousResearch/hermes-agent/pull/23583), [#23590](https://github.com/NousResearch/hermes-agent/pull/23590), [#25299](https://github.com/NousResearch/hermes-agent/pull/25299), [#26760](https://github.com/NousResearch/hermes-agent/pull/26760), [#26729](https://github.com/NousResearch/hermes-agent/pull/26729), [#26765](https://github.com/NousResearch/hermes-agent/pull/26765), [#21881](https://github.com/NousResearch/hermes-agent/pull/21881), [#26612](https://github.com/NousResearch/hermes-agent/pull/26612))
- **API server exposes run approval events** — long-running runs surface approval requests over the API stream, no more silent stalls. (salvage of [#20311](https://github.com/NousResearch/hermes-agent/pull/20311)) ([#21899](https://github.com/NousResearch/hermes-agent/pull/21899))
- **`/subgoal` — user-added criteria appended to active `/goal`** — layer extra success criteria onto a running goal loop. The judge sees them in the prompt, no behavior change when subgoals are empty. ([#25449](https://github.com/NousResearch/hermes-agent/pull/25449))
- **Plugins can run any LLM call via `ctx.llm`** — plugins get a first-class hook to make their own LLM requests through the active provider/credentials, no manual wiring. Plus `tool_override` flag for replacing built-in tools. ([#23194](https://github.com/NousResearch/hermes-agent/pull/23194), [#26759](https://github.com/NousResearch/hermes-agent/pull/26759))
- **Brave Search (free tier) + DuckDuckGo (DDGS) as web-search providers** — two new free search backends alongside Tavily / SearXNG / Exa. ([#21337](https://github.com/NousResearch/hermes-agent/pull/21337))
- **Sudo brute-force block + sudo-stdin/askpass DANGEROUS classification** — closes the `sudo -S` brute-force avenue; approval gates classify stdin-fed and askpass-stripped sudo invocations as dangerous. (salvages of #22194 + #21128) ([#23736](https://github.com/NousResearch/hermes-agent/pull/23736))
- **Provider rename — Alibaba Cloud → Qwen Cloud, picker reorder** — matches what the world calls it. Existing config keys still work. ([#24835](https://github.com/NousResearch/hermes-agent/pull/24835))
---
## 🪟 Windows — Native Support (Early Beta)
### Bootstrap & installer
- **Native Windows support (early beta)** — first-class native Windows path across CLI / gateway / TUI / tools ([#21561](https://github.com/NousResearch/hermes-agent/pull/21561))
- **PyPI wheel packaging — `pip install hermes-agent && hermes`** (salvage of #26350) ([#26593](https://github.com/NousResearch/hermes-agent/pull/26593))
- **Recognise Shift+Enter as a newline key** + Windows docs (salvage #21545) ([#22130](https://github.com/NousResearch/hermes-agent/pull/22130))
- **Preserve Ctrl+C for Windows foreground runs** (@helix4u) ([#22752](https://github.com/NousResearch/hermes-agent/pull/22752))
- **Stop spamming cwd-missing + tirith-spawn warnings on every terminal call** ([#26618](https://github.com/NousResearch/hermes-agent/pull/26618))
- **Use `--extra all` not `--all-extras`; drop lazy-covered extras from `[all]`** ([#24515](https://github.com/NousResearch/hermes-agent/pull/24515))
### Windows-specific fixes (40+ across cli / tools / gateway / curator / TUI)
A long tail of native-Windows fixes shipped alongside the beta — taskkill-based subprocess management, MinGit auto-install, Microsoft Store python stub detection, npm prefix handling, native PTY paths, signal handling differences, foreground process management, ANSI sequence handling, path normalization, file-locking semantics, and many more. Full list in commit log under `fix(windows)` / `feat(windows)` / `windows`.
---
## 🚀 Performance Wave
### Cold start
- **Cut ~19s from `hermes` cold start** — skills cache + lazy Feishu + no Nous HTTP at startup ([#22138](https://github.com/NousResearch/hermes-agent/pull/22138))
- **Skip eager plugin discovery on known built-in subcommands** ([#22120](https://github.com/NousResearch/hermes-agent/pull/22120))
- **Cache Nous auth + .env loads**`hermes tools` All Platforms from 14s to <1.5s ([#25341](https://github.com/NousResearch/hermes-agent/pull/25341))
- **Skip welcome banner on `chat -q` single-query mode** ([#22904](https://github.com/NousResearch/hermes-agent/pull/22904))
- **Defer heavy google-cloud imports in google_chat to first adapter use** ([#22681](https://github.com/NousResearch/hermes-agent/pull/22681))
- **Defer QQAdapter and YuanbaoAdapter imports via PEP 562** ([#22790](https://github.com/NousResearch/hermes-agent/pull/22790))
- **Defer httpx import in teams to first webhook call** ([#22831](https://github.com/NousResearch/hermes-agent/pull/22831))
- **Defer fal_client import to first generation request** ([#22859](https://github.com/NousResearch/hermes-agent/pull/22859))
- **models.dev cache-first lookup, skip network when disk cache is fresh** ([#22808](https://github.com/NousResearch/hermes-agent/pull/22808))
- **Parallelize API connectivity checks in `hermes doctor` and disable IMDS** ([#22766](https://github.com/NousResearch/hermes-agent/pull/22766))
### Runtime
- **180x faster `browser_console` evaluations** — route through supervisor's persistent CDP WebSocket ([#23226](https://github.com/NousResearch/hermes-agent/pull/23226))
- **Tune Telegram cadence + adaptive fast-path for short replies** (salvage of #10388) ([#23587](https://github.com/NousResearch/hermes-agent/pull/23587))
- **Accumulate length-continuation prefix via list+join** ([#26237](https://github.com/NousResearch/hermes-agent/pull/26237))
### Prompt caching
- **Cross-session 1h prefix cache for Claude on Anthropic / OpenRouter / Nous Portal** ([#23828](https://github.com/NousResearch/hermes-agent/pull/23828))
- **Hit prefix cache in background review fork** (salvage #17276 + #25427) ([#25434](https://github.com/NousResearch/hermes-agent/pull/25434))
---
## 📦 Installation & Distribution
### PyPI + supply-chain
- **PyPI wheel packaging — `pip install hermes-agent && hermes`** (salvage of #26350) ([#26593](https://github.com/NousResearch/hermes-agent/pull/26593))
- **Supply-chain advisory checker + lazy-install framework + tiered install fallback** ([#24220](https://github.com/NousResearch/hermes-agent/pull/24220))
- **Use `--extra all` not `--all-extras`; drop lazy-covered extras from `[all]`** ([#24515](https://github.com/NousResearch/hermes-agent/pull/24515))
- **Skip browser download when system chromium exists** (@helix4u) ([#25317](https://github.com/NousResearch/hermes-agent/pull/25317))
### Nix
- **`extraDependencyGroups` for sealed venv extras** (@alt-glitch) ([#21817](https://github.com/NousResearch/hermes-agent/pull/21817))
- **Refresh npm lockfile hashes** — keeps Nix flake builds reproducible
### Docker
- **Bootstrap auth.json from env on first boot** ([#21880](https://github.com/NousResearch/hermes-agent/pull/21880))
- **Drop manual @hermes/ink build, rely on esbuild bundle** — slimmer image
### ACP / Zed
- **Zed ACP Registry integration** (salvage of #25908) ([#26079](https://github.com/NousResearch/hermes-agent/pull/26079))
- **Switch to uvx distribution, drop npm launcher** ([#26120](https://github.com/NousResearch/hermes-agent/pull/26120))
- **`hermes acp --setup-browser` bootstraps browser tools for registry installs** ([#26234](https://github.com/NousResearch/hermes-agent/pull/26234))
---
## 🏗️ Core Agent & Architecture
### Sessions & handoff
- **`/handoff` actually transfers the session live** ([#23395](https://github.com/NousResearch/hermes-agent/pull/23395))
- **Expose `HERMES_SESSION_ID` env var to agent tools** (@alt-glitch) ([#23847](https://github.com/NousResearch/hermes-agent/pull/23847))
### Goals (Ralph loop)
- **`/subgoal` — user-added criteria appended to active `/goal`** ([#25449](https://github.com/NousResearch/hermes-agent/pull/25449))
- **`/goal` checklist + /subgoal user controls** ([#23456](https://github.com/NousResearch/hermes-agent/pull/23456)) — rolled back in window ([#23813](https://github.com/NousResearch/hermes-agent/pull/23813)); /subgoal returned in simpler form via #25449
### Compression
- **Make `protect_first_n` configurable** ([#25447](https://github.com/NousResearch/hermes-agent/pull/25447))
### Verification
- **Per-turn file-mutation verifier footer** ([#24498](https://github.com/NousResearch/hermes-agent/pull/24498))
### Stream retry
- **Log inner cause, upstream headers, bytes/elapsed on every drop** ([#23005](https://github.com/NousResearch/hermes-agent/pull/23005))
---
## 🤖 Models & Providers
### New providers
- **xAI Grok OAuth (SuperGrok Subscription) provider** ([#26534](https://github.com/NousResearch/hermes-agent/pull/26534))
- **NovitaAI provider** (salvage #7219) (@kshitijk4poor) ([#25507](https://github.com/NousResearch/hermes-agent/pull/25507))
- **NVIDIA NIM billing origin header** (salvage #25211) ([#26585](https://github.com/NousResearch/hermes-agent/pull/26585))
### Provider work
- **OpenRouter Pareto Code router with `min_coding_score` knob** ([#22838](https://github.com/NousResearch/hermes-agent/pull/22838))
- **Optional codex app-server runtime for OpenAI/Codex models** ([#24182](https://github.com/NousResearch/hermes-agent/pull/24182))
- **Codex-runtime: retire wedged sessions + post-tool watchdog + OAuth refresh classify** ([#25769](https://github.com/NousResearch/hermes-agent/pull/25769))
- **Codex-runtime: skip unavailable plugins during migration** ([#25437](https://github.com/NousResearch/hermes-agent/pull/25437))
- **Codex-runtime: de-dup `[plugins.X]` tables and stop leaking HERMES_HOME into config.toml** (#26250) (@kshitijk4poor) ([#26260](https://github.com/NousResearch/hermes-agent/pull/26260))
- **Pass `reasoning.effort` to xAI Responses API** ([#22807](https://github.com/NousResearch/hermes-agent/pull/22807))
- **Custom provider: prompt and persist explicit `api_mode`** ([#25068](https://github.com/NousResearch/hermes-agent/pull/25068))
- **Rename Alibaba Cloud → Qwen Cloud, reorder picker** ([#24835](https://github.com/NousResearch/hermes-agent/pull/24835))
- **Restore gpt-5.3-codex-spark for ChatGPT Pro** (salvage #18286 + #19530, fixes #16172) (@kshitijk4poor) ([#22991](https://github.com/NousResearch/hermes-agent/pull/22991))
- **Inject tool-use enforcement for GLM models** ([#24715](https://github.com/NousResearch/hermes-agent/pull/24715))
- **Use Nous Portal as model metadata authority** (@rob-maron) ([#24502](https://github.com/NousResearch/hermes-agent/pull/24502))
- **Unified `client=hermes-client-v<version>` tag on every Portal request** ([#24779](https://github.com/NousResearch/hermes-agent/pull/24779))
- **Prevent stale Ollama credentials after provider switch** (@kshitijk4poor) ([#21703](https://github.com/NousResearch/hermes-agent/pull/21703))
- **Auxiliary client: rotate pooled auth after quota failures** (salvage #22779) ([#22792](https://github.com/NousResearch/hermes-agent/pull/22792))
- **Auxiliary client: skip providers without credentials immediately** (#25395) ([#25487](https://github.com/NousResearch/hermes-agent/pull/25487))
- **Auth: send Nous refresh token via header** (@shannonsands) ([#21578](https://github.com/NousResearch/hermes-agent/pull/21578))
- **MiniMax: harden OAuth dashboard and runtime** ([#24165](https://github.com/NousResearch/hermes-agent/pull/24165))
### OpenAI-compatible proxy
- **Local OpenAI-compatible proxy for OAuth providers** — Codex / Aider / Cline can hit Claude Pro, ChatGPT Pro, SuperGrok ([#25969](https://github.com/NousResearch/hermes-agent/pull/25969))
---
## 📱 Messaging Platforms (Gateway)
### New platforms
- **LINE Messaging API platform plugin** ([#23197](https://github.com/NousResearch/hermes-agent/pull/23197))
- **SimpleX Chat platform plugin** (salvages #2558) ([#26232](https://github.com/NousResearch/hermes-agent/pull/26232))
### Microsoft Graph foundation
- **msgraph: add auth and client foundation** (salvage of #21408) ([#21922](https://github.com/NousResearch/hermes-agent/pull/21922))
- **msgraph: add webhook listener platform** (salvage of #21409) ([#21969](https://github.com/NousResearch/hermes-agent/pull/21969))
- **teams-pipeline: add plugin runtime and operator cli** (salvage of #21410) ([#22007](https://github.com/NousResearch/hermes-agent/pull/22007))
- **teams: add pipeline outbound delivery via existing adapter** (salvage of #21411) ([#22024](https://github.com/NousResearch/hermes-agent/pull/22024))
### Cross-platform
- **Per-platform admin/user split for slash commands** (salvage of #4443) ([#23373](https://github.com/NousResearch/hermes-agent/pull/23373))
- **Forensics on signal handling — non-blocking diag, per-phase timing, stale-unit warning** ([#23285](https://github.com/NousResearch/hermes-agent/pull/23285))
- **Keep gateway running when platforms fail; add per-platform circuit breaker + `/platform`** ([#26600](https://github.com/NousResearch/hermes-agent/pull/26600))
- **Wire `clarify` tool with inline keyboard buttons on Telegram** ([#24199](https://github.com/NousResearch/hermes-agent/pull/24199))
- **Add `chat_id` to `hook_ctx` for message source tracking** ([#24710](https://github.com/NousResearch/hermes-agent/pull/24710))
### Telegram
- **Native draft streaming via `sendMessageDraft` (Bot API 9.5+)** (salvage of #3412) ([#23512](https://github.com/NousResearch/hermes-agent/pull/23512))
- **Stream Telegram edits safely** — salvage of #22264 (@kshitijk4poor) ([#22518](https://github.com/NousResearch/hermes-agent/pull/22518))
- **Telegram notification mode** (salvage #22772) ([#22793](https://github.com/NousResearch/hermes-agent/pull/22793))
- **Telegram guest mention mode** (@kshitijk4poor) ([#22759](https://github.com/NousResearch/hermes-agent/pull/22759))
- **Split-and-deliver oversized edits instead of silent truncation** (salvage of #19537) ([#23576](https://github.com/NousResearch/hermes-agent/pull/23576))
- **Preserve DM topic routing via reply fallback** (salvage #22053) (@kshitijk4poor) ([#22410](https://github.com/NousResearch/hermes-agent/pull/22410))
- **Pass `source.thread_id` explicitly on auto-reset notice** (carve-out of #7404) ([#23440](https://github.com/NousResearch/hermes-agent/pull/23440))
### Discord
- **Render clarify choices as buttons** ([#25485](https://github.com/NousResearch/hermes-agent/pull/25485))
- **Channel history backfill — default on, broadened scope** ([#25984](https://github.com/NousResearch/hermes-agent/pull/25984))
- **`thread_require_mention` for multi-bot threads** (salvage #25313) ([#25445](https://github.com/NousResearch/hermes-agent/pull/25445))
### Slack
- **Support `!cmd` as alternate prefix for slash commands in threads** ([#25355](https://github.com/NousResearch/hermes-agent/pull/25355))
### WhatsApp
- **Surface quoted reply metadata from Baileys** (#25398) ([#25489](https://github.com/NousResearch/hermes-agent/pull/25489))
### Feishu / Google Chat / others
- **Feishu: native update prompt cards** (@kshitijk4poor) ([#22448](https://github.com/NousResearch/hermes-agent/pull/22448))
- **Google Chat: repair setup prompt imports** (@helix4u) ([#22038](https://github.com/NousResearch/hermes-agent/pull/22038))
- **Google Chat: honor relay-declared sender_type** (salvage of #22107) (@kshitijk4poor) ([#22432](https://github.com/NousResearch/hermes-agent/pull/22432))
- **LINE: use `build_source` instead of nonexistent `create_source`** ([#24717](https://github.com/NousResearch/hermes-agent/pull/24717))
- **Add `weixin, and more` to gateway docs** (salvage of #21063 by @wuwuzhijing)
---
## 🖥️ CLI & TUI
### CLI
- **Show YOLO mode warning in banner and status bar** ([#26238](https://github.com/NousResearch/hermes-agent/pull/26238))
- **Confirm prompt for destructive slash commands** (#4069) ([#22687](https://github.com/NousResearch/hermes-agent/pull/22687))
- **`docker_extra_args` + `display.timestamps`** ([#23599](https://github.com/NousResearch/hermes-agent/pull/23599))
- **Delegate tool: show user's actual concurrency / spawn-depth limits in description** ([#22694](https://github.com/NousResearch/hermes-agent/pull/22694))
### TUI
- **`/sessions` slash command for browsing and resuming previous sessions** (@austinpickett) ([#20805](https://github.com/NousResearch/hermes-agent/pull/20805))
- **Segment turns with rule above non-first user msgs; trim ticker dead space** (@OutThisLife) ([#21846](https://github.com/NousResearch/hermes-agent/pull/21846))
- **Support attaching to an existing gateway** (@OutThisLife) ([#21978](https://github.com/NousResearch/hermes-agent/pull/21978))
- **Resolve markdown links to readable page titles** (@OutThisLife) ([#24013](https://github.com/NousResearch/hermes-agent/pull/24013))
- **Width-aware markdown table rendering with vertical fallback** (@alt-glitch) ([#26195](https://github.com/NousResearch/hermes-agent/pull/26195))
- **Keep Ink displayCursor in sync with fast-echo writes so cursor stops drifting** (@OutThisLife) ([#26717](https://github.com/NousResearch/hermes-agent/pull/26717))
- **Allow transcript scroll + Esc during approval/clarify/confirm prompts** (@OutThisLife) ([#26414](https://github.com/NousResearch/hermes-agent/pull/26414))
- **Preserve session when switching personality** (@austinpickett) ([#20942](https://github.com/NousResearch/hermes-agent/pull/20942))
- **Skip native safety net on OSC52-capable terminals** (@benbarclay) ([#20954](https://github.com/NousResearch/hermes-agent/pull/20954))
### Dashboard / GUI
- **Route embedded TUI through dashboard gateway** (@OutThisLife) ([#21979](https://github.com/NousResearch/hermes-agent/pull/21979))
- **Hide token/cost analytics behind config flag (default off)** ([#25438](https://github.com/NousResearch/hermes-agent/pull/25438))
- **Fix Langfuse observability — trace I/O, tool outputs, placeholder credentials** (closes #22342, #22763) (@kshitijk4poor) ([#26320](https://github.com/NousResearch/hermes-agent/pull/26320))
- **MiniMax 'Login' button launched Claude OAuth** (salvage #22849) ([#24058](https://github.com/NousResearch/hermes-agent/pull/24058))
- **Update cron modals** (@austinpickett) ([#25985](https://github.com/NousResearch/hermes-agent/pull/25985))
- **Analytics: prevent silent token loss and add Claude 4.54.7 pricing** (@austinpickett) ([#21455](https://github.com/NousResearch/hermes-agent/pull/21455))
---
## 🔧 Tools & Capabilities
### Vision & video
- **`vision_analyze` returns pixels to vision-capable models** ([#22955](https://github.com/NousResearch/hermes-agent/pull/22955))
- **Unified `video_generate` with pluggable provider backends** ([#25126](https://github.com/NousResearch/hermes-agent/pull/25126))
- **`image_gen`: actionable setup message when no FAL backend is reachable** ([#26222](https://github.com/NousResearch/hermes-agent/pull/26222))
### Computer use
- **`computer_use` cua-driver backend + focus-safe ops + non-Anthropic provider fix** (re-salvage #16936) ([#21967](https://github.com/NousResearch/hermes-agent/pull/21967))
- **Refresh cua-driver on `hermes update` + add `install --upgrade`** ([#24063](https://github.com/NousResearch/hermes-agent/pull/24063))
### LSP & write-time diagnostics
- **Semantic diagnostics from real language servers in `write_file`/`patch`** ([#24168](https://github.com/NousResearch/hermes-agent/pull/24168))
- **Shift baseline diagnostics into post-edit coordinates** ([#25978](https://github.com/NousResearch/hermes-agent/pull/25978))
### Search & web
- **Brave Search (free tier) and DDGS search providers** ([#21337](https://github.com/NousResearch/hermes-agent/pull/21337))
- **Bearer auth header for Tavily `/crawl` endpoint** ([#24658](https://github.com/NousResearch/hermes-agent/pull/24658))
### X (Twitter)
- **Gated `x_search` tool with OAuth-or-API-key auth** ([#26763](https://github.com/NousResearch/hermes-agent/pull/26763))
### Browser
- **Route `browser_console` eval through supervisor's persistent CDP WS (180x faster)** ([#23226](https://github.com/NousResearch/hermes-agent/pull/23226))
- **Support externally managed Camofox sessions** ([#24499](https://github.com/NousResearch/hermes-agent/pull/24499))
### MCP
- **`supports_parallel_tool_calls` for MCP servers** (salvage of #9944) ([#26825](https://github.com/NousResearch/hermes-agent/pull/26825))
- **Codex preset for Codex CLI MCP server** (salvage #22663) ([#22679](https://github.com/NousResearch/hermes-agent/pull/22679))
- **Stop retrying initial MCP auth failures** (#25624) ([#25776](https://github.com/NousResearch/hermes-agent/pull/25776))
### Google Workspace
- **Drive write ops + Docs/Sheets create/append** ([#21895](https://github.com/NousResearch/hermes-agent/pull/21895))
### Per-turn verifier
- **Per-turn file-mutation verifier footer** ([#24498](https://github.com/NousResearch/hermes-agent/pull/24498))
---
## 🧩 Kanban (Multi-Agent)
- **`specify` — auxiliary LLM fleshes out triage tasks** ([#21435](https://github.com/NousResearch/hermes-agent/pull/21435))
- **Orchestrator board tools — `kanban_list` + `kanban_unblock`** (carve-out of #20568) ([#23012](https://github.com/NousResearch/hermes-agent/pull/23012))
- **`stranded_in_ready` diagnostic for unclaimed tasks** ([#23578](https://github.com/NousResearch/hermes-agent/pull/23578))
- **Dashboard batch QOL upgrade** (salvage of #23240) ([#23550](https://github.com/NousResearch/hermes-agent/pull/23550))
- **Tooltips and docs link across dashboard** ([#21541](https://github.com/NousResearch/hermes-agent/pull/21541))
- **Dedupe notifier delivery via atomic claim + rewind on failure** (salvage #22558) ([#23401](https://github.com/NousResearch/hermes-agent/pull/23401))
- **Keep notifier subscriptions alive across retry cycles** (salvage #21398) ([#23423](https://github.com/NousResearch/hermes-agent/pull/23423))
- **Drop caller-controlled author override in `kanban_comment`** (salvage of #22109) (@kshitijk4poor) ([#22435](https://github.com/NousResearch/hermes-agent/pull/22435))
- **Sanitize comment author rendering in `build_worker_context`** ([#22769](https://github.com/NousResearch/hermes-agent/pull/22769))
---
## 🧠 Plugins & Extension
### Plugin surface
- **Run any LLM call from inside a plugin via `ctx.llm`** ([#23194](https://github.com/NousResearch/hermes-agent/pull/23194))
- **`tool_override` flag for replacing built-in tools** (closes #11049) ([#26759](https://github.com/NousResearch/hermes-agent/pull/26759))
- **`standalone_sender_fn` for out-of-process cron delivery** (@kshitijk4poor) ([#22461](https://github.com/NousResearch/hermes-agent/pull/22461))
- **`HERMES_PLUGINS_DEBUG=1` surfaces plugin discovery logs** ([#22684](https://github.com/NousResearch/hermes-agent/pull/22684))
- **Hindsight-client as optional dependency** (@alt-glitch) ([#21818](https://github.com/NousResearch/hermes-agent/pull/21818))
### Profile & distribution
- **Shareable profile distributions via git** ([#20831](https://github.com/NousResearch/hermes-agent/pull/20831))
---
## ⏰ Cron
- **Routing intent — `deliver=all` fans out to every connected channel** ([#21495](https://github.com/NousResearch/hermes-agent/pull/21495))
- **Support name-based lookup for job operations** ([#26231](https://github.com/NousResearch/hermes-agent/pull/26231))
- **Blank Cron dashboard tab + partial-record crashes** (salvage #21042 + #22330) (@kshitijk4poor) ([#22389](https://github.com/NousResearch/hermes-agent/pull/22389))
- **Do not seed `HERMES_SESSION_*` contextvars from cron origin** (salvage of #22356) (@kshitijk4poor) ([#22382](https://github.com/NousResearch/hermes-agent/pull/22382))
- **Scan assembled prompt including skill content for prompt injection** (#3968)
---
## 🧩 Skills Ecosystem
### Skills Hub
- **`hermes-skills/huggingface` as a trusted default tap** (closes #2549) ([#26219](https://github.com/NousResearch/hermes-agent/pull/26219))
- **Show per-skill pages in the left sidebar** ([#26646](https://github.com/NousResearch/hermes-agent/pull/26646))
- **Richer info panels on the Skills Hub** ([#22905](https://github.com/NousResearch/hermes-agent/pull/22905))
- **Refuse `skill_view` name collisions instead of guessing** (closes #6136 @polkn)
### Curator
- **Show rename map in user-visible summary** ([#22910](https://github.com/NousResearch/hermes-agent/pull/22910))
- **Hint at `hermes curator pin` in the rename block** ([#23212](https://github.com/NousResearch/hermes-agent/pull/23212))
### New optional skills
- **Hyperliquid** — perp/spot trading via SDK + REST (salvage of #1952) ([#23583](https://github.com/NousResearch/hermes-agent/pull/23583))
- **Yahoo Finance** market data ([#23590](https://github.com/NousResearch/hermes-agent/pull/23590))
- **api-testing** (REST/GraphQL debug, salvages #1800) ([#23582](https://github.com/NousResearch/hermes-agent/pull/23582))
- **Unified EVM multi-chain skill** (salvages #25291 + #2010 + folds in base/) ([#25299](https://github.com/NousResearch/hermes-agent/pull/25299))
- **darwinian-evolver** ([#26760](https://github.com/NousResearch/hermes-agent/pull/26760))
- **osint-investigation** (closes #355) ([#26729](https://github.com/NousResearch/hermes-agent/pull/26729))
- **pinggy-tunnel** ([#26765](https://github.com/NousResearch/hermes-agent/pull/26765))
- **watchers** — RSS / HTTP JSON / GitHub polling via cron no-agent ([#21881](https://github.com/NousResearch/hermes-agent/pull/21881))
- **Notion overhaul for the Developer Platform** (May 2026) ([#26612](https://github.com/NousResearch/hermes-agent/pull/26612))
---
## 🔒 Security & Reliability
### Security hardening
- **Sudo brute-force block + sudo-stdin/askpass DANGEROUS** (salvage of #22194 + #21128) (@kshitijk4poor) ([#23736](https://github.com/NousResearch/hermes-agent/pull/23736))
- **Drop caller-controlled author override in `kanban_comment`** (salvage of #22109) (@kshitijk4poor) ([#22435](https://github.com/NousResearch/hermes-agent/pull/22435))
- **Cover remaining SSRF fetch paths in skills-hub** (salvage #22804) ([#22843](https://github.com/NousResearch/hermes-agent/pull/22843))
- **Use credential_pool for custom endpoint model listing probes** (salvage #22810) ([#22842](https://github.com/NousResearch/hermes-agent/pull/22842))
- **Require dashboard auth for plugin API routes** (salvage #19541) ([#23220](https://github.com/NousResearch/hermes-agent/pull/23220))
- **Sanitize env and redact output in quick commands + remove write-only `_pending_messages`** ([#23584](https://github.com/NousResearch/hermes-agent/pull/23584))
- **Reduce unnecessary `shell=True` in subprocess calls** ([#25149](https://github.com/NousResearch/hermes-agent/pull/25149))
- **Sanitize Google Chat sender_type from relay** (salvage of #22107) (@kshitijk4poor) ([#22432](https://github.com/NousResearch/hermes-agent/pull/22432))
- **Supply-chain advisory checker** ([#24220](https://github.com/NousResearch/hermes-agent/pull/24220))
- **Rewrite security policy around OS-level isolation as the boundary** (@jquesnelle) ([#20317](https://github.com/NousResearch/hermes-agent/pull/20317))
- **Remove public security advisory page** ([#24253](https://github.com/NousResearch/hermes-agent/pull/24253))
### Reliability — notable bug closures
- **SQLite: fall back to `journal_mode=DELETE` on NFS/SMB/FUSE** (fixes `/resume` on network mounts) (@kshitijk4poor) ([#22043](https://github.com/NousResearch/hermes-agent/pull/22043))
- **Codex-runtime: retire wedged sessions + post-tool watchdog + OAuth refresh classify** ([#25769](https://github.com/NousResearch/hermes-agent/pull/25769))
- **Codex-runtime: de-dup `[plugins.X]` tables and stop leaking HERMES_HOME** (#26250) (@kshitijk4poor) ([#26260](https://github.com/NousResearch/hermes-agent/pull/26260))
- **Daytona: migrate legacy-sandbox lookup to cursor-based `list()`** ([#24587](https://github.com/NousResearch/hermes-agent/pull/24587))
- **MCP: stop retrying initial MCP auth failures** (#25624) ([#25776](https://github.com/NousResearch/hermes-agent/pull/25776))
- **Gateway: enable text-intercept for multi-choice clarify fallback** (#25587) ([#25778](https://github.com/NousResearch/hermes-agent/pull/25778))
- **Gateway: keep running when platforms fail; per-platform circuit breaker + `/platform`** ([#26600](https://github.com/NousResearch/hermes-agent/pull/26600))
- **Delegate: salvage #21933 JSON-string batch + diagnostic logging** (@kshitijk4poor) ([#22436](https://github.com/NousResearch/hermes-agent/pull/22436))
- **Profiles+banner: exclude infrastructure from `--clone-all` + fix stale update-check repo resolution** (@kshitijk4poor) ([#22475](https://github.com/NousResearch/hermes-agent/pull/22475))
- **ACP: inline file attachment resources** (salvage #21400 + image support) ([#21407](https://github.com/NousResearch/hermes-agent/pull/21407))
- **CI: unblock shared PR checks** (@stephenschoettler) ([#21012](https://github.com/NousResearch/hermes-agent/pull/21012), [#25957](https://github.com/NousResearch/hermes-agent/pull/25957))
### Notable reverts in window
- **`/goal` checklist + /subgoal feature stack** — rolled back ([#23813](https://github.com/NousResearch/hermes-agent/pull/23813)); `/subgoal` returned in simpler form via [#25449](https://github.com/NousResearch/hermes-agent/pull/25449)
- **Scrollback box width clamp** (#25975) rolled back to restore full-width borders ([#26163](https://github.com/NousResearch/hermes-agent/pull/26163))
- **`fix(cli): tolerate unreadable dirs when building systemd PATH`** rolled back
---
## 🌍 i18n
- **Localize all gateway commands + web dashboard, add 8 new locales (16 total)** ([#22914](https://github.com/NousResearch/hermes-agent/pull/22914))
---
## 📚 Documentation
- **Repair Voice & TTS provider table** (@nightcityblade, fixes #24101) ([#24138](https://github.com/NousResearch/hermes-agent/pull/24138))
- **Show per-skill pages in the left sidebar** ([#26646](https://github.com/NousResearch/hermes-agent/pull/26646))
- **Mention Weixin in gateway help and docstrings** (salvage of #21063 by @wuwuzhijing)
- **Richer info panels on the Skills Hub** ([#22905](https://github.com/NousResearch/hermes-agent/pull/22905))
- Many more doc updates across providers, platforms, skills, Windows install paths, and dashboard.
---
## 🧪 Testing & CI
- **Unblock shared PR checks** (@stephenschoettler) ([#21012](https://github.com/NousResearch/hermes-agent/pull/21012))
- **Stabilize shared test state after 21012** (@stephenschoettler) ([#25957](https://github.com/NousResearch/hermes-agent/pull/25957))
- A long tail of test additions for platforms, providers, plugins, and edge cases — 8 explicit `test:` PRs plus ~250 fix PRs that also added regression coverage.
---
## 👥 Contributors
### Core
- @teknium1 — release lead, architecture, ~406 PRs merged in window
### Top community contributors
- **@kshitijk4poor** — 38 PRs · Telegram cadence/streaming/topic routing, security hardening (sudo, SSRF, kanban_comment, dashboard auth), codex-runtime hygiene, NovitaAI provider, profile/banner fixes, Feishu update cards, gateway QOL across the board
- **@alt-glitch** — 13 PRs · Markdown-table TUI rendering, `HERMES_SESSION_ID` env var, hindsight-client optional dep, Nix `extraDependencyGroups`
- **@OutThisLife** (Brooklyn Nicholson) — 12 PRs · TUI turn segmentation, attach-to-gateway, markdown link titles, embedded TUI via dashboard gateway, Ink cursor sync, scroll/Esc during prompts
- **@austinpickett** — 8 PRs · `/sessions` slash command, personality switching preserves session, cron modals, dashboard analytics
- **@helix4u** — 5 PRs · Google Chat setup, browser install skip on system chromium, Windows Ctrl+C preservation
- **@rob-maron** — 4 PRs · Nous Portal as model metadata authority, provider polish
- **@stephenschoettler** — 3 PRs · CI stabilization
- **@ethernet8023** — 3 PRs · platform/gateway work
### All contributors (alphabetical)
@02356abc, @0xbyt4, @0xharryriddle, @1000Delta, @1RB, @29206394, @A-kamal, @aashizpoudel, @Abd0r,
@adybag14-cyber, @AgentArcLab, @ahmedbadr3, @AhmetArif0, @alblez, @Alex-yang00, @ALIYILD, @AllynSheep,
@alt-glitch, @am423, @amathxbt, @amethystani, @ArecaNon, @Arkmusn, @askclaw-vesper, @AsoTora, @austinpickett,
@aydnOktay, @ayushere, @baocin, @Bartok9, @benbarclay, @BennetYrWang, @Bihruze, @binhnt92, @briandevans,
@brooklynnicholson, @btorresgil, @buntingszn, @CalmProton, @chrisworksai, @CoinTheHat, @dandacompany, @Dangooy,
@DanielLSM, @David-0x221Eight, @ddupont808, @dhruv-saxena, @diablozzc, @dlkakbs, @dmahan93, @dmnkhorvath,
@domtriola, @donrhmexe, @Dusk1e, @eloklam, @emozilla, @ephron-ren, @erenkarakus, @EthanGuo-coder,
@ethernet8023, @evgyur, @explainanalyze, @fahdad, @fr33d3m0n, @Freeman-Consulting, @freqyfreqy, @Frowtek,
@fu576, @github-actions[bot], @gnanirahulnutakki, @GodsBoy, @guglielmofonda, @Gutslabs, @hanzckernel,
@heathley, @hekaru-agent, @helix4u, @HenkDz, @HiddenPuppy, @hllqkb, @hrygo, @HuangYuChuh, @Hugo-SEQUIER, @HxT9,
@iacker, @InB4DevOps, @isaachuangGMICLOUD, @iuyup, @Jaaneek, @jackey8616, @jackjin1997, @Jaggia, @jak983464779,
@jelrod27, @jethac, @JithendraNara, @johnisag, @Julientalbot, @Jwd-gity, @kallidean, @keyuyuan, @kfa-ai,
@kidonng, @KiraKatana, @kjames2001, @konsisumer, @Korkyzer, @kshitijk4poor, @KvnGz, @lars-hagen, @leehack,
@leepoweii, @LeonSGP43, @li0near, @libo1106, @liquidchen, @littlewwwhite, @liuhao1024, @liyoungc, @luandiasrj,
@luoyuctl, @luyao618, @magic524, @mbac, @McClean, @memosr, @Mibayy, @ming1523, @mizgyo, @mrshu, @ms-alan,
@MustafaKara7, @nederev, @nicoechaniz, @nidhi-singh02, @nightcityblade, @nik1t7n, @Ninso112, @NivOO5,
@novax635, @nv-kasikritc, @oferlaor, @oswaldb22, @outdoorsea, @oxngon, @PaTTeeL, @pearjelly, @pefontana,
@perng, @PhilipAD, @phuongvm, @polkn, @Prasanna28Devadiga, @princepal9120, @pty819, @purzbeats, @Quarkex,
@quocanh261997, @qWaitCrypto, @Qwinty, @rahimsais, @raymaylee, @ReqX, @rewbs, @RhombusMaximus, @rob-maron,
@Ruzzgar, @ryptotalent, @Sanjays2402, @shannonsands, @shaun0927, @SiliconID, @silv-mt-holdings, @simpolism,
@smwbev, @soichiyo, @sprmn24, @steezkelly, @stephenschoettler, @Sylw3ster, @szymonclawd, @teyrebaz33,
@Tianyu199509, @Tranquil-Flow, @TreyDong, @TurgutKural, @tw2818, @tymrtn, @uzunkuyruk, @v1b3coder,
@vanthinh6886, @VinceZcrikl, @vKongv, @vominh1919, @voteblake, @VTRiot, @wali-reheman, @wesleysimplicio,
@wilsen0, @WorldWriter, @worlldz, @wuli666, @wuwuzhijing, @Wysie, @XiaoXiao0221, @xieNniu, @xxxigm, @yehuosi,
@ygd58, @yifengingit, @yuga-hashimoto, @zccyman, @ZeterMordio, @Zhekinmaksim, @zhengyn0001
Also: @Nagatha (Claude Opus 4.7).
---
**Full Changelog**: [v2026.5.7...v2026.5.16](https://github.com/NousResearch/hermes-agent/compare/v2026.5.7...v2026.5.16)

View file

@ -18,6 +18,7 @@ import acp
from acp.schema import (
AgentCapabilities,
AgentMessageChunk,
AgentThoughtChunk,
AuthenticateResponse,
AvailableCommand,
AvailableCommandsUpdate,
@ -788,14 +789,20 @@ class HermesACPAgent(acp.Agent):
# ---- Session management -------------------------------------------------
@staticmethod
def _history_message_text(message: dict[str, Any]) -> str:
"""Extract displayable text from a persisted OpenAI-style message."""
content = message.get("content")
if isinstance(content, str):
return content.strip()
if isinstance(content, list):
def _flatten_history_text(value: Any) -> str:
"""Normalize a persisted text-or-text-parts value into a single string.
OpenAI-style assistant content (and provider reasoning fields) can arrive
as either a scalar string or a list of ``{"text": ...}`` /
``{"type": "text", "content": ...}`` parts. Whitespace-only inputs
collapse to an empty string so callers can treat ``""`` as "nothing to
emit".
"""
if isinstance(value, str):
return value.strip()
if isinstance(value, list):
parts: list[str] = []
for item in content:
for item in value:
if isinstance(item, dict):
text = item.get("text")
if isinstance(text, str):
@ -807,6 +814,29 @@ class HermesACPAgent(acp.Agent):
return "\n".join(part.strip() for part in parts if part and part.strip()).strip()
return ""
@classmethod
def _history_message_text(cls, message: dict[str, Any]) -> str:
"""Extract displayable text from a persisted OpenAI-style message."""
return cls._flatten_history_text(message.get("content"))
@classmethod
def _history_reasoning_text(cls, message: dict[str, Any]) -> str:
"""Extract displayable reasoning/thought text from a persisted assistant message.
Returns the first non-empty value among ``reasoning_content`` (the
canonical field used by DeepSeek / Moonshot and the post-#16892
chat-completions normalizer) and ``reasoning`` (used by the codex
event projector and several other transports). Both keys are
actively written by live code paths, so neither branch is
deprecated they cover different transports rather than old vs.
new sessions.
"""
for key in ("reasoning_content", "reasoning"):
text = cls._flatten_history_text(message.get(key))
if text:
return text
return ""
@staticmethod
def _history_message_update(
*,
@ -827,6 +857,11 @@ class HermesACPAgent(acp.Agent):
)
return None
@staticmethod
def _history_thought_update(text: str) -> AgentThoughtChunk:
"""Build an ACP history replay update for an assistant thought."""
return acp.update_agent_thought_text(text)
@staticmethod
def _history_tool_call_name_args(tool_call: dict[str, Any]) -> tuple[str, dict[str, Any]]:
"""Extract function name/arguments from an OpenAI-style tool_call."""
@ -854,13 +889,17 @@ class HermesACPAgent(acp.Agent):
).strip()
async def _replay_session_history(self, state: SessionState) -> None:
"""Send persisted user/assistant history to clients during session/load.
"""Replay persisted user/assistant history during session/load or session/resume.
Zed's ACP history UI calls ``session/load`` after the user picks an item
from the Agents sidebar. The agent must then replay the full conversation
as user/assistant chunks plus reconstructed tool-call start/completion
notifications; merely restoring server-side state makes Hermes remember
context, but leaves the editor looking like a clean thread.
Invoked inline (``await``) from both ``load_session`` and
``resume_session`` so that spec-compliant ACP clients receive the
full transcript within the request's lifetime — see the comment at
the call sites for the rationale and prior-art citations.
Replays the conversation as user/assistant chunks, thinking-mode
thought chunks, plus reconstructed tool-call start/completion
notifications. Merely restoring server-side state makes Hermes
remember context, but leaves the editor looking like a clean thread.
"""
if not self._conn or not state.history:
return
@ -882,24 +921,37 @@ class HermesACPAgent(acp.Agent):
for message in state.history:
role = str(message.get("role") or "")
if role in {"user", "assistant"}:
if role == "user":
text = self._history_message_text(message)
if text:
update = self._history_message_update(role=role, text=text)
if update is not None and not await _send(update):
return
continue
if role == "assistant":
thought = self._history_reasoning_text(message)
if thought and not await _send(self._history_thought_update(thought)):
return
text = self._history_message_text(message)
if text:
update = self._history_message_update(role=role, text=text)
if update is not None and not await _send(update):
return
if role == "assistant" and isinstance(message.get("tool_calls"), list):
for tool_call in message["tool_calls"]:
if not isinstance(tool_call, dict):
continue
tool_call_id = self._history_tool_call_id(tool_call)
if not tool_call_id:
continue
tool_name, args = self._history_tool_call_name_args(tool_call)
active_tool_calls[tool_call_id] = (tool_name, args)
if not await _send(build_tool_start(tool_call_id, tool_name, args)):
return
tool_calls = message.get("tool_calls")
if isinstance(tool_calls, list):
for tool_call in tool_calls:
if not isinstance(tool_call, dict):
continue
tool_call_id = self._history_tool_call_id(tool_call)
if not tool_call_id:
continue
tool_name, args = self._history_tool_call_name_args(tool_call)
active_tool_calls[tool_call_id] = (tool_name, args)
if not await _send(build_tool_start(tool_call_id, tool_name, args)):
return
continue
if role == "tool":
@ -942,18 +994,6 @@ class HermesACPAgent(acp.Agent):
models=self._build_model_state(state),
)
def _schedule_history_replay(self, state: SessionState) -> None:
"""Replay persisted history after session/load or session/resume returns.
Zed only attaches streamed transcript/tool updates once the load/resume
response has completed. Sending replay notifications while the request is
still in-flight can make the server look correct in logs while the editor
drops or fails to attach the tool-call history.
"""
loop = asyncio.get_running_loop()
replay_coro = self._replay_session_history(state)
loop.call_soon(asyncio.create_task, replay_coro)
async def load_session(
self,
cwd: str,
@ -967,7 +1007,30 @@ class HermesACPAgent(acp.Agent):
return None
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Loaded session %s", session_id)
self._schedule_history_replay(state)
# Per ACP spec, `session/load` must stream the prior conversation back
# to the client via `session/update` notifications BEFORE responding,
# so the client receives the full transcript within the load request's
# lifetime. Awaiting the replay here matches Codex / Claude Code /
# OpenCode / Pi and the Zed client (which registers the session-update
# routing entry before awaiting the loadSession RPC specifically so
# in-call history replay updates can find the thread). Deferring this
# via `loop.call_soon` (as we did briefly in May 2026) broke every
# spec-compliant ACP client that measures notifications synchronously
# against the load response — see #12285 follow-up.
try:
await self._replay_session_history(state)
except Exception:
# Replay is best-effort — a corrupted or unexpected message shape
# must not turn a successful session/load into a JSON-RPC error
# response. Per-notification failures are already caught inside
# ``_replay_session_history``; this outer guard covers anything
# raised by the helpers themselves before reaching ``_send``.
logger.warning(
"ACP history replay raised during session/load for %s"
"load will still succeed, partial transcript may be missing",
session_id,
exc_info=True,
)
self._schedule_available_commands_update(session_id)
self._schedule_usage_update(state)
return LoadSessionResponse(models=self._build_model_state(state))
@ -985,7 +1048,18 @@ class HermesACPAgent(acp.Agent):
state = self.session_manager.create_session(cwd=cwd)
await self._register_session_mcp_servers(state, mcp_servers)
logger.info("Resumed session %s", state.session_id)
self._schedule_history_replay(state)
# See `load_session` above for the spec rationale — replay must
# complete before the response so clients receive the full transcript
# within the request's lifetime.
try:
await self._replay_session_history(state)
except Exception:
logger.warning(
"ACP history replay raised during session/resume for %s"
"resume will still succeed, partial transcript may be missing",
state.session_id,
exc_info=True,
)
self._schedule_available_commands_update(state.session_id)
self._schedule_usage_update(state)
return ResumeSessionResponse(models=self._build_model_state(state))

View file

@ -1060,10 +1060,12 @@ def _generate_pkce() -> tuple:
def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
"""Run Hermes-native OAuth PKCE flow and return credential state."""
import secrets
import time
import webbrowser
verifier, challenge = _generate_pkce()
oauth_state = secrets.token_urlsafe(32)
params = {
"code": "true",
@ -1073,7 +1075,7 @@ def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
"scope": _OAUTH_SCOPES,
"code_challenge": challenge,
"code_challenge_method": "S256",
"state": verifier,
"state": oauth_state,
}
from urllib.parse import urlencode
@ -1110,7 +1112,12 @@ def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
splits = auth_code.split("#")
code = splits[0]
state = splits[1] if len(splits) > 1 else ""
received_state = splits[1] if len(splits) > 1 else ""
# Validate state to prevent CSRF (RFC 6749 §10.12)
if received_state != oauth_state:
logger.warning("OAuth state mismatch — possible CSRF, aborting")
return None
try:
import urllib.request
@ -1119,7 +1126,7 @@ def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
"grant_type": "authorization_code",
"client_id": _OAUTH_CLIENT_ID,
"code": code,
"state": state,
"state": received_state,
"redirect_uri": _OAUTH_REDIRECT_URI,
"code_verifier": verifier,
}).encode()

View file

@ -30,6 +30,28 @@ _DEFAULT_TIMEOUT_SECONDS = 900.0
_TOOL_CALL_BLOCK_RE = re.compile(r"<tool_call>\s*(\{.*?\})\s*</tool_call>", re.DOTALL)
_TOOL_CALL_JSON_RE = re.compile(r"\{\s*\"id\"\s*:\s*\"[^\"]+\"\s*,\s*\"type\"\s*:\s*\"function\"\s*,\s*\"function\"\s*:\s*\{.*?\}\s*\}", re.DOTALL)
# Stderr fingerprint of the deprecated `gh copilot` CLI extension
# (https://github.blog/changelog/2025-09-25-upcoming-deprecation-of-gh-copilot-cli-extension).
# We require BOTH the literal product name ("gh-copilot") AND a deprecation
# marker, so generic stderr from the NEW `@github/copilot` CLI — whose repo
# is github.com/github/copilot-cli and which legitimately mentions "copilot-cli"
# in its own banners and error messages — doesn't get misclassified as the
# deprecated extension.
_DEPRECATION_REQUIRED = ("gh-copilot",)
_DEPRECATION_MARKERS = (
"has been deprecated",
"no commands will be executed",
)
def _is_gh_copilot_deprecation_message(stderr_text: str) -> bool:
"""True iff stderr looks like the deprecated gh-copilot extension's banner."""
lower = stderr_text.lower()
if not any(req in lower for req in _DEPRECATION_REQUIRED):
return False
return any(marker in lower for marker in _DEPRECATION_MARKERS)
def _resolve_command() -> str:
return (
@ -506,6 +528,21 @@ class CopilotACPClient:
stderr_text = "\n".join(stderr_tail).strip()
if proc.poll() is not None and stderr_text:
if _is_gh_copilot_deprecation_message(stderr_text):
raise RuntimeError(
"Hermes ACP mode requires the NEW GitHub Copilot CLI "
"(github.com/github/copilot-cli), but the binary it just "
"spawned is the deprecated `gh copilot` extension.\n\n"
"Install the new CLI:\n"
" npm install -g @github/copilot\n"
" # then verify with: copilot --help\n\n"
"If `copilot` already resolves to the new CLI but you still see this,\n"
"point Hermes at it explicitly:\n"
" export HERMES_COPILOT_ACP_COMMAND=/path/to/new/copilot\n\n"
"Alternative: use the `copilot` provider (no ACP, hits the Copilot API\n"
"directly with a Copilot subscription token) via `hermes setup`.\n\n"
f"Original error:\n{stderr_text}"
)
raise RuntimeError(f"Copilot ACP process exited early: {stderr_text}")
raise TimeoutError(f"Timed out waiting for Copilot ACP response to {method}.")

View file

@ -358,6 +358,12 @@ _URL_TO_PROVIDER: Dict[str, str] = {
"api.deepseek.com": "deepseek",
"api.githubcopilot.com": "copilot",
"models.github.ai": "copilot",
# GitHub Models free tier (Azure-hosted prototyping endpoint) — same
# canonical provider as the Copilot API. Hard per-request token cap
# (often 8K) makes it unusable for Hermes' system prompt, but mapping
# it here lets us recognize the endpoint and emit a targeted hint
# instead of falling through the unknown-custom-endpoint path.
"models.inference.ai.azure.com": "copilot",
"api.fireworks.ai": "fireworks",
"opencode.ai": "opencode-go",
"api.x.ai": "xai",

View file

@ -663,7 +663,7 @@ export const af: Translations = {
columnHelp: {
triage: "Rou idees — 'n spesifiseerder sal die spesifikasie uitwerk",
todo: "Wag op afhanklikhede of nie toegewys nie",
ready: "Toegewys en wag vir 'n versender-tik",
ready: "Afhanklikhede is bevredig; wys 'n profiel toe om te versend",
running: "Deur 'n werker geëis — in vlug",
blocked: "Werker het mensinvoer aangevra",
done: "Voltooi",

View file

@ -662,7 +662,7 @@ export const de: Translations = {
columnHelp: {
triage: "Rohe Ideen — ein Specifier wird die Spezifikation ausarbeiten",
todo: "Wartet auf Abhängigkeiten oder ist nicht zugewiesen",
ready: "Zugewiesen und wartet auf einen Dispatcher-Tick",
ready: "Abhängigkeiten erfüllt; Profil zum Dispatch zuweisen",
running: "Von einem Worker übernommen — in Bearbeitung",
blocked: "Worker hat um menschliche Eingabe gebeten",
done: "Abgeschlossen",

View file

@ -574,6 +574,9 @@ export const en: Translations = {
createTask: "Create task in this column",
noTasks: "— no tasks —",
unassigned: "unassigned",
needsAssignee: "Needs assignee",
needsAssigneeHint:
"Dependencies are satisfied, but the dispatcher skips this task until you assign a profile.",
untitled: "(untitled)",
loadingDetail: "Loading…",
addComment: "Add a comment… (Enter to submit)",
@ -664,7 +667,7 @@ export const en: Translations = {
columnHelp: {
triage: "Raw ideas — a specifier will flesh out the spec",
todo: "Waiting on dependencies or unassigned",
ready: "Assigned and waiting for a dispatcher tick",
ready: "Dependencies satisfied; assign a profile to dispatch",
running: "Claimed by a worker — in-flight",
blocked: "Worker asked for human input",
done: "Completed",

View file

@ -662,7 +662,7 @@ export const es: Translations = {
columnHelp: {
triage: "Ideas en bruto — un specifier desarrollará la especificación",
todo: "Esperando dependencias o sin asignar",
ready: "Asignado y esperando un tick del dispatcher",
ready: "Dependencias satisfechas; asigna un perfil para despachar",
running: "Reclamado por un worker — en ejecución",
blocked: "El worker pidió intervención humana",
done: "Completado",

View file

@ -662,7 +662,7 @@ export const fr: Translations = {
columnHelp: {
triage: "Idées brutes — un specifier rédigera la spécification",
todo: "En attente de dépendances ou non assigné",
ready: "Assigné et en attente d'un tick du dispatcher",
ready: "Dépendances satisfaites ; assignez un profil pour dispatch",
running: "Réclamé par un worker — en cours d'exécution",
blocked: "Le worker a demandé une intervention humaine",
done: "Terminé",

View file

@ -663,7 +663,7 @@ export const ga: Translations = {
columnHelp: {
triage: "Smaointe amha — déanfaidh specifier an spec a chur i bhfeidhm",
todo: "Ag fanacht ar spleáchais nó gan sannadh",
ready: "Sannta agus ag fanacht ar thic an dispatcher",
ready: "Tá na spleáchais sásaithe; sann próifíl le dispatch a dhéanamh",
running: "Éilithe ag worker — ar siúl",
blocked: "D'iarr an worker ionchur duine",
done: "Críochnaithe",

View file

@ -663,7 +663,7 @@ export const hu: Translations = {
columnHelp: {
triage: "Nyers ötletek — egy specifier kidolgozza a specifikációt",
todo: "Függőségekre vár vagy nincs felelőse",
ready: "Kiosztva, dispatcher tickre vár",
ready: "A függőségek teljesültek; rendelj hozzá profilt az indításhoz",
running: "Worker felvette — folyamatban",
blocked: "A worker emberi beavatkozást kért",
done: "Befejezve",

View file

@ -662,7 +662,7 @@ export const it: Translations = {
columnHelp: {
triage: "Idee grezze — un specifier elaborerà la specifica",
todo: "In attesa di dipendenze o non assegnato",
ready: "Assegnato e in attesa di un tick del dispatcher",
ready: "Dipendenze soddisfatte; assegna un profilo per il dispatch",
running: "Preso in carico da un worker — in esecuzione",
blocked: "Il worker ha richiesto input umano",
done: "Completato",

View file

@ -663,7 +663,7 @@ export const ja: Translations = {
columnHelp: {
triage: "未整理のアイデア — スペシファイアが仕様を肉付けします",
todo: "依存関係の待機中、または未割り当て",
ready: "割り当て済み、ディスパッチャーのティック待ち",
ready: "依存関係は満たされています。ディスパッチするにはプロファイルを割り当ててください",
running: "ワーカーが取得中 — 実行中",
blocked: "ワーカーが人間の入力を求めています",
done: "完了",

View file

@ -663,7 +663,7 @@ export const ko: Translations = {
columnHelp: {
triage: "원시 아이디어 — 스페시파이어가 사양을 구체화합니다",
todo: "종속성 대기 중 또는 미지정",
ready: "지정되었으며 디스패처 틱 대기 중",
ready: "종속성이 충족됨; 디스패치하려면 프로필을 지정하세요",
running: "워커가 점유 중 — 실행 중",
blocked: "워커가 사람의 입력을 요청함",
done: "완료됨",

View file

@ -663,7 +663,7 @@ export const pt: Translations = {
columnHelp: {
triage: "Ideias em bruto — um specifier vai detalhar a especificação",
todo: "À espera de dependências ou sem atribuição",
ready: "Atribuído e à espera de um tick do dispatcher",
ready: "Dependências satisfeitas; atribua um perfil para despachar",
running: "Reivindicado por um worker — em execução",
blocked: "O worker pediu intervenção humana",
done: "Concluído",

View file

@ -663,7 +663,7 @@ export const ru: Translations = {
columnHelp: {
triage: "Сырые идеи — specifier подготовит спецификацию",
todo: "Ожидает зависимостей или без исполнителя",
ready: "Назначено и ждёт тика диспетчера",
ready: "Зависимости выполнены; назначьте профиль для диспетчеризации",
running: "Взято воркером — выполняется",
blocked: "Воркер запросил вмешательство человека",
done: "Завершено",

View file

@ -663,7 +663,7 @@ export const tr: Translations = {
columnHelp: {
triage: "Ham fikirler — bir specifier şartnameyi detaylandıracak",
todo: "Bağımlılıklar bekleniyor veya atanmamış",
ready: "Atanmış ve dispatcher tick'i bekleniyor",
ready: "Bağımlılıklar karşılandı; dispatch için bir profil atayın",
running: "Bir worker tarafından alındı — yürütülüyor",
blocked: "Worker insan girdisi istedi",
done: "Tamamlandı",

View file

@ -586,6 +586,8 @@ export interface Translations {
createTask: string;
noTasks: string;
unassigned: string;
needsAssignee?: string;
needsAssigneeHint?: string;
untitled: string;
loadingDetail: string;
addComment: string;

View file

@ -663,7 +663,7 @@ export const uk: Translations = {
columnHelp: {
triage: "Сирі ідеї — специфікатор деталізує специфікацію",
todo: "Очікує на залежності або не призначено",
ready: "Призначено, очікує тіку диспетчера",
ready: "Залежності задоволені; призначте профіль для диспетчеризації",
running: "Захоплено воркером — у роботі",
blocked: "Воркер запитав втручання людини",
done: "Завершено",

View file

@ -663,7 +663,7 @@ export const zhHant: Translations = {
columnHelp: {
triage: "原始想法 — 規格制定者將完善規格",
todo: "等待相依項目或尚未指派",
ready: "已指派,等待排程器輪詢",
ready: "相依項目已滿足;指派設定檔以便排程",
running: "已被工作者領取 — 執行中",
blocked: "工作者請求人工輸入",
done: "已完成",

View file

@ -659,7 +659,7 @@ export const zh: Translations = {
columnHelp: {
triage: "原始想法 — 规范制定者将完善规格",
todo: "等待依赖项或未分配",
ready: "已分配,等待调度器轮询",
ready: "依赖项已满足;分配一个配置文件以便调度",
running: "已被工作者认领 — 执行中",
blocked: "工作者请求人工输入",
done: "已完成",

View file

@ -2961,9 +2961,25 @@ class BasePlatformAdapter(ABC):
merge_pending_message_event(self._pending_messages, session_key, event)
return # Don't interrupt now - will run after current task completes
# Default behavior for non-photo follow-ups: interrupt the running agent
# Default behavior for non-photo follow-ups: interrupt the running agent.
#
# Use merge_text=True so rapid TEXT follow-ups (#4469) accumulate
# into the single pending slot instead of clobbering each other.
# Without merging, three rapid messages "A", "B", "C" land like:
# _pending_messages[k] = A (interrupts)
# _pending_messages[k] = B (replaces A before consumer reads)
# _pending_messages[k] = C (replaces B)
# ...and only "C" reaches the next turn. merge_pending_message_event
# already does the right thing for photo/media bursts; the
# ``merge_text=True`` flag extends that to plain TEXT events.
# Same shape as the Telegram bursty-grace path in gateway/run.py.
logger.debug("[%s] New message while session %s is active — triggering interrupt", self.name, session_key)
self._pending_messages[session_key] = event
merge_pending_message_event(
self._pending_messages,
session_key,
event,
merge_text=True,
)
# Signal the interrupt (the processing task checks this)
self._active_sessions[session_key].set()
return # Don't process now - will be handled after current task finishes

View file

@ -14,8 +14,8 @@ Provides subcommands for:
import os
import sys
__version__ = "0.13.0"
__release_date__ = "2026.5.7"
__version__ = "0.14.0"
__release_date__ = "2026.5.16"
def _ensure_utf8():

View file

@ -1152,6 +1152,10 @@ DEFAULT_CONFIG = {
"provider": "", # e.g. "openrouter" (empty = inherit parent provider + credentials)
"base_url": "", # direct OpenAI-compatible endpoint for subagents
"api_key": "", # API key for delegation.base_url (falls back to OPENAI_API_KEY)
"api_mode": "", # wire protocol for delegation.base_url: "chat_completions",
# "codex_responses", or "anthropic_messages". Empty = auto-detect
# from URL (e.g. /anthropic suffix → anthropic_messages). Set this
# explicitly for non-standard endpoints the heuristic can't detect.
# When delegate_task narrows child toolsets explicitly, preserve any
# MCP toolsets the parent already has enabled. On by default so
# narrowing (e.g. toolsets=["web","browser"]) expresses "I want these
@ -1609,6 +1613,23 @@ DEFAULT_CONFIG = {
"servers": {},
},
# X (Twitter) Search via xAI's built-in x_search Responses tool.
# The tool registers when xAI credentials are available (SuperGrok
# OAuth or XAI_API_KEY) AND the x_search toolset is enabled in
# `hermes tools`. These settings tune the backing Responses API call.
"x_search": {
# xAI model used for the Responses call. grok-4.20-reasoning is
# the recommended default; any Grok model with x_search tool
# access works.
"model": "grok-4.20-reasoning",
# Request timeout in seconds (minimum 30). x_search can take
# 60-120s for complex queries — the default is generous.
"timeout_seconds": 180,
# Number of automatic retries on 5xx / ReadTimeout / ConnectionError.
# Each retry backs off (1.5x attempt seconds, capped at 5s).
"retries": 2,
},
# Config schema version - bump this when adding new required fields
"_config_version": 23,
}

View file

@ -152,6 +152,30 @@ def _apply_doctor_tool_availability_overrides(available: list[str], unavailable:
return updated_available, updated_unavailable
def _has_healthy_oauth_fallback_for_apikey_provider(provider_label: str) -> bool:
"""Return True when a direct API-key probe failure is non-blocking.
Some provider families support both a direct API-key path and a separate
OAuth runtime path. When the OAuth path is already healthy, doctor should
still show a failed API-key connectivity row, but it should not promote
that direct-key problem into the final blocking summary.
"""
try:
from hermes_cli.auth import (
get_gemini_oauth_auth_status,
get_minimax_oauth_auth_status,
)
except Exception:
return False
normalized = (provider_label or "").strip().lower()
if normalized in {"google / gemini", "gemini"}:
return bool((get_gemini_oauth_auth_status() or {}).get("logged_in"))
if normalized == "minimax":
return bool((get_minimax_oauth_auth_status() or {}).get("logged_in"))
return False
def check_ok(text: str, detail: str = ""):
print(f" {color('', Colors.GREEN)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
@ -1594,7 +1618,10 @@ def run_doctor(args):
print(f" {_glyph} {_label} {_detail}")
else:
print(f" {_glyph} {_label}")
for _issue in _r.issues:
_issues_to_add = list(_r.issues)
if _issues_to_add and _has_healthy_oauth_fallback_for_apikey_provider(_r.label):
_issues_to_add = []
for _issue in _issues_to_add:
issues.append(_issue)
# =========================================================================

View file

@ -2525,6 +2525,7 @@ def _is_github_models_base_url(base_url: Optional[str]) -> bool:
return (
normalized.startswith(COPILOT_BASE_URL)
or normalized.startswith("https://models.github.ai/inference")
or normalized.startswith("https://models.inference.ai.azure.com")
)

View file

@ -325,8 +325,15 @@ class PluginContext:
is_async: bool = False,
description: str = "",
emoji: str = "",
override: bool = False,
) -> None:
"""Register a tool in the global registry **and** track it as plugin-provided."""
"""Register a tool in the global registry **and** track it as plugin-provided.
Pass ``override=True`` to replace an existing built-in tool with the
same name (e.g. swap the default ``browser_navigate`` for a custom
CDP-backed implementation). Without it, attempting to register a name
already claimed by a different toolset is rejected.
"""
from tools.registry import registry
registry.register(
@ -339,9 +346,13 @@ class PluginContext:
is_async=is_async,
description=description,
emoji=emoji,
override=override,
)
self._manager._plugin_tool_names.add(name)
logger.debug("Plugin %s registered tool: %s", self.manifest.name, name)
logger.debug(
"Plugin %s registered tool: %s%s",
self.manifest.name, name, " (override)" if override else "",
)
# -- message injection --------------------------------------------------

View file

@ -61,6 +61,7 @@ CONFIGURABLE_TOOLSETS = [
("video", "🎬 Video Analysis", "video_analyze (requires video-capable model)"),
("image_gen", "🎨 Image Generation", "image_generate"),
("video_gen", "🎬 Video Generation", "video_generate (text-to-video + image-to-video)"),
("x_search", "🐦 X (Twitter) Search", "x_search (requires xAI OAuth or XAI_API_KEY)"),
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
("tts", "🔊 Text-to-Speech", "text_to_speech"),
("skills", "📚 Skills", "list, view, manage"),
@ -86,7 +87,12 @@ CONFIGURABLE_TOOLSETS = [
# Video gen is off by default — it's a niche, paid, slow feature. Users
# who want it opt in via `hermes tools` → Video Generation, which walks
# them through provider + model selection.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "spotify", "discord", "discord_admin", "video", "video_gen"}
#
# X search is off by default — gated on xAI credentials (SuperGrok OAuth
# or XAI_API_KEY). Users opt in via `hermes tools` → X (Twitter) Search,
# which walks them through credential setup. The tool's check_fn means
# the schema won't appear to the model even if enabled without credentials.
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "spotify", "discord", "discord_admin", "video", "video_gen", "x_search"}
# Platform-scoped toolsets: only appear in the `hermes tools` checklist for
# these platforms, and only resolve/save for these platforms. A toolset
@ -308,6 +314,39 @@ TOOL_CATEGORIES = {
# converge image_gen toward.
"providers": [],
},
"x_search": {
"name": "X (Twitter) Search",
"setup_title": "Select xAI Credential Source",
"setup_note": (
"Hermes routes X searches through xAI's built-in x_search "
"Responses tool. Both credential sources hit the same "
"https://api.x.ai/v1/responses endpoint — pick whichever you "
"already have. SuperGrok OAuth is preferred when both are set "
"(uses your subscription quota instead of API spend)."
),
"icon": "🐦",
"providers": [
{
"name": "xAI Grok OAuth (SuperGrok Subscription)",
"badge": "subscription",
"tag": "Browser login at accounts.x.ai — no API key required",
"env_vars": [],
"post_setup": "xai_grok",
},
{
"name": "xAI API key",
"badge": "paid",
"tag": "Direct xAI API billing via XAI_API_KEY",
"env_vars": [
{
"key": "XAI_API_KEY",
"prompt": "xAI API key",
"url": "https://console.x.ai/",
},
],
},
],
},
"browser": {
"name": "Browser Automation",
"icon": "🌐",

View file

@ -21,6 +21,7 @@ Public API (signatures preserved from the original 2,400-line version):
"""
import json
import re
import asyncio
import logging
import threading
@ -485,6 +486,48 @@ _AGENT_LOOP_TOOLS = {"todo", "memory", "session_search", "delegate_task"}
_READ_SEARCH_TOOLS = {"read_file", "search_files"}
# =========================================================================
# Tool error sanitization
# =========================================================================
#
# Tool exceptions can carry arbitrary text into the model's context as the
# `tool` message content. json.dumps() handles quote/backslash escaping so a
# raw injection of `</tool_call>` won't break message framing, but the model
# still *reads* those tokens and they can confuse downstream tool-call
# parsing or, in adversarial cases, nudge it toward role-confusion framing.
#
# This helper strips structural framing tokens (XML role tags, CDATA,
# markdown code fences) and caps the message at a sane upper bound before it
# becomes part of the conversation. It's defense-in-depth — the json layer
# already prevents framing escape — but cheap and worth having.
#
# Ported from ironclaw#1639.
_TOOL_ERROR_ROLE_TAG_RE = re.compile(
r'</?(?:tool_call|function_call|result|response|output|input|system|assistant|user)>',
re.IGNORECASE,
)
_TOOL_ERROR_FENCE_OPEN_RE = re.compile(r'^\s*```(?:json|xml|html|markdown)?\s*', re.MULTILINE)
_TOOL_ERROR_FENCE_CLOSE_RE = re.compile(r'\s*```\s*$', re.MULTILINE)
_TOOL_ERROR_CDATA_RE = re.compile(r'<!\[CDATA\[.*?\]\]>', re.DOTALL)
_TOOL_ERROR_MAX_LEN = 2000
def _sanitize_tool_error(error_msg: str) -> str:
"""Strip structural framing tokens from a tool error before showing it to the model.
See _TOOL_ERROR_ROLE_TAG_RE docstring above for rationale.
"""
if not error_msg:
return "[TOOL_ERROR] "
sanitized = _TOOL_ERROR_ROLE_TAG_RE.sub("", error_msg)
sanitized = _TOOL_ERROR_FENCE_OPEN_RE.sub("", sanitized)
sanitized = _TOOL_ERROR_FENCE_CLOSE_RE.sub("", sanitized)
sanitized = _TOOL_ERROR_CDATA_RE.sub("", sanitized)
if len(sanitized) > _TOOL_ERROR_MAX_LEN:
sanitized = sanitized[:_TOOL_ERROR_MAX_LEN - 3] + "..."
return f"[TOOL_ERROR] {sanitized}"
# =========================================================================
# Tool argument type coercion
# =========================================================================
@ -824,7 +867,7 @@ def handle_function_call(
except Exception as e:
error_msg = f"Error executing {function_name}: {str(e)}"
logger.exception(error_msg)
return json.dumps({"error": error_msg}, ensure_ascii=False)
return json.dumps({"error": _sanitize_tool_error(error_msg)}, ensure_ascii=False)
# =============================================================================

View file

@ -0,0 +1,309 @@
---
name: pinggy-tunnel
description: Zero-install localhost tunnels over SSH via Pinggy.
version: 0.1.0
author: Teknium (teknium1), Hermes Agent
license: MIT
platforms: [linux, macos, windows]
metadata:
hermes:
tags: [Pinggy, Tunnel, Networking, SSH, Webhook, Localhost]
related_skills: [cloudflared-quick-tunnel, webhook-subscriptions]
---
# Pinggy Tunnel Skill
Expose a local service (dev server, webhook receiver, MCP endpoint, demo) to the public internet using a Pinggy SSH reverse tunnel. No daemon to install — the user's stock SSH client connects to `a.pinggy.io:443` and Pinggy hands back a public HTTP/HTTPS URL.
Free tier: 60-minute tunnels, random subdomain, no signup. Pro tier ($3/mo) is an opt-in with a token.
## When to Use
- User asks to "expose this locally", "share my dev server", "make this URL public", "tunnel port N", "get a public URL for a webhook"
- Need to receive a webhook callback during a local task (Stripe, GitHub, Discord, AgentMail)
- Sharing a one-off HTTP demo (MCP server, Ollama/vLLM endpoint, dashboard) with a remote party
- The host has SSH but no `cloudflared` / `ngrok` binary, and installing one would be overkill
If the host already has `cloudflared` configured, prefer the `cloudflared-quick-tunnel` skill — Cloudflare quick tunnels don't expire after 60 minutes.
## Prerequisites
- `ssh` on PATH (`ssh -V`). Default on Linux, macOS, and Windows 10+. No other install.
- A local service listening on `127.0.0.1:<port>` before the tunnel starts. Pinggy will return URLs but they'll 502 until the local origin is up.
Optional:
- `PINGGY_TOKEN` env var for paid Pro features (persistent subdomain, custom domain, multiple tunnels, no 60-minute cap). Free tier needs no credentials.
## Quick Reference
```bash
# Plain HTTP/HTTPS tunnel for port 8000 (free tier)
ssh -p 443 -o StrictHostKeyChecking=no -o ServerAliveInterval=30 \
-R0:localhost:8000 free@a.pinggy.io
# TCP tunnel (databases, raw SSH, etc.)
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:5432 tcp@a.pinggy.io
# TLS tunnel (Pinggy can't decrypt — bring your own certs at origin)
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:443 tls@a.pinggy.io
# Basic auth gate (b:user:pass)
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 \
"b:admin:secret+free@a.pinggy.io"
# Bearer token gate (k:token)
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 \
"k:mysecrettoken+free@a.pinggy.io"
# IP whitelist (w:CIDR)
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 \
"w:203.0.113.0/24+free@a.pinggy.io"
# Enable CORS + force HTTPS redirect
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 \
"co+x:https+free@a.pinggy.io"
# Pro tier (persistent URL, no 60-min cap)
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 "$PINGGY_TOKEN+a.pinggy.io"
```
## Procedure — Start a Tunnel and Get the URL
The model SHOULD use the `terminal` tool. The tunnel must stay alive for the duration of the share, so run it as a background process and parse the public URL from stdout.
### 1. Confirm a local origin is up
```bash
curl -sI http://127.0.0.1:8000/ | head -1
# expect HTTP/1.x 200 (or any non-connection-refused response)
```
If nothing is listening yet, start it first (e.g. `python3 -m http.server 8000 --bind 127.0.0.1`). Pinggy will happily return a URL pointed at nothing — the user will see 502 until the origin comes up.
### 2. Launch the tunnel as a background process
Use `terminal(background=True)` and capture output to a logfile (Pinggy prints the URLs on stdout, then keeps the connection open):
```bash
LOG=/tmp/pinggy-8000.log
nohup ssh -p 443 \
-o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null \
-o ServerAliveInterval=30 \
-o ServerAliveCountMax=3 \
-R0:localhost:8000 free@a.pinggy.io \
> "$LOG" 2>&1 &
echo $! > /tmp/pinggy-8000.pid
```
`StrictHostKeyChecking=no` + `UserKnownHostsFile=/dev/null` skips the first-run host-key prompt. `ServerAliveInterval=30` keeps the SSH session from getting torn down by an idle NAT.
### 3. Parse the URL out of the log
```bash
sleep 4
grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/pinggy-8000.log | head -1
```
Expected output looks like:
```
You are not authenticated.
Your tunnel will expire in 60 minutes.
http://yqycl-98-162-69-48.a.free.pinggy.link
https://yqycl-98-162-69-48.a.free.pinggy.link
```
Hand the `https://...pinggy.link` URL to the user.
### 4. Verify
```bash
curl -sI https://<the-url>/ | head -3
# expect 200/302/whatever the local origin actually returns
```
If you get `502 Bad Gateway`, the SSH session is up but the local origin isn't listening — fix step 1 first.
### 5. Teardown
```bash
kill "$(cat /tmp/pinggy-8000.pid)"
# or, if the pid file got lost:
pkill -f 'ssh -p 443 .* free@a\.pinggy\.io'
```
If you have a session_id from `terminal(background=True)`, prefer `process(action='kill', session_id=...)`.
## Access Control via Username Keywords
Pinggy stacks control flags into the SSH username separated by `+`. Always quote the whole `user@host` argument when it contains a `+`:
| Keyword | Effect |
|---------|--------|
| `b:user:pass` | HTTP Basic auth gate |
| `k:token` | Bearer-token header gate (`Authorization: Bearer <token>`) |
| `w:CIDR` | IP whitelist (single IP or CIDR, repeatable) |
| `co` | Add `Access-Control-Allow-Origin: *` (CORS) |
| `x:https` | Force HTTPS — auto-redirect HTTP to HTTPS |
| `a:Name:Value` | Add request header |
| `u:Name:Value` | Update request header |
| `r:Name` | Remove request header |
| `qr` | Print a QR code of the URL to stdout (handy for mobile sharing) |
Combine freely: `"b:admin:secret+co+x:https+free@a.pinggy.io"`.
## Web Debugger (optional)
Pinggy can mirror the inbound traffic to `localhost:4300` for inspection. Add a local forward to the SSH command:
```bash
ssh -p 443 -L4300:localhost:4300 -R0:localhost:8000 free@a.pinggy.io
```
Then open `http://localhost:4300` in a browser to see live request/response pairs.
## Pitfalls
- **60-minute hard cap on the free tier.** The SSH session terminates at the 60-minute mark; the URL goes dead. For longer shares, either use `PINGGY_TOKEN` (Pro) or auto-restart with a shell loop (note that the URL changes on every restart for free-tier).
- **Free-tier URL is random and changes on restart.** Don't bookmark it, don't paste it into a config file. Re-parse from the log each time.
- **Concurrent free tunnels are limited to one per source IP.** Starting a second tunnel from the same machine usually kills the first. Pro tier lifts this.
- **`+` in usernames must be quoted.** Bare `ssh ... b:admin:secret+free@a.pinggy.io` works in bash but breaks under shells that treat `+` specially or when assembled programmatically. Always wrap in double quotes.
- **Don't tunnel anything sensitive without an access-control flag.** A bare HTTP tunnel is reachable by anyone with the URL. Use `b:`, `k:`, or `w:` for non-public services.
- **`process(action='log')` may miss SSH banner output.** Pinggy prints the URLs and then the SSH session goes interactive. Always redirect to a logfile and `grep` the file directly — same pattern as `cloudflared-quick-tunnel`.
- **Host-key prompt on first run.** Default OpenSSH config asks the user to accept Pinggy's host key. Always pass `-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null` for unattended runs.
- **TCP and TLS tunnels return a `<subdomain>.a.pinggy.online:<port>` pair, not an https URL.** Parse with a different regex (`tcp://` and a port). Don't assume every Pinggy tunnel is HTTP.
- **Pro mode requires the token as the username, not a flag.** Use `"$PINGGY_TOKEN+a.pinggy.io"` (no `free@`). With a token you can also add `:persistent` for a stable subdomain — see `pinggy.io/docs/`.
## Recipes
Composite patterns combining a local origin with a Pinggy tunnel. Each recipe is self-contained — start the origin, start the tunnel, parse the URL, hand it back to the user.
### Recipe 1 — Receive a webhook callback
Use this when an external service (Stripe, GitHub, Discord, AgentMail, etc.) needs to POST to a publicly reachable URL during a local task.
```bash
# 1. Tiny capturing server: every request gets appended to /tmp/webhook-hits.log
cat >/tmp/webhook-server.py <<'PY'
import http.server, json, datetime, pathlib
LOG = pathlib.Path("/tmp/webhook-hits.log")
class H(http.server.BaseHTTPRequestHandler):
def _capture(self):
n = int(self.headers.get("content-length") or 0)
body = self.rfile.read(n).decode("utf-8", "replace") if n else ""
rec = {"t": datetime.datetime.utcnow().isoformat(), "path": self.path,
"method": self.command, "headers": dict(self.headers), "body": body}
with LOG.open("a") as f: f.write(json.dumps(rec) + "\n")
self.send_response(200); self.send_header("content-type","application/json")
self.end_headers(); self.wfile.write(b'{"ok":true}\n')
def do_GET(self): self._capture()
def do_POST(self): self._capture()
def log_message(self,*a,**k): pass
http.server.HTTPServer(("127.0.0.1", 18080), H).serve_forever()
PY
nohup python3 /tmp/webhook-server.py >/tmp/webhook-server.log 2>&1 &
echo $! >/tmp/webhook-server.pid
# 2. Tunnel — bearer-token-gate so randos can't pollute the capture log
nohup ssh -p 443 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
-o ServerAliveInterval=30 \
-R0:localhost:18080 "k:$(openssl rand -hex 12)+free@a.pinggy.io" \
>/tmp/webhook-pinggy.log 2>&1 &
echo $! >/tmp/webhook-pinggy.pid
sleep 5
URL=$(grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/webhook-pinggy.log | head -1)
echo "Webhook URL: $URL"
# 3. While the agent works, watch hits land
tail -f /tmp/webhook-hits.log
```
Hand `$URL` to the service that needs to call you. Teardown: `kill $(cat /tmp/webhook-server.pid) $(cat /tmp/webhook-pinggy.pid)`.
### Recipe 2 — Expose an MCP server over HTTP/SSE
Use when a remote MCP client (Claude Desktop on another machine, a teammate's editor, etc.) needs to reach an MCP server running on the local box. Only works for MCP servers that speak HTTP transport — stdio-mode servers can't be tunneled.
```bash
# 1. Start the MCP server in HTTP mode (example: a FastMCP server on port 8765)
nohup python3 my_mcp_server.py --transport http --port 8765 \
>/tmp/mcp-server.log 2>&1 &
echo $! >/tmp/mcp-server.pid
# 2. Tunnel with a bearer token — MCP traffic should not be open to the internet
TOKEN=$(openssl rand -hex 16)
nohup ssh -p 443 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
-o ServerAliveInterval=30 \
-R0:localhost:8765 "k:$TOKEN+free@a.pinggy.io" \
>/tmp/mcp-pinggy.log 2>&1 &
echo $! >/tmp/mcp-pinggy.pid
sleep 5
URL=$(grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/mcp-pinggy.log | head -1)
echo "MCP URL: $URL"
echo "Bearer token: $TOKEN"
```
The remote client connects to `$URL` with `Authorization: Bearer $TOKEN`. Hermes' own native MCP client config: `{"transport": "http", "url": "<URL>", "headers": {"Authorization": "Bearer <TOKEN>"}}`.
### Recipe 3 — Expose a local LLM endpoint (Ollama / vLLM / llama.cpp)
Share a local model with a remote caller (another agent, a phone, a teammate). Ollama listens on `:11434`, vLLM and llama.cpp typically on `:8000`.
```bash
# Pre-req: the model server is already running on 127.0.0.1:11434 (Ollama default)
TOKEN=$(openssl rand -hex 16)
nohup ssh -p 443 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
-o ServerAliveInterval=30 \
-R0:localhost:11434 "k:$TOKEN+co+free@a.pinggy.io" \
>/tmp/llm-pinggy.log 2>&1 &
echo $! >/tmp/llm-pinggy.pid
sleep 5
URL=$(grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/llm-pinggy.log | head -1)
echo "Endpoint: $URL"
echo "Token: $TOKEN"
# Verify
curl -s "$URL/api/tags" -H "Authorization: Bearer $TOKEN" | head
```
`co` enables CORS so a browser caller can hit the endpoint. Drop `co` for backend-only callers. For an OpenAI-compatible vLLM/llama.cpp endpoint, callers use base URL `$URL/v1` with `Authorization: Bearer $TOKEN` — but note Pinggy strips/replaces nothing in the body, so the model server itself sees Pinggy's token; the local server should be configured to ignore auth (it's already on `127.0.0.1`) and let Pinggy do the gating.
### Recipe 4 — Share a dev server with a one-shot password
The fastest "let a teammate poke at my running app" pattern. Random password, prints once, dies when you Ctrl-C.
```bash
PASS=$(openssl rand -base64 12 | tr -d '+/=' | head -c 12)
echo "Dev server password: $PASS"
ssh -p 443 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
-o ServerAliveInterval=30 \
-R0:localhost:3000 "b:dev:$PASS+co+x:https+free@a.pinggy.io"
# URL prints to the terminal. Share URL + password. Ctrl-C to tear down.
```
`b:dev:$PASS` gates the URL with HTTP Basic auth. `x:https` forces TLS. `co` adds CORS for SPA frontends.
## Verification
```bash
# End-to-end: spin up a trivial origin, tunnel it, hit it, tear down
python3 -m http.server 18000 --bind 127.0.0.1 >/tmp/origin.log 2>&1 &
ORIGIN_PID=$!
nohup ssh -p 443 \
-o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null \
-R0:localhost:18000 free@a.pinggy.io >/tmp/pinggy-verify.log 2>&1 &
SSH_PID=$!
sleep 5
URL=$(grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/pinggy-verify.log | head -1)
echo "URL: $URL"
curl -sI "$URL/" | head -1
kill "$SSH_PID" "$ORIGIN_PID"
```
Expected: a `pinggy.link` URL and `HTTP/2 200` on the curl head.

View file

@ -0,0 +1,199 @@
---
name: darwinian-evolver
description: Evolve prompts/regex/SQL/code with Imbue's evolution loop.
version: 0.1.0
author: Bihruze (Asahi0x), Hermes Agent
license: MIT
platforms: [linux, macos]
metadata:
hermes:
tags: [evolution, optimization, prompt-engineering, research]
related_skills: [arxiv, jupyter-live-kernel]
---
# Darwinian Evolver
Run Imbue's [darwinian_evolver](https://github.com/imbue-ai/darwinian_evolver) — an
LLM-driven evolutionary search loop — to optimize a **prompt, regex, SQL query,
or small code snippet** against a fitness function.
Status: thin wrapper around the upstream tool. The skill installs it, walks the
agent through writing a `Problem` definition (organism + evaluator + mutator),
and drives the loop via the upstream CLI or a small custom Python driver.
**License:** the upstream tool is **AGPL-3.0**. The skill ONLY ever invokes it
via the upstream CLI or a `subprocess`/`uv run` call (mere aggregation). Do NOT
import upstream classes into Hermes itself.
## When to Use
- User says "optimize this prompt", "evolve a regex for X", "auto-improve this
code/SQL", "search for a better instruction".
- You have a scorer (exact match, regex pass-rate, unit test, LLM-judge, runtime
metric) AND a starting candidate (organism). If you don't have a scorer, stop
and define one first — that's the hard part.
- Cost is OK: a typical run is 50500 LLM calls. On gpt-4o-mini that's pennies;
on Claude Sonnet it can be a few dollars.
Do **not** use this when:
- The optimization target is differentiable (use gradient descent / DSPy).
- You only need to try 23 variants — just write them by hand.
- The fitness signal is purely subjective with no measurable criterion.
## Prerequisites
- Python ≥3.11
- `git`, `uv` (or `pip`)
- One of: `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, or `OPENAI_API_KEY`
The skill ships a small `parrot_openrouter.py` driver that uses `OPENROUTER_API_KEY`
via the OpenAI SDK, so any model on OpenRouter works. The upstream CLI itself
hardcodes Anthropic and needs `ANTHROPIC_API_KEY`.
## Install (One-Time)
Run via the `terminal` tool:
```bash
mkdir -p ~/.hermes/cache/darwinian-evolver && cd ~/.hermes/cache/darwinian-evolver
[ -d darwinian_evolver ] || git clone --depth 1 https://github.com/imbue-ai/darwinian_evolver.git
cd darwinian_evolver && uv sync
```
Verify:
```bash
cd ~/.hermes/cache/darwinian-evolver/darwinian_evolver \
&& uv run darwinian_evolver --help | head -5
```
## Quick Start — The Built-In Parrot Example
Tiny smoke test (requires `ANTHROPIC_API_KEY`):
```bash
cd ~/.hermes/cache/darwinian-evolver/darwinian_evolver
uv run darwinian_evolver parrot \
--num_iterations 2 \
--num_parents_per_iteration 2 \
--mutator_concurrency 2 --evaluator_concurrency 2 \
--output_dir /tmp/parrot_demo
```
Outputs:
- `/tmp/parrot_demo/snapshots/iteration_N.pkl` — pickled population per iteration
- `/tmp/parrot_demo/<jsonl>` — per-iteration JSON log (path printed at end)
Open `~/.hermes/cache/darwinian-evolver/darwinian_evolver/darwinian_evolver/lineage_visualizer.html`
in a browser and load the JSON log to see the evolutionary tree.
## Quick Start — OpenRouter Driver (No Anthropic Key)
The skill ships `scripts/parrot_openrouter.py` — same parrot problem, but the
LLM call goes through OpenRouter so any provider works.
```bash
# From wherever the skill is installed:
SKILL_DIR=~/.hermes/skills/research/darwinian-evolver
DE_DIR=~/.hermes/cache/darwinian-evolver/darwinian_evolver
cd "$DE_DIR" && \
EVOLVER_MODEL='openai/gpt-4o-mini' \
uv run --with openai python "$SKILL_DIR/scripts/parrot_openrouter.py" \
--num_iterations 3 --num_parents_per_iteration 2 \
--output_dir /tmp/parrot_or
```
Inspect the result with `scripts/show_snapshot.py`:
```bash
uv run --with openai python "$SKILL_DIR/scripts/show_snapshot.py" \
/tmp/parrot_or/snapshots/iteration_3.pkl
```
Expected output: 7 evolved prompt templates ranked by score, with the best
landing around 0.60.8 (the seed `Say {{ phrase }}` scored 0.000).
## Defining a Custom Problem
The skill ships `templates/custom_problem_template.py` — copy, edit, run.
Three things you must define:
1. **`Organism`** — a Pydantic `BaseModel` subclass holding the artifact being
evolved (`prompt_template: str`, `regex_pattern: str`, `sql_query: str`,
`code_block: str`, etc.). Add a `run(*args)` method that exercises it.
2. **`Evaluator`** — `.evaluate(organism) -> EvaluationResult(score=..., trainable_failure_cases=[...], holdout_failure_cases=[...], is_viable=True)`.
- **`score`** is in `[0, 1]`. Higher is better.
- **`trainable_failure_cases`** — what the mutator sees. Include enough
context (input, expected, actual) for the LLM to diagnose.
- **`holdout_failure_cases`** — kept out of the mutator's view. Use these
to detect overfitting.
- **`is_viable=True`** unless the organism is completely broken (raises,
returns None, etc.). A 0-score viable organism is fine — it just gets
down-weighted in parent selection.
3. **`Mutator`** — `.mutate(organism, failure_cases, learning_log_entries) -> list[Organism]`.
Typically: build an LLM prompt that includes the current organism + a
failure case + an ask to propose a fix; parse the LLM's response; return
a new `Organism`. Return `[]` on parse failure — the loop handles it.
Then write a driver script that wires `Problem(initial_organism, evaluator, [mutators])`
into `EvolveProblemLoop` and iterates over `loop.run(num_iterations=N)` — the
shipped `scripts/parrot_openrouter.py` is the reference.
## Hyperparameters That Actually Matter
| flag | default | when to change |
|---|---|---|
| `--num_iterations` | 5 | bump to 1020 once you trust the evaluator |
| `--num_parents_per_iteration` | 4 | drop to 2 for cheap exploration |
| `--mutator_concurrency` | 10 | drop to 24 to avoid rate limits |
| `--evaluator_concurrency` | 10 | same; evaluator hits the LLM too |
| `--batch_size` | 1 | raise to 35 once your mutator handles multiple failures |
| `--verify_mutations` | off | turn on once mutator is wasteful (>10× cost saving on later runs per Imbue) |
| `--midpoint_score` | `p75` | leave alone unless scores cluster |
| `--sharpness` | 10 | leave alone |
## Pitfalls
1. **`Initial organism must be viable`** — set `is_viable=True` in your
`EvaluationResult` even on a 0-score seed. The loop refuses non-viable
organisms because they imply the loop has nothing to evolve from.
2. **Provider content filters kill runs.** Azure-backed OpenRouter models
reject phrases like "ignore previous instructions" with HTTP 400. Wrap
the LLM call in `try/except` and return `f"<LLM_ERROR: {e}>"` — the
evolver will just score that organism 0 and move on.
3. **`loop.run()` is a generator** — calling it doesn't run anything until
you iterate. Use `for snap in loop.run(num_iterations=N):`.
4. **Snapshots are nested pickles.** `iteration_N.pkl` contains a dict with
`population_snapshot` (more pickled bytes). To unpickle you must have the
`Organism` class importable under the same dotted path it was pickled at.
5. **Concurrency defaults are aggressive.** 10/10 will hit rate limits on
most providers. Start with 2/2.
6. **CLI is hardcoded to Anthropic.** `uv run darwinian_evolver <problem>`
reaches for `ANTHROPIC_API_KEY` and uses Claude Sonnet. To use any other
provider, write a driver like `parrot_openrouter.py`.
7. **AGPL.** Never `from darwinian_evolver import ...` inside Hermes core.
Custom driver scripts under `~/.hermes/skills/...` are user-side and fine.
8. **No PyPI package.** `pip install darwinian-evolver` will pull the wrong
thing. Always install from the GitHub repo.
## Verification
After install + a parrot run, exit code 0 from this is sufficient:
```bash
DE_DIR=~/.hermes/cache/darwinian-evolver/darwinian_evolver
ls "$DE_DIR/darwinian_evolver/lineage_visualizer.html" >/dev/null && \
cd "$DE_DIR" && uv run darwinian_evolver --help >/dev/null && \
echo "darwinian-evolver: OK"
```
## References
- [Imbue research post](https://imbue.com/research/2026-02-27-darwinian-evolver/)
- [ARC-AGI-2 results](https://imbue.com/research/2026-02-27-arc-agi-2-evolution/)
- [imbue-ai/darwinian_evolver](https://github.com/imbue-ai/darwinian_evolver) (AGPL-3.0)
- [Darwin Gödel Machines](https://arxiv.org/abs/2505.22954)
- [PromptBreeder](https://arxiv.org/abs/2309.16797)

View file

@ -0,0 +1,218 @@
"""
parrot_openrouter: same as the upstream `parrot` example but the LLM call goes
through OpenRouter (OpenAI SDK) instead of Anthropic native. Lets us run an
end-to-end evolution with whatever model the user already has paid access to.
Run with:
uv --project darwinian_evolver run python parrot_openrouter.py \
--num_iterations 3 --output_dir /tmp/parrot_out
Reads `OPENROUTER_API_KEY` from the environment.
"""
from __future__ import annotations
import argparse
import os
import sys
from pathlib import Path
import jinja2
from openai import OpenAI
# Vendored problem types from upstream (AGPL — only run via subprocess in production)
from darwinian_evolver.cli_common import build_hyperparameter_config_from_args
from darwinian_evolver.cli_common import register_hyperparameter_args
from darwinian_evolver.cli_common import parse_learning_log_view_type
from darwinian_evolver.evolve_problem_loop import EvolveProblemLoop
from darwinian_evolver.learning_log import LearningLogEntry
from darwinian_evolver.problem import EvaluationFailureCase
from darwinian_evolver.problem import EvaluationResult
from darwinian_evolver.problem import Evaluator
from darwinian_evolver.problem import Mutator
from darwinian_evolver.problem import Organism
from darwinian_evolver.problem import Problem
DEFAULT_MODEL = os.environ.get("EVOLVER_MODEL", "openai/gpt-4o-mini")
def _client() -> OpenAI:
key = os.environ.get("OPENROUTER_API_KEY")
if not key:
sys.exit("OPENROUTER_API_KEY is not set")
return OpenAI(api_key=key, base_url="https://openrouter.ai/api/v1")
def _prompt_llm(prompt: str) -> str:
try:
r = _client().chat.completions.create(
model=DEFAULT_MODEL,
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return r.choices[0].message.content or ""
except Exception as e:
# Treat any provider error (rate limit, content filter, schema reject)
# as a failed response. The evolver will simply see this as a low score
# on this organism and move on — much friendlier than killing the run.
return f"<LLM_ERROR: {type(e).__name__}: {e}>"
class ParrotOrganism(Organism):
prompt_template: str
def run(self, phrase: str) -> str:
try:
prompt = jinja2.Template(self.prompt_template).render(phrase=phrase)
except jinja2.exceptions.TemplateError as e:
return f"Error rendering prompt: {e}"
if not prompt:
return ""
return _prompt_llm(prompt)
class ParrotEvaluationFailureCase(EvaluationFailureCase):
phrase: str
response: str
class ImproveParrotMutator(Mutator[ParrotOrganism, ParrotEvaluationFailureCase]):
IMPROVEMENT_PROMPT_TEMPLATE = """
We want to build a prompt that causes an LLM to repeat back a given phrase verbatim.
The current prompt template is:
```
{{ organism.prompt_template }}
```
Unfortunately, on this phrase:
```
{{ failure_case.phrase }}
```
the LLM responded with:
```
{{ failure_case.response }}
```
Diagnose what went wrong, then propose an improved prompt template. Put the new
template in the LAST triple-backtick block of your response.
""".strip()
def mutate(
self,
organism: ParrotOrganism,
failure_cases: list[ParrotEvaluationFailureCase],
learning_log_entries: list[LearningLogEntry],
) -> list[ParrotOrganism]:
fc = failure_cases[0]
prompt = jinja2.Template(self.IMPROVEMENT_PROMPT_TEMPLATE).render(
organism=organism, failure_case=fc
)
try:
resp = _prompt_llm(prompt)
parts = resp.split("```")
if len(parts) < 3:
return []
new_tpl = parts[-2].strip()
return [ParrotOrganism(prompt_template=new_tpl)]
except Exception as e:
print(f"mutate error: {e}", file=sys.stderr)
return []
class ParrotEvaluator(Evaluator[ParrotOrganism, EvaluationResult, ParrotEvaluationFailureCase]):
TRAINABLE_PHRASES = [
"Hello world.",
"bla",
"Bla",
"bla.",
'"bla bla".',
"Just say 'foo' once with no extra words.",
]
HOLDOUT_PHRASES = [
"bla, but only once.",
"'bla'",
]
def evaluate(self, organism: ParrotOrganism) -> EvaluationResult:
train_fails: list[ParrotEvaluationFailureCase] = []
hold_fails: list[ParrotEvaluationFailureCase] = []
for i, p in enumerate(self.TRAINABLE_PHRASES):
r = organism.run(p)
if r != p:
train_fails.append(ParrotEvaluationFailureCase(
phrase=p, response=r, data_point_id=f"trainable_{i}"))
for i, p in enumerate(self.HOLDOUT_PHRASES):
r = organism.run(p)
if r != p:
hold_fails.append(ParrotEvaluationFailureCase(
phrase=p, response=r, data_point_id=f"holdout_{i}"))
n_total = len(self.TRAINABLE_PHRASES) + len(self.HOLDOUT_PHRASES)
n_ok = n_total - len(train_fails) - len(hold_fails)
return EvaluationResult(
score=n_ok / n_total,
trainable_failure_cases=train_fails,
holdout_failure_cases=hold_fails,
# Always viable. Even a 0-score seed is a valid starting point; the
# mutator should still get a chance to fix it.
is_viable=True,
)
def make_problem() -> Problem:
return Problem[ParrotOrganism, EvaluationResult, ParrotEvaluationFailureCase](
evaluator=ParrotEvaluator(),
mutators=[ImproveParrotMutator()],
initial_organism=ParrotOrganism(prompt_template="Say {{ phrase }}"),
)
def main() -> int:
ap = argparse.ArgumentParser()
register_hyperparameter_args(ap.add_argument_group("hyperparameters"))
ap.add_argument("--num_iterations", type=int, default=3)
ap.add_argument("--mutator_concurrency", type=int, default=4)
ap.add_argument("--evaluator_concurrency", type=int, default=4)
ap.add_argument("--output_dir", type=str, required=True)
args = ap.parse_args()
out = Path(args.output_dir)
out.mkdir(parents=True, exist_ok=True)
hp = build_hyperparameter_config_from_args(args)
loop = EvolveProblemLoop(
problem=make_problem(),
learning_log_view_type=parse_learning_log_view_type(hp.learning_log_view_type),
num_parents_per_iteration=hp.num_parents_per_iteration,
mutator_concurrency=args.mutator_concurrency,
evaluator_concurrency=args.evaluator_concurrency,
fixed_midpoint_score=hp.fixed_midpoint_score,
midpoint_score_percentile=hp.midpoint_score_percentile,
sharpness=hp.sharpness,
novelty_weight=hp.novelty_weight,
batch_size=hp.batch_size,
should_verify_mutations=hp.verify_mutations,
)
import json
log_path = out / "results.jsonl"
snap_dir = out / "snapshots"
snap_dir.mkdir(exist_ok=True)
print("Evaluating initial organism...")
for snap in loop.run(num_iterations=args.num_iterations):
(snap_dir / f"iteration_{snap.iteration}.pkl").write_bytes(snap.snapshot)
_, best_eval = snap.best_organism_result
print(f"iter={snap.iteration} pop={snap.population_size} "
f"best_score={best_eval.score:.3f}")
with log_path.open("a") as f:
f.write(json.dumps({
"iteration": snap.iteration,
"best_score": best_eval.score,
"pop_size": snap.population_size,
"score_percentiles": {str(k): v for k, v in snap.score_percentiles.items()},
}) + "\n")
print(f"\nDone. Results in: {out}")
return 0
if __name__ == "__main__":
sys.exit(main())

View file

@ -0,0 +1,69 @@
"""
show_snapshot.py Dump the population from a darwinian-evolver snapshot pickle.
Usage:
python show_snapshot.py PATH/TO/iteration_N.pkl [--field prompt_template]
The script is intentionally Organism-agnostic: it walks `org.__dict__` and prints
all str fields. By default it shows `prompt_template` if present; pass --field to
target a different attribute (e.g. `regex_pattern`, `sql_query`, `code_block`).
"""
from __future__ import annotations
import argparse
import pickle
import sys
from pathlib import Path
def main() -> int:
ap = argparse.ArgumentParser()
ap.add_argument("snapshot", type=Path)
ap.add_argument(
"--field",
default=None,
help="Organism attribute to display. Defaults to the first str field found.",
)
ap.add_argument("--top", type=int, default=None, help="Show only top N by score.")
args = ap.parse_args()
if not args.snapshot.exists():
sys.exit(f"snapshot not found: {args.snapshot}")
# The outer pickle wraps a dict; the inner pickle contains the actual organism
# objects, which must be importable under their original dotted path. If you
# ran a custom driver, make sure its module is on sys.path before calling this.
outer = pickle.loads(args.snapshot.read_bytes())
if not isinstance(outer, dict) or "population_snapshot" not in outer:
sys.exit("not a darwinian-evolver snapshot (no population_snapshot key)")
inner = pickle.loads(outer["population_snapshot"])
pairs = inner["organisms"] # list of (Organism, EvaluationResult)
print(f"# organisms: {len(pairs)}\n")
ranked = sorted(pairs, key=lambda p: getattr(p[1], "score", 0) or 0, reverse=True)
if args.top:
ranked = ranked[: args.top]
for i, (org, res) in enumerate(ranked):
score = getattr(res, "score", float("nan"))
print(f"=== rank {i} score={score:.3f} ===")
# pick field
field = args.field
if field is None:
for k, v in vars(org).items():
if isinstance(v, str) and not k.startswith("_") and k not in ("id",):
field = k
break
val = getattr(org, field, None) if field else None
if val is None:
print(f" (no string field; org fields: {list(vars(org).keys())})")
else:
print(f" {field} ({len(val)} chars):")
for ln in val.splitlines()[:30]:
print(f" {ln}")
print()
return 0
if __name__ == "__main__":
sys.exit(main())

View file

@ -0,0 +1,240 @@
"""
Template: a custom darwinian-evolver problem.
Copy this file, fill in the THREE marked spots (Organism, Evaluator, Mutator),
then run it as a driver script. The skeleton handles all the wiring so you only
write the domain-specific logic.
To run:
cd ~/.hermes/cache/darwinian-evolver/darwinian_evolver
OPENROUTER_API_KEY=... uv run --with openai python /path/to/this_file.py \
--num_iterations 3 --num_parents_per_iteration 2 \
--output_dir /tmp/my_problem
The pattern mirrors `scripts/parrot_openrouter.py` (the working reference).
"""
from __future__ import annotations
import argparse
import os
import sys
from pathlib import Path
from openai import OpenAI
# Upstream types (AGPL — invoked via subprocess in production; importing here
# is fine for skill-side driver scripts the user owns).
from darwinian_evolver.cli_common import (
build_hyperparameter_config_from_args,
parse_learning_log_view_type,
register_hyperparameter_args,
)
from darwinian_evolver.evolve_problem_loop import EvolveProblemLoop
from darwinian_evolver.learning_log import LearningLogEntry
from darwinian_evolver.problem import (
EvaluationFailureCase,
EvaluationResult,
Evaluator,
Mutator,
Organism,
Problem,
)
DEFAULT_MODEL = os.environ.get("EVOLVER_MODEL", "openai/gpt-4o-mini")
def _client() -> OpenAI:
key = os.environ.get("OPENROUTER_API_KEY")
if not key:
sys.exit("OPENROUTER_API_KEY is not set")
return OpenAI(api_key=key, base_url="https://openrouter.ai/api/v1")
def _prompt_llm(prompt: str, max_tokens: int = 1024) -> str:
try:
r = _client().chat.completions.create(
model=DEFAULT_MODEL,
max_tokens=max_tokens,
messages=[{"role": "user", "content": prompt}],
)
return r.choices[0].message.content or ""
except Exception as e:
# Never let one bad LLM response kill the run.
return f"<LLM_ERROR: {type(e).__name__}: {e}>"
# ---------------------------------------------------------------------------
# 1. ORGANISM — what you are evolving.
# ---------------------------------------------------------------------------
class MyOrganism(Organism):
# TODO: replace with your artifact field. Common shapes:
# prompt_template: str
# regex_pattern: str
# sql_query: str
# code_block: str
artifact: str
def run(self, *inputs) -> str:
"""Exercise the organism on a test input. Return whatever your
evaluator wants to score."""
# TODO: implement. For prompt evolution this typically calls _prompt_llm
# with the artifact rendered against the input. For regex/SQL it would
# call `re.findall(self.artifact, input)` / execute SQL / etc.
raise NotImplementedError
# ---------------------------------------------------------------------------
# 2. EVALUATOR — score organisms and surface failures the mutator can learn from.
# ---------------------------------------------------------------------------
class MyFailureCase(EvaluationFailureCase):
# TODO: include enough context for the LLM to diagnose the failure.
input: str
expected: str
actual: str
class MyEvaluator(Evaluator[MyOrganism, EvaluationResult, MyFailureCase]):
# Split your dataset. Mutator only sees trainable; holdout detects overfitting.
TRAINABLE = [
# TODO: list of (input, expected) tuples
# ("input1", "expected1"),
]
HOLDOUT = [
# TODO: separate set the mutator never sees
]
def evaluate(self, organism: MyOrganism) -> EvaluationResult:
train_fails: list[MyFailureCase] = []
hold_fails: list[MyFailureCase] = []
for i, (inp, expected) in enumerate(self.TRAINABLE):
actual = organism.run(inp)
if actual != expected:
train_fails.append(MyFailureCase(
input=inp, expected=expected, actual=actual,
data_point_id=f"trainable_{i}",
))
for i, (inp, expected) in enumerate(self.HOLDOUT):
actual = organism.run(inp)
if actual != expected:
hold_fails.append(MyFailureCase(
input=inp, expected=expected, actual=actual,
data_point_id=f"holdout_{i}",
))
n_total = len(self.TRAINABLE) + len(self.HOLDOUT)
n_ok = n_total - len(train_fails) - len(hold_fails)
return EvaluationResult(
score=n_ok / n_total if n_total else 0.0,
trainable_failure_cases=train_fails,
holdout_failure_cases=hold_fails,
# Always-viable. The evolver only blocks completely-broken organisms;
# a 0-score organism is fine and will simply be sampled less often.
is_viable=True,
)
# ---------------------------------------------------------------------------
# 3. MUTATOR — LLM proposes an improved organism from a failure case.
# ---------------------------------------------------------------------------
class MyMutator(Mutator[MyOrganism, MyFailureCase]):
PROMPT = """
The current artifact is:
```
{artifact}
```
On this input:
```
{input}
```
it produced:
```
{actual}
```
but we wanted:
```
{expected}
```
Diagnose what went wrong, then propose an improved version of the artifact.
Put the new version in the LAST triple-backtick block of your response.
""".strip()
def mutate(
self,
organism: MyOrganism,
failure_cases: list[MyFailureCase],
learning_log_entries: list[LearningLogEntry],
) -> list[MyOrganism]:
fc = failure_cases[0]
prompt = self.PROMPT.format(
artifact=organism.artifact,
input=fc.input,
actual=fc.actual,
expected=fc.expected,
)
resp = _prompt_llm(prompt)
parts = resp.split("```")
if len(parts) < 3:
return []
new_artifact = parts[-2].strip()
# Strip an opening language tag like "python\n" or "sql\n"
if "\n" in new_artifact:
first_line, rest = new_artifact.split("\n", 1)
if first_line and not first_line.startswith(" ") and len(first_line) < 20:
new_artifact = rest
return [MyOrganism(artifact=new_artifact)]
# ---------------------------------------------------------------------------
# Driver — fills in the EvolveProblemLoop boilerplate. You shouldn't need to
# touch anything below this line for a typical run.
# ---------------------------------------------------------------------------
def make_problem() -> Problem:
initial = MyOrganism(artifact="TODO: starting artifact here") # TODO
return Problem[MyOrganism, EvaluationResult, MyFailureCase](
evaluator=MyEvaluator(),
mutators=[MyMutator()],
initial_organism=initial,
)
def main() -> int:
ap = argparse.ArgumentParser()
register_hyperparameter_args(ap.add_argument_group("hyperparameters"))
ap.add_argument("--num_iterations", type=int, default=3)
ap.add_argument("--mutator_concurrency", type=int, default=2)
ap.add_argument("--evaluator_concurrency", type=int, default=2)
ap.add_argument("--output_dir", type=str, required=True)
args = ap.parse_args()
out = Path(args.output_dir)
out.mkdir(parents=True, exist_ok=True)
(out / "snapshots").mkdir(exist_ok=True)
hp = build_hyperparameter_config_from_args(args)
loop = EvolveProblemLoop(
problem=make_problem(),
learning_log_view_type=parse_learning_log_view_type(hp.learning_log_view_type),
num_parents_per_iteration=hp.num_parents_per_iteration,
mutator_concurrency=args.mutator_concurrency,
evaluator_concurrency=args.evaluator_concurrency,
fixed_midpoint_score=hp.fixed_midpoint_score,
midpoint_score_percentile=hp.midpoint_score_percentile,
sharpness=hp.sharpness,
novelty_weight=hp.novelty_weight,
batch_size=hp.batch_size,
should_verify_mutations=hp.verify_mutations,
)
print("Evaluating initial organism...")
for snap in loop.run(num_iterations=args.num_iterations):
(out / "snapshots" / f"iteration_{snap.iteration}.pkl").write_bytes(snap.snapshot)
_, best = snap.best_organism_result
print(f"iter={snap.iteration} pop={snap.population_size} best_score={best.score:.3f}")
print(f"\nDone. Results in: {out}")
return 0
if __name__ == "__main__":
sys.exit(main())

View file

@ -0,0 +1,277 @@
---
name: osint-investigation
description: Public-records OSINT investigation framework — SEC EDGAR filings, USAspending contracts, Senate lobbying, OFAC sanctions, ICIJ offshore leaks, NYC property records (ACRIS), OpenCorporates registries, CourtListener court records, Wayback Machine archives, Wikipedia + Wikidata, GDELT news monitoring. Entity resolution across sources, cross-link analysis, timing correlation, evidence chains. Python stdlib only.
version: 0.1.0
platforms: [linux, macos, windows]
author: Hermes Agent (adapted from ShinMegamiBoson/OpenPlanter, MIT)
metadata:
hermes:
tags: [osint, investigation, public-records, sec, sanctions, corporate-registry, property, courts, due-diligence, journalism]
category: research
related_skills: [domain-intel, arxiv]
---
# OSINT Investigation — Public Records Cross-Reference
Investigative framework for public-records OSINT: government contracts,
corporate filings, lobbying, sanctions, offshore leaks, property records,
court records, web archives, knowledge bases, and global news. Resolve
entities across heterogeneous sources, build cross-links with explicit
confidence, run statistical timing tests, and produce structured evidence
chains.
**Python stdlib only.** Zero install. Works on Linux, macOS, Windows. Most
sources work with no API key (OpenCorporates has an optional free token
that raises rate limits).
Adapted from the MIT-licensed ShinMegamiBoson/OpenPlanter project; expanded
to cover identity / property / litigation / archives / news sources that
the original didn't address.
## When to use this skill
Use when the user asks for:
- "follow the money" — government contracts, lobbying → legislation, sanctions
- corporate due diligence — who controls company X, where are they
incorporated, who serves on their boards, what filings have they made
- sanctions screening — is entity X on OFAC SDN, ICIJ offshore leaks
- pay-to-play investigation — contractors with offshore ties, lobbying
clients winning awards
- property ownership — find recorded deeds/mortgages by name or address
(NYC; for other counties point users at the relevant recorder)
- litigation history — find federal + state court opinions and PACER dockets
- multi-source entity resolution where naming varies (LLC suffixes, abbreviations)
- evidence-chain construction with explicit confidence levels
- "what's been said about X" — international news (GDELT) + Wikipedia
narrative + Wayback Machine to recover dead URLs
Do NOT use this skill for:
- general web research → `web_search` / `web_extract`
- domain/infrastructure OSINT → `domain-intel` skill
- academic literature → `arxiv` skill
- social-media profile discovery → `sherlock` skill (optional)
- US **federal** campaign finance — FEC is intentionally NOT covered here
(the API is unreliable for ad-hoc contributor-name queries on the free
DEMO_KEY tier). For federal donations, point users at
https://www.fec.gov/data/ directly.
## Workflow
The agent runs scripts via the `terminal` tool. `SKILL_DIR` is the directory
holding this SKILL.md.
### 1. Identify which sources apply
Read the data-source wiki entries to plan the investigation:
```
ls SKILL_DIR/references/sources/
# Federal financial / regulatory
cat SKILL_DIR/references/sources/sec-edgar.md # corporate filings
cat SKILL_DIR/references/sources/usaspending.md # federal contracts
cat SKILL_DIR/references/sources/senate-ld.md # lobbying
cat SKILL_DIR/references/sources/ofac-sdn.md # sanctions
cat SKILL_DIR/references/sources/icij-offshore.md # offshore leaks
# Identity / property / litigation / archives / news
cat SKILL_DIR/references/sources/nyc-acris.md # NYC property records
cat SKILL_DIR/references/sources/opencorporates.md # global corporate registry
cat SKILL_DIR/references/sources/courtlistener.md # court records (federal + state)
cat SKILL_DIR/references/sources/wayback.md # Wayback Machine archives
cat SKILL_DIR/references/sources/wikipedia.md # Wikipedia + Wikidata
cat SKILL_DIR/references/sources/gdelt.md # global news monitoring
```
Each entry follows a 9-section template: summary, access, schema, coverage,
cross-reference keys, data quality, acquisition, legal, references.
The **cross-reference potential** section maps join keys between sources — read
those first to pick the right pair.
### 2. Acquire data
Each source has a stdlib-only fetch script in `SKILL_DIR/scripts/`:
**Federal financial / regulatory**
```bash
# SEC EDGAR filings (corporate disclosures)
python3 SKILL_DIR/scripts/fetch_sec_edgar.py --cik 0000320193 \
--types 10-K,10-Q --out data/edgar_filings.csv
# USAspending federal contracts
python3 SKILL_DIR/scripts/fetch_usaspending.py --recipient "EXAMPLE CORP" \
--fy 2024 --out data/contracts.csv
# Senate LD-1 / LD-2 lobbying disclosures
python3 SKILL_DIR/scripts/fetch_senate_ld.py --client "EXAMPLE CORP" \
--year 2024 --out data/lobbying.csv
# OFAC SDN sanctions list (full snapshot)
python3 SKILL_DIR/scripts/fetch_ofac_sdn.py --out data/ofac_sdn.csv
# ICIJ Offshore Leaks — downloads ~70 MB bulk CSV on first use,
# then searches it locally. Cached for 30 days under
# $HERMES_OSINT_CACHE/icij/ (default: ~/.cache/hermes-osint/icij/).
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
--out data/icij.csv
```
**Identity / property / litigation / archives / news**
```bash
# NYC property records (deeds, mortgages, liens) — ACRIS via Socrata
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --name "SMITH, JOHN" \
--out data/acris.csv
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --address "571 HUDSON" \
--out data/acris_addr.csv
# OpenCorporates — 130+ jurisdiction corporate registry
# (free token required; set OPENCORPORATES_API_TOKEN or pass --token)
python3 SKILL_DIR/scripts/fetch_opencorporates.py --query "Example Corp" \
--jurisdiction us_ny --out data/opencorporates.csv
# CourtListener — federal + state court opinions, PACER dockets
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Smith v. Example Corp" \
--type opinions --out data/courts.csv
# Wayback Machine — historical web captures
python3 SKILL_DIR/scripts/fetch_wayback.py --url "example.com" \
--match host --collapse digest --out data/wayback.csv
# Wikipedia + Wikidata — narrative bio + structured facts
# Set HERMES_OSINT_UA=your-app/1.0 (your@email) to identify yourself
python3 SKILL_DIR/scripts/fetch_wikipedia.py --query "Bill Gates" \
--out data/wp.csv
# GDELT — global news in 100+ languages, ~2015→present
python3 SKILL_DIR/scripts/fetch_gdelt.py --query '"Example Corp"' \
--timespan 1y --out data/gdelt.csv
```
All outputs are normalized CSV with a header row. Re-run scripts idempotently.
When a private individual won't be in a source (e.g. SEC EDGAR for a non-public-
company person, USAspending for someone who isn't a federal contractor, Senate
LDA for someone who isn't a lobbying client), the script returns 0 rows with a
clear warning rather than silently writing an empty CSV. EDGAR specifically
flags when the company-name resolver matched an individual Form 3/4/5 filer
rather than a corporate registrant.
Rate-limit notes are in each source's wiki entry. Default fetchers sleep
politely between paginated requests. **API keys raise rate limits** for
sources that support them (`SEC_USER_AGENT`, `SENATE_LDA_TOKEN`,
`OPENCORPORATES_API_TOKEN`, `COURTLISTENER_TOKEN`). All scripts surface
429 responses immediately with the upstream's quota message so the user
knows to slow down or supply a key.
### 3. Resolve entities across sources
Normalize names and find matches between two CSV files:
```bash
# Match lobbying clients (Senate LDA) against contract recipients (USAspending)
python3 SKILL_DIR/scripts/entity_resolution.py \
--left data/lobbying.csv --left-name-col client_name \
--right data/contracts.csv --right-name-col recipient_name \
--out data/cross_links.csv
```
Three matching tiers with explicit confidence:
| Tier | Method | Confidence |
|------|--------|------------|
| `exact` | Normalized strings equal after suffix/punctuation strip | high |
| `fuzzy` | Sorted-token equality (word-bag match) | medium |
| `token_overlap` | ≥60% token overlap, ≥2 shared tokens, tokens ≥4 chars | low |
Output `cross_links.csv` columns: `match_type, confidence, left_name,
right_name, left_normalized, right_normalized, left_row, right_row`.
### 4. Statistical timing correlation (optional)
Test whether two time series cluster suspiciously close together — e.g.
lobbying filings near contract awards — using a permutation test:
```bash
python3 SKILL_DIR/scripts/timing_analysis.py \
--donations data/lobbying.csv --donation-date-col filing_date \
--donation-amount-col income --donation-donor-col client_name \
--donation-recipient-col registrant_name \
--contracts data/contracts.csv --contract-date-col award_date \
--contract-vendor-col recipient_name \
--cross-links data/cross_links.csv \
--permutations 1000 \
--out data/timing.json
```
The script's column flags are intentionally generic — the original tool was
written for donations vs awards, but it works for any (event, payee) time
series joined through cross-links. Null hypothesis: event timing is
independent of award dates. One-tailed p-value = fraction of permutations
with mean nearest-award distance ≤ observed. Minimum 3 events per (payer,
vendor) pair to run the test.
### 5. Build the findings JSON (evidence chain)
```bash
python3 SKILL_DIR/scripts/build_findings.py \
--cross-links data/cross_links.csv \
--timing data/timing.json \
--out data/findings.json
```
Every finding has `id, title, severity, confidence, summary, evidence[], sources[]`.
Each evidence item points back to a specific row in a source CSV. The user (or a
follow-up agent) can verify every claim against its source.
## Confidence and evidence discipline
This is the load-bearing rule of the skill. Tell the user:
- Every claim must trace to a record. No naked assertions.
- Confidence tier travels with the claim. `match_type=fuzzy` is "probable",
not "confirmed."
- Entity resolution produces candidates, NOT conclusions. A `fuzzy` match
between "ACME LLC" and "Acme Holdings Group" is a lead, not a fact.
- Statistical significance ≠ wrongdoing. p < 0.05 means the timing pattern
is unlikely under the null. It does not establish corruption.
- All data sources here are public records. They may still contain
inaccuracies, stale info, or redactions (GDPR, sealed records).
## Adding a new data source
Use the template:
```bash
cp SKILL_DIR/templates/source-template.md \
SKILL_DIR/references/sources/<your-source>.md
```
Fill in all 9 sections. Write a `fetch_<source>.py` script in `scripts/` that
uses stdlib only and writes a normalized CSV. Update the source list in the
"When to use" section above.
## Tools and their limits
- `entity_resolution.py` does NOT use external fuzzy libraries (no rapidfuzz,
no jellyfish). Token-bag matching is the upper bound here. If you need
Levenshtein, transliteration, or phonetic matching, pip-install separately.
- `timing_analysis.py` uses Python's `random` for permutations. For
reproducibility, pass `--seed N`.
- `fetch_*.py` scripts use `urllib.request` and respect `Retry-After`. Heavy
bulk usage may still violate ToS — read each source's legal section first.
## Legal note
All Phase-1 sources are public records. Bulk acquisition is permitted under
their respective access terms (FOIA, public records law, ICIJ explicit
publication, OFAC public data). However:
- Some sources rate-limit aggressively. Respect their headers.
- Some redact registrant info (GDPR on WHOIS, sealed filings).
- Cross-referencing public records to identify private individuals can have
ethical implications. The skill produces evidence chains, not accusations.

View file

@ -0,0 +1,98 @@
# CourtListener — Free Law Project
## 1. Summary
CourtListener (Free Law Project) aggregates court opinions, dockets, oral
arguments, and judge data. Covers ~10M federal and state court opinions
back to colonial America, plus PACER docket data from RECAP submissions.
## 2. Access Methods
- **REST API v4:** `https://www.courtlistener.com/api/rest/v4/`
- **Auth:** Anonymous reads allowed on most endpoints; token raises rate
limits and unlocks bulk export
- **Rate limit:** ~5,000 req/hour unauthenticated for search; higher with token
Set `COURTLISTENER_TOKEN` env var. Get a free token at
https://www.courtlistener.com/sign-in/ then create an API key.
## 3. Data Schema
Key fields emitted by `fetch_courtlistener.py`:
| Column | Type | Description |
|--------|------|-------------|
| `case_name` | str | Case name |
| `court` | str | Court name |
| `court_id` | str | Court ID (e.g. `nysd`, `scotus`, `ca9`) |
| `date_filed` | str | YYYY-MM-DD |
| `docket_number` | str | Court docket number |
| `judge` | str | Judge name(s) |
| `citation` | str | Reporter citation(s) |
| `result_type` | str | opinions / dockets / oral / people |
| `snippet` | str | Search-match snippet (up to 500 chars) |
| `absolute_url` | str | Direct CourtListener URL |
## 4. Coverage
- Federal: all circuit and district courts, SCOTUS
- State: all 50 state supreme/appellate courts, many trial courts
- Opinions: ~10M back to 1600s (colonial), full coverage 1950 → present
- Dockets via RECAP: ~3M+ from user-submitted PACER PDFs
- Updated continuously
## 5. Cross-Reference Potential
- **OpenCorporates**`case_name` (corporate litigation)
- **SEC EDGAR**`case_name` (securities class actions)
- **OFAC SDN**`case_name` (sanctions-related civil/criminal cases)
Join key: party name from `case_name`. Note: `case_name` often abbreviates
("Smith v. Jones" rather than full party names) — use the full case URL
to get all parties.
## 6. Data Quality
- Older opinions (pre-1990) often lack docket numbers and judges
- State coverage is more uneven than federal
- PACER docket coverage depends on RECAP user submissions — not exhaustive
- Sealed documents are excluded
- Party names in case captions don't always match filing names exactly
## 7. Acquisition Script
Path: `scripts/fetch_courtlistener.py`
```bash
# Search opinions for a party / keyword
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Example Corp" \
--out data/cl.csv
# PACER dockets (best for recent litigation)
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Example Corp" \
--type dockets --out data/cl_dockets.csv
# Restrict to a court
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Microsoft" \
--court ca9 --out data/cl_9th.csv
# Date range
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Example Corp" \
--date-from 2020-01-01 --date-to 2024-12-31 --out data/cl.csv
```
Pass `--token` or set `COURTLISTENER_TOKEN`.
## 8. Legal & Licensing
- Court opinions are public domain
- Free Law Project provides the data under CC0 / public domain dedication
- No commercial use restrictions on opinion text or metadata
- Some PACER PDFs have copyright on layout (not text) — fair use applies
## 9. References
- API docs: https://www.courtlistener.com/help/api/rest/
- Court IDs: https://www.courtlistener.com/api/jurisdictions/
- RECAP archive: https://www.courtlistener.com/recap/
- Bulk data: https://www.courtlistener.com/help/api/bulk-data/

View file

@ -0,0 +1,104 @@
# GDELT — Global News Monitoring
## 1. Summary
GDELT (Global Database of Events, Language, and Tone) monitors world news
in 100+ languages with full-text indexing. Updated every 15 minutes.
~2015 → present, ~1B+ articles indexed. Free anonymous access.
GDELT is wider than Google News (more international, more long-tail
sources) and indexed by tone/sentiment, themes (CAMEO codes), people, and
organizations.
## 2. Access Methods
- **DOC 2.0 API:** `https://api.gdeltproject.org/api/v2/doc/doc`
- **Events / GKG 2.0:** `https://api.gdeltproject.org/api/v2/events/events`
- **Auth:** None
- **Rate limit:** **1 request per 5 seconds** for the DOC API — strict
The fetch script automatically retries after a 6-second sleep when a
429 is received.
## 3. Data Schema
Key fields emitted by `fetch_gdelt.py`:
| Column | Type | Description |
|--------|------|-------------|
| `title` | str | Article title |
| `url` | str | Article URL |
| `seen_date` | str | When GDELT first saw the article (UTC) |
| `domain` | str | Publisher domain |
| `language` | str | Source language |
| `source_country` | str | 2-letter country code |
| `tone` | str | GDELT-computed tone score (negative = negative coverage) |
| `social_image` | str | Open Graph image URL when available |
## 4. Coverage
- Worldwide news in 100+ languages
- ~2015 → present (Events back to 1979 via a separate stream)
- Update frequency: 15 minutes
- Bias: heavily Anglophone in volume but very wide source list overall
## 5. Cross-Reference Potential
- **All sources**`title` / `url` (news context for any subject)
- **Wikipedia** ↔ event timeline for notable entities
- **Wayback Machine** ↔ recover articles whose URLs have died
- **OFAC SDN** ↔ news context for sanctions designations
- **SEC EDGAR** ↔ news context for 8-K material events
Join key: entity name appearing in article title or full-text. GDELT also
extracts named entities into a separate stream (GKG) not exposed by this
fetcher — query GDELT directly for entity-level filtering.
## 6. Data Quality
- Title extraction is automated and can be wrong (sometimes captures the
site name + delimiter + article title; sometimes a generic page title)
- Sentiment / tone is computed by GDELT, not source-supplied
- Some domains are oversampled (newswires, aggregators)
- Source country is inferred from domain registration / TLD — can be
wrong for international news sites with country-neutral domains
- Article URLs can rot — pair with Wayback Machine to preserve content
## 7. Acquisition Script
Path: `scripts/fetch_gdelt.py`
```bash
# Recent news mentioning an entity
python3 SKILL_DIR/scripts/fetch_gdelt.py --query "Nous Research" \
--timespan 6m --out data/gdelt.csv
# Phrase-exact (use double quotes inside single quotes for the shell)
python3 SKILL_DIR/scripts/fetch_gdelt.py --query '"Dillon Rolnick"' \
--timespan 1y --out data/gdelt.csv
# Filter to a country / language
python3 SKILL_DIR/scripts/fetch_gdelt.py --query "Microsoft" \
--source-country US --source-lang English --out data/gdelt.csv
# Date range
python3 SKILL_DIR/scripts/fetch_gdelt.py --query "Microsoft" \
--start 2024-01-01 --end 2024-12-31 --out data/gdelt.csv
```
GDELT supports its own query operators: phrase quoting, AND/OR/NOT,
`sourcecountry:US`, `theme:ECON_BANKRUPTCY`, `tone<-5`, etc.
See https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/ for syntax.
## 8. Legal & Licensing
- GDELT data is provided free for academic and journalistic use
- Article URLs link out to original publishers — copyright remains with
the publisher
- GDELT is NOT a content archive; it's a metadata index
## 9. References
- DOC 2.0 API: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/
- Themes & query syntax: https://blog.gdeltproject.org/gkg-2-0-our-global-knowledge-graph-2-0-amazing-data-at-your-fingertips/
- Project home: https://www.gdeltproject.org/

View file

@ -0,0 +1,104 @@
# ICIJ Offshore Leaks Database
## 1. Summary
The International Consortium of Investigative Journalists (ICIJ) publishes a
combined database of offshore entities from the Panama Papers, Paradise Papers,
Pandora Papers, Bahamas Leaks, and Offshore Leaks. ~800,000+ offshore entities
with their officers, intermediaries, and addresses.
## 2. Access Methods
- **Bulk download (primary):** `https://offshoreleaks-data.icij.org/offshoreleaks/csv/full-oldb.LATEST.zip` (~70 MB ZIP, refreshed periodically)
- **Search UI (human):** `https://offshoreleaks.icij.org/`
- **Auth:** None
- **Note:** The previous Open Refine reconciliation endpoint at
`/reconcile` now returns 404. ICIJ has removed it. The bulk ZIP is the
remaining stable access path. The skill's `fetch_icij_offshore.py` caches
the ZIP locally (default `~/.cache/hermes-osint/icij/`, refreshes after
30 days) and searches it offline.
## 3. Data Schema
Key fields emitted by `fetch_icij_offshore.py`:
| Column | Type | Description |
|--------|------|-------------|
| `node_id` | int | ICIJ canonical node ID |
| `name` | str | Entity / officer / intermediary name |
| `node_type` | str | entity / officer / intermediary / address |
| `country_codes` | str | Semicolon-separated ISO codes |
| `countries` | str | Country names |
| `jurisdiction` | str | Offshore jurisdiction (BVI, Panama, etc.) |
| `incorporation_date` | str | YYYY-MM-DD |
| `inactivation_date` | str | YYYY-MM-DD (if struck) |
| `source` | str | Panama Papers / Paradise Papers / Pandora Papers / etc. |
| `entity_url` | str | Link to ICIJ page |
| `connections` | str | Semicolon-separated node IDs of related entities |
## 4. Coverage
- Worldwide offshore entity records
- Earliest records: 1970s (Bahamas Leaks). Most data 19902018.
- NOT updated in real-time — new leaks added when ICIJ publishes them
- ~810,000 offshore entities + ~750,000 officers + ~150,000 intermediaries
## 5. Cross-Reference Potential
- **SEC EDGAR**`name` (public companies with offshore arms)
- **USAspending**`name` (federal contractors with offshore structure)
- **OFAC SDN**`name` (sanctioned entities using offshore vehicles)
Join key: normalized entity/officer name. `node_id` is canonical for cross-
referencing within ICIJ. Connections graph traversal is in-script (BFS over
`connections`).
## 6. Data Quality
- Offshore entity names sometimes appear in multiple leaks with slight variations
- Officers may be nominees (front persons), not beneficial owners
- Some entries have minimal info (just a name + jurisdiction)
- The connections graph is incomplete — some relationships are documented in
source materials but not in the structured database
- Inactive/struck-off entities are still included with `inactivation_date`
## 7. Acquisition Script
Path: `scripts/fetch_icij_offshore.py`
```bash
# Search by entity name (case-insensitive substring across the bulk DB)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
--out data/icij.csv
# Search by officer (individual person)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --officer "SMITH JOHN" \
--out data/icij.csv
# Search by jurisdiction (filter on cached results)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --officer "SMITH" \
--jurisdiction "BRITISH VIRGIN ISLANDS" --out data/icij_bvi.csv
# Force a fresh download (default refresh window is 30 days)
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
--force-refresh --out data/icij.csv
```
First call downloads the ~70 MB ZIP under `~/.cache/hermes-osint/icij/`
(or `$HERMES_OSINT_CACHE/icij/`). Subsequent calls reuse the cache for 30 days.
## 8. Legal & Licensing
- Public record as published by ICIJ under explicit publication
- No copyright on the underlying facts (entity names, jurisdictions)
- ICIJ asks for attribution if used in derivative reporting
- **Ethical note**: Presence in this database does NOT imply wrongdoing. Many
offshore structures are legal. The database is a research tool, not a list of
criminals.
## 9. References
- Database: https://offshoreleaks.icij.org/
- About the data: https://offshoreleaks.icij.org/pages/about
- Methodology: https://www.icij.org/investigations/panama-papers/
- API hints: Open Refine reconciliation endpoint at `https://offshoreleaks.icij.org/reconcile`

View file

@ -0,0 +1,90 @@
# NYC ACRIS — NYC Real Property Records
## 1. Summary
The Automated City Register Information System (ACRIS) is NYC's index of
recorded property documents: deeds, mortgages, satisfactions, liens, UCC
filings. Covers Manhattan, Bronx, Brooklyn, Queens, Staten Island.
Published as 4 linked Socrata datasets on the NYC Open Data portal.
## 2. Access Methods
- **Socrata API:** `https://data.cityofnewyork.us/resource/636b-3b5g.json` (Parties)
- **Other datasets:** `bnx9-e6tj` (Master), `8h5j-fqxa` (Legal), `uqqa-hym2` (References)
- **Auth:** None for read access (Socrata `$app_token` raises rate limits if needed)
- **Rate limit:** Generous (~1000 req/hour unauthenticated)
## 3. Data Schema
Key fields emitted by `fetch_nyc_acris.py` (Parties joined to Master):
| Column | Type | Description |
|--------|------|-------------|
| `document_id` | str | ACRIS document ID |
| `name` | str | Party name as recorded (often "LAST, FIRST" but varies) |
| `party_type` | str | 1=grantor, 2=grantee, 3=other |
| `party_role` | str | Human-readable role label |
| `address_1` | str | Property or party address line 1 |
| `city`, `state`, `zip`, `country` | str | Address parts |
| `doc_type` | str | DEED, MTGE (mortgage), SAT (satisfaction), AGMT, etc. |
| `doc_date`, `recorded_date` | str | YYYY-MM-DD |
| `borough` | str | Manhattan / Bronx / Brooklyn / Queens / Staten Island |
| `amount` | str | Document amount (USD, when applicable) |
| `filing_url` | str | Direct ACRIS DocumentImageView link |
## 4. Coverage
- NYC 5 boroughs only — other counties have their own recorders
- 1966 → present (older filings exist on microfilm at the County Clerk)
- Updated nightly
- ~70M+ party records cumulative
## 5. Cross-Reference Potential
- **SEC EDGAR**`name` (insider filers with NYC property)
- **USAspending**`name` (federal contractors with NYC property)
- **Senate LDA**`name` (lobbyists / clients with NYC property)
- **ICIJ Offshore**`name` (NYC properties owned via offshore vehicles)
Join key: normalized party name. NYC property records typically store names
as "LAST, FIRST" or full LLC names — use `entity_resolution.py`.
## 6. Data Quality
- Same person appears with multiple name formats over time
- LLC and trust ownership obscures beneficial owners
- Recording lag can be 2-4 weeks after closing
- Older documents have spottier address data
- Sealed records (e.g. domestic violence shelters) are excluded by law
## 7. Acquisition Script
Path: `scripts/fetch_nyc_acris.py`
```bash
# By party name
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --name "ROLNICK" --out data/acris.csv
# By address (useful when you know the property but not the names)
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --address "571 HUDSON" --out data/acris.csv
# Restrict to grantees (buyers / mortgagees)
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --name "ROLNICK" --party-type 2 \
--out data/acris_buyers.csv
```
The script joins Parties → Master to populate doc_type, dates, borough, and
amount. Pass `--no-enrich` to skip the join (faster, fewer columns).
## 8. Legal & Licensing
- Public record under NYS Real Property Law and NYC Charter
- No commercial use restrictions on the data
- All ACRIS data is public information by statute
## 9. References
- ACRIS portal: https://a836-acris.nyc.gov/CP/
- NYC Open Data: https://data.cityofnewyork.us/
- Parties dataset: https://data.cityofnewyork.us/City-Government/ACRIS-Real-Property-Parties/636b-3b5g
- Document type codes: https://www1.nyc.gov/site/finance/taxes/acris.page

View file

@ -0,0 +1,92 @@
# OFAC SDN — Specially Designated Nationals List
## 1. Summary
The Office of Foreign Assets Control (OFAC) publishes the Specially Designated
Nationals and Blocked Persons List (SDN). US persons are generally prohibited
from dealing with individuals and entities on this list. Also published:
non-SDN consolidated lists (BIS Denied Persons, FSE, etc.).
## 2. Access Methods
- **Full XML:** `https://www.treasury.gov/ofac/downloads/sdn.xml`
- **Delimited:** `https://www.treasury.gov/ofac/downloads/sdn.csv`
- **Consolidated:** `https://www.treasury.gov/ofac/downloads/consolidated/consolidated.xml`
- **Auth:** None
- **Rate limit:** None (static file downloads). Updated continuously.
## 3. Data Schema
Key fields emitted by `fetch_ofac_sdn.py`:
| Column | Type | Description |
|--------|------|-------------|
| `entity_id` | int | OFAC unique ID |
| `name` | str | Primary name |
| `entity_type` | str | individual / entity / vessel / aircraft |
| `program_list` | str | Semicolon-separated sanctions programs (e.g. SDGT;IRAN) |
| `title` | str | For individuals: title/role |
| `nationalities` | str | Semicolon-separated country codes |
| `aka_list` | str | Semicolon-separated "also known as" names |
| `addresses` | str | Semicolon-separated known addresses |
| `dob` | str | Date of birth (individuals) |
| `pob` | str | Place of birth (individuals) |
| `remarks` | str | OFAC's free-text remarks |
| `last_updated` | str | YYYY-MM-DD (publication date) |
## 4. Coverage
- Worldwide — all entities sanctioned by US Treasury
- ~10,000 entries on SDN, ~15,000 on consolidated lists
- Updated continuously (sometimes daily during active enforcement)
- Includes AKAs (very common, can be 10+ per entity)
## 5. Cross-Reference Potential
- **SEC EDGAR**`name` (public companies sanctioned)
- **USAspending**`name` (sanctioned entity as federal contractor — should
be impossible but verify)
- **ICIJ Offshore**`name` (offshore entities also sanctioned)
Join key: normalized name. **CRITICAL**: must match against `aka_list` too.
Many sanctioned entities are caught only via aliases.
## 6. Data Quality
- Names are transliterated from many scripts — multiple romanizations possible
- AKAs often differ wildly from primary name
- Some entries have minimal info (no DOB, no address) for individuals
- Free-text `remarks` contain critical context — read them
- "Specially Designated Global Terrorists" (SDGT) and "Cyber-related" (CYBER2)
programs add and remove entries frequently
## 7. Acquisition Script
Path: `scripts/fetch_ofac_sdn.py`
```bash
# Full snapshot
python3 SKILL_DIR/scripts/fetch_ofac_sdn.py --out data/ofac_sdn.csv
# Filter to specific program
python3 SKILL_DIR/scripts/fetch_ofac_sdn.py --program SDGT --out data/sdn_sdgt.csv
# Entities only (skip individuals, vessels, aircraft)
python3 SKILL_DIR/scripts/fetch_ofac_sdn.py --entity-type entity --out data/sdn_entities.csv
```
## 8. Legal & Licensing
- Public record under Executive Order authority and statutory sanctions programs
- US persons MUST screen against this list — it is enforced
- No restrictions on the data itself; restrictions are on transactions with
the listed entities
- ZERO penalty for "over-matching" — false positives must be cleared but are not
prohibited
## 9. References
- OFAC home: https://ofac.treasury.gov/
- SDN list: https://ofac.treasury.gov/specially-designated-nationals-and-blocked-persons-list-sdn-human-readable-lists
- Data formats: https://ofac.treasury.gov/sdn-list/sanctions-list-search-tool
- Compliance guidance: https://ofac.treasury.gov/recent-actions

View file

@ -0,0 +1,103 @@
# OpenCorporates — Global Corporate Registry
## 1. Summary
OpenCorporates aggregates corporate registry data from 130+ jurisdictions
worldwide (~200M companies). Covers US state-level filings (NY DOS, Delaware
DOC, California SOS, etc.), UK Companies House, EU registries, and most
common-law jurisdictions.
## 2. Access Methods
- **REST API:** `https://api.opencorporates.com/v0.4/`
- **HTML fallback:** `https://opencorporates.com/companies?q=...`
- **Auth:** API token required (free tier 500 calls/month, paid plans available)
- **Rate limit:** Token-bound; un-tokened requests return 401
Set `OPENCORPORATES_API_TOKEN` env var. Get a free token at
https://opencorporates.com/api_accounts/new.
## 3. Data Schema
Key fields emitted by `fetch_opencorporates.py`:
| Column | Type | Description |
|--------|------|-------------|
| `name` | str | Company legal name |
| `company_number` | str | Registry-assigned number |
| `jurisdiction_code` | str | e.g. `us_ny`, `us_de`, `gb` |
| `jurisdiction_name` | str | Human-readable jurisdiction |
| `incorporation_date` | str | YYYY-MM-DD |
| `dissolution_date` | str | YYYY-MM-DD (empty if active) |
| `company_type` | str | Domestic LLC / Foreign Corp / etc. |
| `status` | str | Active / Inactive / Dissolved |
| `registered_address` | str | Registered office address |
| `opencorporates_url` | str | Link to OpenCorporates entity page |
| `officers_count` | str | Total officers on record |
| `source` | str | `api`, `html`, or `html-fallback` |
## 4. Coverage
- US: all 50 states + DC at state level (LLCs, corps, LPs)
- International: UK, EU, Canada, Australia, NZ, many APAC + LATAM jurisdictions
- ~200M company records cumulative
- Update frequency varies by jurisdiction (UK CH is near-realtime; some
state registries lag months)
## 5. Cross-Reference Potential
- **NYC ACRIS**`name` (LLC/corp owners of NYC property)
- **USAspending**`name` (corporate federal contractors)
- **SEC EDGAR**`name` (public companies + their subsidiaries)
- **ICIJ Offshore**`name` (international corporate structures)
Join key: normalized company name. Some entries have `previous_names` arrays
which are not currently exported by the fetch script — query OC directly
for that.
## 6. Data Quality
- Company-name spellings vary across re-incorporations and renames
- Officer records are spottier than company records (many jurisdictions
don't require officer disclosure)
- Beneficial-ownership data is generally NOT here — most jurisdictions
don't require it. UK Companies House has PSC (people with significant
control) but that's not universal.
- Cross-jurisdictional links (parent / subsidiary) are based on registry
filings only; corporate trees are often incomplete
## 7. Acquisition Script
Path: `scripts/fetch_opencorporates.py`
```bash
# Search globally by name
python3 SKILL_DIR/scripts/fetch_opencorporates.py --query "Example Corp" \
--out data/oc.csv
# Restrict to a jurisdiction
python3 SKILL_DIR/scripts/fetch_opencorporates.py --query "Example Corp" \
--jurisdiction us_ny --out data/oc_ny.csv
# Set token via env or flag
OPENCORPORATES_API_TOKEN=xxx python3 SKILL_DIR/scripts/fetch_opencorporates.py \
--query "Microsoft" --out data/oc.csv
```
Without a token the script falls back to scraping the HTML search page.
The fallback is brittle and only fills in `name`, `jurisdiction_code`,
`opencorporates_url` — set the token for serious work.
## 8. Legal & Licensing
- OpenCorporates aggregates public records — the underlying facts are
public domain
- OpenCorporates own database is licensed CC-BY-SA-4.0; attribution required
- API ToS prohibits redistributing the full dataset; per-record reference
is fine
## 9. References
- API docs: https://api.opencorporates.com/documentation/API-Reference
- Jurisdiction codes: https://api.opencorporates.com/v0.4/jurisdictions.json
- Schema: https://opencorporates.com/info/our_data

View file

@ -0,0 +1,83 @@
# SEC EDGAR — Corporate Filings
## 1. Summary
EDGAR (Electronic Data Gathering, Analysis, and Retrieval) is the SEC's system
for corporate disclosure filings: 10-K (annual), 10-Q (quarterly), 8-K (current
events), DEF 14A (proxy), Form 4 (insider trading), 13F (institutional holdings).
## 2. Access Methods
- **API:** `https://data.sec.gov/submissions/CIK<10-digit-padded>.json` (no auth)
- **Filing index:** `https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=...`
- **Full-text search:** `https://efts.sec.gov/LATEST/search-index?q=...`
- **Auth:** None — requires `User-Agent` header with contact info per SEC policy
- **Rate limit:** 10 requests/second per IP (enforced)
## 3. Data Schema
Key fields emitted by `fetch_sec_edgar.py` (filings index):
| Column | Type | Description |
|--------|------|-------------|
| `cik` | str | Central Index Key (10-digit padded) |
| `company_name` | str | Registrant name |
| `form_type` | str | 10-K, 10-Q, 8-K, etc. |
| `filing_date` | str | YYYY-MM-DD |
| `accession_number` | str | Filing accession (e.g. 0000320193-24-000123) |
| `primary_document` | str | Filename of main document |
| `filing_url` | str | Direct URL to filing index |
| `reporting_period` | str | Period of report (where applicable) |
## 4. Coverage
- All public US registrants from 1993 → present
- 1993-2000 has spotty coverage of older filings (paper-to-electronic migration)
- ~12M filings cumulative
- Updated within minutes of filing acceptance
## 5. Cross-Reference Potential
- **USAspending**`company_name` (public companies as federal contractors)
- **Senate LD**`company_name` (public companies hire lobbyists)
- **OFAC SDN**`company_name` (sanctions screening of public registrants)
Join key: company name OR CIK if you have it. CIK is canonical and stable.
## 6. Data Quality
- Subsidiaries often filed under parent CIK — be careful with name matches
- Name changes over time (rebrands, acquisitions) — CIK remains constant
- 10-K Item 1A Risk Factors are free-form text — useful for `web_extract`-style
parsing, not structured queries
- Foreign private issuers file 20-F instead of 10-K
## 7. Acquisition Script
Path: `scripts/fetch_sec_edgar.py`
```bash
# By CIK
python3 SKILL_DIR/scripts/fetch_sec_edgar.py --cik 0000320193 \
--types 10-K,10-Q --out data/edgar_filings.csv
# By company name (resolves to CIK first via name search)
python3 SKILL_DIR/scripts/fetch_sec_edgar.py --company "APPLE INC" \
--types 8-K --since 2024-01-01 --out data/edgar_filings.csv
```
Set `SEC_USER_AGENT` env var with your contact email (SEC requirement).
Example: `SEC_USER_AGENT="Research example@example.com"`.
## 8. Legal & Licensing
- Public record under SEC Rule 24b-2 / 17 CFR § 230.401
- No commercial use restrictions on filing content
- SEC asks all bulk users to include a `User-Agent` with contact info and to
respect 10 req/s — failure to do so can result in IP blocking
## 9. References
- Developer docs: https://www.sec.gov/edgar/sec-api-documentation
- EDGAR full-text search: https://efts.sec.gov/LATEST/search-index
- Fair access policy: https://www.sec.gov/os/accessing-edgar-data

View file

@ -0,0 +1,89 @@
# Senate LD — Lobbying Disclosure (LD-1 / LD-2)
## 1. Summary
The Senate Office of Public Records publishes lobbying disclosures under the
Lobbying Disclosure Act of 1995 (LDA, as amended by HLOGA 2007). LD-1 is
registration of a new client-lobbyist relationship; LD-2 is the quarterly
activity report.
## 2. Access Methods
- **API:** `https://lda.senate.gov/api/v1/` (no auth required for read-only)
- **Bulk download:** `https://lda.senate.gov/api/v1/filings/?format=csv` (paginated)
- **Auth:** Token required for >120 req/hour — register at https://lda.senate.gov/api/auth/register/
- **Rate limit:** 120 req/hour unauthenticated, 1,200 req/hour authenticated
## 3. Data Schema
Key fields emitted by `fetch_senate_ld.py`:
| Column | Type | Description |
|--------|------|-------------|
| `filing_uuid` | str | Unique filing ID |
| `filing_type` | str | LD-1, LD-2, LD-203, etc. |
| `filing_year` | int | Year |
| `filing_period` | str | Q1/Q2/Q3/Q4 or annual |
| `registrant_name` | str | Lobbying firm or organization |
| `registrant_id` | str | Senate-assigned registrant ID |
| `client_name` | str | Client being represented |
| `client_id` | str | Senate-assigned client ID |
| `client_general_description` | str | Client industry / business |
| `income` | float | LD-2 income from client this quarter (USD) |
| `expenses` | float | LD-2 expenses (in-house lobbying) |
| `lobbyists` | str | Semicolon-separated lobbyist names |
| `issues` | str | Semicolon-separated issue areas |
| `government_entities` | str | Agencies/chambers contacted |
| `filing_date` | str | YYYY-MM-DD |
## 4. Coverage
- US federal lobbying only (state lobbying handled by individual state ethics offices)
- 1999 → present (full electronic coverage from 2008)
- Quarterly reporting cycle (LD-2)
- ~1M+ filings cumulative
## 5. Cross-Reference Potential
- **USAspending**`client_name` (clients lobbying for contracts)
- **SEC EDGAR**`client_name` (public companies as lobbying clients)
- **OFAC SDN**`client_name` (sanctions screening of lobbying clients)
Join key: normalized client_name. registrant_id and client_id are canonical
when joining Senate-internal records.
## 6. Data Quality
- Many lobbyist names appear in multiple registrants over time (job changes)
- `issues` and `government_entities` are free-text — Inconsistent capitalization
- Foreign agents register under FARA (Department of Justice), NOT here
- Income/expenses are reported in $10,000 brackets in some older filings
## 7. Acquisition Script
Path: `scripts/fetch_senate_ld.py`
```bash
# By client
python3 SKILL_DIR/scripts/fetch_senate_ld.py --client "EXAMPLE CORP" \
--year 2024 --out data/lobbying.csv
# By registrant (lobbying firm)
python3 SKILL_DIR/scripts/fetch_senate_ld.py --registrant "BIG K STREET LLP" \
--year 2024 --out data/lobbying.csv
```
Set `SENATE_LDA_TOKEN` env var if you have one (or pass `--token`).
Defaults to anonymous (120 req/hour).
## 8. Legal & Licensing
- Public record under 2 U.S.C. § 1604 (LDA)
- No commercial use restrictions
- Reuse is unconditional — see Senate Public Records Office disclaimer
## 9. References
- API docs: https://lda.senate.gov/api/redoc/v1/
- LDA guidance: https://lobbyingdisclosure.house.gov/ld_guidance.pdf
- Senate Public Records: https://lda.senate.gov/

View file

@ -0,0 +1,97 @@
# USAspending — Federal Government Contracts and Grants
## 1. Summary
USAspending.gov is the official source of federal spending data. Coverage:
contracts, grants, loans, direct payments, sub-awards. Required by the DATA Act
of 2014 — all federal agencies must report to a single schema.
## 2. Access Methods
- **API v2:** `https://api.usaspending.gov/api/v2/` (no auth, no key)
- **Bulk:** `https://files.usaspending.gov/` (CSV / Parquet by award type)
- **Auth:** None
- **Rate limit:** Not strictly enforced, but be polite — keep to <10 req/s
## 3. Data Schema
Key fields emitted by `fetch_usaspending.py` (prime awards):
| Column | Type | Description |
|--------|------|-------------|
| `award_id` | str | Federal award ID (PIID for contracts, FAIN for grants) |
| `recipient_name` | str | Awardee legal name |
| `recipient_uei` | str | Unique Entity Identifier (replaced DUNS in 2022) |
| `recipient_duns` | str | Legacy DUNS number (historical only) |
| `recipient_parent_name` | str | Ultimate parent organization |
| `recipient_state` | str | Recipient state |
| `awarding_agency` | str | Department / agency name |
| `awarding_sub_agency` | str | Sub-tier (e.g. DoD → Army) |
| `award_type` | str | Contract / Grant / Loan / Direct Payment |
| `award_amount` | float | Current total obligation in USD |
| `award_date` | str | Action / signed date YYYY-MM-DD |
| `period_of_performance_start` | str | YYYY-MM-DD |
| `period_of_performance_end` | str | YYYY-MM-DD |
| `naics_code` | str | Industry classification |
| `psc_code` | str | Product / Service Code |
| `competition_extent` | str | Full / limited / sole-source |
| `description` | str | Award description (free-text) |
## 4. Coverage
- US federal awards only (state/local not included)
- FY 2008 → present (full coverage from FY 2017)
- Updated bi-weekly from agency reporting
- ~100M+ transaction records cumulative
## 5. Cross-Reference Potential
- **SEC EDGAR**`recipient_name` (public companies as contractors)
- **Senate LD**`recipient_name` (lobbying clients winning contracts)
- **OFAC SDN**`recipient_name` (sanctions screening of contractors — must be
filtered out by SAM.gov but verify)
- **ICIJ Offshore**`recipient_name` (offshore-linked contractors)
Join key: normalized recipient name. UEI is canonical when present.
## 6. Data Quality
- DUNS → UEI transition (April 2022) — old records have DUNS, new records have UEI
- Some sub-awards aren't reported (FFATA threshold is $30k)
- Award amount changes over time (mod actions) — fetch script reports current total
- `competition_extent` field is free-text in older records — `fetch_usaspending.py`
normalizes to canonical values
- Recipient name variations are extensive — "ACME LLC", "Acme L.L.C.", "ACME, INC"
all appear. Use `entity_resolution.py`.
## 7. Acquisition Script
Path: `scripts/fetch_usaspending.py`
```bash
# By recipient name
python3 SKILL_DIR/scripts/fetch_usaspending.py --recipient "EXAMPLE CORP" \
--fy 2024 --out data/contracts.csv
# By awarding agency
python3 SKILL_DIR/scripts/fetch_usaspending.py --agency "Department of Defense" \
--fy 2024 --out data/contracts.csv
# Filter to sole-source only
python3 SKILL_DIR/scripts/fetch_usaspending.py --recipient "EXAMPLE CORP" \
--fy 2024 --sole-source-only --out data/contracts.csv
```
## 8. Legal & Licensing
- Public record under the Federal Funding Accountability and Transparency Act
(FFATA, 2006) and DATA Act (2014)
- No commercial use restrictions on the data
- Personal information of award recipients (e.g. small business owners' addresses
in some grants) should be handled per the source agency's privacy notice
## 9. References
- API docs: https://api.usaspending.gov/
- Data dictionary: https://www.usaspending.gov/data-dictionary
- Award schema: https://files.usaspending.gov/docs/Data_Dictionary_Crosswalk.xlsx

View file

@ -0,0 +1,93 @@
# Wayback Machine — Internet Archive CDX
## 1. Summary
The Internet Archive's Wayback Machine has captured ~900B+ web pages since
1996. The CDX server API indexes those captures by URL, timestamp, and
content hash. Free, anonymous, no auth.
## 2. Access Methods
- **CDX server:** `https://web.archive.org/cdx/search/cdx`
- **Wayback URL:** `https://web.archive.org/web/<timestamp>/<url>`
- **Save Page Now (write):** `https://web.archive.org/save/<url>` (different API)
- **Auth:** None
- **Rate limit:** Generous; be polite (~1 req/s)
## 3. Data Schema
Key fields emitted by `fetch_wayback.py`:
| Column | Type | Description |
|--------|------|-------------|
| `url` | str | Original URL captured |
| `timestamp` | str | YYYYMMDDHHMMSS (CDX format) |
| `wayback_url` | str | Direct replay URL |
| `mimetype` | str | Content-type at capture |
| `status` | str | HTTP status (typically 200) |
| `digest` | str | SHA1 of capture content (collapse-friendly) |
| `length` | str | Byte length of capture |
## 4. Coverage
- 1996 → present
- ~900B+ captures across ~700M domains
- Updated continuously by automated crawls + manual saves
- Some domains have aggressive coverage (news), others sparse (private)
## 5. Cross-Reference Potential
- **Wikipedia** ↔ Reverse-lookup pages cited as references that have since
disappeared
- **News URLs** ↔ Original article content when present-day URLs 404
- **Corporate websites** ↔ Historical "About" pages, executive bios that
have been scrubbed
The Wayback CDX is most useful as a **content-recovery** layer when other
sources point to URLs that no longer exist.
## 6. Data Quality
- robots.txt-blocked domains may have spotty or no coverage
- Captures vary in completeness (HTML may be saved without CSS/JS)
- Some content is excluded by domain owner request (DMCA, etc.)
- Coverage of "deep links" (URLs with query strings) is uneven
- Time resolution is per-capture, not continuous — gaps are common
## 7. Acquisition Script
Path: `scripts/fetch_wayback.py`
```bash
# All captures of a specific URL
python3 SKILL_DIR/scripts/fetch_wayback.py --url "https://example.com/page" \
--out data/wb.csv
# All captures of a host
python3 SKILL_DIR/scripts/fetch_wayback.py --url "example.com" \
--match host --out data/wb.csv
# All captures of a domain + subdomains
python3 SKILL_DIR/scripts/fetch_wayback.py --url "example.com" \
--match domain --out data/wb.csv
# Only unique-content captures within a date window
python3 SKILL_DIR/scripts/fetch_wayback.py --url "example.com" \
--match host --collapse digest \
--from-date 2020-01-01 --to-date 2023-12-31 \
--out data/wb.csv
```
## 8. Legal & Licensing
- Internet Archive captures are made under fair-use research provisions
- Replay URLs are stable references — citing them is encouraged
- Internet Archive non-profit terms of use govern content
- Some content is rights-restricted; replay may be blocked even if the
CDX entry shows it as captured
## 9. References
- CDX server docs: https://github.com/internetarchive/wayback/blob/master/wayback-cdx-server/README.md
- Wayback API: https://archive.org/help/wayback_api.php
- Internet Archive: https://archive.org/

View file

@ -0,0 +1,107 @@
# Wikipedia + Wikidata
## 1. Summary
Wikipedia is the canonical narrative-bio source for notable people, places,
and organizations. Wikidata is its structured-data counterpart: ~110M
items, each with claims, dates, identifiers, and cross-references to
external authorities (VIAF, ISNI, ORCID, GRID, etc.).
Together they're a high-precision entity-resolution layer — the bar for
inclusion is real, but anything past that bar is well-cross-referenced.
## 2. Access Methods
- **Wikipedia OpenSearch:** `https://en.wikipedia.org/w/api.php?action=opensearch`
- **Wikipedia REST summary:** `https://en.wikipedia.org/api/rest_v1/page/summary/<title>`
- **Wikidata Action API:** `https://www.wikidata.org/w/api.php?action=wbgetentities`
- **Wikidata SPARQL:** `https://query.wikidata.org/sparql` (more powerful but aggressively rate-limited)
- **Auth:** None, but **a meaningful User-Agent is required**
Set `HERMES_OSINT_UA` to something identifying (e.g. `your-app/1.0 (you@example.com)`).
Wikimedia returns HTTP 429 to generic UAs.
## 3. Data Schema
Key fields emitted by `fetch_wikipedia.py`:
| Column | Type | Description |
|--------|------|-------------|
| `source` | str | `wikipedia` or `wikipedia+wikidata` |
| `label` | str | Wikipedia article title |
| `description` | str | Short Wikidata description |
| `qid` | str | Wikidata QID (e.g. Q2283 for Microsoft) |
| `wikipedia_title`, `wikipedia_url` | str | Article identifier + URL |
| `wikidata_url` | str | Wikidata entity URL |
| `instance_of` | str | What kind of thing it is (P31) |
| `country` | str | Country (P17 for orgs/places, P27 for people) |
| `occupation` | str | P106 |
| `employer` | str | P108 |
| `date_of_birth` | str | P569, YYYY-MM-DD |
| `place_of_birth` | str | P19 |
| `summary` | str | Wikipedia REST extract (~1000 chars) |
The fetch script uses Wikidata's Action API (NOT SPARQL) for structured
facts — far more lenient on rate limits.
## 4. Coverage
- Wikipedia EN: ~7M articles
- Wikidata: ~110M items, ~1.5B statements
- Updated continuously; abuse filters and bots run constantly
- High notability bar — most private individuals are not in Wikipedia
## 5. Cross-Reference Potential
- **All sources**`label` (entity identity resolution)
- **SEC EDGAR**`label` (public companies)
- **CourtListener**`label` (parties to notable litigation)
- **Wikidata external identifiers** (not currently in this fetcher's output)
link to VIAF, ISNI, ORCID, GRID, GitHub, Twitter, IMDb, ...
Join key: Wikidata QID is canonical. Wikipedia titles are stable for
most articles but can be renamed.
## 6. Data Quality
- Notability filter — only notable entities (criteria vary by topic)
- Recency lag — current events take days to weeks to be reflected
- POV / vandalism — moderated, but edits between sweeps can be bad
- Living-persons biographies have stricter sourcing requirements
- Wikidata claims have qualifiers and references — the fetch script
doesn't currently export them
## 7. Acquisition Script
Path: `scripts/fetch_wikipedia.py`
```bash
# Look up a notable entity
python3 SKILL_DIR/scripts/fetch_wikipedia.py --query "Microsoft" --out data/wp.csv
# A specific person
python3 SKILL_DIR/scripts/fetch_wikipedia.py --query "Bill Gates" --out data/wp_bg.csv
# Skip the Wikidata enrichment for speed
python3 SKILL_DIR/scripts/fetch_wikipedia.py --query "Microsoft" --no-wikidata \
--limit 5 --out data/wp.csv
```
The OpenSearch is fuzzy — `--limit 5` returns the top 5 Wikipedia article
matches. Each is enriched with the QID + structured facts unless
`--no-wikidata` is passed.
## 8. Legal & Licensing
- Wikipedia text: CC-BY-SA-3.0 / GFDL
- Wikidata claims: CC0 (public domain)
- API ToS: respect rate limits, identify your agent
- Commercial use allowed with attribution
## 9. References
- Wikipedia OpenSearch: https://www.mediawiki.org/wiki/API:Opensearch
- Wikipedia REST: https://en.wikipedia.org/api/rest_v1/
- Wikidata Action API: https://www.wikidata.org/wiki/Wikidata:Data_access
- Wikidata SPARQL: https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service
- User-Agent policy: https://meta.wikimedia.org/wiki/User-Agent_policy

View file

@ -0,0 +1,82 @@
"""Tiny stdlib HTTP helper used by fetch_*.py scripts.
Provides polite retry + JSON convenience + User-Agent enforcement.
"""
from __future__ import annotations
import json
import os
import time
import urllib.error
import urllib.parse
import urllib.request
DEFAULT_UA = (
"hermes-osint-investigation/0.2 "
"(+https://github.com/NousResearch/hermes-agent; "
"set HERMES_OSINT_UA env var to identify yourself per "
"Wikimedia / SEC fair-use guidance)"
)
def get(
url: str,
*,
params: dict | None = None,
headers: dict | None = None,
user_agent: str | None = None,
max_retries: int = 3,
backoff: float = 1.5,
timeout: float = 30.0,
) -> bytes:
"""GET with retry on 5xx and Retry-After honoring.
429 (rate-limit) is raised IMMEDIATELY with a clear message retrying
when the upstream says "you're over quota" just wastes time. The caller
should slow down or supply real credentials.
"""
if params:
sep = "&" if "?" in url else "?"
url = f"{url}{sep}{urllib.parse.urlencode(params)}"
h = {"User-Agent": user_agent or os.environ.get("HERMES_OSINT_UA", DEFAULT_UA)}
if headers:
h.update(headers)
last_err: Exception | None = None
for attempt in range(max_retries + 1):
req = urllib.request.Request(url, headers=h)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
return resp.read()
except urllib.error.HTTPError as e:
if e.code == 429:
# Surface immediately. Read the body so the caller sees the
# provider's actual message ("OVER_RATE_LIMIT" etc.).
try:
body = e.read(2048).decode("utf-8", errors="replace")
except Exception: # noqa: BLE001
body = ""
raise RuntimeError(
f"HTTP 429 rate-limited by {urllib.parse.urlsplit(url).netloc}. "
f"Slow down or supply a real API key. Body: {body[:300]}"
) from e
if e.code in (500, 502, 503, 504) and attempt < max_retries:
retry_after = e.headers.get("Retry-After") if e.headers else None
wait = float(retry_after) if (retry_after and retry_after.isdigit()) else backoff ** (attempt + 1)
time.sleep(wait)
last_err = e
continue
raise
except urllib.error.URLError as e:
if attempt < max_retries:
time.sleep(backoff ** (attempt + 1))
last_err = e
continue
raise
if last_err:
raise last_err
raise RuntimeError("unreachable")
def get_json(url: str, **kwargs) -> dict | list:
return json.loads(get(url, **kwargs).decode("utf-8"))

View file

@ -0,0 +1,67 @@
"""Shared entity-name normalization helpers (stdlib-only).
Used by entity_resolution.py and timing_analysis.py.
"""
from __future__ import annotations
import re
# Legal suffixes / corporate boilerplate to strip during normalization.
_SUFFIX_TOKENS = {
"INC", "INCORPORATED", "LLC", "LLP", "LP", "LTD", "LIMITED",
"CORP", "CORPORATION", "CO", "COMPANY",
"GROUP", "GRP", "HOLDINGS", "HOLDING",
"PARTNERS", "ASSOCIATES",
"INTERNATIONAL", "INTL",
"ENTERPRISES", "ENTERPRISE",
"SERVICES", "SERVICE", "SVCS",
"SOLUTIONS", "MANAGEMENT", "MGMT", "CONSULTING",
"TECHNOLOGY", "TECHNOLOGIES", "TECH",
"INDUSTRIES", "INDUSTRY",
"AMERICA", "AMERICAN",
"USA", "US",
"PLLC", "PC",
"TRUST", "FOUNDATION",
}
_PUNCT_RE = re.compile(r"[^\w\s]")
_WS_RE = re.compile(r"\s+")
def normalize_name(name: str | None) -> str:
"""Standard normalization: uppercase, strip suffixes, drop punctuation."""
if not name:
return ""
s = _PUNCT_RE.sub(" ", name.upper())
s = _WS_RE.sub(" ", s).strip()
tokens = [t for t in s.split() if t and t not in _SUFFIX_TOKENS]
return " ".join(tokens)
def normalize_aggressive(name: str | None) -> str:
"""Aggressive normalization: sorted unique tokens (word-bag)."""
base = normalize_name(name)
if not base:
return ""
return " ".join(sorted(set(base.split())))
def name_tokens(name: str | None, min_len: int = 4) -> set[str]:
"""Token set used for overlap matching."""
base = normalize_name(name)
if not base:
return set()
return {t for t in base.split() if len(t) >= min_len}
def token_overlap_ratio(left: str | None, right: str | None) -> tuple[float, int]:
"""Return (jaccard-like ratio, shared token count) over min-len tokens."""
a = name_tokens(left)
b = name_tokens(right)
if not a or not b:
return 0.0, 0
shared = a & b
if not shared:
return 0.0, 0
union = a | b
return len(shared) / len(union), len(shared)

View file

@ -0,0 +1,221 @@
#!/usr/bin/env python3
"""Build a structured findings.json with evidence chains (stdlib-only).
Aggregates cross_links.csv (entity_resolution output) and an optional
timing.json (timing_analysis output) into a single evidence-chain document.
Output structure:
{
"metadata": {...},
"findings": [
{
"id": "F0001",
"title": "...",
"severity": "HIGH|MEDIUM|LOW",
"confidence": "high|medium|low",
"summary": "...",
"evidence": [
{"source": "cross_links.csv", "row": 12, "fields": {...}},
...
],
"sources": ["cross_links.csv", "timing.json"]
}
]
}
Every finding traces to specific source rows. No naked claims.
"""
from __future__ import annotations
import argparse
import csv
import json
from collections import defaultdict
from pathlib import Path
CONFIDENCE_ORDER = {"high": 0, "medium": 1, "low": 2}
SEVERITY_ORDER = {"HIGH": 0, "MEDIUM": 1, "LOW": 2}
def _read_cross_links(path: str) -> list[dict[str, str]]:
with open(path, newline="", encoding="utf-8") as fh:
return list(csv.DictReader(fh))
def build_findings(
cross_links_path: str,
timing_path: str | None = None,
out_path: str = "findings.json",
bundled_threshold: int = 3,
) -> dict:
findings: list[dict] = []
next_id = 1
# 1. Match-based findings, grouped by (left_normalized, right_normalized).
matches = _read_cross_links(cross_links_path)
grouped: dict[tuple[str, str], list[dict[str, str]]] = defaultdict(list)
for i, row in enumerate(matches):
row["__row__"] = str(i)
grouped[(row.get("left_normalized", ""), row.get("right_normalized", ""))].append(row)
for (left_norm, right_norm), rows in grouped.items():
if not left_norm or not right_norm:
continue
# Use the highest-confidence match for the finding's overall confidence.
best = min(rows, key=lambda r: CONFIDENCE_ORDER.get(r.get("confidence", "low"), 2))
finding_id = f"F{next_id:04d}"
next_id += 1
evidence = [
{
"source": "cross_links.csv",
"row": int(r["__row__"]),
"fields": {
"match_type": r.get("match_type", ""),
"confidence": r.get("confidence", ""),
"left_name": r.get("left_name", ""),
"right_name": r.get("right_name", ""),
"overlap_ratio": r.get("overlap_ratio", ""),
"shared_tokens": r.get("shared_tokens", ""),
},
}
for r in rows
]
findings.append(
{
"id": finding_id,
"title": f"Entity match: {best.get('left_name', '')}{best.get('right_name', '')}",
"severity": "MEDIUM" if best.get("confidence") == "high" else "LOW",
"confidence": best.get("confidence", "low"),
"summary": (
f"{len(rows)} cross-link record(s) tie "
f"'{best.get('left_name', '')}' to "
f"'{best.get('right_name', '')}' "
f"(best tier: {best.get('match_type', '')})."
),
"evidence": evidence,
"sources": ["cross_links.csv"],
}
)
# 2. Bundled-donations findings (if cross_links carries donor↔candidate pattern).
# Heuristic: many distinct left names sharing the same right name.
by_right: dict[str, set[str]] = defaultdict(set)
by_right_rows: dict[str, list[dict[str, str]]] = defaultdict(list)
for r in matches:
right = r.get("right_normalized", "")
left_raw = r.get("left_name", "").strip()
if right and left_raw:
by_right[right].add(left_raw)
by_right_rows[right].append(r)
for right_norm, lefts in by_right.items():
if len(lefts) < bundled_threshold:
continue
rows = by_right_rows[right_norm]
right_raw = rows[0].get("right_name", "")
findings.append(
{
"id": f"F{next_id:04d}",
"title": f"Bundled cross-links: {len(lefts)} distinct left entities ↔ '{right_raw}'",
"severity": "HIGH",
"confidence": "medium",
"summary": (
f"{len(lefts)} distinct left-side entities link to "
f"'{right_raw}'. Pattern suggests coordinated relationship "
f"(e.g. bundled donations, multi-vendor employer)."
),
"evidence": [
{
"source": "cross_links.csv",
"row": int(r.get("__row__", "0")),
"fields": {
"left_name": r.get("left_name", ""),
"match_type": r.get("match_type", ""),
},
}
for r in rows
],
"sources": ["cross_links.csv"],
}
)
next_id += 1
# 3. Timing-based findings.
if timing_path and Path(timing_path).exists():
timing = json.loads(Path(timing_path).read_text())
for r in timing.get("results", []):
if not r.get("significant"):
continue
findings.append(
{
"id": f"F{next_id:04d}",
"title": (
f"Donation timing significantly clusters near awards: "
f"{r['donor']}{r['recipient']}"
),
"severity": "HIGH" if r["p_value"] < 0.01 else "MEDIUM",
"confidence": "medium",
"summary": (
f"Mean nearest-award distance {r['observed_mean_days']} days "
f"(null {r['null_mean_days']} days). p={r['p_value']}, "
f"effect size {r['effect_size_sd']} SD. "
f"{r['n_donations']} donations, {r['n_award_dates']} awards."
),
"evidence": [
{
"source": "timing.json",
"row": None,
"fields": r,
}
],
"sources": ["timing.json"],
}
)
next_id += 1
# Sort: severity → confidence → id.
findings.sort(
key=lambda f: (
SEVERITY_ORDER.get(f["severity"], 3),
CONFIDENCE_ORDER.get(f["confidence"], 3),
f["id"],
)
)
payload = {
"metadata": {
"n_findings": len(findings),
"cross_links_path": cross_links_path,
"timing_path": timing_path,
"bundled_threshold": bundled_threshold,
},
"findings": findings,
}
Path(out_path).write_text(json.dumps(payload, indent=2))
return payload
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--cross-links", required=True)
p.add_argument("--timing", help="Optional timing.json from timing_analysis.py")
p.add_argument("--out", default="findings.json")
p.add_argument(
"--bundled-threshold",
type=int,
default=3,
help="Minimum distinct left entities to flag as bundled (default 3)",
)
a = p.parse_args()
payload = build_findings(
cross_links_path=a.cross_links,
timing_path=a.timing,
out_path=a.out,
bundled_threshold=a.bundled_threshold,
)
print(f"Wrote {payload['metadata']['n_findings']} findings to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,228 @@
#!/usr/bin/env python3
"""Cross-source entity resolution (stdlib-only).
Given two CSV files with name columns, find candidate matches using three
tiers of normalization:
1. exact normalized strings equal
2. fuzzy sorted-token (word-bag) match
3. token_overlap >=60% Jaccard overlap on >=4-char tokens, >=2 shared
Adapted from ShinMegamiBoson/OpenPlanter (MIT) but generalized: no Boston-
specific record types, no contribution-code filters, no fixed schemas.
Output CSV columns:
match_type, confidence, left_name, right_name,
left_normalized, right_normalized, left_row, right_row,
overlap_ratio, shared_tokens
"""
from __future__ import annotations
import argparse
import csv
import sys
from pathlib import Path
# Allow running directly or as a module.
sys.path.insert(0, str(Path(__file__).parent))
from _normalize import ( # noqa: E402
normalize_name,
normalize_aggressive,
token_overlap_ratio,
)
CONFIDENCE = {
"exact": "high",
"fuzzy": "medium",
"token_overlap": "low",
}
def _read_csv(path: str, name_col: str) -> list[dict[str, str]]:
rows = []
with open(path, newline="", encoding="utf-8") as fh:
reader = csv.DictReader(fh)
if name_col not in (reader.fieldnames or []):
raise SystemExit(
f"Column {name_col!r} not in {path}. "
f"Available: {reader.fieldnames}"
)
for i, row in enumerate(reader):
row["__row__"] = str(i)
rows.append(row)
return rows
def _build_index(rows: list[dict[str, str]], name_col: str):
"""Index by exact-normalized and aggressive (sorted-token) form."""
exact: dict[str, list[dict[str, str]]] = {}
aggressive: dict[str, list[dict[str, str]]] = {}
for row in rows:
raw = row.get(name_col, "")
n = normalize_name(raw)
if n:
exact.setdefault(n, []).append(row)
a = normalize_aggressive(raw)
if a:
aggressive.setdefault(a, []).append(row)
return exact, aggressive
def _emit(
out_rows: list[dict[str, str]],
seen: set[tuple],
match_type: str,
left_row: dict[str, str],
right_row: dict[str, str],
left_col: str,
right_col: str,
ratio: float = 0.0,
shared: int = 0,
):
left_raw = left_row.get(left_col, "")
right_raw = right_row.get(right_col, "")
key = (
left_row["__row__"],
right_row["__row__"],
match_type,
)
if key in seen:
return
seen.add(key)
out_rows.append(
{
"match_type": match_type,
"confidence": CONFIDENCE[match_type],
"left_name": left_raw,
"right_name": right_raw,
"left_normalized": normalize_name(left_raw),
"right_normalized": normalize_name(right_raw),
"left_row": left_row["__row__"],
"right_row": right_row["__row__"],
"overlap_ratio": f"{ratio:.3f}" if ratio else "",
"shared_tokens": str(shared) if shared else "",
}
)
def resolve(
left_path: str,
left_col: str,
right_path: str,
right_col: str,
out_path: str,
overlap_threshold: float = 0.60,
min_shared: int = 2,
skip_overlap: bool = False,
) -> int:
left_rows = _read_csv(left_path, left_col)
right_rows = _read_csv(right_path, right_col)
right_exact, right_aggressive = _build_index(right_rows, right_col)
out_rows: list[dict[str, str]] = []
seen: set[tuple] = set()
# Pass 1+2: exact / fuzzy via index lookup.
for lrow in left_rows:
raw = lrow.get(left_col, "")
n = normalize_name(raw)
if not n:
continue
for rrow in right_exact.get(n, []):
_emit(out_rows, seen, "exact", lrow, rrow, left_col, right_col)
a = normalize_aggressive(raw)
if a:
for rrow in right_aggressive.get(a, []):
_emit(out_rows, seen, "fuzzy", lrow, rrow, left_col, right_col)
if not skip_overlap:
# Pass 3: token overlap (O(N*M) — expensive; allow opt-out).
for lrow in left_rows:
l_raw = lrow.get(left_col, "")
if not normalize_name(l_raw):
continue
for rrow in right_rows:
ratio, shared = token_overlap_ratio(
l_raw, rrow.get(right_col, "")
)
if ratio >= overlap_threshold and shared >= min_shared:
_emit(
out_rows,
seen,
"token_overlap",
lrow,
rrow,
left_col,
right_col,
ratio=ratio,
shared=shared,
)
fieldnames = [
"match_type",
"confidence",
"left_name",
"right_name",
"left_normalized",
"right_normalized",
"left_row",
"right_row",
"overlap_ratio",
"shared_tokens",
]
with open(out_path, "w", newline="", encoding="utf-8") as fh:
writer = csv.DictWriter(fh, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(out_rows)
return len(out_rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--left", required=True, help="Left CSV path")
p.add_argument(
"--left-name-col", required=True, help="Name column in left CSV"
)
p.add_argument("--right", required=True, help="Right CSV path")
p.add_argument(
"--right-name-col",
required=True,
help="Name column in right CSV",
)
p.add_argument("--out", required=True, help="Output CSV path")
p.add_argument(
"--overlap-threshold",
type=float,
default=0.60,
help="Jaccard overlap threshold for token_overlap tier (default 0.60)",
)
p.add_argument(
"--min-shared",
type=int,
default=2,
help="Minimum shared tokens for token_overlap tier (default 2)",
)
p.add_argument(
"--skip-overlap",
action="store_true",
help="Skip the O(N*M) token_overlap pass (much faster on large CSVs)",
)
args = p.parse_args()
count = resolve(
left_path=args.left,
left_col=args.left_name_col,
right_path=args.right,
right_col=args.right_name_col,
out_path=args.out,
overlap_threshold=args.overlap_threshold,
min_shared=args.min_shared,
skip_overlap=args.skip_overlap,
)
print(f"Wrote {count} match rows to {args.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,149 @@
#!/usr/bin/env python3
"""Search court records via CourtListener (Free Law Project).
Covers ~10M federal and state court opinions, plus PACER docket data
where available. Public REST API v4 supports anonymous read access for
search; some endpoints require a token (free at courtlistener.com).
Set COURTLISTENER_TOKEN to authenticate (raises rate limits).
"""
from __future__ import annotations
import argparse
import csv
import os
import sys
import urllib.parse
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from _http import get_json # noqa: E402
BASE = "https://www.courtlistener.com/api/rest/v4/search/"
COLUMNS = [
"case_name",
"court",
"court_id",
"date_filed",
"docket_number",
"judge",
"citation",
"result_type",
"snippet",
"absolute_url",
]
SEARCH_TYPES = {
"opinions": "o", # Court opinions
"dockets": "r", # PACER dockets (may require auth depending on coverage)
"oral": "oa", # Oral arguments
"people": "p", # Judges / people
"recap": "r", # Same as dockets in v4
}
def fetch(
query: str,
search_type: str,
court: str | None,
date_from: str | None,
date_to: str | None,
token: str | None,
limit: int,
out_path: str,
) -> int:
type_code = SEARCH_TYPES.get(search_type, search_type)
params = {
"q": query,
"type": type_code,
}
if court:
params["court"] = court
if date_from:
params["filed_after"] = date_from
if date_to:
params["filed_before"] = date_to
headers = {"Authorization": f"Token {token}"} if token else None
rows: list[dict[str, str]] = []
next_url: str | None = f"{BASE}?{urllib.parse.urlencode(params)}"
while next_url and len(rows) < limit:
try:
payload = get_json(next_url, headers=headers)
except Exception as e: # noqa: BLE001
print(f"CourtListener error: {e}", file=sys.stderr)
break
if not isinstance(payload, dict):
break
results = payload.get("results", [])
for r in results:
if len(rows) >= limit:
break
rows.append(
{
"case_name": r.get("caseName", "") or r.get("case_name", "") or "",
"court": r.get("court", "") or "",
"court_id": r.get("court_id", "") or "",
"date_filed": (r.get("dateFiled", "") or r.get("date_filed", "") or "")[:10],
"docket_number": r.get("docketNumber", "") or r.get("docket_number", "") or "",
"judge": r.get("judge", "") or "",
"citation": "; ".join(r.get("citation", []) or []) if isinstance(r.get("citation"), list) else (r.get("citation") or ""),
"result_type": search_type,
"snippet": (r.get("snippet", "") or "").replace("\n", " ")[:500],
"absolute_url": (
f"https://www.courtlistener.com{r.get('absolute_url', '')}"
if r.get("absolute_url", "").startswith("/")
else r.get("absolute_url", "")
),
}
)
next_url = payload.get("next")
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
if not rows:
print(
f"CourtListener: 0 results for type={search_type!r} q={query!r}. "
"Most private individuals don't appear in published court records "
"unless they were party to a federal or state appellate case.",
file=sys.stderr,
)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--query", required=True, help="Search query (party name, case name, keyword)")
p.add_argument(
"--type",
default="opinions",
choices=list(SEARCH_TYPES.keys()),
help="Search type (default: opinions)",
)
p.add_argument("--court", help="Court ID filter (e.g. 'nysd' = SDNY, 'scotus' = Supreme Court)")
p.add_argument("--date-from", help="Filed-after date YYYY-MM-DD")
p.add_argument("--date-to", help="Filed-before date YYYY-MM-DD")
p.add_argument("--token", default=os.environ.get("COURTLISTENER_TOKEN"))
p.add_argument("--limit", type=int, default=100)
p.add_argument("--out", required=True)
a = p.parse_args()
n = fetch(
query=a.query,
search_type=a.type,
court=a.court,
date_from=a.date_from,
date_to=a.date_to,
token=a.token,
limit=a.limit,
out_path=a.out,
)
print(f"Wrote {n} CourtListener rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,162 @@
#!/usr/bin/env python3
"""Search the GDELT 2.0 DOC API for news mentions.
GDELT monitors world news in 100+ languages and indexes the full text.
Free, anonymous, ~15-minute update frequency. Covers ~2015present.
Useful for surfacing news mentions of a person, company, or topic across
international media much wider net than Google News.
"""
from __future__ import annotations
import argparse
import csv
import json
import sys
import time
import urllib.parse
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from _http import get_json # noqa: E402
BASE = "https://api.gdeltproject.org/api/v2/doc/doc"
COLUMNS = [
"title",
"url",
"seen_date",
"domain",
"language",
"source_country",
"tone",
"social_image",
]
def fetch(
query: str,
mode: str,
timespan: str | None,
start_datetime: str | None,
end_datetime: str | None,
source_country: str | None,
source_lang: str | None,
limit: int,
out_path: str,
) -> int:
params: dict[str, str] = {
"query": query,
"mode": mode,
"format": "json",
"maxrecords": str(min(limit, 250)),
"sort": "datedesc",
}
if timespan:
params["timespan"] = timespan
if start_datetime:
params["startdatetime"] = start_datetime.replace("-", "").replace(":", "").replace(" ", "")
if end_datetime:
params["enddatetime"] = end_datetime.replace("-", "").replace(":", "").replace(" ", "")
if source_country:
params["sourcecountry"] = source_country
if source_lang:
params["sourcelang"] = source_lang
url = f"{BASE}?{urllib.parse.urlencode(params)}"
payload: dict | list = {}
for attempt in range(3):
try:
payload = get_json(url)
break
except RuntimeError as e:
# GDELT requires 1 request per 5 seconds; back off and retry.
if "429" in str(e) and attempt < 2:
print(
f"GDELT throttle hit; sleeping 6s before retry "
f"(attempt {attempt + 1}/3)",
file=sys.stderr,
)
time.sleep(6)
continue
print(f"GDELT error: {e}", file=sys.stderr)
payload = {}
break
except Exception as e: # noqa: BLE001
print(f"GDELT error: {e}", file=sys.stderr)
payload = {}
break
rows: list[dict[str, str]] = []
if isinstance(payload, dict):
articles = payload.get("articles", []) or []
for a in articles[:limit]:
seen = (a.get("seendate") or "")
# GDELT format: 20260319T083000Z → 2026-03-19 08:30:00Z
if len(seen) == 16 and "T" in seen:
seen = f"{seen[0:4]}-{seen[4:6]}-{seen[6:8]} {seen[9:11]}:{seen[11:13]}:{seen[13:15]}Z"
rows.append(
{
"title": (a.get("title") or "").replace("\n", " ").strip(),
"url": a.get("url") or "",
"seen_date": seen,
"domain": a.get("domain") or "",
"language": a.get("language") or "",
"source_country": a.get("sourcecountry") or "",
"tone": str(a.get("tone") or ""),
"social_image": a.get("socialimage") or "",
}
)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
if not rows:
print(
f"GDELT: 0 articles for query={query!r}. "
"GDELT indexes ~2015→present. Try widening the timespan or "
"checking the query syntax (https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/).",
file=sys.stderr,
)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--query", required=True, help='Search query (supports GDELT operators: quoted phrases, AND/OR/NOT, sourcecountry:, theme:)')
p.add_argument(
"--mode",
default="ArtList",
choices=["ArtList", "ImageCollage", "TimelineVol", "TimelineTone", "ToneChart"],
help="GDELT mode (default ArtList for article list)",
)
p.add_argument(
"--timespan",
help="Relative window: e.g. '1d', '1w', '1m', '3m', '1y' (overrides start/end)",
)
p.add_argument("--start", help="Absolute start YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS")
p.add_argument("--end", help="Absolute end YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS")
p.add_argument("--source-country", help="2-letter source country (e.g. US, UK)")
p.add_argument("--source-lang", help="Source language (e.g. English, Spanish)")
p.add_argument("--limit", type=int, default=100)
p.add_argument("--out", required=True)
a = p.parse_args()
n = fetch(
query=a.query,
mode=a.mode,
timespan=a.timespan,
start_datetime=a.start,
end_datetime=a.end,
source_country=a.source_country,
source_lang=a.source_lang,
limit=a.limit,
out_path=a.out,
)
print(f"Wrote {n} GDELT article rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,234 @@
#!/usr/bin/env python3
"""Search ICIJ Offshore Leaks via the bulk CSV database.
The old reconcile endpoint (https://offshoreleaks.icij.org/reconcile) returns
404 ICIJ has removed it. The remaining stable access path is the public
bulk download:
https://offshoreleaks-data.icij.org/offshoreleaks/csv/full-oldb.LATEST.zip
~70 MB, ~6 CSVs inside (nodes-entities, nodes-officers, nodes-intermediaries,
nodes-addresses, relationships, ...). We cache it under
$HERMES_OSINT_CACHE/icij/ (default: ~/.cache/hermes-osint/icij/) and search
locally so the agent doesn't re-download for every query.
Output CSV columns match the original `fetch_icij_offshore.py` contract.
"""
from __future__ import annotations
import argparse
import csv
import io
import os
import re
import sys
import time
import urllib.request
import zipfile
from pathlib import Path
BULK_URL = "https://offshoreleaks-data.icij.org/offshoreleaks/csv/full-oldb.LATEST.zip"
COLUMNS = [
"node_id",
"name",
"node_type",
"country_codes",
"countries",
"jurisdiction",
"incorporation_date",
"inactivation_date",
"source",
"entity_url",
"connections",
]
def _cache_dir() -> Path:
base = os.environ.get("HERMES_OSINT_CACHE")
if base:
return Path(base) / "icij"
return Path.home() / ".cache" / "hermes-osint" / "icij"
def _download(dest: Path, force: bool = False) -> Path:
"""Download (or reuse cached) ICIJ bulk ZIP."""
dest.mkdir(parents=True, exist_ok=True)
zip_path = dest / "full-oldb.zip"
if zip_path.exists() and not force:
# Re-check age: refetch if older than 30 days.
age_days = (time.time() - zip_path.stat().st_mtime) / 86400
if age_days < 30:
return zip_path
print(f"Downloading ICIJ bulk database (~70 MB) to {zip_path}", file=sys.stderr)
req = urllib.request.Request(
BULK_URL,
headers={"User-Agent": "hermes-agent osint-investigation skill"},
)
with urllib.request.urlopen(req, timeout=120) as resp: # noqa: S310
tmp = zip_path.with_suffix(".zip.tmp")
with open(tmp, "wb") as fh:
while True:
chunk = resp.read(1 << 16)
if not chunk:
break
fh.write(chunk)
tmp.replace(zip_path)
return zip_path
def _open_csv(zf: zipfile.ZipFile, name_pattern: str):
"""Open the first CSV matching name_pattern (case-insensitive substring)."""
for info in zf.infolist():
if name_pattern.lower() in info.filename.lower() and info.filename.lower().endswith(".csv"):
return zf.open(info), info.filename
return None, None
def _match(needle_norm: str, hay: str) -> bool:
return needle_norm in (hay or "").upper()
def _normalize_query(s: str) -> str:
s = s.upper()
s = re.sub(r"[^\w\s]", " ", s)
s = re.sub(r"\s+", " ", s).strip()
return s
def fetch(
entity: str | None,
officer: str | None,
jurisdiction: str | None,
out_path: str,
cache_dir: Path,
force_refresh: bool = False,
limit: int = 500,
) -> int:
zip_path = _download(cache_dir, force=force_refresh)
rows: list[dict[str, str]] = []
needles: list[tuple[str, str]] = [] # (kind, normalized needle)
if entity:
needles.append(("Entity", _normalize_query(entity)))
if officer:
needles.append(("Officer", _normalize_query(officer)))
jur_norm = _normalize_query(jurisdiction) if jurisdiction else None
targets = [
("Entity", "nodes-entities"),
("Officer", "nodes-officers"),
("Intermediary", "nodes-intermediaries"),
]
with zipfile.ZipFile(zip_path) as zf:
for node_type, csv_substring in targets:
relevant_needles = [n for (k, n) in needles if k in (node_type, "Entity", "Officer")] or []
# Only scan a CSV if we have a needle that could plausibly match it,
# or if we have ONLY a jurisdiction filter.
applicable_needles = [n for (k, n) in needles if k == node_type]
if needles and not applicable_needles and not jur_norm:
continue
stream, fname = _open_csv(zf, csv_substring)
if not stream:
continue
with stream:
text = io.TextIOWrapper(stream, encoding="utf-8", errors="replace")
reader = csv.DictReader(text)
for row in reader:
name = (row.get("name") or "").strip()
if not name:
continue
name_u = name.upper()
matched = False
for n in applicable_needles or relevant_needles:
if _match(n, name_u):
matched = True
break
if not needles:
matched = True # jurisdiction-only sweep
if not matched:
continue
jur = (row.get("jurisdiction_description") or row.get("country_codes") or "").strip()
if jur_norm and jur_norm not in jur.upper() and jur_norm not in (row.get("countries") or "").upper():
continue
node_id = (row.get("node_id") or "").strip()
rows.append(
{
"node_id": node_id,
"name": name,
"node_type": node_type,
"country_codes": row.get("country_codes", "") or "",
"countries": row.get("countries", "") or "",
"jurisdiction": jur,
"incorporation_date": row.get("incorporation_date", "") or "",
"inactivation_date": row.get("inactivation_date", "") or "",
"source": row.get("sourceID", "") or row.get("source", "") or "",
"entity_url": (
f"https://offshoreleaks.icij.org/nodes/{node_id}" if node_id else ""
),
"connections": "",
}
)
if len(rows) >= limit:
break
if len(rows) >= limit:
break
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
if not rows:
bits = []
if entity:
bits.append(f"entity={entity!r}")
if officer:
bits.append(f"officer={officer!r}")
if jurisdiction:
bits.append(f"jurisdiction={jurisdiction!r}")
print(
f"ICIJ: 0 matches for {', '.join(bits)}. "
"The bulk database covers offshore leaks (Panama, Paradise, Pandora, "
"Bahamas, Offshore Leaks). Most private US individuals are NOT in it.",
file=sys.stderr,
)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--entity", help="Search by entity name (substring, case-insensitive)")
p.add_argument("--officer", help="Search by officer / individual name (substring, case-insensitive)")
p.add_argument("--jurisdiction", help="Filter results by jurisdiction substring")
p.add_argument("--limit", type=int, default=500)
p.add_argument("--out", required=True)
p.add_argument(
"--cache-dir",
type=Path,
default=None,
help="Override cache directory (default: $HERMES_OSINT_CACHE/icij or ~/.cache/hermes-osint/icij)",
)
p.add_argument(
"--force-refresh",
action="store_true",
help="Re-download the bulk ZIP even if a recent cached copy exists.",
)
a = p.parse_args()
if not (a.entity or a.officer or a.jurisdiction):
p.error("must supply at least one of --entity / --officer / --jurisdiction")
n = fetch(
entity=a.entity,
officer=a.officer,
jurisdiction=a.jurisdiction,
out_path=a.out,
cache_dir=a.cache_dir or _cache_dir(),
force_refresh=a.force_refresh,
limit=a.limit,
)
print(f"Wrote {n} ICIJ Offshore Leaks rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,203 @@
#!/usr/bin/env python3
"""Search NYC property records via ACRIS (Automated City Register Information System).
Uses the city's Socrata-backed open data API. No auth required for read access.
Datasets:
bnx9-e6tj Real Property Master (one row per recorded document)
636b-3b5g Real Property Parties (names grantor, grantee, etc.)
8h5j-fqxa Real Property Legal (lot / property identifiers)
uqqa-hym2 Real Property References
The Parties dataset has the names. We search by name and optionally join to
Master to get the doc type and date.
"""
from __future__ import annotations
import argparse
import csv
import sys
import urllib.parse
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from _http import get_json # noqa: E402
PARTIES_URL = "https://data.cityofnewyork.us/resource/636b-3b5g.json"
MASTER_URL = "https://data.cityofnewyork.us/resource/bnx9-e6tj.json"
PARTY_TYPE = {
"1": "grantor (seller / mortgagor / debtor)",
"2": "grantee (buyer / mortgagee / creditor)",
"3": "other party",
}
BOROUGH = {
"1": "Manhattan",
"2": "Bronx",
"3": "Brooklyn",
"4": "Queens",
"5": "Staten Island",
}
COLUMNS = [
"document_id",
"name",
"party_type",
"party_role",
"address_1",
"address_2",
"city",
"state",
"zip",
"country",
"doc_type",
"doc_date",
"recorded_date",
"borough",
"amount",
"filing_url",
]
def _filing_url(document_id: str) -> str:
if not document_id:
return ""
return (
f"https://a836-acris.nyc.gov/DS/DocumentSearch/DocumentImageView?doc_id={document_id}"
)
def fetch(
name: str | None,
address: str | None,
party_type: str | None,
limit: int,
out_path: str,
enrich: bool = True,
) -> int:
if not (name or address):
raise SystemExit("must supply --name or --address")
where_clauses: list[str] = []
if name:
safe = name.upper().replace("'", "''")
where_clauses.append(f"upper(name) like '%{safe}%'")
if address:
safe_addr = address.upper().replace("'", "''")
where_clauses.append(f"upper(address_1) like '%{safe_addr}%'")
if party_type and party_type in {"1", "2", "3"}:
where_clauses.append(f"party_type='{party_type}'")
params = {
"$where": " AND ".join(where_clauses),
"$limit": str(limit),
}
url = f"{PARTIES_URL}?{urllib.parse.urlencode(params)}"
parties = get_json(url)
if not isinstance(parties, list):
raise SystemExit(f"Unexpected ACRIS response: {parties!r}")
# Enrich with master record (doc_type, dates, borough, amount).
doc_ids: list[str] = sorted({
d for d in (p.get("document_id") for p in parties) if d
})
masters: dict[str, dict] = {}
if enrich and doc_ids:
# Batch up to 100 doc_ids per request (Socrata IN-list is fine for this).
for i in range(0, len(doc_ids), 100):
chunk = doc_ids[i : i + 100]
id_list = ",".join(f"'{d}'" for d in chunk)
master_params = {
"$where": f"document_id in ({id_list})",
"$limit": "100",
}
url = f"{MASTER_URL}?{urllib.parse.urlencode(master_params)}"
try:
rows = get_json(url)
except Exception as e: # noqa: BLE001
print(f"ACRIS master lookup failed for chunk: {e}", file=sys.stderr)
continue
if isinstance(rows, list):
for r in rows:
did = r.get("document_id", "")
if did:
masters[did] = r
out_rows: list[dict[str, str]] = []
for p in parties:
did = p.get("document_id", "") or ""
m = masters.get(did, {})
out_rows.append(
{
"document_id": did,
"name": p.get("name", "") or "",
"party_type": p.get("party_type", "") or "",
"party_role": PARTY_TYPE.get(p.get("party_type", ""), ""),
"address_1": p.get("address_1", "") or "",
"address_2": p.get("address_2", "") or "",
"city": p.get("city", "") or "",
"state": p.get("state", "") or "",
"zip": p.get("zip", "") or "",
"country": p.get("country", "") or "",
"doc_type": m.get("doc_type", "") or "",
"doc_date": (m.get("document_date", "") or "")[:10],
"recorded_date": (m.get("recorded_datetime", "") or "")[:10],
"borough": BOROUGH.get(m.get("recorded_borough", ""), m.get("recorded_borough", "")),
"amount": m.get("document_amt", "") or "",
"filing_url": _filing_url(did),
}
)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(out_rows)
if not out_rows:
filters = []
if name:
filters.append(f"name={name!r}")
if address:
filters.append(f"address={address!r}")
print(
f"NYC ACRIS: 0 records for {', '.join(filters)}. "
"ACRIS covers ONLY NYC (5 boroughs). For property records elsewhere, "
"search the relevant county recorder directly.",
file=sys.stderr,
)
return len(out_rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--name", help="Party name substring (case-insensitive)")
p.add_argument("--address", help="Address line 1 substring")
p.add_argument(
"--party-type",
choices=["1", "2", "3"],
help="Filter party type: 1=grantor (seller/mortgagor), 2=grantee (buyer/mortgagee), 3=other",
)
p.add_argument("--limit", type=int, default=200)
p.add_argument(
"--no-enrich",
action="store_true",
help="Skip the master-document lookup that adds doc_type/date/amount",
)
p.add_argument("--out", required=True)
a = p.parse_args()
n = fetch(
name=a.name,
address=a.address,
party_type=a.party_type,
limit=a.limit,
out_path=a.out,
enrich=not a.no_enrich,
)
print(f"Wrote {n} NYC ACRIS rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,175 @@
#!/usr/bin/env python3
"""Fetch OFAC SDN list (CSV format) and normalize.
Public endpoint: https://www.treasury.gov/ofac/downloads/sdn.csv
Format reference: https://ofac.treasury.gov/specially-designated-nationals-and-blocked-persons-list-sdn-human-readable-lists
The SDN CSV uses a specific 12-column format with no header row:
ent_num, sdn_name, sdn_type, program, title, call_sign, vess_type,
tonnage, grt, vess_flag, vess_owner, remarks
Address and AKA records live in separate files. We fetch all three and join.
"""
from __future__ import annotations
import argparse
import csv
import io
import sys
from collections import defaultdict
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from _http import get # noqa: E402
SDN_URL = "https://www.treasury.gov/ofac/downloads/sdn.csv"
ADD_URL = "https://www.treasury.gov/ofac/downloads/add.csv"
ALT_URL = "https://www.treasury.gov/ofac/downloads/alt.csv"
SDN_COLS = [
"ent_num", "sdn_name", "sdn_type", "program", "title",
"call_sign", "vess_type", "tonnage", "grt", "vess_flag",
"vess_owner", "remarks",
]
ADD_COLS = [
"ent_num", "add_num", "address", "city_state_zip", "country", "add_remarks",
]
ALT_COLS = [
"ent_num", "alt_num", "alt_type", "alt_name", "alt_remarks",
]
COLUMNS = [
"entity_id",
"name",
"entity_type",
"program_list",
"title",
"nationalities",
"aka_list",
"addresses",
"dob",
"pob",
"remarks",
"last_updated",
]
_TYPE_MAP = {
"individual": "individual",
"entity": "entity",
"vessel": "vessel",
"aircraft": "aircraft",
}
def _read_csv(url: str, columns: list[str]) -> list[dict[str, str]]:
body = get(url, timeout=60).decode("latin-1", errors="replace")
reader = csv.reader(io.StringIO(body))
out = []
for row in reader:
if not row:
continue
# Pad/truncate to expected width.
row = row[: len(columns)] + [""] * (len(columns) - len(row))
out.append(dict(zip(columns, row)))
return out
def _strip_quotes(s: str) -> str:
s = s.strip()
if s.startswith('"') and s.endswith('"'):
s = s[1:-1]
if s == "-0-":
return ""
return s
def fetch(
program: str | None,
entity_type: str | None,
out_path: str,
) -> int:
sdn = _read_csv(SDN_URL, SDN_COLS)
addresses = _read_csv(ADD_URL, ADD_COLS)
akas = _read_csv(ALT_URL, ALT_COLS)
addr_by_ent: dict[str, list[str]] = defaultdict(list)
for a in addresses:
ent = _strip_quotes(a["ent_num"])
parts = [
_strip_quotes(a[c])
for c in ("address", "city_state_zip", "country")
if _strip_quotes(a[c])
]
if parts:
addr_by_ent[ent].append(", ".join(parts))
aka_by_ent: dict[str, list[str]] = defaultdict(list)
for k in akas:
ent = _strip_quotes(k["ent_num"])
name = _strip_quotes(k["alt_name"])
if name:
aka_by_ent[ent].append(name)
rows: list[dict[str, str]] = []
for r in sdn:
ent_num = _strip_quotes(r["ent_num"])
if not ent_num:
continue
sdn_type = _TYPE_MAP.get(_strip_quotes(r["sdn_type"]).lower(), _strip_quotes(r["sdn_type"]))
if entity_type and sdn_type != entity_type:
continue
progs = _strip_quotes(r["program"])
if program and program.upper() not in progs.upper().split(";"):
continue
remarks = _strip_quotes(r["remarks"])
# DOB / POB are commonly embedded in remarks for individuals.
dob = ""
pob = ""
if sdn_type == "individual" and remarks:
for chunk in remarks.split(";"):
ch = chunk.strip()
if ch.upper().startswith("DOB"):
dob = ch.split(maxsplit=1)[1] if " " in ch else ""
elif ch.upper().startswith("POB"):
pob = ch.split(maxsplit=1)[1] if " " in ch else ""
rows.append(
{
"entity_id": ent_num,
"name": _strip_quotes(r["sdn_name"]),
"entity_type": sdn_type,
"program_list": "; ".join(p.strip() for p in progs.split(";") if p.strip()),
"title": _strip_quotes(r["title"]),
"nationalities": "", # not in this CSV; available in XML format
"aka_list": "; ".join(aka_by_ent.get(ent_num, [])),
"addresses": "; ".join(addr_by_ent.get(ent_num, [])),
"dob": dob,
"pob": pob,
"remarks": remarks,
"last_updated": "",
}
)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__)
p.add_argument("--program", help="Filter to specific sanctions program (e.g. SDGT, IRAN)")
p.add_argument(
"--entity-type",
choices=["individual", "entity", "vessel", "aircraft"],
help="Filter to a specific entity type",
)
p.add_argument("--out", required=True)
a = p.parse_args()
n = fetch(program=a.program, entity_type=a.entity_type, out_path=a.out)
print(f"Wrote {n} OFAC SDN rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,192 @@
#!/usr/bin/env python3
"""Search OpenCorporates company registry data.
OpenCorporates aggregates ~200M companies from 130+ jurisdictions. The
public API requires an API token (free tier: 500 calls/month). Set
OPENCORPORATES_API_TOKEN in env or pass --token.
Without a token, this script falls back to scraping the public HTML
search page (limited fields, more brittle, no jurisdiction filter).
"""
from __future__ import annotations
import argparse
import csv
import json
import os
import re
import sys
import urllib.parse
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from _http import get, get_json # noqa: E402
API_URL = "https://api.opencorporates.com/v0.4/companies/search"
HTML_URL = "https://opencorporates.com/companies"
COLUMNS = [
"name",
"company_number",
"jurisdiction_code",
"jurisdiction_name",
"incorporation_date",
"dissolution_date",
"company_type",
"status",
"registered_address",
"opencorporates_url",
"officers_count",
"source",
]
def _via_api(query: str, jurisdiction: str | None, token: str, limit: int) -> list[dict]:
params = {
"q": query,
"api_token": token,
"per_page": str(min(limit, 100)),
}
if jurisdiction:
params["jurisdiction_code"] = jurisdiction
url = f"{API_URL}?{urllib.parse.urlencode(params)}"
payload = get_json(url)
if not isinstance(payload, dict):
return []
results = payload.get("results", {}).get("companies", []) or []
return [r.get("company", {}) for r in results if isinstance(r, dict)]
def _via_html(query: str, limit: int) -> list[dict]:
"""Best-effort HTML fallback when no API token is available."""
params = {"q": query, "utf8": ""}
url = f"{HTML_URL}?{urllib.parse.urlencode(params)}"
body = get(url, user_agent="Mozilla/5.0 hermes-osint").decode("utf-8", errors="replace")
# Each result is in <li class="company"> ... </li> with name, url, status
pattern = re.compile(
r'<li[^>]*class="[^"]*company[^"]*"[^>]*>.*?'
r'<a[^>]+href="(?P<url>/companies/[^"]+)"[^>]*>(?P<name>[^<]+)</a>'
r'(?:.*?<span[^>]*class="[^"]*jurisdiction[^"]*"[^>]*>(?P<jur>[^<]+)</span>)?'
r"(?:.*?<dt[^>]*>(?:Company\s+Number|Number)</dt>\s*<dd[^>]*>(?P<num>[^<]+)</dd>)?",
re.DOTALL | re.IGNORECASE,
)
out = []
for m in pattern.finditer(body):
if len(out) >= limit:
break
url_path = m.group("url").strip()
out.append(
{
"name": (m.group("name") or "").strip(),
"opencorporates_url": f"https://opencorporates.com{url_path}",
"jurisdiction_code": (m.group("jur") or "").strip(),
"company_number": (m.group("num") or "").strip(),
"_via": "html",
}
)
return out
def fetch(
query: str,
jurisdiction: str | None,
token: str | None,
limit: int,
out_path: str,
) -> int:
if token:
try:
companies = _via_api(query, jurisdiction, token, limit)
source_tag = "api"
except Exception as e: # noqa: BLE001
print(
f"OpenCorporates API call failed ({e}); falling back to HTML.",
file=sys.stderr,
)
companies = _via_html(query, limit)
source_tag = "html-fallback"
else:
print(
"OPENCORPORATES_API_TOKEN not set — using HTML fallback (limited fields). "
"Get a free token at https://opencorporates.com/api_accounts/new",
file=sys.stderr,
)
companies = _via_html(query, limit)
source_tag = "html"
rows: list[dict[str, str]] = []
for c in companies[:limit]:
if c.get("_via") == "html":
rows.append(
{
"name": c.get("name", ""),
"company_number": c.get("company_number", ""),
"jurisdiction_code": c.get("jurisdiction_code", ""),
"jurisdiction_name": "",
"incorporation_date": "",
"dissolution_date": "",
"company_type": "",
"status": "",
"registered_address": "",
"opencorporates_url": c.get("opencorporates_url", ""),
"officers_count": "",
"source": source_tag,
}
)
continue
addr = c.get("registered_address_in_full") or ""
rows.append(
{
"name": c.get("name", "") or "",
"company_number": c.get("company_number", "") or "",
"jurisdiction_code": c.get("jurisdiction_code", "") or "",
"jurisdiction_name": "",
"incorporation_date": c.get("incorporation_date", "") or "",
"dissolution_date": c.get("dissolution_date", "") or "",
"company_type": c.get("company_type", "") or "",
"status": c.get("current_status", "") or c.get("inactive", "") or "",
"registered_address": addr,
"opencorporates_url": c.get("opencorporates_url", "") or "",
"officers_count": str(c.get("officers", {}).get("total_count", "") if c.get("officers") else ""),
"source": source_tag,
}
)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
if not rows:
print(
f"OpenCorporates: 0 matches for query={query!r}"
f"{f' jurisdiction={jurisdiction!r}' if jurisdiction else ''}.",
file=sys.stderr,
)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--query", required=True, help="Company name search")
p.add_argument(
"--jurisdiction",
help="Jurisdiction code, e.g. 'us_ny', 'us_de', 'gb', 'sg' (lowercased OpenCorporates style)",
)
p.add_argument("--limit", type=int, default=50)
p.add_argument("--token", default=os.environ.get("OPENCORPORATES_API_TOKEN"))
p.add_argument("--out", required=True)
a = p.parse_args()
n = fetch(
query=a.query,
jurisdiction=a.jurisdiction,
token=a.token,
limit=a.limit,
out_path=a.out,
)
print(f"Wrote {n} OpenCorporates rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,184 @@
#!/usr/bin/env python3
"""Fetch SEC EDGAR filings index for a given CIK or company name.
SEC requires a User-Agent header with contact info. Set SEC_USER_AGENT,
e.g. SEC_USER_AGENT="Research example@example.com".
Filings JSON is published at:
https://data.sec.gov/submissions/CIK<10-digit-padded>.json
Company lookup uses:
https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&company=<name>&output=atom
"""
from __future__ import annotations
import argparse
import csv
import os
import re
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from _http import get, get_json # noqa: E402
SUBMISSIONS_URL = "https://data.sec.gov/submissions/CIK{cik}.json"
COLUMNS = [
"cik",
"company_name",
"form_type",
"filing_date",
"accession_number",
"primary_document",
"filing_url",
"reporting_period",
]
def _ua() -> str:
ua = os.environ.get("SEC_USER_AGENT", "").strip()
if not ua:
raise SystemExit(
"SEC requires a User-Agent with contact info. "
"Set SEC_USER_AGENT='Your Name your@email'."
)
return ua
def _resolve_cik(company: str) -> tuple[str, str]:
"""Resolve a company name to a CIK via EDGAR's atom feed.
Returns (cik, resolved_company_name). The feed entries also reveal whether
the match is an individual filer (Form 3/4/5 only) surfaced in the
return value so callers can warn.
"""
url = "https://www.sec.gov/cgi-bin/browse-edgar"
params = {"action": "getcompany", "company": company, "output": "atom", "owner": "include"}
body = get(url, params=params, user_agent=_ua()).decode("utf-8", errors="replace")
m = re.search(r"CIK=(\d{10})", body)
if not m:
raise SystemExit(f"Could not resolve CIK for company={company!r}")
cik = m.group(1)
name_m = re.search(r"<title>([^<]+)\s*\((\d{10})\)</title>", body)
resolved = name_m.group(1).strip() if name_m else ""
return cik, resolved
def fetch(
cik: str | None,
company: str | None,
types: list[str],
since: str | None,
out_path: str,
) -> int:
resolved_name = ""
if not cik and company:
try:
cik, resolved_name = _resolve_cik(company) # type: ignore[assignment]
except SystemExit as e:
# Write empty CSV with header so downstream tools still work,
# and tell the user clearly.
print(f"SEC EDGAR: {e}", file=sys.stderr)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
csv.DictWriter(fh, fieldnames=COLUMNS).writeheader()
return 0
if resolved_name:
print(
f"Resolved company={company!r} → CIK {cik} ({resolved_name})",
file=sys.stderr,
)
if not cik:
raise SystemExit("must supply --cik or --company")
cik = cik.zfill(10)
url = SUBMISSIONS_URL.format(cik=cik)
payload = get_json(url, user_agent=_ua())
if not isinstance(payload, dict):
raise SystemExit(f"Unexpected EDGAR response shape for CIK {cik}")
name = payload.get("name", "")
recent = (payload.get("filings", {}) or {}).get("recent", {}) or {}
form = recent.get("form", [])
date = recent.get("filingDate", [])
accession = recent.get("accessionNumber", [])
primary_doc = recent.get("primaryDocument", [])
period = recent.get("reportDate", [])
# Histogram of available filing types — useful for surfacing why a filter
# returned 0 (e.g. user asked for 10-K on an individual Form 4 filer).
type_hist: dict[str, int] = {}
for ftype in form:
type_hist[ftype] = type_hist.get(ftype, 0) + 1
type_set = {t.strip().upper() for t in types} if types else None
rows: list[dict[str, str]] = []
for i, ftype in enumerate(form):
if type_set and ftype.upper() not in type_set:
continue
fdate = date[i] if i < len(date) else ""
if since and fdate and fdate < since:
continue
acc = accession[i] if i < len(accession) else ""
pdoc = primary_doc[i] if i < len(primary_doc) else ""
acc_nodash = acc.replace("-", "")
filing_url = (
f"https://www.sec.gov/Archives/edgar/data/{int(cik)}/{acc_nodash}/{pdoc}"
if acc and pdoc
else ""
)
rows.append(
{
"cik": cik,
"company_name": name,
"form_type": ftype,
"filing_date": fdate,
"accession_number": acc,
"primary_document": pdoc,
"filing_url": filing_url,
"reporting_period": period[i] if i < len(period) else "",
}
)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
if not rows and type_hist:
top = sorted(type_hist.items(), key=lambda kv: -kv[1])[:8]
hist_str = ", ".join(f"{t}={n}" for t, n in top)
print(
f"Warning: SEC EDGAR CIK {cik} ({name}) has {sum(type_hist.values())} "
f"recent filings but NONE match types={types}. "
f"Available form types: {hist_str}.",
file=sys.stderr,
)
# Insider-filer heuristic: only Form 3/4/5 → individual person, not a company.
company_types = {"10-K", "10-Q", "8-K", "20-F", "DEF 14A", "S-1"}
if not (set(type_hist.keys()) & company_types):
print(
f"Note: CIK {cik} appears to be an INDIVIDUAL filer "
f"(insider Form 3/4/5 only), not a corporate registrant. "
f"The resolver may have matched an officer/director named "
f"{company!r} rather than a company.",
file=sys.stderr,
)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__)
p.add_argument("--cik", help="Central Index Key (will be 10-digit zero-padded)")
p.add_argument("--company", help="Resolve to CIK by company name")
p.add_argument("--types", default="", help="Comma-separated form types (e.g. 10-K,10-Q,8-K)")
p.add_argument("--since", help="Skip filings before YYYY-MM-DD")
p.add_argument("--out", required=True)
a = p.parse_args()
types = [t for t in (a.types or "").split(",") if t.strip()]
n = fetch(cik=a.cik, company=a.company, types=types, since=a.since, out_path=a.out)
print(f"Wrote {n} EDGAR filing rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,146 @@
#!/usr/bin/env python3
"""Fetch Senate Lobbying Disclosure (LD-1 / LD-2) filings.
Anonymous: 120 req/hour. Token (SENATE_LDA_TOKEN): 1200 req/hour.
"""
from __future__ import annotations
import argparse
import csv
import os
import sys
import time
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from _http import get_json # noqa: E402
ENDPOINT = "https://lda.senate.gov/api/v1/filings/"
COLUMNS = [
"filing_uuid",
"filing_type",
"filing_year",
"filing_period",
"registrant_name",
"registrant_id",
"client_name",
"client_id",
"client_general_description",
"income",
"expenses",
"lobbyists",
"issues",
"government_entities",
"filing_date",
]
def fetch(
client: str | None,
registrant: str | None,
year: int,
token: str | None,
out_path: str,
page_size: int = 100,
max_pages: int = 25,
) -> int:
params: dict = {"filing_year": year, "page_size": page_size}
if client:
params["client_name"] = client
if registrant:
params["registrant_name"] = registrant
headers = {"Authorization": f"Token {token}"} if token else None
rows: list[dict[str, str]] = []
url = ENDPOINT
page = 0
while page < max_pages:
try:
payload = get_json(url, params=params if page == 0 else None, headers=headers)
except Exception as e: # noqa: BLE001
print(f"Senate LDA error on page {page + 1}: {e}", file=sys.stderr)
break
if not isinstance(payload, dict):
break
results = payload.get("results", [])
for r in results:
client_obj = r.get("client") or {}
registrant_obj = r.get("registrant") or {}
lobbying_activities = r.get("lobbying_activities") or []
lobbyists = []
issues = []
entities = []
for la in lobbying_activities:
for lob in la.get("lobbyists") or []:
lob_obj = lob.get("lobbyist") or {}
name = " ".join(
x for x in (lob_obj.get("first_name", ""), lob_obj.get("last_name", "")) if x
)
if name:
lobbyists.append(name)
desc = la.get("description") or ""
if desc:
issues.append(desc)
for ge in la.get("government_entities") or []:
nm = ge.get("name") or ""
if nm:
entities.append(nm)
rows.append(
{
"filing_uuid": r.get("filing_uuid", "") or "",
"filing_type": r.get("filing_type", "") or "",
"filing_year": str(r.get("filing_year", "") or year),
"filing_period": r.get("filing_period", "") or "",
"registrant_name": registrant_obj.get("name", "") or "",
"registrant_id": str(registrant_obj.get("id", "") or ""),
"client_name": client_obj.get("name", "") or "",
"client_id": str(client_obj.get("id", "") or ""),
"client_general_description": client_obj.get("general_description", "") or "",
"income": str(r.get("income", "") or ""),
"expenses": str(r.get("expenses", "") or ""),
"lobbyists": "; ".join(sorted(set(lobbyists))),
"issues": "; ".join(issues),
"government_entities": "; ".join(sorted(set(entities))),
"filing_date": (r.get("dt_posted") or "")[:10],
}
)
next_url = payload.get("next")
if not next_url:
break
url = next_url
page += 1
time.sleep(1.0 if not token else 0.3)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__)
p.add_argument("--client", help="Client name filter")
p.add_argument("--registrant", help="Registrant (lobbying firm) name filter")
p.add_argument("--year", type=int, default=2024)
p.add_argument("--token", default=os.environ.get("SENATE_LDA_TOKEN"))
p.add_argument("--max-pages", type=int, default=25)
p.add_argument("--out", required=True)
a = p.parse_args()
if not (a.client or a.registrant):
p.error("must supply at least one of --client / --registrant")
n = fetch(
client=a.client,
registrant=a.registrant,
year=a.year,
token=a.token,
out_path=a.out,
max_pages=a.max_pages,
)
print(f"Wrote {n} Senate LDA rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,170 @@
#!/usr/bin/env python3
"""Fetch federal contracts/awards from USAspending.gov API v2.
No auth required. POST to /api/v2/search/spending_by_award/ with filters.
"""
from __future__ import annotations
import argparse
import csv
import json
import sys
import time
import urllib.request
from pathlib import Path
ENDPOINT = "https://api.usaspending.gov/api/v2/search/spending_by_award/"
COLUMNS = [
"award_id",
"recipient_name",
"recipient_uei",
"recipient_duns",
"recipient_parent_name",
"recipient_state",
"awarding_agency",
"awarding_sub_agency",
"award_type",
"award_amount",
"award_date",
"period_of_performance_start",
"period_of_performance_end",
"naics_code",
"psc_code",
"competition_extent",
"description",
]
# USAspending result column "code" → human label mapping for output.
_FIELDS = [
"Award ID",
"Recipient Name",
"Recipient UEI",
"Recipient DUNS Number",
"Recipient Parent Name",
"Recipient State Code",
"Awarding Agency",
"Awarding Sub Agency",
"Award Type",
"Award Amount",
"Start Date",
"End Date",
"NAICS Code",
"PSC Code",
"Type of Set Aside",
"Description",
]
def _post(body: dict) -> dict:
req = urllib.request.Request(
ENDPOINT,
data=json.dumps(body).encode("utf-8"),
headers={"Content-Type": "application/json", "User-Agent": "hermes-agent osint-investigation"},
method="POST",
)
with urllib.request.urlopen(req, timeout=60) as resp:
return json.loads(resp.read().decode("utf-8"))
def fetch(
recipient: str | None,
agency: str | None,
fy: int,
sole_source_only: bool,
out_path: str,
page_size: int = 100,
max_pages: int = 20,
) -> int:
filters: dict = {
"time_period": [{"start_date": f"{fy - 1}-10-01", "end_date": f"{fy}-09-30"}],
# Contracts only by default; adjust award_type_codes for grants/loans.
"award_type_codes": ["A", "B", "C", "D"],
}
if recipient:
filters["recipient_search_text"] = [recipient]
if agency:
filters["agencies"] = [{"type": "awarding", "tier": "toptier", "name": agency}]
rows: list[dict[str, str]] = []
page = 1
while page <= max_pages:
body = {
"filters": filters,
"fields": _FIELDS,
"page": page,
"limit": page_size,
"sort": "Award Amount",
"order": "desc",
}
try:
payload = _post(body)
except Exception as e: # noqa: BLE001
print(f"USAspending error on page {page}: {e}", file=sys.stderr)
break
results = payload.get("results", [])
if not results:
break
for r in results:
set_aside = r.get("Type of Set Aside", "") or ""
if sole_source_only and "sole" not in set_aside.lower():
continue
rows.append(
{
"award_id": r.get("Award ID", "") or "",
"recipient_name": r.get("Recipient Name", "") or "",
"recipient_uei": r.get("Recipient UEI", "") or "",
"recipient_duns": r.get("Recipient DUNS Number", "") or "",
"recipient_parent_name": r.get("Recipient Parent Name", "") or "",
"recipient_state": r.get("Recipient State Code", "") or "",
"awarding_agency": r.get("Awarding Agency", "") or "",
"awarding_sub_agency": r.get("Awarding Sub Agency", "") or "",
"award_type": r.get("Award Type", "") or "",
"award_amount": str(r.get("Award Amount", "") or ""),
"award_date": r.get("Start Date", "") or "",
"period_of_performance_start": r.get("Start Date", "") or "",
"period_of_performance_end": r.get("End Date", "") or "",
"naics_code": str(r.get("NAICS Code", "") or ""),
"psc_code": str(r.get("PSC Code", "") or ""),
"competition_extent": set_aside,
"description": r.get("Description", "") or "",
}
)
meta = payload.get("page_metadata", {})
if not meta.get("hasNext"):
break
page += 1
time.sleep(0.5)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__)
p.add_argument("--recipient", help="Recipient name search")
p.add_argument("--agency", help="Awarding agency (top-tier)")
p.add_argument("--fy", type=int, default=2024, help="Federal fiscal year")
p.add_argument("--sole-source-only", action="store_true")
p.add_argument("--max-pages", type=int, default=20)
p.add_argument("--out", required=True)
a = p.parse_args()
if not (a.recipient or a.agency):
p.error("must supply at least one of --recipient / --agency")
n = fetch(
recipient=a.recipient,
agency=a.agency,
fy=a.fy,
sole_source_only=a.sole_source_only,
out_path=a.out,
max_pages=a.max_pages,
)
print(f"Wrote {n} USAspending rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,142 @@
#!/usr/bin/env python3
"""Search the Internet Archive Wayback Machine via the CDX server.
The CDX API indexes ~900B+ archived web pages. Anonymous read access,
no auth required. Useful for finding deleted / changed pages by URL,
domain, or substring match.
"""
from __future__ import annotations
import argparse
import csv
import sys
import urllib.parse
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from _http import get_json # noqa: E402
BASE = "https://web.archive.org/cdx/search/cdx"
COLUMNS = [
"url",
"timestamp",
"wayback_url",
"mimetype",
"status",
"digest",
"length",
]
def fetch(
url_or_host: str,
match_type: str,
from_date: str | None,
to_date: str | None,
status: str | None,
mime: str | None,
collapse: str | None,
limit: int,
out_path: str,
) -> int:
params: dict[str, str] = {
"url": url_or_host,
"matchType": match_type,
"output": "json",
"limit": str(limit),
}
if from_date:
params["from"] = from_date.replace("-", "")
if to_date:
params["to"] = to_date.replace("-", "")
if status:
params["filter"] = f"statuscode:{status}"
if mime:
params.setdefault("filter", "")
# Multiple filters: CDX accepts repeated filter params via urlencode list
params["filter"] = f"mimetype:{mime}"
if collapse:
params["collapse"] = collapse
url = f"{BASE}?{urllib.parse.urlencode(params)}"
try:
payload = get_json(url)
except Exception as e: # noqa: BLE001
print(f"Wayback CDX error: {e}", file=sys.stderr)
payload = []
rows: list[dict[str, str]] = []
if isinstance(payload, list) and len(payload) > 1:
header = payload[0]
idx = {h: i for i, h in enumerate(header)}
for entry in payload[1:]:
ts = entry[idx["timestamp"]] if "timestamp" in idx else ""
orig = entry[idx["original"]] if "original" in idx else ""
rows.append(
{
"url": orig,
"timestamp": ts,
"wayback_url": f"https://web.archive.org/web/{ts}/{orig}" if ts and orig else "",
"mimetype": entry[idx["mimetype"]] if "mimetype" in idx else "",
"status": entry[idx["statuscode"]] if "statuscode" in idx else "",
"digest": entry[idx["digest"]] if "digest" in idx else "",
"length": entry[idx["length"]] if "length" in idx else "",
}
)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
if not rows:
print(
f"Wayback Machine: 0 captures for {url_or_host!r} matchType={match_type}.",
file=sys.stderr,
)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--url", required=True, help="URL or host to look up in the archive")
p.add_argument(
"--match",
default="exact",
choices=["exact", "prefix", "host", "domain"],
help=(
"exact: this URL only. "
"prefix: this URL's path-prefix. "
"host: any URL on this host. "
"domain: any URL on this domain or subdomains."
),
)
p.add_argument("--from-date", help="Earliest capture YYYY-MM-DD")
p.add_argument("--to-date", help="Latest capture YYYY-MM-DD")
p.add_argument("--status", help="HTTP status filter (e.g. 200)")
p.add_argument("--mime", help="MIME type filter (e.g. text/html)")
p.add_argument(
"--collapse",
help="Collapse adjacent identical entries (e.g. 'digest' for unique-content captures)",
)
p.add_argument("--limit", type=int, default=200)
p.add_argument("--out", required=True)
a = p.parse_args()
n = fetch(
url_or_host=a.url,
match_type=a.match,
from_date=a.from_date,
to_date=a.to_date,
status=a.status,
mime=a.mime,
collapse=a.collapse,
limit=a.limit,
out_path=a.out,
)
print(f"Wrote {n} Wayback capture rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,267 @@
#!/usr/bin/env python3
"""Search Wikipedia + Wikidata for an entity (person, company, place, concept).
Two free APIs:
- Wikipedia OpenSearch + REST summary endpoint for narrative bio
- Wikidata SPARQL endpoint for structured facts (birth, employer, awards, etc.)
Both are anonymous-access. Useful for resolving who-is-this-entity questions
and surfacing cross-references that other sources can join against.
"""
from __future__ import annotations
import argparse
import csv
import json
import re
import sys
import urllib.parse
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from _http import get_json # noqa: E402
WP_OPENSEARCH = "https://en.wikipedia.org/w/api.php"
WP_SUMMARY = "https://en.wikipedia.org/api/rest_v1/page/summary/"
WD_ACTION = "https://www.wikidata.org/w/api.php"
COLUMNS = [
"source",
"label",
"description",
"qid",
"wikipedia_title",
"wikipedia_url",
"wikidata_url",
"instance_of",
"country",
"occupation",
"employer",
"date_of_birth",
"place_of_birth",
"summary",
]
def _wp_search(query: str, limit: int) -> list[dict]:
params = {
"action": "opensearch",
"search": query,
"limit": str(min(limit, 20)),
"format": "json",
}
url = f"{WP_OPENSEARCH}?{urllib.parse.urlencode(params)}"
data = get_json(url)
if not isinstance(data, list) or len(data) < 4:
return []
titles, descs, urls = data[1], data[2], data[3]
out = []
for i, title in enumerate(titles):
out.append(
{
"title": title,
"description": descs[i] if i < len(descs) else "",
"url": urls[i] if i < len(urls) else "",
}
)
return out
def _wp_summary(title: str) -> dict:
"""Pull the REST summary for a title — short bio, image, type."""
url = f"{WP_SUMMARY}{urllib.parse.quote(title.replace(' ', '_'))}"
try:
return get_json(url) # type: ignore[return-value]
except Exception as e: # noqa: BLE001
print(f"Wikipedia summary lookup for {title!r} failed: {e}", file=sys.stderr)
return {}
def _wd_lookup_by_qid(qid: str) -> dict:
"""Pull common facts for a QID via Wikidata's Action API (no SPARQL).
The Action API is far more lenient on rate-limits than the SPARQL Query
Service. We get claims as QIDs and then resolve labels in one batch call.
"""
# Properties of interest. The Action API returns claims as QIDs or
# typed literals, so the slot mapping is local-only.
interesting = {
"P31": "instance_of",
"P17": "country", # for orgs / places
"P27": "country", # for individuals (country of citizenship)
"P106": "occupation",
"P108": "employer",
"P569": "date_of_birth",
"P19": "place_of_birth",
}
params = {
"action": "wbgetentities",
"ids": qid,
"props": "claims",
"format": "json",
}
url = f"{WD_ACTION}?{urllib.parse.urlencode(params)}"
try:
data = get_json(url)
except Exception as e: # noqa: BLE001
print(f"Wikidata wbgetentities for {qid} failed: {e}", file=sys.stderr)
return {}
if not isinstance(data, dict):
return {}
claims = (data.get("entities", {}).get(qid, {}) or {}).get("claims", {}) or {}
# Collect raw values (QIDs or literals) and remember which slot each
# came from. Date literals come back as ISO strings; QIDs need a label
# resolution pass.
qid_to_slots: dict[str, list[str]] = {}
facts: dict[str, list[str]] = {}
for prop_id, slot in interesting.items():
for claim in claims.get(prop_id, []) or []:
v = (claim.get("mainsnak", {}) or {}).get("datavalue", {}) or {}
vtype = v.get("type")
value = v.get("value")
if vtype == "wikibase-entityid" and isinstance(value, dict):
vqid = value.get("id", "")
if vqid:
qid_to_slots.setdefault(vqid, [])
if slot not in qid_to_slots[vqid]:
qid_to_slots[vqid].append(slot)
elif vtype == "time" and isinstance(value, dict):
raw = value.get("time", "") or ""
# +1955-10-28T00:00:00Z → 1955-10-28
m = re.search(r"[+-]?(\d{4})-(\d{2})-(\d{2})", raw)
if m:
facts.setdefault(slot, []).append(
f"{m.group(1)}-{m.group(2)}-{m.group(3)}"
)
elif vtype == "string":
facts.setdefault(slot, []).append(str(value))
# Resolve labels for all referenced QIDs in one batch (up to 50 at a time).
qids = list(qid_to_slots)
for i in range(0, len(qids), 50):
batch = qids[i : i + 50]
params = {
"action": "wbgetentities",
"ids": "|".join(batch),
"props": "labels",
"languages": "en",
"format": "json",
}
url = f"{WD_ACTION}?{urllib.parse.urlencode(params)}"
try:
data = get_json(url)
except Exception as e: # noqa: BLE001
print(f"Wikidata label batch failed: {e}", file=sys.stderr)
continue
if not isinstance(data, dict):
continue
ents = data.get("entities", {}) or {}
for vqid, ent in ents.items():
label = (ent.get("labels", {}).get("en", {}) or {}).get("value", "") or vqid
for slot in qid_to_slots.get(vqid, []):
facts.setdefault(slot, []).append(label)
# Deduplicate per slot, preserving order.
deduped: dict[str, list[str]] = {}
for slot, vals in facts.items():
seen = set()
out = []
for v in vals:
if v in seen:
continue
seen.add(v)
out.append(v)
deduped[slot] = out
return deduped
def _wd_qid_for_title(title: str) -> str:
"""Get the Wikidata QID associated with a Wikipedia article title."""
params = {
"action": "query",
"format": "json",
"prop": "pageprops",
"ppprop": "wikibase_item",
"titles": title,
"redirects": 1,
}
url = f"{WP_OPENSEARCH}?{urllib.parse.urlencode(params)}"
try:
data = get_json(url)
except Exception: # noqa: BLE001
return ""
if not isinstance(data, dict):
return ""
pages = data.get("query", {}).get("pages", {}) or {}
for page in pages.values():
qid = (page.get("pageprops") or {}).get("wikibase_item", "")
if qid:
return qid
return ""
def fetch(query: str, limit: int, no_wikidata: bool, out_path: str) -> int:
hits = _wp_search(query, limit)
rows: list[dict[str, str]] = []
for hit in hits[:limit]:
title = hit.get("title", "")
if not title:
continue
summary = _wp_summary(title)
qid = _wd_qid_for_title(title) if not no_wikidata else ""
facts: dict = {}
if qid:
facts = _wd_lookup_by_qid(qid)
rows.append(
{
"source": "wikipedia+wikidata" if qid else "wikipedia",
"label": title,
"description": (summary.get("description") or hit.get("description") or "").strip(),
"qid": qid,
"wikipedia_title": title,
"wikipedia_url": hit.get("url", ""),
"wikidata_url": f"https://www.wikidata.org/wiki/{qid}" if qid else "",
"instance_of": "; ".join(facts.get("instance_of", [])),
"country": "; ".join(facts.get("country", [])),
"occupation": "; ".join(facts.get("occupation", [])),
"employer": "; ".join(facts.get("employer", [])),
"date_of_birth": "; ".join(facts.get("date_of_birth", []))[:10] if facts.get("date_of_birth") else "",
"place_of_birth": "; ".join(facts.get("place_of_birth", [])),
"summary": (summary.get("extract") or "").replace("\n", " ")[:1000],
}
)
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
with open(out_path, "w", newline="", encoding="utf-8") as fh:
w = csv.DictWriter(fh, fieldnames=COLUMNS)
w.writeheader()
w.writerows(rows)
if not rows:
print(
f"Wikipedia: 0 articles for query={query!r}. "
"Private individuals not notable enough for a Wikipedia article "
"won't appear here (the bar is real).",
file=sys.stderr,
)
return len(rows)
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--query", required=True, help="Entity name (person, company, place, concept)")
p.add_argument("--limit", type=int, default=5)
p.add_argument(
"--no-wikidata",
action="store_true",
help="Skip the Wikidata SPARQL enrichment (faster, less detail)",
)
p.add_argument("--out", required=True)
a = p.parse_args()
n = fetch(query=a.query, limit=a.limit, no_wikidata=a.no_wikidata, out_path=a.out)
print(f"Wrote {n} Wikipedia/Wikidata rows to {a.out}")
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,253 @@
#!/usr/bin/env python3
"""Permutation test for donation/contract timing correlation (stdlib-only).
For each (donor, vendor) pair, compute the mean number of days between each
donation and the nearest contract award. Then shuffle contract award dates
N times within the observation window and compute the same statistic. The
one-tailed p-value is the fraction of permutations whose mean is <= the
observed mean (smaller distance = tighter clustering).
Adapted from ShinMegamiBoson/OpenPlanter (MIT). Differences:
- Pure stdlib (no pandas / numpy)
- Domain-agnostic (no snow-vendor / CRITICAL-politician filter)
- Configurable column names via flags
- Optional --seed for reproducibility
"""
from __future__ import annotations
import argparse
import csv
import datetime as dt
import json
import math
import random
import statistics
from collections import defaultdict
from pathlib import Path
_DATE_FORMATS = ("%Y-%m-%d", "%m/%d/%Y", "%Y/%m/%d", "%m-%d-%Y", "%Y%m%d")
def parse_date(raw: str) -> dt.date | None:
if not raw:
return None
raw = raw.strip()
for fmt in _DATE_FORMATS:
try:
return dt.datetime.strptime(raw, fmt).date()
except ValueError:
continue
return None
def _read(path: str) -> list[dict[str, str]]:
with open(path, newline="", encoding="utf-8") as fh:
return list(csv.DictReader(fh))
def _nearest_distance(donation_date: dt.date, awards: list[dt.date]) -> int:
"""Absolute days to nearest award date."""
return min(abs((donation_date - a).days) for a in awards)
def _permute(
awards_count: int,
donations: list[dt.date],
date_min: dt.date,
date_max: dt.date,
rng: random.Random,
) -> float:
"""One permutation: draw uniform random award dates, compute mean nearest-distance."""
span_days = (date_max - date_min).days or 1
rand_awards = [
date_min + dt.timedelta(days=rng.randint(0, span_days))
for _ in range(awards_count)
]
distances = [_nearest_distance(d, rand_awards) for d in donations]
return statistics.mean(distances)
def analyze(
donations_path: str,
donation_date_col: str,
donation_amount_col: str,
donation_donor_col: str,
donation_recipient_col: str,
contracts_path: str,
contract_date_col: str,
contract_vendor_col: str,
cross_links_path: str | None,
n_permutations: int = 1000,
min_donations: int = 3,
p_threshold: float = 0.05,
seed: int | None = None,
out_path: str = "timing.json",
) -> dict:
rng = random.Random(seed)
donations = _read(donations_path)
contracts = _read(contracts_path)
# Allow optional join through cross_links — donor (left) ↔ vendor (right).
# When present, donor strings get mapped to matched vendor names so the
# vendor-date index lookup actually finds the contracts.
matched_pairs: set[tuple[str, str]] | None = None
donor_to_vendors: dict[str, set[str]] = defaultdict(set)
if cross_links_path:
matched_pairs = set()
for row in _read(cross_links_path):
left = row.get("left_name", "")
right = row.get("right_name", "")
matched_pairs.add((left, right))
donor_to_vendors[left].add(right)
# Index contract dates by vendor name.
vendor_to_award_dates: dict[str, list[dt.date]] = defaultdict(list)
all_award_dates: list[dt.date] = []
for row in contracts:
d = parse_date(row.get(contract_date_col, ""))
if not d:
continue
vendor_to_award_dates[row.get(contract_vendor_col, "").strip()].append(d)
all_award_dates.append(d)
if not all_award_dates:
raise SystemExit(f"No parseable dates in {contracts_path}/{contract_date_col}")
global_min = min(all_award_dates)
global_max = max(all_award_dates)
# Group donations by (donor, recipient).
grouped: dict[tuple[str, str], list[tuple[dt.date, float]]] = defaultdict(list)
for row in donations:
donor = row.get(donation_donor_col, "").strip()
recip = row.get(donation_recipient_col, "").strip()
d = parse_date(row.get(donation_date_col, ""))
try:
amt = float(row.get(donation_amount_col, "0") or 0)
except ValueError:
amt = 0.0
if not (donor and recip and d):
continue
grouped[(donor, recip)].append((d, amt))
results = []
skipped = 0
for (donor, recip), records in grouped.items():
if len(records) < min_donations:
skipped += 1
continue
# Only test if donor appears in cross-links (when provided). The
# (donor, candidate) tuple itself is NOT what's in matched_pairs —
# cross_links pairs are (donor, vendor). We use the cross-link to
# map donor → vendor name(s) so the vendor-date index resolves.
if matched_pairs is not None and donor not in donor_to_vendors:
skipped += 1
continue
# Try direct donor→awards first, then go through cross-link vendor names.
award_dates = list(vendor_to_award_dates.get(donor, []))
if not award_dates:
award_dates = list(vendor_to_award_dates.get(recip, []))
if not award_dates and donor_to_vendors.get(donor):
for vendor_name in donor_to_vendors[donor]:
award_dates.extend(vendor_to_award_dates.get(vendor_name, []))
if not award_dates:
skipped += 1
continue
donation_dates = [d for (d, _) in records]
observed = statistics.mean(
_nearest_distance(d, award_dates) for d in donation_dates
)
permuted_means = [
_permute(len(award_dates), donation_dates, global_min, global_max, rng)
for _ in range(n_permutations)
]
p_value = sum(1 for m in permuted_means if m <= observed) / n_permutations
null_mean = statistics.mean(permuted_means)
null_std = statistics.pstdev(permuted_means) or 1.0
effect_size = (null_mean - observed) / null_std
results.append(
{
"donor": donor,
"recipient": recip,
"n_donations": len(records),
"n_award_dates": len(award_dates),
"observed_mean_days": round(observed, 2),
"null_mean_days": round(null_mean, 2),
"p_value": round(p_value, 4),
"effect_size_sd": round(effect_size, 2),
"significant": p_value < p_threshold,
"total_donation_amount": round(sum(a for (_, a) in records), 2),
}
)
results.sort(key=lambda r: r["p_value"])
payload = {
"metadata": {
"n_permutations": n_permutations,
"min_donations": min_donations,
"p_threshold": p_threshold,
"seed": seed,
"n_pairs_tested": len(results),
"n_pairs_skipped": skipped,
"n_significant": sum(1 for r in results if r["significant"]),
"observation_window": [global_min.isoformat(), global_max.isoformat()],
},
"results": results,
}
Path(out_path).write_text(json.dumps(payload, indent=2))
return payload
def main() -> int:
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
p.add_argument("--donations", required=True)
p.add_argument("--donation-date-col", required=True)
p.add_argument("--donation-amount-col", required=True)
p.add_argument("--donation-donor-col", required=True)
p.add_argument("--donation-recipient-col", required=True)
p.add_argument("--contracts", required=True)
p.add_argument("--contract-date-col", required=True)
p.add_argument("--contract-vendor-col", required=True)
p.add_argument(
"--cross-links",
help="Optional cross_links.csv to restrict (donor, vendor) pairs",
)
p.add_argument("--permutations", type=int, default=1000)
p.add_argument("--min-donations", type=int, default=3)
p.add_argument("--p-threshold", type=float, default=0.05)
p.add_argument("--seed", type=int)
p.add_argument("--out", default="timing.json")
a = p.parse_args()
payload = analyze(
donations_path=a.donations,
donation_date_col=a.donation_date_col,
donation_amount_col=a.donation_amount_col,
donation_donor_col=a.donation_donor_col,
donation_recipient_col=a.donation_recipient_col,
contracts_path=a.contracts,
contract_date_col=a.contract_date_col,
contract_vendor_col=a.contract_vendor_col,
cross_links_path=a.cross_links,
n_permutations=a.permutations,
min_donations=a.min_donations,
p_threshold=a.p_threshold,
seed=a.seed,
out_path=a.out,
)
meta = payload["metadata"]
print(
f"Tested {meta['n_pairs_tested']} pairs ({meta['n_pairs_skipped']} skipped). "
f"Significant (p<{meta['p_threshold']}): {meta['n_significant']}. "
f"Wrote {a.out}"
)
return 0
if __name__ == "__main__":
raise SystemExit(main())

View file

@ -0,0 +1,59 @@
# <Source Name>
## 1. Summary
What this data source is, who publishes it, why it matters for investigations.
## 2. Access Methods
- API endpoint(s)
- Bulk download URLs
- Auth requirements (none / API key / OAuth)
- Rate limits
## 3. Data Schema
Key fields, record types, table relationships. List the columns the fetch
script emits.
## 4. Coverage
- Jurisdiction
- Time range
- Update frequency
- Data volume (rows / GB)
## 5. Cross-Reference Potential
Which other sources can be joined and on what keys. Be explicit:
- `<source>``<column>` (join key: <normalized entity name / EIN / CIK / etc.>)
## 6. Data Quality
Known issues — formatting inconsistencies, missing fields, duplicates,
historical gaps, redaction.
## 7. Acquisition Script
Path: `scripts/fetch_<source>.py`
Example:
```bash
python3 SKILL_DIR/scripts/fetch_<source>.py --<filter> <value> --out data/<source>.csv
```
Output CSV columns: `<col1>, <col2>, ...`
## 8. Legal & Licensing
- Public records law / FOIA basis
- Terms of use / acceptable use
- Attribution requirements (if any)
## 9. References
- Official docs: <url>
- Data dictionary: <url>
- Related coverage / journalism: <url>

View file

@ -68,7 +68,7 @@
const FALLBACK_COLUMN_HELP = {
triage: "Raw ideas — a specifier will flesh out the spec",
todo: "Waiting on dependencies or unassigned",
ready: "Assigned and waiting for a dispatcher tick",
ready: "Dependencies satisfied; assign a profile to dispatch",
running: "Claimed by a worker — in-flight",
blocked: "Worker asked for human input",
done: "Completed",
@ -2048,6 +2048,7 @@
};
const progress = t.progress;
const needsAssignee = t.status === "ready" && !t.assignee;
return h("div", {
ref: cardRef,
@ -2118,6 +2119,13 @@
title: `${progress.done} of ${progress.total} child tasks done`,
}, `${progress.done}/${progress.total}`)
: null,
needsAssignee
? h(Badge, {
variant: "outline",
className: "hermes-kanban-needs-assignee",
title: tx(i18n, "needsAssigneeHint", "Dependencies are satisfied, but the dispatcher skips this task until you assign a profile."),
}, tx(i18n, "needsAssignee", "Needs assignee"))
: null,
),
h("div", { className: "hermes-kanban-card-title" },
t.title || tx(i18n, "untitled", "(untitled)")),
@ -2126,7 +2134,9 @@
? h("span", { className: "hermes-kanban-assignee",
title: `Assigned to Hermes profile @${t.assignee}` }, "@", t.assignee)
: h("span", { className: "hermes-kanban-unassigned",
title: "No profile assigned. The dispatcher will pick one from available profiles when the task is Ready." },
title: needsAssignee
? tx(i18n, "needsAssigneeHint", "Dependencies are satisfied, but the dispatcher skips this task until you assign a profile.")
: "No profile assigned." },
tx(i18n, "unassigned", "unassigned")),
t.comment_count > 0
? h("span", { className: "hermes-kanban-count",

View file

@ -280,6 +280,14 @@
padding: 0.05rem 0.3rem !important;
}
.hermes-kanban-needs-assignee {
font-size: 0.6rem !important;
padding: 0.05rem 0.3rem !important;
background: color-mix(in srgb, var(--color-warning, #d4b348) 16%, transparent);
border-color: color-mix(in srgb, var(--color-warning, #d4b348) 45%, var(--color-border));
color: var(--color-foreground);
}
.hermes-kanban-assignee {
font-weight: 500;
color: color-mix(in srgb, var(--color-foreground) 80%, var(--color-muted-foreground));

View file

@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "hermes-agent"
version = "0.13.0"
version = "0.14.0"
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
readme = "README.md"
requires-python = ">=3.11"
@ -216,12 +216,11 @@ hermes-acp = "acp_adapter.entry:main"
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_bootstrap", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "utils"]
[tool.setuptools.package-data]
hermes_cli = ["web_dist/**/*", "tui_dist/**/*", "scripts/install.sh"]
hermes_cli = ["web_dist/**/*"]
gateway = ["assets/**/*"]
acp_adapter = ["bootstrap/*.sh", "bootstrap/*.ps1"]
[tool.setuptools.packages.find]
include = ["agent", "agent.*", "tools", "tools.*", "hermes_cli", "gateway", "gateway.*", "tui_gateway", "tui_gateway.*", "cron", "acp_adapter", "acp_adapter.*", "plugins", "plugins.*", "providers", "providers.*"]
include = ["agent", "agent.*", "tools", "tools.*", "hermes_cli", "gateway", "gateway.*", "tui_gateway", "tui_gateway.*", "cron", "acp_adapter", "plugins", "plugins.*", "providers", "providers.*"]
[tool.pytest.ini_options]
testpaths = ["tests"]

View file

@ -393,6 +393,19 @@ def _is_destructive_command(cmd: str) -> bool:
return False
def _is_mcp_tool_parallel_safe(tool_name: str) -> bool:
"""Check if an MCP tool comes from a server with parallel tool calls enabled.
Lazy-imports from ``tools.mcp_tool`` to avoid circular dependencies.
Returns False if the MCP module is not available.
"""
try:
from tools.mcp_tool import is_mcp_tool_parallel_safe
return is_mcp_tool_parallel_safe(tool_name)
except Exception:
return False
def _should_parallelize_tool_batch(tool_calls) -> bool:
"""Return True when a tool-call batch is safe to run concurrently."""
if len(tool_calls) <= 1:
@ -432,7 +445,9 @@ def _should_parallelize_tool_batch(tool_calls) -> bool:
continue
if tool_name not in _PARALLEL_SAFE_TOOLS:
return False
# Check if it's an MCP tool from a server that opted into parallel calls.
if not _is_mcp_tool_parallel_safe(tool_name):
return False
return True
@ -3027,6 +3042,24 @@ class AIAgent:
parts.append(f"{type(e).__name__}({msg})" if msg else type(e).__name__)
return " <- ".join(parts) if parts else type(error).__name__
def _is_provider_stream_parse_error(self, error: BaseException) -> bool:
"""Return True for malformed provider streaming data from SDK parsers.
Some Anthropic-compatible streaming providers can send a malformed
event-stream frame. The Anthropic SDK surfaces that as a plain
``ValueError`` such as ``expected ident at line 1 column 149``. That
is provider wire-format trouble, not local request validation, so it
should follow the same retry path as a truncated JSON body.
"""
if getattr(self, "api_mode", None) != "anthropic_messages":
return False
if not isinstance(error, ValueError):
return False
if isinstance(error, (UnicodeEncodeError, json.JSONDecodeError)):
return False
message = str(error).strip().lower()
return "expected ident at line" in message
def _log_stream_retry(
self,
*,
@ -5080,6 +5113,12 @@ class AIAgent:
"""
raw = str(error)
if (
isinstance(error, ValueError)
and "expected ident at line" in raw.lower()
):
return f"Malformed provider streaming response: {raw[:300]}"
# Cloudflare / proxy HTML pages: grab the <title> for a clean summary
if "<!DOCTYPE" in raw or "<html" in raw:
m = re.search(r"<title[^>]*>([^<]+)</title>", raw, re.IGNORECASE)
@ -8528,6 +8567,7 @@ class AIAgent:
_is_conn_err = isinstance(
e, (_httpx.ConnectError, _httpx.RemoteProtocolError, ConnectionError)
)
_is_stream_parse_err = self._is_provider_stream_parse_error(e)
# If the stream died AFTER some tokens were delivered:
# normally we don't retry (the user already saw text,
@ -8567,7 +8607,10 @@ class AIAgent:
for phrase in _SSE_PREVIEW_PHRASES
)
_is_transient = (
_is_timeout or _is_conn_err or _is_sse_conn_err_preview
_is_timeout
or _is_conn_err
or _is_sse_conn_err_preview
or _is_stream_parse_err
)
_can_silent_retry = (
_partial_tool_in_flight
@ -8665,7 +8708,7 @@ class AIAgent:
for phrase in _SSE_CONN_PHRASES
)
if _is_timeout or _is_conn_err or _is_sse_conn_err:
if _is_timeout or _is_conn_err or _is_sse_conn_err or _is_stream_parse_err:
# Transient network / timeout error. Retry the
# streaming request with a fresh connection first.
if _stream_attempt < _max_stream_retries:
@ -8706,12 +8749,20 @@ class AIAgent:
mid_tool_call=False,
diag=request_client_holder.get("diag"),
)
self._emit_status(
"❌ Connection to provider failed after "
f"{_max_stream_retries + 1} attempts. "
"The provider may be experiencing issues — "
"try again in a moment."
)
if _is_stream_parse_err:
self._emit_status(
"❌ Provider returned malformed streaming data after "
f"{_max_stream_retries + 1} attempts. "
"The provider may be experiencing issues — "
"try again in a moment."
)
else:
self._emit_status(
"❌ Connection to provider failed after "
f"{_max_stream_retries + 1} attempts. "
"The provider may be experiencing issues — "
"try again in a moment."
)
else:
_err_lower = str(e).lower()
_is_stream_unsupported = (
@ -14133,6 +14184,39 @@ class AIAgent:
"interrupted": True,
}
# Actionable hint for GitHub Models (Azure) 413 errors.
# The free tier enforces a hard 8K token cap per request,
# which Hermes' system prompt + tool schemas alone exceed.
# Compression can't help — the floor is the system prompt
# itself, not the conversation — so surface a clear "not
# compatible" message instead of looping into three futile
# compression attempts.
if (
status_code == 413
and isinstance(_base, str)
and "models.inference.ai.azure.com" in _base
):
self._vprint(
f"{self.log_prefix} 💡 GitHub Models free tier (models.inference.ai.azure.com) caps every",
force=True,
)
self._vprint(
f"{self.log_prefix} request at ~8K tokens. Hermes' system prompt + tool schemas baseline",
force=True,
)
self._vprint(
f"{self.log_prefix} exceeds that floor, so this endpoint cannot run an agentic loop.",
force=True,
)
self._vprint(
f"{self.log_prefix} Use the `copilot` provider with a Copilot subscription token (`hermes",
force=True,
)
self._vprint(
f"{self.log_prefix} setup` → GitHub Copilot), or pick any other provider.",
force=True,
)
# Check for 413 payload-too-large BEFORE generic 4xx handler.
# A 413 is a payload-size error — the correct response is to
# compress history and retry, not abort immediately.
@ -14509,11 +14593,16 @@ class AIAgent:
# provider/network failure (malformed response body,
# truncated stream, routing layer corruption), not a
# local programming bug, and should be retried (#14782).
# Exclude Anthropic stream parser ValueErrors for the
# same reason: third-party Anthropic-compatible providers
# can emit malformed event-stream frames that SDK parsers
# raise as plain ValueError.
is_local_validation_error = (
isinstance(api_error, (ValueError, TypeError))
and not isinstance(
api_error, (UnicodeEncodeError, json.JSONDecodeError)
)
and not self._is_provider_stream_parse_error(api_error)
# ssl.SSLError (and its subclass SSLCertVerificationError)
# inherits from OSError *and* ValueError via Python MRO,
# so the isinstance(ValueError) check above would

View file

@ -59,6 +59,8 @@ AUTHOR_MAP = {
"m@mobrienv.dev": "mikeyobrien",
"qiyin.zuo@pcitc.com": "qiyin-code",
"mr.aashiz@gmail.com": "aashizpoudel",
"70629228+shaun0927@users.noreply.github.com": "shaun0927",
"98262967+Bihruze@users.noreply.github.com": "Bihruze",
"nidhi2894@gmail.com": "nidhi-singh02",
"30312689+aashizpoudel@users.noreply.github.com": "aashizpoudel",
"oleksii.lisikh@gmail.com": "olisikh",
@ -91,6 +93,7 @@ AUTHOR_MAP = {
"30397170+1000Delta@users.noreply.github.com": "1000Delta",
"szymonclawd@mac.home": "szymonclawd",
"257759490+szymonclawd@users.noreply.github.com": "szymonclawd",
"101180447+worlldz@users.noreply.github.com": "worlldz",
"zhanganzhe@tenclass.com": "luoyuctl",
"51604064+luoyuctl@users.noreply.github.com": "luoyuctl",
"127238744+teknium1@users.noreply.github.com": "teknium1",
@ -1078,6 +1081,11 @@ AUTHOR_MAP = {
"nidhi2894@gmail.com": "nidhi-singh02", # PR #2752 salvage (slack whitespace-only IndexError guard)
"38173192+nidhi-singh02@users.noreply.github.com": "nidhi-singh02",
"Jaaneek@users.noreply.github.com": "Jaaneek", # PR #26457 (xAI Grok OAuth provider)
# v0.14.0 additions
"chuang.guo@hopechart.com": "wuwuzhijing", # PR #21063 salvage (gateway docs mention Weixin)
"nightcityblade@gmail.com": "nightcityblade", # PR #24138 (docs voice/tts table)
"pol.kuijken@gmail.com": "polkn", # PR #6136 salvage (skill_view collision refusal)
"robin@soal.org": "rewbs",
}

View file

@ -13,6 +13,7 @@ from acp.schema import (
AgentCapabilities,
AgentMessageChunk,
AgentPlanUpdate,
AgentThoughtChunk,
AuthenticateResponse,
AvailableCommandsUpdate,
Implementation,
@ -467,25 +468,296 @@ class TestSessionOps:
)
@pytest.mark.asyncio
async def test_load_session_schedules_history_replay_after_response(self, agent):
"""Zed only attaches replayed updates after session/load has completed."""
async def test_load_session_replays_reasoning_thought_before_message(self, agent):
"""Thinking-model thoughts must be replayed via ``agent_thought_chunk``.
Regression for #12285 — when a session is loaded, persisted assistant
``reasoning_content`` / ``reasoning`` fields must surface as ACP
``AgentThoughtChunk`` notifications in the same relative position they
had live (thought streams before the assistant message text), so Zed's
collapsed Thinking pane rebuilds instead of vanishing on reconnect.
"""
mock_conn = MagicMock(spec=acp.Client)
mock_conn.session_update = AsyncMock()
agent._conn = mock_conn
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [
{"role": "user", "content": "Walk me through it."},
{
"role": "assistant",
"reasoning_content": "Let me think step by step about the request.",
"content": "Here is the plan.",
},
{"role": "user", "content": "And the legacy case?"},
{
"role": "assistant",
# No reasoning_content — exercise the legacy "reasoning" fallback
# path so sessions persisted before #16892 still replay thoughts.
"reasoning": "Older sessions stored the trace under the internal key.",
"content": "Same idea, older field name.",
},
]
mock_conn.session_update.reset_mock()
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
await asyncio.sleep(0)
await asyncio.sleep(0)
assert isinstance(resp, LoadSessionResponse)
replay_kinds = [
getattr(call.kwargs.get("update"), "session_update", None)
for call in mock_conn.session_update.await_args_list
if getattr(call.kwargs.get("update"), "session_update", None)
in {"user_message_chunk", "agent_message_chunk", "agent_thought_chunk"}
]
assert replay_kinds == [
"user_message_chunk",
"agent_thought_chunk",
"agent_message_chunk",
"user_message_chunk",
"agent_thought_chunk",
"agent_message_chunk",
]
thought_updates = [
call.kwargs["update"]
for call in mock_conn.session_update.await_args_list
if isinstance(call.kwargs.get("update"), AgentThoughtChunk)
]
assert len(thought_updates) == 2
assert thought_updates[0].content.text == "Let me think step by step about the request."
assert thought_updates[1].content.text == "Older sessions stored the trace under the internal key."
@pytest.mark.asyncio
async def test_load_session_replays_reasoning_only_turn(self, agent):
"""Assistant turns with reasoning but no content should still emit a thought.
Pure reasoning-only assistant entries (e.g. a thinking step before a
tool-call turn) commonly carry ``reasoning_content`` with empty
``content``. The replay must still surface the thought so the editor's
Thinking pane rebuilds, even when there is no message text to follow.
"""
mock_conn = MagicMock(spec=acp.Client)
mock_conn.session_update = AsyncMock()
agent._conn = mock_conn
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [
{
"role": "assistant",
"reasoning_content": "I should call the search tool next.",
"content": "",
},
]
mock_conn.session_update.reset_mock()
await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
await asyncio.sleep(0)
await asyncio.sleep(0)
thought_updates = [
call.kwargs["update"]
for call in mock_conn.session_update.await_args_list
if isinstance(call.kwargs.get("update"), AgentThoughtChunk)
]
message_updates = [
call.kwargs["update"]
for call in mock_conn.session_update.await_args_list
if isinstance(call.kwargs.get("update"), AgentMessageChunk)
]
assert len(thought_updates) == 1
assert thought_updates[0].content.text == "I should call the search tool next."
assert message_updates == []
@pytest.mark.asyncio
async def test_load_session_skips_empty_reasoning_fields(self, agent):
"""Empty/whitespace reasoning fields must not produce notifications."""
mock_conn = MagicMock(spec=acp.Client)
mock_conn.session_update = AsyncMock()
agent._conn = mock_conn
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [
{
"role": "assistant",
"reasoning_content": "",
"reasoning": " \n\t",
"content": "Just a regular answer.",
},
]
mock_conn.session_update.reset_mock()
await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
await asyncio.sleep(0)
await asyncio.sleep(0)
thought_updates = [
call.kwargs["update"]
for call in mock_conn.session_update.await_args_list
if isinstance(call.kwargs.get("update"), AgentThoughtChunk)
]
assert thought_updates == []
@pytest.mark.asyncio
async def test_load_session_replays_thought_then_tool_call_without_message(self, agent):
"""Canonical thinking-model shape: reasoning + tool_call + no body text.
Thinking models commonly emit a pre-tool thought followed by a
tool_calls turn with empty ``content``. Replay must emit:
``agent_thought_chunk`` then ``tool_call`` then ``tool_call_update``
for the matching tool result and crucially, NO ``agent_message_chunk``
for the empty-text assistant body. Regression for the canonical
thinking-then-tool flow on #12285.
"""
mock_conn = MagicMock(spec=acp.Client)
mock_conn.session_update = AsyncMock()
agent._conn = mock_conn
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [
{"role": "user", "content": "Find the bug."},
{
"role": "assistant",
"reasoning_content": "I should grep for the function name first.",
"content": "",
"tool_calls": [
{
"id": "call_grep_1",
"type": "function",
"function": {
"name": "search_files",
"arguments": '{"pattern":"foo","path":"."}',
},
}
],
},
{
"role": "tool",
"tool_call_id": "call_grep_1",
"content": '{"total_count":1,"matches":[{"path":"x.py","line":1,"content":"foo"}]}',
},
]
mock_conn.session_update.reset_mock()
await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
await asyncio.sleep(0)
await asyncio.sleep(0)
kinds = [
getattr(call.kwargs.get("update"), "session_update", None)
for call in mock_conn.session_update.await_args_list
if getattr(call.kwargs.get("update"), "session_update", None)
in {
"user_message_chunk",
"agent_thought_chunk",
"agent_message_chunk",
"tool_call",
"tool_call_update",
}
]
# No agent_message_chunk for the empty-content assistant turn.
assert "agent_message_chunk" not in kinds
# Thought must precede the tool_call_start within the assistant turn,
# and the tool result follows.
assert kinds == [
"user_message_chunk",
"agent_thought_chunk",
"tool_call",
"tool_call_update",
]
@pytest.mark.asyncio
async def test_load_session_replays_history_before_returning_response(self, agent):
"""Per ACP spec, replay must complete BEFORE load_session returns.
Spec-compliant ACP clients (Codex, Claude Code, OpenCode, Pi, Zed)
attach their ``session/update`` listeners before awaiting the
``loadSession`` RPC and rely on receiving the full transcript within
the request's lifetime. Deferring replay via ``loop.call_soon`` (the
prior behavior in May 2026) broke clients that read notification
counts synchronously against the load response see #12285 follow-up.
"""
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [{"role": "user", "content": "hello from history"}]
events = []
events: list[str] = []
async def replay_after_response(_state):
async def replay_records(_state):
events.append("replay")
with patch.object(agent, "_replay_session_history", side_effect=replay_after_response):
with patch.object(agent, "_replay_session_history", side_effect=replay_records):
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
events.append("returned")
assert isinstance(resp, LoadSessionResponse)
assert events == ["returned"]
await asyncio.sleep(0)
await asyncio.sleep(0)
assert events == ["returned", "replay"]
# Replay must have happened BEFORE the response was constructed —
# i.e. before the `events.append("returned")` after the await resolves.
assert events == ["replay", "returned"]
@pytest.mark.asyncio
async def test_resume_session_replays_history_before_returning_response(self, agent):
"""Same spec rationale as ``load_session`` — replay before responding."""
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [{"role": "user", "content": "hello from history"}]
events: list[str] = []
async def replay_records(_state):
events.append("replay")
with patch.object(agent, "_replay_session_history", side_effect=replay_records):
resp = await agent.resume_session(cwd="/tmp", session_id=new_resp.session_id)
events.append("returned")
assert isinstance(resp, ResumeSessionResponse)
assert events == ["replay", "returned"]
@pytest.mark.asyncio
async def test_load_session_survives_replay_helper_exception(self, agent, caplog):
"""A replay helper raising must not turn load_session into an error.
With awaited replay, an exception in ``_replay_session_history`` now
propagates into the ``load_session`` handler. The defensive try/except
guard at the call site must catch and log it so the JSON-RPC client
still receives a ``LoadSessionResponse`` partial transcripts are
acceptable, total load failure is not.
"""
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [{"role": "user", "content": "hi"}]
async def boom(_state):
raise RuntimeError("simulated replay helper crash")
with caplog.at_level("WARNING", logger="acp_adapter.server"):
with patch.object(agent, "_replay_session_history", side_effect=boom):
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
assert isinstance(resp, LoadSessionResponse)
assert "history replay raised during session/load" in caplog.text
@pytest.mark.asyncio
async def test_resume_session_survives_replay_helper_exception(self, agent, caplog):
"""Same guarantee as ``load_session`` for the resume path."""
new_resp = await agent.new_session(cwd="/tmp")
state = agent.session_manager.get_session(new_resp.session_id)
state.history = [{"role": "user", "content": "hi"}]
async def boom(_state):
raise RuntimeError("simulated replay helper crash")
with caplog.at_level("WARNING", logger="acp_adapter.server"):
with patch.object(agent, "_replay_session_history", side_effect=boom):
resp = await agent.resume_session(cwd="/tmp", session_id=new_resp.session_id)
assert isinstance(resp, ResumeSessionResponse)
assert "history replay raised during session/resume" in caplog.text
@pytest.mark.asyncio
async def test_resume_session_creates_new_if_missing(self, agent):

View file

@ -0,0 +1,170 @@
"""Regression tests for the Anthropic OAuth PKCE flow.
Guards against re-introducing the bug where the PKCE ``code_verifier`` was
reused as the OAuth ``state`` parameter, leaking the verifier via the
authorization URL (browser history, Referer headers, auth-server logs) and
removing CSRF protection on the callback path.
History:
- PR #1775 first fixed this on ``run_hermes_oauth_login()``.
- PR #2647 (b17e5c10) added ``run_hermes_oauth_login_pure()`` and silently
copy-pasted the pre-#1775 vulnerable pattern.
- PR #3107 removed the old function, leaving only the regressed copy.
- PR #10699 (issue #10693) fixed the regression on the surviving function.
"""
from __future__ import annotations
import io
import json
from typing import Any, Dict
from urllib.parse import parse_qs, urlparse
def _patch_oauth_flow(
monkeypatch,
*,
callback_code: str,
token_response: Dict[str, Any] | None = None,
capture_token_request: Dict[str, Any] | None = None,
capture_auth_url: Dict[str, str] | None = None,
) -> None:
"""Wire up monkeypatches that let ``run_hermes_oauth_login_pure()`` run
end-to-end without touching a real browser, stdin, or HTTP endpoint.
``callback_code`` is the literal string the user would paste back into the
terminal (``"<code>#<state>"`` format).
``capture_token_request`` and ``capture_auth_url`` are out-dict captures
so the test can introspect what was sent to the auth URL and the token
endpoint, respectively.
"""
import urllib.request
if token_response is None:
token_response = {
"access_token": "sk-ant-test-access",
"refresh_token": "sk-ant-test-refresh",
"expires_in": 3600,
}
def fake_open(url):
if capture_auth_url is not None:
capture_auth_url["url"] = url
return True
monkeypatch.setattr("webbrowser.open", fake_open)
monkeypatch.setattr("builtins.input", lambda *_a, **_kw: callback_code)
class _FakeResponse:
def __init__(self, body: bytes) -> None:
self._body = body
def __enter__(self):
return self
def __exit__(self, *_exc):
return False
def read(self):
return self._body
def fake_urlopen(req, *_a, **_kw):
if capture_token_request is not None:
capture_token_request["url"] = req.full_url
capture_token_request["data"] = json.loads(req.data.decode())
capture_token_request["headers"] = dict(req.headers)
return _FakeResponse(json.dumps(token_response).encode())
monkeypatch.setattr(urllib.request, "urlopen", fake_urlopen)
def test_authorization_url_state_is_not_pkce_verifier(monkeypatch, tmp_path):
"""The ``state`` parameter in the authorization URL must NOT equal the
PKCE ``code_verifier``.
Reusing the verifier as state leaks the verifier into browser history,
Referer headers, and auth-server access logs defeating RFC 7636.
"""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
captured_url: Dict[str, str] = {}
captured_token: Dict[str, Any] = {}
_patch_oauth_flow(
monkeypatch,
# state echoed back unchanged so the CSRF guard passes
callback_code="auth-code-from-anthropic#PLACEHOLDER",
capture_auth_url=captured_url,
capture_token_request=captured_token,
)
# Stub the callback parse: we need the state echoed back to match. To do
# that without hardcoding the state value, override input() AFTER seeing
# the auth URL.
import builtins
real_input_calls = {"count": 0}
def fake_input(*_a, **_kw):
real_input_calls["count"] += 1
# First (and only) call is the "Authorization code:" prompt.
url = captured_url.get("url", "")
qs = parse_qs(urlparse(url).query)
state = qs.get("state", [""])[0]
return f"auth-code-from-anthropic#{state}"
monkeypatch.setattr(builtins, "input", fake_input)
from agent.anthropic_adapter import run_hermes_oauth_login_pure
result = run_hermes_oauth_login_pure()
assert result is not None, "OAuth flow should succeed with matching state"
url = captured_url["url"]
qs = parse_qs(urlparse(url).query)
assert "state" in qs and qs["state"][0], "authorization URL must include state"
assert "code_challenge" in qs, "authorization URL must include code_challenge"
state_in_url = qs["state"][0]
verifier_sent = captured_token["data"]["code_verifier"]
# The whole point: state and verifier must be independent values.
assert state_in_url != verifier_sent, (
"PKCE code_verifier was reused as OAuth state — regression of #10693 / "
"#1775. The verifier is supposed to be a secret known only to the "
"client; placing it in the authorization URL leaks it via browser "
"history, Referer headers, and auth-server logs."
)
# And the verifier MUST NOT appear anywhere in the URL.
assert verifier_sent not in url, (
"PKCE verifier leaked into authorization URL — regression of #10693"
)
def test_callback_state_mismatch_aborts(monkeypatch, tmp_path, caplog):
"""If the state returned in the callback does not match the one we sent
in the authorization URL, the flow must abort before exchanging the code.
Without this check, an attacker who tricks the user into pasting a
crafted ``<code>#<state>`` string can complete the token exchange — the
CSRF protection that ``state`` is supposed to provide (RFC 6749 §10.12)
would be absent.
"""
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
captured_token: Dict[str, Any] = {}
_patch_oauth_flow(
monkeypatch,
callback_code="attacker-code#attacker-state-does-not-match",
capture_token_request=captured_token,
)
from agent.anthropic_adapter import run_hermes_oauth_login_pure
result = run_hermes_oauth_login_pure()
assert result is None, "mismatched state must abort the flow"
assert "url" not in captured_token, (
"token exchange must NOT happen when state mismatches"
)

View file

@ -0,0 +1,77 @@
"""Tests for gh-copilot CLI deprecation detection and GitHub Models Azure URL mapping."""
import pytest
from agent.copilot_acp_client import _is_gh_copilot_deprecation_message
class TestDeprecationPatternDetection:
"""Verify that stderr from the deprecated `gh copilot` extension is caught
without false-positiving on the new `@github/copilot` CLI."""
_REAL_DEPRECATION_STDERR = (
"The gh-copilot extension has been deprecated in favor of the newer "
"GitHub Copilot CLI.\nFor more information, visit:\n"
"- Copilot CLI: https://github.com/github/copilot-cli\n"
"- Deprecation announcement: https://github.blog/changelog/"
"2025-09-25-upcoming-deprecation-of-gh-copilot-cli-extension\n"
"No commands will be executed."
)
def test_real_deprecation_message_matches(self):
assert _is_gh_copilot_deprecation_message(self._REAL_DEPRECATION_STDERR)
@pytest.mark.parametrize(
"stderr_text",
[
# The deprecation banner uses both halves of the fingerprint.
"The gh-copilot extension has been deprecated.",
"gh-copilot: no commands will be executed.",
# Mixed casing — match is case-insensitive.
"The GH-Copilot Extension HAS BEEN DEPRECATED.",
],
)
def test_genuine_deprecation_variants_match(self, stderr_text: str):
assert _is_gh_copilot_deprecation_message(stderr_text)
@pytest.mark.parametrize(
"stderr_text",
[
# Generic errors — no fingerprint at all.
"Error: connection refused",
"",
# The NEW @github/copilot CLI's repo is github.com/github/copilot-cli.
# Its stderr can legitimately mention "copilot-cli" or "deprecation"
# in unrelated contexts; neither alone should trip the detector.
"copilot-cli: failed to authenticate with the API",
"warning: the --foo flag is scheduled for deprecation in v3",
"See https://github.com/github/copilot-cli/issues for support",
# Half the fingerprint without the other half.
"gh-copilot: command not found",
"extension has been deprecated (some other extension)",
],
)
def test_does_not_false_positive(self, stderr_text: str):
assert not _is_gh_copilot_deprecation_message(stderr_text)
class TestGitHubModelsAzureUrl:
"""Verify that the Azure GitHub Models URL is recognised."""
def test_url_to_provider_contains_azure_models(self):
from agent.model_metadata import _URL_TO_PROVIDER
# Maps to the canonical "copilot" provider (same convention as the
# other GitHub-family entries) — not the "github-models" alias.
assert _URL_TO_PROVIDER.get("models.inference.ai.azure.com") == "copilot"
def test_is_github_models_base_url_recognises_azure(self):
from hermes_cli.models import _is_github_models_base_url
assert _is_github_models_base_url("https://models.inference.ai.azure.com")
assert _is_github_models_base_url("https://models.inference.ai.azure.com/v1/chat")
def test_is_github_models_base_url_still_recognises_github_ai(self):
from hermes_cli.models import _is_github_models_base_url
assert _is_github_models_base_url("https://models.github.ai/inference")

View file

@ -0,0 +1,152 @@
"""Regression test for #4469.
When the agent is actively running (session present in
``adapter._active_sessions``) and the user fires off multiple TEXT
follow-ups in rapid succession, the previous behaviour was a single-slot
replacement at ``gateway/platforms/base.py``:
self._pending_messages[session_key] = event
So three rapid messages ``A``, ``B``, ``C`` arriving while the agent was
still working on the initial turn produced a pending slot containing only
``C``; ``A`` and ``B`` were silently dropped.
The fix routes the follow-up through ``merge_pending_message_event(...,
merge_text=True)`` so TEXT events accumulate into the existing pending
event's text instead of clobbering it. Photo / media bursts continue to
merge through the same helper (they always did).
"""
from __future__ import annotations
import asyncio
import sys
import types
from unittest.mock import AsyncMock, MagicMock
import pytest
# Minimal telegram stub so importing gateway.platforms.base does not pull
# in the real python-telegram-bot dependency.
_tg = sys.modules.get("telegram") or types.ModuleType("telegram")
_tg.constants = sys.modules.get("telegram.constants") or types.ModuleType("telegram.constants")
_ct = MagicMock()
_ct.PRIVATE = "private"
_ct.GROUP = "group"
_ct.SUPERGROUP = "supergroup"
_tg.constants.ChatType = _ct
sys.modules.setdefault("telegram", _tg)
sys.modules.setdefault("telegram.constants", _tg.constants)
sys.modules.setdefault("telegram.ext", types.ModuleType("telegram.ext"))
from gateway.config import Platform, PlatformConfig
from gateway.platforms.base import (
BasePlatformAdapter,
MessageEvent,
MessageType,
)
from gateway.session import SessionSource, build_session_key
def _make_event(text: str, chat_id: str = "12345") -> MessageEvent:
source = SessionSource(
platform=Platform.TELEGRAM,
chat_id=chat_id,
chat_type="dm",
user_id="u1",
)
return MessageEvent(
text=text,
message_type=MessageType.TEXT,
source=source,
message_id=f"msg-{text[:8]}",
)
def _make_adapter() -> BasePlatformAdapter:
"""Build a BasePlatformAdapter without running its heavy __init__.
We only need the bits ``handle_message`` touches on the active-session
path: ``_active_sessions``, ``_pending_messages``,
``_message_handler``, ``_busy_session_handler``, ``config``, ``platform``.
"""
class _DummyAdapter(BasePlatformAdapter): # type: ignore[misc]
async def connect(self):
pass
async def disconnect(self):
pass
async def get_chat_info(self, chat_id):
return None
async def send(self, *args, **kwargs):
return MagicMock(success=True, message_id="x", retryable=False)
adapter = object.__new__(_DummyAdapter)
adapter.config = PlatformConfig(enabled=True, token="***")
adapter.platform = Platform.TELEGRAM
adapter._message_handler = AsyncMock(return_value=None)
adapter._busy_session_handler = None
adapter._active_sessions = {}
adapter._pending_messages = {}
adapter._session_tasks = {}
adapter._background_tasks = set()
adapter._post_delivery_callbacks = {}
adapter._expected_cancelled_tasks = set()
adapter._fatal_error_code = None
adapter._fatal_error_message = None
adapter._fatal_error_retryable = True
adapter._fatal_error_handler = None
adapter._running = True
adapter._auto_tts_default = False
adapter._auto_tts_enabled_chats = set()
adapter._auto_tts_disabled_chats = set()
adapter._typing_paused = set()
return adapter
@pytest.mark.asyncio
async def test_rapid_text_followups_accumulate_instead_of_replacing():
"""Three rapid TEXT follow-ups during an active session must all
survive in ``adapter._pending_messages[session_key].text``."""
adapter = _make_adapter()
first = _make_event("part one")
session_key = build_session_key(first.source)
# Mark the session as active so subsequent messages take the
# "already running" branch in handle_message.
adapter._active_sessions[session_key] = asyncio.Event()
second = _make_event("part two")
third = _make_event("part three")
await adapter.handle_message(second)
await adapter.handle_message(third)
# Both rapid follow-ups must be preserved, not just the last one.
pending = adapter._pending_messages[session_key]
assert pending.text == "part two\npart three", (
f"expected accumulated text, got {pending.text!r}"
)
# Interrupt event must be signalled exactly like before.
assert adapter._active_sessions[session_key].is_set()
@pytest.mark.asyncio
async def test_single_followup_is_stored_as_is():
"""One TEXT follow-up still lands as the event object itself
(no spurious wrapping / mutation) guards against the merge path
breaking the simple case."""
adapter = _make_adapter()
first = _make_event("only one")
session_key = build_session_key(first.source)
adapter._active_sessions[session_key] = asyncio.Event()
await adapter.handle_message(first)
pending = adapter._pending_messages[session_key]
assert pending is first
assert pending.text == "only one"
assert adapter._active_sessions[session_key].is_set()

View file

@ -839,3 +839,108 @@ class TestGitHubTokenCheck:
assert "gh auth" in str(call_log) or any(c[0] == "gh" for c in call_log), f"gh not called: {call_log}"
assert "GitHub authenticated via gh CLI" in out or "token configured" in out
def _run_doctor_with_healthy_oauth_fallback(
monkeypatch,
tmp_path,
*,
env_key: str,
bad_key: str,
failing_host: str,
gemini_oauth_status: dict,
minimax_oauth_status: dict,
) -> str:
home = tmp_path / ".hermes"
home.mkdir(parents=True, exist_ok=True)
(home / "config.yaml").write_text(
"model:\n"
" provider: nous\n"
" default: moonshotai/kimi-k2.6\n",
encoding="utf-8",
)
project = tmp_path / "project"
project.mkdir(exist_ok=True)
monkeypatch.setattr(doctor_mod, "HERMES_HOME", home)
monkeypatch.setattr(doctor_mod, "PROJECT_ROOT", project)
monkeypatch.setattr(doctor_mod, "_DHH", str(home))
monkeypatch.setenv(env_key, bad_key)
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
monkeypatch.delenv("GEMINI_API_KEY", raising=False)
monkeypatch.delenv("GOOGLE_API_KEY", raising=False)
monkeypatch.delenv("MINIMAX_API_KEY", raising=False)
monkeypatch.delenv("MINIMAX_CN_API_KEY", raising=False)
monkeypatch.setenv(env_key, bad_key)
fake_model_tools = types.SimpleNamespace(
check_tool_availability=lambda *a, **kw: ([], []),
TOOLSET_REQUIREMENTS={},
)
monkeypatch.setitem(sys.modules, "model_tools", fake_model_tools)
from hermes_cli import auth as _auth_mod
monkeypatch.setattr(_auth_mod, "get_nous_auth_status", lambda: {"logged_in": True})
monkeypatch.setattr(_auth_mod, "get_codex_auth_status", lambda: {})
monkeypatch.setattr(_auth_mod, "get_gemini_oauth_auth_status", lambda: gemini_oauth_status)
monkeypatch.setattr(_auth_mod, "get_minimax_oauth_auth_status", lambda: minimax_oauth_status)
def fake_get(url, headers=None, timeout=None):
status = 401 if failing_host in url else 200
return types.SimpleNamespace(status_code=status)
import httpx
monkeypatch.setattr(httpx, "get", fake_get)
buf = io.StringIO()
with contextlib.redirect_stdout(buf):
doctor_mod.run_doctor(Namespace(fix=False))
return buf.getvalue()
@pytest.mark.parametrize(
("env_key", "bad_key", "failing_host", "gemini_oauth_status", "minimax_oauth_status", "unexpected_issue"),
[
(
"GOOGLE_API_KEY",
"bad-gemini-key",
"googleapis.com",
{"logged_in": True, "email": "user@example.com"},
{},
"Check GOOGLE_API_KEY in .env",
),
(
"MINIMAX_API_KEY",
"bad-minimax-key",
"minimax.io",
{},
{"logged_in": True, "region": "global"},
"Check MINIMAX_API_KEY in .env",
),
],
)
def test_run_doctor_ignores_invalid_direct_keys_when_oauth_fallback_is_healthy(
monkeypatch,
tmp_path,
env_key,
bad_key,
failing_host,
gemini_oauth_status,
minimax_oauth_status,
unexpected_issue,
):
out = _run_doctor_with_healthy_oauth_fallback(
monkeypatch,
tmp_path,
env_key=env_key,
bad_key=bad_key,
failing_host=failing_host,
gemini_oauth_status=gemini_oauth_status,
minimax_oauth_status=minimax_oauth_status,
)
assert "invalid API key" in out
assert unexpected_issue not in out

View file

@ -662,6 +662,129 @@ class TestPluginContext:
from tools.registry import registry
assert "plugin_echo" in registry._tools
def test_register_tool_rejects_shadow_without_override(self, tmp_path, monkeypatch, caplog):
"""Without override=True, registering a tool name claimed by a different toolset is rejected."""
from tools.registry import registry
# Seed an existing entry from a non-plugin toolset.
registry.register(
name="shadow_target",
toolset="terminal",
schema={"name": "shadow_target", "description": "Built-in", "parameters": {"type": "object", "properties": {}}},
handler=lambda args, **kw: "built-in",
)
original_handler = registry._tools["shadow_target"].handler
try:
plugins_dir = tmp_path / "hermes_test" / "plugins"
plugin_dir = plugins_dir / "shadow_plugin"
plugin_dir.mkdir(parents=True)
(plugin_dir / "plugin.yaml").write_text(yaml.dump({"name": "shadow_plugin"}))
(plugin_dir / "__init__.py").write_text(
'def register(ctx):\n'
' ctx.register_tool(\n'
' name="shadow_target",\n'
' toolset="plugin_shadow_plugin",\n'
' schema={"name": "shadow_target", "description": "Plugin", "parameters": {"type": "object", "properties": {}}},\n'
' handler=lambda args, **kw: "plugin",\n'
' )\n'
)
hermes_home = tmp_path / "hermes_test"
(hermes_home / "config.yaml").write_text(
yaml.safe_dump({"plugins": {"enabled": ["shadow_plugin"]}})
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
with caplog.at_level(logging.ERROR, logger="tools.registry"):
mgr = PluginManager()
mgr.discover_and_load()
# Original handler must still be in place — registration was rejected.
assert registry._tools["shadow_target"].handler is original_handler
assert registry._tools["shadow_target"].toolset == "terminal"
# And an ERROR was logged explaining why and how to opt in.
assert any("override=True" in r.message for r in caplog.records)
finally:
registry.deregister("shadow_target")
def test_register_tool_override_replaces_existing(self, tmp_path, monkeypatch, caplog):
"""override=True lets a plugin replace an existing built-in tool."""
from tools.registry import registry
registry.register(
name="override_target",
toolset="terminal",
schema={"name": "override_target", "description": "Built-in", "parameters": {"type": "object", "properties": {}}},
handler=lambda args, **kw: "built-in",
)
try:
plugins_dir = tmp_path / "hermes_test" / "plugins"
plugin_dir = plugins_dir / "override_plugin"
plugin_dir.mkdir(parents=True)
(plugin_dir / "plugin.yaml").write_text(yaml.dump({"name": "override_plugin"}))
(plugin_dir / "__init__.py").write_text(
'def register(ctx):\n'
' ctx.register_tool(\n'
' name="override_target",\n'
' toolset="plugin_override_plugin",\n'
' schema={"name": "override_target", "description": "Plugin", "parameters": {"type": "object", "properties": {}}},\n'
' handler=lambda args, **kw: "plugin",\n'
' override=True,\n'
' )\n'
)
hermes_home = tmp_path / "hermes_test"
(hermes_home / "config.yaml").write_text(
yaml.safe_dump({"plugins": {"enabled": ["override_plugin"]}})
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
with caplog.at_level(logging.INFO, logger="tools.registry"):
mgr = PluginManager()
mgr.discover_and_load()
# Plugin handler replaced the built-in one.
assert registry._tools["override_target"].toolset == "plugin_override_plugin"
assert registry._tools["override_target"].handler({}, ) == "plugin"
# Override is audit-logged at INFO.
assert any(
"overriding existing" in r.message and "override_target" in r.message
for r in caplog.records
)
# Plugin tracks it.
assert "override_target" in mgr._plugin_tool_names
finally:
registry.deregister("override_target")
def test_register_tool_override_on_new_name_is_noop_path(self, tmp_path, monkeypatch):
"""override=True on a brand-new name still registers cleanly (no existing entry to replace)."""
from tools.registry import registry
plugins_dir = tmp_path / "hermes_test" / "plugins"
plugin_dir = plugins_dir / "new_override_plugin"
plugin_dir.mkdir(parents=True)
(plugin_dir / "plugin.yaml").write_text(yaml.dump({"name": "new_override_plugin"}))
(plugin_dir / "__init__.py").write_text(
'def register(ctx):\n'
' ctx.register_tool(\n'
' name="brand_new_override_tool",\n'
' toolset="plugin_new_override_plugin",\n'
' schema={"name": "brand_new_override_tool", "description": "New", "parameters": {"type": "object", "properties": {}}},\n'
' handler=lambda args, **kw: "ok",\n'
' override=True,\n'
' )\n'
)
hermes_home = tmp_path / "hermes_test"
(hermes_home / "config.yaml").write_text(
yaml.safe_dump({"plugins": {"enabled": ["new_override_plugin"]}})
)
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
try:
mgr = PluginManager()
mgr.discover_and_load()
assert "brand_new_override_tool" in registry._tools
finally:
registry.deregister("brand_new_override_tool")
# ── TestPluginToolVisibility ───────────────────────────────────────────────

View file

@ -2269,6 +2269,60 @@ class TestParallelScopePathNormalization:
assert not _should_parallelize_tool_batch([tc1, tc2])
class TestMcpParallelToolBatch:
"""Integration test: _should_parallelize_tool_batch respects MCP parallel flag."""
def test_mcp_tools_default_sequential(self):
"""MCP tools without supports_parallel_tool_calls are sequential."""
from run_agent import _should_parallelize_tool_batch
tc1 = _mock_tool_call(name="mcp_github_list_repos", arguments='{"org":"openai"}', call_id="c1")
tc2 = _mock_tool_call(name="mcp_github_search_code", arguments='{"q":"test"}', call_id="c2")
assert not _should_parallelize_tool_batch([tc1, tc2])
def test_mcp_tools_parallel_when_server_opted_in(self):
"""MCP tools from a parallel-safe server can run concurrently."""
from run_agent import _should_parallelize_tool_batch
from tools.mcp_tool import _parallel_safe_servers, _lock
with _lock:
_parallel_safe_servers.add("github")
try:
tc1 = _mock_tool_call(name="mcp_github_list_repos", arguments='{"org":"openai"}', call_id="c1")
tc2 = _mock_tool_call(name="mcp_github_search_code", arguments='{"q":"test"}', call_id="c2")
assert _should_parallelize_tool_batch([tc1, tc2])
finally:
with _lock:
_parallel_safe_servers.discard("github")
def test_mixed_mcp_and_builtin_parallel(self):
"""MCP parallel tools mixed with built-in parallel-safe tools."""
from run_agent import _should_parallelize_tool_batch
from tools.mcp_tool import _parallel_safe_servers, _lock
with _lock:
_parallel_safe_servers.add("docs")
try:
tc1 = _mock_tool_call(name="mcp_docs_search", arguments='{"query":"api"}', call_id="c1")
tc2 = _mock_tool_call(name="web_search", arguments='{"query":"test"}', call_id="c2")
assert _should_parallelize_tool_batch([tc1, tc2])
finally:
with _lock:
_parallel_safe_servers.discard("docs")
def test_mixed_parallel_and_serial_mcp_servers(self):
"""One parallel MCP server + one non-parallel MCP server = sequential."""
from run_agent import _should_parallelize_tool_batch
from tools.mcp_tool import _parallel_safe_servers, _lock
with _lock:
_parallel_safe_servers.add("docs")
# "github" is NOT in _parallel_safe_servers
try:
tc1 = _mock_tool_call(name="mcp_docs_search", arguments='{"query":"api"}', call_id="c1")
tc2 = _mock_tool_call(name="mcp_github_list_repos", arguments='{"org":"openai"}', call_id="c2")
assert not _should_parallelize_tool_batch([tc1, tc2])
finally:
with _lock:
_parallel_safe_servers.discard("docs")
class TestHandleMaxIterations:
def test_returns_summary(self, agent):
resp = _mock_response(content="Here is a summary of what I did.")

View file

@ -999,6 +999,88 @@ class TestAnthropicStreamCallbacks:
assert touch_calls.count("receiving stream response") == len(events)
@patch("run_agent.AIAgent._replace_primary_openai_client")
def test_anthropic_stream_parser_valueerror_retries_before_delivery(
self, mock_replace, monkeypatch,
):
"""Malformed Anthropic event-stream frames retry instead of surfacing HTTP None."""
from run_agent import AIAgent
agent = AIAgent(
api_key="test-key",
base_url="https://api.minimax.io/anthropic",
provider="minimax",
model="MiniMax-M2.7",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
agent.api_mode = "anthropic_messages"
agent._interrupt_requested = False
monkeypatch.setenv("HERMES_STREAM_RETRIES", "1")
class _BadStream:
response = None
def __enter__(self):
return self
def __exit__(self, *_args):
return False
def __iter__(self):
raise ValueError("expected ident at line 1 column 149")
final_message = SimpleNamespace(content=[], stop_reason="end_turn")
good_stream = MagicMock()
good_stream.__enter__ = MagicMock(return_value=good_stream)
good_stream.__exit__ = MagicMock(return_value=False)
good_stream.__iter__ = MagicMock(return_value=iter([]))
good_stream.get_final_message.return_value = final_message
agent._anthropic_client = MagicMock()
agent._anthropic_client.messages.stream.side_effect = [
_BadStream(),
good_stream,
]
response = agent._interruptible_streaming_api_call({})
assert response is final_message
assert agent._anthropic_client.messages.stream.call_count == 2
assert mock_replace.call_count == 1
@patch("run_agent.AIAgent._replace_primary_openai_client")
def test_generic_anthropic_valueerror_still_propagates_without_stream_retry(
self, mock_replace, monkeypatch,
):
"""Only known provider stream parser ValueErrors are treated as transient."""
from run_agent import AIAgent
agent = AIAgent(
api_key="test-key",
base_url="https://api.minimax.io/anthropic",
provider="minimax",
model="MiniMax-M2.7",
quiet_mode=True,
skip_context_files=True,
skip_memory=True,
)
agent.api_mode = "anthropic_messages"
agent._interrupt_requested = False
monkeypatch.setenv("HERMES_STREAM_RETRIES", "1")
agent._anthropic_client = MagicMock()
agent._anthropic_client.messages.stream.side_effect = ValueError(
"invalid local request shape"
)
with pytest.raises(ValueError, match="invalid local request shape"):
agent._interruptible_streaming_api_call({})
assert agent._anthropic_client.messages.stream.call_count == 1
assert mock_replace.call_count == 0
class TestPartialToolCallWarning:
"""Regression: when a stream dies mid tool-call argument generation after
@ -1504,4 +1586,3 @@ class TestCopilotACPStreamingDecision:
_use_streaming = False
assert _use_streaming is True

View file

@ -0,0 +1,102 @@
"""
Smoke tests for the darwinian-evolver optional skill.
We can't actually run the evolution loop in CI (it needs network + a paid LLM),
so these tests verify:
- SKILL.md frontmatter conforms to the hardline format
- shipped scripts parse as valid Python
- the scripts reference the right env var / module paths
"""
from __future__ import annotations
import ast
import re
from pathlib import Path
import pytest
import yaml
SKILL_DIR = Path(__file__).resolve().parents[2] / "optional-skills" / "research" / "darwinian-evolver"
@pytest.fixture(scope="module")
def frontmatter() -> dict:
src = (SKILL_DIR / "SKILL.md").read_text()
m = re.search(r"^---\n(.*?)\n---", src, re.DOTALL)
assert m, "SKILL.md missing YAML frontmatter"
return yaml.safe_load(m.group(1))
def test_skill_dir_exists() -> None:
assert SKILL_DIR.is_dir(), f"missing skill dir: {SKILL_DIR}"
def test_skill_md_present() -> None:
assert (SKILL_DIR / "SKILL.md").is_file()
def test_description_under_60_chars(frontmatter) -> None:
desc = frontmatter["description"]
assert len(desc) <= 60, f"description is {len(desc)} chars (hardline ≤60): {desc!r}"
def test_name_matches_dir(frontmatter) -> None:
assert frontmatter["name"] == "darwinian-evolver"
def test_platforms_excludes_windows(frontmatter) -> None:
# Upstream uses func_timeout (POSIX signals) and uv subprocess pipelines; the
# skill is gated [linux, macos]. If we ever port to Windows, update this test
# to assert ["linux", "macos", "windows"].
assert "windows" not in frontmatter["platforms"]
assert set(frontmatter["platforms"]) >= {"linux", "macos"}
def test_author_credits_contributor(frontmatter) -> None:
author = frontmatter["author"]
assert "Bihruze" in author, f"author should credit the original contributor: {author!r}"
def test_license_mit(frontmatter) -> None:
assert frontmatter["license"] == "MIT"
@pytest.mark.parametrize(
"path",
[
"scripts/parrot_openrouter.py",
"scripts/show_snapshot.py",
"templates/custom_problem_template.py",
],
)
def test_shipped_scripts_parse(path: str) -> None:
src = (SKILL_DIR / path).read_text()
ast.parse(src) # raises SyntaxError on broken Python
def test_parrot_script_uses_openrouter() -> None:
src = (SKILL_DIR / "scripts" / "parrot_openrouter.py").read_text()
assert "OPENROUTER_API_KEY" in src, "parrot driver should read OPENROUTER_API_KEY"
assert "openrouter.ai/api/v1" in src, "parrot driver should target OpenRouter"
assert "EVOLVER_MODEL" in src, "model should be overridable via EVOLVER_MODEL"
def test_parrot_script_has_error_swallowing() -> None:
"""Provider content-filter / rate-limit must not kill the run — see Pitfall 2."""
src = (SKILL_DIR / "scripts" / "parrot_openrouter.py").read_text()
assert "LLM_ERROR" in src, "_prompt_llm should swallow provider errors and tag them"
def test_skill_calls_out_agpl(frontmatter) -> None:
"""The upstream tool is AGPL-3.0. The skill MUST flag this so users don't
import it into MIT-licensed code by accident."""
src = (SKILL_DIR / "SKILL.md").read_text()
assert "AGPL" in src, "SKILL.md must mention upstream AGPL license"
def test_skill_pitfalls_section_present() -> None:
src = (SKILL_DIR / "SKILL.md").read_text()
assert "## Pitfalls" in src
# Pitfalls we discovered during the spike — keep them in sync with reality.
assert "Initial organism must be viable" in src
assert "generator" in src # loop.run() pitfall

View file

@ -0,0 +1,137 @@
"""Tests for `_sanitize_tool_error` in model_tools.
Ported from ironclaw#1639 — defense-in-depth on tool exception strings before
they enter the model's `tool` message content. Note that `json.dumps()` in
`handle_function_call` already handles quote/backslash escaping at the wire
layer; this helper exists to strip structural framing tokens the model
itself might react to (XML role tags, CDATA, markdown code fences) and to
cap pathological lengths.
"""
from __future__ import annotations
from model_tools import _sanitize_tool_error, _TOOL_ERROR_MAX_LEN
class TestRoleTagStripping:
def test_strips_tool_call_tags(self):
out = _sanitize_tool_error("bad <tool_call>injected</tool_call> happened")
assert "<tool_call>" not in out
assert "</tool_call>" not in out
assert "bad injected happened" in out
def test_strips_function_call_tags(self):
out = _sanitize_tool_error("<function_call>x</function_call>")
assert "<function_call>" not in out
assert "</function_call>" not in out
def test_strips_role_tags(self):
# Each of these should be stripped
for tag in ("system", "assistant", "user", "result", "response", "output", "input"):
raw = f"prefix <{tag}>hi</{tag}> suffix"
out = _sanitize_tool_error(raw)
assert f"<{tag}>" not in out, f"failed to strip <{tag}>"
assert f"</{tag}>" not in out, f"failed to strip </{tag}>"
def test_role_tag_strip_is_case_insensitive(self):
out = _sanitize_tool_error("<TOOL_CALL>x</Tool_Call>")
assert "<" not in out.replace("[TOOL_ERROR]", "") # only the prefix bracket survives
def test_unrelated_xml_kept(self):
# We intentionally only strip the role-like tag whitelist, not all XML
out = _sanitize_tool_error("Error parsing <ParseError>line 5</ParseError>")
assert "<ParseError>" in out
class TestCDATAStripping:
def test_strips_cdata(self):
out = _sanitize_tool_error("error: <![CDATA[malicious]]> here")
assert "<![CDATA[" not in out
assert "]]>" not in out
def test_strips_multiline_cdata(self):
out = _sanitize_tool_error("a\n<![CDATA[line1\nline2]]>\nb")
assert "CDATA" not in out
assert "a" in out and "b" in out
class TestCodeFenceStripping:
def test_strips_leading_fence_with_lang(self):
out = _sanitize_tool_error("```json\n{\"x\": 1}")
assert not out.replace("[TOOL_ERROR] ", "").startswith("```")
def test_strips_trailing_fence(self):
out = _sanitize_tool_error("payload\n```")
assert not out.rstrip().endswith("```")
def test_strips_bare_fence(self):
out = _sanitize_tool_error("```\nstuff")
assert "```" not in out.split("\n")[0]
class TestTruncation:
def test_caps_long_input(self):
long = "A" * (_TOOL_ERROR_MAX_LEN * 2)
out = _sanitize_tool_error(long)
# Total length is prefix + truncated body
body = out[len("[TOOL_ERROR] "):]
assert len(body) == _TOOL_ERROR_MAX_LEN
assert body.endswith("...")
def test_does_not_truncate_short_input(self):
msg = "short error"
out = _sanitize_tool_error(msg)
assert "..." not in out
assert msg in out
class TestEnvelope:
def test_wraps_with_prefix(self):
out = _sanitize_tool_error("oh no")
assert out.startswith("[TOOL_ERROR] ")
def test_empty_input(self):
out = _sanitize_tool_error("")
assert out == "[TOOL_ERROR] "
def test_preserves_normal_error_text(self):
msg = "Error executing read_file: FileNotFoundError: /tmp/missing"
out = _sanitize_tool_error(msg)
assert msg in out
class TestHandleFunctionCallIntegration:
"""Verify handle_function_call routes exception-path errors through the sanitizer.
Note: the "Unknown tool: ..." early-return in tools/registry.py is a
*different* code path from `except Exception` in handle_function_call
that one returns directly without sanitization (and there's nothing to
sanitize in a hardcoded format string anyway). This test exercises the
real exception path by passing args that make a known tool raise.
"""
def test_exception_path_error_is_sanitized(self):
import json
from model_tools import handle_function_call
from tools.registry import registry as _registry
# Force a known tool to raise with a payload containing role tags.
def boom(_args, **_kwargs):
raise RuntimeError("<tool_call>injected</tool_call> boom")
all_tools = _registry.get_all_tool_names()
assert all_tools, "no tools registered — test environment broken"
target = all_tools[0]
original = _registry._tools[target].handler
_registry._tools[target].handler = boom
try:
result_str = handle_function_call(target, {})
finally:
_registry._tools[target].handler = original
payload = json.loads(result_str)
assert "error" in payload, payload
assert payload["error"].startswith("[TOOL_ERROR] "), payload["error"]
# Role-tag stripping carried through
assert "<tool_call>" not in payload["error"]
assert "</tool_call>" not in payload["error"]
assert "boom" in payload["error"]

View file

@ -1102,3 +1102,206 @@ class TestDetectSudoStdin:
"make 2>&1 | tee build.log"
)
assert is_dangerous is False
class TestMacOSPrivateSystemPaths:
"""Inspired by Claude Code 2.1.113 "dangerous path protection".
On macOS, /etc, /var, /tmp, /home are symlinks to
/private/{etc,var,tmp,home}. A command that writes to
/private/etc/sudoers works identically to /etc/sudoers but bypasses
a plain "/etc/" pattern check. These tests guard the shared
_SYSTEM_CONFIG_PATH fragment used across redirect / tee / cp / mv /
install / sed -i patterns.
"""
def test_private_etc_redirect(self):
dangerous, _, desc = detect_dangerous_command(
"echo 'root ALL=NOPASSWD: ALL' > /private/etc/sudoers"
)
assert dangerous is True
assert "system config" in desc.lower()
def test_private_var_redirect(self):
dangerous, _, _ = detect_dangerous_command(
"echo payload > /private/var/db/dslocal/nodes/x"
)
assert dangerous is True
def test_private_etc_via_tee(self):
dangerous, _, desc = detect_dangerous_command(
"echo malicious | tee /private/etc/hosts"
)
assert dangerous is True
assert "tee" in desc.lower() or "system" in desc.lower()
def test_private_etc_cp(self):
dangerous, _, desc = detect_dangerous_command(
"cp malicious.conf /private/etc/hosts"
)
assert dangerous is True
assert "copy" in desc.lower() or "system config" in desc.lower()
def test_private_etc_mv(self):
dangerous, _, _ = detect_dangerous_command(
"mv evil /private/etc/ssh/sshd_config"
)
assert dangerous is True
def test_private_etc_install(self):
dangerous, _, _ = detect_dangerous_command(
"install -m 600 key /private/etc/ssh/keys"
)
assert dangerous is True
def test_private_etc_sed_in_place(self):
dangerous, _, desc = detect_dangerous_command(
"sed -i 's/root/pwned/' /private/etc/passwd"
)
assert dangerous is True
assert "in-place" in desc.lower() or "system config" in desc.lower()
def test_private_var_sed_long_flag(self):
dangerous, _, _ = detect_dangerous_command(
"sed --in-place 's/x/y/' /private/var/log/wtmp"
)
assert dangerous is True
def test_private_tmp_cp(self):
dangerous, _, _ = detect_dangerous_command(
"cp rootkit /private/tmp/payload"
)
assert dangerous is True
def test_ls_private_is_safe(self):
"""Reading under /private/ must not trigger approval."""
dangerous, _, _ = detect_dangerous_command("ls /private")
assert dangerous is False
def test_echo_mentioning_private_path_is_safe(self):
"""Literal mention of /private/etc in an echo string must not fire."""
dangerous, _, _ = detect_dangerous_command(
"echo 'the macOS path is /private/etc on disk'"
)
assert dangerous is False
class TestKillallKillSignals:
"""Inspired by Claude Code 2.1.113 expanded deny rules.
The existing pattern caught `pkill -9` but not the equivalent
`killall -9` / `-KILL` / `-s KILL` / `-r <regex>` broad sweeps that
can wipe out unrelated processes.
"""
def test_killall_dash_9(self):
dangerous, _, desc = detect_dangerous_command("killall -9 firefox")
assert dangerous is True
assert "kill" in desc.lower()
def test_killall_dash_kill(self):
dangerous, _, _ = detect_dangerous_command("killall -KILL firefox")
assert dangerous is True
def test_killall_dash_sigkill(self):
dangerous, _, _ = detect_dangerous_command("killall -SIGKILL firefox")
assert dangerous is True
def test_killall_dash_s_kill(self):
dangerous, _, _ = detect_dangerous_command("killall -s KILL firefox")
assert dangerous is True
def test_killall_dash_s_signum(self):
dangerous, _, _ = detect_dangerous_command("killall -s 9 firefox")
assert dangerous is True
def test_killall_regex(self):
"""killall -r <regex> is a broad sweep; require approval."""
dangerous, _, desc = detect_dangerous_command("killall -r 'fire.*'")
assert dangerous is True
assert "regex" in desc.lower() or "kill" in desc.lower()
def test_killall_combined_flags(self):
dangerous, _, _ = detect_dangerous_command("killall -9 -r 'herm.*'")
assert dangerous is True
def test_killall_list_signals_is_safe(self):
"""`killall -l` lists signals and is harmless — must not fire."""
dangerous, _, _ = detect_dangerous_command("killall -l")
assert dangerous is False
def test_killall_version_is_safe(self):
dangerous, _, _ = detect_dangerous_command("killall -V")
assert dangerous is False
class TestFindExecdir:
"""Inspired by Claude Code 2.1.113 tightening of find rules.
`find -execdir rm` has the same destructive effect as `find -exec rm`
but ran in each match's directory. Previously missed because the
pattern required a literal `-exec ` followed by a space.
"""
def test_find_execdir_rm(self):
dangerous, _, desc = detect_dangerous_command(
"find . -execdir rm {} \\;"
)
assert dangerous is True
assert "find" in desc.lower() or "rm" in desc.lower()
def test_find_execdir_with_absolute_rm(self):
dangerous, _, _ = detect_dangerous_command(
"find /var -execdir /bin/rm -rf {} \\;"
)
assert dangerous is True
def test_find_exec_rm_still_caught(self):
"""Original -exec pattern must still fire (regression guard)."""
dangerous, _, _ = detect_dangerous_command(
"find . -exec rm {} \\;"
)
assert dangerous is True
def test_find_execdir_ls_is_safe(self):
"""-execdir with a read-only command is not dangerous."""
dangerous, _, _ = detect_dangerous_command(
"find . -execdir ls {} \\;"
)
assert dangerous is False
class TestEtcPatternsUnaffectedByRefactor:
"""Regression guard: the /etc/ patterns were refactored to share the
_SYSTEM_CONFIG_PATH fragment with the /private/ mirror. Make sure the
existing /etc/ coverage remains identical.
"""
def test_etc_redirect(self):
dangerous, _, _ = detect_dangerous_command("echo x > /etc/hosts")
assert dangerous is True
def test_etc_cp(self):
dangerous, _, _ = detect_dangerous_command("cp evil /etc/hosts")
assert dangerous is True
def test_etc_sed_inline(self):
dangerous, _, _ = detect_dangerous_command(
"sed -i 's/a/b/' /etc/hosts"
)
assert dangerous is True
def test_etc_tee(self):
dangerous, _, _ = detect_dangerous_command(
"echo x | tee /etc/hosts"
)
assert dangerous is True
def test_cat_etc_hostname_is_safe(self):
"""Reading /etc/ files is safe — only writes require approval."""
dangerous, _, _ = detect_dangerous_command("cat /etc/hostname")
assert dangerous is False
def test_grep_etc_passwd_is_safe(self):
dangerous, _, _ = detect_dangerous_command("grep root /etc/passwd")
assert dangerous is False

View file

@ -890,6 +890,63 @@ class TestDelegationCredentialResolution(unittest.TestCase):
self.assertEqual(creds["api_key"], "local-key")
self.assertEqual(creds["api_mode"], "chat_completions")
def test_direct_endpoint_auto_detects_anthropic_messages_suffix(self):
# Issue #10213: Azure AI Foundry exposes Anthropic-compatible models at
# a /anthropic URL suffix. Subagents must pick anthropic_messages
# automatically, matching the main agent's runtime resolver.
parent = _make_mock_parent(depth=0)
cfg = {
"model": "claude-opus-4-6",
"provider": "custom",
"base_url": "https://myfoundry.services.ai.azure.com/anthropic",
"api_key": "foundry-key",
}
creds = _resolve_delegation_credentials(cfg, parent)
self.assertEqual(creds["provider"], "custom")
self.assertEqual(creds["base_url"], "https://myfoundry.services.ai.azure.com/anthropic")
self.assertEqual(creds["api_key"], "foundry-key")
self.assertEqual(creds["api_mode"], "anthropic_messages")
def test_direct_endpoint_honors_explicit_api_mode(self):
# When delegation.api_mode is set explicitly, it overrides URL-based
# detection so users can force a transport on non-standard endpoints.
parent = _make_mock_parent(depth=0)
cfg = {
"model": "claude-opus-4-6",
"provider": "custom",
"base_url": "https://proxy.example.com/v1",
"api_key": "proxy-key",
"api_mode": "anthropic_messages",
}
creds = _resolve_delegation_credentials(cfg, parent)
self.assertEqual(creds["api_mode"], "anthropic_messages")
def test_direct_endpoint_explicit_api_mode_overrides_url_detection(self):
# Explicit api_mode in config always wins over auto-detection.
parent = _make_mock_parent(depth=0)
cfg = {
"model": "claude-opus-4-6",
"provider": "custom",
"base_url": "https://myfoundry.services.ai.azure.com/anthropic",
"api_key": "foundry-key",
"api_mode": "chat_completions",
}
creds = _resolve_delegation_credentials(cfg, parent)
self.assertEqual(creds["api_mode"], "chat_completions")
def test_direct_endpoint_invalid_api_mode_falls_back_to_detection(self):
# An invalid api_mode string must not break detection; fall back to URL heuristic.
parent = _make_mock_parent(depth=0)
cfg = {
"model": "claude-opus-4-6",
"provider": "custom",
"base_url": "https://myfoundry.services.ai.azure.com/anthropic",
"api_key": "foundry-key",
"api_mode": "garbage",
}
creds = _resolve_delegation_credentials(cfg, parent)
self.assertEqual(creds["api_mode"], "anthropic_messages")
def test_direct_endpoint_returns_none_api_key_when_not_configured(self):
# When base_url is set without api_key, api_key should be None so
# _build_child_agent inherits the parent's key (effective_api_key = override or parent).

View file

@ -3762,3 +3762,135 @@ class TestRegisterMcpServers:
)
_servers.pop("srv", None)
# ---------------------------------------------------------------------------
# Tests for parallel tool call support (port from openai/codex#17667)
# ---------------------------------------------------------------------------
class TestMcpParallelToolCalls:
"""Tests for the supports_parallel_tool_calls config option."""
def test_is_mcp_tool_parallel_safe_non_mcp_tool(self):
"""Non-MCP tool names always return False."""
from tools.mcp_tool import is_mcp_tool_parallel_safe
assert is_mcp_tool_parallel_safe("web_search") is False
assert is_mcp_tool_parallel_safe("read_file") is False
assert is_mcp_tool_parallel_safe("terminal") is False
assert is_mcp_tool_parallel_safe("") is False
def test_is_mcp_tool_parallel_safe_no_servers(self):
"""MCP tool from unknown server returns False."""
from tools.mcp_tool import is_mcp_tool_parallel_safe, _parallel_safe_servers, _lock
with _lock:
_parallel_safe_servers.clear()
assert is_mcp_tool_parallel_safe("mcp_docs_search") is False
def test_is_mcp_tool_parallel_safe_with_flag(self):
"""MCP tool from a parallel-safe server returns True."""
from tools.mcp_tool import is_mcp_tool_parallel_safe, _parallel_safe_servers, _lock
with _lock:
_parallel_safe_servers.add("docs")
try:
assert is_mcp_tool_parallel_safe("mcp_docs_search") is True
assert is_mcp_tool_parallel_safe("mcp_docs_read_file") is True
# Different server should be False
assert is_mcp_tool_parallel_safe("mcp_github_list_repos") is False
finally:
with _lock:
_parallel_safe_servers.discard("docs")
def test_is_mcp_tool_parallel_safe_server_with_underscores(self):
"""Server names containing underscores are correctly matched."""
from tools.mcp_tool import is_mcp_tool_parallel_safe, _parallel_safe_servers, _lock
with _lock:
_parallel_safe_servers.add("my_server")
try:
assert is_mcp_tool_parallel_safe("mcp_my_server_query") is True
finally:
with _lock:
_parallel_safe_servers.discard("my_server")
def test_is_mcp_tool_parallel_safe_no_tool_suffix(self):
"""Tool name that is just 'mcp_{server}' without a tool part returns False."""
from tools.mcp_tool import is_mcp_tool_parallel_safe, _parallel_safe_servers, _lock
with _lock:
_parallel_safe_servers.add("docs")
try:
# "mcp_docs" has no tool part after the server name
assert is_mcp_tool_parallel_safe("mcp_docs") is False
# "mcp_docs_" has empty tool part
assert is_mcp_tool_parallel_safe("mcp_docs_") is False
finally:
with _lock:
_parallel_safe_servers.discard("docs")
def test_register_mcp_servers_tracks_parallel_flag(self):
"""register_mcp_servers populates _parallel_safe_servers from config."""
from tools.mcp_tool import (
register_mcp_servers, _parallel_safe_servers, _lock,
sanitize_mcp_name_component,
)
fake_config = {
"parallel_srv": {
"command": "echo",
"supports_parallel_tool_calls": True,
},
"serial_srv": {
"command": "echo",
"supports_parallel_tool_calls": False,
},
"default_srv": {
"command": "echo",
# no supports_parallel_tool_calls key
},
}
with patch("tools.mcp_tool._MCP_AVAILABLE", True), \
patch("tools.mcp_tool._ensure_mcp_loop"), \
patch("tools.mcp_tool._run_on_mcp_loop"), \
patch("tools.mcp_tool._existing_tool_names", return_value=[]):
register_mcp_servers(fake_config)
with _lock:
assert sanitize_mcp_name_component("parallel_srv") in _parallel_safe_servers
assert sanitize_mcp_name_component("serial_srv") not in _parallel_safe_servers
assert sanitize_mcp_name_component("default_srv") not in _parallel_safe_servers
# Cleanup
_parallel_safe_servers.discard(sanitize_mcp_name_component("parallel_srv"))
def test_register_mcp_servers_removes_parallel_flag_on_toggle(self):
"""Toggling supports_parallel_tool_calls to false removes server from the set."""
from tools.mcp_tool import (
register_mcp_servers, _parallel_safe_servers, _lock,
sanitize_mcp_name_component,
)
# First registration: parallel enabled
config_on = {
"toggle_srv": {
"command": "echo",
"supports_parallel_tool_calls": True,
},
}
with patch("tools.mcp_tool._MCP_AVAILABLE", True), \
patch("tools.mcp_tool._ensure_mcp_loop"), \
patch("tools.mcp_tool._run_on_mcp_loop"), \
patch("tools.mcp_tool._existing_tool_names", return_value=[]):
register_mcp_servers(config_on)
with _lock:
assert sanitize_mcp_name_component("toggle_srv") in _parallel_safe_servers
# Second registration: parallel disabled
config_off = {
"toggle_srv": {
"command": "echo",
"supports_parallel_tool_calls": False,
},
}
with patch("tools.mcp_tool._MCP_AVAILABLE", True), \
patch("tools.mcp_tool._ensure_mcp_loop"), \
patch("tools.mcp_tool._run_on_mcp_loop"), \
patch("tools.mcp_tool._existing_tool_names", return_value=[]):
register_mcp_servers(config_off)
with _lock:
assert sanitize_mcp_name_component("toggle_srv") not in _parallel_safe_servers

View file

@ -0,0 +1,438 @@
"""Tests for the X (Twitter) Search tool backed by xAI Responses API.
Covers:
- HTTP request shape (URL, headers, payload, model from config)
- Handle filter validation (allowed vs excluded mutual exclusion)
- Inline url_citation extraction from message annotations
- Structured error handling (4xx with code, 5xx retry, ReadTimeout retry)
- Credential resolution: API key path, OAuth path, both-set preference, none-set
- check_x_search_requirements gating in registry
"""
import json
import requests
class _FakeResponse:
def __init__(self, payload, *, status_code=200, text=None):
self._payload = payload
self.status_code = status_code
self.text = text if text is not None else json.dumps(payload)
def raise_for_status(self):
if self.status_code >= 400:
err = requests.HTTPError(f"{self.status_code} Client Error")
err.response = self
raise err
def json(self):
return self._payload
# ---------------------------------------------------------------------------
# Original PR #10786 test coverage (HTTP shape, handle validation, citations,
# retry behavior) — preserved verbatim. Uses XAI_API_KEY env var via the
# default resolver path.
# ---------------------------------------------------------------------------
def test_x_search_posts_responses_request(monkeypatch):
from tools.x_search_tool import x_search_tool
from hermes_cli import __version__
captured = {}
def _fake_post(url, headers=None, json=None, timeout=None):
captured["url"] = url
captured["headers"] = headers
captured["json"] = json
captured["timeout"] = timeout
return _FakeResponse(
{
"output_text": "People on X are discussing xAI's latest launch.",
"citations": [{"url": "https://x.com/example/status/1", "title": "Example post"}],
}
)
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
monkeypatch.setattr("requests.post", _fake_post)
result = json.loads(
x_search_tool(
query="What are people saying about xAI on X?",
allowed_x_handles=["xai", "@grok"],
from_date="2026-04-01",
to_date="2026-04-10",
enable_image_understanding=True,
)
)
tool_def = captured["json"]["tools"][0]
assert captured["url"] == "https://api.x.ai/v1/responses"
assert captured["headers"]["User-Agent"] == f"Hermes-Agent/{__version__}"
assert captured["json"]["model"] == "grok-4.20-reasoning"
assert captured["json"]["store"] is False
assert tool_def["type"] == "x_search"
assert tool_def["allowed_x_handles"] == ["xai", "grok"]
assert tool_def["from_date"] == "2026-04-01"
assert tool_def["to_date"] == "2026-04-10"
assert tool_def["enable_image_understanding"] is True
assert result["success"] is True
assert result["answer"] == "People on X are discussing xAI's latest launch."
def test_x_search_rejects_conflicting_handle_filters(monkeypatch):
from tools.x_search_tool import x_search_tool
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
result = json.loads(
x_search_tool(
query="latest xAI discussion",
allowed_x_handles=["xai"],
excluded_x_handles=["grok"],
)
)
assert result["error"] == "allowed_x_handles and excluded_x_handles cannot be used together"
def test_x_search_extracts_inline_url_citations(monkeypatch):
from tools.x_search_tool import x_search_tool
def _fake_post(url, headers=None, json=None, timeout=None):
return _FakeResponse(
{
"output": [
{
"type": "message",
"content": [
{
"type": "output_text",
"text": "xAI posted an update on X.",
"annotations": [
{
"type": "url_citation",
"url": "https://x.com/xai/status/123",
"title": "xAI update",
"start_index": 0,
"end_index": 3,
}
],
}
],
}
]
}
)
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
monkeypatch.setattr("requests.post", _fake_post)
result = json.loads(x_search_tool(query="latest post from xai"))
assert result["success"] is True
assert result["answer"] == "xAI posted an update on X."
assert result["inline_citations"] == [
{
"url": "https://x.com/xai/status/123",
"title": "xAI update",
"start_index": 0,
"end_index": 3,
}
]
def test_x_search_returns_structured_http_error(monkeypatch):
from tools.x_search_tool import x_search_tool
class _FailingResponse:
status_code = 403
text = '{"code":"forbidden","error":"x_search is not enabled for this model"}'
def json(self):
return {
"code": "forbidden",
"error": "x_search is not enabled for this model",
}
def raise_for_status(self):
err = requests.HTTPError("403 Client Error: Forbidden")
err.response = self
raise err
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
monkeypatch.setattr("requests.post", lambda *a, **k: _FailingResponse())
result = json.loads(x_search_tool(query="latest xai discussion"))
assert result["success"] is False
assert result["provider"] == "xai"
assert result["tool"] == "x_search"
assert result["error_type"] == "HTTPError"
assert result["error"] == "forbidden: x_search is not enabled for this model"
def test_x_search_retries_read_timeout_then_succeeds(monkeypatch):
from tools.x_search_tool import x_search_tool
calls = {"count": 0}
def _fake_post(url, headers=None, json=None, timeout=None):
calls["count"] += 1
if calls["count"] == 1:
raise requests.ReadTimeout("timed out")
return _FakeResponse(
{
"output_text": "Recovered after retry.",
"citations": [],
}
)
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
monkeypatch.setattr("requests.post", _fake_post)
monkeypatch.setattr("tools.x_search_tool.time.sleep", lambda *_: None)
result = json.loads(x_search_tool(query="grok xai"))
assert calls["count"] == 2
assert result["success"] is True
assert result["answer"] == "Recovered after retry."
def test_x_search_retries_5xx_then_succeeds(monkeypatch):
from tools.x_search_tool import x_search_tool
calls = {"count": 0}
def _fake_post(url, headers=None, json=None, timeout=None):
calls["count"] += 1
if calls["count"] == 1:
return _FakeResponse(
{"code": "Internal error", "error": "Service temporarily unavailable."},
status_code=500,
)
return _FakeResponse({"output_text": "Recovered after 5xx retry."})
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
monkeypatch.setattr("requests.post", _fake_post)
monkeypatch.setattr("tools.x_search_tool.time.sleep", lambda *_: None)
result = json.loads(x_search_tool(query="grok xai"))
assert calls["count"] == 2
assert result["success"] is True
assert result["answer"] == "Recovered after 5xx retry."
# ---------------------------------------------------------------------------
# Credential-resolution coverage — the OAuth-or-API-key gating contract.
# ---------------------------------------------------------------------------
def _no_xai_env(monkeypatch):
"""Strip any XAI_* env vars so the resolver doesn't see a leaked dev key."""
for var in ("XAI_API_KEY", "XAI_BASE_URL", "HERMES_XAI_BASE_URL"):
monkeypatch.delenv(var, raising=False)
def test_x_search_uses_xai_oauth_when_only_oauth_available(monkeypatch):
"""OAuth-only user: credential_source should be ``xai-oauth``."""
from tools.registry import invalidate_check_fn_cache
from tools.x_search_tool import check_x_search_requirements, x_search_tool
_no_xai_env(monkeypatch)
def _fake_resolve():
return {
"provider": "xai-oauth",
"api_key": "oauth-bearer-token",
"base_url": "https://api.x.ai/v1",
}
monkeypatch.setattr(
"tools.x_search_tool.resolve_xai_http_credentials", _fake_resolve
)
invalidate_check_fn_cache()
assert check_x_search_requirements() is True
captured = {}
def _fake_post(url, headers=None, json=None, timeout=None):
captured["headers"] = headers
return _FakeResponse({"output_text": "Found posts via OAuth."})
monkeypatch.setattr("requests.post", _fake_post)
result = json.loads(x_search_tool(query="anything about xai"))
assert result["success"] is True
assert result["credential_source"] == "xai-oauth"
assert captured["headers"]["Authorization"] == "Bearer oauth-bearer-token"
def test_x_search_uses_api_key_when_only_xai_api_key_set(monkeypatch):
"""API-key-only user: credential_source should be ``xai``."""
from tools.registry import invalidate_check_fn_cache
from tools.x_search_tool import check_x_search_requirements, x_search_tool
_no_xai_env(monkeypatch)
def _fake_resolve():
# Real ``resolve_xai_http_credentials`` returns ``"xai"`` when it
# falls through to the XAI_API_KEY env var path.
return {
"provider": "xai",
"api_key": "raw-api-key",
"base_url": "https://api.x.ai/v1",
}
monkeypatch.setattr(
"tools.x_search_tool.resolve_xai_http_credentials", _fake_resolve
)
invalidate_check_fn_cache()
assert check_x_search_requirements() is True
captured = {}
def _fake_post(url, headers=None, json=None, timeout=None):
captured["headers"] = headers
return _FakeResponse({"output_text": "Found posts via API key."})
monkeypatch.setattr("requests.post", _fake_post)
result = json.loads(x_search_tool(query="anything"))
assert result["success"] is True
assert result["credential_source"] == "xai"
assert captured["headers"]["Authorization"] == "Bearer raw-api-key"
def test_x_search_prefers_oauth_when_both_available(monkeypatch):
"""Both credentials present: OAuth wins (matches Teknium's billing preference).
The real ordering is implemented in ``tools.xai_http.resolve_xai_http_credentials``
OAuth runtime first, fallback OAuth resolver second, ``XAI_API_KEY`` third.
This test exercises the contract by having the resolver return the OAuth
bearer (the ``xai-oauth`` ``provider`` tag is the marker).
"""
from tools.registry import invalidate_check_fn_cache
from tools.x_search_tool import x_search_tool
monkeypatch.setenv("XAI_API_KEY", "raw-api-key")
# Mimic xai_http's preference: OAuth wins, so we return the OAuth tuple
# even though XAI_API_KEY is also set.
def _fake_resolve():
return {
"provider": "xai-oauth",
"api_key": "oauth-bearer-token",
"base_url": "https://api.x.ai/v1",
}
monkeypatch.setattr(
"tools.x_search_tool.resolve_xai_http_credentials", _fake_resolve
)
invalidate_check_fn_cache()
captured = {}
def _fake_post(url, headers=None, json=None, timeout=None):
captured["headers"] = headers
return _FakeResponse({"output_text": "OAuth preferred."})
monkeypatch.setattr("requests.post", _fake_post)
result = json.loads(x_search_tool(query="anything"))
assert result["credential_source"] == "xai-oauth"
assert captured["headers"]["Authorization"] == "Bearer oauth-bearer-token"
def test_x_search_returns_tool_error_when_no_credentials(monkeypatch):
"""No credentials anywhere: tool returns a clear error, not a 401 from xAI."""
from tools.registry import invalidate_check_fn_cache
from tools.x_search_tool import check_x_search_requirements, x_search_tool
_no_xai_env(monkeypatch)
def _fake_resolve():
return {
"provider": "xai",
"api_key": "",
"base_url": "https://api.x.ai/v1",
}
monkeypatch.setattr(
"tools.x_search_tool.resolve_xai_http_credentials", _fake_resolve
)
invalidate_check_fn_cache()
assert check_x_search_requirements() is False
# If a model somehow invokes the tool despite a False check_fn, the call
# surfaces a friendly error rather than an HTTP exception.
result = x_search_tool(query="anything")
assert "No xAI credentials available" in result
assert "hermes auth add xai-oauth" in result
def test_x_search_check_fn_false_when_resolver_raises(monkeypatch):
"""Resolver exceptions (e.g. expired token + failed refresh) gate the tool out."""
from tools.registry import invalidate_check_fn_cache
from tools.x_search_tool import check_x_search_requirements
_no_xai_env(monkeypatch)
def _boom():
raise RuntimeError("token revoked and refresh failed")
monkeypatch.setattr(
"tools.x_search_tool.resolve_xai_http_credentials", _boom
)
invalidate_check_fn_cache()
assert check_x_search_requirements() is False
def test_x_search_honors_config_model_and_timeout(monkeypatch, tmp_path):
"""``x_search.model`` and ``x_search.timeout_seconds`` override the defaults."""
from tools.x_search_tool import x_search_tool
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
# Patch the in-module config loader so tests don't touch ~/.hermes/config.yaml.
monkeypatch.setattr(
"tools.x_search_tool._load_x_search_config",
lambda: {"model": "grok-custom-test", "timeout_seconds": 45, "retries": 0},
)
captured = {}
def _fake_post(url, headers=None, json=None, timeout=None):
captured["model"] = json["model"]
captured["timeout"] = timeout
return _FakeResponse({"output_text": "Custom model OK."})
monkeypatch.setattr("requests.post", _fake_post)
result = json.loads(x_search_tool(query="anything"))
assert result["success"] is True
assert captured["model"] == "grok-custom-test"
assert captured["timeout"] == 45
def test_x_search_registered_in_registry_with_check_fn():
"""The tool is registered under the x_search toolset with the gating check_fn."""
import tools.x_search_tool # noqa: F401 — ensures registration runs
from tools.registry import registry
entry = registry.get_entry("x_search")
assert entry is not None
assert entry.toolset == "x_search"
assert entry.check_fn is not None
assert entry.check_fn.__name__ == "check_x_search_requirements"
assert "XAI_API_KEY" in entry.requires_env
assert entry.emoji == "🐦"

View file

@ -133,8 +133,19 @@ _CREDENTIAL_FILES = (
r'(?:~|\$home|\$\{home\})/\.'
r'(?:netrc|pgpass|npmrc|pypirc)\b'
)
# macOS: /etc, /var, /tmp, /home are symlinks to /private/{etc,var,tmp,home}.
# A command written to target /private/etc/sudoers works identically to
# /etc/sudoers on macOS but bypasses a plain "/etc/" pattern check. Match
# both forms. Inspired by Claude Code 2.1.113's "dangerous path protection".
_MACOS_PRIVATE_SYSTEM_PATH = r'/private/(?:etc|var|tmp|home)/'
# System-config paths that should trigger approval for any write/edit,
# collapsing /etc, its macOS /private/etc mirror, and /etc/sudoers.d/ into
# one shared fragment so new DANGEROUS_PATTERNS stay consistent.
_SYSTEM_CONFIG_PATH = (
rf'(?:/etc/|{_MACOS_PRIVATE_SYSTEM_PATH})'
)
_SENSITIVE_WRITE_TARGET = (
r'(?:/etc/|/dev/sd|'
rf'(?:{_SYSTEM_CONFIG_PATH}|/dev/sd|'
rf'{_SSH_SENSITIVE_PATH}|'
rf'{_HERMES_ENV_PATH}|'
rf'{_SHELL_RC_FILES}|'
@ -318,10 +329,17 @@ DANGEROUS_PATTERNS = [
# *next* line to satisfy the negative lookahead, silently allowing DELETE without WHERE.
(r'\bDELETE\s+FROM\b(?![^\n]*\bWHERE\b)', "SQL DELETE without WHERE"),
(r'\bTRUNCATE\s+(TABLE)?\s*\w', "SQL TRUNCATE"),
(r'>\s*/etc/', "overwrite system config"),
(rf'>\s*{_SYSTEM_CONFIG_PATH}', "overwrite system config"),
(r'\bsystemctl\s+(-[^\s]+\s+)*(stop|restart|disable|mask)\b', "stop/restart system service"),
(r'\bkill\s+-9\s+-1\b', "kill all processes"),
(r'\bpkill\s+-9\b', "force kill processes"),
# killall with SIGKILL (parallel to pkill -9). Catches -9 / -KILL /
# -s KILL / -SIGKILL forms, and also `killall -r <regex>` broad sweeps
# that can wipe out unrelated processes by accident.
# Inspired by Claude Code 2.1.113 expanded deny rules.
(r'\bkillall\s+(-[^\s]*\s+)*-(9|KILL|SIGKILL)\b', "force kill processes (killall -KILL)"),
(r'\bkillall\s+(-[^\s]*\s+)*-s\s+(KILL|SIGKILL|9)\b', "force kill processes (killall -s KILL)"),
(r'\bkillall\s+(-[^\s]*\s+)*-r\b', "kill processes by regex (killall -r)"),
(r':\(\)\s*\{\s*:\s*\|\s*:\s*&\s*\}\s*;\s*:', "fork bomb"),
# Any shell invocation via -c or combined flags like -lc, -ic, etc.
(r'\b(bash|sh|zsh|ksh)\s+-[^\s]*c(\s+|$)', "shell command via -c/-lc flag"),
@ -333,7 +351,11 @@ DANGEROUS_PATTERNS = [
(rf'\btee\b.*["\']?{_PROJECT_SENSITIVE_WRITE_TARGET}["\']?{_COMMAND_TAIL}', "overwrite project env/config via tee"),
(rf'>>?\s*["\']?{_PROJECT_SENSITIVE_WRITE_TARGET}["\']?{_COMMAND_TAIL}', "overwrite project env/config via redirection"),
(r'\bxargs\s+.*\brm\b', "xargs with rm"),
(r'\bfind\b.*-exec\s+(/\S*/)?rm\b', "find -exec rm"),
# find -exec rm / -execdir rm — the -execdir variant (same semantics,
# runs in the directory of each match) was previously missed. Claude
# Code 2.1.113 tightened their equivalent find rule to stop auto-
# approving -exec / -delete flags.
(r'\bfind\b.*-exec(?:dir)?\s+(/\S*/)?rm\b', "find -exec/-execdir rm"),
(r'\bfind\b.*-delete\b', "find -delete"),
# Gateway lifecycle protection: prevent the agent from killing its own
# gateway process. These commands trigger a gateway restart/stop that
@ -351,11 +373,12 @@ DANGEROUS_PATTERNS = [
# to regex at detection time. Catch the structural pattern instead.
(r'\bkill\b.*\$\(\s*pgrep\b', "kill process via pgrep expansion (self-termination)"),
(r'\bkill\b.*`\s*pgrep\b', "kill process via backtick pgrep expansion (self-termination)"),
# File copy/move/edit into sensitive system paths
(r'\b(cp|mv|install)\b.*\s/etc/', "copy/move file into /etc/"),
# File copy/move/edit into sensitive system paths (/etc/ and macOS
# /private/etc/ mirror).
(rf'\b(cp|mv|install)\b.*\s{_SYSTEM_CONFIG_PATH}', "copy/move file into system config path"),
(rf'\b(cp|mv|install)\b.*\s["\']?{_PROJECT_SENSITIVE_WRITE_TARGET}["\']?{_COMMAND_TAIL}', "overwrite project env/config file"),
(r'\bsed\s+-[^\s]*i.*\s/etc/', "in-place edit of system config"),
(r'\bsed\s+--in-place\b.*\s/etc/', "in-place edit of system config (long flag)"),
(rf'\bsed\s+-[^\s]*i.*\s{_SYSTEM_CONFIG_PATH}', "in-place edit of system config"),
(rf'\bsed\s+--in-place\b.*\s{_SYSTEM_CONFIG_PATH}', "in-place edit of system config (long flag)"),
# Script execution via heredoc — bypasses the -e/-c flag patterns above.
# `python3 << 'EOF'` feeds arbitrary code via stdin without -c/-e flags.
(r'\b(python[23]?|perl|ruby|node)\s+<<', "script execution via heredoc"),

View file

@ -2362,6 +2362,7 @@ def _resolve_delegation_credentials(cfg: dict, parent_agent) -> dict:
configured_provider = str(cfg.get("provider") or "").strip() or None
configured_base_url = str(cfg.get("base_url") or "").strip() or None
configured_api_key = str(cfg.get("api_key") or "").strip() or None
configured_api_mode = str(cfg.get("api_mode") or "").strip().lower() or None
if configured_base_url:
# When delegation.api_key is not set, return None so _build_child_agent
@ -2372,9 +2373,17 @@ def _resolve_delegation_credentials(cfg: dict, parent_agent) -> dict:
# callers to duplicate the key under delegation.api_key.
api_key = configured_api_key # None → inherited from parent in _build_child_agent
# Use the shared URL-based api_mode detector (same path the main agent's
# runtime resolver uses) so Anthropic-compatible direct endpoints with a
# /anthropic suffix — Azure AI Foundry, MiniMax, Zhipu GLM, LiteLLM
# proxies — pick the right transport automatically. Without this,
# subagents would default to chat_completions and hit 404s on endpoints
# that only speak the Anthropic Messages protocol. Fixes #10213.
from hermes_cli.runtime_provider import _detect_api_mode_for_url
base_lower = configured_base_url.lower()
provider = "custom"
api_mode = "chat_completions"
api_mode = _detect_api_mode_for_url(configured_base_url) or "chat_completions"
if (
base_url_hostname(configured_base_url) == "chatgpt.com"
and "/backend-api/codex" in base_lower
@ -2388,6 +2397,11 @@ def _resolve_delegation_credentials(cfg: dict, parent_agent) -> dict:
provider = "custom"
api_mode = "anthropic_messages"
# Explicit delegation.api_mode in config always wins. Lets users force
# a transport for non-standard endpoints the URL heuristic can't detect.
if configured_api_mode in {"chat_completions", "codex_responses", "anthropic_messages"}:
api_mode = configured_api_mode
return {
"model": configured_model,
"provider": provider,

View file

@ -78,7 +78,7 @@ LAZY_DEPS: dict[str, tuple[str, ...]] = {
# ─── Inference providers ───────────────────────────────────────────────
# Native Anthropic SDK — needed when provider=anthropic (not via
# OpenRouter / aggregators which use the openai SDK).
"provider.anthropic": ("anthropic==0.86.0",),
"provider.anthropic": ("anthropic==0.87.0",), # CVE-2026-34450, CVE-2026-34452
# AWS Bedrock provider
"provider.bedrock": ("boto3==1.42.89",),
@ -125,7 +125,7 @@ LAZY_DEPS: dict[str, tuple[str, ...]] = {
"platform.slack": (
"slack-bolt==1.27.0",
"slack-sdk==3.40.1",
"aiohttp==3.13.3",
"aiohttp==3.13.4", # CVE-2026-34513/34518/34519/34520/34525
),
"platform.matrix": (
"mautrix[encryption]==0.21.0",

View file

@ -24,6 +24,7 @@ Example config::
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_..."
supports_parallel_tool_calls: true # tools from this server may run concurrently
remote_api:
url: "https://my-mcp-server.example.com/mcp"
headers:
@ -56,6 +57,8 @@ Features:
- Thread-safe architecture with dedicated background event loop
- Sampling support: MCP servers can request LLM completions via
sampling/createMessage (text and tool-use responses)
- Parallel tool call opt-in: per-server ``supports_parallel_tool_calls``
flag allows concurrent execution of tools from the same server
Architecture:
A dedicated background event loop (_mcp_loop) runs in a daemon thread.
@ -1976,11 +1979,16 @@ def _handle_session_expired_and_retry(
return None
# Sanitized server names whose ``supports_parallel_tool_calls`` config is True.
# Populated during ``register_mcp_servers()`` and queried by
# ``is_mcp_tool_parallel_safe()`` for the parallel-execution check in run_agent.
_parallel_safe_servers: set = set()
# Dedicated event loop running in a background daemon thread.
_mcp_loop: Optional[asyncio.AbstractEventLoop] = None
_mcp_thread: Optional[threading.Thread] = None
# Protects _mcp_loop, _mcp_thread, _servers, and _stdio_pids.
# Protects _mcp_loop, _mcp_thread, _servers, _parallel_safe_servers, and _stdio_pids.
_lock = threading.Lock()
# PIDs of stdio MCP server subprocesses. Tracked so we can force-kill
@ -3098,6 +3106,12 @@ def register_mcp_servers(servers: Dict[str, dict]) -> List[str]:
for k, v in servers.items()
if k not in _servers and _parse_boolish(v.get("enabled", True), default=True)
}
# Track which servers opt-in to parallel tool calls (idempotent).
for srv_name, srv_cfg in servers.items():
if _parse_boolish(srv_cfg.get("supports_parallel_tool_calls", False), default=False):
_parallel_safe_servers.add(sanitize_mcp_name_component(srv_name))
else:
_parallel_safe_servers.discard(sanitize_mcp_name_component(srv_name))
if not new_servers:
return _existing_tool_names()
@ -3208,6 +3222,29 @@ def discover_mcp_tools() -> List[str]:
return tool_names
def is_mcp_tool_parallel_safe(tool_name: str) -> bool:
"""Check if an MCP tool belongs to a server that supports parallel tool calls.
MCP tool names follow the pattern ``mcp_{server}_{tool}``. This extracts
the server component and checks it against the set of servers whose config
includes ``supports_parallel_tool_calls: true``.
Returns False for non-MCP tools or tools from servers without the flag.
"""
if not tool_name.startswith("mcp_"):
return False
# Strip the "mcp_" prefix and extract the server name.
# Tool names are: mcp_{sanitized_server}_{sanitized_tool}
# We need to check all possible server prefixes because the server name
# itself may contain underscores after sanitization.
rest = tool_name[4:] # strip "mcp_"
with _lock:
for server_name in _parallel_safe_servers:
if rest.startswith(server_name + "_") and len(rest) > len(server_name) + 1:
return True
return False
def get_mcp_status() -> List[dict]:
"""Return status of all configured MCP servers for banner display.

View file

@ -244,8 +244,16 @@ class ToolRegistry:
emoji: str = "",
max_result_size_chars: int | float | None = None,
dynamic_schema_overrides: Callable = None,
override: bool = False,
):
"""Register a tool. Called at module-import time by each tool file."""
"""Register a tool. Called at module-import time by each tool file.
``override=True`` is an explicit opt-in for plugins that intend to
replace an existing built-in tool implementation (e.g. swap the
default browser tool for a headed-Chrome CDP backend). Without it,
registrations that would shadow an existing tool from a different
toolset are rejected to prevent accidental overwrites.
"""
with self._lock:
existing = self._tools.get(name)
if existing and existing.toolset != toolset:
@ -260,13 +268,22 @@ class ToolRegistry:
"Tool '%s': MCP toolset '%s' overwriting MCP toolset '%s'",
name, toolset, existing.toolset,
)
elif override:
# Explicit plugin opt-in: replace the existing tool.
# Logged at INFO so the override is auditable in agent.log.
logger.info(
"Tool '%s': toolset '%s' overriding existing toolset '%s' "
"(override=True opt-in)",
name, toolset, existing.toolset,
)
else:
# Reject shadowing — prevent plugins/MCP from overwriting
# built-in tools or vice versa.
logger.error(
"Tool registration REJECTED: '%s' (toolset '%s') would "
"shadow existing tool from toolset '%s'. Deregister the "
"existing tool first if this is intentional.",
"shadow existing tool from toolset '%s'. Pass "
"override=True to register() if the replacement is "
"intentional, or deregister the existing tool first.",
name, toolset, existing.toolset,
)
return
@ -387,7 +404,16 @@ class ToolRegistry:
return entry.handler(args, **kwargs)
except Exception as e:
logger.exception("Tool %s dispatch error: %s", name, e)
return json.dumps({"error": f"Tool execution failed: {type(e).__name__}: {e}"})
# Route through the sanitizer so framing tokens / CDATA / fences
# in exception strings don't reach the model as structural noise.
# See model_tools._sanitize_tool_error for rationale.
raw = f"Tool execution failed: {type(e).__name__}: {e}"
try:
from model_tools import _sanitize_tool_error
sanitized = _sanitize_tool_error(raw)
except Exception:
sanitized = raw # defensive: never let the sanitizer block error propagation
return json.dumps({"error": sanitized})
# ------------------------------------------------------------------
# Query helpers (replace redundant dicts in model_tools.py)

424
tools/x_search_tool.py Normal file
View file

@ -0,0 +1,424 @@
#!/usr/bin/env python3
"""X Search tool backed by xAI's built-in ``x_search`` Responses API tool.
Authentication
--------------
The tool registers when **either** xAI credential path is available:
* ``XAI_API_KEY`` is set in ``~/.hermes/.env`` or the process environment
(paid xAI API key), OR
* The user is signed in via xAI Grok OAuth SuperGrok subscription
i.e. ``hermes auth add xai-oauth`` has been run and the stored refresh
token still works.
Credential preference at call time matches
:func:`tools.xai_http.resolve_xai_http_credentials`: SuperGrok OAuth first,
direct OAuth resolver second, ``XAI_API_KEY`` last. That helper also
auto-refreshes the OAuth access token when it's within the refresh skew
window, so a ``True`` from :func:`check_x_search_requirements` means the
bearer is fetchable AND non-empty.
Salvaged from PR #10786 (originally by @Jaaneek); credential resolution
reworked to honor both auth modes per Teknium's design.
"""
from __future__ import annotations
import json
import logging
import os
import time
from typing import Any, Dict, List, Optional, Tuple
import requests
from tools.registry import registry, tool_error
from tools.xai_http import hermes_xai_user_agent, resolve_xai_http_credentials
logger = logging.getLogger(__name__)
DEFAULT_XAI_BASE_URL = "https://api.x.ai/v1"
DEFAULT_X_SEARCH_MODEL = "grok-4.20-reasoning"
DEFAULT_X_SEARCH_TIMEOUT_SECONDS = 180
DEFAULT_X_SEARCH_RETRIES = 2
MAX_HANDLES = 10
# ---------------------------------------------------------------------------
# Config
# ---------------------------------------------------------------------------
def _load_x_search_config() -> Dict[str, Any]:
try:
from hermes_cli.config import load_config
return load_config().get("x_search", {}) or {}
except Exception:
return {}
def _get_x_search_model() -> str:
cfg = _load_x_search_config()
return (str(cfg.get("model") or "").strip() or DEFAULT_X_SEARCH_MODEL)
def _get_x_search_timeout_seconds() -> int:
cfg = _load_x_search_config()
raw_value = cfg.get("timeout_seconds", DEFAULT_X_SEARCH_TIMEOUT_SECONDS)
try:
return max(30, int(raw_value))
except Exception:
return DEFAULT_X_SEARCH_TIMEOUT_SECONDS
def _get_x_search_retries() -> int:
cfg = _load_x_search_config()
raw_value = cfg.get("retries", DEFAULT_X_SEARCH_RETRIES)
try:
return max(0, int(raw_value))
except Exception:
return DEFAULT_X_SEARCH_RETRIES
# ---------------------------------------------------------------------------
# Credential resolution
# ---------------------------------------------------------------------------
def _resolve_xai_bearer() -> Tuple[str, str, str]:
"""Return ``(api_key, base_url, source)``.
``source`` is one of ``"xai-oauth"`` or ``"xai"`` so callers (and tests)
can tell which credential path won. Raises ``RuntimeError`` if no usable
credential is available the registered :func:`check_x_search_requirements`
gate makes that case unreachable in normal operation, but the runtime
check exists so a credential that expires between registration and
invocation produces a clean tool error instead of a 401.
"""
creds = resolve_xai_http_credentials()
api_key = str(creds.get("api_key") or "").strip()
if not api_key:
raise RuntimeError(
"No xAI credentials available. Run `hermes auth add xai-oauth` "
"to sign in with your SuperGrok subscription, or set XAI_API_KEY."
)
base_url = str(creds.get("base_url") or DEFAULT_XAI_BASE_URL).strip().rstrip("/")
source = str(creds.get("provider") or "xai")
return api_key, base_url, source
def check_x_search_requirements() -> bool:
"""Return True when xAI credentials are available AND valid.
``resolve_xai_http_credentials`` calls
:func:`hermes_cli.auth.resolve_xai_oauth_runtime_credentials` which
auto-refreshes the OAuth access token if it's expiring; a successful
return therefore implies a usable bearer.
"""
try:
creds = resolve_xai_http_credentials()
return bool(str(creds.get("api_key") or "").strip())
except Exception:
return False
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _normalize_handles(handles: Optional[List[str]], field_name: str) -> List[str]:
cleaned: List[str] = []
for handle in handles or []:
normalized = str(handle or "").strip().lstrip("@")
if normalized:
cleaned.append(normalized)
if len(cleaned) > MAX_HANDLES:
raise ValueError(f"{field_name} supports at most {MAX_HANDLES} handles")
return cleaned
def _extract_response_text(payload: Dict[str, Any]) -> str:
output_text = str(payload.get("output_text") or "").strip()
if output_text:
return output_text
parts: List[str] = []
for item in payload.get("output", []) or []:
if item.get("type") != "message":
continue
for content in item.get("content", []) or []:
ctype = content.get("type")
if ctype in ("output_text", "text"):
text = str(content.get("text") or "").strip()
if text:
parts.append(text)
return "\n\n".join(parts).strip()
def _extract_inline_citations(payload: Dict[str, Any]) -> List[Dict[str, Any]]:
citations: List[Dict[str, Any]] = []
for item in payload.get("output", []) or []:
if item.get("type") != "message":
continue
for content in item.get("content", []) or []:
for annotation in content.get("annotations", []) or []:
if annotation.get("type") != "url_citation":
continue
citations.append(
{
"url": annotation.get("url", ""),
"title": annotation.get("title", ""),
"start_index": annotation.get("start_index"),
"end_index": annotation.get("end_index"),
}
)
return citations
def _http_error_message(exc: requests.HTTPError) -> str:
response = getattr(exc, "response", None)
if response is None:
return str(exc)
try:
payload = response.json()
except Exception:
payload = None
if isinstance(payload, dict):
code = str(payload.get("code") or "").strip()
error = str(payload.get("error") or "").strip()
message = error or str(payload)
if code and code not in message:
message = f"{code}: {message}"
return message or str(exc)
text = str(getattr(response, "text", "") or "").strip()
if text:
return text[:500]
return str(exc)
# ---------------------------------------------------------------------------
# Tool implementation
# ---------------------------------------------------------------------------
def x_search_tool(
query: str,
allowed_x_handles: Optional[List[str]] = None,
excluded_x_handles: Optional[List[str]] = None,
from_date: str = "",
to_date: str = "",
enable_image_understanding: bool = False,
enable_video_understanding: bool = False,
) -> str:
if not query or not query.strip():
return tool_error("query is required for x_search")
try:
api_key, base_url, source = _resolve_xai_bearer()
except RuntimeError as exc:
return tool_error(str(exc))
try:
allowed = _normalize_handles(allowed_x_handles, "allowed_x_handles")
excluded = _normalize_handles(excluded_x_handles, "excluded_x_handles")
if allowed and excluded:
return tool_error("allowed_x_handles and excluded_x_handles cannot be used together")
tool_def: Dict[str, Any] = {"type": "x_search"}
if allowed:
tool_def["allowed_x_handles"] = allowed
if excluded:
tool_def["excluded_x_handles"] = excluded
if from_date.strip():
tool_def["from_date"] = from_date.strip()
if to_date.strip():
tool_def["to_date"] = to_date.strip()
if enable_image_understanding:
tool_def["enable_image_understanding"] = True
if enable_video_understanding:
tool_def["enable_video_understanding"] = True
payload = {
"model": _get_x_search_model(),
"input": [
{
"role": "user",
"content": query.strip(),
}
],
"tools": [tool_def],
"store": False,
}
timeout_seconds = _get_x_search_timeout_seconds()
max_retries = _get_x_search_retries()
response: Optional[requests.Response] = None
for attempt in range(max_retries + 1):
try:
response = requests.post(
f"{base_url}/responses",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
"User-Agent": hermes_xai_user_agent(),
},
json=payload,
timeout=timeout_seconds,
)
response.raise_for_status()
break
except requests.HTTPError as e:
status_code = getattr(getattr(e, "response", None), "status_code", None)
if status_code is None or status_code < 500 or attempt >= max_retries:
raise
logger.warning(
"x_search upstream failure on attempt %s/%s: %s",
attempt + 1,
max_retries + 1,
_http_error_message(e),
)
time.sleep(min(5.0, 1.5 * (attempt + 1)))
except (requests.ReadTimeout, requests.ConnectionError) as e:
if attempt >= max_retries:
raise
logger.warning(
"x_search transient failure on attempt %s/%s: %s",
attempt + 1,
max_retries + 1,
e,
)
time.sleep(min(5.0, 1.5 * (attempt + 1)))
if response is None:
raise RuntimeError("x_search request did not return a response")
data = response.json()
answer = _extract_response_text(data)
citations = list(data.get("citations") or [])
inline_citations = _extract_inline_citations(data)
return json.dumps(
{
"success": True,
"provider": "xai",
"credential_source": source,
"tool": "x_search",
"model": payload["model"],
"query": query.strip(),
"answer": answer,
"citations": citations,
"inline_citations": inline_citations,
},
ensure_ascii=False,
)
except requests.HTTPError as e:
logger.error("x_search failed: %s", e, exc_info=True)
return json.dumps(
{
"success": False,
"provider": "xai",
"tool": "x_search",
"error": _http_error_message(e),
"error_type": type(e).__name__,
},
ensure_ascii=False,
)
except requests.ReadTimeout as e:
logger.error("x_search timed out: %s", e, exc_info=True)
return json.dumps(
{
"success": False,
"provider": "xai",
"tool": "x_search",
"error": f"xAI x_search timed out after {_get_x_search_timeout_seconds()} seconds",
"error_type": type(e).__name__,
},
ensure_ascii=False,
)
except Exception as e:
logger.error("x_search failed: %s", e, exc_info=True)
return json.dumps(
{
"success": False,
"provider": "xai",
"tool": "x_search",
"error": str(e),
"error_type": type(e).__name__,
},
ensure_ascii=False,
)
X_SEARCH_SCHEMA = {
"name": "x_search",
"description": (
"Search X (Twitter) posts, profiles, and threads using xAI's built-in "
"X Search tool. Use this for current discussion, reactions, or claims "
"on X rather than general web pages. Available when xAI credentials "
"are configured (SuperGrok OAuth or XAI_API_KEY)."
),
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "What to look up on X.",
},
"allowed_x_handles": {
"type": "array",
"items": {"type": "string"},
"description": "Optional list of X handles to include exclusively (max 10).",
},
"excluded_x_handles": {
"type": "array",
"items": {"type": "string"},
"description": "Optional list of X handles to exclude (max 10).",
},
"from_date": {
"type": "string",
"description": "Optional start date in YYYY-MM-DD format.",
},
"to_date": {
"type": "string",
"description": "Optional end date in YYYY-MM-DD format.",
},
"enable_image_understanding": {
"type": "boolean",
"description": "Whether xAI should analyze images attached to matching X posts.",
"default": False,
},
"enable_video_understanding": {
"type": "boolean",
"description": "Whether xAI should analyze videos attached to matching X posts.",
"default": False,
},
},
"required": ["query"],
},
}
def _handle_x_search(args, **kw):
return x_search_tool(
query=args.get("query", ""),
allowed_x_handles=args.get("allowed_x_handles"),
excluded_x_handles=args.get("excluded_x_handles"),
from_date=args.get("from_date", ""),
to_date=args.get("to_date", ""),
enable_image_understanding=bool(args.get("enable_image_understanding", False)),
enable_video_understanding=bool(args.get("enable_video_understanding", False)),
)
registry.register(
name="x_search",
toolset="x_search",
schema=X_SEARCH_SCHEMA,
handler=_handle_x_search,
check_fn=check_x_search_requirements,
requires_env=["XAI_API_KEY"],
emoji="🐦",
max_result_size_chars=100_000,
)

View file

@ -88,6 +88,17 @@ TOOLSETS = {
"tools": ["web_search"],
"includes": []
},
"x_search": {
"description": (
"Search X (Twitter) posts and threads via xAI's built-in "
"x_search Responses tool. Available when xAI credentials are "
"configured (SuperGrok OAuth or XAI_API_KEY). Off by default; "
"enable in `hermes tools` → X (Twitter) Search."
),
"tools": ["x_search"],
"includes": []
},
"vision": {
"description": "Image analysis and vision tools",

View file

@ -21,6 +21,7 @@ export { default as Text } from './src/ink/components/Text.tsx'
export type { Props as TextProps } from './src/ink/components/Text.tsx'
export type { Key } from './src/ink/events/input-event.ts'
export { default as useApp } from './src/ink/hooks/use-app.ts'
export { useCursorAdvance } from './src/ink/hooks/use-cursor-advance.ts'
export { useDeclaredCursor } from './src/ink/hooks/use-declared-cursor.ts'
export { default as useInput } from './src/ink/hooks/use-input.ts'
export { useHasSelection, useSelection } from './src/ink/hooks/use-selection.ts'

View file

@ -12,6 +12,7 @@ export { default as ScrollBox } from './ink/components/ScrollBox.js'
export { default as Spacer } from './ink/components/Spacer.js'
export { default as Text } from './ink/components/Text.js'
export { default as useApp } from './ink/hooks/use-app.js'
export { useCursorAdvance } from './ink/hooks/use-cursor-advance.js'
export { useDeclaredCursor } from './ink/hooks/use-declared-cursor.js'
export { type RunExternalProcess, useExternalProcess, withInkSuspended } from './ink/hooks/use-external-process.js'
export { default as useInput } from './ink/hooks/use-input.js'

View file

@ -33,6 +33,7 @@ import { DBP, DFE, DISABLE_MOUSE_TRACKING, EBP, EFE, SHOW_CURSOR } from '../term
import AppContext from './AppContext.js'
import { ClockProvider } from './ClockContext.js'
import CursorAdvanceContext, { type CursorAdvanceNotifier } from './CursorAdvanceContext.js'
import CursorDeclarationContext, { type CursorDeclarationSetter } from './CursorDeclarationContext.js'
import ErrorOverview from './ErrorOverview.js'
import StdinContext from './StdinContext.js'
@ -100,6 +101,18 @@ type Props = {
// Enables IME composition at the input caret and lets screen readers /
// magnifiers track the input. Optional so testing.tsx doesn't stub it.
readonly onCursorDeclaration?: CursorDeclarationSetter
// Receives notifications that the physical cursor was advanced out-of-band
// (e.g. TextInput's fast-echo bypass writing directly to stdout). The
// handler in ink.tsx updates two pieces of state from a single call:
// - `displayCursor` (the relative-move basis log-update uses on the
// next frame; skipped on alt-screen where CSI H resets it every
// frame anyway), and
// - the active `cursorDeclaration.relativeX/Y` (the target the cursor
// parks at after every frame; bumped on BOTH screens because
// onRender's alt-screen branch emits an absolute CUP from it and
// a stale declaration there is still visibly wrong).
// Optional so testing.tsx doesn't need to stub it.
readonly onCursorAdvance?: CursorAdvanceNotifier
// Dispatch a keyboard event through the DOM tree. Called for each
// parsed key alongside the legacy EventEmitter path.
readonly dispatchKeyboardEvent: (parsedKey: ParsedKey) => void
@ -196,7 +209,9 @@ export default class App extends PureComponent<Props, State> {
<TerminalFocusProvider>
<ClockProvider>
<CursorDeclarationContext.Provider value={this.props.onCursorDeclaration ?? (() => {})}>
{this.state.error ? <ErrorOverview error={this.state.error as Error} /> : this.props.children}
<CursorAdvanceContext.Provider value={this.props.onCursorAdvance ?? (() => {})}>
{this.state.error ? <ErrorOverview error={this.state.error as Error} /> : this.props.children}
</CursorAdvanceContext.Provider>
</CursorDeclarationContext.Provider>
</ClockProvider>
</TerminalFocusProvider>

View file

@ -0,0 +1,35 @@
import { createContext } from 'react'
/**
* Notify Ink that the physical terminal cursor was advanced by an
* out-of-band stdout.write (e.g. the TextInput fast-echo path).
*
* This is a two-part notification calling it updates both:
*
* 1. Ink's cached `displayCursor` (the basis log-update uses to
* compute relative cursor moves for the next frame's preamble).
* Without this, the next frame's preamble starts from a stale
* parked position and the diff is rendered N cells offset.
* This half is SKIPPED on alt-screen every alt-screen frame
* begins with CSI H which absolutely repositions the cursor, so
* the relative-move basis is reset for free.
*
* 2. Ink's active `cursorDeclaration` (the target the cursor parks
* at after every frame, set by `useDeclaredCursor`). Without
* this, an unrelated component re-rendering before the deferred
* React state catches up would publish a stale declaration and
* visually undo the fast-echo's advance. This half applies to
* BOTH main-screen and alt-screen on alt-screen the cursor-
* park branch in onRender emits an absolute CUP to
* `rect.x + decl.relativeX`, so a stale declaration there is
* still wrong even though displayCursor is skipped.
*
* `dx`/`dy` are deltas in terminal cells (positive = right/down,
* negative = left/up). The caller is responsible for ensuring the
* physical cursor really did move by that amount.
*/
export type CursorAdvanceNotifier = (dx: number, dy?: number) => void
const CursorAdvanceContext = createContext<CursorAdvanceNotifier>(() => {})
export default CursorAdvanceContext

View file

@ -0,0 +1,33 @@
import { useContext } from 'react'
import CursorAdvanceContext, { type CursorAdvanceNotifier } from '../components/CursorAdvanceContext.js'
/**
* Returns a function that notifies Ink the physical terminal cursor was
* advanced out-of-band (e.g. by a direct stdout.write from the
* TextInput fast-echo bypass).
*
* Calling the returned function updates two pieces of Ink state:
*
* - `displayCursor` the cached parked-cursor position log-update
* uses as the relative-move basis for the next frame. Skipped on
* alt-screen, where every frame's CSI H resets the cursor anyway.
*
* - The active `cursorDeclaration` the target the cursor parks at
* after every frame. Bumped on BOTH main- and alt-screen, because
* onRender's alt-screen park branch emits an absolute CUP from
* this value and a stale declaration there is still visibly wrong.
* The next React commit that publishes a fresh declaration
* supersedes the bump.
*
* The caller is responsible for the stdout write itself; this hook
* only reports the resulting cursor delta. Pass `dx` and optional
* `dy` in terminal cells (positive = moved right/down, negative =
* moved left/up).
*
* If the host isn't an Ink render root (test stubs, non-Ink renderer)
* the returned callback is a safe no-op.
*/
export function useCursorAdvance(): CursorAdvanceNotifier {
return useContext(CursorAdvanceContext)
}

View file

@ -0,0 +1,234 @@
import { EventEmitter } from 'events'
import React from 'react'
import { describe, expect, it } from 'vitest'
import Text from './components/Text.js'
import Ink from './ink.js'
class FakeTty extends EventEmitter {
chunks: string[] = []
columns = 40
rows = 8
isTTY = true
write(chunk: string | Uint8Array, cb?: (err?: Error | null) => void): boolean {
this.chunks.push(typeof chunk === 'string' ? chunk : Buffer.from(chunk).toString('utf8'))
cb?.()
return true
}
}
function makeInk() {
const stdout = new FakeTty()
const stdin = new FakeTty()
const stderr = new FakeTty()
const ink = new Ink({
exitOnCtrlC: false,
patchConsole: false,
stderr: stderr as unknown as NodeJS.WriteStream,
stdin: stdin as unknown as NodeJS.ReadStream,
stdout: stdout as unknown as NodeJS.WriteStream
})
return { ink, stdout, stdin, stderr }
}
// Cast helper instead of exposing __get*ForTest methods on production Ink —
// these are internal frame/cursor caches we only inspect from tests.
type InkPrivate = {
displayCursor: { x: number; y: number } | null
cursorDeclaration: { node: unknown; relativeX: number; relativeY: number } | null
frontFrame: { cursor: { x: number; y: number } }
}
const peek = (ink: Ink): InkPrivate => ink as unknown as InkPrivate
// Closes the cursor-drift bug: when TextInput's fast-echo path writes a
// printable character directly to stdout, the hardware cursor advances by
// one cell BUT Ink's `displayCursor` cache (used as the basis for the
// next frame's relative cursor preamble) wasn't being updated. On long
// sessions an unrelated re-render (status bar timer, streaming
// reasoning, etc.) would then park the hardware cursor N cells offset
// from the actual caret — visible as "extra whitespace between my last
// typed character and the cursor block".
describe('Ink.noteExternalCursorAdvance', () => {
it('bumps an already-tracked displayCursor by the given delta', () => {
const { ink } = makeInk()
ink.render(React.createElement(Text, null, 'hi'))
ink.onRender()
// Seed a known parked position directly. In production this is set by
// the cursor-park branch in onRender when a useDeclaredCursor caller
// commits a declaration; this test bypasses React for hermeticity.
peek(ink).displayCursor = { x: 5, y: 0 }
ink.noteExternalCursorAdvance(3)
expect(peek(ink).displayCursor).toEqual({ x: 8, y: 0 })
ink.noteExternalCursorAdvance(-1)
expect(peek(ink).displayCursor).toEqual({ x: 7, y: 0 })
ink.noteExternalCursorAdvance(0, 2)
expect(peek(ink).displayCursor).toEqual({ x: 7, y: 2 })
ink.unmount()
})
it('seeds displayCursor from frontFrame.cursor when nothing was parked', () => {
const { ink } = makeInk()
ink.render(React.createElement(Text, null, 'hello'))
ink.onRender()
expect(peek(ink).displayCursor).toBeNull()
const base = { x: peek(ink).frontFrame.cursor.x, y: peek(ink).frontFrame.cursor.y }
ink.noteExternalCursorAdvance(4)
expect(peek(ink).displayCursor).toEqual({ x: base.x + 4, y: base.y })
ink.unmount()
})
it('is a no-op when the delta is zero', () => {
const { ink } = makeInk()
ink.render(React.createElement(Text, null, 'hi'))
ink.onRender()
ink.noteExternalCursorAdvance(0)
expect(peek(ink).displayCursor).toBeNull()
ink.noteExternalCursorAdvance(0, 0)
expect(peek(ink).displayCursor).toBeNull()
ink.unmount()
})
it('skips displayCursor on alt-screen — CSI H resets every frame', () => {
const { ink } = makeInk()
ink.setAltScreenActive(true)
ink.render(React.createElement(Text, null, 'hi'))
ink.onRender()
peek(ink).displayCursor = { x: 5, y: 0 }
ink.noteExternalCursorAdvance(3)
expect(peek(ink).displayCursor).toEqual({ x: 5, y: 0 })
ink.unmount()
})
// Closes Copilot follow-up on PR #26717: the default TUI wraps the
// composer in <AlternateScreen>, so alt-screen is the production
// path. CSI H only resets the log-update relative-move basis — the
// declared cursor target is still consulted by onRender's alt-screen
// park branch (`cursorPosition(row, col)` using rect + decl). So
// cursorDeclaration MUST advance on alt-screen too, even though
// displayCursor doesn't need to.
it('still advances cursorDeclaration on alt-screen', () => {
const { ink } = makeInk()
ink.setAltScreenActive(true)
ink.render(React.createElement(Text, null, 'hi'))
ink.onRender()
const fakeNode = {} as unknown as Record<string, unknown>
peek(ink).cursorDeclaration = { node: fakeNode, relativeX: 7, relativeY: 0 }
peek(ink).displayCursor = { x: 12, y: 0 }
ink.noteExternalCursorAdvance(3)
// displayCursor untouched on alt-screen
expect(peek(ink).displayCursor).toEqual({ x: 12, y: 0 })
// declaration still advanced — onRender's alt-screen park reads this
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 10, relativeY: 0 })
ink.unmount()
})
// Closes Copilot review feedback on PR #26717: even after the
// TextInput-level fix where layout reads `curRef.current` directly,
// there's still a window where a fast-echo wrote to stdout but the
// current cursor declaration on Ink (set by an earlier render's
// useDeclaredCursor commit) points at the PRE-keystroke caret
// column. If we advanced only `displayCursor`, an unrelated re-render
// in that window would re-run onRender's cursor-park branch with the
// stale declaration and visually undo the fast-echo's advance. We
// must bump BOTH so the cursor stays anchored to the physical caret
// until the next React commit publishes a fresh declaration
// (computed from `curRef.current` via the cursorLayout call in
// textInput.tsx) that supersedes the bump.
it('advances the active cursorDeclaration in lock-step with displayCursor', () => {
const { ink } = makeInk()
ink.render(React.createElement(Text, null, 'hi'))
ink.onRender()
const fakeNode = {} as unknown as Record<string, unknown>
peek(ink).cursorDeclaration = { node: fakeNode, relativeX: 7, relativeY: 0 }
peek(ink).displayCursor = { x: 12, y: 0 }
ink.noteExternalCursorAdvance(3)
expect(peek(ink).displayCursor).toEqual({ x: 15, y: 0 })
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 10, relativeY: 0 })
ink.noteExternalCursorAdvance(-1)
expect(peek(ink).displayCursor).toEqual({ x: 14, y: 0 })
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 9, relativeY: 0 })
ink.unmount()
})
// Closes Copilot follow-up on PR #26717: the dy half of the notifier
// contract was tested for `displayCursor` but not for
// `cursorDeclaration.relativeY`. Newlines in fast-echoed text never
// hit the bypass today (canFastAppendShape rejects '\n'), but `dy`
// is part of the public API and must propagate symmetrically with
// dx so future callers (e.g. multi-line paste shortcuts) don't get
// a half-implemented contract.
it('advances cursorDeclaration.relativeY when dy is non-zero', () => {
const { ink } = makeInk()
ink.render(React.createElement(Text, null, 'hi'))
ink.onRender()
const fakeNode = {} as unknown as Record<string, unknown>
peek(ink).cursorDeclaration = { node: fakeNode, relativeX: 2, relativeY: 1 }
peek(ink).displayCursor = { x: 4, y: 2 }
ink.noteExternalCursorAdvance(1, 3)
expect(peek(ink).displayCursor).toEqual({ x: 5, y: 5 })
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 3, relativeY: 4 })
// Negative dy too — cursor moving up across visual rows.
ink.noteExternalCursorAdvance(0, -2)
expect(peek(ink).displayCursor).toEqual({ x: 5, y: 3 })
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 3, relativeY: 2 })
ink.unmount()
})
it('leaves cursorDeclaration unchanged when no declaration is active', () => {
const { ink } = makeInk()
ink.render(React.createElement(Text, null, 'hi'))
ink.onRender()
expect(peek(ink).cursorDeclaration).toBeNull()
ink.noteExternalCursorAdvance(3)
expect(peek(ink).cursorDeclaration).toBeNull()
ink.unmount()
})
})

View file

@ -16,6 +16,7 @@ import { logError } from '../utils/log.js'
import { colorize } from './colorize.js'
import App from './components/App.js'
import type { CursorAdvanceNotifier } from './components/CursorAdvanceContext.js'
import type { CursorDeclaration, CursorDeclarationSetter } from './components/CursorDeclarationContext.js'
import { FRAME_INTERVAL_MS } from './constants.js'
import * as dom from './dom.js'
@ -2219,6 +2220,85 @@ export default class Ink {
this.cursorDeclaration = decl
}
// Caller writes raw bytes to stdout that move the physical terminal
// cursor (e.g. TextInput's fast-echo bypass). Without this notification,
// Ink's `displayCursor` cache and log-update's prevFrame.cursor stay
// unchanged, so the next frame's relative cursor moves compute from a
// stale position and the hardware cursor parks `dx` cells offset from
// the actual caret. Visible symptom: extra whitespace between the just-
// typed character and the cursor block, more pronounced on long
// sessions where unrelated components re-render between fast-echo and
// the deferred composer re-render.
//
// If displayCursor was already tracked, just bump it. Otherwise seed it
// to (prevFrame.cursor + delta) so the next frame's preamble emits a
// (-dx, -dy) relative move that brings the cursor back to log-update's
// expected start position before the diff body runs.
//
// Public so tests can drive it directly without mounting App.
//
// Bumps BOTH `displayCursor` (used by log-update's relative-move
// preamble) AND, if non-null, `cursorDeclaration.relativeX/Y` (the
// target the cursor parks at after every frame). Advancing only one
// of the two would leave the other stale: e.g. if the deferred React
// `setCur` hasn't flushed yet, the next unrelated re-render would
// re-compute `target` from the stale declaration and park the
// hardware cursor back at the old caret column. We advance both so
// the fast-echo is invisible to intervening frames until React
// catches up.
noteExternalCursorAdvance: CursorAdvanceNotifier = (dx, dy = 0) => {
if (dx === 0 && dy === 0) {
return
}
// displayCursor / log-update relative-move basis only matters on
// main screen — alt-screen frames begin with absolute CSI H every
// frame so the next preamble naturally resets to (0,0). cursorDeclaration,
// however, IS still consulted on alt-screen — onRender's park branch
// emits an absolute CUP using `rect.x + decl.relativeX`, so a stale
// declaration in the deferred-setCur window would park the cursor
// at the pre-keystroke caret. We therefore skip ONLY the displayCursor
// half on alt-screen, not the declaration half.
if (!this.altScreenActive) {
if (this.displayCursor !== null) {
this.displayCursor = {
x: this.displayCursor.x + dx,
y: this.displayCursor.y + dy
}
} else {
// No prior parked position. Seed from frontFrame.cursor (where
// log-update parked the cursor at the end of the last frame) so
// the next preamble's relative move correctly cancels the
// external advance.
const baseX = this.frontFrame.cursor.x
const baseY = this.frontFrame.cursor.y
this.displayCursor = { x: baseX + dx, y: baseY + dy }
}
}
// Also advance the active cursor declaration if any. Without this,
// a TextInput that defers its React `cur` state update (16ms timer
// in textInput.tsx — perf optimization that batches re-renders
// during heavy typing) leaves `cursorDeclaration.relativeX` pointing
// at the pre-keystroke caret column. If an unrelated component
// re-renders before the deferred `setCur` flushes, the cursor-park
// branch at the end of onRender would move the hardware cursor back
// to that stale relativeX and visually undo the fast-echo's
// advance. Bumping relativeX here keeps the declared target in
// lock-step with the physical cursor until React state catches up.
// Applies to BOTH main-screen and alt-screen — the alt-screen park
// branch uses an absolute CUP to (rect.x + decl.relativeX), so a
// stale declaration there would still produce the wrong column.
const decl = this.cursorDeclaration
if (decl !== null) {
this.cursorDeclaration = {
node: decl.node,
relativeX: decl.relativeX + dx,
relativeY: decl.relativeY + dy
}
}
}
render(node: ReactNode): void {
this.currentNode = node
@ -2228,6 +2308,7 @@ export default class Ink {
exitOnCtrlC={this.options.exitOnCtrlC}
getHyperlinkAt={this.getHyperlinkAt}
onClickAt={this.dispatchClick}
onCursorAdvance={this.noteExternalCursorAdvance}
onCursorDeclaration={this.setCursorDeclaration}
onExit={this.unmount}
onHoverAt={this.dispatchHover}

View file

@ -0,0 +1,50 @@
import { readFileSync } from 'node:fs'
import { dirname, join } from 'node:path'
import { fileURLToPath } from 'node:url'
import { describe, expect, it } from 'vitest'
// Locate textInput.tsx relative to this test file so the assertion
// survives moves of the test fixture itself.
const TEXT_INPUT_PATH = join(dirname(fileURLToPath(import.meta.url)), '..', 'components', 'textInput.tsx')
const source = readFileSync(TEXT_INPUT_PATH, 'utf8')
// Closes Copilot follow-up on PR #26717: the original cursor-drift
// fix bumped Ink's displayCursor / cursorDeclaration on fast-echo, but
// if TextInput itself re-renders before the deferred 16ms `setCur`
// flushes (parent state change, status-bar tick, spinner) the layout
// effect inside `useDeclaredCursor` re-publishes a declaration
// computed from the STALE React `cur` state and clobbers the Ink-level
// bump. The fix is structural: read `curRef.current` (always
// up-to-date) when computing the layout, not the `cur` state.
//
// This file pins that invariant. Switching back to `cur` state — or
// re-introducing a memo keyed on `cur` that uses `curRef.current`
// inside but stops re-computing on rerender — is a regression and
// should be caught here, not via a flaky integration test that mounts
// Ink + stdin.
describe('textInput cursor-layout source of truth', () => {
it('reads curRef.current (not the cur React state) for cursorLayout', () => {
// The line we care about. We allow whitespace / formatting drift,
// but the call itself must use `curRef.current`.
expect(source).toMatch(/cursorLayout\(\s*display\s*,\s*curRef\.current\s*,\s*columns\s*\)/)
})
it('does not pass the bare `cur` React state into cursorLayout', () => {
// Any `cursorLayout(display, cur, columns)` invocation would
// reintroduce the stale-declaration window.
expect(source).not.toMatch(/cursorLayout\(\s*display\s*,\s*cur\s*,\s*columns\s*\)/)
})
it('keeps the fast-echo notifier calls paired with the stdout writes', () => {
// Both fast-echo paths must call noteCursorAdvance, otherwise Ink
// never learns about the out-of-band write and drifts again. We
// tolerate explanatory comments in between (the rationale block is
// intentionally long), but the pairing itself must hold.
const backspacePattern = /stdout!\.write\(['"`]\\b \\b['"`]\)[\s\S]{0,1000}?noteCursorAdvance\(-1\)/
expect(source).toMatch(backspacePattern)
const appendPattern = /stdout!\.write\(text\)[\s\S]{0,1000}?noteCursorAdvance\(text\.length\)/
expect(source).toMatch(appendPattern)
})
})

View file

@ -133,4 +133,42 @@ describe('canFastBackspaceShape', () => {
it('rejects deleting an emoji', () => {
expect(canFastBackspaceShape('hi🙂', 'hi🙂'.length)).toBe(false)
})
// Closes Copilot PR #26717 round 3: the "\b \b" sequence cannot move
// the terminal cursor onto the previous visual row across a
// soft-wrap boundary. When the caret sits at visual column 0 of a
// wrapped row (column == 0 in the computed cursor layout), backspace
// would leave the physical cursor in place while the logical caret
// moves up to the end of the previous visual line — desyncing both
// Ink's displayCursor model and the user-visible position. The fast
// path must fall through in that case so the normal Ink render path
// can lay out the correct cursor position.
it('rejects fast-backspace at a soft-wrap boundary when columns is known', () => {
// value width 6 in a column of 6 → cursorLayout produces (line 1, col 0)
// i.e. the caret has overflowed onto the next visual line.
const value = 'hello '
expect(canFastBackspaceShape(value, value.length, 6)).toBe(false)
})
it('rejects fast-backspace at an exact multiple of columns (wide wrap)', () => {
// 12 chars at width 6 → two full visual rows, caret at (line 2, col 0).
const value = 'abcdefghijkl'
expect(canFastBackspaceShape(value, value.length, 6)).toBe(false)
})
it('still accepts fast-backspace inside a wrapped line', () => {
// Caret mid-visual-line — "\b \b" can move the cursor one cell left
// without crossing a wrap boundary.
expect(canFastBackspaceShape('hello world', 'hello world'.length, 20)).toBe(true)
expect(canFastBackspaceShape('abcdefghi', 9, 6)).toBe(true) // visual line 1, col 3 → ok
})
it('skips the wrap-boundary check when columns is omitted (legacy contract)', () => {
// Callers that don't pass `columns` fall back to the pre-wrap-aware
// behavior — the function does NOT magically reject anything that
// could be a wrap boundary without the width. Production callers
// must always pass `columns`; this case is for unit tests of the
// pre-wrap shape contract.
expect(canFastBackspaceShape('hello ', 'hello '.length)).toBe(true)
})
})

View file

@ -16,13 +16,14 @@ import {
type InkExt = typeof Ink & {
stringWidth: (s: string) => number
useCursorAdvance: () => (dx: number, dy?: number) => void
useDeclaredCursor: (a: { line: number; column: number; active: boolean }) => (el: any) => void
useStdout: () => { stdout?: NodeJS.WriteStream }
useTerminalFocus: () => boolean
}
const ink = Ink as unknown as InkExt
const { Box, Text, useStdin, useInput, useStdout, stringWidth, useDeclaredCursor, useTerminalFocus } = ink
const { Box, Text, useStdin, useInput, useStdout, stringWidth, useCursorAdvance, useDeclaredCursor, useTerminalFocus } = ink
const ESC = '\x1b'
const INV = `${ESC}[7m`
@ -238,8 +239,26 @@ export function canFastAppendShape(
* ASCII. Anything else (combining marks, IME compositions, wide chars,
* tabs, ANSI fragments) goes through the normal render path so Ink can
* recompute cell widths.
*
* When `columns` is supplied, ALSO rejects when the physical cursor
* sits at visual column 0 i.e., right after a soft-wrap boundary.
* The "\b \b" sequence cannot move the cursor onto the previous visual
* row (terminals don't back-step across line wraps), so the physical
* cursor would stay put while the logical caret moves to the end of
* the previous visual line, desyncing both Ink's `displayCursor` model
* and the user-visible position.
*
* When `columns` is OMITTED, the wrap-boundary check is skipped
* entirely and the function reverts to the legacy non-wrap-aware
* contract values like `'hello '` will return `true` even though
* they would be unsafe at a width of 6. Production callers (the
* composer's `canFastBackspace` helper) always pass `columns`;
* `columns` is optional only so unit tests of the pre-wrap shape
* contract can keep calling the helper without threading width
* through. Do NOT omit it from any new caller that relies on the
* wrap-boundary protection.
*/
export function canFastBackspaceShape(current: string, cursor: number): boolean {
export function canFastBackspaceShape(current: string, cursor: number, columns?: number): boolean {
if (cursor !== current.length) {
return false
}
@ -252,6 +271,13 @@ export function canFastBackspaceShape(current: string, cursor: number): boolean
return false
}
// If we know the wrap width, reject at the soft-wrap boundary: the
// caret's visual column is 0, so "\b \b" can't represent the physical
// move back to the previous visual line.
if (columns !== undefined && cursorLayout(current, cursor, columns).column === 0) {
return false
}
const removed = current.slice(prevPos(current, cursor), cursor)
return ASCII_PRINTABLE_RE.test(removed)
@ -333,6 +359,7 @@ export function TextInput({
const fwdDel = useFwdDelete(focus)
const termFocus = useTerminalFocus()
const { stdout } = useStdout()
const noteCursorAdvance = useCursorAdvance()
const curRef = useRef(cur)
const selRef = useRef<null | { end: number; start: number }>(null)
@ -368,7 +395,19 @@ export function TextInput({
[sel]
)
const layout = useMemo(() => cursorLayout(display, cur, columns), [columns, cur, display])
// Read `curRef.current` (always up-to-date) rather than the `cur`
// React state. The fast-echo path defers the React `setCur` by 16ms
// to batch re-renders during heavy typing; if an unrelated render
// flushes this component during that window and we used the stale
// `cur` state here, the layout effect inside `useDeclaredCursor`
// would publish a stale cursor declaration and clobber the Ink-level
// bump from `noteCursorAdvance(...)`. `cur` is still in scope and
// referenced by setSel/setCur paths below, so React tracks the
// dependency naturally — we just don't use it as the source of truth
// for layout. The cursorLayout call is cheap (one wrap-text pass
// over a single-line string in the common case), so dropping useMemo
// is fine.
const layout = cursorLayout(display, curRef.current, columns)
const boxRef = useDeclaredCursor({
line: layout.line,
@ -526,7 +565,7 @@ export function TextInput({
canFastEchoBase() && canFastAppendShape(current, cursor, text, columns, lineWidthRef.current)
const canFastBackspace = (current: string, cursor: number) =>
canFastEchoBase() && canFastBackspaceShape(current, cursor)
canFastEchoBase() && canFastBackspaceShape(current, cursor, columns)
const commit = (
next: string,
@ -911,6 +950,12 @@ export function TextInput({
v = v.slice(0, t) + v.slice(c)
c = t
stdout!.write('\b \b')
// The "\b \b" sequence ends with the cursor one column to the
// LEFT of where Ink last parked it. Tell Ink so its `displayCursor`
// (and log-update's relative-move basis on the next frame) stays
// in sync — otherwise the cursor parks one cell to the right of
// the caret on the next unrelated re-render.
noteCursorAdvance(-1)
commit(v, c, true, false, false, Math.max(0, lineWidthRef.current - 1))
return
@ -998,6 +1043,14 @@ export function TextInput({
if (simpleAppend) {
stdout!.write(text)
// ASCII-printable text advances the physical cursor by exactly
// text.length cells (canFastAppendShape rejects non-ASCII,
// wide chars, newlines). Notify Ink so the cached displayCursor
// / log-update relative-move basis advances with it; otherwise
// any unrelated re-render that happens before the 16ms
// setCur/setParent flush parks the cursor text.length cells
// too far right (#cursor-drift).
noteCursorAdvance(text.length)
commit(v, c, true, false, false, lineWidthRef.current + stringWidth(text))
return

Some files were not shown because too many files have changed in this diff Show more