mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-11 08:42:11 +00:00
Merge branch 'main' of github.com:NousResearch/hermes-agent into bb/gui
This commit is contained in:
commit
062eed654d
116 changed files with 10770 additions and 258 deletions
477
RELEASE_v0.14.0.md
Normal file
477
RELEASE_v0.14.0.md
Normal file
|
|
@ -0,0 +1,477 @@
|
|||
# Hermes Agent v0.14.0 (v2026.5.16)
|
||||
|
||||
**Release Date:** May 16, 2026
|
||||
**Since v0.13.0:** 808 commits · 633 merged PRs · 1393 files changed · 165,061 insertions · 545 issues closed (12 P0, 50 P1) · 215 community contributors (including co-authors)
|
||||
|
||||
> The Foundation Release — Hermes Agent installs and runs anywhere now. Native Windows ships in early beta with a full PowerShell installer story, a `pip install hermes-agent` wheel lands on PyPI, lazy-deps reshape what `pip install hermes-agent` actually pulls down, the supply-chain checker scans every install/upgrade for unsafe versions, and a new OpenAI-compatible local proxy lets Codex / Aider / Cline talk to OAuth-only providers (Claude Pro, ChatGPT Pro, SuperGrok). The cold-start wave shaves ~19 seconds off `hermes` launch, browser-tool CDP calls run 180x faster, and `hermes tools` All-Platforms drops from 14s to under 1.5s. Two new messaging platforms (LINE and SimpleX Chat) and a Microsoft Graph foundation (Teams pipeline + webhook adapter) land alongside `/handoff` that finally transfers sessions live, `vision_analyze` passing pixels through to vision-capable models, `x_search` as a first-class tool, LSP semantic diagnostics on every `write_file` / `patch`, a unified pluggable `video_generate`, a `computer_use` cua-driver backend, cross-session 1-hour Claude prompt caching, a per-turn file-mutation verifier, plus 9 new optional skills. 50+ P1 closures, 12 P0 closures.
|
||||
|
||||
---
|
||||
|
||||
## ✨ Highlights
|
||||
|
||||
- **Native Windows support (early beta)** — full PowerShell installer, native subprocess/PTY paths, taskkill-based process management, MinGit auto-install, Microsoft Store python stub detection, foreground Ctrl+C preservation, taskkill+ps2 fallback, npm prefix handling, and ~40 follow-up Windows-only fixes across CLI / gateway / TUI / curator / tools. Hermes finally runs natively on `cmd.exe` and PowerShell, no WSL required. ([#21561](https://github.com/NousResearch/hermes-agent/pull/21561), [#22130](https://github.com/NousResearch/hermes-agent/pull/22130), [#22752](https://github.com/NousResearch/hermes-agent/pull/22752), [#26618](https://github.com/NousResearch/hermes-agent/pull/26618), and many more)
|
||||
|
||||
- **`pip install hermes-agent && hermes`** — Hermes Agent is now a real PyPI package. One command, no clone, no git, no shell installer. Wheel includes the Ink TUI bundle and shell launcher. (salvage of [#26350](https://github.com/NousResearch/hermes-agent/pull/26350)) ([#26593](https://github.com/NousResearch/hermes-agent/pull/26593))
|
||||
|
||||
- **Cold-start performance wave — ~19s off `hermes` launch** — skills cache, lazy Feishu import, no Nous HTTP at startup, plus PEP-562 lazy adapter imports (QQ, Yuanbao, Teams, Google Chat), deferred `fal_client` / `google-cloud` / `httpx` loads, models.dev disk-cache-first lookup, parallel doctor API checks, eager-skip plugin discovery on built-in subcommands, `hermes tools` All-Platforms drops from 14s to <1.5s, welcome banner skipped on `chat -q`. ([#22138](https://github.com/NousResearch/hermes-agent/pull/22138), [#22120](https://github.com/NousResearch/hermes-agent/pull/22120), [#22681](https://github.com/NousResearch/hermes-agent/pull/22681), [#22790](https://github.com/NousResearch/hermes-agent/pull/22790), [#22808](https://github.com/NousResearch/hermes-agent/pull/22808), [#22831](https://github.com/NousResearch/hermes-agent/pull/22831), [#22859](https://github.com/NousResearch/hermes-agent/pull/22859), [#22904](https://github.com/NousResearch/hermes-agent/pull/22904), [#22766](https://github.com/NousResearch/hermes-agent/pull/22766), [#25341](https://github.com/NousResearch/hermes-agent/pull/25341))
|
||||
|
||||
- **180x faster `browser_console` evaluations** — routed through the supervisor's persistent CDP WebSocket instead of spawning a fresh DevTools session per call. Real-world page interactions feel instant. ([#23226](https://github.com/NousResearch/hermes-agent/pull/23226))
|
||||
|
||||
- **Supply-chain advisory checker + lazy-deps framework + tiered install fallback** — every `pip install` / `hermes update` scans dependencies against an advisory list, lazy-deps replace heavy import-time loads with first-use installs, and the installer falls back through extras tiers when a wheel rejects on the target platform. ([#24220](https://github.com/NousResearch/hermes-agent/pull/24220))
|
||||
|
||||
- **OpenAI-compatible local proxy** — `hermes proxy` exposes any OAuth-authed provider (Claude Pro, ChatGPT Pro, SuperGrok) as an OpenAI-compatible endpoint that Codex / Aider / Cline / VS Code Continue can hit. Your subscription, your tools. ([#25969](https://github.com/NousResearch/hermes-agent/pull/25969))
|
||||
|
||||
- **Cross-session 1-hour Claude prompt cache** — Anthropic / OpenRouter / Nous Portal now share a 1h prefix cache across sessions for Claude models. Fast resume, fast `/new`, lower cost on repeat work. ([#23828](https://github.com/NousResearch/hermes-agent/pull/23828))
|
||||
|
||||
- **Two new messaging platforms — LINE + SimpleX Chat** — LINE Messaging API lands as a first-class platform, SimpleX Chat salvages #2558 onto the modern adapter spec. Hermes is now on 22 platforms. ([#23197](https://github.com/NousResearch/hermes-agent/pull/23197), [#26232](https://github.com/NousResearch/hermes-agent/pull/26232))
|
||||
|
||||
- **Microsoft Graph foundation — Teams pipeline + webhook adapter** — `msgraph` auth/client foundation, webhook listener platform, Teams pipeline plugin runtime, and Teams outbound delivery via the existing adapter — Hermes can now read and post to Teams. (salvages of #21408–#21411) ([#21922](https://github.com/NousResearch/hermes-agent/pull/21922), [#21969](https://github.com/NousResearch/hermes-agent/pull/21969), [#22007](https://github.com/NousResearch/hermes-agent/pull/22007), [#22024](https://github.com/NousResearch/hermes-agent/pull/22024))
|
||||
|
||||
- **`/handoff` actually transfers the session live** — the agent's active session moves to a different model / persona / profile mid-conversation, with messages, tool history, and context preserved. ([#23395](https://github.com/NousResearch/hermes-agent/pull/23395))
|
||||
|
||||
- **`x_search` — first-class X (Twitter) search tool** — gated tool with OAuth-or-API-key auth, no skill needed to query the timeline. ([#26763](https://github.com/NousResearch/hermes-agent/pull/26763))
|
||||
|
||||
- **`vision_analyze` returns pixels to vision-capable models** — when the active model can see, `vision_analyze` now hands the image straight through instead of falling back to a text description. ([#22955](https://github.com/NousResearch/hermes-agent/pull/22955))
|
||||
|
||||
- **LSP semantic diagnostics on every write** — `write_file` and `patch` now run real language-server diagnostics on the post-edit file (delta-only) and surface real errors before they ship downstream. ([#24168](https://github.com/NousResearch/hermes-agent/pull/24168), [#25978](https://github.com/NousResearch/hermes-agent/pull/25978))
|
||||
|
||||
- **Per-turn file-mutation verifier footer** — after every turn that wrote files, the agent gets a verifier footer summarizing what actually changed on disk — catches silent overwrites and "wrote it but it didn't land" bugs. ([#24498](https://github.com/NousResearch/hermes-agent/pull/24498))
|
||||
|
||||
- **Unified `video_generate` with pluggable provider backends** — single tool, any backend. Drop in a new video provider as a plugin, no core changes. ([#25126](https://github.com/NousResearch/hermes-agent/pull/25126))
|
||||
|
||||
- **`computer_use` cua-driver backend** — proper focus-safe ops, non-Anthropic provider support, refresh on `hermes update`. Computer-use is no longer locked to a single SDK. (re-salvage of #16936) ([#21967](https://github.com/NousResearch/hermes-agent/pull/21967), [#24063](https://github.com/NousResearch/hermes-agent/pull/24063))
|
||||
|
||||
- **xAI Grok OAuth provider — SuperGrok via subscription** — sign in with your xAI account, talk to Grok models from Hermes. ([#26534](https://github.com/NousResearch/hermes-agent/pull/26534))
|
||||
|
||||
- **Clarify with buttons — native inline keyboards on Telegram + Discord** — the `clarify` tool renders multi-choice prompts as platform-native buttons instead of typed responses. ([#24199](https://github.com/NousResearch/hermes-agent/pull/24199), [#25485](https://github.com/NousResearch/hermes-agent/pull/25485))
|
||||
|
||||
- **Discord channel history backfill (default on)** — Hermes reads recent channel history when joining a thread so it actually knows what's been said. ([#25984](https://github.com/NousResearch/hermes-agent/pull/25984))
|
||||
|
||||
- **Watchers skill — RSS / HTTP JSON / GitHub polling via cron `no_agent` mode** — skill recipes that wire change-detection sources directly into cron's script-only watchdog mode. ([#21881](https://github.com/NousResearch/hermes-agent/pull/21881))
|
||||
|
||||
- **Zed ACP Registry integration + uvx distribution** — Hermes is in the Zed registry, installable via `uvx` (no npm). Plus `hermes acp --setup-browser` bootstraps browser tools for registry installs. (salvage of [#25908](https://github.com/NousResearch/hermes-agent/pull/25908)) ([#26079](https://github.com/NousResearch/hermes-agent/pull/26079), [#26120](https://github.com/NousResearch/hermes-agent/pull/26120), [#26234](https://github.com/NousResearch/hermes-agent/pull/26234))
|
||||
|
||||
- **OpenRouter Pareto Code router** — wire a new OpenRouter router with `min_coding_score` knob. Pick the cheapest model that meets your quality bar. ([#22838](https://github.com/NousResearch/hermes-agent/pull/22838))
|
||||
|
||||
- **Optional codex app-server runtime for OpenAI/Codex models** — drives the OpenAI Codex CLI under the hood for OpenAI/Codex paths, with session reuse, wedge retirement, and OAuth refresh classification. ([#24182](https://github.com/NousResearch/hermes-agent/pull/24182), [#25769](https://github.com/NousResearch/hermes-agent/pull/25769))
|
||||
|
||||
- **`hermes-skills/huggingface` as a trusted default tap** — community skills index from huggingface.co/skills is available by default in the Skills Hub. ([#26219](https://github.com/NousResearch/hermes-agent/pull/26219))
|
||||
|
||||
- **9 new optional skills** — Hyperliquid (perp/spot trading via SDK + REST) (@kshitijk4poor & Hermes), Yahoo Finance market data, api-testing (REST/GraphQL debug), unified EVM multi-chain skill (folds #25291 + #2010 + base/), darwinian-evolver, osint-investigation (closes #355), pinggy-tunnel, watchers (RSS/HTTP/GitHub via cron), Notion overhaul for the Developer Platform (May 2026). ([#23582](https://github.com/NousResearch/hermes-agent/pull/23582), [#23583](https://github.com/NousResearch/hermes-agent/pull/23583), [#23590](https://github.com/NousResearch/hermes-agent/pull/23590), [#25299](https://github.com/NousResearch/hermes-agent/pull/25299), [#26760](https://github.com/NousResearch/hermes-agent/pull/26760), [#26729](https://github.com/NousResearch/hermes-agent/pull/26729), [#26765](https://github.com/NousResearch/hermes-agent/pull/26765), [#21881](https://github.com/NousResearch/hermes-agent/pull/21881), [#26612](https://github.com/NousResearch/hermes-agent/pull/26612))
|
||||
|
||||
- **API server exposes run approval events** — long-running runs surface approval requests over the API stream, no more silent stalls. (salvage of [#20311](https://github.com/NousResearch/hermes-agent/pull/20311)) ([#21899](https://github.com/NousResearch/hermes-agent/pull/21899))
|
||||
|
||||
- **`/subgoal` — user-added criteria appended to active `/goal`** — layer extra success criteria onto a running goal loop. The judge sees them in the prompt, no behavior change when subgoals are empty. ([#25449](https://github.com/NousResearch/hermes-agent/pull/25449))
|
||||
|
||||
- **Plugins can run any LLM call via `ctx.llm`** — plugins get a first-class hook to make their own LLM requests through the active provider/credentials, no manual wiring. Plus `tool_override` flag for replacing built-in tools. ([#23194](https://github.com/NousResearch/hermes-agent/pull/23194), [#26759](https://github.com/NousResearch/hermes-agent/pull/26759))
|
||||
|
||||
- **Brave Search (free tier) + DuckDuckGo (DDGS) as web-search providers** — two new free search backends alongside Tavily / SearXNG / Exa. ([#21337](https://github.com/NousResearch/hermes-agent/pull/21337))
|
||||
|
||||
- **Sudo brute-force block + sudo-stdin/askpass DANGEROUS classification** — closes the `sudo -S` brute-force avenue; approval gates classify stdin-fed and askpass-stripped sudo invocations as dangerous. (salvages of #22194 + #21128) ([#23736](https://github.com/NousResearch/hermes-agent/pull/23736))
|
||||
|
||||
- **Provider rename — Alibaba Cloud → Qwen Cloud, picker reorder** — matches what the world calls it. Existing config keys still work. ([#24835](https://github.com/NousResearch/hermes-agent/pull/24835))
|
||||
|
||||
|
||||
---
|
||||
|
||||
## 🪟 Windows — Native Support (Early Beta)
|
||||
|
||||
### Bootstrap & installer
|
||||
- **Native Windows support (early beta)** — first-class native Windows path across CLI / gateway / TUI / tools ([#21561](https://github.com/NousResearch/hermes-agent/pull/21561))
|
||||
- **PyPI wheel packaging — `pip install hermes-agent && hermes`** (salvage of #26350) ([#26593](https://github.com/NousResearch/hermes-agent/pull/26593))
|
||||
- **Recognise Shift+Enter as a newline key** + Windows docs (salvage #21545) ([#22130](https://github.com/NousResearch/hermes-agent/pull/22130))
|
||||
- **Preserve Ctrl+C for Windows foreground runs** (@helix4u) ([#22752](https://github.com/NousResearch/hermes-agent/pull/22752))
|
||||
- **Stop spamming cwd-missing + tirith-spawn warnings on every terminal call** ([#26618](https://github.com/NousResearch/hermes-agent/pull/26618))
|
||||
- **Use `--extra all` not `--all-extras`; drop lazy-covered extras from `[all]`** ([#24515](https://github.com/NousResearch/hermes-agent/pull/24515))
|
||||
|
||||
### Windows-specific fixes (40+ across cli / tools / gateway / curator / TUI)
|
||||
A long tail of native-Windows fixes shipped alongside the beta — taskkill-based subprocess management, MinGit auto-install, Microsoft Store python stub detection, npm prefix handling, native PTY paths, signal handling differences, foreground process management, ANSI sequence handling, path normalization, file-locking semantics, and many more. Full list in commit log under `fix(windows)` / `feat(windows)` / `windows`.
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Performance Wave
|
||||
|
||||
### Cold start
|
||||
- **Cut ~19s from `hermes` cold start** — skills cache + lazy Feishu + no Nous HTTP at startup ([#22138](https://github.com/NousResearch/hermes-agent/pull/22138))
|
||||
- **Skip eager plugin discovery on known built-in subcommands** ([#22120](https://github.com/NousResearch/hermes-agent/pull/22120))
|
||||
- **Cache Nous auth + .env loads** — `hermes tools` All Platforms from 14s to <1.5s ([#25341](https://github.com/NousResearch/hermes-agent/pull/25341))
|
||||
- **Skip welcome banner on `chat -q` single-query mode** ([#22904](https://github.com/NousResearch/hermes-agent/pull/22904))
|
||||
- **Defer heavy google-cloud imports in google_chat to first adapter use** ([#22681](https://github.com/NousResearch/hermes-agent/pull/22681))
|
||||
- **Defer QQAdapter and YuanbaoAdapter imports via PEP 562** ([#22790](https://github.com/NousResearch/hermes-agent/pull/22790))
|
||||
- **Defer httpx import in teams to first webhook call** ([#22831](https://github.com/NousResearch/hermes-agent/pull/22831))
|
||||
- **Defer fal_client import to first generation request** ([#22859](https://github.com/NousResearch/hermes-agent/pull/22859))
|
||||
- **models.dev cache-first lookup, skip network when disk cache is fresh** ([#22808](https://github.com/NousResearch/hermes-agent/pull/22808))
|
||||
- **Parallelize API connectivity checks in `hermes doctor` and disable IMDS** ([#22766](https://github.com/NousResearch/hermes-agent/pull/22766))
|
||||
|
||||
### Runtime
|
||||
- **180x faster `browser_console` evaluations** — route through supervisor's persistent CDP WebSocket ([#23226](https://github.com/NousResearch/hermes-agent/pull/23226))
|
||||
- **Tune Telegram cadence + adaptive fast-path for short replies** (salvage of #10388) ([#23587](https://github.com/NousResearch/hermes-agent/pull/23587))
|
||||
- **Accumulate length-continuation prefix via list+join** ([#26237](https://github.com/NousResearch/hermes-agent/pull/26237))
|
||||
|
||||
### Prompt caching
|
||||
- **Cross-session 1h prefix cache for Claude on Anthropic / OpenRouter / Nous Portal** ([#23828](https://github.com/NousResearch/hermes-agent/pull/23828))
|
||||
- **Hit prefix cache in background review fork** (salvage #17276 + #25427) ([#25434](https://github.com/NousResearch/hermes-agent/pull/25434))
|
||||
|
||||
---
|
||||
|
||||
## 📦 Installation & Distribution
|
||||
|
||||
### PyPI + supply-chain
|
||||
- **PyPI wheel packaging — `pip install hermes-agent && hermes`** (salvage of #26350) ([#26593](https://github.com/NousResearch/hermes-agent/pull/26593))
|
||||
- **Supply-chain advisory checker + lazy-install framework + tiered install fallback** ([#24220](https://github.com/NousResearch/hermes-agent/pull/24220))
|
||||
- **Use `--extra all` not `--all-extras`; drop lazy-covered extras from `[all]`** ([#24515](https://github.com/NousResearch/hermes-agent/pull/24515))
|
||||
- **Skip browser download when system chromium exists** (@helix4u) ([#25317](https://github.com/NousResearch/hermes-agent/pull/25317))
|
||||
|
||||
### Nix
|
||||
- **`extraDependencyGroups` for sealed venv extras** (@alt-glitch) ([#21817](https://github.com/NousResearch/hermes-agent/pull/21817))
|
||||
- **Refresh npm lockfile hashes** — keeps Nix flake builds reproducible
|
||||
|
||||
### Docker
|
||||
- **Bootstrap auth.json from env on first boot** ([#21880](https://github.com/NousResearch/hermes-agent/pull/21880))
|
||||
- **Drop manual @hermes/ink build, rely on esbuild bundle** — slimmer image
|
||||
|
||||
### ACP / Zed
|
||||
- **Zed ACP Registry integration** (salvage of #25908) ([#26079](https://github.com/NousResearch/hermes-agent/pull/26079))
|
||||
- **Switch to uvx distribution, drop npm launcher** ([#26120](https://github.com/NousResearch/hermes-agent/pull/26120))
|
||||
- **`hermes acp --setup-browser` bootstraps browser tools for registry installs** ([#26234](https://github.com/NousResearch/hermes-agent/pull/26234))
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ Core Agent & Architecture
|
||||
|
||||
### Sessions & handoff
|
||||
- **`/handoff` actually transfers the session live** ([#23395](https://github.com/NousResearch/hermes-agent/pull/23395))
|
||||
- **Expose `HERMES_SESSION_ID` env var to agent tools** (@alt-glitch) ([#23847](https://github.com/NousResearch/hermes-agent/pull/23847))
|
||||
|
||||
### Goals (Ralph loop)
|
||||
- **`/subgoal` — user-added criteria appended to active `/goal`** ([#25449](https://github.com/NousResearch/hermes-agent/pull/25449))
|
||||
- **`/goal` checklist + /subgoal user controls** ([#23456](https://github.com/NousResearch/hermes-agent/pull/23456)) — rolled back in window ([#23813](https://github.com/NousResearch/hermes-agent/pull/23813)); /subgoal returned in simpler form via #25449
|
||||
|
||||
### Compression
|
||||
- **Make `protect_first_n` configurable** ([#25447](https://github.com/NousResearch/hermes-agent/pull/25447))
|
||||
|
||||
### Verification
|
||||
- **Per-turn file-mutation verifier footer** ([#24498](https://github.com/NousResearch/hermes-agent/pull/24498))
|
||||
|
||||
### Stream retry
|
||||
- **Log inner cause, upstream headers, bytes/elapsed on every drop** ([#23005](https://github.com/NousResearch/hermes-agent/pull/23005))
|
||||
|
||||
---
|
||||
|
||||
## 🤖 Models & Providers
|
||||
|
||||
### New providers
|
||||
- **xAI Grok OAuth (SuperGrok Subscription) provider** ([#26534](https://github.com/NousResearch/hermes-agent/pull/26534))
|
||||
- **NovitaAI provider** (salvage #7219) (@kshitijk4poor) ([#25507](https://github.com/NousResearch/hermes-agent/pull/25507))
|
||||
- **NVIDIA NIM billing origin header** (salvage #25211) ([#26585](https://github.com/NousResearch/hermes-agent/pull/26585))
|
||||
|
||||
### Provider work
|
||||
- **OpenRouter Pareto Code router with `min_coding_score` knob** ([#22838](https://github.com/NousResearch/hermes-agent/pull/22838))
|
||||
- **Optional codex app-server runtime for OpenAI/Codex models** ([#24182](https://github.com/NousResearch/hermes-agent/pull/24182))
|
||||
- **Codex-runtime: retire wedged sessions + post-tool watchdog + OAuth refresh classify** ([#25769](https://github.com/NousResearch/hermes-agent/pull/25769))
|
||||
- **Codex-runtime: skip unavailable plugins during migration** ([#25437](https://github.com/NousResearch/hermes-agent/pull/25437))
|
||||
- **Codex-runtime: de-dup `[plugins.X]` tables and stop leaking HERMES_HOME into config.toml** (#26250) (@kshitijk4poor) ([#26260](https://github.com/NousResearch/hermes-agent/pull/26260))
|
||||
- **Pass `reasoning.effort` to xAI Responses API** ([#22807](https://github.com/NousResearch/hermes-agent/pull/22807))
|
||||
- **Custom provider: prompt and persist explicit `api_mode`** ([#25068](https://github.com/NousResearch/hermes-agent/pull/25068))
|
||||
- **Rename Alibaba Cloud → Qwen Cloud, reorder picker** ([#24835](https://github.com/NousResearch/hermes-agent/pull/24835))
|
||||
- **Restore gpt-5.3-codex-spark for ChatGPT Pro** (salvage #18286 + #19530, fixes #16172) (@kshitijk4poor) ([#22991](https://github.com/NousResearch/hermes-agent/pull/22991))
|
||||
- **Inject tool-use enforcement for GLM models** ([#24715](https://github.com/NousResearch/hermes-agent/pull/24715))
|
||||
- **Use Nous Portal as model metadata authority** (@rob-maron) ([#24502](https://github.com/NousResearch/hermes-agent/pull/24502))
|
||||
- **Unified `client=hermes-client-v<version>` tag on every Portal request** ([#24779](https://github.com/NousResearch/hermes-agent/pull/24779))
|
||||
- **Prevent stale Ollama credentials after provider switch** (@kshitijk4poor) ([#21703](https://github.com/NousResearch/hermes-agent/pull/21703))
|
||||
- **Auxiliary client: rotate pooled auth after quota failures** (salvage #22779) ([#22792](https://github.com/NousResearch/hermes-agent/pull/22792))
|
||||
- **Auxiliary client: skip providers without credentials immediately** (#25395) ([#25487](https://github.com/NousResearch/hermes-agent/pull/25487))
|
||||
- **Auth: send Nous refresh token via header** (@shannonsands) ([#21578](https://github.com/NousResearch/hermes-agent/pull/21578))
|
||||
- **MiniMax: harden OAuth dashboard and runtime** ([#24165](https://github.com/NousResearch/hermes-agent/pull/24165))
|
||||
|
||||
### OpenAI-compatible proxy
|
||||
- **Local OpenAI-compatible proxy for OAuth providers** — Codex / Aider / Cline can hit Claude Pro, ChatGPT Pro, SuperGrok ([#25969](https://github.com/NousResearch/hermes-agent/pull/25969))
|
||||
|
||||
---
|
||||
|
||||
## 📱 Messaging Platforms (Gateway)
|
||||
|
||||
### New platforms
|
||||
- **LINE Messaging API platform plugin** ([#23197](https://github.com/NousResearch/hermes-agent/pull/23197))
|
||||
- **SimpleX Chat platform plugin** (salvages #2558) ([#26232](https://github.com/NousResearch/hermes-agent/pull/26232))
|
||||
|
||||
### Microsoft Graph foundation
|
||||
- **msgraph: add auth and client foundation** (salvage of #21408) ([#21922](https://github.com/NousResearch/hermes-agent/pull/21922))
|
||||
- **msgraph: add webhook listener platform** (salvage of #21409) ([#21969](https://github.com/NousResearch/hermes-agent/pull/21969))
|
||||
- **teams-pipeline: add plugin runtime and operator cli** (salvage of #21410) ([#22007](https://github.com/NousResearch/hermes-agent/pull/22007))
|
||||
- **teams: add pipeline outbound delivery via existing adapter** (salvage of #21411) ([#22024](https://github.com/NousResearch/hermes-agent/pull/22024))
|
||||
|
||||
### Cross-platform
|
||||
- **Per-platform admin/user split for slash commands** (salvage of #4443) ([#23373](https://github.com/NousResearch/hermes-agent/pull/23373))
|
||||
- **Forensics on signal handling — non-blocking diag, per-phase timing, stale-unit warning** ([#23285](https://github.com/NousResearch/hermes-agent/pull/23285))
|
||||
- **Keep gateway running when platforms fail; add per-platform circuit breaker + `/platform`** ([#26600](https://github.com/NousResearch/hermes-agent/pull/26600))
|
||||
- **Wire `clarify` tool with inline keyboard buttons on Telegram** ([#24199](https://github.com/NousResearch/hermes-agent/pull/24199))
|
||||
- **Add `chat_id` to `hook_ctx` for message source tracking** ([#24710](https://github.com/NousResearch/hermes-agent/pull/24710))
|
||||
|
||||
### Telegram
|
||||
- **Native draft streaming via `sendMessageDraft` (Bot API 9.5+)** (salvage of #3412) ([#23512](https://github.com/NousResearch/hermes-agent/pull/23512))
|
||||
- **Stream Telegram edits safely** — salvage of #22264 (@kshitijk4poor) ([#22518](https://github.com/NousResearch/hermes-agent/pull/22518))
|
||||
- **Telegram notification mode** (salvage #22772) ([#22793](https://github.com/NousResearch/hermes-agent/pull/22793))
|
||||
- **Telegram guest mention mode** (@kshitijk4poor) ([#22759](https://github.com/NousResearch/hermes-agent/pull/22759))
|
||||
- **Split-and-deliver oversized edits instead of silent truncation** (salvage of #19537) ([#23576](https://github.com/NousResearch/hermes-agent/pull/23576))
|
||||
- **Preserve DM topic routing via reply fallback** (salvage #22053) (@kshitijk4poor) ([#22410](https://github.com/NousResearch/hermes-agent/pull/22410))
|
||||
- **Pass `source.thread_id` explicitly on auto-reset notice** (carve-out of #7404) ([#23440](https://github.com/NousResearch/hermes-agent/pull/23440))
|
||||
|
||||
### Discord
|
||||
- **Render clarify choices as buttons** ([#25485](https://github.com/NousResearch/hermes-agent/pull/25485))
|
||||
- **Channel history backfill — default on, broadened scope** ([#25984](https://github.com/NousResearch/hermes-agent/pull/25984))
|
||||
- **`thread_require_mention` for multi-bot threads** (salvage #25313) ([#25445](https://github.com/NousResearch/hermes-agent/pull/25445))
|
||||
|
||||
### Slack
|
||||
- **Support `!cmd` as alternate prefix for slash commands in threads** ([#25355](https://github.com/NousResearch/hermes-agent/pull/25355))
|
||||
|
||||
### WhatsApp
|
||||
- **Surface quoted reply metadata from Baileys** (#25398) ([#25489](https://github.com/NousResearch/hermes-agent/pull/25489))
|
||||
|
||||
### Feishu / Google Chat / others
|
||||
- **Feishu: native update prompt cards** (@kshitijk4poor) ([#22448](https://github.com/NousResearch/hermes-agent/pull/22448))
|
||||
- **Google Chat: repair setup prompt imports** (@helix4u) ([#22038](https://github.com/NousResearch/hermes-agent/pull/22038))
|
||||
- **Google Chat: honor relay-declared sender_type** (salvage of #22107) (@kshitijk4poor) ([#22432](https://github.com/NousResearch/hermes-agent/pull/22432))
|
||||
- **LINE: use `build_source` instead of nonexistent `create_source`** ([#24717](https://github.com/NousResearch/hermes-agent/pull/24717))
|
||||
- **Add `weixin, and more` to gateway docs** (salvage of #21063 by @wuwuzhijing)
|
||||
|
||||
---
|
||||
|
||||
## 🖥️ CLI & TUI
|
||||
|
||||
### CLI
|
||||
- **Show YOLO mode warning in banner and status bar** ([#26238](https://github.com/NousResearch/hermes-agent/pull/26238))
|
||||
- **Confirm prompt for destructive slash commands** (#4069) ([#22687](https://github.com/NousResearch/hermes-agent/pull/22687))
|
||||
- **`docker_extra_args` + `display.timestamps`** ([#23599](https://github.com/NousResearch/hermes-agent/pull/23599))
|
||||
- **Delegate tool: show user's actual concurrency / spawn-depth limits in description** ([#22694](https://github.com/NousResearch/hermes-agent/pull/22694))
|
||||
|
||||
### TUI
|
||||
- **`/sessions` slash command for browsing and resuming previous sessions** (@austinpickett) ([#20805](https://github.com/NousResearch/hermes-agent/pull/20805))
|
||||
- **Segment turns with rule above non-first user msgs; trim ticker dead space** (@OutThisLife) ([#21846](https://github.com/NousResearch/hermes-agent/pull/21846))
|
||||
- **Support attaching to an existing gateway** (@OutThisLife) ([#21978](https://github.com/NousResearch/hermes-agent/pull/21978))
|
||||
- **Resolve markdown links to readable page titles** (@OutThisLife) ([#24013](https://github.com/NousResearch/hermes-agent/pull/24013))
|
||||
- **Width-aware markdown table rendering with vertical fallback** (@alt-glitch) ([#26195](https://github.com/NousResearch/hermes-agent/pull/26195))
|
||||
- **Keep Ink displayCursor in sync with fast-echo writes so cursor stops drifting** (@OutThisLife) ([#26717](https://github.com/NousResearch/hermes-agent/pull/26717))
|
||||
- **Allow transcript scroll + Esc during approval/clarify/confirm prompts** (@OutThisLife) ([#26414](https://github.com/NousResearch/hermes-agent/pull/26414))
|
||||
- **Preserve session when switching personality** (@austinpickett) ([#20942](https://github.com/NousResearch/hermes-agent/pull/20942))
|
||||
- **Skip native safety net on OSC52-capable terminals** (@benbarclay) ([#20954](https://github.com/NousResearch/hermes-agent/pull/20954))
|
||||
|
||||
### Dashboard / GUI
|
||||
- **Route embedded TUI through dashboard gateway** (@OutThisLife) ([#21979](https://github.com/NousResearch/hermes-agent/pull/21979))
|
||||
- **Hide token/cost analytics behind config flag (default off)** ([#25438](https://github.com/NousResearch/hermes-agent/pull/25438))
|
||||
- **Fix Langfuse observability — trace I/O, tool outputs, placeholder credentials** (closes #22342, #22763) (@kshitijk4poor) ([#26320](https://github.com/NousResearch/hermes-agent/pull/26320))
|
||||
- **MiniMax 'Login' button launched Claude OAuth** (salvage #22849) ([#24058](https://github.com/NousResearch/hermes-agent/pull/24058))
|
||||
- **Update cron modals** (@austinpickett) ([#25985](https://github.com/NousResearch/hermes-agent/pull/25985))
|
||||
- **Analytics: prevent silent token loss and add Claude 4.5–4.7 pricing** (@austinpickett) ([#21455](https://github.com/NousResearch/hermes-agent/pull/21455))
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Tools & Capabilities
|
||||
|
||||
### Vision & video
|
||||
- **`vision_analyze` returns pixels to vision-capable models** ([#22955](https://github.com/NousResearch/hermes-agent/pull/22955))
|
||||
- **Unified `video_generate` with pluggable provider backends** ([#25126](https://github.com/NousResearch/hermes-agent/pull/25126))
|
||||
- **`image_gen`: actionable setup message when no FAL backend is reachable** ([#26222](https://github.com/NousResearch/hermes-agent/pull/26222))
|
||||
|
||||
### Computer use
|
||||
- **`computer_use` cua-driver backend + focus-safe ops + non-Anthropic provider fix** (re-salvage #16936) ([#21967](https://github.com/NousResearch/hermes-agent/pull/21967))
|
||||
- **Refresh cua-driver on `hermes update` + add `install --upgrade`** ([#24063](https://github.com/NousResearch/hermes-agent/pull/24063))
|
||||
|
||||
### LSP & write-time diagnostics
|
||||
- **Semantic diagnostics from real language servers in `write_file`/`patch`** ([#24168](https://github.com/NousResearch/hermes-agent/pull/24168))
|
||||
- **Shift baseline diagnostics into post-edit coordinates** ([#25978](https://github.com/NousResearch/hermes-agent/pull/25978))
|
||||
|
||||
### Search & web
|
||||
- **Brave Search (free tier) and DDGS search providers** ([#21337](https://github.com/NousResearch/hermes-agent/pull/21337))
|
||||
- **Bearer auth header for Tavily `/crawl` endpoint** ([#24658](https://github.com/NousResearch/hermes-agent/pull/24658))
|
||||
|
||||
### X (Twitter)
|
||||
- **Gated `x_search` tool with OAuth-or-API-key auth** ([#26763](https://github.com/NousResearch/hermes-agent/pull/26763))
|
||||
|
||||
### Browser
|
||||
- **Route `browser_console` eval through supervisor's persistent CDP WS (180x faster)** ([#23226](https://github.com/NousResearch/hermes-agent/pull/23226))
|
||||
- **Support externally managed Camofox sessions** ([#24499](https://github.com/NousResearch/hermes-agent/pull/24499))
|
||||
|
||||
### MCP
|
||||
- **`supports_parallel_tool_calls` for MCP servers** (salvage of #9944) ([#26825](https://github.com/NousResearch/hermes-agent/pull/26825))
|
||||
- **Codex preset for Codex CLI MCP server** (salvage #22663) ([#22679](https://github.com/NousResearch/hermes-agent/pull/22679))
|
||||
- **Stop retrying initial MCP auth failures** (#25624) ([#25776](https://github.com/NousResearch/hermes-agent/pull/25776))
|
||||
|
||||
### Google Workspace
|
||||
- **Drive write ops + Docs/Sheets create/append** ([#21895](https://github.com/NousResearch/hermes-agent/pull/21895))
|
||||
|
||||
### Per-turn verifier
|
||||
- **Per-turn file-mutation verifier footer** ([#24498](https://github.com/NousResearch/hermes-agent/pull/24498))
|
||||
|
||||
---
|
||||
|
||||
## 🧩 Kanban (Multi-Agent)
|
||||
|
||||
- **`specify` — auxiliary LLM fleshes out triage tasks** ([#21435](https://github.com/NousResearch/hermes-agent/pull/21435))
|
||||
- **Orchestrator board tools — `kanban_list` + `kanban_unblock`** (carve-out of #20568) ([#23012](https://github.com/NousResearch/hermes-agent/pull/23012))
|
||||
- **`stranded_in_ready` diagnostic for unclaimed tasks** ([#23578](https://github.com/NousResearch/hermes-agent/pull/23578))
|
||||
- **Dashboard batch QOL upgrade** (salvage of #23240) ([#23550](https://github.com/NousResearch/hermes-agent/pull/23550))
|
||||
- **Tooltips and docs link across dashboard** ([#21541](https://github.com/NousResearch/hermes-agent/pull/21541))
|
||||
- **Dedupe notifier delivery via atomic claim + rewind on failure** (salvage #22558) ([#23401](https://github.com/NousResearch/hermes-agent/pull/23401))
|
||||
- **Keep notifier subscriptions alive across retry cycles** (salvage #21398) ([#23423](https://github.com/NousResearch/hermes-agent/pull/23423))
|
||||
- **Drop caller-controlled author override in `kanban_comment`** (salvage of #22109) (@kshitijk4poor) ([#22435](https://github.com/NousResearch/hermes-agent/pull/22435))
|
||||
- **Sanitize comment author rendering in `build_worker_context`** ([#22769](https://github.com/NousResearch/hermes-agent/pull/22769))
|
||||
|
||||
---
|
||||
|
||||
## 🧠 Plugins & Extension
|
||||
|
||||
### Plugin surface
|
||||
- **Run any LLM call from inside a plugin via `ctx.llm`** ([#23194](https://github.com/NousResearch/hermes-agent/pull/23194))
|
||||
- **`tool_override` flag for replacing built-in tools** (closes #11049) ([#26759](https://github.com/NousResearch/hermes-agent/pull/26759))
|
||||
- **`standalone_sender_fn` for out-of-process cron delivery** (@kshitijk4poor) ([#22461](https://github.com/NousResearch/hermes-agent/pull/22461))
|
||||
- **`HERMES_PLUGINS_DEBUG=1` surfaces plugin discovery logs** ([#22684](https://github.com/NousResearch/hermes-agent/pull/22684))
|
||||
- **Hindsight-client as optional dependency** (@alt-glitch) ([#21818](https://github.com/NousResearch/hermes-agent/pull/21818))
|
||||
|
||||
### Profile & distribution
|
||||
- **Shareable profile distributions via git** ([#20831](https://github.com/NousResearch/hermes-agent/pull/20831))
|
||||
|
||||
---
|
||||
|
||||
## ⏰ Cron
|
||||
|
||||
- **Routing intent — `deliver=all` fans out to every connected channel** ([#21495](https://github.com/NousResearch/hermes-agent/pull/21495))
|
||||
- **Support name-based lookup for job operations** ([#26231](https://github.com/NousResearch/hermes-agent/pull/26231))
|
||||
- **Blank Cron dashboard tab + partial-record crashes** (salvage #21042 + #22330) (@kshitijk4poor) ([#22389](https://github.com/NousResearch/hermes-agent/pull/22389))
|
||||
- **Do not seed `HERMES_SESSION_*` contextvars from cron origin** (salvage of #22356) (@kshitijk4poor) ([#22382](https://github.com/NousResearch/hermes-agent/pull/22382))
|
||||
- **Scan assembled prompt including skill content for prompt injection** (#3968)
|
||||
|
||||
---
|
||||
|
||||
## 🧩 Skills Ecosystem
|
||||
|
||||
### Skills Hub
|
||||
- **`hermes-skills/huggingface` as a trusted default tap** (closes #2549) ([#26219](https://github.com/NousResearch/hermes-agent/pull/26219))
|
||||
- **Show per-skill pages in the left sidebar** ([#26646](https://github.com/NousResearch/hermes-agent/pull/26646))
|
||||
- **Richer info panels on the Skills Hub** ([#22905](https://github.com/NousResearch/hermes-agent/pull/22905))
|
||||
- **Refuse `skill_view` name collisions instead of guessing** (closes #6136 @polkn)
|
||||
|
||||
### Curator
|
||||
- **Show rename map in user-visible summary** ([#22910](https://github.com/NousResearch/hermes-agent/pull/22910))
|
||||
- **Hint at `hermes curator pin` in the rename block** ([#23212](https://github.com/NousResearch/hermes-agent/pull/23212))
|
||||
|
||||
### New optional skills
|
||||
- **Hyperliquid** — perp/spot trading via SDK + REST (salvage of #1952) ([#23583](https://github.com/NousResearch/hermes-agent/pull/23583))
|
||||
- **Yahoo Finance** market data ([#23590](https://github.com/NousResearch/hermes-agent/pull/23590))
|
||||
- **api-testing** (REST/GraphQL debug, salvages #1800) ([#23582](https://github.com/NousResearch/hermes-agent/pull/23582))
|
||||
- **Unified EVM multi-chain skill** (salvages #25291 + #2010 + folds in base/) ([#25299](https://github.com/NousResearch/hermes-agent/pull/25299))
|
||||
- **darwinian-evolver** ([#26760](https://github.com/NousResearch/hermes-agent/pull/26760))
|
||||
- **osint-investigation** (closes #355) ([#26729](https://github.com/NousResearch/hermes-agent/pull/26729))
|
||||
- **pinggy-tunnel** ([#26765](https://github.com/NousResearch/hermes-agent/pull/26765))
|
||||
- **watchers** — RSS / HTTP JSON / GitHub polling via cron no-agent ([#21881](https://github.com/NousResearch/hermes-agent/pull/21881))
|
||||
- **Notion overhaul for the Developer Platform** (May 2026) ([#26612](https://github.com/NousResearch/hermes-agent/pull/26612))
|
||||
|
||||
---
|
||||
|
||||
## 🔒 Security & Reliability
|
||||
|
||||
### Security hardening
|
||||
- **Sudo brute-force block + sudo-stdin/askpass DANGEROUS** (salvage of #22194 + #21128) (@kshitijk4poor) ([#23736](https://github.com/NousResearch/hermes-agent/pull/23736))
|
||||
- **Drop caller-controlled author override in `kanban_comment`** (salvage of #22109) (@kshitijk4poor) ([#22435](https://github.com/NousResearch/hermes-agent/pull/22435))
|
||||
- **Cover remaining SSRF fetch paths in skills-hub** (salvage #22804) ([#22843](https://github.com/NousResearch/hermes-agent/pull/22843))
|
||||
- **Use credential_pool for custom endpoint model listing probes** (salvage #22810) ([#22842](https://github.com/NousResearch/hermes-agent/pull/22842))
|
||||
- **Require dashboard auth for plugin API routes** (salvage #19541) ([#23220](https://github.com/NousResearch/hermes-agent/pull/23220))
|
||||
- **Sanitize env and redact output in quick commands + remove write-only `_pending_messages`** ([#23584](https://github.com/NousResearch/hermes-agent/pull/23584))
|
||||
- **Reduce unnecessary `shell=True` in subprocess calls** ([#25149](https://github.com/NousResearch/hermes-agent/pull/25149))
|
||||
- **Sanitize Google Chat sender_type from relay** (salvage of #22107) (@kshitijk4poor) ([#22432](https://github.com/NousResearch/hermes-agent/pull/22432))
|
||||
- **Supply-chain advisory checker** ([#24220](https://github.com/NousResearch/hermes-agent/pull/24220))
|
||||
- **Rewrite security policy around OS-level isolation as the boundary** (@jquesnelle) ([#20317](https://github.com/NousResearch/hermes-agent/pull/20317))
|
||||
- **Remove public security advisory page** ([#24253](https://github.com/NousResearch/hermes-agent/pull/24253))
|
||||
|
||||
### Reliability — notable bug closures
|
||||
- **SQLite: fall back to `journal_mode=DELETE` on NFS/SMB/FUSE** (fixes `/resume` on network mounts) (@kshitijk4poor) ([#22043](https://github.com/NousResearch/hermes-agent/pull/22043))
|
||||
- **Codex-runtime: retire wedged sessions + post-tool watchdog + OAuth refresh classify** ([#25769](https://github.com/NousResearch/hermes-agent/pull/25769))
|
||||
- **Codex-runtime: de-dup `[plugins.X]` tables and stop leaking HERMES_HOME** (#26250) (@kshitijk4poor) ([#26260](https://github.com/NousResearch/hermes-agent/pull/26260))
|
||||
- **Daytona: migrate legacy-sandbox lookup to cursor-based `list()`** ([#24587](https://github.com/NousResearch/hermes-agent/pull/24587))
|
||||
- **MCP: stop retrying initial MCP auth failures** (#25624) ([#25776](https://github.com/NousResearch/hermes-agent/pull/25776))
|
||||
- **Gateway: enable text-intercept for multi-choice clarify fallback** (#25587) ([#25778](https://github.com/NousResearch/hermes-agent/pull/25778))
|
||||
- **Gateway: keep running when platforms fail; per-platform circuit breaker + `/platform`** ([#26600](https://github.com/NousResearch/hermes-agent/pull/26600))
|
||||
- **Delegate: salvage #21933 JSON-string batch + diagnostic logging** (@kshitijk4poor) ([#22436](https://github.com/NousResearch/hermes-agent/pull/22436))
|
||||
- **Profiles+banner: exclude infrastructure from `--clone-all` + fix stale update-check repo resolution** (@kshitijk4poor) ([#22475](https://github.com/NousResearch/hermes-agent/pull/22475))
|
||||
- **ACP: inline file attachment resources** (salvage #21400 + image support) ([#21407](https://github.com/NousResearch/hermes-agent/pull/21407))
|
||||
- **CI: unblock shared PR checks** (@stephenschoettler) ([#21012](https://github.com/NousResearch/hermes-agent/pull/21012), [#25957](https://github.com/NousResearch/hermes-agent/pull/25957))
|
||||
|
||||
### Notable reverts in window
|
||||
- **`/goal` checklist + /subgoal feature stack** — rolled back ([#23813](https://github.com/NousResearch/hermes-agent/pull/23813)); `/subgoal` returned in simpler form via [#25449](https://github.com/NousResearch/hermes-agent/pull/25449)
|
||||
- **Scrollback box width clamp** (#25975) rolled back to restore full-width borders ([#26163](https://github.com/NousResearch/hermes-agent/pull/26163))
|
||||
- **`fix(cli): tolerate unreadable dirs when building systemd PATH`** rolled back
|
||||
|
||||
---
|
||||
|
||||
## 🌍 i18n
|
||||
|
||||
- **Localize all gateway commands + web dashboard, add 8 new locales (16 total)** ([#22914](https://github.com/NousResearch/hermes-agent/pull/22914))
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- **Repair Voice & TTS provider table** (@nightcityblade, fixes #24101) ([#24138](https://github.com/NousResearch/hermes-agent/pull/24138))
|
||||
- **Show per-skill pages in the left sidebar** ([#26646](https://github.com/NousResearch/hermes-agent/pull/26646))
|
||||
- **Mention Weixin in gateway help and docstrings** (salvage of #21063 by @wuwuzhijing)
|
||||
- **Richer info panels on the Skills Hub** ([#22905](https://github.com/NousResearch/hermes-agent/pull/22905))
|
||||
- Many more doc updates across providers, platforms, skills, Windows install paths, and dashboard.
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Testing & CI
|
||||
|
||||
- **Unblock shared PR checks** (@stephenschoettler) ([#21012](https://github.com/NousResearch/hermes-agent/pull/21012))
|
||||
- **Stabilize shared test state after 21012** (@stephenschoettler) ([#25957](https://github.com/NousResearch/hermes-agent/pull/25957))
|
||||
- A long tail of test additions for platforms, providers, plugins, and edge cases — 8 explicit `test:` PRs plus ~250 fix PRs that also added regression coverage.
|
||||
|
||||
---
|
||||
|
||||
## 👥 Contributors
|
||||
|
||||
### Core
|
||||
- @teknium1 — release lead, architecture, ~406 PRs merged in window
|
||||
|
||||
### Top community contributors
|
||||
- **@kshitijk4poor** — 38 PRs · Telegram cadence/streaming/topic routing, security hardening (sudo, SSRF, kanban_comment, dashboard auth), codex-runtime hygiene, NovitaAI provider, profile/banner fixes, Feishu update cards, gateway QOL across the board
|
||||
- **@alt-glitch** — 13 PRs · Markdown-table TUI rendering, `HERMES_SESSION_ID` env var, hindsight-client optional dep, Nix `extraDependencyGroups`
|
||||
- **@OutThisLife** (Brooklyn Nicholson) — 12 PRs · TUI turn segmentation, attach-to-gateway, markdown link titles, embedded TUI via dashboard gateway, Ink cursor sync, scroll/Esc during prompts
|
||||
- **@austinpickett** — 8 PRs · `/sessions` slash command, personality switching preserves session, cron modals, dashboard analytics
|
||||
- **@helix4u** — 5 PRs · Google Chat setup, browser install skip on system chromium, Windows Ctrl+C preservation
|
||||
- **@rob-maron** — 4 PRs · Nous Portal as model metadata authority, provider polish
|
||||
- **@stephenschoettler** — 3 PRs · CI stabilization
|
||||
- **@ethernet8023** — 3 PRs · platform/gateway work
|
||||
|
||||
### All contributors (alphabetical)
|
||||
|
||||
@02356abc, @0xbyt4, @0xharryriddle, @1000Delta, @1RB, @29206394, @A-kamal, @aashizpoudel, @Abd0r,
|
||||
@adybag14-cyber, @AgentArcLab, @ahmedbadr3, @AhmetArif0, @alblez, @Alex-yang00, @ALIYILD, @AllynSheep,
|
||||
@alt-glitch, @am423, @amathxbt, @amethystani, @ArecaNon, @Arkmusn, @askclaw-vesper, @AsoTora, @austinpickett,
|
||||
@aydnOktay, @ayushere, @baocin, @Bartok9, @benbarclay, @BennetYrWang, @Bihruze, @binhnt92, @briandevans,
|
||||
@brooklynnicholson, @btorresgil, @buntingszn, @CalmProton, @chrisworksai, @CoinTheHat, @dandacompany, @Dangooy,
|
||||
@DanielLSM, @David-0x221Eight, @ddupont808, @dhruv-saxena, @diablozzc, @dlkakbs, @dmahan93, @dmnkhorvath,
|
||||
@domtriola, @donrhmexe, @Dusk1e, @eloklam, @emozilla, @ephron-ren, @erenkarakus, @EthanGuo-coder,
|
||||
@ethernet8023, @evgyur, @explainanalyze, @fahdad, @fr33d3m0n, @Freeman-Consulting, @freqyfreqy, @Frowtek,
|
||||
@fu576, @github-actions[bot], @gnanirahulnutakki, @GodsBoy, @guglielmofonda, @Gutslabs, @hanzckernel,
|
||||
@heathley, @hekaru-agent, @helix4u, @HenkDz, @HiddenPuppy, @hllqkb, @hrygo, @HuangYuChuh, @Hugo-SEQUIER, @HxT9,
|
||||
@iacker, @InB4DevOps, @isaachuangGMICLOUD, @iuyup, @Jaaneek, @jackey8616, @jackjin1997, @Jaggia, @jak983464779,
|
||||
@jelrod27, @jethac, @JithendraNara, @johnisag, @Julientalbot, @Jwd-gity, @kallidean, @keyuyuan, @kfa-ai,
|
||||
@kidonng, @KiraKatana, @kjames2001, @konsisumer, @Korkyzer, @kshitijk4poor, @KvnGz, @lars-hagen, @leehack,
|
||||
@leepoweii, @LeonSGP43, @li0near, @libo1106, @liquidchen, @littlewwwhite, @liuhao1024, @liyoungc, @luandiasrj,
|
||||
@luoyuctl, @luyao618, @magic524, @mbac, @McClean, @memosr, @Mibayy, @ming1523, @mizgyo, @mrshu, @ms-alan,
|
||||
@MustafaKara7, @nederev, @nicoechaniz, @nidhi-singh02, @nightcityblade, @nik1t7n, @Ninso112, @NivOO5,
|
||||
@novax635, @nv-kasikritc, @oferlaor, @oswaldb22, @outdoorsea, @oxngon, @PaTTeeL, @pearjelly, @pefontana,
|
||||
@perng, @PhilipAD, @phuongvm, @polkn, @Prasanna28Devadiga, @princepal9120, @pty819, @purzbeats, @Quarkex,
|
||||
@quocanh261997, @qWaitCrypto, @Qwinty, @rahimsais, @raymaylee, @ReqX, @rewbs, @RhombusMaximus, @rob-maron,
|
||||
@Ruzzgar, @ryptotalent, @Sanjays2402, @shannonsands, @shaun0927, @SiliconID, @silv-mt-holdings, @simpolism,
|
||||
@smwbev, @soichiyo, @sprmn24, @steezkelly, @stephenschoettler, @Sylw3ster, @szymonclawd, @teyrebaz33,
|
||||
@Tianyu199509, @Tranquil-Flow, @TreyDong, @TurgutKural, @tw2818, @tymrtn, @uzunkuyruk, @v1b3coder,
|
||||
@vanthinh6886, @VinceZcrikl, @vKongv, @vominh1919, @voteblake, @VTRiot, @wali-reheman, @wesleysimplicio,
|
||||
@wilsen0, @WorldWriter, @worlldz, @wuli666, @wuwuzhijing, @Wysie, @XiaoXiao0221, @xieNniu, @xxxigm, @yehuosi,
|
||||
@ygd58, @yifengingit, @yuga-hashimoto, @zccyman, @ZeterMordio, @Zhekinmaksim, @zhengyn0001
|
||||
|
||||
Also: @Nagatha (Claude Opus 4.7).
|
||||
|
||||
---
|
||||
|
||||
**Full Changelog**: [v2026.5.7...v2026.5.16](https://github.com/NousResearch/hermes-agent/compare/v2026.5.7...v2026.5.16)
|
||||
|
|
@ -18,6 +18,7 @@ import acp
|
|||
from acp.schema import (
|
||||
AgentCapabilities,
|
||||
AgentMessageChunk,
|
||||
AgentThoughtChunk,
|
||||
AuthenticateResponse,
|
||||
AvailableCommand,
|
||||
AvailableCommandsUpdate,
|
||||
|
|
@ -788,14 +789,20 @@ class HermesACPAgent(acp.Agent):
|
|||
# ---- Session management -------------------------------------------------
|
||||
|
||||
@staticmethod
|
||||
def _history_message_text(message: dict[str, Any]) -> str:
|
||||
"""Extract displayable text from a persisted OpenAI-style message."""
|
||||
content = message.get("content")
|
||||
if isinstance(content, str):
|
||||
return content.strip()
|
||||
if isinstance(content, list):
|
||||
def _flatten_history_text(value: Any) -> str:
|
||||
"""Normalize a persisted text-or-text-parts value into a single string.
|
||||
|
||||
OpenAI-style assistant content (and provider reasoning fields) can arrive
|
||||
as either a scalar string or a list of ``{"text": ...}`` /
|
||||
``{"type": "text", "content": ...}`` parts. Whitespace-only inputs
|
||||
collapse to an empty string so callers can treat ``""`` as "nothing to
|
||||
emit".
|
||||
"""
|
||||
if isinstance(value, str):
|
||||
return value.strip()
|
||||
if isinstance(value, list):
|
||||
parts: list[str] = []
|
||||
for item in content:
|
||||
for item in value:
|
||||
if isinstance(item, dict):
|
||||
text = item.get("text")
|
||||
if isinstance(text, str):
|
||||
|
|
@ -807,6 +814,29 @@ class HermesACPAgent(acp.Agent):
|
|||
return "\n".join(part.strip() for part in parts if part and part.strip()).strip()
|
||||
return ""
|
||||
|
||||
@classmethod
|
||||
def _history_message_text(cls, message: dict[str, Any]) -> str:
|
||||
"""Extract displayable text from a persisted OpenAI-style message."""
|
||||
return cls._flatten_history_text(message.get("content"))
|
||||
|
||||
@classmethod
|
||||
def _history_reasoning_text(cls, message: dict[str, Any]) -> str:
|
||||
"""Extract displayable reasoning/thought text from a persisted assistant message.
|
||||
|
||||
Returns the first non-empty value among ``reasoning_content`` (the
|
||||
canonical field used by DeepSeek / Moonshot and the post-#16892
|
||||
chat-completions normalizer) and ``reasoning`` (used by the codex
|
||||
event projector and several other transports). Both keys are
|
||||
actively written by live code paths, so neither branch is
|
||||
deprecated — they cover different transports rather than old vs.
|
||||
new sessions.
|
||||
"""
|
||||
for key in ("reasoning_content", "reasoning"):
|
||||
text = cls._flatten_history_text(message.get(key))
|
||||
if text:
|
||||
return text
|
||||
return ""
|
||||
|
||||
@staticmethod
|
||||
def _history_message_update(
|
||||
*,
|
||||
|
|
@ -827,6 +857,11 @@ class HermesACPAgent(acp.Agent):
|
|||
)
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _history_thought_update(text: str) -> AgentThoughtChunk:
|
||||
"""Build an ACP history replay update for an assistant thought."""
|
||||
return acp.update_agent_thought_text(text)
|
||||
|
||||
@staticmethod
|
||||
def _history_tool_call_name_args(tool_call: dict[str, Any]) -> tuple[str, dict[str, Any]]:
|
||||
"""Extract function name/arguments from an OpenAI-style tool_call."""
|
||||
|
|
@ -854,13 +889,17 @@ class HermesACPAgent(acp.Agent):
|
|||
).strip()
|
||||
|
||||
async def _replay_session_history(self, state: SessionState) -> None:
|
||||
"""Send persisted user/assistant history to clients during session/load.
|
||||
"""Replay persisted user/assistant history during session/load or session/resume.
|
||||
|
||||
Zed's ACP history UI calls ``session/load`` after the user picks an item
|
||||
from the Agents sidebar. The agent must then replay the full conversation
|
||||
as user/assistant chunks plus reconstructed tool-call start/completion
|
||||
notifications; merely restoring server-side state makes Hermes remember
|
||||
context, but leaves the editor looking like a clean thread.
|
||||
Invoked inline (``await``) from both ``load_session`` and
|
||||
``resume_session`` so that spec-compliant ACP clients receive the
|
||||
full transcript within the request's lifetime — see the comment at
|
||||
the call sites for the rationale and prior-art citations.
|
||||
|
||||
Replays the conversation as user/assistant chunks, thinking-mode
|
||||
thought chunks, plus reconstructed tool-call start/completion
|
||||
notifications. Merely restoring server-side state makes Hermes
|
||||
remember context, but leaves the editor looking like a clean thread.
|
||||
"""
|
||||
if not self._conn or not state.history:
|
||||
return
|
||||
|
|
@ -882,24 +921,37 @@ class HermesACPAgent(acp.Agent):
|
|||
for message in state.history:
|
||||
role = str(message.get("role") or "")
|
||||
|
||||
if role in {"user", "assistant"}:
|
||||
if role == "user":
|
||||
text = self._history_message_text(message)
|
||||
if text:
|
||||
update = self._history_message_update(role=role, text=text)
|
||||
if update is not None and not await _send(update):
|
||||
return
|
||||
continue
|
||||
|
||||
if role == "assistant":
|
||||
thought = self._history_reasoning_text(message)
|
||||
if thought and not await _send(self._history_thought_update(thought)):
|
||||
return
|
||||
|
||||
text = self._history_message_text(message)
|
||||
if text:
|
||||
update = self._history_message_update(role=role, text=text)
|
||||
if update is not None and not await _send(update):
|
||||
return
|
||||
|
||||
if role == "assistant" and isinstance(message.get("tool_calls"), list):
|
||||
for tool_call in message["tool_calls"]:
|
||||
if not isinstance(tool_call, dict):
|
||||
continue
|
||||
tool_call_id = self._history_tool_call_id(tool_call)
|
||||
if not tool_call_id:
|
||||
continue
|
||||
tool_name, args = self._history_tool_call_name_args(tool_call)
|
||||
active_tool_calls[tool_call_id] = (tool_name, args)
|
||||
if not await _send(build_tool_start(tool_call_id, tool_name, args)):
|
||||
return
|
||||
tool_calls = message.get("tool_calls")
|
||||
if isinstance(tool_calls, list):
|
||||
for tool_call in tool_calls:
|
||||
if not isinstance(tool_call, dict):
|
||||
continue
|
||||
tool_call_id = self._history_tool_call_id(tool_call)
|
||||
if not tool_call_id:
|
||||
continue
|
||||
tool_name, args = self._history_tool_call_name_args(tool_call)
|
||||
active_tool_calls[tool_call_id] = (tool_name, args)
|
||||
if not await _send(build_tool_start(tool_call_id, tool_name, args)):
|
||||
return
|
||||
continue
|
||||
|
||||
if role == "tool":
|
||||
|
|
@ -942,18 +994,6 @@ class HermesACPAgent(acp.Agent):
|
|||
models=self._build_model_state(state),
|
||||
)
|
||||
|
||||
def _schedule_history_replay(self, state: SessionState) -> None:
|
||||
"""Replay persisted history after session/load or session/resume returns.
|
||||
|
||||
Zed only attaches streamed transcript/tool updates once the load/resume
|
||||
response has completed. Sending replay notifications while the request is
|
||||
still in-flight can make the server look correct in logs while the editor
|
||||
drops or fails to attach the tool-call history.
|
||||
"""
|
||||
loop = asyncio.get_running_loop()
|
||||
replay_coro = self._replay_session_history(state)
|
||||
loop.call_soon(asyncio.create_task, replay_coro)
|
||||
|
||||
async def load_session(
|
||||
self,
|
||||
cwd: str,
|
||||
|
|
@ -967,7 +1007,30 @@ class HermesACPAgent(acp.Agent):
|
|||
return None
|
||||
await self._register_session_mcp_servers(state, mcp_servers)
|
||||
logger.info("Loaded session %s", session_id)
|
||||
self._schedule_history_replay(state)
|
||||
# Per ACP spec, `session/load` must stream the prior conversation back
|
||||
# to the client via `session/update` notifications BEFORE responding,
|
||||
# so the client receives the full transcript within the load request's
|
||||
# lifetime. Awaiting the replay here matches Codex / Claude Code /
|
||||
# OpenCode / Pi and the Zed client (which registers the session-update
|
||||
# routing entry before awaiting the loadSession RPC specifically so
|
||||
# in-call history replay updates can find the thread). Deferring this
|
||||
# via `loop.call_soon` (as we did briefly in May 2026) broke every
|
||||
# spec-compliant ACP client that measures notifications synchronously
|
||||
# against the load response — see #12285 follow-up.
|
||||
try:
|
||||
await self._replay_session_history(state)
|
||||
except Exception:
|
||||
# Replay is best-effort — a corrupted or unexpected message shape
|
||||
# must not turn a successful session/load into a JSON-RPC error
|
||||
# response. Per-notification failures are already caught inside
|
||||
# ``_replay_session_history``; this outer guard covers anything
|
||||
# raised by the helpers themselves before reaching ``_send``.
|
||||
logger.warning(
|
||||
"ACP history replay raised during session/load for %s — "
|
||||
"load will still succeed, partial transcript may be missing",
|
||||
session_id,
|
||||
exc_info=True,
|
||||
)
|
||||
self._schedule_available_commands_update(session_id)
|
||||
self._schedule_usage_update(state)
|
||||
return LoadSessionResponse(models=self._build_model_state(state))
|
||||
|
|
@ -985,7 +1048,18 @@ class HermesACPAgent(acp.Agent):
|
|||
state = self.session_manager.create_session(cwd=cwd)
|
||||
await self._register_session_mcp_servers(state, mcp_servers)
|
||||
logger.info("Resumed session %s", state.session_id)
|
||||
self._schedule_history_replay(state)
|
||||
# See `load_session` above for the spec rationale — replay must
|
||||
# complete before the response so clients receive the full transcript
|
||||
# within the request's lifetime.
|
||||
try:
|
||||
await self._replay_session_history(state)
|
||||
except Exception:
|
||||
logger.warning(
|
||||
"ACP history replay raised during session/resume for %s — "
|
||||
"resume will still succeed, partial transcript may be missing",
|
||||
state.session_id,
|
||||
exc_info=True,
|
||||
)
|
||||
self._schedule_available_commands_update(state.session_id)
|
||||
self._schedule_usage_update(state)
|
||||
return ResumeSessionResponse(models=self._build_model_state(state))
|
||||
|
|
|
|||
|
|
@ -1060,10 +1060,12 @@ def _generate_pkce() -> tuple:
|
|||
|
||||
def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
|
||||
"""Run Hermes-native OAuth PKCE flow and return credential state."""
|
||||
import secrets
|
||||
import time
|
||||
import webbrowser
|
||||
|
||||
verifier, challenge = _generate_pkce()
|
||||
oauth_state = secrets.token_urlsafe(32)
|
||||
|
||||
params = {
|
||||
"code": "true",
|
||||
|
|
@ -1073,7 +1075,7 @@ def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
|
|||
"scope": _OAUTH_SCOPES,
|
||||
"code_challenge": challenge,
|
||||
"code_challenge_method": "S256",
|
||||
"state": verifier,
|
||||
"state": oauth_state,
|
||||
}
|
||||
from urllib.parse import urlencode
|
||||
|
||||
|
|
@ -1110,7 +1112,12 @@ def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
|
|||
|
||||
splits = auth_code.split("#")
|
||||
code = splits[0]
|
||||
state = splits[1] if len(splits) > 1 else ""
|
||||
received_state = splits[1] if len(splits) > 1 else ""
|
||||
|
||||
# Validate state to prevent CSRF (RFC 6749 §10.12)
|
||||
if received_state != oauth_state:
|
||||
logger.warning("OAuth state mismatch — possible CSRF, aborting")
|
||||
return None
|
||||
|
||||
try:
|
||||
import urllib.request
|
||||
|
|
@ -1119,7 +1126,7 @@ def run_hermes_oauth_login_pure() -> Optional[Dict[str, Any]]:
|
|||
"grant_type": "authorization_code",
|
||||
"client_id": _OAUTH_CLIENT_ID,
|
||||
"code": code,
|
||||
"state": state,
|
||||
"state": received_state,
|
||||
"redirect_uri": _OAUTH_REDIRECT_URI,
|
||||
"code_verifier": verifier,
|
||||
}).encode()
|
||||
|
|
|
|||
|
|
@ -30,6 +30,28 @@ _DEFAULT_TIMEOUT_SECONDS = 900.0
|
|||
_TOOL_CALL_BLOCK_RE = re.compile(r"<tool_call>\s*(\{.*?\})\s*</tool_call>", re.DOTALL)
|
||||
_TOOL_CALL_JSON_RE = re.compile(r"\{\s*\"id\"\s*:\s*\"[^\"]+\"\s*,\s*\"type\"\s*:\s*\"function\"\s*,\s*\"function\"\s*:\s*\{.*?\}\s*\}", re.DOTALL)
|
||||
|
||||
# Stderr fingerprint of the deprecated `gh copilot` CLI extension
|
||||
# (https://github.blog/changelog/2025-09-25-upcoming-deprecation-of-gh-copilot-cli-extension).
|
||||
# We require BOTH the literal product name ("gh-copilot") AND a deprecation
|
||||
# marker, so generic stderr from the NEW `@github/copilot` CLI — whose repo
|
||||
# is github.com/github/copilot-cli and which legitimately mentions "copilot-cli"
|
||||
# in its own banners and error messages — doesn't get misclassified as the
|
||||
# deprecated extension.
|
||||
_DEPRECATION_REQUIRED = ("gh-copilot",)
|
||||
_DEPRECATION_MARKERS = (
|
||||
"has been deprecated",
|
||||
"no commands will be executed",
|
||||
)
|
||||
|
||||
|
||||
def _is_gh_copilot_deprecation_message(stderr_text: str) -> bool:
|
||||
"""True iff stderr looks like the deprecated gh-copilot extension's banner."""
|
||||
|
||||
lower = stderr_text.lower()
|
||||
if not any(req in lower for req in _DEPRECATION_REQUIRED):
|
||||
return False
|
||||
return any(marker in lower for marker in _DEPRECATION_MARKERS)
|
||||
|
||||
|
||||
def _resolve_command() -> str:
|
||||
return (
|
||||
|
|
@ -506,6 +528,21 @@ class CopilotACPClient:
|
|||
|
||||
stderr_text = "\n".join(stderr_tail).strip()
|
||||
if proc.poll() is not None and stderr_text:
|
||||
if _is_gh_copilot_deprecation_message(stderr_text):
|
||||
raise RuntimeError(
|
||||
"Hermes ACP mode requires the NEW GitHub Copilot CLI "
|
||||
"(github.com/github/copilot-cli), but the binary it just "
|
||||
"spawned is the deprecated `gh copilot` extension.\n\n"
|
||||
"Install the new CLI:\n"
|
||||
" npm install -g @github/copilot\n"
|
||||
" # then verify with: copilot --help\n\n"
|
||||
"If `copilot` already resolves to the new CLI but you still see this,\n"
|
||||
"point Hermes at it explicitly:\n"
|
||||
" export HERMES_COPILOT_ACP_COMMAND=/path/to/new/copilot\n\n"
|
||||
"Alternative: use the `copilot` provider (no ACP, hits the Copilot API\n"
|
||||
"directly with a Copilot subscription token) via `hermes setup`.\n\n"
|
||||
f"Original error:\n{stderr_text}"
|
||||
)
|
||||
raise RuntimeError(f"Copilot ACP process exited early: {stderr_text}")
|
||||
raise TimeoutError(f"Timed out waiting for Copilot ACP response to {method}.")
|
||||
|
||||
|
|
|
|||
|
|
@ -358,6 +358,12 @@ _URL_TO_PROVIDER: Dict[str, str] = {
|
|||
"api.deepseek.com": "deepseek",
|
||||
"api.githubcopilot.com": "copilot",
|
||||
"models.github.ai": "copilot",
|
||||
# GitHub Models free tier (Azure-hosted prototyping endpoint) — same
|
||||
# canonical provider as the Copilot API. Hard per-request token cap
|
||||
# (often 8K) makes it unusable for Hermes' system prompt, but mapping
|
||||
# it here lets us recognize the endpoint and emit a targeted hint
|
||||
# instead of falling through the unknown-custom-endpoint path.
|
||||
"models.inference.ai.azure.com": "copilot",
|
||||
"api.fireworks.ai": "fireworks",
|
||||
"opencode.ai": "opencode-go",
|
||||
"api.x.ai": "xai",
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const af: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Rou idees — 'n spesifiseerder sal die spesifikasie uitwerk",
|
||||
todo: "Wag op afhanklikhede of nie toegewys nie",
|
||||
ready: "Toegewys en wag vir 'n versender-tik",
|
||||
ready: "Afhanklikhede is bevredig; wys 'n profiel toe om te versend",
|
||||
running: "Deur 'n werker geëis — in vlug",
|
||||
blocked: "Werker het mensinvoer aangevra",
|
||||
done: "Voltooi",
|
||||
|
|
|
|||
|
|
@ -662,7 +662,7 @@ export const de: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Rohe Ideen — ein Specifier wird die Spezifikation ausarbeiten",
|
||||
todo: "Wartet auf Abhängigkeiten oder ist nicht zugewiesen",
|
||||
ready: "Zugewiesen und wartet auf einen Dispatcher-Tick",
|
||||
ready: "Abhängigkeiten erfüllt; Profil zum Dispatch zuweisen",
|
||||
running: "Von einem Worker übernommen — in Bearbeitung",
|
||||
blocked: "Worker hat um menschliche Eingabe gebeten",
|
||||
done: "Abgeschlossen",
|
||||
|
|
|
|||
|
|
@ -574,6 +574,9 @@ export const en: Translations = {
|
|||
createTask: "Create task in this column",
|
||||
noTasks: "— no tasks —",
|
||||
unassigned: "unassigned",
|
||||
needsAssignee: "Needs assignee",
|
||||
needsAssigneeHint:
|
||||
"Dependencies are satisfied, but the dispatcher skips this task until you assign a profile.",
|
||||
untitled: "(untitled)",
|
||||
loadingDetail: "Loading…",
|
||||
addComment: "Add a comment… (Enter to submit)",
|
||||
|
|
@ -664,7 +667,7 @@ export const en: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Raw ideas — a specifier will flesh out the spec",
|
||||
todo: "Waiting on dependencies or unassigned",
|
||||
ready: "Assigned and waiting for a dispatcher tick",
|
||||
ready: "Dependencies satisfied; assign a profile to dispatch",
|
||||
running: "Claimed by a worker — in-flight",
|
||||
blocked: "Worker asked for human input",
|
||||
done: "Completed",
|
||||
|
|
|
|||
|
|
@ -662,7 +662,7 @@ export const es: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Ideas en bruto — un specifier desarrollará la especificación",
|
||||
todo: "Esperando dependencias o sin asignar",
|
||||
ready: "Asignado y esperando un tick del dispatcher",
|
||||
ready: "Dependencias satisfechas; asigna un perfil para despachar",
|
||||
running: "Reclamado por un worker — en ejecución",
|
||||
blocked: "El worker pidió intervención humana",
|
||||
done: "Completado",
|
||||
|
|
|
|||
|
|
@ -662,7 +662,7 @@ export const fr: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Idées brutes — un specifier rédigera la spécification",
|
||||
todo: "En attente de dépendances ou non assigné",
|
||||
ready: "Assigné et en attente d'un tick du dispatcher",
|
||||
ready: "Dépendances satisfaites ; assignez un profil pour dispatch",
|
||||
running: "Réclamé par un worker — en cours d'exécution",
|
||||
blocked: "Le worker a demandé une intervention humaine",
|
||||
done: "Terminé",
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const ga: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Smaointe amha — déanfaidh specifier an spec a chur i bhfeidhm",
|
||||
todo: "Ag fanacht ar spleáchais nó gan sannadh",
|
||||
ready: "Sannta agus ag fanacht ar thic an dispatcher",
|
||||
ready: "Tá na spleáchais sásaithe; sann próifíl le dispatch a dhéanamh",
|
||||
running: "Éilithe ag worker — ar siúl",
|
||||
blocked: "D'iarr an worker ionchur duine",
|
||||
done: "Críochnaithe",
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const hu: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Nyers ötletek — egy specifier kidolgozza a specifikációt",
|
||||
todo: "Függőségekre vár vagy nincs felelőse",
|
||||
ready: "Kiosztva, dispatcher tickre vár",
|
||||
ready: "A függőségek teljesültek; rendelj hozzá profilt az indításhoz",
|
||||
running: "Worker felvette — folyamatban",
|
||||
blocked: "A worker emberi beavatkozást kért",
|
||||
done: "Befejezve",
|
||||
|
|
|
|||
|
|
@ -662,7 +662,7 @@ export const it: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Idee grezze — un specifier elaborerà la specifica",
|
||||
todo: "In attesa di dipendenze o non assegnato",
|
||||
ready: "Assegnato e in attesa di un tick del dispatcher",
|
||||
ready: "Dipendenze soddisfatte; assegna un profilo per il dispatch",
|
||||
running: "Preso in carico da un worker — in esecuzione",
|
||||
blocked: "Il worker ha richiesto input umano",
|
||||
done: "Completato",
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const ja: Translations = {
|
|||
columnHelp: {
|
||||
triage: "未整理のアイデア — スペシファイアが仕様を肉付けします",
|
||||
todo: "依存関係の待機中、または未割り当て",
|
||||
ready: "割り当て済み、ディスパッチャーのティック待ち",
|
||||
ready: "依存関係は満たされています。ディスパッチするにはプロファイルを割り当ててください",
|
||||
running: "ワーカーが取得中 — 実行中",
|
||||
blocked: "ワーカーが人間の入力を求めています",
|
||||
done: "完了",
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const ko: Translations = {
|
|||
columnHelp: {
|
||||
triage: "원시 아이디어 — 스페시파이어가 사양을 구체화합니다",
|
||||
todo: "종속성 대기 중 또는 미지정",
|
||||
ready: "지정되었으며 디스패처 틱 대기 중",
|
||||
ready: "종속성이 충족됨; 디스패치하려면 프로필을 지정하세요",
|
||||
running: "워커가 점유 중 — 실행 중",
|
||||
blocked: "워커가 사람의 입력을 요청함",
|
||||
done: "완료됨",
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const pt: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Ideias em bruto — um specifier vai detalhar a especificação",
|
||||
todo: "À espera de dependências ou sem atribuição",
|
||||
ready: "Atribuído e à espera de um tick do dispatcher",
|
||||
ready: "Dependências satisfeitas; atribua um perfil para despachar",
|
||||
running: "Reivindicado por um worker — em execução",
|
||||
blocked: "O worker pediu intervenção humana",
|
||||
done: "Concluído",
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const ru: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Сырые идеи — specifier подготовит спецификацию",
|
||||
todo: "Ожидает зависимостей или без исполнителя",
|
||||
ready: "Назначено и ждёт тика диспетчера",
|
||||
ready: "Зависимости выполнены; назначьте профиль для диспетчеризации",
|
||||
running: "Взято воркером — выполняется",
|
||||
blocked: "Воркер запросил вмешательство человека",
|
||||
done: "Завершено",
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const tr: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Ham fikirler — bir specifier şartnameyi detaylandıracak",
|
||||
todo: "Bağımlılıklar bekleniyor veya atanmamış",
|
||||
ready: "Atanmış ve dispatcher tick'i bekleniyor",
|
||||
ready: "Bağımlılıklar karşılandı; dispatch için bir profil atayın",
|
||||
running: "Bir worker tarafından alındı — yürütülüyor",
|
||||
blocked: "Worker insan girdisi istedi",
|
||||
done: "Tamamlandı",
|
||||
|
|
|
|||
|
|
@ -586,6 +586,8 @@ export interface Translations {
|
|||
createTask: string;
|
||||
noTasks: string;
|
||||
unassigned: string;
|
||||
needsAssignee?: string;
|
||||
needsAssigneeHint?: string;
|
||||
untitled: string;
|
||||
loadingDetail: string;
|
||||
addComment: string;
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const uk: Translations = {
|
|||
columnHelp: {
|
||||
triage: "Сирі ідеї — специфікатор деталізує специфікацію",
|
||||
todo: "Очікує на залежності або не призначено",
|
||||
ready: "Призначено, очікує тіку диспетчера",
|
||||
ready: "Залежності задоволені; призначте профіль для диспетчеризації",
|
||||
running: "Захоплено воркером — у роботі",
|
||||
blocked: "Воркер запитав втручання людини",
|
||||
done: "Завершено",
|
||||
|
|
|
|||
|
|
@ -663,7 +663,7 @@ export const zhHant: Translations = {
|
|||
columnHelp: {
|
||||
triage: "原始想法 — 規格制定者將完善規格",
|
||||
todo: "等待相依項目或尚未指派",
|
||||
ready: "已指派,等待排程器輪詢",
|
||||
ready: "相依項目已滿足;指派設定檔以便排程",
|
||||
running: "已被工作者領取 — 執行中",
|
||||
blocked: "工作者請求人工輸入",
|
||||
done: "已完成",
|
||||
|
|
|
|||
|
|
@ -659,7 +659,7 @@ export const zh: Translations = {
|
|||
columnHelp: {
|
||||
triage: "原始想法 — 规范制定者将完善规格",
|
||||
todo: "等待依赖项或未分配",
|
||||
ready: "已分配,等待调度器轮询",
|
||||
ready: "依赖项已满足;分配一个配置文件以便调度",
|
||||
running: "已被工作者认领 — 执行中",
|
||||
blocked: "工作者请求人工输入",
|
||||
done: "已完成",
|
||||
|
|
|
|||
|
|
@ -2961,9 +2961,25 @@ class BasePlatformAdapter(ABC):
|
|||
merge_pending_message_event(self._pending_messages, session_key, event)
|
||||
return # Don't interrupt now - will run after current task completes
|
||||
|
||||
# Default behavior for non-photo follow-ups: interrupt the running agent
|
||||
# Default behavior for non-photo follow-ups: interrupt the running agent.
|
||||
#
|
||||
# Use merge_text=True so rapid TEXT follow-ups (#4469) accumulate
|
||||
# into the single pending slot instead of clobbering each other.
|
||||
# Without merging, three rapid messages "A", "B", "C" land like:
|
||||
# _pending_messages[k] = A (interrupts)
|
||||
# _pending_messages[k] = B (replaces A before consumer reads)
|
||||
# _pending_messages[k] = C (replaces B)
|
||||
# ...and only "C" reaches the next turn. merge_pending_message_event
|
||||
# already does the right thing for photo/media bursts; the
|
||||
# ``merge_text=True`` flag extends that to plain TEXT events.
|
||||
# Same shape as the Telegram bursty-grace path in gateway/run.py.
|
||||
logger.debug("[%s] New message while session %s is active — triggering interrupt", self.name, session_key)
|
||||
self._pending_messages[session_key] = event
|
||||
merge_pending_message_event(
|
||||
self._pending_messages,
|
||||
session_key,
|
||||
event,
|
||||
merge_text=True,
|
||||
)
|
||||
# Signal the interrupt (the processing task checks this)
|
||||
self._active_sessions[session_key].set()
|
||||
return # Don't process now - will be handled after current task finishes
|
||||
|
|
|
|||
|
|
@ -14,8 +14,8 @@ Provides subcommands for:
|
|||
import os
|
||||
import sys
|
||||
|
||||
__version__ = "0.13.0"
|
||||
__release_date__ = "2026.5.7"
|
||||
__version__ = "0.14.0"
|
||||
__release_date__ = "2026.5.16"
|
||||
|
||||
|
||||
def _ensure_utf8():
|
||||
|
|
|
|||
|
|
@ -1152,6 +1152,10 @@ DEFAULT_CONFIG = {
|
|||
"provider": "", # e.g. "openrouter" (empty = inherit parent provider + credentials)
|
||||
"base_url": "", # direct OpenAI-compatible endpoint for subagents
|
||||
"api_key": "", # API key for delegation.base_url (falls back to OPENAI_API_KEY)
|
||||
"api_mode": "", # wire protocol for delegation.base_url: "chat_completions",
|
||||
# "codex_responses", or "anthropic_messages". Empty = auto-detect
|
||||
# from URL (e.g. /anthropic suffix → anthropic_messages). Set this
|
||||
# explicitly for non-standard endpoints the heuristic can't detect.
|
||||
# When delegate_task narrows child toolsets explicitly, preserve any
|
||||
# MCP toolsets the parent already has enabled. On by default so
|
||||
# narrowing (e.g. toolsets=["web","browser"]) expresses "I want these
|
||||
|
|
@ -1609,6 +1613,23 @@ DEFAULT_CONFIG = {
|
|||
"servers": {},
|
||||
},
|
||||
|
||||
# X (Twitter) Search via xAI's built-in x_search Responses tool.
|
||||
# The tool registers when xAI credentials are available (SuperGrok
|
||||
# OAuth or XAI_API_KEY) AND the x_search toolset is enabled in
|
||||
# `hermes tools`. These settings tune the backing Responses API call.
|
||||
"x_search": {
|
||||
# xAI model used for the Responses call. grok-4.20-reasoning is
|
||||
# the recommended default; any Grok model with x_search tool
|
||||
# access works.
|
||||
"model": "grok-4.20-reasoning",
|
||||
# Request timeout in seconds (minimum 30). x_search can take
|
||||
# 60-120s for complex queries — the default is generous.
|
||||
"timeout_seconds": 180,
|
||||
# Number of automatic retries on 5xx / ReadTimeout / ConnectionError.
|
||||
# Each retry backs off (1.5x attempt seconds, capped at 5s).
|
||||
"retries": 2,
|
||||
},
|
||||
|
||||
# Config schema version - bump this when adding new required fields
|
||||
"_config_version": 23,
|
||||
}
|
||||
|
|
|
|||
|
|
@ -152,6 +152,30 @@ def _apply_doctor_tool_availability_overrides(available: list[str], unavailable:
|
|||
return updated_available, updated_unavailable
|
||||
|
||||
|
||||
def _has_healthy_oauth_fallback_for_apikey_provider(provider_label: str) -> bool:
|
||||
"""Return True when a direct API-key probe failure is non-blocking.
|
||||
|
||||
Some provider families support both a direct API-key path and a separate
|
||||
OAuth runtime path. When the OAuth path is already healthy, doctor should
|
||||
still show a failed API-key connectivity row, but it should not promote
|
||||
that direct-key problem into the final blocking summary.
|
||||
"""
|
||||
try:
|
||||
from hermes_cli.auth import (
|
||||
get_gemini_oauth_auth_status,
|
||||
get_minimax_oauth_auth_status,
|
||||
)
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
normalized = (provider_label or "").strip().lower()
|
||||
if normalized in {"google / gemini", "gemini"}:
|
||||
return bool((get_gemini_oauth_auth_status() or {}).get("logged_in"))
|
||||
if normalized == "minimax":
|
||||
return bool((get_minimax_oauth_auth_status() or {}).get("logged_in"))
|
||||
return False
|
||||
|
||||
|
||||
def check_ok(text: str, detail: str = ""):
|
||||
print(f" {color('✓', Colors.GREEN)} {text}" + (f" {color(detail, Colors.DIM)}" if detail else ""))
|
||||
|
||||
|
|
@ -1594,7 +1618,10 @@ def run_doctor(args):
|
|||
print(f" {_glyph} {_label} {_detail}")
|
||||
else:
|
||||
print(f" {_glyph} {_label}")
|
||||
for _issue in _r.issues:
|
||||
_issues_to_add = list(_r.issues)
|
||||
if _issues_to_add and _has_healthy_oauth_fallback_for_apikey_provider(_r.label):
|
||||
_issues_to_add = []
|
||||
for _issue in _issues_to_add:
|
||||
issues.append(_issue)
|
||||
|
||||
# =========================================================================
|
||||
|
|
|
|||
|
|
@ -2525,6 +2525,7 @@ def _is_github_models_base_url(base_url: Optional[str]) -> bool:
|
|||
return (
|
||||
normalized.startswith(COPILOT_BASE_URL)
|
||||
or normalized.startswith("https://models.github.ai/inference")
|
||||
or normalized.startswith("https://models.inference.ai.azure.com")
|
||||
)
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -325,8 +325,15 @@ class PluginContext:
|
|||
is_async: bool = False,
|
||||
description: str = "",
|
||||
emoji: str = "",
|
||||
override: bool = False,
|
||||
) -> None:
|
||||
"""Register a tool in the global registry **and** track it as plugin-provided."""
|
||||
"""Register a tool in the global registry **and** track it as plugin-provided.
|
||||
|
||||
Pass ``override=True`` to replace an existing built-in tool with the
|
||||
same name (e.g. swap the default ``browser_navigate`` for a custom
|
||||
CDP-backed implementation). Without it, attempting to register a name
|
||||
already claimed by a different toolset is rejected.
|
||||
"""
|
||||
from tools.registry import registry
|
||||
|
||||
registry.register(
|
||||
|
|
@ -339,9 +346,13 @@ class PluginContext:
|
|||
is_async=is_async,
|
||||
description=description,
|
||||
emoji=emoji,
|
||||
override=override,
|
||||
)
|
||||
self._manager._plugin_tool_names.add(name)
|
||||
logger.debug("Plugin %s registered tool: %s", self.manifest.name, name)
|
||||
logger.debug(
|
||||
"Plugin %s registered tool: %s%s",
|
||||
self.manifest.name, name, " (override)" if override else "",
|
||||
)
|
||||
|
||||
# -- message injection --------------------------------------------------
|
||||
|
||||
|
|
|
|||
|
|
@ -61,6 +61,7 @@ CONFIGURABLE_TOOLSETS = [
|
|||
("video", "🎬 Video Analysis", "video_analyze (requires video-capable model)"),
|
||||
("image_gen", "🎨 Image Generation", "image_generate"),
|
||||
("video_gen", "🎬 Video Generation", "video_generate (text-to-video + image-to-video)"),
|
||||
("x_search", "🐦 X (Twitter) Search", "x_search (requires xAI OAuth or XAI_API_KEY)"),
|
||||
("moa", "🧠 Mixture of Agents", "mixture_of_agents"),
|
||||
("tts", "🔊 Text-to-Speech", "text_to_speech"),
|
||||
("skills", "📚 Skills", "list, view, manage"),
|
||||
|
|
@ -86,7 +87,12 @@ CONFIGURABLE_TOOLSETS = [
|
|||
# Video gen is off by default — it's a niche, paid, slow feature. Users
|
||||
# who want it opt in via `hermes tools` → Video Generation, which walks
|
||||
# them through provider + model selection.
|
||||
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "spotify", "discord", "discord_admin", "video", "video_gen"}
|
||||
#
|
||||
# X search is off by default — gated on xAI credentials (SuperGrok OAuth
|
||||
# or XAI_API_KEY). Users opt in via `hermes tools` → X (Twitter) Search,
|
||||
# which walks them through credential setup. The tool's check_fn means
|
||||
# the schema won't appear to the model even if enabled without credentials.
|
||||
_DEFAULT_OFF_TOOLSETS = {"moa", "homeassistant", "spotify", "discord", "discord_admin", "video", "video_gen", "x_search"}
|
||||
|
||||
# Platform-scoped toolsets: only appear in the `hermes tools` checklist for
|
||||
# these platforms, and only resolve/save for these platforms. A toolset
|
||||
|
|
@ -308,6 +314,39 @@ TOOL_CATEGORIES = {
|
|||
# converge image_gen toward.
|
||||
"providers": [],
|
||||
},
|
||||
"x_search": {
|
||||
"name": "X (Twitter) Search",
|
||||
"setup_title": "Select xAI Credential Source",
|
||||
"setup_note": (
|
||||
"Hermes routes X searches through xAI's built-in x_search "
|
||||
"Responses tool. Both credential sources hit the same "
|
||||
"https://api.x.ai/v1/responses endpoint — pick whichever you "
|
||||
"already have. SuperGrok OAuth is preferred when both are set "
|
||||
"(uses your subscription quota instead of API spend)."
|
||||
),
|
||||
"icon": "🐦",
|
||||
"providers": [
|
||||
{
|
||||
"name": "xAI Grok OAuth (SuperGrok Subscription)",
|
||||
"badge": "subscription",
|
||||
"tag": "Browser login at accounts.x.ai — no API key required",
|
||||
"env_vars": [],
|
||||
"post_setup": "xai_grok",
|
||||
},
|
||||
{
|
||||
"name": "xAI API key",
|
||||
"badge": "paid",
|
||||
"tag": "Direct xAI API billing via XAI_API_KEY",
|
||||
"env_vars": [
|
||||
{
|
||||
"key": "XAI_API_KEY",
|
||||
"prompt": "xAI API key",
|
||||
"url": "https://console.x.ai/",
|
||||
},
|
||||
],
|
||||
},
|
||||
],
|
||||
},
|
||||
"browser": {
|
||||
"name": "Browser Automation",
|
||||
"icon": "🌐",
|
||||
|
|
|
|||
|
|
@ -21,6 +21,7 @@ Public API (signatures preserved from the original 2,400-line version):
|
|||
"""
|
||||
|
||||
import json
|
||||
import re
|
||||
import asyncio
|
||||
import logging
|
||||
import threading
|
||||
|
|
@ -485,6 +486,48 @@ _AGENT_LOOP_TOOLS = {"todo", "memory", "session_search", "delegate_task"}
|
|||
_READ_SEARCH_TOOLS = {"read_file", "search_files"}
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Tool error sanitization
|
||||
# =========================================================================
|
||||
#
|
||||
# Tool exceptions can carry arbitrary text into the model's context as the
|
||||
# `tool` message content. json.dumps() handles quote/backslash escaping so a
|
||||
# raw injection of `</tool_call>` won't break message framing, but the model
|
||||
# still *reads* those tokens and they can confuse downstream tool-call
|
||||
# parsing or, in adversarial cases, nudge it toward role-confusion framing.
|
||||
#
|
||||
# This helper strips structural framing tokens (XML role tags, CDATA,
|
||||
# markdown code fences) and caps the message at a sane upper bound before it
|
||||
# becomes part of the conversation. It's defense-in-depth — the json layer
|
||||
# already prevents framing escape — but cheap and worth having.
|
||||
#
|
||||
# Ported from ironclaw#1639.
|
||||
_TOOL_ERROR_ROLE_TAG_RE = re.compile(
|
||||
r'</?(?:tool_call|function_call|result|response|output|input|system|assistant|user)>',
|
||||
re.IGNORECASE,
|
||||
)
|
||||
_TOOL_ERROR_FENCE_OPEN_RE = re.compile(r'^\s*```(?:json|xml|html|markdown)?\s*', re.MULTILINE)
|
||||
_TOOL_ERROR_FENCE_CLOSE_RE = re.compile(r'\s*```\s*$', re.MULTILINE)
|
||||
_TOOL_ERROR_CDATA_RE = re.compile(r'<!\[CDATA\[.*?\]\]>', re.DOTALL)
|
||||
_TOOL_ERROR_MAX_LEN = 2000
|
||||
|
||||
|
||||
def _sanitize_tool_error(error_msg: str) -> str:
|
||||
"""Strip structural framing tokens from a tool error before showing it to the model.
|
||||
|
||||
See _TOOL_ERROR_ROLE_TAG_RE docstring above for rationale.
|
||||
"""
|
||||
if not error_msg:
|
||||
return "[TOOL_ERROR] "
|
||||
sanitized = _TOOL_ERROR_ROLE_TAG_RE.sub("", error_msg)
|
||||
sanitized = _TOOL_ERROR_FENCE_OPEN_RE.sub("", sanitized)
|
||||
sanitized = _TOOL_ERROR_FENCE_CLOSE_RE.sub("", sanitized)
|
||||
sanitized = _TOOL_ERROR_CDATA_RE.sub("", sanitized)
|
||||
if len(sanitized) > _TOOL_ERROR_MAX_LEN:
|
||||
sanitized = sanitized[:_TOOL_ERROR_MAX_LEN - 3] + "..."
|
||||
return f"[TOOL_ERROR] {sanitized}"
|
||||
|
||||
|
||||
# =========================================================================
|
||||
# Tool argument type coercion
|
||||
# =========================================================================
|
||||
|
|
@ -824,7 +867,7 @@ def handle_function_call(
|
|||
except Exception as e:
|
||||
error_msg = f"Error executing {function_name}: {str(e)}"
|
||||
logger.exception(error_msg)
|
||||
return json.dumps({"error": error_msg}, ensure_ascii=False)
|
||||
return json.dumps({"error": _sanitize_tool_error(error_msg)}, ensure_ascii=False)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
|
|
|
|||
309
optional-skills/devops/pinggy-tunnel/SKILL.md
Normal file
309
optional-skills/devops/pinggy-tunnel/SKILL.md
Normal file
|
|
@ -0,0 +1,309 @@
|
|||
---
|
||||
name: pinggy-tunnel
|
||||
description: Zero-install localhost tunnels over SSH via Pinggy.
|
||||
version: 0.1.0
|
||||
author: Teknium (teknium1), Hermes Agent
|
||||
license: MIT
|
||||
platforms: [linux, macos, windows]
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [Pinggy, Tunnel, Networking, SSH, Webhook, Localhost]
|
||||
related_skills: [cloudflared-quick-tunnel, webhook-subscriptions]
|
||||
---
|
||||
|
||||
# Pinggy Tunnel Skill
|
||||
|
||||
Expose a local service (dev server, webhook receiver, MCP endpoint, demo) to the public internet using a Pinggy SSH reverse tunnel. No daemon to install — the user's stock SSH client connects to `a.pinggy.io:443` and Pinggy hands back a public HTTP/HTTPS URL.
|
||||
|
||||
Free tier: 60-minute tunnels, random subdomain, no signup. Pro tier ($3/mo) is an opt-in with a token.
|
||||
|
||||
## When to Use
|
||||
|
||||
- User asks to "expose this locally", "share my dev server", "make this URL public", "tunnel port N", "get a public URL for a webhook"
|
||||
- Need to receive a webhook callback during a local task (Stripe, GitHub, Discord, AgentMail)
|
||||
- Sharing a one-off HTTP demo (MCP server, Ollama/vLLM endpoint, dashboard) with a remote party
|
||||
- The host has SSH but no `cloudflared` / `ngrok` binary, and installing one would be overkill
|
||||
|
||||
If the host already has `cloudflared` configured, prefer the `cloudflared-quick-tunnel` skill — Cloudflare quick tunnels don't expire after 60 minutes.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- `ssh` on PATH (`ssh -V`). Default on Linux, macOS, and Windows 10+. No other install.
|
||||
- A local service listening on `127.0.0.1:<port>` before the tunnel starts. Pinggy will return URLs but they'll 502 until the local origin is up.
|
||||
|
||||
Optional:
|
||||
|
||||
- `PINGGY_TOKEN` env var for paid Pro features (persistent subdomain, custom domain, multiple tunnels, no 60-minute cap). Free tier needs no credentials.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
```bash
|
||||
# Plain HTTP/HTTPS tunnel for port 8000 (free tier)
|
||||
ssh -p 443 -o StrictHostKeyChecking=no -o ServerAliveInterval=30 \
|
||||
-R0:localhost:8000 free@a.pinggy.io
|
||||
|
||||
# TCP tunnel (databases, raw SSH, etc.)
|
||||
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:5432 tcp@a.pinggy.io
|
||||
|
||||
# TLS tunnel (Pinggy can't decrypt — bring your own certs at origin)
|
||||
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:443 tls@a.pinggy.io
|
||||
|
||||
# Basic auth gate (b:user:pass)
|
||||
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 \
|
||||
"b:admin:secret+free@a.pinggy.io"
|
||||
|
||||
# Bearer token gate (k:token)
|
||||
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 \
|
||||
"k:mysecrettoken+free@a.pinggy.io"
|
||||
|
||||
# IP whitelist (w:CIDR)
|
||||
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 \
|
||||
"w:203.0.113.0/24+free@a.pinggy.io"
|
||||
|
||||
# Enable CORS + force HTTPS redirect
|
||||
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 \
|
||||
"co+x:https+free@a.pinggy.io"
|
||||
|
||||
# Pro tier (persistent URL, no 60-min cap)
|
||||
ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:8000 "$PINGGY_TOKEN+a.pinggy.io"
|
||||
```
|
||||
|
||||
## Procedure — Start a Tunnel and Get the URL
|
||||
|
||||
The model SHOULD use the `terminal` tool. The tunnel must stay alive for the duration of the share, so run it as a background process and parse the public URL from stdout.
|
||||
|
||||
### 1. Confirm a local origin is up
|
||||
|
||||
```bash
|
||||
curl -sI http://127.0.0.1:8000/ | head -1
|
||||
# expect HTTP/1.x 200 (or any non-connection-refused response)
|
||||
```
|
||||
|
||||
If nothing is listening yet, start it first (e.g. `python3 -m http.server 8000 --bind 127.0.0.1`). Pinggy will happily return a URL pointed at nothing — the user will see 502 until the origin comes up.
|
||||
|
||||
### 2. Launch the tunnel as a background process
|
||||
|
||||
Use `terminal(background=True)` and capture output to a logfile (Pinggy prints the URLs on stdout, then keeps the connection open):
|
||||
|
||||
```bash
|
||||
LOG=/tmp/pinggy-8000.log
|
||||
nohup ssh -p 443 \
|
||||
-o StrictHostKeyChecking=no \
|
||||
-o UserKnownHostsFile=/dev/null \
|
||||
-o ServerAliveInterval=30 \
|
||||
-o ServerAliveCountMax=3 \
|
||||
-R0:localhost:8000 free@a.pinggy.io \
|
||||
> "$LOG" 2>&1 &
|
||||
echo $! > /tmp/pinggy-8000.pid
|
||||
```
|
||||
|
||||
`StrictHostKeyChecking=no` + `UserKnownHostsFile=/dev/null` skips the first-run host-key prompt. `ServerAliveInterval=30` keeps the SSH session from getting torn down by an idle NAT.
|
||||
|
||||
### 3. Parse the URL out of the log
|
||||
|
||||
```bash
|
||||
sleep 4
|
||||
grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/pinggy-8000.log | head -1
|
||||
```
|
||||
|
||||
Expected output looks like:
|
||||
|
||||
```
|
||||
You are not authenticated.
|
||||
Your tunnel will expire in 60 minutes.
|
||||
http://yqycl-98-162-69-48.a.free.pinggy.link
|
||||
https://yqycl-98-162-69-48.a.free.pinggy.link
|
||||
```
|
||||
|
||||
Hand the `https://...pinggy.link` URL to the user.
|
||||
|
||||
### 4. Verify
|
||||
|
||||
```bash
|
||||
curl -sI https://<the-url>/ | head -3
|
||||
# expect 200/302/whatever the local origin actually returns
|
||||
```
|
||||
|
||||
If you get `502 Bad Gateway`, the SSH session is up but the local origin isn't listening — fix step 1 first.
|
||||
|
||||
### 5. Teardown
|
||||
|
||||
```bash
|
||||
kill "$(cat /tmp/pinggy-8000.pid)"
|
||||
# or, if the pid file got lost:
|
||||
pkill -f 'ssh -p 443 .* free@a\.pinggy\.io'
|
||||
```
|
||||
|
||||
If you have a session_id from `terminal(background=True)`, prefer `process(action='kill', session_id=...)`.
|
||||
|
||||
## Access Control via Username Keywords
|
||||
|
||||
Pinggy stacks control flags into the SSH username separated by `+`. Always quote the whole `user@host` argument when it contains a `+`:
|
||||
|
||||
| Keyword | Effect |
|
||||
|---------|--------|
|
||||
| `b:user:pass` | HTTP Basic auth gate |
|
||||
| `k:token` | Bearer-token header gate (`Authorization: Bearer <token>`) |
|
||||
| `w:CIDR` | IP whitelist (single IP or CIDR, repeatable) |
|
||||
| `co` | Add `Access-Control-Allow-Origin: *` (CORS) |
|
||||
| `x:https` | Force HTTPS — auto-redirect HTTP to HTTPS |
|
||||
| `a:Name:Value` | Add request header |
|
||||
| `u:Name:Value` | Update request header |
|
||||
| `r:Name` | Remove request header |
|
||||
| `qr` | Print a QR code of the URL to stdout (handy for mobile sharing) |
|
||||
|
||||
Combine freely: `"b:admin:secret+co+x:https+free@a.pinggy.io"`.
|
||||
|
||||
## Web Debugger (optional)
|
||||
|
||||
Pinggy can mirror the inbound traffic to `localhost:4300` for inspection. Add a local forward to the SSH command:
|
||||
|
||||
```bash
|
||||
ssh -p 443 -L4300:localhost:4300 -R0:localhost:8000 free@a.pinggy.io
|
||||
```
|
||||
|
||||
Then open `http://localhost:4300` in a browser to see live request/response pairs.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **60-minute hard cap on the free tier.** The SSH session terminates at the 60-minute mark; the URL goes dead. For longer shares, either use `PINGGY_TOKEN` (Pro) or auto-restart with a shell loop (note that the URL changes on every restart for free-tier).
|
||||
- **Free-tier URL is random and changes on restart.** Don't bookmark it, don't paste it into a config file. Re-parse from the log each time.
|
||||
- **Concurrent free tunnels are limited to one per source IP.** Starting a second tunnel from the same machine usually kills the first. Pro tier lifts this.
|
||||
- **`+` in usernames must be quoted.** Bare `ssh ... b:admin:secret+free@a.pinggy.io` works in bash but breaks under shells that treat `+` specially or when assembled programmatically. Always wrap in double quotes.
|
||||
- **Don't tunnel anything sensitive without an access-control flag.** A bare HTTP tunnel is reachable by anyone with the URL. Use `b:`, `k:`, or `w:` for non-public services.
|
||||
- **`process(action='log')` may miss SSH banner output.** Pinggy prints the URLs and then the SSH session goes interactive. Always redirect to a logfile and `grep` the file directly — same pattern as `cloudflared-quick-tunnel`.
|
||||
- **Host-key prompt on first run.** Default OpenSSH config asks the user to accept Pinggy's host key. Always pass `-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null` for unattended runs.
|
||||
- **TCP and TLS tunnels return a `<subdomain>.a.pinggy.online:<port>` pair, not an https URL.** Parse with a different regex (`tcp://` and a port). Don't assume every Pinggy tunnel is HTTP.
|
||||
- **Pro mode requires the token as the username, not a flag.** Use `"$PINGGY_TOKEN+a.pinggy.io"` (no `free@`). With a token you can also add `:persistent` for a stable subdomain — see `pinggy.io/docs/`.
|
||||
|
||||
## Recipes
|
||||
|
||||
Composite patterns combining a local origin with a Pinggy tunnel. Each recipe is self-contained — start the origin, start the tunnel, parse the URL, hand it back to the user.
|
||||
|
||||
### Recipe 1 — Receive a webhook callback
|
||||
|
||||
Use this when an external service (Stripe, GitHub, Discord, AgentMail, etc.) needs to POST to a publicly reachable URL during a local task.
|
||||
|
||||
```bash
|
||||
# 1. Tiny capturing server: every request gets appended to /tmp/webhook-hits.log
|
||||
cat >/tmp/webhook-server.py <<'PY'
|
||||
import http.server, json, datetime, pathlib
|
||||
LOG = pathlib.Path("/tmp/webhook-hits.log")
|
||||
class H(http.server.BaseHTTPRequestHandler):
|
||||
def _capture(self):
|
||||
n = int(self.headers.get("content-length") or 0)
|
||||
body = self.rfile.read(n).decode("utf-8", "replace") if n else ""
|
||||
rec = {"t": datetime.datetime.utcnow().isoformat(), "path": self.path,
|
||||
"method": self.command, "headers": dict(self.headers), "body": body}
|
||||
with LOG.open("a") as f: f.write(json.dumps(rec) + "\n")
|
||||
self.send_response(200); self.send_header("content-type","application/json")
|
||||
self.end_headers(); self.wfile.write(b'{"ok":true}\n')
|
||||
def do_GET(self): self._capture()
|
||||
def do_POST(self): self._capture()
|
||||
def log_message(self,*a,**k): pass
|
||||
http.server.HTTPServer(("127.0.0.1", 18080), H).serve_forever()
|
||||
PY
|
||||
nohup python3 /tmp/webhook-server.py >/tmp/webhook-server.log 2>&1 &
|
||||
echo $! >/tmp/webhook-server.pid
|
||||
|
||||
# 2. Tunnel — bearer-token-gate so randos can't pollute the capture log
|
||||
nohup ssh -p 443 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
|
||||
-o ServerAliveInterval=30 \
|
||||
-R0:localhost:18080 "k:$(openssl rand -hex 12)+free@a.pinggy.io" \
|
||||
>/tmp/webhook-pinggy.log 2>&1 &
|
||||
echo $! >/tmp/webhook-pinggy.pid
|
||||
sleep 5
|
||||
URL=$(grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/webhook-pinggy.log | head -1)
|
||||
echo "Webhook URL: $URL"
|
||||
|
||||
# 3. While the agent works, watch hits land
|
||||
tail -f /tmp/webhook-hits.log
|
||||
```
|
||||
|
||||
Hand `$URL` to the service that needs to call you. Teardown: `kill $(cat /tmp/webhook-server.pid) $(cat /tmp/webhook-pinggy.pid)`.
|
||||
|
||||
### Recipe 2 — Expose an MCP server over HTTP/SSE
|
||||
|
||||
Use when a remote MCP client (Claude Desktop on another machine, a teammate's editor, etc.) needs to reach an MCP server running on the local box. Only works for MCP servers that speak HTTP transport — stdio-mode servers can't be tunneled.
|
||||
|
||||
```bash
|
||||
# 1. Start the MCP server in HTTP mode (example: a FastMCP server on port 8765)
|
||||
nohup python3 my_mcp_server.py --transport http --port 8765 \
|
||||
>/tmp/mcp-server.log 2>&1 &
|
||||
echo $! >/tmp/mcp-server.pid
|
||||
|
||||
# 2. Tunnel with a bearer token — MCP traffic should not be open to the internet
|
||||
TOKEN=$(openssl rand -hex 16)
|
||||
nohup ssh -p 443 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
|
||||
-o ServerAliveInterval=30 \
|
||||
-R0:localhost:8765 "k:$TOKEN+free@a.pinggy.io" \
|
||||
>/tmp/mcp-pinggy.log 2>&1 &
|
||||
echo $! >/tmp/mcp-pinggy.pid
|
||||
sleep 5
|
||||
URL=$(grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/mcp-pinggy.log | head -1)
|
||||
echo "MCP URL: $URL"
|
||||
echo "Bearer token: $TOKEN"
|
||||
```
|
||||
|
||||
The remote client connects to `$URL` with `Authorization: Bearer $TOKEN`. Hermes' own native MCP client config: `{"transport": "http", "url": "<URL>", "headers": {"Authorization": "Bearer <TOKEN>"}}`.
|
||||
|
||||
### Recipe 3 — Expose a local LLM endpoint (Ollama / vLLM / llama.cpp)
|
||||
|
||||
Share a local model with a remote caller (another agent, a phone, a teammate). Ollama listens on `:11434`, vLLM and llama.cpp typically on `:8000`.
|
||||
|
||||
```bash
|
||||
# Pre-req: the model server is already running on 127.0.0.1:11434 (Ollama default)
|
||||
TOKEN=$(openssl rand -hex 16)
|
||||
nohup ssh -p 443 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
|
||||
-o ServerAliveInterval=30 \
|
||||
-R0:localhost:11434 "k:$TOKEN+co+free@a.pinggy.io" \
|
||||
>/tmp/llm-pinggy.log 2>&1 &
|
||||
echo $! >/tmp/llm-pinggy.pid
|
||||
sleep 5
|
||||
URL=$(grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/llm-pinggy.log | head -1)
|
||||
echo "Endpoint: $URL"
|
||||
echo "Token: $TOKEN"
|
||||
|
||||
# Verify
|
||||
curl -s "$URL/api/tags" -H "Authorization: Bearer $TOKEN" | head
|
||||
```
|
||||
|
||||
`co` enables CORS so a browser caller can hit the endpoint. Drop `co` for backend-only callers. For an OpenAI-compatible vLLM/llama.cpp endpoint, callers use base URL `$URL/v1` with `Authorization: Bearer $TOKEN` — but note Pinggy strips/replaces nothing in the body, so the model server itself sees Pinggy's token; the local server should be configured to ignore auth (it's already on `127.0.0.1`) and let Pinggy do the gating.
|
||||
|
||||
### Recipe 4 — Share a dev server with a one-shot password
|
||||
|
||||
The fastest "let a teammate poke at my running app" pattern. Random password, prints once, dies when you Ctrl-C.
|
||||
|
||||
```bash
|
||||
PASS=$(openssl rand -base64 12 | tr -d '+/=' | head -c 12)
|
||||
echo "Dev server password: $PASS"
|
||||
ssh -p 443 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
|
||||
-o ServerAliveInterval=30 \
|
||||
-R0:localhost:3000 "b:dev:$PASS+co+x:https+free@a.pinggy.io"
|
||||
# URL prints to the terminal. Share URL + password. Ctrl-C to tear down.
|
||||
```
|
||||
|
||||
`b:dev:$PASS` gates the URL with HTTP Basic auth. `x:https` forces TLS. `co` adds CORS for SPA frontends.
|
||||
|
||||
## Verification
|
||||
|
||||
```bash
|
||||
# End-to-end: spin up a trivial origin, tunnel it, hit it, tear down
|
||||
python3 -m http.server 18000 --bind 127.0.0.1 >/tmp/origin.log 2>&1 &
|
||||
ORIGIN_PID=$!
|
||||
|
||||
nohup ssh -p 443 \
|
||||
-o StrictHostKeyChecking=no \
|
||||
-o UserKnownHostsFile=/dev/null \
|
||||
-R0:localhost:18000 free@a.pinggy.io >/tmp/pinggy-verify.log 2>&1 &
|
||||
SSH_PID=$!
|
||||
|
||||
sleep 5
|
||||
URL=$(grep -oE 'https://[a-z0-9-]+\.[a-z]+\.pinggy\.link' /tmp/pinggy-verify.log | head -1)
|
||||
echo "URL: $URL"
|
||||
curl -sI "$URL/" | head -1
|
||||
|
||||
kill "$SSH_PID" "$ORIGIN_PID"
|
||||
```
|
||||
|
||||
Expected: a `pinggy.link` URL and `HTTP/2 200` on the curl head.
|
||||
199
optional-skills/research/darwinian-evolver/SKILL.md
Normal file
199
optional-skills/research/darwinian-evolver/SKILL.md
Normal file
|
|
@ -0,0 +1,199 @@
|
|||
---
|
||||
name: darwinian-evolver
|
||||
description: Evolve prompts/regex/SQL/code with Imbue's evolution loop.
|
||||
version: 0.1.0
|
||||
author: Bihruze (Asahi0x), Hermes Agent
|
||||
license: MIT
|
||||
platforms: [linux, macos]
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [evolution, optimization, prompt-engineering, research]
|
||||
related_skills: [arxiv, jupyter-live-kernel]
|
||||
---
|
||||
|
||||
# Darwinian Evolver
|
||||
|
||||
Run Imbue's [darwinian_evolver](https://github.com/imbue-ai/darwinian_evolver) — an
|
||||
LLM-driven evolutionary search loop — to optimize a **prompt, regex, SQL query,
|
||||
or small code snippet** against a fitness function.
|
||||
|
||||
Status: thin wrapper around the upstream tool. The skill installs it, walks the
|
||||
agent through writing a `Problem` definition (organism + evaluator + mutator),
|
||||
and drives the loop via the upstream CLI or a small custom Python driver.
|
||||
|
||||
**License:** the upstream tool is **AGPL-3.0**. The skill ONLY ever invokes it
|
||||
via the upstream CLI or a `subprocess`/`uv run` call (mere aggregation). Do NOT
|
||||
import upstream classes into Hermes itself.
|
||||
|
||||
## When to Use
|
||||
|
||||
- User says "optimize this prompt", "evolve a regex for X", "auto-improve this
|
||||
code/SQL", "search for a better instruction".
|
||||
- You have a scorer (exact match, regex pass-rate, unit test, LLM-judge, runtime
|
||||
metric) AND a starting candidate (organism). If you don't have a scorer, stop
|
||||
and define one first — that's the hard part.
|
||||
- Cost is OK: a typical run is 50–500 LLM calls. On gpt-4o-mini that's pennies;
|
||||
on Claude Sonnet it can be a few dollars.
|
||||
|
||||
Do **not** use this when:
|
||||
- The optimization target is differentiable (use gradient descent / DSPy).
|
||||
- You only need to try 2–3 variants — just write them by hand.
|
||||
- The fitness signal is purely subjective with no measurable criterion.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Python ≥3.11
|
||||
- `git`, `uv` (or `pip`)
|
||||
- One of: `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, or `OPENAI_API_KEY`
|
||||
|
||||
The skill ships a small `parrot_openrouter.py` driver that uses `OPENROUTER_API_KEY`
|
||||
via the OpenAI SDK, so any model on OpenRouter works. The upstream CLI itself
|
||||
hardcodes Anthropic and needs `ANTHROPIC_API_KEY`.
|
||||
|
||||
## Install (One-Time)
|
||||
|
||||
Run via the `terminal` tool:
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.hermes/cache/darwinian-evolver && cd ~/.hermes/cache/darwinian-evolver
|
||||
[ -d darwinian_evolver ] || git clone --depth 1 https://github.com/imbue-ai/darwinian_evolver.git
|
||||
cd darwinian_evolver && uv sync
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
cd ~/.hermes/cache/darwinian-evolver/darwinian_evolver \
|
||||
&& uv run darwinian_evolver --help | head -5
|
||||
```
|
||||
|
||||
## Quick Start — The Built-In Parrot Example
|
||||
|
||||
Tiny smoke test (requires `ANTHROPIC_API_KEY`):
|
||||
|
||||
```bash
|
||||
cd ~/.hermes/cache/darwinian-evolver/darwinian_evolver
|
||||
uv run darwinian_evolver parrot \
|
||||
--num_iterations 2 \
|
||||
--num_parents_per_iteration 2 \
|
||||
--mutator_concurrency 2 --evaluator_concurrency 2 \
|
||||
--output_dir /tmp/parrot_demo
|
||||
```
|
||||
|
||||
Outputs:
|
||||
- `/tmp/parrot_demo/snapshots/iteration_N.pkl` — pickled population per iteration
|
||||
- `/tmp/parrot_demo/<jsonl>` — per-iteration JSON log (path printed at end)
|
||||
|
||||
Open `~/.hermes/cache/darwinian-evolver/darwinian_evolver/darwinian_evolver/lineage_visualizer.html`
|
||||
in a browser and load the JSON log to see the evolutionary tree.
|
||||
|
||||
## Quick Start — OpenRouter Driver (No Anthropic Key)
|
||||
|
||||
The skill ships `scripts/parrot_openrouter.py` — same parrot problem, but the
|
||||
LLM call goes through OpenRouter so any provider works.
|
||||
|
||||
```bash
|
||||
# From wherever the skill is installed:
|
||||
SKILL_DIR=~/.hermes/skills/research/darwinian-evolver
|
||||
DE_DIR=~/.hermes/cache/darwinian-evolver/darwinian_evolver
|
||||
|
||||
cd "$DE_DIR" && \
|
||||
EVOLVER_MODEL='openai/gpt-4o-mini' \
|
||||
uv run --with openai python "$SKILL_DIR/scripts/parrot_openrouter.py" \
|
||||
--num_iterations 3 --num_parents_per_iteration 2 \
|
||||
--output_dir /tmp/parrot_or
|
||||
```
|
||||
|
||||
Inspect the result with `scripts/show_snapshot.py`:
|
||||
|
||||
```bash
|
||||
uv run --with openai python "$SKILL_DIR/scripts/show_snapshot.py" \
|
||||
/tmp/parrot_or/snapshots/iteration_3.pkl
|
||||
```
|
||||
|
||||
Expected output: 7 evolved prompt templates ranked by score, with the best
|
||||
landing around 0.6–0.8 (the seed `Say {{ phrase }}` scored 0.000).
|
||||
|
||||
## Defining a Custom Problem
|
||||
|
||||
The skill ships `templates/custom_problem_template.py` — copy, edit, run.
|
||||
Three things you must define:
|
||||
|
||||
1. **`Organism`** — a Pydantic `BaseModel` subclass holding the artifact being
|
||||
evolved (`prompt_template: str`, `regex_pattern: str`, `sql_query: str`,
|
||||
`code_block: str`, etc.). Add a `run(*args)` method that exercises it.
|
||||
|
||||
2. **`Evaluator`** — `.evaluate(organism) -> EvaluationResult(score=..., trainable_failure_cases=[...], holdout_failure_cases=[...], is_viable=True)`.
|
||||
- **`score`** is in `[0, 1]`. Higher is better.
|
||||
- **`trainable_failure_cases`** — what the mutator sees. Include enough
|
||||
context (input, expected, actual) for the LLM to diagnose.
|
||||
- **`holdout_failure_cases`** — kept out of the mutator's view. Use these
|
||||
to detect overfitting.
|
||||
- **`is_viable=True`** unless the organism is completely broken (raises,
|
||||
returns None, etc.). A 0-score viable organism is fine — it just gets
|
||||
down-weighted in parent selection.
|
||||
|
||||
3. **`Mutator`** — `.mutate(organism, failure_cases, learning_log_entries) -> list[Organism]`.
|
||||
Typically: build an LLM prompt that includes the current organism + a
|
||||
failure case + an ask to propose a fix; parse the LLM's response; return
|
||||
a new `Organism`. Return `[]` on parse failure — the loop handles it.
|
||||
|
||||
Then write a driver script that wires `Problem(initial_organism, evaluator, [mutators])`
|
||||
into `EvolveProblemLoop` and iterates over `loop.run(num_iterations=N)` — the
|
||||
shipped `scripts/parrot_openrouter.py` is the reference.
|
||||
|
||||
## Hyperparameters That Actually Matter
|
||||
|
||||
| flag | default | when to change |
|
||||
|---|---|---|
|
||||
| `--num_iterations` | 5 | bump to 10–20 once you trust the evaluator |
|
||||
| `--num_parents_per_iteration` | 4 | drop to 2 for cheap exploration |
|
||||
| `--mutator_concurrency` | 10 | drop to 2–4 to avoid rate limits |
|
||||
| `--evaluator_concurrency` | 10 | same; evaluator hits the LLM too |
|
||||
| `--batch_size` | 1 | raise to 3–5 once your mutator handles multiple failures |
|
||||
| `--verify_mutations` | off | turn on once mutator is wasteful (>10× cost saving on later runs per Imbue) |
|
||||
| `--midpoint_score` | `p75` | leave alone unless scores cluster |
|
||||
| `--sharpness` | 10 | leave alone |
|
||||
|
||||
## Pitfalls
|
||||
|
||||
1. **`Initial organism must be viable`** — set `is_viable=True` in your
|
||||
`EvaluationResult` even on a 0-score seed. The loop refuses non-viable
|
||||
organisms because they imply the loop has nothing to evolve from.
|
||||
2. **Provider content filters kill runs.** Azure-backed OpenRouter models
|
||||
reject phrases like "ignore previous instructions" with HTTP 400. Wrap
|
||||
the LLM call in `try/except` and return `f"<LLM_ERROR: {e}>"` — the
|
||||
evolver will just score that organism 0 and move on.
|
||||
3. **`loop.run()` is a generator** — calling it doesn't run anything until
|
||||
you iterate. Use `for snap in loop.run(num_iterations=N):`.
|
||||
4. **Snapshots are nested pickles.** `iteration_N.pkl` contains a dict with
|
||||
`population_snapshot` (more pickled bytes). To unpickle you must have the
|
||||
`Organism` class importable under the same dotted path it was pickled at.
|
||||
5. **Concurrency defaults are aggressive.** 10/10 will hit rate limits on
|
||||
most providers. Start with 2/2.
|
||||
6. **CLI is hardcoded to Anthropic.** `uv run darwinian_evolver <problem>`
|
||||
reaches for `ANTHROPIC_API_KEY` and uses Claude Sonnet. To use any other
|
||||
provider, write a driver like `parrot_openrouter.py`.
|
||||
7. **AGPL.** Never `from darwinian_evolver import ...` inside Hermes core.
|
||||
Custom driver scripts under `~/.hermes/skills/...` are user-side and fine.
|
||||
8. **No PyPI package.** `pip install darwinian-evolver` will pull the wrong
|
||||
thing. Always install from the GitHub repo.
|
||||
|
||||
## Verification
|
||||
|
||||
After install + a parrot run, exit code 0 from this is sufficient:
|
||||
|
||||
```bash
|
||||
DE_DIR=~/.hermes/cache/darwinian-evolver/darwinian_evolver
|
||||
ls "$DE_DIR/darwinian_evolver/lineage_visualizer.html" >/dev/null && \
|
||||
cd "$DE_DIR" && uv run darwinian_evolver --help >/dev/null && \
|
||||
echo "darwinian-evolver: OK"
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Imbue research post](https://imbue.com/research/2026-02-27-darwinian-evolver/)
|
||||
- [ARC-AGI-2 results](https://imbue.com/research/2026-02-27-arc-agi-2-evolution/)
|
||||
- [imbue-ai/darwinian_evolver](https://github.com/imbue-ai/darwinian_evolver) (AGPL-3.0)
|
||||
- [Darwin Gödel Machines](https://arxiv.org/abs/2505.22954)
|
||||
- [PromptBreeder](https://arxiv.org/abs/2309.16797)
|
||||
|
|
@ -0,0 +1,218 @@
|
|||
"""
|
||||
parrot_openrouter: same as the upstream `parrot` example but the LLM call goes
|
||||
through OpenRouter (OpenAI SDK) instead of Anthropic native. Lets us run an
|
||||
end-to-end evolution with whatever model the user already has paid access to.
|
||||
|
||||
Run with:
|
||||
uv --project darwinian_evolver run python parrot_openrouter.py \
|
||||
--num_iterations 3 --output_dir /tmp/parrot_out
|
||||
|
||||
Reads `OPENROUTER_API_KEY` from the environment.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import jinja2
|
||||
from openai import OpenAI
|
||||
|
||||
# Vendored problem types from upstream (AGPL — only run via subprocess in production)
|
||||
from darwinian_evolver.cli_common import build_hyperparameter_config_from_args
|
||||
from darwinian_evolver.cli_common import register_hyperparameter_args
|
||||
from darwinian_evolver.cli_common import parse_learning_log_view_type
|
||||
from darwinian_evolver.evolve_problem_loop import EvolveProblemLoop
|
||||
from darwinian_evolver.learning_log import LearningLogEntry
|
||||
from darwinian_evolver.problem import EvaluationFailureCase
|
||||
from darwinian_evolver.problem import EvaluationResult
|
||||
from darwinian_evolver.problem import Evaluator
|
||||
from darwinian_evolver.problem import Mutator
|
||||
from darwinian_evolver.problem import Organism
|
||||
from darwinian_evolver.problem import Problem
|
||||
|
||||
DEFAULT_MODEL = os.environ.get("EVOLVER_MODEL", "openai/gpt-4o-mini")
|
||||
|
||||
|
||||
def _client() -> OpenAI:
|
||||
key = os.environ.get("OPENROUTER_API_KEY")
|
||||
if not key:
|
||||
sys.exit("OPENROUTER_API_KEY is not set")
|
||||
return OpenAI(api_key=key, base_url="https://openrouter.ai/api/v1")
|
||||
|
||||
|
||||
def _prompt_llm(prompt: str) -> str:
|
||||
try:
|
||||
r = _client().chat.completions.create(
|
||||
model=DEFAULT_MODEL,
|
||||
max_tokens=1024,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
)
|
||||
return r.choices[0].message.content or ""
|
||||
except Exception as e:
|
||||
# Treat any provider error (rate limit, content filter, schema reject)
|
||||
# as a failed response. The evolver will simply see this as a low score
|
||||
# on this organism and move on — much friendlier than killing the run.
|
||||
return f"<LLM_ERROR: {type(e).__name__}: {e}>"
|
||||
|
||||
|
||||
class ParrotOrganism(Organism):
|
||||
prompt_template: str
|
||||
|
||||
def run(self, phrase: str) -> str:
|
||||
try:
|
||||
prompt = jinja2.Template(self.prompt_template).render(phrase=phrase)
|
||||
except jinja2.exceptions.TemplateError as e:
|
||||
return f"Error rendering prompt: {e}"
|
||||
if not prompt:
|
||||
return ""
|
||||
return _prompt_llm(prompt)
|
||||
|
||||
|
||||
class ParrotEvaluationFailureCase(EvaluationFailureCase):
|
||||
phrase: str
|
||||
response: str
|
||||
|
||||
|
||||
class ImproveParrotMutator(Mutator[ParrotOrganism, ParrotEvaluationFailureCase]):
|
||||
IMPROVEMENT_PROMPT_TEMPLATE = """
|
||||
We want to build a prompt that causes an LLM to repeat back a given phrase verbatim.
|
||||
|
||||
The current prompt template is:
|
||||
```
|
||||
{{ organism.prompt_template }}
|
||||
```
|
||||
|
||||
Unfortunately, on this phrase:
|
||||
```
|
||||
{{ failure_case.phrase }}
|
||||
```
|
||||
the LLM responded with:
|
||||
```
|
||||
{{ failure_case.response }}
|
||||
```
|
||||
|
||||
Diagnose what went wrong, then propose an improved prompt template. Put the new
|
||||
template in the LAST triple-backtick block of your response.
|
||||
""".strip()
|
||||
|
||||
def mutate(
|
||||
self,
|
||||
organism: ParrotOrganism,
|
||||
failure_cases: list[ParrotEvaluationFailureCase],
|
||||
learning_log_entries: list[LearningLogEntry],
|
||||
) -> list[ParrotOrganism]:
|
||||
fc = failure_cases[0]
|
||||
prompt = jinja2.Template(self.IMPROVEMENT_PROMPT_TEMPLATE).render(
|
||||
organism=organism, failure_case=fc
|
||||
)
|
||||
try:
|
||||
resp = _prompt_llm(prompt)
|
||||
parts = resp.split("```")
|
||||
if len(parts) < 3:
|
||||
return []
|
||||
new_tpl = parts[-2].strip()
|
||||
return [ParrotOrganism(prompt_template=new_tpl)]
|
||||
except Exception as e:
|
||||
print(f"mutate error: {e}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
|
||||
class ParrotEvaluator(Evaluator[ParrotOrganism, EvaluationResult, ParrotEvaluationFailureCase]):
|
||||
TRAINABLE_PHRASES = [
|
||||
"Hello world.",
|
||||
"bla",
|
||||
"Bla",
|
||||
"bla.",
|
||||
'"bla bla".',
|
||||
"Just say 'foo' once with no extra words.",
|
||||
]
|
||||
HOLDOUT_PHRASES = [
|
||||
"bla, but only once.",
|
||||
"'bla'",
|
||||
]
|
||||
|
||||
def evaluate(self, organism: ParrotOrganism) -> EvaluationResult:
|
||||
train_fails: list[ParrotEvaluationFailureCase] = []
|
||||
hold_fails: list[ParrotEvaluationFailureCase] = []
|
||||
for i, p in enumerate(self.TRAINABLE_PHRASES):
|
||||
r = organism.run(p)
|
||||
if r != p:
|
||||
train_fails.append(ParrotEvaluationFailureCase(
|
||||
phrase=p, response=r, data_point_id=f"trainable_{i}"))
|
||||
for i, p in enumerate(self.HOLDOUT_PHRASES):
|
||||
r = organism.run(p)
|
||||
if r != p:
|
||||
hold_fails.append(ParrotEvaluationFailureCase(
|
||||
phrase=p, response=r, data_point_id=f"holdout_{i}"))
|
||||
n_total = len(self.TRAINABLE_PHRASES) + len(self.HOLDOUT_PHRASES)
|
||||
n_ok = n_total - len(train_fails) - len(hold_fails)
|
||||
return EvaluationResult(
|
||||
score=n_ok / n_total,
|
||||
trainable_failure_cases=train_fails,
|
||||
holdout_failure_cases=hold_fails,
|
||||
# Always viable. Even a 0-score seed is a valid starting point; the
|
||||
# mutator should still get a chance to fix it.
|
||||
is_viable=True,
|
||||
)
|
||||
|
||||
|
||||
def make_problem() -> Problem:
|
||||
return Problem[ParrotOrganism, EvaluationResult, ParrotEvaluationFailureCase](
|
||||
evaluator=ParrotEvaluator(),
|
||||
mutators=[ImproveParrotMutator()],
|
||||
initial_organism=ParrotOrganism(prompt_template="Say {{ phrase }}"),
|
||||
)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
ap = argparse.ArgumentParser()
|
||||
register_hyperparameter_args(ap.add_argument_group("hyperparameters"))
|
||||
ap.add_argument("--num_iterations", type=int, default=3)
|
||||
ap.add_argument("--mutator_concurrency", type=int, default=4)
|
||||
ap.add_argument("--evaluator_concurrency", type=int, default=4)
|
||||
ap.add_argument("--output_dir", type=str, required=True)
|
||||
args = ap.parse_args()
|
||||
|
||||
out = Path(args.output_dir)
|
||||
out.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
hp = build_hyperparameter_config_from_args(args)
|
||||
loop = EvolveProblemLoop(
|
||||
problem=make_problem(),
|
||||
learning_log_view_type=parse_learning_log_view_type(hp.learning_log_view_type),
|
||||
num_parents_per_iteration=hp.num_parents_per_iteration,
|
||||
mutator_concurrency=args.mutator_concurrency,
|
||||
evaluator_concurrency=args.evaluator_concurrency,
|
||||
fixed_midpoint_score=hp.fixed_midpoint_score,
|
||||
midpoint_score_percentile=hp.midpoint_score_percentile,
|
||||
sharpness=hp.sharpness,
|
||||
novelty_weight=hp.novelty_weight,
|
||||
batch_size=hp.batch_size,
|
||||
should_verify_mutations=hp.verify_mutations,
|
||||
)
|
||||
|
||||
import json
|
||||
log_path = out / "results.jsonl"
|
||||
snap_dir = out / "snapshots"
|
||||
snap_dir.mkdir(exist_ok=True)
|
||||
print("Evaluating initial organism...")
|
||||
for snap in loop.run(num_iterations=args.num_iterations):
|
||||
(snap_dir / f"iteration_{snap.iteration}.pkl").write_bytes(snap.snapshot)
|
||||
_, best_eval = snap.best_organism_result
|
||||
print(f"iter={snap.iteration} pop={snap.population_size} "
|
||||
f"best_score={best_eval.score:.3f}")
|
||||
with log_path.open("a") as f:
|
||||
f.write(json.dumps({
|
||||
"iteration": snap.iteration,
|
||||
"best_score": best_eval.score,
|
||||
"pop_size": snap.population_size,
|
||||
"score_percentiles": {str(k): v for k, v in snap.score_percentiles.items()},
|
||||
}) + "\n")
|
||||
print(f"\nDone. Results in: {out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
|
|
@ -0,0 +1,69 @@
|
|||
"""
|
||||
show_snapshot.py — Dump the population from a darwinian-evolver snapshot pickle.
|
||||
|
||||
Usage:
|
||||
python show_snapshot.py PATH/TO/iteration_N.pkl [--field prompt_template]
|
||||
|
||||
The script is intentionally Organism-agnostic: it walks `org.__dict__` and prints
|
||||
all str fields. By default it shows `prompt_template` if present; pass --field to
|
||||
target a different attribute (e.g. `regex_pattern`, `sql_query`, `code_block`).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import pickle
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def main() -> int:
|
||||
ap = argparse.ArgumentParser()
|
||||
ap.add_argument("snapshot", type=Path)
|
||||
ap.add_argument(
|
||||
"--field",
|
||||
default=None,
|
||||
help="Organism attribute to display. Defaults to the first str field found.",
|
||||
)
|
||||
ap.add_argument("--top", type=int, default=None, help="Show only top N by score.")
|
||||
args = ap.parse_args()
|
||||
|
||||
if not args.snapshot.exists():
|
||||
sys.exit(f"snapshot not found: {args.snapshot}")
|
||||
|
||||
# The outer pickle wraps a dict; the inner pickle contains the actual organism
|
||||
# objects, which must be importable under their original dotted path. If you
|
||||
# ran a custom driver, make sure its module is on sys.path before calling this.
|
||||
outer = pickle.loads(args.snapshot.read_bytes())
|
||||
if not isinstance(outer, dict) or "population_snapshot" not in outer:
|
||||
sys.exit("not a darwinian-evolver snapshot (no population_snapshot key)")
|
||||
inner = pickle.loads(outer["population_snapshot"])
|
||||
pairs = inner["organisms"] # list of (Organism, EvaluationResult)
|
||||
|
||||
print(f"# organisms: {len(pairs)}\n")
|
||||
ranked = sorted(pairs, key=lambda p: getattr(p[1], "score", 0) or 0, reverse=True)
|
||||
if args.top:
|
||||
ranked = ranked[: args.top]
|
||||
|
||||
for i, (org, res) in enumerate(ranked):
|
||||
score = getattr(res, "score", float("nan"))
|
||||
print(f"=== rank {i} score={score:.3f} ===")
|
||||
# pick field
|
||||
field = args.field
|
||||
if field is None:
|
||||
for k, v in vars(org).items():
|
||||
if isinstance(v, str) and not k.startswith("_") and k not in ("id",):
|
||||
field = k
|
||||
break
|
||||
val = getattr(org, field, None) if field else None
|
||||
if val is None:
|
||||
print(f" (no string field; org fields: {list(vars(org).keys())})")
|
||||
else:
|
||||
print(f" {field} ({len(val)} chars):")
|
||||
for ln in val.splitlines()[:30]:
|
||||
print(f" {ln}")
|
||||
print()
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
|
|
@ -0,0 +1,240 @@
|
|||
"""
|
||||
Template: a custom darwinian-evolver problem.
|
||||
|
||||
Copy this file, fill in the THREE marked spots (Organism, Evaluator, Mutator),
|
||||
then run it as a driver script. The skeleton handles all the wiring so you only
|
||||
write the domain-specific logic.
|
||||
|
||||
To run:
|
||||
cd ~/.hermes/cache/darwinian-evolver/darwinian_evolver
|
||||
OPENROUTER_API_KEY=... uv run --with openai python /path/to/this_file.py \
|
||||
--num_iterations 3 --num_parents_per_iteration 2 \
|
||||
--output_dir /tmp/my_problem
|
||||
|
||||
The pattern mirrors `scripts/parrot_openrouter.py` (the working reference).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from openai import OpenAI
|
||||
|
||||
# Upstream types (AGPL — invoked via subprocess in production; importing here
|
||||
# is fine for skill-side driver scripts the user owns).
|
||||
from darwinian_evolver.cli_common import (
|
||||
build_hyperparameter_config_from_args,
|
||||
parse_learning_log_view_type,
|
||||
register_hyperparameter_args,
|
||||
)
|
||||
from darwinian_evolver.evolve_problem_loop import EvolveProblemLoop
|
||||
from darwinian_evolver.learning_log import LearningLogEntry
|
||||
from darwinian_evolver.problem import (
|
||||
EvaluationFailureCase,
|
||||
EvaluationResult,
|
||||
Evaluator,
|
||||
Mutator,
|
||||
Organism,
|
||||
Problem,
|
||||
)
|
||||
|
||||
DEFAULT_MODEL = os.environ.get("EVOLVER_MODEL", "openai/gpt-4o-mini")
|
||||
|
||||
|
||||
def _client() -> OpenAI:
|
||||
key = os.environ.get("OPENROUTER_API_KEY")
|
||||
if not key:
|
||||
sys.exit("OPENROUTER_API_KEY is not set")
|
||||
return OpenAI(api_key=key, base_url="https://openrouter.ai/api/v1")
|
||||
|
||||
|
||||
def _prompt_llm(prompt: str, max_tokens: int = 1024) -> str:
|
||||
try:
|
||||
r = _client().chat.completions.create(
|
||||
model=DEFAULT_MODEL,
|
||||
max_tokens=max_tokens,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
)
|
||||
return r.choices[0].message.content or ""
|
||||
except Exception as e:
|
||||
# Never let one bad LLM response kill the run.
|
||||
return f"<LLM_ERROR: {type(e).__name__}: {e}>"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 1. ORGANISM — what you are evolving.
|
||||
# ---------------------------------------------------------------------------
|
||||
class MyOrganism(Organism):
|
||||
# TODO: replace with your artifact field. Common shapes:
|
||||
# prompt_template: str
|
||||
# regex_pattern: str
|
||||
# sql_query: str
|
||||
# code_block: str
|
||||
artifact: str
|
||||
|
||||
def run(self, *inputs) -> str:
|
||||
"""Exercise the organism on a test input. Return whatever your
|
||||
evaluator wants to score."""
|
||||
# TODO: implement. For prompt evolution this typically calls _prompt_llm
|
||||
# with the artifact rendered against the input. For regex/SQL it would
|
||||
# call `re.findall(self.artifact, input)` / execute SQL / etc.
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 2. EVALUATOR — score organisms and surface failures the mutator can learn from.
|
||||
# ---------------------------------------------------------------------------
|
||||
class MyFailureCase(EvaluationFailureCase):
|
||||
# TODO: include enough context for the LLM to diagnose the failure.
|
||||
input: str
|
||||
expected: str
|
||||
actual: str
|
||||
|
||||
|
||||
class MyEvaluator(Evaluator[MyOrganism, EvaluationResult, MyFailureCase]):
|
||||
# Split your dataset. Mutator only sees trainable; holdout detects overfitting.
|
||||
TRAINABLE = [
|
||||
# TODO: list of (input, expected) tuples
|
||||
# ("input1", "expected1"),
|
||||
]
|
||||
HOLDOUT = [
|
||||
# TODO: separate set the mutator never sees
|
||||
]
|
||||
|
||||
def evaluate(self, organism: MyOrganism) -> EvaluationResult:
|
||||
train_fails: list[MyFailureCase] = []
|
||||
hold_fails: list[MyFailureCase] = []
|
||||
for i, (inp, expected) in enumerate(self.TRAINABLE):
|
||||
actual = organism.run(inp)
|
||||
if actual != expected:
|
||||
train_fails.append(MyFailureCase(
|
||||
input=inp, expected=expected, actual=actual,
|
||||
data_point_id=f"trainable_{i}",
|
||||
))
|
||||
for i, (inp, expected) in enumerate(self.HOLDOUT):
|
||||
actual = organism.run(inp)
|
||||
if actual != expected:
|
||||
hold_fails.append(MyFailureCase(
|
||||
input=inp, expected=expected, actual=actual,
|
||||
data_point_id=f"holdout_{i}",
|
||||
))
|
||||
n_total = len(self.TRAINABLE) + len(self.HOLDOUT)
|
||||
n_ok = n_total - len(train_fails) - len(hold_fails)
|
||||
return EvaluationResult(
|
||||
score=n_ok / n_total if n_total else 0.0,
|
||||
trainable_failure_cases=train_fails,
|
||||
holdout_failure_cases=hold_fails,
|
||||
# Always-viable. The evolver only blocks completely-broken organisms;
|
||||
# a 0-score organism is fine and will simply be sampled less often.
|
||||
is_viable=True,
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 3. MUTATOR — LLM proposes an improved organism from a failure case.
|
||||
# ---------------------------------------------------------------------------
|
||||
class MyMutator(Mutator[MyOrganism, MyFailureCase]):
|
||||
PROMPT = """
|
||||
The current artifact is:
|
||||
```
|
||||
{artifact}
|
||||
```
|
||||
|
||||
On this input:
|
||||
```
|
||||
{input}
|
||||
```
|
||||
it produced:
|
||||
```
|
||||
{actual}
|
||||
```
|
||||
but we wanted:
|
||||
```
|
||||
{expected}
|
||||
```
|
||||
|
||||
Diagnose what went wrong, then propose an improved version of the artifact.
|
||||
Put the new version in the LAST triple-backtick block of your response.
|
||||
""".strip()
|
||||
|
||||
def mutate(
|
||||
self,
|
||||
organism: MyOrganism,
|
||||
failure_cases: list[MyFailureCase],
|
||||
learning_log_entries: list[LearningLogEntry],
|
||||
) -> list[MyOrganism]:
|
||||
fc = failure_cases[0]
|
||||
prompt = self.PROMPT.format(
|
||||
artifact=organism.artifact,
|
||||
input=fc.input,
|
||||
actual=fc.actual,
|
||||
expected=fc.expected,
|
||||
)
|
||||
resp = _prompt_llm(prompt)
|
||||
parts = resp.split("```")
|
||||
if len(parts) < 3:
|
||||
return []
|
||||
new_artifact = parts[-2].strip()
|
||||
# Strip an opening language tag like "python\n" or "sql\n"
|
||||
if "\n" in new_artifact:
|
||||
first_line, rest = new_artifact.split("\n", 1)
|
||||
if first_line and not first_line.startswith(" ") and len(first_line) < 20:
|
||||
new_artifact = rest
|
||||
return [MyOrganism(artifact=new_artifact)]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Driver — fills in the EvolveProblemLoop boilerplate. You shouldn't need to
|
||||
# touch anything below this line for a typical run.
|
||||
# ---------------------------------------------------------------------------
|
||||
def make_problem() -> Problem:
|
||||
initial = MyOrganism(artifact="TODO: starting artifact here") # TODO
|
||||
return Problem[MyOrganism, EvaluationResult, MyFailureCase](
|
||||
evaluator=MyEvaluator(),
|
||||
mutators=[MyMutator()],
|
||||
initial_organism=initial,
|
||||
)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
ap = argparse.ArgumentParser()
|
||||
register_hyperparameter_args(ap.add_argument_group("hyperparameters"))
|
||||
ap.add_argument("--num_iterations", type=int, default=3)
|
||||
ap.add_argument("--mutator_concurrency", type=int, default=2)
|
||||
ap.add_argument("--evaluator_concurrency", type=int, default=2)
|
||||
ap.add_argument("--output_dir", type=str, required=True)
|
||||
args = ap.parse_args()
|
||||
|
||||
out = Path(args.output_dir)
|
||||
out.mkdir(parents=True, exist_ok=True)
|
||||
(out / "snapshots").mkdir(exist_ok=True)
|
||||
|
||||
hp = build_hyperparameter_config_from_args(args)
|
||||
loop = EvolveProblemLoop(
|
||||
problem=make_problem(),
|
||||
learning_log_view_type=parse_learning_log_view_type(hp.learning_log_view_type),
|
||||
num_parents_per_iteration=hp.num_parents_per_iteration,
|
||||
mutator_concurrency=args.mutator_concurrency,
|
||||
evaluator_concurrency=args.evaluator_concurrency,
|
||||
fixed_midpoint_score=hp.fixed_midpoint_score,
|
||||
midpoint_score_percentile=hp.midpoint_score_percentile,
|
||||
sharpness=hp.sharpness,
|
||||
novelty_weight=hp.novelty_weight,
|
||||
batch_size=hp.batch_size,
|
||||
should_verify_mutations=hp.verify_mutations,
|
||||
)
|
||||
|
||||
print("Evaluating initial organism...")
|
||||
for snap in loop.run(num_iterations=args.num_iterations):
|
||||
(out / "snapshots" / f"iteration_{snap.iteration}.pkl").write_bytes(snap.snapshot)
|
||||
_, best = snap.best_organism_result
|
||||
print(f"iter={snap.iteration} pop={snap.population_size} best_score={best.score:.3f}")
|
||||
|
||||
print(f"\nDone. Results in: {out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
277
optional-skills/research/osint-investigation/SKILL.md
Normal file
277
optional-skills/research/osint-investigation/SKILL.md
Normal file
|
|
@ -0,0 +1,277 @@
|
|||
---
|
||||
name: osint-investigation
|
||||
description: Public-records OSINT investigation framework — SEC EDGAR filings, USAspending contracts, Senate lobbying, OFAC sanctions, ICIJ offshore leaks, NYC property records (ACRIS), OpenCorporates registries, CourtListener court records, Wayback Machine archives, Wikipedia + Wikidata, GDELT news monitoring. Entity resolution across sources, cross-link analysis, timing correlation, evidence chains. Python stdlib only.
|
||||
version: 0.1.0
|
||||
platforms: [linux, macos, windows]
|
||||
author: Hermes Agent (adapted from ShinMegamiBoson/OpenPlanter, MIT)
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [osint, investigation, public-records, sec, sanctions, corporate-registry, property, courts, due-diligence, journalism]
|
||||
category: research
|
||||
related_skills: [domain-intel, arxiv]
|
||||
---
|
||||
|
||||
# OSINT Investigation — Public Records Cross-Reference
|
||||
|
||||
Investigative framework for public-records OSINT: government contracts,
|
||||
corporate filings, lobbying, sanctions, offshore leaks, property records,
|
||||
court records, web archives, knowledge bases, and global news. Resolve
|
||||
entities across heterogeneous sources, build cross-links with explicit
|
||||
confidence, run statistical timing tests, and produce structured evidence
|
||||
chains.
|
||||
|
||||
**Python stdlib only.** Zero install. Works on Linux, macOS, Windows. Most
|
||||
sources work with no API key (OpenCorporates has an optional free token
|
||||
that raises rate limits).
|
||||
|
||||
Adapted from the MIT-licensed ShinMegamiBoson/OpenPlanter project; expanded
|
||||
to cover identity / property / litigation / archives / news sources that
|
||||
the original didn't address.
|
||||
|
||||
## When to use this skill
|
||||
|
||||
Use when the user asks for:
|
||||
|
||||
- "follow the money" — government contracts, lobbying → legislation, sanctions
|
||||
- corporate due diligence — who controls company X, where are they
|
||||
incorporated, who serves on their boards, what filings have they made
|
||||
- sanctions screening — is entity X on OFAC SDN, ICIJ offshore leaks
|
||||
- pay-to-play investigation — contractors with offshore ties, lobbying
|
||||
clients winning awards
|
||||
- property ownership — find recorded deeds/mortgages by name or address
|
||||
(NYC; for other counties point users at the relevant recorder)
|
||||
- litigation history — find federal + state court opinions and PACER dockets
|
||||
- multi-source entity resolution where naming varies (LLC suffixes, abbreviations)
|
||||
- evidence-chain construction with explicit confidence levels
|
||||
- "what's been said about X" — international news (GDELT) + Wikipedia
|
||||
narrative + Wayback Machine to recover dead URLs
|
||||
|
||||
Do NOT use this skill for:
|
||||
|
||||
- general web research → `web_search` / `web_extract`
|
||||
- domain/infrastructure OSINT → `domain-intel` skill
|
||||
- academic literature → `arxiv` skill
|
||||
- social-media profile discovery → `sherlock` skill (optional)
|
||||
- US **federal** campaign finance — FEC is intentionally NOT covered here
|
||||
(the API is unreliable for ad-hoc contributor-name queries on the free
|
||||
DEMO_KEY tier). For federal donations, point users at
|
||||
https://www.fec.gov/data/ directly.
|
||||
|
||||
## Workflow
|
||||
|
||||
The agent runs scripts via the `terminal` tool. `SKILL_DIR` is the directory
|
||||
holding this SKILL.md.
|
||||
|
||||
### 1. Identify which sources apply
|
||||
|
||||
Read the data-source wiki entries to plan the investigation:
|
||||
|
||||
```
|
||||
ls SKILL_DIR/references/sources/
|
||||
|
||||
# Federal financial / regulatory
|
||||
cat SKILL_DIR/references/sources/sec-edgar.md # corporate filings
|
||||
cat SKILL_DIR/references/sources/usaspending.md # federal contracts
|
||||
cat SKILL_DIR/references/sources/senate-ld.md # lobbying
|
||||
cat SKILL_DIR/references/sources/ofac-sdn.md # sanctions
|
||||
cat SKILL_DIR/references/sources/icij-offshore.md # offshore leaks
|
||||
|
||||
# Identity / property / litigation / archives / news
|
||||
cat SKILL_DIR/references/sources/nyc-acris.md # NYC property records
|
||||
cat SKILL_DIR/references/sources/opencorporates.md # global corporate registry
|
||||
cat SKILL_DIR/references/sources/courtlistener.md # court records (federal + state)
|
||||
cat SKILL_DIR/references/sources/wayback.md # Wayback Machine archives
|
||||
cat SKILL_DIR/references/sources/wikipedia.md # Wikipedia + Wikidata
|
||||
cat SKILL_DIR/references/sources/gdelt.md # global news monitoring
|
||||
```
|
||||
|
||||
Each entry follows a 9-section template: summary, access, schema, coverage,
|
||||
cross-reference keys, data quality, acquisition, legal, references.
|
||||
|
||||
The **cross-reference potential** section maps join keys between sources — read
|
||||
those first to pick the right pair.
|
||||
|
||||
### 2. Acquire data
|
||||
|
||||
Each source has a stdlib-only fetch script in `SKILL_DIR/scripts/`:
|
||||
|
||||
**Federal financial / regulatory**
|
||||
|
||||
```bash
|
||||
# SEC EDGAR filings (corporate disclosures)
|
||||
python3 SKILL_DIR/scripts/fetch_sec_edgar.py --cik 0000320193 \
|
||||
--types 10-K,10-Q --out data/edgar_filings.csv
|
||||
|
||||
# USAspending federal contracts
|
||||
python3 SKILL_DIR/scripts/fetch_usaspending.py --recipient "EXAMPLE CORP" \
|
||||
--fy 2024 --out data/contracts.csv
|
||||
|
||||
# Senate LD-1 / LD-2 lobbying disclosures
|
||||
python3 SKILL_DIR/scripts/fetch_senate_ld.py --client "EXAMPLE CORP" \
|
||||
--year 2024 --out data/lobbying.csv
|
||||
|
||||
# OFAC SDN sanctions list (full snapshot)
|
||||
python3 SKILL_DIR/scripts/fetch_ofac_sdn.py --out data/ofac_sdn.csv
|
||||
|
||||
# ICIJ Offshore Leaks — downloads ~70 MB bulk CSV on first use,
|
||||
# then searches it locally. Cached for 30 days under
|
||||
# $HERMES_OSINT_CACHE/icij/ (default: ~/.cache/hermes-osint/icij/).
|
||||
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
|
||||
--out data/icij.csv
|
||||
```
|
||||
|
||||
**Identity / property / litigation / archives / news**
|
||||
|
||||
```bash
|
||||
# NYC property records (deeds, mortgages, liens) — ACRIS via Socrata
|
||||
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --name "SMITH, JOHN" \
|
||||
--out data/acris.csv
|
||||
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --address "571 HUDSON" \
|
||||
--out data/acris_addr.csv
|
||||
|
||||
# OpenCorporates — 130+ jurisdiction corporate registry
|
||||
# (free token required; set OPENCORPORATES_API_TOKEN or pass --token)
|
||||
python3 SKILL_DIR/scripts/fetch_opencorporates.py --query "Example Corp" \
|
||||
--jurisdiction us_ny --out data/opencorporates.csv
|
||||
|
||||
# CourtListener — federal + state court opinions, PACER dockets
|
||||
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Smith v. Example Corp" \
|
||||
--type opinions --out data/courts.csv
|
||||
|
||||
# Wayback Machine — historical web captures
|
||||
python3 SKILL_DIR/scripts/fetch_wayback.py --url "example.com" \
|
||||
--match host --collapse digest --out data/wayback.csv
|
||||
|
||||
# Wikipedia + Wikidata — narrative bio + structured facts
|
||||
# Set HERMES_OSINT_UA=your-app/1.0 (your@email) to identify yourself
|
||||
python3 SKILL_DIR/scripts/fetch_wikipedia.py --query "Bill Gates" \
|
||||
--out data/wp.csv
|
||||
|
||||
# GDELT — global news in 100+ languages, ~2015→present
|
||||
python3 SKILL_DIR/scripts/fetch_gdelt.py --query '"Example Corp"' \
|
||||
--timespan 1y --out data/gdelt.csv
|
||||
```
|
||||
|
||||
All outputs are normalized CSV with a header row. Re-run scripts idempotently.
|
||||
|
||||
When a private individual won't be in a source (e.g. SEC EDGAR for a non-public-
|
||||
company person, USAspending for someone who isn't a federal contractor, Senate
|
||||
LDA for someone who isn't a lobbying client), the script returns 0 rows with a
|
||||
clear warning rather than silently writing an empty CSV. EDGAR specifically
|
||||
flags when the company-name resolver matched an individual Form 3/4/5 filer
|
||||
rather than a corporate registrant.
|
||||
|
||||
Rate-limit notes are in each source's wiki entry. Default fetchers sleep
|
||||
politely between paginated requests. **API keys raise rate limits** for
|
||||
sources that support them (`SEC_USER_AGENT`, `SENATE_LDA_TOKEN`,
|
||||
`OPENCORPORATES_API_TOKEN`, `COURTLISTENER_TOKEN`). All scripts surface
|
||||
429 responses immediately with the upstream's quota message so the user
|
||||
knows to slow down or supply a key.
|
||||
|
||||
### 3. Resolve entities across sources
|
||||
|
||||
Normalize names and find matches between two CSV files:
|
||||
|
||||
```bash
|
||||
# Match lobbying clients (Senate LDA) against contract recipients (USAspending)
|
||||
python3 SKILL_DIR/scripts/entity_resolution.py \
|
||||
--left data/lobbying.csv --left-name-col client_name \
|
||||
--right data/contracts.csv --right-name-col recipient_name \
|
||||
--out data/cross_links.csv
|
||||
```
|
||||
|
||||
Three matching tiers with explicit confidence:
|
||||
|
||||
| Tier | Method | Confidence |
|
||||
|------|--------|------------|
|
||||
| `exact` | Normalized strings equal after suffix/punctuation strip | high |
|
||||
| `fuzzy` | Sorted-token equality (word-bag match) | medium |
|
||||
| `token_overlap` | ≥60% token overlap, ≥2 shared tokens, tokens ≥4 chars | low |
|
||||
|
||||
Output `cross_links.csv` columns: `match_type, confidence, left_name,
|
||||
right_name, left_normalized, right_normalized, left_row, right_row`.
|
||||
|
||||
### 4. Statistical timing correlation (optional)
|
||||
|
||||
Test whether two time series cluster suspiciously close together — e.g.
|
||||
lobbying filings near contract awards — using a permutation test:
|
||||
|
||||
```bash
|
||||
python3 SKILL_DIR/scripts/timing_analysis.py \
|
||||
--donations data/lobbying.csv --donation-date-col filing_date \
|
||||
--donation-amount-col income --donation-donor-col client_name \
|
||||
--donation-recipient-col registrant_name \
|
||||
--contracts data/contracts.csv --contract-date-col award_date \
|
||||
--contract-vendor-col recipient_name \
|
||||
--cross-links data/cross_links.csv \
|
||||
--permutations 1000 \
|
||||
--out data/timing.json
|
||||
```
|
||||
|
||||
The script's column flags are intentionally generic — the original tool was
|
||||
written for donations vs awards, but it works for any (event, payee) time
|
||||
series joined through cross-links. Null hypothesis: event timing is
|
||||
independent of award dates. One-tailed p-value = fraction of permutations
|
||||
with mean nearest-award distance ≤ observed. Minimum 3 events per (payer,
|
||||
vendor) pair to run the test.
|
||||
|
||||
### 5. Build the findings JSON (evidence chain)
|
||||
|
||||
```bash
|
||||
python3 SKILL_DIR/scripts/build_findings.py \
|
||||
--cross-links data/cross_links.csv \
|
||||
--timing data/timing.json \
|
||||
--out data/findings.json
|
||||
```
|
||||
|
||||
Every finding has `id, title, severity, confidence, summary, evidence[], sources[]`.
|
||||
Each evidence item points back to a specific row in a source CSV. The user (or a
|
||||
follow-up agent) can verify every claim against its source.
|
||||
|
||||
## Confidence and evidence discipline
|
||||
|
||||
This is the load-bearing rule of the skill. Tell the user:
|
||||
|
||||
- Every claim must trace to a record. No naked assertions.
|
||||
- Confidence tier travels with the claim. `match_type=fuzzy` is "probable",
|
||||
not "confirmed."
|
||||
- Entity resolution produces candidates, NOT conclusions. A `fuzzy` match
|
||||
between "ACME LLC" and "Acme Holdings Group" is a lead, not a fact.
|
||||
- Statistical significance ≠ wrongdoing. p < 0.05 means the timing pattern
|
||||
is unlikely under the null. It does not establish corruption.
|
||||
- All data sources here are public records. They may still contain
|
||||
inaccuracies, stale info, or redactions (GDPR, sealed records).
|
||||
|
||||
## Adding a new data source
|
||||
|
||||
Use the template:
|
||||
|
||||
```bash
|
||||
cp SKILL_DIR/templates/source-template.md \
|
||||
SKILL_DIR/references/sources/<your-source>.md
|
||||
```
|
||||
|
||||
Fill in all 9 sections. Write a `fetch_<source>.py` script in `scripts/` that
|
||||
uses stdlib only and writes a normalized CSV. Update the source list in the
|
||||
"When to use" section above.
|
||||
|
||||
## Tools and their limits
|
||||
|
||||
- `entity_resolution.py` does NOT use external fuzzy libraries (no rapidfuzz,
|
||||
no jellyfish). Token-bag matching is the upper bound here. If you need
|
||||
Levenshtein, transliteration, or phonetic matching, pip-install separately.
|
||||
- `timing_analysis.py` uses Python's `random` for permutations. For
|
||||
reproducibility, pass `--seed N`.
|
||||
- `fetch_*.py` scripts use `urllib.request` and respect `Retry-After`. Heavy
|
||||
bulk usage may still violate ToS — read each source's legal section first.
|
||||
|
||||
## Legal note
|
||||
|
||||
All Phase-1 sources are public records. Bulk acquisition is permitted under
|
||||
their respective access terms (FOIA, public records law, ICIJ explicit
|
||||
publication, OFAC public data). However:
|
||||
|
||||
- Some sources rate-limit aggressively. Respect their headers.
|
||||
- Some redact registrant info (GDPR on WHOIS, sealed filings).
|
||||
- Cross-referencing public records to identify private individuals can have
|
||||
ethical implications. The skill produces evidence chains, not accusations.
|
||||
|
|
@ -0,0 +1,98 @@
|
|||
# CourtListener — Free Law Project
|
||||
|
||||
## 1. Summary
|
||||
|
||||
CourtListener (Free Law Project) aggregates court opinions, dockets, oral
|
||||
arguments, and judge data. Covers ~10M federal and state court opinions
|
||||
back to colonial America, plus PACER docket data from RECAP submissions.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **REST API v4:** `https://www.courtlistener.com/api/rest/v4/`
|
||||
- **Auth:** Anonymous reads allowed on most endpoints; token raises rate
|
||||
limits and unlocks bulk export
|
||||
- **Rate limit:** ~5,000 req/hour unauthenticated for search; higher with token
|
||||
|
||||
Set `COURTLISTENER_TOKEN` env var. Get a free token at
|
||||
https://www.courtlistener.com/sign-in/ then create an API key.
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_courtlistener.py`:
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `case_name` | str | Case name |
|
||||
| `court` | str | Court name |
|
||||
| `court_id` | str | Court ID (e.g. `nysd`, `scotus`, `ca9`) |
|
||||
| `date_filed` | str | YYYY-MM-DD |
|
||||
| `docket_number` | str | Court docket number |
|
||||
| `judge` | str | Judge name(s) |
|
||||
| `citation` | str | Reporter citation(s) |
|
||||
| `result_type` | str | opinions / dockets / oral / people |
|
||||
| `snippet` | str | Search-match snippet (up to 500 chars) |
|
||||
| `absolute_url` | str | Direct CourtListener URL |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- Federal: all circuit and district courts, SCOTUS
|
||||
- State: all 50 state supreme/appellate courts, many trial courts
|
||||
- Opinions: ~10M back to 1600s (colonial), full coverage 1950 → present
|
||||
- Dockets via RECAP: ~3M+ from user-submitted PACER PDFs
|
||||
- Updated continuously
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **OpenCorporates** ↔ `case_name` (corporate litigation)
|
||||
- **SEC EDGAR** ↔ `case_name` (securities class actions)
|
||||
- **OFAC SDN** ↔ `case_name` (sanctions-related civil/criminal cases)
|
||||
|
||||
Join key: party name from `case_name`. Note: `case_name` often abbreviates
|
||||
("Smith v. Jones" rather than full party names) — use the full case URL
|
||||
to get all parties.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- Older opinions (pre-1990) often lack docket numbers and judges
|
||||
- State coverage is more uneven than federal
|
||||
- PACER docket coverage depends on RECAP user submissions — not exhaustive
|
||||
- Sealed documents are excluded
|
||||
- Party names in case captions don't always match filing names exactly
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_courtlistener.py`
|
||||
|
||||
```bash
|
||||
# Search opinions for a party / keyword
|
||||
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Example Corp" \
|
||||
--out data/cl.csv
|
||||
|
||||
# PACER dockets (best for recent litigation)
|
||||
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Example Corp" \
|
||||
--type dockets --out data/cl_dockets.csv
|
||||
|
||||
# Restrict to a court
|
||||
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Microsoft" \
|
||||
--court ca9 --out data/cl_9th.csv
|
||||
|
||||
# Date range
|
||||
python3 SKILL_DIR/scripts/fetch_courtlistener.py --query "Example Corp" \
|
||||
--date-from 2020-01-01 --date-to 2024-12-31 --out data/cl.csv
|
||||
```
|
||||
|
||||
Pass `--token` or set `COURTLISTENER_TOKEN`.
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Court opinions are public domain
|
||||
- Free Law Project provides the data under CC0 / public domain dedication
|
||||
- No commercial use restrictions on opinion text or metadata
|
||||
- Some PACER PDFs have copyright on layout (not text) — fair use applies
|
||||
|
||||
## 9. References
|
||||
|
||||
- API docs: https://www.courtlistener.com/help/api/rest/
|
||||
- Court IDs: https://www.courtlistener.com/api/jurisdictions/
|
||||
- RECAP archive: https://www.courtlistener.com/recap/
|
||||
- Bulk data: https://www.courtlistener.com/help/api/bulk-data/
|
||||
|
|
@ -0,0 +1,104 @@
|
|||
# GDELT — Global News Monitoring
|
||||
|
||||
## 1. Summary
|
||||
|
||||
GDELT (Global Database of Events, Language, and Tone) monitors world news
|
||||
in 100+ languages with full-text indexing. Updated every 15 minutes.
|
||||
~2015 → present, ~1B+ articles indexed. Free anonymous access.
|
||||
|
||||
GDELT is wider than Google News (more international, more long-tail
|
||||
sources) and indexed by tone/sentiment, themes (CAMEO codes), people, and
|
||||
organizations.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **DOC 2.0 API:** `https://api.gdeltproject.org/api/v2/doc/doc`
|
||||
- **Events / GKG 2.0:** `https://api.gdeltproject.org/api/v2/events/events`
|
||||
- **Auth:** None
|
||||
- **Rate limit:** **1 request per 5 seconds** for the DOC API — strict
|
||||
|
||||
The fetch script automatically retries after a 6-second sleep when a
|
||||
429 is received.
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_gdelt.py`:
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `title` | str | Article title |
|
||||
| `url` | str | Article URL |
|
||||
| `seen_date` | str | When GDELT first saw the article (UTC) |
|
||||
| `domain` | str | Publisher domain |
|
||||
| `language` | str | Source language |
|
||||
| `source_country` | str | 2-letter country code |
|
||||
| `tone` | str | GDELT-computed tone score (negative = negative coverage) |
|
||||
| `social_image` | str | Open Graph image URL when available |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- Worldwide news in 100+ languages
|
||||
- ~2015 → present (Events back to 1979 via a separate stream)
|
||||
- Update frequency: 15 minutes
|
||||
- Bias: heavily Anglophone in volume but very wide source list overall
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **All sources** ↔ `title` / `url` (news context for any subject)
|
||||
- **Wikipedia** ↔ event timeline for notable entities
|
||||
- **Wayback Machine** ↔ recover articles whose URLs have died
|
||||
- **OFAC SDN** ↔ news context for sanctions designations
|
||||
- **SEC EDGAR** ↔ news context for 8-K material events
|
||||
|
||||
Join key: entity name appearing in article title or full-text. GDELT also
|
||||
extracts named entities into a separate stream (GKG) not exposed by this
|
||||
fetcher — query GDELT directly for entity-level filtering.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- Title extraction is automated and can be wrong (sometimes captures the
|
||||
site name + delimiter + article title; sometimes a generic page title)
|
||||
- Sentiment / tone is computed by GDELT, not source-supplied
|
||||
- Some domains are oversampled (newswires, aggregators)
|
||||
- Source country is inferred from domain registration / TLD — can be
|
||||
wrong for international news sites with country-neutral domains
|
||||
- Article URLs can rot — pair with Wayback Machine to preserve content
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_gdelt.py`
|
||||
|
||||
```bash
|
||||
# Recent news mentioning an entity
|
||||
python3 SKILL_DIR/scripts/fetch_gdelt.py --query "Nous Research" \
|
||||
--timespan 6m --out data/gdelt.csv
|
||||
|
||||
# Phrase-exact (use double quotes inside single quotes for the shell)
|
||||
python3 SKILL_DIR/scripts/fetch_gdelt.py --query '"Dillon Rolnick"' \
|
||||
--timespan 1y --out data/gdelt.csv
|
||||
|
||||
# Filter to a country / language
|
||||
python3 SKILL_DIR/scripts/fetch_gdelt.py --query "Microsoft" \
|
||||
--source-country US --source-lang English --out data/gdelt.csv
|
||||
|
||||
# Date range
|
||||
python3 SKILL_DIR/scripts/fetch_gdelt.py --query "Microsoft" \
|
||||
--start 2024-01-01 --end 2024-12-31 --out data/gdelt.csv
|
||||
```
|
||||
|
||||
GDELT supports its own query operators: phrase quoting, AND/OR/NOT,
|
||||
`sourcecountry:US`, `theme:ECON_BANKRUPTCY`, `tone<-5`, etc.
|
||||
See https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/ for syntax.
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- GDELT data is provided free for academic and journalistic use
|
||||
- Article URLs link out to original publishers — copyright remains with
|
||||
the publisher
|
||||
- GDELT is NOT a content archive; it's a metadata index
|
||||
|
||||
## 9. References
|
||||
|
||||
- DOC 2.0 API: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/
|
||||
- Themes & query syntax: https://blog.gdeltproject.org/gkg-2-0-our-global-knowledge-graph-2-0-amazing-data-at-your-fingertips/
|
||||
- Project home: https://www.gdeltproject.org/
|
||||
|
|
@ -0,0 +1,104 @@
|
|||
# ICIJ Offshore Leaks Database
|
||||
|
||||
## 1. Summary
|
||||
|
||||
The International Consortium of Investigative Journalists (ICIJ) publishes a
|
||||
combined database of offshore entities from the Panama Papers, Paradise Papers,
|
||||
Pandora Papers, Bahamas Leaks, and Offshore Leaks. ~800,000+ offshore entities
|
||||
with their officers, intermediaries, and addresses.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **Bulk download (primary):** `https://offshoreleaks-data.icij.org/offshoreleaks/csv/full-oldb.LATEST.zip` (~70 MB ZIP, refreshed periodically)
|
||||
- **Search UI (human):** `https://offshoreleaks.icij.org/`
|
||||
- **Auth:** None
|
||||
- **Note:** The previous Open Refine reconciliation endpoint at
|
||||
`/reconcile` now returns 404. ICIJ has removed it. The bulk ZIP is the
|
||||
remaining stable access path. The skill's `fetch_icij_offshore.py` caches
|
||||
the ZIP locally (default `~/.cache/hermes-osint/icij/`, refreshes after
|
||||
30 days) and searches it offline.
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_icij_offshore.py`:
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `node_id` | int | ICIJ canonical node ID |
|
||||
| `name` | str | Entity / officer / intermediary name |
|
||||
| `node_type` | str | entity / officer / intermediary / address |
|
||||
| `country_codes` | str | Semicolon-separated ISO codes |
|
||||
| `countries` | str | Country names |
|
||||
| `jurisdiction` | str | Offshore jurisdiction (BVI, Panama, etc.) |
|
||||
| `incorporation_date` | str | YYYY-MM-DD |
|
||||
| `inactivation_date` | str | YYYY-MM-DD (if struck) |
|
||||
| `source` | str | Panama Papers / Paradise Papers / Pandora Papers / etc. |
|
||||
| `entity_url` | str | Link to ICIJ page |
|
||||
| `connections` | str | Semicolon-separated node IDs of related entities |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- Worldwide offshore entity records
|
||||
- Earliest records: 1970s (Bahamas Leaks). Most data 1990–2018.
|
||||
- NOT updated in real-time — new leaks added when ICIJ publishes them
|
||||
- ~810,000 offshore entities + ~750,000 officers + ~150,000 intermediaries
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **SEC EDGAR** ↔ `name` (public companies with offshore arms)
|
||||
- **USAspending** ↔ `name` (federal contractors with offshore structure)
|
||||
- **OFAC SDN** ↔ `name` (sanctioned entities using offshore vehicles)
|
||||
|
||||
Join key: normalized entity/officer name. `node_id` is canonical for cross-
|
||||
referencing within ICIJ. Connections graph traversal is in-script (BFS over
|
||||
`connections`).
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- Offshore entity names sometimes appear in multiple leaks with slight variations
|
||||
- Officers may be nominees (front persons), not beneficial owners
|
||||
- Some entries have minimal info (just a name + jurisdiction)
|
||||
- The connections graph is incomplete — some relationships are documented in
|
||||
source materials but not in the structured database
|
||||
- Inactive/struck-off entities are still included with `inactivation_date`
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_icij_offshore.py`
|
||||
|
||||
```bash
|
||||
# Search by entity name (case-insensitive substring across the bulk DB)
|
||||
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
|
||||
--out data/icij.csv
|
||||
|
||||
# Search by officer (individual person)
|
||||
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --officer "SMITH JOHN" \
|
||||
--out data/icij.csv
|
||||
|
||||
# Search by jurisdiction (filter on cached results)
|
||||
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --officer "SMITH" \
|
||||
--jurisdiction "BRITISH VIRGIN ISLANDS" --out data/icij_bvi.csv
|
||||
|
||||
# Force a fresh download (default refresh window is 30 days)
|
||||
python3 SKILL_DIR/scripts/fetch_icij_offshore.py --entity "EXAMPLE CORP" \
|
||||
--force-refresh --out data/icij.csv
|
||||
```
|
||||
|
||||
First call downloads the ~70 MB ZIP under `~/.cache/hermes-osint/icij/`
|
||||
(or `$HERMES_OSINT_CACHE/icij/`). Subsequent calls reuse the cache for 30 days.
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Public record as published by ICIJ under explicit publication
|
||||
- No copyright on the underlying facts (entity names, jurisdictions)
|
||||
- ICIJ asks for attribution if used in derivative reporting
|
||||
- **Ethical note**: Presence in this database does NOT imply wrongdoing. Many
|
||||
offshore structures are legal. The database is a research tool, not a list of
|
||||
criminals.
|
||||
|
||||
## 9. References
|
||||
|
||||
- Database: https://offshoreleaks.icij.org/
|
||||
- About the data: https://offshoreleaks.icij.org/pages/about
|
||||
- Methodology: https://www.icij.org/investigations/panama-papers/
|
||||
- API hints: Open Refine reconciliation endpoint at `https://offshoreleaks.icij.org/reconcile`
|
||||
|
|
@ -0,0 +1,90 @@
|
|||
# NYC ACRIS — NYC Real Property Records
|
||||
|
||||
## 1. Summary
|
||||
|
||||
The Automated City Register Information System (ACRIS) is NYC's index of
|
||||
recorded property documents: deeds, mortgages, satisfactions, liens, UCC
|
||||
filings. Covers Manhattan, Bronx, Brooklyn, Queens, Staten Island.
|
||||
Published as 4 linked Socrata datasets on the NYC Open Data portal.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **Socrata API:** `https://data.cityofnewyork.us/resource/636b-3b5g.json` (Parties)
|
||||
- **Other datasets:** `bnx9-e6tj` (Master), `8h5j-fqxa` (Legal), `uqqa-hym2` (References)
|
||||
- **Auth:** None for read access (Socrata `$app_token` raises rate limits if needed)
|
||||
- **Rate limit:** Generous (~1000 req/hour unauthenticated)
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_nyc_acris.py` (Parties joined to Master):
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `document_id` | str | ACRIS document ID |
|
||||
| `name` | str | Party name as recorded (often "LAST, FIRST" but varies) |
|
||||
| `party_type` | str | 1=grantor, 2=grantee, 3=other |
|
||||
| `party_role` | str | Human-readable role label |
|
||||
| `address_1` | str | Property or party address line 1 |
|
||||
| `city`, `state`, `zip`, `country` | str | Address parts |
|
||||
| `doc_type` | str | DEED, MTGE (mortgage), SAT (satisfaction), AGMT, etc. |
|
||||
| `doc_date`, `recorded_date` | str | YYYY-MM-DD |
|
||||
| `borough` | str | Manhattan / Bronx / Brooklyn / Queens / Staten Island |
|
||||
| `amount` | str | Document amount (USD, when applicable) |
|
||||
| `filing_url` | str | Direct ACRIS DocumentImageView link |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- NYC 5 boroughs only — other counties have their own recorders
|
||||
- 1966 → present (older filings exist on microfilm at the County Clerk)
|
||||
- Updated nightly
|
||||
- ~70M+ party records cumulative
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **SEC EDGAR** ↔ `name` (insider filers with NYC property)
|
||||
- **USAspending** ↔ `name` (federal contractors with NYC property)
|
||||
- **Senate LDA** ↔ `name` (lobbyists / clients with NYC property)
|
||||
- **ICIJ Offshore** ↔ `name` (NYC properties owned via offshore vehicles)
|
||||
|
||||
Join key: normalized party name. NYC property records typically store names
|
||||
as "LAST, FIRST" or full LLC names — use `entity_resolution.py`.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- Same person appears with multiple name formats over time
|
||||
- LLC and trust ownership obscures beneficial owners
|
||||
- Recording lag can be 2-4 weeks after closing
|
||||
- Older documents have spottier address data
|
||||
- Sealed records (e.g. domestic violence shelters) are excluded by law
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_nyc_acris.py`
|
||||
|
||||
```bash
|
||||
# By party name
|
||||
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --name "ROLNICK" --out data/acris.csv
|
||||
|
||||
# By address (useful when you know the property but not the names)
|
||||
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --address "571 HUDSON" --out data/acris.csv
|
||||
|
||||
# Restrict to grantees (buyers / mortgagees)
|
||||
python3 SKILL_DIR/scripts/fetch_nyc_acris.py --name "ROLNICK" --party-type 2 \
|
||||
--out data/acris_buyers.csv
|
||||
```
|
||||
|
||||
The script joins Parties → Master to populate doc_type, dates, borough, and
|
||||
amount. Pass `--no-enrich` to skip the join (faster, fewer columns).
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Public record under NYS Real Property Law and NYC Charter
|
||||
- No commercial use restrictions on the data
|
||||
- All ACRIS data is public information by statute
|
||||
|
||||
## 9. References
|
||||
|
||||
- ACRIS portal: https://a836-acris.nyc.gov/CP/
|
||||
- NYC Open Data: https://data.cityofnewyork.us/
|
||||
- Parties dataset: https://data.cityofnewyork.us/City-Government/ACRIS-Real-Property-Parties/636b-3b5g
|
||||
- Document type codes: https://www1.nyc.gov/site/finance/taxes/acris.page
|
||||
|
|
@ -0,0 +1,92 @@
|
|||
# OFAC SDN — Specially Designated Nationals List
|
||||
|
||||
## 1. Summary
|
||||
|
||||
The Office of Foreign Assets Control (OFAC) publishes the Specially Designated
|
||||
Nationals and Blocked Persons List (SDN). US persons are generally prohibited
|
||||
from dealing with individuals and entities on this list. Also published:
|
||||
non-SDN consolidated lists (BIS Denied Persons, FSE, etc.).
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **Full XML:** `https://www.treasury.gov/ofac/downloads/sdn.xml`
|
||||
- **Delimited:** `https://www.treasury.gov/ofac/downloads/sdn.csv`
|
||||
- **Consolidated:** `https://www.treasury.gov/ofac/downloads/consolidated/consolidated.xml`
|
||||
- **Auth:** None
|
||||
- **Rate limit:** None (static file downloads). Updated continuously.
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_ofac_sdn.py`:
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `entity_id` | int | OFAC unique ID |
|
||||
| `name` | str | Primary name |
|
||||
| `entity_type` | str | individual / entity / vessel / aircraft |
|
||||
| `program_list` | str | Semicolon-separated sanctions programs (e.g. SDGT;IRAN) |
|
||||
| `title` | str | For individuals: title/role |
|
||||
| `nationalities` | str | Semicolon-separated country codes |
|
||||
| `aka_list` | str | Semicolon-separated "also known as" names |
|
||||
| `addresses` | str | Semicolon-separated known addresses |
|
||||
| `dob` | str | Date of birth (individuals) |
|
||||
| `pob` | str | Place of birth (individuals) |
|
||||
| `remarks` | str | OFAC's free-text remarks |
|
||||
| `last_updated` | str | YYYY-MM-DD (publication date) |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- Worldwide — all entities sanctioned by US Treasury
|
||||
- ~10,000 entries on SDN, ~15,000 on consolidated lists
|
||||
- Updated continuously (sometimes daily during active enforcement)
|
||||
- Includes AKAs (very common, can be 10+ per entity)
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **SEC EDGAR** ↔ `name` (public companies sanctioned)
|
||||
- **USAspending** ↔ `name` (sanctioned entity as federal contractor — should
|
||||
be impossible but verify)
|
||||
- **ICIJ Offshore** ↔ `name` (offshore entities also sanctioned)
|
||||
|
||||
Join key: normalized name. **CRITICAL**: must match against `aka_list` too.
|
||||
Many sanctioned entities are caught only via aliases.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- Names are transliterated from many scripts — multiple romanizations possible
|
||||
- AKAs often differ wildly from primary name
|
||||
- Some entries have minimal info (no DOB, no address) for individuals
|
||||
- Free-text `remarks` contain critical context — read them
|
||||
- "Specially Designated Global Terrorists" (SDGT) and "Cyber-related" (CYBER2)
|
||||
programs add and remove entries frequently
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_ofac_sdn.py`
|
||||
|
||||
```bash
|
||||
# Full snapshot
|
||||
python3 SKILL_DIR/scripts/fetch_ofac_sdn.py --out data/ofac_sdn.csv
|
||||
|
||||
# Filter to specific program
|
||||
python3 SKILL_DIR/scripts/fetch_ofac_sdn.py --program SDGT --out data/sdn_sdgt.csv
|
||||
|
||||
# Entities only (skip individuals, vessels, aircraft)
|
||||
python3 SKILL_DIR/scripts/fetch_ofac_sdn.py --entity-type entity --out data/sdn_entities.csv
|
||||
```
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Public record under Executive Order authority and statutory sanctions programs
|
||||
- US persons MUST screen against this list — it is enforced
|
||||
- No restrictions on the data itself; restrictions are on transactions with
|
||||
the listed entities
|
||||
- ZERO penalty for "over-matching" — false positives must be cleared but are not
|
||||
prohibited
|
||||
|
||||
## 9. References
|
||||
|
||||
- OFAC home: https://ofac.treasury.gov/
|
||||
- SDN list: https://ofac.treasury.gov/specially-designated-nationals-and-blocked-persons-list-sdn-human-readable-lists
|
||||
- Data formats: https://ofac.treasury.gov/sdn-list/sanctions-list-search-tool
|
||||
- Compliance guidance: https://ofac.treasury.gov/recent-actions
|
||||
|
|
@ -0,0 +1,103 @@
|
|||
# OpenCorporates — Global Corporate Registry
|
||||
|
||||
## 1. Summary
|
||||
|
||||
OpenCorporates aggregates corporate registry data from 130+ jurisdictions
|
||||
worldwide (~200M companies). Covers US state-level filings (NY DOS, Delaware
|
||||
DOC, California SOS, etc.), UK Companies House, EU registries, and most
|
||||
common-law jurisdictions.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **REST API:** `https://api.opencorporates.com/v0.4/`
|
||||
- **HTML fallback:** `https://opencorporates.com/companies?q=...`
|
||||
- **Auth:** API token required (free tier 500 calls/month, paid plans available)
|
||||
- **Rate limit:** Token-bound; un-tokened requests return 401
|
||||
|
||||
Set `OPENCORPORATES_API_TOKEN` env var. Get a free token at
|
||||
https://opencorporates.com/api_accounts/new.
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_opencorporates.py`:
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `name` | str | Company legal name |
|
||||
| `company_number` | str | Registry-assigned number |
|
||||
| `jurisdiction_code` | str | e.g. `us_ny`, `us_de`, `gb` |
|
||||
| `jurisdiction_name` | str | Human-readable jurisdiction |
|
||||
| `incorporation_date` | str | YYYY-MM-DD |
|
||||
| `dissolution_date` | str | YYYY-MM-DD (empty if active) |
|
||||
| `company_type` | str | Domestic LLC / Foreign Corp / etc. |
|
||||
| `status` | str | Active / Inactive / Dissolved |
|
||||
| `registered_address` | str | Registered office address |
|
||||
| `opencorporates_url` | str | Link to OpenCorporates entity page |
|
||||
| `officers_count` | str | Total officers on record |
|
||||
| `source` | str | `api`, `html`, or `html-fallback` |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- US: all 50 states + DC at state level (LLCs, corps, LPs)
|
||||
- International: UK, EU, Canada, Australia, NZ, many APAC + LATAM jurisdictions
|
||||
- ~200M company records cumulative
|
||||
- Update frequency varies by jurisdiction (UK CH is near-realtime; some
|
||||
state registries lag months)
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **NYC ACRIS** ↔ `name` (LLC/corp owners of NYC property)
|
||||
- **USAspending** ↔ `name` (corporate federal contractors)
|
||||
- **SEC EDGAR** ↔ `name` (public companies + their subsidiaries)
|
||||
- **ICIJ Offshore** ↔ `name` (international corporate structures)
|
||||
|
||||
Join key: normalized company name. Some entries have `previous_names` arrays
|
||||
which are not currently exported by the fetch script — query OC directly
|
||||
for that.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- Company-name spellings vary across re-incorporations and renames
|
||||
- Officer records are spottier than company records (many jurisdictions
|
||||
don't require officer disclosure)
|
||||
- Beneficial-ownership data is generally NOT here — most jurisdictions
|
||||
don't require it. UK Companies House has PSC (people with significant
|
||||
control) but that's not universal.
|
||||
- Cross-jurisdictional links (parent / subsidiary) are based on registry
|
||||
filings only; corporate trees are often incomplete
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_opencorporates.py`
|
||||
|
||||
```bash
|
||||
# Search globally by name
|
||||
python3 SKILL_DIR/scripts/fetch_opencorporates.py --query "Example Corp" \
|
||||
--out data/oc.csv
|
||||
|
||||
# Restrict to a jurisdiction
|
||||
python3 SKILL_DIR/scripts/fetch_opencorporates.py --query "Example Corp" \
|
||||
--jurisdiction us_ny --out data/oc_ny.csv
|
||||
|
||||
# Set token via env or flag
|
||||
OPENCORPORATES_API_TOKEN=xxx python3 SKILL_DIR/scripts/fetch_opencorporates.py \
|
||||
--query "Microsoft" --out data/oc.csv
|
||||
```
|
||||
|
||||
Without a token the script falls back to scraping the HTML search page.
|
||||
The fallback is brittle and only fills in `name`, `jurisdiction_code`,
|
||||
`opencorporates_url` — set the token for serious work.
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- OpenCorporates aggregates public records — the underlying facts are
|
||||
public domain
|
||||
- OpenCorporates own database is licensed CC-BY-SA-4.0; attribution required
|
||||
- API ToS prohibits redistributing the full dataset; per-record reference
|
||||
is fine
|
||||
|
||||
## 9. References
|
||||
|
||||
- API docs: https://api.opencorporates.com/documentation/API-Reference
|
||||
- Jurisdiction codes: https://api.opencorporates.com/v0.4/jurisdictions.json
|
||||
- Schema: https://opencorporates.com/info/our_data
|
||||
|
|
@ -0,0 +1,83 @@
|
|||
# SEC EDGAR — Corporate Filings
|
||||
|
||||
## 1. Summary
|
||||
|
||||
EDGAR (Electronic Data Gathering, Analysis, and Retrieval) is the SEC's system
|
||||
for corporate disclosure filings: 10-K (annual), 10-Q (quarterly), 8-K (current
|
||||
events), DEF 14A (proxy), Form 4 (insider trading), 13F (institutional holdings).
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **API:** `https://data.sec.gov/submissions/CIK<10-digit-padded>.json` (no auth)
|
||||
- **Filing index:** `https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=...`
|
||||
- **Full-text search:** `https://efts.sec.gov/LATEST/search-index?q=...`
|
||||
- **Auth:** None — requires `User-Agent` header with contact info per SEC policy
|
||||
- **Rate limit:** 10 requests/second per IP (enforced)
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_sec_edgar.py` (filings index):
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `cik` | str | Central Index Key (10-digit padded) |
|
||||
| `company_name` | str | Registrant name |
|
||||
| `form_type` | str | 10-K, 10-Q, 8-K, etc. |
|
||||
| `filing_date` | str | YYYY-MM-DD |
|
||||
| `accession_number` | str | Filing accession (e.g. 0000320193-24-000123) |
|
||||
| `primary_document` | str | Filename of main document |
|
||||
| `filing_url` | str | Direct URL to filing index |
|
||||
| `reporting_period` | str | Period of report (where applicable) |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- All public US registrants from 1993 → present
|
||||
- 1993-2000 has spotty coverage of older filings (paper-to-electronic migration)
|
||||
- ~12M filings cumulative
|
||||
- Updated within minutes of filing acceptance
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **USAspending** ↔ `company_name` (public companies as federal contractors)
|
||||
- **Senate LD** ↔ `company_name` (public companies hire lobbyists)
|
||||
- **OFAC SDN** ↔ `company_name` (sanctions screening of public registrants)
|
||||
|
||||
Join key: company name OR CIK if you have it. CIK is canonical and stable.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- Subsidiaries often filed under parent CIK — be careful with name matches
|
||||
- Name changes over time (rebrands, acquisitions) — CIK remains constant
|
||||
- 10-K Item 1A Risk Factors are free-form text — useful for `web_extract`-style
|
||||
parsing, not structured queries
|
||||
- Foreign private issuers file 20-F instead of 10-K
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_sec_edgar.py`
|
||||
|
||||
```bash
|
||||
# By CIK
|
||||
python3 SKILL_DIR/scripts/fetch_sec_edgar.py --cik 0000320193 \
|
||||
--types 10-K,10-Q --out data/edgar_filings.csv
|
||||
|
||||
# By company name (resolves to CIK first via name search)
|
||||
python3 SKILL_DIR/scripts/fetch_sec_edgar.py --company "APPLE INC" \
|
||||
--types 8-K --since 2024-01-01 --out data/edgar_filings.csv
|
||||
```
|
||||
|
||||
Set `SEC_USER_AGENT` env var with your contact email (SEC requirement).
|
||||
Example: `SEC_USER_AGENT="Research example@example.com"`.
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Public record under SEC Rule 24b-2 / 17 CFR § 230.401
|
||||
- No commercial use restrictions on filing content
|
||||
- SEC asks all bulk users to include a `User-Agent` with contact info and to
|
||||
respect 10 req/s — failure to do so can result in IP blocking
|
||||
|
||||
## 9. References
|
||||
|
||||
- Developer docs: https://www.sec.gov/edgar/sec-api-documentation
|
||||
- EDGAR full-text search: https://efts.sec.gov/LATEST/search-index
|
||||
- Fair access policy: https://www.sec.gov/os/accessing-edgar-data
|
||||
|
|
@ -0,0 +1,89 @@
|
|||
# Senate LD — Lobbying Disclosure (LD-1 / LD-2)
|
||||
|
||||
## 1. Summary
|
||||
|
||||
The Senate Office of Public Records publishes lobbying disclosures under the
|
||||
Lobbying Disclosure Act of 1995 (LDA, as amended by HLOGA 2007). LD-1 is
|
||||
registration of a new client-lobbyist relationship; LD-2 is the quarterly
|
||||
activity report.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **API:** `https://lda.senate.gov/api/v1/` (no auth required for read-only)
|
||||
- **Bulk download:** `https://lda.senate.gov/api/v1/filings/?format=csv` (paginated)
|
||||
- **Auth:** Token required for >120 req/hour — register at https://lda.senate.gov/api/auth/register/
|
||||
- **Rate limit:** 120 req/hour unauthenticated, 1,200 req/hour authenticated
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_senate_ld.py`:
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `filing_uuid` | str | Unique filing ID |
|
||||
| `filing_type` | str | LD-1, LD-2, LD-203, etc. |
|
||||
| `filing_year` | int | Year |
|
||||
| `filing_period` | str | Q1/Q2/Q3/Q4 or annual |
|
||||
| `registrant_name` | str | Lobbying firm or organization |
|
||||
| `registrant_id` | str | Senate-assigned registrant ID |
|
||||
| `client_name` | str | Client being represented |
|
||||
| `client_id` | str | Senate-assigned client ID |
|
||||
| `client_general_description` | str | Client industry / business |
|
||||
| `income` | float | LD-2 income from client this quarter (USD) |
|
||||
| `expenses` | float | LD-2 expenses (in-house lobbying) |
|
||||
| `lobbyists` | str | Semicolon-separated lobbyist names |
|
||||
| `issues` | str | Semicolon-separated issue areas |
|
||||
| `government_entities` | str | Agencies/chambers contacted |
|
||||
| `filing_date` | str | YYYY-MM-DD |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- US federal lobbying only (state lobbying handled by individual state ethics offices)
|
||||
- 1999 → present (full electronic coverage from 2008)
|
||||
- Quarterly reporting cycle (LD-2)
|
||||
- ~1M+ filings cumulative
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **USAspending** ↔ `client_name` (clients lobbying for contracts)
|
||||
- **SEC EDGAR** ↔ `client_name` (public companies as lobbying clients)
|
||||
- **OFAC SDN** ↔ `client_name` (sanctions screening of lobbying clients)
|
||||
|
||||
Join key: normalized client_name. registrant_id and client_id are canonical
|
||||
when joining Senate-internal records.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- Many lobbyist names appear in multiple registrants over time (job changes)
|
||||
- `issues` and `government_entities` are free-text — Inconsistent capitalization
|
||||
- Foreign agents register under FARA (Department of Justice), NOT here
|
||||
- Income/expenses are reported in $10,000 brackets in some older filings
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_senate_ld.py`
|
||||
|
||||
```bash
|
||||
# By client
|
||||
python3 SKILL_DIR/scripts/fetch_senate_ld.py --client "EXAMPLE CORP" \
|
||||
--year 2024 --out data/lobbying.csv
|
||||
|
||||
# By registrant (lobbying firm)
|
||||
python3 SKILL_DIR/scripts/fetch_senate_ld.py --registrant "BIG K STREET LLP" \
|
||||
--year 2024 --out data/lobbying.csv
|
||||
```
|
||||
|
||||
Set `SENATE_LDA_TOKEN` env var if you have one (or pass `--token`).
|
||||
Defaults to anonymous (120 req/hour).
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Public record under 2 U.S.C. § 1604 (LDA)
|
||||
- No commercial use restrictions
|
||||
- Reuse is unconditional — see Senate Public Records Office disclaimer
|
||||
|
||||
## 9. References
|
||||
|
||||
- API docs: https://lda.senate.gov/api/redoc/v1/
|
||||
- LDA guidance: https://lobbyingdisclosure.house.gov/ld_guidance.pdf
|
||||
- Senate Public Records: https://lda.senate.gov/
|
||||
|
|
@ -0,0 +1,97 @@
|
|||
# USAspending — Federal Government Contracts and Grants
|
||||
|
||||
## 1. Summary
|
||||
|
||||
USAspending.gov is the official source of federal spending data. Coverage:
|
||||
contracts, grants, loans, direct payments, sub-awards. Required by the DATA Act
|
||||
of 2014 — all federal agencies must report to a single schema.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **API v2:** `https://api.usaspending.gov/api/v2/` (no auth, no key)
|
||||
- **Bulk:** `https://files.usaspending.gov/` (CSV / Parquet by award type)
|
||||
- **Auth:** None
|
||||
- **Rate limit:** Not strictly enforced, but be polite — keep to <10 req/s
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_usaspending.py` (prime awards):
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `award_id` | str | Federal award ID (PIID for contracts, FAIN for grants) |
|
||||
| `recipient_name` | str | Awardee legal name |
|
||||
| `recipient_uei` | str | Unique Entity Identifier (replaced DUNS in 2022) |
|
||||
| `recipient_duns` | str | Legacy DUNS number (historical only) |
|
||||
| `recipient_parent_name` | str | Ultimate parent organization |
|
||||
| `recipient_state` | str | Recipient state |
|
||||
| `awarding_agency` | str | Department / agency name |
|
||||
| `awarding_sub_agency` | str | Sub-tier (e.g. DoD → Army) |
|
||||
| `award_type` | str | Contract / Grant / Loan / Direct Payment |
|
||||
| `award_amount` | float | Current total obligation in USD |
|
||||
| `award_date` | str | Action / signed date YYYY-MM-DD |
|
||||
| `period_of_performance_start` | str | YYYY-MM-DD |
|
||||
| `period_of_performance_end` | str | YYYY-MM-DD |
|
||||
| `naics_code` | str | Industry classification |
|
||||
| `psc_code` | str | Product / Service Code |
|
||||
| `competition_extent` | str | Full / limited / sole-source |
|
||||
| `description` | str | Award description (free-text) |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- US federal awards only (state/local not included)
|
||||
- FY 2008 → present (full coverage from FY 2017)
|
||||
- Updated bi-weekly from agency reporting
|
||||
- ~100M+ transaction records cumulative
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **SEC EDGAR** ↔ `recipient_name` (public companies as contractors)
|
||||
- **Senate LD** ↔ `recipient_name` (lobbying clients winning contracts)
|
||||
- **OFAC SDN** ↔ `recipient_name` (sanctions screening of contractors — must be
|
||||
filtered out by SAM.gov but verify)
|
||||
- **ICIJ Offshore** ↔ `recipient_name` (offshore-linked contractors)
|
||||
|
||||
Join key: normalized recipient name. UEI is canonical when present.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- DUNS → UEI transition (April 2022) — old records have DUNS, new records have UEI
|
||||
- Some sub-awards aren't reported (FFATA threshold is $30k)
|
||||
- Award amount changes over time (mod actions) — fetch script reports current total
|
||||
- `competition_extent` field is free-text in older records — `fetch_usaspending.py`
|
||||
normalizes to canonical values
|
||||
- Recipient name variations are extensive — "ACME LLC", "Acme L.L.C.", "ACME, INC"
|
||||
all appear. Use `entity_resolution.py`.
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_usaspending.py`
|
||||
|
||||
```bash
|
||||
# By recipient name
|
||||
python3 SKILL_DIR/scripts/fetch_usaspending.py --recipient "EXAMPLE CORP" \
|
||||
--fy 2024 --out data/contracts.csv
|
||||
|
||||
# By awarding agency
|
||||
python3 SKILL_DIR/scripts/fetch_usaspending.py --agency "Department of Defense" \
|
||||
--fy 2024 --out data/contracts.csv
|
||||
|
||||
# Filter to sole-source only
|
||||
python3 SKILL_DIR/scripts/fetch_usaspending.py --recipient "EXAMPLE CORP" \
|
||||
--fy 2024 --sole-source-only --out data/contracts.csv
|
||||
```
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Public record under the Federal Funding Accountability and Transparency Act
|
||||
(FFATA, 2006) and DATA Act (2014)
|
||||
- No commercial use restrictions on the data
|
||||
- Personal information of award recipients (e.g. small business owners' addresses
|
||||
in some grants) should be handled per the source agency's privacy notice
|
||||
|
||||
## 9. References
|
||||
|
||||
- API docs: https://api.usaspending.gov/
|
||||
- Data dictionary: https://www.usaspending.gov/data-dictionary
|
||||
- Award schema: https://files.usaspending.gov/docs/Data_Dictionary_Crosswalk.xlsx
|
||||
|
|
@ -0,0 +1,93 @@
|
|||
# Wayback Machine — Internet Archive CDX
|
||||
|
||||
## 1. Summary
|
||||
|
||||
The Internet Archive's Wayback Machine has captured ~900B+ web pages since
|
||||
1996. The CDX server API indexes those captures by URL, timestamp, and
|
||||
content hash. Free, anonymous, no auth.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **CDX server:** `https://web.archive.org/cdx/search/cdx`
|
||||
- **Wayback URL:** `https://web.archive.org/web/<timestamp>/<url>`
|
||||
- **Save Page Now (write):** `https://web.archive.org/save/<url>` (different API)
|
||||
- **Auth:** None
|
||||
- **Rate limit:** Generous; be polite (~1 req/s)
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_wayback.py`:
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `url` | str | Original URL captured |
|
||||
| `timestamp` | str | YYYYMMDDHHMMSS (CDX format) |
|
||||
| `wayback_url` | str | Direct replay URL |
|
||||
| `mimetype` | str | Content-type at capture |
|
||||
| `status` | str | HTTP status (typically 200) |
|
||||
| `digest` | str | SHA1 of capture content (collapse-friendly) |
|
||||
| `length` | str | Byte length of capture |
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- 1996 → present
|
||||
- ~900B+ captures across ~700M domains
|
||||
- Updated continuously by automated crawls + manual saves
|
||||
- Some domains have aggressive coverage (news), others sparse (private)
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **Wikipedia** ↔ Reverse-lookup pages cited as references that have since
|
||||
disappeared
|
||||
- **News URLs** ↔ Original article content when present-day URLs 404
|
||||
- **Corporate websites** ↔ Historical "About" pages, executive bios that
|
||||
have been scrubbed
|
||||
|
||||
The Wayback CDX is most useful as a **content-recovery** layer when other
|
||||
sources point to URLs that no longer exist.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- robots.txt-blocked domains may have spotty or no coverage
|
||||
- Captures vary in completeness (HTML may be saved without CSS/JS)
|
||||
- Some content is excluded by domain owner request (DMCA, etc.)
|
||||
- Coverage of "deep links" (URLs with query strings) is uneven
|
||||
- Time resolution is per-capture, not continuous — gaps are common
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_wayback.py`
|
||||
|
||||
```bash
|
||||
# All captures of a specific URL
|
||||
python3 SKILL_DIR/scripts/fetch_wayback.py --url "https://example.com/page" \
|
||||
--out data/wb.csv
|
||||
|
||||
# All captures of a host
|
||||
python3 SKILL_DIR/scripts/fetch_wayback.py --url "example.com" \
|
||||
--match host --out data/wb.csv
|
||||
|
||||
# All captures of a domain + subdomains
|
||||
python3 SKILL_DIR/scripts/fetch_wayback.py --url "example.com" \
|
||||
--match domain --out data/wb.csv
|
||||
|
||||
# Only unique-content captures within a date window
|
||||
python3 SKILL_DIR/scripts/fetch_wayback.py --url "example.com" \
|
||||
--match host --collapse digest \
|
||||
--from-date 2020-01-01 --to-date 2023-12-31 \
|
||||
--out data/wb.csv
|
||||
```
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Internet Archive captures are made under fair-use research provisions
|
||||
- Replay URLs are stable references — citing them is encouraged
|
||||
- Internet Archive non-profit terms of use govern content
|
||||
- Some content is rights-restricted; replay may be blocked even if the
|
||||
CDX entry shows it as captured
|
||||
|
||||
## 9. References
|
||||
|
||||
- CDX server docs: https://github.com/internetarchive/wayback/blob/master/wayback-cdx-server/README.md
|
||||
- Wayback API: https://archive.org/help/wayback_api.php
|
||||
- Internet Archive: https://archive.org/
|
||||
|
|
@ -0,0 +1,107 @@
|
|||
# Wikipedia + Wikidata
|
||||
|
||||
## 1. Summary
|
||||
|
||||
Wikipedia is the canonical narrative-bio source for notable people, places,
|
||||
and organizations. Wikidata is its structured-data counterpart: ~110M
|
||||
items, each with claims, dates, identifiers, and cross-references to
|
||||
external authorities (VIAF, ISNI, ORCID, GRID, etc.).
|
||||
|
||||
Together they're a high-precision entity-resolution layer — the bar for
|
||||
inclusion is real, but anything past that bar is well-cross-referenced.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- **Wikipedia OpenSearch:** `https://en.wikipedia.org/w/api.php?action=opensearch`
|
||||
- **Wikipedia REST summary:** `https://en.wikipedia.org/api/rest_v1/page/summary/<title>`
|
||||
- **Wikidata Action API:** `https://www.wikidata.org/w/api.php?action=wbgetentities`
|
||||
- **Wikidata SPARQL:** `https://query.wikidata.org/sparql` (more powerful but aggressively rate-limited)
|
||||
- **Auth:** None, but **a meaningful User-Agent is required**
|
||||
|
||||
Set `HERMES_OSINT_UA` to something identifying (e.g. `your-app/1.0 (you@example.com)`).
|
||||
Wikimedia returns HTTP 429 to generic UAs.
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields emitted by `fetch_wikipedia.py`:
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `source` | str | `wikipedia` or `wikipedia+wikidata` |
|
||||
| `label` | str | Wikipedia article title |
|
||||
| `description` | str | Short Wikidata description |
|
||||
| `qid` | str | Wikidata QID (e.g. Q2283 for Microsoft) |
|
||||
| `wikipedia_title`, `wikipedia_url` | str | Article identifier + URL |
|
||||
| `wikidata_url` | str | Wikidata entity URL |
|
||||
| `instance_of` | str | What kind of thing it is (P31) |
|
||||
| `country` | str | Country (P17 for orgs/places, P27 for people) |
|
||||
| `occupation` | str | P106 |
|
||||
| `employer` | str | P108 |
|
||||
| `date_of_birth` | str | P569, YYYY-MM-DD |
|
||||
| `place_of_birth` | str | P19 |
|
||||
| `summary` | str | Wikipedia REST extract (~1000 chars) |
|
||||
|
||||
The fetch script uses Wikidata's Action API (NOT SPARQL) for structured
|
||||
facts — far more lenient on rate limits.
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- Wikipedia EN: ~7M articles
|
||||
- Wikidata: ~110M items, ~1.5B statements
|
||||
- Updated continuously; abuse filters and bots run constantly
|
||||
- High notability bar — most private individuals are not in Wikipedia
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
- **All sources** ↔ `label` (entity identity resolution)
|
||||
- **SEC EDGAR** ↔ `label` (public companies)
|
||||
- **CourtListener** ↔ `label` (parties to notable litigation)
|
||||
- **Wikidata external identifiers** (not currently in this fetcher's output)
|
||||
link to VIAF, ISNI, ORCID, GRID, GitHub, Twitter, IMDb, ...
|
||||
|
||||
Join key: Wikidata QID is canonical. Wikipedia titles are stable for
|
||||
most articles but can be renamed.
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
- Notability filter — only notable entities (criteria vary by topic)
|
||||
- Recency lag — current events take days to weeks to be reflected
|
||||
- POV / vandalism — moderated, but edits between sweeps can be bad
|
||||
- Living-persons biographies have stricter sourcing requirements
|
||||
- Wikidata claims have qualifiers and references — the fetch script
|
||||
doesn't currently export them
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_wikipedia.py`
|
||||
|
||||
```bash
|
||||
# Look up a notable entity
|
||||
python3 SKILL_DIR/scripts/fetch_wikipedia.py --query "Microsoft" --out data/wp.csv
|
||||
|
||||
# A specific person
|
||||
python3 SKILL_DIR/scripts/fetch_wikipedia.py --query "Bill Gates" --out data/wp_bg.csv
|
||||
|
||||
# Skip the Wikidata enrichment for speed
|
||||
python3 SKILL_DIR/scripts/fetch_wikipedia.py --query "Microsoft" --no-wikidata \
|
||||
--limit 5 --out data/wp.csv
|
||||
```
|
||||
|
||||
The OpenSearch is fuzzy — `--limit 5` returns the top 5 Wikipedia article
|
||||
matches. Each is enriched with the QID + structured facts unless
|
||||
`--no-wikidata` is passed.
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Wikipedia text: CC-BY-SA-3.0 / GFDL
|
||||
- Wikidata claims: CC0 (public domain)
|
||||
- API ToS: respect rate limits, identify your agent
|
||||
- Commercial use allowed with attribution
|
||||
|
||||
## 9. References
|
||||
|
||||
- Wikipedia OpenSearch: https://www.mediawiki.org/wiki/API:Opensearch
|
||||
- Wikipedia REST: https://en.wikipedia.org/api/rest_v1/
|
||||
- Wikidata Action API: https://www.wikidata.org/wiki/Wikidata:Data_access
|
||||
- Wikidata SPARQL: https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service
|
||||
- User-Agent policy: https://meta.wikimedia.org/wiki/User-Agent_policy
|
||||
|
|
@ -0,0 +1,82 @@
|
|||
"""Tiny stdlib HTTP helper used by fetch_*.py scripts.
|
||||
|
||||
Provides polite retry + JSON convenience + User-Agent enforcement.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import time
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
|
||||
DEFAULT_UA = (
|
||||
"hermes-osint-investigation/0.2 "
|
||||
"(+https://github.com/NousResearch/hermes-agent; "
|
||||
"set HERMES_OSINT_UA env var to identify yourself per "
|
||||
"Wikimedia / SEC fair-use guidance)"
|
||||
)
|
||||
|
||||
|
||||
def get(
|
||||
url: str,
|
||||
*,
|
||||
params: dict | None = None,
|
||||
headers: dict | None = None,
|
||||
user_agent: str | None = None,
|
||||
max_retries: int = 3,
|
||||
backoff: float = 1.5,
|
||||
timeout: float = 30.0,
|
||||
) -> bytes:
|
||||
"""GET with retry on 5xx and Retry-After honoring.
|
||||
|
||||
429 (rate-limit) is raised IMMEDIATELY with a clear message — retrying
|
||||
when the upstream says "you're over quota" just wastes time. The caller
|
||||
should slow down or supply real credentials.
|
||||
"""
|
||||
if params:
|
||||
sep = "&" if "?" in url else "?"
|
||||
url = f"{url}{sep}{urllib.parse.urlencode(params)}"
|
||||
h = {"User-Agent": user_agent or os.environ.get("HERMES_OSINT_UA", DEFAULT_UA)}
|
||||
if headers:
|
||||
h.update(headers)
|
||||
|
||||
last_err: Exception | None = None
|
||||
for attempt in range(max_retries + 1):
|
||||
req = urllib.request.Request(url, headers=h)
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
return resp.read()
|
||||
except urllib.error.HTTPError as e:
|
||||
if e.code == 429:
|
||||
# Surface immediately. Read the body so the caller sees the
|
||||
# provider's actual message ("OVER_RATE_LIMIT" etc.).
|
||||
try:
|
||||
body = e.read(2048).decode("utf-8", errors="replace")
|
||||
except Exception: # noqa: BLE001
|
||||
body = ""
|
||||
raise RuntimeError(
|
||||
f"HTTP 429 rate-limited by {urllib.parse.urlsplit(url).netloc}. "
|
||||
f"Slow down or supply a real API key. Body: {body[:300]}"
|
||||
) from e
|
||||
if e.code in (500, 502, 503, 504) and attempt < max_retries:
|
||||
retry_after = e.headers.get("Retry-After") if e.headers else None
|
||||
wait = float(retry_after) if (retry_after and retry_after.isdigit()) else backoff ** (attempt + 1)
|
||||
time.sleep(wait)
|
||||
last_err = e
|
||||
continue
|
||||
raise
|
||||
except urllib.error.URLError as e:
|
||||
if attempt < max_retries:
|
||||
time.sleep(backoff ** (attempt + 1))
|
||||
last_err = e
|
||||
continue
|
||||
raise
|
||||
if last_err:
|
||||
raise last_err
|
||||
raise RuntimeError("unreachable")
|
||||
|
||||
|
||||
def get_json(url: str, **kwargs) -> dict | list:
|
||||
return json.loads(get(url, **kwargs).decode("utf-8"))
|
||||
|
|
@ -0,0 +1,67 @@
|
|||
"""Shared entity-name normalization helpers (stdlib-only).
|
||||
|
||||
Used by entity_resolution.py and timing_analysis.py.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
|
||||
# Legal suffixes / corporate boilerplate to strip during normalization.
|
||||
_SUFFIX_TOKENS = {
|
||||
"INC", "INCORPORATED", "LLC", "LLP", "LP", "LTD", "LIMITED",
|
||||
"CORP", "CORPORATION", "CO", "COMPANY",
|
||||
"GROUP", "GRP", "HOLDINGS", "HOLDING",
|
||||
"PARTNERS", "ASSOCIATES",
|
||||
"INTERNATIONAL", "INTL",
|
||||
"ENTERPRISES", "ENTERPRISE",
|
||||
"SERVICES", "SERVICE", "SVCS",
|
||||
"SOLUTIONS", "MANAGEMENT", "MGMT", "CONSULTING",
|
||||
"TECHNOLOGY", "TECHNOLOGIES", "TECH",
|
||||
"INDUSTRIES", "INDUSTRY",
|
||||
"AMERICA", "AMERICAN",
|
||||
"USA", "US",
|
||||
"PLLC", "PC",
|
||||
"TRUST", "FOUNDATION",
|
||||
}
|
||||
|
||||
_PUNCT_RE = re.compile(r"[^\w\s]")
|
||||
_WS_RE = re.compile(r"\s+")
|
||||
|
||||
|
||||
def normalize_name(name: str | None) -> str:
|
||||
"""Standard normalization: uppercase, strip suffixes, drop punctuation."""
|
||||
if not name:
|
||||
return ""
|
||||
s = _PUNCT_RE.sub(" ", name.upper())
|
||||
s = _WS_RE.sub(" ", s).strip()
|
||||
tokens = [t for t in s.split() if t and t not in _SUFFIX_TOKENS]
|
||||
return " ".join(tokens)
|
||||
|
||||
|
||||
def normalize_aggressive(name: str | None) -> str:
|
||||
"""Aggressive normalization: sorted unique tokens (word-bag)."""
|
||||
base = normalize_name(name)
|
||||
if not base:
|
||||
return ""
|
||||
return " ".join(sorted(set(base.split())))
|
||||
|
||||
|
||||
def name_tokens(name: str | None, min_len: int = 4) -> set[str]:
|
||||
"""Token set used for overlap matching."""
|
||||
base = normalize_name(name)
|
||||
if not base:
|
||||
return set()
|
||||
return {t for t in base.split() if len(t) >= min_len}
|
||||
|
||||
|
||||
def token_overlap_ratio(left: str | None, right: str | None) -> tuple[float, int]:
|
||||
"""Return (jaccard-like ratio, shared token count) over min-len tokens."""
|
||||
a = name_tokens(left)
|
||||
b = name_tokens(right)
|
||||
if not a or not b:
|
||||
return 0.0, 0
|
||||
shared = a & b
|
||||
if not shared:
|
||||
return 0.0, 0
|
||||
union = a | b
|
||||
return len(shared) / len(union), len(shared)
|
||||
|
|
@ -0,0 +1,221 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Build a structured findings.json with evidence chains (stdlib-only).
|
||||
|
||||
Aggregates cross_links.csv (entity_resolution output) and an optional
|
||||
timing.json (timing_analysis output) into a single evidence-chain document.
|
||||
|
||||
Output structure:
|
||||
{
|
||||
"metadata": {...},
|
||||
"findings": [
|
||||
{
|
||||
"id": "F0001",
|
||||
"title": "...",
|
||||
"severity": "HIGH|MEDIUM|LOW",
|
||||
"confidence": "high|medium|low",
|
||||
"summary": "...",
|
||||
"evidence": [
|
||||
{"source": "cross_links.csv", "row": 12, "fields": {...}},
|
||||
...
|
||||
],
|
||||
"sources": ["cross_links.csv", "timing.json"]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
Every finding traces to specific source rows. No naked claims.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import json
|
||||
from collections import defaultdict
|
||||
from pathlib import Path
|
||||
|
||||
CONFIDENCE_ORDER = {"high": 0, "medium": 1, "low": 2}
|
||||
SEVERITY_ORDER = {"HIGH": 0, "MEDIUM": 1, "LOW": 2}
|
||||
|
||||
|
||||
def _read_cross_links(path: str) -> list[dict[str, str]]:
|
||||
with open(path, newline="", encoding="utf-8") as fh:
|
||||
return list(csv.DictReader(fh))
|
||||
|
||||
|
||||
def build_findings(
|
||||
cross_links_path: str,
|
||||
timing_path: str | None = None,
|
||||
out_path: str = "findings.json",
|
||||
bundled_threshold: int = 3,
|
||||
) -> dict:
|
||||
findings: list[dict] = []
|
||||
next_id = 1
|
||||
|
||||
# 1. Match-based findings, grouped by (left_normalized, right_normalized).
|
||||
matches = _read_cross_links(cross_links_path)
|
||||
grouped: dict[tuple[str, str], list[dict[str, str]]] = defaultdict(list)
|
||||
for i, row in enumerate(matches):
|
||||
row["__row__"] = str(i)
|
||||
grouped[(row.get("left_normalized", ""), row.get("right_normalized", ""))].append(row)
|
||||
|
||||
for (left_norm, right_norm), rows in grouped.items():
|
||||
if not left_norm or not right_norm:
|
||||
continue
|
||||
# Use the highest-confidence match for the finding's overall confidence.
|
||||
best = min(rows, key=lambda r: CONFIDENCE_ORDER.get(r.get("confidence", "low"), 2))
|
||||
finding_id = f"F{next_id:04d}"
|
||||
next_id += 1
|
||||
evidence = [
|
||||
{
|
||||
"source": "cross_links.csv",
|
||||
"row": int(r["__row__"]),
|
||||
"fields": {
|
||||
"match_type": r.get("match_type", ""),
|
||||
"confidence": r.get("confidence", ""),
|
||||
"left_name": r.get("left_name", ""),
|
||||
"right_name": r.get("right_name", ""),
|
||||
"overlap_ratio": r.get("overlap_ratio", ""),
|
||||
"shared_tokens": r.get("shared_tokens", ""),
|
||||
},
|
||||
}
|
||||
for r in rows
|
||||
]
|
||||
findings.append(
|
||||
{
|
||||
"id": finding_id,
|
||||
"title": f"Entity match: {best.get('left_name', '')} ↔ {best.get('right_name', '')}",
|
||||
"severity": "MEDIUM" if best.get("confidence") == "high" else "LOW",
|
||||
"confidence": best.get("confidence", "low"),
|
||||
"summary": (
|
||||
f"{len(rows)} cross-link record(s) tie "
|
||||
f"'{best.get('left_name', '')}' to "
|
||||
f"'{best.get('right_name', '')}' "
|
||||
f"(best tier: {best.get('match_type', '')})."
|
||||
),
|
||||
"evidence": evidence,
|
||||
"sources": ["cross_links.csv"],
|
||||
}
|
||||
)
|
||||
|
||||
# 2. Bundled-donations findings (if cross_links carries donor↔candidate pattern).
|
||||
# Heuristic: many distinct left names sharing the same right name.
|
||||
by_right: dict[str, set[str]] = defaultdict(set)
|
||||
by_right_rows: dict[str, list[dict[str, str]]] = defaultdict(list)
|
||||
for r in matches:
|
||||
right = r.get("right_normalized", "")
|
||||
left_raw = r.get("left_name", "").strip()
|
||||
if right and left_raw:
|
||||
by_right[right].add(left_raw)
|
||||
by_right_rows[right].append(r)
|
||||
for right_norm, lefts in by_right.items():
|
||||
if len(lefts) < bundled_threshold:
|
||||
continue
|
||||
rows = by_right_rows[right_norm]
|
||||
right_raw = rows[0].get("right_name", "")
|
||||
findings.append(
|
||||
{
|
||||
"id": f"F{next_id:04d}",
|
||||
"title": f"Bundled cross-links: {len(lefts)} distinct left entities ↔ '{right_raw}'",
|
||||
"severity": "HIGH",
|
||||
"confidence": "medium",
|
||||
"summary": (
|
||||
f"{len(lefts)} distinct left-side entities link to "
|
||||
f"'{right_raw}'. Pattern suggests coordinated relationship "
|
||||
f"(e.g. bundled donations, multi-vendor employer)."
|
||||
),
|
||||
"evidence": [
|
||||
{
|
||||
"source": "cross_links.csv",
|
||||
"row": int(r.get("__row__", "0")),
|
||||
"fields": {
|
||||
"left_name": r.get("left_name", ""),
|
||||
"match_type": r.get("match_type", ""),
|
||||
},
|
||||
}
|
||||
for r in rows
|
||||
],
|
||||
"sources": ["cross_links.csv"],
|
||||
}
|
||||
)
|
||||
next_id += 1
|
||||
|
||||
# 3. Timing-based findings.
|
||||
if timing_path and Path(timing_path).exists():
|
||||
timing = json.loads(Path(timing_path).read_text())
|
||||
for r in timing.get("results", []):
|
||||
if not r.get("significant"):
|
||||
continue
|
||||
findings.append(
|
||||
{
|
||||
"id": f"F{next_id:04d}",
|
||||
"title": (
|
||||
f"Donation timing significantly clusters near awards: "
|
||||
f"{r['donor']} ↔ {r['recipient']}"
|
||||
),
|
||||
"severity": "HIGH" if r["p_value"] < 0.01 else "MEDIUM",
|
||||
"confidence": "medium",
|
||||
"summary": (
|
||||
f"Mean nearest-award distance {r['observed_mean_days']} days "
|
||||
f"(null {r['null_mean_days']} days). p={r['p_value']}, "
|
||||
f"effect size {r['effect_size_sd']} SD. "
|
||||
f"{r['n_donations']} donations, {r['n_award_dates']} awards."
|
||||
),
|
||||
"evidence": [
|
||||
{
|
||||
"source": "timing.json",
|
||||
"row": None,
|
||||
"fields": r,
|
||||
}
|
||||
],
|
||||
"sources": ["timing.json"],
|
||||
}
|
||||
)
|
||||
next_id += 1
|
||||
|
||||
# Sort: severity → confidence → id.
|
||||
findings.sort(
|
||||
key=lambda f: (
|
||||
SEVERITY_ORDER.get(f["severity"], 3),
|
||||
CONFIDENCE_ORDER.get(f["confidence"], 3),
|
||||
f["id"],
|
||||
)
|
||||
)
|
||||
|
||||
payload = {
|
||||
"metadata": {
|
||||
"n_findings": len(findings),
|
||||
"cross_links_path": cross_links_path,
|
||||
"timing_path": timing_path,
|
||||
"bundled_threshold": bundled_threshold,
|
||||
},
|
||||
"findings": findings,
|
||||
}
|
||||
Path(out_path).write_text(json.dumps(payload, indent=2))
|
||||
return payload
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--cross-links", required=True)
|
||||
p.add_argument("--timing", help="Optional timing.json from timing_analysis.py")
|
||||
p.add_argument("--out", default="findings.json")
|
||||
p.add_argument(
|
||||
"--bundled-threshold",
|
||||
type=int,
|
||||
default=3,
|
||||
help="Minimum distinct left entities to flag as bundled (default 3)",
|
||||
)
|
||||
a = p.parse_args()
|
||||
|
||||
payload = build_findings(
|
||||
cross_links_path=a.cross_links,
|
||||
timing_path=a.timing,
|
||||
out_path=a.out,
|
||||
bundled_threshold=a.bundled_threshold,
|
||||
)
|
||||
print(f"Wrote {payload['metadata']['n_findings']} findings to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,228 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Cross-source entity resolution (stdlib-only).
|
||||
|
||||
Given two CSV files with name columns, find candidate matches using three
|
||||
tiers of normalization:
|
||||
|
||||
1. exact — normalized strings equal
|
||||
2. fuzzy — sorted-token (word-bag) match
|
||||
3. token_overlap — >=60% Jaccard overlap on >=4-char tokens, >=2 shared
|
||||
|
||||
Adapted from ShinMegamiBoson/OpenPlanter (MIT) but generalized: no Boston-
|
||||
specific record types, no contribution-code filters, no fixed schemas.
|
||||
|
||||
Output CSV columns:
|
||||
match_type, confidence, left_name, right_name,
|
||||
left_normalized, right_normalized, left_row, right_row,
|
||||
overlap_ratio, shared_tokens
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Allow running directly or as a module.
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _normalize import ( # noqa: E402
|
||||
normalize_name,
|
||||
normalize_aggressive,
|
||||
token_overlap_ratio,
|
||||
)
|
||||
|
||||
CONFIDENCE = {
|
||||
"exact": "high",
|
||||
"fuzzy": "medium",
|
||||
"token_overlap": "low",
|
||||
}
|
||||
|
||||
|
||||
def _read_csv(path: str, name_col: str) -> list[dict[str, str]]:
|
||||
rows = []
|
||||
with open(path, newline="", encoding="utf-8") as fh:
|
||||
reader = csv.DictReader(fh)
|
||||
if name_col not in (reader.fieldnames or []):
|
||||
raise SystemExit(
|
||||
f"Column {name_col!r} not in {path}. "
|
||||
f"Available: {reader.fieldnames}"
|
||||
)
|
||||
for i, row in enumerate(reader):
|
||||
row["__row__"] = str(i)
|
||||
rows.append(row)
|
||||
return rows
|
||||
|
||||
|
||||
def _build_index(rows: list[dict[str, str]], name_col: str):
|
||||
"""Index by exact-normalized and aggressive (sorted-token) form."""
|
||||
exact: dict[str, list[dict[str, str]]] = {}
|
||||
aggressive: dict[str, list[dict[str, str]]] = {}
|
||||
for row in rows:
|
||||
raw = row.get(name_col, "")
|
||||
n = normalize_name(raw)
|
||||
if n:
|
||||
exact.setdefault(n, []).append(row)
|
||||
a = normalize_aggressive(raw)
|
||||
if a:
|
||||
aggressive.setdefault(a, []).append(row)
|
||||
return exact, aggressive
|
||||
|
||||
|
||||
def _emit(
|
||||
out_rows: list[dict[str, str]],
|
||||
seen: set[tuple],
|
||||
match_type: str,
|
||||
left_row: dict[str, str],
|
||||
right_row: dict[str, str],
|
||||
left_col: str,
|
||||
right_col: str,
|
||||
ratio: float = 0.0,
|
||||
shared: int = 0,
|
||||
):
|
||||
left_raw = left_row.get(left_col, "")
|
||||
right_raw = right_row.get(right_col, "")
|
||||
key = (
|
||||
left_row["__row__"],
|
||||
right_row["__row__"],
|
||||
match_type,
|
||||
)
|
||||
if key in seen:
|
||||
return
|
||||
seen.add(key)
|
||||
out_rows.append(
|
||||
{
|
||||
"match_type": match_type,
|
||||
"confidence": CONFIDENCE[match_type],
|
||||
"left_name": left_raw,
|
||||
"right_name": right_raw,
|
||||
"left_normalized": normalize_name(left_raw),
|
||||
"right_normalized": normalize_name(right_raw),
|
||||
"left_row": left_row["__row__"],
|
||||
"right_row": right_row["__row__"],
|
||||
"overlap_ratio": f"{ratio:.3f}" if ratio else "",
|
||||
"shared_tokens": str(shared) if shared else "",
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
def resolve(
|
||||
left_path: str,
|
||||
left_col: str,
|
||||
right_path: str,
|
||||
right_col: str,
|
||||
out_path: str,
|
||||
overlap_threshold: float = 0.60,
|
||||
min_shared: int = 2,
|
||||
skip_overlap: bool = False,
|
||||
) -> int:
|
||||
left_rows = _read_csv(left_path, left_col)
|
||||
right_rows = _read_csv(right_path, right_col)
|
||||
|
||||
right_exact, right_aggressive = _build_index(right_rows, right_col)
|
||||
|
||||
out_rows: list[dict[str, str]] = []
|
||||
seen: set[tuple] = set()
|
||||
|
||||
# Pass 1+2: exact / fuzzy via index lookup.
|
||||
for lrow in left_rows:
|
||||
raw = lrow.get(left_col, "")
|
||||
n = normalize_name(raw)
|
||||
if not n:
|
||||
continue
|
||||
for rrow in right_exact.get(n, []):
|
||||
_emit(out_rows, seen, "exact", lrow, rrow, left_col, right_col)
|
||||
a = normalize_aggressive(raw)
|
||||
if a:
|
||||
for rrow in right_aggressive.get(a, []):
|
||||
_emit(out_rows, seen, "fuzzy", lrow, rrow, left_col, right_col)
|
||||
|
||||
if not skip_overlap:
|
||||
# Pass 3: token overlap (O(N*M) — expensive; allow opt-out).
|
||||
for lrow in left_rows:
|
||||
l_raw = lrow.get(left_col, "")
|
||||
if not normalize_name(l_raw):
|
||||
continue
|
||||
for rrow in right_rows:
|
||||
ratio, shared = token_overlap_ratio(
|
||||
l_raw, rrow.get(right_col, "")
|
||||
)
|
||||
if ratio >= overlap_threshold and shared >= min_shared:
|
||||
_emit(
|
||||
out_rows,
|
||||
seen,
|
||||
"token_overlap",
|
||||
lrow,
|
||||
rrow,
|
||||
left_col,
|
||||
right_col,
|
||||
ratio=ratio,
|
||||
shared=shared,
|
||||
)
|
||||
|
||||
fieldnames = [
|
||||
"match_type",
|
||||
"confidence",
|
||||
"left_name",
|
||||
"right_name",
|
||||
"left_normalized",
|
||||
"right_normalized",
|
||||
"left_row",
|
||||
"right_row",
|
||||
"overlap_ratio",
|
||||
"shared_tokens",
|
||||
]
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
writer = csv.DictWriter(fh, fieldnames=fieldnames)
|
||||
writer.writeheader()
|
||||
writer.writerows(out_rows)
|
||||
return len(out_rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--left", required=True, help="Left CSV path")
|
||||
p.add_argument(
|
||||
"--left-name-col", required=True, help="Name column in left CSV"
|
||||
)
|
||||
p.add_argument("--right", required=True, help="Right CSV path")
|
||||
p.add_argument(
|
||||
"--right-name-col",
|
||||
required=True,
|
||||
help="Name column in right CSV",
|
||||
)
|
||||
p.add_argument("--out", required=True, help="Output CSV path")
|
||||
p.add_argument(
|
||||
"--overlap-threshold",
|
||||
type=float,
|
||||
default=0.60,
|
||||
help="Jaccard overlap threshold for token_overlap tier (default 0.60)",
|
||||
)
|
||||
p.add_argument(
|
||||
"--min-shared",
|
||||
type=int,
|
||||
default=2,
|
||||
help="Minimum shared tokens for token_overlap tier (default 2)",
|
||||
)
|
||||
p.add_argument(
|
||||
"--skip-overlap",
|
||||
action="store_true",
|
||||
help="Skip the O(N*M) token_overlap pass (much faster on large CSVs)",
|
||||
)
|
||||
args = p.parse_args()
|
||||
|
||||
count = resolve(
|
||||
left_path=args.left,
|
||||
left_col=args.left_name_col,
|
||||
right_path=args.right,
|
||||
right_col=args.right_name_col,
|
||||
out_path=args.out,
|
||||
overlap_threshold=args.overlap_threshold,
|
||||
min_shared=args.min_shared,
|
||||
skip_overlap=args.skip_overlap,
|
||||
)
|
||||
print(f"Wrote {count} match rows to {args.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,149 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Search court records via CourtListener (Free Law Project).
|
||||
|
||||
Covers ~10M federal and state court opinions, plus PACER docket data
|
||||
where available. Public REST API v4 supports anonymous read access for
|
||||
search; some endpoints require a token (free at courtlistener.com).
|
||||
|
||||
Set COURTLISTENER_TOKEN to authenticate (raises rate limits).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import os
|
||||
import sys
|
||||
import urllib.parse
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _http import get_json # noqa: E402
|
||||
|
||||
BASE = "https://www.courtlistener.com/api/rest/v4/search/"
|
||||
|
||||
COLUMNS = [
|
||||
"case_name",
|
||||
"court",
|
||||
"court_id",
|
||||
"date_filed",
|
||||
"docket_number",
|
||||
"judge",
|
||||
"citation",
|
||||
"result_type",
|
||||
"snippet",
|
||||
"absolute_url",
|
||||
]
|
||||
|
||||
SEARCH_TYPES = {
|
||||
"opinions": "o", # Court opinions
|
||||
"dockets": "r", # PACER dockets (may require auth depending on coverage)
|
||||
"oral": "oa", # Oral arguments
|
||||
"people": "p", # Judges / people
|
||||
"recap": "r", # Same as dockets in v4
|
||||
}
|
||||
|
||||
|
||||
def fetch(
|
||||
query: str,
|
||||
search_type: str,
|
||||
court: str | None,
|
||||
date_from: str | None,
|
||||
date_to: str | None,
|
||||
token: str | None,
|
||||
limit: int,
|
||||
out_path: str,
|
||||
) -> int:
|
||||
type_code = SEARCH_TYPES.get(search_type, search_type)
|
||||
params = {
|
||||
"q": query,
|
||||
"type": type_code,
|
||||
}
|
||||
if court:
|
||||
params["court"] = court
|
||||
if date_from:
|
||||
params["filed_after"] = date_from
|
||||
if date_to:
|
||||
params["filed_before"] = date_to
|
||||
headers = {"Authorization": f"Token {token}"} if token else None
|
||||
|
||||
rows: list[dict[str, str]] = []
|
||||
next_url: str | None = f"{BASE}?{urllib.parse.urlencode(params)}"
|
||||
while next_url and len(rows) < limit:
|
||||
try:
|
||||
payload = get_json(next_url, headers=headers)
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(f"CourtListener error: {e}", file=sys.stderr)
|
||||
break
|
||||
if not isinstance(payload, dict):
|
||||
break
|
||||
results = payload.get("results", [])
|
||||
for r in results:
|
||||
if len(rows) >= limit:
|
||||
break
|
||||
rows.append(
|
||||
{
|
||||
"case_name": r.get("caseName", "") or r.get("case_name", "") or "",
|
||||
"court": r.get("court", "") or "",
|
||||
"court_id": r.get("court_id", "") or "",
|
||||
"date_filed": (r.get("dateFiled", "") or r.get("date_filed", "") or "")[:10],
|
||||
"docket_number": r.get("docketNumber", "") or r.get("docket_number", "") or "",
|
||||
"judge": r.get("judge", "") or "",
|
||||
"citation": "; ".join(r.get("citation", []) or []) if isinstance(r.get("citation"), list) else (r.get("citation") or ""),
|
||||
"result_type": search_type,
|
||||
"snippet": (r.get("snippet", "") or "").replace("\n", " ")[:500],
|
||||
"absolute_url": (
|
||||
f"https://www.courtlistener.com{r.get('absolute_url', '')}"
|
||||
if r.get("absolute_url", "").startswith("/")
|
||||
else r.get("absolute_url", "")
|
||||
),
|
||||
}
|
||||
)
|
||||
next_url = payload.get("next")
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
if not rows:
|
||||
print(
|
||||
f"CourtListener: 0 results for type={search_type!r} q={query!r}. "
|
||||
"Most private individuals don't appear in published court records "
|
||||
"unless they were party to a federal or state appellate case.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--query", required=True, help="Search query (party name, case name, keyword)")
|
||||
p.add_argument(
|
||||
"--type",
|
||||
default="opinions",
|
||||
choices=list(SEARCH_TYPES.keys()),
|
||||
help="Search type (default: opinions)",
|
||||
)
|
||||
p.add_argument("--court", help="Court ID filter (e.g. 'nysd' = SDNY, 'scotus' = Supreme Court)")
|
||||
p.add_argument("--date-from", help="Filed-after date YYYY-MM-DD")
|
||||
p.add_argument("--date-to", help="Filed-before date YYYY-MM-DD")
|
||||
p.add_argument("--token", default=os.environ.get("COURTLISTENER_TOKEN"))
|
||||
p.add_argument("--limit", type=int, default=100)
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
n = fetch(
|
||||
query=a.query,
|
||||
search_type=a.type,
|
||||
court=a.court,
|
||||
date_from=a.date_from,
|
||||
date_to=a.date_to,
|
||||
token=a.token,
|
||||
limit=a.limit,
|
||||
out_path=a.out,
|
||||
)
|
||||
print(f"Wrote {n} CourtListener rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,162 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Search the GDELT 2.0 DOC API for news mentions.
|
||||
|
||||
GDELT monitors world news in 100+ languages and indexes the full text.
|
||||
Free, anonymous, ~15-minute update frequency. Covers ~2015→present.
|
||||
|
||||
Useful for surfacing news mentions of a person, company, or topic across
|
||||
international media — much wider net than Google News.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
import urllib.parse
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _http import get_json # noqa: E402
|
||||
|
||||
BASE = "https://api.gdeltproject.org/api/v2/doc/doc"
|
||||
|
||||
COLUMNS = [
|
||||
"title",
|
||||
"url",
|
||||
"seen_date",
|
||||
"domain",
|
||||
"language",
|
||||
"source_country",
|
||||
"tone",
|
||||
"social_image",
|
||||
]
|
||||
|
||||
|
||||
def fetch(
|
||||
query: str,
|
||||
mode: str,
|
||||
timespan: str | None,
|
||||
start_datetime: str | None,
|
||||
end_datetime: str | None,
|
||||
source_country: str | None,
|
||||
source_lang: str | None,
|
||||
limit: int,
|
||||
out_path: str,
|
||||
) -> int:
|
||||
params: dict[str, str] = {
|
||||
"query": query,
|
||||
"mode": mode,
|
||||
"format": "json",
|
||||
"maxrecords": str(min(limit, 250)),
|
||||
"sort": "datedesc",
|
||||
}
|
||||
if timespan:
|
||||
params["timespan"] = timespan
|
||||
if start_datetime:
|
||||
params["startdatetime"] = start_datetime.replace("-", "").replace(":", "").replace(" ", "")
|
||||
if end_datetime:
|
||||
params["enddatetime"] = end_datetime.replace("-", "").replace(":", "").replace(" ", "")
|
||||
if source_country:
|
||||
params["sourcecountry"] = source_country
|
||||
if source_lang:
|
||||
params["sourcelang"] = source_lang
|
||||
|
||||
url = f"{BASE}?{urllib.parse.urlencode(params)}"
|
||||
payload: dict | list = {}
|
||||
for attempt in range(3):
|
||||
try:
|
||||
payload = get_json(url)
|
||||
break
|
||||
except RuntimeError as e:
|
||||
# GDELT requires 1 request per 5 seconds; back off and retry.
|
||||
if "429" in str(e) and attempt < 2:
|
||||
print(
|
||||
f"GDELT throttle hit; sleeping 6s before retry "
|
||||
f"(attempt {attempt + 1}/3)",
|
||||
file=sys.stderr,
|
||||
)
|
||||
time.sleep(6)
|
||||
continue
|
||||
print(f"GDELT error: {e}", file=sys.stderr)
|
||||
payload = {}
|
||||
break
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(f"GDELT error: {e}", file=sys.stderr)
|
||||
payload = {}
|
||||
break
|
||||
|
||||
rows: list[dict[str, str]] = []
|
||||
if isinstance(payload, dict):
|
||||
articles = payload.get("articles", []) or []
|
||||
for a in articles[:limit]:
|
||||
seen = (a.get("seendate") or "")
|
||||
# GDELT format: 20260319T083000Z → 2026-03-19 08:30:00Z
|
||||
if len(seen) == 16 and "T" in seen:
|
||||
seen = f"{seen[0:4]}-{seen[4:6]}-{seen[6:8]} {seen[9:11]}:{seen[11:13]}:{seen[13:15]}Z"
|
||||
rows.append(
|
||||
{
|
||||
"title": (a.get("title") or "").replace("\n", " ").strip(),
|
||||
"url": a.get("url") or "",
|
||||
"seen_date": seen,
|
||||
"domain": a.get("domain") or "",
|
||||
"language": a.get("language") or "",
|
||||
"source_country": a.get("sourcecountry") or "",
|
||||
"tone": str(a.get("tone") or ""),
|
||||
"social_image": a.get("socialimage") or "",
|
||||
}
|
||||
)
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
if not rows:
|
||||
print(
|
||||
f"GDELT: 0 articles for query={query!r}. "
|
||||
"GDELT indexes ~2015→present. Try widening the timespan or "
|
||||
"checking the query syntax (https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/).",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--query", required=True, help='Search query (supports GDELT operators: quoted phrases, AND/OR/NOT, sourcecountry:, theme:)')
|
||||
p.add_argument(
|
||||
"--mode",
|
||||
default="ArtList",
|
||||
choices=["ArtList", "ImageCollage", "TimelineVol", "TimelineTone", "ToneChart"],
|
||||
help="GDELT mode (default ArtList for article list)",
|
||||
)
|
||||
p.add_argument(
|
||||
"--timespan",
|
||||
help="Relative window: e.g. '1d', '1w', '1m', '3m', '1y' (overrides start/end)",
|
||||
)
|
||||
p.add_argument("--start", help="Absolute start YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS")
|
||||
p.add_argument("--end", help="Absolute end YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS")
|
||||
p.add_argument("--source-country", help="2-letter source country (e.g. US, UK)")
|
||||
p.add_argument("--source-lang", help="Source language (e.g. English, Spanish)")
|
||||
p.add_argument("--limit", type=int, default=100)
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
n = fetch(
|
||||
query=a.query,
|
||||
mode=a.mode,
|
||||
timespan=a.timespan,
|
||||
start_datetime=a.start,
|
||||
end_datetime=a.end,
|
||||
source_country=a.source_country,
|
||||
source_lang=a.source_lang,
|
||||
limit=a.limit,
|
||||
out_path=a.out,
|
||||
)
|
||||
print(f"Wrote {n} GDELT article rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,234 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Search ICIJ Offshore Leaks via the bulk CSV database.
|
||||
|
||||
The old reconcile endpoint (https://offshoreleaks.icij.org/reconcile) returns
|
||||
404 — ICIJ has removed it. The remaining stable access path is the public
|
||||
bulk download:
|
||||
|
||||
https://offshoreleaks-data.icij.org/offshoreleaks/csv/full-oldb.LATEST.zip
|
||||
|
||||
~70 MB, ~6 CSVs inside (nodes-entities, nodes-officers, nodes-intermediaries,
|
||||
nodes-addresses, relationships, ...). We cache it under
|
||||
$HERMES_OSINT_CACHE/icij/ (default: ~/.cache/hermes-osint/icij/) and search
|
||||
locally so the agent doesn't re-download for every query.
|
||||
|
||||
Output CSV columns match the original `fetch_icij_offshore.py` contract.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import io
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
import urllib.request
|
||||
import zipfile
|
||||
from pathlib import Path
|
||||
|
||||
BULK_URL = "https://offshoreleaks-data.icij.org/offshoreleaks/csv/full-oldb.LATEST.zip"
|
||||
|
||||
COLUMNS = [
|
||||
"node_id",
|
||||
"name",
|
||||
"node_type",
|
||||
"country_codes",
|
||||
"countries",
|
||||
"jurisdiction",
|
||||
"incorporation_date",
|
||||
"inactivation_date",
|
||||
"source",
|
||||
"entity_url",
|
||||
"connections",
|
||||
]
|
||||
|
||||
|
||||
def _cache_dir() -> Path:
|
||||
base = os.environ.get("HERMES_OSINT_CACHE")
|
||||
if base:
|
||||
return Path(base) / "icij"
|
||||
return Path.home() / ".cache" / "hermes-osint" / "icij"
|
||||
|
||||
|
||||
def _download(dest: Path, force: bool = False) -> Path:
|
||||
"""Download (or reuse cached) ICIJ bulk ZIP."""
|
||||
dest.mkdir(parents=True, exist_ok=True)
|
||||
zip_path = dest / "full-oldb.zip"
|
||||
if zip_path.exists() and not force:
|
||||
# Re-check age: refetch if older than 30 days.
|
||||
age_days = (time.time() - zip_path.stat().st_mtime) / 86400
|
||||
if age_days < 30:
|
||||
return zip_path
|
||||
print(f"Downloading ICIJ bulk database (~70 MB) to {zip_path}", file=sys.stderr)
|
||||
req = urllib.request.Request(
|
||||
BULK_URL,
|
||||
headers={"User-Agent": "hermes-agent osint-investigation skill"},
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=120) as resp: # noqa: S310
|
||||
tmp = zip_path.with_suffix(".zip.tmp")
|
||||
with open(tmp, "wb") as fh:
|
||||
while True:
|
||||
chunk = resp.read(1 << 16)
|
||||
if not chunk:
|
||||
break
|
||||
fh.write(chunk)
|
||||
tmp.replace(zip_path)
|
||||
return zip_path
|
||||
|
||||
|
||||
def _open_csv(zf: zipfile.ZipFile, name_pattern: str):
|
||||
"""Open the first CSV matching name_pattern (case-insensitive substring)."""
|
||||
for info in zf.infolist():
|
||||
if name_pattern.lower() in info.filename.lower() and info.filename.lower().endswith(".csv"):
|
||||
return zf.open(info), info.filename
|
||||
return None, None
|
||||
|
||||
|
||||
def _match(needle_norm: str, hay: str) -> bool:
|
||||
return needle_norm in (hay or "").upper()
|
||||
|
||||
|
||||
def _normalize_query(s: str) -> str:
|
||||
s = s.upper()
|
||||
s = re.sub(r"[^\w\s]", " ", s)
|
||||
s = re.sub(r"\s+", " ", s).strip()
|
||||
return s
|
||||
|
||||
|
||||
def fetch(
|
||||
entity: str | None,
|
||||
officer: str | None,
|
||||
jurisdiction: str | None,
|
||||
out_path: str,
|
||||
cache_dir: Path,
|
||||
force_refresh: bool = False,
|
||||
limit: int = 500,
|
||||
) -> int:
|
||||
zip_path = _download(cache_dir, force=force_refresh)
|
||||
rows: list[dict[str, str]] = []
|
||||
needles: list[tuple[str, str]] = [] # (kind, normalized needle)
|
||||
if entity:
|
||||
needles.append(("Entity", _normalize_query(entity)))
|
||||
if officer:
|
||||
needles.append(("Officer", _normalize_query(officer)))
|
||||
jur_norm = _normalize_query(jurisdiction) if jurisdiction else None
|
||||
|
||||
targets = [
|
||||
("Entity", "nodes-entities"),
|
||||
("Officer", "nodes-officers"),
|
||||
("Intermediary", "nodes-intermediaries"),
|
||||
]
|
||||
|
||||
with zipfile.ZipFile(zip_path) as zf:
|
||||
for node_type, csv_substring in targets:
|
||||
relevant_needles = [n for (k, n) in needles if k in (node_type, "Entity", "Officer")] or []
|
||||
# Only scan a CSV if we have a needle that could plausibly match it,
|
||||
# or if we have ONLY a jurisdiction filter.
|
||||
applicable_needles = [n for (k, n) in needles if k == node_type]
|
||||
if needles and not applicable_needles and not jur_norm:
|
||||
continue
|
||||
stream, fname = _open_csv(zf, csv_substring)
|
||||
if not stream:
|
||||
continue
|
||||
with stream:
|
||||
text = io.TextIOWrapper(stream, encoding="utf-8", errors="replace")
|
||||
reader = csv.DictReader(text)
|
||||
for row in reader:
|
||||
name = (row.get("name") or "").strip()
|
||||
if not name:
|
||||
continue
|
||||
name_u = name.upper()
|
||||
matched = False
|
||||
for n in applicable_needles or relevant_needles:
|
||||
if _match(n, name_u):
|
||||
matched = True
|
||||
break
|
||||
if not needles:
|
||||
matched = True # jurisdiction-only sweep
|
||||
if not matched:
|
||||
continue
|
||||
jur = (row.get("jurisdiction_description") or row.get("country_codes") or "").strip()
|
||||
if jur_norm and jur_norm not in jur.upper() and jur_norm not in (row.get("countries") or "").upper():
|
||||
continue
|
||||
node_id = (row.get("node_id") or "").strip()
|
||||
rows.append(
|
||||
{
|
||||
"node_id": node_id,
|
||||
"name": name,
|
||||
"node_type": node_type,
|
||||
"country_codes": row.get("country_codes", "") or "",
|
||||
"countries": row.get("countries", "") or "",
|
||||
"jurisdiction": jur,
|
||||
"incorporation_date": row.get("incorporation_date", "") or "",
|
||||
"inactivation_date": row.get("inactivation_date", "") or "",
|
||||
"source": row.get("sourceID", "") or row.get("source", "") or "",
|
||||
"entity_url": (
|
||||
f"https://offshoreleaks.icij.org/nodes/{node_id}" if node_id else ""
|
||||
),
|
||||
"connections": "",
|
||||
}
|
||||
)
|
||||
if len(rows) >= limit:
|
||||
break
|
||||
if len(rows) >= limit:
|
||||
break
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
if not rows:
|
||||
bits = []
|
||||
if entity:
|
||||
bits.append(f"entity={entity!r}")
|
||||
if officer:
|
||||
bits.append(f"officer={officer!r}")
|
||||
if jurisdiction:
|
||||
bits.append(f"jurisdiction={jurisdiction!r}")
|
||||
print(
|
||||
f"ICIJ: 0 matches for {', '.join(bits)}. "
|
||||
"The bulk database covers offshore leaks (Panama, Paradise, Pandora, "
|
||||
"Bahamas, Offshore Leaks). Most private US individuals are NOT in it.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--entity", help="Search by entity name (substring, case-insensitive)")
|
||||
p.add_argument("--officer", help="Search by officer / individual name (substring, case-insensitive)")
|
||||
p.add_argument("--jurisdiction", help="Filter results by jurisdiction substring")
|
||||
p.add_argument("--limit", type=int, default=500)
|
||||
p.add_argument("--out", required=True)
|
||||
p.add_argument(
|
||||
"--cache-dir",
|
||||
type=Path,
|
||||
default=None,
|
||||
help="Override cache directory (default: $HERMES_OSINT_CACHE/icij or ~/.cache/hermes-osint/icij)",
|
||||
)
|
||||
p.add_argument(
|
||||
"--force-refresh",
|
||||
action="store_true",
|
||||
help="Re-download the bulk ZIP even if a recent cached copy exists.",
|
||||
)
|
||||
a = p.parse_args()
|
||||
if not (a.entity or a.officer or a.jurisdiction):
|
||||
p.error("must supply at least one of --entity / --officer / --jurisdiction")
|
||||
n = fetch(
|
||||
entity=a.entity,
|
||||
officer=a.officer,
|
||||
jurisdiction=a.jurisdiction,
|
||||
out_path=a.out,
|
||||
cache_dir=a.cache_dir or _cache_dir(),
|
||||
force_refresh=a.force_refresh,
|
||||
limit=a.limit,
|
||||
)
|
||||
print(f"Wrote {n} ICIJ Offshore Leaks rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,203 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Search NYC property records via ACRIS (Automated City Register Information System).
|
||||
|
||||
Uses the city's Socrata-backed open data API. No auth required for read access.
|
||||
|
||||
Datasets:
|
||||
bnx9-e6tj — Real Property Master (one row per recorded document)
|
||||
636b-3b5g — Real Property Parties (names — grantor, grantee, etc.)
|
||||
8h5j-fqxa — Real Property Legal (lot / property identifiers)
|
||||
uqqa-hym2 — Real Property References
|
||||
|
||||
The Parties dataset has the names. We search by name and optionally join to
|
||||
Master to get the doc type and date.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import sys
|
||||
import urllib.parse
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _http import get_json # noqa: E402
|
||||
|
||||
PARTIES_URL = "https://data.cityofnewyork.us/resource/636b-3b5g.json"
|
||||
MASTER_URL = "https://data.cityofnewyork.us/resource/bnx9-e6tj.json"
|
||||
|
||||
PARTY_TYPE = {
|
||||
"1": "grantor (seller / mortgagor / debtor)",
|
||||
"2": "grantee (buyer / mortgagee / creditor)",
|
||||
"3": "other party",
|
||||
}
|
||||
|
||||
BOROUGH = {
|
||||
"1": "Manhattan",
|
||||
"2": "Bronx",
|
||||
"3": "Brooklyn",
|
||||
"4": "Queens",
|
||||
"5": "Staten Island",
|
||||
}
|
||||
|
||||
COLUMNS = [
|
||||
"document_id",
|
||||
"name",
|
||||
"party_type",
|
||||
"party_role",
|
||||
"address_1",
|
||||
"address_2",
|
||||
"city",
|
||||
"state",
|
||||
"zip",
|
||||
"country",
|
||||
"doc_type",
|
||||
"doc_date",
|
||||
"recorded_date",
|
||||
"borough",
|
||||
"amount",
|
||||
"filing_url",
|
||||
]
|
||||
|
||||
|
||||
def _filing_url(document_id: str) -> str:
|
||||
if not document_id:
|
||||
return ""
|
||||
return (
|
||||
f"https://a836-acris.nyc.gov/DS/DocumentSearch/DocumentImageView?doc_id={document_id}"
|
||||
)
|
||||
|
||||
|
||||
def fetch(
|
||||
name: str | None,
|
||||
address: str | None,
|
||||
party_type: str | None,
|
||||
limit: int,
|
||||
out_path: str,
|
||||
enrich: bool = True,
|
||||
) -> int:
|
||||
if not (name or address):
|
||||
raise SystemExit("must supply --name or --address")
|
||||
|
||||
where_clauses: list[str] = []
|
||||
if name:
|
||||
safe = name.upper().replace("'", "''")
|
||||
where_clauses.append(f"upper(name) like '%{safe}%'")
|
||||
if address:
|
||||
safe_addr = address.upper().replace("'", "''")
|
||||
where_clauses.append(f"upper(address_1) like '%{safe_addr}%'")
|
||||
if party_type and party_type in {"1", "2", "3"}:
|
||||
where_clauses.append(f"party_type='{party_type}'")
|
||||
|
||||
params = {
|
||||
"$where": " AND ".join(where_clauses),
|
||||
"$limit": str(limit),
|
||||
}
|
||||
url = f"{PARTIES_URL}?{urllib.parse.urlencode(params)}"
|
||||
parties = get_json(url)
|
||||
if not isinstance(parties, list):
|
||||
raise SystemExit(f"Unexpected ACRIS response: {parties!r}")
|
||||
|
||||
# Enrich with master record (doc_type, dates, borough, amount).
|
||||
doc_ids: list[str] = sorted({
|
||||
d for d in (p.get("document_id") for p in parties) if d
|
||||
})
|
||||
masters: dict[str, dict] = {}
|
||||
if enrich and doc_ids:
|
||||
# Batch up to 100 doc_ids per request (Socrata IN-list is fine for this).
|
||||
for i in range(0, len(doc_ids), 100):
|
||||
chunk = doc_ids[i : i + 100]
|
||||
id_list = ",".join(f"'{d}'" for d in chunk)
|
||||
master_params = {
|
||||
"$where": f"document_id in ({id_list})",
|
||||
"$limit": "100",
|
||||
}
|
||||
url = f"{MASTER_URL}?{urllib.parse.urlencode(master_params)}"
|
||||
try:
|
||||
rows = get_json(url)
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(f"ACRIS master lookup failed for chunk: {e}", file=sys.stderr)
|
||||
continue
|
||||
if isinstance(rows, list):
|
||||
for r in rows:
|
||||
did = r.get("document_id", "")
|
||||
if did:
|
||||
masters[did] = r
|
||||
|
||||
out_rows: list[dict[str, str]] = []
|
||||
for p in parties:
|
||||
did = p.get("document_id", "") or ""
|
||||
m = masters.get(did, {})
|
||||
out_rows.append(
|
||||
{
|
||||
"document_id": did,
|
||||
"name": p.get("name", "") or "",
|
||||
"party_type": p.get("party_type", "") or "",
|
||||
"party_role": PARTY_TYPE.get(p.get("party_type", ""), ""),
|
||||
"address_1": p.get("address_1", "") or "",
|
||||
"address_2": p.get("address_2", "") or "",
|
||||
"city": p.get("city", "") or "",
|
||||
"state": p.get("state", "") or "",
|
||||
"zip": p.get("zip", "") or "",
|
||||
"country": p.get("country", "") or "",
|
||||
"doc_type": m.get("doc_type", "") or "",
|
||||
"doc_date": (m.get("document_date", "") or "")[:10],
|
||||
"recorded_date": (m.get("recorded_datetime", "") or "")[:10],
|
||||
"borough": BOROUGH.get(m.get("recorded_borough", ""), m.get("recorded_borough", "")),
|
||||
"amount": m.get("document_amt", "") or "",
|
||||
"filing_url": _filing_url(did),
|
||||
}
|
||||
)
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(out_rows)
|
||||
|
||||
if not out_rows:
|
||||
filters = []
|
||||
if name:
|
||||
filters.append(f"name={name!r}")
|
||||
if address:
|
||||
filters.append(f"address={address!r}")
|
||||
print(
|
||||
f"NYC ACRIS: 0 records for {', '.join(filters)}. "
|
||||
"ACRIS covers ONLY NYC (5 boroughs). For property records elsewhere, "
|
||||
"search the relevant county recorder directly.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return len(out_rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--name", help="Party name substring (case-insensitive)")
|
||||
p.add_argument("--address", help="Address line 1 substring")
|
||||
p.add_argument(
|
||||
"--party-type",
|
||||
choices=["1", "2", "3"],
|
||||
help="Filter party type: 1=grantor (seller/mortgagor), 2=grantee (buyer/mortgagee), 3=other",
|
||||
)
|
||||
p.add_argument("--limit", type=int, default=200)
|
||||
p.add_argument(
|
||||
"--no-enrich",
|
||||
action="store_true",
|
||||
help="Skip the master-document lookup that adds doc_type/date/amount",
|
||||
)
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
n = fetch(
|
||||
name=a.name,
|
||||
address=a.address,
|
||||
party_type=a.party_type,
|
||||
limit=a.limit,
|
||||
out_path=a.out,
|
||||
enrich=not a.no_enrich,
|
||||
)
|
||||
print(f"Wrote {n} NYC ACRIS rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,175 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Fetch OFAC SDN list (CSV format) and normalize.
|
||||
|
||||
Public endpoint: https://www.treasury.gov/ofac/downloads/sdn.csv
|
||||
Format reference: https://ofac.treasury.gov/specially-designated-nationals-and-blocked-persons-list-sdn-human-readable-lists
|
||||
|
||||
The SDN CSV uses a specific 12-column format with no header row:
|
||||
ent_num, sdn_name, sdn_type, program, title, call_sign, vess_type,
|
||||
tonnage, grt, vess_flag, vess_owner, remarks
|
||||
Address and AKA records live in separate files. We fetch all three and join.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import io
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _http import get # noqa: E402
|
||||
|
||||
SDN_URL = "https://www.treasury.gov/ofac/downloads/sdn.csv"
|
||||
ADD_URL = "https://www.treasury.gov/ofac/downloads/add.csv"
|
||||
ALT_URL = "https://www.treasury.gov/ofac/downloads/alt.csv"
|
||||
|
||||
SDN_COLS = [
|
||||
"ent_num", "sdn_name", "sdn_type", "program", "title",
|
||||
"call_sign", "vess_type", "tonnage", "grt", "vess_flag",
|
||||
"vess_owner", "remarks",
|
||||
]
|
||||
ADD_COLS = [
|
||||
"ent_num", "add_num", "address", "city_state_zip", "country", "add_remarks",
|
||||
]
|
||||
ALT_COLS = [
|
||||
"ent_num", "alt_num", "alt_type", "alt_name", "alt_remarks",
|
||||
]
|
||||
|
||||
COLUMNS = [
|
||||
"entity_id",
|
||||
"name",
|
||||
"entity_type",
|
||||
"program_list",
|
||||
"title",
|
||||
"nationalities",
|
||||
"aka_list",
|
||||
"addresses",
|
||||
"dob",
|
||||
"pob",
|
||||
"remarks",
|
||||
"last_updated",
|
||||
]
|
||||
|
||||
_TYPE_MAP = {
|
||||
"individual": "individual",
|
||||
"entity": "entity",
|
||||
"vessel": "vessel",
|
||||
"aircraft": "aircraft",
|
||||
}
|
||||
|
||||
|
||||
def _read_csv(url: str, columns: list[str]) -> list[dict[str, str]]:
|
||||
body = get(url, timeout=60).decode("latin-1", errors="replace")
|
||||
reader = csv.reader(io.StringIO(body))
|
||||
out = []
|
||||
for row in reader:
|
||||
if not row:
|
||||
continue
|
||||
# Pad/truncate to expected width.
|
||||
row = row[: len(columns)] + [""] * (len(columns) - len(row))
|
||||
out.append(dict(zip(columns, row)))
|
||||
return out
|
||||
|
||||
|
||||
def _strip_quotes(s: str) -> str:
|
||||
s = s.strip()
|
||||
if s.startswith('"') and s.endswith('"'):
|
||||
s = s[1:-1]
|
||||
if s == "-0-":
|
||||
return ""
|
||||
return s
|
||||
|
||||
|
||||
def fetch(
|
||||
program: str | None,
|
||||
entity_type: str | None,
|
||||
out_path: str,
|
||||
) -> int:
|
||||
sdn = _read_csv(SDN_URL, SDN_COLS)
|
||||
addresses = _read_csv(ADD_URL, ADD_COLS)
|
||||
akas = _read_csv(ALT_URL, ALT_COLS)
|
||||
|
||||
addr_by_ent: dict[str, list[str]] = defaultdict(list)
|
||||
for a in addresses:
|
||||
ent = _strip_quotes(a["ent_num"])
|
||||
parts = [
|
||||
_strip_quotes(a[c])
|
||||
for c in ("address", "city_state_zip", "country")
|
||||
if _strip_quotes(a[c])
|
||||
]
|
||||
if parts:
|
||||
addr_by_ent[ent].append(", ".join(parts))
|
||||
|
||||
aka_by_ent: dict[str, list[str]] = defaultdict(list)
|
||||
for k in akas:
|
||||
ent = _strip_quotes(k["ent_num"])
|
||||
name = _strip_quotes(k["alt_name"])
|
||||
if name:
|
||||
aka_by_ent[ent].append(name)
|
||||
|
||||
rows: list[dict[str, str]] = []
|
||||
for r in sdn:
|
||||
ent_num = _strip_quotes(r["ent_num"])
|
||||
if not ent_num:
|
||||
continue
|
||||
sdn_type = _TYPE_MAP.get(_strip_quotes(r["sdn_type"]).lower(), _strip_quotes(r["sdn_type"]))
|
||||
if entity_type and sdn_type != entity_type:
|
||||
continue
|
||||
progs = _strip_quotes(r["program"])
|
||||
if program and program.upper() not in progs.upper().split(";"):
|
||||
continue
|
||||
remarks = _strip_quotes(r["remarks"])
|
||||
# DOB / POB are commonly embedded in remarks for individuals.
|
||||
dob = ""
|
||||
pob = ""
|
||||
if sdn_type == "individual" and remarks:
|
||||
for chunk in remarks.split(";"):
|
||||
ch = chunk.strip()
|
||||
if ch.upper().startswith("DOB"):
|
||||
dob = ch.split(maxsplit=1)[1] if " " in ch else ""
|
||||
elif ch.upper().startswith("POB"):
|
||||
pob = ch.split(maxsplit=1)[1] if " " in ch else ""
|
||||
rows.append(
|
||||
{
|
||||
"entity_id": ent_num,
|
||||
"name": _strip_quotes(r["sdn_name"]),
|
||||
"entity_type": sdn_type,
|
||||
"program_list": "; ".join(p.strip() for p in progs.split(";") if p.strip()),
|
||||
"title": _strip_quotes(r["title"]),
|
||||
"nationalities": "", # not in this CSV; available in XML format
|
||||
"aka_list": "; ".join(aka_by_ent.get(ent_num, [])),
|
||||
"addresses": "; ".join(addr_by_ent.get(ent_num, [])),
|
||||
"dob": dob,
|
||||
"pob": pob,
|
||||
"remarks": remarks,
|
||||
"last_updated": "",
|
||||
}
|
||||
)
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__)
|
||||
p.add_argument("--program", help="Filter to specific sanctions program (e.g. SDGT, IRAN)")
|
||||
p.add_argument(
|
||||
"--entity-type",
|
||||
choices=["individual", "entity", "vessel", "aircraft"],
|
||||
help="Filter to a specific entity type",
|
||||
)
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
n = fetch(program=a.program, entity_type=a.entity_type, out_path=a.out)
|
||||
print(f"Wrote {n} OFAC SDN rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,192 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Search OpenCorporates company registry data.
|
||||
|
||||
OpenCorporates aggregates ~200M companies from 130+ jurisdictions. The
|
||||
public API requires an API token (free tier: 500 calls/month). Set
|
||||
OPENCORPORATES_API_TOKEN in env or pass --token.
|
||||
|
||||
Without a token, this script falls back to scraping the public HTML
|
||||
search page (limited fields, more brittle, no jurisdiction filter).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import urllib.parse
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _http import get, get_json # noqa: E402
|
||||
|
||||
API_URL = "https://api.opencorporates.com/v0.4/companies/search"
|
||||
HTML_URL = "https://opencorporates.com/companies"
|
||||
|
||||
COLUMNS = [
|
||||
"name",
|
||||
"company_number",
|
||||
"jurisdiction_code",
|
||||
"jurisdiction_name",
|
||||
"incorporation_date",
|
||||
"dissolution_date",
|
||||
"company_type",
|
||||
"status",
|
||||
"registered_address",
|
||||
"opencorporates_url",
|
||||
"officers_count",
|
||||
"source",
|
||||
]
|
||||
|
||||
|
||||
def _via_api(query: str, jurisdiction: str | None, token: str, limit: int) -> list[dict]:
|
||||
params = {
|
||||
"q": query,
|
||||
"api_token": token,
|
||||
"per_page": str(min(limit, 100)),
|
||||
}
|
||||
if jurisdiction:
|
||||
params["jurisdiction_code"] = jurisdiction
|
||||
url = f"{API_URL}?{urllib.parse.urlencode(params)}"
|
||||
payload = get_json(url)
|
||||
if not isinstance(payload, dict):
|
||||
return []
|
||||
results = payload.get("results", {}).get("companies", []) or []
|
||||
return [r.get("company", {}) for r in results if isinstance(r, dict)]
|
||||
|
||||
|
||||
def _via_html(query: str, limit: int) -> list[dict]:
|
||||
"""Best-effort HTML fallback when no API token is available."""
|
||||
params = {"q": query, "utf8": "✓"}
|
||||
url = f"{HTML_URL}?{urllib.parse.urlencode(params)}"
|
||||
body = get(url, user_agent="Mozilla/5.0 hermes-osint").decode("utf-8", errors="replace")
|
||||
# Each result is in <li class="company"> ... </li> with name, url, status
|
||||
pattern = re.compile(
|
||||
r'<li[^>]*class="[^"]*company[^"]*"[^>]*>.*?'
|
||||
r'<a[^>]+href="(?P<url>/companies/[^"]+)"[^>]*>(?P<name>[^<]+)</a>'
|
||||
r'(?:.*?<span[^>]*class="[^"]*jurisdiction[^"]*"[^>]*>(?P<jur>[^<]+)</span>)?'
|
||||
r"(?:.*?<dt[^>]*>(?:Company\s+Number|Number)</dt>\s*<dd[^>]*>(?P<num>[^<]+)</dd>)?",
|
||||
re.DOTALL | re.IGNORECASE,
|
||||
)
|
||||
out = []
|
||||
for m in pattern.finditer(body):
|
||||
if len(out) >= limit:
|
||||
break
|
||||
url_path = m.group("url").strip()
|
||||
out.append(
|
||||
{
|
||||
"name": (m.group("name") or "").strip(),
|
||||
"opencorporates_url": f"https://opencorporates.com{url_path}",
|
||||
"jurisdiction_code": (m.group("jur") or "").strip(),
|
||||
"company_number": (m.group("num") or "").strip(),
|
||||
"_via": "html",
|
||||
}
|
||||
)
|
||||
return out
|
||||
|
||||
|
||||
def fetch(
|
||||
query: str,
|
||||
jurisdiction: str | None,
|
||||
token: str | None,
|
||||
limit: int,
|
||||
out_path: str,
|
||||
) -> int:
|
||||
if token:
|
||||
try:
|
||||
companies = _via_api(query, jurisdiction, token, limit)
|
||||
source_tag = "api"
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(
|
||||
f"OpenCorporates API call failed ({e}); falling back to HTML.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
companies = _via_html(query, limit)
|
||||
source_tag = "html-fallback"
|
||||
else:
|
||||
print(
|
||||
"OPENCORPORATES_API_TOKEN not set — using HTML fallback (limited fields). "
|
||||
"Get a free token at https://opencorporates.com/api_accounts/new",
|
||||
file=sys.stderr,
|
||||
)
|
||||
companies = _via_html(query, limit)
|
||||
source_tag = "html"
|
||||
|
||||
rows: list[dict[str, str]] = []
|
||||
for c in companies[:limit]:
|
||||
if c.get("_via") == "html":
|
||||
rows.append(
|
||||
{
|
||||
"name": c.get("name", ""),
|
||||
"company_number": c.get("company_number", ""),
|
||||
"jurisdiction_code": c.get("jurisdiction_code", ""),
|
||||
"jurisdiction_name": "",
|
||||
"incorporation_date": "",
|
||||
"dissolution_date": "",
|
||||
"company_type": "",
|
||||
"status": "",
|
||||
"registered_address": "",
|
||||
"opencorporates_url": c.get("opencorporates_url", ""),
|
||||
"officers_count": "",
|
||||
"source": source_tag,
|
||||
}
|
||||
)
|
||||
continue
|
||||
addr = c.get("registered_address_in_full") or ""
|
||||
rows.append(
|
||||
{
|
||||
"name": c.get("name", "") or "",
|
||||
"company_number": c.get("company_number", "") or "",
|
||||
"jurisdiction_code": c.get("jurisdiction_code", "") or "",
|
||||
"jurisdiction_name": "",
|
||||
"incorporation_date": c.get("incorporation_date", "") or "",
|
||||
"dissolution_date": c.get("dissolution_date", "") or "",
|
||||
"company_type": c.get("company_type", "") or "",
|
||||
"status": c.get("current_status", "") or c.get("inactive", "") or "",
|
||||
"registered_address": addr,
|
||||
"opencorporates_url": c.get("opencorporates_url", "") or "",
|
||||
"officers_count": str(c.get("officers", {}).get("total_count", "") if c.get("officers") else ""),
|
||||
"source": source_tag,
|
||||
}
|
||||
)
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
if not rows:
|
||||
print(
|
||||
f"OpenCorporates: 0 matches for query={query!r}"
|
||||
f"{f' jurisdiction={jurisdiction!r}' if jurisdiction else ''}.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--query", required=True, help="Company name search")
|
||||
p.add_argument(
|
||||
"--jurisdiction",
|
||||
help="Jurisdiction code, e.g. 'us_ny', 'us_de', 'gb', 'sg' (lowercased OpenCorporates style)",
|
||||
)
|
||||
p.add_argument("--limit", type=int, default=50)
|
||||
p.add_argument("--token", default=os.environ.get("OPENCORPORATES_API_TOKEN"))
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
n = fetch(
|
||||
query=a.query,
|
||||
jurisdiction=a.jurisdiction,
|
||||
token=a.token,
|
||||
limit=a.limit,
|
||||
out_path=a.out,
|
||||
)
|
||||
print(f"Wrote {n} OpenCorporates rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,184 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Fetch SEC EDGAR filings index for a given CIK or company name.
|
||||
|
||||
SEC requires a User-Agent header with contact info. Set SEC_USER_AGENT,
|
||||
e.g. SEC_USER_AGENT="Research example@example.com".
|
||||
|
||||
Filings JSON is published at:
|
||||
https://data.sec.gov/submissions/CIK<10-digit-padded>.json
|
||||
|
||||
Company lookup uses:
|
||||
https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&company=<name>&output=atom
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _http import get, get_json # noqa: E402
|
||||
|
||||
SUBMISSIONS_URL = "https://data.sec.gov/submissions/CIK{cik}.json"
|
||||
COLUMNS = [
|
||||
"cik",
|
||||
"company_name",
|
||||
"form_type",
|
||||
"filing_date",
|
||||
"accession_number",
|
||||
"primary_document",
|
||||
"filing_url",
|
||||
"reporting_period",
|
||||
]
|
||||
|
||||
|
||||
def _ua() -> str:
|
||||
ua = os.environ.get("SEC_USER_AGENT", "").strip()
|
||||
if not ua:
|
||||
raise SystemExit(
|
||||
"SEC requires a User-Agent with contact info. "
|
||||
"Set SEC_USER_AGENT='Your Name your@email'."
|
||||
)
|
||||
return ua
|
||||
|
||||
|
||||
def _resolve_cik(company: str) -> tuple[str, str]:
|
||||
"""Resolve a company name to a CIK via EDGAR's atom feed.
|
||||
|
||||
Returns (cik, resolved_company_name). The feed entries also reveal whether
|
||||
the match is an individual filer (Form 3/4/5 only) — surfaced in the
|
||||
return value so callers can warn.
|
||||
"""
|
||||
url = "https://www.sec.gov/cgi-bin/browse-edgar"
|
||||
params = {"action": "getcompany", "company": company, "output": "atom", "owner": "include"}
|
||||
body = get(url, params=params, user_agent=_ua()).decode("utf-8", errors="replace")
|
||||
m = re.search(r"CIK=(\d{10})", body)
|
||||
if not m:
|
||||
raise SystemExit(f"Could not resolve CIK for company={company!r}")
|
||||
cik = m.group(1)
|
||||
name_m = re.search(r"<title>([^<]+)\s*\((\d{10})\)</title>", body)
|
||||
resolved = name_m.group(1).strip() if name_m else ""
|
||||
return cik, resolved
|
||||
|
||||
|
||||
def fetch(
|
||||
cik: str | None,
|
||||
company: str | None,
|
||||
types: list[str],
|
||||
since: str | None,
|
||||
out_path: str,
|
||||
) -> int:
|
||||
resolved_name = ""
|
||||
if not cik and company:
|
||||
try:
|
||||
cik, resolved_name = _resolve_cik(company) # type: ignore[assignment]
|
||||
except SystemExit as e:
|
||||
# Write empty CSV with header so downstream tools still work,
|
||||
# and tell the user clearly.
|
||||
print(f"SEC EDGAR: {e}", file=sys.stderr)
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
csv.DictWriter(fh, fieldnames=COLUMNS).writeheader()
|
||||
return 0
|
||||
if resolved_name:
|
||||
print(
|
||||
f"Resolved company={company!r} → CIK {cik} ({resolved_name})",
|
||||
file=sys.stderr,
|
||||
)
|
||||
if not cik:
|
||||
raise SystemExit("must supply --cik or --company")
|
||||
cik = cik.zfill(10)
|
||||
url = SUBMISSIONS_URL.format(cik=cik)
|
||||
payload = get_json(url, user_agent=_ua())
|
||||
if not isinstance(payload, dict):
|
||||
raise SystemExit(f"Unexpected EDGAR response shape for CIK {cik}")
|
||||
name = payload.get("name", "")
|
||||
recent = (payload.get("filings", {}) or {}).get("recent", {}) or {}
|
||||
form = recent.get("form", [])
|
||||
date = recent.get("filingDate", [])
|
||||
accession = recent.get("accessionNumber", [])
|
||||
primary_doc = recent.get("primaryDocument", [])
|
||||
period = recent.get("reportDate", [])
|
||||
|
||||
# Histogram of available filing types — useful for surfacing why a filter
|
||||
# returned 0 (e.g. user asked for 10-K on an individual Form 4 filer).
|
||||
type_hist: dict[str, int] = {}
|
||||
for ftype in form:
|
||||
type_hist[ftype] = type_hist.get(ftype, 0) + 1
|
||||
|
||||
type_set = {t.strip().upper() for t in types} if types else None
|
||||
rows: list[dict[str, str]] = []
|
||||
for i, ftype in enumerate(form):
|
||||
if type_set and ftype.upper() not in type_set:
|
||||
continue
|
||||
fdate = date[i] if i < len(date) else ""
|
||||
if since and fdate and fdate < since:
|
||||
continue
|
||||
acc = accession[i] if i < len(accession) else ""
|
||||
pdoc = primary_doc[i] if i < len(primary_doc) else ""
|
||||
acc_nodash = acc.replace("-", "")
|
||||
filing_url = (
|
||||
f"https://www.sec.gov/Archives/edgar/data/{int(cik)}/{acc_nodash}/{pdoc}"
|
||||
if acc and pdoc
|
||||
else ""
|
||||
)
|
||||
rows.append(
|
||||
{
|
||||
"cik": cik,
|
||||
"company_name": name,
|
||||
"form_type": ftype,
|
||||
"filing_date": fdate,
|
||||
"accession_number": acc,
|
||||
"primary_document": pdoc,
|
||||
"filing_url": filing_url,
|
||||
"reporting_period": period[i] if i < len(period) else "",
|
||||
}
|
||||
)
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
|
||||
if not rows and type_hist:
|
||||
top = sorted(type_hist.items(), key=lambda kv: -kv[1])[:8]
|
||||
hist_str = ", ".join(f"{t}={n}" for t, n in top)
|
||||
print(
|
||||
f"Warning: SEC EDGAR CIK {cik} ({name}) has {sum(type_hist.values())} "
|
||||
f"recent filings but NONE match types={types}. "
|
||||
f"Available form types: {hist_str}.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
# Insider-filer heuristic: only Form 3/4/5 → individual person, not a company.
|
||||
company_types = {"10-K", "10-Q", "8-K", "20-F", "DEF 14A", "S-1"}
|
||||
if not (set(type_hist.keys()) & company_types):
|
||||
print(
|
||||
f"Note: CIK {cik} appears to be an INDIVIDUAL filer "
|
||||
f"(insider Form 3/4/5 only), not a corporate registrant. "
|
||||
f"The resolver may have matched an officer/director named "
|
||||
f"{company!r} rather than a company.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__)
|
||||
p.add_argument("--cik", help="Central Index Key (will be 10-digit zero-padded)")
|
||||
p.add_argument("--company", help="Resolve to CIK by company name")
|
||||
p.add_argument("--types", default="", help="Comma-separated form types (e.g. 10-K,10-Q,8-K)")
|
||||
p.add_argument("--since", help="Skip filings before YYYY-MM-DD")
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
types = [t for t in (a.types or "").split(",") if t.strip()]
|
||||
n = fetch(cik=a.cik, company=a.company, types=types, since=a.since, out_path=a.out)
|
||||
print(f"Wrote {n} EDGAR filing rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,146 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Fetch Senate Lobbying Disclosure (LD-1 / LD-2) filings.
|
||||
|
||||
Anonymous: 120 req/hour. Token (SENATE_LDA_TOKEN): 1200 req/hour.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _http import get_json # noqa: E402
|
||||
|
||||
ENDPOINT = "https://lda.senate.gov/api/v1/filings/"
|
||||
COLUMNS = [
|
||||
"filing_uuid",
|
||||
"filing_type",
|
||||
"filing_year",
|
||||
"filing_period",
|
||||
"registrant_name",
|
||||
"registrant_id",
|
||||
"client_name",
|
||||
"client_id",
|
||||
"client_general_description",
|
||||
"income",
|
||||
"expenses",
|
||||
"lobbyists",
|
||||
"issues",
|
||||
"government_entities",
|
||||
"filing_date",
|
||||
]
|
||||
|
||||
|
||||
def fetch(
|
||||
client: str | None,
|
||||
registrant: str | None,
|
||||
year: int,
|
||||
token: str | None,
|
||||
out_path: str,
|
||||
page_size: int = 100,
|
||||
max_pages: int = 25,
|
||||
) -> int:
|
||||
params: dict = {"filing_year": year, "page_size": page_size}
|
||||
if client:
|
||||
params["client_name"] = client
|
||||
if registrant:
|
||||
params["registrant_name"] = registrant
|
||||
|
||||
headers = {"Authorization": f"Token {token}"} if token else None
|
||||
rows: list[dict[str, str]] = []
|
||||
url = ENDPOINT
|
||||
page = 0
|
||||
while page < max_pages:
|
||||
try:
|
||||
payload = get_json(url, params=params if page == 0 else None, headers=headers)
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(f"Senate LDA error on page {page + 1}: {e}", file=sys.stderr)
|
||||
break
|
||||
if not isinstance(payload, dict):
|
||||
break
|
||||
results = payload.get("results", [])
|
||||
for r in results:
|
||||
client_obj = r.get("client") or {}
|
||||
registrant_obj = r.get("registrant") or {}
|
||||
lobbying_activities = r.get("lobbying_activities") or []
|
||||
lobbyists = []
|
||||
issues = []
|
||||
entities = []
|
||||
for la in lobbying_activities:
|
||||
for lob in la.get("lobbyists") or []:
|
||||
lob_obj = lob.get("lobbyist") or {}
|
||||
name = " ".join(
|
||||
x for x in (lob_obj.get("first_name", ""), lob_obj.get("last_name", "")) if x
|
||||
)
|
||||
if name:
|
||||
lobbyists.append(name)
|
||||
desc = la.get("description") or ""
|
||||
if desc:
|
||||
issues.append(desc)
|
||||
for ge in la.get("government_entities") or []:
|
||||
nm = ge.get("name") or ""
|
||||
if nm:
|
||||
entities.append(nm)
|
||||
rows.append(
|
||||
{
|
||||
"filing_uuid": r.get("filing_uuid", "") or "",
|
||||
"filing_type": r.get("filing_type", "") or "",
|
||||
"filing_year": str(r.get("filing_year", "") or year),
|
||||
"filing_period": r.get("filing_period", "") or "",
|
||||
"registrant_name": registrant_obj.get("name", "") or "",
|
||||
"registrant_id": str(registrant_obj.get("id", "") or ""),
|
||||
"client_name": client_obj.get("name", "") or "",
|
||||
"client_id": str(client_obj.get("id", "") or ""),
|
||||
"client_general_description": client_obj.get("general_description", "") or "",
|
||||
"income": str(r.get("income", "") or ""),
|
||||
"expenses": str(r.get("expenses", "") or ""),
|
||||
"lobbyists": "; ".join(sorted(set(lobbyists))),
|
||||
"issues": "; ".join(issues),
|
||||
"government_entities": "; ".join(sorted(set(entities))),
|
||||
"filing_date": (r.get("dt_posted") or "")[:10],
|
||||
}
|
||||
)
|
||||
next_url = payload.get("next")
|
||||
if not next_url:
|
||||
break
|
||||
url = next_url
|
||||
page += 1
|
||||
time.sleep(1.0 if not token else 0.3)
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__)
|
||||
p.add_argument("--client", help="Client name filter")
|
||||
p.add_argument("--registrant", help="Registrant (lobbying firm) name filter")
|
||||
p.add_argument("--year", type=int, default=2024)
|
||||
p.add_argument("--token", default=os.environ.get("SENATE_LDA_TOKEN"))
|
||||
p.add_argument("--max-pages", type=int, default=25)
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
if not (a.client or a.registrant):
|
||||
p.error("must supply at least one of --client / --registrant")
|
||||
n = fetch(
|
||||
client=a.client,
|
||||
registrant=a.registrant,
|
||||
year=a.year,
|
||||
token=a.token,
|
||||
out_path=a.out,
|
||||
max_pages=a.max_pages,
|
||||
)
|
||||
print(f"Wrote {n} Senate LDA rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,170 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Fetch federal contracts/awards from USAspending.gov API v2.
|
||||
|
||||
No auth required. POST to /api/v2/search/spending_by_award/ with filters.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
import urllib.request
|
||||
from pathlib import Path
|
||||
|
||||
ENDPOINT = "https://api.usaspending.gov/api/v2/search/spending_by_award/"
|
||||
COLUMNS = [
|
||||
"award_id",
|
||||
"recipient_name",
|
||||
"recipient_uei",
|
||||
"recipient_duns",
|
||||
"recipient_parent_name",
|
||||
"recipient_state",
|
||||
"awarding_agency",
|
||||
"awarding_sub_agency",
|
||||
"award_type",
|
||||
"award_amount",
|
||||
"award_date",
|
||||
"period_of_performance_start",
|
||||
"period_of_performance_end",
|
||||
"naics_code",
|
||||
"psc_code",
|
||||
"competition_extent",
|
||||
"description",
|
||||
]
|
||||
|
||||
# USAspending result column "code" → human label mapping for output.
|
||||
_FIELDS = [
|
||||
"Award ID",
|
||||
"Recipient Name",
|
||||
"Recipient UEI",
|
||||
"Recipient DUNS Number",
|
||||
"Recipient Parent Name",
|
||||
"Recipient State Code",
|
||||
"Awarding Agency",
|
||||
"Awarding Sub Agency",
|
||||
"Award Type",
|
||||
"Award Amount",
|
||||
"Start Date",
|
||||
"End Date",
|
||||
"NAICS Code",
|
||||
"PSC Code",
|
||||
"Type of Set Aside",
|
||||
"Description",
|
||||
]
|
||||
|
||||
|
||||
def _post(body: dict) -> dict:
|
||||
req = urllib.request.Request(
|
||||
ENDPOINT,
|
||||
data=json.dumps(body).encode("utf-8"),
|
||||
headers={"Content-Type": "application/json", "User-Agent": "hermes-agent osint-investigation"},
|
||||
method="POST",
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=60) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def fetch(
|
||||
recipient: str | None,
|
||||
agency: str | None,
|
||||
fy: int,
|
||||
sole_source_only: bool,
|
||||
out_path: str,
|
||||
page_size: int = 100,
|
||||
max_pages: int = 20,
|
||||
) -> int:
|
||||
filters: dict = {
|
||||
"time_period": [{"start_date": f"{fy - 1}-10-01", "end_date": f"{fy}-09-30"}],
|
||||
# Contracts only by default; adjust award_type_codes for grants/loans.
|
||||
"award_type_codes": ["A", "B", "C", "D"],
|
||||
}
|
||||
if recipient:
|
||||
filters["recipient_search_text"] = [recipient]
|
||||
if agency:
|
||||
filters["agencies"] = [{"type": "awarding", "tier": "toptier", "name": agency}]
|
||||
|
||||
rows: list[dict[str, str]] = []
|
||||
page = 1
|
||||
while page <= max_pages:
|
||||
body = {
|
||||
"filters": filters,
|
||||
"fields": _FIELDS,
|
||||
"page": page,
|
||||
"limit": page_size,
|
||||
"sort": "Award Amount",
|
||||
"order": "desc",
|
||||
}
|
||||
try:
|
||||
payload = _post(body)
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(f"USAspending error on page {page}: {e}", file=sys.stderr)
|
||||
break
|
||||
results = payload.get("results", [])
|
||||
if not results:
|
||||
break
|
||||
for r in results:
|
||||
set_aside = r.get("Type of Set Aside", "") or ""
|
||||
if sole_source_only and "sole" not in set_aside.lower():
|
||||
continue
|
||||
rows.append(
|
||||
{
|
||||
"award_id": r.get("Award ID", "") or "",
|
||||
"recipient_name": r.get("Recipient Name", "") or "",
|
||||
"recipient_uei": r.get("Recipient UEI", "") or "",
|
||||
"recipient_duns": r.get("Recipient DUNS Number", "") or "",
|
||||
"recipient_parent_name": r.get("Recipient Parent Name", "") or "",
|
||||
"recipient_state": r.get("Recipient State Code", "") or "",
|
||||
"awarding_agency": r.get("Awarding Agency", "") or "",
|
||||
"awarding_sub_agency": r.get("Awarding Sub Agency", "") or "",
|
||||
"award_type": r.get("Award Type", "") or "",
|
||||
"award_amount": str(r.get("Award Amount", "") or ""),
|
||||
"award_date": r.get("Start Date", "") or "",
|
||||
"period_of_performance_start": r.get("Start Date", "") or "",
|
||||
"period_of_performance_end": r.get("End Date", "") or "",
|
||||
"naics_code": str(r.get("NAICS Code", "") or ""),
|
||||
"psc_code": str(r.get("PSC Code", "") or ""),
|
||||
"competition_extent": set_aside,
|
||||
"description": r.get("Description", "") or "",
|
||||
}
|
||||
)
|
||||
meta = payload.get("page_metadata", {})
|
||||
if not meta.get("hasNext"):
|
||||
break
|
||||
page += 1
|
||||
time.sleep(0.5)
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__)
|
||||
p.add_argument("--recipient", help="Recipient name search")
|
||||
p.add_argument("--agency", help="Awarding agency (top-tier)")
|
||||
p.add_argument("--fy", type=int, default=2024, help="Federal fiscal year")
|
||||
p.add_argument("--sole-source-only", action="store_true")
|
||||
p.add_argument("--max-pages", type=int, default=20)
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
if not (a.recipient or a.agency):
|
||||
p.error("must supply at least one of --recipient / --agency")
|
||||
n = fetch(
|
||||
recipient=a.recipient,
|
||||
agency=a.agency,
|
||||
fy=a.fy,
|
||||
sole_source_only=a.sole_source_only,
|
||||
out_path=a.out,
|
||||
max_pages=a.max_pages,
|
||||
)
|
||||
print(f"Wrote {n} USAspending rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,142 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Search the Internet Archive Wayback Machine via the CDX server.
|
||||
|
||||
The CDX API indexes ~900B+ archived web pages. Anonymous read access,
|
||||
no auth required. Useful for finding deleted / changed pages by URL,
|
||||
domain, or substring match.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import sys
|
||||
import urllib.parse
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _http import get_json # noqa: E402
|
||||
|
||||
BASE = "https://web.archive.org/cdx/search/cdx"
|
||||
|
||||
COLUMNS = [
|
||||
"url",
|
||||
"timestamp",
|
||||
"wayback_url",
|
||||
"mimetype",
|
||||
"status",
|
||||
"digest",
|
||||
"length",
|
||||
]
|
||||
|
||||
|
||||
def fetch(
|
||||
url_or_host: str,
|
||||
match_type: str,
|
||||
from_date: str | None,
|
||||
to_date: str | None,
|
||||
status: str | None,
|
||||
mime: str | None,
|
||||
collapse: str | None,
|
||||
limit: int,
|
||||
out_path: str,
|
||||
) -> int:
|
||||
params: dict[str, str] = {
|
||||
"url": url_or_host,
|
||||
"matchType": match_type,
|
||||
"output": "json",
|
||||
"limit": str(limit),
|
||||
}
|
||||
if from_date:
|
||||
params["from"] = from_date.replace("-", "")
|
||||
if to_date:
|
||||
params["to"] = to_date.replace("-", "")
|
||||
if status:
|
||||
params["filter"] = f"statuscode:{status}"
|
||||
if mime:
|
||||
params.setdefault("filter", "")
|
||||
# Multiple filters: CDX accepts repeated filter params via urlencode list
|
||||
params["filter"] = f"mimetype:{mime}"
|
||||
if collapse:
|
||||
params["collapse"] = collapse
|
||||
|
||||
url = f"{BASE}?{urllib.parse.urlencode(params)}"
|
||||
try:
|
||||
payload = get_json(url)
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(f"Wayback CDX error: {e}", file=sys.stderr)
|
||||
payload = []
|
||||
|
||||
rows: list[dict[str, str]] = []
|
||||
if isinstance(payload, list) and len(payload) > 1:
|
||||
header = payload[0]
|
||||
idx = {h: i for i, h in enumerate(header)}
|
||||
for entry in payload[1:]:
|
||||
ts = entry[idx["timestamp"]] if "timestamp" in idx else ""
|
||||
orig = entry[idx["original"]] if "original" in idx else ""
|
||||
rows.append(
|
||||
{
|
||||
"url": orig,
|
||||
"timestamp": ts,
|
||||
"wayback_url": f"https://web.archive.org/web/{ts}/{orig}" if ts and orig else "",
|
||||
"mimetype": entry[idx["mimetype"]] if "mimetype" in idx else "",
|
||||
"status": entry[idx["statuscode"]] if "statuscode" in idx else "",
|
||||
"digest": entry[idx["digest"]] if "digest" in idx else "",
|
||||
"length": entry[idx["length"]] if "length" in idx else "",
|
||||
}
|
||||
)
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
if not rows:
|
||||
print(
|
||||
f"Wayback Machine: 0 captures for {url_or_host!r} matchType={match_type}.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--url", required=True, help="URL or host to look up in the archive")
|
||||
p.add_argument(
|
||||
"--match",
|
||||
default="exact",
|
||||
choices=["exact", "prefix", "host", "domain"],
|
||||
help=(
|
||||
"exact: this URL only. "
|
||||
"prefix: this URL's path-prefix. "
|
||||
"host: any URL on this host. "
|
||||
"domain: any URL on this domain or subdomains."
|
||||
),
|
||||
)
|
||||
p.add_argument("--from-date", help="Earliest capture YYYY-MM-DD")
|
||||
p.add_argument("--to-date", help="Latest capture YYYY-MM-DD")
|
||||
p.add_argument("--status", help="HTTP status filter (e.g. 200)")
|
||||
p.add_argument("--mime", help="MIME type filter (e.g. text/html)")
|
||||
p.add_argument(
|
||||
"--collapse",
|
||||
help="Collapse adjacent identical entries (e.g. 'digest' for unique-content captures)",
|
||||
)
|
||||
p.add_argument("--limit", type=int, default=200)
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
n = fetch(
|
||||
url_or_host=a.url,
|
||||
match_type=a.match,
|
||||
from_date=a.from_date,
|
||||
to_date=a.to_date,
|
||||
status=a.status,
|
||||
mime=a.mime,
|
||||
collapse=a.collapse,
|
||||
limit=a.limit,
|
||||
out_path=a.out,
|
||||
)
|
||||
print(f"Wrote {n} Wayback capture rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,267 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Search Wikipedia + Wikidata for an entity (person, company, place, concept).
|
||||
|
||||
Two free APIs:
|
||||
- Wikipedia OpenSearch + REST summary endpoint for narrative bio
|
||||
- Wikidata SPARQL endpoint for structured facts (birth, employer, awards, etc.)
|
||||
|
||||
Both are anonymous-access. Useful for resolving who-is-this-entity questions
|
||||
and surfacing cross-references that other sources can join against.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
import urllib.parse
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
from _http import get_json # noqa: E402
|
||||
|
||||
WP_OPENSEARCH = "https://en.wikipedia.org/w/api.php"
|
||||
WP_SUMMARY = "https://en.wikipedia.org/api/rest_v1/page/summary/"
|
||||
WD_ACTION = "https://www.wikidata.org/w/api.php"
|
||||
|
||||
COLUMNS = [
|
||||
"source",
|
||||
"label",
|
||||
"description",
|
||||
"qid",
|
||||
"wikipedia_title",
|
||||
"wikipedia_url",
|
||||
"wikidata_url",
|
||||
"instance_of",
|
||||
"country",
|
||||
"occupation",
|
||||
"employer",
|
||||
"date_of_birth",
|
||||
"place_of_birth",
|
||||
"summary",
|
||||
]
|
||||
|
||||
|
||||
def _wp_search(query: str, limit: int) -> list[dict]:
|
||||
params = {
|
||||
"action": "opensearch",
|
||||
"search": query,
|
||||
"limit": str(min(limit, 20)),
|
||||
"format": "json",
|
||||
}
|
||||
url = f"{WP_OPENSEARCH}?{urllib.parse.urlencode(params)}"
|
||||
data = get_json(url)
|
||||
if not isinstance(data, list) or len(data) < 4:
|
||||
return []
|
||||
titles, descs, urls = data[1], data[2], data[3]
|
||||
out = []
|
||||
for i, title in enumerate(titles):
|
||||
out.append(
|
||||
{
|
||||
"title": title,
|
||||
"description": descs[i] if i < len(descs) else "",
|
||||
"url": urls[i] if i < len(urls) else "",
|
||||
}
|
||||
)
|
||||
return out
|
||||
|
||||
|
||||
def _wp_summary(title: str) -> dict:
|
||||
"""Pull the REST summary for a title — short bio, image, type."""
|
||||
url = f"{WP_SUMMARY}{urllib.parse.quote(title.replace(' ', '_'))}"
|
||||
try:
|
||||
return get_json(url) # type: ignore[return-value]
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(f"Wikipedia summary lookup for {title!r} failed: {e}", file=sys.stderr)
|
||||
return {}
|
||||
|
||||
|
||||
def _wd_lookup_by_qid(qid: str) -> dict:
|
||||
"""Pull common facts for a QID via Wikidata's Action API (no SPARQL).
|
||||
|
||||
The Action API is far more lenient on rate-limits than the SPARQL Query
|
||||
Service. We get claims as QIDs and then resolve labels in one batch call.
|
||||
"""
|
||||
# Properties of interest. The Action API returns claims as QIDs or
|
||||
# typed literals, so the slot mapping is local-only.
|
||||
interesting = {
|
||||
"P31": "instance_of",
|
||||
"P17": "country", # for orgs / places
|
||||
"P27": "country", # for individuals (country of citizenship)
|
||||
"P106": "occupation",
|
||||
"P108": "employer",
|
||||
"P569": "date_of_birth",
|
||||
"P19": "place_of_birth",
|
||||
}
|
||||
params = {
|
||||
"action": "wbgetentities",
|
||||
"ids": qid,
|
||||
"props": "claims",
|
||||
"format": "json",
|
||||
}
|
||||
url = f"{WD_ACTION}?{urllib.parse.urlencode(params)}"
|
||||
try:
|
||||
data = get_json(url)
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(f"Wikidata wbgetentities for {qid} failed: {e}", file=sys.stderr)
|
||||
return {}
|
||||
if not isinstance(data, dict):
|
||||
return {}
|
||||
claims = (data.get("entities", {}).get(qid, {}) or {}).get("claims", {}) or {}
|
||||
|
||||
# Collect raw values (QIDs or literals) and remember which slot each
|
||||
# came from. Date literals come back as ISO strings; QIDs need a label
|
||||
# resolution pass.
|
||||
qid_to_slots: dict[str, list[str]] = {}
|
||||
facts: dict[str, list[str]] = {}
|
||||
for prop_id, slot in interesting.items():
|
||||
for claim in claims.get(prop_id, []) or []:
|
||||
v = (claim.get("mainsnak", {}) or {}).get("datavalue", {}) or {}
|
||||
vtype = v.get("type")
|
||||
value = v.get("value")
|
||||
if vtype == "wikibase-entityid" and isinstance(value, dict):
|
||||
vqid = value.get("id", "")
|
||||
if vqid:
|
||||
qid_to_slots.setdefault(vqid, [])
|
||||
if slot not in qid_to_slots[vqid]:
|
||||
qid_to_slots[vqid].append(slot)
|
||||
elif vtype == "time" and isinstance(value, dict):
|
||||
raw = value.get("time", "") or ""
|
||||
# +1955-10-28T00:00:00Z → 1955-10-28
|
||||
m = re.search(r"[+-]?(\d{4})-(\d{2})-(\d{2})", raw)
|
||||
if m:
|
||||
facts.setdefault(slot, []).append(
|
||||
f"{m.group(1)}-{m.group(2)}-{m.group(3)}"
|
||||
)
|
||||
elif vtype == "string":
|
||||
facts.setdefault(slot, []).append(str(value))
|
||||
|
||||
# Resolve labels for all referenced QIDs in one batch (up to 50 at a time).
|
||||
qids = list(qid_to_slots)
|
||||
for i in range(0, len(qids), 50):
|
||||
batch = qids[i : i + 50]
|
||||
params = {
|
||||
"action": "wbgetentities",
|
||||
"ids": "|".join(batch),
|
||||
"props": "labels",
|
||||
"languages": "en",
|
||||
"format": "json",
|
||||
}
|
||||
url = f"{WD_ACTION}?{urllib.parse.urlencode(params)}"
|
||||
try:
|
||||
data = get_json(url)
|
||||
except Exception as e: # noqa: BLE001
|
||||
print(f"Wikidata label batch failed: {e}", file=sys.stderr)
|
||||
continue
|
||||
if not isinstance(data, dict):
|
||||
continue
|
||||
ents = data.get("entities", {}) or {}
|
||||
for vqid, ent in ents.items():
|
||||
label = (ent.get("labels", {}).get("en", {}) or {}).get("value", "") or vqid
|
||||
for slot in qid_to_slots.get(vqid, []):
|
||||
facts.setdefault(slot, []).append(label)
|
||||
|
||||
# Deduplicate per slot, preserving order.
|
||||
deduped: dict[str, list[str]] = {}
|
||||
for slot, vals in facts.items():
|
||||
seen = set()
|
||||
out = []
|
||||
for v in vals:
|
||||
if v in seen:
|
||||
continue
|
||||
seen.add(v)
|
||||
out.append(v)
|
||||
deduped[slot] = out
|
||||
return deduped
|
||||
|
||||
|
||||
def _wd_qid_for_title(title: str) -> str:
|
||||
"""Get the Wikidata QID associated with a Wikipedia article title."""
|
||||
params = {
|
||||
"action": "query",
|
||||
"format": "json",
|
||||
"prop": "pageprops",
|
||||
"ppprop": "wikibase_item",
|
||||
"titles": title,
|
||||
"redirects": 1,
|
||||
}
|
||||
url = f"{WP_OPENSEARCH}?{urllib.parse.urlencode(params)}"
|
||||
try:
|
||||
data = get_json(url)
|
||||
except Exception: # noqa: BLE001
|
||||
return ""
|
||||
if not isinstance(data, dict):
|
||||
return ""
|
||||
pages = data.get("query", {}).get("pages", {}) or {}
|
||||
for page in pages.values():
|
||||
qid = (page.get("pageprops") or {}).get("wikibase_item", "")
|
||||
if qid:
|
||||
return qid
|
||||
return ""
|
||||
|
||||
|
||||
def fetch(query: str, limit: int, no_wikidata: bool, out_path: str) -> int:
|
||||
hits = _wp_search(query, limit)
|
||||
rows: list[dict[str, str]] = []
|
||||
for hit in hits[:limit]:
|
||||
title = hit.get("title", "")
|
||||
if not title:
|
||||
continue
|
||||
summary = _wp_summary(title)
|
||||
qid = _wd_qid_for_title(title) if not no_wikidata else ""
|
||||
facts: dict = {}
|
||||
if qid:
|
||||
facts = _wd_lookup_by_qid(qid)
|
||||
rows.append(
|
||||
{
|
||||
"source": "wikipedia+wikidata" if qid else "wikipedia",
|
||||
"label": title,
|
||||
"description": (summary.get("description") or hit.get("description") or "").strip(),
|
||||
"qid": qid,
|
||||
"wikipedia_title": title,
|
||||
"wikipedia_url": hit.get("url", ""),
|
||||
"wikidata_url": f"https://www.wikidata.org/wiki/{qid}" if qid else "",
|
||||
"instance_of": "; ".join(facts.get("instance_of", [])),
|
||||
"country": "; ".join(facts.get("country", [])),
|
||||
"occupation": "; ".join(facts.get("occupation", [])),
|
||||
"employer": "; ".join(facts.get("employer", [])),
|
||||
"date_of_birth": "; ".join(facts.get("date_of_birth", []))[:10] if facts.get("date_of_birth") else "",
|
||||
"place_of_birth": "; ".join(facts.get("place_of_birth", [])),
|
||||
"summary": (summary.get("extract") or "").replace("\n", " ")[:1000],
|
||||
}
|
||||
)
|
||||
|
||||
Path(out_path).parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(out_path, "w", newline="", encoding="utf-8") as fh:
|
||||
w = csv.DictWriter(fh, fieldnames=COLUMNS)
|
||||
w.writeheader()
|
||||
w.writerows(rows)
|
||||
if not rows:
|
||||
print(
|
||||
f"Wikipedia: 0 articles for query={query!r}. "
|
||||
"Private individuals not notable enough for a Wikipedia article "
|
||||
"won't appear here (the bar is real).",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return len(rows)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--query", required=True, help="Entity name (person, company, place, concept)")
|
||||
p.add_argument("--limit", type=int, default=5)
|
||||
p.add_argument(
|
||||
"--no-wikidata",
|
||||
action="store_true",
|
||||
help="Skip the Wikidata SPARQL enrichment (faster, less detail)",
|
||||
)
|
||||
p.add_argument("--out", required=True)
|
||||
a = p.parse_args()
|
||||
n = fetch(query=a.query, limit=a.limit, no_wikidata=a.no_wikidata, out_path=a.out)
|
||||
print(f"Wrote {n} Wikipedia/Wikidata rows to {a.out}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,253 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Permutation test for donation/contract timing correlation (stdlib-only).
|
||||
|
||||
For each (donor, vendor) pair, compute the mean number of days between each
|
||||
donation and the nearest contract award. Then shuffle contract award dates
|
||||
N times within the observation window and compute the same statistic. The
|
||||
one-tailed p-value is the fraction of permutations whose mean is <= the
|
||||
observed mean (smaller distance = tighter clustering).
|
||||
|
||||
Adapted from ShinMegamiBoson/OpenPlanter (MIT). Differences:
|
||||
- Pure stdlib (no pandas / numpy)
|
||||
- Domain-agnostic (no snow-vendor / CRITICAL-politician filter)
|
||||
- Configurable column names via flags
|
||||
- Optional --seed for reproducibility
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import csv
|
||||
import datetime as dt
|
||||
import json
|
||||
import math
|
||||
import random
|
||||
import statistics
|
||||
from collections import defaultdict
|
||||
from pathlib import Path
|
||||
|
||||
_DATE_FORMATS = ("%Y-%m-%d", "%m/%d/%Y", "%Y/%m/%d", "%m-%d-%Y", "%Y%m%d")
|
||||
|
||||
|
||||
def parse_date(raw: str) -> dt.date | None:
|
||||
if not raw:
|
||||
return None
|
||||
raw = raw.strip()
|
||||
for fmt in _DATE_FORMATS:
|
||||
try:
|
||||
return dt.datetime.strptime(raw, fmt).date()
|
||||
except ValueError:
|
||||
continue
|
||||
return None
|
||||
|
||||
|
||||
def _read(path: str) -> list[dict[str, str]]:
|
||||
with open(path, newline="", encoding="utf-8") as fh:
|
||||
return list(csv.DictReader(fh))
|
||||
|
||||
|
||||
def _nearest_distance(donation_date: dt.date, awards: list[dt.date]) -> int:
|
||||
"""Absolute days to nearest award date."""
|
||||
return min(abs((donation_date - a).days) for a in awards)
|
||||
|
||||
|
||||
def _permute(
|
||||
awards_count: int,
|
||||
donations: list[dt.date],
|
||||
date_min: dt.date,
|
||||
date_max: dt.date,
|
||||
rng: random.Random,
|
||||
) -> float:
|
||||
"""One permutation: draw uniform random award dates, compute mean nearest-distance."""
|
||||
span_days = (date_max - date_min).days or 1
|
||||
rand_awards = [
|
||||
date_min + dt.timedelta(days=rng.randint(0, span_days))
|
||||
for _ in range(awards_count)
|
||||
]
|
||||
distances = [_nearest_distance(d, rand_awards) for d in donations]
|
||||
return statistics.mean(distances)
|
||||
|
||||
|
||||
def analyze(
|
||||
donations_path: str,
|
||||
donation_date_col: str,
|
||||
donation_amount_col: str,
|
||||
donation_donor_col: str,
|
||||
donation_recipient_col: str,
|
||||
contracts_path: str,
|
||||
contract_date_col: str,
|
||||
contract_vendor_col: str,
|
||||
cross_links_path: str | None,
|
||||
n_permutations: int = 1000,
|
||||
min_donations: int = 3,
|
||||
p_threshold: float = 0.05,
|
||||
seed: int | None = None,
|
||||
out_path: str = "timing.json",
|
||||
) -> dict:
|
||||
rng = random.Random(seed)
|
||||
|
||||
donations = _read(donations_path)
|
||||
contracts = _read(contracts_path)
|
||||
|
||||
# Allow optional join through cross_links — donor (left) ↔ vendor (right).
|
||||
# When present, donor strings get mapped to matched vendor names so the
|
||||
# vendor-date index lookup actually finds the contracts.
|
||||
matched_pairs: set[tuple[str, str]] | None = None
|
||||
donor_to_vendors: dict[str, set[str]] = defaultdict(set)
|
||||
if cross_links_path:
|
||||
matched_pairs = set()
|
||||
for row in _read(cross_links_path):
|
||||
left = row.get("left_name", "")
|
||||
right = row.get("right_name", "")
|
||||
matched_pairs.add((left, right))
|
||||
donor_to_vendors[left].add(right)
|
||||
|
||||
# Index contract dates by vendor name.
|
||||
vendor_to_award_dates: dict[str, list[dt.date]] = defaultdict(list)
|
||||
all_award_dates: list[dt.date] = []
|
||||
for row in contracts:
|
||||
d = parse_date(row.get(contract_date_col, ""))
|
||||
if not d:
|
||||
continue
|
||||
vendor_to_award_dates[row.get(contract_vendor_col, "").strip()].append(d)
|
||||
all_award_dates.append(d)
|
||||
|
||||
if not all_award_dates:
|
||||
raise SystemExit(f"No parseable dates in {contracts_path}/{contract_date_col}")
|
||||
global_min = min(all_award_dates)
|
||||
global_max = max(all_award_dates)
|
||||
|
||||
# Group donations by (donor, recipient).
|
||||
grouped: dict[tuple[str, str], list[tuple[dt.date, float]]] = defaultdict(list)
|
||||
for row in donations:
|
||||
donor = row.get(donation_donor_col, "").strip()
|
||||
recip = row.get(donation_recipient_col, "").strip()
|
||||
d = parse_date(row.get(donation_date_col, ""))
|
||||
try:
|
||||
amt = float(row.get(donation_amount_col, "0") or 0)
|
||||
except ValueError:
|
||||
amt = 0.0
|
||||
if not (donor and recip and d):
|
||||
continue
|
||||
grouped[(donor, recip)].append((d, amt))
|
||||
|
||||
results = []
|
||||
skipped = 0
|
||||
for (donor, recip), records in grouped.items():
|
||||
if len(records) < min_donations:
|
||||
skipped += 1
|
||||
continue
|
||||
# Only test if donor appears in cross-links (when provided). The
|
||||
# (donor, candidate) tuple itself is NOT what's in matched_pairs —
|
||||
# cross_links pairs are (donor, vendor). We use the cross-link to
|
||||
# map donor → vendor name(s) so the vendor-date index resolves.
|
||||
if matched_pairs is not None and donor not in donor_to_vendors:
|
||||
skipped += 1
|
||||
continue
|
||||
# Try direct donor→awards first, then go through cross-link vendor names.
|
||||
award_dates = list(vendor_to_award_dates.get(donor, []))
|
||||
if not award_dates:
|
||||
award_dates = list(vendor_to_award_dates.get(recip, []))
|
||||
if not award_dates and donor_to_vendors.get(donor):
|
||||
for vendor_name in donor_to_vendors[donor]:
|
||||
award_dates.extend(vendor_to_award_dates.get(vendor_name, []))
|
||||
if not award_dates:
|
||||
skipped += 1
|
||||
continue
|
||||
|
||||
donation_dates = [d for (d, _) in records]
|
||||
observed = statistics.mean(
|
||||
_nearest_distance(d, award_dates) for d in donation_dates
|
||||
)
|
||||
|
||||
permuted_means = [
|
||||
_permute(len(award_dates), donation_dates, global_min, global_max, rng)
|
||||
for _ in range(n_permutations)
|
||||
]
|
||||
p_value = sum(1 for m in permuted_means if m <= observed) / n_permutations
|
||||
null_mean = statistics.mean(permuted_means)
|
||||
null_std = statistics.pstdev(permuted_means) or 1.0
|
||||
effect_size = (null_mean - observed) / null_std
|
||||
|
||||
results.append(
|
||||
{
|
||||
"donor": donor,
|
||||
"recipient": recip,
|
||||
"n_donations": len(records),
|
||||
"n_award_dates": len(award_dates),
|
||||
"observed_mean_days": round(observed, 2),
|
||||
"null_mean_days": round(null_mean, 2),
|
||||
"p_value": round(p_value, 4),
|
||||
"effect_size_sd": round(effect_size, 2),
|
||||
"significant": p_value < p_threshold,
|
||||
"total_donation_amount": round(sum(a for (_, a) in records), 2),
|
||||
}
|
||||
)
|
||||
|
||||
results.sort(key=lambda r: r["p_value"])
|
||||
|
||||
payload = {
|
||||
"metadata": {
|
||||
"n_permutations": n_permutations,
|
||||
"min_donations": min_donations,
|
||||
"p_threshold": p_threshold,
|
||||
"seed": seed,
|
||||
"n_pairs_tested": len(results),
|
||||
"n_pairs_skipped": skipped,
|
||||
"n_significant": sum(1 for r in results if r["significant"]),
|
||||
"observation_window": [global_min.isoformat(), global_max.isoformat()],
|
||||
},
|
||||
"results": results,
|
||||
}
|
||||
|
||||
Path(out_path).write_text(json.dumps(payload, indent=2))
|
||||
return payload
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
p.add_argument("--donations", required=True)
|
||||
p.add_argument("--donation-date-col", required=True)
|
||||
p.add_argument("--donation-amount-col", required=True)
|
||||
p.add_argument("--donation-donor-col", required=True)
|
||||
p.add_argument("--donation-recipient-col", required=True)
|
||||
p.add_argument("--contracts", required=True)
|
||||
p.add_argument("--contract-date-col", required=True)
|
||||
p.add_argument("--contract-vendor-col", required=True)
|
||||
p.add_argument(
|
||||
"--cross-links",
|
||||
help="Optional cross_links.csv to restrict (donor, vendor) pairs",
|
||||
)
|
||||
p.add_argument("--permutations", type=int, default=1000)
|
||||
p.add_argument("--min-donations", type=int, default=3)
|
||||
p.add_argument("--p-threshold", type=float, default=0.05)
|
||||
p.add_argument("--seed", type=int)
|
||||
p.add_argument("--out", default="timing.json")
|
||||
a = p.parse_args()
|
||||
|
||||
payload = analyze(
|
||||
donations_path=a.donations,
|
||||
donation_date_col=a.donation_date_col,
|
||||
donation_amount_col=a.donation_amount_col,
|
||||
donation_donor_col=a.donation_donor_col,
|
||||
donation_recipient_col=a.donation_recipient_col,
|
||||
contracts_path=a.contracts,
|
||||
contract_date_col=a.contract_date_col,
|
||||
contract_vendor_col=a.contract_vendor_col,
|
||||
cross_links_path=a.cross_links,
|
||||
n_permutations=a.permutations,
|
||||
min_donations=a.min_donations,
|
||||
p_threshold=a.p_threshold,
|
||||
seed=a.seed,
|
||||
out_path=a.out,
|
||||
)
|
||||
meta = payload["metadata"]
|
||||
print(
|
||||
f"Tested {meta['n_pairs_tested']} pairs ({meta['n_pairs_skipped']} skipped). "
|
||||
f"Significant (p<{meta['p_threshold']}): {meta['n_significant']}. "
|
||||
f"Wrote {a.out}"
|
||||
)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -0,0 +1,59 @@
|
|||
# <Source Name>
|
||||
|
||||
## 1. Summary
|
||||
|
||||
What this data source is, who publishes it, why it matters for investigations.
|
||||
|
||||
## 2. Access Methods
|
||||
|
||||
- API endpoint(s)
|
||||
- Bulk download URLs
|
||||
- Auth requirements (none / API key / OAuth)
|
||||
- Rate limits
|
||||
|
||||
## 3. Data Schema
|
||||
|
||||
Key fields, record types, table relationships. List the columns the fetch
|
||||
script emits.
|
||||
|
||||
## 4. Coverage
|
||||
|
||||
- Jurisdiction
|
||||
- Time range
|
||||
- Update frequency
|
||||
- Data volume (rows / GB)
|
||||
|
||||
## 5. Cross-Reference Potential
|
||||
|
||||
Which other sources can be joined and on what keys. Be explicit:
|
||||
|
||||
- `<source>` ↔ `<column>` (join key: <normalized entity name / EIN / CIK / etc.>)
|
||||
|
||||
## 6. Data Quality
|
||||
|
||||
Known issues — formatting inconsistencies, missing fields, duplicates,
|
||||
historical gaps, redaction.
|
||||
|
||||
## 7. Acquisition Script
|
||||
|
||||
Path: `scripts/fetch_<source>.py`
|
||||
|
||||
Example:
|
||||
|
||||
```bash
|
||||
python3 SKILL_DIR/scripts/fetch_<source>.py --<filter> <value> --out data/<source>.csv
|
||||
```
|
||||
|
||||
Output CSV columns: `<col1>, <col2>, ...`
|
||||
|
||||
## 8. Legal & Licensing
|
||||
|
||||
- Public records law / FOIA basis
|
||||
- Terms of use / acceptable use
|
||||
- Attribution requirements (if any)
|
||||
|
||||
## 9. References
|
||||
|
||||
- Official docs: <url>
|
||||
- Data dictionary: <url>
|
||||
- Related coverage / journalism: <url>
|
||||
14
plugins/kanban/dashboard/dist/index.js
vendored
14
plugins/kanban/dashboard/dist/index.js
vendored
|
|
@ -68,7 +68,7 @@
|
|||
const FALLBACK_COLUMN_HELP = {
|
||||
triage: "Raw ideas — a specifier will flesh out the spec",
|
||||
todo: "Waiting on dependencies or unassigned",
|
||||
ready: "Assigned and waiting for a dispatcher tick",
|
||||
ready: "Dependencies satisfied; assign a profile to dispatch",
|
||||
running: "Claimed by a worker — in-flight",
|
||||
blocked: "Worker asked for human input",
|
||||
done: "Completed",
|
||||
|
|
@ -2048,6 +2048,7 @@
|
|||
};
|
||||
|
||||
const progress = t.progress;
|
||||
const needsAssignee = t.status === "ready" && !t.assignee;
|
||||
|
||||
return h("div", {
|
||||
ref: cardRef,
|
||||
|
|
@ -2118,6 +2119,13 @@
|
|||
title: `${progress.done} of ${progress.total} child tasks done`,
|
||||
}, `${progress.done}/${progress.total}`)
|
||||
: null,
|
||||
needsAssignee
|
||||
? h(Badge, {
|
||||
variant: "outline",
|
||||
className: "hermes-kanban-needs-assignee",
|
||||
title: tx(i18n, "needsAssigneeHint", "Dependencies are satisfied, but the dispatcher skips this task until you assign a profile."),
|
||||
}, tx(i18n, "needsAssignee", "Needs assignee"))
|
||||
: null,
|
||||
),
|
||||
h("div", { className: "hermes-kanban-card-title" },
|
||||
t.title || tx(i18n, "untitled", "(untitled)")),
|
||||
|
|
@ -2126,7 +2134,9 @@
|
|||
? h("span", { className: "hermes-kanban-assignee",
|
||||
title: `Assigned to Hermes profile @${t.assignee}` }, "@", t.assignee)
|
||||
: h("span", { className: "hermes-kanban-unassigned",
|
||||
title: "No profile assigned. The dispatcher will pick one from available profiles when the task is Ready." },
|
||||
title: needsAssignee
|
||||
? tx(i18n, "needsAssigneeHint", "Dependencies are satisfied, but the dispatcher skips this task until you assign a profile.")
|
||||
: "No profile assigned." },
|
||||
tx(i18n, "unassigned", "unassigned")),
|
||||
t.comment_count > 0
|
||||
? h("span", { className: "hermes-kanban-count",
|
||||
|
|
|
|||
8
plugins/kanban/dashboard/dist/style.css
vendored
8
plugins/kanban/dashboard/dist/style.css
vendored
|
|
@ -280,6 +280,14 @@
|
|||
padding: 0.05rem 0.3rem !important;
|
||||
}
|
||||
|
||||
.hermes-kanban-needs-assignee {
|
||||
font-size: 0.6rem !important;
|
||||
padding: 0.05rem 0.3rem !important;
|
||||
background: color-mix(in srgb, var(--color-warning, #d4b348) 16%, transparent);
|
||||
border-color: color-mix(in srgb, var(--color-warning, #d4b348) 45%, var(--color-border));
|
||||
color: var(--color-foreground);
|
||||
}
|
||||
|
||||
.hermes-kanban-assignee {
|
||||
font-weight: 500;
|
||||
color: color-mix(in srgb, var(--color-foreground) 80%, var(--color-muted-foreground));
|
||||
|
|
|
|||
|
|
@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
|||
|
||||
[project]
|
||||
name = "hermes-agent"
|
||||
version = "0.13.0"
|
||||
version = "0.14.0"
|
||||
description = "The self-improving AI agent — creates skills from experience, improves them during use, and runs anywhere"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
|
|
@ -216,12 +216,11 @@ hermes-acp = "acp_adapter.entry:main"
|
|||
py-modules = ["run_agent", "model_tools", "toolsets", "batch_runner", "trajectory_compressor", "toolset_distributions", "cli", "hermes_bootstrap", "hermes_constants", "hermes_state", "hermes_time", "hermes_logging", "utils"]
|
||||
|
||||
[tool.setuptools.package-data]
|
||||
hermes_cli = ["web_dist/**/*", "tui_dist/**/*", "scripts/install.sh"]
|
||||
hermes_cli = ["web_dist/**/*"]
|
||||
gateway = ["assets/**/*"]
|
||||
acp_adapter = ["bootstrap/*.sh", "bootstrap/*.ps1"]
|
||||
|
||||
[tool.setuptools.packages.find]
|
||||
include = ["agent", "agent.*", "tools", "tools.*", "hermes_cli", "gateway", "gateway.*", "tui_gateway", "tui_gateway.*", "cron", "acp_adapter", "acp_adapter.*", "plugins", "plugins.*", "providers", "providers.*"]
|
||||
include = ["agent", "agent.*", "tools", "tools.*", "hermes_cli", "gateway", "gateway.*", "tui_gateway", "tui_gateway.*", "cron", "acp_adapter", "plugins", "plugins.*", "providers", "providers.*"]
|
||||
|
||||
[tool.pytest.ini_options]
|
||||
testpaths = ["tests"]
|
||||
|
|
|
|||
107
run_agent.py
107
run_agent.py
|
|
@ -393,6 +393,19 @@ def _is_destructive_command(cmd: str) -> bool:
|
|||
return False
|
||||
|
||||
|
||||
def _is_mcp_tool_parallel_safe(tool_name: str) -> bool:
|
||||
"""Check if an MCP tool comes from a server with parallel tool calls enabled.
|
||||
|
||||
Lazy-imports from ``tools.mcp_tool`` to avoid circular dependencies.
|
||||
Returns False if the MCP module is not available.
|
||||
"""
|
||||
try:
|
||||
from tools.mcp_tool import is_mcp_tool_parallel_safe
|
||||
return is_mcp_tool_parallel_safe(tool_name)
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
def _should_parallelize_tool_batch(tool_calls) -> bool:
|
||||
"""Return True when a tool-call batch is safe to run concurrently."""
|
||||
if len(tool_calls) <= 1:
|
||||
|
|
@ -432,7 +445,9 @@ def _should_parallelize_tool_batch(tool_calls) -> bool:
|
|||
continue
|
||||
|
||||
if tool_name not in _PARALLEL_SAFE_TOOLS:
|
||||
return False
|
||||
# Check if it's an MCP tool from a server that opted into parallel calls.
|
||||
if not _is_mcp_tool_parallel_safe(tool_name):
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
|
@ -3027,6 +3042,24 @@ class AIAgent:
|
|||
parts.append(f"{type(e).__name__}({msg})" if msg else type(e).__name__)
|
||||
return " <- ".join(parts) if parts else type(error).__name__
|
||||
|
||||
def _is_provider_stream_parse_error(self, error: BaseException) -> bool:
|
||||
"""Return True for malformed provider streaming data from SDK parsers.
|
||||
|
||||
Some Anthropic-compatible streaming providers can send a malformed
|
||||
event-stream frame. The Anthropic SDK surfaces that as a plain
|
||||
``ValueError`` such as ``expected ident at line 1 column 149``. That
|
||||
is provider wire-format trouble, not local request validation, so it
|
||||
should follow the same retry path as a truncated JSON body.
|
||||
"""
|
||||
if getattr(self, "api_mode", None) != "anthropic_messages":
|
||||
return False
|
||||
if not isinstance(error, ValueError):
|
||||
return False
|
||||
if isinstance(error, (UnicodeEncodeError, json.JSONDecodeError)):
|
||||
return False
|
||||
message = str(error).strip().lower()
|
||||
return "expected ident at line" in message
|
||||
|
||||
def _log_stream_retry(
|
||||
self,
|
||||
*,
|
||||
|
|
@ -5080,6 +5113,12 @@ class AIAgent:
|
|||
"""
|
||||
raw = str(error)
|
||||
|
||||
if (
|
||||
isinstance(error, ValueError)
|
||||
and "expected ident at line" in raw.lower()
|
||||
):
|
||||
return f"Malformed provider streaming response: {raw[:300]}"
|
||||
|
||||
# Cloudflare / proxy HTML pages: grab the <title> for a clean summary
|
||||
if "<!DOCTYPE" in raw or "<html" in raw:
|
||||
m = re.search(r"<title[^>]*>([^<]+)</title>", raw, re.IGNORECASE)
|
||||
|
|
@ -8528,6 +8567,7 @@ class AIAgent:
|
|||
_is_conn_err = isinstance(
|
||||
e, (_httpx.ConnectError, _httpx.RemoteProtocolError, ConnectionError)
|
||||
)
|
||||
_is_stream_parse_err = self._is_provider_stream_parse_error(e)
|
||||
|
||||
# If the stream died AFTER some tokens were delivered:
|
||||
# normally we don't retry (the user already saw text,
|
||||
|
|
@ -8567,7 +8607,10 @@ class AIAgent:
|
|||
for phrase in _SSE_PREVIEW_PHRASES
|
||||
)
|
||||
_is_transient = (
|
||||
_is_timeout or _is_conn_err or _is_sse_conn_err_preview
|
||||
_is_timeout
|
||||
or _is_conn_err
|
||||
or _is_sse_conn_err_preview
|
||||
or _is_stream_parse_err
|
||||
)
|
||||
_can_silent_retry = (
|
||||
_partial_tool_in_flight
|
||||
|
|
@ -8665,7 +8708,7 @@ class AIAgent:
|
|||
for phrase in _SSE_CONN_PHRASES
|
||||
)
|
||||
|
||||
if _is_timeout or _is_conn_err or _is_sse_conn_err:
|
||||
if _is_timeout or _is_conn_err or _is_sse_conn_err or _is_stream_parse_err:
|
||||
# Transient network / timeout error. Retry the
|
||||
# streaming request with a fresh connection first.
|
||||
if _stream_attempt < _max_stream_retries:
|
||||
|
|
@ -8706,12 +8749,20 @@ class AIAgent:
|
|||
mid_tool_call=False,
|
||||
diag=request_client_holder.get("diag"),
|
||||
)
|
||||
self._emit_status(
|
||||
"❌ Connection to provider failed after "
|
||||
f"{_max_stream_retries + 1} attempts. "
|
||||
"The provider may be experiencing issues — "
|
||||
"try again in a moment."
|
||||
)
|
||||
if _is_stream_parse_err:
|
||||
self._emit_status(
|
||||
"❌ Provider returned malformed streaming data after "
|
||||
f"{_max_stream_retries + 1} attempts. "
|
||||
"The provider may be experiencing issues — "
|
||||
"try again in a moment."
|
||||
)
|
||||
else:
|
||||
self._emit_status(
|
||||
"❌ Connection to provider failed after "
|
||||
f"{_max_stream_retries + 1} attempts. "
|
||||
"The provider may be experiencing issues — "
|
||||
"try again in a moment."
|
||||
)
|
||||
else:
|
||||
_err_lower = str(e).lower()
|
||||
_is_stream_unsupported = (
|
||||
|
|
@ -14133,6 +14184,39 @@ class AIAgent:
|
|||
"interrupted": True,
|
||||
}
|
||||
|
||||
# Actionable hint for GitHub Models (Azure) 413 errors.
|
||||
# The free tier enforces a hard 8K token cap per request,
|
||||
# which Hermes' system prompt + tool schemas alone exceed.
|
||||
# Compression can't help — the floor is the system prompt
|
||||
# itself, not the conversation — so surface a clear "not
|
||||
# compatible" message instead of looping into three futile
|
||||
# compression attempts.
|
||||
if (
|
||||
status_code == 413
|
||||
and isinstance(_base, str)
|
||||
and "models.inference.ai.azure.com" in _base
|
||||
):
|
||||
self._vprint(
|
||||
f"{self.log_prefix} 💡 GitHub Models free tier (models.inference.ai.azure.com) caps every",
|
||||
force=True,
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix} request at ~8K tokens. Hermes' system prompt + tool schemas baseline",
|
||||
force=True,
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix} exceeds that floor, so this endpoint cannot run an agentic loop.",
|
||||
force=True,
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix} Use the `copilot` provider with a Copilot subscription token (`hermes",
|
||||
force=True,
|
||||
)
|
||||
self._vprint(
|
||||
f"{self.log_prefix} setup` → GitHub Copilot), or pick any other provider.",
|
||||
force=True,
|
||||
)
|
||||
|
||||
# Check for 413 payload-too-large BEFORE generic 4xx handler.
|
||||
# A 413 is a payload-size error — the correct response is to
|
||||
# compress history and retry, not abort immediately.
|
||||
|
|
@ -14509,11 +14593,16 @@ class AIAgent:
|
|||
# provider/network failure (malformed response body,
|
||||
# truncated stream, routing layer corruption), not a
|
||||
# local programming bug, and should be retried (#14782).
|
||||
# Exclude Anthropic stream parser ValueErrors for the
|
||||
# same reason: third-party Anthropic-compatible providers
|
||||
# can emit malformed event-stream frames that SDK parsers
|
||||
# raise as plain ValueError.
|
||||
is_local_validation_error = (
|
||||
isinstance(api_error, (ValueError, TypeError))
|
||||
and not isinstance(
|
||||
api_error, (UnicodeEncodeError, json.JSONDecodeError)
|
||||
)
|
||||
and not self._is_provider_stream_parse_error(api_error)
|
||||
# ssl.SSLError (and its subclass SSLCertVerificationError)
|
||||
# inherits from OSError *and* ValueError via Python MRO,
|
||||
# so the isinstance(ValueError) check above would
|
||||
|
|
|
|||
|
|
@ -59,6 +59,8 @@ AUTHOR_MAP = {
|
|||
"m@mobrienv.dev": "mikeyobrien",
|
||||
"qiyin.zuo@pcitc.com": "qiyin-code",
|
||||
"mr.aashiz@gmail.com": "aashizpoudel",
|
||||
"70629228+shaun0927@users.noreply.github.com": "shaun0927",
|
||||
"98262967+Bihruze@users.noreply.github.com": "Bihruze",
|
||||
"nidhi2894@gmail.com": "nidhi-singh02",
|
||||
"30312689+aashizpoudel@users.noreply.github.com": "aashizpoudel",
|
||||
"oleksii.lisikh@gmail.com": "olisikh",
|
||||
|
|
@ -91,6 +93,7 @@ AUTHOR_MAP = {
|
|||
"30397170+1000Delta@users.noreply.github.com": "1000Delta",
|
||||
"szymonclawd@mac.home": "szymonclawd",
|
||||
"257759490+szymonclawd@users.noreply.github.com": "szymonclawd",
|
||||
"101180447+worlldz@users.noreply.github.com": "worlldz",
|
||||
"zhanganzhe@tenclass.com": "luoyuctl",
|
||||
"51604064+luoyuctl@users.noreply.github.com": "luoyuctl",
|
||||
"127238744+teknium1@users.noreply.github.com": "teknium1",
|
||||
|
|
@ -1078,6 +1081,11 @@ AUTHOR_MAP = {
|
|||
"nidhi2894@gmail.com": "nidhi-singh02", # PR #2752 salvage (slack whitespace-only IndexError guard)
|
||||
"38173192+nidhi-singh02@users.noreply.github.com": "nidhi-singh02",
|
||||
"Jaaneek@users.noreply.github.com": "Jaaneek", # PR #26457 (xAI Grok OAuth provider)
|
||||
# v0.14.0 additions
|
||||
"chuang.guo@hopechart.com": "wuwuzhijing", # PR #21063 salvage (gateway docs mention Weixin)
|
||||
"nightcityblade@gmail.com": "nightcityblade", # PR #24138 (docs voice/tts table)
|
||||
"pol.kuijken@gmail.com": "polkn", # PR #6136 salvage (skill_view collision refusal)
|
||||
"robin@soal.org": "rewbs",
|
||||
}
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@ from acp.schema import (
|
|||
AgentCapabilities,
|
||||
AgentMessageChunk,
|
||||
AgentPlanUpdate,
|
||||
AgentThoughtChunk,
|
||||
AuthenticateResponse,
|
||||
AvailableCommandsUpdate,
|
||||
Implementation,
|
||||
|
|
@ -467,25 +468,296 @@ class TestSessionOps:
|
|||
)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_load_session_schedules_history_replay_after_response(self, agent):
|
||||
"""Zed only attaches replayed updates after session/load has completed."""
|
||||
async def test_load_session_replays_reasoning_thought_before_message(self, agent):
|
||||
"""Thinking-model thoughts must be replayed via ``agent_thought_chunk``.
|
||||
|
||||
Regression for #12285 — when a session is loaded, persisted assistant
|
||||
``reasoning_content`` / ``reasoning`` fields must surface as ACP
|
||||
``AgentThoughtChunk`` notifications in the same relative position they
|
||||
had live (thought streams before the assistant message text), so Zed's
|
||||
collapsed Thinking pane rebuilds instead of vanishing on reconnect.
|
||||
"""
|
||||
mock_conn = MagicMock(spec=acp.Client)
|
||||
mock_conn.session_update = AsyncMock()
|
||||
agent._conn = mock_conn
|
||||
|
||||
new_resp = await agent.new_session(cwd="/tmp")
|
||||
state = agent.session_manager.get_session(new_resp.session_id)
|
||||
state.history = [
|
||||
{"role": "user", "content": "Walk me through it."},
|
||||
{
|
||||
"role": "assistant",
|
||||
"reasoning_content": "Let me think step by step about the request.",
|
||||
"content": "Here is the plan.",
|
||||
},
|
||||
{"role": "user", "content": "And the legacy case?"},
|
||||
{
|
||||
"role": "assistant",
|
||||
# No reasoning_content — exercise the legacy "reasoning" fallback
|
||||
# path so sessions persisted before #16892 still replay thoughts.
|
||||
"reasoning": "Older sessions stored the trace under the internal key.",
|
||||
"content": "Same idea, older field name.",
|
||||
},
|
||||
]
|
||||
|
||||
mock_conn.session_update.reset_mock()
|
||||
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
|
||||
await asyncio.sleep(0)
|
||||
await asyncio.sleep(0)
|
||||
|
||||
assert isinstance(resp, LoadSessionResponse)
|
||||
|
||||
replay_kinds = [
|
||||
getattr(call.kwargs.get("update"), "session_update", None)
|
||||
for call in mock_conn.session_update.await_args_list
|
||||
if getattr(call.kwargs.get("update"), "session_update", None)
|
||||
in {"user_message_chunk", "agent_message_chunk", "agent_thought_chunk"}
|
||||
]
|
||||
assert replay_kinds == [
|
||||
"user_message_chunk",
|
||||
"agent_thought_chunk",
|
||||
"agent_message_chunk",
|
||||
"user_message_chunk",
|
||||
"agent_thought_chunk",
|
||||
"agent_message_chunk",
|
||||
]
|
||||
|
||||
thought_updates = [
|
||||
call.kwargs["update"]
|
||||
for call in mock_conn.session_update.await_args_list
|
||||
if isinstance(call.kwargs.get("update"), AgentThoughtChunk)
|
||||
]
|
||||
assert len(thought_updates) == 2
|
||||
assert thought_updates[0].content.text == "Let me think step by step about the request."
|
||||
assert thought_updates[1].content.text == "Older sessions stored the trace under the internal key."
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_load_session_replays_reasoning_only_turn(self, agent):
|
||||
"""Assistant turns with reasoning but no content should still emit a thought.
|
||||
|
||||
Pure reasoning-only assistant entries (e.g. a thinking step before a
|
||||
tool-call turn) commonly carry ``reasoning_content`` with empty
|
||||
``content``. The replay must still surface the thought so the editor's
|
||||
Thinking pane rebuilds, even when there is no message text to follow.
|
||||
"""
|
||||
mock_conn = MagicMock(spec=acp.Client)
|
||||
mock_conn.session_update = AsyncMock()
|
||||
agent._conn = mock_conn
|
||||
|
||||
new_resp = await agent.new_session(cwd="/tmp")
|
||||
state = agent.session_manager.get_session(new_resp.session_id)
|
||||
state.history = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"reasoning_content": "I should call the search tool next.",
|
||||
"content": "",
|
||||
},
|
||||
]
|
||||
|
||||
mock_conn.session_update.reset_mock()
|
||||
await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
|
||||
await asyncio.sleep(0)
|
||||
await asyncio.sleep(0)
|
||||
|
||||
thought_updates = [
|
||||
call.kwargs["update"]
|
||||
for call in mock_conn.session_update.await_args_list
|
||||
if isinstance(call.kwargs.get("update"), AgentThoughtChunk)
|
||||
]
|
||||
message_updates = [
|
||||
call.kwargs["update"]
|
||||
for call in mock_conn.session_update.await_args_list
|
||||
if isinstance(call.kwargs.get("update"), AgentMessageChunk)
|
||||
]
|
||||
assert len(thought_updates) == 1
|
||||
assert thought_updates[0].content.text == "I should call the search tool next."
|
||||
assert message_updates == []
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_load_session_skips_empty_reasoning_fields(self, agent):
|
||||
"""Empty/whitespace reasoning fields must not produce notifications."""
|
||||
mock_conn = MagicMock(spec=acp.Client)
|
||||
mock_conn.session_update = AsyncMock()
|
||||
agent._conn = mock_conn
|
||||
|
||||
new_resp = await agent.new_session(cwd="/tmp")
|
||||
state = agent.session_manager.get_session(new_resp.session_id)
|
||||
state.history = [
|
||||
{
|
||||
"role": "assistant",
|
||||
"reasoning_content": "",
|
||||
"reasoning": " \n\t",
|
||||
"content": "Just a regular answer.",
|
||||
},
|
||||
]
|
||||
|
||||
mock_conn.session_update.reset_mock()
|
||||
await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
|
||||
await asyncio.sleep(0)
|
||||
await asyncio.sleep(0)
|
||||
|
||||
thought_updates = [
|
||||
call.kwargs["update"]
|
||||
for call in mock_conn.session_update.await_args_list
|
||||
if isinstance(call.kwargs.get("update"), AgentThoughtChunk)
|
||||
]
|
||||
assert thought_updates == []
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_load_session_replays_thought_then_tool_call_without_message(self, agent):
|
||||
"""Canonical thinking-model shape: reasoning + tool_call + no body text.
|
||||
|
||||
Thinking models commonly emit a pre-tool thought followed by a
|
||||
tool_calls turn with empty ``content``. Replay must emit:
|
||||
``agent_thought_chunk`` then ``tool_call`` then ``tool_call_update``
|
||||
for the matching tool result — and crucially, NO ``agent_message_chunk``
|
||||
for the empty-text assistant body. Regression for the canonical
|
||||
thinking-then-tool flow on #12285.
|
||||
"""
|
||||
mock_conn = MagicMock(spec=acp.Client)
|
||||
mock_conn.session_update = AsyncMock()
|
||||
agent._conn = mock_conn
|
||||
|
||||
new_resp = await agent.new_session(cwd="/tmp")
|
||||
state = agent.session_manager.get_session(new_resp.session_id)
|
||||
state.history = [
|
||||
{"role": "user", "content": "Find the bug."},
|
||||
{
|
||||
"role": "assistant",
|
||||
"reasoning_content": "I should grep for the function name first.",
|
||||
"content": "",
|
||||
"tool_calls": [
|
||||
{
|
||||
"id": "call_grep_1",
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "search_files",
|
||||
"arguments": '{"pattern":"foo","path":"."}',
|
||||
},
|
||||
}
|
||||
],
|
||||
},
|
||||
{
|
||||
"role": "tool",
|
||||
"tool_call_id": "call_grep_1",
|
||||
"content": '{"total_count":1,"matches":[{"path":"x.py","line":1,"content":"foo"}]}',
|
||||
},
|
||||
]
|
||||
|
||||
mock_conn.session_update.reset_mock()
|
||||
await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
|
||||
await asyncio.sleep(0)
|
||||
await asyncio.sleep(0)
|
||||
|
||||
kinds = [
|
||||
getattr(call.kwargs.get("update"), "session_update", None)
|
||||
for call in mock_conn.session_update.await_args_list
|
||||
if getattr(call.kwargs.get("update"), "session_update", None)
|
||||
in {
|
||||
"user_message_chunk",
|
||||
"agent_thought_chunk",
|
||||
"agent_message_chunk",
|
||||
"tool_call",
|
||||
"tool_call_update",
|
||||
}
|
||||
]
|
||||
# No agent_message_chunk for the empty-content assistant turn.
|
||||
assert "agent_message_chunk" not in kinds
|
||||
# Thought must precede the tool_call_start within the assistant turn,
|
||||
# and the tool result follows.
|
||||
assert kinds == [
|
||||
"user_message_chunk",
|
||||
"agent_thought_chunk",
|
||||
"tool_call",
|
||||
"tool_call_update",
|
||||
]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_load_session_replays_history_before_returning_response(self, agent):
|
||||
"""Per ACP spec, replay must complete BEFORE load_session returns.
|
||||
|
||||
Spec-compliant ACP clients (Codex, Claude Code, OpenCode, Pi, Zed)
|
||||
attach their ``session/update`` listeners before awaiting the
|
||||
``loadSession`` RPC and rely on receiving the full transcript within
|
||||
the request's lifetime. Deferring replay via ``loop.call_soon`` (the
|
||||
prior behavior in May 2026) broke clients that read notification
|
||||
counts synchronously against the load response — see #12285 follow-up.
|
||||
"""
|
||||
new_resp = await agent.new_session(cwd="/tmp")
|
||||
state = agent.session_manager.get_session(new_resp.session_id)
|
||||
state.history = [{"role": "user", "content": "hello from history"}]
|
||||
events = []
|
||||
events: list[str] = []
|
||||
|
||||
async def replay_after_response(_state):
|
||||
async def replay_records(_state):
|
||||
events.append("replay")
|
||||
|
||||
with patch.object(agent, "_replay_session_history", side_effect=replay_after_response):
|
||||
with patch.object(agent, "_replay_session_history", side_effect=replay_records):
|
||||
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
|
||||
events.append("returned")
|
||||
|
||||
assert isinstance(resp, LoadSessionResponse)
|
||||
assert events == ["returned"]
|
||||
await asyncio.sleep(0)
|
||||
await asyncio.sleep(0)
|
||||
assert events == ["returned", "replay"]
|
||||
# Replay must have happened BEFORE the response was constructed —
|
||||
# i.e. before the `events.append("returned")` after the await resolves.
|
||||
assert events == ["replay", "returned"]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_resume_session_replays_history_before_returning_response(self, agent):
|
||||
"""Same spec rationale as ``load_session`` — replay before responding."""
|
||||
new_resp = await agent.new_session(cwd="/tmp")
|
||||
state = agent.session_manager.get_session(new_resp.session_id)
|
||||
state.history = [{"role": "user", "content": "hello from history"}]
|
||||
events: list[str] = []
|
||||
|
||||
async def replay_records(_state):
|
||||
events.append("replay")
|
||||
|
||||
with patch.object(agent, "_replay_session_history", side_effect=replay_records):
|
||||
resp = await agent.resume_session(cwd="/tmp", session_id=new_resp.session_id)
|
||||
events.append("returned")
|
||||
|
||||
assert isinstance(resp, ResumeSessionResponse)
|
||||
assert events == ["replay", "returned"]
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_load_session_survives_replay_helper_exception(self, agent, caplog):
|
||||
"""A replay helper raising must not turn load_session into an error.
|
||||
|
||||
With awaited replay, an exception in ``_replay_session_history`` now
|
||||
propagates into the ``load_session`` handler. The defensive try/except
|
||||
guard at the call site must catch and log it so the JSON-RPC client
|
||||
still receives a ``LoadSessionResponse`` — partial transcripts are
|
||||
acceptable, total load failure is not.
|
||||
"""
|
||||
new_resp = await agent.new_session(cwd="/tmp")
|
||||
state = agent.session_manager.get_session(new_resp.session_id)
|
||||
state.history = [{"role": "user", "content": "hi"}]
|
||||
|
||||
async def boom(_state):
|
||||
raise RuntimeError("simulated replay helper crash")
|
||||
|
||||
with caplog.at_level("WARNING", logger="acp_adapter.server"):
|
||||
with patch.object(agent, "_replay_session_history", side_effect=boom):
|
||||
resp = await agent.load_session(cwd="/tmp", session_id=new_resp.session_id)
|
||||
|
||||
assert isinstance(resp, LoadSessionResponse)
|
||||
assert "history replay raised during session/load" in caplog.text
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_resume_session_survives_replay_helper_exception(self, agent, caplog):
|
||||
"""Same guarantee as ``load_session`` for the resume path."""
|
||||
new_resp = await agent.new_session(cwd="/tmp")
|
||||
state = agent.session_manager.get_session(new_resp.session_id)
|
||||
state.history = [{"role": "user", "content": "hi"}]
|
||||
|
||||
async def boom(_state):
|
||||
raise RuntimeError("simulated replay helper crash")
|
||||
|
||||
with caplog.at_level("WARNING", logger="acp_adapter.server"):
|
||||
with patch.object(agent, "_replay_session_history", side_effect=boom):
|
||||
resp = await agent.resume_session(cwd="/tmp", session_id=new_resp.session_id)
|
||||
|
||||
assert isinstance(resp, ResumeSessionResponse)
|
||||
assert "history replay raised during session/resume" in caplog.text
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_resume_session_creates_new_if_missing(self, agent):
|
||||
|
|
|
|||
170
tests/agent/test_anthropic_oauth_pkce.py
Normal file
170
tests/agent/test_anthropic_oauth_pkce.py
Normal file
|
|
@ -0,0 +1,170 @@
|
|||
"""Regression tests for the Anthropic OAuth PKCE flow.
|
||||
|
||||
Guards against re-introducing the bug where the PKCE ``code_verifier`` was
|
||||
reused as the OAuth ``state`` parameter, leaking the verifier via the
|
||||
authorization URL (browser history, Referer headers, auth-server logs) and
|
||||
removing CSRF protection on the callback path.
|
||||
|
||||
History:
|
||||
- PR #1775 first fixed this on ``run_hermes_oauth_login()``.
|
||||
- PR #2647 (b17e5c10) added ``run_hermes_oauth_login_pure()`` and silently
|
||||
copy-pasted the pre-#1775 vulnerable pattern.
|
||||
- PR #3107 removed the old function, leaving only the regressed copy.
|
||||
- PR #10699 (issue #10693) fixed the regression on the surviving function.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import io
|
||||
import json
|
||||
from typing import Any, Dict
|
||||
from urllib.parse import parse_qs, urlparse
|
||||
|
||||
|
||||
def _patch_oauth_flow(
|
||||
monkeypatch,
|
||||
*,
|
||||
callback_code: str,
|
||||
token_response: Dict[str, Any] | None = None,
|
||||
capture_token_request: Dict[str, Any] | None = None,
|
||||
capture_auth_url: Dict[str, str] | None = None,
|
||||
) -> None:
|
||||
"""Wire up monkeypatches that let ``run_hermes_oauth_login_pure()`` run
|
||||
end-to-end without touching a real browser, stdin, or HTTP endpoint.
|
||||
|
||||
``callback_code`` is the literal string the user would paste back into the
|
||||
terminal (``"<code>#<state>"`` format).
|
||||
``capture_token_request`` and ``capture_auth_url`` are out-dict captures
|
||||
so the test can introspect what was sent to the auth URL and the token
|
||||
endpoint, respectively.
|
||||
"""
|
||||
import urllib.request
|
||||
|
||||
if token_response is None:
|
||||
token_response = {
|
||||
"access_token": "sk-ant-test-access",
|
||||
"refresh_token": "sk-ant-test-refresh",
|
||||
"expires_in": 3600,
|
||||
}
|
||||
|
||||
def fake_open(url):
|
||||
if capture_auth_url is not None:
|
||||
capture_auth_url["url"] = url
|
||||
return True
|
||||
|
||||
monkeypatch.setattr("webbrowser.open", fake_open)
|
||||
monkeypatch.setattr("builtins.input", lambda *_a, **_kw: callback_code)
|
||||
|
||||
class _FakeResponse:
|
||||
def __init__(self, body: bytes) -> None:
|
||||
self._body = body
|
||||
|
||||
def __enter__(self):
|
||||
return self
|
||||
|
||||
def __exit__(self, *_exc):
|
||||
return False
|
||||
|
||||
def read(self):
|
||||
return self._body
|
||||
|
||||
def fake_urlopen(req, *_a, **_kw):
|
||||
if capture_token_request is not None:
|
||||
capture_token_request["url"] = req.full_url
|
||||
capture_token_request["data"] = json.loads(req.data.decode())
|
||||
capture_token_request["headers"] = dict(req.headers)
|
||||
return _FakeResponse(json.dumps(token_response).encode())
|
||||
|
||||
monkeypatch.setattr(urllib.request, "urlopen", fake_urlopen)
|
||||
|
||||
|
||||
def test_authorization_url_state_is_not_pkce_verifier(monkeypatch, tmp_path):
|
||||
"""The ``state`` parameter in the authorization URL must NOT equal the
|
||||
PKCE ``code_verifier``.
|
||||
|
||||
Reusing the verifier as state leaks the verifier into browser history,
|
||||
Referer headers, and auth-server access logs — defeating RFC 7636.
|
||||
"""
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
|
||||
captured_url: Dict[str, str] = {}
|
||||
captured_token: Dict[str, Any] = {}
|
||||
_patch_oauth_flow(
|
||||
monkeypatch,
|
||||
# state echoed back unchanged so the CSRF guard passes
|
||||
callback_code="auth-code-from-anthropic#PLACEHOLDER",
|
||||
capture_auth_url=captured_url,
|
||||
capture_token_request=captured_token,
|
||||
)
|
||||
|
||||
# Stub the callback parse: we need the state echoed back to match. To do
|
||||
# that without hardcoding the state value, override input() AFTER seeing
|
||||
# the auth URL.
|
||||
import builtins
|
||||
|
||||
real_input_calls = {"count": 0}
|
||||
|
||||
def fake_input(*_a, **_kw):
|
||||
real_input_calls["count"] += 1
|
||||
# First (and only) call is the "Authorization code:" prompt.
|
||||
url = captured_url.get("url", "")
|
||||
qs = parse_qs(urlparse(url).query)
|
||||
state = qs.get("state", [""])[0]
|
||||
return f"auth-code-from-anthropic#{state}"
|
||||
|
||||
monkeypatch.setattr(builtins, "input", fake_input)
|
||||
|
||||
from agent.anthropic_adapter import run_hermes_oauth_login_pure
|
||||
|
||||
result = run_hermes_oauth_login_pure()
|
||||
assert result is not None, "OAuth flow should succeed with matching state"
|
||||
|
||||
url = captured_url["url"]
|
||||
qs = parse_qs(urlparse(url).query)
|
||||
|
||||
assert "state" in qs and qs["state"][0], "authorization URL must include state"
|
||||
assert "code_challenge" in qs, "authorization URL must include code_challenge"
|
||||
|
||||
state_in_url = qs["state"][0]
|
||||
verifier_sent = captured_token["data"]["code_verifier"]
|
||||
|
||||
# The whole point: state and verifier must be independent values.
|
||||
assert state_in_url != verifier_sent, (
|
||||
"PKCE code_verifier was reused as OAuth state — regression of #10693 / "
|
||||
"#1775. The verifier is supposed to be a secret known only to the "
|
||||
"client; placing it in the authorization URL leaks it via browser "
|
||||
"history, Referer headers, and auth-server logs."
|
||||
)
|
||||
|
||||
# And the verifier MUST NOT appear anywhere in the URL.
|
||||
assert verifier_sent not in url, (
|
||||
"PKCE verifier leaked into authorization URL — regression of #10693"
|
||||
)
|
||||
|
||||
|
||||
def test_callback_state_mismatch_aborts(monkeypatch, tmp_path, caplog):
|
||||
"""If the state returned in the callback does not match the one we sent
|
||||
in the authorization URL, the flow must abort before exchanging the code.
|
||||
|
||||
Without this check, an attacker who tricks the user into pasting a
|
||||
crafted ``<code>#<state>`` string can complete the token exchange — the
|
||||
CSRF protection that ``state`` is supposed to provide (RFC 6749 §10.12)
|
||||
would be absent.
|
||||
"""
|
||||
monkeypatch.setenv("HERMES_HOME", str(tmp_path))
|
||||
|
||||
captured_token: Dict[str, Any] = {}
|
||||
_patch_oauth_flow(
|
||||
monkeypatch,
|
||||
callback_code="attacker-code#attacker-state-does-not-match",
|
||||
capture_token_request=captured_token,
|
||||
)
|
||||
|
||||
from agent.anthropic_adapter import run_hermes_oauth_login_pure
|
||||
|
||||
result = run_hermes_oauth_login_pure()
|
||||
|
||||
assert result is None, "mismatched state must abort the flow"
|
||||
assert "url" not in captured_token, (
|
||||
"token exchange must NOT happen when state mismatches"
|
||||
)
|
||||
77
tests/agent/test_copilot_acp_deprecation.py
Normal file
77
tests/agent/test_copilot_acp_deprecation.py
Normal file
|
|
@ -0,0 +1,77 @@
|
|||
"""Tests for gh-copilot CLI deprecation detection and GitHub Models Azure URL mapping."""
|
||||
|
||||
import pytest
|
||||
|
||||
from agent.copilot_acp_client import _is_gh_copilot_deprecation_message
|
||||
|
||||
|
||||
class TestDeprecationPatternDetection:
|
||||
"""Verify that stderr from the deprecated `gh copilot` extension is caught
|
||||
without false-positiving on the new `@github/copilot` CLI."""
|
||||
|
||||
_REAL_DEPRECATION_STDERR = (
|
||||
"The gh-copilot extension has been deprecated in favor of the newer "
|
||||
"GitHub Copilot CLI.\nFor more information, visit:\n"
|
||||
"- Copilot CLI: https://github.com/github/copilot-cli\n"
|
||||
"- Deprecation announcement: https://github.blog/changelog/"
|
||||
"2025-09-25-upcoming-deprecation-of-gh-copilot-cli-extension\n"
|
||||
"No commands will be executed."
|
||||
)
|
||||
|
||||
def test_real_deprecation_message_matches(self):
|
||||
assert _is_gh_copilot_deprecation_message(self._REAL_DEPRECATION_STDERR)
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"stderr_text",
|
||||
[
|
||||
# The deprecation banner uses both halves of the fingerprint.
|
||||
"The gh-copilot extension has been deprecated.",
|
||||
"gh-copilot: no commands will be executed.",
|
||||
# Mixed casing — match is case-insensitive.
|
||||
"The GH-Copilot Extension HAS BEEN DEPRECATED.",
|
||||
],
|
||||
)
|
||||
def test_genuine_deprecation_variants_match(self, stderr_text: str):
|
||||
assert _is_gh_copilot_deprecation_message(stderr_text)
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"stderr_text",
|
||||
[
|
||||
# Generic errors — no fingerprint at all.
|
||||
"Error: connection refused",
|
||||
"",
|
||||
# The NEW @github/copilot CLI's repo is github.com/github/copilot-cli.
|
||||
# Its stderr can legitimately mention "copilot-cli" or "deprecation"
|
||||
# in unrelated contexts; neither alone should trip the detector.
|
||||
"copilot-cli: failed to authenticate with the API",
|
||||
"warning: the --foo flag is scheduled for deprecation in v3",
|
||||
"See https://github.com/github/copilot-cli/issues for support",
|
||||
# Half the fingerprint without the other half.
|
||||
"gh-copilot: command not found",
|
||||
"extension has been deprecated (some other extension)",
|
||||
],
|
||||
)
|
||||
def test_does_not_false_positive(self, stderr_text: str):
|
||||
assert not _is_gh_copilot_deprecation_message(stderr_text)
|
||||
|
||||
|
||||
class TestGitHubModelsAzureUrl:
|
||||
"""Verify that the Azure GitHub Models URL is recognised."""
|
||||
|
||||
def test_url_to_provider_contains_azure_models(self):
|
||||
from agent.model_metadata import _URL_TO_PROVIDER
|
||||
|
||||
# Maps to the canonical "copilot" provider (same convention as the
|
||||
# other GitHub-family entries) — not the "github-models" alias.
|
||||
assert _URL_TO_PROVIDER.get("models.inference.ai.azure.com") == "copilot"
|
||||
|
||||
def test_is_github_models_base_url_recognises_azure(self):
|
||||
from hermes_cli.models import _is_github_models_base_url
|
||||
|
||||
assert _is_github_models_base_url("https://models.inference.ai.azure.com")
|
||||
assert _is_github_models_base_url("https://models.inference.ai.azure.com/v1/chat")
|
||||
|
||||
def test_is_github_models_base_url_still_recognises_github_ai(self):
|
||||
from hermes_cli.models import _is_github_models_base_url
|
||||
|
||||
assert _is_github_models_base_url("https://models.github.ai/inference")
|
||||
152
tests/gateway/test_active_session_text_merge.py
Normal file
152
tests/gateway/test_active_session_text_merge.py
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
"""Regression test for #4469.
|
||||
|
||||
When the agent is actively running (session present in
|
||||
``adapter._active_sessions``) and the user fires off multiple TEXT
|
||||
follow-ups in rapid succession, the previous behaviour was a single-slot
|
||||
replacement at ``gateway/platforms/base.py``:
|
||||
|
||||
self._pending_messages[session_key] = event
|
||||
|
||||
So three rapid messages ``A``, ``B``, ``C`` arriving while the agent was
|
||||
still working on the initial turn produced a pending slot containing only
|
||||
``C``; ``A`` and ``B`` were silently dropped.
|
||||
|
||||
The fix routes the follow-up through ``merge_pending_message_event(...,
|
||||
merge_text=True)`` so TEXT events accumulate into the existing pending
|
||||
event's text instead of clobbering it. Photo / media bursts continue to
|
||||
merge through the same helper (they always did).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import sys
|
||||
import types
|
||||
from unittest.mock import AsyncMock, MagicMock
|
||||
|
||||
import pytest
|
||||
|
||||
# Minimal telegram stub so importing gateway.platforms.base does not pull
|
||||
# in the real python-telegram-bot dependency.
|
||||
_tg = sys.modules.get("telegram") or types.ModuleType("telegram")
|
||||
_tg.constants = sys.modules.get("telegram.constants") or types.ModuleType("telegram.constants")
|
||||
_ct = MagicMock()
|
||||
_ct.PRIVATE = "private"
|
||||
_ct.GROUP = "group"
|
||||
_ct.SUPERGROUP = "supergroup"
|
||||
_tg.constants.ChatType = _ct
|
||||
sys.modules.setdefault("telegram", _tg)
|
||||
sys.modules.setdefault("telegram.constants", _tg.constants)
|
||||
sys.modules.setdefault("telegram.ext", types.ModuleType("telegram.ext"))
|
||||
|
||||
from gateway.config import Platform, PlatformConfig
|
||||
from gateway.platforms.base import (
|
||||
BasePlatformAdapter,
|
||||
MessageEvent,
|
||||
MessageType,
|
||||
)
|
||||
from gateway.session import SessionSource, build_session_key
|
||||
|
||||
|
||||
def _make_event(text: str, chat_id: str = "12345") -> MessageEvent:
|
||||
source = SessionSource(
|
||||
platform=Platform.TELEGRAM,
|
||||
chat_id=chat_id,
|
||||
chat_type="dm",
|
||||
user_id="u1",
|
||||
)
|
||||
return MessageEvent(
|
||||
text=text,
|
||||
message_type=MessageType.TEXT,
|
||||
source=source,
|
||||
message_id=f"msg-{text[:8]}",
|
||||
)
|
||||
|
||||
|
||||
def _make_adapter() -> BasePlatformAdapter:
|
||||
"""Build a BasePlatformAdapter without running its heavy __init__.
|
||||
|
||||
We only need the bits ``handle_message`` touches on the active-session
|
||||
path: ``_active_sessions``, ``_pending_messages``,
|
||||
``_message_handler``, ``_busy_session_handler``, ``config``, ``platform``.
|
||||
"""
|
||||
|
||||
class _DummyAdapter(BasePlatformAdapter): # type: ignore[misc]
|
||||
async def connect(self):
|
||||
pass
|
||||
|
||||
async def disconnect(self):
|
||||
pass
|
||||
|
||||
async def get_chat_info(self, chat_id):
|
||||
return None
|
||||
|
||||
async def send(self, *args, **kwargs):
|
||||
return MagicMock(success=True, message_id="x", retryable=False)
|
||||
|
||||
adapter = object.__new__(_DummyAdapter)
|
||||
adapter.config = PlatformConfig(enabled=True, token="***")
|
||||
adapter.platform = Platform.TELEGRAM
|
||||
adapter._message_handler = AsyncMock(return_value=None)
|
||||
adapter._busy_session_handler = None
|
||||
adapter._active_sessions = {}
|
||||
adapter._pending_messages = {}
|
||||
adapter._session_tasks = {}
|
||||
adapter._background_tasks = set()
|
||||
adapter._post_delivery_callbacks = {}
|
||||
adapter._expected_cancelled_tasks = set()
|
||||
adapter._fatal_error_code = None
|
||||
adapter._fatal_error_message = None
|
||||
adapter._fatal_error_retryable = True
|
||||
adapter._fatal_error_handler = None
|
||||
adapter._running = True
|
||||
adapter._auto_tts_default = False
|
||||
adapter._auto_tts_enabled_chats = set()
|
||||
adapter._auto_tts_disabled_chats = set()
|
||||
adapter._typing_paused = set()
|
||||
return adapter
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_rapid_text_followups_accumulate_instead_of_replacing():
|
||||
"""Three rapid TEXT follow-ups during an active session must all
|
||||
survive in ``adapter._pending_messages[session_key].text``."""
|
||||
adapter = _make_adapter()
|
||||
first = _make_event("part one")
|
||||
session_key = build_session_key(first.source)
|
||||
|
||||
# Mark the session as active so subsequent messages take the
|
||||
# "already running" branch in handle_message.
|
||||
adapter._active_sessions[session_key] = asyncio.Event()
|
||||
|
||||
second = _make_event("part two")
|
||||
third = _make_event("part three")
|
||||
|
||||
await adapter.handle_message(second)
|
||||
await adapter.handle_message(third)
|
||||
|
||||
# Both rapid follow-ups must be preserved, not just the last one.
|
||||
pending = adapter._pending_messages[session_key]
|
||||
assert pending.text == "part two\npart three", (
|
||||
f"expected accumulated text, got {pending.text!r}"
|
||||
)
|
||||
# Interrupt event must be signalled exactly like before.
|
||||
assert adapter._active_sessions[session_key].is_set()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_single_followup_is_stored_as_is():
|
||||
"""One TEXT follow-up still lands as the event object itself
|
||||
(no spurious wrapping / mutation) — guards against the merge path
|
||||
breaking the simple case."""
|
||||
adapter = _make_adapter()
|
||||
first = _make_event("only one")
|
||||
session_key = build_session_key(first.source)
|
||||
|
||||
adapter._active_sessions[session_key] = asyncio.Event()
|
||||
await adapter.handle_message(first)
|
||||
|
||||
pending = adapter._pending_messages[session_key]
|
||||
assert pending is first
|
||||
assert pending.text == "only one"
|
||||
assert adapter._active_sessions[session_key].is_set()
|
||||
|
|
@ -839,3 +839,108 @@ class TestGitHubTokenCheck:
|
|||
|
||||
assert "gh auth" in str(call_log) or any(c[0] == "gh" for c in call_log), f"gh not called: {call_log}"
|
||||
assert "GitHub authenticated via gh CLI" in out or "token configured" in out
|
||||
|
||||
|
||||
def _run_doctor_with_healthy_oauth_fallback(
|
||||
monkeypatch,
|
||||
tmp_path,
|
||||
*,
|
||||
env_key: str,
|
||||
bad_key: str,
|
||||
failing_host: str,
|
||||
gemini_oauth_status: dict,
|
||||
minimax_oauth_status: dict,
|
||||
) -> str:
|
||||
home = tmp_path / ".hermes"
|
||||
home.mkdir(parents=True, exist_ok=True)
|
||||
(home / "config.yaml").write_text(
|
||||
"model:\n"
|
||||
" provider: nous\n"
|
||||
" default: moonshotai/kimi-k2.6\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
project = tmp_path / "project"
|
||||
project.mkdir(exist_ok=True)
|
||||
|
||||
monkeypatch.setattr(doctor_mod, "HERMES_HOME", home)
|
||||
monkeypatch.setattr(doctor_mod, "PROJECT_ROOT", project)
|
||||
monkeypatch.setattr(doctor_mod, "_DHH", str(home))
|
||||
monkeypatch.setenv(env_key, bad_key)
|
||||
monkeypatch.delenv("OPENROUTER_API_KEY", raising=False)
|
||||
monkeypatch.delenv("OPENAI_API_KEY", raising=False)
|
||||
monkeypatch.delenv("GEMINI_API_KEY", raising=False)
|
||||
monkeypatch.delenv("GOOGLE_API_KEY", raising=False)
|
||||
monkeypatch.delenv("MINIMAX_API_KEY", raising=False)
|
||||
monkeypatch.delenv("MINIMAX_CN_API_KEY", raising=False)
|
||||
monkeypatch.setenv(env_key, bad_key)
|
||||
|
||||
fake_model_tools = types.SimpleNamespace(
|
||||
check_tool_availability=lambda *a, **kw: ([], []),
|
||||
TOOLSET_REQUIREMENTS={},
|
||||
)
|
||||
monkeypatch.setitem(sys.modules, "model_tools", fake_model_tools)
|
||||
|
||||
from hermes_cli import auth as _auth_mod
|
||||
|
||||
monkeypatch.setattr(_auth_mod, "get_nous_auth_status", lambda: {"logged_in": True})
|
||||
monkeypatch.setattr(_auth_mod, "get_codex_auth_status", lambda: {})
|
||||
monkeypatch.setattr(_auth_mod, "get_gemini_oauth_auth_status", lambda: gemini_oauth_status)
|
||||
monkeypatch.setattr(_auth_mod, "get_minimax_oauth_auth_status", lambda: minimax_oauth_status)
|
||||
|
||||
def fake_get(url, headers=None, timeout=None):
|
||||
status = 401 if failing_host in url else 200
|
||||
return types.SimpleNamespace(status_code=status)
|
||||
|
||||
import httpx
|
||||
|
||||
monkeypatch.setattr(httpx, "get", fake_get)
|
||||
|
||||
buf = io.StringIO()
|
||||
with contextlib.redirect_stdout(buf):
|
||||
doctor_mod.run_doctor(Namespace(fix=False))
|
||||
return buf.getvalue()
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("env_key", "bad_key", "failing_host", "gemini_oauth_status", "minimax_oauth_status", "unexpected_issue"),
|
||||
[
|
||||
(
|
||||
"GOOGLE_API_KEY",
|
||||
"bad-gemini-key",
|
||||
"googleapis.com",
|
||||
{"logged_in": True, "email": "user@example.com"},
|
||||
{},
|
||||
"Check GOOGLE_API_KEY in .env",
|
||||
),
|
||||
(
|
||||
"MINIMAX_API_KEY",
|
||||
"bad-minimax-key",
|
||||
"minimax.io",
|
||||
{},
|
||||
{"logged_in": True, "region": "global"},
|
||||
"Check MINIMAX_API_KEY in .env",
|
||||
),
|
||||
],
|
||||
)
|
||||
def test_run_doctor_ignores_invalid_direct_keys_when_oauth_fallback_is_healthy(
|
||||
monkeypatch,
|
||||
tmp_path,
|
||||
env_key,
|
||||
bad_key,
|
||||
failing_host,
|
||||
gemini_oauth_status,
|
||||
minimax_oauth_status,
|
||||
unexpected_issue,
|
||||
):
|
||||
out = _run_doctor_with_healthy_oauth_fallback(
|
||||
monkeypatch,
|
||||
tmp_path,
|
||||
env_key=env_key,
|
||||
bad_key=bad_key,
|
||||
failing_host=failing_host,
|
||||
gemini_oauth_status=gemini_oauth_status,
|
||||
minimax_oauth_status=minimax_oauth_status,
|
||||
)
|
||||
|
||||
assert "invalid API key" in out
|
||||
assert unexpected_issue not in out
|
||||
|
|
|
|||
|
|
@ -662,6 +662,129 @@ class TestPluginContext:
|
|||
from tools.registry import registry
|
||||
assert "plugin_echo" in registry._tools
|
||||
|
||||
def test_register_tool_rejects_shadow_without_override(self, tmp_path, monkeypatch, caplog):
|
||||
"""Without override=True, registering a tool name claimed by a different toolset is rejected."""
|
||||
from tools.registry import registry
|
||||
|
||||
# Seed an existing entry from a non-plugin toolset.
|
||||
registry.register(
|
||||
name="shadow_target",
|
||||
toolset="terminal",
|
||||
schema={"name": "shadow_target", "description": "Built-in", "parameters": {"type": "object", "properties": {}}},
|
||||
handler=lambda args, **kw: "built-in",
|
||||
)
|
||||
original_handler = registry._tools["shadow_target"].handler
|
||||
try:
|
||||
plugins_dir = tmp_path / "hermes_test" / "plugins"
|
||||
plugin_dir = plugins_dir / "shadow_plugin"
|
||||
plugin_dir.mkdir(parents=True)
|
||||
(plugin_dir / "plugin.yaml").write_text(yaml.dump({"name": "shadow_plugin"}))
|
||||
(plugin_dir / "__init__.py").write_text(
|
||||
'def register(ctx):\n'
|
||||
' ctx.register_tool(\n'
|
||||
' name="shadow_target",\n'
|
||||
' toolset="plugin_shadow_plugin",\n'
|
||||
' schema={"name": "shadow_target", "description": "Plugin", "parameters": {"type": "object", "properties": {}}},\n'
|
||||
' handler=lambda args, **kw: "plugin",\n'
|
||||
' )\n'
|
||||
)
|
||||
hermes_home = tmp_path / "hermes_test"
|
||||
(hermes_home / "config.yaml").write_text(
|
||||
yaml.safe_dump({"plugins": {"enabled": ["shadow_plugin"]}})
|
||||
)
|
||||
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
|
||||
|
||||
with caplog.at_level(logging.ERROR, logger="tools.registry"):
|
||||
mgr = PluginManager()
|
||||
mgr.discover_and_load()
|
||||
|
||||
# Original handler must still be in place — registration was rejected.
|
||||
assert registry._tools["shadow_target"].handler is original_handler
|
||||
assert registry._tools["shadow_target"].toolset == "terminal"
|
||||
# And an ERROR was logged explaining why and how to opt in.
|
||||
assert any("override=True" in r.message for r in caplog.records)
|
||||
finally:
|
||||
registry.deregister("shadow_target")
|
||||
|
||||
def test_register_tool_override_replaces_existing(self, tmp_path, monkeypatch, caplog):
|
||||
"""override=True lets a plugin replace an existing built-in tool."""
|
||||
from tools.registry import registry
|
||||
|
||||
registry.register(
|
||||
name="override_target",
|
||||
toolset="terminal",
|
||||
schema={"name": "override_target", "description": "Built-in", "parameters": {"type": "object", "properties": {}}},
|
||||
handler=lambda args, **kw: "built-in",
|
||||
)
|
||||
try:
|
||||
plugins_dir = tmp_path / "hermes_test" / "plugins"
|
||||
plugin_dir = plugins_dir / "override_plugin"
|
||||
plugin_dir.mkdir(parents=True)
|
||||
(plugin_dir / "plugin.yaml").write_text(yaml.dump({"name": "override_plugin"}))
|
||||
(plugin_dir / "__init__.py").write_text(
|
||||
'def register(ctx):\n'
|
||||
' ctx.register_tool(\n'
|
||||
' name="override_target",\n'
|
||||
' toolset="plugin_override_plugin",\n'
|
||||
' schema={"name": "override_target", "description": "Plugin", "parameters": {"type": "object", "properties": {}}},\n'
|
||||
' handler=lambda args, **kw: "plugin",\n'
|
||||
' override=True,\n'
|
||||
' )\n'
|
||||
)
|
||||
hermes_home = tmp_path / "hermes_test"
|
||||
(hermes_home / "config.yaml").write_text(
|
||||
yaml.safe_dump({"plugins": {"enabled": ["override_plugin"]}})
|
||||
)
|
||||
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
|
||||
|
||||
with caplog.at_level(logging.INFO, logger="tools.registry"):
|
||||
mgr = PluginManager()
|
||||
mgr.discover_and_load()
|
||||
|
||||
# Plugin handler replaced the built-in one.
|
||||
assert registry._tools["override_target"].toolset == "plugin_override_plugin"
|
||||
assert registry._tools["override_target"].handler({}, ) == "plugin"
|
||||
# Override is audit-logged at INFO.
|
||||
assert any(
|
||||
"overriding existing" in r.message and "override_target" in r.message
|
||||
for r in caplog.records
|
||||
)
|
||||
# Plugin tracks it.
|
||||
assert "override_target" in mgr._plugin_tool_names
|
||||
finally:
|
||||
registry.deregister("override_target")
|
||||
|
||||
def test_register_tool_override_on_new_name_is_noop_path(self, tmp_path, monkeypatch):
|
||||
"""override=True on a brand-new name still registers cleanly (no existing entry to replace)."""
|
||||
from tools.registry import registry
|
||||
|
||||
plugins_dir = tmp_path / "hermes_test" / "plugins"
|
||||
plugin_dir = plugins_dir / "new_override_plugin"
|
||||
plugin_dir.mkdir(parents=True)
|
||||
(plugin_dir / "plugin.yaml").write_text(yaml.dump({"name": "new_override_plugin"}))
|
||||
(plugin_dir / "__init__.py").write_text(
|
||||
'def register(ctx):\n'
|
||||
' ctx.register_tool(\n'
|
||||
' name="brand_new_override_tool",\n'
|
||||
' toolset="plugin_new_override_plugin",\n'
|
||||
' schema={"name": "brand_new_override_tool", "description": "New", "parameters": {"type": "object", "properties": {}}},\n'
|
||||
' handler=lambda args, **kw: "ok",\n'
|
||||
' override=True,\n'
|
||||
' )\n'
|
||||
)
|
||||
hermes_home = tmp_path / "hermes_test"
|
||||
(hermes_home / "config.yaml").write_text(
|
||||
yaml.safe_dump({"plugins": {"enabled": ["new_override_plugin"]}})
|
||||
)
|
||||
monkeypatch.setenv("HERMES_HOME", str(hermes_home))
|
||||
|
||||
try:
|
||||
mgr = PluginManager()
|
||||
mgr.discover_and_load()
|
||||
assert "brand_new_override_tool" in registry._tools
|
||||
finally:
|
||||
registry.deregister("brand_new_override_tool")
|
||||
|
||||
|
||||
# ── TestPluginToolVisibility ───────────────────────────────────────────────
|
||||
|
||||
|
|
|
|||
|
|
@ -2269,6 +2269,60 @@ class TestParallelScopePathNormalization:
|
|||
assert not _should_parallelize_tool_batch([tc1, tc2])
|
||||
|
||||
|
||||
class TestMcpParallelToolBatch:
|
||||
"""Integration test: _should_parallelize_tool_batch respects MCP parallel flag."""
|
||||
|
||||
def test_mcp_tools_default_sequential(self):
|
||||
"""MCP tools without supports_parallel_tool_calls are sequential."""
|
||||
from run_agent import _should_parallelize_tool_batch
|
||||
tc1 = _mock_tool_call(name="mcp_github_list_repos", arguments='{"org":"openai"}', call_id="c1")
|
||||
tc2 = _mock_tool_call(name="mcp_github_search_code", arguments='{"q":"test"}', call_id="c2")
|
||||
assert not _should_parallelize_tool_batch([tc1, tc2])
|
||||
|
||||
def test_mcp_tools_parallel_when_server_opted_in(self):
|
||||
"""MCP tools from a parallel-safe server can run concurrently."""
|
||||
from run_agent import _should_parallelize_tool_batch
|
||||
from tools.mcp_tool import _parallel_safe_servers, _lock
|
||||
with _lock:
|
||||
_parallel_safe_servers.add("github")
|
||||
try:
|
||||
tc1 = _mock_tool_call(name="mcp_github_list_repos", arguments='{"org":"openai"}', call_id="c1")
|
||||
tc2 = _mock_tool_call(name="mcp_github_search_code", arguments='{"q":"test"}', call_id="c2")
|
||||
assert _should_parallelize_tool_batch([tc1, tc2])
|
||||
finally:
|
||||
with _lock:
|
||||
_parallel_safe_servers.discard("github")
|
||||
|
||||
def test_mixed_mcp_and_builtin_parallel(self):
|
||||
"""MCP parallel tools mixed with built-in parallel-safe tools."""
|
||||
from run_agent import _should_parallelize_tool_batch
|
||||
from tools.mcp_tool import _parallel_safe_servers, _lock
|
||||
with _lock:
|
||||
_parallel_safe_servers.add("docs")
|
||||
try:
|
||||
tc1 = _mock_tool_call(name="mcp_docs_search", arguments='{"query":"api"}', call_id="c1")
|
||||
tc2 = _mock_tool_call(name="web_search", arguments='{"query":"test"}', call_id="c2")
|
||||
assert _should_parallelize_tool_batch([tc1, tc2])
|
||||
finally:
|
||||
with _lock:
|
||||
_parallel_safe_servers.discard("docs")
|
||||
|
||||
def test_mixed_parallel_and_serial_mcp_servers(self):
|
||||
"""One parallel MCP server + one non-parallel MCP server = sequential."""
|
||||
from run_agent import _should_parallelize_tool_batch
|
||||
from tools.mcp_tool import _parallel_safe_servers, _lock
|
||||
with _lock:
|
||||
_parallel_safe_servers.add("docs")
|
||||
# "github" is NOT in _parallel_safe_servers
|
||||
try:
|
||||
tc1 = _mock_tool_call(name="mcp_docs_search", arguments='{"query":"api"}', call_id="c1")
|
||||
tc2 = _mock_tool_call(name="mcp_github_list_repos", arguments='{"org":"openai"}', call_id="c2")
|
||||
assert not _should_parallelize_tool_batch([tc1, tc2])
|
||||
finally:
|
||||
with _lock:
|
||||
_parallel_safe_servers.discard("docs")
|
||||
|
||||
|
||||
class TestHandleMaxIterations:
|
||||
def test_returns_summary(self, agent):
|
||||
resp = _mock_response(content="Here is a summary of what I did.")
|
||||
|
|
|
|||
|
|
@ -999,6 +999,88 @@ class TestAnthropicStreamCallbacks:
|
|||
|
||||
assert touch_calls.count("receiving stream response") == len(events)
|
||||
|
||||
@patch("run_agent.AIAgent._replace_primary_openai_client")
|
||||
def test_anthropic_stream_parser_valueerror_retries_before_delivery(
|
||||
self, mock_replace, monkeypatch,
|
||||
):
|
||||
"""Malformed Anthropic event-stream frames retry instead of surfacing HTTP None."""
|
||||
from run_agent import AIAgent
|
||||
|
||||
agent = AIAgent(
|
||||
api_key="test-key",
|
||||
base_url="https://api.minimax.io/anthropic",
|
||||
provider="minimax",
|
||||
model="MiniMax-M2.7",
|
||||
quiet_mode=True,
|
||||
skip_context_files=True,
|
||||
skip_memory=True,
|
||||
)
|
||||
agent.api_mode = "anthropic_messages"
|
||||
agent._interrupt_requested = False
|
||||
monkeypatch.setenv("HERMES_STREAM_RETRIES", "1")
|
||||
|
||||
class _BadStream:
|
||||
response = None
|
||||
|
||||
def __enter__(self):
|
||||
return self
|
||||
|
||||
def __exit__(self, *_args):
|
||||
return False
|
||||
|
||||
def __iter__(self):
|
||||
raise ValueError("expected ident at line 1 column 149")
|
||||
|
||||
final_message = SimpleNamespace(content=[], stop_reason="end_turn")
|
||||
good_stream = MagicMock()
|
||||
good_stream.__enter__ = MagicMock(return_value=good_stream)
|
||||
good_stream.__exit__ = MagicMock(return_value=False)
|
||||
good_stream.__iter__ = MagicMock(return_value=iter([]))
|
||||
good_stream.get_final_message.return_value = final_message
|
||||
|
||||
agent._anthropic_client = MagicMock()
|
||||
agent._anthropic_client.messages.stream.side_effect = [
|
||||
_BadStream(),
|
||||
good_stream,
|
||||
]
|
||||
|
||||
response = agent._interruptible_streaming_api_call({})
|
||||
|
||||
assert response is final_message
|
||||
assert agent._anthropic_client.messages.stream.call_count == 2
|
||||
assert mock_replace.call_count == 1
|
||||
|
||||
@patch("run_agent.AIAgent._replace_primary_openai_client")
|
||||
def test_generic_anthropic_valueerror_still_propagates_without_stream_retry(
|
||||
self, mock_replace, monkeypatch,
|
||||
):
|
||||
"""Only known provider stream parser ValueErrors are treated as transient."""
|
||||
from run_agent import AIAgent
|
||||
|
||||
agent = AIAgent(
|
||||
api_key="test-key",
|
||||
base_url="https://api.minimax.io/anthropic",
|
||||
provider="minimax",
|
||||
model="MiniMax-M2.7",
|
||||
quiet_mode=True,
|
||||
skip_context_files=True,
|
||||
skip_memory=True,
|
||||
)
|
||||
agent.api_mode = "anthropic_messages"
|
||||
agent._interrupt_requested = False
|
||||
monkeypatch.setenv("HERMES_STREAM_RETRIES", "1")
|
||||
|
||||
agent._anthropic_client = MagicMock()
|
||||
agent._anthropic_client.messages.stream.side_effect = ValueError(
|
||||
"invalid local request shape"
|
||||
)
|
||||
|
||||
with pytest.raises(ValueError, match="invalid local request shape"):
|
||||
agent._interruptible_streaming_api_call({})
|
||||
|
||||
assert agent._anthropic_client.messages.stream.call_count == 1
|
||||
assert mock_replace.call_count == 0
|
||||
|
||||
|
||||
class TestPartialToolCallWarning:
|
||||
"""Regression: when a stream dies mid tool-call argument generation after
|
||||
|
|
@ -1504,4 +1586,3 @@ class TestCopilotACPStreamingDecision:
|
|||
_use_streaming = False
|
||||
|
||||
assert _use_streaming is True
|
||||
|
||||
|
|
|
|||
102
tests/skills/test_darwinian_evolver_skill.py
Normal file
102
tests/skills/test_darwinian_evolver_skill.py
Normal file
|
|
@ -0,0 +1,102 @@
|
|||
"""
|
||||
Smoke tests for the darwinian-evolver optional skill.
|
||||
|
||||
We can't actually run the evolution loop in CI (it needs network + a paid LLM),
|
||||
so these tests verify:
|
||||
- SKILL.md frontmatter conforms to the hardline format
|
||||
- shipped scripts parse as valid Python
|
||||
- the scripts reference the right env var / module paths
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import ast
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
import yaml
|
||||
|
||||
SKILL_DIR = Path(__file__).resolve().parents[2] / "optional-skills" / "research" / "darwinian-evolver"
|
||||
|
||||
|
||||
@pytest.fixture(scope="module")
|
||||
def frontmatter() -> dict:
|
||||
src = (SKILL_DIR / "SKILL.md").read_text()
|
||||
m = re.search(r"^---\n(.*?)\n---", src, re.DOTALL)
|
||||
assert m, "SKILL.md missing YAML frontmatter"
|
||||
return yaml.safe_load(m.group(1))
|
||||
|
||||
|
||||
def test_skill_dir_exists() -> None:
|
||||
assert SKILL_DIR.is_dir(), f"missing skill dir: {SKILL_DIR}"
|
||||
|
||||
|
||||
def test_skill_md_present() -> None:
|
||||
assert (SKILL_DIR / "SKILL.md").is_file()
|
||||
|
||||
|
||||
def test_description_under_60_chars(frontmatter) -> None:
|
||||
desc = frontmatter["description"]
|
||||
assert len(desc) <= 60, f"description is {len(desc)} chars (hardline ≤60): {desc!r}"
|
||||
|
||||
|
||||
def test_name_matches_dir(frontmatter) -> None:
|
||||
assert frontmatter["name"] == "darwinian-evolver"
|
||||
|
||||
|
||||
def test_platforms_excludes_windows(frontmatter) -> None:
|
||||
# Upstream uses func_timeout (POSIX signals) and uv subprocess pipelines; the
|
||||
# skill is gated [linux, macos]. If we ever port to Windows, update this test
|
||||
# to assert ["linux", "macos", "windows"].
|
||||
assert "windows" not in frontmatter["platforms"]
|
||||
assert set(frontmatter["platforms"]) >= {"linux", "macos"}
|
||||
|
||||
|
||||
def test_author_credits_contributor(frontmatter) -> None:
|
||||
author = frontmatter["author"]
|
||||
assert "Bihruze" in author, f"author should credit the original contributor: {author!r}"
|
||||
|
||||
|
||||
def test_license_mit(frontmatter) -> None:
|
||||
assert frontmatter["license"] == "MIT"
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"path",
|
||||
[
|
||||
"scripts/parrot_openrouter.py",
|
||||
"scripts/show_snapshot.py",
|
||||
"templates/custom_problem_template.py",
|
||||
],
|
||||
)
|
||||
def test_shipped_scripts_parse(path: str) -> None:
|
||||
src = (SKILL_DIR / path).read_text()
|
||||
ast.parse(src) # raises SyntaxError on broken Python
|
||||
|
||||
|
||||
def test_parrot_script_uses_openrouter() -> None:
|
||||
src = (SKILL_DIR / "scripts" / "parrot_openrouter.py").read_text()
|
||||
assert "OPENROUTER_API_KEY" in src, "parrot driver should read OPENROUTER_API_KEY"
|
||||
assert "openrouter.ai/api/v1" in src, "parrot driver should target OpenRouter"
|
||||
assert "EVOLVER_MODEL" in src, "model should be overridable via EVOLVER_MODEL"
|
||||
|
||||
|
||||
def test_parrot_script_has_error_swallowing() -> None:
|
||||
"""Provider content-filter / rate-limit must not kill the run — see Pitfall 2."""
|
||||
src = (SKILL_DIR / "scripts" / "parrot_openrouter.py").read_text()
|
||||
assert "LLM_ERROR" in src, "_prompt_llm should swallow provider errors and tag them"
|
||||
|
||||
|
||||
def test_skill_calls_out_agpl(frontmatter) -> None:
|
||||
"""The upstream tool is AGPL-3.0. The skill MUST flag this so users don't
|
||||
import it into MIT-licensed code by accident."""
|
||||
src = (SKILL_DIR / "SKILL.md").read_text()
|
||||
assert "AGPL" in src, "SKILL.md must mention upstream AGPL license"
|
||||
|
||||
|
||||
def test_skill_pitfalls_section_present() -> None:
|
||||
src = (SKILL_DIR / "SKILL.md").read_text()
|
||||
assert "## Pitfalls" in src
|
||||
# Pitfalls we discovered during the spike — keep them in sync with reality.
|
||||
assert "Initial organism must be viable" in src
|
||||
assert "generator" in src # loop.run() pitfall
|
||||
137
tests/test_sanitize_tool_error.py
Normal file
137
tests/test_sanitize_tool_error.py
Normal file
|
|
@ -0,0 +1,137 @@
|
|||
"""Tests for `_sanitize_tool_error` in model_tools.
|
||||
|
||||
Ported from ironclaw#1639 — defense-in-depth on tool exception strings before
|
||||
they enter the model's `tool` message content. Note that `json.dumps()` in
|
||||
`handle_function_call` already handles quote/backslash escaping at the wire
|
||||
layer; this helper exists to strip structural framing tokens the model
|
||||
itself might react to (XML role tags, CDATA, markdown code fences) and to
|
||||
cap pathological lengths.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from model_tools import _sanitize_tool_error, _TOOL_ERROR_MAX_LEN
|
||||
|
||||
|
||||
class TestRoleTagStripping:
|
||||
def test_strips_tool_call_tags(self):
|
||||
out = _sanitize_tool_error("bad <tool_call>injected</tool_call> happened")
|
||||
assert "<tool_call>" not in out
|
||||
assert "</tool_call>" not in out
|
||||
assert "bad injected happened" in out
|
||||
|
||||
def test_strips_function_call_tags(self):
|
||||
out = _sanitize_tool_error("<function_call>x</function_call>")
|
||||
assert "<function_call>" not in out
|
||||
assert "</function_call>" not in out
|
||||
|
||||
def test_strips_role_tags(self):
|
||||
# Each of these should be stripped
|
||||
for tag in ("system", "assistant", "user", "result", "response", "output", "input"):
|
||||
raw = f"prefix <{tag}>hi</{tag}> suffix"
|
||||
out = _sanitize_tool_error(raw)
|
||||
assert f"<{tag}>" not in out, f"failed to strip <{tag}>"
|
||||
assert f"</{tag}>" not in out, f"failed to strip </{tag}>"
|
||||
|
||||
def test_role_tag_strip_is_case_insensitive(self):
|
||||
out = _sanitize_tool_error("<TOOL_CALL>x</Tool_Call>")
|
||||
assert "<" not in out.replace("[TOOL_ERROR]", "") # only the prefix bracket survives
|
||||
|
||||
def test_unrelated_xml_kept(self):
|
||||
# We intentionally only strip the role-like tag whitelist, not all XML
|
||||
out = _sanitize_tool_error("Error parsing <ParseError>line 5</ParseError>")
|
||||
assert "<ParseError>" in out
|
||||
|
||||
|
||||
class TestCDATAStripping:
|
||||
def test_strips_cdata(self):
|
||||
out = _sanitize_tool_error("error: <![CDATA[malicious]]> here")
|
||||
assert "<![CDATA[" not in out
|
||||
assert "]]>" not in out
|
||||
|
||||
def test_strips_multiline_cdata(self):
|
||||
out = _sanitize_tool_error("a\n<![CDATA[line1\nline2]]>\nb")
|
||||
assert "CDATA" not in out
|
||||
assert "a" in out and "b" in out
|
||||
|
||||
|
||||
class TestCodeFenceStripping:
|
||||
def test_strips_leading_fence_with_lang(self):
|
||||
out = _sanitize_tool_error("```json\n{\"x\": 1}")
|
||||
assert not out.replace("[TOOL_ERROR] ", "").startswith("```")
|
||||
|
||||
def test_strips_trailing_fence(self):
|
||||
out = _sanitize_tool_error("payload\n```")
|
||||
assert not out.rstrip().endswith("```")
|
||||
|
||||
def test_strips_bare_fence(self):
|
||||
out = _sanitize_tool_error("```\nstuff")
|
||||
assert "```" not in out.split("\n")[0]
|
||||
|
||||
|
||||
class TestTruncation:
|
||||
def test_caps_long_input(self):
|
||||
long = "A" * (_TOOL_ERROR_MAX_LEN * 2)
|
||||
out = _sanitize_tool_error(long)
|
||||
# Total length is prefix + truncated body
|
||||
body = out[len("[TOOL_ERROR] "):]
|
||||
assert len(body) == _TOOL_ERROR_MAX_LEN
|
||||
assert body.endswith("...")
|
||||
|
||||
def test_does_not_truncate_short_input(self):
|
||||
msg = "short error"
|
||||
out = _sanitize_tool_error(msg)
|
||||
assert "..." not in out
|
||||
assert msg in out
|
||||
|
||||
|
||||
class TestEnvelope:
|
||||
def test_wraps_with_prefix(self):
|
||||
out = _sanitize_tool_error("oh no")
|
||||
assert out.startswith("[TOOL_ERROR] ")
|
||||
|
||||
def test_empty_input(self):
|
||||
out = _sanitize_tool_error("")
|
||||
assert out == "[TOOL_ERROR] "
|
||||
|
||||
def test_preserves_normal_error_text(self):
|
||||
msg = "Error executing read_file: FileNotFoundError: /tmp/missing"
|
||||
out = _sanitize_tool_error(msg)
|
||||
assert msg in out
|
||||
|
||||
|
||||
class TestHandleFunctionCallIntegration:
|
||||
"""Verify handle_function_call routes exception-path errors through the sanitizer.
|
||||
|
||||
Note: the "Unknown tool: ..." early-return in tools/registry.py is a
|
||||
*different* code path from `except Exception` in handle_function_call —
|
||||
that one returns directly without sanitization (and there's nothing to
|
||||
sanitize in a hardcoded format string anyway). This test exercises the
|
||||
real exception path by passing args that make a known tool raise.
|
||||
"""
|
||||
|
||||
def test_exception_path_error_is_sanitized(self):
|
||||
import json
|
||||
from model_tools import handle_function_call
|
||||
from tools.registry import registry as _registry
|
||||
|
||||
# Force a known tool to raise with a payload containing role tags.
|
||||
def boom(_args, **_kwargs):
|
||||
raise RuntimeError("<tool_call>injected</tool_call> boom")
|
||||
|
||||
all_tools = _registry.get_all_tool_names()
|
||||
assert all_tools, "no tools registered — test environment broken"
|
||||
target = all_tools[0]
|
||||
original = _registry._tools[target].handler
|
||||
_registry._tools[target].handler = boom
|
||||
try:
|
||||
result_str = handle_function_call(target, {})
|
||||
finally:
|
||||
_registry._tools[target].handler = original
|
||||
|
||||
payload = json.loads(result_str)
|
||||
assert "error" in payload, payload
|
||||
assert payload["error"].startswith("[TOOL_ERROR] "), payload["error"]
|
||||
# Role-tag stripping carried through
|
||||
assert "<tool_call>" not in payload["error"]
|
||||
assert "</tool_call>" not in payload["error"]
|
||||
assert "boom" in payload["error"]
|
||||
|
|
@ -1102,3 +1102,206 @@ class TestDetectSudoStdin:
|
|||
"make 2>&1 | tee build.log"
|
||||
)
|
||||
assert is_dangerous is False
|
||||
|
||||
|
||||
class TestMacOSPrivateSystemPaths:
|
||||
"""Inspired by Claude Code 2.1.113 "dangerous path protection".
|
||||
|
||||
On macOS, /etc, /var, /tmp, /home are symlinks to
|
||||
/private/{etc,var,tmp,home}. A command that writes to
|
||||
/private/etc/sudoers works identically to /etc/sudoers but bypasses
|
||||
a plain "/etc/" pattern check. These tests guard the shared
|
||||
_SYSTEM_CONFIG_PATH fragment used across redirect / tee / cp / mv /
|
||||
install / sed -i patterns.
|
||||
"""
|
||||
|
||||
def test_private_etc_redirect(self):
|
||||
dangerous, _, desc = detect_dangerous_command(
|
||||
"echo 'root ALL=NOPASSWD: ALL' > /private/etc/sudoers"
|
||||
)
|
||||
assert dangerous is True
|
||||
assert "system config" in desc.lower()
|
||||
|
||||
def test_private_var_redirect(self):
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"echo payload > /private/var/db/dslocal/nodes/x"
|
||||
)
|
||||
assert dangerous is True
|
||||
|
||||
def test_private_etc_via_tee(self):
|
||||
dangerous, _, desc = detect_dangerous_command(
|
||||
"echo malicious | tee /private/etc/hosts"
|
||||
)
|
||||
assert dangerous is True
|
||||
assert "tee" in desc.lower() or "system" in desc.lower()
|
||||
|
||||
def test_private_etc_cp(self):
|
||||
dangerous, _, desc = detect_dangerous_command(
|
||||
"cp malicious.conf /private/etc/hosts"
|
||||
)
|
||||
assert dangerous is True
|
||||
assert "copy" in desc.lower() or "system config" in desc.lower()
|
||||
|
||||
def test_private_etc_mv(self):
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"mv evil /private/etc/ssh/sshd_config"
|
||||
)
|
||||
assert dangerous is True
|
||||
|
||||
def test_private_etc_install(self):
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"install -m 600 key /private/etc/ssh/keys"
|
||||
)
|
||||
assert dangerous is True
|
||||
|
||||
def test_private_etc_sed_in_place(self):
|
||||
dangerous, _, desc = detect_dangerous_command(
|
||||
"sed -i 's/root/pwned/' /private/etc/passwd"
|
||||
)
|
||||
assert dangerous is True
|
||||
assert "in-place" in desc.lower() or "system config" in desc.lower()
|
||||
|
||||
def test_private_var_sed_long_flag(self):
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"sed --in-place 's/x/y/' /private/var/log/wtmp"
|
||||
)
|
||||
assert dangerous is True
|
||||
|
||||
def test_private_tmp_cp(self):
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"cp rootkit /private/tmp/payload"
|
||||
)
|
||||
assert dangerous is True
|
||||
|
||||
def test_ls_private_is_safe(self):
|
||||
"""Reading under /private/ must not trigger approval."""
|
||||
dangerous, _, _ = detect_dangerous_command("ls /private")
|
||||
assert dangerous is False
|
||||
|
||||
def test_echo_mentioning_private_path_is_safe(self):
|
||||
"""Literal mention of /private/etc in an echo string must not fire."""
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"echo 'the macOS path is /private/etc on disk'"
|
||||
)
|
||||
assert dangerous is False
|
||||
|
||||
|
||||
class TestKillallKillSignals:
|
||||
"""Inspired by Claude Code 2.1.113 expanded deny rules.
|
||||
|
||||
The existing pattern caught `pkill -9` but not the equivalent
|
||||
`killall -9` / `-KILL` / `-s KILL` / `-r <regex>` broad sweeps that
|
||||
can wipe out unrelated processes.
|
||||
"""
|
||||
|
||||
def test_killall_dash_9(self):
|
||||
dangerous, _, desc = detect_dangerous_command("killall -9 firefox")
|
||||
assert dangerous is True
|
||||
assert "kill" in desc.lower()
|
||||
|
||||
def test_killall_dash_kill(self):
|
||||
dangerous, _, _ = detect_dangerous_command("killall -KILL firefox")
|
||||
assert dangerous is True
|
||||
|
||||
def test_killall_dash_sigkill(self):
|
||||
dangerous, _, _ = detect_dangerous_command("killall -SIGKILL firefox")
|
||||
assert dangerous is True
|
||||
|
||||
def test_killall_dash_s_kill(self):
|
||||
dangerous, _, _ = detect_dangerous_command("killall -s KILL firefox")
|
||||
assert dangerous is True
|
||||
|
||||
def test_killall_dash_s_signum(self):
|
||||
dangerous, _, _ = detect_dangerous_command("killall -s 9 firefox")
|
||||
assert dangerous is True
|
||||
|
||||
def test_killall_regex(self):
|
||||
"""killall -r <regex> is a broad sweep; require approval."""
|
||||
dangerous, _, desc = detect_dangerous_command("killall -r 'fire.*'")
|
||||
assert dangerous is True
|
||||
assert "regex" in desc.lower() or "kill" in desc.lower()
|
||||
|
||||
def test_killall_combined_flags(self):
|
||||
dangerous, _, _ = detect_dangerous_command("killall -9 -r 'herm.*'")
|
||||
assert dangerous is True
|
||||
|
||||
def test_killall_list_signals_is_safe(self):
|
||||
"""`killall -l` lists signals and is harmless — must not fire."""
|
||||
dangerous, _, _ = detect_dangerous_command("killall -l")
|
||||
assert dangerous is False
|
||||
|
||||
def test_killall_version_is_safe(self):
|
||||
dangerous, _, _ = detect_dangerous_command("killall -V")
|
||||
assert dangerous is False
|
||||
|
||||
|
||||
class TestFindExecdir:
|
||||
"""Inspired by Claude Code 2.1.113 tightening of find rules.
|
||||
|
||||
`find -execdir rm` has the same destructive effect as `find -exec rm`
|
||||
but ran in each match's directory. Previously missed because the
|
||||
pattern required a literal `-exec ` followed by a space.
|
||||
"""
|
||||
|
||||
def test_find_execdir_rm(self):
|
||||
dangerous, _, desc = detect_dangerous_command(
|
||||
"find . -execdir rm {} \\;"
|
||||
)
|
||||
assert dangerous is True
|
||||
assert "find" in desc.lower() or "rm" in desc.lower()
|
||||
|
||||
def test_find_execdir_with_absolute_rm(self):
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"find /var -execdir /bin/rm -rf {} \\;"
|
||||
)
|
||||
assert dangerous is True
|
||||
|
||||
def test_find_exec_rm_still_caught(self):
|
||||
"""Original -exec pattern must still fire (regression guard)."""
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"find . -exec rm {} \\;"
|
||||
)
|
||||
assert dangerous is True
|
||||
|
||||
def test_find_execdir_ls_is_safe(self):
|
||||
"""-execdir with a read-only command is not dangerous."""
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"find . -execdir ls {} \\;"
|
||||
)
|
||||
assert dangerous is False
|
||||
|
||||
|
||||
class TestEtcPatternsUnaffectedByRefactor:
|
||||
"""Regression guard: the /etc/ patterns were refactored to share the
|
||||
_SYSTEM_CONFIG_PATH fragment with the /private/ mirror. Make sure the
|
||||
existing /etc/ coverage remains identical.
|
||||
"""
|
||||
|
||||
def test_etc_redirect(self):
|
||||
dangerous, _, _ = detect_dangerous_command("echo x > /etc/hosts")
|
||||
assert dangerous is True
|
||||
|
||||
def test_etc_cp(self):
|
||||
dangerous, _, _ = detect_dangerous_command("cp evil /etc/hosts")
|
||||
assert dangerous is True
|
||||
|
||||
def test_etc_sed_inline(self):
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"sed -i 's/a/b/' /etc/hosts"
|
||||
)
|
||||
assert dangerous is True
|
||||
|
||||
def test_etc_tee(self):
|
||||
dangerous, _, _ = detect_dangerous_command(
|
||||
"echo x | tee /etc/hosts"
|
||||
)
|
||||
assert dangerous is True
|
||||
|
||||
def test_cat_etc_hostname_is_safe(self):
|
||||
"""Reading /etc/ files is safe — only writes require approval."""
|
||||
dangerous, _, _ = detect_dangerous_command("cat /etc/hostname")
|
||||
assert dangerous is False
|
||||
|
||||
def test_grep_etc_passwd_is_safe(self):
|
||||
dangerous, _, _ = detect_dangerous_command("grep root /etc/passwd")
|
||||
assert dangerous is False
|
||||
|
|
|
|||
|
|
@ -890,6 +890,63 @@ class TestDelegationCredentialResolution(unittest.TestCase):
|
|||
self.assertEqual(creds["api_key"], "local-key")
|
||||
self.assertEqual(creds["api_mode"], "chat_completions")
|
||||
|
||||
def test_direct_endpoint_auto_detects_anthropic_messages_suffix(self):
|
||||
# Issue #10213: Azure AI Foundry exposes Anthropic-compatible models at
|
||||
# a /anthropic URL suffix. Subagents must pick anthropic_messages
|
||||
# automatically, matching the main agent's runtime resolver.
|
||||
parent = _make_mock_parent(depth=0)
|
||||
cfg = {
|
||||
"model": "claude-opus-4-6",
|
||||
"provider": "custom",
|
||||
"base_url": "https://myfoundry.services.ai.azure.com/anthropic",
|
||||
"api_key": "foundry-key",
|
||||
}
|
||||
creds = _resolve_delegation_credentials(cfg, parent)
|
||||
self.assertEqual(creds["provider"], "custom")
|
||||
self.assertEqual(creds["base_url"], "https://myfoundry.services.ai.azure.com/anthropic")
|
||||
self.assertEqual(creds["api_key"], "foundry-key")
|
||||
self.assertEqual(creds["api_mode"], "anthropic_messages")
|
||||
|
||||
def test_direct_endpoint_honors_explicit_api_mode(self):
|
||||
# When delegation.api_mode is set explicitly, it overrides URL-based
|
||||
# detection so users can force a transport on non-standard endpoints.
|
||||
parent = _make_mock_parent(depth=0)
|
||||
cfg = {
|
||||
"model": "claude-opus-4-6",
|
||||
"provider": "custom",
|
||||
"base_url": "https://proxy.example.com/v1",
|
||||
"api_key": "proxy-key",
|
||||
"api_mode": "anthropic_messages",
|
||||
}
|
||||
creds = _resolve_delegation_credentials(cfg, parent)
|
||||
self.assertEqual(creds["api_mode"], "anthropic_messages")
|
||||
|
||||
def test_direct_endpoint_explicit_api_mode_overrides_url_detection(self):
|
||||
# Explicit api_mode in config always wins over auto-detection.
|
||||
parent = _make_mock_parent(depth=0)
|
||||
cfg = {
|
||||
"model": "claude-opus-4-6",
|
||||
"provider": "custom",
|
||||
"base_url": "https://myfoundry.services.ai.azure.com/anthropic",
|
||||
"api_key": "foundry-key",
|
||||
"api_mode": "chat_completions",
|
||||
}
|
||||
creds = _resolve_delegation_credentials(cfg, parent)
|
||||
self.assertEqual(creds["api_mode"], "chat_completions")
|
||||
|
||||
def test_direct_endpoint_invalid_api_mode_falls_back_to_detection(self):
|
||||
# An invalid api_mode string must not break detection; fall back to URL heuristic.
|
||||
parent = _make_mock_parent(depth=0)
|
||||
cfg = {
|
||||
"model": "claude-opus-4-6",
|
||||
"provider": "custom",
|
||||
"base_url": "https://myfoundry.services.ai.azure.com/anthropic",
|
||||
"api_key": "foundry-key",
|
||||
"api_mode": "garbage",
|
||||
}
|
||||
creds = _resolve_delegation_credentials(cfg, parent)
|
||||
self.assertEqual(creds["api_mode"], "anthropic_messages")
|
||||
|
||||
def test_direct_endpoint_returns_none_api_key_when_not_configured(self):
|
||||
# When base_url is set without api_key, api_key should be None so
|
||||
# _build_child_agent inherits the parent's key (effective_api_key = override or parent).
|
||||
|
|
|
|||
|
|
@ -3762,3 +3762,135 @@ class TestRegisterMcpServers:
|
|||
)
|
||||
|
||||
_servers.pop("srv", None)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Tests for parallel tool call support (port from openai/codex#17667)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestMcpParallelToolCalls:
|
||||
"""Tests for the supports_parallel_tool_calls config option."""
|
||||
|
||||
def test_is_mcp_tool_parallel_safe_non_mcp_tool(self):
|
||||
"""Non-MCP tool names always return False."""
|
||||
from tools.mcp_tool import is_mcp_tool_parallel_safe
|
||||
assert is_mcp_tool_parallel_safe("web_search") is False
|
||||
assert is_mcp_tool_parallel_safe("read_file") is False
|
||||
assert is_mcp_tool_parallel_safe("terminal") is False
|
||||
assert is_mcp_tool_parallel_safe("") is False
|
||||
|
||||
def test_is_mcp_tool_parallel_safe_no_servers(self):
|
||||
"""MCP tool from unknown server returns False."""
|
||||
from tools.mcp_tool import is_mcp_tool_parallel_safe, _parallel_safe_servers, _lock
|
||||
with _lock:
|
||||
_parallel_safe_servers.clear()
|
||||
assert is_mcp_tool_parallel_safe("mcp_docs_search") is False
|
||||
|
||||
def test_is_mcp_tool_parallel_safe_with_flag(self):
|
||||
"""MCP tool from a parallel-safe server returns True."""
|
||||
from tools.mcp_tool import is_mcp_tool_parallel_safe, _parallel_safe_servers, _lock
|
||||
with _lock:
|
||||
_parallel_safe_servers.add("docs")
|
||||
try:
|
||||
assert is_mcp_tool_parallel_safe("mcp_docs_search") is True
|
||||
assert is_mcp_tool_parallel_safe("mcp_docs_read_file") is True
|
||||
# Different server should be False
|
||||
assert is_mcp_tool_parallel_safe("mcp_github_list_repos") is False
|
||||
finally:
|
||||
with _lock:
|
||||
_parallel_safe_servers.discard("docs")
|
||||
|
||||
def test_is_mcp_tool_parallel_safe_server_with_underscores(self):
|
||||
"""Server names containing underscores are correctly matched."""
|
||||
from tools.mcp_tool import is_mcp_tool_parallel_safe, _parallel_safe_servers, _lock
|
||||
with _lock:
|
||||
_parallel_safe_servers.add("my_server")
|
||||
try:
|
||||
assert is_mcp_tool_parallel_safe("mcp_my_server_query") is True
|
||||
finally:
|
||||
with _lock:
|
||||
_parallel_safe_servers.discard("my_server")
|
||||
|
||||
def test_is_mcp_tool_parallel_safe_no_tool_suffix(self):
|
||||
"""Tool name that is just 'mcp_{server}' without a tool part returns False."""
|
||||
from tools.mcp_tool import is_mcp_tool_parallel_safe, _parallel_safe_servers, _lock
|
||||
with _lock:
|
||||
_parallel_safe_servers.add("docs")
|
||||
try:
|
||||
# "mcp_docs" has no tool part after the server name
|
||||
assert is_mcp_tool_parallel_safe("mcp_docs") is False
|
||||
# "mcp_docs_" has empty tool part
|
||||
assert is_mcp_tool_parallel_safe("mcp_docs_") is False
|
||||
finally:
|
||||
with _lock:
|
||||
_parallel_safe_servers.discard("docs")
|
||||
|
||||
def test_register_mcp_servers_tracks_parallel_flag(self):
|
||||
"""register_mcp_servers populates _parallel_safe_servers from config."""
|
||||
from tools.mcp_tool import (
|
||||
register_mcp_servers, _parallel_safe_servers, _lock,
|
||||
sanitize_mcp_name_component,
|
||||
)
|
||||
fake_config = {
|
||||
"parallel_srv": {
|
||||
"command": "echo",
|
||||
"supports_parallel_tool_calls": True,
|
||||
},
|
||||
"serial_srv": {
|
||||
"command": "echo",
|
||||
"supports_parallel_tool_calls": False,
|
||||
},
|
||||
"default_srv": {
|
||||
"command": "echo",
|
||||
# no supports_parallel_tool_calls key
|
||||
},
|
||||
}
|
||||
with patch("tools.mcp_tool._MCP_AVAILABLE", True), \
|
||||
patch("tools.mcp_tool._ensure_mcp_loop"), \
|
||||
patch("tools.mcp_tool._run_on_mcp_loop"), \
|
||||
patch("tools.mcp_tool._existing_tool_names", return_value=[]):
|
||||
register_mcp_servers(fake_config)
|
||||
|
||||
with _lock:
|
||||
assert sanitize_mcp_name_component("parallel_srv") in _parallel_safe_servers
|
||||
assert sanitize_mcp_name_component("serial_srv") not in _parallel_safe_servers
|
||||
assert sanitize_mcp_name_component("default_srv") not in _parallel_safe_servers
|
||||
# Cleanup
|
||||
_parallel_safe_servers.discard(sanitize_mcp_name_component("parallel_srv"))
|
||||
|
||||
def test_register_mcp_servers_removes_parallel_flag_on_toggle(self):
|
||||
"""Toggling supports_parallel_tool_calls to false removes server from the set."""
|
||||
from tools.mcp_tool import (
|
||||
register_mcp_servers, _parallel_safe_servers, _lock,
|
||||
sanitize_mcp_name_component,
|
||||
)
|
||||
|
||||
# First registration: parallel enabled
|
||||
config_on = {
|
||||
"toggle_srv": {
|
||||
"command": "echo",
|
||||
"supports_parallel_tool_calls": True,
|
||||
},
|
||||
}
|
||||
with patch("tools.mcp_tool._MCP_AVAILABLE", True), \
|
||||
patch("tools.mcp_tool._ensure_mcp_loop"), \
|
||||
patch("tools.mcp_tool._run_on_mcp_loop"), \
|
||||
patch("tools.mcp_tool._existing_tool_names", return_value=[]):
|
||||
register_mcp_servers(config_on)
|
||||
with _lock:
|
||||
assert sanitize_mcp_name_component("toggle_srv") in _parallel_safe_servers
|
||||
|
||||
# Second registration: parallel disabled
|
||||
config_off = {
|
||||
"toggle_srv": {
|
||||
"command": "echo",
|
||||
"supports_parallel_tool_calls": False,
|
||||
},
|
||||
}
|
||||
with patch("tools.mcp_tool._MCP_AVAILABLE", True), \
|
||||
patch("tools.mcp_tool._ensure_mcp_loop"), \
|
||||
patch("tools.mcp_tool._run_on_mcp_loop"), \
|
||||
patch("tools.mcp_tool._existing_tool_names", return_value=[]):
|
||||
register_mcp_servers(config_off)
|
||||
with _lock:
|
||||
assert sanitize_mcp_name_component("toggle_srv") not in _parallel_safe_servers
|
||||
|
|
|
|||
438
tests/tools/test_x_search_tool.py
Normal file
438
tests/tools/test_x_search_tool.py
Normal file
|
|
@ -0,0 +1,438 @@
|
|||
"""Tests for the X (Twitter) Search tool backed by xAI Responses API.
|
||||
|
||||
Covers:
|
||||
- HTTP request shape (URL, headers, payload, model from config)
|
||||
- Handle filter validation (allowed vs excluded mutual exclusion)
|
||||
- Inline url_citation extraction from message annotations
|
||||
- Structured error handling (4xx with code, 5xx retry, ReadTimeout retry)
|
||||
- Credential resolution: API key path, OAuth path, both-set preference, none-set
|
||||
- check_x_search_requirements gating in registry
|
||||
"""
|
||||
|
||||
import json
|
||||
|
||||
import requests
|
||||
|
||||
|
||||
class _FakeResponse:
|
||||
def __init__(self, payload, *, status_code=200, text=None):
|
||||
self._payload = payload
|
||||
self.status_code = status_code
|
||||
self.text = text if text is not None else json.dumps(payload)
|
||||
|
||||
def raise_for_status(self):
|
||||
if self.status_code >= 400:
|
||||
err = requests.HTTPError(f"{self.status_code} Client Error")
|
||||
err.response = self
|
||||
raise err
|
||||
|
||||
def json(self):
|
||||
return self._payload
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Original PR #10786 test coverage (HTTP shape, handle validation, citations,
|
||||
# retry behavior) — preserved verbatim. Uses XAI_API_KEY env var via the
|
||||
# default resolver path.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def test_x_search_posts_responses_request(monkeypatch):
|
||||
from tools.x_search_tool import x_search_tool
|
||||
from hermes_cli import __version__
|
||||
|
||||
captured = {}
|
||||
|
||||
def _fake_post(url, headers=None, json=None, timeout=None):
|
||||
captured["url"] = url
|
||||
captured["headers"] = headers
|
||||
captured["json"] = json
|
||||
captured["timeout"] = timeout
|
||||
return _FakeResponse(
|
||||
{
|
||||
"output_text": "People on X are discussing xAI's latest launch.",
|
||||
"citations": [{"url": "https://x.com/example/status/1", "title": "Example post"}],
|
||||
}
|
||||
)
|
||||
|
||||
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
|
||||
monkeypatch.setattr("requests.post", _fake_post)
|
||||
|
||||
result = json.loads(
|
||||
x_search_tool(
|
||||
query="What are people saying about xAI on X?",
|
||||
allowed_x_handles=["xai", "@grok"],
|
||||
from_date="2026-04-01",
|
||||
to_date="2026-04-10",
|
||||
enable_image_understanding=True,
|
||||
)
|
||||
)
|
||||
|
||||
tool_def = captured["json"]["tools"][0]
|
||||
assert captured["url"] == "https://api.x.ai/v1/responses"
|
||||
assert captured["headers"]["User-Agent"] == f"Hermes-Agent/{__version__}"
|
||||
assert captured["json"]["model"] == "grok-4.20-reasoning"
|
||||
assert captured["json"]["store"] is False
|
||||
assert tool_def["type"] == "x_search"
|
||||
assert tool_def["allowed_x_handles"] == ["xai", "grok"]
|
||||
assert tool_def["from_date"] == "2026-04-01"
|
||||
assert tool_def["to_date"] == "2026-04-10"
|
||||
assert tool_def["enable_image_understanding"] is True
|
||||
assert result["success"] is True
|
||||
assert result["answer"] == "People on X are discussing xAI's latest launch."
|
||||
|
||||
|
||||
def test_x_search_rejects_conflicting_handle_filters(monkeypatch):
|
||||
from tools.x_search_tool import x_search_tool
|
||||
|
||||
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
|
||||
|
||||
result = json.loads(
|
||||
x_search_tool(
|
||||
query="latest xAI discussion",
|
||||
allowed_x_handles=["xai"],
|
||||
excluded_x_handles=["grok"],
|
||||
)
|
||||
)
|
||||
|
||||
assert result["error"] == "allowed_x_handles and excluded_x_handles cannot be used together"
|
||||
|
||||
|
||||
def test_x_search_extracts_inline_url_citations(monkeypatch):
|
||||
from tools.x_search_tool import x_search_tool
|
||||
|
||||
def _fake_post(url, headers=None, json=None, timeout=None):
|
||||
return _FakeResponse(
|
||||
{
|
||||
"output": [
|
||||
{
|
||||
"type": "message",
|
||||
"content": [
|
||||
{
|
||||
"type": "output_text",
|
||||
"text": "xAI posted an update on X.",
|
||||
"annotations": [
|
||||
{
|
||||
"type": "url_citation",
|
||||
"url": "https://x.com/xai/status/123",
|
||||
"title": "xAI update",
|
||||
"start_index": 0,
|
||||
"end_index": 3,
|
||||
}
|
||||
],
|
||||
}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
)
|
||||
|
||||
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
|
||||
monkeypatch.setattr("requests.post", _fake_post)
|
||||
|
||||
result = json.loads(x_search_tool(query="latest post from xai"))
|
||||
|
||||
assert result["success"] is True
|
||||
assert result["answer"] == "xAI posted an update on X."
|
||||
assert result["inline_citations"] == [
|
||||
{
|
||||
"url": "https://x.com/xai/status/123",
|
||||
"title": "xAI update",
|
||||
"start_index": 0,
|
||||
"end_index": 3,
|
||||
}
|
||||
]
|
||||
|
||||
|
||||
def test_x_search_returns_structured_http_error(monkeypatch):
|
||||
from tools.x_search_tool import x_search_tool
|
||||
|
||||
class _FailingResponse:
|
||||
status_code = 403
|
||||
text = '{"code":"forbidden","error":"x_search is not enabled for this model"}'
|
||||
|
||||
def json(self):
|
||||
return {
|
||||
"code": "forbidden",
|
||||
"error": "x_search is not enabled for this model",
|
||||
}
|
||||
|
||||
def raise_for_status(self):
|
||||
err = requests.HTTPError("403 Client Error: Forbidden")
|
||||
err.response = self
|
||||
raise err
|
||||
|
||||
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
|
||||
monkeypatch.setattr("requests.post", lambda *a, **k: _FailingResponse())
|
||||
|
||||
result = json.loads(x_search_tool(query="latest xai discussion"))
|
||||
|
||||
assert result["success"] is False
|
||||
assert result["provider"] == "xai"
|
||||
assert result["tool"] == "x_search"
|
||||
assert result["error_type"] == "HTTPError"
|
||||
assert result["error"] == "forbidden: x_search is not enabled for this model"
|
||||
|
||||
|
||||
def test_x_search_retries_read_timeout_then_succeeds(monkeypatch):
|
||||
from tools.x_search_tool import x_search_tool
|
||||
|
||||
calls = {"count": 0}
|
||||
|
||||
def _fake_post(url, headers=None, json=None, timeout=None):
|
||||
calls["count"] += 1
|
||||
if calls["count"] == 1:
|
||||
raise requests.ReadTimeout("timed out")
|
||||
return _FakeResponse(
|
||||
{
|
||||
"output_text": "Recovered after retry.",
|
||||
"citations": [],
|
||||
}
|
||||
)
|
||||
|
||||
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
|
||||
monkeypatch.setattr("requests.post", _fake_post)
|
||||
monkeypatch.setattr("tools.x_search_tool.time.sleep", lambda *_: None)
|
||||
|
||||
result = json.loads(x_search_tool(query="grok xai"))
|
||||
|
||||
assert calls["count"] == 2
|
||||
assert result["success"] is True
|
||||
assert result["answer"] == "Recovered after retry."
|
||||
|
||||
|
||||
def test_x_search_retries_5xx_then_succeeds(monkeypatch):
|
||||
from tools.x_search_tool import x_search_tool
|
||||
|
||||
calls = {"count": 0}
|
||||
|
||||
def _fake_post(url, headers=None, json=None, timeout=None):
|
||||
calls["count"] += 1
|
||||
if calls["count"] == 1:
|
||||
return _FakeResponse(
|
||||
{"code": "Internal error", "error": "Service temporarily unavailable."},
|
||||
status_code=500,
|
||||
)
|
||||
return _FakeResponse({"output_text": "Recovered after 5xx retry."})
|
||||
|
||||
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
|
||||
monkeypatch.setattr("requests.post", _fake_post)
|
||||
monkeypatch.setattr("tools.x_search_tool.time.sleep", lambda *_: None)
|
||||
|
||||
result = json.loads(x_search_tool(query="grok xai"))
|
||||
|
||||
assert calls["count"] == 2
|
||||
assert result["success"] is True
|
||||
assert result["answer"] == "Recovered after 5xx retry."
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Credential-resolution coverage — the OAuth-or-API-key gating contract.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _no_xai_env(monkeypatch):
|
||||
"""Strip any XAI_* env vars so the resolver doesn't see a leaked dev key."""
|
||||
for var in ("XAI_API_KEY", "XAI_BASE_URL", "HERMES_XAI_BASE_URL"):
|
||||
monkeypatch.delenv(var, raising=False)
|
||||
|
||||
|
||||
def test_x_search_uses_xai_oauth_when_only_oauth_available(monkeypatch):
|
||||
"""OAuth-only user: credential_source should be ``xai-oauth``."""
|
||||
from tools.registry import invalidate_check_fn_cache
|
||||
from tools.x_search_tool import check_x_search_requirements, x_search_tool
|
||||
|
||||
_no_xai_env(monkeypatch)
|
||||
|
||||
def _fake_resolve():
|
||||
return {
|
||||
"provider": "xai-oauth",
|
||||
"api_key": "oauth-bearer-token",
|
||||
"base_url": "https://api.x.ai/v1",
|
||||
}
|
||||
|
||||
monkeypatch.setattr(
|
||||
"tools.x_search_tool.resolve_xai_http_credentials", _fake_resolve
|
||||
)
|
||||
invalidate_check_fn_cache()
|
||||
|
||||
assert check_x_search_requirements() is True
|
||||
|
||||
captured = {}
|
||||
|
||||
def _fake_post(url, headers=None, json=None, timeout=None):
|
||||
captured["headers"] = headers
|
||||
return _FakeResponse({"output_text": "Found posts via OAuth."})
|
||||
|
||||
monkeypatch.setattr("requests.post", _fake_post)
|
||||
|
||||
result = json.loads(x_search_tool(query="anything about xai"))
|
||||
|
||||
assert result["success"] is True
|
||||
assert result["credential_source"] == "xai-oauth"
|
||||
assert captured["headers"]["Authorization"] == "Bearer oauth-bearer-token"
|
||||
|
||||
|
||||
def test_x_search_uses_api_key_when_only_xai_api_key_set(monkeypatch):
|
||||
"""API-key-only user: credential_source should be ``xai``."""
|
||||
from tools.registry import invalidate_check_fn_cache
|
||||
from tools.x_search_tool import check_x_search_requirements, x_search_tool
|
||||
|
||||
_no_xai_env(monkeypatch)
|
||||
|
||||
def _fake_resolve():
|
||||
# Real ``resolve_xai_http_credentials`` returns ``"xai"`` when it
|
||||
# falls through to the XAI_API_KEY env var path.
|
||||
return {
|
||||
"provider": "xai",
|
||||
"api_key": "raw-api-key",
|
||||
"base_url": "https://api.x.ai/v1",
|
||||
}
|
||||
|
||||
monkeypatch.setattr(
|
||||
"tools.x_search_tool.resolve_xai_http_credentials", _fake_resolve
|
||||
)
|
||||
invalidate_check_fn_cache()
|
||||
|
||||
assert check_x_search_requirements() is True
|
||||
|
||||
captured = {}
|
||||
|
||||
def _fake_post(url, headers=None, json=None, timeout=None):
|
||||
captured["headers"] = headers
|
||||
return _FakeResponse({"output_text": "Found posts via API key."})
|
||||
|
||||
monkeypatch.setattr("requests.post", _fake_post)
|
||||
|
||||
result = json.loads(x_search_tool(query="anything"))
|
||||
|
||||
assert result["success"] is True
|
||||
assert result["credential_source"] == "xai"
|
||||
assert captured["headers"]["Authorization"] == "Bearer raw-api-key"
|
||||
|
||||
|
||||
def test_x_search_prefers_oauth_when_both_available(monkeypatch):
|
||||
"""Both credentials present: OAuth wins (matches Teknium's billing preference).
|
||||
|
||||
The real ordering is implemented in ``tools.xai_http.resolve_xai_http_credentials``
|
||||
— OAuth runtime first, fallback OAuth resolver second, ``XAI_API_KEY`` third.
|
||||
This test exercises the contract by having the resolver return the OAuth
|
||||
bearer (the ``xai-oauth`` ``provider`` tag is the marker).
|
||||
"""
|
||||
from tools.registry import invalidate_check_fn_cache
|
||||
from tools.x_search_tool import x_search_tool
|
||||
|
||||
monkeypatch.setenv("XAI_API_KEY", "raw-api-key")
|
||||
|
||||
# Mimic xai_http's preference: OAuth wins, so we return the OAuth tuple
|
||||
# even though XAI_API_KEY is also set.
|
||||
def _fake_resolve():
|
||||
return {
|
||||
"provider": "xai-oauth",
|
||||
"api_key": "oauth-bearer-token",
|
||||
"base_url": "https://api.x.ai/v1",
|
||||
}
|
||||
|
||||
monkeypatch.setattr(
|
||||
"tools.x_search_tool.resolve_xai_http_credentials", _fake_resolve
|
||||
)
|
||||
invalidate_check_fn_cache()
|
||||
|
||||
captured = {}
|
||||
|
||||
def _fake_post(url, headers=None, json=None, timeout=None):
|
||||
captured["headers"] = headers
|
||||
return _FakeResponse({"output_text": "OAuth preferred."})
|
||||
|
||||
monkeypatch.setattr("requests.post", _fake_post)
|
||||
|
||||
result = json.loads(x_search_tool(query="anything"))
|
||||
|
||||
assert result["credential_source"] == "xai-oauth"
|
||||
assert captured["headers"]["Authorization"] == "Bearer oauth-bearer-token"
|
||||
|
||||
|
||||
def test_x_search_returns_tool_error_when_no_credentials(monkeypatch):
|
||||
"""No credentials anywhere: tool returns a clear error, not a 401 from xAI."""
|
||||
from tools.registry import invalidate_check_fn_cache
|
||||
from tools.x_search_tool import check_x_search_requirements, x_search_tool
|
||||
|
||||
_no_xai_env(monkeypatch)
|
||||
|
||||
def _fake_resolve():
|
||||
return {
|
||||
"provider": "xai",
|
||||
"api_key": "",
|
||||
"base_url": "https://api.x.ai/v1",
|
||||
}
|
||||
|
||||
monkeypatch.setattr(
|
||||
"tools.x_search_tool.resolve_xai_http_credentials", _fake_resolve
|
||||
)
|
||||
invalidate_check_fn_cache()
|
||||
|
||||
assert check_x_search_requirements() is False
|
||||
|
||||
# If a model somehow invokes the tool despite a False check_fn, the call
|
||||
# surfaces a friendly error rather than an HTTP exception.
|
||||
result = x_search_tool(query="anything")
|
||||
assert "No xAI credentials available" in result
|
||||
assert "hermes auth add xai-oauth" in result
|
||||
|
||||
|
||||
def test_x_search_check_fn_false_when_resolver_raises(monkeypatch):
|
||||
"""Resolver exceptions (e.g. expired token + failed refresh) gate the tool out."""
|
||||
from tools.registry import invalidate_check_fn_cache
|
||||
from tools.x_search_tool import check_x_search_requirements
|
||||
|
||||
_no_xai_env(monkeypatch)
|
||||
|
||||
def _boom():
|
||||
raise RuntimeError("token revoked and refresh failed")
|
||||
|
||||
monkeypatch.setattr(
|
||||
"tools.x_search_tool.resolve_xai_http_credentials", _boom
|
||||
)
|
||||
invalidate_check_fn_cache()
|
||||
|
||||
assert check_x_search_requirements() is False
|
||||
|
||||
|
||||
def test_x_search_honors_config_model_and_timeout(monkeypatch, tmp_path):
|
||||
"""``x_search.model`` and ``x_search.timeout_seconds`` override the defaults."""
|
||||
from tools.x_search_tool import x_search_tool
|
||||
|
||||
monkeypatch.setenv("XAI_API_KEY", "xai-test-key")
|
||||
|
||||
# Patch the in-module config loader so tests don't touch ~/.hermes/config.yaml.
|
||||
monkeypatch.setattr(
|
||||
"tools.x_search_tool._load_x_search_config",
|
||||
lambda: {"model": "grok-custom-test", "timeout_seconds": 45, "retries": 0},
|
||||
)
|
||||
|
||||
captured = {}
|
||||
|
||||
def _fake_post(url, headers=None, json=None, timeout=None):
|
||||
captured["model"] = json["model"]
|
||||
captured["timeout"] = timeout
|
||||
return _FakeResponse({"output_text": "Custom model OK."})
|
||||
|
||||
monkeypatch.setattr("requests.post", _fake_post)
|
||||
|
||||
result = json.loads(x_search_tool(query="anything"))
|
||||
|
||||
assert result["success"] is True
|
||||
assert captured["model"] == "grok-custom-test"
|
||||
assert captured["timeout"] == 45
|
||||
|
||||
|
||||
def test_x_search_registered_in_registry_with_check_fn():
|
||||
"""The tool is registered under the x_search toolset with the gating check_fn."""
|
||||
import tools.x_search_tool # noqa: F401 — ensures registration runs
|
||||
from tools.registry import registry
|
||||
|
||||
entry = registry.get_entry("x_search")
|
||||
assert entry is not None
|
||||
assert entry.toolset == "x_search"
|
||||
assert entry.check_fn is not None
|
||||
assert entry.check_fn.__name__ == "check_x_search_requirements"
|
||||
assert "XAI_API_KEY" in entry.requires_env
|
||||
assert entry.emoji == "🐦"
|
||||
|
|
@ -133,8 +133,19 @@ _CREDENTIAL_FILES = (
|
|||
r'(?:~|\$home|\$\{home\})/\.'
|
||||
r'(?:netrc|pgpass|npmrc|pypirc)\b'
|
||||
)
|
||||
# macOS: /etc, /var, /tmp, /home are symlinks to /private/{etc,var,tmp,home}.
|
||||
# A command written to target /private/etc/sudoers works identically to
|
||||
# /etc/sudoers on macOS but bypasses a plain "/etc/" pattern check. Match
|
||||
# both forms. Inspired by Claude Code 2.1.113's "dangerous path protection".
|
||||
_MACOS_PRIVATE_SYSTEM_PATH = r'/private/(?:etc|var|tmp|home)/'
|
||||
# System-config paths that should trigger approval for any write/edit,
|
||||
# collapsing /etc, its macOS /private/etc mirror, and /etc/sudoers.d/ into
|
||||
# one shared fragment so new DANGEROUS_PATTERNS stay consistent.
|
||||
_SYSTEM_CONFIG_PATH = (
|
||||
rf'(?:/etc/|{_MACOS_PRIVATE_SYSTEM_PATH})'
|
||||
)
|
||||
_SENSITIVE_WRITE_TARGET = (
|
||||
r'(?:/etc/|/dev/sd|'
|
||||
rf'(?:{_SYSTEM_CONFIG_PATH}|/dev/sd|'
|
||||
rf'{_SSH_SENSITIVE_PATH}|'
|
||||
rf'{_HERMES_ENV_PATH}|'
|
||||
rf'{_SHELL_RC_FILES}|'
|
||||
|
|
@ -318,10 +329,17 @@ DANGEROUS_PATTERNS = [
|
|||
# *next* line to satisfy the negative lookahead, silently allowing DELETE without WHERE.
|
||||
(r'\bDELETE\s+FROM\b(?![^\n]*\bWHERE\b)', "SQL DELETE without WHERE"),
|
||||
(r'\bTRUNCATE\s+(TABLE)?\s*\w', "SQL TRUNCATE"),
|
||||
(r'>\s*/etc/', "overwrite system config"),
|
||||
(rf'>\s*{_SYSTEM_CONFIG_PATH}', "overwrite system config"),
|
||||
(r'\bsystemctl\s+(-[^\s]+\s+)*(stop|restart|disable|mask)\b', "stop/restart system service"),
|
||||
(r'\bkill\s+-9\s+-1\b', "kill all processes"),
|
||||
(r'\bpkill\s+-9\b', "force kill processes"),
|
||||
# killall with SIGKILL (parallel to pkill -9). Catches -9 / -KILL /
|
||||
# -s KILL / -SIGKILL forms, and also `killall -r <regex>` broad sweeps
|
||||
# that can wipe out unrelated processes by accident.
|
||||
# Inspired by Claude Code 2.1.113 expanded deny rules.
|
||||
(r'\bkillall\s+(-[^\s]*\s+)*-(9|KILL|SIGKILL)\b', "force kill processes (killall -KILL)"),
|
||||
(r'\bkillall\s+(-[^\s]*\s+)*-s\s+(KILL|SIGKILL|9)\b', "force kill processes (killall -s KILL)"),
|
||||
(r'\bkillall\s+(-[^\s]*\s+)*-r\b', "kill processes by regex (killall -r)"),
|
||||
(r':\(\)\s*\{\s*:\s*\|\s*:\s*&\s*\}\s*;\s*:', "fork bomb"),
|
||||
# Any shell invocation via -c or combined flags like -lc, -ic, etc.
|
||||
(r'\b(bash|sh|zsh|ksh)\s+-[^\s]*c(\s+|$)', "shell command via -c/-lc flag"),
|
||||
|
|
@ -333,7 +351,11 @@ DANGEROUS_PATTERNS = [
|
|||
(rf'\btee\b.*["\']?{_PROJECT_SENSITIVE_WRITE_TARGET}["\']?{_COMMAND_TAIL}', "overwrite project env/config via tee"),
|
||||
(rf'>>?\s*["\']?{_PROJECT_SENSITIVE_WRITE_TARGET}["\']?{_COMMAND_TAIL}', "overwrite project env/config via redirection"),
|
||||
(r'\bxargs\s+.*\brm\b', "xargs with rm"),
|
||||
(r'\bfind\b.*-exec\s+(/\S*/)?rm\b', "find -exec rm"),
|
||||
# find -exec rm / -execdir rm — the -execdir variant (same semantics,
|
||||
# runs in the directory of each match) was previously missed. Claude
|
||||
# Code 2.1.113 tightened their equivalent find rule to stop auto-
|
||||
# approving -exec / -delete flags.
|
||||
(r'\bfind\b.*-exec(?:dir)?\s+(/\S*/)?rm\b', "find -exec/-execdir rm"),
|
||||
(r'\bfind\b.*-delete\b', "find -delete"),
|
||||
# Gateway lifecycle protection: prevent the agent from killing its own
|
||||
# gateway process. These commands trigger a gateway restart/stop that
|
||||
|
|
@ -351,11 +373,12 @@ DANGEROUS_PATTERNS = [
|
|||
# to regex at detection time. Catch the structural pattern instead.
|
||||
(r'\bkill\b.*\$\(\s*pgrep\b', "kill process via pgrep expansion (self-termination)"),
|
||||
(r'\bkill\b.*`\s*pgrep\b', "kill process via backtick pgrep expansion (self-termination)"),
|
||||
# File copy/move/edit into sensitive system paths
|
||||
(r'\b(cp|mv|install)\b.*\s/etc/', "copy/move file into /etc/"),
|
||||
# File copy/move/edit into sensitive system paths (/etc/ and macOS
|
||||
# /private/etc/ mirror).
|
||||
(rf'\b(cp|mv|install)\b.*\s{_SYSTEM_CONFIG_PATH}', "copy/move file into system config path"),
|
||||
(rf'\b(cp|mv|install)\b.*\s["\']?{_PROJECT_SENSITIVE_WRITE_TARGET}["\']?{_COMMAND_TAIL}', "overwrite project env/config file"),
|
||||
(r'\bsed\s+-[^\s]*i.*\s/etc/', "in-place edit of system config"),
|
||||
(r'\bsed\s+--in-place\b.*\s/etc/', "in-place edit of system config (long flag)"),
|
||||
(rf'\bsed\s+-[^\s]*i.*\s{_SYSTEM_CONFIG_PATH}', "in-place edit of system config"),
|
||||
(rf'\bsed\s+--in-place\b.*\s{_SYSTEM_CONFIG_PATH}', "in-place edit of system config (long flag)"),
|
||||
# Script execution via heredoc — bypasses the -e/-c flag patterns above.
|
||||
# `python3 << 'EOF'` feeds arbitrary code via stdin without -c/-e flags.
|
||||
(r'\b(python[23]?|perl|ruby|node)\s+<<', "script execution via heredoc"),
|
||||
|
|
|
|||
|
|
@ -2362,6 +2362,7 @@ def _resolve_delegation_credentials(cfg: dict, parent_agent) -> dict:
|
|||
configured_provider = str(cfg.get("provider") or "").strip() or None
|
||||
configured_base_url = str(cfg.get("base_url") or "").strip() or None
|
||||
configured_api_key = str(cfg.get("api_key") or "").strip() or None
|
||||
configured_api_mode = str(cfg.get("api_mode") or "").strip().lower() or None
|
||||
|
||||
if configured_base_url:
|
||||
# When delegation.api_key is not set, return None so _build_child_agent
|
||||
|
|
@ -2372,9 +2373,17 @@ def _resolve_delegation_credentials(cfg: dict, parent_agent) -> dict:
|
|||
# callers to duplicate the key under delegation.api_key.
|
||||
api_key = configured_api_key # None → inherited from parent in _build_child_agent
|
||||
|
||||
# Use the shared URL-based api_mode detector (same path the main agent's
|
||||
# runtime resolver uses) so Anthropic-compatible direct endpoints with a
|
||||
# /anthropic suffix — Azure AI Foundry, MiniMax, Zhipu GLM, LiteLLM
|
||||
# proxies — pick the right transport automatically. Without this,
|
||||
# subagents would default to chat_completions and hit 404s on endpoints
|
||||
# that only speak the Anthropic Messages protocol. Fixes #10213.
|
||||
from hermes_cli.runtime_provider import _detect_api_mode_for_url
|
||||
|
||||
base_lower = configured_base_url.lower()
|
||||
provider = "custom"
|
||||
api_mode = "chat_completions"
|
||||
api_mode = _detect_api_mode_for_url(configured_base_url) or "chat_completions"
|
||||
if (
|
||||
base_url_hostname(configured_base_url) == "chatgpt.com"
|
||||
and "/backend-api/codex" in base_lower
|
||||
|
|
@ -2388,6 +2397,11 @@ def _resolve_delegation_credentials(cfg: dict, parent_agent) -> dict:
|
|||
provider = "custom"
|
||||
api_mode = "anthropic_messages"
|
||||
|
||||
# Explicit delegation.api_mode in config always wins. Lets users force
|
||||
# a transport for non-standard endpoints the URL heuristic can't detect.
|
||||
if configured_api_mode in {"chat_completions", "codex_responses", "anthropic_messages"}:
|
||||
api_mode = configured_api_mode
|
||||
|
||||
return {
|
||||
"model": configured_model,
|
||||
"provider": provider,
|
||||
|
|
|
|||
|
|
@ -78,7 +78,7 @@ LAZY_DEPS: dict[str, tuple[str, ...]] = {
|
|||
# ─── Inference providers ───────────────────────────────────────────────
|
||||
# Native Anthropic SDK — needed when provider=anthropic (not via
|
||||
# OpenRouter / aggregators which use the openai SDK).
|
||||
"provider.anthropic": ("anthropic==0.86.0",),
|
||||
"provider.anthropic": ("anthropic==0.87.0",), # CVE-2026-34450, CVE-2026-34452
|
||||
# AWS Bedrock provider
|
||||
"provider.bedrock": ("boto3==1.42.89",),
|
||||
|
||||
|
|
@ -125,7 +125,7 @@ LAZY_DEPS: dict[str, tuple[str, ...]] = {
|
|||
"platform.slack": (
|
||||
"slack-bolt==1.27.0",
|
||||
"slack-sdk==3.40.1",
|
||||
"aiohttp==3.13.3",
|
||||
"aiohttp==3.13.4", # CVE-2026-34513/34518/34519/34520/34525
|
||||
),
|
||||
"platform.matrix": (
|
||||
"mautrix[encryption]==0.21.0",
|
||||
|
|
|
|||
|
|
@ -24,6 +24,7 @@ Example config::
|
|||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_..."
|
||||
supports_parallel_tool_calls: true # tools from this server may run concurrently
|
||||
remote_api:
|
||||
url: "https://my-mcp-server.example.com/mcp"
|
||||
headers:
|
||||
|
|
@ -56,6 +57,8 @@ Features:
|
|||
- Thread-safe architecture with dedicated background event loop
|
||||
- Sampling support: MCP servers can request LLM completions via
|
||||
sampling/createMessage (text and tool-use responses)
|
||||
- Parallel tool call opt-in: per-server ``supports_parallel_tool_calls``
|
||||
flag allows concurrent execution of tools from the same server
|
||||
|
||||
Architecture:
|
||||
A dedicated background event loop (_mcp_loop) runs in a daemon thread.
|
||||
|
|
@ -1976,11 +1979,16 @@ def _handle_session_expired_and_retry(
|
|||
return None
|
||||
|
||||
|
||||
# Sanitized server names whose ``supports_parallel_tool_calls`` config is True.
|
||||
# Populated during ``register_mcp_servers()`` and queried by
|
||||
# ``is_mcp_tool_parallel_safe()`` for the parallel-execution check in run_agent.
|
||||
_parallel_safe_servers: set = set()
|
||||
|
||||
# Dedicated event loop running in a background daemon thread.
|
||||
_mcp_loop: Optional[asyncio.AbstractEventLoop] = None
|
||||
_mcp_thread: Optional[threading.Thread] = None
|
||||
|
||||
# Protects _mcp_loop, _mcp_thread, _servers, and _stdio_pids.
|
||||
# Protects _mcp_loop, _mcp_thread, _servers, _parallel_safe_servers, and _stdio_pids.
|
||||
_lock = threading.Lock()
|
||||
|
||||
# PIDs of stdio MCP server subprocesses. Tracked so we can force-kill
|
||||
|
|
@ -3098,6 +3106,12 @@ def register_mcp_servers(servers: Dict[str, dict]) -> List[str]:
|
|||
for k, v in servers.items()
|
||||
if k not in _servers and _parse_boolish(v.get("enabled", True), default=True)
|
||||
}
|
||||
# Track which servers opt-in to parallel tool calls (idempotent).
|
||||
for srv_name, srv_cfg in servers.items():
|
||||
if _parse_boolish(srv_cfg.get("supports_parallel_tool_calls", False), default=False):
|
||||
_parallel_safe_servers.add(sanitize_mcp_name_component(srv_name))
|
||||
else:
|
||||
_parallel_safe_servers.discard(sanitize_mcp_name_component(srv_name))
|
||||
|
||||
if not new_servers:
|
||||
return _existing_tool_names()
|
||||
|
|
@ -3208,6 +3222,29 @@ def discover_mcp_tools() -> List[str]:
|
|||
return tool_names
|
||||
|
||||
|
||||
def is_mcp_tool_parallel_safe(tool_name: str) -> bool:
|
||||
"""Check if an MCP tool belongs to a server that supports parallel tool calls.
|
||||
|
||||
MCP tool names follow the pattern ``mcp_{server}_{tool}``. This extracts
|
||||
the server component and checks it against the set of servers whose config
|
||||
includes ``supports_parallel_tool_calls: true``.
|
||||
|
||||
Returns False for non-MCP tools or tools from servers without the flag.
|
||||
"""
|
||||
if not tool_name.startswith("mcp_"):
|
||||
return False
|
||||
# Strip the "mcp_" prefix and extract the server name.
|
||||
# Tool names are: mcp_{sanitized_server}_{sanitized_tool}
|
||||
# We need to check all possible server prefixes because the server name
|
||||
# itself may contain underscores after sanitization.
|
||||
rest = tool_name[4:] # strip "mcp_"
|
||||
with _lock:
|
||||
for server_name in _parallel_safe_servers:
|
||||
if rest.startswith(server_name + "_") and len(rest) > len(server_name) + 1:
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def get_mcp_status() -> List[dict]:
|
||||
"""Return status of all configured MCP servers for banner display.
|
||||
|
||||
|
|
|
|||
|
|
@ -244,8 +244,16 @@ class ToolRegistry:
|
|||
emoji: str = "",
|
||||
max_result_size_chars: int | float | None = None,
|
||||
dynamic_schema_overrides: Callable = None,
|
||||
override: bool = False,
|
||||
):
|
||||
"""Register a tool. Called at module-import time by each tool file."""
|
||||
"""Register a tool. Called at module-import time by each tool file.
|
||||
|
||||
``override=True`` is an explicit opt-in for plugins that intend to
|
||||
replace an existing built-in tool implementation (e.g. swap the
|
||||
default browser tool for a headed-Chrome CDP backend). Without it,
|
||||
registrations that would shadow an existing tool from a different
|
||||
toolset are rejected to prevent accidental overwrites.
|
||||
"""
|
||||
with self._lock:
|
||||
existing = self._tools.get(name)
|
||||
if existing and existing.toolset != toolset:
|
||||
|
|
@ -260,13 +268,22 @@ class ToolRegistry:
|
|||
"Tool '%s': MCP toolset '%s' overwriting MCP toolset '%s'",
|
||||
name, toolset, existing.toolset,
|
||||
)
|
||||
elif override:
|
||||
# Explicit plugin opt-in: replace the existing tool.
|
||||
# Logged at INFO so the override is auditable in agent.log.
|
||||
logger.info(
|
||||
"Tool '%s': toolset '%s' overriding existing toolset '%s' "
|
||||
"(override=True opt-in)",
|
||||
name, toolset, existing.toolset,
|
||||
)
|
||||
else:
|
||||
# Reject shadowing — prevent plugins/MCP from overwriting
|
||||
# built-in tools or vice versa.
|
||||
logger.error(
|
||||
"Tool registration REJECTED: '%s' (toolset '%s') would "
|
||||
"shadow existing tool from toolset '%s'. Deregister the "
|
||||
"existing tool first if this is intentional.",
|
||||
"shadow existing tool from toolset '%s'. Pass "
|
||||
"override=True to register() if the replacement is "
|
||||
"intentional, or deregister the existing tool first.",
|
||||
name, toolset, existing.toolset,
|
||||
)
|
||||
return
|
||||
|
|
@ -387,7 +404,16 @@ class ToolRegistry:
|
|||
return entry.handler(args, **kwargs)
|
||||
except Exception as e:
|
||||
logger.exception("Tool %s dispatch error: %s", name, e)
|
||||
return json.dumps({"error": f"Tool execution failed: {type(e).__name__}: {e}"})
|
||||
# Route through the sanitizer so framing tokens / CDATA / fences
|
||||
# in exception strings don't reach the model as structural noise.
|
||||
# See model_tools._sanitize_tool_error for rationale.
|
||||
raw = f"Tool execution failed: {type(e).__name__}: {e}"
|
||||
try:
|
||||
from model_tools import _sanitize_tool_error
|
||||
sanitized = _sanitize_tool_error(raw)
|
||||
except Exception:
|
||||
sanitized = raw # defensive: never let the sanitizer block error propagation
|
||||
return json.dumps({"error": sanitized})
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Query helpers (replace redundant dicts in model_tools.py)
|
||||
|
|
|
|||
424
tools/x_search_tool.py
Normal file
424
tools/x_search_tool.py
Normal file
|
|
@ -0,0 +1,424 @@
|
|||
#!/usr/bin/env python3
|
||||
"""X Search tool backed by xAI's built-in ``x_search`` Responses API tool.
|
||||
|
||||
Authentication
|
||||
--------------
|
||||
The tool registers when **either** xAI credential path is available:
|
||||
|
||||
* ``XAI_API_KEY`` is set in ``~/.hermes/.env`` or the process environment
|
||||
(paid xAI API key), OR
|
||||
* The user is signed in via xAI Grok OAuth — SuperGrok subscription —
|
||||
i.e. ``hermes auth add xai-oauth`` has been run and the stored refresh
|
||||
token still works.
|
||||
|
||||
Credential preference at call time matches
|
||||
:func:`tools.xai_http.resolve_xai_http_credentials`: SuperGrok OAuth first,
|
||||
direct OAuth resolver second, ``XAI_API_KEY`` last. That helper also
|
||||
auto-refreshes the OAuth access token when it's within the refresh skew
|
||||
window, so a ``True`` from :func:`check_x_search_requirements` means the
|
||||
bearer is fetchable AND non-empty.
|
||||
|
||||
Salvaged from PR #10786 (originally by @Jaaneek); credential resolution
|
||||
reworked to honor both auth modes per Teknium's design.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import time
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
import requests
|
||||
|
||||
from tools.registry import registry, tool_error
|
||||
from tools.xai_http import hermes_xai_user_agent, resolve_xai_http_credentials
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
DEFAULT_XAI_BASE_URL = "https://api.x.ai/v1"
|
||||
DEFAULT_X_SEARCH_MODEL = "grok-4.20-reasoning"
|
||||
DEFAULT_X_SEARCH_TIMEOUT_SECONDS = 180
|
||||
DEFAULT_X_SEARCH_RETRIES = 2
|
||||
MAX_HANDLES = 10
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Config
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _load_x_search_config() -> Dict[str, Any]:
|
||||
try:
|
||||
from hermes_cli.config import load_config
|
||||
|
||||
return load_config().get("x_search", {}) or {}
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
|
||||
def _get_x_search_model() -> str:
|
||||
cfg = _load_x_search_config()
|
||||
return (str(cfg.get("model") or "").strip() or DEFAULT_X_SEARCH_MODEL)
|
||||
|
||||
|
||||
def _get_x_search_timeout_seconds() -> int:
|
||||
cfg = _load_x_search_config()
|
||||
raw_value = cfg.get("timeout_seconds", DEFAULT_X_SEARCH_TIMEOUT_SECONDS)
|
||||
try:
|
||||
return max(30, int(raw_value))
|
||||
except Exception:
|
||||
return DEFAULT_X_SEARCH_TIMEOUT_SECONDS
|
||||
|
||||
|
||||
def _get_x_search_retries() -> int:
|
||||
cfg = _load_x_search_config()
|
||||
raw_value = cfg.get("retries", DEFAULT_X_SEARCH_RETRIES)
|
||||
try:
|
||||
return max(0, int(raw_value))
|
||||
except Exception:
|
||||
return DEFAULT_X_SEARCH_RETRIES
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Credential resolution
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _resolve_xai_bearer() -> Tuple[str, str, str]:
|
||||
"""Return ``(api_key, base_url, source)``.
|
||||
|
||||
``source`` is one of ``"xai-oauth"`` or ``"xai"`` so callers (and tests)
|
||||
can tell which credential path won. Raises ``RuntimeError`` if no usable
|
||||
credential is available — the registered :func:`check_x_search_requirements`
|
||||
gate makes that case unreachable in normal operation, but the runtime
|
||||
check exists so a credential that expires between registration and
|
||||
invocation produces a clean tool error instead of a 401.
|
||||
"""
|
||||
creds = resolve_xai_http_credentials()
|
||||
api_key = str(creds.get("api_key") or "").strip()
|
||||
if not api_key:
|
||||
raise RuntimeError(
|
||||
"No xAI credentials available. Run `hermes auth add xai-oauth` "
|
||||
"to sign in with your SuperGrok subscription, or set XAI_API_KEY."
|
||||
)
|
||||
base_url = str(creds.get("base_url") or DEFAULT_XAI_BASE_URL).strip().rstrip("/")
|
||||
source = str(creds.get("provider") or "xai")
|
||||
return api_key, base_url, source
|
||||
|
||||
|
||||
def check_x_search_requirements() -> bool:
|
||||
"""Return True when xAI credentials are available AND valid.
|
||||
|
||||
``resolve_xai_http_credentials`` calls
|
||||
:func:`hermes_cli.auth.resolve_xai_oauth_runtime_credentials` which
|
||||
auto-refreshes the OAuth access token if it's expiring; a successful
|
||||
return therefore implies a usable bearer.
|
||||
"""
|
||||
try:
|
||||
creds = resolve_xai_http_credentials()
|
||||
return bool(str(creds.get("api_key") or "").strip())
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _normalize_handles(handles: Optional[List[str]], field_name: str) -> List[str]:
|
||||
cleaned: List[str] = []
|
||||
for handle in handles or []:
|
||||
normalized = str(handle or "").strip().lstrip("@")
|
||||
if normalized:
|
||||
cleaned.append(normalized)
|
||||
if len(cleaned) > MAX_HANDLES:
|
||||
raise ValueError(f"{field_name} supports at most {MAX_HANDLES} handles")
|
||||
return cleaned
|
||||
|
||||
|
||||
def _extract_response_text(payload: Dict[str, Any]) -> str:
|
||||
output_text = str(payload.get("output_text") or "").strip()
|
||||
if output_text:
|
||||
return output_text
|
||||
|
||||
parts: List[str] = []
|
||||
for item in payload.get("output", []) or []:
|
||||
if item.get("type") != "message":
|
||||
continue
|
||||
for content in item.get("content", []) or []:
|
||||
ctype = content.get("type")
|
||||
if ctype in ("output_text", "text"):
|
||||
text = str(content.get("text") or "").strip()
|
||||
if text:
|
||||
parts.append(text)
|
||||
return "\n\n".join(parts).strip()
|
||||
|
||||
|
||||
def _extract_inline_citations(payload: Dict[str, Any]) -> List[Dict[str, Any]]:
|
||||
citations: List[Dict[str, Any]] = []
|
||||
for item in payload.get("output", []) or []:
|
||||
if item.get("type") != "message":
|
||||
continue
|
||||
for content in item.get("content", []) or []:
|
||||
for annotation in content.get("annotations", []) or []:
|
||||
if annotation.get("type") != "url_citation":
|
||||
continue
|
||||
citations.append(
|
||||
{
|
||||
"url": annotation.get("url", ""),
|
||||
"title": annotation.get("title", ""),
|
||||
"start_index": annotation.get("start_index"),
|
||||
"end_index": annotation.get("end_index"),
|
||||
}
|
||||
)
|
||||
return citations
|
||||
|
||||
|
||||
def _http_error_message(exc: requests.HTTPError) -> str:
|
||||
response = getattr(exc, "response", None)
|
||||
if response is None:
|
||||
return str(exc)
|
||||
|
||||
try:
|
||||
payload = response.json()
|
||||
except Exception:
|
||||
payload = None
|
||||
|
||||
if isinstance(payload, dict):
|
||||
code = str(payload.get("code") or "").strip()
|
||||
error = str(payload.get("error") or "").strip()
|
||||
message = error or str(payload)
|
||||
if code and code not in message:
|
||||
message = f"{code}: {message}"
|
||||
return message or str(exc)
|
||||
|
||||
text = str(getattr(response, "text", "") or "").strip()
|
||||
if text:
|
||||
return text[:500]
|
||||
return str(exc)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Tool implementation
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def x_search_tool(
|
||||
query: str,
|
||||
allowed_x_handles: Optional[List[str]] = None,
|
||||
excluded_x_handles: Optional[List[str]] = None,
|
||||
from_date: str = "",
|
||||
to_date: str = "",
|
||||
enable_image_understanding: bool = False,
|
||||
enable_video_understanding: bool = False,
|
||||
) -> str:
|
||||
if not query or not query.strip():
|
||||
return tool_error("query is required for x_search")
|
||||
|
||||
try:
|
||||
api_key, base_url, source = _resolve_xai_bearer()
|
||||
except RuntimeError as exc:
|
||||
return tool_error(str(exc))
|
||||
|
||||
try:
|
||||
allowed = _normalize_handles(allowed_x_handles, "allowed_x_handles")
|
||||
excluded = _normalize_handles(excluded_x_handles, "excluded_x_handles")
|
||||
if allowed and excluded:
|
||||
return tool_error("allowed_x_handles and excluded_x_handles cannot be used together")
|
||||
|
||||
tool_def: Dict[str, Any] = {"type": "x_search"}
|
||||
if allowed:
|
||||
tool_def["allowed_x_handles"] = allowed
|
||||
if excluded:
|
||||
tool_def["excluded_x_handles"] = excluded
|
||||
if from_date.strip():
|
||||
tool_def["from_date"] = from_date.strip()
|
||||
if to_date.strip():
|
||||
tool_def["to_date"] = to_date.strip()
|
||||
if enable_image_understanding:
|
||||
tool_def["enable_image_understanding"] = True
|
||||
if enable_video_understanding:
|
||||
tool_def["enable_video_understanding"] = True
|
||||
|
||||
payload = {
|
||||
"model": _get_x_search_model(),
|
||||
"input": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": query.strip(),
|
||||
}
|
||||
],
|
||||
"tools": [tool_def],
|
||||
"store": False,
|
||||
}
|
||||
|
||||
timeout_seconds = _get_x_search_timeout_seconds()
|
||||
max_retries = _get_x_search_retries()
|
||||
response: Optional[requests.Response] = None
|
||||
for attempt in range(max_retries + 1):
|
||||
try:
|
||||
response = requests.post(
|
||||
f"{base_url}/responses",
|
||||
headers={
|
||||
"Authorization": f"Bearer {api_key}",
|
||||
"Content-Type": "application/json",
|
||||
"User-Agent": hermes_xai_user_agent(),
|
||||
},
|
||||
json=payload,
|
||||
timeout=timeout_seconds,
|
||||
)
|
||||
response.raise_for_status()
|
||||
break
|
||||
except requests.HTTPError as e:
|
||||
status_code = getattr(getattr(e, "response", None), "status_code", None)
|
||||
if status_code is None or status_code < 500 or attempt >= max_retries:
|
||||
raise
|
||||
logger.warning(
|
||||
"x_search upstream failure on attempt %s/%s: %s",
|
||||
attempt + 1,
|
||||
max_retries + 1,
|
||||
_http_error_message(e),
|
||||
)
|
||||
time.sleep(min(5.0, 1.5 * (attempt + 1)))
|
||||
except (requests.ReadTimeout, requests.ConnectionError) as e:
|
||||
if attempt >= max_retries:
|
||||
raise
|
||||
logger.warning(
|
||||
"x_search transient failure on attempt %s/%s: %s",
|
||||
attempt + 1,
|
||||
max_retries + 1,
|
||||
e,
|
||||
)
|
||||
time.sleep(min(5.0, 1.5 * (attempt + 1)))
|
||||
|
||||
if response is None:
|
||||
raise RuntimeError("x_search request did not return a response")
|
||||
|
||||
data = response.json()
|
||||
|
||||
answer = _extract_response_text(data)
|
||||
citations = list(data.get("citations") or [])
|
||||
inline_citations = _extract_inline_citations(data)
|
||||
|
||||
return json.dumps(
|
||||
{
|
||||
"success": True,
|
||||
"provider": "xai",
|
||||
"credential_source": source,
|
||||
"tool": "x_search",
|
||||
"model": payload["model"],
|
||||
"query": query.strip(),
|
||||
"answer": answer,
|
||||
"citations": citations,
|
||||
"inline_citations": inline_citations,
|
||||
},
|
||||
ensure_ascii=False,
|
||||
)
|
||||
except requests.HTTPError as e:
|
||||
logger.error("x_search failed: %s", e, exc_info=True)
|
||||
return json.dumps(
|
||||
{
|
||||
"success": False,
|
||||
"provider": "xai",
|
||||
"tool": "x_search",
|
||||
"error": _http_error_message(e),
|
||||
"error_type": type(e).__name__,
|
||||
},
|
||||
ensure_ascii=False,
|
||||
)
|
||||
except requests.ReadTimeout as e:
|
||||
logger.error("x_search timed out: %s", e, exc_info=True)
|
||||
return json.dumps(
|
||||
{
|
||||
"success": False,
|
||||
"provider": "xai",
|
||||
"tool": "x_search",
|
||||
"error": f"xAI x_search timed out after {_get_x_search_timeout_seconds()} seconds",
|
||||
"error_type": type(e).__name__,
|
||||
},
|
||||
ensure_ascii=False,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error("x_search failed: %s", e, exc_info=True)
|
||||
return json.dumps(
|
||||
{
|
||||
"success": False,
|
||||
"provider": "xai",
|
||||
"tool": "x_search",
|
||||
"error": str(e),
|
||||
"error_type": type(e).__name__,
|
||||
},
|
||||
ensure_ascii=False,
|
||||
)
|
||||
|
||||
|
||||
X_SEARCH_SCHEMA = {
|
||||
"name": "x_search",
|
||||
"description": (
|
||||
"Search X (Twitter) posts, profiles, and threads using xAI's built-in "
|
||||
"X Search tool. Use this for current discussion, reactions, or claims "
|
||||
"on X rather than general web pages. Available when xAI credentials "
|
||||
"are configured (SuperGrok OAuth or XAI_API_KEY)."
|
||||
),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": {
|
||||
"type": "string",
|
||||
"description": "What to look up on X.",
|
||||
},
|
||||
"allowed_x_handles": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Optional list of X handles to include exclusively (max 10).",
|
||||
},
|
||||
"excluded_x_handles": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Optional list of X handles to exclude (max 10).",
|
||||
},
|
||||
"from_date": {
|
||||
"type": "string",
|
||||
"description": "Optional start date in YYYY-MM-DD format.",
|
||||
},
|
||||
"to_date": {
|
||||
"type": "string",
|
||||
"description": "Optional end date in YYYY-MM-DD format.",
|
||||
},
|
||||
"enable_image_understanding": {
|
||||
"type": "boolean",
|
||||
"description": "Whether xAI should analyze images attached to matching X posts.",
|
||||
"default": False,
|
||||
},
|
||||
"enable_video_understanding": {
|
||||
"type": "boolean",
|
||||
"description": "Whether xAI should analyze videos attached to matching X posts.",
|
||||
"default": False,
|
||||
},
|
||||
},
|
||||
"required": ["query"],
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def _handle_x_search(args, **kw):
|
||||
return x_search_tool(
|
||||
query=args.get("query", ""),
|
||||
allowed_x_handles=args.get("allowed_x_handles"),
|
||||
excluded_x_handles=args.get("excluded_x_handles"),
|
||||
from_date=args.get("from_date", ""),
|
||||
to_date=args.get("to_date", ""),
|
||||
enable_image_understanding=bool(args.get("enable_image_understanding", False)),
|
||||
enable_video_understanding=bool(args.get("enable_video_understanding", False)),
|
||||
)
|
||||
|
||||
|
||||
registry.register(
|
||||
name="x_search",
|
||||
toolset="x_search",
|
||||
schema=X_SEARCH_SCHEMA,
|
||||
handler=_handle_x_search,
|
||||
check_fn=check_x_search_requirements,
|
||||
requires_env=["XAI_API_KEY"],
|
||||
emoji="🐦",
|
||||
max_result_size_chars=100_000,
|
||||
)
|
||||
11
toolsets.py
11
toolsets.py
|
|
@ -88,6 +88,17 @@ TOOLSETS = {
|
|||
"tools": ["web_search"],
|
||||
"includes": []
|
||||
},
|
||||
|
||||
"x_search": {
|
||||
"description": (
|
||||
"Search X (Twitter) posts and threads via xAI's built-in "
|
||||
"x_search Responses tool. Available when xAI credentials are "
|
||||
"configured (SuperGrok OAuth or XAI_API_KEY). Off by default; "
|
||||
"enable in `hermes tools` → X (Twitter) Search."
|
||||
),
|
||||
"tools": ["x_search"],
|
||||
"includes": []
|
||||
},
|
||||
|
||||
"vision": {
|
||||
"description": "Image analysis and vision tools",
|
||||
|
|
|
|||
1
ui-tui/packages/hermes-ink/index.d.ts
vendored
1
ui-tui/packages/hermes-ink/index.d.ts
vendored
|
|
@ -21,6 +21,7 @@ export { default as Text } from './src/ink/components/Text.tsx'
|
|||
export type { Props as TextProps } from './src/ink/components/Text.tsx'
|
||||
export type { Key } from './src/ink/events/input-event.ts'
|
||||
export { default as useApp } from './src/ink/hooks/use-app.ts'
|
||||
export { useCursorAdvance } from './src/ink/hooks/use-cursor-advance.ts'
|
||||
export { useDeclaredCursor } from './src/ink/hooks/use-declared-cursor.ts'
|
||||
export { default as useInput } from './src/ink/hooks/use-input.ts'
|
||||
export { useHasSelection, useSelection } from './src/ink/hooks/use-selection.ts'
|
||||
|
|
|
|||
|
|
@ -12,6 +12,7 @@ export { default as ScrollBox } from './ink/components/ScrollBox.js'
|
|||
export { default as Spacer } from './ink/components/Spacer.js'
|
||||
export { default as Text } from './ink/components/Text.js'
|
||||
export { default as useApp } from './ink/hooks/use-app.js'
|
||||
export { useCursorAdvance } from './ink/hooks/use-cursor-advance.js'
|
||||
export { useDeclaredCursor } from './ink/hooks/use-declared-cursor.js'
|
||||
export { type RunExternalProcess, useExternalProcess, withInkSuspended } from './ink/hooks/use-external-process.js'
|
||||
export { default as useInput } from './ink/hooks/use-input.js'
|
||||
|
|
|
|||
|
|
@ -33,6 +33,7 @@ import { DBP, DFE, DISABLE_MOUSE_TRACKING, EBP, EFE, SHOW_CURSOR } from '../term
|
|||
|
||||
import AppContext from './AppContext.js'
|
||||
import { ClockProvider } from './ClockContext.js'
|
||||
import CursorAdvanceContext, { type CursorAdvanceNotifier } from './CursorAdvanceContext.js'
|
||||
import CursorDeclarationContext, { type CursorDeclarationSetter } from './CursorDeclarationContext.js'
|
||||
import ErrorOverview from './ErrorOverview.js'
|
||||
import StdinContext from './StdinContext.js'
|
||||
|
|
@ -100,6 +101,18 @@ type Props = {
|
|||
// Enables IME composition at the input caret and lets screen readers /
|
||||
// magnifiers track the input. Optional so testing.tsx doesn't stub it.
|
||||
readonly onCursorDeclaration?: CursorDeclarationSetter
|
||||
// Receives notifications that the physical cursor was advanced out-of-band
|
||||
// (e.g. TextInput's fast-echo bypass writing directly to stdout). The
|
||||
// handler in ink.tsx updates two pieces of state from a single call:
|
||||
// - `displayCursor` (the relative-move basis log-update uses on the
|
||||
// next frame; skipped on alt-screen where CSI H resets it every
|
||||
// frame anyway), and
|
||||
// - the active `cursorDeclaration.relativeX/Y` (the target the cursor
|
||||
// parks at after every frame; bumped on BOTH screens because
|
||||
// onRender's alt-screen branch emits an absolute CUP from it and
|
||||
// a stale declaration there is still visibly wrong).
|
||||
// Optional so testing.tsx doesn't need to stub it.
|
||||
readonly onCursorAdvance?: CursorAdvanceNotifier
|
||||
// Dispatch a keyboard event through the DOM tree. Called for each
|
||||
// parsed key alongside the legacy EventEmitter path.
|
||||
readonly dispatchKeyboardEvent: (parsedKey: ParsedKey) => void
|
||||
|
|
@ -196,7 +209,9 @@ export default class App extends PureComponent<Props, State> {
|
|||
<TerminalFocusProvider>
|
||||
<ClockProvider>
|
||||
<CursorDeclarationContext.Provider value={this.props.onCursorDeclaration ?? (() => {})}>
|
||||
{this.state.error ? <ErrorOverview error={this.state.error as Error} /> : this.props.children}
|
||||
<CursorAdvanceContext.Provider value={this.props.onCursorAdvance ?? (() => {})}>
|
||||
{this.state.error ? <ErrorOverview error={this.state.error as Error} /> : this.props.children}
|
||||
</CursorAdvanceContext.Provider>
|
||||
</CursorDeclarationContext.Provider>
|
||||
</ClockProvider>
|
||||
</TerminalFocusProvider>
|
||||
|
|
|
|||
|
|
@ -0,0 +1,35 @@
|
|||
import { createContext } from 'react'
|
||||
|
||||
/**
|
||||
* Notify Ink that the physical terminal cursor was advanced by an
|
||||
* out-of-band stdout.write (e.g. the TextInput fast-echo path).
|
||||
*
|
||||
* This is a two-part notification — calling it updates both:
|
||||
*
|
||||
* 1. Ink's cached `displayCursor` (the basis log-update uses to
|
||||
* compute relative cursor moves for the next frame's preamble).
|
||||
* Without this, the next frame's preamble starts from a stale
|
||||
* parked position and the diff is rendered N cells offset.
|
||||
* This half is SKIPPED on alt-screen — every alt-screen frame
|
||||
* begins with CSI H which absolutely repositions the cursor, so
|
||||
* the relative-move basis is reset for free.
|
||||
*
|
||||
* 2. Ink's active `cursorDeclaration` (the target the cursor parks
|
||||
* at after every frame, set by `useDeclaredCursor`). Without
|
||||
* this, an unrelated component re-rendering before the deferred
|
||||
* React state catches up would publish a stale declaration and
|
||||
* visually undo the fast-echo's advance. This half applies to
|
||||
* BOTH main-screen and alt-screen — on alt-screen the cursor-
|
||||
* park branch in onRender emits an absolute CUP to
|
||||
* `rect.x + decl.relativeX`, so a stale declaration there is
|
||||
* still wrong even though displayCursor is skipped.
|
||||
*
|
||||
* `dx`/`dy` are deltas in terminal cells (positive = right/down,
|
||||
* negative = left/up). The caller is responsible for ensuring the
|
||||
* physical cursor really did move by that amount.
|
||||
*/
|
||||
export type CursorAdvanceNotifier = (dx: number, dy?: number) => void
|
||||
|
||||
const CursorAdvanceContext = createContext<CursorAdvanceNotifier>(() => {})
|
||||
|
||||
export default CursorAdvanceContext
|
||||
|
|
@ -0,0 +1,33 @@
|
|||
import { useContext } from 'react'
|
||||
|
||||
import CursorAdvanceContext, { type CursorAdvanceNotifier } from '../components/CursorAdvanceContext.js'
|
||||
|
||||
/**
|
||||
* Returns a function that notifies Ink the physical terminal cursor was
|
||||
* advanced out-of-band (e.g. by a direct stdout.write from the
|
||||
* TextInput fast-echo bypass).
|
||||
*
|
||||
* Calling the returned function updates two pieces of Ink state:
|
||||
*
|
||||
* - `displayCursor` — the cached parked-cursor position log-update
|
||||
* uses as the relative-move basis for the next frame. Skipped on
|
||||
* alt-screen, where every frame's CSI H resets the cursor anyway.
|
||||
*
|
||||
* - The active `cursorDeclaration` — the target the cursor parks at
|
||||
* after every frame. Bumped on BOTH main- and alt-screen, because
|
||||
* onRender's alt-screen park branch emits an absolute CUP from
|
||||
* this value and a stale declaration there is still visibly wrong.
|
||||
* The next React commit that publishes a fresh declaration
|
||||
* supersedes the bump.
|
||||
*
|
||||
* The caller is responsible for the stdout write itself; this hook
|
||||
* only reports the resulting cursor delta. Pass `dx` and optional
|
||||
* `dy` in terminal cells (positive = moved right/down, negative =
|
||||
* moved left/up).
|
||||
*
|
||||
* If the host isn't an Ink render root (test stubs, non-Ink renderer)
|
||||
* the returned callback is a safe no-op.
|
||||
*/
|
||||
export function useCursorAdvance(): CursorAdvanceNotifier {
|
||||
return useContext(CursorAdvanceContext)
|
||||
}
|
||||
234
ui-tui/packages/hermes-ink/src/ink/ink-cursor-advance.test.ts
Normal file
234
ui-tui/packages/hermes-ink/src/ink/ink-cursor-advance.test.ts
Normal file
|
|
@ -0,0 +1,234 @@
|
|||
import { EventEmitter } from 'events'
|
||||
|
||||
import React from 'react'
|
||||
import { describe, expect, it } from 'vitest'
|
||||
|
||||
import Text from './components/Text.js'
|
||||
import Ink from './ink.js'
|
||||
|
||||
class FakeTty extends EventEmitter {
|
||||
chunks: string[] = []
|
||||
columns = 40
|
||||
rows = 8
|
||||
isTTY = true
|
||||
|
||||
write(chunk: string | Uint8Array, cb?: (err?: Error | null) => void): boolean {
|
||||
this.chunks.push(typeof chunk === 'string' ? chunk : Buffer.from(chunk).toString('utf8'))
|
||||
cb?.()
|
||||
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
function makeInk() {
|
||||
const stdout = new FakeTty()
|
||||
const stdin = new FakeTty()
|
||||
const stderr = new FakeTty()
|
||||
|
||||
const ink = new Ink({
|
||||
exitOnCtrlC: false,
|
||||
patchConsole: false,
|
||||
stderr: stderr as unknown as NodeJS.WriteStream,
|
||||
stdin: stdin as unknown as NodeJS.ReadStream,
|
||||
stdout: stdout as unknown as NodeJS.WriteStream
|
||||
})
|
||||
|
||||
return { ink, stdout, stdin, stderr }
|
||||
}
|
||||
|
||||
// Cast helper instead of exposing __get*ForTest methods on production Ink —
|
||||
// these are internal frame/cursor caches we only inspect from tests.
|
||||
type InkPrivate = {
|
||||
displayCursor: { x: number; y: number } | null
|
||||
cursorDeclaration: { node: unknown; relativeX: number; relativeY: number } | null
|
||||
frontFrame: { cursor: { x: number; y: number } }
|
||||
}
|
||||
const peek = (ink: Ink): InkPrivate => ink as unknown as InkPrivate
|
||||
|
||||
// Closes the cursor-drift bug: when TextInput's fast-echo path writes a
|
||||
// printable character directly to stdout, the hardware cursor advances by
|
||||
// one cell BUT Ink's `displayCursor` cache (used as the basis for the
|
||||
// next frame's relative cursor preamble) wasn't being updated. On long
|
||||
// sessions an unrelated re-render (status bar timer, streaming
|
||||
// reasoning, etc.) would then park the hardware cursor N cells offset
|
||||
// from the actual caret — visible as "extra whitespace between my last
|
||||
// typed character and the cursor block".
|
||||
describe('Ink.noteExternalCursorAdvance', () => {
|
||||
it('bumps an already-tracked displayCursor by the given delta', () => {
|
||||
const { ink } = makeInk()
|
||||
|
||||
ink.render(React.createElement(Text, null, 'hi'))
|
||||
ink.onRender()
|
||||
|
||||
// Seed a known parked position directly. In production this is set by
|
||||
// the cursor-park branch in onRender when a useDeclaredCursor caller
|
||||
// commits a declaration; this test bypasses React for hermeticity.
|
||||
peek(ink).displayCursor = { x: 5, y: 0 }
|
||||
|
||||
ink.noteExternalCursorAdvance(3)
|
||||
expect(peek(ink).displayCursor).toEqual({ x: 8, y: 0 })
|
||||
|
||||
ink.noteExternalCursorAdvance(-1)
|
||||
expect(peek(ink).displayCursor).toEqual({ x: 7, y: 0 })
|
||||
|
||||
ink.noteExternalCursorAdvance(0, 2)
|
||||
expect(peek(ink).displayCursor).toEqual({ x: 7, y: 2 })
|
||||
|
||||
ink.unmount()
|
||||
})
|
||||
|
||||
it('seeds displayCursor from frontFrame.cursor when nothing was parked', () => {
|
||||
const { ink } = makeInk()
|
||||
|
||||
ink.render(React.createElement(Text, null, 'hello'))
|
||||
ink.onRender()
|
||||
|
||||
expect(peek(ink).displayCursor).toBeNull()
|
||||
const base = { x: peek(ink).frontFrame.cursor.x, y: peek(ink).frontFrame.cursor.y }
|
||||
|
||||
ink.noteExternalCursorAdvance(4)
|
||||
expect(peek(ink).displayCursor).toEqual({ x: base.x + 4, y: base.y })
|
||||
|
||||
ink.unmount()
|
||||
})
|
||||
|
||||
it('is a no-op when the delta is zero', () => {
|
||||
const { ink } = makeInk()
|
||||
|
||||
ink.render(React.createElement(Text, null, 'hi'))
|
||||
ink.onRender()
|
||||
|
||||
ink.noteExternalCursorAdvance(0)
|
||||
expect(peek(ink).displayCursor).toBeNull()
|
||||
|
||||
ink.noteExternalCursorAdvance(0, 0)
|
||||
expect(peek(ink).displayCursor).toBeNull()
|
||||
|
||||
ink.unmount()
|
||||
})
|
||||
|
||||
it('skips displayCursor on alt-screen — CSI H resets every frame', () => {
|
||||
const { ink } = makeInk()
|
||||
|
||||
ink.setAltScreenActive(true)
|
||||
ink.render(React.createElement(Text, null, 'hi'))
|
||||
ink.onRender()
|
||||
peek(ink).displayCursor = { x: 5, y: 0 }
|
||||
|
||||
ink.noteExternalCursorAdvance(3)
|
||||
|
||||
expect(peek(ink).displayCursor).toEqual({ x: 5, y: 0 })
|
||||
|
||||
ink.unmount()
|
||||
})
|
||||
|
||||
// Closes Copilot follow-up on PR #26717: the default TUI wraps the
|
||||
// composer in <AlternateScreen>, so alt-screen is the production
|
||||
// path. CSI H only resets the log-update relative-move basis — the
|
||||
// declared cursor target is still consulted by onRender's alt-screen
|
||||
// park branch (`cursorPosition(row, col)` using rect + decl). So
|
||||
// cursorDeclaration MUST advance on alt-screen too, even though
|
||||
// displayCursor doesn't need to.
|
||||
it('still advances cursorDeclaration on alt-screen', () => {
|
||||
const { ink } = makeInk()
|
||||
|
||||
ink.setAltScreenActive(true)
|
||||
ink.render(React.createElement(Text, null, 'hi'))
|
||||
ink.onRender()
|
||||
|
||||
const fakeNode = {} as unknown as Record<string, unknown>
|
||||
|
||||
peek(ink).cursorDeclaration = { node: fakeNode, relativeX: 7, relativeY: 0 }
|
||||
peek(ink).displayCursor = { x: 12, y: 0 }
|
||||
|
||||
ink.noteExternalCursorAdvance(3)
|
||||
|
||||
// displayCursor untouched on alt-screen
|
||||
expect(peek(ink).displayCursor).toEqual({ x: 12, y: 0 })
|
||||
// declaration still advanced — onRender's alt-screen park reads this
|
||||
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 10, relativeY: 0 })
|
||||
|
||||
ink.unmount()
|
||||
})
|
||||
|
||||
// Closes Copilot review feedback on PR #26717: even after the
|
||||
// TextInput-level fix where layout reads `curRef.current` directly,
|
||||
// there's still a window where a fast-echo wrote to stdout but the
|
||||
// current cursor declaration on Ink (set by an earlier render's
|
||||
// useDeclaredCursor commit) points at the PRE-keystroke caret
|
||||
// column. If we advanced only `displayCursor`, an unrelated re-render
|
||||
// in that window would re-run onRender's cursor-park branch with the
|
||||
// stale declaration and visually undo the fast-echo's advance. We
|
||||
// must bump BOTH so the cursor stays anchored to the physical caret
|
||||
// until the next React commit publishes a fresh declaration
|
||||
// (computed from `curRef.current` via the cursorLayout call in
|
||||
// textInput.tsx) that supersedes the bump.
|
||||
it('advances the active cursorDeclaration in lock-step with displayCursor', () => {
|
||||
const { ink } = makeInk()
|
||||
|
||||
ink.render(React.createElement(Text, null, 'hi'))
|
||||
ink.onRender()
|
||||
|
||||
const fakeNode = {} as unknown as Record<string, unknown>
|
||||
|
||||
peek(ink).cursorDeclaration = { node: fakeNode, relativeX: 7, relativeY: 0 }
|
||||
peek(ink).displayCursor = { x: 12, y: 0 }
|
||||
|
||||
ink.noteExternalCursorAdvance(3)
|
||||
|
||||
expect(peek(ink).displayCursor).toEqual({ x: 15, y: 0 })
|
||||
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 10, relativeY: 0 })
|
||||
|
||||
ink.noteExternalCursorAdvance(-1)
|
||||
expect(peek(ink).displayCursor).toEqual({ x: 14, y: 0 })
|
||||
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 9, relativeY: 0 })
|
||||
|
||||
ink.unmount()
|
||||
})
|
||||
|
||||
// Closes Copilot follow-up on PR #26717: the dy half of the notifier
|
||||
// contract was tested for `displayCursor` but not for
|
||||
// `cursorDeclaration.relativeY`. Newlines in fast-echoed text never
|
||||
// hit the bypass today (canFastAppendShape rejects '\n'), but `dy`
|
||||
// is part of the public API and must propagate symmetrically with
|
||||
// dx so future callers (e.g. multi-line paste shortcuts) don't get
|
||||
// a half-implemented contract.
|
||||
it('advances cursorDeclaration.relativeY when dy is non-zero', () => {
|
||||
const { ink } = makeInk()
|
||||
|
||||
ink.render(React.createElement(Text, null, 'hi'))
|
||||
ink.onRender()
|
||||
|
||||
const fakeNode = {} as unknown as Record<string, unknown>
|
||||
|
||||
peek(ink).cursorDeclaration = { node: fakeNode, relativeX: 2, relativeY: 1 }
|
||||
peek(ink).displayCursor = { x: 4, y: 2 }
|
||||
|
||||
ink.noteExternalCursorAdvance(1, 3)
|
||||
|
||||
expect(peek(ink).displayCursor).toEqual({ x: 5, y: 5 })
|
||||
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 3, relativeY: 4 })
|
||||
|
||||
// Negative dy too — cursor moving up across visual rows.
|
||||
ink.noteExternalCursorAdvance(0, -2)
|
||||
expect(peek(ink).displayCursor).toEqual({ x: 5, y: 3 })
|
||||
expect(peek(ink).cursorDeclaration).toEqual({ node: fakeNode, relativeX: 3, relativeY: 2 })
|
||||
|
||||
ink.unmount()
|
||||
})
|
||||
|
||||
it('leaves cursorDeclaration unchanged when no declaration is active', () => {
|
||||
const { ink } = makeInk()
|
||||
|
||||
ink.render(React.createElement(Text, null, 'hi'))
|
||||
ink.onRender()
|
||||
|
||||
expect(peek(ink).cursorDeclaration).toBeNull()
|
||||
|
||||
ink.noteExternalCursorAdvance(3)
|
||||
|
||||
expect(peek(ink).cursorDeclaration).toBeNull()
|
||||
|
||||
ink.unmount()
|
||||
})
|
||||
})
|
||||
|
|
@ -16,6 +16,7 @@ import { logError } from '../utils/log.js'
|
|||
|
||||
import { colorize } from './colorize.js'
|
||||
import App from './components/App.js'
|
||||
import type { CursorAdvanceNotifier } from './components/CursorAdvanceContext.js'
|
||||
import type { CursorDeclaration, CursorDeclarationSetter } from './components/CursorDeclarationContext.js'
|
||||
import { FRAME_INTERVAL_MS } from './constants.js'
|
||||
import * as dom from './dom.js'
|
||||
|
|
@ -2219,6 +2220,85 @@ export default class Ink {
|
|||
|
||||
this.cursorDeclaration = decl
|
||||
}
|
||||
// Caller writes raw bytes to stdout that move the physical terminal
|
||||
// cursor (e.g. TextInput's fast-echo bypass). Without this notification,
|
||||
// Ink's `displayCursor` cache and log-update's prevFrame.cursor stay
|
||||
// unchanged, so the next frame's relative cursor moves compute from a
|
||||
// stale position and the hardware cursor parks `dx` cells offset from
|
||||
// the actual caret. Visible symptom: extra whitespace between the just-
|
||||
// typed character and the cursor block, more pronounced on long
|
||||
// sessions where unrelated components re-render between fast-echo and
|
||||
// the deferred composer re-render.
|
||||
//
|
||||
// If displayCursor was already tracked, just bump it. Otherwise seed it
|
||||
// to (prevFrame.cursor + delta) so the next frame's preamble emits a
|
||||
// (-dx, -dy) relative move that brings the cursor back to log-update's
|
||||
// expected start position before the diff body runs.
|
||||
//
|
||||
// Public so tests can drive it directly without mounting App.
|
||||
//
|
||||
// Bumps BOTH `displayCursor` (used by log-update's relative-move
|
||||
// preamble) AND, if non-null, `cursorDeclaration.relativeX/Y` (the
|
||||
// target the cursor parks at after every frame). Advancing only one
|
||||
// of the two would leave the other stale: e.g. if the deferred React
|
||||
// `setCur` hasn't flushed yet, the next unrelated re-render would
|
||||
// re-compute `target` from the stale declaration and park the
|
||||
// hardware cursor back at the old caret column. We advance both so
|
||||
// the fast-echo is invisible to intervening frames until React
|
||||
// catches up.
|
||||
noteExternalCursorAdvance: CursorAdvanceNotifier = (dx, dy = 0) => {
|
||||
if (dx === 0 && dy === 0) {
|
||||
return
|
||||
}
|
||||
|
||||
// displayCursor / log-update relative-move basis only matters on
|
||||
// main screen — alt-screen frames begin with absolute CSI H every
|
||||
// frame so the next preamble naturally resets to (0,0). cursorDeclaration,
|
||||
// however, IS still consulted on alt-screen — onRender's park branch
|
||||
// emits an absolute CUP using `rect.x + decl.relativeX`, so a stale
|
||||
// declaration in the deferred-setCur window would park the cursor
|
||||
// at the pre-keystroke caret. We therefore skip ONLY the displayCursor
|
||||
// half on alt-screen, not the declaration half.
|
||||
if (!this.altScreenActive) {
|
||||
if (this.displayCursor !== null) {
|
||||
this.displayCursor = {
|
||||
x: this.displayCursor.x + dx,
|
||||
y: this.displayCursor.y + dy
|
||||
}
|
||||
} else {
|
||||
// No prior parked position. Seed from frontFrame.cursor (where
|
||||
// log-update parked the cursor at the end of the last frame) so
|
||||
// the next preamble's relative move correctly cancels the
|
||||
// external advance.
|
||||
const baseX = this.frontFrame.cursor.x
|
||||
const baseY = this.frontFrame.cursor.y
|
||||
this.displayCursor = { x: baseX + dx, y: baseY + dy }
|
||||
}
|
||||
}
|
||||
|
||||
// Also advance the active cursor declaration if any. Without this,
|
||||
// a TextInput that defers its React `cur` state update (16ms timer
|
||||
// in textInput.tsx — perf optimization that batches re-renders
|
||||
// during heavy typing) leaves `cursorDeclaration.relativeX` pointing
|
||||
// at the pre-keystroke caret column. If an unrelated component
|
||||
// re-renders before the deferred `setCur` flushes, the cursor-park
|
||||
// branch at the end of onRender would move the hardware cursor back
|
||||
// to that stale relativeX and visually undo the fast-echo's
|
||||
// advance. Bumping relativeX here keeps the declared target in
|
||||
// lock-step with the physical cursor until React state catches up.
|
||||
// Applies to BOTH main-screen and alt-screen — the alt-screen park
|
||||
// branch uses an absolute CUP to (rect.x + decl.relativeX), so a
|
||||
// stale declaration there would still produce the wrong column.
|
||||
const decl = this.cursorDeclaration
|
||||
|
||||
if (decl !== null) {
|
||||
this.cursorDeclaration = {
|
||||
node: decl.node,
|
||||
relativeX: decl.relativeX + dx,
|
||||
relativeY: decl.relativeY + dy
|
||||
}
|
||||
}
|
||||
}
|
||||
render(node: ReactNode): void {
|
||||
this.currentNode = node
|
||||
|
||||
|
|
@ -2228,6 +2308,7 @@ export default class Ink {
|
|||
exitOnCtrlC={this.options.exitOnCtrlC}
|
||||
getHyperlinkAt={this.getHyperlinkAt}
|
||||
onClickAt={this.dispatchClick}
|
||||
onCursorAdvance={this.noteExternalCursorAdvance}
|
||||
onCursorDeclaration={this.setCursorDeclaration}
|
||||
onExit={this.unmount}
|
||||
onHoverAt={this.dispatchHover}
|
||||
|
|
|
|||
50
ui-tui/src/__tests__/textInputCursorSourceOfTruth.test.ts
Normal file
50
ui-tui/src/__tests__/textInputCursorSourceOfTruth.test.ts
Normal file
|
|
@ -0,0 +1,50 @@
|
|||
import { readFileSync } from 'node:fs'
|
||||
import { dirname, join } from 'node:path'
|
||||
import { fileURLToPath } from 'node:url'
|
||||
|
||||
import { describe, expect, it } from 'vitest'
|
||||
|
||||
// Locate textInput.tsx relative to this test file so the assertion
|
||||
// survives moves of the test fixture itself.
|
||||
const TEXT_INPUT_PATH = join(dirname(fileURLToPath(import.meta.url)), '..', 'components', 'textInput.tsx')
|
||||
const source = readFileSync(TEXT_INPUT_PATH, 'utf8')
|
||||
|
||||
// Closes Copilot follow-up on PR #26717: the original cursor-drift
|
||||
// fix bumped Ink's displayCursor / cursorDeclaration on fast-echo, but
|
||||
// if TextInput itself re-renders before the deferred 16ms `setCur`
|
||||
// flushes (parent state change, status-bar tick, spinner) the layout
|
||||
// effect inside `useDeclaredCursor` re-publishes a declaration
|
||||
// computed from the STALE React `cur` state and clobbers the Ink-level
|
||||
// bump. The fix is structural: read `curRef.current` (always
|
||||
// up-to-date) when computing the layout, not the `cur` state.
|
||||
//
|
||||
// This file pins that invariant. Switching back to `cur` state — or
|
||||
// re-introducing a memo keyed on `cur` that uses `curRef.current`
|
||||
// inside but stops re-computing on rerender — is a regression and
|
||||
// should be caught here, not via a flaky integration test that mounts
|
||||
// Ink + stdin.
|
||||
describe('textInput cursor-layout source of truth', () => {
|
||||
it('reads curRef.current (not the cur React state) for cursorLayout', () => {
|
||||
// The line we care about. We allow whitespace / formatting drift,
|
||||
// but the call itself must use `curRef.current`.
|
||||
expect(source).toMatch(/cursorLayout\(\s*display\s*,\s*curRef\.current\s*,\s*columns\s*\)/)
|
||||
})
|
||||
|
||||
it('does not pass the bare `cur` React state into cursorLayout', () => {
|
||||
// Any `cursorLayout(display, cur, columns)` invocation would
|
||||
// reintroduce the stale-declaration window.
|
||||
expect(source).not.toMatch(/cursorLayout\(\s*display\s*,\s*cur\s*,\s*columns\s*\)/)
|
||||
})
|
||||
|
||||
it('keeps the fast-echo notifier calls paired with the stdout writes', () => {
|
||||
// Both fast-echo paths must call noteCursorAdvance, otherwise Ink
|
||||
// never learns about the out-of-band write and drifts again. We
|
||||
// tolerate explanatory comments in between (the rationale block is
|
||||
// intentionally long), but the pairing itself must hold.
|
||||
const backspacePattern = /stdout!\.write\(['"`]\\b \\b['"`]\)[\s\S]{0,1000}?noteCursorAdvance\(-1\)/
|
||||
expect(source).toMatch(backspacePattern)
|
||||
|
||||
const appendPattern = /stdout!\.write\(text\)[\s\S]{0,1000}?noteCursorAdvance\(text\.length\)/
|
||||
expect(source).toMatch(appendPattern)
|
||||
})
|
||||
})
|
||||
|
|
@ -133,4 +133,42 @@ describe('canFastBackspaceShape', () => {
|
|||
it('rejects deleting an emoji', () => {
|
||||
expect(canFastBackspaceShape('hi🙂', 'hi🙂'.length)).toBe(false)
|
||||
})
|
||||
|
||||
// Closes Copilot PR #26717 round 3: the "\b \b" sequence cannot move
|
||||
// the terminal cursor onto the previous visual row across a
|
||||
// soft-wrap boundary. When the caret sits at visual column 0 of a
|
||||
// wrapped row (column == 0 in the computed cursor layout), backspace
|
||||
// would leave the physical cursor in place while the logical caret
|
||||
// moves up to the end of the previous visual line — desyncing both
|
||||
// Ink's displayCursor model and the user-visible position. The fast
|
||||
// path must fall through in that case so the normal Ink render path
|
||||
// can lay out the correct cursor position.
|
||||
it('rejects fast-backspace at a soft-wrap boundary when columns is known', () => {
|
||||
// value width 6 in a column of 6 → cursorLayout produces (line 1, col 0)
|
||||
// i.e. the caret has overflowed onto the next visual line.
|
||||
const value = 'hello '
|
||||
expect(canFastBackspaceShape(value, value.length, 6)).toBe(false)
|
||||
})
|
||||
|
||||
it('rejects fast-backspace at an exact multiple of columns (wide wrap)', () => {
|
||||
// 12 chars at width 6 → two full visual rows, caret at (line 2, col 0).
|
||||
const value = 'abcdefghijkl'
|
||||
expect(canFastBackspaceShape(value, value.length, 6)).toBe(false)
|
||||
})
|
||||
|
||||
it('still accepts fast-backspace inside a wrapped line', () => {
|
||||
// Caret mid-visual-line — "\b \b" can move the cursor one cell left
|
||||
// without crossing a wrap boundary.
|
||||
expect(canFastBackspaceShape('hello world', 'hello world'.length, 20)).toBe(true)
|
||||
expect(canFastBackspaceShape('abcdefghi', 9, 6)).toBe(true) // visual line 1, col 3 → ok
|
||||
})
|
||||
|
||||
it('skips the wrap-boundary check when columns is omitted (legacy contract)', () => {
|
||||
// Callers that don't pass `columns` fall back to the pre-wrap-aware
|
||||
// behavior — the function does NOT magically reject anything that
|
||||
// could be a wrap boundary without the width. Production callers
|
||||
// must always pass `columns`; this case is for unit tests of the
|
||||
// pre-wrap shape contract.
|
||||
expect(canFastBackspaceShape('hello ', 'hello '.length)).toBe(true)
|
||||
})
|
||||
})
|
||||
|
|
|
|||
|
|
@ -16,13 +16,14 @@ import {
|
|||
|
||||
type InkExt = typeof Ink & {
|
||||
stringWidth: (s: string) => number
|
||||
useCursorAdvance: () => (dx: number, dy?: number) => void
|
||||
useDeclaredCursor: (a: { line: number; column: number; active: boolean }) => (el: any) => void
|
||||
useStdout: () => { stdout?: NodeJS.WriteStream }
|
||||
useTerminalFocus: () => boolean
|
||||
}
|
||||
|
||||
const ink = Ink as unknown as InkExt
|
||||
const { Box, Text, useStdin, useInput, useStdout, stringWidth, useDeclaredCursor, useTerminalFocus } = ink
|
||||
const { Box, Text, useStdin, useInput, useStdout, stringWidth, useCursorAdvance, useDeclaredCursor, useTerminalFocus } = ink
|
||||
|
||||
const ESC = '\x1b'
|
||||
const INV = `${ESC}[7m`
|
||||
|
|
@ -238,8 +239,26 @@ export function canFastAppendShape(
|
|||
* ASCII. Anything else (combining marks, IME compositions, wide chars,
|
||||
* tabs, ANSI fragments) goes through the normal render path so Ink can
|
||||
* recompute cell widths.
|
||||
*
|
||||
* When `columns` is supplied, ALSO rejects when the physical cursor
|
||||
* sits at visual column 0 — i.e., right after a soft-wrap boundary.
|
||||
* The "\b \b" sequence cannot move the cursor onto the previous visual
|
||||
* row (terminals don't back-step across line wraps), so the physical
|
||||
* cursor would stay put while the logical caret moves to the end of
|
||||
* the previous visual line, desyncing both Ink's `displayCursor` model
|
||||
* and the user-visible position.
|
||||
*
|
||||
* When `columns` is OMITTED, the wrap-boundary check is skipped
|
||||
* entirely and the function reverts to the legacy non-wrap-aware
|
||||
* contract — values like `'hello '` will return `true` even though
|
||||
* they would be unsafe at a width of 6. Production callers (the
|
||||
* composer's `canFastBackspace` helper) always pass `columns`;
|
||||
* `columns` is optional only so unit tests of the pre-wrap shape
|
||||
* contract can keep calling the helper without threading width
|
||||
* through. Do NOT omit it from any new caller that relies on the
|
||||
* wrap-boundary protection.
|
||||
*/
|
||||
export function canFastBackspaceShape(current: string, cursor: number): boolean {
|
||||
export function canFastBackspaceShape(current: string, cursor: number, columns?: number): boolean {
|
||||
if (cursor !== current.length) {
|
||||
return false
|
||||
}
|
||||
|
|
@ -252,6 +271,13 @@ export function canFastBackspaceShape(current: string, cursor: number): boolean
|
|||
return false
|
||||
}
|
||||
|
||||
// If we know the wrap width, reject at the soft-wrap boundary: the
|
||||
// caret's visual column is 0, so "\b \b" can't represent the physical
|
||||
// move back to the previous visual line.
|
||||
if (columns !== undefined && cursorLayout(current, cursor, columns).column === 0) {
|
||||
return false
|
||||
}
|
||||
|
||||
const removed = current.slice(prevPos(current, cursor), cursor)
|
||||
|
||||
return ASCII_PRINTABLE_RE.test(removed)
|
||||
|
|
@ -333,6 +359,7 @@ export function TextInput({
|
|||
const fwdDel = useFwdDelete(focus)
|
||||
const termFocus = useTerminalFocus()
|
||||
const { stdout } = useStdout()
|
||||
const noteCursorAdvance = useCursorAdvance()
|
||||
|
||||
const curRef = useRef(cur)
|
||||
const selRef = useRef<null | { end: number; start: number }>(null)
|
||||
|
|
@ -368,7 +395,19 @@ export function TextInput({
|
|||
[sel]
|
||||
)
|
||||
|
||||
const layout = useMemo(() => cursorLayout(display, cur, columns), [columns, cur, display])
|
||||
// Read `curRef.current` (always up-to-date) rather than the `cur`
|
||||
// React state. The fast-echo path defers the React `setCur` by 16ms
|
||||
// to batch re-renders during heavy typing; if an unrelated render
|
||||
// flushes this component during that window and we used the stale
|
||||
// `cur` state here, the layout effect inside `useDeclaredCursor`
|
||||
// would publish a stale cursor declaration and clobber the Ink-level
|
||||
// bump from `noteCursorAdvance(...)`. `cur` is still in scope and
|
||||
// referenced by setSel/setCur paths below, so React tracks the
|
||||
// dependency naturally — we just don't use it as the source of truth
|
||||
// for layout. The cursorLayout call is cheap (one wrap-text pass
|
||||
// over a single-line string in the common case), so dropping useMemo
|
||||
// is fine.
|
||||
const layout = cursorLayout(display, curRef.current, columns)
|
||||
|
||||
const boxRef = useDeclaredCursor({
|
||||
line: layout.line,
|
||||
|
|
@ -526,7 +565,7 @@ export function TextInput({
|
|||
canFastEchoBase() && canFastAppendShape(current, cursor, text, columns, lineWidthRef.current)
|
||||
|
||||
const canFastBackspace = (current: string, cursor: number) =>
|
||||
canFastEchoBase() && canFastBackspaceShape(current, cursor)
|
||||
canFastEchoBase() && canFastBackspaceShape(current, cursor, columns)
|
||||
|
||||
const commit = (
|
||||
next: string,
|
||||
|
|
@ -911,6 +950,12 @@ export function TextInput({
|
|||
v = v.slice(0, t) + v.slice(c)
|
||||
c = t
|
||||
stdout!.write('\b \b')
|
||||
// The "\b \b" sequence ends with the cursor one column to the
|
||||
// LEFT of where Ink last parked it. Tell Ink so its `displayCursor`
|
||||
// (and log-update's relative-move basis on the next frame) stays
|
||||
// in sync — otherwise the cursor parks one cell to the right of
|
||||
// the caret on the next unrelated re-render.
|
||||
noteCursorAdvance(-1)
|
||||
commit(v, c, true, false, false, Math.max(0, lineWidthRef.current - 1))
|
||||
|
||||
return
|
||||
|
|
@ -998,6 +1043,14 @@ export function TextInput({
|
|||
|
||||
if (simpleAppend) {
|
||||
stdout!.write(text)
|
||||
// ASCII-printable text advances the physical cursor by exactly
|
||||
// text.length cells (canFastAppendShape rejects non-ASCII,
|
||||
// wide chars, newlines). Notify Ink so the cached displayCursor
|
||||
// / log-update relative-move basis advances with it; otherwise
|
||||
// any unrelated re-render that happens before the 16ms
|
||||
// setCur/setParent flush parks the cursor text.length cells
|
||||
// too far right (#cursor-drift).
|
||||
noteCursorAdvance(text.length)
|
||||
commit(v, c, true, false, false, lineWidthRef.current + stringWidth(text))
|
||||
|
||||
return
|
||||
|
|
|
|||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue