Commit graph

8524 commits

Author SHA1 Message Date
Teknium
973f27e956
fix(run_agent): isolate background review fork from external memory plugins (#27190)
Pass skip_memory=True to the AIAgent constructor used by
_spawn_background_review() so the review fork's __init__ no longer
rebuilds a _memory_manager wired to honcho / mem0 / supermemory /
etc. under the parent's session_id.

Before this change, the review fork ingested its harness prompt
(the 'Review the conversation above and update the skill library...'
text) into the user's real memory namespace via three sites in
run_conversation():
  - on_turn_start(turn_count, prompt)      cadence + turn-message
  - prefetch_all(prompt)                   recall query
  - sync_all(prompt, review_output, ...)   harness + review output
                                           recorded as a
                                           (user, assistant) pair

Built-in MEMORY.md / USER.md state is still rebound from the parent
right after construction, so memory(action='add') writes from the
review continue to land on disk; only the external-plugin side
effects are removed.

Reported by @Utku.
2026-05-16 20:33:38 -07:00
teknium1
96b7f3da45 chore(release): AUTHOR_MAP entries for batch salvage contributors
Adds release-note attribution mappings for:
- @Saurav0989 (PR #27071)
- @avifenesh (PR #25902)
- @BROCCOLO1D (PR #26796)
- @matthewlai (PR #25293)
2026-05-16 20:32:43 -07:00
Matthew Lai
7244116b68 feat(agent): Added gemma 4 to reasoning allowlist 2026-05-16 20:32:43 -07:00
PaTTeeL
21078ebcea fix(fallback): forward custom_providers to fallback model context-length detection
The same root cause as the auxiliary compression fix (commit 7becb19):
get_model_context_length() is called without custom_providers, so per-model
context_length overrides are silently skipped.  The fallback activation path
(_try_activate_fallback) had the same missing parameter.

When the agent switches to a fallback provider, the fallback model would use
the models.dev value (e.g. 204800 for NVIDIA NIM minimax-m2.7) instead of
the user-configured one in custom_providers (e.g. 196608) — a subtle
discrepancy that could cause the fallback model to run with an incorrect
context window, leading to truncated messages or failed API requests when
the model does not support the detected length.

Fix: pass self._custom_providers to get_model_context_length() so the
fallback path sees the same per-model overrides as the main model path.
2026-05-16 20:32:43 -07:00
aqilaziz
903ac23bc8 docs(dashboard): clarify chat tab tui flag 2026-05-16 20:32:43 -07:00
BROCCOLO1D
c741eacd0c docs(spotify): document Home Assistant speaker routing 2026-05-16 20:32:43 -07:00
r266-tech
49bd95c432 docs(security): document YOLO mode visual indicators added in #26238 2026-05-16 20:32:43 -07:00
r266-tech
6f7292a555 docs(cron): document name-based job lookup from #26231 2026-05-16 20:32:43 -07:00
r266-tech
86f3776a72 docs(delegation): document api_mode wire-protocol override from #26824 2026-05-16 20:32:43 -07:00
r266-tech
31a805883b docs(delegation): show api_mode override in custom-endpoint example 2026-05-16 20:32:43 -07:00
Avi Fenesh
d5ce85c423 docs: add computer-use-linux community MCP 2026-05-16 20:32:43 -07:00
kjames2001
df80bda778 docs: add Hermes MemPalace to Community plugins section 2026-05-16 20:32:43 -07:00
Saurav0989
a1e3d7969e docs: add hermes-eval to Community section 2026-05-16 20:32:43 -07:00
teknium1
407a11b419 feat(discord): allow_any_attachment config to accept arbitrary file types
The Discord adapter silently dropped any attachment whose extension wasn't
in the SUPPORTED_DOCUMENT_TYPES allowlist (PDF, text family, zip, office).
Users uploading .wav / .bin / other unrecognized formats saw nothing in
their conversation — the file got logged as 'Unsupported document type'
and discarded before the agent ever saw it.

Add discord.allow_any_attachment (default false) to bypass the allowlist.
When on:
  - Any file is downloaded, cached under ~/.hermes/cache/documents/, and
    surfaced as a DOCUMENT-typed event with application/octet-stream MIME
  - gateway/run.py already emits a context note with the cached path,
    auto-translated via to_agent_visible_cache_path() for Docker/Modal
    sandboxed terminals
  - File body is NOT inlined — only the path — so binary uploads don't
    blow up the context window
  - Allowlisted text formats (.txt/.md/.log) keep their 100 KiB inline
    behavior unchanged

Also adds discord.max_attachment_bytes (default 32 MiB matches the
historical hardcoded cap; 0 = unlimited) since users opting into arbitrary
types may want to raise the cap. The whole attachment is held in memory
while being cached, so unlimited carries a real memory cost.

Env overrides: DISCORD_ALLOW_ANY_ATTACHMENT, DISCORD_MAX_ATTACHMENT_BYTES.

Discord-only by deliberate scope. Telegram has hard 20 MB API limits and
Slack has its own caps — extending the same flag there is a separate
follow-up if/when requested.
2026-05-16 20:26:18 -07:00
Teknium
3b39096904
Port from Kilo-Org/kilocode#9434: strip historical media after compression (#27189)
After context compression, the protected tail messages retain their
original image parts. When those include multi-MB pasted screenshots,
every subsequent API request re-ships the same base-64 blobs forever —
which can push the request past provider body-size limits and wedge the
session even though compression 'succeeded'.

Add _strip_historical_media() to agent/context_compressor.py. After the
summary is built, find the newest user message that carries an image
part and replace image parts in every earlier message with a short
text placeholder ('[Attached image — stripped after compression]').
The newest image-bearing user turn keeps its media so the model can
still analyse what the user just sent.

Handles all three multimodal shapes:
  - OpenAI chat.completions image_url
  - OpenAI Responses API input_image
  - Anthropic native {type: image, source: ...}

Includes 27 unit tests covering the helpers and the end-to-end
compress() integration, plus a manual E2E check confirming a ~4MB
two-image conversation shrinks to ~2MB after compression.
2026-05-16 17:18:25 -07:00
Guillaume Meyer
5cbe0b1c4f test(plugins): cover _discover_all_plugins recursion + cross-link loader
Add a TestDiscoverAllPlugins class covering the six cases the recursive
scan needs to handle:

- flat plugin uses its manifest ``name:`` as the key
- category-namespaced plugin keys off ``<category>/<dirname>`` even when
  the manifest ``name:`` is bare (regression test for the original bug —
  ``plugins/observability/langfuse/`` with ``name: langfuse`` must
  surface as ``observability/langfuse``, not ``langfuse``)
- user-installed plugin overrides bundled on key collision
- depth cap: anything below ``<root>/<category>/<plugin>/`` is ignored
- bundled ``memory/`` and ``context_engine/`` are skipped (they have
  their own loaders), but user plugins under those category names are
  still scanned

Also add an in-source comment next to the key derivation pointing at the
loader's matching line (``PluginManager._parse_manifest`` in
plugins.py:1027-1028), so future renames of one site flag the other.

Both items raised in Copilot review on #27161.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 17:15:19 -07:00
Guillaume Meyer
21be7025c5 refactor(plugins): drop dead bundled-source guard in _discover_all_plugins
The `if key in seen and source == "bundled": continue` check was
unreachable: bundled is scanned before user, so `key in seen` can never
be true while `source == "bundled"`. The "user overrides bundled"
semantics are preserved automatically by the unconditional
`seen[key] = …` on the user pass.

Replaces the dead guard with a one-line comment explaining the
overwrite semantics, so a future contributor adding a third source
(e.g. project plugins) can see at a glance how ordering interacts with
the dict-overwrite. Matches `PluginManager.discover_and_load`'s
"user wins" rule.

Spotted by Copilot in code review on #27161.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 17:15:19 -07:00
Guillaume Meyer
8ab8bc2f03 fix(plugins): remove unreachable hermes tools → Langfuse path
The langfuse plugin is hooks-only (no toolsets), so it never appears in
`hermes tools` — that menu iterates `_get_effective_configurable_toolsets()`
(= `CONFIGURABLE_TOOLSETS` + plugin-registered toolsets), and "langfuse"
is in neither. The `TOOL_CATEGORIES["langfuse"]` setup wizard (with its
`post_setup: "langfuse"` hook that pip-installs the SDK and writes
`plugins.enabled`) was reachable only when a toolset key "langfuse" got
enabled, which can't happen — so it's been dead code, and the docs that
promised "Setup (interactive): hermes tools → Langfuse Observability"
were silently broken.

Right home for that wizard is `hermes plugins` (e.g. auto-running a
plugin's post-setup hook on enable), which is a generic plugin-setup
mechanism worth designing properly rather than shoehorning langfuse
back into `hermes tools`. Until that exists, point users at the
working manual flow.

Code:
- Delete `TOOL_CATEGORIES["langfuse"]` (24 lines) — unreachable.
- Delete the `post_setup_key == "langfuse"` branch in `_run_post_setup`
  (29 lines) — only caller was the deleted TOOL_CATEGORIES entry.

Docs / comments (point at the manual flow + interactive `hermes plugins`):
- `plugins/observability/langfuse/README.md`: collapse the two-option
  setup section to the single working flow.
- `plugins/observability/langfuse/plugin.yaml`: update `description`.
- `plugins/observability/langfuse/__init__.py`: update module docstring.
- `hermes_cli/config.py`: update inline comment above the LANGFUSE_*
  env-var allow-list.
- `website/docs/user-guide/features/built-in-plugins.md`: collapse
  "Setup (interactive)" + "Setup (manual)" into one accurate block.
- `website/docs/reference/environment-variables.md`: update the
  cross-reference in the Langfuse env-vars section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 17:15:19 -07:00
Guillaume Meyer
9b82586c6b fix(plugins): surface category-namespaced plugins in hermes plugins list
`_discover_all_plugins()` in plugins_cmd.py did a flat scan of the
bundled and user plugin directories — only direct children with a
plugin.yaml were surfaced. Category directories like `observability/`,
`image_gen/`, `platforms/`, `model-providers/`, `web/`, and `video_gen/`
have no plugin.yaml of their own, so their nested plugins
(`observability/langfuse`, `image_gen/openai`, etc.) never appeared in
`hermes plugins list` or the interactive `hermes plugins` UI — even
though the runtime loader (`PluginManager._scan_directory_level`)
discovers them correctly and they do load at runtime.

This broke the documented promise that bundled plugins appear in
`hermes plugins list` and the interactive UI before being enabled,
and made it look like `observability/langfuse` didn't exist.

Refactor `_discover_all_plugins()` to mirror the loader's recursion
(depth cap = 2, same skip set, user overrides bundled on key collision).
Return the path-derived registry key (e.g. `observability/langfuse`) as
the displayed name, matching what the user passes to
`hermes plugins enable …` / writes under `plugins.enabled` in
config.yaml.

Also clarify the plugins docs: spell out that sub-category plugins
surface by their `<category>/<plugin>` key in `hermes plugins list` /
interactive UI, add an `observability/langfuse` example to the command
reference, and include a nested entry in the interactive-UI mock.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 17:15:19 -07:00
Teknium
29b1bd0e20
feat(cli): add hermes send to pipe script output to any messaging platform (#27188)
Introduces a thin CLI wrapper around the existing send_message_tool so
shell scripts, cron scripts, CI hooks, and monitoring daemons can reuse
the gateway's already-configured platform credentials without
reimplementing each platform's REST client.

  hermes send --to telegram "deploy finished"
  echo "RAM 92%" | hermes send --to telegram:-1001234567890
  hermes send --to discord:#ops --file report.md
  hermes send --to slack:#eng --subject "[CI]" --file build.log
  hermes send --list                  # all targets
  hermes send --list telegram         # filter by platform

Supports all platforms the send_message tool already does (Telegram,
Discord, Slack, Signal, SMS, WhatsApp, Matrix, Feishu, DingTalk, WeCom,
Weixin, Email, etc.), including threaded targets and #channel-name
resolution via the channel directory.

hermes_cli/send_cmd.py delegates to tools.send_message_tool.send_message_tool,
which means there is zero new platform-specific code. The subcommand just:

1. Bridges ~/.hermes/.env and top-level ~/.hermes/config.yaml scalars into
   os.environ (same bootstrap the gateway does at startup) — required so
   TELEGRAM_HOME_CHANNEL and friends are visible to load_gateway_config().
2. Resolves the message body from positional arg, --file, or piped stdin.
3. Calls the shared tool and translates its JSON result to exit codes:
   0 success, 1 delivery failure, 2 usage error.

No running gateway is required for bot-token platforms (Telegram, Discord,
Slack, Signal, SMS, WhatsApp) — the tool hits each platform's REST API
directly. Plugin platforms that rely on a live adapter connection still
need the gateway running; the error message is forwarded verbatim.

- New guide: website/docs/guides/pipe-script-output.md covering real-world
  patterns (memory watchdogs, CI hooks, cron pipes, long-running task
  completion pings) and the security/gateway notes.
- Cross-links added from automate-with-cron.md ("no LLM? use hermes send")
  and developer-guide/gateway-internals.md (delivery-path section).

tests/hermes_cli/test_send_cmd.py (20 tests, all green):

- Happy paths: positional message, stdin, --file, --file -, --subject,
  --json, --quiet.
- Error paths: missing --to, missing body, file not found, tool returns
  error payload (exit 1), tool skipped-send result (exit 0).
- --list: human output, --json output, platform filter, unknown platform.
- Env loader: bridges config.yaml scalars into env, does not override
  existing env vars, gracefully handles missing files.
- Registrar contract: register_send_subparser() returns a working parser.

Smoke-tested end-to-end against a live Telegram bot before commit.
2026-05-16 17:14:45 -07:00
konsisumer
33528b428d fix(agent): reset _fallback_index at turn start even when no fallback activated
In long-lived interactive sessions, _try_activate_fallback() advances
_fallback_index before attempting client resolution.  When resolution
fails (provider not configured, etc.) the function returns False without
ever setting _fallback_activated=True.  _restore_primary_runtime() then
skips its reset block entirely (guarded by `if not _fallback_activated`),
leaving _fallback_index >= len(_fallback_chain) for all subsequent turns.
The eager-fallback guard at the top of the retry loop checks
`_fallback_index < len(_fallback_chain)`, so the condition fails silently
and no fallback is ever attempted again for that session.

Cron jobs spawn a fresh AIAgent per run and never hit this path, which is
why the same fallback chain works reliably for cron but not interactive.

Fix: reset _fallback_index=0 in the `not _fallback_activated` early-return
branch so every new turn starts with the full chain available.

Fixes #20465
2026-05-16 17:12:48 -07:00
Teknium
2b193907d6
fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184)
xAI's Responses stream emits 'type=error' as the FIRST SSE frame when an
OAuth account is unsubscribed/exhausted or rejects the encrypted-reasoning
replay introduced in the May 2026 SuperGrok rollout. The SDK helper
raises RuntimeError(Expected to have received response.created before
error), which the caller correctly routes to
_run_codex_create_stream_fallback. The fallback then opens a new stream
that emits the same 'error' frame — but the fallback loop only handled
{response.completed, response.incomplete, response.failed} and silently
continue'd past 'error' events. Result: the loop fell off the end of
the stream and raised the useless 'fallback did not emit a terminal
response' RuntimeError, which the classifier marked retryable=True and
looped 3x before failing with no clue what went wrong.

Now: 'error' frames raise a synthesized _StreamErrorEvent with an OpenAI
SDK-shaped .body so _summarize_api_error, _extract_api_error_context,
_is_entitlement_failure, and classify_api_error all see the real
provider message. Users on unsubscribed accounts now see 'do not have
an active Grok subscription' once, not three RuntimeErrors.

Verified end-to-end: classifier returns reason=auth retryable=False;
entitlement detector matches even with status_code=None; summarizer
returns the full xAI message.

Tests: 4 new in TestCodexFallbackErrorEvent covering xAI subscription
message, dict-shaped events, summarizer integration, and the empty-stream
case (must still raise the original RuntimeError so 'truncated mid-flight'
stays distinguishable from 'provider rejected the call').
2026-05-16 17:09:41 -07:00
Teknium
e21cb8d145
feat(status): append session recap to /status output (#27176)
Adds a pure-local recap of recent session activity — turn counts,
tools used, files touched, last user ask, last assistant reply —
appended to the existing /status output. Useful when juggling multiple
sessions and you want a one-glance reminder of where this one left off.

Inspired by Claude Code 2.1.114's /recap, but folded into /status so
we don't add a 6th info command. Pure local computation: no LLM call,
no auxiliary model, no prompt-cache invalidation, instant and free.

Salvage of #18587 — kept the shared hermes_cli.session_recap.build_recap
helper and its 13 unit tests, dropped the /recap slash command +
ACTIVE_SESSION_BYPASS_COMMANDS entry + Level-2 bypass since /status
already covers both surfaces.

Tailored to hermes-agent's tool vocabulary: file-editing tools
(patch, write_file, read_file, skill_manage, skill_view) surface
touched paths; tool-call counts highlight which classes of work
drove the session.

Source: https://code.claude.com/docs/en/whats-new/2026-w17
2026-05-16 16:51:42 -07:00
Teknium
226cee43d9
feat(cli): show ▶ N indicator in status bar when /background tasks are running (#27175)
Surface live background-task count in the prompt_toolkit status bar so users
can see at a glance that a /background task exists and is running — no need
to ask the agent about it (the agent has no visibility into bg sessions by
design).

- _get_status_bar_snapshot now reports active_background_tasks from len()
  of the live _background_tasks dict (entries are removed in the task
  thread's finally block, so this reflects truly-running tasks)
- Indicator shown only on medium (<76) and wide (>=76) tiers; narrow (<52)
  stays minimal since it's already cramped
- No invalidate plumbing needed: status bar fragments are pulled via lambda
  on every redraw, and the bg thread already calls _app.invalidate() on exit

Refs #8568
2026-05-16 16:51:29 -07:00
helix4u
6f817e1447 fix(telegram): restore DM topic typing indicator 2026-05-16 16:50:02 -07:00
Maxim Esipov
e51d74ab91 fix(codex): rotate pool on usage limit 429 2026-05-16 16:49:56 -07:00
Teknium
dffb602f37
fix(xai): drop stale X Premium+ hint from entitlement 403 surfacing (#27110)
xAI announced on 2026-05-16 (https://x.ai/news/grok-hermes) that X Premium
subscriptions now work in Hermes Agent. The hint we shipped in PR #26644
asserted the opposite ("X Premium+ does NOT include xAI API access — only
standalone SuperGrok subscribers can use this provider"), which would now
misdirect Premium+ users who hit any other 403 (no Grok sub at all, wrong
tier, exhausted quota) into thinking they need to switch subscriptions
when their sub is in fact valid.

Remove _decorate_xai_entitlement_error and its two call sites in
_summarize_api_error. xAI's own body text already says "Manage subscriptions
at https://grok.com/?_s=usage" — surface that verbatim and let xAI's wording
do the diagnosis.

The _is_entitlement_failure guard (which prevents credential-pool refresh
loops on entitlement 403s) and the reasoning-replay gating for xai-oauth
are unrelated and untouched.

Update tests to assert the body still surfaces verbatim and that no
Hermes-side editorializing is appended.
2026-05-16 16:00:01 -07:00
Teknium
fb05f5d4b5
fix(mcp): validate remote URLs up-front with a clear error (#27105)
Port from anomalyco/opencode#25019 ("fix: handle invalid mcp urls").

Previously: a typo in `config.yaml` (missing scheme, wrong scheme,
empty string, non-string value) slipped past `_is_http()` and hit
`httpx.URL(url)` or `streamablehttp_client(url, ...)` deep in the
transport layer. That raised a generic exception which went through
the reconnect-backoff loop, so a bad URL caused _MAX_INITIAL_CONNECT_RETRIES
attempts with doubling backoff — about a minute of pointless retries
plus an opaque error — before the server was marked failed.

Now: we validate the URL once, at the top of `run()`, before
entering the retry loop. A malformed URL raises `InvalidMcpUrlError`
(a `ValueError` subclass) with a message that names the offending
server and explains exactly what was wrong. `_ready` is set and
`_error` is populated, so `start()` re-raises and the server shows
up as failed in `hermes mcp list` without any backoff burn.

Validation rules:
- Must be a string (rejects None, dict, int)
- Must be non-empty (rejects '' and whitespace-only)
- Scheme must be http or https (rejects file://, ws://, stdio://)
- Must have a non-empty host (rejects http:///, http://:8080)

Tests (21 new cases in tests/tools/test_mcp_invalid_url.py):
- TestValidUrlsAccepted: http, https, IPv6, ports, paths, query strings
- TestInvalidUrlsRejected: every rejection path above + clear error text
- TestErrorIsValueError: downstream code catching ValueError still works

E2E verified: a misconfigured server with `url: not-a-valid-url`
now fails in <0.001s with the clear error, instead of minutes of retries.

Doesn't touch stdio servers (they use `command`, not `url`) — the
validator only fires when `_is_http()` returns True.
2026-05-16 13:06:56 -07:00
Teknium
93e109a1d5
fix(moonshot): strip $ref siblings and collapse tuple items in tool schemas (#27104)
Port from anomalyco/opencode#24730: Moonshot's JSON Schema validator rejects
two shapes that the rest of the JSON Schema ecosystem accepts:

1. $ref nodes with sibling keywords. Moonshot expands the reference before
   validation and then rejects the node if keys like `description`, `type`,
   or `default` appear alongside $ref. MCP-sourced tool schemas commonly
   put a `description` on $ref-typed properties so the model sees the
   field hint — which worked on every provider except Moonshot.

2. Tuple-style `items` arrays (positional element schemas). Moonshot's
   engine requires ONE schema applied to every array element. Common in
   tool schemas generated from Go/Protobuf that model fixed-length arrays
   as `[{type:number}, {type:number}]`.

Repairs applied in `agent/moonshot_schema.py`:

- Rule 3: when a node has `$ref`, return `{"$ref": <value>}` only
  (strip every sibling). The referenced definition still carries its own
  description on the target node, which Moonshot accepts.
- Rule 4: when `items` is a list, collapse to the first element schema
  (falling back to `{}` which is then filled by the generic missing-type
  rule). Preserves `minItems` / `maxItems` / other siblings.

Tests: 10 new cases across TestRefSiblingStripping + TestTupleItems,
plus the existing TestMissingTypeFilled::test_ref_node_is_not_given_synthetic_type
still passes (it asserted plain $ref passes through; now it passes through
as exactly `{"$ref": "..."}` which is strictly compatible).

All 35 tests in test_moonshot_schema.py pass.
2026-05-16 13:02:19 -07:00
Teknium
dc3d0fe148
Port from cline/cline#10343: periodic gateway memory logging (#27102)
Emit a grep-friendly '[MEMORY] rss=...MB ...' line in agent.log /
gateway.log every N minutes (default 5) so slow leaks in the long-lived
gateway process show up as a time series. Based on
https://github.com/cline/cline/pull/10343
(src/standalone/memory-monitor.ts).

- gateway/memory_monitor.py: new module. Daemon thread, baseline on
  start, final snapshot on stop. Uses resource.getrusage() (stdlib)
  first, falls back to psutil, disables itself with one WARNING if
  neither is available.
- gateway/run.py: start monitor right after setup_logging() in
  start_gateway(); stop it in the shutdown block next to MCP teardown.
- hermes_cli/config.py: logging.memory_monitor { enabled, interval_seconds }
  defaults under the existing logging section.
- tests/gateway/test_memory_monitor.py: 10 unit tests covering format,
  baseline/shutdown snapshots, double-start noop, periodic timer,
  daemon thread invariant, and unavailable-RSS warn-and-skip path.

Adapted from TypeScript/Node to Python (threading.Event-based daemon
thread instead of setInterval/unref), added Python-specific gc + thread
counts to the log line (handier than ext/arrayBuffers for diagnosing
Python gateway leaks), and gated behind a config.yaml toggle so users
can silence the periodic line if they want.

No heap-snapshot-on-OOM equivalent — CPython doesn't have V8's
--heapsnapshot-near-heap-limit; tracemalloc would be the Python
equivalent but adds non-trivial overhead, so leaving that out.
2026-05-16 12:55:23 -07:00
Teknium
fc03c95da1
feat(cli): add /exit --delete flag to remove session on quit (#27101)
Port from google-gemini/gemini-cli#19332.

Users can now exit with '/exit --delete' (or '/quit --delete', '/exit -d')
to permanently remove the current session's SQLite history plus on-disk
transcripts (*.json / *.jsonl / request_dump_*) in one shot. Useful for
privacy-sensitive workflows and one-off interactions where leaving a
session recording behind is undesirable.

Implementation:
- New HermesCLI._delete_session_on_exit one-shot flag (defaults False).
- process_command() parses --delete / -d after /exit or /quit and arms
  the flag. Unknown args print a hint and keep the CLI running (prevents
  typos like '/exit -delete' from accidentally exiting).
- Shutdown path calls SessionDB.delete_session(session_id, sessions_dir=...)
  right after end_session() when the flag is set. That API already
  existed for 'hermes sessions delete' and handles both SQLite removal
  (orphaning child sessions so FK constraints hold) and on-disk file
  cleanup.
- /quit CommandDef now advertises '[--delete]' in args_hint so /help
  and CLI autocomplete surface it.

Tests: tests/cli/test_exit_delete_session.py (12 cases covering both
aliases, case insensitivity, whitespace, short form, unknown-arg
rejection, and registry metadata).

E2E-verified with isolated HERMES_HOME: session row deleted, all three
transcript/request-dump files removed, second delete_session call
correctly returns False.
2026-05-16 12:51:08 -07:00
briandevans
c844d15c3d fix(update): stream npm install output so postinstall progress is visible (#18840)
`hermes update` ran the repo-root and ui-tui npm installs with both
`--silent` and `subprocess.run(..., capture_output=True)`, which hides
all output from optional postinstall scripts.  The largest of those —
`@askjo/camofox-browser`'s `npx camoufox-js fetch` — downloads a
Firefox-fork browser binary that can take many minutes on slow
connections.  Because nothing was printed during that wait, the updater
appeared to hang at "Updating Node.js dependencies..." and users
Ctrl-C'd, sometimes leaving `node_modules` partially installed.

Drop `--silent` and pass `capture_output=False` for the repo-root and
ui-tui paths so npm streams its `info run …` postinstall lines straight
to the terminal.  Output is still mirrored to `~/.hermes/logs/update.log`
by the existing `_UpdateOutputStream` wrapper, so SSH-disconnect safety
is preserved.

The `web/` install path is untouched — its build step is fast and does
not run binary-fetching postinstalls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 12:18:55 -07:00
Teknium
05af78c53d
fix(update): make Camofox lazy-installed instead of eager (#27055)
The `@askjo/camofox-browser` npm package was a top-level entry in
the root `package.json` `dependencies` block, so `hermes update`
ran its postinstall on every user, every update. That postinstall
calls `npx camoufox-js fetch`, which silently downloads a ~300MB
Firefox-fork browser binary from GitHub Releases — multi-minute on
fast connections, and a hard block for users on slow / restricted
networks (notably users in China running through a VPN).

Camofox is an explicit opt-in browser backend. The runtime check
in `tools/browser_tool.py` only routes through Camofox when the
user has set `CAMOFOX_URL` (selected via `hermes tools` →
Browser Automation → Camofox). Users who never opted in never
touched the package at runtime, yet every `hermes update` paid
for the binary fetch anyway.

This change:

* Removes `@askjo/camofox-browser` from root `package.json`
  dependencies (and the regenerated `package-lock.json` drops
  Camofox's entire transitive tree, ~2.6k lines).
* Updates the Camofox `post_setup` handler in
  `hermes_cli/tools_config.py` to install
  `@askjo/camofox-browser@^1.5.2` explicitly when the user
  selects Camofox, and streams npm output (no `--silent`, no
  `capture_output`) so the ~300MB download is visible rather
  than appearing frozen.
* Adds `tests/test_package_json_lazy_deps.py` as a regression
  guard so future PRs can't silently re-add Camofox (or any
  binary-postinstall package) to eager root dependencies.

`agent-browser` stays eager — it is the default Chromium-driving
backend used by every session that does not have a cloud browser
provider configured, and its postinstall is small.

Validation:

| | Before | After |
|---|---|---|
| `hermes update` time on slow network | multi-minute hang at `→ Updating Node.js dependencies...` | seconds (no binary fetch) |
| Camofox opt-in install visibility | silent, looked frozen | streamed npm output |
| Regression guard against re-adding | none | `test_package_json_lazy_deps.py` |

Tests:
- `tests/test_package_json_lazy_deps.py`: 3/3 pass
- `tests/tools/test_browser_camofox*`: 92/92 pass
- `tests/hermes_cli/test_tools_config.py`: 66/66 pass
- `tests/hermes_cli/test_cmd_update.py` + adjacent: green

Reported by lulu (Discord, May 2026) — `hermes update` hangs at
`→ Updating Node.js dependencies...` in China.
Related: #18840, #18869.
2026-05-16 12:15:45 -07:00
Teknium
8a2b2b9f6f
docs(release): expand v0.14.0 highlights with newcomer-friendly context (#27053)
Each highlight now gets 2-3 sentences explaining the user-facing value,
not just the technical change. Targeted at someone discovering Hermes
for the first time who isn't deep in the codebase.
2026-05-16 11:57:59 -07:00
Teknium
6c2406c5e1
fix(signal): read groupV2.id in envelope, fall back to legacy groupInfo (#27051)
Port from qwibitai/nanoclaw#1962: modern Signal V2-only groups surface on
dataMessage.groupV2.id, not groupInfo.groupId. signal-cli versions differ
in which field they expose for V2 groups — some forward the underlying
libsignal envelope verbatim (groupV2), others normalize everything into
groupInfo. Without a groupV2 read, V2-only groups appear as DMs because
groupInfo is undefined and the adapter misroutes them to the sender's
DM session.

Reads groupV2.id first, falls back to groupInfo.groupId. Also hardens
chat_name extraction against non-dict groupInfo payloads (crashed with
AttributeError under malformed envelopes).

6 new tests cover V2 routing, V1 legacy compatibility, V2-preferred
precedence, no-group DM path, allowlist enforcement, and malformed
payloads.
2026-05-16 11:53:57 -07:00
Teknium
35f25523c6
docs(tools): add video_generate / video_gen toolset to user-facing tool docs (#27050)
The video_gen toolset and its video_generate tool shipped without
user-facing reference docs. toolsets-reference.md and the dev-guide
plugin page were already in, but reference/tools-reference.md had no
video_gen section at all and user-guide/features/tools.md's Media row
didn't list video_generate.

- reference/tools-reference.md: add a video_gen section after video,
  including backend list (xAI Grok-Imagine, FAL.ai Veo/Pixverse/Kling),
  unified text-to-video / image-to-video surface note, link to the
  dev-guide plugin page, and the video_generate tool row. Add
  video_generate to the standalone-tools quick-counts line.
- user-guide/features/tools.md: extend Media row with video_generate
  and video_analyze plus an opt-in caveat.
2026-05-16 11:53:13 -07:00
Teknium
6836987428
docs(release): rewrite v0.14.0 highlights for excitement framing (#27035)
* chore: release v0.14.0 (2026.5.16)

The Foundation Release — Hermes installs and runs anywhere now.

Highlights:
- Native Windows support (early beta) — PowerShell installer, native subprocess/PTY paths, ~40 follow-up Windows-only fixes
- pip install hermes-agent — PyPI wheel
- Cold-start wave — ~19s off hermes launch, 180x faster browser_console (CDP WS)
- Supply-chain advisory checker + lazy-deps + tiered install fallback
- OpenAI-compatible local proxy for OAuth providers (Claude Pro, ChatGPT Pro, SuperGrok)
- Cross-session 1h Claude prompt cache (Anthropic / OpenRouter / Nous Portal)
- 2 new platforms: LINE + SimpleX Chat (22 total)
- Microsoft Graph foundation — Teams pipeline + webhook adapter
- /handoff actually transfers sessions live
- x_search first-class tool, vision_analyze pixel passthrough
- LSP semantic diagnostics on every write
- Unified video_generate with pluggable backends
- computer_use cua-driver backend
- 9 new optional skills, OpenRouter Pareto Code router, xAI Grok OAuth
- 12 P0 + 50 P1 closures

808 commits · 633 PRs · 1393 files · 165k insertions · 545 issues closed · 215 contributors

* docs(release): rewrite v0.14.0 highlights for excitement framing

Demote Windows beta from headline; lead with SuperGrok / OAuth proxy /
x_search / Microsoft Teams. Frame lazy-deps as a debloating wave that
makes installs dramatically lighter. Add highlights for clickable URLs
in any terminal, dangerous-command detection bypasses, ChatGPT Pro
and SuperGrok via the local proxy. Tighten the summary paragraph.
2026-05-16 11:18:06 -07:00
kshitij
3034eee38e
fix(acp): replay session history before responding to session/load (#12285 follow-up) (#26957)
Switches `_replay_session_history` from `loop.call_soon`-deferred (after the
`LoadSessionResponse` is written) to `await`-inline (before the response is
constructed) for both `session/load` and `session/resume`. Adds defensive
try/except around the awaited call so a replay helper crash still yields a
successful load response — partial transcripts are acceptable, total
load failure is not.

The deferral was added on May 2 in commit 19854c7cd with the rationale "Zed
only attaches streamed transcript/tool updates once the load/resume response
has completed." That justification was incorrect:

- Zed's current ACP integration (zed-industries/zed
  crates/agent_servers/src/acp.rs) explicitly registers the session-update
  routing entry BEFORE awaiting the loadSession RPC, with the comment:
  "so that any session/update notifications that arrive during the call
  (e.g. history replay during session/load) can find the thread."
- Every other reference ACP server (Codex, Claude Code, OpenCode, Pi, agentao)
  replays history BEFORE responding to the load request.
- The ACP spec wording ("Stream the entire conversation history back to the
  client via notifications") and the natural JSON-RPC reading both mean
  "during the request's lifetime", not "after the response resolves".

Empirical reproduction (reported by Biraj on @agentclientprotocol/sdk
v0.21.1): the same custom ACP client works correctly against Codex /
Claude Code / OpenCode / Pi but receives 0 notifications from Hermes
because it measures the per-call notification count at the moment
`loadSession` resolves — which on Hermes was before the `call_soon`-
scheduled replay coroutine had a chance to run.

Changes:
- `acp_adapter/server.py`: remove `_schedule_history_replay`; both
  `load_session` and `resume_session` now `await self._replay_session_history`
  before returning, wrapped in try/except that logs and continues on
  helper exceptions.
- `tests/acp/test_server.py`: replace the single
  `test_load_session_schedules_history_replay_after_response`
  (which encoded the now-incorrect post-response ordering) with two tests
  asserting `events == ["replay", "returned"]` for load and resume.
  Add two regression tests confirming that a replay helper raising still
  yields a `LoadSessionResponse` / `ResumeSessionResponse` rather than
  propagating the exception out as a JSON-RPC error.

Result: 240 ACP tests pass (was 238), ruff clean. Verified end-to-end:
biraj's synchronous notification-counter pattern now sees 6 notifications
during `loadSession` for a 5-message session, matching all other reference
ACP servers.

The `_fenced_text` change in `acp_adapter/tools.py` from the same May 2
commit is orthogonal and intentionally left intact — it's a separate,
still-valid fix for Zed's pipe-as-table rendering.

Refs #12285. Follows up #26943 (which added thought-chunk replay but kept
the deferral).
2026-05-16 07:41:34 -07:00
kshitij
f3a4af9cf2
fix(acp): replay assistant reasoning as agent_thought_chunk on session/load (#12285) (#26943)
Persisted assistant `reasoning_content` / `reasoning` fields are now emitted
as ACP `agent_thought_chunk` notifications during `_replay_session_history`,
so editor clients (Zed, etc.) rebuild collapsed Thinking panes when the user
re-opens a session that used a thinking model.

Ordering matches live streaming: thought precedes message text within the
same assistant turn, mirroring how `reasoning_callback` deltas arrive before
`stream_delta_callback` deltas in `events.py::make_thinking_cb` /
`make_message_cb`.

Behavior on non-reasoning histories is unchanged; the replay loop's existing
text / tool_call / tool_call_update / plan emission is preserved bit-for-bit.

Closes #12285.

Credit:
- @Yukipukii1 (#14691) — original thought-replay design via
  `acp.update_agent_thought_text`; the tool-call portion of that PR has
  since landed via #19139, but the reasoning replay is theirs.
- @HenkDz (#17652 / #18578) — established the `_replay_session_history` and
  `_history_*` helper conventions this builds on.
- @D1zzyDwarf (#16531) — also closed by this work.
2026-05-16 06:45:29 -07:00
Teknium
a91a57fa5a
chore: release v0.14.0 (2026.5.16) (#26862)
The Foundation Release — Hermes installs and runs anywhere now.

Highlights:
- Native Windows support (early beta) — PowerShell installer, native subprocess/PTY paths, ~40 follow-up Windows-only fixes
- pip install hermes-agent — PyPI wheel
- Cold-start wave — ~19s off hermes launch, 180x faster browser_console (CDP WS)
- Supply-chain advisory checker + lazy-deps + tiered install fallback
- OpenAI-compatible local proxy for OAuth providers (Claude Pro, ChatGPT Pro, SuperGrok)
- Cross-session 1h Claude prompt cache (Anthropic / OpenRouter / Nous Portal)
- 2 new platforms: LINE + SimpleX Chat (22 total)
- Microsoft Graph foundation — Teams pipeline + webhook adapter
- /handoff actually transfers sessions live
- x_search first-class tool, vision_analyze pixel passthrough
- LSP semantic diagnostics on every write
- Unified video_generate with pluggable backends
- computer_use cua-driver backend
- 9 new optional skills, OpenRouter Pareto Code router, xAI Grok OAuth
- 12 P0 + 50 P1 closures

808 commits · 633 PRs · 1393 files · 165k insertions · 545 issues closed · 215 contributors
2026-05-16 02:58:57 -07:00
teknium1
72f94f4a7c test(security): regression guard for OAuth PKCE state/verifier separation
Two unit tests for run_hermes_oauth_login_pure():

1. test_authorization_url_state_is_not_pkce_verifier — asserts state in the
   auth URL is independent from the PKCE code_verifier sent in the token
   exchange, and that the verifier never appears in the URL.

2. test_callback_state_mismatch_aborts — asserts the flow returns None
   (no token exchange) when the callback state does not match the value
   we generated.

Negative control verified: reintroducing the b17e5c10 vulnerable pattern
(state = verifier, no callback validation) makes both tests fail.

Also adds AUTHOR_MAP entry for shaun0927 (contributor of the fix).
2026-05-16 02:38:02 -07:00
JunghwanNA
345821b4a1 style: move secrets import alongside other function-level imports
Group the secrets import with time and webbrowser at the top of
run_hermes_oauth_login_pure(), matching the existing pattern.
Drop the _secrets alias — no name conflict in this scope.
2026-05-16 02:38:02 -07:00
JunghwanNA
fcd9011f8d fix(security): separate OAuth PKCE state from code_verifier
The PKCE flow reused the code_verifier as the OAuth state parameter.
Per RFC 6749 §10.12 and RFC 7636, these serve different purposes:
state is an anti-CSRF token visible in the authorization URL; the
code_verifier must remain secret for the token exchange.

Generate an independent secrets.token_urlsafe(32) for state and
validate it on callback to provide actual CSRF protection.

Closes #10693
2026-05-16 02:38:02 -07:00
Teknium
585d6b6430
fix(gateway): merge rapid TEXT follow-ups during active sessions (#4469) (#26822)
When the agent is running and the user sends multiple TEXT messages in
rapid succession, base.py's active-session branch stored the pending
event as a single-slot replacement:

    self._pending_messages[session_key] = event

Three rapid messages A, B, C landed as: A (interrupts), B (replaces A
before consumer reads), C (replaces B). Only C reached the next turn —
A and B were silently dropped. This is the symptom in #4469.

Route the follow-up through merge_pending_message_event(..., merge_text=True)
so TEXT events accumulate into the existing pending event's text instead
of clobbering it. Photo and media bursts already merged through the same
helper; this just extends the merge_text path (already used by the
Telegram bursty-grace branch in gateway/run.py) to all platforms.

Test exercises BasePlatformAdapter.handle_message directly with the
session marked active and asserts three rapid TEXT events merge to
'part two\\npart three' rather than dropping the middle message.
Sanity-checked the test would fail without the fix.

Credits @devorun for the original investigation and analysis in #4491
that surfaced the underlying queue handling, though their fix targeted
GatewayRunner._pending_messages which is now dead state on main.
2026-05-16 02:25:41 -07:00
teknium1
374dc81c23 fix(copilot-acp): tighten deprecation detection + sharpen GitHub Models 413 hint
Follow-up improvements on top of @konsisumer's cherry-picked fix for #10648:

1. Deprecation patterns required BOTH a product fingerprint ('gh-copilot') and
   a deprecation marker. The previous list included 'copilot-cli' and bare
   'deprecation', which would false-positive on stderr from the NEW
   @github/copilot CLI — whose repo is literally github.com/github/copilot-cli
   and which legitimately surfaces those substrings in its own messages.

2. Replace the deprecation hint. The user in #10648 installed
   'gh extension install github/gh-copilot' (the deprecated extension)
   thinking that's what ACP mode uses, when ACP actually spawns the new
   'copilot' binary from '@github/copilot'. The hint now points users at the
   correct install command ('npm install -g @github/copilot') with the new
   CLI's repo URL, and demotes provider-switching to a fallback alternative.

3. Change _URL_TO_PROVIDER value for models.inference.ai.azure.com from the
   'github-models' alias to the canonical 'copilot' provider id, matching the
   convention used by every other entry in the table.

4. Sharpen the 413 hint message. The free tier's ~8K cap is below the
   system-prompt floor, so this endpoint is fundamentally incompatible with
   an agentic loop — not a 'use a different URL' problem.

Tests:
- New parametrized false-positive coverage for the new CLI's stderr shape.
- Updated assertion to require canonical 'copilot' provider mapping.
- All 14 deprecation/URL tests pass.
2026-05-16 02:24:48 -07:00
konsisumer
b85b938b1f test: add tests for copilot ACP deprecation detection and Azure URL mapping
Cover the deprecation pattern matching against real gh-copilot stderr
output, verify the GitHub Models Azure URL is in _URL_TO_PROVIDER, and
confirm _is_github_models_base_url recognises the Azure endpoint.
2026-05-16 02:24:48 -07:00
konsisumer
4ded3ede33 fix: detect gh-copilot deprecation and improve GitHub Models 413 errors (#10648)
Address two blocking issues when using GitHub Copilot integrations:

1. ACP mode: detect the gh-copilot CLI deprecation error from stderr
   and surface an actionable message with alternatives instead of
   hanging or showing a cryptic error.

2. GitHub Models (Azure) 413: recognize models.inference.ai.azure.com
   as a known GitHub Models URL, and print a targeted hint explaining
   the hard 8K token limit that makes this endpoint incompatible with
   Hermes' system prompt size.
2026-05-16 02:24:48 -07:00
kshitijk4poor
7bb97b952f chore: add worlldz to AUTHOR_MAP for #26704 salvage 2026-05-16 02:21:17 -07:00
worlldz
d0a183cadd fix(doctor): suppress stale direct-key issues when oauth is healthy
Fixes #26693

`hermes doctor` currently promotes invalid direct API keys into the final
summary even when the matching OAuth path is already healthy. That makes
the setup look more broken than it really is.

This change keeps the failed API Connectivity row visible but stops
treating it as a blocking summary issue when a healthy OAuth fallback
already exists for the same provider family.

Covered cases:
- Gemini OAuth + invalid direct Gemini key
- MiniMax OAuth + invalid direct MiniMax key

Based on #26704 by @worlldz.
2026-05-16 02:21:17 -07:00
Teknium
5f91b1a48b
feat(skills): add osint-investigation optional skill (closes #355) (#26729)
* feat(skills): add osint-investigation optional skill (closes #355)

Phase-1 public-records OSINT investigation framework adapted from
ShinMegamiBoson/OpenPlanter (MIT). Lives in optional-skills/research/.

Six data-source wiki entries (FEC, SEC EDGAR, USAspending, Senate LD,
OFAC SDN, ICIJ Offshore Leaks), each following the 9-section template:
summary, access, schema, coverage, cross-reference keys, data quality,
acquisition, legal, references.

Six stdlib-only acquisition scripts that emit normalized CSV, plus three
analysis scripts:

  - entity_resolution.py  — three-tier match (exact / fuzzy / token overlap)
                            with explicit confidence per row
  - timing_analysis.py    — permutation test for donation/contract timing
                            correlation, joins through cross-links
  - build_findings.py     — assembles structured findings.json with
                            evidence chains pointing back to source rows

Validation: full pipeline runs end-to-end on synthetic fixtures. Entity
resolution found 24 cross-matches with 0 false positives on a 5-row /
4-row test set. Timing analysis on 5 donations clustered near 3 awards
returned p=0.000, effect size 2.41 SD. Findings JSON correctly tags
HIGH-severity timing pattern. All 9 scripts pass --help and py_compile.

Docs site page auto-generated by website/scripts/generate-skill-docs.py;
sidebar + catalog entries updated by the same generator.

* fix(osint-investigation): live API fixes from end-to-end sweep

Live-tested the skill on a real public-citizen query and found three bugs
the synthetic E2E missed. All three are now fixed and re-verified.

1. FEC fetch hung on contributor name searches.
   The combination of two_year_transaction_period + sort=date +
   contributor_name puts the OpenFEC query plan on a slow path that the
   upstream gateway times out (25s+). Switched to min_date/max_date with no
   explicit sort. Renamed --candidate to --contributor (the original name
   was misleading: FEC searches by donor, not by candidate; --candidate is
   kept as a deprecated alias). Added --state filter for narrowing.

2. ICIJ Offshore Leaks reconcile endpoint returns 404.
   ICIJ removed the Open Refine reconciliation API. Rewrote
   fetch_icij_offshore.py to download the official bulk CSV ZIP (~70 MB,
   public, no auth) and search it locally. Cached under
   $HERMES_OSINT_CACHE/icij/ (default ~/.cache/hermes-osint/icij/) for
   30 days, --force-refresh to refetch. Verified live: 'PUTIN' query
   returns 5 Panama Papers officer matches in 0.5s after first download.

3. SEC EDGAR silently returned 0 when the company-name resolver matched
   an individual Form 3/4/5 filer (insider trading disclosures).
   Now surfaces 'Resolved company X → CIK Y (Z)' on stderr, prints a
   filing-type histogram when the type filter wipes results, and
   explicitly warns when the matched CIK appears to be an individual
   filer rather than a corporate registrant.

Bonus: _http.py was retrying 429 responses with exponential backoff plus
honoring (often-missing) Retry-After headers, which compounded into
multi-second hangs per page when the upstream key was over quota.
Changed to fail-fast on 429 with a clear, actionable error showing the
upstream's quota message. Verified: 0.3s fast-fail vs the previous 60s
hang on DEMO_KEY rate-limit exhaustion.

Updated SKILL.md, fec.md, and icij-offshore.md to match the new CLI
flags and ICIJ bulk-cache flow. Regenerated the docusaurus page via
website/scripts/generate-skill-docs.py.

Live sweep results across all 6 sources for 'Dillon Rolnick, New York':
- OFAC SDN: 0 matches ✓ (correctly not sanctioned)
- USAspending: 0 matches ✓ (correctly not a federal contractor)
- Senate LDA: 0 matches ✓ (correctly not a lobbying client)
- SEC EDGAR: warns it resolved to 'Rolnick Michael' (CIK 0001845264)
    who is an individual Form 3 filer, not a corporate registrant
- ICIJ: 0 matches ✓ (correctly not in any offshore leak)
- FEC: rate-limited (DEMO_KEY); fails fast with clear quota message

* feat(osint-investigation): expand to 12 sources covering identity, property, courts, archives, news

Phase-2 expansion per Teknium feedback that the original 6-source skill
(federal financial/regulatory only) wasn't a complete OSINT toolkit. Adds
6 more sources covering the major omissions a real investigation would
reach for first.

New sources (6 fetch scripts + 6 wiki entries):

1. NYC ACRIS — Real property records (deeds, mortgages, liens) via the
   city's Socrata API. Search by party name or property address. Joins
   Parties to Master to populate doc_type, dates, borough, and amount.
   Coverage: 5 NYC boroughs, ~70M party records, 1966-present.

2. OpenCorporates — Global corporate registry covering 130+ jurisdictions
   (~200M companies). Free API token at
   https://opencorporates.com/api_accounts/new raises the rate limit;
   HTML fallback works without one (limited fields).

3. CourtListener (Free Law Project) — federal + state court opinions
   (~10M back to colonial era) + PACER dockets via RECAP. Anonymous v4
   search works; COURTLISTENER_TOKEN raises rate limits.

4. Wayback Machine CDX — historical web captures (~900B+). Used both for
   surveillance-of-record (when did this site change?) and as a
   content-recovery layer when other sources point to dead URLs.

5. Wikipedia + Wikidata — narrative bio + structured facts. Wikipedia
   OpenSearch for article matching, REST summary for extracts, Wikidata
   Action API (wbgetentities) for claims. Avoids the SPARQL Query
   Service which is aggressively rate-limited.

6. GDELT 2.0 DOC API — global news monitoring in 100+ languages,
   ~2015-present. Auto-retries with 6s backoff on the standard
   1-req-per-5-sec throttle.

Other changes in this commit:

- SEC EDGAR no longer raises SystemExit when the company-name resolver
  finds no CIK; writes an empty CSV with header so the rest of a
  pipeline can keep moving and the warning is just on stderr.

- _http.py User-Agent updated per Wikimedia policy: includes app name,
  version, and a 'set HERMES_OSINT_UA to identify yourself' instruction.

- SKILL.md workflow now groups sources into two clusters (federal
  financial vs identity/property/courts/archives/news) with bash
  examples for each. 'When to use this skill' lists the broader set of
  investigation patterns the expanded sources unlock.

Live sweep results on 'Dillon Rolnick, New York' across all 12 sources:

  ofac           ✓ 0 (correctly clean)
  icij           ✓ 0 (correctly not in any leak)
  usaspending    ✓ 0 (correctly not a federal contractor)
  senate_lda     ✓ 0 (correctly not a lobbying client)
  sec_edgar      ✓ 0, warns: resolved to 'Rolnick Michael' (CIK 0001845264),
                   individual Form 3 filer, NOT a corporate registrant
  fec            — rate-limited (DEMO_KEY exhausted), fails fast with
                   clear quota message
  nyc_acris      ✓ 200 records named Rolnick across NYC; 48 records at
                   571 Hudson (the property the web identifies as his)
  opencorporates ✓ 0 (no API token configured; HTML fallback)
  courtlistener  ✓ 0 for 'Dillon Rolnick'; 20 for 'Rolnick' generally;
                   5 for 'Microsoft' sanity check
  wayback        ✓ 30 captures of nousresearch.com from 2011-present
  wikipedia      ✓ 0 (correctly not notable enough); Bill Gates sanity
                   returns full structured facts (occupation, employer,
                   DOB, place of birth, country)
  gdelt          ✓ 0 for 'Dillon Rolnick'; 5 for 'Nous Research'

All 17 scripts compile clean and pass --help. Synthetic analysis pipeline
regression still passes (entity_resolution 30 matches, timing p=0.000,
findings 2).

* feat(osint-investigation): remove FEC; DEMO_KEY rate-limits make it unreliable

The FEC fetcher consistently failed the live sweep because the OpenFEC
DEMO_KEY tier (40 calls/hour) exhausts on a single investigation, and
the upstream returns slow-path query plans for unindexed contributor-name
searches that the gateway times out. Without a real API key it's not
usable; with one the user has to sign up at api.data.gov first. That's
too much setup friction for a skill that should work out of the box.

Removed:
  - scripts/fetch_fec.py
  - references/sources/fec.md

Updated:
  - SKILL.md frontmatter description + tags
  - 'When NOT to use' now points users at https://www.fec.gov/data/ for
    federal donations
  - entity_resolution example switched from donor↔contractor to
    lobbying-client↔contractor (Senate LDA + USAspending pair)
  - timing_analysis example switched to lobbying-filings vs awards
  - 8 wiki entries had their 'FEC ↔ ...' cross-reference bullets removed

11 sources remain (5 federal financial + 6 identity/property/courts/
archives/news). All scripts compile, pass --help, and the synthetic
analysis pipeline still passes on the new lobbying-shaped regression
fixture (30 matches, p=0.000 on tight clustering, 2 findings).
2026-05-16 01:55:06 -07:00