hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-22 10:32:00 +00:00

Author	SHA1	Message	Date
teknium1	a58287afcb	Merge remote-tracking branch 'origin/main' into pr48275-rebase # Conflicts: # cron/scheduler.py	2026-06-19 07:40:29 -07:00
Ben	6c44471bfd	fix(hindsight): lazy-install cloud client dependency	2026-06-19 07:36:28 -07:00
Hao Zhe	d7cd0bc086	fix(openviking): preserve structured sync attribution	2026-06-19 15:23:41 +08:00
Eurekaxun	c7b7f92ec1	fix(openviking): sync structured turns with tool parts	2026-06-19 15:23:41 +08:00
Ben	637aff46e7	Merge remote-tracking branch 'origin/main' into hermes/hermes-6fe26723	2026-06-19 15:17:13 +10:00
qin-ctx	2a5d51c16e	fix(openviking): adapt memory provider for current api (cherry picked from commit `cbb87389f3`)	2026-06-18 16:58:11 +08:00
kshitijk4poor	0787ea07c8	test(langfuse): pin exact surviving key in turn-isolation test The prior assertion `all("turn1" in k or "turn2" in k for k in keys)` was weak on two counts: it passes vacuously when keys is empty (a regression that lost all state would slip through), and after turn 2 finalizes only turn 1 lingers, so it only ever inspected turn 1 anyway. Replace it with an exact check that one key survives, it is turn 1, and turn 2 never merged into it — the real isolation invariant the test name claims.	2026-06-18 13:00:01 +05:30
kshitijk4poor	f4fbaa6cda	fix(langfuse): bound _TRACE_STATE growth from non-finalizing turns Scoping the trace key by turn_id (the prior commit) fixed cross-turn collisions but introduced a slow leak: _finish_trace only pops a key when a turn ends cleanly (final response has content and no tool calls), so any turn that is interrupted, ends on a tool call, or has empty final content now leaves its uniquely-keyed entry in _TRACE_STATE forever. Previously the constant per-session key was overwritten by the next turn, capping growth at ~1 entry per session. Add an LRU cap (_MAX_TRACE_STATE) enforced by _evict_stale_locked, called under _STATE_LOCK immediately before each insert. It evicts the least-recently-updated entries (using the previously-dead last_updated_at field) and ends their root span so nothing dangles. Regression test drives 50 non-finalizing turns against a cap of 8 and asserts the dict stays bounded with the most-recent turns surviving.	2026-06-18 12:59:41 +05:30
kshitijk4poor	e1d10ec1ed	refactor(langfuse): extract _scope_prefix from _trace_key The turn- and api-scoped branches each repeated the same task/session/thread fallback ladder with only the infix differing. Extract the shared prefix into _scope_prefix so a future scope dimension touches one ladder instead of three. The legacy branch still returns a bare task_id (not the task: prefix) for backward compatibility, so it stays separate. Output key strings are unchanged; a new test pins them across every task/session/turn/api combination since the keys are matched across hooks and any drift would silently break trace finalization.	2026-06-18 12:58:24 +05:30
kshitijk4poor	b4356135f2	test(langfuse): add end-to-end turn-isolation regression The PR added helper-level tests for _trace_key but nothing exercised the keys through the real hooks. This adds TestTurnTraceIsolation, which drives on_pre_llm_request / on_post_llm_call across two turns of one gateway session (task_id == session_id, unique turn_id, api_call_count reset per turn) and asserts each turn opens its own root trace when the first turn fails to finalize (tool-only final step). This test fails on the pre-fix code (only one trace opened, turn 2 absorbed into turn 1) and passes with the scoping fix. Also pins the turn_id-over-api_request_id key precedence: the turn-scoped post_llm_call carries no api_request_id, so it must still resolve to the same key as the request-scoped hooks or finalization breaks.	2026-06-18 12:38:44 +05:30
infinitycrew39	40ed67ccfe	test(langfuse): cover turn/api trace-key scoping	2026-06-18 12:36:35 +05:30
Ben	e1e53bff9d	Merge remote-tracking branch 'origin/main' into hermes/hermes-6fe26723	2026-06-18 16:18:33 +10:00
kshitijk4poor	1153b42b24	Merge upstream/main into OpenViking setup-UX (salvage #32445 ) Resolves conflicts from the OpenViking churn that merged after #32445 was opened (#48042/#47662 session-switch + write hardening, #47311/#47973): - plugins/memory/openviking/__init__.py: keep both __init__ field groups (the PR's _runtime_start_* alongside main's _prefetch_threads/_shutting_down). - tests/plugins/memory/test_openviking_provider.py: keep BOTH the PR's new setup-validation tests and main's session-switch/concurrency tests (disjoint additions to the same region). Two fixes layered while reconciling (contributor work otherwise preserved): - Restore the merged tenant-header contract (#22414/#21232). The PR had changed _VikingClient defaults to '' and made empty account/user OMIT the tenant headers; main's contract is that empty falls back to 'default' and the X-OpenViking-Account/User headers are ALWAYS sent (ROOT API keys need them). Reverted the constructor to 'account or os.environ.get(..., "default")' and updated the two PR tests that asserted the omit-when-empty behavior. - Close a secret-file TOCTOU in the setup writers. _write_env_vars and _write_ovcli_config wrote the api_key/root_api_key file and chmod 0600 AFTERWARD, leaving a world-readable window on newly-created files. Added _precreate_secret_file() to create with 0600 before any secret bytes land.	2026-06-18 11:28:51 +05:30
Ben	3fc7b624d8	feat(cron,gateway): NAS-JWT fire verifier + /api/cron/fire webhook (Chronos) Phase 4E (E.1 + E.2). The inbound side of Chronos: NAS POSTs the agent when a one-shot fires; the agent verifies a NAS-minted JWT and runs the job. E.1 — plugins/cron/chronos/verify.py: - verify_nas_fire_token(token, expected_audience, jwks_or_key, issuer): verifies signature against the NAS JWKS (RS/ES family; symmetric rejected), aud == this agent, exp/nbf, iss, and purpose == "cron_fire" (so a general agent JWT can't be replayed against the fire endpoint). Returns claims or None; never raises. Crypto delegated to PyJWT[crypto] (already a declared dep) — no hand-rolled JWT, no new dependency. No key configured → refuse (never unsigned-decode a security boundary). - get_fire_verifier(): pluggable indirection so the DQ-4 escape hatch (direct per-job cron-key) can swap in with no handler change. E.2 — gateway/platforms/api_server.py: - POST /api/cron/fire (registered only when _CRON_AVAILABLE). Authenticated by the NAS-JWT via get_fire_verifier() — NOT API_SERVER_KEY (NAS holds no API key; this is the only inbound that triggers remote job execution, so it gets its own purpose-scoped check). Verifier args come from cron.chronos.* config. 401 on bad/missing/forged token. 400 on missing job_id. On success: 202 + fire_due runs in the background (so a long agent turn never trips NAS's HTTP timeout); the store CAS claim inside fire_due de-dupes a scheduler retry. Tests: - test_chronos_verify (11): REAL RS256 signing — valid→claims, wrong-aud, missing/wrong purpose, expired, wrong-iss, tampered-signature (attacker key), no-key-refuse, empty-token, JWKS-URL key resolution, get_fire_verifier. - test_cron_fire_webhook (5): valid→202+fire, invalid→401+no-fire, missing token→401, missing job_id→400, and fire path does NOT require API_SERVER_KEY. api_server regression suites (214) green. E.3 (NAS endpoints) is a separate cross-repo PR; the wire contract lands next (docs/chronos-managed-cron-contract.md).	2026-06-18 14:46:33 +10:00
Ben	4c8bbe6416	feat(cron): Chronos NAS-mediated managed-cron provider (scale-to-zero) Phase 4D. The first non-default CronScheduler: plugins/cron/chronos/. Inert unless cron.provider=chronos; resolve_cron_scheduler falls back to the built-in if unavailable, so cron never loses its trigger. Files: - chronos/__init__.py — ChronosCronScheduler + register(ctx). * is_available(): config-only, NO network (portal_url + callback_url + a stored Nous access token via get_provider_auth_state). Returns False → resolver falls back to built-in. * start(): reconcile() then RETURN — no blocking loop, no 60s wake (DQ-1: this is what makes scale-to-zero real; the machine wakes only on a NAS→agent fire). * _arm_one_shot(job): POST NAS provision {job_id, fire_at, agent_callback_url, dedup_key=job_id:fire_at}. Agent owns the time → sub-minute fires survive (no scheduler 1-minute floor). * reconcile(): converge NAS arms toward jobs.json — arm missing/changed-time, cancel orphaned, skip paused. Cold process rebuilds from jobs.json + idempotent dedup_key. * on_jobs_changed(): reconcile (re-arm/cancel the affected one-shot). * fire_due(): ABC default (CAS claim + run_one_job) THEN re-arm the next one-shot. Job gone (one-shot done / repeat-N exhausted) → no re-arm. - chronos/_nas_client.py — thin HTTP wrapper for provision/cancel/list using the agent's existing refresh-aware Nous token (resolve_nous_access_token). Names no scheduler vendor; holds no scheduler creds. - chronos/plugin.yaml — discovery metadata. INVARIANT: zero "qstash"/"upstash" hits in plugins/cron, gateway, hermes_cli, website/docs — the external scheduler is a NAS-internal detail, never named agent-side. Tests (13, all NAS mocked, zero network): is_available off-without-config + on-with-config + makes-no-network; arm payload incl. sub-minute + noop without next_run; reconcile arms-all / cancels-orphan / skips-paused / skips-already- armed; fire_due re-arms next / no re-arm when job gone / no re-arm when claim lost.	2026-06-18 14:40:56 +10:00
Austin Pickett	fd674af47f	fix(photon): preserve text in mixed iMessage attachments (salvage #46513 ) (#46818 ) * fix(photon): preserve text in mixed iMessage attachments When an iMessage bubble carried both text and an attachment, spectrum-ts' inbound mapper returned only buildAttachmentMessage(...), dropping the user's typed text before Hermes could see it. The Photon adapter then had no 'group' content path, so the text was lost entirely. - adapter.py: handle a new 'group' content type that flattens text + attachment items, preserving the typed text alongside cached media (extracted shared _normalize_binary_payload helper). - sidecar: emit 'group' content in normalizeContent, and ship patch-spectrum-mixed-attachments.mjs which patches spectrum-ts' pinned mapper (at npm postinstall AND at sidecar startup, so existing installs self-heal). Windows robustness fixes on top of the original PR: - The patcher's CLI guard used 'import.meta.url === file://${argv[1]}', which never matches on Windows (file:/// + drive letter) — it silently no-opped. Switched to pathToFileURL(argv[1]).href. - The patcher matched \n-joined strings, so a CRLF checkout (Windows git autocrlf) defeated every replacement. It now normalizes CRLF->LF for matching and restores the original EOL style on write. Co-authored-by: Yuhang Lin <yuhanglin@YuhangdeMac-mini.local> * chore: map YuhangLin contributor email for attribution (#46513) --------- Co-authored-by: Yuhang Lin <yuhanglin@YuhangdeMac-mini.local> Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>	2026-06-17 16:14:24 -05:00
kshitijk4poor	c835448908	fix(openviking): don't block the command thread on session switch; lock turn state Follow-up hardening on @ehz0ah / @harshitAgr's session-switch work (#28296): - on_session_switch no longer runs the old-session writer-drain + pending-token GET + commit POST inline on the caller's command thread. /new, /branch, /resume, /undo call it synchronously, so a slow drain (up to 10s) or wedged commit blocked the user-facing command — the same hazard #41945 fixed for end-of-turn sync. State now rotates synchronously (cheap) and the old-session commit is offloaded to a daemon finalizer (generalized _finalize_session_async). - Guard the (_session_id, _turn_count) pair with _session_state_lock: sync_turn runs on the memory-manager executor thread while the session hooks run on the command thread, so the snapshot+reset vs increment was a cross-thread race. - _session_needs_commit checks the committed-session guard BEFORE the turn_count>0 shortcut, closing a double-commit window when a racing sync_turn re-increments after commit+reset. - Add a _shutting_down flag so deferred finalizers stop POSTing against a torn-down client; track all prefetch threads in a set so invalidate/shutdown join every one, not just the latest slot. Tests: regression for the non-blocking switch (asserts the caller returns while a slow drain is parked off-thread) and the committed-guard ordering; updated the deferred-commit test to the unified finalizer contract.	2026-06-18 00:21:21 +05:30
Hao Zhe	3ac6551ba3	fix(openviking): handle rewound session switches	2026-06-17 14:46:06 +08:00
Hao Zhe	00c045b43f	fix(openviking): harden session writes and switch commits	2026-06-17 13:16:03 +08:00
Hao Zhe	f3b813c027	test(openviking): preserve content/write memory writes	2026-06-17 12:58:14 +08:00
harshitAgr	91e9459e10	fix(openviking): track writers per-session so commit waits for all sync_turn's bounded join could drop a still-alive previous worker by replacing the single _sync_thread slot. The dropped worker kept POSTing under the old sid but was no longer visible to on_session_end / on_session_switch, so the commit could fire while orphaned writes were still in flight — those writes landed past the commit boundary and were never extracted. Replace the single _sync_thread slot with _inflight_writers: Dict[sid, Set[Thread]]. Writers self-register on spawn (sync_turn, on_memory_write) and self-deregister on exit. The commit path drains _drain_writers(sid, 10.0) and skips the commit if any writer for that sid is still alive after the bounded budget. Also trim inline review-rationale comments to short invariants per reviewer style ask: "commit only after session writes drain" and "drop prefetch results from older switch generations." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit `7537ee6f5b`)	2026-06-17 12:55:37 +08:00
harshitAgr	eddbf291a4	fix(openviking): close remaining session-boundary races on switch Three follow-ups from review on #28296: 1. Sync worker outliving the bounded join. Each sync_turn POST has _TIMEOUT=30s and there are two per turn, but on_session_end and on_session_switch only join for 10s. If the worker is still alive after the join, committing the old session orphans the worker's late writes past the commit boundary — they land in an already- committed session and never get extracted. Both hooks now re-check is_alive() after the join and skip the commit when the worker hasn't drained. 2. on_memory_write late session_id capture. Same shape as the pre-fix sync_turn: f-string for the post path read self._session_id inside the worker, so a switch between thread spawn and post call landed the memory note in the new session. Snapshot sid at call time, same pattern as sync_turn. 3. Stale prefetch repopulating the new session. The pre-switch drain+clear only protects against workers that finish before the join completes; one finishing after the clear would write its result into the new generation's slot. Added a monotonic _prefetch_generation; workers capture it at spawn and refuse to write if it has advanced. Tests: existing in-flight-sync test updated to drain (it tested the join-before-commit happy path); four new tests cover hung-writer skip on end + switch, on_memory_write sid capture, and prefetch generation gating. 177/177 memory tests pass. (cherry picked from commit `3791a87dbe`)	2026-06-17 12:54:44 +08:00
harshitAgr	a30b40c73a	fix(openviking): close session-boundary races on sync_turn and on_session_end Two hardening fixes prompted by review on #28296: 1. sync_turn() now snapshots the target session id before spawning the worker. The previous code read self._session_id inside the worker, so a worker delayed past on_session_switch's bounded join could read the rotated-in NEW id and write the OLD turn's messages into the wrong session. 2. on_session_end() resets _turn_count to 0 after a successful commit, making the old-session commit path idempotent with the new switch hook. /new and compression call commit_memory_session() (which fires on_session_end) immediately before on_session_switch; without this, the old session would be committed twice. On commit failure we leave _turn_count > 0 so on_session_switch retries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (cherry picked from commit `2ea8d5c537`)	2026-06-17 12:54:15 +08:00
harshitAgr	813a4e3838	fix(openviking): implement on_session_switch hook (#28296 ) OpenVikingMemoryProvider only overrides on_session_end and inherits the base-class no-op for on_session_switch. When the agent rotates session_id (via /new, /branch, /reset, /resume, or context compression), the provider's cached _session_id stays at the value initialize() captured. All subsequent sync_turn writes then land in the already-closed old session, and on_session_end tries to commit it a second time — the new session never accumulates messages and never triggers memory extraction. The fix mirrors the pattern Hindsight uses (#17508): 1. Wait for any in-flight sync thread to drain under the OLD _session_id before we mutate it, otherwise the commit below races the last message write. 2. Commit the old session if it accumulated turns — same extraction semantics as on_session_end. Skip if empty (nothing to extract). 3. Drain in-flight prefetch from the old session and clear its cached result so the new session doesn't see stale recall. 4. Rotate _session_id to the new value and reset _turn_count. Commit failures are swallowed (logged at WARN) so a flaky server can't strand the provider on the old session forever — same posture as the existing on_session_end commit. (cherry picked from commit `a1e7185e8a`)	2026-06-17 12:53:54 +08:00
Hao Zhe	166d2457b2	fix(memory): avoid setup autostart for unhealthy OpenViking	2026-06-17 01:32:43 +08:00
Hao Zhe	315fdae5f8	fix(memory): tighten OpenViking local autostart	2026-06-17 01:23:05 +08:00
Hao Zhe	2c2ca0443b	feat(memory): improve OpenViking setup UX	2026-06-17 01:04:26 +08:00
Hao Zhe	3c76dac4fd	fix(memory): log OpenViking chmod failures	2026-06-17 01:02:39 +08:00
Hao Zhe	2b972472ce	fix(memory): validate OpenViking manual setup steps	2026-06-17 01:02:39 +08:00
Hao Zhe	94523764fc	fix(memory): choose OpenViking key type before prompting	2026-06-17 01:02:39 +08:00
Hao Zhe	70f53f36cb	feat(memory): add manual OpenViking setup path	2026-06-17 01:02:39 +08:00
Hao Zhe	b0e25c9cb2	fix(memory): restrict OpenViking setup file permissions	2026-06-17 01:02:39 +08:00
Hao Zhe	2dace37f6b	feat(memory): improve OpenViking setup UX Support linking, copying, and creating ovcli.conf during OpenViking memory setup. Make setup cancellation write nothing and cover OpenViking/Hindsight picker cancellation paths.	2026-06-17 01:02:38 +08:00
underthestars-zhy	5b3fa26366	fix(photon): unify project identifiers and update documentation for Spectrum provisioning Co-Authored-By: Marvin <marvin@photon.codes> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 05:25:56 -07:00
Teknium	49e743985a	fix: route minimax m3 reasoning controls through profile Follow up PR #46609's api.minimax.io reasoning report by moving the behavior out of the broad run_agent host gate and into the MiniMax provider profile. Only MiniMax-M3 on the documented OpenAI-compatible /v1 route gets reasoning_split/thinking/reasoning_effort; Anthropic-format MiniMax and non-M3 models keep their existing wire shapes. Co-authored-by: goku94123 <gooku94123@gmail.com>	2026-06-15 07:08:43 -07:00
Teknium	a688d2a1bd	test: assert disk cleanup prunes protected walks	2026-06-15 05:25:27 -07:00
墨綠BG	40699c3292	🐛 fix(disk-cleanup): avoid brittle sweep review issues	2026-06-15 05:25:27 -07:00
墨綠BG	c1a70a5439	🐛 fix(disk-cleanup): prune protected cleanup walks	2026-06-15 05:25:27 -07:00
Nicolò Boschi	a376ca0081	feat(hindsight): make observation scopes configurable on retain Adds an observation_scopes config key (and HINDSIGHT_RETAIN_OBSERVATION_SCOPES env var) so retained memories can opt into per_tag / all_combinations / custom scoping instead of Hindsight's default combined pass. Threaded through _build_retain_kwargs so all three retain paths honor it: auto-retain and flush-on-switch already use aretain_batch; the tool retain path is switched from aretain to aretain_batch (functionally equivalent, aretain just wraps a single-item batch) since aretain doesn't accept the observation_scopes parameter.	2026-06-15 04:59:17 -07:00
Teknium	f3fe99863d	revert(web): remove keyless Parallel search fallback (#46350 ) Remove the free Parallel Search MCP path and restore the keyed Parallel backend behavior from before it was introduced. Also drops the keyless fallback registration/display labeling tests and returns the Parallel SDK pin to the prior version.	2026-06-14 16:47:57 -07:00
Teknium	2681c5a12d	fix(photon): correct gateway start command (#45566 )	2026-06-13 05:14:59 -07:00
underthestars-zhy	a652131c42	fix(photon): stop gateway restarts from orphaning the sidecar on its port A hard gateway exit (crash, SIGKILL, supervisor restart) left the detached Node sidecar running with a token the next gateway run doesn't know, so it could never be told to /shutdown. Every replacement spawn then died on EADDRINUSE, failing each 30→300s reconnect attempt while the orphan kept consuming the inbound gRPC stream. Two layers: - Lifetime binding: the adapter now holds the sidecar's stdin as a pipe, and the sidecar (PHOTON_SIDECAR_WATCH_STDIN=1) shuts down on stdin EOF — fired by the OS on any parent death, including SIGKILL. - Startup reaping: before spawning, the adapter probes the port and terminates a stale listener, but only after verifying its command line is a Photon sidecar; a foreign listener raises a clear error instead of being signalled. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-12 01:07:38 -07:00
underthestars-zhy	573c4e6511	feat(photon): upgrade to spectrum-ts 3.0.0 (pinned) with markdown + reactions Pin spectrum-ts to exactly 3.0.0 (was ^1.18.0 plus an `npm install spectrum-ts@latest` on every setup) so breaking SDK majors can't take down fresh installs silently; `hermes photon setup` now runs `npm ci`. Upgrade procedure documented in the README. Migrate resolveSpace to the v3 namespace API: `im.space.create(phone)` for DMs and `im.space.get(id)` for everything else — group spaces are now rehydratable from their persisted id after a sidecar restart, which v1 could not do. Markdown: replies go out via the v3 `markdown()` builder (iMessage renders natively; other Spectrum platforms degrade to plain text). `PHOTON_MARKDOWN=false` reverts to the stripped plain-text path. Reactions, behind PHOTON_REACTIONS (default off): lifecycle tapbacks (👀 while processing, 👍/👎 on completion) via new sidecar /react and /unreact endpoints with per-target reaction-handle tracking, and user tapbacks on bot-sent messages routed to the agent as synthetic `reaction:added:<emoji>` events. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-12 01:07:38 -07:00
Austin Pickett	c3464ecf45	fix(discord): recover from runtime gateway task exits (#44383 ) * fix(discord): recover from runtime gateway task exits Salvaged from #39416 (AMEOBIUS) — cherry-picked only the task-exit recovery; the original PR was 1081 commits behind with 28 unrelated commits. A post-ready discord.py WebSocket crash left the gateway split-brained: producers stayed active while Discord stopped responding. After this fix the adapter calls _set_fatal_error(retryable=True) + _notify_fatal_error() so the existing GatewayRunner reconnect watcher replaces the dead adapter. Also adds _wait_for_ready_or_bot_exit() so startup failures (SOCKS/proxy errors, invalid tokens) surface fast instead of burning the full ready timeout. Because connect() no longer waits via asyncio.wait_for on that path, test_connect_releases_token_lock_on_timeout is updated to trigger the timeout through the new helper (same lock-release contract). 3 tests pass (2 new runtime-failure tests + the updated timeout test); test_discord_connect.py and test_discord_slash_commands.py green. Co-Authored-By: ameobius <ameobius@local.host> * fix(test): patch _wait_for_ready_or_bot_exit in timeout cancel test connect() no longer uses asyncio.wait_for for the ready handshake, so test_connect_timeout_cancels_bot_task was hanging for 30s in CI. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: ameobius <ameobius@local.host> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-11 15:39:01 -04:00
Teknium	0a5762c78d	fix(web): genericize free-MCP client identity per telemetry policy Replace the hermes-identifying clientInfo/User-Agent/session-id prefix on the keyless Parallel Search MCP path with a neutral 'mcp-web-client' identity. Project policy forbids third-party usage attribution without an explicit user opt-in (see telemetry PR policy); MCP requires a clientInfo, so a generic one satisfies the spec without attributing traffic. Also adds the contributor AUTHOR_MAP entry and refreshes uv.lock against current main (parallel-web 0.6.0).	2026-06-10 19:54:38 -07:00
Matt Harris	e0e2571711	feat(web): Parallel-backed web search & extract — free Search MCP when keyless, v1 REST when keyed Make Parallel the web search/extract backend with a zero-setup free tier: - Keyless (no PARALLEL_API_KEY): web_search/web_extract work out of the box via Parallel's free hosted Search MCP (search.parallel.ai/mcp), and parallel becomes the default backend when no other web credentials are configured (ahead of ddgs, which is search-only). A small hand-rolled Streamable-HTTP JSON-RPC client speaks the MCP's web_search/web_fetch tools; the existing web_search/web_extract tools are the only tools registered. - Keyed (PARALLEL_API_KEY set): uses the Parallel v1 REST endpoints (client.search / client.extract with advanced_settings.full_content) — no beta. Bumps parallel-web 0.4.2 -> 0.6.0. - Attribution: on the free path only, results carry provider/attribution and the CLI tool line reads "Parallel search" / "Parallel fetch"; the paid path is unbranded. - Selection/registration: web tools register unconditionally (free MCP backstop) while check_web_api_key remains a real usability probe; explicit per-capability backends are honored (so misconfig surfaces) rather than masked by the fallback. Tested: live web_search/web_extract against search.parallel.ai in keyless and keyed modes; unit suites for the MCP client, backend selection, and display labeling; full agent run shows the "Parallel search" label on the free path.	2026-06-10 19:54:38 -07:00
kshitijk4poor	4642762289	fix(langfuse): redact base64 data URIs instead of truncating into invalid base64 The Langfuse SDK treats `data:;base64,...` strings as media and tries to decode them. `_truncate_text` was slicing those strings mid-payload, producing invalid base64 and noisy "Error parsing base64 data URI" logs. Observability only needs the metadata, not raw image/audio bytes, so redact the whole data URI (type, media_type, length) before it reaches the SDK. Salvaged the Langfuse fix from #39682 onto current main as a standalone, single-concern change (the dashboard `dist/*` and plugin-discovery parts of that PR already landed separately on main). Co-authored-by: foras910521-lab <foras910521-lab@users.noreply.github.com>	2026-06-10 10:49:36 +05:30
Philip D'Souza	92dfd70d6a	fix(photon): production hardening for the gRPC-native iMessage channel (#42732 ) * fix(photon): override transitive CVEs in the sidecar deps `npm audit` flagged 7 high-severity transitive CVEs (protobufjs code injection GHSA-66ff-xgx4-vchm + outdated @opentelemetry OTLP exporters) pulled in via spectrum-ts -> @photon-ai/otel. npm's suggested fix downgrades spectrum-ts to a version that targets the decommissioned spectrum host, so instead pin patched versions via `overrides` (protobufjs 8.6.1, @opentelemetry/* 0.218.0) without touching spectrum-ts. `npm audit` -> 0; spectrum-ts + provider still import. * fix(photon): harden the sidecar bridge + bound the dedup cache - constant-time sidecar control-token comparison (was `!==`, timing-attackable). - cap the control-channel request body (2 MiB) so a compromised local peer can't OOM the sidecar. - wrap the inbound gRPC stream consumer in a re-subscribe loop with capped exponential backoff + jitter — if the async iterator throws/ends it would otherwise stop inbound forever (the adapter dedupes any replay). - add an unhandledRejection handler so a stray rejection logs instead of killing the process. - dedup cache (adapter) was a true bounded LRU only for expired entries; a burst of unique ids within the window grew it without limit. Evict oldest at the cap. * chore: add AUTHOR_MAP entry for PhilipAD --------- Co-authored-by: PhilipAD <philipadsouza@gmail.com>	2026-06-09 11:12:58 -04:00
kshitij	85852b71d8	fix(nemo-relay): preserve downstream errors in adaptive execution (#42691 ) Based on #42658 by @mnajafian-nv. Preserves the real downstream provider/tool exception when NeMo Relay's managed adaptive execution wraps a failing callback as an internal runtime error. Without this, the original exception (and its retry-classification signal, e.g. status_code) is lost behind Relay's wrapper. Salvage changes on top of the original PR: - Tolerant Relay-wrapper match: _is_relay_wrapped_callback_error now uses str.startswith on the "internal error: <cls>: <msg>" prefix instead of exact equality, so a future Relay version appending a traceback/suffix doesn't silently defeat the unwrap. On a total format change it returns False and falls back to the pre-fix behavior (surfacing Relay's error) rather than masking it. - Deduplicated the LLM and tool execute paths into a shared _run_managed_with_downstream_preservation helper, removing ~20 lines of copy-pasted nonlocal/try-except scaffolding that could drift out of sync. - Added a real-middleware regression guard (test_nemo_relay_downstream_unwrap_matches_real_middleware_wrapper_shape) that drives hermes_cli.middleware._run_execution_chain and asserts the plugin's _original_downstream_error unwraps the actual private _DownstreamExecutionError wrapper. The original synthetic tests modeled the wrapper with a local class, so a rename or shape change in core middleware would not have been caught; this test fails loudly if that contract drifts. Co-authored-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 02:31:10 -07:00
underthestars-zhy	dbf2470d46	feat(photon): Add voice message support to Photon adapter Extend the sidecar and Python adapter to handle `voice` content alongside `attachment`. Voice notes are inlined as base64 (same size-cap logic), surfaced as `MessageType.VOICE`, and include an optional `duration` field in fallback markers when bytes are unavailable.	2026-06-08 22:53:01 -07:00

1 2 3 4 5

212 commits