Commit graph

212 commits

Author SHA1 Message Date
teknium1
a58287afcb
Merge remote-tracking branch 'origin/main' into pr48275-rebase
# Conflicts:
#	cron/scheduler.py
2026-06-19 07:40:29 -07:00
Ben
6c44471bfd fix(hindsight): lazy-install cloud client dependency 2026-06-19 07:36:28 -07:00
Hao Zhe
d7cd0bc086 fix(openviking): preserve structured sync attribution 2026-06-19 15:23:41 +08:00
Eurekaxun
c7b7f92ec1 fix(openviking): sync structured turns with tool parts 2026-06-19 15:23:41 +08:00
Ben
637aff46e7 Merge remote-tracking branch 'origin/main' into hermes/hermes-6fe26723 2026-06-19 15:17:13 +10:00
qin-ctx
2a5d51c16e fix(openviking): adapt memory provider for current api
(cherry picked from commit cbb87389f3)
2026-06-18 16:58:11 +08:00
kshitijk4poor
0787ea07c8 test(langfuse): pin exact surviving key in turn-isolation test
The prior assertion `all("turn1" in k or "turn2" in k for k in keys)` was
weak on two counts: it passes vacuously when keys is empty (a regression
that lost all state would slip through), and after turn 2 finalizes only
turn 1 lingers, so it only ever inspected turn 1 anyway. Replace it with an
exact check that one key survives, it is turn 1, and turn 2 never merged
into it — the real isolation invariant the test name claims.
2026-06-18 13:00:01 +05:30
kshitijk4poor
f4fbaa6cda fix(langfuse): bound _TRACE_STATE growth from non-finalizing turns
Scoping the trace key by turn_id (the prior commit) fixed cross-turn
collisions but introduced a slow leak: _finish_trace only pops a key when a
turn ends cleanly (final response has content and no tool calls), so any
turn that is interrupted, ends on a tool call, or has empty final content
now leaves its uniquely-keyed entry in _TRACE_STATE forever. Previously the
constant per-session key was overwritten by the next turn, capping growth at
~1 entry per session.

Add an LRU cap (_MAX_TRACE_STATE) enforced by _evict_stale_locked, called
under _STATE_LOCK immediately before each insert. It evicts the
least-recently-updated entries (using the previously-dead last_updated_at
field) and ends their root span so nothing dangles. Regression test drives
50 non-finalizing turns against a cap of 8 and asserts the dict stays bounded
with the most-recent turns surviving.
2026-06-18 12:59:41 +05:30
kshitijk4poor
e1d10ec1ed refactor(langfuse): extract _scope_prefix from _trace_key
The turn- and api-scoped branches each repeated the same
task/session/thread fallback ladder with only the infix differing. Extract
the shared prefix into _scope_prefix so a future scope dimension touches one
ladder instead of three. The legacy branch still returns a bare task_id (not
the task: prefix) for backward compatibility, so it stays separate.

Output key strings are unchanged; a new test pins them across every
task/session/turn/api combination since the keys are matched across hooks
and any drift would silently break trace finalization.
2026-06-18 12:58:24 +05:30
kshitijk4poor
b4356135f2 test(langfuse): add end-to-end turn-isolation regression
The PR added helper-level tests for _trace_key but nothing exercised the
keys through the real hooks. This adds TestTurnTraceIsolation, which drives
on_pre_llm_request / on_post_llm_call across two turns of one gateway
session (task_id == session_id, unique turn_id, api_call_count reset per
turn) and asserts each turn opens its own root trace when the first turn
fails to finalize (tool-only final step). This test fails on the pre-fix
code (only one trace opened, turn 2 absorbed into turn 1) and passes with
the scoping fix.

Also pins the turn_id-over-api_request_id key precedence: the turn-scoped
post_llm_call carries no api_request_id, so it must still resolve to the
same key as the request-scoped hooks or finalization breaks.
2026-06-18 12:38:44 +05:30
infinitycrew39
40ed67ccfe test(langfuse): cover turn/api trace-key scoping 2026-06-18 12:36:35 +05:30
Ben
e1e53bff9d Merge remote-tracking branch 'origin/main' into hermes/hermes-6fe26723 2026-06-18 16:18:33 +10:00
kshitijk4poor
1153b42b24 Merge upstream/main into OpenViking setup-UX (salvage #32445)
Resolves conflicts from the OpenViking churn that merged after #32445 was
opened (#48042/#47662 session-switch + write hardening, #47311/#47973):

- plugins/memory/openviking/__init__.py: keep both __init__ field groups
  (the PR's _runtime_start_* alongside main's _prefetch_threads/_shutting_down).
- tests/plugins/memory/test_openviking_provider.py: keep BOTH the PR's new
  setup-validation tests and main's session-switch/concurrency tests (disjoint
  additions to the same region).

Two fixes layered while reconciling (contributor work otherwise preserved):

- Restore the merged tenant-header contract (#22414/#21232). The PR had changed
  _VikingClient defaults to '' and made empty account/user OMIT the tenant
  headers; main's contract is that empty falls back to 'default' and the
  X-OpenViking-Account/User headers are ALWAYS sent (ROOT API keys need them).
  Reverted the constructor to 'account or os.environ.get(..., "default")' and
  updated the two PR tests that asserted the omit-when-empty behavior.

- Close a secret-file TOCTOU in the setup writers. _write_env_vars and
  _write_ovcli_config wrote the api_key/root_api_key file and chmod 0600
  AFTERWARD, leaving a world-readable window on newly-created files. Added
  _precreate_secret_file() to create with 0600 before any secret bytes land.
2026-06-18 11:28:51 +05:30
Ben
3fc7b624d8 feat(cron,gateway): NAS-JWT fire verifier + /api/cron/fire webhook (Chronos)
Phase 4E (E.1 + E.2). The inbound side of Chronos: NAS POSTs the agent when a
one-shot fires; the agent verifies a NAS-minted JWT and runs the job.

E.1 — plugins/cron/chronos/verify.py:
- verify_nas_fire_token(token, expected_audience, jwks_or_key, issuer): verifies
  signature against the NAS JWKS (RS/ES family; symmetric rejected), aud == this
  agent, exp/nbf, iss, and purpose == "cron_fire" (so a general agent JWT can't
  be replayed against the fire endpoint). Returns claims or None; never raises.
  Crypto delegated to PyJWT[crypto] (already a declared dep) — no hand-rolled
  JWT, no new dependency. No key configured → refuse (never unsigned-decode a
  security boundary).
- get_fire_verifier(): pluggable indirection so the DQ-4 escape hatch
  (direct per-job cron-key) can swap in with no handler change.

E.2 — gateway/platforms/api_server.py:
- POST /api/cron/fire (registered only when _CRON_AVAILABLE). Authenticated by
  the NAS-JWT via get_fire_verifier() — NOT API_SERVER_KEY (NAS holds no API
  key; this is the only inbound that triggers remote job execution, so it gets
  its own purpose-scoped check). Verifier args come from cron.chronos.* config.
  401 on bad/missing/forged token. 400 on missing job_id. On success: 202 +
  fire_due runs in the background (so a long agent turn never trips NAS's HTTP
  timeout); the store CAS claim inside fire_due de-dupes a scheduler retry.

Tests:
- test_chronos_verify (11): REAL RS256 signing — valid→claims, wrong-aud,
  missing/wrong purpose, expired, wrong-iss, tampered-signature (attacker key),
  no-key-refuse, empty-token, JWKS-URL key resolution, get_fire_verifier.
- test_cron_fire_webhook (5): valid→202+fire, invalid→401+no-fire, missing
  token→401, missing job_id→400, and fire path does NOT require API_SERVER_KEY.
api_server regression suites (214) green.

E.3 (NAS endpoints) is a separate cross-repo PR; the wire contract lands next
(docs/chronos-managed-cron-contract.md).
2026-06-18 14:46:33 +10:00
Ben
4c8bbe6416 feat(cron): Chronos NAS-mediated managed-cron provider (scale-to-zero)
Phase 4D. The first non-default CronScheduler: plugins/cron/chronos/. Inert
unless cron.provider=chronos; resolve_cron_scheduler falls back to the built-in
if unavailable, so cron never loses its trigger.

Files:
- chronos/__init__.py — ChronosCronScheduler + register(ctx).
  * is_available(): config-only, NO network (portal_url + callback_url + a
    stored Nous access token via get_provider_auth_state). Returns False →
    resolver falls back to built-in.
  * start(): reconcile() then RETURN — no blocking loop, no 60s wake (DQ-1:
    this is what makes scale-to-zero real; the machine wakes only on a
    NAS→agent fire).
  * _arm_one_shot(job): POST NAS provision {job_id, fire_at, agent_callback_url,
    dedup_key=job_id:fire_at}. Agent owns the time → sub-minute fires survive
    (no scheduler 1-minute floor).
  * reconcile(): converge NAS arms toward jobs.json — arm missing/changed-time,
    cancel orphaned, skip paused. Cold process rebuilds from jobs.json +
    idempotent dedup_key.
  * on_jobs_changed(): reconcile (re-arm/cancel the affected one-shot).
  * fire_due(): ABC default (CAS claim + run_one_job) THEN re-arm the next
    one-shot. Job gone (one-shot done / repeat-N exhausted) → no re-arm.
- chronos/_nas_client.py — thin HTTP wrapper for provision/cancel/list using
  the agent's existing refresh-aware Nous token (resolve_nous_access_token).
  Names no scheduler vendor; holds no scheduler creds.
- chronos/plugin.yaml — discovery metadata.

INVARIANT: zero "qstash"/"upstash" hits in plugins/cron, gateway, hermes_cli,
website/docs — the external scheduler is a NAS-internal detail, never named
agent-side.

Tests (13, all NAS mocked, zero network): is_available off-without-config +
on-with-config + makes-no-network; arm payload incl. sub-minute + noop without
next_run; reconcile arms-all / cancels-orphan / skips-paused / skips-already-
armed; fire_due re-arms next / no re-arm when job gone / no re-arm when claim
lost.
2026-06-18 14:40:56 +10:00
Austin Pickett
fd674af47f
fix(photon): preserve text in mixed iMessage attachments (salvage #46513) (#46818)
* fix(photon): preserve text in mixed iMessage attachments

When an iMessage bubble carried both text and an attachment, spectrum-ts'
inbound mapper returned only buildAttachmentMessage(...), dropping the user's
typed text before Hermes could see it. The Photon adapter then had no 'group'
content path, so the text was lost entirely.

- adapter.py: handle a new 'group' content type that flattens text + attachment
  items, preserving the typed text alongside cached media (extracted shared
  _normalize_binary_payload helper).
- sidecar: emit 'group' content in normalizeContent, and ship
  patch-spectrum-mixed-attachments.mjs which patches spectrum-ts' pinned mapper
  (at npm postinstall AND at sidecar startup, so existing installs self-heal).

Windows robustness fixes on top of the original PR:
- The patcher's CLI guard used 'import.meta.url === file://${argv[1]}', which
  never matches on Windows (file:/// + drive letter) — it silently no-opped.
  Switched to pathToFileURL(argv[1]).href.
- The patcher matched \n-joined strings, so a CRLF checkout (Windows git
  autocrlf) defeated every replacement. It now normalizes CRLF->LF for matching
  and restores the original EOL style on write.

Co-authored-by: Yuhang Lin <yuhanglin@YuhangdeMac-mini.local>

* chore: map YuhangLin contributor email for attribution (#46513)

---------

Co-authored-by: Yuhang Lin <yuhanglin@YuhangdeMac-mini.local>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
2026-06-17 16:14:24 -05:00
kshitijk4poor
c835448908 fix(openviking): don't block the command thread on session switch; lock turn state
Follow-up hardening on @ehz0ah / @harshitAgr's session-switch work (#28296):

- on_session_switch no longer runs the old-session writer-drain + pending-token
  GET + commit POST inline on the caller's command thread. /new, /branch,
  /resume, /undo call it synchronously, so a slow drain (up to 10s) or wedged
  commit blocked the user-facing command — the same hazard #41945 fixed for
  end-of-turn sync. State now rotates synchronously (cheap) and the old-session
  commit is offloaded to a daemon finalizer (generalized _finalize_session_async).
- Guard the (_session_id, _turn_count) pair with _session_state_lock: sync_turn
  runs on the memory-manager executor thread while the session hooks run on the
  command thread, so the snapshot+reset vs increment was a cross-thread race.
- _session_needs_commit checks the committed-session guard BEFORE the
  turn_count>0 shortcut, closing a double-commit window when a racing sync_turn
  re-increments after commit+reset.
- Add a _shutting_down flag so deferred finalizers stop POSTing against a
  torn-down client; track all prefetch threads in a set so invalidate/shutdown
  join every one, not just the latest slot.

Tests: regression for the non-blocking switch (asserts the caller returns while
a slow drain is parked off-thread) and the committed-guard ordering; updated the
deferred-commit test to the unified finalizer contract.
2026-06-18 00:21:21 +05:30
Hao Zhe
3ac6551ba3 fix(openviking): handle rewound session switches 2026-06-17 14:46:06 +08:00
Hao Zhe
00c045b43f fix(openviking): harden session writes and switch commits 2026-06-17 13:16:03 +08:00
Hao Zhe
f3b813c027 test(openviking): preserve content/write memory writes 2026-06-17 12:58:14 +08:00
harshitAgr
91e9459e10 fix(openviking): track writers per-session so commit waits for all
sync_turn's bounded join could drop a still-alive previous worker by
replacing the single _sync_thread slot. The dropped worker kept POSTing
under the old sid but was no longer visible to on_session_end /
on_session_switch, so the commit could fire while orphaned writes were
still in flight — those writes landed past the commit boundary and were
never extracted.

Replace the single _sync_thread slot with _inflight_writers:
Dict[sid, Set[Thread]]. Writers self-register on spawn (sync_turn,
on_memory_write) and self-deregister on exit. The commit path drains
_drain_writers(sid, 10.0) and skips the commit if any writer for that
sid is still alive after the bounded budget.

Also trim inline review-rationale comments to short invariants per
reviewer style ask: "commit only after session writes drain" and
"drop prefetch results from older switch generations."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit 7537ee6f5b)
2026-06-17 12:55:37 +08:00
harshitAgr
eddbf291a4 fix(openviking): close remaining session-boundary races on switch
Three follow-ups from review on #28296:

1. Sync worker outliving the bounded join. Each sync_turn POST has
   _TIMEOUT=30s and there are two per turn, but on_session_end and
   on_session_switch only join for 10s. If the worker is still alive
   after the join, committing the old session orphans the worker's
   late writes past the commit boundary — they land in an already-
   committed session and never get extracted. Both hooks now re-check
   is_alive() after the join and skip the commit when the worker
   hasn't drained.

2. on_memory_write late session_id capture. Same shape as the
   pre-fix sync_turn: f-string for the post path read self._session_id
   inside the worker, so a switch between thread spawn and post call
   landed the memory note in the new session. Snapshot sid at call
   time, same pattern as sync_turn.

3. Stale prefetch repopulating the new session. The pre-switch
   drain+clear only protects against workers that finish before the
   join completes; one finishing after the clear would write its
   result into the new generation's slot. Added a monotonic
   _prefetch_generation; workers capture it at spawn and refuse to
   write if it has advanced.

Tests: existing in-flight-sync test updated to drain (it tested the
join-before-commit happy path); four new tests cover hung-writer skip
on end + switch, on_memory_write sid capture, and prefetch generation
gating. 177/177 memory tests pass.

(cherry picked from commit 3791a87dbe)
2026-06-17 12:54:44 +08:00
harshitAgr
a30b40c73a fix(openviking): close session-boundary races on sync_turn and on_session_end
Two hardening fixes prompted by review on #28296:

1. sync_turn() now snapshots the target session id before spawning the
   worker. The previous code read self._session_id inside the worker, so
   a worker delayed past on_session_switch's bounded join could read the
   rotated-in NEW id and write the OLD turn's messages into the wrong
   session.

2. on_session_end() resets _turn_count to 0 after a successful commit,
   making the old-session commit path idempotent with the new switch
   hook. /new and compression call commit_memory_session() (which fires
   on_session_end) immediately before on_session_switch; without this,
   the old session would be committed twice. On commit failure we leave
   _turn_count > 0 so on_session_switch retries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(cherry picked from commit 2ea8d5c537)
2026-06-17 12:54:15 +08:00
harshitAgr
813a4e3838 fix(openviking): implement on_session_switch hook (#28296)
OpenVikingMemoryProvider only overrides on_session_end and inherits the
base-class no-op for on_session_switch. When the agent rotates session_id
(via /new, /branch, /reset, /resume, or context compression), the
provider's cached _session_id stays at the value initialize() captured.
All subsequent sync_turn writes then land in the already-closed old
session, and on_session_end tries to commit it a second time — the new
session never accumulates messages and never triggers memory extraction.

The fix mirrors the pattern Hindsight uses (#17508):

  1. Wait for any in-flight sync thread to drain under the OLD _session_id
     before we mutate it, otherwise the commit below races the last
     message write.
  2. Commit the old session if it accumulated turns — same extraction
     semantics as on_session_end. Skip if empty (nothing to extract).
  3. Drain in-flight prefetch from the old session and clear its cached
     result so the new session doesn't see stale recall.
  4. Rotate _session_id to the new value and reset _turn_count.

Commit failures are swallowed (logged at WARN) so a flaky server can't
strand the provider on the old session forever — same posture as the
existing on_session_end commit.

(cherry picked from commit a1e7185e8a)
2026-06-17 12:53:54 +08:00
Hao Zhe
166d2457b2 fix(memory): avoid setup autostart for unhealthy OpenViking 2026-06-17 01:32:43 +08:00
Hao Zhe
315fdae5f8 fix(memory): tighten OpenViking local autostart 2026-06-17 01:23:05 +08:00
Hao Zhe
2c2ca0443b feat(memory): improve OpenViking setup UX 2026-06-17 01:04:26 +08:00
Hao Zhe
3c76dac4fd fix(memory): log OpenViking chmod failures 2026-06-17 01:02:39 +08:00
Hao Zhe
2b972472ce fix(memory): validate OpenViking manual setup steps 2026-06-17 01:02:39 +08:00
Hao Zhe
94523764fc fix(memory): choose OpenViking key type before prompting 2026-06-17 01:02:39 +08:00
Hao Zhe
70f53f36cb feat(memory): add manual OpenViking setup path 2026-06-17 01:02:39 +08:00
Hao Zhe
b0e25c9cb2 fix(memory): restrict OpenViking setup file permissions 2026-06-17 01:02:39 +08:00
Hao Zhe
2dace37f6b feat(memory): improve OpenViking setup UX
Support linking, copying, and creating ovcli.conf during OpenViking memory setup.

Make setup cancellation write nothing and cover OpenViking/Hindsight picker cancellation paths.
2026-06-17 01:02:38 +08:00
underthestars-zhy
5b3fa26366 fix(photon): unify project identifiers and update documentation for Spectrum provisioning
Co-Authored-By: Marvin <marvin@photon.codes>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 05:25:56 -07:00
Teknium
49e743985a fix: route minimax m3 reasoning controls through profile
Follow up PR #46609's api.minimax.io reasoning report by moving the behavior out of the broad run_agent host gate and into the MiniMax provider profile. Only MiniMax-M3 on the documented OpenAI-compatible /v1 route gets reasoning_split/thinking/reasoning_effort; Anthropic-format MiniMax and non-M3 models keep their existing wire shapes.

Co-authored-by: goku94123 <gooku94123@gmail.com>
2026-06-15 07:08:43 -07:00
Teknium
a688d2a1bd test: assert disk cleanup prunes protected walks 2026-06-15 05:25:27 -07:00
墨綠BG
40699c3292 🐛 fix(disk-cleanup): avoid brittle sweep review issues 2026-06-15 05:25:27 -07:00
墨綠BG
c1a70a5439 🐛 fix(disk-cleanup): prune protected cleanup walks 2026-06-15 05:25:27 -07:00
Nicolò Boschi
a376ca0081 feat(hindsight): make observation scopes configurable on retain
Adds an observation_scopes config key (and HINDSIGHT_RETAIN_OBSERVATION_SCOPES
env var) so retained memories can opt into per_tag / all_combinations /
custom scoping instead of Hindsight's default combined pass.

Threaded through _build_retain_kwargs so all three retain paths honor it:
auto-retain and flush-on-switch already use aretain_batch; the tool retain
path is switched from aretain to aretain_batch (functionally equivalent,
aretain just wraps a single-item batch) since aretain doesn't accept the
observation_scopes parameter.
2026-06-15 04:59:17 -07:00
Teknium
f3fe99863d
revert(web): remove keyless Parallel search fallback (#46350)
Remove the free Parallel Search MCP path and restore the keyed Parallel backend behavior from before it was introduced.

Also drops the keyless fallback registration/display labeling tests and returns the Parallel SDK pin to the prior version.
2026-06-14 16:47:57 -07:00
Teknium
2681c5a12d
fix(photon): correct gateway start command (#45566) 2026-06-13 05:14:59 -07:00
underthestars-zhy
a652131c42 fix(photon): stop gateway restarts from orphaning the sidecar on its port
A hard gateway exit (crash, SIGKILL, supervisor restart) left the
detached Node sidecar running with a token the next gateway run doesn't
know, so it could never be told to /shutdown. Every replacement spawn
then died on EADDRINUSE, failing each 30→300s reconnect attempt while
the orphan kept consuming the inbound gRPC stream.

Two layers:
- Lifetime binding: the adapter now holds the sidecar's stdin as a
  pipe, and the sidecar (PHOTON_SIDECAR_WATCH_STDIN=1) shuts down on
  stdin EOF — fired by the OS on any parent death, including SIGKILL.
- Startup reaping: before spawning, the adapter probes the port and
  terminates a stale listener, but only after verifying its command
  line is a Photon sidecar; a foreign listener raises a clear error
  instead of being signalled.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 01:07:38 -07:00
underthestars-zhy
573c4e6511 feat(photon): upgrade to spectrum-ts 3.0.0 (pinned) with markdown + reactions
Pin spectrum-ts to exactly 3.0.0 (was ^1.18.0 plus an `npm install
spectrum-ts@latest` on every setup) so breaking SDK majors can't take
down fresh installs silently; `hermes photon setup` now runs `npm ci`.
Upgrade procedure documented in the README.

Migrate resolveSpace to the v3 namespace API: `im.space.create(phone)`
for DMs and `im.space.get(id)` for everything else — group spaces are
now rehydratable from their persisted id after a sidecar restart, which
v1 could not do.

Markdown: replies go out via the v3 `markdown()` builder (iMessage
renders natively; other Spectrum platforms degrade to plain text).
`PHOTON_MARKDOWN=false` reverts to the stripped plain-text path.

Reactions, behind PHOTON_REACTIONS (default off): lifecycle tapbacks
(👀 while processing, 👍/👎 on completion) via new sidecar /react and
/unreact endpoints with per-target reaction-handle tracking, and user
tapbacks on bot-sent messages routed to the agent as synthetic
`reaction:added:<emoji>` events.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 01:07:38 -07:00
Austin Pickett
c3464ecf45
fix(discord): recover from runtime gateway task exits (#44383)
* fix(discord): recover from runtime gateway task exits

Salvaged from #39416 (AMEOBIUS) — cherry-picked only the task-exit
recovery; the original PR was 1081 commits behind with 28 unrelated
commits.

A post-ready discord.py WebSocket crash left the gateway split-brained:
producers stayed active while Discord stopped responding. After this fix
the adapter calls _set_fatal_error(retryable=True) + _notify_fatal_error()
so the existing GatewayRunner reconnect watcher replaces the dead adapter.

Also adds _wait_for_ready_or_bot_exit() so startup failures (SOCKS/proxy
errors, invalid tokens) surface fast instead of burning the full ready
timeout. Because connect() no longer waits via asyncio.wait_for on that
path, test_connect_releases_token_lock_on_timeout is updated to trigger
the timeout through the new helper (same lock-release contract).

3 tests pass (2 new runtime-failure tests + the updated timeout test);
test_discord_connect.py and test_discord_slash_commands.py green.

Co-Authored-By: ameobius <ameobius@local.host>

* fix(test): patch _wait_for_ready_or_bot_exit in timeout cancel test

connect() no longer uses asyncio.wait_for for the ready handshake, so
test_connect_timeout_cancels_bot_task was hanging for 30s in CI.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: ameobius <ameobius@local.host>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-11 15:39:01 -04:00
Teknium
0a5762c78d fix(web): genericize free-MCP client identity per telemetry policy
Replace the hermes-identifying clientInfo/User-Agent/session-id prefix on
the keyless Parallel Search MCP path with a neutral 'mcp-web-client'
identity. Project policy forbids third-party usage attribution without an
explicit user opt-in (see telemetry PR policy); MCP requires a clientInfo,
so a generic one satisfies the spec without attributing traffic.

Also adds the contributor AUTHOR_MAP entry and refreshes uv.lock against
current main (parallel-web 0.6.0).
2026-06-10 19:54:38 -07:00
Matt Harris
e0e2571711 feat(web): Parallel-backed web search & extract — free Search MCP when keyless, v1 REST when keyed
Make Parallel the web search/extract backend with a zero-setup free tier:

- Keyless (no PARALLEL_API_KEY): web_search/web_extract work out of the box via
  Parallel's free hosted Search MCP (search.parallel.ai/mcp), and parallel
  becomes the default backend when no other web credentials are configured
  (ahead of ddgs, which is search-only). A small hand-rolled Streamable-HTTP
  JSON-RPC client speaks the MCP's web_search/web_fetch tools; the existing
  web_search/web_extract tools are the only tools registered.
- Keyed (PARALLEL_API_KEY set): uses the Parallel v1 REST endpoints
  (client.search / client.extract with advanced_settings.full_content) — no beta.
  Bumps parallel-web 0.4.2 -> 0.6.0.
- Attribution: on the free path only, results carry provider/attribution and the
  CLI tool line reads "Parallel search" / "Parallel fetch"; the paid path is
  unbranded.
- Selection/registration: web tools register unconditionally (free MCP backstop)
  while check_web_api_key remains a real usability probe; explicit per-capability
  backends are honored (so misconfig surfaces) rather than masked by the fallback.

Tested: live web_search/web_extract against search.parallel.ai in keyless and
keyed modes; unit suites for the MCP client, backend selection, and display
labeling; full agent run shows the "Parallel search" label on the free path.
2026-06-10 19:54:38 -07:00
kshitijk4poor
4642762289 fix(langfuse): redact base64 data URIs instead of truncating into invalid base64
The Langfuse SDK treats `data:*;base64,...` strings as media and tries to
decode them. `_truncate_text` was slicing those strings mid-payload, producing
invalid base64 and noisy "Error parsing base64 data URI" logs. Observability
only needs the metadata, not raw image/audio bytes, so redact the whole data
URI (type, media_type, length) before it reaches the SDK.

Salvaged the Langfuse fix from #39682 onto current main as a standalone,
single-concern change (the dashboard `dist/**` and plugin-discovery parts of
that PR already landed separately on main).

Co-authored-by: foras910521-lab <foras910521-lab@users.noreply.github.com>
2026-06-10 10:49:36 +05:30
Philip D'Souza
92dfd70d6a
fix(photon): production hardening for the gRPC-native iMessage channel (#42732)
* fix(photon): override transitive CVEs in the sidecar deps

`npm audit` flagged 7 high-severity transitive CVEs (protobufjs code injection
GHSA-66ff-xgx4-vchm + outdated @opentelemetry OTLP exporters) pulled in via
spectrum-ts -> @photon-ai/otel. npm's suggested fix downgrades spectrum-ts to a
version that targets the decommissioned spectrum host, so instead pin patched
versions via `overrides` (protobufjs 8.6.1, @opentelemetry/* 0.218.0) without
touching spectrum-ts. `npm audit` -> 0; spectrum-ts + provider still import.

* fix(photon): harden the sidecar bridge + bound the dedup cache

- constant-time sidecar control-token comparison (was `!==`, timing-attackable).
- cap the control-channel request body (2 MiB) so a compromised local peer can't
  OOM the sidecar.
- wrap the inbound gRPC stream consumer in a re-subscribe loop with capped
  exponential backoff + jitter — if the async iterator throws/ends it would
  otherwise stop inbound forever (the adapter dedupes any replay).
- add an unhandledRejection handler so a stray rejection logs instead of killing
  the process.
- dedup cache (adapter) was a true bounded LRU only for expired entries; a burst
  of unique ids within the window grew it without limit. Evict oldest at the cap.

* chore: add AUTHOR_MAP entry for PhilipAD

---------

Co-authored-by: PhilipAD <philipadsouza@gmail.com>
2026-06-09 11:12:58 -04:00
kshitij
85852b71d8
fix(nemo-relay): preserve downstream errors in adaptive execution (#42691)
Based on #42658 by @mnajafian-nv.

Preserves the real downstream provider/tool exception when NeMo Relay's
managed adaptive execution wraps a failing callback as an internal runtime
error. Without this, the original exception (and its retry-classification
signal, e.g. status_code) is lost behind Relay's wrapper.

Salvage changes on top of the original PR:

- Tolerant Relay-wrapper match: _is_relay_wrapped_callback_error now uses
  str.startswith on the "internal error: <cls>: <msg>" prefix instead of
  exact equality, so a future Relay version appending a traceback/suffix
  doesn't silently defeat the unwrap. On a total format change it returns
  False and falls back to the pre-fix behavior (surfacing Relay's error)
  rather than masking it.
- Deduplicated the LLM and tool execute paths into a shared
  _run_managed_with_downstream_preservation helper, removing ~20 lines of
  copy-pasted nonlocal/try-except scaffolding that could drift out of sync.
- Added a real-middleware regression guard
  (test_nemo_relay_downstream_unwrap_matches_real_middleware_wrapper_shape)
  that drives hermes_cli.middleware._run_execution_chain and asserts the
  plugin's _original_downstream_error unwraps the actual private
  _DownstreamExecutionError wrapper. The original synthetic tests modeled the
  wrapper with a local class, so a rename or shape change in core middleware
  would not have been caught; this test fails loudly if that contract drifts.

Co-authored-by: mnajafian-nv <mnajafian@nvidia.com>
2026-06-09 02:31:10 -07:00
underthestars-zhy
dbf2470d46 feat(photon): Add voice message support to Photon adapter
Extend the sidecar and Python adapter to handle `voice` content
alongside `attachment`. Voice notes are inlined as base64 (same
size-cap logic), surfaced as `MessageType.VOICE`, and include an
optional `duration` field in fallback markers when bytes are
unavailable.
2026-06-08 22:53:01 -07:00