Commit graph

12895 commits

Author SHA1 Message Date
Teknium
d7021af30f
fix(learn): name distilled skills as author Hermes, not the host OS user (#52388)
/learn told the agent to fill the skill `author` field, and the system
prompt environment probe surfaces the OS login name (user=$(whoami) in
prompt_builder.py), so the model wrote the host username into published
SKILL.md frontmatter — a privacy leak the user never opted into, and
inconsistent run to run as the most-salient identity changed.

The /learn authoring prompt now sets `author` to the literal value
`Hermes` and explicitly forbids deriving it from the host environment
(OS/login user, git config, or any probeable identity). The skill names
itself as the tool that wrote it.

Closes #52368.
2026-06-25 12:48:08 -07:00
helix4u
4efec63a34 fix(tools): let session_search match session titles 2026-06-26 01:12:26 +05:30
rob-maron
2c02583c2b fix shape 2026-06-25 12:38:33 -07:00
rob-maron
525ee58b43 krea 2026-06-25 12:38:33 -07:00
kshitijk4poor
73c8d5a1e7 fix: use self._session_db directly + add regression test
- Replace getattr(self.session_store, '_db', None) with self._session_db
  (the GatewayRunner's own SessionDB, consistent with existing usage in
  slash_commands.py L240/L499).
- Remove verbose comment referencing a branch name as an issue number.
- Update stale comment in run.py that said 'today it has no session_db'.
- Add regression test verifying session_db is passed and rotated session
  is persisted (adapted from #51624 by @LeonSGP43).
- Add _session_db=None to _make_runner fixtures in test_compress_command,
  test_compress_focus, and test_compress_plugin_engine.
2026-06-26 00:50:40 +05:30
Omar B
1a38a8ff7d fix(gateway): pass session_db to compress temp agents so persistence works
Manual /compress and session hygiene auto-compress both create temporary
AIAgent instances to run compression. These agents were created without
a session_db, so compress_context computed the compressed messages in
memory, rotated the session ID, and reported success — but never wrote
to the database. The next user message reloaded the original full
transcript, making compression appear to do nothing.

Fix: pass session_db=self.session_store._db to both temp agents so the
session rotation is properly persisted. Also set _end_session_on_close
on the /compress temp agent (already done in hygiene path) to prevent
cleanup from ending the newly rotated session.
2026-06-26 00:50:40 +05:30
brooklyn!
edf35918be
Merge pull request #52620 from NousResearch/bb/desktop-session-switch-perf 2026-06-25 14:19:59 -05:00
Brooklyn Nicholson
e8561d61e6 test(tui_gateway): pin synchronous-build resume tests to eager_build
These three assert the eager build contract — stored runtime overrides /
profile db reach _make_agent synchronously, and the agent binds to the
compression tip. Under deferred-by-default the build runs off-thread, so
they raced the timer (green in CI, flaky locally). Pin them to
eager_build; deferred coverage lives in the protocol tests.
2026-06-25 14:13:07 -05:00
David Metcalfe
da73223f4a fix(desktop): show statusbar item tooltips on hover
Statusbar items declared a 'title' string (e.g. YOLO, gateway health,
agents, cron, version, context usage) that was populated by
use-statusbar-items.tsx but never forwarded to the rendered DOM in
StatusbarControls — so every statusbar button/menu/text/link had no
hover hint.

Wrap the four render branches (menu trigger, text, link, action) in
the existing 'Tip' component from components/ui/tooltip.tsx. Tip is
self-contained (carries its own Provider), instant (delayDuration=0),
themed (bg-foreground/text-background, auto-inverts per theme), and
already in use elsewhere in the desktop shell. Renders the child
untouched when label is falsy, so items without a title stay
zero-cost.
2026-06-25 12:11:17 -07:00
Brooklyn Nicholson
1ca1f9f2c7 refactor(tui_gateway): DRY the deferred-session paths
Collapse the duplicated cold-resume / lazy-watch / create scaffolding into
shared helpers: _deferred_session_record (the live-session dict minus the
agent), _lazy_resume_info (the not-yet-built session.info), _claim_or_reuse_live
(lock + double-checked register-or-reuse), and _schedule_agent_build (the
pre-warm timer). Net -12 lines, three copies of the ~30-key session dict and
the lazy-info block down to one each. No behavior change.
2026-06-25 14:03:03 -05:00
Brooklyn Nicholson
3bf00e459a perf(desktop): make deferred resume the default, not an opt-in flag
Per review: gating the faster path behind a `defer_build` flag that the
only caller always sends is pointless. Flip it — `session.resume` now
defers the agent build by default for every caller (desktop + Ink TUI);
a caller that needs the agent built synchronously passes `eager_build:
true` (used by the build-race test). The desktop no longer sends a flag.

While verifying the flip, fixed two real parity gaps the deferred path
had vs the old eager (`_init_session`) path:

- `_enable_gateway_prompts()` was never called on a deferred resume, so
  approvals/clarify wouldn't route through the gateway prompt callbacks.
- `_start_agent_build` never wired `background_review_callback` /
  `memory_notifications`, so a deferred-built session's self-improvement
  "💾 …" summary leaked to stdout instead of rendering in-transcript.
  Wiring it there also fixes it for `session.create` sessions, which
  build through the same path.

ACP is unaffected (it uses its own session_manager, not this RPC); the
Ink TUI already consumes the same lazy `info` shape from session.create
and upgrades on the later `session.info` event.
2026-06-25 14:03:03 -05:00
Brooklyn Nicholson
c4c590e4a1 perf(desktop): make session switching fast under load
Switching sessions in the desktop app could freeze the whole UI for
several seconds on heavy, tool-rich chats. Root causes and fixes:

- Cold `session.resume` built the AIAgent (MCP discovery, prompt/skill
  build) *before* returning, and the desktop awaits that RPC before it
  paints — so the entire switch blocked on the build. Add an opt-in
  `defer_build` resume path (the contract `session.create` already uses):
  return the full display transcript immediately, register an upgradable
  live session, and pre-warm the agent on a short timer. The persisted
  runtime identity (model/provider/base_url/api_mode/reasoning/tier) is
  restored on the deferred build so it can't drop the provider.

- Nothing bounded how many in-memory agents accumulate; a user who
  reconnects often piled up detached sessions for the full 6h TTL. Add a
  soft LRU cap (`max_live_sessions`, default 16) that evicts the
  least-recently-active DETACHED sessions (no live client) — never a
  running, awaiting-input, mid-build, or live-transport one. Reopening
  re-resumes from disk.

- On the prefetch-hit cold-resume path, skip rebuilding a throwaway
  merged-message array (and its 1000-entry Map) when the prefetch already
  painted the exact transcript; the downstream sameMessageList guard
  already drops the publish, so it was pure main-thread cost.

The desktop opts into `defer_build` for every non-watch cold resume; the
eager path stays for CLI/TUI and existing callers.
2026-06-25 14:03:03 -05:00
kshitij
5de8a8fbe8
Merge pull request #52375 from NousResearch/salvage/47237-dedupe-user-turns
fix(gateway): dedupe user turns on transient failure (#47237)
2026-06-26 00:30:59 +05:30
davidgut1982
6208d6b3be fix(gateway): dedupe user turns on transient failure (#47237)
When the gateway persists a user message after a transient provider
failure (429/timeout/auth error), subsequent retries of the same
Telegram message could stack duplicate user turns in the transcript,
causing the agent to fall behind by 1-2 messages.

Add has_platform_message_id() to SessionDB (using the existing
idx_messages_platform_msg_id partial index) and a SessionStore wrapper.
The gateway's transient-failure path checks this before
append_to_transcript -- if the platform_message_id is already
persisted, the duplicate write is skipped.

Salvaged from #47869 by @davidgut1982. Adapted to current main which
has additional append sites and an existing content-based dedupe in
the exception handler path.

Closes #47237
2026-06-26 00:11:17 +05:30
kshitij
d682f320b3
Merge pull request #52147 from NousResearch/salvage/29184-mcp-osv-nonblocking
fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184)
2026-06-25 23:39:44 +05:30
kshitij
c210e23a02
Merge pull request #52386 from NousResearch/salvage/31999-yaml-indent
fix(utils): unify YAML list indent across all config writers (#31999)
2026-06-25 23:39:37 +05:30
qdaszx
6305ac0e4b fix(mcp): run OSV malware preflight off the event loop with a bounded timeout (#29184)
During stdio MCP server startup, _run_stdio (an async method) called the
synchronous check_package_for_malware() inline. That makes a blocking
urllib HTTPS POST to api.osv.dev whose own timeout doesn't reliably cover a
stalled SSL handshake, so an intermittent network issue froze the entire
asyncio event loop for up to ~120s — blowing past the TUI/gateway's 15s
startup budget and showing "gateway startup timeout".

Run the check via asyncio.to_thread (off the loop) AND bound it with
asyncio.wait_for(timeout=_OSV_MALWARE_CHECK_TIMEOUT_S=12s). The malware check
is fail-open, so on timeout we log and proceed rather than blocking startup.

Salvaged from #29190 by @qdaszx (re-applied on current main — the call site
moved since the PR was opened), combining the to_thread approach also proposed
in #29192 by @ygd58. Two load-bearing tests: event-loop-not-blocked-during-
check and timeout-fails-open — both mutation-verified to fail against the old
inline blocking call.

Closes #29184.

Co-authored-by: ygd58 <buraysandro9@gmail.com>
2026-06-25 23:30:41 +05:30
xxxigm
0aea0c3654 fix(utils): unify YAML list indent across all config writers (#31999)
atomic_yaml_write used default yaml.dump which emits indentless
sequences (list items at column 0), while atomic_roundtrip_yaml_update
(ruamel.yaml) emits 2-space-indented sequences. Cross-path writes to
the same config.yaml toggled indentation on every save, eventually
producing a mixed-indent file that js-yaml rejects with 'bad indentation
of a mapping entry', silently dropping custom_providers and breaking
model switching.

Add IndentDumper SafeDumper subclass that forces indentless=False,
route atomic_yaml_write through it. Route tui_gateway._save_cfg and
the Telegram adapter's config writer through atomic_yaml_write so all
paths emit the same 2-indent layout.

Salvaged from #32034 by @xxxigm. Adapted to current main which already
has allow_unicode=True (from #51356) but was missing IndentDumper.

Closes #31999
2026-06-25 23:27:44 +05:30
brooklyn!
a53fc78c02
Merge pull request #52594 from NousResearch/bb/queue-resubmit-on-busy
fix(tui_gateway): queue mid-turn prompts instead of dropping them on a busy retry
2026-06-25 12:50:18 -05:00
kshitijk4poor
15ee2d6f04 refactor: lightweight sudo count + drop chatty multi-sudo tip
Replace _count_real_sudo_invocations (which called
_rewrite_real_sudo_invocations and discarded the rewritten string) with
a lightweight token scan that reuses the same tokeniser but skips string
building. Remove the agent-facing tip about nested sudo in heredocs —
the cache-cleared warning is enough.
2026-06-25 23:08:48 +05:30
xxxigm
d93abd75d1 test(terminal): cover sudo cache invalidation and multi-invocation piping 2026-06-25 23:08:48 +05:30
xxxigm
8278d82e17 fix(terminal): improve sudo -S password delivery and cache invalidation
Pipe one password line per sudo invocation in compound commands so a correct
password is not rejected on the second `sudo` in `sudo a && sudo b`. Drop the
session cache when sudo returns Authentication failed, surface sudo_auth_failed
in the tool result, and add hints for interactive sessions.
2026-06-25 23:08:48 +05:30
brooklyn!
931a5e92cc
Merge pull request #52592 from NousResearch/bb/close-interrupt-tool-seq-sibling-paths
fix(agent): close tool-call sequence on all interrupt aborts (#48879 follow-up)
2026-06-25 12:31:27 -05:00
Brooklyn Nicholson
70319626a9 fix(tui_gateway): queue mid-turn prompts instead of dropping them on a busy retry
A prompt sent while a turn was in flight got rejected with 4009 "session busy",
which pushed clients (the desktop app) into a deadline-bounded busy-retry. When
turn teardown outlived that deadline — e.g. the user hits stop while a slow,
non-interruptible tool (web_search, read_file, an MCP call) is mid-flight, since
the sequential executor only checks the interrupt flag between tools — the
resubmitted message was silently dropped: "it just doesn't listen".

Wire the previously-dead display.busy_input_mode config into prompt.submit:
instead of rejecting, apply the policy and queue the message to run as the next
turn (drained in run()'s tail, ahead of goal/notification follow-ups). Modes:
interrupt (default) interrupts the live turn so it winds down promptly then runs
the queued message; queue runs it after the current turn finishes; steer injects
it into the live turn when accepted, else queues. The queued slot pins the
sender's transport and losslessly merges a second arrival. No client deadline,
no dropped sends.
2026-06-25 12:29:49 -05:00
Brooklyn Nicholson
2d286a6d00 fix(agent): close tool-call sequence on all interrupt aborts, not just finalize_turn
#48879 closed the tool-call sequence on interrupt inside finalize_turn so a
/stop after a tool no longer persists a `tool` tail that the next user message
turns into a `tool -> user` role-alternation violation (which strict providers
like Gemini/Claude react to by hallucinating a continuation and ignoring prior
context — what users see as "lost context after stop").

But the retry-wait, error-handling, and post-error retry-wait interrupt aborts
in conversation_loop return early and never reach finalize_turn, so they still
persisted and returned a raw `tool` tail. Interrupting during provider
backoff/rate-limiting (common under heavy work) hit exactly this path.

Extract the close into a shared close_interrupted_tool_sequence helper and apply
it at every interrupt abort (finalize_turn + the three early returns) so the
whole bug class is fixed, not just the one site.
2026-06-25 12:24:34 -05:00
brooklyn!
88e01d92e6
Merge pull request #52591 from NousResearch/bb/desktop-update-adhoc-sign
fix(desktop): ad-hoc sign macOS self-update rebuilds
2026-06-25 12:23:59 -05:00
Brooklyn Nicholson
1d9ed7f48a fix(desktop): ad-hoc sign macOS self-update rebuilds
The desktop self-updater rebuilds and re-signs the .app on each user's own
machine (`hermes desktop --build-only` -> electron-builder `--dir`). With
CSC_IDENTITY_AUTO_DISCOVERY on (its default), electron-builder signs the
type=distribution, hardened-runtime bundle with whatever identity is in that
user's keychain -- typically a personal "Apple Development" cert -- which
stalls/fails the sign step (no Developer ID, no provisioning profile) or
clobbers the original notarized signature with an unusable one, tripping
Gatekeeper on every post-update launch.

Force ad-hoc signing for the local packaged rebuild instead: deterministic,
and exactly what _desktop_macos_relaunchable_fixup already finishes off.
No-op for source runs, off-macOS, when a real identity is configured
(CSC_LINK / APPLE_SIGNING_IDENTITY), or when the caller already pinned the flag.
2026-06-25 12:08:29 -05:00
ethernet
a6a28ce3e2 fix(ci): run CI on all PRs to anywhere
fixes stacked PRs no-checks bug where
main < a < b
a merges into main
b is retargeted to main

but b doesn't run checks since it's not considered a new pr to main

now b will simply already have passing ci :)
2026-06-25 09:15:20 -07:00
Ben Barclay
d6269da7fd
fix(gateway): harden scale-to-zero dormancy guards (#52359)
Some checks are pending
CI / detect (push) Waiting to run
CI / tests (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / typecheck (push) Blocked by required conditions
CI / docs-site (push) Blocked by required conditions
CI / history-check (push) Blocked by required conditions
CI / contributor-check (push) Blocked by required conditions
CI / uv-lockfile (push) Blocked by required conditions
CI / docker-lint (push) Blocked by required conditions
CI / supply-chain (push) Blocked by required conditions
CI / osv-scanner (push) Blocked by required conditions
CI / All required checks pass (push) Blocked by required conditions
Deploy Site / deploy-vercel (push) Waiting to run
Deploy Site / deploy-docs (push) Waiting to run
Docker Build and Publish / build-amd64 (push) Waiting to run
Docker Build and Publish / build-arm64 (push) Waiting to run
Docker Build and Publish / merge (push) Blocked by required conditions
Block scale-to-zero suspend while background async delegations are active, and restore runtime status to running on real inbound after a dormant wake.\n\nAdd regression coverage for both review findings.
2026-06-25 20:41:03 +10:00
Teknium
e62afaca62
fix(learn): teach /learn the full CONTRIBUTING.md skill standards (#52372)
The /learn authoring prompt taught a subset of the HARDLINE skill rules,
and stated the <=60-char description rule without making the model enforce
it — so generated descriptions overshot (up to 202 chars), which the
60-char system-prompt skill index then silently truncates.

- description: add the index-truncation rationale, a count-and-trim
  self-check, and a good/bad length example so the model actually hits <=60.
- add platforms-gating rule (OS-bound primitives -> declare platforms:).
- add author-credits-human-first rule.
- round out the Hermes-tool framing with the full wrapped-tool mapping and
  references/templates layout.

Closes #52367.
2026-06-25 00:17:23 -07:00
Teknium
60a2feeebf chore: add benbenlijie to AUTHOR_MAP for PR #47205 salvage 2026-06-25 00:17:17 -07:00
benbenwyb
6f2b2a1f34 fix: handle named custom providers and Z.AI overload retries 2026-06-25 00:17:17 -07:00
Ben Barclay
736e981abf
fix(auth): honor NOUS_INFERENCE_BASE_URL env override for Nous OAuth sessions (#52270)
The host-allowlist hardening (#30611) plus the refresh heal (#49735) left
the documented NOUS_INFERENCE_BASE_URL dev/staging escape hatch unreachable
for OAuth sessions, despite three code comments asserting it still works.

Root cause — resolution precedence in resolve_nous_runtime_credentials:

    inference_base_url = (
        _optional_base_url(state.get("inference_base_url"))  # stored — wins
        or os.getenv("NOUS_INFERENCE_BASE_URL")              # env — unreachable
        or DEFAULT_NOUS_INFERENCE_URL
    )

A staging OAuth login persists its inference_base_url, but the allowlist
rejects the staging host and the refresh heal rewrites the stored value to
the production default. The stored (now prod) value is then read BEFORE the
env var, so the override never takes effect — every request 401s against
prod or is pinned to prod, and setting the env var does nothing.

Fix: the user-set env override is the most-trusted source, so consult it
FIRST for the URL used to build the client / returned to callers — while
keeping the PERSISTED value the validated, network-provenance one (the
override is a runtime overlay, never written to auth.json, so unsetting it
cleanly reverts to prod). Applied at both chokepoints:

- resolve_nous_runtime_credentials (no-refresh read path AND refresh path)
- the nous_portal proxy adapter, which re-validates the resolver's returned
  base_url against the prod allowlist as defense-in-depth and would
  otherwise reject a legitimate staging override at the forward boundary.

New _nous_inference_env_override() / split of stored-vs-effective URL keep
the threat model intact: Portal-returned URLs are still allowlist-validated
at every network site, and the env path stays ungated (trusted OS user).

Also folds in the no-refresh read-path heal (supersedes the approach in
the open #50265): a poisoned stored staging host now heals to the prod
default on read even when no refresh fires.

Tests: TestEnvOverrideWins (env wins on read + refresh paths; override never
persisted; poisoned stored heals) and TestProxyAdapterEnvOverride. Verified
the 4 behavioral tests fail against pre-fix code and pass with the fix; full
inference-validation + nous-provider suites green (85 passed). E2E-validated
against a real temp HERMES_HOME exercising the real resolver + proxy adapter:
resolver→staging, persisted→prod, proxy→staging, unset→reverts to prod.
2026-06-25 00:11:15 -07:00
kshitijk4poor
d6cf383d74 refactor(setup): simplify Z.AI picker — drop dead fallback, fix tests
- Remove dead `chosen_base or effective_base` fallback; _select_zai_endpoint
  always returns a non-empty base URL (returns current_base on cancel).
- Add .rstrip("/") to official-endpoint return for symmetry with custom-proxy
  path (both now return normalized URLs).
- Replace magic index 4 with len(ZAI_ENDPOINTS) in custom-proxy tests so they
  don't break if a 5th endpoint is added to ZAI_ENDPOINTS.
2026-06-25 12:07:01 +05:30
kshitijk4poor
d0df264213 test(setup): add ZAI endpoint picker tests, move base-URL tests to MiniMax
Z.AI now uses a curses picker instead of plain text input for base URL,
so the existing TestBaseUrlValidation tests (which used zai as their test
subject) are migrated to MiniMax, which still uses the text input path.

Add TestZaiEndpointPicker covering:
- Selecting each official endpoint (Global, China, Coding Plan Global,
  Coding Plan China) saves the correct base URL to config
- Custom proxy URL entry (valid + invalid rejection)
- Cancel keeps the existing base URL
- Current endpoint is the default choice in the picker
- Non-standard URL defaults to the Custom proxy option
2026-06-25 12:07:01 +05:30
kshitijk4poor
f3372d3407 feat(setup): wire Z.AI endpoint picker into _model_flow_api_key_provider
When provider_id == 'zai', replace the plain text Base URL input with
_select_zai_endpoint, which presents a curses picker offering Global,
China, Coding Plan Global, Coding Plan China, and custom proxy options.
Other API-key providers (MiniMax, DeepSeek, etc.) keep the text input.
2026-06-25 12:07:01 +05:30
kshitijk4poor
d0f9c4bcc6 feat(setup): add _select_zai_endpoint helper for Z.AI endpoint picker
Presents a curses-based picker (via _prompt_provider_choice) offering the
four official Z.AI endpoints — Global, China, Coding Plan Global, Coding
Plan China — plus a custom-proxy option. Sourced from ZAI_ENDPOINTS in
auth.py so it stays in sync with the probe list.

Not yet wired into the setup flow; that comes in the next commit.
2026-06-25 12:07:01 +05:30
brooklyn!
818f03cdd8
Merge pull request #52366 from NousResearch/bb/pet-gen-variant-remix
feat(pets): remix a draft into a fresh round
2026-06-25 01:12:47 -05:00
Brooklyn Nicholson
6b3ea2cea6 refactor(pets): tighten remix comments and confirm handler 2026-06-25 01:10:56 -05:00
Brooklyn Nicholson
5196575d40 feat(pets): remix a draft into a fresh round
Add a hover/focus "Remix" action on each completed draft card in the
generation grid. It re-runs generation with the chosen draft fed back in
as the reference image, keeping the same prompt and staying on step 2 so
the user can explore variations without starting over.

Because regenerating is slow and replaces the current drafts, the first
remix shows a one-time confirmation; the acknowledgement is persisted so
subsequent remixes fire immediately.
2026-06-25 01:09:19 -05:00
brooklyn!
4362c1a3af
Merge pull request #52326 from NousResearch/bb/shared-tool-labels
fix(ui): share compact tool previews across clients
2026-06-25 00:53:11 -05:00
Brooklyn Nicholson
f3d6d9bbd3 fix(ui): share compact tool previews across clients
Move terminal/execute_code/read_file preview compaction into agent.display so CLI, gateway, and Ink TUI all inherit the same labels that desktop introduced in #52321.

The shared preview keeps raw args intact while trimming display-only shell plumbing (`cd`, pipe tails, banner/status echoes) and read_file line ranges. Desktop now prefers backend `context` for live rows and keeps its TypeScript fallback only for hydrated history.
2026-06-25 00:47:14 -05:00
brooklyn!
3af22c0ed5
Merge pull request #52338 from NousResearch/bb/pets-gen-timeouts
fix(pets): raise generation timeouts for the slow quality-first model path
2026-06-25 00:45:43 -05:00
Brooklyn Nicholson
a5849917a8 test(pets): make slow pet generation suite opt-in
The pet generation image-processing suite is deterministic but expensive enough
to blow the per-file CI timeout on Linux (140s), and it is not relevant to the
fast timeout PR's normal signal. Keep it available for manual validation, but do
not run it by default.

Set HERMES_RUN_SLOW_PET_TESTS=1 to enable the suite. The canonical test wrapper
now preserves that opt-in variable through its hermetic env.
2026-06-25 00:44:53 -05:00
Brooklyn Nicholson
25c31cab62 fix(pets): soften step-1 ETA copy to "several minutes"
The fixed "up to 5 minutes" wording undersells the slow quality-first path
(OpenAI image via OpenRouter), where a full hatch can run far longer. Use an
open-ended "several minutes" instead so the banner stays honest across the
fast and slow providers.
2026-06-25 00:35:54 -05:00
Brooklyn Nicholson
7078d9d1e2 fix(pets): raise generation timeouts for the slow quality-first model path
The quality-first default (OpenAI image via OpenRouter) is slow, and a full
hatch fans out ~8 rows with up to 3 retries each (300s/call) across 2 parallel
waves, so the absolute backend worst case is ~30 min. The old ceilings fired
mid-run:

- per-image HTTP call: 180s -> 300s (a single cold row can exceed 3 min)
- drafts RPC: 240s -> 420s (single wave, no retries — 7 min is ample)
- hatch RPC: 420s -> 1hr (sits above the ~30 min backend worst case)

The hatch ceiling is intentionally well above the realistic max so the frontend
never throws "request timed out" before the backend has exhausted its own
retries. The background-resumable notification path remains the real UX safety
net — the user can close the modal and get pinged on completion.
2026-06-25 00:34:52 -05:00
brooklyn!
a8e6a4f00b
Merge pull request #52321 from NousResearch/bb/desktop-cmd-label-summary
fix(desktop): compact tool row titles
2026-06-25 00:03:07 -05:00
Brooklyn Nicholson
41f302fa73 fix(desktop): compact tool row titles
Make completed desktop tool rows read like useful activity labels instead of raw plumbing: terminal rows use a dispatch-style shell summarizer for agent wrappers, and read_file rows keep the action plus filename and requested line range.

The shell cleanup follows condensed-milk-pi's shape: split command compounds on real separators, strip pipe tails inside each segment, clean redirects/env prefixes, then classify setup/banner/status segments. Multi-command probes render as `first command + N commands`; the full command remains available in copy/detail.

Read rows now render as `Read package.json` or `Read main.ts L25-34`, using requested positive offset/limit and returned line numbers only as fallback for negative/unknown offsets.
2026-06-25 00:01:11 -05:00
Teknium
7a65800fed
fix(cache): content-address prompt_cache_key so recurring cron jobs reuse the warm prefix (#52295)
Recurring cron jobs were prompt-cache-cold on every fire. session_id is
built as cron_<job_id>_<timestamp>, and the Codex/Responses transport used
session_id directly as prompt_cache_key — so the timestamp changed the cache
key on every run and the static prefix (agent identity + tool schemas) was
re-paid each tick.

Derive prompt_cache_key from a SHA-256 of the static prefix (instructions +
sorted tool schemas) instead. Repeated fires of the same job share one
content-addressed key (pck_<hash>) and reuse the warm prefix within the
provider's cache TTL. The key changes exactly when the prefix changes —
edit the job's prompt or toolset and it re-keys; leave it alone and it stays
stable.

session_id is left untouched for transcript isolation, log correlation, and
the Codex/xAI session-scope routing headers (session_id, x-client-request-id,
x-grok-conv-id) — those are the per-fire identity, not the cache key. Only the
prompt_cache_key body field (standard OpenAI/Codex path and the xAI extra_body
field) is content-addressed.

Closes #51395.

Co-authored-by: spiky02plateau <spiky02plateau@users.noreply.github.com>
Co-authored-by: JoaoMarcos44 <JoaoMarcos44@users.noreply.github.com>
2026-06-24 21:46:30 -07:00
Ben Barclay
72ae163250
fix(relay): authorize relay-delivered events by delivery, not source.platform (#52306)
* fix(relay): authorize relay-delivered events by delivery, not source.platform

The #52190 upstream-authz fix keyed _is_user_authorized off
source.platform via _adapter_authorization_is_upstream(source.platform).
But a relay *message* inbound carries the UNDERLYING platform
(source.platform == discord/telegram/...), NOT Platform.RELAY, because
ws_transport._event_from_wire maps the connector's wire payload
(platform="discord") straight onto SessionSource for session-keying and
egress. The relay adapter is registered only under Platform.RELAY, so
adapters.get(Platform.DISCORD) misses, the trusted-upstream branch is
skipped, and the user hits the env-allowlist default-deny:

    WARNING gateway.run: Unauthorized user: <id> (<name>) on discord

(Live staging bug: alpha tester linked successfully, then every
follow-up DM was silently dropped.)

Fix: the authentic trust signal is that the event was delivered over the
per-instance-authenticated relay WS, not which platform it underlies. Add
a wire-INVISIBLE SessionSource.delivered_via_upstream_relay flag, stamped
by the relay transport in _event_from_wire, and authorize on it. The flag
is excluded from to_dict/from_dict so a peer can neither forge it across
the wire nor have it restored from persistence. The existing adapter-flag
check is retained for events whose source.platform IS Platform.RELAY
(interaction-passthrough). A direct Discord event on a multiplexing
gateway (direct + relay adapters) is unmarked and still default-denies.

* fix(relay): use identity check on delivery marker to avoid MagicMock fail-open

A MagicMock() source (used by test_signal.py and other gateway tests) auto-
vivifies source.delivered_via_upstream_relay as a truthy Mock, which a bare
truthiness check would treat as authorized — flipping
test_signal_in_allowlist_maps from False to True. The marker is a real bool on
SessionSource, so check 'is True' explicitly: refuses to authorize any non-bool
stand-in, defensive against accidental fail-open.
2026-06-25 14:21:09 +10:00