The self-hosted OIDC dashboard provider was public-client + PKCE only, with
two `# TODO(confidential-client)` seams. Authentik and Keycloak commonly
default a new OIDC client to *confidential*, whose token endpoint rejects an
unauthenticated exchange (`invalid_client`) — so a self-hoster who accepts
their IDP's default could not complete dashboard login without manually
flipping the client to public.
Add optional confidential-client support:
- New optional `client_secret` (env `HERMES_DASHBOARD_OIDC_CLIENT_SECRET`,
or `dashboard.oauth.self_hosted.client_secret`; env-wins-config, empty
treated as unset). It is a credential, so docs steer operators to the
`.env` file; config.yaml is supported only for precedence symmetry.
- `_token_endpoint_auth()` selects `client_secret_basic` (HTTP Basic header)
vs `client_secret_post` (form body) from the IDP's advertised
`token_endpoint_auth_methods_supported`, defaulting to basic (the OIDC
default) when absent. Applied to complete_login, refresh_session, and
revoke_session (RFC 7009 §2.1).
- PKCE is sent in BOTH modes — the secret is client authentication layered
on top, never a replacement (OAuth 2.1 / RFC 9700 keep PKCE mandatory).
- Basic header url-encodes client_id/secret before base64 per RFC 6749
§2.3.1, so reserved chars (`:`, `@`, space) round-trip correctly.
Non-breaking: with no secret configured the provider is a pure public PKCE
client, byte-identical to prior behaviour (no Authorization header, no
client_secret in the body). The secret is never logged — register() reports
only a `confidential=<bool>` flag.
Tests: 16 new cases covering basic/post selection, default-when-absent,
public-unchanged contract, PKCE-preserved, reserved-char url-encoding,
blank-secret-is-public, refresh + revoke auth, no-secret-in-logs, and
env/config register wiring. Full dashboard-auth suite (nous provider,
middleware, gate, cookies, WS, 401-reauth, status endpoint) — 396 tests —
green, proving no existing auth path regressed.
The self-hosted OIDC dashboard login rejected any http:// redirect_uri
whose host was not localhost/127.0.0.1, surfacing "redirect_uri may only use http:// for localhost/127.0.0.1" before reaching the IDP. This broke self-hosted dashboards reached over plain HTTP (including LAN IPs, internal hostnames, and reverse proxies that terminate TLS upstream).
#38827 already dropped this check from the nous provider, but the generic self-hosted provider copied the old localhost-only
branch and reintroduced the bug for HERMES_DASHBOARD_OIDC_ISSUER setups.
The IDP's own allowlist is authoritative on which redirect_uris are
permitted; this client-side _validate_redirect_uri is only a fast-fail for
obvious operator error and should not second-guess valid http:// deployments.
Fix: drop the localhost-only branch on the http scheme. Validation now enforces only that the scheme is http(s) and the path ends with
/auth/callback. Updated the docstring to explain the relaxed contract,
and added test_allows_http_with_arbitrary_host covering an internal
hostname and a LAN IP alongside the existing localhost case.
The supermemory and mem0 memory providers shipped third-party SDKs
(supermemory / mem0ai) that are not core dependencies, but — unlike the
honcho and hindsight providers — they imported those SDKs directly with
no tools.lazy_deps.ensure() preflight and had no LAZY_DEPS allowlist
entry. On the published Docker image the agent venv is sealed
(HERMES_DISABLE_LAZY_INSTALLS=1) and lazy installs are redirected to a
writable durable target (HERMES_LAZY_INSTALL_TARGET). honcho/hindsight
route through ensure() and install fine there; supermemory/mem0 never
called it, so their SDK was never installed on a hosted instance and the
provider silently reported itself unavailable even with the API key set.
Fixes:
- Add memory.supermemory + memory.mem0 to the LAZY_DEPS allowlist
(tools/lazy_deps.py), pinned to current PyPI releases.
- Call ensure('memory.<x>', prompt=False) at each SDK-import chokepoint
(_SupermemoryClient.__init__; Mem0MemoryProvider._create_backend),
mirroring honcho's wrapped try/except shape.
- Drop the SDK-import gate from supermemory's is_available() — it was a
chicken-and-egg trap (provider never loaded on a sealed venv, so
ensure() never ran). Now key-presence only, like honcho/mem0.
- Add matching pyproject extras [supermemory]/[mem0]; update the
lazy-covered-extras contract test (excluded from [all] by policy).
Tests prove each path fails without the fix and the real sealed-venv
durable-target gate accepts both features.
HolographicMemoryProvider.shutdown() dropped its MemoryStore reference
without calling the existing MemoryStore.close(). Since the connection is
opened check_same_thread=False (one per session), its fd was released by
refcount/GC at a non-deterministic time on a non-deterministic thread,
churning a DB fd through the kernel free pool on every session teardown.
Call close() so the fd is released deterministically.
Reported by @alfranli123 (#44037), who pinpointed the exact code location.
Note: the report's TLS-fd-recycle corruption attribution could not be
reproduced from the code — dropping a sqlite connection flushes valid
SQLite pages via the VFS, never TLS framing, and the provider is at most a
releaser of DB fds, not a TLS-flushing socket owner. This change is correct
resource hygiene that removes per-session fd churn regardless.
Add post_setup() and get_status_config() to the Supermemory memory
provider so `hermes memory setup` and `hermes memory status` print a
one-line connection summary (container, profile fact count,
auto_recall/auto_capture). Point API-key onboarding at the Hermes
connect URL (app.supermemory.ai/integrations?connect=hermes).
Salvage of #52988. Two fixes folded in:
- Test isolation: the new probe/status tests mocked _SupermemoryClient
but not the __import__("supermemory") guard inside
_probe_supermemory_connection, so they passed only where the optional
supermemory package was installed and failed on a clean checkout / CI
(the PR shipped with red CI). Added _stub_supermemory_importable()
mirroring the existing test_is_available_false_when_import_missing
pattern; the suite now passes with supermemory absent.
- post_setup: `if api_key and api_key not in os.environ` checked whether
the key's *value* named an env var (always false in practice). Fixed to
compare the value: `os.environ.get("SUPERMEMORY_API_KEY") != api_key`.
Verified: 38/38 in test_supermemory_provider.py and the full
tests/plugins/memory/ suite green with supermemory not installed.
Closes#52988
Populate `reply_to_message_id`, `reply_to_text`, and
`reply_to_is_own_message` on reaction events so the gateway injects
`[Replying to your previous message: "..."]` when the agent receives
a tapback.
The sidecar now extracts a capped text preview from the hydrated
reaction target (plain text and mixed group messages; null for
attachment/voice-only targets), emitting it as `targetText` in the
NDJSON reaction payload. The Python adapter reads this field and sets
the reply correlation fields on the `MessageEvent`.
v8 made `richlink` outbound-only; inbound rich links now arrive as
plain `text`. Remove the `getBalloonBundleId`/`toRichlinkMessage`
branches from the iMessage mapper patch and update the fixture,
lockfile, and README accordingly.
Update the Photon platform plugin's Node.js sidecar from spectrum-ts
3.1.0 to 7.0.0, which splits the SDK into scoped `@spectrum-ts/*`
packages with `spectrum-ts` as the umbrella re-export.
- Bump exact pin in package.json/package-lock.json to 7.0.0
- Update mixed-attachments patch script to target the new
`@spectrum-ts/imessage/dist/index.js` path and tab-indented output
- Rewrite test fixture to match v7.x mapper shape (tab-indented,
`const ... = async` declarations, single-line builder calls) and
point at `@spectrum-ts/imessage/dist/index.js`
- Update README upgrade guide to document the v5 package split and
the postinstall patch validation step
- Update comments in cli.py and index.mjs to reference v5/v7 changes
The self-hosted OIDC provider fetched the discovery document with a bare
httpx.get(). httpx defaults to follow_redirects=False (unlike curl -L or
the requests library), so when an IDP answers GET
/.well-known/openid-configuration with a 3xx — Authentik canonicalises the
.well-known path, and any IDP behind a reverse proxy doing an http→https
upgrade redirects too — the bare redirect (empty body) tripped the
status != 200 guard and raised 'OIDC discovery returned 302', which
routes.py maps to the provider_unreachable audit event and a 503. The
browser surfaced 'Auth provider self-hosted unreachable'.
The user's smoking gun (curl -o writing zero bytes from inside the
container) is exactly a redirect with no body — the same wall the code hit.
Add follow_redirects=True to the discovery GET only. It's safe: the
issuer-pin check and _require_https_or_loopback still validate the resolved
document and every endpoint, so a redirect can't smuggle in a bad issuer or
a cleartext endpoint. The token/revocation POSTs deliberately keep the
no-follow default (they carry an auth code / refresh token and the endpoint
is already the canonical absolute URL).
Existing discovery tests mocked httpx.get with a canned 200 and never
exercised a real 3xx. Add a regression test that runs a real loopback
server returning a 302 on the .well-known path — fails without the fix
(ProviderError: discovery returned 302), passes with it.
* Return None instead of erroring on drain login failure
* Fix login on drain
* Remove login for drained endpoints flow and clean the code
* chore: drop unrelated credits changes from this PR
* Remove extra comments that were not really necessary
Task 2.0b: the concrete shared-bearer-secret auth provider, the FIRST consumer
of the generic token-auth capability (Task 2.0a). Implements decisions.md Q-A.
plugins/dashboard_auth/drain/ (bundled, discovered like dashboard_auth/basic):
- DrainSecretProvider: non-interactive provider, supports_token=True. Verifies
an inbound Authorization bearer token against a per-agent shared secret with
hmac.compare_digest (constant-time, no timing oracle) and, on a match,
vouches for the caller as the "drain-control" principal scoped to "drain".
The five interactive ABC methods raise NotImplementedError; verify_session
returns None (stacks harmlessly in the cookie-verify loop).
- assess_secret_strength(): fail-closed entropy gate. Rejects secrets shorter
than 43 url-safe-b64 chars (~256 bits), with < 16 distinct characters, or
below 128 bits Shannon entropy — so a weak/structured/repeated secret can
never be silently accepted. Enforced both at register() (friendly skip
reason) and in __init__ (raises — defence in depth).
- register(ctx): no-op + skip reason when HERMES_DASHBOARD_DRAIN_SECRET is
unset; rejects a weak secret fail-closed (drain endpoint stays gated). On a
strong secret, registers the provider AND opts /api/gateway/drain into the
generic token-auth seam via register_token_route().
Config: the secret is a CREDENTIAL → carried via HERMES_DASHBOARD_DRAIN_SECRET
(per-agent, provisioned by NAS at deploy). Behavioural knobs only
(dashboard.drain_auth.{scope,min_secret_chars}) live in config.yaml — added to
DEFAULT_CONFIG with the .env-is-for-secrets rationale documented inline.
Tests: tests/plugins/dashboard_auth/test_drain_provider.py — entropy gate
(strong pass; empty/short/repeated/few-distinct/custom-min reject), verify_token
(match → scoped principal, wrong/empty → None, custom scope), protocol
compliance, interactive-methods-raise, and register() (skip-no-secret,
fail-closed-weak-secret, strong-env-secret registers + route opt-in, config
scope + min_secret_chars). 21 new tests; drain + token-auth suites 44 passed.
Verified the plugin is discovered as dashboard_auth/drain alongside basic/nous.
Intentionally deferred:
- The begin/cancel-drain endpoint handler itself — Task 2.1.
- The dashboard→gateway control channel — Task 2.2.
Build status: dashboard-auth + drain-plugin suites green.
OpenRouter/Nous image gen now runs a quality-first model chain by default:
attempt the highest-fidelity OpenAI image model first, then fall back to
Gemini 3 Pro Image when it's access-gated/unavailable/times out. An explicit
OPENROUTER_IMAGE_MODEL / config model override pins one model with no fallback.
Atlas validation rejects malformed model output instead of shipping it: adds a
per-state collapse guard (a single sliver/fragment row no longer passes because
other rows are healthy), on top of the existing postage-stamp + multi-pose
checks.
Desktop: pet-gen native notifications are now "global" (not tied to a chat
session), so a background generation started from the command center fires an
OS notification when the user is away even with no active session. Adds a
neutral "This can take up to 5 minutes." banner on step 1, and lets the
provider picker auto-size.
Tests updated/added for the OpenRouter fallback chain, the collapse guard, and
the global notification path.
Reference-grounded image provider over the OpenRouter-compatible
chat-completions image protocol (Gemini Flash Image et al.). Nous Portal
proxies OpenRouter, so one provider serves both — giving pet generation a
reference-capable backend beyond OpenAI gpt-image.
Salvage of PR #48927 by @ehz0ah, which consolidates OpenViking recall
work from #41706 (@huangxun375-stack), #33260, #49975, and #32444.
Replaces stale background post-turn prefetch warming with synchronous
current-query recall. The old queue_prefetch warmed the PREVIOUS user
message while turn-start recall consumed the CURRENT one, so injected
context was always about the wrong topic.
Changes:
- prefetch() now does session-aware /api/v1/search/search with the
current query, falls back to /api/v1/search/find on failure
- Contract-safe payloads: limit, score_threshold, context_type,
session_id — no top_k, no search-body mode, no target_uri
- L2 content reads for items with level=2 or empty abstracts, capped
at full_read_limit (default 2)
- Local ranking (score + query-token overlap + leaf boost), dedup,
score threshold, and injected-char budget
- queue_prefetch() is now a no-op (background warming removed)
- Additive batched viking_read: uris param accepts up to 3 URIs
- Per-request timeout support on _VikingClient.get/post/delete
- Removes stale _prefetch_result/_prefetch_thread/_prefetch_generation
state and _invalidate_prefetch_state()
- Strengthened system_prompt_block guidance
Salvage follow-up fixes:
- Expose all 8 recall config knobs in get_config_schema() (PR #48927
had removed them; #41706 correctly exposed them). Env vars remain
as internal mechanism but are now visible in setup wizard.
- Lower default timeout 8s→4s, request_timeout 6s→3s, full_read_limit
3→2 to reduce per-turn blocking latency.
Co-authored-by: Hao Zhe <haozhe4547@gmail.com>
Co-authored-by: Eurekaxun <eurekaxun@163.com>
spectrum-ts routes stream telemetry through @photon-ai/otel's createLogger,
which sends severity>=ERROR to console.error and WARN/INFO to console.log.
The two lines the health monitor keys off land on different channels:
log.error("stream persistently failing") -> console.error (caught), but
log.warn("stream interrupted; reconnecting") -> console.log (was missed).
The original interception patched console.error only, so the recovering->
degraded escalation counter never saw the interrupt bursts that are the
primary silent-inbound symptom. Verified live against spectrum-ts 3.1.0 +
@photon-ai/otel: 3 real log.warn('stream interrupted') calls now escalate
to degraded -> process.exit(75) -> adapter reconnect.
Adds a shared classifyStreamLog() fed by both console.error and console.log,
plus a regression test asserting both channels are intercepted.
Map Hermes xhigh→max to unlock DeepSeek V4's 'Max thinking' tier
through Ollama Cloud's OpenAI-compatible /v1/chat/completions endpoint.
low/medium/high pass through unchanged; disabled/none suppress
reasoning entirely.
Empirically confirmed: reasoning_effort:max produces ~2.5× more
thinking tokens than high on deepseek-v4-pro:cloud (1576 vs 642).
Mirror built-in memory writes to external providers only after the native memory tool succeeds and is not staged for approval. Keep OpenViking's built-in memory mirroring add-only, since Hermes native memory entries do not yet have stable OpenViking file URIs for replace/remove.
Add a narrow viking_forget tool for exact user memory file deletion and document the current OpenViking write/delete behavior.
* fix: update to version 3 endpoints and adding update and delete tool
* chore: removing the test md file
* fix: prevent circuit breaker on client errors in Mem0 provider
* chore: add telemetry for platform version
* feat: add OSS mode support to Mem0 memory provider
* chore: bump mem0ai dependency to >=2.0.1 in memory plugin
* refactor: enhance dependency checks and embedder config in mem0 backend
* refactor: adjust fact storage message for OSS mode
* refactor: expand user paths, add collection recreation on dimension change for Qdrant
* fix(mem0): make MEM0_USER_ID override gateway-native ids and tag writes with channel
When MEM0_USER_ID was configured (env or mem0.json), the gateway-native id
from kwargs (Telegram numeric id, Discord snowflake, ...) still won, so the
same human ended up under different user_ids per channel and memories never
merged across CLI / Telegram / Slack / Discord. Mirrors openclaw's cfg.userId
pattern: configured override wins, gateway-native id is the fallback.
The legacy "hermes-user" placeholder default written by the setup wizard is
treated as unset to avoid silently bucketing every gateway user together.
Also tag every write with metadata.channel (cli/telegram/discord/...) so the
dashboard can offer per-channel filtered views without coupling identity to
the channel; document the read/write filter asymmetry as intentional
(reads scope to user_id only for cross-agent recall).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: improve Mem0 memory provider backend, pagination, config, and error handling
* refactor: update mem0 telemetry code, docs, and bump version
* fix(mem0): make get_config_schema() return unified schema with mode-aware required flag
Schema always includes api_key field so picker shows "API key / local" for
both modes. In OSS mode api_key.required=False so status won't mislead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: improve mem0 telemetry, add env var key and OSS mode detection
* chore: bump mem0ai lower bound to 2.0.4 (latest SDK release)
* refactor: set telemetry sample rate to 1.0 and update docs for opt‑out
* fix(mem0): resolve 15 correctness, thread-safety, and resource bugs
Thread safety:
- Protect circuit breaker counters with _breaker_lock (race between
prefetch/sync daemon threads and main thread)
- Wrap sync_turn thread creation in _sync_lock; skip if previous sync
is still alive after 5 s join to prevent duplicate memory ingestion
- Guard _schedule_flush timer creation under _queue_lock (TOCTOU race)
- Capture local `backend` reference in prefetch/sync closures so
shutdown() nulling self._backend cannot crash in-flight threads
Correctness:
- Fix bool("false")==True for rerank param; parse string values explicitly
- Guard page/top_k with max(1,...) and move int() inside try blocks
- Fix fact_count=0 always in OSS mode (Memory.add returns list, not dict)
- Fix prefetch() not clearing result when thread still alive after timeout
- Fix atexit.register accumulating on repeated initialize() calls
Backend / setup:
- Handle Qdrant named-vector collections in _recreate_collection_if_dims_changed
(vectors is a dict; .size access raised AttributeError, swallowed silently)
- Wrap QdrantClient and psycopg2 conn/cursor in try/finally to prevent leaks
- Resolve ollama_bin at top of _ensure_ollama; use it for ollama pull
- Fix embedder key lookup when LLM provider has no env_var (e.g. ollama)
Also: remove _telemetry_enabled cache (env var check is cheap), bump
required mem0ai to >=2.0.7, minor README wording fix.
* fix(mem0): fix brittle qdrant path test + add telemetry sample-rate docs
- Replace generator-throw lambda with a proper def in
test_qdrant_path_not_writable; use tmp_path instead of a hardcoded
/nonexistent path so the test is root-safe
- Add MEM0_TELEMETRY_SAMPLE_RATE to memory-providers.md (was only
in the plugin README, not the user-guide docs)
* revert: remove MEM0_TELEMETRY_SAMPLE_RATE from user-guide docs
* refactor: remove telemetry from mem0 plugin and update documentation
* fix(mem0): set stdin=DEVNULL on setup subprocess calls
The TUI stdin guard (scripts/check_subprocess_stdin.py) requires every
subprocess call in plugin code to set stdin= so it can't inherit the
gateway's JSON-RPC stdin fd. Muzzle the docker/ollama calls in the OSS
setup wizard with stdin=subprocess.DEVNULL (none need interactive input).
Also covers the docker-inspect call the linter's regex misses.
---------
Co-authored-by: chaithanyak42 <chaithanya.kumar42a@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On resource-contended hosts the embedded Hindsight daemon can exceed a
single 2s /health check; upstream then waits a grace window before
treating it as stale and killing+restarting it (hindsight-embed reads
HINDSIGHT_EMBED_PORT_HEALTH_GRACE_TIMEOUT, default 30s, into a
module-level constant at import time). Users on busy boxes had no
Hermes-side way to raise it short of hand-setting an env var.
Add a 'port_health_grace_timeout' config.json option to the Hindsight
plugin. When set, initialize() exports it to the process env BEFORE
daemon_embed_manager is imported (the import-time read is the contract).
setdefault() so an explicit operator env override always wins. Exposed
in 'hermes memory setup' for local_embedded mode.
Follow-up to #50308 / issue #13125 comment thread.
Follow-up for salvaged PR #50256. Unit tests for the three behaviors:
retryable classification of Envoy/sidecar overflow strings, per-chat typing
cooldown with stop_typing reset, and the _supervise_sidecar crash-detection
path that raises a retryable fatal (and the clean-shutdown no-op).
PostgreSQL's initdb refuses to run as root, so the embedded Hindsight
daemon could never initialize its data directory under root. The
daemon-start thread would fail, retry, and loop forever — each cycle
reloading embedding models (~958MB RAM, ~33% CPU) with no user-visible
error, leaving Hermes sluggish on a common VPS/cloud root setup.
initialize() now detects root (os.geteuid() == 0) before spawning the
daemon thread, disables local_embedded mode, and surfaces a clear
warning to both the log and the terminal so the user knows to run as a
non-root user or switch to cloud / local_external mode.
Closes#13125.
Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
The raft platform plugin's check_raft_requirements() logged a WARNING every
time it returned False. Since check_fn is called on every load_gateway_config()
(~every 10s during normal gateway operation), users who don't have the raft
CLI installed get their logs flooded with no way to suppress it — hermes plugins
disable doesn't work for bundled platform plugins, and platforms.raft.enabled:
false doesn't gate the check_fn call.
Fix: make check_raft_requirements() a silent predicate (return True/False
only, no logging), matching the convention documented and used by other
platform adapters (e.g. teams/adapter.py). The caller in
gateway/platform_registry.py create_adapter() already emits its own warning
when requirements aren't met and an adapter is actually requested — that's the
correct place for a user-facing warning (fires once per connect attempt, not
once per config load).
Fixes#49234
The prior assertion `all("turn1" in k or "turn2" in k for k in keys)` was
weak on two counts: it passes vacuously when keys is empty (a regression
that lost all state would slip through), and after turn 2 finalizes only
turn 1 lingers, so it only ever inspected turn 1 anyway. Replace it with an
exact check that one key survives, it is turn 1, and turn 2 never merged
into it — the real isolation invariant the test name claims.
Scoping the trace key by turn_id (the prior commit) fixed cross-turn
collisions but introduced a slow leak: _finish_trace only pops a key when a
turn ends cleanly (final response has content and no tool calls), so any
turn that is interrupted, ends on a tool call, or has empty final content
now leaves its uniquely-keyed entry in _TRACE_STATE forever. Previously the
constant per-session key was overwritten by the next turn, capping growth at
~1 entry per session.
Add an LRU cap (_MAX_TRACE_STATE) enforced by _evict_stale_locked, called
under _STATE_LOCK immediately before each insert. It evicts the
least-recently-updated entries (using the previously-dead last_updated_at
field) and ends their root span so nothing dangles. Regression test drives
50 non-finalizing turns against a cap of 8 and asserts the dict stays bounded
with the most-recent turns surviving.
The turn- and api-scoped branches each repeated the same
task/session/thread fallback ladder with only the infix differing. Extract
the shared prefix into _scope_prefix so a future scope dimension touches one
ladder instead of three. The legacy branch still returns a bare task_id (not
the task: prefix) for backward compatibility, so it stays separate.
Output key strings are unchanged; a new test pins them across every
task/session/turn/api combination since the keys are matched across hooks
and any drift would silently break trace finalization.
The PR added helper-level tests for _trace_key but nothing exercised the
keys through the real hooks. This adds TestTurnTraceIsolation, which drives
on_pre_llm_request / on_post_llm_call across two turns of one gateway
session (task_id == session_id, unique turn_id, api_call_count reset per
turn) and asserts each turn opens its own root trace when the first turn
fails to finalize (tool-only final step). This test fails on the pre-fix
code (only one trace opened, turn 2 absorbed into turn 1) and passes with
the scoping fix.
Also pins the turn_id-over-api_request_id key precedence: the turn-scoped
post_llm_call carries no api_request_id, so it must still resolve to the
same key as the request-scoped hooks or finalization breaks.
Resolves conflicts from the OpenViking churn that merged after #32445 was
opened (#48042/#47662 session-switch + write hardening, #47311/#47973):
- plugins/memory/openviking/__init__.py: keep both __init__ field groups
(the PR's _runtime_start_* alongside main's _prefetch_threads/_shutting_down).
- tests/plugins/memory/test_openviking_provider.py: keep BOTH the PR's new
setup-validation tests and main's session-switch/concurrency tests (disjoint
additions to the same region).
Two fixes layered while reconciling (contributor work otherwise preserved):
- Restore the merged tenant-header contract (#22414/#21232). The PR had changed
_VikingClient defaults to '' and made empty account/user OMIT the tenant
headers; main's contract is that empty falls back to 'default' and the
X-OpenViking-Account/User headers are ALWAYS sent (ROOT API keys need them).
Reverted the constructor to 'account or os.environ.get(..., "default")' and
updated the two PR tests that asserted the omit-when-empty behavior.
- Close a secret-file TOCTOU in the setup writers. _write_env_vars and
_write_ovcli_config wrote the api_key/root_api_key file and chmod 0600
AFTERWARD, leaving a world-readable window on newly-created files. Added
_precreate_secret_file() to create with 0600 before any secret bytes land.
Phase 4E (E.1 + E.2). The inbound side of Chronos: NAS POSTs the agent when a
one-shot fires; the agent verifies a NAS-minted JWT and runs the job.
E.1 — plugins/cron/chronos/verify.py:
- verify_nas_fire_token(token, expected_audience, jwks_or_key, issuer): verifies
signature against the NAS JWKS (RS/ES family; symmetric rejected), aud == this
agent, exp/nbf, iss, and purpose == "cron_fire" (so a general agent JWT can't
be replayed against the fire endpoint). Returns claims or None; never raises.
Crypto delegated to PyJWT[crypto] (already a declared dep) — no hand-rolled
JWT, no new dependency. No key configured → refuse (never unsigned-decode a
security boundary).
- get_fire_verifier(): pluggable indirection so the DQ-4 escape hatch
(direct per-job cron-key) can swap in with no handler change.
E.2 — gateway/platforms/api_server.py:
- POST /api/cron/fire (registered only when _CRON_AVAILABLE). Authenticated by
the NAS-JWT via get_fire_verifier() — NOT API_SERVER_KEY (NAS holds no API
key; this is the only inbound that triggers remote job execution, so it gets
its own purpose-scoped check). Verifier args come from cron.chronos.* config.
401 on bad/missing/forged token. 400 on missing job_id. On success: 202 +
fire_due runs in the background (so a long agent turn never trips NAS's HTTP
timeout); the store CAS claim inside fire_due de-dupes a scheduler retry.
Tests:
- test_chronos_verify (11): REAL RS256 signing — valid→claims, wrong-aud,
missing/wrong purpose, expired, wrong-iss, tampered-signature (attacker key),
no-key-refuse, empty-token, JWKS-URL key resolution, get_fire_verifier.
- test_cron_fire_webhook (5): valid→202+fire, invalid→401+no-fire, missing
token→401, missing job_id→400, and fire path does NOT require API_SERVER_KEY.
api_server regression suites (214) green.
E.3 (NAS endpoints) is a separate cross-repo PR; the wire contract lands next
(docs/chronos-managed-cron-contract.md).
Phase 4D. The first non-default CronScheduler: plugins/cron/chronos/. Inert
unless cron.provider=chronos; resolve_cron_scheduler falls back to the built-in
if unavailable, so cron never loses its trigger.
Files:
- chronos/__init__.py — ChronosCronScheduler + register(ctx).
* is_available(): config-only, NO network (portal_url + callback_url + a
stored Nous access token via get_provider_auth_state). Returns False →
resolver falls back to built-in.
* start(): reconcile() then RETURN — no blocking loop, no 60s wake (DQ-1:
this is what makes scale-to-zero real; the machine wakes only on a
NAS→agent fire).
* _arm_one_shot(job): POST NAS provision {job_id, fire_at, agent_callback_url,
dedup_key=job_id:fire_at}. Agent owns the time → sub-minute fires survive
(no scheduler 1-minute floor).
* reconcile(): converge NAS arms toward jobs.json — arm missing/changed-time,
cancel orphaned, skip paused. Cold process rebuilds from jobs.json +
idempotent dedup_key.
* on_jobs_changed(): reconcile (re-arm/cancel the affected one-shot).
* fire_due(): ABC default (CAS claim + run_one_job) THEN re-arm the next
one-shot. Job gone (one-shot done / repeat-N exhausted) → no re-arm.
- chronos/_nas_client.py — thin HTTP wrapper for provision/cancel/list using
the agent's existing refresh-aware Nous token (resolve_nous_access_token).
Names no scheduler vendor; holds no scheduler creds.
- chronos/plugin.yaml — discovery metadata.
INVARIANT: zero "qstash"/"upstash" hits in plugins/cron, gateway, hermes_cli,
website/docs — the external scheduler is a NAS-internal detail, never named
agent-side.
Tests (13, all NAS mocked, zero network): is_available off-without-config +
on-with-config + makes-no-network; arm payload incl. sub-minute + noop without
next_run; reconcile arms-all / cancels-orphan / skips-paused / skips-already-
armed; fire_due re-arms next / no re-arm when job gone / no re-arm when claim
lost.
* fix(photon): preserve text in mixed iMessage attachments
When an iMessage bubble carried both text and an attachment, spectrum-ts'
inbound mapper returned only buildAttachmentMessage(...), dropping the user's
typed text before Hermes could see it. The Photon adapter then had no 'group'
content path, so the text was lost entirely.
- adapter.py: handle a new 'group' content type that flattens text + attachment
items, preserving the typed text alongside cached media (extracted shared
_normalize_binary_payload helper).
- sidecar: emit 'group' content in normalizeContent, and ship
patch-spectrum-mixed-attachments.mjs which patches spectrum-ts' pinned mapper
(at npm postinstall AND at sidecar startup, so existing installs self-heal).
Windows robustness fixes on top of the original PR:
- The patcher's CLI guard used 'import.meta.url === file://${argv[1]}', which
never matches on Windows (file:/// + drive letter) — it silently no-opped.
Switched to pathToFileURL(argv[1]).href.
- The patcher matched \n-joined strings, so a CRLF checkout (Windows git
autocrlf) defeated every replacement. It now normalizes CRLF->LF for matching
and restores the original EOL style on write.
Co-authored-by: Yuhang Lin <yuhanglin@YuhangdeMac-mini.local>
* chore: map YuhangLin contributor email for attribution (#46513)
---------
Co-authored-by: Yuhang Lin <yuhanglin@YuhangdeMac-mini.local>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Follow-up hardening on @ehz0ah / @harshitAgr's session-switch work (#28296):
- on_session_switch no longer runs the old-session writer-drain + pending-token
GET + commit POST inline on the caller's command thread. /new, /branch,
/resume, /undo call it synchronously, so a slow drain (up to 10s) or wedged
commit blocked the user-facing command — the same hazard #41945 fixed for
end-of-turn sync. State now rotates synchronously (cheap) and the old-session
commit is offloaded to a daemon finalizer (generalized _finalize_session_async).
- Guard the (_session_id, _turn_count) pair with _session_state_lock: sync_turn
runs on the memory-manager executor thread while the session hooks run on the
command thread, so the snapshot+reset vs increment was a cross-thread race.
- _session_needs_commit checks the committed-session guard BEFORE the
turn_count>0 shortcut, closing a double-commit window when a racing sync_turn
re-increments after commit+reset.
- Add a _shutting_down flag so deferred finalizers stop POSTing against a
torn-down client; track all prefetch threads in a set so invalidate/shutdown
join every one, not just the latest slot.
Tests: regression for the non-blocking switch (asserts the caller returns while
a slow drain is parked off-thread) and the committed-guard ordering; updated the
deferred-commit test to the unified finalizer contract.