hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-06-27 11:22:03 +00:00

Author	SHA1	Message	Date
Ben	538c419d2e	fix(gateway): scope dashboard liveness fallback to the profile PR #52151 hardened the runtime-status liveness check to trust a readable live process command line over stale gateway_state.json argv, so a recycled PID now owned by an s6 supervisor no longer counts as a running gateway. That fix is correct but incomplete for the reported symptom: the web dashboard showed a named profile's gateway green while `hermes -p <name> gateway status` showed it stopped. Two further issues: 1. Cross-profile PID reuse. In per-profile Docker supervision, one profile's stale `gateway_state.json` can record a PID the OS later recycled onto a DIFFERENT profile's live gateway. That PID's command line still `looks_like_gateway`, so the dead profile was reported running. The recorded argv has its `-p <name>` selector stripped in-process by `_apply_profile_override`, so it cannot disambiguate; the live `/proc` cmdline still carries it. `get_runtime_status_running_pid` now accepts an `expected_home` and validates the live command line belongs to THAT profile (mirroring `hermes_cli.gateway._matches_current_profile`, the logic the CLI scan path already uses — which is why the CLI was correct). `_check_gateway_running` passes the enumerated profile dir. 2. The existing regression test `test_gateway_running_check_falls_back_to_ runtime_state` used the live pytest PID with a gateway-shaped record; once the live cmdline became authoritative it no longer looked like a gateway. Updated to mock the live cmdline to the real separate-process scenario it describes. The active-profile path (`get_running_pid`) is intentionally left unscoped: it is lock-verified and any live gateway cmdline is acceptable there. Multiplex mode is unaffected — `running` state is only ever written to a gateway's own home, never a secondary served profile's. Adds coverage for: cross-profile PID reuse (named + default), matching profile cmdline (`-p`, `--profile`, explicit HERMES_HOME=), the bare default gateway, and the unreadable-cmdline cross-platform fallback. Each new cross-profile assertion fails without the profile scope and passes with it. Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>	2026-06-25 10:25:54 +10:00
helix4u	f1617a7ebb	fix(gateway): validate runtime status pid command line	2026-06-25 10:25:54 +10:00
Ben	d335164833	fix(relay): authorize relay inbound via connector-enforced upstream authz A hosted instance fronted by the Team Gateway connector dropped EVERY relay message as "Unauthorized user" and the agent never replied — despite the message routing correctly through the connector to the instance. Root cause: gateway authorization (_is_user_authorized) had no notion of upstream-enforced authz. Platform.RELAY matches no {PLATFORM}_ALLOWED_USERS allowlist and isn't in the HA/WEBHOOK always-authorized set, so a relay user with no env allowlist configured hit the default-deny ("No user allowlists configured. All unauthorized users will be denied."). The message was received, then silently denied before reaching the agent. This is incorrect for relay: the connector authenticates the gateway's WS with a per-instance secret and performs owner-only author-binding resolution BEFORE delivering. A message only reaches this gateway because the connector resolved it to THIS instance's bound user (user_instance_binding), keyed on the author id the connector OBSERVED off the event — never a gateway claim. The authorization decision is already made by a trusted, authenticated upstream; there is no local RELAY_ALLOWED_USERS allowlist to consult, and default-denying for its absence is the bug. Fix: add a generic BasePlatformAdapter.authorization_is_upstream capability (default False) that the relay adapter overrides to True, plus a dedicated trusted branch in _is_user_authorized that honors it. This is delegation to a trusted upstream, NOT a fail-open: it fires only for an adapter that explicitly declares the flag; every direct network-exposed adapter leaves it False and the env-allowlist default-deny (SECURITY.md §2.6) is unchanged. Distinct from enforces_own_access_policy, which mirrors a LOCAL config-driven allowlist — this delegates to an authenticated upstream's decision. Tests: behavior contract that the base defaults False, the relay adapter declares True, a relay user (group + DM) is authorized with no env allowlist, and crucially a non-upstream adapter with no allowlist still default-denies (guards against the fix becoming a blanket fail-open). 6 new tests; relay + authz + config-policy suites green (134 + 90). Found via live staging debug of the Discord self-serve onboarding flow.	2026-06-25 10:06:21 +10:00
liuhao1024	404b06ac4f	fix(gateway): honor server retry_after in _send_with_retry for Telegram flood control (#46762 ) When Telegram's sendRichMessage returns a FloodWait/RetryAfter error, _try_send_rich() now extracts the server-provided retry_after value and propagates it through SendResult.retry_after. The base _send_with_retry() layer honors this value instead of using its default short exponential backoff (~2s, ~4s), preventing the retry budget from being exhausted against a server that demands a 25-37s wait. Salvaged from #46774 by @liuhao1024. Telegram adapter path moved from gateway/platforms/telegram.py to plugins/platforms/telegram/adapter.py since the original PR. Closes #46762	2026-06-25 02:43:47 +05:30
kshitijk4poor	9c994377ed	fix(gateway): skip non-dict entries in session loading (#46994 ) Corrupted sessions.json entries (e.g. a bare bool where a dict is expected) caused TypeError on 'origin' in data' which escaped the (ValueError, KeyError) inner except and aborted loading ALL remaining sessions, not just the corrupted one. Two-layer fix: - Loop level: isinstance(entry_data, dict) guard before from_dict - from_dict: isinstance(data['origin'], dict) instead of bare truthiness - Added TypeError to the inner except as defense-in-depth Closes #46994	2026-06-25 01:26:13 +05:30
kshitijk4poor	e0272cfef2	Revert "fix(compression): make minimum context floor configurable (#31600 )" This reverts commit `cae1ee44a7`.	2026-06-25 01:04:44 +05:30
Tranquil-Flow	cae1ee44a7	fix(compression): make minimum context floor configurable (#31600 ) Add compression.minimum_context_floor config key that allows users to lower the compression threshold floor below the hardcoded 64K default, preventing infinite tool-call loops on models whose structured output degrades well before 64K tokens. - agent/model_metadata.py: add get_configurable_minimum_context() helper with 16K hard safety limit - agent/context_compressor.py: accept minimum_context_floor param, thread it through _compute_threshold_tokens - agent/conversation_compression.py: use compressor's floor for aux model context validation - agent/agent_init.py: read compression.minimum_context_floor from config and pass to ContextCompressor - gateway/run.py: cache-busting includes new key Salvaged from #31686 by @Tranquil-Flow onto current main. Resolves conflicts with in-place compaction (#38763) and max_tokens threshold computation (#43547) that landed after the original PR. Closes #31600	2026-06-25 00:56:04 +05:30
kshitij	9214aa7dde	Merge pull request #52090 from NousResearch/salvage/35994-reset-deadlock fix(gateway): offload agent cleanup off the event loop in /new reset (#35994)	2026-06-25 00:34:21 +05:30
kshitijk4poor	0225480369	fix(gateway): offload agent cleanup off the event loop in /new reset (#35994 ) The /new (and /reset) confirmation-button callback runs the slash-confirm handler on the asyncio event loop (see _request_slash_confirm). That handler calls _handle_reset_command, which invoked the SYNCHRONOUS, potentially long-blocking _cleanup_agent_resources inline: agent.close() tears down terminal sandboxes, browser daemons and background processes (subprocess waits), and shutdown_memory_provider() can make a network call. A slow teardown wedged the entire event loop, so the bot went silent and stopped processing all messages until a manual restart. Offload _cleanup_agent_resources via the existing contextvar-preserving _run_in_executor_with_context helper, bounded by asyncio.wait_for with a named _RESET_CLEANUP_TIMEOUT_S (30s). The loop is never blocked; on timeout the reset proceeds and the worker thread is left to finish on its own (it cannot be cancelled). The text /new path is unaffected (already off-loop). Tests (tests/gateway/test_35994_reset_button_deadlock.py): the loop keeps ticking while close() blocks in its worker thread; a cleanup that raises is swallowed (warning logged) and the reset still rotates the session; a cleanup that times out degrades gracefully. All three are mutation-verified to fail without their respective production branch.	2026-06-25 00:27:22 +05:30
sweetcornna	b41d9b845d	fix(gateway): surface retry hint instead of silently dropping turn after /stop (#31884 ) After /stop, the next user message can hit a stale generation token and return with api_calls=0, no failure, no interruption. _normalize_empty_agent_response fell through to an empty string, so the gateway logged "response=0 chars" and sent nothing — the message was silently lost while internal work sometimes continued. Add the api_calls==0 / not-failed / not-interrupted / not-partial branch to the single normalization chokepoint so the user gets a short retry hint instead of silence. Regression test asserts the hint surfaces. Salvaged from #33851 (re-applied on current main; original was 1401 commits behind and the function had moved).	2026-06-24 23:51:31 +05:30
kshitij	ae20c3fb90	Merge pull request #51025 from NousResearch/salvage/cron-autoreset-override fix(gateway): consume was_auto_reset so /model survives session auto-reset (#48031)	2026-06-24 19:20:11 +05:30
x7peeps	6879d77d74	fix(gateway): consume was_auto_reset so /model survives session auto-reset When `/model X` is the FIRST message after an idle/daily/suspended auto-reset, the slash-command path stores a session model override but leaves `session_entry.was_auto_reset = True` (it never passes through `_handle_message_with_agent`, which is where the flag was consumed). On the NEXT regular message, the auto-reset cleanup block pops the freshly-stored model/reasoning override BEFORE the flag is consumed — so the switch is silently lost and resolution falls back to the config default, while the session DB still shows the switched model (a two-sources-of-truth divergence). Consume the flag at both sites: 1. gateway/run.py — capture `was_auto_reset` into a local and set the attribute False immediately at the top of the cleanup block, so the cleanup can't re-fire on a later message and wipe an override stored between turns. Downstream reads use the captured local. 2. gateway/slash_commands.py — the model path consumes the flag before storing the override, so a /model-first-after-auto-reset isn't wiped by the next message's cleanup. Salvaged from #48062 by x7peeps (authorship preserved). Tests: tests/gateway/test_48031_model_switch_after_auto_reset.py — AST invariants pinning both consume sites (load-bearing; verified they fail when either consume is removed). Mirrors the AST-pin approach in test_35809_auto_reset_clean_context.py. Gateway session/reset suite: 16 passed. Fixes #48031	2026-06-24 19:12:44 +05:30
kshitij	d68a133458	Merge pull request #51890 from NousResearch/salvage/40695-handoff-watcher-async fix(gateway): offload handoff-watcher SQLite calls to avoid blocking the async heartbeat (#40695)	2026-06-24 19:10:52 +05:30
liuhao1024	721cf54fb1	fix(gateway): offload /model provider-listing off the event loop (#41289 ) The Discord/Telegram /model slash command listed providers synchronously on the gateway's async event loop. list_picker_providers / list_authenticated_providers are blocking and can fall through to a synchronous urllib HTTP fetch when the on-disk provider cache is stale, freezing the loop for 120-150s -> "application did not respond" and delayed agent starts. Port #41304's asyncio.to_thread offload to the current handler location. The handler moved from gateway/run.py to gateway/slash_commands.py (_handle_model_command); wrap BOTH blocking call sites so the whole bug class is covered: - picker path -> list_picker_providers - text-fallback path -> list_authenticated_providers asyncio.to_thread is already idiomatic in this module (and asyncio is imported), so the loop now stays responsive while the (possibly network-bound) listing runs on a worker thread. Adds tests/gateway/test_model_command_async_offload.py asserting the offload contract at the real handler seam for both paths (mutation- survivable: reverting either to_thread wrap fails the matching test). Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-06-24 18:40:52 +05:30
r266-tech	f0c5d812b0	fix(gateway): offload handoff watcher SessionDB polling off the event loop The Discord gateway heartbeat stalled ('Shard ID None heartbeat blocked for more than N seconds') because _handoff_watcher polled the synchronous, blocking SQLite-backed SessionDB directly on the asyncio event loop every 2s. Each list_pending/claim/complete/fail call performed blocking disk I/O on the loop thread, starving the Discord heartbeat coroutine. Wrap every blocking SessionDB call inside the watcher loop in asyncio.to_thread(...) so the SQLite work runs on a worker thread and the event loop (and heartbeat) stays responsive. These four call sites are the only synchronous self._session_db.* calls inside the watcher loop body. Adds tests/gateway/test_handoff_watcher_async_db.py asserting the watcher offloads its SessionDB calls via asyncio.to_thread (mutation-survivable: reverting any to_thread wrap fails the corresponding assertion). Fixes #40695 Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-06-24 18:40:23 +05:30
Ben	c93b9f9057	feat(relay): terminal 4401 (opt-out) → clean "Relay disabled" state Some checks are pending CI / detect (push) Waiting to run Details CI / tests (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / typecheck (push) Blocked by required conditions Details CI / docs-site (push) Blocked by required conditions Details CI / history-check (push) Blocked by required conditions Details CI / contributor-check (push) Blocked by required conditions Details CI / uv-lockfile (push) Blocked by required conditions Details CI / docker-lint (push) Blocked by required conditions Details CI / supply-chain (push) Blocked by required conditions Details CI / osv-scanner (push) Blocked by required conditions Details CI / All required checks pass (push) Blocked by required conditions Details Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Phase 7 Unit 7d-B. When an operator opts an instance OUT of the Team Gateway relay (Unit 7b deprovision), the connector revokes the per-gateway secret and closes the gateway's WS with 4401. The reconnect supervisor previously treated EVERY close as retryable, so the live process spun "retrying 4401" forever and the dashboard showed a red error — opt-out looked like a failure. Now a 4401 close that arrives AFTER a successful handshake is recognized as a terminal credential revocation: - ws_transport.py: track `_handshake_succeeded` (set when a descriptor is received); on a 4401 close after a prior success, latch `auth_revoked` and do NOT spawn the reconnect supervisor. A 4401 BEFORE any successful handshake stays retryable (cold-start / not-yet-provisioned race, not a revocation). New `auth_revoked` property + a websockets-version-safe close-code reader (prefers `.rcvd`/`.sent` Close frames; `.code` is deprecated in websockets 13+). - adapter.py: a revocation monitor turns `transport.auth_revoked` into a clean, NON-retryable `relay_disabled` fatal and notifies the gateway's fatal-error handler (so the adapter is removed and NOT queued for reconnection — the credential is dead until the instance is recreated). Monitor is cancelled on disconnect; only started when the transport exposes `auth_revoked` (prod WS). - run.py: `_handle_adapter_fatal_error` maps the `relay_disabled` code to a `disabled` platform_state (not `fatal`/`retrying`). - web: PlatformsCard renders the `disabled` state with a neutral outline badge, a PowerOff icon, and muted (not destructive-red) text + message. New optional `status.disabled` i18n string ("Disabled"). Also bundles the Phase 7 contract-doc update (this doc is authoritative in hermes-agent): docs/relay-connector-contract.md gains an "Author-first resolution + the account-link (DM) path" section documenting the multi-tenant-guild rule (D-7.2 — route by authenticated author binding, never by guild; unlinked → fail-closed), the `/link <code>` DM flow, and the connector-authoritative opt-out + terminal-4401 behavior this PR implements. Tests: +2 ws_transport (4401-after-handshake terminal / no-reconnect; 4401-before-handshake stays retryable) and +2 adapter (revocation → non-retryable relay_disabled fatal + handler fired; no-revocation → no fatal). 138 relay tests pass (incl. the contract-doc conformance test); ruff clean; web tsc clean. Phase 7 Unit 7d-B (relay-adapter solo lane). Q17 → Option 2; Option 3 (live de-register, no recreate) + the restart-re-provision hole deferred post-alpha.	2026-06-24 18:43:01 +10:00
Chaz Dinkle	abc3662bf6	fix(gateway): detect launchd in /restart service-manager probe (#43475 ) On a launchd-managed gateway (macOS), /restart stopped the gateway but never relaunched it: the handler's service detection checks only INVOCATION_ID (systemd) and container markers, so under launchd it takes the detached path and exits 0 — which KeepAlive.SuccessfulExit=false treats as a deliberate stop. The gateway stays silently dead until a manual launchctl kickstart. Detect launchd via XPC_SERVICE_NAME, which launchd sets to the job label for processes it spawns. The probe deliberately excludes the literal "0": interactive macOS shells inherit XPC_SERVICE_NAME=0 (a truthy string), and routing an unsupervised interactive gateway to the service path would make it exit non-zero with nothing to revive it. Routing through via_service=True (rather than forcing a non-zero exit on the detached path) matters: the detached path also spawns a helper that relaunches the gateway, so exiting non-zero there would have BOTH the helper and launchd respawn it — two gateways racing for the same bot tokens. The service path spawns no helper; launchd is the single respawner. Fixes #43475. Supersedes the run.py-era probes in #19940/#33393 (the handler has since moved to gateway/slash_commands.py) and avoids the double-spawn risk in the exit-code-site approaches (#43498, #43596).	2026-06-24 00:14:25 -07:00
Teknium	0ef86febe2	docs(sessions): clarify sessions.json is the gateway routing index, not the session list (#51726 ) Users who inspect ~/.hermes/sessions/sessions.json see only gateway entries (e.g. agent:main:whatsapp:dm:...) and mistake it for the session index that hermes sessions list / /sessions read — which is actually state.db. Issue #49361 reported CLI sessions as 'invisible' on this premise. - gateway/session.py: write a self-documenting _README sentinel at the top of sessions.json explaining it's the gateway routing index and that ALL sessions (CLI/TUI/gateway) live in state.db; skip _-prefixed keys on load so the sentinel never round-trips into a SessionEntry. - Harden every sessions.json reader against the sentinel: mcp_serve loader, gateway/mirror.py, gateway/channel_directory.py all skip _-prefixed keys. - docs/user-guide/sessions.md: warning callout naming the exact symptom. - tests: assert prune ignores metadata sentinels; add round-trip coverage.	2026-06-23 23:56:36 -07:00
uperLu	0d4cecb352	fix(cron): avoid provider package shadowing core cron	2026-06-23 23:39:22 -07:00
Ben	31bced1607	fix(profiles): detect a separate-process gateway in profile status The dashboard Profiles view showed "Gateway stopped" for a gateway that is in fact running — while the sidebar status strip and `hermes gateway status` (CLI) both correctly showed it running. Reported on v0.17.0 running the gateway + dashboard in one Docker container. Root cause: three liveness surfaces with three detection strengths, all reading the same `gateway.pid`: - `hermes gateway status` -> find_gateway_pids() (process-table scan) - sidebar /api/status -> get_running_pid() + gateway_state.json PID fallback + health-URL probe - Profiles view -> _check_gateway_running() = get_running_pid() ONLY, no fallback `get_running_pid()` short-circuits to None the moment the runtime lock (`gateway.lock`) doesn't register as held by the calling process — which is always true when the reader is a separate process from the gateway (the dashboard is its own s6 service in the container), and also for any launch-service-managed gateway that left a fresh `gateway_state.json` but no live PID file. So the Profiles view alone reported the live gateway as stopped. Fix: give _check_gateway_running the same fallback the sidebar already has — after the pid-file/lock check misses, validate the PID recorded in that profile's gateway_state.json against the live process table via the existing get_runtime_status_running_pid(). read_runtime_status() gains an optional path arg so a profile's state file can be read without mutating the process-global HERMES_HOME (preserving the contextvar-based profile isolation the dashboard relies on). Backward compatible: every existing caller passes no argument. Tests: a regression test that fails pre-fix (live gateway, lock check returns None -> must still report running) and a guard test that a 'stopped' state file is never reported running even with a live PID.	2026-06-24 16:36:17 +10:00
teknium1	366c2a3766	fix(gateway): propagate fatal-config exit code through start_gateway clean-exit path The contributor PR stamped runner._exit_code=78 on non-retryable startup errors, but start_gateway()'s clean-exit branch returned True before the SystemExit(runner.exit_code) site, so main() exited 0. The s6 finish script's [ "$1" = "78" ] check never matched and s6 crash-looped the gateway anyway — the fix was dead as shipped (#51228). Honor runner.exit_code in the clean-exit branch: raise SystemExit(code) when set, else return True (normal /restart clean exit). Add a start_gateway()-level test that asserts process-level SystemExit(78) propagation — the gap the PR's object-level test missed — plus exit_code on the existing _CleanExitRunner mocks.	2026-06-24 16:34:51 +10:00
Francesco Mucio	776f68e1ee	fix(gateway): exit 78 (EX_CONFIG) on fatal startup errors, s6 finish script stops restart loop Profiles without their own messaging token inherit the default profile's token via os.getenv, hit a token collision, and exit with startup_failed. s6 restarts them immediately, creating ~30MB tirith sandbox dirs in /tmp each cycle — filling the disk in hours (#51228). Changes: - gateway/restart.py: add GATEWAY_FATAL_CONFIG_EXIT_CODE = 78 - gateway/run.py: set exit_code=78 on non-retryable startup errors (token collision, no platforms) - hermes_cli/service_manager.py: add _render_finish_script() that translates exit 78 → exit 125 (s6 permanent failure) - hermes_cli/container_boot.py: write finish script alongside run script during profile registration The s6 finish script pattern follows docker/s6-rc.d/dashboard/finish. Closes #51228	2026-06-24 16:34:51 +10:00
jeremy gu	044996e403	fix(gateway): track no-systemd restart runtimes	2026-06-23 23:29:28 -07:00
helix4u	06cbc3bae9	fix(photon): recover degraded upstream stream	2026-06-23 21:33:10 -07:00
Ben Barclay	6e88f7b6f7	feat(relay): Phase 5 Unit C — wake primitive (gateway side) (#51595 ) Register a per-instance wakeUrl and forward it to the connector at self-provision so a suspended gateway can be poked awake when buffered work arrives (pairs with the connector-side WakePoker). - relay_wake_url() resolver (env GATEWAY_RELAY_WAKE_URL, then gateway.relay_wake_url in config.yaml), mirroring relay_instance_id() - thread wake_url through _post_provision (adds wakeUrl to the body only when set) + self_provision_relay (resolve, forward, log) - hermes gateway enroll --wake-url <url> persists GATEWAY_RELAY_WAKE_URL - document the §5.2 wake poke in relay-connector-contract.md §3.3 - tests: relay_wake_url resolution (env/config/absent), provision forwarding, body-only-when-set (6 new; 130 relay tests pass) The actual reconnect+drain on wake is Unit B's loop; this unit only wires the wake SIGNAL. Opt-in: absent wakeUrl => connector never pokes.	2026-06-24 11:00:11 +10:00
Ben Barclay	40fddc9e4c	feat(relay): Phase 5 §5.3 going-idle / buffered-flip primitive (gateway side) (#51572 ) The gateway half of the going-idle/buffered-flip primitive (scale-to-zero PRIMITIVE, not the behaviour). Integrates with the EXISTING drain transition: - ws_transport: `go_idle()` sends `going_idle` + awaits the connector's `going_idle_ack` (connector-authoritative flip-then-ack, Q-5.3c — stays serving until the ack so nothing is lost in the flip window); acks a buffered inbound (bufferId present) via `inbound_ack` after the handler runs (drain-without-dup on the delivery leg); NET-NEW reconnect loop re-dials + re-handshakes after an unexpected close (off by default, on in production). - adapter: emits `going_idle` from its existing `disconnect()` drain seam before tearing down the socket; best-effort + guarded (never blocks shutdown). - transport Protocol + contract doc §3.2 document the 3 new frames. +6 relay tests (124 pass). NOT in scope: the autonomous idle timer / machine suspend / NAS health model (deferred behaviour). Ben's relay-adapter solo lane.	2026-06-24 09:50:30 +10:00
fyzanshaik	0ba1dfed78	fix(gateway): refuse model switch on stale checkout to avoid env_float ImportError	2026-06-24 04:16:54 +05:30
islam666	0c79992db5	fix(gateway): preserve _session_tasks on guard mismatch to enable stale lock healing (#48300 ) _session_task_is_stale() failed to detect a stale session lock when the owner task completed and cleaned _session_tasks (del in _process_message_background's finally) but _active_sessions was NOT released because _release_session_guard skipped on a guard mismatch (a concurrent reset/new command or drain handoff swapped _active_sessions[key] to a different guard). With no owner task left to inspect, _session_task_is_stale reported 'not stale', the orphaned guard was never healed, and the session deadlocked permanently — later messages received but never dispatched. Reorder the finally cleanup to release-then-conditional-delete: release the guard first, then drop the _session_tasks entry ONLY if the guard was actually released (session_key no longer in _active_sessions). On a guard mismatch the done-task entry survives, so the on-entry self-heal (_session_task_is_stale -> _heal_stale_session_lock) detects the stale lock and clears it on the next inbound message. Extracted the cleanup into a callable _cleanup_finished_session_task() helper so the regression test drives the REAL production code path rather than a copy of its logic (the original test inlined the fixed logic and passed regardless of the production order — mutation-verified the rewritten tests now fail on the buggy del-first order). Added a positive-path test (guard matches -> release + delete) so both branches are pinned. Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-06-24 03:06:21 +05:30
Teknium	e32ebc6aa2	feat(skills): /learn — distill a reusable skill from anything you describe (#51506 ) Open-ended skill learning across every surface. /learn <free text> takes a description of any source — a directory, a URL, the workflow you just walked the agent through, or pasted notes — and the live agent gathers it with the tools it already has (read_file/search_files, web_extract, the conversation, the pasted text), then authors a SKILL.md via skill_manage following the house authoring standards (<=60-char description, the standard section order, Hermes-tool framing, no invented commands). No engine, no model-tool footprint, works on any terminal backend (local, Docker, remote): /learn builds a standards-guided prompt and hands it to the agent as a normal turn. - agent/learn_prompt.py: shared standards-guided prompt builder - /learn registry entry (both surfaces) + CLI handler (inject onto input queue) + gateway handler (rewrite turn, fall through, /blueprint pattern) - tui_gateway command.dispatch returns a send directive -> TUI + dashboard chat - dashboard Skills page 'Learn a skill' panel (dir + URL + open-ended text) composes a /learn request and runs it in chat - docs (slash-commands ref + skills feature page), 11 targeted tests Inspired by OpenAI Codex's Record & Replay and the /learn concept from #47234 (dir-distillation engine); reworked to be open-ended and engine-free per review.	2026-06-23 13:51:28 -07:00
Teknium	6cc07b6cd0	feat(discord): render reasoning as -# subtext via display.reasoning_style (#51168 ) Adds a per-platform display.reasoning_style setting (code \| blockquote \| subtext) controlling how the show_reasoning summary renders on the gateway. Discord defaults to "subtext" (-# small grey metadata text); every other platform keeps the fenced code block. Resolves through the existing display.platforms.<platform>.reasoning_style override chain.	2026-06-23 10:44:02 -07:00
Ben Barclay	45bc4fb37f	feat(relay): declare relevance policy to the connector + document the management plane (#51248 ) The gateway half of Phase 6 Unit ζ: project the agent's existing relevance knobs into the connector's platform-agnostic vocabulary and declare them at boot over the /relay/policy route, so the SAME mention-gating / free-response / allow-bots behavior the agent applies directly also governs relay delivery (and excluded chatter never wakes a scaled-to-zero agent). - gateway/relay/__init__.py: - relay_relevance_policy(): project require_mention -> requireAddress, free_response_channels -> freeResponseScopes, {PLATFORM}_ALLOW_BOTS in {mentions,all} -> allowOtherBots. Reads the fronted platform's config block + bridged top-level keys. Returns None when all-default (the connector's quiet default already matches) or no concrete platform is fronted. - send_relay_policy(): POST /relay/policy authenticated with the gateway's own per-gateway upgrade token (make_upgrade_token — same bearer as the WS upgrade), so the connector attaches it to the authenticated instance, never a body-asserted id. Re-declares every boot (self-healing, full replace). NEVER raises, NEVER blocks boot — relevance is an optimization layered on the δ/ε authorization gate. Reuses the per-gateway secret + the /relay/provision host; no new inbound surface, no new credential. - _policy_url(): ws(s)://…/relay -> http(s)://…/relay/policy. - gateway/run.py: call send_relay_policy() after register_relay_adapter() succeeds (the secret is resolved by then). - docs/relay-connector-contract.md: new §7 documenting per-instance delivery + the management plane (/manage/* + /relay/policy) + the relevance-declaration contract; versioning renumbered to §8. Contract conformance test stays green (§2/§3 tables untouched). Tests: +12 (projection mapping incl. comma-string + top-level fallback; send auth/skip/fail-soft/non-200). Full relay suite 118 pass. The connector route is already E2E-proven (connector repo gateway_policy_driver.py); this adds the real gateway send-path it pairs with. This completes Phase 6 (Team Gateway per-user isolation) end to end.	2026-06-23 18:43:19 +10:00
kshitijk4poor	0e69cd4b37	fix(memory): honor configured char limits in the no-agent on-disk store Follow-up to the /memory approve fresh-store fix. Both the CLI fallback and the messaging-gateway handler built a bare MemoryStore() with the hardcoded default char limits (2200/1375), ignoring the user's configured memory.memory_char_limit / user_char_limit. A live agent honors those overrides (agent/agent_init.py), so an approval applied without a live agent could accept a write the user's lower cap would reject, or vice versa. Extract a shared tools.memory_tool.load_on_disk_store() factory that reads the configured limits (falling back to defaults if config can't load) and wire both the CLI and gateway handlers to it, closing the gap on both surfaces and de-duplicating the construction block.	2026-06-23 03:10:53 +05:30
kshitijk4poor	100e7be20e	fix(security): deny root-level credential stores in media delivery The media-delivery denylist in gateway/platforms/base.py enumerated only .env/auth.json/credentials/config.yaml under HERMES_HOME, so other credential stores that live at the root fell through and could be auto-attached to chat replies. The reported case: the Google Workspace skill's google_token.json refreshes every turn, bumping its mtime to 'now', which kept passing the strict-mode recency window and re-sent the OAuth token on every reply. Extend the explicit per-file denylist to mirror the canonical credential set already enforced by the read/write guards in agent/file_safety.py: google_token.json, google_oauth_pending.json, auth/google_oauth.json, .anthropic_oauth.json, webhook_subscriptions.json, cache/bws_cache.json, auth.lock, and the pairing/ token directory. Targeted per-file additions (not a blanket ~/.hermes deny, which was declined in #32090/#34425 because it would block skills/, logs/, and ad-hoc agent-written deliverables). mcp-tokens/ (#37222) and state.db/kanban.db (#41071) are left to their sibling targeted PRs. Reported-by: xxxigm (#50912)	2026-06-23 02:56:48 +05:30
Teknium	2ba1cfeb2e	feat(goals): completion contracts for /goal — evidence-based judging (#50501 ) Adds an optional structured completion contract to the standing-goal loop, adapted from OpenAI Codex's /goal guidance (a durable objective works best when it names what done means, how to prove it, what not to break, what's in scope, and when to stop). A contract has five optional fields — outcome, verification, constraints, boundaries, stop_when. When set, the continuation prompt tells the agent to target the verification surface and respect constraints, and the judge marks the goal done only when the verification criterion is met with concrete evidence (command result, file excerpt, test output) instead of a loose "looks done" claim. This tightens the most common /goal failure mode: premature completion / endless over-continuation on an underspecified goal. Two ways to set a contract, both backward compatible (bare /goal <text> behaves exactly as before): - /goal draft <objective> — expands plain text into a full contract via the goal_judge aux model (cache-safe side call), falls back to a free-form goal if the model is unavailable. - /goal <text> with inline 'field: value' lines (verify:, constraints:, boundaries:, stop when:, ...). Plain goals with an incidental colon are not mangled — only known field prefixes are pulled out. - /goal show prints the active contract. Contracts persist in SessionDB.state_meta alongside the goal (survive /resume), compose with /subgoal criteria, and old goal rows load unchanged. CLI + every gateway platform via the shared GoalManager engine; zero new model tools. Tests: +18 in tests/hermes_cli/test_goals.py (parse/serialize/judge-prompt/ draft/fallback), 73/73 green; 42/42 across the broader goal test surface; live E2E roundtrip (set -> persist -> reload -> contract-aware prompts) green.	2026-06-22 12:20:09 -07:00
Teknium	ff85af3fc7	feat(goals): /goal wait <pid> — park the loop on a background process (#50503 ) * feat(goals): add /goal wait <pid> barrier to park the loop on a background process The /goal loop re-pokes the agent every turn via the post-turn judge. When a goal is gated on a long-running background process (CI poller, build, test matrix, deploy) that produces nothing to judge yet, this spins the agent into 'is it done?' busy-work and burns the turn budget. /goal wait <pid> [reason] parks the loop: while the PID is alive, the judge is skipped, no turn is consumed, no continuation fires, and /goal status shows a parked indicator. The barrier auto-clears the moment the process exits (the agent's notify_on_complete watcher is the natural wake signal), then the next turn resumes normal judging. /goal unwait clears it manually; pause/resume/clear drop it; a dead/stale PID can never wedge the loop. Wired across CLI, gateway, and the mid-run command guard for parity. Barrier persists in SessionDB.state_meta (survives /resume); GoalState gains backward-compatible waiting_on_pid/waiting_reason/waiting_since fields. 12 new tests; docs updated. * fix(goals): use gateway.status._pid_exists for liveness, not os.kill(pid,0) The Windows-footguns CI guard flagged os.kill(pid, 0) in _pid_alive — on Windows that's not a no-op, it routes to CTRL_C_EVENT and hard-kills the target's console process group (bpo-14484). Delegate to the canonical footgun-safe gateway.status._pid_exists (psutil + ctypes/POSIX fallback) instead, with a direct-psutil last resort. * feat(goals): judge-driven auto-wait — the loop parks itself, no manual /goal wait Makes the wait barrier automatic. Every turn the judge is shown the agent's live background processes (pid, command, uptime, output tail from the process_registry) alongside the goal + response, and can return a new 'wait' verdict instead of continue: {"verdict":"wait","wait_on_pid":N} → park until that process exits {"verdict":"wait","wait_for_seconds":N} → park until the deadline passes evaluate_after_turn acts on the directive (sets the barrier, parks the loop) so the agent isn't re-poked into busy-work while CI/builds/deploys run. Adds a time-based waiting_until barrier alongside the pid barrier; both auto-clear and can never wedge the loop. Drivers (CLI, gateway, tui_gateway) feed the live registry in via gather_background_processes(). Manual /goal wait stays as an override. Judge verdict contract widened to (verdict, reason, parse_failed, wait_directive); legacy {"done":bool} shape still accepted. * test(goals): update kanban _fake_judge to the 4-tuple judge contract CI test(3) caught it: test_kanban_goal_mode's _fake_judge still returned the 3-tuple (verdict, reason, parse_failed), but the kanban loop now unpacks the 4-tuple (+ wait_directive). Update the fake to return None for the directive and accept the background_processes kwarg. * feat(goals): trigger-based wait — park on a process's own signal, not just exit Addresses two gaps in the judge-driven wait: (1) the judge could only express 'wait until PID exits' or 'wait N seconds', so a long-lived watcher/server that fires a trigger MID-RUN (and may never exit) couldn't be waited on; (2) the process's own watch_patterns/notify_on_complete trigger was invisible to the judge. Adds a session-based barrier (waiting_on_session) that releases on the process's OWN trigger via process_registry.is_session_waiting(): the session exits, OR (if started with watch_patterns) its pattern matches — even while the process keeps running. list_sessions() now surfaces session_id + watch_patterns/watch_hit/ notify_on_complete so the judge sees the trigger and is told to prefer wait_on_session for trigger processes. Judge verdict gains a {wait_on_session} directive (preferred over pid). Backward-compatible GoalState field; pid + time barriers unchanged. Tests: TestSessionTriggerBarrier (release on mid-run pattern match while alive, release on exit, unknown-session, full park→trigger→resume, parse, validation, backcompat load). 105 goal-surface + 85 process_registry tests green.	2026-06-22 06:27:29 -07:00
teknium	e9cd8c5bf3	fix(delivery): drop env-var knob, flag all chunking adapters Follow-up to ScotterMonk's cron-truncation fix: - Remove HERMES_DELIVERY_MAX_PLATFORM_OUTPUT env var. Behavioral config belongs in config.yaml, not a new HERMES_* env var (.env is secrets only). The actual bug is fixed entirely by the adapter-aware skip; the configurable cap was unneeded scope. MAX_PLATFORM_OUTPUT is a constant again, collapsing the max_output=0 disable branch and the audit-vs-truncation threshold divergence. - Flag the remaining verified-chunking adapters (slack, matrix, feishu, mattermost, teams, whatsapp, whatsapp_cloud, weixin, bluebubbles, yuanbao) with splits_long_messages=True so the fix covers the whole bug class, not just Discord/Telegram. Each verified to chunk in its own send() via truncate_message(). - SMS deliberately left False: it chunks for normal replies but a multi-segment cron blast is cost-bearing; the 4000-cap + file save is the safer default there. - Update tests: drop the two env-override tests, add a test asserting a save failure during truncation (non-chunking) propagates.	2026-06-22 05:41:22 -07:00
ScotterMonk	86e4521cb1	fix(delivery): make cron output truncation configurable + adapter-aware Gateway-level truncation (MAX_PLATFORM_OUTPUT=4000) was pre-empting adapter-side message splitting. Discord and Telegram both chunk long content natively in their send() via truncate_message(), but the delivery router truncated to 3800 chars + footer before the adapter ever saw the full payload — so long cron output was cut short instead of being delivered as multiple messages (issue #50126). Changes: - HERMES_DELIVERY_MAX_PLATFORM_OUTPUT env var makes the cap configurable (default 4000, backward compatible). Set to 0 to disable truncation. - TRUNCATED_VISIBLE (3800) removed — visible portion now derived dynamically from max_output minus the actual footer length. - New BasePlatformAdapter.splits_long_messages capability flag (default False). Adapters that chunk in send() set True; delivery skips truncation for them but still saves full output to disk as audit. - Flagged Discord and Telegram (both verified to chunk in send()). Fixes #50126	2026-06-22 05:41:22 -07:00
Ben Barclay	75a70d98f3	feat(relay): forward a stable instance id at self-provision (Phase 6 Unit α) (#50772 ) Add relay_instance_id() (env GATEWAY_RELAY_INSTANCE_ID first, then gateway.relay_instance_id in config.yaml, mirroring the other relay readers) and forward it in the /relay/provision body so the connector can bind gatewayId -> instanceId and route inbound per-instance once Phase 6 delivery lands. The value is gateway-asserted but safely scoped: the org/tenant stays NAS-token-verified at the connector, so a dishonest gateway can only bind its OWN tenant's instance — same posture as relay_endpoint(). instanceId is only added to the body when present, so omitting it lets the connector store null (back-compat: self-hosted / pre-Phase-6 gateways simply have no binding yet). For a managed (NAS-hosted) agent the id is NAS's AgentInstance.id, stamped into the container env beside GATEWAY_RELAY_URL. Tests: reader (env/config/absent), self_provision_relay forwards the id (set + absent), and the real _post_provision body includes instanceId ONLY when set. Refs: ~/nous/specs/gateway-gateway plan.md Phase 6 Unit α; decisions.md Q11.	2026-06-22 21:46:59 +10:00
kshitij	1f28b1a9b9	fix(gateway): redact credentials from approval prompts before sending to clients (#48456 ) (#50767 ) Tirith redacts its own findings, but the approval-request callbacks built the operator prompt from the RAW command string, so a credential-shaped value Tirith flagged was sent verbatim to clients, undoing the redaction one layer up. Two egress transports carried the leak; both are fixed via a shared module-level seam _redact_approval_command() (redact_sensitive_text force=True): 1. chat platforms — _approval_notify_sync (gateway/run.py): redact before both the button path (send_exec_approval) and the plain-text /approve fallback. 2. SSE/API stream — _approval_notify (gateway/platforms/api_server.py): redact event['command'] before it is enqueued to API/desktop clients. (whole-bug-class: sibling call path on a separate transport.) force=True so the prompt — a hard secret-egress boundary — honors redaction even when security.redact_secrets is off. Clean commands pass through unchanged. Tests bind the seam (synthetic credential-format fixtures, force-when-disabled) AND assert BOTH callbacks ASSIGN the redacted result before the send/enqueue sink, via an AST contract that rejects a discarded-result call. All mutation-checked.	2026-06-22 11:39:45 +00:00
Ben Barclay	64a507da44	feat(relay): handle passthrough_forward over the WS (Phase 5 §5.1, gateway half) (#50702 ) The connector half (gateway-gateway) moves the passthrough plane's post-ACK forward off the HTTP gatewayEndpoint onto the gateway's outbound /relay WS via a new passthrough_forward frame. This is the gateway side: the relay adapter now RECEIVES and handles that frame, so a hosted gateway (no public IP) can process forwarded Class-2/3 traffic (Discord interactions, Twilio) over the socket it already holds — closing the "passthrough inbound doesn't work for hosted gateways" gap. - ws_transport.py: decode the passthrough_forward frame; PassthroughForward dataclass + _passthrough_from_wire (base64 body -> exact bytes, byte parity with the connector's toPassthroughForward); set_passthrough_handler mirrors set_interrupt_inbound_handler. - transport.py: PassthroughHandler type + set_passthrough_handler on the RelayTransport protocol. - adapter.py: connect() wires the passthrough handler; _on_passthrough decodes the (already-sanitized, token-free) forward and, for a Discord interaction, converts it to a MessageEvent routed through the normal agent path (handle_message) — the reply egresses over the outbound / token-less follow_up path, so the gateway never holds the interaction credential. Never raises (a bad forward can't kill the read loop). Non-discord forwards (Twilio) are logged + dropped for now. - docs/relay-connector-contract.md: document the passthrough_forward frame + PassthroughForward shape + §3.1. The interaction -> MessageEvent CONVERSION semantics (slash-command vs button UX, option rendering) are the open sub-design flagged in the spec; the TRANSPORT + receive mechanism (this) is settled per Ben's Gate-2 decision: "the relay adapter handles receiving these events over the WS." Tests (tests/gateway/relay/test_relay_passthrough.py): byte-preservation round-trip (+ malformed-body tolerance), connect() wiring, application-command and message-component interactions route through handle_message with correct session source + scope capture, malformed/non-discord forwards dropped cleanly. 100 relay tests green. Pairs with the connector PR (gateway-gateway).	2026-06-22 20:10:57 +10:00
Shannon Sands	5dae502b86	Address email pairing review feedback	2026-06-21 22:43:57 -07:00
Shannon Sands	2455e1801b	Make email pairing opt-in	2026-06-21 22:43:57 -07:00
Shannon Sands	4b09903de5	fix Nous auth refresh for idle agents	2026-06-21 22:43:48 -07:00
teknium1	4314d451ca	fix(gateway): accept any inbound file type across all messaging platforms Authorization to message the agent is the gate, not the file extension. Previously the inbound-attachment allowlist (SUPPORTED_DOCUMENT_TYPES) was opt-OUT on Discord (allow_any_attachment defaulted false) and had no bypass at all on Telegram/Slack — so an .html (or any non-allowlisted type) was dropped or hard-rejected before the agent saw it. Now every authorized upload is cached and surfaced to the agent regardless of type: - base.cache_media_bytes(): unknown types cache as octet-stream (or the caller-supplied MIME) instead of returning None — fixes the chokepoint that Teams/Telegram-media route through. - discord/telegram/slack adapters: removed the allowlist reject/skip; any non-media attachment is typed DOCUMENT and cached. Known types keep their precise MIME. - Text inlining now gates on a shared _TEXT_INJECT_EXTENSIONS set (text + code + config + markup) instead of a blind UTF-8 decode, so binary formats (PDF/zip/docx) with ASCII headers are never inlined. - gateway/run.py emits the path-pointing context note for every DOCUMENT, including non text/application MIME types. - discord.allow_any_attachment is now a documented no-op kept for config back-compat. Validation: 357 gateway tests pass; E2E confirms .html/.bin/custom types cache, known types stay precise, PDFs are not inlined.	2026-06-21 22:43:45 -07:00
Ben Barclay	de6b3ae377	fix(terminal): bridge docker_extra_args to TERMINAL_DOCKER_EXTRA_ARGS in CLI + gateway (#50631 ) terminal.docker_extra_args passes flags verbatim to `docker run` (e.g. --gpus=all, --shm-size=16g). It was wired into DEFAULT_CONFIG, TERMINAL_CONFIG_ENV_MAP (so `hermes config set` bridged it), terminal_tool._get_env_config (reads TERMINAL_DOCKER_EXTRA_ARGS), and DockerEnvironment (applies extra_args) -- but it was MISSING from cli.py's env_mappings and gateway/run.py's _terminal_env_map. Consequence: a user who hand-edits config.yaml (rather than running `hermes config set`) has docker_extra_args silently dropped on the CLI and gateway/desktop startup paths, while docker_image / docker_volumes (which ARE in those maps) bridge correctly -- producing the reported 'Hermes partially reads the Docker config' symptom where --gpus=all and --shm-size=16g never reach docker run. This is the same bridge-coverage bug class that shipped before for docker_run_as_host_user (cli + gateway) and docker_mount_cwd_to_workspace (gateway). Fix by adding the key to both maps, plus a dedicated regression pin in test_terminal_config_env_sync.py mirroring the existing test_docker_*_is_bridged_everywhere guards.	2026-06-22 15:41:23 +10:00
teknium1	f45ace9318	feat(security): startup security posture audit (warn-on-load) Surface dangerous host/deployment posture at gateway startup so operators get the 'you're exposed' signal the June 2026 MCP-config persistence campaign victims never had. Warn-only — never blocks startup, never raises. Checks (each independently fail-safe): - Running as root (POSIX uid 0) - SSH daemon with PasswordAuthentication enabled (incl. the 'yes' default) - Running in a container with no persistent volume mount over HERMES_HOME - Network-accessible API server with no API_SERVER_KEY New module hermes_cli/security_audit_startup.py; invoked once per process from start_gateway() right after setup_logging(). Cross-platform (root/SSH checks no-op on Windows). Idea: @Cthulhu.	2026-06-21 19:05:27 -07:00
teknium1	7726ce3040	fix(security): close hermes-0day MCP-persistence attack surface Remove the dashboard --insecure auth-bypass, add an MCP persistence guard + IOC blocklist, and raise the API-server key entropy floor. Driven by the June 2026 hermes-0day campaign (r/hermesagent, live 854.media instance): scanners find exposed Hermes dashboards/API servers, drive the root agent to plant a 'command: bash' MCP entry that appends an attacker SSH key to authorized_keys, which cron + startup then re-execute every tick. - dashboard: --insecure no longer disables the auth gate. should_require_auth returns True for every non-loopback bind; a public bind ALWAYS requires an auth provider (bundled password provider or OAuth). --insecure kept as a warned no-op for backward compat. Fail-closed error now points at the password provider, not at --insecure. - mcp_security: validate_mcp_server_entry now also rejects shell payloads that write to OS persistence surfaces (authorized_keys/.ssh/pam.d/sudoers/cron/ rc files) and hard-rejects a hermes-0day IOC blocklist (attacker SSH key + source IPs) anywhere in command/args/env. Runs at save AND spawn time. - api_server: raise network-bind API_SERVER_KEY entropy floor 8->16 chars; warn when a network-accessible API server runs an unsandboxed local backend.	2026-06-21 19:05:27 -07:00
teknium1	012f40c98c	fix(status): cross-platform start-time fingerprint via psutil fallback The PID-reuse guard (#43846) reads /proc/<pid>/stat field 22, which only exists on Linux — on macOS/Windows it returned None and the guard silently degraded to a bare liveness check (a no-op, safety-wise). Add a psutil.create_time() fallback (psutil is a hard dep, cross-platform), quantized to centiseconds for stable equality, so the recycled-PID guard actually protects macOS/Windows too. /proc always wins first on Linux and always misses on macOS/Windows, so the two sources never mix on one host and same-source equality is all the guard needs.	2026-06-21 17:23:33 -07:00
xxxigm	242ec45f45	fix(gateway): don't lazy-install SDKs for unconfigured platforms on startup For adapter plugins, ``PlatformEntry.check_fn`` doubles as a lazy installer: calling it pip-installs the platform SDK as a side effect (see e.g. ``plugins/platforms/discord/adapter.py::check_discord_requirements``). The enablement sweep in ``_apply_env_overrides`` called ``check_fn`` for every registered plugin platform unconditionally, so a single ``load_gateway_config()`` — which the desktop/dashboard readiness probe ``GET /api/status`` awaits synchronously — pip-installed Discord, Telegram, Slack, Feishu and Dingtalk even when the user configured none of them (``platforms: none``). On a slow or restricted network the installs ran long enough to block the event loop past the desktop's readiness timeouts, so the app timed out, killed and re-spawned the backend, and boot-looped (stuck at 94%). Consult the cheap ``is_connected`` credential check FIRST and only run the install-triggering ``check_fn`` for platforms that are already enabled or actually configured. Auto-enable-by-credentials is unchanged: a platform with its token set still gets its SDK installed and enabled.	2026-06-21 16:41:17 -07:00
teknium1	4d4ba0831e	refactor(session): simplify traversal guard to a helper + logger, harden non-leading separators Follow-up to the salvaged #9560 fix: - Replace the _TRAVERSAL_RE regex with an explicit _is_path_unsafe() helper (drops the now-unused `import re`); catches a path separator ANYWHERE, not just leading, so a non-leading Windows backslash can't slip through. - Switch the per-entry skip in _ensure_loaded_locked from print() to logger.warning to match the module's logging conventions. - Add AUTHOR_MAP entry for the contributor. - Add regression tests for the non-leading-separator case.	2026-06-21 15:23:36 -07:00

1 2 3 4 5 ...

2193 commits