Commit graph

4414 commits

Author SHA1 Message Date
teknium1
486d632cc2 fix(auxiliary): coerce None final.output to empty list in Codex aux adapter
Closes #33368.

`_CodexCompletionsAdapter.create()` iterates `final.output` from the
Codex Responses stream. The event-driven consumer (introduced in #33042)
always sets `final.output` to a list, so this shape can't come from our
own code path. But:

- Mocked clients in tests can return a typed Response with `output=None`
- Third-party shims / compatibility layers that bypass the consumer can
  do the same
- A future code path that wraps a different consumer could regress

The old code `getattr(final, "output", [])` returns `None` (not the
default `[]`) when the attribute EXISTS but is `None`. Iterating
`None` then raises `TypeError: 'NoneType' object is not iterable` —
the exact error logged by title-generation when this fires.

Fix: `getattr(final, "output", None) or []` — single-line defensive
coerce. Cheap; zero risk.

Regression test asserts the auxiliary path handles a final whose
`.output` is `None` (via monkey-patched consumer) without raising and
returns the expected chat.completions-shaped response.

Reporter: @pavegrid-1 (issue #33368).
2026-05-27 11:08:21 -07:00
Teknium
9919caff46
feat(image_gen): add Krea provider plugin (Krea 2 Medium + Large) (#33236)
* feat(image_gen): add Krea provider plugin (Krea 2 Medium + Large)

New built-in image_gen backend wrapping Krea's Krea 2 foundation
image model family. Auto-discovered like the other image_gen plugins
and appears in 'hermes tools' → Image Generation → Krea.

Krea's API is asynchronous — submit returns a job_id, poll /jobs/{id}
until terminal. The provider hides that behind the synchronous
ImageGenProvider.generate() contract: submit, poll every 2s with
light backoff (max 5s), 3-minute ceiling matching Krea's hosted-tool
timeout. Result URL is materialised to $HERMES_HOME/cache/images/
to avoid CDN-expiry 404s downstream (same fix as xAI #26942).

Models:
- krea-2-medium (default — Krea's 'start here' recommendation)
- krea-2-large

Aspect ratios map landscape→16:9, square→1:1, portrait→9:16.
Resolution: 1K (Krea's only current option).

Kwarg passthrough: seed, creativity (raw/low/medium/high), styles,
image_style_references (capped 10), moodboards (capped 1) — matches
Krea's per-request limits. Unknown kwargs are ignored.

Config knobs (config.yaml):
  image_gen.provider: krea
  image_gen.krea.model: krea-2-medium | krea-2-large
  image_gen.krea.creativity: raw | low | medium | high
Env overrides: KREA_API_KEY (required), KREA_IMAGE_MODEL.

KREA_API_KEY is registered in OPTIONAL_ENV_VARS so 'hermes setup'
prompts for it.

31 new tests; image_gen suite + picker + tools_config: 211/211.

* fix(image_gen/krea): address review feedback

- Update KREA_API_KEY setup URL to the canonical token-creation page
  (https://www.krea.ai/app/api/tokens). The previous URL returned 404.

- Fail fast on non-retryable HTTP statuses during poll. The previous
  loop retried every HTTPError for the full 180s deadline, so an auth
  (401), billing (402), forbidden (403), or not-found (404) response
  would make image_generate hang for three minutes. Only retry
  transient statuses (408/409/425/429/5xx); surface everything else
  immediately.

- Add 5 tests covering fail-fast on 401/403/404 and retry on 429/503.

* fix(krea): point users at the real API token dashboard URL

Three call sites linked users to dashboard pages that don't exist:
- hermes_cli/config.py: https://www.krea.ai/app/api/tokens
- plugins/image_gen/krea/__init__.py get_setup_schema: https://www.krea.ai/api-keys
- plugins/image_gen/krea/__init__.py auth_required error: https://www.krea.ai/api-keys

Per Krea's own docs (https://docs.krea.ai/developers/api-keys-and-billing),
the real dashboard URL is https://www.krea.ai/settings/api-tokens. All three
sites now point there.
2026-05-27 11:01:47 -07:00
Dora (kyra-nest)
bcae3fcc4e fix(honcho): align user context peer perspective
Use the shared observer/target resolver for session context so peer='user' and explicit configured peer IDs query Honcho from the same assistant-observed perspective when allowed. Add regression coverage for user alias, explicit peer, and self-observer fallback.
2026-05-27 10:49:33 -07:00
David Doan
1800a1c796 fix(honcho): align peer-card read and write paths
honcho_profile(peer="user") returned an empty card even when Honcho
held a populated peer card for the user. Two independent bugs combined
to produce the symptom:

1. Read path: get_peer_card() called _fetch_peer_card(observer, target=user),
   which hits GET /peers/{observer}/card?target={user} — the observer's local
   card of the user. On self-hosted Honcho v3 this slot is empty unless writes
   also use it. The peer card lives on the user peer itself
   (GET /peers/{user}/card). Add a fallback: when the observer-target slot is
   empty and a target exists, retry against the target peer's own card.

2. Write path: set_peer_card() resolved only the target peer and called
   user_peer.set_card(card). The read path uses the assistant peer as
   observer, so writes and reads addressed different Honcho card scopes.
   Align set_peer_card() with _resolve_observer_target() so writes go to
   assistant_peer.set_card(card, target=user_peer_id), matching the read.

Both paths now use the same observer/target resolution, and the read
path additionally falls back to the target's own card for compatibility
with deployments where cards were written directly to the peer.

Closes: related to #13375, #17124, #20729
2026-05-27 10:49:33 -07:00
Erosika
1a8e67076a fix(honcho): cover pinUserPeer + aiPeer edge cases in setup, clone, and gateway cache
Three related regressions stemming from the pinUserPeer alias landing:

- Setup wizard read host-only fields when detecting current shape but the
  parser supports root-level config and gives host pinUserPeer higher
  precedence than pinPeerName. Re-running setup could mis-detect shape
  and silently flip routing. Detection now uses the same resolver order
  as HonchoClientConfig, and each shape branch scrubs every peer-mapping
  key before writing so a stale pinUserPeer=false can't outrank a freshly
  written pinPeerName=true. Multi no longer auto-writes
  userPeerAliases={} (was silently masking root-level baselines).

- clone_honcho_for_profile inherited pinPeerName but not pinUserPeer, so
  a default profile configured with the newer key produced cloned
  profiles without the pin.

- Gateway cache-busting signature fingerprinted Honcho user-peer fields
  but not ai_peer. Since HonchoSessionManager freezes cfg.ai_peer at
  init, mid-flight aiPeer edits kept assistant writes on the old peer
  until an unrelated cache eviction. ai_peer is now part of the
  signature.
2026-05-27 10:49:33 -07:00
Erosika
939499beed chore(honcho): trim PR-history narration from docs and tests
Remove "PR #14984 / #27371 / #1969" references and "the original key /
legacy / backwards-compatible / Port #N" narration from the honcho
plugin README, tests, and one stale code comment. These artefacts age
poorly: they describe how a change happened rather than what the code
does today, and they tax readers who weren't around for the original
work.

Also drop a dangling reference to scratch/memory-plugin-ux-specs.md in
__init__.py — the file isn't in the repo or git history.

No behaviour change.
2026-05-27 10:49:33 -07:00
Erosika
6feb2afd50 fix(honcho): plug pinPeerName transition gaps
Three correctness gaps when honcho.json's identity-mapping config changes
mid-flight:

1. The gateway's agent cache signature ignored honcho identity keys, so
   editing peerName / pinPeerName / userPeerAliases / runtimePeerPrefix
   was silently dropped until an unrelated cache eviction. Extend
   _extract_cache_busting_config to fingerprint the resolved honcho
   config so the AIAgent rebuilds on the next message.

2. cmd_setup let single → multi flips orphan the pinned-pool history
   under peerName without warning. Detect the transition, warn that
   runtime users will resolve to fresh empty peers, and auto-steer to
   hybrid (alias the operator's runtime IDs back to peerName) so the
   operator's own continuity survives. yes / no overrides available.

3. README didn't document the orphaning behaviour. Add a "Migrating
   single → multi" callout under Deployment shapes.

Tests:
- TestPinTransition (test_pin_peer_name.py): fresh-manager flip resolves
  to runtime, in-process flip is gated by the per-key session cache
  (documents the gateway-cache-must-bust contract), 3 cache-bust
  signature tests for pin / aliases / prefix.
- TestProfilePeerUniqueness: two profiles pinned to distinct peerNames
  resolve to distinct peers; host-level peerName overrides root when
  pinned.
- test_single_to_multi_steers_to_hybrid_by_default and
  test_single_to_multi_yes_override_keeps_multi (test_cli.py): wizard
  guard end-to-end coverage.
2026-05-27 10:49:33 -07:00
erosika
3cf5e8225d refactor(honcho): accept pinUserPeer as backwards-compatible alias for pinPeerName
The original key 'pinPeerName' from #14984 is ambiguous: a fresh
reader can't tell whether it pins the user peer or the AI peer from
the name alone.  The resolver only ever pins the user-side
(_resolve_user_peer_id short-circuits when pin_peer_name is true; the
AI peer is already pinned by construction via aiPeer).

Add 'pinUserPeer' as the canonical alias.  Both keys land on the
same internal pin_peer_name field; precedence is host pinUserPeer →
host pinPeerName → root pinUserPeer → root pinPeerName → default.
Host-level always beats root-level regardless of alias, so a host
block can still explicitly disable a root-level pin even via the new
key.

Make _resolve_bool variadic so it can express the four-value
precedence chain.  All existing callers pass two positional args +
default keyword, which the new signature accepts unchanged.

Internal var name (pin_peer_name) stays the same to keep the
cherry-picked #27371 commits clean and avoid a noisy rename diff.
2026-05-27 10:49:33 -07:00
erosika
0bac880991 feat(honcho-setup): add deployment-shape step to identity-mapping wizard
The PR #27371 resolver introduced three identity-mapping config keys
(pinPeerName, userPeerAliases, runtimePeerPrefix), but operators had
no guided way to set them — they had to read the README, understand
the resolver ladder, and hand-edit honcho.json.  This commit adds an
interactive step to 'hermes honcho setup' that asks one question
('what's your deployment shape?') and writes the right combination
of keys.

Three shapes cover the realistic deployments:

* single -- pinPeerName=true.  All gateway users collapse to your
            peerName.  Recommended for personal/single-operator use.

* multi  -- pinPeerName=false, no aliases.  Each runtime user gets
            their own peer.  Optional runtimePeerPrefix for cross-
            platform namespace isolation.

* hybrid -- pinPeerName=false, with userPeerAliases mapping YOUR
            runtime IDs (Telegram UID, Discord snowflake, Slack
            user, Matrix MXID) to peerName.  Multi-user gateway
            where you are a privileged operator.

A 'skip' option leaves existing identity-mapping config untouched —
critical because re-running setup must not silently wipe operator-
curated aliases.

The wizard detects the current shape from existing config so the
prompt's default matches what the operator already has.
2026-05-27 10:49:33 -07:00
erosika
c03960decd fix(honcho): include user_id in agent cache signature to prevent shared-thread peer contamination
PR #27371 introduced a per-user-peer resolver in HonchoSessionManager,
but the resolved runtime identity is frozen into the manager at first-
message init.  When the gateway session_key intentionally omits the
participant ID (the default for threads via thread_sessions_per_user=
False), a cached AIAgent created by user A is reused for user B's
messages, attributing B's writes to A's resolved Honcho peer and
breaking #27371's per-user-peer contract.

Fix by including user_id and user_id_alt in _agent_config_signature so
the cache key distinguishes participants in shared threads.  Each user
in a shared thread now triggers a fresh AIAgent build (trading prompt-
cache warmth for memory-attribution correctness — the right tradeoff
for an external-memory backend where misattribution is unrecoverable).

The default-None case keeps the signature byte-identical to pre-fix
behavior so this change doesn't invalidate in-flight caches on deploy.
2026-05-27 10:49:33 -07:00
erosika
00e6830204 fix(honcho): inherit identity-mapping config in cloned profile blocks
PR #27371 added host-scoped userPeerAliases, runtimePeerPrefix, and
pinPeerName, but the cloned-profile allowlist in
plugins/memory/honcho/cli.py::clone_honcho_for_profile() omitted them.
A new profile created via 'hermes honcho setup' or similar would
silently drop the operator's identity-mapping config, causing gateway
users to resolve to raw runtime IDs and fragmenting Honcho memory
across an unintended set of peers.

Add the three keys to the allowlist and a regression test class
covering all three plus the unset case.
2026-05-27 10:49:33 -07:00
mavrickdeveloper
30b391ab36 Avoid Honcho runtime peer collisions
(cherry picked from commit 4ae3c1a228)
2026-05-27 10:49:33 -07:00
mavrickdeveloper
382b1fc1b6 Cover Honcho runtime peer edge cases
(cherry picked from commit d89a57ea40)
2026-05-27 10:49:33 -07:00
mavrickdeveloper
2e3c6627ce Add Honcho runtime peer mapping
(cherry picked from commit 864cdb3d2e)
2026-05-27 10:49:33 -07:00
zccyman
2e181602a1 fix(agent): isolate credential pool on provider fallback
Closes #33163.

When _try_activate_fallback() switches from one provider to another (e.g.
openai-codex → openrouter), the credential pool still belongs to the
primary provider. This causes two compounding bugs:

1. The pool retains the primary's base_url. Downstream pool recovery
   (rate_limit / billing / auth) calls _swap_credential() with a primary
   entry which overwrites the agent's base_url back to the primary's
   endpoint. Every fallback request then 404s against the wrong host.

2. Pool recovery acting on errors from the FALLBACK provider mutates the
   PRIMARY's pool state (#33088 reported a related corruption pattern),
   exhausting/rotating entries that have nothing to do with the failure.

Two layered fixes:

a) try_activate_fallback (agent/chat_completion_helpers.py): on fallback
   activation, clear agent._credential_pool when the fallback provider
   doesn't match the pool's provider. Pool is preserved when the fallback
   shares the pool's provider (e.g. multiple openrouter entries).

b) recover_with_credential_pool (agent/agent_runtime_helpers.py):
   defensive guard rejects any pool mutation when agent.provider doesn't
   match pool.provider. Defense-in-depth — should never fire after (a)
   is in place, but covers any future path that attaches a stale pool.

Salvaged from @zccyman's PR #33217. The original PR was written against
the pre-refactor monolithic run_agent.py; both target functions have
since been extracted to module-level helpers. Behavior is identical —
the guards live in the canonical extracted locations.

Tests
- New tests/run_agent/test_fallback_credential_isolation.py (7 tests
  covering: fallback clears mismatched pool, fallback preserves matching
  pool, recovery rejects mismatched pool, recovery accepts matching
  pool, 429-from-z.ai-doesn't-exhaust-codex-pool, _client_kwargs
  base_url survives pool clear, _swap_credential doesn't restore
  primary URL after fallback).
- Cross-verified: 77/77 passing across fallback isolation tests +
  agent/test_credential_pool.py — no regression.

Co-authored-by: zccyman <16263913+zccyman@users.noreply.github.com>
2026-05-27 10:45:26 -07:00
JohnC1009
414a5bc924 fix(auth): fall back to global auth.json in _load_provider_state
In profile mode, _load_provider_state previously returned None when a
provider was absent from the profile's auth.json — even if the user had
authenticated at the global root. This broke runtime credential resolvers
that read state directly (resolve_nous_access_token,
resolve_nous_runtime_credentials), causing profiles without their own
nous login to fail with 'Hermes is not logged into Nous Portal' despite
a valid global session.

Push the existing read-only global fallback (already used by
get_provider_auth_state and read_credential_pool) into _load_provider_state
so every caller benefits, and simplify get_provider_auth_state into a thin
wrapper. Writes still target the profile only — profile state continues to
shadow global state on the next read after a per-profile login. Behavior in
classic (non-profile) mode is unchanged because _load_global_auth_store
returns an empty dict.

Adds 5 tests covering the new contract on _load_provider_state directly.
Existing 770 auth/credential/nous tests still pass.
2026-05-27 09:38:58 -07:00
LeonSGP43
458a94e425 fix(cli): keep destructive slash modal on Linux 2026-05-27 05:57:01 -07:00
Teknium
f0de3cd0a0
fix(agent): roll back switch_model() state when client rebuild fails (#33228)
Closes #33175.

switch_model() in agent/agent_runtime_helpers.py mutated agent.model and
agent.provider before rebuilding the client, with no try/except to restore
them on failure. If the rebuild raised (bad API key, network error,
build_anthropic_client failure, etc.) the agent was left with the new
model+provider name paired with the OLD client — producing HTTP 400s like
"claude-sonnet-4-6 is not supported on openai-codex" on the next turn.

Callers in cli.py, gateway/run.py, and tui_gateway/server.py already catch
the exception and warn the user, but the warning was misleading because
the swap had partially succeeded; the agent's state was torn.

Snapshot every mutated field before the swap, wrap the swap+rebuild block
in try/except, and restore the snapshot on failure before re-raising so
the caller's warning surfaces.

Reported by @amirariff91. Tests cover both branches (chat_completions and
anthropic_messages) and the cross-branch case (anthropic -> openai).
2026-05-27 05:43:20 -07:00
Teknium
b4eea187d5 fix(xai-oauth): gate slash-enum strip on model name + add regression tests (#28490)
Three additions on top of @Nami4D's salvage:

1. Gate the preflight slash-enum strip on the model name pattern
   (grok-* / x-ai/grok-*).  The original PR stripped slash-containing
   enum values from every codex_responses request, but native Codex
   (OpenAI) and GitHub Models DO accept slash enums — stripping them
   there would silently degrade tool-schema constraints.  xAI is the
   only Responses-API surface that rejects the shape.

2. Resolve the merge conflict in agent/transports/codex.py by
   preserving both the timeout-forwarding block that landed on main
   between the PR's branch point and now AND the new service_tier
   strip.  Behavioural intent of both is preserved.

3. Six new tests in tests/agent/transports/test_codex_transport.py
   covering:
   - TestCodexTransportXaiServiceTierStrip (3 tests): xAI strips
     service_tier from request_overrides; non-xAI codex_responses
     and GitHub Models both KEEP service_tier (regression guards
     so the strip stays xAI-only).
   - TestPreflightSlashEnumStrip (3 tests): Grok and aggregator-
     prefixed Grok model names both trigger the safety-net strip;
     non-Grok models preserve slash enums as a regression guard
     against the strip becoming too broad.

51/51 in tests/agent/transports/test_codex_transport.py.

Co-authored-by: Nami4D <hello@nami4d.tech>
2026-05-27 05:25:38 -07:00
Teknium
0325e18f34
fix(gateway): keep Telegram heartbeat + interim commentary on; edit heartbeat in place (#33187)
#33151 flipped THREE Telegram display defaults to false:
  - tool_progress: new -> off            (kept: per-tool stream is too chatty)
  - interim_assistant_messages: T -> F   (REVERTED here)
  - long_running_notifications: T -> F   (REVERTED here)
  - busy_ack_detail: T -> F              (kept: verbose iteration counter)

The two reverts were wrong. interim_assistant_messages = the model's REAL
words mid-turn ("I'll inspect the repo first.", "Let me check both files
in parallel"). That is signal, not noise. Suppressing it left Telegram
users staring at "typing..." for the entire turn duration with no
feedback. long_running_notifications = the periodic heartbeat. Silent
agent for 30 minutes is worse than one bubble updating every 3 minutes.

Changes:
  - gateway/display_config.py: Telegram tier-1 inbox keeps both defaults
    on (only tool_progress and busy_ack_detail stay off).
  - gateway/run.py _notify_long_running(): edit a single heartbeat
    message in place (where the adapter supports it) instead of posting
    a new "Still working..." bubble each interval. Telegram, Discord,
    Slack, Matrix all qualify. Falls back to send-new when edit fails.
  - gateway/run.py: tighten heartbeat text. " Still working... (12 min
    elapsed — iteration 21/60, running: terminal)" -> " Working — 12
    min, terminal". Verbose iteration detail moves behind busy_ack_detail
    (one knob now controls both busy acks AND heartbeat verbosity).
  - tests/, cli-config.yaml.example, website/docs/user-guide/messaging:
    updated to reflect the corrected story.
2026-05-27 05:21:53 -07:00
Teknium
69dfcdcc15 fix(auth): codex chat path falls back to credential_pool when singleton is empty
Closes #32992.

The chat path resolves Codex credentials via `resolve_codex_runtime_credentials`
which only reads `providers.openai-codex.tokens` (the singleton). The auxiliary
path uses `_read_codex_access_token` which checks the credential_pool first.
For users whose tokens live only in the pool — manual seed, partial re-auth,
restore from backup, or any state where the singleton is empty but the pool
is healthy — the chat path raised AuthError or (worse, since OpenAI(api_key='')
silently attaches no header) the wire saw HTTP 401 "Missing Authentication header"
while the auxiliary path worked fine.

This adds a pool fallback to `resolve_codex_runtime_credentials`: when the
singleton has no usable access_token, scan `credential_pool.openai-codex` for
the first entry that has a non-empty access_token and isn't in an exhaustion
cooldown window (`last_error_reset_at` in the future). If found, return that
token with `source="credential_pool"`. If no usable entry exists, the original
AuthError propagates as before.

Regression tests cover:
- Empty singleton + healthy pool entry → pool token returned
- Pool fallback skips entries currently in cooldown
- Empty singleton + empty/wedged pool → AuthError propagates (existing contract preserved)
2026-05-27 03:43:51 -07:00
helix4u
ea34925002 fix(discord): recover Windows voice opus decoding 2026-05-27 03:35:33 -07:00
Teknium
0b6ace6498 test(verbose): align with telegram tier-1 inbox default
Two tests in test_verbose_command.py asserted Telegram's tool_progress
default was "new" and expected /verbose to cycle that to "all". The
default has since been overridden to "off" in gateway/display_config.py
(_PLATFORM_DEFAULTS for telegram — tier-1 inbox preset that keeps mobile
chats final-answer-first), making the first /verbose invocation cycle
off → new, not all → verbose.

The behavioral change was intentional; the tests were stale and missing
from the same commit. Surfaced as a pre-existing failure on origin/main
during CI for the unrelated #33164 / #33168 Codex auth salvages.
2026-05-27 03:13:15 -07:00
konsisumer
f1422ffd77 fix(gateway): classify Codex 429 quota as rate-limit, not missing credentials
When the Codex OAuth token endpoint returns 429 (usage-limit / quota
exhaustion), refresh_codex_oauth_pure raised a generic auth error that the
gateway surfaced as 'Primary provider auth failed: No Codex credentials
stored. Run hermes auth', prompting re-auth that cannot lift a quota cap.

Classify 429 distinctly (codex_rate_limited, relogin_required=False) with a
non-alarming quota message that honors Retry-After, log it as
'Primary provider rate-limited (429)', and stop format_auth_error from
appending the re-authenticate remediation. Also log the fallback provider's
literal config key instead of the resolved runtime category.

Refs #32790
2026-05-27 03:13:15 -07:00
konsisumer
2bbd53493d fix(cli): sync credential_pool on Codex re-auth
Codex re-auth via `hermes setup` / `hermes model` wrote fresh OAuth
tokens to providers.openai-codex.tokens but left the credential_pool
device_code entry holding the consumed refresh token and stale error
markers. Since the runtime selects from the pool, the next request
spent a dead token and got a 401 token_invalidated. Update the
singleton-seeded pool entries in lockstep and clear their error state.

Fixes #33000
2026-05-27 03:02:06 -07:00
houenyang-momo
60f84c6c28 gateway: quiet Telegram operational chatter 2026-05-27 02:41:24 -07:00
Robert DaSilva
efa952531b fix: ignore Telegram start pings 2026-05-27 02:41:24 -07:00
sir-ad
8807b1c727 fix(gateway): hide telegram compaction status noise 2026-05-27 02:41:24 -07:00
chaconne67
9c69204d87 fix(codex_responses_adapter): drop foreign-issuer reasoning on replay
reasoning.encrypted_content is sealed to the Responses endpoint that
minted it. When a session switches model providers mid-conversation —
say the user runs /model gpt-5.5 after several turns on grok-4.3, or
vice versa — the persisted codex_reasoning_items carry blobs the new
endpoint cannot decrypt, and every subsequent turn fails with HTTP 400
invalid_encrypted_content.

This is the cross-issuer prevention layer. Pairs with:
* PR #33035 — runtime recovery when the HTTP 400 fires anyway
* PR #33146 — prevention for transient rs_tmp_* items

Stamps each reasoning item with the issuer kind that minted it
(codex_backend / xai_responses / github_responses / other:<url>) at
normalize time, then drops items at replay time when the active
endpoint differs from the stamp. Unstamped (legacy) items pass
through for backwards compatibility.

Cherry-picked from @chaconne67's PR #31629. Conflict against current
main (#33035's replay_encrypted_reasoning parameter) resolved as
'keep both' — the two guards compose: replay_encrypted_reasoning=False
is the session-wide kill switch, current_issuer_kind is the per-item
filter that runs only when replay is still enabled.
2026-05-27 02:40:03 -07:00
Krishna
b1a46b3047 fix(codex): drop transient rs_tmp reasoning replay state 2026-05-27 02:25:59 -07:00
Teknium
187cf0f257
tools(terminal): nudge homebrewed CI pollers at the tool surface (#33142)
Background processes whose command contains `gh pr view --json
statusCheckRollup` or `gh pr checks | jq` now get a runtime hint in
the result pointing at the canonical green-ci-policy snippets. The
homebrew shape has caused at least seven silent CI-watcher failures
in the past two weeks (#31329, #31448, #31695, #31709, #31745,
#32264, #33131) — each one a different jq/awk/grep variation of the
same fundamental problem (stdout buffering, jq null-key edge cases,
conclusion-vs-status confusion, TTY-only banner grepping).

The skill that documents this anti-pattern is excellent, but a skill
only fires if the agent loads it. The tool surface fires on every
misuse. This is the embed-footguns-in-tool-surface pattern from
PR #31289 applied to a recurring failure mode that's outgrown
skill-only enforcement.

Detector is deliberately narrow — flags two specific shapes:

  1. Any command containing `statusCheckRollup` (the JSON-API path —
     conclusion vs status field semantics keep burning us).
  2. `gh pr view` / `gh pr checks` combined with `jq` (gh pr
     checks doesn't emit JSON, so any `| jq` here is confused intent;
     the canonical column-2 poller uses awk-on-tabs, not jq).

Does NOT flag the blessed column-2 awk-on-tabs poller (which uses
`awk -F"\t" "\==\"pending\""`) or the exit-code-driven
`gh pr checks $PR >/dev/null` snippet.

Hint composes with the existing background-without-notify_on_complete
hint — both can fire on the same call. Each is independently
actionable.

Tests:
- 4 new cases in tests/tools/test_notify_on_complete.py
- test_homebrew_ci_poller_via_statusCheckRollup_emits_hint (positive)
- test_homebrew_ci_poller_via_gh_pr_checks_piped_to_jq_emits_hint (positive)
- test_canonical_column2_awk_poller_does_not_emit_homebrew_hint (negative)
- test_canonical_gh_pr_checks_exit_code_loop_does_not_emit_hint (negative)
- test_non_ci_background_command_does_not_emit_homebrew_hint (negative)
- 30/30 passing (was 26)
2026-05-27 02:22:08 -07:00
Ben
a890389b69 feat(dashboard-auth): HERMES_DASHBOARD_PUBLIC_URL / dashboard.public_url override
Operators behind reverse proxies that don't reliably forward
X-Forwarded-Host / X-Forwarded-Proto / X-Forwarded-Prefix (manual
nginx setups, on-prem ingresses, custom-domain Fly deploys with
incomplete proxy chains) had no way to force the absolute base URL
the OAuth callback redirects from. The dashboard would reconstruct
the redirect_uri from request headers, the IDP would echo it back,
and the user would land on the wrong host or wrong path — 404.

Add `dashboard.public_url` to config.yaml with env override
HERMES_DASHBOARD_PUBLIC_URL. When set, it is the complete authority —
scheme + host + optional path prefix (e.g. https://example.com/hermes) —
and becomes the base for the OAuth `redirect_uri`. X-Forwarded-Prefix
is IGNORED on this code path because the operator has explicitly
declared the public URL; we no longer need to guess from proxy
headers, and stacking the prefix on top would double-prefix the
common case where the prefix is already baked into public_url.

When unset, the existing proxy_headers + X-Forwarded-Prefix
reconstruction runs untouched. Existing Fly.io deploys continue to
work without configuration — this is purely additive.

Precedence mirrors dashboard.oauth.client_id:

  env (non-empty) > config.yaml > reconstructed from request

Implementation:

  - hermes_cli/config.py: add dashboard.public_url to DEFAULT_CONFIG
    with a multi-paragraph doc comment explaining the use case,
    the X-Forwarded-Prefix interaction, and the validation rules.
  - hermes_cli/dashboard_auth/prefix.py: factored out the existing
    _REJECT_CHARS frozenset, added _normalise_public_url() validator
    (requires http/https scheme + non-empty host + no header-injection
    chars), _load_dashboard_section() loader (robust to load_config
    raising, non-dict shapes), and resolve_public_url() entry point
    with the env-overrides-config precedence. A malformed value
    silently falls through to ""; the caller treats "" as "reconstruct
    from request" so a typo never breaks the login flow.
  - hermes_cli/dashboard_auth/routes.py: rewrite _redirect_uri()
    docstring to spell out the three resolution tiers; add the
    public_url short-circuit before the existing X-Forwarded-Prefix
    splicing. Source-level comment notes that X-Forwarded-Prefix is
    intentionally ignored when public_url is set so a future reader
    doesn't try to "fix" the missing prefix layering.
  - cli-config.yaml.example: extend the existing dashboard section
    with a public_url block.
  - website/docs/user-guide/features/web-dashboard.md: new "Public
    URL override" section between the provider configuration and
    the OAuth flow walkthrough. Documents the env-vs-config table,
    the validation rules, and the `http://` `public_url` ↔ Secure
    cookie footgun.

Test coverage — new TestPublicUrlOverride class (8 tests):

  - env var overrides request reconstruction (the primary motivating
    case)
  - config.yaml used when env unset
  - env wins over config (precedence pin)
  - public_url with a path prefix already baked in (the Q1-a case the
    user explicitly chose)
  - public_url suppresses X-Forwarded-Prefix layering (defends
    against the double-prefix bug)
  - trailing slash stripped from public_url (no //auth/callback)
  - malformed public_url falls through to reconstruction (six
    hostile inputs: javascript:, ftp:, missing scheme, missing host,
    quote chars, CRLF injection)
  - empty env string doesn't shadow config.yaml entry (CI / Fly
    provisioned-but-empty secret case)

Mutation-tested: flipping the precedence in resolve_public_url() trips
exactly test_env_overrides_config_public_url; weakening the validator
(accept any scheme) trips exactly test_malformed_public_url_falls_through_to_reconstruction.
Both other tests in each pair stay green, confirming the suite
discriminates the specific regression each test pins.
2026-05-27 02:12:27 -07:00
Ben
61dcc33893 feat(dashboard-auth): config.yaml as canonical surface for dashboard.oauth
Per AGENTS.md, ~/.hermes/.env is reserved for API keys / secrets and
config.yaml is the surface for non-secret configuration. The Nous
Portal plugin previously read HERMES_DASHBOARD_OAUTH_CLIENT_ID and
HERMES_DASHBOARD_PORTAL_URL from the environment only, which forced
local-dev / on-prem operators to put non-secret per-instance
configuration in .env — violating the convention.

Add dashboard.oauth.{client_id,portal_url} to DEFAULT_CONFIG and have
the plugin resolve each setting with env-overrides-config precedence:

  1. Env var when set to a non-empty value (Fly.io platform-secret
     injection — what pushes per-deploy client_ids without baking
     them into the image).
  2. config.yaml entry (canonical surface for local dev / on-prem).
  3. Plugin default (no provider registered when client_id is empty;
     portal_url defaults to https://portal.nousresearch.com).

Empty env values are explicitly treated as unset so a provisioned-but-
not-populated Fly secret can't accidentally shadow a valid config.yaml
entry with an empty string — operators would otherwise lose the gate.

Implementation:

  - hermes_cli/config.py: add dashboard.oauth.{client_id,portal_url}
    block to DEFAULT_CONFIG with full doc comment explaining the
    override precedence and Fly.io rationale.
  - plugins/dashboard_auth/nous/__init__.py: add _load_config_oauth_section,
    _resolve_client_id, _resolve_portal_url helpers; replace the two
    direct os.environ.get() calls in register() with the resolvers.
    Update the skip-reason string to mention BOTH surfaces so an
    operator looking at the fail-closed bind error knows config.yaml
    is a valid alternative to the env var.
  - plugins/dashboard_auth/nous/plugin.yaml: update description to
    name both surfaces. requires_env stays pointing at the env var
    name — it's metadata-only (not used by the plugin loader for
    gating) so this is documentation/UX, not enforcement.
  - cli-config.yaml.example: append commented dashboard.oauth block
    with the same override rationale operators see in code.
  - website/docs/user-guide/features/web-dashboard.md: rewrite the
    'Default provider: Nous Research' section to lead with config.yaml,
    present env vars as operator overrides (Fly.io's primary path).
    Updated the example fail-closed bind error to match the new
    skip-reason text.

Test coverage — new TestConfigYamlSource class (8 tests) pinning
every tier of the precedence chain:

  - config-yaml-only path registers correctly
  - both config-yaml fields (client_id + portal_url) honoured
  - env var overrides config for client_id (Fly.io critical path)
  - env var overrides config for portal_url
  - empty env string does NOT shadow config (CI/Fly edge case)
  - neither source set → skip with reason mentioning BOTH surfaces
  - load_config() raising falls through to env-only path (resilience)
  - non-dict oauth section falls through cleanly (typo resilience)

Mutation-tested: flipping the precedence to config-wins-over-env trips
exactly test_env_overrides_config_client_id while the other 7 stay
green, confirming the suite discriminates the order, not just the
sources.

This closes the last item in Teknium's PR review (PR #30156).
2026-05-27 02:12:27 -07:00
Ben
b26d81d536 feat(dashboard-auth): honour X-Forwarded-Prefix + __Host-/__Secure- cookies
Mission-control style deploys reverse-proxy the dashboard at a path
prefix (e.g. mission-control.tilos.com/hermes/* -> :9119) and inject
X-Forwarded-Prefix: /hermes on every request. The SPA mount already
honoured this for asset URLs and the bootstrap __HERMES_BASE_PATH__,
but the OAuth gate didn't:

  1. The gate's Location: header to /login and the 401 envelope's
     login_url were built bare ("/login?next=..."). Under a /hermes
     prefix the browser follows that to mission-control.tilos.com/login
     which the proxy doesn't route to the dashboard.
  2. _redirect_uri (the OAuth callback URL handed to the IDP) used
     request.url_for() which doesn't honour X-Forwarded-Prefix
     (Starlette/uvicorn only proxy_headers Host + Proto + For). The
     IDP redirects back to /auth/callback instead of /hermes/auth/
     callback → 404 in the user's browser.
  3. Cookies were set with Path=/ which leaks them to other apps on
     the same origin and won't be sent back on requests under the
     prefix in the first place.

Fix threads the normalised prefix through every boundary:

  * New hermes_cli/dashboard_auth/prefix.py — single source of truth
    for X-Forwarded-Prefix parsing. web_server._normalise_prefix
    becomes a re-export so the SPA mount, the gate, and the cookies
    helper all agree.
  * middleware._unauth_response builds login_url = f"{prefix}/login".
  * routes._redirect_uri splices the prefix into the path component
    of the IDP-bound URL (with full validation of the header).
  * cookies.{set,clear}_{session,pkce}_cookie now take prefix="".
    Path attribute switches to /hermes when set; cookie name switches
    name variant (see below). Every caller passes the request's
    normalised prefix.

Cookie hardening (Teknium's lesser-note #1 in the PR review): adopt
the __Host- / __Secure- cookie name prefixes per draft-west-cookie-
prefixes. The variant is selected from (use_https, prefix):

  * Loopback HTTP → bare "hermes_session_at" (both prefixes require
    Secure, incompatible with HTTP).
  * HTTPS, direct deploy (Path=/) → "__Host-hermes_session_at".
    Strongest spec: bound to exact origin, no Domain attribute, Secure
    required.
  * HTTPS, behind a proxy prefix (Path=/hermes) →
    "__Secure-hermes_session_at". __Host- forbids Path != "/"; the
    explicit Path=/hermes covers same-origin app isolation.

Setter and reader BOTH consult the prefix because the cookie *name*
changes — a reader that looked up the bare name when the setter wrote
__Secure- would never find the value. The reader falls back across
all three variants so a request whose shape changed mid-session (e.g.
post-deploy from no-prefix to /hermes) still picks up the existing
cookie until it expires.

Test coverage:

  - tests/hermes_cli/test_dashboard_auth_prefix.py — new file. 11 tests
    pinning:
      • Location: /hermes/login on the gate's HTML redirect
      • 401 envelope login_url carries the prefix
      • Malformed X-Forwarded-Prefix is ignored (header-injection
        defence; the script-tag value is normalised to empty string)
      • _redirect_uri splices /hermes into the path (the property
        that prevents the IDP-returns-to-404 failure)
      • PKCE cookie uses Path=/hermes + __Secure- when proxied
      • Session cookies use __Host- when direct, __Secure- when
        proxied, bare on loopback HTTP
      • End-to-end round trip with hand-managed PKCE cookie carriage
        (TestClient can't simulate a Path=/hermes cookie automatically)
  - tests/hermes_cli/test_dashboard_auth_cookies.py — rewritten to pin
    each (use_https, prefix) shape produces its expected cookie name,
    plus reader-side coverage that __Host- and __Secure- variants are
    both recognised.
  - Existing tests across middleware / 401-reauth / etc. updated to
    match the new cookie names (substring contains instead of
    startswith).

Mutation-tested: reverting _unauth_response to build the bare
"/login" URL trips exactly the two tests that pin the prefix
carriage, confirming the suite discriminates the regression.
2026-05-27 02:12:27 -07:00
Ben
034ad95fed fix(dashboard-auth): propagate next= through login page + PKCE cookie
The gate's _unauth_response set next=<path> on the /login redirect URL,
but nothing downstream read it: render_login_html ignored next=,
auth_login dropped it, and auth_callback read next= from its own query
string — which an IDP never sets on the callback URL (real IDPs only
echo back code+state). The _validate_post_login_target plumbing in the
callback was unreachable on the happy path, so users always landed on
"/" regardless of what they originally requested.

Worse: reading next= from the callback URL was a latent open-redirect
sink, since an attacker could craft /auth/callback?...&next=/admin and
have the server honour it post-auth.

Fix carries next= through the round trip on a server-controlled channel:

  1. login_page reads request.query_params['next'] and passes it (post-
     validation) to render_login_html.
  2. render_login_html threads next= URL-encoded into each provider
     button's href, with HTML-attribute escaping as defence in depth.
  3. auth_login accepts ?next= as a query param, re-validates, and
     appends it as a fourth segment (next=<urlquoted>) in the PKCE
     cookie payload alongside provider/state/verifier.
  4. auth_callback no longer accepts a next: str = "" query param. It
     parses next= out of the PKCE cookie and validates that with the
     same same-origin rules. Any attacker-supplied ?next= on the
     callback URL is silently ignored — server-only carrier.

Test coverage adds three classes:

  - TestAuthCallbackNext drives /login → /auth/login → IDP-bounce →
    /auth/callback end-to-end without smuggling next= onto the callback
    URL (which is what the previous tests did and why they didn't
    catch the bug). Includes test_attacker_callback_next_param_is_ignored
    to pin the security property that the URL value is never read.
  - TestRenderLoginHtmlNext covers the rendering function at the
    unit boundary so a regression that drops next_path is caught
    without spinning up the full app.
  - TestAuthLoginPkceCookieNext inspects the Set-Cookie header on
    /auth/login responses so a regression in cookie encoding is caught
    without driving the full round trip.

Mutation-tested: reverting auth_callback to read next= from the URL
trips 3 of 6 TestAuthCallbackNext tests (the safe-path and attacker-
hardening ones), confirming the suite discriminates between the cookie
read and the URL read.
2026-05-27 02:12:27 -07:00
Ben
c3104195b8 fix(dashboard-auth): bypass loopback WS peer check in gated mode
When the OAuth gate is active, start_server runs uvicorn with
proxy_headers=True so the dashboard can honour X-Forwarded-Proto from
Fly's TLS terminator (cookies, redirect URI reconstruction). A side
effect: ws.client.host is rewritten to the X-Forwarded-For value, which
on Fly is the real internet client IP — never loopback. The loopback
peer guard in _ws_client_is_allowed then rejected every WS upgrade in
gated mode (4403 close) even after a successful OAuth round trip and
ticket consumption, silently breaking /api/pty, /api/ws, /api/pub, and
/api/events.

Fix: in gated mode, bypass the peer-IP check. The OAuth gate +
single-use ticket is the auth. The Host/Origin guard in
_ws_host_origin_is_allowed still runs and is what protects against
DNS-rebinding here, not the peer IP.

Loopback mode behaviour is unchanged: the legacy ?token= path is the
only auth there and we don't want LAN hosts guessing tokens.

Regression coverage: TestWsRequestIsAllowedGated pins all four
behaviours — non-loopback peer allowed in gated mode, non-loopback peer
rejected in loopback mode, loopback peer allowed in loopback mode, and
the Host/Origin guard still firing on a rebinding attempt with gated
mode + matching peer.
2026-05-27 02:12:27 -07:00
Ben
866cc988b5 fix(dashboard-auth): use fixed-length sig suffix in stub token framing
The stub auth provider's _sign/_unsign helpers joined payload and HMAC
with a 'b"."' separator and recovered the parts via bytes.rsplit. HMAC-SHA256
digests are random bytes, so ~12% of the time the digest contains 0x2E
('.') and rsplit picks the wrong split point -- HMAC verification then
spuriously rejects valid tokens.

test_stub_refresh_round_trips was failing ~25% of the time in isolation
because of this.

Switch to a fixed-length suffix (32 bytes, sliced off in _unsign): no
separator means no collision class. After the fix, 10/10 runs pass.
2026-05-27 02:12:27 -07:00
Ben
c598076b76 test(dashboard-auth): strip HERMES_DASHBOARD_OAUTH_* env vars in hermetic fixture
When these vars are set in the developer's shell, every /api/status call
triggers load_gateway_config() -> discover_plugins() -> the bundled
dashboard_auth/nous plugin auto-registers itself, leaking a provider into
the registry across tests on the same xdist worker. That breaks assertions
like 'auth_providers == []' (loopback) and '== ["stub"]' (gated) in
test_dashboard_auth_status_endpoint.py.

CI never has these set, so this only surfaced locally -- exactly the
hermeticity gap _hermetic_environment is meant to close. Add them to
_HERMES_BEHAVIORAL_VARS so the autouse fixture strips them, and to the
unset list in scripts/run_tests.sh as belt-and-suspenders for direct
pytest invocations.
2026-05-27 02:12:27 -07:00
Ben
a498485631 feat(dashboard-auth-nous): surface token iss/aud in verification-failure error
When jwt.decode raises InvalidTokenError, decode the token a second time
without signature verification (safe — we never trust the values, just
display them) and append the actual iss/aud claims plus our configured
expected values to the error message. Lets operators see config drift
between HERMES_DASHBOARD_PORTAL_URL / HERMES_DASHBOARD_OAUTH_CLIENT_ID
and what Portal is actually emitting without having to hand-decode the
JWT from the browser cookie.
2026-05-27 02:12:27 -07:00
Ben
b3dc539304 feat(dashboard-auth): Nous plugin always-on; default portal URL; specific error messages
The Nous OAuth provider plugin (plugins/dashboard_auth/nous) is bundled
and auto-loaded — same as before — but previously refused to register
unless BOTH HERMES_DASHBOARD_OAUTH_CLIENT_ID and HERMES_DASHBOARD_PORTAL_URL
were set, then the gate's fail-closed branch told the operator 'install
the default Nous provider'. That message is misleading: the provider IS
installed; it's just unconfigured. And the contract only really needs
the per-instance client_id — the portal URL is the same for everyone
in production.

Three changes:

1. plugins/dashboard_auth/nous/__init__.py:
   - HERMES_DASHBOARD_PORTAL_URL is now optional and defaults to
     'https://portal.nousresearch.com'. Override only for staging
     (portal.rewbs.uk) or a custom deployment. Empty string also
     falls back to the default so an empty Fly secret can't point
     the dashboard at nowhere.
   - Plugin exposes a module-level LAST_SKIP_REASON: str that the gate
     reads when no providers register. Cleared on each register() call.
     Skip reasons are human-readable and actionable
     ('HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. The Nous Portal
     provisions this env var…').

2. plugins/dashboard_auth/nous/plugin.yaml:
   - requires_env drops HERMES_DASHBOARD_PORTAL_URL; only the client_id
     is mandatory. Description updated to reflect this.

3. hermes_cli/web_server.py:
   - When the gate fail-closes for 'no providers', it now reads each
     bundled plugin's LAST_SKIP_REASON and embeds them in the SystemExit
     message. Operator sees the specific config fix needed:
       Bundled providers reported these issues:
         • nous: HERMES_DASHBOARD_OAUTH_CLIENT_ID is not set. …
     instead of the prior generic 'Install the default Nous provider'.

Tests:
  - TestPluginRegister rewritten to assert the new defaults +
    LAST_SKIP_REASON contents (6 tests, +1 new for empty-string env).
  - New gate test test_start_server_surfaces_nous_skip_reason_when_unconfigured.
  - test_get_method_is_not_allowed widened to handle the SPA-shell 200
    path explicitly — assertion now verifies no JSON ticket leaks
    rather than asserting a specific status code (covers all four of
    401/404/405/200).

Docs updated: web-dashboard.md's 'Default provider' section now shows
the env-var table with required/optional columns and embeds the
fail-closed error message verbatim so operators can match what they
see at the prompt.
2026-05-27 02:12:27 -07:00
Ben
2fc4615fc4 feat(dashboard-auth): Phase 7 — SPA AuthWidget + /api/status auth fields
Phase 7 surfaces the OAuth gate state to users.

web/src/components/AuthWidget.tsx (new):
  Sidebar widget that fetches /api/auth/me on mount and renders a
  compact 'Logged in as <user_id…> via <provider>' row with a logout
  icon. Contract V1 (Nous Portal) emits no email/display_name claims,
  so user_id is the display value (truncated to 14 chars + ellipsis);
  display_name and email fallthroughs are forward-compat for OQ-C1.
  Renders nothing on 401 from /api/auth/me — that's the signal the
  gate isn't engaged (loopback mode), in which case the widget would
  be confusing.
  Logout POSTs /auth/logout (which clears cookies + redirects to
  /login) then full-page-navigates to /login itself; the SPA's fetch
  wrapper doesn't follow that redirect, so the navigation is explicit.

web/src/App.tsx: mounts <AuthWidget /> above <SidebarFooter />.
  Component is self-hiding in loopback mode so there's no need for a
  conditional mount.

web/src/lib/api.ts:
  - getAuthMe() + logout() helpers
  - AuthMeResponse type
  - StatusResponse gets optional auth_required + auth_providers fields
    so the existing StatusPage can render a gated/loopback badge.

hermes_cli/web_server.py: /api/status payload now includes
  - auth_required: bool — whether app.state.auth_required is True
  - auth_providers: list[str] — registered DashboardAuthProvider names
  Lazy-imports list_providers so early-startup status calls don't
  crash if the dashboard_auth module is still being set up.

tests/hermes_cli/test_dashboard_auth_status_endpoint.py: 3 new tests
covering the new status fields in both gated and loopback modes plus
a regression that no existing field got dropped from the payload.

The hermes status CLI is unchanged in this commit — that command
tracks model providers + OAuth credentials, not running-dashboard
state. The /api/status endpoint is the canonical place to query
dashboard auth-gate state, consumed by the React StatusPage already.
2026-05-27 02:12:27 -07:00
Ben
5e9308b5b8 feat(dashboard-auth): Phase 6 — 401 re-auth envelope + next= propagation
Contract V1 of nous-account-service PR #180 ships no refresh tokens, so
the original Phase 6 silent-refresh design is replaced with a thinner
'401 → redirect to /login' UX. The dashboard's gated middleware now
emits a structured envelope on any auth failure; the SPA's fetch
wrapper sees it and full-page-navigates the user through re-auth.

hermes_cli/dashboard_auth/cookies.py:
  set_session_cookies(refresh_token='') SKIPS writing the
  hermes_session_rt cookie. Forward-compat: a non-empty refresh_token
  still emits the cookie unchanged, so a future Portal contract that
  starts issuing RTs flips the persistence on with no other change.
  clear_session_cookies still emits a Max-Age=0 deletion for the RT
  cookie so stale cookies from earlier deployments get flushed on
  logout / session expiry. Deprecation marker + rationale in
  module docstring per the user's docstring-only deprecation pattern.

hermes_cli/dashboard_auth/middleware.py:
  _unauth_response now builds a structured JSON envelope for API 401s:
    { error: 'session_expired' | 'unauthenticated',
      detail: 'Unauthorized',
      reason: <internal>,
      login_url: '/login?next=<safe-path>' }
  HTML redirects also carry next= so a user landing on /sessions
  without a cookie bounces back to /sessions after re-auth.
  _safe_next_target validates same-origin: drops protocol-relative
  paths (//evil.com), absolute URLs, and any /login or /auth/* loop.
  Dead cookies are cleared on the 401 path so the browser stops
  replaying invalid tokens.

hermes_cli/dashboard_auth/routes.py:
  /auth/callback accepts next= query param and validates via
  _validate_post_login_target (same rules as the gate's
  _safe_next_target — defence-in-depth because next= survived a full
  IDP round trip and attacker-controlled state can re-enter via the
  callback URL). Open-redirect attempts land at '/' instead.

web/src/lib/api.ts:
  fetchJSON parses the 401 envelope and full-page-navigates to
  body.login_url ONLY on the known session-expiry error codes.
  Domain-level 401s (e.g. permission errors) bubble up as regular
  errors. credentials: 'include' added so cookie auth works for all
  fetches routed through this wrapper. sessionStorage.lastLocation is
  preserved for future use by AuthWidget / hermes_status.

Test files marked with pytest.mark.xdist_group so the four files that
mutate web_server.app.state.auth_required serialize onto the same xdist
worker — eliminates 'works locally, fails in CI' app-state bleed.

20 new tests in test_dashboard_auth_401_reauth.py:
  - set_session_cookies(refresh_token='') skips RT cookie
  - clear_session_cookies still emits RT deletion
  - 401 envelope shape (unauthenticated vs session_expired)
  - dead cookie cleared on invalid-token 401
  - login_url carries next= for deep paths
  - login loop avoided when path is /login/auth/api-auth
  - protocol-relative URL rejected
  - _safe_next_target unit tests (accept same-origin, reject loops/abs)
  - /auth/callback respects safe next= but rejects open redirects

2 pre-existing tests updated to accept the new /login?next=%2F shape.

Full dashboard-auth suite: 168 passed, 1 skipped (Phase 0 pre-existing).
2026-05-27 02:12:27 -07:00
Ben
b2360ba44e feat(dashboard-auth): _ws_auth_ok helper + ticket auth on all 4 WS endpoints
Phase 5 task 5.2. Four WebSocket endpoints — /api/pty, /api/ws, /api/pub,
/api/events — previously authed with the same constant-time check against
`_SESSION_TOKEN`. Replaced with a single helper that branches on
`app.state.auth_required`:

  Loopback / --insecure: legacy ?token=<_SESSION_TOKEN> path (unchanged).
  Gated:                  ?ticket=<single-use> consumed against the
                          dashboard-auth ticket store.

Critical security property: gated mode UNCONDITIONALLY rejects the
?token= path. A leaked _SESSION_TOKEN value from a log line is not
replayable for WS access in gated deployments.

`_build_sidecar_url` now branches too: loopback uses the legacy token;
gated mode mints a server-internal ticket via mint_ticket() with
pseudo-user 'pty-sidecar' / provider 'server-internal' so audit logs can
distinguish PTY-internal sidecar tickets from browser tickets. PTY
children open /api/pub exactly once at startup so single-use suffices.

Ticket rejections audit-log as WS_TICKET_REJECTED with truncated reason
+ client IP + WS path. Operators debugging 'WS keeps closing' issues see
which endpoint and why.

17 new tests:
- POST /api/auth/ws-ticket: 200 with cookie, 401/302 without, distinct
  per call, GET-not-allowed.
- _ws_auth_ok loopback: token accept/reject, missing-token reject,
  ticket-param-ignored.
- _ws_auth_ok gated: ticket accept, single-use rejection, unknown reject,
  legacy-token-rejected-in-gated assertion, audit-log emission.
- _build_sidecar_url: loopback uses token=, gated uses ticket=, no-bound
  returns None.
2026-05-27 02:12:27 -07:00
Ben
b69fce9c86 feat(dashboard-auth): single-use WS tickets + POST /api/auth/ws-ticket
Phase 5 task 5.1. Browsers cannot set Authorization on a WebSocket
upgrade, so in gated mode the SPA needs an alternative way to bind the
upgrade to its authenticated session.

  hermes_cli/dashboard_auth/ws_tickets.py — in-memory single-use ticket
  store with 30s TTL. Thread-safe (threading.Lock), token_urlsafe(32)
  values, ticket value truncated to 8 chars in error messages for log
  hygiene. Module-level state with _reset_for_tests() helper.

  hermes_cli/dashboard_auth/routes.py — adds POST /api/auth/ws-ticket.
  Auth-required (the gate middleware already attaches Session to
  request.state.session). Returns {ticket, ttl_seconds}; emits
  WS_TICKET_MINTED audit event with user_id + provider + ip.

  hermes_cli/dashboard_auth/audit.py — adds WS_TICKET_REJECTED enum
  value for the consume-side rejection event (wired into the WS
  endpoints in task 5.2).

11 new tests covering round-trip, single-use, TTL boundary, unknown
ticket rejection, secret-hygiene truncation in error messages, and
concurrent mint+consume from 20 threads.
2026-05-27 02:12:27 -07:00
Ben
848baeb0a8 feat(dashboard-auth): plugins/dashboard_auth/nous — contract-compliant Nous OAuth provider
Bundled, kind=backend, auto-loads. Activates ONLY when Portal-injected
env vars are present:

  HERMES_DASHBOARD_OAUTH_CLIENT_ID  — agent:{instance_id}
  HERMES_DASHBOARD_PORTAL_URL       — Portal base URL

Loopback / --insecure operators leave both unset and never see this
plugin register anything. The fail-closed branch in start_server handles
the 'public bind + zero providers' case independently.

Implementation follows nous-account-service PR #180's published OAuth
contract verbatim:

  - client_id is per-instance (agent:{instance_id}); the suffix is
    cross-checked against the token's agent_instance_id claim as
    defense-in-depth (contract C9).
  - scope is agent_dashboard:access only (contract C3).
  - aud is the bare client_id, no hermes-cli: prefix (contract C2).
  - RS256 JWT verification against /.well-known/jwks.json with
    5-minute cache (contract C7).
  - No refresh tokens in V1: refresh_session always raises
    RefreshExpiredError; revoke_session is a no-op (contract C5).
  - oauth_contract_version claim: missing → warn + proceed; present
    and != 1 → refuse (contract C11, OQ-C2 tolerant treatment).
  - redirect_uri validated client-side as defense before bouncing to
    Portal; authoritative check is server-side per agent-redirect-uri.ts.

41 new tests covering construction, plugin-entry env gating, start_login
shape, complete_login httpx-mocked happy path + error mapping,
verify_session JWT verification (RSA keypair fixture, full claim-check
matrix), refresh_session always raising, revoke_session no-op.

PyJWT + cryptography are already in the venv (jose was previously
suggested; switched to pyjwt[crypto] since the latter is already
pulled in transitively).
2026-05-27 02:12:27 -07:00
Ben
53736b3922 feat(dashboard-auth): fail-closed on no providers; proxy_headers when gated; suppress _SESSION_TOKEN injection
Phase 3, Task 3.5. Three changes to web_server.py:

  1. start_server replaces the legacy SystemExit-refusing-to-bind guard
     with: if app.state.auth_required and no providers registered, exit
     with a clear message; otherwise log the gate-on banner. --insecure
     keeps its existing behaviour.

  2. uvicorn proxy_headers flag is computed from app.state.auth_required.
     Loopback / --insecure keep it False (so _ws_client_is_allowed sees
     the real peer for the loopback gate); gated mode flips it True so
     X-Forwarded-Proto from Fly's TLS terminator is honoured for cookie
     Secure-flag decisions in detect_https().

  3. _serve_index no longer injects window.__HERMES_SESSION_TOKEN__ when
     the gate is on — the SPA reads identity from /api/auth/me using
     cookie auth instead. window.__HERMES_AUTH_REQUIRED__ flag lets the
     SPA pick between ticket-auth (gated) and token-auth (loopback) for
     /api/pty + /api/ws (Phase 5 will wire this in the React layer).

4 new behavioural tests; loopback regression harness still green.
2026-05-27 02:12:27 -07:00
Ben
5b17eab67a feat(dashboard-auth): auth gate middleware + /auth/* routes + /login HTML
Phase 3, Tasks 3.2 + 3.3 + 3.4. These three pieces are mutually
dependent so they land together.

middleware.py - gated_auth_middleware engages when app.state.auth_required
is True.  Allowlists /login, /auth/*, /api/auth/providers, and static
asset paths; everything else demands a valid session_at cookie.  Verifies
by trying every registered provider's verify_session in turn (multi-
provider stack); attaches verified Session to request.state.session.
Returns 401 JSON for /api/* and 302 -> /login for HTML.  ProviderError
during verify -> 503.

routes.py - APIRouter with:
  GET  /login              server-rendered HTML
  GET  /auth/login?provider=N  302 to IDP + PKCE cookie
  GET  /auth/callback?code,state  completes login, sets session cookies
  POST /auth/logout        clears cookies + best-effort revoke
  GET  /api/auth/providers public bootstrap endpoint (503 if zero)
  GET  /api/auth/me        verified session as JSON (auth-required)

login_page.py - Inline-CSS HTML template, no React, no JavaScript.

web_server.py - Mounted gated_auth_middleware between host_header and
auth_middleware (FastAPI runs middlewares in registration order: host
check -> cookie auth -> token auth).  auth_middleware short-circuits
when auth_required so cookie auth is authoritative in gated mode.
Router is included before mount_spa so the catch-all doesn't swallow
/login or /auth/*.

17 new behavioural tests; loopback regression harness still green.
2026-05-27 02:12:27 -07:00
Ben
a30c4d8ebd feat(dashboard-auth): cookie helpers for session_at/session_rt/pkce
Phase 3, Task 3.1. Three cookies:
  - hermes_session_at: OAuth access token (HttpOnly, TTL = token TTL)
  - hermes_session_rt: OAuth refresh token (HttpOnly, 30d max-age)
  - hermes_session_pkce: PKCE state + verifier + provider hint (10min)

All SameSite=Lax + Path=/. Secure flag is set ONLY when the request
scheme is https — uvicorn proxy_headers=True (enabled in gated mode at
Phase 3.5) rewrites scheme from X-Forwarded-Proto so Fly's TLS
terminator works.
2026-05-27 02:12:27 -07:00
Ben
628a52fce2 test(dashboard-auth): stub auth provider for E2E gate testing
Phase 2, Task 2.1. Self-contained fake IDP — start_login redirects
straight back to {redirect_uri}?code=stub_code&state=<s> so tests can
walk the OAuth round trip in-process. Tokens are HMAC-signed JSON blobs
(not real JWTs) — enough structure for verify_session to detect tamper
and expiry without pulling in pyjwt.

Lives in tests/ only — never registered as a real plugin. Phase 3's
end-to-end tests import StubAuthProvider directly.

Convention: exp <= now counts as expired (TTL=0 means born-expired)
— matches what Phase 6's silent-refresh test will need.
2026-05-27 02:12:27 -07:00
Ben
865cae4f61 feat(dashboard-auth): json-lines audit log at $HERMES_HOME/logs/dashboard-auth.log
Phase 1, Task 1.4. Records every auth event (login start/success/failure,
logout, refresh success/failure, revoke, session verify failure, WS
ticket mint) as one JSON object per line. Token-like kwargs (access_token,
refresh_token, code, code_verifier, state, ticket, cookie, Authorization)
are dropped before serialisation so the log never contains live secrets.

Write failures log at WARNING but never raise — auth flows must not fail
because the audit logger broke.
2026-05-27 02:12:27 -07:00