hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-25 17:18:11 +00:00

History

Siddharth Balyan fcb1944b4f Some checks are pending Deploy Site / deploy-vercel (push) Waiting to run Details Deploy Site / deploy-docs (push) Waiting to run Details Docker Build and Publish / build-amd64 (push) Waiting to run Details Docker Build and Publish / build-arm64 (push) Waiting to run Details Docker Build and Publish / merge (push) Blocked by required conditions Details Lint (ruff + ty) / ruff + ty diff (push) Waiting to run Details Lint (ruff + ty) / ruff enforcement (blocking) (push) Waiting to run Details Lint (ruff + ty) / Windows footguns (blocking) (push) Waiting to run Details Nix Lockfile Fix / auto-fix-main (push) Waiting to run Details Nix Lockfile Fix / fix (push) Waiting to run Details Nix / nix (macos-latest) (push) Waiting to run Details Nix / nix (ubuntu-latest) (push) Waiting to run Details OSV-Scanner / Scan lockfiles (push) Waiting to run Details Tests / test (1) (push) Waiting to run Details Tests / test (2) (push) Waiting to run Details Tests / test (3) (push) Waiting to run Details Tests / test (4) (push) Waiting to run Details Tests / test (5) (push) Waiting to run Details Tests / test (6) (push) Waiting to run Details Tests / save-durations (push) Blocked by required conditions Details Tests / e2e (push) Waiting to run Details uv.lock check / uv lock --check (push) Waiting to run Details feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 ) * feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits) L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that exercises the real header -> CreditsState -> TUI pipe end-to-end behind HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists. - agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are strings -> paid_access via == "true", never bool(); retain-last-known; only subscription_micros may be negative; _usd kept verbatim). - run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros, session-start baseline latch, + dev-gated "credits" capture log. - agent/chat_completion_helpers.py: capture on the streaming response. - agent/agent_init.py: init _credits_state + _credits_session_start_micros. - tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged. - ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner. Off by default; silent for normal users. Validated live against staging (capture log delta matches the TUI segment). Throwaway consumer (readout/log/ banner); credits_tracker + the capture plumbing are the real feature foundation. test(credits): lock parser under 9-state matrix + harden validation (L2) Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free, depleted, debt, missing, no_org) plus validation edge cases: version strict==1 with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off == "true"/"false", never bool()), half-pair subscription limit treated as both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros → None, negative non-subscription micros → None, as_of_ms junk → None, zero limit ZeroDivision guard. Harden agent/credits_tracker.py to match the spec: - Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState - Add depleted property (== not paid_access, never remaining==0) - Change used_fraction guard to key off subscription_limit_micros (the actual denominator) not denominator_kind (metadata) - Replace fail-soft _safe_int with a sentinel-returning variant; full validation now returns None on any malformed field rather than silently defaulting - Add module-level warn-once latch for version > 1 - Add USD regex validation; add denominator_kind allow-list check - Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-) feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1) L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's policy will fire through and L5's TUI render will consume. - agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id; kind defaults "sticky", kept TTL-expressive for a future config seam). - run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and _emit_notice / _emit_notice_clear emitters (swallow all callback errors — a notice must never break the agent loop; no-op when unbound). - agent/agent_init.py: thread both callbacks through init_agent. - tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear WS events (snake_case payload, matching the existing gateway-event convention). - ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent. - tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op, signature threading, TUI binding payload shape). Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/ decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly. * feat(credits): threshold reconciliation policy + tests (L4.1) * feat(credits): wire threshold policy into capture + latch (L4.2) After a fresh header parse, _capture_credits runs evaluate_credits_notices against the agent's _credits_latch and emits the result — clears first, then shows (so a recovered depletion clears before the "restored" success lands, and depleted wins the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks) still caches state for /usage but runs no policy. Parse stays fail-open (miss → keep last-known); the eval/emit path warns on failure rather than swallowing, so a depletion-notice bug can't vanish silently. - run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn); latch lazy-guarded (object.__new__ safety). - agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}. * feat(tui): render credits notices in the status bar (L5, Strategy B) The TUI now renders the notification.show / notification.clear gateway events the agent emits — a level-colored notice overrides the status/verb slot when not busy. - Notice state machine on turnController (pendingNotice + dedicated noticeTimer + show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler decodes the events and delegates. - Render priority busy > notice > status (appChrome StatusRule); notice text rendered verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx; dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire). - Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites (recordMessageComplete / interruptTurn / recordError) — never idle(), which reset() also calls (would leak across sessions); reset() clears instead. - Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard; latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak). - 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority). * feat(credits): cold-start seed for new Nous sessions (L3) A genuinely-new Nous session has no inference header yet, so seed credits state from the authoritative GET /api/oauth/account snapshot at session start (in the new-session branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin hook gets no agent reference). The seed runs the shared notice policy, so a session that opens already depleted warns IMMEDIATELY rather than only after the first turn. - Maps the nested account fields (paid_service_access → paid_access; total_usable / subscription / purchased on paid_service_access_info; rollover on subscription), each None-guarded; float dollars → micros via round(d1e6), _usd left "" (render formats from micros — never synthesize a verbatim usd from a float). - Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset → used_fraction None → no warn90 from the seed (% only once a header lands, per D-E). - Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never blocks startup); paid_access unknown ⇒ True (never falsely depleted). - run_agent.py: extracted the warm-path policy/emit block into a shared _emit_credits_notices() so capture and the seed fire notices identically. * feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6) Add Nous credit dollar magnitudes to /usage (subscription / top-up / total + rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the account endpoint exposes a denominator). Reuses the existing account-usage render machinery via a new pure build_nous_credits_snapshot() that maps a NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to fetch_account_usage (keeps the per-provider boundary intact). CLI /usage also doubles as a depletion-recovery trigger: a force_fresh account fetch, kept in a SEPARATE local so it never clobbers the header-sourced agent._credits_state (which alone carries used_fraction). If paid access recovered while credits.depleted is latched and a notice consumer is bound, it reuses agent._emit_credits_notices() to clear it. Gateway /usage displays magnitudes only — messaging binds no notice consumer, so it performs no recovery emit. Fail-open throughout: any portal hiccup leaves /usage unaffected. * refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers The dev-flag truthy check was inlined in three places. Replace with the shared utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the env check on every render). Behaviour-preserving; identical truthy set. * fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review) Adversarial review found the /usage depletion-recovery trigger dead AND broken: the CLI binds no notice_clear_callback, the TUI runs /usage in a separate slash-worker subprocess (its own agent/latch), and the no-clobber rule made it evaluate stale paid_access anyway. Recovery already happens on the next inference (warm path), so the trigger was redundant — remove it and stop the depleted notice over-promising. - cli.py: remove the dead recovery block; bound the /usage portal fetch with a 10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch — urllib's per-socket timeout is not a wall-clock guarantee. - agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance" (no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn). - agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch so a stalled portal can't hang session startup; tidy its time import. * chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE) Throwaway dev scaffolding to exercise the notice pipeline without real spend or Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct / grant_exhausted / depleted / clear) or a file path whose contents name a state (re-read each turn → flip states live for recovery testing). _capture_credits injects the chosen CreditsState instead of parsing real headers and runs the shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding. * feat(credits): /usage monthly-grant % gauge The portal /api/oauth/account subscription block now carries monthly_credits (the per-period grant allowance, the % denominator). The consumer parsed monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only. Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload. build_nous_credits_snapshot emits a Subscription usage window (real % used, routed through the existing render machinery) when monthly_credits is a finite positive denominator and credits_remaining is finite and <= cap; otherwise it degrades to magnitudes-only (older portals, rollover-over-cap, or non-finite payloads). Guards (adversarial-review-driven): reject non-finite operands (json.loads parses bare NaN/Infinity by default → would render $nan + a false 100% used), reject bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap (rollover spanning the period makes the cap a nonsensical denominator → the $X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%. Money rule preserved: the ratio + magnitudes are computed from numeric float account fields via display formatting, never by parsing a server _usd string (there are none on these dataclasses). 13 gauge tests added (tests/agent/test_nous_credits_gauge.py). fix(credits): show /usage Nous block whenever a Nous account is present /usage runs in a slash-worker subprocess whose resolved inference provider is often not "nous" even when the user has a Nous account, so gating the Nous credits block on (provider == "nous") hid it entirely — the account data was fully available but never rendered. Gate instead on "a Nous account is logged in": a cheap local auth-state lookup (get_provider_auth_state('nous') has an access_token) decides whether to attempt the portal fetch, regardless of which provider inference runs on. In the gateway the block is also lifted out of the 'if provider:' scope so a Nous-credentialled user with another (or no) resident inference provider still sees their balance. Fail-open and the per-fetch wall-clock timeout are preserved. * fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker) In the TUI, /usage runs in a slash-worker subprocess that resumes the session WITHOUT building an agent (self.agent is None), so _show_usage early-returned "(._.) No active agent" before ever reaching the Nous credits block — which is agent-independent (a portal fetch gated on Nous auth-state). Extract the block into _print_nous_credits_block() and run it at the no-agent / no-calls early-returns too (returns True if it printed, so the fallback message only shows when there's genuinely nothing). Verified live against staging: the block + monthly-grant gauge now render in the slash-worker /usage path (previously hidden). The plain CLI REPL + messaging paths are unchanged (they have a live agent). * feat(credits): escalating 50/75/90 usage bands (single status line) Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn, 90 warn) shown as ONE status-bar line: it displays the highest band the subscription grant has crossed, replaces the line as usage climbs, steps back down on recovery, and clears below 50%. No stacking, no per-turn churn. Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything from it. Single notice key (credits.usage) with a usage_band latch field so the notice only re-emits when the band actually changes. The crossing gate (seen_below_90) is preserved so a fresh live session that opens mid-range stays quiet until it has been observed below the lowest band (cold-start primes it when it wants an open-high warning). Denominator math unchanged: % = subscription grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %. Migrated test_credits_policy.py to the new key + added TestUsageBands (climb, step-down, recovery-clear, idempotent, inclusive boundaries). * feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn) Notices previously only fired inside a conversation turn (first message), so a session that opened already depleted / past a usage band showed nothing at 'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start() and call it (a) in the TUI/desktop agent build right after the notice callback is wired (fires at 'ready', before any message) and (b) as the first-turn fallback in conversation_loop. Idempotent (skips once _credits_state exists) and fail-open. The seed now maps monthly_credits -> subscription_limit_micros + denominator_kind='subscription_cap', so used_fraction is computable at seed time and usage-band warnings (not just depletion) hydrate on open. Primes the crossing latch so a session opening already in a band warns immediately. Degrades to depletion-only when monthly_credits is absent (older portals). Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap degradation, and the shared seed (fires/idempotent/skips-non-nous). * feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge when the portal supplies a positive, finite monthly_credits denominator with remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise. Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so the CLI and TUI /usage render the same block, and _snapshot_from_credits_state() so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too. TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage panel renders them regardless of API-call count or resume state — previously the TUI's separate /usage implementation only showed token counts. Money rule preserved: %% and magnitudes come from numeric float account fields via display formatting, never by parsing a server _usd string. feat(credits): CLI REPL inline notices (parity with TUI) The plain CLI agent bound no notice callbacks, so credit notices were TUI-only. Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders a single level-colored line above the prompt (error red / warn yellow / success green / info dim) via _cprint, and seed credits at session open so a depletion or usage-band warning shows before the first message — the same hydration the TUI got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot). * test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open seed, /usage gauge). * fix(credits): usage-band notice clears on next prompt (not sticky-forever) A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear the visible credits.usage notice when a new turn starts (startMessage), so it shows until your next prompt then yields. The server latch is unchanged, so it won't re-nag at the same band — it only re-shows when the band actually changes (climb) or clears when usage drops below the lowest band. Depletion stays sticky. * refactor(credits): consolidate the /usage credits block behind nous_credits_lines() The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command) each re-implemented the auth-gate + portal fetch + render, and both bypassed the dev-fixture short-circuit that only the TUI honored — so /usage ignored HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth gate, and the fixture works on every surface (~60 fewer duplicated lines). The gateway usage test recorded only the last asyncio.to_thread call; /usage now dispatches both the account fetch and the credits fetch, so it records every call and matches the account fetch by its provider arg. * fix(credits): keep the /usage gauge type-safe and log its fail-open path _is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge operands (monthly_credits / credits_remaining) and the magnitudes passed to _fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block is diagnosable in agent.log without a dev flag. * fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed - Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a real account. Matches the documented run workflow (both vars set together). - Hot-path probe: parse_credits_headers checks for the version sentinel header before allocating a lowercased copy of the response headers — skips that work on every non-Nous API call. Behaviour-identical and still case-insensitive. - Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now runs in a daemon thread, so a slow/unreachable portal never delays session "ready" (previously blocked up to 10s). The dev-fixture path stays synchronous; the thread re-checks idempotency before hydrating (a live header may land first). - Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed parser / dead seed is distinguishable from a legitimate no-headers miss. Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate. * test(tui): fix env-timing in the StatusRule dev-credits assertion DEV_CREDITS_MODE is read once at module load (config/env), so mutating process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner assertion only passed if the env was exported before vitest started, and failed in a normal run. Move that assertion to a sibling file that mocks config/env with DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard). * test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt - _snapshot_from_credits_state (the offline /usage renderer) had no direct test: lock the gauge math, the verbatim _usd magnitudes, the depletion line and the fixture marker, plus the no-cap (no gauge) and None-state cases. - turnController.startMessage had no test for clearing the credits.usage notice on the next prompt while leaving credits.depleted sticky. feat(credits): deliver credit notices over messaging gateways Bind notice_callback/notice_clear_callback on the per-turn gateway agent so usage-band / depletion / restored notices reach Telegram/Discord/Slack/ etc. Previously the messaging gateway bound neither callback, so the agent's _emit_credits_notices early-returned and a chat user crossing a band got nothing unless they ran /usage manually. - render_notice_line(): AgentNotice -> single plaintext line (level glyph + text), plaintext-only so it renders uniformly without per-platform escaping. Fail-soft on malformed/empty notices. - Standalone push for every notice (messaging has no persistent status bar): route through the shared _deliver_platform_notice rail (honors private/ public delivery + thread metadata), scheduled onto the gateway loop via safe_schedule_threadsafe from the agent's sync worker thread — same pattern as _status_callback_sync. - The fired-once latch lives on the cached (reused-in-place) agent and persists across turns, so a band crosses once -> one push, no per-turn re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder). - Recovery ('Credit access restored') rides the show path (emitted as a success notice, not a clear). notice_clear_callback is a no-op: a sent platform message can't be cleanly retracted. Tests: render glyph/levels/fail-soft + public/private delivery seam through _deliver_platform_notice + no-adapter no-op. * fix(credits): don't double the glyph on messaging notices render_notice_line prepended a per-level glyph, but the notice policy already bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used", "⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead level→glyph map. The render tests fed glyph-less text (and the success case only checked startswith), so the doubling slipped through. Rework them around the verbatim contract and add an end-to-end regression that runs real evaluate_credits_notices output through render_notice_line and asserts the line is returned unchanged.		2026-06-06 13:18:18 +05:30
..
lsp	fix(lsp): detect Windows wrapper binaries in installer probes	2026-05-30 02:08:36 -07:00
transports	fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 )	2026-06-05 03:53:59 -07:00
__init__.py	test: add unit tests for 8 modules (batch 2)	2026-02-26 13:54:20 +03:00
test_anthropic_adapter.py	fix(anthropic): demote dead thinking signature when orphan-strip mutates the latest turn	2026-05-31 06:14:34 -07:00
test_anthropic_keychain.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_anthropic_mcp_prefix_strip.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_anthropic_oauth_pkce.py	fix(auth): don't launch a text-mode browser inside the terminal for OAuth (#34479 )	2026-05-29 01:23:06 -07:00
test_arcee_trinity_overrides.py	test(arcee): cover Trinity Large Thinking temperature + compression overrides	2026-05-05 17:23:45 -07:00
test_async_utils.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_client.py	fix(vision): use MiniMax type="video" block (not input_video) + tests	2026-06-04 05:38:11 -07:00
test_auxiliary_client_anthropic_custom.py	fix(anthropic): complete third-party Anthropic-compatible provider support (#12846 )	2026-04-19 22:43:09 -07:00
test_auxiliary_client_azure_foundry.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_client_xai_oauth_recovery.py	fix(auxiliary): detect xAI OAuth 403 bad-credentials as auth error	2026-05-29 00:28:02 -07:00
test_auxiliary_config_bridge.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_main_first.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_named_custom_providers.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_transport_autodetect.py	fix(auxiliary): auto-detect Anthropic Messages transport for all aux clients (#17027 )	2026-04-28 06:50:14 -07:00
test_azure_identity_adapter.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_bedrock_1m_context.py	feat(azure-foundry): add Microsoft Entra ID auth	2026-05-18 10:14:38 -07:00
test_bedrock_adapter.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_bedrock_integration.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_codex_cloudflare_headers.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_codex_responses_adapter.py	feat(prompt): universal task-completion guidance + local Python toolchain probe (#34340 )	2026-05-28 22:26:09 -07:00
test_codex_ttfb_watchdog.py	fix(codex): relax no-byte TTFB watchdog default from 12s to 120s	2026-05-29 02:02:25 -07:00
test_compress_focus.py	fix: resolve CI test failures — add missing functions, fix stale tests (#9483 )	2026-04-14 01:43:45 -07:00
test_compression_concurrent_fork.py	fix(compression): fail open when lock subsystem is missing (version skew) (#34475 )	2026-05-29 01:32:32 -07:00
test_compressor_historical_media.py	Port from Kilo-Org/kilocode#9434: strip historical media after compression (#27189 )	2026-05-16 17:18:25 -07:00
test_compressor_image_tokens.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_context_compressor.py	fix(compression): avoid repeat preflight compaction from rough estimates	2026-05-29 19:05:03 -07:00
test_context_compressor_summary_continuity.py	test: cover ci-unblocker production regressions	2026-05-27 22:14:53 -07:00
test_context_engine.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_context_engine_host_contract.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_context_references.py	fix(agent): fall back when rg is blocked for @folder references	2026-04-20 01:56:41 -07:00
test_copilot_acp_client.py	fix(ci): recover 38 failing tests on main (#17642 )	2026-04-29 20:05:32 -07:00
test_copilot_acp_deprecation.py	fix(copilot-acp): tighten deprecation detection + sharpen GitHub Models 413 hint	2026-05-16 02:24:48 -07:00
test_credential_pool.py	fix(auth): address Nous JWT fallback review	2026-05-29 02:24:48 -07:00
test_credential_pool_routing.py	refactor: remove smart_model_routing feature (#12732 )	2026-04-19 18:12:55 -07:00
test_credits_cold_start.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_credits_fixture_snapshot.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_credits_policy.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_credits_tracker.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_crossloop_client_cache.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_curator.py	feat(curator): prune built-in skills after inactivity + track usage for all skills (#36701 )	2026-06-01 02:07:32 -07:00
test_curator_activity.py	fix: use skill activity in curator status	2026-04-30 10:31:47 -07:00
test_curator_backup.py	fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671 ) (#18731 )	2026-05-02 01:29:57 -07:00
test_curator_classification.py	feat(curator): hint at `hermes curator pin` in the rename block (#23212 )	2026-05-10 06:44:53 -07:00
test_curator_reports.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_custom_provider_extra_body.py	fix(custom): pass custom provider extra body	2026-05-21 07:48:53 -07:00
test_deepseek_anthropic_thinking.py	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )	2026-05-17 02:29:41 -07:00
test_direct_provider_url_detection.py	fix: restrict provider URL detection to exact hostname matches	2026-04-20 22:14:29 -07:00
test_display.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_display_emoji.py	feat(tools): centralize tool emoji metadata in registry + skin integration	2026-03-15 20:21:21 -07:00
test_display_todo_progress.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_display_tool_failure.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_error_classifier.py	fix(agent): fallback immediately on provider content-policy blocks (#33883 )	2026-05-28 07:28:24 -07:00
test_external_skills.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_external_skills_dirs_cache.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_file_safety.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_file_safety_container_mirror.py	fix(file-safety): extend sandbox-mirror guard to cover inner-container path (#32049 ) (#32407 )	2026-06-02 14:03:37 +10:00
test_file_safety_credentials.py	fix(security): block read_file on project-local .env files	2026-05-25 03:40:47 -07:00
test_file_safety_cross_profile.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_file_safety_sandbox_mirror.py	fix(file-safety): add sandbox-mirror soft guard for writes to per-task .hermes mirrors (#32213 )	2026-06-02 11:29:24 +10:00
test_gemini_cloudcode.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_gemini_fast_fallback.py	fix: wrap _pool_may_recover_from_rate_limit call through run_agent namespace	2026-05-18 20:04:57 -07:00
test_gemini_free_tier_gate.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_gemini_native_adapter.py	fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 )	2026-06-05 03:53:59 -07:00
test_gemini_schema.py	fix(gemini): drop integer/number/boolean enums from tool schemas (#15082 )	2026-04-24 03:40:00 -07:00
test_i18n.py	fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix (#38383 )	2026-06-03 12:00:27 -07:00
test_image_gen_registry.py	feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )	2026-04-21 21:30:10 -07:00
test_image_routing.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_insights.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_jiter_preload.py	fix(agent): preload jiter native parser	2026-05-28 00:20:11 -07:00
test_kimi_coding_anthropic_thinking.py	fix(anthropic): broaden Kimi thinking-suppression to custom endpoints (#17455 )	2026-04-29 06:35:42 -07:00
test_last_total_tokens.py	fix(compressor): ABC compliance — total_tokens, api_mode, logger consistency	2026-05-23 17:38:19 -07:00
test_local_stream_timeout.py	fix(local): recognize unqualified hostnames as local endpoints (#9248 )	2026-06-05 10:18:10 +10:00
test_markdown_tables.py	fix(cli): vertical fallback for markdown tables wider than terminal (#23948 )	2026-05-11 16:49:13 -07:00
test_memory_provider.py	fix(memory): register parent packages for user-installed provider imports	2026-06-04 05:35:43 -07:00
test_memory_session_switch.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_memory_user_id.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_minimax_auxiliary_url.py	fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983 )	2026-04-07 22:23:28 -07:00
test_minimax_provider.py	polish(minimax): address Copilot review comments on M3 default-aux fix	2026-06-04 05:53:35 -07:00
test_model_metadata.py	fix(model_metadata): drop stale ≤256,000 cache entries for Grok-4.3	2026-06-04 05:36:34 -07:00
test_model_metadata_local_ctx.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_model_metadata_ssl.py	fix(auth): honor SSL CA env vars across httpx + requests callsites	2026-04-24 03:00:33 -07:00
test_models_dev.py	test: remove low-value model-catalog mirror tests	2026-05-29 23:45:05 -07:00
test_moonshot_schema.py	Add Hermes desktop app (#20059 )	2026-05-31 17:46:56 -05:00
test_non_stream_stale_timeout.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_nous_credits_gauge.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_nous_credits_snapshot.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_nous_oauth_401_guidance.py	feat(cli): make `hermes portal` the human-readable Portal onboarding alias	2026-06-04 01:19:28 +05:30
test_nous_rate_guard.py	fix(nous): don't trip cross-session rate breaker on upstream-capacity 429s (#15898 )	2026-04-26 04:53:42 -07:00
test_onboarding.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_openrouter_response_cache.py	fix(openrouter): use canonical X-Title attribution header	2026-05-05 10:13:34 -07:00
test_plugin_llm.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_portal_tags.py	feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779 )	2026-05-12 20:49:20 -07:00
test_prompt_builder.py	test(prompt): place cwd regression tests in TestEnvironmentHints (drop redundant docker case)	2026-06-01 16:55:04 -07:00
test_prompt_caching.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_proxy_and_url_validation.py	fix(agent): normalize socks:// env proxies for httpx/anthropic	2026-04-21 05:52:46 -07:00
test_rate_limit_tracker.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_redact.py	test(redact): assert Discord mentions pass through unchanged	2026-05-30 20:48:41 -07:00
test_resume_stale_active_task.py	fix(compressor): strip stale handoff prefix on resume; reconcile #26290+#32787 (#35344 )	2026-05-30 07:29:21 -07:00
test_runtime_cwd.py	fix(desktop): stabilize project folder sessions (#37586 )	2026-06-02 20:23:09 +00:00
test_save_url_image.py	fix(image_gen): cache xAI ephemeral URL responses to disk (#26942 ) (#31759 )	2026-05-24 18:10:47 -07:00
test_set_runtime_main_custom_provider.py	test(auxiliary): e2e routing assertions for custom-provider aux resolution	2026-05-30 02:38:59 -07:00
test_shell_hooks.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_shell_hooks_consent.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_skill_bundles.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_skill_commands.py	test: use subprocesses for each test file (#29016 )	2026-05-21 16:40:04 +05:30
test_skill_commands_reload.py	refactor(reload-skills): queue note for next turn, drop cache invalidation + agent tool	2026-04-29 21:07:47 -07:00
test_skill_utils.py	fix(skills): load Linux-tagged skills on Termux (android sys.platform)	2026-05-21 19:08:38 -07:00
test_streaming_context_scrubber.py	🐛 fix(memory): require newline after context tag	2026-05-18 10:53:08 -07:00
test_subagent_progress.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_subagent_stop_hook.py	feat: shell hooks — wire shell scripts as Hermes hook callbacks	2026-04-20 20:53:51 -07:00
test_subdirectory_hints.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_summary_prefix_semantics.py	fix(compression): drop conflicting 'resume Active Task' directive in summary prefix	2026-05-30 07:29:21 -07:00
test_system_prompt.py	refactor(prompt): route context-file cwd through runtime_cwd resolver	2026-06-01 16:55:04 -07:00
test_system_prompt_restore.py	perf(prompt-cache): date-only timestamp + loud gateway-DB roundtrip logging	2026-05-17 23:20:37 -07:00
test_think_scrubber.py	fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 )	2026-05-05 04:33:38 -07:00
test_title_generator.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_tool_dispatch_helpers.py	feat(security): promptware defense — shared threat patterns + memory load-time scan + tool-result delimiters (#32269 )	2026-05-25 14:52:24 -07:00
test_tool_guardrails.py	fix: add recovery hints to loop guard warnings	2026-05-19 00:12:12 -07:00
test_tool_result_classification.py	fix: classify landed file mutations with diagnostics	2026-05-13 06:46:23 -07:00
test_transcription_registry.py	feat(stt): add register_transcription_provider() plugin hook	2026-05-25 01:41:19 -07:00
test_tts_registry.py	feat(tts): add register_tts_provider() plugin hook (closes #30398 )	2026-05-24 18:04:54 -07:00
test_unsupported_parameter_retry.py	test: remove 50 stale/broken tests to unblock CI (#22098 )	2026-05-08 14:55:40 -07:00
test_unsupported_temperature_retry.py	fix(auxiliary): stop capping output with max_tokens by default (#34530 ) (#34845 )	2026-05-29 17:24:30 -07:00
test_usage_pricing.py	remove Vercel AI Gateway and Vercel Sandbox (#33067 )	2026-05-27 00:43:32 -07:00
test_video_gen_registry.py	feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )	2026-05-13 16:39:41 -07:00
test_vision_resolved_args.py	fix(vision): preserve explicit provider auth with custom base_url	2026-05-04 05:05:43 -07:00
test_vision_routing_31179.py	fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452 )	2026-05-24 15:01:28 -07:00