Commit graph

1269 commits

Author SHA1 Message Date
Teknium
f3fe99863d
revert(web): remove keyless Parallel search fallback (#46350)
Remove the free Parallel Search MCP path and restore the keyed Parallel backend behavior from before it was introduced.

Also drops the keyless fallback registration/display labeling tests and returns the Parallel SDK pin to the prior version.
2026-06-14 16:47:57 -07:00
mr-r0b0t
bff78a34dc feat(zai): add GLM-5.2 with verified 1M context window
GLM-5.2 ships with a 1M (1,048,576) token context window. Without this
entry, Hermes falls through to the generic 'glm' key (202,752 tokens),
under-reporting the context bar and prematurely compressing conversations.

The 1M limit was verified empirically via needle-in-a-haystack retrieval
at 789,240 prompt tokens on api.z.ai/api/coding/paas/v4 — zero errors,
zero truncation, correct retrieval at every tested size (25K through 789K).

Changes:
- agent/model_metadata.py: add 'glm-5.2': 1_048_576 before 'glm' fallback
- hermes_cli/models.py: add glm-5.2 to zai curated models
- hermes_cli/setup.py: add glm-5.2 to setup wizard zai list
- hermes_cli/auth.py: add glm-5.2 to coding plan endpoint probes
- plugins/model-providers/zai/__init__.py: add glm-5.2 to fallback_models
- tests/agent/test_model_metadata.py: context resolution + vendor-prefix tests
2026-06-14 13:50:36 -07:00
Teknium
4e6d05c6a5
perf(skills): share raw config cache in skill utils (#46149) 2026-06-14 11:14:58 -07:00
kshitijk4poor
ce19fdb7ce fix(skills): apply global|platform disabled union to all resolution sites
The platform-disabled fix landed only in agent.skill_utils.get_disabled_skill_names
(the system-prompt path). Two sibling resolvers still used the old
replace-not-union semantics, so the same skill could be hidden from the
<available_skills> prompt yet reported enabled elsewhere:

- hermes_cli/skills_config.get_disabled_skills (the 'hermes skills config' UI)
  returned only the platform list, so a globally-disabled skill showed as
  enabled (unchecked) on any platform with a platform_disabled entry.
- tools/skills_tool._is_skill_disabled (gates whether skill_view loads a skill)
  ignored the global list when a platform list existed, so a globally-disabled
  skill could still be loaded on such a platform.

Both now union the global list with the platform list, matching
get_disabled_skill_names. An explicit empty platform list no longer re-enables
a globally-disabled skill — global disables hold on every platform (#46201).

Also: fix the now-stale get_disabled_skill_names docstring and drop a stray
blank line. Regression tests added for both sites (proven to fail on the old
replace semantics).
2026-06-14 22:54:54 +05:30
ibrahim özsaraç
7bbe7024c2 fix: filter platform-disabled skills from <available_skills> prompt (#46201)
build_skills_system_prompt() already resolved _platform_hint but called
get_disabled_skill_names() with no argument, so the resolved platform never
reached the filter and the prompt cache_key varied by platform while the
disabled set did not. Pass _platform_hint or None.

get_disabled_skill_names() also fully ignored the global 'disabled' list once
a platform-specific list was found. Return the union (global | platform) so a
globally-disabled skill stays disabled on every platform.

Salvaged from #46203 by @iborazzi; the unrelated apps/shared/tsconfig.json
ES2023 bump is intentionally dropped (one concern per PR).
2026-06-14 22:52:57 +05:30
Teknium
13a1bd0f83
perf(model-metadata): persist OpenRouter metadata cache (#46114) 2026-06-14 04:45:46 -07:00
Teknium
723c2331bd fix: make profile subprocess HOME policy explicit 2026-06-14 03:20:21 -07:00
zccyman
b00060ce54 fix(agent): expose HERMES_REAL_HOME in subprocess envs for profile isolation
When profile isolation activates ({HERMES_HOME}/home/ exists), child
processes receive HOME={HERMES_HOME}/home/ for tool config isolation
(git, ssh, gh). However, scripts using Path.home() to locate
~/.hermes/ would incorrectly resolve to the isolated profile home,
breaking helpers that rely on the real user home directory.

New get_real_home() helper in hermes_constants resolves the actual
user home independently of profile isolation. All four subprocess
spawners now inject HERMES_REAL_HOME alongside the profile HOME:

- tools/code_execution_tool.py (execute_code)
- tools/environments/local.py (terminal background, run_env)
- agent/copilot_acp_client.py (Copilot ACP)

Child scripts can now use:
  Path(os.environ.get("HERMES_REAL_HOME", os.environ.get("HOME", "")))

to reliably find the real user home regardless of profile isolation.

Closes #25114
2026-06-14 03:20:21 -07:00
helix4u
85e6232a07 fix(providers): support anthropic proxy v1 endpoints 2026-06-14 02:09:16 -07:00
Teknium
81e42335a1
fix(file-safety): relax user-write deny policy (#45947)
Allow file tools to edit shell startup files, user package-manager configs, and Hermes control files that the user can already modify directly. Keep hard blocks for SSH keys, .env/OAuth token stores, mcp-tokens, pairing files, and system privilege files.
2026-06-14 02:07:32 -07:00
Brooklyn Nicholson
715b691723 fix(desktop): show summarizing indicator during auto-compaction
Auto-compression rewrites history mid-turn, which made long threads look
like they reset. Re-tag the gateway lifecycle status as compacting and
surface it in the desktop thread loading indicators.
2026-06-14 02:28:07 -05:00
kshitijk4poor
10bd01972b refactor(agent): share the content_policy_blocked result builder + recovery hint
The HTTP-200 refusal handler (finish_reason=content_filter) and the
exception-path handler (a provider moderation error classified as
content_policy_blocked) independently built the same terminal turn result —
the same {final_response, messages, api_calls, completed:False, failed:True,
error:'content_policy_blocked: ...'} dict — and ended their user-facing
message with the same 'Try rephrasing... hermes fallback add' trailer, copied
verbatim. The two copies could drift.

Funnel both through a shared _content_policy_blocked_result() builder and a
shared _CONTENT_POLICY_RECOVERY_HINT constant. Also collapse the HTTP-200
path's two near-identical with/without-explanation templates into one (compute
the detail fragment once) and pass reason=FailoverReason.content_policy_blocked
.value to the error hook instead of a hand-written string literal, matching the
sibling hook call.

Behavior-preserving: the provider/refusal lead-in wording stays distinct (a
provider safety filter vs the model declining are genuinely different signals),
the with-text and exception messages are byte-identical to before, and the
no-explanation case only gains a paragraph break for consistency. Surfaced by
the simplify-code reuse/quality reviewers.

The efficiency reviewer's 'redundant normalize_response' flag was deliberately
NOT applied: that branch is cold (refusal-only) and pure-CPU, and reusing the
sibling-branch normalized locals would risk a NameError on the codex_responses
path (which sets finish_reason without normalizing) — re-normalizing is the
robust choice.
2026-06-14 12:19:19 +05:30
kshitijk4poor
12c84d6c77 fix(transports): only treat a refusal as terminal when it is the sole payload
A chat-completions response that carries real text or tool calls *alongside*
a `message.refusal` note is a normal, usable turn — the model did work. The
prior logic flipped finish_reason to `content_filter` whenever a refusal
string was present, so the conversation loop reframed a content-bearing turn
as a *failed* safety refusal (failed=True) and buried the model's actual
output inside the "model declined" template, or dropped tool calls entirely.

Only promote to a terminal `content_filter` when the refusal is the sole
payload (no visible text AND no tool calls). The refusal explanation is still
recorded in provider_data in every case for observability. Refusal-only
responses (the bug this feature targets) are unaffected and still surface
terminally; the empty+refusal, bare content_filter passthrough, and no-refusal
common cases are byte-identical to before.

Updates the partial-content test to the corrected contract and adds a
tool_calls-alongside-refusal regression guard.
2026-06-14 12:12:52 +05:30
SHL0MS
bb46bf8ce4 fix(agent): surface model refusals instead of retrying them as errors
A Claude refusal (HTTP 200, stop_reason="refusal", empty content) was
laundered into a generic retry loop and surfaced as a misleading
"rate limited / invalid response" or "no content after retries" error,
burning paid attempts reproducing a deterministic refusal.

This hit two distinct paths:

- Direct Anthropic (anthropic_messages): validate_response rejected the
  empty-content refusal *before* normalize_response mapped refusal ->
  content_filter, so it fell into the invalid-response retry loop.
- Nous Portal / OpenAI-compatible (chat_completions): the portal surfaces
  a Claude refusal via message.refusal with empty content, which sailed
  past validation and died in the empty-response retry loop.

Fix (one unified content_filter dispatch for all backends):
- AnthropicTransport.validate_response: accept empty content when
  stop_reason == "refusal" so it flows to normalize_response.
- ChatCompletionsTransport.normalize_response: promote message.refusal to
  content + a content_filter finish reason.
- conversation_loop: handle finish_reason == "content_filter" - fire the
  api_request_error hook (content_policy_blocked), try a configured
  fallback once, else return a clear terminal refusal message. Never retry
  a deterministic refusal.

Supersedes #43084, which fixed only the direct-Anthropic path and could
not reach the chat_completions/portal path.

Tests: transport-level (validate_response refusal, message.refusal
promotion) + end-to-end loop (refusal surfaced, exactly one API call).

(cherry picked from commit 01f546f92c)
2026-06-14 12:10:08 +05:30
brooklyn!
4b5ba112ad
fix: shrink images to reported provider dimension limit (#45979)
Parse provider-reported image pixel ceilings so many-image Anthropic requests can recover by shrinking Retina screenshots below the stricter limit instead of retrying the same rejected payload.
2026-06-14 01:07:43 -05:00
Teknium
7aaae7acd0 fix(ssl): align guard docs and escape hatch 2026-06-13 21:14:32 -07:00
Teknium
dc90ca4e17 fix(ssl): run CA guard during agent initialization 2026-06-13 21:14:32 -07:00
Teknium
af5b526472 fix(ssl): validate CA bundle paths before provider calls 2026-06-13 21:14:32 -07:00
chromalinx
a218a0f156 fix(agent,gateway,doctor): add SSL CA cert bundle fail-fast guard
A stale certifi CA bundle after a partial `hermes update` used to crash
the agent on the first outbound HTTPS call with a raw traceback and
trap the gateway in a retry loop.

This patch:

* Adds `agent/errors.py` with a typed `SSLConfigurationError`
* Adds `agent/ssl_guard.py` with a `verify_ca_bundle()` pre-flight
  that asserts the bundle exists, is non-trivial in size, and can build
  a working SSLContext. On macOS, it falls back to the system trust
  store when the bundle is empty but the system store is healthy
  (covers corporate proxies / MDM setups).
* Wires the guard into `run_agent.py` and `gateway/run.py` right
  after the `hermes_bootstrap` import, inside a try/except so a bug
  in the guard itself can never prevent startup.
* Adds a `SSL / CA Certificates` section to `hermes_cli doctor` so
  users can detect the failure with one command.
* Adds unit tests covering the healthy, missing, empty, skip-env, and
  macOS-fallback paths.
* Adds an RCA document describing the failure mode and the recovery
  path (`pip install -e .`).

When the bundle is broken the user sees:

    \u26a0\ufe0f SSL certificate bundle issue detected.
       Run: pip install -e .

`HERMES_SKIP_SSL_GUARD=1` disables the check for sandboxed
environments that ship their own trust store.
2026-06-13 21:14:32 -07:00
Teknium
069bfd6545 fix(agent): keep Codex reasoning replay on Codex path 2026-06-13 14:35:00 -07:00
Teknium
c8e5f34f24
fix(gemini): strip native self prefixes before generateContent (#36141)
Strip `google/` and `gemini/` self-prefixes before native Gemini generateContent calls, and keep provider-normalization expectations aligned.
2026-06-13 13:47:08 -07:00
briandevans
7d11fa4e9e fix(codex-responses): let final_answer complete top-level incomplete responses 2026-06-13 13:45:29 -07:00
ITheEqualizer
7c0605bf22 fix(telegram): preserve rich formatting on stream final 2026-06-13 13:44:45 -07:00
achaljhawar
819def44c7 fix(agent): scope Nous tags to Nous auxiliary calls 2026-06-13 13:24:40 -07:00
ashishpatel26
957a8ffa88 fix(bedrock): omit sampling params for restricted Claude models
Bedrock Converse rejects non-default sampling parameters for Opus 4.7 and 4.8 with a ValidationException. Reuse the Anthropic-native sampling-param guard in the Bedrock kwargs builder so those models omit temperature/topP while older Claude and non-Claude models keep existing behavior.

Includes the stop-sequence regression from the parallel fix to ensure stopSequences still pass through for restricted Opus models.

Co-authored-by: Tranquil-Flow <tranquil_flow@protonmail.com>
2026-06-13 10:45:56 -07:00
Tranquil-Flow
4fd9397ae3 fix(codex): drop extra_headers for chatgpt.com backend 2026-06-13 07:13:24 -07:00
Henrik Bentel
eed61a1251 fix(gemini): add role field to systemInstruction 2026-06-13 06:12:52 -07:00
Tranquil-Flow
5acd185f7c fix(moonshot): handle union type arrays in tool schemas 2026-06-13 05:51:41 -07:00
Teknium
a59d5e37e8
feat(telegram): make rich messages always on (#45584)
Remove the rich_messages config toggle entirely so Telegram replies always try the Bot API 10.1 rich-message path first, with the existing MarkdownV2 fallback/latch behavior for unsupported endpoints and per-message failures.

Restore the Telegram platform hint to encourage rich Markdown tables/task lists/math now that the rich path is the default, and remove the config/docs surface for the old toggle.
2026-06-13 05:45:11 -07:00
Teknium
4b646bc21e
fix(auxiliary): preserve main provider base url (#45587) 2026-06-13 05:44:18 -07:00
helix4u
2d474e39c7 fix(acp): preserve memory provider tools 2026-06-13 04:51:44 -07:00
Teknium
3803e5fc28
fix(agent): don't treat custom:<name> pools as cross-provider mismatch (#45289)
Custom endpoints carry two naming conventions for the same provider: the
agent's provider attribute is the generic 'custom' label while the pool
is keyed 'custom:<normalized-name>'. The defensive guard in
recover_with_credential_pool compared them literally, logged
'Credential pool provider mismatch: pool=custom:<name>, agent=custom',
and skipped recovery — so 401 refresh and 429 rotation never ran for
ANY custom-provider user (seen in the field on a Fireworks setup whose
dead key burned full retry cycles every turn with the skip warning on
each one).

Accept the pair only when the agent's CURRENT base_url resolves to the
same pool key via get_custom_provider_pool_key, preserving the guard's
original purpose (#33088/#33163): a fallback provider or a different
custom endpoint still skips pool mutation.
2026-06-13 02:01:09 -07:00
kyssta-exe
956af7f3c3 fix(agent): add metadata flag to context compression summary messages (#38389)
Summary messages (standalone insertion and merge-into-tail) now carry a
metadata flag so frontends (CLI, Desktop, gateway, TUI) can distinguish
them from real assistant/user messages without content-prefix heuristics.

Re-applied from PR #38434 onto current main (conflicted with the
_SUMMARY_END_MARKER hoist). Key renamed from the PR's
'is_compressed_summary' to '_compressed_summary': the wire sanitizers
strip underscore-prefixed message keys, so the flag stays in-process and
can never reach strict gateways (Fireworks/Mistral/Kimi reject unknown
keys with 'Extra inputs are not permitted').
2026-06-12 16:47:15 -07:00
Teknium
8905ee6b8a fix(agent): rewind flush cursor exactly when repair compacts before the cursor
Follow-up to the #44837 clamp: a min() clamp only fixes cursor overshoot
past the new end of the list. When repair_message_sequence drops/merges
messages at indexes below the cursor, the clamp leaves the cursor pointing
past unflushed rows and the turn-end flush silently skips them.

Extract repair_message_sequence_with_cursor(): snapshot the flushed prefix
by object identity before repair, then recompute the cursor as the count
of surviving flushed messages. Falls back to the clamp when no snapshot is
available. Keeps the safety guard in _flush_messages_to_session_db.

Adds targeted tests for overshoot, before-cursor compaction, no-repair,
bare-agent, and the flush guard.
2026-06-12 16:29:01 -07:00
kyssta-exe
5d0408d9fe fix(agent): clamp flush cursor after repair_message_sequence compaction (#44837) 2026-06-12 16:29:01 -07:00
konsisumer
aec38855b5 fix(agent): preserve recent turns during compression 2026-06-12 16:26:58 -07:00
xxxigm
691ff7c188 fix(compressor): keep last visible assistant reply out of compaction summary + label handoffs in WebUI (#29824)
Two-pronged fix for the WebUI "context compaction block in place of
last assistant response" regression.

Agent layer (the real fix). ``_find_tail_cut_by_tokens`` already had
``_ensure_last_user_message_in_tail`` to keep the most recent user
request out of the compressed middle (#10896), but no symmetric
anchor for the assistant side. When the conversation has an
oversized recent tool result or a long stretch of tool-call/result
pairs *after* the assistant's last visible reply, the token-budget
walk can stop with the previously-visible reply on the wrong side
of ``cut_idx``. The summariser then rolls it into the single
``[CONTEXT COMPACTION — REFERENCE ONLY]`` block persisted as
``role="user"`` or ``role="assistant"``, and from the operator's
perspective the WebUI session viewer
(``web/src/pages/SessionsPage.tsx``) and the TUI chat panel both
suddenly show the opaque "Context compaction" block in the slot
where they were just reading the actual answer:

    User:  "i cant see the output of the last message you sent,
            i did see it previously, however now see 'context
            compaction'"

Added ``_ensure_last_assistant_message_in_tail`` mirror of the
user-side anchor. It looks for the most recent assistant message
with non-empty text content (skipping tool-call-only assistant
"stubs" which the UI renders as small "calling tool X" indicators
rather than a readable bubble) and walks ``cut_idx`` back through
the standard ``_align_boundary_backward`` so we don't split a
tool_call/result group that immediately precedes it. The two
anchors are chained — each only walks ``cut_idx`` backward, so
the tail can only grow.

Falls back to "most recent assistant of any kind" only when no
content-bearing reply exists in the compressible region (fresh
multi-step tool sequence with no prior reply) — in that case the
agent-side fix is effectively a no-op and the existing
user-message anchor carries the load.

WebUI layer (clarity). Added ``isCompactionMessage`` detector that
recognises the ``[CONTEXT COMPACTION — REFERENCE ONLY]`` (current)
and ``[CONTEXT SUMMARY]:`` (legacy) prefixes from
``agent/context_compressor.py``, and a new ``compaction`` entry
in ``MessageBubble``'s ``ROLE_STYLES`` map. Compaction blocks
now render as muted, italicised system-style rows labelled
``Context handoff`` — clearly metadata, not the assistant's
actual reply — so an operator scrolling back through a long
session can't mistake the summary for a real answer.

Keeping the detected prefixes inline (rather than importing them)
because the WebUI bundle has no Python interop. A guardrail comment
points readers at the source-of-truth constants in
``agent/context_compressor.py``.
2026-06-12 15:41:57 -07:00
Teknium
0db5cb8e75 refactor(agent): hoist summary end marker to _SUMMARY_END_MARKER; strip it on rehydration
Follow-up to the #33346 cherry-pick:
- the marker string was duplicated at both insertion sites (standalone +
  merged-into-tail); hoist to a module constant
- _strip_summary_prefix now also strips a trailing end marker so a
  rehydrated handoff body doesn't leak the boundary directive into the
  iterative-update summarizer prompt (it is re-appended on insertion)
2026-06-12 15:05:00 -07:00
Tranquil-Flow
749b7219c4 fix(compression): always append END OF CONTEXT SUMMARY marker to standalone summaries regardless of role
When the compression summary lands as an assistant-role message (head ends
with user), the end marker was not appended. Models may regurgitate the
summary text as their own visible output when there's no clear boundary
signal (#33256).

The end marker was already appended for user-role summaries (#11475, #14521)
but the assistant-role path was missed in the original fix. This ensures ALL
standalone summary messages carry the boundary marker, preventing summary
text from leaking into user-visible chat output.
2026-06-12 15:05:00 -07:00
Aðalsteinn Helgason
2714fc8396 fix(agent): re-enter retry loop on genuine Nous 429 so fallback guard runs
The genuine-rate-limit branch set retry_count = max_retries before
continue, intending the top-of-loop Nous guard to handle fallback or
bail cleanly. But the loop condition is retry_count < max_retries, so
the guard never ran: no fallback activation, no clean rate-limit
message — just the generic retry-exhaustion error.

Set retry_count = max(0, max_retries - 1) so the loop body runs exactly
once more and the guard sees the breaker state recorded moments earlier.

Extracted from the #44061 bugfix rollup by @AIalliAI.
2026-06-12 12:21:29 -07:00
Teknium
652dd9c9f2 fix: rich messages follow-ups — reply_parameters, send latch, opt-in default
- Use reply_parameters per the sendRichMessage spec instead of the
  undocumented reply_to_message_id scalar (silently ignored -> reply
  anchor quietly dropped).
- Latch rich sends off after an endpoint-capability failure (old PTB /
  server without sendRichMessage) so every later reply doesn't pay a
  doomed extra roundtrip; per-message BadRequests do NOT latch.
- Default rich_messages to OFF (opt-in) while the day-old Bot API 10.1
  endpoint is validated live; revert the prompt-hint table guidance
  until the default flips on.
- Tests: reply_parameters shape, send-latch behavior, BadRequest
  non-latch; rich tests opt in explicitly via extra.
2026-06-12 11:47:54 -07:00
ITheEqualizer
05b9c84ca4 Add Telegram Bot API 10.1 rich message support
Introduce opportunistic support for Telegram Bot API 10.1 rich messages by sending raw agent Markdown via sendRichMessage and streaming previews via sendRichMessageDraft. Implements a rich-path fast‑path in gateway/platforms/telegram.py (RICH_MESSAGE_MAX_BYTES=32768, feature gate platforms.telegram.extra.rich_messages, bot capability checks, routing/thread handling, and conservative fallback rules: permanent/capability errors fall back to the legacy MarkdownV2 path, transient/network errors are surfaced without legacy-resend). Also add a latch for draft capability failures (_rich_draft_disabled) and preserve legacy chunking and draft behavior when needed. Update agent prompt hints (telegram encourages rich Markdown/tables), add CLI config example option, update English and Chinese docs to describe rich messages and fallbacks, and add/adjust tests for rich send and draft behavior.
2026-06-12 11:47:54 -07:00
Siddharth Balyan
7ba5df0d52
feat(billing): /credits command — balance + portal top-up handoff (#44776)
* feat(billing): /usage → portal top-up browser handoff

Add the terminal side of the billing slice (phase 2a): start a top-up by
throwing the user to the portal billing page with the top-up modal open. The
terminal does not confirm, poll, or track payment — checkout completes in the
browser and the next /usage shows the new balance.

- nous_account.py: parse organisation.slug/name from /api/oauth/account into
  NousPortalAccountInfo; add nous_portal_topup_url() building the org-pinned
  {base}/orgs/{slug}/billing?topup=open with a null-slug fallback to the legacy
  {base}/billing?topup=open (never /orgs/None/...).
- portal_cli.py: 'hermes portal topup' — fresh account fetch, identity line
  (Topping up as <email> / org <name>), browser open with printed-URL fallback,
  no-wait closing copy. No polling/confirmation (deferred to 2b).
- account_usage.py: the shared /usage credits block now links the org-pinned
  top-up URL (auto-opens the modal) + points to the command.

Depends on NAS #409 (organisation.slug/name + ?topup=open). Do not merge until
that is live on the target env; until then /api/oauth/account returns
organisation: { id } only and the URL falls back to legacy.

* feat(billing): /credits command for balance + top-up handoff

Replace the standalone `hermes portal topup` subcommand with an in-session
/credits slash command — a focused money surface (balance in, top-up out) that
works in the CLI, TUI, and every messaging platform from one registry entry.

- commands.py: register /credits (Info category). Slack is at its 50-slash cap,
  so /credits is routed via /hermes credits on Slack only (new
  _SLACK_VIA_HERMES_ONLY set) to avoid clamping a canonical command off the
  native list and breaking Telegram parity; native everywhere else.
- account_usage.py: build_credits_view() — one portal fetch → balance lines +
  identity line + org-pinned top-up URL + depleted flag, consumed by all
  surfaces. Reuses the same snapshot/URL builder as /usage so numbers match.
- cli.py: _show_credits() — balance block + identity line + 3-button panel
  (Open top-up / Copy link / Cancel) via the existing prompt_toolkit modal.
  ASK, never auto-launch; headless falls back to printing the URL.
- gateway/slash_commands.py: _handle_credits_command() — renders the block +
  tappable top-up URL + no-wait copy; works on button and plain-text platforms.
- /usage credits line now points to /credits.
- Retire `hermes portal topup` (portal_cli.py back to baseline); the engine
  (slug/name parse + nous_portal_topup_url) stays as the shared core.

No polling, no payment confirmation (billing phase 2a). Depends on NAS #409.

* fix(credits): /credits works in the TUI slash-worker (non-interactive)

In the TUI, /credits runs in the slash-worker subprocess where there is no
live prompt_toolkit app and stdin is the JSON-RPC pipe. _show_credits called
the 3-button modal unconditionally, which fell back to reading stdin →
exception → slash.exec rejected → the command produced no output (only the
pre-existing 'Credit access paused' banner showed).

- _show_credits: when self._app is None (TUI worker / piped / non-interactive),
  render the text variant — balance block + tappable top-up URL + no-wait line,
  same affordance as the messaging surfaces — and skip the modal entirely. The
  3-button panel still renders in the interactive CLI.
- Depleted banner copy: 'run /usage for balance' → 'run /credits to top up'
  now that /credits is the dedicated money surface (+ tests).
- Regression tests: _show_credits with self._app=None renders text and never
  invokes the modal; logged-out path.

* feat(tui): credits.view RPC for the /credits tappable top-up button

Add a credits.view JSON-RPC method returning the structured CreditsView
(logged_in, balance_lines, identity_line, topup_url, depleted) so the TUI can
render a clickable <Link> top-up button instead of plain text. Account-
independent (portal fetch gated on a logged-in Nous account), fail-open to
{logged_in: false} on any hiccup. Mirrors session.usage's credits-block pattern.

Frontend (TUI-local /credits command + Ink component) lands separately.

* feat(tui): /credits command with keyboard-driven top-up confirm

TUI-local /credits: fetches the structured balance via the credits.view RPC,
prints the balance + identity + top-up URL, then arms the EXISTING confirm
overlay (Enter = open top-up in browser via openExternalUrl, Esc = cancel).
Reuses ConfirmReq — no new overlay component/state/input handler. Headless
(openExternalUrl returns false) falls back to printing the URL.

- gatewayTypes.ts: CreditsViewResponse.
- commands/credits.ts: the command (mirrors /status's rpc+guarded pattern).
- registry.ts: register creditsCommands.
- test: balance+overlay armed, headless fallback, no-url, logged-out (4 cases).

Matches the CLI /credits 'Enter to open' affordance. Phase 2a: no polling.
2026-06-12 08:51:10 +00:00
Teknium
8e5b7592f8 refactor(agent): hoist MEDIA-directive regex to module level
Avoid recompiling the pattern on every _serialize_for_summary call; name it
beside _PATH_MENTION_RE with the #14665 rationale.
2026-06-12 01:14:28 -07:00
Tranquil-Flow
286ecd26d8 fix(agent): strip MEDIA directives from compressor summarizer input (#14665) 2026-06-12 01:14:28 -07:00
Teknium
c196269d8d
fix(credits): suppress usage gauge when top-up funds exist + add display.credits_notices toggle (#44716)
The subscription-cap usage gauge (50/75/90% bands) ignored purchased
(top-up) credits: a sub user with top-up funds got a sticky warn banner
at 90% of their cap — permanently at >=100%, alongside grant_spent —
despite being fully able to keep inferencing. The cap is the wrong
denominator for an account that can keep spending.

- evaluate_credits_notices: purchased_micros > 0 suppresses the usage
  band (grant_spent already covers the cap-reached + top-up case with
  the remaining balance). A top-up landing mid-session clears any
  showing band; spending top-up down to 0 resumes the gauge.
- New display.credits_notices config (default true): false silences all
  credits notices. State capture and /usage are unaffected. Read once
  per agent (cached) in _emit_credits_notices, fail-open true.
- Docs: configuration.md display block.
2026-06-12 01:06:46 -07:00
kshitijk4poor
15439bee47 refactor(memory): reuse _summarize_user_message_for_log instead of forking it
The original fix added agent/memory_manager.py:flatten_message_content, but
that helper was a near-exact duplicate of
agent/codex_responses_adapter.py:_summarize_user_message_for_log — same
None/str/list dispatch, same {text,input_text,output_text}/{image_url,input_image}
part sets, the identical [N image(s)] marker, and the same str() fallback. The
only difference was the join separator (newline for memory vs space for the
log/trajectory previews the existing helper already serves), and that helper is
already imported into agent/turn_finalizer.py — the same file whose call site the
memory fix touches.

Parameterize the existing helper with sep=' ' (default preserves every current
logging/trajectory caller byte-for-byte) and call it with sep='\n' at the memory
boundary; drop the forked flatten_message_content. Repoints the unit tests to the
consolidated helper and adds a case locking the default space-join.

Single source of truth for multimodal-content flattening; no behavior change for
the fix or for existing callers.
2026-06-12 12:49:18 +05:30
Erosika
87893fe4cb fix(memory): flatten multimodal content before provider sync
Multimodal turns carry message content as a list of typed parts
({type: "text"|"image_url", ...}). _sync_external_memory_for_turn
passed that list straight into MemoryManager.sync_all, and providers
feed it to regexes — Honcho's sync_turn calls sanitize_context, where
re.sub raised 'expected string or bytes-like object, got list'. Every
turn with an attached image silently never synced.

Flatten to plain text at the boundary: text parts joined, images noted
as an [N image(s)] marker so the attachment isn't erased from recall.
Fixing here covers all providers instead of patching each plugin.

(cherry picked from commit 705bdb6ffe)
2026-06-12 12:46:28 +05:30
Teknium
c7bee8f961 refactor(agent): drop unused tail_start param from _derive_auto_focus_topic
The parameter was reserved-but-unused (del'd immediately); YAGNI. Test
call site updated.
2026-06-11 23:03:52 -07:00
konsisumer
434c684bfa fix(agent): focus automatic compression on recent user turns 2026-06-11 23:03:52 -07:00