hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-07-25 17:18:11 +00:00

Author	SHA1	Message	Date
solyanviktor-star	f1ea4a56c2	fix(memory): cover the remaining setup-time .env reads with utf-8-sig Follow-up to review feedback: - mem0 _prompt_api_key read .env with the locale default, so a Notepad BOM hid the first key from the masked current-value lookup; read it with utf-8-sig + errors=replace like the canonical readers in hermes_cli/config.py. - hindsight _load_simple_env used plain utf-8; it also parses the Hermes .env during post_setup, where a BOM stuck to the first key. Switch to utf-8-sig + errors=replace. - Add hindsight regressions: BOM key matching in _load_simple_env and in the cloud post_setup writer, plus non-ASCII round-trip preservation, and a mem0 regression for the BOM'd masked-key lookup. The BOM tests fail without the fix on any platform. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-24 17:10:39 -07:00
solyanviktor-star	75afc47baa	fix(memory): read/write .env as UTF-8 in mem0 and hindsight setup The mem0 and hindsight memory-provider setup routines round-trip the user's ~/.hermes/.env: they read existing lines, update the keys they manage, and rewrite the whole file preserving every other line verbatim. Both used env_path.read_text() / write_text() with no encoding. read_text()/write_text() with no encoding fall back to the system locale (cp1252/GBK on Windows), so on a non-UTF-8 host the preserved lines get mangled or the call crashes on any non-ASCII value, and — because the reader never strips a BOM — a Notepad-edited .env makes the first key fail the in-place match and get duplicated instead of updated. Match the canonical .env readers in hermes_cli/config.py: read with encoding='utf-8-sig' (BOM-tolerant) and write with encoding='utf-8'. mem0/_setup.py already pins utf-8 for mem0.json, so this just aligns the .env path in the same file. Fixes both memory plugins in one class fix. Adds regression tests: a BOM'd .env updates the first key in place (locale-independent, fails without the fix) and non-ASCII existing lines survive the round-trip. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-24 17:10:39 -07:00
Hao Zhe	1cfe23c6e4	fix(openviking): serialize client refresh state	2026-07-24 13:00:53 +05:30
Hao Zhe	cf0bd5dd4c	fix(openviking): stop pending runtime start on shutdown	2026-07-24 13:00:53 +05:30
Hao Zhe	c4d0f1c1d6	fix(openviking): serialize local runtime recovery starts Avoid spawning multiple local OpenViking server processes while a runtime autostart waiter is already active. Remote endpoints still retry on later accesses because they do not install a local waiter.	2026-07-24 13:00:53 +05:30
Hao Zhe	8fdc9c58e0	fix(openviking): use readonly config loader	2026-07-24 13:00:53 +05:30
爪爪	5ea3abc3b3	fix(openviking): match tenant-header errors structurally instead of hard-coding strings The _needs_trusted_identity_retry method was hard-coding specific server-side error strings to detect when a request failed due to missing X-OpenViking-Account / X-OpenViking-User headers. Each new server-side error variant required another string added to the client. Replace the string enumeration with a structural match: the error message mentions one of the tenant headers AND the HTTP status is 400. This covers all current error variants: - "Trusted mode requests must include X-OpenViking-Account and User" - "ROOT requests to tenant-scoped APIs must include X-OpenViking-Account" - "Trusted mode requests must include X-OpenViking-Account." - "Trusted mode requests must include X-OpenViking-User." The 400 status guard avoids false-positives on 403 errors such as "USER API keys cannot override X-OpenViking-User", which must not trigger a retry. All 176 existing tests pass. (cherry picked from commit `5a24d6766c`)	2026-07-24 13:00:53 +05:30
Hao Zhe	21a634b98a	test(openviking): keep root tenant errors out of trusted retry	2026-07-24 13:00:53 +05:30
Hao Zhe	36f01d2e54	fix(openviking): sanitize splitline env separators	2026-07-24 13:00:53 +05:30
koshaji	d3520944c7	fix(openviking): join runtime-autostart thread on shutdown (SIGABRT-at-exit) `OpenVikingMemoryProvider.shutdown()` joins in-flight writers, deferred-commit threads, and prefetch threads, but not `_runtime_start_thread` — the tracked `daemon=True` waiter that runs `_finish_runtime_openviking_start`, which blocks on network health probes (`_wait_for_openviking_health` polling + a `_VikingClient.health()` request). If the local OpenViking runtime is slow or unreachable, that waiter can still be blocked in network I/O at interpreter exit. CPython then forcibly kills it during `Py_FinalizeEx` (`PyThread_exit_thread` -> `__pthread_unwind` -> `abort()`), producing SIGABRT (exit 134) with no traceback — the same daemon- thread-at-exit failure class fixed for the Honcho provider. Fix: - `shutdown()` now joins `_runtime_start_thread` (timeout-bounded) alongside the other tracked threads. - `_wait_for_openviking_health()` gains a `should_stop` callback; the waiter passes `lambda: self._shutting_down` so the poll loop bails out promptly once `shutdown()` flips the flag, instead of lingering up to the 60s autostart timeout and timing out the join (which would leave the thread alive). - Add tests/plugins/memory/test_openviking_shutdown.py covering the short-circuit and the shutdown-joins-runtime-thread behaviour. (cherry picked from commit `5471ec7021`)	2026-07-24 13:00:53 +05:30
pprism13	9291b786b4	fix(openviking): sanitize embedded newlines when writing .env secrets `_write_env_vars` in the OpenViking memory provider interpolates each secret straight into a `KEY=VALUE` line, but the values only ever pass through `_clean_config_value`, whose `value.strip()` trims surrounding whitespace and leaves internal CR/LF intact. Because the file is strictly line-oriented and is re-read via `read_text().splitlines()`, a value that carries an embedded newline spills onto a second physical line, and the tail is re-parsed as an independent `KEY=VALUE` entry on the next round trip. A secret pasted with a trailing record (e.g. an `OPENVIKING_API_KEY` copied with an extra line) therefore injects an arbitrary additional variable into the persisted credentials file and silently corrupts it. The fix neutralizes the line terminators at the single chokepoint where values reach the file. A small `_env_line_safe` helper strips `\r`, `\n`, and the NUL byte from each value, and both write sites in `_write_env_vars` (the existing-key update branch and the appended-key branch) route through it, so a value can only ever occupy the single line it is written on. ## What does this PR do? Hardens the OpenViking memory provider's `.env` writer so a malformed or pasted secret value can no longer break out of its `KEY=VALUE` line and inject a rogue variable into the profile-scoped credentials file. ## Related Issue N/A ## Type of Change - [x] 🐛 Bug fix (non-breaking change that fixes an issue) ## Changes Made - `plugins/memory/openviking/__init__.py`: add `_env_line_safe()` which removes `\r`, `\n`, and `\x00` from a value, and apply it to both the updated-key and appended-key write branches in `_write_env_vars()`. - `tests/plugins/memory/test_openviking_provider.py`: add two regression tests covering a fresh write and an in-place key update with embedded CR/LF, asserting no injected line survives the read-back. ## How to Test 1. Run the targeted tests: `pytest tests/plugins/memory/test_openviking_provider.py -k env_writer -q` 2. Reverting the `_env_line_safe` sanitization makes `test_openviking_env_writer_strips_embedded_newlines_in_values` and `test_openviking_env_writer_strips_newlines_when_updating_existing_key` fail with a rogue `INJECTED_KEY=`/`ROGUE=1` line appearing in the file, confirming the tests pin the bug. 3. `ruff check plugins/memory/openviking/__init__.py` and `python scripts/check-windows-footguns.py plugins/memory/openviking/__init__.py` both pass. ## Checklist ### Code - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix - [x] I've run the relevant tests and they pass - [x] I've added tests for my changes (required for bug fixes) - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (docstrings) — or N/A - [x] I've updated `cli-config.yaml.example` if I added/changed config keys — N/A - [x] I've updated `CONTRIBUTING.md` or `AGENTS.md` if I changed architecture or workflows — N/A - [x] I've considered cross-platform impact (strips CR as well as LF) — done - [x] I've updated tool descriptions/schemas if I changed tool behavior — N/A (cherry picked from commit `f29dd2df84`)	2026-07-24 13:00:53 +05:30
Teknium	c1b0f6f3c1	feat(kanban): per-task model dropdown — set/override worker model+provider from the board (#69876 ) Adds the missing write path for the per-task model_override column (which was previously only settable via manual SQL) and pairs it with a provider_override so cross-provider switches resolve correctly: - kanban_db: provider_override column (+migration), set_model_override() with model_override_set event, create_task(model_override=, provider_override=), dispatcher spawns worker with -m <model> [--provider <name>] - dashboard: Model row in the task drawer — dropdown fed by a new /model-options endpoint (build_models_payload substrate, provider-grouped, free-text fallback), PATCH + bulk model override support - CLI: kanban create --model/--provider, new kanban set-model subcommand, show prints the provider - agent tools: kanban_create accepts model/provider; show/list expose provider_override Rate-limit recovery flow: override is settable on running tasks and takes effect on the next dispatch, without touching the worker profile's config.	2026-07-22 22:23:24 -07:00
wernerhp	cc84af9fad	fix(memory): preserve genuine pre-delimiter content in merged compaction rows teknium1 review on #57690: harvesting logic was skipping the ENTIRE merged row when a compaction summary was appended to the tail message, discarding real prior user content that context_compressor retains before the _MERGED_SUMMARY_DELIMITER. Extract and harvest that pre-delimiter segment instead of dropping it wholesale. Revert-to-fail: reverting plugins/memory/holographic/__init__.py alone drops test_merged_into_tail_preserves_genuine_pre_delimiter_preference (19 passed, 1 failed); restoring the fix returns 20/20 passed.	2026-07-22 06:59:14 -07:00
wernerhp	004de13f14	fix(memory/holographic): don't harvest compaction summaries; honor auto_extract=false string Two compounding defects in the holographic memory provider (#57682): 1. The on_session_end gate used plain truthiness on auto_extract, but the plugin's own config schema declares it as a string enum with default "false" — and not "false" is False, so extraction ran for users who had it configured off. Coerce with the shared utils.is_truthy_value (same fix class as the merged byterover no-op fix). 2. _auto_extract_facts scanned every role=user message. Context-compaction handoff summaries can be inserted as role=user messages and their prose reliably matches the decision patterns (we decided/agreed, the project uses), so the compactor's own output was persisted as a durable project fact on every rollover following a compaction — recreated even after manual deletion. Adds agent.context_compressor.is_compaction_summary_message(), a public helper that prefers the in-process COMPRESSED_SUMMARY_METADATA_KEY marker and falls back to _is_context_summary_content() (covers merged-into-tail and historical prefixes), since the metadata key is stripped by wire sanitizers and doesn't survive all session-store round-trips. The plugin skips summary messages before pattern matching. Fixes #57682	2026-07-22 06:59:14 -07:00
Hao Zhe	8af2133009	fix(openviking): align session context with shared profile contract	2026-07-22 14:12:50 +05:30
Flownium	11c1ca01c5	fix(openviking): inject session-start memory context (cherry picked from commit `18b474d0bd`)	2026-07-22 14:12:50 +05:30
kshitij	c3c80e1796	refactor: cleanup follow-up for salvaged PR #58871 - Remove dead current_sid parameter from _recover_pending_sessions - Remove dead cleanup parameter from _release_owner_run_claim (always True) - Set _run_lock_path after flock succeeds, not before - Collapse redundant BlockingIOError branch (covered by OSError+errno check) - Track _pending_marked_sids to skip re-writing marker file on every sync_turn	2026-07-22 14:05:28 +05:30
Chris Korhonen	81fc424592	fix(openviking): chunk structured session sync Preserve ordered structured turns across OpenViking's 100-message batch limit and resume retries from the first unconfirmed message. Based on the OpenViking batching work from commit `1a567f7067` in #58981.	2026-07-22 14:05:28 +05:30
Hao Zhe	3fd96583f4	fix(openviking): serialize orphan session recovery	2026-07-22 14:05:28 +05:30
Hao Zhe	323e9baf5d	fix(openviking): recover pending session commits	2026-07-22 14:05:28 +05:30
RenoMG	95c616be20	fix(supermemory): complete self-hosted endpoint routing	2026-07-20 00:40:40 -07:00
Dhravya Shah	24ac26a3da	feat(supermemory): support custom base URL for self-hosted servers The supermemory SDK already honors SUPERMEMORY_BASE_URL, but the raw urllib call used for session-end conversation ingest hardcoded https://api.supermemory.ai/v4/conversations, so ingest always hit the cloud even when pointing at a self-hosted server (e.g. http://localhost:6767). Resolve the base URL as config (supermemory.json base_url) > SUPERMEMORY_BASE_URL env var > https://api.supermemory.ai, strip any trailing slash, and use it for both the SDK client and the /v4/conversations ingest endpoint.	2026-07-20 00:40:40 -07:00
teknium1	c84c0c5277	Merge branch 'pr-51020' into lane/c3-memory-panel	2026-07-18 15:13:04 -07:00
Teknium	07f07c7b51	fix(mem0): migrate legacy OSS base URL aliases Normalize stale api_base keys to each mem0 provider's accepted URL field before Memory.from_config, without mutating the saved config.	2026-07-17 13:49:29 -07:00
Teknium	e4f87557b9	feat(kanban): modal create-task dialog, editable board project directory, comment workflow hint (#66333 ) Community feedback (@LSanapalli on X): the inline task-creation form is cramped inside a ~280px column with no way to resize; board-level workspace defaults can't be changed after board creation; and users believe they must block a task, comment, then unblock just to talk to a worker. - Create-task dialog: replace the inline column form with a centered modal (reuses hermes-kanban-dialog chrome, 36rem wide) with labeled fields for title, assignee, priority, skills, workspace kind/path, goal mode, and parent task. Same request shape; Enter/Escape behavior preserved; submit disabled until a title is present. - Board settings dialog: new Settings button in the board switcher opens a modal to edit display name, description, and the board-level default project directory (default_workdir). PATCH /boards/:slug now accepts default_workdir (validated absolute existing dir; empty string clears; omitted leaves unchanged) and returns the recomputed default_workspace_kind so task-creation defaults follow immediately. - Comment workflow hint: the task drawer's comment box now explains that comments land on the thread immediately and reach the worker on its next run/kanban_show() — no block/unblock dance needed — with a fuller tooltip for when blocking IS the right tool. - i18n: new keys optional in the kanban namespace with English fallbacks in the bundle (established pattern; avoids churning 17 locale files). - Docs: dashboard section updated for the dialog + Settings button.	2026-07-17 07:23:54 -07:00
amanning3390	311a5b0a55	feat(kimi): discover K3 on coding endpoint	2026-07-16 13:33:02 -07:00
arminanton	cf73b3d411	fix(copilot): clamp reasoning effort to the nearest supported level, not xhigh->high The Copilot provider profile unconditionally mapped ``xhigh`` to ``high`` before checking the model's catalog, so models that DO support ``xhigh`` (e.g. the gpt-5.x family per the live /models catalog) were silently capped one level down. Honor the requested effort when the catalog lists it as supported, and only downgrade when it does not, choosing the nearest weaker supported level (xhigh->high, minimal->low, else medium, else the first supported level). This matches the nearest-down clamp behavior used elsewhere for the ``max`` effort. Adds tests/plugins/model_providers/test_copilot_profile.py covering forward, downgrade, and fallback paths (catalog lookup stubbed).	2026-07-16 08:47:10 -07:00
kshitijk4poor	5d9a72b7c2	fix(ollama-cloud): capability-gate reasoning_effort + correct disable semantics Three follow-up fixes to the salvaged reasoning_effort support, all verified live against ollama.com /v1/chat/completions + /api/show on deepseek-v4-pro, gemma3, and qwen3-coder: 1. Capability-gate on /api/show 'thinking'. The original ignored the supports_reasoning flag and emitted reasoning_effort for every model. Now gated: only models whose native /api/show capabilities list contains 'thinking' (deepseek-v4 yes; gemma3 / qwen3-coder no) get reasoning_effort. Mirrors the LM Studio pattern — capability resolved once per (model, base_url) in run_agent._supports_reasoning_extra_body via a cached probe (hermes_cli.models.ollama_model_supports_thinking), threaded into the profile hook as supports_reasoning. No live HTTP in the per-request path. 2. Disable actually disables. Ollama Cloud defaults to thinking ON and IGNORES the extra_body.thinking:{type:disabled} shape (verified: still returned reasoning). The only working off switch is top-level reasoning_effort:'none'. The salvaged code returned ({}, {}) for enabled:false / effort:none, leaving thinking ON. Now emits {'reasoning_effort': 'none'}. 3. Omit unrecognized effort. The original forwarded any unknown string verbatim including 'minimal' (a real Hermes effort level). Ollama Cloud rejects unrecognized values with a hard HTTP 400 (accepted set: low/medium/high/ max/none), so forwarding 'minimal' would break the request. Now omitted. Core touches (run_agent.py, hermes_cli/models.py) add the capability probe; the plugin profile only consumes the resolved flag. 24/24 profile tests green; 194 provider/transport tests unaffected.	2026-07-16 07:58:04 -07:00
otsune	3fccd698fd	feat(kanban): attachment toolset + CLI to match the dashboard surface The kanban board has had full attachment storage and a dashboard HTTP API (upload/list/download/delete) since #35338, but there was no agent toolset tool and no `hermes kanban` CLI verb for attachments. Agents and scripts that don't go through the dashboard server (or can't touch the DB directly) had no way to create or read real attachments — only links in comments. Close that gap by mirroring the existing comment surface: - `kanban_db.store_attachment_bytes()` — one shared write path (validate name, enforce the 25 MB cap, write the blob under the per-task dir with collision-free naming, insert the metadata row, clean up an orphan blob if the insert fails). `_MAX_ATTACHMENT_BYTES`, `_safe_attachment_name`, and a new `_collision_free_path` move here so the dashboard, the tool, and the CLI all share one implementation and can't drift. - Tools (`tools/kanban_tools.py`): `kanban_attach` (inline base64), `kanban_attach_url` (server-side http/https fetch with the same cap), `kanban_attachments` (list). Write tools respect worker task-ownership; list is read-only. Registered in the `kanban` toolset. - CLI (`hermes_cli/kanban.py`): `attach <id> <path>`, `attachments <id>`, `attach-rm <attachment_id>`. - Dashboard `upload_task_attachment` now imports the shared helpers and uses `_collision_free_path` — behavior identical (still streams to disk with the cap, still 413 on overflow). - Docs (AGENTS.md, kanban-worker skill) and toolset membership updated. Tests: tool round-trip + oversize + bad base64 + ownership; attach_url against a local HTTP fixture incl. oversize-mid-stream and non-http scheme rejection; CLI attach/attachments/attach-rm; shared-helper unit tests; dashboard parity preserved. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-07-16 07:33:14 -07:00
AhmetArif0	5a39f0501c	fix(video-gen): omit duration for range-based FAL families when unspecified _clamp_duration returned durations[0] for all families when duration=None, causing pixverse-v6, seedance-2.0, and kling-v3-4k to always send their minimum value (1s, 4s, 3s respectively) instead of omitting the field and letting the FAL endpoint apply its own default. Range families are now detected via the existing _is_duration_range heuristic and return None (field omitted) when no duration is requested. Enum families like veo3.1 keep sending their first entry as the default.	2026-07-16 04:24:10 -07:00
Jeff Watts	d68ac9092a	test(photon): cover hidden Windows sidecar spawns	2026-07-16 01:03:43 -07:00
Erosika	155a792013	Merge origin/main: keep unified declared-schema config surface Conflict resolutions: - web_server.py: keep the branch's unified provider-config handlers (declared schema served by default, profile-scoped) over main's surface=declared query param. Main's declared-surface helpers and the hermes_cli.memory_providers import are dropped — the branch deleted that module when provider schemas moved into their plugins, so main's code path could no longer import. - hermes.ts: keep profileScoped() calls without ?surface=declared to match the unified backend. - constants.ts: take main's new reasoning effort values (max, ultra), keep the branch's memory provider ordering. - config-settings.tsx: take main's FallbackModelsField import, drop the now-unused SECTIONS import (branch replaced it with sectionFieldEntries). - provider-config-panel.test.tsx: file moved to settings/memory/ on the branch; main's act() fixes targeted the old tests, and the rewritten suite produces no act() warnings, so the old path stays deleted. - test_web_server.py: keep both sides' new tests (main's MoA endpoint tests plus the branch's Honcho provider tests).	2026-07-15 17:50:31 -04:00
Teknium	7c954969b7	fix(auxiliary): route direct-create aux callers through call_llm (#65029 ) * fix(auxiliary): route direct-create aux callers through call_llm (#35566) Five callers (kanban_decompose, kanban_specify, profile_describer, and goals.py's judge + draft-contract) built raw clients via get_text_auxiliary_client() and passed extra_body=get_auxiliary_extra_body() — which only returns Nous portal tags and ignores auxiliary.<task>.extra_body from config.yaml entirely. That was the remaining half of #35566 after the call_llm path was fixed. Routing them through call_llm(task=...) gives each caller the full auxiliary contract for free: task extra_body, the reasoning_effort shorthand, transient retries, provider-profile projection, and fallback chains. goal_judge gains a DEFAULT_CONFIG block (it had none — its provider/model overrides silently didn't exist as documented keys). get_auxiliary_extra_body() now has zero non-test callers; kept for plugin back-compat. Fixes #35566. * test: migrate kanban dashboard + CLI specify mocks to call_llm Two more consumers of specify_task mocked the old get_text_auxiliary_client symbol (missed in the first sibling sweep — they live outside tests/hermes_cli's kanban files): the dashboard plugin's /specify endpoint tests and the /kanban slash-command E2E. Same migration as the rest: mock call_llm at the source, no-provider now surfaces via the LLM-error branch.	2026-07-15 07:39:17 -07:00
Epoxidex	8662254ab2	fix(ollama): emit top-level reasoning_effort=none on /v1/chat/completions (#25758 ) Ollama's /v1/chat/completions silently ignores extra_body.think (it only honours it on /api/chat — ollama/ollama#14820), so agent.reasoning_effort: none never actually disabled thinking on OpenAI-compatible Ollama routes. Emit the top-level reasoning_effort='none' field (which Ollama respects) alongside think=False (kept for proxies and the native /api/chat path). The PR's second half (propagating reasoning_config to the background-review fork) already landed on main via agent/background_review.py, so only the provider-profile change is salvaged here, resolved onto the current GLM/effort-aware profile. Salvaged from PR #29820 by @Epoxidex.	2026-07-15 06:38:28 -07:00
Ben Barclay	9884b4faad	fix(cron/chronos): cache PyJWKClient across fires to stop JWKS fetch storm (#64641 ) The inbound cron-fire verifier constructed a fresh PyJWKClient on every fire, discarding the client's key cache and forcing a synchronous JWKS HTTP GET to the portal on each fire. Under a burst of concurrent fires (a hosted instance with several cron jobs firing in the same window) this fanned out into N simultaneous JWKS fetches that the portal rate-limited (HTTP 403 -> verification fails -> agent 401), or that blocked the event loop long enough that the fire webhook could not return its 202 before the relay's 30s timeout (observed in prod as relay 504s concentrated on high-job-count instances). Cache one PyJWKClient per JWKS URL at module scope (double-checked lock) so the signing keys are reused across fires; NAS keys rotate rarely, so the steady state is zero JWKS fetches per fire. Regression test proves 5 fires -> 1 client construction (was 5).	2026-07-15 09:27:35 +10:00
kshitijk4poor	1f41bdbecd	fix(upstage): collapse unknown future efforts to high; behavior-contract tests Review findings from the 4-angle pass: - Unknown-but-enabled effort levels now collapse to Solar's strongest (high) instead of silently downgrading to the medium default — guards against the next #62650-style vocabulary addition. Explicit-empty effort keeps the medium default. - fallback_models test now asserts the behavior contract (non-empty, no denied families) instead of freezing the exact model tuple (change-detector, AGENTS.md reject reason). - Drop unused pytest import in test_upstage_provider.py.	2026-07-15 00:09:24 +05:30
kshitijk4poor	f88cac71bc	fix(upstage): map 'ultra' reasoning effort to Solar's high Main added max/ultra effort levels (#62650) after this PR branched; without the mapping 'ultra' silently fell through to the medium default. Matches the xhigh/max collapse-to-strongest convention used by other profiles.	2026-07-15 00:09:24 +05:30
Changhyun Min	35d3fc3b09	refactor(agent): drop the solar-pro rolling alias, default to solar-pro3 Pin the Upstage default to the concrete solar-pro3 instead of the solar-pro rolling alias: - plugin fallback_models is now ("solar-pro3",); entry [0] is the setup default - drop the "solar-pro" context-window fallback entry (solar-pro3 covers it) - update the reasoning default-on docstring and profile tests accordingly Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-15 00:09:24 +05:30
Changhyun Min	0031c5c371	refactor(agent): treat unknown Solar models as reasoning-capable Invert the reasoning-support check from an allow-list (solar-pro, solar-open) to a deny-list of the known non-reasoning families (solar-mini, syn-pro). Newly released Solar models now get reasoning_effort by default instead of having it silently dropped. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-15 00:09:24 +05:30
Changhyun Min	20502b407c	feat(agent): add Upstage Solar as a model provider Adds Upstage Solar as a bundled model-provider plugin. Solar exposes an OpenAI-compatible chat-completions endpoint at https://api.upstage.ai/v1, so the generic chat_completions transport handles request/response/streaming/tool calls — the profile is the core integration. Provider registration (Upstage isn't in models.dev, so each registry that does not auto-wire from the plugin layer needs an explicit entry — same pattern as nvidia/gmi): - plugins/model-providers/upstage/: UpstageProfile + plugin.yaml. Picker default and offline catalog list only the agentic Solar Pro models, led by `solar-pro` (rolling alias for the latest Pro). default_aux_model empty so aux tasks use the main model. `solar` alias. UPSTAGE_BASE_URL overrides the host. - hermes_cli/providers.py: HERMES_OVERLAYS + label + `solar` alias, so resolve_provider_full('upstage') resolves (without this, an explicit `provider: upstage` in config was dropped and fell through to auto-detect). - hermes_cli/auth.py: PROVIDER_REGISTRY entry + `solar` alias, so `hermes doctor` / resolve_provider recognise upstage (the static-registry path the lazy profile-extension doesn't reliably cover at validation time). - hermes_cli/models.py: CANONICAL_PROVIDERS entry places Upstage Solar in the curated picker order (above the auto-appended `custom`). - agent/model_metadata.py: context-window fallbacks (/v1/models omits context_length); `solar-pro` carries the 128K Pro context as the catch-all. Reasoning: UpstageProfile.build_api_kwargs_extras wires Solar's top-level `reasoning_effort` (low\|medium\|high; xhigh/max→high). Reasoning-capable families are solar-pro* and solar-open*; solar-mini/syn-pro never receive it. Defaults ON at medium when unset (matches the /reasoning "medium (default)" label); `/reasoning none` disables; explicit/saved settings are honored. No reasoning_content echo handling needed (unlike DeepSeek/Kimi). Web dashboard: - web/src/pages/EnvPage.tsx: add an "Upstage Solar" provider group so UPSTAGE_API_KEY / UPSTAGE_BASE_URL appear under LLM Providers (not "Other"). Docs/tests: - .env.example: documents UPSTAGE_API_KEY / UPSTAGE_BASE_URL. - tests: profile wiring, reasoning_effort mapping (pro/open/mini, efforts, disabled, default-on), provider-resolver regression (resolve_provider_full / get_provider / solar alias / overlay), `solar-pro` default. Testing: pytest tests/providers tests/plugins/model_providers tests/hermes_cli/test_upstage_provider.py tests/run_agent/test_provider_parity.py tests/hermes_cli/test_api_key_providers.py; ruff clean. Verified end-to-end: `hermes doctor` shows "Upstage Solar", and live chat works via both `--provider upstage` and `--provider solar`. Reasoning wire format per https://console.upstage.ai/api/docs/for-agents/raw. Platforms tested: macOS. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-15 00:09:24 +05:30
Jeffrey Quesnelle	77d5b2d573	Merge pull request #59846 from bbednarski9/bbednarski/nemo-relay-upgrade feat(nemo-relay): nemo-relay observability version upgrade to support dynamic plugin activation	2026-07-14 13:15:49 -04:00
Bryan Bednarski	7e201fa1b6	fix(nemo-relay): align dynamic plugin configuration Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>	2026-07-13 17:01:08 -06:00
kshitijk4poor	2fc3f9c1ff	fix(deepinfra): harden multimodal provider routing Prevent credential forwarding across catalog redirects, retain explicit opt-in semantics for paid media backends, fail closed on invalid provider configuration, avoid mixed-catalog and output-limit assumptions, and reserve native STT provider names.	2026-07-14 02:59:39 +05:30
Georgi Atsev	fe002eb124	feat(providers): Support DeepInfra as an LLM provider	2026-07-14 02:59:39 +05:30
Bryan Bednarski	2d2bed5891	Merge origin/main into bbednarski/nemo-relay-upgrade Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>	2026-07-13 14:50:20 -06:00
Teknium	2bd721cebc	test(kanban): remove duplicate final-results footer	2026-07-13 01:49:35 -07:00
Teknium	98b4562947	fix(kanban): make Done-card results actionable	2026-07-13 01:49:35 -07:00
iborazzi	deae8e3b4d	feat(kanban): surface final_result for Done cards; show run summary when task.result is empty	2026-07-13 01:49:35 -07:00
Teknium	a10081f83b	test(image-gen): cover Codex capability HTTP boundary	2026-07-12 23:43:49 -07:00
Teknium	402969670d	fix(image-gen): classify unsupported Codex image accounts	2026-07-12 23:43:49 -07:00

1 2 3 4 5 ...

332 commits