session.resume was building conversation history with only role and
content, stripping tool_call_id, tool_calls, and tool_name. The API
requires tool messages to reference their parent tool_call, so resumed
sessions with tool history would fail with HTTP 500.
Use get_messages_as_conversation() which already preserves the full
message structure including tool metadata and reasoning fields.
- Show full session ID in a fixed-width column for easy scanning
- Pad row numbers to 2 digits to keep alignment past 9 entries
- Always show session source (tui/cli) instead of conditionally hiding it
- Use Box-based column layout so ID, metadata, and title don't run together
- session.list RPC now queries both tui and cli sources, merged by recency
- Session picker shows source label for non-tui sessions (e.g. ", cli")
- Added source field to SessionItem interface
When HERMES_ROOT was added for Nix-bundled TUI support, the fallback
was set to os.getcwd(). This overrode the TUI's own import.meta.dirname
resolution, so launching `hermes --tui` from outside the repo caused
the gateway client to look for venv/bin/python relative to the user's
working directory instead of the repo root.
Use PROJECT_ROOT (resolved from the source file location) as the
fallback, which is stable regardless of where the command is invoked.
* feat: API server model name derived from profile name
For multi-user setups (e.g. OpenWebUI), each profile's API server now
advertises a distinct model name on /v1/models:
- Profile 'lucas' -> model ID 'lucas'
- Profile 'admin' -> model ID 'admin'
- Default profile -> 'hermes-agent' (unchanged)
Explicit override via API_SERVER_MODEL_NAME env var or
platforms.api_server.model_name config for custom names.
Resolves friction where OpenWebUI couldn't distinguish multiple
hermes-agent connections all advertising the same model name.
* docs: multi-user setup with profiles for API server + Open WebUI
- api-server.md: added Multi-User Setup section, API_SERVER_MODEL_NAME
to config table, updated /v1/models description
- open-webui.md: added Multi-User Setup with Profiles section with
step-by-step guide, updated model name references
- environment-variables.md: added API_SERVER_MODEL_NAME entry
When a streaming response is cut mid-tool-call (connection drop, timeout),
the accumulated function.arguments is invalid JSON. The mock response
builder defaulted finish_reason to 'stop', so the agent loop treated it
as a valid completed turn and tried to execute tools with broken args.
Fix: validate tool call arguments with json.loads() during mock response
reconstruction. If any are invalid JSON, override finish_reason to
'length'. In the main loop's length handler, if tool calls are present,
refuse to execute and return partial=True with a clear error instead of
silently failing or wasting retries.
Also fixes _thinking_exhausted to not short-circuit when tool calls are
present — truncated tool calls are not thinking exhaustion.
Original cherry-picked from PR #6776 by AIandI0x1.
Closes#6638.
The test was mocking _vprint entirely, bypassing the suppress guard.
Switch to capturing _print_fn output so the real _vprint runs and
the guard suppresses retry noise as intended.
Replace 6 identical copies of the Termux detection function across
cli.py, browser_tool.py, voice_mode.py, status.py, doctor.py, and
gateway.py with a single shared implementation in hermes_constants.py.
Each call site imports with its original local name to preserve all
existing callers (internal references and test monkeypatches).
_classify_by_message had no handling for _USAGE_LIMIT_PATTERNS, so
messages like 'usage limit exceeded, try again in 5 minutes' arriving
without an HTTP status code fell through to FailoverReason.unknown
instead of rate_limit.
Apply the same billing/rate-limit disambiguation that _classify_402
already uses: USAGE_LIMIT_PATTERNS + transient signal → rate_limit,
USAGE_LIMIT_PATTERNS alone → billing.
Add 4 tests covering the no-status-code usage-limit path.
* feat(nix): shared-state permission model for interactive CLI users
Enable interactive CLI users in the hermes group to share full
read-write state (sessions, memories, logs, cron) with the gateway
service via a setgid + group-writable permission model.
Changes:
nix/nixosModules.nix:
- Directories use setgid 2770 (was 0750) so new files inherit the
hermes group. home/ stays 0750 (no interactive write needed).
- Activation script creates HERMES_HOME subdirs (cron, sessions, logs,
memories) — previously Python created them but managed mode now skips
mkdir.
- Activation migrates existing runtime files to group-writable (chmod
g+rw). Nix-managed files (config.yaml, .env, .managed) stay 0640/0644.
- Gateway systemd unit gets UMask=0007 so files it creates are 0660.
hermes_cli/config.py:
- ensure_hermes_home() splits into managed/unmanaged paths. Managed mode
verifies dirs exist (raises RuntimeError if not) instead of creating
them. Scoped umask(0o007) ensures SOUL.md is created as 0660.
hermes_logging.py:
- _ManagedRotatingFileHandler subclass applies chmod 0660 after log
rotation in managed mode. RotatingFileHandler.doRollover() creates new
files via open() which uses the process umask (0022 → 0644), not the
scoped umask from ensure_hermes_home().
Verified with a 13-subtest NixOS VM integration test covering setgid,
interactive writes, file ownership, migration, and gateway coexistence.
Refs: #6044
* Fix managed log file mode on initial open
Co-authored-by: Siddharth Balyan <alt-glitch@users.noreply.github.com>
* refactor: simplify managed file handler and merge activation loops
- Cache is_managed() result in handler __init__ instead of lazy-importing
on every _open()/_chmod_if_managed() call. Avoids repeated stat+env
checks on log rotation.
- Merge two for-loops over the same subdir list in activation script
into a single loop (mkdir + chown + chmod + find in one pass).
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Siddharth Balyan <alt-glitch@users.noreply.github.com>
The overflow split loop required _message_id to be set, but on the
first streamed message (or after a segment break) _message_id is None.
Oversized text fell through to _send_or_edit → adapter.send(), which
split internally — but subsequent edits hit Telegram's 'message too
long' and were silently truncated with '…', cutting off the response.
Add a new code path for the _message_id is None case that uses
truncate_message() (same as the non-streaming path) to split with
proper word/code-fence boundaries and chunk indicators. Each chunk
is sent as a new message via _send_new_chunk().
Properly handles got_done (returns immediately after sending chunks
instead of continuing into an infinite loop) and got_segment_break.
Original cherry-picked from PR #6816 by dangelo352.
Fixes silent message truncation on Telegram for long streamed responses.
Chrome auto-launch now passes --user-data-dir, --no-first-run, and
--no-default-browser-check so the debug instance doesn't conflict with
an already-running Chrome using the default profile. The profile dir
lives at {hermes_home}/chrome-debug/.
Also updates the fallback manual instructions to include the same flags
and removes the stale 'close existing Chrome windows' hint.
When managed_persistence is enabled, cleanup_all now only clears local
tracking state without sending DELETE requests to the Camofox server.
This prevents persistent browser profiles (cookies, logins, localStorage)
from being destroyed during process-wide cleanup.
Ephemeral sessions still get full server-side deletion as before.
Chrome auto-launch now passes --user-data-dir, --no-first-run, and
--no-default-browser-check so the debug instance doesn't conflict with
an already-running Chrome using the default profile. The profile dir
lives at {hermes_home}/chrome-debug/.
Also updates the fallback manual instructions to include the same flags
and removes the stale 'close existing Chrome windows' hint.
When _generate_summary() failed (no provider, timeout, model error),
the compressor silently dropped all middle turns with just a debug
log. The agent would then see head + tail with no explanation of the
gap, causing total context amnesia (generic greetings instead of
continuing the conversation).
Now generates a static fallback marker that tells the model context
was lost and to continue from the recent tail messages. The fallback
flows through the same role-alternation logic as a real summary so
message structure stays valid.
- Rewrite test_google_workspace_api.py: test bridge token handling
and calendar date range instead of removed get_credentials()
- Update test_google_oauth_setup.py: partial scopes now accepted
with warning instead of rejected with SystemExit
Migrate the google-workspace skill from custom Python API wrappers
(google-api-python-client) to Google's official Rust CLI gws
(googleworkspace/cli). Add gws_bridge.py for headless-compatible
token refresh. Fix partial OAuth scope handling.
Co-authored-by: spideystreet <dhicham.pro@gmail.com>
Cherry-picked from PR #6713